Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-5823

pmm-server log download and api to get the version failed with timeout

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: In Review
    • Priority: Medium
    • Resolution: Unresolved
    • Affects Version/s: 2.4.0, 2.5.0
    • Fix Version/s: None
    • Component/s: PMM ManageD
    • Story Points:
      1
    • Sprint:
      Platform Sprint 18, Platform Sprint 21, Platform Sprint 24
    • Needs Review:
      Yes

      Description

      Issue: pmm-server log download and api to get the version failed with timeout

      Steps To reproduce :
      1. install pmm server

       

      docker create -v /srv --name pmm-data percona/pmm-server:2.5.0 /bin/true
      docker run -d -p 80:80 -p 443:443 --volumes-from pmm-data --name pmm-server --restart always percona/pmm-server:2.5.0
      

       

      2. login to pmm GUI

      3. Connect to pmm-server:

      docker exec -it pmm-server /bin/bash
      
      ls -alh /srv/logs
      

      Run pmm-admin summary --pprof from inside the pmm-server container while trying to download https://<address-of-your-pmm-server>/managed/logs.zip

      Also checking API failing with timeout

      # curl -k 'https://admin:admin@172.17.0.2:443/managed/v1/version' 
      <html>
      <head><title>504 Gateway Time-out</title></head>
      <body>
      <center><h1>504 Gateway Time-out</h1></center>
      <hr><center>nginx</center>
      </body>
      </html>
      
      pmmserver : /srv/logs
      [root@2275edb6f0ec logs]# ls -lah      
      total 96K
      drwxrwxr-x 2 pmm      pmm      4.0K Apr 24 06:54 .
      drwxr-xr-x 9 root     root     4.0K Apr 24 06:53 ..
      -rw-r----- 1 root     root        0 Apr 24 06:54 clickhouse-server.err.log
      -rw-r----- 1 root     root     6.5K Apr 24 06:54 clickhouse-server.log
      -rw-r--r-- 1 root     root      305 Apr 24 06:54 clickhouse-server.startup.log
      -rw-r--r-- 1 root     root        0 Apr 24 06:54 cron.log
      -rw-r--r-- 1 root     root      134 Apr 24 06:54 dashboard-upgrade.log
      -rw-r--r-- 1 root     root     5.6K Apr 24 06:54 grafana.log
      -rw-r--r-- 1 root     root     1.3K Apr 24 06:54 nginx.access.log
      -rw-r--r-- 1 root     root      254 Apr 24 06:54 nginx.error.log
      -rw-r--r-- 1 root     root      112 Apr 24 06:54 nginx.startup.log
      -rw-r--r-- 1 root     root      12K Apr 24 06:54 pmm-agent.log
      -rw-r--r-- 1 root     root      17K Apr 24 06:54 pmm-managed.log
      -rw------- 1 postgres postgres  182 Apr 24 06:54 postgresql.log
      -rw-r--r-- 1 root     root      694 Apr 24 06:54 postgresql.startup.log
      -rw-r--r-- 1 root     root     2.2K Apr 24 06:54 prometheus.log
      -rw-r--r-- 1 root     root      968 Apr 24 06:54 qan-api2.log
      -rw-r--r-- 1 root     root     2.6K Apr 24 06:54 supervisord.log
      [root@2275edb6f0ec logs]# pmm-admin summary --pprof 
      Getting http://127.0.0.1:9933/debug/pprof/profile?seconds=60 ...
      Getting http://127.0.0.1:7773/debug/pprof/profile?seconds=60 ...
      Getting http://127.0.0.1:7777/debug/pprof/profile?seconds=60 ...
      Getting http://127.0.0.1:7773/debug/pprof/heap?gc=1 ...
      Getting http://127.0.0.1:9933/debug/pprof/heap?gc=1 ...
      Getting http://127.0.0.1:7777/debug/pprof/heap?gc=1 ...
      Getting http://127.0.0.1:9933/debug/pprof/trace?seconds=10 ...
      Getting http://127.0.0.1:7777/debug/pprof/trace?seconds=10 ...
      Getting http://127.0.0.1:7773/debug/pprof/trace?seconds=10 ...
      [GET /logs.zip][401] Logs default  &{Code:16 Error:Unauthorized Message:Unauthorized}
      summary_2275edb6f0ec_2020_04_24_06_55_15.zip created.
      
      
      [root@2275edb6f0ec logs]# ls -lah 
      total 244K
      drwxrwxr-x 2 pmm      pmm      4.0K Apr 24 06:55 .
      drwxr-xr-x 9 root     root     4.0K Apr 24 06:53 ..
      -rw-r----- 1 root     root        0 Apr 24 06:54 clickhouse-server.err.log
      -rw-r----- 1 root     root      18K Apr 24 06:56 clickhouse-server.log
      -rw-r--r-- 1 root     root      305 Apr 24 06:54 clickhouse-server.startup.log
      -rw-r--r-- 1 root     root        0 Apr 24 06:54 cron.log
      -rw-r--r-- 1 root     root      134 Apr 24 06:54 dashboard-upgrade.log
      -rw-r--r-- 1 root     root      16K Apr 24 06:56 grafana.log
      -rw-r--r-- 1 root     root      43K Apr 24 06:56 nginx.access.log
      -rw-r--r-- 1 root     root     1.5K Apr 24 06:55 nginx.error.log
      -rw-r--r-- 1 root     root      112 Apr 24 06:54 nginx.startup.log
      -rw-r--r-- 1 root     root      13K Apr 24 06:56 pmm-agent.log
      -rw-r--r-- 1 root     root      32K Apr 24 06:56 pmm-managed.log
      -rw------- 1 postgres postgres  182 Apr 24 06:54 postgresql.log
      -rw-r--r-- 1 root     root      694 Apr 24 06:54 postgresql.startup.log
      -rw-r--r-- 1 root     root     2.2K Apr 24 06:54 prometheus.log
      -rw-r--r-- 1 root     root     2.1K Apr 24 06:56 qan-api2.log
      -rw-r--r-- 1 root     root      71K Apr 24 06:56 summary_2275edb6f0ec_2020_04_24_06_55_15.zip
      -rw-r--r-- 1 root     root     2.6K Apr 24 06:54 supervisord.log
      [root@2275edb6f0ec logs]# 
      

      summary_2275edb6f0ec_2020_04_24_06_55_15.zip

      cat nginx.error.log

      2020/04/24 06:54:19 [error] 36#36: *4 upstream rejected request with error 0 while reading response header from upstream, client: 172.17.0.1, server: _, request: "POST /agent.Agent/Connect HTTP/2.0", upstream: "grpc://127.0.0.1:7771", host: "172.17.0.2"
      2020/04/24 06:55:29 [warn] 36#36: *70 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/1/00/0000000001 while reading upstream, client: 172.17.0.1, server: _, request: "GET /graph/public/build/vendors~app.5f08acfc6cecf932dc51.js HTTP/1.1", upstream: "http://127.0.0.1:3000/public/build/vendors~app.5f08acfc6cecf932dc51.js", host: "172.17.0.2", referrer: "http://172.17.0.2/graph/login"
      2020/04/24 06:55:29 [warn] 36#36: *73 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/2/00/0000000002 while reading upstream, client: 172.17.0.1, server: _, request: "GET /graph/public/build/app.5f08acfc6cecf932dc51.js HTTP/1.1", upstream: "http://127.0.0.1:3000/public/build/app.5f08acfc6cecf932dc51.js", host: "172.17.0.2", referrer: "http://172.17.0.2/graph/login"
      2020/04/24 06:55:40 [warn] 36#36: *71 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/3/00/0000000003 while reading upstream, client: 172.17.0.1, server: _, request: "GET /graph/public/build/vendors~app.5f08acfc6cecf932dc51.js HTTP/1.1", upstream: "http://127.0.0.1:3000/public/build/vendors~app.5f08acfc6cecf932dc51.js", host: "172.17.0.2", referrer: "http://172.17.0.2/graph/graph/"
      2020/04/24 06:56:45 [error] 36#36: *71 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.17.0.1, server: _, request: "POST /v1/Updates/Check HTTP/1.1", upstream: "http://127.0.0.1:7772/v1/Updates/Check", host: "172.17.0.2", referrer: "http://172.17.0.2/graph/d/pmm-home/home-dashboard?orgId=1&refresh=1m"
      2020/04/24 06:56:47 [error] 36#36: *62 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.17.0.1, server: _, request: "GET /managed/logs.zip HTTP/1.1", upstream: "http://127.0.0.1:7772/logs.zip", host: "172.17.0.2"
      

      Expectation: It should not fail with a timeout

      Note: Issue not consistent to reproduce but most of the time it's reproduciable.

        Smart Checklist

          Attachments

          1. 500_2.png
            500_2.png
            295 kB
          2. 500.png
            500.png
            175 kB
          3. 504.png
            504.png
            235 kB
          4. image-2020-06-08-15-06-58-326.png
            image-2020-06-08-15-06-58-326.png
            32 kB
          5. image-2020-07-07-13-07-15-427.png
            image-2020-07-07-13-07-15-427.png
            32 kB
          6. pmm-server_2020-04-24_07-38.zip
            56 kB
          7. Screenshot from 2020-07-07 13-34-38.png
            Screenshot from 2020-07-07 13-34-38.png
            16 kB
          8. Screenshot from 2020-07-07 13-34-38.png
            Screenshot from 2020-07-07 13-34-38.png
            16 kB
          9. Screenshot from 2020-07-07 13-34-48.png
            Screenshot from 2020-07-07 13-34-48.png
            40 kB
          10. summary_2275edb6f0ec_2020_04_24_06_55_15.zip
            71 kB
          11. summary_c37e632b018d_2020_04_24_07_47_02.zip
            77 kB
          12. timeout for pmm-server log.png
            timeout for pmm-server log.png
            131 kB

            Issue Links

              Activity

                People

                Assignee:
                puneet.kala Puneet Kala
                Reporter:
                lalit.choudhary Lalit Choudhary
                Votes:
                4 Vote for this issue
                Watchers:
                9 Start watching this issue

                  Dates

                  Created:
                  Updated:

                    Time Tracking

                    Estimated:
                    Original Estimate - Not Specified
                    Not Specified
                    Remaining:
                    Remaining Estimate - Not Specified
                    Not Specified
                    Logged:
                    Time Spent - 4 days, 7 hours
                    4d 7h