Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-5823

PMM Server: Timeout when simultaneously generating and accessing logs via download or API

Details

    • 1
    • Yes
    • Yes

    Description

      PROBLEM IS PRESENTED ONLY RIGHT AFTER YOU LOGIN INTO GRAFANA.

      Please see this comment with summary and steps to reproduce.

      Issue: pmm-server log download and api to get the version failed with timeout

      Steps To reproduce :
      1. install pmm server

       

      docker create -v /srv --name pmm-data percona/pmm-server:2.5.0 /bin/true
      docker run -d -p 80:80 -p 443:443 --volumes-from pmm-data --name pmm-server --restart always percona/pmm-server:2.5.0
      

       

      2. login to pmm GUI

      3. Connect to pmm-server:

      docker exec -it pmm-server /bin/bash
      
      ls -alh /srv/logs
      

      Run pmm-admin summary --pprof from inside the pmm-server container while trying to download https://<address-of-your-pmm-server>/managed/logs.zip

      Also checking API failing with timeout

      # curl -k 'https://admin:[email protected]:443/managed/v1/version' 
      <html>
      <head><title>504 Gateway Time-out</title></head>
      <body>
      <center><h1>504 Gateway Time-out</h1></center>
      <hr><center>nginx</center>
      </body>
      </html>
      
      pmmserver : /srv/logs
      [root@2275edb6f0ec logs]# ls -lah      
      total 96K
      drwxrwxr-x 2 pmm      pmm      4.0K Apr 24 06:54 .
      drwxr-xr-x 9 root     root     4.0K Apr 24 06:53 ..
      -rw-r----- 1 root     root        0 Apr 24 06:54 clickhouse-server.err.log
      -rw-r----- 1 root     root     6.5K Apr 24 06:54 clickhouse-server.log
      -rw-r--r-- 1 root     root      305 Apr 24 06:54 clickhouse-server.startup.log
      -rw-r--r-- 1 root     root        0 Apr 24 06:54 cron.log
      -rw-r--r-- 1 root     root      134 Apr 24 06:54 dashboard-upgrade.log
      -rw-r--r-- 1 root     root     5.6K Apr 24 06:54 grafana.log
      -rw-r--r-- 1 root     root     1.3K Apr 24 06:54 nginx.access.log
      -rw-r--r-- 1 root     root      254 Apr 24 06:54 nginx.error.log
      -rw-r--r-- 1 root     root      112 Apr 24 06:54 nginx.startup.log
      -rw-r--r-- 1 root     root      12K Apr 24 06:54 pmm-agent.log
      -rw-r--r-- 1 root     root      17K Apr 24 06:54 pmm-managed.log
      -rw------- 1 postgres postgres  182 Apr 24 06:54 postgresql.log
      -rw-r--r-- 1 root     root      694 Apr 24 06:54 postgresql.startup.log
      -rw-r--r-- 1 root     root     2.2K Apr 24 06:54 prometheus.log
      -rw-r--r-- 1 root     root      968 Apr 24 06:54 qan-api2.log
      -rw-r--r-- 1 root     root     2.6K Apr 24 06:54 supervisord.log
      [root@2275edb6f0ec logs]# pmm-admin summary --pprof 
      Getting http://127.0.0.1:9933/debug/pprof/profile?seconds=60 ...
      Getting http://127.0.0.1:7773/debug/pprof/profile?seconds=60 ...
      Getting http://127.0.0.1:7777/debug/pprof/profile?seconds=60 ...
      Getting http://127.0.0.1:7773/debug/pprof/heap?gc=1 ...
      Getting http://127.0.0.1:9933/debug/pprof/heap?gc=1 ...
      Getting http://127.0.0.1:7777/debug/pprof/heap?gc=1 ...
      Getting http://127.0.0.1:9933/debug/pprof/trace?seconds=10 ...
      Getting http://127.0.0.1:7777/debug/pprof/trace?seconds=10 ...
      Getting http://127.0.0.1:7773/debug/pprof/trace?seconds=10 ...
      [GET /logs.zip][401] Logs default  &{Code:16 Error:Unauthorized Message:Unauthorized}
      summary_2275edb6f0ec_2020_04_24_06_55_15.zip created.
      
      
      [root@2275edb6f0ec logs]# ls -lah 
      total 244K
      drwxrwxr-x 2 pmm      pmm      4.0K Apr 24 06:55 .
      drwxr-xr-x 9 root     root     4.0K Apr 24 06:53 ..
      -rw-r----- 1 root     root        0 Apr 24 06:54 clickhouse-server.err.log
      -rw-r----- 1 root     root      18K Apr 24 06:56 clickhouse-server.log
      -rw-r--r-- 1 root     root      305 Apr 24 06:54 clickhouse-server.startup.log
      -rw-r--r-- 1 root     root        0 Apr 24 06:54 cron.log
      -rw-r--r-- 1 root     root      134 Apr 24 06:54 dashboard-upgrade.log
      -rw-r--r-- 1 root     root      16K Apr 24 06:56 grafana.log
      -rw-r--r-- 1 root     root      43K Apr 24 06:56 nginx.access.log
      -rw-r--r-- 1 root     root     1.5K Apr 24 06:55 nginx.error.log
      -rw-r--r-- 1 root     root      112 Apr 24 06:54 nginx.startup.log
      -rw-r--r-- 1 root     root      13K Apr 24 06:56 pmm-agent.log
      -rw-r--r-- 1 root     root      32K Apr 24 06:56 pmm-managed.log
      -rw------- 1 postgres postgres  182 Apr 24 06:54 postgresql.log
      -rw-r--r-- 1 root     root      694 Apr 24 06:54 postgresql.startup.log
      -rw-r--r-- 1 root     root     2.2K Apr 24 06:54 prometheus.log
      -rw-r--r-- 1 root     root     2.1K Apr 24 06:56 qan-api2.log
      -rw-r--r-- 1 root     root      71K Apr 24 06:56 summary_2275edb6f0ec_2020_04_24_06_55_15.zip
      -rw-r--r-- 1 root     root     2.6K Apr 24 06:54 supervisord.log
      [root@2275edb6f0ec logs]# 
      

      summary_2275edb6f0ec_2020_04_24_06_55_15.zip

      cat nginx.error.log

      2020/04/24 06:54:19 [error] 36#36: *4 upstream rejected request with error 0 while reading response header from upstream, client: 172.17.0.1, server: _, request: "POST /agent.Agent/Connect HTTP/2.0", upstream: "grpc://127.0.0.1:7771", host: "172.17.0.2"
      2020/04/24 06:55:29 [warn] 36#36: *70 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/1/00/0000000001 while reading upstream, client: 172.17.0.1, server: _, request: "GET /graph/public/build/vendors~app.5f08acfc6cecf932dc51.js HTTP/1.1", upstream: "http://127.0.0.1:3000/public/build/vendors~app.5f08acfc6cecf932dc51.js", host: "172.17.0.2", referrer: "http://172.17.0.2/graph/login"
      2020/04/24 06:55:29 [warn] 36#36: *73 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/2/00/0000000002 while reading upstream, client: 172.17.0.1, server: _, request: "GET /graph/public/build/app.5f08acfc6cecf932dc51.js HTTP/1.1", upstream: "http://127.0.0.1:3000/public/build/app.5f08acfc6cecf932dc51.js", host: "172.17.0.2", referrer: "http://172.17.0.2/graph/login"
      2020/04/24 06:55:40 [warn] 36#36: *71 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/3/00/0000000003 while reading upstream, client: 172.17.0.1, server: _, request: "GET /graph/public/build/vendors~app.5f08acfc6cecf932dc51.js HTTP/1.1", upstream: "http://127.0.0.1:3000/public/build/vendors~app.5f08acfc6cecf932dc51.js", host: "172.17.0.2", referrer: "http://172.17.0.2/graph/graph/"
      2020/04/24 06:56:45 [error] 36#36: *71 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.17.0.1, server: _, request: "POST /v1/Updates/Check HTTP/1.1", upstream: "http://127.0.0.1:7772/v1/Updates/Check", host: "172.17.0.2", referrer: "http://172.17.0.2/graph/d/pmm-home/home-dashboard?orgId=1&refresh=1m"
      2020/04/24 06:56:47 [error] 36#36: *62 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.17.0.1, server: _, request: "GET /managed/logs.zip HTTP/1.1", upstream: "http://127.0.0.1:7772/logs.zip", host: "172.17.0.2"
      

      Expectation: It should not fail with a timeout

      Note: Issue not consistent to reproduce but most of the time it's reproduciable.

      Attachments

        1. 500_2.png
          500_2.png
          295 kB
        2. 500.png
          500.png
          175 kB
        3. 504.png
          504.png
          235 kB
        4. image-2020-06-08-15-06-58-326.png
          image-2020-06-08-15-06-58-326.png
          32 kB
        5. image-2020-07-07-13-07-15-427.png
          image-2020-07-07-13-07-15-427.png
          32 kB
        6. image-2020-10-08-13-09-11-141.png
          image-2020-10-08-13-09-11-141.png
          29 kB
        7. image-2020-10-16-16-25-48-373.png
          image-2020-10-16-16-25-48-373.png
          26 kB
        8. pmm2.11.0.zip
          63 kB
        9. pmm2.5.0.zip
          55 kB
        10. pmm-server_2020-04-24_07-38.zip
          56 kB
        11. pmm-server_2020-10-14_07-11.zip
          55 kB
        12. Screenshot from 2020-07-07 13-34-38.png
          Screenshot from 2020-07-07 13-34-38.png
          16 kB
        13. Screenshot from 2020-07-07 13-34-38.png
          Screenshot from 2020-07-07 13-34-38.png
          16 kB
        14. Screenshot from 2020-07-07 13-34-48.png
          Screenshot from 2020-07-07 13-34-48.png
          40 kB
        15. summary_2275edb6f0ec_2020_04_24_06_55_15.zip
          71 kB
        16. summary_c37e632b018d_2020_04_24_07_47_02.zip
          77 kB
        17. timeout for pmm-server log.png
          timeout for pmm-server log.png
          131 kB

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lalit.choudhary Lalit Choudhary
              Votes:
              4 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - Not Specified
                  Not Specified
                  Logged:
                  Time Spent - 4 days, 7 hours
                  4d 7h

                  Smart Checklist