Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-7464

NGINX misconfiguration leads to log storm in push mode



      The client_max_body_size is set to 10m, which is much higher than the buffer size and this results in a log storm if the vmagent(s) in push-mode have a backlog.

      In a reasonably small amount of time (15m or so), with only 2 instances under monitoring, the following was observed:

      $ grep -Fc "client request body is buffered to a temporary file" /srv/logs/nginx.error.log 

      Setting the client_body_buffer_size higher than the max body size resolved the log storm, an equal size is good enough though

      In addition, due to the fact that the instances all send their data as fast as they could, the PMM server host was loaded for quite a while:

      Steps to reproduce:

      1. Use push-mode for pmm-agent and allow data to gather
      2. Modify /etc/nginx/conf.d/pmm.conf to apply
        location /victoriametrics/api/v1/write { return 404; }

        ahead of the existing location directive for VM

      3. Reload NGINX
      4. Watch pmm-agent logs to confirm
        error        VictoriaMetrics/app/vmagent/remotewrite/client.go:261        unexpected status code received after sending a block with size 656 bytes to "1:secret-url" during retry #8: 404; response body="<html>\r\n<head><title>404 Not Found</title></head>\r\n<body>\r\n<center><h1>404 Not Found</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n"; re-sending the block in 60.000 seconds  agentID=/agent_id/a03bde40-7931-4a41-b840-879a43d86745 component=agent-process type=vm_agent

        messages appear

      5. Monitor /srv/logs/nginx.error.log to look for "client request body is buffered to a temporary file" entries
      6. Allow agents to build up backlog (5-10m seems enough)
      7. Remove fake location directive and apply change in PR
      8. Reload NGINX
      9. Confirm pmm-agents are sending normally and NGINX error log is free from error messages


        Issue Links



              alexander.tymchuk Alexander Tymchuk
              ceri.williams Ceri Williams
              0 Vote for this issue
              4 Start watching this issue



                Smart Checklist