Details
-
Bug
-
Status: Done
-
High
-
Resolution: Fixed
-
2.12.0, 2.14.0
-
0.5
-
06 - Core
-
C/S Core
Description
The client_max_body_size is set to 10m, which is much higher than the buffer size and this results in a log storm if the vmagent(s) in push-mode have a backlog.
In a reasonably small amount of time (15m or so), with only 2 instances under monitoring, the following was observed:
$ grep -Fc "client request body is buffered to a temporary file" /srv/logs/nginx.error.log
71850
Setting the client_body_buffer_size higher than the max body size resolved the log storm, an equal size is good enough though
In addition, due to the fact that the instances all send their data as fast as they could, the PMM server host was loaded for quite a while:
Steps to reproduce:
- Use push-mode for pmm-agent and allow data to gather
- Modify /etc/nginx/conf.d/pmm.conf to apply
location /victoriametrics/api/v1/write { return 404; }
ahead of the existing location directive for VM
- Reload NGINX
- Watch pmm-agent logs to confirm
error VictoriaMetrics/app/vmagent/remotewrite/client.go:261 unexpected status code received after sending a block with size 656 bytes to "1:secret-url" during retry #8: 404; response body="<html>\r\n<head><title>404 Not Found</title></head>\r\n<body>\r\n<center><h1>404 Not Found</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n"; re-sending the block in 60.000 seconds agentID=/agent_id/a03bde40-7931-4a41-b840-879a43d86745 component=agent-process type=vm_agent
messages appear
- Monitor /srv/logs/nginx.error.log to look for "client request body is buffered to a temporary file" entries
- Allow agents to build up backlog (5-10m seems enough)
- Remove fake location directive and apply change in PR
- Reload NGINX
- Confirm pmm-agents are sending normally and NGINX error log is free from error messages