Details
-
Improvement
-
Status: Done
-
High
-
Resolution: Done
-
None
-
2
-
Platform Sprint 29 (2.12+a2)
-
Yes
-
Yes
-
Yes
Description
For better integration with k8s, there should be a way to wait for pmm-agent to become fully configured and connected.
Add -wait flag to pmm-admin status command. Flag's default value(0s) should not change the command behavior. Otherwise, the flag's value should contain duration (like in pmm-admin status -wait=30s). If given, pmm-admin should wait for pmm-agent to become connected for that maximum time. It should retry checking pmm-agent's status with a one-second delay after the previous request failed or returned information about pmm-agent not being connected. It should retry silently unless -debug or -trace is given. After pmm-agent is connected or that time elapsed, it should behave as it is now (i. e. produce normal output or exit with an appropriate error message and non-zero status code).
That flag should cover the following cases:
- when pmm-agent is not running (what currently produces "Failed to get PMM Agent status from local pmm-agent: Post "http://127.0.0.1:7777/local/Status": dial tcp 127.0.0.1:7777: connect: connection refused." error);
- when pmm-agent is running but not configured (what currently produces "Failed to get PMM Agent status from local pmm-agent: pmm-agent is running, but not set up." error);
- when pmm-agent is running and configured, but not connected to PMM Server (what currently produces "Failed to get PMM Agent status from local pmm-agent: pmm-agent is not connected to PMM Server." too, but actually a different case).
How to test:
- Setup PMM Server
- Run pmm-admin status --wait=30s
- After 30s should return error "Failed to get PMM Agent status from local pmm-agent: pmm-agent is running, but not set up."
- Configure pmm-agent
- Run pmm-admin status --wait=30s
- Should return status immediately
- Stop PMM Server
- Run pmm-admin status --wait=30s
- After 30s should return error "Failed to get PMM Agent status from local pmm-agent: pmm-agent is not connected to PMM Server."
- Run pmm-admin status --wait=30s and start PMM Server
- After a few seconds status should be returned
- Stop pmm-agent
- Run pmm-admin status --wait=30s
- After 30s should return error "Failed to get PMM Agent status from local pmm-agent: Post "http://127.0.0.1:7777/local/Status": dial tcp 127.0.0.1:7777: connect: connection refused."
- Run pmm-admin status --wait=30s and start pmm-agent
- After a few seconds status should be returned
Documentation:
Documentation should be added to pmm-admin man page.
--wait values expected in go format `30s`, `5m`, `1h`
Attachments
Issue Links
- relates to
-
PMM-5588 pmm-admin status should always work
-
- Done
-