Uploaded image for project: 'Percona Operator for PostgreSQL'
  1. Percona Operator for PostgreSQL
  2. K8SPG-96

pmm container should not crash in case of issues

Details

    • Improvement
    • Status: Done
    • Medium
    • Resolution: Done
    • None
    • 1.1.0
    • None
    • None
    • Yes

    Description

      Our Operator provides an integration with Percona Monitoring and Management.

      We deploy pmm-agent as a sidecar container.

      If for some reason pmm-agent cannot connect to the server or wrong credentials are submitted, the whole container crashes. This in turn leads to CrashloopBackOff status for the whole Pod which runs the DB cluster component. (see https://jira.percona.com/browse/PMM-7677)

       

      PMM team added the flag which allows us to avoid container crash, but just restart pmm-agent till it recovers the connection.

      Ticket: https://jira.percona.com/browse/PMM-7677

      PR: https://github.com/Percona-Lab/pmm-submodules/pull/1923/files

      Flags introduced:

      • PMM_AGENT_SIDECAR - if true, `pmm-agent` will be restarted in case of it's failed.
      • PMM_AGENT_SIDECAR_SLEEP - time to wait before restarting pmm-agent if PMM_AGENT_SIDECAR is true. 1 second by default.

      We are going to set the flags as follows:

      PMM_AGENT_SIDECAR to true

      PMM_AGENT_SIDECAR_SLEEP to 5 (seconds)

       

      Attachments

        Issue Links

          Activity

            People

              slava.sarzhan Slava Sarzhan
              sergey.pronin Sergey Pronin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Smart Checklist