Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-7350

Restart DB Cluster restarts only one PXC server and one proxySQL, not all of them

Details

    • 3
    • Yes
    • Server Integrations
    • Hide

      Just follow the steps to reproduce. They won't be reproducible anymore

      Show
      Just follow the steps to reproduce. They won't be reproducible anymore

    Description

      Impact on the user:

      • user will not get what they've tried to achieve by restarting the DB Cluster

      Steps to reproduce:

      1. Setup PXC cluster in PMM
      2. wait it's ready
      3. Restart it
      4. Wait it's restarted
      5. check status in k8s

      Actual result:
      spronin-test-proxysql-0 3/3 Running 6 13h spronin-test-proxysql-1 3/3 Running 0 13h spronin-test-proxysql-2 0/3 ContainerCreating 0 1s spronin-test-pxc-0 1/1 Running 0 13h spronin-test-pxc-1 1/1 Running 8 13h spronin-test-pxc-2 1/1 Terminating 0 2m29s

      as you can see only one PXC and one proxySQL pod's started recently, others are 13h Up

      Expected Result:
      All pods have recent uptime

      Workaround:
      restart pods from kubectl

      Suggested Implementation:

      • When dbaas-controller gets Restart request
      • Pause DB cluster
      • Wait until DB cluster is paused
        • Start new goroutine to Resume DB cluster
      • Return response

      Possible issues:

      • If dbaas-controller restarts during restart DB cluster may stuck in pause state
      • After pausing DB cluster it may have active status for a few seconds (PMM-7397

      Details:

      PMM-6824 implemented database cluster restart with kubectl rollout restart. It is good enough for alpha, but not good enough for beta: we can lose data this way, and operators' team confirmed it is not safe.

      For beta1, we should implement restart via full cluster pause/resume: https://www.percona.com/doc/kubernetes-operator-for-pxc/pause.html

      The same functionality will be available for PSMDB.

      Attachments

        Issue Links

          Activity

            People

              andrew.minkin Andrew Minkin
              roma.novikov Roma Novikov
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Smart Checklist