Uploaded image for project: 'Percona Monitoring and Management'
  1. Percona Monitoring and Management
  2. PMM-5924

Alertmanager not running after PMM Server upgrade via Docker

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Priority: High
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.6.1
    • Component/s: None
    • Labels:
      None
    • Story Points:
      2
    • Sprint:
      Platform Sprint 16
    • Needs Review:
      Yes
    • Needs QA:
      Yes
    • Needs Doc:
      No

      Description

      User Impact:
      Alertmanager is not running
      STR:

      1. Install PMM 2.5.0 docker
      2. Update to 2.6.0 with docker-way:
        docker stop pmm-server
        docker rm pmm-server
        docker run -d -p 80:80 -p 443:443 --volumes-from pmm-nailya-20200512080852-13399-data --name pmm-nailya-20200512080852-13399-server --restart always -e PERCONA_TEST_CHECKS_INTERVAL=10s percona/pmm-server:2.6.0 
      1. check PMM server logs

      Given result:

       docker logs pmm-nailya-20200512080852-13399-server
      WARN[2020-05-12T11:31:59.727+00:00] Configuration warning: environment variable "PERCONA_TEST_CHECKS_INTERVAL" IS NOT SUPPORTED and WILL BE REMOVED IN THE FUTURE. 
      2020-05-12 11:31:59,816 INFO Included extra file "/etc/supervisord.d/alertmanager.ini" during parsing
      2020-05-12 11:31:59,816 INFO Included extra file "/etc/supervisord.d/pmm.ini" during parsing
      2020-05-12 11:31:59,816 INFO Included extra file "/etc/supervisord.d/prometheus.ini" during parsing
      2020-05-12 11:31:59,816 INFO Included extra file "/etc/supervisord.d/qan-api2.ini" during parsing
      2020-05-12 11:31:59,816 INFO Set uid to user 0 succeeded
      2020-05-12 11:31:59,824 INFO RPC interface 'supervisor' initialized
      2020-05-12 11:31:59,824 INFO supervisord started with pid 1
      2020-05-12 11:32:00,826 INFO spawned: 'postgresql' with pid 12
      2020-05-12 11:32:00,827 INFO spawned: 'clickhouse' with pid 13
      2020-05-12 11:32:00,829 INFO spawned: 'grafana' with pid 14
      2020-05-12 11:32:00,831 INFO spawned: 'nginx' with pid 15
      2020-05-12 11:32:00,832 INFO spawned: 'cron' with pid 16
      2020-05-12 11:32:00,834 INFO spawned: 'prometheus' with pid 17
      2020-05-12 11:32:00,836 INFO spawned: 'alertmanager' with pid 18
      2020-05-12 11:32:00,837 INFO spawned: 'dashboard-upgrade' with pid 19
      2020-05-12 11:32:00,839 INFO spawned: 'qan-api2' with pid 20
      2020-05-12 11:32:00,841 INFO spawned: 'pmm-managed' with pid 21
      2020-05-12 11:32:00,842 INFO spawned: 'pmm-agent' with pid 22
      2020-05-12 11:32:00,880 INFO success: dashboard-upgrade entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
      2020-05-12 11:32:00,916 INFO exited: qan-api2 (exit status 1; not expected)
      2020-05-12 11:32:01,127 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:01,960 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:01,960 INFO success: clickhouse entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:01,960 INFO success: grafana entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:01,961 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:01,961 INFO success: cron entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:01,961 INFO success: prometheus entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:01,962 INFO spawned: 'qan-api2' with pid 134
      2020-05-12 11:32:01,962 INFO success: pmm-managed entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:01,963 INFO success: pmm-agent entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:01,974 INFO waiting for grafana to stop
      2020-05-12 11:32:02,132 INFO spawned: 'alertmanager' with pid 145
      2020-05-12 11:32:02,440 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:02,843 INFO stopped: grafana (exit status 0)
      2020-05-12 11:32:03,176 INFO success: qan-api2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:04,663 INFO spawned: 'alertmanager' with pid 178
      2020-05-12 11:32:04,701 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:08,231 INFO spawned: 'alertmanager' with pid 220
      2020-05-12 11:32:08,299 INFO spawned: 'grafana' with pid 226
      2020-05-12 11:32:08,321 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:10,249 INFO success: grafana entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:12,337 INFO spawned: 'alertmanager' with pid 284
      2020-05-12 11:32:12,451 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:13,260 INFO waiting for grafana to stop
      2020-05-12 11:32:13,266 INFO stopped: grafana (exit status 0)
      2020-05-12 11:32:18,354 INFO spawned: 'alertmanager' with pid 332
      2020-05-12 11:32:18,361 INFO spawned: 'grafana' with pid 333
      2020-05-12 11:32:18,437 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:19,371 INFO success: grafana entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:24,834 INFO spawned: 'alertmanager' with pid 402
      2020-05-12 11:32:24,917 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:30,896 INFO waiting for grafana to stop
      2020-05-12 11:32:30,899 INFO stopped: grafana (exit status 0)
      2020-05-12 11:32:32,226 INFO spawned: 'alertmanager' with pid 476
      2020-05-12 11:32:32,261 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:35,993 INFO spawned: 'grafana' with pid 510
      2020-05-12 11:32:37,063 INFO success: grafana entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
      2020-05-12 11:32:37,087 INFO exited: dashboard-upgrade (exit status 0; expected)
      2020-05-12 11:32:40,647 INFO spawned: 'alertmanager' with pid 544
      2020-05-12 11:32:40,712 INFO exited: alertmanager (exit status 1; not expected)
      2020-05-12 11:32:49,724 INFO spawned: 'alertmanager' with pid 621
      2020-05-12 11:32:49,769 INFO exited: alertmanager (exit status 1; not expected)
      

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            nailya.kutlubaeva Nailya Kutlubaeva
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - Not Specified
                Not Specified
                Logged:
                Time Spent - 2 days
                2d

                  Smart Checklist