-
Type:
Bug
-
Status: Done
-
Priority:
High
-
Resolution: Fixed
-
Affects Version/s: 2.5.0, 2.6.0
-
Fix Version/s: 2.6.1
-
Component/s: Virtual Appliance
-
Labels:None
-
Story Points:2
-
Sprint:Platform Sprint 16
-
Needs Review:Yes
-
Needs QA:Yes
-
Needs Doc:No
User Impact:
PMM Server UI is not available
STR:
- Install PMM 2.5.0
- Enable laboratory and testing repositories
- Upgrade to 2.6.0
- Login to PMM server and reboot the server
- Wait for a few minutes
- Run
supervisorctl status
supervisord is running
- Wait for a few minutes
Given result: supervisord is stopped
[admin@localhost ~]$ sudo supervisorctl status unix:///var/run/supervisor/supervisor.sock no such file [admin@localhost ~]$ sudo service supervisord start Redirecting to /bin/systemctl start supervisord.service Job for supervisord.service failed because a timeout was exceeded. See "systemctl status supervisord.service" and "journalctl -xe" for details.
in journal:
May 11 11:20:25 localhost crond[1568]: (CRON) INFO (Shutting down) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,582 INFO exited: grafana (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,583 INFO exited: nginx (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,583 INFO exited: cron (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,583 INFO exited: alertmanager (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,583 INFO exited: pmm-agent (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,583 WARN received SIGTERM indicating exit request May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,583 INFO waiting for postgresql, prometheus, pmm-managed, qan-api2, clickhouse to die May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,591 INFO exited: pmm-managed (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,591 INFO exited: qan-api2 (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,595 INFO exited: prometheus (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,609 INFO exited: postgresql (exit status 0; expected) May 11 11:20:25 localhost supervisord[1559]: 2020-05-11 11:20:25,830 INFO exited: clickhouse (exit status 0; expected) May 11 11:20:25 localhost systemd[1]: Failed to start Process Monitoring and Control Daemon. -- Subject: Unit supervisord.service has failed -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit supervisord.service has failed. -- -- The result is failed. May 11 11:20:25 localhost systemd[1]: Unit supervisord.service entered failed state. May 11 11:20:25 localhost systemd[1]: supervisord.service failed. May 11 11:20:25 localhost polkitd[767]: Unregistered Authentication Agent for unix-process:1541:63960 (system bus name :1.23, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UT May 11 11:20:25 localhost sudo[1539]: pam_unix(sudo:session): session closed for user root
The same problem appears for upgrade from 2.4.0 to 2.5.0
Logs after upgrade are attached
The same problem appears on AMI after upgrade
Additional information: The problem appears only after restarting upgraded OVF/AMI.
- causes
-
PMM-5960 Supervisord restarts correctly after restart of PMM Server virtual appliances
-
- Done
-