Details
-
Bug
-
Status: Done
-
High
-
Resolution: Fixed
-
2.22.0, 2.23.0
-
1
-
Yes
-
Yes
-
[obsolete] Server Features
Description
Issue:
--------
After upgrading to 2.22 pmm-managed component crashes with the following error:
panic: interface conversion: agentpb.AgentResponsePayload is nil, not *agentpb.GetVersionsResponse
goroutine 268 [running]:
github.com/percona/pmm-managed/services/agents.(*VersionerService).GetVersions(0xc0000bc550, 0xc000664600, 0x2e, 0xc000bd0380, 0x4, 0x4, 0x0, 0x0, 0x0, 0x0, ...)
/home/builder/rpm/BUILD/pmm-managed-1ef5c4435934798165ff937e2f4c916b4329558a/src/github.com/percona/pmm-managed/services/agents/versioner.go:122 +0x5c9
github.com/percona/pmm-managed/services/versioncache.(*Service).updateVersionsForNextService(0xc000313bf0, 0xc000000004, 0x1639d1b, 0x14)
/home/builder/rpm/BUILD/pmm-managed-1ef5c4435934798165ff937e2f4c916b4329558a/src/github.com/percona/pmm-managed/services/versioncache/versioncache.go:164 +0x137
github.com/percona/pmm-managed/services/versioncache.(*Service).Run(0xc000313bf0, 0x18897d8, 0xc00062aae0)
/home/builder/rpm/BUILD/pmm-managed-1ef5c4435934798165ff937e2f4c916b4329558a/src/github.com/percona/pmm-managed/services/versioncache/versioncache.go:235 +0x308
main.main.func13(0xc000756c40, 0xc000313bf0, 0x18897d8, 0xc00062aae0)
/home/builder/rpm/BUILD/pmm-managed-1ef5c4435934798165ff937e2f4c916b4329558a/src/github.com/percona/pmm-managed/main.go:831 +0x6b
created by main.main
/home/builder/rpm/BUILD/pmm-managed-1ef5c4435934798165ff937e2f4c916b4329558a/src/github.com/percona/pmm-managed/main.go:829 +0x3bdd
It happens every 4 hours and causes problems.
Cause:
----------
Service version check functionality was added in 2.22 (PMM-8460)
In code, we do not check for NIL in response:
response, err := agent.channel.SendAndWaitResponse(request) if err != nil { return nil, errors.WithStack(err) } versionsResponse := response.(*agentpb.GetVersionsResponse).Versions if len(versionsResponse) != len(softwaresRequest) { return nil, errors.Errorf("response and request slice length mismatch %d != %d", len(versionsResponse), len(softwaresRequest)) }
The problem is we can receive NIL in case of a communication failure or if a remote pmm-agent has an older version and knows nothing about this kind of request.
Possible solution:
------------------------
Add a simple check "if response == nil" to handle situations like this properly and avoid a pmm-managed crash. E.g. we could log an ERROR/WARNING message and continue.