Operator cannot clean replication's failover sources if replications have been stopped

Description

When having data in

 

But without any running replication

 

 

 

 

Operator fails with

 

It is failling in : https://github.com/percona/percona-xtradb-cluster-operator/blob/v1.12.0/pkg/controller/pxc/replication.go#L305-L348

It is querying the existing channels with "db.CurrentReplicationChannels()" which is using mysql.replication_asynchronous_connection_failover table.

Then it is executing STOP SLAVE for 'channel', but it never checked beforehand if this channel was running.

 

Suggested fix, either:

  • use "show slaves status" to get the existing channel (=> replication_asynchronous_connection_failover is currently queried twice in the same function, it's queried again before removing replication source)

  • handle the fact that it's already stopped/non-existent to ensure idempotence

 

 

Environment

None

AFFECTED CS IDs

CS0037089

Activity

Show:

Slava Sarzhan December 7, 2023 at 5:06 PM

The issue was fixed.

Done

Details

Assignee

Reporter

Needs QA

Yes

Fix versions

Affects versions

Priority

Smart Checklist

Created June 12, 2023 at 1:52 PM
Updated March 5, 2024 at 5:25 PM
Resolved February 1, 2024 at 4:42 PM