Uploaded image for project: 'Percona XtraDB Cluster'
  1. Percona XtraDB Cluster
  2. PXC-512

Replication slave from PXC can crash if wsrep_forced_binlog_format=ROW

Details

    • Bug
    • Status: Done
    • Medium
    • Resolution: Fixed
    • 5.5.31-23.7.5, Not 8.0.x, 5.7.35-31.53 (Q3 2021)
    • 5.7.40-31.63 (Q4 2022)
    • None

    Description

      LP #1214465
      https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1214465

      =============
      Unfortunately, I didn't have much time to dig into it more deeper.
      I have the following setup:

      • 3 nodes in PXC setup (server names are "db1,db2,db3"), version is the same for all PXC nodes.
        Server version: 5.5.31-23.7.5 Percona XtraDB Cluster (GPL) 5.5.31-23.7.5, Revision 438, wsrep_23.7.5.r3880
      • 1 replication slave (server name is "s1"), which is using node db3 as a master.
        Server version: 5.5.32-rel31.0-log Percona Server with XtraDB (GPL), Release rel31.0, Revision 549

      db3 has the following option enabled:
      wsrep_forced_binlog_format='row'

      Now let's create test procedure on one of the nodes in our PXC:

      mysql> delimiter //
      mysql> create procedure test_proc() BEGIN INSERT INTO test VALUES(1);end//
      Query OK, 0 rows affected (0.01 sec)
      mysql> delimiter ;

      Then go to s1 server and check slave status, I left only important info from this output:

      mysql> show slave status\G
      ...
      Slave_IO_Running: Yes
      Slave_SQL_Running: No
      ...
      Last_Error: Could not execute Write_rows event on table mysql.proc; Duplicate entry 'test-test_proc-PROCEDURE' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log db3-bin.000018, end_log_pos 2687
      ...
      Last_SQL_Error: Could not execute Write_rows event on table mysql.proc; Duplicate entry 'test-test_proc-PROCEDURE' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log db3-bin.000018, end_log_pos 2687

      That's it. We just corrupted our slave (s1). Skipping an error will fix this.
      Looks like PXC was trying to create this procedure twice and at the second time it failed because of duplicate record on s1 in mysql.proc (b/c of primary key).
      I didn't find any duplicate records in binary log, commands counter was increased by 1 as required, so probably the problem is somewhere is Galera library.
      Btw, dropping this procedure leads to the same issue.

      This issue can be fixed if we set wsrep_forced_binlog_format to NONE (or STATEMENT).

      Attachments

        Activity

          People

            amonar Anton Matvienko
            kenn.takara Kenn Takara (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Smart Checklist