Uploaded image for project: 'Percona XtraDB Cluster'
  1. Percona XtraDB Cluster
  2. PXC-2023

LP #1741818: Missing data when using 'load data' in PXC5.7 leads to data inconsistency

Details

    • Bug
    • Status: Done
    • Low
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      **Reported in Launchpad by Peter Zhao last update 08-01-2018 05:03:58

      Environment:
      Server version: 5.7.19-17-log Percona Server (GPL), wsrep_29.22
      OS version: CentOS 6.7

      We found an issue when using Percona XtraDB Cluster 5.7.19. There will be chances that load data operation leads to data inconsistency in PXC5.7.

      After upgrading to PXC5.7 from PXC5.6, we found the instance of the cluster crashed frequently. According to the error log, we found the crash was led by a delete statement. We investigated this issue for serveral days and finally found the cause.

      In our practice, we build a Percona XtraDB Cluster with three instance by GTID replication. We use one instance for writing/reading and two other instrances for reading. Besides, we run a crontab task to execute several shell scripts to load data from text files and do some transactions to update serveral tables on one instance in the morning.

      We are acknowleged that PXC will split one load data operation into serveral batches of which is up to 10000 records. But we found that when loading data from file on disk, if this file is large enough and the load data operation will cover two or more binary logs, missing data will happen. Some data will not be loaded to the slave instance. Besides, we found that in binary logs, such transactions don't have XID attrbiute.

      In our config, we set enforce_gtid_consistency = 1, gtid_mode = ON and this issue ocurrs almost every day. But when we set enforce_gtid_consistency = 0, gtid_mode = OFF, this issue occurs no more.

      I'll show you how to recur this bug.
      =============
      1. Create a Percona XtraDB Cluster of at least 2 instance using MySQL native GTID for replication.
      2. Prepare a text file for loading.
      3. Load data from the file and watch the size of the binary logs. If the load opertion covers at least two binary logs, you can find the data inconsistency between the master and other slaves. Slave instance will miss some data.
      =============

      Although now we turn off gtid_mode and the issue happens no more. But I think this solution is not good enough. If someone else meets this issue, he or she would be confused by this problem. I hope that your team can review this issue and fix this bug in later version.

      Attachments

        Issue Links

          Activity

            People

              krunal.bauskar Krunal Bauskar (Inactive)
              lpjirasync lpjirasync (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Smart Checklist