Uploaded image for project: 'Percona Toolkit'
  1. Percona Toolkit
  2. PT-187

pt-table-checksum not able to get differences for columns

    Details

    • Type: Bug
    • Status: Done
    • Priority: Medium
    • Resolution: Won't Fix
    • Affects Version/s: 3.0.4
    • Fix Version/s: 3.0.5
    • Component/s: None
    • Labels:
      None

      Description

       have an old case that I'm trying to get solved. I'm seeing that, pt-table-checksum is not able to verify that the same table, on master and on slave, they have different values for some of te rows although both tables have the same number of rows. It's not that hard to get and I would like to know if there is a way I can better check the data for complete consistency beyond of the number of rows (we cannot really say a table is 100% consistent on master and slaves if count() is all the same, IMHO).

      Let's consider a simple example:

      box01 [foo]> select i,j from foo01;
      +---+------+
      | i | j |
      +---+------+
      | 1 | a |
      | 2 | b |
      | 3 | c |
      | 4 | d |
      | 5 | e |
      +---+------+
      5 rows in set (0.00 sec)
      
      box02 [foo]> select * from foo.foo01;
      +---+------+
      | i | j |
      +---+------+
      | 1 | a |
      | 2 | b |
      | 3 | c |
      | 4 | f |
      | 5 | g |
      +---+------+
      5 rows in set (0.00 sec)
      
      [root@box01 ~]# pt-table-checksum --recursion-method=dsn="h=192.168.50.11,D=bianchi,t=dsns" --replicate bianchi.checksums --no-check-binlog-format --no-version-check --no-check-replication-filters --databases foo --tables foo01 -c j
      TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
      08-04T20:53:34 0 0 5 1 0 0.016 foo.foo01

       

      Above, I attempted to force it to checksum only this comma-separated list of columns as my intention is to really just check this table, no other one.

      By the way, after running the below simple checksum, I verified the checksums table and ...

      box02 [foo]> select db,tbl,this_cnt,master_cnt,this_crc,master_crc from bianchi.checksums\G
      *************************** 1. row ***************************
      db: foo
      tbl: foo01
      this_cnt: 5
      master_cnt: 5
      this_crc: 2e243fc8
      master_crc: 2e243fc8
      1 row in set (0.00 sec)

       

      Anything to share here on this case folks? Thanks a lot!!

      PS.: of course in the case in the case I detect that this is happening, a good solution will be to check what dataset is the most accurate one, working with the customer and agree that we need to merge data or even, using pt-table-sync --sync-to-master, but, if table-checksum was able to do it at some point comparing a e.g. CRC32 per row basis, it would be good stuff as well.

        Smart Checklist

          Attachments

            Activity

              People

              • Assignee:
                carlos.salguero Carlos Salguero
                Reporter:
                carlos.salguero Carlos Salguero
                Reviewer:
                Jericho Rivera
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: