Uploaded image for project: 'Percona Toolkit'
  1. Percona Toolkit
  2. PT-2112

pt-archiver specifying secondary index adds all columns in purge operation

Details

    • Bug
    • Status: Open
    • Medium
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • Yes
    • Yes

    Description

      Hello team, I noticed that this issue was actually reported on ticket PT-157 and marked fixed on version 3.0.13. However, having version 3.4.0 I still see the issue

       

      [email protected]:~$ pt-archiver --version
      pt-archiver 3.4.0 

      This is the table definition

       

       

      mysql> show create table dbtest.sbtest1\G
      *************************** 1. row ***************************
             Table: sbtest1
      Create Table: CREATE TABLE `sbtest1` (
        `id` int(11) NOT NULL AUTO_INCREMENT,
        `k` int(11) NOT NULL DEFAULT '0',
        `c` char(120) NOT NULL DEFAULT '',
        `pad` char(60) NOT NULL DEFAULT '',
        PRIMARY KEY (`id`),
        KEY `k_1` (`k`)
      ) ENGINE=InnoDB AUTO_INCREMENT=4000001 DEFAULT CHARSET=latin1
      1 row in set (0.00 sec) 

      I executed the following, please notice the secondary index usage

       

      pt-archiver \
      --source=h=localhost,P=3306,D=dbtest,t=sbtest1,b=0,i=k_1 \
      --where='k < 2000000' --limit=1000 \
      --primary-key-only --bulk-delete --dry-run --purge 

      This is what got generated, it ignored the seconday index and forced the primary index

       

       

      SELECT /*!40001 SQL_NO_CACHE */ `id` FROM `dbtest`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE (k < 2000000) AND (`id` < '4000000') ORDER BY `id` LIMIT 1000
      SELECT /*!40001 SQL_NO_CACHE */ `id` FROM `dbtest`.`sbtest1` FORCE INDEX(`PRIMARY`) WHERE (k < 2000000) AND (`id` < '4000000') AND ((`id` >= ?)) ORDER BY `id` LIMIT 1000
      DELETE FROM `dbtest`.`sbtest1` WHERE (((`id` >= ?))) AND (((`id` <= ?))) AND (k < 2000000) LIMIT 1000
      [email protected]:~$ pt-archiver --version 

      I changed the command, by specifiying the id column and removing the --primary-key-only option

       

       

      pt-archiver \
      --source=h=localhost,P=3306,D=dbtest,t=sbtest1,b=0,i=k_1 \
      --where='k < 2000000' --limit=1000 \
      --columns=id --bulk-delete --dry-run --purge 

      This is the result, it ignored the --columns option, but at least it did used the secondary index

       

       

      SELECT /*!40001 SQL_NO_CACHE */ `id`,`k`,`c`,`pad` FROM `dbtest`.`sbtest1` FORCE INDEX(`k_1`) WHERE (k < 2000000) ORDER BY `k` LIMIT 1000
      SELECT /*!40001 SQL_NO_CACHE */ `id`,`k`,`c`,`pad` FROM `dbtest`.`sbtest1` FORCE INDEX(`k_1`) WHERE (k < 2000000) AND ((`k` >= ?)) ORDER BY `k` LIMIT 1000
      DELETE FROM `dbtest`.`sbtest1` WHERE (((`k` >= ?))) AND (((`k` <= ?))) AND (k < 2000000) LIMIT 1000 

      I believe that the original fix was to just add the --primary-key-only option, and it would just include the id column on the select clause and respect the defined secondary index.

       

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            michael.villegas Michael Villegas
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Smart Checklist