Uploaded image for project: 'Percona Toolkit'
  1. Percona Toolkit
  2. PT-1978

Add reporting on unused/redundant indexes for MongoDB by pt-mongodb-summary

Details

    Description

      User story:
      As a PT user, I need to have information about unused and redundant indexes in the MongoDB server I'm analyzing with pt-mongodb-summary so that I'm aware of such indexes and can do future optimizations

      UI/UX:
      N/A

      Acceptance criteria

      • by running pt-mongodb-summary against MongoDB server user provided an additional information

      Out of scope:

      • an additional flag for the data

      Suggested implementation:
      TBD

      How to test

      Set up

      1. Connect to the MongoDB primary in your cluster

      mongo 127.0.0.1:17001/admin
      

      2. Create a database, a collection and insert some documents

      for (var i=0; i< 100;i++) { db.getSiblingDB("testdb").test_col.insert({f1:i, f2: _rand()*100, f3: _rand()*300, f4: _rand()*300, f5: (Math.random()+1).toString(36).substring(2) }); }
      

      3. Check there are documents (optional)

      db.getSiblingDB("testdb").test_col.find().count()
      100
      

      4. Create some indexes

      db.getSiblingDB("testdb").test_col.createIndex({"f1": 1, "f2": -1, "f3": 1, "f4": 1}, {"name": "idx_01"});
      db.getSiblingDB("testdb").test_col.createIndex({"f1": 1, "f2": -1, "f3": 1}, {"name": "idx_02"});
      db.getSiblingDB("testdb").test_col.createIndex({"f1": 1, "f2": -1}, {"name": "idx_03"});
      db.getSiblingDB("testdb").test_col.createIndex({"f3": -1}, {"name": "idx_04"});
      db.getSiblingDB("testdb").test_col.createIndex({"f4": -1}, {"name": "idx_05"});
      db.getSiblingDB("testdb").test_col.createIndex({"f5": -1}, {"name": "idx_06"});
      

      5. Run a query to make use of one of the created indexes

      db.getSiblingDB("testdb").test_col.find({"f1": {"$gte": 50}});
      

      Test

      Run the program to get a list of unused and duplicated indexes
      In the example I am using the check-all command but you can use check-unused or check-duplicates commands.

      ./pt-mongodb-index-check --mongodb.uri=mongodb://127.0.0.1:17001 --databases=testdb --all-collections check-all
      
      Duplicated indexes
      
      testdb.test_col, index 'idx_03', with fields {f1:1, f2:-1} is the prefix of 'idx_02' with fields {f1:1, f2:-1, f3:1}
      testdb.test_col, index 'idx_03', with fields {f1:1, f2:-1} is the prefix of 'idx_01' with fields {f1:1, f2:-1, f3:1, f4:1}
      testdb.test_col, index 'idx_02', with fields {f1:1, f2:-1, f3:1} is the prefix of 'idx_01' with fields {f1:1, f2:-1, f3:1, f4:1}
      
      Unused indexes since last restart
      
      testdb.test_col, index 'idx_02' with fields {f1:1 }, f2:-1 }, f3:1 }
      testdb.test_col, index 'idx_03' with fields {f1:1 }, f2:-1 }
      testdb.test_col, index 'idx_05' with fields {f4:-1 }
      testdb.test_col, index 'idx_04' with fields {f3:-1 }
      testdb.test_col, index 'idx_06' with fields {f5:-1 }
      

      Notice that the index idx_01 doesn't appear on the unused index list because we ran a query to make use of it.

      Details:
      See an attached scripts

      Attachments

        Activity

          People

            carlos.salguero Carlos Salguero (Inactive)
            matthew.boehm@percona.com Matthew Boehm
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Smart Checklist