Uploaded image for project: 'Percona Server for MySQL'
  1. Percona Server for MySQL
  2. PS-1297

LP #1099526: XtraDB does not scale well on 12 core system for specific all in memory SELECT

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Priority: High
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None

      Description

      **Reported in Launchpad by Valerii Kravchuk last update 23-04-2014 15:42:22

      For the query joining two tables on a long character primary key we see that on 12 cores it does not scale even to 10 threads. Tested with recent 5.5.28-rel29.3:

      CREATE TABLE `incident` (
      `sys_id` char(32) NOT NULL DEFAULT '',
      `category` varchar(40) DEFAULT NULL,
      PRIMARY KEY (`sys_id`),
      KEY `incident_category` (`category`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

      insert into incident(sys_id) values (rand()*1000000);
      insert into incident(sys_id) select rand()*1000000 from incident; – 13 times

      CREATE TABLE `task` (
      `sys_id` char(32) NOT NULL DEFAULT '',
      `u_root_cause` char(32) DEFAULT NULL,
      `u_business_impact_description` mediumtext,
      `u_business_impact_category` mediumtext,
      PRIMARY KEY (`sys_id`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

      insert into task(sys_id) values(rand()*1000000);
      insert into task(sys_id) select rand()*1000000 from task; – 13 times

      insert into task(sys_id) select sys_id from incident limit 100;

      update incident set category=rand()*100000;

      show table status like 'task'\G
      show table status like 'incident'\G

      Problematic query is:

      select count, category from task inner join incident on task.sys_id=incident.sys_id group by incident.category;

      Here is what we get from mysqlslap (server was started with --no-defaults in this case, but all data fit well into memory and further tuning does NOT help to scale better it seems):

      [valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=1 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count, category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 35.766 seconds
      Minimum number of seconds to run all queries: 35.488 seconds
      Maximum number of seconds to run all queries: 35.828 seconds
      Number of clients running queries: 1
      Average number of queries per client: 1000

      [valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=2 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count, category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 22.655 seconds
      Minimum number of seconds to run all queries: 20.245 seconds
      Maximum number of seconds to run all queries: 23.943 seconds
      Number of clients running queries: 2
      Average number of queries per client: 500

      [valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=4 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count, category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 12.668 seconds
      Minimum number of seconds to run all queries: 10.320 seconds
      Maximum number of seconds to run all queries: 14.053 seconds
      Number of clients running queries: 4
      Average number of queries per client: 250

      [valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=8 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count, category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 9.055 seconds
      Minimum number of seconds to run all queries: 8.533 seconds
      Maximum number of seconds to run all queries: 9.676 seconds
      Number of clients running queries: 8
      Average number of queries per client: 125

      [valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=10 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count, category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 9.934 seconds
      Minimum number of seconds to run all queries: 9.503 seconds
      Maximum number of seconds to run all queries: 10.310 seconds
      Number of clients running queries: 10
      Average number of queries per client: 100

      Here total time to run 1000 SELECTs went up already.

      [valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=12 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count, category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 11.015 seconds
      Minimum number of seconds to run all queries: 10.520 seconds
      Maximum number of seconds to run all queries: 11.889 seconds
      Number of clients running queries: 12
      Average number of queries per client: 83

      At this moment we have the following waits:

      mysql> show engine innodb mutex;
      ------------------------------------------------

      Type Name Status

      ------------------------------------------------

      InnoDB &log_sys->mutex os_waits=1
      InnoDB &kernel_mutex os_waits=1
      InnoDB combined &block->mutex os_waits=118933
      InnoDB &dict_operation_lock os_waits=2
      InnoDB &log_sys->checkpoint_lock os_waits=113

      ------------------------------------------------
      5 rows in set (0.00 sec)

      So, it is all about combined &block->mutex it seems in case of Percona Server/XtraDB.

      Surely with more threads throughput degrades further:

      [valerii.kravchuk@cisco1 Percona-Server-5.5.28-rel29.3-388.Linux.x86_64]$ bin/mysqlslap -uroot --concurrency=24 --create-schema=test --no-drop --number-of-queries=1000 --iterations=10 --query='select count, category from task inner join incident on task.sys_id=incident.sys_id group by incident.category'
      Benchmark
      Average number of seconds to run all queries: 14.353 seconds
      Minimum number of seconds to run all queries: 13.935 seconds
      Maximum number of seconds to run all queries: 14.467 seconds
      Number of clients running queries: 24
      Average number of queries per client: 41

      Question is: why it does not scale even to 10 threads, if simpler scale to 24 and even 48 on this same box with these same tables?

      See upstream bug (there waits info is different), http://bugs.mysql.com/bug.php?id=68079, for more details (and dump to load for correct comparison).

        Smart Checklist

          Attachments

            Activity

              People

              Assignee:
              Unassigned
              Reporter:
              lpjirasync lpjirasync (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: