Details
-
Bug
-
Status: Done
-
Medium
-
Resolution: Fixed
-
8.0.19-10, 5.7.30-33
-
None
-
None
Description
Originally reported at https://bugs.mysql.com/bug.php?id=99440
Description
Multi-threaded slave randomly stacks, fails with lock wait timeout, or asserts.
How to Repeat
1. Start replication setup using MTR:
$ ./mtr --suite=rpl --start --mysqld=--binlog_format=mixed --mysqld=--slave_preserve_commit_order=1 --mysqld=--slave_parallel_workers=16 --mysqld=--slave_parallel_type=logical_clock --mysqld=--binlog_order_commits=1 --mysqld=--log_slave_updates=1 rpl_alter &
2. Connect to the slave and run:
set global innodb_lock_wait_timeout=1; CHANGE MASTER TO master_host='127.0.0.1', master_port=13000, master_user='root'; start slave;
3. On the master:
CREATE TABLE `table_name` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `a_id` bigint(20) NOT NULL, `b_id` bigint(20) NOT NULL, `c_id` smallint(5) NOT NULL, `d_id` varchar(255) NOT NULL DEFAULT '', `created_at` datetime NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `unique_table_name` (`a_id`,`b_id`), KEY `index_b` (`b_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='table definition';
4. Run mysqlslap job:
$ ../bin/mysqlslap -uroot -h127.0.0.1 -P13000 --query=' INSERT IGNORE INTO table_name (d_id, b_id, c_id, a_id) VALUES ('20123', 1234203, 3, 12345212);' --create-schema=test --iterations=100 --concurrency=100
5. On the slave observe:
5.1. Slowly moving Exec_Master_Log_Pos in the SHOW SLAVE STATUS output
5.2. Long-waiting threads for dependent or preceding transaction
5.3. After some time slave error:
Last_SQL_Errno: 3030 Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 4 failed executing transaction 'ANONYMOUS' at master log master-bin.000001, end_log_pos 14803. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
5.4. Then restart slave (if you are not lucky - multiple times) and observe the crash:
2020-07-21T14:22:45.507956Z 56 [Warning] [MY-010584] [Repl] Slave SQL for channel '': Worker 10 failed executing transaction 'ANONYMOUS' at master log master-bin.000001, end_log_pos 30875; Error 'Lock wait timeout exceeded; try restarting transaction' on query. Default database: 'test'. Query: 'INSERT IGNORE INTO table_name (d_id, b_id, c_id, a_id) VALUES (20123, 1234203, 3, 12345212)', Error_code: MY-001205 mysqld: /home/sveta/src/percona-server/sql/sql_error.h:381: uint Diagnostics_area::mysql_errno() const: Assertion `m_status == DA_ERROR' failed. 14:22:45 UTC - mysqld got signal 6 ; Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware. Thread pointer: 0x7f7bb40008d0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 7f7c583cfcd0 thread_stack 0x46000 /home/sveta/build/ps-8.0/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x59) [0x56520ac5ce7f] /home/sveta/build/ps-8.0/bin/mysqld(handle_fatal_signal+0x2ac) [0x565209a2351a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7f7c9579e3c0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb) [0x7f7c94e2d18b] /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b) [0x7f7c94e0c859] /lib/x86_64-linux-gnu/libc.so.6(+0x25729) [0x7f7c94e0c729] /lib/x86_64-linux-gnu/libc.so.6(+0x36f36) [0x7f7c94e1df36] /home/sveta/build/ps-8.0/bin/mysqld(Diagnostics_area::mysql_errno() const+0x3e) [0x5652093afa50] /home/sveta/build/ps-8.0/bin/mysqld(Slave_worker::check_and_report_end_of_retries(THD*)+0x1fa) [0x56520a906b14] /home/sveta/build/ps-8.0/bin/mysqld(Slave_worker::retry_transaction(unsigned int, unsigned long long, unsigned int, unsigned long long)+0xc3) [0x56520a906c7b] /home/sveta/build/ps-8.0/bin/mysqld(slave_worker_exec_job_group(Slave_worker*, Relay_log_info*)+0x4c9) [0x56520a9091ea] /home/sveta/build/ps-8.0/bin/mysqld(+0x462d60d) [0x56520a92360d] /home/sveta/build/ps-8.0/bin/mysqld(+0x51495a1) [0x56520b43f5a1] /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f7c95792609] /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f7c94f09103] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0): Connection ID (thread ID): 48 Status: NOT_KILLED
Attachments
Issue Links
- is blocked by
-
PS-7147 Modified Relay_log_info::cannot_safely_rollback() to handle null pointer
-
- Done
-