Details
-
Bug
-
Status: Done
-
High
-
Resolution: Fixed
-
5.7.28-31.41
-
None
-
None
Description
A whole cluster gets effectively blocked for writes when a DDL query fails to successfully trigger brute force abort. And this is happening when
thread_handling = pool-of-threads
is used.
So, as a result, we can observe long waiting for MDL lock, which should never happen in PXC/Galera due to priority nature of DDL handling, like:
Id: 9
User: msandbox
Host: localhost
db: test
Command: Query
Time: 332
State: Waiting for table metadata lock
Info: truncate table t1
An example debug level log from affected node in attachment.
How to reproduce
Use PXC node member with
thread_handling = pool-of-threads
.
In session one start a transaction on a simple table:
node1 [localhost:26529] {msandbox} (test) > begin; select * from t1; Query OK, 0 rows affected (0.00 sec) Empty set (0.00 sec)
In session 2 on the same node, try a DDL on the same table, like:
node1 [localhost:26529] {msandbox} (test) > truncate table t1;
Confirmed on PXC 5.7.26, .27 and .28.