Description
Launchpad: https://bugs.launchpad.net/percona-toolkit/+bug/1016272
We're using pt-kill with the following options:
pt-kill --busy-time 120s --print --match-info "^(select|SELECT)" --interval 10
The idea is to only kill select queries running longer than 120 seconds.
However there's a problem with the logic that pt-kill uses to kill queries. If the Command is not equal to Query, then busy time is ignored, and other match parameters are free to take effect.
In this case, a prepared statement is in the "Execute" Command state, and running a SELECT in the Info part. It gets killed immediately by pt-kill every time.
The current assumption that only items in the processlist with Command=Query are "busy" But that isn't the case.
I think the right fix is to say that any query which is not idle is busy.
That is to change the current busy time check from:
if ( $find_spec{busy_time} && ($query->{Command} || '') eq 'Query' ) {
to
if ( $find_spec{busy_time} && ($query->{Command} || '') ne 'Sleep' ) {
However, I understand that this isn't a trivial change. The smallest change which would fix this issue is to add Execute to the if statement as well as Query, but that seems like it could leave the script vulnerable to similar issues.
The MySQL documentation in this case seems to agree with my case: that many queries not having Command=Query should actually be considered busy: http://dev.mysql.com/doc/refman/5.1/en/thread-commands.html
Attachments
Issue Links
- relates to
-
PT-1492 pt-kill in version 3.0.7 seems not to respect busy-time any longer
-
- Done
-