Details
-
Bug
-
Status: Done
-
Low
-
Resolution: Fixed
-
3.0.13
-
None
Description
**Reported in Launchpad by Paul Namuag last update 11-05-2016 03:48:24
We tried to run on a customer's side identically like the command below:
pt-online-schema-change --execute --alter-foreign-keys-method auto --set-vars "tx_isolation='READ-COMMITTED'" --alter "CHANGE COLUMN employerIdentificationNumber employerIdentificationNumber VARCHAR(50) NULL DEFAULT NULL" --critical-load Threads_running=200 --max-load Threads_running=200 h=10.5.1.1,D=somedb,t=sometable
and we encountered this error:
Use of uninitialized value $host in string eq at /usr/bin/pt-online-schema-change line 4245.
Upon checking with the PTDEBUG=1, it shows the following:
- Cxn:3886 18596 DBI::db=HASH(0x18a04d8) Setting dbh
- Cxn:3891 18596 DBI::db=HASH(0x18a04d8) SELECT @@server_id /!50038 , @@hostname/
- Cxn:3893 18596 DBI::db=HASH(0x18a04d8) hostname: xxxxxx.ocal 52
- Cxn:3874 18596 DBI::db=HASH(0x18a04d8) Connected dbh to xxxxx.local h=10.15.11.52
- MasterSlave:4218 18596 Looking for slaves on D=somedb,h=10.15.11.52 using methods processlist hosts
- MasterSlave:4225 18596 Finding slaves with _find_slaves_by_processlist
- MasterSlave:4287 18596 DBI::db=HASH(0x18a04d8) SHOW GRANTS FOR CURRENT_USER()
- MasterSlave:4317 18596 DBI::db=HASH(0x18a04d8) SHOW PROCESSLIST
Use of uninitialized value $host in string eq at /usr/bin/pt-online-schema-change line 4245. - Cxn:4012 18596 Destroying cxn
- Cxn:4021 18596 DBI::db=HASH(0x17c6a58) Disconnecting dbh on xxxxx h=10.15.11.46
- Cxn:4012 18596 Destroying cxn
- Cxn:4021 18596 DBI::db=HASH(0x18a04d8) Disconnecting dbh on xxxxx h=10.15.11.52
- Cxn:4012 18596 Destroying cxn
- Cxn:4021 18596 DBI::db=HASH(0x188d240) Disconnecting dbh on xxxxx h=10.15.11.23
- Cxn:4012 18596 Destroying cxn
- Cxn:4021 18596 DBI::db=HASH(0x17c94f0) Disconnecting dbh on xxxxx h=10.15.11.46
Upon checking the line, I got this line of code (based on the line from error: Use of uninitialized value $host in string eq at /usr/bin/pt-online-schema-change line 4245.)
$ vim /usr/bin/pt-online-schema-change +4245
4234 sub _find_slaves_by_processlist {
4235 my ( $self, $dsn_parser, $dbh, $dsn ) = @_;
4236
4237 my @slaves = map {
4238 my $slave = $dsn_parser->parse("h=$_", $dsn);
4239 $slave->
= 'processlist';
4240 $slave;
4241 }
4242 grep
4243 map {
4244 my ( $host ) = $_->
=~ m/^([^:]+):/;
4245 if ( $host eq 'localhost' ) { 4246 $host = '127.0.0.1'; # Replication never uses sockets. 4247 }
4248 $host;
4249 } $self->get_connected_slaves($dbh);
4250
4251 return @slaves;
4252 }
The bug occurs there as it seems that the host it tries to detect is empty or null as it seems that it fails here > my ( $host ) = $_>{host}
=~ m/^([^:]+):/;
but it founds a host that has no "semi-colon", for example "localhost"
Based on the processlist, I see this:
1 | event_scheduler | localhost | NULL | Daemon | 40041 | Waiting for next activation | NULL |
9 | someuser | 10.15.11.23:50630 | NULL | Binlog Dump | 1031379 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL |
17 | someuser | 10.15.11.52:57571 | NULL | Binlog Dump | 1031369 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL |
18 | someuser | 10.19.11.31:54500 | NULL | Binlog Dump | 1031369 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL |
284044 | someuser | 10.15.11.145:55431 | NULL | Binlog Dump | 843843 | Master has sent all binlog to slave; waiting for binlog to be updated | NULL |
So the bug occurs there.
The work-around on this is to use --recursion-method=dsn="h=yourdsnhost,D=percona,t=dsns".
Version:
~ #> pt-online-schema-change --version
pt-online-schema-change 2.2.17