Uploaded image for project: 'Percona XtraDB Cluster'
  1. Percona XtraDB Cluster
  2. PXC-2005

LP #1714684: Falling join node after a successful SST

    Details

    • Type: Bug
    • Status: Done
    • Priority: Low
    • Resolution: Cannot Reproduce
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      **Reported in Launchpad by Stanislav Gritsenko last update 05-09-2017 03:14:04

      I observe a strange situation when I try to add a second node to the percona galera cluster.
      After a long file upload using xtrabackup-v2 (3.5TB of data and about 1,000,000 files) and successfully passed all the stages, judging by the log, the second node falls with an absolutely uninformative error:

      2017-09-02 01:31:14 17707 [Note] InnoDB: Using atomics to ref count buffer pool pages
      2017-09-02 01:31:14 17707 [Note] InnoDB: The InnoDB memory heap is disabled
      2017-09-02 01:31:14 17707 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
      2017-09-02 01:31:14 17707 [Note] InnoDB: Memory barrier is not used
      2017-09-02 01:31:14 17707 [Note] InnoDB: Compressed tables use zlib 1.2.7
      2017-09-02 01:31:14 17707 [Note] InnoDB: Using Linux native AIO
      2017-09-02 01:31:14 17707 [Note] InnoDB: Using CPU crc32 instructions
      2017-09-02 01:31:14 17707 [Warning] InnoDB: innodb_file_io_threads is deprecated. Please use innodb_read_io_threads and innodb_write_io_threads instead
      2017-09-02 01:31:14 17707 [Note] InnoDB: Initializing buffer pool, size = 10.0G
      2017-09-02 01:31:15 17707 [Note] InnoDB: Completed initialization of buffer pool
      2017-09-02 01:31:16 17707 [Note] InnoDB: Highest supported file format is Barracuda.
      2017-09-02 01:32:27 17707 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000047 of size 134217728 bytes
      2017-09-02 02:44:02 17707 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000048 of size 134217728 bytes
      2017-09-02 03:03:40 17707 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000049 of size 134217728 bytes
      2017-09-02 03:21:14 17707 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000050 of size 134217728 bytes
      2017-09-02 03:40:19 17707 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000051 of size 134217728 bytes
      2017-09-02 03:44:41 17707 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000052 of size 134217728 bytes
      2017-09-02 03:49:21 17707 [Note] WSREP: Created page /var/lib/mysql/gcache.page.000053 of size 134217728 bytes
      2017-09-02 03:53:24 17707 [Note] InnoDB: 128 rollback segment(s) are active.
      2017-09-02 03:53:24 17707 [Note] InnoDB: Waiting for purge to start
      2017-09-02 03:53:24 17707 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.36-82.0 started; log sequence number 7400309372985
      2017-09-02 03:53:24 17707 [ERROR] Aborting

      2017-09-02 03:53:24 17707 [Note] WSREP: Signalling cancellation of the SST request.
      2017-09-02 03:53:24 17707 [Note] WSREP: SST request was cancelled
      2017-09-02 03:53:24 17707 [Note] WSREP: Closing send monitor...
      2017-09-02 03:53:24 17707 [Note] WSREP: Closed send monitor.
      2017-09-02 03:53:24 17707 [Note] WSREP: gcomm: terminating thread
      2017-09-02 03:53:24 17707 [Note] WSREP: gcomm: joining thread
      2017-09-02 03:53:24 17707 [Note] WSREP: gcomm: closing backend
      2017-09-02 03:53:26 17707 [Note] WSREP: Service disconnected.
      2017-09-02 03:53:26 17707 [Note] WSREP: Waiting to close threads......
      2017-09-02 03:53:26 17707 [Note] WSREP: rollbacker thread exiting

      -------------------------------------

      This info correspond with strace log data:

      2049 03:53:24 <... setpriority resumed> ) = 0
      2049 03:53:24 getpriority(PRIO_PROCESS, 2049) = 20
      17707 03:53:24 write(2, "2017-09-02 03:53:24 17707 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.36-82.0 started; log sequence number 7400309372985\n", 137 <unfinished ...>
      2049 03:53:24 select(0, NULL, NULL, NULL,

      {1, 0}

      <unfinished ...>
      17707 03:53:24 <... write resumed> ) = 137
      17707 03:53:24 mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7fb32cbf9000
      17707 03:53:24 mprotect(0x7fb32cbf9000, 4096, PROT_NONE) = 0
      17707 03:53:24 clone(child_stack=0x7fb32d3f8e30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb32d3f99d0, tls=0x7fb32d3f9700, child_tidptr=0x7fb32d3f99d0) = 2050
      2050 03:53:24 set_robust_list(0x7fb32d3f99e0, 24 <unfinished ...>
      17707 03:53:24 mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0 <unfinished ...>
      2050 03:53:24 <... set_robust_list resumed> ) = 0
      17707 03:53:24 <... mmap resumed> ) = 0x7fb32c3f8000
      17707 03:53:24 mprotect(0x7fb32c3f8000, 4096, PROT_NONE) = 0
      2050 03:53:24 futex(0x7fbf9746b58c, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
      17707 03:53:24 clone( <unfinished ...>
      2051 03:53:24 set_robust_list(0x7fb32cbf89e0, 24 <unfinished ...>
      17707 03:53:24 <... clone resumed> child_stack=0x7fb32cbf7e30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb32cbf89d0, tls=0x7fb32cbf8700, child_tidptr=0x7fb32cbf89d0) = 2051
      2051 03:53:24 <... set_robust_list resumed> ) = 0
      17707 03:53:24 mmap(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0 <unfinished ...>
      2051 03:53:24 futex(0x7fb4cd3ef90c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1,

      {1504324414, 807821000}

      , ffffffff <unfinished ...>
      17707 03:53:24 <... mmap resumed> ) = 0x7fb32bbf7000
      17707 03:53:24 mprotect(0x7fb32bbf7000, 4096, PROT_NONE) = 0
      17707 03:53:24 clone( <unfinished ...>
      2052 03:53:24 set_robust_list(0x7fb32c3f79e0, 24 <unfinished ...>
      17707 03:53:24 <... clone resumed> child_stack=0x7fb32c3f6e30, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb32c3f79d0, tls=0x7fb32c3f7700, child_tidptr=0x7fb32c3f79d0) = 2052
      2052 03:53:24 <... set_robust_list resumed> ) = 0
      2052 03:53:24 futex(0x7fb334add74c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1,

      {1504324409, 808928000}

      , ffffffff <unfinished ...>
      2047 03:53:24 <... pread resumed> "K\315\1\320\0\t\276\200\0\2\321b\0\2\321\\\0\0\6\271\7U\2727E\277\0\0\0\0\0\0\0\0\0\0\0\0\0!=\251\0\233\16\356\6\353\0\0\0\1\0\1\0\206\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\f\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\1\0\0\3<\234infimum\0\t\6\0\10\3\0\0supremum\0NJ% \0\20\7;\335whois_record_delta/#sql-291557_8e08aawhois_record_delta/FK9F837BD2D0C7A375\0\n \350NJ% \0\30\0077\313whois_record_delta/#sql-291557_a77c0bwhois_record_delta/FKA6CD4E737B556202\0\n\4\276NJ% \0 \7\0\335whois_record_delta/#sql-291557_a7a611whois_record_delta/FKB4869798B8CF12D0\0\n\4\277NJ% \0(\7\0014whois_record_delta/#sql-291557_a7ba06whois_record_delta/FKC8DFC7E4D0C7A375\0\n \340`\\7\0\0000\7\2Kwhois_record_delta/registry_data_academy_add_2017_07_31whois_record_delta/FKA03AB7BF7B556202\0\3`Md`;\0\0008\7\2\270whois_record_delta/registry_data_accountants_add_2017_04_13whois_record_delta/FK5DCE39197B556202\0\t\352\21c_:\0\0@\7\3$whois_record_delta/registry_data_accountant_add_2017_07_28whois_record_delta/FK27E110F37B556202\0\4Z\36c_:\5\0H\7\3\220whois_record_delta/registry_data_accountant_add_2017_08_28whois_record_delta/FK27E18552D0C7A375\0\4Z\35^Z5\0\0P\7\3\367whois_record_delta/registry_data_actor_add_2017_07_06whois_record_delta/FK38C0C9FED0C7A375\0\0029\24^Z5\0\0X\7\4^whois_rec"..., 16384, 10462691328) = 16384
      2047 03:53:24 pread(12, "\230S\314\376\0\n \354\0\3`J\0\3`M\0\0\6\271\0079B\302E\277\0\0\0\0\0\0\0\0\0\0\0\0\0\7*~\0\200\35\222 ~\0\0\0\5\0\0\0\31\0\0\0\0\246\0\312m\0\0\0\0\0\0\0\0\0\f\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\1\0\0\3\24\17infimum\0\t\3\0\10\3\0\0supremum\0J% \0\20\5\2\303whois_record_delta/#sql-291557_abd422whois_record_delta/FKF3D90A9420710653J%$\0\30\5\1{whois_record_delta/#sql-291557_abd422whois_record_delta/FKF3D90A9479B00024J%$\0 \5\0\327whois_record_delta/#sql-291557_abd422whois_record_delta/FKF3D90A947B556202J% \0(\5\1\315whois_record_delta/#sql-291557_abd422whois_record_delta/FKF3D90A94B8CF12D0J% \0000\5\0\205whois_record_delta/#sql-291557_abd422whois_record_delta/FKF3D90A94D0C7A375J% \0008\5#zwhois_record_delta/#sql-291557_abd422whois_record_delta/FKFED1313020710653J%$\0@\5\3gwhois_record_delta/#sql-291557_abd422whois_record_delta/FKFED1313079B00024J%$\0H\5\2qwhois_record_delta/#sql-291557_abd422whois_record_delta/FKFED131307B556202J% \0P\5\3\271whois_record_delta/#sql-291557_abd422whois_record_delta/FKFED13130B8CF12D0J% \0X\5\3\25whois_record_delta/#sql-291557_abd422whois_record_delta/FKFED13130C7212EEFJ%$\0`\5\2\37whois_record_delta/#sql-291557_abd422whois_record_delta/FKFED13130D0C7A"..., 16384, 10875502592) = 16384
      2047 03:53:24 pread(12, <unfinished ...>
      17707 03:53:24 write(2, "2017-09-02 03:53:24 17707 [ERROR] Aborting\n\n", 44) = 44
      17707 03:53:24 write(2, "2017-09-02 03:53:24 17707 [Note] WSREP: Signalling cancellation of the SST request.\n", 84) = 84
      17707 03:53:24 write(2, "2017-09-02 03:53:24 17707 [Note] WSREP: SST request was cancelled\n", 66) = 66
      17707 03:53:24 futex(0x7fb885bdb224, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fb885bdb220,

      {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}

      <unfinished ...>
      ------------------

      All logs has been attached.

        Smart Checklist

          Attachments

            Activity

              People

              • Assignee:
                lalit.choudhary Lalit Choudhary
                Reporter:
                lpjirasync lpjirasync (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: