Uploaded image for project: 'Percona XtraDB Cluster'
  1. Percona XtraDB Cluster
  2. PXC-1078

LP #1212739: backend must be restarted message causes node hang

    Details

    • Type: Bug
    • Status: Wait for Subtasks
    • Priority: High
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      **Reported in Launchpad by Jay Janssen last update 17-11-2015 08:24:36

      Scenario:

      node1 up and running, 1 node cluster.

      Node2 starts, but it has an older grastate. This seems to cause a 'prim' conflict and drops node1 into the Init state. Node2 fails, and node1 goes into this state:

      130815 15:21:40 [Note] WSREP: declaring 620654b6-05be-11e3-9fed-cbbc588bf177 stable
      130815 15:21:40 [Note] WSREP: declaring d7f678fa-05bd-11e3-b212-922779171530 stable
      130815 15:21:40 [Warning] WSREP: 32e0c052-05be-11e3-a964-ee85c7ae2549 conflicting prims: my prim: view_id(PRIM,32e0c052-05be-11e3-a964-ee85c7ae2549,1) other prim: view_id(PRIM,d7f678fa-05bd-11e3-b212-922779171530,10)
      130815 15:21:40 [ERROR] WSREP: caught exception in PC, state dump to stderr follows:
      pc::Proto{uuid=32e0c052-05be-11e3-a964-ee85c7ae2549,start_prim=1,npvo=0,ignore_sb=0,ignore_quorum=0,state=1,last_sent_seq=3,checksum=1,instances=
      32e0c052-05be-11e3-a964-ee85c7ae2549,prim=1,un=0,last_seq=3,last_prim=view_id(PRIM,32e0c052-05be-11e3-a964-ee85c7ae2549,1),to_seq=2,weight=1
      ,state_msgs=
      32e0c052-05be-11e3-a964-ee85c7ae2549,pcmsg{ type=STATE, seq=0, flags= 0, node_map

      { 32e0c052-05be-11e3-a964-ee85c7ae2549,prim=1,un=0,last_seq=3,last_prim=view_id(PRIM,32e0c052-05be-11e3-a964-ee85c7ae2549,1),to_seq=2,weight=1 }}

      620654b6-05be-11e3-9fed-cbbc588bf177,pcmsg{ type=STATE, seq=0, flags= 0, node_map

      { 620654b6-05be-11e3-9fed-cbbc588bf177,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,00000000-0000-0000-0000-000000000000,0),to_seq=-1,weight=1 d7f678fa-05bd-11e3-b212-922779171530,prim=1,un=0,last_seq=2,last_prim=view_id(PRIM,d7f678fa-05bd-11e3-b212-922779171530,10),to_seq=-1,weight=1 }}

      ,current_view=view(view_id(REG,32e0c052-05be-11e3-a964-ee85c7ae2549,12) memb {
      32e0c052-05be-11e3-a964-ee85c7ae2549,
      620654b6-05be-11e3-9fed-cbbc588bf177,
      d7f678fa-05bd-11e3-b212-922779171530,
      } joined {
      620654b6-05be-11e3-9fed-cbbc588bf177,
      d7f678fa-05bd-11e3-b212-922779171530,
      } left {
      } partitioned {
      }),pc_view=view(view_id(PRIM,32e0c052-05be-11e3-a964-ee85c7ae2549,1) memb {
      32e0c052-05be-11e3-a964-ee85c7ae2549,
      } joined {
      } left {
      } partitioned

      { }),mtu=32636}

      130815 15:21:40 [Note] WSREP: evs::msg{version=0,type=1,user_type=255,order=4,seq=0,seq_range=0,aru_seq=-1,flags=4,source=d7f678fa-05bd-11e3-b212-922779171530,source_view_id=view_id(REG,32e0c052-05be-11e3-a964-ee85c7ae2549,12),range_uuid=00000000-0000-0000-0000-000000000000,range=[-1,-1],fifo_seq=271,node_list=()
      } 116
      130815 15:21:40 [ERROR] WSREP: exception caused by message: evs::msg{version=0,type=3,user_type=255,order=1,seq=0,seq_range=-1,aru_seq=0,flags=4,source=620654b6-05be-11e3-9fed-cbbc588bf177,source_view_id=view_id(REG,32e0c052-05be-11e3-a964-ee85c7ae2549,12),range_uuid=00000000-0000-0000-0000-000000000000,range=[-1,-1],fifo_seq=17,node_list=()
      }
      state after handling message: evs::proto(evs::proto(32e0c052-05be-11e3-a964-ee85c7ae2549, OPERATIONAL, view_id(REG,32e0c052-05be-11e3-a964-ee85c7ae2549,12)), OPERATIONAL) {
      current_view=view(view_id(REG,32e0c052-05be-11e3-a964-ee85c7ae2549,12) memb {
      32e0c052-05be-11e3-a964-ee85c7ae2549,
      620654b6-05be-11e3-9fed-cbbc588bf177,
      d7f678fa-05bd-11e3-b212-922779171530,
      } joined {
      } left {
      } partitioned {
      }),
      input_map=evs::input_map: {aru_seq=0,safe_seq=0,node_index=node:

      {idx=0,range=[1,0],safe_seq=0}

      node:

      {idx=1,range=[1,0],safe_seq=0}

      node:

      {idx=2,range=[1,0],safe_seq=0}

      },
      fifo_seq=105,
      last_sent=0,
      known={
      32e0c052-05be-11e3-a964-ee85c7ae2549,evs::node

      {operational=1,suspected=0,installed=1,fifo_seq=-1,}

      620654b6-05be-11e3-9fed-cbbc588bf177,evs::node

      {operational=1,suspected=0,installed=1,fifo_seq=17,}

      d7f678fa-05bd-11e3-b212-922779171530,evs::node

      {operational=1,suspected=0,installed=1,fifo_seq=273,}

      }
      }130815 15:21:40 [ERROR] WSREP: exception from gcomm, backend must be restarted:32e0c052-05be-11e3-a964-ee85c7ae2549 aborting due to conflicting prims: older overrides (FATAL)
      at gcomm/src/pc_proto.cpp:handle_state():888
      130815 15:21:40 [Note] WSREP: Received self-leave message.
      130815 15:21:40 [Note] WSREP: Flow-control interval: [0, 0]
      130815 15:21:40 [Note] WSREP: Received SELF-LEAVE. Closing connection.
      130815 15:21:40 [Note] WSREP: Shifting SYNCED -> CLOSED (TO: 0)
      130815 15:21:40 [Note] WSREP: RECV thread exiting 0: Success
      130815 15:21:40 [Note] WSREP: New cluster view: global state: 32e1936e-05be-11e3-9b1b-56bdddb5fdd3:0, view# -1: non-Primary, number of nodes: 0, my index: -1, protocol version 2
      130815 15:21:40 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      130815 15:21:40 [Note] WSREP: applier thread exiting (code:0)

      However, the daemon stays in the Init state. When I try to shut it down, it just hangs:

      130815 15:22:31 [Note] /usr/sbin/mysqld: Normal shutdown

      130815 15:22:31 [Note] WSREP: Stop replication
      130815 15:22:31 [Note] WSREP: Closing send monitor...
      130815 15:22:31 [Note] WSREP: Closed send monitor.

        Smart Checklist

          Attachments

            Activity

              People

              • Assignee:
                krunal.bauskar Krunal Bauskar
                Reporter:
                lpjirasync lpjirasync (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: