Uploaded image for project: 'Percona XtraDB Cluster'
  1. Percona XtraDB Cluster
  2. PXC-3396

SST failing because of wrong binlog name

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: On Hold
    • Priority: Medium
    • Resolution: Unresolved
    • Affects Version/s: 8.0.20-11.2(CVE)
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Description

      We reduced our 5 Node 5.7 Cluster to only one node. Then we upgraded this node to PXC 8 by the following steps:

      1. stopped the node and upgraded all the packages
      2. applied all the configchanges for 8
      3. made the node readonly and started it without the galera configurations
      4. node performed an inplace upgrade, everything was fine
      5. executed a "reset master" on the node and stopped it again
      6. removed the readonly from config and added the wsrep configs to the node again (with empty gcomm address)
      7. started the node again and we had our frist PXC 8 node with galera

       

      Now we was going on to the next node:

      1. mysqld was already stopped
      2. upgraded the packages to pxc8 (including xtrabackup as on the first node)
      3. we applied the configchanged for 8
      4. then we cleared the datadir by "rm -rf DATADIR/*"
      5. then we started the node
      6. the node recognized the empty directory and started an SST from our first node
      7. this took 3h since its 1,4TB
      8. At the end the receiver node throwed the following errors:

      Sep 02 23:50:31 receiver-node.example.com -wsrep-sst-joiner[24784]: 2020-09-02T21:50:25.614581Z 1 [System] [MY-000000] [WSREP] PXC upgrade completed successfully
      Sep 02 23:50:31 receiver-node.example.com -wsrep-sst-joiner[24784]: mysqld: File '/data/mysql/.000003' not found (OS errno 2 - No such file or directory)
      Sep 02 23:50:31 receiver-node.example.com -wsrep-sst-joiner[24784]: 2020-09-02T21:50:25.680151Z 0 [ERROR] [MY-010958] [Server] Could not open log file.
      Sep 02 23:50:31 receiver-node.example.com -wsrep-sst-joiner[24784]: 2020-09-02T21:50:25.680445Z 0 [ERROR] [MY-010041] [Server] Can't init tc log
      Sep 02 23:50:31 receiver-node.example.com -wsrep-sst-joiner[24784]: 2020-09-02T21:50:25.681044Z 0 [ERROR] [MY-010119] [Server] Aborting
      Sep 02 23:50:31 receiver-node.example.com -wsrep-sst-joiner[24784]: 2020-09-02T21:50:30.620730Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.>
      Sep 02 23:50:31 receiver-node.example.com -wsrep-sst-joiner[24784]: EOF:
      2020-09-02T21:50:32.116964Z 0 [ERROR] [MY-000000] [WSREP] Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.0.201' --datadir '/data/mysql/' --basedir '/usr/' --plugindir '/usr/lib64/mysql/plugin/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '24259' --mysqld-version '8.0.19-10' --binlog '/data/mysql/mysql-bin' : 3 (No such process)
      2020-09-02T21:50:32.119162Z 0 [ERROR] [MY-000000] [WSREP] Failed to read uuid:seqno from joiner script.
      2020-09-02T21:50:32.119207Z 0 [ERROR] [MY-000000] [WSREP] SST script aborted with error 3 (No such process)
      Sep 02 23:50:33 receiver-node.example.com systemd[1]: mysql.service: Main process exited, code=killed, status=6/ABRT
      Sep 02 23:50:33 receiver-node.example.com systemd[1]: mysql.service: Failed with result 'signal'.
      Sep 02 23:50:33 receiver-node.example.com systemd[1]: Failed to start Percona XtraDB Cluster.

       

      for any reason, the node replicates the binlog file with the correct name to the data directory (/data/mysql/mysql-bin.000003) However in the index file (/data/mysql/mysql-bin.index) the wrong filename is written (/data/mysql/.000003).

       

      So why is there a completly wrong filename in the index and how could I solve this issue? I tried to simply start the node again and hoped it will recognize this error and start an IST, however it just dropped everything and started a new SST.

        Attachments

        1. donor.log
          4 kB
        2. joiner-failed.log
          19 kB
        3. joiner-successful.log
          20 kB
        4. my.cnf
          1 kB
        5. my.cnf-donor
          5 kB
        6. my.cnf-joiner
          5 kB

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            romulus Thomas bruckmann
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:

                Smart Checklist