Uploaded image for project: 'Percona Server for MySQL'
  1. Percona Server for MySQL
  2. PS-7231

Modify `Slave_transaction::retry_transaction()` to call `mysql_errno()` only when `thd->is_error()` is true

    XMLWordPrintable

    Details

    • Needs Review:
      Yes

      Description

      Assertion failure `m_status == DA_ERROR`

      This assertion is observed only when both slave_transaction_retries and slave_preserve_commit_order is enabled on the replica server and is more likely to happen when slave_transaction_retries is set to a lower value.

      In general, if a replication applier thread fails to execute a transaction because of an InnoDB deadlock or because the transaction's execution time exceeded InnoDB's innodb_lock_wait_timeout, it automatically retries slave_transaction_retries times before stopping with an error.

      And when slave_preserve_commit_order is enabled, the replica server ensures that transactions are externalized on the replica in the same order as they appear in the replica's relay log, and prevents gaps in the sequence of transactions that have been executed from the relay log. If a thread's execution is completed before its preceding thread, then the executing thread waits until all previous transactions are committed before committing.

      Similarly, when the waiting thread has locked the rows which are needed by the thread executing the preivous transaction, then the innodb deadlock detection algorithm kicks in and the preceding thread asks the waiting thread to rollback (only if its sequence number is lesser than that of the waiting thread).

      When this happens, the waiting thread wakes up from the cond_wait and it gets to know that It was asked to roll back by the other transaction as it was holding a lock that is needed by the other transaction to progress, and goes ahead with retrying the transaction.

      But just before the transaction is retried, the worker checks if it encountered any errors during its execution. If there is no error, it simulates ER_LOCK_DEADLOCK error in order for it to be considered as a temporary error so that the worker thread retries the transaction.

      However, when the retries are exhausted, the worker thread logs an error into the error log by accessing the thread's diagnostic_area by calling `thd->get_stmt_da()->mysql_errno()`. If the error had been simulate (not called through `my_error` function call), the diagnostic_area would still be empty and thus making the assertion `DBUG_ASSERT(m_status == DA_ERROR);` to fail.

       

      Thread pointer: 0x7f7bb40008d0
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 7f7c583cfcd0 thread_stack 0x46000
      /home/sveta/build/ps-8.0/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x59) [0x56520ac5ce7f]
      /home/sveta/build/ps-8.0/bin/mysqld(handle_fatal_signal+0x2ac) [0x565209a2351a]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7f7c9579e3c0]
      /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb) [0x7f7c94e2d18b]
      /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b) [0x7f7c94e0c859]
      /lib/x86_64-linux-gnu/libc.so.6(+0x25729) [0x7f7c94e0c729]
      /lib/x86_64-linux-gnu/libc.so.6(+0x36f36) [0x7f7c94e1df36]
      /home/sveta/build/ps-8.0/bin/mysqld(Diagnostics_area::mysql_errno() const+0x3e) [0x5652093afa50]
      /home/sveta/build/ps-8.0/bin/mysqld(Slave_worker::check_and_report_end_of_retries(THD*)+0x1fa) [0x56520a906b14]
      /home/sveta/build/ps-8.0/bin/mysqld(Slave_worker::retry_transaction(unsigned int, unsigned long long, unsigned int, unsigned long long)+0xc3) [0x56520a906c7b]
      /home/sveta/build/ps-8.0/bin/mysqld(slave_worker_exec_job_group(Slave_worker*, Relay_log_info*)+0x4c9) [0x56520a9091ea]
      /home/sveta/build/ps-8.0/bin/mysqld(+0x462d60d) [0x56520a92360d]
      /home/sveta/build/ps-8.0/bin/mysqld(+0x51495a1) [0x56520b43f5a1]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f7c95792609]
      /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f7c94f09103]
      
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0): Connection ID (thread ID): 48
      Status: NOT_KILLED

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              venkatesh.prasad Venkatesh Prasad
              Reporter:
              venkatesh.prasad Venkatesh Prasad
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Smart Checklist