Uploaded image for project: 'Percona Server for MongoDB'
  1. Percona Server for MongoDB
  2. PSMDB-183

Crash in 3.6 MongoRocks snapshot manager

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This issue seems sporadic, I tried to create a small reproducer but for now I don't have it. It causes crashes in CleanEveryN test hook, but it also causes sporadic crashes in different tests regardless of this test hook.

      I have generated a core file and gdb output which I paste here:

      Program terminated with signal SIGSEGV, Segmentation fault.
      (gdb) bt
      #0  __GI___pthread_mutex_lock (mutex=0x50) at ../nptl/pthread_mutex_lock.c:67
      #1  0x000055ccb0ae41fe in __gthread_mutex_lock (__mutex=0x50) at /usr/include/x86_64-linux-gnu/c++/5/bits/gthr-default.h:748
      #2  std::mutex::lock (this=0x50) at /usr/include/c++/5/mutex:135
      #3  std::lock_guard<std::mutex>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/5/mutex:386
      #4  mongo::RocksSnapshotManager::cleanupUnneededSnapshots (this=0x0) at src/mongo/db/storage/rocks/src/rocks_snapshot_manager.cpp:65
      #5  0x000055ccb0ad800f in mongo::RocksRecoveryUnit::RocksRecoveryUnit (this=0x55ccb49a52c0, transactionEngine=<optimized out>,
          snapshotManager=<optimized out>, db=<optimized out>, counterManager=<optimized out>, compactionScheduler=<optimized out>, durabilityManager=0x0,
          durable=true) at src/mongo/db/storage/rocks/src/rocks_recovery_unit.cpp:229
      #6  0x000055ccb0abd33d in mongo::RocksEngine::newRecoveryUnit (this=0x55ccb40f96c0) at src/mongo/db/storage/rocks/src/rocks_engine.cpp:296
      #7  0x000055ccb0ccf5d5 in mongo::ServiceContextMongoD::_newOpCtx (this=0x55ccb3b3d480, client=<optimized out>, opId=4979)
          at src/mongo/db/service_context_d.cpp:275
      #8  0x000055ccb2173431 in mongo::ServiceContext::makeOperationContext (this=0x55ccb3b3d480, client=0x55ccb499a400) at src/mongo/db/service_context.cpp:237
      #9  0x000055ccb216ecf7 in mongo::Client::makeOperationContext (this=<optimized out>) at src/mongo/db/client.cpp:128
      #10 0x000055ccb0ce39e3 in mongo::ServiceStateMachine::_processMessage (this=0x55ccb49ac550, guard=...) at src/mongo/transport/service_state_machine.cpp:356
      #11 0x000055ccb0cdf5a7 in mongo::ServiceStateMachine::_runNextInGuard (this=0x55ccb49ac550, guard=...) at src/mongo/transport/service_state_machine.cpp:419
      #12 0x000055ccb0ce27f1 in mongo::ServiceStateMachine::<lambda()>::operator() (__closure=0x55ccb4fd1d40) at src/mongo/transport/service_state_machine.cpp:456
      #13 std::_Function_handler<void(), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecu
      tor::ScheduleFlags, mongo::ServiceStateMachine::Ownership)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
          at /usr/include/c++/5/functional:1871
      #14 0x000055ccb1c11372 in std::function<void ()>::operator()() const (this=0x7f63b063d2b0) at /usr/include/c++/5/functional:2267
      #15 mongo::transport::ServiceExecutorSynchronous::schedule(std::function<void ()>, mongo::transport::ServiceExecutor::ScheduleFlags) (
          this=this@entry=0x55ccb40ea700, task=..., flags=flags@entry=mongo::transport::ServiceExecutor::kMayRecurse)
          at src/mongo/transport/service_executor_synchronous.cpp:118
      #16 0x000055ccb0cde410 in mongo::ServiceStateMachine::_scheduleNextWithGuard (this=this@entry=0x55ccb49ac550, guard=...,
          flags=flags@entry=mongo::transport::ServiceExecutor::kMayRecurse, ownershipModel=ownershipModel@entry=mongo::ServiceStateMachine::Ownership::kOwned)
          at src/mongo/transport/service_state_machine.cpp:459
      #17 0x000055ccb0ce0952 in mongo::ServiceStateMachine::_sourceCallback (this=this@entry=0x55ccb49ac550, status=...)
          at src/mongo/transport/service_state_machine.cpp:291
      #18 0x000055ccb0ce124b in mongo::ServiceStateMachine::_sourceMessage (this=this@entry=0x55ccb49ac550, guard=...)
          at src/mongo/transport/service_state_machine.cpp:250
      #19 0x000055ccb0cdf62d in mongo::ServiceStateMachine::_runNextInGuard (this=0x55ccb49ac550, guard=...) at src/mongo/transport/service_state_machine.cpp:416
      #20 0x000055ccb0ce27f1 in mongo::ServiceStateMachine::<lambda()>::operator() (__closure=0x55ccb4179260) at src/mongo/transport/service_state_machine.cpp:456
      #21 std::_Function_handler<void(), mongo::ServiceStateMachine::_scheduleNextWithGuard(mongo::ServiceStateMachine::ThreadGuard, mongo::transport::ServiceExecu
      tor::ScheduleFlags, mongo::ServiceStateMachine::Ownership)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
          at /usr/include/c++/5/functional:1871
      #22 0x000055ccb1c118d5 in std::function<void ()>::operator()() const (this=<optimized out>) at /usr/include/c++/5/functional:2267
      #23 mongo::transport::ServiceExecutorSynchronous::<lambda()>::operator() (__closure=0x55ccb3ed4020)
          at src/mongo/transport/service_executor_synchronous.cpp:135
      #24 std::_Function_handler<void(), mongo::transport::ServiceExecutorSynchronous::schedule(mongo::transport::ServiceExecutor::Task, mongo::transport::ServiceE
      xecutor::ScheduleFlags)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/5/functional:1871
      #25 0x000055ccb2178204 in std::function<void ()>::operator()() const (this=<optimized out>) at /usr/include/c++/5/functional:2267
      #26 mongo::(anonymous namespace)::runFunc (ctx=0x55ccb41789a0) at src/mongo/transport/service_entry_point_utils.cpp:55
      #27 0x00007f63ae88e6ba in start_thread (arg=0x7f63b063e700) at pthread_create.c:333
      #28 0x00007f63ae5c43dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      

      I was using the following suite to generate core and it took just a few minutes:

      buildscripts/resmoke.py --jobs=4 --shuffle --storageEngineCacheSizeGB=1 --storageEngine=rocksdb   --suites=replica_sets_resync_static_jscore_passthrough --reportFile=resmoke_replica_sets_resync_static_jscore_passthrough_rocksdb_1.json
      

      Will upload the thread apply bt all from gdb also but in a file to now clutter the report.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              denis.protivenskii Denis Protivenskii (Inactive)
              Reporter:
              tomislav.plavcic@percona.com Tomislav Plavcic
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:

                  Smart Checklist