Details
-
Improvement
-
Status: Done
-
Medium
-
Resolution: Fixed
-
None
-
None
-
None
Description
Hi,
There is no hard STR, however from time to time PITR doesn't restart automatically after backup.
STR:
1) perform backup
pbm logs -e "backup/2021-08-04T11:35:56Z" -t 0 -n rs1 -s D 2021-08-04T11:36:26Z D [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] releasing lock 2021-08-04T11:36:26Z I [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] backup finished 2021-08-04T11:36:26Z I [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] dropping tmp collections 2021-08-04T11:36:21Z I [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] s3.uploadPartSize is set to 10485760 (~10Mb) 2021-08-04T11:36:21Z D [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] set oplog span to {1628076973 1} / {1628076978 15} 2021-08-04T11:36:18Z I [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] mongodump finished, waiting for the oplog 2021-08-04T11:36:17Z I [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] s3.uploadPartSize is set to 10485760 (~10Mb) 2021-08-04T11:36:16Z D [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] wait for tmp users {1628076976 7} 2021-08-04T11:36:13Z D [rs1/localhost:27019] [backup/2021-08-04T11:35:56Z] skip after nomination, probably started by another node 2021-08-04T11:36:13Z D [rs1/localhost:27018] [backup/2021-08-04T11:35:56Z] skip after nomination, probably started by another node 2021-08-04T11:36:13Z I [rs1/localhost:27020] [backup/2021-08-04T11:35:56Z] backup started
2) start PITR
2021-08-04T11:36:56Z I [rs1/localhost:27020] [pitr] streaming started from 2021-08-04 11:36:18 +0000 UTC / 1628076978 2021-08-04T11:36:56Z D [rs1/localhost:27020] [pitr] start_ok 2021-08-04T11:36:56Z D [rs1/localhost:27020] [pitr] lastTS set to {1628076978 15} 2021-08-04T11:36:18 2021-08-04T11:36:56Z I [rs1/localhost:27020] [pitr] restore after 2021-08-04T07:54:49Z the chunk &{rs1 pbmPitr/rs1/20210804/20210804075637-19.20210804080421-1.oplog.snappy s2 {%!s(uint32=1628063797) %!s(uint32=19)} {%!s(uint32=1628064261) %!s(uint32=1)} %!s(int64=0)}, skip
3) perform backup
pbm logs -e "backup/2021-08-04T11:37:14Z" -t 0 -n rs1 -s D 2021-08-04T11:37:42Z D [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] releasing lock 2021-08-04T11:37:42Z I [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] backup finished 2021-08-04T11:37:42Z I [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] dropping tmp collections 2021-08-04T11:37:39Z I [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] s3.uploadPartSize is set to 10485760 (~10Mb) 2021-08-04T11:37:39Z D [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] set oplog span to {1628077053 1} / {1628077056 7} 2021-08-04T11:37:35Z I [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] mongodump finished, waiting for the oplog 2021-08-04T11:37:35Z I [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] s3.uploadPartSize is set to 10485760 (~10Mb) 2021-08-04T11:37:35Z D [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] wait for tmp users {1628077055 4} 2021-08-04T11:37:32Z D [rs1/localhost:27019] [backup/2021-08-04T11:37:14Z] skip after nomination, probably started by another node 2021-08-04T11:37:31Z D [rs1/localhost:27018] [backup/2021-08-04T11:37:14Z] skip after nomination, probably started by another node 2021-08-04T11:37:31Z I [rs1/localhost:27020] [backup/2021-08-04T11:37:14Z] backup started
PITR temporary stopped for the backup, but never restarts again:
2021-08-04T11:37:35Z I [rs1/localhost:27020] [pitr] pausing/stopping with last_ts 2021-08-04 11:37:33 +0000 UTC 2021-08-04T11:37:35Z I [rs1/localhost:27020] [pitr] created chunk 2021-08-04T11:36:18 - 2021-08-04T11:37:33 2021-08-04T11:37:35Z I [rs1/localhost:27020] [pitr] s3.uploadPartSize is set to 10485760 (~10Mb) 2021-08-04T11:37:31Z I [rs1/localhost:27020] [pitr] got wake_up signal
Current list of backups and PITR chunks:
pbm list --full Backup snapshots: 2021-08-04T07:54:49Z [complete: 2021-08-04T07:55:11] 2021-08-04T07:56:15Z [complete: 2021-08-04T07:56:37] 2021-08-04T11:35:56Z [complete: 2021-08-04T11:36:18] 2021-08-04T11:37:14Z [complete: 2021-08-04T11:37:36] 2021-08-04T13:54:25Z [complete: 2021-08-04T13:54:47] 2021-08-04T13:56:03Z [complete: 2021-08-04T13:56:25] PITR <on>: 2021-08-04T07:55:11 - 2021-08-04T08:04:21 2021-08-04T11:36:18 - 2021-08-04T11:37:33 2021-08-04T13:54:47 - 2021-08-04T14:06:32 config: [{ 2021-08-04T07:55:11 - 2021-08-04T08:04:26} { 2021-08-04T11:36:18 - 2021-08-04T11:38:00} { 2021-08-04T13:54:47 - 2021-08-04T14:06:32}] rs1: [{ 2021-08-04T07:55:11 - 2021-08-04T08:04:21} { 2021-08-04T11:36:18 - 2021-08-04T11:37:33} { 2021-08-04T13:54:47 - 2021-08-04T14:06:33}] rs2: [{ 2021-08-04T07:55:11 - 2021-08-04T08:04:25} { 2021-08-04T11:36:18 - 2021-08-04T11:37:57} { 2021-08-04T13:54:47 - 2021-08-04T14:06:32}]
On rs2 PITR was successfully restarted:
2021-08-04T11:37:42Z I [rs2/localhost:28018] [pitr] streaming started from 2021-08-04 11:37:36 +0000 UTC / 1628077056 2021-08-04T11:37:42Z D [rs2/localhost:28018] [pitr] start_ok 2021-08-04T11:37:42Z D [rs2/localhost:28018] [pitr] lastTS set to {1628077056 7} 2021-08-04T11:37:36 2021-08-04T11:37:42Z I [rs2/localhost:28018] [pitr] copied chunk 2021-08-04T11:37:33 - 2021-08-04T11:37:36 2021-08-04T11:37:35Z I [rs2/localhost:28018] [pitr] pausing/stopping with last_ts 2021-08-04 11:37:34 +0000 UTC 2021-08-04T11:37:35Z I [rs2/localhost:28018] [pitr] created chunk 2021-08-04T11:36:18 - 2021-08-04T11:37:34 2021-08-04T11:37:35Z I [rs2/localhost:28018] [pitr] s3.uploadPartSize is set to 10485760 (~10Mb) 2021-08-04T11:37:31Z I [rs2/localhost:28018] [pitr] got wake_up signal