PBM failed to finish external backup

Description

Hello,

We use Sanoid to perform ZFS snapshot with the help of pre and post snapshot scripts. The pre snapshot script looks like this (/opt/sanoid-scripts/start-backup.sh):

Then Sanoid creates a snapshot.

The post-snapshot script is (/opt/sanoid-scripts/stop-backup.sh):

The snapshot has been created:

But the backup itself is marked as still running:

It is now impossible to create a new backup:

Nor delete the current one:

Here is the backup description:

PBM version:

Installed via Percona Debian repositories.

The error can be reproduced by creating a script that runs both pre and post script sequentially:

The fix is to follow the optional step (3) described in the documentation and perform a describe backup, then search for statuses and wait if they are not “copyReady” (at PBM and replica set levels) before ending the pre snapshot script (/opt/sanoid-scripts/start-backup.sh):

Is the describe operation really optional? What could explain this broken backup state when all the operations are performed very quickly?

Kind regards,

Environment

None

Activity

Show:

Julien Riou February 7, 2024 at 3:39 PM

Hi,

No worries. We have added a 60 seconds delay after the backup creation and there is no error anymore. But this is a quick-and-dirty workaround.

Aaditya Dubey February 7, 2024 at 2:57 PM

Hi

Thank you for the report.
We are checking the issue, will keep you posted.
Sorry for delay in response.

Julien Riou January 29, 2024 at 9:57 AM

The “workaround” has been applied for 3 days and we still have the issue:

Done

Details

Assignee

Reporter

Labels

Needs Review

Yes

Needs QA

Yes

Sprint

None

None

Priority

Smart Checklist

Created January 25, 2024 at 2:31 PM
Updated October 9, 2024 at 10:16 AM
Resolved September 24, 2024 at 12:24 PM
Loading...