pgbackrest-archive-push-local-gcs.sh could hang for days calling SSH

Description

Postgresql expects that archive command finishes in 60 seconds, but due to network problems ssh could hang for days with current setup. This breaks backups and new replica's creation.

/opt/crunchy/bin/postgres-ha/pgbackrest/pgbackrest-archive-push-local-gcs.sh calls pgbackrest on the shared repo

Possible solution:
A) Implement a proper timeout with http://man.openbsd.org/ssh_config#ServerAliveInterval
B) use timeout command and kill pgbackrest archive-push after 60+ seconds (e.g. 120).

Environment

None

AFFECTED CS IDs

CS0034210

Activity

Show:

Slava Sarzhan March 14, 2023 at 9:20 AM

The timeout to the "pgbackrest archive-pus" command was added.

Done

Details

Assignee

Reporter

Needs QA

Fix versions

Affects versions

Priority

Smart Checklist

Created March 6, 2023 at 3:07 PM
Updated March 5, 2024 at 3:51 PM
Resolved March 15, 2023 at 9:13 AM