pgbackrest-archive-push-local-gcs.sh could hang for days calling SSH
General
Escalation
General
Escalation
Description
Postgresql expects that archive command finishes in 60 seconds, but due to network problems ssh could hang for days with current setup. This breaks backups and new replica's creation.
/opt/crunchy/bin/postgres-ha/pgbackrest/pgbackrest-archive-push-local-gcs.sh calls pgbackrest on the shared repo
Postgresql expects that archive command finishes in 60 seconds, but due to network problems ssh could hang for days with current setup. This breaks backups and new replica's creation.
/opt/crunchy/bin/postgres-ha/pgbackrest/pgbackrest-archive-push-local-gcs.sh calls pgbackrest on the shared repo
Possible solution:
A) Implement a proper timeout with http://man.openbsd.org/ssh_config#ServerAliveInterval
B) use timeout command and kill pgbackrest archive-push after 60+ seconds (e.g. 120).