pg_createsubscriber: Fix an unpredictable recovery wait time.
authorAmit Kapila <[email protected]>
Tue, 30 Jul 2024 08:47:30 +0000 (14:17 +0530)
committerAmit Kapila <[email protected]>
Tue, 30 Jul 2024 08:47:30 +0000 (14:17 +0530)
The problem is that the tool is using the LSN returned by
pg_create_logical_replication_slot() as recovery_target_lsn. This LSN is
ahead of the current WAL position and the recovery waits until the
publisher writes a WAL record to reach the target and ends the recovery.
On idle systems, this wait time is unpredictable and could lead to failure
in promoting the subscriber. To avoid that, insert a harmless WAL record.

Reported-by: Alexander Lakhin and Tom Lane
Diagnosed-by: Hayato Kuroda
Author: Euler Taveira
Reviewed-by: Hayato Kuroda, Amit Kapila
Backpatch-through: 17
Discussion: https://postgr.es/m/2377319.1719766794%40sss.pgh.pa.us
Discussion: https://postgr.es/m/CA+TgmoYcY+Wb67NAwaHT7MvxCSeV86oSc+va9hHKaasE42ukyw@mail.gmail.com

src/bin/pg_basebackup/pg_createsubscriber.c

index b02318782a66d3363502882d1180c7d02210dec0..b15fb98994aca8df0ec6b624c037f4c206863748 100644 (file)
@@ -778,6 +778,28 @@ setup_publisher(struct LogicalRepInfo *dbinfo)
        else
            exit(1);
 
+       /*
+        * Since we are using the LSN returned by the last replication slot as
+        * recovery_target_lsn, this LSN is ahead of the current WAL position
+        * and the recovery waits until the publisher writes a WAL record to
+        * reach the target and ends the recovery. On idle systems, this wait
+        * time is unpredictable and could lead to failure in promoting the
+        * subscriber. To avoid that, insert a harmless WAL record.
+        */
+       if (i == num_dbs - 1 && !dry_run)
+       {
+           PGresult   *res;
+
+           res = PQexec(conn, "SELECT pg_log_standby_snapshot()");
+           if (PQresultStatus(res) != PGRES_TUPLES_OK)
+           {
+               pg_log_error("could not write an additional WAL record: %s",
+                            PQresultErrorMessage(res));
+               disconnect_database(conn, true);
+           }
+           PQclear(res);
+       }
+
        disconnect_database(conn, false);
    }