If a corrupt WAL record is received by streaming replication, disconnect
authorHeikki Linnakangas
Mon, 14 Jun 2010 06:04:21 +0000 (06:04 +0000)
committerHeikki Linnakangas
Mon, 14 Jun 2010 06:04:21 +0000 (06:04 +0000)
and retry. If the record is genuinely corrupt in the master database,
there's little hope of recovering, but it's better than simply retrying
to apply the corrupt WAL record in a tight loop without even trying to
retransmit it, which is what we used to do.

src/backend/access/transam/xlog.c

index a72d7f24da03d1ce33ec64685227bce968dfe8df..5787b3d164c95bba1732d9d416fc65e0c6a044de 100644 (file)
@@ -7,7 +7,7 @@
  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
- * $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.423 2010/06/12 09:14:52 petere Exp $
+ * $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.424 2010/06/14 06:04:21 heikki Exp $
  *
  *-------------------------------------------------------------------------
  */
@@ -9270,6 +9270,22 @@ retry:
            {
                if (WalRcvInProgress())
                {
+                   /*
+                    * If we find an invalid record in the WAL streamed from
+                    * master, something is seriously wrong. There's little
+                    * chance that the problem will just go away, but PANIC
+                    * is not good for availability either, especially in
+                    * hot standby mode. Disconnect, and retry from
+                    * archive/pg_xlog again. The WAL in the archive should
+                    * be identical to what was streamed, so it's unlikely
+                    * that it helps, but one can hope...
+                    */
+                   if (failedSources & XLOG_FROM_STREAM)
+                   {
+                       ShutdownWalRcv();
+                       continue;
+                   }
+
                    /*
                     * While walreceiver is active, wait for new WAL to arrive
                     * from primary.