-
-
Alternative Method for Log Shipping
-
- An alternative to the built-in standby mode described in the previous
- sections is to use a restore_command that polls the archive location.
- This was the only option available in versions 8.4 and below.
-
-
- Note that in this mode, the server will apply WAL one file at a
- time, so if you use the standby server for queries (see Hot Standby),
- there is a delay between an action in the primary and when the
- action becomes visible in the standby, corresponding to the time it takes
- to fill up the WAL file. archive_timeout can be used to make that delay
- shorter. Also note that you can't combine streaming replication with
- this method.
-
-
- The operations that occur on both primary and standby servers are
- normal continuous archiving and recovery tasks. The only point of
- contact between the two database servers is the archive of WAL files
- that both share: primary writing to the archive, standby reading from
- the archive. Care must be taken to ensure that WAL archives from separate
- primary servers do not become mixed together or confused. The archive
- need not be large if it is only required for standby operation.
-
-
- The magic that makes the two loosely coupled servers work together is
- simply a restore_command used on the standby that,
- when asked for the next WAL file, waits for it to become available from
- the primary. Normal recovery
- processing would request a file from the WAL archive, reporting failure
- if the file was unavailable. For standby processing it is normal for
- the next WAL file to be unavailable, so the standby must wait for
- it to appear. For files ending in
- .history there is no need to wait, and a non-zero return
- code must be returned. A waiting restore_command can be
- written as a custom script that loops after polling for the existence of
- the next WAL file. There must also be some way to trigger failover, which
- should interrupt the restore_command, break the loop and
- return a file-not-found error to the standby server. This ends recovery
- and the standby will then come up as a normal server.
-
-
- Pseudocode for a suitable restore_command is:
-triggered = false;
-while (!NextWALFileReady() && !triggered)
-{
- sleep(100000L); /* wait for ~0.1 sec */
- if (CheckForExternalTrigger())
- triggered = true;
-}
-if (!triggered)
- CopyWALFileForRecovery();
-
-
-
- The method for triggering failover is an important part of planning
- and design. One potential option is the restore_command
- command. It is executed once for each WAL file, but the process
- running the restore_command is created and dies for
- each file, so there is no daemon or server process, and
- signals or a signal handler cannot be used. Therefore, the
- restore_command is not suitable to trigger failover.
- It is possible to use a simple timeout facility, especially if
- used in conjunction with a known archive_timeout
- setting on the primary. However, this is somewhat error prone
- since a network problem or busy primary server might be sufficient
- to initiate failover. A notification mechanism such as the explicit
- creation of a trigger file is ideal, if this can be arranged.
-
-
-
-
Implementation
-
- The short procedure for configuring a standby server using this alternative
- method is as follows. For
- full details of each step, refer to previous sections as noted.
-
-
- Set up primary and standby systems as nearly identical as
- possible, including two identical copies of
-
PostgreSQL at the same release level.
-
-
-
- Set up continuous archiving from the primary to a WAL archive
- directory on the standby server. Ensure that
- ,
- and
-
- are set appropriately on the primary
- (see ).
-
-
-
- Make a base backup of the primary server (see
- linkend="backup-base-backup"/>), and load this data onto the standby.
-
-
-
- Begin recovery on the standby server from the local WAL
- archive, using restore_command that waits
- as described previously (see ).
-
-
-
-
-
- Recovery treats the WAL archive as read-only, so once a WAL file has
- been copied to the standby system it can be copied to tape at the same
- time as it is being read by the standby database server.
- Thus, running a standby server for high availability can be performed at
- the same time as files are stored for longer term disaster recovery
- purposes.
-
-
- For testing purposes, it is possible to run both primary and standby
- servers on the same system. This does not provide any worthwhile
- improvement in server robustness, nor would it be described as HA.
-
-
-
-
Hot Standby