- The result is equivalent to replacing the target data directory with the
- source one. Only changed blocks from relation files are copied;
- all other files are copied in full, including configuration files. The
- advantage of
pg_rewind over taking a new base backup, or
- tools like
rsync , is that
pg_rewind does
- not require reading through unchanged blocks in the cluster. This makes
- it a lot faster when the database is large and only a small
- fraction of blocks differ between the clusters.
+ After a successful rewind, the state of the target data directory is
+ analogous to a base backup of the source data directory. Unlike taking
+ a new base backup or using a tool like
rsync ,
+
pg_rewind does not require comparing or copying
+ unchanged relation blocks in the cluster. Only changed blocks from existing
+ relation files are copied; all other files, including new relation files,
+ configuration files, and WAL segments, are copied in full. As such the
+ rewind operation is significantly faster than other approaches when the
+ database is large and only a small fraction of blocks differ between the
+ clusters.
- When the target server is started for the first time after running
-
pg_rewind , it will go into recovery mode and replay all
- WAL generated in the source server after the point of divergence.
- If some of the WAL was no longer available in the source server when
-
pg_rewind was run, and therefore could not be copied by the
-
pg_rewind session, it must be made available when the
- target server is started. This can be done by creating a
- recovery.signal file in the target data directory
- and configuring suitable
- in postgresql.conf .
+ After running
pg_rewind , WAL replay needs to
+ complete for the data directory to be in a consistent state. When the
+ target server is started again it will enter archive recovery and replay
+ all WAL generated in the source server from the last checkpoint before
+ the point of divergence. If some of the WAL was no longer available in the
+ source server when
pg_rewind was run, and
+ therefore could not be copied by the
pg_rewind
+ session, it must be made available when the target server is started.
+ This can be done by creating a recovery.signal file
+ in the target data directory and by configuring a suitable
+ in
+ postgresql.conf .
recovered. In such a case, taking a new fresh backup is recommended.
+ As
pg_rewind copies configuration files
+ entirely from the source, it may be required to correct the configuration
+ used for recovery before restarting the target server, especially if
+ the target is reintroduced as a standby of the source. If you restart
+ the server after the rewind operation has finished but without configuring
+ recovery, the target may again diverge from the primary.
+
+
pg_rewind will fail immediately if it finds
files it cannot write directly to. This can happen for example when
Copy all those changed blocks from the source cluster to
the target cluster, either using direct file system access
(--source-pgdata ) or SQL (--source-server ).
+ Relation files are now in a state equivalent to the moment of the last
+ completed checkpoint prior to the point at which the WAL timelines of the
+ source and target diverged plus the current state on the source of any
+ blocks changed on the target after that divergence.
- Copy all other files such as pg_xact and
- configuration files from the source cluster to the target cluster
- (everything except the relation files). Similarly to base backups,
- the contents of the directories pg_dynshmem/ ,
+ Copy all other files, including new relation files, WAL segments,
+ pg_xact , and configuration files from the source
+ cluster to the target cluster. Similarly to base backups, the contents
+ of the directories pg_dynshmem/ ,
pg_notify/ , pg_replslot/ ,
pg_serial/ , pg_snapshots/ ,
- pg_stat_tmp/ , and
- pg_subtrans/ are omitted from the data copied
- from the source cluster. Any file or directory beginning with
- pgsql_tmp is omitted, as well as are
+ pg_stat_tmp/ , and pg_subtrans/
+ are omitted from the data copied from the source cluster. The files
backup_label ,
tablespace_map ,
pg_internal.init ,
- postmaster.opts and
- postmaster.pid .
+ postmaster.opts , and
+ postmaster.pid , as well as any file or directory
+ beginning with pgsql_tmp , are omitted.
+
+
+
+ Create a backup_label file to begin WAL replay at
+ the checkpoint created at failover and configure the
+ pg_control file with a minimum consistency LSN
+ defined as the result of pg_current_wal_insert_lsn()
+ when rewinding from a live source or the last checkpoint LSN when
+ rewinding from a stopped source.
- Apply the WAL from the source cluster, starting from the checkpoint
- created at failover. (Strictly speaking,
pg_rewind
- doesn't apply the WAL, it just creates a backup label file that
- makes
PostgreSQL start by replaying all WAL from
- that checkpoint forward.)
+ When starting the target,
PostgreSQL replays
+ all the required WAL, resulting in a data directory in a consistent
+ state.