-
+
Write-Ahead Logging (WAL)
Problems with indexes (problems 1 and 2) could possibly have been
fixed by additional fsync() calls, but it is
not obvious how to handle the last case without
-
WAL;
WAL saves the entire
- data page content in the log if that is required to ensure page
+
WAL;
WAL saves the entire
data
+ page content in the log if that is required to ensure page
consistency for after-crash recovery.
made by aborted transactions will still occupy disk space and that
we still need a permanent pg_clog file to hold
the status of transactions, since we are not able to re-use
- transaction identifiers. Once UNDO is implemented,
+ transaction identifiers. Once UNDO is implemented,
pg_clog will no longer be required to be
permanent; it will be possible to remove
- pg_clog at shutdown. (However, the urgency
- of this concern has decreased greatly with the adoption of a segmented
+ pg_clog at shutdown. (However, the urgency of
+ this concern has decreased greatly with the adoption of a segmented
storage method for pg_clog --- it is no longer
necessary to keep old pg_clog entries around
forever.)
- A difficulty standing in the way of realizing these benefits is that they
- require saving
WAL entries for considerable periods
- of time (eg, as long as the longest possible transaction if transaction
- UNDO is wanted). The present
WAL format is
- extremely bulky since it includes many disk page snapshots.
- This is not a serious concern at present, since the entries only need
- to be kept for one or two checkpoint intervals; but to achieve
- these future benefits some sort of compressed
WAL
- format will be needed.
+ A difficulty standing in the way of realizing these benefits is that
+ they require saving
WAL entries for considerable
+ periods of time (eg, as long as the longest possible transaction if
+ transaction UNDO is wanted). The present
WAL
+ format is extremely bulky since it includes many disk page
+ snapshots. This is not a serious concern at present, since the
+ entries only need to be kept for one or two checkpoint intervals;
+ but to achieve these future benefits some sort of compressed
+
WAL format will be needed.
not occur often enough to prevent
WAL buffers
being written by LogInsert. On such systems
one should increase the number of
WAL buffers by
- modifying the WAL_BUFFERS parameter. The default
- number of
WAL buffers is 8. Increasing this
- value will correspondingly increase shared memory usage.
+ modifying the postgresql.conf
+ WAL_BUFFERS parameter. The default number of
+ WAL buffers is 8. Increasing this value will
+ correspondingly increase shared memory usage.
Reducing CHECKPOINT_SEGMENTS and/or
- CHECKPOINT_TIMEOUT causes checkpoints to be
- done more often. This allows faster after-crash recovery (since
- less work will need to be redone). However, one must balance this against
- the increased cost of flushing dirty data pages more often. In addition,
- to ensure data page consistency, the first modification of a data page
- after each checkpoint results in logging the entire page content.
- Thus a smaller checkpoint interval increases the volume of output to
- the log, partially negating the goal of using a smaller interval, and
- in any case causing more disk I/O.
+ CHECKPOINT_TIMEOUT causes checkpoints to be done
+ more often. This allows faster after-crash recovery (since less work
+ will need to be redone). However, one must balance this against the
+ increased cost of flushing dirty data pages more often. In addition,
+ to ensure data page consistency, the first modification of a data
+ page after each checkpoint results in logging the entire page
+ content. Thus a smaller checkpoint interval increases the volume of
+ output to the log, partially negating the goal of using a smaller
+ interval, and in any case causing more disk I/O.
The number of 16MB segment files will always be at least
WAL_FILES + 1, and will normally not exceed
- WAL_FILES + 2 * CHECKPOINT_SEGMENTS
- + 1. This may be used to estimate space requirements for WAL. Ordinarily,
- when an old log segment file is no longer needed, it is recycled (renamed
- to become the next sequential future segment). If, due to a short-term
- peak of log output rate, there are more than WAL_FILES +
- 2 * CHECKPOINT_SEGMENTS + 1 segment files, then unneeded
- segment files will be deleted instead of recycled until the system gets
- back under this limit. (If this happens on a regular basis,
- WAL_FILES should be increased to avoid it. Deleting log
- segments that will only have to be created again later is expensive and
- pointless.)
+ WAL_FILES + MAX(WAL_FILES,
+ CHECKPOINT_SEGMENTS) + 1. This may be used to
+ estimate space requirements for WAL. Ordinarily, when an old log
+ segment files are no longer needed, they are recycled (renamed to
+ become the next sequential future segments). If, due to a short-term
+ peak of log output rate, there are more than
+ WAL_FILES + MAX(WAL_FILES,
+ CHECKPOINT_SEGMENTS) + 1 segment files, then
+ unneeded segment files will be deleted instead of recycled until the
+ system gets back under this limit. (If this happens on a regular
+ basis, WAL_FILES should be increased to avoid it.
+ Deleting log segments that will only have to be created again later
+ is expensive and pointless.)