-
+
Arrays
- An alternative syntax, which conforms to the SQL standard, may
+ An alternative syntax, which conforms to the SQL standard, can
be used for one-dimensional arrays.
pay_by_quarter could have been defined
as:
To write an array value as a literal constant, enclose the element
values within curly braces and separate them by commas. (If you
know C, this is not unlike the C syntax for initializing
- structures.) You may put double quotes around any element value,
+ structures.) You can put double quotes around any element value,
and must do so if it contains commas or curly braces. (More
details appear below.) Thus, the general format of an array
constant is the following:
- The ARRAY> constructor syntax may also be used:
+ The ARRAY> constructor syntax can also be used:
INSERT INTO sal_emp
VALUES ('Bill',
WHERE name = 'Carol';
- An array may also be updated at a single element:
+ An array can also be updated at a single element:
UPDATE sal_emp SET pay_by_quarter[4] = 15000
Note that the concatenation operator discussed above is preferred over
direct use of these functions. In fact, the functions exist primarily for use
- in implementing the concatenation operator. However, they may be directly
+ in implementing the concatenation operator. However, they might be directly
useful in the creation of user-defined aggregates. Some examples:
Arrays are not sets; searching for specific array elements
- may be a sign of database misdesign. Consider
+ can be a sign of database misdesign. Consider
using a separate table with a row for each item that would be an
array element. This will be easier to search, and is likely to
scale up better to large numbers of elements.
or backslashes disables this and allows the literal string value
NULL> to be entered. Also, for backwards compatibility with
pre-8.2 versions of
PostgreSQL>, the
- linkend="guc-array-nulls"> configuration parameter may be turned
+ linkend="guc-array-nulls"> configuration parameter might be turned
off> to suppress recognition of NULL> as a NULL.
- You may write whitespace before a left brace or after a right
- brace. You may also write whitespace before or after any individual item
+ You can write whitespace before a left brace or after a right
+ brace. You can also write whitespace before or after any individual item
string. In all of these cases the whitespace will be ignored. However,
whitespace within double-quoted elements, or surrounded on both sides by
non-whitespace characters of an element, is not ignored.
-
+
Backup and Restore
By default, the
psql> script will continue to
- execute after an SQL error is encountered. You may wish to use the
+ execute after an SQL error is encountered. You might wish to use the
following command at the top of the script to alter that
behaviour and have
psql exit with an
exit status of 3 if an SQL error occurs:
passing the -1> or --single-transaction>
command-line options to
psql>. When using this
mode, be aware that even the smallest of errors can rollback a
- restore that has already run for many hours. However, that may
+ restore that has already run for many hours. However, that might
still be preferable to manually cleaning up a complex database
after a partially restored dump.
If you have dug into the details of the file system layout of the
- database, you may be tempted to try to back up or restore only certain
+ database, you might be tempted to try to back up or restore only certain
individual tables or databases from their respective files or
directories. This will not> work because the
information contained in these files contains only half the
- If your database is spread across multiple file systems, there may not
+ If your database is spread across multiple file systems, there might not
be any way to obtain exactly-simultaneous frozen snapshots of all
the volumes. For example, if your data files and WAL log are on different
disks, or if tablespaces are on different file systems, it might
Since we can string together an indefinitely long sequence of WAL files
for replay, continuous backup can be achieved simply by continuing to archive
the WAL files. This is particularly valuable for large databases, where
- it may not be convenient to take a full backup frequently.
+ it might not be convenient to take a full backup frequently.
As with the plain file-system-backup technique, this method can only
support restoration of an entire database cluster, not a subset.
- Also, it requires a lot of archival storage: the base backup may be bulky,
+ Also, it requires a lot of archival storage: the base backup might be bulky,
and a busy system will generate many megabytes of WAL traffic that
have to be archived. Still, it is the preferred backup technique in
many situations where high reliability is needed.
which will copy archivable WAL segments to the directory
/mnt/server/archivedir>. (This is an example, not a
- recommendation, and may not work on all platforms.)
+ recommendation, and might not work on all platforms.)
In writing your archive command, you should assume that the file names to
- be archived may be up to 64 characters long and may contain any
+ be archived can be up to 64 characters long and can contain any
combination of ASCII letters, digits, and dots. It is not necessary to
remember the original relative path (%p>) but it is necessary to
remember the file name (%f>).
postgresql.conf>, pg_hba.conf> and
pg_ident.conf>), since those are edited manually rather
than through SQL operations.
- You may wish to keep the configuration files in a location that will
+ You might wish to keep the configuration files in a location that will
be backed up by your regular file system backup procedures. See
for how to relocate the
configuration files.
between pg_start_backup> and the start of the actual backup,
nor between the end of the backup and pg_stop_backup>; a
few minutes' delay won't hurt anything. (However, if you normally run the
- server with full_page_writes> disabled, you may notice a drop
+ server with full_page_writes> disabled, you might notice a drop
in performance between pg_start_backup> and
pg_stop_backup>, since full_page_writes> is
effectively forced on during backup mode.) You must ensure that these
- You may , however, omit from the backup dump the files within the
+ You can , however, omit from the backup dump the files within the
pg_xlog/> subdirectory of the cluster directory. This
slight complication is worthwhile because it reduces the risk
of mistakes when restoring. This is easy to arrange if
the file system backup and the WAL segment files used during the
backup (as specified in the backup history file), all archived WAL
segments with names numerically less are no longer needed to recover
- the file system backup and may be deleted. However, you should
+ the file system backup and can be deleted. However, you should
consider keeping several backup sets to be absolutely certain that
you can recover your data.
require that you have enough free space on your system to hold two
copies of your existing database. If you do not have enough space,
you need at the least to copy the contents of the pg_xlog>
- subdirectory of the cluster data directory, as it may contain logs which
+ subdirectory of the cluster data directory, as it might contain logs which
were not archived before the system went down.
Create a recovery command file recovery.conf> in the cluster
- data directory (see ). You may
+ data directory (see ). You might
also want to temporarily modify pg_hba.conf> to prevent
ordinary users from connecting until you are sure the recovery has worked.
recovery.conf> is the restore_command>,
which tells
PostgreSQL> how to get back archived
WAL file segments. Like the archive_command>, this is
- a shell command string. It may contain %f>, which is
+ a shell command string. It can contain %f>, which is
replaced by the name of the desired log file, and %p>,
which is replaced by the path name to copy the log file to.
(The path name is relative to the working directory of the server,
It should also be noted that the default
WAL
format is fairly bulky since it includes many disk page snapshots.
These page snapshots are designed to support crash recovery, since
- we may need to fix partially-written disk pages. Depending on
- your system hardware and software, the risk of partial writes may
+ we might need to fix partially-written disk pages. Depending on
+ your system hardware and software, the risk of partial writes might
be small enough to ignore, in which case you can significantly
reduce the total volume of archived logs by turning off page
snapshots using the
use of the logs for PITR operations. An area for future
development is to compress archived WAL data by removing
unnecessary page copies even when full_page_writes> is
- on. In the meantime, administrators may wish to reduce the number
+ on. In the meantime, administrators might wish to reduce the number
of page snapshots included in WAL by increasing the checkpoint
interval parameters as much as feasible.
connectivity between the two and the viability of the primary. It is
also possible to use a third system (called a witness server) to avoid
some problems of inappropriate failover, but the additional complexity
- may not be worthwhile unless it is set-up with sufficient care and
+ might not be worthwhile unless it is set-up with sufficient care and
rigorous testing.
Once failover to the standby occurs, we have only a
single server in operation. This is known as a degenerate state.
The former standby is now the primary, but the former primary is down
- and may stay down. To return to normal operation we must
+ and might stay down. To return to normal operation we must
fully recreate a standby server,
either on the former primary system when it comes up, or on a third,
possibly new, system. Once complete the primary and standby can be
It is recommended that you use the
pg_dump> and
pg_dumpall> programs from the newer version of
PostgreSQL>, to take advantage of any enhancements
- that may have been made in these programs. Current releases of the
+ that might have been made in these programs. Current releases of the
dump programs can read data from any server version back to 7.0.
When you move the old installation out of the way
- it may no longer be perfectly usable. Some of the executable programs
+ it might no longer be perfectly usable. Some of the executable programs
contain absolute paths to various installed programs and data files.
This is usually not a big problem but if you plan on using two
installations in parallel for a while you should assign them
-
+
- Related information may be found in the documentation for
+ Related information can be found in the documentation for
-
+
The catalog pg_amop stores information about
operators associated with access method operator families. There is one
row for each operator that is a member of an operator family. An operator
- can appear in more than one family, but may not appear in more than one
+ can appear in more than one family, but can not appear in more than one
position within a family.
Always -1 in storage, but when loaded into a row descriptor
- in memory this may be updated to cache the offset of the attribute
+ in memory this might be updated to cache the offset of the attribute
within the row
bool
- This column is defined locally in the relation. Note that a column may
+ This column is defined locally in the relation. Note that a column can
be locally defined and inherited simultaneously
database authorization identifiers (roles). A role subsumes the concepts
of users> and groups>. A user is essentially just a
role with the rolcanlogin> flag set. Any role (with or
- without rolcanlogin>) may have other roles as members; see
+ without rolcanlogin>) can have other roles as members; see
pg_auth_members .
|
rolcreaterole
bool
- Role may create more roles
+ Role can create more roles
|
rolcreatedb
bool
- Role may create databases
+ Role can create databases
|
rolcatupdate
bool
- Role may update system catalogs directly. (Even a superuser may not do
+ Role can update system catalogs directly. (Even a superuser can not do
this unless this column is true)
rolcanlogin
bool
- Role may log in. That is, this role can be given as the initial
+ Role can log in. That is, this role can be given as the initial
session authorization identifier
admin_option
bool
- True if member> may grant membership in
+ True if member> can grant membership in
roleid> to others
char
- Indicates what contexts the cast may be invoked in.
+ Indicates what contexts the cast can be invoked in.
e> means only as an explicit cast (using
CAST> or ::> syntax).
a> means implicitly in assignment
In all cases, a pg_depend entry indicates that the
- referenced object may not be dropped without also dropping the dependent
+ referenced object can not be dropped without also dropping the dependent
object. However, there are several subflavors identified by
deptype>:
A normal relationship between separately-created objects. The
- dependent object may be dropped without affecting the
- referenced object. The referenced object may only be dropped
+ dependent object can be dropped without affecting the
+ referenced object. The referenced object can only be dropped
by specifying CASCADE>, in which case the dependent
object is dropped, too. Example: a table column has a normal
dependency on its data type.
- Other dependency flavors may be needed in future.
+ Other dependency flavors might be needed in future.
This is false for internal languages (such as
SQL ) and true for user-defined languages.
Currently,
pg_dump still uses this
- to determine which languages need to be dumped, but this may be
+ to determine which languages need to be dumped, but this might be
replaced by a different mechanism in the future
True if this is a trusted language, which means that it is believed
not to grant access to anything outside the normal SQL execution
- environment. Only superusers may create functions in untrusted
+ environment. Only superusers can create functions in untrusted
languages
bytea
Actual data stored in the large object.
- This will never be more than LOBLKSIZE> bytes and may be less
+ This will never be more than LOBLKSIZE> bytes and might be less
Each row of pg_largeobject holds data
for one page of a large object, beginning at
byte offset (pageno * LOBLKSIZE>) within the object. The implementation
- allows sparse storage: pages may be missing, and may be shorter than
+ allows sparse storage: pages might be missing, and might be shorter than
LOBLKSIZE> bytes even if they are not the last page of the object.
Missing regions within a large object read as zeroes.
It is s for stable> functions,
whose results (for fixed inputs) do not change within a scan.
It is v for volatile> functions,
- whose results may change at any time. (Use v also
+ whose results might change at any time. (Use v also
for functions with side-effects, so that calls to them cannot get
optimized away.)
In all cases, a pg_shdepend entry indicates that
- the referenced object may not be dropped without also dropping the dependent
+ the referenced object can not be dropped without also dropping the dependent
object. However, there are several subflavors identified by
deptype>:
- Other dependency flavors may be needed in future. Note in particular
+ Other dependency flavors might be needed in future. Note in particular
that the current definition only supports roles as referenced objects.
- Since different kinds of statistics may be appropriate for different
+ Since different kinds of statistics might be appropriate for different
kinds of data, pg_statistic is designed not
to assume very much about what sort of statistics it stores. Only
extremely general statistics (such as nullness) are given dedicated
pg_statistic should not be readable by the
public, since even statistical information about a table's contents
- may be considered sensitive. (Example: minimum and maximum values
+ might be considered sensitive. (Example: minimum and maximum values
of a salary column might be quite interesting.)
pg_stats
is a publicly readable view on
default expression represented by typdefaultbin>. If
typdefaultbin> is null and typdefault> is
not, then typdefault> is the external representation of
- the type's default value, which may be fed to the type's input
+ the type's default value, which might be fed to the type's input
converter to produce a constant
Cursors are used internally to implement some of the components
of
PostgreSQL>, such as procedural languages.
- Therefore, the pg_cursors> view may include cursors
+ Therefore, the pg_cursors> view might include cursors
that have not been explicitly created by the user.
pg_locks contains one row per active lockable
object, requested lock mode, and relevant transaction. Thus, the same
- lockable object may
+ lockable object might
appear many times, if multiple transactions are holding or waiting
for locks on it. However, an object that currently has no locks on it
will not appear at all.
rolcreaterole
bool
- Role may create more roles
+ Role can create more roles
|
rolcreatedb
bool
- Role may create databases
+ Role can create databases
|
bool
- Role may update system catalogs directly. (Even a superuser may not do
+ Role can update system catalogs directly. (Even a superuser can not do
this unless this column is true.)
bool
- Role may log in. That is, this role can be given as the initial
+ Role can log in. That is, this role can be given as the initial
session authorization identifier
usecreatedb
bool
- User may create databases
+ User can create databases
|
bool
- User may update system catalogs. (Even a superuser may not do
+ User can update system catalogs. (Even a superuser can not do
this unless this column is true.)
|
usecreatedb
bool
- User may create databases
+ User can create databases
|
usecatupd
bool
- User may update system catalogs. (Even a superuser may not do
+ User can update system catalogs. (Even a superuser can not do
this unless this column is true.)
-
+
Localization>
environment variables seen by the server, not by the environment
of any client. Therefore, be careful to configure the correct locale settings
before starting the server. A consequence of this is that if
- client and server are set up in different locales, messages may
+ client and server are set up in different locales, messages might
appear in different languages depending on where they originated.
If locale support doesn't work in spite of the explanation above,
check that the locale support in your operating system is
correctly configured. To check what locales are installed on your
- system, you may use the command locale -a if
+ system, you can use the command locale -a if
your operating system provides it.
-
+
Client Authentication
runs. If all the users of a particular server also have accounts on
the server's machine, it makes sense to assign database user names
that match their operating system user names. However, a server that
- accepts remote connections may have many database users who have no local operating system
+ accepts remote connections might have many database users who have no local operating system
account, and in such cases there need be no connection between
database user names and OS user names.
- A record may have one of the seven formats
+ A record can have one of the seven formats
local database user auth-method auth-option
host database user CIDR-address auth-method auth-option
IP-mask
- These fields may be used as an alternative to the
+ These fields can be used as an alternative to the
CIDR-address notation. Instead of
specifying the mask length, the actual mask is specified in a
separate column. For example, 255.0.0.0> represents an IPv4
# If these are the only three lines for local connections, they will
# allow local users to connect only to their own databases (databases
# with the same name as their database user name) except for administrators
-# and members of role "support", who may connect to all databases. The file
+# and members of role "support", who can connect to all databases. The file
# $PGDATA/admins contains a list of names of administrators. Passwords
# are required in all cases.
#
trust> authentication is appropriate and very
convenient for local connections on a single-user workstation. It
is usually not> appropriate by itself on a multiuser
- machine. However, you may be able to use trust> even
+ machine. However, you might be able to use trust> even
on a multiuser machine, if you restrict access to the server's
Unix-domain socket file using file-system permissions. To do this, set the
unix_socket_permissions (and possibly
./configure --with-krb-srvnam=whatever>. In most environments,
this parameter never needs to be changed. However, to support multiple
PostgreSQL> installations on the same host it is necessary.
- Some Kerberos implementations may also require a different service name,
+ Some Kerberos implementations might also require a different service name,
such as Microsoft Active Directory which requires the service name
to be in uppercase (POSTGRES ).
as which database user. The same map-name> can be
used repeatedly to specify more user-mappings within a single map.
There is no restriction regarding how many database users a given
- operating system user may correspond to, nor vice versa.
+ operating system user can correspond to, nor vice versa.
will encrypt only the connection between the PostgreSQL server
and the LDAP server. The connection between the client and the
PostgreSQL server is not affected by this setting. To make use of
- TLS encryption, you may need to configure the LDAP library prior
+ TLS encryption, you might need to configure the LDAP library prior
to configuring PostgreSQL. Note that encrypted LDAP is available only
if the platform's LDAP library supports it.
The database you are trying to connect to does not exist. Note that
if you do not specify a database name, it defaults to the database
- user name, which may or may not be the right thing.
+ user name, which might or might not be the right thing.
- The server log may contain more information about an
+ The server log might contain more information about an
authentication failure than is reported to the client. If you are
confused about the reason for a failure, check the log.
-
+
Server Configuration
All parameter names are case-insensitive. Every parameter takes a
value of one of four types: Boolean, integer, floating point,
- or string. Boolean values may be written as ON ,
+ or string. Boolean values can be written as ON ,
OFF , TRUE ,
FALSE , YES ,
NO , 1 , 0
postgresql.conf . Note that this means you won't
be able to change the value on-the-fly by editing
postgresql.conf , so while the command-line
- method may be convenient, it can cost you flexibility later.
+ method might be convenient, it can cost you flexibility later.
Determines the maximum number of concurrent connections to the
database server. The default is typically 100 connections, but
- may be less if your kernel settings will not support it (as
+ might be less if your kernel settings will not support it (as
determined during
initdb>). This parameter can
only be set at server start.
- Increasing this parameter m
ay cause
PostgreSQL>
+ Increasing this parameter m
ight cause
PostgreSQL>
to request more System V> shared
memory or semaphores than your operating system's default configuration
allows. See for information on how to
On systems that support the TCP_KEEPCNT socket option, specifies how
- many keepalives may be lost before the connection is considered dead.
+ many keepalives can be lost before the connection is considered dead.
A value of zero uses the system default. If TCP_KEEPCNT is not
supported, this parameter must be zero. This parameter is ignored
for connections made via a Unix-domain socket.
Sets the amount of memory the database server uses for shared
memory buffers. The default is typically 32 megabytes
- (32MB>), but may be less if your kernel settings will
+ (32MB>), but might be less if your kernel settings will
not support it (as determined during
initdb>).
This setting must be at least 128 kilobytes and at least 16
kilobytes times . (Non-default
- Increasing this parameter m
ay cause
PostgreSQL>
+ Increasing this parameter m
ight cause
PostgreSQL>
to request more System V> shared
memory than your operating system's default configuration
allows. See for information on how to
- Increasing this parameter m
ay cause
PostgreSQL>
+ Increasing this parameter m
ight cause
PostgreSQL>
to request more System V> shared
memory than your operating system's default configuration
allows. See for information on how to
operations can be executed at a time by a database session, and
an installation normally doesn't have many of them running
concurrently, it's safe to set this value significantly larger
- than work_mem . Larger settings may improve
+ than work_mem . Larger settings might improve
performance for vacuuming and for restoring database dumps.
routine in the server, but only in key potentially-recursive routines
such as expression evaluation. The default setting is two
megabytes (2MB>), which is conservatively small and
- unlikely to risk crashes. However, it may be too small to allow
+ unlikely to risk crashes. However, it might be too small to allow
execution of complex functions. Only superusers can change this
setting.
These parameters control the size of the shared free space
map>, which tracks the locations of unused space in the database.
- An undersized free space map may cause the database to consume
+ An undersized free space map can cause the database to consume
increasing amounts of disk space over time, because free space that
is not in the map cannot be re-used; instead
PostgreSQL>
will request more disk space from the operating system when it needs
- Increasing these parameters m
ay cause
PostgreSQL>
+ Increasing these parameters m
ight cause
PostgreSQL>
to request more System V> shared
memory than your operating system's default configuration
allows. See for information on how to
By preloading a shared library, the library startup time is avoided
when the library is first used. However, the time to start each new
- server process may increase slightly, even if that process never
+ server process might increase slightly, even if that process never
uses the library. So this parameter is recommended only for
libraries that will be used in most sessions.
Note that on many systems, the effective resolution
of sleep delays is 10 milliseconds; setting
vacuum_cost_delay to a value that is
- not a multiple of 10 may have the same results as setting it
+ not a multiple of 10 might have the same results as setting it
to the next higher multiple of 10.
(200ms>). Note that on many systems, the effective
resolution of sleep delays is 10 milliseconds; setting
bgwriter_delay> to a value that is not a multiple of
- 10 may have the same results as setting it to the next higher
+ 10 might have the same results as setting it to the next higher
multiple of 10. This parameter can only be set in the
postgresql.conf> file or on the server command line.
allowed to do its best in buffering, ordering, and delaying
writes. This can result in significantly improved performance.
However, if the system crashes, the results of the last few
- committed transactions may be lost in part or whole. In the
- worst case, unrecoverable data corruption may occur.
+ committed transactions might be lost in part or whole. In the
+ worst case, unrecoverable data corruption might occur.
(Crashes of the database software itself are not>
a risk factor here. Only an operating-system-level crash
creates a risk of corruption.)
Turning this parameter off speeds normal operation, but
might lead to a corrupt database after an operating system crash
or power failure. The risks are similar to turning off
- fsync>, though smaller. It may be safe to turn off
+ fsync>, though smaller. It might be safe to turn off
this parameter if you have hardware (such as a battery-backed disk
controller) or file-system software that reduces
the risk of partial page writes to an acceptably low level (e.g., ReiserFS 4).
- Increasing this parameter m
ay cause
PostgreSQL>
+ Increasing this parameter m
ight cause
PostgreSQL>
to request more System V> shared
memory than your operating system's default configuration
allows. See for information on how to
These configuration parameters provide a crude method of
influencing the query plans chosen by the query optimizer. If
the default plan chosen by the optimizer for a particular query
- is not optimal, a temporary solution may be found by using one
+ is not optimal, a temporary solution can be found by using one
of these configuration parameters to force the optimizer to
choose a different plan. Turning one of these settings off
permanently is seldom a good idea, however.
Sets the default statistics target for table columns that have
not had a column-specific target set via ALTER TABLE
SET STATISTICS>. Larger values increase the time needed to
- do ANALYZE>, but may improve the quality of the
+ do ANALYZE>, but might improve the quality of the
planner's estimates. The default is 10. For more information
on the use of statistics by the
PostgreSQL>
query planner, refer to .
The planner will merge sub-queries into upper queries if the
resulting FROM list would have no more than
- this many items. Smaller values reduce planning time but may
+ this many items. Smaller values reduce planning time but might
yield inferior query plans. The default is eight. It is usually
wise to keep this less than .
For more information see .
The planner will rewrite explicit JOIN>
constructs (except FULL JOIN>s) into lists of
FROM> items whenever a list of no more than this many items
- would result. Smaller values reduce planning time but may
+ would result. Smaller values reduce planning time but might
yield inferior query plans.
explicit JOIN>s. Thus, the explicit join order
specified in the query will be the actual order in which the
relations are joined. The query planner does not always choose
- the optimal join order; advanced users may elect to
+ the optimal join order; advanced users can elect to
temporarily set this variable to 1, and then specify the join
order they desire explicitly.
For more information see .
This method, in combination with logging to
stderr>,
is often more useful than
logging to
syslog>, since some types of messages
- m
ay not appear in
syslog> output (a common example
+ m
ight not appear in
syslog> output (a common example
is dynamic-linker failure messages).
This parameter can only be set at server start.
When redirect_stderr> is enabled, this parameter
determines the directory in which log files will be created.
- It may be specified as an absolute path, or relative to the
+ It can be specified as an absolute path, or relative to the
cluster data directory.
This parameter can only be set in the postgresql.conf>
file or on the server command line.
log_rotation_age to 60 , and
log_rotation_size to 1000000 .
Including %M> in log_filename allows
- any size-driven rotations that may occur to select a file name
+ any size-driven rotations that might occur to select a file name
different from the hour's initial file name.
When logging to
syslog> is enabled, this parameter
- facility
to be used. You may choose
+ facility
to be used. You can choose
from LOCAL0>, LOCAL1>,
LOCAL2>, LOCAL3>, LOCAL4>,
LOCAL5>, LOCAL6>, LOCAL7>;
NOTICE
- Provides information that may be helpful to users, e.g.,
+ Provides information that might be helpful to users, e.g.,
truncation of long identifiers and the creation of indexes as part
of primary keys.
Controls whether the server should start the
statistics-collection subprocess. This is on by default, but
- may be turned off if you know you have no interest in
+ can be turned off if you know you have no interest in
collecting statistics or running autovacuum.
This parameter can only be set at server start, because the collection
subprocess cannot be started or stopped on-the-fly. (However, the
Sets the locale to use for formatting date and time values.
- (Currently, this setting does nothing, but it may in the
+ (Currently, this setting does nothing, but it might in the
future.) Acceptable values are system-dependent; see
linkend="locale"> for more information. If this variable is
set to the empty string (which is the default) then the value
- Increasing this parameter m
ay cause
PostgreSQL>
+ Increasing this parameter m
ight cause
PostgreSQL>
to request more System V> shared
memory than your operating system's default configuration
allows. See for information on how to
The regular expression flavor> can be set to
advanced>, extended>, or basic>.
The default is advanced>. The extended>
- setting may be useful for exact backwards compatibility with
+ setting might be useful for exact backwards compatibility with
pre-7.4 releases of
PostgreSQL>. See
for details.
behavior of treating backslashes as escape characters.
The default will change to on> in a future release
to improve compatibility with the standard.
- Applications may check this
+ Applications can check this
parameter to determine how string literals will be processed.
The presence of this parameter can also be taken as an indication
that the escape string syntax (E'...'>) is supported.
installed. As such, they have been excluded from the sample
postgresql.conf> file. These options report
various aspects of
PostgreSQL behavior
- that may be of interest to certain applications, particularly
+ that might be of interest to certain applications, particularly
administrative front-ends.
the system to instead report a warning, zero out the damaged page,
and continue processing. This behavior will destroy data>,
namely all the rows on the damaged page. But it allows you to get
- past the error and retrieve rows from any undamaged pages that may
+ past the error and retrieve rows from any undamaged pages that might
be present in the table. So it is useful for recovering data if
corruption has occurred due to hardware or software error. You should
generally not set this on until you have given up hope of recovering
-
+
- If you have a fast link to the Internet, you may not need
+ If you have a fast link to the Internet, you might not need
-z3 , which instructs
CVS to use
gzip compression for transferred data. But
on a modem-speed link, it's a very substantial win.
-
+
Data Types
PostgreSQL has a rich set of native data
- types available to users. Users may add new types to
+ types available to users. Users can add new types to
PostgreSQL using the
linkend="sql-createtype" endterm="sql-createtype-title"> command.
paths, or have several possibilities for formats, such as the date
and time types.
Some of the input and output functions are not invertible. That is,
- the result of an output function may lose accuracy when compared to
+ the result of an output function might lose accuracy when compared to
the original input.
- The bigint type may not function correctly on all
+ The bigint type might not function correctly on all
platforms, since it relies on compiler support for eight-byte
integers. On a machine without such support, bigint
acts the same as integer (but still takes up eight
Inexact means that some values cannot be converted exactly to the
internal format and are stored as approximations, so that storing
- and printing back out a value may show slight discrepancies.
+ and printing back out a value might show slight discrepancies.
Managing these errors and how they propagate through calculations
is the subject of an entire branch of mathematics and computer
science and will not be discussed further here, except for the
- Comparing two floating-point values for equality may or may
+ Comparing two floating-point values for equality might or might
not work as expected.
1E-37 to 1E+37 with a precision of at least 6 decimal digits. The
double precision type typically has a range of around
1E-307 to 1E+308 with a precision of at least 15 digits. Values that
- are too large or too small will cause an error. Rounding may
+ are too large or too small will cause an error. Rounding might
take place if the precision of an input number is too high.
Numbers too close to zero that are not representable as distinct
from zero will cause an underflow error.
digits. The assumption that real and
double precision have exactly 24 and 53 bits in the
mantissa respectively is correct for IEEE-standard floating point
- implementations. On non-IEEE platforms it may be off a little, but
+ implementations. On non-IEEE platforms it might be off a little, but
for simplicity the same ranges of p are used
on all platforms.
The storage requirement for data of these types is 4 bytes plus the
actual string, and in case of character plus the
padding. Long strings are compressed by the system automatically, so
- the physical requirement on disk may be less. Long values are also
+ the physical requirement on disk might be less. Long values are also
stored in background tables so they do not interfere with rapid
access to the shorter column values. In any case, the longest
possible character string that can be stored is about 1 GB. (The
terminator) but should be referenced using the constant
NAMEDATALEN . The length is set at compile time (and
is therefore adjustable for special uses); the default maximum
- length may change in a future release. The type "char"
+ length might change in a future release. The type "char"
(note the quotes) is different from char(1) in that it
only uses one byte of storage. It is internally used in the system
catalogs as a poor-man's enumeration type.
Depending on the front end to
PostgreSQL> you use,
- you may have additional work to do in terms of escaping and
- unescaping bytea strings. For example, you may also
+ you might have additional work to do in terms of escaping and
+ unescaping bytea strings. For example, you might also
have to escape line feeds and carriage returns if your interface
automatically translates these.
When timestamp> values are stored as double precision floating-point
numbers (currently the default), the effective limit of precision
- may be less than 6. timestamp values are stored as seconds
+ might be less than 6. timestamp values are stored as seconds
before or after midnight 2000-01-01. Microsecond precision is achieved for
dates within a few years of 2000-01-01, but the precision degrades for
dates further away. When timestamp values are stored as
time type can.
Time zones in the real world have little meaning unless
associated with a date as well as a time,
- since the offset may vary through the year with daylight-saving
+ since the offset can vary through the year with daylight-saving
time boundaries.
A time zone abbreviation, for example PST>. Such a
specification merely defines a particular offset from UTC, in
- contrast to full time zone names which may imply a set of daylight
+ contrast to full time zone names which might imply a set of daylight
savings transition-date rules as well. The recognized abbreviations
are listed in the
pg_timezone_abbrevs> view (see
linkend="view-pg-timezone-abbrevs">). You cannot set the
- Functions coded in C (whether built-in or dynamically loaded) may be
+ Functions coded in C (whether built-in or dynamically loaded) can be
declared to accept or return any of these pseudo data types. It is up to
the function author to ensure that the function will behave safely
when a pseudo-type is used as an argument type.
- Functions coded in procedural languages may use pseudo-types only as
+ Functions coded in procedural languages can use pseudo-types only as
allowed by their implementation languages. At present the procedural
languages all forbid use of a pseudo-type as argument type, and allow
only void> and record> as a result type (plus
-
+
Date/Time Support
PostgreSQL uses an internal heuristic
parser for all date/time input support. Dates and times are input as
strings, and are broken up into distinct fields with a preliminary
- determination of what kind of information may be in the
+ determination of what kind of information can be in the
field. Each field is interpreted and either assigned a numeric
value, ignored, or rejected.
The parser contains internal lookup tables for all textual fields,
If the numeric token contains a dash (->), slash
(/>), or two or more dots (.>), this is
- a date string which may have a text month. If a date token has
+ a date string which might have a text month. If a date token has
already been seen, it is instead interpreted as a time zone
name (e.g., America/New_York>).
- Gregorian years AD 1-99 may be entered by using 4 digits with leading
+ Gregorian years AD 1-99 can be entered by using 4 digits with leading
zeros (e.g., 0099> is AD 99).
- A timezone abbreviation file may contain blank lines and comments
+ A timezone abbreviation file can contain blank lines and comments
beginning with #>. Non-comment lines must have one of
these formats:
The @OVERRIDE> syntax indicates that subsequent entries in the
- file may override previous entries (i.e., entries obtained from included
+ file can override previous entries (i.e., entries obtained from included
files). Without this, conflicting definitions of the same timezone
abbreviation are considered an error.
-
+
Data Definition
- The default value may be an expression, which will be
+ The default value can be an expression, which will be
evaluated whenever the default value is inserted
(not when the table is created). A common example
- is that a timestamp column may have a default of now()>,
+ is that a timestamp column can have a default of now()>,
so that it gets set to the time of row insertion. Another common
example is generating a serial number> for each row.
In
PostgreSQL this is typically done by
The NOT NULL constraint has an inverse: the
NULL constraint. This does not mean that the
column must be null, which would surely be useless. Instead, this
- simply selects the default behavior that the column may be null.
+ simply selects the default behavior that the column might be null.
The NULL constraint is not present in the SQL
standard and should not be used in portable applications. (It was
only added to
PostgreSQL to be
unique constraint it is possible to store duplicate
rows that contain a null value in at least one of the constrained
columns. This behavior conforms to the SQL standard, but we have
- heard that other SQL databases may not follow this rule. So be
+ heard that other SQL databases might not follow this rule. So be
careful when developing applications that are intended to be
portable.
(Of course, this is only possible if the table contains fewer
than 232> (4 billion) rows, and in practice the
table size had better be much less than that, or performance
- may suffer.)
+ might suffer.)
PostgreSQL> will attempt to convert the column's
default value (if any) to the new type, as well as any constraints
- that involve the column. But these conversions may fail, or may
+ that involve the column. But these conversions might fail, or might
produce surprising results. It's often best to drop any constraints
on the column before altering its type, and then add back suitably
modified constraints afterwards.
in turn contain tables. Schemas also contain other kinds of named
objects, including data types, functions, and operators. The same
object name can be used in different schemas without conflict; for
- example, both schema1> and myschema> may
+ example, both schema1> and myschema> can
contain tables named mytable>. Unlike databases,
- schemas are not rigidly separated: a user may access objects in any
+ schemas are not rigidly separated: a user can access objects in any
of the schemas in the database he is connected to, if he has
privileges to do so.
Schema names beginning with pg_> are reserved for
- system purposes and may not be created by users.
+ system purposes and can not be created by users.
own. To allow that, the owner of the schema needs to grant the
USAGE privilege on the schema. To allow users
to make use of the objects in the schema, additional privileges
- may need to be granted, as appropriate for the object.
+ might need to be granted, as appropriate for the object.
the search path. If it is not named explicitly in the path then
it is implicitly searched before> searching the path's
schemas. This ensures that built-in names will always be
- findable. However, you may explicitly place
+ findable. However, you can explicitly place
pg_catalog> at the end of your search path if you
prefer to have user-defined names override built-in names.
In
PostgreSQL versions before 7.3,
table names beginning with pg_> were reserved. This is
- no longer true: you may create such a table name if you wish, in
+ no longer true: you can create such a table name if you wish, in
any non-system schema. However, it's best to continue to avoid
such names, to ensure that you won't suffer a conflict if some
future version defines a system table named the same as your
- In some cases you may wish to know which table a particular row
+ In some cases you might wish to know which table a particular row
originated from. There is a system column called
tableoid in each table which can tell you the
originating table:
- Bulk loads and deletes may be accomplished by adding or removing
+ Bulk loads and deletes can be accomplished by adding or removing
partitions, if that requirement is planned into the partitioning design.
ALTER TABLE> is far faster than a bulk operation.
It also entirely avoids the VACUUM
As we can see, a complex partitioning scheme could require a
substantial amount of DDL. In the above example we would be
- creating a new partition each month, so it may be wise to write a
+ creating a new partition each month, so it might be wise to write a
script that generates the required DDL automatically.
This allows further operations to be performed on the data before
it is dropped. For example, this is often a useful time to back up
the data using
COPY>, pg_dump>, or
- similar tools. It can also be a useful time to aggregate data
+ similar tools. It might also be a useful time to aggregate data
into smaller formats, perform other data manipulations, or run
reports.
-
+
Monitoring Disk Usage
there is also a
TOAST> file associated with the table,
which is used to store values too wide to fit comfortably in the main
table (see ). There will be one index on the
-
TOAST> table, if present. There may also be indexes associated
+
TOAST> table, if present. There might also be indexes associated
with the base table. Each table and index is stored in a separate disk
file — possibly more than one file, if the file would exceed one
gigabyte. Naming conventions for these files are described in
The most important disk monitoring task of a database administrator
is to make sure the disk doesn't grow full. A filled data disk will
- not result in data corruption, but it may well prevent useful activity
+ not result in data corruption, but it might prevent useful activity
from occurring. If the disk holding the WAL files grows full, database
- server panic and consequent shutdown may occur.
+ server panic and consequent shutdown might occur.
-
+
Data Manipulation
UPDATE products SET price = 10 WHERE price = 5;
- This may cause zero, one, or many rows to be updated. It is not
+ This might cause zero, one, or many rows to be updated. It is not
an error to attempt an update that does not match any rows.
Let's look at that command in detail. First is the key word
UPDATE followed by the table name. As usual,
- the table name may be schema-qualified, otherwise it is looked up
+ the table name can be schema-qualified, otherwise it is looked up
in the path. Next is the key word SET followed
by the column name, an equals sign and the new column value. The
new column value can be any scalar expression, not just a constant.
-
+
Documentation
The following tools are used to process the documentation. Some
- may be optional, as noted.
+ might be optional, as noted.
We have documented experience with several installation methods for
the various tools that are needed to process the documentation.
- These will be described below. There may be some other packaged
+ These will be described below. There might be some other packaged
distributions for these tools. Please report package status to the
documentation mailing list, and we will include that information
here.
It appears that current versions of the
PostgreSQL documentation
trigger some bug in or exceed the size limit of OpenJade. If the
build process of the
RTF version hangs for a
- long time and the output file still has size 0, then you may have
+ long time and the output file still has size 0, then you might have
hit that problem. (But keep in mind that a normal build takes 5
to 10 minutes, so don't abort too soon.)
The
PostgreSQL distribution includes a
parsed DTD definitions file reference.ced .
- You m
ay find that when using
PSGML , a
+ You m
ight find that when using
PSGML , a
comfortable way of working with these separate files of book
parts is to insert a proper DOCTYPE
declaration while you're editing them. If you are working on
Reference pages that describe executable commands should contain
the following sections, in this order. Sections that do not apply
- may be omitted. Additional top-level sections should only be used
+ can be omitted. Additional top-level sections should only be used
in special circumstances; often that information belongs in the
Usage
section.
A list describing each command-line option. If there are a
- lot of options, subsections may be used.
+ lot of options, subsections can be used.
-
+
EXEC SQL ...;
These statements syntactically take the place of a C statement.
- Depending on the particular statement, they may appear at the
+ Depending on the particular statement, they can appear at the
global level or within a function. Embedded
SQL statements follow the case-sensitivity rules
of normal
SQL code, and not those of C.
(single-quoted) string literal or a variable reference. The
connection target DEFAULT initiates a connection
to the default database under the default user name. No separate
- user name or connection name may be specified in that case.
+ user name or connection name can be specified in that case.
As above, the parameters username and
- password may be an SQL identifier, an
+ password can be an SQL identifier, an
SQL string literal, or a reference to a character variable.
EXEC SQL EXECUTE IMMEDIATE :stmt;
- You may not execute statements that retrieve data (e.g.,
+ You can not execute statements that retrieve data (e.g.,
SELECT ) this way.
...
EXEC SQL EXECUTE mystmt INTO v1, v2, v3 USING 37;
- An EXECUTE command may have an
+ An EXECUTE command can have an
INTO clause, a USING clause,
both, or neither.
FETCH statement. An SQL descriptor area groups
the data of one row of data together with metadata items into one
data structure. The metadata is particularly useful when executing
- dynamic SQL statements, where the nature of the result columns may
+ dynamic SQL statements, where the nature of the result columns might
not be known ahead of time.
The statement sent to the
PostgreSQL
server was empty. (This cannot normally happen in an embedded
- SQL program, so it may point to an internal error.) (SQLSTATE
+ SQL program, so it might point to an internal error.) (SQLSTATE
YE002)
If you manage the build process of a larger project using
-
make , it m
ay be convenient to include
+
make , it m
ight be convenient to include
the following implicit rule to your makefiles:
ECPG = ecpg
Here is a complete example describing the output of the
- preprocessor of a file foo.pgc (details may
+ preprocessor of a file foo.pgc (details might
change with each particular version of the preprocessor):
EXEC SQL BEGIN DECLARE SECTION;
-
+
According to the standard, the first two characters of an error code
denote a class of errors, while the last three characters indicate
a specific condition within that class. Thus, an application that
- does not recognize the specific error code may still be able to infer
+ does not recognize the specific error code can still be able to infer
what to do from the error class.
-
+
A domain is based on a particular base type and for many purposes
- is interchangeable with its base type. However, a domain may
+ is interchangeable with its base type. However, a domain can
have constraints that restrict its valid values to a subset of
what the underlying base type would allow.
-
+
External Projects
All other language interfaces are external projects and are distributed
separately. includes a list of
- some of these projects. Note that some of these packages may not be
+ some of these projects. Note that some of these packages might not be
released under the same license as
PostgreSQL>. For more
information on each language interface, including licensing terms, refer to
its website and documentation.
In addition, there are a number of procedural languages that are developed
and maintained outside the core
PostgreSQL
distribution. lists some of these
- packages. Note that some of these projects may not be released under the same
+ packages. Note that some of these projects might not be released under the same
license as
PostgreSQL>. For more information on each
procedural language, including licensing information, refer to its website
and documentation.
-
+
SQL Conformance
9075 Working Group during the preparation of SQL:2003. Even so,
many of the features required by SQL:2003 are already supported,
though sometimes with slightly differing syntax or function.
- Further moves towards conformance may be expected in later releases.
+ Further moves towards conformance should be expected in later releases.
PostgreSQL supports most of the major features of SQL:2003. Out of
164 mandatory features required for full Core conformance,
PostgreSQL conforms to at least 150. In addition, there is a long
- list of supported optional features. It may be worth noting that at
+ list of supported optional features. It might be worth noting that at
the time of writing, no current version of any database management
system claims full conformance to Core SQL:2003.
that
PostgreSQL supports, followed by a
list of the features defined in
SQL:2003 which
are not yet supported in
PostgreSQL .
- Both of these lists are approximate: There may be minor details that
+ Both of these lists are approximate: There might be minor details that
are nonconforming for a feature that is listed as supported, and
- large parts of an unsupported feature may in fact be implemented.
+ large parts of an unsupported feature might in fact be implemented.
The main body of the documentation always contains the most accurate
information about what does and does not work.
-
+
Functions and Operators
- Some applications may expect that
+ Some applications might expect that
expression = NULL
returns true if expression evaluates to
the null value. It is highly recommended that these applications
data type as its argument.
The functions working with double precision data are mostly
implemented on top of the host system's C library; accuracy and behavior in
- boundary cases may therefore vary depending on the host system.
+ boundary cases can therefore vary depending on the host system.
other characters, the respective character in
pattern must be
preceded by the escape character. The default escape
- character is the backslash but a different one may be selected by
+ character is the backslash but a different one can be selected by
using the ESCAPE clause. To match the escape
character itself, write two escape characters.
Like LIKE , the SIMILAR TO
operator succeeds only if its pattern matches the entire string;
this is unlike common regular expression practice, wherein the pattern
- may match any part of the string.
+ can match any part of the string.
Also like
LIKE , SIMILAR TO uses
_> and %> as wildcard characters denoting
- Parentheses () may be used to group items into
+ Parentheses () can be used to group items into
a single logical item.
A constraint> matches an empty string, but matches only when
specific conditions are met. A constraint can be used where an atom
- could be used, except it may not be followed by a quantifier.
+ could be used, except it can not be followed by a quantifier.
The simple constraints are shown in
;
some more constraints are described later.
- An RE may not end with \>.
+ An RE can not end with \>.
{>m>,>n>}>
a sequence of m> through n>
- (inclusive) matches of the atom; m> may not exceed
+ (inclusive) matches of the atom; m> can not exceed
n>
- Lookahead constraints may not contain back references>
+ Lookahead constraints can not contain back references>
(see ),
and all parentheses within them are considered non-capturing.
^ are the members of an equivalence class, then
[[=o=]] , [[=^=]] , and
[o^] are all synonymous. An equivalence class
- may not be an endpoint of a range.
+ can not be an endpoint of a range.
xdigit . These stand for the character classes
defined in
ctype 3 .
- A locale may provide others. A character class may not be used as
+ A locale can provide others. A character class can not be used as
an endpoint of a range.
- An ARE may begin with embedded options>:
+ An ARE can begin with embedded options>:
a sequence (?>xyz>)>
(where xyz> is one or more alphabetic characters)
specifies options affecting the rest of the RE.
Embedded options take effect at the )> terminating the sequence.
- They may appear only at the start of an ARE (after the
+ They can appear only at the start of an ARE (after the
***:> director if any).
- Certain modifiers may be applied to any template pattern to alter its
+ Certain modifiers can be applied to any template pattern to alter its
behavior. For example, FMMonth
is the Month pattern with the
FM modifier.
In these expressions, the desired time zone zone> can be
specified either as a text string (e.g., 'PST' )
or as an interval (e.g., INTERVAL '-08:00' ).
- In the text case, a time zone name may be specified in any of the ways
+ In the text case, a time zone name can be specified in any of the ways
described in .
- Other database systems may advance these values more
+ Other database systems might advance these values more
frequently.
statement (more specifically, the time of receipt of the latest command
message from the client).
statement_timestamp()> and transaction_timestamp()>
- return the same value during the first command of a transaction, but may
+ return the same value during the first command of a transaction, but might
differ during subsequent commands.
clock_timestamp()> returns the actual current time, and
therefore its value changes even within a single SQL command.
The effective resolution of the sleep interval is platform-specific;
0.01 seconds is a common value. The sleep delay will be at least as long
- as specified. It may be longer depending on factors such as server load.
+ as specified. It might be longer depending on factors such as server load.
t.p> is a point> column then
SELECT p[0] FROM t> retrieves the X coordinate and
UPDATE t SET p[1] = ...> changes the Y coordinate.
- In the same way, a value of type box> or lseg> may be treated
+ In the same way, a value of type box> or lseg> can be treated
as an array of two point> values.
Note that late binding was the only behavior supported in
PostgreSQL releases before 8.1, so you
- may need to do this to preserve the semantics of old applications.
+ might need to do this to preserve the semantics of old applications.
value and sets its is_called field to true ,
meaning that the next nextval will advance the sequence
before returning a value. In the three-parameter form,
- is_called may be set either true or
+ is_called can be set either true or
false . If it's set to false ,
the next nextval will return exactly the specified
value, and sequence advancement commences with the following
same sequence, a nextval operation is never rolled back;
that is, once a value has been fetched it is considered used, even if the
transaction that did the nextval later aborts. This means
- that aborted transactions may leave unused holes
in the
+ that aborted transactions might leave unused holes
in the
sequence of assigned values. setval operations are never
rolled back, either.
It should be noted that except for count ,
these functions return a null value when no rows are selected. In
particular, sum of no rows returns null, not
- zero as one might expect. The coalesce function may be
+ zero as one might expect. The coalesce function can be
used to substitute zero for null when necessary.
Users accustomed to working with other SQL database management
- systems may be surprised by the performance of the
+ systems might be surprised by the performance of the
count aggregate when it is applied to the
entire table. A query like:
whether at least one row is returned, not all the way to completion.
It is unwise to write a subquery that has any side effects (such as
calling sequence functions); whether the side effects occur or not
- may be difficult to predict.
+ might be difficult to predict.
- The search path may be altered at run time. The command is:
+ The search path can be altered at run time. The command is:
SET search_path TO schema> , schema>, ...
creating command for a constraint, index, rule, or trigger. (Note that this
is a decompiled reconstruction, not the original text of the command.)
pg_get_expr decompiles the internal form of an
- individual expression, such as the default value for a column. It may be
+ individual expression, such as the default value for a column. It can be
useful when examining the contents of system catalogs.
pg_get_viewdef reconstructs the SELECT>
query that defines a view. Most of these functions come in two variants,
the transaction log archive area. The history file includes the label given to
pg_start_backup>, the starting and ending transaction log locations for
the backup, and the starting and ending times of the backup. The return
- value is the backup's ending transaction log location (which again may be of little
+ value is the backup's ending transaction log location (which again might be of little
interest). After noting the ending location, the current transaction log insertion
point is automatically advanced to the next transaction log file, so that the
ending transaction log file can be archived immediately to complete the backup.
bigint
Disk space used by the table or index with the specified name.
- The table name may be qualified with a schema name
+ The table name can be qualified with a schema name
|
bigint
Total disk space used by the table with the specified name,
- including indexes and toasted data. The table name may be
+ including indexes and toasted data. The table name can be
qualified with a schema name
The functions shown in
linkend="functions-admin-genfile"> provide native file access to
files on the machine hosting the server. Only files within the
- database cluster directory and the log_directory> may be
+ database cluster directory and the log_directory> can be
accessed. Use a relative path for files within the cluster directory,
and a path matching the log_directory> configuration setting
for log files. Use of these functions is restricted to superusers.
pg_advisory_lock> locks an application-defined resource,
- which may be identified either by a single 64-bit key value or two
+ which can be identified either by a single 64-bit key value or two
32-bit key values (note that these two key spaces do not overlap). If
another session already holds a lock on the same resource, the
function will wait until the resource becomes available. The lock
The function xmlcomment creates an XML value
containing an XML comment with the specified text as content.
- The text may not contain -- or end with a
+ The text can not contain -- or end with a
- so that the resulting construct is a valid
XML comment. If the argument is null, the result is null.
-
+
GIN Indexes
GIN stands for Generalized Inverted Index. It is
an index structure storing a set of (key, posting list) pairs, where
a posting list> is a set of rows in which the key occurs. Each
- indexed value may contain many keys, so the same row ID may appear in
+ indexed value can contain many keys, so the same row ID can appear in
multiple posting lists.
Returns TRUE if the indexed value satisfies the query operator with
- strategy number n> (or may satisfy, if the operator is
+ strategy number n> (or would satisfy, if the operator is
marked RECHECK in the operator class). The check> array has
the same length as the number of keys previously returned by
extractQuery> for this query. Each element of the
-
GIN searches keys only by equality matching. This m
ay
+
GIN searches keys only by equality matching. This m
ight
be improved in future.
-
+
GiST Indexes
Usually, replay of the WAL log is sufficient to restore the integrity
of a GiST index following a database crash. However, there are some
corner cases in which the index state is not fully rebuilt. The index
- will still be functionally correct, but there may be some performance
+ will still be functionally correct, but there might be some performance
degradation. When this occurs, the index can be repaired by
VACUUM>ing its table, or by rebuilding the index using
REINDEX>. In some cases a plain VACUUM> is
-
+
Index Access Method Interface Definition
block number and an item number within that block (see
linkend="storage-page-layout">). This is sufficient
information to fetch a particular row version from the table.
- Indexes are not directly aware that under MVCC, there may be multiple
+ Indexes are not directly aware that under MVCC, there might be multiple
extant versions of the same logical row; to an index, each tuple is
an independent object that needs its own index entry. Thus, an
update of a row always creates all-new index entries for the row, even if
heap_tid> is the TID to be indexed.
If the access method supports unique indexes (its
pg_am>.amcanunique> flag is true) then
- check_uniqueness> may be true, in which case the access method
+ check_uniqueness> might be true, in which case the access method
must verify that there is no conflicting row; this is the only situation in
which the access method normally needs the heapRelation>
parameter. See for details.
Because of limited maintenance_work_mem>,
- ambulkdelete> may need to be called more than once when many
+ ambulkdelete> might need to be called more than once when many
tuples are to be deleted. The stats> argument is the result
of the previous call for this index (it is NULL for the first call within a
VACUUM> operation). This allows the AM to accumulate statistics
Clean up after a VACUUM operation (zero or more
ambulkdelete> calls). This does not have to do anything
- beyond returning index statistics, but it may perform bulk cleanup
+ beyond returning index statistics, but it might perform bulk cleanup
such as reclaiming empty index pages. stats> is whatever the
last ambulkdelete> call returned, or NULL if
ambulkdelete> was not called because no tuples needed to be
- The operator family may indicate that the index is lossy> for a
+ The operator family can indicate that the index is lossy> for a
particular operator; this implies that the index scan will return all the
entries that pass the scan key, plus possibly additional entries that do
not. The core system's index-scan machinery will then apply that operator
The access method must support marking> a position in a scan
- and later returning to the marked position. The same position may be
+ and later returning to the marked position. The same position might be
restored multiple times. However, only one position need be remembered
per scan; a new ammarkpos> call overrides the previously
marked position.
would have found the entry if it had existed when the scan started, or for
the scan to return such an entry upon rescanning or backing
up even though it had not been returned the first time through. Similarly,
- a concurrent delete may or may not be reflected in the results of a scan.
+ a concurrent delete might or might not be reflected in the results of a scan.
What is important is that insertions or deletions not cause the scan to
miss or multiply return entries that were not themselves being inserted or
deleted.
RowExclusiveLock> when updating the index (including plain
VACUUM>). Since these lock
types do not conflict, the access method is responsible for handling any
- fine-grained locking it may need. An exclusive lock on the index as a whole
+ fine-grained locking it might need. An exclusive lock on the index as a whole
will be taken only during index creation, destruction,
REINDEX>, or VACUUM FULL>.
heap>) and the index. Because
PostgreSQL separates accesses
and updates of the heap from those of the index, there are windows in
- which the index may be inconsistent with the heap. We handle this problem
+ which the index might be inconsistent with the heap. We handle this problem
with the following rules:
against this scenario by requiring the scan keys to be rechecked
against the heap row in all cases, but that is too expensive. Instead,
we use a pin on an index page as a proxy to indicate that the reader
- may still be in flight> from the index entry to the matching
+ might still be in flight> from the index entry to the matching
heap entry. Making ambulkdelete> block on such a pin ensures
that VACUUM> cannot delete the heap entry before the reader
is done with it. This solution costs little in run time, and adds blocking
entry. This is expensive for a number of reasons. An
asynchronous> scan in which we collect many TIDs from the index,
and only visit the heap tuples sometime later, requires much less index
- locking overhead and may allow a more efficient heap access pattern.
+ locking overhead and can allow a more efficient heap access pattern.
Per the above analysis, we must use the synchronous approach for
non-MVCC-compliant snapshots, but an asynchronous scan is workable
for a query using an MVCC snapshot.
-
+
Indexes
Once an index is created, no further intervention is required: the
system will update the index when the table is modified, and it will
use the index in queries when it thinks this would be more efficient
- than a sequential table scan. But you may have to run the
+ than a sequential table scan. But you might have to run the
ANALYZE command regularly to update
statistics to allow the query planner to make educated decisions.
See for information about
how to find out whether an index is used and when and why the
- planner may choose not to use an index.
+ planner might choose not to use an index.
indexes to perform no better than B-tree indexes, and the
index size and build time for hash indexes is much worse.
Furthermore, hash index operations are not presently WAL-logged,
- so hash indexes may need to be rebuilt with REINDEX>
+ so hash indexes might need to be rebuilt with REINDEX>
after a database crash.
For these reasons, hash index use is presently discouraged.
SELECT name FROM test2 WHERE major = constant AND minor = constant ;
- then it may be appropriate to define an index on the columns
+ then it might be appropriate to define an index on the columns
major and
minor together, e.g.,
Currently, only the B-tree and GiST index types support multicolumn
- indexes. Up to 32 columns may be specified. (This limit can be
+ indexes. Up to 32 columns can be specified. (This limit can be
altered when building
PostgreSQL ; see the
file pg_config_manual.h .)
In all but the simplest applications, there are various combinations of
- indexes that may be useful, and the database developer must make
+ indexes that might be useful, and the database developer must make
trade-offs to decide which indexes to provide. Sometimes multicolumn
indexes are best, but sometimes it's better to create separate indexes
and rely on the index-combination feature. For example, if your
- Indexes may also be used to enforce uniqueness of a column's value,
+ Indexes can also be used to enforce uniqueness of a column's value,
or the uniqueness of the combined values of more than one column.
CREATE UNIQUE INDEX name ON table (column , ... );
The syntax of the CREATE INDEX> command normally requires
writing parentheses around index expressions, as shown in the second
- example. The parentheses may be omitted when the expression is just
+ example. The parentheses can be omitted when the expression is just
a function call, as in the first example.
SELECT * FROM orders WHERE order_nr = 3501;
- The order 3501 may be among the billed or among the unbilled
+ The order 3501 might be among the billed or among the unbilled
orders.
Finally, a partial index can also be used to override the system's
- query plan choices. It may occur that data sets with peculiar
- distributions will cause the system to use an index when it really
+ query plan choices. Also, data sets with peculiar
+ distributions might cause the system to use an index when it really
should not. In that case the index can be set up so that it is not
available for the offending query. Normally,
PostgreSQL> makes reasonable choices about index
- An index definition may specify an operator
+ An index definition can specify an operator
class for each column of an index.
CREATE INDEX name ON table (column opclass , ... );
via run-time parameters (described in
linkend="runtime-config-query-constants">).
An inaccurate selectivity estimate is due to
- insufficient statistics. It may be possible to improve this by
+ insufficient statistics. It might be possible to improve this by
tuning the statistics-gathering parameters (see
).
If you do not succeed in adjusting the costs to be more
- appropriate, then you may have to resort to forcing index usage
- explicitly. You may also want to contact the
+ appropriate, then you might have to resort to forcing index usage
+ explicitly. You might also want to contact the
PostgreSQL> developers to examine the issue.
-
+
The Information Schema
grantee
sql_identifier
- Name of the role to which this role membership was granted (may
+ Name of the role to which this role membership was granted (can
be the current user, or a different role in case of nested role
memberships)
grantee
sql_identifier
- Name of the role to which this role membership was granted (may
+ Name of the role to which this role membership was granted (can
be the current user, or a different role in case of nested role
memberships)
If data_type identifies a numeric type, this
column contains the (declared or implicit) precision of the
type for this attribute. The precision indicates the number of
- significant digits. It may be expressed in decimal (base 10)
+ significant digits. It can be expressed in decimal (base 10)
or binary (base 2) terms, as specified in the column
numeric_precision_radix . For all other data
types, this column is null.
If data_type identifies an exact numeric
type, this column contains the (declared or implicit) scale of
the type for this attribute. The scale indicates the number of
- significant digits to the right of the decimal point. It may
+ significant digits to the right of the decimal point. It can
be expressed in decimal (base 10) or binary (base 2) terms, as
specified in the column
numeric_precision_radix . For all other data
YES if the column is possibly nullable,
NO if it is known not nullable. A not-null
constraint is one way a column can be known not nullable, but
- there may be others.
+ there can be others.
If data_type identifies a numeric type, this
column contains the (declared or implicit) precision of the
type for this column. The precision indicates the number of
- significant digits. It may be expressed in decimal (base 10)
+ significant digits. It can be expressed in decimal (base 10)
or binary (base 2) terms, as specified in the column
numeric_precision_radix . For all other data
types, this column is null.
If data_type identifies an exact numeric
type, this column contains the (declared or implicit) scale of
the type for this column. The scale indicates the number of
- significant digits to the right of the decimal point. It may
+ significant digits to the right of the decimal point. It can
be expressed in decimal (base 10) or binary (base 2) terms, as
specified in the column
numeric_precision_radix . For all other data
is supposed to identify the underlying built-in type of the column.
In
PostgreSQL , this means that the type
is defined in the system catalog schema
- pg_catalog . This column may be useful if the
+ pg_catalog . This column might be useful if the
application can handle the well-known built-in types specially (for
example, format the numeric types differently or use the data in
the precision columns). The columns udt_name ,
If the domain has a numeric type, this column contains the
(declared or implicit) precision of the type for this column.
The precision indicates the number of significant digits. It
- may be expressed in decimal (base 10) or binary (base 2) terms,
+ can be expressed in decimal (base 10) or binary (base 2) terms,
as specified in the column
numeric_precision_radix . For all other data
types, this column is null.
If the domain has an exact numeric type, this column contains
the (declared or implicit) scale of the type for this column.
The scale indicates the number of significant digits to the
- right of the decimal point. It may be expressed in decimal
+ right of the decimal point. It can be expressed in decimal
(base 10) or binary (base 2) terms, as specified in the column
numeric_precision_radix . For all other data
types, this column is null.
For permission checking, the set of applicable roles
- is applied, which may be broader than the set of enabled roles. So
+ is applied, which can be broader than the set of enabled roles. So
generally, it is better to use the view
applicable_roles instead of this one; see also
there.
|
routine_name
sql_identifier
- Name of the function (may be duplicated in case of overloading)
+ Name of the function (might be duplicated in case of overloading)
|
applies to domains, and since domains do not have real privileges
in
PostgreSQL , this view is empty.
Further information can be found under
- usage_privileges . In the future, this view may
+ usage_privileges . In the future, this view might
contain more useful information.
|
routine_name
sql_identifier
- Name of the function (may be duplicated in case of overloading)
+ Name of the function (might be duplicated in case of overloading)
|
|
routine_name
sql_identifier
- Name of the function (may be duplicated in case of overloading)
+ Name of the function (might be duplicated in case of overloading)
|
This column contains the (declared or implicit) precision of
the sequence data type (see above). The precision indicates
- the number of significant digits. It may be expressed in
+ the number of significant digits. It can be expressed in
decimal (base 10) or binary (base 2) terms, as specified in the
column numeric_precision_radix .
This column contains the (declared or implicit) scale of the
sequence data type (see above). The scale indicates the number
of significant digits to the right of the decimal point. It
- may be expressed in decimal (base 10) or binary (base 2) terms,
+ can be expressed in decimal (base 10) or binary (base 2) terms,
as specified in the column
numeric_precision_radix .
incompatibilities with the SQL standard that affect the
representation in the information schema. First, trigger names are
local to the table in
PostgreSQL , rather
- than being independent schema objects. Therefore there may be duplicate
+ than being independent schema objects. Therefore there can be duplicate
trigger names defined in one schema, as long as they belong to
different tables. (trigger_catalog and
trigger_schema are really the values pertaining
in
PostgreSQL , this view shows implicit
USAGE privileges granted to
PUBLIC for all domains. In the future, this
- view may contain more useful information.
+ view might contain more useful information.
-
+
PostgreSQL>]]>
executables considerably, and on non-GCC compilers it usually
also disables compiler optimization, causing slowdowns. However,
having the symbols available is extremely helpful for dealing
- with any problems that may arise. Currently, this option is
+ with any problems that might arise. Currently, this option is
recommended for production installations only if you use GCC.
But you should always have it on if you are doing development work
or running a beta version.
investigates (for example, software upgrades), then it's a good
idea to do gmake distclean> before reconfiguring and
rebuilding. Without this, your changes in configuration choices
- may not propagate everywhere they need to.
+ might not propagate everywhere they need to.
- To stop a server running in the background you can type
+ To stop a server running in the background you can type:
kill `cat /usr/local/pgsql/data/postmaster.pid`
-
+
Preface
- contains assorted information that may be of
+ contains assorted information that might be of
use to
PostgreSQL> developers.
-
+
Using hostaddr> instead of host> allows the
- application to avoid a host name look-up, which may be important in
+ application to avoid a host name look-up, which might be important in
applications with time constraints. However, Kerberos authentication
requires the host name. The following therefore applies: If
host> is specified without hostaddr>, a host name
If PQconnectStart> succeeds, the next stage is to poll
-
libpq> so that it may proceed with the connection sequence.
+
libpq> so that it can proceed with the connection sequence.
Use PQsocket(conn) to obtain the descriptor of the
socket underlying the database connection.
Loop thus: If PQconnectPoll(conn) last returned
- At any time during connection, the status of the connection may be
+ At any time during connection, the status of the connection can be
checked by calling PQstatus>. If this gives CONNECTION_BAD>, then the
connection procedure has failed; if it gives CONNECTION_OK>, then the
connection is ready. Both of these states are equally detectable
- from the return value of PQconnectPoll>, described above. Other states may also occur
+ from the return value of PQconnectPoll>, described above. Other states might also occur
during (and only during) an asynchronous connection procedure. These
- indicate the current stage of the connection procedure and may be useful
+ indicate the current stage of the connection procedure and might be useful
to provide feedback to the user for example. These statuses are:
- Returns a connection options array. This may be used to determine
+ Returns a connection options array. This can be used to determine
all possible PQconnectdb options and their
current default values. The return value points to an array of
PQconninfoOption structures, which ends
This function will close the connection
to the server and attempt to reestablish a new
connection to the same server, using all the same
- parameters previously used. This may be useful for
+ parameters previously used. This might be useful for
error recovery if a working connection is lost.
These functions will close the connection to the server and attempt to
reestablish a new connection to the same server, using all the same
- parameters previously used. This may be useful for error recovery if a
+ parameters previously used. This can be useful for error recovery if a
working connection is lost. They differ from PQreset (above) in that they
act in a nonblocking manner. These functions suffer from the same
restrictions as PQconnectStart> and PQconnectPoll>.
Connection Status Functions
- These functions may be used to interrogate the status
+ These functions can be used to interrogate the status
of an existing database connection object.
If no value for standard_conforming_strings> is reported,
-applications may assume it is off>, that is, backslashes
+applications can assume it is off>, that is, backslashes
are treated as escapes in string literals. Also, the presence of this
-parameter may be taken as an indication that the escape string syntax
+parameter can be taken as an indication that the escape string syntax
(E'...'>) is accepted.
int PQprotocolVersion(const PGconn *conn);
-Applications may wish to use this to determine whether certain features
+Applications might wish to use this to determine whether certain features
are supported.
Currently, the possible values are 2 (2.0 protocol), 3 (3.0 protocol),
or zero (connection bad). This will not change after connection
int PQserverVersion(const PGconn *conn);
-Applications may use this to determine the version of the database server they
+Applications might use this to determine the version of the database server they
are connected to. The number is formed by converting the major, minor, and
revision numbers into two-decimal-digit numbers and appending them
together. For example, version 8.1.5 will be returned as 80105, and version
The number of parameters supplied; it is the length of the arrays
paramTypes[]>, paramValues[]>,
paramLengths[]>, and paramFormats[]>. (The
- array pointers
may be
NULL when
nParams>
+ array pointers
can be
NULL when
nParams>
is zero.)
Specifies the actual data lengths of binary-format parameters.
It is ignored for null parameters and text-format parameters.
- The array pointer may be null when there are no binary parameters.
+ The array pointer can be null when there are no binary parameters.
The primary advantage of PQexecParams> over PQexec>
-is that parameter values may be separated from the command string, thus
+is that parameter values can be separated from the command string, thus
avoiding the need for tedious and error-prone quoting and escaping.
The function creates a prepared statement named
stmtName>
from the
query> string, which must contain a single SQL command.
-
stmtName> may be ""> to create an unnamed statement,
+
stmtName> can be ""> to create an unnamed statement,
in which case any pre-existing unnamed statement is automatically replaced;
otherwise it is an error if the statement name is already defined in the
current session.
to in the query as $1>, $2>, etc.
nParams> is the number of parameters for which types are
pre-specified in the array
paramTypes[]>. (The array pointer
-
may be
NULL when
nParams> is zero.)
+
can be
NULL when
nParams> is zero.)
paramTypes[]> specifies, by OID, the data types to be assigned to
the parameter symbols. If
paramTypes> is NULL ,
or any particular element in the array is zero, the server assigns a data type
to the parameter symbol in the same way it would do for an untyped literal
-string. Also, the query may use parameter symbols with numbers higher than
+string. Also, the query can use parameter symbols with numbers higher than
nParams>; data types will be inferred for these symbols as
well. (See PQdescribePrepared for a means to find out
what data types were inferred.)
-
stmtName> may be ""> or NULL to reference the unnamed
+
stmtName> can be ""> or NULL to reference the unnamed
statement, otherwise it must be the name of an existing prepared statement.
On success, a PGresult> with status
PGRES_COMMAND_OK is returned. The functions
PQnparams and PQparamtype
-may be applied to this PGresult> to obtain information
+can be applied to this PGresult> to obtain information
about the parameters of the prepared statement, and the functions
PQnfields , PQfname ,
PQftype , etc provide information about the result
-
portalName> may be ""> or NULL to reference the unnamed
+
portalName> can be ""> or NULL to reference the unnamed
portal, otherwise it must be the name of an existing portal.
On success, a PGresult> with status
PGRES_COMMAND_OK is returned. The functions
PQnfields , PQfname ,
-PQftype , etc may be applied to the
+PQftype , etc can be applied to the
PGresult> to obtain information about the result
columns (if any) of the portal.
to retrieve zero rows still shows PGRES_TUPLES_OK .
PGRES_COMMAND_OK is for commands that can never
return rows (INSERT , UPDATE ,
-etc.). A response of PGRES_EMPTY_QUERY may indicate
+etc.). A response of PGRES_EMPTY_QUERY might indicate
a bug in the client software.
Detail: an optional secondary error message carrying more detail about
-the problem. May run to multiple lines.
+the problem. Might run to multiple lines.
Hint: an optional suggestion what to do about the problem. This is
intended to differ from detail in that it offers advice (potentially
-inappropriate) rather than hard facts. May run to multiple lines.
+inappropriate) rather than hard facts. Might run to multiple lines.
-Commonly this is just the name of the command, but it may include additional
+Commonly this is just the name of the command, but it might include additional
data such as the number of rows processed. The caller should
not free the result directly. It will be freed when the
associated PGresult> handle is passed to
PQescapeStringConn>; the difference is that it does not
take
conn> or error> parameters. Because of this,
it cannot adjust its behavior depending on the connection properties (such as
-character encoding) and therefore it may give the wrong results>.
+character encoding) and therefore it might give the wrong results>.
Also, it has no way to report error conditions.
take a PGconn> parameter. Because of this, it cannot adjust
its behavior depending on the connection properties (in particular,
whether standard-conforming strings are enabled)
- and therefore it may give the wrong results>. Also, it
+ and therefore it might give the wrong results>. Also, it
has no way to return an error message on failure.
-PQexec waits for the command to be completed. The application may have other
+PQexec waits for the command to be completed. The application might have other
work to do (such as maintaining a user interface), in which case it won't
want to block waiting for the response.
After successfully calling PQsendQuery , call
PQgetResult one or more
- times to obtain the results. PQsendQuery may not be called
+ times to obtain the results. PQsendQuery can not be called
again (on the same connection) until PQgetResult has returned a null pointer,
indicating that the command is done.
PQerrorMessage can be consulted). Note that the result
does not say
whether any input data was actually collected. After calling
-PQconsumeInput , the application may check
+PQconsumeInput , the application can check
PQisBusy and/or PQnotifies to see if
their state has changed.
-PQconsumeInput may be called even if the application is not
+PQconsumeInput can be called even if the application is not
prepared to deal with a result or notification just yet. The
function will read available data and save it in a buffer, thereby
causing a select() read-ready indication to go away. The
or data values are sent. (It is much more probable if the application
sends data via COPY IN , however.) To prevent this possibility and achieve
completely nonblocking database operation, the following additional
-functions may be used.
+functions can be used.
-This interface is somewhat obsolete, as one may achieve similar performance
+This interface is somewhat obsolete, as one can achieve similar performance
and greater functionality by setting up a prepared statement to define the
function call. Then, executing the statement with binary transmission of
parameters and results substitutes for a fast-path function call.
is returned to indicate success or failure of the transfer. Its status
will be PGRES_COMMAND_OK for success or
PGRES_FATAL_ERROR if some problem was encountered.
- At this point further SQL commands may be issued via
+ At this point further SQL commands can be issued via
PQexec . (It is not possible to execute other SQL
commands using the same connection while the COPY
operation is in progress.)
-The application may divide the COPY data stream into buffer loads of any
+The application can divide the COPY data stream into buffer loads of any
convenient size. Buffer-load boundaries have no semantic significance when
sending. The contents of the data stream must match the data format expected
by the COPY> command; see
After successfully calling PQputCopyEnd>, call
PQgetResult> to obtain the final result status of the
-COPY> command. One may wait for
+COPY> command. One can wait for
this result to be available in the usual way. Then return to normal
operation.
After PQgetCopyData> returns -1, call
PQgetResult> to obtain the final result status of the
-COPY> command. One may wait for
+COPY> command. One can wait for
this result to be available in the usual way. Then return to normal
operation.
returned messages include severity, primary text, and position only;
this will normally fit on a single line. The default mode produces
messages that include the above plus any detail, hint, or context
-fields (these may span multiple lines). The VERBOSE>
+fields (these might span multiple lines). The VERBOSE>
mode includes all available fields. Changing the verbosity does not
affect the messages available from already-existing
PGresult> objects, only subsequently-created ones.
form before it is sent. The arguments are the cleartext password, and the SQL
name of the user it is for. The return value is a string allocated by
malloc , or NULL if out of memory.
-The caller may assume the string doesn't contain any special
+The caller can assume the string doesn't contain any special
characters that would require escaping. Use PQfreemem> to free
the result when done with it.
hostname :port :database :username :password
-Each of the first four fields may be a literal value, or * ,
+Each of the first four fields can be a literal value, or * ,
which matches anything. The password field from the first line that matches the
current connection parameters will be used. (Therefore, put more-specific
entries first when you are using wildcards.)
-
+
Large Objects
large object data. We use the
libpq C
library for the examples in this chapter, but most programming
interfaces native to
PostgreSQL support
- equivalent functionality. Other interfaces may use the large
+ equivalent functionality. Other interfaces might use the large
object interface internally to provide generic support for large
values. This is not described here.
Closing a Large Object Descriptor
- A large object descriptor may be closed by calling
+ A large object descriptor can be closed by calling
int lo_close(PGconn *conn, int fd);
is a sample program which shows how the large object
interface
- in
libpq> can be used. Parts of the program are
+ in
libpq> can be used. Parts of the program are
commented out but are left in the source for the reader's
benefit. This program can also be found in
src/test/examples/testlo.c in the source distribution.
-
+
Routine Database Maintenance Tasks
Clearly, a table that receives frequent updates or deletes will need
to be vacuumed more often than tables that are seldom updated. It
- m
ay be useful to set up periodic
cron> tasks that
+ m
ight be useful to set up periodic
cron> tasks that
VACUUM only selected tables, skipping tables that are known not to
change often. This is only likely to be helpful if you have both
large heavily-updated tables and large seldom-updated tables — the
If you have multiple databases
in a cluster, don't forget to VACUUM each one;
the program
- may be helpful.
+ might be helpful.
generate good plans for queries. These statistics are gathered by
the ANALYZE> command, which can be invoked by itself or
as an optional step in VACUUM>. It is important to have
- reasonably accurate statistics, otherwise poor choices of plans may
+ reasonably accurate statistics, otherwise poor choices of plans might
degrade database performance.
As with vacuuming for space recovery, frequent updates of statistics
are more useful for heavily-updated tables than for seldom-updated
- ones. But even for a heavily-updated table, there may be no need for
+ ones. But even for a heavily-updated table, there might be no need for
statistics updates if the statistical distribution of the data is
not changing much. A simple rule of thumb is to think about how much
the minimum and maximum values of the columns in the table change.
of row update will have a constantly-increasing maximum value as
rows are added and updated; such a column will probably need more
frequent statistics updates than, say, a column containing URLs for
- pages accessed on a website. The URL column may receive changes just
+ pages accessed on a website. The URL column might receive changes just
as often, but the statistical distribution of its values probably
changes relatively slowly.
- Although per-column tweaking of ANALYZE> frequency may not be
- very productive, you may well find it worthwhile to do per-column
+ Although per-column tweaking of ANALYZE> frequency might not be
+ very productive, you might well find it worthwhile to do per-column
adjustment of the level of detail of the statistics collected by
ANALYZE>. Columns that are heavily used in WHERE> clauses
- and have highly irregular data distributions may require a finer-grain
+ and have highly irregular data distributions might require a finer-grain
data histogram than other columns. See ALTER TABLE SET
STATISTICS>.
Recommended practice for most sites is to schedule a database-wide
ANALYZE> once a day at a low-usage time of day; this can
usefully be combined with a nightly VACUUM>. However,
- sites with relatively slowly changing table statistics may find that
+ sites with relatively slowly changing table statistics might find that
this is overkill, and that less-frequent ANALYZE> runs
are sufficient.
One disadvantage of decreasing vacuum_freeze_min_age> is that
- it may cause VACUUM> to do useless work: changing a table row's
+ it might cause VACUUM> to do useless work: changing a table row's
XID to FrozenXID> is a waste of time if the row is modified
soon thereafter (causing it to acquire a new XID). So the setting should
be large enough that rows are not frozen until they are unlikely to change
The number of obsolete tuples is obtained from the statistics
collector; it is a semi-accurate count updated by each
UPDATE and DELETE operation. (It
- is only semi-accurate because some information may be lost under heavy
+ is only semi-accurate because some information might be lost under heavy
load.) For analyze, a similar condition is used: the threshold, defined as
analyze threshold = analyze base threshold + analyze scale factor * number of tuples
postgres into a
file, you will have log output, but
the only way to truncate the log file is to stop and restart
- the server. This may be OK if you are using
+ the server. This might be OK if you are using
PostgreSQL in a development environment,
but few production servers would find this behavior acceptable.
On many systems, however,
syslog> is not very reliable,
- particularly with large log messages; it may truncate or drop messages
+ particularly with large log messages; it might truncate or drop messages
just when you need them the most. Also, on
Linux>,
syslog> will sync each message to disk, yielding poor
performance. (You can use a -> at the start of the file name
-
+
Managing Databases
exactly as described above.
The reference page contains the invocation
details. Note that createdb> without any arguments will create
- a database with the current user name, which may or may not be what
+ a database with the current user name, which might or might not be what
you want.
pg_dump> dump: the dump script should be restored in a
virgin database to ensure that one recreates the correct contents
of the dumped database, without any conflicts with additions that
- may now be present in template1>.
+ can now be present in template1>.
It is possible to create additional template databases, and indeed
- one may copy any database in a cluster by specifying its name
+ one can copy any database in a cluster by specifying its name
as the template for CREATE DATABASE>. It is important to
understand, however, that this is not (yet) intended as
a general-purpose COPY DATABASE
facility.
Two useful flags exist in
pg_database pg_database>> for each
database: the columns datistemplate and
datallowconn . datistemplate
- may be set to indicate that a database is intended as a template for
- CREATE DATABASE>. If this flag is set, the database may be
+ can be set to indicate that a database is intended as a template for
+ CREATE DATABASE>. If this flag is set, the database can be
cloned by
any user with CREATEDB> privileges; if it is not set, only superusers
- and the owner of the database may clone it.
+ and the owner of the database can clone it.
If datallowconn is false, then no new connections
to that database will be allowed (but existing sessions are not killed
simply by setting the flag false). The template0
The postgres> database is also created when a database
cluster is initialized. This database is meant as a default database for
users and applications to connect to. It is simply a copy of
- template1> and may be dropped and recreated if required.
+ template1> and can be dropped and recreated if required.
-
+
Monitoring Database Activity
but one should not neglect regular Unix monitoring programs such as
ps>, top>, iostat>, and vmstat>.
Also, once one has identified a
- poorly-performing query, further investigation may be needed using
+ poorly-performing query, further investigation might be needed using
PostgreSQL 's
endterm="sql-explain-title"> command.
discusses EXPLAIN>
The user, database, and connection source host items remain the same for
the life of the client connection, but the activity indicator changes.
- The activity may be idle> (i.e., waiting for a client command),
+ The activity can be idle> (i.e., waiting for a client command),
idle in transaction> (waiting for client inside a BEGIN> block),
or a command type name such as SELECT>. Also,
waiting> is attached if the server process is presently waiting
The parameter must be
set to true> for the statistics collector to be launched
- at all. This is the default and recommended setting, but it may be
+ at all. This is the default and recommended setting, but it can be
turned off if you have no interest in statistics and want to
squeeze out every last drop of overhead. (The savings is likely to
be small, however.) Note that this option cannot be changed while
invoking a kernel call. However, these statistics do not give the
entire story: due to the way in which
PostgreSQL>
handles disk I/O, data that is not in the
-
PostgreSQL> buffer cache may still reside in the
- kernel's I/O cache, and may therefore still be fetched without
+
PostgreSQL> buffer cache might still reside in the
+ kernel's I/O cache, and might therefore still be fetched without
requiring a physical read. Users interested in obtaining more
detailed information on
PostgreSQL> I/O behavior are
advised to use the
PostgreSQL> statistics collector
You should remember that trace programs need to be carefully written and
- debugged prior to their use, otherwise the trace information collected may
+ debugged prior to their use, otherwise the trace information collected might
be meaningless. In most cases where problems are found it is the
instrumentation that is at fault, not the underlying system. When
discussing information found using dynamic tracing, be sure to enclose
- The dynamic tracing utility may require you to further define these trace
+ The dynamic tracing utility might require you to further define these trace
points. For example, DTrace requires you to add new probes to the file
src/backend/utils/probes.d> as shown here:
-
+
Concurrency Control
Committed and Serializable. When you select the level Read
Uncommitted you really get Read Committed, and when you select
Repeatable Read you really get Serializable, so the actual
- isolation level may be stricter than what you select. This is
+ isolation level might be stricter than what you select. This is
permitted by the SQL standard: the four isolation levels only
define which phenomena must not happen, they do not define which
phenomena must happen. The reason that
PostgreSQL>
behave the same as SELECT
in terms of searching for target rows: they will only find target rows
that were committed as of the command start time. However, such a target
- row may have already been updated (or deleted or locked) by
+ row might have already been updated (or deleted or locked) by
another concurrent transaction by the time it is found. In this case, the
would-be updater will wait for the first updating transaction to commit or
roll back (if it is still in progress). If the first updater rolls back,
The partial transaction isolation provided by Read Committed mode is
adequate for many applications, and this mode is fast and simple to use.
- However, for applications that do complex queries and updates, it may
+ However, for applications that do complex queries and updates, it might
be necessary to guarantee a more rigorously consistent view of the
database than the Read Committed mode provides.
in terms of searching for target rows: they will only find target rows
that were committed as of the transaction start time. However, such a
target
- row may have already been updated (or deleted or locked) by
+ row might have already been updated (or deleted or locked) by
another concurrent transaction by the time it is found. In this case, the
serializable transaction will wait for the first updating transaction to commit or
roll back (if it is still in progress). If the first updater rolls back,
- Note that only updating transactions may need to be retried; read-only
+ Note that only updating transactions might need to be retried; read-only
transactions will never have serialization conflicts.
transaction sees a wholly consistent view of the database. However,
the application has to be prepared to retry transactions when concurrent
updates make it impossible to sustain the illusion of serial execution.
- Since the cost of redoing complex transactions may be significant,
+ Since the cost of redoing complex transactions might be significant,
this mode is recommended only when updating transactions contain logic
- sufficiently complex that they may give wrong answers in Read
+ sufficiently complex that they might give wrong answers in Read
Committed mode. Most commonly, Serializable mode is necessary when
a transaction executes several successive commands that must see
identical views of the database.
The intuitive meaning (and mathematical definition) of
serializable> execution is that any two successfully committed
concurrent transactions will appear to have executed strictly serially,
- one after the other — although which one appeared to occur first may
+ one after the other — although which one appeared to occur first might
not be predictable in advance. It is important to realize that forbidding
the undesirable behaviors listed in
is not sufficient to guarantee true serializability, and in fact
between one lock mode and another is the set of lock modes with
which each conflicts. Two transactions cannot hold locks of conflicting
modes on the same table at the same time. (However, a transaction
- never conflicts with itself. For example, it may acquire
+ never conflicts with itself. For example, it might acquire
ACCESS EXCLUSIVE lock and later acquire
ACCESS SHARE lock on the same table.) Non-conflicting
- lock modes may be held concurrently by many transactions. Notice in
+ lock modes can be held concurrently by many transactions. Notice in
particular that some lock modes are self-conflicting (for example,
an ACCESS EXCLUSIVE lock cannot be held by more than one
transaction at a time) while others are not self-conflicting (for example,
To acquire an exclusive row-level lock on a row without actually
modifying the row, select the row with SELECT FOR
UPDATE. Note that once the row-level lock is acquired,
- the transaction may update the row multiple times without
+ the transaction can update the row multiple times without
fear of conflicts.
PostgreSQL doesn't remember any
information about modified rows in memory, so it has no limit to
the number of rows locked at one time. However, locking a row
- may cause a disk write; thus, for example, SELECT FOR
+ might cause a disk write; thus, for example, SELECT FOR
UPDATE will modify selected rows to mark them locked, and so
will result in disk writes.
occurred. One should also ensure that the first lock acquired on
an object in a transaction is the highest mode that will be
needed for that object. If it is not feasible to verify this in
- advance, then deadlocks may be handled on-the-fly by retrying
+ advance, then deadlocks can be handled on-the-fly by retrying
transactions that are aborted due to deadlock.
Another way to think about it is that each
transaction sees a snapshot of the database contents, and concurrently
- executing transactions may very well see different snapshots. So the
+ executing transactions might very well see different snapshots. So the
whole concept of now
is somewhat ill-defined anyway.
This is not normally
a big problem if the client applications are isolated from each other,
but if the clients can communicate via channels outside the database
- then serious confusion may ensue.
+ then serious confusion might ensue.
lock(s) before performing queries. A lock obtained by a
serializable transaction guarantees that no other transactions modifying
the table are still running, but if the snapshot seen by the
- transaction predates obtaining the lock, it may predate some now-committed
+ transaction predates obtaining the lock, it might predate some now-committed
changes in the table. A serializable transaction's snapshot is actually
frozen at the start of its first query or data-modification command
(SELECT>, INSERT>,
read/write access. Locks are released immediately after each
index row is fetched or inserted. But note that a GIN-indexed
value insertion usually produces several index key insertions
- per row, so GIN may do substantial work for a single value's
+ per row, so GIN might do substantial work for a single value's
insertion.
-
+
The # character introduces a comment. If whitespace immediately
follows the # character, then this is a comment maintained by the
- translator. There may also be automatic comments, which have a
+ translator. There can also be automatic comments, which have a
non-whitespace character immediately following the #. These are
maintained by the various tools that operate on the PO files and
are intended to aid the translator.
ISO 639-1 two-letter language code (in lower case), e.g.,
fr.po for French. If there is really a need
for more than one translation effort per language then the files
- may also be named
+ can also be named
language _region .po
where region is the
AVAIL_LANGUAGES := de fr
- (Other languages may appear, of course.)
+ (Other languages can appear, of course.)
- As the underlying program or library changes, messages may be
+ As the underlying program or library changes, messages might be
changed or added by the programmers. In this case you do not need
to start from scratch. Instead, run the command
The PO files can be edited with a regular text editor. The
translator should only change the area between the quotes after
- the msgstr directive, may add comments and alter the fuzzy flag.
+ the msgstr directive, add comments, and alter the fuzzy flag.
There is (unsurprisingly) a PO mode for Emacs, which I find quite
useful.
digits $ needs to
follow the % immediately, before any other format manipulators.
(This feature really exists in the printf
- family of functions. You may not have heard of it before because
+ family of functions. You might not have heard of it before because
there is little use for it outside of message
internationalization.)
normally. The corrected string can be merged in when the
program sources have been updated. If the original string
contains a factual mistake, report that (or fix it yourself)
- and do not translate it. Instead, you may mark the string with
+ and do not translate it. Instead, you can mark the string with
a comment in the PO file.
open file %s) should probably not start with a
capital letter (if your language distinguishes letter case) or
end with a period (if your language uses punctuation marks).
- It may help to read .
+ It might help to read .
- This may tend to add a lot of clutter. One common shortcut is to use
+ This tends to add a lot of clutter. One common shortcut is to use
#define _(x) gettext(x)
printf("Files were %s.\n", flag ? "copied" : "removed");
- The word order within the sentence may be different in other
+ The word order within the sentence might be different in other
languages. Also, even if you remember to call gettext() on each
- fragment, the fragments may not translate well separately. It's
+ fragment, the fragments might not translate well separately. It's
better to duplicate a little code so that each message to be
translated is a coherent whole. Only numbers, file names, and
such-like run-time variables should be inserted at run time into
printf("copied %d files", n):
then be disappointed. Some languages have more than two forms,
- with some peculiar rules. We may have a solution for this in
+ with some peculiar rules. We might have a solution for this in
the future, but for now the matter is best avoided altogether.
You could write:
-
+
Performance Tips
- Estimated total cost (If all rows were to be retrieved, which they may
+ Estimated total cost (If all rows were to be retrieved, though they might
not be: for example, a query with a LIMIT> clause will stop
short of paying the total cost of the Limit> plan node's
input node.)
- If the WHERE> condition is selective enough, the planner may
+ If the WHERE> condition is selective enough, the planner might
switch to a simple> index scan plan:
run time will normally be just a little larger than the total time
reported for the top-level plan node. For INSERT>,
UPDATE>, and DELETE> commands, the total run time
- may be considerably larger, because it includes the time spent processing
+ might be considerably larger, because it includes the time spent processing
the result rows. In these commands, the time for the top plan node
essentially is the time spent computing the new rows and/or locating the
old ones, but it doesn't include the time spent applying the changes.
It is worth noting that EXPLAIN> results should not be extrapolated
to situations other than the one you are actually testing; for example,
results on a toy-sized table can't be assumed to apply to large tables.
- The planner's cost estimates are not linear and so it may well choose
+ The planner's cost estimates are not linear and so it might choose
a different plan for a larger or smaller table. An extreme example
is that on a table that only occupies one disk page, you'll nearly
always get a sequential scan plan whether indexes are available or not.
command, or globally by setting the
configuration variable.
The default limit is presently 10 entries. Raising the limit
- may allow more accurate planner estimates to be made, particularly for
+ might allow more accurate planner estimates to be made, particularly for
columns with irregular data distributions, at the price of consuming
more space in pg_statistic and slightly more
- time to compute the estimates. Conversely, a lower limit may be
+ time to compute the estimates. Conversely, a lower limit might be
appropriate for columns with simple data distributions.
between two input tables, so it's necessary to build up the result
in one or another of these fashions.) The important point is that
these different join possibilities give semantically equivalent
- results but may have hugely different execution costs. Therefore,
+ results but might have hugely different execution costs. Therefore,
the planner will explore all of them to try to find the most
efficient query plan.
orders to worry about. But the number of possible join orders grows
exponentially as the number of tables expands. Beyond ten or so input
tables it's no longer practical to do an exhaustive search of all the
- possibilities, and even for six or seven tables planning may take an
+ possibilities, and even for six or seven tables planning might take an
annoyingly long time. When there are too many input tables, the
PostgreSQL planner will switch from exhaustive
search to a genetic probabilistic search
Therefore the planner has no choice of join order here: it must join
B to C and then join A to that result. Accordingly, this query takes
less time to plan than the previous query. In other cases, the planner
- may be able to determine that more than one join order is safe.
+ might be able to determine that more than one join order is safe.
For example, given
SELECT * FROM a LEFT JOIN b ON (a.bid = b.id) LEFT JOIN c ON (a.cid = c.id);
Populating a Database
- One may need to insert a large amount of data when first populating
+ One might need to insert a large amount of data when first populating
a database. This section contains some suggestions on how to make
this process as efficient as possible.
Turn off autocommit and just do one commit at the end. (In plain
SQL, this means issuing BEGIN at the start and
- COMMIT at the end. Some client libraries may
+ COMMIT at the end. Some client libraries might
do this behind your back, in which case you need to make sure the
library does it when you want it done.) If you allow each
insertion to be committed separately,
- If you cannot use
COPY , it m
ay help to use
+ If you cannot use
COPY , it m
ight help to use
linkend="sql-prepare" endterm="sql-prepare-title"> to create a
prepared INSERT statement, and then use
EXECUTE as many times as required. This avoids
If you are adding large amounts of data to an existing table,
- it may be a win to drop the index,
+ it might be a win to drop the index,
load the table, and then recreate the index. Of course, the
- database performance for other users may be adversely affected
+ database performance for other users might be adversely affected
during the time that the index is missing. One should also think
twice before dropping unique indexes, since the error checking
afforded by the unique constraint will be lost while the index is
Just as with indexes, a foreign key constraint can be checked
- in bulk> more efficiently than row-by-row. So it may be
+ in bulk> more efficiently than row-by-row. So it might be
useful to drop foreign key constraints, load data, and re-create
the constraints. Again, there is a trade-off between data load
speed and loss of error checking while the constraint is missing.
Turn off archive_command
- When loading large amounts of data you may want to unset the
- before loading. It may be
+ When loading large amounts of data you might want to unset the
+ before loading. It might be
faster to take a new base backup once the load has completed
than to allow a large archive to accumulate.
includes bulk loading large amounts of data into the table. Running
ANALYZE (or VACUUM ANALYZE )
ensures that the planner has up-to-date statistics about the
- table. With no statistics or obsolete statistics, the planner may
+ table. With no statistics or obsolete statistics, the planner might
make poor decisions during query planning, leading to poor
performance on any tables with inaccurate or nonexistent
statistics.
-
+
How the Planner Uses Statistics
The outputs and algorithms shown below are taken from version 8.0.
- The behavior of earlier (or later) versions may vary.
+ The behavior of earlier (or later) versions might vary.
345 | 10000
The planner will check the relpages
- estimate (this is a cheap operation) and if incorrect may scale
+ estimate (this is a cheap operation) and if incorrect might scale
reltuples to obtain a row estimate. In this
case it does not, thus:
-
+
PL/Perl - Perl Procedural Language
The usual advantage to using PL/Perl is that this allows use,
within stored functions, of the manyfold string
munging operators and functions available for Perl. Parsing
- complex strings may be be easier using Perl than it is with the
+ complex strings might be be easier using Perl than it is with the
string functions and control structures provided in PL/pgSQL.
spi_query and spi_fetchrow
- work together as a pair for row sets which may be large, or for cases
+ work together as a pair for row sets which might be large, or for cases
where you wish to return rows as they arrive.
spi_fetchrow works only with
spi_query . The following example illustrates how
The advantage of prepared queries is that is it possible to use one prepared plan for more
- than one query execution. After the plan is not needed anymore, it may be freed with
+ than one query execution. After the plan is not needed anymore, it can be freed with
spi_freeplan :
external modules). There is no way to access internals of the
database server process or to gain OS-level access with the
permissions of the server process,
- as a C function can do. Thus, any unprivileged database user may
+ as a C function can do. Thus, any unprivileged database user can
be permitted to use this language.
-
+
PL/pgSQL - SQL Procedural Language
substantially reduce the total amount of time required to parse
and generate execution plans for the statements in a
PL/pgSQL> function. A disadvantage is that errors
- in a specific expression or command may not be detected until that
+ in a specific expression or command can not be detected until that
part of the function is reached in execution.
-
PL/pgSQL> functions may also be declared to accept
+
PL/pgSQL> functions can also be declared to accept
and return the polymorphic types
anyelement and anyarray . The actual
data types handled by a polymorphic function can vary from call to
- Finally, a
PL/pgSQL> function may be declared to return
+ Finally, a
PL/pgSQL> function can be declared to return
void> if it has no useful return value.
The following chart shows what you have to do when writing quote
- marks without dollar quoting. It may be useful when translating
+ marks without dollar quoting. It might be useful when translating
pre-dollar quoting code into something more comprehensible.
type of the structure you are referencing, and most importantly,
if the data type of the referenced item changes in the future (for
instance: you change the type of user_id>
- from integer to real ), you may not need
+ from integer to real ), you might not need
to change your function definition.
%TYPE is particularly valuable in polymorphic
- functions, since the data types needed for internal variables may
+ functions, since the data types needed for internal variables can
change from one call to the next. Appropriate variables can be
created by applying %TYPE to the function's
arguments or result placeholders.
Note that RECORD> is not a true data type, only a placeholder.
One should also realize that when a
PL/pgSQL
function is declared to return type record>, this is not quite the
- same concept as a record variable, even though such a function may well
+ same concept as a record variable, even though such a function might
use a record variable to hold its result. In both cases the actual row
structure is unknown when the function is written, but for a function
returning record> the actual structure is determined when the
loops). FOUND is set this way when the
FOR> loop exits; inside the execution of the loop,
FOUND is not modified by the
- FOR> statement, although it may be changed by the
+ FOR> statement, although it might be changed by the
execution of other statements within the loop body.
for
PL/pgSQL> stores the entire result set
before returning from the function, as discussed above. That
means that if a
PL/pgSQL> function produces a
- very large result set, performance may be poor: data will be
+ very large result set, performance might be poor: data will be
written to disk to avoid memory exhaustion, but the function
itself will not return until the entire result set has been
- generated. A future version of
PL/pgSQL> may
+ generated. A future version of
PL/pgSQL> might
allow users to define set-returning functions
that do not have this limitation. Currently, the point at
which data begins being written to disk is controlled by the
name CURSOR ( arguments ) FOR query ;
- (FOR> may be replaced by IS> for
+ (FOR> can be replaced by IS> for
arguments , if specified, is a
comma-separated list of pairs name
curs3 CURSOR (key integer) IS SELECT * FROM tenk1 WHERE unique1 = key;
All three of these variables have the data type refcursor>,
- but the first may be used with any query, while the second has
+ but the first can be used with any query, while the second has
a fully specified query already bound> to it, and the last
has a parameterized query bound to it. (key> will be
replaced by an integer parameter value when the cursor is opened.)
FETCH retrieves the next row from the
- cursor into a target, which may be a row variable, a record
+ cursor into a target, which might be a row variable, a record
variable, or a comma-separated list of simple variables, just like
SELECT INTO . As with SELECT
- INTO, the special variable FOUND may
+ INTO, the special variable FOUND can
be checked to see whether a row was obtained or not.
- Row-level triggers fired BEFORE> may return null to signal the
+ Row-level triggers fired BEFORE> can return null to signal the
trigger manager to skip the rest of the operation for this row
(i.e., subsequent triggers are not fired, and the
INSERT>/UPDATE>/DELETE> does not occur
The return value of a BEFORE> or AFTER>
statement-level trigger or an AFTER> row-level trigger is
- always ignored; it may as well be null. However, any of these types of
- triggers can still abort the entire operation by raising an error.
+ always ignored; it might as well be null. However, any of these types of
+ triggers might still abort the entire operation by raising an error.
original table for certain queries — often with vastly reduced run
times.
This technique is commonly used in Data Warehousing, where the tables
- of measured or observed data (called fact tables) can be extremely large.
+ of measured or observed data (called fact tables) might be extremely large.
shows an example of a
trigger procedure in
PL/pgSQL that maintains
a summary table for a fact table in a data warehouse.
-
+
PL/Python - Python Procedural Language
available as an untrusted> language (meaning it does not
offer any way of restricting what users can do in it). It has
therefore been renamed to plpythonu>. The trusted
- variant plpython> may become available again in future,
+ variant plpython> might become available again in future,
if a new secure execution mechanism is developed in Python.
- If TD["when"] is BEFORE>, you may
+ If TD["when"] is BEFORE>, you can
return None or "OK" from the
Python function to indicate the row is unmodified,
"SKIP"> to abort the event, or "MODIFY"> to
-
+
PL/Tcl - Tcl Procedural Language
provides no way to access internals of the database server or to
gain OS-level access under the permissions of the
PostgreSQL server process, as a C
- function can do. Thus, unprivileged database users may be trusted
+ function can do. Thus, unprivileged database users can be trusted
to use this language; it does not give them unlimited authority.
The other notable implementation restriction is that Tcl functions
- may not be used to create input/output functions for new data
+ can not be used to create input/output functions for new data
types.
PL/Tcl>>
- The query may use parameters, that is, placeholders for
+ The query can use parameters, that is, placeholders for
values to be supplied whenever the plan is actually executed.
In the query string, refer to parameters
by the symbols $1 ... $n .
Doubles all occurrences of single quote and backslash characters
- in the given string. This may be used to safely quote strings
+ in the given string. This can be used to safely quote strings
that are to be inserted into SQL commands given
to spi_exec or
spi_prepare .
Tcl Procedure Names
- In
PostgreSQL , one and the
- same function name can be used for
- different functions as long as the number of arguments or their types
+ In
PostgreSQL , the same function name can be used for
+ different function definitions as long as the number of arguments or their types
differ. Tcl, however, requires all procedure names to be distinct.
PL/Tcl deals with this by making the internal Tcl procedure names contain
the object
-
+
- This part contains assorted information that can be of use to
+ This part contains assorted information that might be of use to
-
+
Bug Reporting Guidelines
If the function or the options do not exist then your version is
more than old enough to warrant an upgrade.
If you run a prepackaged version, such as RPMs, say so, including any
- subversion the package may have. If you are talking about a CVS
+ subversion the package might have. If you are talking about a CVS
snapshot, mention that, including its date and time.
-
+
Frontend/Backend Protocol
The protocol is supported over
TCP/IP and also over
Unix-domain sockets. Port number 5432 has been registered with IANA as
the customary TCP port number for servers supporting this protocol, but
- in practice any non-privileged port number may be used.
+ in practice any non-privileged port number can be used.
count) before attempting to process its contents. This allows easy
recovery if an error is detected while processing the contents. In
extreme situations (such as not having enough memory to buffer the
- message), the receiver may use the byte count to determine how much
+ message), the receiver can use the byte count to determine how much
input to skip before it resumes reading messages.
portals>. A prepared statement represents the result of
parsing, semantic analysis, and (optionally) planning of a textual query
string.
- A prepared statement is not necessarily ready to execute, because it may
+ A prepared statement is not necessarily ready to execute, because it might
lack specific values for parameters>. A portal represents
a ready-to-execute or already-partially-executed statement, with any
missing parameter values filled in. (For SELECT> statements,
execute> step that runs a portal's query. In the case of
a query that returns rows (SELECT>, SHOW>, etc),
the execute step can be told to fetch only
- a limited number of rows, so that multiple execute steps may be needed
+ a limited number of rows, so that multiple execute steps might be needed
to complete the operation.
the only supported formats are text> and binary>,
but the protocol makes provision for future extensions. The desired
format for any value is specified by a format code>.
- Clients may specify a format code for each transmitted parameter value
+ Clients can specify a format code for each transmitted parameter value
and for each column of a query result. Text has format code zero,
binary has format code one, and all other format codes are reserved
for future definition.
Binary representations for integers use network byte order (most
significant byte first). For other data types consult the documentation
or source code to learn about the binary representation. Keep in mind
- that binary representations for complex data types may change across
+ that binary representations for complex data types might change across
server versions; the text format is usually the more portable choice.
This message informs the frontend about the current (initial)
setting of backend parameters, such as
- linkend="guc-client-encoding"> or .
- The frontend may ignore this message, or record the settings
+ linkend="guc-client-encoding"> or .
+ The frontend can ignore this message, or record the settings
for its future use; see for
more details. The frontend should not respond to this
message, but should continue listening for a ReadyForQuery
ReadyForQuery
- Start-up is completed. The frontend may now issue commands.
+ Start-up is completed. The frontend can now issue commands.
The backend then sends one or more response
messages depending on the contents of the query command string,
and finally a ReadyForQuery response message. ReadyForQuery
- informs the frontend that it may safely send a new command.
+ informs the frontend that it can safely send a new command.
(It is not actually necessary for the frontend to wait for
ReadyForQuery before issuing another command, but the frontend must
then take responsibility for figuring out what happens if the earlier
Processing of the query string is complete. A separate
- message is sent to indicate this because the query string may
+ message is sent to indicate this because the query string might
contain multiple SQL commands. (CommandComplete marks the
end of processing one SQL command, not the whole string.)
ReadyForQuery will always be sent, whether processing
In the event of an error, ErrorResponse is issued followed by
ReadyForQuery. All further processing of the query string is aborted by
ErrorResponse (even if more queries remained in it). Note that this
- may occur partway through the sequence of messages generated by an
+ might occur partway through the sequence of messages generated by an
individual query.
A frontend must be prepared to accept ErrorResponse and
NoticeResponse messages whenever it is expecting any other type of
message. See also concerning messages
- that the backend may generate due to outside events.
+ that the backend might generate due to outside events.
about data types of parameter placeholders, and the
name of a destination prepared-statement object (an empty string
selects the unnamed prepared statement). The response is
- either ParseComplete or ErrorResponse. Parameter data types may be
+ either ParseComplete or ErrorResponse. Parameter data types can be
specified by OID; if not given, the parser attempts to infer the
data types in the same way as it would do for untyped literal string
constants.
Query planning for named prepared-statement objects occurs when the Parse
message is processed. If a query will be repeatedly executed with
- different parameters, it may be beneficial to send a single Parse message
+ different parameters, it might be beneficial to send a single Parse message
containing a parameterized query, followed by multiple Bind
and Execute messages. This will avoid replanning the query on each
execution.
- Query plans generated from a parameterized query may be less
+ Query plans generated from a parameterized query might be less
efficient than query plans generated from an equivalent query with actual
parameter values substituted. The query planner cannot make decisions
based on actual parameter values (for example, index selectivity) when
FunctionCall message to the backend. The backend then sends one
or more response messages depending on the results of the function
call, and finally a ReadyForQuery response message. ReadyForQuery
- informs the frontend that it may safely send a new query or
+ informs the frontend that it can safely send a new query or
function call.
At present, NotificationResponse can only be sent outside a
transaction, and thus it will not occur in the middle of a
- command-response series, though it may occur just before ReadyForQuery.
+ command-response series, though it might occur just before ReadyForQuery.
It is unwise to design frontend logic that assumes that, however.
Good practice is to be able to accept NotificationResponse at any
point in the protocol.
Cancelling Requests in Progress
- During the processing of a query, the frontend may request
+ During the processing of a query, the frontend might request
cancellation of the query. The cancel request is not sent
directly on the open connection to the backend for reasons of
implementation efficiency: we don't want to have the backend
- The cancellation signal may or may not have any effect — for
+ The cancellation signal might or might not have any effect — for
example, if it arrives after the backend has finished processing
the query, then it will have no effect. If the cancellation is
effective, it results in the current command being terminated
server and not across the regular frontend/backend communication
link, it is possible for the cancel request to be issued by any
process, not just the frontend whose query is to be canceled.
- This may have some benefits of flexibility i n building
+ This might provide additional flexibility whe n building
multiple-process applications. It also introduces a security
risk, in that unauthorized persons might try to cancel queries.
The security risk is addressed by requiring a dynamically
In rare cases (such as an administrator-commanded database shutdown)
- the backend may disconnect without any frontend request to do so.
+ the backend might disconnect without any frontend request to do so.
In such cases the backend will attempt to send an error or notice message
giving the reason for the disconnection before it closes the connection.
is being processed, the backend will probably finish the query
before noticing the disconnection. If the query is outside any
transaction block (BEGIN> ... COMMIT>
- sequence) then its results may be committed before the
+ sequence) then its results might be committed before the
disconnection is recognized.
StartupMessage. The server then responds with a single byte
containing S> or N>, indicating that it is
willing or unwilling to perform
SSL ,
- respectively. The frontend may close the connection at this point
+ respectively. The frontend might close the connection at this point
if it is dissatisfied with the response. To continue after
S>, perform an SSL startup handshake
(not described here, part of the
SSL
response to SSLRequest from the server. This would only occur if
the server predates the addition of
SSL support
to
PostgreSQL>. In this case the connection must
- be closed, but the frontend may choose to open a fresh connection
+ be closed, but the frontend might choose to open a fresh connection
and proceed without requesting
SSL .
- An initial SSLRequest may also be used in a connection that is being
+ An initial SSLRequest can also be used in a connection that is being
opened to send a CancelRequest message.
While the protocol itself does not provide a way for the server to
- force
SSL encryption, the administrator
may
+ force
SSL encryption, the administrator
can
configure the server to reject unencrypted sessions as a byproduct
of authentication checking.
This section describes the detailed format of each message. Each is marked to
-indicate that it may be sent by a frontend (F), a backend (B), or both
+indicate that it can be sent by a frontend (F), a backend (B), or both
(F & B).
Notice that although each message includes a byte count at the beginning,
the message format is defined so that the message end can be found without
reference to the byte count. This aids validity checking. (The CopyData
message is an exception, because it forms part of a data stream; the contents
-of any individual CopyData message may not be interpretable on their own.)
+of any individual CopyData message can not be interpretable on their own.)
Data that forms part of a COPY data stream. Messages sent
from the backend will always correspond to single data rows,
- but messages sent by frontends may divide the data stream
+ but messages sent by frontends might divide the data stream
arbitrarily.
The message body consists of one or more identified fields,
- followed by a zero byte as a terminator. Fields may appear in
+ followed by a zero byte as a terminator. Fields can appear in
any order. For each field there is the following:
the message terminator and no string follows.
The presently defined field types are listed in
.
- Since more field types may be added in future,
+ Since more field types might be added in future,
frontends should silently ignore fields of unrecognized
type.
The message body consists of one or more identified fields,
- followed by a zero byte as a terminator. Fields may appear in
+ followed by a zero byte as a terminator. Fields can appear in
any order. For each field there is the following:
the message terminator and no string follows.
The presently defined field types are listed in
.
- Since more field types may be added in future,
+ Since more field types might be added in future,
frontends should silently ignore fields of unrecognized
type.
The number of parameters used by the statement
- (may be zero).
+ (can be zero).
The number of parameter data types specified
- (may be zero). Note that this is not an indication of
+ (can be zero). Note that this is not an indication of
the number of parameters that might appear in the
query string, only the number that the frontend wants to
prespecify types for.
- Specifies the number of fields in a row (may be zero).
+ Specifies the number of fields in a row (can be zero).
In addition to the above, any run-time parameter that can be
- set at backend start time may be listed. Such settings
+ set at backend start time might be listed. Such settings
will be applied during backend start (after parsing the
command-line options if any). The values will act as
session defaults.
Error and Notice Message Fields
-This section describes the fields that may appear in ErrorResponse and
+This section describes the fields that can appear in ErrorResponse and
NoticeResponse messages. Each field type has a single-byte identification
token. Note that any given field type should appear at most once per
message.
Detail: an optional secondary error message carrying more
- detail about the problem. May run to multiple lines.
+ detail about the problem. Might run to multiple lines.
Hint: an optional suggestion what to do about the problem.
This is intended to differ from Detail in that it offers advice
(potentially inappropriate) rather than hard facts.
- May run to multiple lines.
+ Might run to multiple lines.
ErrorResponse and NoticeResponse ('E>' and 'N>')
-messages now contain multiple fields, from which the client code may
+messages now contain multiple fields, from which the client code can
assemble an error message of the desired level of verbosity. Note that
individual fields will typically not end with a newline, whereas the single
string sent in the older protocol always did.
backend message types ParseComplete, BindComplete, PortalSuspended,
ParameterDescription, NoData, and CloseComplete. Existing clients do not
have to concern themselves with this sub-protocol, but making use of it
-may allow improvements in performance or functionality.
+might allow improvements in performance or functionality.
The NotificationResponse ('A>') message has an additional string
-field, which is presently empty but may someday carry additional data passed
+field, which is presently empty but might someday carry additional data passed
from the NOTIFY event sender.
-
+
Queries
FROM table_reference , table_reference , ...
- A table reference may be a table name (possibly schema-qualified),
+ A table reference can be a table name (possibly schema-qualified),
or a derived table such as a subquery, a table join, or complex
combinations of these. If more than one table reference is listed
in the FROM> clause they are cross-joined (see below)
- to form the intermediate virtual table that may then be subject to
+ to form the intermediate virtual table that can then be subject to
transformations by the WHERE>, GROUP BY>,
and HAVING> clauses and is finally the result of the
overall table expression.
Joins of all types can be chained together or nested: either or
both of T1 and
- T2 may be joined tables. Parentheses
- may be used around JOIN> clauses to control the join
+ T2 might be joined tables. Parentheses
+ can be used around JOIN> clauses to control the join
order. In the absence of parentheses, JOIN> clauses
nest left-to-right.
of either base data types (scalar types) or composite data types
(table rows). They are used like a table, view, or subquery in
the FROM> clause of a query. Columns returned by table
- functions may be included in SELECT>,
+ functions can be included in SELECT>,
JOIN>, or WHERE> clauses in the same manner
as a table, view, or subquery column.
- A table function may be aliased in the FROM> clause,
- but it also may be left unaliased. If a function is used in the
+ A table function can be aliased in the FROM> clause,
+ but it also can be left unaliased. If a function is used in the
FROM> clause with no alias, the function name is used
as the resulting table name.
After passing the WHERE> filter, the derived input
- table may be subject to grouping, using the GROUP BY>
+ table might be subject to grouping, using the GROUP BY>
clause, and elimination of group rows using the HAVING>
clause.
p.name , and p.price must be
in the GROUP BY> clause since they are referenced in
the query select list. (Depending on how exactly the products
- table is set up, name and price may be fully dependent on the
+ table is set up, name and price might be fully dependent on the
product ID, so the additional groupings could theoretically be
unnecessary, but this is not implemented yet.) The column
s.units> does not have to be in the GROUP
- After the select list has been processed, the result table may
+ After the select list has been processed, the result table can
optionally be subject to the elimination of duplicate rows. The
DISTINCT key word is written directly after
SELECT to specify this:
When more than one expression is specified,
the later values are used to sort rows that are equal according to the
- earlier values. Each expression may be followed by an optional
+ earlier values. Each expression can be followed by an optional
ASC> or DESC> keyword to set the sort direction to
ascending or descending. ASC> order is the default.
Ascending order puts smaller values first, where
When using LIMIT>, it is important to use an
ORDER BY> clause that constrains the result rows into a
unique order. Otherwise you will get an unpredictable subset of
- the query's rows. You may be asking for the tenth through
+ the query's rows. You might be asking for the tenth through
twentieth rows, but tenth through twentieth in what ordering? The
ordering is unknown, unless you specified ORDER BY>.
The rows skipped by an OFFSET> clause still have to be
computed inside the server; therefore a large OFFSET>
- can be inefficient.
+ might be inefficient.
-
+
- White space (i.e., spaces, tabs, and newlines) may be used freely
+ White space (i.e., spaces, tabs, and newlines) can be used freely
in SQL commands. That means you can type the command aligned
differently than above, or even all on one line. Two dashes
(--
) introduce comments.
a type for storing single precision floating-point numbers.
date should be self-explanatory. (Yes, the column of
type date is also named date .
- This may be convenient or confusing — you choose.)
+ This might be convenient or confusing — you choose.)
You can update existing rows using the
UPDATE command.
Suppose you discover the temperature readings are
- all off by 2 degrees after November 28. You may correct the
+ all off by 2 degrees after November 28. You can correct the
data as follows:
-
+
Reference
length an authoritative, complete, and formal summary about their
respective subjects. More information about the use of
PostgreSQL , in narrative, tutorial, or
- example form, may be found in other parts of this book. See the
+ example form, can be found in other parts of this book. See the
cross-references listed on each reference page.
This part contains reference information for
PostgreSQL client applications and
utilities. Not all of these commands are of general utility, some
- may require special privileges. The common feature of these
+ might require special privileges. The common feature of these
applications is that they can be run on any host, independent of
where the database server resides.
-
+
Regression Tests
If you have configured
PostgreSQL to install
into a location where an older
PostgreSQL
installation already exists, and you perform gmake check>
- before installing the new version, you may find that the tests fail
+ before installing the new version, you might find that the tests fail
because the new programs try to use the already-installed shared
libraries. (Typical symptoms are complaints about undefined symbols.)
If you wish to run the tests before overwriting the old installation,
scripts, which means forty processes: there's a server process and a
psql> process for each test script.
So if your system enforces a per-user limit on the number of processes,
- make sure this limit is at least fifty or so, else you may get
+ make sure this limit is at least fifty or so, else you might get
random-seeming failures in the parallel test. If you are not in
a position to raise the limit, you can cut down the degree of parallelism
by setting the MAX_CONNECTIONS> parameter. For example,
generated on a reference system, so the results are sensitive to
small system differences. When a test is reported as
failed
, always examine the differences between
- expected and actual results; you may well find that the
+ expected and actual results; you might find that the
differences are not significant. Nonetheless, we still strive to
maintain accurate reference files across all supported platforms,
so it can be expected that all tests pass.
Some of the regression tests involve intentional invalid input
values. Error messages can come from either the
PostgreSQL code or from the host
- platform system routines. In the latter case, the messages may
+ platform system routines. In the latter case, the messages can
vary between platforms, but should reflect similar
information. These differences in messages will result in a
failed
regression test that can be validated by
If you run the tests against an already-installed server that was
initialized with a collation-order locale other than C, then
- there may be differences due to sort order and follow-up
+ there might be differences due to sort order and follow-up
failures. The regression test suite is set up to handle this
problem by providing alternative result files that together are
known to handle a large number of locales.
-
+
+
Composite Types
To write a composite value as a literal constant, enclose the field
- values within parentheses and separate them by commas. You may put double
+ values within parentheses and separate them by commas. You can put double
quotes around any field value, and must do so if it contains commas or
parentheses. (More details appear below.) Thus, the general format of a
composite constant is the following:
- The ROW expression syntax may also be used to
+ The ROW expression syntax can also be used to
construct composite values. In most cases this is considerably
simpler to use than the string-literal syntax, since you don't have
to worry about multiple layers of quoting. We already used this
The decoration consists of parentheses ((> and )>)
around the whole value, plus commas (,>) between adjacent
items. Whitespace outside the parentheses is ignored, but within the
- parentheses it is considered part of the field value, and may or may not be
+ parentheses it is considered part of the field value, and might or might not be
significant depending on the input conversion rules for the field data type.
For example, in
- As shown previously, when writing a composite value you may write double
+ As shown previously, when writing a composite value you can write double
quotes around any individual field value.
You must> do so if the field value would otherwise
confuse the composite-value parser. In particular, fields containing
with a data type whose input routine also treated backslashes specially,
bytea> for example, we might need as many as eight backslashes
in the command to get one backslash into the stored composite field.)
- Dollar quoting (see ) may be
+ Dollar quoting (see ) can be
used to avoid the need to double backslashes.
-
+
The Rule System
the originally given query will be executed, and its command
status will be returned as usual. (But note that if there were
any conditional INSTEAD> rules, the negation of their qualifications
- will have been added to the original query. This may reduce the
+ will have been added to the original query. This might reduce the
number of rows it processes, and if so the reported status will
be affected.)
-
+
Operating System Environment
This usually means just what it suggests: you tried to start
another server on the same port where one is already running.
However, if the kernel error message is not Address
- already in use or some variant of that, there may
+ already in use or some variant of that, there might
be a different problem. For example, trying to start a server
- on a reserved port number may draw something like:
+ on a reserved port number might draw something like:
$ postgres -p 666
LOG: could not bind IPv4 socket: Permission denied
can try starting the server with a smaller-than-normal number of
buffers (). You will eventually want
to reconfigure your kernel to increase the allowed shared memory
- size. You may also see this message when trying to start multiple
+ size. You might also see this message when trying to start multiple
servers on the same machine, if their total space requested
exceeds the kernel limit.
space. It means your kernel's limit on the number of
class="osname">System V> semaphores is smaller than the number
PostgreSQL wants to create. As above,
- you may be able to work around the problem by starting the
+ you might be able to work around the problem by starting the
server with a reduced number of allowed connections
(), but you'll eventually want to
increase the kernel limit.
connection request and rejected it. That case will produce a
different message, as shown in
linkend="client-authentication-problems">.) Other error messages
- such as Connection timed out may
+ such as Connection timed out might
indicate more fundamental problems, like lack of network
connectivity.
- Older distributions may not have the sysctl program,
+ Older distributions might not have the sysctl program,
but equivalent changes can be made by manipulating the
/proc file system:
In OS X 10.3.9 and later, instead of editing /etc/rc>
- you may create a file named /etc/sysctl.conf>,
+ you can create a file named /etc/sysctl.conf>,
containing variable assignments such as
kern.sysv.shmmax=4194304
sort of configuration commonly used for other databases such
-
It may , however, be necessary to modify the global
+
It might , however, be necessary to modify the global
ulimit information in
/etc/security/limits , as the default hard
limits for file sizes (fsize ) and numbers of
- files (nofiles ) may be too low.
+ files (nofiles ) might be too low.
socially friendly
values that allow many users to
coexist on a machine without using an inappropriate fraction of
the system resources. If you run many servers on a machine this
- is perhaps what you want, but on dedicated servers you may want to
+ is perhaps what you want, but on dedicated servers you might want to
raise this limit.
In Linux 2.4 and later, the default virtual memory behavior is not
optimal for
PostgreSQL . Because of the
- way that the kernel implements memory overcommit, the kernel may
+ way that the kernel implements memory overcommit, the kernel might
terminate the
PostgreSQL server (the
master server process) if the memory demands of
another process cause the system to run out of virtual memory.
sysctl -w vm.overcommit_memory=2
or placing an equivalent entry in /etc/sysctl.conf>.
- You may also wish to modify the related setting
+ You might also wish to modify the related setting
vm.overcommit_ratio>. For details see the kernel documentation
file Documentation/vm/overcommit-accounting>.
It is best not to use SIGKILL to shut down
the server. Doing so will prevent the server from releasing
- shared memory and semaphores, which may then have to be done
+ shared memory and semaphores, which might then have to be done
manually before a new server can be started. Furthermore,
SIGKILL kills the postgres
process without letting it relay the signal to its subprocesses,
-
+
PostgreSQL Coding Conventions
func_signature_string(funcname, nargs,
actual_arg_types)),
errhint("Unable to choose a best candidate function. "
- "You may need to add explicit typecasts.")));
+ "You might need to add explicit typecasts.")));
This illustrates the use of format codes to embed run-time values into
a message text. Also, an optional hint> message is provided.
Rationale: keeping the primary message short helps keep it to the point,
and lets clients lay out screen space on the assumption that one line is
- enough for error messages. Detail and hint messages may be relegated to a
+ enough for error messages. Detail and hint messages can be relegated to a
verbose mode, or perhaps a pop-up error-details window. Also, details and
hints would normally be suppressed from the server log to save
space. Reference to implementation details is best avoided since users
Don't put any specific assumptions about formatting into the message
texts. Expect clients and the server log to wrap lines to fit their own
- needs. In long messages, newline characters (\n) may be used to indicate
+ needs. In long messages, newline characters (\n) can be used to indicate
suggested paragraph breaks. Don't end a message with a newline. Don't
use tabs or other formatting characters. (In error context displays,
newlines are automatically added to separate levels of context such as
messages are not grammatically complete sentences anyway. (And if they're
long enough to be more than one sentence, they should be split into
primary and detail parts.) However, detail and hint messages are longer
- and may need to include multiple sentences. For consistency, they should
+ and might need to include multiple sentences. For consistency, they should
follow complete-sentence style even when there's only one sentence.
The first one means that the attempt to open the file failed. The
message should give a reason, such as disk full
or
file doesn't exist
. The past tense is appropriate because
- next time the disk might not be full anymore or the file in question may
+ next time the disk might not be full anymore or the file in question might
exist.
-
+
Server Programming Interface
Note that if a command invoked via SPI fails, then control will not be
returned to your procedure. Rather, the
transaction or subtransaction in which your procedure executes will be
- rolled back. (This may seem surprising given that the SPI functions mostly
+ rolled back. (This might seem surprising given that the SPI functions mostly
have documented error-return conventions. Those conventions only apply
for errors detected within the SPI functions themselves, however.)
It is possible to recover control after an error by establishing your own
SPI_connect opens a connection from a
procedure invocation to the SPI manager. You must call this
function if you want to execute commands through SPI. Some utility
- SPI functions may be called from unconnected procedures.
+ SPI functions can be called from unconnected procedures.
- This function may only be called from a connected procedure.
+ This function can only be called from a connected procedure.
- You may pass multiple commands in one string.
+ You can pass multiple commands in one string.
SPI_execute returns the
result for the command executed last. The
count
limit applies to each command separately, but it is not applied to
SPI_OK_INSERT_RETURNING ,
SPI_OK_DELETE_RETURNING , or
SPI_OK_UPDATE_RETURNING ,
- then you may use the
+ then you can use the
global pointer SPITupleTable *SPI_tuptable to
access the result rows. Some utility commands (such as
EXPLAIN>) also return row sets, and SPI_tuptable>
vals> is an array of pointers to rows. (The number
of valid entries is given by SPI_processed .)
- tupdesc> is a row descriptor which you may pass to
+ tupdesc> is a row descriptor which you can pass to
SPI functions dealing with rows. tuptabcxt>,
alloced>, and free> are internal
fields not intended for use by SPI callers.
When the same or a similar command is to be executed repeatedly, it
- may be advantageous to perform the planning only once.
+ might be advantageous to perform the planning only once.
SPI_prepare converts a command string into an
execution plan that can be executed repeatedly using
SPI_execute_plan .
There is a disadvantage to using parameters: since the planner does
not know the values that will be supplied for the parameters, it
- may make worse planning choices than it would make for a normal
+ might make worse planning choices than it would make for a normal
command with all constants visible.
- All functions described in this section may be used by both
+ All functions described in this section can be used by both
connected and unconnected procedures.
- All functions described in this section may be used by both
+ All functions described in this section can be used by both
connected and unconnected procedures. In an unconnected procedure,
they act the same as the underlying ordinary server functions
(palloc>, etc.).
-
+
SQL
- The tables PART and SUPPLIER may be regarded as
+ The tables PART and SUPPLIER can be regarded as
entities and
- SELLS may be regarded as a relationship
+ SELLS can be regarded as a relationship
between a particular
part and a particular supplier.
- Arithmetic operations may be used in the target list and in the WHERE
+ Arithmetic operations can be used in the target list and in the WHERE
clause. For example if we want to know how much it would cost if we
take two pieces of a part we could use the following query:
A joined table, created using JOIN syntax, is a table reference list
item that occurs in a FROM clause and before any WHERE, GROUP BY,
or HAVING clause. Other table references, including table names or
- other JOIN clauses, may be included in the FROM clause if separated
+ other JOIN clauses, can be included in the FROM clause if separated
by commas. JOINed tables are logically like any other
table listed in the FROM clause.
JOINs of all types can be chained together or nested where either or both of
T1 and
- T2 may be JOINed tables.
+ T2 can be JOINed tables.
Parenthesis can be used around JOIN clauses to control the order
of JOINs which are otherwise processed left to right.
Note that for a query using GROUP BY and aggregate
functions to make sense, the target list can only refer directly to
- the attributes being grouped by. Other attributes may only be used
+ the attributes being grouped by. Other attributes can only be used
inside the arguments of aggregate functions. Otherwise there would
not be a unique value to associate with the other attributes.
Create View
- A view may be regarded as a virtual table ,
+ A view can be regarded as a virtual table ,
i.e. a table that
does not physically exist in the database
but looks to the user
-
+
Getting Started
If your site administrator has not set things up in the default
- way, you may have some more work to do. For example, if the
+ way, you might have some more work to do. For example, if the
database server machine is a remote machine, you will need to set
the PGHOST environment variable to the name of the
database server machine. The environment variable
- PGPORT may also have to be set. The bottom line is
+ PGPORT might also have to be set. The bottom line is
this: if you try to start an application program and it complains
that it cannot connect to the database, you should consult your
site administrator or, if that is you, the documentation to make
- More about createdb and dropdb may
+ More about createdb and dropdb can
be found in and
respectively.
-
+
when compiling the server). In a table, all the pages are logically
equivalent, so a particular item (row) can be stored in any page. In
indexes, the first page is generally reserved as a metapage>
-holding control information, and there may be different types of pages
+holding control information, and there can be different types of pages
within the index, depending on the index access method.
- All the details may be found in
+ All the details can be found in
src/include/storage/bufpage.h .
The number of item identifiers present can be determined by looking at
pd_lower>, which is increased to allocate a new identifier.
Because an item
- identifier is never moved until it is freed, its index may be used on a
+ identifier is never moved until it is freed, its index can be used on a
long-term basis to reference an item, even when the item itself is moved
around on the page to compact free space. In fact, every pointer to an
item (ItemPointer , also known as
- The final section is the special section
which may
- contain anything the access method wishes to store. For example,
+ The final section is the special section
which can
+ contain anything the access method wishes to store. For example,
b-tree indexes store links to the page's left and right siblings,
as well as some other data relevant to the index structure.
Ordinary tables do not use a special section at all (indicated by setting
- All the details may be found in
+ All the details can be found in
src/include/access/htup.h .
variable length field (attlen = -1) then it's a bit more complicated.
All variable-length datatypes share the common header structure
varattrib , which includes the total length of the stored
- value and some flag bits. Depending on the flags, the data may be either
+ value and some flag bits. Depending on the flags, the data can be either
inline or in a
TOAST> table;
it might be compressed, too (see ).
-
+
SQL Syntax
key word can be letters, underscores, digits
(0 -9 ), or dollar signs
($>). Note that dollar signs are not allowed in identifiers
- according to the letter of the SQL standard, so their use may render
+ according to the letter of the SQL standard, so their use might render
applications less portable.
The SQL standard will not define a key word that contains
digits or starts or ends with an underscore, so identifiers of this
digits (0 through 9). At least one digit must be before or after the
decimal point, if one is used. At least one digit must follow the
exponent marker (e ), if one is present.
- There may not be any spaces or other characters embedded in the
+ There can not be any spaces or other characters embedded in the
constant. Note that any leading plus or minus sign is not actually
considered part of the constant; it is an operator applied to the
constant.
The string constant's text is passed to the input conversion
routine for the type called type . The
result is a constant of the indicated type. The explicit type
- cast may be omitted if there is no ambiguity as to the type the
+ cast can be omitted if there is no ambiguity as to the type the
constant must be (for example, when it is assigned directly to a
table column), in which case it is automatically coerced.
typename ( 'string ' )
- but not all type names
may be used in this way; see
+ but not all type names
can be used in this way; see
linkend="sql-syntax-type-casts"> for details.
A dollar sign ($ ) followed by digits is used
to represent a positional parameter in the body of a function
definition or a prepared statement. In other contexts the
- dollar sign may be part of an identifier or a dollar-quoted string
+ dollar sign can be part of an identifier or a dollar-quoted string
constant.
where the comment begins with /* and extends to
the matching occurrence of */ . These block
comments nest, as specified in the SQL standard but unlike C, so that one can
- comment out larger blocks of code that may contain existing block
+ comment out larger blocks of code that might contain existing block
comments.
associativity of the operators in
PostgreSQL>.
Most operators have the same precedence and are left-associative.
The precedence and associativity of the operators is hard-wired
- into the parser. This may lead to non-intuitive behavior; for
+ into the parser. This can lead to non-intuitive behavior; for
example the Boolean operators <> and
>> have a different precedence than the Boolean
operators <=> and >=>. Also, you will
the key words NEW or OLD .
(NEW and OLD can only appear in rewrite rules,
while other correlation names can be used in any SQL statement.)
- The correlation name and separating dot may be omitted if the column name
+ The correlation name and separating dot can be omitted if the column name
is unique across all the tables being used in the current query. (See also .)
In general the array expression must be
- parenthesized, but the parentheses may be omitted when the expression
+ parenthesized, but the parentheses can be omitted when the expression
to be subscripted is just a column reference or positional parameter.
Also, multiple subscripts can be concatenated when the original array
is multidimensional.
In general the row expression must be
- parenthesized, but the parentheses may be omitted when the expression
+ parenthesized, but the parentheses can be omitted when the expression
to be selected from is just a table reference or positional parameter.
For example,
The list of built-in functions is in .
- Other functions may be added by the user.
+ Other functions can be added by the user.
The predefined aggregate functions are described in
- linkend="functions-aggregate">. Other aggregate functions may be added
+ linkend="functions-aggregate">. Other aggregate functions can be added
by the user.
- An aggregate expression may only appear in the result list or
+ An aggregate expression can only appear in the result list or
HAVING> clause of a SELECT> command.
It is forbidden in other clauses, such as WHERE>,
because those clauses are logically evaluated before the results
- An explicit type cast may usually be omitted if there is no ambiguity as
+ An explicit type cast can usually be omitted if there is no ambiguity as
to the type that a value expression must produce (for example, when it is
assigned to a table column); the system will automatically apply a
type cast in such cases. However, automatic casting is only done for
Multidimensional array values can be built by nesting array
constructors.
- In the inner constructors, the key word ARRAY may
+ In the inner constructors, the key word ARRAY can
be omitted. For example, these produce the same result:
By default, the value created by a ROW> expression is of
an anonymous record type. If necessary, it can be cast to a named
composite type — either the row type of a table, or a composite type
- created with CREATE TYPE AS>. An explicit cast may be needed
+ created with CREATE TYPE AS>. An explicit cast might be needed
to avoid ambiguity. For example:
CREATE TABLE mytable(f1 int, f2 float, f3 text);
rely on side effects or evaluation order in WHERE> and HAVING> clauses,
since those clauses are extensively reprocessed as part of
developing an execution plan. Boolean
- expressions (AND>/OR>/NOT> combinations) in those clauses may be reorganized
+ expressions (AND>/OR>/NOT> combinations) in those clauses can be reorganized
in any manner allowed by the laws of Boolean algebra.
When it is essential to force evaluation order, a CASE>
- construct (see ) may be
+ construct (see ) can be
used. For example, this is an untrustworthy way of trying to
avoid division by zero in a WHERE> clause:
-
+
Triggers
The return value is ignored for row-level triggers fired after an
- operation, and so they may as well return NULL>.
+ operation, and so they can return NULL>.
If a trigger function executes SQL commands then these
- commands may fire triggers again. This is known as cascading
+ commands might fire triggers again. This is known as cascading
triggers. There is no direct limitation on the number of cascade
levels. It is possible for cascades to cause a recursive invocation
of the same trigger; for example, an INSERT
changes for rows previously processed in the same outer
command. This requires caution, since the ordering of these
change events is not in general predictable; a SQL command that
- affects multiple rows may visit the rows in any order.
+ affects multiple rows can visit the rows in any order.
tg_event>
- Describes the event for which the function is called. You may use the
+ Describes the event for which the function is called. You can use the
following macros to examine tg_event :
Here is a very simple example of a trigger function written in C.
- (Examples of triggers written in procedural languages may be found
+ (Examples of triggers written in procedural languages can be found
in the documentation of the procedural languages.)
-
+
Type Conversion
SELECT ~ '20' AS "negation";
ERROR: operator is not unique: ~ "unknown"
-HINT: Could not choose a best candidate operator. You may need to add explicit
+HINT: Could not choose a best candidate operator. You might need to add explicit
type casts.
This happens because the system can't decide which of the several
Since numeric constants with decimal points are initially assigned the
type numeric , the following query will require no type
-conversion and may therefore be slightly more efficient:
+conversion and might therefore be slightly more efficient:
SELECT round(4.0, 4);
-
+
Database Roles and Privileges
- The set of database roles a given client connection may connect as
+ The set of database roles a given client connection can connect as
is determined by the client authentication setup, as explained in
. (Thus, a client is not
necessarily limited to connect as the role with the same name as
Role Attributes
- A database role may have a number of attributes that define its
+ A database role can have a number of attributes that define its
privileges and interact with the client authentication system.
Functions and triggers allow users to insert code into the backend
- server that other users may execute unintentionally. Hence, both
+ server that other users might execute unintentionally. Hence, both
mechanisms permit users to Trojan horse
others with relative ease. The only real protection is tight
control over who can define functions.
-
+
Reliability and the Write-Ahead Log
- Next, there may be a cache in the disk drive controller; this is
+ Next, there might be a cache in the disk drive controller; this is
particularly common on
RAID> controller cards. Some of
these caches are write-through>, meaning writes are passed
along to the drive as soon as they arrive. Others are
Write-Ahead Logging (
WAL )
is a standard approach to transaction logging. Its detailed
- description may be found in most (if not all) books about
+ description can be found in most (if not all) books about
transaction processing. Briefly,
WAL 's central
concept is that changes to data files (where tables and indexes
reside) must be written only after those changes have been logged,
file needs to be flushed to disk at the time of transaction
commit, rather than every data file changed by the transaction.
In multiuser environments, commits of many transactions
- may be accomplished with a single fsync of
+ can be accomplished with a single fsync of
the log file. Furthermore, the log file is written sequentially,
and so the cost of syncing the log is much less than the cost of
flushing the data pages. This is especially true for servers
checkpoint_segments . Occasional appearance of such
a message is not cause for alarm, but if it appears often then the
checkpoint control parameters should be increased. Bulk operations such
- as large COPY> transfers may cause a number of such warnings
+ as large COPY> transfers might cause a number of such warnings
to appear if you have not set checkpoint_segments> high
enough.
is used on every database low level modification (for example, row
insertion) at a time when an exclusive lock is held on affected
data pages, so the operation needs to be as fast as possible. What
- is worse, writing
WAL buffers m
ay also force the
+ is worse, writing
WAL buffers m
ight also force the
creation of a new log segment, which takes even more
time. Normally,
WAL buffers should be written
and flushed by a LogFlush request, which is
made, for the most part, at transaction commit time to ensure that
transaction records are flushed to permanent storage. On systems
- with high log output, LogFlush requests may
+ with high log output, LogFlush requests might
not occur often enough to prevent LogInsert
from having to do writes. On such systems
one should increase the number of
WAL buffers by
compiled with support for it) will result in each
LogInsert and LogFlush
WAL call being logged to the server log. This
- option may be replaced by a more general mechanism in the future.
+ option might be replaced by a more general mechanism in the future.
It is of advantage if the log is located on another disk than the
- main database files. This may be achieved by moving the directory
+ main database files. This can be achieved by moving the directory
pg_xlog to another location (while the server
is shut down, of course) and creating a symbolic link from the
original location in the main data directory to the new location.
The aim of
WAL , to ensure that the log is
- written before database records are altered, may be subverted by
+ written before database records are altered, can be subverted by
disk drives
disk drive>> that falsely report a
successful write to the kernel,
when in fact they have only cached the data and not yet stored it
- on the disk. A power failure in such a situation may still lead to
+ on the disk. A power failure in such a situation might still lead to
irrecoverable data corruption. Administrators should try to ensure
that disks holding
PostgreSQL 's
WAL log files do not make such false reports.
-
+
User-Defined Aggregates
Thus, in addition to the argument and result data types seen by a user
of the aggregate, there is an internal state-value data type that
- may be different from both the argument and result types.
+ might be different from both the argument and result types.
- Aggregate functions may use polymorphic
+ Aggregate functions can use polymorphic
state transition functions or final functions, so that the same functions
can be used to implement multiple aggregates.
See
for an explanation of polymorphic functions.
- Going a step further, the aggregate function itself may be specified
+ Going a step further, the aggregate function itself can be specified
with polymorphic input type(s) and state type, allowing a single
aggregate definition to serve for multiple input data types.
Here is an example of a polymorphic aggregate:
-
+
User-Defined Functions
of function can take base types, composite types, or
combinations of these as arguments (parameters). In addition,
every kind of function can return a base type or
- a composite type. Functions may also be defined to return
+ a composite type. Functions can also be defined to return
sets of base or composite values.
SETOF>function>> Alternatively,
- an SQL function may be declared to return a set, by specifying the
+ an SQL function can be declared to return a set, by specifying the
function's return type as SETOF
sometype>. In this case all rows of the
last query's result are returned. Further details appear below.
body using the syntax $n>>: $1>
refers to the first argument, $2> to the second, and so on.
If an argument is of a composite type, then the dot notation,
- e.g., $1.name , may be used to access attributes
+ e.g., $1.name , can be used to access attributes
of the argument. The arguments can only be used as data values,
not as identifiers. Thus for example this is reasonable:
SQL Functions as Table Sources
- All SQL functions may be used in the FROM> clause of a query,
+ All SQL functions can be used in the FROM> clause of a query,
but it is particularly useful for functions returning composite types.
If the function is defined to return a base type, the table function
produces a one-column table. If the function is defined to return
- Currently, functions returning sets may also be called in the select list
+ Currently, functions returning sets can also be called in the select list
of a query. For each row that the query
generates by itself, the function returning set is invoked, and an output
row is generated for each element of the function's result set. Note,
- however, that this capability is deprecated and may be removed in future
+ however, that this capability is deprecated and might be removed in future
releases. The following is an example function returning a set from the
select list:
Polymorphic SQL Functions
-
SQL functions
may be declared to accept and
+
SQL functions
can be declared to accept and
return the polymorphic types anyelement and
anyarray . See
linkend="extend-types-polymorphic"> for a more detailed
- More than one function may be defined with the same SQL name, so long
+ More than one function can be defined with the same SQL name, so long
as the arguments they take are different. In other words,
function names can be overloaded . When a
query is executed, the server will determine which function to
a lot whether a function is executed once during planning or once during
query execution startup. But there is a big difference if the plan is
saved and reused later. Labeling a function IMMUTABLE> when
- it really isn't may allow it to be prematurely folded to a constant during
+ it really isn't might allow it to be prematurely folded to a constant during
planning, resulting in a stale value being re-used during subsequent uses
of the plan. This is a hazard when using prepared statements or when
using function languages that cache plans (such as
- On the other hand, fixed-length types of any size may
+ On the other hand, fixed-length types of any size can
be passed by-reference. For example, here is a sample
implementation of a
PostgreSQL type:
Never> modify the contents of a pass-by-reference input
value. If you do so you are likely to corrupt on-disk data, since
- the pointer you are given may well point directly into a disk buffer.
+ the pointer you are given might point directly into a disk buffer.
The sole exception to this rule is explained in
.
that uses a built-in type of
PostgreSQL>.
The Defined In
column gives the header file that
needs to be included to get the type definition. (The actual
- definition may be in a different file that is included by the
+ definition might be in a different file that is included by the
listed file. It is recommended that users stick to the defined
interface.) Note that you should always include
postgres.h first in any source file, because
(Better style would be to use just 'funcs'> in the
AS> clause, after having added
DIRECTORY to the search path. In any
- case, we may omit the system-specific extension for a shared
+ case, we can omit the system-specific extension for a shared
library, commonly .so or
.sl .)
- At first glance, the version-1 coding conventions may appear to
+ At first glance, the version-1 coding conventions might appear to
be just pointless obscurantism. They do, however, offer a number
of improvements, because the macros can hide unnecessary detail.
An example is that in coding add_one_float8>, we no longer need to
Before we turn to the more advanced topics, we should discuss
some coding rules for
PostgreSQL
- C-language functions. While it may be possible to load functions
+ C-language functions. While it might be possible to load functions
written in languages other than C into
PostgreSQL , this is usually difficult
(when it is possible at all) because other languages, such as
memset . Without this, it's difficult to
support hash indexes or hash joins, as you must pick out only
the significant bits of your data structure to compute a hash.
- Even if you initialize all fields of your structure, there may be
- alignment padding (holes in the structure) that may contain
+ Even if you initialize all fields of your structure, there might be
+ alignment padding (holes in the structure) that contain
garbage values.
Composite types do not have a fixed layout like C structures.
- Instances of a composite type may contain null fields. In
+ Instances of a composite type can contain null fields. In
addition, composite types that are part of an inheritance
- hierarchy may have different fields than other members of the
+ hierarchy can have different fields than other members of the
same inheritance hierarchy. Therefore,
PostgreSQL provides a function
interface for accessing fields of composite types from C.
Polymorphic Arguments and Return Types
- C-language functions may be declared to accept and
+ C-language functions can be declared to accept and
return the polymorphic types
anyelement and anyarray .
See for a more detailed explanation
Shared Memory and LWLocks
- Add-ins may reserve LWLocks and an allocation of shared memory on server
+ Add-ins can reserve LWLocks and an allocation of shared memory on server
startup. The add-in's shared library must be preloaded by specifying
it in
shared-preload-libraries>>.
-
+
Interfacing Extensions To Indexes
called because one thing they specify is the set of
WHERE>-clause operators that can be used with an index
(i.e., can be converted into an index-scan qualification). An
- operator class may also specify some support
+ operator class can also specify some support
procedures> that are needed by the internal operations of the
index method, but do not directly correspond to any
WHERE>-clause operator that can be used with the index.
To handle these needs,
PostgreSQL
uses the concept of an operator
- An operator family contains one or more operator classes, and may also
+ An operator family contains one or more operator classes, and can also
contain indexable operators and corresponding support functions that
belong to the family as a whole but not to any single class within the
family. We say that such operators and functions are loose>
Consider again the situation where we are storing in the index only
the bounding box of a complex object such as a polygon. In this
case there's not much value in storing the whole polygon in the index
- entry — we may as well store just a simpler object of type
+ entry — we might as well store just a simpler object of type
box>. This situation is expressed by the STORAGE>
option in CREATE OPERATOR CLASS>: we'd write something like
-
+
User-Defined Operators
This assists the optimizer by
giving it some idea of how many rows will be eliminated by WHERE>
clauses that have this form. (What happens if the constant is on
- the left, you may be wondering? Well, that's one of the things that
+ the left, you might be wondering? Well, that's one of the things that
COMMUTATOR> is for...)
There are additional selectivity estimation functions designed for geometric
operators in src/backend/utils/adt/geo_selfuncs.c : areasel , positionsel ,
- and contsel . At this writing these are just stubs, but you may want
+ and contsel . At this writing these are just stubs, but you might want
to use them (or even better, improve them) anyway.
Care should be exercised when preparing a hash function, because there
are machine-dependent ways in which it might fail to do the right thing.
- For example, if your data type is a structure in which there may be
+ For example, if your data type is a structure in which there might be
uninteresting pad bits, you can't simply pass the whole structure to
hash_any>. (Unless you write your other operators and
functions to ensure that the unused bits are always zero, which is the
strict, the
function must also be complete: that is, it should return true or
false, never null, for any two nonnull inputs. If this rule is
- not followed, hash-optimization of IN> operations may
+ not followed, hash-optimization of IN> operations might
generate wrong results. (Specifically, IN> might return
false where the correct answer according to the standard would be null;
or it might yield an error complaining that it wasn't prepared for a
A merge-joinable operator must have a commutator (itself if the two
operand data types are the same, or a related equality operator
if they are different) that appears in the same operator family.
- If this is not the case, planner errors may occur when the operator
+ If this is not the case, planner errors might occur when the operator
is used. Also, it is a good idea (but not strictly required) for
a btree operator family that supports multiple datatypes to provide
equality operators for every combination of the datatypes; this
-
+
Procedural Languages
only necessary to execute CREATE LANGUAGE>
language_name> to install the language into the
current database. Alternatively, the program
- linkend="app-createlang"> may be used to do this from the shell
+ linkend="app-createlang"> can be used to do this from the shell
command line. For example, to install the language
PL/pgSQL into the database
template1>, use
- Optionally, the language handler may provide a validator>
+ Optionally, the language handler can provide a validator>
function that checks a function definition for correctness without
actually executing it. The validator function is called by
CREATE FUNCTION> if it exists. If a validator function
*
*
* IDENTIFICATION
- * $PostgreSQL: pgsql/src/backend/parser/parse_func.c,v 1.191 2007/01/05 22:19:34 momjian Exp $
+ * $PostgreSQL: pgsql/src/backend/parser/parse_func.c,v 1.192 2007/01/31 20:56:20 momjian Exp $
*
*-------------------------------------------------------------------------
*/
func_signature_string(funcname, nargs,
actual_arg_types)),
errhint("Could not choose a best candidate function. "
- "You may need to add explicit type casts."),
+ "You might need to add explicit type casts."),
parser_errposition(pstate, location)));
else
ereport(ERROR,
func_signature_string(funcname, nargs,
actual_arg_types)),
errhint("No function matches the given name and argument types. "
- "You may need to add explicit type casts."),
+ "You might need to add explicit type casts."),
parser_errposition(pstate, location)));
}
*
*
* IDENTIFICATION
- * $PostgreSQL: pgsql/src/backend/parser/parse_oper.c,v 1.91 2007/01/05 22:19:34 momjian Exp $
+ * $PostgreSQL: pgsql/src/backend/parser/parse_oper.c,v 1.92 2007/01/31 20:56:20 momjian Exp $
*
*-------------------------------------------------------------------------
*/
errmsg("operator is not unique: %s",
op_signature_string(op, oprkind, arg1, arg2)),
errhint("Could not choose a best candidate operator. "
- "You may need to add explicit type casts."),
+ "You might need to add explicit type casts."),
parser_errposition(pstate, location)));
else
ereport(ERROR,
errmsg("operator does not exist: %s",
op_signature_string(op, oprkind, arg1, arg2)),
errhint("No operator matches the given name and argument type(s). "
- "You may need to add explicit type casts."),
+ "You might need to add explicit type casts."),
parser_errposition(pstate, location)));
}