- "http://www.postgresql.org/docs/current/static/install-upgrading.html">
- http://www.postgresql.org/docs/current/static/install-upgrading.html
- for specific instructions.
-
-
3.7) What computer hardware should I use?
-
-
Because PC hardware is mostly compatible, people tend to believe that
- all PC hardware is of equal quality. It is not. ECC RAM, SCSI, and
- quality motherboards are more reliable and have better performance than
- less expensive hardware. PostgreSQL will run on almost any hardware,
- but if reliability and performance are important it is wise to
- research your hardware options thoroughly. A disk controller with a
- battery-backed cache is also useful. Our email lists can be used
- to discuss hardware options and tradeoffs.
-
-
-
-
Operational Questions
-
-
4.1) How do I SELECT only the
- first few rows of a query? A random row?
-
-
To retrieve only a few rows, if you know at the number of rows
- needed at the time of the SELECT use
- LIMIT . If an index matches the ORDER
- BY it is possible the entire query does not have to be
- executed. If you don't know the number of rows at
- SELECT time, use a cursor and
- FETCH.
-
-
To SELECT a random row, use:
- SELECT col
- FROM tab
- ORDER BY random()
- LIMIT 1;
-
-
-
4.2) How do I find out what tables, indexes,
- databases, and users are defined? How do I see the queries used
- by psql to display them?
-
-
Use the \dt command to see tables in psql. For a complete list of
- commands inside psql you can use \?. Alternatively you can read the source
- code for psql in file pgsql/src/bin/psql/describe.c, it
- contains SQL commands that generate the output for
- psql's backslash commands. You can also start psql with the
- -E option so it will print out the queries it uses to execute the
- commands you give. PostgreSQL also provides an SQL compliant
- INFORMATION SCHEMA interface you can query to get information about the
- database.
-
-
There are also system tables beginning with pg_ that describe
- these too.
-
-
Use psql -l will list all databases.
-
-
Also try the file pgsql/src/tutorial/syscat.source. It
- illustrates many of the SELECTs needed to get
- information from the database system tables.
-
-
4.3) How do you change a column's data type?
-
-
Changing the data type of a column can be done easily in 8.0
- and later with ALTER TABLE ALTER COLUMN TYPE.
-
-
In earlier releases, do this:
- BEGIN;
- ALTER TABLE tab ADD COLUMN new_col new_data_type;
- UPDATE tab SET new_col = CAST(old_col AS new_data_type);
- ALTER TABLE tab DROP COLUMN old_col;
- COMMIT;
-
-
You might then want to do VACUUM FULL tab to reclaim the
- disk space used by the expired rows.
-
-
4.4) What is the maximum size for a row, a
- table, and a database?
-
-
-
Maximum size for a database? | unlimited (32 TB databases
-exist) |
-
Maximum size for a table? | 32 TB |
-
Maximum size for a row? | 400 GB |
-
Maximum size for a field? | 1 GB |
-
Maximum number of rows in a table? | unlimited |
-
Maximum number of columns in a table? | 250-1600 depending
-on column types |
-
Maximum number of indexes on a
-table? | unlimited |
-
-
-
-
Of course, these are not actually unlimited, but limited to
- available disk space and memory/swap space. Performance may suffer
- when these values get unusually large.
-
-
The maximum table size of 32 TB does not require large file
- support from the operating system. Large tables are stored as
- multiple 1 GB files so file system size limits are not
- important.
-
-
The maximum table size, row size, and maximum number of columns
- can be quadrupled by increasing the default block size to 32k. The
- maximum table size can also be increased using table partitioning.
-
-
One limitation is that indexes can not be created on columns
- longer than about 2,000 characters. Fortunately, such indexes are
- rarely needed. Uniqueness is best guaranteed by a function index
- of an MD5 hash of the long column, and full text indexing
- allows for searching of words within the column.
-
-
4.5) How much database disk space is required
- to store data from a typical text file?
-
-
A PostgreSQL database may require up to five times the disk
- space to store data from a text file.
-
-
As an example, consider a file of 100,000 lines with an integer
- and text description on each line. Suppose the text string
- avergages twenty bytes in length. The flat file would be 2.8 MB.
- The size of the PostgreSQL database file containing this data can
- be estimated as 5.2 MB:
- 24 bytes: each row header (approximate)
- 24 bytes: one int field and one text field
- + 4 bytes: pointer on page to tuple
- ----------------------------------------
- 52 bytes per row
-
- The data page size in PostgreSQL is 8192 bytes (8 KB), so:
-
- 8192 bytes per page
- ------------------- = 158 rows per database page (rounded down)
- 52 bytes per row
-
- 100000 data rows
- -------------------- = 633 database pages (rounded up)
- 158 rows per page
-
-633 database pages * 8192 bytes per page = 5,185,536 bytes (5.2 MB)
-
-
-
Indexes do not require as much overhead, but do contain the data
- that is being indexed, so they can be large also.
-
-
NULLs are stored as bitmaps, so they
- use very little space.
-
-
4.6) Why are my queries slow? Why don't they
- use my indexes?
-
-
Indexes are not used by every query. Indexes are used only if the
- table is larger than a minimum size, and the query selects only a
- small percentage of the rows in the table. This is because the random
- disk access caused by an index scan can be slower than a straight read
- through the table, or sequential scan.
-
-
To determine if an index should be used, PostgreSQL must have
- statistics about the table. These statistics are collected using
- VACUUM ANALYZE, or simply ANALYZE.
- Using statistics, the optimizer knows how many rows are in the
- table, and can better determine if indexes should be used.
- Statistics are also valuable in determining optimal join order and
- join methods. Statistics collection should be performed
- periodically as the contents of the table change.
-
-
Indexes are normally not used for ORDER BY or to
- perform joins. A sequential scan followed by an explicit sort is
- usually faster than an index scan of a large table.
- However, LIMIT combined with ORDER BY
- often will use an index because only a small portion of the table
- is returned.
-
-
If you believe the optimizer is incorrect in choosing a
- sequential scan, use SET enable_seqscan TO 'off'
and
- run query again to see if an index scan is indeed faster.
-
-
When using wild-card operators such as LIKE or
- ~, indexes can only be used in certain circumstances:
-
The beginning of the search string must be anchored to the start
- of the string, i.e.
-
LIKE patterns must not start with %.
-
~ (regular expression) patterns must start with
- ^.
-
-
The search string can not start with a character class,
- e.g. [a-e].
-
Case-insensitive searches such as ILIKE and
- ~* do not utilize indexes. Instead, use expression
- indexes, which are described in section
4.8.
-
The default C locale must be used during
- initdb because it is not possible to know the next-greatest
- character in a non-C locale. You can create a special
- text_pattern_ops
index for such cases that work only
- for LIKE indexing. It is also possible to use
- full text indexing for word searches.
-
-
-
-
4.7) How do I see how the query optimizer is
- evaluating my query?
-
-
See the EXPLAIN manual page.
-
-
4.8) How do I perform regular expression
- searches and case-insensitive regular expression searches? How do I
- use an index for case-insensitive searches?
-
-
The ~ operator does regular expression matching, and
- ~* does case-insensitive regular expression matching. The
- case-insensitive variant of LIKE is called
- ILIKE.
-
-
Case-insensitive equality comparisons are normally expressed
- as:
- SELECT *
- FROM tab
- WHERE lower(col) = 'abc';
-
- This will not use an standard index. However, if you create an
- expression index, it will be used:
- CREATE INDEX tabindex ON tab (lower(col));
-
-
If the above index is created as UNIQUE, though
- the column can store upper and lowercase characters, it can not have
- identical values that differ only in case. To force a particular
- case to be stored in the column, use a CHECK
- constraint or a trigger.
-
-
4.9) In a query, how do I detect if a field
- is NULL? How do I concatenate possible NULLs?
- How can I sort on whether a field is NULL or not?
-
-
You test the column with IS NULL and IS
- NOT NULL, like this:
-
- SELECT *
- FROM tab
- WHERE col IS NULL;
-
-
-
To concatentate with possible NULLs, use COALESCE(),
- like this:
- SELECT COALESCE(col1, '') || COALESCE(col2, '')
- FROM tab
-
-
-
To sort by the NULL status, use the IS NULL
- and IS NOT NULL modifiers in your ORDER BY clause.
- Things that are true will sort higher than things that are false,
- so the following will put NULL entries at the top of the resulting list:
-
- SELECT *
- FROM tab
- ORDER BY (col IS NOT NULL)
-
-
-
4.10) What is the difference between the
- various character types?
-
-
Type | Internal Name | Notes |
-
VARCHAR(n) | varchar | size specifies maximum
-length, no padding |
-
CHAR(n) | bpchar | blank padded to the specified
-fixed length |
-
TEXT | text | no specific upper limit on
-length |
-
BYTEA | bytea | variable-length byte array
-(null-byte safe) |
-
"char" | char | one character |
-
-
-
-
You will see the internal name when examining system catalogs
- and in some error messages.
-
-
The first four types above are "varlena" types (i.e., the first
- four bytes on disk are the length, followed by the data). Thus the
- actual space used is slightly greater than the declared size.
- However, long values are also subject to compression, so the space
- on disk might also be less than expected.
-
- VARCHAR(n) is best when storing variable-length
- strings and it limits how long a string can be. TEXT
- is for strings of unlimited length, with a maximum of one gigabyte.
-
CHAR(n) is for storing strings that are all the
- same length. CHAR(n) pads with blanks to the specified
- length, while VARCHAR(n) only stores the characters
- supplied. BYTEA is for storing binary data,
- particularly values that include NULL bytes. All the
- types described here have similar performance characteristics.
-
-
4.11.1) How do I create a
- serial/auto-incrementing field?
-
-
PostgreSQL supports a SERIAL data type. It
- auto-creates a sequence. For example, this:
- CREATE TABLE person (
- id SERIAL,
- name TEXT
- );
-
-
- is automatically translated into this:
-
- CREATE SEQUENCE person_id_seq;
- CREATE TABLE person (
- id INT4 NOT NULL DEFAULT nextval('person_id_seq'),
- name TEXT
- );
-
-
-
Automatically created sequence are named
- <table>_<serialcolumn>_seq, where
- table and serialcolumn are the names of the table and
- SERIAL column, respectively. See the
- create_sequence manual page for more information about
- sequences.
-
-
4.11.2) How do I get the value of a
- SERIAL insert?
-
-
The simplest way is to retrieve the assigned SERIAL