GiST Indexes
-
+gist-intro">
Introduction
-
+gist-extensibility">
Extensibility
-
+gist-implementation">
Implementation
-
+gist-examples">
Examples
- To see example implementations of index methods implemented using
-
GiST, examine the following contrib modules:
+ The
PostgreSQL source distribution includes
+ several examples of index methods implemented using
+
GiST. The core system currently provides R-Tree
+ equivalent functionality for some of the built-in geometric datatypes
+ (see src/backend/access/gist/gistproc.c>). The following
+
contrib> modules also contain GiST
+ operator classes:
btree_gist
+
B-Tree equivalent functionality for several datatypes
ltree
-
Indexing for tree-like stuctures
+
Indexing for tree-like structures
- rtree_gist
+ pg_trgm
+
Text similarity using trigram matching
seg
-
Storage and indexed access for float ranges
+
Indexing for float ranges
- tsearch and tsearch2
+ tsearch2
+
+
Crash Recovery
+
+ Usually, replay of the WAL log is sufficient to restore the integrity
+ of a GiST index following a database crash. However, there are some
+ corner cases in which the index state is not fully rebuilt. The index
+ will still be functionally correct, but there may be some performance
+ degradation. When this occurs, the index can be repaired by
+ VACUUM>ing its table, or by rebuilding the index using
+ REINDEX>. In some cases a plain VACUUM> is
+ not sufficient, and either VACUUM FULL> or REINDEX>
+ is needed. The need for one of these procedures is indicated by occurrence
+ of this log message during crash recovery:
+LOG: index NNN/NNN/NNN needs VACUUM or REINDEX to finish crash recovery
+
+ or this log message during routine index insertions:
+LOG: index "FOO" needs VACUUM or REINDEX to finish crash recovery
+
+ If a plain VACUUM> finds itself unable to complete recovery
+ fully, it will return a notice:
+NOTICE: index "FOO" needs VACUUM FULL or REINDEX to finish crash recovery
+
+
+
+
-
+
Indexes
CREATE INDEX name ON table USING hash (column);
-
- Testing has shown
PostgreSQL's hash
- indexes to perform no better than B-tree indexes, and the
- index size and build time for hash indexes is much worse. For
- these reasons, hash index use is presently discouraged.
-
-
equivalent to the R-tree operator classes, and many other GiST operator
classes are available in the contrib> collection or as separate
projects. For more information see .
-
- It is likely that the R-tree index type will be retired in a future
- release, as GiST indexes appear to do everything R-trees can do with
- similar or better performance. Users are encouraged to migrate
- applications that use R-tree indexes to GiST indexes.
-
-
+
+
+ Testing has shown
PostgreSQL's hash
+ indexes to perform no better than B-tree indexes, and the
+ index size and build time for hash indexes is much worse.
+ Furthermore, hash index operations are not presently WAL-logged,
+ so hash indexes may need to be rebuilt with REINDEX>
+ after a database crash.
+ For these reasons, hash index use is presently discouraged.
+
+
+ Similarly, R-tree indexes do not seem to have any performance
+ advantages compared to the equivalent operations of GiST indexes.
+ Like hash indexes, they are not WAL-logged and may need
+ REINDEX>ing after a database crash.
+
+
+ While the problems with hash indexes may be fixed eventually,
+ it is likely that the R-tree index type will be retired in a future
+ release. Users are encouraged to migrate applications that use R-tree
+ indexes to GiST indexes.
+
+
A multicolumn GiST index can only be used when there is a query condition
- on its leading column. As with B-trees, conditions on additional columns
- restrict the entries returned by the index, but do not in themselves aid
- the index search.
+ on its leading column. Conditions on additional columns restrict the
+ entries returned by the index, but the condition on the first column is the
+ most important one for determining how much of the index needs to be
+ scanned. A GiST index will be relatively ineffective if its first column
+ has only a few distinct values, even if there are many distinct values in
+ additional columns.
- B-tree indexes
+ B-tree
and GiST indexes
- Short-term share/exclusive page-level locks are used for
- read/write access. Locks are released immediately after each
- index row is fetched or inserted. B-tree indexes provide
- the highest concurrency without deadlock conditions.
+ Short-term share/exclusive page-level locks are used for
+ read/write access. Locks are released immediately after each
+ index row is fetched or inserted. These index types provide
+ the highest concurrency without deadlock conditions.
-
GiST and R-tree indexes
+ Hash indexes
- Share/exclusive index-level locks are used for read/write access.
- Locks are released after the command is done.
+ Share/exclusive hash-bucket-level locks are used for read/write
+ access. Locks are released after the whole bucket is processed.
+ Bucket-level locks provide better concurrency than index-level
+ ones, but deadlock is possible since the locks are held longer
+ than one index operation.
- Hash indexes
+ R-tree indexes
- Share/exclusive hash-bucket-level locks are used for read/write
- access. Locks are released after the whole bucket is processed.
- Bucket-level locks provide better concurrency than index-level
- ones, but deadlock is possible since the locks are held longer
- than one index operation.
+ Share/exclusive index-level locks are used for read/write access.
+ Locks are released after the entire command is done.
- In short, B-tree indexes offer the best performance for concurrent
+ Currently, B-tree indexes offer the best performance for concurrent
applications; since they also have more features than hash
indexes, they are the recommended index type for concurrent
applications that need to index scalar data. When dealing with
- non-scalar data, B-trees obviously cannot be used; in that
- situation, application developers should be aware of the
- relatively poor concurrent performance of GiST and R-tree
- indexes.
+ non-scalar data, B-trees are not useful, and GiST indexes should
+ be used instead. R-tree indexes are deprecated and are likely
+ to disappear entirely in a future release.