+ url="http://www.sai.msu.su/~megera/postgres/gist/">>.
+
+
+
+
+
+
Extensibility
+
+ Traditionally, implementing a new index access method meant a lot of
+ difficult work. It was necessary to understand the inner workings of the
+ database, such as the lock manager and Write-Ahead Log. The
+
GiST interface has a high level of abstraction,
+ requiring the access method implementor to only implement the semantics of
+ the data type being accessed. The
GiST layer itself
+ takes care of concurrency, logging and searching the tree structure.
+
+
+ This extensibility should not be confused with the extensibility of the
+ other standard search trees in terms of the data they can handle. For
+ example,
PostgreSQL supports extensible B+-trees
+ and R-trees. That means that you can use
+
PostgreSQL to build a B+-tree or R-tree over any
+ data type you want. But B+-trees only support range predicates
+ (<, =, >),
+ and R-trees only support n-D range queries (contains, contained, equals).
+
+
+ So if you index, say, an image collection with a
+
PostgreSQL B+-tree, you can only issue queries
+ such as is imagex equal to imagey
, is imagex less
+ than imagey and is imagex greater than imagey
?
+ Depending on how you define equals
, less than
+ and greater than
in this context, this could be useful.
+ However, by using a
GiST based index, you could create
+ ways to ask domain-specific questions, perhaps find all images of
+ horses or find all over-exposed images
.
+
+
+ All it takes to get a
GiST access method up and running
+ is to implement seven user-defined methods, which define the behavior of
+ keys in the tree. Of course these methods have to be pretty fancy to
+ support fancy queries, but for all the standard queries (B+-trees,
+ R-trees, etc.) they're relatively straightforward. In short,
+
GiST combines extensibility along with generality, code
+ reuse, and a clean interface.
+
+
+
+
+
+
Implementation
+
+ There are seven methods that an index operator class for
+
+
+
+
+ consistent
+
+ Given a predicate p on a tree page, and a user
+ query, q, this method will return false if it is
+ certain that both p and q cannot
+ be true for a given data item.
+
+
+
+
+
+ union
+
+ This method consolidates information in the tree. Given a set of
+ entries, this function generates a new predicate that is true for all
+ the entries.
+
+
+
+
+
+ compress
+
+ Converts the data item into a format suitable for physical storage in
+ an index page.
+
+
+
+
+
+ decompress
+
+ The reverse of the compress method. Converts the
+ index representation of the data item into a format that can be
+ manipulated by the database.
+
+
+
+
+
+ penalty
+
+ Returns a value indicating the cost
of inserting the new
+ entry into a particular branch of the tree. items will be inserted
+ down the path of least penalty in the tree.
+
+
+
+
+
+ picksplit
+
+ When a page split is necessary, this function decides which entries on
+ the page are to stay on the old page, and which are to move to the new
+ page.
+
+
+
+
+
+ same
+
+ Returns true if two entries are identical, false otherwise.
+
+
+
+
+
+
+
+
+
+
Limitations
+
+ The current implementation of
GiST within
+
PostgreSQL has some major limitations:
+
GiST access is not concurrent; the
+
GiST interface doesn't allow the development of certain
+ data types, such as digital trees (see papers by Aoki et al); and there
+ is not yet any support for write-ahead logging of updates in
+
+
+ Solutions to the concurrency problems appear in Marcel Kornacker's
+ thesis; however these ideas have not yet been put into practice in the
+
PostgreSQL implementation.
+
+
+ The lack of write-ahead logging is just a small matter of programming,
+ but since it isn't done yet, a crash could render a
GiST
+ index inconsistent, forcing a REINDEX.
+
+
+
+
+
+
Examples
+
+ To see example implementations of index methods implemented using
+
GiST, examine the following contrib modules:
+
+
+
+
+ btree_gist
+
+
+
+
+
+ cube
+
+
Indexing for multi-dimensional cubes
+
+
+
+
+ intarray
+
+
RD-Tree for one-dimensional array of int4 values
+
+
+
+
+ ltree
+
+
Indexing for tree-like stuctures
+
+
+
+
+ rtree_gist
+
+
+
+
+
+ seg
+
+
Storage and indexed access for float ranges
+
+
+
+
+ tsearch and tsearch2
+
+
+
+
+
+
+
+