Improve documentation about GiST opclass support functions.

author Tom Lane

Fri, 12 Jun 2009 19:48:53 +0000 (19:48 +0000)

committer Tom Lane

Fri, 12 Jun 2009 19:48:53 +0000 (19:48 +0000)
author Tom Lane
Fri, 12 Jun 2009 19:48:53 +0000 (19:48 +0000)
committer Tom Lane
Fri, 12 Jun 2009 19:48:53 +0000 (19:48 +0000)
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml

index f236e6ad614f5172ba386081a87c67669f617b92..eddaaad5dfa39f34c26a5aee1a2302f6113ce1e7 100644 (file)
--- a/doc/src/sgml/gist.sgml
+++ b/doc/src/sgml/gist.sgml
@@ -1,4 +1,4 @@
-
+
  
  
  GiST Indexes
@@ -25,16 +25,17 @@
   
  
    
-    Some of the information here is derived from the University of California at
-    Berkeley's GiST Indexing Project
-    web site and 
+    Some of the information here is derived from the University of California
+    at Berkeley's GiST Indexing Project
+    web site and
+    Marcel Kornacker's thesis,
      
-    Marcel Kornacker's thesis, Access Methods for Next-Generation Database Systems.
+    Access Methods for Next-Generation Database Systems.
      The GiST
      implementation in PostgreSQL is primarily
      maintained by Teodor Sigaev and Oleg Bartunov, and there is more
      information on their
-    website.
+    web site.
    
  
  
@@ -47,11 +48,11 @@
     difficult work.  It was necessary to understand the inner workings of the
     database, such as the lock manager and Write-Ahead Log.  The
     GiST interface has a high level of abstraction,
-   requiring the access method implementer to only implement the semantics of
+   requiring the access method implementer only to implement the semantics of
     the data type being accessed.  The GiST layer itself
     takes care of concurrency, logging and searching the tree structure.
   
- 
+
   
     This extensibility should not be confused with the extensibility of the
     other standard search trees in terms of the data they can handle.  For
@@ -62,12 +63,12 @@
     (<, =, >),
     and hash indexes only support equality queries.
   
- 
+
   
     So if you index, say, an image collection with a
     PostgreSQL B-tree, you can only issue queries
     such as is imagex equal to imagey, is imagex less
-   than imagey and is imagex greater than imagey?
+   than imagey and is imagex greater than imagey.
     Depending on how you define equals, less than
     and greater than in this context, this could be useful.
     However, by using a GiST based index, you could create
@@ -89,87 +90,479 @@
  
  
   Implementation
- 
+
   
     There are seven methods that an index operator class for
-   GiST must provide:
+   GiST must provide. Correctness of the index is ensured
+   by proper implementation of the same, consistent
+   and union methods, while efficiency (size and speed) of the
+   index will depend on the penalty and picksplit
+   methods.
+   The remaining two methods are compress and
+   decompress, which allow an index to have internal tree data of
+   a different type than the data it indexes. The leaves are to be of the
+   indexed data type, while the other tree nodes can be of any C struct (but
+   you still have to follow PostgreSQL datatype rules here,
+   see about varlena for variable sized data). If the tree's
+   internal data type exists at the SQL level, the STORAGE option
+   of the CREATE OPERATOR CLASS command can be used.
   
  
   
      
-     consistent
+     consistent
       
        
-       Given a predicate p on a tree page, and a user
-       query, q, this method will return false if it is
-       certain that both p and q cannot
-       be true for a given data item.  For a true result, a
-       recheck flag must also be returned; this indicates whether
-       the predicate implies the query (recheck = false) or
-       not (recheck = true).
+       Given an index entry p and a query value q,
+       this function determines whether the index entry is
+       consistent with the query; that is, could the predicate
+       indexed_column
+       indexable_operator q be true for
+       any row represented by the index entry?  For a leaf index entry this is
+       equivalent to testing the indexable condition, while for an internal
+       tree node this determines whether it is necessary to scan the subtree
+       of the index represented by the tree node.  When the result is
+       true, a recheck flag must also be returned.
+       This indicates whether the predicate is certainly true or only possibly
+       true.  If recheck = false then the index has
+       tested the predicate condition exactly, whereas if recheck
+       = true the row is only a candidate match.  In that case the
+       system will automatically evaluate the
+       indexable_operator against the actual row value to see
+       if it is really a match.  This convention allows
+       GiST to support both lossless and lossy index
+       structures.
+      
+
+      
+        The SQL declaration of the function must look like this:
+
+
+CREATE OR REPLACE FUNCTION my_consistent(internal, data_type, smallint, oid, internal)
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+
+        And the matching code in the C module could then follow this skeleton:
+
+
+Datum       my_consistent(PG_FUNCTION_ARGS);
+PG_FUNCTION_INFO_V1(my_consistent);
+
+Datum
+my_consistent(PG_FUNCTION_ARGS)
+{
+    GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+    data_type  *query = PG_GETARG_DATA_TYPE_P(1);
+    StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+    /* Oid subtype = PG_GETARG_OID(3); */
+    bool       *recheck = (bool *) PG_GETARG_POINTER(4);
+    data_type  *key = DatumGetDataType(entry->key);
+    bool        retval;
+
+    /*
+     * determine return value as a function of strategy, key and query.
+     *
+     * Use GIST_LEAF(entry) to know where you're called in the index tree,
+     * which comes handy when supporting the = operator for example (you could
+     * check for non empty union() in non-leaf nodes and equality in leaf
+     * nodes).
+     */
+
+    *recheck = true;        /* or false if check is exact */
+
+    PG_RETURN_BOOL(retval);
+}
+
+
+       Here, key is an element in the index and query
+       the value being looked up in the index. The StrategyNumber
+       parameter indicates which operator of your operator class is being
+       applied — it matches one of the operator numbers in the
+       CREATE OPERATOR CLASS command.  Depending on what operators
+       you have included in the class, the data type of query could
+       vary with the operator, but the above skeleton assumes it doesn't.
        
+
       
      
  
      
-     union
+     union
       
        
         This method consolidates information in the tree.  Given a set of
-       entries, this function generates a new predicate that is true for all
-       the entries.
+       entries, this function generates a new index entry that represents
+       all the given entries.
+      
+
+      
+        The SQL declaration of the function must look like this:
+
+
+CREATE OR REPLACE FUNCTION my_union(internal, internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+
+        And the matching code in the C module could then follow this skeleton:
+
+
+Datum       my_union(PG_FUNCTION_ARGS);
+PG_FUNCTION_INFO_V1(my_union);
+
+Datum
+my_union(PG_FUNCTION_ARGS)
+{
+    GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
+    GISTENTRY  *ent = entryvec->vector;
+    data_type  *out,
+               *tmp,
+               *old;
+    int         numranges,
+                i = 0;
+
+    numranges = entryvec->n;
+    tmp = DatumGetDataType(ent[0].key);
+    out = tmp;
+
+    if (numranges == 1)
+    {
+        out = data_type_deep_copy(tmp);
+
+        PG_RETURN_DATA_TYPE_P(out);
+    }
+
+    for (i = 1; i < numranges; i++)
+    {
+        old = out;
+        tmp = DatumGetDataType(ent[i].key);
+        out = my_union_implementation(out, tmp);
+    }
+
+    PG_RETURN_DATA_TYPE_P(out);
+}
+
+      
+
+      
+        As you can see, in this skeleton we're dealing with a data type
+        where union(X, Y, Z) = union(union(X, Y), Z). It's easy
+        enough to support data types where this is not the case, by
+        implementing the proper union algorithm in this
+        GiST support method.
+      
+
+      
+        The union implementation function should return a
+        pointer to newly palloc()ed memory. You can't just
+        return whatever the input is.
        
       
      
  
      
-     compress
+     compress
       
        
         Converts the data item into a format suitable for physical storage in
         an index page.
        
+
+      
+        The SQL declaration of the function must look like this:
+
+
+CREATE OR REPLACE FUNCTION my_compress(internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+
+        And the matching code in the C module could then follow this skeleton:
+
+
+Datum       my_compress(PG_FUNCTION_ARGS);
+PG_FUNCTION_INFO_V1(my_compress);
+
+Datum
+my_compress(PG_FUNCTION_ARGS)
+{
+    GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+    GISTENTRY  *retval;
+
+    if (entry->leafkey)
+    {
+        /* replace entry->key with a compressed version */
+        compressed_data_type *compressed_data = palloc(sizeof(compressed_data_type));
+
+        /* fill *compressed_data from entry->key ... */
+
+        retval = palloc(sizeof(GISTENTRY));
+        gistentryinit(*retval, PointerGetDatum(compressed_data),
+                      entry->rel, entry->page, entry->offset, FALSE);
+    }
+    else
+    {
+        /* typically we needn't do anything with non-leaf entries */
+        retval = entry;
+    }
+
+    PG_RETURN_POINTER(retval);
+}
+
+      
+
+      
+       You have to adapt compressed_data_type to the specific
+       type you're converting to in order to compress your leaf nodes, of
+       course.
+      
+
+      
+        Depending on your needs, you could also need to care about
+        compressing NULL values in there, storing for example
+        (Datum) 0 like gist_circle_compress does.
+      
       
      
  
      
-     decompress
+     decompress
       
        
         The reverse of the compress method.  Converts the
         index representation of the data item into a format that can be
         manipulated by the database.
        
+
+      
+        The SQL declaration of the function must look like this:
+
+
+CREATE OR REPLACE FUNCTION my_decompress(internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+
+        And the matching code in the C module could then follow this skeleton:
+
+
+Datum       my_decompress(PG_FUNCTION_ARGS);
+PG_FUNCTION_INFO_V1(my_decompress);
+
+Datum
+my_decompress(PG_FUNCTION_ARGS)
+{
+    PG_RETURN_POINTER(PG_GETARG_POINTER(0));
+}
+
+
+        The above skeleton is suitable for the case where no decompression
+        is needed.
+      
       
      
  
      
-     penalty
+     penalty
       
        
         Returns a value indicating the cost of inserting the new
-       entry into a particular branch of the tree.  items will be inserted
+       entry into a particular branch of the tree.  Items will be inserted
         down the path of least penalty in the tree.
        
+
+      
+        The SQL declaration of the function must look like this:
+
+
+CREATE OR REPLACE FUNCTION my_penalty(internal, internal, internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;  -- in some cases penalty functions need not be strict
+
+
+        And the matching code in the C module could then follow this skeleton:
+
+
+Datum       my_penalty(PG_FUNCTION_ARGS);
+PG_FUNCTION_INFO_V1(my_penalty);
+
+Datum
+my_penalty(PG_FUNCTION_ARGS)
+{
+    GISTENTRY  *origentry = (GISTENTRY *) PG_GETARG_POINTER(0);
+    GISTENTRY  *newentry = (GISTENTRY *) PG_GETARG_POINTER(1);
+    float      *penalty = (float *) PG_GETARG_POINTER(2);
+    data_type  *orig = DatumGetDataType(origentry->key);
+    data_type  *new = DatumGetDataType(newentry->key);
+
+    *penalty = my_penalty_implementation(orig, new);
+    PG_RETURN_POINTER(penalty);
+}
+
+      
+
+      
+        The penalty function is crucial to good performance of
+        the index. It'll get used at insertion time to determine which branch
+        to follow when choosing where to add the new entry in the tree. At
+        query time, the more balanced the index, the quicker the lookup.
+      
       
      
  
      
-     picksplit
+     picksplit
       
        
-       When a page split is necessary, this function decides which entries on
-       the page are to stay on the old page, and which are to move to the new
-       page.
+       When an index page split is necessary, this function decides which
+       entries on the page are to stay on the old page, and which are to move
+       to the new page.
+      
+
+      
+        The SQL declaration of the function must look like this:
+
+
+CREATE OR REPLACE FUNCTION my_picksplit(internal, internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+
+        And the matching code in the C module could then follow this skeleton:
+
+
+Datum       my_picksplit(PG_FUNCTION_ARGS);
+PG_FUNCTION_INFO_V1(my_picksplit);
+
+Datum
+my_picksplit(PG_FUNCTION_ARGS)
+{
+    GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0);
+    OffsetNumber maxoff = entryvec->n - 1;
+    GISTENTRY  *ent = entryvec->vector;
+    GIST_SPLITVEC *v = (GIST_SPLITVEC *) PG_GETARG_POINTER(1);
+    int         i,
+                nbytes;
+    OffsetNumber *left,
+               *right;
+    data_type  *tmp_union;
+    data_type  *unionL;
+    data_type  *unionR;
+    GISTENTRY **raw_entryvec;
+
+    maxoff = entryvec->n - 1;
+    nbytes = (maxoff + 1) * sizeof(OffsetNumber);
+
+    v->spl_left = (OffsetNumber *) palloc(nbytes);
+    left = v->spl_left;
+    v->spl_nleft = 0;
+
+    v->spl_right = (OffsetNumber *) palloc(nbytes);
+    right = v->spl_right;
+    v->spl_nright = 0;
+
+    unionL = NULL;
+    unionR = NULL;
+
+    /* Initialize the raw entry vector. */
+    raw_entryvec = (GISTENTRY **) malloc(entryvec->n * sizeof(void *));
+    for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+        raw_entryvec[i] = &(entryvec->vector[i]);
+
+    for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+    {
+        int         real_index = raw_entryvec[i] - entryvec->vector;
+
+        tmp_union = DatumGetDataType(entryvec->vector[real_index].key);
+        Assert(tmp_union != NULL);
+
+        /*
+         * Choose where to put the index entries and update unionL and unionR
+         * accordingly. Append the entries to either v_spl_left or
+         * v_spl_right, and care about the counters.
+         */
+
+        if (my_choice_is_left(unionL, curl, unionR, curr))
+        {
+            if (unionL == NULL)
+                unionL = tmp_union;
+            else
+                unionL = my_union_implementation(unionL, tmp_union);
+
+            *left = real_index;
+            ++left;
+            ++(v->spl_nleft);
+        }
+        else
+        {
+            /*
+             * Same on the right
+             */
+        }
+    }
+
+    v->spl_ldatum = DataTypeGetDatum(unionL);
+    v->spl_rdatum = DataTypeGetDatum(unionR);
+    PG_RETURN_POINTER(v);
+}
+
+      
+
+      
+        Like penalty, the picksplit function
+        is crucial to good performance of the index.  Designing suitable
+        penalty and picksplit implementations
+        is where the challenge of implementing well-performing
+        GiST indexes lies.
        
       
      
  
      
-     same
+     same
       
        
-       Returns true if two entries are identical, false otherwise.
+       Returns true if two index entries are identical, false otherwise.
+      
+
+      
+        The SQL declaration of the function must look like this:
+
+
+CREATE OR REPLACE FUNCTION my_same(internal, internal, internal)
+RETURNS internal
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+
+        And the matching code in the C module could then follow this skeleton:
+
+
+Datum       my_same(PG_FUNCTION_ARGS);
+PG_FUNCTION_INFO_V1(my_same);
+
+Datum
+my_same(PG_FUNCTION_ARGS)
+{
+    prefix_range *v1 = PG_GETARG_PREFIX_RANGE_P(0);
+    prefix_range *v2 = PG_GETARG_PREFIX_RANGE_P(1);
+    bool       *result = (bool *) PG_GETARG_POINTER(2);
+
+    *result = my_eq(v1, v2);
+    PG_RETURN_POINTER(result);
+}
+
+
+        For historical reasons, the same function doesn't
+        just return a boolean result; instead it has to store the flag
+        at the location indicated by the third argument.
        
       
      
@@ -189,9 +582,9 @@
    R-Tree equivalent functionality for some of the built-in geometric data types
    (see src/backend/access/gist/gistproc.c).  The following
    contrib modules also contain GiST
-  operator classes: 
+  operator classes:
   
- 
+
   
    
     btree_gist
author	Tom Lane
	Fri, 12 Jun 2009 19:48:53 +0000 (19:48 +0000)
committer	Tom Lane
	Fri, 12 Jun 2009 19:48:53 +0000 (19:48 +0000)