-
+
GIN Indexes
GIN stands for Generalized Inverted Index. It is
an index structure storing a set of (key, posting list) pairs, where
- 'posting list' is a set of rows in which the key occurs. The
+ 'posting list' is a set of rows in which the key occurs. Each
row may contain many keys.
Returns an array of keys of the query to be executed. n contains
- strategy number of operation (see ).
+ the strategy number of the operation
+ (see ).
Depending on n, query may be different type.
bool consistent( bool check[], StrategyNumber n, Datum query)
- Returns TRUE if indexed value satisfies query qualifier with strategy n
- (or may satisfy in case of RECHECK mark in operator class).
- Each element of the check array is TRUE if indexed value has a
+ Returns TRUE if the indexed value satisfies the query qualifier with
+ strategy n (or may satisfy in case of RECHECK mark in operator class).
+ Each element of the check array is TRUE if the indexed value has a
corresponding key in the query: if (check[i] == TRUE ) the i-th key of
the query is present in the indexed value.
Create vs insert
- In most cases, insertion into
GIN index is slow
because
- many GIN keys may be inserted for each table row. So, when loading data
- in bulk it may be useful to drop index and recreate it
- after the data is loaded in the table.
+ In most cases, insertion into
GIN index is slow
+ due to the likelihood of many keys being inserted for each value.
+ So, for bulk insertions into a table it is advisable to to drop the GIN
+ index and recreate it after finishing bulk insertion.
gin_fuzzy_search_limit
- The primary goal of develop
ment GIN indices was
+ The primary goal of develop
ing GIN indices was
support for highly scalable, full-text search in
PostgreSQL and there are often situations when
a full-text search returns a very large set of results. Since reading
Such queries usually contain very frequent words, so the results are not
very helpful. To facilitate execution of such queries
-
GIN has a configurable
soft upper limit of the size
+
GIN has a configurable soft upper limit of the size
of the returned set, determined by the
gin_fuzzy_search_limit GUC variable. It is set to 0 by
default (no limit).
Limitations
-
GIN doesn't support full
scan of index due to it's
- extremely inefficiency: because of a lot of keys per value,
+
GIN doesn't support full
index scans due to their
+ extremely inefficiency: because there are often many keys per value,
each heap pointer will returned several times.
- When extractQuery returns zero
number of keys, GIN will
- emit a error: for different opclass and strategy semantic meaning of void
- query may be different (for example, any array contains void array,
- but they
aren't overlapped with void one), and GIN can't
+ When extractQuery returns zero
keys, GIN will emit a
+ error: for different opclasses and strategies the semantic meaning of a void
+ query may be different (for example, any array contains the void array,
+ but they
don't overlap the void array), and GIN can't
suggest reasonable answer.
-
+
Indexes
index
GIN is a inverted index and it's usable for values which have more
- than one key, arrays for example. Like to GiST, GIN may support
+ than one key, arrays for example. Like GiST, GIN may support
many different user-defined indexing strategies and the particular
operators with which a GIN index can be used vary depending on the
indexing strategy.
(See for the meaning of
these operators.)
- Another GIN operator classes are available in the contrib>
+ Other GIN operator classes are available in the contrib>
tsearch2 and intarray modules. For more information see .
-
+
Concurrency Control
- Short-term share/exclusive page-level locks are used for
- read/write access. Locks are released immediately after each
- index row is fetched or inserted. However, note that a GIN index
- usually requires several inserts for each table row.
+ Short-term share/exclusive page-level locks are used for
+ read/write access. Locks are released immediately after each
+ index row is fetched or inserted. But note that a GIN-indexed
+ value insertion usually produces several index key insertions
+ per row, so GIN may do substantial work for a single value's
+ insertion.