grows exponentially with the number of joins included in it. Further
optimization effort is caused by the support of a variety of
join methods
- (e.g., nested loop,
index scan, merge join in
Postgres) to
+ (e.g., nested loop,
hash join, merge join in
Postgres) to
process individual joins and a diversity of
indices (e.g., r-tree,
b-tree, hash in
Postgres) as access paths for relations.
The current
Postgres optimizer
- implementation performs a near-
- exhaustive search over the space of alternative strategies. This query
+ implementation performs a near-exhaustive search
+ over the space of alternative strategies. This query
optimization technique is inadequate to support database application
domains that involve the need for extensive queries, such as artificial
intelligence.
- Performance difficulties within exploring the space of possible query
- plans arose the demand for a new optimization technique being developed.
+ Performance difficulties in exploring the space of possible query
+ plans created the demand for a new optimization technique being developed.
- The
GA is a heuristic optimization method which operates through
+ The
GA is a heuristic optimization method which
+ operates through
determined, randomized search. The set of possible solutions for the
optimization problem is considered as a
- erm>populaerm> of individuals.
+ populationerm> of individuals.
The degree of adaption of an individual to its environment is specified
by its fitness.
is encoded by the integer string '4-1-3-2',
which means, first join relation '4' and '1', then '3', and
- then '2', where 1, 2, 3, 4 are relids in
Postgres.
+ then '2', where 1, 2, 3, 4 are relids within the
- Usage of edge recombination crossover which is especially suited
+ Usage of edge recombination crossover which is
+ especially suited
to keep edge losses low for the solution of the
-
crocronym> by means of a
GA;
+
TSPcronym> by means of a GA;
- The
GEQO module gives the following benefits to
- compared to the
Postgres query optimizer implementation:
-
-
-
- Handling of large join queries through non-exhaustive search;
-
-
-
-
- Improved cost size approximation of query plans since no longer
- plan merging is needed (the
GEQO module evaluates the cost for a
- query plan as an individual).
-
-
-
+ the
Postgres query optimizer to
+ support large join queries effectively through
+ non-exhaustive search.
-
-
-
+
Future Implementation Tasks for
-
-
Basic Improvements
-
-
-
Improve genetic algorithm parameter settings
-
+ Work is still needed to improve the genetic algorithm parameter
+ settings.
In file backend/optimizer/geqo/geqo_params.c, routines
gimme_pool_size and gimme_number_generations,
we have to find a compromise for the parameter settings
-
-
-
-
Find better solution for integer overflow
-
- In file backend/optimizer/geqo/geqo_eval.c, routine
- geqo_joinrel_size,
- the present hack for MAXINT overflow is to set the
Postgres integer
- value of rel->size to its logarithm.
- Modifications of Rel in backend/nodes/relation.h will
- surely have severe impacts on the whole
Postgres implementation.
-
-
-
-
-
Find solution for exhausted memory
- Memory exhaustion may occur with more than 10 relations involved in a query.
- In file backend/optimizer/geqo/geqo_eval.c, routine
- gimme_tree is recursively called.
- Maybe I forgot something to be freed correctly, but I dunno what.
- Of course the rel data structure of the
- join keeps growing and
- growing the more relations are packed into it.
- Suggestions are welcome :-(
-
-
-
References