-
+
Server Configuration
not had a column-specific target set via ALTER TABLE
SET STATISTICS>. Larger values increase the time needed to
do ANALYZE>, but might improve the quality of the
- planner's estimates. The default is 10. For more information
+ planner's estimates. The default is 100. For more information
on the use of statistics by the
PostgreSQL>
query planner, refer to .
-
+
Performance Tips
column-by-column basis using the ALTER TABLE SET STATISTICS>
command, or globally by setting the
configuration variable.
- The default limit is presently 10 entries. Raising the limit
+ The default limit is presently 100 entries. Raising the limit
might allow more accurate planner estimates to be made, particularly for
columns with irregular data distributions, at the price of consuming
more space in pg_statistic and slightly more
This form
sets the per-column statistics-gathering target for subsequent
operations.
- The target can be set in the range 0 to 1000; alternatively, set it
+ The target can be set in the range 0 to 10000; alternatively, set it
to -1 to revert to using the system default statistics
target ().
For more information on the use of statistics by the
will change slightly each time ANALYZE is run,
even if the actual table contents did not change. This might result
in small changes in the planner's estimated costs shown by
- . In rare situations, this
- non-determinism will cause the query optimizer to choose a
- different query plan between runs of ANALYZE. To
- avoid this, raise the amount of statistics collected by
+ .
+ In rare situations, this non-determinism will cause the planner's
+ choices of query plans to change after ANALYZE is run.
+ To avoid this, raise the amount of statistics collected by
ANALYZE, as described below.
endterm="sql-altertable-title">). The target value sets the
maximum number of entries in the most-common-value list and the
maximum number of bins in the histogram. The default target value
- is 10, but this can be adjusted up or down to trade off accuracy of
+ is 100, but this can be adjusted up or down to trade off accuracy of
planner estimates against the time taken for
ANALYZE and the amount of space occupied in
pg_statistic. In particular, setting the
*
*
* IDENTIFICATION
- * $PostgreSQL: pgsql/src/backend/commands/analyze.c,v 1.128 2008/11/10 00:49:37 tgl Exp $
+ * $PostgreSQL: pgsql/src/backend/commands/analyze.c,v 1.129 2008/12/13 19:13:44 tgl Exp $
*
*-------------------------------------------------------------------------
*/
/* Default statistics target (GUC parameter) */
-int default_statistics_target = 10;
+int default_statistics_target = 100;
/* A few variables that don't seem worth passing around as parameters */
static int elevel = -1;
* error in bin size f, and error probability gamma, the minimum
* random sample size is
* r = 4 * k * ln(2*n/gamma) / f^2
- * Taking f = 0.5, gamma = 0.01, n = 1 million rows, we obtain
+ * Taking f = 0.5, gamma = 0.01, n = 10^6 rows, we obtain
* r = 305.82 * k
* Note that because of the log function, the dependence on n is
- * quite weak; even at n = 1 billion, a 300*k sample gives <= 0.59
+ * quite weak; even at n = 10^12, a 300*k sample gives <= 0.66
* bin size error with probability 0.99. So there's no real need to
* scale for n, which is a good thing because we don't necessarily
* know it at this point.
*
*
* IDENTIFICATION
- * $PostgreSQL: pgsql/src/backend/commands/tablecmds.c,v 1.272 2008/12/06 23:22:46 momjian Exp $
+ * $PostgreSQL: pgsql/src/backend/commands/tablecmds.c,v 1.273 2008/12/13 19:13:44 tgl Exp $
*
*-------------------------------------------------------------------------
*/
errmsg("statistics target %d is too low",
newtarget)));
}
- else if (newtarget > 1000)
+ else if (newtarget > 10000)
{
- newtarget = 1000;
+ newtarget = 10000;
ereport(WARNING,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("lowering statistics target to %d",
*
*
* IDENTIFICATION
- * $PostgreSQL: pgsql/src/backend/tsearch/ts_typanalyze.c,v 1.3 2008/11/27 21:17:39 heikki Exp $
+ * $PostgreSQL: pgsql/src/backend/tsearch/ts_typanalyze.c,v 1.4 2008/12/13 19:13:44 tgl Exp $
*
*-------------------------------------------------------------------------
*/
attr->attstattarget = default_statistics_target;
stats->compute_stats = compute_tsvector_stats;
- /* see comment about the choice of minrows from analyze.c */
+ /* see comment about the choice of minrows in commands/analyze.c */
stats->minrows = 300 * attr->attstattarget;
PG_RETURN_BOOL(true);
* is no more than a few times w.
*
* We use a hashtable for the D structure and a bucket width of
- * statistic_target * 100, where 100 is an arbitrarily chosen constant, meant
- * to approximate the number of lexemes in a single tsvector.
+ * statistics_target * 100, where 100 is an arbitrarily chosen constant,
+ * meant to approximate the number of lexemes in a single tsvector.
*/
static void
compute_tsvector_stats(VacAttrStats *stats,
LexemeHashKey hash_key;
TrackItem *item;
- /* We want statistic_target * 100 lexemes in the MCELEM array */
+ /* We want statistics_target * 100 lexemes in the MCELEM array */
num_mcelem = stats->attr->attstattarget * 100;
/*
* Written by Peter Eisentraut
.
*
* IDENTIFICATION
- * $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.482 2008/12/02 02:00:32 alvherre Exp $
+ * $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.483 2008/12/13 19:13:44 tgl Exp $
*
*--------------------------------------------------------------------
*/
"column-specific target set via ALTER TABLE SET STATISTICS.")
},
&default_statistics_target,
- 10, 1, 1000, NULL, NULL
+ 100, 1, 10000, NULL, NULL
},
{
{"from_collapse_limit", PGC_USERSET, QUERY_TUNING_OTHER,
# - Other Planner Options -
-#default_statistics_target = 10 # range 1-1000
+#default_statistics_target = 100 # range 1-10000
#constraint_exclusion = off
#cursor_tuple_fraction = 0.1 # range 0.0-1.0
#from_collapse_limit = 8