Skip to content

Commit 72c2c97

Browse files
akorotkovCommitfest Bot
authored and
Commitfest Bot
committed
Use extended stats for precise estimation of bucket size in hash join
Recognizing the real-life complexity where columns in the table often have functional dependencies, PostgreSQL's estimation of the number of distinct values over a set of columns can be underestimated (or much rarely, overestimated) when dealing with multi-clause JOIN. In the case of hash join, it can end up with a small number of predicted hash buckets and, as a result, picking non-optimal merge join. To improve the situation, we introduce one additional stage of bucket size estimation - having two or more join clauses estimator lookup for extended statistics and use it for multicolumn estimation. Clauses are grouped into lists, each containing expressions referencing the same relation. The result of the multicolumn estimation made over such a list is combined with others according to the caller's logic. Clauses that are not estimated are returned to the caller for further estimation. Discussion: https://postgr.es/m/52257607-57f6-850d-399a-ec33a654457b%40postgrespro.ru Author: Andrei Lepikhov <[email protected]> Reviewed-by: Andy Fan <[email protected]> Reviewed-by: Tomas Vondra <[email protected]> Reviewed-by: Alena Rybakina <[email protected]> Reviewed-by: Alexander Korotkov <[email protected]>
1 parent 3c472a1 commit 72c2c97

File tree

5 files changed

+266
-1
lines changed

5 files changed

+266
-1
lines changed

src/backend/optimizer/path/costsize.c

+11-1
Original file line numberDiff line numberDiff line change
@@ -4339,9 +4339,19 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
43394339
}
43404340
else
43414341
{
4342+
List *otherclauses;
4343+
43424344
innerbucketsize = 1.0;
43434345
innermcvfreq = 1.0;
4344-
foreach(hcl, hashclauses)
4346+
4347+
/* At first, try to estimate bucket size using extended statistics. */
4348+
otherclauses = estimate_multivariate_bucketsize(root,
4349+
inner_path->parent,
4350+
hashclauses,
4351+
&innerbucketsize);
4352+
4353+
/* Pass through the remaining clauses */
4354+
foreach(hcl, otherclauses)
43454355
{
43464356
RestrictInfo *restrictinfo = lfirst_node(RestrictInfo, hcl);
43474357
Selectivity thisbucketsize;

src/backend/utils/adt/selfuncs.c

+177
Original file line numberDiff line numberDiff line change
@@ -3765,6 +3765,183 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
37653765
return numdistinct;
37663766
}
37673767

3768+
/*
3769+
* Try to estimate the bucket size of the hash join inner side when the join
3770+
* condition contains two or more clauses by employing extended statistics.
3771+
*
3772+
* The main idea of this approach is that the distinct value generated by
3773+
* multivariate estimation on two or more columns would provide less bucket size
3774+
* than estimation on one separate column.
3775+
*
3776+
* IMPORTANT: It is crucial to synchronize the approach of combining different
3777+
* estimations with the caller's method.
3778+
*
3779+
* Return a list of clauses that didn't fetch any extended statistics.
3780+
*/
3781+
List *
3782+
estimate_multivariate_bucketsize(PlannerInfo *root, RelOptInfo *inner,
3783+
List *hashclauses,
3784+
Selectivity *innerbucketsize)
3785+
{
3786+
List *clauses = list_copy(hashclauses);
3787+
List *otherclauses = NIL;
3788+
double ndistinct = 1.0;
3789+
3790+
if (list_length(hashclauses) <= 1)
3791+
3792+
/*
3793+
* Nothing to do for a single clause. Could we employ univariate
3794+
* extended stat here?
3795+
*/
3796+
return hashclauses;
3797+
3798+
while (clauses != NIL)
3799+
{
3800+
ListCell *lc;
3801+
int relid = -1;
3802+
List *varinfos = NIL;
3803+
List *origin_rinfos = NIL;
3804+
double mvndistinct;
3805+
List *origin_varinfos;
3806+
int group_relid = -1;
3807+
RelOptInfo *group_rel = NULL;
3808+
ListCell *lc1,
3809+
*lc2;
3810+
3811+
/*
3812+
* Find clauses, referencing the same single base relation and try to
3813+
* estimate such a group with extended statistics. Create varinfo for
3814+
* an approved clause, push it to otherclauses, if it can't be
3815+
* estimated here or ignore to process at the next iteration.
3816+
*/
3817+
foreach(lc, clauses)
3818+
{
3819+
RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc);
3820+
Node *expr;
3821+
Relids relids;
3822+
GroupVarInfo *varinfo;
3823+
3824+
/*
3825+
* Find the inner side of the join, which we need to estimate the
3826+
* number of buckets. Use outer_is_left because the
3827+
* clause_sides_match_join routine has called on hash clauses.
3828+
*/
3829+
relids = rinfo->outer_is_left ?
3830+
rinfo->right_relids : rinfo->left_relids;
3831+
expr = rinfo->outer_is_left ?
3832+
get_rightop(rinfo->clause) : get_leftop(rinfo->clause);
3833+
3834+
if (bms_get_singleton_member(relids, &relid) &&
3835+
root->simple_rel_array[relid]->statlist != NIL)
3836+
{
3837+
/*
3838+
* This inner-side expression references only one relation.
3839+
* Extended statistics on this clause can exist.
3840+
*/
3841+
if (group_relid < 0)
3842+
{
3843+
RangeTblEntry *rte = root->simple_rte_array[relid];
3844+
3845+
if (!rte || (rte->relkind != RELKIND_RELATION &&
3846+
rte->relkind != RELKIND_MATVIEW &&
3847+
rte->relkind != RELKIND_FOREIGN_TABLE &&
3848+
rte->relkind != RELKIND_PARTITIONED_TABLE))
3849+
{
3850+
/* Extended statistics can't exist in principle */
3851+
otherclauses = lappend(otherclauses, rinfo);
3852+
clauses = foreach_delete_current(clauses, lc);
3853+
continue;
3854+
}
3855+
3856+
group_relid = relid;
3857+
group_rel = root->simple_rel_array[relid];
3858+
}
3859+
else if (group_relid != relid)
3860+
3861+
/*
3862+
* Being in the group forming state we don't need other
3863+
* clauses.
3864+
*/
3865+
continue;
3866+
3867+
varinfo = (GroupVarInfo *) palloc(sizeof(GroupVarInfo));
3868+
varinfo->var = expr;
3869+
varinfo->rel = root->simple_rel_array[relid];
3870+
varinfo->ndistinct = 0.0;
3871+
varinfo->isdefault = false;
3872+
varinfos = lappend(varinfos, varinfo);
3873+
3874+
/*
3875+
* Remember the link to RestrictInfo for the case the clause
3876+
* is failed to be estimated.
3877+
*/
3878+
origin_rinfos = lappend(origin_rinfos, rinfo);
3879+
}
3880+
else
3881+
/* This clause can't be estimated with extended statistics */
3882+
otherclauses = lappend(otherclauses, rinfo);
3883+
3884+
clauses = foreach_delete_current(clauses, lc);
3885+
}
3886+
3887+
if (list_length(varinfos) < 2)
3888+
{
3889+
/*
3890+
* Multivariate statistics doesn't apply to single columns except
3891+
* for expressions, but it has not been implemented yet.
3892+
*/
3893+
otherclauses = list_concat(otherclauses, origin_rinfos);
3894+
list_free_deep(varinfos);
3895+
list_free(origin_rinfos);
3896+
continue;
3897+
}
3898+
3899+
Assert(group_rel != NULL);
3900+
3901+
/* Let's employ extended statistics. */
3902+
origin_varinfos = varinfos;
3903+
for (;;)
3904+
{
3905+
bool estimated = estimate_multivariate_ndistinct(root,
3906+
group_rel,
3907+
&varinfos,
3908+
&mvndistinct);
3909+
3910+
if (!estimated)
3911+
break;
3912+
3913+
/*
3914+
* We've got an estimation. Use ndistinct value in a consistent
3915+
* way - according to the caller's logic (see
3916+
* final_cost_hashjoin).
3917+
*/
3918+
if (ndistinct < mvndistinct)
3919+
ndistinct = mvndistinct;
3920+
Assert(ndistinct >= 1.0);
3921+
}
3922+
3923+
Assert(list_length(origin_varinfos) == list_length(origin_rinfos));
3924+
3925+
/*
3926+
* Remove already estimated clauses from the origin_rinfos
3927+
*/
3928+
forboth(lc1, origin_varinfos, lc2, origin_rinfos)
3929+
{
3930+
GroupVarInfo *vinfo = lfirst(lc1);
3931+
3932+
if (!list_member_ptr(varinfos, vinfo))
3933+
/* Already estimated */
3934+
continue;
3935+
3936+
/* Can't be estimated here - push to the returning list */
3937+
otherclauses = lappend(otherclauses, lfirst(lc2));
3938+
}
3939+
}
3940+
3941+
*innerbucketsize = 1.0 / ndistinct;
3942+
return otherclauses;
3943+
}
3944+
37683945
/*
37693946
* Estimate hash bucket statistics when the specified expression is used
37703947
* as a hash key for the given number of buckets.

src/include/utils/selfuncs.h

+4
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,10 @@ extern double estimate_num_groups(PlannerInfo *root, List *groupExprs,
218218
double input_rows, List **pgset,
219219
EstimationInfo *estinfo);
220220

221+
extern List *estimate_multivariate_bucketsize(PlannerInfo *root,
222+
RelOptInfo *inner,
223+
List *hashclauses,
224+
Selectivity *innerbucketsize);
221225
extern void estimate_hash_bucket_stats(PlannerInfo *root,
222226
Node *hashkey, double nbuckets,
223227
Selectivity *mcv_freq,

src/test/regress/expected/stats_ext.out

+45
Original file line numberDiff line numberDiff line change
@@ -3383,3 +3383,48 @@ SELECT * FROM check_estimated_rows('
33833383
(1 row)
33843384

33853385
DROP TABLE grouping_unique;
3386+
--
3387+
-- Extended statistics on sb_2 (x, y, z) improve a bucket size estimation,
3388+
-- and the optimizer may choose hash join.
3389+
--
3390+
CREATE TABLE sb_1 AS
3391+
SELECT gs % 10 AS x, gs % 10 AS y, gs % 10 AS z
3392+
FROM generate_series(1, 1e4) AS gs;
3393+
CREATE TABLE sb_2 AS
3394+
SELECT gs % 49 AS x, gs % 51 AS y, gs % 73 AS z, 'abc' || gs AS payload
3395+
FROM generate_series(1, 1e4) AS gs;
3396+
ANALYZE sb_1, sb_2;
3397+
-- During hash join estimation, the number of distinct values on each column
3398+
-- is calculated. The optimizer selects the smallest number of distinct values
3399+
-- and the largest hash bucket size. The optimizer decides that the hash
3400+
-- bucket size is quite big because there are possibly many correlations.
3401+
EXPLAIN (COSTS OFF) -- Choose merge join
3402+
SELECT * FROM sb_1 a, sb_2 b WHERE a.x = b.x AND a.y = b.y AND a.z = b.z;
3403+
QUERY PLAN
3404+
-------------------------------------------------------------
3405+
Merge Join
3406+
Merge Cond: ((a.z = b.z) AND (a.x = b.x) AND (a.y = b.y))
3407+
-> Sort
3408+
Sort Key: a.z, a.x, a.y
3409+
-> Seq Scan on sb_1 a
3410+
-> Sort
3411+
Sort Key: b.z, b.x, b.y
3412+
-> Seq Scan on sb_2 b
3413+
(8 rows)
3414+
3415+
-- The ndistinct extended statistics on (x, y, z) provides more reliable value
3416+
-- of bucket size.
3417+
CREATE STATISTICS extstat_sb_2 (ndistinct) ON x, y, z FROM sb_2;
3418+
ANALYZE sb_2;
3419+
EXPLAIN (COSTS OFF) -- Choose hash join
3420+
SELECT * FROM sb_1 a, sb_2 b WHERE a.x = b.x AND a.y = b.y AND a.z = b.z;
3421+
QUERY PLAN
3422+
------------------------------------------------------------
3423+
Hash Join
3424+
Hash Cond: ((a.x = b.x) AND (a.y = b.y) AND (a.z = b.z))
3425+
-> Seq Scan on sb_1 a
3426+
-> Hash
3427+
-> Seq Scan on sb_2 b
3428+
(5 rows)
3429+
3430+
DROP TABLE sb_1, sb_2 CASCADE;

src/test/regress/sql/stats_ext.sql

+29
Original file line numberDiff line numberDiff line change
@@ -1719,3 +1719,32 @@ SELECT * FROM check_estimated_rows('
17191719
ON t1.t1 = q1.x;
17201720
');
17211721
DROP TABLE grouping_unique;
1722+
1723+
--
1724+
-- Extended statistics on sb_2 (x, y, z) improve a bucket size estimation,
1725+
-- and the optimizer may choose hash join.
1726+
--
1727+
CREATE TABLE sb_1 AS
1728+
SELECT gs % 10 AS x, gs % 10 AS y, gs % 10 AS z
1729+
FROM generate_series(1, 1e4) AS gs;
1730+
CREATE TABLE sb_2 AS
1731+
SELECT gs % 49 AS x, gs % 51 AS y, gs % 73 AS z, 'abc' || gs AS payload
1732+
FROM generate_series(1, 1e4) AS gs;
1733+
ANALYZE sb_1, sb_2;
1734+
1735+
-- During hash join estimation, the number of distinct values on each column
1736+
-- is calculated. The optimizer selects the smallest number of distinct values
1737+
-- and the largest hash bucket size. The optimizer decides that the hash
1738+
-- bucket size is quite big because there are possibly many correlations.
1739+
EXPLAIN (COSTS OFF) -- Choose merge join
1740+
SELECT * FROM sb_1 a, sb_2 b WHERE a.x = b.x AND a.y = b.y AND a.z = b.z;
1741+
1742+
-- The ndistinct extended statistics on (x, y, z) provides more reliable value
1743+
-- of bucket size.
1744+
CREATE STATISTICS extstat_sb_2 (ndistinct) ON x, y, z FROM sb_2;
1745+
ANALYZE sb_2;
1746+
1747+
EXPLAIN (COSTS OFF) -- Choose hash join
1748+
SELECT * FROM sb_1 a, sb_2 b WHERE a.x = b.x AND a.y = b.y AND a.z = b.z;
1749+
1750+
DROP TABLE sb_1, sb_2 CASCADE;

0 commit comments

Comments
 (0)