Skip to content

Commit a547e68

Browse files
committed
Adjust cost model for HashAgg that spills to disk.
Tomas Vondra observed that the IO behavior for HashAgg tends to be worse than for Sort. Penalize HashAgg IO costs accordingly. Also, account for the CPU effort of spilling the tuples and reading them back. Discussion: https://postgr.es/m/20200906212112.nzoy5ytrzjjodpfh@development Reviewed-by: Tomas Vondra Backpatch-through: 13
1 parent 53367e6 commit a547e68

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

src/backend/optimizer/path/costsize.c

+13
Original file line numberDiff line numberDiff line change
@@ -2416,6 +2416,7 @@ cost_agg(Path *path, PlannerInfo *root,
24162416
double pages;
24172417
double pages_written = 0.0;
24182418
double pages_read = 0.0;
2419+
double spill_cost;
24192420
double hashentrysize;
24202421
double nbatches;
24212422
Size mem_limit;
@@ -2453,9 +2454,21 @@ cost_agg(Path *path, PlannerInfo *root,
24532454
pages = relation_byte_size(input_tuples, input_width) / BLCKSZ;
24542455
pages_written = pages_read = pages * depth;
24552456

2457+
/*
2458+
* HashAgg has somewhat worse IO behavior than Sort on typical
2459+
* hardware/OS combinations. Account for this with a generic penalty.
2460+
*/
2461+
pages_read *= 2.0;
2462+
pages_written *= 2.0;
2463+
24562464
startup_cost += pages_written * random_page_cost;
24572465
total_cost += pages_written * random_page_cost;
24582466
total_cost += pages_read * seq_page_cost;
2467+
2468+
/* account for CPU cost of spilling a tuple and reading it back */
2469+
spill_cost = depth * input_tuples * 2.0 * cpu_tuple_cost;
2470+
startup_cost += spill_cost;
2471+
total_cost += spill_cost;
24592472
}
24602473

24612474
/*

0 commit comments

Comments
 (0)