You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Until now ExecChooseHashTableSize() considered only the size of the
in-memory hash table when picking the nbatch value, and completely
ignored the memory needed for the batch files. Which can be a lot,
because each batch needs two BufFiles (each with a BLCKSZ buffer).
Same for increasing the number of batches during execution.
With enough batches, the batch files may use orders of magnitude more
memory than the in-memory hash table. But the sizing logic is oblivious
to this.
It's also possible to trigger a "batch explosion", e.g. due to duplicate
values or skew in general. We've seen reports of joins with hundreds of
thousands (or even millions) of batches, consuming gigabytes of memory,
triggering OOM errors. These cases are fairly rare, but it's clearly
possible to hit them.
We can't prevent this during planning - we could improve the planning,
but that does nothing for the execution-time batch explosion. But we can
reduce the impact by using as little memory as possible.
This patch improves the memory by rebalancing how the memory is divided
between the hash table and batch files. Sometimes it's better to use
fewer batch files, even if it means the hash table exceeds the limit.
Whenever we need to increase the capacity of the hash node, we can do
that by either doubling the number of batches or doubling the size of
the in-memory hash table. The outcome is the same, allowing the hash
node to handle a relation twice the size. But the memory usage may be
very different - for low nbatch values it's better to add batches, for
high nbatch values it's better to allow a larger hash table.
It might seem like relaxing the memory limit - but that's not really the
case. It has always been like that, except the memory used by batches
was ignored, as if the files were free. This commit improves the
situation by considering this memory when adjusting nbatch values.
Increasing the hashtable memory limit may also help to prevent the batch
explosion in the first place. Given enough hash collisions or duplicate
hashes it's easy to get a batch that can't be split, resulting in a
cycle of quickly doubling the number of batches. Allowing the hashtable
to get larger may stop this, once the batch gets large enough to fit the
skewed data.
0 commit comments