Hi.
One of the issues when we try to use sharding in PostgreSQL is absence
of partial aggregates pushdown.
I see several opportunities to alleviate this issue.
If we look at Citus, it implements aggregate, calculating internal state
of an arbitrary agregate function and exporting it as text. So we could
calculate internal states independently on all data sources and then
finalize it, which allows to compute arbitrary aggregate.
But, as mentioned in [1] thread, for some functions (like
count/max/min/sum) we can just push down them. It seems easy and covers
a lot of cases.
For now there are still issues - for example you can't handle functions
as avg() as we should somehow get its internal state or sum() variants,
which need aggserialfn/aggdeserialfn. Preliminary version is attached.
Is someone else working on the issue? Does suggested approach make
sense?
[1]
https://www.postgresql.org/message-id/flat/9998c3af9fdb5f7d62a6c7ad0fcd9142%40postgrespro.ru
--
Best regards,
Alexander Pyhalov,
Postgres Professional