Lists: | pgsql-hackers |
---|
From: | Thomas <thomasmannhart97(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | boehlen(at)ifi(dot)uzh(dot)ch, dignoes(at)inf(dot)unibz(dot)it, gamper(at)inf(dot)unibz(dot)it, P(dot)Moser(at)noi(dot)bz(dot)it |
Subject: | Patch: Range Merge Join |
Date: | 2021-06-09 15:05:35 |
Message-ID: | CAMWfgiAsJgVrkbrsv740Y2=2duO4rVYRhaD08EhFqBuJFmBH1A@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi Hackers,
More than a year ago we submitted a patch that offered two primitives
(ALIGN and NORMALIZE) to support the processing of temporal data with range
types. During the ensuing discussion we decided to withdraw the original
patch
and to split it into smaller parts.
In the context of my BSc thesis, we started working and implementing a
Range Merge Join (RMJ), which is key for most temporal operations. The RMJ
is a useful operator in its own right and it greatly benefits any possible
temporal extension.
We have implemented the Range Merge Join algorithm by extending the
existing Merge Join to also support range conditions, i.e., BETWEEN-AND
or @> (containment for range types). Range joins contain a containment
condition and may have (optional) equality conditions. For example the
following query joins employees with a department and work period with
events on a specific day for that department:
SELECT emps.name, emps.dept, events.event, events.day
FROM emps JOIN events ON emps.dept = events.dept
AND events.day <@ emps.eperiod;
The resulting query plan is as follows:
QUERY PLAN
----------------------------------------------------------------------------------------------
Range Merge Join (cost=106.73..118.01 rows=3 width=100) (actual rows=6
loops=1)
Merge Cond: (emps.dept = events.dept)
Range Cond: (events.day <@ emps.eperiod)
-> Sort (cost=46.87..48.49 rows=650 width=96) (actual rows=5 loops=1)
Sort Key: emps.dept, emps.eperiod
Sort Method: quicksort Memory: 25kB
-> Seq Scan on emps (cost=0.00..16.50 rows=650 width=96) (actual
rows=5 loops=1)
-> Sort (cost=59.86..61.98 rows=850 width=68) (actual rows=6 loops=1)
Sort Key: events.dept, events.day
Sort Method: quicksort Memory: 25kB
-> Seq Scan on events (cost=0.00..18.50 rows=850 width=68)
(actual rows=5 loops=1)
Planning Time: 0.077 ms
Execution Time: 0.092 ms
(13 rows)
Example queries and instances of tables can be found at the end of the mail.
The range merge join works with range types using <@ and also scalar data
types
using "a.ts BETWEEN b.ts AND b.te" or "b.ts <= a.ts AND a.ts <= b.te".
Currently, PostgreSQL does not provide specialized join algorithms for range
conditions (besides index nested loops), or Hash Join and Merge Joins that
evaluate an equality condition only.
Our idea is to have a separate range_cond besides the merge_cond for the
Merge Join that stores the potential range conditions of a query. The state
diagram of the Merge Join is then extended to also take into consideration
the range_cond. See the simplified state diagram of the Range Merge Join as
an extension of the Merge Join in the attachment. These additions besides a
boolean check have no effect on the Marge Join when no range condition is
present.
We provide extensive testing results and further information, including the
full BSc Thesis (technical report), describing the implementation and tests
in detail on http://tpg.inf.unibz.it/project-rmj and
http://tpg.inf.unibz.it/downloads/rmj-report.pdf.
We performed several experiments and show that depending on the selectivity
of
the range condition the range merge join outperforms existing execution
algorithms up to an order of magnitude. We found that the range merge join
that
needs to find range_cond from inequalities, incurs only a very small
overhead
in planning time in some TPCH queries (see Table 5.3 in the technical
report)
and in general only a very small overhead for a large number of joins or
many
inequality conditions (see Figure 5.1). To check the overhead of our
extension
for the traditional merge join execution time, we executed the TPCH queries
using the merge join (hash join disabled) and found no statistically
significant difference (see Table 5.4).
We are looking forward to your feedback and any suggestions to improve the
patch.
Best Regards,
Thomas Mannhart
Attachments: State Diagram and Patch
OPEN POINTS AND TODOs:
- Currently we do not consider parallelization
- Not all cases for input sort orders are considered yet
EXAMPLE QUERIES:
The first query uses a range condition using BETWEEN AND only and no
equality condition.
----------------------------------------------------------------------------------------------
DROP TABLE IF EXISTS marks;
DROP TABLE IF EXISTS grades;
CREATE TABLE marks (name text, snumber numeric, mark numeric);
CREATE TABLE grades (mmin numeric, mmax numeric, grade numeric);
INSERT INTO marks (name, snumber, mark) VALUES
('Anton', 1232, 23.5),
('Thomas', 4356, 95),
('Michael', 1125, 72),
('Hans', 3425, 90);
INSERT INTO grades (mmin, mmax, grade) VALUES
(0.0, 18, 1),
(18.5, 36, 2),
(36.5, 54, 3),
(54.5, 72, 4),
(72.5, 90, 5),
(90.5, 100, 6);
EXPLAIN(ANALYZE, TIMING FALSE)
SELECT marks.name, marks.snumber, grades.grade
FROM marks JOIN grades ON marks.mark BETWEEN grades.mmin AND grades.mmax;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Range Merge Join (cost=93.74..920.13 rows=46944 width=96) (actual rows=16
loops=1)
Range Cond: ((marks.mark >= grades.mmin) AND (marks.mark <= grades.mmax))
-> Sort (cost=46.87..48.49 rows=650 width=96) (actual rows=12 loops=1)
Sort Key: grades.mmin
Sort Method: quicksort Memory: 25kB
-> Seq Scan on grades (cost=0.00..16.50 rows=650 width=96)
(actual rows=12 loops=1)
-> Sort (cost=46.87..48.49 rows=650 width=96) (actual rows=21 loops=1)
Sort Key: marks.mark
Sort Method: quicksort Memory: 25kB
-> Seq Scan on marks (cost=0.00..16.50 rows=650 width=96)
(actual rows=8 loops=1)
Planning Time: 0.078 ms
Execution Time: 0.068 ms
(12 rows)
----------------------------------------------------------------------------------------------
The second query uses a range and an equality condition and joins the
relations using contained in (<@).
----------------------------------------------------------------------------------------------
DROP TABLE IF EXISTS emps;
DROP TABLE IF EXISTS events;
CREATE TABLE emps (name text, dept text, eperiod daterange);
CREATE TABLE events (event text, dept text, day date);
INSERT INTO emps (name, dept, eperiod) VALUES
('Anton', 'Sales', '(2020-01-01, 2020-03-31)'),
('Thomas', 'Marketing', '(2020-01-01, 2020-06-30)'),
('Michael', 'Marketing', '(2020-03-01, 2020-12-31)'),
('Hans', 'Sales', '(2020-01-01, 2020-12-31)'),
('Thomas', 'Accounting', '(2020-07-01, 2020-12-31)');
INSERT INTO events (event, dept, day) VALUES
('Fair CH', 'Marketing', '2020-03-05'),
('Presentation', 'Sales', '2020-06-15'),
('Fair IT', 'Marketing', '2020-08-03'),
('Balance Report', 'Accounting', '2020-08-03'),
('Product launch', 'Marketing', '2020-10-15');
EXPLAIN(ANALYZE, TIMING FALSE)
SELECT emps.name, emps.dept, events.event, events.day
FROM emps JOIN events ON emps.dept = events.dept
AND events.day <@ emps.eperiod;
QUERY PLAN
----------------------------------------------------------------------------------------------
Range Merge Join (cost=106.73..118.01 rows=3 width=100) (actual rows=6
loops=1)
Merge Cond: (emps.dept = events.dept)
Range Cond: (events.day <@ emps.eperiod)
-> Sort (cost=46.87..48.49 rows=650 width=96) (actual rows=5 loops=1)
Sort Key: emps.dept, emps.eperiod
Sort Method: quicksort Memory: 25kB
-> Seq Scan on emps (cost=0.00..16.50 rows=650 width=96) (actual
rows=5 loops=1)
-> Sort (cost=59.86..61.98 rows=850 width=68) (actual rows=6 loops=1)
Sort Key: events.dept, events.day
Sort Method: quicksort Memory: 25kB
-> Seq Scan on events (cost=0.00..18.50 rows=850 width=68)
(actual rows=5 loops=1)
Planning Time: 0.077 ms
Execution Time: 0.092 ms
(13 rows)
----------------------------------------------------------------------------------------------
Attachment | Content-Type | Size |
---|---|---|
postgres-rmj.patch | application/octet-stream | 48.9 KB |
simple-RMJ-annotated.pdf | application/pdf | 55.0 KB |
From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Thomas <thomasmannhart97(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, boehlen(at)ifi(dot)uzh(dot)ch, dignoes(at)inf(dot)unibz(dot)it, gamper(at)inf(dot)unibz(dot)it, P(dot)Moser(at)noi(dot)bz(dot)it |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-06-10 03:09:54 |
Message-ID: | CAApHDvoyAAkJjZ=DxyOMDhSNDg3Fo799OAF6c8gzfLWBMNaK0Q@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, 10 Jun 2021 at 03:05, Thomas <thomasmannhart97(at)gmail(dot)com> wrote:
> We have implemented the Range Merge Join algorithm by extending the
> existing Merge Join to also support range conditions, i.e., BETWEEN-AND
> or @> (containment for range types).
It shouldn't be a blocker for you, but just so you're aware, there was
a previous proposal for this in [1] and a patch in [2]. I've include
Jeff here just so he's aware of this. Jeff may wish to state his
intentions with his own patch. It's been a few years now.
I only just glanced over the patch. I'd suggest getting rid of the /*
Thomas */ comments. We use git, so if you need an audit trail about
changes then you'll find it in git blame. If you have those for an
internal audit trail then you should consider using git. No committer
would commit those to PostgreSQL, so they might as well disappear.
For further review, please add the patch to the July commitfest [3].
We should be branching for pg15 sometime before the start of July.
There will be more focus on new patches around that time. Further
details in [4].
Also, I see this if your first post to this list, so welcome, and
thank you for the contribution. Also, just to set expectations;
patches like this almost always take a while to get into shape for
PostgreSQL. Please expect a lot of requests to change things. That's
fairly standard procedure. The process often drags on for months and
in some less common cases, years.
David
[1] https://www.postgresql.org/message-id/flat/6227.1334559170%40sss.pgh.pa.us#82c771950ba486dec911923a5e910000
[2] https://www.postgresql.org/message-id/flat/CAMp0ubfwAFFW3O_NgKqpRPmm56M4weTEXjprb2gP_NrDaEC4Eg%40mail.gmail.com
[3] https://commitfest.postgresql.org/33/
[4] https://wiki.postgresql.org/wiki/CommitFest
From: | Thomas <thomasmannhart97(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, boehlen(at)ifi(dot)uzh(dot)ch, dignoes(at)inf(dot)unibz(dot)it, gamper(at)inf(dot)unibz(dot)it, P(dot)Moser(at)noi(dot)bz(dot)it |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-06-10 09:04:08 |
Message-ID: | CAMWfgiB2zdoXfKuA+UNgX_ZqFJRDAczMFqeRdYD4nmu4-KSL0A@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Thank you for the feedback.
I removed the redundant comments from the patch and added this thread to
the July CF [1].
Best Regards,
Thomas Mannhart
[1] https://commitfest.postgresql.org/33/3160/
Am Do., 10. Juni 2021 um 05:10 Uhr schrieb David Rowley <
dgrowleyml(at)gmail(dot)com>:
> On Thu, 10 Jun 2021 at 03:05, Thomas <thomasmannhart97(at)gmail(dot)com> wrote:
> > We have implemented the Range Merge Join algorithm by extending the
> > existing Merge Join to also support range conditions, i.e., BETWEEN-AND
> > or @> (containment for range types).
>
> It shouldn't be a blocker for you, but just so you're aware, there was
> a previous proposal for this in [1] and a patch in [2]. I've include
> Jeff here just so he's aware of this. Jeff may wish to state his
> intentions with his own patch. It's been a few years now.
>
> I only just glanced over the patch. I'd suggest getting rid of the /*
> Thomas */ comments. We use git, so if you need an audit trail about
> changes then you'll find it in git blame. If you have those for an
> internal audit trail then you should consider using git. No committer
> would commit those to PostgreSQL, so they might as well disappear.
>
> For further review, please add the patch to the July commitfest [3].
> We should be branching for pg15 sometime before the start of July.
> There will be more focus on new patches around that time. Further
> details in [4].
>
> Also, I see this if your first post to this list, so welcome, and
> thank you for the contribution. Also, just to set expectations;
> patches like this almost always take a while to get into shape for
> PostgreSQL. Please expect a lot of requests to change things. That's
> fairly standard procedure. The process often drags on for months and
> in some less common cases, years.
>
> David
>
> [1]
> https://www.postgresql.org/message-id/flat/6227.1334559170%40sss.pgh.pa.us#82c771950ba486dec911923a5e910000
> [2]
> https://www.postgresql.org/message-id/flat/CAMp0ubfwAFFW3O_NgKqpRPmm56M4weTEXjprb2gP_NrDaEC4Eg%40mail.gmail.com
> [3] https://commitfest.postgresql.org/33/
> [4] https://wiki.postgresql.org/wiki/CommitFest
>
Attachment | Content-Type | Size |
---|---|---|
postgres-rmj.patch | application/octet-stream | 46.8 KB |
From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com>, Thomas <thomasmannhart97(at)gmail(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, boehlen(at)ifi(dot)uzh(dot)ch, dignoes(at)inf(dot)unibz(dot)it, gamper(at)inf(dot)unibz(dot)it, P(dot)Moser(at)noi(dot)bz(dot)it |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-06-11 02:14:32 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, 2021-06-10 at 15:09 +1200, David Rowley wrote:
> It shouldn't be a blocker for you, but just so you're aware, there
> was
> a previous proposal for this in [1] and a patch in [2]. I've include
> Jeff here just so he's aware of this. Jeff may wish to state his
> intentions with his own patch. It's been a few years now.
Great, thank you for working on this!
I'll start with the reason I set the work down before: it did not work
well with multiple join keys. That might be fine, but I also started
thinking it was specialized enough that I wanted to look into doing it
as an extension using the CustomScan mechanism.
Do you have any solution to working better with multiple join keys? And
do you have thoughts on whether it would be a good candidate for the
CustomScan extension mechanism, which would make it easier to
experiment with?
Regards,
Jeff Davis
From: | Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Thomas <thomasmannhart97(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, boehlen(at)ifi(dot)uzh(dot)ch, dignoes(at)inf(dot)unibz(dot)it, gamper(at)inf(dot)unibz(dot)it, P(dot)Moser(at)noi(dot)bz(dot)it |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-10-04 21:27:54 |
Message-ID: | 20211004212754.GA30366@ahch-to |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Jun 10, 2021 at 07:14:32PM -0700, Jeff Davis wrote:
> On Thu, 2021-06-10 at 15:09 +1200, David Rowley wrote:
> > It shouldn't be a blocker for you, but just so you're aware, there
> > was
> > a previous proposal for this in [1] and a patch in [2]. I've include
> > Jeff here just so he's aware of this. Jeff may wish to state his
> > intentions with his own patch. It's been a few years now.
>
> Great, thank you for working on this!
>
> I'll start with the reason I set the work down before: it did not work
> well with multiple join keys. That might be fine, but I also started
> thinking it was specialized enough that I wanted to look into doing it
> as an extension using the CustomScan mechanism.
>
> Do you have any solution to working better with multiple join keys? And
> do you have thoughts on whether it would be a good candidate for the
> CustomScan extension mechanism, which would make it easier to
> experiment with?
>
Hi,
It seems this has been stalled since jun-2021. I intend mark this as
RwF unless someone speaks in the next hour or so.
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
From: | Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Thomas <thomasmannhart97(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, boehlen(at)ifi(dot)uzh(dot)ch, dignoes(at)inf(dot)unibz(dot)it, gamper(at)inf(dot)unibz(dot)it, P(dot)Moser(at)noi(dot)bz(dot)it |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-10-05 00:30:34 |
Message-ID: | 20211005003034.GA30929@ahch-to |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
> On Mon, Oct 04, 2021 at 04:27:54PM -0500, Jaime Casanova wrote:
>> On Thu, Jun 10, 2021 at 07:14:32PM -0700, Jeff Davis wrote:
>> >
>> > I'll start with the reason I set the work down before: it did not work
>> > well with multiple join keys. That might be fine, but I also started
>> > thinking it was specialized enough that I wanted to look into doing it
>> > as an extension using the CustomScan mechanism.
>> >
>> > Do you have any solution to working better with multiple join keys? And
>> > do you have thoughts on whether it would be a good candidate for the
>> > CustomScan extension mechanism, which would make it easier to
>> > experiment with?
>> >
>>
>> Hi,
>>
>> It seems this has been stalled since jun-2021. I intend mark this as
>> RwF unless someone speaks in the next hour or so.
>>
Thomas <thomasmannhart97(at)gmail(dot)com> wrote me:
> Hi,
>
> I registered this patch for the commitfest in july. It had not been reviewed and moved to the next CF. I still like to submit it.
>
> Regards,
> Thomas
>
Just for clarification RwF doesn't imply reject of the patch.
Nevertheless, given that there has been no real review I will mark this
patch as "Waiting on Author" and move it to the next CF.
Meanwhile, cfbot (aka http://commitfest.cputube.org) says this doesn't
compile. Here is a little patch to fix the compilation errors, after
that it passes all tests in make check-world.
Also attached a rebased version of your patch with the fixes so we turn
cfbot entry green again
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
Attachment | Content-Type | Size |
---|---|---|
rmj_fixes_to_compile.txt | text/plain | 1002 bytes |
postgres-rmj-20211004.patch | text/x-diff | 46.6 KB |
From: | Thomas <thomasmannhart97(at)gmail(dot)com> |
---|---|
To: | Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, pgsql(at)j-davis(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | dignoes(at)inf(dot)unibz(dot)it, boehlen(at)ifi(dot)uzh(dot)ch, p(dot)moser(at)noi(dot)bz(dot)it, gamper(at)inf(dot)unibz(dot)it, Thomas Mannhart <thomas_m(at)hotmail(dot)ch> |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-11-10 14:03:55 |
Message-ID: | CAMWfgiDDvHqN2STsCyKj9oyBJ1yCQdDOp_6Jc0xupuJ8fn9xtQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Dear all,
thanks for the feedback!
We had a closer look at the previous patches and the CustomScan
infrastructure.
Compared to the previous patch, we do not (directly) focus on joins
with the overlap (&&) condition in this patch. Instead we consider
joins with containment (@>) between a range and an element, and joins
with conditions over scalars of the form "right.element BETWEEN
left.start AND left.end", and more generally left.start >(=)
right.element AND right.element <(=) left.end. We call such conditions
range conditions and these conditions can be combined with equality
conditions in the Range Merge Join.
The Range Merge Join can use (optional) equality conditions and one
range condition of the form shown above. In this case the inputs are
sorted first by the attributes used for equality and then one input by
the range (or start in the case of scalars) and the other input by the
element. The Range Merge Join is then a simple extension of the Merge
Join that in addition to the (optional) equality attributes also uses
the range condition in the merge join states. This is similar to an
index-nested loop with scalars for cases when the relation containing
the element has an index on the equality attributes followed by the
element. The Range Merge Join uses sorting and thus does not require
the index for this purpose and performs better.
The patch uses the optimizer estimates to evaluate if the Range Merge
Join is beneficial as compared to other execution strategies, but when
no equality attributes are present, it becomes the only efficient
option for the above range conditions. If a join contains multiple
range conditions, then based on the estimates the most effective
strategy is chosen for the Range Merge Join.
Although we do not directly focus on joins with the overlap (&&)
condition between two ranges, we show in [1] that these joins can be
evaluated using the union (UNION ALL) of two joins with a range
condition, where intuitively, one tests that the start of one input
falls within the range of the other and vice versa. We evaluated this
using regular (B-tree) indices and compare it to joins with the
overlap (&&) condition using GiST, SP-GiST and others, and found that
it performs better. The Range Merge Join would improve this further
and would not require the creation of an index.
We did not consider an implementation as a CustomScan, as we feel the
join is rather general, can be implemented using a small extension of
the existing Merge Join, and would require a substantial duplication
of the Merge Join code.
Kind regards,
Thomas, Anton, Johann, Michael, Peter
[1] https://doi.org/10.1007/s00778-021-00692-3 (open access)
Am Di., 5. Okt. 2021 um 02:30 Uhr schrieb Jaime Casanova <
jcasanov(at)systemguards(dot)com(dot)ec>:
> > On Mon, Oct 04, 2021 at 04:27:54PM -0500, Jaime Casanova wrote:
> >> On Thu, Jun 10, 2021 at 07:14:32PM -0700, Jeff Davis wrote:
> >> >
> >> > I'll start with the reason I set the work down before: it did not work
> >> > well with multiple join keys. That might be fine, but I also started
> >> > thinking it was specialized enough that I wanted to look into doing it
> >> > as an extension using the CustomScan mechanism.
> >> >
> >> > Do you have any solution to working better with multiple join keys?
> And
> >> > do you have thoughts on whether it would be a good candidate for the
> >> > CustomScan extension mechanism, which would make it easier to
> >> > experiment with?
> >> >
> >>
> >> Hi,
> >>
> >> It seems this has been stalled since jun-2021. I intend mark this as
> >> RwF unless someone speaks in the next hour or so.
> >>
>
> Thomas <thomasmannhart97(at)gmail(dot)com> wrote me:
>
> > Hi,
> >
> > I registered this patch for the commitfest in july. It had not been
> reviewed and moved to the next CF. I still like to submit it.
> >
> > Regards,
> > Thomas
> >
>
> Just for clarification RwF doesn't imply reject of the patch.
> Nevertheless, given that there has been no real review I will mark this
> patch as "Waiting on Author" and move it to the next CF.
>
> Meanwhile, cfbot (aka http://commitfest.cputube.org) says this doesn't
> compile. Here is a little patch to fix the compilation errors, after
> that it passes all tests in make check-world.
>
> Also attached a rebased version of your patch with the fixes so we turn
> cfbot entry green again
>
> --
> Jaime Casanova
> Director de Servicios Profesionales
> SystemGuards - Consultores de PostgreSQL
>
Attachment | Content-Type | Size |
---|---|---|
postgres.patch | application/octet-stream | 47.1 KB |
From: | Daniel Gustafsson <daniel(at)yesql(dot)se> |
---|---|
To: | Thomas <thomasmannhart97(at)gmail(dot)com> |
Cc: | Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, pgsql <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, dignoes(at)inf(dot)unibz(dot)it, boehlen(at)ifi(dot)uzh(dot)ch, p(dot)moser(at)noi(dot)bz(dot)it, gamper(at)inf(dot)unibz(dot)it, Thomas Mannhart <thomas_m(at)hotmail(dot)ch> |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-11-17 14:03:32 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
This patch fails to compile due to an incorrect function name in an assertion:
nodeMergejoin.c:297:9: warning: implicit declaration of function 'list_legth' is invalid in C99 [-Wimplicit-function-declaration]
Assert(list_legth(node->rangeclause) < 3);
^
--
Daniel Gustafsson https://vmware.com/
From: | Thomas <thomasmannhart97(at)gmail(dot)com> |
---|---|
To: | daniel(at)yesql(dot)se |
Cc: | Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, pgsql <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, dignoes(at)inf(dot)unibz(dot)it, boehlen(at)ifi(dot)uzh(dot)ch, p(dot)moser(at)noi(dot)bz(dot)it, gamper(at)inf(dot)unibz(dot)it, Thomas Mannhart <thomas_m(at)hotmail(dot)ch> |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-11-17 14:45:26 |
Message-ID: | CAMWfgiBRETALLvhk9P3dCsii8cL1XyYHv87=MH4OtnT5Y+o9Ng@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Thank you for the feedback and sorry for the oversight. I fixed the bug and
attached a new version of the patch.
Kind Regards, Thomas
Am Mi., 17. Nov. 2021 um 15:03 Uhr schrieb Daniel Gustafsson <
daniel(at)yesql(dot)se>:
> This patch fails to compile due to an incorrect function name in an
> assertion:
>
> nodeMergejoin.c:297:9: warning: implicit declaration of function
> 'list_legth' is invalid in C99 [-Wimplicit-function-declaration]
> Assert(list_legth(node->rangeclause) < 3);
> ^
>
> --
> Daniel Gustafsson https://vmware.com/
>
>
Attachment | Content-Type | Size |
---|---|---|
postgres-rmj_11-17-21.patch | application/octet-stream | 47.1 KB |
From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Thomas <thomasmannhart97(at)gmail(dot)com>, daniel(at)yesql(dot)se |
Cc: | Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, pgsql <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, dignoes(at)inf(dot)unibz(dot)it, boehlen(at)ifi(dot)uzh(dot)ch, p(dot)moser(at)noi(dot)bz(dot)it, gamper(at)inf(dot)unibz(dot)it, Thomas Mannhart <thomas_m(at)hotmail(dot)ch> |
Subject: | Re: Patch: Range Merge Join |
Date: | 2021-11-17 22:28:43 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 11/17/21 15:45, Thomas wrote:
> Thank you for the feedback and sorry for the oversight. I fixed the bug
> and attached a new version of the patch.
>
> Kind Regards, Thomas
>
> Am Mi., 17. Nov. 2021 um 15:03 Uhr schrieb Daniel Gustafsson
> <daniel(at)yesql(dot)se <mailto:daniel(at)yesql(dot)se>>:
>
> This patch fails to compile due to an incorrect function name in an
> assertion:
>
> nodeMergejoin.c:297:9: warning: implicit declaration of function
> 'list_legth' is invalid in C99 [-Wimplicit-function-declaration]
> Assert(list_legth(node->rangeclause) < 3);
>
That still doesn't compile with asserts, because MJCreateRangeData has
Assert(list_length(node->rangeclause) < 3);
but there's no 'node' variable :-/
I took a brief look at the patch, and I think there are two main issues
preventing it from moving forward.
1) no tests
There's not a *single* regression test exercising the new code, so even
after adding Assert(false) to MJCreateRangeData() tests pass just fine.
Clearly, that needs to change.
2) lack of comments
The patch adds a bunch of functions, but it does not really explain what
the functions do (unlike the various surrounding functions). Even if I
can work out what the functions do, it's much harder to determine what
the "contract" is (i.e. what assumptions the function do and what is
guaranteed).
Similarly, the patch modifies/reworks large blocks of executor code,
without updating the comments describing what the block does.
See 0002 for various places that I think are missing comments.
Aside from that, I have a couple minor comments:
3) I'm not quite sure I like "Range Merge Join" to be honest. It's still
a "Merge Join" pretty much. What about ditching the "Range"? There'll
still be "Range Cond" key, which should be good enough I think.
4) Some minor whitespace issues (tabs vs. spaces). See 0002.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
0002-review.patch | text/x-patch | 9.2 KB |
0001-2021-11-17.patch | text/x-patch | 48.5 KB |
From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | Thomas <thomasmannhart97(at)gmail(dot)com>, daniel(at)yesql(dot)se, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, pgsql <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, dignoes(at)inf(dot)unibz(dot)it, boehlen(at)ifi(dot)uzh(dot)ch, p(dot)moser(at)noi(dot)bz(dot)it, gamper(at)inf(dot)unibz(dot)it, Thomas Mannhart <thomas_m(at)hotmail(dot)ch> |
Subject: | Re: Patch: Range Merge Join |
Date: | 2022-01-17 07:39:33 |
Message-ID: | 20220117073933.m7ef2hwaubpcd75v@jrouhaud |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On Wed, Nov 17, 2021 at 11:28:43PM +0100, Tomas Vondra wrote:
> On 11/17/21 15:45, Thomas wrote:
> > Thank you for the feedback and sorry for the oversight. I fixed the bug
> > and attached a new version of the patch.
> >
> > Kind Regards, Thomas
> >
> > Am Mi., 17. Nov. 2021 um 15:03 Uhr schrieb Daniel Gustafsson
> > <daniel(at)yesql(dot)se <mailto:daniel(at)yesql(dot)se>>:
> >
> > This patch fails to compile due to an incorrect function name in an
> > assertion:
> >
> > nodeMergejoin.c:297:9: warning: implicit declaration of function
> > 'list_legth' is invalid in C99 [-Wimplicit-function-declaration]
> > Assert(list_legth(node->rangeclause) < 3);
> >
>
> That still doesn't compile with asserts, because MJCreateRangeData has
>
> Assert(list_length(node->rangeclause) < 3);
>
> but there's no 'node' variable :-/
>
>
> I took a brief look at the patch, and I think there are two main issues
> preventing it from moving forward.
>
> 1) no tests
>
> 2) lack of comments
>
> 3) I'm not quite sure I like "Range Merge Join" to be honest. It's still a
> "Merge Join" pretty much. What about ditching the "Range"? There'll still be
> "Range Cond" key, which should be good enough I think.
>
> 4) Some minor whitespace issues (tabs vs. spaces). See 0002.
It's been 2 months since Tomas posted that review.
Thomas, do you plan to work on that patch during this commitfest?