BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.

Lists: pgsql-bugs
From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: maximilian(dot)chrzan(at)here(dot)com
Subject: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-13 14:10:00
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18959
Logged by: Maximilian Chrzan
Email address: maximilian(dot)chrzan(at)here(dot)com
PostgreSQL version: 17.4
Operating system: x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.
Description:

Dear PostgreSQL team,
We encountered a reproducible issue when creating expression indexes on a
partitioned table using:
CREATE INDEX IF NOT EXISTS ... ON partitioned_table ((expression));
When such statements are executed in parallel (e.g., via separate
transactions or threads), the PostgreSQL engine attempts to propagate the
index to each child partition using internally generated names like:
partition_name_expr_idx
partition_name_expr_idx1
partition_name_expr_idx2
...
These internal names are not derived from the index expression or parent
index name, but instead appear to be based on a counter of existing
expression indexes.
The Issue:
When multiple expressions are indexed in parallel on the same partitioned
table, even with distinct expressions and parent index names, the system may
generate the same internal name for partition-level indexes, causing:
(Postgres <17): ERROR: duplicate key value violates unique constraint
"pg_class_relname_nsp_index" 23505
(Postgres 17): relation "{index_name}" already exists 42P07
This occurs even though the parent-level index names are unique and
expressions differ.
Reproducer (simplified):
-- In separate sessions concurrently:
CREATE INDEX IF NOT EXISTS idx_expr1 ON parent_table (((jsondata -> 'a' ->
'b')));
CREATE INDEX IF NOT EXISTS idx_expr2 ON parent_table (((jsondata -> 'x' ->
'y')));
Internally, PostgreSQL attempts to create something like:
CREATE INDEX parent_table_partition1_expr_idx ON ...
CREATE INDEX parent_table_partition1_expr_idx ON ... -- collision
Expected behavior:
If expressions or parent index names differ, partition-level index names
should be derived deterministically from:
* Parent index name (preferred) eg.: parent_idx_name_partition1
* Or a hash of the expression (as fallback)
This would avoid internal naming collisions and allow safe concurrent
execution of CREATE INDEX IF NOT EXISTS on partitioned tables.
This issue limits scalability when programmatically creating multiple
JSON-path expression indexes on partitioned tables, and complicates use of
parallelism. While advisory locking is a possible workaround, it is not
ideal.
Thanks in advance for looking into it.
Best regards,
Max Chrzan


From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-18 10:29:48
Message-ID: CAFiTN-vB9UHF0yJxFynKuFiLkw094z7KLyDw4EAfUeaj1pZ00g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

On Sat, Jun 14, 2025 at 3:15 PM PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 18959
> Logged by: Maximilian Chrzan
> Email address: maximilian(dot)chrzan(at)here(dot)com
> PostgreSQL version: 17.4
> Operating system: x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.
> Description:
>
> Dear PostgreSQL team,
> We encountered a reproducible issue when creating expression indexes on a
> partitioned table using:
> CREATE INDEX IF NOT EXISTS ... ON partitioned_table ((expression));
> When such statements are executed in parallel (e.g., via separate
> transactions or threads), the PostgreSQL engine attempts to propagate the
> index to each child partition using internally generated names like:
> partition_name_expr_idx
> partition_name_expr_idx1
> partition_name_expr_idx2
> ...
> These internal names are not derived from the index expression or parent
> index name, but instead appear to be based on a counter of existing
> expression indexes.
> The Issue:
> When multiple expressions are indexed in parallel on the same partitioned
> table, even with distinct expressions and parent index names, the system may
> generate the same internal name for partition-level indexes, causing:
> (Postgres <17): ERROR: duplicate key value violates unique constraint
> "pg_class_relname_nsp_index" 23505
> (Postgres 17): relation "{index_name}" already exists 42P07
> This occurs even though the parent-level index names are unique and
> expressions differ.
> Reproducer (simplified):
> -- In separate sessions concurrently:
> CREATE INDEX IF NOT EXISTS idx_expr1 ON parent_table (((jsondata -> 'a' ->
> 'b')));
> CREATE INDEX IF NOT EXISTS idx_expr2 ON parent_table (((jsondata -> 'x' ->
> 'y')));
> Internally, PostgreSQL attempts to create something like:
> CREATE INDEX parent_table_partition1_expr_idx ON ...
> CREATE INDEX parent_table_partition1_expr_idx ON ... -- collision
> Expected behavior:
> If expressions or parent index names differ, partition-level index names
> should be derived deterministically from:
> * Parent index name (preferred) eg.: parent_idx_name_partition1
> * Or a hash of the expression (as fallback)
> This would avoid internal naming collisions and allow safe concurrent
> execution of CREATE INDEX IF NOT EXISTS on partitioned tables.
> This issue limits scalability when programmatically creating multiple
> JSON-path expression indexes on partitioned tables, and complicates use of
> parallelism. While advisory locking is a possible workaround, it is not
> ideal.

It seems beneficial to embed the parent index name within the names of
its partitioned child indexes, although it would become tricky when
building an index for a multi level partition hierarchy but we could
simplify this by only referencing the top-level user-provided index
name. This is my perspective, and I'm open to other ideas.

--
Regards,
Dilip Kumar
Google


From: Phineas Jensen <phin(at)zayda(dot)net>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-18 13:46:38
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

> On Jun 18, 2025, at 4:29 AM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> It seems beneficial to embed the parent index name within the names of
> its partitioned child indexes, although it would become tricky when
> building an index for a multi level partition hierarchy but we could
> simplify this by only referencing the top-level user-provided index
> name. This is my perspective, and I'm open to other ideas.

I agree that embedding the parent index name would be the simplest solution for this case, but a similar bug would still happen if no index name was specified for the parent at all (e.g. CREATE INDEX ON parent_table ((jsondata->’a’->’b’)) ), although in that case, the conflict is on the parent table, not the child tables.

Would it be worth making CREATE INDEX add a short hash or some other unique key when no name is specified? Or does it make more sense to just say (maybe in the documentation) that if you are running CREATE INDEX multiple times concurrently that you should specify a name to avoid conflicts?

I created SQL and Bash scripts to reproduce the problem, which I’ve attached.

Phin Jensen




From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-18 15:21:51
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

Dilip Kumar <dilipbalaut(at)gmail(dot)com> writes:
> On Sat, Jun 14, 2025 at 3:15 PM PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
>> If expressions or parent index names differ, partition-level index names
>> should be derived deterministically from:
>> * Parent index name (preferred) eg.: parent_idx_name_partition1
>> * Or a hash of the expression (as fallback)
>> This would avoid internal naming collisions and allow safe concurrent
>> execution of CREATE INDEX IF NOT EXISTS on partitioned tables.

> It seems beneficial to embed the parent index name within the names of
> its partitioned child indexes, although it would become tricky when
> building an index for a multi level partition hierarchy but we could
> simplify this by only referencing the top-level user-provided index
> name. This is my perspective, and I'm open to other ideas.

This seems very closely related to commit 3db61db48 [1], which fixed
a similar behavior for child foreign key constraints. Per that commit
message, it's a good idea for the child objects to have names related
to the parent objects, so we ought to change this behavior regardless
of any concurrent-failure considerations.

Having said that, I do not think that the OP's idea of fully
deterministic index name choice is workable. We don't constrain
partitions to be exactly like their parents; that means that an index
name that works fine at an upper level might conflict with some
pre-existing index on a child. So unless you prefer failure to
selecting a different name at the child level, it's necessary to
allow the child index names to sometimes be different.

But ... the code *does* have the ability to dodge conflicting
index names already; this is why you get
partition_name_expr_idx
partition_name_expr_idx1
partition_name_expr_idx2
and not immediate failure. If this isn't working reliably in
concurrent situations, that must mean that we are not obtaining
an exclusive lock before looking for pre-existing index names.
I'm not sure if that's a bug or intentional. My vague recollection
is that we intend to allow multiple CREATE INDEX in parallel, so it
may be that obtaining a lock would be a cure worse than the disease.

In any case, deriving the child index name(s) from the parent name
would reduce the scope of this problem, so I agree we ought to
make it do that.

regards, tom lane

[1] https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=3db61db48


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-18 16:46:35
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

I wrote:
> This seems very closely related to commit 3db61db48 [1], which fixed
> a similar behavior for child foreign key constraints. Per that commit
> message, it's a good idea for the child objects to have names related
> to the parent objects, so we ought to change this behavior regardless
> of any concurrent-failure considerations.

I experimented with the attached, which borrows a couple of ideas
from 3db61db48 to produce names like "parent_index_2" when cloning
indexes. While it should help with the immediate problem, I'm not
sure if this is acceptable, because there are a *lot* of ensuing
changes in the regression tests, many more than 3db61db48 caused.
(Note that I didn't bother to fix places where the tests rely on
a generated name that has changed; the delta in the test outputs
is merely meant to give an idea of how much churn there is.
I didn't check non-core test suites, either.)

Also, looking at the error message changes, I'm less sure that
this is a UX improvement than I was about 3db61db48. Do people
care which partition a uniqueness constraint failed in? In
the current behavior, the index name will reflect that, but
with this behavior, not so much.

Anyway, maybe this is a good idea or maybe it isn't. Thoughts?

regards, tom lane

Attachment Content-Type Size
wip-change-choice-of-cloned-index-names.patch text/x-diff 118.0 KB

From: "Chrzan, Maximilian" <maximilian(dot)chrzan(at)here(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: AW: [EXTERNAL] Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-19 14:08:34
Message-ID: DU2PR04MB91304CF5226E97E1F0F986C69E7DA@DU2PR04MB9130.eurprd04.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

We are working with very large partitioned tables (500M+ rows, >1 TB of data) and need to create multiple expression indexes on them.

To avoid the issues with parallel index creation, we switched to sequential execution: as soon as one index finishes (usually after 1–2 hours), we immediately start the next (typically within a second). In this setup, there is no actual parallelism — yet we occasionally still hit this error:

ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index"
Detail: Key (relname, relnamespace) = (…) already exists.

This suggests that the issue is not limited to concurrent execution. It can also occur when index creation happens in quick succession.

Additionally, we noticed that two parallel index creations on a partitioned table will block each other — even if they target different expressions. Here's a simplified example:

CREATE TABLE test (
jsondata JSONB,
version BIGINT NOT NULL DEFAULT 9223372036854775807
) PARTITION BY RANGE (version);

CREATE TABLE test_p0 PARTITION OF test FOR VALUES FROM (0) TO (100000);

Transaction 1:

DO $$
BEGIN
CREATE INDEX IF NOT EXISTS idx_1 ON test
(((jsondata -> 'properties') -> 'foo1') ASC NULLS LAST);
PERFORM pg_sleep(10);
END;
$$;

Transaction 2 (started in parallel):

DO $$
BEGIN
CREATE INDEX IF NOT EXISTS idx_2 ON test
(((jsondata -> 'properties') -> 'foo2') ASC NULLS LAST);
END;
$$;

Transaction 2 will block until Transaction 1 completes — and then fail with:

ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index"
Detail: Key (relname, relnamespace) = (test_p1_expr_idx, 2200) already exists.

If the same indexes are created directly on the partition "test_p0", the second index is created immediately — without blocking or error.
________________________________
Von: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Gesendet: Mittwoch, 18. Juni 2025 18:46
An: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Chrzan, Maximilian <maximilian(dot)chrzan(at)here(dot)com>; pgsql-bugs(at)lists(dot)postgresql(dot)org <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Betreff: [EXTERNAL] Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.

[Sie erhalten nicht h?ufig E-Mails von tgl(at)sss(dot)pgh(dot)pa(dot)us(dot) Weitere Informationen, warum dies wichtig ist, finden Sie unter https://aka.ms/LearnAboutSenderIdentification ]

LEARN FAST: This email originated outside of HERE.
Please do not click on links or open attachments unless you recognize the sender and know the content is safe. Thank you.

I wrote:
> This seems very closely related to commit 3db61db48 [1], which fixed
> a similar behavior for child foreign key constraints. Per that commit
> message, it's a good idea for the child objects to have names related
> to the parent objects, so we ought to change this behavior regardless
> of any concurrent-failure considerations.

I experimented with the attached, which borrows a couple of ideas
from 3db61db48 to produce names like "parent_index_2" when cloning
indexes. While it should help with the immediate problem, I'm not
sure if this is acceptable, because there are a *lot* of ensuing
changes in the regression tests, many more than 3db61db48 caused.
(Note that I didn't bother to fix places where the tests rely on
a generated name that has changed; the delta in the test outputs
is merely meant to give an idea of how much churn there is.
I didn't check non-core test suites, either.)

Also, looking at the error message changes, I'm less sure that
this is a UX improvement than I was about 3db61db48. Do people
care which partition a uniqueness constraint failed in? In
the current behavior, the index name will reflect that, but
with this behavior, not so much.

Anyway, maybe this is a good idea or maybe it isn't. Thoughts?

regards, tom lane


From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: "Chrzan, Maximilian" <maximilian(dot)chrzan(at)here(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-19 15:37:57
Message-ID: CAFiTN-v+x4o_9C715FCYnvyKeqHMx1aYCTj=GGaJYaiW4PxBTQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Jun 19, 2025 at 7:38 PM Chrzan, Maximilian
<maximilian(dot)chrzan(at)here(dot)com> wrote:
>
> We are working with very large partitioned tables (500M+ rows, >1 TB of data) and need to create multiple expression indexes on them.
>
> To avoid the issues with parallel index creation, we switched to sequential execution: as soon as one index finishes (usually after 1–2 hours), we immediately start the next (typically within a second). In this setup, there is no actual parallelism — yet we occasionally still hit this error:
>
> ERROR: duplicate key value violates unique constraint "pg_class_relname_nsp_index"
> Detail: Key (relname, relnamespace) = (…) already exists.
>
> This suggests that the issue is not limited to concurrent execution. It can also occur when index creation happens in quick succession.
>
> Additionally, we noticed that two parallel index creations on a partitioned table will block each other — even if they target different expressions. Here's a simplified example:
>
> CREATE TABLE test (
> jsondata JSONB,
> version BIGINT NOT NULL DEFAULT 9223372036854775807
> ) PARTITION BY RANGE (version);
>
> CREATE TABLE test_p0 PARTITION OF test FOR VALUES FROM (0) TO (100000);
>
> Transaction 1:
>
> DO $$
> BEGIN
> CREATE INDEX IF NOT EXISTS idx_1 ON test
> (((jsondata -> 'properties') -> 'foo1') ASC NULLS LAST);
> PERFORM pg_sleep(10);
> END;
> $$;
>
> Transaction 2 (started in parallel):
>
> DO $$
> BEGIN
> CREATE INDEX IF NOT EXISTS idx_2 ON test
> (((jsondata -> 'properties') -> 'foo2') ASC NULLS LAST);
> END;
> $$;
>
> Transaction 2 will block until Transaction 1 completes — and then fail with:

I believe this is fundamentally the same issue we're addressing here.
We're observing duplicate index name creation on child tables. If the
first transaction remains open, the second transaction waits for it to
commit or roll back because it's attempting to insert the same index
name key into the catalog. Once the first transaction commits, the
second will roll back due to a unique key violation. Conversely, if
the first transaction rolls back, the second will succeed.

--
Regards,
Dilip Kumar
Google


From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-19 15:53:20
Message-ID: CAEG8a3+EwEJVS-xLC6xB59mu9VkriPy-cC+oBtvY7oSu==PFTQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

Hi Tom,

On Thu, Jun 19, 2025 at 12:46 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> I wrote:
> > This seems very closely related to commit 3db61db48 [1], which fixed
> > a similar behavior for child foreign key constraints. Per that commit
> > message, it's a good idea for the child objects to have names related
> > to the parent objects, so we ought to change this behavior regardless
> > of any concurrent-failure considerations.
>
> I experimented with the attached, which borrows a couple of ideas
> from 3db61db48 to produce names like "parent_index_2" when cloning
> indexes. While it should help with the immediate problem, I'm not
> sure if this is acceptable, because there are a *lot* of ensuing
> changes in the regression tests, many more than 3db61db48 caused.
> (Note that I didn't bother to fix places where the tests rely on
> a generated name that has changed; the delta in the test outputs
> is merely meant to give an idea of how much churn there is.
> I didn't check non-core test suites, either.)

I think this approach is better because each child index inherits its
parent's index name with an extra number, creating a more
intuitive hierarchy. This naming convention makes it easier to
understand the partition levels directly from the index name.
So I'm +1 for this idea.

>
> Also, looking at the error message changes, I'm less sure that
> this is a UX improvement than I was about 3db61db48. Do people
> care which partition a uniqueness constraint failed in? In
> the current behavior, the index name will reflect that, but
> with this behavior, not so much.

I can see the benefit of being able to identify the associated
partition directly by checking the index name. Can we prepend
the partition rel name to the index name, this will make the
index longer, not sure if it's acceptable.

>
> Anyway, maybe this is a good idea or maybe it isn't. Thoughts?
>
> regards, tom lane
>

--
Regards
Junwang Zhao


From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-19 16:04:22
Message-ID: CAFiTN-tVuS7pWbSZtCh5kAsGihen5ESpirS811KNFPCgLBm1WQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

On Wed, Jun 18, 2025 at 10:16 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> I wrote:
> > This seems very closely related to commit 3db61db48 [1], which fixed
> > a similar behavior for child foreign key constraints. Per that commit
> > message, it's a good idea for the child objects to have names related
> > to the parent objects, so we ought to change this behavior regardless
> > of any concurrent-failure considerations.
>
> I experimented with the attached, which borrows a couple of ideas
> from 3db61db48 to produce names like "parent_index_2" when cloning
> indexes. While it should help with the immediate problem, I'm not
> sure if this is acceptable, because there are a *lot* of ensuing
> changes in the regression tests, many more than 3db61db48 caused.
> (Note that I didn't bother to fix places where the tests rely on
> a generated name that has changed; the delta in the test outputs
> is merely meant to give an idea of how much churn there is.
> I didn't check non-core test suites, either.)
>
> Also, looking at the error message changes, I'm less sure that
> this is a UX improvement than I was about 3db61db48. Do people
> care which partition a uniqueness constraint failed in? In
> the current behavior, the index name will reflect that, but
> with this behavior, not so much.
>
> Anyway, maybe this is a good idea or maybe it isn't. Thoughts?

I haven't reviewed the patch itself, but I like the idea. We're now
consistently using the parent index name for partitioned indexes,
whether they're named or unnamed indexes. That looks like a great
improvement. And I think including the partition number of each level
in the index name significantly enhances its clarity, especially
within a multi-level partition hierarchy.

--
Regards,
Dilip Kumar
Google


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-19 16:57:46
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

Dilip Kumar <dilipbalaut(at)gmail(dot)com> writes:
> On Wed, Jun 18, 2025 at 10:16 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I experimented with the attached, which borrows a couple of ideas
>> from 3db61db48 to produce names like "parent_index_2" when cloning
>> indexes. While it should help with the immediate problem, I'm not
>> sure if this is acceptable, because there are a *lot* of ensuing
>> changes in the regression tests, many more than 3db61db48 caused.

> I haven't reviewed the patch itself, but I like the idea. We're now
> consistently using the parent index name for partitioned indexes,
> whether they're named or unnamed indexes. That looks like a great
> improvement. And I think including the partition number of each level
> in the index name significantly enhances its clarity, especially
> within a multi-level partition hierarchy.

A different approach that we could take --- possibly alongside doing
the above --- is to try to remove the race condition between two
sessions choosing the same index name. It doesn't look practical
to close the race window completely, but it's quite simple to make
it a whole lot shorter. If we check for a conflicting relation
name using SnapshotDirty instead of only looking for committed
pg_class rows, then the window is little more than the time needed
to insert the index's pg_class row, rather than being the whole
time needed to build the index. (The fact that the OP is working
with terabyte-sized tables is what's making this so bad for him.)

In the attached draft I only bothered to change the initial
probe for a conflicting pg_class entry. We could go further and
apply the same idea in ConstraintNameExists(), but I'm not sure
it's worth the trouble.

regards, tom lane

Attachment Content-Type Size
wip-shorten-index-name-choice-race-condition.patch text/x-diff 2.4 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-19 20:59:46
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

Dilip Kumar <dilipbalaut(at)gmail(dot)com> writes:
> I haven't reviewed the patch itself, but I like the idea. We're now
> consistently using the parent index name for partitioned indexes,
> whether they're named or unnamed indexes. That looks like a great
> improvement. And I think including the partition number of each level
> in the index name significantly enhances its clarity, especially
> within a multi-level partition hierarchy.

Since people seem to think this might be a good way to proceed,
I spent some effort on cleaning up the regression test changes.

While doing that, I decided that applying this behavioral change to
CREATE TABLE LIKE (the original user of generateClonedIndexStmt)
might not be such a hot idea: the regression test changes that
that induced felt less natural than the ones involving partitioned
indexes. Another practical reason is that all the calls for
partitioned indexes will call DefineIndex immediately, so the
race-condition window for some other session to claim the same
index name is barely wider than it was before. But in CREATE TABLE
LIKE, there's considerably more delay, and I think it might even
be possible to construct counterexamples where our own process
could try to create two identically-named indexes if we try to
nail down the index name in generateClonedIndexStmt.

So that leads me to the attached. Excluding CREATE TABLE LIKE
reduces the number of regression-test changes a little, but
there's still a lot of them, implying this is a nontrivial
behavioral change for users. So I feel like this is not
something to squeeze into v18 post-beta-1. I'm thinking it'd
be appropriate for v19 instead. (We could perhaps back-patch
the other SnapshotDirty patch to ameliorate the problem in the
back branches.)

regards, tom lane

Attachment Content-Type Size
v1-0001-Change-the-names-generated-for-index-partitions.patch text/x-diff 133.9 KB

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-20 11:29:10
Message-ID: CAFiTN-t0LFgiHgD7DvHZtCxL46X3qdARAmYoLREVizH_YyEn6A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Jun 20, 2025 at 2:29 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Dilip Kumar <dilipbalaut(at)gmail(dot)com> writes:
> > I haven't reviewed the patch itself, but I like the idea. We're now
> > consistently using the parent index name for partitioned indexes,
> > whether they're named or unnamed indexes. That looks like a great
> > improvement. And I think including the partition number of each level
> > in the index name significantly enhances its clarity, especially
> > within a multi-level partition hierarchy.
>
> Since people seem to think this might be a good way to proceed,
> I spent some effort on cleaning up the regression test changes.
>
> While doing that, I decided that applying this behavioral change to
> CREATE TABLE LIKE (the original user of generateClonedIndexStmt)
> might not be such a hot idea: the regression test changes that
> that induced felt less natural than the ones involving partitioned
> indexes. Another practical reason is that all the calls for
> partitioned indexes will call DefineIndex immediately, so the
> race-condition window for some other session to claim the same
> index name is barely wider than it was before. But in CREATE TABLE
> LIKE, there's considerably more delay, and I think it might even
> be possible to construct counterexamples where our own process
> could try to create two identically-named indexes if we try to
> nail down the index name in generateClonedIndexStmt.
>
> So that leads me to the attached.

The patch LGTM

Excluding CREATE TABLE LIKE
> reduces the number of regression-test changes a little, but
> there's still a lot of them, implying this is a nontrivial
> behavioral change for users. So I feel like this is not
> something to squeeze into v18 post-beta-1. I'm thinking it'd
> be appropriate for v19 instead. (We could perhaps back-patch
> the other SnapshotDirty patch to ameliorate the problem in the
> back branches.)

Yes, that makes sense to apply in v19 because of user visible behavior
changes in index names. I agree the SnapshotDirty patch can give
relief for this case for back branches.

--
Regards,
Dilip Kumar
Google


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: maximilian(dot)chrzan(at)here(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Date: 2025-06-20 17:50:41
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-bugs

Dilip Kumar <dilipbalaut(at)gmail(dot)com> writes:
> Yes, that makes sense to apply in v19 because of user visible behavior
> changes in index names. I agree the SnapshotDirty patch can give
> relief for this case for back branches.

OK, I pushed the SnapshotDirty patch. The other patch still seems
to apply over it, so I won't repost that unless the cfbot thinks
differently.

regards, tom lane