BUG #18559: Crash after detaching a partition concurrently from another session

Lists: pgsql-bugs
From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: kuntalghosh(dot)2007(at)gmail(dot)com
Subject: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-07-30 13:47:15
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 18559
Logged by: Kuntal Ghosh
Email address: kuntalghosh(dot)2007(at)gmail(dot)com
PostgreSQL version: 17beta2
Operating system: AL2
Description:

I've encountered the following crash while dropping a partition table
followed by detaching it concurrently.

#0 0x0000000000900e5f in heap_getattr (tup=0x0, attnum=33,
tupleDesc=0x7f40db0a5458, isnull=0x7ffcb110197e) at
../../../src/include/access/htup_details.h:801
801 if (attnum > (int)
HeapTupleHeaderGetNatts(tup->t_data))
(gdb) bt
#0 0x0000000000900e5f in heap_getattr (tup=0x0, attnum=33,
tupleDesc=0x7f40db0a5458, isnull=0x7ffcb110197e) at
../../../src/include/access/htup_details.h:801
#1 0x000000000090123b in RelationBuildPartitionDesc (rel=0x7f40db0b68e8,
omit_detached=true) at partdesc.c:237
#2 0x0000000000900fe0 in RelationGetPartitionDesc (rel=0x7f40db0b68e8,
omit_detached=true) at partdesc.c:109
#3 0x0000000000901889 in PartitionDirectoryLookup (pdir=0x24287e8,
rel=0x7f40db0b68e8) at partdesc.c:457
#4 0x00000000008e77c3 in set_relation_partition_info (root=0x241c308,
rel=0x241d518, relation=0x7f40db0b68e8) at plancat.c:2367
#5 0x00000000008e48c6 in get_relation_info (root=0x241c308,
relationObjectId=16388, inhparent=true, rel=0x241d518) at plancat.c:554
#6 0x00000000008eb8b7 in build_simple_rel (root=0x241c308, relid=1,
parent=0x0) at relnode.c:340
#7 0x000000000089f007 in add_base_rels_to_query (root=0x241c308,
jtnode=0x241be90) at initsplan.c:165
#8 0x000000000089f04e in add_base_rels_to_query (root=0x241c308,
jtnode=0x241c238) at initsplan.c:173
#9 0x00000000008a5363 in query_planner (root=0x241c308,
qp_callback=0x8aba74 <standard_qp_callback>, qp_extra=0x7ffcb1101da0) at
planmain.c:170
#10 0x00000000008a7d88 in grouping_planner (root=0x241c308,
tuple_fraction=0, setops=0x0) at planner.c:1520
#11 0x00000000008a74b0 in subquery_planner (glob=0x241b988, parse=0x241d1f8,
parent_root=0x0, hasRecursion=false, tuple_fraction=0, setops=0x0) at
planner.c:1089
#12 0x00000000008a5ae7 in standard_planner (parse=0x241d1f8,
query_string=0x23732d8 "prepare p1 as select * from p;", cursorOptions=2048,
boundParams=0x0) at planner.c:415
#13 0x00000000008a587e in planner (parse=0x241d1f8, query_string=0x23732d8
"prepare p1 as select * from p;", cursorOptions=2048, boundParams=0x0) at
planner.c:282
#14 0x00000000009e7dbc in pg_plan_query (querytree=0x241d1f8,
query_string=0x23732d8 "prepare p1 as select * from p;", cursorOptions=2048,
boundParams=0x0) at postgres.c:904
#15 0x00000000009e7eed in pg_plan_queries (querytrees=0x241c2b8,
query_string=0x23732d8 "prepare p1 as select * from p;", cursorOptions=2048,
boundParams=0x0) at postgres.c:996
#16 0x0000000000b9e50f in BuildCachedPlan (plansource=0x2374270,
qlist=0x241c2b8, boundParams=0x0, queryEnv=0x0) at plancache.c:962
#17 0x0000000000b9eaeb in GetCachedPlan (plansource=0x2374270,
boundParams=0x0, owner=0x0, queryEnv=0x0) at plancache.c:1199
#18 0x00000000006cfd2c in ExecuteQuery (pstate=0x2372ed8, stmt=0x2349130,
intoClause=0x0, params=0x0, dest=0x2372e48, qc=0x7ffcb1102630) at
prepare.c:193
#19 0x00000000009f0c2c in standard_ProcessUtility (pstmt=0x23491e0,
queryString=0x2348720 "execute p1;", readOnlyTree=false,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x2372e48,
qc=0x7ffcb1102630)
at utility.c:750
#20 0x00000000009f061b in ProcessUtility (pstmt=0x23491e0,
queryString=0x2348720 "execute p1;", readOnlyTree=false,
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x2372e48,
qc=0x7ffcb1102630)
at utility.c:523
#21 0x00000000009ef237 in PortalRunUtility (portal=0x23c8100,
pstmt=0x23491e0, isTopLevel=true, setHoldSnapshot=true, dest=0x2372e48,
qc=0x7ffcb1102630) at pquery.c:1158
#22 0x00000000009eefa0 in FillPortalStore (portal=0x23c8100,
isTopLevel=true) at pquery.c:1031
#23 0x00000000009ee90e in PortalRun (portal=0x23c8100,
count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x23495a0,
altdest=0x23495a0, qc=0x7ffcb1102800) at pquery.c:763
#24 0x00000000009e83e6 in exec_simple_query (query_string=0x2348720 "execute
p1;") at postgres.c:1274
#25 0x00000000009ecb0b in PostgresMain (dbname=0x2381fa8 "postgres",
username=0x2381f88 "kuntalgh") at postgres.c:4696
#26 0x00000000009e4c0a in BackendMain (startup_data=0x7ffcb1102b0c "",
startup_data_len=4) at backend_startup.c:107
#27 0x0000000000910ea3 in postmaster_child_launch (child_type=B_BACKEND,
startup_data=0x7ffcb1102b0c "", startup_data_len=4,
client_sock=0x7ffcb1102b30) at launch_backend.c:274
#28 0x0000000000916661 in BackendStartup (client_sock=0x7ffcb1102b30) at
postmaster.c:3495
#29 0x0000000000913d7c in ServerLoop () at postmaster.c:1662
#30 0x0000000000913736 in PostmasterMain (argc=3, argv=0x2342ea0) at
postmaster.c:1360
#31 0x00000000007d2e9f in main (argc=3, argv=0x2342ea0) at main.c:197

I've reproduced the issue by following [1] with minor modification.

1. ./configure --enable-debug --enable-depend --enable-cassert CFLAGS=-O0
2. make -j; make install -j; initdb -D ./primary; pg_ctl -D ../primary -l
logfile start
3. alter system set plan_cache_mode to 'force_generic_plan' ; select
pg_reload_conf();
4. create table p( a int,b int) partition by range(a);create table p1
partition of p for values from (0) to (1);create table p2 partition of p
for
values from (1) to (2);

Now, we need to use GDB to reproduce the crash.

Session 1:
1. Attach GDB and put a breakpoint at ATExecDetachPartition

Session 2:
1. SQL:prepare p1 as select * from p;
2. Attach GDB and put a breakpoint at ProcessUtility() and
find_inheritance_children_extended()

Session 1:
1. alter table p detach partition p2 concurrently;
2. The session will be stalled at ATExecDetachPartition. Continue stepping
next till CommitTransactionCommand();

Session 2:
1. SQL:execute p1;
2. The session will be stalled at ProcessUtility(). Before that, it takes
the snapshot.

Session 1:
1. Continue till DetachPartitionFinalize.

Session 2:
1. Continue till find_inheritance_children_extended(). It'll find two
partitions as transaction 1 isn't yet committed. Complete the execution in
that function.

Session 1:
1. Run to completion.
2. SQL: drop table p2;

Session 1:
1. It will crash as it assumes an entry in pg_class for the dropped
relation.

The following code assumes that an pg_class entry for the detached partition
will always be available which is wrong.

Thanks,
Kuntal
[1]
https://www.postgresql.org/message-id/CAHewXNkaKgVmT%2BOkVA9UHrEYm%2Bb8J6o_8%2B-84Qey6V5tM-%2Bz9A%40mail.gmail.com


From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: kuntalghosh(dot)2007(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-07-30 13:52:33
Message-ID: CAGz5QC+yTwKXAYgvvmiPAwez4CVkwO==ECDK+3fMAJ8j9Lcbkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Jul 30, 2024 at 7:18 PM PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
Adding Alvaro.
>
> The following code assumes that an pg_class entry for the detached partition
> will always be available which is wrong.
>
Referring to the following code.
/*
* Two problems are possible here. First, a concurrent ATTACH
* PARTITION might be in the process of adding a new partition, but
* the syscache doesn't have it, or its copy of it does not yet have
* its relpartbound set. We cannot just AcceptInvalidationMessages(),
* because the other process might have already removed itself from
* the ProcArray but not yet added its invalidation messages to the
* shared queue. We solve this problem by reading pg_class directly
* for the desired tuple.
*
* The other problem is that DETACH CONCURRENTLY is in the process of
* removing a partition, which happens in two steps: first it marks it
* as "detach pending", commits, then unsets relpartbound. If
* find_inheritance_children_extended included that partition but we
* below we see that DETACH CONCURRENTLY has reset relpartbound for
* it, we'd see an inconsistent view. (The inconsistency is seen
* because table_open below reads invalidation messages.) We protect
* against this by retrying find_inheritance_children_extended().
*/
if (boundspec == NULL)
{
Relation pg_class;
SysScanDesc scan;
ScanKeyData key[1];
Datum datum;
bool isnull;

pg_class = table_open(RelationRelationId, AccessShareLock);
ScanKeyInit(&key[0],
Anum_pg_class_oid,
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(inhrelid));
scan = systable_beginscan(pg_class, ClassOidIndexId, true,
NULL, 1, key);
tuple = systable_getnext(scan);
datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
RelationGetDescr(pg_class), &isnull);
if (!isnull)
boundspec = stringToNode(TextDatumGetCString(datum));
systable_endscan(scan);
table_close(pg_class, AccessShareLock);

IIUC, a simple fix would be to retry if an entry is not found. Attached a patch.

--
Thanks & Regards,
Kuntal Ghosh

Attachment Content-Type Size
0001-Fix-creation-of-partition-descriptor-during-concurre.patch application/octet-stream 1.2 KB

From: Tender Wang <tndrwang(at)gmail(dot)com>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-07-31 02:34:01
Message-ID: CAHewXN=56U_4gfYcDJcUUn5t9mPKNs-HBDWpxr7Jowx-_chKyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> 于2024年7月30日周二 21:52写道:

> On Tue, Jul 30, 2024 at 7:18 PM PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
> Adding Alvaro.
> >
> > The following code assumes that an pg_class entry for the detached
> partition
> > will always be available which is wrong.
> >
> Referring to the following code.
> /*
> * Two problems are possible here. First, a concurrent ATTACH
> * PARTITION might be in the process of adding a new partition, but
> * the syscache doesn't have it, or its copy of it does not yet
> have
> * its relpartbound set. We cannot just
> AcceptInvalidationMessages(),
> * because the other process might have already removed itself from
> * the ProcArray but not yet added its invalidation messages to the
> * shared queue. We solve this problem by reading pg_class
> directly
> * for the desired tuple.
> *
> * The other problem is that DETACH CONCURRENTLY is in the process
> of
> * removing a partition, which happens in two steps: first it
> marks it
> * as "detach pending", commits, then unsets relpartbound. If
> * find_inheritance_children_extended included that partition but
> we
> * below we see that DETACH CONCURRENTLY has reset relpartbound for
> * it, we'd see an inconsistent view. (The inconsistency is seen
> * because table_open below reads invalidation messages.) We
> protect
> * against this by retrying find_inheritance_children_extended().
> */
> if (boundspec == NULL)
> {
> Relation pg_class;
> SysScanDesc scan;
> ScanKeyData key[1];
> Datum datum;
> bool isnull;
>
> pg_class = table_open(RelationRelationId, AccessShareLock);
> ScanKeyInit(&key[0],
> Anum_pg_class_oid,
> BTEqualStrategyNumber, F_OIDEQ,
> ObjectIdGetDatum(inhrelid));
> scan = systable_beginscan(pg_class, ClassOidIndexId, true,
> NULL, 1, key);
> tuple = systable_getnext(scan);
> datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
> RelationGetDescr(pg_class), &isnull);
> if (!isnull)
> boundspec = stringToNode(TextDatumGetCString(datum));
> systable_endscan(scan);
> table_close(pg_class, AccessShareLock);
>
> IIUC, a simple fix would be to retry if an entry is not found. Attached a
> patch.
>

I take a quick look the attached patch. It looks good to me.

--
Tender Wang


From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-10 13:54:58
Message-ID: CAEG8a3+qEsMW_R502GB9szReSGnhUvoNuvLci1wvw07+Lbegiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Jul 30, 2024 at 9:52 PM Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> wrote:
>
> On Tue, Jul 30, 2024 at 7:18 PM PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
> Adding Alvaro.
> >
> > The following code assumes that an pg_class entry for the detached partition
> > will always be available which is wrong.
> >
> Referring to the following code.
> /*
> * Two problems are possible here. First, a concurrent ATTACH
> * PARTITION might be in the process of adding a new partition, but
> * the syscache doesn't have it, or its copy of it does not yet have
> * its relpartbound set. We cannot just AcceptInvalidationMessages(),
> * because the other process might have already removed itself from
> * the ProcArray but not yet added its invalidation messages to the
> * shared queue. We solve this problem by reading pg_class directly
> * for the desired tuple.
> *
> * The other problem is that DETACH CONCURRENTLY is in the process of
> * removing a partition, which happens in two steps: first it marks it
> * as "detach pending", commits, then unsets relpartbound. If
> * find_inheritance_children_extended included that partition but we
> * below we see that DETACH CONCURRENTLY has reset relpartbound for
> * it, we'd see an inconsistent view. (The inconsistency is seen
> * because table_open below reads invalidation messages.) We protect
> * against this by retrying find_inheritance_children_extended().
> */
> if (boundspec == NULL)
> {
> Relation pg_class;
> SysScanDesc scan;
> ScanKeyData key[1];
> Datum datum;
> bool isnull;
>
> pg_class = table_open(RelationRelationId, AccessShareLock);
> ScanKeyInit(&key[0],
> Anum_pg_class_oid,
> BTEqualStrategyNumber, F_OIDEQ,
> ObjectIdGetDatum(inhrelid));
> scan = systable_beginscan(pg_class, ClassOidIndexId, true,
> NULL, 1, key);
> tuple = systable_getnext(scan);
> datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
> RelationGetDescr(pg_class), &isnull);
> if (!isnull)
> boundspec = stringToNode(TextDatumGetCString(datum));
> systable_endscan(scan);
> table_close(pg_class, AccessShareLock);
>
> IIUC, a simple fix would be to retry if an entry is not found. Attached a patch.

I can reproduce the issue, and the patch LGTM.

>
> --
> Thanks & Regards,
> Kuntal Ghosh

--
Regards
Junwang Zhao


From: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
To: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-12 18:24:27
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 2024-Aug-10, Junwang Zhao wrote:

> > IIUC, a simple fix would be to retry if an entry is not found. Attached a patch.
>
> I can reproduce the issue, and the patch LGTM.

Interesting issue, thanks for reporting and putting together a
reproducer. I have added some comments to the proposed patch, so here's
a v2 for it. I'm going to write a commit message for it and push to all
branches since 14.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"La experiencia nos dice que el hombre peló millones de veces las patatas,
pero era forzoso admitir la posibilidad de que en un caso entre millones,
las patatas pelarían al hombre" (Ijon Tichy)

Attachment Content-Type Size
v2-0001-Fix-creation-of-partition-descriptor-during-concu.patch text/x-diff 2.9 KB

From: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
To: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-12 22:33:09
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 2024-Aug-12, Alvaro Herrera from 2ndQuadrant wrote:

> On 2024-Aug-10, Junwang Zhao wrote:
>
> > > IIUC, a simple fix would be to retry if an entry is not found. Attached a patch.
> >
> > I can reproduce the issue, and the patch LGTM.
>
> Interesting issue, thanks for reporting and putting together a
> reproducer. I have added some comments to the proposed patch, so here's
> a v2 for it. I'm going to write a commit message for it and push to all
> branches since 14.

Dept. of second thoughts. I couldn't find any reason why it's okay to
dereference the return value from systable_getnext() without verifying
that it's not NULL, so I added the check to the older branches too. As
far as I know we've never had a crash report that could be traced to
lack of that check, but that code still looks like it's assuming a
little too much.

One more thing here. Applying the test scripts that I used for the
previous bug with addition of DROP after the DETACH CONCURRENTLY, I get
a different failure during planning, which reports this error:

ERROR: could not open relation with OID 457639
STATEMENT: select * from p where a = $1;

and changing that error (in relation_open) from ERROR to PANIC would
result in this backtrace:

#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo(at)entry=6, no_tid=no_tid(at)entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007f149c3cae8f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 0x00007f149c37bfb2 in __GI_raise (sig=sig(at)entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007f149c366472 in __GI_abort () at ./stdlib/abort.c:79
#4 0x0000557db57b8611 in errfinish (filename=<optimized out>, lineno=61, funcname=0x557db5816128 <__func__.1> "relation_open")
at ../../../../../../pgsql/source/master/src/backend/utils/error/elog.c:599
#5 0x0000557db52441ad in relation_open (relationId=17466, lockmode=lockmode(at)entry=1)
at ../../../../../../pgsql/source/master/src/backend/access/common/relation.c:61
#6 0x0000557db537d1d9 in table_open (relationId=<optimized out>, lockmode=lockmode(at)entry=1)
at ../../../../../../pgsql/source/master/src/backend/access/table/table.c:44
#7 0x0000557db55b0cf8 in expand_partitioned_rtentry (root=root(at)entry=0x557db9c8b530, relinfo=relinfo(at)entry=0x557db9c8c2d8,
parentrte=parentrte(at)entry=0x557db9c8c178, parentRTindex=parentRTindex(at)entry=1, parentrel=parentrel(at)entry=0x7f149bc46060, parent_updatedCols=0x0,
top_parentrc=0x0, lockmode=1) at ../../../../../../pgsql/source/master/src/backend/optimizer/util/inherit.c:390
#8 0x0000557db55b121f in expand_inherited_rtentry (root=root(at)entry=0x557db9c8b530, rel=0x557db9c8c2d8, rte=0x557db9c8c178, rti=rti(at)entry=1)
at ../../../../../../pgsql/source/master/src/backend/optimizer/util/inherit.c:154
#9 0x0000557db558e11a in add_other_rels_to_query (root=root(at)entry=0x557db9c8b530)
at ../../../../../../pgsql/source/master/src/backend/optimizer/plan/initsplan.c:214
#10 0x0000557db5591983 in query_planner (root=root(at)entry=0x557db9c8b530, qp_callback=qp_callback(at)entry=0x557db5593470 <standard_qp_callback>,
qp_extra=qp_extra(at)entry=0x7ffd987c39e0) at ../../../../../../pgsql/source/master/src/backend/optimizer/plan/planmain.c:268
#11 0x0000557db5597968 in grouping_planner (root=root(at)entry=0x557db9c8b530, tuple_fraction=<optimized out>, tuple_fraction(at)entry=0, setops=setops(at)entry=0x0)
at ../../../../../../pgsql/source/master/src/backend/optimizer/plan/planner.c:1520
#12 0x0000557db559a8f3 in subquery_planner (glob=glob(at)entry=0x557db9c8c758, parse=parse(at)entry=0x557db9c8bf68, parent_root=parent_root(at)entry=0x0,
hasRecursion=hasRecursion(at)entry=false, tuple_fraction=tuple_fraction(at)entry=0, setops=setops(at)entry=0x0)
at ../../../../../../pgsql/source/master/src/backend/optimizer/plan/planner.c:1089
#13 0x0000557db559acea in standard_planner (parse=0x557db9c8bf68, query_string=<optimized out>, cursorOptions=2048, boundParams=0x0)
at ../../../../../../pgsql/source/master/src/backend/optimizer/plan/planner.c:415
#14 0x0000557db5677117 in pg_plan_query (querytree=querytree(at)entry=0x557db9c8bf68, query_string=query_string(at)entry=0x557db9cb34c8 "select * from p where a = $1;",
cursorOptions=cursorOptions(at)entry=2048, boundParams=boundParams(at)entry=0x0) at ../../../../../pgsql/source/master/src/backend/tcop/postgres.c:912
#15 0x0000557db5677273 in pg_plan_queries (querytrees=querytrees(at)entry=0x557db9d69268, query_string=0x557db9cb34c8 "select * from p where a = $1;",
cursorOptions=2048, boundParams=boundParams(at)entry=0x0) at ../../../../../pgsql/source/master/src/backend/tcop/postgres.c:1006
#16 0x0000557db57a39b3 in BuildCachedPlan (plansource=plansource(at)entry=0x557db9c5ff10, qlist=qlist(at)entry=0x557db9d69268, boundParams=boundParams(at)entry=0x0,
queryEnv=queryEnv(at)entry=0x0) at ../../../../../../pgsql/source/master/src/backend/utils/cache/plancache.c:962
#17 0x0000557db57a4204 in GetCachedPlan (plansource=plansource(at)entry=0x557db9c5ff10, boundParams=boundParams(at)entry=0x557db9cc0778, owner=owner(at)entry=0x0,
queryEnv=queryEnv(at)entry=0x0) at ../../../../../../pgsql/source/master/src/backend/utils/cache/plancache.c:1199
#18 0x0000557db56786c5 in exec_bind_message (input_message=0x7ffd987c3e70) at ../../../../../pgsql/source/master/src/backend/tcop/postgres.c:2017
#19 PostgresMain (dbname=<optimized out>, username=<optimized out>) at ../../../../../pgsql/source/master/src/backend/tcop/postgres.c:4814

Clearly this is undesirable, but I'm not sure how strongly we need to
pursue a fix for it. It seems hard to forbid dropping the table after
the detach. Is it enough to advise users to not drop partitions
immediately after detaching them? Is this a sign of a more fundamental
problem in the mechanism for DETACH CONCURRENTLY when used together with
prepared statements?

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Industry suffers from the managerial dogma that for the sake of stability
and continuity, the company should be independent of the competence of
individual employees." (E. Dijkstra)


From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-13 12:12:29
Message-ID: CAGz5QCKGAhdN_Aq109zRAv44nJfZw2Wc6Fx22EZZ-UmX-WGFVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Aug 13, 2024 at 4:03 AM Alvaro Herrera from 2ndQuadrant
<alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2024-Aug-12, Alvaro Herrera from 2ndQuadrant wrote:
>
> > On 2024-Aug-10, Junwang Zhao wrote:
> >
> > > > IIUC, a simple fix would be to retry if an entry is not found. Attached a patch.
> > >
> > > I can reproduce the issue, and the patch LGTM.
> >
> > Interesting issue, thanks for reporting and putting together a
> > reproducer. I have added some comments to the proposed patch, so here's
> > a v2 for it. I'm going to write a commit message for it and push to all
> > branches since 14.
>
> Dept. of second thoughts. I couldn't find any reason why it's okay to
> dereference the return value from systable_getnext() without verifying
> that it's not NULL, so I added the check to the older branches too. As
> far as I know we've never had a crash report that could be traced to
> lack of that check, but that code still looks like it's assuming a
> little too much.
+1.

>
> One more thing here. Applying the test scripts that I used for the
> previous bug with addition of DROP after the DETACH CONCURRENTLY, I get
> a different failure during planning, which reports this error:
>
> ERROR: could not open relation with OID 457639
> STATEMENT: select * from p where a = $1;
>
> #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo(at)entry=6, no_tid=no_tid(at)entry=0) at ./nptl/pthread_kill.c:44
> #1 0x00007f149c3cae8f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
> #2 0x00007f149c37bfb2 in __GI_raise (sig=sig(at)entry=6) at ../sysdeps/posix/raise.c:26
> #3 0x00007f149c366472 in __GI_abort () at ./stdlib/abort.c:79
> #4 0x0000557db57b8611 in errfinish (filename=<optimized out>, lineno=61, funcname=0x557db5816128 <__func__.1> "relation_open")
> at ../../../../../../pgsql/source/master/src/backend/utils/error/elog.c:599
> #5 0x0000557db52441ad in relation_open (relationId=17466, lockmode=lockmode(at)entry=1)
> at ../../../../../../pgsql/source/master/src/backend/access/common/relation.c:61
> #6 0x0000557db537d1d9 in table_open (relationId=<optimized out>, lockmode=lockmode(at)entry=1)
> at ../../../../../../pgsql/source/master/src/backend/access/table/table.c:44
> #7 0x0000557db55b0cf8 in expand_partitioned_rtentry (root=root(at)entry=0x557db9c8b530, relinfo=relinfo(at)entry=0x557db9c8c2d8,
> parentrte=parentrte(at)entry=0x557db9c8c178, parentRTindex=parentRTindex(at)entry=1, parentrel=parentrel(at)entry=0x7f149bc46060, parent_updatedCols=0x0,
> top_parentrc=0x0, lockmode=1) at ../../../../../../pgsql/source/master/src/backend/optimizer/util/inherit.c:390
That means - after getting the live partitions from
prune_append_rel_partitions(), by the time the code tries to lock a
child, it's already dropped.

Able to reproduce the issue with same steps with a small tweak,

Session 1:
1. Continue till DetachPartitionFinalize.

Session 2:
1. Continue till expand_partitioned_rtentry(). It'll find two live
partition after calling prune_append_rel_partitions().

Session 1:
1. Run to completion.
2. SQL: drop table p2;

Session 2:
1. Continue

The table_open will thrown the error - "could not open relation with OID".

The function find_inheritance_children_extended deals with the missing
partition as following:
/* Get the lock to synchronize against concurrent drop */
LockRelationOid(inhrelid, lockmode);

/*
* Now that we have the lock, double-check to see if the relation
* really exists or not. If not, assume it was dropped while we
* waited to acquire lock, and ignore it.
*/
if (!SearchSysCacheExists1(RELOID, ObjectIdGetDatum(inhrelid)))
{
/* Release useless lock */
UnlockRelationOid(inhrelid, lockmode);
/* And ignore this relation */
continue;
}
However, similar check is not there in expand_partitioned_rtentry().
Introducing the same check will fix the issue. But, I don't know how
it affects the pruning part as this partition couldn't be pruned
earlier and that's why we're opening the child partition.

The key points I see are:
1. The find_inheritance_children_extended() function includes a
partition that's being detached based on the current snapshot.
2. Later in the code path, the expand_partitioned_rtentry function
takes a heavy-weight lock on the partitions.

Between these two steps, some code paths expect the child partition to
exist in the syscache, which leads to errors or crashes.

--
Thanks & Regards,
Kuntal Ghosh


From: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-13 18:18:58
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 2024-Aug-13, Kuntal Ghosh wrote:

> That means - after getting the live partitions from
> prune_append_rel_partitions(), by the time the code tries to lock a
> child, it's already dropped.

Right.

> However, similar check is not there in expand_partitioned_rtentry().
> Introducing the same check will fix the issue. But, I don't know how
> it affects the pruning part as this partition couldn't be pruned
> earlier and that's why we're opening the child partition.

Hmm, we could just remove the partition from the set of live partitions
-- then it should behave the same as if the partition had been pruned.
Something like the attached, perhaps.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle." (Larry Wall, Apocalypse 6)

Attachment Content-Type Size
0001-Don-t-open-partitions-that-were-detached-and-dropped.patch text/x-diff 1.2 KB

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-15 19:31:49
Message-ID: CAGz5QC+XXc+rV=BBGwGVoOO8X7hmhpLfNBvVOiZpk22M6PyfBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Tue, Aug 13, 2024 at 11:49 PM Alvaro Herrera from 2ndQuadrant
<alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > That means - after getting the live partitions from
> > prune_append_rel_partitions(), by the time the code tries to lock a
> > child, it's already dropped.
>
> Right.
>
> > However, similar check is not there in expand_partitioned_rtentry().
> > Introducing the same check will fix the issue. But, I don't know how
> > it affects the pruning part as this partition couldn't be pruned
> > earlier and that's why we're opening the child partition.
>
> Hmm, we could just remove the partition from the set of live partitions
> -- then it should behave the same as if the partition had been pruned.
> Something like the attached, perhaps.
>
Thanks for the patch. LGTM. I've verified that it's fixing the issue.

--
Thanks & Regards,
Kuntal Ghosh


From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-18 14:46:45
Message-ID: CAEG8a3L-4qyFUQO=ARZu+=-owVbW8oJziDH0GcU9GYt0__jYvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Aug 16, 2024 at 3:32 AM Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> wrote:
>
> On Tue, Aug 13, 2024 at 11:49 PM Alvaro Herrera from 2ndQuadrant
> <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > > That means - after getting the live partitions from
> > > prune_append_rel_partitions(), by the time the code tries to lock a
> > > child, it's already dropped.
> >
> > Right.
> >
> > > However, similar check is not there in expand_partitioned_rtentry().
> > > Introducing the same check will fix the issue. But, I don't know how
> > > it affects the pruning part as this partition couldn't be pruned
> > > earlier and that's why we're opening the child partition.
> >
> > Hmm, we could just remove the partition from the set of live partitions
> > -- then it should behave the same as if the partition had been pruned.
> > Something like the attached, perhaps.
> >
> Thanks for the patch. LGTM. I've verified that it's fixing the issue.
>
+1

>
>
> --
> Thanks & Regards,
> Kuntal Ghosh

--
Regards
Junwang Zhao


From: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
To: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-20 18:46:41
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 2024-Aug-18, Junwang Zhao wrote:

> On Fri, Aug 16, 2024 at 3:32 AM Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> wrote:

> > Thanks for the patch. LGTM. I've verified that it's fixing the issue.
>
> +1

Thanks both for looking, I pushed this yesterday.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/


From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Date: 2024-08-22 15:14:22
Message-ID: CAGz5QCJ66Tc56+i2o1mN48=7VV8+9BJaLa2rwuHUvEtX1708sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Wed, Aug 21, 2024 at 12:16 AM Alvaro Herrera from 2ndQuadrant <
alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> Thanks both for looking, I pushed this yesterday.

Awesome. I've marked the commitfest entry[1] as committed.

[1] https://commitfest.postgresql.org/49/5155/

--
Thanks & Regards,
Kuntal Ghosh