Skip to content

Commit fd0b9dc

Browse files
author
Amit Kapila
committed
Prohibit combining publications with different column lists.
Currently, we simply combine the column lists when publishing tables on multiple publications and that can sometimes lead to unexpected behavior. Say, if a column is published in any row-filtered publication, then the values for that column are sent to the subscriber even for rows that don't match the row filter, as long as the row matches the row filter for any other publication, even if that other publication doesn't include the column. The main purpose of introducing a column list is to have statically different shapes on publisher and subscriber or hide sensitive column data. In both cases, it doesn't seem to make sense to combine column lists. So, we disallow the cases where the column list is different for the same table when combining publications. It can be later extended to combine the column lists for selective cases where required. Reported-by: Alvaro Herrera Author: Hou Zhijie Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/[email protected]
1 parent 99f6f19 commit fd0b9dc

File tree

6 files changed

+181
-166
lines changed

6 files changed

+181
-166
lines changed

doc/src/sgml/ref/alter_publication.sgml

+11-1
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,17 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r
116116

117117
<para>
118118
Optionally, a column list can be specified. See <xref
119-
linkend="sql-createpublication"/> for details.
119+
linkend="sql-createpublication"/> for details. Note that a subscription
120+
having several publications in which the same table has been published
121+
with different column lists is not supported. So, changing the column
122+
lists of the tables being subscribed could cause inconsistency of column
123+
lists among publications, in which case <command>ALTER PUBLICATION</command>
124+
will be successful but later the WalSender on the publisher or the
125+
subscriber may throw an error. In this scenario, the user needs to
126+
recreate the subscription after adjusting the column list or drop the
127+
problematic publication using
128+
<literal>ALTER SUBSCRIPTION ... DROP PUBLICATION</literal> and then add
129+
it back after adjusting the column list.
120130
</para>
121131

122132
<para>

doc/src/sgml/ref/create_subscription.sgml

+5
Original file line numberDiff line numberDiff line change
@@ -355,6 +355,11 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
355355
copied data that would be incompatible with subsequent filtering.
356356
</para>
357357

358+
<para>
359+
Subscriptions having several publications in which the same table has been
360+
published with different column lists are not supported.
361+
</para>
362+
358363
<para>
359364
We allow non-existent publications to be specified so that users can add
360365
those later. This means

src/backend/commands/subscriptioncmds.c

+23-5
Original file line numberDiff line numberDiff line change
@@ -1754,24 +1754,35 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId)
17541754
/*
17551755
* Get the list of tables which belong to specified publications on the
17561756
* publisher connection.
1757+
*
1758+
* Note that we don't support the case where the column list is different for
1759+
* the same table in different publications to avoid sending unwanted column
1760+
* information for some of the rows. This can happen when both the column
1761+
* list and row filter are specified for different publications.
17571762
*/
17581763
static List *
17591764
fetch_table_list(WalReceiverConn *wrconn, List *publications)
17601765
{
17611766
WalRcvExecResult *res;
17621767
StringInfoData cmd;
17631768
TupleTableSlot *slot;
1764-
Oid tableRow[2] = {TEXTOID, TEXTOID};
1769+
Oid tableRow[3] = {TEXTOID, TEXTOID, NAMEARRAYOID};
17651770
List *tablelist = NIL;
1771+
bool check_columnlist = (walrcv_server_version(wrconn) >= 150000);
17661772

17671773
initStringInfo(&cmd);
1768-
appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename\n"
1769-
" FROM pg_catalog.pg_publication_tables t\n"
1774+
appendStringInfoString(&cmd, "SELECT DISTINCT t.schemaname, t.tablename \n");
1775+
1776+
/* Get column lists for each relation if the publisher supports it */
1777+
if (check_columnlist)
1778+
appendStringInfoString(&cmd, ", t.attnames\n");
1779+
1780+
appendStringInfoString(&cmd, "FROM pg_catalog.pg_publication_tables t\n"
17701781
" WHERE t.pubname IN (");
17711782
get_publications_str(publications, &cmd, true);
17721783
appendStringInfoChar(&cmd, ')');
17731784

1774-
res = walrcv_exec(wrconn, cmd.data, 2, tableRow);
1785+
res = walrcv_exec(wrconn, cmd.data, check_columnlist ? 3 : 2, tableRow);
17751786
pfree(cmd.data);
17761787

17771788
if (res->status != WALRCV_OK_TUPLES)
@@ -1795,7 +1806,14 @@ fetch_table_list(WalReceiverConn *wrconn, List *publications)
17951806
Assert(!isnull);
17961807

17971808
rv = makeRangeVar(nspname, relname, -1);
1798-
tablelist = lappend(tablelist, rv);
1809+
1810+
if (check_columnlist && list_member(tablelist, rv))
1811+
ereport(ERROR,
1812+
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
1813+
errmsg("cannot use different column lists for table \"%s.%s\" in different publications",
1814+
nspname, relname));
1815+
else
1816+
tablelist = lappend(tablelist, rv);
17991817

18001818
ExecClearTuple(slot);
18011819
}

src/backend/replication/logical/tablesync.c

+38-34
Original file line numberDiff line numberDiff line change
@@ -753,25 +753,14 @@ fetch_remote_table_info(char *nspname, char *relname,
753753
/*
754754
* Get column lists for each relation.
755755
*
756-
* For initial synchronization, column lists can be ignored in following
757-
* cases:
758-
*
759-
* 1) one of the subscribed publications for the table hasn't specified
760-
* any column list
761-
*
762-
* 2) one of the subscribed publications has puballtables set to true
763-
*
764-
* 3) one of the subscribed publications is declared as ALL TABLES IN
765-
* SCHEMA that includes this relation
766-
*
767756
* We need to do this before fetching info about column names and types,
768757
* so that we can skip columns that should not be replicated.
769758
*/
770759
if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
771760
{
772761
WalRcvExecResult *pubres;
773762
TupleTableSlot *slot;
774-
Oid attrsRow[] = {INT2OID};
763+
Oid attrsRow[] = {INT2VECTOROID};
775764
StringInfoData pub_names;
776765
bool first = true;
777766

@@ -786,19 +775,17 @@ fetch_remote_table_info(char *nspname, char *relname,
786775

787776
/*
788777
* Fetch info about column lists for the relation (from all the
789-
* publications). We unnest the int2vector values, because that makes
790-
* it easier to combine lists by simply adding the attnums to a new
791-
* bitmap (without having to parse the int2vector data). This
792-
* preserves NULL values, so that if one of the publications has no
793-
* column list, we'll know that.
778+
* publications).
794779
*/
795780
resetStringInfo(&cmd);
796781
appendStringInfo(&cmd,
797-
"SELECT DISTINCT unnest"
782+
"SELECT DISTINCT"
783+
" (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts)"
784+
" THEN NULL ELSE gpt.attrs END)"
798785
" FROM pg_publication p,"
799-
" LATERAL pg_get_publication_tables(p.pubname) gpt"
800-
" LEFT OUTER JOIN unnest(gpt.attrs) ON TRUE"
801-
" WHERE gpt.relid = %u"
786+
" LATERAL pg_get_publication_tables(p.pubname) gpt,"
787+
" pg_class c"
788+
" WHERE gpt.relid = %u AND c.oid = gpt.relid"
802789
" AND p.pubname IN ( %s )",
803790
lrel->remoteid,
804791
pub_names.data);
@@ -813,26 +800,43 @@ fetch_remote_table_info(char *nspname, char *relname,
813800
nspname, relname, pubres->err)));
814801

815802
/*
816-
* Merge the column lists (from different publications) by creating a
817-
* single bitmap with all the attnums. If we find a NULL value, that
818-
* means one of the publications has no column list for the table
819-
* we're syncing.
803+
* We don't support the case where the column list is different for
804+
* the same table when combining publications. See comments atop
805+
* fetch_table_list. So there should be only one row returned.
806+
* Although we already checked this when creating the subscription, we
807+
* still need to check here in case the column list was changed after
808+
* creating the subscription and before the sync worker is started.
809+
*/
810+
if (tuplestore_tuple_count(pubres->tuplestore) > 1)
811+
ereport(ERROR,
812+
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
813+
errmsg("cannot use different column lists for table \"%s.%s\" in different publications",
814+
nspname, relname));
815+
816+
/*
817+
* Get the column list and build a single bitmap with the attnums.
818+
*
819+
* If we find a NULL value, it means all the columns should be
820+
* replicated.
820821
*/
821822
slot = MakeSingleTupleTableSlot(pubres->tupledesc, &TTSOpsMinimalTuple);
822-
while (tuplestore_gettupleslot(pubres->tuplestore, true, false, slot))
823+
if (tuplestore_gettupleslot(pubres->tuplestore, true, false, slot))
823824
{
824825
Datum cfval = slot_getattr(slot, 1, &isnull);
825826

826-
/* NULL means empty column list, so we're done. */
827-
if (isnull)
827+
if (!isnull)
828828
{
829-
bms_free(included_cols);
830-
included_cols = NULL;
831-
break;
832-
}
829+
ArrayType *arr;
830+
int nelems;
831+
int16 *elems;
833832

834-
included_cols = bms_add_member(included_cols,
835-
DatumGetInt16(cfval));
833+
arr = DatumGetArrayTypeP(cfval);
834+
nelems = ARR_DIMS(arr)[0];
835+
elems = (int16 *) ARR_DATA_PTR(arr);
836+
837+
for (natt = 0; natt < nelems; natt++)
838+
included_cols = bms_add_member(included_cols, elems[natt]);
839+
}
836840

837841
ExecClearTuple(slot);
838842
}

src/backend/replication/pgoutput/pgoutput.c

+40-40
Original file line numberDiff line numberDiff line change
@@ -979,30 +979,31 @@ pgoutput_column_list_init(PGOutputData *data, List *publications,
979979
RelationSyncEntry *entry)
980980
{
981981
ListCell *lc;
982+
bool first = true;
983+
Relation relation = RelationIdGetRelation(entry->publish_as_relid);
982984

983985
/*
984986
* Find if there are any column lists for this relation. If there are,
985-
* build a bitmap merging all the column lists.
986-
*
987-
* All the given publication-table mappings must be checked.
987+
* build a bitmap using the column lists.
988988
*
989989
* Multiple publications might have multiple column lists for this
990990
* relation.
991991
*
992+
* Note that we don't support the case where the column list is different
993+
* for the same table when combining publications. See comments atop
994+
* fetch_table_list. But one can later change the publication so we still
995+
* need to check all the given publication-table mappings and report an
996+
* error if any publications have a different column list.
997+
*
992998
* FOR ALL TABLES and FOR ALL TABLES IN SCHEMA implies "don't use column
993-
* list" so it takes precedence.
999+
* list".
9941000
*/
9951001
foreach(lc, publications)
9961002
{
9971003
Publication *pub = lfirst(lc);
9981004
HeapTuple cftuple = NULL;
9991005
Datum cfdatum = 0;
1000-
1001-
/*
1002-
* Assume there's no column list. Only if we find pg_publication_rel
1003-
* entry with a column list we'll switch it to false.
1004-
*/
1005-
bool pub_no_list = true;
1006+
Bitmapset *cols = NULL;
10061007

10071008
/*
10081009
* If the publication is FOR ALL TABLES then it is treated the same as
@@ -1011,6 +1012,8 @@ pgoutput_column_list_init(PGOutputData *data, List *publications,
10111012
*/
10121013
if (!pub->alltables)
10131014
{
1015+
bool pub_no_list = true;
1016+
10141017
/*
10151018
* Check for the presence of a column list in this publication.
10161019
*
@@ -1024,51 +1027,48 @@ pgoutput_column_list_init(PGOutputData *data, List *publications,
10241027

10251028
if (HeapTupleIsValid(cftuple))
10261029
{
1027-
/*
1028-
* Lookup the column list attribute.
1029-
*
1030-
* Note: We update the pub_no_list value directly, because if
1031-
* the value is NULL, we have no list (and vice versa).
1032-
*/
1030+
/* Lookup the column list attribute. */
10331031
cfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, cftuple,
10341032
Anum_pg_publication_rel_prattrs,
10351033
&pub_no_list);
10361034

1037-
/*
1038-
* Build the column list bitmap in the per-entry context.
1039-
*
1040-
* We need to merge column lists from all publications, so we
1041-
* update the same bitmapset. If the column list is null, we
1042-
* interpret it as replicating all columns.
1043-
*/
1035+
/* Build the column list bitmap in the per-entry context. */
10441036
if (!pub_no_list) /* when not null */
10451037
{
10461038
pgoutput_ensure_entry_cxt(data, entry);
10471039

1048-
entry->columns = pub_collist_to_bitmapset(entry->columns,
1049-
cfdatum,
1050-
entry->entry_cxt);
1040+
cols = pub_collist_to_bitmapset(cols, cfdatum,
1041+
entry->entry_cxt);
1042+
1043+
/*
1044+
* If column list includes all the columns of the table,
1045+
* set it to NULL.
1046+
*/
1047+
if (bms_num_members(cols) == RelationGetNumberOfAttributes(relation))
1048+
{
1049+
bms_free(cols);
1050+
cols = NULL;
1051+
}
10511052
}
1053+
1054+
ReleaseSysCache(cftuple);
10521055
}
10531056
}
10541057

1055-
/*
1056-
* Found a publication with no column list, so we're done. But first
1057-
* discard column list we might have from preceding publications.
1058-
*/
1059-
if (pub_no_list)
1058+
if (first)
10601059
{
1061-
if (cftuple)
1062-
ReleaseSysCache(cftuple);
1063-
1064-
bms_free(entry->columns);
1065-
entry->columns = NULL;
1066-
1067-
break;
1060+
entry->columns = cols;
1061+
first = false;
10681062
}
1069-
1070-
ReleaseSysCache(cftuple);
1063+
else if (!bms_equal(entry->columns, cols))
1064+
ereport(ERROR,
1065+
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
1066+
errmsg("cannot use different column lists for table \"%s.%s\" in different publications",
1067+
get_namespace_name(RelationGetNamespace(relation)),
1068+
RelationGetRelationName(relation)));
10711069
} /* loop all subscribed publications */
1070+
1071+
RelationClose(relation);
10721072
}
10731073

10741074
/*

0 commit comments

Comments
 (0)