Age | Commit message (Collapse) | Author |
|
This allows specifying an optional column list when adding a table to
logical replication. The column list may be specified after the table
name, enclosed in parentheses. Columns not included in this list are not
sent to the subscriber, allowing the schema on the subscriber to be a
subset of the publisher schema.
For UPDATE/DELETE publications, the column list needs to cover all
REPLICA IDENTITY columns. For INSERT publications, the column list is
arbitrary and may omit some REPLICA IDENTITY columns. Furthermore, if
the table uses REPLICA IDENTITY FULL, column list is not allowed.
The column list can contain only simple column references. Complex
expressions, function calls etc. are not allowed. This restriction could
be relaxed in the future.
During the initial table synchronization, only columns included in the
column list are copied to the subscriber. If the subscription has
several publications, containing the same table with different column
lists, columns specified in any of the lists will be copied.
This means all columns are replicated if the table has no column list
at all (which is treated as column list with all columns), or when of
the publications is defined as FOR ALL TABLES (possibly IN SCHEMA that
matches the schema of the table).
For partitioned tables, publish_via_partition_root determines whether
the column list for the root or the leaf relation will be used. If the
parameter is 'false' (the default), the list defined for the leaf
relation is used. Otherwise, the column list for the root partition
will be used.
Psql commands \dRp+ and \d <table-name> now display any column lists.
Author: Tomas Vondra, Alvaro Herrera, Rahila Syed
Reviewed-by: Peter Eisentraut, Alvaro Herrera, Vignesh C, Ibrar Ahmed,
Amit Kapila, Hou zj, Peter Smith, Wang wei, Tang, Shi yu
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
|
|
This commit adds support for decoding of sequences to the built-in
replication (the infrastructure was added by commit 0da92dc530).
The syntax and behavior mostly mimics handling of tables, i.e. a
publication may be defined as FOR ALL SEQUENCES (replicating all
sequences in a database), FOR ALL SEQUENCES IN SCHEMA (replicating
all sequences in a particular schema) or individual sequences.
To publish sequence modifications, the publication has to include
'sequence' action. The protocol is extended with a new message,
describing sequence increments.
A new system view pg_publication_sequences lists all the sequences
added to a publication, both directly and indirectly. Various psql
commands (\d and \dRp) are improved to also display publications
including a given sequence, or sequences included in a publication.
Author: Tomas Vondra, Cary Huang
Reviewed-by: Peter Eisentraut, Amit Kapila, Hannu Krosing, Andres
Freund, Petr Jelinek
Discussion: https://postgr.es/m/[email protected]
Discussion: https://postgr.es/m/[email protected]
|
|
Noticed via -fsanitize=undefined.
Backpatch to all branches, for the same reasons as 46ab07ffda9.
Discussion: https://postgr.es/m/[email protected]
Backpatch: 10-
|
|
A security invoker view checks permissions for accessing its
underlying base relations using the privileges of the user of the
view, rather than the privileges of the view owner. Additionally, if
any of the base relations are tables with RLS enabled, the policies of
the user of the view are applied, rather than those of the view owner.
This allows views to be defined without giving away additional
privileges on the underlying base relations, and matches a similar
feature available in other database systems.
It also allows views to operate more naturally with RLS, without
affecting the assignments of policies to users.
Christoph Heiss, with some additional hacking by me. Reviewed by
Laurenz Albe and Wolfgang Walther.
Discussion: https://postgr.es/m/b66dd6d6-ad3e-c6f2-8b90-47be773da240%40cybertec.at
|
|
Restructure things so that the functions which update the global
variables shared_map and local_map are separate from the functions
which just read and write relation map files without touching any
global variables.
In the new structure of things, write_relmap_file() writes a relmap
file but no longer performs global variable updates. A symmetric
function read_relmap_file() that just reads a file without changing
any global variables is added, and load_relmap_file(), which does
change the global variables, uses it as a subroutine.
Because write_relmap_file() no longer updates shared_map and
local_map, that logic is moved to perform_relmap_update(). However,
no similar logic is added to relmap_redo() even though it also calls
write_relmap_file(). That's because recovery must not rely on the
contents of the relation map, and therefore there is no need to
initialize it. In fact, doing so seems like a mistake, because we
might then manage to rely on the in-memory map where we shouldn't.
Patch by me, based on earlier work by Dilip Kumar. Reviewed by
Ashutosh Sharma.
Discussion: http://postgr.es/m/CA+TgmobQLgrt4AXsc0ru7aFFkzv=9fS-Q_yO69=k9WY67RCctg@mail.gmail.com
|
|
Commit 8b069ef5d changed this function to look at pg_constraint.conindid
rather than searching pg_depend. That was a good performance improvement,
but it failed to preserve the exact semantics. The old code would only
return an index that was "owned by" (internally dependent on) the
specified constraint, whereas the new code will also return indexes that
are just referenced by foreign key constraints. This confuses ALTER
TABLE, which was implicitly expecting the previous semantics, into
failing with errors like
ERROR: relation 146621 has multiple clustered indexes
or
ERROR: "pk_attbl" is not an index for table "atref"
We can fix this without reverting the performance improvement by adding
a contype check in get_constraint_index(). Another way could be to
make ALTER TABLE check it, but I'm worried that extension code could
also have subtle dependencies on the old semantics.
Tom Lane and Japin Li, per bug #17409 from Holly Roberts.
Back-patch to v14 where the error crept in.
Discussion: https://postgr.es/m/[email protected]
|
|
There are three parallel ways to call parse/analyze: with fixed
parameters, with variable parameters, and by supplying your own parser
callback. Some of the involved functions were confusingly named and
made this API structure more confusing. This patch renames some
functions to make this clearer:
parse_analyze() -> parse_analyze_fixedparams()
pg_analyze_and_rewrite() -> pg_analyze_and_rewrite_fixedparams()
(Otherwise one might think this variant doesn't accept parameters, but
in fact all three ways accept parameters.)
pg_analyze_and_rewrite_params() -> pg_analyze_and_rewrite_withcb()
(Before, and also when considering pg_analyze_and_rewrite(), one might
think this is the only way to pass parameters. Moreover, the parser
callback doesn't necessarily need to parse only parameters, it's just
one of the things it could do.)
parse_fixed_parameters() -> setup_parse_fixed_parameters()
parse_variable_parameters() -> setup_parse_variable_parameters()
(These functions don't actually do any parsing, they just set up
callbacks to use during parsing later.)
This patch also adds some const decorations to the fixed-parameters
API, so the distinction from the variable-parameters API is more
clear.
Reviewed-by: Nathan Bossart <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/[email protected]
|
|
This feature adds row filtering for publication tables. When a publication
is defined or modified, an optional WHERE clause can be specified. Rows
that don't satisfy this WHERE clause will be filtered out. This allows a
set of tables to be partially replicated. The row filter is per table. A
new row filter can be added simply by specifying a WHERE clause after the
table name. The WHERE clause must be enclosed by parentheses.
The row filter WHERE clause for a table added to a publication that
publishes UPDATE and/or DELETE operations must contain only columns that
are covered by REPLICA IDENTITY. The row filter WHERE clause for a table
added to a publication that publishes INSERT can use any column. If the
row filter evaluates to NULL, it is regarded as "false". The WHERE clause
only allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined types, user-defined collations,
non-immutable built-in functions, or references to system columns. These
restrictions could be addressed in the future.
If you choose to do the initial table synchronization, only data that
satisfies the row filters is copied to the subscriber. If the subscription
has several publications in which a table has been published with
different WHERE clauses, rows that satisfy ANY of the expressions will be
copied. If a subscriber is a pre-15 version, the initial table
synchronization won't use row filters even if they are defined in the
publisher.
The row filters are applied before publishing the changes. If the
subscription has several publications in which the same table has been
published with different filters (for the same publish operation), those
expressions get OR'ed together so that rows satisfying any of the
expressions will be replicated.
This means all the other filters become redundant if (a) one of the
publications have no filter at all, (b) one of the publications was
created using FOR ALL TABLES, (c) one of the publications was created
using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema.
If your publication contains a partitioned table, the publication
parameter publish_via_partition_root determines if it uses the partition's
row filter (if the parameter is false, the default) or the root
partitioned table's row filter.
Psql commands \dRp+ and \d <table-name> will display any row filters.
Author: Hou Zhijie, Euler Taveira, Peter Smith, Ajin Cherian
Reviewed-by: Greg Nancarrow, Haiying Tang, Amit Kapila, Tomas Vondra, Dilip Kumar, Vignesh C, Alvaro Herrera, Andres Freund, Wei Wang
Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com
|
|
There were a number of places in the code that used bespoke bit-twiddling
expressions to do bitwise rotation. While we've had pg_rotate_right32()
for a while now, we hadn't gotten around to standardizing on that. Do so
now. Since many potential call sites look more natural with the "left"
equivalent, add that function too.
Reviewed by Tom Lane and Yugo Nagata
Discussion:
https://www.postgresql.org/message-id/CAFBsxsH7c1LC0CGZ0ADCBXLHU5-%3DKNXx-r7tHYPAW51b2HK4Qw%40mail.gmail.com
|
|
The SQL standard has been ambiguous about whether null values in
unique constraints should be considered equal or not. Different
implementations have different behaviors. In the SQL:202x draft, this
has been formalized by making this implementation-defined and adding
an option on unique constraint definitions UNIQUE [ NULLS [NOT]
DISTINCT ] to choose a behavior explicitly.
This patch adds this option to PostgreSQL. The default behavior
remains UNIQUE NULLS DISTINCT. Making this happen in the btree code
is pretty easy; most of the patch is just to carry the flag around to
all the places that need it.
The CREATE UNIQUE INDEX syntax extension is not from the standard,
it's my own invention.
I named all the internal flags, catalog columns, etc. in the negative
("nulls not distinct") so that the default PostgreSQL behavior is the
default if the flag is false.
Reviewed-by: Maxim Orlov <[email protected]>
Reviewed-by: Pavel Borisov <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/[email protected]
|
|
xlog.h is directly and indirectly #included in a lot of places. With
this change, xloginsert.h is no longer unnecessarily included in the
large number of them that don't need it.
Author: Bharath Rupireddy <[email protected]>
Discussion: https://postgr.es/m/CALj2ACVe-W+WM5P44N7eG9C2_FmaeM8Dq5aCnD3fHt0Ba=WR6w@mail.gmail.com
|
|
Add pg_statistic_ext_data.stxdinherit flag, so that for each extended
statistics definition we can store two versions of data - one for the
relation alone, one for the whole inheritance tree. This is analogous to
pg_statistic.stainherit, but we failed to include such flag in catalogs
for extended statistics, and we had to work around it (see commits
859b3003de, 36c4bc6e72 and 20b9fa308e).
This changes the relationship between the two catalogs storing extended
statistics objects (pg_statistic_ext and pg_statistic_ext_data). Until
now, there was a simple 1:1 mapping - for each definition there was one
pg_statistic_ext_data row, and this row was inserted while creating the
statistics (and then updated during ANALYZE). With the stxdinherit flag,
we don't know how many rows there will be (child relations may be added
after the statistics object is defined), so there may be up to two rows.
We could make CREATE STATISTICS to always create both rows, but that
seems wasteful - without partitioning we only need stxdinherit=false
rows, and declaratively partitioned tables need only stxdinherit=true.
So we no longer initialize pg_statistic_ext_data in CREATE STATISTICS,
and instead make that a responsibility of ANALYZE. Which is what we do
for regular statistics too.
Patch by me, with extensive improvements and fixes by Justin Pryzby.
Author: Tomas Vondra, Justin Pryzby
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
|
|
Backpatch-through: 10
|
|
The API spec for lookup_rowtype_tupdesc previously said you could use
either ReleaseTupleDesc or DecrTupleDescRefCount. However, the latter
choice means the caller must be certain that the returned tupdesc is
refcounted. I don't recall right now whether that was always true
when this spec was written, but it's certainly not always true since
we introduced shared record typcaches for parallel workers. That means
that callers using DecrTupleDescRefCount are dependent on typcache
behavior details that they probably shouldn't be. Hence, change the API
spec to say that you must call ReleaseTupleDesc, and fix the half-dozen
callers that weren't.
AFAICT this is just future-proofing, there's no live bug here.
So no back-patch.
Per gripe from Chapman Flack.
Discussion: https://postgr.es/m/[email protected]
|
|
Extend the foreign key ON DELETE actions SET NULL and SET DEFAULT by
allowing the specification of a column list, like
CREATE TABLE posts (
...
FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL (author_id)
);
If a column list is specified, only those columns are set to
null/default, instead of all the columns in the foreign-key
constraint.
This is useful for multitenant or sharded schemas, where the tenant or
shard ID is included in the primary key of all tables but shouldn't be
set to null.
Author: Paul Martinez <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CACqFVBZQyMYJV=njbSMxf+rbDHpx=W=B7AEaMKn8dWn9OZJY7w@mail.gmail.com
|
|
Add more macros to group some RELKIND_* macros:
- RELKIND_HAS_PARTITIONS()
- RELKIND_HAS_TABLESPACE()
- RELKIND_HAS_TABLE_AM()
Reviewed-by: Michael Paquier <[email protected]>
Reviewed-by: Alvaro Herrera <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/a574c8f1-9c84-93ad-a9e5-65233d6fc00f%40enterprisedb.com
|
|
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed only by BRIN indexes. There are no index
pointers to individual tuples in BRIN, and the page range summary will
be updated anyway as it relies on visibility info.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
Patch by Josef Simanek, various fixes and improvements by me.
Author: Josef Simanek
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
|
|
fsstate in process_pending_requests (in postgres_fdw.c) was added in
8998e3cafa2 as an assertion-only variable, 1ec7fca8592 stated using
the variable outside of assertions.
rd_index in get_index_column_opclass (in lsyscache.c) was introduced
in 2a6368343ff, and then promptly used in the fix commit 7e041603904
shortly thereafter.
This removes the PG_USED_FOR_ASSERTS_ONLY variable decoration from
the above mentioned variables.
Reviewed-by: Greg Nancarrow <[email protected]>
Discussion: https://postgr.es/m/[email protected]
|
|
A new option "FOR ALL TABLES IN SCHEMA" in Create/Alter Publication allows
one or more schemas to be specified, whose tables are selected by the
publisher for sending the data to the subscriber.
The new syntax allows specifying both the tables and schemas. For example:
CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
OR
ALTER PUBLICATION pub1 ADD TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2;
A new system table "pg_publication_namespace" has been added, to maintain
the schemas that the user wants to publish through the publication.
Modified the output plugin (pgoutput) to publish the changes if the
relation is part of schema publication.
Updates pg_dump to identify and dump schema publications. Updates the \d
family of commands to display schema publications and \dRp+ variant will
now display associated schemas if any.
Author: Vignesh C, Hou Zhijie, Amit Kapila
Syntax-Suggested-by: Tom Lane, Alvaro Herrera
Reviewed-by: Greg Nancarrow, Masahiko Sawada, Hou Zhijie, Amit Kapila, Haiying Tang, Ajin Cherian, Rahila Syed, Bharath Rupireddy, Mark Dilger
Tested-by: Haiying Tang
Discussion: https://www.postgresql.org/message-id/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ@mail.gmail.com
|
|
The purpose of commit 8a54e12a38d1545d249f1402f66c8cde2837d97c was to
fix this, and it sufficed when the PREPARE TRANSACTION completed before
the CIC looked for lock conflicts. Otherwise, things still broke. As
before, in a cluster having used CIC while having enabled prepared
transactions, queries that use the resulting index can silently fail to
find rows. It may be necessary to reindex to recover from past
occurrences; REINDEX CONCURRENTLY suffices. Fix this for future index
builds by making CIC wait for arbitrarily-recent prepared transactions
and for ordinary transactions that may yet PREPARE TRANSACTION. As part
of that, have PREPARE TRANSACTION transfer locks to its dummy PGPROC
before it calls ProcArrayClearTransaction(). Back-patch to 9.6 (all
supported versions).
Andrey Borodin, reviewed (in earlier versions) by Andres Freund.
Discussion: https://postgr.es/m/[email protected]
|
|
CIC and REINDEX CONCURRENTLY assume backends see their catalog changes
no later than each backend's next transaction start. That failed to
hold when a backend absorbed a relevant invalidation in the middle of
running RelationBuildDesc() on the CIC index. Queries that use the
resulting index can silently fail to find rows. Fix this for future
index builds by making RelationBuildDesc() loop until it finishes
without accepting a relevant invalidation. It may be necessary to
reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices.
Back-patch to 9.6 (all supported versions).
Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres
Freund.
Discussion: https://postgr.es/m/[email protected]
|
|
Commit 01e658fa74 added hash support for row types. This also added
support for hashing anonymous record types, using the same approach
that the type cache uses for comparison support for record types: It
just reports that it works, but it might fail at run time if a
component type doesn't actually support the operation. We get away
with that for comparison because most types support that. But some
types don't support hashing, so the current state can result in
failures at run time where the planner chooses hashing over sorting,
whereas that previously worked if only sorting was an option.
We do, however, want the record hashing support for path tracking in
recursive unions, and the SEARCH and CYCLE clauses built on that. In
that case, hashing is the only plan option. So enable that, this
commit implements the following approach: The type cache does not
report that hashing is available for the record type. This undoes
that part of 01e658fa74. Instead, callers that require hashing no
matter what can override that result themselves. This patch only
touches the callers to make the aforementioned recursive query cases
work, namely the parse analysis of unions, as well as the hash_array()
function.
Reported-by: Sait Talha Nisanci <[email protected]>
Bug: #17158
Discussion: https://www.postgresql.org/message-id/flat/17158-8a2ba823982537a4%40postgresql.org
|
|
The existing data structures in inval.c are fairly inefficient for
the common case of a command or subtransaction that registers a small
number of cache invalidation events. While this doesn't matter if we
commit right away, it can build up to a lot of bloat in a transaction
that contains many DDL operations. By making a few more assumptions
about the expected use-case, we can switch to a representation using
densely-packed arrays. Although this eliminates some data-copying,
it doesn't seem to make much difference time-wise. But the space
consumption decreases substantially.
Patch by me; thanks to Nathan Bossart for review.
Discussion: https://postgr.es/m/[email protected]
|
|
This patch causes EXPLAIN output to refer to objects that are in
the current session's temp schema with the "pg_temp" schema alias
rather than that schema's actual name. This is useful for our own
testing purposes since it will stabilize EXPLAIN VERBOSE output
for such cases, allowing us to use that in regression tests.
It should be less confusing for end users too.
Since ruleutils.c needs to change behavior for this, the change
also leaks into a few other users of ruleutils.c, for example
pg_get_viewdef(). AFAICS that won't cause any problems.
We did find that aggressively trying to change this behavior
across-the-board would cause issues, but as long as "pg_temp"
only appears within generated SQL text, I think it'll be fine.
Along the way, make get_namespace_name_or_temp conform to the
same API as get_namespace_name, ie that it returns a palloc'd
string or NULL. The current behavior hasn't caused any bugs
since no callers attempt to pfree the result, but if it gets
more widespread usage that could become a problem.
Amul Sul, reviewed and extended by me
Discussion: https://postgr.es/m/CAAJ_b97W=QaGmag9AhWNbmx3uEYsNkXWL+OVW1_E1D3BtgWvtw@mail.gmail.com
|
|
Commit 0563a3a8b changed how partition constraints were generated such
that this function no longer computes the mapping of parent attnos to
child attnos.
This is an external function that extensions could use, so this is
potentially a breaking change. No external callers are known, however,
and this will make it simpler to write such callers in the future.
Author: Hou Zhijie
Reviewed-by: David Rowley, Michael Paquier, Soumyadeep Chakraborty
Discussion: https://www.postgresql.org/message-id/flat/OS0PR01MB5716A75A45BE46101A1B489894379@OS0PR01MB5716.jpnprd01.prod.outlook.com
|
|
The name introduced by commit 4656e3d66 was agreed to be unreasonably
long. To match this change, rename initdb's recently-added
--clobber-cache option to --discard-caches.
Discussion: https://postgr.es/m/[email protected]
|
|
If an error occurred in the wrong place, it was possible to leave an
unintialized entry in the hash table, leading to a crash. Fixed.
Also, be more careful about the order of operations so that an
allocation error doesn't leak memory in CacheMemoryContext or
unnecessarily advance NextRecordTypmod.
Backpatch through version 11. Earlier versions (prior to 35ea75632a5)
do not exhibit the problem, because an uninitialized hash entry
contains a valid empty list.
Author: Sait Talha Nisanci <[email protected]>
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/HE1PR8303MB009069D476225B9A9E194B8891779@HE1PR8303MB0090.EURPRD83.prod.outlook.com
Backpatch-through: 11
|
|
Commit 03ffc4d6d added logic to bypass all caching behavior in
LookupOpclassInfo when CLOBBER_CACHE_ALWAYS is enabled. It doesn't
look like I stopped to think much about what that would cost, but
recent investigation shows that the cost is enormous: it roughly
doubles the time needed for cache-clobber test runs.
There does seem to be value in this behavior when trying to test
the opclass-cache loading logic itself, but for other purposes the
cost is excessive. Hence, let's back off to doing this only when
debug_invalidate_system_caches_always is at least 3; or in older
branches, when CLOBBER_CACHE_RECURSIVELY is defined.
While here, clean up some other minor issues in LookupOpclassInfo.
Re-order the code so we aren't left with broken cache entries (leading
to later core dumps) in the unlikely case that we suffer OOM while
trying to allocate space for a new entry. (That seems to be my
oversight in 03ffc4d6d.) Also, in >= v13, stop allocating one array
entry too many. That's evidently left over from sloppy reversion in
851b14b0c.
Back-patch to all supported branches, mainly to reduce the runtime
of cache-clobbering buildfarm animals.
Discussion: https://postgr.es/m/[email protected]
|
|
Several places were performing a tight loop to determine the first power
of 2 number that's > or >= the required memory. Instead of using a loop
for that, we can use pg_nextpower2_32 or pg_nextpower2_64. When we need a
power of 2 number equal to or greater than a given amount, we just pass
the amount to the nextpower2 function. When we need a power of 2 greater
than the amount, we just pass the amount + 1.
Additionally, in tsearch there were a couple of locations that were
performing a while loop when a simple "if" would have done. In both of
these locations only 1 item is being added, so the loop could only have
ever iterated once. Changing the loop into an if statement makes the code
very slightly more optimal as the condition is checked once rather than
twice.
There are quite a few remaining locations that increase the size of the
buffer in the following form:
while (reqsize >= buflen)
{
buflen *= 2;
buf = repalloc(buf, buflen);
}
These are not touched in this commit. repalloc will error out for sizes
larger than MaxAllocSize. Changing these to use pg_nextpower2_32 would
remove the chance of that error being raised. It's unclear from the code
if the sizes could ever become that large, so err on the side of caution.
Discussion: https://postgr.es/m/CAApHDvp=tns7RL4PH0ZR0M+M-YFLquK7218x=0B_zO+DbOma+w@mail.gmail.com
Reviewed-by: Zhihong Yu
|
|
Instead of defining symbols such as AmOidIndexId explicitly, include
them as an argument of DECLARE_INDEX() and have genbki.pl generate the
way as the table OID symbols from the CATALOG() declaration.
Reviewed-by: John Naylor <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/ccef1e46-a404-25b1-9b4c-85f2c08e1f28%40enterprisedb.com
|
|
We were using RelationGetIndexList() to update the relation's replica
identity index but instead, we can directly use RelationGetReplicaIndex()
which uses the same functionality. This is a minor code readability
improvement.
Author: Japin Li
Reviewed-By: Takamichi Osumi, Amit Kapila
Discussion: https://postgr.es/m/[email protected]
|
|
In previous commit, I missed that relmap_redo() was also not acquiring the
RelationMappingLock. Thanks to Thomas Munro for pointing that out.
Backpatch-through: 9.6, like previous commit.
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGLev%3DPpOSaL3WRZgOvgk217et%2BbxeJcRr4eR-NttP1F6Q%40mail.gmail.com
|
|
Contrary to the comment here, POSIX does not guarantee atomicity of a
read(), if another process calls write() concurrently. Or at least Linux
does not. Add locking to load_relmap_file() to avoid the race condition.
Fixes bug #17064. Thanks to Alexander Lakhin for the report and test case.
Backpatch-through: 9.6, all supported versions.
Discussion: https://www.postgresql.org/message-id/[email protected]
|
|
Commit e7eea52b2d has introduced a new function
RelationGetIdentityKeyBitmap which omits to handle the case where there is
no replica identity index on a relation.
Author: Mark Dilger
Reviewed-by: Takamichi Osumi, Amit Kapila
Discussion: https://www.postgresql.org/message-id/[email protected]
|
|
Parallel query processes that called BlessTupleDesc() for identical
tuple descriptors at the same moment could crash. There was code to
handle that rare case, but it dereferenced a bogus DSA pointer. Repair.
Back-patch to 11, where commit cc5f8136 added support for sharing tuple
descriptors in parallel queries.
Reported-by: Eric Thinnes <[email protected]>
Discussion: https://postgr.es/m/99aaa2eb-e194-bf07-c29a-1a76b4f2bcf9%40gmx.de
|
|
Also "make reformat-dat-files".
The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
|
|
Design problems were discovered in the handling of composite types and
record types that would cause some relevant versions not to be recorded.
Misgivings were also expressed about the use of the pg_depend catalog
for this purpose. We're out of time for this release so we'll revert
and try again.
Commits reverted:
1bf946bd: Doc: Document known problem with Windows collation versions.
cf002008: Remove no-longer-relevant test case.
ef387bed: Fix bogus collation-version-recording logic.
0fb0a050: Hide internal error for pg_collation_actual_version(<bad OID>).
ff942057: Suppress "warning: variable 'collcollate' set but not used".
d50e3b1f: Fix assertion in collation version lookup.
f24b1569: Rethink extraction of collation dependencies.
257836a7: Track collation versions for indexes.
cd6f479e: Add pg_depend.refobjversion.
7d1297df: Remove pg_collation.collversion.
Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com
|
|
Makes partition descriptor acquisition faster during the transient
period in which a partition is in the process of being detached.
This also adds the restriction that only one partition can be in
pending-detach state for a partitioned table.
While at it, return find_inheritance_children() API to what it was
before 71f4c8c6f74b, and create a separate
find_inheritance_children_extended() that returns detailed info about
detached partitions.
(This incidentally fixes a bug in 8aba9322511 whereby a memory context
holding a transient partdesc is reparented to a NULL PortalContext,
leading to permanent leak of that memory. The fix is to no longer rely
on reparenting contexts to PortalContext. Reported by Amit Langote.)
Per gripe from Amit Langote
Discussion: https://postgr.es/m/CA+HiwqFgpP1LxJZOBYGt9rpvTjXXkg5qG2+Xch2Z1Q7KrqZR1A@mail.gmail.com
|
|
The Truncate operation acquires an exclusive lock on the target relation
and indexes. It then waits for logical replication of the operation to
finish at commit. Now because we are acquiring the shared lock on the
target index to get index attributes in pgoutput while sending the
changes for the Truncate operation, it leads to a deadlock.
Actually, we don't need to acquire a lock on the target index as we build
the cache entry using a historic snapshot and all the later changes are
absorbed while decoding WAL. So, we wrote a special purpose function for
logical replication to get a bitmap of replica identity attribute numbers
where we get that information without locking the target index.
We decided not to backpatch this as there doesn't seem to be any field
complaint about this issue since it was introduced in commit 5dfd1e5a in
v11.
Reported-by: Haiying Tang
Author: Takamichi Osumi, test case by Li Japin
Reviewed-by: Amit Kapila, Ajin Cherian
Discussion: https://postgr.es/m/OS0PR01MB6113C2499C7DC70EE55ADB82FB759@OS0PR01MB6113.jpnprd01.prod.outlook.com
|
|
Andrew Gierth reported that it's possible to crash the backend if no
pg_attrdef record is found to match an attribute that has atthasdef set.
AttrDefaultFetch warns about this situation, but then leaves behind
a relation tupdesc that has null "adbin" pointer(s), which most places
don't guard against.
We considered promoting the warning to an error, but throwing errors
during relcache load is pretty drastic: it effectively locks one out
of using the relation at all. What seems better is to leave the
load-time behavior as a warning, but then throw an error in any code
path that wants to use a default and can't find it. This confines
the error to a subset of INSERT/UPDATE operations on the table, and
in particular will at least allow a pg_dump to succeed.
Also, we should fix AttrDefaultFetch to not leave any null pointers
in the tupdesc, because that just creates an untested bug hazard.
While at it, apply the same philosophy of "warn at load, throw error
only upon use of the known-missing info" to CHECK constraints.
CheckConstraintFetch is very nearly the same logic as AttrDefaultFetch,
but for reasons lost in the mists of time, it was throwing ERROR for
the same cases that AttrDefaultFetch treats as WARNING. Make the two
functions more nearly alike.
In passing, get rid of potentially-O(N^2) loops in equalTupleDesc
by making AttrDefaultFetch sort the entries after fetching them,
so that equalTupleDesc can assume that entries in two equal tupdescs
must be in matching order. (CheckConstraintFetch already was sorting
CHECK constraints, but equalTupleDesc hadn't been told about it.)
There's some argument for back-patching this, but with such a small
number of field reports, I'm content to fix it in HEAD.
Discussion: https://postgr.es/m/[email protected]
|
|
Membership consists, implicitly, of the current database owner. Expect
use in template databases. Once pg_database_owner has rights within a
template, each owner of a database instantiated from that template will
exercise those rights.
Reviewed by John Naylor.
Discussion: https://postgr.es/m/[email protected]
|
|
Allow a partition be detached from its partitioned table without
blocking concurrent queries, by running in two transactions and only
requiring ShareUpdateExclusive in the partitioned table.
Because it runs in two transactions, it cannot be used in a transaction
block. This is the main reason to use dedicated syntax: so that users
can choose to use the original mode if they need it. But also, it
doesn't work when a default partition exists (because an exclusive lock
would still need to be obtained on it, in order to change its partition
constraint.)
In case the second transaction is cancelled or a crash occurs, there's
ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final
steps.
The main trick to make this work is the addition of column
pg_inherits.inhdetachpending, initially false; can only be set true in
the first part of this command. Once that is committed, concurrent
transactions that use a PartitionDirectory will include or ignore
partitions so marked: in optimizer they are ignored if the row is marked
committed for the snapshot; in executor they are always included. As a
result, and because of the way PartitionDirectory caches partition
descriptors, queries that were planned before the detach will see the
rows in the detached partition and queries that are planned after the
detach, won't.
A CHECK constraint is created that duplicates the partition constraint.
This is probably not strictly necessary, and some users will prefer to
remove it afterwards, but if the partition is re-attached to a
partitioned table, the constraint needn't be rechecked.
Author: Álvaro Herrera <[email protected]>
Reviewed-by: Amit Langote <[email protected]>
Reviewed-by: Justin Pryzby <[email protected]>
Discussion: https://postgr.es/m/[email protected]
|
|
To allow inserts in parallel-mode this feature has to ensure that all the
constraints, triggers, etc. are parallel-safe for the partition hierarchy
which is costly and we need to find a better way to do that. Additionally,
we could have used existing cached information in some cases like indexes,
domains, etc. to determine the parallel-safety.
List of commits reverted, in reverse chronological order:
ed62d3737c Doc: Update description for parallel insert reloption.
c8f78b6161 Add a new GUC and a reloption to enable inserts in parallel-mode.
c5be48f092 Improve FK trigger parallel-safety check added by 05c8482f7f.
e2cda3c20a Fix use of relcache TriggerDesc field introduced by commit 05c8482f7f.
e4e87a32cc Fix valgrind issue in commit 05c8482f7f.
05c8482f7f Enable parallel SELECT for "INSERT INTO ... SELECT ...".
Discussion: https://postgr.es/m/[email protected]
|
|
Previously, to check relation permanence, the Relation's Form_pg_class
structure member relpersistence was compared to the value
RELPERSISTENCE_PERMANENT ("p"). This commit adds the macro
RelationIsPermanent() and is used in appropirate places to simplify the
code. This matches other RelationIs* macros.
This macro will be used in more places in future cluster file encryption
patches.
Discussion: https://postgr.es/m/[email protected]
|
|
Some code paths in this function perform syscache lookups, which
can lead to table accesses and possibly leakage of cruft into
the caller's context. If said context is CacheMemoryContext,
we eventually will have visible bloat. But fixing this is no
harder than moving one memory context switch step. (The other
callers don't have a problem.)
Andres Freund and I independently found this via valgrind testing.
Back-patch to v12 where this code was added.
Discussion: https://postgr.es/m/[email protected]
Discussion: https://postgr.es/m/[email protected]
|
|
Although these lists are usually NIL, and even when not empty
are unlikely to be large, constant relcache update traffic could
eventually result in visible bloat of CacheMemoryContext.
Found via valgrind testing.
Back-patch to v10 where this field was added.
Discussion: https://postgr.es/m/[email protected]
|
|
Parallel SELECT can't be utilized for INSERT in the following cases:
- INSERT statement uses the ON CONFLICT DO UPDATE clause
- Target table has a parallel-unsafe: trigger, index expression or
predicate, column default expression or check constraint
- Target table has a parallel-unsafe domain constraint on any column
- Target table is a partitioned table with a parallel-unsafe partition key
expression or support function
The planner is updated to perform additional parallel-safety checks for
the cases listed above, for determining whether it is safe to run INSERT
in parallel-mode with an underlying parallel SELECT. The planner will
consider using parallel SELECT for "INSERT INTO ... SELECT ...", provided
nothing unsafe is found from the additional parallel-safety checks, or
from the existing parallel-safety checks for SELECT.
While checking parallel-safety, we need to check it for all the partitions
on the table which can be costly especially when we decide not to use a
parallel plan. So, in a separate patch, we will introduce a GUC and or a
reloption to enable/disable parallelism for Insert statements.
Prior to entering parallel-mode for the execution of INSERT with parallel
SELECT, a TransactionId is acquired and assigned to the current
transaction state. This is necessary to prevent the INSERT from attempting
to assign the TransactionId whilst in parallel-mode, which is not allowed.
This approach has a disadvantage in that if the underlying SELECT does not
return any rows, then the TransactionId is not used, however that
shouldn't happen in practice in many cases.
Author: Greg Nancarrow, Amit Langote, Amit Kapila
Reviewed-by: Amit Langote, Hou Zhijie, Takayuki Tsunakawa, Antonin Houska, Bharath Rupireddy, Dilip Kumar, Vignesh C, Zhihong Yu, Amit Kapila
Tested-by: Tang, Haiying
Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com
Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
|
|
The portions fixing the documentation are backpatched where needed.
Author: Justin Pryzby
Discussion: https://postgr.es/m/[email protected]
backpatch-through: 9.6
|
|
For an index, attstattarget can be updated using ALTER INDEX SET
STATISTICS. This data was lost on the new index after REINDEX
CONCURRENTLY.
The update of this field is done when the old and new indexes are
swapped to make the fix back-patchable. Another approach we could look
after in the long-term is to change index_create() to pass the wanted
values of attstattarget when creating the new relation, but, as this
would cause an ABI breakage this can be done only on HEAD.
Reported-by: Ronan Dunklau
Author: Michael Paquier
Reviewed-by: Ronan Dunklau, Tomas Vondra
Discussion: https://postgr.es/m/16628084.uLZWGnKmhe@laptop-ronand
Backpatch-through: 12
|
|
This patch essentially is cleaning up technical debt left behind
by the original implementation of plpgsql procedures, particularly
commit d92bc83c4. That patch (or more precisely, follow-on patches
fixing its worst bugs) forced us to re-plan CALL and DO statements
each time through, if we're in a non-atomic context. That wasn't
for any fundamental reason, but just because use of a saved plan
requires having a ResourceOwner to hold a reference count for the
plan, and we had no suitable resowner at hand, nor would the
available APIs support using one if we did. While it's not that
expensive to create a "plan" for CALL/DO, the cycles do add up
in repeated executions.
This patch therefore makes the following API changes:
* GetCachedPlan/ReleaseCachedPlan are modified to let the caller
specify which resowner to use to pin the plan, rather than forcing
use of CurrentResourceOwner.
* spi.c gains a "SPI_execute_plan_extended" entry point that lets
callers say which resowner to use to pin the plan. This borrows the
idea of an options struct from the recently added SPI_prepare_extended,
hopefully allowing future options to be added without more API breaks.
This supersedes SPI_execute_plan_with_paramlist (which I've marked
deprecated) as well as SPI_execute_plan_with_receiver (which is new
in v14, so I just took it out altogether).
* I also took the opportunity to remove the crude hack of letting
plpgsql reach into SPI private data structures to mark SPI plans as
"no_snapshot". It's better to treat that as an option of
SPI_prepare_extended.
Now, when running a non-atomic procedure or DO block that contains
any CALL or DO commands, plpgsql creates a ResourceOwner that
will be used to pin the plans of the CALL/DO commands. (In an
atomic context, we just use CurrentResourceOwner, as before.)
Having done this, we can just save CALL/DO plans normally,
whether or not they are used across transaction boundaries.
This seems to be good for something like 2X speedup of a CALL
of a trivial procedure with a few simple argument expressions.
By restricting the creation of an extra ResourceOwner like this,
there's essentially zero penalty in cases that can't benefit.
Pavel Stehule, with some further hacking by me
Discussion: https://postgr.es/m/CAFj8pRCLPdDAETvR7Po7gC5y_ibkn_-bOzbeJb39WHms01194Q@mail.gmail.com
|