postgresql.git - This is the main PostgreSQL git repository.

Age	Commit message (Collapse)	Author
10 days	Message style improvements	Peter Eisentraut

2025-06-03	Rename log_lock_failure GUC to log_lock_failures for consistency.	Fujii Masao
	This commit renames the GUC log_lock_failure to log_lock_failures to align with the existing similar setting log_lock_waits, which uses the plural form. This improves naming consistency across related GUCs. Suggested-by: Peter Eisentraut <[email protected]> Author: Fujii Masao <[email protected] Reviewed-by: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-04-25	Fix bug allowing io_combine_limit > io_max_combine_combine limit	Andres Freund
	10f66468475 intended to limit the value of io_combine_limit to the minimum of io_combine_limit and io_max_combine_limit. To avoid issues with interdependent GUCs, it introduced io_combine_limit_guc and set io_combine_limit in assign hooks. That plan was thwarted by guc_tables.c accidentally still referencing io_combine_limit, instead of io_combine_limit_guc. That lead to the GUC machinery overriding the work done in the assign hooks, potentially leaving io_combine_limit with a too high value. The consequence of this bug was that when running with io_combine_limit > io_combine_limit_guc the AIO machinery would not have reserved large enough iovec and IO data arrays, with one IO's arrays overlapping with another IO's, leading to total confusion. To make such a problem easier to detect in the future, add assertions to pgaio_io_set_handle_data_* checking the length is smaller than io_max_combine_limit (not just PG_IOV_MAX). It'd be nice to have a few tests for this, but it's not entirely obvious how to do so portably. As remarked upon by Tom, the GUC assignment hooks really shouldn't set the underlying variable, that's the job of the GUC machinery. Change that as well. Discussion: https://postgr.es/m/c5jyqnuwrpigd35qe7xdypxsisdjrdba5iw63mhcse4mzjogxo@qdjpv22z763f
2025-04-08	Introduce file_copy_method setting.	Thomas Munro
	It can be set to either COPY (the default) or CLONE if the system supports it. CLONE causes callers of copydir(), currently CREATE DATABASE ... STRATEGY=FILE_COPY and ALTER DATABASE ... SET TABLESPACE = ..., to use copy_file_range (Linux, FreeBSD) or copyfile (macOS) to copy files instead of a read-write loop over the contents. CLONE gives the kernel the opportunity to share block ranges on copy-on-write file systems and push copying down to storage on others, depending on configuration. On some systems CLONE can be used to clone large databases quickly with CREATE DATABASE ... TEMPLATE=source STRATEGY=FILE_COPY. Other operating systems could be supported; patches welcome. Co-authored-by: Nazir Bilal Yavuz <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Ranier Vilela <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGLM%2Bt%2BSwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w%40mail.gmail.com
2025-04-07	Add support for basic NUMA awareness	Tomas Vondra
	Add basic NUMA awareness routines, using a minimal src/port/pg_numa.c portability wrapper and an optional build dependency, enabled by --with-libnuma configure option. For now this is Linux-only, other platforms may be supported later. A built-in SQL function pg_numa_available() allows checking NUMA support, i.e. that the server was built/linked with the NUMA library. The main function introduced is pg_numa_query_pages(), which allows determining the NUMA node for individual memory pages. Internally the function uses move_pages(2) syscall, as it allows batching, and is more efficient than get_mempolicy(2). Author: Jakub Wartak <[email protected]> Co-authored-by: Bertrand Drouvot <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Álvaro Herrera <[email protected]> Reviewed-by: Tomas Vondra <[email protected]> Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com
2025-04-07	Remove GUC_NOT_IN_SAMPLE from enable_self_join_elimination	Alexander Korotkov
	fc069a3a6319 implements Self-Join Elimination (SJE) and provides a new GUC variable: enable_self_join_elimination. This new GUC variable was marked as GUC_NOT_IN_SAMPLE. However, enable_self_join_elimination is documented and is not different from any other enable_* GUCs. Thus, remove GUC_NOT_IN_SAMPLE from it and add it to the postgresql.conf.sample. Discussion: https://postgr.es/m/CAPpHfdsqMTEsmxk3aQwt6xPz%2BKpUELO%3D6fzmER9ZRGrbs4uMfA%40mail.gmail.com Author: Tender Wang <[email protected]> Reviewed-by: Tom Lane <[email protected]>
2025-03-30	Enable IO concurrency on all systems	Andres Freund
	Previously effective_io_concurrency and maintenance_io_concurrency could not be set above 0 on machines without fadvise support. AIO enables IO concurrency without such support, via io_method=worker. Currently only subsystems using the read stream API will take advantage of this. Other users of maintenance_io_concurrency (like recovery prefetching) which leverage OS advice directly will not benefit from this change. In those cases, maintenance_io_concurrency will have no effect on I/O behavior. Author: Melanie Plageman <[email protected]> Reviewed-by: Noah Misch <[email protected]> Discussion: https://postgr.es/m/CAAKRu_atGgZePo=_g6T3cNtfMf0QxpvoUh5OUqa_cnPdhLd=gw@mail.gmail.com
2025-03-27	Remove the query_id_squash_values GUC	Álvaro Herrera
	Commit 62d712ecfd94 introduced the capability to calculate the same queryId for queries with different lengths of constants in a list for an IN clause. This behavior was originally enabled with a GUC query_id_squash_values. After a discussion about the value of such a GUC, it was decided to back out of the use of a GUC and make the squashing behavior the only available option. Author: Sami Imseih <[email protected]> Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/CA+q6zcVTK-3C-8NWV1oY2NZrvtnMCDqnyYYyk1T7WMUG65MeOQ@mail.gmail.com
2025-03-24	Redefine max_files_per_process to control additionally opened files	Andres Freund
	Until now max_files_per_process=N limited each backend to open N files in total (minus a safety factor), even if there were already more files opened in postmaster and inherited by backends. Change max_files_per_process to control how many additional files each process is allowed to open. The main motivation for this is the patch to add io_method=io_uring, which needs to open one file for each backend. Without this patch, even if RLIMIT_NOFILE is high enough, postmaster will fail in set_max_safe_fds() if started with a high max_connections. The cause of the failure is that, until now, set_max_safe_fds() subtracted the already open files from max_files_per_process. Reviewed-by: Noah Misch <[email protected]> Discussion: https://postgr.es/m/w6uiicyou7hzq47mbyejubtcyb2rngkkf45fk4q7inue5kfbeo@bbfad3qyubvs Discussion: https://postgr.es/m/CAGECzQQh6VSy3KG4pN1d=h9J=D1rStFCMR+t7yh_Kwj-g87aLQ@mail.gmail.com
2025-03-21	Add GUC option to control maximum active replication origins.	Masahiko Sawada
	This commit introduces a new GUC option max_active_replication_origins to control the maximum number of active replication origins. Previously, this was controlled by 'max_replication_slots'. Having a separate GUC option provides better flexibility for setting up subscribers, as they may not require replication slots (for cascading replication) but always require replication origins. Author: Euler Taveira <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Reviewed-by: vignesh C <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-20	Add vacuum_truncate configuration parameter.	Nathan Bossart
	This new parameter works just like the storage parameter of the same name: if set to true (which is the default), autovacuum and VACUUM attempt to truncate any empty pages at the end of the table. It is primarily intended to help users avoid locking issues on hot standbys. The setting can be overridden with the storage parameter or VACUUM's TRUNCATE option. Since there's presently no way to determine whether a Boolean storage parameter is explicitly set or has just picked up the default value, this commit also introduces an isset_offset member to relopt_parse_elt. Suggested-by: Will Storey <[email protected]> Author: Nathan Bossart <[email protected]> Co-authored-by: Gurjeet Singh <[email protected]> Reviewed-by: Laurenz Albe <[email protected]> Reviewed-by: Fujii Masao <[email protected]> Reviewed-by: Robert Treat <[email protected]> Discussion: https://postgr.es/m/Z2DE4lDX4tHqNGZt%40dev.null
2025-03-19	extension_control_path	Peter Eisentraut
	The new GUC extension_control_path specifies a path to look for extension control files. The default value is $system, which looks in the compiled-in location, as before. The path search uses the same code and works in the same way as dynamic_library_path. Some use cases of this are: (1) testing extensions during package builds, (2) installing extensions outside security-restricted containers like Python.app (on macOS), (3) adding extensions to PostgreSQL running in a Kubernetes environment using operators such as CloudNativePG without having to rebuild the base image for each new extension. There is also a tweak in Makefile.global so that it is possible to install extensions using PGXS into an different directory than the default, using 'make install prefix=/else/where'. This previously only worked when specifying the subdirectories, like 'make install datadir=/else/where/share pkglibdir=/else/where/lib', for purely implementation reasons. (Of course, without the path feature, installing elsewhere was rarely useful.) Author: Peter Eisentraut <[email protected]> Co-authored-by: Matheus Alcantara <[email protected]> Reviewed-by: David E. Wheeler <[email protected]> Reviewed-by: Gabriele Bartolini <[email protected]> Reviewed-by: Marco Nenciarini <[email protected]> Reviewed-by: Niccolò Fei <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/[email protected]
2025-03-19	Introduce io_max_combine_limit.	Thomas Munro
	The existing io_combine_limit can be changed by users. The new io_max_combine_limit is fixed at server startup time, and functions as a silent clamp on the user setting. That in itself is probably quite useful, but the primary motivation is: aio_init.c allocates shared memory for all asynchronous IOs including some per-block data, and we didn't want to waste memory you'd never used by assuming they could be up to PG_IOV_MAX. This commit already halves the size of 'AioHandleIov' and 'AioHandleData'. A follow-up commit can now expand PG_IOV_MAX without affecting that. Since our GUC system doesn't support dependencies or cross-checks between GUCs, the user-settable one now assigns a "raw" value to io_combine_limit_guc, and the lower of io_combine_limit_guc and io_max_combine_limit is maintained in io_combine_limit. Reviewed-by: Andres Freund <[email protected]> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com
2025-03-18	Introduce squashing of constant lists in query jumbling	Álvaro Herrera
	pg_stat_statements produces multiple entries for queries like SELECT something FROM table WHERE col IN (1, 2, 3, ...) depending on the number of parameters, because every element of ArrayExpr is individually jumbled. Most of the time that's undesirable, especially if the list becomes too large. Fix this by introducing a new GUC query_id_squash_values which modifies the node jumbling code to only consider the first and last element of a list of constants, rather than each list element individually. This affects both the query_id generated by query jumbling, as well as pg_stat_statements query normalization so that it suppresses printing of the individual elements of such a list. The default value is off, meaning the previous behavior is maintained. Author: Dmitry Dolgov <[email protected]> Reviewed-by: Sergey Dudoladov (mysterious, off-list) Reviewed-by: David Geier <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Álvaro Herrera <[email protected]> Reviewed-by: Sami Imseih <[email protected]> Reviewed-by: Sutou Kouhei <[email protected]> Reviewed-by: Tom Lane <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Reviewed-by: Marcos Pegoraro <[email protected]> Reviewed-by: Julien Rouhaud <[email protected]> Reviewed-by: Zhihong Yu <[email protected]> Tested-by: Yasuo Honda <[email protected]> Tested-by: Sergei Kornilov <[email protected]> Tested-by: Maciek Sakrejda <[email protected]> Tested-by: Chengxi Sun <[email protected]> Tested-by: Jakub Wartak <[email protected]> Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com
2025-03-18	aio: Infrastructure for io_method=worker	Andres Freund
	This commit contains the basic, system-wide, infrastructure for io_method=worker. It does not yet actually execute IO, this commit just provides the infrastructure for running IO workers, kept separate for easier review. The number of IO workers can be adjusted with a PGC_SIGHUP GUC. Eventually we'd like to make the number of workers dynamically scale up/down based on the current "IO load". To allow the number of IO workers to be increased without a restart, we need to reserve PGPROC entries for the workers unconditionally. This has been judged to be worth the cost. If it turns out to be problematic, we can introduce a PGC_POSTMASTER GUC to control the maximum number. As io workers might be needed during shutdown, e.g. for AIO during the shutdown checkpoint, a new PMState phase is added. IO workers are shut down after the shutdown checkpoint has been performed and walsender/archiver have shut down, but before the checkpointer itself shuts down. See also 87a6690cc69. Updates PGSTAT_FILE_FORMAT_ID due to the addition of a new BackendType. Reviewed-by: Noah Misch <[email protected]> Co-authored-by: Thomas Munro <[email protected]> Co-authored-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-18	Add X25519 to the default set of curves	Daniel Gustafsson
	Since many clients default to the X25519 curve in the TLS handshake, the fact that the server by defualt doesn't support it cause an extra roundtrip for each TLS connection. By adding multiple curves, which is supported since 3d1ef3a15c3eb68da, we can reduce the risk of extra roundtrips. Author: Daniel Gustafsson <[email protected]> Co-authored-by: Jacob Champion <[email protected]> Reported-by: Andres Freund <[email protected]> Reviewed-by: Jacob Champion <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-17	aio: Basic subsystem initialization	Andres Freund
	This commit just does the minimal wiring up of the AIO subsystem, added in the next commit, to the rest of the system. The next commit contains more details about motivation and architecture. This commit is kept separate to make it easier to review, separating the changes across the tree, from the implementation of the new subsystem. We discussed squashing this commit with the main commit before merging AIO, but there has been a mild preference for keeping it separate. Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Noah Misch <[email protected]> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-03-14	Add GUC option to log lock acquisition failures.	Fujii Masao
	This commit introduces a new GUC, log_lock_failure, which controls whether a detailed log message is produced when a lock acquisition fails. Currently, it only supports logging lock failures caused by SELECT ... NOWAIT. The log message includes information about all processes holding or waiting for the lock that couldn't be acquired, helping users analyze and diagnose the causes of lock failures. Currently, this option does not log failures from SELECT ... SKIP LOCKED, as that could generate excessive log messages if many locks are skipped, causing unnecessary noise. This mechanism can be extended in the future to support for logging lock failures from other commands, such as LOCK TABLE ... NOWAIT. Author: Yuki Seino <[email protected]> Co-authored-by: Fujii Masao <[email protected]> Reviewed-by: Jelte Fennema-Nio <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-12	Modularize log_connections output	Melanie Plageman
	Convert the boolean log_connections GUC into a list GUC comprised of the connection aspects to log. This gives users more control over the volume and kind of connection logging. The current log_connections options are 'receipt', 'authentication', and 'authorization'. The empty string disables all connection logging. 'all' enables all available connection logging. For backwards compatibility, the most common values for the log_connections boolean are still supported (on, off, 1, 0, true, false, yes, no). Note that previously supported substrings of on, off, true, false, yes, and no are no longer supported. Author: Melanie Plageman <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: Fujii Masao <[email protected]> Reviewed-by: Daniel Gustafsson <[email protected]> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com
2025-02-26	Re-add GUC track_wal_io_timing	Michael Paquier
	This commit is a rework of 2421e9a51d20, about which Andres Freund has raised some concerns as it is valuable to have both track_io_timing and track_wal_io_timing in some cases, as the WAL write and fsync paths can be a major bottleneck for some workloads. Hence, it can be relevant to not calculate the WAL timings in environments where pg_test_timing performs poorly while capturing some IO data under track_io_timing for the non-WAL IO paths. The opposite can be also true: it should be possible to disable the non-WAL timings and enable the WAL timings (the previous GUC setups allowed this possibility). track_wal_io_timing is added back in this commit, controlling if WAL timings should be calculated in pg_stat_io for the read, fsync and write paths, as done previously with pg_stat_wal. pg_stat_wal previously tracked only the sync and write parts (now removed), read stats is new data tracked in pg_stat_io, all three are aggregated if track_wal_io_timing is enabled. The read part matters during recovery or if a XLogReader is used. Extra note: more control over if the types of timings calculated in pg_stat_io could be done with a GUC that lists pairs of (IOObject,IOOp). Reported-by: Andres Freund <[email protected]> Author: Bertrand Drouvot <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/3opf2wh2oljco6ldyqf7ukabw3jijnnhno6fjb4mlu6civ5h24@fcwmhsgmlmzu
2025-02-24	Move MAX_BACKENDS to procnumber.h	Andres Freund
	MAX_BACKENDS influences many things besides postmaster. I e.g. noticed that we don't have static assertions ensuring BUF_REFCOUNT_MASK is big enough for MAX_BACKENDS, adding them would require including postmaster.h in buf_internals.h which doesn't seem right. While at that, add MAX_BACKENDS_BITS, as that's useful in various places for static assertions (to be added in subsequent commits). Reviewed-by: Thomas Munro <[email protected]> Discussion: https://postgr.es/m/wptizm4qt6yikgm2pt52xzyv6ycmqiutloyvypvmagn7xvqkce@d4xuv3mylpg4
2025-02-24	Remove read/sync fields from pg_stat_wal and GUC track_wal_io_timing	Michael Paquier
	The four following attributes are removed from pg_stat_wal: * wal_write * wal_sync * wal_write_time * wal_sync_time a051e71e28a1 has added an equivalent of this information in pg_stat_io with more granularity as this now spreads across the backend types, IO context and IO objects. So, keeping the same information in pg_stat_wal has little benefits. Another benefit of this commit is the removal of PendingWalStats, simplifying an upcoming patch to add per-backend WAL statistics, which already support IO statistics and which have access to the write/sync stats data of WAL. The GUC track_wal_io_timing, that was used to enable or disable the aggregation of the write and sync timings for WAL, is also removed. pgstat_prepare_io_time() is simplified. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, due to the update of PgStat_WalStats. Author: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-20	Add support for OAUTHBEARER SASL mechanism	Daniel Gustafsson
	This commit implements OAUTHBEARER, RFC 7628, and OAuth 2.0 Device Authorization Grants, RFC 8628. In order to use this there is a new pg_hba auth method called oauth. When speaking to a OAuth- enabled server, it looks a bit like this: $ psql 'host=example.org oauth_issuer=... oauth_client_id=...' Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG Device authorization is currently the only supported flow so the OAuth issuer must support that in order for users to authenticate. Third-party clients may however extend this and provide their own flows. The built-in device authorization flow is currently not supported on Windows. In order for validation to happen server side a new framework for plugging in OAuth validation modules is added. As validation is implementation specific, with no default specified in the standard, PostgreSQL does not ship with one built-in. Each pg_hba entry can specify a specific validator or be left blank for the validator installed as default. This adds a requirement on libcurl for the client side support, which is optional to build, but the server side has no additional build requirements. In order to run the tests, Python is required as this adds a https server written in Python. Tests are gated behind PG_TEST_EXTRA as they open ports. This patch has been a multi-year project with many contributors involved with reviews and in-depth discussions: Michael Paquier, Heikki Linnakangas, Zhihong Yu, Mahendrakar Srinivasarao, Andrey Chudnovsky and Stephen Frost to name a few. While Jacob Champion is the main author there have been some levels of hacking by others. Daniel Gustafsson contributed the validation module and various bits and pieces; Thomas Munro wrote the client side support for kqueue. Author: Jacob Champion <[email protected]> Co-authored-by: Daniel Gustafsson <[email protected]> Co-authored-by: Thomas Munro <[email protected]> Reviewed-by: Daniel Gustafsson <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Reviewed-by: Antonin Houska <[email protected]> Reviewed-by: Kashif Zeeshan <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-02-19	Invalidate inactive replication slots.	Amit Kapila
	This commit introduces idle_replication_slot_timeout GUC that allows inactive slots to be invalidated at the time of checkpoint. Because checkpoints happen checkpoint_timeout intervals, there can be some lag between when the idle_replication_slot_timeout was exceeded and when the slot invalidation is triggered at the next checkpoint. To avoid such lags, users can force a checkpoint to promptly invalidate inactive slots. Note that the idle timeout invalidation mechanism is not applicable for slots that do not reserve WAL or for slots on the standby server that are synced from the primary server (i.e., standby slots having 'synced' field 'true'). Synced slots are always considered to be inactive because they don't perform logical decoding to produce changes. The slots can become inactive for a long period if a subscriber is down due to a system error or inaccessible because of network issues. If such a situation persists, it might be more practical to recreate the subscriber rather than attempt to recover the node and wait for it to catch up which could be time-consuming. Then, external tools could create replication slots (e.g., for migrations or upgrades) that may fail to remove them if an error occurs, leaving behind unused slots that take up space and resources. Manually cleaning them up can be tedious and error-prone, and without intervention, these lingering slots can cause unnecessary WAL retention and system bloat. As the duration of idle_replication_slot_timeout is in minutes, any test using that would be time-consuming. We are planning to commit a follow up patch for tests by using the injection point framework. Author: Nisha Moond <[email protected]> Author: Bharath Rupireddy <[email protected]> Reviewed-by: Peter Smith <[email protected]> Reviewed-by: Hayato Kuroda <[email protected]> Reviewed-by: Vignesh C <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Hou Zhijie <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB5716C131A7D80DAE8CB9E88794FC2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-02-18	Make the description of some GUCs more consistent	Michael Paquier
	This commit improves the description of a couple of GUCs, to be more consistent with the style of their surroundings: * array_nulls * enable_self_join_elimination * optimize_bounded_sort * row_security * synchronize_seqscans Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/[email protected]
2025-02-17	Implement Self-Join Elimination	Alexander Korotkov
	The Self-Join Elimination (SJE) feature removes an inner join of a plain table to itself in the query tree if it is proven that the join can be replaced with a scan without impacting the query result. Self-join and inner relation get replaced with the outer in query, equivalence classes, and planner info structures. Also, the inner restrictlist moves to the outer one with the removal of duplicated clauses. Thus, this optimization reduces the length of the range table list (this especially makes sense for partitioned relations), reduces the number of restriction clauses and, in turn, selectivity estimations, and potentially improves total planner prediction for the query. This feature is dedicated to avoiding redundancy, which can appear after pull-up transformations or the creation of an EquivalenceClass-derived clause like the below. SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3); SELECT * FROM t1 WHERE EXISTS (SELECT t3.x FROM t1 t3 WHERE t3.x = t1.x); SELECT * FROM t1,t2, t1 t3 WHERE t1.x = t2.x AND t2.x = t3.x; In the future, we could also reduce redundancy caused by subquery pull-up after unnecessary outer join removal in cases like the one below. SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3 LEFT JOIN t2 ON t2.x = t1.x); Also, it can drastically help to join partitioned tables, removing entries even before their expansion. The SJE proof is based on innerrel_is_unique() machinery. We can remove a self-join when for each outer row: 1. At most, one inner row matches the join clause; 2. Each matched inner row must be (physically) the same as the outer one; 3. Inner and outer rows have the same row mark. In this patch, we use the next approach to identify a self-join: 1. Collect all merge-joinable join quals which look like a.x = b.x; 2. Add to the list above the baseretrictinfo of the inner table; 3. Check innerrel_is_unique() for the qual list. If it returns false, skip this pair of joining tables; 4. Check uniqueness, proved by the baserestrictinfo clauses. To prove the possibility of self-join elimination, the inner and outer clauses must match exactly. The relation replacement procedure is not trivial and is partly combined with the one used to remove useless left joins. Tests covering this feature were added to join.sql. Some of the existing regression tests changed due to self-join removal logic. Discussion: https://postgr.es/m/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru Author: Andrey Lepikhov <[email protected]> Author: Alexander Kuzmenkov <[email protected]> Co-authored-by: Alexander Korotkov <[email protected]> Co-authored-by: Alena Rybakina <[email protected]> Reviewed-by: Tom Lane <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Simon Riggs <[email protected]> Reviewed-by: Jonathan S. Katz <[email protected]> Reviewed-by: David Rowley <[email protected]> Reviewed-by: Thomas Munro <[email protected]> Reviewed-by: Konstantin Knizhnik <[email protected]> Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Hywel Carver <[email protected]> Reviewed-by: Laurenz Albe <[email protected]> Reviewed-by: Ronan Dunklau <[email protected]> Reviewed-by: vignesh C <[email protected]> Reviewed-by: Zhihong Yu <[email protected]> Reviewed-by: Greg Stark <[email protected]> Reviewed-by: Jaime Casanova <[email protected]> Reviewed-by: Michał Kłeczek <[email protected]> Reviewed-by: Alena Rybakina <[email protected]> Reviewed-by: Alexander Korotkov <[email protected]>
2025-02-14	Describe special values in GUC descriptions more consistently.	Nathan Bossart
	Many GUCs accept special values like -1 or an empty string to disable the feature, use a system default, etc. While the documentation consistently lists these special values, the GUC descriptions do not. Many such descriptions fail to mention the special values, and those that do vary in phrasing and placement. This commit aims to bring some consistency to this area by applying the following rules: * Special values should be listed at the end of the long description. * Descriptions should use numerals (e.g., "0") instead of words (e.g., "zero"). * Special value mentions should be concise and direct (e.g., "0 disables the timeout.", "An empty string means use the operating system setting."). * Multiple special values should be listed in ascending order. Of course, there are exceptions, such as max_pred_locks_per_relation and search_path, whose special values are too complex to include. And there are cases like listen_addresses, where the meaning of an empty string is arguably too obvious to include. In those cases, I've refrained from adding special value information to the GUC description. Reviewed-by: Peter Smith <[email protected]> Reviewed-by: "David G. Johnston" <[email protected]> Reviewed-by: Daniel Gustafsson <[email protected]> Discussion: https://postgr.es/m/Z6aIy4aywxUZHAo6%40nathan
2025-02-11	Add cost-based vacuum delay time to progress views.	Nathan Bossart
	This commit adds the amount of time spent sleeping due to cost-based delay to the pg_stat_progress_vacuum and pg_stat_progress_analyze system views. A new configuration parameter named track_cost_delay_timing, which is off by default, controls whether this information is gathered. For vacuum, the reported value includes the sleep time of any associated parallel workers. However, parallel workers only report their sleep time once per second to avoid overloading the leader process. Bumps catversion. Author: Bertrand Drouvot <[email protected]> Co-authored-by: Nathan Bossart <[email protected]> Reviewed-by: Sami Imseih <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Masahiro Ikeda <[email protected]> Reviewed-by: Dilip Kumar <[email protected]> Reviewed-by: Sergei Kornilov <[email protected]> Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
2025-02-11	Eagerly scan all-visible pages to amortize aggressive vacuum	Melanie Plageman
	Aggressive vacuums must scan every unfrozen tuple in order to advance the relfrozenxid/relminmxid. Because data is often vacuumed before it is old enough to require freezing, relations may build up a large backlog of pages that are set all-visible but not all-frozen in the visibility map. When an aggressive vacuum is triggered, all of these pages must be scanned. These pages have often been evicted from shared buffers and even from the kernel buffer cache. Thus, aggressive vacuums often incur large amounts of extra I/O at the expense of foreground workloads. To amortize the cost of aggressive vacuums, eagerly scan some all-visible but not all-frozen pages during normal vacuums. All-visible pages that are eagerly scanned and set all-frozen in the visibility map are counted as successful eager freezes and those not frozen are counted as failed eager freezes. If too many eager scans fail in a row, eager scanning is temporarily suspended until a later portion of the relation. The number of failures tolerated is configurable globally and per table. To effectively amortize aggressive vacuums, we cap the number of successes as well. Capping eager freeze successes also limits the amount of potentially wasted work if these pages are modified again before the next aggressive vacuum. Once we reach the maximum number of blocks successfully eager frozen, eager scanning is disabled for the remainder of the vacuum of the relation. Original design idea from Robert Haas, with enhancements from Andres Freund, Tomas Vondra, and me Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Robert Treat <[email protected]> Reviewed-by: Bilal Yavuz <[email protected]> Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com
2025-02-11	config: Rename "Asynchronous Behavior" to "I/O"	Andres Freund
	"I/O" seems more descriptive than "Asynchronous Behavior", given that some of the GUCs in the section don't relate to anything asynchronous. Most other abbreviations in the config sections are un-abbreviated, but "Input/Output" seems less likely to be helpful than just IO or I/O. Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc@lguk43z3bsyq
2025-02-11	config: Split "Worker Processes" out of "Asynchronous Behavior"	Andres Freund
	Having all the worker related GUCs in the same section as IO controlling GUCs doesn't really make sense. Create a separate section for them. Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc@lguk43z3bsyq
2025-02-06	Fix autovacuum_vacuum_max_threshold's GUC description.	Nathan Bossart
	Most GUCs that accept a special value to disable the feature mention it in their GUC description. This commit adds that information to autovacuum_vacuum_max_threshold's description. Oversight in commit 306dc520b9.
2025-02-05	Introduce autovacuum_vacuum_max_threshold.	Nathan Bossart
	One way autovacuum chooses tables to vacuum is by comparing the number of updated or deleted tuples with a value calculated using autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor. The threshold specifies the base value for comparison, and the scale factor specifies the fraction of the table size to add to it. This strategy ensures that smaller tables are vacuumed after fewer updates/deletes than larger tables, which is reasonable in many cases but can result in infrequent vacuums on very large tables. This is undesirable for a couple of reasons, such as very large tables incurring a huge amount of bloat between vacuums. This new parameter provides a way to set a limit on the value calculated with autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor so that very large tables are vacuumed more frequently. By default, it is set to 100,000,000 tuples, but it can be disabled by setting it to -1. It can also be adjusted for individual tables by changing storage parameters. Author: Nathan Bossart <[email protected]> Co-authored-by: Frédéric Yhuel <[email protected]> Reviewed-by: Melanie Plageman <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Laurenz Albe <[email protected]> Reviewed-by: Michael Banck <[email protected]> Reviewed-by: Joe Conway <[email protected]> Reviewed-by: Sami Imseih <[email protected]> Reviewed-by: David Rowley <[email protected]> Reviewed-by: wenhui qiu <[email protected]> Reviewed-by: Vinícius Abrahão <[email protected]> Reviewed-by: Robert Treat <[email protected]> Reviewed-by: Alena Rybakina <[email protected]> Discussion: https://postgr.es/m/956435f8-3b2f-47a6-8756-8c54ded61802%40dalibo.com
2025-02-02	Mention jsonlog in description of logging_collector in GUC table	Michael Paquier
	logging_collector was only mentioning stderr and csvlog, and forgot about jsonlog. Oversight in dc686681e079, that has added support for jsonlog in log_destination. While on it, the description in the GUC table is tweaked to be more consistent with the documentation and postgresql.conf.sample. Author: Umar Hayat Reviewed-by: Ashutosh Bapat, Tom Lane Discussion: https://postgr.es/m/CAD68Dp1K_vBYqBEukHw=1jF7e76t8aszGZTFL2ugi=H7r=a7MA@mail.gmail.com Backpatch-through: 13
2025-01-31	Remove obsolete restriction on the range of log_rotation_size.	Tom Lane
	When syslogger.c was first written, we didn't want to assume that all platforms have 64-bit ftello. But we've been assuming that since v13 (cf commit 799d22461), so let's use that in syslogger.c and allow log_rotation_size to range up to INT_MAX kilobytes. The old code effectively limited log_rotation_size to 2GB regardless of platform. While nobody's complained, that doesn't seem too far away from what might be thought reasonable these days. I noticed this while searching for instances of "1024L" in connection with commit 041e8b95b. These were the last such instances. (We still have instances of L-suffixed literals, but most of them are associated with wait intervals for pg_usleep or similar functions. I don't see any urgent reason to change that.)
2025-01-14	Synchronize guc_tables.c categories with vacuum docs categories	Melanie Plageman
	ca9c6a5680d consolidated most of the vacuum-related GUCs' documentation into a new subsection. af2317652d5daf8b then enforced this order in postgresql.conf.sample. This commit reorganizes the GUC groups in guc_tables.c/h to match the updated ordering in the docs. Reported-by: Álvaro Herrera Reviewed-by: Álvaro Herrera, Alena Rybakina Discussion: https://postgr.es/m/202501132046.m4mcvxxswznu%40alvherre.pgsql
2025-01-06	Allow changing autovacuum_max_workers without restarting.	Nathan Bossart
	This commit introduces a new parameter named autovacuum_worker_slots that controls how many autovacuum worker slots to reserve during server startup. Modifying this new parameter's value does require a server restart, but it should typically be set to the upper bound of what you might realistically need to set autovacuum_max_workers. With that new parameter in place, autovacuum_max_workers can now be changed with a SIGHUP (e.g., pg_ctl reload). If autovacuum_max_workers is set higher than autovacuum_worker_slots, a WARNING is emitted, and the server will only start up to autovacuum_worker_slots workers at a given time. If autovacuum_max_workers is set to a value less than the number of currently-running autovacuum workers, the existing workers will continue running, but no new workers will be started until the number of running autovacuum workers drops below autovacuum_max_workers. Reviewed-by: Sami Imseih, Justin Pryzby, Robert Haas, Andres Freund, Yogesh Sharma Discussion: https://postgr.es/m/20240410212344.GA1824549%40nathanxps13
2025-01-01	Update copyright for 2025	Bruce Momjian
	Backpatch-through: 13
2024-12-02	Deprecate MD5 passwords.	Nathan Bossart
	MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time. Furthermore, MD5 password hashes in PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and is considered to be superior to MD5. This commit marks MD5 password support in PostgreSQL as deprecated and to be removed in a future release. The documentation now contains several deprecation notices, and CREATE ROLE and ALTER ROLE now emit deprecation warnings when setting MD5 passwords. The warnings can be disabled by setting the md5_password_warnings parameter to "off". Reviewed-by: Greg Sabino Mullane, Jim Nasby Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan
2024-11-26	Reordering DISTINCT keys to match input path's pathkeys	Richard Guo
	The ordering of DISTINCT items is semantically insignificant, so we can reorder them as needed. In fact, in the parser, we absorb the sorting semantics of the sortClause as much as possible into the distinctClause, ensuring that one clause is a prefix of the other. This can help avoid a possible need to re-sort. In this commit, we attempt to adjust the DISTINCT keys to match the input path's pathkeys. This can likewise help avoid re-sorting, or allow us to use incremental-sort to save efforts. For DISTINCT ON expressions, the parser already ensures that they match the initial ORDER BY expressions. When reordering the DISTINCT keys, we must ensure that the resulting pathkey list matches the initial distinctClause pathkeys. This introduces a new GUC, enable_distinct_reordering, which allows the optimization to be disabled if needed. Author: Richard Guo Reviewed-by: Andrei Lepikhov Discussion: https://postgr.es/m/CAMbWs48dR26cCcX0f=8bja2JKQPcU64136kHk=xekHT9xschiQ@mail.gmail.com
2024-10-24	Support configuring TLSv1.3 cipher suites	Daniel Gustafsson
	The ssl_ciphers GUC can only set cipher suites for TLSv1.2, and lower, connections. For TLSv1.3 connections a different OpenSSL API must be used. This adds a new GUC, ssl_tls13_ciphers, which can be used to configure a colon separated list of cipher suites to support when performing a TLSv1.3 handshake. Original patch by Erica Zhang with additional hacking by me. Author: Erica Zhang <[email protected]> Author: Daniel Gustafsson <[email protected]> Reviewed-by: Jacob Champion <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Reviewed-by: Jelte Fennema-Nio <[email protected]> Discussion: https://postgr.es/m/[email protected]
2024-10-24	Support configuring multiple ECDH curves	Daniel Gustafsson
	The ssl_ecdh_curve GUC only accepts a single value, but the TLS handshake can list multiple curves in the groups extension (the extension has been renamed to contain more than elliptic curves). This changes the GUC to accept a colon-separated list of curves. This commit also renames the GUC to ssl_groups to match the new nomenclature for the TLS extension. Original patch by Erica Zhang with additional hacking by me. Author: Erica Zhang <[email protected]> Author: Daniel Gustafsson <[email protected]> Reviewed-by: Jacob Champion <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Reviewed-by: Jelte Fennema-Nio <[email protected]> Discussion: https://postgr.es/m/[email protected]
2024-10-13	Use MAX_PARALLEL_WORKER_LIMIT for max_parallel_maintenance_workers	Michael Paquier
	max_parallel_maintenance_workers has been introduced in 9da0cc35284b, and used a hardcoded limit of 1024 rather than this variable. max_parallel_workers and max_parallel_workers_per_gather already used MAX_PARALLEL_WORKER_LIMIT (1024) as their upper-bound since 6599c9ac3340. Author: Matthias van de Meent Reviewed-by: Zhang Mingli Discussion: https://postgr.es/m/CAEze2WiCiJD+8Wig_wGPyn4vgdPjbnYXy2Rw+9KYi6izTMuP=w@mail.gmail.com
2024-09-04	Apply more quoting to GUC names in messages	Michael Paquier
	This is a continuation of 17974ec25946. More quotes are applied to GUC names in error messages and hints, taking care of what seems to be all the remaining holes currently in the tree for the GUCs. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2024-08-30	Clarify restrict_nonsystem_relation_kind description.	Masahiko Sawada
	This change improves the description of the restrict_nonsystem_relation_kind parameter in guc_table.c and the documentation for better clarity. Backpatch to 12, where this GUC parameter was introduced. Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/6a96f1af-22b4-4a80-8161-1f26606b9ee2%40eisentraut.org Backpatch-through: 12
2024-08-29	Message style improvements	Peter Eisentraut

2024-08-19	Mark search_path as GUC_REPORT	Tomas Vondra
	Report search_path changes to the client. Multi-tenant applications often map tenants to schemas, and use search_path to pick the tenant a given connection works with. This breaks when a connection pool (like PgBouncer), because the search_path may change unexpectedly. There are other GUCs we might want reported (e.g. various timeouts), but search_path is by far the biggest foot gun that can lead either to puzzling failures during query execution (when objects are missing or are defined differently), or even to accessing incorrect data. Many existing tools modify search_path, pg_dump being a notable example. Ideally, clients could specify which GUCs are interesting and should be subject to this reporting, but we don't support that. GUC_REPORT is what connection pools rely on for other interesting GUCs, so just use that. When this change was initially proposed in 2014, one of the concerns was impact on performance. But this was addressed by commit 2432b1a04087, which ensures we report each GUC at most once per query, no matter how many times it changed during execution. Eventually, this might be replaced / superseded by allowing doing this by making the protocol extensible in this direction, but it's unclear when (or if) that happens. Until then, we can leverage GUC_REPORT. Author: Alexander Kukushkin, Jelte Fennema-Nio Discussion: https://postgr.es/m/CAFh8B=k8s7WrcqhafmYhdN1+E5LVzZi_QaYDq8bKvrGJTAhY2Q@mail.gmail.com
2024-08-14	Remove TRACE_SORT macro	Peter Eisentraut
	The TRACE_SORT macro guarded the availability of the trace_sort GUC setting. But it has been enabled by default ever since it was introduced in PostgreSQL 8.1, and there have been no reports that someone wanted to disable it. So just remove the macro to simplify things. (For the avoidance of doubt: The trace_sort GUC is still there. This only removes the rarely-used macro guarding it.) Reviewed-by: Heikki Linnakangas <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/be5f7162-7c1d-44e3-9a78-74dcaa6529f2%40eisentraut.org
2024-08-10	Allow adjusting session_authorization and role in parallel workers.	Tom Lane
	The code intends to allow GUCs to be set within parallel workers via function SET clauses, but not otherwise. However, doing so fails for "session_authorization" and "role", because the assign hooks for those attempt to set the subsidiary "is_superuser" GUC, and that call falls foul of the "not otherwise" prohibition. We can't switch to using GUC_ACTION_SAVE for this, so instead add a new GUC variable flag GUC_ALLOW_IN_PARALLEL to mark is_superuser as being safe to set anyway. (This is okay because is_superuser has context PGC_INTERNAL and thus only hard-wired calls can change it. We'd need more thought before applying the flag to other GUCs; but maybe there are other use-cases.) This isn't the prettiest fix perhaps, but other alternatives we thought of would be much more invasive. While here, correct a thinko in commit 059de3ca4: when rejecting a GUC setting within a parallel worker, we should return 0 not -1 if the ereport doesn't longjmp. (This seems to have no consequences right now because no caller cares, but it's inconsistent.) Improve the comments to try to forestall future confusion of the same kind. Despite the lack of field complaints, this seems worth back-patching. Thanks to Nathan Bossart for the idea to invent a new flag, and for review. Discussion: https://postgr.es/m/[email protected]
2024-08-10	Lower minimum maintenance_work_mem to 64kB	John Naylor
	Since the introduction of TID store, vacuum uses far less memory in the common case than in versions 16 and earlier. Invoking multiple rounds of index vacuuming in turn requires a much larger table. It'd be a good idea anyway to cover this case in regression testing, and a lower limit is less painful for slow buildfarm animals. The reason to do it now is to re-enable coverage of the bugfix in commit 83c39a1f7f. For consistency, give autovacuum_work_mem the same treatment. Suggested by Andres Freund Tested by Melanie Plageman Backpatch to v17, where TID store was introduced Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/20240722164745.fvaoh6g6zprisqgp%40awork3.anarazel.de