summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2023-04-13Skip the 004_io_direct.pl test if a pre-flight check fails.Thomas Munro
The test previously had a list of OSes that direct I/O was expected to work on. That worked well enough for the systems in our build farm, but didn't survive contact with the Debian build bots running on tmpfs via overlayfs. tmpfs does not support O_DIRECT, but we don't want to exclude Linux generally. The new approach is to try to create an empty file with O_DIRECT from Perl first. If that fails, we'll skip the test and report what the error was. Reported-by: Christoph Berg <[email protected]> Reviewed-by: Dagfinn Ilmari Mannsåker <[email protected]> Reviewed-by: Andrew Dunstan <[email protected]> Discussion: https://postgr.es/m/ZDYd4A78cT2ULxZZ%40msg.df7cb.de
2023-04-12Remove overzealous assertion from PHJ.Thomas Munro
We can't assert that we're the only process attached to a barrier after BarrierArriveAndDetachExceptLast(). Although that'll be true almost always, a late-starting parallel worker can attach very briefly (that is, immediately detach after checking the phase) right at that moment. BarrierArriveAndDetachExceptLast() already contains an assertion like that, but it holds a spinlock preventing the race. This thinko caused a one-off failure on build farm animal chimaera. Diagnosed-by: Melanie Plageman <[email protected]> Reported-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-04-12Improve error messages introduced in be87200efd9 and 0fdab27ad68Andres Freund
Author: Kyotaro Horiguchi <[email protected]> Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2023-04-12Revert "Catalog NOT NULL constraints" and falloutAlvaro Herrera
This reverts commit e056c557aef4 and minor later fixes thereof. There's a few problems in this new feature -- most notably regarding pg_upgrade behavior, but others as well. This new feature is not in any way critical on its own, so instead of scrambling to fix it we revert it and try again in early 17 with these issues in mind. Discussion: https://postgr.es/m/[email protected]
2023-04-12Fix parallel-safety marking when moving initplans to another node.Tom Lane
Our policy since commit ab77a5a45 has been that a plan node having any initplans is automatically not parallel-safe. (This could be relaxed, but not today.) clean_up_removed_plan_level neglected this, and could attach initplans to a parallel-safe child plan node without clearing the plan's parallel-safe flag. That could lead to "subplan was not initialized" errors at runtime, in case an initplan referenced another one and only the referencing one got transmitted to parallel workers. The fix in clean_up_removed_plan_level is trivial enough. materialize_finished_plan also moves initplans from one node to another, but it's okay because it already copies the source node's parallel_safe flag. The other place that does this kind of thing is standard_planner's hack to inject a top-level Gather when debug_parallel_query is active. But that's actually dead code given that we're correctly enforcing the "initplans aren't parallel safe" rule, so just replace it with an Assert that there are no initplans. Also improve some related comments. Normally we'd add a regression test case for this sort of bug. The mistake itself is already reached by existing tests, but there is accidentally no visible problem. The only known test case that creates an actual failure seems too indirect and fragile to justify keeping it as a regression test (not least because it fails to fail in v11, though the bug is clearly present there too). Per report from Justin Pryzby. Back-patch to all supported branches. Discussion: https://postgr.es/m/[email protected]
2023-04-12Fix incorrect format placeholdersPeter Eisentraut
2023-04-12Fix detection of unseekable files for fseek() and ftello() with MSVCMichael Paquier
Calling fseek() or ftello() on a handle to a non-seeking device such as a pipe or a communications device is not supported. Unfortunately, MSVC's flavor of these routines, _fseeki64() and _ftelli64(), do not return an error when given a pipe as handle. Some of the logic of pg_dump and restore relies on these routines to check if a handle is seekable, causing failures when passing the contents of pg_dump to pg_restore through a pipe, for example. This commit introduces wrappers for fseeko() and ftello() on MSVC so as any callers are able to properly detect the cases of non-seekable handles. This relies mainly on GetFileType(), sharing a bit of code with the MSVC port for fstat(). The code in charge of getting a file type is refactored into a new file called win32common.c, shared by win32stat.c and the new win32fseek.c. It includes the MSVC ports for fseeko() and ftello(). Like 765f5df, this is backpatched down to 14, where the fstat() implementation for MSVC is able to understand about files larger than 4GB in size. Using a TAP test for that is proving to be tricky as IPC::Run handles the pipes by itself, still I have been able to check the fix manually. Reported-by: Daniel Watzinger Author: Juan José Santamaría Flecha, Michael Paquier Discussion: https://postgr.es/m/CAC+AXB26a4EmxM2suXxPpJaGrqAdxracd7hskLg-zxtPB50h7A@mail.gmail.com Backpatch-through: 14
2023-04-11Refine the guidelines for rmgrdesc authors.Peter Geoghegan
Clarify the goals of the recently added guidelines for rmgrdesc authors: to avoid gratuitous inconsistencies across resource managers, and to make it reasonably easy to write a reusable custom parser. Beyond that, the guidelines leave rmgrdesc authors with a significant amount of leeway. This even includes the leeway to invent custom conventions (in cases where it's warranted). Follow-up to commit 7d8219a4. Author: Peter Geoghegan <[email protected]> Reviewed-By: Melanie Plageman <[email protected]> Discussion: https://postgr.es/m/CAH2-WzkbYuvwYKm-Y-72QEh6SPMQcAo9uONv+mR3bMGcu9E_Cg@mail.gmail.com
2023-04-11Fix Heap rmgr's desc output for infobits arrays.Peter Geoghegan
Make heap desc routines that output status bit as arrays of constants avoid outputting array literals that contain superfluous punctuation characters that complicate parsing the output. Also make sure that no heap desc routine repeats the same key name (at the same nesting level), for the same reason. Arguably, these were both oversights in commit 7d8219a4. In passing, make the desc output code (which covers Heap's DELETE, UPDATE, HOT_UPDATE, LOCK, and LOCK_UPDATED record types) consistent in terms of the output order of each field. This order also matches WAL record struct order. Heap's DELETE desc output now shows the record's xmax field for the first time (just like UPDATE/HOT_UPDATE records). Author: Peter Geoghegan <[email protected]> Reviewed-By: Melanie Plageman <[email protected]> Discussion: https://postgr.es/m/CAH2-Wz=pNYtxiJ2Jx5Lj=fKo1OEZ4GE0p_kct+ugAUTqBwU46g@mail.gmail.com
2023-04-11Fix xl_heap_lock WAL record field's data type.Peter Geoghegan
Make xl_heap_lock's infobits_set field of type uint8, not int8. Using int8 isn't appropriate given that the field just holds status bits. This fixes an oversight in commit 0ac5ad5134. In passing rename the nearby TransactionId field to "xmax" to make things consistency with related records, such as xl_heap_lock_updated. Deliberately avoid a bump in XLOG_PAGE_MAGIC. No backpatch, either. Author: Peter Geoghegan <[email protected]> Discussion: https://postgr.es/m/CAH2-WzkCd3kOS8b7Rfxw7Mh1_6jvX=Nzo-CWR1VBTiOtVZkWHA@mail.gmail.com
2023-04-11035_standby_logical_decoding: Add missing waits for replicationAndres Freund
At least one slow buildfarm system (hoverfly) showed that the database creation was not replicated before we try to create logical replication slots on the standby, in that database. Reported-by: Noah Misch <[email protected]> Author: "Drouvot, Bertrand" <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-04-11Fix uninitialized variable in transformTableLikeClause()David Rowley
process_notnull_constraints should be set to false until we discover a NOT NULL column. Discovered while running Valgrind. Discussion: https://postgr.es/m/CAApHDvoMyiZVi1KW5WVdqMRzWsWkD3F7n6QD+BbAO6WTeAWsUQ@mail.gmail.com
2023-04-11Improve ereports for VACUUM's BUFFER_USAGE_LIMIT optionDavid Rowley
There's no need to check if opt->arg is NULL since defGetString() already does that and raises an ERROR if it is. Let's just remove that check. Also, combine the two remaining ERRORs into a single check. It seems better to give an indication about what sort of values we're looking for rather than just to state that the value given isn't valid. Make BUFFER_USAGE_LIMIT uppercase in this ERROR message too. It's already upper case in one other error message, so make that consistent. Reported-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/[email protected]
2023-04-11Clarify nbtree posting list update desc issue.Peter Geoghegan
Per complaint from Melanie Plageman. Follow-up to commit 5d6728e5. Reported-By: Melanie Plageman <[email protected]> Discussion: https://postgr.es/m/20230411002315.oyaicmcqrq2hb3ek@liskov
2023-04-10Fix nbtree posting list update desc output.Peter Geoghegan
We cannot use the generic array_desc approach with per-tuple nbtree posting list update metadata because array_desc can only deal with fixed width elements (e.g., page offset numbers). Using array_desc led to incorrect rmgr descriptions for updates from nbtree DELETE/VACUUM WAL records. To fix, add specialized code to describe the update metadata as array elements in desc output. We now iterate over the update metadata using an approach that matches related REDO routines. Also stop showing the updates offset number array separately in nbtree DELETE/VACUUM desc output. It's redundant information, since the same page offset numbers appear in the description of each individual update element. Also make some small tweaks to the way that we format arrays in all desc routines (not just nbtree desc routines) to make arrays a little less verbose. Oversight in commit 1c453cfd, which enhanced the nbtree rmgr desc routines. Author: Peter Geoghegan <[email protected]> Discussion: https://postgr.es/m/CAH2-WzkbYuvwYKm-Y-72QEh6SPMQcAo9uONv+mR3bMGcu9E_Cg@mail.gmail.com
2023-04-08Simplify version check for SKIP clauseDaniel Gustafsson
Checking for the required versions of IO::Pty as well as IPC::Run can be achieved with a single eval call, and by using the VERSION function the comparison is guaranteed to follow the same rules as calling 'use' on the module with a version. Reported-by: Andrew Dunstan <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-04-08Use higher wal_level for 004_io_direct.pl.Thomas Munro
The new direct I/O test deliberately uses a very small shared_buffers to force some disk transfers without making the data set large and slow, but ran into a problem with wal_level = minimal: log_newpage_range() pins many buffers, leading to a few intermittent "no unpinned buffers available" errors. We could presumably fix that by adjusting shared_buffers, but crake seems to be trying to tell us something interesting with these settings, so let's just avoid wal_level = minimal in this test for now. Reported-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/20230408060408.n7xdwk3mxj5oykt6%40awork3.anarazel.de
2023-04-08Improve indentation of multiline initialization expressions.Tom Lane
If a variable has an initialization expression that wraps onto the next line(s), pg_bsd_indent will now indent the continuation lines one stop, instead of aligning them flush with the variable declaration. We've been holding off applying this until the last v16 CF finished, but now it's time. Thomas Munro and Tom Lane Discussion: https://postgr.es/m/[email protected]
2023-04-08Try to unbreak MSVC builds for pg_waldumpAndrew Dunstan
remedy an omission in commit 7d8219a444
2023-04-08Suppress bogus printout during new 035_standby_logical_decoding.pl test.Tom Lane
Our convention for some time has been that successful tests shouldn't print anything on stderr. A stray "diag" call violated that, and for that matter messed up the normal TAP progress display.
2023-04-08Skip \password TAP test on old IPC::Run versionsDaniel Gustafsson
IPC::Run versions prior to 0.98 cause the interactive session to time out, so SKIP the test in case these versions are detected (they are within the base requirement for our TAP tests in general). Error reported by the BF and investigation by Tom Lane. Discussion: https://postgr.es/m/[email protected]
2023-04-08Try to unbreak MSVC builds for fuzzystrmatchAndrew Dunstan
Commit a290378a37 neglrected to add a recipe for MSVC to build the daitch_motokoff.h file. Per buildfarm animal bowerbird.
2023-04-08Revert "Add support for Kerberos credential delegation"Stephen Frost
This reverts commit 3d4fa227bce4294ce1cc214b4a9d3b7caa3f0454. Per discussion and buildfarm, this depends on APIs that seem to not be available on at least one platform (NetBSD). Should be certainly possible to rework to be optional on that platform if necessary but bit late for that at this point. Discussion: https://postgr.es/m/[email protected]
2023-04-08Redesign interrupt/cancel API for regex engine.Thomas Munro
Previously, a PostgreSQL-specific callback checked by the regex engine had a way to trigger a special error code REG_CANCEL if it detected that the next call to CHECK_FOR_INTERRUPTS() would certainly throw via ereport(). A later proposed bugfix aims to move some complex logic out of signal handlers, so that it won't run until the next CHECK_FOR_INTERRUPTS(), which makes the above design impossible unless we split CHECK_FOR_INTERRUPTS() into two phases, one to run logic and another to ereport(). We may develop such a system in the future, but for the regex code it is no longer necessary. An earlier commit moved regex memory management over to our MemoryContext system. Given that the purpose of the two-phase interrupt checking was to free memory before throwing, something we don't need to worry about anymore, it seems simpler to inject CHECK_FOR_INTERRUPTS() directly into cancelation points, and just let it throw. Since the plan is to keep PostgreSQL-specific concerns separate from the main regex engine code (with a view to bein able to stay in sync with other projects), do this with a new macro INTERRUPT(), customizable in regcustom.h and defaulting to nothing. Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
2023-04-08Update tsearch regex memory management.Thomas Munro
Now that our regex engine uses palloc(), it's not necessary to set up a special memory context callback to free compiled regexes. The regex has no resources other than the memory that is already going to be freed in bulk. Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
2023-04-08Use MemoryContext API for regex memory management.Thomas Munro
Previously, regex_t objects' memory was managed with malloc() and free() directly. Switch to palloc()-based memory management instead. Advantages: * memory used by cached regexes is now visible with MemoryContext observability tools * cleanup can be done automatically in certain failure modes (something that later commits will take advantage of) * cleanup can be done in bulk On the downside, there may be more fragmentation (wasted memory) due to per-regex MemoryContext objects. This is a problem shared with other cached objects in PostgreSQL and can probably be improved with later tuning. Thanks to Noah Misch for suggesting this general approach, which unblocks later work on interrupts. Suggested-by: Noah Misch <[email protected]> Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGK3PGKwcKqzoosamn36YW-fsuTdOPPF1i_rtEO%3DnEYKSg%40mail.gmail.com
2023-04-08TAP test for logical decoding on standbyAndres Freund
Author: "Drouvot, Bertrand" <[email protected]> Author: Amit Khandekar <[email protected]> Author: Craig Ringer <[email protected]> (in an older version) Author: Andres Freund <[email protected]> Reviewed-by: "Drouvot, Bertrand" <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Fabrízio de Royes Mello <[email protected]>
2023-04-08Allow logical decoding on standbysAndres Freund
Unsurprisingly, this requires wal_level = logical to be set on the primary and standby. The infrastructure added in 26669757b6a ensures that slots are invalidated if the primary's wal_level is lowered. Creating a slot on a standby waits for a xl_running_xact record to be processed. If the primary is idle (and thus not emitting xl_running_xact records), that can take a while. To make that faster, this commit also introduces the pg_log_standby_snapshot() function. By executing it on the primary, completion of slot creation on the standby can be accelerated. Note that logical decoding on a standby does not itself enforce that required catalog rows are not removed. The user has to use physical replication slots + hot_standby_feedback or other measures to prevent that. If catalog rows required for a slot are removed, the slot is invalidated. See 6af1793954e for an overall design of logical decoding on a standby. Bumps catversion, for the addition of the pg_log_standby_snapshot() function. Author: "Drouvot, Bertrand" <[email protected]> Author: Andres Freund <[email protected]> (in an older version) Author: Amit Khandekar <[email protected]> (in an older version) Reviewed-by: Andres Freund <[email protected]> Reviewed-by: FabrÌzio de Royes Mello <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-By: Robert Haas <[email protected]>
2023-04-08For cascading replication, wake physical and logical walsenders separatelyAndres Freund
Physical walsenders can't send data until it's been flushed; logical walsenders can't decode and send data until it's been applied. On the standby, the WAL is flushed first, which will only wake up physical walsenders; and then applied, which will only wake up logical walsenders. Previously, all walsenders were awakened when the WAL was flushed. That was fine for logical walsenders on the primary; but on the standby the flushed WAL would have been not applied yet, so logical walsenders were awakened too early. Per idea from Jeff Davis and Amit Kapila. Author: "Drouvot, Bertrand" <[email protected]> Reviewed-By: Jeff Davis <[email protected]> Reviewed-By: Robert Haas <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Discussion: https://postgr.es/m/CAA4eK1+zO5LUeisabX10c81LU-fWMKO4M9Wyg1cdkbW7Hqh6vQ@mail.gmail.com
2023-04-08Handle logical slot conflicts on standbyAndres Freund
During WAL replay on the standby, when a conflict with a logical slot is identified, invalidate such slots. There are two sources of conflicts: 1) Using the information added in 6af1793954e, logical slots are invalidated if required rows are removed 2) wal_level on the primary server is reduced to below logical Uses the infrastructure introduced in the prior commit. FIXME: add commit reference. Change InvalidatePossiblyObsoleteSlot() to use a recovery conflict to interrupt use of a slot, if called in the startup process. The new recovery conflict is added to pg_stat_database_conflicts, as confl_active_logicalslot. See 6af1793954e for an overall design of logical decoding on a standby. Bumps catversion for the addition of the pg_stat_database_conflicts column. Bumps PGSTAT_FILE_FORMAT_ID for the same reason. Author: "Drouvot, Bertrand" <[email protected]> Author: Andres Freund <[email protected]> Author: Amit Khandekar <[email protected]> (in an older version) Reviewed-by: "Drouvot, Bertrand" <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Fabrízio de Royes Mello <[email protected]> Reviewed-by: Bharath Rupireddy <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Alvaro Herrera <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-04-08Support invalidating replication slots due to horizon and wal_levelAndres Freund
Needed for logical decoding on a standby. Slots need to be invalidated because of the horizon if rows required for logical decoding are removed. If the primary's wal_level is lowered from 'logical', logical slots on the standby need to be invalidated. The new invalidation methods will be used in a subsequent commit. Logical slots that have been invalidated can be identified via the new pg_replication_slots.conflicting column. See 6af1793954e for an overall design of logical decoding on a standby. Bumps catversion for the addition of the new pg_replication_slots column. Author: "Drouvot, Bertrand" <[email protected]> Author: Andres Freund <[email protected]> Author: Amit Khandekar <[email protected]> (in an older version) Reviewed-by: "Drouvot, Bertrand" <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Fabrízio de Royes Mello <[email protected]> Reviewed-by: Bharath Rupireddy <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Melanie Plageman <[email protected]> Reviewed-by: Alvaro Herrera <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-04-08Fix underspecified sort order in inherit.sqlAndres Freund
Introduced in e056c557aef4. Per buildfarm member prion.
2023-04-08Prevent use of invalidated logical slot in CreateDecodingContext()Andres Freund
Previously we had checks for this in multiple places. Support for logical decoding on standbys will add other forms of invalidation, making it worth while to centralize the checks. This slightly changes the error message for both the walsender and SQL interface. Particularly the SQL interface error was inaccurate, as the "This slot has never previously reserved WAL" portion was unreachable. Reviewed-by: "Drouvot, Bertrand" <[email protected]> Reviewed-by: Melanie Plageman <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-04-08Replace replication slot's invalidated_at LSN with an enumAndres Freund
This is mainly useful because the upcoming logical-decoding-on-standby feature adds further reasons for invalidating slots, and we don't want to end up with multiple invalidated_* fields, or check different attributes. Eventually we should consider not resetting restart_lsn when invalidating a slot due to max_slot_wal_keep_size. But that's a user visible change, so left for later. Increases SLOT_VERSION, due to the changed field (with a different alignment, no less). Reviewed-by: "Drouvot, Bertrand" <[email protected]> Reviewed-by: Alvaro Herrera <[email protected]> Reviewed-by: Melanie Plageman <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-04-08Add io_direct setting (developer-only).Thomas Munro
Provide a way to ask the kernel to use O_DIRECT (or local equivalent) where available for data and WAL files, to avoid or minimize kernel caching. This hurts performance currently and is not intended for end users yet. Later proposed work would introduce our own I/O clustering, read-ahead, etc to replace the facilities the kernel disables with this option. The only user-visible change, if the developer-only GUC is not used, is that this commit also removes the obscure logic that would activate O_DIRECT for the WAL when wal_sync_method=open_[data]sync and wal_level=minimal (which also requires max_wal_senders=0). Those are non-default and unlikely settings, and this behavior wasn't (correctly) documented. The same effect can be achieved with io_direct=wal. Author: Thomas Munro <[email protected]> Author: Andres Freund <[email protected]> Author: Bharath Rupireddy <[email protected]> Reviewed-by: Justin Pryzby <[email protected]> Reviewed-by: Bharath Rupireddy <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg%40mail.gmail.com
2023-04-08Introduce PG_IO_ALIGN_SIZE and align all I/O buffers.Thomas Munro
In order to have the option to use O_DIRECT/FILE_FLAG_NO_BUFFERING in a later commit, we need the addresses of user space buffers to be well aligned. The exact requirements vary by OS and file system (typically sectors and/or memory pages). The address alignment size is set to 4096, which is enough for currently known systems: it matches modern sectors and common memory page size. There is no standard governing O_DIRECT's requirements so we might eventually have to reconsider this with more information from the field or future systems. Aligning I/O buffers on memory pages is also known to improve regular buffered I/O performance. Three classes of I/O buffers for regular data pages are adjusted: (1) Heap buffers are now allocated with the new palloc_aligned() or MemoryContextAllocAligned() functions introduced by commit 439f6175. (2) Stack buffers now use a new struct PGIOAlignedBlock to respect PG_IO_ALIGN_SIZE, if possible with this compiler. (3) The buffer pool is also aligned in shared memory. WAL buffers were already aligned on XLOG_BLCKSZ. It's possible for XLOG_BLCKSZ to be configured smaller than PG_IO_ALIGNED_SIZE and thus for O_DIRECT WAL writes to fail to be well aligned, but that's a pre-existing condition and will be addressed by a later commit. BufFiles are not yet addressed (there's no current plan to use O_DIRECT for those, but they could potentially get some incidental speedup even in plain buffered I/O operations through better alignment). If we can't align stack objects suitably using the compiler extensions we know about, we disable the use of O_DIRECT by setting PG_O_DIRECT to 0. This avoids the need to consider systems that have O_DIRECT but can't align stack objects the way we want; such systems could in theory be supported with more work but we don't currently know of any such machines, so it's easier to pretend there is no O_DIRECT support instead. That's an existing and tested class of system. Add assertions that all buffers passed into smgrread(), smgrwrite() and smgrextend() are correctly aligned, unless PG_O_DIRECT is 0 (= stack alignment tricks may be unavailable) or the block size has been set too small to allow arrays of buffers to be all aligned. Author: Thomas Munro <[email protected]> Author: Andres Freund <[email protected]> Reviewed-by: Justin Pryzby <[email protected]> Discussion: https://postgr.es/m/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com
2023-04-08Add missing .gitignore entry.Tom Lane
Seems an oversight in 7d8219a44. Fix before somebody commits a generated file.
2023-04-08Add support for Kerberos credential delegationStephen Frost
Support GSSAPI/Kerberos credentials being delegated to the server by a client. With this, a user authenticating to PostgreSQL using Kerberos (GSSAPI) credentials can choose to delegate their credentials to the PostgreSQL server (which can choose to accept them, or not), allowing the server to then use those delegated credentials to connect to another service, such as with postgres_fdw or dblink or theoretically any other service which is able to be authenticated using Kerberos. Both postgres_fdw and dblink are changed to allow non-superuser password-less connections but only when GSSAPI credentials have been delegated to the server by the client and GSSAPI is used to authenticate to the remote system. Authors: Stephen Frost, Peifeng Qiu Reviewed-By: David Christensen Discussion: https://postgr.es/m/CO1PR05MB8023CC2CB575E0FAAD7DF4F8A8E29@CO1PR05MB8023.namprd05.prod.outlook.com
2023-04-08Track IO times in pg_stat_ioAndres Freund
a9c70b46dbe and 8aaa04b32S added counting of IO operations to a new view, pg_stat_io. Now, add IO timing for reads, writes, extends, and fsyncs to pg_stat_io as well. This combines the tracking for pgBufferUsage with the tracking for pg_stat_io into a new function pgstat_count_io_op_time(). This should make it a bit easier to avoid the somewhat costly instr_time conversion done for pgBufferUsage. Author: Melanie Plageman <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/flat/CAAKRu_ay5iKmnbXZ3DsauViF3eMxu4m1oNnJXqV_HyqYeg55Ww%40mail.gmail.com
2023-04-07Show more detail in nbtree rmgr descriptions.Peter Geoghegan
Show a detailed description of the page offset number arrays that appear in certain nbtree WAL records. Also brings nbtree desc routines in line with the guidelines established by recent commit 7d8219a4. Author: Melanie Plageman <[email protected]> Reviewed-By: Peter Geoghegan <[email protected]> Discussion: https://postgr.es/m/flat/20230109215842.fktuhesvayno6o4g%40awork3.anarazel.de
2023-04-07For Kerberos testing, disable DNS lookupsStephen Frost
Similar to 8dff2f224, this disables DNS lookups by the Kerberos library to look up the KDC and the realm while the Kerberos tests are running. In some environments, these lookups can take a long time and end up timing out and causing tests to fail. Further, since this isn't really our domain, we shouldn't be sending out these DNS requests during our tests.
2023-04-07Show more detail in heapam rmgr descriptions.Peter Geoghegan
Add helper functions that output arrays in a standard format, and use the functions inside heapdesc routines. This allows tools like pg_walinspect to show a detailed description of the page offset number arrays for records like PRUNE and VACUUM (unless there was an FPI). Also document the conventions that desc routines should follow. Only the heapdesc routines follow the conventions for now, so they're just guidelines for the time being. Based on a suggestion from Andres Freund. Author: Melanie Plageman <[email protected]> Reviewed-By: Peter Geoghegan <[email protected]> Discussion: https://postgr.es/m/flat/20230109215842.fktuhesvayno6o4g%40awork3.anarazel.de
2023-04-07Fix table name clash in recently introduced testAndres Freund
A few buildfarm animals recently started complaining about the "child" relation already existing. e056c557aef added a new child table to inherit.sql, but triggers.sql, running in the same parallel group, also uses a child table. Rename the new table to inh_child. It maybe worth renaming child, parent in other tests as well, but that's work for another day. Discussion: https://postgr.es/m/[email protected]
2023-04-07Improve IO accounting for temp relation writesAndres Freund
Both pgstat_database and pgBufferUsage count IO timing for reads of temporary relation blocks into local buffers. However, both failed to count write IO timing for flushes of dirty local buffers. Fix. Additionally, FlushRelationBuffers() seems to have omitted counting write IO (both count and timing) stats for both pgstat_database and pgBufferUsage. Fix. Author: Melanie Plageman <[email protected]> Reviewed-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/20230321023451.7rzy4kjj2iktrg2r%40awork3.anarazel.de
2023-04-07Test SCRAM iteration changes with psql \passwordDaniel Gustafsson
A version of this test was included in the original patch for altering SCRAM iteration count, but was omitted due to how interactive psql TAP sessions worked before being refactored. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2023-04-07Refactor background psql TAP functionsDaniel Gustafsson
This breaks out the background and interactive psql functionality into a new class, PostgreSQL::Test::BackgroundPsql. Sessions are still initiated via PostgreSQL::Test::Cluster, but once started they can be manipulated by the new helper functions which intend to make querying easier. A sample session for a command which can be expected to finish at a later time can be seen below. my $session = $node->background_psql('postgres'); $bsession->query_until(qr/start/, q( \echo start CREATE INDEX CONCURRENTLY idx ON t(a); )); $bsession->quit; Patch by Andres Freund with some additional hacking by me. Author: Andres Freund <[email protected]> Reviewed-by: Andrew Dunstan <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-04-07Fix underspecified sort order in test queryAlvaro Herrera
Fail in e056c557aef4.
2023-04-07Catalog NOT NULL constraintsAlvaro Herrera
We now create pg_constaint rows for NOT NULL constraints with contype='n'. We propagate these constraints during operations such as adding inheritance relationships, creating and attaching partitions, creating tables LIKE other tables. We mostly follow the well-known rules of conislocal and coninhcount that we have for CHECK constraints, with some adaptations; for example, as opposed to CHECK constraints, we don't match NOT NULL ones by name when descending a hierarchy to alter it; instead we match by column number. This means we don't require the constraint names to be identical across a hierarchy. For now, we omit them from system catalogs. Maybe this is worth reconsidering. We don't support NOT VALID nor DEFERRABLE clauses either; these can be added as separate features later (this patch is already large and complicated enough.) This has been very long in the making. The first patch was written by Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'), which I (Álvaro) then hijacked in 2011 and 2012, until that one was killed by the realization that we ought to use contype='c' instead: manufactured CHECK constraints. However, later SQL standard development, as well as nonobvious emergent properties of that design (mostly, failure to distinguish them from "normal" CHECK constraints as well as the performance implication of having to test the CHECK expression) led us to reconsider this choice, so now the current implementation uses contype='n' again. In 2016 Vitaly Burovoy also worked on this feature[1] but found no consensus for his proposed approach, which was claimed to be closer to the letter of the standard, requiring additional pg_attribute columns to track the OID of the NOT NULL constraint for that column. [1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com Author: Álvaro Herrera <[email protected]> Author: Bernd Helmle <[email protected]> Reviewed-by: Justin Pryzby <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/CACA0E642A0267EDA387AF2B%40%5B172.26.14.62%5D Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com Discussion: https://postgr.es/m/[email protected]
2023-04-07Doc: improve descriptions of max_[pred_]locks_per_transaction GUCs.Tom Lane
The old wording described these as being multiplied by max_connections plus max_prepared_transactions, which hasn't been exactly right for some time thanks to the addition of various auxiliary processes. Moreover, exactness here is a bit pointless given that the lock tables can expand into the initially-unallocated "slop" space in shared memory. Rather than trying to track exactly what the code is doing, let's just use the term "server processes". Likewise adjust these GUCs' description strings in guc_tables.c. Wang Wei, reviewed by Nathan Bossart and myself Discussion: https://postgr.es/m/OS3PR01MB6275BDD09C9B875C65FCC5AB9EA39@OS3PR01MB6275.jpnprd01.prod.outlook.com
2023-04-07Add array_sample() and array_shuffle() functions.Tom Lane
These are useful in Monte Carlo applications. Martin Kalcher, reviewed/adjusted by Daniel Gustafsson and myself Discussion: https://postgr.es/m/[email protected]