summaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/pg_locale.c
AgeCommit message (Collapse)Author
7 daysControl ctype behavior internally with a method table.Jeff Davis
Previously, pattern matching and case mapping behavior branched based on the provider. Refactor to use a method table, which is less error-prone. This is also a step toward multiple provider versions, which we may want to support in the future. Reviewed-by: Andreas Karlsson <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com
9 daysRemove unused #include's in src/backend/utils/adt/*Peter Eisentraut
Author: Aleksander Alekseev <[email protected]> Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/CAJ7c6TOowVbR-0NEvvDm6a_mag18krR0XJ2FKrc9DHXj7hFRtQ%40mail.gmail.com
2025-05-23Revert function to get memory context stats for processesDaniel Gustafsson
Due to concerns raised about the approach, and memory leaks found in sensitive contexts the functionality is reverted. This reverts commits 45e7e8ca9, f8c115a6c, d2a1ed172, 55ef7abf8 and 042a66291 for v18 with an intent to revisit this patch for v19. Discussion: https://postgr.es/m/[email protected]
2025-04-20Fix a few duplicate words in commentsDavid Rowley
These are all new to v18 Author: David Rowley <[email protected]> Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com
2025-04-19Fix typos and grammar in the codeMichael Paquier
The large majority of these have been introduced by recent commits done in the v18 development cycle. Author: Alexander Lakhin <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-04-17Assert lack of hazardous buffer locks before possible catalog read.Noah Misch
Commit 0bada39c83a150079567a6e97b1a25a198f30ea3 fixed a bug of this kind, which existed in all branches for six days before detection. While the probability of reaching the trouble was low, the disruption was extreme. No new backends could start, and service restoration needed an immediate shutdown. Hence, add this to catch the next bug like it. The new check in RelationIdGetRelation() suffices to make autovacuum detect the bug in commit 243e9b40f1b2dd09d6e5bf91ebf6e822a2cd3704 that led to commit 0bada39. This also checks in a number of similar places. It replaces each Assert(IsTransactionState()) that pertained to a conditional catalog read. No back-patch for now, but a back-patch of commit 243e9b4 should back-patch this, too. A back-patch could omit the src/test/regress changes, since back branches won't gain new index columns. Reported-by: Alexander Lakhin <[email protected]> Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2025-03-28Use thread-safe strftime_l() instead of strftime().Peter Eisentraut
This removes some setlocale() calls and a lot of commentary about how dangerous that is. strftime_l() is from POSIX 2008, and on Windows we use _wcsftime_l(). While here, adjust error message for strftime_l() failure: it does not in practice set errno (even though POSIX says it could), so no %m. Author: Thomas Munro <[email protected]> Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGJqVe0%2BPv9dvC9dSums_PXxGo9SWcxYAMBguWJUGbWz-A%40mail.gmail.com
2025-03-27Provide thread-safe pg_localeconv_r().Peter Eisentraut
This involves four different implementation strategies: 1. For Windows, we now require _configthreadlocale() to be available and work (commit f1da075d9a0), and the documentation says that the object returned by localeconv() is in thread-local memory. 2. For glibc, we translate to nl_langinfo_l() calls, because it offers the same information that way as an extension, and that API is thread-safe. 3. For macOS/*BSD, use localeconv_l(), which is thread-safe. 4. For everything else, use uselocale() to set the locale for the thread, and use a big ugly lock to defend against the returned object being concurrently clobbered. In practice this currently means only Solaris. The new call is used in pg_locale.c, replacing calls to setlocale() and localeconv(). Author: Thomas Munro <[email protected]> Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/CA%2BhUKGJqVe0%2BPv9dvC9dSums_PXxGo9SWcxYAMBguWJUGbWz-A%40mail.gmail.com
2025-01-24Add SQL function CASEFOLD().Jeff Davis
Useful for caseless matching. Similar to LOWER(), but avoids edge-case problems with using LOWER() for caseless matching. For collations that support it, CASEFOLD() handles characters with more than two case variations or multi-character case variations. Some characters may fold to uppercase. The results of case folding are also more stable across Unicode versions than LOWER() or UPPER(). Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com Reviewed-by: Ian Lawrence Barwick
2025-01-17Support PG_UNICODE_FAST locale in the builtin collation provider.Jeff Davis
The PG_UNICODE_FAST locale uses code point sort order (fast, memcmp-based) combined with Unicode character semantics. The character semantics are based on Unicode full case mapping. Full case mapping can map a single codepoint to multiple codepoints, such as "ß" uppercasing to "SS". Additionally, it handles context-sensitive mappings like the "final sigma", and it uses titlecase mappings such as "Dž" when titlecasing (rather than plain uppercase mappings). Importantly, the uppercasing of "ß" as "SS" is specifically mentioned by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics for case mapping and pattern matching, so if we changed it to use the PG_UNICODE_FAST locale, it would offer better compliance with the standard. For now, though, do not change the behavior of UCS_BASIC. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-08Control collation behavior with a method table.Jeff Davis
Previously, behavior branched based on the provider. A method table is less error-prone and more flexible. The ctype behavior will be addressed in an upcoming commit. Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com
2025-01-08Move code for collation version into provider-specific files.Jeff Davis
Author: Andreas Karlsson Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477%40proxel.se
2025-01-01Update copyright for 2025Bruce Momjian
Backpatch-through: 13
2024-12-16Refactor string case conversion into provider-specific files.Jeff Davis
Create API entry points pg_strlower(), etc., that work with any provider and give the caller control over the destination buffer. Then, move provider-specific logic into pg_locale_builtin.c, pg_locale_icu.c, and pg_locale_libc.c as appropriate. Discussion: https://postgr.es/m/[email protected]
2024-12-03Perform provider-specific initialization in new functions.Jeff Davis
Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/[email protected]
2024-12-03Fix unintentional behavior change in commit e9931bfb75.Jeff Davis
Prior to that commit, there was special case to use ASCII case mapping behavior for the libc provider with a single-byte encoding when that's the default collation. Commit e9931bfb75 mistakenly eliminated that special case; this commit restores it. Discussion: https://postgr.es/m/[email protected]
2024-11-27Require ucrt if using MinGW.Thomas Munro
Historically we tolerated the absence of various C runtime library features for the benefit of the MinGW tool chain, because it used ancient msvcrt.dll for a long period of time. It now uses ucrt by default (like Windows 10+, Visual Studio 2015+), and that's the only configuration we're testing. In practice, we effectively required ucrt already in PostgreSQL 17, when commit 8d9a9f03 required _create_locale etc, first available in msvcr120.dll (Visual Studio 2013, the last of the pre-ucrt series of runtimes), and for MinGW users that practically meant ucrt because it was difficult or impossible to use msvcr120.dll. That may even not have been the first such case, but old MinGW configurations had already dropped off our testing radar so we weren't paying much attention. This commit formalizes the requirement. It also removes a couple of obsolete comments that discussed msvcrt.dll limitations, and some tests of !defined(_MSC_VER) to imply msvcrt.dll. There are many more anachronisms, but it'll take some time to figure out how to remove them all. APIs affected relate to locales, UTF-8, threads, large files and more. Thanks to Peter Eisentraut for the documentation change. It's not really necessary to talk about ucrt explicitly in such a short section, since it's the default for MinGW-w64 and MSYS2. It's enough to prune references and broken links to much older tools. Reviewed-by: Peter Eisentraut <[email protected]> Discussion: https://postgr.es/m/d9e7731c-ca1b-477c-9298-fa51e135574a%40eisentraut.org
2024-11-26Clean up newlines following left parenthesesÁlvaro Herrera
Most came in during the 17 cycle, so backpatch there. Some (particularly reorderbuffer.h) are very old, but backpatching doesn't seem useful. Like commits c9d297751959, c4f113e8fef9.
2024-10-25Refactor the code to create a pg_locale_t into new function.Jeff Davis
Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/[email protected]
2024-10-14Move libc-specific code from pg_locale.c into pg_locale_libc.c.Jeff Davis
Move implementation of pg_locale_t code for libc collations into pg_locale_libc.c. Other locale-related code, such as pg_perm_setlocale(), remains in pg_locale.c for now. Discussion: https://postgr.es/m/flat/[email protected]
2024-10-14Move ICU-specific code from pg_locale.c into pg_locale_icu.c.Jeff Davis
Discussion: https://postgr.es/m/flat/[email protected]
2024-10-05Reject non-ASCII locale names.Thomas Munro
Commit bf03cfd1 started scanning all available BCP 47 locale names on Windows. This caused an abort/crash in the Windows runtime library if the default locale name contained non-ASCII characters, because of our use of the setlocale() save/restore pattern with "char" strings. After switching to another locale with a different encoding, the saved name could no longer be understood, and setlocale() would abort. "Turkish_Türkiye.1254" is the example from recent reports, but there are other examples of countries and languages with non-ASCII characters in their names, and they appear in Windows' (old style) locale names. To defend against this: 1. In initdb, reject non-ASCII locale names given explicity on the command line, or returned by the operating system environment with setlocale(..., ""), or "canonicalized" by the operating system when we set it. 2. In initdb only, perform the save-and-restore with Windows' non-standard wchar_t variant of setlocale(), so that it is not subject to round trip failures stemming from char string encoding confusion. 3. In the backend, we don't have to worry about the save-and-restore problem because we have already vetted the defaults, so we just have to make sure that CREATE DATABASE also rejects non-ASCII names in any new databases. SET lc_XXX doesn't suffer from the problem, but the ban applies to it too because it uses check_locale(). CREATE COLLATION doesn't suffer from the problem either, but it doesn't use check_locale() so it is not included in the new ban for now, to minimize the change. Anyone who encounters the new error message should either create a new duplicated locale with an ASCII-only name using Windows Locale Builder, or consider using BCP 47 names like "tr-TR". Users already couldn't initialize a cluster with "Turkish_Türkiye.1254" on PostgreSQL 16+, but the new failure mode is an error message that explains why, instead of a crash. Back-patch to 16, where bf03cfd1 landed. Older versions are affected in theory too, but only 16 and later are causing crash reports. Reviewed-by: Andrew Dunstan <[email protected]> (the idea, not the patch) Reported-by: Haifang Wang (Centific Technologies Inc) <[email protected]> Discussion: https://postgr.es/m/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com
2024-09-24Allow length=-1 for NUL-terminated input to pg_strncoll(), etc.Jeff Davis
Like ICU, allow a length of -1 to be specified for NUL-terminated arguments to pg_strncoll(), pg_strnxfrm(), and pg_strnxfrm_prefix(). Simplifies the code and comments. Discussion: https://postgr.es/m/[email protected]
2024-09-24Tighten up make_libc_collator() and make_icu_collator().Jeff Davis
Ensure that error paths within these functions do not leak a collator, and return the result rather than using an out parameter. (Error paths in the caller may still result in a leaked collator, which will be addressed separately.) In make_libc_collator(), if the first newlocale() succeeds and the second one fails, close the first locale_t object. The function make_icu_collator() doesn't have any external callers, so change it to be static. Discussion: https://postgr.es/m/[email protected]
2024-09-12Simplify checks for deterministic collations.Jeff Davis
Remove redundant checks for locale->collate_is_c now that we always have a valid pg_locale_t. Also, remove pg_locale_deterministic() wrapper, which is no longer useful after commit e9931bfb75. Just check the field directly, consistent with other fields in pg_locale_t. Author: Andreas Karlsson Discussion: https://postgr.es/m/[email protected]
2024-09-06Remove lc_ctype_is_c().Jeff Davis
Instead always fetch the locale and look at the ctype_is_c field. hba.c relies on regexes working for the C locale without needing catalog access, which worked before due to a special case for C_COLLATION_OID in lc_ctype_is_c(). Move the special case to pg_set_regex_collation() now that lc_ctype_is_c() is gone. Author: Andreas Karlsson Discussion: https://postgr.es/m/[email protected]
2024-09-04Remove lc_collate_is_c().Jeff Davis
Instead just look up the collation and check collate_is_c field. Author: Andreas Karlsson Discussion: https://postgr.es/m/[email protected]
2024-09-03Remember last collation to speed up collation cache.Jeff Davis
This optimization is to avoid a performance regression in an upcoming patch that will remove lc_collate_is_c(). Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2024-08-23thread-safety: gmtime_r(), localtime_r()Peter Eisentraut
Use gmtime_r() and localtime_r() instead of gmtime() and localtime(), for thread-safety. There are a few affected calls in libpq and ecpg's libpgtypes, which are probably effectively bugs, because those libraries already claim to be thread-safe. There is one affected call in the backend. Most of the backend otherwise uses the custom functions pg_gmtime() and pg_localtime(), which are implemented differently. While we're here, change the call in the backend to gmtime*() instead of localtime*(), since for that use time zone behavior is irrelevant, and this side-steps any questions about when time zones are initialized by localtime_r() vs localtime(). Portability: gmtime_r() and localtime_r() are in POSIX but are not available on Windows. Windows has functions gmtime_s() and localtime_s() that can fulfill the same purpose, so we add some small wrappers around them. (Note that these *_s() functions are also different from the *_s() functions in the bounds-checking extension of C11. We are not using those here.) On MinGW, you can get the POSIX-style *_r() functions by defining _POSIX_C_SOURCE appropriately before including <time.h>. This leads to a conflict at least in plpython because apparently _POSIX_C_SOURCE gets defined in some header there, and then our replacement definitions conflict with the system definitions. To avoid that sort of thing, we now always define _POSIX_C_SOURCE on MinGW and use the POSIX-style functions here. Reviewed-by: Stepan Neretin <[email protected]> Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Thomas Munro <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/[email protected]
2024-08-19Fix harmless LC_COLLATE[_MASK] confusion.Thomas Munro
Commit ca051d8b101 called newlocale(LC_COLLATE, ...) instead of newlocale(LC_COLLATE_MASK, ...), in code reached only on FreeBSD. They have the same value on that OS, explaining why it worked. Fix. Back-patch to 14, where ca051d8b101 landed.
2024-08-06selfuncs.c: use pg_strxfrm() instead of strxfrm().Jeff Davis
pg_strxfrm() takes a pg_locale_t, so it works properly with all providers. This improves estimates for ICU when performing linear interpolation within a histogram bin. Previously, convert_string_datum() always used strxfrm() and relied on setlocale(). That did not produce good estimates for non-default or non-libc collations. Discussion: https://postgr.es/m/[email protected]
2024-08-06Remove support for null pg_locale_t most places.Jeff Davis
Previously, passing NULL for pg_locale_t meant "use the libc provider and the server environment". Now that the database collation is represented as a proper pg_locale_t (not dependent on setlocale()), remove special cases for NULL. Leave wchar2char() and char2wchar() unchanged for now, because the callers don't always have a libc-based pg_locale_t available. Discussion: https://postgr.es/m/[email protected] Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-07-30Relax check for return value from second call of pg_strnxfrm().Jeff Davis
strxfrm() is not guaranteed to return the exact number of bytes needed to store the result; it may return a higher value. Discussion: https://postgr.es/m/[email protected] Reviewed-by: Heikki Linnakangas Backpatch-through: 16
2024-07-30Make collation not depend on setlocale().Jeff Davis
Now that the result of pg_newlocale_from_collation() is always non-NULL, then we can move the collate_is_c and ctype_is_c flags into pg_locale_t. That simplifies the logic in lc_collate_is_c() and lc_ctype_is_c(), removing the dependence on setlocale(). This commit also eliminates the multi-stage initialization of the collation cache. As long as we have catalog access, then it's now safe to call pg_newlocale_from_collation() without checking lc_collate_is_c() first. Discussion: https://postgr.es/m/[email protected] Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-07-29Do not return NULL from pg_newlocale_from_collation().Jeff Davis
Previously, pg_newlocale_from_collation() returned NULL as a special case for the DEFAULT_COLLATION_OID if the provider was libc. In that case the behavior would depend on the last call to setlocale(). Now, consistent with the other providers, it will return a pointer to default_locale, which is not dependent on setlocale(). Note: for the C and POSIX locales, the locale_t structure within the pg_locale_t will still be zero, because those locales are implemented with internal logic and do not use libc at all. lc_collate_is_c() and lc_ctype_is_c() still depend on setlocale() to determine the current locale, which will be removed in a subsequent commit. Discussion: https://postgr.es/m/[email protected] Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-07-28Fix whitespace in commit 005c6b833f.Jeff Davis
2024-07-28Refactor: make default_locale internal to pg_locale.c.Jeff Davis
Discussion: https://postgr.es/m/[email protected] Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-07-28Change collation cache to use simplehash.h.Jeff Davis
Speeds up text comparison expressions when using a collation other than the database default collation. Does not affect larger operations such as ORDER BY, because the lookup is only done once. Discussion: https://postgr.es/m/[email protected] Reviewed-by: John Naylor, Andreas Karlsson
2024-07-22Replace some strtok() with strsep()Peter Eisentraut
strtok() considers adjacent delimiters to be one delimiter, which is arguably the wrong behavior in some cases. Replace with strsep(), which has the right behavior: Adjacent delimiters create an empty token. Affected by this are parsing of: - Stored SCRAM secrets ("SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>") - ICU collation attributes ("und@colStrength=primary;colCaseLevel=yes") for ICU older than version 54 - PG_COLORS environment variable ("error=01;31:warning=01;35:note=01;36:locus=01") - pg_regress command-line options with comma-separated list arguments (--dbname, --create-role) (currently only used pg_regress_ecpg) Reviewed-by: Kyotaro Horiguchi <[email protected]> Reviewed-by: David Steele <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/[email protected]
2024-07-04Assign error codes where missing for user-facing failuresMichael Paquier
All the errors triggered in the code paths patched here would cause the backend to issue an internal_error errcode, which is a state that should be used only for "can't happen" situations. However, these code paths are reachable by the regression tests, and could be seen by users in valid cases. Some regression tests expect internal errcodes as they manipulate the backend state to cause corruption (like checksums), or use elog() because it is more convenient (like injection points), these have no need to change. This reduces the number of internal failures triggered in a check-world by more than half, while providing correct errcodes for these valid cases. Reviewed-by: Robert Haas Discussion: https://postgr.es/m/[email protected]
2024-05-17Revise GUC names quoting in messages againPeter Eisentraut
After further review, we want to move in the direction of always quoting GUC names in error messages, rather than the previous (PG16) wildly mixed practice or the intermittent (mid-PG17) idea of doing this depending on how possibly confusing the GUC name is. This commit applies appropriate quotes to (almost?) all mentions of GUC names in error messages. It partially supersedes a243569bf65 and 8d9978a7176, which had moved things a bit in the opposite direction but which then were abandoned in a partial state. Author: Peter Smith <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com
2024-05-07Remove obsolete comment.Jeff Davis
Per suggestion from Peter, the comment was not helpful, so remove it rather than fixing it. Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/[email protected]
2024-03-29Use version for builtin collations.Jeff Davis
Given that the version field already exists, there's little reason not to use it. Suggestion from Peter Eisentraut. Discussion: https://postgr.es/m/[email protected] Reviewed-by: Peter Eisentraut
2024-03-19Support C.UTF-8 locale in the new builtin collation provider.Jeff Davis
The builtin C.UTF-8 locale has similar semantics to the libc locale of the same name. That is, code point sort order (fast, memcmp-based) combined with Unicode semantics for character operations such as pattern matching, regular expressions, and LOWER()/INITCAP()/UPPER(). The character semantics are based on Unicode simple case mappings. The builtin provider's C.UTF-8 offers several important advantages over libc: * faster sorting -- benefits from additional optimizations such as abbreviated keys and varstrfastcmp_c * faster case conversion, e.g. LOWER(), at least compared with some libc implementations * available on all platforms with identical semantics, and the semantics are stable, testable, and documentable within a given Postgres major version Being based on memcmp, the builtin C.UTF-8 locale does not offer natural language sort order. But it is an improvement for most use cases that might otherwise use libc's "C.UTF-8" locale, as well as many use cases that use libc's "C" locale. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider
2024-03-18Fix another warning, introduced by 846311051e.Jeff Davis
Discussion: https://postgr.es/m/[email protected] Reported-by: Tom Lane
2024-03-18Address more review comments on commit 2d819a08a1.Jeff Davis
Based on comments from Peter Eisentraut. * Document CREATE DATABASE ... BUILTIN_LOCALE. * Determine required encoding based on locale name for CREATE COLLATION. Use -1 for "C" (requires catversion bump). * initdb output fixups. * Make ctype_is_c a constant true for now. * Fixups to ICU 010_create_database.pl test. Discussion: https://postgr.es/m/[email protected]
2024-03-18Fix unreachable code warning from commit 2d819a08a1.Jeff Davis
Found by Coverity. Discussion: https://postgr.es/m/[email protected] Reported-by: Tom Lane
2024-03-14Introduce "builtin" collation provider.Jeff Davis
New provider for collations, like "libc" or "icu", but without any external dependency. Initially, the only locale supported by the builtin provider is "C", which is identical to the libc provider's "C" locale. The libc provider's "C" locale has always been treated as a special case that uses an internal implementation, without using libc at all -- so the new builtin provider uses the same implementation. The builtin provider's locale is independent of the server environment variables LC_COLLATE and LC_CTYPE. Using the builtin provider, the database collation locale can be "C" while LC_COLLATE and LC_CTYPE are set to "en_US", which is impossible with the libc provider. By offering a new builtin provider, it clarifies that the semantics of a collation using this provider will never depend on libc, and makes it easier to document the behavior. Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider
2024-03-09Catalog changes preparing for builtin collation provider.Jeff Davis
Rename pg_collation.colliculocale to colllocale, and pg_database.daticulocale to datlocale. These names reflects that the fields will be useful for the upcoming builtin provider as well, not just for ICU. This is purely a rename; no changes to the meaning of the fields. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Peter Eisentraut
2024-03-04Remove unused #include's from backend .c filesPeter Eisentraut
as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org