Update PG17 to master prior to 1.7.0 #2325

jrgemignani · 2026-01-30T16:59:34Z

Cherry picked the following PRs into PG17 to bring it up to 1.7.0,
in preparation to switching to version 1.7.0

9ee143a Fix upgrade script for 1.6.0 to 1.7.0 (#2320)
7bf52f6 Add RLS support and fix permission checks (#2309)
7a1ca20 Replace libcsv with pg COPY for csv loading (#2310)
c9b7417 Fix Issue 1884: Ambiguous column reference (#2306)
76531bf Upgrade Jest to v29 for node: protocol compatibility (#2307)
ea0915f Optimize vertex/edge field access with direct array indexing (#2302)
79bf3ad feat: Add 32-bit platform support for graphid type (#2286)
60eeda1 Fix and improve index.sql addendum (#2301)
2108171 Fix and improve index.sql regression test coverage (#2300)
cd325a3 Fix Issue 2289: handle empty list in IN expression (#2294)
079ae96 Revise README for Python driver updates (#2298)
b40eaaf Makefile: fix race condition on cypher_gram_def.h (#2273)
d382d53 Restrict age_load commands (#2274)
7740c00 Migrate python driver configuration to pyproject.toml (#2272)
303fcf6 Convert string to raw string to remove invalid escape sequence warning (#2267)
e0d12c1 Update grammar file for maintainability (#2270)
18e268b Fix ORDER BY alias resolution with AS in Cypher queries (#2269)
bbc9d44 Fix possible memory and file descriptors leaks (#2258)
6b304fa Adjust 'could not find rte for' ERROR message (#2266)
a6a0836 Fix Issue 2256: segmentation fault when calling coalesce function (#2259)
f61af9e Add index on id columns (#2117)
314f097 Fix issue 2245 - Creating more than 41 vlabels causes crash in drop_graph (#2248)
3404368 Fix issue 2243 - Regression in string concatenation (#2244)
8cade62 Add fast functions for checking edge uniqueness (#2227)
ea9d3ec Bump gopkg.in/yaml.v3 from 3.0.0 to 3.0.1 in /drivers/golang (#2212)
f6e79a8 Fix issue with CALL/YIELD for user defined and qualified functions. (#2217)

…pache#2217) Fixed 2 issues with CALL/YIELD - 1) If a user defined function was in search_path, the transform_FuncCall logic would only find it, if it were part of an extension. 2) If a function were qualified, the transform_cypher_call_subquery logic would mistakenly extract the schema name instead of the function name. NOTE: transform_FuncCall should be reviewed for possible refactor. Added regression tests. modified: src/backend/parser/cypher_clause.c modified: src/backend/parser/cypher_expr.c modified: regress/expected/cypher_call.out modified: regress/sql/cypher_call.sql

…2212) Bumps gopkg.in/yaml.v3 from 3.0.0 to 3.0.1. --- updated-dependencies: - dependency-name: gopkg.in/yaml.v3 dependency-version: 3.0.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Added fast functions for checking edge uniqueness. This will help improve performance for MATCH queries with paths longer than 3 but less than 11. The normal edge uniqueness function will deal with any path 11 and over. modified: age--1.6.0--y.y.y.sql modified: sql/agtype_graphid.sql modified: src/backend/parser/cypher_clause.c modified: src/backend/utils/adt/age_vle.c

Fixed issue 2243 - Regression in string concatenation using the + operator. The issue was in versions 1.5.0 and 1.6.0, at least. It was due to using Int8GetDatum instead of Int64GetDatum for the agtype integer field in the following functions - get_numeric_datum_from_agtype_value get_string_from_agtype_value This impacted more than what the original issue covered, but those additional cases were resolved too. Added regression tests. modified: regress/expected/agtype.out modified: regress/sql/agtype.sql modified: src/backend/utils/adt/agtype_ops.c

…raph (apache#2248) Fixed issue 2245 - Creating more than 41 vlabels causes drop_grapth to fail with "label (relation) cache corrupted" and crashing out on the following command. This was due to corruption of the label_relation_cache during the HASH_DELETE process. As the issue was with a cache flush routine, it was necessary to fix them all. Here is the list of the flush functions that were fixed - static void flush_graph_name_cache(void) static void flush_graph_namespace_cache(void) static void flush_label_name_graph_cache(void) static void flush_label_graph_oid_cache(void) static void flush_label_relation_cache(void) static void flush_label_seq_name_graph_cache(void) Added regression tests. modified: regress/expected/catalog.out modified: regress/sql/catalog.sql modified: src/backend/utils/cache/ag_cache.c

- Whenever a label will be created, indices on id columns will be created by default. In case of vertex, a unique index on id column will be created, which will also serve as a unique constraint. In case of edge, a non-unique index on start_id and end_id columns will be created. - This change is expected to improve the performance of queries that involve joins. From some performance tests, it was observed that the performance of queries improved alot. - Loader was updated to insert tuples in indices as well. This has caused to slow the loader down a bit, but it was necessary. - A bug related to command ids in cypher_delete executor was also fixed.

…ache#2259) Fixed issue 2256: A segmentation fault occurs when calling the coalesce function in PostgreSQL version 17. This likely predates 17 and includes other similar types of "functions". See issues 1124 (PR 1125) and 1303 (PR 1317) for more details. This issue is due to coalesce() being processed differently from other functions. Additionally, greatest() was found to exhibit the same behavior. They were added to the list of types to ignore during the cypher analyze phase. A few others were added: CaseExpr, XmlExpr, ArrayExpr, & RowExpr. Although, I wasn't able to find cases where these caused crashes. Added regression tests. modified: regress/expected/cypher.out modified: regress/sql/cypher.sql modified: src/backend/parser/cypher_analyze.c

Adjusted the following type of error message. It was mentioned in issue 2263 as being incorrect, which it isn't. However, it did need some clarification added - ERROR: could not find rte for <column name> Added a HINT for additional clarity - HINT: variable <column name> does not exist within scope of usage For example: CREATE p0=(n0), (n1{k:EXISTS{WITH p0}}) RETURN 1 ERROR: could not find rte for p0 LINE 3: CREATE p0=(n0), (n1{k:EXISTS{WITH p0}}) ^ HINT: variable p0 does not exist within scope of usage Additionally, added pstate->p_expr_kind == EXPR_KIND_INSERT_TARGET to transform_cypher_clause_as_subquery. Updated existing regression tests. Added regression tests from issue. modified: regress/expected/cypher_call.out modified: regress/expected/cypher_subquery.out modified: regress/expected/cypher_union.out modified: regress/expected/cypher_with.out modified: regress/expected/expr.out modified: regress/expected/list_comprehension.out modified: regress/expected/scan.out modified: src/backend/parser/cypher_clause.c modified: src/backend/parser/cypher_expr.c

- Used postgres memory allocation functions instead of standard ones. - Wrapped main loop of csv loader in PG_TRY block for better error handling.

NOTE: This PR was partially created with AI tools and reviewed by a human. ORDER BY clauses failed when referencing column aliases from RETURN: MATCH (p:Person) RETURN p.age AS age ORDER BY age DESC ERROR: could not find rte for age Added SQL-99 compliant alias matching to find_target_list_entry() that checks if ORDER BY identifier matches a target list alias before attempting expression transformation. This enables standard SQL behavior for sorting by aliased columns with DESC/DESCENDING/ASC/ASCENDING. Updated regression tests. Added regression tests. modified: regress/expected/cypher_match.out modified: regress/expected/expr.out modified: regress/sql/expr.sql modified: src/backend/parser/cypher_clause.c

Consolidated duplicate code, added helper functions, and reviewed the grammar file for issues. NOTE: I used an AI tool to review and cleanup the grammar file. I have reviewed all of the work it did. Improvements: 1. Added KEYWORD_STRDUP macro to eliminate hardcoded string lengths 2. Consolidated EXPLAIN statement handling into make_explain_stmt helper 3. Extracted WITH clause validation into validate_return_item_aliases helper 4. Created make_default_return_node helper for subquery return-less logic Benefits: - Reduced code duplication by ~150 lines - Improved maintainability with helper functions - Eliminated manual string length calculations (error-prone) All 29 existing regression tests pass modified: src/backend/parser/cypher_gram.y

apache#2267) - Changed '\s' to r'\s'

- Add pyproject.toml with package configuration - Simplify setup.py to minimal backward-compatible wrapper. - Updated CI workflow and .gitignore. - Resolves warning about using setup.py directly.

This PR applies restrictions to the following age_load commands - load_labels_from_file() load_edges_from_file() They are now tied to a specific root directory and are required to have a specific file extension to eliminate any attempts to force them to access any other files. Nothing else has changed with the actual command formats or parameters, only that they work out of the /tmp/age directory and only access files with an extension of .csv. Added regression tests and updated the location of the csv files for those regression tests. modified: regress/expected/age_load.out modified: regress/sql/age_load.sql modified: src/backend/utils/load/age_load.c

The file cypher_gram.c generates cypher_gram_def.h, which is directly necessary for cypher_parser.o and cypher_keywords.o and their respective .bc files. But that direct dependency is not reflected in the Makefile, which only had the indirect dependency of .o on .c. So on high parallel builds, the .h may not have been generated by bison yet. Additionally, the .bc files should have the same dependencies as the .o files, but those are lacking. Here is an example output where the .bc file fails to build, as it was running concurrently with the bison instance that was about to finalize cypher_gram_def.h: In file included from src/backend/parser/cypher_parser.c:24: clang-17 -Wno-ignored-attributes -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -O2 -I.//src/include -I.//src/include/parser -I. -I./ -I/usr/pgsql-17/include/server -I/usr/pgsql-17/include/internal -D_GNU_SOURCE -I/usr/include -I/usr/include/libxml2 -flto=thin -emit-llvm -c -o src/backend/parser/cypher_parser.bc src/backend/parser/cypher_parser.c .//src/include/parser/cypher_gram.h:65:10: fatal error: 'parser/cypher_gram_def.h' file not found 65 | #include "parser/cypher_gram_def.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. make: *** [/usr/pgsql-17/lib/pgxs/src/makefiles/../../src/Makefile.global:1085: src/backend/parser/cypher_parser.bc] Error 1 make: *** Waiting for unfinished jobs.... gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wshadow=compatible-local -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -O2 -g -fmessage-length=0 -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables -fasynchronous-unwind-tables -fPIC -fvisibility=hidden -I.//src/include -I.//src/include/parser -I. -I./ -I/usr/pgsql-17/include/server -I/usr/pgsql-17/include/internal -D_GNU_SOURCE -I/usr/include -I/usr/include/libxml2 -c -o src/backend/catalog/ag_label.o src/backend/catalog/ag_label.c /usr/bin/bison -Wno-deprecated --defines=src/include/parser/cypher_gram_def.h -o src/backend/parser/cypher_gram.c src/backend/parser/cypher_gram.y Previously, cypher_parser.o was missing the dependency, so it could start before cypher_gram_def.h was available: Considering target file 'src/backend/parser/cypher_parser.o'. File 'src/backend/parser/cypher_parser.o' does not exist. Considering target file 'src/backend/parser/cypher_parser.c'. File 'src/backend/parser/cypher_parser.c' was considered already. Considering target file 'src/backend/parser/cypher_gram.c'. File 'src/backend/parser/cypher_gram.c' was considered already. Finished prerequisites of target file 'src/backend/parser/cypher_parser.o'. Must remake target 'src/backend/parser/cypher_parser.o'. As well as cypher_parser.bc, missing the dependency on cypher_gram_def.h: Considering target file 'src/backend/parser/cypher_parser.bc'. File 'src/backend/parser/cypher_parser.bc' does not exist. Considering target file 'src/backend/parser/cypher_parser.c'. File 'src/backend/parser/cypher_parser.c' was considered already. Finished prerequisites of target file 'src/backend/parser/cypher_parser.bc'. Must remake target 'src/backend/parser/cypher_parser.bc'. Now cypher_parser.o correctly depends on cypher_gram_def.h: Considering target file 'src/backend/parser/cypher_parser.o'. File 'src/backend/parser/cypher_parser.o' does not exist. Considering target file 'src/backend/parser/cypher_parser.c'. File 'src/backend/parser/cypher_parser.c' was considered already. Considering target file 'src/backend/parser/cypher_gram.c'. File 'src/backend/parser/cypher_gram.c' was considered already. Considering target file 'src/include/parser/cypher_gram_def.h'. File 'src/include/parser/cypher_gram_def.h' was considered already. Finished prerequisites of target file 'src/backend/parser/cypher_parser.o'. Must remake target 'src/backend/parser/cypher_parser.o'. And cypher_parser.bc correctly depends on cypher_gram_def.h as well: Considering target file 'src/backend/parser/cypher_parser.bc'. File 'src/backend/parser/cypher_parser.bc' does not exist. Considering target file 'src/backend/parser/cypher_parser.c'. File 'src/backend/parser/cypher_parser.c' was considered already. Considering target file 'src/backend/parser/cypher_gram.c'. File 'src/backend/parser/cypher_gram.c' was considered already. Considering target file 'src/include/parser/cypher_gram_def.h'. File 'src/include/parser/cypher_gram_def.h' was considered already. Finished prerequisites of target file 'src/backend/parser/cypher_parser.bc'. Must remake target 'src/backend/parser/cypher_parser.bc'.

Updated README to from psycopg2 to psycopg3 (psycopg)

NOTE: This PR was created with AI tools and a human. When evaluating 'x IN []' with an empty list, the transform_AEXPR_IN function would return NULL because no expressions were processed. This caused a 'cache lookup failed for type 0' error downstream. This fix adds an early check for the empty list case: - 'x IN []' returns false (nothing can be in an empty list) Additional NOTE: Cypher does not have 'NOT IN' syntax. To check if a value is NOT in a list, use 'NOT (x IN list)'. The NOT operator will invert the false from an empty list to true as expected. The fix returns a boolean constant directly, avoiding the NULL result that caused the type lookup failure. Added regression tests. modified: regress/expected/expr.out modified: regress/sql/expr.sql modified: src/backend/parser/cypher_expr.c

NOTE: This PR was created with AI tools and a human. - Remove unused copy command (leftover from deleted agload_test_graph test) - Replace broken Section 4 that referenced non-existent graph with comprehensive WHERE clause tests covering string, int, bool, and float properties with AND/OR/NOT operators - Add EXPLAIN tests to verify index usage: - Section 3: Validate GIN indices (load_city_gin_idx, load_country_gin_idx) show Bitmap Index Scan for property matching - Section 4: Validate all expression indices (city_country_code_idx, city_id_idx, city_west_coast_idx, country_life_exp_idx) show Index Scan for WHERE clause filtering All indices now have EXPLAIN verification confirming they are used as expected. modified: regress/expected/index.out modified: regress/sql/index.sql

NOTE: This PR was created with the help of AI tools and a human. Added additional requested regression tests - *EXPLAIN for pattern with WHERE clause *EXPLAIN for pattern with filters on both country and city modified: regress/expected/index.out modified: regress/sql/index.sql

* feat: Add 32-bit platform support for graphid type This enables AGE to work on 32-bit platforms including WebAssembly (WASM). Problem: - graphid is int64 (8 bytes) with PASSEDBYVALUE - On 32-bit systems, Datum is only 4 bytes - PostgreSQL rejects pass-by-value types larger than Datum Solution: - Makefile-only change (no C code modifications) - When SIZEOF_DATUM=4 is passed to make, strip PASSEDBYVALUE from the generated SQL - If not specified, normal 64-bit behavior is preserved (PASSEDBYVALUE kept) This change is backward compatible: - 64-bit systems continue using pass-by-value - 32-bit systems now work with pass-by-reference Motivation: PGlite (PostgreSQL compiled to WebAssembly) uses 32-bit pointers and requires this patch to run AGE. Tested on: - 64-bit Linux (regression tests pass) - 32-bit WebAssembly via PGlite (all operations work) Co-authored-by: abbuehlj <[email protected]>

…2302) NOTE: This PR was created using AI tools and a human. Leverage deterministic key ordering from uniqueify_agtype_object() to access vertex/edge fields in O(1) instead of O(log n) binary search. Fields are sorted by key length, giving fixed positions: - Vertex: id(0), label(1), properties(2) - Edge: id(0), label(1), end_id(2), start_id(3), properties(4) Changes: - Add field index constants and accessor macros to agtype.h - Update age_id(), age_start_id(), age_end_id(), age_label(), age_properties() to use direct field access - Add fill_agtype_value_no_copy() for read-only scalar extraction without memory allocation - Add compare_agtype_scalar_containers() fast path for scalar comparison - Update hash_agtype_value(), equals_agtype_scalar_value(), and compare_agtype_scalar_values() to use direct field access macros - Add fast path in get_one_agtype_from_variadic_args() bypassing extract_variadic_args() for single argument case - Add comprehensive regression test (30 tests) Performance impact: Improves ORDER BY, hash joins, aggregations, and Cypher functions (id, start_id, end_id, label, properties) on vertices and edges. All previous regression tests were not impacted. Additional regression test added to enhance coverage. modified: Makefile new file: regress/expected/direct_field_access.out new file: regress/sql/direct_field_access.sql modified: src/backend/utils/adt/agtype.c modified: src/backend/utils/adt/agtype_util.c modified: src/include/utils/agtype.h

Note: This PR was created with AI tools and a human. The pg-connection-string module (dependency of pg) now uses the node: protocol prefix for built-in modules (e.g., require('node:process')). Jest 26 does not support this syntax, causing test failures. Changes: - Upgrade jest from ^26.6.3 to ^29.7.0 - Upgrade ts-jest from ^26.5.1 to ^29.4.6 - Upgrade @types/jest from ^26.0.20 to ^29.5.14 - Update typescript to ^4.9.5 This also resolves 19 npm audit vulnerabilities (17 moderate, 2 high) that existed in the older Jest 26 dependency tree. modified: drivers/nodejs/package.json

Fix Issue 1884: Ambiguous column reference and invalid AGT header errors. Note: This PR was created with AI tools and a human, or 2. This commit addresses two related bugs that occur when using SET to store graph elements (vertices, edges, paths) as property values: Issue 1884 - "column reference is ambiguous" error: When a Cypher query uses the same variable in both the SET expression RHS and the RETURN clause (e.g., SET n.prop = n RETURN n), PostgreSQL would report "column reference is ambiguous" because the variable appeared in multiple subqueries without proper qualification. Solution: The fix for this issue was already in place through the target entry naming scheme that qualifies column references. "Invalid AGT header value" offset error: When deserializing nested VERTEX, EDGE, or PATH values stored in properties, the system would fail with errors like "Invalid AGT header value: 0x00000041". This occurred because ag_serialize_extended_type() did not include alignment padding (padlen) in the agtentry length calculation for these types, while fill_agtype_value() uses INTALIGN() when reading, causing offset mismatch. Solution: Modified ag_serialize_extended_type() in agtype_ext.c to include padlen in the agtentry length for VERTEX, EDGE, and PATH cases, matching the existing pattern used for INTEGER, FLOAT, and NUMERIC types: *agtentry = AGTENTRY_IS_AGTYPE | (padlen + (AGTENTRY_OFFLENMASK & ...)); This ensures the serialized length accounts for alignment padding, allowing correct deserialization of nested graph elements. Appropriate regression tests were added to verify the fixes. Co-authored by: Zainab Saad <[email protected]> modified: regress/expected/cypher_set.out modified: regress/sql/cypher_set.sql modified: src/backend/parser/cypher_clause.c modified: src/backend/utils/adt/agtype_ext.c

- Commit also adds permission checks - Resolves a critical memory spike issue on loading large file - Use pg's COPY infrastructure (BeginCopyFrom, NextCopyFromRawFields) for 64KB buffered CSV parsing instead of libcsv - Add byte based flush threshold (64KB) matching COPY behavior for memory safety - Use heap_multi_insert with BulkInsertState for optimized batch inserts - Add per batch memory context to prevent memory growth during large loads - Remove libcsv dependency (libcsv.c, csv.h) - Improves loading performance by 15-20% - No previous regression tests were impacted - Added regression tests for permissions/rls Assisted-by AI

- Previously, age only set ACL_SELECT and ACL_INSERT in RTEPermissionInfo, bypassing pg's privilege checking for DELETE and UPDATE operations. - Additionally, RLS policies were not enforced because AGE uses CMD_SELECT for all Cypher queries, causing the rewriter to skip RLS policy application. Permission fixes: - Add ACL_DELETE permission flag for DELETE clause operations - Add ACL_UPDATE permission flag for SET/REMOVE clause operations - Recursively search RTEs including subqueries for permission info RLS support: - Implemented at executor level because age transforms all cypher queries to CMD_SELECT, so pg's rewriter never adds RLS policies for INSERT/UPDATE/DELETE operations. There isnt an appropriate rewriter hook to modify this behavior, so we do it in executor instead. - Add setup_wcos() to apply WITH CHECK policies at execution time for CREATE, SET, and MERGE operations - Add setup_security_quals() and check_security_quals() to apply USING policies for UPDATE and DELETE operations - USING policies silently filter rows (matching pg behavior) - WITH CHECK policies raise errors on violation - DETACH DELETE raises error if edge RLS blocks deletion to prevent dangling edges - Add permission checks and rls in startnode/endnode functions - Add regression tests Assisted-by AI

- Added index creation for existing labels Assisted-by AI

jrgemignani and others added 26 commits January 30, 2026 08:37

Fix possible memory and file descriptors leaks (apache#2258)

bbc9d44

- Used postgres memory allocation functions instead of standard ones. - Wrapped main loop of csv loader in PG_TRY block for better error handling.

Convert string to raw string to remove invalid escape sequence warning (

303fcf6

apache#2267) - Changed '\s' to r'\s'

Migrate python driver configuration to pyproject.toml (apache#2272)

7740c00

- Add pyproject.toml with package configuration - Simplify setup.py to minimal backward-compatible wrapper. - Updated CI workflow and .gitignore. - Resolves warning about using setup.py directly.

Revise README for Python driver updates (apache#2298)

079ae96

Updated README to from psycopg2 to psycopg3 (psycopg)

Fix upgrade script for 1.6.0 to 1.7.0 (apache#2320)

9ee143a

- Added index creation for existing labels Assisted-by AI

jrgemignani requested a review from MuhammadTahaNaveed January 30, 2026 16:59

github-actions bot added PG17 override-stale To keep issues/PRs untouched from stale action labels Jan 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update PG17 to master prior to 1.7.0 #2325

Update PG17 to master prior to 1.7.0 #2325

jrgemignani commented Jan 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Update PG17 to master prior to 1.7.0 #2325

Are you sure you want to change the base?

Update PG17 to master prior to 1.7.0 #2325

Conversation

jrgemignani commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

jrgemignani commented Jan 30, 2026 •

edited

Loading