postgres-xl.git - Official repo for Postgres-XL. Stable branch is XL9_5_STABLE. Current development is PG10 compatible. Controlled by Postgres-X2 Core Team.

Age	Commit message (Collapse)	Author
2018-10-12	Fix incorrect comparison in pgxcnode_gethashHEAD master	Tomas Vondra
	The check is supposed to ensure NULL/empty nodename gets hashed to 0, but (nodename == '\0') is comparing the pointer itself, not the first character. So dereference that correctly.
2018-10-12	Use sufficiently large buffer in SharedQueueWrite	Tomas Vondra
	The sq_key alone may be up to 64 bytes, so we need more than that. We could use dynamic memory instead, but 128 bytes should be enough both for the sq_key and the other pieces.
2018-10-11	Use dynamic buffer to parse NODE_LIST_RESULT in GTM	Tomas Vondra
	When processing NODE_LIST_RESULT messages, gtmpqParseSuccess() used a static buffer, defined as "char buf[8092]". This is an issue, as the message has variable length, and may get long enough to exceed any hard-coded limit. While that's not very common (it requires long paths, node names and/or many GTM sessions on the node), it may happen, in which case the memcpy() causes a buffer overflow and corrupts the stack. Fixing this is simple - allocate the buffer using malloc() intead, requesting exactly the right amount of memory. This however hits a latent pre-existing issue in the code, because the code was doing memcpy(&buf,...) instead of memcpy(buf,...). With static buffers this was harmless, because (buf == &buf), so the code was working as intended (except when there were more than 8092 bytes). With dynamic memory this is no longer true, becase (buf != &buf), and the stack corruption was much easier to trigger (just 8 bytes). Per report and debug info by Hengbing. Patch by Pavan and me.
2018-08-03	Use correct path for tablspaces while creating a basebackup	Pavan Deolasee
	In XL, we embed the nodename in the tablespace subdir name to ensure that non-conflicting paths are created when multiple coordinators/datanodes are running on the same server. The code to handle tablespace mapping in basebackup was missing this support. Per report and patch by Wanglin.
2018-08-03	Fix compilation error introduced in 0e3770c14c3fb1858192feb7240343cb35ba013c	Pavan Deolasee
	Per report and patch by Wanglin
2018-07-31	Ensure that bad protocol ERROR message is sent to the frontend	Pavan Deolasee
	In case of receiving bad protocol messages received by the GTM proxy, let the client know about the error messages.
2018-07-31	Ensure partition child tables inherit distribution properties correctly	Pavan Deolasee
	While in restore mode, that we use to load schema when a new node is added to the cluster, the partition child tables should correctly inherit the distribution properties from the parent table. This support was lacking, thus leading to incorrect handling of such tables. Per report by Virendra Kumar.
2018-07-31	Do not dump TO NODE clause for partition or child table	Pavan Deolasee
	We missed this in the commit c168cc8d58c6e0d9710ef0aba1b846b7174e0a79. So deal with it now.
2018-07-27	Ensure qualified name for dumping sequence value	Pavan Deolasee
	Without that the sequence won't be found correctly.
2018-07-27	Do not dump DISTRIBUTED BY for partition and inherited table	Pavan Deolasee
	Child tables inherit the distribition property from the parent table. Even more, XL doesn't support a syntax of the form PARTITION OF .. DISTRIBUTED BY and doesn't allow child tables to have a distribution property different than the parent. So attaching this clause to the partition table does not make any sense. Per report from Virendra Kumar.
2018-07-27	Teach pgxc_exec_sizefunc() to use pg_my_temp_schema() to get temp schema	Pavan Deolasee
	Similar to what we did in e688c0c23c962d425b82fdfad014bace4207af1d, we must not rely on the temporary namespace on the coordinator since it may change on the remote nodes. Instead we use the pg_my_temp_schema() function to find the currently active temporary schema on the remote node.
2018-07-27	Fix handling of REFRESH MATERIALIZED VIEW CONCURRENTLY	Pavan Deolasee
	We create a coordinator-only LOCAL temporary table for REFRESH MATERIALIZED VIEW CONCURRENTLY. Since this table does not exist on the remote nodes, we must not use explicit "ANALYZE <temptable>". Instead, just analyze it locally like we were doing at other places. Restore the matview test case to use REFRESH MATERIALIZED VIEW CONCURRENTLY now that the underlying bug is fixed.
2018-07-27	Improve locking semantics in GTM and GTM Proxy	Pavan Deolasee
	While GTM allows long jump in case of errors, we were not careful to release locks currently held by the executing thread. That could lead to threads leaving a critical section still holding a lock and thus causing deadlocks. We now properly track currently held locks in the thread-specific information and release those locks in case of an error. Same is done for mutex locks as well, though there is only one that gets used. This change required using a malloc-ed memory for thread-specific info. While due care has been taken to free the structure, we should keep an eye on it for any possible memory leaks. In passing also improve handling of bad-protocol startup messages which may have caused deadlock and resource starvation.
2018-07-27	Fix a compiler warning introduced in the previous commit	Pavan Deolasee

2018-07-27	Ensure that typename is schema qualified while sending row description	Pavan Deolasee
	A row description messages contains the type information for the attributes in the column. But if the type does not exist in the search_path then the coordinator fails to parse the typename back to the type. So the datanode must send the schema name along with the type name. Per report and test case by Hengbing Wang @ Microfun. Added a new test file and a few test cases to cover this area.
2018-07-27	Ensure pooler process follows consistent model for SIGQUIT handling	Pavan Deolasee
	We'd occassionally seen that the pooler process fails to respond to SIGQUIT and gets stuck in a non recoverable state. Code inspection reveals that we're not following the model followed by rest of the background worker processes in handling SIGQUIT. So get that fixed, with the hope that this will fix the problem case.
2018-07-27	Properly quote typename before calling parseTypeString	Pavan Deolasee
	Without this, parseTypeString() might throw an error or resolve to a wrong type in case the type name requires quoting. Per report by Hengbing Wang
2018-05-21	Remove some accidentally added elog(LOG) messages	Pavan Deolasee

2018-05-21	Fix broken implementation of recovery to barrier.	Pavan Deolasee
	Per report from Hengbing, the current implementation of PITR recovery to a BARRIER failed to correctly stop at the given recovery_target_barrier. It seems there are two bugs here. 1) we failed to write the XLOG record correctly and 2) we also failed to mark the end-of-recovery upon seeing the XLOG record during the recovery. Fix both these problems and also fix pg_xlogdump in passing to ensure we can dump the BARRIER XLOG records correctly.
2018-05-21	Fix a long standing bug in vacuum/analyze of temp tables	Pavan Deolasee
	The system may and very likely choose different namespace for temporary tables on different nodes. So it was erroneous to explicitly add the coordinator side nampspace to the queries constructed for fetching stats from the remote nodes. A regression test was non-deterministically failing for this reason for long, but only now we could fully understand the problem and fix it. We now use pg_my_temp_schema() to derive the current temporary schema used by the remote node instead of hardcoding that in the query using coordinator side information.
2018-05-21	Accept regression diffs in join test case	Pavan Deolasee
	The plans now look the same as vanilla PG except for additional Remote Fast Query Execution nodes
2018-05-21	Accept regression diffs in plpgsql test case	Pavan Deolasee
	The new output looks correct and has been fixed because of our work to get transaction handling correct.
2018-05-21	Add expected changes to plpgsql.out missed in ↵	Pavan Deolasee
	0f65a7193da4b6b0a35b6446b4c904a9f5ac9bf6
2018-05-21	Accept regression diff.	Pavan Deolasee
	We no longer see "DROP INDEX CONCURRENTLY cannot run inside a transaction block" if the index does not exists and we're running DROP IF EXISTS command
2018-05-18	Fix post-cherry-pick problems.	Pavan Deolasee

2018-05-18	Track clearly whether to run a remote transaction in autocommit or a block	Pavan Deolasee
	Chi Gao and Hengbing Wang reported certain issues around transaction handling and demonstrated via xlogdump how certain transactions were getting marked committed/aborted repeatedly on a datanode. When an already committed transaction is attempted to be aborted again, it results in a PANIC. Upon investigation, this uncovered a very serious yet long standing bug in transaction handling. If the client is running in autocommit mode, we try to avoid starting a transaction block on the datanode side if only one datanode is going to be involved in the transaction. This is an optimisation to speed up short queries touching only a single node. But when the query rewriter transforms a single statement into multiple statements, we would still (and incorrectly) run each statement in an autocommit mode on the datanode. This can cause inconsistencies when one statement commits but the next statement aborts. And it may also lead to the PANIC situations if we continue to use the same global transaction identifier for the statements. This can also happen when the user invokes a user-defined function. If the function has multiple statements, each statement will run in an autocommit mode, if it's FQSed, thus again creating inconsistency if a following statement in the function fails. We now have a more elaborate mechanism to tackle autocommit and transaction block needs. The special casing for force_autocommit is now removed, thus making it more predictable. We also have specific conditions to check to ensure that we don't mixup autocommit and transaction block for the same global xid. Finally, if a query rewriter transforms a single statement into multiple statements, we run those statements in a transaction block. Together these changes should help us fix the problems.
2018-05-07	Do not try to show targetlist of a RemoteSubplan on top of ModifyTable	Pavan Deolasee
	We do some special processing for RemoteSubplan with returning lists. But the EXPLAIN plan mechanism is not adequetly trained to handle that special crafting. So for now do not try to print the target list in the EXPLAIN output.
2018-04-17	Do not send the new protocol message to non-XL client.	Pavan Deolasee
	The new message 'W' to report waited-for XIDs must not be sent to a non-XL client since it's not capable of handling that and might just cause unpleasant problems. In fact, we should change 'W' to something else since standard libpq understands that message and hangs forever expecting more data. With a new protocol message, it would have failed, thus providing a more user friend error. But postponing that for now since we should think through implications of protocol change carefully before doing that.
2017-11-12	Use local variable to format timestamp in GTM log line prefix	Tomas Vondra
	When formatting log line prefix in GTM, we can't use global variable, because multiple threads may scribble over the same value. This is why the timestamp was missing in some log lines - one thread did the strftime(), but before it used the value another thread truncated the string (which is the first step in formatting a log line). So instead use a local (not shared by threads) variable, and pass it to setup_formatted_log_time() explicitly.
2017-11-07	Fix bug in release_connection() introduced by d9f45c9018	Tomas Vondra
	d9f45c9018ec3ec1fc11e4be2be7f9728a1799b1 attempted to refactor release_connection() to make it more readable, but unfortunately inverted the force_destroy check, causing regression failures. In hindsight, the refactoring was rather arbitrary and not really helping with the readability, so just revert to the original code (but keep the comments, explaining what's happening).
2017-11-04	Move several functions from pgxcnode.c to poolmgr.c	Tomas Vondra
	A number of functions were defined in pgxcnode.h/pgxnnode.h, but only ever used in poolmgr.c. Those are: - PGXCNodeConnect - open libpq connection using conn. string - PGXCNodePing - ping node using connection string - PGXCNodeClose - close libpq connection - PGXCNodeConnected - verify connection status - PGXCNodeConnStr - build connection string So move them to poolmgr.c and make them static, so that poolmgr is the only part dealing with libpq connections directly.
2017-11-04	Comments and cleanup in the connection pool manager	Tomas Vondra
	Similarly to a39b06b0c6, this does minor cleanup in the pool manager code by removing unused functions and adding a lot of comments, both at the file level (explaining the concepts and basic API methods) and for individual functions.
2017-11-04	Add gtm_snap.c comments (missing in a39b06b0c6)	Tomas Vondra
	These comments should have been included in a39b06b0c6, but I failed to include the file in the commit before pushing :-(
2017-11-04	Improve comments in GTM code, minor naming tweaks	Tomas Vondra
	This patch improves comments in gtm_txn.c and gtm_snap.c in three basic ways: 1) Adds global comments explaining the basics of transaction and snapshot management APIs - underlying concepts, main methods. 2) Improves (and adds) function-level comments, explaining the meaning of parameters, return values, and other details. 3) Tweaks the naming of several API functions, to make them more consistent with the rest of the module.
2017-11-04	Cleanup GTM API: make functions static, remove dead code	Tomas Vondra
	The cleanup does two basic things: * Functions used only in a single source file are made static (and also removed from the header file, of course). This reduces the size of the public GTM API. * Unused functions (identified by the compiler thanks to making other functions static in the previous step) are removed. The assumption is that this code was not really tested at all, and would only make future improvements harder.
2017-10-19	Remove references to issue 3520503 from privileges test	Tomas Vondra
	Multiple places in the regression test mentioned issue 3520503 a reason for failures. Unfortunately it's not clear what the issue is about (the comments were added in 10cf12dc51), but the reference seems obsolete anyway as the queries seem to work fine - the results are the same as expected on upstream.
2017-10-19	Remove unnecessary ORDER BY from privileges test	Tomas Vondra
	Some of the ORDER BY clauses added to the test are no longer necessary as the queries produce stable results anyway (all rows are the same). So remove the unnecessary clauses, to make the test more like upstream.
2017-10-19	Increase random_page_cost to fix a plan in updatable_views	Tomas Vondra
	The remote part of a query happens with per-node statistics, i.e. with only a fraction of the total number of rows. This affects the costing and may result in somewhat unexpected plan changes. For example one of the plans in updatable_views changed from hashjoin to nestloop due to this - the index got a bit smaller, lowering the cost of inner index scan enough to make nestloop cheaper. Instead of increasing the number of rows in the test to make it more expensive again (which would affect the rest of the test), tweak the random_page_cost for that one query a bit.
2017-10-19	Collect index statistics during ANALYZE on coordinator	Tomas Vondra
	ANALYZE was not collecting index statistics, which may have negative impact for example on selectivity estimates for expressions. This also fixes some incorrect plan changes in updatable_views regression test. Discussion: <[email protected]>
2017-10-19	Fix failures in updatable_views due to different structure	Tomas Vondra
	Since commit 93cbab90b0c6fc3fc4aa515b93057127c0ee8a1b we enforce stricter rules on structure of partitioned tables, e.g. we do not allow different order of columns in parent/child tables. This was causing failures in the updatable_views tests, so fix that by ensuring the structure actually matches exactly.
2017-10-19	Fix handling of root->distribution during redistribution	Tomas Vondra
	This fixes some remaining bugs in handling root->distribution, caused by the upper-planner pathification (in PostgreSQL 9.6). Prior to the pathification (so in PostgreSQL 9.5 and Postgres-XL 9.5), the root->distribution was used for two purposes: * To track distribution expected by ModifyTable (UPDATE,DELETE), so that grouping_planner() knew how to redistribute the data. * To communicate the resulting distribution from grouping_planner() back to standard_planner(). This worked fine in 9.5 as grouping_planner() was only dealing with a single remaining path (plan) when considering the redistribution, and so it was OK to tweak root->distribution. But since the pathification in 9.6 that is no longer true. There is no obvious reason why all the paths would have to share the same distribution, and we don't know which one will be the cheapest one. So from now on root->distribution is used to track the distribution expected by ModifyTable. Distribution for each path is available in path->distribution if needed. Note: We still use subroot->distribution to pass information about distribution of subqueries, though. But we only set it after the one cheapest path is selected.
2017-10-19	Accept plan changes in updatable_views test	Tomas Vondra
	After getting rid of the extra targetlist entries in 2d29155679, the plan changes in updatable_views seem reasonable so accept them.
2017-10-19	Remove coordinator quals, evaluated at Remote Subquery	Tomas Vondra
	While rewriting UPDATE/DELETE commands in rewriteTargetListUD, we've been pulling all Vars from quals, and adding them to target lists. As multiple Vars may reference the same column, this sometimes produced plans with duplicate targetlist entries like this one: Update on public.t111 -> Index Scan using t1_a_idx on public.t1 Output: 100, t1.b, t1.c, t1.a, t1.a, t1.a, t1.a, t1.a, t1.a, t1.a, t1.a, t1.ctid -> ... Getting rid of the duplicate entries would be simple - before adding entry for eachh Vars, check that a matching entry does not exist yet. The question however is if we actually need any of this. The comment in rewriteTargetListUD() claims we need to add the Vars because of "coordinator quals" - which is not really defined anywhere, but it probably means quals evaluated at the Remote Subquery node. But we push all quals to the remote node, so there should not be any cases where a qual would have to be evaluated locally (or where that would be preferable). So just remove all the relevant code from rewriteHandler.c, which means we produce this plan instead: Update on public.t111 -> Index Scan using t1_a_idx on public.t1 Output: 100, t1.b, t1.c, t1.ctid -> ... This affects a number of plans in regression tests, but the changes seem fine - we simply remove unnecessary target list entries. I've also added an assert to EXPLAIN enforcing the "no quals" rule for Remote Subquery nodes. Discussion: <[email protected]>
2017-10-14	Allocate thread-local snapshot array statically	Tomas Vondra
	Since commit fb56418d66 the snapshots are computed in thread-local storage, but we haven't been freeing the memory (on thread exit). As the memory is allocated in the global (TopMostMemoryContext), this presented a memory leak of 64kB on each GTM connection. One way to fix this would be to track when the thread-local storage is used in GTM_GetTransactionSnapshot(), and allocate the memory in TopMemoryContext (which is per-thread and released on exit). But there's a simpler way - allocate the thread-specific storage as part of GTM_ThreadInfo, and just redirect sn_xip from the snapshot. This way we don't have to worry about palloc/pfree at all, and we mostly assume that every connection will need to compute at least one snapshot anyway. Reported by Rob Canavan <[email protected]>, investigation and fix by me. For more discussion see <CAFTg0q6VC_11+c=Q=gsAcDsBrDjvuGKjzNwH4Lr8vERRDn4Ycw@mail.gmail.com> Backpatch to Postgres-XL 9.5.
2017-10-14	Remember queryId for queries executed using FQS	Tomas Vondra
	pgxc_FQS_planner() was not copying queryId, so extensions relying on it did not work properly. For example the pg_stat_statements extension was ignoring queries executed using FQS entirely. Backpatch to Postgres-XL 9.5.
2017-10-08	Remove unnecessary ORDER BY clauses from plpgsql tests	Tomas Vondra
	Some of the tests produce stable ordering even without the explicit ORDER BY clauses, due to only using generate_series() and not any distributed tables. So get rid of the unnecessary ORDER BY clauses.
2017-10-08	Fix expected output for plpgsql test suite	Tomas Vondra
	Commit 7d55b3a318 accepted incorrect expected output for a number of tests in this suite. The issue might have been initially masked by existence of another .out file for this test. We seem to be producing the correct output, so just use expected output from upstream. Moreover, the table (INT4_TBL) is defined as replicated, so we don't need the explicit ORDER BY clauses as the ordering is stable anyway. So remove them, to make the the tests a bit closer to upstream.
2017-10-08	Accept correct output/plan in subselect test suite	Tomas Vondra
	The value 200 is in fact incorrect, and commit 159912518 accepted it by mistake. The query should have produced 100 (which it now does). The plan is correct, and matches the plan produced on PostgreSQL 10 (although with Remote Subquery Scan on top).
2017-10-08	Update cmin values in combocid based on XL 9.5	Tomas Vondra
	As mentioned in 3a64cfdde3, some of the output differences (compared to PostgreSQL 10) may be caused by XL advancing cmin more often, for example due to splitting a single command into multiple steps. So tweak the expected output using output from Postgres-XL 9.5r1.6.
2017-10-08	Stabilize combocid test by replicating the table	Tomas Vondra
	Commit 1d14325822 randomized the choice of a starting node with ROUNDROBIN distribution, so that the data in combocid tests are not all placed on the first node but distributed randomly (despite using single-row INSERTS as before). So to stabilize the test, make the table replicated. The table only has a single column and the test updates is, so we can't use any other stable distribution (e.g. BY HASH). The expected results were obtained by running the combocid.sql on PostgreSQL 10, so there might be some cmin differences.