Age | Commit message (Collapse) | Author |
|
We create a coordinator-only LOCAL temporary table for REFRESH MATERIALIZED
VIEW CONCURRENTLY. Since this table does not exist on the remote nodes, we must
not use explicit "ANALYZE <temptable>". Instead, just analyze it locally like
we were doing at other places.
Restore the matview test case to use REFRESH MATERIALIZED VIEW CONCURRENTLY now
that the underlying bug is fixed.
|
|
A row description messages contains the type information for the attributes in
the column. But if the type does not exist in the search_path then the
coordinator fails to parse the typename back to the type. So the datanode must
send the schema name along with the type name.
Per report and test case by Hengbing Wang @ Microfun.
Added a new test file and a few test cases to cover this area.
|
|
Without this, parseTypeString() might throw an error or resolve to a wrong type
in case the type name requires quoting.
Per report by Hengbing Wang
|
|
The system may and very likely choose different namespace for temporary tables
on different nodes. So it was erroneous to explicitly add the coordinator side
nampspace to the queries constructed for fetching stats from the remote nodes.
A regression test was non-deterministically failing for this reason for long,
but only now we could fully understand the problem and fix it. We now use
pg_my_temp_schema() to derive the current temporary schema used by the remote
node instead of hardcoding that in the query using coordinator side
information.
|
|
The plans now look the same as vanilla PG except for additional Remote Fast
Query Execution nodes
|
|
The new output looks correct and has been fixed because of our work to get
transaction handling correct.
|
|
0f65a7193da4b6b0a35b6446b4c904a9f5ac9bf6
|
|
We no longer see "DROP INDEX CONCURRENTLY cannot run inside a transaction
block" if the index does not exists and we're running DROP IF EXISTS
command
|
|
Chi Gao and Hengbing Wang reported certain issues around transaction handling
and demonstrated via xlogdump how certain transactions were getting marked
committed/aborted repeatedly on a datanode. When an already committed
transaction is attempted to be aborted again, it results in a PANIC. Upon
investigation, this uncovered a very serious yet long standing bug in
transaction handling.
If the client is running in autocommit mode, we try to avoid starting a
transaction block on the datanode side if only one datanode is going to be
involved in the transaction. This is an optimisation to speed up short queries
touching only a single node. But when the query rewriter transforms a single
statement into multiple statements, we would still (and incorrectly) run each
statement in an autocommit mode on the datanode. This can cause inconsistencies
when one statement commits but the next statement aborts. And it may also lead
to the PANIC situations if we continue to use the same global transaction
identifier for the statements.
This can also happen when the user invokes a user-defined function. If the
function has multiple statements, each statement will run in an autocommit
mode, if it's FQSed, thus again creating inconsistency if a following statement
in the function fails.
We now have a more elaborate mechanism to tackle autocommit and transaction
block needs. The special casing for force_autocommit is now removed, thus
making it more predictable. We also have specific conditions to check to ensure
that we don't mixup autocommit and transaction block for the same global xid.
Finally, if a query rewriter transforms a single statement into multiple
statements, we run those statements in a transaction block. Together these
changes should help us fix the problems.
|
|
We do some special processing for RemoteSubplan with returning lists. But the
EXPLAIN plan mechanism is not adequetly trained to handle that special
crafting. So for now do not try to print the target list in the EXPLAIN output.
|
|
Multiple places in the regression test mentioned issue 3520503 a reason
for failures. Unfortunately it's not clear what the issue is about (the
comments were added in 10cf12dc51), but the reference seems obsolete
anyway as the queries seem to work fine - the results are the same as
expected on upstream.
|
|
Some of the ORDER BY clauses added to the test are no longer necessary
as the queries produce stable results anyway (all rows are the same).
So remove the unnecessary clauses, to make the test more like upstream.
|
|
The remote part of a query happens with per-node statistics, i.e. with
only a fraction of the total number of rows. This affects the costing
and may result in somewhat unexpected plan changes.
For example one of the plans in updatable_views changed from hashjoin
to nestloop due to this - the index got a bit smaller, lowering the
cost of inner index scan enough to make nestloop cheaper.
Instead of increasing the number of rows in the test to make it more
expensive again (which would affect the rest of the test), tweak the
random_page_cost for that one query a bit.
|
|
Since commit 93cbab90b0c6fc3fc4aa515b93057127c0ee8a1b we enforce
stricter rules on structure of partitioned tables, e.g. we do not
allow different order of columns in parent/child tables.
This was causing failures in the updatable_views tests, so fix that
by ensuring the structure actually matches exactly.
|
|
After getting rid of the extra targetlist entries in 2d29155679, the
plan changes in updatable_views seem reasonable so accept them.
|
|
While rewriting UPDATE/DELETE commands in rewriteTargetListUD, we've
been pulling all Vars from quals, and adding them to target lists. As
multiple Vars may reference the same column, this sometimes produced
plans with duplicate targetlist entries like this one:
Update on public.t111
-> Index Scan using t1_a_idx on public.t1
Output: 100, t1.b, t1.c, t1.a, t1.a, t1.a, t1.a, t1.a, t1.a,
t1.a, t1.a, t1.ctid
-> ...
Getting rid of the duplicate entries would be simple - before adding
entry for eachh Vars, check that a matching entry does not exist yet.
The question however is if we actually need any of this.
The comment in rewriteTargetListUD() claims we need to add the Vars
because of "coordinator quals" - which is not really defined anywhere,
but it probably means quals evaluated at the Remote Subquery node.
But we push all quals to the remote node, so there should not be any
cases where a qual would have to be evaluated locally (or where that
would be preferable).
So just remove all the relevant code from rewriteHandler.c, which
means we produce this plan instead:
Update on public.t111
-> Index Scan using t1_a_idx on public.t1
Output: 100, t1.b, t1.c, t1.ctid
-> ...
This affects a number of plans in regression tests, but the changes
seem fine - we simply remove unnecessary target list entries.
I've also added an assert to EXPLAIN enforcing the "no quals" rule
for Remote Subquery nodes.
Discussion: <[email protected]>
|
|
Some of the tests produce stable ordering even without the explicit
ORDER BY clauses, due to only using generate_series() and not any
distributed tables. So get rid of the unnecessary ORDER BY clauses.
|
|
Commit 7d55b3a318 accepted incorrect expected output for a number
of tests in this suite. The issue might have been initially masked
by existence of another .out file for this test.
We seem to be producing the correct output, so just use expected
output from upstream. Moreover, the table (INT4_TBL) is defined as
replicated, so we don't need the explicit ORDER BY clauses as the
ordering is stable anyway. So remove them, to make the the tests
a bit closer to upstream.
|
|
The value 200 is in fact incorrect, and commit 159912518 accepted it
by mistake. The query should have produced 100 (which it now does).
The plan is correct, and matches the plan produced on PostgreSQL 10
(although with Remote Subquery Scan on top).
|
|
As mentioned in 3a64cfdde3, some of the output differences (compared
to PostgreSQL 10) may be caused by XL advancing cmin more often, for
example due to splitting a single command into multiple steps.
So tweak the expected output using output from Postgres-XL 9.5r1.6.
|
|
Commit 1d14325822 randomized the choice of a starting node with
ROUNDROBIN distribution, so that the data in combocid tests are not
all placed on the first node but distributed randomly (despite using
single-row INSERTS as before).
So to stabilize the test, make the table replicated. The table only
has a single column and the test updates is, so we can't use any
other stable distribution (e.g. BY HASH).
The expected results were obtained by running the combocid.sql on
PostgreSQL 10, so there might be some cmin differences.
|
|
Commit e26a0e07d8 started ignoring distributions defined on partitions,
but omitted this place in 'rules' when accepting the warnings.
|
|
On stock PostgreSQL, CREATE INDEX also updates statistics in pg_class
(relpages and reltuples). But Postgres-XL does not do that, which may
result in plan differences when the test relies on this behavior.
This is the same issue as in cfb055553687c257dd1d1ed123356c892f48a804,
but affecting inherit regression tests. So fix it in the same way, by
doing an explicit vacuum on the tables.
|
|
The accepted plan changes seem correct, as the only difference with
respect to upstream plans is Limit distribution. The commit diff is
a bit more complicated, because the expected plan did not reflect
the switch from Result to ProjectSet.
|
|
The plan change is expected, as it simply expands
-> Limit
to
-> Limit
-> Remote Subquery Scan
-> Limit
It wasn't accepted before probably because it was hidden by other
failures in the test suite.
|
|
When checking if a query is eligible for FQS (fast-query shipping),
disable the optimization for queries in SCROLL cursors, as FQS does
not support backward scans.
Discussion: <[email protected]>
|
|
The ATTACH PARTITION command was failing due to mismatching column
order. That in turn caused failure in sanity_check, as the table
was not dropped when dropping the parent.
|
|
The expected output did not match, likely due to some confusion when
merging upstream changes (where the query does not include the ORDER
BY clause).
The updated output matches exactly to what is produced by upstream
after adding the ORDER BY clause.
|
|
Commit f7d1d581c950191a465b8483173f2ad69ae8fffe converted a couple of
sequences to persistent (instead of temporary), but failed to update
the expected output.
|
|
We now create views/materialised views on all nodes, unless they are temporary
objects in which case they are created only on the local coordinator and the
datanodes. Similarly, temporary sequences are created on the local coordinator
and the datanodes.
This solves many outstanding problems in the regression results where remote
nodes used to fail because of non-existent type for a view or similar such
issues. A few other test cases now started to work correctly and produce output
matching upstream PG. So the expected output for those test cases has been
appropriated fixed.
Couple of sequences in the rangefuncs test case have been converted into
permanent sequences because the subsequent SQL functions refer to them and
hence fail if they do not exist on the remote coordinators.
The problem with special RULE converting a regular table into a view goes away
with the fix since DROP VIEW commands are now propgataed to the datanodes too.
|
|
The actual result matches with the upstream result. The diffs must have been
caused by incorrect merge. So accept those fully.
|
|
The first diff was in fact a mistake and the actual output matched with the
upstream output. It must have been missed during the merge process. The other
diff was simply an error because XL doesn't allow updating distribution key
column.
|
|
There were two broad categories of problems.
1. Errors due to lack of savepoint support
2. Errors and side effects due to lack of trigger support.
For 1, we reorganised the test case so that they can be run without savepoint.
For 2, we mostly accepted the regression changes. Apart from usual errors while
creating/dropping triggers, there were differences in query results because
of changes to preceding update/insert/delete statements. The behaviour of those
statements change because of lack of triggers.
|
|
Because of XL's strict requirement on column ordering and positioning, change
the test case to avoid DROP COLUMN commands. This kinda makes the test case
useless because the sole purpose of the test was to test if things stand up
well when there is a mismatch in column numbering. Given the current
restriction, there is no other option to make these changes.
|
|
A dummy append node with no subpaths doesn't need any adjustment for
distribution. This allows us to actually correct handle UPDATE/DELETE in some
cases which were failing earlier.
|
|
- Some problems related to inherited tables fixed by ensuring column ordering.
- ORDER BY clauses added at some other places to ensure consistent row ordering.
- Changes related to TABLESAMPLE accepted as XL returns more rows than PG
- SAVEPOINTs removed and replaced by transaction blocks as XL does not support
subtransaction
- NOTICEs are not displayed in XL
- Append is pushed down to the remote node now that we impose stricter
restrictions on inheritance
|
|
As the table has just a single float8 column, XL automatically picks
ROUNDROBIN distribution. Commit 1d14325822 randomized selection of the
initial node, which with single-row inserts (used by the float8 tests)
effectively means random distribution, while before that all the rows
would be routed to the first node.
Some of the tests in float8 test suite seem to be sensitive to this, in
particular this overflow test:
SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f ORDER BY f1;
ERROR: value out of range: overflow
One of the rows (containing 1.2345678901234e-200) triggers an underflow,
so when the database hits it first, it will report this error instead:
ERROR: value out of range: underflow
The probability of hitting this is however fairly low (less than 10%),
as the executor has to touch the underflowing value first.
|
|
Remove a SAVEPOINT statement, which otherwise fails. Once that is removed, a
few other test cases work fine and the associated expected output changes are
accepted.
|
|
It was a simple case of change in row ordering because the test case is
requesting order by column 'a', but the expected output had order by column 'c'
|
|
It's just an addition of a Remote Subquery Scan node on top of the regular
plan.
|
|
We don't support subtransactions and hence can't handle exception thrown by
trying to set invalid value. We'd already removed the exception, but the
transaction was being left in an aborted state. So fix this.
The test case still fails for some other reason which should be investigated
separately.
|
|
These changes were lost when we removed alternate expected output files for the
test case. So these are not new differences and the same ordering is exhibited
in XL9.5 as well.
NOTICEs are not shown by XL so accept those differences.
|
|
Accept some diffs which look sane and in-line with the upstream errors. Also
comment out a few tests which explictly test subtransactions, something we
don't currently support.
|
|
We don't support BACKWARD scan of RemoteSubplan and neither support WHERE
CURRENT OF. So accept the resulting errors.
|
|
We don't support BACKWARD scan of RemoteSubplan and neither support WHERE
CURRENT OF. So accept errors arising out of these limitations. These test case
changes are new in XL10 and hence we did not see these failures in the earlier
releases of XL.
|
|
SAVEPOINTs were used, which we don't support. So commented those out (along
with ROLLBACK calls). That does have an impact on the test case though because
at least in one place we were checking if the cmin goes back to 0 after rolling
back to a savepoint. But there is not much we can do about that until we add
SAVEPOINT support. Other changes include accepting diffs because of ctid
changes as rows come from different nodes and hence ctids are
duplicated/changed.
|
|
We do not support trigger and hence the regression differences.
|
|
As with other test cases, accept the lack of NOTICEs in the test case. This
issue needs to be investigated, but surely this is not the only test case
that's suffering from this behavioural change. So accept for now and change
later once (and when) we fix the NOTICE problem.
|
|
Accept some differences as the identity column is also used distribution
column and hence updates are disallowed in XL. So accept those differences (we
should later add XL-specific test cases by having identity in non-distribution
column). Also add ORDER BY in some select queries to ensure consistent ordering
of the result
|
|
Since large objects are not supported by XL, these were mostly cosmetic
differences without any possible bug.
|