From: Michael P. <mic...@gm...> - 2012-02-28 04:07:36
|
Hi Pavan, I am able to run regressions. However there are still issues: 1) A datanode crash (gdb) bt #0 0x00007f2a555afa75 in raise () from /lib/libc.so.6 #1 0x00007f2a555b35c0 in abort () from /lib/libc.so.6 #2 0x0000000000857bd6 in ExceptionalCondition (conditionName=0xa08c50 "!(((proc->xid) != ((TransactionId) 0)))", errorType=0xa08c40 "FailedAssertion", fileName=0xa08c14 "procarray.c", lineNumber=440) at assert.c:57 #3 0x000000000072fe0a in ProcArrayEndTransaction (proc=0x7f2a55135ac0, latestXid=23666) at procarray.c:440 #4 0x00000000004b96dd in AbortTransaction () at xact.c:2706 #5 0x00000000004b9e49 in AbortCurrentTransaction () at xact.c:3191 #6 0x0000000000755cff in PostgresMain (argc=2, argv=0xdd9d90, username=0xdd9c30 "michael") at postgres.c:4010 #7 0x00000000006fc1ea in BackendRun (port=0xe08fd0) at postmaster.c:3763 #8 0x00000000006fb86a in BackendStartup (port=0xe08fd0) at postmaster.c:3448 #9 0x00000000006f87b6 in ServerLoop () at postmaster.c:1539 #10 0x00000000006f7f57 in PostmasterMain (argc=7, argv=0xdd6ba0) at postmaster.c:1200 #11 0x0000000000662b39 in main (argc=7, argv=0xdd6ba0) at main.c:199 (gdb) up 3 #3 0x000000000072fe0a in ProcArrayEndTransaction (proc=0x7f2a55135ac0, latestXid=23666) at procarray.c:440 440 Assert(TransactionIdIsValid(proc->xid)); (gdb) p proc $1 = (PGPROC *) 0x7f2a55135ac0 (gdb) p *proc $2 = {links = {prev = 0x0, next = 0x0}, sem = {semId = 26378254, semNum = 0}, waitStatus = 0, lxid = 0, xid = 0, xmin = 0, pid = 18594, backendId = 4, databaseId = 16384, roleId = 10, inCommit = 0 '\000', vacuumFlags = 0 '\000', recoveryConflictPending = 0 '\000', isPooler = 0 '\000', lwWaiting = 0 '\000', lwExclusive = 1 '\001', lwWaitLink = 0x0, waitLock = 0x0, waitProcLock = 0x0, waitLockMode = 0, heldLocks = 0, waitLatch = {is_set = 1, is_shared = 1 '\001', owner_pid = 18594}, waitLSN = {xlogid = 0, xrecoff = 0}, syncRepState = 0, syncRepLinks = {prev = 0x0, next = 0x0}, myProcLocks = {{ prev = 0x7f2a55135b48, next = 0x7f2a55135b48}, {prev = 0x7f2a55135b58, next = 0x7f2a55135b58}, {prev = 0x7f2a55135b68, next = 0x7f2a55135b68}, {prev = 0x7f2a55135b78, next = 0x7f2a55135b78}, { prev = 0x7f2a55135b88, next = 0x7f2a55135b88}, {prev = 0x7f2a55135b98, next = 0x7f2a55135b98}, {prev = 0x7f2a55135ba8, next = 0x7f2a55135ba8}, {prev = 0x7f2a55135bb8, next = 0x7f2a55135bb8}, { prev = 0x7f2a55135bc8, next = 0x7f2a55135bc8}, {prev = 0x7f2a55135bd8, next = 0x7f2a55135bd8}, {prev = 0x7f2a55135be8, next = 0x7f2a55135be8}, {prev = 0x7f2a55135bf8, next = 0x7f2a55135bf8}, { prev = 0x7f2a55135c08, next = 0x7f2a55135c08}, {prev = 0x7f2a55135c18, next = 0x7f2a55135c18}, {prev = 0x7f2a55135c28, next = 0x7f2a55135c28}, {prev = 0x7f2a55135c38, next = 0x7f2a55135c38}}, subxids = {overflowed = 0 '\000', nxids = 0, xids = {0 <repeats 64 times>}}} (gdb) p *proc->xid Cannot access memory at address 0x0 (gdb) p proc->xid $3 = 0 It happens for the test case rowtypes, and I extracted the test case: BEGIN; CREATE TABLE price ( id SERIAL PRIMARY KEY, active BOOLEAN NOT NULL, price NUMERIC ); CREATE TYPE price_input AS ( id INTEGER, price NUMERIC ); CREATE TYPE price_key AS ( id INTEGER ); CREATE FUNCTION price_key_from_table(price) RETURNS price_key AS $$ SELECT $1.id $$ LANGUAGE SQL; CREATE FUNCTION price_key_from_input(price_input) RETURNS price_key AS $$ SELECT $1.id $$ LANGUAGE SQL; insert into price values (1,false,42), (10,false,100), (11,true,17.99); UPDATE price SET active = true, price = input_prices.price FROM unnest(ARRAY[(10, 123.00), (11, 99.99)]::price_input[]) input_prices WHERE price_key_from_table(price.*) = price_key_from_input(input_prices.*); select * from price; rollback; --assertion error here 2) A coordinator crash with 5Co/5Dn postgres=# psql -X -c "create database toto" postgres postgres-# \q michael@boheme:~/pgsql $ psql -X -c "create database toto" postgres CREATE DATABASE michael@boheme:~/pgsql $ psql toto psql (9.1beta2) Type "help" for help. toto=# create table aa (a int); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> \q #0 0x000000000065127a in pgxc_node_remote_prepare (prepareGID=0x2076380 "T10056") at execRemote.c:1705 1705 if (connections[i]->state != DN_CONNECTION_STATE_IDLE) (gdb) bt #0 0x000000000065127a in pgxc_node_remote_prepare (prepareGID=0x2076380 "T10056") at execRemote.c:1705 #1 0x0000000000657008 in PrePrepare_Remote (prepareGID=0x2076380 "T10056", localNode=1 '\001', implicit=1 '\001') at execRemote.c:4561 #2 0x00000000004b91dc in PrepareTransaction () at xact.c:2353 #3 0x00000000004b8c54 in CommitTransaction () at xact.c:2005 #4 0x00000000004b9a6a in CommitTransactionCommand () at xact.c:2897 #5 0x0000000000753f8e in finish_xact_command () at postgres.c:2572 #6 0x00000000007518e9 in exec_simple_query (query_string=0x1fcd6c8 "create table aa (a int);") at postgres.c:1136 #7 0x0000000000755f10 in PostgresMain (argc=2, argv=0x1f11de8, username=0x1f11c30 "michael") at postgres.c:4155 #8 0x00000000006fc1ea in BackendRun (port=0x1f40e10) at postmaster.c:3763 #9 0x00000000006fb86a in BackendStartup (port=0x1f40e10) at postmaster.c:3448 #10 0x00000000006f87b6 in ServerLoop () at postmaster.c:1539 #11 0x00000000006f7f57 in PostmasterMain (argc=7, argv=0x1f0eba0) at postmaster.c:1200 #12 0x0000000000662b39 in main (argc=7, argv=0x1f0eba0) at main.c:199 (gdb) p i $2 = 4 (gdb) p connections $3 = (PGXCNodeHandle **) 0x1f2ca40 (gdb) p connections[0] $4 = (PGXCNodeHandle *) 0x2007638 (gdb) p *connections[0] $5 = {nodeoid = 16384, sock = 31, transaction_status = 84 'T', state = DN_CONNECTION_STATE_QUERY, combiner = 0x0, error = 0x0, outBuffer = 0x2019568 "Q", outSize = 16384, outEnd = 0, inBuffer = 0x201d5a8 "C", inSize = 16384, inStart = 24, inEnd = 24, inCursor = 24} (gdb) p *connections[1] $6 = {nodeoid = 16385, sock = 41, transaction_status = 84 'T', state = DN_CONNECTION_STATE_QUERY, combiner = 0x0, error = 0x0, outBuffer = 0x20215e8 "Q", outSize = 16384, outEnd = 0, inBuffer = 0x2025628 "C", inSize = 16384, inStart = 24, inEnd = 24, inCursor = 24} (gdb) p *connections[2] $7 = {nodeoid = 16386, sock = 42, transaction_status = 84 'T', state = DN_CONNECTION_STATE_QUERY, combiner = 0x0, error = 0x0, outBuffer = 0x2029668 "Q", outSize = 16384, outEnd = 0, inBuffer = 0x202d6a8 "C", inSize = 16384, inStart = 24, inEnd = 24, inCursor = 24} (gdb) p *connections[3] $8 = {nodeoid = 16387, sock = 43, transaction_status = 84 'T', state = DN_CONNECTION_STATE_QUERY, combiner = 0x0, error = 0x0, outBuffer = 0x20316e8 "Q", outSize = 16384, outEnd = 0, inBuffer = 0x2035728 "C", inSize = 16384, inStart = 24, inEnd = 24, inCursor = 24} (gdb) p *connections[4] Cannot access memory at address 0x100000001 On Mon, Feb 27, 2012 at 8:22 PM, Pavan Deolasee < pav...@en...> wrote: > On Mon, Feb 27, 2012 at 12:14 PM, Pavan Deolasee > <pav...@en...> wrote: > > On Mon, Feb 27, 2012 at 11:46 AM, Ashutosh Bapat > > <ash...@en...> wrote: > >> Ah, > >> if you see the error even after rebuilding the data-clusters, this > could be > >> something to look at. > >> > > > > My observation is that this happens when something goes wrong at the > > datanode. The problem is since we are seeing this error even without > > the patch (though intermittently), we need to fix the general issue. > > The patch might just be manifesting it. > > > > I hope the attached patch would fix the issue for pooler connections > (to be applied on top of the xact refactoring patch). In > ExecRemoteUtility we are doing some funny things.. like counting the > local coordinator and then reducing the count by 1 while running the > for loop. I am not sure why its necessary and how it was running > earlier. But I definitely think thats a bad code because we don't > connect to the local coordinator ever. > > I have tested regression with 1 and 1 coordinators and they seem to > work fine in my environment. I will continue to dig other errors. > > Thanks, > Pavan > > -- > Pavan Deolasee > EnterpriseDB http://www.enterprisedb.com > -- Michael Paquier http://michael.otacoo.com |