postgres-xc-general Mailing List for Postgres-XC

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

EXECUTE DIRECT ON data_east03 'SELECT count(1) FROM foo';

 count
----------
       1
(1 row)

EXECUTE DIRECT ON data_east02 'SELECT count(1) FROM foo';

 count
------------
 100000      
(1 row)

I get the same result even if I properly shutdown and restart the cluster.

PJ

>________________________________
> From: Nikhil Sontakke <ni...@st...>
>To: Paul Jones <pb...@cm...> 
>Cc: "pos...@li..." <pos...@li...> 
>Sent: Thursday, April 25, 2013 10:47 PM
>Subject: Re: [Postgres-xc-general] BUG in \copy ...was Re: Questions about PRIMARY and a problem
> 
>
>
>Hi Paul, 
>
>It's not recommended to log in directly into a datanode to run queries. All queries ought to be directed via coordinators to use sane XIDs. 
>
>What does the following query return when run from the coordinator?
>
>
>EXECUTE DIRECT ON (data_east03) 'SELECT count(1) FROM foo';
>
>
>Regards,
>Nikhils
>
>
>
>On Fri, Apr 26, 2013 at 3:16 AM, Paul Jones <pb...@cm...> wrote:
>
>I think I have discovered the problem with PRIMARY and our loads.  The PRIMARY data node does not load
>>data for tables that are DISTRIBUTE BY REPLICATION.    We are on PGXC 1.0.2.   Here is a way to
>>demonstrate the problem:
>>
>># Demonstrate PRIMARY node \copy bug.
>>
>># 3 Coordinators
>># 8 Datanodes
>>
>># Prepare some data:
>>$ awk 'BEGIN{for(i=1;i<=100000;i++) print i; }' > DATA
>>
>>$ psql -h HOST242 -U postgres
>>postgres=# select * from pgxc_node;
>>   node_name    | node_type | node_port |     node_host       | nodeis_primary | nodeis_preferred |   node_id
>>----------------+-----------+-----------+---------------------+----------------+------------------+-------------
>>coordw         | C         |      5432 | localhost           | f              | f                |   670793242
>>coorde         | C         |      5432 | HOST225.example.com | f              | f                |   329164574
>>coordc         | C         |      5432 | HOST232.example.com | f              | f                | -1588937622
>>data_east01    | D         |     25432 | HOST225.example.com | f              | f                | -2053435448
>>data_east02    | D         |     25432 | HOST226.example.com | f              | f                |    94547764
>>data_east03    | D         |     25432 | HOST230.example.com | t              | f                |  197970754
>>data_central01 | D         |     25432 | HOST231.example.com | f              | f                |   124274836
>>data_central02 | D         |     25432 | HOST232.example.com | f              | f                |  1002175669
>>data_central03 | D         |     25432 | HOST238.example.com | f              | f                | -1150964881
>>data_west01    | D         |     25432 | HOST239.example.com | f              | f                |  2129529735
>>data_west02    | D         |     25432 | HOST242.example.com | f              | f                |  -717656524
>>(11 rows)
>>
>>#################################################
>># If we create a table distributed by replication
>># and fill it with an insert, we see the same
>># row counts from a coordinator and the PRIMARY
>>#################################################
>>
>>psql -h HOST242 -U user
>>user=> create table foo (aaa int) distribute by replication;
>>CREATE TABLE
>>user=> insert into foo select generate_series(1,100000);
>>INSERT 0 100000
>>user=> select count(1) from foo;
>>count
>>--------
>>100000
>>(1 row)
>>user=> \q
>>
>>$ psql -p 25432 -h HOST230 -U user
>>psql (9.2.4, server 9.1.7)
>>
>>user=> select count(1) from foo;
>>count
>>--------
>>100000
>>(1 row)
>>
>>user=> \q
>>
>>###############################################
>># Now truncate the table and load it with \copy
>>###############################################
>>
>>$ psql -h HOST242 -U user
>>user=> truncate table foo;
>>TRUNCATE TABLE
>>user=>
>>user=> \copy foo from DATA
>>user=> select count(1) from foo;
>>count
>>--------
>>100000
>>(1 row)
>>
>>user=> \q
>>
>>#############################################
>># The count was correct from the coordinator,
>># but not in the PRIMARY node.
>>#############################################
>>
>>$ psql -p 25432 -h HOST230 -U user
>>psql (9.2.4, server 9.1.7)
>>
>>user=> select count(1) from foo;
>>count
>>-------
>>     1
>>(1 row)
>>
>>user=> \q
>>
>>#######################################
>># The count is correct in a non-PRIMARY
>>#######################################
>>
>>$ psql -p 25432 -h HOST231 -U user
>>psql (9.2.4, server 9.1.7)
>>
>>user=> select count(1) from foo;
>>count
>>--------
>>100000
>>(1 row)
>>
>>user=> \q
>>
>>
>>
>>----- Original Message -----
>>> From: Paul Jones <pb...@cm...>
>>> To: "pos...@li..." <pos...@li...>
>>> Cc:
>>> Sent: Monday, April 8, 2013 2:58 PM
>>> Subject: Re: [Postgres-xc-general] Questions about PRIMARY and a problem
>>>
>>>T hanks for everyone's explanation of PRIMARY.  It is much clearer now.
>>>
>>> I believe, then, that we may have uncovered a bug in PRIMARY.
>>>
>>> We created a new cluster (8 nodes, 3 coordinators), but with only one PRIMARY
>>> datanode.  The PRIMARY was
>>> declared the same on all 3 coordinators.
>>>
>>> A table that was declared DISTRIBUTE BY REPLICATION and loaded by \copy  did
>>> not have any rows
>>> present on the PRIMARY!  Further, other tables with FK's referring to this
>>> table had RI failures when they were loaded,
>>> even though there were complete copies of the table in all the other non-PRIMARY
>>> datanodes.
>>>
>>> When we remade the cluster without any PRIMARY, this table loaded into all
>>> datanodes and there were no RI failures.
>>>
>>> Is this a bug?   Unfortunately I won't be able to experiment with this until
>>> we finish executing our test plan, perhaps
>>> a few days.
>>>
>>> PJ
>>>
>>>
>>>
>>>
>>>> ________________________________
>>>>  From: Koichi Suzuki <koi...@gm...>
>>>> To: Andrei Martsinchyk <and...@gm...>
>>>> Cc: "pos...@li..."
>>> <pos...@li...>
>>>> Sent: Sunday, April 7, 2013 9:23 AM
>>>> Subject: Re: [Postgres-xc-general] Questions about PRIMARY and a problem
>>>>
>>>>
>>>> Primary node is useful to maintain replicated table in a consistent status
>>> in all the datanode.  All the writes to a replicated table goes first to the
>>> primary node so all the conflicts are resolved here and prevents conflict writes
>>> in other datanodes.   In this sense, this may prvent some deadlocks but it does
>>> not remove the chance of deadlocks in general sense.
>>>>
>>>> On the othe hand, preferred node (datanode again) saves inter-server
>>> communication to read replciated table.   It does not work to maintain
>>> replicated table consistensy but helps to gain some performance.
>>>>
>>>> Regards;
>>>> ---
>>>> Koichi Suzuki
>>>>
>>>>
>>>>
>>>> ----------
>>>> Koichi Suzuki
>>>>
>>>>
>>>> 2013/4/7 Andrei Martsinchyk <and...@gm...>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2013/4/7 Jov <am...@am...>
>>>>>
>>>>> datanode use primary node to solve replication table write,it is
>>> good,but how coordinator solve the dead lock problem? the coordinator nodes
>>> replication all globle catalog tables across coords,they are some kind
>>> replication table. 
>>>>>>
>>>>>>
>>>>>> eg.
>>>>>>
>>>>>> client 1 run alter table tb on coord node A,it will lock local
>>> catalog data on A,and wait other coord node B.
>>>>>> client 2 run alter table tb on coord node B,it will lock local
>>> catalog data on B,and wait other coord node A.
>>>>>>
>>>>>>
>>>>>>
>>>>>> so how XC handle this dead lock?
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> XC does  not handle this, it will be deadlocked.
>>>>> Fortunately, chance of concurrent DDL much less then chance of
>>> concurrent replicated update.
>>>>>
>>>>>
>>>>>  
>>>>>
>>>>>>
>>>>>> 2013/4/6 Andrei Martsinchyk <and...@gm...>
>>>>>>
>>>>>> PRIMARY was introduced to avoid distributed deadlocks when updating
>>> replicated tables.
>>>>>>> To better understand the problem, imagine two transactions A and
>>> B are updating the same tuple in replicated concurrently.
>>>>>>> Normally coordinator sends the same commands to all datanodes at
>>> the same time, and if on some node A updates the tuple first, B will be waiting
>>> for the end of transaction A. If on other node B wins the race, both
>>> transactions will be waiting for each other. It is hard to detect such deadlock,
>>> the information about locks is not sent across network.
>>>>>>> But it is possible to avoid. The idea is to set one datanode as
>>> a primary, and execute distributed update on primary node first, and go on with
>>> other nodes only if operation succeeds on primary.
>>>>>>> With this approach row lock on primary would stop concurrent
>>> transactions from taking row locks on other nodes that could prevent command
>>> completion.
>>>>>>> So, to have this stuff working properly you should 
>>>>>>> 1) set only one datanode as a primary;
>>>>>>> 2) if you have multiple coordinators, the same datanode should
>>> be set as a primary on each of them.
>>>>>>> Obvious drawback of the approach is double execution time of
>>> replicated updates.
>>>>>>> Note: "update" means any write access.  
>>>>>>> Hope this answers 1)-3)
>>>>>>> Regarding 4), the query
>>>>>>>
>>>>>>>
>>>>>>> select nodeoids from pg_class, pgxc_class where pg_class.oid =
>>> pcrelid and relname = '<your table name>';
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> returns the list of nodes, where the specified table is
>>> distributed on. I guess there are 7 of them.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2013/4/5 Paul Jones <pb...@cm...>
>>>>>>>
>>>>>>>
>>>>>>>> We are experimenting with an 8-datanode, 3-coordinator
>>> cluster and we
>>>>>>>> have some questions about the use of PRIMARY and a problem. 
>>>>>>>>
>>>>>>>> The manual explains what PRIMARY means but does not provide
>>> much detail
>>>>>>>> about when you would use it or not use it.
>>>>>>>>
>>>>>>>> 1) Can PRIMARY apply to coordinators and if so, when would
>>> you
>>>>>>>>    want it or not?
>>>>>>>>
>>>>>>>> 2) When would you use PRIMARY for datanodes or not, and
>>> would you
>>>>>>>>    ever want more than one datanode to be a primary?
>>>>>>>>
>>>>>>>> 3) Does a pgxc_node datanode entry on its own server have to
>>> be
>>>>>>>>    the FQDN server name or can it be 'localhost'?
>>>>>>>>   
>>>>>>>> 4) We have a table that is defined as DISTRIBUTE BY
>>> REPLICATION.
>>>>>>>>    It only loads on the first 7 nodes.  It will just not
>>> load on
>>>>>>>>    node 8.  There are a lot of FK references from other
>>> tables to it,
>>>>>>>>    but it itself only has a simple CHAR(11) PK, one
>>> constraint,
>>>>>>>>    and 3 indices.
>>>>>>>>   
>>>>>>>>    Has anyone seen anything like this before?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Paul Jones
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> Minimize network downtime and maximize team effectiveness.
>>>>>>>> Reduce network management and security costs.Learn how to
>>> hire
>>>>>>>> the most talented Cisco Certified professionals. Visit the
>>>>>>>> Employer Resources Portal
>>>>>>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>>>> _______________________________________________
>>>>>>>> Postgres-xc-general mailing list
>>>>>>>> Pos...@li...
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Andrei Martsinchyk
>>>>>>>
>>>>>>> StormDB - http://www.stormdb.com
>>>>>>> The Database Cloud
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> Minimize network downtime and maximize team effectiveness.
>>>>>>> Reduce network management and security costs.Learn how to hire
>>>>>>> the most talented Cisco Certified professionals. Visit the
>>>>>>> Employer Resources Portal
>>>>>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>>> _______________________________________________
>>>>>>> Postgres-xc-general mailing list
>>>>>>> Pos...@li...
>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jov
>>>>>>
>>>>>> blog: http:amutu.com/blog
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Andrei Martsinchyk
>>>>>
>>>>> StormDB - http://www.stormdb.com
>>>>> The Database Cloud
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Minimize network downtime and maximize team effectiveness.
>>>>> Reduce network management and security costs.Learn how to hire
>>>>> the most talented Cisco Certified professionals. Visit the
>>>>> Employer Resources Portal
>>>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>>>> _______________________________________________
>>>>> Postgres-xc-general mailing list
>>>>> Pos...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>>>
>>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Minimize network downtime and maximize team effectiveness.
>>>> Reduce network management and security costs.Learn how to hire
>>>> the most talented Cisco Certified professionals. Visit the
>>>> Employer Resources Portal
>>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>>> _______________________________________________
>>>> Postgres-xc-general mailing list
>>>> Pos...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>>
>>>>
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Minimize network downtime and maximize team effectiveness.
>>> Reduce network management and security costs.Learn how to hire
>>> the most talented Cisco Certified professionals. Visit the
>>> Employer Resources Portal
>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>> _______________________________________________
>>> Postgres-xc-general mailing list
>>> Pos...@li...
>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>
>>
>>------------------------------------------------------------------------------
>>Try New Relic Now & We'll Send You this Cool Shirt
>>New Relic is the only SaaS-based application performance monitoring service
>>that delivers powerful full stack analytics. Optimize and monitor your
>>browser, app, & servers with just a few lines of code. Try New Relic
>>and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
>>_______________________________________________
>>Postgres-xc-general mailing list
>>Pos...@li...
>>https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>
>
>
>-- 
>StormDB - http://www.stormdb.com
>The Database Cloud 
>
>

2010	Jan	Feb	Mar	Apr	May (2)	Jun	Jul	Aug (6)	Sep	Oct (19)	Nov (1)	Dec
2011	Jan (12)	Feb (1)	Mar (4)	Apr (4)	May (32)	Jun (12)	Jul (11)	Aug (1)	Sep (6)	Oct (3)	Nov	Dec (10)
2012	Jan (11)	Feb (1)	Mar (3)	Apr (25)	May (53)	Jun (38)	Jul (103)	Aug (54)	Sep (31)	Oct (66)	Nov (77)	Dec (20)
2013	Jan (91)	Feb (86)	Mar (103)	Apr (107)	May (25)	Jun (37)	Jul (17)	Aug (59)	Sep (38)	Oct (78)	Nov (29)	Dec (15)
2014	Jan (23)	Feb (82)	Mar (118)	Apr (101)	May (103)	Jun (45)	Jul (6)	Aug (10)	Sep	Oct (32)	Nov	Dec (9)
2015	Jan (3)	Feb (5)	Mar	Apr (1)	May	Jun	Jul (9)	Aug (4)	Sep (3)	Oct	Nov	Dec
2016	Jan (3)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2017	Jan	Feb	Mar	Apr	May	Jun (3)	Jul	Aug	Sep	Oct	Nov	Dec
2018	Jan	Feb	Mar	Apr	May (4)	Jun	Jul	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
	1 (2)	2 (2)	3 (4)	4 (5)	5 (17)	6 (4)
7 (7)	8 (12)	9 (1)	10 (1)	11 (6)	12 (7)	13
14 (1)	15 (3)	16 (2)	17 (1)	18 (2)	19 (8)	20
21 (4)	22 (5)	23 (3)	24	25 (1)	26 (3)	27 (2)
28	29 (1)	30 (3)

postgres-xc-general Mailing List for Postgres-XC

postgres-xc-general — General info and messages