postgres-xc-general Mailing List for Postgres-XC

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

I think I have discovered the problem with PRIMARY and our loads.  The PRIMARY data node does not load
data for tables that are DISTRIBUTE BY REPLICATION.    We are on PGXC 1.0.2.   Here is a way to
demonstrate the problem:

# Demonstrate PRIMARY node \copy bug.

# 3 Coordinators
# 8 Datanodes

# Prepare some data:
$ awk 'BEGIN{for(i=1;i<=100000;i++) print i; }' > DATA

$ psql -h HOST242 -U postgres
postgres=# select * from pgxc_node;
   node_name    | node_type | node_port |     node_host       | nodeis_primary | nodeis_preferred |   node_id
----------------+-----------+-----------+---------------------+----------------+------------------+-------------
coordw         | C         |      5432 | localhost           | f              | f                |   670793242
coorde         | C         |      5432 | HOST225.example.com | f              | f                |   329164574
coordc         | C         |      5432 | HOST232.example.com | f              | f                | -1588937622
data_east01    | D         |     25432 | HOST225.example.com | f              | f                | -2053435448
data_east02    | D         |     25432 | HOST226.example.com | f              | f                |    94547764
data_east03    | D         |     25432 | HOST230.example.com | t              | f                |  197970754
data_central01 | D         |     25432 | HOST231.example.com | f              | f                |   124274836
data_central02 | D         |     25432 | HOST232.example.com | f              | f                |  1002175669
data_central03 | D         |     25432 | HOST238.example.com | f              | f                | -1150964881
data_west01    | D         |     25432 | HOST239.example.com | f              | f                |  2129529735
data_west02    | D         |     25432 | HOST242.example.com | f              | f                |  -717656524
(11 rows)

#################################################
# If we create a table distributed by replication
# and fill it with an insert, we see the same
# row counts from a coordinator and the PRIMARY
#################################################

psql -h HOST242 -U user
user=> create table foo (aaa int) distribute by replication;
CREATE TABLE
user=> insert into foo select generate_series(1,100000);
INSERT 0 100000
user=> select count(1) from foo;
count
--------
100000
(1 row)
user=> \q

$ psql -p 25432 -h HOST230 -U user
psql (9.2.4, server 9.1.7)

user=> select count(1) from foo;
count
--------
100000
(1 row)

user=> \q

###############################################
# Now truncate the table and load it with \copy
###############################################

$ psql -h HOST242 -U user
user=> truncate table foo;
TRUNCATE TABLE
user=> 
user=> \copy foo from DATA
user=> select count(1) from foo;
count
--------
100000
(1 row)

user=> \q

#############################################
# The count was correct from the coordinator,
# but not in the PRIMARY node.
#############################################

$ psql -p 25432 -h HOST230 -U user
psql (9.2.4, server 9.1.7)

user=> select count(1) from foo;
count
-------
     1
(1 row)

user=> \q

#######################################
# The count is correct in a non-PRIMARY
#######################################

$ psql -p 25432 -h HOST231 -U user
psql (9.2.4, server 9.1.7)

user=> select count(1) from foo;
count
--------
100000
(1 row)

user=> \q

----- Original Message -----
> From: Paul Jones <pb...@cm...>
> To: "pos...@li..." <pos...@li...>
> Cc: 
> Sent: Monday, April 8, 2013 2:58 PM
> Subject: Re: [Postgres-xc-general] Questions about PRIMARY and a problem
> 
>T hanks for everyone's explanation of PRIMARY.  It is much clearer now.
> 
> I believe, then, that we may have uncovered a bug in PRIMARY.
> 
> We created a new cluster (8 nodes, 3 coordinators), but with only one PRIMARY 
> datanode.  The PRIMARY was
> declared the same on all 3 coordinators.
> 
> A table that was declared DISTRIBUTE BY REPLICATION and loaded by \copy  did 
> not have any rows
> present on the PRIMARY!  Further, other tables with FK's referring to this 
> table had RI failures when they were loaded,
> even though there were complete copies of the table in all the other non-PRIMARY 
> datanodes.
> 
> When we remade the cluster without any PRIMARY, this table loaded into all 
> datanodes and there were no RI failures.
> 
> Is this a bug?   Unfortunately I won't be able to experiment with this until 
> we finish executing our test plan, perhaps
> a few days.
> 
> PJ
> 
> 
> 
> 
>> ________________________________
>>  From: Koichi Suzuki <koi...@gm...>
>> To: Andrei Martsinchyk <and...@gm...> 
>> Cc: "pos...@li..." 
> <pos...@li...> 
>> Sent: Sunday, April 7, 2013 9:23 AM
>> Subject: Re: [Postgres-xc-general] Questions about PRIMARY and a problem
>> 
>> 
>> Primary node is useful to maintain replicated table in a consistent status 
> in all the datanode.  All the writes to a replicated table goes first to the 
> primary node so all the conflicts are resolved here and prevents conflict writes 
> in other datanodes.   In this sense, this may prvent some deadlocks but it does 
> not remove the chance of deadlocks in general sense.
>> 
>> On the othe hand, preferred node (datanode again) saves inter-server 
> communication to read replciated table.   It does not work to maintain 
> replicated table consistensy but helps to gain some performance.
>> 
>> Regards;
>> ---
>> Koichi Suzuki
>> 
>> 
>> 
>> ----------
>> Koichi Suzuki
>> 
>> 
>> 2013/4/7 Andrei Martsinchyk <and...@gm...>
>> 
>> 
>>> 
>>> 
>>> 
>>> 
>>> 2013/4/7 Jov <am...@am...>
>>> 
>>> datanode use primary node to solve replication table write,it is 
> good,but how coordinator solve the dead lock problem? the coordinator nodes 
> replication all globle catalog tables across coords,they are some kind 
> replication table. 
>>>> 
>>>> 
>>>> eg.
>>>> 
>>>> client 1 run alter table tb on coord node A,it will lock local 
> catalog data on A,and wait other coord node B.
>>>> client 2 run alter table tb on coord node B,it will lock local 
> catalog data on B,and wait other coord node A.
>>>> 
>>>> 
>>>> 
>>>> so how XC handle this dead lock?
>>>> 
>>>> 
>>> 
>>> 
>>> XC does  not handle this, it will be deadlocked.
>>> Fortunately, chance of concurrent DDL much less then chance of 
> concurrent replicated update.
>>> 
>>> 
>>>  
>>> 
>>>> 
>>>> 2013/4/6 Andrei Martsinchyk <and...@gm...>
>>>> 
>>>> PRIMARY was introduced to avoid distributed deadlocks when updating 
> replicated tables.
>>>>> To better understand the problem, imagine two transactions A and 
> B are updating the same tuple in replicated concurrently.
>>>>> Normally coordinator sends the same commands to all datanodes at 
> the same time, and if on some node A updates the tuple first, B will be waiting 
> for the end of transaction A. If on other node B wins the race, both 
> transactions will be waiting for each other. It is hard to detect such deadlock, 
> the information about locks is not sent across network.
>>>>> But it is possible to avoid. The idea is to set one datanode as 
> a primary, and execute distributed update on primary node first, and go on with 
> other nodes only if operation succeeds on primary.
>>>>> With this approach row lock on primary would stop concurrent 
> transactions from taking row locks on other nodes that could prevent command 
> completion.
>>>>> So, to have this stuff working properly you should 
>>>>> 1) set only one datanode as a primary;
>>>>> 2) if you have multiple coordinators, the same datanode should 
> be set as a primary on each of them.
>>>>> Obvious drawback of the approach is double execution time of 
> replicated updates.
>>>>> Note: "update" means any write access.  
>>>>> Hope this answers 1)-3)
>>>>> Regarding 4), the query
>>>>> 
>>>>> 
>>>>> select nodeoids from pg_class, pgxc_class where pg_class.oid = 
> pcrelid and relname = '<your table name>';
>>>>> 
>>>>> 
>>>>> 
>>>>> returns the list of nodes, where the specified table is 
> distributed on. I guess there are 7 of them.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 2013/4/5 Paul Jones <pb...@cm...>
>>>>> 
>>>>> 
>>>>>> We are experimenting with an 8-datanode, 3-coordinator 
> cluster and we
>>>>>> have some questions about the use of PRIMARY and a problem. 
>>>>>> 
>>>>>> The manual explains what PRIMARY means but does not provide 
> much detail
>>>>>> about when you would use it or not use it.
>>>>>> 
>>>>>> 1) Can PRIMARY apply to coordinators and if so, when would 
> you
>>>>>>    want it or not?
>>>>>> 
>>>>>> 2) When would you use PRIMARY for datanodes or not, and 
> would you
>>>>>>    ever want more than one datanode to be a primary?
>>>>>> 
>>>>>> 3) Does a pgxc_node datanode entry on its own server have to 
> be
>>>>>>    the FQDN server name or can it be 'localhost'?
>>>>>>   
>>>>>> 4) We have a table that is defined as DISTRIBUTE BY 
> REPLICATION.
>>>>>>    It only loads on the first 7 nodes.  It will just not 
> load on
>>>>>>    node 8.  There are a lot of FK references from other 
> tables to it,
>>>>>>    but it itself only has a simple CHAR(11) PK, one 
> constraint,
>>>>>>    and 3 indices.
>>>>>>   
>>>>>>    Has anyone seen anything like this before?
>>>>>> 
>>>>>> Thanks,
>>>>>> Paul Jones
>>>>>> 
>>>>>> ------------------------------------------------------------------------------
>>>>>> Minimize network downtime and maximize team effectiveness.
>>>>>> Reduce network management and security costs.Learn how to 
> hire
>>>>>> the most talented Cisco Certified professionals. Visit the
>>>>>> Employer Resources Portal
>>>>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>>>>> _______________________________________________
>>>>>> Postgres-xc-general mailing list
>>>>>> Pos...@li...
>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Andrei Martsinchyk
>>>>> 
>>>>> StormDB - http://www.stormdb.com
>>>>> The Database Cloud
>>>>> 
>>>>> 
>>>>> 
>>>>> ------------------------------------------------------------------------------
>>>>> Minimize network downtime and maximize team effectiveness.
>>>>> Reduce network management and security costs.Learn how to hire
>>>>> the most talented Cisco Certified professionals. Visit the
>>>>> Employer Resources Portal
>>>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>>>> _______________________________________________
>>>>> Postgres-xc-general mailing list
>>>>> Pos...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Jov
>>>> 
>>>> blog: http:amutu.com/blog
>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Andrei Martsinchyk
>>> 
>>> StormDB - http://www.stormdb.com
>>> The Database Cloud
>>> 
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> Minimize network downtime and maximize team effectiveness.
>>> Reduce network management and security costs.Learn how to hire
>>> the most talented Cisco Certified professionals. Visit the
>>> Employer Resources Portal
>>> http://www.cisco.com/web/learning/employer_resources/index.html
>>> _______________________________________________
>>> Postgres-xc-general mailing list
>>> Pos...@li...
>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>> 
>>> 
>> 
>> ------------------------------------------------------------------------------
>> Minimize network downtime and maximize team effectiveness.
>> Reduce network management and security costs.Learn how to hire 
>> the most talented Cisco Certified professionals. Visit the 
>> Employer Resources Portal
>> http://www.cisco.com/web/learning/employer_resources/index.html
>> _______________________________________________
>> Postgres-xc-general mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>> 
>> 
>> 
> 
> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire 
> the most talented Cisco Certified professionals. Visit the 
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> 

2010	Jan	Feb	Mar	Apr	May (2)	Jun	Jul	Aug (6)	Sep	Oct (19)	Nov (1)	Dec
2011	Jan (12)	Feb (1)	Mar (4)	Apr (4)	May (32)	Jun (12)	Jul (11)	Aug (1)	Sep (6)	Oct (3)	Nov	Dec (10)
2012	Jan (11)	Feb (1)	Mar (3)	Apr (25)	May (53)	Jun (38)	Jul (103)	Aug (54)	Sep (31)	Oct (66)	Nov (77)	Dec (20)
2013	Jan (91)	Feb (86)	Mar (103)	Apr (107)	May (25)	Jun (37)	Jul (17)	Aug (59)	Sep (38)	Oct (78)	Nov (29)	Dec (15)
2014	Jan (23)	Feb (82)	Mar (118)	Apr (101)	May (103)	Jun (45)	Jul (6)	Aug (10)	Sep	Oct (32)	Nov	Dec (9)
2015	Jan (3)	Feb (5)	Mar	Apr (1)	May	Jun	Jul (9)	Aug (4)	Sep (3)	Oct	Nov	Dec
2016	Jan (3)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2017	Jan	Feb	Mar	Apr	May	Jun (3)	Jul	Aug	Sep	Oct	Nov	Dec
2018	Jan	Feb	Mar	Apr	May (4)	Jun	Jul	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
	1 (2)	2 (2)	3 (4)	4 (5)	5 (17)	6 (4)
7 (7)	8 (12)	9 (1)	10 (1)	11 (6)	12 (7)	13
14 (1)	15 (3)	16 (2)	17 (1)	18 (2)	19 (8)	20
21 (4)	22 (5)	23 (3)	24	25 (1)	26 (3)	27 (2)
28	29 (1)	30 (3)

postgres-xc-general Mailing List for Postgres-XC

postgres-xc-general — General info and messages