You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
1
(2) |
2
(2) |
3
(4) |
4
(5) |
5
(17) |
6
(4) |
7
(7) |
8
(12) |
9
(1) |
10
(1) |
11
(6) |
12
(7) |
13
|
14
(1) |
15
(3) |
16
(2) |
17
(1) |
18
(2) |
19
(8) |
20
|
21
(4) |
22
(5) |
23
(3) |
24
|
25
(1) |
26
(3) |
27
(2) |
28
|
29
(1) |
30
(3) |
|
|
|
|
From: Koichi S. <koi...@gm...> - 2013-04-07 14:32:10
|
Yes, we're planning 1.1 by the end of this June, with 9.2.4. I didn't have a plan to make 1,0.3 with 9.1.9. If there's enough requirement, we may have to consider to have it. Regards; ---------- Koichi Suzuki 2013/4/5 seikath <se...@gm...> > Hello guys, > > PostgresSQL released patch due to > http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2013-1899 > > More info here: > http://www.theregister.co.uk/2013/04/04/postgresql_urgent_patch/ > > Should we expect new release of PGXC ? > > Kind regards, > > Ivan > > > ------------------------------------------------------------------------------ > Minimize network downtime and maximize team effectiveness. > Reduce network management and security costs.Learn how to hire > the most talented Cisco Certified professionals. Visit the > Employer Resources Portal > http://www.cisco.com/web/learning/employer_resources/index.html > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Koichi S. <koi...@gm...> - 2013-04-07 14:27:51
|
Automatic monitoring and failover has not been provided yet. As Michael suggested, yes it's up to you at present. Sakata-san at NTT announced that they're developing XC RA for pacemaker/heartbeat (I hope it runs with Corosync too). Failover steps will be found in pgx_ctl (bash version). I do hope it helps people to build their own falover system. I'm about to commit C-version of pgxc_ctl, which is much faster and more flexible than bash version. Regards; ---------- Koichi Suzuki 2013/4/6 Michael Paquier <mic...@gm...> > > > > On Sat, Apr 6, 2013 at 4:34 PM, Theodotos Andreou <th...@ub...>wrote: > >> Guys hi, >> >> I finally managed, with the help of pgxc_ctl to setup a HA postgres-xc >> cluster. I have 4 nodes (coord/datanode) and 2 GTMs (active/)failover as >> per the default setup of pgxc_ctl. >> > > Now I have some questions to get me started. >> >> 1) When using the standard CREATE TABLE sql commands the tables are >> distributed or replicated? I intend to copy the db schema from my live >> postgresql system and I want to know what the default behavior is >> > You can specify the distribution type by using clause DISTRIBUTE BY of > CREATE TABLE. > Clause TO NODE/GROUP can be used to specified a list of nodes where data > is distributed. > Documentation is your friend: > http://postgres-xc.sourceforge.net/docs/1_0_2/sql-createtable.html > > >> >> 2) Should there be an automatic failover mechanism when one of the >> machines fail? What do you recommend? (heartbeat/pacemaker, keepalined, >> monit, ...) >> > This is really up to you. Hearbeat would be fine, even Corosync/RDBMS. > Why not using the one you are the kost familiar with? > > >> 3) The default setup creates 4 coordinators: node1:20004, node2:20005, >> node3:20004, node4:20005. Is it better to use a TCP/IP load balancer to >> accept traffic at port 5432 and load balance to the 4 nodes or use an >> application load balancer instead (pgpool, pgbouncer, ...) >> > I recall that an instance of pgbouncer can only pool connections to 1 > node. So pgpool? > > >> 4) How about optimizations? Shared memory and such? Should it be split >> by 4 since there are 4 instances on each server? >> > Those are huge questions. First, you should setup a Datanode as you would > set up a normal PG server. > > For Coordinators, shared_buffers can be reduced to a quantity enough to > cache the catalog data as it doesn't have any relations. Depending on your > application, you might for example not need to set up a high value of > work_mem if you are able to push down sort operations to Datanodes for > example. > -- > Michael > > > ------------------------------------------------------------------------------ > Minimize network downtime and maximize team effectiveness. > Reduce network management and security costs.Learn how to hire > the most talented Cisco Certified professionals. Visit the > Employer Resources Portal > http://www.cisco.com/web/learning/employer_resources/index.html > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > |
From: Koichi S. <koi...@gm...> - 2013-04-07 14:23:32
|
Primary node is useful to maintain replicated table in a consistent status in all the datanode. All the writes to a replicated table goes first to the primary node so all the conflicts are resolved here and prevents conflict writes in other datanodes. In this sense, this may prvent some deadlocks but it does not remove the chance of deadlocks in general sense. On the othe hand, preferred node (datanode again) saves inter-server communication to read replciated table. It does not work to maintain replicated table consistensy but helps to gain some performance. Regards; --- Koichi Suzuki ---------- Koichi Suzuki 2013/4/7 Andrei Martsinchyk <and...@gm...> > > > > 2013/4/7 Jov <am...@am...> > >> datanode use primary node to solve replication table write,it is good,but >> how coordinator solve the dead lock problem? the coordinator nodes >> replication all globle catalog tables across coords,they are some kind >> replication table. >> >> eg. >> client 1 run alter table tb on coord node A,it will lock local catalog >> data on A,and wait other coord node B. >> client 2 run alter table tb on coord node B,it will lock local catalog >> data on B,and wait other coord node A. >> >> so how XC handle this dead lock? >> >> > XC does not handle this, it will be deadlocked. > Fortunately, chance of concurrent DDL much less then chance of concurrent > replicated update. > > > >> >> 2013/4/6 Andrei Martsinchyk <and...@gm...> >> >>> PRIMARY was introduced to avoid distributed deadlocks when updating >>> replicated tables. >>> To better understand the problem, imagine two transactions A and B are >>> updating the same tuple in replicated concurrently. >>> Normally coordinator sends the same commands to all datanodes at the >>> same time, and if on some node A updates the tuple first, B will be waiting >>> for the end of transaction A. If on other node B wins the race, both >>> transactions will be waiting for each other. It is hard to detect such >>> deadlock, the information about locks is not sent across network. >>> But it is possible to avoid. The idea is to set one datanode as a >>> primary, and execute distributed update on primary node first, and go on >>> with other nodes only if operation succeeds on primary. >>> With this approach row lock on primary would stop concurrent >>> transactions from taking row locks on other nodes that could prevent >>> command completion. >>> So, to have this stuff working properly you should >>> 1) set only one datanode as a primary; >>> 2) if you have multiple coordinators, the same datanode should be set as >>> a primary on each of them. >>> Obvious drawback of the approach is double execution time of replicated >>> updates. >>> Note: "update" means any write access. >>> Hope this answers 1)-3) >>> Regarding 4), the query >>> >>> select nodeoids from pg_class, pgxc_class where pg_class.oid = pcrelid >>> and relname = '<your table name>'; >>> >>> returns the list of nodes, where the specified table is distributed on. >>> I guess there are 7 of them. >>> >>> >>> >>> 2013/4/5 Paul Jones <pb...@cm...> >>> >>>> >>>> We are experimenting with an 8-datanode, 3-coordinator cluster and we >>>> have some questions about the use of PRIMARY and a problem. >>>> >>>> The manual explains what PRIMARY means but does not provide much detail >>>> about when you would use it or not use it. >>>> >>>> 1) Can PRIMARY apply to coordinators and if so, when would you >>>> want it or not? >>>> >>>> 2) When would you use PRIMARY for datanodes or not, and would you >>>> ever want more than one datanode to be a primary? >>>> >>>> 3) Does a pgxc_node datanode entry on its own server have to be >>>> the FQDN server name or can it be 'localhost'? >>>> >>>> 4) We have a table that is defined as DISTRIBUTE BY REPLICATION. >>>> It only loads on the first 7 nodes. It will just not load on >>>> node 8. There are a lot of FK references from other tables to it, >>>> but it itself only has a simple CHAR(11) PK, one constraint, >>>> and 3 indices. >>>> >>>> Has anyone seen anything like this before? >>>> >>>> Thanks, >>>> Paul Jones >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Minimize network downtime and maximize team effectiveness. >>>> Reduce network management and security costs.Learn how to hire >>>> the most talented Cisco Certified professionals. Visit the >>>> Employer Resources Portal >>>> http://www.cisco.com/web/learning/employer_resources/index.html >>>> _______________________________________________ >>>> Postgres-xc-general mailing list >>>> Pos...@li... >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >>>> >>> >>> >>> >>> -- >>> Andrei Martsinchyk >>> >>> StormDB - http://www.stormdb.com >>> The Database Cloud >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Minimize network downtime and maximize team effectiveness. >>> Reduce network management and security costs.Learn how to hire >>> the most talented Cisco Certified professionals. Visit the >>> Employer Resources Portal >>> http://www.cisco.com/web/learning/employer_resources/index.html >>> _______________________________________________ >>> Postgres-xc-general mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >>> >>> >> >> >> -- >> Jov >> blog: http:amutu.com/blog <http://amutu.com/blog> >> > > > > -- > Andrei Martsinchyk > > StormDB - http://www.stormdb.com > The Database Cloud > > > > ------------------------------------------------------------------------------ > Minimize network downtime and maximize team effectiveness. > Reduce network management and security costs.Learn how to hire > the most talented Cisco Certified professionals. Visit the > Employer Resources Portal > http://www.cisco.com/web/learning/employer_resources/index.html > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > |
From: Andrei M. <and...@gm...> - 2013-04-07 06:58:50
|
2013/4/7 Jov <am...@am...> > datanode use primary node to solve replication table write,it is good,but > how coordinator solve the dead lock problem? the coordinator nodes > replication all globle catalog tables across coords,they are some kind > replication table. > > eg. > client 1 run alter table tb on coord node A,it will lock local catalog > data on A,and wait other coord node B. > client 2 run alter table tb on coord node B,it will lock local catalog > data on B,and wait other coord node A. > > so how XC handle this dead lock? > > XC does not handle this, it will be deadlocked. Fortunately, chance of concurrent DDL much less then chance of concurrent replicated update. > > 2013/4/6 Andrei Martsinchyk <and...@gm...> > >> PRIMARY was introduced to avoid distributed deadlocks when updating >> replicated tables. >> To better understand the problem, imagine two transactions A and B are >> updating the same tuple in replicated concurrently. >> Normally coordinator sends the same commands to all datanodes at the same >> time, and if on some node A updates the tuple first, B will be waiting for >> the end of transaction A. If on other node B wins the race, both >> transactions will be waiting for each other. It is hard to detect such >> deadlock, the information about locks is not sent across network. >> But it is possible to avoid. The idea is to set one datanode as a >> primary, and execute distributed update on primary node first, and go on >> with other nodes only if operation succeeds on primary. >> With this approach row lock on primary would stop concurrent transactions >> from taking row locks on other nodes that could prevent command completion. >> So, to have this stuff working properly you should >> 1) set only one datanode as a primary; >> 2) if you have multiple coordinators, the same datanode should be set as >> a primary on each of them. >> Obvious drawback of the approach is double execution time of replicated >> updates. >> Note: "update" means any write access. >> Hope this answers 1)-3) >> Regarding 4), the query >> >> select nodeoids from pg_class, pgxc_class where pg_class.oid = pcrelid >> and relname = '<your table name>'; >> >> returns the list of nodes, where the specified table is distributed on. I >> guess there are 7 of them. >> >> >> >> 2013/4/5 Paul Jones <pb...@cm...> >> >>> >>> We are experimenting with an 8-datanode, 3-coordinator cluster and we >>> have some questions about the use of PRIMARY and a problem. >>> >>> The manual explains what PRIMARY means but does not provide much detail >>> about when you would use it or not use it. >>> >>> 1) Can PRIMARY apply to coordinators and if so, when would you >>> want it or not? >>> >>> 2) When would you use PRIMARY for datanodes or not, and would you >>> ever want more than one datanode to be a primary? >>> >>> 3) Does a pgxc_node datanode entry on its own server have to be >>> the FQDN server name or can it be 'localhost'? >>> >>> 4) We have a table that is defined as DISTRIBUTE BY REPLICATION. >>> It only loads on the first 7 nodes. It will just not load on >>> node 8. There are a lot of FK references from other tables to it, >>> but it itself only has a simple CHAR(11) PK, one constraint, >>> and 3 indices. >>> >>> Has anyone seen anything like this before? >>> >>> Thanks, >>> Paul Jones >>> >>> >>> ------------------------------------------------------------------------------ >>> Minimize network downtime and maximize team effectiveness. >>> Reduce network management and security costs.Learn how to hire >>> the most talented Cisco Certified professionals. Visit the >>> Employer Resources Portal >>> http://www.cisco.com/web/learning/employer_resources/index.html >>> _______________________________________________ >>> Postgres-xc-general mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >>> >> >> >> >> -- >> Andrei Martsinchyk >> >> StormDB - http://www.stormdb.com >> The Database Cloud >> >> >> >> ------------------------------------------------------------------------------ >> Minimize network downtime and maximize team effectiveness. >> Reduce network management and security costs.Learn how to hire >> the most talented Cisco Certified professionals. Visit the >> Employer Resources Portal >> http://www.cisco.com/web/learning/employer_resources/index.html >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> > > > -- > Jov > blog: http:amutu.com/blog <http://amutu.com/blog> > -- Andrei Martsinchyk StormDB - http://www.stormdb.com The Database Cloud |
From: Jov <am...@am...> - 2013-04-07 05:50:37
|
datanode use primary node to solve replication table write,it is good,but how coordinator solve the dead lock problem? the coordinator nodes replication all globle catalog tables across coords,they are some kind replication table. eg. client 1 run alter table tb on coord node A,it will lock local catalog data on A,and wait other coord node B. client 2 run alter table tb on coord node B,it will lock local catalog data on B,and wait other coord node A. so how XC handle this dead lock? 2013/4/6 Andrei Martsinchyk <and...@gm...> > PRIMARY was introduced to avoid distributed deadlocks when updating > replicated tables. > To better understand the problem, imagine two transactions A and B are > updating the same tuple in replicated concurrently. > Normally coordinator sends the same commands to all datanodes at the same > time, and if on some node A updates the tuple first, B will be waiting for > the end of transaction A. If on other node B wins the race, both > transactions will be waiting for each other. It is hard to detect such > deadlock, the information about locks is not sent across network. > But it is possible to avoid. The idea is to set one datanode as a primary, > and execute distributed update on primary node first, and go on with other > nodes only if operation succeeds on primary. > With this approach row lock on primary would stop concurrent transactions > from taking row locks on other nodes that could prevent command completion. > So, to have this stuff working properly you should > 1) set only one datanode as a primary; > 2) if you have multiple coordinators, the same datanode should be set as a > primary on each of them. > Obvious drawback of the approach is double execution time of replicated > updates. > Note: "update" means any write access. > Hope this answers 1)-3) > Regarding 4), the query > > select nodeoids from pg_class, pgxc_class where pg_class.oid = pcrelid and > relname = '<your table name>'; > > returns the list of nodes, where the specified table is distributed on. I > guess there are 7 of them. > > > > 2013/4/5 Paul Jones <pb...@cm...> > >> >> We are experimenting with an 8-datanode, 3-coordinator cluster and we >> have some questions about the use of PRIMARY and a problem. >> >> The manual explains what PRIMARY means but does not provide much detail >> about when you would use it or not use it. >> >> 1) Can PRIMARY apply to coordinators and if so, when would you >> want it or not? >> >> 2) When would you use PRIMARY for datanodes or not, and would you >> ever want more than one datanode to be a primary? >> >> 3) Does a pgxc_node datanode entry on its own server have to be >> the FQDN server name or can it be 'localhost'? >> >> 4) We have a table that is defined as DISTRIBUTE BY REPLICATION. >> It only loads on the first 7 nodes. It will just not load on >> node 8. There are a lot of FK references from other tables to it, >> but it itself only has a simple CHAR(11) PK, one constraint, >> and 3 indices. >> >> Has anyone seen anything like this before? >> >> Thanks, >> Paul Jones >> >> >> ------------------------------------------------------------------------------ >> Minimize network downtime and maximize team effectiveness. >> Reduce network management and security costs.Learn how to hire >> the most talented Cisco Certified professionals. Visit the >> Employer Resources Portal >> http://www.cisco.com/web/learning/employer_resources/index.html >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> > > > > -- > Andrei Martsinchyk > > StormDB - http://www.stormdb.com > The Database Cloud > > > > ------------------------------------------------------------------------------ > Minimize network downtime and maximize team effectiveness. > Reduce network management and security costs.Learn how to hire > the most talented Cisco Certified professionals. Visit the > Employer Resources Portal > http://www.cisco.com/web/learning/employer_resources/index.html > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- Jov blog: http:amutu.com/blog <http://amutu.com/blog> |
From: Abbas B. <abb...@en...> - 2013-04-07 05:45:46
|
On Sun, Apr 7, 2013 at 8:25 AM, Theodotos Andreou <th...@ub...> wrote: > On 04/06/2013 04:00 PM, Michael Paquier wrote: > > > > > On Sat, Apr 6, 2013 at 4:34 PM, Theodotos Andreou <th...@ub...>wrote: > >> Guys hi, >> >> I finally managed, with the help of pgxc_ctl to setup a HA postgres-xc >> cluster. I have 4 nodes (coord/datanode) and 2 GTMs (active/)failover as >> per the default setup of pgxc_ctl. >> > > Now I have some questions to get me started. >> >> 1) When using the standard CREATE TABLE sql commands the tables are >> distributed or replicated? I intend to copy the db schema from my live >> postgresql system and I want to know what the default behavior is >> > You can specify the distribution type by using clause DISTRIBUTE BY of > CREATE TABLE. > Clause TO NODE/GROUP can be used to specified a list of nodes where data > is distributed. > Documentation is your friend: > http://postgres-xc.sourceforge.net/docs/1_0_2/sql-createtable.html > > If I use pg_dump then the resulting data that comes from "traditional" > postgresql will not have any "DISTRIBUTE BY" clause. According to the > manual: > > "If DISTRIBUTE BY is not specified, columns with UNIQUE constraint will > be chosen as the distribution key. If no such column is specified, > distribution column is the first eligible column in the definition. If no > such column is found, then the table will be distributed by ROUND ROBIN. " > > So the default behavior for "traditional" CREATE TABLE statements is ROUND > ROBIN, right? > No, If the user does not specify any DISTRIBUTE BY clause, then the system first tries to distribute the table by hash, and for that it uses the first column that is hash distributable. Many common types (int, float, text etc) can be taken as hash distributable. However if the table does not contain any column that can be hash distributed then the system falls back to ROUND ROBIN. > > > > >> >> 2) Should there be an automatic failover mechanism when one of the >> machines fail? What do you recommend? (heartbeat/pacemaker, keepalined, >> monit, ...) >> > This is really up to you. Hearbeat would be fine, even Corosync/RDBMS. > Why not using the one you are the kost familiar with? > > > I have experience with heartbeat and monit (not realy a HA mechanism but > with scripts you can do almost everything). But I would feel more > comfortable if I see a HOWTO of someone that already implemented automatic > failover in production. > > This presentation: > > http://wiki.postgresql.org/images/7/7a/Postgres-xc-sharednothing-pgopen2012.pdf > says (page 34) that "Linux-HA Japan team is actively working for the > above" (HA resource agents for pacemaker). Is there any progress on that? > Some code even alpha/unstable? > > > > >> 3) The default setup creates 4 coordinators: node1:20004, node2:20005, >> node3:20004, node4:20005. Is it better to use a TCP/IP load balancer to >> accept traffic at port 5432 and load balance to the 4 nodes or use an >> application load balancer instead (pgpool, pgbouncer, ...) >> > I recall that an instance of pgbouncer can only pool connections to 1 > node. So pgpool? > > > Do you see any advantages/disadvantages between a TCP/IP load balancer > (e.g. keepalived/IPVS) with an application level load balancer like pgpool? > Which one do you use? > > > > >> 4) How about optimizations? Shared memory and such? Should it be split >> by 4 since there are 4 instances on each server? >> > Those are huge questions. First, you should setup a Datanode as you would > set up a normal PG server. > > For Coordinators, shared_buffers can be reduced to a quantity enough to > cache the catalog data as it doesn't have any relations. Depending on your > application, you might for example not need to set up a high value of > work_mem if you are able to push down sort operations to Datanodes for > example. > > > Makes sense to give more shared buffers for datanodes. My application > already uses "traditional" postgresql and we would like not to have to do > code changes on it. How can you "push down sort operations to Datanodes"? > Change the code on the application or within postgres-xc? > Within postgres-xc, when the coordinator receives the query it passes it through postgres-xc planner which decides whether the query can be directly shipped to datanodes or not. If not the query then passes through standard postgres planner. > > -- > Michael > > > Thanks for your reply Michael! Please excuse me if some of the questions > appear silly. I am a humble sysadmin and even though I have setup databases > like mysql and postgres, I have little experience on the "internals" of > databases. Postgres-xc is an even more complicated system than traditional > Linux RDBMS and I want to clear my confusion a bit. > > > ------------------------------------------------------------------------------ > Minimize network downtime and maximize team effectiveness. > Reduce network management and security costs.Learn how to hire > the most talented Cisco Certified professionals. Visit the > Employer Resources Portal > http://www.cisco.com/web/learning/employer_resources/index.html > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- -- Abbas Architect EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: 92-334-5100153 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Theodotos A. <th...@ub...> - 2013-04-07 03:26:13
|
On 04/06/2013 04:00 PM, Michael Paquier wrote: > > > > On Sat, Apr 6, 2013 at 4:34 PM, Theodotos Andreou <th...@ub... > <mailto:th...@ub...>> wrote: > > Guys hi, > > I finally managed, with the help of pgxc_ctl to setup a HA postgres-xc > cluster. I have 4 nodes (coord/datanode) and 2 GTMs > (active/)failover as > per the default setup of pgxc_ctl. > > Now I have some questions to get me started. > > 1) When using the standard CREATE TABLE sql commands the tables are > distributed or replicated? I intend to copy the db schema from my live > postgresql system and I want to know what the default behavior is > > You can specify the distribution type by using clause DISTRIBUTE BY of > CREATE TABLE. > Clause TO NODE/GROUP can be used to specified a list of nodes where > data is distributed. > Documentation is your friend: > http://postgres-xc.sourceforge.net/docs/1_0_2/sql-createtable.html If I use pg_dump then the resulting data that comes from "traditional" postgresql will not have any "DISTRIBUTE BY" clause. According to the manual: "If DISTRIBUTE BY is not specified, columns with UNIQUE constraint will be chosen as the distribution key. If no such column is specified, distribution column is the first eligible column in the definition. If no such column is found, then the table will be distributed by ROUND ROBIN. " So the default behavior for "traditional" CREATE TABLE statements is ROUND ROBIN, right? > > 2) Should there be an automatic failover mechanism when one of the > machines fail? What do you recommend? (heartbeat/pacemaker, > keepalined, > monit, ...) > > This is really up to you. Hearbeat would be fine, even Corosync/RDBMS. > Why not using the one you are the kost familiar with? I have experience with heartbeat and monit (not realy a HA mechanism but with scripts you can do almost everything). But I would feel more comfortable if I see a HOWTO of someone that already implemented automatic failover in production. This presentation: http://wiki.postgresql.org/images/7/7a/Postgres-xc-sharednothing-pgopen2012.pdf says (page 34) that "Linux-HA Japan team is actively working for the above" (HA resource agents for pacemaker). Is there any progress on that? Some code even alpha/unstable? > 3) The default setup creates 4 coordinators: node1:20004, node2:20005, > node3:20004, node4:20005. Is it better to use a TCP/IP load > balancer to > accept traffic at port 5432 and load balance to the 4 nodes or use an > application load balancer instead (pgpool, pgbouncer, ...) > > I recall that an instance of pgbouncer can only pool connections to 1 > node. So pgpool? Do you see any advantages/disadvantages between a TCP/IP load balancer (e.g. keepalived/IPVS) with an application level load balancer like pgpool? Which one do you use? > 4) How about optimizations? Shared memory and such? Should it be split > by 4 since there are 4 instances on each server? > > Those are huge questions. First, you should setup a Datanode as you > would set up a normal PG server. > > For Coordinators, shared_buffers can be reduced to a quantity enough > to cache the catalog data as it doesn't have any relations. Depending > on your application, you might for example not need to set up a high > value of work_mem if you are able to push down sort operations to > Datanodes for example. Makes sense to give more shared buffers for datanodes. My application already uses "traditional" postgresql and we would like not to have to do code changes on it. How can you "push down sort operations to Datanodes"? Change the code on the application or within postgres-xc? > -- > Michael Thanks for your reply Michael! Please excuse me if some of the questions appear silly. I am a humble sysadmin and even though I have setup databases like mysql and postgres, I have little experience on the "internals" of databases. Postgres-xc is an even more complicated system than traditional Linux RDBMS and I want to clear my confusion a bit. |