You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
1
|
2
(1) |
3
|
4
(15) |
5
|
6
(8) |
7
(2) |
8
(1) |
9
(6) |
10
(3) |
11
(11) |
12
(6) |
13
(5) |
14
(2) |
15
|
16
(4) |
17
|
18
|
19
(2) |
20
(4) |
21
(3) |
22
|
23
(5) |
24
(1) |
25
|
26
(5) |
27
(3) |
28
|
29
|
30
(8) |
31
(8) |
|
|
|
|
From: Mason S. <ma...@st...> - 2012-07-31 16:02:45
|
On Tue, Jul 31, 2012 at 8:25 AM, Vladimir Stavrinov <vst...@gm...> wrote: > On Tue, Jul 31, 2012 at 08:11:21AM -0400, Mason Sharp wrote: > >> If a data node has one or more sync rep standbys, it should be >> theoretically possible to read balance those if that intelligence is > > "sync rep" means no "write balance" > I think you misunderstood. Tables can be either distributed or replicated across the database segments. Each segment in turn can be have multiple synchronous replicas, similar to PostgreSQL's synchronous replication. So, for your large write-heavy tables, you should distribute amongst multiple nodes, gaining write scalability. The overhead and added latency for having replicas of each database segment is relatively small, so you need not think of that as preventing "write balance", as you say. For tables where you want read scalability, you would want to replicate those. This is at the table level, not the database segment level. The coordinator will read balance those today. If you also want read-balancing for distributed (non-replicated) tables across the database segment replicas, that has not yet been implemented, but is definitely doable (if your company would like to sponsor such a change, we are happy to implement it). Supporting such a change would involve changing the coordinator code to be able to know about database segment replicas. Up until now the project has focused on the challenges of the core database and not so much dealing with stuff on the outside of it, like HA. I hope that helps. Regards, Mason > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > -- Mason Sharp StormDB - http://www.stormdb.com The Database Cloud Also Offering Postgres-XC Support and Services |
From: Mason S. <ma...@st...> - 2012-07-31 12:29:35
|
On Mon, Jul 30, 2012 at 7:49 PM, Michael Paquier <mic...@gm...> wrote: >> I'd like to repeat this truth alone and together with You, but what about >> rac? > All the nodes in rac are replicated. > It provides good read scalability but sucks when write in involved. > > >> What I mean exactly is to remove "DISTRIBUTE BY" from "CREATE TABLE" >> statement and add distribution type to data node property as a field to >> pgxc_node table, for example. > Thanks for this precision. > There are several cons against that: > - it is not possible to define a distribution key based on a column > - it is not possible to define range partitioning, column partitioning > - the list of nodes is still needed in CREATE TABLE > - PostgreSQL supports basic table partitioning with DDL => > http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html. We have > been thinking for a long time now to move our paritioning code deeper in > postgres and extend it for range a column partitioning. A node-based method > will definitely stop the possibility of such an extension integrated with > postgres. Vladimir, just agreeing with Michael, and that the above points are important for performance. > > On Tue, Jul 31, 2012 at 1:49 AM, Vladimir Stavrinov <vst...@gm...> > wrote: >> >> On Mon, Jul 30, 2012 at 6:35 PM, Vladimir Stavrinov >> <vst...@gm...> wrote: >> >> > I see only way to provide read & write LB simultaneously if replication >> > will be done asynchronously in background process. This way we should >> > have a distributed database as main stock accomplished with numbers of >> > replicated nodes containing complete data. In such system read requests >> > should go to replicated nodes only when they are up to day (at least for >> > requested data). Asynchronous updates in such architect should support >> > LB for write requests to distributed nodes, which should remain >> > synchronous. > > It is written in XC definition that it is a synchronous multi-master. > Doing that in asynchronous way would break that, and also this way you > cannot guarantee at 100% that the data replicated on 1 node will be there on > other nodes. > This is an extremely important characteristic of mission-critical > applications. > >> >> >> Such architect allow to create totally automated and complete LB & HA >> cluster without any third party helpers. If one of the distributed >> (shark) nodes fails, it should be automatically replaced (failover) >> with one of the up to date replicated nodes. > > This can be also achieved with postgres streaming replication naturally > available in XC. > -- > Michael Paquier > http://michael.otacoo.com > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > -- Mason Sharp StormDB - http://www.stormdb.com The Database Cloud |
From: Vladimir S. <vst...@gm...> - 2012-07-31 12:25:48
|
On Tue, Jul 31, 2012 at 08:11:21AM -0400, Mason Sharp wrote: > If a data node has one or more sync rep standbys, it should be > theoretically possible to read balance those if that intelligence is "sync rep" means no "write balance" -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Mason S. <ma...@st...> - 2012-07-31 12:18:27
|
On Tue, Jul 31, 2012 at 5:19 AM, Vladimir Stavrinov <vst...@gm...> wrote: > On Tue, Jul 31, 2012 at 05:35:45PM +0900, Michael Paquier wrote: > >> The main problem that I see here is that replicating data >> asynchronously breaks MVCC. > > May I cite myself? > > When read request come in, it should go to replicated node if and only if > requested data exists there, otherwise such request should go to distributed > node where those data in question exists in any case. > >> So you are never sure that the data will be here or not on your >> background nodes. > > If we control where the data stored in distributed nodes, why not to control > the state of replicated nodes? In both cases we should know what data is where. > If a data node has one or more sync rep standbys, it should be theoretically possible to read balance those if that intelligence is added to the coordinator. It would not matter if that data is in a "distributed" or "replicated" table. If it were asynchronous, there would be more tracking that would have to be done to know if it is safe to load balance. In some tests we have done at StormDB, the extra overhead for sync rep is small, so you might as well use sync rep. With multiple replicas, in sync rep mode it will continue working even there is a failure with one of the replicas. >> Ideas are of course always welcome, but if you want to add some new >> features you will need to be more specific. > > I don't think what we discussing here is simply feature that may be added > with an patch. The idea to move storage control on cluster level touches > the basics and concept of XC. Yeah, that is sounding different than what is done here. I am not sure I understand what your requirements are and what exactly it is you need. If it is about HA, there are a lot of basics in place to build out HA from that will probably meet your needs. > > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general -- Mason Sharp StormDB - http://www.stormdb.com The Database Cloud |
From: Vladimir S. <vst...@gm...> - 2012-07-31 10:33:23
|
On Mon, Jul 30, 2012 at 10:21:06PM +0900, Michael Paquier wrote: > is real case)? And as I see, to do opposite operation, i.e. adding > data node, wee need to use this CREATE/DROP/RENAME TABLE technique again? > It doesn't look like HA. > > In 1.0, yes. And this is only necessary for hash/modulo/round robin tables. No, it is necessary for replicated tables too. More over, in summary it is necessary in any case of nodes adding or removal with any type of tables, replicated or distributed. This is experimental fact. The reason is that You cannot use ALTER TABLE statement to change the nodes list. The result is that if single node fail, cluster of any configuration stop working. In case of replicated node You still can do SELECT but not INSERT or UPDATE. And You can't simply remove failed node in this case too. Eventually any changes in nodes number lead to recreating overall cluster from scratch: You should drop databases and restore it from backup. It is realty now. Sorry. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Vladimir S. <vst...@gm...> - 2012-07-31 09:19:34
|
On Tue, Jul 31, 2012 at 05:35:45PM +0900, Michael Paquier wrote: > The main problem that I see here is that replicating data > asynchronously breaks MVCC. May I cite myself? When read request come in, it should go to replicated node if and only if requested data exists there, otherwise such request should go to distributed node where those data in question exists in any case. > So you are never sure that the data will be here or not on your > background nodes. If we control where the data stored in distributed nodes, why not to control the state of replicated nodes? In both cases we should know what data is where. > Ideas are of course always welcome, but if you want to add some new > features you will need to be more specific. I don't think what we discussing here is simply feature that may be added with an patch. The idea to move storage control on cluster level touches the basics and concept of XC. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Michael P. <mic...@gm...> - 2012-07-31 08:35:57
|
On Tue, Jul 31, 2012 at 5:19 PM, Vladimir Stavrinov <vst...@gm...>wrote: > > All the nodes in rac are replicated. > Is the same true for mysql cluster? Would You like to say that only XC > is wrtie scalable? > Sorry I didn't use a correct formulation. mysql can perform write scalability, it uses the same basics as XC, I was thinking about RAC and postgreSQL-based clusters only. > > There are several cons against that: > > - it is not possible to define a distribution key based on a column > > I believe, some other methods to make decision where to store new > incoming data, exists or may be created. At least round-robin. Other > is on LB criteria based : You choose node under least load. > For round-robin, this is definitely true, and a feature like this would be welcome in XC. This is not really dependent on the fact of deciding the distribution strategy at node level or table level. > > > - it is not possible to define range partitioning, column partitioning > > Is it so necessary for cluster solutions with distributed databases? > As user might be able to push some data of a given column or another on a node. That might be helpful for security reasons. But there is no rush in implementing such things, I am just mentioning that you solutions close definitely the door at possible extensions. > > > It is written in XC definition that it is a synchronous multi-master. > > Doing that in asynchronous way would break that, and also this way you > > No! You didn't read carefully what I wrote. We have classic > distributed XC as core of our system. It contains all complete data at > every moment and it is write scalable synchronous multi-master as usu > al. But then we can supplement it with extra replicated nodes, that > will be updated asynchronously in low priority background process in > order to keep cluster remaining write scalable. When read request come > in, it should go to replicated node if and only if requested data > exists there, otherwise such request should go to distributed node > where those data in question exists in any case. > The main problem that I see here is that replicating data asynchronously breaks MVCC. So you are never sure that the data will be here or not on your background nodes. > > >> Such architect allow to create totally automated and complete LB & HA > >> cluster without any third party helpers. If one of the distributed > >> (shark) nodes fails, it should be automatically replaced (failover) > >> with one of the up to date replicated nodes. > > > > This can be also achieved with postgres streaming replication naturally > > available in XC. > > Certainly You mean postgres standby server as method of duplicating > distributed node. We have already discussed this topic: it is one kind > of number of external HA solutions. But I wrote above something else. > I mean here that existing replicated node, that currently serve read > requests from application, can take over the role of any distributed > node in case it fail. And I suppose this failover procedure should be > automated, started on event of failure and executed in real time. > > OK, I see all I wrote here in this thread is far from current XC state > as well as from Your thoughts at all. So, You may consider all this as > my unreachable dreams. > Dreams come true. They just need to be worked on. I would advise you to propose feature design, then patches, and not only general ideas. Ideas are of course always welcome, but if you want to add some new features you will need to be more specific. -- Michael Paquier http://michael.otacoo.com |
From: Vladimir S. <vst...@gm...> - 2012-07-31 08:19:53
|
> All the nodes in rac are replicated. Is the same true for mysql cluster? Would You like to say that only XC is wrtie scalable? > There are several cons against that: > - it is not possible to define a distribution key based on a column I believe, some other methods to make decision where to store new incoming data, exists or may be created. At least round-robin. Other is on LB criteria based : You choose node under least load. > - it is not possible to define range partitioning, column partitioning Is it so necessary for cluster solutions with distributed databases? > - the list of nodes is still needed in CREATE TABLE In this case when we need add new data node, we should apply CREATE/DROP/RENAME technique to every distributed table. But this is almost equivalent to creating cluster from scratch. Indeed, it is better create dump, drop database and restore it from backup. So it look like XC is not XC, i.e is not extensible. That is why I think, all storage control should be moved to cluster level. > It is written in XC definition that it is a synchronous multi-master. > Doing that in asynchronous way would break that, and also this way you No! You didn't read carefully what I wrote. We have classic distributed XC as core of our system. It contains all complete data at every moment and it is write scalable synchronous multi-master as usu al. But then we can supplement it with extra replicated nodes, that will be updated asynchronously in low priority background process in order to keep cluster remaining write scalable. When read request come in, it should go to replicated node if and only if requested data exists there, otherwise such request should go to distributed node where those data in question exists in any case. >> Such architect allow to create totally automated and complete LB & HA >> cluster without any third party helpers. If one of the distributed >> (shark) nodes fails, it should be automatically replaced (failover) >> with one of the up to date replicated nodes. > > This can be also achieved with postgres streaming replication naturally > available in XC. Certainly You mean postgres standby server as method of duplicating distributed node. We have already discussed this topic: it is one kind of number of external HA solutions. But I wrote above something else. I mean here that existing replicated node, that currently serve read requests from application, can take over the role of any distributed node in case it fail. And I suppose this failover procedure should be automated, started on event of failure and executed in real time. OK, I see all I wrote here in this thread is far from current XC state as well as from Your thoughts at all. So, You may consider all this as my unreachable dreams. |
From: Michael P. <mic...@gm...> - 2012-07-30 23:49:56
|
> I'd like to repeat this truth alone and together with You, but what about > rac? All the nodes in rac are replicated. It provides good read scalability but sucks when write in involved. > What I mean exactly is to remove "DISTRIBUTE BY" from "CREATE TABLE" > statement and add distribution type to data node property as a field to > pgxc_node table, for example. Thanks for this precision. There are several cons against that: - it is not possible to define a distribution key based on a column - it is not possible to define range partitioning, column partitioning - the list of nodes is still needed in CREATE TABLE - PostgreSQL supports basic table partitioning with DDL => http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html. We have been thinking for a long time now to move our paritioning code deeper in postgres and extend it for range a column partitioning. A node-based method will definitely stop the possibility of such an extension integrated with postgres. On Tue, Jul 31, 2012 at 1:49 AM, Vladimir Stavrinov <vst...@gm...>wrote: > On Mon, Jul 30, 2012 at 6:35 PM, Vladimir Stavrinov > <vst...@gm...> wrote: > > > I see only way to provide read & write LB simultaneously if replication > > will be done asynchronously in background process. This way we should > > have a distributed database as main stock accomplished with numbers of > > replicated nodes containing complete data. In such system read requests > > should go to replicated nodes only when they are up to day (at least for > > requested data). Asynchronous updates in such architect should support > > LB for write requests to distributed nodes, which should remain > > synchronous. > It is written in XC definition that it is a synchronous multi-master. Doing that in asynchronous way would break that, and also this way you cannot guarantee at 100% that the data replicated on 1 node will be there on other nodes. This is an extremely important characteristic of mission-critical applications. > > Such architect allow to create totally automated and complete LB & HA > cluster without any third party helpers. If one of the distributed > (shark) nodes fails, it should be automatically replaced (failover) > with one of the up to date replicated nodes. > This can be also achieved with postgres streaming replication naturally available in XC. -- Michael Paquier http://michael.otacoo.com |
From: Vladimir S. <vst...@gm...> - 2012-07-30 16:49:26
|
On Mon, Jul 30, 2012 at 6:35 PM, Vladimir Stavrinov <vst...@gm...> wrote: > I see only way to provide read & write LB simultaneously if replication > will be done asynchronously in background process. This way we should > have a distributed database as main stock accomplished with numbers of > replicated nodes containing complete data. In such system read requests > should go to replicated nodes only when they are up to day (at least for > requested data). Asynchronous updates in such architect should support > LB for write requests to distributed nodes, which should remain > synchronous. Such architect allow to create totally automated and complete LB & HA cluster without any third party helpers. If one of the distributed (shark) nodes fails, it should be automatically replaced (failover) with one of the up to date replicated nodes. |
From: Vladimir S. <vst...@gm...> - 2012-07-30 14:35:24
|
On Mon, Jul 30, 2012 at 11:01:25PM +0900, Michael Paquier wrote: > Well, honestly, there are some ways to provide what you are looking for. However there are currently no cluster product that can fully provide that with a complete multi master structure. I see only way to provide read & write LB simultaneously if replication will be done asynchronously in background process. This way we should have a distributed database as main stock accomplished with numbers of replicated nodes containing complete data. In such system read requests should go to replicated nodes only when they are up to day (at least for requested data). Asynchronous updates in such architect should support LB for write requests to distributed nodes, which should remain synchronous. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Vladimir S. <vst...@gm...> - 2012-07-30 14:18:26
|
On Mon, Jul 30, 2012 at 11:01:25PM +0900, Michael Paquier wrote: > Well, honestly, there are some ways to provide what you are looking for. However there are currently no cluster product that can fully provide that with a complete multi master structure. I'd like to repeat this truth alone and together with You, but what about rac? > No. Not directly. However you can also make all the tables to a specific node with the same type! If I understood that a distribution type at node level means that all the tables on this node have the same type, which is the type of the node. What I mean exactly is to remove "DISTRIBUTE BY" from "CREATE TABLE" statement and add distribution type to data node property as a field to pgxc_node table, for example. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Michael P. <mic...@gm...> - 2012-07-30 14:01:43
|
On 2012/07/30, at 22:52, Vladimir Stavrinov <vst...@gm...> wrote: > On Mon, Jul 30, 2012 at 10:21:06PM +0900, Michael Paquier wrote: > >> - LB: There is an automatic load balancing between Datanodes and Coordinator by design. Load balancing at Coordinator >> level has to be managed by an external tool. > > For read requests to replicated tables and write requests to distributed > tables it is clear, it is true. But for write requests to replicated > tables there are no LB. And for read requests to distributed tables we > have quasi LB where requests for different data may go to different > nodes, while requests for the same data still go to the same node. Yes, replicated tables need to be used for master tables, the tables referred a lot and changed a little. Well, honestly, there are some ways to provide what you are looking for. However there are currently no cluster product that can fully provide that with a complete multi master structure. > >> do You think about implementing different data node types (instead of >> tables), i.e. "distributed" and "replicated" nodes? >> >> Well, the only extension that XC adds is that, and it allows to perform either read and/or write scalability in a >> multi-master symmetric cluster, so that's a good deal! > > I am not sure I understood You right. Does this means You support the > idea to move distribution type from table level to node level? No. Not directly. However you can also make all the tables to a specific node with the same type! If I understood that a distribution type at node level means that all the tables on this node have the same type, which is the type of the node. > > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > |
From: Vladimir S. <vst...@gm...> - 2012-07-30 13:53:11
|
On Mon, Jul 30, 2012 at 10:21:06PM +0900, Michael Paquier wrote: > - LB: There is an automatic load balancing between Datanodes and Coordinator by design. Load balancing at Coordinator > level has to be managed by an external tool. For read requests to replicated tables and write requests to distributed tables it is clear, it is true. But for write requests to replicated tables there are no LB. And for read requests to distributed tables we have quasi LB where requests for different data may go to different nodes, while requests for the same data still go to the same node. > do You think about implementing different data node types (instead of > tables), i.e. "distributed" and "replicated" nodes? > > Well, the only extension that XC adds is that, and it allows to perform either read and/or write scalability in a > multi-master symmetric cluster, so that's a good deal! I am not sure I understood You right. Does this means You support the idea to move distribution type from table level to node level? -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Michael P. <mic...@gm...> - 2012-07-30 13:21:16
|
On Mon, Jul 30, 2012 at 7:03 PM, Vladimir Stavrinov <vst...@gm...>wrote: > On Fri, Jul 20, 2012 at 06:18:22PM +0900, Michael Paquier wrote: > > > Like postgreSQL, you can attach a slave node to a datanode and then > perform a failover on it. > > After the master node fails for a reason or another, you will need to > promote a slave waiting behind. > > Something like pg_ctl promote -D $DN_FOLDER is enough. > > This is for the Datanode side. > > Then what you need to do is update the node catalogs on each > Coordinator to allow them to redirect to the new promoted > > node. > > Let's suppose that the node that failed was called datanodeN (you > need the same node name for master and slave). > > In order to do that, issue "ALTER NODE datanodeN WITH (HOST = > '$new_ip', PORT = $NEW_PORT); SELECT pgxc_pool_reload();" > > Do that on each Coordinator and then the promoted slave will be > visible to each Coordinator and will be a part of > > cluster. > > If You don't do this every day there are chances You make an error. How > much time it take in this case? As I wrote above, it is not own XC HA > feature, but rather external cluster infrastructure. As such it is better > to use mentioned above tandem drbd + corosync + pacemaker - at least it > get failover automated. > I do not mean to perform such operations manually. It was just to illustrate how to do it. Like PostgreSQL, XC provides to the user the necessary interface to perform easily and externally failover and HA operation. Then, the architect is free to use the HA utilities he wishes to perform any HA operation. In your case, a layer based on pacemaker would work. However, XC needs to be able to adapt to a maximum number of HA applications and monitoring utilities. The current interface fills this goal. > > > In 1.0 you can do that with those kind of things (want to remove data > from datanodeN): > > CREATE TABLE new_table TO NODE (datanode1, ... datanode(N-1), > datanode(N+1), datanodeP) AS SELECT * from old_table; > > DROP TABLE old_table; > > ALTER TABLE new_table RENAME TO old_table; > > Once you are sure that the datanode you want to remove has no unique > data (don't care about replicated...), perform a > > DROP NODE on each Coordinator, > > then pgxc_pool_reload() and the node will be removed correctly. > > Looks fine! What if there are thousands such tables to be relocated (it > is real case)? And as I see, to do opposite operation, i.e. adding > data node, wee need to use this CREATE/DROP/RENAME TABLE technique again? > It doesn't look like HA. > In 1.0, yes. And this is only necessary for hash/modulo/round robin tables. > > > Please note that I working on a patch able to do such stuff > automatically... Will be committed soon. > > It is hopeful news. > The patch is already committed in master branch. So you can do it in a simple command. > > > DISTRIBUTE BY REPLICATION) XC itself have no neither HA nor LB (at > least > > for writes) capabilities? > > > > Basically it has both, I know some guys who are already building an > HA/LB solution based on that... > > What do You mean? I mean: - HA: XC provides the necessary interface to allow other external tools to perform operations like for postgres. - LB: There is an automatic load balancing between Datanodes and Coordinator by design. Load balancing at Coordinator level has to be managed by an external tool. > As we saw above HA is external and LB is a question > either read or write. Yes, we have one only variant of such solution: > when all tables are replicated we have "internal" HA and "read" LB. But > such solution is implemented in many other technologies apart from XC. > But as far as I understand, the main feature of XC is what named as > "write-scalable, synchronous multi-master" > symmetric. > > OK, I still hope, I make right decision choosing XC as cluster > solution. But now summarizing discussed problematic I have further > question: Why do You implemented distribution types on table level? In a cluster what is important is to limit the amount of data exchanged between nodes to reach good performance. In order to accomplish that, you need a control of tables joins. In XC, maximizing performance is simply sending as many joins as possible to remote nodes, reducing the amount of data exchanged between nodes by that much. There are multiple ways to control data joins, like caching the data at Coordinator level for reuse, what pgpool-II does. But in this case how to manage prepared plans or write operations? This is hardly compatible with multi-master. Hence, the control is given to the tables. Explaining why distribution is controlled like this. It is very complex for using and is not transparent. For example, when You > need to install third party application You need to revise their all sql > scripts to add DISTRIBUTE BY statement if You don't want defaults. What > do You think about implementing different data node types (instead of > tables), i.e. "distributed" and "replicated" nodes? > Well, the only extension that XC adds is that, and it allows to perform either read and/or write scalability in a multi-master symmetric cluster, so that's a good deal! -- Michael Paquier http://michael.otacoo.com |
From: Vladimir S. <vst...@gm...> - 2012-07-30 10:03:39
|
On Fri, Jul 20, 2012 at 06:18:22PM +0900, Michael Paquier wrote: > Like postgreSQL, you can attach a slave node to a datanode and then perform a failover on it. > After the master node fails for a reason or another, you will need to promote a slave waiting behind. > Something like pg_ctl promote -D $DN_FOLDER is enough. > This is for the Datanode side. > Then what you need to do is update the node catalogs on each Coordinator to allow them to redirect to the new promoted > node. > Let's suppose that the node that failed was called datanodeN (you need the same node name for master and slave). > In order to do that, issue "ALTER NODE datanodeN WITH (HOST = '$new_ip', PORT = $NEW_PORT); SELECT pgxc_pool_reload();" > Do that on each Coordinator and then the promoted slave will be visible to each Coordinator and will be a part of > cluster. If You don't do this every day there are chances You make an error. How much time it take in this case? As I wrote above, it is not own XC HA feature, but rather external cluster infrastructure. As such it is better to use mentioned above tandem drbd + corosync + pacemaker - at least it get failover automated. > In 1.0 you can do that with those kind of things (want to remove data from datanodeN): > CREATE TABLE new_table TO NODE (datanode1, ... datanode(N-1), datanode(N+1), datanodeP) AS SELECT * from old_table; > DROP TABLE old_table; > ALTER TABLE new_table RENAME TO old_table; > Once you are sure that the datanode you want to remove has no unique data (don't care about replicated...), perform a > DROP NODE on each Coordinator, > then pgxc_pool_reload() and the node will be removed correctly. Looks fine! What if there are thousands such tables to be relocated (it is real case)? And as I see, to do opposite operation, i.e. adding data node, wee need to use this CREATE/DROP/RENAME TABLE technique again? It doesn't look like HA. > Please note that I working on a patch able to do such stuff automatically... Will be committed soon. It is hopeful news. > DISTRIBUTE BY REPLICATION) XC itself have no neither HA nor LB (at least > for writes) capabilities? > > Basically it has both, I know some guys who are already building an HA/LB solution based on that... What do You mean? As we saw above HA is external and LB is a question either read or write. Yes, we have one only variant of such solution: when all tables are replicated we have "internal" HA and "read" LB. But such solution is implemented in many other technologies apart from XC. But as far as I understand, the main feature of XC is what named as "write-scalable, synchronous multi-master" OK, I still hope, I make right decision choosing XC as cluster solution. But now summarizing discussed problematic I have further question: Why do You implemented distribution types on table level? It is very complex for using and is not transparent. For example, when You need to install third party application You need to revise their all sql scripts to add DISTRIBUTE BY statement if You don't want defaults. What do You think about implementing different data node types (instead of tables), i.e. "distributed" and "replicated" nodes? -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Joshua D. D. <jd...@co...> - 2012-07-27 16:09:57
|
Hello, That would be very helpful. Thank you for offering. Sincerely, jD |
From: Michael P. <mic...@gm...> - 2012-07-27 09:08:47
|
On 2012/07/27, at 17:42, Benjamin Henrion <bh...@ud...> wrote: > Hi, > > Does anybody has Virtual Box images to test postgres-xc out of the box? I don't have any images of vbox with xc preinstalled on it, sorry. Somebody here perhaps? > > On the Getting Started page, there is: > > "Although you can install Postgres-XC cluster in single Linux > operating system, we advise you to install on multiple Linux virtual > machines using Virtual Box or VMWare. The number of virtual machines > depends on the Postgres-XC configuration. Please take a look at > configuration section in this page." > > If nobody has some, I could spend some time on make some Vbox images. Why not. It would be helpful for everybody for sure. Thanks, Michael > > Best, > > -- > Benjamin Henrion <bhenrion at ffii.org> > FFII Brussels - +32-484-566109 - +32-2-3500762 > "In July 2005, after several failed attempts to legalise software > patents in Europe, the patent establishment changed its strategy. > Instead of explicitly seeking to sanction the patentability of > software, they are now seeking to create a central European patent > court, which would establish and enforce patentability rules in their > favor, without any possibility of correction by competing courts or > democratically elected legislators." > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Benjamin H. <bh...@ud...> - 2012-07-27 08:42:25
|
Hi, Does anybody has Virtual Box images to test postgres-xc out of the box? On the Getting Started page, there is: "Although you can install Postgres-XC cluster in single Linux operating system, we advise you to install on multiple Linux virtual machines using Virtual Box or VMWare. The number of virtual machines depends on the Postgres-XC configuration. Please take a look at configuration section in this page." If nobody has some, I could spend some time on make some Vbox images. Best, -- Benjamin Henrion <bhenrion at ffii.org> FFII Brussels - +32-484-566109 - +32-2-3500762 "In July 2005, after several failed attempts to legalise software patents in Europe, the patent establishment changed its strategy. Instead of explicitly seeking to sanction the patentability of software, they are now seeking to create a central European patent court, which would establish and enforce patentability rules in their favor, without any possibility of correction by competing courts or democratically elected legislators." |
From: Koichi S. <koi...@gm...> - 2012-07-26 06:56:08
|
pgcrypto looks to work fine. Obviously, it is not recommended to encrypt distribution key, where we don't have good optimization. Here's my result. Enjoy; ---- Koichi ----8<------------------------8<------------ postgres=# create extension pgcrypto; CREATE EXTENSION postgres=# create table test (id int, name bytea); CREATE TABLE postgres=# insert into test select 1, encrypt(convert_to('Suzuki', 'UTF8'), 'pass'::bytea, 'aes'); INSERT 0 1 postgres=# insert into test select 2, pgp_sym_encrypt('Sato', 'pass'); INSERT 0 1 postgres=# insert into test select 3, pgp_sym_encrypt('Sato', 'pass'); INSERT 0 1 postgres=# select * from test; id | name ----+------------------------------------------------------------------------------------------------------------------------------------------------ 1 | \x42a2c9e81526133e031336b8a09a3ae8 2 | \xc30d0407030285360772296ea94e78d2350126b86572a89a4f054fb95eb0b8ab86b214e66571469460a469b740f26c3465e19f17500703e75e06378bc4b75b37f053befad44a 3 | \xc30d040703024a9931a54cde6b6f60d235011bdbf887c8a0cb2e122ed034101ef64b3d5fb7d25cbff91dd82661de665651d92c0c1ee7d6672139ef7e59b767bb90612ab71bef (3 rows) postgres=# SELECT id, convert_from(decrypt(name, 'pass'::bytea, 'aes'), 'UTF8') FROM test where id = 1; id | convert_from ----+-------------- 1 | Suzuki (1 row) postgres=# SELECT id, pgp_sym_decrypt(name,'pass') FROM test WHERE id IN (2,3); id | pgp_sym_decrypt ----+----------------- 3 | Sato 2 | Sato (2 rows) postgres=# create table test1 (id varchar, name bytea); CREATE TABLE postgres=# insert into test1 values (pgp_sym_encrypt('1', 'pass'), pgp_sym_encrypt('Suzuki', 'pass')); INSERT 0 1 postgres=# select * from test1; id | name ------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------- \xc30d0407030233ffc6b915c7908d62d23201980e846bcfbf772c43ae7bb5c9476da716fe80dd2149e81aaa2353767bb5402af7fa54eb8b498ea21de58a58e8ca697229 | \xc30d040703020169a4462ef5f63d65d2370108b2e0dea578142f63e416989a95e19663605c7f035b54074d6f190e704e7bf75b651c6dfa6091667a8242fcd91ae60384bf0ec0c4c6 (1 row) postgres=# select pgp_sym_decrypt(id::bytea, 'pass'), pgp_sym_decrypt(name::bytea, 'pass') from test1; pgp_sym_decrypt | pgp_sym_decrypt -----------------+----------------- 1 | Suzuki (1 row) postgres=# select pgp_sym_decrypt(id::bytea, 'pass'), pgp_sym_decrypt(name::bytea, 'pass') from test1 where pgp_sym_decrypt(id::bytea, 'pass') = '1'; pgp_sym_decrypt | pgp_sym_decrypt -----------------+----------------- 1 | Suzuki (1 row) postgres=# ---->8------------------------>8------------ 2012/7/26 Aris Setyawan <ari...@gm...>: > Hi All, > > Is "Additional Supplied Modules" supported in XC cluster (hstore, > pgcrypto, etc..)? > > -thanks > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Koichi S. <koi...@gm...> - 2012-07-26 06:37:24
|
Yes, namely yes. May need to share the test. Here's my test of hstore. It looks to run. The example was from http://d.hatena.ne.jp/rudi/20120330/1333120115 (originally in Japanese). Of course, "products" table is distributed by hash on id. Regards; --- Koichi --------------8<-------------------------8<----------------- [koichi@willey:pgxc]$ psql -p 20004 postgres psql (PGXC 1.1devel, based on PG 9.2devel) Type "help" for help. postgres=# create extension hstore; WARNING: => is deprecated as an operator name DETAIL: This name may be disallowed altogether in future versions of PostgreSQL. CREATE EXTENSION postgres=# CREATE TABLE products (id serial PRIMARY KEY, name varchar, attrubutes hstore); NOTICE: CREATE TABLE will create implicit sequence "products_id_seq" for serial column "products.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "products_pkey" for table "products" CREATE TABLE postgres=# INSERT INTO products (name, attributes) VALUES postgres-# ('Geel Love: A Novel', postgres(# 'author => "Katherine Dunn", postgres'# pages => 368, postgres'# category => fiction' postgres(# ); ERROR: column "attributes" of relation "products" does not exist LINE 1: INSERT INTO products (name, attributes) VALUES ^ postgres=# drop table products postgres-# ; DROP TABLE postgres=# CREATE TABLE products ( postgres(# id serial PRIMARY KEY, postgres(# name varchar, postgres(# attributes hstore postgres(# ); NOTICE: CREATE TABLE will create implicit sequence "products_id_seq" for serial column "products.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "products_pkey" for table "products" CREATE TABLE postgres=# INSERT INTO products (name, attributes) VALUES ( postgres(# 'Geek Love: A Novel', postgres(# 'author => "Katherine Dunn", postgres'# pages => 368, postgres'# category => fiction' postgres(# ); INSERT 0 1 postgres=# SELECT name as device postgres-# FROM products postgres-# WHERE attributes->'category' = 'fiction'; device -------------------- Geek Love: A Novel (1 row) postgres=# SELECT name, attributes->'pages' postgres-# FROM products postgres-# WHERE attributes ? 'pages'; name | ?column? --------------------+---------- Geek Love: A Novel | 368 (1 row) postgres=# CREATE INDEX product_manufacturer postgres-# ON products ((products.attributes->'manufacturer')); CREATE INDEX postgres=# --------->8--------------------------->8------------------------- 2012/7/26 Michael Paquier <mic...@gm...>: > > > On Thu, Jul 26, 2012 at 3:09 PM, Aris Setyawan <ari...@gm...> wrote: >> >> Hi All, >> >> Is "Additional Supplied Modules" supported in XC cluster (hstore, >> pgcrypto, etc..)? > > Normally yes. I tested some of them and they should work. There might be > exceptions though. > -- > Michael Paquier > http://michael.otacoo.com > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Michael P. <mic...@gm...> - 2012-07-26 06:34:05
|
On Thu, Jul 26, 2012 at 3:30 PM, Aris Setyawan <ari...@gm...> wrote: > > Normally yes. I tested some of them and they should work. There might be > exceptions though. > > Can I rely on the current 1.0 XC's documentation to find out if one > module is supported? > Yes you can. They may be some mistakes in the documentation as there are not so many people maintaining and correcting it. So if you find anything that does not correspond, feel free to report that on this thread. Thanks, -- Michael Paquier http://michael.otacoo.com |
From: Michael P. <mic...@gm...> - 2012-07-26 06:23:42
|
On Thu, Jul 26, 2012 at 3:09 PM, Aris Setyawan <ari...@gm...> wrote: > Hi All, > > Is "Additional Supplied Modules" supported in XC cluster (hstore, > pgcrypto, etc..)? > Normally yes. I tested some of them and they should work. There might be exceptions though. -- Michael Paquier http://michael.otacoo.com |
From: Aris S. <ari...@gm...> - 2012-07-26 06:09:39
|
Hi All, Is "Additional Supplied Modules" supported in XC cluster (hstore, pgcrypto, etc..)? -thanks |
From: Koichi S. <ko...@in...> - 2012-07-24 00:40:37
|
It seems not reasonable to ask single license to all the Postgres-XC documents. License could be different from document to document, especially those written by individuals. As Joshua mentioned, maybe we can ask specific license to web site or Wiki contents and contributors to such contents should agree on this. Creative commons looks looks okay. ShareAlike, maybe no objection. Then commercial or non-commercial? As I mentioned before, I'm not sure what corner case "commercial" implies. Fair use is allowd unconditionally. So I think we can begin with non-commercial. Any more inputs? --- Koichi Suzuki On Tue, 24 Jul 2012 07:44:35 +0900 Michael Paquier <mic...@gm...> wrote: > On Tue, Jul 24, 2012 at 7:37 AM, Joshua D. Drake <jd...@co...>wrote: > > > > > On 07/23/2012 03:34 PM, Michael Paquier wrote: > > > > Agreed, which is why I suggested keeping our hands clean but forcing > >> the > >> good community-citizen approach with attribution. > >> > >> On this point I kind of agree, everything that is labelled as > >> "Postgres-XC development group" should be based on the same license as > >> the code to facilitate all the things. > >> However, docs written by guys not using Postgres-XC development group > >> name on their docs but a personal name or company name can provide the > >> license they want and if other people want to pick up those documents > >> they need to contact the authors. This is for example the case of my own > >> presentation documents. Those docs are under non-commercial as I use on > >> them my company name and my own name. > >> > > > > There is no way to force any author to release any document under any > > license except in the instance where the author would like to contribute > > that documentation directly to the Postgres-XC development group. In that > > case (say a patch submission, or acceptance on the website or wiki) we can > > force a specific license. Otherwise, we are powerless. > > So we agree here. > > > > So Michael, I believe with that your concerns are addressed, yes? > > Yes. > -- > Michael Paquier > http://michael.otacoo.com |