You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Michael P. <mic...@gm...> - 2012-10-26 13:15:56
|
On Fri, Oct 26, 2012 at 9:55 PM, Vladimir Stavrinov <vst...@gm...>wrote > > That's right, this is it. This is result of Your concept: instead of > one cluster we should build two clusters. > And you have the same notion with PG itself, you create 2 database servers if you use a slave with a master, so I do not see your point, and people live and appreciate such a robust solution.. If you are so of using slaves, you could also use archive files for a recovery, or take periodic dumps of each Datanode if you do not want to lose data, then replay it on a new node if necessary. -- Michael Paquier http://michael.otacoo.com |
From: Vladimir S. <vst...@gm...> - 2012-10-26 12:55:56
|
On Thu, Oct 25, 2012 at 4:21 AM, Michael Paquier <mic...@gm...> wrote: > This looks like a possible solution trying to achieve load balancing easily > at Coordinator level. You could also publish a small utility for the XC > community based in your experience. That is only a suggestion to help It is not utility, it is cluster infrastructure and configuration. > It is btw recommended to have a standby node behing the one that failed if > the Datanode that failed cannot be recovered for a reason or another. That's right, this is it. This is result of Your concept: instead of one cluster we should build two clusters. |
From: Vladimir S. <vst...@gm...> - 2012-10-26 12:46:53
|
On Thu, Oct 25, 2012 at 4:13 AM, Michael Paquier <mic...@gm...> wrote: > 1) It is not our goal to oblige the users to user an HA solution or another, Sounds fine. Where are those users? Who wants cluster without HA? Everybody when hears word "cluster" implies "HA". > Postgres code with XC. One of the reasons explaining that XC is able to keep > up with Postgres code pace easily is that we avoid to implement solutions in > core that might impact unnecessarily its interactions with Postgres. You are heroes. How long You can continue "code pace" on this hard way? This paradigm prevents You do not implement not only HA but lot of other things that is necessary for cluster. I never saw this type of fork. I believe at some point You will either become a part of Postgres or totally come off and go Your own way. The only question is when? And best answer is "right now". >> Managability - I want to manage a cluster easily (add node, remove node, >> spare nodes, monitoring, ...). It cannot be simple enough. > > Sure. I don't know about any utilities able to do that, but if you could > build a utility like this running on top of XC and sell it, well you might > be able to make some money if XC becomes popular, what is not really the > case now ;) There are no problem with adding or removing nodes. But after that we should something do with data contained in the nodes. In other words, this is data manipulating issue. And it is not about "utility like this running on top of XC". It should be implemented internally. |
From: Nikhil S. <ni...@st...> - 2012-10-26 12:23:04
|
On Fri, Oct 26, 2012 at 5:15 PM, Vladimir Stavrinov <vst...@gm...>wrote: > On Thu, Oct 25, 2012 at 1:40 AM, Paulo Pires <pj...@ub...> wrote: > > > Summing, I've found Postgres-XC to be quite easy to install and > > configure in a 3 coordinators + 3 data-nodes (GTM all over them and > > GTM-Proxy handling HA). A little Google and command-line did the trick > > in *a couple hours*! > > In Debian You can install this package in a few seconds. > > > Now, the only downside for me is that Postgres-XC doesn't have a > > built-in way of load-balancing between coordinators. If the coordinator > > It is not a problem. The problem is necessity to have standby for > every data node. > > Why is that a problem? The standby can run on nodes which are part of your cluster itself. >But be > aware: with this solution we have HA only for LB, but not for > datanodes itself. > > HA for datanodes can be provided by using standy nodes as well. > That is what we have without HA. And that is why You must > have standby for every data node. In other word You should build extra > infrastructure in size of entire cluster. > > It looks like you like the Mysql Cluster product a lot and are trying to force fit PGXC within its parameters. So going back to your favorite Mysql Cluster product. The group has to contain at least two nodes for redundancy, right? Why is that ok and having a replica not ok or not similar? PGXC can/will certainly provide read capabilities from replicas in coming versions. Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
From: Nikhil S. <ni...@st...> - 2012-10-26 12:07:19
|
> But where is war? It is simply question. With low priority You have no > neither knowledge nor HA itself. But if every XC accompanied with HA > then it is high priority. And question is what is true here? > > Vladimir, I guess you are getting the impression that PGXC has de-emphasized HA, that's certainly not the case. For a distributed database, the HA aspects are really important. As you have mentioned elsewhere there needs to be a solution in place with something like CoroSync/PaceMaker and it's been looked into. Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
From: Vladimir S. <vst...@gm...> - 2012-10-26 11:56:02
|
On Fri, Oct 26, 2012 at 08:42:05PM +0900, Michael Paquier wrote: > > He spoke about priorities, not lack of knowledge. You're playing with > > What is difference? > > Easy, easy. This is a space of peace. But where is war? It is simply question. With low priority You have no neither knowledge nor HA itself. But if every XC accompanied with HA then it is high priority. And question is what is true here? -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Vladimir S. <vst...@gm...> - 2012-10-26 11:45:45
|
On Thu, Oct 25, 2012 at 1:40 AM, Paulo Pires <pj...@ub...> wrote: > Summing, I've found Postgres-XC to be quite easy to install and > configure in a 3 coordinators + 3 data-nodes (GTM all over them and > GTM-Proxy handling HA). A little Google and command-line did the trick > in *a couple hours*! In Debian You can install this package in a few seconds. > Now, the only downside for me is that Postgres-XC doesn't have a > built-in way of load-balancing between coordinators. If the coordinator It is not a problem. The problem is necessity to have standby for every data node. > 1) Define a DNS FQDN like coordinator.mydomain pointing to an IP > (i.e., 10.0.0.1) > 2) Point my app to work with that FQDN > 3) On every coordinator, configure keepalived with one shared-IP > (10.0.0.1) > 4) Install haproxy in every coordinator and have it load-balance with > the other coordinators First, haproxy here is extra - keepalived can do all things itself and better. Second, put it on any XC node is bad idea. In any case I prefer full cluster solution with corosync/pacemaker. This way we can put under single cluster control not only database, but all other parts of the system, i.e. web servers and applications servers. But be aware: with this solution we have HA only for LB, but not for datanodes itself. > My only doubt is, if you get a data-node offline and then bring it up, > will the data in that data-node be synchronized? My congratulation. You come at the point about what we are discussing for a long time in neighbor thread. Data from this node if it has no replica on other nodes are not available any more, but Your application don't knows, which data is available and which is not. You can easy imagine consequences. That is moment when down time is started. That is what we have without HA. And that is why You must have standby for every data node. In other word You should build extra infrastructure in size of entire cluster. |
From: Michael P. <mic...@gm...> - 2012-10-26 11:42:16
|
On Fri, Oct 26, 2012 at 4:53 PM, Vladimir Stavrinov <vst...@gm...>wrote: > On Fri, Oct 26, 2012 at 08:50:09AM +0100, Paulo Pires wrote: > > > He spoke about priorities, not lack of knowledge. You're playing with > > What is difference? > Easy, easy. This is a space of peace. Thanks in advance for respecting each other and people reading this mailing list. -- Michael Paquier http://michael.otacoo.com |
From: Vladimir S. <vst...@gm...> - 2012-10-26 07:54:08
|
On Fri, Oct 26, 2012 at 08:50:09AM +0100, Paulo Pires wrote: > He spoke about priorities, not lack of knowledge. You're playing with What is difference? -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Paulo P. <pj...@ub...> - 2012-10-26 07:50:25
|
On 26/10/12 07:56, Vladimir Stavrinov wrote: > On Thu, Oct 25, 2012 at 10:41:05AM +0300, Andrei Martsinchyk wrote: > >> XC is for those, who want more TPS per dollar, under the >> circumstances HA is not a first priority definitely. If you > Paulo, recently You asked me: > > "Do you know anyone putting up a database cluster without > HA/Clustering/LB?" > > Here they are. Ask Andrei to introduce You to them. Then You tell us > impressive story about numerous people for whom Postgres-XC was > invented. He spoke about priorities, not lack of knowledge. You're playing with words and that just sucks, man! > -- Paulo Pires |
From: Vladimir S. <vst...@gm...> - 2012-10-26 06:56:52
|
On Thu, Oct 25, 2012 at 10:41:05AM +0300, Andrei Martsinchyk wrote: > XC is for those, who want more TPS per dollar, under the > circumstances HA is not a first priority definitely. If you Paulo, recently You asked me: "Do you know anyone putting up a database cluster without HA/Clustering/LB?" Here they are. Ask Andrei to introduce You to them. Then You tell us impressive story about numerous people for whom Postgres-XC was invented. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Paulo P. <pj...@ub...> - 2012-10-25 07:43:52
|
On 25/10/12 08:37, Vladimir Stavrinov wrote: > On Thu, Oct 25, 2012 at 2:05 AM, Vladimir Stavrinov > <vst...@gm...> wrote: >> On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: >>> one of those solutions. Everybody wins. If XC integrates one >>> approach it will lose flexibility in this area. >> and gain much more users. > OK. Paulo don't wants more users, because he don't like easy ways and > simple things. But we all want flexibility. Flexibility is good thing > and here is example. I didn't say "I don't want more users". I just believe, based on my experience, that subjects as advanced as the ones we're discussing don't come easy. And they shouldn't in the sense that people should really learn/know about what they're doing, regarding clustering, HA, etc.! > > We have cluster consists of 4 nodes. Nodes organized in groups. All > data distributed between groups and every group contains the identical > data, i.e. replicas. In this case with such model we have 3 options: > > 1. Read scalability only with 4 replicas in group. > 2. Read and write scalability with 2 replicas per group. > 3. Write scalability only with 1 replica per group. > > It is obvious: with more nodes we have more options, i.e. more > flexibility. It means here the trade off between read and write > scalability. And we don't need for this "CREATE TABLE ... DISTRIBUTE > BY ..." I think it is enough for most cases. > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general -- Paulo Pires |
From: Andrei M. <and...@gm...> - 2012-10-25 07:41:18
|
I feel like the discussion is senseless. Everything costs its price. If your need HA you pay with performance. If you need both HA and performance you pay for more powerful hardware. XC is for those, who want more TPS per dollar, under the circumstances HA is not a first priority definitely. If you know how to implement HA solution that does not affect performance please tell us. There are a lot of useful features (like ability to start when server starts, schedule backups, failover to standby system) which are out of the core. If you want any of these your need to set it up or have someone do that for you. If you do not need them you can go without them pretty well. 2012/10/25 Vladimir Stavrinov <vst...@gm...> > On Thu, Oct 25, 2012 at 12:18 AM, Andrei Martsinchyk > <and...@gm...> wrote: > > > I think your test was incorrect. It works. > > No, it is exactly what this thread started from and what indicated in > its subject. See very first answer of developer: it is not even a bug, > it is by design. Sounds like anecdote, but it is true. > > > performance scalability. They could use XC as is. If there is demand of > HA > > on market, other developers may create XC-based solutions, more or less > > Do You really have question about this? I think High Availability is > priority number one because we are not very happy sitting in > Rolls-Royce that can not move. > Nice. Rolls-Royce requires road, fuel, driver, service. If you do not provide all these, you will be sitting in car that can not move. Why you purchased it then? -- Andrei Martsinchyk StormDB - http://www.stormdb.com The Database Cloud |
From: Vladimir S. <vst...@gm...> - 2012-10-25 07:38:04
|
On Thu, Oct 25, 2012 at 2:05 AM, Vladimir Stavrinov <vst...@gm...> wrote: > On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: >> one of those solutions. Everybody wins. If XC integrates one >> approach it will lose flexibility in this area. > > and gain much more users. OK. Paulo don't wants more users, because he don't like easy ways and simple things. But we all want flexibility. Flexibility is good thing and here is example. We have cluster consists of 4 nodes. Nodes organized in groups. All data distributed between groups and every group contains the identical data, i.e. replicas. In this case with such model we have 3 options: 1. Read scalability only with 4 replicas in group. 2. Read and write scalability with 2 replicas per group. 3. Write scalability only with 1 replica per group. It is obvious: with more nodes we have more options, i.e. more flexibility. It means here the trade off between read and write scalability. And we don't need for this "CREATE TABLE ... DISTRIBUTE BY ..." I think it is enough for most cases. |
From: Vladimir S. <vst...@gm...> - 2012-10-25 07:01:15
|
On Thu, Oct 25, 2012 at 12:18 AM, Andrei Martsinchyk <and...@gm...> wrote: > I think your test was incorrect. It works. No, it is exactly what this thread started from and what indicated in its subject. See very first answer of developer: it is not even a bug, it is by design. Sounds like anecdote, but it is true. > performance scalability. They could use XC as is. If there is demand of HA > on market, other developers may create XC-based solutions, more or less Do You really have question about this? I think High Availability is priority number one because we are not very happy sitting in Rolls-Royce that can not move. |
From: Ashutosh B. <ash...@en...> - 2012-10-25 06:32:40
|
On Thu, Oct 25, 2012 at 5:43 AM, Michael Paquier <mic...@gm...>wrote: > On Thu, Oct 25, 2012 at 5:41 AM, David Hofstee <pg...@c0...> wrote: > >> ** >> >> Hi, >> >> I've been reading the '*ERROR: Failed to get pooled connections*' thread >> about what XC should and should not do. I opted to start a new thread >> (instead of replying) about how I would like XC to be. >> >> Some background. I work for a SaaS company (mostly dev, some ops) which >> has to be online 24/7. We are now running apache/tomcat/mysql for each set >> of customers on about 30 nodes and we want to centralize and make our >> application more robust, efficient and simple. It basically means creating >> layers: LB, webservers, application servers, database cluster. Some easy >> parts are already done (haproxy, nginx). Our 'platform' is pretty complex >> and I have so many tasks, I prefer to *not* dig into details. We are now >> discussing the db issue (mysql cluster is not that great). >> >> My dream DB cluster: >> > Scalability - that means read and write scalability. XC should do that >> right now. Nice. >> >> High availability - a node can go offline and it should not hinder >> availability (only processing capacity) >> >> Maintainability - Since maintenance/change is our primary cause of >> downtime, it should be possible to kill a node and add it later. This can >> be because the VM is being moved, the OS is updated/upgraded, etc. Also, >> think about how a cluster is updated from major version to major version >> (lets say 9.x to 10.x). Maybe that is not an issue (but I don't know about >> it yet). >> >> Simplicity - It would be nice if the default package+config file is all I >> need. If it is too complex I cannot go on holidays. Some points: >> >> - I read that *'...even the stock postgresql.conf configuration file >> is pretty conservative and users tweak it as per their requirements... >> *'. For me that translates as 'if you are new to Postgres it works >> bad'. Not simple (for e.g. some of our dev-ers). >> - For HA* '...Like Postgres, you need an external application to >> provide it'*. When using a cluster I think HA is very often wanted. I >> need to explain all this to every ops-colleague of mine and some are not >> very accurate. Not simple again. >> >> XC is a fork of Postgres and we try to share the same philosophy as the > parent project about being really conservative on the things that should or > should not be added in core. > For example, let's take the case of HA. It is of course possible to > implement an HA solution directly in the core of XC, but there are 2 things > that would go against that: > 1) It is not our goal to oblige the users to user an HA solution or > another, and I do not believe that it is the role of core people to > integrate directly in XC core a solution that might be good for a certain > type of applications, without caring of the other types of applications. > Postgres is popular because it lets all the users free to use what they > want, and depending on the application people want to use with XC, they > might prefer an HA solution or another. > 2) If in the future Postgres integrates a native HA solution (I do not > believe it will be the case as the community is really conservative, but > let's assume), and if XC had a some point integrated an HA solution > directly in its core, we would certainly have to drop the XC solution and > rely on the Postgres solution as XC is a fork of Postgres. This would be a > waste of time for the core people who integrated the HA solution, and > people merging Postgres code with XC. One of the reasons explaining that XC > is able to keep up with Postgres code pace easily is that we avoid to > implement solutions in core that might impact unnecessarily its > interactions with Postgres. > +10. I totally agree with Michael here. We would like to keep XC's footprint as small as possible. XC would add features for distributed computing that will not be present in PG. Rest features would come from PG. At the same time, we lack in terms of resources; and hence choose only few things that look to be important from XC's perspective. > >> >> Quick setup - I want to setup an NxM cluster quickly (N times duplication >> for HA, M times distributed writes for performance). I prefer to setup a >> single node with a given config file, add nodes and be ready to go. Maybe >> an hour in case of disaster recovery? >> > There are already tools about that like this one written in Ruby: > > https://sourceforge.net/projects/postgres-xc/files/misc/pgxc_config_v0_9_3.tar.gz/download > It is not maintained since 0.9.3 as this is not honestly a part of core. > You might have a look at it. > >> Managability - I want to manage a cluster easily (add node, remove node, >> spare nodes, monitoring, ...). It cannot be simple enough. >> > Sure. I don't know about any utilities able to do that, but if you could > build a utility like this running on top of XC and sell it, well you might > be able to make some money if XC becomes popular, what is not really the > case now ;) > >> Backup - I'm not familiar with running backups on Postgres but we >> currently run a blocking backup on the mysql, for consistency, and it >> causes issues. We use Bacula on a file level. Which brings up a question: >> How do you backup a cluster (if you don't know which nodes are hot)? >> > In the case of XC, you might directly take a dump from a Coordinator with > pg_dump, and then restore the dump file with pg_restore. You might want to > use archive files. > There are many ways to accomplish that, like in Postgres. The only > difference in the case of XC is that you need to do that for each node as > architecture is shared nothing. > Logging - Yes... >> >> Some may respond that things are not that simple. I know. But I still >> want it to be simple. It would make PGXC a no-brainer for everyone. Thanks >> for listening and keep up the good work! I appreciate it. >> > There are already utilities implemented for Postgres that can work > natively with XC, like for logging you might want to use log analyzers like > pgbadger. > You should have a look at that first for each thing you want to do, then > evaluate the effort necessary to achieve each of your goals. > > Thanks, > -- > Michael Paquier > http://michael.otacoo.com > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Vladimir S. <vst...@gm...> - 2012-10-25 00:40:33
|
On Wed, Oct 24, 2012 at 11:27:25PM +0100, Paulo Pires wrote: > FYI there is technology that deprecates the need of rebooting a machine > following a kernel update, such as ksplice (bought by Oracle a couple > years ago). There is such debian package but it is not commonly used. > I believe you can add new machinery (new coordinators, new data-nodes) > and deprecate old hardware. Am I being to simplistic thinking this way? > Anyway, changing a cluster hardware every two years seems overkill to > me. But of course, it depends on your app growth We don't speak about upgrade here, it is about scalability, do you remember? > Yes, internal is (supposedly) easier or as you say "transparent" - I'd > use the word "seamless". But you'll need to learn it and take care of it > somehow, the same way you'd do with external solutions, such as haproxy > or keepalived. I don't think HA/Clustering/LB is for the "heart faint". > Whether you know what you're doing, or leave this matter alone! You'll > save your sanity in the medium term.. If You know how automobile works it doesn't means You want to build it just for Your own usage. But in our context, remember again, extra complexity means not only extra software, but extra infrastructure, i.e. extra hardware as well. I am using corosync, pacemaker, ipvs, ldirectord, drbd and keepalvied. But here we are discussing database cluster and it needs some other approach. I want to use some of such tools for distributing requests between coordinators and for failover of ipvs point of distribution and gtm. But I don't want standby data nodes. All nodes should be under load and there are should be enough redundancy to survive any one node lost. Health monitoring and failover should be done internally by XC in this case. > I don't understand why you keep citing MySQL as an example. *Don't take > me wrong here*, but if you feel it to be the right tool, just go with it I've already explained this here twice: it is not right tool, because it is in-memory database. But it has right clustering model and that is why I cite it here as good exemplar. > and leave the ones who think the same about Postgres-XC alone. This is good tool to close any discussion about anything. > Do you know anyone putting up a database cluster without > HA/Clustering/LB knowledge? If you do, please ask them to stop. This questing is not for me. Look cites above. > If at least this was a "who has more users" competition, that would > make sense. The best tools I use in my day-to-day job didn't come > easy! I don't agree with you on this, at all. But I agree with You at this point. But it is not about "easy way" or "more users". I don't think we should lose flexibility with clustering model where distribution scheme defined on cluster level. I believe it can include distribution on table level. So it may be default setting issue. Well designed complex things easy to use with default setting, but still provides enough flexibility. > I *only* had to change my biggest app DDL (which is generated by some > Java JPA tool) in order to test DISTRIBUTE BY. But I'm good with 100% > replication.. for now. In the end I made *zero* changes! I don't see how this story helps in production environment. *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |
From: Michael P. <mic...@gm...> - 2012-10-25 00:21:32
|
On Thu, Oct 25, 2012 at 6:40 AM, Paulo Pires <pj...@ub...> wrote: > Hi, > > Summing, I've found Postgres-XC to be quite easy to install and > configure in a 3 coordinators + 3 data-nodes (GTM all over them and > GTM-Proxy handling HA). A little Google and command-line did the trick > in *a couple hours*! > > Now, the only downside for me is that Postgres-XC doesn't have a > built-in way of load-balancing between coordinators. If the coordinator > your app is pointing to goes down, your app goes down - your application > can target all of them, but in my experience, your application will > *always* target a host. So, ATM my solution is: > 1) Define a DNS FQDN like coordinator.mydomain pointing to an IP > (i.e., 10.0.0.1) > 2) Point my app to work with that FQDN > 3) On every coordinator, configure keepalived with one shared-IP > (10.0.0.1) > 4) Install haproxy in every coordinator and have it load-balance with > the other coordinators > > This way, keep-alived will always choose the first coordinator (based on > its priority) and then haproxy (running on that machine) will > load-balance with others. If this coordinator goes down, the second host > in keepalived priority list will replace it and not only is it a valid > coordinator, but also it will be able to load-balance with other > coordinators. > This looks like a possible solution trying to achieve load balancing easily at Coordinator level. You could also publish a small utility for the XC community based in your experience. That is only a suggestion to help community, please understand that I do not force you publishing anything of course. > My only doubt is, if you get a data-node offline and then bring it up, > will the data in that data-node be synchronized? > If the Datanode becomes offline for a certain reason, all the transactions that should have run on it will fail at Coordinator level, so there is no worries here about data synchronization normally. It is btw recommended to have a standby node behing the one that failed if the Datanode that failed cannot be recovered for a reason or another. > And that's it. I'm in now way a DB-expert and I felt quite confused by > reading the previous thread. But as a developer, Postgres-XC has been a > huge upgrade for me. (Now, if only RETURN ID was to be implemented, mr. > Abbas ;-)). > +1. Looking forward to seeing this feature ;-o > Sorry for being a little off-topic, but wanted to share my _little_ > experience with this wonderful piece of software. > Thanks, I am convinced it is helpful for a lot of people -- Michael Paquier http://michael.otacoo.com |
From: Michael P. <mic...@gm...> - 2012-10-25 00:13:55
|
On Thu, Oct 25, 2012 at 5:41 AM, David Hofstee <pg...@c0...> wrote: > ** > > Hi, > > I've been reading the '*ERROR: Failed to get pooled connections*' thread > about what XC should and should not do. I opted to start a new thread > (instead of replying) about how I would like XC to be. > > Some background. I work for a SaaS company (mostly dev, some ops) which > has to be online 24/7. We are now running apache/tomcat/mysql for each set > of customers on about 30 nodes and we want to centralize and make our > application more robust, efficient and simple. It basically means creating > layers: LB, webservers, application servers, database cluster. Some easy > parts are already done (haproxy, nginx). Our 'platform' is pretty complex > and I have so many tasks, I prefer to *not* dig into details. We are now > discussing the db issue (mysql cluster is not that great). > > My dream DB cluster: > Scalability - that means read and write scalability. XC should do that > right now. Nice. > > High availability - a node can go offline and it should not hinder > availability (only processing capacity) > > Maintainability - Since maintenance/change is our primary cause of > downtime, it should be possible to kill a node and add it later. This can > be because the VM is being moved, the OS is updated/upgraded, etc. Also, > think about how a cluster is updated from major version to major version > (lets say 9.x to 10.x). Maybe that is not an issue (but I don't know about > it yet). > > Simplicity - It would be nice if the default package+config file is all I > need. If it is too complex I cannot go on holidays. Some points: > > - I read that *'...even the stock postgresql.conf configuration file > is pretty conservative and users tweak it as per their requirements...*'. > For me that translates as 'if you are new to Postgres it works bad'. Not > simple (for e.g. some of our dev-ers). > - For HA* '...Like Postgres, you need an external application to > provide it'*. When using a cluster I think HA is very often wanted. I > need to explain all this to every ops-colleague of mine and some are not > very accurate. Not simple again. > > XC is a fork of Postgres and we try to share the same philosophy as the parent project about being really conservative on the things that should or should not be added in core. For example, let's take the case of HA. It is of course possible to implement an HA solution directly in the core of XC, but there are 2 things that would go against that: 1) It is not our goal to oblige the users to user an HA solution or another, and I do not believe that it is the role of core people to integrate directly in XC core a solution that might be good for a certain type of applications, without caring of the other types of applications. Postgres is popular because it lets all the users free to use what they want, and depending on the application people want to use with XC, they might prefer an HA solution or another. 2) If in the future Postgres integrates a native HA solution (I do not believe it will be the case as the community is really conservative, but let's assume), and if XC had a some point integrated an HA solution directly in its core, we would certainly have to drop the XC solution and rely on the Postgres solution as XC is a fork of Postgres. This would be a waste of time for the core people who integrated the HA solution, and people merging Postgres code with XC. One of the reasons explaining that XC is able to keep up with Postgres code pace easily is that we avoid to implement solutions in core that might impact unnecessarily its interactions with Postgres. > > > Quick setup - I want to setup an NxM cluster quickly (N times duplication > for HA, M times distributed writes for performance). I prefer to setup a > single node with a given config file, add nodes and be ready to go. Maybe > an hour in case of disaster recovery? > There are already tools about that like this one written in Ruby: https://sourceforge.net/projects/postgres-xc/files/misc/pgxc_config_v0_9_3.tar.gz/download It is not maintained since 0.9.3 as this is not honestly a part of core. You might have a look at it. > Managability - I want to manage a cluster easily (add node, remove node, > spare nodes, monitoring, ...). It cannot be simple enough. > Sure. I don't know about any utilities able to do that, but if you could build a utility like this running on top of XC and sell it, well you might be able to make some money if XC becomes popular, what is not really the case now ;) > Backup - I'm not familiar with running backups on Postgres but we > currently run a blocking backup on the mysql, for consistency, and it > causes issues. We use Bacula on a file level. Which brings up a question: > How do you backup a cluster (if you don't know which nodes are hot)? > In the case of XC, you might directly take a dump from a Coordinator with pg_dump, and then restore the dump file with pg_restore. You might want to use archive files. There are many ways to accomplish that, like in Postgres. The only difference in the case of XC is that you need to do that for each node as architecture is shared nothing. > Logging - Yes... > > Some may respond that things are not that simple. I know. But I still want > it to be simple. It would make PGXC a no-brainer for everyone. Thanks for > listening and keep up the good work! I appreciate it. > There are already utilities implemented for Postgres that can work natively with XC, like for logging you might want to use log analyzers like pgbadger. You should have a look at that first for each thing you want to do, then evaluate the effort necessary to achieve each of your goals. Thanks, -- Michael Paquier http://michael.otacoo.com |
From: Paulo P. <pj...@ub...> - 2012-10-24 22:27:38
|
On 10/24/12 11:05 PM, Vladimir Stavrinov wrote: > On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: > >> That is the reason to buy latest IPhone. Some servers run for years >> without even reboot. Usually people are replacing servers only if >> they really need to do that. > > What about security patches for kernel? For years without reboot? FYI there is technology that deprecates the need of rebooting a machine following a kernel update, such as ksplice (bought by Oracle a couple years ago). And > it is not only reason to upgrade kernel. As for replacing, Yes it true, > but this moment inevitably come when new software eats more resources > while number of users increases, but I never hear somebody says it is > scaling process. > >> Nobody upgrades daily. I think it is not a lot of trouble to >> recreate cluster once per few years. > > Once per few years You can built totally new system on brand-new technology. I believe you can add new machinery (new coordinators, new data-nodes) and deprecate old hardware. Am I being to simplistic thinking this way? Anyway, changing a cluster hardware every two years seems overkill to me. But of course, it depends on your app growth. > Cluster scalability imply possibility to scale it at any moment for example > (but not only) when new customers or partners come with new demand for fast > paced company with increasing load. It is by design. It is exactly what for the > scalable cluster exists: you can scale (expand) existing system instead of > building new one. > > >> Why it doubles hardware park, multiple components may share same hardware. > > As usual here it is far from reality. It is not common approach acceptable for > most companies. What You talking about looks like an approach for clouds or any > other service providers where hardware may be shared by their customers. > >> HA solution means extra complexity either it external or internal. > > But it makes difference. External should be built and managed by users, > while internal is complete and transparent solution provided by authors. Yes, internal is (supposedly) easier or as you say "transparent" - I'd use the word "seamless". But you'll need to learn it and take care of it somehow, the same way you'd do with external solutions, such as haproxy or keepalived. I don't think HA/Clustering/LB is for the "heart faint". Whether you know what you're doing, or leave this matter alone! You'll save your sanity in the medium term.. > With mysql cluster there are nothing to do with HA for users at all, it > just already "exists". I don't understand why you keep citing MySQL as an example. *Don't take me wrong here*, but if you feel it to be the right tool, just go with it and leave the ones who think the same about Postgres-XC alone. > >> There are people out there who do not want that complexity, they >> are happy with just performance scalability. They could use XC as > > Will they happy with data lost and down time? Who they are? Do you know anyone putting up a database cluster without HA/Clustering/LB knowledge? If you do, please ask them to stop. > >> one of those solutions. Everybody wins. If XC integrates one >> approach it will lose flexibility in this area. > > and gain much more users. If at least this was a "who has more users" competition, that would make sense. The best tools I use in my day-to-day job didn't come easy! I don't agree with you on this, at all. > >> I did not quite understand what you mean here. There are a lot of >> important for system design things along all the hardware and >> software stack. The more is known to developers the better result >> will be. One may design database on XC if he does know anything >> about it at all, with pure SQL, and the database will work. But >> much better result can be achieved if database is designed >> consciously. Number of nodes does not matter for distribution >> planning, btw. > > Again: all of this is not about transparency. You are talking perhaps about > installing single application on fresh XC. But what if You install third party > application on existing XC already running multiply applications? What if those > databases distributed in different ways. What if because of this You can not > use all nodes for new application? In this case You must rewrite all "CREATE > TABLE" statements to distribute tables to concrete nodes by concrete way. In > this case developer doesn't help and it is not what named "transparency." I *only* had to change my biggest app DDL (which is generated by some Java JPA tool) in order to test DISTRIBUTE BY. But I'm good with 100% replication.. for now. In the end I made *zero* changes! > > > *************************** > ### Vladimir Stavrinov > ### vst...@gm... > *************************** > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > -- Paulo Pires Ubiwhere |
From: Vladimir S. <vst...@gm...> - 2012-10-24 22:14:34
|
On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: > I think your test was incorrect. It works. It is so simple, that it hard to make something wrong. You can easily reproduce it on 1.0.0 with simple SELECT request. I will repeat it on 1.0.1 meanwhile. *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |
From: Vladimir S. <vst...@gm...> - 2012-10-24 22:05:28
|
On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote: > That is the reason to buy latest IPhone. Some servers run for years > without even reboot. Usually people are replacing servers only if > they really need to do that. What about security patches for kernel? For years without reboot? And it is not only reason to upgrade kernel. As for replacing, Yes it true, but this moment inevitably come when new software eats more resources while number of users increases, but I never hear somebody says it is scaling process. > Nobody upgrades daily. I think it is not a lot of trouble to > recreate cluster once per few years. Once per few years You can built totally new system on brand-new technology. Cluster scalability imply possibility to scale it at any moment for example (but not only) when new customers or partners come with new demand for fast paced company with increasing load. It is by design. It is exactly what for the scalable cluster exists: you can scale (expand) existing system instead of building new one. > Why it doubles hardware park, multiple components may share same hardware. As usual here it is far from reality. It is not common approach acceptable for most companies. What You talking about looks like an approach for clouds or any other service providers where hardware may be shared by their customers. > HA solution means extra complexity either it external or internal. But it makes difference. External should be built and managed by users, while internal is complete and transparent solution provided by authors. With mysql cluster there are nothing to do with HA for users at all, it just already "exists". > There are people out there who do not want that complexity, they > are happy with just performance scalability. They could use XC as Will they happy with data lost and down time? Who they are? > one of those solutions. Everybody wins. If XC integrates one > approach it will lose flexibility in this area. and gain much more users. > I did not quite understand what you mean here. There are a lot of > important for system design things along all the hardware and > software stack. The more is known to developers the better result > will be. One may design database on XC if he does know anything > about it at all, with pure SQL, and the database will work. But > much better result can be achieved if database is designed > consciously. Number of nodes does not matter for distribution > planning, btw. Again: all of this is not about transparency. You are talking perhaps about installing single application on fresh XC. But what if You install third party application on existing XC already running multiply applications? What if those databases distributed in different ways. What if because of this You can not use all nodes for new application? In this case You must rewrite all "CREATE TABLE" statements to distribute tables to concrete nodes by concrete way. In this case developer doesn't help and it is not what named "transparency." *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |
From: Paulo P. <pj...@ub...> - 2012-10-24 21:40:34
|
Hi, Summing, I've found Postgres-XC to be quite easy to install and configure in a 3 coordinators + 3 data-nodes (GTM all over them and GTM-Proxy handling HA). A little Google and command-line did the trick in *a couple hours*! Now, the only downside for me is that Postgres-XC doesn't have a built-in way of load-balancing between coordinators. If the coordinator your app is pointing to goes down, your app goes down - your application can target all of them, but in my experience, your application will *always* target a host. So, ATM my solution is: 1) Define a DNS FQDN like coordinator.mydomain pointing to an IP (i.e., 10.0.0.1) 2) Point my app to work with that FQDN 3) On every coordinator, configure keepalived with one shared-IP (10.0.0.1) 4) Install haproxy in every coordinator and have it load-balance with the other coordinators This way, keep-alived will always choose the first coordinator (based on its priority) and then haproxy (running on that machine) will load-balance with others. If this coordinator goes down, the second host in keepalived priority list will replace it and not only is it a valid coordinator, but also it will be able to load-balance with other coordinators. My only doubt is, if you get a data-node offline and then bring it up, will the data in that data-node be synchronized? And that's it. I'm in now way a DB-expert and I felt quite confused by reading the previous thread. But as a developer, Postgres-XC has been a huge upgrade for me. (Now, if only RETURN ID was to be implemented, mr. Abbas ;-)). Sorry for being a little off-topic, but wanted to share my _little_ experience with this wonderful piece of software. Cheers, PP On 10/24/12 9:41 PM, David Hofstee wrote: > Hi, > > I've been reading the '*ERROR: Failed to get pooled connections*' thread > about what XC should and should not do. I opted to start a new thread > (instead of replying) about how I would like XC to be. > > Some background. I work for a SaaS company (mostly dev, some ops) which > has to be online 24/7. We are now running apache/tomcat/mysql for each > set of customers on about 30 nodes and we want to centralize and make > our application more robust, efficient and simple. It basically means > creating layers: LB, webservers, application servers, database cluster. > Some easy parts are already done (haproxy, nginx). Our 'platform' is > pretty complex and I have so many tasks, I prefer to /not/ dig into > details. We are now discussing the db issue (mysql cluster is not that > great). > > My dream DB cluster: > > Scalability - that means read and write scalability. XC should do that > right now. Nice. > > High availability - a node can go offline and it should not hinder > availability (only processing capacity) > > Maintainability - Since maintenance/change is our primary cause of > downtime, it should be possible to kill a node and add it later. This > can be because the VM is being moved, the OS is updated/upgraded, etc. > Also, think about how a cluster is updated from major version to major > version (lets say 9.x to 10.x). Maybe that is not an issue (but I don't > know about it yet). > > Simplicity - It would be nice if the default package+config file is all > I need. If it is too complex I cannot go on holidays. Some points: > > * I read that /'...even the stock postgresql.conf configuration file > is pretty conservative and users tweak it as per their > requirements.../'. For me that translates as 'if you are new to > Postgres it works bad'. Not simple (for e.g. some of our dev-ers). > * For HA/'...Like Postgres, you need an external application to > provide it'/. When using a cluster I think HA is very often wanted. > I need to explain all this to every ops-colleague of mine and some > are not very accurate. Not simple again. > > Quick setup - I want to setup an NxM cluster quickly (N times > duplication for HA, M times distributed writes for performance). I > prefer to setup a single node with a given config file, add nodes and be > ready to go. Maybe an hour in case of disaster recovery? > > Managability - I want to manage a cluster easily (add node, remove node, > spare nodes, monitoring, ...). It cannot be simple enough. > > Backup - I'm not familiar with running backups on Postgres but we > currently run a blocking backup on the mysql, for consistency, and it > causes issues. We use Bacula on a file level. Which brings up a > question: How do you backup a cluster (if you don't know which nodes are > hot)? > > Logging - Yes... > > > > Some may respond that things are not that simple. I know. But I still > want it to be simple. It would make PGXC a no-brainer for everyone. > Thanks for listening and keep up the good work! I appreciate it. > > > > David H. > > > > > > > > > > > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > > > > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > -- Paulo Pires Ubiwhere |
From: David H. <pg...@c0...> - 2012-10-24 20:59:50
|
Hi, I've been reading the 'ERROR: FAILED TO GET POOLED CONNECTIONS' thread about what XC should and should not do. I opted to start a new thread (instead of replying) about how I would like XC to be. Some background. I work for a SaaS company (mostly dev, some ops) which has to be online 24/7. We are now running apache/tomcat/mysql for each set of customers on about 30 nodes and we want to centralize and make our application more robust, efficient and simple. It basically means creating layers: LB, webservers, application servers, database cluster. Some easy parts are already done (haproxy, nginx). Our 'platform' is pretty complex and I have so many tasks, I prefer to _not_ dig into details. We are now discussing the db issue (mysql cluster is not that great). My dream DB cluster: Scalability - that means read and write scalability. XC should do that right now. Nice. High availability - a node can go offline and it should not hinder availability (only processing capacity) Maintainability - Since maintenance/change is our primary cause of downtime, it should be possible to kill a node and add it later. This can be because the VM is being moved, the OS is updated/upgraded, etc. Also, think about how a cluster is updated from major version to major version (lets say 9.x to 10.x). Maybe that is not an issue (but I don't know about it yet). Simplicity - It would be nice if the default package+config file is all I need. If it is too complex I cannot go on holidays. Some points: * I read that _'...even the stock postgresql.conf configuration file is pretty conservative and users tweak it as per their requirements..._'. For me that translates as 'if you are new to Postgres it works bad'. Not simple (for e.g. some of our dev-ers). * For HA_ '...Like Postgres, you need an external application to provide it'_. When using a cluster I think HA is very often wanted. I need to explain all this to every ops-colleague of mine and some are not very accurate. Not simple again. Quick setup - I want to setup an NxM cluster quickly (N times duplication for HA, M times distributed writes for performance). I prefer to setup a single node with a given config file, add nodes and be ready to go. Maybe an hour in case of disaster recovery? Managability - I want to manage a cluster easily (add node, remove node, spare nodes, monitoring, ...). It cannot be simple enough. Backup - I'm not familiar with running backups on Postgres but we currently run a blocking backup on the mysql, for consistency, and it causes issues. We use Bacula on a file level. Which brings up a question: How do you backup a cluster (if you don't know which nodes are hot)? Logging - Yes... Some may respond that things are not that simple. I know. But I still want it to be simple. It would make PGXC a no-brainer for everyone. Thanks for listening and keep up the good work! I appreciate it. David H. |
From: Vladimir S. <vst...@gm...> - 2012-10-24 20:49:28
|
On Wed, Oct 24, 2012 at 01:00:51PM -0400, Jim Mlodgenski wrote: > The default will to distribute by HASH if it has some sort of valid My congratulations! I was thought so too ... before have tested it. But my surprise was when I've found the same data on every node. More over, despite of redundancy, XC stop working if one node fails. But it's no matter, because more important thing is that in any case for every table You should choose either read or write scalability, rewriting "CREATE TABLE" accordantly, while mysql cluster provides both at the same time for all tables without any headache about distribution schemas. i.e. all data are replicated and distributed at the same time. The only essential difference, that prevent consider mysql cluster as alternative for XC is that as I mentioned earlier, it is in-memory data base and as so it is limited in size, while XC have no such limit. Though be aware it's all about 1.0.0. I don't test all of these features against 1.0.1 yet. *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |