You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
1
|
2
(1) |
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
11
(3) |
12
|
13
|
14
|
15
|
16
|
17
(4) |
18
|
19
|
20
|
21
|
22
|
23
|
24
(20) |
25
(8) |
26
(22) |
27
|
28
(2) |
29
(3) |
30
|
31
(3) |
|
|
|
From: Paulo P. <pj...@ub...> - 2012-10-31 14:26:12
|
Can't wait to get my hands on the yet-to-be-finished-and-opened-to-the-world new features! It was a good read, thanks! PP On 31/10/12 05:25, Koichi Suzuki wrote: > Hi, > > My presentation material at PGconf.EU 2012 is available at > http://wiki.postgresql.org/images/4/44/Pgxc_HA_20121024.pdf > or > http://wiki.postgresql.org/wiki/PostgreSQL_Conference_Europe_Talks_2012 > > Enjoy. > ---------- > Koichi Suzuki > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general -- Paulo Pires |
From: Michael P. <mic...@gm...> - 2012-10-31 05:48:00
|
On Wed, Oct 31, 2012 at 2:25 PM, Koichi Suzuki <koi...@gm...>wrote: > Hi, > > My presentation material at PGconf.EU 2012 is available at > http://wiki.postgresql.org/images/4/44/Pgxc_HA_20121024.pdf > or > http://wiki.postgresql.org/wiki/PostgreSQL_Conference_Europe_Talks_2012 > Thx. -- Michael Paquier http://michael.otacoo.com |
From: Koichi S. <koi...@gm...> - 2012-10-31 05:26:04
|
Hi, My presentation material at PGconf.EU 2012 is available at http://wiki.postgresql.org/images/4/44/Pgxc_HA_20121024.pdf or http://wiki.postgresql.org/wiki/PostgreSQL_Conference_Europe_Talks_2012 Enjoy. ---------- Koichi Suzuki |
From: Mason S. <ma...@st...> - 2012-10-29 17:38:26
|
On Fri, Oct 26, 2012 at 12:50 PM, Vladimir Stavrinov <vst...@gm...> wrote: > On Fri, Oct 26, 2012 at 11:36:24AM -0400, Mason Sharp wrote: > >> > It is not the same. What about write? Then You should teach Your >> > application where to read and where to write. What about transparency? >> >> You can get more write scalability with more data nodes. > > How it is related with cite above? It is not about scalability, it is > about transparency. No matter how much nodes for writes we have. But if > You have separate points for read and write, then You loose transparency. > In XC you just write to a coordinator, it interacts with the data nodes where the actual user data is. >> It took years until PostgreSQL itself had built-in replication. > > I don't blame You don't implement something yet. We have something other > point of controversy here. The discussion in these two close threads is > about what "must have" and what should not. It is about priority and > concept. > >> I think most on here also feel strongly about having HA, > > But this discussion shows something else. All my opponents (and I still > saw no one supporter ) here very strongly insists, that HA is not high > priority for XC or should not be implemented in the core. But my > endlessly repeated question "Who wants cluster without HA?" got never > answered in any form. So it is become mystery question. > I think everyone is trying to be helpful, answer questions and provide opinions. It is not that HA is not possible; HA can be provided outside of the core and has been done by multiple people in different ways. There are probably varying opinions on how best to do this, but perhaps the next step is to come up with best practices with the current state of the code. >> it is just not built into the core at present. > > "at present" means it will built or possible in the future. Good news. > It is first ray of light in the darkness or the light at the end of the > tunnel. And it is more important that it come from "Architect and > development leader" of XC. Though, it is characteristically, I am not > surprised, something like this I expected. Thanks. > More than one person contributed to the original architecture. The project leader of Postgres-XC is Koichi Suzuki. > -- > > *************************** > ## Vladimir Stavrinov > ## vst...@gm... > *************************** > -- Mason Sharp StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Services |
From: Vladimir S. <vst...@gm...> - 2012-10-29 17:38:25
|
On Sat, Oct 27, 2012 at 04:23:12PM -0400, Mason Sharp wrote: > In XC you just write to a coordinator, it interacts with the data > nodes where the actual user data is. Right. But we are talking here about standby. Do You remember? Would You say it is accessible via coordinator too? > opinions. It is not that HA is not possible; HA can be provided > outside of the core and has been done by multiple people in different > ways. There are probably varying opinions on how best to do this, but I didn't mean "HA is not possible". But if You say "HA is not implemented in core at present", that means it may be implemented there in the future. And if no answer to question: "Who want XC without HA" means there are no XC used without HA, i.e. it is "Must Have", then next question should be: "Why HA should be outside XC?" > More than one person contributed to the original architecture. The > project leader of Postgres-XC is Koichi Suzuki. I know. My comment here was not about architecture. This is about people mentality. *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |
From: Vladimir S. <vst...@gm...> - 2012-10-29 11:56:29
|
On Sun, Oct 28, 2012 at 5:35 AM, Shavais Zarathustra <sh...@gm...> wrote: > Well, the point would be to get a replacement server going, for the server > that died, with all the software installed and the configuration set up, > after which my hope has been that we'd be able to reinitialize the database > on that host and perform some kind of recovery process to get it back up and > working within the cluster. But maybe that requires some of the HA features > that you're talking about that XC doesn't have working yet? With HA there will no down time, so You will have enough time for recovering failed node. Without HA You should recreate cluster from scratch from backup. In both cases virtual machine helps not so much. > clustering stuff, together with Oracle's database clustering, which was all I heard a story where whole bank was crashed on RAC. Even HA did not help. > of a brave new/old world for me, with all this poor man's Open Source stuff, "poor man's" ? Great! > Well, the hardware they have at these pseudo-cloud datacenters is all What You are describing here and below is cloud infrastructure that itself has scalability and HA, what cluster must have too. So what for do You want one inside other? You loose efficiency and money. >> logs should be handled on every node, it is not so simple. > > Yeah, I was thinking this was probably the case. So what I'm not sure of is > what you do after your datanode has been recovered as far as you can get it > recovered using the usual single database recovery techniques - how do you Without HA at this point down time started again. And if You succeed in recovering at some point in time where this node will consistent with cluster, then You will be happy, otherwise You will recreate Your cluster from scratch from backup again. > Unix Admin "is only as good as their backups". That's certainly the truth. No doubt, definitely! Backup always and everywhere. But with backup You can recover Your system at some point in past. So you have both joys: down time and data lost in this case too. Backup is not alternative for HA and vice verse: we need them both. > But I'm not concerned about the security of my DBA role, in fact I've been One developer boasted me how he can do database user becomes unix user root and shuts down the system. The answer on my horror was something similar what we are reading here: the security there becomes the victim of speed. And it was very serious and responsible institution where this database was running. > need a throat to cut before I can cut it. The risk of a crash is small and > tolerable, but if I'm not convinced I'll be able to handle the load - that's > a show stopper. I don't understand such philosophy. Look into data center: it filled by lot of rack mount servers either owned or rent by customers. Most of them have no neither scalability nor HA and they are happy with it. But this is quite another story: they have no cluster at all. If You need cluster means You are doing something that require HA. What data You are processing that requires scalability? Is it garbage You willing to loose? What are those business processes that make Your heavy load? Are they nonsense that can tolerate down time? Please tell me, do You have cluster that running without HA? Or do you know such? > But it's seems like, for the most part, the important scalability features > are there at this point, right? I hope it is true. > So very next on the list I would think would be HA. Totally endorse. > And it sounds like the XC devs are working fairly feverishly on it. Basing on what they are writing here, I am not sure about this. |
From: Nikhil S. <ni...@st...> - 2012-10-28 06:03:23
|
> Without HA, we might someday go out of business - without some claim to > scalability, we can't get into business to begin with. > > Rightly said. > Can you (1) do a full dump, then (2) kill, drop and rebuild the > cluster, and then (3) restore the entire cluster using pg_restore (or psql > .. < dumpfile) through a coordinator? This would be a last resort, > obviously, since I'd lose all the data on every datanode since the last > full dump, but if I know I can do that, at least I know I have that option. > > Yeah, this is possible and it works. Once I tested this by modifying pg_hba.conf to disallow all application connections. Then a did a dump of global objects like users, roles etc. followed by dumps of each of the databases that I had in my XC cluster. One can also use pg_dumpall if the current cluster size is not too large. And all this was done by pointing to a single coordinator, but everything was consistent at the cluster level. And a subsequent pg_restore/psql to populate a new cluster worked pretty well. > as long as I'm going through a coordinator, the effects of the sql > statements should replicate or distribute across the cluster depending on > how the table was set up.. right? > > Yes, the above is correct. > I can probably create a temporary table on a single datanode, through a > coordinator, just by telling it to distribute that table, and only list the > one datanode I want it on, right? Then I can do a data-only restore, of > just that table, then from there I can use it through a coordinator, and > affect whatever other tables I need to. > > Yeah, the above is very much possible. See Michael's blog on this: http://michael.otacoo.com/postgresql-2/pgxc-data-distribution-in-a-subset-of-nodes/ > Hmm, so I wonder what I actually would do if a datanode went down, or if > the gtm server went down. > GTM server can be configured with a GTM-standby and you can failover to the same in case of issues with the GTM. A datanode can be configured with synchronous/asynchronous replicas and again one can failover to the same in case of issues. Using an HA framework and some PGXC tools that are being worked upon will help automate this in the coming days. HTH, Nikhils > > On Fri, Oct 26, 2012 at 10:54 AM, David Hofstee <pg...@c0...> wrote: > >> ** >> >> 1. No cluster without HA option; I agree. >> 2. Integrate XC into PG; In the future I would like to think of a >> single PG instance as a 1-node cluster-able db. >> >> I think PGXC is the best thing that is happening. PGXC deserves to be the >> most usable in the world too (instead of mysql). Gtx, >> >> >> David >> >> Vladimir Stavrinov schreef op 2012-10-26 14:46: >> >> On Thu, Oct 25, 2012 at 4:13 AM, Michael Paquier >> <mic...@gm...> wrote: >> >> 1) It is not our goal to oblige the users to user an HA solution or >> another, >> >> Sounds fine. Where are those users? Who wants cluster without HA? >> Everybody when hears word "cluster" implies "HA" >> >> Postgres code with XC. One of the reasons explaining that XC is able to >> keep up with Postgres code pace easily is that we avoid to implement >> solutions in core that might impact unnecessarily its interactions with >> Postgres. >> >> You are heroes. How long You can continue "code pace" on this hard >> way? This paradigm prevents You do not implement not only HA but lot >> of other things that is necessary for cluster. I never saw this type >> of fork. I believe at some point You will either become a part of >> Postgres or totally come off and go Your own way. The only question >> is when? And best answer is "right now". >> >> >> >> >> ------------------------------------------------------------------------------ >> The Windows 8 Center >> In partnership with Sourceforge >> Your idea - your app - 30 days. Get started! >> http://windows8center.sourceforge.net/ >> what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/ >> >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> > > > ------------------------------------------------------------------------------ > WINDOWS 8 is here. > Millions of people. Your app in 30 days. > Visit The Windows 8 Center at Sourceforge for all your go to resources. > http://windows8center.sourceforge.net/ > join-generation-app-and-make-money-coding-fast/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
From: Shavais Z. <sh...@gm...> - 2012-10-28 01:36:06
|
On Fri, Oct 26, 2012 at 4:01 PM, Vladimir Stavrinov <vst...@gm...>wrote: > On Fri, Oct 26, 2012 at 12:36:25PM -0700, Roger Mayes wrote: > > > they ever were to, I can press a button and have another server up, > > of the exact same configuration, in minutes flat, restored, > > potentially, to an image created the previous night. We can > > This is good for standalone system, but with cluster those images of all > nodes should be synchronized. > > Well, the point would be to get a replacement server going, for the server that died, with all the software installed and the configuration set up, after which my hope has been that we'd be able to reinitialize the database on that host and perform some kind of recovery process to get it back up and working within the cluster. But maybe that requires some of the HA features that you're talking about that XC doesn't have working yet? In the 90's I was an "master's certified" Oracle DBA / Unix Admin, with special training on "tuning the whole system", blah blah blah. I setup and managed various HA clusters, like HP's MC Service Guard, and Sequent's HA clustering stuff, together with Oracle's database clustering, which was all based on shared scsi, and fddi and all that. Ancient history, yes, but I'm not completely without any clue about HA concepts in general, it's just kind of a brave new/old world for me, with all this poor man's Open Source stuff, and with rented cloud servers and so on. > More over cluster inside virtual machines is something exotic. If all > Your nodes are running on the same hardware host, what for You need > cluster? Well, the hardware they have at these pseudo-cloud datacenters is all sitting on a combination of copper and fiber backplanes, that connect a number of cpu's, drives, power supplies, memory boards, et etc. They try to eliminate single points of failure except for the backplanes; they can do more-or-less transparent hot swapping of drives, power supplies, maybe even cpu boards, depending on which phase of their whole set up your hosts are running on. (They're always upgrading and moving their stuff forward, and once in a while they'll coordinate with you to move your hosts off of older hardware onto their newer stuff.) We have QOS guarantees for a certain amount of processing capacity, IO bandwidth, and Network bandwidth and so on per host, depending on how we configure them, and our experience so far with these virtual hosts has been that the TPS and bandwidth levels are pretty consistent. It's measurable, we occasionally run some (sometimes very quick and dirty) benchmarks to make sure, because once or twice (in 3 years) we've caught them in a misconfiguration in their equipment, after some maintenance they did, that was limiting us incorrectly. But they have a maximum amount of processing, memory, and IO bandwidth that they can give us per host, and sometimes we need more than that in order handle the response to something like, say, a Facebook post by Taylor Swift. (Boom - millions of hits in the space of a few minutes. Unbelievable. Quite a spectacular thing to watch.) [ Can you imagine being so famous that with a one-line facebook wall post, you can single-handedly crash (well, load down to the point of being essentially halted) any single host setup? hahaha jeezuz h. It's a good thing I'm not, I think it would Drive Me Nutz. Look out, don't have a bad facial expression on for 1 second where any camera might happen to catch you. Millions of teenagers will have their idealic image of you shattered. The men in white coats would come for me after about a week. (They probably should have long since dragged me away, actually, but thankfully I'm not famous enough for them to know it. I'm whistling dixie at the moon, waving my arms maniacally and dancing by under their radar, lol. http://www.youtube.com/watch?v=GUfS8LyeUyM http://www.youtube.com/watch?v=5yO_P0ZmuBc http://www.youtube.com/watch?v=uq-gYOrU8bA YouTube. Ok, it's awash in visual, symantic and synaptic spam, but it's infinitely better than the old juke box, and for practically no cost at all. We truly do live in the "age of miracles and wonders". Hey, what better place than here to share bizarre play lists with a select group of complete strangers? I remind you of a song? Bring it on.) ] Anyway, ideally what we'd like is to have a nominal setup that we run most of the time, that we can expand as quickly and as transparently as possible to a much larger setup that we run for a few hours or a few days at a time, that we then shrink back to our "business as usual" setup. But if we have to, we can have a bunch nominally configured hosts, that we can scale up (in terms of the number of cpu's, the memory size, the network and io bandwidth, etc) in advance of a marketing blitz (.ie, one Taylor Swift facebook post, lol), and then back down afterwards, and if it costs us an hour or so of down time in each direction, that's not too big of a deal for us. > Run standalone system without virtual machine and You've got > more capacity. Or simply pay for more resources for single virtual > machine. The same if Your XC nodes are running on different hardware > nodes: You will use small part of its resources for which You have paid. > What for do You need virtual machines there? They are needed for Your > provider to share resources with his customers, but not for You. > > I am running XC on virtual machines but for testing and debugging only. > > > HA is important for us, scalability is definitely the more pressing > > matter: Without HA, we might someday go out of business - without > > some claim to scalability, we can't get into business to begin > > with. > > I really enjoy with Your maxim. Your philosophy is applicable for > everything existing in this wonderful world. We can't loose something we > don't have. So first we want to have, then we don't want to loose. And > it is not about priorities, it is about "be or not to be". There even > nice song exists about this. But what You wrote below proves: You need > HA first of all. > > > Can you do log shipping, hot backups, and recover a cluster to a > > point in time? If not what is the quickest/best backup/recovery > > procedure? Whatever it is, it is something I'll need to get > > scripted and working (I mean I'll write > > logs should be handled on every node, it is not so simple. > > Yeah, I was thinking this was probably the case. So what I'm not sure of is what you do after your datanode has been recovered as far as you can get it recovered using the usual single database recovery techniques - how do you then get it back into the cluster, and get it up to speed and running in the cluster, etc. I'm imagining it can be done? Hopefully it's just a matter of reading some docs and doing some experimentation. But I don't think that proves we need HA more than we need scalability, or even as much. We can launch without any working recovery plan if we have to. But it does us no good to launch if we can't handle the load. If I'm wearing my old DBA hat, I know I'm slitting my throat saying that - a DBA / Unix Admin "is only as good as their backups". That's certainly the truth. But I'm not concerned about the security of my DBA role, in fact I've been trying hard to cast it off for ages, actually. But as a business man - I need a throat to cut before I can cut it. The risk of a crash is small and tolerable, but if I'm not convinced I'll be able to handle the load - that's a show stopper. But it's seems like, for the most part, the important scalability features are there at this point, right? So very next on the list I would think would be HA. And it sounds like the XC devs are working fairly feverishly on it. |
From: Vladimir S. <vst...@gm...> - 2012-10-26 23:01:18
|
On Fri, Oct 26, 2012 at 12:36:25PM -0700, Roger Mayes wrote: > they ever were to, I can press a button and have another server up, > of the exact same configuration, in minutes flat, restored, > potentially, to an image created the previous night. We can This is good for standalone system, but with cluster those images of all nodes should be synchronized. More over cluster inside virtual machines is something exotic. If all Your nodes are running on the same hardware host, what for You need cluster? Run standalone system without virtual machine and You've got more capacity. Or simply pay for more resources for single virtual machine. The same if Your XC nodes are running on different hardware nodes: You will use small part of its resources for which You have paid. What for do You need virtual machines there? They are needed for Your provider to share resources with his customers, but not for You. I am running XC on virtual machines but for testing and debugging only. > HA is important for us, scalability is definitely the more pressing > matter: Without HA, we might someday go out of business - without > some claim to scalability, we can't get into business to begin > with. I really enjoy with Your maxim. Your philosophy is applicable for everything existing in this wonderful world. We can't loose something we don't have. So first we want to have, then we don't want to loose. And it is not about priorities, it is about "be or not to be". There even nice song exists about this. But what You wrote below proves: You need HA first of all. > Can you do log shipping, hot backups, and recover a cluster to a > point in time? If not what is the quickest/best backup/recovery > procedure? Whatever it is, it is something I'll need to get > scripted and working (I mean I'll write logs should be handled on every node, it is not so simple. > Can you (1) do a full dump, then (2) kill, drop and rebuild the > cluster, and then (3) restore the entire cluster using pg_restore > (or psql .. < dumpfile) through a coordinator? This would be a Yes, without this it become inviable. The only question is about size of data base and level of load. Depend on this backup procedure may impact production operation. I am doing such things piping dump directly to network without any temporary files on intermidiate storage. > found myself doing things with dump files, like pulling pieces of > them into temporary tables and then deleting and/or updating rows > in existing tables based on the data in the dump-loaded temporary > tables. I'm imagining that these general recovery scenarios are > not particularly complicated by the fact that I'm working with an > xc cluster - as long as I'm going through a coordinator, the > effects of the sql statements should replicate or distribute across > the cluster depending on how the table was set up.. right? You are right. This is usual data manipulation with sql and if XC claims it is transparent, as long as it is true it should working. > I can probably create a temporary table on a single datanode, > through a coordinator, just by telling it to distribute that table, > and only list the one datanode I want it on, right? Then I can do No difference, sure. > Hmm, so I wonder what I actually would do if a datanode went down, > or if the gtm server went down. Obviously I wouldn't want to lose > all the data in the other nodes. I wonder how complicated it would > be and how long it would take At this point HA comes to help. But it is not alternative for backup. We should do usual backup always. > to get things back up and running again. I guess I'd better > familiarize myself with the docs on backup and recovery. > Are they up to date with pgxc 1.0.1? Yes, You can see this on top of every page. But name of link pointing to it is wrong. *************************** ### Vladimir Stavrinov ### vst...@gm... *************************** |
From: Roger M. <rog...@gm...> - 2012-10-26 19:36:34
|
While catching up with this thread I've been thinking about what the lack of current HA capability means for my company's project. The virtual cloud server hosting we use is quite robust, they almost never have any visible hardware problems, and if they ever were to, I can press a button and have another server up, of the exact same configuration, in minutes flat, restored, potentially, to an image created the previous night. We can probably have 99% up time without HA, so while HA is important for us, scalability is definitely the more pressing matter: Without HA, we might someday go out of business - without some claim to scalability, we can't get into business to begin with. I guess for us the important thing is to be able to recover quickly in the (hopefully very unlikely) event of a crash, and to be able to restore data from daily exports if someone accidentally deletes or changes something important. So here are my questions - Can you do log shipping, hot backups, and recover a cluster to a point in time? If not what is the quickest/best backup/recovery procedure? Whatever it is, it is something I'll need to get scripted and working (I mean I'll write scripts and test and debug them) for us, at some point before we go live. Can you (1) do a full dump, then (2) kill, drop and rebuild the cluster, and then (3) restore the entire cluster using pg_restore (or psql .. < dumpfile) through a coordinator? This would be a last resort, obviously, since I'd lose all the data on every datanode since the last full dump, but if I know I can do that, at least I know I have that option. Sometimes I've found myself doing things with dump files, like pulling pieces of them into temporary tables and then deleting and/or updating rows in existing tables based on the data in the dump-loaded temporary tables. I'm imagining that these general recovery scenarios are not particularly complicated by the fact that I'm working with an xc cluster - as long as I'm going through a coordinator, the effects of the sql statements should replicate or distribute across the cluster depending on how the table was set up.. right? I can probably create a temporary table on a single datanode, through a coordinator, just by telling it to distribute that table, and only list the one datanode I want it on, right? Then I can do a data-only restore, of just that table, then from there I can use it through a coordinator, and affect whatever other tables I need to. Hmm, so I wonder what I actually would do if a datanode went down, or if the gtm server went down. Obviously I wouldn't want to lose all the data in the other nodes. I wonder how complicated it would be and how long it would take to get things back up and running again. I guess I'd better familiarize myself with the docs on backup and recovery. Are they up to date with pgxc 1.0.1? -rm On Fri, Oct 26, 2012 at 10:54 AM, David Hofstee <pg...@c0...> wrote: > ** > > 1. No cluster without HA option; I agree. > 2. Integrate XC into PG; In the future I would like to think of a > single PG instance as a 1-node cluster-able db. > > I think PGXC is the best thing that is happening. PGXC deserves to be the > most usable in the world too (instead of mysql). Gtx, > > > David > > Vladimir Stavrinov schreef op 2012-10-26 14:46: > > On Thu, Oct 25, 2012 at 4:13 AM, Michael Paquier > <mic...@gm...> wrote: > > 1) It is not our goal to oblige the users to user an HA solution or > another, > > Sounds fine. Where are those users? Who wants cluster without HA? > Everybody when hears word "cluster" implies "HA" > > Postgres code with XC. One of the reasons explaining that XC is able to > keep up with Postgres code pace easily is that we avoid to implement > solutions in core that might impact unnecessarily its interactions with > Postgres. > > You are heroes. How long You can continue "code pace" on this hard > way? This paradigm prevents You do not implement not only HA but lot > of other things that is necessary for cluster. I never saw this type > of fork. I believe at some point You will either become a part of > Postgres or totally come off and go Your own way. The only question > is when? And best answer is "right now". > > > > > ------------------------------------------------------------------------------ > The Windows 8 Center > In partnership with Sourceforge > Your idea - your app - 30 days. Get started! > http://windows8center.sourceforge.net/ > what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > |
From: David H. <pg...@c0...> - 2012-10-26 17:54:35
|
* No cluster without HA option; I agree. * Integrate XC into PG; In the future I would like to think of a single PG instance as a 1-node cluster-able db. I think PGXC is the best thing that is happening. PGXC deserves to be the most usable in the world too (instead of mysql). Gtx, David Vladimir Stavrinov schreef op 2012-10-26 14:46: > On Thu, Oct 25, 2012 at 4:13 AM, Michael Paquier > <mic...@gm...> wrote: > >> 1) It is not our goal to oblige the users to user an HA solution or another, > > Sounds fine. Where are those users? Who wants cluster without HA? > Everybody when hears word "cluster" implies "HA" > >> Postgres code with XC. One of the reasons explaining that XC is able to keep up with Postgres code pace easily is that we avoid to implement solutions in core that might impact unnecessarily its interactions with Postgres. > > You are heroes. How long You can continue "code pace" on this hard > way? This paradigm prevents You do not implement not only HA but lot > of other things that is necessary for cluster. I never saw this type > of fork. I believe at some point You will either become a part of > Postgres or totally come off and go Your own way. The only question > is when? And best answer is "right now". |
From: Vladimir S. <vst...@gm...> - 2012-10-26 16:50:42
|
On Fri, Oct 26, 2012 at 11:36:24AM -0400, Mason Sharp wrote: > > It is not the same. What about write? Then You should teach Your > > application where to read and where to write. What about transparency? > > You can get more write scalability with more data nodes. How it is related with cite above? It is not about scalability, it is about transparency. No matter how much nodes for writes we have. But if You have separate points for read and write, then You loose transparency. > It took years until PostgreSQL itself had built-in replication. I don't blame You don't implement something yet. We have something other point of controversy here. The discussion in these two close threads is about what "must have" and what should not. It is about priority and concept. > I think most on here also feel strongly about having HA, But this discussion shows something else. All my opponents (and I still saw no one supporter ) here very strongly insists, that HA is not high priority for XC or should not be implemented in the core. But my endlessly repeated question "Who wants cluster without HA?" got never answered in any form. So it is become mystery question. > it is just not built into the core at present. "at present" means it will built or possible in the future. Good news. It is first ray of light in the darkness or the light at the end of the tunnel. And it is more important that it come from "Architect and development leader" of XC. Though, it is characteristically, I am not surprised, something like this I expected. Thanks. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Mason S. <ma...@st...> - 2012-10-26 15:36:56
|
On Fri, Oct 26, 2012 at 10:59 AM, Vladimir Stavrinov <vst...@gm...> wrote: > On Fri, Oct 26, 2012 at 08:03:54PM +0530, Nikhil Sontakke wrote: > >> Using a standby is not an external solution. I wrote that PGXC is >> an ongoing project and will surely use them for reads > > It is not the same. What about write? Then You should teach Your > application where to read and where to write. What about transparency? You can get more write scalability with more data nodes. The project has made a lot of progress since its inception and for a long time a lot of the focus needed to be on just making sure the core database worked. :-) I suspect read balancing for data node standbys will happen at some point. As for HA, I think on this mailing list different ideas were discussed to achieve HA. Koichi Suzuki just did a presentation this week in Prague about HA. Someone else mentioned they are using pgbouncer. We are using Corosync/Pacemaker and have two synchronous replicas per data node. It took years until PostgreSQL itself had built-in replication. Perhaps hooks will be made within XC to help with HA, perhaps there will be an add-on that can be downloaded, perhaps basic data node failover from coordinators could be done (but other components would need failover too). I think most on here also feel strongly about having HA, it is just not built into the core at present. -- Mason Sharp StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Services |
From: Vladimir S. <vst...@gm...> - 2012-10-26 14:59:40
|
On Fri, Oct 26, 2012 at 08:03:54PM +0530, Nikhil Sontakke wrote: > Using a standby is not an external solution. I wrote that PGXC is > an ongoing project and will surely use them for reads It is not the same. What about write? Then You should teach Your application where to read and where to write. What about transparency? No, standby is crutches, it is surrogate. It is not for cluster, it is for standalone database. If You have cluster, You don't need standby. Cluster is best replacement for standby. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Nikhil S. <ni...@st...> - 2012-10-26 14:34:24
|
> > It looks like you like the Mysql Cluster product a lot and are > > trying to force fit PGXC within its parameters. So going back to > > your favorite Mysql Cluster product. The group has to contain at > > Again! I have wrote a lot here about mysql. Please, read before write. > I don't like mysql, but even enemies may be right. Where this > intolerant fanaticism from here where I expect intelligent people? > Especially I like "Go away!". You don't want discussion? Then what are > You doing here? > > Calm down Vladimir. No one is demonstrating any fanaticism here. I was just pointing out that the group in a Mysql cluster appears similar to a PG server and its replica, just that :) > > least two nodes for redundancy, right? Why is that ok and having a > > replica not ok or not similar? PGXC can/will certainly provide read > > The difference You are pointed out on Yourself in Your next sentences: In > an cluster all redundant data are under work load, they are available for > read and write, i.e. they are working. But in any external solutions > they are sleeping. > > Using a standby is not an external solution. I wrote that PGXC is an ongoing project and will surely use them for reads in the future. Why is that difficult to understand? Anyways, I guess enough on this from my side. Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
From: Vladimir S. <vst...@gm...> - 2012-10-26 14:22:15
|
On Fri, Oct 26, 2012 at 05:52:33PM +0530, Nikhil Sontakke wrote: > Why is that a problem? The standby can run on nodes which are part > of your cluster itself. It is already answered many times: because it is extra management, extra recourses and dead weight. > HA for datanodes can be provided by using standy nodes as well. The same as above. > It looks like you like the Mysql Cluster product a lot and are > trying to force fit PGXC within its parameters. So going back to > your favorite Mysql Cluster product. The group has to contain at Again! I have wrote a lot here about mysql. Please, read before write. I don't like mysql, but even enemies may be right. Where this intolerant fanaticism from here where I expect intelligent people? Especially I like "Go away!". You don't want discussion? Then what are You doing here? > least two nodes for redundancy, right? Why is that ok and having a > replica not ok or not similar? PGXC can/will certainly provide read The difference You are pointed out on Yourself in Your next sentences: In an cluster all redundant data are under work load, they are available for read and write, i.e. they are working. But in any external solutions they are sleeping. > capabilities from replicas in coming versions. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Joseph G. <jos...@or...> - 2012-10-26 14:04:13
|
On 26 October 2012 23:06, Nikhil Sontakke <ni...@st...> wrote: > >> >> But where is war? It is simply question. With low priority You have no >> neither knowledge nor HA itself. But if every XC accompanied with HA >> then it is high priority. And question is what is true here? >> > > Vladimir, I guess you are getting the impression that PGXC has de-emphasized > HA, that's certainly not the case. > > For a distributed database, the HA aspects are really important. As you have > mentioned elsewhere there needs to be a solution in place with something > like CoroSync/PaceMaker and it's been looked into. > > Regards, > Nikhils > -- > StormDB - http://www.stormdb.com > The Database Cloud > Postgres-XC Support and Service > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > For those interested I have been playing with something similar (you could probably see my previous discussion on list). I have been building a a prototype using external scripting that allows PG-XC to use the inbuilt streaming replication to HA datanodes. This has great HA properties but can't currently distribute read queries to the slaves nicely. I have been evaluating how to do this but after looking at the GTM etc I have decided it's beyond my limited knowledge of PG/PG-XC for now. The basic setup uses pgbouncer infront of PG-XC on a virtual IP so the path a query takes looks something like this: virtual-ip -> pgbouncer primary -> coodinators -> virtual-ip -> datanode master. The virtual IP infront of the datanode pair failsover over automatically and repmgr then instructs the slave to become writeable. There is also secondary pgbouncer server that fails over automatically too, this allows clients to just reconnect on anything bad happening. This causes a very slight service disruption but overall is pretty ok... considering for anything to happen a physical server has to fail. Ideally I would like to integrate the failover detection and management into the coordinator cluster along with beign able to service read queries from my datanode slaves. However I am quite happy with this setup and am able to scale write capacity with ease with a fully HA setup. (minus a disconnect on something bad happening which is OK) Joseph. -- CTO | Orion Virtualisation Solutions | www.orionvm.com.au Phone: 1300 56 99 52 | Mobile: 0428 754 846 |
From: Vladimir S. <vst...@gm...> - 2012-10-26 13:42:50
|
On Fri, Oct 26, 2012 at 05:36:52PM +0530, Nikhil Sontakke wrote: > Vladimir, I guess you are getting the impression that PGXC has > de-emphasized HA, that's certainly not the case. Very interesting! Everybody here try to convince me in that and it is not expression. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Vladimir S. <vst...@gm...> - 2012-10-26 13:31:48
|
On Fri, Oct 26, 2012 at 10:15:45PM +0900, Michael Paquier wrote: > And you have the same notion with PG itself, you create 2 database The only a small difference: PG is not a cluster. > servers if you use a slave with a master, so I do not see your > point, and people live and appreciate such a robust solution.. If > you are so of using slaves, you could also use archive files for a > recovery, or take periodic dumps of each Datanode if you do not > want to lose data, then replay it on a new node if necessary. All these things needs extra resources while cluster itself should have enough redundancy for fail-over. The difference is that those redundant data in cluster are under work load, i.e. are working, while all other external solutions are dead weight. -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |
From: Michael P. <mic...@gm...> - 2012-10-26 13:15:56
|
On Fri, Oct 26, 2012 at 9:55 PM, Vladimir Stavrinov <vst...@gm...>wrote > > That's right, this is it. This is result of Your concept: instead of > one cluster we should build two clusters. > And you have the same notion with PG itself, you create 2 database servers if you use a slave with a master, so I do not see your point, and people live and appreciate such a robust solution.. If you are so of using slaves, you could also use archive files for a recovery, or take periodic dumps of each Datanode if you do not want to lose data, then replay it on a new node if necessary. -- Michael Paquier http://michael.otacoo.com |
From: Vladimir S. <vst...@gm...> - 2012-10-26 12:55:56
|
On Thu, Oct 25, 2012 at 4:21 AM, Michael Paquier <mic...@gm...> wrote: > This looks like a possible solution trying to achieve load balancing easily > at Coordinator level. You could also publish a small utility for the XC > community based in your experience. That is only a suggestion to help It is not utility, it is cluster infrastructure and configuration. > It is btw recommended to have a standby node behing the one that failed if > the Datanode that failed cannot be recovered for a reason or another. That's right, this is it. This is result of Your concept: instead of one cluster we should build two clusters. |
From: Vladimir S. <vst...@gm...> - 2012-10-26 12:46:53
|
On Thu, Oct 25, 2012 at 4:13 AM, Michael Paquier <mic...@gm...> wrote: > 1) It is not our goal to oblige the users to user an HA solution or another, Sounds fine. Where are those users? Who wants cluster without HA? Everybody when hears word "cluster" implies "HA". > Postgres code with XC. One of the reasons explaining that XC is able to keep > up with Postgres code pace easily is that we avoid to implement solutions in > core that might impact unnecessarily its interactions with Postgres. You are heroes. How long You can continue "code pace" on this hard way? This paradigm prevents You do not implement not only HA but lot of other things that is necessary for cluster. I never saw this type of fork. I believe at some point You will either become a part of Postgres or totally come off and go Your own way. The only question is when? And best answer is "right now". >> Managability - I want to manage a cluster easily (add node, remove node, >> spare nodes, monitoring, ...). It cannot be simple enough. > > Sure. I don't know about any utilities able to do that, but if you could > build a utility like this running on top of XC and sell it, well you might > be able to make some money if XC becomes popular, what is not really the > case now ;) There are no problem with adding or removing nodes. But after that we should something do with data contained in the nodes. In other words, this is data manipulating issue. And it is not about "utility like this running on top of XC". It should be implemented internally. |
From: Nikhil S. <ni...@st...> - 2012-10-26 12:23:04
|
On Fri, Oct 26, 2012 at 5:15 PM, Vladimir Stavrinov <vst...@gm...>wrote: > On Thu, Oct 25, 2012 at 1:40 AM, Paulo Pires <pj...@ub...> wrote: > > > Summing, I've found Postgres-XC to be quite easy to install and > > configure in a 3 coordinators + 3 data-nodes (GTM all over them and > > GTM-Proxy handling HA). A little Google and command-line did the trick > > in *a couple hours*! > > In Debian You can install this package in a few seconds. > > > Now, the only downside for me is that Postgres-XC doesn't have a > > built-in way of load-balancing between coordinators. If the coordinator > > It is not a problem. The problem is necessity to have standby for > every data node. > > Why is that a problem? The standby can run on nodes which are part of your cluster itself. >But be > aware: with this solution we have HA only for LB, but not for > datanodes itself. > > HA for datanodes can be provided by using standy nodes as well. > That is what we have without HA. And that is why You must > have standby for every data node. In other word You should build extra > infrastructure in size of entire cluster. > > It looks like you like the Mysql Cluster product a lot and are trying to force fit PGXC within its parameters. So going back to your favorite Mysql Cluster product. The group has to contain at least two nodes for redundancy, right? Why is that ok and having a replica not ok or not similar? PGXC can/will certainly provide read capabilities from replicas in coming versions. Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
From: Nikhil S. <ni...@st...> - 2012-10-26 12:07:19
|
> But where is war? It is simply question. With low priority You have no > neither knowledge nor HA itself. But if every XC accompanied with HA > then it is high priority. And question is what is true here? > > Vladimir, I guess you are getting the impression that PGXC has de-emphasized HA, that's certainly not the case. For a distributed database, the HA aspects are really important. As you have mentioned elsewhere there needs to be a solution in place with something like CoroSync/PaceMaker and it's been looked into. Regards, Nikhils -- StormDB - http://www.stormdb.com The Database Cloud Postgres-XC Support and Service |
From: Vladimir S. <vst...@gm...> - 2012-10-26 11:56:02
|
On Fri, Oct 26, 2012 at 08:42:05PM +0900, Michael Paquier wrote: > > He spoke about priorities, not lack of knowledge. You're playing with > > What is difference? > > Easy, easy. This is a space of peace. But where is war? It is simply question. With low priority You have no neither knowledge nor HA itself. But if every XC accompanied with HA then it is high priority. And question is what is true here? -- *************************** ## Vladimir Stavrinov ## vst...@gm... *************************** |