postgres-xc-general Mailing List for Postgres-XC (Page 2)

Brought to you by: ahsanhadi, amitdkhan, ashutoshbapat, gabbasb, and 3 others

postgres-xc-general — General info and messages

You can subscribe to this list here.

2010	Jan	Feb	Mar	Apr	May (2)	Jun	Jul	Aug (6)	Sep	Oct (19)	Nov (1)	Dec
2011	Jan (12)	Feb (1)	Mar (4)	Apr (4)	May (32)	Jun (12)	Jul (11)	Aug (1)	Sep (6)	Oct (3)	Nov	Dec (10)
2012	Jan (11)	Feb (1)	Mar (3)	Apr (25)	May (53)	Jun (38)	Jul (103)	Aug (54)	Sep (31)	Oct (66)	Nov (77)	Dec (20)
2013	Jan (91)	Feb (86)	Mar (103)	Apr (107)	May (25)	Jun (37)	Jul (17)	Aug (59)	Sep (38)	Oct (78)	Nov (29)	Dec (15)
2014	Jan (23)	Feb (82)	Mar (118)	Apr (101)	May (103)	Jun (45)	Jul (6)	Aug (10)	Sep	Oct (32)	Nov	Dec (9)
2015	Jan (3)	Feb (5)	Mar	Apr (1)	May	Jun	Jul (9)	Aug (4)	Sep (3)	Oct	Nov	Dec
2016	Jan (3)	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2017	Jan	Feb	Mar	Apr	May	Jun (3)	Jul	Aug	Sep	Oct	Nov	Dec
2018	Jan	Feb	Mar	Apr	May (4)	Jun	Jul	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
	1	2 (1)	3	4	5	6
7	8	9	10	11 (3)	12	13
14	15	16	17 (4)	18	19	20
21	22	23	24 (20)	25 (8)	26 (22)	27
28 (2)	29 (3)	30	31 (3)

Flat | Threaded

<< < 1 2 3 > >> (Page 2 of 3)

Re: [Postgres-xc-general] Our general use case

From: Vladimir S. <vst...@gm...> - 2012-10-26 11:45:45

On Thu, Oct 25, 2012 at 1:40 AM, Paulo Pires <pj...@ub...> wrote:

> Summing, I've found Postgres-XC to be quite easy to install and
> configure in a 3 coordinators + 3 data-nodes (GTM all over them and
> GTM-Proxy handling HA). A little Google and command-line did the trick
> in *a couple hours*!

In Debian You can install this package in a few seconds.

> Now, the only downside for me is that Postgres-XC doesn't have a
> built-in way of load-balancing between coordinators. If the coordinator

It is not a problem. The problem is necessity to have standby for
every data node.

>   1) Define a DNS FQDN like coordinator.mydomain pointing to an IP
> (i.e., 10.0.0.1)
>   2) Point my app to work with that FQDN
>   3) On every coordinator, configure keepalived with one shared-IP
> (10.0.0.1)
>   4) Install haproxy in every coordinator and have it load-balance with
> the other coordinators

First, haproxy here is extra - keepalived can do all things itself and
better. Second, put it on any XC node is bad idea. In any case I
prefer full cluster solution with corosync/pacemaker. This way we can
put under single cluster control not only database, but all other
parts of the system, i.e. web servers and applications servers. But be
aware: with this solution we have HA only for LB, but not for
datanodes itself.

> My only doubt is, if you get a data-node offline and then bring it up,
> will the data in that data-node be synchronized?

My congratulation. You come at the point about what we are discussing
for a long time in neighbor thread. Data from this node if it has no
replica on other nodes are not available any more, but Your
application don't knows, which data is available and which is not. You
can easy imagine consequences. That is moment when down time is
started. That is what we have without HA. And that is why You must
have standby for every data node. In other word You should build extra
infrastructure in size of entire cluster.

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Michael P. <mic...@gm...> - 2012-10-26 11:42:16

On Fri, Oct 26, 2012 at 4:53 PM, Vladimir Stavrinov <vst...@gm...>wrote:

> On Fri, Oct 26, 2012 at 08:50:09AM +0100, Paulo Pires wrote:
>
> > He spoke about priorities, not lack of knowledge. You're playing with
>
> What is difference?
>
Easy, easy. This is a space of peace.
Thanks in advance for respecting each other and people reading this mailing
list.
-- 
Michael Paquier
http://michael.otacoo.com

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-26 07:54:08

On Fri, Oct 26, 2012 at 08:50:09AM +0100, Paulo Pires wrote:

> He spoke about priorities, not lack of knowledge. You're playing with

What is difference?

 

-- 

***************************
##  Vladimir Stavrinov
##  vst...@gm...
***************************

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Paulo P. <pj...@ub...> - 2012-10-26 07:50:25

On 26/10/12 07:56, Vladimir Stavrinov wrote:
> On Thu, Oct 25, 2012 at 10:41:05AM +0300, Andrei Martsinchyk wrote:
>
>>    XC is for those, who want more TPS per dollar, under the
>>    circumstances HA is not a first priority definitely. If you
> Paulo, recently You asked me:
>
> "Do you know anyone putting up a database cluster without
> HA/Clustering/LB?"
>
> Here they are. Ask Andrei to introduce You to them. Then You tell us
> impressive story about numerous people for whom Postgres-XC was
> invented.

He spoke about priorities, not lack of knowledge. You're playing with
words and that just sucks, man!

>

-- 
Paulo Pires

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-26 06:56:52

On Thu, Oct 25, 2012 at 10:41:05AM +0300, Andrei Martsinchyk wrote:

>    XC is for those, who want more TPS per dollar, under the
>    circumstances HA is not a first priority definitely. If you

Paulo, recently You asked me:

"Do you know anyone putting up a database cluster without
HA/Clustering/LB?"

Here they are. Ask Andrei to introduce You to them. Then You tell us
impressive story about numerous people for whom Postgres-XC was
invented.

-- 

***************************
##  Vladimir Stavrinov
##  vst...@gm...
***************************

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Paulo P. <pj...@ub...> - 2012-10-25 07:43:52

On 25/10/12 08:37, Vladimir Stavrinov wrote:
> On Thu, Oct 25, 2012 at 2:05 AM, Vladimir Stavrinov
> <vst...@gm...> wrote:
>> On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote:
>>>    one of those solutions. Everybody wins.  If XC integrates one
>>>    approach it will lose flexibility in this area.
>> and gain much more users.
> OK. Paulo don't wants more users, because he don't like easy ways and
> simple things. But we all want flexibility. Flexibility is good thing
> and here is example.

I didn't say "I don't want more users". I just believe, based on my
experience, that subjects as advanced as the ones we're discussing don't
come easy. And they shouldn't in the sense that people should really
learn/know about what they're doing, regarding clustering, HA, etc.!

>
> We have cluster consists of 4 nodes. Nodes organized in groups. All
> data distributed between groups and every group contains the identical
> data, i.e. replicas. In this case with such model we have 3 options:
>
> 1. Read scalability only with 4 replicas in group.
> 2. Read and write scalability with 2 replicas per group.
> 3. Write scalability only with 1 replica per group.
>
> It is obvious: with more nodes we have more options, i.e. more
> flexibility. It means here the trade off between read and write
> scalability. And we don't need for this "CREATE TABLE ... DISTRIBUTE
> BY ..." I think it is enough for most cases.
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general

-- 
Paulo Pires

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Andrei M. <and...@gm...> - 2012-10-25 07:41:18

I feel like the discussion is senseless. Everything costs its price. If
your need HA you pay with performance. If you need both HA and performance
you pay for more powerful hardware.
XC is for those, who want more TPS per dollar, under the circumstances HA
is not a first priority definitely. If you know how to implement HA
solution that does not affect performance please tell us.
There are a lot of useful features (like ability to start when server
starts, schedule backups, failover to standby system) which are out of the
core. If you want any of these your need to set it up or have someone do
that for you. If you do not need them you can go without them pretty well.

2012/10/25 Vladimir Stavrinov <vst...@gm...>

> On Thu, Oct 25, 2012 at 12:18 AM, Andrei Martsinchyk
> <and...@gm...> wrote:
>
> > I think your test was incorrect. It works.
>
> No, it is exactly what this thread started from and what indicated in
> its subject. See very first answer of developer: it is not even a bug,
> it is by design. Sounds like anecdote, but it is true.
>
> > performance scalability. They could use XC as is. If there is demand of
> HA
> > on market, other developers may create XC-based solutions, more or less
>
> Do You really have question about this? I think High Availability is
> priority number one because we are not very happy sitting in
> Rolls-Royce that can not move.
>

Nice. Rolls-Royce requires road, fuel, driver, service. If you do not
provide all these, you will be sitting in car that can not move. Why you
purchased it then?

-- 
Andrei Martsinchyk

StormDB - http://www.stormdb.com
The Database Cloud

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-25 07:38:04

On Thu, Oct 25, 2012 at 2:05 AM, Vladimir Stavrinov
<vst...@gm...> wrote:
> On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote:

>>    one of those solutions. Everybody wins.  If XC integrates one
>>    approach it will lose flexibility in this area.
>
> and gain much more users.

OK. Paulo don't wants more users, because he don't like easy ways and
simple things. But we all want flexibility. Flexibility is good thing
and here is example.

We have cluster consists of 4 nodes. Nodes organized in groups. All
data distributed between groups and every group contains the identical
data, i.e. replicas. In this case with such model we have 3 options:

1. Read scalability only with 4 replicas in group.
2. Read and write scalability with 2 replicas per group.
3. Write scalability only with 1 replica per group.

It is obvious: with more nodes we have more options, i.e. more
flexibility. It means here the trade off between read and write
scalability. And we don't need for this "CREATE TABLE ... DISTRIBUTE
BY ..." I think it is enough for most cases.

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-25 07:01:15

On Thu, Oct 25, 2012 at 12:18 AM, Andrei Martsinchyk
<and...@gm...> wrote:

> I think your test was incorrect. It works.

No, it is exactly what this thread started from and what indicated in
its subject. See very first answer of developer: it is not even a bug,
it is by design. Sounds like anecdote, but it is true.

> performance scalability. They could use XC as is. If there is demand of HA
> on market, other developers may create XC-based solutions, more or less

Do You really have question about this? I think High Availability is
priority number one because we are not very happy sitting in
Rolls-Royce that can not move.

Re: [Postgres-xc-general] Our general use case

From: Ashutosh B. <ash...@en...> - 2012-10-25 06:32:40

On Thu, Oct 25, 2012 at 5:43 AM, Michael Paquier
<mic...@gm...>wrote:

> On Thu, Oct 25, 2012 at 5:41 AM, David Hofstee <pg...@c0...> wrote:
>
>> **
>>
>> Hi,
>>
>> I've been reading the '*ERROR: Failed to get pooled connections*' thread
>> about what XC should and should not do. I opted to start a new thread
>> (instead of replying) about how I would like XC to be.
>>
>> Some background. I work for a SaaS company (mostly dev, some ops) which
>> has to be online 24/7. We are now running apache/tomcat/mysql for each set
>> of customers on about 30 nodes and we want to centralize and make our
>> application more robust, efficient and simple. It basically means creating
>> layers: LB, webservers, application servers, database cluster. Some easy
>> parts are already done (haproxy, nginx). Our 'platform' is pretty complex
>> and I have so many tasks, I prefer to *not* dig into details. We are now
>> discussing the db issue (mysql cluster is not that great).
>>
>> My dream DB cluster:
>>
> Scalability - that means read and write scalability. XC should do that
>> right now. Nice.
>>
>> High availability - a node can go offline and it should not hinder
>> availability (only processing capacity)
>>
>> Maintainability - Since maintenance/change is our primary cause of
>> downtime, it should be possible to kill a node and add it later. This can
>> be because the VM is being moved, the OS is updated/upgraded, etc. Also,
>> think about how a cluster is updated from major version to major version
>> (lets say 9.x to 10.x). Maybe that is not an issue (but I don't know about
>> it yet).
>>
>> Simplicity - It would be nice if the default package+config file is all I
>> need. If it is too complex I cannot go on holidays. Some points:
>>
>>    - I read that *'...even the stock postgresql.conf configuration file
>>    is pretty conservative and users tweak it as per their requirements...
>>    *'. For me that translates as 'if you are new to Postgres it works
>>    bad'. Not simple (for e.g. some of our dev-ers).
>>    - For HA* '...Like Postgres, you need an external application to
>>    provide it'*. When using a cluster I think HA is very often wanted. I
>>    need to explain all this to every ops-colleague of mine and some are not
>>    very accurate. Not simple again.
>>
>> XC is a fork of Postgres and we try to share the same philosophy as the
> parent project about being really conservative on the things that should or
> should not be added in core.
> For example, let's take the case of HA. It is of course possible to
> implement an HA solution directly in the core of XC, but there are 2 things
> that would go against that:
> 1) It is not our goal to oblige the users to user an HA solution or
> another, and I do not believe that it is the role of core people to
> integrate directly in XC core a solution that might be good for a certain
> type of applications, without caring of the other types of applications.
> Postgres is popular because it lets all the users free to use what they
> want, and depending on the application people want to use with XC, they
> might prefer an HA solution or another.
> 2) If in the future Postgres integrates a native HA solution (I do not
> believe it will be the case as the community is really conservative, but
> let's assume), and if XC had a some point integrated an HA solution
> directly in its core, we would certainly have to drop the XC solution and
> rely on the Postgres solution as XC is a fork of Postgres. This would be a
> waste of time for the core people who integrated the HA solution, and
> people merging Postgres code with XC. One of the reasons explaining that XC
> is able to keep up with Postgres code pace easily is that we avoid to
> implement solutions in core that might impact unnecessarily its
> interactions with Postgres.
>

+10. I totally agree with Michael here. We would like to keep XC's
footprint as small as possible. XC would add features for distributed
computing that will not be present in PG. Rest features would come from PG.
At the same time, we lack in terms of resources; and hence choose only few
things that look to be important from XC's perspective.

>
>>
>> Quick setup - I want to setup an NxM cluster quickly (N times duplication
>> for HA, M times distributed writes for performance). I prefer to setup a
>> single node with a given config file, add nodes and be ready to go. Maybe
>> an hour in case of disaster recovery?
>>
> There are already tools about that like this one written in Ruby:
>
> https://sourceforge.net/projects/postgres-xc/files/misc/pgxc_config_v0_9_3.tar.gz/download
> It is not maintained since 0.9.3 as this is not honestly a part of core.
> You might have a look at it.
>
>> Managability - I want to manage a cluster easily (add node, remove node,
>> spare nodes, monitoring, ...). It cannot be simple enough.
>>
> Sure. I don't know about any utilities able to do that, but if you could
> build a utility like this running on top of XC and sell it, well you might
> be able to make some money if XC becomes popular, what is not really the
> case now ;)
>
>> Backup - I'm not familiar with running backups on Postgres but we
>> currently run a blocking backup on the mysql, for consistency, and it
>> causes issues. We use Bacula on a file level. Which brings up a question:
>> How do you backup a cluster (if you don't know which nodes are hot)?
>>
> In the case of XC, you might directly take a dump from a Coordinator with
> pg_dump, and then restore the dump file with pg_restore. You might want to
> use archive files.
> There are many ways to accomplish that, like in Postgres. The only
> difference in the case of XC is that you need to do that for each node as
> architecture is shared nothing.
>
 Logging - Yes...
>>
>> Some may respond that things are not that simple. I know. But I still
>> want it to be simple. It would make PGXC a no-brainer for everyone. Thanks
>> for listening and keep up the good work! I appreciate it.
>>
> There are already utilities implemented for Postgres that can work
> natively with XC, like for logging you might want to use log analyzers like
> pgbadger.
> You should have a look at that first for each thing you want to do, then
> evaluate the effort necessary to achieve each of your goals.
>
> Thanks,
> --
> Michael Paquier
> http://michael.otacoo.com
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>
>


-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-25 00:40:33

On Wed, Oct 24, 2012 at 11:27:25PM +0100, Paulo Pires wrote:

> FYI there is technology that deprecates the need of rebooting a machine
> following a kernel update, such as ksplice (bought by Oracle a couple
> years ago).

There is such debian package but it is not commonly used.

> I believe you can add new machinery (new coordinators, new data-nodes)
> and deprecate old hardware. Am I being to simplistic thinking this way?
> Anyway, changing a cluster hardware every two years seems overkill to
> me. But of course, it depends on your app growth

We don't speak about upgrade here, it is about scalability, do you
remember?

> Yes, internal is (supposedly) easier or as you say "transparent" - I'd
> use the word "seamless". But you'll need to learn it and take care of it
> somehow, the same way you'd do with external solutions, such as haproxy
> or keepalived. I don't think HA/Clustering/LB is for the "heart faint".
> Whether you know what you're doing, or leave this matter alone! You'll
> save your sanity in the medium term..

If You know how automobile works it doesn't means You want to build it
just for Your own usage. But in our context, remember again, extra
complexity means not only extra software, but extra infrastructure, i.e.
extra hardware as well. I am using corosync, pacemaker, ipvs,
ldirectord, drbd and keepalvied. But here we are discussing database
cluster and it needs some other approach. I want to use some of such
tools for distributing requests between coordinators and for failover of
ipvs point of distribution and gtm.  But I don't want standby data
nodes. All nodes should be under load and there are should be enough
redundancy to survive any one node lost. Health monitoring and failover
should be done internally by XC in this case.

> I don't understand why you keep citing MySQL as an example. *Don't take
> me wrong here*, but if you feel it to be the right tool, just go with it

I've already explained this here twice: it is not right tool, because it
is in-memory database. But it has right clustering model and that is why
I cite it here as good exemplar.

> and leave the ones who think the same about Postgres-XC alone.

This is good tool to close any discussion about anything.

> Do you know anyone putting up a database cluster without
> HA/Clustering/LB knowledge? If you do, please ask them to stop.

This questing is not for me. Look cites above.

> If at least this was a "who has more users" competition, that would
> make sense.  The best tools I use in my day-to-day job didn't come
> easy! I don't agree with you on this, at all.

But I agree with You at this point. But it is not about "easy way" or
"more users". I don't think we should lose flexibility with clustering
model where distribution scheme defined on cluster level. I believe it
can include distribution on table level. So it may be default setting
issue.

Well designed complex things easy to use with default setting, but
still provides enough flexibility.

> I *only* had to change my biggest app DDL (which is generated by some
> Java JPA tool) in order to test DISTRIBUTE BY. But I'm good with 100%
> replication.. for now. In the end I made *zero* changes!

I don't see how this story helps in production environment.

***************************
###  Vladimir Stavrinov
###  vst...@gm...
***************************

Re: [Postgres-xc-general] Our general use case

From: Michael P. <mic...@gm...> - 2012-10-25 00:21:32

On Thu, Oct 25, 2012 at 6:40 AM, Paulo Pires <pj...@ub...> wrote:

> Hi,
>
> Summing, I've found Postgres-XC to be quite easy to install and
> configure in a 3 coordinators + 3 data-nodes (GTM all over them and
> GTM-Proxy handling HA). A little Google and command-line did the trick
> in *a couple hours*!
>
> Now, the only downside for me is that Postgres-XC doesn't have a
> built-in way of load-balancing between coordinators. If the coordinator
> your app is pointing to goes down, your app goes down - your application
> can target all of them, but in my experience, your application will
> *always* target a host. So, ATM my solution is:
>   1) Define a DNS FQDN like coordinator.mydomain pointing to an IP
> (i.e., 10.0.0.1)
>   2) Point my app to work with that FQDN
>   3) On every coordinator, configure keepalived with one shared-IP
> (10.0.0.1)
>   4) Install haproxy in every coordinator and have it load-balance with
> the other coordinators
>
> This way, keep-alived will always choose the first coordinator (based on
> its priority) and then haproxy (running on that machine) will
> load-balance with others. If this coordinator goes down, the second host
> in keepalived priority list will replace it and not only is it a valid
> coordinator, but also it will be able to load-balance with other
> coordinators.
>
This looks like a possible solution trying to achieve load balancing easily
at Coordinator level. You could also publish a small utility for the XC
community based in your experience. That is only a suggestion to help
community, please understand that I do not force you publishing anything of
course.


> My only doubt is, if you get a data-node offline and then bring it up,
> will the data in that data-node be synchronized?
>
If the Datanode becomes offline for a certain reason, all the transactions
that should have run on it will fail at Coordinator level, so there is no
worries here about data synchronization normally.
It is btw recommended to have a standby node behing the one that failed if
the Datanode that failed cannot be recovered for a reason or another.


> And that's it. I'm in now way a DB-expert and I felt quite confused by
> reading the previous thread. But as a developer, Postgres-XC has been a
> huge upgrade for me. (Now, if only RETURN ID was to be implemented, mr.
> Abbas ;-)).
>
+1. Looking forward to seeing this feature ;-o


> Sorry for being a little off-topic, but wanted to share my _little_
> experience with this wonderful piece of software.
>
Thanks, I am convinced it is helpful for a lot of people
-- 
Michael Paquier
http://michael.otacoo.com

Re: [Postgres-xc-general] Our general use case

From: Michael P. <mic...@gm...> - 2012-10-25 00:13:55

On Thu, Oct 25, 2012 at 5:41 AM, David Hofstee <pg...@c0...> wrote:

> **
>
> Hi,
>
> I've been reading the '*ERROR: Failed to get pooled connections*' thread
> about what XC should and should not do. I opted to start a new thread
> (instead of replying) about how I would like XC to be.
>
> Some background. I work for a SaaS company (mostly dev, some ops) which
> has to be online 24/7. We are now running apache/tomcat/mysql for each set
> of customers on about 30 nodes and we want to centralize and make our
> application more robust, efficient and simple. It basically means creating
> layers: LB, webservers, application servers, database cluster. Some easy
> parts are already done (haproxy, nginx). Our 'platform' is pretty complex
> and I have so many tasks, I prefer to *not* dig into details. We are now
> discussing the db issue (mysql cluster is not that great).
>
> My dream DB cluster:
>
Scalability - that means read and write scalability. XC should do that
> right now. Nice.
>
> High availability - a node can go offline and it should not hinder
> availability (only processing capacity)
>
> Maintainability - Since maintenance/change is our primary cause of
> downtime, it should be possible to kill a node and add it later. This can
> be because the VM is being moved, the OS is updated/upgraded, etc. Also,
> think about how a cluster is updated from major version to major version
> (lets say 9.x to 10.x). Maybe that is not an issue (but I don't know about
> it yet).
>
> Simplicity - It would be nice if the default package+config file is all I
> need. If it is too complex I cannot go on holidays. Some points:
>
>    - I read that *'...even the stock postgresql.conf configuration file
>    is pretty conservative and users tweak it as per their requirements...*'.
>    For me that translates as 'if you are new to Postgres it works bad'. Not
>    simple (for e.g. some of our dev-ers).
>    - For HA* '...Like Postgres, you need an external application to
>    provide it'*. When using a cluster I think HA is very often wanted. I
>    need to explain all this to every ops-colleague of mine and some are not
>    very accurate. Not simple again.
>
> XC is a fork of Postgres and we try to share the same philosophy as the
parent project about being really conservative on the things that should or
should not be added in core.
For example, let's take the case of HA. It is of course possible to
implement an HA solution directly in the core of XC, but there are 2 things
that would go against that:
1) It is not our goal to oblige the users to user an HA solution or
another, and I do not believe that it is the role of core people to
integrate directly in XC core a solution that might be good for a certain
type of applications, without caring of the other types of applications.
Postgres is popular because it lets all the users free to use what they
want, and depending on the application people want to use with XC, they
might prefer an HA solution or another.
2) If in the future Postgres integrates a native HA solution (I do not
believe it will be the case as the community is really conservative, but
let's assume), and if XC had a some point integrated an HA solution
directly in its core, we would certainly have to drop the XC solution and
rely on the Postgres solution as XC is a fork of Postgres. This would be a
waste of time for the core people who integrated the HA solution, and
people merging Postgres code with XC. One of the reasons explaining that XC
is able to keep up with Postgres code pace easily is that we avoid to
implement solutions in core that might impact unnecessarily its
interactions with Postgres.

>
>
> Quick setup - I want to setup an NxM cluster quickly (N times duplication
> for HA, M times distributed writes for performance). I prefer to setup a
> single node with a given config file, add nodes and be ready to go. Maybe
> an hour in case of disaster recovery?
>
There are already tools about that like this one written in Ruby:
https://sourceforge.net/projects/postgres-xc/files/misc/pgxc_config_v0_9_3.tar.gz/download
It is not maintained since 0.9.3 as this is not honestly a part of core.
You might have a look at it.

> Managability - I want to manage a cluster easily (add node, remove node,
> spare nodes, monitoring, ...). It cannot be simple enough.
>
Sure. I don't know about any utilities able to do that, but if you could
build a utility like this running on top of XC and sell it, well you might
be able to make some money if XC becomes popular, what is not really the
case now ;)

> Backup - I'm not familiar with running backups on Postgres but we
> currently run a blocking backup on the mysql, for consistency, and it
> causes issues. We use Bacula on a file level. Which brings up a question:
> How do you backup a cluster (if you don't know which nodes are hot)?
>
In the case of XC, you might directly take a dump from a Coordinator with
pg_dump, and then restore the dump file with pg_restore. You might want to
use archive files.
There are many ways to accomplish that, like in Postgres. The only
difference in the case of XC is that you need to do that for each node as
architecture is shared nothing.

> Logging - Yes...
>
> Some may respond that things are not that simple. I know. But I still want
> it to be simple. It would make PGXC a no-brainer for everyone. Thanks for
> listening and keep up the good work! I appreciate it.
>
There are already utilities implemented for Postgres that can work natively
with XC, like for logging you might want to use log analyzers like pgbadger.
You should have a look at that first for each thing you want to do, then
evaluate the effort necessary to achieve each of your goals.

Thanks,
-- 
Michael Paquier
http://michael.otacoo.com

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Paulo P. <pj...@ub...> - 2012-10-24 22:27:38

Attachments: signature.asc


On 10/24/12 11:05 PM, Vladimir Stavrinov wrote:
> On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote:
> 
>>    That is the reason to buy latest IPhone. Some servers run for years
>>    without even reboot. Usually people are replacing servers only if
>>    they really need to do that.
> 
> What about security patches for kernel? For years without reboot?

FYI there is technology that deprecates the need of rebooting a machine
following a kernel update, such as ksplice (bought by Oracle a couple
years ago).

 And
> it is not only reason to upgrade kernel. As for replacing, Yes it true,
> but this moment inevitably come when new software eats more resources
> while number of users increases, but I never hear somebody says it is
> scaling process.
> 
>>    Nobody upgrades daily. I think it is not a lot of trouble to
>>    recreate cluster once per few years.
> 
> Once per few years You can built totally new system on brand-new technology.

I believe you can add new machinery (new coordinators, new data-nodes)
and deprecate old hardware. Am I being to simplistic thinking this way?
Anyway, changing a cluster hardware every two years seems overkill to
me. But of course, it depends on your app growth.

> Cluster scalability imply possibility to scale it at any moment for example
> (but not only) when new customers or partners come with new demand for fast
> paced company with increasing load. It is by design. It is exactly what for the
> scalable cluster exists: you can scale (expand) existing system instead of
> building new one.
> 
> 
>>    Why it doubles hardware park, multiple components may share same hardware.
> 
> As usual here it is far from reality. It is not common approach acceptable for
> most companies. What You talking about looks like an approach for clouds or any
> other service providers where hardware may be shared by their customers.
> 
>>    HA solution means extra complexity either it external or internal.
> 
> But it makes difference. External should be built and managed by users,
> while internal is complete and transparent solution provided by authors.

Yes, internal is (supposedly) easier or as you say "transparent" - I'd
use the word "seamless". But you'll need to learn it and take care of it
somehow, the same way you'd do with external solutions, such as haproxy
or keepalived. I don't think HA/Clustering/LB is for the "heart faint".
Whether you know what you're doing, or leave this matter alone! You'll
save your sanity in the medium term..

> With mysql cluster there are nothing to do with HA for users at all, it
> just already "exists".

I don't understand why you keep citing MySQL as an example. *Don't take
me wrong here*, but if you feel it to be the right tool, just go with it
and leave the ones who think the same about Postgres-XC alone.

> 
>>    There are people out there who do not want that complexity, they
>>    are happy with just performance scalability. They could use XC as
> 
> Will they happy with data lost and down time? Who they are?

Do you know anyone putting up a database cluster without
HA/Clustering/LB knowledge? If you do, please ask them to stop.

> 
>>    one of those solutions. Everybody wins.  If XC integrates one
>>    approach it will lose flexibility in this area.
> 
> and gain much more users.

If at least this was a "who has more users" competition, that would make
sense.
The best tools I use in my day-to-day job didn't come easy! I don't
agree with you on this, at all.

> 
>>    I did not quite understand what you mean here. There are a lot of
>>    important for system design things along all the hardware and
>>    software stack. The more is known to developers the better result
>>    will be. One may design database on XC if he does know anything
>>    about it at all, with pure SQL, and the database will work. But
>>    much better result can be achieved if database is designed
>>    consciously.  Number of nodes does not matter for distribution
>>    planning, btw.
> 
> Again: all of this is not about transparency. You are talking perhaps about
> installing single application on fresh XC. But what if You install third party
> application on existing XC already running multiply applications? What if those
> databases distributed in different ways. What if because of this You can not
> use all nodes for new application? In this case You must rewrite all "CREATE
> TABLE" statements to distribute tables to concrete nodes by concrete way. In
> this case developer doesn't help and it is not what named "transparency."

I *only* had to change my biggest app DDL (which is generated by some
Java JPA tool) in order to test DISTRIBUTE BY. But I'm good with 100%
replication.. for now. In the end I made *zero* changes!

> 
> 
> ***************************
> ###  Vladimir Stavrinov
> ###  vst...@gm...
> ***************************
> 
> 
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> 

-- 
Paulo Pires
Ubiwhere

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-24 22:14:34

On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote:

>    I think your test was incorrect. It works.

It is so simple, that it hard to make something wrong. You can easily
reproduce it on 1.0.0 with simple SELECT request. I will repeat it on
1.0.1 meanwhile.


***************************
###  Vladimir Stavrinov
###  vst...@gm...
***************************

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-24 22:05:28

On Wed, Oct 24, 2012 at 11:18:59PM +0300, Andrei Martsinchyk wrote:

>    That is the reason to buy latest IPhone. Some servers run for years
>    without even reboot. Usually people are replacing servers only if
>    they really need to do that.

What about security patches for kernel? For years without reboot? And
it is not only reason to upgrade kernel. As for replacing, Yes it true,
but this moment inevitably come when new software eats more resources
while number of users increases, but I never hear somebody says it is
scaling process.

>    Nobody upgrades daily. I think it is not a lot of trouble to
>    recreate cluster once per few years.

Once per few years You can built totally new system on brand-new technology.
Cluster scalability imply possibility to scale it at any moment for example
(but not only) when new customers or partners come with new demand for fast
paced company with increasing load. It is by design. It is exactly what for the
scalable cluster exists: you can scale (expand) existing system instead of
building new one.

>    Why it doubles hardware park, multiple components may share same hardware.

As usual here it is far from reality. It is not common approach acceptable for
most companies. What You talking about looks like an approach for clouds or any
other service providers where hardware may be shared by their customers.

>    HA solution means extra complexity either it external or internal.

But it makes difference. External should be built and managed by users,
while internal is complete and transparent solution provided by authors.
With mysql cluster there are nothing to do with HA for users at all, it
just already "exists".

>    There are people out there who do not want that complexity, they
>    are happy with just performance scalability. They could use XC as

Will they happy with data lost and down time? Who they are?

>    one of those solutions. Everybody wins.  If XC integrates one
>    approach it will lose flexibility in this area.

and gain much more users.

>    I did not quite understand what you mean here. There are a lot of
>    important for system design things along all the hardware and
>    software stack. The more is known to developers the better result
>    will be. One may design database on XC if he does know anything
>    about it at all, with pure SQL, and the database will work. But
>    much better result can be achieved if database is designed
>    consciously.  Number of nodes does not matter for distribution
>    planning, btw.

Again: all of this is not about transparency. You are talking perhaps about
installing single application on fresh XC. But what if You install third party
application on existing XC already running multiply applications? What if those
databases distributed in different ways. What if because of this You can not
use all nodes for new application? In this case You must rewrite all "CREATE
TABLE" statements to distribute tables to concrete nodes by concrete way. In
this case developer doesn't help and it is not what named "transparency."

***************************
###  Vladimir Stavrinov
###  vst...@gm...
***************************

Re: [Postgres-xc-general] Our general use case

From: Paulo P. <pj...@ub...> - 2012-10-24 21:40:34

Attachments: signature.asc

Hi,

Summing, I've found Postgres-XC to be quite easy to install and
configure in a 3 coordinators + 3 data-nodes (GTM all over them and
GTM-Proxy handling HA). A little Google and command-line did the trick
in *a couple hours*!

Now, the only downside for me is that Postgres-XC doesn't have a
built-in way of load-balancing between coordinators. If the coordinator
your app is pointing to goes down, your app goes down - your application
can target all of them, but in my experience, your application will
*always* target a host. So, ATM my solution is:
  1) Define a DNS FQDN like coordinator.mydomain pointing to an IP
(i.e., 10.0.0.1)
  2) Point my app to work with that FQDN
  3) On every coordinator, configure keepalived with one shared-IP
(10.0.0.1)
  4) Install haproxy in every coordinator and have it load-balance with
the other coordinators

This way, keep-alived will always choose the first coordinator (based on
its priority) and then haproxy (running on that machine) will
load-balance with others. If this coordinator goes down, the second host
in keepalived priority list will replace it and not only is it a valid
coordinator, but also it will be able to load-balance with other
coordinators.

My only doubt is, if you get a data-node offline and then bring it up,
will the data in that data-node be synchronized?

And that's it. I'm in now way a DB-expert and I felt quite confused by
reading the previous thread. But as a developer, Postgres-XC has been a
huge upgrade for me. (Now, if only RETURN ID was to be implemented, mr.
Abbas ;-)).

Sorry for being a little off-topic, but wanted to share my _little_
experience with this wonderful piece of software.

Cheers,
PP

On 10/24/12 9:41 PM, David Hofstee wrote:
> Hi,
> 
> I've been reading the '*ERROR: Failed to get pooled connections*' thread
> about what XC should and should not do. I opted to start a new thread
> (instead of replying) about how I would like XC to be.
> 
> Some background. I work for a SaaS company (mostly dev, some ops) which
> has to be online 24/7. We are now running apache/tomcat/mysql for each
> set of customers on about 30 nodes and we want to centralize and make
> our application more robust, efficient and simple. It basically means
> creating layers: LB, webservers, application servers, database cluster.
> Some easy parts are already done (haproxy, nginx). Our 'platform' is
> pretty complex and I have so many tasks, I prefer to /not/ dig into
> details. We are now discussing the db issue (mysql cluster is not that
> great).
> 
> My dream DB cluster:
> 
> Scalability - that means read and write scalability. XC should do that
> right now. Nice.
> 
> High availability - a node can go offline and it should not hinder
> availability (only processing capacity)
> 
> Maintainability - Since maintenance/change is our primary cause of
> downtime, it should be possible to kill a node and add it later. This
> can be because the VM is being moved, the OS is updated/upgraded, etc.
> Also, think about how a cluster is updated from major version to major
> version (lets say 9.x to 10.x). Maybe that is not an issue (but I don't
> know about it yet).
> 
> Simplicity - It would be nice if the default package+config file is all
> I need. If it is too complex I cannot go on holidays. Some points:
> 
>   * I read that /'...even the stock postgresql.conf configuration file
>     is pretty conservative and users tweak it as per their
>     requirements.../'. For me that translates as 'if you are new to
>     Postgres it works bad'. Not simple (for e.g. some of our dev-ers).
>   * For HA/'...Like Postgres, you need an external application to
>     provide it'/. When using a cluster I think HA is very often wanted.
>     I need to explain all this to every ops-colleague of mine and some
>     are not very accurate. Not simple again.
> 
> Quick setup - I want to setup an NxM cluster quickly (N times
> duplication for HA, M times distributed writes for performance). I
> prefer to setup a single node with a given config file, add nodes and be
> ready to go. Maybe an hour in case of disaster recovery?
> 
> Managability - I want to manage a cluster easily (add node, remove node,
> spare nodes, monitoring, ...). It cannot be simple enough.
> 
> Backup - I'm not familiar with running backups on Postgres but we
> currently run a blocking backup on the mysql, for consistency, and it
> causes issues. We use Bacula on a file level. Which brings up a
> question: How do you backup a cluster (if you don't know which nodes are
> hot)?
> 
> Logging - Yes...
> 
>  
> 
> Some may respond that things are not that simple. I know. But I still
> want it to be simple. It would make PGXC a no-brainer for everyone.
> Thanks for listening and keep up the good work! I appreciate it.
> 
>  
> 
> David H.
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
> 
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> 
> 
> 
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> 

-- 
Paulo Pires
Ubiwhere

[Postgres-xc-general] Our general use case

From: David H. <pg...@c0...> - 2012-10-24 20:59:50

 

Hi, 

I've been reading the 'ERROR: FAILED TO GET POOLED
CONNECTIONS' thread about what XC should and should not do. I opted to
start a new thread (instead of replying) about how I would like XC to
be. 

Some background. I work for a SaaS company (mostly dev, some ops)
which has to be online 24/7. We are now running apache/tomcat/mysql for
each set of customers on about 30 nodes and we want to centralize and
make our application more robust, efficient and simple. It basically
means creating layers: LB, webservers, application servers, database
cluster. Some easy parts are already done (haproxy, nginx). Our
'platform' is pretty complex and I have so many tasks, I prefer to _not_
dig into details. We are now discussing the db issue (mysql cluster is
not that great). 

My dream DB cluster: 

Scalability - that means read
and write scalability. XC should do that right now. Nice. 

High
availability - a node can go offline and it should not hinder
availability (only processing capacity) 

Maintainability - Since
maintenance/change is our primary cause of downtime, it should be
possible to kill a node and add it later. This can be because the VM is
being moved, the OS is updated/upgraded, etc. Also, think about how a
cluster is updated from major version to major version (lets say 9.x to
10.x). Maybe that is not an issue (but I don't know about it yet).


Simplicity - It would be nice if the default package+config file is
all I need. If it is too complex I cannot go on holidays. Some points:


 	* I read that _'...even the stock postgresql.conf configuration file
is pretty conservative and users tweak it as per their
requirements..._'. For me that translates as 'if you are new to Postgres
it works bad'. Not simple (for e.g. some of our dev-ers).
 	* For HA_
'...Like Postgres, you need an external application to provide it'_.
When using a cluster I think HA is very often wanted. I need to explain
all this to every ops-colleague of mine and some are not very accurate.
Not simple again. 

Quick setup - I want to setup an NxM cluster quickly
(N times duplication for HA, M times distributed writes for
performance). I prefer to setup a single node with a given config file,
add nodes and be ready to go. Maybe an hour in case of disaster
recovery? 

Managability - I want to manage a cluster easily (add node,
remove node, spare nodes, monitoring, ...). It cannot be simple enough.


Backup - I'm not familiar with running backups on Postgres but we
currently run a blocking backup on the mysql, for consistency, and it
causes issues. We use Bacula on a file level. Which brings up a
question: How do you backup a cluster (if you don't know which nodes are
hot)? 

Logging - Yes... 

Some may respond that things are not that
simple. I know. But I still want it to be simple. It would make PGXC a
no-brainer for everyone. Thanks for listening and keep up the good work!
I appreciate it. 

David H.

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-24 20:49:28

On Wed, Oct 24, 2012 at 01:00:51PM -0400, Jim Mlodgenski wrote:

> The default will to distribute by HASH if it has some sort of valid

My congratulations! I was thought so too ... before have tested it. But
my surprise was when I've found the same data on every node. More over,
despite of redundancy, XC stop working if one node fails.

But it's no matter, because more important thing is that in any case for
every table You should choose either read or write scalability,
rewriting "CREATE TABLE" accordantly, while mysql cluster provides both
at the same time for all tables without any headache about distribution
schemas. i.e. all data are replicated and distributed at the same time.
The only essential difference, that prevent consider mysql cluster as
alternative for XC is that as I mentioned earlier, it is in-memory data
base and as so it is limited in size, while XC have no such limit.

Though be aware it's all about 1.0.0. I don't test all of these features
against 1.0.1 yet.


***************************
###  Vladimir Stavrinov
###  vst...@gm...
***************************

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Andrei M. <and...@gm...> - 2012-10-24 20:19:06

2012/10/24 Vladimir Stavrinov <vst...@gm...>

> On Wed, Oct 24, 2012 at 06:25:56PM +0300, Andrei Martsinchyk wrote:
>
> >    I guess you got familiar with other solutions out there and trying
> >    to find in XC somesing similar. But XC is different.  The main goal
> >    of XC is scalability, not HA.
>
> Despite of its name or goal XC is distributed database only.
>
> >    But it looks like we understand "scalability" differently too.
>
> The difference is that You narrow its meaning.
>
> >    What would a classic database owner do if he is not satisfied with
> >    the performance of his database? He would move to better hardware!
> >    That basically what we mean by "scalability".
>
> If You purchase more powerful hardware to replace old one
> no matter it is database server or Your desktop machine it is not
> scalability it is rather upgrade or stepping up to happy future.
>
>
That is the reason to buy latest IPhone. Some servers run for years without
even reboot. Usually people are replacing servers only if they really need
to do that.



> >    However in case of classic single-server DBMS you would notice,
> >    that hardware cost grows exponentially. With XC you may scale
> >    linearly - if you run XC, for example, on 8 node cluster you may
> >    add 8 more and get 2 times more TPS. That is because XC is able to
> >    intellegently split your data on your nodes. If you have one huge
> >    table on N nodes you can write data N times faster, since each
> >    particular row goes to one node and each node processes 1/Nth of
> >    total requests. Read is scaling either - if you search by key each
> >    node will search only local part of data, wich is N times smaller
> >    then entire table, and all nodes search in parallel. More, if the
> >    search key is the same as distribution key only one node will
> >    search, that one where rows may be located perfect if there are
> >    multiple concurrent searchers.
>
> Thank You for long explanation, but it is excess. I was aware when
> wrote ... But it nothing changes.
>
> >    You mentioned adding nodes online. That feature is not *yet*
> >    implemented in XC. I would not call it "scalability" though. I
> >    would call it flexibility.
>
> It is very polite definition if we remember that it is alternative to
> recreating entire cluster from scratch.
>
>
Nobody upgrades daily. I think it is not a lot of trouble to recreate
cluster once per few years.


> >    That approach is not good for HA: redundancy is needed for HA, XC
> >    is not redundant if you lost one node you lost part of data. XC
> >    will still live in that case and it would be even able to serve
> >    some queries. But query that needs lost
>
> No, it stops working at all. (To be sure: this was tested against 1.0.0,
> but 1.0.1)
>
>
I think your test was incorrect. It works.


> >    node would fail. However XC supports Postgres replication, you may
> >    configure replicas of your datanodes and switch to slave if master
> >    fails. Currently an external solution is required to build such
> >    kind of system. I do not think this is a problem. Nobody needs pure
> >    DBMS anyway, at least frontend is needed. XC is a good brick to
> >    build system that perfectly fulfill customer requirements.
>
> I already wrote: any external solution doubles hardware park and add
> complexity of the system.
>
>
Why it doubles hardware park, multiple components may share same hardware.
HA solution means extra complexity either it external or internal. There
are people out there who do not want that complexity, they are happy with
just performance scalability. They could use XC as is. If there is demand
of HA on market, other developers may create XC-based solutions, more or
less integrated. Consumers may choose one of those solutions. Everybody
wins.
If XC integrates one approach it will lose flexibility in this area.


> >    And about transparency. Application sees XC as a generic DBMS and
> >    can access it using generic SQL. Even CREATE TABLE without
> >    DISTRIBUTE BY clause is supported. But like with any other DBMS
>
> In this case by default it will be "BY REPLICATION" and as result it
> looses main XC feature: write scalability.
>
>
The criteria is pretty complex. However HASH distribution takes priority.


>
> >    database architect must know DBMS internals well and use provided
>
> But he could not know how much nodes You have or You will have and what
> other databases are there running and how existing data already
> distributed. DBMS internals is not transparency related issue at all,
> because there are always difference what for You are writing Your
> application: for mysql, for porstgresql, for oracle or for all of them.
>
>
I did not quite understand what you mean here. There are a lot of important
for system design things along all the hardware and software stack. The
more is known to developers the better result will be. One may design
database on XC if he does know anything about it at all, with pure SQL, and
the database will work. But much better result can be achieved if database
is designed consciously.
Number of nodes does not matter for distribution planning, btw.


> >    tools, like SQL extensions to tune up specific database for
> >    application. XC is capable to achieve much better then linear
> >    performance when it is optimized.
>
> It is acceptable in specific cases, and should be considered as
> customization. But in most cases we need common solution.
>
> --
>
> ***************************
> ##  Vladimir Stavrinov
> ##  vst...@gm...
> ***************************
>
>


-- 
Andrei Martsinchyk

StormDB - http://www.stormdb.com
The Database Cloud

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Jim M. <ji...@gm...> - 2012-10-24 17:00:57

On Wed, Oct 24, 2012 at 12:53 PM, Vladimir Stavrinov
<vst...@gm...> wrote:
> On Wed, Oct 24, 2012 at 11:42:43AM -0400, Jim Mlodgenski wrote:
>
>> That's not actually the case. XC will automatically distribute the
>> table even if the DISTRIBUTE BY clause is not in the CREATE TABLE
>
> In this case by default it will be "BY REPLICATION" and as result it
> looses main XC feature: write scalability.

The default will to distribute by HASH if it has some sort of valid
column to use. If there is no way to determine which column to use, it
will fall back and use a round robin distribution. It never uses "BY
REPLICATION" by default.

>
> --
>
> ***************************
> ##  Vladimir Stavrinov
> ##  vst...@gm...
> ***************************
>

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-24 16:53:45

On Wed, Oct 24, 2012 at 11:42:43AM -0400, Jim Mlodgenski wrote:

> That's not actually the case. XC will automatically distribute the
> table even if the DISTRIBUTE BY clause is not in the CREATE TABLE

In this case by default it will be "BY REPLICATION" and as result it
looses main XC feature: write scalability.

-- 

***************************
##  Vladimir Stavrinov
##  vst...@gm...
***************************

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Vladimir S. <vst...@gm...> - 2012-10-24 16:50:43

On Wed, Oct 24, 2012 at 06:25:56PM +0300, Andrei Martsinchyk wrote:

>    I guess you got familiar with other solutions out there and trying
>    to find in XC somesing similar. But XC is different.  The main goal
>    of XC is scalability, not HA.

Despite of its name or goal XC is distributed database only.

>    But it looks like we understand "scalability" differently too.

The difference is that You narrow its meaning.

>    What would a classic database owner do if he is not satisfied with
>    the performance of his database? He would move to better hardware!
>    That basically what we mean by "scalability".

If You purchase more powerful hardware to replace old one
no matter it is database server or Your desktop machine it is not
scalability it is rather upgrade or stepping up to happy future.

>    However in case of classic single-server DBMS you would notice,
>    that hardware cost grows exponentially. With XC you may scale
>    linearly - if you run XC, for example, on 8 node cluster you may
>    add 8 more and get 2 times more TPS. That is because XC is able to
>    intellegently split your data on your nodes. If you have one huge
>    table on N nodes you can write data N times faster, since each
>    particular row goes to one node and each node processes 1/Nth of
>    total requests. Read is scaling either - if you search by key each
>    node will search only local part of data, wich is N times smaller
>    then entire table, and all nodes search in parallel. More, if the
>    search key is the same as distribution key only one node will
>    search, that one where rows may be located perfect if there are
>    multiple concurrent searchers.

Thank You for long explanation, but it is excess. I was aware when
wrote ... But it nothing changes.

>    You mentioned adding nodes online. That feature is not *yet*
>    implemented in XC. I would not call it "scalability" though. I
>    would call it flexibility.

It is very polite definition if we remember that it is alternative to
recreating entire cluster from scratch.

>    That approach is not good for HA: redundancy is needed for HA, XC
>    is not redundant if you lost one node you lost part of data. XC
>    will still live in that case and it would be even able to serve
>    some queries. But query that needs lost

No, it stops working at all. (To be sure: this was tested against 1.0.0,
but 1.0.1)

>    node would fail. However XC supports Postgres replication, you may
>    configure replicas of your datanodes and switch to slave if master
>    fails. Currently an external solution is required to build such
>    kind of system. I do not think this is a problem. Nobody needs pure
>    DBMS anyway, at least frontend is needed. XC is a good brick to
>    build system that perfectly fulfill customer requirements.

I already wrote: any external solution doubles hardware park and add
complexity of the system.

>    And about transparency. Application sees XC as a generic DBMS and
>    can access it using generic SQL. Even CREATE TABLE without
>    DISTRIBUTE BY clause is supported. But like with any other DBMS

In this case by default it will be "BY REPLICATION" and as result it
looses main XC feature: write scalability.

>    database architect must know DBMS internals well and use provided

But he could not know how much nodes You have or You will have and what
other databases are there running and how existing data already
distributed. DBMS internals is not transparency related issue at all,
because there are always difference what for You are writing Your
application: for mysql, for porstgresql, for oracle or for all of them.

>    tools, like SQL extensions to tune up specific database for
>    application. XC is capable to achieve much better then linear
>    performance when it is optimized.

It is acceptable in specific cases, and should be considered as
customization. But in most cases we need common solution.

-- 

***************************
##  Vladimir Stavrinov
##  vst...@gm...
***************************

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Jim M. <ji...@gm...> - 2012-10-24 15:42:53

On Wed, Oct 24, 2012 at 11:13 AM, Vladimir Stavrinov
<vst...@gm...> wrote:
> On Wed, Oct 24, 2012 at 07:40:33PM +0530, Nikhil Sontakke wrote:
>
>>    "While many standard MySQL schemas and applications can work using
>>    MySQL Cluster, it is also true that unmodified applications and
>>    database schemas may be slightly incompatible or have suboptimal
>>    performance when run using MySQL Cluster"
>
> I was aware of this when wrote previous message.
>
>>    So transparency might come at a cost in the case of MySQL cluster as well.
>
> It is rare and specific cases and absolutely different thing then we have with
> XC. In XC we must take care about "CREATE TABLE ... DISTRIBUTE BY ..."
> EVERYWHERE and ALWAYS.

That's not actually the case. XC will automatically distribute the
table even if the DISTRIBUTE BY clause is not in the CREATE TABLE
statement. It uses the primary key and foreign keys information to
determine a distribution key if one is not provided. In many cases
this is perfectly acceptable and completely transparent to the
application. I've moved over several websites to XC never needing to
touch the DDL.

>
>
>>    In general Postgres has all along believed that the user is more
>>    intelligent and will take the pains to understand the nuances of
>>    their use case and configure the database accordingly. That's why
>
> Again it is different things. It is not configuration of database. It
> is rewriting installation sql scripts. Imagine if You need install third
> party application. What about upgrade? And what about lot of such
> applications? No, it is not acceptable for production.
>
> This is example of core of my claims here: You don't think about real life
> and production environment.
>
>
>>    perhaps even the stock postgresql.conf configuration file is pretty
>>    conservative and users tweak it as per their requirements.
>
> To edit configuration file postgresql.conf is good idea, but rewriting
> installation sql script every time is very bad idea.
>
>>      Impossibility to extend cluster online means it is not scalable.
>>
>>    As you rightly mention below, this is indeed a "young" project and IMHO it's maturing along proper lines.
>
> Good news. News is that You agree with me in something.
>
>>      Again: it should not be external tool, it should be internal,
>>      integral, essential feature.
>>
>>    Some people will say exactly the opposite. Why add something
>
> Didn't hear.
>
>>    minimal internal support. Like for example the Corosync/Pacemaker
>>    LinuxHA product maybe along with some of the tools that Suzuki san
>
> That is exactly what I am using. But it is not an alternative for internal
> solution.
>
>>    applications, the XC cluster continues to function.  As long as
>>    datanodes are equipped with replication and an HA strategy is in
>>    place to handle datanodes going down and failing over to a promoted
>>    standby,  then again the cluster continues to function.
>
> Good. But bad thing is that with any external solution You should twice
> Your hardware park for data nodes, because only half of them will be
> under work load. This is essential and main reason why solution should
> be internal. The next one is manageability and complexity of whole
> system.
>
>>    Here seems to be the fundamental difference between mysql cluster
>>    and PGXC. Everything appears to be "replicated" in MySQL cluster
>>    and all nodes are mirror images of each other. In PGXC, data can be
>>    partitioned across nodes as well. It is for this that we provide
>>    the flexibility to the user via the DISTRIBUTE BY clause.
>
> It seems only, but is not true. All data are distributed between groups
> of data nodes. Replicas are inside group only.
>
>>    AIUI, all Mysql nodes are images of each other. While that's good
>>    for reads, that is not so good for writes, no?
>
> No, see above.
>
>>    Data node addition is a work in progress in XC currently.
>
> I saw already:
>
> http://postgres-xc.sourceforge.net/roadmap.html
>
> But it is issue of priority.
>
> --
>
> ***************************
> ##  Vladimir Stavrinov
> ##  vst...@gm...
> ***************************
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general

Re: [Postgres-xc-general] ERROR: Failed to get pooled connections

From: Andrei M. <and...@gm...> - 2012-10-24 15:26:07

Hi Vladimir,

I guess you got familiar with other solutions out there and trying to find
in XC somesing similar. But XC is different. The main goal of XC is
scalability, not HA.
But it looks like we understand "scalability" differently too.
What would a classic database owner do if he is not satisfied with the
performance of his database? He would move to better hardware! That
basically what we mean by "scalability".
However in case of classic single-server DBMS you would notice, that
hardware cost grows exponentially. With XC you may scale linearly - if you
run XC, for example, on 8 node cluster you may add 8 more and get 2 times
more TPS. That is because XC is able to intellegently split your data on
your nodes. If you have one huge table on N nodes you can write data N
times faster, since each particular row goes to one node and each node
processes 1/Nth of total requests. Read is scaling either - if you search
by key each node will search only local part of data, wich is N times
smaller then entire table, and all nodes search in parallel. More, if the
search key is the same as distribution key only one node will search, that
one where rows may be located perfect if there are multiple concurrent
searchers.
You mentioned adding nodes online. That feature is not *yet* implemented in
XC. I would not call it "scalability" though. I would call it flexibility.
That approach is not good for HA: redundancy is needed for HA, XC is not
redundant if you lost one node you lost part of data. XC will still live in
that case and it would be even able to serve some queries. But query that
needs lost node would fail. However XC supports Postgres replication, you
may configure replicas of your datanodes and switch to slave if master
fails. Currently an external solution is required to build such kind of
system. I do not think this is a problem. Nobody needs pure DBMS anyway, at
least frontend is needed. XC is a good brick to build system that perfectly
fulfill customer requirements.
And about transparency. Application sees XC as a generic DBMS and can
access it using generic SQL. Even CREATE TABLE without DISTRIBUTE BY clause
is supported. But like with any other DBMS database architect must know
DBMS internals well and use provided tools, like SQL extensions to tune up
specific database for application. XC is capable to achieve much better
then linear performance when it is optimized.

2012/10/24 Vladimir Stavrinov <vst...@gm...>

> On Wed, Oct 24, 2012 at 08:08:32PM +0900, Michael Paquier wrote:
>
> >    Sure, XC provides thanks to its architecture naturally transparency
> and scalability.
>
> What does XC provides? My two rhetorical questions above imply answers
> "NO".  Necessity to adapt application means cluster is not transparent.
> Impossibility to extend cluster online means it is not scalable.
>
> More over, this two issues are interrelated, because You should rewrite
> "CREATE TABLE" statement every time you expand (read: recreate) Your
> cluster. But this issue looks much worse if node fails containing tables
> with different distributed schemas. This is uncontrollable model.
>
> >    Load balancing can be provided between Coordinator and Datanodes
> >    depending on applications, or at Coordinator level
>
> It should not depend on application, it should be an cluster's global
> function.
>
> >    For HA, Koichi is currently working on some tools to provide that,
>
> Again: it should not be external tool, it should be internal, integral,
> essential feature.
>
> >    I am not sure you can that easily compare XC and mysql cluster,
> >    both share the same architectures, but once of the main
>
> I don't know what there is "the same", but in functionality it is
> totally different. Mysql cluster has the precise and clear clustering
> model:
>
> 1. If some nodes fail cluster continues to work as soon as there remains
> at least one healthy node in every group.
>
> 2. No "CREATE TABLE ... DISTRIBUTE BY ..." statement. You just define
> the number of replicas at configuration level. Yes, now there are only
> one option is available that make sense with two replicas, but it is
> enough.
>
> 3. Read and write scalability (i.e. LB) at the same time for all tables
> (i.e. on the cluster level).
>
> 4. You can add data node online, i.e. without restarting (not to mention
> "recreating" as for XC) cluster. Yes, only new data will go to the new
> node in this case. But You can totally redistribute it with restart.
>
> So it is full flagged cluster, that's not true for XC and it's a pity.
>
> >    differences coming to my mind is that XC is far more flexible in
> >    terms of license (BSD and not GPL), and like PostgreSQL, no company
> >    has the control of its code like mysql products which Oracle relies
>
> Yes, and this is why I am persuading all developers migrate to
> Postgresql. But it is off topic here where we are discussing
> functionality, but not an licence issues.
>
> Be tolerant to my criticism, I wouldn't say You made bad thing, I was
> amazing when first read "write-scalable, synchronous multi-master,
> transparent PostgreSQL cluster" in Your description that I completely
> and exactly copied into description of my debian package, but I was
> notably disappointed after my first test showing me that it is odd with
> reality. It would not be so bad itself, as soon as it is young project,
> but much worse that this discussion shows there are something wrong with
> Your priorities and fundamental approach.
>
> --
>
> ***************************
> ##  Vladimir Stavrinov
> ##  vst...@gm...
> ***************************
>
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>

-- 
Andrei Martsinchyk

StormDB - http://www.stormdb.com
The Database Cloud

Flat | Threaded

<< < 1 2 3 > >> (Page 2 of 3)

S	M	T	W	T	F	S
	1	2 (1)	3	4	5	6
7	8	9	10	11 (3)	12	13
14	15	16	17 (4)	18	19	20
21	22	23	24 (20)	25 (8)	26 (22)	27
28 (2)	29 (3)	30	31 (3)