postgres-xc-developers Mailing List for Postgres-XC (Page 123)

Brought to you by: ahsanhadi, amitdkhan, ashutoshbapat, gabbasb, and 3 others

postgres-xc-developers — Postgres-XC hackers and developers

You can subscribe to this list here.

2010	Jan	Feb	Mar	Apr (10)	May (17)	Jun (3)	Jul	Aug	Sep (8)	Oct (18)	Nov (51)	Dec (74)
2011	Jan (47)	Feb (44)	Mar (44)	Apr (102)	May (35)	Jun (25)	Jul (56)	Aug (69)	Sep (32)	Oct (37)	Nov (31)	Dec (16)
2012	Jan (34)	Feb (127)	Mar (218)	Apr (252)	May (80)	Jun (137)	Jul (205)	Aug (159)	Sep (35)	Oct (50)	Nov (82)	Dec (52)
2013	Jan (107)	Feb (159)	Mar (118)	Apr (163)	May (151)	Jun (89)	Jul (106)	Aug (177)	Sep (49)	Oct (63)	Nov (46)	Dec (7)
2014	Jan (65)	Feb (128)	Mar (40)	Apr (11)	May (4)	Jun (8)	Jul (16)	Aug (11)	Sep (4)	Oct (1)	Nov (5)	Dec (16)
2015	Jan (5)	Feb	Mar (2)	Apr (5)	May (4)	Jun (12)	Jul	Aug	Sep	Oct	Nov	Dec (4)
2019	Jan	Feb	Mar	Apr	May	Jun	Jul (2)	Aug	Sep	Oct	Nov	Dec

Flat | Threaded

<< < 1 .. 121 122 123 124 125 .. 139 > >> (Page 123 of 139)

[Postgres-xc-developers] Moving to the new master branch

From: Michael P. <mic...@gm...> - 2011-05-31 06:11:12

Hi all,

During the last two weeks, a branch called PGXC-TrialMaster has been created
to realign XC code with PostgreSQL master branch.
The goal of that is to be able to merge easily the code of XC with future
releases of PostgreSQL.
Currently, Trial branch is located at the intersection of Postgres master
and Postgres 9.0 stable branch.
Therefore, at some point (next release 0.9.5, beginning of sync streaming
replication implementation), this branch will be merged with the postgres
master up to 9.1 stable branch.
A second branch based on 9.0 stable may also be created.

The current master will have its name changed to keep history of releases up
to now (0.9~0.9.4).
Also, before moving to the next master, it would be better to merge the
barrier commits to Trial branch.

Regarding regression tests, trial branch has achieved the same results as
master branch.
About DBT-1, I have been able to run a test with DBT-1 with more or less the
same results as the current master branch with 1 loader machine.
However DBT-1 is not a good indicator as sometimes count returns 0 rows,
resulting in a high number of errors sent back to application.

So, does anyone think we should postpone the master change to later?
Do you think it is OK to do that now?
What would remain is to merge the barrier code to TrialMaster.

Regards,
-- 
Michael Paquier
http://michael.otacoo.com

[Postgres-xc-developers] Shipping single node queries involving GROUP BY

From: Ashutosh B. <ash...@en...> - 2011-05-30 08:29:26

Hi All,
In my first commit, I had disabled the complete shipping of queries
involving GROUP BY to datanode even if there is a single datanode involved.
This was done, because we do not finalise aggregates at the datanodes. Hence
when we ship queries involving aggregates, the results we get back are in
transition states. There is facility to aggregate those transition results
at coordinator through RemoteQuery node, but RemoteQuery node can not do
that for grouped results. It's clumsy to add grouping in RemoteQuery node,
and it will involve code duplication.

Instead, if we can indicate while shipping query or otherwise, that the
datanode needs to finalize results at datanode itself, we can ship these
queries fully to the datanodes. Is there a way, by which we can send some
more information to datanodes, alongwith the query we send?

This will also help the EXEC DIRECT queries. As of now, even if queries with
aggregates are executed directly on a datanode (using EXEC DIRECT) it does
not give correct results because of the same reason above.

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [Postgres-xc-developers] Perhaps you may be interested by this bug...

From: xiong w. <wan...@gm...> - 2011-05-29 02:55:25

Hi Michael，

2011/5/27 Michael Paquier <mic...@gm...>:
> This is a nice fix.
> It worked perfectly, I just pushed it to the repository after usual checks.
>
> I just completely forgot to write your name in the commit.
> Sorry :(

It doesn't matter. :)

Regards,
Benny

>
> On Fri, May 27, 2011 at 4:10 PM, xiong wang <wan...@gm...> wrote:
>>
>> Hi Michael,
>>
>> The encloser is a patch fixing the bug you submitted.
>>
>> Best regards,
>> Benny
>>
>> 2011/5/26 Michael Paquier <mic...@gm...>:
>> > Hi Benny,
>> >
>> > How are you?
>> > I heard you graduated and began work, congratulations. How is work?
>> >
>> > I found an interesting bug with JDBC driver, a driver in java for
>> > postgresql.
>> > When using it with Postgres-XC for multi insert like:
>> > create table aa (a int);
>> > insert into aa values (1),(2),(3);
>> >
>> > JDBC makes XC react as if table is replicated even if the table is
>> > distributed.
>> > I am not asking you at all to solve it or anything, as I'll try to do it
>> > myself, but I thought you may be interested.
>> >
>> > https://sourceforge.net/tracker/?func=detail&aid=3307846&group_id=311227&atid=1310232
>> >
>> > Regards,
>> > --
>> > Michael Paquier
>> > http://michael.otacoo.com
>> >
>
>
>
> --
> Michael Paquier
> http://michael.otacoo.com
>

Re: [Postgres-xc-developers] Perhaps you may be interested by this bug...

From: Michael P. <mic...@gm...> - 2011-05-27 08:18:24

This is a nice fix.
It worked perfectly, I just pushed it to the repository after usual checks.

I just completely forgot to write your name in the commit.
Sorry :(

On Fri, May 27, 2011 at 4:10 PM, xiong wang <wan...@gm...> wrote:

> Hi Michael,
>
> The encloser is a patch fixing the bug you submitted.
>
> Best regards,
> Benny
>
> 2011/5/26 Michael Paquier <mic...@gm...>:
> > Hi Benny,
> >
> > How are you?
> > I heard you graduated and began work, congratulations. How is work?
> >
> > I found an interesting bug with JDBC driver, a driver in java for
> > postgresql.
> > When using it with Postgres-XC for multi insert like:
> > create table aa (a int);
> > insert into aa values (1),(2),(3);
> >
> > JDBC makes XC react as if table is replicated even if the table is
> > distributed.
> > I am not asking you at all to solve it or anything, as I'll try to do it
> > myself, but I thought you may be interested.
> >
> https://sourceforge.net/tracker/?func=detail&aid=3307846&group_id=311227&atid=1310232
> >
> > Regards,
> > --
> > Michael Paquier
> > http://michael.otacoo.com
> >
>



-- 
Michael Paquier
http://michael.otacoo.com

Re: [Postgres-xc-developers] Perhaps you may be interested by this bug...

From: xiong w. <wan...@gm...> - 2011-05-27 07:10:24

Attachments: jdbc_muti_insert.patch

Hi Michael,

The encloser is a patch fixing the bug you submitted.

Best regards,
Benny

2011/5/26 Michael Paquier <mic...@gm...>:
> Hi Benny,
>
> How are you?
> I heard you graduated and began work, congratulations. How is work?
>
> I found an interesting bug with JDBC driver, a driver in java for
> postgresql.
> When using it with Postgres-XC for multi insert like:
> create table aa (a int);
> insert into aa values (1),(2),(3);
>
> JDBC makes XC react as if table is replicated even if the table is
> distributed.
> I am not asking you at all to solve it or anything, as I'll try to do it
> myself, but I thought you may be interested.
> https://sourceforge.net/tracker/?func=detail&aid=3307846&group_id=311227&atid=1310232
>
> Regards,
> --
> Michael Paquier
> http://michael.otacoo.com
>

Re: [Postgres-xc-developers] [Postgres-xc-committers] Postgres-XC branch, master, updated. v0.9.4-70-g49b66c7

From: Michael P. <mic...@gm...> - 2011-05-26 02:15:58

Just a comment on this thread.
In case you want to answer to a commit message:
1) Please delete the end of the message in case commit is very long
2) Move such a thread to XC hackers mailing list
pos...@li... only
Commit mailing list's purpose is only GIT commits and we shouldn't use it
for development discussions.

On Wed, May 25, 2011 at 7:20 PM, Abbas Butt <abb...@te...>wrote:

> On Wed, May 25, 2011 at 5:54 AM, Koichi Suzuki <ko...@in...>wrote:
>
>> Hi,
>>
>> Current code utilizes existing hash-generation mechanism and I think this
>> is basically right thing to do.   By using this, we can pick up almost any
>> column (I'm not sure about about geometric types and composit types, would
>> like to test) for hash distribution.
>>
>> Points are: 1) Is a distribution column stable enough? --- This is user's
>> choice and most of float attribute is not stable.  2) Can we reproduce the
>> same hash value from the same input value?
>>
>> Mason's point is 2).   It will be better to handle this from more general
>> view.   Anyway, I think current implementation is simple and general enough.
>>    We need separete means to determine if specified column is good to select
>> as distribution column.     This should be applied not only embedded types
>> but also user-defined types and need some design and implementation effort.
>>
>> At present, we may notice users that it is not recommended and may be
>> prohibited in the future.
>>
>
> Agreed.
>
>
>>
>> We can introduce new catalog table or extend pg_type to describe what
>> types are allowed as distribution key.
>> ---
>> Koichi
>> # Geometric types element values are float and they're not adequate to use
>> as distribution key.
>>
>
> I initially thought about adding geometric types too, but then decided to
> leave them for some time later.
>
>
-- 
Michael Paquier
http://michael.otacoo.com

Re: [Postgres-xc-developers] [Postgres-xc-committers] Postgres-XC branch, master, updated. v0.9.4-70-g49b66c7

From: Abbas B. <abb...@te...> - 2011-05-25 10:20:47

On Wed, May 25, 2011 at 5:54 AM, Koichi Suzuki <ko...@in...>wrote:

> Hi,
>
> Current code utilizes existing hash-generation mechanism and I think this
> is basically right thing to do.   By using this, we can pick up almost any
> column (I'm not sure about about geometric types and composit types, would
> like to test) for hash distribution.
>
> Points are: 1) Is a distribution column stable enough? --- This is user's
> choice and most of float attribute is not stable.  2) Can we reproduce the
> same hash value from the same input value?
>
> Mason's point is 2).   It will be better to handle this from more general
> view.   Anyway, I think current implementation is simple and general enough.
>    We need separete means to determine if specified column is good to select
> as distribution column.     This should be applied not only embedded types
> but also user-defined types and need some design and implementation effort.
>
> At present, we may notice users that it is not recommended and may be
> prohibited in the future.
>

Agreed.


>
> We can introduce new catalog table or extend pg_type to describe what types
> are allowed as distribution key.
> ---
> Koichi
> # Geometric types element values are float and they're not adequate to use
> as distribution key.
>

I initially thought about adding geometric types too, but then decided to
leave them for some time later.


>
> On Tue, 24 May 2011 09:03:29 -0400
> Mason <ma...@us...> wrote:
>
> > On Tue, May 24, 2011 at 8:08 AM, Abbas Butt
> > <ga...@us...> wrote:
> > > Project "Postgres-XC".
> > >
> > > The branch, master has been updated
> > >       via  49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit)
> > >      from  87a62879ab3492e3dd37d00478ffa857639e2b85 (commit)
> > >
> > >
> > > - Log -----------------------------------------------------------------
> > > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae
> > > Author: Abbas <abb...@en...>
> > > Date:   Tue May 24 17:06:30 2011 +0500
> > >
> > >    This patch adds support for the following data types to be used as
> distribution key
> > >
> > >    INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR
> > >    CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR
> > >    FLOAT4, FLOAT8, NUMERIC, CASH
> > >    ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL,
> TIMETZ
> > >
> >
> > I am not sure some of these data types are a good idea to use for
> > distributing on.  Float is inexact and seems problematic
> >
> > I just did a quick test:
> >
> > mds=# create table float1 (a float, b float) distribute by hash (a);
> > CREATE TABLE
> >
> > mds=# insert into float1 values (2.0/3, 2);
> > INSERT 0 1
> >
> > mds=# select * from float1;
> >          a         | b
> > -------------------+---
> >  0.666666666666667 | 2
> > (1 row)
> >
> > Then, I copy and paste the output of a:
> >
> > mds=# select * from float1 where a = 0.666666666666667;
> >  a | b
> > ---+---
> > (0 rows)
> >
> > Looking at the plan it tries to take advantage of partitioning:
> >
> > mds=# explain select * from float1 where a = 0.666666666666667;
> >                             QUERY PLAN
> > -------------------------------------------------------------------
> >  Data Node Scan (Node Count [1])  (cost=0.00..0.00 rows=0 width=0)
> > (1 row)
> >
> > I think we should remove support for floats as a possible distribution
> > type; users may get themselves into trouble.
> >
> >
> > There may be similar issues with the timestamp data types:
> >
> > mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a);
> > CREATE TABLE
> > mds=# insert into timestamp1 values (now(), 1);
> > INSERT 0 1
> > mds=# select * from timestamp1;
> >              a              | b
> > ----------------------------+---
> >  2011-05-24 08:51:21.597551 | 1
> > (1 row)
> >
> > mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551';
> >  a | b
> > ---+---
> > (0 rows)
> >
> >
> > As far as BOOL goes, I suppose it may be ok, but of course there are
> > only two possible values. I would block it, or at the very least if
> > the user leaves off the distribution clause, I would not consider BOOL
> > columns and look at other columns as better partitioning candidates.
> >
> > In any event, I am very glad to see the various INT types, CHAR,
> > VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful
> > some of the others are.
> >
> > Thanks,
> >
> > Mason
> >
> >
> ------------------------------------------------------------------------------
> > vRanger cuts backup time in half-while increasing security.
> > With the market-leading solution for virtual backup and recovery,
> > you get blazing-fast, flexible, and affordable data protection.
> > Download your free trial now.
> > http://p.sf.net/sfu/quest-d2dcopy1
> > _______________________________________________
> > Postgres-xc-developers mailing list
> > Pos...@li...
> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
> >
>
>
> ------------------------------------------------------------------------------
> vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery,
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now.
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>

Re: [Postgres-xc-developers] [Postgres-xc-committers] Postgres-XC branch, master, updated. v0.9.4-70-g49b66c7

From: Koichi S. <ko...@in...> - 2011-05-25 01:28:10

I think current situation of float and float-based types are not adequate as distribution columns.    On the other hand, I think it is not a good thing to exluce them by hard-coded logic.   We should consider future extension to it and using pg_type or new catalog will be a good idea.
---
Koichi

pOn Tue, 24 May 2011 09:57:54 -0400
Mason Sharp <mas...@gm...> wrote:

> On Tue, May 24, 2011 at 9:40 AM, Abbas Butt <abb...@te...> wrote:
> >
> >
> > On Tue, May 24, 2011 at 6:03 PM, Mason <ma...@us...>
> > wrote:
> >>
> >> On Tue, May 24, 2011 at 8:08 AM, Abbas Butt
> >> <ga...@us...> wrote:
> >> > Project "Postgres-XC".
> >> >
> >> > The branch, master has been updated
> >> >       via  49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit)
> >> >      from  87a62879ab3492e3dd37d00478ffa857639e2b85 (commit)
> >> >
> >> >
> >> > - Log -----------------------------------------------------------------
> >> > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae
> >> > Author: Abbas <abb...@en...>
> >> > Date:   Tue May 24 17:06:30 2011 +0500
> >> >
> >> >    This patch adds support for the following data types to be used as
> >> > distribution key
> >> >
> >> >    INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR
> >> >    CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR
> >> >    FLOAT4, FLOAT8, NUMERIC, CASH
> >> >    ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL,
> >> > TIMETZ
> >> >
> >>
> >> I am not sure some of these data types are a good idea to use for
> >> distributing on.  Float is inexact and seems problematic
> >>
> >> I just did a quick test:
> >>
> >> mds=# create table float1 (a float, b float) distribute by hash (a);
> >> CREATE TABLE
> >>
> >> mds=# insert into float1 values (2.0/3, 2);
> >> INSERT 0 1
> >>
> >> mds=# select * from float1;
> >>         a         | b
> >> -------------------+---
> >>  0.666666666666667 | 2
> >> (1 row)
> >>
> >> Then, I copy and paste the output of a:
> >>
> >> mds=# select * from float1 where a = 0.666666666666667;
> >>  a | b
> >> ---+---
> >> (0 rows)
> >>
> >
> > float is a tricky type. Leave XC aside this test case will produce same
> > results in plain postgres for this reason.
> > The column actually does not contain 0.666666666666667, what psql is showing
> > us is only an approximation of what is stored there.
> > select * from float1 where a = 2.0/3; would however work.
> > 2ndly suppose we have the same test case with data type float4.
> > Now both
> > select * from float1 where a = 0.666666666666667; and
> > select * from float1 where a = 2.0/3;
> > would show up no results both in PG and XC.
> > The reason is that PG treats real numbers as float8 by default and float8
> > does not compare to float4.
> > select * from float1 where a = cast (2.0/3 as float4);
> > would therefore work.
> > Any user willing to use float types has to be aware of these strange
> > behaviors and knowing these he/she may benefit from being able to use it as
> > a distribution key.
> 
> 
> I don't think it is a good idea that they have to know that they
> should change all of their application code and add casting to make
> sure it works like they want. I think people are just going to get
> themselves into trouble. I strongly recommend disabling distribution
> support for some of these data types.
> 
> Thanks,
> 
> Mason
> 
> 
> 
> >
> >>
> >> Looking at the plan it tries to take advantage of partitioning:
> >>
> >> mds=# explain select * from float1 where a = 0.666666666666667;
> >>                            QUERY PLAN
> >> -------------------------------------------------------------------
> >>  Data Node Scan (Node Count [1])  (cost=0.00..0.00 rows=0 width=0)
> >> (1 row)
> >>
> >> I think we should remove support for floats as a possible distribution
> >> type; users may get themselves into trouble.
> >>
> >>
> >> There may be similar issues with the timestamp data types:
> >>
> >> mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a);
> >> CREATE TABLE
> >> mds=# insert into timestamp1 values (now(), 1);
> >> INSERT 0 1
> >> mds=# select * from timestamp1;
> >>             a              | b
> >> ----------------------------+---
> >>  2011-05-24 08:51:21.597551 | 1
> >> (1 row)
> >>
> >> mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551';
> >>  a | b
> >> ---+---
> >> (0 rows)
> >>
> >>
> >> As far as BOOL goes, I suppose it may be ok, but of course there are
> >> only two possible values. I would block it, or at the very least if
> >> the user leaves off the distribution clause, I would not consider BOOL
> >> columns and look at other columns as better partitioning candidates.
> >>
> >> In any event, I am very glad to see the various INT types, CHAR,
> >> VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful
> >> some of the others are.
> >>
> >> Thanks,
> >>
> >> Mason
> >>
> >>
> >> ------------------------------------------------------------------------------
> >> vRanger cuts backup time in half-while increasing security.
> >> With the market-leading solution for virtual backup and recovery,
> >> you get blazing-fast, flexible, and affordable data protection.
> >> Download your free trial now.
> >> http://p.sf.net/sfu/quest-d2dcopy1
> >> _______________________________________________
> >> Postgres-xc-committers mailing list
> >> Pos...@li...
> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-committers
> >
> >
> > ------------------------------------------------------------------------------
> > vRanger cuts backup time in half-while increasing security.
> > With the market-leading solution for virtual backup and recovery,
> > you get blazing-fast, flexible, and affordable data protection.
> > Download your free trial now.
> > http://p.sf.net/sfu/quest-d2dcopy1
> > _______________________________________________
> > Postgres-xc-committers mailing list
> > Pos...@li...
> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-committers
> >
> >
> 
> ------------------------------------------------------------------------------
> vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery, 
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now. 
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>

Re: [Postgres-xc-developers] [Postgres-xc-committers] Postgres-XC branch, master, updated. v0.9.4-70-g49b66c7

From: Koichi S. <ko...@in...> - 2011-05-25 01:11:32

Hi,

Current code utilizes existing hash-generation mechanism and I think this is basically right thing to do.   By using this, we can pick up almost any column (I'm not sure about about geometric types and composit types, would like to test) for hash distribution. 

Points are: 1) Is a distribution column stable enough? --- This is user's choice and most of float attribute is not stable.  2) Can we reproduce the same hash value from the same input value? 

Mason's point is 2).   It will be better to handle this from more general view.   Anyway, I think current implementation is simple and general enough.    We need separete means to determine if specified column is good to select as distribution column.     This should be applied not only embedded types but also user-defined types and need some design and implementation effort.

At present, we may notice users that it is not recommended and may be prohibited in the future.

We can introduce new catalog table or extend pg_type to describe what types are allowed as distribution key.
---
Koichi
# Geometric types element values are float and they're not adequate to use as distribution key.

On Tue, 24 May 2011 09:03:29 -0400
Mason <ma...@us...> wrote:

> On Tue, May 24, 2011 at 8:08 AM, Abbas Butt
> <ga...@us...> wrote:
> > Project "Postgres-XC".
> >
> > The branch, master has been updated
> >       via  49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit)
> >      from  87a62879ab3492e3dd37d00478ffa857639e2b85 (commit)
> >
> >
> > - Log -----------------------------------------------------------------
> > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae
> > Author: Abbas <abb...@en...>
> > Date:   Tue May 24 17:06:30 2011 +0500
> >
> >    This patch adds support for the following data types to be used as distribution key
> >
> >    INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR
> >    CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR
> >    FLOAT4, FLOAT8, NUMERIC, CASH
> >    ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL, TIMETZ
> >
> 
> I am not sure some of these data types are a good idea to use for
> distributing on.  Float is inexact and seems problematic
> 
> I just did a quick test:
> 
> mds=# create table float1 (a float, b float) distribute by hash (a);
> CREATE TABLE
> 
> mds=# insert into float1 values (2.0/3, 2);
> INSERT 0 1
> 
> mds=# select * from float1;
>          a         | b
> -------------------+---
>  0.666666666666667 | 2
> (1 row)
> 
> Then, I copy and paste the output of a:
> 
> mds=# select * from float1 where a = 0.666666666666667;
>  a | b
> ---+---
> (0 rows)
> 
> Looking at the plan it tries to take advantage of partitioning:
> 
> mds=# explain select * from float1 where a = 0.666666666666667;
>                             QUERY PLAN
> -------------------------------------------------------------------
>  Data Node Scan (Node Count [1])  (cost=0.00..0.00 rows=0 width=0)
> (1 row)
> 
> I think we should remove support for floats as a possible distribution
> type; users may get themselves into trouble.
> 
> 
> There may be similar issues with the timestamp data types:
> 
> mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a);
> CREATE TABLE
> mds=# insert into timestamp1 values (now(), 1);
> INSERT 0 1
> mds=# select * from timestamp1;
>              a              | b
> ----------------------------+---
>  2011-05-24 08:51:21.597551 | 1
> (1 row)
> 
> mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551';
>  a | b
> ---+---
> (0 rows)
> 
> 
> As far as BOOL goes, I suppose it may be ok, but of course there are
> only two possible values. I would block it, or at the very least if
> the user leaves off the distribution clause, I would not consider BOOL
> columns and look at other columns as better partitioning candidates.
> 
> In any event, I am very glad to see the various INT types, CHAR,
> VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful
> some of the others are.
> 
> Thanks,
> 
> Mason
> 
> ------------------------------------------------------------------------------
> vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery, 
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now. 
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>

Re: [Postgres-xc-developers] [Postgres-xc-committers] Postgres-XC branch, master, updated. v0.9.4-70-g49b66c7

From: Mason S. <mas...@gm...> - 2011-05-24 13:58:05

On Tue, May 24, 2011 at 9:40 AM, Abbas Butt <abb...@te...> wrote:
>
>
> On Tue, May 24, 2011 at 6:03 PM, Mason <ma...@us...>
> wrote:
>>
>> On Tue, May 24, 2011 at 8:08 AM, Abbas Butt
>> <ga...@us...> wrote:
>> > Project "Postgres-XC".
>> >
>> > The branch, master has been updated
>> >       via  49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit)
>> >      from  87a62879ab3492e3dd37d00478ffa857639e2b85 (commit)
>> >
>> >
>> > - Log -----------------------------------------------------------------
>> > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae
>> > Author: Abbas <abb...@en...>
>> > Date:   Tue May 24 17:06:30 2011 +0500
>> >
>> >    This patch adds support for the following data types to be used as
>> > distribution key
>> >
>> >    INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR
>> >    CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR
>> >    FLOAT4, FLOAT8, NUMERIC, CASH
>> >    ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL,
>> > TIMETZ
>> >
>>
>> I am not sure some of these data types are a good idea to use for
>> distributing on.  Float is inexact and seems problematic
>>
>> I just did a quick test:
>>
>> mds=# create table float1 (a float, b float) distribute by hash (a);
>> CREATE TABLE
>>
>> mds=# insert into float1 values (2.0/3, 2);
>> INSERT 0 1
>>
>> mds=# select * from float1;
>>         a         | b
>> -------------------+---
>>  0.666666666666667 | 2
>> (1 row)
>>
>> Then, I copy and paste the output of a:
>>
>> mds=# select * from float1 where a = 0.666666666666667;
>>  a | b
>> ---+---
>> (0 rows)
>>
>
> float is a tricky type. Leave XC aside this test case will produce same
> results in plain postgres for this reason.
> The column actually does not contain 0.666666666666667, what psql is showing
> us is only an approximation of what is stored there.
> select * from float1 where a = 2.0/3; would however work.
> 2ndly suppose we have the same test case with data type float4.
> Now both
> select * from float1 where a = 0.666666666666667; and
> select * from float1 where a = 2.0/3;
> would show up no results both in PG and XC.
> The reason is that PG treats real numbers as float8 by default and float8
> does not compare to float4.
> select * from float1 where a = cast (2.0/3 as float4);
> would therefore work.
> Any user willing to use float types has to be aware of these strange
> behaviors and knowing these he/she may benefit from being able to use it as
> a distribution key.


I don't think it is a good idea that they have to know that they
should change all of their application code and add casting to make
sure it works like they want. I think people are just going to get
themselves into trouble. I strongly recommend disabling distribution
support for some of these data types.

Thanks,

Mason



>
>>
>> Looking at the plan it tries to take advantage of partitioning:
>>
>> mds=# explain select * from float1 where a = 0.666666666666667;
>>                            QUERY PLAN
>> -------------------------------------------------------------------
>>  Data Node Scan (Node Count [1])  (cost=0.00..0.00 rows=0 width=0)
>> (1 row)
>>
>> I think we should remove support for floats as a possible distribution
>> type; users may get themselves into trouble.
>>
>>
>> There may be similar issues with the timestamp data types:
>>
>> mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a);
>> CREATE TABLE
>> mds=# insert into timestamp1 values (now(), 1);
>> INSERT 0 1
>> mds=# select * from timestamp1;
>>             a              | b
>> ----------------------------+---
>>  2011-05-24 08:51:21.597551 | 1
>> (1 row)
>>
>> mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551';
>>  a | b
>> ---+---
>> (0 rows)
>>
>>
>> As far as BOOL goes, I suppose it may be ok, but of course there are
>> only two possible values. I would block it, or at the very least if
>> the user leaves off the distribution clause, I would not consider BOOL
>> columns and look at other columns as better partitioning candidates.
>>
>> In any event, I am very glad to see the various INT types, CHAR,
>> VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful
>> some of the others are.
>>
>> Thanks,
>>
>> Mason
>>
>>
>> ------------------------------------------------------------------------------
>> vRanger cuts backup time in half-while increasing security.
>> With the market-leading solution for virtual backup and recovery,
>> you get blazing-fast, flexible, and affordable data protection.
>> Download your free trial now.
>> http://p.sf.net/sfu/quest-d2dcopy1
>> _______________________________________________
>> Postgres-xc-committers mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-committers
>
>
> ------------------------------------------------------------------------------
> vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery,
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now.
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Postgres-xc-committers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-committers
>
>

Re: [Postgres-xc-developers] [Postgres-xc-committers] Postgres-XC branch, master, updated. v0.9.4-70-g49b66c7

From: Abbas B. <abb...@te...> - 2011-05-24 13:40:17

On Tue, May 24, 2011 at 6:03 PM, Mason <ma...@us...>wrote:

> On Tue, May 24, 2011 at 8:08 AM, Abbas Butt
> <ga...@us...> wrote:
> > Project "Postgres-XC".
> >
> > The branch, master has been updated
> >       via  49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit)
> >      from  87a62879ab3492e3dd37d00478ffa857639e2b85 (commit)
> >
> >
> > - Log -----------------------------------------------------------------
> > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae
> > Author: Abbas <abb...@en...>
> > Date:   Tue May 24 17:06:30 2011 +0500
> >
> >    This patch adds support for the following data types to be used as
> distribution key
> >
> >    INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR
> >    CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR
> >    FLOAT4, FLOAT8, NUMERIC, CASH
> >    ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL, TIMETZ
> >
>
> I am not sure some of these data types are a good idea to use for
> distributing on.  Float is inexact and seems problematic
>
> I just did a quick test:
>
> mds=# create table float1 (a float, b float) distribute by hash (a);
> CREATE TABLE
>
> mds=# insert into float1 values (2.0/3, 2);
> INSERT 0 1
>
> mds=# select * from float1;
>         a         | b
> -------------------+---
>  0.666666666666667 | 2
> (1 row)
>
> Then, I copy and paste the output of a:
>
> mds=# select * from float1 where a = 0.666666666666667;
>  a | b
> ---+---
> (0 rows)
>
>
float is a tricky type. Leave XC aside this test case will produce same
results in plain postgres for this reason.
The column actually does not contain 0.666666666666667, what psql is showing
us is only an approximation of what is stored there.

select * from float1 where a = 2.0/3; would however work.

2ndly suppose we have the same test case with data type float4.

Now both
select * from float1 where a = 0.666666666666667; and
select * from float1 where a = 2.0/3;
would show up no results both in PG and XC.
The reason is that PG treats real numbers as float8 by default and float8
does not compare to float4.
select * from float1 where a = cast (2.0/3 as float4);
would therefore work.

Any user willing to use float types has to be aware of these strange
behaviors and knowing these he/she may benefit from being able to use it as
a distribution key.


> Looking at the plan it tries to take advantage of partitioning:
>
> mds=# explain select * from float1 where a = 0.666666666666667;
>                             QUERY PLAN
> -------------------------------------------------------------------
>  Data Node Scan (Node Count [1])  (cost=0.00..0.00 rows=0 width=0)
> (1 row)
>
> I think we should remove support for floats as a possible distribution
> type; users may get themselves into trouble.
>
>
> There may be similar issues with the timestamp data types:
>
> mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a);
> CREATE TABLE
> mds=# insert into timestamp1 values (now(), 1);
> INSERT 0 1
> mds=# select * from timestamp1;
>             a              | b
> ----------------------------+---
>  2011-05-24 08:51:21.597551 | 1
> (1 row)
>
> mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551';
>  a | b
> ---+---
> (0 rows)
>
>
> As far as BOOL goes, I suppose it may be ok, but of course there are
> only two possible values. I would block it, or at the very least if
> the user leaves off the distribution clause, I would not consider BOOL
> columns and look at other columns as better partitioning candidates.
>
> In any event, I am very glad to see the various INT types, CHAR,
> VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful
> some of the others are.
>
> Thanks,
>
> Mason
>
>
> ------------------------------------------------------------------------------
> vRanger cuts backup time in half-while increasing security.
> With the market-leading solution for virtual backup and recovery,
> you get blazing-fast, flexible, and affordable data protection.
> Download your free trial now.
> http://p.sf.net/sfu/quest-d2dcopy1
> _______________________________________________
> Postgres-xc-committers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-committers
>

Re: [Postgres-xc-developers] [Postgres-xc-committers] Postgres-XC branch, master, updated. v0.9.4-70-g49b66c7

From: Mason <ma...@us...> - 2011-05-24 13:03:38

On Tue, May 24, 2011 at 8:08 AM, Abbas Butt
<ga...@us...> wrote:
> Project "Postgres-XC".
>
> The branch, master has been updated
>       via  49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit)
>      from  87a62879ab3492e3dd37d00478ffa857639e2b85 (commit)
>
>
> - Log -----------------------------------------------------------------
> commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae
> Author: Abbas <abb...@en...>
> Date:   Tue May 24 17:06:30 2011 +0500
>
>    This patch adds support for the following data types to be used as distribution key
>
>    INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR
>    CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR
>    FLOAT4, FLOAT8, NUMERIC, CASH
>    ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL, TIMETZ
>

I am not sure some of these data types are a good idea to use for
distributing on.  Float is inexact and seems problematic

I just did a quick test:

mds=# create table float1 (a float, b float) distribute by hash (a);
CREATE TABLE

mds=# insert into float1 values (2.0/3, 2);
INSERT 0 1

mds=# select * from float1;
         a         | b
-------------------+---
 0.666666666666667 | 2
(1 row)

Then, I copy and paste the output of a:

mds=# select * from float1 where a = 0.666666666666667;
 a | b
---+---
(0 rows)

Looking at the plan it tries to take advantage of partitioning:

mds=# explain select * from float1 where a = 0.666666666666667;
                            QUERY PLAN
-------------------------------------------------------------------
 Data Node Scan (Node Count [1])  (cost=0.00..0.00 rows=0 width=0)
(1 row)

I think we should remove support for floats as a possible distribution
type; users may get themselves into trouble.

There may be similar issues with the timestamp data types:

mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a);
CREATE TABLE
mds=# insert into timestamp1 values (now(), 1);
INSERT 0 1
mds=# select * from timestamp1;
             a              | b
----------------------------+---
 2011-05-24 08:51:21.597551 | 1
(1 row)

mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551';
 a | b
---+---
(0 rows)

As far as BOOL goes, I suppose it may be ok, but of course there are
only two possible values. I would block it, or at the very least if
the user leaves off the distribution clause, I would not consider BOOL
columns and look at other columns as better partitioning candidates.

In any event, I am very glad to see the various INT types, CHAR,
VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful
some of the others are.

Thanks,

Mason

[Postgres-xc-developers] Showing table name, its distribution and distribution columns

From: Koichi S. <koi...@gm...> - 2011-05-24 02:01:53

Uploaded a script to show table name, its distribution, and
distribution attribute name to
https://sourceforge.net/apps/mediawiki/postgres-xc/index.php?title=TIPS

With the following statement:

----
 SELECT pg_class.relname relation,
        pgxc_class.pclocatortype distribution,
        pg_attribute.attname attribute
    FROM pg_class, pgxc_class, pg_attribute
    WHERE pg_class.oid = pgxc_class.pcrelid
          and pg_class.oid = pg_attribute.attrelid
          and pgxc_class.pcattnum = pg_attribute.attnum
 UNION
 SELECT pg_class.relname relation,
        pgxc_class.pclocatortype distribution,
        'none' attribute
    FROM pg_class, pgxc_class, pg_attribute
    WHERE pg_class.oid = pgxc_class.pcrelid
          and pg_class.oid = pg_attribute.attrelid
          and pgxc_class.pcattnum = 0
    ;
---

You can have a result like:

  relation   | distribution | attribute
-------------+--------------+-----------
 table_five  | M            | a
 table_four  | H            | a
 table_one   | H            | a
 table_seven | R            | none
 table_six   | N            | none
 table_three | H            | a
 table_two   | H            | a
(7 rows)

M: modulo, H: hash, R: replicate, N: round-robin.  Attribute name is
represented as "none" for N and R distribution.
----------
Koichi Suzuki