You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Juned K. <jkh...@gm...> - 2014-05-01 10:02:29
|
I am not sure but may be datanode_archlog directory is not exists in my case. may be its have different name. however size of /home/postgres/pgxc/nodes/dn_master directory is *42G* and /home/postgres/pgxc/nodes/gtm_pxy is of *31G*. Please suggest. On Thu, May 1, 2014 at 12:50 PM, Michael Paquier <mic...@gm...>wrote: > On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...> wrote: > > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No > > space left on device (28) > This means that the partition containing folder > /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df > and monitor the size of this partition/disk that you are using for > your archive files. > -- > Michael > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
From: Michael P. <mic...@gm...> - 2014-05-01 07:20:49
|
On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...> wrote: > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No > space left on device (28) This means that the partition containing folder /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df and monitor the size of this partition/disk that you are using for your archive files. -- Michael |
From: Juned K. <jkh...@gm...> - 2014-05-01 06:32:54
|
Hi koichi, Yeah my gtm.control file is exists but it contains no data. yes it used to work earlier this problem happened suddenly. still 123G space is available on server. I didn't setup anything in recovery.conf file manually. On Thu, May 1, 2014 at 11:52 AM, 鈴木 幸市 <ko...@in...> wrote: > Could you share gtm.control file which is placed at the same directory as > gtm.conf. I’m afraid this file has been removed or corrupt. Did it use > to work and then happened this problem? If so, did you met any other > issues on the operation? Do you really have enough file space at the > datanode? Xlog archive consumes much file space. This is no difference > from PostgreSQL. > > Did you setup archive_cleanup_command configuration in your > recovery.conf file to clean up old archive log? pg_archivecleanup utility > will help. > > Regards; > --- > Koichi Suzuki > > 2014/05/01 15:00、Juned Khan <jkh...@gm...> のメール: > > Anyone can please help me with this how to solve this issue ? > > > On Mon, Apr 28, 2014 at 5:01 PM, Juned Khan <jkh...@gm...> wrote: > >> in logs i found this >> DETAIL: The failed archive command was: rsync >> pg_xlog/000000010000001C0000006A postgres@db01 >> :/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A >> rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: >> Broken pipe (32) >> rsync: write failed on >> "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No >> space left on device (28) >> rsync error: error in file IO (code 11) at receiver.c(322) >> [receiver=3.0.9] >> rsync: connection unexpectedly closed (28 bytes received so far) [sender] >> rsync error: error in rsync protocol data stream (code 12) at io.c(605) >> [sender=3.0.9] >> LOG: archive command failed with exit code 12 >> >> >> >> On Mon, Apr 28, 2014 at 4:51 PM, Juned Khan <jkh...@gm...> wrote: >> >>> Hi all, >>> >>> I setup pgxc, but today when i tried fire one manual query i got below >>> message. i am not able to show table descriptions as well. >>> >>> database=# \d >>> ERROR: could not access status of transaction 0 >>> DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No >>> space left on device. >>> >>> sometimes its showing table description in below format. >>> >>> database=# \dt >>> <table border="1"> >>> <caption>List of relations</caption> >>> <tr> >>> <th align="center">Schema</th> >>> <th align="center">Name</th> >>> <th align="center">Type</th> >>> <th align="center">Owner</th> >>> </tr> >>> >>> >>> -- >>> Thanks, >>> Juned Khan >>> >>> <http://www.inextrix.com/> >>> >> >> >> >> -- >> Thanks, >> Juned Khan >> >> <http://www.inextrix.com/> >> > > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs_______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > -- Thanks, Juned Khan <http://www.inextrix.com/> |
From: 鈴木 幸市 <ko...@in...> - 2014-05-01 06:22:17
|
Could you share gtm.control file which is placed at the same directory as gtm.conf. I’m afraid this file has been removed or corrupt. Did it use to work and then happened this problem? If so, did you met any other issues on the operation? Do you really have enough file space at the datanode? Xlog archive consumes much file space. This is no difference from PostgreSQL. Did you setup archive_cleanup_command configuration in your recovery.conf file to clean up old archive log? pg_archivecleanup utility will help. Regards; --- Koichi Suzuki 2014/05/01 15:00、Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> のメール: Anyone can please help me with this how to solve this issue ? On Mon, Apr 28, 2014 at 5:01 PM, Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> wrote: in logs i found this DETAIL: The failed archive command was: rsync pg_xlog/000000010000001C0000006A postgres@db01:/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) rsync: write failed on "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] rsync: connection unexpectedly closed (28 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] LOG: archive command failed with exit code 12 On Mon, Apr 28, 2014 at 4:51 PM, Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> wrote: Hi all, I setup pgxc, but today when i tried fire one manual query i got below message. i am not able to show table descriptions as well. database=# \d ERROR: could not access status of transaction 0 DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space left on device. sometimes its showing table description in below format. database=# \dt <table border="1"> <caption>List of relations</caption> <tr> <th align="center">Schema</th> <th align="center">Name</th> <th align="center">Type</th> <th align="center">Owner</th> </tr> -- Thanks, Juned Khan <http://www.inextrix.com/> -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com<http://www.inextrix.com/> -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com<http://www.inextrix.com/> ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs_______________________________________________ Postgres-xc-general mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Juned K. <jkh...@gm...> - 2014-05-01 06:00:21
|
Anyone can please help me with this how to solve this issue ? On Mon, Apr 28, 2014 at 5:01 PM, Juned Khan <jkh...@gm...> wrote: > in logs i found this > DETAIL: The failed archive command was: rsync > pg_xlog/000000010000001C0000006A postgres@db01 > :/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A > rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: > Broken pipe (32) > rsync: write failed on > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No > space left on device (28) > rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] > rsync: connection unexpectedly closed (28 bytes received so far) [sender] > rsync error: error in rsync protocol data stream (code 12) at io.c(605) > [sender=3.0.9] > LOG: archive command failed with exit code 12 > > > > On Mon, Apr 28, 2014 at 4:51 PM, Juned Khan <jkh...@gm...> wrote: > >> Hi all, >> >> I setup pgxc, but today when i tried fire one manual query i got below >> message. i am not able to show table descriptions as well. >> >> database=# \d >> ERROR: could not access status of transaction 0 >> DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space >> left on device. >> >> sometimes its showing table description in below format. >> >> database=# \dt >> <table border="1"> >> <caption>List of relations</caption> >> <tr> >> <th align="center">Schema</th> >> <th align="center">Name</th> >> <th align="center">Type</th> >> <th align="center">Owner</th> >> </tr> >> >> >> -- >> Thanks, >> Juned Khan >> >> <http://www.inextrix.com/> >> > > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
From: 鈴木 幸市 <ko...@in...> - 2014-05-01 04:06:29
|
Again, XC’s first goal is read/write scale out and parallel transaction processing among the nodes. HA feature should be a separate solution combined with other failure handling with Corosync, for example. Thank you; --- Koichi Suzuki 2014/05/01 12:02、Tim Uckun <tim...@gm...<mailto:tim...@gm...>> のメール: If pgxc could distribute the shards in a redundant fashion across the storage nodes I think it would go a very long way to make pgxc a very good HA solution. If a storage node goes down then other nodes can serve the same shards and when a node is brought back up shards could be distributed to it. Elasticsearch and other data stores use a similar strategy. On Thu, May 1, 2014 at 1:11 PM, 鈴木 幸市 <ko...@in...<mailto:ko...@in...>> wrote: It is just like PostgreSQL slave does not work as a master without explicit promote operation. XC provides features for HA configuration but it does not provide HA feature by itself. It is also similar to PG. HA feature cannot be configured only within XC. It needs much more, for example, hardware, network, and storage monitoring, failover and virtual IP. At present this is why XC (and I believe PG as well) provides fundamental feature but not total HA solution. This can be changed in the future though. Pgxc_ctl provides simple way to start everything and monitor if everything is working, as well as simple manual failover mechanism. Regards; --- Koichi Suzuki 2014/04/30 23:54、Thibault Marquand <thi...@ec...<mailto:thi...@ec...>> のメール: > Hi, > I am setting up a High Availability (HA) architecture and I use PG-XC. > Thank you very much for your work, it works fine and you are the only > one active project for synchronous multi master replication. > > I don't understand why GTM proxies and GTM standby don't > reconnect/promote themselves in case of GTM crash. > At the moment, I use shell script to do this but I am not enthusiast > with this because of maintenance trouble in future. > Is there any reason of this lack of automation ? > > Still about HA feature, I realized that when all cluster start, if GTM > doesn't work neither does GTM standby. Obviously database doesn't work > in that case. > I also made a script to handle this situation and change GTM standby > config in order to transform it in GTM before starting. (Not using > promote command, only changing configuration file). Maybe I missed > something to make it work ? If not, could it be a new feature for future > release ? > > You can find my script on my puppet module for postgres-xc : > https://forge.puppetlabs.com/echoes/postgres_xc or > https://github.com/echoes-tech/puppet-postgres_xc. > > Best regards, > Thibault Marquand > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li...<mailto:Pos...@li...> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Tim U. <tim...@gm...> - 2014-05-01 03:02:32
|
If pgxc could distribute the shards in a redundant fashion across the storage nodes I think it would go a very long way to make pgxc a very good HA solution. If a storage node goes down then other nodes can serve the same shards and when a node is brought back up shards could be distributed to it. Elasticsearch and other data stores use a similar strategy. On Thu, May 1, 2014 at 1:11 PM, 鈴木 幸市 <ko...@in...> wrote: > It is just like PostgreSQL slave does not work as a master without > explicit promote operation. > > XC provides features for HA configuration but it does not provide HA > feature by itself. It is also similar to PG. HA feature cannot be > configured only within XC. It needs much more, for example, hardware, > network, and storage monitoring, failover and virtual IP. > > At present this is why XC (and I believe PG as well) provides fundamental > feature but not total HA solution. > > This can be changed in the future though. > > Pgxc_ctl provides simple way to start everything and monitor if everything > is working, as well as simple manual failover mechanism. > > Regards; > --- > Koichi Suzuki > > 2014/04/30 23:54、Thibault Marquand <thi...@ec...> > のメール: > > > Hi, > > I am setting up a High Availability (HA) architecture and I use PG-XC. > > Thank you very much for your work, it works fine and you are the only > > one active project for synchronous multi master replication. > > > > I don't understand why GTM proxies and GTM standby don't > > reconnect/promote themselves in case of GTM crash. > > At the moment, I use shell script to do this but I am not enthusiast > > with this because of maintenance trouble in future. > > Is there any reason of this lack of automation ? > > > > Still about HA feature, I realized that when all cluster start, if GTM > > doesn't work neither does GTM standby. Obviously database doesn't work > > in that case. > > I also made a script to handle this situation and change GTM standby > > config in order to transform it in GTM before starting. (Not using > > promote command, only changing configuration file). Maybe I missed > > something to make it work ? If not, could it be a new feature for future > > release ? > > > > You can find my script on my puppet module for postgres-xc : > > https://forge.puppetlabs.com/echoes/postgres_xc or > > https://github.com/echoes-tech/puppet-postgres_xc. > > > > Best regards, > > Thibault Marquand > > > > > ------------------------------------------------------------------------------ > > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > > Instantly run your Selenium tests across 300+ browser/OS combos. Get > > unparalleled scalability from the best Selenium testing platform > available. > > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: 鈴木 幸市 <ko...@in...> - 2014-05-01 01:11:36
|
It is just like PostgreSQL slave does not work as a master without explicit promote operation. XC provides features for HA configuration but it does not provide HA feature by itself. It is also similar to PG. HA feature cannot be configured only within XC. It needs much more, for example, hardware, network, and storage monitoring, failover and virtual IP. At present this is why XC (and I believe PG as well) provides fundamental feature but not total HA solution. This can be changed in the future though. Pgxc_ctl provides simple way to start everything and monitor if everything is working, as well as simple manual failover mechanism. Regards; --- Koichi Suzuki 2014/04/30 23:54、Thibault Marquand <thi...@ec...> のメール: > Hi, > I am setting up a High Availability (HA) architecture and I use PG-XC. > Thank you very much for your work, it works fine and you are the only > one active project for synchronous multi master replication. > > I don't understand why GTM proxies and GTM standby don't > reconnect/promote themselves in case of GTM crash. > At the moment, I use shell script to do this but I am not enthusiast > with this because of maintenance trouble in future. > Is there any reason of this lack of automation ? > > Still about HA feature, I realized that when all cluster start, if GTM > doesn't work neither does GTM standby. Obviously database doesn't work > in that case. > I also made a script to handle this situation and change GTM standby > config in order to transform it in GTM before starting. (Not using > promote command, only changing configuration file). Maybe I missed > something to make it work ? If not, could it be a new feature for future > release ? > > You can find my script on my puppet module for postgres-xc : > https://forge.puppetlabs.com/echoes/postgres_xc or > https://github.com/echoes-tech/puppet-postgres_xc. > > Best regards, > Thibault Marquand > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Thibault M. <thi...@ec...> - 2014-04-30 15:12:17
|
Hi, I am setting up a High Availability (HA) architecture and I use PG-XC. Thank you very much for your work, it works fine and you are the only one active project for synchronous multi master replication. I don't understand why GTM proxies and GTM standby don't reconnect/promote themselves in case of GTM crash. At the moment, I use shell script to do this but I am not enthusiast with this because of maintenance trouble in future. Is there any reason of this lack of automation ? Still about HA feature, I realized that when all cluster start, if GTM doesn't work neither does GTM standby. Obviously database doesn't work in that case. I also made a script to handle this situation and change GTM standby config in order to transform it in GTM before starting. (Not using promote command, only changing configuration file). Maybe I missed something to make it work ? If not, could it be a new feature for future release ? You can find my script on my puppet module for postgres-xc : https://forge.puppetlabs.com/echoes/postgres_xc or https://github.com/echoes-tech/puppet-postgres_xc. Best regards, Thibault Marquand |
From: 鈴木 幸市 <ko...@in...> - 2014-04-30 08:43:45
|
I’m afraid you need to share how to reproduce the problem to fix the issue. Could you share the following? 1) Cluster configuration, if possible, pgxc_ctl configuration file. 2) DDL and DML you used to reproduce the issue. The addition you made is outside the loop of the issue. Internally, we had similar problem between O2 and non-O2 optimization, which was caused by missing memory initialization. O2 or higher optimization requires more strict memory initialization. Because there are so many candidate of the cause, I believe it is important to share how to reproduce the issue. Thank you . --- Koichi Suzuki 2014/04/26 15:06、Aaron Jackson <aja...@re...<mailto:aja...@re...>> のメール: I honestly have no idea what the optimizer is doing, however, I have isolated the behavior down to a simple change that eliminates the problem for the -O2 optimization. for (index = 0; index < arrayP->numProcs; index++) { if (arrayP->pgprocnos[index] == proc->pgprocno) { /* Keep the PGPROC array sorted. See notes above */ memmove(&arrayP->pgprocnos[index], &arrayP->pgprocnos[i$ (arrayP->numProcs - index - 1) * sizeof$ arrayP->pgprocnos[arrayP->numProcs - 1] = -1; $ arrayP->numProcs--; LWLockRelease(ProcArrayLock); return; } } /* Ooops */ LWLockRelease(ProcArrayLock); elog(LOG, "ProcArrayRemove(post-test): %p", &index); elog(LOG, "failed to find proc %p in ProcArray", proc); } The *only* change I made is to log the pointer to the index after the loop. I tried many things, but it was a necessity to do an operation that forced the evaluation of index's address. Hope this helps, Aaron ________________________________ From: Aaron Jackson [aja...@re...<mailto:aja...@re...>] Sent: Friday, April 25, 2014 4:26 PM To: pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs It's quite possible I'm missing something obvious, but here is how I've modified procarray.c - the idea was to capture the values that were failing to understand why it was failing. void ProcArrayRemove(PGPROC *proc, TransactionId latestXid) { ProcArrayStruct *arrayP = procArray; int index; int _xNumProcs; int _xIndex; .... for (index = 0; (_xIndex = index) < (_xNumProcs = arrayP->numProcs); index++) { if (arrayP->pgprocnos[index] == proc->pgprocno) { /* Keep the PGPROC array sorted. See notes above */ memmove(&arrayP->pgprocnos[index], &arrayP->pgprocnos[index + 1], (arrayP->numProcs - index - 1) * sizeof(int)); arrayP->pgprocnos[arrayP->numProcs - 1] = -1; /* for debugging */ arrayP->numProcs--; LWLockRelease(ProcArrayLock); return; } } /* Ooops */ LWLockRelease(ProcArrayLock); elog(LOG, "ProcArrayRemove(post-test): %d | %d | %d | %d", _xIndex, _xNumProcs, arrayP->numProcs, _xIndex < _xNumProcs); elog(LOG, "failed to find proc %p in ProcArray", proc); } With CFLAGS="" this works as expected. Once I set CFLAGS="-O2" (or anything else similar) it falls apart. For example, the fall through case triggered and it showed the following ... ProcArrayRemove(post-test): 1 | 9 | 9 | 1 Which means the loop test should have succeeded. I could take this one step further and cache the result of the for loop, however, I can tell you from prior experience, _xIndex < _xNumProcs evaluated as FALSE. Really not sure what the compiler is doing to draw that conclusion from 1 < 9. Aaron ________________________________ From: Aaron Jackson [aja...@re...<mailto:aja...@re...>] Sent: Friday, April 25, 2014 3:05 PM To: 鈴木 幸市 Cc: pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs CFLAGS="-O2" gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu9) The failed evaluation occurs on line 421 of backend/storage/ipc/procarray.c The test portion of the clause fails. I'm not entirely sure why gcc specifically fails, but if I were taking an educated guess, it would be that arrayP->numProcs was volatile and the resultant value of the test was optimized and cached. I've used several techniques (none of which I like) to fool gcc into believing the value is volatile and discarding the value of arrayP->numProcs. It concerns me more because the ProcArrayLock should be locked during this sequence. Aaron ________________________________ From: 鈴木 幸市 [ko...@in...<mailto:ko...@in...>] Sent: Sunday, April 13, 2014 7:55 PM To: Aaron Jackson Cc: pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs Thank you Aaron for the detailed analysis. As long as the issue is just for XC, we need a fix for it to work correctly regardless the compiler optimization. Did to locate where such wrong estimation takes place? And what compilation option did you use? They are very helpful. Best; --- Koichi Suzuki 2014/04/12 11:40、Aaron Jackson <aja...@re...<mailto:aja...@re...>> のメール: It appears that problem is a compiler optimization issue. I narrowed the issue down to the loop at the end of the ProcArrayRemove method. I'm not entirely sure why, but the compiler generated code that evaluates the test block of the loop improperly. Since changing the compiler options, the problem has been resolved. Aaron ________________________________ From: Aaron Jackson [aja...@re...<mailto:aja...@re...>] Sent: Friday, April 11, 2014 1:07 AM To: pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs I forgot to mention that if I injected a context switch (sleep(0) did the trick as did an elog statement) during the test in the ProcArrayRemove, that it no longer failed. Hopefully that will help in understanding the reasons why that may have triggered the ProcArrayRemove to succeed. ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees_______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general ------------------------------------------------------------------------------ Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform_______________________________________________ Postgres-xc-general mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Aaron J. <aja...@re...> - 2014-04-29 16:39:21
|
When I load data into my table "detail" with COPY, the table loads at a rate of about 56k rows per second. The data is distributed on a key to achieve this rate of insert (width is 678). However, when I do the following: INSERT INTO DETAIL SELECT 123 as Id, ... FROM DETAIL WHERE Id = 500; I see the write performance drop to only 2.5K rows per second. The total data set loaded from Id = 500 is 200k rows and takes about 7s to load into the data coordinator. So, I can attribute almost all of the time (about 80 seconds) directly to the insert. Insert on detail (cost=0.00..10.00 rows=1000 width=678) (actual time=79438.038..79438.038 rows=0 loops=1) Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 Node expr: productid -> Data Node Scan on detail "_REMOTE_TABLE_QUERY_" (cost=0.00..10.00 rows=1000 width=678) (actual time=3.917..2147.231 rows=200000 loops=1) Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 IMO, it seems like an insert like this should approach the performance of a COPY. Am I missing something or can you recommend a different approach? |
From: Koichi S. <koi...@gm...> - 2014-04-29 12:10:08
|
Your configuration is wrong. Gtm_host and gtm_port in gtm_proxy.conf should be the same as host and port in pgtm.conf. Because there's too may pitfalls like this in configuring PGXC manually, I strongly advise to begin with pgxc_ctl, which does all these for your behalf unless you'd like to lean more in deep of PGXC architecture and configuration. Regards; --- Koichi Suzuki 2014-04-25 15:55 GMT+09:00 张紫宇 <zha...@gm...>: > I installed pgxc 1.2.1 and I wanted to start gtm and gtm_proxy on the same > sever, but gtm_proxy didn't work. > > I did it as such: > sudo su > mkdir /usr/local/pgsql/data_gtm > mkdir /usr/local/pgsql/data_gtm_proxy > chown l /usr/local/pgsql/data_gtm > chown l /usr/local/pgsql/data_gtm_proxy > su l > initgtm -Z gtm -D /usr/local/pgsql/data_gtm > initgtm -Z gtm_proxy -D /usr/local/pgsql/data_gtm_proxy > gtm -D /usr/local/pgsql/data_gtm & > gtm_proxy -D /usr/local/pgsql/data_gtm_proxy > > On the last step,it returns CST -FATAL: can not connect to GTM > LOCATION: ConnectGTM, proxy_main.c:3344. > I followed it in the gtm_proxy,I found errorno 111 which means Connection > refused in > function connectFailureMessage > in GTMPQconnectPoll > in connectGTMStart > in PQconnectGTMStart > in PQconnectGTM > in ConnectGTM > in RegisterProxy > in BaseInit > in main. > > My os is ubuntu 12.04 amd64 and I also tested it on centos 6.I also > installed pgxc 1.2.1 on both of them. But they all get the same error.I > found a mail > "http://sourceforge.net/p/postgres-xc/mailman/message/30755193/", we are > exactly the same. > > I followed and tried each every pages on net, but still can't solve it. Can > you please tell me what I can do? Any help here would be really appreciated. > > gtm.conf and gtm_proxy.conf come as follow: > > > > > gtm.conf: > # ---------------------- > # GTM configuration file > # ---------------------- > # > # This file must be placed on gtm working directory > # specified by -D command line option of gtm or gtm_ctl. The > # configuration file name must be "gtm.conf" > # > # > # This file consists of lines of the form > # > # name = value > # > # (The "=" is optional.) Whitespace may be used. Comments are > # introduced with "#" anywhere on a line. The complete list of > # parameter names and allowed values can be found in the > # Postgres-XC documentation. > # > # The commented-out settings shown in this file represent the default > # values. > # > # Re-commenting a setting is NOT sufficient to revert it to the default > # value. > # > # You need to restart the server. > > #------------------------------------------------------------------------------ > # GENERAL PARAMETERS > #------------------------------------------------------------------------------ > nodename = 'one' # Specifies the node name. > # (changes requires restart) > #listen_addresses = '*' # Listen addresses of this GTM. > # (changes requires restart) > port = 6666 # Port number of this GTM. > # (changes requires restart) > > #startup = ACT # Start mode. ACT/STANDBY. > > #------------------------------------------------------------------------------ > # GTM STANDBY PARAMETERS > #------------------------------------------------------------------------------ > #Those parameters are effective when GTM is activated as a standby server > #active_host = '' # Listen address of active GTM. > # (changes requires restart) > #active_port = # Port number of active GTM. > # (changes requires restart) > > #--------------------------------------- > # OTHER OPTIONS > #--------------------------------------- > #keepalives_idle = 0 # Keepalives_idle parameter. > #keepalives_interval = 0 # Keepalives_interval parameter. > #keepalives_count = 0 # Keepalives_count internal parameter. > #log_file = 'gtm.log' # Log file name > #log_min_messages = WARNING # log_min_messages. Default WARNING. > # Valid value: DEBUG, DEBUG5, DEBUG4, DEBUG3, > # DEBUG2, DEBUG1, INFO, NOTICE, WARNING, > # ERROR, LOG, FATAL, PANIC > #synchronous_backup = off # If backup to standby is synchronous > > > > > > > > gtm_proxy.conf: > > #----------------------------- > # GTM Proxy configuration file > #----------------------------- > # > # This file must be placed on gtm working directory > # specified by -D command line option of gtm_proxy or gtm_ctl. > # The configuration file name must be "gtm_proxy.conf" > # > # > # This file consists of lines of the form > # > # name = value > # > # (The "=" is optional.) Whitespace may be used. Comments are > # introduced with "#" anywhere on a line. The complete list of > # parameter names and allowed values can be found in the > # Postgres-XC documentation. > # > # The commented-out settings shown in this file represent the default > # values. > # > # Re-commenting a setting is NOT sufficient to revert it to the default > # value. > # > # You need to restart the server. > > #------------------------------------------------------------------------------ > # GENERAL PARAMETERS > #------------------------------------------------------------------------------ > nodename = 'one' # Specifies the node name. > # (changes requires restart) > #listen_addresses = '*' # Listen addresses of this GTM. > # (changes requires restart) > port = 6666 # Port number of this GTM. > # (changes requires restart) > > #------------------------------------------------------------------------------ > # GTM PROXY PARAMETERS > #------------------------------------------------------------------------------ > #worker_threads = 1 # Number of the worker thread of this > # GTM proxy > # (changes requires restart) > > #------------------------------------------------------------------------------ > # GTM CONNECTION PARAMETERS > #------------------------------------------------------------------------------ > # Those parameters are used to connect to a GTM server > gtm_host = 'localhost' # Listen address of the active GTM. > # (changes requires restart) > gtm_port = 6668 # Port number of the active GTM. > # (changes requires restart) > > #------------------------------------------------------------------------------ > # Behavior at GTM communication error > #------------------------------------------------------------------------------ > #gtm_connect_retry_interval = 0 # How long (in secs) to wait until the next > # retry to connect to GTM. > # > # > #------------------------------------------------------------------------------ > # Other options > #------------------------------------------------------------------------------ > #keepalives_idle = 0 # Keepalives_idle parameter. > #keepalives_interval = 0 # Keepalives_interval parameter. > #keepalives_count = 0 # Keepalives_count internal parameter. > #log_file = 'gtm_proxy.log' # Log file name > #log_min_messages = WARNING # log_min_messages. Default WARNING. > # Valid value: DEBUG, DEBUG5, DEBUG4, DEBUG3, > # DEBUG2, DEBUG1, INFO, NOTICE, WARNING, > # ERROR, LOG, FATAL, PANIC. > > > > --Ronian > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Aaron J. <aja...@re...> - 2014-04-29 06:16:52
|
That was my thinking as well. I indirectly executed the queries independently. Basically, each query takes about 5-10 seconds each for 4 data nodes - specifically, I used psql -h <datanode> -p <port> - to time the individual data node performance individually. So you figure, worst case = 10 seconds x 4 nodes = 40 seconds of aggregate time on a serial request. But I'm seeing 65 seconds which means there's some other overhead that I'm missing. The 65 second aggregate is also the reason why I asked if the requests were parallel or serial because it *feels* serial though it could be other factors. I'll retest and update. ________________________________ From: Ashutosh Bapat [ash...@en...] Sent: Tuesday, April 29, 2014 1:05 AM To: Aaron Jackson Cc: amul sul; pos...@li... Subject: Re: [Postgres-xc-general] Data Node Scan Performance Hi Aaron, Can you please take the timing of executing "EXECUTE DIRECT <query to the datanode>" to some datanode. I suspect that the delay you are seeing is added by the sheer communication between coord and datanode. Some of that would be libpq overhead and some of it will be network overhead. On Tue, Apr 29, 2014 at 10:58 AM, Aaron Jackson <aja...@re...<mailto:aja...@re...>> wrote: Interesting, So, I wonder why I am seeing query times that are more than the sum of the total times required to perform the process without the coordinator. For example, let's say the query was 'SELECT 500 as Id, Foo, Bar from MyTable WHERE Id = 186' - I could perform this query at all 4 nodes and they would take no more than 10 seconds to run individually. However, when performed against the coordinator, this same query takes 65 seconds. That's more than the total aggregate of all data nodes. Any thoughts - is it completely attributed to the coordinator? ________________________________________ From: amul sul [sul...@ya...<mailto:sul...@ya...>] Sent: Tuesday, April 29, 2014 12:23 AM To: Aaron Jackson; pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] Data Node Scan Performance >On Tuesday, 29 April 2014 10:38 AM, Aaron Jackson <aja...@re...<mailto:aja...@re...>> wrote: > my question is, does the coordinator execute the data node scan serially >or in parallel - and if it's serially, >is there any thought around how to make it parallel? IMO, scan on data happens independently i.e parallel, the scan result is collected at co-ordinator and returned to client. Referring distributed table using other than distribution key(in your case it Q instead of k), has little penalty. Regards, Amul Sul ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |
From: Ashutosh B. <ash...@en...> - 2014-04-29 06:05:17
|
Hi Aaron, Can you please take the timing of executing "EXECUTE DIRECT <query to the datanode>" to some datanode. I suspect that the delay you are seeing is added by the sheer communication between coord and datanode. Some of that would be libpq overhead and some of it will be network overhead. On Tue, Apr 29, 2014 at 10:58 AM, Aaron Jackson <aja...@re...>wrote: > Interesting, > > So, I wonder why I am seeing query times that are more than the sum of the > total times required to perform the process without the coordinator. For > example, let's say the query was 'SELECT 500 as Id, Foo, Bar from MyTable > WHERE Id = 186' - I could perform this query at all 4 nodes and they would > take no more than 10 seconds to run individually. However, when performed > against the coordinator, this same query takes 65 seconds. That's more > than the total aggregate of all data nodes. > > Any thoughts - is it completely attributed to the coordinator? > > > ________________________________________ > From: amul sul [sul...@ya...] > Sent: Tuesday, April 29, 2014 12:23 AM > To: Aaron Jackson; pos...@li... > Subject: Re: [Postgres-xc-general] Data Node Scan Performance > > >On Tuesday, 29 April 2014 10:38 AM, Aaron Jackson <aja...@re...> > wrote: > > my question is, does the coordinator execute the data node scan serially > >or in parallel - and if it's serially, > >is there any thought around how to make it parallel? > > IMO, scan on data happens independently i.e parallel, the scan result is > collected at co-ordinator and returned to client. > > Referring distributed table using other than distribution key(in your case > it Q instead of k), has little penalty. > > Regards, > Amul Sul > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |
From: amul s. <sul...@ya...> - 2014-04-29 06:01:50
|
On Tuesday, 29 April 2014 10:58 AM, Aaron Jackson <aja...@re...> wrote: Interesting, >Any thoughts - is it completely attributed to the coordinator? I am not sure, but in your example, MyTable is distributed on foo and search on ID, if somehow you able to add distribution key search in WHERE condition (eg. ....WHERE Id = 186 AND Foo=xyz ), planner would get help to locate candidate tuple easily. Current situation it scanning all the datanodes & combine result at coordinator. Dose ID is unique in MyTable ? if so can't you can distribute on ID? Regards, Amul Sul |
From: Aaron J. <aja...@re...> - 2014-04-29 05:30:21
|
Interesting, So, I wonder why I am seeing query times that are more than the sum of the total times required to perform the process without the coordinator. For example, let's say the query was 'SELECT 500 as Id, Foo, Bar from MyTable WHERE Id = 186' - I could perform this query at all 4 nodes and they would take no more than 10 seconds to run individually. However, when performed against the coordinator, this same query takes 65 seconds. That's more than the total aggregate of all data nodes. Any thoughts - is it completely attributed to the coordinator? ________________________________________ From: amul sul [sul...@ya...] Sent: Tuesday, April 29, 2014 12:23 AM To: Aaron Jackson; pos...@li... Subject: Re: [Postgres-xc-general] Data Node Scan Performance >On Tuesday, 29 April 2014 10:38 AM, Aaron Jackson <aja...@re...> wrote: > my question is, does the coordinator execute the data node scan serially >or in parallel - and if it's serially, >is there any thought around how to make it parallel? IMO, scan on data happens independently i.e parallel, the scan result is collected at co-ordinator and returned to client. Referring distributed table using other than distribution key(in your case it Q instead of k), has little penalty. Regards, Amul Sul |
From: amul s. <sul...@ya...> - 2014-04-29 05:23:42
|
>On Tuesday, 29 April 2014 10:38 AM, Aaron Jackson <aja...@re...> wrote: > my question is, does the coordinator execute the data node scan serially >or in parallel - and if it's serially, >is there any thought around how to make it parallel? IMO, scan on data happens independently i.e parallel, the scan result is collected at co-ordinator and returned to client. Referring distributed table using other than distribution key(in your case it Q instead of k), has little penalty. Regards, Amul Sul |
From: Aaron J. <aja...@re...> - 2014-04-29 05:16:09
|
Also, if it helps, this is actually an operational query, not a data warehouse type query. In this case, we have an object that owns a considerable amount of data below it (in this case, 1:N:M relationships where rows in table A have children in table B and grandchildren in table C). The operation I want to perform is a scalable clone of the data. The actually query looks more like the following: INSERT INTO MyTable SELECT 500 as Id, Foo, Bar from MyTable WHERE Id = $1 For testing purposes (I'm simply tinkering), the data is distributed on Foo and indexed on Id. So in this case I'm copying the "values" from the old entity to the new entity. The plan appears to require the data to be retrieved into the coordinator and then dispersed back down into the data nodes. An optimization for this specific problem would be to push the 'INSERT INTO ... SELECT' directly down into the data nodes since there isn't any inherent benefit to the coordinator consuming the data. The distribution key is identical - so the data will be sent right back to the data node it came from. Any other thoughts on how to make this performant - if not, I'll go back to my tinkering table. ________________________________ From: Aaron Jackson [aja...@re...] Sent: Tuesday, April 29, 2014 12:06 AM To: pos...@li... Subject: [Postgres-xc-general] Data Node Scan Performance I have a table that I've distributed by some key K. When I want to query by some other dimension Q, the coordinator explain plan indicates that it does a Data Node Scan on *table* "_REMOTE_TABLE_QUERY" Now what I've noticed is that if I have 4 nodes, the coordinator based scan may take 65 seconds, however, the individual date nodes usually finish within 5-10 seconds. The individual explain plains from each data node reveal nothing. So my question is, does the coordinator execute the data node scan serially or in parallel - and if it's serially, is there any thought around how to make it parallel? In the event it is already parallel, is the time differential I'm seeing simply attributed to the coordinator gathering results in preparation to return to the requesting client? Thanks |
From: Aaron J. <aja...@re...> - 2014-04-29 05:07:32
|
I have a table that I've distributed by some key K. When I want to query by some other dimension Q, the coordinator explain plan indicates that it does a Data Node Scan on *table* "_REMOTE_TABLE_QUERY" Now what I've noticed is that if I have 4 nodes, the coordinator based scan may take 65 seconds, however, the individual date nodes usually finish within 5-10 seconds. The individual explain plains from each data node reveal nothing. So my question is, does the coordinator execute the data node scan serially or in parallel - and if it's serially, is there any thought around how to make it parallel? In the event it is already parallel, is the time differential I'm seeing simply attributed to the coordinator gathering results in preparation to return to the requesting client? Thanks |
From: L <zha...@gm...> - 2014-04-29 02:58:21
|
-------- Original Message -------- Subject: GTM Proxy can't start Date: Fri, 25 Apr 2014 14:55:01 +0800 From: 张紫宇 <zha...@gm...> To: pos...@li... I installed pgxc 1.2.1 and I wanted to start gtm and gtm_proxy on the same sever, but gtm_proxy didn't work. I did it as such: sudo su mkdir /usr/local/pgsql/data_gtm mkdir /usr/local/pgsql/data_gtm_proxy chown l /usr/local/pgsql/data_gtm chown l /usr/local/pgsql/data_gtm_proxy su l initgtm -Z gtm -D /usr/local/pgsql/data_gtm initgtm -Z gtm_proxy -D /usr/local/pgsql/data_gtm_proxy gtm -D /usr/local/pgsql/data_gtm & gtm_proxy -D /usr/local/pgsql/data_gtm_proxy On the last step,it returns CST -FATAL: can not connect to GTM LOCATION: ConnectGTM, proxy_main.c:3344. I followed it in the gtm_proxy,I found errorno 111 which means Connection refused in function connectFailureMessage in GTMPQconnectPoll in connectGTMStart in PQconnectGTMStart in PQconnectGTM in ConnectGTM in RegisterProxy in BaseInit in main. My os is ubuntu 12.04 amd64 and I also tested it on centos 6.I also installed pgxc 1.2.1 on both of them. But they all get the same error.I found a mail "http://sourceforge.net/p/postgres-xc/mailman/message/30755193/", we are exactly the same. I followed and tried each every pages on net, but still can't solve it. Can you please tell me what I can do? Any help here would be really appreciated. gtm.conf and gtm_proxy.conf come as follow: gtm.conf: # ---------------------- # GTM configuration file # ---------------------- # # This file must be placed on gtm working directory # specified by -D command line option of gtm or gtm_ctl. The # configuration file name must be "gtm.conf" # # # This file consists of lines of the form # # name = value # # (The "=" is optional.) Whitespace may be used. Comments are # introduced with "#" anywhere on a line. The complete list of # parameter names and allowed values can be found in the # Postgres-XC documentation. # # The commented-out settings shown in this file represent the default # values. # # Re-commenting a setting is NOT sufficient to revert it to the default # value. # # You need to restart the server. #------------------------------------------------------------------------------ # GENERAL PARAMETERS #------------------------------------------------------------------------------ nodename = 'one'# Specifies the node name. # (changes requires restart) #listen_addresses = '*'# Listen addresses of this GTM. # (changes requires restart) port = 6666# Port number of this GTM. # (changes requires restart) #startup = ACT# Start mode. ACT/STANDBY. #------------------------------------------------------------------------------ # GTM STANDBY PARAMETERS #------------------------------------------------------------------------------ #Those parameters are effective when GTM is activated as a standby server #active_host = ''# Listen address of active GTM. # (changes requires restart) #active_port =# Port number of active GTM. # (changes requires restart) #--------------------------------------- # OTHER OPTIONS #--------------------------------------- #keepalives_idle = 0# Keepalives_idle parameter. #keepalives_interval = 0# Keepalives_interval parameter. #keepalives_count = 0# Keepalives_count internal parameter. #log_file = 'gtm.log'# Log file name #log_min_messages = WARNING# log_min_messages. Default WARNING. # Valid value: DEBUG, DEBUG5, DEBUG4, DEBUG3, # DEBUG2, DEBUG1, INFO, NOTICE, WARNING, # ERROR, LOG, FATAL, PANIC #synchronous_backup = off# If backup to standby is synchronous gtm_proxy.conf: #----------------------------- # GTM Proxy configuration file #----------------------------- # # This file must be placed on gtm working directory # specified by -D command line option of gtm_proxy or gtm_ctl. # The configuration file name must be "gtm_proxy.conf" # # # This file consists of lines of the form # # name = value # # (The "=" is optional.) Whitespace may be used. Comments are # introduced with "#" anywhere on a line. The complete list of # parameter names and allowed values can be found in the # Postgres-XC documentation. # # The commented-out settings shown in this file represent the default # values. # # Re-commenting a setting is NOT sufficient to revert it to the default # value. # # You need to restart the server. #------------------------------------------------------------------------------ # GENERAL PARAMETERS #------------------------------------------------------------------------------ nodename = 'one'# Specifies the node name. # (changes requires restart) #listen_addresses = '*'# Listen addresses of this GTM. # (changes requires restart) port = 6666# Port number of this GTM. # (changes requires restart) #------------------------------------------------------------------------------ # GTM PROXY PARAMETERS #------------------------------------------------------------------------------ #worker_threads = 1# Number of the worker thread of this # GTM proxy # (changes requires restart) #------------------------------------------------------------------------------ # GTM CONNECTION PARAMETERS #------------------------------------------------------------------------------ # Those parameters are used to connect to a GTM server gtm_host = 'localhost' # Listen address of the active GTM. # (changes requires restart) gtm_port = 6668 # Port number of the active GTM. # (changes requires restart) #------------------------------------------------------------------------------ # Behavior at GTM communication error #------------------------------------------------------------------------------ #gtm_connect_retry_interval = 0# How long (in secs) to wait until the next # retry to connect to GTM. # # #------------------------------------------------------------------------------ # Other options #------------------------------------------------------------------------------ #keepalives_idle = 0# Keepalives_idle parameter. #keepalives_interval = 0# Keepalives_interval parameter. #keepalives_count = 0# Keepalives_count internal parameter. #log_file = 'gtm_proxy.log'# Log file name #log_min_messages = WARNING# log_min_messages. Default WARNING. # Valid value: DEBUG, DEBUG5, DEBUG4, DEBUG3, # DEBUG2, DEBUG1, INFO, NOTICE, WARNING, # ERROR, LOG, FATAL, PANIC. --Ronian |
From: Juned K. <jkh...@gm...> - 2014-04-28 11:31:19
|
in logs i found this DETAIL: The failed archive command was: rsync pg_xlog/000000010000001C0000006A postgres@db01 :/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) rsync: write failed on "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] rsync: connection unexpectedly closed (28 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] LOG: archive command failed with exit code 12 On Mon, Apr 28, 2014 at 4:51 PM, Juned Khan <jkh...@gm...> wrote: > Hi all, > > I setup pgxc, but today when i tried fire one manual query i got below > message. i am not able to show table descriptions as well. > > database=# \d > ERROR: could not access status of transaction 0 > DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space > left on device. > > sometimes its showing table description in below format. > > database=# \dt > <table border="1"> > <caption>List of relations</caption> > <tr> > <th align="center">Schema</th> > <th align="center">Name</th> > <th align="center">Type</th> > <th align="center">Owner</th> > </tr> > > > -- > Thanks, > Juned Khan > > <http://www.inextrix.com/> > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
From: Juned K. <jkh...@gm...> - 2014-04-28 11:22:00
|
Hi all, I setup pgxc, but today when i tried fire one manual query i got below message. i am not able to show table descriptions as well. database=# \d ERROR: could not access status of transaction 0 DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space left on device. sometimes its showing table description in below format. database=# \dt <table border="1"> <caption>List of relations</caption> <tr> <th align="center">Schema</th> <th align="center">Name</th> <th align="center">Type</th> <th align="center">Owner</th> </tr> -- Thanks, Juned Khan <http://www.inextrix.com/> |
From: Aaron J. <aja...@re...> - 2014-04-26 06:07:08
|
I honestly have no idea what the optimizer is doing, however, I have isolated the behavior down to a simple change that eliminates the problem for the -O2 optimization. for (index = 0; index < arrayP->numProcs; index++) { if (arrayP->pgprocnos[index] == proc->pgprocno) { /* Keep the PGPROC array sorted. See notes above */ memmove(&arrayP->pgprocnos[index], &arrayP->pgprocnos[i$ (arrayP->numProcs - index - 1) * sizeof$ arrayP->pgprocnos[arrayP->numProcs - 1] = -1; $ arrayP->numProcs--; LWLockRelease(ProcArrayLock); return; } } /* Ooops */ LWLockRelease(ProcArrayLock); elog(LOG, "ProcArrayRemove(post-test): %p", &index); elog(LOG, "failed to find proc %p in ProcArray", proc); } The *only* change I made is to log the pointer to the index after the loop. I tried many things, but it was a necessity to do an operation that forced the evaluation of index's address. Hope this helps, Aaron ________________________________ From: Aaron Jackson [aja...@re...] Sent: Friday, April 25, 2014 4:26 PM To: pos...@li... Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs It's quite possible I'm missing something obvious, but here is how I've modified procarray.c - the idea was to capture the values that were failing to understand why it was failing. void ProcArrayRemove(PGPROC *proc, TransactionId latestXid) { ProcArrayStruct *arrayP = procArray; int index; int _xNumProcs; int _xIndex; ... for (index = 0; (_xIndex = index) < (_xNumProcs = arrayP->numProcs); index++) { if (arrayP->pgprocnos[index] == proc->pgprocno) { /* Keep the PGPROC array sorted. See notes above */ memmove(&arrayP->pgprocnos[index], &arrayP->pgprocnos[index + 1], (arrayP->numProcs - index - 1) * sizeof(int)); arrayP->pgprocnos[arrayP->numProcs - 1] = -1; /* for debugging */ arrayP->numProcs--; LWLockRelease(ProcArrayLock); return; } } /* Ooops */ LWLockRelease(ProcArrayLock); elog(LOG, "ProcArrayRemove(post-test): %d | %d | %d | %d", _xIndex, _xNumProcs, arrayP->numProcs, _xIndex < _xNumProcs); elog(LOG, "failed to find proc %p in ProcArray", proc); } With CFLAGS="" this works as expected. Once I set CFLAGS="-O2" (or anything else similar) it falls apart. For example, the fall through case triggered and it showed the following ... ProcArrayRemove(post-test): 1 | 9 | 9 | 1 Which means the loop test should have succeeded. I could take this one step further and cache the result of the for loop, however, I can tell you from prior experience, _xIndex < _xNumProcs evaluated as FALSE. Really not sure what the compiler is doing to draw that conclusion from 1 < 9. Aaron ________________________________ From: Aaron Jackson [aja...@re...] Sent: Friday, April 25, 2014 3:05 PM To: 鈴木 幸市 Cc: pos...@li... Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs CFLAGS="-O2" gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu9) The failed evaluation occurs on line 421 of backend/storage/ipc/procarray.c The test portion of the clause fails. I'm not entirely sure why gcc specifically fails, but if I were taking an educated guess, it would be that arrayP->numProcs was volatile and the resultant value of the test was optimized and cached. I've used several techniques (none of which I like) to fool gcc into believing the value is volatile and discarding the value of arrayP->numProcs. It concerns me more because the ProcArrayLock should be locked during this sequence. Aaron ________________________________ From: 鈴木 幸市 [ko...@in...] Sent: Sunday, April 13, 2014 7:55 PM To: Aaron Jackson Cc: pos...@li... Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs Thank you Aaron for the detailed analysis. As long as the issue is just for XC, we need a fix for it to work correctly regardless the compiler optimization. Did to locate where such wrong estimation takes place? And what compilation option did you use? They are very helpful. Best; --- Koichi Suzuki 2014/04/12 11:40、Aaron Jackson <aja...@re...<mailto:aja...@re...>> のメール: It appears that problem is a compiler optimization issue. I narrowed the issue down to the loop at the end of the ProcArrayRemove method. I'm not entirely sure why, but the compiler generated code that evaluates the test block of the loop improperly. Since changing the compiler options, the problem has been resolved. Aaron ________________________________ From: Aaron Jackson [aja...@re...<mailto:aja...@re...>] Sent: Friday, April 11, 2014 1:07 AM To: pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs I forgot to mention that if I injected a context switch (sleep(0) did the trick as did an elog statement) during the test in the ProcArrayRemove, that it no longer failed. Hopefully that will help in understanding the reasons why that may have triggered the ProcArrayRemove to succeed. ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees_______________________________________________ Postgres-xc-general mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Aaron J. <aja...@re...> - 2014-04-25 21:27:24
|
It's quite possible I'm missing something obvious, but here is how I've modified procarray.c - the idea was to capture the values that were failing to understand why it was failing. void ProcArrayRemove(PGPROC *proc, TransactionId latestXid) { ProcArrayStruct *arrayP = procArray; int index; int _xNumProcs; int _xIndex; ... for (index = 0; (_xIndex = index) < (_xNumProcs = arrayP->numProcs); index++) { if (arrayP->pgprocnos[index] == proc->pgprocno) { /* Keep the PGPROC array sorted. See notes above */ memmove(&arrayP->pgprocnos[index], &arrayP->pgprocnos[index + 1], (arrayP->numProcs - index - 1) * sizeof(int)); arrayP->pgprocnos[arrayP->numProcs - 1] = -1; /* for debugging */ arrayP->numProcs--; LWLockRelease(ProcArrayLock); return; } } /* Ooops */ LWLockRelease(ProcArrayLock); elog(LOG, "ProcArrayRemove(post-test): %d | %d | %d | %d", _xIndex, _xNumProcs, arrayP->numProcs, _xIndex < _xNumProcs); elog(LOG, "failed to find proc %p in ProcArray", proc); } With CFLAGS="" this works as expected. Once I set CFLAGS="-O2" (or anything else similar) it falls apart. For example, the fall through case triggered and it showed the following ... ProcArrayRemove(post-test): 1 | 9 | 9 | 1 Which means the loop test should have succeeded. I could take this one step further and cache the result of the for loop, however, I can tell you from prior experience, _xIndex < _xNumProcs evaluated as FALSE. Really not sure what the compiler is doing to draw that conclusion from 1 < 9. Aaron ________________________________ From: Aaron Jackson [aja...@re...] Sent: Friday, April 25, 2014 3:05 PM To: 鈴木 幸市 Cc: pos...@li... Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs CFLAGS="-O2" gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu9) The failed evaluation occurs on line 421 of backend/storage/ipc/procarray.c The test portion of the clause fails. I'm not entirely sure why gcc specifically fails, but if I were taking an educated guess, it would be that arrayP->numProcs was volatile and the resultant value of the test was optimized and cached. I've used several techniques (none of which I like) to fool gcc into believing the value is volatile and discarding the value of arrayP->numProcs. It concerns me more because the ProcArrayLock should be locked during this sequence. Aaron ________________________________ From: 鈴木 幸市 [ko...@in...] Sent: Sunday, April 13, 2014 7:55 PM To: Aaron Jackson Cc: pos...@li... Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs Thank you Aaron for the detailed analysis. As long as the issue is just for XC, we need a fix for it to work correctly regardless the compiler optimization. Did to locate where such wrong estimation takes place? And what compilation option did you use? They are very helpful. Best; --- Koichi Suzuki 2014/04/12 11:40、Aaron Jackson <aja...@re...<mailto:aja...@re...>> のメール: It appears that problem is a compiler optimization issue. I narrowed the issue down to the loop at the end of the ProcArrayRemove method. I'm not entirely sure why, but the compiler generated code that evaluates the test block of the loop improperly. Since changing the compiler options, the problem has been resolved. Aaron ________________________________ From: Aaron Jackson [aja...@re...<mailto:aja...@re...>] Sent: Friday, April 11, 2014 1:07 AM To: pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs I forgot to mention that if I injected a context switch (sleep(0) did the trick as did an elog statement) during the test in the ProcArrayRemove, that it no longer failed. Hopefully that will help in understanding the reasons why that may have triggered the ProcArrayRemove to succeed. ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees_______________________________________________ Postgres-xc-general mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Aaron J. <aja...@re...> - 2014-04-25 20:10:14
|
CFLAGS="-O2" gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu9) The failed evaluation occurs on line 421 of backend/storage/ipc/procarray.c The test portion of the clause fails. I'm not entirely sure why gcc specifically fails, but if I were taking an educated guess, it would be that arrayP->numProcs was volatile and the resultant value of the test was optimized and cached. I've used several techniques (none of which I like) to fool gcc into believing the value is volatile and discarding the value of arrayP->numProcs. It concerns me more because the ProcArrayLock should be locked during this sequence. Aaron ________________________________ From: 鈴木 幸市 [ko...@in...] Sent: Sunday, April 13, 2014 7:55 PM To: Aaron Jackson Cc: pos...@li... Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs Thank you Aaron for the detailed analysis. As long as the issue is just for XC, we need a fix for it to work correctly regardless the compiler optimization. Did to locate where such wrong estimation takes place? And what compilation option did you use? They are very helpful. Best; --- Koichi Suzuki 2014/04/12 11:40、Aaron Jackson <aja...@re...<mailto:aja...@re...>> のメール: It appears that problem is a compiler optimization issue. I narrowed the issue down to the loop at the end of the ProcArrayRemove method. I'm not entirely sure why, but the compiler generated code that evaluates the test block of the loop improperly. Since changing the compiler options, the problem has been resolved. Aaron ________________________________ From: Aaron Jackson [aja...@re...<mailto:aja...@re...>] Sent: Friday, April 11, 2014 1:07 AM To: pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] failed to find proc - increasing numProcs I forgot to mention that if I injected a context switch (sleep(0) did the trick as did an elog statement) during the test in the ProcArrayRemove, that it no longer failed. Hopefully that will help in understanding the reasons why that may have triggered the ProcArrayRemove to succeed. ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees_______________________________________________ Postgres-xc-general mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |