You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(10) |
May
(17) |
Jun
(3) |
Jul
|
Aug
|
Sep
(8) |
Oct
(18) |
Nov
(51) |
Dec
(74) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(47) |
Feb
(44) |
Mar
(44) |
Apr
(102) |
May
(35) |
Jun
(25) |
Jul
(56) |
Aug
(69) |
Sep
(32) |
Oct
(37) |
Nov
(31) |
Dec
(16) |
2012 |
Jan
(34) |
Feb
(127) |
Mar
(218) |
Apr
(252) |
May
(80) |
Jun
(137) |
Jul
(205) |
Aug
(159) |
Sep
(35) |
Oct
(50) |
Nov
(82) |
Dec
(52) |
2013 |
Jan
(107) |
Feb
(159) |
Mar
(118) |
Apr
(163) |
May
(151) |
Jun
(89) |
Jul
(106) |
Aug
(177) |
Sep
(49) |
Oct
(63) |
Nov
(46) |
Dec
(7) |
2014 |
Jan
(65) |
Feb
(128) |
Mar
(40) |
Apr
(11) |
May
(4) |
Jun
(8) |
Jul
(16) |
Aug
(11) |
Sep
(4) |
Oct
(1) |
Nov
(5) |
Dec
(16) |
2015 |
Jan
(5) |
Feb
|
Mar
(2) |
Apr
(5) |
May
(4) |
Jun
(12) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Michael P. <mic...@gm...> - 2011-03-06 23:55:20
|
Hi, Thanks for your input I am always very careful with patches that play with the error handling on nodes. But... I had a look at your patch, and I get that there is indeed now a problem with error handling when launching utilities or DDL. In the current code, ExecRemoteUtility is doing something like this: 1) Send query to datanodes 2) check error (at this point connections can be sent back to pooler) 3) Send query to coordinators 4) check error, once again, (at this point connections can safely, once again, be sent back to pooler) There is indeed a logic problem with the error handling for datanodes. If an error occurs when sending the query to datanodes when checking error messages for remote state, we may send back to pooler clean datanode connections and coordinator connections that still have messages on them. This may create data inconsistency with dirty messages on coordinator-coordinator connections. Let me have a closer look at this patch and I think I'll commit it. On Fri, Mar 4, 2011 at 5:13 PM, xiong wang <wan...@gm...> wrote: > Dears, > > In PGXC, when there's an error from datanodes, the coordinators > relative with the executing statement state and theirs data sent to > the executing coordinator have no chance to be processed at all. The > patch fix such a problem. I hope it may be useful. > > Regards, > Benny > > Thanks, -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Michael <mic...@gm...> - 2011-03-06 09:39:34
|
(2011?03?05? 15:08), Abbas Butt wrote: > By using this are you saying that we make a separate regression sql > and expected directories for XC? Of course not. Instead of using each time a "make installcheck" command, I thought this patch may bring for flexibility when doing tests. I just meant that as we are focusing on stability, it may be nice to be able to launch regression tests with a simple "pg_regress" command with chosen options from a chosen directory as we are going to use a lot of schedule cases depending on the uses we try to reproduce. Also, with the current pg_regress, you are able to set up freely the output and input directory, but there is no option to set up a directory for folders expected and results. -- Michael michael.otacoo.com <http://michael.otacoo.com> |
From: Abbas B. <abb...@te...> - 2011-03-05 06:09:08
|
By using this are you saying that we make a separate regression sql and expected directories for XC? On Fri, Mar 4, 2011 at 2:45 PM, Michael Paquier <mic...@gm...>wrote: > Hi all, > > Please find attached a small patch adding an option in pg_regress that > allows to customize the directory where "expected" and "sql" folders are. > I felt that it was missing in the current implementation if you want to use > a customize set of SQL queries. > > Perhaps it may be better to give it to PostgreSQL community... > > Regards, > -- > Michael Paquier > http://michaelpq.users.sourceforge.net > > > > ------------------------------------------------------------------------------ > What You Don't Know About Data Connectivity CAN Hurt You > This paper provides an overview of data connectivity, details > its effect on application quality, and explores various alternative > solutions. http://p.sf.net/sfu/progress-d2d > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Abbas B. <abb...@te...> - 2011-03-05 06:07:23
|
Good catch On Fri, Mar 4, 2011 at 1:18 PM, xiong wang <wan...@gm...> wrote: > Dears, > I am sorry that I missed the enclosure. > > Regads, > Benny > > 2011/3/4 xiong wang <wan...@gm...>: > > Dears, > > > > The original process for copy from in CSV HEADER mode in PGXC can be > > described as follows: > > 1. Coordinator reads the header then throw it; > > 2. Coordinator reads the data then distributes them to datanodes with > > the same mode(CSV HEADER). > > 3. Datanodes still need throws the header, but now the data from > > Coordinator has no header but only data.Therefore, the first row of > > the data sent to datanodes will be ignored. The ignored data row count > > is the same a s the datanodes which are assigned. > > > > The patch changes the way Coordinator sends to the datanodes. > > Coodinator always sends to datanodes with no HEADER so that datanodes > > will never need to read the header which has been thrown away by > > Coordinator.. > > > > Regards, > > Benny > > > > > ------------------------------------------------------------------------------ > What You Don't Know About Data Connectivity CAN Hurt You > This paper provides an overview of data connectivity, details > its effect on application quality, and explores various alternative > solutions. http://p.sf.net/sfu/progress-d2d > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Koichi S. <koi...@gm...> - 2011-03-04 10:24:38
|
This is serious restriction of the current implementation. I discussed with Michael and Abbas and we have the following idea: 1) To write new distribution function for numeric and character (including varchar), 2) To allow user-defined distribution function, 3) To allow tables to be distributed to subset of datanodes. I assume your point includes two more things: a) to choose any column as distribution key, and b) to chose PRIMARY KEY as the default distribution key, which may be a combination of any number of columns. We may be able to implement them if we restrict PRIMARY KEY columns to one of integer, numeric or character. Regards; ---------- Koichi Suzuki 2011/3/4 rahua1 <ra...@16...>: > Hi, all > > when creating one table with primary key: > > h=# create table test(a int, b varchar(5), primary key(b,a)); > > postgres-XC always select first column in primary keys as hash distribution > key when no "distribute by" clause, so an error occurs: > ERROR: Column b is not a hash distributable data type. > > Is this one bug to pg-xc? Should postgres-XC be more intelligent to select > another column which can be hash key to be hash key? > > Any reply will be great appreciated!! > > > Best regards to all!! > > > 2011-03-04 > ________________________________ > rahua1 > ------------------------------------------------------------------------------ > What You Don't Know About Data Connectivity CAN Hurt You > This paper provides an overview of data connectivity, details > its effect on application quality, and explores various alternative > solutions. http://p.sf.net/sfu/progress-d2d > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Michael P. <mic...@gm...> - 2011-03-04 09:45:06
|
Hi all, Please find attached a small patch adding an option in pg_regress that allows to customize the directory where "expected" and "sql" folders are. I felt that it was missing in the current implementation if you want to use a customize set of SQL queries. Perhaps it may be better to give it to PostgreSQL community... Regards, -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: rahua1<ra...@16...> - 2011-03-04 09:07:00
|
Hi, all when creating one table with primary key: h=# create table test(a int, b varchar(5), primary key(b,a)); postgres-XC always select first column in primary keys as hash distribution key when no "distribute by" clause, so an error occurs: ERROR: Column b is not a hash distributable data type. Is this one bug to pg-xc? Should postgres-XC be more intelligent to select another column which can be hash key to be hash key? Any reply will be great appreciated!! Best regards to all!! 2011-03-04 rahua1 |
From: xiong w. <wan...@gm...> - 2011-03-04 08:18:48
|
Dears, I am sorry that I missed the enclosure. Regads, Benny 2011/3/4 xiong wang <wan...@gm...>: > Dears, > > The original process for copy from in CSV HEADER mode in PGXC can be > described as follows: > 1. Coordinator reads the header then throw it; > 2. Coordinator reads the data then distributes them to datanodes with > the same mode(CSV HEADER). > 3. Datanodes still need throws the header, but now the data from > Coordinator has no header but only data.Therefore, the first row of > the data sent to datanodes will be ignored. The ignored data row count > is the same a s the datanodes which are assigned. > > The patch changes the way Coordinator sends to the datanodes. > Coodinator always sends to datanodes with no HEADER so that datanodes > will never need to read the header which has been thrown away by > Coordinator.. > > Regards, > Benny > |
From: xiong w. <wan...@gm...> - 2011-03-04 08:13:19
|
Dears, In PGXC, when there's an error from datanodes, the coordinators relative with the executing statement state and theirs data sent to the executing coordinator have no chance to be processed at all. The patch fix such a problem. I hope it may be useful. Regards, Benny |
From: xiong w. <wan...@gm...> - 2011-03-04 07:56:36
|
Dears, The original process for copy from in CSV HEADER mode in PGXC can be described as follows: 1. Coordinator reads the header then throw it; 2. Coordinator reads the data then distributes them to datanodes with the same mode(CSV HEADER). 3. Datanodes still need throws the header, but now the data from Coordinator has no header but only data.Therefore, the first row of the data sent to datanodes will be ignored. The ignored data row count is the same a s the datanodes which are assigned. The patch changes the way Coordinator sends to the datanodes. Coodinator always sends to datanodes with no HEADER so that datanodes will never need to read the header which has been thrown away by Coordinator.. Regards, Benny |
From: Michael P. <mic...@gm...> - 2011-03-04 04:50:21
|
Hi, You are definitely right. It is not really easy to understand which base we are using for the HA implementation. So this could be done in the following order: 1) patch for the separate configuration file 2) patch for mirroring feature 2) patch for XCM 4) patch forthe interface between XCM and Postgres-XC Please note that for the moment and for stabilization purposes, it is planned not to merge the HA experimental features with the master branch. For the next release, what may be done is a tarball based on ha_support branch and one based on master branch. We are currently making a difficult merge with 9.0.3 and this has to be stabilized first with clear regressions tests before envisaging a merge between ha functionalities (ha_support) and master branch. Even with that, it may be better in long term to keep the 2 branches separated. However, the current ha_support branch may have its name changed to master and the current master branch may have its name changed to... let's say "core". Regards, On Wed, Mar 2, 2011 at 10:46 PM, Mason <ma...@us...>wrote: > I just wanted to make a comment. > > The ha_support branch has a lot of extensive changes, including node > management and monitoring. When considering merging components into > the main branch, it may make sense to break it out into multiple > patches and post to the developer list to make it easier for review in > digestible chunks. For example: > > - a patch that just defines nodes in a separate file outside of > postgresql.conf > - a patch for other node management enhancements > - a patch for monitoring enhancements > - a patch for HA > > Regards, > > Mason > > > ------------------------------------------------------------------------------ > Free Software Download: Index, Search & Analyze Logs and other IT data in > Real-Time with Splunk. Collect, index and harness all the fast moving IT > data > generated by your applications, servers and devices whether physical, > virtual > or in the cloud. Deliver compliance at lower cost and gain new business > insights. http://p.sf.net/sfu/splunk-dev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Mason <ma...@us...> - 2011-03-02 13:46:54
|
I just wanted to make a comment. The ha_support branch has a lot of extensive changes, including node management and monitoring. When considering merging components into the main branch, it may make sense to break it out into multiple patches and post to the developer list to make it easier for review in digestible chunks. For example: - a patch that just defines nodes in a separate file outside of postgresql.conf - a patch for other node management enhancements - a patch for monitoring enhancements - a patch for HA Regards, Mason |
From: xiong w. <wan...@gm...> - 2011-02-28 06:15:43
|
Hi Michael, The enclosures are including such files: 1. The patch which fix the bugs about rules you reported. 2. The test cases about mutiple insert. 3. The expect files on insert.sql 4. The test cases on mutiple insert rules. There're bugs on rule in PGXC, therefore I only attached the test cases on rule without relative expect file. Regards, Benny 2011/2/22 Michael Paquier <mic...@gm...>: > Hi, > > Here is a little bit of feedback about the rule crash. > I fixed myself an issue I found with rules this morning. > > So based on that I ran a couple of tests with your patch. > 1) case do nothing: works well > dbt1=# create table aa (a int, b int); > CREATE TABLE > dbt1=# create table bb (a int, b int) distribute by replication; > CREATE TABLE > dbt1=# create rule aa_ins as on insert to aa do instead nothing; > CREATE RULE > dbt1=# insert into aa values (1,2),(2,3); > INSERT 0 0 > dbt1=# select * from bb; > a | b > ---+--- > (0 rows) > dbt1=# select * from aa; > a | b > ---+--- > (0 rows) > This case works well. > > 2) with an insert rule: do also > dbt1=# create table aa (a int, b int); > CREATE TABLE > dbt1=# create table bb (a int, b int) distribute by replication; > CREATE TABLE > dbt1=# create rule bb_ins as on insert to aa do also insert into bb values > (new.a,new.b); > CREATE RULE > dbt1=# insert into aa values (1,2),(2,3); > dbt1=# execute direct on node 1 'select * from aa'; > a | b > ---+--- > 1 | 2 > 1 | 2 > 2 | 3 > 1 | 2 > 2 | 3 > (5 rows) > dbt1=# execute direct on node 2 'select * from aa'; > a | b > ---+--- > 2 | 3 > 1 | 2 > 2 | 3 > 1 | 2 > 2 | 3 > (5 rows) > It looks that the query is not running on the good table. > In RewriteInsertStmt, only one locator information is used when rewriting > the query. > Only the locator information of the tables whose rule is applied to is taken > into account. > > For example, in my case queries are not rewritten for table bb but only for > table aa. > It may be possible to take into account also the table bb defined in the > rules when building the lists of values. > > If the others have any ideas about how it could be able to do that smoothly, > any ideas is welcome. > > I think you should modify RewriteInsertStmt to take into account also the > rules that have been fired on this query. > I suppose this information is visible in the parsing tree as it works well > for one INSERT value. > > I attach a modified version of the patch you sent. > It does exactly the same thing as your first version. > > Regards, > -- > Michael Paquier > http://michaelpq.users.sourceforge.net > > |
From: xabc1000 <xab...@16...> - 2011-02-25 01:08:04
|
hi, When a table with foreign key was been created, a crash was encounted. Atfer debugging, I found that the pointer "cxt->rel" in the function "checkLocalFKConstraints" was NULL. What's the function of "checkLocalFKConstraints"? what's the function of the following code? foreach(attritem, fkconstraint->fk_attrs) { char *attrname = (char *) strVal(lfirst(attritem)); if (strcmp(cxt->rel->rd_locator_info->partAttrName, attrname) == 0) { /* Found the ordinal position in constraint */ break; } pos++; } If someone can help me, I would be very grateful. yours xcix 2011-02-25 xabc1000 |
From: Suzuki H. <hir...@in...> - 2011-02-24 14:58:33
|
Thanks your kind answer. > I have not yet studied the latest docs for GTM Standby in detail, but > I think something could be done if message ids are assigned, and > recent messages are buffered, such that when a GTM-Standby is > promoted, it can compare the last message id with the others and they > can sync up any missing messages. Perhaps this process can be done in > such a way that cluster-wide it is known in which order the > GTM-standbys will be contacted (ignoring the fact that broadcasting > with acknowledgements could also be coded one day). This is the answer that I expected. I think that this is a general solution:"... message ids are assigned, and recent messages are buffered, .." I also used same idea with some systems I made. I questioned because I could not find any API for the purpose above in the document of XCM. (Actually, Koichi san said before "gtm must move faster. I want to avoid context switches as much as possible,etc..." I thought that he worried about the speed very much. Therefore, I wanted to know what method HA-gtm use. ) > Anyway, whatever the exact technique that is chosen, I do not think > this is an unsurmountable issue. Yes, I think so. It's not difficult. > Also, for more background, code is currently being written such that > even if there is no GTM standby, a restarted GTM can rebuild its state > from information from the other nodes, but failover could be quicker > with a dedicated GTM standby. Great. I think that it's a big technical challenge. Thanks a lot. |
From: Mason S. <mas...@gm...> - 2011-02-24 13:33:25
|
On Thu, Feb 24, 2011 at 2:15 AM, Suzuki Hironobu <hir...@in...> wrote: > Thank you kind response. > >> We're assuring every message has been reached by receiving responses >> except for very few case. One of them is reporting a failure to >> xcwatcher. Here, because the failure will be reported from other >> source sooner or later, we don't care each report has to be reached to >> xcwatcher. In very critical case, xcwatcher will find no connection >> to monitoring agent, or monitoring agent will detect its local >> component failure. When we use UDB, we always have backups and we >> limit this use so that it does not affect database integrity within >> the cluster. > > I understand xcwatchers are able to find almost failures. > >> We're assuring every message has been reached by receiving responses >> except for very few case. > I'm interested in gtm, because it is SPOF of XC. > I'm especially interested in this "very few case". > > My questions were very simple. > In the case below: >>> For example: >>> (1)gtm-standby1 receives a message from gtm-act, >>> (2)gtm-act crashed! >>> (3)gtm-standby2 never receive it. >>> This is a typical case, and there are many similar cases. > First. > I think that there is a possibility that this happens though it is very > rare. > Is my thought correct? > > Second. > If my thought is correct, > can gtm-standby2 receive the last message not reached after xcwatcher > detects failure? > Or > if my thought is not correct, how are all messages perfectly sent? > > > The most fundamental question is: > How is the consistency of the data among gtm-act and two or more > gtm-standbys kept? > (I think that the consistency of data among gtms is the necessary > condition of HA-gtm.) I have not yet studied the latest docs for GTM Standby in detail, but I think something could be done if message ids are assigned, and recent messages are buffered, such that when a GTM-Standby is promoted, it can compare the last message id with the others and they can sync up any missing messages. Perhaps this process can be done in such a way that cluster-wide it is known in which order the GTM-standbys will be contacted (ignoring the fact that broadcasting with acknowledgements could also be coded one day). Also, for more background, code is currently being written such that even if there is no GTM standby, a restarted GTM can rebuild its state from information from the other nodes, but failover could be quicker with a dedicated GTM standby. Also, GTM currently can save its state when shut down gracefully. Such state info could theoretically be sent over from the promoted standby from the other ones if there is a problem. Similarly, this info could be sent over when spinning up a new GTM Standby dynamically. Anyway, whatever the exact technique that is chosen, I do not think this is an unsurmountable issue. Regards, Mason > > Of course, if only one GTM-standby runs, the problem is easy. > > > Regards, > > > ------------------------------------------------------------------------------ > Free Software Download: Index, Search & Analyze Logs and other IT data in > Real-Time with Splunk. Collect, index and harness all the fast moving IT data > generated by your applications, servers and devices whether physical, virtual > or in the cloud. Deliver compliance at lower cost and gain new business > insights. http://p.sf.net/sfu/splunk-dev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |
From: Suzuki H. <hir...@in...> - 2011-02-24 07:14:51
|
Thank you kind response. > We're assuring every message has been reached by receiving responses > except for very few case. One of them is reporting a failure to > xcwatcher. Here, because the failure will be reported from other > source sooner or later, we don't care each report has to be reached to > xcwatcher. In very critical case, xcwatcher will find no connection > to monitoring agent, or monitoring agent will detect its local > component failure. When we use UDB, we always have backups and we > limit this use so that it does not affect database integrity within > the cluster. I understand xcwatchers are able to find almost failures. > We're assuring every message has been reached by receiving responses > except for very few case. I'm interested in gtm, because it is SPOF of XC. I'm especially interested in this "very few case". My questions were very simple. In the case below: >> For example: >> (1)gtm-standby1 receives a message from gtm-act, >> (2)gtm-act crashed! >> (3)gtm-standby2 never receive it. >> This is a typical case, and there are many similar cases. First. I think that there is a possibility that this happens though it is very rare. Is my thought correct? Second. If my thought is correct, can gtm-standby2 receive the last message not reached after xcwatcher detects failure? Or if my thought is not correct, how are all messages perfectly sent? The most fundamental question is: How is the consistency of the data among gtm-act and two or more gtm-standbys kept? (I think that the consistency of data among gtms is the necessary condition of HA-gtm.) Of course, if only one GTM-standby runs, the problem is easy. Regards, |
From: Koichi S. <koi...@gm...> - 2011-02-23 08:05:41
|
Hi, ---------- Koichi Suzuki 2011/2/23 Suzuki Hironobu <hir...@in...>: > Thank you, and this is final question, maybe. > >>>> Now it is under the development (too early to publish) and is similar >>>> to streaming replication. GTM-ACT sends it's update (each > > At least ver9.0's SR is asynchronous. > By the way, Yes. There's a plan to make it synchronous and we're waiting for it. If they don't come early, we may make local extension. > >> Data transmission is synchronous. GTM-Standby has threads which >> corresponds to each GTM-ACT threads. Because GTM-Act threads >> coorrespond to GTM-Proxy worker threads, basically GTM-ACT just copies >> message from GTM-Proxy to GTM-Sandby and GTM-Standby can recreate >> transaction status. >> > > I heard that PostgresXC do not use reliable communication protocol. > Then, under very critical condition when gtm-act (or other components) > crashed, > I think that there is a possibility to which gtm-standby's binary do not > correspond. > (This is a critical timing issue.) We're assuring every message has been reached by receiving responses except for very few case. One of them is reporting a failure to xcwatcher. Here, because the failure will be reported from other source sooner or later, we don't care each report has to be reached to xcwatcher. In very critical case, xcwatcher will find no connection to monitoring agent, or monitoring agent will detect its local component failure. When we use UDB, we always have backups and we limit this use so that it does not affect database integrity within the cluster. Communication among GTM, GTM-Proxy, GTM-Standby, Coordinator and Datanode/Mirros are reliable. > > For example: > (1)gtm-standby1 receives a message from gtm-act, > (2)gtm-act crashed! > (3)gtm-standby2 never receive it. > This is a typical case, and there are many similar cases. > Therefore, I think that the data consistency among gtms(gtm-act and > gtm-standbys) is not guaranteed. > Is it true? Or are there any mechanisms to avoid this problem? > >> In the case of intermittent failure, typically in the network, we can >> expect many things. >> >> Some transaction may fail but another may be successful. I think >> 2PC will maintain database integrity in the whole database. I >> believe this is what we should enforce. One thing we should be >> careful, for example, is the case that different coordinators observe >> different (intermittent) failure for different datanode mirrors. We >> have to be careful to keep "primary" datanode mirror consistent to >> maintain data integrity between mirrors. > > I wish you success. > > > > ------------------------------------------------------------------------------ > Free Software Download: Index, Search & Analyze Logs and other IT data in > Real-Time with Splunk. Collect, index and harness all the fast moving IT data > generated by your applications, servers and devices whether physical, virtual > or in the cloud. Deliver compliance at lower cost and gain new business > insights. http://p.sf.net/sfu/splunk-dev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > Thank you; --- Koichi Suzuki |
From: Suzuki H. <hir...@in...> - 2011-02-23 06:24:47
|
Thank you, and this is final question, maybe. >>> Now it is under the development (too early to publish) and is similar >>> to streaming replication. GTM-ACT sends it's update (each At least ver9.0's SR is asynchronous. By the way, > Data transmission is synchronous. GTM-Standby has threads which > corresponds to each GTM-ACT threads. Because GTM-Act threads > coorrespond to GTM-Proxy worker threads, basically GTM-ACT just copies > message from GTM-Proxy to GTM-Sandby and GTM-Standby can recreate > transaction status. > I heard that PostgresXC do not use reliable communication protocol. Then, under very critical condition when gtm-act (or other components) crashed, I think that there is a possibility to which gtm-standby's binary do not correspond. (This is a critical timing issue.) For example: (1)gtm-standby1 receives a message from gtm-act, (2)gtm-act crashed! (3)gtm-standby2 never receive it. This is a typical case, and there are many similar cases. Therefore, I think that the data consistency among gtms(gtm-act and gtm-standbys) is not guaranteed. Is it true? Or are there any mechanisms to avoid this problem? > In the case of intermittent failure, typically in the network, we can > expect many things. > > Some transaction may fail but another may be successful. I think > 2PC will maintain database integrity in the whole database. I > believe this is what we should enforce. One thing we should be > careful, for example, is the case that different coordinators observe > different (intermittent) failure for different datanode mirrors. We > have to be careful to keep "primary" datanode mirror consistent to > maintain data integrity between mirrors. I wish you success. |
From: Michael P. <mic...@gm...> - 2011-02-23 04:07:41
|
On Wed, Feb 23, 2011 at 12:59 PM, xiong wang <wan...@gm...> wrote: > Dears, > There's an error when I drop database. > > postgres=# create database test; > CREATE DATABASE > postgres=# drop database test; > ERROR: Clean connections not completed > > Regards, > Benny > I am able to reproduce that. This error seems to occur only when you create and drop a database when connected to the database postgres. -- test You are now connected to database "template1". template1=# create database dbt1; CREATE DATABASE template1=# drop database dbt1; DROP DATABASE template1=# create database dbt1; CREATE DATABASE template1=# drop database dbt1; DROP DATABASE -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: xiong w. <wan...@gm...> - 2011-02-23 03:59:35
|
Dears, There's an error when I drop database. postgres=# create database test; CREATE DATABASE postgres=# drop database test; ERROR: Clean connections not completed Regards, Benny |
From: Koichi S. <koi...@gm...> - 2011-02-22 08:04:10
|
Hi, Please find my response inline... ---------- Koichi Suzuki 2011/2/22 Suzuki Hironobu <hir...@in...>: > Thanks your quick response. > > >>> >>> I'd like to know details more. >>> (1)How to replicate the data among gtm and gtm-standby(s)? >> >> Now it is under the development (too early to publish) and is similar >> to streaming replication. GTM-ACT sends it's update (each >> transaction status change, typically) to SBY. GTM-SBY can connect to >> GTM-ACT anytime. When GTM-ACT accepts GTM-SBY connection and begin >> to ship each transaction status update. >> > > It will show a good performance. > > Is data transmission among gtm synchronous or asynchronous? > And if it's asynchronous, > is there a mechanism that takes the synchronization among gtm-standby > when master-gtm crashed? Data transmission is synchronous. GTM-Standby has threads which corresponds to each GTM-ACT threads. Because GTM-Act threads coorrespond to GTM-Proxy worker threads, basically GTM-ACT just copies message from GTM-Proxy to GTM-Sandby and GTM-Standby can recreate transaction status. In fact, GTM-Standby's binary will be the same as GTM-ACT. So far, we don't backup GTM-SBY status to stable storage. When GTM-SBY crashes, we can get another GTM-SBY connected to GTM-ACT. Cascated GTM-SBY and multiple GT-SBY could be options for the future. > >>> (2)What failure do you assume? Only crash failure? or more? >>> And what's kind of failure detector does xcwatcher have? >>> Theoretically, is it eventually strong? eventually perfect? >> >> Hardware crash and software crash. Xcwatcher is a bit traditional. >> Network communication and process monitoring. Difference is that any >> Postgres-XC components >> (GTM/GTM-SBY/GT-PXY/Coordinator/Datanode/Mirror) can report the >> failure of other components they communicate to xcwatcher through >> XCM. Xcwatcher distributes this update (not only failure, but also >> start/stop and raise to ACT, etc) to all the servers. On the other >> hand, XCM evaluates the failure and advice xcwatcher what to do, as >> written in the document. >> > I understand. > I wanted to know that how do you think about failures. > Because words "timeout" and/or "omission" are not found in the document. Component (not server hardware) failure is detected by issuing "monitoring" command and check the response. Postgres-XC components are allowed to report other components' failures to xcwatcher through XCM module. Hardware failure detection is yet primitive. It is based upon response timeout. As I wrote, we may want to combine this with other hardware monitoring provided by general purpose HA middleware. > >>> >>> (3)As a possibility, I think that Postgres-XC components divide into >>> some parts due to the network failure etc. Is it correct? >> >> When xcwatcher fails to monitor servers or components, it treats them >> to be failed and tries to stop them just in case. When sufficient >> components are not recognized by xcwatcher, it will stop the whole >> cluster to enforce data integrity among datanodes/mirrors. >> >> All these action can be monitored through xcwatcher log by external >> tools. >> >> When xcwatcher itself fails, Postgres-XC can continue to run for a >> while. Operators can restart xcwatcher even in the different server. >> In this case, xcwatcher collects current cluster status through XCM >> (from all the servers involved) to rebuild global cluster status. >> >> I think they can be combined with hardware monitoring capability >> provided by many HA middleware. >> > > I was misunderstanding it a little. > I thought that we can construct a dependable system only with XCM. > > I think the setting of HA middleware seems to become difficult, > to correspond to difficult situations (for example: intermittent failure > of network). > However, it might be easy when assuming (perfect)crash failure only. In the case of intermittent failure, typically in the network, we can expect many things. Some transaction may fail but another may be successful. I think 2PC will maintain database integrity in the whole database. I believe this is what we should enforce. One thing we should be careful, for example, is the case that different coordinators observe different (intermittent) failure for different datanode mirrors. We have to be careful to keep "primary" datanode mirror consistent to maintain data integrity between mirrors. > > > I am looking forward to the trial of XCM. It's very simple. Enjoy. > > > ------------------------------------------------------------------------------ > Index, Search & Analyze Logs and other IT data in Real-Time with Splunk > Collect, index and harness all the fast moving IT data generated by your > applications, servers and devices whether physical, virtual or in the cloud. > Deliver compliance at lower cost and gain new business insights. > Free Software Download: http://p.sf.net/sfu/splunk-dev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |
From: Suzuki H. <hir...@in...> - 2011-02-22 05:35:19
|
Thanks your quick response. >> >> I'd like to know details more. >> (1)How to replicate the data among gtm and gtm-standby(s)? > > Now it is under the development (too early to publish) and is similar > to streaming replication. GTM-ACT sends it's update (each > transaction status change, typically) to SBY. GTM-SBY can connect to > GTM-ACT anytime. When GTM-ACT accepts GTM-SBY connection and begin > to ship each transaction status update. > It will show a good performance. Is data transmission among gtm synchronous or asynchronous? And if it's asynchronous, is there a mechanism that takes the synchronization among gtm-standby when master-gtm crashed? >> (2)What failure do you assume? Only crash failure? or more? >> And what's kind of failure detector does xcwatcher have? >> Theoretically, is it eventually strong? eventually perfect? > > Hardware crash and software crash. Xcwatcher is a bit traditional. > Network communication and process monitoring. Difference is that any > Postgres-XC components > (GTM/GTM-SBY/GT-PXY/Coordinator/Datanode/Mirror) can report the > failure of other components they communicate to xcwatcher through > XCM. Xcwatcher distributes this update (not only failure, but also > start/stop and raise to ACT, etc) to all the servers. On the other > hand, XCM evaluates the failure and advice xcwatcher what to do, as > written in the document. > I understand. I wanted to know that how do you think about failures. Because words "timeout" and/or "omission" are not found in the document. >> >> (3)As a possibility, I think that Postgres-XC components divide into >> some parts due to the network failure etc. Is it correct? > > When xcwatcher fails to monitor servers or components, it treats them > to be failed and tries to stop them just in case. When sufficient > components are not recognized by xcwatcher, it will stop the whole > cluster to enforce data integrity among datanodes/mirrors. > > All these action can be monitored through xcwatcher log by external > tools. > > When xcwatcher itself fails, Postgres-XC can continue to run for a > while. Operators can restart xcwatcher even in the different server. > In this case, xcwatcher collects current cluster status through XCM > (from all the servers involved) to rebuild global cluster status. > > I think they can be combined with hardware monitoring capability > provided by many HA middleware. > I was misunderstanding it a little. I thought that we can construct a dependable system only with XCM. I think the setting of HA middleware seems to become difficult, to correspond to difficult situations (for example: intermittent failure of network). However, it might be easy when assuming (perfect)crash failure only. I am looking forward to the trial of XCM. |
From: Michael P. <mic...@gm...> - 2011-02-22 04:07:23
|
Hi, Here is a little bit of feedback about the rule crash. I fixed myself an issue I found with rules this morning. So based on that I ran a couple of tests with your patch. 1) case do nothing: works well dbt1=# create table aa (a int, b int); CREATE TABLE dbt1=# create table bb (a int, b int) distribute by replication; CREATE TABLE dbt1=# create rule aa_ins as on insert to aa do instead nothing; CREATE RULE dbt1=# insert into aa values (1,2),(2,3); INSERT 0 0 dbt1=# select * from bb; a | b ---+--- (0 rows) dbt1=# select * from aa; a | b ---+--- (0 rows) This case works well. 2) with an insert rule: do also dbt1=# create table aa (a int, b int); CREATE TABLE dbt1=# create table bb (a int, b int) distribute by replication; CREATE TABLE dbt1=# create rule bb_ins as on insert to aa do also insert into bb values (new.a,new.b); CREATE RULE dbt1=# insert into aa values (1,2),(2,3); dbt1=# execute direct on node 1 'select * from aa'; a | b ---+--- 1 | 2 1 | 2 2 | 3 1 | 2 2 | 3 (5 rows) dbt1=# execute direct on node 2 'select * from aa'; a | b ---+--- 2 | 3 1 | 2 2 | 3 1 | 2 2 | 3 (5 rows) It looks that the query is not running on the good table. In RewriteInsertStmt, only one locator information is used when rewriting the query. Only the locator information of the tables whose rule is applied to is taken into account. For example, in my case queries are not rewritten for table bb but only for table aa. It may be possible to take into account also the table bb defined in the rules when building the lists of values. If the others have any ideas about how it could be able to do that smoothly, any ideas is welcome. I think you should modify RewriteInsertStmt to take into account also the rules that have been fired on this query. I suppose this information is visible in the parsing tree as it works well for one INSERT value. I attach a modified version of the patch you sent. It does exactly the same thing as your first version. Regards, -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Koichi S. <ko...@in...> - 2011-02-22 00:24:28
|
Thanks for quick response. (2011年02月22日 06:50), Suzuki Hironobu wrote: > Hi, > >> XCM module is added to Postgres-XC (ha_support branch so far). I >> added the following file in sourceforge development web-site. >> >> XCM_Module_Document_20110221.pdf > > I finished reading this document now. > I think that XCM is a great idea. > >> Misc is created to store temporary materials intended to be a part of >> further releases. > > I'd like to know details more. > (1)How to replicate the data among gtm and gtm-standby(s)? Now it is under the development (too early to publish) and is similar to streaming replication. GTM-ACT sends it's update (each transaction status change, typically) to SBY. GTM-SBY can connect to GTM-ACT anytime. When GTM-ACT accepts GTM-SBY connection and begin to ship each transaction status update. > > (2)What failure do you assume? Only crash failure? or more? > And what's kind of failure detector does xcwatcher have? > Theoretically, is it eventually strong? eventually perfect? Hardware crash and software crash. Xcwatcher is a bit traditional. Network communication and process monitoring. Difference is that any Postgres-XC components (GTM/GTM-SBY/GT-PXY/Coordinator/Datanode/Mirror) can report the failure of other components they communicate to xcwatcher through XCM. Xcwatcher distributes this update (not only failure, but also start/stop and raise to ACT, etc) to all the servers. On the other hand, XCM evaluates the failure and advice xcwatcher what to do, as written in the document. > > (3)As a possibility, I think that Postgres-XC components divide into > some parts due to the network failure etc. Is it correct? When xcwatcher fails to monitor servers or components, it treats them to be failed and tries to stop them just in case. When sufficient components are not recognized by xcwatcher, it will stop the whole cluster to enforce data integrity among datanodes/mirrors. All these action can be monitored through xcwatcher log by external tools. When xcwatcher itself fails, Postgres-XC can continue to run for a while. Operators can restart xcwatcher even in the different server. In this case, xcwatcher collects current cluster status through XCM (from all the servers involved) to rebuild global cluster status. I think they can be combined with hardware monitoring capability provided by many HA middleware. Thank you very much for your interest and involvement. --- Koichi > > > ---- > I cannot go to the conference on Friday. > Please teach if there is time. > > Regards, > > > > ------------------------------------------------------------------------------ > Index, Search& Analyze Logs and other IT data in Real-Time with Splunk > Collect, index and harness all the fast moving IT data generated by your > applications, servers and devices whether physical, virtual or in the cloud. > Deliver compliance at lower cost and gain new business insights. > Free Software Download: http://p.sf.net/sfu/splunk-dev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |