You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
|
1
(15) |
2
(10) |
3
(2) |
4
(6) |
5
|
6
(1) |
7
(23) |
8
|
9
|
10
|
11
|
12
(2) |
13
|
14
|
15
|
16
(2) |
17
(2) |
18
|
19
|
20
(1) |
21
(2) |
22
(3) |
23
(2) |
24
(5) |
25
(2) |
26
(3) |
27
(4) |
28
(6) |
29
(9) |
30
(3) |
31
|
From: Koichi S. <koi...@gm...> - 2014-05-25 18:04:42
|
I see. Your have good usecase for read-only transactions. Because of the nature of log shipping and sharing/clustering, it is not simple to provide read-only transaction in XC. Two essential reasons: 1. Delay in WAL playback in each slave may be different. It makes providing consistent database view extremely difficult. 2. At present, slave calculates snapshot of the transaction from the WAL. Current code does not allow missing XIDs. There will be memory leak and crash by OOM if there's many missing XIDs in the WAL stream. In XC, it is disabled and the database view may be inconsistent. Please note that this does not affect recovery and promotion. Read only scalability is obviously a candidate of our TODO list. Based on the discussion in PGCon cluster summit, we will open-up our roadmap discussion this week and ask for input of feature/performance/quality discussion at our general/developer mailing list. I hope you can post your usecase and requirement to this discussion. Best Regards; --- Koichi Suzuki 2014-05-25 5:23 GMT-07:00 ZhangJulian <jul...@ou...>: > Hi Koichi, > > Thanks for the explaination. > > We have a system which has some OLTP applications and some REPORT > applications, and the REPORT system can bear some inconsistency. We do not > want the REPORT system influencing the statibility of the OLTP system, so > read/write separation is applicable in this scenario. > > From your advices, I feel I should limit the use cases to a smaller > scenario, for example, even the GUC is enabled, only the SELECT statement > under the autocommit=true could be routed to the slaves. > > From the other mail thread, the community has planned some other approachs > to achive the similar goal. Because our team has no much experience on the > development, we plan to train ourselves by this task even it will not be > adopted by the community. > > We may ask help from you if we have some questions, thanks in advance! > > Thanks > Julian > >> Date: Fri, 23 May 2014 08:52:28 -0400 > >> Subject: Re: [Postgres-xc-general] Do you think the new feature is >> meaningful? - Read/Write Separation >> From: koi...@gm... >> To: jul...@ou... >> CC: pos...@li... >> >> Hello; >> >> Find my reply inline. >> >> Thank you; >> --- >> Koichi Suzuki >> >> >> 2014-05-22 23:49 GMT-04:00 ZhangJulian <jul...@ou...>: >> > Hi Koichi, >> > >> > Thanks for your comments! >> > >> > 1. pgxc_node issue. >> > I feel the pgxc_node in data node have no use currently, right? >> > In current codebase, if a coordinator slave or a data node slave is >> > promoted, ALTER NODE statement must be executed in all the coordinators >> > since the pgxc_node table is a local table in each node. >> > Assume the feature is applied, ALTER NODE/CREATE NODE syntax also will >> > be >> > updated to update the master and slave together. Once a coordinator >> > slave or >> > a data node slave is prompted, the information in other coordinators and >> > the >> > prompted coordinator could be updated as the previous behavior. >> >> I understand your goal and it sounds attractive to have such >> master-slave info inside the database. Maybe we need better idea >> which survives slave promotion. >> >> > >> > 2. the data between the master and the slave may not be consistency >> > every >> > time. >> > It should be a common issue on PostgreSQL, and other non-cluster >> > database >> > platform. There are many users who use the master-slave infrastructure >> > to >> > achive the read/write separation. If the user open the feature, they >> > should >> > know the risk. >> >> The use case should be limited. The transaction has to be read only. >> We cannot transfer statement-by-statement basis. Even with >> transaction-basis transfer, we may be suffered from such >> inconsistency. I'm afraid this may not be understood widely. >> Given this, anyway, synchronizing WAL playback in slaves is essential >> issue to provide read transaction on slaves. This was discussed in >> the cluster summit at PGCon this Tuesday. >> >> > >> > 3. the GXID issue. >> > It is too complex to me, I can not understand it thoroughly, :) But if >> > the >> > user can bear the data is not consistency in a short time, it will be >> > not a >> > issue, right? >> >> GXID issue is a solution to provide "atomic visibility" among read and >> write distributed transactions. It is quite new and may need another >> material to understand. Let me prepare a material to describe why it >> is needed and what issues this solves. >> >> This kind of thing is essential to provide consistent database view in >> the cluster. >> >> Please allow me a bit to provide background information on this. >> >> > >> > Thanks >> > Julian >> > >> >> Date: Thu, 22 May 2014 09:21:28 -0400 >> >> Subject: Re: [Postgres-xc-general] Do you think the new feature is >> >> meaningful? - Read/Write Separation >> >> From: koi...@gm... >> >> To: jul...@ou... >> >> CC: pos...@li... >> > >> >> >> >> Hello; >> >> >> >> Thanks a lot for the idea. Please find my comments inline. >> >> >> >> Hope you consider them and more forward to make your goal more >> >> feasible? >> >> >> >> Regards; >> >> --- >> >> Koichi Suzuki >> >> >> >> >> >> 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: >> >> > Hi All, >> >> > >> >> > I plan to implement it as the below idea. >> >> > 1. add a new GUC to the coordinator configuration, which control the >> >> > READ/WRITE Separation feature is ON/OFF. >> >> > 2. extend the catalog table pgxc_node by adding new columns: >> >> > slave1_host, >> >> > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose >> >> > at >> >> > most >> >> > two slaves are supported. >> >> >> >> I don't think this is a good idea. If we have these info in the >> >> catalog, this will all goes to the slave by WAL shipping and will be >> >> used when a slave is promoted. >> >> >> >> This information is not valid when the master is gone and one of the >> >> slaves is promoted. >> >> >> >> > 3. a read only transaction or the front read only part of a >> >> > transaction >> >> > will >> >> > be routed to the slave node to execute. >> >> >> >> In current WAL shipping, we have to expect some difference when a >> >> transaction or statement update is visible to the slave. At least, >> >> even with >> >> synchronized replication, there's slight delay after the WAL record is >> >> received and is replayed to be available to hot standby. There's >> >> even a chance that such update is visible before it is visible at the >> >> master. >> >> >> >> Therefore, usecase of current hot standby should allow such >> >> differences. I don't think your example allow such WAL shipping >> >> replication characteristics. >> >> >> >> Moreover, current hot standby implementation assumes the slave will >> >> receive every XID in updates. It does not assume there could be >> >> missing XIDs and this assumption is used to generate snapshot to >> >> enforce update visibility. >> >> >> >> In XC, because of GXID nature, some GXID may be missing at some slave. >> >> >> >> At present, because we didn't have sufficient resources, snapshot >> >> generation is disabled. >> >> >> >> In addition to this, local snapshot may not work. We need global XID >> >> (GXID) to get consistent result. >> >> >> >> By such reasons, it is not simple to provide consistent database view >> >> from slaves. >> >> >> >> I discussed this in PGCon cluster summit this Tuesday and I'm afraid >> >> this need much more analysis, research and design. >> >> >> >> > >> >> > For example, >> >> > begin; >> >> > select ....; ==>go to slave node >> >> > select ....; ==>go to slave node >> >> > insert ....; ==>go to master node >> >> > select ....; ==>go to master node, since it may visit the row >> >> > inserted >> >> > by >> >> > the previous insert statement. >> >> > end; >> >> > >> >> > By this, in a cluster, >> >> > some coordinator can be configured to support the OLTP system, the >> >> > query >> >> > will be routed to the master data nodes; >> >> > others coordinators can be configured to support the report system, >> >> > the >> >> > query will be routed to the slave data nodes; >> >> > the different wordloads will be applied to different coordinators and >> >> > data >> >> > nodes, then they can be isolated. >> >> > >> >> > Do you think if it is valuable? Do you have some advices? >> >> > >> >> > Thanks >> >> > Julian >> >> > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ >> >> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For >> >> > FREE >> >> > Instantly run your Selenium tests across 300+ browser/OS combos. >> >> > Get unparalleled scalability from the best Selenium testing platform >> >> > available >> >> > Simple to use. Nothing to install. Get started now for free." >> >> > http://p.sf.net/sfu/SauceLabs >> >> > _______________________________________________ >> >> > Postgres-xc-general mailing list >> >> > Pos...@li... >> >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> > |
From: ZhangJulian <jul...@ou...> - 2014-05-25 12:23:59
|
Hi Koichi, Thanks for the explaination. We have a system which has some OLTP applications and some REPORT applications, and the REPORT system can bear some inconsistency. We do not want the REPORT system influencing the statibility of the OLTP system, so read/write separation is applicable in this scenario. From your advices, I feel I should limit the use cases to a smaller scenario, for example, even the GUC is enabled, only the SELECT statement under the autocommit=true could be routed to the slaves. From the other mail thread, the community has planned some other approachs to achive the similar goal. Because our team has no much experience on the development, we plan to train ourselves by this task even it will not be adopted by the community. We may ask help from you if we have some questions, thanks in advance! Thanks Julian > Date: Fri, 23 May 2014 08:52:28 -0400 > Subject: Re: [Postgres-xc-general] Do you think the new feature is meaningful? - Read/Write Separation > From: koi...@gm... > To: jul...@ou... > CC: pos...@li... > > Hello; > > Find my reply inline. > > Thank you; > --- > Koichi Suzuki > > > 2014-05-22 23:49 GMT-04:00 ZhangJulian <jul...@ou...>: > > Hi Koichi, > > > > Thanks for your comments! > > > > 1. pgxc_node issue. > > I feel the pgxc_node in data node have no use currently, right? > > In current codebase, if a coordinator slave or a data node slave is > > promoted, ALTER NODE statement must be executed in all the coordinators > > since the pgxc_node table is a local table in each node. > > Assume the feature is applied, ALTER NODE/CREATE NODE syntax also will be > > updated to update the master and slave together. Once a coordinator slave or > > a data node slave is prompted, the information in other coordinators and the > > prompted coordinator could be updated as the previous behavior. > > I understand your goal and it sounds attractive to have such > master-slave info inside the database. Maybe we need better idea > which survives slave promotion. > > > > > 2. the data between the master and the slave may not be consistency every > > time. > > It should be a common issue on PostgreSQL, and other non-cluster database > > platform. There are many users who use the master-slave infrastructure to > > achive the read/write separation. If the user open the feature, they should > > know the risk. > > The use case should be limited. The transaction has to be read only. > We cannot transfer statement-by-statement basis. Even with > transaction-basis transfer, we may be suffered from such > inconsistency. I'm afraid this may not be understood widely. > Given this, anyway, synchronizing WAL playback in slaves is essential > issue to provide read transaction on slaves. This was discussed in > the cluster summit at PGCon this Tuesday. > > > > > 3. the GXID issue. > > It is too complex to me, I can not understand it thoroughly, :) But if the > > user can bear the data is not consistency in a short time, it will be not a > > issue, right? > > GXID issue is a solution to provide "atomic visibility" among read and > write distributed transactions. It is quite new and may need another > material to understand. Let me prepare a material to describe why it > is needed and what issues this solves. > > This kind of thing is essential to provide consistent database view in > the cluster. > > Please allow me a bit to provide background information on this. > > > > > Thanks > > Julian > > > >> Date: Thu, 22 May 2014 09:21:28 -0400 > >> Subject: Re: [Postgres-xc-general] Do you think the new feature is > >> meaningful? - Read/Write Separation > >> From: koi...@gm... > >> To: jul...@ou... > >> CC: pos...@li... > > > >> > >> Hello; > >> > >> Thanks a lot for the idea. Please find my comments inline. > >> > >> Hope you consider them and more forward to make your goal more feasible? > >> > >> Regards; > >> --- > >> Koichi Suzuki > >> > >> > >> 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: > >> > Hi All, > >> > > >> > I plan to implement it as the below idea. > >> > 1. add a new GUC to the coordinator configuration, which control the > >> > READ/WRITE Separation feature is ON/OFF. > >> > 2. extend the catalog table pgxc_node by adding new columns: > >> > slave1_host, > >> > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at > >> > most > >> > two slaves are supported. > >> > >> I don't think this is a good idea. If we have these info in the > >> catalog, this will all goes to the slave by WAL shipping and will be > >> used when a slave is promoted. > >> > >> This information is not valid when the master is gone and one of the > >> slaves is promoted. > >> > >> > 3. a read only transaction or the front read only part of a transaction > >> > will > >> > be routed to the slave node to execute. > >> > >> In current WAL shipping, we have to expect some difference when a > >> transaction or statement update is visible to the slave. At least, > >> even with > >> synchronized replication, there's slight delay after the WAL record is > >> received and is replayed to be available to hot standby. There's > >> even a chance that such update is visible before it is visible at the > >> master. > >> > >> Therefore, usecase of current hot standby should allow such > >> differences. I don't think your example allow such WAL shipping > >> replication characteristics. > >> > >> Moreover, current hot standby implementation assumes the slave will > >> receive every XID in updates. It does not assume there could be > >> missing XIDs and this assumption is used to generate snapshot to > >> enforce update visibility. > >> > >> In XC, because of GXID nature, some GXID may be missing at some slave. > >> > >> At present, because we didn't have sufficient resources, snapshot > >> generation is disabled. > >> > >> In addition to this, local snapshot may not work. We need global XID > >> (GXID) to get consistent result. > >> > >> By such reasons, it is not simple to provide consistent database view > >> from slaves. > >> > >> I discussed this in PGCon cluster summit this Tuesday and I'm afraid > >> this need much more analysis, research and design. > >> > >> > > >> > For example, > >> > begin; > >> > select ....; ==>go to slave node > >> > select ....; ==>go to slave node > >> > insert ....; ==>go to master node > >> > select ....; ==>go to master node, since it may visit the row inserted > >> > by > >> > the previous insert statement. > >> > end; > >> > > >> > By this, in a cluster, > >> > some coordinator can be configured to support the OLTP system, the query > >> > will be routed to the master data nodes; > >> > others coordinators can be configured to support the report system, the > >> > query will be routed to the slave data nodes; > >> > the different wordloads will be applied to different coordinators and > >> > data > >> > nodes, then they can be isolated. > >> > > >> > Do you think if it is valuable? Do you have some advices? > >> > > >> > Thanks > >> > Julian > >> > > >> > > >> > ------------------------------------------------------------------------------ > >> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > >> > Instantly run your Selenium tests across 300+ browser/OS combos. > >> > Get unparalleled scalability from the best Selenium testing platform > >> > available > >> > Simple to use. Nothing to install. Get started now for free." > >> > http://p.sf.net/sfu/SauceLabs > >> > _______________________________________________ > >> > Postgres-xc-general mailing list > >> > Pos...@li... > >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > >> > |
From: Koichi S. <koi...@gm...> - 2014-05-24 23:03:06
|
2014-05-24 17:10 GMT-04:00 Josh Berkus <jo...@ag...>: > Koichi, > >> 1. To allow async., when a node fails, fall back whole cluster status >> to the latest consistent state, such as pointed by a barrier. I can >> provide some detailed thought on this if interesting. > > This is not interesting to me. If I have to accept major data loss for > a single node failure, then I can use solutions which do not require an GTM. > >> 2. Allow to have a copy of shards to another node at planner/executor level. > > Yes. This should be at the executor level, in my opinion. All writes > go to all shards and do not complete until they all succeed or the shard > times out (and then is marked disabled). > > What to do with reads is more nuanced. If we load-balance reads, then > we are increasing throughput of the cluster. If we send each read to > all duplicate shards, then we are improving response times while > decreasing throughput. I think that deserves some testing. Planner needs some more to choose the best one which pushdown is the best path to do. Also, to handle conflicting writes in different coordinators, we may need to define node priority where to go first. > >> 3. Implement another replication better for XC using BDR, just for >> distributed tables, for example. > > This has the same problems as solution #1. We can implement better synchronization suitable for XC need. Also, only shards can be replicated to reduce the overhead. I think this has better potential than streaming replication. Regards; --- Koichi Suzuki > >> At present, XC uses hash value of the node name to determine each row >> location for distributed tables. For ideas 2 and 3, we need to add >> some infrastructure to make this allocation more flexible. > > Yes. We would need a shard ID which is separate from the node name. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com |
From: Josh B. <jo...@ag...> - 2014-05-24 21:10:47
|
Koichi, > 1. To allow async., when a node fails, fall back whole cluster status > to the latest consistent state, such as pointed by a barrier. I can > provide some detailed thought on this if interesting. This is not interesting to me. If I have to accept major data loss for a single node failure, then I can use solutions which do not require an GTM. > 2. Allow to have a copy of shards to another node at planner/executor level. Yes. This should be at the executor level, in my opinion. All writes go to all shards and do not complete until they all succeed or the shard times out (and then is marked disabled). What to do with reads is more nuanced. If we load-balance reads, then we are increasing throughput of the cluster. If we send each read to all duplicate shards, then we are improving response times while decreasing throughput. I think that deserves some testing. > 3. Implement another replication better for XC using BDR, just for > distributed tables, for example. This has the same problems as solution #1. > At present, XC uses hash value of the node name to determine each row > location for distributed tables. For ideas 2 and 3, we need to add > some infrastructure to make this allocation more flexible. Yes. We would need a shard ID which is separate from the node name. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Koichi S. <koi...@gm...> - 2014-05-24 20:04:56
|
At present, XC advises to make a replica with synchronize replication. Pgxc_ctl configures slaves in this way. I understand that this is not for performance and we may need some other solution for this. To begin with, there are a couple of ideas for this. 1. To allow async., when a node fails, fall back whole cluster status to the latest consistent state, such as pointed by a barrier. I can provide some detailed thought on this if interesting. 2. Allow to have a copy of shards to another node at planner/executor level. 3. Implement another replication better for XC using BDR, just for distributed tables, for example. At present, XC uses hash value of the node name to determine each row location for distributed tables. For ideas 2 and 3, we need to add some infrastructure to make this allocation more flexible. Further input is welcome. Thank you. --- Koichi Suzuki 2014-05-24 14:53 GMT-04:00 Josh Berkus <jo...@ag...>: > All: > > So, in addition to the stability issues raised at the PostgresXC summit, > I need to raise something which is a deficiency of both XC and XL and > should be (in my opinion) our #2 priority after stability. And that's > node/shard redundancy. > > Right now, if single node fails, the cluster is frozen for writes ... > and fails some reads ... until the node is replaced by the user from a > replica. It's also not clear that we *can* actually replace a node from > a replica because the replica will be async rep, and thus not at exactly > the same GXID as the rest of the cluster. This makes XC a > low-availability solution. > > The answer for this is to do the same thing which every other clustering > system has done: write each shard to multiple locations. Default would > be two. If each shard is present on two different nodes, then losing a > node is just a performance problem, not a downtime event. > > Thoughts? > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Josh B. <jo...@ag...> - 2014-05-24 18:53:13
|
All: So, in addition to the stability issues raised at the PostgresXC summit, I need to raise something which is a deficiency of both XC and XL and should be (in my opinion) our #2 priority after stability. And that's node/shard redundancy. Right now, if single node fails, the cluster is frozen for writes ... and fails some reads ... until the node is replaced by the user from a replica. It's also not clear that we *can* actually replace a node from a replica because the replica will be async rep, and thus not at exactly the same GXID as the rest of the cluster. This makes XC a low-availability solution. The answer for this is to do the same thing which every other clustering system has done: write each shard to multiple locations. Default would be two. If each shard is present on two different nodes, then losing a node is just a performance problem, not a downtime event. Thoughts? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Koichi S. <koi...@gm...> - 2014-05-24 16:09:36
|
Sorry for the late response. What version are you using? 1.2.1 includes several fix for GTM connectivity. --- Koichi Suzuki 2014-05-22 12:28 GMT-04:00 Aaron Jackson <aja...@re...>: > Given my past experience with compiler issues, I'm a little hesitant to even > report this. That said, I have a three node cluster, each with a > coordinator, data node and gtm proxy. I have a standalone gtm instance > without a slave. Often, when I come in after the servers have been up for a > while, I'm greeted with a variety of issues. > > There are several warnings in the coordinator and data node logs, that read > "Do not have a GTM snapshot available" - I've discarded these as mostly > benign for the moment. > > The coordinator is much worse.. > > 30770 | 2014-05-22 15:53:06 UTC | ERROR: current transaction is aborted, > commands ignored until end of transaction block > 30770 | 2014-05-22 15:53:06 UTC | STATEMENT: DISCARD ALL > 4560 | 2014-05-22 15:54:30 UTC | LOG: failed to connect to Datanode > 4560 | 2014-05-22 15:54:30 UTC | LOG: failed to connect to Datanode > 4560 | 2014-05-22 15:54:30 UTC | WARNING: can not connect to node 16390 > 30808 | 2014-05-22 15:54:30 UTC | LOG: failed to acquire connections > > > Usually, I reset the coordinator and datanode and the world is happy again. > However, it makes me somewhat concerned that I'm seeing these kinds of > failures on a daily basis. I wouldn't rule out the compiler again as it's > been the reason for previous failures, but has anyone else seen anything > like this?? > > Aaron > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Koichi S. <koi...@gm...> - 2014-05-23 12:52:35
|
Hello; Find my reply inline. Thank you; --- Koichi Suzuki 2014-05-22 23:49 GMT-04:00 ZhangJulian <jul...@ou...>: > Hi Koichi, > > Thanks for your comments! > > 1. pgxc_node issue. > I feel the pgxc_node in data node have no use currently, right? > In current codebase, if a coordinator slave or a data node slave is > promoted, ALTER NODE statement must be executed in all the coordinators > since the pgxc_node table is a local table in each node. > Assume the feature is applied, ALTER NODE/CREATE NODE syntax also will be > updated to update the master and slave together. Once a coordinator slave or > a data node slave is prompted, the information in other coordinators and the > prompted coordinator could be updated as the previous behavior. I understand your goal and it sounds attractive to have such master-slave info inside the database. Maybe we need better idea which survives slave promotion. > > 2. the data between the master and the slave may not be consistency every > time. > It should be a common issue on PostgreSQL, and other non-cluster database > platform. There are many users who use the master-slave infrastructure to > achive the read/write separation. If the user open the feature, they should > know the risk. The use case should be limited. The transaction has to be read only. We cannot transfer statement-by-statement basis. Even with transaction-basis transfer, we may be suffered from such inconsistency. I'm afraid this may not be understood widely. Given this, anyway, synchronizing WAL playback in slaves is essential issue to provide read transaction on slaves. This was discussed in the cluster summit at PGCon this Tuesday. > > 3. the GXID issue. > It is too complex to me, I can not understand it thoroughly, :) But if the > user can bear the data is not consistency in a short time, it will be not a > issue, right? GXID issue is a solution to provide "atomic visibility" among read and write distributed transactions. It is quite new and may need another material to understand. Let me prepare a material to describe why it is needed and what issues this solves. This kind of thing is essential to provide consistent database view in the cluster. Please allow me a bit to provide background information on this. > > Thanks > Julian > >> Date: Thu, 22 May 2014 09:21:28 -0400 >> Subject: Re: [Postgres-xc-general] Do you think the new feature is >> meaningful? - Read/Write Separation >> From: koi...@gm... >> To: jul...@ou... >> CC: pos...@li... > >> >> Hello; >> >> Thanks a lot for the idea. Please find my comments inline. >> >> Hope you consider them and more forward to make your goal more feasible? >> >> Regards; >> --- >> Koichi Suzuki >> >> >> 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: >> > Hi All, >> > >> > I plan to implement it as the below idea. >> > 1. add a new GUC to the coordinator configuration, which control the >> > READ/WRITE Separation feature is ON/OFF. >> > 2. extend the catalog table pgxc_node by adding new columns: >> > slave1_host, >> > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at >> > most >> > two slaves are supported. >> >> I don't think this is a good idea. If we have these info in the >> catalog, this will all goes to the slave by WAL shipping and will be >> used when a slave is promoted. >> >> This information is not valid when the master is gone and one of the >> slaves is promoted. >> >> > 3. a read only transaction or the front read only part of a transaction >> > will >> > be routed to the slave node to execute. >> >> In current WAL shipping, we have to expect some difference when a >> transaction or statement update is visible to the slave. At least, >> even with >> synchronized replication, there's slight delay after the WAL record is >> received and is replayed to be available to hot standby. There's >> even a chance that such update is visible before it is visible at the >> master. >> >> Therefore, usecase of current hot standby should allow such >> differences. I don't think your example allow such WAL shipping >> replication characteristics. >> >> Moreover, current hot standby implementation assumes the slave will >> receive every XID in updates. It does not assume there could be >> missing XIDs and this assumption is used to generate snapshot to >> enforce update visibility. >> >> In XC, because of GXID nature, some GXID may be missing at some slave. >> >> At present, because we didn't have sufficient resources, snapshot >> generation is disabled. >> >> In addition to this, local snapshot may not work. We need global XID >> (GXID) to get consistent result. >> >> By such reasons, it is not simple to provide consistent database view >> from slaves. >> >> I discussed this in PGCon cluster summit this Tuesday and I'm afraid >> this need much more analysis, research and design. >> >> > >> > For example, >> > begin; >> > select ....; ==>go to slave node >> > select ....; ==>go to slave node >> > insert ....; ==>go to master node >> > select ....; ==>go to master node, since it may visit the row inserted >> > by >> > the previous insert statement. >> > end; >> > >> > By this, in a cluster, >> > some coordinator can be configured to support the OLTP system, the query >> > will be routed to the master data nodes; >> > others coordinators can be configured to support the report system, the >> > query will be routed to the slave data nodes; >> > the different wordloads will be applied to different coordinators and >> > data >> > nodes, then they can be isolated. >> > >> > Do you think if it is valuable? Do you have some advices? >> > >> > Thanks >> > Julian >> > >> > >> > ------------------------------------------------------------------------------ >> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >> > Instantly run your Selenium tests across 300+ browser/OS combos. >> > Get unparalleled scalability from the best Selenium testing platform >> > available >> > Simple to use. Nothing to install. Get started now for free." >> > http://p.sf.net/sfu/SauceLabs >> > _______________________________________________ >> > Postgres-xc-general mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> > |
From: ZhangJulian <jul...@ou...> - 2014-05-23 03:49:11
|
Hi Koichi, Thanks for your comments! 1. pgxc_node issue. I feel the pgxc_node in data node have no use currently, right? In current codebase, if a coordinator slave or a data node slave is promoted, ALTER NODE statement must be executed in all the coordinators since the pgxc_node table is a local table in each node. Assume the feature is applied, ALTER NODE/CREATE NODE syntax also will be updated to update the master and slave together. Once a coordinator slave or a data node slave is prompted, the information in other coordinators and the prompted coordinator could be updated as the previous behavior. 2. the data between the master and the slave may not be consistency every time. It should be a common issue on PostgreSQL, and other non-cluster database platform. There are many users who use the master-slave infrastructure to achive the read/write separation. If the user open the feature, they should know the risk. 3. the GXID issue. It is too complex to me, I can not understand it thoroughly, :) But if the user can bear the data is not consistency in a short time, it will be not a issue, right? Thanks Julian > Date: Thu, 22 May 2014 09:21:28 -0400 > Subject: Re: [Postgres-xc-general] Do you think the new feature is meaningful? - Read/Write Separation > From: koi...@gm... > To: jul...@ou... > CC: pos...@li... > > Hello; > > Thanks a lot for the idea. Please find my comments inline. > > Hope you consider them and more forward to make your goal more feasible? > > Regards; > --- > Koichi Suzuki > > > 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: > > Hi All, > > > > I plan to implement it as the below idea. > > 1. add a new GUC to the coordinator configuration, which control the > > READ/WRITE Separation feature is ON/OFF. > > 2. extend the catalog table pgxc_node by adding new columns: slave1_host, > > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at most > > two slaves are supported. > > I don't think this is a good idea. If we have these info in the > catalog, this will all goes to the slave by WAL shipping and will be > used when a slave is promoted. > > This information is not valid when the master is gone and one of the > slaves is promoted. > > > 3. a read only transaction or the front read only part of a transaction will > > be routed to the slave node to execute. > > In current WAL shipping, we have to expect some difference when a > transaction or statement update is visible to the slave. At least, > even with > synchronized replication, there's slight delay after the WAL record is > received and is replayed to be available to hot standby. There's > even a chance that such update is visible before it is visible at the > master. > > Therefore, usecase of current hot standby should allow such > differences. I don't think your example allow such WAL shipping > replication characteristics. > > Moreover, current hot standby implementation assumes the slave will > receive every XID in updates. It does not assume there could be > missing XIDs and this assumption is used to generate snapshot to > enforce update visibility. > > In XC, because of GXID nature, some GXID may be missing at some slave. > > At present, because we didn't have sufficient resources, snapshot > generation is disabled. > > In addition to this, local snapshot may not work. We need global XID > (GXID) to get consistent result. > > By such reasons, it is not simple to provide consistent database view > from slaves. > > I discussed this in PGCon cluster summit this Tuesday and I'm afraid > this need much more analysis, research and design. > > > > > For example, > > begin; > > select ....; ==>go to slave node > > select ....; ==>go to slave node > > insert ....; ==>go to master node > > select ....; ==>go to master node, since it may visit the row inserted by > > the previous insert statement. > > end; > > > > By this, in a cluster, > > some coordinator can be configured to support the OLTP system, the query > > will be routed to the master data nodes; > > others coordinators can be configured to support the report system, the > > query will be routed to the slave data nodes; > > the different wordloads will be applied to different coordinators and data > > nodes, then they can be isolated. > > > > Do you think if it is valuable? Do you have some advices? > > > > Thanks > > Julian > > > > ------------------------------------------------------------------------------ > > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > > Instantly run your Selenium tests across 300+ browser/OS combos. > > Get unparalleled scalability from the best Selenium testing platform > > available > > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > |
From: Aaron J. <aja...@re...> - 2014-05-22 16:28:27
|
Given my past experience with compiler issues, I'm a little hesitant to even report this. That said, I have a three node cluster, each with a coordinator, data node and gtm proxy. I have a standalone gtm instance without a slave. Often, when I come in after the servers have been up for a while, I'm greeted with a variety of issues. There are several warnings in the coordinator and data node logs, that read "Do not have a GTM snapshot available" - I've discarded these as mostly benign for the moment. The coordinator is much worse.. 30770 | 2014-05-22 15:53:06 UTC | ERROR: current transaction is aborted, commands ignored until end of transaction block 30770 | 2014-05-22 15:53:06 UTC | STATEMENT: DISCARD ALL 4560 | 2014-05-22 15:54:30 UTC | LOG: failed to connect to Datanode 4560 | 2014-05-22 15:54:30 UTC | LOG: failed to connect to Datanode 4560 | 2014-05-22 15:54:30 UTC | WARNING: can not connect to node 16390 30808 | 2014-05-22 15:54:30 UTC | LOG: failed to acquire connections Usually, I reset the coordinator and datanode and the world is happy again. However, it makes me somewhat concerned that I'm seeing these kinds of failures on a daily basis. I wouldn't rule out the compiler again as it's been the reason for previous failures, but has anyone else seen anything like this?? Aaron |
From: Koichi S. <koi...@gm...> - 2014-05-22 13:21:35
|
Hello; Thanks a lot for the idea. Please find my comments inline. Hope you consider them and more forward to make your goal more feasible? Regards; --- Koichi Suzuki 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: > Hi All, > > I plan to implement it as the below idea. > 1. add a new GUC to the coordinator configuration, which control the > READ/WRITE Separation feature is ON/OFF. > 2. extend the catalog table pgxc_node by adding new columns: slave1_host, > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at most > two slaves are supported. I don't think this is a good idea. If we have these info in the catalog, this will all goes to the slave by WAL shipping and will be used when a slave is promoted. This information is not valid when the master is gone and one of the slaves is promoted. > 3. a read only transaction or the front read only part of a transaction will > be routed to the slave node to execute. In current WAL shipping, we have to expect some difference when a transaction or statement update is visible to the slave. At least, even with synchronized replication, there's slight delay after the WAL record is received and is replayed to be available to hot standby. There's even a chance that such update is visible before it is visible at the master. Therefore, usecase of current hot standby should allow such differences. I don't think your example allow such WAL shipping replication characteristics. Moreover, current hot standby implementation assumes the slave will receive every XID in updates. It does not assume there could be missing XIDs and this assumption is used to generate snapshot to enforce update visibility. In XC, because of GXID nature, some GXID may be missing at some slave. At present, because we didn't have sufficient resources, snapshot generation is disabled. In addition to this, local snapshot may not work. We need global XID (GXID) to get consistent result. By such reasons, it is not simple to provide consistent database view from slaves. I discussed this in PGCon cluster summit this Tuesday and I'm afraid this need much more analysis, research and design. > > For example, > begin; > select ....; ==>go to slave node > select ....; ==>go to slave node > insert ....; ==>go to master node > select ....; ==>go to master node, since it may visit the row inserted by > the previous insert statement. > end; > > By this, in a cluster, > some coordinator can be configured to support the OLTP system, the query > will be routed to the master data nodes; > others coordinators can be configured to support the report system, the > query will be routed to the slave data nodes; > the different wordloads will be applied to different coordinators and data > nodes, then they can be isolated. > > Do you think if it is valuable? Do you have some advices? > > Thanks > Julian > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: ZhangJulian <jul...@ou...> - 2014-05-22 08:19:43
|
Hi All, I plan to implement it as the below idea. 1. add a new GUC to the coordinator configuration, which control the READ/WRITE Separation feature is ON/OFF. 2. extend the catalog table pgxc_node by adding new columns: slave1_host, slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at most two slaves are supported. 3. a read only transaction or the front read only part of a transaction will be routed to the slave node to execute. For example, begin; select ....; ==>go to slave node select ....; ==>go to slave node insert ....; ==>go to master node select ....; ==>go to master node, since it may visit the row inserted by the previous insert statement. end; By this, in a cluster, some coordinator can be configured to support the OLTP system, the query will be routed to the master data nodes; others coordinators can be configured to support the report system, the query will be routed to the slave data nodes; the different wordloads will be applied to different coordinators and data nodes, then they can be isolated. Do you think if it is valuable? Do you have some advices? Thanks Julian |
From: Ashutosh B. <ash...@en...> - 2014-05-21 05:00:38
|
Hi Aaron, >From the plan you have given we can see that INSERT is happening on the coordinator, inserting one row at a time. Although the INSERT statement is prepared on the datanode, each EXECUTE incurs the libpq and execution overheads on datanode. What should ideally happen is, all the rows to be inserted on a same datanode should be stored in some sort of file and bulk inserted (using COPY protocol). But this is not implemented yet, because 1. We do not have resources to implement it 2. We do not have global statistics at the coordinator to estimate, how many rows SELECT is going to returns, and hence can not decide whether to use single insert at a time (for small number of rows) or bulk insert (large number of rows). On Tue, Apr 29, 2014 at 10:08 PM, Aaron Jackson <aja...@re...>wrote: > When I load data into my table "detail" with COPY, the table loads at a > rate of about 56k rows per second. The data is distributed on a key to > achieve this rate of insert (width is 678). However, when I do the > following: > > INSERT INTO DETAIL SELECT 123 as Id, ... FROM DETAIL WHERE Id = 500; > > I see the write performance drop to only 2.5K rows per second. The > total data set loaded from Id = 500 is 200k rows and takes about 7s to load > into the data coordinator. So, I can attribute almost all of the time > (about 80 seconds) directly to the insert. > > Insert on detail (cost=0.00..10.00 rows=1000 width=678) (actual > time=79438.038..79438.038 rows=0 loops=1) > Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 > Node expr: productid > -> Data Node Scan on detail "_REMOTE_TABLE_QUERY_" (cost=0.00..10.00 > rows=1000 width=678) (actual time=3.917..2147.231 rows=200000 loops=1) > Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 > > IMO, it seems like an insert like this should approach the performance > of a COPY. Am I missing something or can you recommend a different > approach? > > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |
From: Koichi S. <koi...@gm...> - 2014-05-21 01:53:31
|
Thank you Josh. This could be because of the transmission overhead. In XC, file to copy is first transferred to each target datanode before handled by copy handler. Regards; --- Koichi Suzuki 2014-05-20 12:23 GMT-04:00 Josh Berkus <jo...@ag...>: > On 04/29/2014 12:38 PM, Aaron Jackson wrote: >> When I load data into my table "detail" with COPY, the table loads at a rate of about 56k rows per second. The data is distributed on a key to achieve this rate of insert (width is 678). However, when I do the following: >> >> INSERT INTO DETAIL SELECT 123 as Id, ... FROM DETAIL WHERE Id = 500; >> >> I see the write performance drop to only 2.5K rows per second. The total data set loaded from Id = 500 is 200k rows and takes about 7s to load into the data coordinator. So, I can attribute almost all of the time (about 80 seconds) directly to the insert. >> >> Insert on detail (cost=0.00..10.00 rows=1000 width=678) (actual time=79438.038..79438.038 rows=0 loops=1) >> Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 >> Node expr: productid >> -> Data Node Scan on detail "_REMOTE_TABLE_QUERY_" (cost=0.00..10.00 rows=1000 width=678) (actual time=3.917..2147.231 rows=200000 loops=1) >> Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 >> >> IMO, it seems like an insert like this should approach the performance of a COPY. Am I missing something or can you recommend a different approach? > > Well, COPY is much faster on vanilla Postgres, for a variety of > optimization reasons. I don't see why PostgresXC would be different. > > Admittedly, the 20X differential is higher than single-node Postgres, so > that seems worth investigating. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Josh B. <jo...@ag...> - 2014-05-20 16:23:58
|
On 04/29/2014 12:38 PM, Aaron Jackson wrote: > When I load data into my table "detail" with COPY, the table loads at a rate of about 56k rows per second. The data is distributed on a key to achieve this rate of insert (width is 678). However, when I do the following: > > INSERT INTO DETAIL SELECT 123 as Id, ... FROM DETAIL WHERE Id = 500; > > I see the write performance drop to only 2.5K rows per second. The total data set loaded from Id = 500 is 200k rows and takes about 7s to load into the data coordinator. So, I can attribute almost all of the time (about 80 seconds) directly to the insert. > > Insert on detail (cost=0.00..10.00 rows=1000 width=678) (actual time=79438.038..79438.038 rows=0 loops=1) > Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 > Node expr: productid > -> Data Node Scan on detail "_REMOTE_TABLE_QUERY_" (cost=0.00..10.00 rows=1000 width=678) (actual time=3.917..2147.231 rows=200000 loops=1) > Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 > > IMO, it seems like an insert like this should approach the performance of a COPY. Am I missing something or can you recommend a different approach? Well, COPY is much faster on vanilla Postgres, for a variety of optimization reasons. I don't see why PostgresXC would be different. Admittedly, the 20X differential is higher than single-node Postgres, so that seems worth investigating. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Mason S. <ms...@tr...> - 2014-05-17 16:47:33
|
Hi Dorian, On Sat, May 17, 2014 at 3:25 AM, Dorian Hoxha <dor...@gm...>wrote: > Postgres-XL Released: Scale-out PostgreSQL Cluster > > http://www.postgresql.org/about/news/1523/ > > Yes, Koichi Suzuki asked me to explain more about Postgres-XL at the Postgres-XC meeting. I will put together some slides to highlight the enhancements and differences. I will also give a very brief update at the Clustering Summit. > > On Sat, May 17, 2014 at 12:33 AM, Josh Berkus <jo...@ag...> wrote: > >> All: >> >> The PostgresXC Meeting and the Clustering Summit at pgCon next week have >> been moved to University Center Room 205 in order to accomodate a >> larger-thank-expected group. >> >> -- >> Josh Berkus >> PostgreSQL Experts Inc. >> http://pgexperts.com > > -- Mason Sharp TransLattice - http://www.translattice.com Distributed and Clustered Database Solutions |
From: Dorian H. <dor...@gm...> - 2014-05-17 07:26:21
|
Postgres-XL Released: Scale-out PostgreSQL Cluster http://www.postgresql.org/about/news/1523/ On Sat, May 17, 2014 at 12:33 AM, Josh Berkus <jo...@ag...> wrote: > All: > > The PostgresXC Meeting and the Clustering Summit at pgCon next week have > been moved to University Center Room 205 in order to accomodate a > larger-thank-expected group. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Josh B. <jo...@ag...> - 2014-05-16 22:51:16
|
All: The PostgresXC Meeting and the Clustering Summit at pgCon next week have been moved to University Center Room 205 in order to accomodate a larger-thank-expected group. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Aaron J. <aja...@re...> - 2014-05-16 14:24:35
|
Yes, All nodes were running fine. However, I didn't account for the fact that I rebuilt the server to add PAM support. When I did that, I reset autoconf and it built with my apparently braindead version of gcc-4.8. So, the issue was resolved once I rebuilt with gcc-4.7 and redistributed the proper binary. Thank, sorry for the goose chase. ________________________________ From: Pavan Deolasee [pav...@gm...] Sent: Monday, May 12, 2014 11:22 AM To: Aaron Jackson Cc: pos...@li... Subject: Re: [Postgres-xc-general] Failed to get pooled connections - overnight Sent from my iPhone On 12-May-2014, at 8:51 pm, Aaron Jackson <aja...@re...<mailto:aja...@re...>> wrote: This morning I came in and connected to my coordinator. I issued a query to count table A and this succeeded. I then asked it to count table B and it failed with "Failed to get pooled connections" - I did an explain on both tables and this is what it told me.. explain select count(*) from tableA; Aggregate (cost=2.50..2.51 rows=1 width=0) -> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..0.00 rows=1000 width=0) Node/s: node_pgs01_1, node_pgs02_1, node_pgs03_1 (3 rows) explain select count(*) from tableB; Aggregate (cost=2.50..2.51 rows=1 width=0) -> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..0.00 rows=1000 width=0) Node/s: node_pgs01_1, node_pgs01_1, node_pgs02_1, node_pgs01_1, node_pgs03_1, node_pgs01_1 (3 rows) I've seen this twice now, so I figured that maybe the pool needed to be reloaded... so I issued pgxc_pool_reload() but that did not help. Restarting the coordinator did not change anything, so it appears to be that the metadata for table B is bad? No nodes have been added or removed since this table was created. Any thoughts? Did you check if all the nodes are running fine ? Thanks, Pavan Aaron ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Pavan D. <pav...@gm...> - 2014-05-12 16:22:12
|
Sent from my iPhone > On 12-May-2014, at 8:51 pm, Aaron Jackson <aja...@re...> wrote: > > This morning I came in and connected to my coordinator. I issued a query to count table A and this succeeded. I then asked it to count table B and it failed with "Failed to get pooled connections" - I did an explain on both tables and this is what it told me.. > > explain select count(*) from tableA; > > Aggregate (cost=2.50..2.51 rows=1 width=0) > -> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..0.00 rows=1000 width=0) > Node/s: node_pgs01_1, node_pgs02_1, node_pgs03_1 > (3 rows) > > explain select count(*) from tableB; > > Aggregate (cost=2.50..2.51 rows=1 width=0) > -> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..0.00 rows=1000 width=0) > Node/s: node_pgs01_1, node_pgs01_1, node_pgs02_1, node_pgs01_1, node_pgs03_1, node_pgs01_1 > (3 rows) > > I've seen this twice now, so I figured that maybe the pool needed to be reloaded... so I issued pgxc_pool_reload() but that did not help. Restarting the coordinator did not change anything, so it appears to be that the metadata for table B is bad? No nodes have been added or removed since this table was created. > > Any thoughts? > Did you check if all the nodes are running fine ? Thanks, Pavan > Aaron > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Aaron J. <aja...@re...> - 2014-05-12 15:21:53
|
This morning I came in and connected to my coordinator. I issued a query to count table A and this succeeded. I then asked it to count table B and it failed with "Failed to get pooled connections" - I did an explain on both tables and this is what it told me.. explain select count(*) from tableA; Aggregate (cost=2.50..2.51 rows=1 width=0) -> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..0.00 rows=1000 width=0) Node/s: node_pgs01_1, node_pgs02_1, node_pgs03_1 (3 rows) explain select count(*) from tableB; Aggregate (cost=2.50..2.51 rows=1 width=0) -> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..0.00 rows=1000 width=0) Node/s: node_pgs01_1, node_pgs01_1, node_pgs02_1, node_pgs01_1, node_pgs03_1, node_pgs01_1 (3 rows) I've seen this twice now, so I figured that maybe the pool needed to be reloaded... so I issued pgxc_pool_reload() but that did not help. Restarting the coordinator did not change anything, so it appears to be that the metadata for table B is bad? No nodes have been added or removed since this table was created. Any thoughts? Aaron |
From: Koichi S. <koi...@gm...> - 2014-05-07 14:51:46
|
In principle, yes. But in practical, not. --- Koichi Suzuki 2014-05-07 21:00 GMT+09:00 Dorian Hoxha <dor...@gm...>: > Currently there is no open-source/free alternative(infinisql looks ~dead, > cubrid doesn't have multi-node-statements + has middleware) that provides > global transactions (most commercial offerings are > in-memory(voltdb,memsql,nuodb). > > Even nosql dbs that don't have transactions (mongo(the hipster)db, has > mongod,replica-servers,control-servers,mongos,balancer| hbase/hypertable). > Only couchbase (no transactions,it is consistent, while riak/cassandra are > eventually-consistent). > > Only Translattice looks to be 1node. > So having 2 components is actually pretty good. > > If it makes sense(don't know internals), for certain workloads, when only > 1-node-transactions are required, maybe the gtm could be eleminated? > > > On Wed, May 7, 2014 at 10:21 AM, 鈴木 幸市 <ko...@in...> wrote: >> >> Please understand that GTM is not a performance bottleneck. It may need >> dedicated network segment and a server, but GTM’s load average is quite low. >> I understand this ends up with bad workload balance and this can be a >> concern for some people. Without GTM, each node may have to do much more >> calculation to and exchange much more data to get snapshot and determine if >> a given row can be vacuumed/vacuum frozen. I don’t know how effective this >> could be. >> >> Regards; >> --- >> Koichi Suzuki >> >> 2014/05/07 17:10、ZhangJulian <jul...@ou...> のメール: >> >> Of course it is attractive. The performance bottleneck, GTM, could be >> removed. And the cluster will be comprised of only one type of unified >> component. :) >> >> Thanks >> Julian >> >> ________________________________ >> From: ko...@in... >> To: jul...@ou... >> CC: koi...@gm...; dor...@gm...; >> pos...@li... >> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft >> Date: Wed, 7 May 2014 08:00:12 +0000 >> >> Right. Maybe I can find how to calculate this without GTM and GXID. >> Anyway, I thing we should keep track of root XID and local XID. I’m now >> designing how to do this. Hope we can share the outcome as well soon. >> Algorithm could be complicated but cluster configuration may look >> significantly simpler. >> >> How do you think providing global MVCC without GTM/GXID attractive? >> >> Regards; >> --- >> Koichi Suzuki >> >> 2014/05/07 16:51、ZhangJulian <jul...@ou...> のメール: >> >> Oh, yes. >> The oldest GXID must be in the snapshot of the oldest alive GXID. So if we >> can know the old alive GXID, we can derive the oldest GXID which is still >> referred. >> >> ________________________________ >> From: ko...@in... >> To: jul...@ou... >> CC: koi...@gm...; dor...@gm...; >> pos...@li... >> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft >> Date: Wed, 7 May 2014 04:00:25 +0000 >> >> Oldest alive GXID is not correct. We need referred oldest GXID, which is >> the oldest GXID appears in all the snapshot being used. Please consider >> that in the case of long, repeated-read transaction, lifetime of snapshot >> can be very long. >> >> Regards; >> --- >> Koichi Suzuki >> >> 2014/05/07 12:25、ZhangJulian <jul...@ou...> のメール: >> >> I said 'time' as the clock value. >> You had considered more than I had known. >> >> For the VACUUM, as my understanding, if some data which can be vacuumed, >> but is not vacuumed in time, this is OK. So if we collect the oldest alive >> GXID, even it is smaller than the current accurate value, it still can guide >> to VACUUM. Am I right? >> >> Thanks >> Julian >> >> ________________________________ >> From: ko...@in... >> To: jul...@ou... >> CC: koi...@gm...; dor...@gm...; >> pos...@li... >> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft >> Date: Wed, 7 May 2014 02:40:43 +0000 >> >> What do you mean by “time-based policy”, does it based on (G)XID, or real >> clock value? To my view, it depend upon what “time” we depend on. >> >> Yes, I’m now studying if we can use real “clock” value for this. In >> this case, we may not need GTM if the clock is accurate enough among servers >> involved. >> >> If you mean not to use global “snapshot” and if it is feasible, we may not >> need GTM. If we associate each local transaction to its “root” >> transaction, which is the transaction application generated directly, we can >> maintaing the visibility by calculating the “snapshot” each time needed, by >> collecting it from all the other nodes. >> >> We need to consider the “vacuum”. I’ve not found a good way to determine >> if some “deleted” rows can be removed from the database and if some “live” >> row’s xmim value can be frozen. >> >> Regards; >> --- >> Koichi Suzuki >> >> 2014/05/07 11:19、ZhangJulian <jul...@ou...> のメール: >> >> Is it possible to do the row visibility check based on a time based >> policy? That is, >> >> 1. Each data node maintains a data structure: gtid - start time - end >> time. Only the gtids modifying data on current data node are contained. >> 2. Each data node maintains the oldest alive gtid, which may not be >> updated synchronously. >> 3. GTM is only responsible to generate a sequence of GTID, which is only >> an integer value. >> 4. The time in different data nodes may be not consistent, but I think in >> some scenario, the application can bear the little difference. >> >> Is there any potential issues? >> >> Thanks >> >> > Date: Sun, 4 May 2014 19:36:20 +0900 >> > From: koi...@gm... >> > To: dor...@gm... >> > CC: pos...@li... >> > Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft >> > >> > As discussed in the last year's XC-day, GTM proxy should be integrated >> > as postmaster backend. Maybe GTM can be. Coordinator/Datanode >> > can also be integrated into one. >> > >> > Apparently, this is the direction we should take. At first, there >> > were no such good experience to start with. Before version 1.0, we >> > determined that the datanode and the coordinator can share the same >> > binary. It is true that we started with the idea to provide >> > cluster-wide MVCC and now we found the next direction. >> > >> > With this integration and when start with only one node, we don't need >> > GTM, which looks identical to standalone PG. When we add the server, >> > at present we do need GTM. Only accumulating local transactions in >> > the nodes cannot maintain cluster-wide database consistency. >> > >> > I'm still investigating an idea how to get rid of GTM. We need to do >> > the following: >> > >> > 1) To provide cluster wide MVCC, >> > 2) To provide good means to determine which row can be vacuumed. >> > >> > My current idea is: if we associate any local XID to the root >> > transaction (the transaction which application created), we may be >> > able to provide cluster wide MVCC by calculating cluster-wide snapshot >> > when needed. I don't know how efficient it is and t don't have good >> > idea how to determine if a given row can be vacuumed. >> > >> > This is the current situation. >> > >> > Hope to have much more input on this. >> > >> > Anyway, hope my draft helps people who is trying to use Postgres-XC. >> > >> > Best; >> > --- >> > Koichi Suzuki >> > >> > >> > 2014-05-04 19:05 GMT+09:00 Dorian Hoxha <dor...@gm...>: >> > > Probably even the gtm-proxy need to be merged with >> > > datanode+coordinator from >> > > what i read. >> > > >> > > If you make only local transactions (inside 1 datanode) + not using >> > > global >> > > sequences, will there be no traffic to the GTM for that transaction ? >> > > >> > > >> > > On Sun, May 4, 2014 at 6:24 AM, Michael Paquier >> > > <mic...@gm...> >> > > wrote: >> > >> >> > >> On Sun, May 4, 2014 at 12:59 AM, Dorian Hoxha >> > >> <dor...@gm...> >> > >> wrote: >> > >> >> You just need commodity INTEL server runnign Linux. >> > >> > Are INTEL cpu required ? If not INTEL can be removed ? (also >> > >> > running >> > >> > typo) >> > >> Not really... I agree to what you mean here. >> > >> >> > >> >> For datawarehouse >> > >> >> >> > >> >> applications, you may need separate patch which devides complexed >> > >> >> query >> > >> >> into smaller >> > >> >> >> > >> >> chunks which run in datanodes in parallel. StormDB will provide >> > >> >> such >> > >> >> patche. >> > >> > >> > >> > Wasn't stormdb bought by another company ? Is there an opensource >> > >> > alternative ? Fix the "patche" typo ? >> > >> > >> > >> > A way to make it simpler is by merging coordinator and datanode >> > >> > into 1 >> > >> > and >> > >> > making it possible for a 'node' to not hold data (be a coordinator >> > >> > only), >> > >> > like in elastic-search, but you probably already know that. >> > >> +1. This would alleviate data transfer between cross-node joins where >> > >> Coordinator and Datanodes are on separate servers. You could always >> > >> have both nodes on the same server with the XC of now... But that's >> > >> double number of nodes to monitor. >> > >> >> > >> > What exact things does the gtm-proxy do? For example, a single row >> > >> > insert >> > >> > wouldn't need the gtm (coordinator just inserts it to the right >> > >> > data-node)(asumming no sequences, since for that the gtm is >> > >> > needed)? >> > >> Grouping messages between Coordinator/Datanode and GTM to reduce >> > >> package interferences and improve performance. >> > >> >> > >> > If multiple tables are sharded on the same key (example: user_id). >> > >> > Will >> > >> > all >> > >> > the rows, from the same user in different tables be in the same >> > >> > data-node ? >> > >> Yep. Node choice algorithm is based using the data type of the key. >> > >> -- >> > >> Michael >> > > >> > > >> > >> > >> > ------------------------------------------------------------------------------ >> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >> > Instantly run your Selenium tests across 300+ browser/OS combos. Get >> > unparalleled scalability from the best Selenium testing platform >> > available. >> > Simple to use. Nothing to install. Get started now for free." >> > http://p.sf.net/sfu/SauceLabs >> > _______________________________________________ >> > Postgres-xc-general mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> ------------------------------------------------------------------------------ >> Is your legacy SCM system holding you back? Join Perforce May 7 to find >> out: >> • 3 signs your SCM is hindering your productivity >> • Requirements for releasing software faster >> • Expert tips and advice for migrating your SCM now >> >> http://p.sf.net/sfu/perforce_______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> >> >> >> > |
From: 鈴木 幸市 <ko...@in...> - 2014-05-07 14:48:08
|
It is quite new topics. Google says Spanner depends upon clock synchronization of 10ms precision worldwide. They’re trying to improve the precision into 1ms so I think it is not impossible to have this clock synchronization available. Anyway, it is quite new and I don’t like to dig too bing into it now. I’d like to concentrate on transaction consistency based upon transaction ID. Regards; --- Koichi Suzuki 2014/05/07 22:18、Jonathan Yue <jy...@ya...<mailto:jy...@ya...>> のメール: IMHO, synchronizing clocks of all nodes is very hard and complicated. If any node has probability , p, of bad clock , then whole system has Np probability. Also responsibility goes to time servers and network connections. I am new to the list, sorry if my point is not pertinent. Jonathan On May 7, 2014, at 5:00 AM, Dorian Hoxha <dor...@gm...<mailto:dor...@gm...>> wrote: Currently there is no open-source/free alternative(infinisql looks ~dead, cubrid doesn't have multi-node-statements + has middleware) that provides global transactions (most commercial offerings are in-memory(voltdb,memsql,nuodb). Even nosql dbs that don't have transactions (mongo(the hipster)db, has mongod,replica-servers,control-servers,mongos,balancer| hbase/hypertable). Only couchbase (no transactions,it is consistent, while riak/cassandra are eventually-consistent). Only Translattice looks to be 1node. So having 2 components is actually pretty good. If it makes sense(don't know internals), for certain workloads, when only 1-node-transactions are required, maybe the gtm could be eleminated? On Wed, May 7, 2014 at 10:21 AM, 鈴木 幸市 <ko...@in...<mailto:ko...@in...>> wrote: Please understand that GTM is not a performance bottleneck. It may need dedicated network segment and a server, but GTM’s load average is quite low. I understand this ends up with bad workload balance and this can be a concern for some people. Without GTM, each node may have to do much more calculation to and exchange much more data to get snapshot and determine if a given row can be vacuumed/vacuum frozen. I don’t know how effective this could be. Regards; --- Koichi Suzuki 2014/05/07 17:10、ZhangJulian <jul...@ou...<mailto:jul...@ou...>> のメール: Of course it is attractive. The performance bottleneck, GTM, could be removed. And the cluster will be comprised of only one type of unified component. :) Thanks Julian ________________________________ From: ko...@in...<mailto:ko...@in...> To: jul...@ou...<mailto:jul...@ou...> CC: koi...@gm...<mailto:koi...@gm...>; dor...@gm...<mailto:dor...@gm...>; pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft Date: Wed, 7 May 2014 08:00:12 +0000 Right. Maybe I can find how to calculate this without GTM and GXID. Anyway, I thing we should keep track of root XID and local XID. I’m now designing how to do this. Hope we can share the outcome as well soon. Algorithm could be complicated but cluster configuration may look significantly simpler. How do you think providing global MVCC without GTM/GXID attractive? Regards; --- Koichi Suzuki 2014/05/07 16:51、ZhangJulian <jul...@ou...<mailto:jul...@ou...>> のメール: Oh, yes. The oldest GXID must be in the snapshot of the oldest alive GXID. So if we can know the old alive GXID, we can derive the oldest GXID which is still referred. ________________________________ From: ko...@in...<mailto:ko...@in...> To: jul...@ou...<mailto:jul...@ou...> CC: koi...@gm...<mailto:koi...@gm...>; dor...@gm...<mailto:dor...@gm...>; pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft Date: Wed, 7 May 2014 04:00:25 +0000 Oldest alive GXID is not correct. We need referred oldest GXID, which is the oldest GXID appears in all the snapshot being used. Please consider that in the case of long, repeated-read transaction, lifetime of snapshot can be very long. Regards; --- Koichi Suzuki 2014/05/07 12:25、ZhangJulian <jul...@ou...<mailto:jul...@ou...>> のメール: I said 'time' as the clock value. You had considered more than I had known. For the VACUUM, as my understanding, if some data which can be vacuumed, but is not vacuumed in time, this is OK. So if we collect the oldest alive GXID, even it is smaller than the current accurate value, it still can guide to VACUUM. Am I right? Thanks Julian ________________________________ From: ko...@in...<mailto:ko...@in...> To: jul...@ou...<mailto:jul...@ou...> CC: koi...@gm...<mailto:koi...@gm...>; dor...@gm...<mailto:dor...@gm...>; pos...@li...<mailto:pos...@li...> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft Date: Wed, 7 May 2014 02:40:43 +0000 What do you mean by “time-based policy”, does it based on (G)XID, or real clock value? To my view, it depend upon what “time” we depend on. Yes, I’m now studying if we can use real “clock” value for this. In this case, we may not need GTM if the clock is accurate enough among servers involved. If you mean not to use global “snapshot” and if it is feasible, we may not need GTM. If we associate each local transaction to its “root” transaction, which is the transaction application generated directly, we can maintaing the visibility by calculating the “snapshot” each time needed, by collecting it from all the other nodes. We need to consider the “vacuum”. I’ve not found a good way to determine if some “deleted” rows can be removed from the database and if some “live” row’s xmim value can be frozen. Regards; --- Koichi Suzuki 2014/05/07 11:19、ZhangJulian <jul...@ou...<mailto:jul...@ou...>> のメール: Is it possible to do the row visibility check based on a time based policy? That is, 1. Each data node maintains a data structure: gtid - start time - end time. Only the gtids modifying data on current data node are contained. 2. Each data node maintains the oldest alive gtid, which may not be updated synchronously. 3. GTM is only responsible to generate a sequence of GTID, which is only an integer value. 4. The time in different data nodes may be not consistent, but I think in some scenario, the application can bear the little difference. Is there any potential issues? Thanks > Date: Sun, 4 May 2014 19:36:20 +0900 > From: koi...@gm...<mailto:koi...@gm...> > To: dor...@gm...<mailto:dor...@gm...> > CC: pos...@li...<mailto:pos...@li...> > Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft > > As discussed in the last year's XC-day, GTM proxy should be integrated > as postmaster backend. Maybe GTM can be. Coordinator/Datanode > can also be integrated into one. > > Apparently, this is the direction we should take. At first, there > were no such good experience to start with. Before version 1.0, we > determined that the datanode and the coordinator can share the same > binary. It is true that we started with the idea to provide > cluster-wide MVCC and now we found the next direction. > > With this integration and when start with only one node, we don't need > GTM, which looks identical to standalone PG. When we add the server, > at present we do need GTM. Only accumulating local transactions in > the nodes cannot maintain cluster-wide database consistency. > > I'm still investigating an idea how to get rid of GTM. We need to do > the following: > > 1) To provide cluster wide MVCC, > 2) To provide good means to determine which row can be vacuumed. > > My current idea is: if we associate any local XID to the root > transaction (the transaction which application created), we may be > able to provide cluster wide MVCC by calculating cluster-wide snapshot > when needed. I don't know how efficient it is and t don't have good > idea how to determine if a given row can be vacuumed. > > This is the current situation. > > Hope to have much more input on this. > > Anyway, hope my draft helps people who is trying to use Postgres-XC. > > Best; > --- > Koichi Suzuki > > > 2014-05-04 19:05 GMT+09:00 Dorian Hoxha <dor...@gm...<mailto:dor...@gm...>>: > > Probably even the gtm-proxy need to be merged with datanode+coordinator from > > what i read. > > > > If you make only local transactions (inside 1 datanode) + not using global > > sequences, will there be no traffic to the GTM for that transaction ? > > > > > > On Sun, May 4, 2014 at 6:24 AM, Michael Paquier <mic...@gm...<mailto:mic...@gm...>> > > wrote: > >> > >> On Sun, May 4, 2014 at 12:59 AM, Dorian Hoxha <dor...@gm...<mailto:dor...@gm...>> > >> wrote: > >> >> You just need commodity INTEL server runnign Linux. > >> > Are INTEL cpu required ? If not INTEL can be removed ? (also running > >> > typo) > >> Not really... I agree to what you mean here. > >> > >> >> For datawarehouse > >> >> > >> >> applications, you may need separate patch which devides complexed query > >> >> into smaller > >> >> > >> >> chunks which run in datanodes in parallel. StormDB will provide such > >> >> patche. > >> > > >> > Wasn't stormdb bought by another company ? Is there an opensource > >> > alternative ? Fix the "patche" typo ? > >> > > >> > A way to make it simpler is by merging coordinator and datanode into 1 > >> > and > >> > making it possible for a 'node' to not hold data (be a coordinator > >> > only), > >> > like in elastic-search, but you probably already know that. > >> +1. This would alleviate data transfer between cross-node joins where > >> Coordinator and Datanodes are on separate servers. You could always > >> have both nodes on the same server with the XC of now... But that's > >> double number of nodes to monitor. > >> > >> > What exact things does the gtm-proxy do? For example, a single row > >> > insert > >> > wouldn't need the gtm (coordinator just inserts it to the right > >> > data-node)(asumming no sequences, since for that the gtm is needed)? > >> Grouping messages between Coordinator/Datanode and GTM to reduce > >> package interferences and improve performance. > >> > >> > If multiple tables are sharded on the same key (example: user_id). Will > >> > all > >> > the rows, from the same user in different tables be in the same > >> > data-node ? > >> Yep. Node choice algorithm is based using the data type of the key. > >> -- > >> Michael > > > > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li...<mailto:Pos...@li...> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general ------------------------------------------------------------------------------ Is your legacy SCM system holding you back? Join Perforce May 7 to find out: • 3 signs your SCM is hindering your productivity • Requirements for releasing software faster • Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce_______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general ------------------------------------------------------------------------------ Is your legacy SCM system holding you back? Join Perforce May 7 to find out: • 3 signs your SCM is hindering your productivity • Requirements for releasing software faster • Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce _______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Jonathan Y. <jy...@ya...> - 2014-05-07 13:19:04
|
IMHO, synchronizing clocks of all nodes is very hard and complicated. If any node has probability , p, of bad clock , then whole system has Np probability. Also responsibility goes to time servers and network connections. I am new to the list, sorry if my point is not pertinent. Jonathan > On May 7, 2014, at 5:00 AM, Dorian Hoxha <dor...@gm...> wrote: > > Currently there is no open-source/free alternative(infinisql looks ~dead, cubrid doesn't have multi-node-statements + has middleware) that provides global transactions (most commercial offerings are in-memory(voltdb,memsql,nuodb). > > Even nosql dbs that don't have transactions (mongo(the hipster)db, has mongod,replica-servers,control-servers,mongos,balancer| hbase/hypertable). > Only couchbase (no transactions,it is consistent, while riak/cassandra are eventually-consistent). > > Only Translattice looks to be 1node. > So having 2 components is actually pretty good. > > If it makes sense(don't know internals), for certain workloads, when only 1-node-transactions are required, maybe the gtm could be eleminated? > > >> On Wed, May 7, 2014 at 10:21 AM, 鈴木 幸市 <ko...@in...> wrote: >> Please understand that GTM is not a performance bottleneck. It may need dedicated network segment and a server, but GTM’s load average is quite low. I understand this ends up with bad workload balance and this can be a concern for some people. Without GTM, each node may have to do much more calculation to and exchange much more data to get snapshot and determine if a given row can be vacuumed/vacuum frozen. I don’t know how effective this could be. >> >> Regards; >> --- >> Koichi Suzuki >> >> 2014/05/07 17:10、ZhangJulian <jul...@ou...> のメール: >> >>> Of course it is attractive. The performance bottleneck, GTM, could be removed. And the cluster will be comprised of only one type of unified component. :) >>> >>> Thanks >>> Julian >>> >>> From: ko...@in... >>> To: jul...@ou... >>> CC: koi...@gm...; dor...@gm...; pos...@li... >>> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft >>> Date: Wed, 7 May 2014 08:00:12 +0000 >>> >>> Right. Maybe I can find how to calculate this without GTM and GXID. Anyway, I thing we should keep track of root XID and local XID. I’m now designing how to do this. Hope we can share the outcome as well soon. Algorithm could be complicated but cluster configuration may look significantly simpler. >>> >>> How do you think providing global MVCC without GTM/GXID attractive? >>> >>> Regards; >>> --- >>> Koichi Suzuki >>> >>> 2014/05/07 16:51、ZhangJulian <jul...@ou...> のメール: >>> >>> Oh, yes. >>> The oldest GXID must be in the snapshot of the oldest alive GXID. So if we can know the old alive GXID, we can derive the oldest GXID which is still referred. >>> >>> From: ko...@in... >>> To: jul...@ou... >>> CC: koi...@gm...; dor...@gm...; pos...@li... >>> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft >>> Date: Wed, 7 May 2014 04:00:25 +0000 >>> >>> Oldest alive GXID is not correct. We need referred oldest GXID, which is the oldest GXID appears in all the snapshot being used. Please consider that in the case of long, repeated-read transaction, lifetime of snapshot can be very long. >>> >>> Regards; >>> --- >>> Koichi Suzuki >>> >>> 2014/05/07 12:25、ZhangJulian <jul...@ou...> のメール: >>> >>> I said 'time' as the clock value. >>> You had considered more than I had known. >>> >>> For the VACUUM, as my understanding, if some data which can be vacuumed, but is not vacuumed in time, this is OK. So if we collect the oldest alive GXID, even it is smaller than the current accurate value, it still can guide to VACUUM. Am I right? >>> >>> Thanks >>> Julian >>> >>> From: ko...@in... >>> To: jul...@ou... >>> CC: koi...@gm...; dor...@gm...; pos...@li... >>> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft >>> Date: Wed, 7 May 2014 02:40:43 +0000 >>> >>> What do you mean by “time-based policy”, does it based on (G)XID, or real clock value? To my view, it depend upon what “time” we depend on. >>> >>> Yes, I’m now studying if we can use real “clock” value for this. In this case, we may not need GTM if the clock is accurate enough among servers involved. >>> >>> If you mean not to use global “snapshot” and if it is feasible, we may not need GTM. If we associate each local transaction to its “root” transaction, which is the transaction application generated directly, we can maintaing the visibility by calculating the “snapshot” each time needed, by collecting it from all the other nodes. >>> >>> We need to consider the “vacuum”. I’ve not found a good way to determine if some “deleted” rows can be removed from the database and if some “live” row’s xmim value can be frozen. >>> >>> Regards; >>> --- >>> Koichi Suzuki >>> >>> 2014/05/07 11:19、ZhangJulian <jul...@ou...> のメール: >>> >>> Is it possible to do the row visibility check based on a time based policy? That is, >>> >>> 1. Each data node maintains a data structure: gtid - start time - end time. Only the gtids modifying data on current data node are contained. >>> 2. Each data node maintains the oldest alive gtid, which may not be updated synchronously. >>> 3. GTM is only responsible to generate a sequence of GTID, which is only an integer value. >>> 4. The time in different data nodes may be not consistent, but I think in some scenario, the application can bear the little difference. >>> >>> Is there any potential issues? >>> >>> Thanks >>> >>> > Date: Sun, 4 May 2014 19:36:20 +0900 >>> > From: koi...@gm... >>> > To: dor...@gm... >>> > CC: pos...@li... >>> > Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft >>> > >>> > As discussed in the last year's XC-day, GTM proxy should be integrated >>> > as postmaster backend. Maybe GTM can be. Coordinator/Datanode >>> > can also be integrated into one. >>> > >>> > Apparently, this is the direction we should take. At first, there >>> > were no such good experience to start with. Before version 1.0, we >>> > determined that the datanode and the coordinator can share the same >>> > binary. It is true that we started with the idea to provide >>> > cluster-wide MVCC and now we found the next direction. >>> > >>> > With this integration and when start with only one node, we don't need >>> > GTM, which looks identical to standalone PG. When we add the server, >>> > at present we do need GTM. Only accumulating local transactions in >>> > the nodes cannot maintain cluster-wide database consistency. >>> > >>> > I'm still investigating an idea how to get rid of GTM. We need to do >>> > the following: >>> > >>> > 1) To provide cluster wide MVCC, >>> > 2) To provide good means to determine which row can be vacuumed. >>> > >>> > My current idea is: if we associate any local XID to the root >>> > transaction (the transaction which application created), we may be >>> > able to provide cluster wide MVCC by calculating cluster-wide snapshot >>> > when needed. I don't know how efficient it is and t don't have good >>> > idea how to determine if a given row can be vacuumed. >>> > >>> > This is the current situation. >>> > >>> > Hope to have much more input on this. >>> > >>> > Anyway, hope my draft helps people who is trying to use Postgres-XC. >>> > >>> > Best; >>> > --- >>> > Koichi Suzuki >>> > >>> > >>> > 2014-05-04 19:05 GMT+09:00 Dorian Hoxha <dor...@gm...>: >>> > > Probably even the gtm-proxy need to be merged with datanode+coordinator from >>> > > what i read. >>> > > >>> > > If you make only local transactions (inside 1 datanode) + not using global >>> > > sequences, will there be no traffic to the GTM for that transaction ? >>> > > >>> > > >>> > > On Sun, May 4, 2014 at 6:24 AM, Michael Paquier <mic...@gm...> >>> > > wrote: >>> > >> >>> > >> On Sun, May 4, 2014 at 12:59 AM, Dorian Hoxha <dor...@gm...> >>> > >> wrote: >>> > >> >> You just need commodity INTEL server runnign Linux. >>> > >> > Are INTEL cpu required ? If not INTEL can be removed ? (also running >>> > >> > typo) >>> > >> Not really... I agree to what you mean here. >>> > >> >>> > >> >> For datawarehouse >>> > >> >> >>> > >> >> applications, you may need separate patch which devides complexed query >>> > >> >> into smaller >>> > >> >> >>> > >> >> chunks which run in datanodes in parallel. StormDB will provide such >>> > >> >> patche. >>> > >> > >>> > >> > Wasn't stormdb bought by another company ? Is there an opensource >>> > >> > alternative ? Fix the "patche" typo ? >>> > >> > >>> > >> > A way to make it simpler is by merging coordinator and datanode into 1 >>> > >> > and >>> > >> > making it possible for a 'node' to not hold data (be a coordinator >>> > >> > only), >>> > >> > like in elastic-search, but you probably already know that. >>> > >> +1. This would alleviate data transfer between cross-node joins where >>> > >> Coordinator and Datanodes are on separate servers. You could always >>> > >> have both nodes on the same server with the XC of now... But that's >>> > >> double number of nodes to monitor. >>> > >> >>> > >> > What exact things does the gtm-proxy do? For example, a single row >>> > >> > insert >>> > >> > wouldn't need the gtm (coordinator just inserts it to the right >>> > >> > data-node)(asumming no sequences, since for that the gtm is needed)? >>> > >> Grouping messages between Coordinator/Datanode and GTM to reduce >>> > >> package interferences and improve performance. >>> > >> >>> > >> > If multiple tables are sharded on the same key (example: user_id). Will >>> > >> > all >>> > >> > the rows, from the same user in different tables be in the same >>> > >> > data-node ? >>> > >> Yep. Node choice algorithm is based using the data type of the key. >>> > >> -- >>> > >> Michael >>> > > >>> > > >>> > >>> > ------------------------------------------------------------------------------ >>> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >>> > Instantly run your Selenium tests across 300+ browser/OS combos. Get >>> > unparalleled scalability from the best Selenium testing platform available. >>> > Simple to use. Nothing to install. Get started now for free." >>> > http://p.sf.net/sfu/SauceLabs >>> > _______________________________________________ >>> > Postgres-xc-general mailing list >>> > Pos...@li... >>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >>> ------------------------------------------------------------------------------ >>> Is your legacy SCM system holding you back? Join Perforce May 7 to find out: >>> • 3 signs your SCM is hindering your productivity >>> • Requirements for releasing software faster >>> • Expert tips and advice for migrating your SCM now >>> http://p.sf.net/sfu/perforce_______________________________________________ >>> Postgres-xc-general mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > ------------------------------------------------------------------------------ > Is your legacy SCM system holding you back? Join Perforce May 7 to find out: > • 3 signs your SCM is hindering your productivity > • Requirements for releasing software faster > • Expert tips and advice for migrating your SCM now > http://p.sf.net/sfu/perforce > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Dorian H. <dor...@gm...> - 2014-05-07 12:00:41
|
Currently there is no open-source/free alternative(infinisql looks ~dead, cubrid doesn't have multi-node-statements + has middleware) that provides global transactions (most commercial offerings are in-memory(voltdb,memsql,nuodb). Even nosql dbs that don't have transactions (mongo(the hipster)db, has mongod,replica-servers,control-servers,mongos,balancer| hbase/hypertable). Only couchbase (no transactions,it is consistent, while riak/cassandra are eventually-consistent). Only Translattice looks to be 1node. So having 2 components is actually pretty good. If it makes sense(don't know internals), for certain workloads, when only 1-node-transactions are required, maybe the gtm could be eleminated? On Wed, May 7, 2014 at 10:21 AM, 鈴木 幸市 <ko...@in...> wrote: > Please understand that GTM is not a performance bottleneck. It may need > dedicated network segment and a server, but GTM’s load average is quite > low. I understand this ends up with bad workload balance and this can be > a concern for some people. Without GTM, each node may have to do much > more calculation to and exchange much more data to get snapshot and > determine if a given row can be vacuumed/vacuum frozen. I don’t know how > effective this could be. > > Regards; > --- > Koichi Suzuki > > 2014/05/07 17:10、ZhangJulian <jul...@ou...> のメール: > > Of course it is attractive. The performance bottleneck, GTM, could be > removed. And the cluster will be comprised of only one type of unified > component. :) > > Thanks > Julian > > ------------------------------ > From: ko...@in... > To: jul...@ou... > CC: koi...@gm...; dor...@gm...; > pos...@li... > Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft > Date: Wed, 7 May 2014 08:00:12 +0000 > > Right. Maybe I can find how to calculate this without GTM and GXID. > Anyway, I thing we should keep track of root XID and local XID. I’m now > designing how to do this. Hope we can share the outcome as well soon. > Algorithm could be complicated but cluster configuration may look > significantly simpler. > > How do you think providing global MVCC without GTM/GXID attractive? > > Regards; > --- > Koichi Suzuki > > 2014/05/07 16:51、ZhangJulian <jul...@ou...> のメール: > > Oh, yes. > The oldest GXID must be in the snapshot of the oldest alive GXID. So if we > can know the old alive GXID, we can derive the oldest GXID which is still > referred. > > ------------------------------ > From: ko...@in... > To: jul...@ou... > CC: koi...@gm...; dor...@gm...; > pos...@li... > Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft > Date: Wed, 7 May 2014 04:00:25 +0000 > > Oldest alive GXID is not correct. We need referred oldest GXID, which is > the oldest GXID appears in all the snapshot being used. Please consider > that in the case of long, repeated-read transaction, lifetime of snapshot > can be very long. > > Regards; > --- > Koichi Suzuki > > 2014/05/07 12:25、ZhangJulian <jul...@ou...> のメール: > > I said 'time' as the clock value. > You had considered more than I had known. > > For the VACUUM, as my understanding, if some data which can be vacuumed, > but is not vacuumed in time, this is OK. So if we collect the oldest alive > GXID, even it is smaller than the current accurate value, it still can > guide to VACUUM. Am I right? > > Thanks > Julian > > ------------------------------ > From: ko...@in... > To: jul...@ou... > CC: koi...@gm...; dor...@gm...; > pos...@li... > Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft > Date: Wed, 7 May 2014 02:40:43 +0000 > > What do you mean by “time-based policy”, does it based on (G)XID, or real > clock value? To my view, it depend upon what “time” we depend on. > > Yes, I’m now studying if we can use real “clock” value for this. In > this case, we may not need GTM if the clock is accurate enough among > servers involved. > > If you mean not to use global “snapshot” and if it is feasible, we may > not need GTM. If we associate each local transaction to its “root” > transaction, which is the transaction application generated directly, we > can maintaing the visibility by calculating the “snapshot” each time > needed, by collecting it from all the other nodes. > > We need to consider the “vacuum”. I’ve not found a good way to > determine if some “deleted” rows can be removed from the database and if > some “live” row’s xmim value can be frozen. > > Regards; > --- > Koichi Suzuki > > 2014/05/07 11:19、ZhangJulian <jul...@ou...> のメール: > > Is it possible to do the row visibility check based on a time based > policy? That is, > > 1. Each data node maintains a data structure: gtid - start time - end > time. Only the gtids modifying data on current data node are contained. > 2. Each data node maintains the oldest alive gtid, which may not be > updated synchronously. > 3. GTM is only responsible to generate a sequence of GTID, which is only > an integer value. > 4. The time in different data nodes may be not consistent, but I think in > some scenario, the application can bear the little difference. > > Is there any potential issues? > > Thanks > > > Date: Sun, 4 May 2014 19:36:20 +0900 > > From: koi...@gm... > > To: dor...@gm... > > CC: pos...@li... > > Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft > > > > As discussed in the last year's XC-day, GTM proxy should be integrated > > as postmaster backend. Maybe GTM can be. Coordinator/Datanode > > can also be integrated into one. > > > > Apparently, this is the direction we should take. At first, there > > were no such good experience to start with. Before version 1.0, we > > determined that the datanode and the coordinator can share the same > > binary. It is true that we started with the idea to provide > > cluster-wide MVCC and now we found the next direction. > > > > With this integration and when start with only one node, we don't need > > GTM, which looks identical to standalone PG. When we add the server, > > at present we do need GTM. Only accumulating local transactions in > > the nodes cannot maintain cluster-wide database consistency. > > > > I'm still investigating an idea how to get rid of GTM. We need to do > > the following: > > > > 1) To provide cluster wide MVCC, > > 2) To provide good means to determine which row can be vacuumed. > > > > My current idea is: if we associate any local XID to the root > > transaction (the transaction which application created), we may be > > able to provide cluster wide MVCC by calculating cluster-wide snapshot > > when needed. I don't know how efficient it is and t don't have good > > idea how to determine if a given row can be vacuumed. > > > > This is the current situation. > > > > Hope to have much more input on this. > > > > Anyway, hope my draft helps people who is trying to use Postgres-XC. > > > > Best; > > --- > > Koichi Suzuki > > > > > > 2014-05-04 19:05 GMT+09:00 Dorian Hoxha <dor...@gm...>: > > > Probably even the gtm-proxy need to be merged with > datanode+coordinator from > > > what i read. > > > > > > If you make only local transactions (inside 1 datanode) + not using > global > > > sequences, will there be no traffic to the GTM for that transaction ? > > > > > > > > > On Sun, May 4, 2014 at 6:24 AM, Michael Paquier < > mic...@gm...> > > > wrote: > > >> > > >> On Sun, May 4, 2014 at 12:59 AM, Dorian Hoxha <dor...@gm... > > > > >> wrote: > > >> >> You just need commodity INTEL server runnign Linux. > > >> > Are INTEL cpu required ? If not INTEL can be removed ? (also running > > >> > typo) > > >> Not really... I agree to what you mean here. > > >> > > >> >> For datawarehouse > > >> >> > > >> >> applications, you may need separate patch which devides complexed > query > > >> >> into smaller > > >> >> > > >> >> chunks which run in datanodes in parallel. StormDB will provide > such > > >> >> patche. > > >> > > > >> > Wasn't stormdb bought by another company ? Is there an opensource > > >> > alternative ? Fix the "patche" typo ? > > >> > > > >> > A way to make it simpler is by merging coordinator and datanode > into 1 > > >> > and > > >> > making it possible for a 'node' to not hold data (be a coordinator > > >> > only), > > >> > like in elastic-search, but you probably already know that. > > >> +1. This would alleviate data transfer between cross-node joins where > > >> Coordinator and Datanodes are on separate servers. You could always > > >> have both nodes on the same server with the XC of now... But that's > > >> double number of nodes to monitor. > > >> > > >> > What exact things does the gtm-proxy do? For example, a single row > > >> > insert > > >> > wouldn't need the gtm (coordinator just inserts it to the right > > >> > data-node)(asumming no sequences, since for that the gtm is needed)? > > >> Grouping messages between Coordinator/Datanode and GTM to reduce > > >> package interferences and improve performance. > > >> > > >> > If multiple tables are sharded on the same key (example: user_id). > Will > > >> > all > > >> > the rows, from the same user in different tables be in the same > > >> > data-node ? > > >> Yep. Node choice algorithm is based using the data type of the key. > > >> -- > > >> Michael > > > > > > > > > > > ------------------------------------------------------------------------------ > > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > > Instantly run your Selenium tests across 300+ browser/OS combos. Get > > unparalleled scalability from the best Selenium testing platform > available. > > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > ------------------------------------------------------------------------------ > Is your legacy SCM system holding you back? Join Perforce May 7 to find > out: > • 3 signs your SCM is hindering your productivity > • Requirements for releasing software faster > • Expert tips and advice for migrating your SCM now > http://p.sf.net/sfu/perforce_______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > > > > |