From: Ashutosh B. <ash...@en...> - 2014-05-29 03:53:17
|
On Thu, May 29, 2014 at 6:41 AM, 鈴木 幸市 <ko...@in...> wrote: > 2014/05/29 1:27、Josh Berkus <jo...@ag...> のメール: > > > On 05/27/2014 07:33 PM, Koichi Suzuki wrote: > >> We can reduce the additional latency by performing prepare and commit > >> in parallel, I mean, sending command to all the target remote nodes > >> first and then receive their response afterwords. > >> > >> As I supposed, alternative is to use BDR. This has very small > >> overhead. We can detect conflicting writes among transactions and if > >> each transaction does not conflict, we can apply these writes in > >> parallel, not in a single thread as we're doing in streaming > >> replication. > > > > Mind you, like BDR itself, we'll still need to figure out how to handle > DDL. > > Thank you for the info. I believe this issue is shared with other use > cases such as SLONY. Are there any other discussion how to handle this? > > > > >> > >> This needs some more work and I think this worth to spend some time. > > > > Yes. Otherwise we have two unpalatable choices: > > > > - Massive data loss (roll back to barrier) every time we lose a node, or > > - Doubling write latency (at least) > > In the case of statement-based redundancy, we need to determine what node > to write first (at least) and this should be the same in all the nodes. > It is needed to handle conflicting wrights consistently. This means the > first write has to be done synchronously and the rest can be done > asynchronously. Because most of the i/o work is done at prepare/commit, I > hope this does not impact the whole throughput or latency badly. > > I am doubtful as to whether the eventual consistency scheme will work in XC. If we commit a change to two nodes, and do not wait for the third node and the third node is not able to apply the change, we will have an inconsistency on the third node. This inconsistency may not be repairable depending upon the failure to write to that node. If the two nodes fail, the data that we will retrieve from the third node would be inconsistent. So, probably there is no point in keeping the third copy. That means that the cluster can tolerate N failures, where N is the number of synchronous copies made. > Regards; > — > Koichi Suzuki > > > > -- > > Josh Berkus > > PostgreSQL Experts Inc. > > http://pgexperts.com > > > > > ------------------------------------------------------------------------------ > > Time is money. Stop wasting it! Get your web API in 5 minutes. > > www.restlet.com/download > > http://p.sf.net/sfu/restlet > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > > > ------------------------------------------------------------------------------ > Time is money. Stop wasting it! Get your web API in 5 minutes. > www.restlet.com/download > http://p.sf.net/sfu/restlet > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |