You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(10) |
May
(17) |
Jun
(3) |
Jul
|
Aug
|
Sep
(8) |
Oct
(18) |
Nov
(51) |
Dec
(74) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(47) |
Feb
(44) |
Mar
(44) |
Apr
(102) |
May
(35) |
Jun
(25) |
Jul
(56) |
Aug
(69) |
Sep
(32) |
Oct
(37) |
Nov
(31) |
Dec
(16) |
2012 |
Jan
(34) |
Feb
(127) |
Mar
(218) |
Apr
(252) |
May
(80) |
Jun
(137) |
Jul
(205) |
Aug
(159) |
Sep
(35) |
Oct
(50) |
Nov
(82) |
Dec
(52) |
2013 |
Jan
(107) |
Feb
(159) |
Mar
(118) |
Apr
(163) |
May
(151) |
Jun
(89) |
Jul
(106) |
Aug
(177) |
Sep
(49) |
Oct
(63) |
Nov
(46) |
Dec
(7) |
2014 |
Jan
(65) |
Feb
(128) |
Mar
(40) |
Apr
(11) |
May
(4) |
Jun
(8) |
Jul
(16) |
Aug
(11) |
Sep
(4) |
Oct
(1) |
Nov
(5) |
Dec
(16) |
2015 |
Jan
(5) |
Feb
|
Mar
(2) |
Apr
(5) |
May
(4) |
Jun
(12) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Amit K. <ami...@en...> - 2014-02-18 18:50:46
|
On 13 February 2014 11:54, Ashutosh Bapat <ash...@en...>wrote: > One more solution would be to use cursors for replicated tables. The idea > is to open cursors on all the copies of the table and append the query with > an ORDER BY clause on all the columns. Thus we are sure that the current of > each of these cursors point to same row on all the copies. While fetching a > row from a replicated table, we fetch from all the cursors and choose only > one row for the data processing. While updating or deleting we send UPDATE > or DELETE with WHERE CURRENT OF. The down side of this approach is that, if > there are coordinator quals, we will end up locking more rows than > necessary, increasing the probability of the deadlock but at least there > won't be a necessary restriction of having primary or unique key and we > won't break backward compatibility. > > If there two identical rows, we might mix the update from different nodes, > but then who knew which of them were corresponded across the nodes to start > with. > > Locking all rows doesn't look good especially because we are looking for a permanent long term solution. If we can come up with some other solution that avoids this, we better avoid this compromise. For a replicated table with 10000 rows, all concurrent updates will be serialized even if they are updating a different row. Other thing is datanode performance impact for ORDER BY all columns, especially with many large size columns. I had also mentioned about ORDER BY in approach A. I am not sure whether there is some kind of optimization in the sort, such as: if we find unique rows with the first n columns, it does not compare the rest of the columns. I think declaring cursors is a cool idea in general for DMLs but it requires refactoring of DML planning, and also it requires ORDER BY. There is a concurrent update issue #398 for which we do require a refactor of DML handling. While doing that it will be clearer whether declaring cursor is really beneficial or if it's not feasible. For ORDER BY, again, for long term, we should have a primary key or an internal unique key so that rows can be ordered on that single column as against all columns. So again, we still are better of with a new system column. As regards to approach C, if we find a way to uniquely generate a new row id independently, then the task of generating rowid will be pretty lightweight. We won't require any other table to store it or generate it. The coordinator will generate it at each insert (both fqs and non-fqs), or may be datanodes themselves find a way to generate a new rowid which is always the same regardless of the datanode. A combination of gxid, timestamp and cmd id can be used to construct a unique rowid at the coordinator. I think one action plan can be : 1. Use Mason's patch and tweak it so that it needs very little modification later on if and when we add the system rowid column. 2. Check in the patch but let it not error out if the primary key is not there. This way we would at least make the replicated tables with primary keys work without data issues, but continue to work as it is now for tables without primary key. 3. Lastly support the new system row id implementation, and do an incremental change in Mason's checked in changes to use this id instead of primary key. > On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki <koi...@gm...>wrote: > >> Hi, >> >> I tested the patch and found that primary key is mandatory. We need >> to modify regression test considerably to give each replicated table >> primary keys. >> >> I think this patch helps but I'm not afraid this is good, especially >> when we try to take XC features back to PG. >> >> Did you post another patch to use all column values if primary key is >> not available? >> >> I think better way is as follows: >> >> 1) If primary key is defined, use it, >> 2) If not, create a primary key as system column, the size should be >> 64bit. >> 3) If primary key is added to a replicated table, remove system primary >> key. >> >> The value of primary key can be obtained as follows: >> >> 1) add new column to pgxc_class catalog to represent maximum value of >> the system primary key, >> 2) when first "insert" is done to the primary node, system primary key >> value is taken from 1) and 1) is updated. The value is returned to >> the coordinator to be propagated to other nodes. >> 3) when subsequent "insert" is being done, system primary key value is >> added to the column value. In this case, each datanode updates 1) >> column value if it is larger than the current maximum value. >> >> 3) is important to change primary node to another. This is needed to >> carry over the primary node to another. >> >> ALTER TABLE should take care of them. >> >> Other issues are: >> >> 4) pg_dump/pg_dumpall should not include this system column value, >> 5) cluster may need to handle this too to repack system primary key >> value (not now but at least in 1.3 or later). >> >> Regards; >> --- >> Koichi Suzuki >> >> >> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >> > Please see attached patch that tries to address the issue of XC using >> CTID >> > for replicated updates and deletes when it is evaluated at a coordinator >> > instead of being pushed down. >> > >> > The problem here is that CTID could be referring to a different tuple >> > altogether on a different data node, which is what happened for one of >> our >> > Postgres-XC support customers, leading to data issues. >> > >> > Instead, the patch looks for a primary key or unique index (with the >> primary >> > key preferred) and uses those values instead of CTID. >> > >> > The patch could be improved further. Extra parameters are set even if >> not >> > used in the execution of the prepared statement sent down to the data >> nodes. >> > >> > Regards, >> > >> > >> > -- >> > Mason Sharp >> > >> > TransLattice - http://www.translattice.com >> > Distributed and Clustered Database Solutions >> > >> > >> ------------------------------------------------------------------------------ >> > November Webinars for C, C++, Fortran Developers >> > Accelerate application performance with scalable programming models. >> Explore >> > techniques for threading, error checking, porting, and tuning. Get the >> most >> > from the latest Intel processors and coprocessors. See abstracts and >> > register >> > >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> > _______________________________________________ >> > Postgres-xc-developers mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > >> >> >> ------------------------------------------------------------------------------ >> Android apps run on BlackBerry 10 >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> Get your Android app in front of a whole new audience. Start now. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > > > ------------------------------------------------------------------------------ > Android apps run on BlackBerry 10 > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > Now with support for Jelly Bean, Bluetooth, Mapview and more. > Get your Android app in front of a whole new audience. Start now. > > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Masataka S. <pg...@gm...> - 2014-02-17 08:49:07
|
My mailer might have wrong. Please ignore the second mail. On 17 February 2014 17:39, Masataka Saito <pg...@gm...> wrote: > Hi, > > I got a report that an error message "GTM error, could not obtain > snapshot" still produced. I looked deeply in the proxy, then I found > an another bug and misc errors. > > I explain the summary of them and attached patches. > All suggestions are welcome. > > 01_fix_proxy_connid.patch: > A proxy assigns an connection ID to the connection from back-end. A > connection ID is sent to the GTM when back-end requires a new > transaction ID, back-end disconnected and so on. If a back-end > disconnected, the GTM releases transactions related on the back-end. > On the other hand, a proxy renumbers the connection ID to fill a > gapping entry in the connection array. But it is not notified to the > GTM, and it causes the release of a transaction that is not related to > disconnected back-end. > The patch makes the connection ID static by separating the connection > ID from the index of the connection array. > > 01_fix_proxy_ereport_again.patch: > I fixed the issue before -- when GTM proxy received an error response > to a command, GTM proxy reported error using an unrelated connection > to the command -- but the patch had mistake. This is the patch that > fixes it in right way. > > 01_fix_proxy_redundancy.patch: > A GTM proxy has some redundant code. This patch eliminates them. > > Regards. |
From: Masataka S. <pg...@gm...> - 2014-02-17 08:39:41
|
Hi, I got a report that an error message "GTM error, could not obtain snapshot" still produced. I looked deeply in the proxy, then I found an another bug and misc errors. I explain the summary of them and attached patches. All suggestions are welcome. 01_fix_proxy_connid.patch: A proxy assigns an connection ID to the connection from back-end. A connection ID is sent to the GTM when back-end requires a new transaction ID, back-end disconnected and so on. If a back-end disconnected, the GTM releases transactions related on the back-end. On the other hand, a proxy renumbers the connection ID to fill a gapping entry in the connection array. But it is not notified to the GTM, and it causes the release of a transaction that is not related to disconnected back-end. The patch makes the connection ID static by separating the connection ID from the index of the connection array. 01_fix_proxy_ereport_again.patch: I fixed the issue before -- when GTM proxy received an error response to a command, GTM proxy reported error using an unrelated connection to the command -- but the patch had mistake. This is the patch that fixes it in right way. 01_fix_proxy_redundancy.patch: A GTM proxy has some redundant code. This patch eliminates them. Regards. |
From: Masataka S. <pg...@gm...> - 2014-02-17 08:24:12
|
Hi, I got a report that an error message "GTM error, could not obtain snapshot" still produced. I looked deeply in the proxy, then I found an another bug and misc errors. I explain the summary of them and attached patches. All suggestions are welcome. 01_fix_proxy_connid.patch: A proxy assigns an connection ID to the connection from back-end. A connection ID is sent to the GTM when back-end requires a new transaction ID, back-end disconnected and so on. If a back-end disconnected, the GTM releases transactions related on the back-end. On the other hand, a proxy renumbers the connection ID to fill a gapping entry in the connection array. But it is not notified to the GTM, and it causes the release of a transaction that is not related to disconnected back-end. The patch makes the connection ID static by separating the connection ID from the index of the connection array. 01_fix_proxy_ereport_again.patch: I fixed the issue before -- when GTM proxy received an error response to a command, GTM proxy reported error using an unrelated connection to the command -- but the patch had mistake. This is the patch that fixes it in right way. 01_fix_proxy_redundancy.patch: A GTM proxy has some redundant code. This patch eliminates them. Regards. |
From: 鈴木 幸市 <ko...@in...> - 2014-02-17 07:13:59
|
1.1 is not suffered by this. The bug was introduced during planner code change to deal with PG planner code change (mainly from automatic updatable views). Regards; --- Koichi Suzuki 2014/02/15 3:54、David E. Wheeler <da...@ju...> のメール: > On Feb 13, 2014, at 8:19 PM, 鈴木 幸市 <ko...@in...> wrote: > >> Not really although there will be no feature addition in 1.1. I will keep maintaining 1.1 and 1.0 for a while. As I wrote in the note, 1.2 needs at least one improvement of update/delete in replicated tables. It will be good to begin 1.2 work after GA unless you’d like to practice on it. > > So, does that mean that there might be a 1.1.1 release with bug fixes? Because this feels like a bug to me. > > Thanks, > > David > > > |
From: 鈴木 幸市 <ko...@in...> - 2014-02-17 07:11:06
|
2014/02/14 18:35、Andrei Martsinchyk <and...@gm...<mailto:and...@gm...>> のメール: 2014-02-14 9:04 GMT+02:00 Masataka Saito <pg...@gm...<mailto:pg...@gm...>>: Thank you for your clever suggestion. > - Make Cancel more selective and affect only specific query. That means an ID for each query to introduce, that should be known to client and way to deliver it. > - Introduce procedure of changing backend key. Old cancel won't affect such backend. I prefer the 2nd idea. But these ideas seem to require touching libpq infrastructure and if I understand correctly, they are used not only the inter node communication but also a coordinator and a frontend communication. Unless we can separate them, I think better not to change it. XC is already extending PG client-server protocol and use the extension in internode communications. The suggested feature do not have to be available to external client and therefore no need to be supported by libpq. > - Before starting new query, check if there is pending cancel and remove it. It sounds ridiculous "cancel cancel" but may work, if queries and cancels are issued synchronously from single source. I'm afraid of the wrong hypothesis. As I suggested first, cancel and subsequent request are not serialized at the target node. It means that if the query started with no pending cancel, it could be interrupted by cancel request. I am not sure how exactly Cancel request is handled. If server creates a session and sends back an acknowledgement before PGcancel returns it is synchronous enough. Node sends next command after the PGcancel returns, so the respective session either already placed the interrupt request or can be found in the Proc array. Either can be cleaned. If the Cancel is not synchronous enough, OK - just another bad idea, ignore it. Unfortunately, it does not happen. So far, cancel is not synchronous. It could be effective after the background receives the next statement. This is what Masataka’s patch is improving. Wait duration, at present 10millisecond, could be new GUC parameter. At least, this looks to work fine with our buildfarm. Regards; — Koichi Suzuki Regards. On 14 February 2014 14:06, Andrei Martsinchyk <and...@gm...<mailto:and...@gm...>> wrote: > > You are right, the temp objects are problem. > On the one hand if we run a long query and there was an error on one node we want to cancel it on others to avoid unnecessary waiting. On the other hand the query may be near its natural end and the cancel may be late and hit the next query. > Just throwing out ideas: > - Make Cancel more selective and affect only specific query. That means an ID for each query to introduce, that should be known to client and way to deliver it. > - Introduce procedure of changing backend key. Old cancel won't affect such backend. > - Before starting new query, check if there is pending cancel and remove it. It sounds ridiculous "cancel cancel" but may work, if queries and cancels are issued synchronously from single source. > > 14.02.2014 4:07 пользователь "Koichi Suzuki" <koi...@gm...<mailto:koi...@gm...>> написал: > >> I misunderstand the implication. Anyway additional wait is separate >> from your suggestion. >> >> Disconnecting the connection as you suggested will bring another >> problem such as TEMPORARY object in the subsequent queries. We do >> not support TEMPORARY object but I believe we should be consistent on >> this for future releases. >> >> Thoughts? >> --- >> Koichi Suzuki >> >> >> 2014-02-14 2:30 GMT+09:00 Andrei Martsinchyk <and...@gm...<mailto:and...@gm...>>: >> > Hello, >> > >> > Postgres establishes separate connection to deliver Cancel command to the >> > target session. >> > On a heavily loaded node it may take fairly long. Longer sleep would help >> > out, but it means longer recovery after an error. >> > Better solution is to remove canceled connection from the pool and therefore >> > do not use it to handle subsequent queries. >> > >> > >> > >> > 2014-02-13 11:10 GMT+02:00 Koichi Suzuki <koi...@gm...<mailto:koi...@gm...>>: >> >> >> >> I think it hits the point. I tested this patch several times and it >> >> seems to work fine. The delay time (at present 10ms) is short enough >> >> and it is applied only when we need to cancel a statement. >> >> >> >> We should check this into all the master and STABLE branches improving >> >> magic number with some meaningful name. >> >> >> >> Any thoughts? >> >> --- >> >> Koichi Suzuki >> >> >> >> >> >> 2014-01-24 18:25 GMT+09:00 Masataka Saito <pg...@gm...<mailto:pg...@gm...>>: >> >> > Hello, >> >> > >> >> > As I've been exasperated by random failures, I'm willing to whip the >> >> > cause >> >> > of the issue. >> >> > >> >> > This issue is related to cancel of the failed query. >> >> > When a datanode reports an error of a query, a coordinator sends a >> >> > cancel >> >> > request to non-idle nodes, waits the node to get ready and requests >> >> > nodes to >> >> > rollback the transaction. >> >> > >> >> > Where's the problem? Consider the next case. >> >> > 1. Datanode A (PID 1) reports an error to coordinator A. ([1] 'E' >> >> > message) >> >> > 2. Coordinator A receives [1] and reports an error to a frontend. ([2] >> >> > 'E' >> >> > message) >> >> > 3. Coordinator A starts aborting process and it thinks datanode A (PID >> >> > 1) is >> >> > not idle. >> >> > 4. Coordinator A sends a cancel request about PID 1 to datanode A (PID >> >> > 2). >> >> > ([3] cancel message) >> >> > 5. Datanode A (PID 1) reports ready to coordinator A. ([4] 'Z' message) >> >> > 6. Coordinator A receives [4] and sends "ROLLBACK TRANSACTION" >> >> > immediately. >> >> > ([5] 'Q' message) >> >> > 7. Datanode A (PID 1) receives [5] and starts processing the query. >> >> > 8. Datanode A (PID 2) receives [3]. >> >> > 9. Datanode A (PID 2) notify PID 1 of [3]. >> >> > 10. Datanode A (PID 1) cancel processing [5] and reports an error to >> >> > Coordinator A. ([6] 'E' message) >> >> > 11. Coordinator A receives [6] and reports an error to a frontend. ([7] >> >> > 'E' >> >> > message) >> >> > >> >> > [7] makes unexpected output and a test fails. >> >> > >> >> > Saying an extreme thing, it could occur that the next query of [5] is >> >> > cancelled by [3]. >> >> > >> >> > As far as I know, there's no way to know when to the cancel request get >> >> > to >> >> > be processed, I think we can't not wait an experimental duration after >> >> > cancelling like the attached patch. >> >> > >> >> > Does anyone have another cool idea to solve this issue? >> >> > >> >> > Regards. >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ >> >> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> >> > Learn Why More Businesses Are Choosing CenturyLink Cloud For >> >> > Critical Workloads, Development Environments & Everything In Between. >> >> > Get a Quote or Start a Free Trial Today. >> >> > >> >> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> >> > _______________________________________________ >> >> > Postgres-xc-developers mailing list >> >> > Pos...@li...<mailto:Pos...@li...> >> >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Android apps run on BlackBerry 10 >> >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >> Get your Android app in front of a whole new audience. Start now. >> >> >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >> >> >> _______________________________________________ >> >> Postgres-xc-developers mailing list >> >> Pos...@li...<mailto:Pos...@li...> >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > >> > >> > >> > >> > -- >> > Andrei Martsinchyk >> > >> > StormDB - http://www.stormdb.com<http://www.stormdb.com/> >> > The Database Cloud >> > > > > ------------------------------------------------------------------------------ > Android apps run on BlackBerry 10 > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > Now with support for Jelly Bean, Bluetooth, Mapview and more. > Get your Android app in front of a whole new audience. Start now. > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li...<mailto:Pos...@li...> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > -- Andrei Martsinchyk StormDB - http://www.stormdb.com<http://www.stormdb.com/> The Database Cloud ------------------------------------------------------------------------------ Android apps run on BlackBerry 10 Introducing the new BlackBerry 10.2.1 Runtime for Android apps. Now with support for Jelly Bean, Bluetooth, Mapview and more. Get your Android app in front of a whole new audience. Start now. http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk_______________________________________________ Postgres-xc-developers mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers |
From: David E. W. <da...@ju...> - 2014-02-14 18:54:21
|
On Feb 13, 2014, at 8:19 PM, 鈴木 幸市 <ko...@in...> wrote: > Not really although there will be no feature addition in 1.1. I will keep maintaining 1.1 and 1.0 for a while. As I wrote in the note, 1.2 needs at least one improvement of update/delete in replicated tables. It will be good to begin 1.2 work after GA unless you’d like to practice on it. So, does that mean that there might be a 1.1.1 release with bug fixes? Because this feels like a bug to me. Thanks, David |
From: Abbas B. <abb...@en...> - 2014-02-14 12:02:23
|
On Fri, Feb 14, 2014 at 1:36 PM, Koichi Suzuki <koi...@gm...>wrote: > I understood the situation. So there's three areas we need to fix. > > 1) Trigger, > I studied the following emails and associated bug reports (a) Mason's email subject [postgres-xc:bugs] #454 Update triggers on replicated tables may corrupt data (b) Amit's email subject [Postgres-xc-core] UPDATE queries on replicated tables (c) Amit's bug report : #402 Issue with updates on replicated tables and concluded that although this bug was observed when Amit was working on triggers and Mason reported that the data corruption was observed with update triggers defined on replicated tables, it actually has nothing to do with triggers. Its a general problem in updates/deletes to replicated tables. What exact problem with triggers do you have in mind that comes under this bug? > 2) FQS-able updates/deletes. > Nothing needs to be done to handle this case. This case is already working fine and would not be impacted by whatever we do to fix the non-FQS case. > > I'm afraid we've not concluded how to solve these two yet. > > 3) Non-FQS cursor may work. It will be nice if 1) and 2) solution > can be shred here too. > > Abbas, could you summarize these three cases? > > Regards; > --- > Koichi Suzuki > > > 2014-02-14 16:02 GMT+09:00 Ashutosh Bapat <ash...@en... > >: > > > > > > > > On Fri, Feb 14, 2014 at 12:22 PM, Abbas Butt < > abb...@en...> > > wrote: > >> > >> The case that we are trying to solve here is when the user statement is > >> not ship-able and has to be evaluated at the coordinator. If it was > >> ship-able, there is no problem in that case, it would get shipped like > you > >> are suggesting. However if it is not ship-able, then we will have a > step in > >> the query plan to first select the row to be updated and then a step to > >> update that row (update being a two step process). What Ashutosh is > >> suggesting is to have a cursor with an order by for all rows and the > quals > >> that the user query had, and then update the row using WCO. > >> The reason of using cursors is to base the update on where the cursor > >> currently points to, rather than the ctid of the row, which could be > >> different on the datanodes. > >> > > > > Very well explained. Thank you. Hope that clears the doubts. > > > >> > >> > >> On Fri, Feb 14, 2014 at 11:22 AM, Koichi Suzuki <koi...@gm...> > >> wrote: > >>> > >>> If we can use the same ORDER BY clause, I don't understand why we need > >>> cursor. We can just ship statements. > >>> --- > >>> Koichi Suzuki > >>> > >>> > >>> 2014-02-14 15:20 GMT+09:00 Abbas Butt <abb...@en...>: > >>> > > >>> > > >>> > > >>> > On Fri, Feb 14, 2014 at 10:58 AM, Koichi Suzuki < > koi...@gm...> > >>> > wrote: > >>> >> > >>> >> 2014-02-14 14:55 GMT+09:00 Abbas Butt <abb...@en... > >: > >>> >> > > >>> >> > > >>> >> > > >>> >> > On Fri, Feb 14, 2014 at 10:48 AM, Ashutosh Bapat > >>> >> > <ash...@en...> wrote: > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt > >>> >> >> <abb...@en...> > >>> >> >> wrote: > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat > >>> >> >>> <ash...@en...> wrote: > >>> >> >>>> > >>> >> >>>> One more solution would be to use cursors for replicated > tables. > >>> >> >>>> The > >>> >> >>>> idea is to open cursors on all the copies of the table and > append > >>> >> >>>> the > >>> >> >>>> query > >>> >> >>>> with an ORDER BY clause on all the columns. Thus we are sure > that > >>> >> >>>> the > >>> >> >>>> current of each of these cursors point to same row on all the > >>> >> >>>> copies. > >>> >> >>>> While > >>> >> >>>> fetching a row from a replicated table, we fetch from all the > >>> >> >>>> cursors > >>> >> >>>> and > >>> >> >>>> choose only one row for the data processing. While updating or > >>> >> >>>> deleting we > >>> >> >>>> send UPDATE or DELETE with WHERE CURRENT OF. The down side of > >>> >> >>>> this > >>> >> >>>> approach > >>> >> >>>> is that, if there are coordinator quals, we will end up locking > >>> >> >>>> more > >>> >> >>>> rows > >>> >> >>>> than necessary, increasing the probability of the deadlock but > at > >>> >> >>>> least > >>> >> >>>> there won't be a necessary restriction of having primary or > >>> >> >>>> unique > >>> >> >>>> key and > >>> >> >>>> we won't break backward compatibility. > >>> >> >>>> > >>> >> >>>> If there two identical rows, we might mix the update from > >>> >> >>>> different > >>> >> >>>> nodes, but then who knew which of them were corresponded across > >>> >> >>>> the > >>> >> >>>> nodes to > >>> >> >>>> start with. > >>> >> >>> > >>> >> >>> > >>> >> >>> Thanks for the suggestion but we currently do not support WCO > and > >>> >> >>> we > >>> >> >>> were > >>> >> >>> thinking of fixing this issue before we declare 1.2 beta is > >>> >> >>> generally > >>> >> >>> available. > >>> >> >>> > >>> >> >> > >>> >> >> > >>> >> >> Abbas, WCO doesn't work from the coordinator, but there is no > >>> >> >> reason > >>> >> >> why > >>> >> >> it shouldn't work at the datanode. So internally between > >>> >> >> coordinator > >>> >> >> and the > >>> >> >> datanode, we can always use WCO. > >>> >> > > >>> >> > > >>> >> > True, Thanks for the clarification. > >>> >> > >>> >> Again, there are no guarantee that all cursors for a replicated > table > >>> >> returns rows in the same order. It is as dangerous as ctid. > >>> > > >>> > > >>> > Could you please explain a little further, how would a query that has > >>> > all > >>> > table columns in the ORDER BY clause return rows in different order > >>> > when run > >>> > on the datanodes? > >>> > > >>> > > >>> >> > >>> >> > >>> >> > > >>> >> >> > >>> >> >> > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki > >>> >> >>>> <koi...@gm...> > >>> >> >>>> wrote: > >>> >> >>>>> > >>> >> >>>>> Hi, > >>> >> >>>>> > >>> >> >>>>> I tested the patch and found that primary key is mandatory. > We > >>> >> >>>>> need > >>> >> >>>>> to modify regression test considerably to give each replicated > >>> >> >>>>> table > >>> >> >>>>> primary keys. > >>> >> >>>>> > >>> >> >>>>> I think this patch helps but I'm not afraid this is good, > >>> >> >>>>> especially > >>> >> >>>>> when we try to take XC features back to PG. > >>> >> >>>>> > >>> >> >>>>> Did you post another patch to use all column values if primary > >>> >> >>>>> key > >>> >> >>>>> is > >>> >> >>>>> not available? > >>> >> >>>>> > >>> >> >>>>> I think better way is as follows: > >>> >> >>>>> > >>> >> >>>>> 1) If primary key is defined, use it, > >>> >> >>>>> 2) If not, create a primary key as system column, the size > >>> >> >>>>> should be > >>> >> >>>>> 64bit. > >>> >> >>>>> 3) If primary key is added to a replicated table, remove > system > >>> >> >>>>> primary > >>> >> >>>>> key. > >>> >> >>>>> > >>> >> >>>>> The value of primary key can be obtained as follows: > >>> >> >>>>> > >>> >> >>>>> 1) add new column to pgxc_class catalog to represent maximum > >>> >> >>>>> value > >>> >> >>>>> of > >>> >> >>>>> the system primary key, > >>> >> >>>>> 2) when first "insert" is done to the primary node, system > >>> >> >>>>> primary > >>> >> >>>>> key > >>> >> >>>>> value is taken from 1) and 1) is updated. The value is > returned > >>> >> >>>>> to > >>> >> >>>>> the coordinator to be propagated to other nodes. > >>> >> >>>>> 3) when subsequent "insert" is being done, system primary key > >>> >> >>>>> value > >>> >> >>>>> is > >>> >> >>>>> added to the column value. In this case, each datanode > updates > >>> >> >>>>> 1) > >>> >> >>>>> column value if it is larger than the current maximum value. > >>> >> >>>>> > >>> >> >>>>> 3) is important to change primary node to another. This is > >>> >> >>>>> needed > >>> >> >>>>> to > >>> >> >>>>> carry over the primary node to another. > >>> >> >>>>> > >>> >> >>>>> ALTER TABLE should take care of them. > >>> >> >>>>> > >>> >> >>>>> Other issues are: > >>> >> >>>>> > >>> >> >>>>> 4) pg_dump/pg_dumpall should not include this system column > >>> >> >>>>> value, > >>> >> >>>>> 5) cluster may need to handle this too to repack system > primary > >>> >> >>>>> key > >>> >> >>>>> value (not now but at least in 1.3 or later). > >>> >> >>>>> > >>> >> >>>>> Regards; > >>> >> >>>>> --- > >>> >> >>>>> Koichi Suzuki > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp < > ms...@tr...>: > >>> >> >>>>> > Please see attached patch that tries to address the issue of > >>> >> >>>>> > XC > >>> >> >>>>> > using > >>> >> >>>>> > CTID > >>> >> >>>>> > for replicated updates and deletes when it is evaluated at a > >>> >> >>>>> > coordinator > >>> >> >>>>> > instead of being pushed down. > >>> >> >>>>> > > >>> >> >>>>> > The problem here is that CTID could be referring to a > >>> >> >>>>> > different > >>> >> >>>>> > tuple > >>> >> >>>>> > altogether on a different data node, which is what happened > >>> >> >>>>> > for > >>> >> >>>>> > one > >>> >> >>>>> > of our > >>> >> >>>>> > Postgres-XC support customers, leading to data issues. > >>> >> >>>>> > > >>> >> >>>>> > Instead, the patch looks for a primary key or unique index > >>> >> >>>>> > (with > >>> >> >>>>> > the > >>> >> >>>>> > primary > >>> >> >>>>> > key preferred) and uses those values instead of CTID. > >>> >> >>>>> > > >>> >> >>>>> > The patch could be improved further. Extra parameters are > set > >>> >> >>>>> > even > >>> >> >>>>> > if not > >>> >> >>>>> > used in the execution of the prepared statement sent down to > >>> >> >>>>> > the > >>> >> >>>>> > data > >>> >> >>>>> > nodes. > >>> >> >>>>> > > >>> >> >>>>> > Regards, > >>> >> >>>>> > > >>> >> >>>>> > > >>> >> >>>>> > -- > >>> >> >>>>> > Mason Sharp > >>> >> >>>>> > > >>> >> >>>>> > TransLattice - http://www.translattice.com > >>> >> >>>>> > Distributed and Clustered Database Solutions > >>> >> >>>>> > > >>> >> >>>>> > > >>> >> >>>>> > > >>> >> >>>>> > > >>> >> >>>>> > > ------------------------------------------------------------------------------ > >>> >> >>>>> > November Webinars for C, C++, Fortran Developers > >>> >> >>>>> > Accelerate application performance with scalable programming > >>> >> >>>>> > models. > >>> >> >>>>> > Explore > >>> >> >>>>> > techniques for threading, error checking, porting, and > tuning. > >>> >> >>>>> > Get > >>> >> >>>>> > the most > >>> >> >>>>> > from the latest Intel processors and coprocessors. See > >>> >> >>>>> > abstracts > >>> >> >>>>> > and > >>> >> >>>>> > register > >>> >> >>>>> > > >>> >> >>>>> > > >>> >> >>>>> > > >>> >> >>>>> > > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > >>> >> >>>>> > _______________________________________________ > >>> >> >>>>> > Postgres-xc-developers mailing list > >>> >> >>>>> > Pos...@li... > >>> >> >>>>> > > >>> >> >>>>> > > >>> >> >>>>> > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >>> >> >>>>> > > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > ------------------------------------------------------------------------------ > >>> >> >>>>> Android apps run on BlackBerry 10 > >>> >> >>>>> Introducing the new BlackBerry 10.2.1 Runtime for Android > apps. > >>> >> >>>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >>> >> >>>>> Get your Android app in front of a whole new audience. Start > >>> >> >>>>> now. > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >>> >> >>>>> _______________________________________________ > >>> >> >>>>> Postgres-xc-developers mailing list > >>> >> >>>>> Pos...@li... > >>> >> >>>>> > >>> >> >>>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> -- > >>> >> >>>> Best Wishes, > >>> >> >>>> Ashutosh Bapat > >>> >> >>>> EnterpriseDB Corporation > >>> >> >>>> The Postgres Database Company > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > ------------------------------------------------------------------------------ > >>> >> >>>> Android apps run on BlackBerry 10 > >>> >> >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >>> >> >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >>> >> >>>> Get your Android app in front of a whole new audience. Start > >>> >> >>>> now. > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >>> >> >>>> _______________________________________________ > >>> >> >>>> Postgres-xc-developers mailing list > >>> >> >>>> Pos...@li... > >>> >> >>>> > >>> >> >>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >>> >> >>>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> -- > >>> >> >>> -- > >>> >> >>> Abbas > >>> >> >>> Architect > >>> >> >>> > >>> >> >>> Ph: 92.334.5100153 > >>> >> >>> Skype ID: gabbasb > >>> >> >>> www.enterprisedb.com > >>> >> >>> > >>> >> >>> Follow us on Twitter > >>> >> >>> @EnterpriseDB > >>> >> >>> > >>> >> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers and more > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> -- > >>> >> >> Best Wishes, > >>> >> >> Ashutosh Bapat > >>> >> >> EnterpriseDB Corporation > >>> >> >> The Postgres Database Company > >>> >> > > >>> >> > > >>> >> > > >>> >> > > >>> >> > -- > >>> >> > -- > >>> >> > Abbas > >>> >> > Architect > >>> >> > > >>> >> > Ph: 92.334.5100153 > >>> >> > Skype ID: gabbasb > >>> >> > www.enterprisedb.com > >>> >> > > >>> >> > Follow us on Twitter > >>> >> > @EnterpriseDB > >>> >> > > >>> >> > Visit EnterpriseDB for tutorials, webinars, whitepapers and more > >>> > > >>> > > >>> > > >>> > > >>> > -- > >>> > -- > >>> > Abbas > >>> > Architect > >>> > > >>> > Ph: 92.334.5100153 > >>> > Skype ID: gabbasb > >>> > www.enterprisedb.com > >>> > > >>> > Follow us on Twitter > >>> > @EnterpriseDB > >>> > > >>> > Visit EnterpriseDB for tutorials, webinars, whitepapers and more > >> > >> > >> > >> > >> -- > >> -- > >> Abbas > >> Architect > >> > >> Ph: 92.334.5100153 > >> Skype ID: gabbasb > >> www.enterprisedb.com > >> > >> Follow us on Twitter > >> @EnterpriseDB > >> > >> Visit EnterpriseDB for tutorials, webinars, whitepapers and more > > > > > > > > > > -- > > Best Wishes, > > Ashutosh Bapat > > EnterpriseDB Corporation > > The Postgres Database Company > -- -- *Abbas* Architect Ph: 92.334.5100153 Skype ID: gabbasb www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> *Follow us on Twitter* @EnterpriseDB Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> |
From: Andrei M. <and...@gm...> - 2014-02-14 09:35:26
|
2014-02-14 9:04 GMT+02:00 Masataka Saito <pg...@gm...>: > Thank you for your clever suggestion. > > > - Make Cancel more selective and affect only specific query. That means > an ID for each query to introduce, that should be known to client and way > to deliver it. > > - Introduce procedure of changing backend key. Old cancel won't affect > such backend. > > I prefer the 2nd idea. But these ideas seem to require touching libpq > infrastructure and if I understand correctly, they are used not only > the inter node communication but also a coordinator and a frontend > communication. Unless we can separate them, I think better not to > change it. > > XC is already extending PG client-server protocol and use the extension in internode communications. The suggested feature do not have to be available to external client and therefore no need to be supported by libpq. > > - Before starting new query, check if there is pending cancel and remove > it. It sounds ridiculous "cancel cancel" but may work, if queries and > cancels are issued synchronously from single source. > > I'm afraid of the wrong hypothesis. As I suggested first, cancel and > subsequent request are not serialized at the target node. It means > that if the query started with no pending cancel, it could be > interrupted by cancel request. > > I am not sure how exactly Cancel request is handled. If server creates a session and sends back an acknowledgement before PGcancel returns it is synchronous enough. Node sends next command after the PGcancel returns, so the respective session either already placed the interrupt request or can be found in the Proc array. Either can be cleaned. If the Cancel is not synchronous enough, OK - just another bad idea, ignore it. > Regards. > > > On 14 February 2014 14:06, Andrei Martsinchyk > <and...@gm...> wrote: > > > > You are right, the temp objects are problem. > > On the one hand if we run a long query and there was an error on one > node we want to cancel it on others to avoid unnecessary waiting. On the > other hand the query may be near its natural end and the cancel may be late > and hit the next query. > > Just throwing out ideas: > > - Make Cancel more selective and affect only specific query. That means > an ID for each query to introduce, that should be known to client and way > to deliver it. > > - Introduce procedure of changing backend key. Old cancel won't affect > such backend. > > - Before starting new query, check if there is pending cancel and remove > it. It sounds ridiculous "cancel cancel" but may work, if queries and > cancels are issued synchronously from single source. > > > > 14.02.2014 4:07 пользователь "Koichi Suzuki" <koi...@gm...> > написал: > > > >> I misunderstand the implication. Anyway additional wait is separate > >> from your suggestion. > >> > >> Disconnecting the connection as you suggested will bring another > >> problem such as TEMPORARY object in the subsequent queries. We do > >> not support TEMPORARY object but I believe we should be consistent on > >> this for future releases. > >> > >> Thoughts? > >> --- > >> Koichi Suzuki > >> > >> > >> 2014-02-14 2:30 GMT+09:00 Andrei Martsinchyk < > and...@gm...>: > >> > Hello, > >> > > >> > Postgres establishes separate connection to deliver Cancel command to > the > >> > target session. > >> > On a heavily loaded node it may take fairly long. Longer sleep would > help > >> > out, but it means longer recovery after an error. > >> > Better solution is to remove canceled connection from the pool and > therefore > >> > do not use it to handle subsequent queries. > >> > > >> > > >> > > >> > 2014-02-13 11:10 GMT+02:00 Koichi Suzuki <koi...@gm...>: > >> >> > >> >> I think it hits the point. I tested this patch several times and it > >> >> seems to work fine. The delay time (at present 10ms) is short > enough > >> >> and it is applied only when we need to cancel a statement. > >> >> > >> >> We should check this into all the master and STABLE branches > improving > >> >> magic number with some meaningful name. > >> >> > >> >> Any thoughts? > >> >> --- > >> >> Koichi Suzuki > >> >> > >> >> > >> >> 2014-01-24 18:25 GMT+09:00 Masataka Saito <pg...@gm...>: > >> >> > Hello, > >> >> > > >> >> > As I've been exasperated by random failures, I'm willing to whip > the > >> >> > cause > >> >> > of the issue. > >> >> > > >> >> > This issue is related to cancel of the failed query. > >> >> > When a datanode reports an error of a query, a coordinator sends a > >> >> > cancel > >> >> > request to non-idle nodes, waits the node to get ready and requests > >> >> > nodes to > >> >> > rollback the transaction. > >> >> > > >> >> > Where's the problem? Consider the next case. > >> >> > 1. Datanode A (PID 1) reports an error to coordinator A. ([1] 'E' > >> >> > message) > >> >> > 2. Coordinator A receives [1] and reports an error to a frontend. > ([2] > >> >> > 'E' > >> >> > message) > >> >> > 3. Coordinator A starts aborting process and it thinks datanode A > (PID > >> >> > 1) is > >> >> > not idle. > >> >> > 4. Coordinator A sends a cancel request about PID 1 to datanode A > (PID > >> >> > 2). > >> >> > ([3] cancel message) > >> >> > 5. Datanode A (PID 1) reports ready to coordinator A. ([4] 'Z' > message) > >> >> > 6. Coordinator A receives [4] and sends "ROLLBACK TRANSACTION" > >> >> > immediately. > >> >> > ([5] 'Q' message) > >> >> > 7. Datanode A (PID 1) receives [5] and starts processing the query. > >> >> > 8. Datanode A (PID 2) receives [3]. > >> >> > 9. Datanode A (PID 2) notify PID 1 of [3]. > >> >> > 10. Datanode A (PID 1) cancel processing [5] and reports an error > to > >> >> > Coordinator A. ([6] 'E' message) > >> >> > 11. Coordinator A receives [6] and reports an error to a frontend. > ([7] > >> >> > 'E' > >> >> > message) > >> >> > > >> >> > [7] makes unexpected output and a test fails. > >> >> > > >> >> > Saying an extreme thing, it could occur that the next query of [5] > is > >> >> > cancelled by [3]. > >> >> > > >> >> > As far as I know, there's no way to know when to the cancel > request get > >> >> > to > >> >> > be processed, I think we can't not wait an experimental duration > after > >> >> > cancelling like the attached patch. > >> >> > > >> >> > Does anyone have another cool idea to solve this issue? > >> >> > > >> >> > Regards. > >> >> > > >> >> > > >> >> > > ------------------------------------------------------------------------------ > >> >> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > >> >> > Learn Why More Businesses Are Choosing CenturyLink Cloud For > >> >> > Critical Workloads, Development Environments & Everything In > Between. > >> >> > Get a Quote or Start a Free Trial Today. > >> >> > > >> >> > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > >> >> > _______________________________________________ > >> >> > Postgres-xc-developers mailing list > >> >> > Pos...@li... > >> >> > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> >> > > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ > >> >> Android apps run on BlackBerry 10 > >> >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >> >> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >> >> Get your Android app in front of a whole new audience. Start now. > >> >> > >> >> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >> >> > >> >> _______________________________________________ > >> >> Postgres-xc-developers mailing list > >> >> Pos...@li... > >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> > > >> > > >> > > >> > > >> > -- > >> > Andrei Martsinchyk > >> > > >> > StormDB - http://www.stormdb.com > >> > The Database Cloud > >> > > > > > > > > ------------------------------------------------------------------------------ > > Android apps run on BlackBerry 10 > > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > > Now with support for Jelly Bean, Bluetooth, Mapview and more. > > Get your Android app in front of a whole new audience. Start now. > > > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > > _______________________________________________ > > Postgres-xc-developers mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > -- Andrei Martsinchyk StormDB - http://www.stormdb.com The Database Cloud |
From: Koichi S. <koi...@gm...> - 2014-02-14 08:36:10
|
I understood the situation. So there's three areas we need to fix. 1) Trigger, 2) FQS-able updates/deletes. I'm afraid we've not concluded how to solve these two yet. 3) Non-FQS cursor may work. It will be nice if 1) and 2) solution can be shred here too. Abbas, could you summarize these three cases? Regards; --- Koichi Suzuki 2014-02-14 16:02 GMT+09:00 Ashutosh Bapat <ash...@en...>: > > > > On Fri, Feb 14, 2014 at 12:22 PM, Abbas Butt <abb...@en...> > wrote: >> >> The case that we are trying to solve here is when the user statement is >> not ship-able and has to be evaluated at the coordinator. If it was >> ship-able, there is no problem in that case, it would get shipped like you >> are suggesting. However if it is not ship-able, then we will have a step in >> the query plan to first select the row to be updated and then a step to >> update that row (update being a two step process). What Ashutosh is >> suggesting is to have a cursor with an order by for all rows and the quals >> that the user query had, and then update the row using WCO. >> The reason of using cursors is to base the update on where the cursor >> currently points to, rather than the ctid of the row, which could be >> different on the datanodes. >> > > Very well explained. Thank you. Hope that clears the doubts. > >> >> >> On Fri, Feb 14, 2014 at 11:22 AM, Koichi Suzuki <koi...@gm...> >> wrote: >>> >>> If we can use the same ORDER BY clause, I don't understand why we need >>> cursor. We can just ship statements. >>> --- >>> Koichi Suzuki >>> >>> >>> 2014-02-14 15:20 GMT+09:00 Abbas Butt <abb...@en...>: >>> > >>> > >>> > >>> > On Fri, Feb 14, 2014 at 10:58 AM, Koichi Suzuki <koi...@gm...> >>> > wrote: >>> >> >>> >> 2014-02-14 14:55 GMT+09:00 Abbas Butt <abb...@en...>: >>> >> > >>> >> > >>> >> > >>> >> > On Fri, Feb 14, 2014 at 10:48 AM, Ashutosh Bapat >>> >> > <ash...@en...> wrote: >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt >>> >> >> <abb...@en...> >>> >> >> wrote: >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat >>> >> >>> <ash...@en...> wrote: >>> >> >>>> >>> >> >>>> One more solution would be to use cursors for replicated tables. >>> >> >>>> The >>> >> >>>> idea is to open cursors on all the copies of the table and append >>> >> >>>> the >>> >> >>>> query >>> >> >>>> with an ORDER BY clause on all the columns. Thus we are sure that >>> >> >>>> the >>> >> >>>> current of each of these cursors point to same row on all the >>> >> >>>> copies. >>> >> >>>> While >>> >> >>>> fetching a row from a replicated table, we fetch from all the >>> >> >>>> cursors >>> >> >>>> and >>> >> >>>> choose only one row for the data processing. While updating or >>> >> >>>> deleting we >>> >> >>>> send UPDATE or DELETE with WHERE CURRENT OF. The down side of >>> >> >>>> this >>> >> >>>> approach >>> >> >>>> is that, if there are coordinator quals, we will end up locking >>> >> >>>> more >>> >> >>>> rows >>> >> >>>> than necessary, increasing the probability of the deadlock but at >>> >> >>>> least >>> >> >>>> there won't be a necessary restriction of having primary or >>> >> >>>> unique >>> >> >>>> key and >>> >> >>>> we won't break backward compatibility. >>> >> >>>> >>> >> >>>> If there two identical rows, we might mix the update from >>> >> >>>> different >>> >> >>>> nodes, but then who knew which of them were corresponded across >>> >> >>>> the >>> >> >>>> nodes to >>> >> >>>> start with. >>> >> >>> >>> >> >>> >>> >> >>> Thanks for the suggestion but we currently do not support WCO and >>> >> >>> we >>> >> >>> were >>> >> >>> thinking of fixing this issue before we declare 1.2 beta is >>> >> >>> generally >>> >> >>> available. >>> >> >>> >>> >> >> >>> >> >> >>> >> >> Abbas, WCO doesn't work from the coordinator, but there is no >>> >> >> reason >>> >> >> why >>> >> >> it shouldn't work at the datanode. So internally between >>> >> >> coordinator >>> >> >> and the >>> >> >> datanode, we can always use WCO. >>> >> > >>> >> > >>> >> > True, Thanks for the clarification. >>> >> >>> >> Again, there are no guarantee that all cursors for a replicated table >>> >> returns rows in the same order. It is as dangerous as ctid. >>> > >>> > >>> > Could you please explain a little further, how would a query that has >>> > all >>> > table columns in the ORDER BY clause return rows in different order >>> > when run >>> > on the datanodes? >>> > >>> > >>> >> >>> >> >>> >> > >>> >> >> >>> >> >> >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki >>> >> >>>> <koi...@gm...> >>> >> >>>> wrote: >>> >> >>>>> >>> >> >>>>> Hi, >>> >> >>>>> >>> >> >>>>> I tested the patch and found that primary key is mandatory. We >>> >> >>>>> need >>> >> >>>>> to modify regression test considerably to give each replicated >>> >> >>>>> table >>> >> >>>>> primary keys. >>> >> >>>>> >>> >> >>>>> I think this patch helps but I'm not afraid this is good, >>> >> >>>>> especially >>> >> >>>>> when we try to take XC features back to PG. >>> >> >>>>> >>> >> >>>>> Did you post another patch to use all column values if primary >>> >> >>>>> key >>> >> >>>>> is >>> >> >>>>> not available? >>> >> >>>>> >>> >> >>>>> I think better way is as follows: >>> >> >>>>> >>> >> >>>>> 1) If primary key is defined, use it, >>> >> >>>>> 2) If not, create a primary key as system column, the size >>> >> >>>>> should be >>> >> >>>>> 64bit. >>> >> >>>>> 3) If primary key is added to a replicated table, remove system >>> >> >>>>> primary >>> >> >>>>> key. >>> >> >>>>> >>> >> >>>>> The value of primary key can be obtained as follows: >>> >> >>>>> >>> >> >>>>> 1) add new column to pgxc_class catalog to represent maximum >>> >> >>>>> value >>> >> >>>>> of >>> >> >>>>> the system primary key, >>> >> >>>>> 2) when first "insert" is done to the primary node, system >>> >> >>>>> primary >>> >> >>>>> key >>> >> >>>>> value is taken from 1) and 1) is updated. The value is returned >>> >> >>>>> to >>> >> >>>>> the coordinator to be propagated to other nodes. >>> >> >>>>> 3) when subsequent "insert" is being done, system primary key >>> >> >>>>> value >>> >> >>>>> is >>> >> >>>>> added to the column value. In this case, each datanode updates >>> >> >>>>> 1) >>> >> >>>>> column value if it is larger than the current maximum value. >>> >> >>>>> >>> >> >>>>> 3) is important to change primary node to another. This is >>> >> >>>>> needed >>> >> >>>>> to >>> >> >>>>> carry over the primary node to another. >>> >> >>>>> >>> >> >>>>> ALTER TABLE should take care of them. >>> >> >>>>> >>> >> >>>>> Other issues are: >>> >> >>>>> >>> >> >>>>> 4) pg_dump/pg_dumpall should not include this system column >>> >> >>>>> value, >>> >> >>>>> 5) cluster may need to handle this too to repack system primary >>> >> >>>>> key >>> >> >>>>> value (not now but at least in 1.3 or later). >>> >> >>>>> >>> >> >>>>> Regards; >>> >> >>>>> --- >>> >> >>>>> Koichi Suzuki >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >>> >> >>>>> > Please see attached patch that tries to address the issue of >>> >> >>>>> > XC >>> >> >>>>> > using >>> >> >>>>> > CTID >>> >> >>>>> > for replicated updates and deletes when it is evaluated at a >>> >> >>>>> > coordinator >>> >> >>>>> > instead of being pushed down. >>> >> >>>>> > >>> >> >>>>> > The problem here is that CTID could be referring to a >>> >> >>>>> > different >>> >> >>>>> > tuple >>> >> >>>>> > altogether on a different data node, which is what happened >>> >> >>>>> > for >>> >> >>>>> > one >>> >> >>>>> > of our >>> >> >>>>> > Postgres-XC support customers, leading to data issues. >>> >> >>>>> > >>> >> >>>>> > Instead, the patch looks for a primary key or unique index >>> >> >>>>> > (with >>> >> >>>>> > the >>> >> >>>>> > primary >>> >> >>>>> > key preferred) and uses those values instead of CTID. >>> >> >>>>> > >>> >> >>>>> > The patch could be improved further. Extra parameters are set >>> >> >>>>> > even >>> >> >>>>> > if not >>> >> >>>>> > used in the execution of the prepared statement sent down to >>> >> >>>>> > the >>> >> >>>>> > data >>> >> >>>>> > nodes. >>> >> >>>>> > >>> >> >>>>> > Regards, >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > -- >>> >> >>>>> > Mason Sharp >>> >> >>>>> > >>> >> >>>>> > TransLattice - http://www.translattice.com >>> >> >>>>> > Distributed and Clustered Database Solutions >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > ------------------------------------------------------------------------------ >>> >> >>>>> > November Webinars for C, C++, Fortran Developers >>> >> >>>>> > Accelerate application performance with scalable programming >>> >> >>>>> > models. >>> >> >>>>> > Explore >>> >> >>>>> > techniques for threading, error checking, porting, and tuning. >>> >> >>>>> > Get >>> >> >>>>> > the most >>> >> >>>>> > from the latest Intel processors and coprocessors. See >>> >> >>>>> > abstracts >>> >> >>>>> > and >>> >> >>>>> > register >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >>> >> >>>>> > _______________________________________________ >>> >> >>>>> > Postgres-xc-developers mailing list >>> >> >>>>> > Pos...@li... >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >> >>>>> > >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> ------------------------------------------------------------------------------ >>> >> >>>>> Android apps run on BlackBerry 10 >>> >> >>>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>> >> >>>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>> >> >>>>> Get your Android app in front of a whole new audience. Start >>> >> >>>>> now. >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>> >> >>>>> _______________________________________________ >>> >> >>>>> Postgres-xc-developers mailing list >>> >> >>>>> Pos...@li... >>> >> >>>>> >>> >> >>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> -- >>> >> >>>> Best Wishes, >>> >> >>>> Ashutosh Bapat >>> >> >>>> EnterpriseDB Corporation >>> >> >>>> The Postgres Database Company >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> ------------------------------------------------------------------------------ >>> >> >>>> Android apps run on BlackBerry 10 >>> >> >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>> >> >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>> >> >>>> Get your Android app in front of a whole new audience. Start >>> >> >>>> now. >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>> >> >>>> _______________________________________________ >>> >> >>>> Postgres-xc-developers mailing list >>> >> >>>> Pos...@li... >>> >> >>>> >>> >> >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >> >>>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> >>> -- >>> >> >>> -- >>> >> >>> Abbas >>> >> >>> Architect >>> >> >>> >>> >> >>> Ph: 92.334.5100153 >>> >> >>> Skype ID: gabbasb >>> >> >>> www.enterprisedb.com >>> >> >>> >>> >> >>> Follow us on Twitter >>> >> >>> @EnterpriseDB >>> >> >>> >>> >> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers and more >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> Best Wishes, >>> >> >> Ashutosh Bapat >>> >> >> EnterpriseDB Corporation >>> >> >> The Postgres Database Company >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > -- >>> >> > -- >>> >> > Abbas >>> >> > Architect >>> >> > >>> >> > Ph: 92.334.5100153 >>> >> > Skype ID: gabbasb >>> >> > www.enterprisedb.com >>> >> > >>> >> > Follow us on Twitter >>> >> > @EnterpriseDB >>> >> > >>> >> > Visit EnterpriseDB for tutorials, webinars, whitepapers and more >>> > >>> > >>> > >>> > >>> > -- >>> > -- >>> > Abbas >>> > Architect >>> > >>> > Ph: 92.334.5100153 >>> > Skype ID: gabbasb >>> > www.enterprisedb.com >>> > >>> > Follow us on Twitter >>> > @EnterpriseDB >>> > >>> > Visit EnterpriseDB for tutorials, webinars, whitepapers and more >> >> >> >> >> -- >> -- >> Abbas >> Architect >> >> Ph: 92.334.5100153 >> Skype ID: gabbasb >> www.enterprisedb.com >> >> Follow us on Twitter >> @EnterpriseDB >> >> Visit EnterpriseDB for tutorials, webinars, whitepapers and more > > > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company |
From: Masataka S. <pg...@gm...> - 2014-02-14 07:04:11
|
Thank you for your clever suggestion. > - Make Cancel more selective and affect only specific query. That means an ID for each query to introduce, that should be known to client and way to deliver it. > - Introduce procedure of changing backend key. Old cancel won't affect such backend. I prefer the 2nd idea. But these ideas seem to require touching libpq infrastructure and if I understand correctly, they are used not only the inter node communication but also a coordinator and a frontend communication. Unless we can separate them, I think better not to change it. > - Before starting new query, check if there is pending cancel and remove it. It sounds ridiculous "cancel cancel" but may work, if queries and cancels are issued synchronously from single source. I'm afraid of the wrong hypothesis. As I suggested first, cancel and subsequent request are not serialized at the target node. It means that if the query started with no pending cancel, it could be interrupted by cancel request. Regards. On 14 February 2014 14:06, Andrei Martsinchyk <and...@gm...> wrote: > > You are right, the temp objects are problem. > On the one hand if we run a long query and there was an error on one node we want to cancel it on others to avoid unnecessary waiting. On the other hand the query may be near its natural end and the cancel may be late and hit the next query. > Just throwing out ideas: > - Make Cancel more selective and affect only specific query. That means an ID for each query to introduce, that should be known to client and way to deliver it. > - Introduce procedure of changing backend key. Old cancel won't affect such backend. > - Before starting new query, check if there is pending cancel and remove it. It sounds ridiculous "cancel cancel" but may work, if queries and cancels are issued synchronously from single source. > > 14.02.2014 4:07 пользователь "Koichi Suzuki" <koi...@gm...> написал: > >> I misunderstand the implication. Anyway additional wait is separate >> from your suggestion. >> >> Disconnecting the connection as you suggested will bring another >> problem such as TEMPORARY object in the subsequent queries. We do >> not support TEMPORARY object but I believe we should be consistent on >> this for future releases. >> >> Thoughts? >> --- >> Koichi Suzuki >> >> >> 2014-02-14 2:30 GMT+09:00 Andrei Martsinchyk <and...@gm...>: >> > Hello, >> > >> > Postgres establishes separate connection to deliver Cancel command to the >> > target session. >> > On a heavily loaded node it may take fairly long. Longer sleep would help >> > out, but it means longer recovery after an error. >> > Better solution is to remove canceled connection from the pool and therefore >> > do not use it to handle subsequent queries. >> > >> > >> > >> > 2014-02-13 11:10 GMT+02:00 Koichi Suzuki <koi...@gm...>: >> >> >> >> I think it hits the point. I tested this patch several times and it >> >> seems to work fine. The delay time (at present 10ms) is short enough >> >> and it is applied only when we need to cancel a statement. >> >> >> >> We should check this into all the master and STABLE branches improving >> >> magic number with some meaningful name. >> >> >> >> Any thoughts? >> >> --- >> >> Koichi Suzuki >> >> >> >> >> >> 2014-01-24 18:25 GMT+09:00 Masataka Saito <pg...@gm...>: >> >> > Hello, >> >> > >> >> > As I've been exasperated by random failures, I'm willing to whip the >> >> > cause >> >> > of the issue. >> >> > >> >> > This issue is related to cancel of the failed query. >> >> > When a datanode reports an error of a query, a coordinator sends a >> >> > cancel >> >> > request to non-idle nodes, waits the node to get ready and requests >> >> > nodes to >> >> > rollback the transaction. >> >> > >> >> > Where's the problem? Consider the next case. >> >> > 1. Datanode A (PID 1) reports an error to coordinator A. ([1] 'E' >> >> > message) >> >> > 2. Coordinator A receives [1] and reports an error to a frontend. ([2] >> >> > 'E' >> >> > message) >> >> > 3. Coordinator A starts aborting process and it thinks datanode A (PID >> >> > 1) is >> >> > not idle. >> >> > 4. Coordinator A sends a cancel request about PID 1 to datanode A (PID >> >> > 2). >> >> > ([3] cancel message) >> >> > 5. Datanode A (PID 1) reports ready to coordinator A. ([4] 'Z' message) >> >> > 6. Coordinator A receives [4] and sends "ROLLBACK TRANSACTION" >> >> > immediately. >> >> > ([5] 'Q' message) >> >> > 7. Datanode A (PID 1) receives [5] and starts processing the query. >> >> > 8. Datanode A (PID 2) receives [3]. >> >> > 9. Datanode A (PID 2) notify PID 1 of [3]. >> >> > 10. Datanode A (PID 1) cancel processing [5] and reports an error to >> >> > Coordinator A. ([6] 'E' message) >> >> > 11. Coordinator A receives [6] and reports an error to a frontend. ([7] >> >> > 'E' >> >> > message) >> >> > >> >> > [7] makes unexpected output and a test fails. >> >> > >> >> > Saying an extreme thing, it could occur that the next query of [5] is >> >> > cancelled by [3]. >> >> > >> >> > As far as I know, there's no way to know when to the cancel request get >> >> > to >> >> > be processed, I think we can't not wait an experimental duration after >> >> > cancelling like the attached patch. >> >> > >> >> > Does anyone have another cool idea to solve this issue? >> >> > >> >> > Regards. >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ >> >> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> >> > Learn Why More Businesses Are Choosing CenturyLink Cloud For >> >> > Critical Workloads, Development Environments & Everything In Between. >> >> > Get a Quote or Start a Free Trial Today. >> >> > >> >> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> >> > _______________________________________________ >> >> > Postgres-xc-developers mailing list >> >> > Pos...@li... >> >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Android apps run on BlackBerry 10 >> >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >> Get your Android app in front of a whole new audience. Start now. >> >> >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >> >> >> _______________________________________________ >> >> Postgres-xc-developers mailing list >> >> Pos...@li... >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > >> > >> > >> > >> > -- >> > Andrei Martsinchyk >> > >> > StormDB - http://www.stormdb.com >> > The Database Cloud >> > > > > ------------------------------------------------------------------------------ > Android apps run on BlackBerry 10 > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > Now with support for Jelly Bean, Bluetooth, Mapview and more. > Get your Android app in front of a whole new audience. Start now. > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |
From: Ashutosh B. <ash...@en...> - 2014-02-14 07:02:51
|
On Fri, Feb 14, 2014 at 12:22 PM, Abbas Butt <abb...@en...>wrote: > The case that we are trying to solve here is when the user statement is > not ship-able and has to be evaluated at the coordinator. If it was > ship-able, there is no problem in that case, it would get shipped like you > are suggesting. However if it is not ship-able, then we will have a step in > the query plan to first select the row to be updated and then a step to > update that row (update being a two step process). What Ashutosh is > suggesting is to have a cursor with an order by for all rows and the quals > that the user query had, and then update the row using WCO. > The reason of using cursors is to base the update on where the cursor > currently points to, rather than the ctid of the row, which could be > different on the datanodes. > > Very well explained. Thank you. Hope that clears the doubts. > > On Fri, Feb 14, 2014 at 11:22 AM, Koichi Suzuki <koi...@gm...>wrote: > >> If we can use the same ORDER BY clause, I don't understand why we need >> cursor. We can just ship statements. >> --- >> Koichi Suzuki >> >> >> 2014-02-14 15:20 GMT+09:00 Abbas Butt <abb...@en...>: >> > >> > >> > >> > On Fri, Feb 14, 2014 at 10:58 AM, Koichi Suzuki <koi...@gm...> >> > wrote: >> >> >> >> 2014-02-14 14:55 GMT+09:00 Abbas Butt <abb...@en...>: >> >> > >> >> > >> >> > >> >> > On Fri, Feb 14, 2014 at 10:48 AM, Ashutosh Bapat >> >> > <ash...@en...> wrote: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt >> >> >> <abb...@en...> >> >> >> wrote: >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat >> >> >>> <ash...@en...> wrote: >> >> >>>> >> >> >>>> One more solution would be to use cursors for replicated tables. >> The >> >> >>>> idea is to open cursors on all the copies of the table and append >> the >> >> >>>> query >> >> >>>> with an ORDER BY clause on all the columns. Thus we are sure that >> the >> >> >>>> current of each of these cursors point to same row on all the >> copies. >> >> >>>> While >> >> >>>> fetching a row from a replicated table, we fetch from all the >> cursors >> >> >>>> and >> >> >>>> choose only one row for the data processing. While updating or >> >> >>>> deleting we >> >> >>>> send UPDATE or DELETE with WHERE CURRENT OF. The down side of this >> >> >>>> approach >> >> >>>> is that, if there are coordinator quals, we will end up locking >> more >> >> >>>> rows >> >> >>>> than necessary, increasing the probability of the deadlock but at >> >> >>>> least >> >> >>>> there won't be a necessary restriction of having primary or unique >> >> >>>> key and >> >> >>>> we won't break backward compatibility. >> >> >>>> >> >> >>>> If there two identical rows, we might mix the update from >> different >> >> >>>> nodes, but then who knew which of them were corresponded across >> the >> >> >>>> nodes to >> >> >>>> start with. >> >> >>> >> >> >>> >> >> >>> Thanks for the suggestion but we currently do not support WCO and >> we >> >> >>> were >> >> >>> thinking of fixing this issue before we declare 1.2 beta is >> generally >> >> >>> available. >> >> >>> >> >> >> >> >> >> >> >> >> Abbas, WCO doesn't work from the coordinator, but there is no reason >> >> >> why >> >> >> it shouldn't work at the datanode. So internally between coordinator >> >> >> and the >> >> >> datanode, we can always use WCO. >> >> > >> >> > >> >> > True, Thanks for the clarification. >> >> >> >> Again, there are no guarantee that all cursors for a replicated table >> >> returns rows in the same order. It is as dangerous as ctid. >> > >> > >> > Could you please explain a little further, how would a query that has >> all >> > table columns in the ORDER BY clause return rows in different order >> when run >> > on the datanodes? >> > >> > >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki >> >> >>>> <koi...@gm...> >> >> >>>> wrote: >> >> >>>>> >> >> >>>>> Hi, >> >> >>>>> >> >> >>>>> I tested the patch and found that primary key is mandatory. We >> >> >>>>> need >> >> >>>>> to modify regression test considerably to give each replicated >> table >> >> >>>>> primary keys. >> >> >>>>> >> >> >>>>> I think this patch helps but I'm not afraid this is good, >> especially >> >> >>>>> when we try to take XC features back to PG. >> >> >>>>> >> >> >>>>> Did you post another patch to use all column values if primary >> key >> >> >>>>> is >> >> >>>>> not available? >> >> >>>>> >> >> >>>>> I think better way is as follows: >> >> >>>>> >> >> >>>>> 1) If primary key is defined, use it, >> >> >>>>> 2) If not, create a primary key as system column, the size >> should be >> >> >>>>> 64bit. >> >> >>>>> 3) If primary key is added to a replicated table, remove system >> >> >>>>> primary >> >> >>>>> key. >> >> >>>>> >> >> >>>>> The value of primary key can be obtained as follows: >> >> >>>>> >> >> >>>>> 1) add new column to pgxc_class catalog to represent maximum >> value >> >> >>>>> of >> >> >>>>> the system primary key, >> >> >>>>> 2) when first "insert" is done to the primary node, system >> primary >> >> >>>>> key >> >> >>>>> value is taken from 1) and 1) is updated. The value is returned >> to >> >> >>>>> the coordinator to be propagated to other nodes. >> >> >>>>> 3) when subsequent "insert" is being done, system primary key >> value >> >> >>>>> is >> >> >>>>> added to the column value. In this case, each datanode updates >> 1) >> >> >>>>> column value if it is larger than the current maximum value. >> >> >>>>> >> >> >>>>> 3) is important to change primary node to another. This is >> needed >> >> >>>>> to >> >> >>>>> carry over the primary node to another. >> >> >>>>> >> >> >>>>> ALTER TABLE should take care of them. >> >> >>>>> >> >> >>>>> Other issues are: >> >> >>>>> >> >> >>>>> 4) pg_dump/pg_dumpall should not include this system column >> value, >> >> >>>>> 5) cluster may need to handle this too to repack system primary >> key >> >> >>>>> value (not now but at least in 1.3 or later). >> >> >>>>> >> >> >>>>> Regards; >> >> >>>>> --- >> >> >>>>> Koichi Suzuki >> >> >>>>> >> >> >>>>> >> >> >>>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >> >> >>>>> > Please see attached patch that tries to address the issue of XC >> >> >>>>> > using >> >> >>>>> > CTID >> >> >>>>> > for replicated updates and deletes when it is evaluated at a >> >> >>>>> > coordinator >> >> >>>>> > instead of being pushed down. >> >> >>>>> > >> >> >>>>> > The problem here is that CTID could be referring to a different >> >> >>>>> > tuple >> >> >>>>> > altogether on a different data node, which is what happened for >> >> >>>>> > one >> >> >>>>> > of our >> >> >>>>> > Postgres-XC support customers, leading to data issues. >> >> >>>>> > >> >> >>>>> > Instead, the patch looks for a primary key or unique index >> (with >> >> >>>>> > the >> >> >>>>> > primary >> >> >>>>> > key preferred) and uses those values instead of CTID. >> >> >>>>> > >> >> >>>>> > The patch could be improved further. Extra parameters are set >> >> >>>>> > even >> >> >>>>> > if not >> >> >>>>> > used in the execution of the prepared statement sent down to >> the >> >> >>>>> > data >> >> >>>>> > nodes. >> >> >>>>> > >> >> >>>>> > Regards, >> >> >>>>> > >> >> >>>>> > >> >> >>>>> > -- >> >> >>>>> > Mason Sharp >> >> >>>>> > >> >> >>>>> > TransLattice - http://www.translattice.com >> >> >>>>> > Distributed and Clustered Database Solutions >> >> >>>>> > >> >> >>>>> > >> >> >>>>> > >> >> >>>>> > >> ------------------------------------------------------------------------------ >> >> >>>>> > November Webinars for C, C++, Fortran Developers >> >> >>>>> > Accelerate application performance with scalable programming >> >> >>>>> > models. >> >> >>>>> > Explore >> >> >>>>> > techniques for threading, error checking, porting, and tuning. >> Get >> >> >>>>> > the most >> >> >>>>> > from the latest Intel processors and coprocessors. See >> abstracts >> >> >>>>> > and >> >> >>>>> > register >> >> >>>>> > >> >> >>>>> > >> >> >>>>> > >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> >> >>>>> > _______________________________________________ >> >> >>>>> > Postgres-xc-developers mailing list >> >> >>>>> > Pos...@li... >> >> >>>>> > >> >> >>>>> > >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> >>>>> > >> >> >>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> >> ------------------------------------------------------------------------------ >> >> >>>>> Android apps run on BlackBerry 10 >> >> >>>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >> >>>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >> >>>>> Get your Android app in front of a whole new audience. Start >> now. >> >> >>>>> >> >> >>>>> >> >> >>>>> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >> >>>>> _______________________________________________ >> >> >>>>> Postgres-xc-developers mailing list >> >> >>>>> Pos...@li... >> >> >>>>> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> -- >> >> >>>> Best Wishes, >> >> >>>> Ashutosh Bapat >> >> >>>> EnterpriseDB Corporation >> >> >>>> The Postgres Database Company >> >> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> ------------------------------------------------------------------------------ >> >> >>>> Android apps run on BlackBerry 10 >> >> >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >> >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >> >>>> Get your Android app in front of a whole new audience. Start now. >> >> >>>> >> >> >>>> >> >> >>>> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >> >>>> _______________________________________________ >> >> >>>> Postgres-xc-developers mailing list >> >> >>>> Pos...@li... >> >> >>>> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> >>>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> -- >> >> >>> Abbas >> >> >>> Architect >> >> >>> >> >> >>> Ph: 92.334.5100153 >> >> >>> Skype ID: gabbasb >> >> >>> www.enterprisedb.com >> >> >>> >> >> >>> Follow us on Twitter >> >> >>> @EnterpriseDB >> >> >>> >> >> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers and more >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Best Wishes, >> >> >> Ashutosh Bapat >> >> >> EnterpriseDB Corporation >> >> >> The Postgres Database Company >> >> > >> >> > >> >> > >> >> > >> >> > -- >> >> > -- >> >> > Abbas >> >> > Architect >> >> > >> >> > Ph: 92.334.5100153 >> >> > Skype ID: gabbasb >> >> > www.enterprisedb.com >> >> > >> >> > Follow us on Twitter >> >> > @EnterpriseDB >> >> > >> >> > Visit EnterpriseDB for tutorials, webinars, whitepapers and more >> > >> > >> > >> > >> > -- >> > -- >> > Abbas >> > Architect >> > >> > Ph: 92.334.5100153 >> > Skype ID: gabbasb >> > www.enterprisedb.com >> > >> > Follow us on Twitter >> > @EnterpriseDB >> > >> > Visit EnterpriseDB for tutorials, webinars, whitepapers and more >> > > > > -- > -- > *Abbas* > Architect > > Ph: 92.334.5100153 > Skype ID: gabbasb > www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> > > *Follow us on Twitter* > @EnterpriseDB > > Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |
From: Abbas B. <abb...@en...> - 2014-02-14 06:52:59
|
The case that we are trying to solve here is when the user statement is not ship-able and has to be evaluated at the coordinator. If it was ship-able, there is no problem in that case, it would get shipped like you are suggesting. However if it is not ship-able, then we will have a step in the query plan to first select the row to be updated and then a step to update that row (update being a two step process). What Ashutosh is suggesting is to have a cursor with an order by for all rows and the quals that the user query had, and then update the row using WCO. The reason of using cursors is to base the update on where the cursor currently points to, rather than the ctid of the row, which could be different on the datanodes. On Fri, Feb 14, 2014 at 11:22 AM, Koichi Suzuki <koi...@gm...>wrote: > If we can use the same ORDER BY clause, I don't understand why we need > cursor. We can just ship statements. > --- > Koichi Suzuki > > > 2014-02-14 15:20 GMT+09:00 Abbas Butt <abb...@en...>: > > > > > > > > On Fri, Feb 14, 2014 at 10:58 AM, Koichi Suzuki <koi...@gm...> > > wrote: > >> > >> 2014-02-14 14:55 GMT+09:00 Abbas Butt <abb...@en...>: > >> > > >> > > >> > > >> > On Fri, Feb 14, 2014 at 10:48 AM, Ashutosh Bapat > >> > <ash...@en...> wrote: > >> >> > >> >> > >> >> > >> >> > >> >> On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt > >> >> <abb...@en...> > >> >> wrote: > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat > >> >>> <ash...@en...> wrote: > >> >>>> > >> >>>> One more solution would be to use cursors for replicated tables. > The > >> >>>> idea is to open cursors on all the copies of the table and append > the > >> >>>> query > >> >>>> with an ORDER BY clause on all the columns. Thus we are sure that > the > >> >>>> current of each of these cursors point to same row on all the > copies. > >> >>>> While > >> >>>> fetching a row from a replicated table, we fetch from all the > cursors > >> >>>> and > >> >>>> choose only one row for the data processing. While updating or > >> >>>> deleting we > >> >>>> send UPDATE or DELETE with WHERE CURRENT OF. The down side of this > >> >>>> approach > >> >>>> is that, if there are coordinator quals, we will end up locking > more > >> >>>> rows > >> >>>> than necessary, increasing the probability of the deadlock but at > >> >>>> least > >> >>>> there won't be a necessary restriction of having primary or unique > >> >>>> key and > >> >>>> we won't break backward compatibility. > >> >>>> > >> >>>> If there two identical rows, we might mix the update from different > >> >>>> nodes, but then who knew which of them were corresponded across the > >> >>>> nodes to > >> >>>> start with. > >> >>> > >> >>> > >> >>> Thanks for the suggestion but we currently do not support WCO and we > >> >>> were > >> >>> thinking of fixing this issue before we declare 1.2 beta is > generally > >> >>> available. > >> >>> > >> >> > >> >> > >> >> Abbas, WCO doesn't work from the coordinator, but there is no reason > >> >> why > >> >> it shouldn't work at the datanode. So internally between coordinator > >> >> and the > >> >> datanode, we can always use WCO. > >> > > >> > > >> > True, Thanks for the clarification. > >> > >> Again, there are no guarantee that all cursors for a replicated table > >> returns rows in the same order. It is as dangerous as ctid. > > > > > > Could you please explain a little further, how would a query that has all > > table columns in the ORDER BY clause return rows in different order when > run > > on the datanodes? > > > > > >> > >> > >> > > >> >> > >> >> > >> >>>> > >> >>>> > >> >>>> > >> >>>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki > >> >>>> <koi...@gm...> > >> >>>> wrote: > >> >>>>> > >> >>>>> Hi, > >> >>>>> > >> >>>>> I tested the patch and found that primary key is mandatory. We > >> >>>>> need > >> >>>>> to modify regression test considerably to give each replicated > table > >> >>>>> primary keys. > >> >>>>> > >> >>>>> I think this patch helps but I'm not afraid this is good, > especially > >> >>>>> when we try to take XC features back to PG. > >> >>>>> > >> >>>>> Did you post another patch to use all column values if primary key > >> >>>>> is > >> >>>>> not available? > >> >>>>> > >> >>>>> I think better way is as follows: > >> >>>>> > >> >>>>> 1) If primary key is defined, use it, > >> >>>>> 2) If not, create a primary key as system column, the size should > be > >> >>>>> 64bit. > >> >>>>> 3) If primary key is added to a replicated table, remove system > >> >>>>> primary > >> >>>>> key. > >> >>>>> > >> >>>>> The value of primary key can be obtained as follows: > >> >>>>> > >> >>>>> 1) add new column to pgxc_class catalog to represent maximum value > >> >>>>> of > >> >>>>> the system primary key, > >> >>>>> 2) when first "insert" is done to the primary node, system primary > >> >>>>> key > >> >>>>> value is taken from 1) and 1) is updated. The value is returned > to > >> >>>>> the coordinator to be propagated to other nodes. > >> >>>>> 3) when subsequent "insert" is being done, system primary key > value > >> >>>>> is > >> >>>>> added to the column value. In this case, each datanode updates > 1) > >> >>>>> column value if it is larger than the current maximum value. > >> >>>>> > >> >>>>> 3) is important to change primary node to another. This is > needed > >> >>>>> to > >> >>>>> carry over the primary node to another. > >> >>>>> > >> >>>>> ALTER TABLE should take care of them. > >> >>>>> > >> >>>>> Other issues are: > >> >>>>> > >> >>>>> 4) pg_dump/pg_dumpall should not include this system column value, > >> >>>>> 5) cluster may need to handle this too to repack system primary > key > >> >>>>> value (not now but at least in 1.3 or later). > >> >>>>> > >> >>>>> Regards; > >> >>>>> --- > >> >>>>> Koichi Suzuki > >> >>>>> > >> >>>>> > >> >>>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: > >> >>>>> > Please see attached patch that tries to address the issue of XC > >> >>>>> > using > >> >>>>> > CTID > >> >>>>> > for replicated updates and deletes when it is evaluated at a > >> >>>>> > coordinator > >> >>>>> > instead of being pushed down. > >> >>>>> > > >> >>>>> > The problem here is that CTID could be referring to a different > >> >>>>> > tuple > >> >>>>> > altogether on a different data node, which is what happened for > >> >>>>> > one > >> >>>>> > of our > >> >>>>> > Postgres-XC support customers, leading to data issues. > >> >>>>> > > >> >>>>> > Instead, the patch looks for a primary key or unique index (with > >> >>>>> > the > >> >>>>> > primary > >> >>>>> > key preferred) and uses those values instead of CTID. > >> >>>>> > > >> >>>>> > The patch could be improved further. Extra parameters are set > >> >>>>> > even > >> >>>>> > if not > >> >>>>> > used in the execution of the prepared statement sent down to the > >> >>>>> > data > >> >>>>> > nodes. > >> >>>>> > > >> >>>>> > Regards, > >> >>>>> > > >> >>>>> > > >> >>>>> > -- > >> >>>>> > Mason Sharp > >> >>>>> > > >> >>>>> > TransLattice - http://www.translattice.com > >> >>>>> > Distributed and Clustered Database Solutions > >> >>>>> > > >> >>>>> > > >> >>>>> > > >> >>>>> > > ------------------------------------------------------------------------------ > >> >>>>> > November Webinars for C, C++, Fortran Developers > >> >>>>> > Accelerate application performance with scalable programming > >> >>>>> > models. > >> >>>>> > Explore > >> >>>>> > techniques for threading, error checking, porting, and tuning. > Get > >> >>>>> > the most > >> >>>>> > from the latest Intel processors and coprocessors. See abstracts > >> >>>>> > and > >> >>>>> > register > >> >>>>> > > >> >>>>> > > >> >>>>> > > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > >> >>>>> > _______________________________________________ > >> >>>>> > Postgres-xc-developers mailing list > >> >>>>> > Pos...@li... > >> >>>>> > > >> >>>>> > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> >>>>> > > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > ------------------------------------------------------------------------------ > >> >>>>> Android apps run on BlackBerry 10 > >> >>>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >> >>>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >> >>>>> Get your Android app in front of a whole new audience. Start now. > >> >>>>> > >> >>>>> > >> >>>>> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >> >>>>> _______________________________________________ > >> >>>>> Postgres-xc-developers mailing list > >> >>>>> Pos...@li... > >> >>>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> >>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> -- > >> >>>> Best Wishes, > >> >>>> Ashutosh Bapat > >> >>>> EnterpriseDB Corporation > >> >>>> The Postgres Database Company > >> >>>> > >> >>>> > >> >>>> > >> >>>> > ------------------------------------------------------------------------------ > >> >>>> Android apps run on BlackBerry 10 > >> >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >> >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >> >>>> Get your Android app in front of a whole new audience. Start now. > >> >>>> > >> >>>> > >> >>>> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >> >>>> _______________________________________________ > >> >>>> Postgres-xc-developers mailing list > >> >>>> Pos...@li... > >> >>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> >>>> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> -- > >> >>> Abbas > >> >>> Architect > >> >>> > >> >>> Ph: 92.334.5100153 > >> >>> Skype ID: gabbasb > >> >>> www.enterprisedb.com > >> >>> > >> >>> Follow us on Twitter > >> >>> @EnterpriseDB > >> >>> > >> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers and more > >> >> > >> >> > >> >> > >> >> > >> >> -- > >> >> Best Wishes, > >> >> Ashutosh Bapat > >> >> EnterpriseDB Corporation > >> >> The Postgres Database Company > >> > > >> > > >> > > >> > > >> > -- > >> > -- > >> > Abbas > >> > Architect > >> > > >> > Ph: 92.334.5100153 > >> > Skype ID: gabbasb > >> > www.enterprisedb.com > >> > > >> > Follow us on Twitter > >> > @EnterpriseDB > >> > > >> > Visit EnterpriseDB for tutorials, webinars, whitepapers and more > > > > > > > > > > -- > > -- > > Abbas > > Architect > > > > Ph: 92.334.5100153 > > Skype ID: gabbasb > > www.enterprisedb.com > > > > Follow us on Twitter > > @EnterpriseDB > > > > Visit EnterpriseDB for tutorials, webinars, whitepapers and more > -- -- *Abbas* Architect Ph: 92.334.5100153 Skype ID: gabbasb www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> *Follow us on Twitter* @EnterpriseDB Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> |
From: Andrei M. <and...@gm...> - 2014-02-14 06:41:35
|
It is not an issue of PG, but the way as XC uses the feature. It is somewhat differs from intended. 14.02.2014 7:21 пользователь "Koichi Suzuki" <koi...@gm...> написал: > It seems to be an issue of PG itself, doesn't it? > --- > Koichi Suzuki > > > 2014-02-14 14:06 GMT+09:00 Andrei Martsinchyk < > and...@gm...>: > > You are right, the temp objects are problem. > > On the one hand if we run a long query and there was an error on one > node we > > want to cancel it on others to avoid unnecessary waiting. On the other > hand > > the query may be near its natural end and the cancel may be late and hit > the > > next query. > > Just throwing out ideas: > > - Make Cancel more selective and affect only specific query. That means > an > > ID for each query to introduce, that should be known to client and way to > > deliver it. > > - Introduce procedure of changing backend key. Old cancel won't affect > such > > backend. > > - Before starting new query, check if there is pending cancel and remove > it. > > It sounds ridiculous "cancel cancel" but may work, if queries and cancels > > are issued synchronously from single source. > > > > 14.02.2014 4:07 пользователь "Koichi Suzuki" <koi...@gm...> > > написал: > > > >> I misunderstand the implication. Anyway additional wait is separate > >> from your suggestion. > >> > >> Disconnecting the connection as you suggested will bring another > >> problem such as TEMPORARY object in the subsequent queries. We do > >> not support TEMPORARY object but I believe we should be consistent on > >> this for future releases. > >> > >> Thoughts? > >> --- > >> Koichi Suzuki > >> > >> > >> 2014-02-14 2:30 GMT+09:00 Andrei Martsinchyk > >> <and...@gm...>: > >> > Hello, > >> > > >> > Postgres establishes separate connection to deliver Cancel command to > >> > the > >> > target session. > >> > On a heavily loaded node it may take fairly long. Longer sleep would > >> > help > >> > out, but it means longer recovery after an error. > >> > Better solution is to remove canceled connection from the pool and > >> > therefore > >> > do not use it to handle subsequent queries. > >> > > >> > > >> > > >> > 2014-02-13 11:10 GMT+02:00 Koichi Suzuki <koi...@gm...>: > >> >> > >> >> I think it hits the point. I tested this patch several times and it > >> >> seems to work fine. The delay time (at present 10ms) is short > enough > >> >> and it is applied only when we need to cancel a statement. > >> >> > >> >> We should check this into all the master and STABLE branches > improving > >> >> magic number with some meaningful name. > >> >> > >> >> Any thoughts? > >> >> --- > >> >> Koichi Suzuki > >> >> > >> >> > >> >> 2014-01-24 18:25 GMT+09:00 Masataka Saito <pg...@gm...>: > >> >> > Hello, > >> >> > > >> >> > As I've been exasperated by random failures, I'm willing to whip > the > >> >> > cause > >> >> > of the issue. > >> >> > > >> >> > This issue is related to cancel of the failed query. > >> >> > When a datanode reports an error of a query, a coordinator sends a > >> >> > cancel > >> >> > request to non-idle nodes, waits the node to get ready and requests > >> >> > nodes to > >> >> > rollback the transaction. > >> >> > > >> >> > Where's the problem? Consider the next case. > >> >> > 1. Datanode A (PID 1) reports an error to coordinator A. ([1] 'E' > >> >> > message) > >> >> > 2. Coordinator A receives [1] and reports an error to a frontend. > >> >> > ([2] > >> >> > 'E' > >> >> > message) > >> >> > 3. Coordinator A starts aborting process and it thinks datanode A > >> >> > (PID > >> >> > 1) is > >> >> > not idle. > >> >> > 4. Coordinator A sends a cancel request about PID 1 to datanode A > >> >> > (PID > >> >> > 2). > >> >> > ([3] cancel message) > >> >> > 5. Datanode A (PID 1) reports ready to coordinator A. ([4] 'Z' > >> >> > message) > >> >> > 6. Coordinator A receives [4] and sends "ROLLBACK TRANSACTION" > >> >> > immediately. > >> >> > ([5] 'Q' message) > >> >> > 7. Datanode A (PID 1) receives [5] and starts processing the query. > >> >> > 8. Datanode A (PID 2) receives [3]. > >> >> > 9. Datanode A (PID 2) notify PID 1 of [3]. > >> >> > 10. Datanode A (PID 1) cancel processing [5] and reports an error > to > >> >> > Coordinator A. ([6] 'E' message) > >> >> > 11. Coordinator A receives [6] and reports an error to a frontend. > >> >> > ([7] > >> >> > 'E' > >> >> > message) > >> >> > > >> >> > [7] makes unexpected output and a test fails. > >> >> > > >> >> > Saying an extreme thing, it could occur that the next query of [5] > is > >> >> > cancelled by [3]. > >> >> > > >> >> > As far as I know, there's no way to know when to the cancel request > >> >> > get > >> >> > to > >> >> > be processed, I think we can't not wait an experimental duration > >> >> > after > >> >> > cancelling like the attached patch. > >> >> > > >> >> > Does anyone have another cool idea to solve this issue? > >> >> > > >> >> > Regards. > >> >> > > >> >> > > >> >> > > >> >> > > ------------------------------------------------------------------------------ > >> >> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > >> >> > Learn Why More Businesses Are Choosing CenturyLink Cloud For > >> >> > Critical Workloads, Development Environments & Everything In > Between. > >> >> > Get a Quote or Start a Free Trial Today. > >> >> > > >> >> > > >> >> > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > >> >> > _______________________________________________ > >> >> > Postgres-xc-developers mailing list > >> >> > Pos...@li... > >> >> > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> >> > > >> >> > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ > >> >> Android apps run on BlackBerry 10 > >> >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >> >> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >> >> Get your Android app in front of a whole new audience. Start now. > >> >> > >> >> > >> >> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >> >> > >> >> _______________________________________________ > >> >> Postgres-xc-developers mailing list > >> >> Pos...@li... > >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> > > >> > > >> > > >> > > >> > -- > >> > Andrei Martsinchyk > >> > > >> > StormDB - http://www.stormdb.com > >> > The Database Cloud > >> > > |
From: Ahsan H. <ahs...@en...> - 2014-02-14 06:39:03
|
On Fri, Feb 14, 2014 at 6:46 AM, Abbas Butt <abb...@en...>wrote: > > > > On Fri, Feb 14, 2014 at 6:09 AM, Koichi Suzuki <koi...@gm...>wrote: > >> I don't think it works because there're no guarantee each cursor >> against different datanode returns rows in the same order. >> > > Why would a query with an ORDER BY clause on all the columns not return > the rows in the same order on each datanode? > hmm...it should. Could they be affected by locale setting? >> Regards; >> --- >> Koichi Suzuki >> >> >> 2014-02-13 15:50 GMT+09:00 Ashutosh Bapat < >> ash...@en...>: >> > WCO on datanodes should be working fine, so that shouldn't be a problem. >> > >> > >> > On Thu, Feb 13, 2014 at 12:13 PM, 鈴木 幸市 <ko...@in...> >> wrote: >> >> >> >> Are you sure that this does not come up with any bad side effects to >> >> support WCO in 1.3? >> >> --- >> >> Koichi Suzuki >> >> >> >> 2014/02/13 15:24、Ashutosh Bapat <ash...@en...> >> のメール: >> >> >> >> One more solution would be to use cursors for replicated tables. The >> idea >> >> is to open cursors on all the copies of the table and append the query >> with >> >> an ORDER BY clause on all the columns. Thus we are sure that the >> current of >> >> each of these cursors point to same row on all the copies. While >> fetching a >> >> row from a replicated table, we fetch from all the cursors and choose >> only >> >> one row for the data processing. While updating or deleting we send >> UPDATE >> >> or DELETE with WHERE CURRENT OF. The down side of this approach is >> that, if >> >> there are coordinator quals, we will end up locking more rows than >> >> necessary, increasing the probability of the deadlock but at least >> there >> >> won't be a necessary restriction of having primary or unique key and we >> >> won't break backward compatibility. >> >> >> >> If there two identical rows, we might mix the update from different >> nodes, >> >> but then who knew which of them were corresponded across the nodes to >> start >> >> with. >> >> >> >> >> >> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki <koi...@gm...> >> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> I tested the patch and found that primary key is mandatory. We need >> >>> to modify regression test considerably to give each replicated table >> >>> primary keys. >> >>> >> >>> I think this patch helps but I'm not afraid this is good, especially >> >>> when we try to take XC features back to PG. >> >>> >> >>> Did you post another patch to use all column values if primary key is >> >>> not available? >> >>> >> >>> I think better way is as follows: >> >>> >> >>> 1) If primary key is defined, use it, >> >>> 2) If not, create a primary key as system column, the size should be >> >>> 64bit. >> >>> 3) If primary key is added to a replicated table, remove system >> primary >> >>> key. >> >>> >> >>> The value of primary key can be obtained as follows: >> >>> >> >>> 1) add new column to pgxc_class catalog to represent maximum value of >> >>> the system primary key, >> >>> 2) when first "insert" is done to the primary node, system primary key >> >>> value is taken from 1) and 1) is updated. The value is returned to >> >>> the coordinator to be propagated to other nodes. >> >>> 3) when subsequent "insert" is being done, system primary key value is >> >>> added to the column value. In this case, each datanode updates 1) >> >>> column value if it is larger than the current maximum value. >> >>> >> >>> 3) is important to change primary node to another. This is needed to >> >>> carry over the primary node to another. >> >>> >> >>> ALTER TABLE should take care of them. >> >>> >> >>> Other issues are: >> >>> >> >>> 4) pg_dump/pg_dumpall should not include this system column value, >> >>> 5) cluster may need to handle this too to repack system primary key >> >>> value (not now but at least in 1.3 or later). >> >>> >> >>> Regards; >> >>> --- >> >>> Koichi Suzuki >> >>> >> >>> >> >>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >> >>> > Please see attached patch that tries to address the issue of XC >> using >> >>> > CTID >> >>> > for replicated updates and deletes when it is evaluated at a >> >>> > coordinator >> >>> > instead of being pushed down. >> >>> > >> >>> > The problem here is that CTID could be referring to a different >> tuple >> >>> > altogether on a different data node, which is what happened for one >> of >> >>> > our >> >>> > Postgres-XC support customers, leading to data issues. >> >>> > >> >>> > Instead, the patch looks for a primary key or unique index (with the >> >>> > primary >> >>> > key preferred) and uses those values instead of CTID. >> >>> > >> >>> > The patch could be improved further. Extra parameters are set even >> if >> >>> > not >> >>> > used in the execution of the prepared statement sent down to the >> data >> >>> > nodes. >> >>> > >> >>> > Regards, >> >>> > >> >>> > >> >>> > -- >> >>> > Mason Sharp >> >>> > >> >>> > TransLattice - http://www.translattice.com >> >>> > Distributed and Clustered Database Solutions >> >>> > >> >>> > >> >>> > >> ------------------------------------------------------------------------------ >> >>> > November Webinars for C, C++, Fortran Developers >> >>> > Accelerate application performance with scalable programming models. >> >>> > Explore >> >>> > techniques for threading, error checking, porting, and tuning. Get >> the >> >>> > most >> >>> > from the latest Intel processors and coprocessors. See abstracts and >> >>> > register >> >>> > >> >>> > >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> >>> > _______________________________________________ >> >>> > Postgres-xc-developers mailing list >> >>> > Pos...@li... >> >>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >>> > >> >>> >> >>> >> >>> >> ------------------------------------------------------------------------------ >> >>> Android apps run on BlackBerry 10 >> >>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >>> Get your Android app in front of a whole new audience. Start now. >> >>> >> >>> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >>> _______________________________________________ >> >>> Postgres-xc-developers mailing list >> >>> Pos...@li... >> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> >> >> >> >> >> >> >> >> -- >> >> Best Wishes, >> >> Ashutosh Bapat >> >> EnterpriseDB Corporation >> >> The Postgres Database Company >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Android apps run on BlackBerry 10 >> >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >> Get your Android app in front of a whole new audience. Start now. >> >> >> >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk_______________________________________________ >> >> Postgres-xc-developers mailing list >> >> Pos...@li... >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> >> >> >> > >> > >> > >> > -- >> > Best Wishes, >> > Ashutosh Bapat >> > EnterpriseDB Corporation >> > The Postgres Database Company >> >> >> ------------------------------------------------------------------------------ >> Android apps run on BlackBerry 10 >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> Get your Android app in front of a whole new audience. Start now. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > > > > -- > -- > *Abbas* > Architect > > Ph: 92.334.5100153 > Skype ID: gabbasb > www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> > > *Follow us on Twitter* > @EnterpriseDB > > Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> > > > ------------------------------------------------------------------------------ > Android apps run on BlackBerry 10 > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > Now with support for Jelly Bean, Bluetooth, Mapview and more. > Get your Android app in front of a whole new audience. Start now. > > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Ahsan Hadi Snr Director Product Development EnterpriseDB Corporation The Enterprise Postgres Company Phone: +92-51-8358874 Mobile: +92-333-5162114 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Koichi S. <koi...@gm...> - 2014-02-14 06:22:54
|
If we can use the same ORDER BY clause, I don't understand why we need cursor. We can just ship statements. --- Koichi Suzuki 2014-02-14 15:20 GMT+09:00 Abbas Butt <abb...@en...>: > > > > On Fri, Feb 14, 2014 at 10:58 AM, Koichi Suzuki <koi...@gm...> > wrote: >> >> 2014-02-14 14:55 GMT+09:00 Abbas Butt <abb...@en...>: >> > >> > >> > >> > On Fri, Feb 14, 2014 at 10:48 AM, Ashutosh Bapat >> > <ash...@en...> wrote: >> >> >> >> >> >> >> >> >> >> On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt >> >> <abb...@en...> >> >> wrote: >> >>> >> >>> >> >>> >> >>> >> >>> On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat >> >>> <ash...@en...> wrote: >> >>>> >> >>>> One more solution would be to use cursors for replicated tables. The >> >>>> idea is to open cursors on all the copies of the table and append the >> >>>> query >> >>>> with an ORDER BY clause on all the columns. Thus we are sure that the >> >>>> current of each of these cursors point to same row on all the copies. >> >>>> While >> >>>> fetching a row from a replicated table, we fetch from all the cursors >> >>>> and >> >>>> choose only one row for the data processing. While updating or >> >>>> deleting we >> >>>> send UPDATE or DELETE with WHERE CURRENT OF. The down side of this >> >>>> approach >> >>>> is that, if there are coordinator quals, we will end up locking more >> >>>> rows >> >>>> than necessary, increasing the probability of the deadlock but at >> >>>> least >> >>>> there won't be a necessary restriction of having primary or unique >> >>>> key and >> >>>> we won't break backward compatibility. >> >>>> >> >>>> If there two identical rows, we might mix the update from different >> >>>> nodes, but then who knew which of them were corresponded across the >> >>>> nodes to >> >>>> start with. >> >>> >> >>> >> >>> Thanks for the suggestion but we currently do not support WCO and we >> >>> were >> >>> thinking of fixing this issue before we declare 1.2 beta is generally >> >>> available. >> >>> >> >> >> >> >> >> Abbas, WCO doesn't work from the coordinator, but there is no reason >> >> why >> >> it shouldn't work at the datanode. So internally between coordinator >> >> and the >> >> datanode, we can always use WCO. >> > >> > >> > True, Thanks for the clarification. >> >> Again, there are no guarantee that all cursors for a replicated table >> returns rows in the same order. It is as dangerous as ctid. > > > Could you please explain a little further, how would a query that has all > table columns in the ORDER BY clause return rows in different order when run > on the datanodes? > > >> >> >> > >> >> >> >> >> >>>> >> >>>> >> >>>> >> >>>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki >> >>>> <koi...@gm...> >> >>>> wrote: >> >>>>> >> >>>>> Hi, >> >>>>> >> >>>>> I tested the patch and found that primary key is mandatory. We >> >>>>> need >> >>>>> to modify regression test considerably to give each replicated table >> >>>>> primary keys. >> >>>>> >> >>>>> I think this patch helps but I'm not afraid this is good, especially >> >>>>> when we try to take XC features back to PG. >> >>>>> >> >>>>> Did you post another patch to use all column values if primary key >> >>>>> is >> >>>>> not available? >> >>>>> >> >>>>> I think better way is as follows: >> >>>>> >> >>>>> 1) If primary key is defined, use it, >> >>>>> 2) If not, create a primary key as system column, the size should be >> >>>>> 64bit. >> >>>>> 3) If primary key is added to a replicated table, remove system >> >>>>> primary >> >>>>> key. >> >>>>> >> >>>>> The value of primary key can be obtained as follows: >> >>>>> >> >>>>> 1) add new column to pgxc_class catalog to represent maximum value >> >>>>> of >> >>>>> the system primary key, >> >>>>> 2) when first "insert" is done to the primary node, system primary >> >>>>> key >> >>>>> value is taken from 1) and 1) is updated. The value is returned to >> >>>>> the coordinator to be propagated to other nodes. >> >>>>> 3) when subsequent "insert" is being done, system primary key value >> >>>>> is >> >>>>> added to the column value. In this case, each datanode updates 1) >> >>>>> column value if it is larger than the current maximum value. >> >>>>> >> >>>>> 3) is important to change primary node to another. This is needed >> >>>>> to >> >>>>> carry over the primary node to another. >> >>>>> >> >>>>> ALTER TABLE should take care of them. >> >>>>> >> >>>>> Other issues are: >> >>>>> >> >>>>> 4) pg_dump/pg_dumpall should not include this system column value, >> >>>>> 5) cluster may need to handle this too to repack system primary key >> >>>>> value (not now but at least in 1.3 or later). >> >>>>> >> >>>>> Regards; >> >>>>> --- >> >>>>> Koichi Suzuki >> >>>>> >> >>>>> >> >>>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >> >>>>> > Please see attached patch that tries to address the issue of XC >> >>>>> > using >> >>>>> > CTID >> >>>>> > for replicated updates and deletes when it is evaluated at a >> >>>>> > coordinator >> >>>>> > instead of being pushed down. >> >>>>> > >> >>>>> > The problem here is that CTID could be referring to a different >> >>>>> > tuple >> >>>>> > altogether on a different data node, which is what happened for >> >>>>> > one >> >>>>> > of our >> >>>>> > Postgres-XC support customers, leading to data issues. >> >>>>> > >> >>>>> > Instead, the patch looks for a primary key or unique index (with >> >>>>> > the >> >>>>> > primary >> >>>>> > key preferred) and uses those values instead of CTID. >> >>>>> > >> >>>>> > The patch could be improved further. Extra parameters are set >> >>>>> > even >> >>>>> > if not >> >>>>> > used in the execution of the prepared statement sent down to the >> >>>>> > data >> >>>>> > nodes. >> >>>>> > >> >>>>> > Regards, >> >>>>> > >> >>>>> > >> >>>>> > -- >> >>>>> > Mason Sharp >> >>>>> > >> >>>>> > TransLattice - http://www.translattice.com >> >>>>> > Distributed and Clustered Database Solutions >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > ------------------------------------------------------------------------------ >> >>>>> > November Webinars for C, C++, Fortran Developers >> >>>>> > Accelerate application performance with scalable programming >> >>>>> > models. >> >>>>> > Explore >> >>>>> > techniques for threading, error checking, porting, and tuning. Get >> >>>>> > the most >> >>>>> > from the latest Intel processors and coprocessors. See abstracts >> >>>>> > and >> >>>>> > register >> >>>>> > >> >>>>> > >> >>>>> > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> >>>>> > _______________________________________________ >> >>>>> > Postgres-xc-developers mailing list >> >>>>> > Pos...@li... >> >>>>> > >> >>>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >>>>> > >> >>>>> >> >>>>> >> >>>>> >> >>>>> ------------------------------------------------------------------------------ >> >>>>> Android apps run on BlackBerry 10 >> >>>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >>>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >>>>> Get your Android app in front of a whole new audience. Start now. >> >>>>> >> >>>>> >> >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >>>>> _______________________________________________ >> >>>>> Postgres-xc-developers mailing list >> >>>>> Pos...@li... >> >>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Best Wishes, >> >>>> Ashutosh Bapat >> >>>> EnterpriseDB Corporation >> >>>> The Postgres Database Company >> >>>> >> >>>> >> >>>> >> >>>> ------------------------------------------------------------------------------ >> >>>> Android apps run on BlackBerry 10 >> >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >>>> Get your Android app in front of a whole new audience. Start now. >> >>>> >> >>>> >> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >>>> _______________________________________________ >> >>>> Postgres-xc-developers mailing list >> >>>> Pos...@li... >> >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> -- >> >>> Abbas >> >>> Architect >> >>> >> >>> Ph: 92.334.5100153 >> >>> Skype ID: gabbasb >> >>> www.enterprisedb.com >> >>> >> >>> Follow us on Twitter >> >>> @EnterpriseDB >> >>> >> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers and more >> >> >> >> >> >> >> >> >> >> -- >> >> Best Wishes, >> >> Ashutosh Bapat >> >> EnterpriseDB Corporation >> >> The Postgres Database Company >> > >> > >> > >> > >> > -- >> > -- >> > Abbas >> > Architect >> > >> > Ph: 92.334.5100153 >> > Skype ID: gabbasb >> > www.enterprisedb.com >> > >> > Follow us on Twitter >> > @EnterpriseDB >> > >> > Visit EnterpriseDB for tutorials, webinars, whitepapers and more > > > > > -- > -- > Abbas > Architect > > Ph: 92.334.5100153 > Skype ID: gabbasb > www.enterprisedb.com > > Follow us on Twitter > @EnterpriseDB > > Visit EnterpriseDB for tutorials, webinars, whitepapers and more |
From: Abbas B. <abb...@en...> - 2014-02-14 06:20:48
|
On Fri, Feb 14, 2014 at 10:58 AM, Koichi Suzuki <koi...@gm...>wrote: > 2014-02-14 14:55 GMT+09:00 Abbas Butt <abb...@en...>: > > > > > > > > On Fri, Feb 14, 2014 at 10:48 AM, Ashutosh Bapat > > <ash...@en...> wrote: > >> > >> > >> > >> > >> On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt < > abb...@en...> > >> wrote: > >>> > >>> > >>> > >>> > >>> On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat > >>> <ash...@en...> wrote: > >>>> > >>>> One more solution would be to use cursors for replicated tables. The > >>>> idea is to open cursors on all the copies of the table and append the > query > >>>> with an ORDER BY clause on all the columns. Thus we are sure that the > >>>> current of each of these cursors point to same row on all the copies. > While > >>>> fetching a row from a replicated table, we fetch from all the cursors > and > >>>> choose only one row for the data processing. While updating or > deleting we > >>>> send UPDATE or DELETE with WHERE CURRENT OF. The down side of this > approach > >>>> is that, if there are coordinator quals, we will end up locking more > rows > >>>> than necessary, increasing the probability of the deadlock but at > least > >>>> there won't be a necessary restriction of having primary or unique > key and > >>>> we won't break backward compatibility. > >>>> > >>>> If there two identical rows, we might mix the update from different > >>>> nodes, but then who knew which of them were corresponded across the > nodes to > >>>> start with. > >>> > >>> > >>> Thanks for the suggestion but we currently do not support WCO and we > were > >>> thinking of fixing this issue before we declare 1.2 beta is generally > >>> available. > >>> > >> > >> > >> Abbas, WCO doesn't work from the coordinator, but there is no reason why > >> it shouldn't work at the datanode. So internally between coordinator > and the > >> datanode, we can always use WCO. > > > > > > True, Thanks for the clarification. > > Again, there are no guarantee that all cursors for a replicated table > returns rows in the same order. It is as dangerous as ctid. > Could you please explain a little further, how would a query that has all table columns in the ORDER BY clause return rows in different order when run on the datanodes? > > > > >> > >> > >>>> > >>>> > >>>> > >>>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki <koi...@gm... > > > >>>> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> I tested the patch and found that primary key is mandatory. We need > >>>>> to modify regression test considerably to give each replicated table > >>>>> primary keys. > >>>>> > >>>>> I think this patch helps but I'm not afraid this is good, especially > >>>>> when we try to take XC features back to PG. > >>>>> > >>>>> Did you post another patch to use all column values if primary key is > >>>>> not available? > >>>>> > >>>>> I think better way is as follows: > >>>>> > >>>>> 1) If primary key is defined, use it, > >>>>> 2) If not, create a primary key as system column, the size should be > >>>>> 64bit. > >>>>> 3) If primary key is added to a replicated table, remove system > primary > >>>>> key. > >>>>> > >>>>> The value of primary key can be obtained as follows: > >>>>> > >>>>> 1) add new column to pgxc_class catalog to represent maximum value of > >>>>> the system primary key, > >>>>> 2) when first "insert" is done to the primary node, system primary > key > >>>>> value is taken from 1) and 1) is updated. The value is returned to > >>>>> the coordinator to be propagated to other nodes. > >>>>> 3) when subsequent "insert" is being done, system primary key value > is > >>>>> added to the column value. In this case, each datanode updates 1) > >>>>> column value if it is larger than the current maximum value. > >>>>> > >>>>> 3) is important to change primary node to another. This is needed > to > >>>>> carry over the primary node to another. > >>>>> > >>>>> ALTER TABLE should take care of them. > >>>>> > >>>>> Other issues are: > >>>>> > >>>>> 4) pg_dump/pg_dumpall should not include this system column value, > >>>>> 5) cluster may need to handle this too to repack system primary key > >>>>> value (not now but at least in 1.3 or later). > >>>>> > >>>>> Regards; > >>>>> --- > >>>>> Koichi Suzuki > >>>>> > >>>>> > >>>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: > >>>>> > Please see attached patch that tries to address the issue of XC > using > >>>>> > CTID > >>>>> > for replicated updates and deletes when it is evaluated at a > >>>>> > coordinator > >>>>> > instead of being pushed down. > >>>>> > > >>>>> > The problem here is that CTID could be referring to a different > tuple > >>>>> > altogether on a different data node, which is what happened for one > >>>>> > of our > >>>>> > Postgres-XC support customers, leading to data issues. > >>>>> > > >>>>> > Instead, the patch looks for a primary key or unique index (with > the > >>>>> > primary > >>>>> > key preferred) and uses those values instead of CTID. > >>>>> > > >>>>> > The patch could be improved further. Extra parameters are set even > >>>>> > if not > >>>>> > used in the execution of the prepared statement sent down to the > data > >>>>> > nodes. > >>>>> > > >>>>> > Regards, > >>>>> > > >>>>> > > >>>>> > -- > >>>>> > Mason Sharp > >>>>> > > >>>>> > TransLattice - http://www.translattice.com > >>>>> > Distributed and Clustered Database Solutions > >>>>> > > >>>>> > > >>>>> > > ------------------------------------------------------------------------------ > >>>>> > November Webinars for C, C++, Fortran Developers > >>>>> > Accelerate application performance with scalable programming > models. > >>>>> > Explore > >>>>> > techniques for threading, error checking, porting, and tuning. Get > >>>>> > the most > >>>>> > from the latest Intel processors and coprocessors. See abstracts > and > >>>>> > register > >>>>> > > >>>>> > > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk > >>>>> > _______________________________________________ > >>>>> > Postgres-xc-developers mailing list > >>>>> > Pos...@li... > >>>>> > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >>>>> > > >>>>> > >>>>> > >>>>> > ------------------------------------------------------------------------------ > >>>>> Android apps run on BlackBerry 10 > >>>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >>>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >>>>> Get your Android app in front of a whole new audience. Start now. > >>>>> > >>>>> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >>>>> _______________________________________________ > >>>>> Postgres-xc-developers mailing list > >>>>> Pos...@li... > >>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Best Wishes, > >>>> Ashutosh Bapat > >>>> EnterpriseDB Corporation > >>>> The Postgres Database Company > >>>> > >>>> > >>>> > ------------------------------------------------------------------------------ > >>>> Android apps run on BlackBerry 10 > >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >>>> Get your Android app in front of a whole new audience. Start now. > >>>> > >>>> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >>>> _______________________________________________ > >>>> Postgres-xc-developers mailing list > >>>> Pos...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >>>> > >>> > >>> > >>> > >>> -- > >>> -- > >>> Abbas > >>> Architect > >>> > >>> Ph: 92.334.5100153 > >>> Skype ID: gabbasb > >>> www.enterprisedb.com > >>> > >>> Follow us on Twitter > >>> @EnterpriseDB > >>> > >>> Visit EnterpriseDB for tutorials, webinars, whitepapers and more > >> > >> > >> > >> > >> -- > >> Best Wishes, > >> Ashutosh Bapat > >> EnterpriseDB Corporation > >> The Postgres Database Company > > > > > > > > > > -- > > -- > > Abbas > > Architect > > > > Ph: 92.334.5100153 > > Skype ID: gabbasb > > www.enterprisedb.com > > > > Follow us on Twitter > > @EnterpriseDB > > > > Visit EnterpriseDB for tutorials, webinars, whitepapers and more > -- -- *Abbas* Architect Ph: 92.334.5100153 Skype ID: gabbasb www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> *Follow us on Twitter* @EnterpriseDB Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> |
From: Koichi S. <koi...@gm...> - 2014-02-14 05:58:42
|
2014-02-14 14:55 GMT+09:00 Abbas Butt <abb...@en...>: > > > > On Fri, Feb 14, 2014 at 10:48 AM, Ashutosh Bapat > <ash...@en...> wrote: >> >> >> >> >> On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt <abb...@en...> >> wrote: >>> >>> >>> >>> >>> On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat >>> <ash...@en...> wrote: >>>> >>>> One more solution would be to use cursors for replicated tables. The >>>> idea is to open cursors on all the copies of the table and append the query >>>> with an ORDER BY clause on all the columns. Thus we are sure that the >>>> current of each of these cursors point to same row on all the copies. While >>>> fetching a row from a replicated table, we fetch from all the cursors and >>>> choose only one row for the data processing. While updating or deleting we >>>> send UPDATE or DELETE with WHERE CURRENT OF. The down side of this approach >>>> is that, if there are coordinator quals, we will end up locking more rows >>>> than necessary, increasing the probability of the deadlock but at least >>>> there won't be a necessary restriction of having primary or unique key and >>>> we won't break backward compatibility. >>>> >>>> If there two identical rows, we might mix the update from different >>>> nodes, but then who knew which of them were corresponded across the nodes to >>>> start with. >>> >>> >>> Thanks for the suggestion but we currently do not support WCO and we were >>> thinking of fixing this issue before we declare 1.2 beta is generally >>> available. >>> >> >> >> Abbas, WCO doesn't work from the coordinator, but there is no reason why >> it shouldn't work at the datanode. So internally between coordinator and the >> datanode, we can always use WCO. > > > True, Thanks for the clarification. Again, there are no guarantee that all cursors for a replicated table returns rows in the same order. It is as dangerous as ctid. > >> >> >>>> >>>> >>>> >>>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki <koi...@gm...> >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I tested the patch and found that primary key is mandatory. We need >>>>> to modify regression test considerably to give each replicated table >>>>> primary keys. >>>>> >>>>> I think this patch helps but I'm not afraid this is good, especially >>>>> when we try to take XC features back to PG. >>>>> >>>>> Did you post another patch to use all column values if primary key is >>>>> not available? >>>>> >>>>> I think better way is as follows: >>>>> >>>>> 1) If primary key is defined, use it, >>>>> 2) If not, create a primary key as system column, the size should be >>>>> 64bit. >>>>> 3) If primary key is added to a replicated table, remove system primary >>>>> key. >>>>> >>>>> The value of primary key can be obtained as follows: >>>>> >>>>> 1) add new column to pgxc_class catalog to represent maximum value of >>>>> the system primary key, >>>>> 2) when first "insert" is done to the primary node, system primary key >>>>> value is taken from 1) and 1) is updated. The value is returned to >>>>> the coordinator to be propagated to other nodes. >>>>> 3) when subsequent "insert" is being done, system primary key value is >>>>> added to the column value. In this case, each datanode updates 1) >>>>> column value if it is larger than the current maximum value. >>>>> >>>>> 3) is important to change primary node to another. This is needed to >>>>> carry over the primary node to another. >>>>> >>>>> ALTER TABLE should take care of them. >>>>> >>>>> Other issues are: >>>>> >>>>> 4) pg_dump/pg_dumpall should not include this system column value, >>>>> 5) cluster may need to handle this too to repack system primary key >>>>> value (not now but at least in 1.3 or later). >>>>> >>>>> Regards; >>>>> --- >>>>> Koichi Suzuki >>>>> >>>>> >>>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >>>>> > Please see attached patch that tries to address the issue of XC using >>>>> > CTID >>>>> > for replicated updates and deletes when it is evaluated at a >>>>> > coordinator >>>>> > instead of being pushed down. >>>>> > >>>>> > The problem here is that CTID could be referring to a different tuple >>>>> > altogether on a different data node, which is what happened for one >>>>> > of our >>>>> > Postgres-XC support customers, leading to data issues. >>>>> > >>>>> > Instead, the patch looks for a primary key or unique index (with the >>>>> > primary >>>>> > key preferred) and uses those values instead of CTID. >>>>> > >>>>> > The patch could be improved further. Extra parameters are set even >>>>> > if not >>>>> > used in the execution of the prepared statement sent down to the data >>>>> > nodes. >>>>> > >>>>> > Regards, >>>>> > >>>>> > >>>>> > -- >>>>> > Mason Sharp >>>>> > >>>>> > TransLattice - http://www.translattice.com >>>>> > Distributed and Clustered Database Solutions >>>>> > >>>>> > >>>>> > ------------------------------------------------------------------------------ >>>>> > November Webinars for C, C++, Fortran Developers >>>>> > Accelerate application performance with scalable programming models. >>>>> > Explore >>>>> > techniques for threading, error checking, porting, and tuning. Get >>>>> > the most >>>>> > from the latest Intel processors and coprocessors. See abstracts and >>>>> > register >>>>> > >>>>> > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >>>>> > _______________________________________________ >>>>> > Postgres-xc-developers mailing list >>>>> > Pos...@li... >>>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>>> > >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Android apps run on BlackBerry 10 >>>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>>>> Get your Android app in front of a whole new audience. Start now. >>>>> >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>>>> _______________________________________________ >>>>> Postgres-xc-developers mailing list >>>>> Pos...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>> >>>> >>>> >>>> >>>> -- >>>> Best Wishes, >>>> Ashutosh Bapat >>>> EnterpriseDB Corporation >>>> The Postgres Database Company >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Android apps run on BlackBerry 10 >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>>> Get your Android app in front of a whole new audience. Start now. >>>> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Postgres-xc-developers mailing list >>>> Pos...@li... >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>> >>> >>> >>> >>> -- >>> -- >>> Abbas >>> Architect >>> >>> Ph: 92.334.5100153 >>> Skype ID: gabbasb >>> www.enterprisedb.com >>> >>> Follow us on Twitter >>> @EnterpriseDB >>> >>> Visit EnterpriseDB for tutorials, webinars, whitepapers and more >> >> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EnterpriseDB Corporation >> The Postgres Database Company > > > > > -- > -- > Abbas > Architect > > Ph: 92.334.5100153 > Skype ID: gabbasb > www.enterprisedb.com > > Follow us on Twitter > @EnterpriseDB > > Visit EnterpriseDB for tutorials, webinars, whitepapers and more |
From: Koichi S. <koi...@gm...> - 2014-02-14 05:56:36
|
Only one thing is that we need to disable implicit 2PC (enforce_two_phase_commit) to use temporary object. It's okay within regression test but may not be okay in more realistic schemes. In XC case, applications do not issue 2PC statement but coordinator will do this when a transaction is involved by more than one node (implicit 2PC). Temporary object cannot survive across the session. This is the reason why temporary object is not allowed with 2PC. Because implicit 2PC does not survive across sessions, I believe XC can support this. We should continue to discuss the other issues. Regards; --- Koichi Suzuki 2014-02-14 14:45 GMT+09:00 Michael Paquier <mic...@gm...>: > On Fri, Feb 14, 2014 at 11:07 AM, Koichi Suzuki <koi...@gm...> wrote: >> Disconnecting the connection as you suggested will bring another >> problem such as TEMPORARY object in the subsequent queries. We do >> not support TEMPORARY object but I believe we should be consistent on >> this for future releases. > AFAIK, temporary object are supported, no? There is even a parameter > allowing to enforce autocommit instead of 2PC when temp objects are > used. This is dangerous, OK, but that's a trade-off and user is > normally aware of that when using temp tables. Actually, temporary > tables are useful even without 2PC with for example a join pushed down > to a unique remote node: it might be better to use a temporary table > gathering a list of keys instead of a heavy ANY clause or a array with > many values. > -- > Michael |
From: Abbas B. <abb...@en...> - 2014-02-14 05:55:35
|
On Fri, Feb 14, 2014 at 10:48 AM, Ashutosh Bapat < ash...@en...> wrote: > > > > On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt <abb...@en...>wrote: > >> >> >> >> On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat < >> ash...@en...> wrote: >> >>> One more solution would be to use cursors for replicated tables. The >>> idea is to open cursors on all the copies of the table and append the query >>> with an ORDER BY clause on all the columns. Thus we are sure that the >>> current of each of these cursors point to same row on all the copies. While >>> fetching a row from a replicated table, we fetch from all the cursors and >>> choose only one row for the data processing. While updating or deleting we >>> send UPDATE or DELETE with WHERE CURRENT OF. The down side of this approach >>> is that, if there are coordinator quals, we will end up locking more rows >>> than necessary, increasing the probability of the deadlock but at least >>> there won't be a necessary restriction of having primary or unique key and >>> we won't break backward compatibility. >>> >>> If there two identical rows, we might mix the update from different >>> nodes, but then who knew which of them were corresponded across the nodes >>> to start with. >>> >> >> Thanks for the suggestion but we currently do not support WCO and we were >> thinking of fixing this issue before we declare 1.2 beta is generally >> available. >> >> > > Abbas, WCO doesn't work from the coordinator, but there is no reason why > it shouldn't work at the datanode. So internally between coordinator and > the datanode, we can always use WCO. > True, Thanks for the clarification. > > >> >>> >>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki <koi...@gm...>wrote: >>> >>>> Hi, >>>> >>>> I tested the patch and found that primary key is mandatory. We need >>>> to modify regression test considerably to give each replicated table >>>> primary keys. >>>> >>>> I think this patch helps but I'm not afraid this is good, especially >>>> when we try to take XC features back to PG. >>>> >>>> Did you post another patch to use all column values if primary key is >>>> not available? >>>> >>>> I think better way is as follows: >>>> >>>> 1) If primary key is defined, use it, >>>> 2) If not, create a primary key as system column, the size should be >>>> 64bit. >>>> 3) If primary key is added to a replicated table, remove system primary >>>> key. >>>> >>>> The value of primary key can be obtained as follows: >>>> >>>> 1) add new column to pgxc_class catalog to represent maximum value of >>>> the system primary key, >>>> 2) when first "insert" is done to the primary node, system primary key >>>> value is taken from 1) and 1) is updated. The value is returned to >>>> the coordinator to be propagated to other nodes. >>>> 3) when subsequent "insert" is being done, system primary key value is >>>> added to the column value. In this case, each datanode updates 1) >>>> column value if it is larger than the current maximum value. >>>> >>>> 3) is important to change primary node to another. This is needed to >>>> carry over the primary node to another. >>>> >>>> ALTER TABLE should take care of them. >>>> >>>> Other issues are: >>>> >>>> 4) pg_dump/pg_dumpall should not include this system column value, >>>> 5) cluster may need to handle this too to repack system primary key >>>> value (not now but at least in 1.3 or later). >>>> >>>> Regards; >>>> --- >>>> Koichi Suzuki >>>> >>>> >>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >>>> > Please see attached patch that tries to address the issue of XC using >>>> CTID >>>> > for replicated updates and deletes when it is evaluated at a >>>> coordinator >>>> > instead of being pushed down. >>>> > >>>> > The problem here is that CTID could be referring to a different tuple >>>> > altogether on a different data node, which is what happened for one >>>> of our >>>> > Postgres-XC support customers, leading to data issues. >>>> > >>>> > Instead, the patch looks for a primary key or unique index (with the >>>> primary >>>> > key preferred) and uses those values instead of CTID. >>>> > >>>> > The patch could be improved further. Extra parameters are set even >>>> if not >>>> > used in the execution of the prepared statement sent down to the data >>>> nodes. >>>> > >>>> > Regards, >>>> > >>>> > >>>> > -- >>>> > Mason Sharp >>>> > >>>> > TransLattice - http://www.translattice.com >>>> > Distributed and Clustered Database Solutions >>>> > >>>> > >>>> ------------------------------------------------------------------------------ >>>> > November Webinars for C, C++, Fortran Developers >>>> > Accelerate application performance with scalable programming models. >>>> Explore >>>> > techniques for threading, error checking, porting, and tuning. Get >>>> the most >>>> > from the latest Intel processors and coprocessors. See abstracts and >>>> > register >>>> > >>>> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >>>> > _______________________________________________ >>>> > Postgres-xc-developers mailing list >>>> > Pos...@li... >>>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>> > >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Android apps run on BlackBerry 10 >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>>> Get your Android app in front of a whole new audience. Start now. >>>> >>>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Postgres-xc-developers mailing list >>>> Pos...@li... >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>> >>> >>> >>> >>> -- >>> Best Wishes, >>> Ashutosh Bapat >>> EnterpriseDB Corporation >>> The Postgres Database Company >>> >>> >>> ------------------------------------------------------------------------------ >>> Android apps run on BlackBerry 10 >>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>> Get your Android app in front of a whole new audience. Start now. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Postgres-xc-developers mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >>> >> >> >> -- >> -- >> *Abbas* >> Architect >> >> Ph: 92.334.5100153 >> Skype ID: gabbasb >> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >> >> *Follow us on Twitter* >> @EnterpriseDB >> >> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >> > > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > -- -- *Abbas* Architect Ph: 92.334.5100153 Skype ID: gabbasb www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> *Follow us on Twitter* @EnterpriseDB Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> |
From: Michael P. <mic...@gm...> - 2014-02-14 05:50:36
|
On Fri, Feb 14, 2014 at 2:21 PM, Koichi Suzuki <koi...@gm...> wrote: > It seems to be an issue of PG itself, doesn't it? Partially. And personally I am truly not convinced that this feature is worth the implementation complications for PG itself. For XC it's another story though with its shared nothing structure. Supporting temp tables in 2PC has been discussed in 8.3/8.4 (?) and there have been a couple of patches AFAIK. But even with that there are more problems to consider with for example WITH HOLD cursors or even LISTEN, which are features very tightly related to a session behave. LISTEN is not supported in XC as I recall, but cursors are... Regards, -- Michael |
From: Ashutosh B. <ash...@en...> - 2014-02-14 05:48:31
|
On Fri, Feb 14, 2014 at 7:25 AM, Abbas Butt <abb...@en...>wrote: > > > > On Thu, Feb 13, 2014 at 11:24 AM, Ashutosh Bapat < > ash...@en...> wrote: > >> One more solution would be to use cursors for replicated tables. The idea >> is to open cursors on all the copies of the table and append the query with >> an ORDER BY clause on all the columns. Thus we are sure that the current of >> each of these cursors point to same row on all the copies. While fetching a >> row from a replicated table, we fetch from all the cursors and choose only >> one row for the data processing. While updating or deleting we send UPDATE >> or DELETE with WHERE CURRENT OF. The down side of this approach is that, if >> there are coordinator quals, we will end up locking more rows than >> necessary, increasing the probability of the deadlock but at least there >> won't be a necessary restriction of having primary or unique key and we >> won't break backward compatibility. >> >> If there two identical rows, we might mix the update from different >> nodes, but then who knew which of them were corresponded across the nodes >> to start with. >> > > Thanks for the suggestion but we currently do not support WCO and we were > thinking of fixing this issue before we declare 1.2 beta is generally > available. > > Abbas, WCO doesn't work from the coordinator, but there is no reason why it shouldn't work at the datanode. So internally between coordinator and the datanode, we can always use WCO. > >> >> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki <koi...@gm...>wrote: >> >>> Hi, >>> >>> I tested the patch and found that primary key is mandatory. We need >>> to modify regression test considerably to give each replicated table >>> primary keys. >>> >>> I think this patch helps but I'm not afraid this is good, especially >>> when we try to take XC features back to PG. >>> >>> Did you post another patch to use all column values if primary key is >>> not available? >>> >>> I think better way is as follows: >>> >>> 1) If primary key is defined, use it, >>> 2) If not, create a primary key as system column, the size should be >>> 64bit. >>> 3) If primary key is added to a replicated table, remove system primary >>> key. >>> >>> The value of primary key can be obtained as follows: >>> >>> 1) add new column to pgxc_class catalog to represent maximum value of >>> the system primary key, >>> 2) when first "insert" is done to the primary node, system primary key >>> value is taken from 1) and 1) is updated. The value is returned to >>> the coordinator to be propagated to other nodes. >>> 3) when subsequent "insert" is being done, system primary key value is >>> added to the column value. In this case, each datanode updates 1) >>> column value if it is larger than the current maximum value. >>> >>> 3) is important to change primary node to another. This is needed to >>> carry over the primary node to another. >>> >>> ALTER TABLE should take care of them. >>> >>> Other issues are: >>> >>> 4) pg_dump/pg_dumpall should not include this system column value, >>> 5) cluster may need to handle this too to repack system primary key >>> value (not now but at least in 1.3 or later). >>> >>> Regards; >>> --- >>> Koichi Suzuki >>> >>> >>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >>> > Please see attached patch that tries to address the issue of XC using >>> CTID >>> > for replicated updates and deletes when it is evaluated at a >>> coordinator >>> > instead of being pushed down. >>> > >>> > The problem here is that CTID could be referring to a different tuple >>> > altogether on a different data node, which is what happened for one of >>> our >>> > Postgres-XC support customers, leading to data issues. >>> > >>> > Instead, the patch looks for a primary key or unique index (with the >>> primary >>> > key preferred) and uses those values instead of CTID. >>> > >>> > The patch could be improved further. Extra parameters are set even if >>> not >>> > used in the execution of the prepared statement sent down to the data >>> nodes. >>> > >>> > Regards, >>> > >>> > >>> > -- >>> > Mason Sharp >>> > >>> > TransLattice - http://www.translattice.com >>> > Distributed and Clustered Database Solutions >>> > >>> > >>> ------------------------------------------------------------------------------ >>> > November Webinars for C, C++, Fortran Developers >>> > Accelerate application performance with scalable programming models. >>> Explore >>> > techniques for threading, error checking, porting, and tuning. Get the >>> most >>> > from the latest Intel processors and coprocessors. See abstracts and >>> > register >>> > >>> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >>> > _______________________________________________ >>> > Postgres-xc-developers mailing list >>> > Pos...@li... >>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> > >>> >>> >>> ------------------------------------------------------------------------------ >>> Android apps run on BlackBerry 10 >>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>> Get your Android app in front of a whole new audience. Start now. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Postgres-xc-developers mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EnterpriseDB Corporation >> The Postgres Database Company >> >> >> ------------------------------------------------------------------------------ >> Android apps run on BlackBerry 10 >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> Get your Android app in front of a whole new audience. Start now. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > > > -- > -- > *Abbas* > Architect > > Ph: 92.334.5100153 > Skype ID: gabbasb > www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> > > *Follow us on Twitter* > @EnterpriseDB > > Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |
From: Michael P. <mic...@gm...> - 2014-02-14 05:45:09
|
On Fri, Feb 14, 2014 at 11:07 AM, Koichi Suzuki <koi...@gm...> wrote: > Disconnecting the connection as you suggested will bring another > problem such as TEMPORARY object in the subsequent queries. We do > not support TEMPORARY object but I believe we should be consistent on > this for future releases. AFAIK, temporary object are supported, no? There is even a parameter allowing to enforce autocommit instead of 2PC when temp objects are used. This is dangerous, OK, but that's a trade-off and user is normally aware of that when using temp tables. Actually, temporary tables are useful even without 2PC with for example a join pushed down to a unique remote node: it might be better to use a temporary table gathering a list of keys instead of a heavy ANY clause or a array with many values. -- Michael |
From: Koichi S. <koi...@gm...> - 2014-02-14 05:21:18
|
It seems to be an issue of PG itself, doesn't it? --- Koichi Suzuki 2014-02-14 14:06 GMT+09:00 Andrei Martsinchyk <and...@gm...>: > You are right, the temp objects are problem. > On the one hand if we run a long query and there was an error on one node we > want to cancel it on others to avoid unnecessary waiting. On the other hand > the query may be near its natural end and the cancel may be late and hit the > next query. > Just throwing out ideas: > - Make Cancel more selective and affect only specific query. That means an > ID for each query to introduce, that should be known to client and way to > deliver it. > - Introduce procedure of changing backend key. Old cancel won't affect such > backend. > - Before starting new query, check if there is pending cancel and remove it. > It sounds ridiculous "cancel cancel" but may work, if queries and cancels > are issued synchronously from single source. > > 14.02.2014 4:07 пользователь "Koichi Suzuki" <koi...@gm...> > написал: > >> I misunderstand the implication. Anyway additional wait is separate >> from your suggestion. >> >> Disconnecting the connection as you suggested will bring another >> problem such as TEMPORARY object in the subsequent queries. We do >> not support TEMPORARY object but I believe we should be consistent on >> this for future releases. >> >> Thoughts? >> --- >> Koichi Suzuki >> >> >> 2014-02-14 2:30 GMT+09:00 Andrei Martsinchyk >> <and...@gm...>: >> > Hello, >> > >> > Postgres establishes separate connection to deliver Cancel command to >> > the >> > target session. >> > On a heavily loaded node it may take fairly long. Longer sleep would >> > help >> > out, but it means longer recovery after an error. >> > Better solution is to remove canceled connection from the pool and >> > therefore >> > do not use it to handle subsequent queries. >> > >> > >> > >> > 2014-02-13 11:10 GMT+02:00 Koichi Suzuki <koi...@gm...>: >> >> >> >> I think it hits the point. I tested this patch several times and it >> >> seems to work fine. The delay time (at present 10ms) is short enough >> >> and it is applied only when we need to cancel a statement. >> >> >> >> We should check this into all the master and STABLE branches improving >> >> magic number with some meaningful name. >> >> >> >> Any thoughts? >> >> --- >> >> Koichi Suzuki >> >> >> >> >> >> 2014-01-24 18:25 GMT+09:00 Masataka Saito <pg...@gm...>: >> >> > Hello, >> >> > >> >> > As I've been exasperated by random failures, I'm willing to whip the >> >> > cause >> >> > of the issue. >> >> > >> >> > This issue is related to cancel of the failed query. >> >> > When a datanode reports an error of a query, a coordinator sends a >> >> > cancel >> >> > request to non-idle nodes, waits the node to get ready and requests >> >> > nodes to >> >> > rollback the transaction. >> >> > >> >> > Where's the problem? Consider the next case. >> >> > 1. Datanode A (PID 1) reports an error to coordinator A. ([1] 'E' >> >> > message) >> >> > 2. Coordinator A receives [1] and reports an error to a frontend. >> >> > ([2] >> >> > 'E' >> >> > message) >> >> > 3. Coordinator A starts aborting process and it thinks datanode A >> >> > (PID >> >> > 1) is >> >> > not idle. >> >> > 4. Coordinator A sends a cancel request about PID 1 to datanode A >> >> > (PID >> >> > 2). >> >> > ([3] cancel message) >> >> > 5. Datanode A (PID 1) reports ready to coordinator A. ([4] 'Z' >> >> > message) >> >> > 6. Coordinator A receives [4] and sends "ROLLBACK TRANSACTION" >> >> > immediately. >> >> > ([5] 'Q' message) >> >> > 7. Datanode A (PID 1) receives [5] and starts processing the query. >> >> > 8. Datanode A (PID 2) receives [3]. >> >> > 9. Datanode A (PID 2) notify PID 1 of [3]. >> >> > 10. Datanode A (PID 1) cancel processing [5] and reports an error to >> >> > Coordinator A. ([6] 'E' message) >> >> > 11. Coordinator A receives [6] and reports an error to a frontend. >> >> > ([7] >> >> > 'E' >> >> > message) >> >> > >> >> > [7] makes unexpected output and a test fails. >> >> > >> >> > Saying an extreme thing, it could occur that the next query of [5] is >> >> > cancelled by [3]. >> >> > >> >> > As far as I know, there's no way to know when to the cancel request >> >> > get >> >> > to >> >> > be processed, I think we can't not wait an experimental duration >> >> > after >> >> > cancelling like the attached patch. >> >> > >> >> > Does anyone have another cool idea to solve this issue? >> >> > >> >> > Regards. >> >> > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ >> >> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> >> > Learn Why More Businesses Are Choosing CenturyLink Cloud For >> >> > Critical Workloads, Development Environments & Everything In Between. >> >> > Get a Quote or Start a Free Trial Today. >> >> > >> >> > >> >> > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> >> > _______________________________________________ >> >> > Postgres-xc-developers mailing list >> >> > Pos...@li... >> >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Android apps run on BlackBerry 10 >> >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> >> Get your Android app in front of a whole new audience. Start now. >> >> >> >> >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> >> >> >> _______________________________________________ >> >> Postgres-xc-developers mailing list >> >> Pos...@li... >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > >> > >> > >> > >> > -- >> > Andrei Martsinchyk >> > >> > StormDB - http://www.stormdb.com >> > The Database Cloud >> > |
From: Andrei M. <and...@gm...> - 2014-02-14 05:06:53
|
You are right, the temp objects are problem. On the one hand if we run a long query and there was an error on one node we want to cancel it on others to avoid unnecessary waiting. On the other hand the query may be near its natural end and the cancel may be late and hit the next query. Just throwing out ideas: - Make Cancel more selective and affect only specific query. That means an ID for each query to introduce, that should be known to client and way to deliver it. - Introduce procedure of changing backend key. Old cancel won't affect such backend. - Before starting new query, check if there is pending cancel and remove it. It sounds ridiculous "cancel cancel" but may work, if queries and cancels are issued synchronously from single source. 14.02.2014 4:07 пользователь "Koichi Suzuki" <koi...@gm...> написал: > I misunderstand the implication. Anyway additional wait is separate > from your suggestion. > > Disconnecting the connection as you suggested will bring another > problem such as TEMPORARY object in the subsequent queries. We do > not support TEMPORARY object but I believe we should be consistent on > this for future releases. > > Thoughts? > --- > Koichi Suzuki > > > 2014-02-14 2:30 GMT+09:00 Andrei Martsinchyk <and...@gm... > >: > > Hello, > > > > Postgres establishes separate connection to deliver Cancel command to the > > target session. > > On a heavily loaded node it may take fairly long. Longer sleep would help > > out, but it means longer recovery after an error. > > Better solution is to remove canceled connection from the pool and > therefore > > do not use it to handle subsequent queries. > > > > > > > > 2014-02-13 11:10 GMT+02:00 Koichi Suzuki <koi...@gm...>: > >> > >> I think it hits the point. I tested this patch several times and it > >> seems to work fine. The delay time (at present 10ms) is short enough > >> and it is applied only when we need to cancel a statement. > >> > >> We should check this into all the master and STABLE branches improving > >> magic number with some meaningful name. > >> > >> Any thoughts? > >> --- > >> Koichi Suzuki > >> > >> > >> 2014-01-24 18:25 GMT+09:00 Masataka Saito <pg...@gm...>: > >> > Hello, > >> > > >> > As I've been exasperated by random failures, I'm willing to whip the > >> > cause > >> > of the issue. > >> > > >> > This issue is related to cancel of the failed query. > >> > When a datanode reports an error of a query, a coordinator sends a > >> > cancel > >> > request to non-idle nodes, waits the node to get ready and requests > >> > nodes to > >> > rollback the transaction. > >> > > >> > Where's the problem? Consider the next case. > >> > 1. Datanode A (PID 1) reports an error to coordinator A. ([1] 'E' > >> > message) > >> > 2. Coordinator A receives [1] and reports an error to a frontend. ([2] > >> > 'E' > >> > message) > >> > 3. Coordinator A starts aborting process and it thinks datanode A (PID > >> > 1) is > >> > not idle. > >> > 4. Coordinator A sends a cancel request about PID 1 to datanode A (PID > >> > 2). > >> > ([3] cancel message) > >> > 5. Datanode A (PID 1) reports ready to coordinator A. ([4] 'Z' > message) > >> > 6. Coordinator A receives [4] and sends "ROLLBACK TRANSACTION" > >> > immediately. > >> > ([5] 'Q' message) > >> > 7. Datanode A (PID 1) receives [5] and starts processing the query. > >> > 8. Datanode A (PID 2) receives [3]. > >> > 9. Datanode A (PID 2) notify PID 1 of [3]. > >> > 10. Datanode A (PID 1) cancel processing [5] and reports an error to > >> > Coordinator A. ([6] 'E' message) > >> > 11. Coordinator A receives [6] and reports an error to a frontend. > ([7] > >> > 'E' > >> > message) > >> > > >> > [7] makes unexpected output and a test fails. > >> > > >> > Saying an extreme thing, it could occur that the next query of [5] is > >> > cancelled by [3]. > >> > > >> > As far as I know, there's no way to know when to the cancel request > get > >> > to > >> > be processed, I think we can't not wait an experimental duration after > >> > cancelling like the attached patch. > >> > > >> > Does anyone have another cool idea to solve this issue? > >> > > >> > Regards. > >> > > >> > > >> > > ------------------------------------------------------------------------------ > >> > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > >> > Learn Why More Businesses Are Choosing CenturyLink Cloud For > >> > Critical Workloads, Development Environments & Everything In Between. > >> > Get a Quote or Start a Free Trial Today. > >> > > >> > > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > >> > _______________________________________________ > >> > Postgres-xc-developers mailing list > >> > Pos...@li... > >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >> > > >> > >> > >> > ------------------------------------------------------------------------------ > >> Android apps run on BlackBerry 10 > >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > >> Now with support for Jelly Bean, Bluetooth, Mapview and more. > >> Get your Android app in front of a whole new audience. Start now. > >> > >> > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > >> > >> _______________________________________________ > >> Postgres-xc-developers mailing list > >> Pos...@li... > >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > > > > > > > > -- > > Andrei Martsinchyk > > > > StormDB - http://www.stormdb.com > > The Database Cloud > > > |