From: Ashutosh B. <ash...@en...> - 2012-07-11 06:08:10
|
The refactoring patch is good to be committed. On Wed, Jul 11, 2012 at 10:33 AM, Michael Paquier <mic...@gm... > wrote: > Updated patch is attached. > I attach once again the patch for remote copy, still pending for review. > > On Tue, Jul 10, 2012 at 6:41 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> 1. Please name the variables as local_hashalgorithm instead of >> hashalgorithm_loc (loc is also used mean the location info e.g Get >> RelationLocInfo()). >> > Done. > > 2. Please rename IsHashDistributable as IsTypeHashDistributable(). Same >> case with IsModuloDistributable(). >> > Done. > > 3. In function prologues, add information about input and output >> variables, esp. in case of GetRelationDistributionItems(). >> > Done. > > 4. This is not change in this patch, but good if you can accommodate it. >> At line 1102, there is a switch case, which has action only for a single >> case, so, it better be replaced with an "if". >> > Done. > > 5. SortRelationDistributionNodes, better be a macro, as it's not doing >> anything but call qsort(). >> > Don't agree on that. I feel it is clearer to let it as an external > function as it is used afterwards in a more flexible way by redistribution. > > >> 6. Following comment doesn't make much sense, please remove it. The >> executor state at the time of table creation and querying can be completely >> different. There is no connection >> 1218 * We should use session data because Executor uses it as >> well to run >> 1219 * commands on nodes. >> > Done. > > 7. In GetRelationDistributionNodes(), there are three places, node sorting >> function is called. Instead, you should just nodeoid array at these three >> places and call the sorting function at the end. In case we need to add >> another if case in that function, to get array of nodes in some other way, >> one has to remember to add the call to sort the nodes array, which can be >> avoided if you add the call to sort function at the end. >> BuildRelationDistributionNodes() sorts the nodeoids inside it, but you can >> take that call out of this function. >> > Done. Simplifies code. > > >> 8. Probably not your code but, Function BuildRelationDistributionNodes() >> does a repalloc() for every new nodeoid it finds. Each repalloc is costly. >> Instead we can allocate memory large enough to contain all members of the >> list passed. If there are node repeated (which will be less likely), we >> will waste a few bytes, but won't be as expensive as calling repalloc(). >> > > >> 9. All the renamed functions are marked as "extern", do you really need >> them so? Also, I don't understand why these functions are located in heap.c? >> > Yes and yes. Do you remember this patch is a base for redistribution? > Our code has an essential dependency with a static structure inside heap.c > classifying the typle attributes called SysAtt. This dependency is really > important because thanks to that we can check if a chosen distribution > column is a system column or not. In case it is a distribution column, we > return an error. > This makes sufficient reasons to keep this code inside heap.c and heap.h. > > I hope regression is sane. > > They are, and testing regressions is one of the first things to do when > reviewing a patch I believe. > > We should honestly move faster on those small reviews, and discuss about > the core of redistribution which is the real purpose here. > Thanks. > -- > Michael Paquier > http://michael.otacoo.com > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |