From: Ashutosh B. <ash...@en...> - 2012-07-16 12:20:15
|
Finally I got off reviewing this patch. One of the problems, with this patch is extensibility. At some point in future (and probably very near future), we should allow adding column (an example) and redistribution to be done at the same time, to reduce the total time of doing ALTER TABLE if it comes to adding column (an example) and redistribute the table at the same time. Basically we should allow all the table rewriting to be done at the same time as the PostgreSQL ALTER TABLE allows to do. The method used here does not leave a room for such combined operation, as an extension in future. This means, that when it comes to supporting the above said capability we have to rewrite the whole thing (leaving may be transformation and node manipulation aside). That's why, I would like to see a fix in ATRewriteTable, where we have sequence for every row 1 get row, 2. rewrite the row 3. write the new row. Anyway, I have following comments for the patch itself, 1. Need better name for PgxcClassAlter(). 2. In ATController, why do you need to move ATCheckCmd below ATPrepCmd? 3. Comments on line 2933 and 2941 can be worded as "Perform pre-catalog-update redistribution operations" and "Perform post-catalog-update redistribution operations" 4. Why can't redistribution be run in a transaction block? Does the redistribution run as a transaction? What happens if the server crashes while redistribution is being done at various stages of its progress? 5. Do you want to rename BuildDistribCommands() as BuildReDistribCommands()? 6. In function BuildDistribCommands(), What does variable new_oids signify? I think it's signifying the new node oids, if so please use name accordingly. 7. The names of function tree_build_entry() and its minions look pretty generic, we need some prefix to these functions like pgxc_redist_ or something like that. BTW, what's the "tree" there indicate? There is nothing like tree that is being built, it's just a list of commands. 8. Why are you using repalloc in tree_add_single_command(), you can rather create a list and append to that list. 9. We don't need two separate functions Pre and Post - Update(), it can be a single function which takes the Pre/Post as flag and runs the relevant commands. BTW, just PreUpdate does not show its real meaning, it should be something like PreCatalogUpdate() or something like that. 10. Why every catalog change to pgxc_class invalidates the cache? We should do it only once for a given command. 11. The functions add_oid_list, delete_oid_list etc. are using oid arrays, then why use the suffice _list() in those functions? I do not like the frequence repalloc that happens in both these functions. Worst is the movement of array element in delete_node_list(). Any mistake here would be disastrous. PostgreSQL code is very generous in using memory to keep things simple. You can use lists or bitmaps if you want to save the space, but not repalloc. 12. Instead of a single file distrib directory, you can use locator directory with distrib.c file (better if you could use name like at_distrib or redistrib, since the files really belong to ALTER TABLE? 13. In makeRemoteCopyOptions you are using palloc0(), which sets all the memory with 0, so bools automatically get value false and NULL lists as NIL, you don't need to set those explicitly. 14. I do not see any tests related to sanity of views, rules and other objects which depend upon the table, after the table is redistributed using this method. May be it's a good idea to provide a prologue at the beginning of the testcase specifying how the testcase is laid out. On Mon, Jul 16, 2012 at 4:56 PM, Ashutosh Bapat < ash...@en...> wrote: > > In this case you do not need any code! You could also do a simple CREATE >> TABLE AS to redistribution the table as you wish to a new table, drop the >> old table, and rename the new table with the old name. This could also be >> done with 1.0. >> You should also have a look at my latest patch, it already includes the >> maximum optimizations possible for replicated table redistribution, >> particularly distrib.c. Just by looking at that you will see that a node >> level control is more than necessary. >> > > This method would change the OID of the table and thus invalidate all the > view definitions, rules etc. depending on this table. We don't want that to > happen. The function would not change the OID, but would write the data in > the table's space itself. > > >> -- >> Michael Paquier >> http://michael.otacoo.com >> > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |