You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
(28) |
Jun
(12) |
Jul
(11) |
Aug
(12) |
Sep
(5) |
Oct
(19) |
Nov
(14) |
Dec
(12) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(18) |
Feb
(30) |
Mar
(115) |
Apr
(89) |
May
(50) |
Jun
(44) |
Jul
(22) |
Aug
(13) |
Sep
(11) |
Oct
(30) |
Nov
(28) |
Dec
(39) |
2012 |
Jan
(38) |
Feb
(18) |
Mar
(43) |
Apr
(91) |
May
(108) |
Jun
(46) |
Jul
(37) |
Aug
(44) |
Sep
(33) |
Oct
(29) |
Nov
(36) |
Dec
(15) |
2013 |
Jan
(35) |
Feb
(611) |
Mar
(5) |
Apr
(55) |
May
(30) |
Jun
(28) |
Jul
(458) |
Aug
(34) |
Sep
(9) |
Oct
(39) |
Nov
(22) |
Dec
(32) |
2014 |
Jan
(16) |
Feb
(16) |
Mar
(42) |
Apr
(179) |
May
(7) |
Jun
(6) |
Jul
(9) |
Aug
|
Sep
(4) |
Oct
|
Nov
(3) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
1
|
2
(1) |
3
(3) |
4
(3) |
5
|
6
|
7
|
8
|
9
(1) |
10
(1) |
11
(3) |
12
(4) |
13
|
14
|
15
|
16
(1) |
17
|
18
|
19
(1) |
20
|
21
|
22
|
23
|
24
(7) |
25
(16) |
26
(2) |
27
|
28
|
29
|
30
(4) |
31
(3) |
|
|
|
|
From: Andrei M. <and...@gm...> - 2011-05-24 17:18:59
|
Hi Abbas, I looked at the code and see that for some data types the compute_hash() returns not a hash code, but original value: + case INT8OID: + /* This gives added advantage that + * a = 8446744073709551359 + * and a = 8446744073709551359::int8 both work*/ + return DatumGetInt32(value); + case INT2OID: + return DatumGetInt16(value); + case OIDOID: + return DatumGetObjectId(value); + case INT4OID: + return DatumGetInt32(value); + case BOOLOID: + return DatumGetBool(value); That not a critical error and gives a bit better calculation speed but may cause poor distributions, when, for example, distribution column contains only even or only odd values. Some node may have many rows while other may not have rows at all. I suggest using hashintX functions here. And another point: Oid's are generated on data nodes, does it make sense to allow hashing here, where it is supposed the value is coming from coordinator? 2011/5/24 Abbas Butt <ga...@us...> > Project "Postgres-XC". > > The branch, master has been updated > via 49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit) > from 87a62879ab3492e3dd37d00478ffa857639e2b85 (commit) > > > - Log ----------------------------------------------------------------- > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae > Author: Abbas <abb...@en...> > Date: Tue May 24 17:06:30 2011 +0500 > > This patch adds support for the following data types to be used as > distribution key > > INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR > CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR > FLOAT4, FLOAT8, NUMERIC, CASH > ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL, TIMETZ > > A new function compute_hash is added in the system which is used to > compute hash of a any of the supported data types. > The computed hash is used in the function GetRelationNodes to > find the targeted data node. > > EXPLAIN for RemoteQuery has been modified to show the number of > data nodes targeted for a certain query. This is essential > to spot bugs in the optimizer in case it is targeting all nodes > by mistake. > > In case of optimisations where comparison with a constant leads > the optimiser to point to a single data node, there were a couple > of mistakes in examine_conditions_walker. > First it was not supporting RelabelType, which represents a "dummy" > type coercion between two binary compatible datatypes. > This was resulting in the optimization not working for varchar > type for example. > Secondly it was not catering for the case where the user specifies the > condition such that the constant expression is written towards LHS and > the > variable towards the RHS of the = operator. > i.e. 23 = a > > A number of test cases have been added in regression to make sure > further enhancements do not break this functionality. > > This change has a sizeable impact on current regression tests in the > following manner. > > 1. horology test case crashes the server and has been commented out in > serial_schedule. > 2. In money test case the planner optimizer wrongly kicks in to optimize > this query > SELECT m = '$123.01' FROM money_data; > to point to a single data node. > 3. There were a few un-necessary EXPLAINs in create_index test case. > Since we have added support in EXPLAIN to show the number of > data nodes targeted for RemoteQuery, this test case was producing > output dependent on the cluster configuration. > 4. In guc test case > DROP ROLE temp_reset_user; > results in > ERROR: permission denied to drop role > > diff --git a/src/backend/access/hash/hashfunc.c > b/src/backend/access/hash/hashfunc.c > index 577873b..22766c5 100644 > --- a/src/backend/access/hash/hashfunc.c > +++ b/src/backend/access/hash/hashfunc.c > @@ -28,6 +28,13 @@ > > #include "access/hash.h" > > +#ifdef PGXC > +#include "catalog/pg_type.h" > +#include "utils/builtins.h" > +#include "utils/timestamp.h" > +#include "utils/date.h" > +#include "utils/nabstime.h" > +#endif > > /* Note: this is used for both "char" and boolean datatypes */ > Datum > @@ -521,3 +528,91 @@ hash_uint32(uint32 k) > /* report the result */ > return UInt32GetDatum(c); > } > + > +#ifdef PGXC > +/* > + * compute_hash() -- Generaic hash function for all datatypes > + * > + */ > + > +Datum > +compute_hash(Oid type, Datum value, int *pErr) > +{ > + Assert(pErr); > + > + *pErr = 0; > + > + if (value == NULL) > + { > + *pErr = 1; > + return 0; > + } > + > + switch(type) > + { > + case INT8OID: > + /* This gives added advantage that > + * a = 8446744073709551359 > + * and a = 8446744073709551359::int8 both work*/ > + return DatumGetInt32(value); > + case INT2OID: > + return DatumGetInt16(value); > + case OIDOID: > + return DatumGetObjectId(value); > + case INT4OID: > + return DatumGetInt32(value); > + case BOOLOID: > + return DatumGetBool(value); > + > + case CHAROID: > + return DirectFunctionCall1(hashchar, value); > + case NAMEOID: > + return DirectFunctionCall1(hashname, value); > + case INT2VECTOROID: > + return DirectFunctionCall1(hashint2vector, value); > + > + case VARCHAROID: > + case TEXTOID: > + return DirectFunctionCall1(hashtext, value); > + > + case OIDVECTOROID: > + return DirectFunctionCall1(hashoidvector, value); > + case FLOAT4OID: > + return DirectFunctionCall1(hashfloat4, value); > + case FLOAT8OID: > + return DirectFunctionCall1(hashfloat8, value); > + > + case ABSTIMEOID: > + return DatumGetAbsoluteTime(value); > + case RELTIMEOID: > + return DatumGetRelativeTime(value); > + case CASHOID: > + return DirectFunctionCall1(hashint8, value); > + > + case BPCHAROID: > + return DirectFunctionCall1(hashbpchar, value); > + case BYTEAOID: > + return DirectFunctionCall1(hashvarlena, value); > + > + case DATEOID: > + return DatumGetDateADT(value); > + case TIMEOID: > + return DirectFunctionCall1(time_hash, value); > + case TIMESTAMPOID: > + return DirectFunctionCall1(timestamp_hash, value); > + case TIMESTAMPTZOID: > + return DirectFunctionCall1(timestamp_hash, value); > + case INTERVALOID: > + return DirectFunctionCall1(interval_hash, value); > + case TIMETZOID: > + return DirectFunctionCall1(timetz_hash, value); > + > + case NUMERICOID: > + return DirectFunctionCall1(hash_numeric, value); > + default: > + *pErr = 1; > + return 0; > + } > +} > + > +#endif > diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c > index 613d5ff..714190f 100644 > --- a/src/backend/commands/copy.c > +++ b/src/backend/commands/copy.c > @@ -1645,14 +1645,14 @@ CopyTo(CopyState cstate) > } > > #ifdef PGXC > - if (IS_PGXC_COORDINATOR && cstate->rel_loc) > + if (IS_PGXC_COORDINATOR && cstate->rel_loc) > { > cstate->processed = DataNodeCopyOut( > - GetRelationNodes(cstate->rel_loc, NULL, > RELATION_ACCESS_READ), > + GetRelationNodes(cstate->rel_loc, 0, > UNKNOWNOID, RELATION_ACCESS_READ), > cstate->connections, > cstate->copy_file); > } > - else > + else > { > #endif > > @@ -2417,15 +2417,18 @@ CopyFrom(CopyState cstate) > #ifdef PGXC > if (IS_PGXC_COORDINATOR && cstate->rel_loc) > { > - Datum *dist_col_value = NULL; > + Datum dist_col_value; > + Oid dist_col_type = UNKNOWNOID; > > if (cstate->idx_dist_by_col >= 0 && > !nulls[cstate->idx_dist_by_col]) > - dist_col_value = > &values[cstate->idx_dist_by_col]; > + { > + dist_col_value = > values[cstate->idx_dist_by_col]; > + dist_col_type = > attr[cstate->idx_dist_by_col]->atttypid; > + } > > if (DataNodeCopyIn(cstate->line_buf.data, > cstate->line_buf.len, > - > GetRelationNodes(cstate->rel_loc, (long *)dist_col_value, > - > RELATION_ACCESS_INSERT), > + > GetRelationNodes(cstate->rel_loc, dist_col_value, dist_col_type, > RELATION_ACCESS_INSERT), > cstate->connections)) > ereport(ERROR, > > (errcode(ERRCODE_CONNECTION_EXCEPTION), > @@ -4037,7 +4040,8 @@ DoInsertSelectCopy(EState *estate, TupleTableSlot > *slot) > HeapTuple tuple; > Datum *values; > bool *nulls; > - Datum *dist_col_value = NULL; > + Datum dist_col_value; > + Oid dist_col_type; > MemoryContext oldcontext; > CopyState cstate; > > @@ -4082,6 +4086,11 @@ DoInsertSelectCopy(EState *estate, TupleTableSlot > *slot) > cstate->fe_msgbuf = makeStringInfo(); > attr = cstate->tupDesc->attrs; > > + if (cstate->idx_dist_by_col >= 0) > + dist_col_type = > attr[cstate->idx_dist_by_col]->atttypid; > + else > + dist_col_type = UNKNOWNOID; > + > /* Get info about the columns we need to process. */ > cstate->out_functions = (FmgrInfo *) > palloc(cstate->tupDesc->natts * sizeof(FmgrInfo)); > foreach(lc, cstate->attnumlist) > @@ -4152,12 +4161,14 @@ DoInsertSelectCopy(EState *estate, TupleTableSlot > *slot) > > /* Get dist column, if any */ > if (cstate->idx_dist_by_col >= 0 && !nulls[cstate->idx_dist_by_col]) > - dist_col_value = &values[cstate->idx_dist_by_col]; > + dist_col_value = values[cstate->idx_dist_by_col]; > + else > + dist_col_type = UNKNOWNOID; > > /* Send item to the appropriate data node(s) (buffer) */ > if (DataNodeCopyIn(cstate->fe_msgbuf->data, > cstate->fe_msgbuf->len, > - GetRelationNodes(cstate->rel_loc, (long > *)dist_col_value, RELATION_ACCESS_INSERT), > + GetRelationNodes(cstate->rel_loc, > dist_col_value, dist_col_type, RELATION_ACCESS_INSERT), > cstate->connections)) > ereport(ERROR, > (errcode(ERRCODE_CONNECTION_EXCEPTION), > diff --git a/src/backend/commands/explain.c > b/src/backend/commands/explain.c > index a361186..fe74569 100644 > --- a/src/backend/commands/explain.c > +++ b/src/backend/commands/explain.c > @@ -851,8 +851,28 @@ ExplainNode(Plan *plan, PlanState *planstate, > case T_WorkTableScan: > #ifdef PGXC > case T_RemoteQuery: > + { > + RemoteQuery *remote_query = (RemoteQuery *) > plan; > + int pnc, nc; > + > + pnc = 0; > + nc = 0; > + if (remote_query->exec_nodes != NULL) > + { > + if > (remote_query->exec_nodes->primarynodelist != NULL) > + { > + pnc = > list_length(remote_query->exec_nodes->primarynodelist); > + appendStringInfo(es->str, " > (Primary Node Count [%d])", pnc); > + } > + if > (remote_query->exec_nodes->nodelist) > + { > + nc = > list_length(remote_query->exec_nodes->nodelist); > + appendStringInfo(es->str, " > (Node Count [%d])", nc); > + } > + } > #endif > - ExplainScanTarget((Scan *) plan, es); > + ExplainScanTarget((Scan *) plan, es); > + } > break; > case T_BitmapIndexScan: > { > diff --git a/src/backend/optimizer/plan/createplan.c > b/src/backend/optimizer/plan/createplan.c > index b6252a3..c03938d 100644 > --- a/src/backend/optimizer/plan/createplan.c > +++ b/src/backend/optimizer/plan/createplan.c > @@ -2418,9 +2418,7 @@ create_remotequery_plan(PlannerInfo *root, Path > *best_path, > scan_plan->exec_nodes->baselocatortype = > rel_loc_info->locatorType; > else > scan_plan->exec_nodes->baselocatortype = '\0'; > - scan_plan->exec_nodes = GetRelationNodes(rel_loc_info, > - > NULL, > - > RELATION_ACCESS_READ); > + scan_plan->exec_nodes = GetRelationNodes(rel_loc_info, 0, > UNKNOWNOID, RELATION_ACCESS_READ); > copy_path_costsize(&scan_plan->scan.plan, best_path); > > /* PGXCTODO - get better estimates */ > @@ -5024,8 +5022,7 @@ create_remotedelete_plan(PlannerInfo *root, Plan > *topplan) > fstep->sql_statement = pstrdup(buf->data); > fstep->combine_type = COMBINE_TYPE_SAME; > fstep->read_only = false; > - fstep->exec_nodes = GetRelationNodes(rel_loc_info, NULL, > - > RELATION_ACCESS_UPDATE); > + fstep->exec_nodes = GetRelationNodes(rel_loc_info, 0, > UNKNOWNOID, RELATION_ACCESS_UPDATE); > } > else > { > diff --git a/src/backend/pgxc/locator/locator.c > b/src/backend/pgxc/locator/locator.c > index 0ab157d..33fe8ac 100644 > --- a/src/backend/pgxc/locator/locator.c > +++ b/src/backend/pgxc/locator/locator.c > @@ -41,7 +41,7 @@ > > #include "catalog/pgxc_class.h" > #include "catalog/namespace.h" > - > +#include "access/hash.h" > > /* > * PGXCTODO For prototype, relations use the same hash mapping table. > @@ -206,7 +206,32 @@ char *pColName; > bool > IsHashDistributable(Oid col_type) > { > - if (col_type == INT4OID || col_type == INT2OID) > + if(col_type == INT8OID > + || col_type == INT2OID > + || col_type == OIDOID > + || col_type == INT4OID > + || col_type == BOOLOID > + || col_type == CHAROID > + || col_type == NAMEOID > + || col_type == INT2VECTOROID > + || col_type == TEXTOID > + || col_type == OIDVECTOROID > + || col_type == FLOAT4OID > + || col_type == FLOAT8OID > + || col_type == ABSTIMEOID > + || col_type == RELTIMEOID > + || col_type == CASHOID > + || col_type == BPCHAROID > + || col_type == BYTEAOID > + || col_type == VARCHAROID > + || col_type == DATEOID > + || col_type == TIMEOID > + || col_type == TIMESTAMPOID > + || col_type == TIMESTAMPTZOID > + || col_type == INTERVALOID > + || col_type == TIMETZOID > + || col_type == NUMERICOID > + ) > return true; > > return false; > @@ -296,7 +321,32 @@ RelationLocInfo *rel_loc_info; > bool > IsModuloDistributable(Oid col_type) > { > - if (col_type == INT4OID || col_type == INT2OID) > + if(col_type == INT8OID > + || col_type == INT2OID > + || col_type == OIDOID > + || col_type == INT4OID > + || col_type == BOOLOID > + || col_type == CHAROID > + || col_type == NAMEOID > + || col_type == INT2VECTOROID > + || col_type == TEXTOID > + || col_type == OIDVECTOROID > + || col_type == FLOAT4OID > + || col_type == FLOAT8OID > + || col_type == ABSTIMEOID > + || col_type == RELTIMEOID > + || col_type == CASHOID > + || col_type == BPCHAROID > + || col_type == BYTEAOID > + || col_type == VARCHAROID > + || col_type == DATEOID > + || col_type == TIMEOID > + || col_type == TIMESTAMPOID > + || col_type == TIMESTAMPTZOID > + || col_type == INTERVALOID > + || col_type == TIMETZOID > + || col_type == NUMERICOID > + ) > return true; > > return false; > @@ -409,13 +459,13 @@ GetRoundRobinNode(Oid relid) > * The returned List is a copy, so it should be freed when finished. > */ > ExecNodes * > -GetRelationNodes(RelationLocInfo *rel_loc_info, long *partValue, > - RelationAccessType accessType) > +GetRelationNodes(RelationLocInfo *rel_loc_info, Datum valueForDistCol, Oid > typeOfValueForDistCol, RelationAccessType accessType) > { > ListCell *prefItem; > ListCell *stepItem; > ExecNodes *exec_nodes; > - > + long hashValue; > + int nError; > > if (rel_loc_info == NULL) > return NULL; > @@ -480,10 +530,10 @@ GetRelationNodes(RelationLocInfo *rel_loc_info, long > *partValue, > break; > > case LOCATOR_TYPE_HASH: > - > - if (partValue != NULL) > + hashValue = compute_hash(typeOfValueForDistCol, > valueForDistCol, &nError); > + if (nError == 0) > /* in prototype, all partitioned tables use > same map */ > - exec_nodes->nodelist = lappend_int(NULL, > get_node_from_hash(hash_range_int(*partValue))); > + exec_nodes->nodelist = lappend_int(NULL, > get_node_from_hash(hash_range_int(hashValue))); > else > if (accessType == RELATION_ACCESS_INSERT) > /* Insert NULL to node 1 */ > @@ -494,9 +544,10 @@ GetRelationNodes(RelationLocInfo *rel_loc_info, long > *partValue, > break; > > case LOCATOR_TYPE_MODULO: > - if (partValue != NULL) > + hashValue = compute_hash(typeOfValueForDistCol, > valueForDistCol, &nError); > + if (nError == 0) > /* in prototype, all partitioned tables use > same map */ > - exec_nodes->nodelist = lappend_int(NULL, > get_node_from_modulo(compute_modulo(*partValue))); > + exec_nodes->nodelist = lappend_int(NULL, > get_node_from_modulo(compute_modulo(hashValue))); > else > if (accessType == RELATION_ACCESS_INSERT) > /* Insert NULL to node 1 */ > @@ -750,7 +801,6 @@ RelationLocInfo * > GetRelationLocInfo(Oid relid) > { > RelationLocInfo *ret_loc_info = NULL; > - char *namespace; > > Relation rel = relation_open(relid, AccessShareLock); > > diff --git a/src/backend/pgxc/plan/planner.c > b/src/backend/pgxc/plan/planner.c > index 2448a74..4873f19 100644 > --- a/src/backend/pgxc/plan/planner.c > +++ b/src/backend/pgxc/plan/planner.c > @@ -43,20 +43,23 @@ > #include "utils/lsyscache.h" > #include "utils/portal.h" > #include "utils/syscache.h" > - > +#include "utils/numeric.h" > +#include "access/hash.h" > +#include "utils/timestamp.h" > +#include "utils/date.h" > > /* > * Convenient format for literal comparisons > * > - * PGXCTODO - make constant type Datum, handle other types > */ > typedef struct > { > - Oid relid; > - RelationLocInfo *rel_loc_info; > - Oid attrnum; > - char *col_name; > - long constant; /* assume long PGXCTODO - > should be Datum */ > + Oid relid; > + RelationLocInfo *rel_loc_info; > + Oid attrnum; > + char *col_name; > + Datum constValue; > + Oid constType; > } Literal_Comparison; > > /* > @@ -471,15 +474,12 @@ get_base_var(Var *var, XCWalkerContext *context) > static void > get_plan_nodes_insert(PlannerInfo *root, RemoteQuery *step) > { > - Query *query = root->parse; > - RangeTblEntry *rte; > - RelationLocInfo *rel_loc_info; > - Const *constant; > - ListCell *lc; > - long part_value; > - long *part_value_ptr = NULL; > - Expr *eval_expr = NULL; > - > + Query *query = root->parse; > + RangeTblEntry *rte; > + RelationLocInfo *rel_loc_info; > + Const *constant; > + ListCell *lc; > + Expr *eval_expr = NULL; > > step->exec_nodes = NULL; > > @@ -568,7 +568,7 @@ get_plan_nodes_insert(PlannerInfo *root, RemoteQuery > *step) > if (!lc) > { > /* Skip rest, handle NULL */ > - step->exec_nodes = GetRelationNodes(rel_loc_info, > NULL, RELATION_ACCESS_INSERT); > + step->exec_nodes = GetRelationNodes(rel_loc_info, > 0, UNKNOWNOID, RELATION_ACCESS_INSERT); > return; > } > > @@ -650,21 +650,11 @@ get_plan_nodes_insert(PlannerInfo *root, RemoteQuery > *step) > } > > constant = (Const *) checkexpr; > - > - if (constant->consttype == INT4OID || > - constant->consttype == INT2OID || > - constant->consttype == INT8OID) > - { > - part_value = (long) constant->constvalue; > - part_value_ptr = &part_value; > - } > - /* PGXCTODO - handle other data types */ > } > } > > /* single call handles both replicated and partitioned types */ > - step->exec_nodes = GetRelationNodes(rel_loc_info, part_value_ptr, > - > RELATION_ACCESS_INSERT); > + step->exec_nodes = GetRelationNodes(rel_loc_info, > constant->constvalue, constant->consttype, RELATION_ACCESS_INSERT); > > if (eval_expr) > pfree(eval_expr); > @@ -1047,6 +1037,28 @@ examine_conditions_walker(Node *expr_node, > XCWalkerContext *context) > { > Expr *arg1 = linitial(opexpr->args); > Expr *arg2 = lsecond(opexpr->args); > + RelabelType *rt; > + Expr *targ; > + > + if (IsA(arg1, RelabelType)) > + { > + rt = arg1; > + arg1 = rt->arg; > + } > + > + if (IsA(arg2, RelabelType)) > + { > + rt = arg2; > + arg2 = rt->arg; > + } > + > + /* Handle constant = var */ > + if (IsA(arg2, Var)) > + { > + targ = arg1; > + arg1 = arg2; > + arg2 = targ; > + } > > /* Look for a table */ > if (IsA(arg1, Var)) > @@ -1134,7 +1146,8 @@ examine_conditions_walker(Node *expr_node, > XCWalkerContext *context) > lit_comp->relid = > column_base->relid; > lit_comp->rel_loc_info = > rel_loc_info1; > lit_comp->col_name = > column_base->colname; > - lit_comp->constant = > constant->constvalue; > + lit_comp->constValue = > constant->constvalue; > + lit_comp->constType = > constant->consttype; > > > context->conditions->partitioned_literal_comps = lappend( > > context->conditions->partitioned_literal_comps, > @@ -1742,9 +1755,7 @@ get_plan_nodes_walker(Node *query_node, > XCWalkerContext *context) > if (rel_loc_info->locatorType != LOCATOR_TYPE_HASH && > rel_loc_info->locatorType != LOCATOR_TYPE_MODULO) > /* do not need to determine partitioning expression > */ > - context->query_step->exec_nodes = > GetRelationNodes(rel_loc_info, > - > NULL, > - > context->accessType); > + context->query_step->exec_nodes = > GetRelationNodes(rel_loc_info, 0, UNKNOWNOID, context->accessType); > > /* Note replicated table usage for determining safe queries > */ > if (context->query_step->exec_nodes) > @@ -1800,9 +1811,7 @@ get_plan_nodes_walker(Node *query_node, > XCWalkerContext *context) > { > Literal_Comparison *lit_comp = (Literal_Comparison > *) lfirst(lc); > > - test_exec_nodes = GetRelationNodes( > - lit_comp->rel_loc_info, > &(lit_comp->constant), > - RELATION_ACCESS_READ); > + test_exec_nodes = > GetRelationNodes(lit_comp->rel_loc_info, lit_comp->constValue, > lit_comp->constType, RELATION_ACCESS_READ); > > test_exec_nodes->tableusagetype = table_usage_type; > if (context->query_step->exec_nodes == NULL) > @@ -1828,9 +1837,7 @@ get_plan_nodes_walker(Node *query_node, > XCWalkerContext *context) > parent_child = (Parent_Child_Join *) > > linitial(context->conditions->partitioned_parent_child); > > - context->query_step->exec_nodes = > GetRelationNodes(parent_child->rel_loc_info1, > - > NULL, > - > context->accessType); > + context->query_step->exec_nodes = > GetRelationNodes(parent_child->rel_loc_info1, 0, UNKNOWNOID, > context->accessType); > context->query_step->exec_nodes->tableusagetype = > table_usage_type; > } > > @@ -3378,8 +3385,6 @@ GetHashExecNodes(RelationLocInfo *rel_loc_info, > ExecNodes **exec_nodes, const Ex > Expr *checkexpr; > Expr *eval_expr = NULL; > Const *constant; > - long part_value; > - long *part_value_ptr = NULL; > > eval_expr = (Expr *) eval_const_expressions(NULL, (Node *)expr); > checkexpr = get_numeric_constant(eval_expr); > @@ -3389,17 +3394,8 @@ GetHashExecNodes(RelationLocInfo *rel_loc_info, > ExecNodes **exec_nodes, const Ex > > constant = (Const *) checkexpr; > > - if (constant->consttype == INT4OID || > - constant->consttype == INT2OID || > - constant->consttype == INT8OID) > - { > - part_value = (long) constant->constvalue; > - part_value_ptr = &part_value; > - } > - > /* single call handles both replicated and partitioned types */ > - *exec_nodes = GetRelationNodes(rel_loc_info, part_value_ptr, > - > RELATION_ACCESS_INSERT); > + *exec_nodes = GetRelationNodes(rel_loc_info, constant->constvalue, > constant->consttype, RELATION_ACCESS_INSERT); > if (eval_expr) > pfree(eval_expr); > > diff --git a/src/backend/pgxc/pool/execRemote.c > b/src/backend/pgxc/pool/execRemote.c > index 75aca21..76e3eef 100644 > --- a/src/backend/pgxc/pool/execRemote.c > +++ b/src/backend/pgxc/pool/execRemote.c > @@ -1061,7 +1061,8 @@ BufferConnection(PGXCNodeHandle *conn) > RemoteQueryState *combiner = conn->combiner; > MemoryContext oldcontext; > > - Assert(conn->state == DN_CONNECTION_STATE_QUERY && combiner); > + if (combiner == NULL || conn->state != DN_CONNECTION_STATE_QUERY) > + return; > > /* > * When BufferConnection is invoked CurrentContext is related to > other > @@ -3076,9 +3077,8 @@ get_exec_connections(RemoteQueryState *planstate, > if (!isnull) > { > RelationLocInfo *rel_loc_info = > GetRelationLocInfo(exec_nodes->relid); > - ExecNodes *nodes = > GetRelationNodes(rel_loc_info, > - > (long *) &partvalue, > - > exec_nodes->accesstype); > + /* PGXCTODO what is the type of > partvalue here*/ > + ExecNodes *nodes = > GetRelationNodes(rel_loc_info, partvalue, UNKNOWNOID, > exec_nodes->accesstype); > if (nodes) > { > nodelist = nodes->nodelist; > diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c > index 415fc47..6d7939b 100644 > --- a/src/backend/tcop/postgres.c > +++ b/src/backend/tcop/postgres.c > @@ -670,18 +670,18 @@ pg_analyze_and_rewrite(Node *parsetree, const char > *query_string, > querytree_list = pg_rewrite_query(query); > > #ifdef PGXC > - if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) > - { > - ListCell *lc; > - > - foreach(lc, querytree_list) > - { > - Query *query = (Query *) lfirst(lc); > - > - if (query->sql_statement == NULL) > - query->sql_statement = pstrdup(query_string); > - } > - } > + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) > + { > + ListCell *lc; > + > + foreach(lc, querytree_list) > + { > + Query *query = (Query *) lfirst(lc); > + > + if (query->sql_statement == NULL) > + query->sql_statement = > pstrdup(query_string); > + } > + } > #endif > > TRACE_POSTGRESQL_QUERY_REWRITE_DONE(query_string); > @@ -1043,7 +1043,7 @@ exec_simple_query(const char *query_string) > > querytree_list = pg_analyze_and_rewrite(parsetree, > query_string, > > NULL, 0); > - > + > plantree_list = pg_plan_queries(querytree_list, 0, NULL); > > /* Done with the snapshot used for parsing/planning */ > diff --git a/src/include/access/hash.h b/src/include/access/hash.h > index d5899f4..4aaffaa 100644 > --- a/src/include/access/hash.h > +++ b/src/include/access/hash.h > @@ -353,4 +353,8 @@ extern OffsetNumber _hash_binsearch_last(Page page, > uint32 hash_value); > extern void hash_redo(XLogRecPtr lsn, XLogRecord *record); > extern void hash_desc(StringInfo buf, uint8 xl_info, char *rec); > > +#ifdef PGXC > +extern Datum compute_hash(Oid type, Datum value, int *pErr); > +#endif > + > #endif /* HASH_H */ > diff --git a/src/include/pgxc/locator.h b/src/include/pgxc/locator.h > index 9f669d9..9ee983c 100644 > --- a/src/include/pgxc/locator.h > +++ b/src/include/pgxc/locator.h > @@ -100,8 +100,7 @@ extern char ConvertToLocatorType(int disttype); > extern char *GetRelationHashColumn(RelationLocInfo *rel_loc_info); > extern RelationLocInfo *GetRelationLocInfo(Oid relid); > extern RelationLocInfo *CopyRelationLocInfo(RelationLocInfo *src_info); > -extern ExecNodes *GetRelationNodes(RelationLocInfo *rel_loc_info, long > *partValue, > - RelationAccessType accessType); > +extern ExecNodes *GetRelationNodes(RelationLocInfo *rel_loc_info, Datum > valueForDistCol, Oid typeOfValueForDistCol, RelationAccessType accessType); > extern bool IsHashColumn(RelationLocInfo *rel_loc_info, char > *part_col_name); > extern bool IsHashColumnForRelId(Oid relid, char *part_col_name); > extern int GetRoundRobinNode(Oid relid); > diff --git a/src/test/regress/expected/create_index_1.out > b/src/test/regress/expected/create_index_1.out > index 52fdc91..ab3807c 100644 > --- a/src/test/regress/expected/create_index_1.out > +++ b/src/test/regress/expected/create_index_1.out > @@ -174,15 +174,10 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, > -12)'; > SET enable_seqscan = OFF; > SET enable_indexscan = ON; > SET enable_bitmapscan = ON; > -EXPLAIN (COSTS OFF) > -SELECT * FROM fast_emp4000 > - WHERE home_base @ '(200,200),(2000,1000)'::box > - ORDER BY (home_base[0])[0]; > - QUERY PLAN > ----------------- > - Data Node Scan > -(1 row) > - > +--EXPLAIN (COSTS OFF) > +--SELECT * FROM fast_emp4000 > +-- WHERE home_base @ '(200,200),(2000,1000)'::box > +-- ORDER BY (home_base[0])[0]; > SELECT * FROM fast_emp4000 > WHERE home_base @ '(200,200),(2000,1000)'::box > ORDER BY (home_base[0])[0]; > @@ -190,40 +185,25 @@ SELECT * FROM fast_emp4000 > ----------- > (0 rows) > > -EXPLAIN (COSTS OFF) > -SELECT count(*) FROM fast_emp4000 WHERE home_base && > '(1000,1000,0,0)'::box; > - QUERY PLAN > ----------------- > - Data Node Scan > -(1 row) > - > +--EXPLAIN (COSTS OFF) > +--SELECT count(*) FROM fast_emp4000 WHERE home_base && > '(1000,1000,0,0)'::box; > SELECT count(*) FROM fast_emp4000 WHERE home_base && > '(1000,1000,0,0)'::box; > count > ------- > 1 > (1 row) > > -EXPLAIN (COSTS OFF) > -SELECT count(*) FROM fast_emp4000 WHERE home_base IS NULL; > - QUERY PLAN > ----------------- > - Data Node Scan > -(1 row) > - > +--EXPLAIN (COSTS OFF) > +--SELECT count(*) FROM fast_emp4000 WHERE home_base IS NULL; > SELECT count(*) FROM fast_emp4000 WHERE home_base IS NULL; > count > ------- > 138 > (1 row) > > -EXPLAIN (COSTS OFF) > -SELECT * FROM polygon_tbl WHERE f1 ~ '((1,1),(2,2),(2,1))'::polygon > - ORDER BY (poly_center(f1))[0]; > - QUERY PLAN > ----------------- > - Data Node Scan > -(1 row) > - > +--EXPLAIN (COSTS OFF) > +--SELECT * FROM polygon_tbl WHERE f1 ~ '((1,1),(2,2),(2,1))'::polygon > +-- ORDER BY (poly_center(f1))[0]; > SELECT * FROM polygon_tbl WHERE f1 ~ '((1,1),(2,2),(2,1))'::polygon > ORDER BY (poly_center(f1))[0]; > id | f1 > @@ -231,14 +211,9 @@ SELECT * FROM polygon_tbl WHERE f1 ~ > '((1,1),(2,2),(2,1))'::polygon > 1 | ((2,0),(2,4),(0,0)) > (1 row) > > -EXPLAIN (COSTS OFF) > -SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1) > - ORDER BY area(f1); > - QUERY PLAN > ----------------- > - Data Node Scan > -(1 row) > - > +--EXPLAIN (COSTS OFF) > +--SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1) > +-- ORDER BY area(f1); > SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1) > ORDER BY area(f1); > f1 > @@ -269,9 +244,9 @@ LINE 1: SELECT count(*) FROM gcircle_tbl WHERE f1 && > '<(500,500),500... > ^ > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl WHERE f1 <@ box '(0,0,100,100)'; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl WHERE f1 <@ box '(0,0,100,100)'; > @@ -282,9 +257,9 @@ SELECT count(*) FROM point_tbl WHERE f1 <@ box > '(0,0,100,100)'; > > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl WHERE box '(0,0,100,100)' @> f1; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl WHERE box '(0,0,100,100)' @> f1; > @@ -295,9 +270,9 @@ SELECT count(*) FROM point_tbl WHERE box > '(0,0,100,100)' @> f1; > > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl WHERE f1 <@ polygon > '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)'; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl WHERE f1 <@ polygon > '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)'; > @@ -308,9 +283,9 @@ SELECT count(*) FROM point_tbl WHERE f1 <@ polygon > '(0,0),(0,100),(100,100),(50, > > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl WHERE f1 <@ circle '<(50,50),50>'; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl WHERE f1 <@ circle '<(50,50),50>'; > @@ -321,9 +296,9 @@ SELECT count(*) FROM point_tbl WHERE f1 <@ circle > '<(50,50),50>'; > > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl p WHERE p.f1 << '(0.0, 0.0)'; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl p WHERE p.f1 << '(0.0, 0.0)'; > @@ -334,9 +309,9 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 << '(0.0, > 0.0)'; > > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl p WHERE p.f1 >> '(0.0, 0.0)'; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl p WHERE p.f1 >> '(0.0, 0.0)'; > @@ -347,9 +322,9 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 >> '(0.0, > 0.0)'; > > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl p WHERE p.f1 <^ '(0.0, 0.0)'; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl p WHERE p.f1 <^ '(0.0, 0.0)'; > @@ -360,9 +335,9 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 <^ '(0.0, > 0.0)'; > > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl p WHERE p.f1 >^ '(0.0, 0.0)'; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl p WHERE p.f1 >^ '(0.0, 0.0)'; > @@ -373,9 +348,9 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 >^ '(0.0, > 0.0)'; > > EXPLAIN (COSTS OFF) > SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, -12)'; > - QUERY PLAN > ----------------- > - Data Node Scan > + QUERY PLAN > +--------------------------------- > + Data Node Scan (Node Count [1]) > (1 row) > > SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, -12)'; > @@ -774,7 +749,7 @@ CREATE INDEX hash_f8_index ON hash_f8_heap USING hash > (random float8_ops); > -- > CREATE TABLE func_index_heap (f1 text, f2 text); > CREATE UNIQUE INDEX func_index_index on func_index_heap (textcat(f1,f2)); > -ERROR: Cannot locally enforce a unique index on round robin distributed > table. > +ERROR: Unique index of partitioned table must contain the hash/modulo > distribution column. > INSERT INTO func_index_heap VALUES('ABC','DEF'); > INSERT INTO func_index_heap VALUES('AB','CDEFG'); > INSERT INTO func_index_heap VALUES('QWE','RTY'); > @@ -788,7 +763,7 @@ INSERT INTO func_index_heap VALUES('QWERTY'); > DROP TABLE func_index_heap; > CREATE TABLE func_index_heap (f1 text, f2 text); > CREATE UNIQUE INDEX func_index_index on func_index_heap ((f1 || f2) > text_ops); > -ERROR: Cannot locally enforce a unique index on round robin distributed > table. > +ERROR: Unique index of partitioned table must contain the hash/modulo > distribution column. > INSERT INTO func_index_heap VALUES('ABC','DEF'); > INSERT INTO func_index_heap VALUES('AB','CDEFG'); > INSERT INTO func_index_heap VALUES('QWE','RTY'); > diff --git a/src/test/regress/expected/float4_1.out > b/src/test/regress/expected/float4_1.out > index 432d159..f50147d 100644 > --- a/src/test/regress/expected/float4_1.out > +++ b/src/test/regress/expected/float4_1.out > @@ -125,16 +125,6 @@ SELECT 'nan'::numeric::float4; > NaN > (1 row) > > -SELECT '' AS five, * FROM FLOAT4_TBL; > - five | f1 > -------+------------- > - | 1004.3 > - | 1.23457e+20 > - | 0 > - | -34.84 > - | 1.23457e-20 > -(5 rows) > - > SELECT '' AS five, * FROM FLOAT4_TBL ORDER BY f1; > five | f1 > ------+------------- > @@ -257,13 +247,14 @@ SELECT '' AS five, f.f1, @f.f1 AS abs_f1 FROM > FLOAT4_TBL f ORDER BY f1; > UPDATE FLOAT4_TBL > SET f1 = FLOAT4_TBL.f1 * '-1' > WHERE FLOAT4_TBL.f1 > '0.0'; > +ERROR: Partition column can't be updated in current version > SELECT '' AS five, * FROM FLOAT4_TBL ORDER BY f1; > - five | f1 > -------+-------------- > - | -1.23457e+20 > - | -1004.3 > - | -34.84 > - | -1.23457e-20 > - | 0 > + five | f1 > +------+------------- > + | -34.84 > + | 0 > + | 1.23457e-20 > + | 1004.3 > + | 1.23457e+20 > (5 rows) > > diff --git a/src/test/regress/expected/float8_1.out > b/src/test/regress/expected/float8_1.out > index 65fe187..8ce7930 100644 > --- a/src/test/regress/expected/float8_1.out > +++ b/src/test/regress/expected/float8_1.out > @@ -381,6 +381,7 @@ SELECT '' AS five, * FROM FLOAT8_TBL ORDER BY f1; > UPDATE FLOAT8_TBL > SET f1 = FLOAT8_TBL.f1 * '-1' > WHERE FLOAT8_TBL.f1 > '0.0'; > +ERROR: Partition column can't be updated in current version > SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f ORDER BY f1; > ERROR: value out of range: overflow > SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f ORDER BY f1; > @@ -396,17 +397,17 @@ ERROR: cannot take logarithm of zero > SELECT '' AS bad, ln(f.f1) from FLOAT8_TBL f where f.f1 < '0.0'; > ERROR: cannot take logarithm of a negative number > SELECT '' AS bad, exp(f.f1) from FLOAT8_TBL f ORDER BY f1; > -ERROR: value out of range: underflow > +ERROR: value out of range: overflow > SELECT '' AS bad, f.f1 / '0.0' from FLOAT8_TBL f; > ERROR: division by zero > SELECT '' AS five, * FROM FLOAT8_TBL ORDER BY f1; > - five | f1 > -------+----------------------- > - | -1.2345678901234e+200 > - | -1004.3 > - | -34.84 > - | -1.2345678901234e-200 > - | 0 > + five | f1 > +------+---------------------- > + | -34.84 > + | 0 > + | 1.2345678901234e-200 > + | 1004.3 > + | 1.2345678901234e+200 > (5 rows) > > -- test for over- and underflow > diff --git a/src/test/regress/expected/foreign_key_1.out > b/src/test/regress/expected/foreign_key_1.out > index 7eccdc6..3cb7d17 100644 > --- a/src/test/regress/expected/foreign_key_1.out > +++ b/src/test/regress/expected/foreign_key_1.out > @@ -773,9 +773,9 @@ INSERT INTO FKTABLE VALUES(43); -- should > fail > ERROR: insert or update on table "fktable" violates foreign key > constraint "fktable_ftest1_fkey" > DETAIL: Key (ftest1)=(43) is not present in table "pktable". > UPDATE FKTABLE SET ftest1 = ftest1; -- should succeed > +ERROR: Partition column can't be updated in current version > UPDATE FKTABLE SET ftest1 = ftest1 + 1; -- should fail > -ERROR: insert or update on table "fktable" violates foreign key > constraint "fktable_ftest1_fkey" > -DETAIL: Key (ftest1)=(43) is not present in table "pktable". > +ERROR: Partition column can't be updated in current version > DROP TABLE FKTABLE; > -- This should fail, because we'd have to cast numeric to int which is > -- not an implicit coercion (or use numeric=numeric, but that's not part > @@ -787,34 +787,22 @@ DROP TABLE PKTABLE; > -- On the other hand, this should work because int implicitly promotes to > -- numeric, and we allow promotion on the FK side > CREATE TABLE PKTABLE (ptest1 numeric PRIMARY KEY); > -ERROR: Column ptest1 is not a hash distributable data type > +NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index > "pktable_pkey" for table "pktable" > INSERT INTO PKTABLE VALUES(42); > -ERROR: relation "pktable" does not exist > -LINE 1: INSERT INTO PKTABLE VALUES(42); > - ^ > CREATE TABLE FKTABLE (ftest1 int REFERENCES pktable); > -ERROR: relation "pktable" does not exist > -- Check it actually works > INSERT INTO FKTABLE VALUES(42); -- should succeed > -ERROR: relation "fktable" does not exist > -LINE 1: INSERT INTO FKTABLE VALUES(42); > - ^ > +ERROR: insert or update on table "fktable" violates foreign key > constraint "fktable_ftest1_fkey" > +DETAIL: Key (ftest1)=(42) is not present in table "pktable". > INSERT INTO FKTABLE VALUES(43); -- should fail > -ERROR: relation "fktable" does not exist > -LINE 1: INSERT INTO FKTABLE VALUES(43); > - ^ > +ERROR: insert or update on table "fktable" violates foreign key > constraint "fktable_ftest1_fkey" > +DETAIL: Key (ftest1)=(43) is not present in table "pktable". > UPDATE FKTABLE SET ftest1 = ftest1; -- should succeed > -ERROR: relation "fktable" does not exist > -LINE 1: UPDATE FKTABLE SET ftest1 = ftest1; > - ^ > +ERROR: Partition column can't be updated in current version > UPDATE FKTABLE SET ftest1 = ftest1 + 1; -- should fail > -ERROR: relation "fktable" does not exist > -LINE 1: UPDATE FKTABLE SET ftest1 = ftest1 + 1; > - ^ > +ERROR: Partition column can't be updated in current version > DROP TABLE FKTABLE; > -ERROR: table "fktable" does not exist > DROP TABLE PKTABLE; > -ERROR: table "pktable" does not exist > -- Two columns, two tables > CREATE TABLE PKTABLE (ptest1 int, ptest2 inet, PRIMARY KEY(ptest1, > ptest2)); > NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index > "pktable_pkey" for table "pktable" > diff --git a/src/test/regress/expected/money_1.out > b/src/test/regress/expected/money_1.out > new file mode 100644 > index 0000000..6a15792 > --- /dev/null > +++ b/src/test/regress/expected/money_1.out > @@ -0,0 +1,186 @@ > +-- > +-- MONEY > +-- > +CREATE TABLE money_data (m money); > +INSERT INTO money_data VALUES ('123'); > +SELECT * FROM money_data; > + m > +--------- > + $123.00 > +(1 row) > + > +SELECT m + '123' FROM money_data; > + ?column? > +---------- > + $246.00 > +(1 row) > + > +SELECT m + '123.45' FROM money_data; > + ?column? > +---------- > + $246.45 > +(1 row) > + > +SELECT m - '123.45' FROM money_data; > + ?column? > +---------- > + -$0.45 > +(1 row) > + > +SELECT m * 2 FROM money_data; > + ?column? > +---------- > + $246.00 > +(1 row) > + > +SELECT m / 2 FROM money_data; > + ?column? > +---------- > + $61.50 > +(1 row) > + > +-- All true > +SELECT m = '$123.00' FROM money_data; > + ?column? > +---------- > + t > +(1 row) > + > +SELECT m != '$124.00' FROM money_data; > + ?column? > +---------- > + t > +(1 row) > + > +SELECT m <= '$123.00' FROM money_data; > + ?column? > +---------- > + t > +(1 row) > + > +SELECT m >= '$123.00' FROM money_data; > + ?column? > +---------- > + t > +(1 row) > + > +SELECT m < '$124.00' FROM money_data; > + ?column? > +---------- > + t > +(1 row) > + > +SELECT m > '$122.00' FROM money_data; > + ?column? > +---------- > + t > +(1 row) > + > +-- All false > +SELECT m = '$123.01' FROM money_data; > + ?column? > +---------- > +(0 rows) > + > +SELECT m != '$123.00' FROM money_data; > + ?column? > +---------- > + f > +(1 row) > + > +SELECT m <= '$122.99' FROM money_data; > + ?column? > +---------- > + f > +(1 row) > + > +SELECT m >= '$123.01' FROM money_data; > + ?column? > +---------- > + f > +(1 row) > + > +SELECT m > '$124.00' FROM money_data; > + ?column? > +---------- > + f > +(1 row) > + > +SELECT m < '$122.00' FROM money_data; > + ?column? > +---------- > + f > +(1 row) > + > +SELECT cashlarger(m, '$124.00') FROM money_data; > + cashlarger > +------------ > + $124.00 > +(1 row) > + > +SELECT cashsmaller(m, '$124.00') FROM money_data; > + cashsmaller > +------------- > + $123.00 > +(1 row) > + > +SELECT cash_words(m) FROM money_data; > + cash_words > +------------------------------------------------- > + One hundred twenty three dollars and zero cents > +(1 row) > + > +SELECT cash_words(m + '1.23') FROM money_data; > + cash_words > +-------------------------------------------------------- > + One hundred twenty four dollars and twenty three cents > +(1 row) > + > +DELETE FROM money_data; > +INSERT INTO money_data VALUES ('$123.45'); > +SELECT * FROM money_data; > + m > +--------- > + $123.45 > +(1 row) > + > +DELETE FROM money_data; > +INSERT INTO money_data VALUES ('$123.451'); > +SELECT * FROM money_data; > + m > +--------- > + $123.45 > +(1 row) > + > +DELETE FROM money_data; > +INSERT INTO money_data VALUES ('$123.454'); > +SELECT * FROM money_data; > + m > +--------- > + $123.45 > +(1 row) > + > +DELETE FROM money_data; > +INSERT INTO money_data VALUES ('$123.455'); > +SELECT * FROM money_data; > + m > +--------- > + $123.46 > +(1 row) > + > +DELETE FROM money_data; > +INSERT INTO money_data VALUES ('$123.456'); > +SELECT * FROM money_data; > + m > +--------- > + $123.46 > +(1 row) > + > +DELETE FROM money_data; > +INSERT INTO money_data VALUES ('$123.459'); > +SELECT * FROM money_data; > + m > +--------- > + $123.46 > +(1 row) > + > diff --git a/src/test/regress/expected/prepared_xacts_2.out > b/src/test/regress/expected/prepared_xacts_2.out > index e456200..307ffad 100644 > --- a/src/test/regress/expected/prepared_xacts_2.out > +++ b/src/test/regress/expected/prepared_xacts_2.out > @@ -6,7 +6,7 @@ > -- isn't really needed ... stopping and starting the postmaster would > -- be enough, but we can't even do that here. > -- create a simple table that we'll use in the tests > -CREATE TABLE pxtest1 (foobar VARCHAR(10)); > +CREATE TABLE pxtest1 (foobar VARCHAR(10)) distribute by replication; > INSERT INTO pxtest1 VALUES ('aaa'); > -- Test PREPARE TRANSACTION > BEGIN; > diff --git a/src/test/regress/expected/reltime_1.out > b/src/test/regress/expected/reltime_1.out > new file mode 100644 > index 0000000..83f61f9 > --- /dev/null > +++ b/src/test/regress/expected/reltime_1.out > @@ -0,0 +1,109 @@ > +-- > +-- RELTIME > +-- > +CREATE TABLE RELTIME_TBL (f1 reltime); > +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 1 minute'); > +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 5 hour'); > +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 10 day'); > +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 34 year'); > +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 3 months'); > +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 14 seconds ago'); > +-- badly formatted reltimes > +INSERT INTO RELTIME_TBL (f1) VALUES ('badly formatted reltime'); > +ERROR: invalid input syntax for type reltime: "badly formatted reltime" > +LINE 1: INSERT INTO RELTIME_TBL (f1) VALUES ('badly formatted reltim... > + ^ > +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 30 eons ago'); > +ERROR: invalid input syntax for type reltime: "@ 30 eons ago" > +LINE 1: INSERT INTO RELTIME_TBL (f1) VALUES ('@ 30 eons ago'); > + ^ > +-- test reltime operators > +SELECT '' AS six, * FROM RELTIME_TBL ORDER BY f1; > + six | f1 > +-----+--------------- > + | @ 14 secs ago > + | @ 1 min > + | @ 5 hours > + | @ 10 days > + | @ 3 mons > + | @ 34 years > +(6 rows) > + > +SELECT '' AS five, * FROM RELTIME_TBL > + WHERE RELTIME_TBL.f1 <> reltime '@ 10 days' ORDER BY f1; > + five | f1 > +------+--------------- > + | @ 14 secs ago > + | @ 1 min > + | @ 5 hours > + | @ 3 mons > + | @ 34 years > +(5 rows) > + > +SELECT '' AS three, * FROM RELTIME_TBL > + WHERE RELTIME_TBL.f1 <= reltime '@ 5 hours' ORDER BY f1; > + three | f1 > +-------+--------------- > + | @ 14 secs ago > + | @ 1 min > + | @ 5 hours > +(3 rows) > + > +SELECT '' AS three, * FROM RELTIME_TBL > + WHERE RELTIME_TBL.f1 < reltime '@ 1 day' ORDER BY f1; > + three | f1 > +-------+--------------- > + | @ 14 secs ago > + | @ 1 min > + | @ 5 hours > +(3 rows) > + > +SELECT '' AS one, * FROM RELTIME_TBL > + WHERE RELTIME_TBL.f1 = reltime '@ 34 years' ORDER BY f1; > + one | f1 > +-----+---------- > + | 34 years > +(1 row) > + > +SELECT '' AS two, * FROM RELTIME_TBL > + WHERE RELTIME_TBL.f1 >= reltime '@ 1 month' ORDER BY f1; > + two | f1 > +-----+------------ > + | @ 3 mons > + | @ 34 years > +(2 rows) > + > +SELECT '' AS five, * FROM RELTIME_TBL > + WHERE RELTIME_TBL.f1 > reltime '@ 3 seconds ago' ORDER BY f1; > + five | f1 > +------+------------ > + | @ 1 min > + | @ 5 hours > + | @ 10 days > + | @ 3 mons > + | @ 34 years > +(5 rows) > + > +SELECT '' AS fifteen, r1.*, r2.* > + FROM RELTIME_TBL r1, RELTIME_TBL r2 > + WHERE r1.f1 > r2.f1 > + ORDER BY r1.f1, r2.f1; > + fifteen | f1 | f1 > +---------+------------+--------------- > + | @ 1 min | @ 14 secs ago > + | @ 5 hours | @ 14 secs ago > + | @ 5 hours | @ 1 min > + | @ 10 days | @ 14 secs ago > + | @ 10 days | @ 1 min > + | @ 10 days | @ 5 hours > + | @ 3 mons | @ 14 secs ago > + | @ 3 mons | @ 1 min > + | @ 3 mons | @ 5 hours > + | @ 3 mons | @ 10 days > + | @ 34 years | @ 14 secs ago > + | @ 34 years | @ 1 min > + | @ 34 years | @ 5 hours > + | @ 34 years | @ 10 days > + | @ 34 years | @ 3 mons > +(15 rows) > + > diff --git a/src/test/regress/expected/triggers_1.out > b/src/test/regress/expected/triggers_1.out > index 5528c66..a9f83ec 100644 > --- a/src/test/regress/expected/triggers_1.out > +++ b/src/test/regress/expected/triggers_1.out > @@ -717,30 +717,30 @@ ERROR: Postgres-XC does not support TRIGGER yet > DETAIL: The feature is not currently supported > \set QUIET false > UPDATE min_updates_test SET f1 = f1; > -UPDATE 2 > -UPDATE min_updates_test SET f2 = f2 + 1; > ERROR: Partition column can't be updated in current version > +UPDATE min_updates_test SET f2 = f2 + 1; > +UPDATE 2 > UPDATE min_updates_test SET f3 = 2 WHERE f3 is null; > UPDATE 1 > UPDATE min_updates_test_oids SET f1 = f1; > -UPDATE 2 > -UPDATE min_updates_test_oids SET f2 = f2 + 1; > ERROR: Partition column can't be updated in current version > +UPDATE min_updates_test_oids SET f2 = f2 + 1; > +UPDATE 2 > UPDATE min_updates_test_oids SET f3 = 2 WHERE f3 is null; > UPDATE 1 > \set QUIET true > SELECT * FROM min_updates_test ORDER BY 1,2,3; > f1 | f2 | f3 > ----+----+---- > - a | 1 | 2 > - b | 2 | 2 > + a | 2 | 2 > + b | 3 | 2 > (2 rows) > > SELECT * FROM min_updates_test_oids ORDER BY 1,2,3; > f1 | f2 | f3 > ----+----+---- > - a | 1 | 2 > - b | 2 | 2 > + a | 2 | 2 > + b | 3 | 2 > (2 rows) > > DROP TABLE min_updates_test; > diff --git a/src/test/regress/expected/tsearch_1.out > b/src/test/regress/expected/tsearch_1.out > index e8c35d4..4d1f1b1 100644 > --- a/src/test/regress/expected/tsearch_1.out > +++ b/src/test/regress/expected/tsearch_1.out > @@ -801,7 +801,7 @@ SELECT COUNT(*) FROM test_tsquery WHERE keyword > 'new > & york'; > (1 row) > > CREATE UNIQUE INDEX bt_tsq ON test_tsquery (keyword); > -ERROR: Cannot locally enforce a unique index on round robin distributed > table. > +ERROR: Unique index of partitioned table must contain the hash/modulo > distribution column. > SET enable_seqscan=OFF; > SELECT COUNT(*) FROM test_tsquery WHERE keyword < 'new & york'; > count > @@ -1054,6 +1054,7 @@ SELECT count(*) FROM test_tsvector WHERE a @@ > to_tsquery('345&qwerty'); > (0 rows) > > UPDATE test_tsvector SET t = null WHERE t = '345 qwerty'; > +ERROR: Partition column can't be updated in current version > SELECT count(*) FROM test_tsvector WHERE a @@ to_tsquery('345&qwerty'); > count > ------- > diff --git a/src/test/regress/expected/xc_distkey.out > b/src/test/regress/expected/xc_distkey.out > new file mode 100644 > index 0000000..d050b27 > --- /dev/null > +++ b/src/test/regress/expected/xc_distkey.out > @@ -0,0 +1,618 @@ > +-- XC Test cases to verify that all supported data types are working as > distribution key > +-- Also verifies that the comaparison with a constant for equality is > optimized. > +create table ch_tab(a char) distribute by modulo(a); > +insert into ch_tab values('a'); > +select hashchar('a'); > + hashchar > +----------- > + 463612535 > +(1 row) > + > +create table nm_tab(a name) distribute by modulo(a); > +insert into nm_tab values('abbas'); > +select hashname('abbas'); > + hashname > +----------- > + 605752656 > +(1 row) > + > +create table nu_tab(a numeric(10,5)) distribute by modulo(a); > +insert into nu_tab values(123.456); > +insert into nu_tab values(789.412); > +select * from nu_tab order by a; > + a > +----------- > + 123.45600 > + 789.41200 > +(2 rows) > + > +select * from nu_tab where a = 123.456; > + a > +----------- > + 123.45600 > +(1 row) > + > +select * from nu_tab where 789.412 = a; > + a > +----------- > + 789.41200 > +(1 row) > + > +explain select * from nu_tab where a = 123.456; > + QUERY PLAN > +------------------------------------------------------------------- > + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) > +(1 row) > + > +explain select * from nu_tab where 789.412 = a; > + QUERY PLAN > +------------------------------------------------------------------- > + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) > +(1 row) > + > +create table tx_tab(a text) distribute by modulo(a); > +insert into tx_tab values('hello world'); > +insert into tx_tab values('Did the quick brown fox jump over the lazy > dog?'); > +select * from tx_tab order by a; > + a > +------------------------------------------------- > + Did the quick brown fox jump over the lazy dog? > + hello world > +(2 rows) > + > +select * from tx_tab where a = 'hello world'; > + a > +------------- > + hello world > +(1 row) > + > +select * from tx_tab where a = 'Did the quick brown fox jump over the lazy > dog?'; > + a > +------------------------------------------------- > + Did the quick brown fox jump over the lazy dog? > +(1 row) > + > +select * from tx_tab where 'hello world' = a; > + a > +------------- > + hello world > +(1 row) > + > +select * from tx_tab where 'Did the quick brown fox jump over the lazy > dog?' = a; > + a > +------------------------------------------------- > + Did the quick brown fox jump over the lazy dog? > +(1 row) > + > +explain select * from tx_tab where a = 'hello world'; > + QUERY PLAN > +------------------------------------------------------------------- > + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) > +(1 row) > + > +explain select * from tx_tab where a = 'Did the quick brown fox jump over > the lazy dog?'; > + QUERY PLAN > +------------------------------------------------------------------- > + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) > +(1 row) > + > +create ta... [truncated message content] |
From: Mason S. <mas...@gm...> - 2011-05-24 13:58:05
|
On Tue, May 24, 2011 at 9:40 AM, Abbas Butt <abb...@te...> wrote: > > > On Tue, May 24, 2011 at 6:03 PM, Mason <ma...@us...> > wrote: >> >> On Tue, May 24, 2011 at 8:08 AM, Abbas Butt >> <ga...@us...> wrote: >> > Project "Postgres-XC". >> > >> > The branch, master has been updated >> > via 49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit) >> > from 87a62879ab3492e3dd37d00478ffa857639e2b85 (commit) >> > >> > >> > - Log ----------------------------------------------------------------- >> > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae >> > Author: Abbas <abb...@en...> >> > Date: Tue May 24 17:06:30 2011 +0500 >> > >> > This patch adds support for the following data types to be used as >> > distribution key >> > >> > INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR >> > CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR >> > FLOAT4, FLOAT8, NUMERIC, CASH >> > ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL, >> > TIMETZ >> > >> >> I am not sure some of these data types are a good idea to use for >> distributing on. Float is inexact and seems problematic >> >> I just did a quick test: >> >> mds=# create table float1 (a float, b float) distribute by hash (a); >> CREATE TABLE >> >> mds=# insert into float1 values (2.0/3, 2); >> INSERT 0 1 >> >> mds=# select * from float1; >> a | b >> -------------------+--- >> 0.666666666666667 | 2 >> (1 row) >> >> Then, I copy and paste the output of a: >> >> mds=# select * from float1 where a = 0.666666666666667; >> a | b >> ---+--- >> (0 rows) >> > > float is a tricky type. Leave XC aside this test case will produce same > results in plain postgres for this reason. > The column actually does not contain 0.666666666666667, what psql is showing > us is only an approximation of what is stored there. > select * from float1 where a = 2.0/3; would however work. > 2ndly suppose we have the same test case with data type float4. > Now both > select * from float1 where a = 0.666666666666667; and > select * from float1 where a = 2.0/3; > would show up no results both in PG and XC. > The reason is that PG treats real numbers as float8 by default and float8 > does not compare to float4. > select * from float1 where a = cast (2.0/3 as float4); > would therefore work. > Any user willing to use float types has to be aware of these strange > behaviors and knowing these he/she may benefit from being able to use it as > a distribution key. I don't think it is a good idea that they have to know that they should change all of their application code and add casting to make sure it works like they want. I think people are just going to get themselves into trouble. I strongly recommend disabling distribution support for some of these data types. Thanks, Mason > >> >> Looking at the plan it tries to take advantage of partitioning: >> >> mds=# explain select * from float1 where a = 0.666666666666667; >> QUERY PLAN >> ------------------------------------------------------------------- >> Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) >> (1 row) >> >> I think we should remove support for floats as a possible distribution >> type; users may get themselves into trouble. >> >> >> There may be similar issues with the timestamp data types: >> >> mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a); >> CREATE TABLE >> mds=# insert into timestamp1 values (now(), 1); >> INSERT 0 1 >> mds=# select * from timestamp1; >> a | b >> ----------------------------+--- >> 2011-05-24 08:51:21.597551 | 1 >> (1 row) >> >> mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551'; >> a | b >> ---+--- >> (0 rows) >> >> >> As far as BOOL goes, I suppose it may be ok, but of course there are >> only two possible values. I would block it, or at the very least if >> the user leaves off the distribution clause, I would not consider BOOL >> columns and look at other columns as better partitioning candidates. >> >> In any event, I am very glad to see the various INT types, CHAR, >> VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful >> some of the others are. >> >> Thanks, >> >> Mason >> >> >> ------------------------------------------------------------------------------ >> vRanger cuts backup time in half-while increasing security. >> With the market-leading solution for virtual backup and recovery, >> you get blazing-fast, flexible, and affordable data protection. >> Download your free trial now. >> http://p.sf.net/sfu/quest-d2dcopy1 >> _______________________________________________ >> Postgres-xc-committers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-committers > > > ------------------------------------------------------------------------------ > vRanger cuts backup time in half-while increasing security. > With the market-leading solution for virtual backup and recovery, > you get blazing-fast, flexible, and affordable data protection. > Download your free trial now. > http://p.sf.net/sfu/quest-d2dcopy1 > _______________________________________________ > Postgres-xc-committers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-committers > > |
From: Abbas B. <abb...@te...> - 2011-05-24 13:40:17
|
On Tue, May 24, 2011 at 6:03 PM, Mason <ma...@us...>wrote: > On Tue, May 24, 2011 at 8:08 AM, Abbas Butt > <ga...@us...> wrote: > > Project "Postgres-XC". > > > > The branch, master has been updated > > via 49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit) > > from 87a62879ab3492e3dd37d00478ffa857639e2b85 (commit) > > > > > > - Log ----------------------------------------------------------------- > > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae > > Author: Abbas <abb...@en...> > > Date: Tue May 24 17:06:30 2011 +0500 > > > > This patch adds support for the following data types to be used as > distribution key > > > > INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR > > CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR > > FLOAT4, FLOAT8, NUMERIC, CASH > > ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL, TIMETZ > > > > I am not sure some of these data types are a good idea to use for > distributing on. Float is inexact and seems problematic > > I just did a quick test: > > mds=# create table float1 (a float, b float) distribute by hash (a); > CREATE TABLE > > mds=# insert into float1 values (2.0/3, 2); > INSERT 0 1 > > mds=# select * from float1; > a | b > -------------------+--- > 0.666666666666667 | 2 > (1 row) > > Then, I copy and paste the output of a: > > mds=# select * from float1 where a = 0.666666666666667; > a | b > ---+--- > (0 rows) > > float is a tricky type. Leave XC aside this test case will produce same results in plain postgres for this reason. The column actually does not contain 0.666666666666667, what psql is showing us is only an approximation of what is stored there. select * from float1 where a = 2.0/3; would however work. 2ndly suppose we have the same test case with data type float4. Now both select * from float1 where a = 0.666666666666667; and select * from float1 where a = 2.0/3; would show up no results both in PG and XC. The reason is that PG treats real numbers as float8 by default and float8 does not compare to float4. select * from float1 where a = cast (2.0/3 as float4); would therefore work. Any user willing to use float types has to be aware of these strange behaviors and knowing these he/she may benefit from being able to use it as a distribution key. > Looking at the plan it tries to take advantage of partitioning: > > mds=# explain select * from float1 where a = 0.666666666666667; > QUERY PLAN > ------------------------------------------------------------------- > Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) > (1 row) > > I think we should remove support for floats as a possible distribution > type; users may get themselves into trouble. > > > There may be similar issues with the timestamp data types: > > mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a); > CREATE TABLE > mds=# insert into timestamp1 values (now(), 1); > INSERT 0 1 > mds=# select * from timestamp1; > a | b > ----------------------------+--- > 2011-05-24 08:51:21.597551 | 1 > (1 row) > > mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551'; > a | b > ---+--- > (0 rows) > > > As far as BOOL goes, I suppose it may be ok, but of course there are > only two possible values. I would block it, or at the very least if > the user leaves off the distribution clause, I would not consider BOOL > columns and look at other columns as better partitioning candidates. > > In any event, I am very glad to see the various INT types, CHAR, > VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful > some of the others are. > > Thanks, > > Mason > > > ------------------------------------------------------------------------------ > vRanger cuts backup time in half-while increasing security. > With the market-leading solution for virtual backup and recovery, > you get blazing-fast, flexible, and affordable data protection. > Download your free trial now. > http://p.sf.net/sfu/quest-d2dcopy1 > _______________________________________________ > Postgres-xc-committers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-committers > |
From: Mason <ma...@us...> - 2011-05-24 13:03:39
|
On Tue, May 24, 2011 at 8:08 AM, Abbas Butt <ga...@us...> wrote: > Project "Postgres-XC". > > The branch, master has been updated > via 49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit) > from 87a62879ab3492e3dd37d00478ffa857639e2b85 (commit) > > > - Log ----------------------------------------------------------------- > commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae > Author: Abbas <abb...@en...> > Date: Tue May 24 17:06:30 2011 +0500 > > This patch adds support for the following data types to be used as distribution key > > INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR > CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR > FLOAT4, FLOAT8, NUMERIC, CASH > ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL, TIMETZ > I am not sure some of these data types are a good idea to use for distributing on. Float is inexact and seems problematic I just did a quick test: mds=# create table float1 (a float, b float) distribute by hash (a); CREATE TABLE mds=# insert into float1 values (2.0/3, 2); INSERT 0 1 mds=# select * from float1; a | b -------------------+--- 0.666666666666667 | 2 (1 row) Then, I copy and paste the output of a: mds=# select * from float1 where a = 0.666666666666667; a | b ---+--- (0 rows) Looking at the plan it tries to take advantage of partitioning: mds=# explain select * from float1 where a = 0.666666666666667; QUERY PLAN ------------------------------------------------------------------- Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) (1 row) I think we should remove support for floats as a possible distribution type; users may get themselves into trouble. There may be similar issues with the timestamp data types: mds=# create table timestamp1 (a timestamp, b int) distribute by hash(a); CREATE TABLE mds=# insert into timestamp1 values (now(), 1); INSERT 0 1 mds=# select * from timestamp1; a | b ----------------------------+--- 2011-05-24 08:51:21.597551 | 1 (1 row) mds=# select * from timestamp1 where a = '2011-05-24 08:51:21.597551'; a | b ---+--- (0 rows) As far as BOOL goes, I suppose it may be ok, but of course there are only two possible values. I would block it, or at the very least if the user leaves off the distribution clause, I would not consider BOOL columns and look at other columns as better partitioning candidates. In any event, I am very glad to see the various INT types, CHAR, VARCHAR, TEXT, NUMERIC and DATE supported. I am not so sure how useful some of the others are. Thanks, Mason |
From: Pavan D. <pa...@us...> - 2011-05-24 12:12:40
|
Project "Postgres-XC". The branch, PGXC-TrialMaster has been updated via dd8e15cc5988aaa71b519724ab3d59e3e82f42e5 (commit) via df91a4341c34cfb5c63fec787e5602ac5e1bdc6d (commit) via 351a1751b7ee8c17d080fe0de9c9bef4bdbc653d (commit) via 7d2a58c8e4cafbcfe48741317813783603b8fb3f (commit) via b931435761b21f27aef8aca7e7e319bd0bee3a3a (commit) via ba0c6b460db82dbd38b5d2cb2d86c9ee36d3adc3 (commit) via 7a810c69a82a7d5990e922ee653b2301b1f91f2b (commit) via 11115542296b6b3eb7a6e9ec07cc4b3d87d44f87 (commit) via 632865878512957d7d65aec3ffb0e596c587f064 (commit) via d14afc6f37bf712e014059a687a02f609322d932 (commit) via d627d77cb86e4ce844b547bd88b5e2d060d6ed5a (commit) via 3d8241213435d7e6e1026de9ba9cf2ad8f6ff258 (commit) via 486318e5a5ed1299488d682ece3b08d40aa3629e (commit) via 52164f4446b11e3dfcb3d209ef527467f1fe9045 (commit) via 4d6a850dac9ba1c9cdf828fd07e739327a2aa878 (commit) via 57ff3183c379a4629b9d215731daf1bb84fdb36d (commit) via e389cedc177bd2f6c3192aa765477d9e9685b121 (commit) via 6c5bea9a0905bcb16a77249bfcc1978df0c571c1 (commit) via 86a985e81a831c78e02548033a63e685439c875e (commit) via 5373463dbee4a82c9080bcc21d1a734d112d165b (commit) via 0faf06811096886d28d5e91e8050b00d3bd8548a (commit) via 585d094996f822269d7e7c7eeab53dbb34369330 (commit) via 198c04c231615a9e81f274831f11bc2deca222e3 (commit) via ffd210505e1a331a8c6980a4032178a42e6f797a (commit) via 169a44eaa6ccca0ada3be43605e6a0c2ca4bd7e9 (commit) via f15bab9584259095b535beb6d47124808b26598c (commit) via 3d971742e7b1f74414603fa854416c7988d8d43c (commit) via c45032009fab9d2ca7c8f60d119d9084dfd18d98 (commit) via f81a5c58113391b7920238eedf713df0a7da36be (commit) via 6704a5b9e10344ee5accc99daf6accbf12fd5667 (commit) via 5f94f595e330101d47be4b6d0dec62cf3d5f2971 (commit) via c95a353c086776d54ea8c0fed8d89ff4628b2ed7 (commit) via 55948f23432759cc06299cd640d41a18bc7b6219 (commit) via f1841ef74cd880500114196f27e1adae59969665 (commit) via bb093447c90f611b55db8aafea10daa77c022f1d (commit) via 873b6c2937c9fc7ecec8943200521ac1fb5a4bd5 (commit) via 83fa5f092a16a54d3981acc6b9c487d4473d97dc (commit) via f400f82341a28b9bf8266ad829650b922a9df85d (commit) via 0fa0474cf15efd2cfea4ebeabf32f15aba13ecc1 (commit) via deb67bfdb6fa6e6b9a8e6984ac84ce77f2b93fe0 (commit) via d1e193d252c93c317674429bfe9a2119857cd2e9 (commit) via 1743b7631e870f97fe3b03d87a5c5bc34ab0a15c (commit) via 7b238dccef00c4f2dc86743dcacbac65e0b1b3e7 (commit) via 94f06fee218c0c36325deff3069acda920fbeba9 (commit) via e3ba278036f21aa3d512c7fa5087ce89767e0caf (commit) via 6650014f3bdfe7d7dc864fb05921d2f32f1918ab (commit) via e6ac8b0b1495941c0a5a82507235e1edbef0e884 (commit) via 81c026427de962fb825088814228fae7da0b71ed (commit) via 2eae75aeecd8532a922cc9195f63edc59931b90f (commit) via 59675139ae3c1dff0d1189f1a415d586bf0852f5 (commit) via b1db2304ea053effb9160bbaafbdd3256ff58497 (commit) via 100e7976889fcc29a674423b8dac9c9ffd5a64cb (commit) via 66cdb22ee68d07c2d7db94ed27903f311e568fdf (commit) via a335e0db6295f992bae8ea62be485a10ea6fdc6c (commit) via f9b53553de919e79edab5469989d1d23112da4e4 (commit) via d3cd801d005812ec499a91c03818c7f32abb4331 (commit) via ee06f1c1455b396332980c3b309f5e37280de706 (commit) via fbfe94557ad94095b40b79ca1a3b1a2e879e1418 (commit) via d2fe66518e3aa72196b44a0e8ebfa745e3e698b4 (commit) via 3fa6d176d5551c1e86e89b80e781b1f27e063a78 (commit) via d5835d5366c7abec7daca0d674159136cc0fb4b9 (commit) via db52bff897a1ce4b2a99266a4aed9b141eca3974 (commit) via f8fb49d57f04fd0e1930847340c1c0f7897d2438 (commit) via 23fa07e339ebbba8e7cf6d57538b12bfc17f66ed (commit) via 7ba1c7cab398bc8e0cc55e8449558a843d940ad9 (commit) via 7fbd5823dfeace7cfa11d968691d6c3c351eccdd (commit) via ce7b5737c56440152b9b06db88a4a36eb0ae5425 (commit) via debc8202c2802adb4c16b51db0d7f80819137308 (commit) via 39a94aecfcb03ddff964aa84cf27fd28070a90c0 (commit) via 147d0b4a7d172d1f36c61f091c67a0c2efa15986 (commit) via 12a24a40765d64691ec2ac7d5dbcc561b3806b9f (commit) via 1e1dc8653b58622fd2531defc585facc114df0f5 (commit) via 7dccc3dca57147acbce13cd6721022ec94671e4b (commit) via 4f5cb7d0335b4cb6de1cbfe24eace05401c74128 (commit) via 9881d357467dd087c32843857c22a08c2f306550 (commit) via b944a8f1673433c65516b71457f17c1e69fa79b3 (commit) via 6dd5db3a89920c3a226f2aef2eb4bc332d5b1607 (commit) via 8345fd2a6c90ea4ae9363a9527260d69eb73d15f (commit) via aef28684d3a06fd6a47f11a3ad44b9db3cee8e22 (commit) via 6b28e34230944f89006b630ba77944f4cbe89a13 (commit) via 47b8ada14365c528f7069ae4e1972d0de64a8ae0 (commit) via f623dedea3c43a0ad2fc6790ce458e12820e2663 (commit) via d2e1f1e8ab97b6377026e1f9aaed3a1905b8fde7 (commit) via 6c3c74605bcb2f11556aa36e7894b90efaf8bd38 (commit) via 9b2d9869a824ee899a9c4893d78c175025b06e7a (commit) via 7d1401cc9cf8b398a59a2c8eb616ebfabf1d9fcb (commit) via 6d9651b0ac536c48eb2aae27dda7caed2520ae4e (commit) via 11f7f78e067b043e7cca85f4f4d6624c2b8994cc (commit) via 7fc7e783709b3854361e2d51cfd0a689ded0176f (commit) via 4399065fde1b20d31582ccad7256d3abd247e35c (commit) via 986115564b64fea51b427e07cf8d718cff8273cb (commit) via 73fe8a9d3c13ca13d85dfa86aa643e4da8b08649 (commit) via c1c1bb130168681174c8bc2c5ac1e6ba51ed0f9e (commit) via 45d885b0e00e4d58247bfcc4ffc0a2173f7353ee (commit) via f3d43f2194c9a52eb256f4ffcb164bd2f45b72ca (commit) via 30d5d5a9d9fcfe347d017d50aedf1b724224058f (commit) via 91c0375894a91fb5694bea26118b33542f33919a (commit) via bd5d82f011becbc06a363d4425ce3d9cc7992884 (commit) via 6967c5ad3698f67ba8ee437bbc23c392714b9d66 (commit) via 91b01409d3d99abcfb0ee09132a2d9e4d41ee923 (commit) via d8e138378d5f81fadc683149d49315dec94957a3 (commit) via 3054e8dc3d2a4b08e147f7433c6e5ee465bce6bd (commit) via 4fab03b86306594c1e43be0d8beb35889a7da59a (commit) via ad968a480b13dbaaed744bb49042ede1d746a797 (commit) via b05b69d335ff22bac802782a16183b9fc97dab12 (commit) via 3a13cf00e625ebab6518dfbe545549225e72e5ad (commit) via 03ede3b6648742aa25378a7da57c94dfd60e7f94 (commit) via 296a4804b2c2d75e904ce620071b6aa1747f18f5 (commit) via d4e2e55ecbcae1b3e2797d50c310bd5f3f1219fb (commit) via c056288a515a3c320152b64e31d841b49e241a5e (commit) via 94ea17b04ef17529a66de9fe951c2095c4ff1aae (commit) via 3970d69b53711f0575683343c869e5e6dfd84c5c (commit) via 71ba7f047af5d7dbdbb5287146802b07b5970d82 (commit) via 3ea4ee8f50c165af648cf7d7c8ff71a5b81d2d50 (commit) via 54302b0b32f35c34ad29e58ac9aa163ff4cd4d08 (commit) via 1c94e275e7b0add9545fcd5a3efe7ca175f22747 (commit) via 8180e6e744070aa983041caad246ae8d29f7f9e8 (commit) via 62d3e5a87be00d3998fddc357a7fad9acd90e061 (commit) via 8fd6387446eadfea0a406ac0ab7f41fa89e2eb84 (commit) via 715edf7186ba9eac3b573714cba8801212e6bb52 (commit) via 95ac89486dec6f31ed0840ed7167349f4273c243 (commit) via 60ccc433a461862a6e8d8c1c88a60c7c78068be0 (commit) via 6cecdfd3b30e09fcabddfd00d3086b9f3de62cbc (commit) via 5de13fab3f9d7986534edbb08f6488124990f45f (commit) via b265af4375d1868ef2b199c536b2fa679e35539d (commit) via 05384c6a05e81cbacfde52605672a2d87acbcd6a (commit) via e4e773b5a1b5f731f293c3b8a481df18829ce27f (commit) via dbd5de39c35a19e086e4680271674dbaa4141b67 (commit) via 56648b6ea8d0f36b33336fc6f36a4be9da2f07d2 (commit) via 8a3fec65c055440b594468c98dfae68e2e9d99ef (commit) via 6c9638d7f6a65affb70813ee74220744c4e18d56 (commit) via 7bc7601b37435d81b945fd89d71e8a3b1ccea05f (commit) via 50197c86bd6223364ed17cb8e26d68bc3829bddf (commit) via 6433e94cae17fe65a317e2d3285cbadea0900164 (commit) via b2a59774c29aab0fd57ee5e55bd910ced7fc59bf (commit) via 00ae1abffcb95c547d86fa8131af7aeebb1de05a (commit) via 9a7116eeea0a3ce1fae43c67448d80a0bfaae352 (commit) via 0a8769638a6ab9aea6bfb25b4d1a073511df1d4a (commit) via 6f6a4d6e2adfc21ce374f83887272f5c524f1a34 (commit) via da6c8b453d35630d21cb6e965a9f80e0e409283d (commit) via 6763c5088b9edcd9811fe23eb22599d5515307d3 (commit) via d6374c413c0bb0fea6b2e1537e742e2ae137ead0 (commit) via 0fb29e5111a2de924e5839e8ea84979e4105f5f4 (commit) via 7fa926a961fb04ef1dd28dc6c460530007d6c2c3 (commit) via 96d31539572e3a2cf670862a538cd2bc514fd876 (commit) via 83eda6d7ea0bcfa8f94bf7a607ecf5afdc0c390c (commit) via 1ba8046dc870ba2c8cf56f4a6d2ac7509cc00dc2 (commit) via 041b7e3253263be70d8bb36fe58d2345b81d7772 (commit) via 5910df676a86ce99231d385e03652fd2d2867666 (commit) via 2a37acfee98d8c711ce6d35e20a736db425500bb (commit) via 4b7399b7722c3eabdb247bf61d7a22ff9e237743 (commit) via c68e8b698c6ba20c57a7525d12fd4c93a6d7cefe (commit) via 6549ab4bddf29ae58bd06443f2701f454276efeb (commit) via a40049bcdcfa96c97e49eeba5598f8a18bb2d5d3 (commit) via bae7c55d569be3185bc940d2c6ca7b54ea32093b (commit) via cbd4f1b71dd935f3459e84678218870bc8bd6ab8 (commit) from 07a3dce533cd254d470ea1fbacfc1fe92d42e40d (commit) - Log ----------------------------------------------------------------- commit dd8e15cc5988aaa71b519724ab3d59e3e82f42e5 Author: Pavan Deolasee <pav...@gm...> Date: Tue May 24 16:58:38 2011 +0530 Fix a merge issue introduced while picking PGXC-commits post 9.0.3 merge from the master branch diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c index 5a3945e..99ecd14 100644 --- a/src/backend/parser/parse_func.c +++ b/src/backend/parser/parse_func.c @@ -1581,55 +1581,6 @@ LookupAggNameTypeNames(List *aggname, List *argtypes, bool noError) return oid; } -/* - * pg_get_expr() is a system function that exposes the expression - * deparsing functionality in ruleutils.c to users. Very handy, but it was - * later realized that the functions in ruleutils.c don't check the input - * rigorously, assuming it to come from system catalogs and to therefore - * be valid. That makes it easy for a user to crash the backend by passing - * a maliciously crafted string representation of an expression to - * pg_get_expr(). - * - * There's a lot of code in ruleutils.c, so it's not feasible to add - * water-proof input checking after the fact. Even if we did it once, it - * would need to be taken into account in any future patches too. - * - * Instead, we restrict pg_rule_expr() to only allow input from system - * catalogs. This is a hack, but it's the most robust and easiest - * to backpatch way of plugging the vulnerability. - * - * This is transparent to the typical usage pattern of - * "pg_get_expr(systemcolumn, ...)", but will break "pg_get_expr('foo', - * ...)", even if 'foo' is a valid expression fetched earlier from a - * system catalog. Hopefully there aren't many clients doing that out there. - */ -void -check_pg_get_expr_args(ParseState *pstate, Oid fnoid, List *args) -{ - Node *arg; - - /* if not being called for pg_get_expr, do nothing */ - if (fnoid != F_PG_GET_EXPR && fnoid != F_PG_GET_EXPR_EXT) - return; - - /* superusers are allowed to call it anyway (dubious) */ - if (superuser()) - return; - - /* - * The first argument must be a Var referencing one of the allowed - * system-catalog columns. It could be a join alias Var or subquery - * reference Var, though, so we need a recursive subroutine to chase - * through those possibilities. - */ - Assert(list_length(args) > 1); - arg = (Node *) linitial(args); - - if (!check_pg_get_expr_arg(pstate, arg, 0)) - ereport(ERROR, - (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), - errmsg("argument to pg_get_expr() must come from system catalogs"))); -} #ifdef PGXC /* @@ -1726,85 +1677,3 @@ IsParseFuncImmutable(ParseState *pstate, List *targs, List *funcname, bool func_ } #endif -static bool -check_pg_get_expr_arg(ParseState *pstate, Node *arg, int netlevelsup) -{ - if (arg && IsA(arg, Var)) - { - Var *var = (Var *) arg; - RangeTblEntry *rte; - AttrNumber attnum; - - netlevelsup += var->varlevelsup; - rte = GetRTEByRangeTablePosn(pstate, var->varno, netlevelsup); - attnum = var->varattno; - - if (rte->rtekind == RTE_JOIN) - { - /* Recursively examine join alias variable */ - if (attnum > 0 && - attnum <= list_length(rte->joinaliasvars)) - { - arg = (Node *) list_nth(rte->joinaliasvars, attnum - 1); - return check_pg_get_expr_arg(pstate, arg, netlevelsup); - } - } - else if (rte->rtekind == RTE_SUBQUERY) - { - /* Subselect-in-FROM: examine sub-select's output expr */ - TargetEntry *ste = get_tle_by_resno(rte->subquery->targetList, - attnum); - ParseState mypstate; - - if (ste == NULL || ste->resjunk) - elog(ERROR, "subquery %s does not have attribute %d", - rte->eref->aliasname, attnum); - arg = (Node *) ste->expr; - - /* - * Recurse into the sub-select to see what its expr refers to. - * We have to build an additional level of ParseState to keep in - * step with varlevelsup in the subselect. - */ - MemSet(&mypstate, 0, sizeof(mypstate)); - mypstate.parentParseState = pstate; - mypstate.p_rtable = rte->subquery->rtable; - /* don't bother filling the rest of the fake pstate */ - - return check_pg_get_expr_arg(&mypstate, arg, 0); - } - else if (rte->rtekind == RTE_RELATION) - { - switch (rte->relid) - { - case IndexRelationId: - if (attnum == Anum_pg_index_indexprs || - attnum == Anum_pg_index_indpred) - return true; - break; - - case AttrDefaultRelationId: - if (attnum == Anum_pg_attrdef_adbin) - return true; - break; - - case ProcedureRelationId: - if (attnum == Anum_pg_proc_proargdefaults) - return true; - break; - - case ConstraintRelationId: - if (attnum == Anum_pg_constraint_conbin) - return true; - break; - - case TypeRelationId: - if (attnum == Anum_pg_type_typdefaultbin) - return true; - break; - } - } - } - - return false; -} commit df91a4341c34cfb5c63fec787e5602ac5e1bdc6d Author: Ashutosh Bapat <ash...@en...> Date: Thu May 19 14:45:02 2011 +0530 While copying the message from datanode to a slot, copy it within the memory context of the slot. Fix some compiler warnings. diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 9f6adff..87302b4 100644 --- a/src/backend/executor/execTuples.c +++ b/src/backend/executor/execTuples.c @@ -466,64 +466,6 @@ ExecStoreMinimalTuple(MinimalTuple mtup, return slot; } -#ifdef PGXC -/* -------------------------------- - * ExecStoreDataRowTuple - * - * Store a buffer in DataRow message format into the slot. - * - * -------------------------------- - */ -TupleTableSlot * -ExecStoreDataRowTuple(char *msg, size_t len, int node, TupleTableSlot *slot, - bool shouldFree) -{ - /* - * sanity checks - */ - Assert(msg != NULL); - Assert(len > 0); - Assert(slot != NULL); - Assert(slot->tts_tupleDescriptor != NULL); - - /* - * Free any old physical tuple belonging to the slot. - */ - if (slot->tts_shouldFree) - heap_freetuple(slot->tts_tuple); - if (slot->tts_shouldFreeMin) - heap_free_minimal_tuple(slot->tts_mintuple); - if (slot->tts_shouldFreeRow) - pfree(slot->tts_dataRow); - - /* - * Drop the pin on the referenced buffer, if there is one. - */ - if (BufferIsValid(slot->tts_buffer)) - ReleaseBuffer(slot->tts_buffer); - - slot->tts_buffer = InvalidBuffer; - - /* - * Store the new tuple into the specified slot. - */ - slot->tts_isempty = false; - slot->tts_shouldFree = false; - slot->tts_shouldFreeMin = false; - slot->tts_shouldFreeRow = shouldFree; - slot->tts_tuple = NULL; - slot->tts_mintuple = NULL; - slot->tts_dataRow = msg; - slot->tts_dataLen = len; - slot->tts_dataNode = node; - - /* Mark extracted state invalid */ - slot->tts_nvalid = 0; - - return slot; -} -#endif - /* -------------------------------- * ExecClearTuple * @@ -1416,3 +1358,68 @@ end_tup_output(TupOutputState *tstate) ExecDropSingleTupleTableSlot(tstate->slot); pfree(tstate); } + +#ifdef PGXC +/* -------------------------------- + * ExecStoreDataRowTuple + * + * Store a buffer in DataRow message format into the slot. + * + * -------------------------------- + */ +TupleTableSlot * +ExecStoreDataRowTuple(char *msg, size_t len, int node, TupleTableSlot *slot, + bool shouldFree) +{ + /* + * sanity checks + */ + Assert(msg != NULL); + Assert(len > 0); + Assert(slot != NULL); + Assert(slot->tts_tupleDescriptor != NULL); + + /* + * Free any old physical tuple belonging to the slot. + */ + if (slot->tts_shouldFree) + heap_freetuple(slot->tts_tuple); + if (slot->tts_shouldFreeMin) + heap_free_minimal_tuple(slot->tts_mintuple); + /* + * if msg == slot->tts_dataRow then we would + * free the dataRow in the slot loosing the contents in msg. It is safe + * to reset shouldFreeRow, since it will be overwritten just below. + */ + if (msg == slot->tts_dataRow) + slot->tts_shouldFreeRow = false; + if (slot->tts_shouldFreeRow) + pfree(slot->tts_dataRow); + + /* + * Drop the pin on the referenced buffer, if there is one. + */ + if (BufferIsValid(slot->tts_buffer)) + ReleaseBuffer(slot->tts_buffer); + + slot->tts_buffer = InvalidBuffer; + + /* + * Store the new tuple into the specified slot. + */ + slot->tts_isempty = false; + slot->tts_shouldFree = false; + slot->tts_shouldFreeMin = false; + slot->tts_shouldFreeRow = shouldFree; + slot->tts_tuple = NULL; + slot->tts_mintuple = NULL; + slot->tts_dataRow = msg; + slot->tts_dataLen = len; + slot->tts_dataNode = node; + + /* Mark extracted state invalid */ + slot->tts_nvalid = 0; + + return slot; +} +#endif diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index 99b05ed..335c05f 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -49,7 +49,7 @@ #define PRIMARY_NODE_WRITEAHEAD 1024 * 1024 static bool autocommit = true; -static is_ddl = false; +static bool is_ddl = false; static bool implicit_force_autocommit = false; static PGXCNodeHandle **write_node_list = NULL; static int write_node_count = 0; @@ -420,7 +420,6 @@ create_tuple_desc(char *msg_body, size_t len) char *typname; Oid oidtypeid; int32 typemode, typmod; - uint32 n32; attnum = (AttrNumber) i; @@ -1152,6 +1151,27 @@ BufferConnection(PGXCNodeHandle *conn) } /* + * copy the datarow from combiner to the given slot, in the slot's memory + * context + */ +static void +CopyDataRowTupleToSlot(RemoteQueryState *combiner, TupleTableSlot *slot) +{ + char *msg; + MemoryContext oldcontext; + oldcontext = MemoryContextSwitchTo(slot->tts_mcxt); + msg = (char *)palloc(combiner->currentRow.msglen); + memcpy(msg, combiner->currentRow.msg, combiner->currentRow.msglen); + ExecStoreDataRowTuple(msg, combiner->currentRow.msglen, + combiner->currentRow.msgnode, slot, true); + pfree(combiner->currentRow.msg); + combiner->currentRow.msg = NULL; + combiner->currentRow.msglen = 0; + combiner->currentRow.msgnode = 0; + MemoryContextSwitchTo(oldcontext); +} + +/* * Get next data row from the combiner's buffer into provided slot * Just clear slot and return false if buffer is empty, that means end of result * set is reached @@ -1164,12 +1184,7 @@ FetchTuple(RemoteQueryState *combiner, TupleTableSlot *slot) /* If we have message in the buffer, consume it */ if (combiner->currentRow.msg) { - ExecStoreDataRowTuple(combiner->currentRow.msg, - combiner->currentRow.msglen, - combiner->currentRow.msgnode, slot, true); - combiner->currentRow.msg = NULL; - combiner->currentRow.msglen = 0; - combiner->currentRow.msgnode = 0; + CopyDataRowTupleToSlot(combiner, slot); have_tuple = true; } @@ -1189,6 +1204,10 @@ FetchTuple(RemoteQueryState *combiner, TupleTableSlot *slot) * completed. Afterwards rows will be taken from the buffer bypassing * currentRow until buffer is empty, and only after that data are read * from a connection. + * PGXCTODO: the message should be allocated in the same memory context as + * that of the slot. Are we sure of that in the call to + * ExecStoreDataRowTuple below? If one fixes this memory issue, please + * consider using CopyDataRowTupleToSlot() for the same. */ if (list_length(combiner->rowBuffer) > 0) { @@ -1279,12 +1298,7 @@ FetchTuple(RemoteQueryState *combiner, TupleTableSlot *slot) /* If we have message in the buffer, consume it */ if (combiner->currentRow.msg) { - ExecStoreDataRowTuple(combiner->currentRow.msg, - combiner->currentRow.msglen, - combiner->currentRow.msgnode, slot, true); - combiner->currentRow.msg = NULL; - combiner->currentRow.msglen = 0; - combiner->currentRow.msgnode = 0; + CopyDataRowTupleToSlot(combiner, slot); have_tuple = true; } @@ -3762,7 +3776,7 @@ handle_results: natts = resultslot->tts_tupleDescriptor->natts; for (i = 0; i < natts; ++i) { - if (resultslot->tts_values[i] == NULL) + if (resultslot->tts_values[i] == (Datum) NULL) return NULL; } commit 351a1751b7ee8c17d080fe0de9c9bef4bdbc653d Author: Michael P <mic...@us...> Date: Wed May 11 18:29:55 2011 +0900 Support for single-prepared PL/PGSQL functions This commit fixes primarily problems like in bug 3138450 (cache lookup for type 0) where XC was not able to set up plpgsql parameter values because values were not correctly fetched. This commit does not yet solve the special case of multiple uses of same plpgsql datum within a SQL command. PL/PGSQL functions using subqueries are out of scope for the moment due to XC's restrictions regarding multi-prepared statements. diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index a8a1070..99b05ed 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -4057,17 +4057,45 @@ ParamListToDataRow(ParamListInfo params, char** result) StringInfoData buf; uint16 n16; int i; + int real_num_params = params->numParams; + + /* + * It is necessary to fetch parameters + * before looking at the output value. + */ + for (i = 0; i < params->numParams; i++) + { + ParamExternData *param; + + param = ¶ms->params[i]; + + if (!OidIsValid(param->ptype) && params->paramFetch != NULL) + (*params->paramFetch) (params, i + 1); + + /* + * In case parameter type is not defined, it is not necessary to include + * it in message sent to backend nodes. + */ + if (!OidIsValid(param->ptype)) + real_num_params--; + } initStringInfo(&buf); + /* Number of parameter values */ - n16 = htons(params->numParams); + n16 = htons(real_num_params); appendBinaryStringInfo(&buf, (char *) &n16, 2); /* Parameter values */ for (i = 0; i < params->numParams; i++) { - ParamExternData *param = params->params + i; + ParamExternData *param = ¶ms->params[i]; uint32 n32; + + /* If parameter has no type defined it is not necessary to include it in message */ + if (!OidIsValid(param->ptype)) + continue; + if (param->isnull) { n32 = htonl(-1); commit 7d2a58c8e4cafbcfe48741317813783603b8fb3f Author: Abbas <abb...@en...> Date: Wed May 4 13:26:10 2011 +0500 This patch fixes a problem in XC that INSERTS/UPDATES in catalog tables were not possible from psql prompt. The problem was in XC planner. XC planner should first check if all the tables in the query are catalog tables then it should invoke standard plannner. This change enables us to remove a temp fix in GetRelationLocInfo. Also a query is added in system_views.sql to add a corresponding entry in pgxc_class. RelationBuildDesc is asked to include bootstrap objetcs too while building location info. diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 2edaf48..083a6d8 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -155,6 +155,8 @@ CREATE SCHEMA __pgxc_datanode_schema__; create table __pgxc_coordinator_schema__.pg_prepared_xacts ( transaction xid, gid text, prepared timestamptz, owner name, database name ); +INSERT INTO pgxc_class VALUES((SELECT oid FROM pg_class WHERE relkind = 'r' AND relname = 'pg_prepared_xacts'), 'N', 0,0,0); + CREATE VIEW __pgxc_datanode_schema__.pg_prepared_xacts AS SELECT P.transaction, P.gid, P.prepared, U.rolname AS owner, D.datname AS database diff --git a/src/backend/pgxc/locator/locator.c b/src/backend/pgxc/locator/locator.c index 4116476..1eff17c 100644 --- a/src/backend/pgxc/locator/locator.c +++ b/src/backend/pgxc/locator/locator.c @@ -754,37 +754,6 @@ GetRelationLocInfo(Oid relid) Relation rel = relation_open(relid, AccessShareLock); - /* This check has been added as a temp fix for CREATE TABLE not adding entry in pgxc_class - * when run from system_views.sql - */ - if ( rel != NULL && - rel->rd_rel != NULL && - rel->rd_rel->relkind == RELKIND_RELATION && - rel->rd_rel->relname.data != NULL && - (strcmp(rel->rd_rel->relname.data, PREPARED_XACTS_TABLE) == 0) ) - { - namespace = get_namespace_name(rel->rd_rel->relnamespace); - - if (namespace != NULL && (strcmp(namespace, PGXC_COORDINATOR_SCHEMA) == 0)) - { - RelationLocInfo *dest_info; - - dest_info = (RelationLocInfo *) palloc0(sizeof(RelationLocInfo)); - - dest_info->relid = relid; - dest_info->locatorType = 'N'; - dest_info->nodeCount = NumDataNodes; - dest_info->nodeList = GetAllDataNodes(); - - relation_close(rel, AccessShareLock); - pfree(namespace); - - return dest_info; - } - - if (namespace != NULL) pfree(namespace); - } - if (rel && rel->rd_locator_info) ret_loc_info = CopyRelationLocInfo(rel->rd_locator_info); diff --git a/src/backend/pgxc/plan/planner.c b/src/backend/pgxc/plan/planner.c index 8f24bbe..2da079f 100644 --- a/src/backend/pgxc/plan/planner.c +++ b/src/backend/pgxc/plan/planner.c @@ -2895,6 +2895,12 @@ pgxc_planner(Query *query, int cursorOptions, ParamListInfo boundParams) if (query->commandType != CMD_SELECT) result->resultRelations = list_make1_int(query->resultRelation); + if (contains_only_pg_catalog (query->rtable)) + { + result = standard_planner(query, cursorOptions, boundParams); + return result; + } + if (query_step->exec_nodes == NULL) get_plan_nodes_command(query_step, root); diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index a2a1d4d..b79e41a 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -889,7 +889,7 @@ RelationBuildDesc(Oid targetRelId, bool insertIt) relation->trigdesc = NULL; #ifdef PGXC - if (IS_PGXC_COORDINATOR && relation->rd_id >= FirstNormalObjectId) + if (IS_PGXC_COORDINATOR && relation->rd_id >= FirstBootstrapObjectId) RelationBuildLocator(relation); #endif /* commit b931435761b21f27aef8aca7e7e319bd0bee3a3a Author: Abbas <abb...@en...> Date: Tue May 3 22:29:51 2011 +0500 This patch makes the group by on XC work. The changes are as follows 1. The application of final function at coordinator is enabled though AggState execution. Till now final function was being applied during execution of RemoteQuery only, if there were aggregates in target list of remote query. This only worked in certain cases of aggregates (expressions involving aggregates, aggregation of join results etc. being some of the exceptions). With this change the way grouping works the same way as PG except a. the data comes from remote nodes in the form of tuples b. the aggregates go through three steps transition, collection (extra step to collect the data across the nodes) and finalization. 2. Till now, the collection and transition result type for some aggregates like sum, count, regr_count were different. I have added a function int8_sum__to_int8() which adds to int8 datums and converts the result into int8 datum. This function is used as collection function for these aggregates so that collection and transition functions have same result types. 3. Changed some of the alternate outputs to correct results now that grouping is working. Commented out test join, since it's crashing with grouping enabled. The test has a query which involves aggregates, group by and order by. The crash is happening because of order by and aggregates. Earlier the test didn't crash since GROUPING as disabled and the query would throw error, but now with grouping is enabled, the crash occurs. Bug id 3284321 tracks the crash. 4. Added new test xc_groupby.sql to test the grouping in XC with round robin and replicated tables with some simple aggregates like sum, count and avg. All work done by Ashutosh Bapat. diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index be2007b..437bf20 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -89,6 +89,7 @@ #include "optimizer/tlist.h" #include "parser/parse_agg.h" #include "parser/parse_coerce.h" +#include "pgxc/pgxc.h" #include "utils/acl.h" #include "utils/builtins.h" #include "utils/lsyscache.h" @@ -121,6 +122,9 @@ typedef struct AggStatePerAggData /* Oids of transfer functions */ Oid transfn_oid; Oid finalfn_oid; /* may be InvalidOid */ +#ifdef PGXC + Oid collectfn_oid; /* may be InvalidOid */ +#endif /* PGXC */ /* * fmgr lookup data for transfer functions --- only valid when @@ -129,6 +133,9 @@ typedef struct AggStatePerAggData */ FmgrInfo transfn; FmgrInfo finalfn; +#ifdef PGXC + FmgrInfo collectfn; +#endif /* PGXC */ /* number of sorting columns */ int numSortCols; @@ -154,6 +161,10 @@ typedef struct AggStatePerAggData */ Datum initValue; bool initValueIsNull; +#ifdef PGXC + Datum initCollectValue; + bool initCollectValueIsNull; +#endif /* PGXC */ /* * We need the len and byval info for the agg's input, result, and @@ -165,9 +176,15 @@ typedef struct AggStatePerAggData int16 inputtypeLen, resulttypeLen, transtypeLen; +#ifdef PGXC + int16 collecttypeLen; +#endif /* PGXC */ bool inputtypeByVal, resulttypeByVal, transtypeByVal; +#ifdef PGXC + bool collecttypeByVal; +#endif /* PGXC */ /* * Stuff for evaluation of inputs. We used to just use ExecEvalExpr, but @@ -725,6 +742,55 @@ finalize_aggregate(AggState *aggstate, MemoryContext oldContext; oldContext = MemoryContextSwitchTo(aggstate->ss.ps.ps_ExprContext->ecxt_per_tuple_memory); +#ifdef PGXC + /* + * PGXCTODO: see PGXCTODO item in advance_collect_function + * this step is needed in case the transition function does not produce + * result consumable by final function and need collection function to be + * applied on transition function results. Usually results by both functions + * should be consumable by final function. + * As such this step is meant only to convert transition results into form + * consumable by final function, the step does not actually do any + * collection. + */ + if (OidIsValid(peraggstate->collectfn_oid)) + { + FunctionCallInfoData fcinfo; + InitFunctionCallInfoData(fcinfo, &(peraggstate->collectfn), 2, + (void *) aggstate, NULL); + /* + * copy the initial datum since it might get changed inside the + * collection function + */ + if (peraggstate->initCollectValueIsNull) + fcinfo.arg[0] = peraggstate->initCollectValue; + else + fcinfo.arg[0] = datumCopy(peraggstate->initCollectValue, + peraggstate->collecttypeByVal, + peraggstate->collecttypeLen); + fcinfo.argnull[0] = peraggstate->initCollectValueIsNull; + fcinfo.arg[1] = pergroupstate->transValue; + fcinfo.argnull[1] = pergroupstate->transValueIsNull; + if (fcinfo.flinfo->fn_strict && + (pergroupstate->transValueIsNull || peraggstate->initCollectValueIsNull)) + { + pergroupstate->transValue = (Datum)0; + pergroupstate->transValueIsNull = true; + } + else + { + Datum newVal = FunctionCallInvoke(&fcinfo); + + /* + * set the result of collection function to the transValue so that code + * below invoking final function does not change + */ + /* PGXCTODO: worry about the memory management here? */ + pergroupstate->transValue = newVal; + pergroupstate->transValueIsNull = fcinfo.isnull; + } + } +#endif /* PGXC */ /* * Apply the agg's finalfn if one is provided, else return transValue. @@ -1546,6 +1612,10 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) AclResult aclresult; Oid transfn_oid, finalfn_oid; +#ifdef PGXC + Oid collectfn_oid; + Expr *collectfnexpr; +#endif /* PGXC */ Expr *transfnexpr, *finalfnexpr; Datum textInitVal; @@ -1612,13 +1682,19 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) get_func_name(aggref->aggfnoid)); peraggstate->transfn_oid = transfn_oid = aggform->aggtransfn; -#ifdef PGXC - /* For PGXC final function is executed when combining, disable it here */ - peraggstate->finalfn_oid = finalfn_oid = InvalidOid; -#else peraggstate->finalfn_oid = finalfn_oid = aggform->aggfinalfn; -#endif - +#ifdef PGXC + peraggstate->collectfn_oid = collectfn_oid = aggform->aggcollectfn; + /* + * For PGXC final and collection functions are used to combine results at coordinator, + * disable those for data node + */ + if (IS_PGXC_DATANODE) + { + peraggstate->finalfn_oid = finalfn_oid = InvalidOid; + peraggstate->collectfn_oid = collectfn_oid = InvalidOid; + } +#endif /* PGXC */ /* Check that aggregate owner has permission to call component fns */ { HeapTuple procTuple; @@ -1645,6 +1721,17 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) aclcheck_error(aclresult, ACL_KIND_PROC, get_func_name(finalfn_oid)); } + +#ifdef PGXC + if (OidIsValid(collectfn_oid)) + { + aclresult = pg_proc_aclcheck(collectfn_oid, aggOwner, + ACL_EXECUTE); + if (aclresult != ACLCHECK_OK) + aclcheck_error(aclresult, ACL_KIND_PROC, + get_func_name(collectfn_oid)); + } +#endif /* PGXC */ } /* resolve actual type of transition state, if polymorphic */ @@ -1675,6 +1762,32 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) finalfn_oid, &transfnexpr, &finalfnexpr); +#ifdef PGXC + if (OidIsValid(collectfn_oid)) + { + /* we expect final function expression to be NULL in call to + * build_aggregate_fnexprs below, since InvalidOid is passed for + * finalfn_oid argument. Use a dummy expression to accept that. + */ + Expr *dummyexpr; + /* + * for XC, we need to setup the collection function expression as well. + * Use the same function with invalid final function oid, and collection + * function information instead of transition function information. + * PGXCTODO: we should really be adding this step inside + * build_aggregate_fnexprs() but this way it becomes easy to merge. + */ + build_aggregate_fnexprs(&aggform->aggtranstype, + 1, + aggform->aggcollecttype, + aggref->aggtype, + collectfn_oid, + InvalidOid, + &collectfnexpr, + &dummyexpr); + Assert(!dummyexpr); + } +#endif /* PGXC */ fmgr_info(transfn_oid, &peraggstate->transfn); peraggstate->transfn.fn_expr = (Node *) transfnexpr; @@ -1685,12 +1798,25 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) peraggstate->finalfn.fn_expr = (Node *) finalfnexpr; } +#ifdef PGXC + if (OidIsValid(collectfn_oid)) + { + fmgr_info(collectfn_oid, &peraggstate->collectfn); + peraggstate->collectfn.fn_expr = (Node *)collectfnexpr; + } +#endif /* PGXC */ + get_typlenbyval(aggref->aggtype, &peraggstate->resulttypeLen, &peraggstate->resulttypeByVal); get_typlenbyval(aggtranstype, &peraggstate->transtypeLen, &peraggstate->transtypeByVal); +#ifdef PGXC + get_typlenbyval(aggform->aggcollecttype, + &peraggstate->collecttypeLen, + &peraggstate->collecttypeByVal); +#endif /* PGXC */ /* * initval is potentially null, so don't try to access it as a struct @@ -1706,6 +1832,23 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) peraggstate->initValue = GetAggInitVal(textInitVal, aggtranstype); +#ifdef PGXC + /* + * initval for collection function is potentially null, so don't try to + * access it as a struct field. Must do it the hard way with + * SysCacheGetAttr. + */ + textInitVal = SysCacheGetAttr(AGGFNOID, aggTuple, + Anum_pg_aggregate_agginitcollect, + &peraggstate->initCollectValueIsNull); + + if (peraggstate->initCollectValueIsNull) + peraggstate->initCollectValue = (Datum) 0; + else + peraggstate->initCollectValue = GetAggInitVal(textInitVal, + aggform->aggcollecttype); +#endif /* PGXC */ + /* * If the transfn is strict and the initval is NULL, make sure input * type and transtype are the same (or at least binary-compatible), so diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c index 7bc9a11..ad6c6f5 100644 --- a/src/backend/optimizer/plan/planmain.c +++ b/src/backend/optimizer/plan/planmain.c @@ -291,12 +291,6 @@ query_planner(PlannerInfo *root, List *tlist, { List *groupExprs; -#ifdef PGXC - ereport(ERROR, - (errcode(ERRCODE_STATEMENT_TOO_COMPLEX), - (errmsg("GROUP BY clause is not yet supported")))); -#endif - groupExprs = get_sortgrouplist_exprs(parse->groupClause, parse->targetList); *num_groups = estimate_num_groups(root, diff --git a/src/backend/pgxc/plan/planner.c b/src/backend/pgxc/plan/planner.c index d386ded..8f24bbe 100644 --- a/src/backend/pgxc/plan/planner.c +++ b/src/backend/pgxc/plan/planner.c @@ -2977,17 +2977,16 @@ pgxc_planner(Query *query, int cursorOptions, ParamListInfo boundParams) } /* - * Use standard plan if we have more than one data node with either - * group by, hasWindowFuncs, or hasRecursive - */ - /* * PGXCTODO - this could be improved to check if the first * group by expression is the partitioning column, in which * case it is ok to treat as a single step. + * PGXCTODO - whatever number of nodes involved in the query, grouping, + * windowing and recursive queries take place at the coordinator. The + * corresponding planner should be able to optimize the queries such that + * most of the query is pushed to datanode, based on the kind of + * distribution the table has. */ if (query->commandType == CMD_SELECT - && query_step->exec_nodes - && list_length(query_step->exec_nodes->nodelist) > 1 && (query->groupClause || query->hasWindowFuncs || query->hasRecursive)) { result = standard_planner(query, cursorOptions, boundParams); diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index 61a6263..a8a1070 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -337,6 +337,14 @@ advance_collect_function(SimpleAgg *simple_agg, FunctionCallInfoData *fcinfo) * result has not been initialized * We must copy the datum into result if it is pass-by-ref. We * do not need to pfree the old result, since it's NULL. + * PGXCTODO: in case the transition result type is different from + * collection result type, this code would not work, since we are + * assigning datum of one type to another. For this code to work the + * input and output of collection function needs to be binary + * compatible which is not. So, either check in AggregateCreate, + * that the input and output of collection function are binary + * coercible or set the initial values something non-null or change + * this code */ simple_agg->collectValue = datumCopy(fcinfo->arg[1], simple_agg->transtypeByVal, diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index a36fb63..754da6e 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -2795,6 +2795,36 @@ int8_sum(PG_FUNCTION_ARGS) NumericGetDatum(oldsum), newval)); } +/* + * similar to int8_sum, except that the result is casted into int8 + */ +Datum +int8_sum_to_int8(PG_FUNCTION_ARGS) +{ + Datum result_num; + Datum numeric_arg; + + /* if both arguments are null, the result is null */ + if (PG_ARGISNULL(0) && PG_ARGISNULL(1)) + PG_RETURN_NULL(); + + /* if either of them is null, the other is the result */ + if (PG_ARGISNULL(0)) + PG_RETURN_DATUM(PG_GETARG_DATUM(1)); + + if (PG_ARGISNULL(1)) + PG_RETURN_DATUM(PG_GETARG_DATUM(0)); + + /* + * convert the first argument to numeric (second one is converted into + * numeric) + * add both the arguments using int8_sum + * convert the result into int8 using numeric_int8 + */ + numeric_arg = DirectFunctionCall1(int8_numeric, PG_GETARG_DATUM(0)); + result_num = DirectFunctionCall2(int8_sum, numeric_arg, PG_GETARG_DATUM(1)); + PG_RETURN_DATUM(DirectFunctionCall1(numeric_int8, result_num)); +} /* * Routines for avg(int2) and avg(int4). The transition datatype diff --git a/src/include/catalog/pg_aggregate.h b/src/include/catalog/pg_aggregate.h index 443b135..57b0b71 100644 --- a/src/include/catalog/pg_aggregate.h +++ b/src/include/catalog/pg_aggregate.h @@ -130,14 +130,14 @@ DATA(insert ( 2106 interval_accum interval_avg 0 1187 "{0 second,0 second}" )); /* sum */ #ifdef PGXC -DATA(insert ( 2107 int8_sum numeric_add - 0 1700 1700 _null_ _null_ )); -DATA(insert ( 2108 int4_sum int8_sum 1779 0 20 1700 _null_ _null_ )); -DATA(insert ( 2109 int2_sum int8_sum 1779 0 20 1700 _null_ _null_ )); -DATA(insert ( 2110 float4pl float4pl - 0 700 700 _null_ _null_ )); -DATA(insert ( 2111 float8pl float8pl - 0 701 701 _null_ _null_ )); +DATA(insert ( 2107 int8_sum numeric_add - 0 1700 1700 _null_ "0" )); +DATA(insert ( 2108 int4_sum int8_sum_to_int8 - 0 20 20 _null_ _null_ )); +DATA(insert ( 2109 int2_sum int8_sum_to_int8 - 0 20 20 _null_ _null_ )); +DATA(insert ( 2110 float4pl float4pl - 0 700 700 _null_ "0" )); +DATA(insert ( 2111 float8pl float8pl - 0 701 701 _null_ "0" )); DATA(insert ( 2112 cash_pl cash_pl - 0 790 790 _null_ _null_ )); DATA(insert ( 2113 interval_pl interval_pl - 0 1186 1186 _null_ _null_ )); -DATA(insert ( 2114 numeric_add numeric_add - 0 1700 1700 _null_ _null_ )); +DATA(insert ( 2114 numeric_add numeric_add - 0 1700 1700 _null_ "0" )); #else DATA(insert ( 2107 int8_sum - 0 1700 _null_ )); DATA(insert ( 2108 int4_sum - 0 20 _null_ )); @@ -242,8 +242,8 @@ DATA(insert ( 3527 enum_smaller - 3518 3500 _null_ )); /* count */ /* Final function is data type conversion function numeric_int8 is refernced by OID because of ambiguous defininition in pg_proc */ #ifdef PGXC -DATA(insert ( 2147 int8inc_any int8_sum 1779 0 20 1700 "0" _null_ )); -DATA(insert ( 2803 int8inc int8_sum 1779 0 20 1700 "0" _null_ )); +DATA(insert ( 2147 int8inc_any int8_sum_to_int8 - 0 20 20 "0" _null_ )); +DATA(insert ( 2803 int8inc int8_sum_to_int8 - 0 20 20 "0" _null_ )); #else DATA(insert ( 2147 int8inc_any - 0 20 "0" )); DATA(insert ( 2803 int8inc - 0 20 "0" )); @@ -353,7 +353,7 @@ DATA(insert ( 2159 numeric_accum numeric_stddev_samp 0 1231 "{0,0,0}" )); /* SQL2003 binary regression aggregates */ #ifdef PGXC -DATA(insert ( 2818 int8inc_float8_float8 int8_sum 1779 0 20 1700 "0" _null_ )); +DATA(insert ( 2818 int8inc_float8_float8 int8_sum_to_int8 - 0 20 20 "0" _null_ )); DATA(insert ( 2819 float8_regr_accum float8_regr_collect float8_regr_sxx 0 1022 1022 "{0,0,0,0,0,0}" "{0,0,0,0,0,0}" )); DATA(insert ( 2820 float8_regr_accum float8_regr_collect float8_regr_syy 0 1022 1022 "{0,0,0,0,0,0}" "{0,0,0,0,0,0}" )); DATA(insert ( 2821 float8_regr_accum float8_regr_collect float8_regr_sxy 0 1022 1022 "{0,0,0,0,0,0}" "{0,0,0,0,0,0}" )); diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index f24e0bc..7ae0b73 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2792,7 +2792,8 @@ DESCR("SUM(int2) transition function"); DATA(insert OID = 1841 ( int4_sum PGNSP PGUID 12 1 0 0 f f f f f i 2 0 20 "20 23" _null_ _null_ _null_ _null_ int4_sum _null_ _null_ _null_ )); DESCR("SUM(int4) transition function"); DATA(insert OID = 1842 ( int8_sum PGNSP PGUID 12 1 0 0 f f f f f i 2 0 1700 "1700 20" _null_ _null_ _null_ _null_ int8_sum _null_ _null_ _null_ )); -DESCR("SUM(int8) transition function"); +DATA(insert OID = 3037 ( int8_sum_to_int8 PGNSP PGUID 12 1 0 0 f f f f f i 2 0 20 "20 20" _null_ _null_ _null_ _null_ int8_sum_to_int8 _null_ _null_ _null_ )); +DESCR("SUM(int*) collection function"); DATA(insert OID = 1843 ( interval_accum PGNSP PGUID 12 1 0 0 f f f t f i 2 0 1187 "1187 1186" _null_ _null_ _null_ _null_ interval_accum _null_ _null_ _null_ )); DESCR("aggregate transition function"); DATA(insert OID = 1844 ( interval_avg PGNSP PGUID 12 1 0 0 f f f t f i 1 0 1186 "1187" _null_ _null_ _null_ _null_ interval_avg _null_ _null_ _null_ )); diff --git a/src/include/pgxc/execRemote.h b/src/include/pgxc/execRemote.h index 7ccef33..405325b 100644 --- a/src/include/pgxc/execRemote.h +++ b/src/include/pgxc/execRemote.h @@ -101,6 +101,12 @@ typedef struct RemoteQueryState * to initialize collecting of aggregates from the DNs */ bool initAggregates; + /* + * PGXCTODO - + * we should get rid of the simple_aggregates member, that should work + * through Agg node and grouping_planner should take care of optimizing it + * to the fullest + */ List *simple_aggregates; /* description of aggregate functions */ void *tuplesortstate; /* for merge sort */ /* Simple DISTINCT support */ diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index 3e3637d..93d0c31 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -940,6 +940,7 @@ extern Datum numeric_stddev_samp(PG_FUNCTION_ARGS); extern Datum int2_sum(PG_FUNCTION_ARGS); extern Datum int4_sum(PG_FUNCTION_ARGS); extern Datum int8_sum(PG_FUNCTION_ARGS); +extern Datum int8_sum_to_int8(PG_FUNCTION_ARGS); extern Datum int2_avg_accum(PG_FUNCTION_ARGS); extern Datum int4_avg_accum(PG_FUNCTION_ARGS); #ifdef PGXC diff --git a/src/test/regress/expected/opr_sanity_1.out b/src/test/regress/expected/opr_sanity_1.out index 885cb13..bf70944 100644 --- a/src/test/regress/expected/opr_sanity_1.out +++ b/src/test/regress/expected/opr_sanity_1.out @@ -709,14 +709,9 @@ WHERE a.aggfnoid = p.oid AND OR NOT binary_coercible(pfn.prorettype, p.prorettype) OR pfn.pronargs != 1 OR NOT binary_coercible(a.aggtranstype, pfn.proargtypes[0])); - aggfnoid | proname | oid | proname -----------+------------+------+--------- - 2108 | sum | 1779 | int8 - 2109 | sum | 1779 | int8 - 2147 | count | 1779 | int8 - 2803 | count | 1779 | int8 - 2818 | regr_count | 1779 | int8 -(5 rows) + aggfnoid | proname | oid | proname +----------+---------+-----+--------- +(0 rows) -- If transfn is strict then either initval should be non-NULL, or -- input type should match transtype so that the first non-null input @@ -1120,7 +1115,10 @@ FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid WHERE am.amname <> 'gin' GROUP BY amname, amsupport, opcname, amprocfamily HAVING count(*) != amsupport OR amprocfamily IS NULL; -ERROR: GROUP BY clause is not yet supported + amname | opcname | count +--------+---------+------- +(0 rows) + SELECT amname, opcname, count(*) FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid LEFT JOIN pg_amproc p ON amprocfamily = opcfamily AND @@ -1128,7 +1126,10 @@ FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid WHERE am.amname = 'gin' GROUP BY amname, amsupport, opcname, amprocfamily HAVING count(*) < amsupport - 1 OR amprocfamily IS NULL; -ERROR: GROUP BY clause is not yet supported + amname | opcname | count +--------+---------+------- +(0 rows) + -- Unfortunately, we can't check the amproc link very well because the -- signature of the function may be different for different support routines -- or different base data types. diff --git a/src/test/regress/expected/with_1.out b/src/test/regress/expected/with_1.out index 5ae3440..7048e51 100644 --- a/src/test/regress/expected/with_1.out +++ b/src/test/regress/expected/with_1.out @@ -247,7 +247,11 @@ WITH q1(x,y) AS ( SELECT hundred, sum(ten) FROM tenk1 GROUP BY hundred ) SELECT count(*) FROM q1 WHERE y > (SELECT sum(y)/100 FROM q1 qsub); -ERROR: GROUP BY clause is not yet supported + count +------- + 50 +(1 row) + -- via a VIEW CREATE TEMPORARY VIEW vsubdepartment AS WITH RECURSIVE subdepartment AS diff --git a/src/test/regress/expected/xc_groupby.out b/src/test/regress/expected/xc_groupby.out new file mode 100644 index 0000000..58f9ea7 --- /dev/null +++ b/src/test/regress/expected/xc_groupby.out @@ -0,0 +1,475 @@ +-- create required tables and fill them with data +create table tab1 (val int, val2 int); +create table tab2 (val int, val2 int); +insert into tab1 values (1, 1), (2, 1), (3, 1), (2, 2), (6, 2), (4, 3), (1, 3), (6, 3); +insert into tab2 values (1, 1), (4, 1), (8, 1), (2, 4), (9, 4), (3, 4), (4, 2), (5, 2), (3, 2); +select count(*), sum(val), avg(val), sum(val)::float8/count(*), val2 from tab1 group by val2; + count | sum | avg | ?column? | val2 +-------+-----+--------------------+------------------+------ + 3 | 6 | 2.0000000000000000 | 2 | 1 + 2 | 8 | 4.0000000000000000 | 4 | 2 + 3 | 11 | 3.6666666666666667 | 3.66666666666667 | 3 +(3 rows) + +-- joins and group by +select count(*), sum(tab1.val * tab2.val), avg(tab1.val*tab2.val), sum(tab1.val*tab2.val)::float8/count(*), tab1.val2, tab2.val2 from tab1 full outer join tab2 on tab1.val2 = tab2.val2 group by tab1.val2, tab2.val2; + count | sum | avg | ?column? | val2 | val2 +-------+-----+---------------------+------------------+------+------ + 6 | 96 | 16.0000000000000000 | 16 | 2 | 2 + 9 | 78 | 8.6666666666666667 | 8.66666666666667 | 1 | 1 + 3 | | | | 3 | + 3 | | | | | 4 +(4 rows) + +-- aggregates over aggregates +select sum(y) from (select sum(val) y, val2%2 x from tab1 group by val2) q1 group by x; + sum +----- + 8 + 17 +(2 rows) + +-- group by without aggregate, just like distinct? +select val2 from tab1 group by val2; + val2 +------ + 1 + 2 + 3 +(3 rows) + +-- group by with aggregates in expression +select count(*) + sum(val) + avg(val), val2 from tab1 group by val2; + ?column? | val2 +---------------------+------ + 11.0000000000000000 | 1 + 14.0000000000000000 | 2 + 17.6666666666666667 | 3 +(3 rows) + +-- group by with expressions in group by clause +select sum(val), avg(val), 2 * val2 from tab1 group by 2 * val2; + sum | avg | ?column? +-----+--------------------+---------- + 11 | 3.6666666666666667 | 6 + 6 | 2.0000000000000000 | 2 + 8 | 4.0000000000000000 | 4 +(3 rows) + +drop table tab1; +drop table tab2; +-- repeat the same tests for replicated tables +-- create required tables and fill them with data +create table tab1 (val int, val2 int) distribute by replication; +create table tab2 (val int, val2 int) distribute by replication; +insert into tab1 values (1, 1), (2, 1), (3, 1), (2, 2), (6, 2), (4, 3), (1, 3), (6, 3); +insert into tab2 values (1, 1), (4, 1), (8, 1), (2, 4), (9, 4), (3, 4), (4, 2), (5, 2), (3, 2); +select count(*), sum(val), avg(val), sum(val)::float8/count(*), val2 from tab1 group by val2; + count | sum | avg | ?column? | val2 +-------+-----+--------------------+------------------+------ + 3 | 6 | 2.0000000000000000 | 2 | 1 + 2 | 8 | 4.0000000000000000 | 4 | 2 + 3 | 11 | 3.6666666666666667 | 3.66666666666667 | 3 +(3 rows) + +-- joins and group by +select count(*), sum(tab1.val * tab2.val), avg(tab1.val*tab2.val), sum(tab1.val*tab2.val)::float8/count(*), tab1.val2, tab2.val2 from tab1 full outer join tab2 on tab1.val2 = tab2.val2 group by tab1.val2, tab2.val2; + count | sum | avg | ?column? | val2 | val2 +-------+-----+---------------------+------------------+------+------ + 6 | 96 | 16.0000000000000000 | 16 | 2 | 2 + 9 | 78 | 8.6666666666666667 | 8.66666666666667 | 1 | 1 + 3 | | | | 3 | + 3 | | | | | 4 +(4 rows) + +-- aggregates over aggregates +select sum(y) from (select sum(val) y, val2%2 x from tab1 group by val2) q1 group by x; + sum +----- + 8 + 17 +(2 rows) + +-- group by without aggregate, just like distinct? +select val2 from tab1 group by val2; + val2 +------ + 1 + 2 + 3 +(3 rows) + +-- group by with aggregates in expression +select count(*) + sum(val) + avg(val), val2 from tab1 group by val2; + ?column? | val2 +---------------------+------ + 11.0000000000000000 | 1 + 14.0000000000000000 | 2 + 17.6666666666666667 | 3 +(3 rows) + +-- group by with expressions in group by clause +select sum(val), avg(val), 2 * val2 from tab1 group by 2 * val2; + sum | avg | ?column? +-----+--------------------+---------- + 11 | 3.6666666666666667 | 6 + 6 | 2.0000000000000000 | 2 + 8 | 4.0000000000000000 | 4 +(3 rows) + +drop table tab1; +drop table tab2; +-- some tests involving nulls, characters, float type etc. +create table def(a int, b varchar(25)); +insert into def VALUES (NULL, NULL); +insert into def VALUES (1, NULL); +insert into def VALUES (NULL, 'One'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (3, 'Three'); +insert into def VALUES (4, 'Three'); +insert into def VALUES (5, 'Three'); +insert into def VALUES (6, 'Two'); +insert into def VALUES (7, NULL); +insert into def VALUES (8, 'Two'); +insert into def VALUES (9, 'Three'); +insert into def VALUES (10, 'Three'); +select a,count(a) from def group by a order by a; + a | count +----+------- + 1 | 1 + 2 | 2 + 3 | 1 + 4 | 1 + 5 | 1 + 6 | 1 + 7 | 1 + 8 | 1 + 9 | 1 + 10 | 1 + | 0 +(11 rows) + +select avg(a) from def group by a; + avg +------------------------ + + 6.0000000000000000 + 5.0000000000000000 + 8.0000000000000000 + 1.00000000000000000000 + 9.0000000000000000 + 2.0000000000000000 + 7.0000000000000000 + 10.0000000000000000 + 3.0000000000000000 + 4.0000000000000000 +(11 rows) + +select avg(a) from def group by a; + avg +------------------------ + + 6.0000000000000000 + 5.0000000000000000 + 8.0000000000000000 + 1.00000000000000000000 + 9.0000000000000000 + 2.0000000000000000 + 7.0000000000000000 + 10.0000000000000000 + 3.0000000000000000 + 4.0000000000000000 +(11 rows) + +select avg(a) from def group by b; + avg +-------------------- + 4.0000000000000000 + + 4.5000000000000000 + 6.2000000000000000 +(4 rows) + +select sum(a) from def group by b; + sum +----- + 8 + + 18 + 31 +(4 rows) + +select count(*) from def group by b; + count +------- + 3 + 1 + 4 + 5 +(4 rows) + +select count(*) from def where a is not null group by a; + count +------- + 1 + 1 + 1 + 1 + 1 + 1 + 2 + 1 + 1 + 1 +(10 rows) + +select b from def group by b; + b +------- + + One + Two + Three +(4 rows) + +select b,count(b) from def group by b; + b | count +-------+------- + | 0 + One | 1 + Two | 4 + Three | 5 +(4 rows) + +select count(*) from def where b is null group by b; + count +------- + 3 +(1 row) + +create table g(a int, b float, c numeric); +insert into g values(1,2.1,3.2); +insert into g values(1,2.1,3.2); +insert into g values(2,2.3,5.2); +select sum(a) from g group by a; + sum +----- + 2 + 2 +(2 rows) + +select sum(b) from g group by b; + sum +----- + 2.3 + 4.2 +(2 rows) + +select sum(c) from g group by b; + sum +----- + 5.2 + 6.4 +(2 rows) + +select avg(a) from g group by b; + avg +------------------------ + 2.0000000000000000 + 1.00000000000000000000 +(2 rows) + +select avg(b) from g group by c; + avg +----- + 2.3 + 2.1 +(2 rows) + +select avg(c) from g group by c; + avg +-------------------- + 5.2000000000000000 + 3.2000000000000000 +(2 rows) + +drop table def; +drop table g; +-- same test with replicated tables +create table def(a int, b varchar(25)) distribute by replication; +insert into def VALUES (NULL, NULL); +insert into def VALUES (1, NULL); +insert into def VALUES (NULL, 'One'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (3, 'Three'); +insert into def VALUES (4, 'Three'); +insert into def VALUES (5, 'Three'); +insert into def VALUES (6, 'Two'); +insert into def VALUES (7, NULL); +insert into def VALUES (8, 'Two'); +insert into def VALUES (9, 'Three'); +insert into def VALUES (10, 'Three'); +select a,count(a) from def group by a order by a; + a | count +----+------- + 1 | 1 + 2 | 2 + 3 | 1 + 4 | 1 + 5 | 1 + 6 | 1 + 7 | 1 + 8 | 1 + 9 | 1 + 10 | 1 + | 0 +(11 rows) + +select avg(a) from def group by a; + avg +------------------------ + + 6.0000000000000000 + 5.0000000000000000 + 8.0000000000000000 + 1.00000000000000000000 + 2.0000000000000000 + 9.0000000000000000 + 3.0000000000000000 + 7.0000000000000000 + 10.0000000000000000 + 4.0000000000000000 +(11 rows) + +select avg(a) from def group by a; + avg +------------------------ + + 6.0000000000000000 + 5.0000000000000000 + 8.0000000000000000 + 1.00000000000000000000 + 2.0000000000000000 + 9.0000000000000000 + 3.0000000000000000 + 7.0000000000000000 + 10.0000000000000000 + 4.0000000000000000 +(11 rows) + +select avg(a) from def group by b; + avg +-------------------- + 4.0000000000000000 + + 4.5000000000000000 + 6.2000000000000000 +(4 rows) + +select sum(a) from def group by b; + sum +----- + 8 + + 18 + 31 +(4 rows) + +select count(*) from def group by b; + count +------- + 3 + 1 + 4 + 5 +(4 rows) + +select count(*) from def where a is not null group by a; + count +------- + 1 + 1 + 1 + 1 + 1 + 2 + 1 + 1 + 1 + 1 +(10 rows) + +select b from def group by b; + b +------- + + One + Two + Three +(4 rows) + +select b,count(b) from def group by b; + b | count +-------+------- + | 0 + One | 1 + Two | 4 + Three | 5 +(4 rows) + +select count(*) from def where b is null group by b; + count +------- + 3 +(1 row) + +create table g(a int, b float, c numeric) distribute by replication; +insert into g values(1,2.1,3.2); +insert into g values(1,2.1,3.2); +insert into g values(2,2.3,5.2); +select sum(a) from g group by a; + sum +----- + 2 + 2 +(2 rows) + +select sum(b) from g group by ... [truncated message content] |
From: Abbas B. <ga...@us...> - 2011-05-24 12:08:45
|
Project "Postgres-XC". The branch, master has been updated via 49b66c77343ae1e9921118e0c902b1528f1cc2ae (commit) from 87a62879ab3492e3dd37d00478ffa857639e2b85 (commit) - Log ----------------------------------------------------------------- commit 49b66c77343ae1e9921118e0c902b1528f1cc2ae Author: Abbas <abb...@en...> Date: Tue May 24 17:06:30 2011 +0500 This patch adds support for the following data types to be used as distribution key INT8, INT2, OID, INT4, BOOL, INT2VECTOR, OIDVECTOR CHAR, NAME, TEXT, BPCHAR, BYTEA, VARCHAR FLOAT4, FLOAT8, NUMERIC, CASH ABSTIME, RELTIME, DATE, TIME, TIMESTAMP, TIMESTAMPTZ, INTERVAL, TIMETZ A new function compute_hash is added in the system which is used to compute hash of a any of the supported data types. The computed hash is used in the function GetRelationNodes to find the targeted data node. EXPLAIN for RemoteQuery has been modified to show the number of data nodes targeted for a certain query. This is essential to spot bugs in the optimizer in case it is targeting all nodes by mistake. In case of optimisations where comparison with a constant leads the optimiser to point to a single data node, there were a couple of mistakes in examine_conditions_walker. First it was not supporting RelabelType, which represents a "dummy" type coercion between two binary compatible datatypes. This was resulting in the optimization not working for varchar type for example. Secondly it was not catering for the case where the user specifies the condition such that the constant expression is written towards LHS and the variable towards the RHS of the = operator. i.e. 23 = a A number of test cases have been added in regression to make sure further enhancements do not break this functionality. This change has a sizeable impact on current regression tests in the following manner. 1. horology test case crashes the server and has been commented out in serial_schedule. 2. In money test case the planner optimizer wrongly kicks in to optimize this query SELECT m = '$123.01' FROM money_data; to point to a single data node. 3. There were a few un-necessary EXPLAINs in create_index test case. Since we have added support in EXPLAIN to show the number of data nodes targeted for RemoteQuery, this test case was producing output dependent on the cluster configuration. 4. In guc test case DROP ROLE temp_reset_user; results in ERROR: permission denied to drop role diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c index 577873b..22766c5 100644 --- a/src/backend/access/hash/hashfunc.c +++ b/src/backend/access/hash/hashfunc.c @@ -28,6 +28,13 @@ #include "access/hash.h" +#ifdef PGXC +#include "catalog/pg_type.h" +#include "utils/builtins.h" +#include "utils/timestamp.h" +#include "utils/date.h" +#include "utils/nabstime.h" +#endif /* Note: this is used for both "char" and boolean datatypes */ Datum @@ -521,3 +528,91 @@ hash_uint32(uint32 k) /* report the result */ return UInt32GetDatum(c); } + +#ifdef PGXC +/* + * compute_hash() -- Generaic hash function for all datatypes + * + */ + +Datum +compute_hash(Oid type, Datum value, int *pErr) +{ + Assert(pErr); + + *pErr = 0; + + if (value == NULL) + { + *pErr = 1; + return 0; + } + + switch(type) + { + case INT8OID: + /* This gives added advantage that + * a = 8446744073709551359 + * and a = 8446744073709551359::int8 both work*/ + return DatumGetInt32(value); + case INT2OID: + return DatumGetInt16(value); + case OIDOID: + return DatumGetObjectId(value); + case INT4OID: + return DatumGetInt32(value); + case BOOLOID: + return DatumGetBool(value); + + case CHAROID: + return DirectFunctionCall1(hashchar, value); + case NAMEOID: + return DirectFunctionCall1(hashname, value); + case INT2VECTOROID: + return DirectFunctionCall1(hashint2vector, value); + + case VARCHAROID: + case TEXTOID: + return DirectFunctionCall1(hashtext, value); + + case OIDVECTOROID: + return DirectFunctionCall1(hashoidvector, value); + case FLOAT4OID: + return DirectFunctionCall1(hashfloat4, value); + case FLOAT8OID: + return DirectFunctionCall1(hashfloat8, value); + + case ABSTIMEOID: + return DatumGetAbsoluteTime(value); + case RELTIMEOID: + return DatumGetRelativeTime(value); + case CASHOID: + return DirectFunctionCall1(hashint8, value); + + case BPCHAROID: + return DirectFunctionCall1(hashbpchar, value); + case BYTEAOID: + return DirectFunctionCall1(hashvarlena, value); + + case DATEOID: + return DatumGetDateADT(value); + case TIMEOID: + return DirectFunctionCall1(time_hash, value); + case TIMESTAMPOID: + return DirectFunctionCall1(timestamp_hash, value); + case TIMESTAMPTZOID: + return DirectFunctionCall1(timestamp_hash, value); + case INTERVALOID: + return DirectFunctionCall1(interval_hash, value); + case TIMETZOID: + return DirectFunctionCall1(timetz_hash, value); + + case NUMERICOID: + return DirectFunctionCall1(hash_numeric, value); + default: + *pErr = 1; + return 0; + } +} + +#endif diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 613d5ff..714190f 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -1645,14 +1645,14 @@ CopyTo(CopyState cstate) } #ifdef PGXC - if (IS_PGXC_COORDINATOR && cstate->rel_loc) + if (IS_PGXC_COORDINATOR && cstate->rel_loc) { cstate->processed = DataNodeCopyOut( - GetRelationNodes(cstate->rel_loc, NULL, RELATION_ACCESS_READ), + GetRelationNodes(cstate->rel_loc, 0, UNKNOWNOID, RELATION_ACCESS_READ), cstate->connections, cstate->copy_file); } - else + else { #endif @@ -2417,15 +2417,18 @@ CopyFrom(CopyState cstate) #ifdef PGXC if (IS_PGXC_COORDINATOR && cstate->rel_loc) { - Datum *dist_col_value = NULL; + Datum dist_col_value; + Oid dist_col_type = UNKNOWNOID; if (cstate->idx_dist_by_col >= 0 && !nulls[cstate->idx_dist_by_col]) - dist_col_value = &values[cstate->idx_dist_by_col]; + { + dist_col_value = values[cstate->idx_dist_by_col]; + dist_col_type = attr[cstate->idx_dist_by_col]->atttypid; + } if (DataNodeCopyIn(cstate->line_buf.data, cstate->line_buf.len, - GetRelationNodes(cstate->rel_loc, (long *)dist_col_value, - RELATION_ACCESS_INSERT), + GetRelationNodes(cstate->rel_loc, dist_col_value, dist_col_type, RELATION_ACCESS_INSERT), cstate->connections)) ereport(ERROR, (errcode(ERRCODE_CONNECTION_EXCEPTION), @@ -4037,7 +4040,8 @@ DoInsertSelectCopy(EState *estate, TupleTableSlot *slot) HeapTuple tuple; Datum *values; bool *nulls; - Datum *dist_col_value = NULL; + Datum dist_col_value; + Oid dist_col_type; MemoryContext oldcontext; CopyState cstate; @@ -4082,6 +4086,11 @@ DoInsertSelectCopy(EState *estate, TupleTableSlot *slot) cstate->fe_msgbuf = makeStringInfo(); attr = cstate->tupDesc->attrs; + if (cstate->idx_dist_by_col >= 0) + dist_col_type = attr[cstate->idx_dist_by_col]->atttypid; + else + dist_col_type = UNKNOWNOID; + /* Get info about the columns we need to process. */ cstate->out_functions = (FmgrInfo *) palloc(cstate->tupDesc->natts * sizeof(FmgrInfo)); foreach(lc, cstate->attnumlist) @@ -4152,12 +4161,14 @@ DoInsertSelectCopy(EState *estate, TupleTableSlot *slot) /* Get dist column, if any */ if (cstate->idx_dist_by_col >= 0 && !nulls[cstate->idx_dist_by_col]) - dist_col_value = &values[cstate->idx_dist_by_col]; + dist_col_value = values[cstate->idx_dist_by_col]; + else + dist_col_type = UNKNOWNOID; /* Send item to the appropriate data node(s) (buffer) */ if (DataNodeCopyIn(cstate->fe_msgbuf->data, cstate->fe_msgbuf->len, - GetRelationNodes(cstate->rel_loc, (long *)dist_col_value, RELATION_ACCESS_INSERT), + GetRelationNodes(cstate->rel_loc, dist_col_value, dist_col_type, RELATION_ACCESS_INSERT), cstate->connections)) ereport(ERROR, (errcode(ERRCODE_CONNECTION_EXCEPTION), diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index a361186..fe74569 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -851,8 +851,28 @@ ExplainNode(Plan *plan, PlanState *planstate, case T_WorkTableScan: #ifdef PGXC case T_RemoteQuery: + { + RemoteQuery *remote_query = (RemoteQuery *) plan; + int pnc, nc; + + pnc = 0; + nc = 0; + if (remote_query->exec_nodes != NULL) + { + if (remote_query->exec_nodes->primarynodelist != NULL) + { + pnc = list_length(remote_query->exec_nodes->primarynodelist); + appendStringInfo(es->str, " (Primary Node Count [%d])", pnc); + } + if (remote_query->exec_nodes->nodelist) + { + nc = list_length(remote_query->exec_nodes->nodelist); + appendStringInfo(es->str, " (Node Count [%d])", nc); + } + } #endif - ExplainScanTarget((Scan *) plan, es); + ExplainScanTarget((Scan *) plan, es); + } break; case T_BitmapIndexScan: { diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index b6252a3..c03938d 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -2418,9 +2418,7 @@ create_remotequery_plan(PlannerInfo *root, Path *best_path, scan_plan->exec_nodes->baselocatortype = rel_loc_info->locatorType; else scan_plan->exec_nodes->baselocatortype = '\0'; - scan_plan->exec_nodes = GetRelationNodes(rel_loc_info, - NULL, - RELATION_ACCESS_READ); + scan_plan->exec_nodes = GetRelationNodes(rel_loc_info, 0, UNKNOWNOID, RELATION_ACCESS_READ); copy_path_costsize(&scan_plan->scan.plan, best_path); /* PGXCTODO - get better estimates */ @@ -5024,8 +5022,7 @@ create_remotedelete_plan(PlannerInfo *root, Plan *topplan) fstep->sql_statement = pstrdup(buf->data); fstep->combine_type = COMBINE_TYPE_SAME; fstep->read_only = false; - fstep->exec_nodes = GetRelationNodes(rel_loc_info, NULL, - RELATION_ACCESS_UPDATE); + fstep->exec_nodes = GetRelationNodes(rel_loc_info, 0, UNKNOWNOID, RELATION_ACCESS_UPDATE); } else { diff --git a/src/backend/pgxc/locator/locator.c b/src/backend/pgxc/locator/locator.c index 0ab157d..33fe8ac 100644 --- a/src/backend/pgxc/locator/locator.c +++ b/src/backend/pgxc/locator/locator.c @@ -41,7 +41,7 @@ #include "catalog/pgxc_class.h" #include "catalog/namespace.h" - +#include "access/hash.h" /* * PGXCTODO For prototype, relations use the same hash mapping table. @@ -206,7 +206,32 @@ char *pColName; bool IsHashDistributable(Oid col_type) { - if (col_type == INT4OID || col_type == INT2OID) + if(col_type == INT8OID + || col_type == INT2OID + || col_type == OIDOID + || col_type == INT4OID + || col_type == BOOLOID + || col_type == CHAROID + || col_type == NAMEOID + || col_type == INT2VECTOROID + || col_type == TEXTOID + || col_type == OIDVECTOROID + || col_type == FLOAT4OID + || col_type == FLOAT8OID + || col_type == ABSTIMEOID + || col_type == RELTIMEOID + || col_type == CASHOID + || col_type == BPCHAROID + || col_type == BYTEAOID + || col_type == VARCHAROID + || col_type == DATEOID + || col_type == TIMEOID + || col_type == TIMESTAMPOID + || col_type == TIMESTAMPTZOID + || col_type == INTERVALOID + || col_type == TIMETZOID + || col_type == NUMERICOID + ) return true; return false; @@ -296,7 +321,32 @@ RelationLocInfo *rel_loc_info; bool IsModuloDistributable(Oid col_type) { - if (col_type == INT4OID || col_type == INT2OID) + if(col_type == INT8OID + || col_type == INT2OID + || col_type == OIDOID + || col_type == INT4OID + || col_type == BOOLOID + || col_type == CHAROID + || col_type == NAMEOID + || col_type == INT2VECTOROID + || col_type == TEXTOID + || col_type == OIDVECTOROID + || col_type == FLOAT4OID + || col_type == FLOAT8OID + || col_type == ABSTIMEOID + || col_type == RELTIMEOID + || col_type == CASHOID + || col_type == BPCHAROID + || col_type == BYTEAOID + || col_type == VARCHAROID + || col_type == DATEOID + || col_type == TIMEOID + || col_type == TIMESTAMPOID + || col_type == TIMESTAMPTZOID + || col_type == INTERVALOID + || col_type == TIMETZOID + || col_type == NUMERICOID + ) return true; return false; @@ -409,13 +459,13 @@ GetRoundRobinNode(Oid relid) * The returned List is a copy, so it should be freed when finished. */ ExecNodes * -GetRelationNodes(RelationLocInfo *rel_loc_info, long *partValue, - RelationAccessType accessType) +GetRelationNodes(RelationLocInfo *rel_loc_info, Datum valueForDistCol, Oid typeOfValueForDistCol, RelationAccessType accessType) { ListCell *prefItem; ListCell *stepItem; ExecNodes *exec_nodes; - + long hashValue; + int nError; if (rel_loc_info == NULL) return NULL; @@ -480,10 +530,10 @@ GetRelationNodes(RelationLocInfo *rel_loc_info, long *partValue, break; case LOCATOR_TYPE_HASH: - - if (partValue != NULL) + hashValue = compute_hash(typeOfValueForDistCol, valueForDistCol, &nError); + if (nError == 0) /* in prototype, all partitioned tables use same map */ - exec_nodes->nodelist = lappend_int(NULL, get_node_from_hash(hash_range_int(*partValue))); + exec_nodes->nodelist = lappend_int(NULL, get_node_from_hash(hash_range_int(hashValue))); else if (accessType == RELATION_ACCESS_INSERT) /* Insert NULL to node 1 */ @@ -494,9 +544,10 @@ GetRelationNodes(RelationLocInfo *rel_loc_info, long *partValue, break; case LOCATOR_TYPE_MODULO: - if (partValue != NULL) + hashValue = compute_hash(typeOfValueForDistCol, valueForDistCol, &nError); + if (nError == 0) /* in prototype, all partitioned tables use same map */ - exec_nodes->nodelist = lappend_int(NULL, get_node_from_modulo(compute_modulo(*partValue))); + exec_nodes->nodelist = lappend_int(NULL, get_node_from_modulo(compute_modulo(hashValue))); else if (accessType == RELATION_ACCESS_INSERT) /* Insert NULL to node 1 */ @@ -750,7 +801,6 @@ RelationLocInfo * GetRelationLocInfo(Oid relid) { RelationLocInfo *ret_loc_info = NULL; - char *namespace; Relation rel = relation_open(relid, AccessShareLock); diff --git a/src/backend/pgxc/plan/planner.c b/src/backend/pgxc/plan/planner.c index 2448a74..4873f19 100644 --- a/src/backend/pgxc/plan/planner.c +++ b/src/backend/pgxc/plan/planner.c @@ -43,20 +43,23 @@ #include "utils/lsyscache.h" #include "utils/portal.h" #include "utils/syscache.h" - +#include "utils/numeric.h" +#include "access/hash.h" +#include "utils/timestamp.h" +#include "utils/date.h" /* * Convenient format for literal comparisons * - * PGXCTODO - make constant type Datum, handle other types */ typedef struct { - Oid relid; - RelationLocInfo *rel_loc_info; - Oid attrnum; - char *col_name; - long constant; /* assume long PGXCTODO - should be Datum */ + Oid relid; + RelationLocInfo *rel_loc_info; + Oid attrnum; + char *col_name; + Datum constValue; + Oid constType; } Literal_Comparison; /* @@ -471,15 +474,12 @@ get_base_var(Var *var, XCWalkerContext *context) static void get_plan_nodes_insert(PlannerInfo *root, RemoteQuery *step) { - Query *query = root->parse; - RangeTblEntry *rte; - RelationLocInfo *rel_loc_info; - Const *constant; - ListCell *lc; - long part_value; - long *part_value_ptr = NULL; - Expr *eval_expr = NULL; - + Query *query = root->parse; + RangeTblEntry *rte; + RelationLocInfo *rel_loc_info; + Const *constant; + ListCell *lc; + Expr *eval_expr = NULL; step->exec_nodes = NULL; @@ -568,7 +568,7 @@ get_plan_nodes_insert(PlannerInfo *root, RemoteQuery *step) if (!lc) { /* Skip rest, handle NULL */ - step->exec_nodes = GetRelationNodes(rel_loc_info, NULL, RELATION_ACCESS_INSERT); + step->exec_nodes = GetRelationNodes(rel_loc_info, 0, UNKNOWNOID, RELATION_ACCESS_INSERT); return; } @@ -650,21 +650,11 @@ get_plan_nodes_insert(PlannerInfo *root, RemoteQuery *step) } constant = (Const *) checkexpr; - - if (constant->consttype == INT4OID || - constant->consttype == INT2OID || - constant->consttype == INT8OID) - { - part_value = (long) constant->constvalue; - part_value_ptr = &part_value; - } - /* PGXCTODO - handle other data types */ } } /* single call handles both replicated and partitioned types */ - step->exec_nodes = GetRelationNodes(rel_loc_info, part_value_ptr, - RELATION_ACCESS_INSERT); + step->exec_nodes = GetRelationNodes(rel_loc_info, constant->constvalue, constant->consttype, RELATION_ACCESS_INSERT); if (eval_expr) pfree(eval_expr); @@ -1047,6 +1037,28 @@ examine_conditions_walker(Node *expr_node, XCWalkerContext *context) { Expr *arg1 = linitial(opexpr->args); Expr *arg2 = lsecond(opexpr->args); + RelabelType *rt; + Expr *targ; + + if (IsA(arg1, RelabelType)) + { + rt = arg1; + arg1 = rt->arg; + } + + if (IsA(arg2, RelabelType)) + { + rt = arg2; + arg2 = rt->arg; + } + + /* Handle constant = var */ + if (IsA(arg2, Var)) + { + targ = arg1; + arg1 = arg2; + arg2 = targ; + } /* Look for a table */ if (IsA(arg1, Var)) @@ -1134,7 +1146,8 @@ examine_conditions_walker(Node *expr_node, XCWalkerContext *context) lit_comp->relid = column_base->relid; lit_comp->rel_loc_info = rel_loc_info1; lit_comp->col_name = column_base->colname; - lit_comp->constant = constant->constvalue; + lit_comp->constValue = constant->constvalue; + lit_comp->constType = constant->consttype; context->conditions->partitioned_literal_comps = lappend( context->conditions->partitioned_literal_comps, @@ -1742,9 +1755,7 @@ get_plan_nodes_walker(Node *query_node, XCWalkerContext *context) if (rel_loc_info->locatorType != LOCATOR_TYPE_HASH && rel_loc_info->locatorType != LOCATOR_TYPE_MODULO) /* do not need to determine partitioning expression */ - context->query_step->exec_nodes = GetRelationNodes(rel_loc_info, - NULL, - context->accessType); + context->query_step->exec_nodes = GetRelationNodes(rel_loc_info, 0, UNKNOWNOID, context->accessType); /* Note replicated table usage for determining safe queries */ if (context->query_step->exec_nodes) @@ -1800,9 +1811,7 @@ get_plan_nodes_walker(Node *query_node, XCWalkerContext *context) { Literal_Comparison *lit_comp = (Literal_Comparison *) lfirst(lc); - test_exec_nodes = GetRelationNodes( - lit_comp->rel_loc_info, &(lit_comp->constant), - RELATION_ACCESS_READ); + test_exec_nodes = GetRelationNodes(lit_comp->rel_loc_info, lit_comp->constValue, lit_comp->constType, RELATION_ACCESS_READ); test_exec_nodes->tableusagetype = table_usage_type; if (context->query_step->exec_nodes == NULL) @@ -1828,9 +1837,7 @@ get_plan_nodes_walker(Node *query_node, XCWalkerContext *context) parent_child = (Parent_Child_Join *) linitial(context->conditions->partitioned_parent_child); - context->query_step->exec_nodes = GetRelationNodes(parent_child->rel_loc_info1, - NULL, - context->accessType); + context->query_step->exec_nodes = GetRelationNodes(parent_child->rel_loc_info1, 0, UNKNOWNOID, context->accessType); context->query_step->exec_nodes->tableusagetype = table_usage_type; } @@ -3378,8 +3385,6 @@ GetHashExecNodes(RelationLocInfo *rel_loc_info, ExecNodes **exec_nodes, const Ex Expr *checkexpr; Expr *eval_expr = NULL; Const *constant; - long part_value; - long *part_value_ptr = NULL; eval_expr = (Expr *) eval_const_expressions(NULL, (Node *)expr); checkexpr = get_numeric_constant(eval_expr); @@ -3389,17 +3394,8 @@ GetHashExecNodes(RelationLocInfo *rel_loc_info, ExecNodes **exec_nodes, const Ex constant = (Const *) checkexpr; - if (constant->consttype == INT4OID || - constant->consttype == INT2OID || - constant->consttype == INT8OID) - { - part_value = (long) constant->constvalue; - part_value_ptr = &part_value; - } - /* single call handles both replicated and partitioned types */ - *exec_nodes = GetRelationNodes(rel_loc_info, part_value_ptr, - RELATION_ACCESS_INSERT); + *exec_nodes = GetRelationNodes(rel_loc_info, constant->constvalue, constant->consttype, RELATION_ACCESS_INSERT); if (eval_expr) pfree(eval_expr); diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index 75aca21..76e3eef 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -1061,7 +1061,8 @@ BufferConnection(PGXCNodeHandle *conn) RemoteQueryState *combiner = conn->combiner; MemoryContext oldcontext; - Assert(conn->state == DN_CONNECTION_STATE_QUERY && combiner); + if (combiner == NULL || conn->state != DN_CONNECTION_STATE_QUERY) + return; /* * When BufferConnection is invoked CurrentContext is related to other @@ -3076,9 +3077,8 @@ get_exec_connections(RemoteQueryState *planstate, if (!isnull) { RelationLocInfo *rel_loc_info = GetRelationLocInfo(exec_nodes->relid); - ExecNodes *nodes = GetRelationNodes(rel_loc_info, - (long *) &partvalue, - exec_nodes->accesstype); + /* PGXCTODO what is the type of partvalue here*/ + ExecNodes *nodes = GetRelationNodes(rel_loc_info, partvalue, UNKNOWNOID, exec_nodes->accesstype); if (nodes) { nodelist = nodes->nodelist; diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 415fc47..6d7939b 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -670,18 +670,18 @@ pg_analyze_and_rewrite(Node *parsetree, const char *query_string, querytree_list = pg_rewrite_query(query); #ifdef PGXC - if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) - { - ListCell *lc; - - foreach(lc, querytree_list) - { - Query *query = (Query *) lfirst(lc); - - if (query->sql_statement == NULL) - query->sql_statement = pstrdup(query_string); - } - } + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) + { + ListCell *lc; + + foreach(lc, querytree_list) + { + Query *query = (Query *) lfirst(lc); + + if (query->sql_statement == NULL) + query->sql_statement = pstrdup(query_string); + } + } #endif TRACE_POSTGRESQL_QUERY_REWRITE_DONE(query_string); @@ -1043,7 +1043,7 @@ exec_simple_query(const char *query_string) querytree_list = pg_analyze_and_rewrite(parsetree, query_string, NULL, 0); - + plantree_list = pg_plan_queries(querytree_list, 0, NULL); /* Done with the snapshot used for parsing/planning */ diff --git a/src/include/access/hash.h b/src/include/access/hash.h index d5899f4..4aaffaa 100644 --- a/src/include/access/hash.h +++ b/src/include/access/hash.h @@ -353,4 +353,8 @@ extern OffsetNumber _hash_binsearch_last(Page page, uint32 hash_value); extern void hash_redo(XLogRecPtr lsn, XLogRecord *record); extern void hash_desc(StringInfo buf, uint8 xl_info, char *rec); +#ifdef PGXC +extern Datum compute_hash(Oid type, Datum value, int *pErr); +#endif + #endif /* HASH_H */ diff --git a/src/include/pgxc/locator.h b/src/include/pgxc/locator.h index 9f669d9..9ee983c 100644 --- a/src/include/pgxc/locator.h +++ b/src/include/pgxc/locator.h @@ -100,8 +100,7 @@ extern char ConvertToLocatorType(int disttype); extern char *GetRelationHashColumn(RelationLocInfo *rel_loc_info); extern RelationLocInfo *GetRelationLocInfo(Oid relid); extern RelationLocInfo *CopyRelationLocInfo(RelationLocInfo *src_info); -extern ExecNodes *GetRelationNodes(RelationLocInfo *rel_loc_info, long *partValue, - RelationAccessType accessType); +extern ExecNodes *GetRelationNodes(RelationLocInfo *rel_loc_info, Datum valueForDistCol, Oid typeOfValueForDistCol, RelationAccessType accessType); extern bool IsHashColumn(RelationLocInfo *rel_loc_info, char *part_col_name); extern bool IsHashColumnForRelId(Oid relid, char *part_col_name); extern int GetRoundRobinNode(Oid relid); diff --git a/src/test/regress/expected/create_index_1.out b/src/test/regress/expected/create_index_1.out index 52fdc91..ab3807c 100644 --- a/src/test/regress/expected/create_index_1.out +++ b/src/test/regress/expected/create_index_1.out @@ -174,15 +174,10 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, -12)'; SET enable_seqscan = OFF; SET enable_indexscan = ON; SET enable_bitmapscan = ON; -EXPLAIN (COSTS OFF) -SELECT * FROM fast_emp4000 - WHERE home_base @ '(200,200),(2000,1000)'::box - ORDER BY (home_base[0])[0]; - QUERY PLAN ----------------- - Data Node Scan -(1 row) - +--EXPLAIN (COSTS OFF) +--SELECT * FROM fast_emp4000 +-- WHERE home_base @ '(200,200),(2000,1000)'::box +-- ORDER BY (home_base[0])[0]; SELECT * FROM fast_emp4000 WHERE home_base @ '(200,200),(2000,1000)'::box ORDER BY (home_base[0])[0]; @@ -190,40 +185,25 @@ SELECT * FROM fast_emp4000 ----------- (0 rows) -EXPLAIN (COSTS OFF) -SELECT count(*) FROM fast_emp4000 WHERE home_base && '(1000,1000,0,0)'::box; - QUERY PLAN ----------------- - Data Node Scan -(1 row) - +--EXPLAIN (COSTS OFF) +--SELECT count(*) FROM fast_emp4000 WHERE home_base && '(1000,1000,0,0)'::box; SELECT count(*) FROM fast_emp4000 WHERE home_base && '(1000,1000,0,0)'::box; count ------- 1 (1 row) -EXPLAIN (COSTS OFF) -SELECT count(*) FROM fast_emp4000 WHERE home_base IS NULL; - QUERY PLAN ----------------- - Data Node Scan -(1 row) - +--EXPLAIN (COSTS OFF) +--SELECT count(*) FROM fast_emp4000 WHERE home_base IS NULL; SELECT count(*) FROM fast_emp4000 WHERE home_base IS NULL; count ------- 138 (1 row) -EXPLAIN (COSTS OFF) -SELECT * FROM polygon_tbl WHERE f1 ~ '((1,1),(2,2),(2,1))'::polygon - ORDER BY (poly_center(f1))[0]; - QUERY PLAN ----------------- - Data Node Scan -(1 row) - +--EXPLAIN (COSTS OFF) +--SELECT * FROM polygon_tbl WHERE f1 ~ '((1,1),(2,2),(2,1))'::polygon +-- ORDER BY (poly_center(f1))[0]; SELECT * FROM polygon_tbl WHERE f1 ~ '((1,1),(2,2),(2,1))'::polygon ORDER BY (poly_center(f1))[0]; id | f1 @@ -231,14 +211,9 @@ SELECT * FROM polygon_tbl WHERE f1 ~ '((1,1),(2,2),(2,1))'::polygon 1 | ((2,0),(2,4),(0,0)) (1 row) -EXPLAIN (COSTS OFF) -SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1) - ORDER BY area(f1); - QUERY PLAN ----------------- - Data Node Scan -(1 row) - +--EXPLAIN (COSTS OFF) +--SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1) +-- ORDER BY area(f1); SELECT * FROM circle_tbl WHERE f1 && circle(point(1,-2), 1) ORDER BY area(f1); f1 @@ -269,9 +244,9 @@ LINE 1: SELECT count(*) FROM gcircle_tbl WHERE f1 && '<(500,500),500... ^ EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl WHERE f1 <@ box '(0,0,100,100)'; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl WHERE f1 <@ box '(0,0,100,100)'; @@ -282,9 +257,9 @@ SELECT count(*) FROM point_tbl WHERE f1 <@ box '(0,0,100,100)'; EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl WHERE box '(0,0,100,100)' @> f1; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl WHERE box '(0,0,100,100)' @> f1; @@ -295,9 +270,9 @@ SELECT count(*) FROM point_tbl WHERE box '(0,0,100,100)' @> f1; EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)'; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)'; @@ -308,9 +283,9 @@ SELECT count(*) FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50, EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl WHERE f1 <@ circle '<(50,50),50>'; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl WHERE f1 <@ circle '<(50,50),50>'; @@ -321,9 +296,9 @@ SELECT count(*) FROM point_tbl WHERE f1 <@ circle '<(50,50),50>'; EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl p WHERE p.f1 << '(0.0, 0.0)'; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl p WHERE p.f1 << '(0.0, 0.0)'; @@ -334,9 +309,9 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 << '(0.0, 0.0)'; EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl p WHERE p.f1 >> '(0.0, 0.0)'; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl p WHERE p.f1 >> '(0.0, 0.0)'; @@ -347,9 +322,9 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 >> '(0.0, 0.0)'; EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl p WHERE p.f1 <^ '(0.0, 0.0)'; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl p WHERE p.f1 <^ '(0.0, 0.0)'; @@ -360,9 +335,9 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 <^ '(0.0, 0.0)'; EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl p WHERE p.f1 >^ '(0.0, 0.0)'; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl p WHERE p.f1 >^ '(0.0, 0.0)'; @@ -373,9 +348,9 @@ SELECT count(*) FROM point_tbl p WHERE p.f1 >^ '(0.0, 0.0)'; EXPLAIN (COSTS OFF) SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, -12)'; - QUERY PLAN ----------------- - Data Node Scan + QUERY PLAN +--------------------------------- + Data Node Scan (Node Count [1]) (1 row) SELECT count(*) FROM point_tbl p WHERE p.f1 ~= '(-5, -12)'; @@ -774,7 +749,7 @@ CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops); -- CREATE TABLE func_index_heap (f1 text, f2 text); CREATE UNIQUE INDEX func_index_index on func_index_heap (textcat(f1,f2)); -ERROR: Cannot locally enforce a unique index on round robin distributed table. +ERROR: Unique index of partitioned table must contain the hash/modulo distribution column. INSERT INTO func_index_heap VALUES('ABC','DEF'); INSERT INTO func_index_heap VALUES('AB','CDEFG'); INSERT INTO func_index_heap VALUES('QWE','RTY'); @@ -788,7 +763,7 @@ INSERT INTO func_index_heap VALUES('QWERTY'); DROP TABLE func_index_heap; CREATE TABLE func_index_heap (f1 text, f2 text); CREATE UNIQUE INDEX func_index_index on func_index_heap ((f1 || f2) text_ops); -ERROR: Cannot locally enforce a unique index on round robin distributed table. +ERROR: Unique index of partitioned table must contain the hash/modulo distribution column. INSERT INTO func_index_heap VALUES('ABC','DEF'); INSERT INTO func_index_heap VALUES('AB','CDEFG'); INSERT INTO func_index_heap VALUES('QWE','RTY'); diff --git a/src/test/regress/expected/float4_1.out b/src/test/regress/expected/float4_1.out index 432d159..f50147d 100644 --- a/src/test/regress/expected/float4_1.out +++ b/src/test/regress/expected/float4_1.out @@ -125,16 +125,6 @@ SELECT 'nan'::numeric::float4; NaN (1 row) -SELECT '' AS five, * FROM FLOAT4_TBL; - five | f1 -------+------------- - | 1004.3 - | 1.23457e+20 - | 0 - | -34.84 - | 1.23457e-20 -(5 rows) - SELECT '' AS five, * FROM FLOAT4_TBL ORDER BY f1; five | f1 ------+------------- @@ -257,13 +247,14 @@ SELECT '' AS five, f.f1, @f.f1 AS abs_f1 FROM FLOAT4_TBL f ORDER BY f1; UPDATE FLOAT4_TBL SET f1 = FLOAT4_TBL.f1 * '-1' WHERE FLOAT4_TBL.f1 > '0.0'; +ERROR: Partition column can't be updated in current version SELECT '' AS five, * FROM FLOAT4_TBL ORDER BY f1; - five | f1 -------+-------------- - | -1.23457e+20 - | -1004.3 - | -34.84 - | -1.23457e-20 - | 0 + five | f1 +------+------------- + | -34.84 + | 0 + | 1.23457e-20 + | 1004.3 + | 1.23457e+20 (5 rows) diff --git a/src/test/regress/expected/float8_1.out b/src/test/regress/expected/float8_1.out index 65fe187..8ce7930 100644 --- a/src/test/regress/expected/float8_1.out +++ b/src/test/regress/expected/float8_1.out @@ -381,6 +381,7 @@ SELECT '' AS five, * FROM FLOAT8_TBL ORDER BY f1; UPDATE FLOAT8_TBL SET f1 = FLOAT8_TBL.f1 * '-1' WHERE FLOAT8_TBL.f1 > '0.0'; +ERROR: Partition column can't be updated in current version SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f ORDER BY f1; ERROR: value out of range: overflow SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f ORDER BY f1; @@ -396,17 +397,17 @@ ERROR: cannot take logarithm of zero SELECT '' AS bad, ln(f.f1) from FLOAT8_TBL f where f.f1 < '0.0'; ERROR: cannot take logarithm of a negative number SELECT '' AS bad, exp(f.f1) from FLOAT8_TBL f ORDER BY f1; -ERROR: value out of range: underflow +ERROR: value out of range: overflow SELECT '' AS bad, f.f1 / '0.0' from FLOAT8_TBL f; ERROR: division by zero SELECT '' AS five, * FROM FLOAT8_TBL ORDER BY f1; - five | f1 -------+----------------------- - | -1.2345678901234e+200 - | -1004.3 - | -34.84 - | -1.2345678901234e-200 - | 0 + five | f1 +------+---------------------- + | -34.84 + | 0 + | 1.2345678901234e-200 + | 1004.3 + | 1.2345678901234e+200 (5 rows) -- test for over- and underflow diff --git a/src/test/regress/expected/foreign_key_1.out b/src/test/regress/expected/foreign_key_1.out index 7eccdc6..3cb7d17 100644 --- a/src/test/regress/expected/foreign_key_1.out +++ b/src/test/regress/expected/foreign_key_1.out @@ -773,9 +773,9 @@ INSERT INTO FKTABLE VALUES(43); -- should fail ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" DETAIL: Key (ftest1)=(43) is not present in table "pktable". UPDATE FKTABLE SET ftest1 = ftest1; -- should succeed +ERROR: Partition column can't be updated in current version UPDATE FKTABLE SET ftest1 = ftest1 + 1; -- should fail -ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" -DETAIL: Key (ftest1)=(43) is not present in table "pktable". +ERROR: Partition column can't be updated in current version DROP TABLE FKTABLE; -- This should fail, because we'd have to cast numeric to int which is -- not an implicit coercion (or use numeric=numeric, but that's not part @@ -787,34 +787,22 @@ DROP TABLE PKTABLE; -- On the other hand, this should work because int implicitly promotes to -- numeric, and we allow promotion on the FK side CREATE TABLE PKTABLE (ptest1 numeric PRIMARY KEY); -ERROR: Column ptest1 is not a hash distributable data type +NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "pktable_pkey" for table "pktable" INSERT INTO PKTABLE VALUES(42); -ERROR: relation "pktable" does not exist -LINE 1: INSERT INTO PKTABLE VALUES(42); - ^ CREATE TABLE FKTABLE (ftest1 int REFERENCES pktable); -ERROR: relation "pktable" does not exist -- Check it actually works INSERT INTO FKTABLE VALUES(42); -- should succeed -ERROR: relation "fktable" does not exist -LINE 1: INSERT INTO FKTABLE VALUES(42); - ^ +ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" +DETAIL: Key (ftest1)=(42) is not present in table "pktable". INSERT INTO FKTABLE VALUES(43); -- should fail -ERROR: relation "fktable" does not exist -LINE 1: INSERT INTO FKTABLE VALUES(43); - ^ +ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_ftest1_fkey" +DETAIL: Key (ftest1)=(43) is not present in table "pktable". UPDATE FKTABLE SET ftest1 = ftest1; -- should succeed -ERROR: relation "fktable" does not exist -LINE 1: UPDATE FKTABLE SET ftest1 = ftest1; - ^ +ERROR: Partition column can't be updated in current version UPDATE FKTABLE SET ftest1 = ftest1 + 1; -- should fail -ERROR: relation "fktable" does not exist -LINE 1: UPDATE FKTABLE SET ftest1 = ftest1 + 1; - ^ +ERROR: Partition column can't be updated in current version DROP TABLE FKTABLE; -ERROR: table "fktable" does not exist DROP TABLE PKTABLE; -ERROR: table "pktable" does not exist -- Two columns, two tables CREATE TABLE PKTABLE (ptest1 int, ptest2 inet, PRIMARY KEY(ptest1, ptest2)); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "pktable_pkey" for table "pktable" diff --git a/src/test/regress/expected/money_1.out b/src/test/regress/expected/money_1.out new file mode 100644 index 0000000..6a15792 --- /dev/null +++ b/src/test/regress/expected/money_1.out @@ -0,0 +1,186 @@ +-- +-- MONEY +-- +CREATE TABLE money_data (m money); +INSERT INTO money_data VALUES ('123'); +SELECT * FROM money_data; + m +--------- + $123.00 +(1 row) + +SELECT m + '123' FROM money_data; + ?column? +---------- + $246.00 +(1 row) + +SELECT m + '123.45' FROM money_data; + ?column? +---------- + $246.45 +(1 row) + +SELECT m - '123.45' FROM money_data; + ?column? +---------- + -$0.45 +(1 row) + +SELECT m * 2 FROM money_data; + ?column? +---------- + $246.00 +(1 row) + +SELECT m / 2 FROM money_data; + ?column? +---------- + $61.50 +(1 row) + +-- All true +SELECT m = '$123.00' FROM money_data; + ?column? +---------- + t +(1 row) + +SELECT m != '$124.00' FROM money_data; + ?column? +---------- + t +(1 row) + +SELECT m <= '$123.00' FROM money_data; + ?column? +---------- + t +(1 row) + +SELECT m >= '$123.00' FROM money_data; + ?column? +---------- + t +(1 row) + +SELECT m < '$124.00' FROM money_data; + ?column? +---------- + t +(1 row) + +SELECT m > '$122.00' FROM money_data; + ?column? +---------- + t +(1 row) + +-- All false +SELECT m = '$123.01' FROM money_data; + ?column? +---------- +(0 rows) + +SELECT m != '$123.00' FROM money_data; + ?column? +---------- + f +(1 row) + +SELECT m <= '$122.99' FROM money_data; + ?column? +---------- + f +(1 row) + +SELECT m >= '$123.01' FROM money_data; + ?column? +---------- + f +(1 row) + +SELECT m > '$124.00' FROM money_data; + ?column? +---------- + f +(1 row) + +SELECT m < '$122.00' FROM money_data; + ?column? +---------- + f +(1 row) + +SELECT cashlarger(m, '$124.00') FROM money_data; + cashlarger +------------ + $124.00 +(1 row) + +SELECT cashsmaller(m, '$124.00') FROM money_data; + cashsmaller +------------- + $123.00 +(1 row) + +SELECT cash_words(m) FROM money_data; + cash_words +------------------------------------------------- + One hundred twenty three dollars and zero cents +(1 row) + +SELECT cash_words(m + '1.23') FROM money_data; + cash_words +-------------------------------------------------------- + One hundred twenty four dollars and twenty three cents +(1 row) + +DELETE FROM money_data; +INSERT INTO money_data VALUES ('$123.45'); +SELECT * FROM money_data; + m +--------- + $123.45 +(1 row) + +DELETE FROM money_data; +INSERT INTO money_data VALUES ('$123.451'); +SELECT * FROM money_data; + m +--------- + $123.45 +(1 row) + +DELETE FROM money_data; +INSERT INTO money_data VALUES ('$123.454'); +SELECT * FROM money_data; + m +--------- + $123.45 +(1 row) + +DELETE FROM money_data; +INSERT INTO money_data VALUES ('$123.455'); +SELECT * FROM money_data; + m +--------- + $123.46 +(1 row) + +DELETE FROM money_data; +INSERT INTO money_data VALUES ('$123.456'); +SELECT * FROM money_data; + m +--------- + $123.46 +(1 row) + +DELETE FROM money_data; +INSERT INTO money_data VALUES ('$123.459'); +SELECT * FROM money_data; + m +--------- + $123.46 +(1 row) + diff --git a/src/test/regress/expected/prepared_xacts_2.out b/src/test/regress/expected/prepared_xacts_2.out index e456200..307ffad 100644 --- a/src/test/regress/expected/prepared_xacts_2.out +++ b/src/test/regress/expected/prepared_xacts_2.out @@ -6,7 +6,7 @@ -- isn't really needed ... stopping and starting the postmaster would -- be enough, but we can't even do that here. -- create a simple table that we'll use in the tests -CREATE TABLE pxtest1 (foobar VARCHAR(10)); +CREATE TABLE pxtest1 (foobar VARCHAR(10)) distribute by replication; INSERT INTO pxtest1 VALUES ('aaa'); -- Test PREPARE TRANSACTION BEGIN; diff --git a/src/test/regress/expected/reltime_1.out b/src/test/regress/expected/reltime_1.out new file mode 100644 index 0000000..83f61f9 --- /dev/null +++ b/src/test/regress/expected/reltime_1.out @@ -0,0 +1,109 @@ +-- +-- RELTIME +-- +CREATE TABLE RELTIME_TBL (f1 reltime); +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 1 minute'); +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 5 hour'); +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 10 day'); +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 34 year'); +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 3 months'); +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 14 seconds ago'); +-- badly formatted reltimes +INSERT INTO RELTIME_TBL (f1) VALUES ('badly formatted reltime'); +ERROR: invalid input syntax for type reltime: "badly formatted reltime" +LINE 1: INSERT INTO RELTIME_TBL (f1) VALUES ('badly formatted reltim... + ^ +INSERT INTO RELTIME_TBL (f1) VALUES ('@ 30 eons ago'); +ERROR: invalid input syntax for type reltime: "@ 30 eons ago" +LINE 1: INSERT INTO RELTIME_TBL (f1) VALUES ('@ 30 eons ago'); + ^ +-- test reltime operators +SELECT '' AS six, * FROM RELTIME_TBL ORDER BY f1; + six | f1 +-----+--------------- + | @ 14 secs ago + | @ 1 min + | @ 5 hours + | @ 10 days + | @ 3 mons + | @ 34 years +(6 rows) + +SELECT '' AS five, * FROM RELTIME_TBL + WHERE RELTIME_TBL.f1 <> reltime '@ 10 days' ORDER BY f1; + five | f1 +------+--------------- + | @ 14 secs ago + | @ 1 min + | @ 5 hours + | @ 3 mons + | @ 34 years +(5 rows) + +SELECT '' AS three, * FROM RELTIME_TBL + WHERE RELTIME_TBL.f1 <= reltime '@ 5 hours' ORDER BY f1; + three | f1 +-------+--------------- + | @ 14 secs ago + | @ 1 min + | @ 5 hours +(3 rows) + +SELECT '' AS three, * FROM RELTIME_TBL + WHERE RELTIME_TBL.f1 < reltime '@ 1 day' ORDER BY f1; + three | f1 +-------+--------------- + | @ 14 secs ago + | @ 1 min + | @ 5 hours +(3 rows) + +SELECT '' AS one, * FROM RELTIME_TBL + WHERE RELTIME_TBL.f1 = reltime '@ 34 years' ORDER BY f1; + one | f1 +-----+---------- + | 34 years +(1 row) + +SELECT '' AS two, * FROM RELTIME_TBL + WHERE RELTIME_TBL.f1 >= reltime '@ 1 month' ORDER BY f1; + two | f1 +-----+------------ + | @ 3 mons + | @ 34 years +(2 rows) + +SELECT '' AS five, * FROM RELTIME_TBL + WHERE RELTIME_TBL.f1 > reltime '@ 3 seconds ago' ORDER BY f1; + five | f1 +------+------------ + | @ 1 min + | @ 5 hours + | @ 10 days + | @ 3 mons + | @ 34 years +(5 rows) + +SELECT '' AS fifteen, r1.*, r2.* + FROM RELTIME_TBL r1, RELTIME_TBL r2 + WHERE r1.f1 > r2.f1 + ORDER BY r1.f1, r2.f1; + fifteen | f1 | f1 +---------+------------+--------------- + | @ 1 min | @ 14 secs ago + | @ 5 hours | @ 14 secs ago + | @ 5 hours | @ 1 min + | @ 10 days | @ 14 secs ago + | @ 10 days | @ 1 min + | @ 10 days | @ 5 hours + | @ 3 mons | @ 14 secs ago + | @ 3 mons | @ 1 min + | @ 3 mons | @ 5 hours + | @ 3 mons | @ 10 days + | @ 34 years | @ 14 secs ago + | @ 34 years | @ 1 min + | @ 34 years | @ 5 hours + | @ 34 years | @ 10 days + | @ 34 years | @ 3 mons +(15 rows) + diff --git a/src/test/regress/expected/triggers_1.out b/src/test/regress/expected/triggers_1.out index 5528c66..a9f83ec 100644 --- a/src/test/regress/expected/triggers_1.out +++ b/src/test/regress/expected/triggers_1.out @@ -717,30 +717,30 @@ ERROR: Postgres-XC does not support TRIGGER yet DETAIL: The feature is not currently supported \set QUIET false UPDATE min_updates_test SET f1 = f1; -UPDATE 2 -UPDATE min_updates_test SET f2 = f2 + 1; ERROR: Partition column can't be updated in current version +UPDATE min_updates_test SET f2 = f2 + 1; +UPDATE 2 UPDATE min_updates_test SET f3 = 2 WHERE f3 is null; UPDATE 1 UPDATE min_updates_test_oids SET f1 = f1; -UPDATE 2 -UPDATE min_updates_test_oids SET f2 = f2 + 1; ERROR: Partition column can't be updated in current version +UPDATE min_updates_test_oids SET f2 = f2 + 1; +UPDATE 2 UPDATE min_updates_test_oids SET f3 = 2 WHERE f3 is null; UPDATE 1 \set QUIET true SELECT * FROM min_updates_test ORDER BY 1,2,3; f1 | f2 | f3 ----+----+---- - a | 1 | 2 - b | 2 | 2 + a | 2 | 2 + b | 3 | 2 (2 rows) SELECT * FROM min_updates_test_oids ORDER BY 1,2,3; f1 | f2 | f3 ----+----+---- - a | 1 | 2 - b | 2 | 2 + a | 2 | 2 + b | 3 | 2 (2 rows) DROP TABLE min_updates_test; diff --git a/src/test/regress/expected/tsearch_1.out b/src/test/regress/expected/tsearch_1.out index e8c35d4..4d1f1b1 100644 --- a/src/test/regress/expected/tsearch_1.out +++ b/src/test/regress/expected/tsearch_1.out @@ -801,7 +801,7 @@ SELECT COUNT(*) FROM test_tsquery WHERE keyword > 'new & york'; (1 row) CREATE UNIQUE INDEX bt_tsq ON test_tsquery (keyword); -ERROR: Cannot locally enforce a unique index on round robin distributed table. +ERROR: Unique index of partitioned table must contain the hash/modulo distribution column. SET enable_seqscan=OFF; SELECT COUNT(*) FROM test_tsquery WHERE keyword < 'new & york'; count @@ -1054,6 +1054,7 @@ SELECT count(*) FROM test_tsvector WHERE a @@ to_tsquery('345&qwerty'); (0 rows) UPDATE test_tsvector SET t = null WHERE t = '345 qwerty'; +ERROR: Partition column can't be updated in current version SELECT count(*) FROM test_tsvector WHERE a @@ to_tsquery('345&qwerty'); count ------- diff --git a/src/test/regress/expected/xc_distkey.out b/src/test/regress/expected/xc_distkey.out new file mode 100644 index 0000000..d050b27 --- /dev/null +++ b/src/test/regress/expected/xc_distkey.out @@ -0,0 +1,618 @@ +-- XC Test cases to verify that all supported data types are working as distribution key +-- Also verifies that the comaparison with a constant for equality is optimized. +create table ch_tab(a char) distribute by modulo(a); +insert into ch_tab values('a'); +select hashchar('a'); + hashchar +----------- + 463612535 +(1 row) + +create table nm_tab(a name) distribute by modulo(a); +insert into nm_tab values('abbas'); +select hashname('abbas'); + hashname +----------- + 605752656 +(1 row) + +create table nu_tab(a numeric(10,5)) distribute by modulo(a); +insert into nu_tab values(123.456); +insert into nu_tab values(789.412); +select * from nu_tab order by a; + a +----------- + 123.45600 + 789.41200 +(2 rows) + +select * from nu_tab where a = 123.456; + a +----------- + 123.45600 +(1 row) + +select * from nu_tab where 789.412 = a; + a +----------- + 789.41200 +(1 row) + +explain select * from nu_tab where a = 123.456; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +explain select * from nu_tab where 789.412 = a; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +create table tx_tab(a text) distribute by modulo(a); +insert into tx_tab values('hello world'); +insert into tx_tab values('Did the quick brown fox jump over the lazy dog?'); +select * from tx_tab order by a; + a +------------------------------------------------- + Did the quick brown fox jump over the lazy dog? + hello world +(2 rows) + +select * from tx_tab where a = 'hello world'; + a +------------- + hello world +(1 row) + +select * from tx_tab where a = 'Did the quick brown fox jump over the lazy dog?'; + a +------------------------------------------------- + Did the quick brown fox jump over the lazy dog? +(1 row) + +select * from tx_tab where 'hello world' = a; + a +------------- + hello world +(1 row) + +select * from tx_tab where 'Did the quick brown fox jump over the lazy dog?' = a; + a +------------------------------------------------- + Did the quick brown fox jump over the lazy dog? +(1 row) + +explain select * from tx_tab where a = 'hello world'; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +explain select * from tx_tab where a = 'Did the quick brown fox jump over the lazy dog?'; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +create table vc_tab(a varchar(255)) distribute by modulo(a); +insert into vc_tab values('abcdefghijklmnopqrstuvwxyz'); +insert into vc_tab values('A quick brown fox'); +insert into vc_tab values(NULL); +select * from vc_tab order by a; + a +---------------------------- + abcdefghijklmnopqrstuvwxyz + A quick brown fox + +(3 rows) + +select * from vc_tab where a = 'abcdefghijklmnopqrstuvwxyz'; + a +---------------------------- + abcdefghijklmnopqrstuvwxyz +(1 row) + +select * from vc_tab where a = 'A quick brown fox'; + a +------------------- + A quick brown fox +(1 row) + +-- This test a bug in examine_conditions_walker where a = constant is optimized but constant = a was not +select * from vc_tab where 'A quick brown fox' = a; + a +------------------- + A quick brown fox +(1 row) + +explain select * from vc_tab where a = 'abcdefghijklmnopqrstuvwxyz'; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +explain select * from vc_tab where a = 'A quick brown fox'; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +-- This test a bug in examine_conditions_walker where a = constant is optimized but constant = a was not +explain select * from vc_tab where 'A quick brown fox' = a; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +create table f8_tab(a float8) distribute by modulo(a); +insert into f8_tab values(123.456); +insert into f8_tab values(10.987654); +select * from f8_tab order by a; + a +----------- + 10.987654 + 123.456 +(2 rows) + +select * from f8_tab where a = 123.456; + a +--------- + 123.456 +(1 row) + +select * from f8_tab where a = 10.987654; + a +----------- + 10.987654 +(1 row) + +select * from f8_tab where a = 123.456::float8; + a +--------- + 123.456 +(1 row) + +select * from f8_tab where a = 10.987654::float8; + a +----------- + 10.987654 +(1 row) + +create table f4_tab(a float4) distribute by modulo(a); +insert into f4_tab values(123.456); +insert into f4_tab values(10.987654); +insert into f4_tab values(NULL); +select * from f4_tab order by a; + a +--------- + 10.9877 + 123.456 + +(3 rows) + +select * from f4_tab where a = 123.456; + a +--- +(0 rows) + +select * from f4_tab where a = 10.987654; + a +--- +(0 rows) + +select * from f4_tab where a = 123.456::float4; + a +--------- + 123.456 +(1 row) + +select * from f4_tab where a = 10.987654::float4; + a +--------- + 10.9877 +(1 row) + +create table i8_tab(a int8) distribute by modulo(a); +insert into i8_tab values(8446744073709551359); +insert into i8_tab values(78902); +insert into i8_tab values(NULL); +select * from i8_tab order by a; + a +--------------------- + 78902 + 8446744073709551359 + +(3 rows) + +select * from i8_tab where a = 8446744073709551359::int8; + a +--------------------- + 8446744073709551359 +(1 row) + +select * from i8_tab where a = 8446744073709551359; + a +--------------------- + 8446744073709551359 +(1 row) + +select * from i8_tab where a = 78902::int8; + a +------- + 78902 +(1 row) + +select * from i8_tab where a = 78902; + a +------- + 78902 +(1 row) + +create table i2_tab(a int2) distribute by modulo(a); +insert into i2_tab values(123); +insert into i2_tab values(456); +select * from i2_tab order by a; + a +----- + 123 + 456 +(2 rows) + +select * from i2_tab where a = 123; + a +----- + 123 +(1 row) + +select * from i2_tab where a = 456; + a +----- + 456 +(1 row) + +create table oid_tab(a oid) distribute by modulo(a); +insert into oid_tab values(23445); +insert into oid_tab values(45662); +select * from oid_tab order by a; + a +------- + 23445 + 45662 +(2 rows) + +select * from oid_tab where a = 23445; + a +------- + 23445 +(1 row) + +select * from oid_tab where a = 45662; + a +------- + 45662 +(1 row) + +create table i4_tab(a int4) distribute by modulo(a); +insert into i4_tab values(65530); +insert into i4_tab values(2147483647); +select * from i4_tab order by a; + a +------------ + 65530 + 2147483647 +(2 rows) + +select * from i4_tab where a = 65530; + a +------- + 65530 +(1 row) + +select * from i4_tab where a = 2147483647; + a +------------ + 2147483647 +(1 row) + +select * from i4_tab where 65530 = a; + a +------- + 65530 +(1 row) + +select * from i4_tab where 2147483647 = a; + a +------------ + 2147483647 +(1 row) + +explain select * from i4_tab where 65530 = a; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +explain select * from i4_tab where a = 2147483647; + QUERY PLAN +------------------------------------------------------------------- + Data Node Scan (Node Count [1]) (cost=0.00..0.00 rows=0 width=0) +(1 row) + +create table bo_tab(a bool) distribute by modulo(a); +insert into bo_tab values(true); +insert into bo_tab values(false); +select * from bo_tab order by a; + a +--- + f + t +(2 rows) + +select * from bo_tab where a = true; + a +--- + t +(1 row) + +select * from bo_tab where a = false; + a +--- + f +(1 row) + +create table bpc_tab(a char(35)) distribute by modulo(a); +insert into bpc_tab values('Hello World'); +insert into bpc_tab values('The quick brown fox'); +select * from bpc_tab order by a; + a +------------------------------------- + Hello World + The quick brown fox +(2 rows) + +select * from bpc_tab where a = 'Hello World'; + a +------------------------------------- + Hello World +(1 row) + +select * from bpc_tab where a = 'The quick brown fox'; + a +------------------------------------- + The quick brown fox +(1 row) + +create table byta_tab(a bytea) distribute by modulo(a); +insert into byta_tab values(E'\\000\\001\\002\\003\\004\\005\\006\\007\\010'); +insert into byta_tab values(E'\\010\\011\\012\\013\\014\\015\\016\\017\\020'); +select * from byta_tab order by a; + a +---------------------- + \x000102030405060708 + \x08090a0b0c0d0e0f10 +(2 rows) + +select * from byta_tab where a = E'\\000\\001\\002\\003\\004\\005\\006\\007\\010'; + a +---------------------- + \x000102030405060708 +(1 row) + +select * from byta_tab where a = E'\\010\\011\\012\\013\\014\\015\\016\\017\\020'; + a +---------------------- + \x08090a0b0c0d0e0f10 +(1 row) + +create table tim_tab(a time) distribute by modulo(a); +insert into tim_tab values('00:01:02.03'); +insert into tim_tab values('23:59:59.99'); +select * from tim_tab order by a; + a +------------- + 00:01:02.03 + 23:59:59.99 +(2 rows) + +delete from tim_tab where a = '00:01:02.03'; +delete from tim_tab where a = '23:59:59.99'; +create table timtz_tab(a time with time zone) distribute by modulo(a); +insert into timtz_tab values('00:01:02.03 PST'); +insert into timtz_tab values('23:59:59.99 PST'); +select * from timtz_tab order by a; + a +---------------- + 00:01:02.03-08 + 23:59:59.99-08 +(2 rows) + +select * from timtz_tab where a = '00:01:02.03 PST'; + a +---------------- + 00:01:02.03-08 +(1 row) + +select * from timtz_tab where a = '23:59:59.99 PST'; + a +---------------- + 23:59:59.99-08 +(1 row) + +create table ts_tab(a timestamp) distribute by modulo(a); +insert into ts_tab values('May 10, 2011 00:01:02.03'); +insert into ts_tab values('August 14, 2001 23:59:59.99'); +select * from ts_tab order by a; + a +----------------------------- + Tue Aug 14 23:59:59.99 2001 + Tue May 10 00:01:02.03 2011 +(2 rows) + +select * from ts_tab where a = 'May 10, 2011 00:01:02.03'; + a +------------------------ + 2011-05-10 00:01:02.03 +(1 row) + +select * from ts_tab where a = 'August 14, 2001 23:59:59.99'; + a +------------------------ + 2001-08-14 23:59:59.99 +(1 row) + +create table in_tab(a interval) distribute by modulo(a); +insert into in_tab values('1 day 12 hours 59 min 10 sec'); +insert into in_tab values('0 day 4 hours 32 min 23 sec'); +select * from in_tab order by a; + a +---------------------------------- + @ 4 hours 32 mins 23 secs + @ 1 day 12 hours 59 mins 10 secs +(2 rows) + +select * from in_tab where a = '1 day 12 hours 59 min 10 sec'; + a +---------------- + 1 day 12:59:10 +(1 row) + +select * from in_tab where a = '0 day 4 hours 32 min 23 sec'; + a +---------- + 04:32:23 +(1 row) + +create table cash_tab(a money) distribute by modulo(a); +insert into cash_tab values('231.54'); +insert into cash_tab values('14011.50'); +select * from cash_tab order by a; + a +------------ + $231.54 + $14,011.50 +(2 rows) + +select * from cash_tab where a = '231.54'; + a +--------- + $231.54 +(1 row) + +select * from cash_tab where a = '14011.50'; + a +------------ + $14,011.50 +(1 row) + +create table atim_tab(a abstime) distribute by modulo(a); +insert into atim_tab values(abstime('May 10, 2011 00:01:02.03')); +insert into atim_tab values(abstime('Jun 23, 2001 23:59:59.99')); +select * from atim_tab order by a; + a +------------------------------ + Sat Jun 23 23:59:59 2001 PDT + Tue May 10 00:01:02 2011 PDT +(2 rows) + +select * from atim_tab where a = abstime('May 10, 2011 00:01:02.03'); + a +------------------------ + 2011-05-10 12:01:02+05 +(1 row) + +select * from atim_tab w... [truncated message content] |
From: Koichi S. <koi...@us...> - 2011-05-24 00:36:09
|
Project "Postgres-XC". The branch, ha_support has been updated via 5e1f7db50172e18a081c9b8155399cd8e8057101 (commit) via 6de2b50702664e82792b6386bf4a5759567bd352 (commit) from 7c0fb4e4bf34f558697ee864e68b01dc05f08e81 (commit) - Log ----------------------------------------------------------------- commit 5e1f7db50172e18a081c9b8155399cd8e8057101 Merge: 6de2b50 7c0fb4e Author: Koichi Suzuki <koi...@gm...> Date: Fri May 13 18:35:59 2011 +0900 Merge branch 'ha_support' of ssh://postgres-xc.git.sourceforge.net/gitroot/postgres-xc/postgres-xc into ha_support commit 6de2b50702664e82792b6386bf4a5759567bd352 Author: Koichi Suzuki <koi...@gm...> Date: Fri May 13 18:29:37 2011 +0900 This is the rety of cancelled commit for gtm-proxy reconnect. This is the first part of the commit. With GTM-Standby, this commit works for the normal case. This also includes all the infrastructure to run "reconnect" including signal hander, backend command backup and release, signal detection and backend command recovery. Next work includes add real "reconnect" and node registration both for worker thread and main thread so that we can test "reconnect". List of modified files are as follows: modified: src/gtm/common/stringinfo.c modified: src/gtm/gtm_ctl/gtm_ctl.c modified: src/gtm/proxy/proxy_main.c modified: src/gtm/proxy/proxy_thread.c modified: src/include/gtm/gtm_proxy.h modified: src/include/gtm/libpq-be.h modified: src/include/gtm/libpq-int.h modified: src/include/gtm/stringinfo.h diff --git a/src/gtm/common/stringinfo.c b/src/gtm/common/stringinfo.c index 35e4cd8..821ca67 100644 --- a/src/gtm/common/stringinfo.c +++ b/src/gtm/common/stringinfo.c @@ -41,6 +41,41 @@ makeStringInfo(void) } /* + * dupStringInfo + * + * Get new StringInfo and copy the original to it. + */ +StringInfo +dupStringInfo(StringInfo orig) +{ + StringInfo new; + + new = makeStringInfo(); + if (!new) + return(new); + + if (orig->len > 0) + { + appendBinaryStringInfo(new, orig->data, orig->len); + new->cursor = orig->cursor; + } + return(new); +} + +/* + * copyStringInfo + * Deep copy: Data part is copied too. Cursor of the destination is + * initialized to zero. + */ +void +copyStringInfo(StringInfo to, StringInfo from) +{ + resetStringInfo(to); + appendBinaryStringInfo(to, from->data, from->len); + return; +} + +/* * initStringInfo * * Initialize a StringInfoData struct (with previously undefined contents) diff --git a/src/gtm/gtm_ctl/gtm_ctl.c b/src/gtm/gtm_ctl/gtm_ctl.c index 55c11bf..5274a53 100644 --- a/src/gtm/gtm_ctl/gtm_ctl.c +++ b/src/gtm/gtm_ctl/gtm_ctl.c @@ -593,6 +593,10 @@ do_reconnect(void) char *reconnect_point_file_nam; FILE *reconnect_point_file; +#ifdef GTM_SBY_DEBUG + write_stderr("Reconnecting to new GTM ... DEBUG MODE."); +#endif + /* * Target must beo "gtm_proxy" */ @@ -623,23 +627,29 @@ do_reconnect(void) * * Option arguments are written to newgtm file under -D directory. */ - reconnect_point_file_nam = malloc(strlen(gtm_data) + 8); + reconnect_point_file_nam = malloc(strlen(gtm_data) + 9); if (reconnect_point_file_nam == NULL) { write_stderr(_("%s: No memory available.\n"), progname); exit(1); } - snprintf(reconnect_point_file_nam, strlen(gtm_data) + 7, "%s/newgtm", gtm_data); + snprintf(reconnect_point_file_nam, strlen(gtm_data) + 8, "%s/newgtm", gtm_data); reconnect_point_file = fopen(reconnect_point_file_nam, "w"); if (reconnect_point_file == NULL) { write_stderr(_("%s: Cannot open reconnect point file %s\n"), progname, reconnect_point_file_nam); exit(1); } - fprintf(reconnect_point_file, "%s", gtm_opts); + fprintf(reconnect_point_file, "%s\n", gtm_opts); fclose(reconnect_point_file); free(reconnect_point_file_nam); - if (kill((pid_t) pid, SIGUSR2) != 0) +#if 0 /* GTM_SBY_DEBUG */ + write_stderr("Now about to send SIGUSR1 to pid %ld.\n", pid); + write_stderr("Returning. This is the debug. Don't send signal actually.\n"); + return; +#endif + + if (kill((pid_t) pid, SIGUSR1) != 0) { write_stderr(_("%s: could not send promote signal (PID: %ld): %s\n"), progname, pid, strerror(errno)); @@ -860,6 +870,7 @@ do_help(void) printf(_(" %s restart -S STARTUP_MODE [-w] [-t SECS] [-D DATADIR] [-m SHUTDOWN-MODE]\n" " [-o \"OPTIONS\"]\n"), progname); printf(_(" %s status -S STARTUP_MODE [-w] [-t SECS] [-D DATADIR]\n"), progname); + printf(_(" %s reconnect [-D DATADIR] -o \"OPTIONS\"]\n"), progname); printf(_("\nCommon options:\n")); printf(_(" -D DATADIR location of the database storage area\n")); @@ -878,6 +889,10 @@ do_help(void) printf(_("\nOptions for stop or restart:\n")); printf(_(" -m SHUTDOWN-MODE can be \"smart\", \"fast\", or \"immediate\"\n")); + printf(_("\n Options for reconnect:\n")); + printf(_(" -t NewGTMPORT Port number of new GTM.\n")); + printf(_(" -s NewGTMHost Host Name of new GTM.\n")); + printf(_("\nShutdown modes are:\n")); printf(_(" smart quit after all clients have disconnected\n")); printf(_(" fast quit directly, with proper shutdown\n")); diff --git a/src/gtm/proxy/proxy_main.c b/src/gtm/proxy/proxy_main.c index 4ab6f94..c93b140 100644 --- a/src/gtm/proxy/proxy_main.c +++ b/src/gtm/proxy/proxy_main.c @@ -59,12 +59,25 @@ int GTMProxyPortNumber; int GTMProxyWorkerThreads; char *GTMProxyDataDir; +/* GTM communication error handling options */ + +int GTMErrorWaitOpt = FALSE; /* Wait and assume XCM if TRUE */ +int GTMErrorWaitSecs = 0; /* Duration of each wait */ +int GTMErrorWaitCount = 0; /* How many durations to wait */ + char *GTMServerHost; int GTMServerPortNumber; GTM_PGXCNodeId GTMProxyID = 0; GTM_ThreadID TopMostThreadID; +/* Communication area with SIGUSR2 signal handler */ + +GTMProxy_ThreadInfo **Proxy_ThreadInfo; +short ReadyToReconnect = FALSE; +char *NewGTMServerHost; +int NewGTMServerPortNumber; + /* The socket(s) we're listening to. */ #define MAXLISTEN 64 static int ListenSocket[MAXLISTEN]; @@ -120,6 +133,7 @@ static void DeleteLockFile(const char *filename); static void RegisterProxy(void); static void UnregisterProxy(void); static GTM_Conn *ConnectGTM(void); +static void ReleaseCmdBackup(GTMProxy_CommandInfo *cmdinfo); /* * One-time initialization. It's called immediately after the main process @@ -205,10 +219,125 @@ BaseInit() } } +static char * +read_token(char *line, char **next) +{ + char *tok; + char *next_token; + + if (line == NULL) + { + *next = NULL; + return(NULL); + } + for (tok = line;; tok++) + { + if (*tok == 0 || *tok == '\n') + return(NULL); + if (*tok == ' ' || *tok == '\t') + continue; + else + break; + } + for (next_token = tok;; next_token++) + { + if (*next_token == 0 || *next_token == '\n') + { + *next_token = 0; + *next = NULL; + return(tok); + } + if (*next_token == ' ' || *next_token == '\t') + { + *next_token = 0; + *next = next_token + 1; + return(tok); + } + else + continue; + } + Assert(0); /* Never comes here. Keep compiler quiet. */ +} + +/* + * Returns non-zero if failed. + * We assume that current working directory is that specified by -D option. + */ +#define MAXLINE 1024 +#define INVALID_RECONNECT_OPTION_MSG() \ + do{ \ + ereport(ERROR, (0, errmsg("Invalid Reconnect Option"))); \ + } while(0) + +static int +GTMProxy_ReadReconnectInfo(void) +{ + + char optstr[MAXLINE]; + char *line; + FILE *optarg_file; + char *optValue; + char *option; + char *next_token; + + optarg_file = fopen("newgtm", "r"); + if (optarg_file == NULL) + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + line = fgets(optstr, MAXLINE, optarg_file); + if (line == NULL) + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + fclose(optarg_file); +#ifdef GTM_SBY_DEBUG + elog(LOG, "reconnect option = \"%s\"\n", optstr); +#endif + next_token = optstr; + while ((option = read_token(next_token, &next_token))) + { + if (strcmp(option, "-t") == 0) /* New GTM port */ + { + optValue = read_token(next_token, &next_token); + if (optValue == NULL) + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + NewGTMServerPortNumber = atoi(optValue); + continue; + } + else if (strcmp(option, "-s") == 0) + { + optValue = read_token(next_token, &next_token); + if (optValue == NULL) + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + if (NewGTMServerHost) + free(NewGTMServerHost); + NewGTMServerHost = strdup(optValue); + continue; + } + else + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + } + return(0); +} + static void GTMProxy_SigleHandler(int signal) { - fprintf(stderr, "Received signal %d", signal); + int ii; + + elog(LOG, "Received signal %d", signal); switch (signal) { @@ -218,6 +347,93 @@ GTMProxy_SigleHandler(int signal) case SIGINT: case SIGHUP: break; + case SIGUSR1: /* Reconnect from gtm_ctl */ + /* + * Only the main thread can distribute SIGUSR2 to avoid lock contention + * of the thread info. If other thread receivs SIGUSR1, it will proxy + * SIGUSR1 to the main thread. + */ + /* + * The mask is set to block signals. They're blocked until all the + * threads reconnect to the new GTM. + */ +#ifdef GTM_SBY_DEBUG + elog(LOG, "Accepted SIGUSR1\n"); +#endif + if (MyThreadID != TopMostThreadID) + { +#ifdef GTM_SBY_DEBUG + elog(LOG, "I'm not the main thread. Proxy the signal to the main thread."); +#endif + pthread_kill(TopMostThreadID, SIGUSR1); + return; + } + /* + * Then this is the main thread. + */ + PG_SETMASK(&BlockSig); +#ifdef GTM_SBY_DEBUG + elog(LOG, "I'm the main thread. Accepted SIGUSR1."); +#endif + /* + * Set Reconnect Info + */ + if (!ReadyToReconnect) + { + elog(LOG, "SIGUSR1 detected, but not ready to handle this. Ignored"); + PG_SETMASK(&UnBlockSig); + return; + } + elog(LOG, "SIGUSR1 detected. Set reconnect info for each worker thread"); + if (GTMProxy_ReadReconnectInfo() != 0) + { + /* Failed to read reconnect information from reconnect data file */ + PG_SETMASK(&UnBlockSig); + return; + } + for (ii = 0; ii < GTMProxyWorkerThreads; ii++) + { + if ((Proxy_ThreadInfo[ii] == NULL) || (Proxy_ThreadInfo[ii]->can_accept_SIGUSR2 == FALSE)) + { + elog(LOG, "Some thread is not ready to accept SIGUSR2. SIGUSR1 ignored."); + PG_SETMASK(&UnBlockSig); + } + } + for (ii = 0; ii < GTMProxyWorkerThreads; ii++) + { + /* + * Issue SIGUSR2 to all the worker threads. + * It will not be issued to the main thread. + */ + pthread_kill(Proxy_ThreadInfo[ii]->thr_id, SIGUSR2); + } + elog(LOG, "SIGUSR2 issued to all the worker threads."); + PG_SETMASK(&UnBlockSig); + return; + case SIGUSR2: /* Reconnect from the main thread */ + /* + * Main thread has nothing to do twith this signal and should not receive this. + */ + PG_SETMASK(&BlockSig); +#ifdef GTM_SBY_DEBUG + elog(LOG, "Detected SIGUSR2, thread:%ld", MyThreadID); +#endif + if (MyThreadID == TopMostThreadID) + { + /* + * This should not be reached. Just in case. + */ + elog(LOG, "SIGUSR2 received by the main thread. Ignoring."); + PG_SETMASK(&UnBlockSig); + return; + } + GetMyThreadInfo->reconnect_issued = TRUE; + if (GetMyThreadInfo->can_longjmp) + { + siglongjmp(GetMyThreadInfo->longjmp_env, 1); + } + PG_SETMASK(&UnBlockSig); + return; default: fprintf(stderr, "Unknown signal %d\n", signal); @@ -286,10 +502,12 @@ main(int argc, char *argv[]) GTMProxyPortNumber = GTM_PROXY_DEFAULT_PORT; GTMProxyWorkerThreads = GTM_PROXY_DEFAULT_WORKERS; + NewGTMServerHost = NULL; + /* * Parse the command like options and set variables */ - while ((opt = getopt(argc, argv, "h:i:p:n:D:l:s:t:")) != -1) + while ((opt = getopt(argc, argv, "h:i:p:n:D:l:s:t:w:z:")) != -1) { switch (opt) { @@ -333,6 +551,16 @@ main(int argc, char *argv[]) GTMServerPortNumber = atoi(optarg); break; + case 'w': + /* Duration to wait at GTM communication error */ + GTMErrorWaitSecs = atoi(optarg); + break; + + case 'z': + /* How many durations to wait */ + GTMErrorWaitCount = atoi(optarg); + break; + default: write_stderr("Try \"%s --help\" for more information.\n", progname); @@ -355,6 +583,19 @@ main(int argc, char *argv[]) } /* + * Validate GTM communication error handling option + */ + if (GTMErrorWaitSecs > 0 && GTMErrorWaitCount > 0) + { + GTMErrorWaitOpt = TRUE; /* Now we assume that XCM is available */ + } + else + { + GTMErrorWaitOpt = FALSE; + GTMErrorWaitSecs = 0; + GTMErrorWaitCount = 0; + } + /* * GTM accepts no non-option switch arguments. */ if (optind < argc) @@ -417,10 +658,17 @@ main(int argc, char *argv[]) pqsignal(SIGQUIT, GTMProxy_SigleHandler); pqsignal(SIGTERM, GTMProxy_SigleHandler); pqsignal(SIGINT, GTMProxy_SigleHandler); + pqsignal(SIGUSR1, GTMProxy_SigleHandler); + pqsignal(SIGUSR2, GTMProxy_SigleHandler); pqinitmask(); /* + * Initialize SIGUSR2 interface area (Thread info) + */ + Proxy_ThreadInfo = palloc0(sizeof(GTMProxy_ThreadInfo *) * GTMProxyWorkerThreads); + + /* * Pre-fork so many worker threads */ @@ -429,7 +677,7 @@ main(int argc, char *argv[]) /* * XXX Start the worker thread */ - if (GTMProxy_ThreadCreate(GTMProxy_ThreadMain) == NULL) + if (GTMProxy_ThreadCreate(GTMProxy_ThreadMain, i) == NULL) { elog(ERROR, "failed to create a new thread"); return STATUS_ERROR; @@ -437,6 +685,12 @@ main(int argc, char *argv[]) } /* + * Now all the worker threads are ready and the proxy can accept SIGUSR1 to reconnect + */ + + ReadyToReconnect = TRUE; + + /* * Accept any new connections. Add for each incoming connection to one of * the pre-forked threads. */ @@ -661,6 +915,25 @@ GTMProxy_ThreadMain(void *argp) initStringInfo(&input_message); /* + * Set GTM communication error handling options. + */ + thrinfo->thr_gtm_conn->gtmErrorWaitOpt = GTMErrorWaitOpt; + thrinfo->thr_gtm_conn->gtmErrorWaitSecs = GTMErrorWaitSecs; + thrinfo->thr_gtm_conn->gtmErrorWaitCount = GTMErrorWaitCount; + + thrinfo->reconnect_issued = FALSE; + + /* + * Initialize comand backup area + */ + for (ii = 0; ii < GTM_PROXY_MAX_CONNECTIONS; ii++) + { + thrinfo->thr_any_backup[ii] = FALSE; + thrinfo->thr_qtype[ii] = 0; + initStringInfo(&(thrinfo->thr_inBufData[ii])); + } + + /* * If an exception is encountered, processing resumes here so we abort the * current transaction and start a new one. * @@ -727,6 +1000,18 @@ GTMProxy_ThreadMain(void *argp) /* We can now handle ereport(ERROR) */ PG_exception_stack = &local_sigjmp_buf; + /* + * Now we're entering thread loop. The last work is to initialize SIGUSR2 control. + */ + Disable_Longjmp(); + GetMyThreadInfo->can_accept_SIGUSR2 = TRUE; + GetMyThreadInfo->reconnect_issued = FALSE; + GetMyThreadInfo->can_longjmp = FALSE; + + /*-------------------------------------------------------------- + * Thread Loop + *------------------------------------------------------------- + */ for (;;) { gtm_ListCell *elem = NULL; @@ -820,6 +1105,56 @@ GTMProxy_ThreadMain(void *argp) thrinfo->thr_processed_commands = gtm_NIL; memset(thrinfo->thr_pending_commands, 0, sizeof (thrinfo->thr_pending_commands)); + + /* + * Each SIGUSR2 should return here and please note that from the the beginning + * of the outer loop, longjmp is disabled and signal handler will simply return + * so that we don't have to be botherd with the memory context. We should be + * sure to be in MemoryContext where longjmp() is issued. + */ + setjmp_again: + if (sigsetjmp(GetMyThreadInfo->longjmp_env, 1) == 0) + { + Disable_Longjmp(); + } + else + { + /* + * SIGUSR2 is detected and jumped here + */ + PG_SETMASK(&UnBlockSig); + /* + * Disconnect the current connection and re-connect to the new GTM + */ + GTMPQfinish(thrinfo->thr_gtm_conn); + sprintf(gtm_connect_string, "host=%s port=%d pgxc_node_id=%d remote_type=%d", + NewGTMServerHost, NewGTMServerPortNumber, GTMProxyID, PGXC_NODE_GTM_PROXY); + thrinfo->thr_gtm_conn = PQconnectGTM(gtm_connect_string); + + if (thrinfo->thr_gtm_conn == NULL) + elog(FATAL, "GTM connection failed."); + + /* + * Set GTM communication error handling option + */ + thrinfo->thr_gtm_conn->gtmErrorWaitOpt = GTMErrorWaitOpt; + thrinfo->thr_gtm_conn->gtmErrorWaitSecs = GTMErrorWaitSecs; + thrinfo->thr_gtm_conn->gtmErrorWaitCount = GTMErrorWaitCount; + + /* + * Initialize the command processing + */ + thrinfo->reconnect_issued = FALSE; + thrinfo->thr_processed_commands = gtm_NIL; + for (ii = 0; ii < MSG_TYPE_COUNT; ii++) + { + thrinfo->thr_pending_commands[ii] = gtm_NIL; + } + gtm_list_free_deep(thrinfo->thr_processed_commands); + thrinfo->thr_processed_commands = gtm_NIL; + goto setjmp_again; /* Get ready for another SIGUSR2 */ + } + /* * Now, read command from each of the connections that has some data to * be read. @@ -840,7 +1175,7 @@ GTMProxy_ThreadMain(void *argp) continue; } - if (thrinfo->thr_poll_fds[ii].revents & POLLIN) + if ((thrinfo->thr_any_backup[ii]) || (thrinfo->thr_poll_fds[ii].revents & POLLIN)) { /* * (3) read a command (loop blocks here) @@ -900,7 +1235,9 @@ GTMProxy_ThreadMain(void *argp) /* * Make sure everything is on wire now */ + Enable_Longjmp(); gtmpqFlush(thrinfo->thr_gtm_conn); + Disable_Longjmp(); /* * Read back the responses and put them on to the right backend @@ -917,8 +1254,10 @@ GTMProxy_ThreadMain(void *argp) */ if (cmdinfo->ci_res_index == 0) { + Enable_Longjmp(); if ((res = GTMPQgetResult(thrinfo->thr_gtm_conn)) == NULL) elog(ERROR, "GTMPQgetResult failed"); + Disable_Longjmp(); } ProcessResponse(thrinfo, cmdinfo, res); @@ -1052,9 +1391,15 @@ ProcessResponse(GTMProxy_ThreadInfo *thrinfo, GTMProxy_CommandInfo *cmdinfo, if (res->gr_status == GTM_RESULT_OK) { if (res->gr_type != TXN_BEGIN_GETGXID_MULTI_RESULT) + { + ReleaseCmdBackup(cmdinfo); elog(ERROR, "Wrong result"); + } if (cmdinfo->ci_res_index >= res->gr_resdata.grd_txn_get_multi.txn_count) + { + ReleaseCmdBackup(cmdinfo); elog(ERROR, "Too few GXIDs"); + } gxid = res->gr_resdata.grd_txn_get_multi.start_gxid + cmdinfo->ci_res_index; @@ -1080,18 +1425,25 @@ ProcessResponse(GTMProxy_ThreadInfo *thrinfo, GTMProxy_CommandInfo *cmdinfo, pq_flush(cmdinfo->ci_conn->con_port); } cmdinfo->ci_conn->con_pending_msg = MSG_TYPE_INVALID; + ReleaseCmdBackup(cmdinfo); break; case MSG_TXN_COMMIT: if (res->gr_type != TXN_COMMIT_MULTI_RESULT) + { + ReleaseCmdBackup(cmdinfo); elog(ERROR, "Wrong result"); + } /* * These are grouped messages. We send an array of GXIDs to commit * or rollback and the server sends us back an array of status * codes. */ if (cmdinfo->ci_res_index >= res->gr_resdata.grd_txn_rc_multi.txn_count) + { + ReleaseCmdBackup(cmdinfo); elog(ERROR, "Too few GXIDs"); + } if (res->gr_resdata.grd_txn_rc_multi.status[cmdinfo->ci_res_index] == STATUS_OK) { @@ -1102,20 +1454,30 @@ ProcessResponse(GTMProxy_ThreadInfo *thrinfo, GTMProxy_CommandInfo *cmdinfo, pq_flush(cmdinfo->ci_conn->con_port); } else + { + ReleaseCmdBackup(cmdinfo); ereport(ERROR2, (EINVAL, errmsg("Transaction commit failed"))); + } cmdinfo->ci_conn->con_pending_msg = MSG_TYPE_INVALID; + ReleaseCmdBackup(cmdinfo); break; case MSG_TXN_ROLLBACK: if (res->gr_type != TXN_ROLLBACK_MULTI_RESULT) + { + ReleaseCmdBackup(cmdinfo); elog(ERROR, "Wrong result"); + } /* * These are grouped messages. We send an array of GXIDs to commit * or rollback and the server sends us back an array of status * codes. */ if (cmdinfo->ci_res_index >= res->gr_resdata.grd_txn_rc_multi.txn_count) + { + ReleaseCmdBackup(cmdinfo); elog(ERROR, "Too few GXIDs"); + } if (res->gr_resdata.grd_txn_rc_multi.status[cmdinfo->ci_res_index] == STATUS_OK) { @@ -1126,17 +1488,27 @@ ProcessResponse(GTMProxy_ThreadInfo *thrinfo, GTMProxy_CommandInfo *cmdinfo, pq_flush(cmdinfo->ci_conn->con_port); } else + { + ReleaseCmdBackup(cmdinfo); ereport(ERROR2, (EINVAL, errmsg("Transaction commit failed"))); + } cmdinfo->ci_conn->con_pending_msg = MSG_TYPE_INVALID; + ReleaseCmdBackup(cmdinfo); break; case MSG_SNAPSHOT_GET: if ((res->gr_type != SNAPSHOT_GET_RESULT) && (res->gr_type != SNAPSHOT_GET_MULTI_RESULT)) + { + ReleaseCmdBackup(cmdinfo); elog(ERROR, "Wrong result"); + } if (cmdinfo->ci_res_index >= res->gr_resdata.grd_txn_snap_multi.txn_count) + { + ReleaseCmdBackup(cmdinfo); elog(ERROR, "Too few GXIDs"); + } if (res->gr_resdata.grd_txn_snap_multi.status[cmdinfo->ci_res_index] == STATUS_OK) { @@ -1158,8 +1530,12 @@ ProcessResponse(GTMProxy_ThreadInfo *thrinfo, GTMProxy_CommandInfo *cmdinfo, pq_flush(cmdinfo->ci_conn->con_port); } else + { + ReleaseCmdBackup(cmdinfo); ereport(ERROR2, (EINVAL, errmsg("snapshot request failed"))); + } cmdinfo->ci_conn->con_pending_msg = MSG_TYPE_INVALID; + ReleaseCmdBackup(cmdinfo); break; case MSG_TXN_BEGIN: @@ -1185,7 +1561,10 @@ ProcessResponse(GTMProxy_ThreadInfo *thrinfo, GTMProxy_CommandInfo *cmdinfo, if ((res->gr_proxyhdr.ph_conid == InvalidGTMProxyConnID) || (res->gr_proxyhdr.ph_conid >= GTM_PROXY_MAX_CONNECTIONS) || (thrinfo->thr_all_conns[res->gr_proxyhdr.ph_conid] != cmdinfo->ci_conn)) + { + ReleaseCmdBackup(cmdinfo); elog(PANIC, "Invalid response or synchronization loss"); + } /* * These are just proxied messages.. so just forward the response @@ -1213,9 +1592,11 @@ ProcessResponse(GTMProxy_ThreadInfo *thrinfo, GTMProxy_CommandInfo *cmdinfo, break; } cmdinfo->ci_conn->con_pending_msg = MSG_TYPE_INVALID; + ReleaseCmdBackup(cmdinfo); break; default: + ReleaseCmdBackup(cmdinfo); ereport(FATAL, (EPROTO, errmsg("invalid frontend message type %d", @@ -1234,11 +1615,32 @@ static int ReadCommand(GTMProxy_ConnectionInfo *conninfo, StringInfo inBuf) { int qtype; + int rv; + int connIdx = conninfo->con_id; + int anyBackup; + int myLocalId; + myLocalId = GetMyThreadInfo->thr_localid; + anyBackup = (GetMyThreadInfo->thr_any_backup[connIdx] ? TRUE : FALSE); + + /* - * Get message type code from the frontend. + * Get message type code from the frontend. */ - qtype = pq_getbyte(conninfo->con_port); + if (!anyBackup) + { + qtype = pq_getbyte(conninfo->con_port); + GetMyThreadInfo->thr_qtype[connIdx] = qtype; + /* + * We should not update thr_any_backup here. This should be + * updated when the backup is consumed or command processing + * is done. + */ + } + else + { + qtype = GetMyThreadInfo->thr_qtype[connIdx]; + } if (qtype == EOF) /* frontend disconnected */ { @@ -1283,9 +1685,20 @@ ReadCommand(GTMProxy_ConnectionInfo *conninfo, StringInfo inBuf) * after the type code; we can read the message contents independently of * the type. */ - if (pq_getmessage(conninfo->con_port, inBuf, 0)) - return EOF; /* suitable message already logged */ - + if (!anyBackup) + { + if (pq_getmessage(conninfo->con_port, inBuf, 0)) + return EOF; /* suitable message already logged */ + copyStringInfo(&(GetMyThreadInfo->thr_inBufData[connIdx]), inBuf); + /* The next line should be added when we add the code to clear backup when the response is processed. */ +#if 0 + GetMyThreadInfo->thr_any_backup[connIdx] = TRUE; +#endif + } + else + { + copyStringInfo(inBuf, &(GetMyThreadInfo->thr_inBufData[connIdx])); + } return qtype; } @@ -1576,8 +1989,10 @@ GTMProxy_ProxyCommand(GTMProxy_ConnectionInfo *conninfo, GTM_Conn *gtm_conn, thrinfo->thr_processed_commands = gtm_lappend(thrinfo->thr_processed_commands, cmdinfo); /* Finish the message. */ + Enable_Longjmp(); if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + Disable_Longjmp(); return; } @@ -1843,8 +2258,10 @@ GTMProxy_ProcessPendingCommands(GTMProxy_ThreadInfo *thrinfo) } /* Finish the message. */ + Enable_Longjmp(); if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + Disable_Longjmp(); /* * Move the entire list to the processed command @@ -1881,8 +2298,10 @@ GTMProxy_ProcessPendingCommands(GTMProxy_ThreadInfo *thrinfo) } /* Finish the message. */ + Enable_Longjmp(); if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + Disable_Longjmp(); /* * Move the entire list to the processed command @@ -1921,8 +2340,10 @@ GTMProxy_ProcessPendingCommands(GTMProxy_ThreadInfo *thrinfo) } /* Finish the message. */ + Enable_Longjmp(); if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + Disable_Longjmp(); /* @@ -1960,8 +2381,10 @@ GTMProxy_ProcessPendingCommands(GTMProxy_ThreadInfo *thrinfo) } /* Finish the message. */ + Enable_Longjmp(); if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + Disable_Longjmp(); /* * Move the entire list to the processed command @@ -2421,6 +2844,25 @@ ConnectGTM(void) } /* + * Release backup command data + */ +#if 1 +static void ReleaseCmdBackup(GTMProxy_CommandInfo *cmdinfo) +{ + GTMProxy_ConnID connIdx = cmdinfo->ci_conn->con_id; + + GetMyThreadInfo->thr_any_backup[connIdx] = FALSE; + GetMyThreadInfo->thr_qtype[connIdx] = 0; + resetStringInfo(&(GetMyThreadInfo->thr_inBufData[connIdx])); +} +#else +static void ReleaseCmdBackup(GTMProxy_CommandInfo *cmdinfo) +{ + return; +} +#endif + +/* * dummy function to avoid compile error. */ bool diff --git a/src/gtm/proxy/proxy_thread.c b/src/gtm/proxy/proxy_thread.c index 4139936..6aca454 100644 --- a/src/gtm/proxy/proxy_thread.c +++ b/src/gtm/proxy/proxy_thread.c @@ -27,6 +27,9 @@ GTMProxy_Threads *GTMProxyThreads = >MProxyThreadsData; #define GTM_PROXY_MAX_THREADS 1024 /* Max threads allowed in the GTMProxy */ #define GTMProxyThreadsFull (GTMProxyThreads->gt_thread_count == GTMProxyThreads->gt_array_size) +extern int GTMProxyWorkerThreads; +extern GTMProxy_ThreadInfo **Proxy_ThreadInfo; + /* * Add the given thrinfo structure to the global array, expanding it if * necessary @@ -126,7 +129,7 @@ GTMProxy_ThreadRemove(GTMProxy_ThreadInfo *thrinfo) * "startroutine". The thread information is returned to the calling process. */ GTMProxy_ThreadInfo * -GTMProxy_ThreadCreate(void *(* startroutine)(void *)) +GTMProxy_ThreadCreate(void *(* startroutine)(void *), int idx) { GTMProxy_ThreadInfo *thrinfo; int err; @@ -142,6 +145,11 @@ GTMProxy_ThreadCreate(void *(* startroutine)(void *)) GTM_CVInit(&thrinfo->thr_cv); /* + * Initialize communication area with SIGUSR2 signal handler (reconnect) + */ + Proxy_ThreadInfo[idx] = thrinfo; + + /* * The thread status is set to GTM_PROXY_THREAD_STARTING and will be changed by * the thread itself when it actually starts executing */ @@ -418,6 +426,13 @@ GTMProxy_ThreadRemoveConnection(GTMProxy_ThreadInfo *thrinfo, GTMProxy_Connectio } /* + * Reset command backup info + */ + thrinfo->thr_any_backup[ii] = FALSE; + thrinfo->thr_qtype[ii] = 0; + resetStringInfo(&(thrinfo->thr_inBufData[ii])); + + /* * If this is the last entry in the array ? If not, then copy the last * entry in this slot and mark the last slot an empty */ diff --git a/src/include/gtm/gtm_proxy.h b/src/include/gtm/gtm_proxy.h index 2af5ef3..6e5f916 100644 --- a/src/include/gtm/gtm_proxy.h +++ b/src/include/gtm/gtm_proxy.h @@ -110,11 +110,23 @@ typedef struct GTMProxy_ThreadInfo /* connection array */ GTMProxy_ConnectionInfo *thr_all_conns[GTM_PROXY_MAX_CONNECTIONS]; struct pollfd thr_poll_fds[GTM_PROXY_MAX_CONNECTIONS]; + + /* Command backup */ + short thr_any_backup[GTM_PROXY_MAX_CONNECTIONS]; + int thr_qtype[GTM_PROXY_MAX_CONNECTIONS]; + StringInfoData thr_inBufData[GTM_PROXY_MAX_CONNECTIONS]; + gtm_List *thr_processed_commands; gtm_List *thr_pending_commands[MSG_TYPE_COUNT]; GTM_Conn *thr_gtm_conn; + /* Reconnect Info */ + int can_accept_SIGUSR2; + int reconnect_issued; + int can_longjmp; + sigjmp_buf longjmp_env; + } GTMProxy_ThreadInfo; typedef struct GTMProxy_Threads @@ -133,7 +145,7 @@ int GTMProxy_ThreadRemove(GTMProxy_ThreadInfo *thrinfo); int GTMProxy_ThreadJoin(GTMProxy_ThreadInfo *thrinfo); void GTMProxy_ThreadExit(void); -extern GTMProxy_ThreadInfo *GTMProxy_ThreadCreate(void *(* startroutine)(void *)); +extern GTMProxy_ThreadInfo *GTMProxy_ThreadCreate(void *(* startroutine)(void *), int idx); extern GTMProxy_ThreadInfo * GTMProxy_GetThreadInfo(GTM_ThreadID thrid); extern GTMProxy_ThreadInfo *GTMProxy_ThreadAddConnection(GTMProxy_ConnectionInfo *conninfo); extern int GTMProxy_ThreadRemoveConnection(GTMProxy_ThreadInfo *thrinfo, @@ -231,4 +243,23 @@ extern GTM_ThreadID TopMostThreadID; CritSectionCount--; \ } while(0) +/* Signal Handler controller */ +#define SIGUSR2DETECTED() (GetMyThreadInfo->reconnect_issued == TRUE) +#define RECONNECT_LONGJMP() do{longjmp(GetMyThreadInfo->longjmp_env, 1);}while(0) +#if 1 +#define Disable_Longjmp() do{GetMyThreadInfo->can_longjmp = FALSE;}while(0) +#define Enable_Longjmp() \ + do{ \ + if (SIGUSR2DETECTED()) { \ + RECONNECT_LONGJMP(); \ + } \ + else { \ + GetMyThreadInfo->can_longjmp = TRUE; \ + } \ + } while(0) +#else +#define Disable_Longjmp() +#define Enable_Longjmp() +#endif + #endif diff --git a/src/include/gtm/libpq-be.h b/src/include/gtm/libpq-be.h index 8e9805f..a2cc004 100644 --- a/src/include/gtm/libpq-be.h +++ b/src/include/gtm/libpq-be.h @@ -73,6 +73,13 @@ typedef struct Port int keepalives_idle; int keepalives_interval; int keepalives_count; + + /* + * GTM communication error handling. See libpq-int.h for details. + */ + int connErr_WaitOpt; + int connErr_WaitSecs; + int connErr_WaitCount; } Port; /* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */ diff --git a/src/include/gtm/libpq-int.h b/src/include/gtm/libpq-int.h index 30775e3..e8495a1 100644 --- a/src/include/gtm/libpq-int.h +++ b/src/include/gtm/libpq-int.h @@ -86,6 +86,13 @@ struct gtm_conn /* Buffer for receiving various parts of messages */ PQExpBufferData workBuffer; /* expansible string */ + /* + * Options to handle GTM communication error. + */ + int gtmErrorWaitOpt; /* If true, wait reconnect signal. If true, assume XCM is available */ + int gtmErrorWaitSecs; /* Duration of the wait time in second */ + int gtmErrorWaitCount; /* Hoe manu durations to wait */ + /* Pointer to the result of last operation */ GTM_Result *result; }; diff --git a/src/include/gtm/stringinfo.h b/src/include/gtm/stringinfo.h index d504685..81b55dc 100644 --- a/src/include/gtm/stringinfo.h +++ b/src/include/gtm/stringinfo.h @@ -146,4 +146,17 @@ extern void appendBinaryStringInfo(StringInfo str, */ extern void enlargeStringInfo(StringInfo str, int needed); +/*----------------------- + * dupStringInfo + * Get new StringInfo and copy the original to it. + */ +extern StringInfo dupStringInfo(StringInfo orig); + +/*------------------------ + * copyStringInfo + * Copy StringInfo. Deep copy: Data will be copied too. + * cursor of "to" will be initialized to zero. + */ +extern void copyStringInfo(StringInfo to, StringInfo from); + #endif /* STRINGINFO_H */ ----------------------------------------------------------------------- Summary of changes: src/gtm/common/stringinfo.c | 35 ++++ src/gtm/gtm_ctl/gtm_ctl.c | 23 ++- src/gtm/proxy/proxy_main.c | 460 +++++++++++++++++++++++++++++++++++++++++- src/gtm/proxy/proxy_thread.c | 17 ++- src/include/gtm/gtm_proxy.h | 33 +++- src/include/gtm/libpq-be.h | 7 + src/include/gtm/libpq-int.h | 7 + src/include/gtm/stringinfo.h | 13 ++ 8 files changed, 580 insertions(+), 15 deletions(-) hooks/post-receive -- Postgres-XC |
From: Ashutosh B. <ash...@us...> - 2011-05-19 09:16:55
|
Project "Postgres-XC". The branch, master has been updated via 87a62879ab3492e3dd37d00478ffa857639e2b85 (commit) from b170fe2d7fc4bd175c72c2e4370fab223bac24d6 (commit) - Log ----------------------------------------------------------------- commit 87a62879ab3492e3dd37d00478ffa857639e2b85 Author: Ashutosh Bapat <ash...@en...> Date: Thu May 19 14:45:02 2011 +0530 While copying the message from datanode to a slot, copy it within the memory context of the slot. Fix some compiler warnings. diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c index 5430e16..76c4ba3 100644 --- a/src/backend/executor/execTuples.c +++ b/src/backend/executor/execTuples.c @@ -1388,6 +1388,13 @@ ExecStoreDataRowTuple(char *msg, size_t len, int node, TupleTableSlot *slot, heap_freetuple(slot->tts_tuple); if (slot->tts_shouldFreeMin) heap_free_minimal_tuple(slot->tts_mintuple); + /* + * if msg == slot->tts_dataRow then we would + * free the dataRow in the slot loosing the contents in msg. It is safe + * to reset shouldFreeRow, since it will be overwritten just below. + */ + if (msg == slot->tts_dataRow) + slot->tts_shouldFreeRow = false; if (slot->tts_shouldFreeRow) pfree(slot->tts_dataRow); diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index 43d9606..75aca21 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -49,7 +49,7 @@ #define PRIMARY_NODE_WRITEAHEAD 1024 * 1024 static bool autocommit = true; -static is_ddl = false; +static bool is_ddl = false; static bool implicit_force_autocommit = false; static PGXCNodeHandle **write_node_list = NULL; static int write_node_count = 0; @@ -420,7 +420,6 @@ create_tuple_desc(char *msg_body, size_t len) char *typname; Oid oidtypeid; int32 typemode, typmod; - uint32 n32; attnum = (AttrNumber) i; @@ -1152,6 +1151,27 @@ BufferConnection(PGXCNodeHandle *conn) } /* + * copy the datarow from combiner to the given slot, in the slot's memory + * context + */ +static void +CopyDataRowTupleToSlot(RemoteQueryState *combiner, TupleTableSlot *slot) +{ + char *msg; + MemoryContext oldcontext; + oldcontext = MemoryContextSwitchTo(slot->tts_mcxt); + msg = (char *)palloc(combiner->currentRow.msglen); + memcpy(msg, combiner->currentRow.msg, combiner->currentRow.msglen); + ExecStoreDataRowTuple(msg, combiner->currentRow.msglen, + combiner->currentRow.msgnode, slot, true); + pfree(combiner->currentRow.msg); + combiner->currentRow.msg = NULL; + combiner->currentRow.msglen = 0; + combiner->currentRow.msgnode = 0; + MemoryContextSwitchTo(oldcontext); +} + +/* * Get next data row from the combiner's buffer into provided slot * Just clear slot and return false if buffer is empty, that means end of result * set is reached @@ -1164,12 +1184,7 @@ FetchTuple(RemoteQueryState *combiner, TupleTableSlot *slot) /* If we have message in the buffer, consume it */ if (combiner->currentRow.msg) { - ExecStoreDataRowTuple(combiner->currentRow.msg, - combiner->currentRow.msglen, - combiner->currentRow.msgnode, slot, true); - combiner->currentRow.msg = NULL; - combiner->currentRow.msglen = 0; - combiner->currentRow.msgnode = 0; + CopyDataRowTupleToSlot(combiner, slot); have_tuple = true; } @@ -1189,6 +1204,10 @@ FetchTuple(RemoteQueryState *combiner, TupleTableSlot *slot) * completed. Afterwards rows will be taken from the buffer bypassing * currentRow until buffer is empty, and only after that data are read * from a connection. + * PGXCTODO: the message should be allocated in the same memory context as + * that of the slot. Are we sure of that in the call to + * ExecStoreDataRowTuple below? If one fixes this memory issue, please + * consider using CopyDataRowTupleToSlot() for the same. */ if (list_length(combiner->rowBuffer) > 0) { @@ -1279,12 +1298,7 @@ FetchTuple(RemoteQueryState *combiner, TupleTableSlot *slot) /* If we have message in the buffer, consume it */ if (combiner->currentRow.msg) { - ExecStoreDataRowTuple(combiner->currentRow.msg, - combiner->currentRow.msglen, - combiner->currentRow.msgnode, slot, true); - combiner->currentRow.msg = NULL; - combiner->currentRow.msglen = 0; - combiner->currentRow.msgnode = 0; + CopyDataRowTupleToSlot(combiner, slot); have_tuple = true; } @@ -3762,7 +3776,7 @@ handle_results: natts = resultslot->tts_tupleDescriptor->natts; for (i = 0; i < natts; ++i) { - if (resultslot->tts_values[i] == NULL) + if (resultslot->tts_values[i] == (Datum) NULL) return NULL; } ----------------------------------------------------------------------- Summary of changes: src/backend/executor/execTuples.c | 7 +++++ src/backend/pgxc/pool/execRemote.c | 44 +++++++++++++++++++++++------------ 2 files changed, 36 insertions(+), 15 deletions(-) hooks/post-receive -- Postgres-XC |
From: Koichi S. <koi...@us...> - 2011-05-16 09:34:51
|
Project "Postgres-XC". The branch, documentation has been updated via 1dc1b90e434dbe56438dec9dfa8887de1767d457 (commit) from 58954e79e274b1280329aa61ca6af66b88c59cf6 (commit) - Log ----------------------------------------------------------------- commit 1dc1b90e434dbe56438dec9dfa8887de1767d457 Author: Koichi Suzuki <koi...@gm...> Date: Mon May 16 18:30:02 2011 +0900 This commit corrects the initial notify for each document source to mark what section is from PostgreSQL and needs review/revision for Postgres-XC. Next work is to review each section and revise it as Postgres-XC documentation. This will be done in the following order: 1. Installation/Cluster configuration/Cluster admninistration, 2. Statements, 3. Tutorials, 4. Application interface. All the source files are affected. diff --git a/doc/src/sgml/acronyms.sgmlin b/doc/src/sgml/acronyms.sgmlin index de4e1e0..8d1cab6 100644 --- a/doc/src/sgml/acronyms.sgmlin +++ b/doc/src/sgml/acronyms.sgmlin @@ -2,6 +2,7 @@ <appendix id="acronyms"> <title>Acronyms</title> + &pgnotice; <para> This is a list of acronyms commonly used in the <productname>PostgreSQL</> diff --git a/doc/src/sgml/adminpack.sgmlin b/doc/src/sgml/adminpack.sgmlin index b097000..4ec8c99 100644 --- a/doc/src/sgml/adminpack.sgmlin +++ b/doc/src/sgml/adminpack.sgmlin @@ -6,6 +6,7 @@ <indexterm zone="adminpack"> <primary>adminpack</primary> </indexterm> + &pgnotice; <para> <filename>adminpack</> provides a number of support functions which @@ -16,7 +17,7 @@ <sect2> <title>Functions implemented</title> - + &pgnotice; <para> The functions implemented by <filename>adminpack</> can only be run by a superuser. Here's a list of these functions: diff --git a/doc/src/sgml/advanced.sgmlin b/doc/src/sgml/advanced.sgmlin index 38045b5..ab7e1dd 100644 --- a/doc/src/sgml/advanced.sgmlin +++ b/doc/src/sgml/advanced.sgmlin @@ -5,6 +5,7 @@ <sect1 id="tutorial-advanced-intro"> <title>Introduction</title> + &pgnotice; <para> In the previous chapter we have covered the basics of using @@ -35,6 +36,7 @@ <indexterm zone="tutorial-views"> <primary>view</primary> </indexterm> + &pgnotice; <para> Refer back to the queries in <xref linkend="tutorial-join">. @@ -78,6 +80,7 @@ SELECT * FROM myview; <indexterm zone="tutorial-fk"> <primary>referential integrity</primary> </indexterm> + &pgnotice; <para> Recall the <classname>weather</classname> and @@ -143,6 +146,7 @@ DETAIL: Key (city)=(Berkeley) is not present in table "cities". <indexterm zone="tutorial-transactions"> <primary>transaction</primary> </indexterm> + &pgnotice; <para> <firstterm>Transactions</> are a fundamental concept of all database @@ -323,6 +327,7 @@ COMMIT; <indexterm zone="tutorial-window"> <primary>window function</primary> </indexterm> + &pgnotice; <para> A <firstterm>window function</> performs a calculation across a set of @@ -567,6 +572,7 @@ SELECT sum(salary) OVER w, avg(salary) OVER w <indexterm zone="tutorial-inheritance"> <primary>inheritance</primary> </indexterm> + &pgnotice; <para> Inheritance is a concept from object-oriented databases. It opens @@ -699,6 +705,7 @@ SELECT name, altitude <sect1 id="tutorial-conclusion"> <title>Conclusion</title> + &pgnotice; <para> <productname>PostgreSQL</productname> has many features not diff --git a/doc/src/sgml/arch-dev.sgmlin b/doc/src/sgml/arch-dev.sgmlin index b656f9b..c2bcca9 100644 --- a/doc/src/sgml/arch-dev.sgmlin +++ b/doc/src/sgml/arch-dev.sgmlin @@ -12,6 +12,7 @@ of O.Univ.Prof.Dr. Georg Gottlob and Univ.Ass. Mag. Katrin Seyr. </para> </note> + &pgnotice; <para> This chapter gives an overview of the internal structure of the @@ -28,6 +29,7 @@ <sect1 id="query-path"> <title>The Path of a Query</title> + &pgnotice; <para> Here we give a short overview of the stages a query has to pass in @@ -115,6 +117,7 @@ <sect1 id="connect-estab"> <title>How Connections are Established</title> + &pgnotice; <para> <productname>PostgreSQL</productname> is implemented using a @@ -154,6 +157,7 @@ <sect1 id="parser-stage"> <title>The Parser Stage</title> + &pgnotice; <para> The <firstterm>parser stage</firstterm> consists of two parts: @@ -178,6 +182,7 @@ <sect2> <title>Parser</title> + &pgnotice; <para> The parser has to check the query string (which arrives as plain @@ -241,6 +246,7 @@ <sect2> <title>Transformation Process</title> + &pgnotice; <para> The parser stage creates a parse tree using only fixed rules about @@ -283,6 +289,7 @@ <sect1 id="rule-system"> <title>The <productname>PostgreSQL</productname> Rule System</title> + &pgnotice; <para> <productname>PostgreSQL</productname> supports a powerful @@ -328,6 +335,7 @@ <sect1 id="planner-optimizer"> <title>Planner/Optimizer</title> + &pgnotice; <para> The task of the <firstterm>planner/optimizer</firstterm> is to @@ -365,6 +373,7 @@ <sect2> <title>Generating Possible Plans</title> + &pgnotice; <para> The planner/optimizer starts by generating plans for scanning each @@ -477,6 +486,7 @@ <sect1 id="executor"> <title>Executor</title> + &pgnotice; <para> The <firstterm>executor</firstterm> takes the plan created by the diff --git a/doc/src/sgml/array.sgmlin b/doc/src/sgml/array.sgmlin index bfc373a..d5617b6 100644 --- a/doc/src/sgml/array.sgmlin +++ b/doc/src/sgml/array.sgmlin @@ -6,6 +6,7 @@ <indexterm> <primary>array</primary> </indexterm> + &pgnotice; <para> <productname>PostgreSQL</productname> allows columns of a table to be @@ -22,6 +23,7 @@ <primary>array</primary> <secondary>declaration</secondary> </indexterm> + &pgnotice; <para> To illustrate the use of array types, we create this table: @@ -92,6 +94,7 @@ CREATE TABLE tictactoe ( <primary>array</primary> <secondary>constant</secondary> </indexterm> + &pgnotice; <para> To write an array value as a literal constant, enclose the element @@ -204,6 +207,7 @@ INSERT INTO sal_emp <primary>array</primary> <secondary>accessing</secondary> </indexterm> + &pgnotice; <para> Now, we can run some queries on the table. @@ -348,6 +352,7 @@ SELECT array_length(schedule, 1) FROM sal_emp WHERE name = 'Carol'; <primary>array</primary> <secondary>modifying</secondary> </indexterm> + &pgnotice; <para> An array value can be replaced completely: @@ -527,6 +532,7 @@ SELECT array_cat(ARRAY[5,6], ARRAY[[1,2],[3,4]]); <primary>array</primary> <secondary>searching</secondary> </indexterm> + &pgnotice; <para> To search for a value in an array, each value must be checked. @@ -591,6 +597,7 @@ SELECT * FROM <primary>array</primary> <secondary>I/O</secondary> </indexterm> + &pgnotice; <para> The external text representation of an array value consists of items that diff --git a/doc/src/sgml/backup.sgmlin b/doc/src/sgml/backup.sgmlin index e99d9dc..3767316 100644 --- a/doc/src/sgml/backup.sgmlin +++ b/doc/src/sgml/backup.sgmlin @@ -4,6 +4,7 @@ <title>Backup and Restore</title> <indexterm zone="backup"><primary>backup</></> + &pgnotice; <para> As with everything that contains valuable data, <productname>PostgreSQL</> @@ -26,6 +27,7 @@ <sect1 id="backup-dump"> <title><acronym>SQL</> Dump</title> + &pgnotice; <para> The idea behind this dump method is to generate a text file with SQL @@ -104,6 +106,7 @@ pg_dump <replaceable class="parameter">dbname</replaceable> > <replaceable cl <sect2 id="backup-dump-restore"> <title>Restoring the dump</title> + &pgnotice; <para> The text files created by <application>pg_dump</> are intended to @@ -189,6 +192,7 @@ pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>h <sect2 id="backup-dump-all"> <title>Using <application>pg_dumpall</></title> + &pgnotice; <para> <application>pg_dump</> dumps only a single database at a time, @@ -226,6 +230,7 @@ psql -f <replaceable class="parameter">infile</replaceable> postgres <sect2 id="backup-dump-large"> <title>Handling large databases</title> + &pgnotice; <para> Some operating systems have maximum file size limits that cause @@ -315,6 +320,7 @@ pg_restore -d <replaceable class="parameter">dbname</replaceable> <replaceable c <sect1 id="backup-file"> <title>File System Level Backup</title> + &pgnotice; <para> An alternative backup strategy is to directly copy the files that @@ -439,6 +445,7 @@ tar -cf backup.tar /usr/local/pgsql/data <indexterm zone="backup"> <primary>PITR</primary> </indexterm> + &pgnotice; <para> At all times, <productname>PostgreSQL</> maintains a @@ -526,6 +533,7 @@ tar -cf backup.tar /usr/local/pgsql/data <sect2 id="backup-archiving-wal"> <title>Setting up WAL archiving</title> + &pgnotice; <para> In an abstract sense, a running <productname>PostgreSQL</> system @@ -719,6 +727,7 @@ archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/ser <sect2 id="backup-base-backup"> <title>Making a Base Backup</title> + &pgnotice; <para> The procedure for making a base backup is relatively simple: @@ -923,6 +932,7 @@ SELECT pg_stop_backup(); <sect2 id="backup-pitr-recovery"> <title>Recovering using a Continuous Archive Backup</title> + &pgnotice; <para> Okay, the worst has happened and you need to recover from your backup. @@ -1111,6 +1121,7 @@ restore_command = 'cp /mnt/server/archivedir/%f %p' <indexterm zone="backup"> <primary>timelines</primary> </indexterm> + &pgnotice; <para> The ability to restore the database to a previous point in time creates @@ -1175,6 +1186,7 @@ restore_command = 'cp /mnt/server/archivedir/%f %p' <sect2 id="backup-tips"> <title>Tips and Examples</title> + &pgnotice; <para> Some tips for configuring continuous archiving are given here. @@ -1182,6 +1194,7 @@ restore_command = 'cp /mnt/server/archivedir/%f %p' <sect3 id="backup-standalone"> <title>Standalone hot backups</title> + &pgnotice; <para> It is possible to use <productname>PostgreSQL</>'s backup facilities to @@ -1245,6 +1258,7 @@ restore_command = 'gunzip < /mnt/server/archivedir/%f | pg_decompresslog - %p <sect3 id="backup-scripts"> <title><varname>archive_command</varname> scripts</title> + &pgnotice; <para> Many people choose to use scripts to define their @@ -1294,6 +1308,7 @@ archive_command = 'local_backup_script.sh' <sect2 id="continuous-archiving-caveats"> <title>Caveats</title> + &pgnotice; <para> At this writing, there are several limitations of the continuous archiving @@ -1375,6 +1390,7 @@ archive_command = 'local_backup_script.sh' <primary>version</primary> <secondary>compatibility</secondary> </indexterm> + &pgnotice; <para> This section discusses how to migrate your database data from one @@ -1471,6 +1487,7 @@ archive_command = 'local_backup_script.sh' <sect2 id="migration-methods-pgdump"> <title>Migrating data via <application>pg_dump</></title> + &pgnotice; <para> To dump data from one major version of <productname>PostgreSQL</> and @@ -1551,6 +1568,7 @@ psql -f backup postgres <sect2 id="migration-methods-other"> <title>Other data migration methods</title> + &pgnotice; <para> The <filename>contrib</> program diff --git a/doc/src/sgml/basenam.sgmlin b/doc/src/sgml/basenam.sgmlin index a755887..c81222f 100644 --- a/doc/src/sgml/basenam.sgmlin +++ b/doc/src/sgml/basenam.sgmlin @@ -28,6 +28,8 @@ <title>Tutorial</title> <partintro> + &pgnotice; + <para> Welcome to the <productname>PostgreSQL</productname> Tutorial. The following few chapters are intended to give a simple introduction diff --git a/doc/src/sgml/biblio.sgmlin b/doc/src/sgml/biblio.sgmlin index 859b888..47ce658 100644 --- a/doc/src/sgml/biblio.sgmlin +++ b/doc/src/sgml/biblio.sgmlin @@ -7,6 +7,7 @@ Selected references and readings for <acronym>SQL</acronym> and <productname>PostgreSQL</productname>. </para> + &pgnotice; <para> Some white papers and technical reports from the original diff --git a/doc/src/sgml/btree-gin.sgmlin b/doc/src/sgml/btree-gin.sgmlin index eb111bb..323f072 100644 --- a/doc/src/sgml/btree-gin.sgmlin +++ b/doc/src/sgml/btree-gin.sgmlin @@ -6,6 +6,7 @@ <indexterm zone="btree-gin"> <primary>btree_gin</primary> </indexterm> + &pgnotice; <para> <filename>btree_gin</> provides sample GIN operator classes that @@ -45,6 +46,7 @@ SELECT * FROM test WHERE a < 10; <sect2> <title>Authors</title> + &pgnotice; <para> Teodor Sigaev (<email>te...@st...</email>) and diff --git a/doc/src/sgml/btree-gist.sgmlin b/doc/src/sgml/btree-gist.sgmlin index 1d0a1e8..3782ce5 100644 --- a/doc/src/sgml/btree-gist.sgmlin +++ b/doc/src/sgml/btree-gist.sgmlin @@ -6,6 +6,7 @@ <indexterm zone="btree-gist"> <primary>btree_gist</primary> </indexterm> + &pgnotice; <para> <filename>btree_gist</> provides sample GiST operator classes that @@ -42,6 +43,7 @@ SELECT * FROM test WHERE a < 10; <sect2> <title>Authors</title> + &pgnotice; <para> Teodor Sigaev (<email>te...@st...</email>) , diff --git a/doc/src/sgml/catalogs.sgmlin b/doc/src/sgml/catalogs.sgmlin index 6825bbc..3e94841 100644 --- a/doc/src/sgml/catalogs.sgmlin +++ b/doc/src/sgml/catalogs.sgmlin @@ -5,6 +5,7 @@ <chapter id="catalogs"> <title>System Catalogs</title> + &pgnotice; <para> The system catalogs are the place where a relational database @@ -23,6 +24,7 @@ <sect1 id="catalogs-overview"> <title>Overview</title> + &pgnotice; <para> <xref linkend="catalog-table"> lists the system catalogs. @@ -279,6 +281,7 @@ <indexterm zone="catalog-pg-aggregate"> <primary>pg_aggregate</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_aggregate</structname> stores information about @@ -369,6 +372,7 @@ <indexterm zone="catalog-pg-am"> <primary>pg_am</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_am</structname> stores information about index @@ -589,6 +593,7 @@ <indexterm zone="catalog-pg-amop"> <primary>pg_amop</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_amop</structname> stores information about @@ -677,6 +682,7 @@ <indexterm zone="catalog-pg-amproc"> <primary>pg_amproc</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_amproc</structname> stores information about @@ -758,6 +764,7 @@ <indexterm zone="catalog-pg-attrdef"> <primary>pg_attrdef</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_attrdef</structname> stores column default values. The main information @@ -829,6 +836,7 @@ <indexterm zone="catalog-pg-attribute"> <primary>pg_attribute</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_attribute</structname> stores information about @@ -1075,6 +1083,7 @@ <indexterm zone="catalog-pg-authid"> <primary>pg_authid</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_authid</structname> contains information about @@ -1208,6 +1217,7 @@ <indexterm zone="catalog-pg-auth-members"> <primary>pg_auth_members</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_auth_members</structname> shows the membership @@ -1277,6 +1287,7 @@ <indexterm zone="catalog-pg-cast"> <primary>pg_cast</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_cast</structname> stores data type conversion @@ -1399,6 +1410,7 @@ <indexterm zone="catalog-pg-class"> <primary>pg_class</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_class</structname> catalogs tables and most @@ -1715,6 +1727,7 @@ <indexterm zone="catalog-pg-constraint"> <primary>pg_constraint</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_constraint</structname> stores check, primary @@ -1975,6 +1988,7 @@ <indexterm zone="catalog-pg-conversion"> <primary>pg_conversion</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_conversion</structname> describes the @@ -2060,6 +2074,7 @@ <indexterm zone="catalog-pg-database"> <primary>pg_database</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_database</structname> stores information about @@ -2219,6 +2234,7 @@ <indexterm zone="catalog-pg-db-role-setting"> <primary>pg_db_role_setting</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_db_role_setting</structname> records the default @@ -2279,6 +2295,7 @@ <indexterm zone="catalog-pg-default-acl"> <primary>pg_default_acl</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_default_acl</> stores initial @@ -2366,6 +2383,7 @@ <indexterm zone="catalog-pg-depend"> <primary>pg_depend</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_depend</structname> records the dependency @@ -2540,6 +2558,7 @@ <indexterm zone="catalog-pg-description"> <primary>pg_description</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_description</> stores optional descriptions @@ -2615,6 +2634,7 @@ <indexterm zone="catalog-pg-enum"> <primary>pg_enum</primary> </indexterm> + &pgnotice; <para> The <structname>pg_enum</structname> catalog contains entries @@ -2665,6 +2685,7 @@ <indexterm zone="catalog-pg-foreign-data-wrapper"> <primary>pg_foreign_data_wrapper</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_foreign_data_wrapper</structname> stores @@ -2746,6 +2767,7 @@ <indexterm zone="catalog-pg-foreign-server"> <primary>pg_foreign_server</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_foreign_server</structname> stores @@ -2835,6 +2857,7 @@ <indexterm zone="catalog-pg-index"> <primary>pg_index</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_index</structname> contains part of the information @@ -3016,6 +3039,7 @@ <indexterm zone="catalog-pg-inherits"> <primary>pg_inherits</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_inherits</> records information about @@ -3079,6 +3103,7 @@ <indexterm zone="catalog-pg-language"> <primary>pg_language</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_language</structname> registers @@ -3199,6 +3224,7 @@ <indexterm zone="catalog-pg-largeobject"> <primary>pg_largeobject</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_largeobject</structname> holds the data making up @@ -3279,6 +3305,7 @@ <indexterm zone="catalog-pg-largeobject-metadata"> <primary>pg_largeobject_metadata</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_largeobject_metadata</structname> @@ -3331,6 +3358,7 @@ <indexterm zone="catalog-pg-namespace"> <primary>pg_namespace</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_namespace</> stores namespaces. @@ -3391,6 +3419,7 @@ <indexterm zone="catalog-pg-opclass"> <primary>pg_opclass</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_opclass</structname> defines @@ -3498,6 +3527,7 @@ <indexterm zone="catalog-pg-operator"> <primary>pg_operator</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_operator</> stores information about operators. @@ -3639,6 +3669,7 @@ <indexterm zone="catalog-pg-opfamily"> <primary>pg_opfamily</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_opfamily</structname> defines operator families. @@ -3719,6 +3750,7 @@ <indexterm zone="catalog-pg-pltemplate"> <primary>pg_pltemplate</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_pltemplate</structname> stores @@ -3825,6 +3857,7 @@ <indexterm zone="catalog-pg-proc"> <primary>pg_proc</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_proc</> stores information about functions (or procedures). @@ -4121,6 +4154,7 @@ <indexterm zone="catalog-pg-rewrite"> <primary>pg_rewrite</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_rewrite</structname> stores rewrite rules for tables and views. @@ -4235,6 +4269,7 @@ <indexterm zone="catalog-pg-shdepend"> <primary>pg_shdepend</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_shdepend</structname> records the @@ -4388,6 +4423,7 @@ <indexterm zone="catalog-pg-shdescription"> <primary>pg_shdescription</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_shdescription</structname> stores optional @@ -4456,6 +4492,7 @@ <indexterm zone="catalog-pg-statistic"> <primary>pg_statistic</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_statistic</structname> stores @@ -4636,6 +4673,7 @@ <indexterm zone="catalog-pg-tablespace"> <primary>pg_tablespace</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_tablespace</structname> stores information @@ -4717,6 +4755,7 @@ <indexterm zone="catalog-pg-trigger"> <primary>pg_trigger</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_trigger</structname> stores triggers on tables. @@ -4893,6 +4932,7 @@ <indexterm zone="catalog-pg-ts-config"> <primary>pg_ts_config</primary> </indexterm> + &pgnotice; <para> The <structname>pg_ts_config</structname> catalog contains entries @@ -4964,6 +5004,7 @@ <indexterm zone="catalog-pg-ts-config-map"> <primary>pg_ts_config_map</primary> </indexterm> + &pgnotice; <para> The <structname>pg_ts_config_map</structname> catalog contains entries @@ -5031,6 +5072,7 @@ <indexterm zone="catalog-pg-ts-dict"> <primary>pg_ts_dict</primary> </indexterm> + &pgnotice; <para> The <structname>pg_ts_dict</structname> catalog contains entries @@ -5110,6 +5152,7 @@ <indexterm zone="catalog-pg-ts-parser"> <primary>pg_ts_parser</primary> </indexterm> + &pgnotice; <para> The <structname>pg_ts_parser</structname> catalog contains entries @@ -5200,6 +5243,7 @@ <indexterm zone="catalog-pg-ts-template"> <primary>pg_ts_template</primary> </indexterm> + &pgnotice; <para> The <structname>pg_ts_template</structname> catalog contains entries @@ -5269,6 +5313,7 @@ <indexterm zone="catalog-pg-type"> <primary>pg_type</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_type</structname> stores information about data @@ -5746,6 +5791,7 @@ <indexterm zone="catalog-pg-user-mapping"> <primary>pg_user_mapping</primary> </indexterm> + &pgnotice; <para> The catalog <structname>pg_user_mapping</structname> stores @@ -5801,6 +5847,7 @@ <sect1 id="views-overview"> <title>System Views</title> + &pgnotice; <para> In addition to the system catalogs, <productname>PostgreSQL</productname> @@ -5938,6 +5985,7 @@ <indexterm zone="view-pg-cursors"> <primary>pg_cursors</primary> </indexterm> + &pgnotice; <para> The <structname>pg_cursors</structname> view lists the cursors that @@ -6058,6 +6106,7 @@ <indexterm zone="view-pg-group"> <primary>pg_group</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_group</structname> exists for backwards @@ -6114,6 +6163,7 @@ <indexterm zone="view-pg-indexes"> <primary>pg_indexes</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_indexes</structname> provides access to @@ -6176,6 +6226,7 @@ <indexterm zone="view-pg-locks"> <primary>pg_locks</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_locks</structname> provides access to @@ -6438,6 +6489,7 @@ <indexterm zone="view-pg-prepared-statements"> <primary>pg_prepared_statements</primary> </indexterm> + &pgnotice; <para> The <structname>pg_prepared_statements</structname> view displays @@ -6526,6 +6578,7 @@ <indexterm zone="view-pg-prepared-xacts"> <primary>pg_prepared_xacts</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_prepared_xacts</structname> displays @@ -6614,6 +6667,7 @@ <indexterm zone="view-pg-roles"> <primary>pg_roles</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_roles</structname> provides access to @@ -6741,6 +6795,7 @@ <indexterm zone="view-pg-rules"> <primary>pg_rules</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_rules</structname> provides access to @@ -6801,6 +6856,7 @@ <indexterm zone="view-pg-settings"> <primary>pg_settings</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_settings</structname> provides access to @@ -6943,6 +6999,7 @@ <indexterm zone="view-pg-shadow"> <primary>pg_shadow</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_shadow</structname> exists for backwards @@ -7047,6 +7104,7 @@ <indexterm zone="view-pg-stats"> <primary>pg_stats</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_stats</structname> provides access to @@ -7211,6 +7269,7 @@ <indexterm zone="view-pg-tables"> <primary>pg_tables</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_tables</structname> provides access to @@ -7284,6 +7343,7 @@ <indexterm zone="view-pg-timezone-abbrevs"> <primary>pg_timezone_abbrevs</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_timezone_abbrevs</structname> provides a list @@ -7331,6 +7391,7 @@ <indexterm zone="view-pg-timezone-names"> <primary>pg_timezone_names</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_timezone_names</structname> provides a list @@ -7388,6 +7449,7 @@ <indexterm zone="view-pg-user"> <primary>pg_user</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_user</structname> provides access to @@ -7471,6 +7533,7 @@ <indexterm zone="view-pg-user-mappings"> <primary>pg_user_mappings</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_user_mappings</structname> provides access @@ -7556,6 +7619,7 @@ <indexterm zone="view-pg-views"> <primary>pg_views</primary> </indexterm> + &pgnotice; <para> The view <structname>pg_views</structname> provides access to diff --git a/doc/src/sgml/chkpass.sgmlin b/doc/src/sgml/chkpass.sgmlin index 865300f..517f906 100644 --- a/doc/src/sgml/chkpass.sgmlin +++ b/doc/src/sgml/chkpass.sgmlin @@ -14,6 +14,7 @@ and is always stored encrypted. To compare, simply compare against a clear text password and the comparison function will encrypt it before comparing. </para> + &pgnotice; <para> There are provisions in the code to report an error if the password is @@ -86,6 +87,7 @@ test=# select p = 'goodbye' from test; <sect2> <title>Author</title> + &pgnotice; <para> D'Arcy J.M. Cain (<email>da...@dr...</email>) diff --git a/doc/src/sgml/citext.sgmlin b/doc/src/sgml/citext.sgmlin index 23d53a6..fbd461a 100644 --- a/doc/src/sgml/citext.sgmlin +++ b/doc/src/sgml/citext.sgmlin @@ -6,6 +6,7 @@ <indexterm zone="citext"> <primary>citext</primary> </indexterm> + &pgnotice; <para> The <filename>citext</> module provides a case-insensitive @@ -16,6 +17,7 @@ <sect2> <title>Rationale</title> + &pgnotice; <para> The standard approach to doing case-insensitive matches @@ -70,6 +72,7 @@ SELECT * FROM tab WHERE lower(col) = LOWER(?); <sect2> <title>How to Use It</title> + &pgnotice; <para> Here's a simple example of usage: @@ -97,6 +100,8 @@ SELECT * FROM users WHERE nick = 'Larry'; <sect2> <title>String Comparison Behavior</title> + &pgnotice; + <para> In order to emulate a case-insensitive collation as closely as possible, there are <type>citext</>-specific versions of a number of the comparison @@ -167,6 +172,8 @@ SELECT * FROM users WHERE nick = 'Larry'; <itemizedlist> <listitem> + &pgnotice; + <para> <type>citext</>'s behavior depends on the <literal>LC_CTYPE</> setting of your database. How it compares @@ -219,6 +226,7 @@ SELECT * FROM users WHERE nick = 'Larry'; <sect2> <title>Author</title> + &pgnotice; <para> David E. Wheeler <email>da...@ki...</email> diff --git a/doc/src/sgml/config.sgmlin b/doc/src/sgml/config.sgmlin index 1b8e5a5..f7f198e 100644 --- a/doc/src/sgml/config.sgmlin +++ b/doc/src/sgml/config.sgmlin @@ -7,6 +7,7 @@ <primary>configuration</primary> <secondary>of the server</secondary> </indexterm> + &pgnotice; <para> There are many configuration parameters that affect the behavior of @@ -17,6 +18,7 @@ <sect1 id="config-setting"> <title>Setting Parameters</title> + &pgnotice; <para> All parameter names are case-insensitive. Every parameter takes a @@ -178,6 +180,7 @@ SET ENABLE_SEQSCAN TO OFF; <sect1 id="runtime-config-file-locations"> <title>File Locations</title> + &pgnotice; <para> In addition to the <filename>postgresql.conf</filename> file @@ -319,6 +322,8 @@ SET ENABLE_SEQSCAN TO OFF; <primary><varname>listen_addresses</> configuration parameter</primary> </indexterm> <listitem> + &pgnotice; + <para> Specifies the TCP/IP address(es) on which the server is to listen for connections from client applications. @@ -622,6 +627,8 @@ SET ENABLE_SEQSCAN TO OFF; </indexterm> <listitem> + &pgnotice; + <para> Maximum time to complete client authentication, in seconds. If a would-be client has not completed the authentication protocol in @@ -823,6 +830,8 @@ SET ENABLE_SEQSCAN TO OFF; <primary><varname>shared_buffers</> configuration parameter</primary> </indexterm> <listitem> + &pgnotice; + <para> Sets the amount of memory the database server uses for shared memory buffers. The default is typically 32 megabytes @@ -1032,6 +1041,8 @@ SET ENABLE_SEQSCAN TO OFF; <primary><varname>max_files_per_process</> configuration parameter</primary> </indexterm> <listitem> + &pgnotice; + <para> Sets the maximum number of simultaneously open files allowed to each server subprocess. The default is one thousand files. If the kernel is enforcing @@ -1110,6 +1121,7 @@ SET ENABLE_SEQSCAN TO OFF; <sect2 id="runtime-config-resource-vacuum-cost"> <title>Cost-Based Vacuum Delay</title> + &pgnotice; <para> During the execution of <xref linkend="sql-vacuum"> @@ -1247,6 +1259,7 @@ SET ENABLE_SEQSCAN TO OFF; <sect2 id="runtime-config-resource-background-writer"> <title>Background Writer</title> + &pgnotice; <para> There is a separate server @@ -1349,6 +1362,8 @@ SET ENABLE_SEQSCAN TO OFF; <primary><varname>effective_io_concurrency</> configuration parameter</primary> </indexterm> <listitem> + &pgnotice; + <para> Sets the number of concurrent disk I/O operations that <productname>PostgreSQL</> expects can be executed @@ -1390,6 +1405,7 @@ SET ENABLE_SEQSCAN TO OFF; <sect1 id="runtime-config-wal"> <title>Write Ahead Log</title> + &pgnotice; <para> See also <xref linkend="wal-configuration"> for details on WAL @@ -1406,6 +1422,8 @@ SET ENABLE_SEQSCAN TO OFF; <primary><varname>wal_level</> configuration parameter</primary> </indexterm> <listitem> + &pgnotice; + <para> <varname>wal_level</> determines how much information is written to the WAL. The default value is <literal>minimal</>, which writes @@ -1726,6 +1744,8 @@ SET ENABLE_SEQSCAN TO OFF; <primary><varname>checkpoint_segments</> configuration parameter</primary> </indexterm> <listitem> + &pgnotice; + <para> Maximum number of log file segments between automatic WAL checkpoints (each segment is normally 16 megabytes). The default @@ -1800,6 +1820,8 @@ SET ENABLE_SEQSCAN TO OFF; <primary><varname>archive_mode</> configuration parameter</primary> </indexterm> <listitem> + &pgnotice; + <para> When <varname>archive_mode</> is enabled, completed WAL segments are sent to archive storage by setting @@ -1885,6 +1907,7 @@ SET ENABLE_SEQSCAN TO OFF; <sect2 id="runtime-config-replication"> <title>Streaming Replication</title> + &pgnotice; <para> These settings control the behavior of the built-in @@ -1993,6 +2016,7 @@ SET ENABLE_SEQSCAN TO OFF; <sect2 id="runtime-config-standby"> <title>Standby Servers</title> + &pgnotice; <para> These settings control the behavior of a standby server that is @@ -2087,6 +2111,7 @@ SET ENABLE_SEQSCAN TO OFF; <sect2 id="runtime-config-query-enable"> <title>Planner Method Configuration</title> + &pgnotice; <para> These configuration parameters provide a crude method of @@ -2262,6 +2287,7 @@ SET ENABLE_SEQSCAN TO OFF; </sect2> <sect2 id="runtime-config-query-constants"> <title>Planner Cost Constants</title> + &pgnotice; <para> The <firstterm>cost</> variables described in this section are measured @@ -2414,6 +2440,7 @@ SET ENABLE_SEQSCAN TO OFF; </sect2> <sect2 id="runtime-config-query-geqo"> <title>Genetic Query Optimizer</title> + &pgnotice; <para> The genetic query optimizer (GEQO) is an algorithm that does query @@ -2561,6 +2588,7 @@ SET ENABLE_SEQSCAN TO OFF; </sect2> <sect2 id="runtime-config-query-other"> <title>Other Planner Options</title> + &pgnotice; <variablelist> @@ -2725,6 +2753,7 @@ SELECT * FROM parent WHERE key = 2400; <indexterm zone="runtime-config-logging"> <primary>server log</primary> </indexterm> + &pgnotice; <sect2 id="runtime-config-logging-where"> <title>Where To Log</title> @@ -2732,6 +2761,7 @@ SELECT * FROM parent WHERE key = 2400; <indexterm zone="runtime-config-logging-where"> <primary>where to log</primary> </indexterm> + &pgnotice; <variablelist> @@ -3029,6 +3059,7 @@ local0.* /var/log/postgresql </sect2> <sect2 id="runtime-config-logging-when"> <title>When To Log</title> + &pgnotice; <variablelist> @@ -3234,6 +3265,7 @@ local0.* /var/log/postgresql </sect2> <sect2 id="runtime-config-logging-what"> <title>What To Log</title> + &pgnotice; <variablelist> @@ -3686,6 +3718,7 @@ FROM pg_stat_activity; </sect2> <sect2 id="runtime-config-logging-csvlog"> <title>Using CSV-Format Log Output</title> + &pgnotice; <para> Including <literal>csvlog</> in the <varname>log_destination</> list @@ -3812,9 +3845,11 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; <sect1 id="runtime-config-statistics"> <title>Run-Time Statistics</title> + &pgnotice; <sect2 id="runtime-config-statistics-collector"> <title>Query and Index Statistics Collector</title> + &pgnotice; <para> These parameters control server-wide statistics collection features. @@ -3939,6 +3974,9 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; <sect2 id="runtime-config-statistics-monitor"> <title>Statistics Monitoring</title> + &pgnotice; + + <variablelist> <varlistentry> @@ -3979,6 +4017,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; <sect1 id="runtime-config-autovacuum"> <title>Automatic Vacuuming</title> + &pgnotice; <indexterm> <primary>autovacuum</primary> @@ -4215,9 +4254,12 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; <sect1 id="runtime-config-client"> <title>Client Connection Defaults</title> + &pgnotice; <sect2 id="runtime-config-client-statement"> <title>Statement Behavior</title> + &pgnotice; + <variablelist> <varlistentry id="guc-search-path" xreflabel="search_path"> @@ -4605,6 +4647,7 @@ SET XML OPTION { DOCUMENT | CONTENT }; </sect2> <sect2 id="runtime-config-client-format"> <title>Locale and Formatting</title> + &pgnotice; <variablelist> @@ -4851,6 +4894,7 @@ SET XML OPTION { DOCUMENT | CONTENT }; </sect2> <sect2 id="runtime-config-client-other"> <title>Other Defaults</title> + &pgnotice; <variablelist> @@ -4982,6 +5026,7 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' <sect1 id="runtime-config-locks"> <title>Lock Management</title> + &pgnotice; <variablelist> @@ -5071,9 +5116,11 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' <sect1 id="runtime-config-compatible"> <title>Version and Platform Compatibility</title> + &pgnotice; <sect2 id="runtime-config-compatible-version"> <title>Previous PostgreSQL Versions</title> + &pgnotice; <variablelist> @@ -5284,6 +5331,8 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' <sect2 id="runtime-config-compatible-clients"> <title>Platform and Client Compatibility</title> + &pgnotice; + <variablelist> <varlistentry id="guc-transform-null-equals" xreflabel="transform_null_equals"> @@ -5339,6 +5388,7 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' <sect1 id="runtime-config-preset"> <title>Preset Options</title> + &pgnotice; <para> The following <quote>parameters</> are read-only, and are determined @@ -5553,6 +5603,7 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' <sect1 id="runtime-config-custom"> <title>Customized Options</title> + &pgnotice; <para> This feature was designed to allow parameters not normally known to @@ -5614,6 +5665,7 @@ plruby.use_strict = true # generates error: unknown class name <sect1 id="runtime-config-developer"> <title>Developer Options</title> + &pgnotice; <para> The following parameters are intended for work on the @@ -5961,6 +6013,7 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1) </sect1> <sect1 id="runtime-config-short"> <title>Short Options</title> + &pgnotice; <para> For convenience there are also single letter command-line option diff --git a/doc/src/sgml/contacts.sgmlin b/doc/src/sgml/contacts.sgmlin index 6b15de3..038ebbc 100644 --- a/doc/src/sgml/contacts.sgmlin +++ b/doc/src/sgml/contacts.sgmlin @@ -1,9 +1,9 @@ <!-- $PostgreSQL: pgsql/doc/src/sgml/contacts.sgml,v 1.9 2006/03/10 19:10:47 momjian Exp $ --> + <appendix label="B" id="contacts"> <title>Contacts</title> - -<!-- +%pgnotice; <para> Support for <productname>PostgreSQL</productname> comes primarily from this printed documentation, the web-based mailing list archives, @@ -12,6 +12,7 @@ and the mailing lists themselves. <sect1 id="mailing-list"> <title>Mailing Lists</title> +%pgnotice; <para> Refer to the introduction in this manual or to the @@ -22,5 +23,6 @@ for subscription information to the no-cost mailing lists. <sect1 id="people"> <title>People</title> ---> +%pgnotice; + </appendix> diff --git a/doc/src/sgml/contrib.sgmlin b/doc/src/sgml/contrib.sgmlin index b801c40..d214df5 100644 --- a/doc/src/sgml/contrib.sgmlin +++ b/doc/src/sgml/contrib.sgmlin @@ -2,6 +2,7 @@ <appendix id="contrib"> <title>Additional Supplied Modules</title> +&pgnotice; <para> This appendix contains information regarding the modules that diff --git a/doc/src/sgml/cube.sgmlin b/doc/src/sgml/cube.sgmlin index 5da1301..75ce194 100644 --- a/doc/src/sgml/cube.sgmlin +++ b/doc/src/sgml/cube.sgmlin @@ -6,6 +6,7 @@ <indexterm zone="cube"> <primary>cube</primary> </indexterm> +&pgnotice; <para> This module implements a data type <type>cube</> for @@ -14,6 +15,7 @@ <sect2> <title>Syntax</title> +&pgnotice; <para> <xref linkend="cube-repr-table"> shows the valid external @@ -85,6 +87,7 @@ <sect2> <title>Precision</title> +&pgnotice; <para> Values are stored internally as 64-bit floating point numbers. This means @@ -94,6 +97,7 @@ <sect2> <title>Usage</title> +&pgnotice; <para> The <filename>cube</> module includes a GiST index operator class for @@ -313,6 +317,7 @@ <sect2> <title>Defaults</title> +&pgnotice; <para> I believe this union: @@ -367,6 +372,7 @@ t <sect2> <title>Notes</title> +&pgnotice; <para> For examples of usage, see the regression test <filename>sql/cube.sql</>. @@ -381,6 +387,7 @@ t <sect2> <title>Credits</title> +&pgnotice; <para> Original author: Gene Selkov, Jr. <email>sel...@mc...</email>, diff --git a/doc/src/sgml/datatype.sgmlin b/doc/src/sgml/datatype.sgmlin index 0e07cc3..ab6250d 100644 --- a/doc/src/sgml/datatype.sgmlin +++ b/doc/src/sgml/datatype.sgmlin @@ -3,6 +3,7 @@ <chapter id="datatype"> <title>Data Types</title> + <indexterm zone="datatype"> <primary>data type</primary> </indexterm> @@ -11,7 +12,7 @@ <primary>type</primary> <see>data type</see> </indexterm> - +&pgnotice; <para> <productname>PostgreSQL</productname> has a rich set of native data types available to users. Users can add new types to @@ -302,6 +303,7 @@ <primary>data type</primary> <secondary>numeric</secondary> </indexterm> +&pgnotice; <para> Numeric types consist of two-, four-, and eight-byte integers, @@ -422,6 +424,7 @@ <primary>int8</primary> <see>bigint</see> </indexterm> +&pgnotice; <para> The types <type>smallint</type>, <type>integer</type>, and @@ -474,6 +477,7 @@ <primary>decimal</primary> <see>numeric</see> </indexterm> +&pgnotice; <para> The type <type>numeric</type> can store numbers with up to 1000 @@ -607,6 +611,7 @@ NUMERIC <indexterm zone="datatype-float"> <primary>floating point</primary> </indexterm> +&pgnotice; <para> The data types <type>real</type> and <type>double @@ -758,6 +763,7 @@ NUMERIC <primary>sequence</primary> <secondary>and serial type</secondary> </indexterm> +&pgnotice; <para> The data types <type>serial</type> and <type>bigserial</type> @@ -833,6 +839,7 @@ ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceab <sect1 id="datatype-money"> <title>Monetary Types</title> +&pgnotice; <para> The <type>money</type> type stores a currency amount with a fixed @@ -893,6 +900,7 @@ SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric; <sect1 id="datatype-character"> <title>Character Types</title> +&pgnotice; <indexterm zone="datatype-character"> <primary>character string</primary> diff --git a/doc/src/sgml/datetime.sgmlin b/doc/src/sgml/datetime.sgmlin index d072118..8e75a51 100644 --- a/doc/src/sgml/datetime.sgmlin +++ b/doc/src/sgml/datetime.sgmlin @@ -2,7 +2,7 @@ <appendix id="datetime-appendix"> <title>Date/Time Support</title> - + &pgnotice; <para> <productname>PostgreSQL</productname> uses an internal heuristic parser for all date/time input support. Dates and times are input as @@ -22,7 +22,7 @@ <sect1 id="datetime-input-rules"> <title>Date/Time Input Interpretation</title> - + &pgnotice; <para> The date/time type inputs are all decoded using the following procedure. </para> @@ -178,7 +178,7 @@ <sect1 id="datetime-keywords"> <title>Date/Time Key Words</title> - + &pgnotice; <para> <xref linkend="datetime-month-table"> shows the tokens that are recognized as names of months. @@ -344,7 +344,7 @@ <primary>time zone</primary> <secondary>input abbreviations</secondary> </indexterm> - + &pgnotice; <para> Since timezone abbreviations are not well standardized, <productname>PostgreSQL</productname> provides a means to customize @@ -455,7 +455,7 @@ <sect1 id="datetime-units-history"> <title>History of Units</title> - + &pgnotice; <para> The Julian calendar was introduced by Julius Caesar in 45 BC. It was in common use in the Western world diff --git a/doc/src/sgml/dblink.sgmlin b/doc/src/sgml/dblink.sgmlin index 3530fc1..2c3b9f1 100644 --- a/doc/src/sgml/dblink.sgmlin +++ b/doc/src/sgml/dblink.sgmlin @@ -6,7 +6,7 @@ <indexterm zone="dblink"> <primary>dblink</primary> </indexterm> - + &pgnotice; <para> <filename>dblink</> is a module which supports connections to other <productname>PostgreSQL</> databases from within a database diff --git a/doc/src/sgml/ddl.sgmlin b/doc/src/sgml/ddl.sgmlin index 86cf6ec..c265a54 100644 --- a/doc/src/sgml/ddl.sgmlin +++ b/doc/src/sgml/ddl.sgmlin @@ -2,7 +2,7 @@ <chapter id="ddl"> <title>Data Definition</title> - + &pgnotice; <para> This chapter covers how one creates the database structures that will hold one's data. In a relational database, the raw data is @@ -29,7 +29,7 @@ <indexterm> <primary>column</primary> </indexterm> - + &pgnotice; <para> A table in a relational database is much like a table on paper: It consists of rows and columns. The number and order of the columns @@ -173,7 +173,7 @@ DROP TABLE products; <indexterm zone="ddl-default"> <primary>default value</primary> </indexterm> - + &pgnotice; <para> A column can be assigned a default value. When a new row is created and no values are specified for some of the columns, those @@ -238,7 +238,7 @@ CREATE TABLE products ( <indexterm zone="ddl-constraints"> <primary>constraint</primary> </indexterm> - + &pgnotice; <para> Data types are a way to limit the kind of data that can be stored in a table. For many applications, however, the constraint they @@ -269,7 +269,7 @@ CREATE TABLE products ( <primary>constraint</primary> <secondary>check</secondary> </indexterm> - + &pgnotice; <para> A check constraint is the most generic constraint type. It allows you to specify that the value in a certain column must satisfy a @@ -415,7 +415,7 @@ CREATE TABLE products ( <primary>constraint</primary> <secondary>NOT NULL</secondary> </indexterm> - + &pgnotice; <para> A not-null constraint simply specifies that a column must not assume the null value. A syntax example: @@ -493,7 +493,7 @@ CREATE TABLE products ( <primary>constraint</primary> <secondary>unique</secondary> </indexterm> - + &pgnotice; <para> Unique constraints ensure that the data contained in a column or a group of columns is unique with respect to all the rows in the @@ -580,7 +580,7 @@ CREATE TABLE products ( <primary>constraint</primary> <secondary>primary key</secondary> </indexterm> - + &pgnotice; <para> Technically, a primary key constraint is simply a combination of a unique constraint and a not-null constraint. So, the following @@ -658,7 +658,7 @@ CREATE TABLE example ( <indexterm> <primary>referential integrity</primary> </indexterm> - + &pgnotice; <para> A foreign key constraint specifies that the values in a column (or a group of columns) must match the values appearing in some row @@ -877,7 +877,7 @@ CREATE TABLE order_items ( <primary>constraint</primary> <secondary>exclusion</secondary> </indexterm> - + &pgnotice; <para> Exclusion constraints ensure that if any two rows are compared on the specified columns or expressions using the specified operators, @@ -905,7 +905,7 @@ CREATE TABLE circles ( <sect1 id="ddl-system-columns"> <title>System Columns</title> - + &pgnotice; <para> Every table has several <firstterm>system columns</> that are implicitly defined by the system. Therefore, these names cannot be @@ -1108,7 +1108,7 @@ CREATE TABLE circles ( <primary>table</primary> <secondary>modifying</secondary> </indexterm> - + &pgnotice; <para> When you create a table and you realize that you made a mistake, or the requirements of the application change, you can drop the @@ -1164,7 +1164,7 @@ CREATE TABLE circles ( <primary>column</primary> <secondary>adding</secondary> </indexterm> - + &pgnotice; <para> To add a column, use a command like: <programlisting> @@ -1208,7 +1208,7 @@ ALTER TABLE products ADD COLUMN description text CHECK (description <> '') <primary>column</primary> <secondary>removing</secondary> </indexterm> - + &pgnotice; <para> To remove a column, use a command like: <programlisting> @@ -1235,7 +1235,7 @@ ALTER TABLE products DROP COLUMN description CASCADE; <primary>constraint</primary> <secondary>adding</secondary> </indexterm> - + &pgnotice; <para> To add a constraint, the table constraint syntax is used. For example: <programlisting> @@ -1263,7 +1263,7 @@ ALTER TABLE products ALTER COLUMN product_no SET NOT NULL; <primary>constraint</primary> <secondary>removing</secondary> </indexterm> - + &pgnotice; <para> To remove a constraint you need to know its name. If you gave it a name then that's easy. Otherwise the system assigned a @@ -1304,7 +1304,7 @@ ALTER TABLE products ALTER COLUMN product_no DROP NOT NULL; <primary>default value</primary> <secondary>changing</secondary> </indexterm> - + &pgnotice; <para> To set a new default for a column, use a command like: <programlisting> @@ -1333,7 +1333,7 @@ ALTER TABLE products ALTER COLUMN price DROP DEFAULT; <primary>column data type</primary> <secondary>changing</secondary> </indexterm> - + &pgnotice; <para> To convert a column to a different data type, use a command like: <programlisting> @@ -1362,7 +1362,7 @@ ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2); <primary>column</primary> <secondary>renaming</secondary> </indexterm> - + &pgnotice; <para> To rename a column: <programlisting> @@ -1378,7 +1378,7 @@ ALTER TABLE products RENAME COLUMN product_no TO product_number; <primary>table</primary> <secondary>renaming</secondary> </indexterm> - + &pgnotice; <para> To rename a table: <programlisting> @@ -1399,7 +1399,7 @@ ALTER TABLE products RENAME TO items; <primary>permission</primary> <see>privilege</see> </indexterm> - + &pgnotice; <para> When you create a database object, you become its owner. By default, only the owner of an object can do anything with the @@ -1490,7 +1490,7 @@ REVOKE ALL ON accounts FROM PUBLIC; <indexterm zone="ddl-schemas"> <primary>schema</primary> </indexterm> - + &pgnotice; <para> A <productname>PostgreSQL</productname> database cluster contains one or more named databases. Users and groups of users are @@ -1559,7 +1559,7 @@ REVOKE ALL ON accounts FROM PUBLIC; <primary>schema</primary> <secondary>creating</secondary> </indexterm> - + &pgnotice; <para> To create a schema, use the <xref linkend="sql-createschema"> command. Give the schema a name @@ -1655,7 +1655,7 @@ CREATE SCHEMA <replaceable>schemaname</replaceable> AUTHORIZATION <replaceable>u <primary>schema</primary> <secondary>public</secondary> </indexterm> - + &pgnotice; <para> In the previous sections we created tables without specifying any schema names. By default such tables (and other objects) are @@ -1686,7 +1686,7 @@ CREATE TABLE public.products ( ... ); <primary>name</primary> <secondary>unqualified</secondary> </indexterm> - + &pgnotice; <para> Qualified names are tedious to write, and it's often best not to wire a particular schema name into applications anyway. Therefore @@ -1798,7 +1798,7 @@ SELECT 3 OPERATOR(pg_catalog.+) 4; <primary>privilege</primary> <secondary sortas="schemas">for schemas</secondary> </indexterm> - + &pgnotice; <para> By default, users cannot access any objects in schemas they do not own. To allow that, the owner of the schema must grant the @@ -1835,7 +1835,7 @@ REVOKE CREATE ON SCHEMA public FROM PUBLIC; <primary>system catalog</primary> <secondary>schema</secondary> </indexterm> - + &pgnotice; <para> In addition to <literal>public</> and user-created schemas, each database contains a <literal>pg_catalog</> schema, which contains @@ -1867,7 +1867,7 @@ REVOKE CREATE ON SCHEMA public FROM PUBLIC; <sect2 id="ddl-schemas-patterns"> <title>Usage Patterns</title> - + &pgnotice; <para> Schemas can be used to organize your data in many ways. There are a few usage patterns that are recommended and are easily supported by @@ -1917,7 +1917,7 @@ REVOKE CREATE ON SCHEMA public FROM PUBLIC; <sect2 id="ddl-schemas-portability"> <title>Portability</title> - + &pgnotice; <para> In the SQL standard, the notion of objects in the same schema being owned by different users does not exist. Moreover, some @@ -1959,7 +1959,7 @@ REVOKE CREATE ON SCHEMA public FROM PUBLIC; <primary>table</primary> <secondary>inheritance</secondary> </indexterm> - + &pgnotice; <para> <productname>PostgreSQL</productname> implements table inheritance, which can be a useful tool for database designers. (SQL:1999 and @@ -2190,7 +2190,7 @@ VALUES ('New York', NULL, NULL, 'NY'); <sect2 id="ddl-inherit-caveats"> <title>Caveats</title> - + &pgnotice; <para> Note that not all SQL commands are able to work on inheritance hierarchies. Commands that are used for data querying, @@ -2281,7 +2281,7 @@ VALUES ('New York', NULL, NULL, 'NY'); <primary>table</primary> <secondary>partitioning</secondary> </indexterm> - + &pgnotice; <para> <productname>PostgreSQL</productname> supports basic table partitioning. This section describes why and how to implement @@ -2290,7 +2290,7 @@ VALUES ('New York', NULL, NULL, 'NY'); <sect2 id="ddl-partitioning-overview"> <title>Overview</title> - + &pgnotice; <para> Partitioning refers to splitting what is logically one large table into smaller physical pieces. @@ -2384,7 +2384,7 @@ VALUES ('New York', NULL, NULL, 'NY'); <sect2 id="ddl-partitioning-implementation"> <title>Implementing Partitioning</title> - + &pgnotice; <para> To set up a partitioned table, do the following: <orderedlist spacing="compact"> @@ -2679,7 +2679,7 @@ LANGUAGE plpgsql; <sect2 id="ddl-partitioning-managing-partitions"> <title>Managing Partitions</title> - + &pgnotice; <para> Normally the set of partitions established when initially defining the table are not intended to remain static. It is @@ -2750,7 +2750,7 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement; <indexterm> <primary>constraint exclusion</primary> </indexterm> - + &pgnotice; <para> <firstterm>Constraint exclusion</> is a query optimization technique that improves performance for partitioned tables defined in the @@ -2841,7 +2841,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; <sect2 id="ddl-partitioning-alternatives"> <title>Alternative Partitioning Methods</title> - + &pgnotice; <para> A different approach to redirecting inserts into the appropriate partition table is to set up rules, instead of a trigger, on the @@ -2903,7 +2903,7 @@ UNION ALL SELECT * FROM measurement_y2008m01; <sect2 id="ddl-partitioning-caveats"> <title>Caveats</title> - + &pgnotice; <para> The following caveats apply to partitioned tables: <itemizedlist> @@ -2988,7 +2988,7 @@ ANALYZE measurement; <sect1 id="ddl-others"> <title>Other Database Objects</title> - + &pgnotice; <para> Tables are the central objects in a relational database structure, because they hold your data. But they are not the only objects @@ -3042,7 +3042,7 @@ ANALYZE measurement; <primary>RESTRICT</primary> <secondary sortas="DROP">with DROP</secondary> </indexterm> - + &pgnotice; <para> When you create complex database structures involving many tables with foreign key constraints, views, triggers, functions, etc. you diff --git a/doc/src/sgml/dfunc.sgmlin b/doc/src/sgml/dfunc.sgmlin index 310787a..7f6fbac 100644 --- a/doc/src/sgml/dfunc.sgmlin +++ b/doc/src/sgml/dfunc.sgmlin @@ -2,7 +2,7 @@ <sect2 id="dfunc"> <title>Compiling and Linking Dynamically-Loaded Functions</title> - + &pgnotice; <para> Before you are able to use your <productname>PostgreSQL</productname> extension functions written in diff --git a/doc/src/sgml/dict-int.sgmlin b/doc/src/sgml/dict-int.sgmlin index d19487f..23200cf 100644 --- a/doc/src/sgml/dict-int.sgmlin +++ b/doc/src/sgml/dict-int.sgmlin @@ -6,7 +6,7 @@ <indexterm zone="dict-int"> <primary>dict_int</primary> </indexterm> - + &pgnotice; <para> <filename>dict_int</> is an example of an add-on dictionary template for full-text search. The motivation for this example dictionary is to @@ -17,7 +17,7 @@ <sect2> <title>Configuration</title> - + &pgnotice; <para> The dictionary accepts two options: </para> @@ -45,7 +45,7 @@ <sect2> <title>Usage</title> - + &pgnotice; <para> Running the installation script creates a text search template <literal>intdict_template</> and a dictionary <literal>intdict</> diff --git a/doc/src/sgml/dict-xsyn.sgmlin b/doc/src/sgml/dict-xsyn.sgmlin index e1ec8e5..96a9f3d 100644 --- a/doc/src/sgml/dict-xsyn.sgmlin +++ b/doc/src/sgml/dict-xsyn.sgmlin @@ -6,7 +6,7 @@ <indexterm zone="dict-xsyn"> <primary>dict_xsyn</primary> </indexterm> - + &pgnotice; <para> <filename>dict_xsyn</> (Extended Synonym Dictionary) is an example of an add-on dictionary template for full-text search. This dictionary type @@ -16,7 +16,7 @@ <sect2> <title>Configuration</title> - + &pgnotice; <para> A <literal>dict_xsyn</> dictionary accepts the following options: </para> @@ -85,7 +85,7 @@ word syn1 syn2 syn3 <sect2> <title>Usage</title> - + &pgnotice; <para> Running the installation script creates a text search template <literal>xsyn_template</> and a dictionary <literal>xsyn</> diff --git a/doc/src/sgml/diskusage.sgmlin b/doc/src/sgml/diskusage.sgmlin index 0c7f544..fdf89db 100644 --- a/doc/src/sgml/diskusage.sgmlin +++ b/doc/src/sgml/diskusage.sgmlin @@ -2,7 +2,7 @@ <chapter id="diskusage"> <title>Monitoring Disk Usage</title> - + &pgnotice; <para> This chapter discusses how to monitor the disk usage of a <productname>PostgreSQL</> database system. @@ -14,7 +14,7 @@ <indexterm zone="disk-usage"> <primary>disk usage</primary> </indexterm> - + &pgnotice; <para> Each table has a primary heap disk file where most of the data is stored. If the table has any columns with potentially-wide values, @@ -112,7 +112,7 @@ ORDER BY relpages DESC; <sect1 id="disk-full"> <title>Disk Full Failure</title> - + &pgnotice; <para> The most important disk monitoring task of a database administrat... [truncated message content] |
From: Michael P. <mic...@us...> - 2011-05-12 07:39:51
|
Project "Postgres-XC". The branch, ha_support has been updated via 7c0fb4e4bf34f558697ee864e68b01dc05f08e81 (commit) from 4f452a336a0cea55c13b93823f39ffed547f9065 (commit) - Log ----------------------------------------------------------------- commit 7c0fb4e4bf34f558697ee864e68b01dc05f08e81 Author: Michael P <mic...@us...> Date: Thu May 12 16:29:13 2011 +0900 XC Watcher and Configurator Configurator is a utility that can be used to setup a Postgres-XC cluster automatically. Based on Ruby and a global YAML configuration file, it is in charge of parsing the global file, make postgresql.conf, indent and pg_hba.conf, and then distribute each postgreSQL file automatically to each node. XC watcher is a module using XCM flag information to gather the status of nodes in the cluster and what kind of action is necessary in case a node fails. This module takes into account Datanode, Coordinator, GTM, GTM-Proxy and GTM-Standby. Modules written by Suto Takayuki diff --git a/src/pgxc/Makefile b/src/pgxc/Makefile index 2156013..d421d0b 100644 --- a/src/pgxc/Makefile +++ b/src/pgxc/Makefile @@ -13,7 +13,7 @@ subdir = src/pgxc top_builddir = ../.. include $(top_builddir)/src/Makefile.global -DIRS = xcm pgxc_clean +DIRS = xcm pgxc_clean pgxc_config xc_watcher all install installdirs uninstall distprep clean distclean maintainer-clean: $(INSTALL_DATA) $(srcdir)/xcm/pgxc_ha.conf.sample '$(DESTDIR)$(datadir)/pgxc_ha.conf.sample' diff --git a/src/pgxc/pgxc_config/Makefile b/src/pgxc/pgxc_config/Makefile new file mode 100644 index 0000000..e3adc78 --- /dev/null +++ b/src/pgxc/pgxc_config/Makefile @@ -0,0 +1,28 @@ +#------------------------------------------------------------------------- +# +# Makefile for src/pgxc/pgxc_config +# +# Portions Copyright (c) 2011 Nippon Telegraph and Telephone Corporation +# +# $PostgreSQL$ +# +#------------------------------------------------------------------------- + +PGFILEDESC = "pgxc_config" +subdir = src/pgxc/pgxc_config +top_builddir = ../../.. +include $(top_builddir)/src/Makefile.global + +all: + +install: all installdirs + $(INSTALL_PROGRAM) pgxc_config$(X) '$(DESTDIR)$(bindir)'/pgxc_config$(X) + chmod 755 '$(DESTDIR)$(bindir)'/pgxc_config$(X) + +installdirs: + $(mkinstalldirs) '$(DESTDIR)$(bindir)' + +uninstall: + rm -f '$(DESTDIR)$(bindir)/pgxc_config$(X)' + +clean distclean maintainer-clean: diff --git a/src/pgxc/pgxc_config/pgxc_config b/src/pgxc/pgxc_config/pgxc_config new file mode 100755 index 0000000..748097c --- /dev/null +++ b/src/pgxc/pgxc_config/pgxc_config @@ -0,0 +1,6267 @@ +#!/usr/bin/ruby +# Copyright (c) 2011 Nippon Telegraph and Telephone Corporation + +require 'pp' +require 'yaml' +require 'fileutils' +require 'optparse' + +SEND_FAIL_MSG_RETRY_COUNT = 10 + +#------------------------------ +# Constant values +#------------------------------ +PGXC_CONFIG_WORK_DIRECTORY='~/.pgxc_config' +Version = "0.9.4" + +#------------------------------ +# module +#------------------------------ +module PgxcModule + def get_keywords(val,dest_array) + + if val.class == Hash then + val.each {|key, value| + dest_array << key + get_keywords(val[key],dest_array) + } + end + if val.class == Array then + val.each {|item| + if item.class == Hash then + get_keywords(item,dest_array) + end + } + end + end +end + +#------------------------------ +# Numeric Class +#------------------------------ +class Numeric + def to_str + self.to_s + end +end + +#------------------------------ +# PostgresXC Class +#------------------------------ +class PostgresXC + include PgxcModule + attr_accessor :value + def initialize(pgxc_conf) + if pgxc_conf == nil + raise "Postgres-xc section not found." + end + @value = pgxc_conf + validate + end + + def validate + required = {"UID"=>true, "GROUP"=>true, "POSTGRES_HOME"=>true, + "PGDATA"=>false, "coordinator"=>false, "datanode"=>false, + "XC_WATCHER"=>false, "MONITORING_AGENT"=>false, "port"=>false, "server"=>false} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Postgres-xc section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Postgres-xc section: unknown keyword '#{k}'." + end + } + end + + def UID + return @value["UID"] + end + + def GROUP + return @value["GROUP"] + end + + def POSTGRES_HOME + return @value["POSTGRES_HOME"] + end + + def XC_WATCHER_HOST + if @value["XC_WATCHER"] + return @value["XC_WATCHER"]["server"] + end + return nil + end + + def XC_WATCHER_PORT + if @value["XC_WATCHER"] + return @value["XC_WATCHER"]["port"] + end + return nil + end + + def MONITORING_AGENT_PORT + if @value["MONITORING_AGENT"] + return @value["MONITORING_AGENT"]["port"] + end + return nil + end + + def PGDATA(key) + if @value["PGDATA"] == nil + return nil + end + if @value["PGDATA"][key] != nil + return @value["PGDATA"][key] + end + return nil + end + +end + +#------------------------------ +# Server Class +#------------------------------ +class Server + include PgxcModule + attr_accessor :value + def initialize(server_conf) + if server_conf == nil + raise "Server section is missing." + end + @value = server_conf + validate + end + + def validate + required = {"name"=>true, "ip_addr"=>true} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Server section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Server section: unknown keyword '#{k}'." + end + } + # check array (include? 'name') + @value.each {|item| + if item["name"] == nil + raise "Server section: name value is missing." + end + } + # ip_addr check + get_name.each {|name| + if get_ip_addr(name) == nil + raise "Server section: ip_addr value is missing in 'name: #{name}'." + end + } + end + + def server(name) + @value.each {|item| + if item["name"] == name + return item + end + } + return nil + end + + def get_server_name(ip) + @value.each{|item| + item['ip_addr'].each {|v| + return item['name'] if v == ip + } + } + return nil + end + + def get_name() + ary = [] + @value.each {|item| + if item["name"] != nil + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def get_ip_addr(name) + @value.each {|item| + if item["name"] == name + return item["ip_addr"] + end + } + return nil + end + +end + +#------------------------------ +# GTM Class +#------------------------------ +class GTM + attr_accessor :value + include PgxcModule + def initialize(gtm_conf) + if gtm_conf == nil + raise "Gtm section is missing." + end + @value = gtm_conf + default + validate + @value.each {|v| + if v.has_key?("name") and v['name'] != 'default' + if v.has_key?('status') == false + v['status'] = 'R' + end + end + } + end + + def primary + @value.each {|v| + if v.has_key?('primary') + return v['primary'] + end + return nil + } + end + + def set_priamry(name) + @value.each {|v| + if v.has_key?('primary') + v['primary'] = name + break + end + } + end + + def set_primary(name) + @value.each {|v| + if v.has_key?('primary') + v['primary'] = name + break + end + } + end + + def active?(name) + if name().size == 1 and name()[0] == name + return true + end + if primary != nil + if primary == name + return true + end + end + return false + end + + def set_status(name, status) + @value.each {|v| + if v['name'] == name + v['status'] = status + end + } + end + + def get_status(name) + @value.each{|v| + return v['status'] if v['name'] == name + } + return nil + end + + def validate + required = {"server"=>true, "name"=>true, "ip_addr"=>false, "port"=>true, + "data_directory"=>true, "log_filename"=>true, "first_gxid"=>true, + "primary"=>false, "status"=>false} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Gtm section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Gtm section: unknown keyword '#{k}'." + end + } + + if name == nil + raise "Gtm section: 'name' is missing. (Excluding default)" + end + + if name.size > 1 + if primary == nil + raise "Gtm section: primary value is missing." + end + if name.size != name.uniq.size + raise "Gtm section: duplicate name in the list." + end + if name.index(primary) == nil + raise "Gtm section: primary '#{primary}' is not found in the Gtm section." + end + name.each {|name| + if server(name) == nil + raise "Gtm section: server value is missing in 'name: #{name}'." + end + if port(name) == nil + raise "Gtm section: port value is missing in 'name: #{name}'." + end + if data_directory(name) == nil + raise "Gtm section: data_directory value is missing in 'name: #{name}'." + end + if log_filename(name) == nil + raise "Gtm section: log_filename value is missing in 'name: #{name}'." + end + if first_gxid(name) == nil + raise "Gtm section: first_gxid value is missing in 'name: #{name}'." + end + } + end + + + end + + def server(name=nil) + server = [] + if name == nil + @value.each {|v| + if v.has_key?("server") + server << v['server'] + end + } + return server + else + @value.each {|v| + if v.has_key?("server") + return v['server'] if name == v['name'] + end + } + end + return nil + end + + def name + name = [] + @value.each {|v| + if v.has_key?('name') + name << v['name'] if v['name'] != 'default' + end + } + return nil if name.size == 0 + return name + end + + def ip_addr(name) + @value.each {|v| + if v.has_key?("name") + next if v['name'] != name + return v['ip_addr'] + end + } + return nil + end + + def port(name) + @value.each {|item| + if item['name'] == name + return item['port'] if item.has_key?('port') + end + } + if @default_port != nil + return @default_port + end + return nil + end + + def data_directory(name) + @value.each {|item| + if item['name'] == name + return item['data_directory'] if item.has_key?('data_directory') + end + } + if @default_data_directory != nil + return @default_data_directory + end + return nil + end + + def log_filename(name) + @value.each {|item| + if item['name'] == name + return item['log_filename'] if item.has_key?('log_filename') + end + } + if @default_log_filename != nil + return @default_log_filename + end + return nil + end + + def first_gxid(name) + @value.each {|item| + if item['name'] == name + return item['first_gxid'] if item.has_key?('first_gxid') + end + } + if @default_first_gxid != nil + return @default_first_gxid + end + return nil + end + + def get_default + @value.each {|item| + if item['name'] == 'default' + return item + end + } + return nil + end + + def default + default_values = get_default + if default_values != nil + @default_data_directory = default_values['data_directory'] + @default_log_filename = default_values['log_filename'] + @default_port = default_values['port'] + @default_first_gxid = default_values['first_gxid'] + end + end +end + +#------------------------------ +# GTM-PROXY Class +#------------------------------ +class GTMProxy + include PgxcModule + attr_accessor :value + def initialize(gtmp_conf) + if gtmp_conf == nil + @value = nil + return + end + @value = gtmp_conf + default + validate + # set status data + @value.each {|v| + if v['name'] != 'default' + if v.has_key?('status') == false + v['status'] = 'R' + end + end + } + end + + def set_status(name, status) + @value.each {|v| + if v['name'] == name + v['status'] = status + end + } + end + + def get_status(name) + @value.each {|v| + return v['status'] if v['name'] == name + } + return nil + end + + def validate + required = {"name"=>true, "server"=>true, "ip_addr"=>false, "gtm"=>true, + "port"=>true, "worker_number"=>true, "log_filename"=>true, "proxy_directory"=>true, + "status"=>false} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Gtm-proxy section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Gtm-proxy section: unknown keyword '#{k}'." + end + } + if get_name == nil + raise "Gtm-proxy section: 'name' is missing. (Excluding default)" + end + + # check array (include? 'name') + @value.each {|item| + if item["name"] == nil + raise "Gtm-proxy section: name value is missing." + end + } + # check server and gtm and port and worker_number and log_filename and proxy_directory + get_name.each {|name| + if get_gtm(name) == nil + raise "Gtm-proxy section: gtm value is missing in 'name: #{name}'." + end + if get_server(name) == nil + raise "Gtm-proxy section: server value is missing in 'name: #{name}'." + end + if get_port(name) == nil + raise "Gtm-proxy section: port value is missing in 'name: #{name}'." + end + if get_worker_number(name) == nil + raise "Gtm-proxy section: worker_number value is missing in 'name: #{name}'." + end + if get_log_filename(name) == nil + raise "Gtm-proxy section: log_filename value is missing in 'name: #{name}'." + end + if get_proxy_directory(name) == nil + raise "Gtm-proxy section: proxy_directory value is missing in 'name: #{name}'." + end + } + end + + def get_name + if @value == nil + return nil + end + ary = [] + @value.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def get_server(name) + if @value == nil + return nil + end + @value.each {|item| + if item["name"] == name + return item["server"] + end + } + return nil + end + + def get_ip_addr(name) + if @value == nil + return nil + end + @value.each {|item| + if item["name"] == name + return item["ip_addr"] + end + } + return nil + end + + def get_port(name) + + @value.each{|item| + if item["name"] == name + return item["port"] if item["port"] != nil + end + } + if @default_port != nil + return @default_port + end + return nil + end + + def get_worker_number(name) + @value.each {|item| + if item["name"] == name + return item["worker_number"] if item["worker_number"] != nil + end + } + if @default_worker_number != nil + return @default_worker_number + end + return nil + end + + def get_proxy_directory(name) + @value.each {|item| + if item["name"] == name + return item["proxy_directory"] if item["proxy_directory"] != nil + end + } + if @default_proxy_directory != nil + return @default_proxy_directory + end + return nil + end + + def get_log_filename(name) + @value.each {|item| + if item["name"] == name + return item["log_filename"] if item["log_filename"] != nil + end + } + if @default_log_filename != nil + return @default_log_filename + end + return nil + end + + def get_gtm(name) + @value.each {|item| + if item["name"] == name + return item["gtm"] if item["gtm"] != nil + end + } + if @default_gtm != nil + return @default_gtm + end + return nil + end + + def get_default + @value.each {|item| + if item["name"] == "default" + return item + end + } + return nil + end + + def default + default_values = get_default + if default_values != nil + @default_gtm = default_values["gtm"] + @default_port = default_values["port"] + @default_worker_number = default_values["worker_number"] + @default_log_filename = default_values["log_filename"] + @default_proxy_directory = default_values["proxy_directory"] + end + end + +end + +#------------------------------ +# Datanode Class +#------------------------------ +class Datanode + include PgxcModule + attr_accessor :value + def initialize(dn_conf) + if dn_conf == nil + raise "Datanode section is missing." + end + @value = dn_conf + default + validate + # set status data + @value.each {|v| + next if v.has_key?('primary-node') + next if v.has_key?('name') and v['name'] == 'default' + if v.has_key?('mirror') + v['mirror'].each {|m| + next if m.has_key?('primary') + if m.has_key?('status') == false + m['status'] = 'R' + end + } + else + if v.has_key?('status') == false + v['status'] = 'R' + end + end + } + end + + def mirror_datanode? + get_name.each {|name| + return true if get_standby_type(name) == "mirror" + } + return false + end + + def set_status(dn_name, status, mirror=nil) + if mirror == nil + @value.each {|v| + next if v['name'] != dn_name + if v['name'] == dn_name + v['status'] = status + end + } + else + # mirror datanode + @value.each {|v| + next if v['name'] != dn_name + if v.has_key?("mirror") + v['mirror'].each {|m| + if m['name'] == mirror + m['status'] = status + end + } + end + } + end + end + + def get_status(dn_name, mirror=nil) + if mirror == nil + @value.each {|v| + next if v['name'] != dn_name + if v['name'] == dn_name + return v['status'] + end + } + else + # mirror datanode + @value.each {|v| + next if v['name'] != dn_name + if v.has_key?("mirror") + v['mirror'].each {|m| + if m['name'] == mirror + return m['status'] + end + } + end + } + end + return nil + end + + def validate + required = {"primary-node"=>true, "name"=>true, "server"=>true, "ip_addr"=>false, "datanode_number"=>true, + "gtm"=>true, "port"=>true, "data_directory"=>false, "standby"=> false, "mirror"=>false, "primary"=>false, "mirror_number"=>false, "status"=>false} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Datanode section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Datanode section: unknown keyword '#{k}'." + end + } + if get_name == nil + raise "Datanode section: 'name' is missing. (Excluding default)" + end + + # check array (include? 'name') + @value.each {|item| + next if item.has_key?("primary-node") + if item["name"] == nil + raise "Datanode section: name value is missing." + end + } + # check standby mirror data + standby_type = ["mirror", "none", nil] + get_name.each {|name| + type = get_standby_type(name) + if type != nil and standby_type.index(type) == nil + raise "Datanode section: standy is invalid value. 'standby: #{type}'." + end + if standby_type.index(type) == nil + raise "Datanode section: standby value is missing in 'name: #{name}'." + end + if type == "mirror" + if get_mirror_data(name)["mirror"] == nil + raise "Datanode section: mirror value is missing in 'name: #{name}'." + end + end + } + + # check primary_node + if get_name.index(primary) == nil + raise "Datanode section: primary_node '#{primary}' is not found in the datanode section." + end + + # check datanode_number and server and gtm and port + get_name.each {|name| + if get_datanode_number(name) == nil + raise "Datanode section: datanode_number value is missing in 'name: #{name}'." + end + # standby : mirror + if get_standby_type(name) == "mirror" + # check server + if get_mirror_servers(name) == nil + raise "Datanode section: server value is missing in 'name: #{name}'." + end + if get_mirror_servers(name).size < 2 + raise "Datanode section: mirror is required two or more server settings in 'name: #{name}'." + end + # check gtm, port + get_mirror_name(name).each {|mirror_name| + if get_mirror_gtm(name, mirror_name) == nil + raise "Datanode section: gtm value is missing in 'name: #{name}'." + end + if get_mirror_port(name, mirror_name) == nil + raise "Datanode section: port value is missing in 'name: #{name}'." + end + } + else + if get_server(name) == nil + raise "Datanode section: server value is missing in 'name: #{name}'." + end + if get_gtm(name) == nil + raise "Datanode section: gtm value is missing in 'name: #{name}'." + end + if get_port(name) == nil + raise "Datanode section: port value is missing in 'name: #{name}'." + end + end + } + # check datanode name + if get_name.size != get_name.uniq.size + raise "Datanode section: duplicate datanode name in the list." + end + # check datanode_number + a = [] + get_name.each {|name| + a << get_datanode_number(name) + } + if a.size != a.uniq.size + raise "Datanode section: duplicate datanode_number value in the list." + end + a.sort! + i = 1 + a.each {|v| + if v != i + raise "Datanode section: datanode_number should start with 1. (next datanode_number add +1)." + end + i += 1 + } + + # check mirror number and mirror primary + # standby : mirror + get_name.each {|name| + if get_standby_type(name) == "mirror" + mirror_number = [] + get_mirror_name(name).each {|mirror_name| + num = get_mirror_number(name, mirror_name) + if num == nil + raise "Datanode mirror section: mirror_number values is missing in 'name: #{mirror_name}'." + end + mirror_number << num + } + if mirror_number.size != mirror_number.uniq.size + raise "Datanode mirror section: duplicate mirror_number value in the list." + end + mirror_number.sort! + i = 1 + mirror_number.each {|v| + if v != i + raise "Datanode mirror section: mirror_number should start with 1. (next mirror_number add +1)." + end + i += 1 + } + # check mirror primary + if get_mirror_primary_name(name) == nil + raise "Datanode mirror section: primary is missing in 'name: #{name}'." + end + if get_mirror_name(name).index(get_mirror_primary_name(name)) == nil + raise "Datanode mirror section: 'primary: #{get_mirror_primary_name(name)}' is not found in the mirror section." + end + end + } + + end + + def primary + @value.each {|item| + if item.has_key?("primary-node") + return item["primary-node"] + end + } + return nil + end + + def count + ary = [] + @value.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + return ary.size + end + + def primary?(name, mirror=nil) + if mirror == nil + return true if primary() == name + else # mirror primary + return true if get_mirror_primary_name(name) == mirror + end + return false + end + + def get_mirror_number(name, mirror) + if get_mirror_data(name) == nil + return nil + end + if get_mirror_data(name)["mirror"] == nil + return nil + end + get_mirror_data(name)["mirror"].each {|item| + if item["name"] == mirror + return item["mirror_number"] + end + } + return nil + end + + def get_name + ary = [] + @value.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def get_mirror_name(name) + mirror_name = [] + return nil if get_mirror_data(name) == nil + return nil if get_mirror_data(name)["mirror"] == nil + get_mirror_data(name)["mirror"].each {|item| + if item["name"] != nil + mirror_name << item["name"] + end + } + return nil if mirror_name.size == 0 + return mirror_name + end + + def get_mirror_primary_name(name) + return nil if get_mirror_data(name) == nil + return nil if get_mirror_data(name)["mirror"] == nil + get_mirror_data(name)["mirror"].each {|item| + if item["primary"] != nil + return item["primary"] + end + } + return nil + end + + def get_server(name) + @value.each {|item| + if item["name"] == name + return item["server"] + end + } + return nil + end + + def get_mirror_servers(name) + server = [] + if get_mirror_data(name) == nil + return nil + end + if get_mirror_data(name)["mirror"] == nil + return nil + end + get_mirror_data(name)["mirror"].each {|item| + if item["server"] != nil + server << item["server"] + end + } + return nil if server.size == 0 + return server + end + + def get_mirror_server(name, mirror) + if get_mirror_data(name) == nil + return nil + end + if get_mirror_data(name)["mirror"] == nil + return nil + end + get_mirror_data(name)["mirror"].each {|item| + if item["name"] == mirror + return item["server"] + end + } + return nil + end + + def get_ip_addr(name) + @value.each {|item| + if item["name"] == name + return item["ip_addr"] + end + } + return nil + end + + def get_mirror_ip_addr(name, mirror) + if get_mirror_data(name) == nil + return nil + end + if get_mirror_data(name)["mirror"] == nil + return nil + end + get_mirror_data(name)["mirror"].each {|item| + return item["ip_addr"] if item["name"] == mirror + } + return nil + end + + def get_mirror_svr_ip_addr(name) + if get_mirror_data(name) == nil + return nil + end + if get_mirror_data(name)["mirror"] == nil + return nil + end + mirror_svr_ip_addr = [] + get_mirror_data(name)["mirror"].each {|item| + if item["server"] != nil + mirror_svr_ip_addr << {"server"=>item["server"], "ip_addr"=>item["ip_addr"]} + end + } + if mirror_svr_ip_addr.size == 0 + return nil + end + return mirror_svr_ip_addr + end + + def get_gtm(name) + @value.each {|item| + if item["name"] == name + return item["gtm"] if item["gtm"] != nil + end + } + if @default_gtm != nil + return @default_gtm + end + return nil + end + + def get_mirror_gtm(name, mirror_name) + if get_mirror_data(name) == nil + return nil + end + if get_mirror_data(name)["mirror"] == nil + return nil + end + get_mirror_data(name)["mirror"].each {|item| + if item["name"] == mirror_name + return item["gtm"] if item["gtm"] != nil + end + } + if @default_gtm != nil + return @default_gtm + end + return nil + end + + def get_datanode_number(name) + @value.each{|item| + if item["name"] == name + return item["datanode_number"] + end + } + return nil + end + + def get_default + @value.each {|item| + if item["name"] == "default" + return item + end + } + return nil + end + + def get_port(name) + @value.each {|item| + if item["name"] == name + return item["port"] if item["port"] != nil + end + } + if @default_port != nil + return @default_port + end + return nil + end + + def get_mirror_port(name, mirror_name) + if get_mirror_data(name) != nil + if get_mirror_data(name)["mirror"] != nil + get_mirror_data(name)["mirror"].each {|item| + if item["name"] == mirror_name + return item["port"] if item["port"] != nil + end + } + end + end + if @default_port != nil + return @default_port + end + return nil + end + + def get_data_directory(name) + @value.each{|item| + if item["name"] == name + return item["data_directory"] if item["data_directory"] != nil + end + } + if @default_data_directory != nil + return @default_data_directory + end + return nil + end + + def get_mirror_datanode_directory(name, mirror) + if get_mirror_data(name) != nil + if get_mirror_data(name)["mirror"] != nil + get_mirror_data(name)["mirror"].each {|item| + if item["name"] == mirror + return item["data_directory"] if item["data_directory"] != nil + end + } + end + end + if @default_data_directory != nil + return @default_data_directory + end + return nil + end + + def default + default_values = get_default + if default_values != nil + @default_gtm = default_values["gtm"] + @default_port = default_values["port"] + @default_data_directory = default_values["data_directory"] + end + end + + def get_mirror_data(name) + @value.each {|item| + if item["name"] == name + return item + end + } + return nil + end + + def get_standby_type(name) + return get_mirror_data(name) ? get_mirror_data(name)["standby"] : nil + end +end + +#------------------------------ +# Coordinator Class +#------------------------------ +class Coordinator + include PgxcModule + attr_accessor :value + def initialize(coord_conf) + if coord_conf == nil + raise "Coordinator section is missing." + end + @value = coord_conf + default + validate + # set status data + @value.each {|v| + if v['name'] != 'default' + if v.has_key?('status') == false + v['status'] = 'R' + end + end + } + end + + def set_status(name, status) + @value.each {|v| + if v['name'] == name + v['status'] = status + end + } + end + + def get_status(name) + @value.each{|v| + return v['status'] if v['name'] == name + } + return nil + end + + def count + ary = [] + @value.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + return ary.size + end + + def validate + required = {"name"=>true, "server"=>true, "ip_addr"=>false, "preferred_data_nodes"=>true, "coordinator_number"=>true, + "gtm"=>true, "port"=>true, "data_directory"=>false, "status"=>false} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Coordinator section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Coordinator section: unknown keyword '#{k}'" + end + } + if get_name == nil + raise "Coordinator section: 'name' is missing. (Excluding default)" + end + + # check array (include? 'name') + @value.each {|item| + next if item.has_key?("primary-node") + if item["name"] == nil + raise "Coordinator section: name value is missing." + end + } + # check server and gtm and preferred_data_nodes + get_name.each {|name| + if get_coordinator_number(name) == nil + raise "Coordinator section: coordinator_number value is missing in 'name: #{name}'." + end + if get_server(name) == nil + raise "Coordinator section: server value is missing in 'name: #{name}'." + end + if get_gtm(name) == nil + raise "Coordinator section: gtm value is missing in 'name: #{name}'." + end + if get_preferred_data_nodes(name) == nil + raise "Coordinator section: preferred_data_nodes value is missing in 'name: #{name}'." + end + if get_port(name) == nil + raise "Coordinator section: port value is missing in 'name: #{name}'." + end + } + # check coordinator_number + a = [] + get_name.each {|name| + a << get_coordinator_number(name) + } + if a.size != a.uniq.size + raise "Coordinator section: duplicate coordinator_number value in the list." + end + a.sort! + i = 1 + a.each {|v| + if v != i + raise "Coordinator section: coordinator_number should start with 1. (next coordinator_number add +1)." + end + i += 1 + } + end + + def get_name + ary = [] + @value.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def get_server(name) + @value.each {|item| + if item["name"] == name + return item["server"] + end + } + return nil + end + + def get_ip_addr(name) + @value.each {|item| + if item["name"] == name + return item["ip_addr"] + end + } + return nil + end + + def get_gtm(name) + @value.each {|item| + if item["name"] == name + return item["gtm"] if item["gtm"] != nil + end + } + if @default_gtm != nil + return @default_gtm + end + return nil + end + + def get_preferred_data_nodes(name) + @value.each {|item| + if item["name"] == name + return item["preferred_data_nodes"] if item["preferred_data_nodes"] != nil + end + } + if @default_preferred_data_nodes != nil + return @default_preferred_data_nodes + end + return nil + end + + def get_coordinator_number(name) + @value.each{|item| + if item["name"] == name + return item["coordinator_number"] + end + } + return nil + end + + def get_default + @value.each {|item| + if item["name"] == "default" + return item + end + } + return nil + end + + def get_port(name) + @value.each {|item| + if item["name"] == name + return item["port"] if item["port"] != nil + end + } + if @default_port != nil + return @default_port + end + return nil + end + + def get_data_directory(name) + @value.each{|item| + if item["name"] == name + return item["data_directory"] if item["data_directory"] != nil + end + } + if @default_data_directory != nil + return @default_data_directory + end + return nil + end + + def default + default_values = get_default + if default_values != nil + @default_gtm = default_values["gtm"] + @default_port = default_values["port"] + @default_data_directory = default_values["data_directory"] + @default_default_preferred_data_nodes = default_values["preferred_data_nodes"] + end + end + +end + +#------------------------------ +# PGSQLConf Class +#------------------------------ +class PGSQLConf + include PgxcModule + attr_accessor :value, :coordinator, :datanode + def initialize(pgsql_conf) + @coordinator = @datanode = nil + if pgsql_conf == nil + raise "Postgresql.conf section not found." + end + @value = pgsql_conf + if pgsql_conf.class == Hash + @coordinator = pgsql_conf['coordinator'] + @datanode = pgsql_conf['datanode'] + else + raise "Postgresql.conf section: syntax error." + end + validate + end + + def validate + required = {"coordinator"=>true, "datanode"=>true, "name"=>true, + "data_node_users"=>true, "coordinator_users"=>true} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Postgresql.conf section: '#{k}' is missing." + end + } + if @coordinator.class != Array + raise "Postgresql.conf coordinator section: syntax error." + end + if @datanode.class != Array and @datanode != nil + raise "Postgresql.conf datanode section: syntax error." + end + # check coordinator_users, data_node_users + if get_coordinator_name == nil and get_coordinator_default == nil + raise "Postgresql.conf coordinator section: syntax error." + end + ['coordinator_users','data_node_users'].each {|k| + if get_coordinator_default != nil + if get_coordinator_default[k] == nil + if get_coordinator_name == nil + raise "Postgresql.conf coordinator section: 'name' is missing." + else + get_coordinator_name.each {|name| + if get_coordinator_value(name)[k] == nil + raise "Postgresql.conf coordinator section: '#{k}' is missing in 'name: #{name}." + end + } + end + end + end + } + end + + def get_datanode_name + if @datanode == nil + return nil + end + ary = [] + @datanode.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def get_coordinator_name + if @coordinator == nil + return nil + end + ary = [] + @coordinator.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def get_coordinator_value(name) + @coordinator.each {|item| + if item["name"] == name + return item + end + } + return nil + end + + def get_coordinator_string_value(name) + opt = "" + param = get_coordinator_value(name) + if param != nil + param.each {|key, value| + if key != "name" and key != "default" + opt = "#{opt}#{key} = #{value}\n" + end + } + end + return opt + end + + def get_datanode_value(name) + if @datanode == nil + return nil + end + @datanode.each {|item| + if item["name"] == name + return item + end + } + return nil + end + + def get_datanode_string_value(name) + opt = "" + param = get_datanode_value(name) + if param != nil + param.each {|key, value| + if key != "name" and key != "default" + opt = "#{opt}#{key} = #{value}\n" + end + } + end + return opt + end + + def get_coordinator_default + @coordinator.each {|item| + if item["name"] == "default" + return item + end + } + return nil + end + + def get_coordinator_default_string_value(name) + opt = "" + param = get_coordinator_default + if param != nil + param.each {|key, value| + if key != "name" and key != "default" + opt = "#{opt}#{key} = #{value}\n" + end + } + end + return opt + end + + def get_datanode_default + if @datanode == nil + return nil + end + @datanode.each {|item| + if item["name"] == "default" + return item + end + } + return nil + end + + def get_datanode_default_string_value(name) + opt = "" + param = get_datanode_default + if param != nil + get_datanode_default.each {|key, value| + if key != "name" and key != "default" + opt = "#{opt}#{key} = #{value}\n" + end + } + end + return opt + end + +end + +#------------------------------ +# RPM Class +#------------------------------ +class RPM + include PgxcModule + attr_accessor :value, :server, :gtm, :coordinator + def initialize(rpm_conf) + if rpm_conf == nil + raise "Rpm section is missing." + end + @gtm = @coordinator = nil + if rpm_conf.class == Hash + @value = rpm_conf + @gtm = @value['gtm'] + @coordinator = @value['coordinator'] + else + raise "Rpm section: syntax error." + end + validate + end + + def validate + required = {"gtm"=>true, "coordinator"=>true} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Rpm section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Rpm section: unknown keyword '#{k}'." + end + } + end +end + +#------------------------------ +# PGHBAConf Class +#------------------------------ +class PGHBAConf + include PgxcModule + attr_accessor :value + def initialize(pg_hba_conf) + @value = @coordinator = @datanode = nil + if pg_hba_conf == nil + return + end + @value = pg_hba_conf + @coordinator = @datanode = nil + if pg_hba_conf.class == Hash + @coordinator = pg_hba_conf['coordinator'] + @datanode = pg_hba_conf['datanode'] + else + raise "Pg_hba.conf section: syntax error." + end + validate + end + + def validate + required = {"coordinator"=>true, "datanode"=>true, "name"=>true, "content"=>true} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Pg_hba.conf section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Pg_hba.conf section: unknown keyword '#{k}'." + end + } + # check default + if has_datanode_default? == true + if get_datanode_default_contents == nil + raise "Pg_hba.conf section: 'name: default' no contents were found." + end + end + if has_coordinator_default? == true + if get_coordinator_default_contents == nil + raise "Pg_hba.conf section: 'name: default' no contents were found." + end + end + end + + def get_datanode_name + if @datanode == nil + return nil + end + ary = [] + @datanode.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def get_coordinator_name + if @coordinator == nil + return nil + end + ary = [] + @coordinator.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def has_datanode_default? + if @datanode == nil + return false + end + @datanode.each {|item| + if item["name"] == "default" + return true + end + } + return false + end + + def has_coordinator_default? + if @coordinator == nil + return false + end + @coordinator.each {|item| + if item["name"] == "default" + return true + end + } + return false + end + + def get_coordinator_contents(name) + if @coordinator == nil + return nil + end + @coordinator.each {|item| + if item["name"] == name + return item["content"] + end + } + return nil + end + + def get_datanode_contents(name) + if @datanode == nil + return nil + end + @datanode.each {|item| + if item["name"] == name + return item["content"] + end + } + return nil + end + + def get_coordinator_default_contents + if @coordinator == nil + return nil + end + @coordinator.each {|item| + if item["name"] == "default" + return item["content"] + end + } + return nil + end + + def get_datanode_default_contents + if @datanode == nil + return nil + end + @datanode.each {|item| + if item["name"] == "default" + return item["content"] + end + } + return nil + end + +end + +#------------------------------ +# PGIDENTConf Class +#------------------------------ +class PGIDENTConf + include PgxcModule + attr_accessor :value + def initialize(pg_ident_conf) + @value = @coordinator = @datanode = nil + if pg_ident_conf == nil + return + end + @value = pg_ident_conf + if pg_ident_conf.class == Hash + @coordinator = pg_ident_conf['coordinator'] + @datanode = pg_ident_conf['datanode'] + else + raise "Pg_ident.conf section: syntax error." + end + validate + end + + def validate + required = {"coordinator"=>false, "datanode"=>false, "name"=>false, "content"=>false} + yaml_keys = [] + get_keywords(@value, yaml_keys) + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "Pg_ident.conf section: '#{k}' is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Pg_ident.conf section: unknown keyword '#{k}'." + end + } + # check default + if has_datanode_default? == true + if get_datanode_default_contents == nil + raise "Pg_ident.conf section: contents not found." + end + end + if has_coordinator_default? == true + if get_coordinator_default_contents == nil + raise "Pg_ident.conf section: contents not found." + end + end + # check coordinator + if @coordinator != nil + coordinator_item = get_coordinator_name + if coordinator_item == nil and has_coordinator_default? == false + raise "Pg_ident.conf section: name value is missing." + end + if coordinator_item != nil + coordinator_item.each {|name| + if get_coordinator_contents(name) == nil + raise "Pg_ident.conf section: contents not found." + end + } + end + end + # check datanode + if @datanode != nil + datanode_item = get_datanode_name + if datanode_item == nil and has_datanode_default? == false + raise "Pg_ident.conf section: name value is missing." + end + if datanode_item != nil + datanode_item.each {|name| + if get_datanode_contents(name) == nil + raise "Pg_ident.conf section: contents not found." + end + } + end + end + end + + def get_datanode_name + if @datanode == nil + return nil + end + ary = [] + @datanode.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def get_coordinator_name + if @coordinator == nil + return nil + end + ary = [] + @coordinator.each {|item| + if item["name"] != nil and item["name"] != "default" + ary << item["name"] + end + } + if ary.size == 0 + ary = nil + end + return ary + end + + def has_datanode_default? + if @datanode == nil + return false + end + @datanode.each {|item| + if item["name"] == "default" + return true + end + } + return false + end + + def has_coordinator_default? + if @coordinator == nil + return false + end + @coordinator.each {|item| + if item["name"] == "default" + return true + end + } + return false + end + + def get_coordinator_contents(name) + if @coordinator == nil + return nil + end + @coordinator.each {|item| + if item["name"] == name + return item["content"] + end + } + return nil + end + + def get_datanode_contents(name) + if @datanode == nil + return nil + end + @datanode.each {|item| + if item["name"] == name + return item["content"] + end + } + return nil + end + + def get_coordinator_default_contents + if @coordinator == nil + return nil + end + @coordinator.each {|item| + if item["name"] == "default" + return item["content"] + end + } + return nil + end + + def get_datanode_default_contents + if @datanode == nil + return nil + end + @datanode.each {|item| + if item["name"] == "default" + return item["content"] + end + } + return nil + end + +end + +#------------------------------ +# PGXC config Class +#------------------------------ +class PGXCCONF + attr_accessor :value, :postgresxc, :server, :gtm, :gtmproxy, :datanode, :coordinator, :pgsqlconf, :rpm, :pghbaconf, :pgidentconf + def initialize(yaml) + @value = YAML.load_file(yaml) + @postgresxc = PostgresXC.new(@value['postgres-xc']) + @server = Server.new(@value['server']) + @gtm = GTM.new(@value['gtm']) + @gtmproxy = GTMProxy.new(@value['gtm-proxy']) + @datanode = Datanode.new(@value['datanode']) + @coordinator = Coordinator.new(@value['coordinator']) + @pgsqlconf = PGSQLConf.new(@value['postgresql.conf']) + @rpm = RPM.new(@value['rpm']) + @pghbaconf = PGHBAConf.new(@value['pg_hba.conf']) + @pgidentconf = PGIDENTConf.new(@value['pg_ident.conf']) + validate + end + + def to_file(path_filename) + f = File.open(File.expand_path("#{path_filename}"), "w") + f.puts @value.to_yaml + f.close + end + + def validate + # top level section keyword check + required = { "postgres-xc"=>true, "server"=>true, "gtm"=>true, "gtm-proxy"=>false, + "datanode"=>true, "coordinator"=>true, "rpm"=>true, + "postgresql.conf"=>true, "pg_hba.conf"=>false, "pg_ident.conf"=>false} + yaml_keys = [] + @value.each {|k, v| + yaml_keys << k + } + required.each {|k, v| + if yaml_keys.index(k) == nil and required[k] == true + raise "'#{k}' section is missing." + end + } + required_key = [] + required.each {|k, v| + required_key << k + } + yaml_keys.each {|k| + if required_key.index(k) == nil + raise "Unknown section '#{k}'." + end + } + + # Mirror ? + mirror_type = @datanode.mirror_datanode? + if mirror_type + # check xc_watcher section + if @postgresxc.value.has_key?('XC_WATCHER') == false + raise "XcWatcher section is missing." + end + if @postgresxc.XC_WATCHER_HOST == nil + raise "XcWatcher section: 'server' is missing." + end + if @postgresxc.XC_WATCHER_PORT == nil + raise "XcWatcher section: 'port' is missing." + end + # check xc_watcher_server + if @server.server(@postgresxc.XC_WATCHER_HOST) == nil + raise "XcWatcher '#{@postgresxc.XC_WATCHER_HOST}' is not found in the server section." + end + # check monitoring agent + if @postgresxc.value.has_key?('MONITORING_AGENT') == false + raise "Monitoring Agent section is missing." + end + if @postgresxc.MONITORING_AGENT_PORT == nil + raise "Monitoring Agent: 'port' is missing." + end + end + + + # check gtm server + @gtm.server.each {|v| + if @server.server(v) == nil + raise "Gtm '#{@gtm.server}' is not found in the server section." + end + } + @gtm.name.each {|v| + if @gtm.ip_addr(v) != nil + result = false + @server.get_ip_addr(@gtm.server(v)).each {|ip| + if ip == @gtm.ip_addr(v) + result = true + break + end + } + if result == false + raise "Gtm 'ip_addr: #{@gtm.ip_addr(v)}' is not found in the server section." + end + else + if @server.get_ip_addr(@gtm.server(v)).size != 1 + raise "Gtm 'ip_addr' value is missing." + end + end + } + + # check gtm-proxy section + # check server + if @gtmproxy.value != nil + @gtmproxy.get_name.each {|name| + if @server.server(@gtmproxy.get_server(name)) == nil + raise "Gtm-proxy 'server: #{@gtmproxy.get_server(name)}' is not found in the server section." + end + } + # check ip_addr + @gtmproxy.get_name.each {|name| + gtmp_ip = @gtmproxy.get_ip_addr(name) + if gtmp_ip != nil + if @server.get_ip_addr(@gtmproxy.get_server(name)).index(gtmp_ip) == nil + raise "Gtm-proxy 'name: #{name}' and 'ip_addr: #{gtmp_ip}' are not found in the server section." + end + else + if @server.get_ip_addr(@gtmproxy.get_server(name)).size != 1 + raise "Gtm-proxy 'ip_addr' is not included in 'name: #{name}'." + end + end + } + #check gtm + @gtmproxy.get_name.each {|name| + gtm_name = @gtmproxy.get_gtm(name) + if @gtm.name.index(gtm_name) + next + else + raise "Gtm-proxy 'gtm: #{gtm_name}' is not found in the gtm section." + end + } + end + + # check datanode section + # check server + @datanode.get_name.each {|name| + if @datanode.get_standby_type(name) == "mirror" + @datanode.get_mirror_servers(name).each {|svr| + if @server.server(svr) == nil + raise "Datanode 'server: #{svr}' is not found in the server section." + end + } + else + if @server.server(@datanode.get_server(name)) == nil + raise "Datanode 'server: #{@datanode.get_server(name)}' is not found in the server section." + end + end + } + # check ip_addr + @datanode.get_name.each {|name| + if @datanode.get_standby_type(name) == "mirror" + mirror_name = @datanode.get_mirror_name(name) + datanode_svr_ip = @datanode.get_mirror_svr_ip_addr(name) + datanode_ip = [] + datanode_svr_ip.each {|dn_ip| + next if dn_ip['ip_addr'] == nil + datanode_ip << dn_ip['ip_addr'] + } + if datanode_svr_ip != nil + if datanode_ip.size != datanode_ip.uniq.size + raise "Datanode 'name:#{name}, ip_addr: or server:' value is duplicated." + end + datanode_svr_ip.each {|svr_ip| + if svr_ip["ip_addr"] != nil + if @server.get_ip_addr(svr_ip["server"]).index(svr_ip["ip_addr"]) == nil + raise "Datanode 'name:#{name}, ip_addr: #{svr_ip['ip_addr']}' value is not found in the server section." + end + else + if @server.get_ip_addr(svr_ip["server"]).size != 1 + raise "Datanode 'ip_addr' is not included in 'name: #{name}'." + end + end + } + else + raise "Datanode 'name:#{name}' is server value is not found." + end + else + datanode_ip = @datanode.get_ip_addr(name) + if datanode_ip != nil + if @server.get_ip_addr(@datanode.get_server(name)).index(datanode_ip) == nil + raise "Datanode 'name:#{name}, ip_addr: #{datanode_ip}' value is not found in the server section." + end + else + if @server.get_ip_addr(@datanode.get_server(name)).size != 1 + raise "Datanode 'ip_addr' is not included in 'name: #{name}'." + end + end + end + } + # check gtm + gtm_proxy_name = @gtmproxy.get_name + if gtm_proxy_name == nil + gtm_proxy_name = [] + end + @datanode.get_name.each {|name| + if @datanode.get_standby_type(name) == "mirror" + @datanode.get_mirror_name(name).each {|mirror_nm| + gtm_name = @datanode.get_mirror_gtm(name,mirror_nm) + if @gtm.name == gtm_name or gtm_proxy_name.index(gtm_name) != nil + next + else + raise "Datanode 'gtm: #{gtm_name}' is not found in gtm or gtm-proxy section." + end + } + else + gtm_name = @datanode.get_gtm(name) + if @gtm.name == gtm_name or gtm_proxy_name.index(gtm_name) != nil + next + else + raise "Datanode 'gtm: #{gtm_name}' is not found in gtm or gtm-proxy section." + end + end + } + # check datanode data_direcotry + @datanode.get_name.each {|name| + if @datanode.get_standby_type(name) == "mirror" + @datanode.get_mirror_name(name).each {|mirror_nm| + if get_mirror_datanode_data_directory(name,mirror_nm) == nil + raise "Datanode data_direcotry value is not found in 'datanode: #{name}, mirror: #{mirror_nm}'." + end + } + else + if get_datanode_data_directory(name) == nil + raise "Datanode data_direcotry value is not found in 'datanode: #{name}'." + end + end + } + + # check coordinatro section + # check server + @coordinator.get_name.each {|name| + if @server.server(@coordinator.get_server(name)) == nil + raise "Coordinator 'server: #{@coordinator.get_server(name)}' is not found in the server section." + end + } + # check ip_addr + @coordinator.get_name.each {|name| + coordinator_ip = @coordinator.get_ip_addr(name) + if coordinator_ip != nil + if @server.get_ip_addr(@coordinator.get_server(name)).index(coordinator_ip) == nil + raise "Coordinator 'name: #{name}, ip_addr: #{coordinator_ip}' is not found in the server section." + end + else + if @server.get_ip_addr(@coordinator.get_server(name)).size != 1 + raise "Coordinator 'ip_addr' is not included in 'name: #{name}'." + end + end + } + # check gtm + gtm_proxy_name = @gtmproxy.get_name + if gtm_proxy_name == nil + gtm_proxy_name = [] + end + @coordinator.get_name.each {|name| + gtm_name = @coordinator.get_gtm(name) + if @gtm.name == gtm_name or gtm_proxy_name.index(gtm_name) != nil + next + else + raise "Coordinator 'gtm: #{gtm_name}' value is not found in gtm or gtm-proxy section." + end + } + # check preferred_data_nodes + @coordinator.get_name.each {|name| + @coordinator.get_preferred_data_nodes(name).each {|dn_name| + if @datanode.get_name.index(dn_name) != nil + next + else + raise "Coordinator 'preferred_data_nodes: #{dn_name}' is not found in the datanode section" + end + } + } + # check coordinator data_direcotry + @coordinator.get_name.each {|name| + if get_coordinator_data_directory(name) == nil + raise "Coordinator data_direcotry is not found in 'coordinator: #{name}'" + end + } + # datanode and coordinator + # check pg_ident.conf + if @pgidentconf.get_datanode_name != nil + @pgidentconf.get_datanode_name.each {|name| + if @datanode.get_name.index(name) == nil + raise "Pg_ident.conf section: '#{name}' value is not found datanode section." + end + } + end + if @pgidentconf.get_coordinator_name... [truncated message content] |
From: Michael P. <mic...@us...> - 2011-05-12 03:58:07
|
Project "Postgres-XC". The branch, ha_support has been updated via 4f452a336a0cea55c13b93823f39ffed547f9065 (commit) from 95fbb1a7742ef7cb0698875dbb3ad758499c21c7 (commit) - Log ----------------------------------------------------------------- commit 4f452a336a0cea55c13b93823f39ffed547f9065 Author: Michael P <mic...@us...> Date: Thu May 12 12:54:18 2011 +0900 Fix for GXID feed This fix is already done on master branch, but it became also necessary here for HA test purposes. diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 1f3c09f..a619fd4 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -1751,13 +1751,14 @@ CommitTransaction(bool contact_gtm) bool PrepareLocalCoord = false; bool PreparePGXCNodes = false; char implicitgid[256]; - TransactionId xid = GetCurrentTransactionId(); + TransactionId xid = InvalidTransactionId; if (IS_PGXC_COORDINATOR && !IsConnFromCoord() && contact_gtm) PreparePGXCNodes = PGXCNodeIsImplicit2PC(&PrepareLocalCoord); if (PrepareLocalCoord || PreparePGXCNodes) { + xid = GetCurrentTransactionId(); sprintf(implicitgid, "T%d", xid); /* Build Implicit 2PC Data for implicit PREPARE */ ----------------------------------------------------------------------- Summary of changes: src/backend/access/transam/xact.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) hooks/post-receive -- Postgres-XC |
From: Michael P. <mic...@us...> - 2011-05-12 03:52:41
|
Project "Postgres-XC". The branch, ha_support has been deleted was cf27a768d5e2f74c88d029a03d857a8d57040e34 ----------------------------------------------------------------------- cf27a768d5e2f74c88d029a03d857a8d57040e34 Fix for GXID feed ----------------------------------------------------------------------- hooks/post-receive -- Postgres-XC |
From: Koichi S. <koi...@us...> - 2011-05-12 02:01:18
|
Project "Postgres-XC". The branch, documentation has been updated via 58954e79e274b1280329aa61ca6af66b88c59cf6 (commit) from 0222ee52bb3fd9bedc71ee86169ad5009137fecc (commit) - Log ----------------------------------------------------------------- commit 58954e79e274b1280329aa61ca6af66b88c59cf6 Author: Koichi Suzuki <koi...@gm...> Date: Thu May 12 10:54:23 2011 +0900 This is the second commit of Postgres-XC documentation. Contents of the commit is as follows: 1) For sections just from PostgreSQL and not reviewd yet, added notice that this need further review and revision. Notice will be found in the file pgnotice.sgmlin. 2) First revision work on "installation". This is still WIP. To include notice in 1) almost all the files are modified. So far, we can generate html and man with this commit. diff --git a/doc/src/sgml/pgnotice.sgmlin b/doc/src/sgml/pgnotice.sgmlin new file mode 100644 index 0000000..515d384 --- /dev/null +++ b/doc/src/sgml/pgnotice.sgmlin @@ -0,0 +1,6 @@ +<!## XC> +<para> +Notice: At present, this section is just taken from PostgreSQL documentation +and is subject to revision for Postgres-XC. +</para> +<!## end> ----------------------------------------------------------------------- Summary of changes: doc/src/sgml/pgnotice.sgmlin | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) create mode 100644 doc/src/sgml/pgnotice.sgmlin hooks/post-receive -- Postgres-XC |
From: Michael P. <mic...@us...> - 2011-05-11 09:37:56
|
Project "Postgres-XC". The branch, master has been updated via b170fe2d7fc4bd175c72c2e4370fab223bac24d6 (commit) from 1580ed848d0577c52949c52fe2cec867b5ee1746 (commit) - Log ----------------------------------------------------------------- commit b170fe2d7fc4bd175c72c2e4370fab223bac24d6 Author: Michael P <mic...@us...> Date: Wed May 11 18:29:55 2011 +0900 Support for single-prepared PL/PGSQL functions This commit fixes primarily problems like in bug 3138450 (cache lookup for type 0) where XC was not able to set up plpgsql parameter values because values were not correctly fetched. This commit does not yet solve the special case of multiple uses of same plpgsql datum within a SQL command. PL/PGSQL functions using subqueries are out of scope for the moment due to XC's restrictions regarding multi-prepared statements. diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index 47a07f0..43d9606 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -4057,17 +4057,45 @@ ParamListToDataRow(ParamListInfo params, char** result) StringInfoData buf; uint16 n16; int i; + int real_num_params = params->numParams; + + /* + * It is necessary to fetch parameters + * before looking at the output value. + */ + for (i = 0; i < params->numParams; i++) + { + ParamExternData *param; + + param = ¶ms->params[i]; + + if (!OidIsValid(param->ptype) && params->paramFetch != NULL) + (*params->paramFetch) (params, i + 1); + + /* + * In case parameter type is not defined, it is not necessary to include + * it in message sent to backend nodes. + */ + if (!OidIsValid(param->ptype)) + real_num_params--; + } initStringInfo(&buf); + /* Number of parameter values */ - n16 = htons(params->numParams); + n16 = htons(real_num_params); appendBinaryStringInfo(&buf, (char *) &n16, 2); /* Parameter values */ for (i = 0; i < params->numParams; i++) { - ParamExternData *param = params->params + i; + ParamExternData *param = ¶ms->params[i]; uint32 n32; + + /* If parameter has no type defined it is not necessary to include it in message */ + if (!OidIsValid(param->ptype)) + continue; + if (param->isnull) { n32 = htonl(-1); ----------------------------------------------------------------------- Summary of changes: src/backend/pgxc/pool/execRemote.c | 32 ++++++++++++++++++++++++++++++-- 1 files changed, 30 insertions(+), 2 deletions(-) hooks/post-receive -- Postgres-XC |
From: Michael P. <mic...@us...> - 2011-05-11 07:05:19
|
Project "Postgres-XC". The branch, ha_support has been updated via cf27a768d5e2f74c88d029a03d857a8d57040e34 (commit) from 3cf91c1fec8c9756b0f1b417a35d4e02b4ad427a (commit) - Log ----------------------------------------------------------------- commit cf27a768d5e2f74c88d029a03d857a8d57040e34 Author: Michael P <mic...@us...> Date: Wed May 11 16:01:21 2011 +0900 Fix for GXID feed This fix is already done on master branch, but it became also necessary here for HA test purposes. diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 1f3c09f..a619fd4 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -1751,13 +1751,14 @@ CommitTransaction(bool contact_gtm) bool PrepareLocalCoord = false; bool PreparePGXCNodes = false; char implicitgid[256]; - TransactionId xid = GetCurrentTransactionId(); + TransactionId xid = InvalidTransactionId; if (IS_PGXC_COORDINATOR && !IsConnFromCoord() && contact_gtm) PreparePGXCNodes = PGXCNodeIsImplicit2PC(&PrepareLocalCoord); if (PrepareLocalCoord || PreparePGXCNodes) { + xid = GetCurrentTransactionId(); sprintf(implicitgid, "T%d", xid); /* Build Implicit 2PC Data for implicit PREPARE */ ----------------------------------------------------------------------- Summary of changes: src/backend/access/transam/xact.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) hooks/post-receive -- Postgres-XC |
From: Michael P. <mic...@us...> - 2011-05-11 07:03:51
|
Project "Postgres-XC". The branch, ha has been deleted was b426d4842f40c91e170378876e0c8107691c4272 ----------------------------------------------------------------------- b426d4842f40c91e170378876e0c8107691c4272 Fix for GXID feed ----------------------------------------------------------------------- hooks/post-receive -- Postgres-XC |
From: Koichi S. <koi...@us...> - 2011-05-10 09:06:59
|
Project "Postgres-XC documentation". The branch, master has been updated via 85d141426729b3b5089465ff6632741c624edc48 (commit) from 87e4cff06ed2bd2871ae0a35189e7b633343d6d2 (commit) - Log ----------------------------------------------------------------- commit 85d141426729b3b5089465ff6632741c624edc48 Author: Koichi Suzuki <koi...@gm...> Date: Tue May 10 18:06:23 2011 +0900 This commit adds one administartion file as below. This file contains initial list of issues for stabilization which is subject to update. new file: Stabilization.odf diff --git a/progress/Stabilization.odf b/progress/Stabilization.odf new file mode 100755 index 0000000..b0a8ca8 Binary files /dev/null and b/progress/Stabilization.odf differ ----------------------------------------------------------------------- Summary of changes: progress/Stabilization.odf | Bin 0 -> 30782 bytes 1 files changed, 0 insertions(+), 0 deletions(-) create mode 100755 progress/Stabilization.odf hooks/post-receive -- Postgres-XC documentation |
From: Koichi S. <koi...@us...> - 2011-05-09 00:39:14
|
Project "Postgres-XC". The branch, ha_support has been updated via 3cf91c1fec8c9756b0f1b417a35d4e02b4ad427a (commit) from 95fbb1a7742ef7cb0698875dbb3ad758499c21c7 (commit) - Log ----------------------------------------------------------------- commit 3cf91c1fec8c9756b0f1b417a35d4e02b4ad427a Author: Koichi Suzuki <koi...@gm...> Date: Mon May 9 09:10:03 2011 +0900 This is the first commit of GTM-Proxy "reconnect" feature to promoted GTM. Design docs will be found in http://sourceforge.net/apps/mediawiki/postgres-xc/index.php?title=Reconnect_GTM-Proxy. Please note that this code has to be tested for a while. Modified files are as follows: modified: src/gtm/common/gtm_serialize.c Just added file header. modified: src/gtm/common/gtm_serialize_debug.c Just added file header. modified: src/gtm/common/gtm_utils.c modified: src/gtm/common/stringinfo.c modified: src/gtm/main/gtm_standby.c modified: src/gtm/proxy/proxy_main.c modified: src/gtm/proxy/proxy_thread.c modified: src/include/gtm/gtm_proxy.h modified: src/include/gtm/libpq-be.h modified: src/include/gtm/libpq-int.h modified: src/include/gtm/stringinfo.h diff --git a/src/gtm/common/gtm_serialize.c b/src/gtm/common/gtm_serialize.c index 85c7233..2328bd6 100644 --- a/src/gtm/common/gtm_serialize.c +++ b/src/gtm/common/gtm_serialize.c @@ -1,8 +1,29 @@ - +/*----------------------------------------------------------------------- + * + * gtm_serialize.c + * + * Serialize/deserialize internal information of GTM. This is used to + * backup GTM internal information to GTM-Standby. + * + * When backup, GTM temporarily concentrates to backup all the + * internal information to GTM-ACT, by blocking all the command + * handlings. Because it does not take long, it will not bring + * significant influence to the overall cluster response. + * + * Because this function must be loaded into various modules such as + * postgres, gtm, pgxc_clean, etc., we cannot cimply depond upon GTM's + * palloc. It has to be ready to use any context-dependent memory + * allocation systems, each mcxt.c or corresponding module should + * provide memory allocation/reallocation/initialization/free + * functions. For details, see include/gtm/gen_alloc.h. + * + * Copyright (c) 2011, Nippon Telegraph and Telephone Corporation + * + *----------------------------------------------------------------------- + */ #include "gtm/gtm_c.h" #include "gtm/elog.h" -// #include "gtm/palloc.h" #include "gtm/gtm.h" #include "gtm/gtm_txn.h" #include "gtm/gtm_seq.h" @@ -20,6 +41,12 @@ //#include "gtm/gtm_list.h" //#include "gtm/memutils.h" +/* ===================================================== + * + * Snapshot data handling + * + * ==================================================== + */ /* ----------------------------------------------------- * Get a serialized size of GTM_SnapshotData structure * Corrected snapshort serialize data calculation. @@ -29,11 +56,12 @@ /* * Serialize of snapshot_data * - * sn_xmin ---> sn_xmax ---> sn_recent_global_xmin - * - * ---> sn_xcnt ---> GXID * sn_xcnt - * |<--- sn_xip -->| + * sn_xmin, sn_xmax, sn_recent_global_xmin, sn_xcmt, + * (GXID * sn_xcnt). If sn_xcnt is zero, the last item + * is not included in the message. * + * All the data will be sent in internal representation. + * no conversion to the network format will be done. */ size_t gtm_get_snapshotdata_size(GTM_SnapshotData *data) @@ -154,7 +182,6 @@ gtm_deserialize_snapshotdata(GTM_SnapshotData *data, const char *buf, size_t buf return len; } - /* ----------------------------------------------------- * Get a serialized size ofGTM_TransactionInfo structure * @@ -505,6 +532,14 @@ gtm_get_transactions_size(GTM_Transactions *data) return len; } +/* --------------------------------------------------------------------- + * + * Serialize all the transaction information in GTM. + * + * Please note that only "in_use" slots are selected to backup. + * + * --------------------------------------------------------------------- + */ size_t gtm_serialize_transactions(GTM_Transactions *data, char *buf, size_t buflen) { diff --git a/src/gtm/common/gtm_serialize_debug.c b/src/gtm/common/gtm_serialize_debug.c index e1dd854..d41e117 100644 --- a/src/gtm/common/gtm_serialize_debug.c +++ b/src/gtm/common/gtm_serialize_debug.c @@ -1,4 +1,13 @@ - +/*-------------------------------------------------------------------- + * + * gtm_serialize_debug.c + * + * Log serialize/deserialize information mainly for debug. + * + * Copyright (c) 2011, Nippon Telegraph and Telephone Corporation + * + *--------------------------------------------------------------------- + */ #include "gtm/gtm_c.h" #include "gtm/elog.h" #include "gtm/palloc.h" diff --git a/src/gtm/common/gtm_utils.c b/src/gtm/common/gtm_utils.c index a25ae6a..f889554 100644 --- a/src/gtm/common/gtm_utils.c +++ b/src/gtm/common/gtm_utils.c @@ -1,3 +1,16 @@ +/*---------------------------------------------------------------------- + * + * gtm_utils.c + * + * Collection of miscelleneous functions for GTM/GTM-Proxy/GTM-Standby. + * + * At present, we have only failure-report routine. + * + * Copyright (c) 2011, Nippon Telegraph and Telephone Corporation + * + *--------------------------------------------------------------------- + */ + #include "gtm/gtm_utils.h" #include "gtm/elog.h" @@ -13,6 +26,12 @@ void gtm_report_failure(GTM_Conn *failed_conn) { + /* + * Todo: invoke "xcm_putevent" command to report the failure of + * the communication buddy. Be careful that the buddy ipaddress + * is "normalized" here and in xcm_putevent, we need to identify + * the component and "id" used in XCM module. + */ /* FIXME: report_xcwatch_gtm_failure() */ elog(LOG, "Calling report_xcwatch_gtm_failure()..."); return; diff --git a/src/gtm/common/stringinfo.c b/src/gtm/common/stringinfo.c index 35e4cd8..122584c 100644 --- a/src/gtm/common/stringinfo.c +++ b/src/gtm/common/stringinfo.c @@ -8,9 +8,14 @@ * * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California - * Portions Copyright (c) 2010-2011 Nippon Telegraph and Telephone Corporation + * Portions Copyright (c) 2010-2011, Nippon Telegraph and Telephone Corporation * - * $PostgreSQL: pgsql/src/backend/lib/stringinfo.c,v 1.49 2008/01/01 19:45:49 momjian Exp $ + * $Postgres-xc: postgres-xc/src/gtm/common/stringinfo.c + * + * History: + * Added dupStringInfo() and copyStringInfo(), to backup incoming + * command in GTM-Proxy. The backup is used to re-issue outstanding + * commands to GTM when it is promoted. May 5th, 2011, K.Suzuki * *------------------------------------------------------------------------- */ @@ -41,6 +46,41 @@ makeStringInfo(void) } /* + * dupStringInfo + * + * Get new StringInfo and copy the original one to it. Please note that + * "cursor" is copied too. + */ +StringInfo +dupStringInfo(StringInfo orig) +{ + StringInfo new; + + new = makeStringInfo(); + if (!new) + return(new); + if (orig->len > 0) + { + appendBinaryStringInfo(new, orig->data, orig->len); + new->cursor = orig->cursor; + } + return(new); +} + +/* + * copyStringInfo + * + * Deep copy StringInfo. In the destination, cursor will remain "zero". + */ +void +copyStringInfo(StringInfo to, StringInfo from) +{ + resetStringInfo(to); + appendBinaryStringInfo(to, from->data, from->len); + return; +} + +/* * initStringInfo * * Initialize a StringInfoData struct (with previously undefined contents) diff --git a/src/gtm/main/gtm_standby.c b/src/gtm/main/gtm_standby.c index 4e9393b..6908324 100644 --- a/src/gtm/main/gtm_standby.c +++ b/src/gtm/main/gtm_standby.c @@ -1,3 +1,20 @@ +/*---------------------------------------------------------------------- + * + * gtm_standby.c + * + * gtm_standby.c mainly takes care of backing-up GTM internal + * information to GTM-Standby. Backup information is used to + * fail-over GTM. When active GTM is detected to be down, it will be + * terminated and "promote" event will be sent to GTM-Standby. Then + * it takes over GTM functionality. + * + * Copyright (c) 2011 Nippon Telegraph and Telephone Corporation + * + * $Postgres-xc: postgres-xc/src/gtm/main/gtm_standby.c + * + *---------------------------------------------------------------------- + */ + #include "gtm/gtm_standby.h" #include "gtm/elog.h" @@ -158,7 +175,8 @@ gtm_standby_restore_gxid() GTMTransactions.gt_transactions_array[i].gti_xmin = txn.gt_transactions_array[i].gti_xmin; GTMTransactions.gt_transactions_array[i].gti_isolevel = txn.gt_transactions_array[i].gti_isolevel; GTMTransactions.gt_transactions_array[i].gti_readonly = txn.gt_transactions_array[i].gti_readonly; - GTMTransactions.gt_transactions_array[i].gti_backend_id = txn.gt_transactions_array[i].gti_backend_id; + GTMTransactions.gt_transactions_array[i].gti_backend_id + = txn.gt_transactions_array[i].gti_backend_id; /* data node */ GTMTransactions.gt_transactions_array[i].gti_datanodecount diff --git a/src/gtm/proxy/proxy_main.c b/src/gtm/proxy/proxy_main.c index 4ab6f94..d2490e8 100644 --- a/src/gtm/proxy/proxy_main.c +++ b/src/gtm/proxy/proxy_main.c @@ -59,12 +59,40 @@ int GTMProxyPortNumber; int GTMProxyWorkerThreads; char *GTMProxyDataDir; +/* + * Added options to control how to deal with GTM communication error. + * + * Because we now have GTM-Standby which fails over when GTM fails and + * in this case, we should keep active transaction as is, wait for + * "reconnect" operation for a while. + */ +int GTMErrorWaitOpt = FALSE; /* Wait (for reconnect) if TRUE */ +int GTMErrorWaitSecs = 0; /* How long for each turn? */ +int GTMErrorWaitCount = 0; /* How many turns to wait? */ + char *GTMServerHost; int GTMServerPortNumber; GTM_PGXCNodeId GTMProxyID = 0; GTM_ThreadID TopMostThreadID; + +/* + * The following variables indicates if "reconnect" is received and + * where new GTM is. + * + * gtm_ctl will issue SIGUSR1 to notify the reconnect, together with + * data file located in GTM_Standby working directory. Then SIGUSR1 + * will be passed (if neede) to the main thread. Main thread will + * re-distribute this event as SIGUSR2 to all the child threads so + * that they can reconnect to new active GTM. + */ + +GTMProxy_ThreadInfo **Proxy_ThreadInfo; +short ReadyToReconnect = FALSE; +char *NewGTMServerHost; +int NewGTMServerPortNumber; + /* The socket(s) we're listening to. */ #define MAXLISTEN 64 static int ListenSocket[MAXLISTEN]; @@ -205,9 +233,133 @@ BaseInit() } } +/* + * Token reader to parse -o option supplied by gtm_ctn reconnect. + * Please note that input string will be modified to indicate the end + * of each token. + */ +static char * +read_token(char *line, char **next) +{ + char *tok; + char *next_token; + + /* Skip leading spaces and find first char of the token. */ + if (line == NULL) + { + *next = NULL; + return(NULL); + } + for (tok = line;; tok++) + { + if (*tok == 0 || *tok == '\n') + { + *next = NULL; + return(NULL); + } + if (*tok == ' ' || *tok == '\t') + continue; + else + break; + } + /* Find the end of the token */ + for (next_token = tok;; next_token++) + { + if (*next_token == 0 || *next_token == '\n') + { + *next = NULL; + return(tok); + } + if (*next_token == ' ' || *next_token == '\t') + { + *next_token = 0; + *next = next_token + 1; + return(tok); + } + else + continue; + } + Assert(0); /* Never comes here. Keep compiler quiet. */ +} + +/* + * gtm -o option analyzer. + * + * Will return non-zero if error found. Here we assume that current + * working directory is that specifiec by -D option. + * + * -o option will be kept in "newgtm" file at -D directory, in the + case of reconnect operation. + */ +static int +GTMProxy_ReadReconnectInfo(void) +{ +#define MAXLINE 1024 +#define INVALID_RECONNECT_OPTION_MSG() \ + do {ereport(ERROR, (0, errmsg("Invalid Reconnect Option")));} while(0) + + char optstr[MAXLINE]; + char *line; + FILE *optarg_file; + char *optValue; + char *option; + char *next_token; + + optarg_file = fopen("newgtm", "r"); + if (optarg_file == NULL) + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + line = fgets(optstr, MAXLINE, optarg_file); + if (line == NULL) + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + fclose (optarg_file); + + next_token = optstr; + while((option = read_token(next_token, &next_token))) + { + if (strcmp(option, "-t") == 0) /* New GTM port */ + { + optValue = read_token(next_token, &next_token); + if (optValue == NULL) + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + NewGTMServerPortNumber = atoi(optValue); + continue; + } + else if (strcmp(option, "-s") == 0) + { + optValue = read_token(next_token, &next_token); + if (optValue == NULL) + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + if (NewGTMServerHost) + free(NewGTMServerHost); + NewGTMServerHost = strdup(optValue); + continue; + } + else + { + INVALID_RECONNECT_OPTION_MSG(); + return(-1); + } + } + return(0); +} + static void GTMProxy_SigleHandler(int signal) { + int ii; + fprintf(stderr, "Received signal %d", signal); switch (signal) @@ -218,7 +370,86 @@ GTMProxy_SigleHandler(int signal) case SIGINT: case SIGHUP: break; - + case SIGUSR1: /* Reconnect operation from gtm_ctl */ + /* + * Only the main thread can distribute SIGUSR2 to avoid + * lock contention of the thread info. If other thread + * receives SIGUSR1, it will issue SIGUSR1 to the main + * thread. + */ + /* + * The mask is set to block signals. They're blocked + * until all the thread reconnect to the new GTM. + */ + if (MyThreadID != TopMostThreadID) + { + pthread_kill(TopMostThreadID, SIGUSR1); + return; + } + /* + * Then I'm the main thread. + */ + PG_SETMASK(&BlockSig); + /* + * Setup Reconnect Info. + */ + if (!ReadyToReconnect) + { + elog(LOG, "SIGUSR1 detected, but not ready to handle this. Ignored."); + PG_SETMASK(&UnBlockSig); + return; + } + elog(LOG, "SIGUSR1 etected. Set reconnect info for each worker thread."); + if (GTMProxy_ReadReconnectInfo() != 0) + { + /* Failed to read reconnect information (-o option) from reconnect data file */ + PG_SETMASK(&UnBlockSig); + return; + } + for (ii = 0; ii < GTMProxyWorkerThreads; ii++) + { + if ((Proxy_ThreadInfo[ii] == NULL) || (Proxy_ThreadInfo[ii]->can_accept_SIGUSR2 == FALSE)) + { + elog(LOG, "Some thread is not ready to accept SIGUSR2. SIGUSR1 ignored."); + PG_SETMASK(&UnBlockSig); + return; + } + } + for (ii = 0; ii < GTMProxyWorkerThreads; ii++) + { + /* + * Issue SIGUSR2 to all the worker threads. + * It will not be issued to the main thread. + */ + pthread_kill(Proxy_ThreadInfo[ii]->thr_id, SIGUSR2); + } + elog(LOG, "Reconnect info setup done for each worker thread."); + PG_SETMASK(&UnBlockSig); + return; + case SIGUSR2: + /* + * This signal must be issued by the main thread to notify + * reconnect to new GTM. Main thread has already received + * SIGUSR1 to notify this and should not receive SIGUSR2. + */ + PG_SETMASK(&BlockSig); + if (MyThreadID == TopMostThreadID) + { + /* + * This should not be reached. Just in case. + */ + elog(LOG, "Main thread received SIGUSR2. Ignoring."); + PG_SETMASK(&UnBlockSig); + return; + } + GetMyThreadInfo->reconnect_issued = TRUE; + if (GetMyThreadInfo->can_longjmp) + { + /* PG_SETMASK(&UnBlockSig) needed at the exit of setjmp() */ + siglongjmp(GetMyThreadInfo->longjmp_env, 1); + } + PG_SETMASK(&UnBlockSig); + return; default: fprintf(stderr, "Unknown signal %d\n", signal); return; @@ -286,10 +517,12 @@ main(int argc, char *argv[]) GTMProxyPortNumber = GTM_PROXY_DEFAULT_PORT; GTMProxyWorkerThreads = GTM_PROXY_DEFAULT_WORKERS; + NewGTMServerHost = NULL; + /* * Parse the command like options and set variables */ - while ((opt = getopt(argc, argv, "h:i:p:n:D:l:s:t:")) != -1) + while ((opt = getopt(argc, argv, "h:i:p:n:D:l:s:t:w:z:")) != -1) { switch (opt) { @@ -328,6 +561,16 @@ main(int argc, char *argv[]) GTMServerHost = strdup(optarg); break; + case 'w': + /* How long to wait for SIGUSR1 to receive reconnect event? */ + GTMErrorWaitSecs = atoi(optarg); + break; + + case 'z': + /* How many tuns wait for SIGUSR1 to reconnect? */ + GTMErrorWaitCount = atoi(optarg); + break; + case 't': /* GTM server port number */ GTMServerPortNumber = atoi(optarg); @@ -355,6 +598,22 @@ main(int argc, char *argv[]) } /* + * If valid value is specified, then GTM-Proxy assumes XCM module is available. + * + * Anyway, keep dependency as minimum as possible. + */ + if (GTMErrorWaitSecs <= 0 || GTMErrorWaitCount <= 0) + { + GTMErrorWaitOpt = FALSE; + if (GTMErrorWaitSecs != 0 || GTMErrorWaitCount <= 0) + { + write_stderr("Invalid Options of waiting reconnect. Ignored.\n"); + GTMErrorWaitSecs = 0; + GTMErrorWaitCount = 0; + } + } + + /* * GTM accepts no non-option switch arguments. */ if (optind < argc) @@ -417,10 +676,17 @@ main(int argc, char *argv[]) pqsignal(SIGQUIT, GTMProxy_SigleHandler); pqsignal(SIGTERM, GTMProxy_SigleHandler); pqsignal(SIGINT, GTMProxy_SigleHandler); + pqsignal(SIGUSR2, GTMProxy_SigleHandler); + pqsignal(SIGUSR1, GTMProxy_SigleHandler); pqinitmask(); /* + * Initialize Thread Infor area + */ + Proxy_ThreadInfo = palloc0(sizeof(GTMProxy_ThreadInfo *) * GTMProxyWorkerThreads); + + /* * Pre-fork so many worker threads */ @@ -429,7 +695,7 @@ main(int argc, char *argv[]) /* * XXX Start the worker thread */ - if (GTMProxy_ThreadCreate(GTMProxy_ThreadMain) == NULL) + if (GTMProxy_ThreadCreate(GTMProxy_ThreadMain, i) == NULL) { elog(ERROR, "failed to create a new thread"); return STATUS_ERROR; @@ -437,6 +703,11 @@ main(int argc, char *argv[]) } /* + * Now all the worker threads are ready and the proxy can accept SIGUSR1 to reconnect + */ + ReadyToReconnect = TRUE; + + /* * Accept any new connections. Add for each incoming connection to one of * the pre-forked threads. */ @@ -647,10 +918,17 @@ GTMProxy_ThreadMain(void *argp) sprintf(gtm_connect_string, "host=%s port=%d pgxc_node_id=%d remote_type=%d", GTMServerHost, GTMServerPortNumber, GTMProxyID, PGXC_NODE_GTM_PROXY); +#ifdef GTM_SBY_DEBUG + fprintf(stderr, "Connecting to GTm by PQconnectGTM(\"s\")\n", gtm_connect_string); +#endif + thrinfo->thr_gtm_conn = PQconnectGTM(gtm_connect_string); if (thrinfo->thr_gtm_conn == NULL) elog(FATAL, "GTM connection failed"); +#ifdef GTM_SBY_DEBUG + fprintf(stderr, "Connecting to GTM successfully\n"); +#endif /* * Get the input_message in the TopMemoryContext so that we don't need to @@ -661,6 +939,13 @@ GTMProxy_ThreadMain(void *argp) initStringInfo(&input_message); /* + * Set options to wait when GTM communication error is detected. + */ + thrinfo->thr_gtm_conn->gtmErrorWaitOpt = GTMErrorWaitOpt; + thrinfo->thr_gtm_conn->gtmErrorWaitSecs = GTMErrorWaitSecs; + thrinfo->thr_gtm_conn->gtmErrorWaitCount = GTMErrorWaitCount; + + /* * If an exception is encountered, processing resumes here so we abort the * current transaction and start a new one. * @@ -727,10 +1012,15 @@ GTMProxy_ThreadMain(void *argp) /* We can now handle ereport(ERROR) */ PG_exception_stack = &local_sigjmp_buf; + /* We can now handle SIGUSR2 as well */ + BLOCK_LONGJMP; /* Signal handler just returns, no longjmp() */ + GetMyThreadInfo->can_accept_SIGUSR2 = TRUE; + for (;;) { gtm_ListCell *elem = NULL; GTM_Result *res = NULL; + int ii; /* * Release storage left over from prior query cycle, and create a new @@ -824,6 +1114,53 @@ GTMProxy_ThreadMain(void *argp) * Now, read command from each of the connections that has some data to * be read. */ + /* + * Each SIGUSR2 signal should return here. Please note that + * from the beginning of the outer loop to here, longjmp is + * blocked and signal handler will simply return so that we + * don't have to be bothered with the memory context. + */ + setjmp_again: + if (sigsetjmp(GetMyThreadInfo->longjmp_env, 1) == 0) + { + BLOCK_LONGJMP; + } + else + { + /* + * Longjmp during read/write wait from/to GTM + */ + PG_SETMASK(&UnBlockSig); + /* + * Now reconnect to the new GTM + */ + sprintf(gtm_connect_string, + "host=%s port=%d pgxc_node_id=%d remote_type=%d", + NewGTMServerHost, NewGTMServerPortNumber, GTMProxyID, PGXC_NODE_GTM_PROXY); + thrinfo->thr_gtm_conn = PQconnectGTM(gtm_connect_string); + + if (thrinfo->thr_gtm_conn == NULL) + elog(FATAL, "GTM reconnect failed."); + + /* + * Set options to wait when GTM communication error is detected. + */ + thrinfo->thr_gtm_conn->gtmErrorWaitOpt = GTMErrorWaitOpt; + thrinfo->thr_gtm_conn->gtmErrorWaitSecs = GTMErrorWaitSecs; + thrinfo->thr_gtm_conn->gtmErrorWaitCount = GTMErrorWaitCount; + + /* + * Initialize command processing. + */ + thrinfo->reconnect_issued = FALSE; + thrinfo->thr_processed_commands = gtm_NIL; + for (ii = 0; ii < MSG_TYPE_COUNT; ii++) + { + thrinfo->thr_pending_commands[ii] = gtm_NIL; + } + goto setjmp_again; /* Get ready for another SIGUSR2 */ + } + for (ii = 0; ii < thrinfo->thr_conn_count; ii++) { GTMProxy_ConnectionInfo *conninfo = thrinfo->thr_all_conns[ii]; @@ -836,16 +1173,25 @@ GTMProxy_ThreadMain(void *argp) * to the remove_list and cleanup at the end of this round of * cleanup. */ + /* + * === KS WIP: === + * Need to delay the clearance of GTMProxy_ConnectionInfo after + * finishing to handle other commands so that it can come back + * and do the same thing to the new GTM. + */ GTMProxy_HandleDisconnect(thrinfo->thr_conn, thrinfo->thr_gtm_conn); + thrinfo->thr_poll_fds[ii].revents &= ~POLLHUP; continue; } - if (thrinfo->thr_poll_fds[ii].revents & POLLIN) + /* The second condition works after the reconnect is done */ + if (thrinfo->thr_poll_fds[ii].revents & POLLIN || thrinfo->thr_qtype[ii]) { /* * (3) read a command (loop blocks here) */ qtype = ReadCommand(thrinfo->thr_conn, &input_message); + DetectSIGNAL_Sync; /* Do not longjmp while handling backend messages. */ switch(qtype) { @@ -892,6 +1238,8 @@ GTMProxy_ThreadMain(void *argp) * one round of messages and the GTM server should flush all the * pending responses after seeing this message. */ + Begin_DetectSIGNAL_Async; + if (gtmpqPutMsgStart('F', true, thrinfo->thr_gtm_conn) || gtmpqPutInt(MSG_DATA_FLUSH, sizeof (GTM_MessageType), thrinfo->thr_gtm_conn) || gtmpqPutMsgEnd(thrinfo->thr_gtm_conn)) @@ -902,9 +1250,15 @@ GTMProxy_ThreadMain(void *argp) */ gtmpqFlush(thrinfo->thr_gtm_conn); + End_DetectSIGNAL_Async; + /* * Read back the responses and put them on to the right backend * connection. + * + * We should not longjmp() while returning successful responses. + * In this case, we should also clear input commnad. TBD. + * We need this code in ProcessResponse(). */ gtm_foreach(elem, thrinfo->thr_processed_commands) { @@ -917,8 +1271,12 @@ GTMProxy_ThreadMain(void *argp) */ if (cmdinfo->ci_res_index == 0) { + Begin_DetectSIGNAL_Async; + if ((res = GTMPQgetResult(thrinfo->thr_gtm_conn)) == NULL) elog(ERROR, "GTMPQgetResult failed"); + + End_DetectSIGNAL_Async; } ProcessResponse(thrinfo, cmdinfo, res); @@ -928,7 +1286,7 @@ GTMProxy_ThreadMain(void *argp) thrinfo->thr_processed_commands = gtm_NIL; /* - * Now clean up disconnected connections + * Now clean up disconnected connections and backup commnads */ for (ii = 0; ii < thrinfo->thr_conn_count; ii++) { @@ -1221,6 +1579,12 @@ ProcessResponse(GTMProxy_ThreadInfo *thrinfo, GTMProxy_CommandInfo *cmdinfo, errmsg("invalid frontend message type %d", cmdinfo->ci_mtype))); } + /* + * Clean input backup + */ + pfree(thrinfo->thr_inBuf[cmdinfo->ci_conn->con_id]); + thrinfo->thr_inBuf[cmdinfo->ci_conn->con_id] = NULL; + thrinfo->thr_qtype[cmdinfo->ci_conn->con_id] = 0; /* Initialize with invalid qtype */ } /* ---------------- @@ -1234,17 +1598,36 @@ static int ReadCommand(GTMProxy_ConnectionInfo *conninfo, StringInfo inBuf) { int qtype; + int rv; + int connIdx = conninfo->con_id; + int anyBackup; + + anyBackup = (GetMyThreadInfo->thr_qtype[connIdx] ? TRUE : FALSE); /* * Get message type code from the frontend. */ - qtype = pq_getbyte(conninfo->con_port); + if (anyBackup) + { + qtype = GetMyThreadInfo->thr_qtype[connIdx]; + } + else + { + qtype = pq_getbyte(conninfo->con_port); + GetMyThreadInfo->thr_any_backup[connIdx] = TRUE; + GetMyThreadInfo->thr_qtype[connIdx] = qtype; + } if (qtype == EOF) /* frontend disconnected */ { ereport(COMMERROR, (EPROTO, errmsg("unexpected EOF on client connection"))); + if (SIGUSR2DETECTED) + { + UNBLOCK_LONGJMP; + RECONNECT_LONGJMP; + } return qtype; } @@ -1283,7 +1666,17 @@ ReadCommand(GTMProxy_ConnectionInfo *conninfo, StringInfo inBuf) * after the type code; we can read the message contents independently of * the type. */ - if (pq_getmessage(conninfo->con_port, inBuf, 0)) + if (anyBackup) + { + copyStringInfo(inBuf, GetMyThreadInfo->thr_inBuf[connIdx]); + rv = 0; + } + else + { + rv = pq_getmessage(conninfo->con_port, inBuf, 0); + GetMyThreadInfo->thr_inBuf[connIdx] = dupStringInfo(inBuf); + } + if (rv) return EOF; /* suitable message already logged */ return qtype; @@ -1576,9 +1969,13 @@ GTMProxy_ProxyCommand(GTMProxy_ConnectionInfo *conninfo, GTM_Conn *gtm_conn, thrinfo->thr_processed_commands = gtm_lappend(thrinfo->thr_processed_commands, cmdinfo); /* Finish the message. */ + Begin_DetectSIGNAL_Async; + if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + End_DetectSIGNAL_Async; + return; } @@ -1649,9 +2046,13 @@ static void GTMProxy_ProxyPGXCNodeCommand(GTMProxy_ConnectionInfo *conninfo,GTM_ thrinfo->thr_processed_commands = gtm_lappend(thrinfo->thr_processed_commands, cmdinfo); /* Finish the message. */ + Begin_DetectSIGNAL_Async; + if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + End_DetectSIGNAL_Async; + return; } @@ -1752,6 +2153,8 @@ static void GTMProxy_HandleDisconnect(GTMProxy_ConnectionInfo *conninfo, GTM_Conn *gtm_conn) { GTM_ProxyMsgHeader proxyhdr; + sigjmp_buf backup_jmp_buf; + int longjmp_enabled = FALSE; /* Mark node as disconnected if it is a postmaster backend */ @@ -1778,9 +2181,16 @@ GTMProxy_HandleDisconnect(GTMProxy_ConnectionInfo *conninfo, GTM_Conn *gtm_conn) } /* Finish the message. */ + Begin_DetectSIGNAL_Async; + if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + End_DetectSIGNAL_Async; + + GetMyThreadInfo->can_longjmp = longjmp_enabled; + memcpy(&(GetMyThreadInfo->longjmp_env), &backup_jmp_buf, sizeof(sigjmp_buf)); + conninfo->con_disconnected = true; if (conninfo->con_port->sock > 0) StreamClose(conninfo->con_port->sock); @@ -1843,9 +2253,13 @@ GTMProxy_ProcessPendingCommands(GTMProxy_ThreadInfo *thrinfo) } /* Finish the message. */ + Begin_DetectSIGNAL_Async; + if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + End_DetectSIGNAL_Async; + /* * Move the entire list to the processed command */ @@ -1881,9 +2295,13 @@ GTMProxy_ProcessPendingCommands(GTMProxy_ThreadInfo *thrinfo) } /* Finish the message. */ + Begin_DetectSIGNAL_Async; + if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + End_DetectSIGNAL_Async; + /* * Move the entire list to the processed command */ @@ -1921,9 +2339,12 @@ GTMProxy_ProcessPendingCommands(GTMProxy_ThreadInfo *thrinfo) } /* Finish the message. */ + Begin_DetectSIGNAL_Async; + if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + End_DetectSIGNAL_Async; /* * Move the entire list to the processed command @@ -1960,9 +2381,13 @@ GTMProxy_ProcessPendingCommands(GTMProxy_ThreadInfo *thrinfo) } /* Finish the message. */ + Begin_DetectSIGNAL_Async; + if (gtmpqPutMsgEnd(gtm_conn)) elog(ERROR, "Error finishing the message"); + End_DetectSIGNAL_Async; + /* * Move the entire list to the processed command */ diff --git a/src/gtm/proxy/proxy_thread.c b/src/gtm/proxy/proxy_thread.c index 4139936..b4fad91 100644 --- a/src/gtm/proxy/proxy_thread.c +++ b/src/gtm/proxy/proxy_thread.c @@ -27,6 +27,9 @@ GTMProxy_Threads *GTMProxyThreads = >MProxyThreadsData; #define GTM_PROXY_MAX_THREADS 1024 /* Max threads allowed in the GTMProxy */ #define GTMProxyThreadsFull (GTMProxyThreads->gt_thread_count == GTMProxyThreads->gt_array_size) +extern int GTMProxyWorkerThreads; +extern GTMProxy_ThreadInfo **Proxy_ThreadInfo; + /* * Add the given thrinfo structure to the global array, expanding it if * necessary @@ -126,7 +129,7 @@ GTMProxy_ThreadRemove(GTMProxy_ThreadInfo *thrinfo) * "startroutine". The thread information is returned to the calling process. */ GTMProxy_ThreadInfo * -GTMProxy_ThreadCreate(void *(* startroutine)(void *)) +GTMProxy_ThreadCreate(void *(* startroutine)(void *), int idx) { GTMProxy_ThreadInfo *thrinfo; int err; @@ -142,6 +145,11 @@ GTMProxy_ThreadCreate(void *(* startroutine)(void *)) GTM_CVInit(&thrinfo->thr_cv); /* + * Initialize communication area with SIGUSR2 signal handler (reconnect) + */ + Proxy_ThreadInfo[idx] = thrinfo; + + /* * The thread status is set to GTM_PROXY_THREAD_STARTING and will be changed by * the thread itself when it actually starts executing */ diff --git a/src/include/gtm/gtm_proxy.h b/src/include/gtm/gtm_proxy.h index 2af5ef3..8ddf54c 100644 --- a/src/include/gtm/gtm_proxy.h +++ b/src/include/gtm/gtm_proxy.h @@ -44,7 +44,7 @@ typedef struct GTMProxy_ConnectionInfo struct GTMProxy_ThreadInfo *con_thrinfo; bool con_authenticated; bool con_disconnected; - GTMProxy_ConnID con_id; + GTMProxy_ConnID con_id; /* Index for thr_all_conns */ GTM_MessageType con_pending_msg; GlobalTransactionId con_txid; @@ -110,11 +110,20 @@ typedef struct GTMProxy_ThreadInfo /* connection array */ GTMProxy_ConnectionInfo *thr_all_conns[GTM_PROXY_MAX_CONNECTIONS]; struct pollfd thr_poll_fds[GTM_PROXY_MAX_CONNECTIONS]; - gtm_List *thr_processed_commands; - gtm_List *thr_pending_commands[MSG_TYPE_COUNT]; + short thr_any_backup[GTM_PROXY_MAX_CONNECTIONS]; + int thr_qtype[GTM_PROXY_MAX_CONNECTIONS]; + StringInfo thr_inBuf[GTM_PROXY_MAX_CONNECTIONS]; + gtm_List *thr_processed_commands; + gtm_List *thr_pending_commands[MSG_TYPE_COUNT]; GTM_Conn *thr_gtm_conn; + /* Reconnect Info */ + int can_accept_SIGUSR2; + int reconnect_issued; + int can_longjmp; + sigjmp_buf longjmp_env; + } GTMProxy_ThreadInfo; typedef struct GTMProxy_Threads @@ -133,7 +142,7 @@ int GTMProxy_ThreadRemove(GTMProxy_ThreadInfo *thrinfo); int GTMProxy_ThreadJoin(GTMProxy_ThreadInfo *thrinfo); void GTMProxy_ThreadExit(void); -extern GTMProxy_ThreadInfo *GTMProxy_ThreadCreate(void *(* startroutine)(void *)); +extern GTMProxy_ThreadInfo *GTMProxy_ThreadCreate(void *(* startroutine)(void *), int idx); extern GTMProxy_ThreadInfo * GTMProxy_GetThreadInfo(GTM_ThreadID thrid); extern GTMProxy_ThreadInfo *GTMProxy_ThreadAddConnection(GTMProxy_ConnectionInfo *conninfo); extern int GTMProxy_ThreadRemoveConnection(GTMProxy_ThreadInfo *thrinfo, @@ -231,4 +240,17 @@ extern GTM_ThreadID TopMostThreadID; CritSectionCount--; \ } while(0) +/* + * Macros to control the behavior of SIGUSR2 singnal handler + */ +#define BLOCK_LONGJMP do{GetMyThreadInfo->can_longjmp = FALSE;}while(0) +#define UNBLOCK_LONGJMP do{GetMyThreadInfo->can_longjmp = TRUE;}while(0) +#define SIGUSR2DETECTED (GetMyThreadInfo->reconnect_issued) +#define RECONNECT_LONGJMP do{longjmp(GetMyThreadInfo->longjmp_env, 1);}while(0) +#define Begin_DetectSIGNAL_Async UNBLOCK_LONGJMP +#define End_DetectSIGNAL_Async do{BLOCK_LONGJMP; if(SIGUSR2DETECTED) RECONNECT_LONGJMP;}while(0) +#define DetectSIGNAL_Sync do{if(SIGUSR2DETECTED) RECONNECT_LONGJMP;}while(0) +#define Begin_GTM_CritSection BLOCK_LONGJMP +#define End_GTM_CritSection DetectSIGNAL_Synch + #endif diff --git a/src/include/gtm/libpq-be.h b/src/include/gtm/libpq-be.h index 8e9805f..dd42ba3 100644 --- a/src/include/gtm/libpq-be.h +++ b/src/include/gtm/libpq-be.h @@ -73,6 +73,13 @@ typedef struct Port int keepalives_idle; int keepalives_interval; int keepalives_count; + + /* + * Connection error option + */ + int connErr_waitOpt; + int connErr_waitSecs; + int connErr_waitCount; } Port; /* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */ diff --git a/src/include/gtm/libpq-int.h b/src/include/gtm/libpq-int.h index 30775e3..9cbdb3d 100644 --- a/src/include/gtm/libpq-int.h +++ b/src/include/gtm/libpq-int.h @@ -86,6 +86,14 @@ struct gtm_conn /* Buffer for receiving various parts of messages */ PQExpBufferData workBuffer; /* expansible string */ + /* + * Options to wait when GTM connection is reset. + * We should wait for SIGUSR1/2 to reconnect, not just return error. + */ + int gtmErrorWaitOpt; /* If we should wait. When TRUE, we assume XCM is available. */ + int gtmErrorWaitSecs; /* How long to wait when CONNECTION is closed from the peer */ + int gtmErrorWaitCount; /* How many times to wait when CONNECTION is closed from the peer */ + /* Pointer to the result of last operation */ GTM_Result *result; }; diff --git a/src/include/gtm/stringinfo.h b/src/include/gtm/stringinfo.h index d504685..9970fdb 100644 --- a/src/include/gtm/stringinfo.h +++ b/src/include/gtm/stringinfo.h @@ -146,4 +146,17 @@ extern void appendBinaryStringInfo(StringInfo str, */ extern void enlargeStringInfo(StringInfo str, int needed); +/*------------------------ + * dupStringInfo + * Get new stringinfo and copy the original one to it. + * Please note that "cursor" is copied too. + */ +extern StringInfo dupStringInfo(StringInfo orig); + +/*------------------------ + * copyStringInfo + * Deep copy StringInfo. Data part will also be copied. + */ +extern void copyStringInfo(StringInfo to, StringInfo from); + #endif /* STRINGINFO_H */ ----------------------------------------------------------------------- Summary of changes: src/gtm/common/gtm_serialize.c | 49 ++++- src/gtm/common/gtm_serialize_debug.c | 11 +- src/gtm/common/gtm_utils.c | 19 ++ src/gtm/common/stringinfo.c | 44 ++++- src/gtm/main/gtm_standby.c | 20 ++- src/gtm/proxy/proxy_main.c | 439 +++++++++++++++++++++++++++++++++- src/gtm/proxy/proxy_thread.c | 10 +- src/include/gtm/gtm_proxy.h | 30 ++- src/include/gtm/libpq-be.h | 7 + src/include/gtm/libpq-int.h | 8 + src/include/gtm/stringinfo.h | 13 + 11 files changed, 627 insertions(+), 23 deletions(-) hooks/post-receive -- Postgres-XC |
From: Pavan D. <pa...@us...> - 2011-05-04 11:18:07
|
Project "Postgres-XC". The branch, pgxc-barrier has been updated via 467b6dc1d5af91bcabbb2b17f640b4cf89e112f9 (commit) via 79bbbd3830261d1ad1f9ac0e0bd033bd2274b438 (commit) via 0e2f36787d7eb0eed9a71b5beb567d09133c15e8 (commit) via 1c63e1870b95ddad3fdee284a0744f31add39919 (commit) via 43edae4ce7bc8a2533e49778dbb567829ec45e4e (commit) via 90c5e49fd6b6469d6e878570ce9287e1b0978723 (commit) via 443c6a8362539d4beee8206012971cf8210502de (commit) via 2610315b335ac97ed88ff532c58239c3a8685206 (commit) via e12ca2d640d6c9d09aefeb05bbce1fba01e9549a (commit) via e734d31c84a0a557bdfb172ee097a4b46d3868a7 (commit) via 719a5c822021427f7fd0eda354842b4b6b7ac381 (commit) via 8d0c5128fe837f140e932f1b67c318fa2788d98d (commit) via de75b4603d9a13ed45a1b73d855c48a098fc7371 (commit) via 8671a6ea9e630d8f4f60126a79f084ada4eb6280 (commit) via 820571e184fb6ae4dd4e63f14724afab283112e2 (commit) via 5e7b541eb0ae25bc673523fb95c52880337ecd41 (commit) via 5d185a9f610af8043b6b528bc619151fea71ba9b (commit) via 8ab933b6a78a3f0eee922a2dd84afacbba342c09 (commit) via 0fd714273601a16501bb3599c60ad0aa46e1f839 (commit) via cac1e04b3eeedbe4d6240dbb00c1a326e65f7f97 (commit) via dfa58f3a944a7b1fe4c3f97281c9b1516f0b69a7 (commit) via 04202cbdd151776cb44eefac7eeecdf05467cca7 (commit) via 2764f7b5379e3eb66d038a74aa0031e29a0de72c (commit) via 3b9ec2e609dbe7d745f88e28d779e46d62abf3d8 (commit) via b17fd66a279122e8cb26fd6e7b557b212c26a10d (commit) via 48d58f472b2b3f6c1a15601d6ee413a0ae17ca3d (commit) via 2cd2aaa6dc83b46c296aa40331ef2d28243aea93 (commit) via ac25b331910b55892b3bfcecde7255471bd3622c (commit) via 1ca81ca5d9f7c7d257b87654e9c50a3913fe6cfe (commit) via bf5d482ef08bf914c4562fe3ce33fc7cc54666e6 (commit) via 175ff49e7d19a4205de0a200ba78c86a1fd46a6c (commit) via 7f0ec79b75ca3836569d6a60cedc55665d74ec20 (commit) via dc2762989aaf0946219444d81b12222ac3b553a0 (commit) via 988a22ba73575b836ea2842c17fc375305fdc266 (commit) via 0e849c14308e999ac79c9ba982487149d80fffd8 (commit) via 45c60e566d02f2f83d7e079d5f28a5526f177c50 (commit) via 690b8663292919c383c4e6c35e3df776ee346b45 (commit) via 1916490d851c62a8ed14807228c2fe5032a90cad (commit) via 92db090c02780f57caed42290d4e59895f41cff4 (commit) via 4d46aea62369eb13f8e7d874e1ab987021b827b1 (commit) via 17cebe041ac8f58d9accd0db678c5e1d5ebfcb14 (commit) via 2fc3bd26ba3e47ec9ccebbd6cac45cb418be7fd4 (commit) via 000557b15b4468cdf6ee1f9e5d8da977c42942ce (commit) via b53f7e7bdc5cc827e332f655c73ac7334e9a162d (commit) via e137a355b191645ddaf921b5cee2264252a76e8d (commit) via 43675e5f56beb98419514bd23997bc178e005a6f (commit) via 48986620f70d0220524e715061b76824ac9684c6 (commit) via 9a329381083f831459db12cdd6616552c4571c5b (commit) via 7ebac29189aaa205c18e21580ef88f98ac262c43 (commit) via 0fd3d4109912f1685bbd8ee657df2e92a4ec69f7 (commit) via 219e41d7324c86ceb537df13214774b70051a833 (commit) via 3e585734f4fa26b6356529f51d7a478871df0301 (commit) via 8b018fd20850ec0753fdfbef024b9a957efaeb0a (commit) via 913fba843d425a786196bcffc965e5ceea75e55d (commit) via 112ad257947a5cc60b7c598880d335a5b9a351c1 (commit) via e1946160fe64042e76b5252c66b6f6fb5da6b85d (commit) via 098a076729929a8ecf2c6eecd5cc6de63628e882 (commit) via 894ec20ad348f6230e1a796b7435d62f6867eab8 (commit) via f738280c29b0e50348a8b0940679595534220e24 (commit) via 91466a5b79ebd8aea0b6313704654b0a1b7818ac (commit) via 0429a889372c6db0b4b1253624601c42410cc108 (commit) via 124814afe8603b4ec933f6f7b0b646a2a36f53cd (commit) via baa0d2e4589347558b6fd894d431c1ff42151492 (commit) via b9c601ba5509e9c90f1d4d184a514ce47ed7397d (commit) via 46f524e5a6430735c6549afbbe0ea61ab8a4cd49 (commit) via 886f9bbe99120dc751a7d9110521a7cd6cc884d9 (commit) via 82c7049c243ba0849d93b6f73e1b4b96a64f4d9c (commit) via f61d2df4e202e715877b676bbb194552588b43bd (commit) via 09be104bfe29525f3423b3b07a9f88b943c8845e (commit) via 75b80f6575e6c29dc07d2e8cd803a8c2cb309bde (commit) via 6654fe0c6fac62e0351b3a409d26861789e9906b (commit) via 4f1d506fd83393fbbe98797fa49e862b5b17f19e (commit) via 796bf6e4b113d356b7d725daa71d62835faf9c68 (commit) via ca673e251cd252aa997c9ac51b8f90d2788e8bac (commit) via 76f1d9cde238be026007d00a6af192b7f7ac4ce5 (commit) via 8f2f50bd72fd570de84518eaf39abbf464de7019 (commit) via c696488cee189743e90303c52954f9e4d791e18c (commit) via d12c17cc28433adedc44e63f5eb0f34b79c8523f (commit) via d4b2b6421ffb3d443a2f9bf923b9615f4c5752fb (commit) via 6fd26e495ab4f8503042d0cb92aa1fae1c9a4342 (commit) via d759db01c235ccce001e86fbc7e6c211d63cb489 (commit) via 829cc54f2997a28979812ea057b084a4c3d5e8e9 (commit) via c46b3fd73fb59f81f7887365c3399a1abd77f6b3 (commit) via 2b2cc8ef94673ee9d68c6b36cd940c98b0960273 (commit) via 6b7c2fee9839c195d5d19be2d1e8457b7125fbd7 (commit) via 0c8315711375436158ddf9cc5950582a864deb67 (commit) via 80576863fad916d25cb4b0c4af42d0ebcaac990a (commit) via edaff5cb25e0aa8c21b5406133c2cbea53ef7b0c (commit) via 4b03e05fb307a45134a3113c5364bafcf7d29bc6 (commit) via 29637b7f7d32ea593f9d2689ec44a0c12511a274 (commit) via aa88067b97ebc92c9ab0a9995ad95e1f988b8df0 (commit) via 725940e25699179ded9d518b756e87577c892e5a (commit) via 199cedb034caa11c185356ef473b442d67069d10 (commit) via 0ff5c459cb0ef20c04dc3675020c93812b2dd8dd (commit) via fc40359f8466b1b03b295a046246aa56e164eb19 (commit) via d5fae33cce224854b83d2e2c0da3e031a7c98bd7 (commit) via 4f6e09f20eca268e7282810c0d8ecbf83f15a43f (commit) via 286c8b5588c6159104f44f81f253400f20f0ceb9 (commit) via bcce6a86720fb9ad81aa8ac0d84355aa27654504 (commit) via 0e51d7efa87f0d0011cee8927118080514d96228 (commit) via 5e0c62f09c4584c9918dca06e5a5770a2e7bd472 (commit) via 9265ae6c81b0bf499ab010736125f7ffca812c0e (commit) via 894c0a4564c20ec339b6d9a098af8e19d439b81a (commit) via 75bff5c22b1d721af3ef88268c5c15624ed0b1d3 (commit) via e9b0fdc80dd5822b3b737fdd1c97891a8287c524 (commit) via 6bc68eefd4558dedc4228335f3f693727dbf47bc (commit) via 0bd6f5672708a47b72c76a642ce335f1d7663d52 (commit) via b8593ff58077e29c21c5f1ed50e622b0cf882b8c (commit) via 171d5e3b22a1b5b15ad09712fe829d8bf61e3d13 (commit) via 33b9889d7cc947a702f2d4ff7cb147abdd606666 (commit) via f8d599dc12fe06669f4aadc5dddda3fc2c1109b8 (commit) via 490c08dd37fa44af57cd3a2b3e931ef4f3a94853 (commit) via 55433e8be6a7b51568e4d727605b45e911c19f75 (commit) via 9e3f4ae484fd7d8403fe59fecc95d065c3c839bf (commit) via 0f0f1798b83b01d3323cd0a9e63e5069e4fb1d4c (commit) via 476f70fbc5292378da7d7e70283b2f44aa0d9d05 (commit) via 7aaee8d89ad4f9dd211042e65ecad68edbc1e9bc (commit) via c739272f73a860a1c1a07ea085c47e8d4e292420 (commit) via 0115aacecb33438da33b51e7761498de6a854ba3 (commit) via e62b497522fbf7b3ac361b3aef0e8ca23cdda675 (commit) via bce5904c777b305c7706146077c2a692e7dad52e (commit) via 2f96f59952551d1644c3176e5a23483308e9c810 (commit) via 861f5131ddadd10fcd88f3653862601cddf0b255 (commit) via 0461384764ef25c8248e8d22ec09a23d3cd608a3 (commit) via 95854dc24142a7ee751c05d658f52d84fc0649dc (commit) via e0df08e39ab62378dcb7d11e259c5e1474104e0f (commit) via 09a00e32319f4af6f2cab879ed1fae866164eca5 (commit) via baed9de74d766b401d088918e809dc9e51b313f7 (commit) via 565200f09a592715c10cbbec578c6b74a25e2bc2 (commit) via d6beeda6a68cce6f74d179e3520443be2a29bb4d (commit) via fe063aa6a39f12ce33d546f6a0730c67bee9bee8 (commit) via 6683a32a11ff2380ec743db49a52bc297acd14cb (commit) via aadbdfe18024455d5e18cf9007de5b8a3f5135dc (commit) via f9690c125d5657cdf8b4c2d9e441d4fe62edf223 (commit) via 60fd3944c0015203d62ae4bb49f209bc0a3b827b (commit) via 3af192ad45f2e21a74d032faa4612259606aa266 (commit) via ab949a7cb237c6966a19204b23aea6c931928961 (commit) via 11b8a7940017006e07f7448c9a42e20b11655ba0 (commit) via 49107f629e3628a89ec4848e167ef9c8e6d2a4e7 (commit) via 901cb86e90d8d286dd0ca3b4c337ed512b28021c (commit) via 4200394d48340e4a304f4f5f6ac3c4668445f6e4 (commit) via fa1747011efd3841353894a0500e31411b4f4efa (commit) via 637f46cda04fee2d9d4bd78c1c9f4f07373aece5 (commit) via 405874a1f0d9b6634349bcee5165cd7e305ba81a (commit) via 94b048a92d21bd787294cabd3a144cf69badd419 (commit) via acef8b6dde2c598a727e2eaf2cca5b66db0b592f (commit) via df5e5dc0ea901385e7be446ca6d9ebec7e58cb12 (commit) via 93781551114d9fe5826459caaf8aa03d63bac001 (commit) via fa427af446193741501c72e6e026ac493a125137 (commit) via a6a956e4d1303161750161c0629adb1b4514d345 (commit) via 33342d8d44bb2c36239712d9e7318df0776ce018 (commit) via 9ecde28ad6b6ed00b7588928480bf0bd8a741421 (commit) via 062ecec9ab0e36f17f31624307470b8662dcfd80 (commit) via 7eb3c73d3750b900c2a8d1ce446c97f2db425ca1 (commit) from 1e74fe24bd50b8e1c8244e2e8f999e10964484dd (commit) - Log ----------------------------------------------------------------- commit 467b6dc1d5af91bcabbb2b17f640b4cf89e112f9 Author: Pavan Deolasee <pav...@gm...> Date: Wed May 4 16:43:02 2011 +0530 WAL log the barrier creation activity on the local coordinator. Also fix some of the bugs in the recovery code. This code was not tasted previously and there were some changes after the 9.0 merge diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index df603a6..20489ce 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -4371,6 +4371,14 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI, xlogfname, recoveryStopAfter ? "after" : "before", timestamptz_to_str(recoveryStopTime)); + else if (recoveryTarget == RECOVERY_TARGET_BARRIER) + snprintf(buffer, sizeof(buffer), + "%s%u\t%s\t%s %s\n", + (srcfd < 0) ? "" : "\n", + parentTLI, + xlogfname, + recoveryStopAfter ? "after" : "before", + recoveryTargetBarrierId); else snprintf(buffer, sizeof(buffer), "%s%u\t%s\tno recovery target specified\n", @@ -5246,7 +5254,7 @@ readRecoveryCommandFile(void) #ifdef PGXC else if (strcmp(tok1, "recovery_barrier_id") == 0) { - recoveryTarget = true; + recoveryTarget = RECOVERY_TARGET_BARRIER; recoveryTargetBarrierId = pstrdup(tok2); } #endif @@ -5467,7 +5475,7 @@ recoveryStopsHere(XLogRecord *record, bool *includeThis) { bool stopsHere; #ifdef PGXC - bool stopsAtThisBarrier; + bool stopsAtThisBarrier = false; char *recordBarrierId; #endif uint8 record_info; @@ -5481,25 +5489,34 @@ recoveryStopsHere(XLogRecord *record, bool *includeThis) if (record->xl_rmid != RM_XACT_ID) #endif return false; + record_info = record->xl_info & ~XLR_INFO_MASK; - if (record_info == XLOG_XACT_COMMIT) + if (record->xl_rmid == RM_XACT_ID) { - xl_xact_commit *recordXactCommitData; + if (record_info == XLOG_XACT_COMMIT) + { + xl_xact_commit *recordXactCommitData; - recordXactCommitData = (xl_xact_commit *) XLogRecGetData(record); - recordXtime = recordXactCommitData->xact_time; - } - else if (record_info == XLOG_XACT_ABORT) - { - xl_xact_abort *recordXactAbortData; + recordXactCommitData = (xl_xact_commit *) XLogRecGetData(record); + recordXtime = recordXactCommitData->xact_time; + } + else if (record_info == XLOG_XACT_ABORT) + { + xl_xact_abort *recordXactAbortData; - recordXactAbortData = (xl_xact_abort *) XLogRecGetData(record); - recordXtime = recordXactAbortData->xact_time; + recordXactAbortData = (xl_xact_abort *) XLogRecGetData(record); + recordXtime = recordXactAbortData->xact_time; + } } #ifdef PGXC - else if (record_info == XLOG_BARRIER_CREATE) + else if (record->xl_rmid == RM_BARRIER_ID) { - recordBarrierId = (char *) XLogRecGetData(record); + if (record_info == XLOG_BARRIER_CREATE) + { + recordBarrierId = (char *) XLogRecGetData(record); + ereport(DEBUG2, + (errmsg("processing barrier xlog record for %s", recordBarrierId))); + } } #endif else @@ -5528,8 +5545,14 @@ recoveryStopsHere(XLogRecord *record, bool *includeThis) *includeThis = recoveryTargetInclusive; } #ifdef PGXC - else if (recoveryTargetBarrierId) + else if (recoveryTarget == RECOVERY_TARGET_BARRIER) { + if ((record->xl_rmid != RM_BARRIER_ID) || + (record_info != XLOG_BARRIER_CREATE)) + return false; + + ereport(DEBUG2, + (errmsg("checking if barrier record matches the target barrier"))); if (strcmp(recoveryTargetBarrierId, recordBarrierId) == 0) stopsAtThisBarrier = true; } @@ -5857,6 +5880,10 @@ StartupXLOG(void) ereport(LOG, (errmsg("starting point-in-time recovery to %s", timestamptz_to_str(recoveryTargetTime)))); + else if (recoveryTarget == RECOVERY_TARGET_BARRIER) + ereport(LOG, + (errmsg("starting point-in-time recovery to barrier %s", + (recoveryTargetBarrierId)))); else ereport(LOG, (errmsg("starting archive recovery"))); diff --git a/src/backend/pgxc/barrier/barrier.c b/src/backend/pgxc/barrier/barrier.c index 3e1d7cc..1b44f36 100644 --- a/src/backend/pgxc/barrier/barrier.c +++ b/src/backend/pgxc/barrier/barrier.c @@ -414,6 +414,18 @@ ExecuteBarrier(const char *id) /* * Also WAL log the BARRIER locally and flush the WAL buffers to disk */ + { + XLogRecData rdata[1]; + XLogRecPtr recptr; + + rdata[0].data = (char *) id; + rdata[0].len = strlen(id) + 1; + rdata[0].buffer = InvalidBuffer; + rdata[0].next = NULL; + + recptr = XLogInsert(RM_BARRIER_ID, XLOG_BARRIER_CREATE, rdata); + XLogFlush(recptr); + } } /* diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 44badd2..b152c36 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -184,7 +184,8 @@ typedef enum { RECOVERY_TARGET_UNSET, RECOVERY_TARGET_XID, - RECOVERY_TARGET_TIME + RECOVERY_TARGET_TIME, + RECOVERY_TARGET_BARRIER } RecoveryTargetType; extern XLogRecPtr XactLastRecEnd; commit 79bbbd3830261d1ad1f9ac0e0bd033bd2274b438 Author: Pavan Deolasee <pav...@gm...> Date: Tue Apr 26 19:54:09 2011 +0530 Rearrange the 2PC commit code so that we can commit the local transaction after releasing the barrier lock. diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 6cc9aa3..df603a6 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -185,7 +185,6 @@ static RecoveryTargetType recoveryTarget = RECOVERY_TARGET_UNSET; static bool recoveryTargetInclusive = true; static TransactionId recoveryTargetXid; static TimestampTz recoveryTargetTime; -static TimestampTz recoveryLastXTime = 0; static char *recoveryTargetBarrierId; /* options taken from recovery.conf for XLOG streaming */ diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index d9a18b4..0af1288 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -1977,7 +1977,7 @@ finish: * This avoid to have any additional interaction with GTM when making a 2PC transaction. */ void -PGXCNodeCommitPrepared(char *gid, bool isTopLevel) +PGXCNodeCommitPrepared(char *gid) { int res = 0; int res_gtm = 0; @@ -2073,17 +2073,11 @@ finish: * If remote connection is a Coordinator type, the commit prepared has to be done locally * if and only if the Coordinator number was in the node list received from GTM. */ - if (operation_local || IsConnFromCoord()) - { - PreventTransactionChain(isTopLevel, "COMMIT PREPARED"); + if (operation_local) FinishPreparedTransaction(gid, true); - } - /* - * Release the barrier lock now so that pending barriers can get moving - */ LWLockRelease(BarrierLock); - return; + return; } /* @@ -2128,9 +2122,11 @@ finish: /* * Rollback prepared transaction on Datanodes involved in the current transaction + * + * Return whether or not a local operation required. */ -void -PGXCNodeRollbackPrepared(char *gid, bool isTopLevel) +bool +PGXCNodeRollbackPrepared(char *gid) { int res = 0; int res_gtm = 0; @@ -2205,17 +2201,7 @@ finish: (errcode(ERRCODE_INTERNAL_ERROR), errmsg("Could not rollback prepared transaction on Datanodes"))); - /* - * Local coordinator rollbacks if involved in PREPARE - * If remote connection is a Coordinator type, the commit prepared has to be done locally also. - * This works for both Datanodes and Coordinators. - */ - if (operation_local || IsConnFromCoord()) - { - PreventTransactionChain(isTopLevel, "ROLLBACK PREPARED"); - FinishPreparedTransaction(gid, false); - } - return; + return operation_local; } diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c index 9bb7f3d..7c9af6d 100644 --- a/src/backend/tcop/utility.c +++ b/src/backend/tcop/utility.c @@ -59,6 +59,7 @@ #ifdef PGXC #include "pgxc/barrier.h" +#include "pgxc/execRemote.h" #include "pgxc/locator.h" #include "pgxc/pgxc.h" #include "pgxc/planner.h" @@ -456,32 +457,58 @@ standard_ProcessUtility(Node *parsetree, break; case TRANS_STMT_COMMIT_PREPARED: + PreventTransactionChain(isTopLevel, "COMMIT PREPARED"); + PreventCommandDuringRecovery("COMMIT PREPARED"); #ifdef PGXC /* * If a COMMIT PREPARED message is received from another Coordinator, * Don't send it down to Datanodes. + * + * XXX We call FinishPreparedTransaction inside + * PGXCNodeCommitPrepared if we are doing a local + * operation. This is convinient because we want to + * hold on to the BarrierLock until local transaction + * is committed too. + * */ if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) - PGXCNodeCommitPrepared(stmt->gid, isTopLevel); -#else - PreventTransactionChain(isTopLevel, "COMMIT PREPARED"); - PreventCommandDuringRecovery("COMMIT PREPARED"); + PGXCNodeCommitPrepared(stmt->gid); + else if (IsConnFromCoord()) + { + /* + * A local Coordinator always commits if involved in Prepare. + * 2PC file is created and flushed if a DDL has been involved in the transaction. + * If remote connection is a Coordinator type, the commit prepared has to be done locally + * if and only if the Coordinator number was in the node list received from GTM. + */ +#endif FinishPreparedTransaction(stmt->gid, true); +#ifdef PGXC + } #endif break; case TRANS_STMT_ROLLBACK_PREPARED: + PreventTransactionChain(isTopLevel, "ROLLBACK PREPARED"); + PreventCommandDuringRecovery("ROLLBACK PREPARED"); #ifdef PGXC /* * If a ROLLBACK PREPARED message is received from another Coordinator, * Don't send it down to Datanodes. */ if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) - PGXCNodeRollbackPrepared(stmt->gid, isTopLevel); -#else - PreventTransactionChain(isTopLevel, "ROLLBACK PREPARED"); - FinishPreparedTransaction(gid, false); - PreventCommandDuringRecovery("ROLLBACK PREPARED"); + operation_local = PGXCNodeRollbackPrepared(stmt->gid); + /* + * Local coordinator rollbacks if involved in PREPARE + * If remote connection is a Coordinator type, the commit prepared has to be done locally also. + * This works for both Datanodes and Coordinators. + */ + if (operation_local || IsConnFromCoord()) + { +#endif + FinishPreparedTransaction(stmt->gid, false); +#ifdef PGXC + } #endif break; diff --git a/src/include/pgxc/execRemote.h b/src/include/pgxc/execRemote.h index 004984a..630ece9 100644 --- a/src/include/pgxc/execRemote.h +++ b/src/include/pgxc/execRemote.h @@ -121,8 +121,8 @@ extern void PGXCNodeBegin(void); extern void PGXCNodeCommit(bool bReleaseHandles); extern int PGXCNodeRollback(void); extern bool PGXCNodePrepare(char *gid); -extern void PGXCNodeRollbackPrepared(char *gid, bool isTopLevel); -extern void PGXCNodeCommitPrepared(char *gid, bool isTopLevel); +extern bool PGXCNodeRollbackPrepared(char *gid); +extern void PGXCNodeCommitPrepared(char *gid); extern bool PGXCNodeIsImplicit2PC(bool *prepare_local_coord); extern int PGXCNodeImplicitPrepare(GlobalTransactionId prepare_xid, char *gid); extern void PGXCNodeImplicitCommitPrepared(GlobalTransactionId prepare_xid, commit 0e2f36787d7eb0eed9a71b5beb567d09133c15e8 Merge: 1e74fe2 1c63e18 Author: Pavan Deolasee <pav...@gm...> Date: Mon Apr 25 17:22:01 2011 +0530 Merge branch 'PGXC-master' into pgxc-barrier Conflicts: src/backend/access/transam/xlog.c src/backend/parser/gram.y src/backend/pgxc/pool/execRemote.c src/backend/tcop/utility.c diff --cc src/backend/access/transam/rmgr.c index 70f2e42,c706e97..1b9998d --- a/src/backend/access/transam/rmgr.c +++ b/src/backend/access/transam/rmgr.c @@@ -20,11 -20,11 +20,13 @@@ #include "commands/dbcommands.h" #include "commands/sequence.h" #include "commands/tablespace.h" +#ifdef PGXC +#include "pgxc/barrier.h" +#endif #include "storage/freespace.h" + #include "storage/standby.h" + #include "utils/relmapper.h" - const RmgrData RmgrTable[RM_MAX_ID + 1] = { {"XLOG", xlog_redo, xlog_desc, NULL, NULL, NULL}, {"Transaction", xact_redo, xact_desc, NULL, NULL, NULL}, diff --cc src/backend/access/transam/xlog.c index 1f3b218,cf5dc74..6cc9aa3 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@@ -38,9 -39,10 +39,11 @@@ #include "funcapi.h" #include "libpq/pqsignal.h" #include "miscadmin.h" +#include "pgxc/barrier.h" #include "pgstat.h" #include "postmaster/bgwriter.h" + #include "replication/walreceiver.h" + #include "replication/walsender.h" #include "storage/bufmgr.h" #include "storage/fd.h" #include "storage/ipc.h" @@@ -166,9 -184,12 +185,14 @@@ static RecoveryTargetType recoveryTarge static bool recoveryTargetInclusive = true; static TransactionId recoveryTargetXid; static TimestampTz recoveryTargetTime; +static TimestampTz recoveryLastXTime = 0; +static char *recoveryTargetBarrierId; + /* options taken from recovery.conf for XLOG streaming */ + static bool StandbyMode = false; + static char *PrimaryConnInfo = NULL; + static char *TriggerFile = NULL; + /* if recoveryStopsHere returns true, it saves actual stop xid/time here */ static TransactionId recoveryStopXid; static TimestampTz recoveryStopTime; @@@ -4925,17 -5237,33 +5240,40 @@@ readRecoveryCommandFile(void if (!parse_bool(tok2, &recoveryTargetInclusive)) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg("parameter \"recovery_target_inclusive\" requires a Boolean value"))); - ereport(LOG, + errmsg("parameter \"%s\" requires a Boolean value", "recovery_target_inclusive"))); + ereport(DEBUG2, (errmsg("recovery_target_inclusive = %s", tok2))); } +#ifdef PGXC + else if (strcmp(tok1, "recovery_barrier_id") == 0) + { + recoveryTarget = true; + recoveryTargetBarrierId = pstrdup(tok2); + } +#endif + else if (strcmp(tok1, "standby_mode") == 0) + { + if (!parse_bool(tok2, &StandbyMode)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("parameter \"%s\" requires a Boolean value", "standby_mode"))); + ereport(DEBUG2, + (errmsg("standby_mode = '%s'", tok2))); + } + else if (strcmp(tok1, "primary_conninfo") == 0) + { + PrimaryConnInfo = pstrdup(tok2); + ereport(DEBUG2, + (errmsg("primary_conninfo = '%s'", + PrimaryConnInfo))); + } + else if (strcmp(tok1, "trigger_file") == 0) + { + TriggerFile = pstrdup(tok2); + ereport(DEBUG2, + (errmsg("trigger_file = '%s'", + TriggerFile))); + } else ereport(FATAL, (errmsg("unrecognized recovery parameter \"%s\"", @@@ -5235,21 -5552,10 +5584,21 @@@ recoveryStopsHere(XLogRecord *record, b } if (recoveryStopAfter) - recoveryLastXTime = recordXtime; + SetLatestXTime(recordXtime); } +#ifdef PGXC + else if (stopsAtThisBarrier) + { + recoveryStopTime = recordXtime; + ereport(LOG, + (errmsg("recovery stopping at barrier %s, time %s", + recoveryTargetBarrierId, + timestamptz_to_str(recoveryStopTime)))); + return true; + } +#endif else - recoveryLastXTime = recordXtime; + SetLatestXTime(recordXtime); return stopsHere; } diff --cc src/backend/parser/gram.y index 6df0582,1b43d6e..d412a10 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@@ -422,10 -448,9 +449,10 @@@ static TypeName *TableFuncTypeName(Lis %type <list> window_clause window_definition_list opt_partition_clause %type <windef> window_definition over_clause window_specification + opt_frame_clause frame_extent frame_bound %type <str> opt_existing_window_name - %type <ival> opt_frame_clause frame_extent frame_bound /* PGXC_BEGIN */ +%type <str> opt_barrier_id %type <distby> OptDistributeBy /* PGXC_END */ @@@ -438,7 -476,7 +478,8 @@@ */ /* ordinary key words in alphabetical order */ - /* PGXC - added REPLICATION, DISTRIBUTE, MODULO, BARRIER and HASH */ -/* PGXC - added DISTRIBUTE, DIRECT, HASH, REPLICATION, ROUND ROBIN, COORDINATOR, CLEAN, MODULO, NODE */ ++/* PGXC - added DISTRIBUTE, DIRECT, HASH, REPLICATION, ROUND ROBIN, ++ * COORDINATOR, CLEAN, MODULO, NODE, BARRIER */ %token <keyword> ABORT_P ABSOLUTE_P ACCESS ACTION ADD_P ADMIN AFTER AGGREGATE ALL ALSO ALTER ALWAYS ANALYSE ANALYZE AND ANY ARRAY AS ASC ASSERTION ASSIGNMENT ASYMMETRIC AT AUTHORIZATION @@@ -638,9 -694,10 +697,11 @@@ stmt | AlterUserSetStmt | AlterUserStmt | AnalyzeStmt + | BarrierStmt | CheckPointStmt + /* PGXC_BEGIN */ | CleanConnStmt + /* PGXC_END*/ | ClosePortalStmt | ClusterStmt | CommentStmt @@@ -10314,7 -11014,7 +11040,8 @@@ ColLabel: IDENT { $$ = $1; /* "Unreserved" keywords --- available for use as any kind of name. */ - /* PGXC - added DISTRIBUTE, HASH, REPLICATION, MODULO, BARRIER */ -/* PGXC - added DISTRIBUTE, DIRECT, HASH, REPLICATION, ROUND ROBIN, COORDINATOR, CLEAN, MODULO, NODE */ ++/* PGXC - added DISTRIBUTE, DIRECT, HASH, REPLICATION, ROUND ROBIN, ++ * COORDINATOR, CLEAN, MODULO, NODE, BARRIER */ unreserved_keyword: ABORT_P | ABSOLUTE_P diff --cc src/backend/tcop/utility.c index 6b3c138,e4f33c5..9bb7f3d --- a/src/backend/tcop/utility.c +++ b/src/backend/tcop/utility.c @@@ -400,10 -461,23 +462,11 @@@ standard_ProcessUtility(Node *parsetree * Don't send it down to Datanodes. */ if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) - operation_local = PGXCNodeCommitPrepared(stmt->gid); -#endif + PGXCNodeCommitPrepared(stmt->gid, isTopLevel); +#else PreventTransactionChain(isTopLevel, "COMMIT PREPARED"); + PreventCommandDuringRecovery("COMMIT PREPARED"); -#ifdef PGXC - /* - * A local Coordinator always commits if involved in Prepare. - * 2PC file is created and flushed if a DDL has been involved in the transaction. - * If remote connection is a Coordinator type, the commit prepared has to be done locally - * if and only if the Coordinator number was in the node list received from GTM. - */ - if (operation_local || IsConnFromCoord()) - { -#endif FinishPreparedTransaction(stmt->gid, true); -#ifdef PGXC - } #endif break; @@@ -414,10 -488,22 +477,11 @@@ * Don't send it down to Datanodes. */ if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) - operation_local = PGXCNodeRollbackPrepared(stmt->gid); -#endif + PGXCNodeRollbackPrepared(stmt->gid, isTopLevel); +#else PreventTransactionChain(isTopLevel, "ROLLBACK PREPARED"); + FinishPreparedTransaction(gid, false); + PreventCommandDuringRecovery("ROLLBACK PREPARED"); -#ifdef PGXC - /* - * Local coordinator rollbacks if involved in PREPARE - * If remote connection is a Coordinator type, the commit prepared has to be done locally also. - * This works for both Datanodes and Coordinators. - */ - if (operation_local || IsConnFromCoord()) - { -#endif - FinishPreparedTransaction(stmt->gid, false); -#ifdef PGXC - } #endif break; diff --cc src/include/nodes/parsenodes.h index 7398649,c571667..6ebf30b --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@@ -2201,21 -2293,12 +2293,25 @@@ typedef struct VacuumStm List *va_cols; /* list of column names, or NIL for all */ } VacuumStmt; +#ifdef PGXC +/* + * ---------------------- + * Barrier Statement + */ +typedef struct BarrierStmt +{ + NodeTag type; + const char *id; /* User supplied barrier id, if any */ +} BarrierStmt; + +#endif + /* ---------------------- * Explain Statement + * + * The "query" field is either a raw parse tree (SelectStmt, InsertStmt, etc) + * or a Query node if parse analysis has been done. Note that rewriting and + * planning of the query are always postponed until execution of EXPLAIN. * ---------------------- */ typedef struct ExplainStmt diff --cc src/include/parser/kwlist.h index 6993eee,c740911..124eca5 --- a/src/include/parser/kwlist.h +++ b/src/include/parser/kwlist.h @@@ -52,12 -52,9 +52,12 @@@ PG_KEYWORD("asymmetric", ASYMMETRIC, RE PG_KEYWORD("at", AT, UNRESERVED_KEYWORD) PG_KEYWORD("authorization", AUTHORIZATION, TYPE_FUNC_NAME_KEYWORD) PG_KEYWORD("backward", BACKWARD, UNRESERVED_KEYWORD) +#ifdef PGXC +PG_KEYWORD("barrier", BARRIER, UNRESERVED_KEYWORD) +#endif PG_KEYWORD("before", BEFORE, UNRESERVED_KEYWORD) PG_KEYWORD("begin", BEGIN_P, UNRESERVED_KEYWORD) - PG_KEYWORD("between", BETWEEN, TYPE_FUNC_NAME_KEYWORD) + PG_KEYWORD("between", BETWEEN, COL_NAME_KEYWORD) PG_KEYWORD("bigint", BIGINT, COL_NAME_KEYWORD) PG_KEYWORD("binary", BINARY, TYPE_FUNC_NAME_KEYWORD) PG_KEYWORD("bit", BIT, COL_NAME_KEYWORD) diff --cc src/include/pgxc/execRemote.h index a50c40c,c2cd922..004984a --- a/src/include/pgxc/execRemote.h +++ b/src/include/pgxc/execRemote.h @@@ -117,11 -117,11 +118,11 @@@ typedef struct RemoteQueryStat /* Multinode Executor */ extern void PGXCNodeBegin(void); - extern void PGXCNodeCommit(void); + extern void PGXCNodeCommit(bool bReleaseHandles); extern int PGXCNodeRollback(void); extern bool PGXCNodePrepare(char *gid); -extern bool PGXCNodeRollbackPrepared(char *gid); -extern bool PGXCNodeCommitPrepared(char *gid); +extern void PGXCNodeRollbackPrepared(char *gid, bool isTopLevel); +extern void PGXCNodeCommitPrepared(char *gid, bool isTopLevel); extern bool PGXCNodeIsImplicit2PC(bool *prepare_local_coord); extern int PGXCNodeImplicitPrepare(GlobalTransactionId prepare_xid, char *gid); extern void PGXCNodeImplicitCommitPrepared(GlobalTransactionId prepare_xid, diff --cc src/include/storage/lwlock.h index 398ebd9,22fb5a7..997d86a --- a/src/include/storage/lwlock.h +++ b/src/include/storage/lwlock.h @@@ -67,9 -67,11 +67,12 @@@ typedef enum LWLockI AutovacuumLock, AutovacuumScheduleLock, SyncScanLock, + RelationMappingLock, + AsyncCtlLock, + AsyncQueueLock, #ifdef PGXC AnalyzeProcArrayLock, + BarrierLock, #endif /* Individual lock IDs end here */ FirstBufMappingLock, ----------------------------------------------------------------------- Summary of changes: .gitignore | 24 + COPYRIGHT | 18 +- GNUmakefile.in | 89 +- Makefile | 2 +- README.git | 14 + aclocal.m4 | 2 +- config/Makefile | 7 +- config/ac_func_accept_argtypes.m4 | 5 +- config/c-compiler.m4 | 27 +- config/c-library.m4 | 28 +- config/config.guess | 215 +- config/config.sub | 100 +- config/docbook.m4 | 31 +- config/general.m4 | 19 +- config/install-sh | 528 +- config/missing | 2 +- config/mkinstalldirs | 152 - config/perl.m4 | 35 +- config/prep_buildtree | 11 +- config/programs.m4 | 42 +- config/python.m4 | 8 +- config/tcl.m4 | 2 +- configure |13506 ++++++----- configure.in | 138 +- contrib/Makefile | 21 +- contrib/README | 12 + contrib/adminpack/.gitignore | 1 + contrib/adminpack/Makefile | 2 +- contrib/adminpack/adminpack.c | 4 +- contrib/adminpack/adminpack.sql.in | 2 +- contrib/adminpack/uninstall_adminpack.sql | 2 +- contrib/auto_explain/Makefile | 2 +- contrib/auto_explain/auto_explain.c | 74 +- contrib/btree_gin/.gitignore | 3 + contrib/btree_gin/Makefile | 2 +- contrib/btree_gin/btree_gin.c | 9 +- contrib/btree_gin/btree_gin.sql.in | 2 +- contrib/btree_gin/expected/bytea.out | 2 + contrib/btree_gin/sql/bytea.sql | 2 + contrib/btree_gin/uninstall_btree_gin.sql | 2 +- contrib/btree_gist/.gitignore | 3 + contrib/btree_gist/Makefile | 2 +- contrib/btree_gist/btree_bit.c | 4 +- contrib/btree_gist/btree_bytea.c | 4 +- contrib/btree_gist/btree_cash.c | 17 +- contrib/btree_gist/btree_date.c | 16 +- contrib/btree_gist/btree_float4.c | 17 +- contrib/btree_gist/btree_float8.c | 17 +- contrib/btree_gist/btree_gist.c | 2 +- contrib/btree_gist/btree_gist.h | 2 +- contrib/btree_gist/btree_gist.sql.in | 2 +- contrib/btree_gist/btree_inet.c | 17 +- contrib/btree_gist/btree_int2.c | 17 +- contrib/btree_gist/btree_int4.c | 17 +- contrib/btree_gist/btree_int8.c | 17 +- contrib/btree_gist/btree_interval.c | 17 +- contrib/btree_gist/btree_macaddr.c | 18 +- contrib/btree_gist/btree_numeric.c | 2 +- contrib/btree_gist/btree_oid.c | 17 +- contrib/btree_gist/btree_text.c | 2 +- contrib/btree_gist/btree_time.c | 16 +- contrib/btree_gist/btree_ts.c | 16 +- contrib/btree_gist/btree_utils_num.c | 2 +- contrib/btree_gist/btree_utils_num.h | 2 +- contrib/btree_gist/btree_utils_var.c | 9 +- contrib/btree_gist/btree_utils_var.h | 2 +- contrib/btree_gist/uninstall_btree_gist.sql | 2 +- contrib/chkpass/.gitignore | 1 + contrib/chkpass/Makefile | 2 +- contrib/chkpass/chkpass.c | 2 +- contrib/chkpass/chkpass.sql.in | 3 +- contrib/chkpass/uninstall_chkpass.sql | 2 +- contrib/citext/.gitignore | 3 + contrib/citext/Makefile | 2 +- contrib/citext/citext.c | 2 +- contrib/citext/citext.sql.in | 2 +- contrib/citext/expected/citext.out | 15 +- contrib/citext/expected/citext_1.out | 15 +- contrib/citext/sql/citext.sql | 3 - contrib/citext/uninstall_citext.sql | 2 +- contrib/contrib-global.mk | 2 +- contrib/cube/.cvsignore | 2 - contrib/cube/.gitignore | 5 + contrib/cube/CHANGES | 2 +- contrib/cube/Makefile | 12 +- contrib/cube/cube.c | 4 +- contrib/cube/cube.sql.in | 2 +- contrib/cube/cubedata.h | 2 +- contrib/cube/cubeparse.y | 2 +- contrib/cube/cubescan.l | 2 +- contrib/cube/uninstall_cube.sql | 2 +- contrib/dblink/.gitignore | 3 + contrib/dblink/Makefile | 2 +- contrib/dblink/dblink.c | 1152 +- contrib/dblink/dblink.h | 5 +- contrib/dblink/dblink.sql.in | 21 +- contrib/dblink/expected/dblink.out | 94 +- contrib/dblink/sql/dblink.sql | 43 + contrib/dblink/uninstall_dblink.sql | 10 +- contrib/dict_int/.gitignore | 3 + contrib/dict_int/Makefile | 2 +- contrib/dict_int/dict_int.c | 4 +- contrib/dict_int/dict_int.sql.in | 2 +- contrib/dict_int/uninstall_dict_int.sql | 2 +- contrib/dict_xsyn/.gitignore | 3 + contrib/dict_xsyn/Makefile | 2 +- contrib/dict_xsyn/dict_xsyn.c | 114 +- contrib/dict_xsyn/dict_xsyn.sql.in | 2 +- contrib/dict_xsyn/expected/dict_xsyn.out | 130 +- contrib/dict_xsyn/sql/dict_xsyn.sql | 41 +- contrib/dict_xsyn/uninstall_dict_xsyn.sql | 2 +- contrib/earthdistance/.gitignore | 3 + contrib/earthdistance/Makefile | 4 +- contrib/earthdistance/earthdistance.c | 2 +- contrib/earthdistance/earthdistance.sql.in | 2 +- contrib/earthdistance/uninstall_earthdistance.sql | 2 +- contrib/fuzzystrmatch/.gitignore | 1 + contrib/fuzzystrmatch/Makefile | 2 +- contrib/fuzzystrmatch/dmetaphone.c | 6 +- contrib/fuzzystrmatch/fuzzystrmatch.c | 4 +- contrib/fuzzystrmatch/fuzzystrmatch.sql.in | 2 +- contrib/fuzzystrmatch/uninstall_fuzzystrmatch.sql | 2 +- contrib/hstore/.gitignore | 3 + contrib/hstore/Makefile | 5 +- contrib/hstore/crc32.c | 2 +- contrib/hstore/crc32.h | 2 +- contrib/hstore/expected/hstore.out | 798 +- contrib/hstore/hstore.h | 188 +- contrib/hstore/hstore.sql.in | 326 +- contrib/hstore/hstore_compat.c | 369 + contrib/hstore/hstore_gin.c | 116 +- contrib/hstore/hstore_gist.c | 117 +- contrib/hstore/hstore_io.c | 847 +- contrib/hstore/hstore_op.c | 1288 +- contrib/hstore/sql/hstore.sql | 186 + contrib/hstore/uninstall_hstore.sql | 58 +- contrib/intagg/Makefile | 6 +- contrib/intagg/int_aggregate.sql | 2 +- contrib/intagg/uninstall_int_aggregate.sql | 2 +- contrib/intarray/.gitignore | 3 + contrib/intarray/Makefile | 2 +- contrib/intarray/_int.h | 2 +- contrib/intarray/_int.sql.in | 2 +- contrib/intarray/_int_bool.c | 32 +- contrib/intarray/_int_gin.c | 4 +- contrib/intarray/_int_gist.c | 2 +- contrib/intarray/_int_op.c | 2 +- contrib/intarray/_int_tool.c | 2 +- contrib/intarray/_intbig_gist.c | 2 +- contrib/intarray/bench/create_test.pl | 2 +- contrib/intarray/uninstall__int.sql | 2 +- contrib/isn/.gitignore | 1 + contrib/isn/EAN13.h | 2 +- contrib/isn/ISBN.h | 2 +- contrib/isn/ISMN.h | 2 +- contrib/isn/ISSN.h | 2 +- contrib/isn/Makefile | 2 +- contrib/isn/UPC.h | 2 +- contrib/isn/isn.c | 6 +- contrib/isn/isn.h | 6 +- contrib/isn/isn.sql.in | 2 +- contrib/isn/uninstall_isn.sql | 2 +- contrib/lo/.gitignore | 1 + contrib/lo/Makefile | 2 +- contrib/lo/lo.c | 2 +- contrib/lo/lo.sql.in | 2 +- contrib/lo/lo_test.sql | 6 +- contrib/lo/uninstall_lo.sql | 2 +- contrib/ltree/.gitignore | 3 + contrib/ltree/Makefile | 2 +- contrib/ltree/_ltree_gist.c | 2 +- contrib/ltree/_ltree_op.c | 2 +- contrib/ltree/crc32.c | 2 +- contrib/ltree/crc32.h | 2 +- contrib/ltree/lquery_op.c | 2 +- contrib/ltree/ltree.h | 2 +- contrib/ltree/ltree.sql.in | 2 +- contrib/ltree/ltree_gist.c | 2 +- contrib/ltree/ltree_io.c | 2 +- contrib/ltree/ltree_op.c | 2 +- contrib/ltree/ltreetest.sql | 2 +- contrib/ltree/ltxtquery_io.c | 6 +- contrib/ltree/ltxtquery_op.c | 2 +- contrib/ltree/uninstall_ltree.sql | 2 +- contrib/oid2name/.gitignore | 1 + contrib/oid2name/Makefile | 5 +- contrib/oid2name/oid2name.c | 22 +- contrib/pageinspect/.gitignore | 1 + contrib/pageinspect/Makefile | 2 +- contrib/pageinspect/btreefuncs.c | 2 +- contrib/pageinspect/fsmfuncs.c | 4 +- contrib/pageinspect/heapfuncs.c | 8 +- contrib/pageinspect/pageinspect.sql.in | 2 +- contrib/pageinspect/rawpage.c | 4 +- contrib/pageinspect/uninstall_pageinspect.sql | 2 +- contrib/passwordcheck/Makefile | 19 + contrib/passwordcheck/passwordcheck.c | 148 + contrib/pg_archivecleanup/.gitignore | 1 + contrib/pg_archivecleanup/Makefile | 18 + contrib/pg_archivecleanup/pg_archivecleanup.c | 320 + contrib/pg_buffercache/.gitignore | 1 + contrib/pg_buffercache/Makefile | 2 +- contrib/pg_buffercache/pg_buffercache.sql.in | 2 +- contrib/pg_buffercache/pg_buffercache_pages.c | 2 +- .../pg_buffercache/uninstall_pg_buffercache.sql | 2 +- contrib/pg_freespacemap/.gitignore | 1 + contrib/pg_freespacemap/Makefile | 2 +- contrib/pg_freespacemap/pg_freespacemap.c | 2 +- contrib/pg_freespacemap/pg_freespacemap.sql.in | 2 +- .../pg_freespacemap/uninstall_pg_freespacemap.sql | 2 +- contrib/pg_standby/.gitignore | 1 + contrib/pg_standby/Makefile | 8 +- contrib/pg_standby/pg_standby.c | 16 +- contrib/pg_stat_statements/.gitignore | 1 + contrib/pg_stat_statements/Makefile | 2 +- contrib/pg_stat_statements/pg_stat_statements.c | 177 +- .../pg_stat_statements/pg_stat_statements.sql.in | 12 +- .../uninstall_pg_stat_statements.sql | 2 +- contrib/pg_trgm/.gitignore | 3 + contrib/pg_trgm/Makefile | 2 +- contrib/pg_trgm/pg_trgm.sql.in | 2 +- contrib/pg_trgm/trgm.h | 2 +- contrib/pg_trgm/trgm_gin.c | 2 +- contrib/pg_trgm/trgm_gist.c | 2 +- contrib/pg_trgm/trgm_op.c | 2 +- contrib/pg_trgm/uninstall_pg_trgm.sql | 2 +- contrib/pg_upgrade/.gitignore | 1 + contrib/pg_upgrade/IMPLEMENTATION | 100 + contrib/pg_upgrade/Makefile | 26 + contrib/pg_upgrade/TESTING | 68 + contrib/pg_upgrade/check.c | 643 + contrib/pg_upgrade/controldata.c | 568 + contrib/pg_upgrade/dump.c | 100 + contrib/pg_upgrade/exec.c | 313 + contrib/pg_upgrade/file.c | 471 + contrib/pg_upgrade/function.c | 265 + contrib/pg_upgrade/info.c | 514 + contrib/pg_upgrade/option.c | 346 + contrib/pg_upgrade/page.c | 178 + contrib/pg_upgrade/pg_upgrade.c | 415 + contrib/pg_upgrade/pg_upgrade.h | 404 + contrib/pg_upgrade/relfilenode.c | 230 + contrib/pg_upgrade/server.c | 339 + contrib/pg_upgrade/tablespace.c | 90 + contrib/pg_upgrade/util.c | 273 + contrib/pg_upgrade/version.c | 93 + contrib/pg_upgrade/version_old_8_3.c | 701 + contrib/pg_upgrade_support/Makefile | 19 + contrib/pg_upgrade_support/pg_upgrade_support.c | 124 + contrib/pgbench/.gitignore | 1 + contrib/pgbench/Makefile | 11 +- contrib/pgbench/pgbench.c | 1113 +- contrib/pgcrypto/.gitignore | 3 + contrib/pgcrypto/Makefile | 2 +- contrib/pgcrypto/blf.c | 2 +- contrib/pgcrypto/blf.h | 2 +- contrib/pgcrypto/crypt-blowfish.c | 2 +- contrib/pgcrypto/crypt-des.c | 4 +- contrib/pgcrypto/crypt-gensalt.c | 2 +- contrib/pgcrypto/crypt-md5.c | 4 +- contrib/pgcrypto/expected/3des.out | 2 + contrib/pgcrypto/expected/blowfish.out | 2 + contrib/pgcrypto/expected/cast5.out | 2 + contrib/pgcrypto/expected/crypt-blowfish.out | 2 +- contrib/pgcrypto/expected/des.out | 2 + contrib/pgcrypto/expected/init.out | 2 + contrib/pgcrypto/expected/pgp-armor.out | 46 +- contrib/pgcrypto/expected/pgp-decrypt.out | 40 +- contrib/pgcrypto/expected/pgp-encrypt.out | 2 + contrib/pgcrypto/expected/pgp-pubkey-encrypt.out | 2 + contrib/pgcrypto/expected/rijndael.out | 2 + contrib/pgcrypto/fortuna.c | 2 +- contrib/pgcrypto/fortuna.h | 2 +- contrib/pgcrypto/imath.c | 4 +- contrib/pgcrypto/imath.h | 4 +- contrib/pgcrypto/internal-sha2.c | 2 +- contrib/pgcrypto/internal.c | 2 +- contrib/pgcrypto/mbuf.c | 2 +- contrib/pgcrypto/mbuf.h | 2 +- contrib/pgcrypto/md5.c | 4 +- contrib/pgcrypto/md5.h | 4 +- contrib/pgcrypto/openssl.c | 2 +- contrib/pgcrypto/pgcrypto.c | 2 +- contrib/pgcrypto/pgcrypto.h | 2 +- contrib/pgcrypto/pgcrypto.sql.in | 2 +- contrib/pgcrypto/pgp-armor.c | 2 +- contrib/pgcrypto/pgp-cfb.c | 2 +- contrib/pgcrypto/pgp-compress.c | 2 +- contrib/pgcrypto/pgp-decrypt.c | 2 +- contrib/pgcrypto/pgp-encrypt.c | 2 +- contrib/pgcrypto/pgp-info.c | 2 +- contrib/pgcrypto/pgp-mpi-internal.c | 2 +- contrib/pgcrypto/pgp-mpi-openssl.c | 2 +- contrib/pgcrypto/pgp-mpi.c | 2 +- contrib/pgcrypto/pgp-pgsql.c | 2 +- contrib/pgcrypto/pgp-pubdec.c | 2 +- contrib/pgcrypto/pgp-pubenc.c | 6 +- contrib/pgcrypto/pgp-pubkey.c | 2 +- contrib/pgcrypto/pgp-s2k.c | 2 +- contrib/pgcrypto/pgp.c | 2 +- contrib/pgcrypto/pgp.h | 2 +- contrib/pgcrypto/px-crypt.c | 2 +- contrib/pgcrypto/px-crypt.h | 2 +- contrib/pgcrypto/px-hmac.c | 2 +- contrib/pgcrypto/px.c | 2 +- contrib/pgcrypto/px.h | 2 +- contrib/pgcrypto/random.c | 2 +- contrib/pgcrypto/rijndael.c | 4 +- contrib/pgcrypto/rijndael.h | 4 +- contrib/pgcrypto/sha1.c | 6 +- contrib/pgcrypto/sha1.h | 6 +- contrib/pgcrypto/sha2.c | 8 +- contrib/pgcrypto/sha2.h | 4 +- contrib/pgcrypto/sql/3des.sql | 2 + contrib/pgcrypto/sql/blowfish.sql | 2 + contrib/pgcrypto/sql/cast5.sql | 2 + contrib/pgcrypto/sql/des.sql | 2 + contrib/pgcrypto/sql/init.sql | 3 + contrib/pgcrypto/sql/pgp-armor.sql | 2 + contrib/pgcrypto/sql/pgp-encrypt.sql | 2 + contrib/pgcrypto/sql/pgp-pubkey-encrypt.sql | 2 + contrib/pgcrypto/sql/rijndael.sql | 2 + contrib/pgcrypto/uninstall_pgcrypto.sql | 2 +- contrib/pgrowlocks/.gitignore | 1 + contrib/pgrowlocks/Makefile | 2 +- contrib/pgrowlocks/pgrowlocks.c | 2 +- contrib/pgrowlocks/pgrowlocks.sql.in | 2 +- contrib/pgrowlocks/uninstall_pgrowlocks.sql | 2 +- contrib/pgstattuple/.gitignore | 1 + contrib/pgstattuple/Makefile | 2 +- contrib/pgstattuple/pgstatindex.c | 2 +- contrib/pgstattuple/pgstattuple.c | 12 +- contrib/pgstattuple/pgstattuple.sql.in | 2 +- contrib/pgstattuple/uninstall_pgstattuple.sql | 2 +- contrib/seg/.cvsignore | 2 - contrib/seg/.gitignore | 5 + contrib/seg/Makefile | 12 +- contrib/seg/seg.c | 4 +- contrib/seg/seg.sql.in | 2 +- contrib/seg/segdata.h | 2 +- contrib/seg/uninstall_seg.sql | 2 +- contrib/spi/.gitignore | 5 + contrib/spi/Makefile | 6 +- contrib/spi/autoinc.c | 2 +- contrib/spi/autoinc.sql.in | 2 +- contrib/spi/insert_username.c | 2 +- contrib/spi/insert_username.sql.in | 2 +- contrib/spi/moddatetime.c | 2 +- contrib/spi/moddatetime.sql.in | 2 +- contrib/spi/refint.c | 2 +- contrib/spi/refint.sql.in | 2 +- contrib/spi/timetravel.c | 2 +- contrib/spi/timetravel.sql.in | 2 +- contrib/sslinfo/.gitignore | 1 + contrib/sslinfo/Makefile | 2 +- contrib/sslinfo/sslinfo.c | 2 +- contrib/sslinfo/sslinfo.sql.in | 2 +- contrib/sslinfo/uninstall_sslinfo.sql | 2 +- contrib/start-scripts/freebsd | 14 +- contrib/start-scripts/linux | 28 +- contrib/start-scripts/osx/PostgreSQL | 10 +- contrib/tablefunc/.gitignore | 3 + contrib/tablefunc/Makefile | 4 +- contrib/tablefunc/tablefunc.c | 4 +- contrib/tablefunc/tablefunc.h | 4 +- contrib/tablefunc/tablefunc.sql.in | 2 +- contrib/tablefunc/uninstall_tablefunc.sql | 2 +- contrib/test_parser/.gitignore | 3 + contrib/test_parser/Makefile | 2 +- contrib/test_parser/test_parser.c | 4 +- contrib/test_parser/test_parser.sql.in | 2 +- contrib/test_parser/uninstall_test_parser.sql | 2 +- contrib/tsearch2/.gitignore | 3 + contrib/tsearch2/Makefile | 2 +- contrib/tsearch2/expected/tsearch2.out | 56 +- contrib/tsearch2/expected/tsearch2_1.out | 56 +- contrib/tsearch2/tsearch2.c | 13 +- contrib/tsearch2/tsearch2.sql.in | 2 +- contrib/tsearch2/uninstall_tsearch2.sql | 2 +- contrib/unaccent/.gitignore | 3 + contrib/unaccent/Makefile | 23 + contrib/unaccent/expected/unaccent.out | 65 + contrib/unaccent/sql/unaccent.sql | 22 + contrib/unaccent/unaccent.c | 320 + contrib/unaccent/unaccent.rules | 187 + contrib/unaccent/unaccent.sql.in | 34 + contrib/unaccent/uninstall_unaccent.sql | 11 + contrib/uuid-ossp/Makefile | 2 +- contrib/uuid-ossp/uninstall_uuid-ossp.sql | 2 +- contrib/uuid-ossp/uuid-ossp.c | 4 +- contrib/uuid-ossp/uuid-ossp.sql.in | 2 +- contrib/vacuumlo/.gitignore | 1 + contrib/vacuumlo/Makefile | 5 +- contrib/vacuumlo/vacuumlo.c | 18 +- contrib/xml2/.gitignore | 3 + contrib/xml2/Makefile | 2 +- contrib/xml2/expected/xml2.out | 66 +- contrib/xml2/expected/xml2_1.out | 14 +- contrib/xml2/pgxml.sql.in | 2 +- contrib/xml2/uninstall_pgxml.sql | 2 +- contrib/xml2/xpath.c | 168 +- contrib/xml2/xslt_proc.c | 16 +- doc/Makefile | 98 +- doc/bug.template | 2 +- doc/src/Makefile | 14 +- doc/src/sgml/.gitignore | 32 + doc/src/sgml/Makefile | 293 +- doc/src/sgml/README.links | 2 +- doc/src/sgml/acronyms.sgml | 30 +- doc/src/sgml/adminpack.sgml | 2 +- doc/src/sgml/advanced.sgml | 7 +- doc/src/sgml/arch-dev.sgml | 58 +- doc/src/sgml/array.sgml | 2 +- doc/src/sgml/auto-explain.sgml | 91 +- doc/src/sgml/backup.sgml | 1100 +- doc/src/sgml/biblio.sgml | 20 +- doc/src/sgml/bki.sgml | 20 +- doc/src/sgml/btree-gin.sgml | 12 +- doc/src/sgml/btree-gist.sgml | 10 +- doc/src/sgml/catalogs.sgml | 844 +- doc/src/sgml/charset.sgml | 75 +- doc/src/sgml/chkpass.sgml | 6 +- doc/src/sgml/citext.sgml | 53 +- doc/src/sgml/client-auth.sgml | 445 +- doc/src/sgml/config.sgml | 1204 +- doc/src/sgml/contacts.sgml | 2 +- doc/src/sgml/contrib-spi.sgml | 6 +- doc/src/sgml/contrib.sgml | 10 +- doc/src/sgml/cube.sgml | 107 +- doc/src/sgml/cvs.sgml | 271 - doc/src/sgml/datatype.sgml | 292 +- doc/src/sgml/datetime.sgml | 2 +- doc/src/sgml/dblink.sgml | 1023 +- doc/src/sgml/ddl.sgml | 111 +- doc/src/sgml/dfunc.sgml | 4 +- doc/src/sgml/dict-int.sgml | 2 +- doc/src/sgml/dict-xsyn.sgml | 55 +- doc/src/sgml/diskusage.sgml | 73 +- doc/src/sgml/dml.sgml | 21 +- doc/src/sgml/docguide.sgml | 147 +- doc/src/sgml/earthdistance.sgml | 6 +- doc/src/sgml/ecpg.sgml | 1178 +- doc/src/sgml/errcodes.sgml | 14 +- doc/src/sgml/extend.sgml | 69 +- doc/src/sgml/external-projects.sgml | 8 +- doc/src/sgml/features.sgml | 2 +- doc/src/sgml/filelist.sgml | 16 +- doc/src/sgml/fixrtf | 2 +- doc/src/sgml/func.sgml | 1470 +- doc/src/sgml/fuzzystrmatch.sgml | 50 +- doc/src/sgml/generate_history.pl | 2 +- doc/src/sgml/geqo.sgml | 23 +- doc/src/sgml/gin.sgml | 45 +- doc/src/sgml/gist.sgml | 24 +- doc/src/sgml/high-availability.sgml | 1512 ++- doc/src/sgml/history.sgml | 2 +- doc/src/sgml/hstore.sgml | 436 +- doc/src/sgml/indexam.sgml | 180 +- doc/src/sgml/indices.sgml | 38 +- doc/src/sgml/info.sgml | 2 +- doc/src/sgml/information_schema.sgml | 273 +- doc/src/sgml/i... [truncated message content] |
From: Abbas B. <ga...@us...> - 2011-05-04 11:11:47
|
Project "Postgres-XC". The branch, master has been updated via 1580ed848d0577c52949c52fe2cec867b5ee1746 (commit) from baa29ae9435f03d3495828c78d89c6b235e61bb0 (commit) - Log ----------------------------------------------------------------- commit 1580ed848d0577c52949c52fe2cec867b5ee1746 Author: Abbas <abb...@en...> Date: Wed May 4 13:26:10 2011 +0500 This patch fixes a problem in XC that INSERTS/UPDATES in catalog tables were not possible from psql prompt. The problem was in XC planner. XC planner should first check if all the tables in the query are catalog tables then it should invoke standard plannner. This change enables us to remove a temp fix in GetRelationLocInfo. Also a query is added in system_views.sql to add a corresponding entry in pgxc_class. RelationBuildDesc is asked to include bootstrap objetcs too while building location info. diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index a039578..2d7607b 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -155,6 +155,8 @@ CREATE SCHEMA __pgxc_datanode_schema__; create table __pgxc_coordinator_schema__.pg_prepared_xacts ( transaction xid, gid text, prepared timestamptz, owner name, database name ); +INSERT INTO pgxc_class VALUES((SELECT oid FROM pg_class WHERE relkind = 'r' AND relname = 'pg_prepared_xacts'), 'N', 0,0,0); + CREATE VIEW __pgxc_datanode_schema__.pg_prepared_xacts AS SELECT P.transaction, P.gid, P.prepared, U.rolname AS owner, D.datname AS database diff --git a/src/backend/pgxc/locator/locator.c b/src/backend/pgxc/locator/locator.c index acab6d7..0ab157d 100644 --- a/src/backend/pgxc/locator/locator.c +++ b/src/backend/pgxc/locator/locator.c @@ -754,37 +754,6 @@ GetRelationLocInfo(Oid relid) Relation rel = relation_open(relid, AccessShareLock); - /* This check has been added as a temp fix for CREATE TABLE not adding entry in pgxc_class - * when run from system_views.sql - */ - if ( rel != NULL && - rel->rd_rel != NULL && - rel->rd_rel->relkind == RELKIND_RELATION && - rel->rd_rel->relname.data != NULL && - (strcmp(rel->rd_rel->relname.data, PREPARED_XACTS_TABLE) == 0) ) - { - namespace = get_namespace_name(rel->rd_rel->relnamespace); - - if (namespace != NULL && (strcmp(namespace, PGXC_COORDINATOR_SCHEMA) == 0)) - { - RelationLocInfo *dest_info; - - dest_info = (RelationLocInfo *) palloc0(sizeof(RelationLocInfo)); - - dest_info->relid = relid; - dest_info->locatorType = 'N'; - dest_info->nodeCount = NumDataNodes; - dest_info->nodeList = GetAllDataNodes(); - - relation_close(rel, AccessShareLock); - pfree(namespace); - - return dest_info; - } - - if (namespace != NULL) pfree(namespace); - } - if (rel && rel->rd_locator_info) ret_loc_info = CopyRelationLocInfo(rel->rd_locator_info); diff --git a/src/backend/pgxc/plan/planner.c b/src/backend/pgxc/plan/planner.c index ed006e7..2448a74 100644 --- a/src/backend/pgxc/plan/planner.c +++ b/src/backend/pgxc/plan/planner.c @@ -2894,6 +2894,12 @@ pgxc_planner(Query *query, int cursorOptions, ParamListInfo boundParams) if (query->commandType != CMD_SELECT) result->resultRelations = list_make1_int(query->resultRelation); + if (contains_only_pg_catalog (query->rtable)) + { + result = standard_planner(query, cursorOptions, boundParams); + return result; + } + if (query_step->exec_nodes == NULL) get_plan_nodes_command(query_step, root); diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 1dfbe38..e861a91 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -888,7 +888,7 @@ RelationBuildDesc(Oid targetRelId, bool insertIt) relation->trigdesc = NULL; #ifdef PGXC - if (IS_PGXC_COORDINATOR && relation->rd_id >= FirstNormalObjectId) + if (IS_PGXC_COORDINATOR && relation->rd_id >= FirstBootstrapObjectId) RelationBuildLocator(relation); #endif /* ----------------------------------------------------------------------- Summary of changes: src/backend/catalog/system_views.sql | 2 ++ src/backend/pgxc/locator/locator.c | 31 ------------------------------- src/backend/pgxc/plan/planner.c | 6 ++++++ src/backend/utils/cache/relcache.c | 2 +- 4 files changed, 9 insertions(+), 32 deletions(-) hooks/post-receive -- Postgres-XC |
From: Koichi S. <koi...@us...> - 2011-05-04 04:15:57
|
Project "Postgres-XC". The branch, ha_support has been updated via 95fbb1a7742ef7cb0698875dbb3ad758499c21c7 (commit) from a3170920c368411b1b9a9e261088a21009fcf74e (commit) - Log ----------------------------------------------------------------- commit 95fbb1a7742ef7cb0698875dbb3ad758499c21c7 Author: Koichi Suzuki <koi...@gm...> Date: Wed May 4 13:07:34 2011 +0900 This commit fixes problem in node registration, combined with GTM-Standby. In the previous code, in MSG_NODE_REGISTER command, hostname was optional, depending upon type of the node to register. It is misleading. Because hostname embedded in the message is obtained using the same function, host name is now mandatory in this message for simplification. Second fix is selecting transaction slot from GTM inside database in GTM-Standby. Original code tested if GXID is invalid. Instead, we should test gti_in_use, which is also used to vacant slot. Third fix is adding status of each sequence in backing up by GTM-Standby. In this way, now GTM-Stanby can be connected to GTM-ACT normally. Next fix will include tweak in source code of GTM, including adding comments, and GTM-Proxy reconnect to promoted GTM-Standby. The affected files are as follows: modified: src/gtm/client/gtm_client.c modified: src/gtm/common/gtm_serialize.c modified: src/gtm/main/gtm_seq.c modified: src/gtm/main/gtm_standby.c modified: src/gtm/recovery/register.c diff --git a/src/gtm/client/gtm_client.c b/src/gtm/client/gtm_client.c index f22ebe4..d4b98f3 100644 --- a/src/gtm/client/gtm_client.c +++ b/src/gtm/client/gtm_client.c @@ -1239,6 +1239,12 @@ int node_register2(GTM_Conn *conn, GTM_PGXCNodeType type, const char *host, GTM time_t finish_time; GTM_PGXCNodeId proxynum = 0; + /* + * We should be very careful about the format of the message. + * Host name and its length is needed only when registering + * GTM Proxy. + * In other case, they must not be included in the message. + */ if (gtmpqPutMsgStart('C', true, conn) || /* Message Type */ gtmpqPutInt(MSG_NODE_REGISTER, sizeof (GTM_MessageType), conn) || diff --git a/src/gtm/common/gtm_serialize.c b/src/gtm/common/gtm_serialize.c index 9eeaf05..85c7233 100644 --- a/src/gtm/common/gtm_serialize.c +++ b/src/gtm/common/gtm_serialize.c @@ -569,7 +569,17 @@ gtm_serialize_transactions(GTM_Transactions *data, char *buf, size_t buflen) for (i=0 ; i<GTM_MAX_GLOBAL_TRANSACTIONS ; i++) { + /* + * The following code is quiestionable. #if 0 ... part is the + * original. To select valid (used) slot from the transaction + * array, we should test gti_in_use instead, which is used to + * find vacant slot. + */ +#if 0 if ( data->gt_transactions_array[i].gti_gxid != InvalidGlobalTransactionId ) +#else + if ( data->gt_transactions_array[i].gti_in_use == TRUE ) +#endif txn_count++; } diff --git a/src/gtm/main/gtm_seq.c b/src/gtm/main/gtm_seq.c index 8195b39..0726fff 100644 --- a/src/gtm/main/gtm_seq.c +++ b/src/gtm/main/gtm_seq.c @@ -404,14 +404,14 @@ int GTM_SeqAlter(GTM_SequenceKey seqkey, */ int GTM_SeqRestore(GTM_SequenceKey seqkey, - GTM_Sequence increment_by, - GTM_Sequence minval, - GTM_Sequence maxval, - GTM_Sequence startval, - GTM_Sequence curval, - int32 state, - bool cycle, - bool called) + GTM_Sequence increment_by, + GTM_Sequence minval, + GTM_Sequence maxval, + GTM_Sequence startval, + GTM_Sequence curval, + int32 state, + bool cycle, + bool called) { GTM_SeqInfo *seqinfo = NULL; int errcode = 0; @@ -1597,6 +1597,6 @@ GTM_RestoreSeqInfo(int ctlfd) } GTM_SeqRestore(&seqkey, increment_by, minval, maxval, startval, curval, - state, cycle, called); + state, cycle, called); } } diff --git a/src/gtm/main/gtm_standby.c b/src/gtm/main/gtm_standby.c index ee9ee58..4e9393b 100644 --- a/src/gtm/main/gtm_standby.c +++ b/src/gtm/main/gtm_standby.c @@ -106,14 +106,14 @@ gtm_standby_restore_sequence() for (i=0 ; i<num_seq ; i++) { GTM_SeqRestore(seq_list[i]->gs_key, - seq_list[i]->gs_increment_by, - seq_list[i]->gs_min_value, - seq_list[i]->gs_max_value, - seq_list[i]->gs_init_value, - seq_list[i]->gs_value, - seq_list[i]->gs_state, - seq_list[i]->gs_cycle, - seq_list[i]->gs_called); + seq_list[i]->gs_increment_by, + seq_list[i]->gs_min_value, + seq_list[i]->gs_max_value, + seq_list[i]->gs_init_value, + seq_list[i]->gs_value, + seq_list[i]->gs_state, + seq_list[i]->gs_cycle, + seq_list[i]->gs_called); } elog(LOG, "Restoring sequences done."); diff --git a/src/gtm/recovery/register.c b/src/gtm/recovery/register.c index c5d72d8..1ee30ce 100644 --- a/src/gtm/recovery/register.c +++ b/src/gtm/recovery/register.c @@ -456,6 +456,15 @@ ProcessPGXCNodeRegister(Port *myport, StringInfo message) * In the case a proxy registering itself, the remote address * is directly taken from socket. */ + /* + * The following block of codes seems to be wrong. #if 0 ... part + * is the original. It seems that all the node include host + * information in the protocol message. It is simple and not + * misleading. The result is very reasonable and seem to work in + * all the cases as much as tesed. Just in case, I left the + * original code for future help. May 4th, 2011, K.Suzuki + */ +#if 0 if (myport->remote_type == PGXC_NODE_GTM_PROXY && !myport->is_postmaster) { @@ -464,6 +473,11 @@ ProcessPGXCNodeRegister(Port *myport, StringInfo message) } else ipaddress = remote_host; +#else + strlen = pq_getmsgint(message, sizeof (GTM_StrLen)); + ipaddress = (char *)pq_getmsgbytes(message, strlen); + +#endif /* Read Port Number */ memcpy(&port, pq_getmsgbytes(message, sizeof (GTM_PGXCNodePort)), ----------------------------------------------------------------------- Summary of changes: src/gtm/client/gtm_client.c | 6 ++++++ src/gtm/common/gtm_serialize.c | 10 ++++++++++ src/gtm/main/gtm_seq.c | 18 +++++++++--------- src/gtm/main/gtm_standby.c | 16 ++++++++-------- src/gtm/recovery/register.c | 14 ++++++++++++++ 5 files changed, 47 insertions(+), 17 deletions(-) hooks/post-receive -- Postgres-XC |
From: Abbas B. <ga...@us...> - 2011-05-03 17:33:45
|
Project "Postgres-XC". The branch, master has been updated via baa29ae9435f03d3495828c78d89c6b235e61bb0 (commit) from a877e0769b83719b446d15a243e14b05700f975a (commit) - Log ----------------------------------------------------------------- commit baa29ae9435f03d3495828c78d89c6b235e61bb0 Author: Abbas <abb...@en...> Date: Tue May 3 22:29:51 2011 +0500 This patch makes the group by on XC work. The changes are as follows 1. The application of final function at coordinator is enabled though AggState execution. Till now final function was being applied during execution of RemoteQuery only, if there were aggregates in target list of remote query. This only worked in certain cases of aggregates (expressions involving aggregates, aggregation of join results etc. being some of the exceptions). With this change the way grouping works the same way as PG except a. the data comes from remote nodes in the form of tuples b. the aggregates go through three steps transition, collection (extra step to collect the data across the nodes) and finalization. 2. Till now, the collection and transition result type for some aggregates like sum, count, regr_count were different. I have added a function int8_sum__to_int8() which adds to int8 datums and converts the result into int8 datum. This function is used as collection function for these aggregates so that collection and transition functions have same result types. 3. Changed some of the alternate outputs to correct results now that grouping is working. Commented out test join, since it's crashing with grouping enabled. The test has a query which involves aggregates, group by and order by. The crash is happening because of order by and aggregates. Earlier the test didn't crash since GROUPING as disabled and the query would throw error, but now with grouping is enabled, the crash occurs. Bug id 3284321 tracks the crash. 4. Added new test xc_groupby.sql to test the grouping in XC with round robin and replicated tables with some simple aggregates like sum, count and avg. All work done by Ashutosh Bapat. diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c index ade57f7..a347f48 100644 --- a/src/backend/executor/nodeAgg.c +++ b/src/backend/executor/nodeAgg.c @@ -89,6 +89,7 @@ #include "optimizer/tlist.h" #include "parser/parse_agg.h" #include "parser/parse_coerce.h" +#include "pgxc/pgxc.h" #include "utils/acl.h" #include "utils/builtins.h" #include "utils/lsyscache.h" @@ -121,6 +122,9 @@ typedef struct AggStatePerAggData /* Oids of transfer functions */ Oid transfn_oid; Oid finalfn_oid; /* may be InvalidOid */ +#ifdef PGXC + Oid collectfn_oid; /* may be InvalidOid */ +#endif /* PGXC */ /* * fmgr lookup data for transfer functions --- only valid when @@ -129,6 +133,9 @@ typedef struct AggStatePerAggData */ FmgrInfo transfn; FmgrInfo finalfn; +#ifdef PGXC + FmgrInfo collectfn; +#endif /* PGXC */ /* number of sorting columns */ int numSortCols; @@ -154,6 +161,10 @@ typedef struct AggStatePerAggData */ Datum initValue; bool initValueIsNull; +#ifdef PGXC + Datum initCollectValue; + bool initCollectValueIsNull; +#endif /* PGXC */ /* * We need the len and byval info for the agg's input, result, and @@ -165,9 +176,15 @@ typedef struct AggStatePerAggData int16 inputtypeLen, resulttypeLen, transtypeLen; +#ifdef PGXC + int16 collecttypeLen; +#endif /* PGXC */ bool inputtypeByVal, resulttypeByVal, transtypeByVal; +#ifdef PGXC + bool collecttypeByVal; +#endif /* PGXC */ /* * Stuff for evaluation of inputs. We used to just use ExecEvalExpr, but @@ -725,6 +742,55 @@ finalize_aggregate(AggState *aggstate, MemoryContext oldContext; oldContext = MemoryContextSwitchTo(aggstate->ss.ps.ps_ExprContext->ecxt_per_tuple_memory); +#ifdef PGXC + /* + * PGXCTODO: see PGXCTODO item in advance_collect_function + * this step is needed in case the transition function does not produce + * result consumable by final function and need collection function to be + * applied on transition function results. Usually results by both functions + * should be consumable by final function. + * As such this step is meant only to convert transition results into form + * consumable by final function, the step does not actually do any + * collection. + */ + if (OidIsValid(peraggstate->collectfn_oid)) + { + FunctionCallInfoData fcinfo; + InitFunctionCallInfoData(fcinfo, &(peraggstate->collectfn), 2, + (void *) aggstate, NULL); + /* + * copy the initial datum since it might get changed inside the + * collection function + */ + if (peraggstate->initCollectValueIsNull) + fcinfo.arg[0] = peraggstate->initCollectValue; + else + fcinfo.arg[0] = datumCopy(peraggstate->initCollectValue, + peraggstate->collecttypeByVal, + peraggstate->collecttypeLen); + fcinfo.argnull[0] = peraggstate->initCollectValueIsNull; + fcinfo.arg[1] = pergroupstate->transValue; + fcinfo.argnull[1] = pergroupstate->transValueIsNull; + if (fcinfo.flinfo->fn_strict && + (pergroupstate->transValueIsNull || peraggstate->initCollectValueIsNull)) + { + pergroupstate->transValue = (Datum)0; + pergroupstate->transValueIsNull = true; + } + else + { + Datum newVal = FunctionCallInvoke(&fcinfo); + + /* + * set the result of collection function to the transValue so that code + * below invoking final function does not change + */ + /* PGXCTODO: worry about the memory management here? */ + pergroupstate->transValue = newVal; + pergroupstate->transValueIsNull = fcinfo.isnull; + } + } +#endif /* PGXC */ /* * Apply the agg's finalfn if one is provided, else return transValue. @@ -1546,6 +1612,10 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) AclResult aclresult; Oid transfn_oid, finalfn_oid; +#ifdef PGXC + Oid collectfn_oid; + Expr *collectfnexpr; +#endif /* PGXC */ Expr *transfnexpr, *finalfnexpr; Datum textInitVal; @@ -1612,12 +1682,19 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) get_func_name(aggref->aggfnoid)); peraggstate->transfn_oid = transfn_oid = aggform->aggtransfn; -#ifdef PGXC - /* For PGXC final function is executed when combining, disable it here */ - peraggstate->finalfn_oid = finalfn_oid = InvalidOid; -#else peraggstate->finalfn_oid = finalfn_oid = aggform->aggfinalfn; -#endif +#ifdef PGXC + peraggstate->collectfn_oid = collectfn_oid = aggform->aggcollectfn; + /* + * For PGXC final and collection functions are used to combine results at coordinator, + * disable those for data node + */ + if (IS_PGXC_DATANODE) + { + peraggstate->finalfn_oid = finalfn_oid = InvalidOid; + peraggstate->collectfn_oid = collectfn_oid = InvalidOid; + } +#endif /* PGXC */ /* Check that aggregate owner has permission to call component fns */ { HeapTuple procTuple; @@ -1644,6 +1721,17 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) aclcheck_error(aclresult, ACL_KIND_PROC, get_func_name(finalfn_oid)); } + +#ifdef PGXC + if (OidIsValid(collectfn_oid)) + { + aclresult = pg_proc_aclcheck(collectfn_oid, aggOwner, + ACL_EXECUTE); + if (aclresult != ACLCHECK_OK) + aclcheck_error(aclresult, ACL_KIND_PROC, + get_func_name(collectfn_oid)); + } +#endif /* PGXC */ } /* resolve actual type of transition state, if polymorphic */ @@ -1674,6 +1762,32 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) finalfn_oid, &transfnexpr, &finalfnexpr); +#ifdef PGXC + if (OidIsValid(collectfn_oid)) + { + /* we expect final function expression to be NULL in call to + * build_aggregate_fnexprs below, since InvalidOid is passed for + * finalfn_oid argument. Use a dummy expression to accept that. + */ + Expr *dummyexpr; + /* + * for XC, we need to setup the collection function expression as well. + * Use the same function with invalid final function oid, and collection + * function information instead of transition function information. + * PGXCTODO: we should really be adding this step inside + * build_aggregate_fnexprs() but this way it becomes easy to merge. + */ + build_aggregate_fnexprs(&aggform->aggtranstype, + 1, + aggform->aggcollecttype, + aggref->aggtype, + collectfn_oid, + InvalidOid, + &collectfnexpr, + &dummyexpr); + Assert(!dummyexpr); + } +#endif /* PGXC */ fmgr_info(transfn_oid, &peraggstate->transfn); peraggstate->transfn.fn_expr = (Node *) transfnexpr; @@ -1684,12 +1798,25 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) peraggstate->finalfn.fn_expr = (Node *) finalfnexpr; } +#ifdef PGXC + if (OidIsValid(collectfn_oid)) + { + fmgr_info(collectfn_oid, &peraggstate->collectfn); + peraggstate->collectfn.fn_expr = (Node *)collectfnexpr; + } +#endif /* PGXC */ + get_typlenbyval(aggref->aggtype, &peraggstate->resulttypeLen, &peraggstate->resulttypeByVal); get_typlenbyval(aggtranstype, &peraggstate->transtypeLen, &peraggstate->transtypeByVal); +#ifdef PGXC + get_typlenbyval(aggform->aggcollecttype, + &peraggstate->collecttypeLen, + &peraggstate->collecttypeByVal); +#endif /* PGXC */ /* * initval is potentially null, so don't try to access it as a struct @@ -1705,6 +1832,23 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) peraggstate->initValue = GetAggInitVal(textInitVal, aggtranstype); +#ifdef PGXC + /* + * initval for collection function is potentially null, so don't try to + * access it as a struct field. Must do it the hard way with + * SysCacheGetAttr. + */ + textInitVal = SysCacheGetAttr(AGGFNOID, aggTuple, + Anum_pg_aggregate_agginitcollect, + &peraggstate->initCollectValueIsNull); + + if (peraggstate->initCollectValueIsNull) + peraggstate->initCollectValue = (Datum) 0; + else + peraggstate->initCollectValue = GetAggInitVal(textInitVal, + aggform->aggcollecttype); +#endif /* PGXC */ + /* * If the transfn is strict and the initval is NULL, make sure input * type and transtype are the same (or at least binary-compatible), so diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c index 0d5b831..9e884cb 100644 --- a/src/backend/optimizer/plan/planmain.c +++ b/src/backend/optimizer/plan/planmain.c @@ -298,12 +298,6 @@ query_planner(PlannerInfo *root, List *tlist, { List *groupExprs; -#ifdef PGXC - ereport(ERROR, - (errcode(ERRCODE_STATEMENT_TOO_COMPLEX), - (errmsg("GROUP BY clause is not yet supported")))); -#endif - groupExprs = get_sortgrouplist_exprs(parse->groupClause, parse->targetList); *num_groups = estimate_num_groups(root, diff --git a/src/backend/pgxc/plan/planner.c b/src/backend/pgxc/plan/planner.c index dd570f4..ed006e7 100644 --- a/src/backend/pgxc/plan/planner.c +++ b/src/backend/pgxc/plan/planner.c @@ -2976,17 +2976,16 @@ pgxc_planner(Query *query, int cursorOptions, ParamListInfo boundParams) } /* - * Use standard plan if we have more than one data node with either - * group by, hasWindowFuncs, or hasRecursive - */ - /* * PGXCTODO - this could be improved to check if the first * group by expression is the partitioning column, in which * case it is ok to treat as a single step. + * PGXCTODO - whatever number of nodes involved in the query, grouping, + * windowing and recursive queries take place at the coordinator. The + * corresponding planner should be able to optimize the queries such that + * most of the query is pushed to datanode, based on the kind of + * distribution the table has. */ if (query->commandType == CMD_SELECT - && query_step->exec_nodes - && list_length(query_step->exec_nodes->nodelist) > 1 && (query->groupClause || query->hasWindowFuncs || query->hasRecursive)) { result = standard_planner(query, cursorOptions, boundParams); diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index c04a98c..47a07f0 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -337,6 +337,14 @@ advance_collect_function(SimpleAgg *simple_agg, FunctionCallInfoData *fcinfo) * result has not been initialized * We must copy the datum into result if it is pass-by-ref. We * do not need to pfree the old result, since it's NULL. + * PGXCTODO: in case the transition result type is different from + * collection result type, this code would not work, since we are + * assigning datum of one type to another. For this code to work the + * input and output of collection function needs to be binary + * compatible which is not. So, either check in AggregateCreate, + * that the input and output of collection function are binary + * coercible or set the initial values something non-null or change + * this code */ simple_agg->collectValue = datumCopy(fcinfo->arg[1], simple_agg->transtypeByVal, diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index eef4e3f..09fde7e 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -2796,6 +2796,36 @@ int8_sum(PG_FUNCTION_ARGS) NumericGetDatum(oldsum), newval)); } +/* + * similar to int8_sum, except that the result is casted into int8 + */ +Datum +int8_sum_to_int8(PG_FUNCTION_ARGS) +{ + Datum result_num; + Datum numeric_arg; + + /* if both arguments are null, the result is null */ + if (PG_ARGISNULL(0) && PG_ARGISNULL(1)) + PG_RETURN_NULL(); + + /* if either of them is null, the other is the result */ + if (PG_ARGISNULL(0)) + PG_RETURN_DATUM(PG_GETARG_DATUM(1)); + + if (PG_ARGISNULL(1)) + PG_RETURN_DATUM(PG_GETARG_DATUM(0)); + + /* + * convert the first argument to numeric (second one is converted into + * numeric) + * add both the arguments using int8_sum + * convert the result into int8 using numeric_int8 + */ + numeric_arg = DirectFunctionCall1(int8_numeric, PG_GETARG_DATUM(0)); + result_num = DirectFunctionCall2(int8_sum, numeric_arg, PG_GETARG_DATUM(1)); + PG_RETURN_DATUM(DirectFunctionCall1(numeric_int8, result_num)); +} /* * Routines for avg(int2) and avg(int4). The transition datatype diff --git a/src/include/catalog/pg_aggregate.h b/src/include/catalog/pg_aggregate.h index 1dbf7b4..80fb20f 100644 --- a/src/include/catalog/pg_aggregate.h +++ b/src/include/catalog/pg_aggregate.h @@ -139,14 +139,14 @@ DATA(insert ( 2106 interval_accum interval_collect interval_avg 0 1187 1187 "{0 /* sum */ #ifdef PGXC -DATA(insert ( 2107 int8_sum numeric_add - 0 1700 1700 _null_ _null_ )); -DATA(insert ( 2108 int4_sum int8_sum 1779 0 20 1700 _null_ _null_ )); -DATA(insert ( 2109 int2_sum int8_sum 1779 0 20 1700 _null_ _null_ )); -DATA(insert ( 2110 float4pl float4pl - 0 700 700 _null_ _null_ )); -DATA(insert ( 2111 float8pl float8pl - 0 701 701 _null_ _null_ )); +DATA(insert ( 2107 int8_sum numeric_add - 0 1700 1700 _null_ "0" )); +DATA(insert ( 2108 int4_sum int8_sum_to_int8 - 0 20 20 _null_ _null_ )); +DATA(insert ( 2109 int2_sum int8_sum_to_int8 - 0 20 20 _null_ _null_ )); +DATA(insert ( 2110 float4pl float4pl - 0 700 700 _null_ "0" )); +DATA(insert ( 2111 float8pl float8pl - 0 701 701 _null_ "0" )); DATA(insert ( 2112 cash_pl cash_pl - 0 790 790 _null_ _null_ )); DATA(insert ( 2113 interval_pl interval_pl - 0 1186 1186 _null_ _null_ )); -DATA(insert ( 2114 numeric_add numeric_add - 0 1700 1700 _null_ _null_ )); +DATA(insert ( 2114 numeric_add numeric_add - 0 1700 1700 _null_ "0" )); #endif #ifdef PGXC //DATA(insert ( 2107 int8_sum - 0 1700 _null_ )); @@ -254,8 +254,8 @@ DATA(insert ( 3527 enum_smaller enum_smaller - 3518 3500 3500 _null_ _null_ ) /* count */ /* Final function is data type conversion function numeric_int8 is refernced by OID because of ambiguous defininition in pg_proc */ #ifdef PGXC -DATA(insert ( 2147 int8inc_any int8_sum 1779 0 20 1700 "0" _null_ )); -DATA(insert ( 2803 int8inc int8_sum 1779 0 20 1700 "0" _null_ )); +DATA(insert ( 2147 int8inc_any int8_sum_to_int8 - 0 20 20 "0" _null_ )); +DATA(insert ( 2803 int8inc int8_sum_to_int8 - 0 20 20 "0" _null_ )); #endif #ifdef PGXC //DATA(insert ( 2147 int8inc_any - 0 20 "0" )); @@ -372,7 +372,7 @@ DATA(insert ( 2159 numeric_accum numeric_collect numeric_stddev_samp 0 1231 1231 /* SQL2003 binary regression aggregates */ #ifdef PGXC -DATA(insert ( 2818 int8inc_float8_float8 int8_sum 1779 0 20 1700 "0" _null_ )); +DATA(insert ( 2818 int8inc_float8_float8 int8_sum_to_int8 - 0 20 20 "0" _null_ )); DATA(insert ( 2819 float8_regr_accum float8_regr_collect float8_regr_sxx 0 1022 1022 "{0,0,0,0,0,0}" "{0,0,0,0,0,0}" )); DATA(insert ( 2820 float8_regr_accum float8_regr_collect float8_regr_syy 0 1022 1022 "{0,0,0,0,0,0}" "{0,0,0,0,0,0}" )); DATA(insert ( 2821 float8_regr_accum float8_regr_collect float8_regr_sxy 0 1022 1022 "{0,0,0,0,0,0}" "{0,0,0,0,0,0}" )); diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 0a5e1f2..55f5606 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -2792,7 +2792,8 @@ DESCR("SUM(int2) transition function"); DATA(insert OID = 1841 ( int4_sum PGNSP PGUID 12 1 0 0 f f f f f i 2 0 20 "20 23" _null_ _null_ _null_ _null_ int4_sum _null_ _null_ _null_ )); DESCR("SUM(int4) transition function"); DATA(insert OID = 1842 ( int8_sum PGNSP PGUID 12 1 0 0 f f f f f i 2 0 1700 "1700 20" _null_ _null_ _null_ _null_ int8_sum _null_ _null_ _null_ )); -DESCR("SUM(int8) transition function"); +DATA(insert OID = 3037 ( int8_sum_to_int8 PGNSP PGUID 12 1 0 0 f f f f f i 2 0 20 "20 20" _null_ _null_ _null_ _null_ int8_sum_to_int8 _null_ _null_ _null_ )); +DESCR("SUM(int*) collection function"); DATA(insert OID = 1843 ( interval_accum PGNSP PGUID 12 1 0 0 f f f t f i 2 0 1187 "1187 1186" _null_ _null_ _null_ _null_ interval_accum _null_ _null_ _null_ )); DESCR("aggregate transition function"); DATA(insert OID = 1844 ( interval_avg PGNSP PGUID 12 1 0 0 f f f t f i 1 0 1186 "1187" _null_ _null_ _null_ _null_ interval_avg _null_ _null_ _null_ )); diff --git a/src/include/pgxc/execRemote.h b/src/include/pgxc/execRemote.h index c2cd922..39933ed 100644 --- a/src/include/pgxc/execRemote.h +++ b/src/include/pgxc/execRemote.h @@ -101,6 +101,12 @@ typedef struct RemoteQueryState * to initialize collecting of aggregates from the DNs */ bool initAggregates; + /* + * PGXCTODO - + * we should get rid of the simple_aggregates member, that should work + * through Agg node and grouping_planner should take care of optimizing it + * to the fullest + */ List *simple_aggregates; /* description of aggregate functions */ void *tuplesortstate; /* for merge sort */ /* Simple DISTINCT support */ diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index 3cd0975..16c8ddc 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -935,6 +935,7 @@ extern Datum numeric_stddev_samp(PG_FUNCTION_ARGS); extern Datum int2_sum(PG_FUNCTION_ARGS); extern Datum int4_sum(PG_FUNCTION_ARGS); extern Datum int8_sum(PG_FUNCTION_ARGS); +extern Datum int8_sum_to_int8(PG_FUNCTION_ARGS); extern Datum int2_avg_accum(PG_FUNCTION_ARGS); extern Datum int4_avg_accum(PG_FUNCTION_ARGS); #ifdef PGXC diff --git a/src/test/regress/expected/opr_sanity_1.out b/src/test/regress/expected/opr_sanity_1.out index 885cb13..bf70944 100644 --- a/src/test/regress/expected/opr_sanity_1.out +++ b/src/test/regress/expected/opr_sanity_1.out @@ -709,14 +709,9 @@ WHERE a.aggfnoid = p.oid AND OR NOT binary_coercible(pfn.prorettype, p.prorettype) OR pfn.pronargs != 1 OR NOT binary_coercible(a.aggtranstype, pfn.proargtypes[0])); - aggfnoid | proname | oid | proname -----------+------------+------+--------- - 2108 | sum | 1779 | int8 - 2109 | sum | 1779 | int8 - 2147 | count | 1779 | int8 - 2803 | count | 1779 | int8 - 2818 | regr_count | 1779 | int8 -(5 rows) + aggfnoid | proname | oid | proname +----------+---------+-----+--------- +(0 rows) -- If transfn is strict then either initval should be non-NULL, or -- input type should match transtype so that the first non-null input @@ -1120,7 +1115,10 @@ FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid WHERE am.amname <> 'gin' GROUP BY amname, amsupport, opcname, amprocfamily HAVING count(*) != amsupport OR amprocfamily IS NULL; -ERROR: GROUP BY clause is not yet supported + amname | opcname | count +--------+---------+------- +(0 rows) + SELECT amname, opcname, count(*) FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid LEFT JOIN pg_amproc p ON amprocfamily = opcfamily AND @@ -1128,7 +1126,10 @@ FROM pg_am am JOIN pg_opclass op ON opcmethod = am.oid WHERE am.amname = 'gin' GROUP BY amname, amsupport, opcname, amprocfamily HAVING count(*) < amsupport - 1 OR amprocfamily IS NULL; -ERROR: GROUP BY clause is not yet supported + amname | opcname | count +--------+---------+------- +(0 rows) + -- Unfortunately, we can't check the amproc link very well because the -- signature of the function may be different for different support routines -- or different base data types. diff --git a/src/test/regress/expected/with_1.out b/src/test/regress/expected/with_1.out index 5ae3440..7048e51 100644 --- a/src/test/regress/expected/with_1.out +++ b/src/test/regress/expected/with_1.out @@ -247,7 +247,11 @@ WITH q1(x,y) AS ( SELECT hundred, sum(ten) FROM tenk1 GROUP BY hundred ) SELECT count(*) FROM q1 WHERE y > (SELECT sum(y)/100 FROM q1 qsub); -ERROR: GROUP BY clause is not yet supported + count +------- + 50 +(1 row) + -- via a VIEW CREATE TEMPORARY VIEW vsubdepartment AS WITH RECURSIVE subdepartment AS diff --git a/src/test/regress/expected/xc_groupby.out b/src/test/regress/expected/xc_groupby.out new file mode 100644 index 0000000..58f9ea7 --- /dev/null +++ b/src/test/regress/expected/xc_groupby.out @@ -0,0 +1,475 @@ +-- create required tables and fill them with data +create table tab1 (val int, val2 int); +create table tab2 (val int, val2 int); +insert into tab1 values (1, 1), (2, 1), (3, 1), (2, 2), (6, 2), (4, 3), (1, 3), (6, 3); +insert into tab2 values (1, 1), (4, 1), (8, 1), (2, 4), (9, 4), (3, 4), (4, 2), (5, 2), (3, 2); +select count(*), sum(val), avg(val), sum(val)::float8/count(*), val2 from tab1 group by val2; + count | sum | avg | ?column? | val2 +-------+-----+--------------------+------------------+------ + 3 | 6 | 2.0000000000000000 | 2 | 1 + 2 | 8 | 4.0000000000000000 | 4 | 2 + 3 | 11 | 3.6666666666666667 | 3.66666666666667 | 3 +(3 rows) + +-- joins and group by +select count(*), sum(tab1.val * tab2.val), avg(tab1.val*tab2.val), sum(tab1.val*tab2.val)::float8/count(*), tab1.val2, tab2.val2 from tab1 full outer join tab2 on tab1.val2 = tab2.val2 group by tab1.val2, tab2.val2; + count | sum | avg | ?column? | val2 | val2 +-------+-----+---------------------+------------------+------+------ + 6 | 96 | 16.0000000000000000 | 16 | 2 | 2 + 9 | 78 | 8.6666666666666667 | 8.66666666666667 | 1 | 1 + 3 | | | | 3 | + 3 | | | | | 4 +(4 rows) + +-- aggregates over aggregates +select sum(y) from (select sum(val) y, val2%2 x from tab1 group by val2) q1 group by x; + sum +----- + 8 + 17 +(2 rows) + +-- group by without aggregate, just like distinct? +select val2 from tab1 group by val2; + val2 +------ + 1 + 2 + 3 +(3 rows) + +-- group by with aggregates in expression +select count(*) + sum(val) + avg(val), val2 from tab1 group by val2; + ?column? | val2 +---------------------+------ + 11.0000000000000000 | 1 + 14.0000000000000000 | 2 + 17.6666666666666667 | 3 +(3 rows) + +-- group by with expressions in group by clause +select sum(val), avg(val), 2 * val2 from tab1 group by 2 * val2; + sum | avg | ?column? +-----+--------------------+---------- + 11 | 3.6666666666666667 | 6 + 6 | 2.0000000000000000 | 2 + 8 | 4.0000000000000000 | 4 +(3 rows) + +drop table tab1; +drop table tab2; +-- repeat the same tests for replicated tables +-- create required tables and fill them with data +create table tab1 (val int, val2 int) distribute by replication; +create table tab2 (val int, val2 int) distribute by replication; +insert into tab1 values (1, 1), (2, 1), (3, 1), (2, 2), (6, 2), (4, 3), (1, 3), (6, 3); +insert into tab2 values (1, 1), (4, 1), (8, 1), (2, 4), (9, 4), (3, 4), (4, 2), (5, 2), (3, 2); +select count(*), sum(val), avg(val), sum(val)::float8/count(*), val2 from tab1 group by val2; + count | sum | avg | ?column? | val2 +-------+-----+--------------------+------------------+------ + 3 | 6 | 2.0000000000000000 | 2 | 1 + 2 | 8 | 4.0000000000000000 | 4 | 2 + 3 | 11 | 3.6666666666666667 | 3.66666666666667 | 3 +(3 rows) + +-- joins and group by +select count(*), sum(tab1.val * tab2.val), avg(tab1.val*tab2.val), sum(tab1.val*tab2.val)::float8/count(*), tab1.val2, tab2.val2 from tab1 full outer join tab2 on tab1.val2 = tab2.val2 group by tab1.val2, tab2.val2; + count | sum | avg | ?column? | val2 | val2 +-------+-----+---------------------+------------------+------+------ + 6 | 96 | 16.0000000000000000 | 16 | 2 | 2 + 9 | 78 | 8.6666666666666667 | 8.66666666666667 | 1 | 1 + 3 | | | | 3 | + 3 | | | | | 4 +(4 rows) + +-- aggregates over aggregates +select sum(y) from (select sum(val) y, val2%2 x from tab1 group by val2) q1 group by x; + sum +----- + 8 + 17 +(2 rows) + +-- group by without aggregate, just like distinct? +select val2 from tab1 group by val2; + val2 +------ + 1 + 2 + 3 +(3 rows) + +-- group by with aggregates in expression +select count(*) + sum(val) + avg(val), val2 from tab1 group by val2; + ?column? | val2 +---------------------+------ + 11.0000000000000000 | 1 + 14.0000000000000000 | 2 + 17.6666666666666667 | 3 +(3 rows) + +-- group by with expressions in group by clause +select sum(val), avg(val), 2 * val2 from tab1 group by 2 * val2; + sum | avg | ?column? +-----+--------------------+---------- + 11 | 3.6666666666666667 | 6 + 6 | 2.0000000000000000 | 2 + 8 | 4.0000000000000000 | 4 +(3 rows) + +drop table tab1; +drop table tab2; +-- some tests involving nulls, characters, float type etc. +create table def(a int, b varchar(25)); +insert into def VALUES (NULL, NULL); +insert into def VALUES (1, NULL); +insert into def VALUES (NULL, 'One'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (3, 'Three'); +insert into def VALUES (4, 'Three'); +insert into def VALUES (5, 'Three'); +insert into def VALUES (6, 'Two'); +insert into def VALUES (7, NULL); +insert into def VALUES (8, 'Two'); +insert into def VALUES (9, 'Three'); +insert into def VALUES (10, 'Three'); +select a,count(a) from def group by a order by a; + a | count +----+------- + 1 | 1 + 2 | 2 + 3 | 1 + 4 | 1 + 5 | 1 + 6 | 1 + 7 | 1 + 8 | 1 + 9 | 1 + 10 | 1 + | 0 +(11 rows) + +select avg(a) from def group by a; + avg +------------------------ + + 6.0000000000000000 + 5.0000000000000000 + 8.0000000000000000 + 1.00000000000000000000 + 9.0000000000000000 + 2.0000000000000000 + 7.0000000000000000 + 10.0000000000000000 + 3.0000000000000000 + 4.0000000000000000 +(11 rows) + +select avg(a) from def group by a; + avg +------------------------ + + 6.0000000000000000 + 5.0000000000000000 + 8.0000000000000000 + 1.00000000000000000000 + 9.0000000000000000 + 2.0000000000000000 + 7.0000000000000000 + 10.0000000000000000 + 3.0000000000000000 + 4.0000000000000000 +(11 rows) + +select avg(a) from def group by b; + avg +-------------------- + 4.0000000000000000 + + 4.5000000000000000 + 6.2000000000000000 +(4 rows) + +select sum(a) from def group by b; + sum +----- + 8 + + 18 + 31 +(4 rows) + +select count(*) from def group by b; + count +------- + 3 + 1 + 4 + 5 +(4 rows) + +select count(*) from def where a is not null group by a; + count +------- + 1 + 1 + 1 + 1 + 1 + 1 + 2 + 1 + 1 + 1 +(10 rows) + +select b from def group by b; + b +------- + + One + Two + Three +(4 rows) + +select b,count(b) from def group by b; + b | count +-------+------- + | 0 + One | 1 + Two | 4 + Three | 5 +(4 rows) + +select count(*) from def where b is null group by b; + count +------- + 3 +(1 row) + +create table g(a int, b float, c numeric); +insert into g values(1,2.1,3.2); +insert into g values(1,2.1,3.2); +insert into g values(2,2.3,5.2); +select sum(a) from g group by a; + sum +----- + 2 + 2 +(2 rows) + +select sum(b) from g group by b; + sum +----- + 2.3 + 4.2 +(2 rows) + +select sum(c) from g group by b; + sum +----- + 5.2 + 6.4 +(2 rows) + +select avg(a) from g group by b; + avg +------------------------ + 2.0000000000000000 + 1.00000000000000000000 +(2 rows) + +select avg(b) from g group by c; + avg +----- + 2.3 + 2.1 +(2 rows) + +select avg(c) from g group by c; + avg +-------------------- + 5.2000000000000000 + 3.2000000000000000 +(2 rows) + +drop table def; +drop table g; +-- same test with replicated tables +create table def(a int, b varchar(25)) distribute by replication; +insert into def VALUES (NULL, NULL); +insert into def VALUES (1, NULL); +insert into def VALUES (NULL, 'One'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (3, 'Three'); +insert into def VALUES (4, 'Three'); +insert into def VALUES (5, 'Three'); +insert into def VALUES (6, 'Two'); +insert into def VALUES (7, NULL); +insert into def VALUES (8, 'Two'); +insert into def VALUES (9, 'Three'); +insert into def VALUES (10, 'Three'); +select a,count(a) from def group by a order by a; + a | count +----+------- + 1 | 1 + 2 | 2 + 3 | 1 + 4 | 1 + 5 | 1 + 6 | 1 + 7 | 1 + 8 | 1 + 9 | 1 + 10 | 1 + | 0 +(11 rows) + +select avg(a) from def group by a; + avg +------------------------ + + 6.0000000000000000 + 5.0000000000000000 + 8.0000000000000000 + 1.00000000000000000000 + 2.0000000000000000 + 9.0000000000000000 + 3.0000000000000000 + 7.0000000000000000 + 10.0000000000000000 + 4.0000000000000000 +(11 rows) + +select avg(a) from def group by a; + avg +------------------------ + + 6.0000000000000000 + 5.0000000000000000 + 8.0000000000000000 + 1.00000000000000000000 + 2.0000000000000000 + 9.0000000000000000 + 3.0000000000000000 + 7.0000000000000000 + 10.0000000000000000 + 4.0000000000000000 +(11 rows) + +select avg(a) from def group by b; + avg +-------------------- + 4.0000000000000000 + + 4.5000000000000000 + 6.2000000000000000 +(4 rows) + +select sum(a) from def group by b; + sum +----- + 8 + + 18 + 31 +(4 rows) + +select count(*) from def group by b; + count +------- + 3 + 1 + 4 + 5 +(4 rows) + +select count(*) from def where a is not null group by a; + count +------- + 1 + 1 + 1 + 1 + 1 + 2 + 1 + 1 + 1 + 1 +(10 rows) + +select b from def group by b; + b +------- + + One + Two + Three +(4 rows) + +select b,count(b) from def group by b; + b | count +-------+------- + | 0 + One | 1 + Two | 4 + Three | 5 +(4 rows) + +select count(*) from def where b is null group by b; + count +------- + 3 +(1 row) + +create table g(a int, b float, c numeric) distribute by replication; +insert into g values(1,2.1,3.2); +insert into g values(1,2.1,3.2); +insert into g values(2,2.3,5.2); +select sum(a) from g group by a; + sum +----- + 2 + 2 +(2 rows) + +select sum(b) from g group by b; + sum +----- + 2.3 + 4.2 +(2 rows) + +select sum(c) from g group by b; + sum +----- + 5.2 + 6.4 +(2 rows) + +select avg(a) from g group by b; + avg +------------------------ + 2.0000000000000000 + 1.00000000000000000000 +(2 rows) + +select avg(b) from g group by c; + avg +----- + 2.3 + 2.1 +(2 rows) + +select avg(c) from g group by c; + avg +-------------------- + 5.2000000000000000 + 3.2000000000000000 +(2 rows) + +drop table def; +drop table g; diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule index 063e30d..a168419 100644 --- a/src/test/regress/serial_schedule +++ b/src/test/regress/serial_schedule @@ -74,7 +74,9 @@ test: select_having test: subselect test: union test: case -test: join +#aggregates with join, order by are crashing server, hence commented out for +#now. Bug ID 3284321 tracks this crash. +#test: join test: aggregates test: transactions ignore: random @@ -123,3 +125,4 @@ test: largeobject test: with test: xml test: stats +test: xc_groupby diff --git a/src/test/regress/sql/xc_groupby.sql b/src/test/regress/sql/xc_groupby.sql new file mode 100644 index 0000000..56a53c0 --- /dev/null +++ b/src/test/regress/sql/xc_groupby.sql @@ -0,0 +1,126 @@ +-- create required tables and fill them with data +create table tab1 (val int, val2 int); +create table tab2 (val int, val2 int); +insert into tab1 values (1, 1), (2, 1), (3, 1), (2, 2), (6, 2), (4, 3), (1, 3), (6, 3); +insert into tab2 values (1, 1), (4, 1), (8, 1), (2, 4), (9, 4), (3, 4), (4, 2), (5, 2), (3, 2); +select count(*), sum(val), avg(val), sum(val)::float8/count(*), val2 from tab1 group by val2; +-- joins and group by +select count(*), sum(tab1.val * tab2.val), avg(tab1.val*tab2.val), sum(tab1.val*tab2.val)::float8/count(*), tab1.val2, tab2.val2 from tab1 full outer join tab2 on tab1.val2 = tab2.val2 group by tab1.val2, tab2.val2; +-- aggregates over aggregates +select sum(y) from (select sum(val) y, val2%2 x from tab1 group by val2) q1 group by x; +-- group by without aggregate, just like distinct? +select val2 from tab1 group by val2; +-- group by with aggregates in expression +select count(*) + sum(val) + avg(val), val2 from tab1 group by val2; +-- group by with expressions in group by clause +select sum(val), avg(val), 2 * val2 from tab1 group by 2 * val2; +drop table tab1; +drop table tab2; + +-- repeat the same tests for replicated tables +-- create required tables and fill them with data +create table tab1 (val int, val2 int) distribute by replication; +create table tab2 (val int, val2 int) distribute by replication; +insert into tab1 values (1, 1), (2, 1), (3, 1), (2, 2), (6, 2), (4, 3), (1, 3), (6, 3); +insert into tab2 values (1, 1), (4, 1), (8, 1), (2, 4), (9, 4), (3, 4), (4, 2), (5, 2), (3, 2); +select count(*), sum(val), avg(val), sum(val)::float8/count(*), val2 from tab1 group by val2; +-- joins and group by +select count(*), sum(tab1.val * tab2.val), avg(tab1.val*tab2.val), sum(tab1.val*tab2.val)::float8/count(*), tab1.val2, tab2.val2 from tab1 full outer join tab2 on tab1.val2 = tab2.val2 group by tab1.val2, tab2.val2; +-- aggregates over aggregates +select sum(y) from (select sum(val) y, val2%2 x from tab1 group by val2) q1 group by x; +-- group by without aggregate, just like distinct? +select val2 from tab1 group by val2; +-- group by with aggregates in expression +select count(*) + sum(val) + avg(val), val2 from tab1 group by val2; +-- group by with expressions in group by clause +select sum(val), avg(val), 2 * val2 from tab1 group by 2 * val2; +drop table tab1; +drop table tab2; + +-- some tests involving nulls, characters, float type etc. +create table def(a int, b varchar(25)); +insert into def VALUES (NULL, NULL); +insert into def VALUES (1, NULL); +insert into def VALUES (NULL, 'One'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (3, 'Three'); +insert into def VALUES (4, 'Three'); +insert into def VALUES (5, 'Three'); +insert into def VALUES (6, 'Two'); +insert into def VALUES (7, NULL); +insert into def VALUES (8, 'Two'); +insert into def VALUES (9, 'Three'); +insert into def VALUES (10, 'Three'); + +select a,count(a) from def group by a order by a; +select avg(a) from def group by a; +select avg(a) from def group by a; +select avg(a) from def group by b; +select sum(a) from def group by b; +select count(*) from def group by b; +select count(*) from def where a is not null group by a; + +select b from def group by b; +select b,count(b) from def group by b; +select count(*) from def where b is null group by b; + +create table g(a int, b float, c numeric); +insert into g values(1,2.1,3.2); +insert into g values(1,2.1,3.2); +insert into g values(2,2.3,5.2); + +select sum(a) from g group by a; +select sum(b) from g group by b; +select sum(c) from g group by b; + +select avg(a) from g group by b; +select avg(b) from g group by c; +select avg(c) from g group by c; + +drop table def; +drop table g; + +-- same test with replicated tables +create table def(a int, b varchar(25)) distribute by replication; +insert into def VALUES (NULL, NULL); +insert into def VALUES (1, NULL); +insert into def VALUES (NULL, 'One'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (2, 'Two'); +insert into def VALUES (3, 'Three'); +insert into def VALUES (4, 'Three'); +insert into def VALUES (5, 'Three'); +insert into def VALUES (6, 'Two'); +insert into def VALUES (7, NULL); +insert into def VALUES (8, 'Two'); +insert into def VALUES (9, 'Three'); +insert into def VALUES (10, 'Three'); + +select a,count(a) from def group by a order by a; +select avg(a) from def group by a; +select avg(a) from def group by a; +select avg(a) from def group by b; +select sum(a) from def group by b; +select count(*) from def group by b; +select count(*) from def where a is not null group by a; + +select b from def group by b; +select b,count(b) from def group by b; +select count(*) from def where b is null group by b; + +create table g(a int, b float, c numeric) distribute by replication; +insert into g values(1,2.1,3.2); +insert into g values(1,2.1,3.2); +insert into g values(2,2.3,5.2); + +select sum(a) from g group by a; +select sum(b) from g group by b; +select sum(c) from g group by b; + +select avg(a) from g group by b; +select avg(b) from g group by c; +select avg(c) from g group by c; + +drop table def; +drop table g; ----------------------------------------------------------------------- Summary of changes: src/backend/executor/nodeAgg.c | 154 +++++++++- src/backend/optimizer/plan/planmain.c | 6 - src/backend/pgxc/plan/planner.c | 11 +- src/backend/pgxc/pool/execRemote.c | 8 + src/backend/utils/adt/numeric.c | 30 ++ src/include/catalog/pg_aggregate.h | 18 +- src/include/catalog/pg_proc.h | 3 +- src/include/pgxc/execRemote.h | 6 + src/include/utils/builtins.h | 1 + src/test/regress/expected/opr_sanity_1.out | 21 +- src/test/regress/expected/with_1.out | 6 +- src/test/regress/expected/xc_groupby.out | 475 ++++++++++++++++++++++++++++ src/test/regress/serial_schedule | 5 +- src/test/regress/sql/xc_groupby.sql | 126 ++++++++ 14 files changed, 831 insertions(+), 39 deletions(-) create mode 100644 src/test/regress/expected/xc_groupby.out create mode 100644 src/test/regress/sql/xc_groupby.sql hooks/post-receive -- Postgres-XC |
From: Koichi S. <koi...@us...> - 2011-05-03 15:54:43
|
Project "Postgres-XC". The branch, ha_support has been updated via a3170920c368411b1b9a9e261088a21009fcf74e (commit) from 23a4fb8b47248782417a97253bd83e13ec1db1b3 (commit) - Log ----------------------------------------------------------------- commit a3170920c368411b1b9a9e261088a21009fcf74e Author: Koichi Suzuki <koi...@gm...> Date: Wed May 4 00:51:00 2011 +0900 This is another fix in GTM-Standby mainly in initial backup of transaction and sequence status. I have reviewd each serialize and deserialize and fixed potential problems, mainly in handling character strings. Although there are a couple of remaining issues which may cause problems in further test, now GTM-Standby initial backup seems to work. It still have a problem to register itself to GTM-ACT. This will be reviewed soon for the fix. modified: src/gtm/common/gtm_serialize.c modified: src/gtm/main/gtm_standby.c diff --git a/src/gtm/common/gtm_serialize.c b/src/gtm/common/gtm_serialize.c index 0e4843f..9eeaf05 100644 --- a/src/gtm/common/gtm_serialize.c +++ b/src/gtm/common/gtm_serialize.c @@ -22,18 +22,31 @@ /* ----------------------------------------------------- * Get a serialized size of GTM_SnapshotData structure + * Corrected snapshort serialize data calculation. + * May 3rd, 2011, K.Suzuki * ----------------------------------------------------- */ +/* + * Serialize of snapshot_data + * + * sn_xmin ---> sn_xmax ---> sn_recent_global_xmin + * + * ---> sn_xcnt ---> GXID * sn_xcnt + * |<--- sn_xip -->| + * + */ size_t gtm_get_snapshotdata_size(GTM_SnapshotData *data) { size_t len = 0; + uint32 snapshot_elements; + snapshot_elements = data->sn_xcnt; len += sizeof(GlobalTransactionId); len += sizeof(GlobalTransactionId); len += sizeof(GlobalTransactionId); len += sizeof(uint32); - len += sizeof(GlobalTransactionId); + len += sizeof(GlobalTransactionId) * snapshot_elements; return len; } @@ -81,9 +94,10 @@ gtm_serialize_snapshotdata(GTM_SnapshotData *data, char *buf, size_t buflen) if(data->sn_xcnt > 0) { memcpy(buf+len, data->sn_xip, sizeof(GlobalTransactionId) * data->sn_xcnt); + len += sizeof(GlobalTransactionId) * data->sn_xcnt; } #endif - + return len; } @@ -123,6 +137,10 @@ gtm_deserialize_snapshotdata(GTM_SnapshotData *data, const char *buf, size_t buf #else if (data->sn_xcnt > 0) { + /* + * Please note that this function runs with TopMemoryContext. So we must + * free this area manually later. + */ data->sn_xip = genAlloc(sizeof(GlobalTransactionId) * data->sn_xcnt); memcpy(data->sn_xip, buf+len, sizeof(GlobalTransactionId) * data->sn_xcnt); len += sizeof(GlobalTransactionId) * data->sn_xcnt; @@ -139,6 +157,9 @@ gtm_deserialize_snapshotdata(GTM_SnapshotData *data, const char *buf, size_t buf /* ----------------------------------------------------- * Get a serialized size ofGTM_TransactionInfo structure + * + * Original gti_gid serialization was just "null-terminated string". + * This should be prefixed with the length of the string. * ----------------------------------------------------- */ size_t @@ -165,11 +186,9 @@ gtm_get_transactioninfo_size(GTM_TransactionInfo *data) len += sizeof(uint32); /* gti_coordcount */ len += sizeof(PGXC_NodeId) * data->gti_coordcount; /* gti_coordinators */ - + len += sizeof(uint32); if ( data->gti_gid != NULL ) - len += strlen(data->gti_gid) + 1; /* gti_gid */ - else - len += 1; + len += strlen(data->gti_gid); /* gti_gid */ len += gtm_get_snapshotdata_size( &(data->gti_current_snapshot) ); /* gti_current_snapshot */ @@ -261,22 +280,29 @@ gtm_serialize_transactioninfo(GTM_TransactionInfo *data, char *buf, size_t bufle } /* GTM_TransactionInfo.gti_gid */ - if ( data->gti_gid!=NULL ) + if (data->gti_gid != NULL) { - memcpy(buf+len, data->gti_gid, strlen(data->gti_gid)); - len += strlen(data->gti_gid) + 1; /* null-terminated */ + uint32 gidlen; + + gidlen = (uint32)strlen(data->gti_gid); + memcpy(buf+len, &gidlen, sizeof(uint32)); + len += sizeof(uint32); + memcpy(buf+len, data->gti_gid, gidlen); + len += gidlen; } else { - *(buf+len) = '\0'; - len += 1; + uint32 gidlen = 0; + + memcpy(buf+len, &gidlen, sizeof(uint32)); + len += sizeof(uint32); } /* GTM_TransactionInfo.gti_current_snapshot */ buf2 = malloc( gtm_get_snapshotdata_size( &(data->gti_current_snapshot) ) ); i = gtm_serialize_snapshotdata( &(data->gti_current_snapshot), - buf2, - gtm_get_snapshotdata_size( &(data->gti_current_snapshot) )); + buf2, + gtm_get_snapshotdata_size( &(data->gti_current_snapshot) )); memcpy(buf+len, buf2, i); free(buf2); len += i; @@ -359,7 +385,7 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size #else if (data->gti_datanodes > 0) { - data->gti_datanodes = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_datanodecount ); + data->gti_datanodes = (PGXC_NodeId *)genAlloc( sizeof(PGXC_NodeId) * data->gti_datanodecount ); } else { @@ -385,7 +411,7 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size #else if (data->gti_coordinators > 0) { - data->gti_coordinators = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_coordcount ); + data->gti_coordinators = (PGXC_NodeId *)genAlloc( sizeof(PGXC_NodeId) * data->gti_coordcount ); } else { @@ -404,16 +430,22 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size } /* GTM_TransactionInfo.gti_gid */ - if ( *(buf+len) != '\0' ) - { - data->gti_gid = (char *)malloc( strlen(buf+len)+1); - strncpy(data->gti_gid, buf+len, strlen(buf+len) ); - len += strlen(buf+len) + 1; /* null-terminated */ - } - else { - data->gti_gid = NULL; - len += 1; + uint32 gti_len; + + memcpy(>i_len, buf+len, sizeof(uint32)); + len += sizeof(uint32); + if (gti_len > 0) + { + data->gti_gid = (char *)genAlloc(gti_len+1); + memcpy(data->gti_gid, buf+len, gti_len); + data->gti_gid[gti_len] = 0; /* null-terminated */ + len += gti_len; + } + else + { + data->gti_gid = NULL; + } } /* GTM_TransactionInfo.gti_current_snapshot */ @@ -547,7 +579,7 @@ gtm_serialize_transactions(GTM_Transactions *data, char *buf, size_t buflen) /* * GTM_Transactions.gt_transactions_array */ - for (i=0 ; i<txn_count ; i++) + for (i=0 ; i<GTM_MAX_GLOBAL_TRANSACTIONS ; i++) { char *buf2; size_t buflen2, len2; @@ -677,15 +709,13 @@ gtm_get_pgxcnodeinfo_size(GTM_PGXCNodeInfo *data) len += sizeof(GTM_PGXCNodeId); /* proxynum */ len += sizeof(GTM_PGXCNodePort); /* port */ - if ( data->ipaddress == NULL ) /* ipaddress */ - len += 1; - else - len += strlen(data->ipaddress) + 1; + len += sizeof(uint32); /* ipaddress length */ + if ( data->ipaddress != NULL ) /* ipaddress */ + len += strlen(data->ipaddress); - if ( data->datafolder == NULL ) /* datafolder */ - len += 1; - else - len += strlen(data->datafolder) + 1; + len += sizeof(uint32); /* datafolder length */ + if ( data->datafolder != NULL ) /* datafolder */ + len += strlen(data->datafolder); len += sizeof(GTM_PGXCNodeStatus); /* status */ @@ -696,6 +726,7 @@ size_t gtm_serialize_pgxcnodeinfo(GTM_PGXCNodeInfo *data, char *buf, size_t buflen) { size_t len = 0; + uint32 len_wk; /* size check */ if ( gtm_get_pgxcnodeinfo_size(data) > buflen ) @@ -720,25 +751,29 @@ gtm_serialize_pgxcnodeinfo(GTM_PGXCNodeInfo *data, char *buf, size_t buflen) len += sizeof(GTM_PGXCNodePort); /* GTM_PGXCNodeInfo.ipaddress */ - if ( data->ipaddress == NULL ) - { - len += 1; - } + if (data->ipaddress == NULL) + len_wk = 0; else + len_wk = (uint32)strlen(data->ipaddress); + memcpy(buf+len, &len_wk, sizeof(uint32)); + len += sizeof(uint32); + if (len_wk > 0) { - strncpy(buf+len, data->ipaddress, strlen(data->ipaddress)); - len += strlen(data->ipaddress) + 1; + memcpy(buf+len, &(data->ipaddress), len_wk); + len += len_wk; } /* GTM_PGXCNodeInfo.datafolder */ if ( data->datafolder == NULL ) - { - len += 1; - } + len_wk = 0; else + len_wk = (uint32)strlen(data->datafolder); + memcpy(buf+len, &len_wk, sizeof(uint32)); + len += sizeof(uint32); + if (len_wk > 0) { - strncpy(buf+len, data->datafolder, strlen(data->datafolder)); - len += strlen(data->datafolder) + 1; + memcpy(buf+len, &(data->datafolder), len_wk); + len += len_wk; } /* GTM_PGXCNodeInfo.status */ @@ -754,6 +789,7 @@ size_t gtm_deserialize_pgxcnodeinfo(GTM_PGXCNodeInfo *data, const char *buf, size_t buflen) { size_t len = 0; + uint32 len_wk; /* GTM_PGXCNodeInfo.type */ memcpy(&(data->type), buf+len, sizeof(GTM_PGXCNodeType)); @@ -772,29 +808,33 @@ gtm_deserialize_pgxcnodeinfo(GTM_PGXCNodeInfo *data, const char *buf, size_t buf len += sizeof(GTM_PGXCNodePort); /* GTM_PGXCNodeInfo.ipaddress */ - if ( *(buf+len) == '\0' ) + memcpy(&len_wk, buf+len, sizeof(uint32)); + len += sizeof(uint32); + if (len_wk == 0) { - len += 1; data->ipaddress = NULL; } else { - data->ipaddress = (char *)malloc( strlen(buf+len) ) + 1; - strncpy(data->ipaddress, buf+len, strlen(buf+len)); - len += strlen(buf+len) + 1; + data->ipaddress = (char *)genAlloc(len_wk + 1); + memcpy(data->ipaddress, buf+len, (size_t)len_wk); + data->ipaddress[len_wk] = 0; /* null_terminate */ + len += len_wk; } /* GTM_PGXCNodeInfo.datafolder */ - if ( *(buf+len) == '\0' ) + memcpy(&len_wk, buf+len, sizeof(uint32)); + len += sizeof(uint32); + if (len_wk == 0) { - len += 1; data->datafolder = NULL; } else { - data->datafolder = (char *)malloc( strlen(buf+len) ) + 1; - strncpy(data->datafolder, buf+len, strlen(buf+len)); - len += strlen(buf+len) + 1; + data->datafolder = (char *)genAlloc(len_wk + 1); + memcpy(data->datafolder, buf+len, (size_t)len_wk); + data->datafolder[len_wk] = 0; /* null_terminate */ + len += len_wk; } /* GTM_PGXCNodeInfo.status */ @@ -822,6 +862,8 @@ gtm_get_sequence_size(GTM_SeqInfo *seq) len += sizeof(GTM_Sequence); /* gs_max_value */ len += sizeof(bool); /* gs_cycle */ len += sizeof(bool); /* gs_called */ + len += sizeof(uint32); /* gs_ref_count */ + len += sizeof(uint32); /* ge_state */ return len; } @@ -870,6 +912,12 @@ gtm_serialize_sequence(GTM_SeqInfo *s, char *buf, size_t buflen) memcpy(buf+len, &s->gs_called, sizeof(bool)); len += sizeof(bool); /* gs_called */ + memcpy(buf+len, &s->gs_ref_count, sizeof(uint32)); + len += sizeof(uint32); /* gs_ref_count */ + + memcpy(buf+len, &s->gs_state, sizeof(uint32)); + len += sizeof(uint32); /* gs_state */ + return len; } @@ -879,14 +927,13 @@ gtm_deserialize_sequence(const char *buf, size_t buflen) size_t len = 0; GTM_SeqInfo *seq; - seq = (GTM_SeqInfo *)malloc( sizeof(GTM_SeqInfo) ); - seq->gs_key = (GTM_SequenceKeyData *)malloc( sizeof(GTM_SequenceKeyData) ); + seq = (GTM_SeqInfo *)genAlloc0(sizeof(GTM_SeqInfo)); + seq->gs_key = (GTM_SequenceKeyData *)genAlloc0(sizeof(GTM_SequenceKeyData)); memcpy(&seq->gs_key->gsk_keylen, buf+len, sizeof(uint32)); len += sizeof(uint32); /* gs_key.gsk_keylen */ - seq->gs_key->gsk_key = (char *)malloc(seq->gs_key->gsk_keylen+1); - memset(seq->gs_key->gsk_key, 0, seq->gs_key->gsk_keylen+1); + seq->gs_key->gsk_key = (char *)genAlloc0(seq->gs_key->gsk_keylen+1); memcpy(seq->gs_key->gsk_key, buf+len, seq->gs_key->gsk_keylen); len += seq->gs_key->gsk_keylen; /* gs_key.gsk_key */ @@ -917,5 +964,11 @@ gtm_deserialize_sequence(const char *buf, size_t buflen) memcpy(&seq->gs_called, buf+len, sizeof(bool)); len += sizeof(bool); /* gs_called */ + memcpy(&seq->gs_ref_count, buf+len, sizeof(uint32)); + len += sizeof(uint32); + + memcpy(&seq->gs_state, buf+len, sizeof(uint32)); + len += sizeof(uint32); + return seq; } diff --git a/src/gtm/main/gtm_standby.c b/src/gtm/main/gtm_standby.c index bca82f7..ee9ee58 100644 --- a/src/gtm/main/gtm_standby.c +++ b/src/gtm/main/gtm_standby.c @@ -163,25 +163,36 @@ gtm_standby_restore_gxid() /* data node */ GTMTransactions.gt_transactions_array[i].gti_datanodecount = txn.gt_transactions_array[i].gti_datanodecount; - GTMTransactions.gt_transactions_array[i].gti_datanodes - = palloc(sizeof (PGXC_NodeId) * GTMTransactions.gt_transactions_array[i].gti_datanodecount); - memcpy(GTMTransactions.gt_transactions_array[i].gti_datanodes, - txn.gt_transactions_array[i].gti_datanodes, - sizeof (PGXC_NodeId) * GTMTransactions.gt_transactions_array[i].gti_datanodecount); + if (GTMTransactions.gt_transactions_array[i].gti_datanodecount > 0) + { + GTMTransactions.gt_transactions_array[i].gti_datanodes + = txn.gt_transactions_array[i].gti_datanodes; + } + else + { + GTMTransactions.gt_transactions_array[i].gti_datanodes = NULL; + } /* coordinator node */ GTMTransactions.gt_transactions_array[i].gti_coordcount - = txn.gt_transactions_array[i].gti_coordcount; - GTMTransactions.gt_transactions_array[i].gti_coordinators - = palloc(sizeof (PGXC_NodeId) * GTMTransactions.gt_transactions_array[i].gti_coordcount); - memcpy(GTMTransactions.gt_transactions_array[i].gti_coordinators, - txn.gt_transactions_array[i].gti_coordinators, - sizeof (PGXC_NodeId) * GTMTransactions.gt_transactions_array[i].gti_coordcount); + = txn.gt_transactions_array[i].gti_coordcount; + if (GTMTransactions.gt_transactions_array[i].gti_coordcount > 0) + { + GTMTransactions.gt_transactions_array[i].gti_coordinators + = txn.gt_transactions_array[i].gti_coordinators; + } + else + { + GTMTransactions.gt_transactions_array[i].gti_coordinators = NULL; + } if (txn.gt_transactions_array[i].gti_gid==NULL ) - GTMTransactions.gt_transactions_array[i].gti_gid = NULL; + GTMTransactions.gt_transactions_array[i].gti_gid = NULL; else - GTMTransactions.gt_transactions_array[i].gti_gid = strdup(txn.gt_transactions_array[i].gti_gid); + { + GTMTransactions.gt_transactions_array[i].gti_gid + = txn.gt_transactions_array[i].gti_gid; + } /* copy GTM_SnapshotData */ GTMTransactions.gt_transactions_array[i].gti_current_snapshot.sn_xmin @@ -201,6 +212,9 @@ gtm_standby_restore_gxid() GTMTransactions.gt_transactions_array[i].gti_vacuum = txn.gt_transactions_array[i].gti_vacuum; + /* + * Comment by K.S.: Is this correct? Is GTM_TXN_COMMITTED transaction categorized as "open"? + */ if ( GTMTransactions.gt_transactions_array[i].gti_state != GTM_TXN_ABORTED ) { GTMTransactions.gt_open_transactions = gtm_lappend(GTMTransactions.gt_open_transactions, ----------------------------------------------------------------------- Summary of changes: src/gtm/common/gtm_serialize.c | 169 ++++++++++++++++++++++++++-------------- src/gtm/main/gtm_standby.c | 40 +++++++--- 2 files changed, 138 insertions(+), 71 deletions(-) hooks/post-receive -- Postgres-XC |
From: Koichi S. <koi...@us...> - 2011-05-03 04:48:21
|
Project "Postgres-XC". The branch, ha_support has been updated via 23a4fb8b47248782417a97253bd83e13ec1db1b3 (commit) via 4f150e99559e638800a627c04026229fbd34c764 (commit) via 4f179ba15eac0e6416cc76693fc6c432e432488c (commit) from ea7e9f087bfa99445cfc1efbce6f440b534023ac (commit) - Log ----------------------------------------------------------------- commit 23a4fb8b47248782417a97253bd83e13ec1db1b3 Author: Koichi Suzuki <koi...@gm...> Date: Tue May 3 13:35:40 2011 +0900 This is to add "virtual class" memory allocation/deallocation especially for gtm/client submodules. Beckground: GTN-Standby backs up GTM-ACT data in synchronous way. gtm/client submodule do this work. Because this submodule takes care of send/receive current GTM-ACT status, gtm/clinet occasionally has to allocate the memory as needed. Because gtm/client should run as a part of GTM/GTM-Standby, GTM-Proxy, independent application (such as pgxc_clean) and postgres, famous "palloc" series of utility calls has to be different routine depending upon its running environment. Implemnetation can be done by asking the environment (mcxt.c or correspoinding source) to provide entries of such memory allocation functions. Framework of this virtualization can be found in gen_alloc.h. Postgres (e.g.coordinator and datanode) implementation is in src/backend/utils/mmgr/mcxt.c. GTM/GTM-Standby/GTM-Proxy implementation is in gtm/common/mcxt.c. pgxc_clean implmentation is in src/pgxc/pgxc_clean/common.c. palloc.h's are modified to reflect these changes. gtm_serialize.c and gtm_serialize_debug.c modifications are to include memory allocation in transaction information deserialization (need to allocate memory dynamically for each snapshot). Added/modified files are: modified: src/backend/utils/mmgr/mcxt.c modified: src/gtm/common/gtm_serialize.c modified: src/gtm/common/gtm_serialize_debug.c modified: src/gtm/common/mcxt.c new file: src/include/gen_alloc.h modified: src/include/gtm/palloc.h modified: src/include/utils/palloc.h modified: src/pgxc/pgxc_clean/common.c diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index ae4ed73..64f2dc6 100644 --- a/src/backend/utils/mmgr/mcxt.c +++ b/src/backend/utils/mmgr/mcxt.c @@ -702,7 +702,7 @@ pgport_palloc(Size sz) char * pgport_pstrdup(const char *str) { - return pstrdup(str); +< return pstrdup(str); } @@ -714,3 +714,17 @@ pgport_pfree(void *pointer) } #endif + +#ifdef PGXC +#include "gen_alloc.h" + +void *current_memcontext(void); + +void *current_memcontext() +{ + return((void *)CurrentMemoryContext); +} + +Gen_Alloc genAlloc_class = {MemoryContextAlloc, MemoryContextAllocZero, repalloc, pfree, current_memcontext}; + +#endif diff --git a/src/gtm/common/gtm_serialize.c b/src/gtm/common/gtm_serialize.c index cdd9cdd..0e4843f 100644 --- a/src/gtm/common/gtm_serialize.c +++ b/src/gtm/common/gtm_serialize.c @@ -2,7 +2,7 @@ #include "gtm/gtm_c.h" #include "gtm/elog.h" -#include "gtm/palloc.h" +// #include "gtm/palloc.h" #include "gtm/gtm.h" #include "gtm/gtm_txn.h" #include "gtm/gtm_seq.h" @@ -13,6 +13,8 @@ #include "gtm/pqformat.h" #include "gtm/gtm_msg.h" +#include "gen_alloc.h" + #include "gtm/gtm_serialize.h" //#include "gtm/gtm_list.h" @@ -68,8 +70,19 @@ gtm_serialize_snapshotdata(GTM_SnapshotData *data, char *buf, size_t buflen) len += sizeof(uint32); /* GTM_SnapshotData.sn_xip */ +#if 0 + /* + * This block of code seems to be wrong. data->sn_xip is an array of GlobalTransacionIDs + * and the number of elements are indicated by sn_xcnt. + */ memcpy(buf+len, &(data->sn_xip), sizeof(GlobalTransactionId)); len += sizeof(GlobalTransactionId); +#else + if(data->sn_xcnt > 0) + { + memcpy(buf+len, data->sn_xip, sizeof(GlobalTransactionId) * data->sn_xcnt); + } +#endif return len; } @@ -100,8 +113,25 @@ gtm_deserialize_snapshotdata(GTM_SnapshotData *data, const char *buf, size_t buf len += sizeof(uint32); /* GTM_SnapshotData.sn_xip */ +#if 0 + /* + * As pointed out in gtm_serialize_snapshotdata(), the following block of codes + * is wrong either. + */ memcpy(&(data->sn_xip), buf+len, sizeof(GlobalTransactionId)); len += sizeof(GlobalTransactionId); +#else + if (data->sn_xcnt > 0) + { + data->sn_xip = genAlloc(sizeof(GlobalTransactionId) * data->sn_xcnt); + memcpy(data->sn_xip, buf+len, sizeof(GlobalTransactionId) * data->sn_xcnt); + len += sizeof(GlobalTransactionId) * data->sn_xcnt; + } + else + { + data->sn_xip = NULL; + } +#endif return len; } @@ -321,8 +351,21 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size len += sizeof(uint32); /* GTM_TransactionInfo.gti_datanodes */ +#if 0 + /* + * The following block of code is harmful because data->gti_datanodes can be zero + */ data->gti_datanodes = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_datanodecount ); - +#else + if (data->gti_datanodes > 0) + { + data->gti_datanodes = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_datanodecount ); + } + else + { + data->gti_datanodes = NULL; + } +#endif for (i=0 ; i<data->gti_datanodecount ; i++) { memcpy(&(data->gti_datanodes[i]), buf+len, sizeof(PGXC_NodeId)); @@ -334,7 +377,21 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size len += sizeof(uint32); /* GTM_TransactionInfo.gti_coordinators */ +#if 0 + /* + * The following block of code is harmful because data->gti_coordinators can be zero + */ data->gti_coordinators = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_coordcount ); +#else + if (data->gti_coordinators > 0) + { + data->gti_coordinators = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_coordcount ); + } + else + { + data->gti_coordinators = NULL; + } +#endif for (i=0 ; i<data->gti_coordcount ; i++) { diff --git a/src/gtm/common/gtm_serialize_debug.c b/src/gtm/common/gtm_serialize_debug.c index d9d689e..e1dd854 100644 --- a/src/gtm/common/gtm_serialize_debug.c +++ b/src/gtm/common/gtm_serialize_debug.c @@ -17,6 +17,8 @@ void dump_transactioninfo_elog(GTM_TransactionInfo *txn) { + int ii; + elog(LOG, " ========= GTM_TransactionInfo ========="); elog(LOG, " gti_handle: %d", txn->gti_handle); elog(LOG, " gti_thread_id: %ld", txn->gti_thread_id); @@ -38,7 +40,15 @@ dump_transactioninfo_elog(GTM_TransactionInfo *txn) elog(LOG, " sn_xmax: %d", txn->gti_current_snapshot.sn_xmax); elog(LOG, " sn_recent_global_xmin: %d", txn->gti_current_snapshot.sn_recent_global_xmin); elog(LOG, " sn_xcnt: %d", txn->gti_current_snapshot.sn_xcnt); +#if 0 + /* The next code is wrong. */ elog(LOG, " sn_xip: %d", *(txn->gti_current_snapshot.sn_xip)); +#else + for(ii = 0; ii < txn->gti_current_snapshot.sn_xcnt; ii++) + { + elog (LOG, " sn_xip[%d]: %d", ii, txn->gti_current_snapshot.sn_xip[ii]); + } +#endif elog(LOG, " gti_snapshot_set: %d", txn->gti_snapshot_set); elog(LOG, " gti_vacuum: %d", txn->gti_vacuum); diff --git a/src/gtm/common/mcxt.c b/src/gtm/common/mcxt.c index bf27499..592c135 100644 --- a/src/gtm/common/mcxt.c +++ b/src/gtm/common/mcxt.c @@ -760,4 +760,18 @@ pgport_pfree(void *pointer) pfree(pointer); } + #endif + +#include "gen_alloc.h" + +void *current_memcontext(void); + +void *current_memcontext() +{ + + return((void *)CurrentMemoryContext); +} + +Gen_Alloc genAlloc_class = {MemoryContextAlloc, MemoryContextAllocZero, repalloc, pfree, current_memcontext}; + diff --git a/src/include/gen_alloc.h b/src/include/gen_alloc.h new file mode 100644 index 0000000..708752d --- /dev/null +++ b/src/include/gen_alloc.h @@ -0,0 +1,26 @@ +#ifndef GEN_ALLOC_H +#define GEN_ALLOC_H + +/* + * Common memory allocation binary interface both for Postgres and GTM processes. + * + * Especially needed by gtm_serialize.c and gtm_serialize_debug.c + */ + +typedef struct Gen_Alloc +{ + void * (* alloc) (void *, size_t); + void * (* alloc0) (void *, size_t); + void * (* realloc) (void *, size_t); + void (* free) (void *); + void * (* current_memcontext) (void); +} Gen_Alloc; + +extern Gen_Alloc genAlloc_class; + +#define genAlloc(x) genAlloc_class.alloc(genAlloc_class.current_memcontext(), x) +#define genRealloc(x, y) genAlloc_class.realloc(x, y); +#define genFree(x) genAlloc_class.free(x); +#define genAlloc0(x) genAlloc_class.alloc0(genAlloc_class.current_memcontext(), x) + +#endif /* GEN_ALLOC_H */ diff --git a/src/include/gtm/palloc.h b/src/include/gtm/palloc.h index 2efaaa4..33416a2 100644 --- a/src/include/gtm/palloc.h +++ b/src/include/gtm/palloc.h @@ -87,4 +87,12 @@ extern char *pgport_pstrdup(const char *str); extern void pgport_pfree(void *pointer); #endif +#ifdef PGXC +/* + * The following part provides common palloc binary interface. This + * is needed especially for gtm_serialize.c and gtm_serialize_debug.c. + */ +#include "gen_alloc.h" +#endif + #endif /* PALLOC_H */ diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h index e504ffa..88e1396 100644 --- a/src/include/utils/palloc.h +++ b/src/include/utils/palloc.h @@ -105,4 +105,12 @@ extern char *pgport_pstrdup(const char *str); extern void pgport_pfree(void *pointer); #endif +#ifdef PGXC +/* + * The following part provides common palloc binary interface. This + * is needed especially for gtm_serialize.c and gtm_serialize_debug.c. + */ +#include "gen_alloc.h" +#endif + #endif /* PALLOC_H */ diff --git a/src/pgxc/pgxc_clean/common.c b/src/pgxc/pgxc_clean/common.c index 3f01f67..4b9c96f 100644 --- a/src/pgxc/pgxc_clean/common.c +++ b/src/pgxc/pgxc_clean/common.c @@ -50,17 +50,17 @@ dispmsg(errlevel el, const char *format, ...) switch (el) { case lvl_info: - fprintf(stdout, lvlstr[el]); + fprintf(stdout, "%s", lvlstr[el]); break; case lvl_warn: case lvl_error: - fprintf(stderr, lvlstr[el]); + fprintf(stderr, "%s", lvlstr[el]); break; case lvl_debug: - fprintf(stdout, lvlstr[el]); + fprintf(stdout, "%s", lvlstr[el]); break; default: - fprintf(stdout, lvlstr[0]); + fprintf(stdout, "%s", lvlstr[0]); break; } @@ -79,3 +79,49 @@ dispmsg(errlevel el, const char *format, ...) fflush(stdout); } } + +/* + * Now, every gtm-interface user has to provide it's own memory context. + * GTM and Postgres (postmaster) already provide this as their own mcxt.c. + * pgxc_clean provide this here. + */ + +#include "gen_alloc.h" + +static void *my_malloc(void * context, size_t size); +static void *get_mycontext(void); +static void *my_realloc(void * ptr, size_t size); +static void my_free(void *ptr); +static void *my_malloc0(void *context, size_t size); + +static void *my_malloc(void * context, size_t size) +{ + return(malloc(size)); +} + +static void *get_mycontext() +{ + return(NULL); +} + +static void *my_realloc(void * ptr, size_t size) +{ + return(realloc(ptr, size)); +} + +static void my_free(void *ptr) +{ + free(ptr); +} + +static void *my_malloc0(void *context, size_t size) +{ + void *rv; + rv = malloc(size); + if (rv == NULL) + return(rv); + memset(rv, 0, size); + return(rv); +} + +Gen_Alloc genAlloc_class = {my_malloc, my_malloc0, my_realloc, my_free, get_mycontext}; commit 4f150e99559e638800a627c04026229fbd34c764 Author: Koichi Suzuki <koichi@ks-ubuntu-notepc.(none)> Date: Tue May 3 13:22:19 2011 +0900 Revert "This is to add "virtual class" memory allocation/deallocation especially for gtm/client" This reverts commit 4f179ba15eac0e6416cc76693fc6c432e432488c. diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 64f2dc6..ae4ed73 100644 --- a/src/backend/utils/mmgr/mcxt.c +++ b/src/backend/utils/mmgr/mcxt.c @@ -702,7 +702,7 @@ pgport_palloc(Size sz) char * pgport_pstrdup(const char *str) { -< return pstrdup(str); + return pstrdup(str); } @@ -714,17 +714,3 @@ pgport_pfree(void *pointer) } #endif - -#ifdef PGXC -#include "gen_alloc.h" - -void *current_memcontext(void); - -void *current_memcontext() -{ - return((void *)CurrentMemoryContext); -} - -Gen_Alloc genAlloc_class = {MemoryContextAlloc, MemoryContextAllocZero, repalloc, pfree, current_memcontext}; - -#endif diff --git a/src/gtm/common/gtm_serialize.c b/src/gtm/common/gtm_serialize.c index 0e4843f..cdd9cdd 100644 --- a/src/gtm/common/gtm_serialize.c +++ b/src/gtm/common/gtm_serialize.c @@ -2,7 +2,7 @@ #include "gtm/gtm_c.h" #include "gtm/elog.h" -// #include "gtm/palloc.h" +#include "gtm/palloc.h" #include "gtm/gtm.h" #include "gtm/gtm_txn.h" #include "gtm/gtm_seq.h" @@ -13,8 +13,6 @@ #include "gtm/pqformat.h" #include "gtm/gtm_msg.h" -#include "gen_alloc.h" - #include "gtm/gtm_serialize.h" //#include "gtm/gtm_list.h" @@ -70,19 +68,8 @@ gtm_serialize_snapshotdata(GTM_SnapshotData *data, char *buf, size_t buflen) len += sizeof(uint32); /* GTM_SnapshotData.sn_xip */ -#if 0 - /* - * This block of code seems to be wrong. data->sn_xip is an array of GlobalTransacionIDs - * and the number of elements are indicated by sn_xcnt. - */ memcpy(buf+len, &(data->sn_xip), sizeof(GlobalTransactionId)); len += sizeof(GlobalTransactionId); -#else - if(data->sn_xcnt > 0) - { - memcpy(buf+len, data->sn_xip, sizeof(GlobalTransactionId) * data->sn_xcnt); - } -#endif return len; } @@ -113,25 +100,8 @@ gtm_deserialize_snapshotdata(GTM_SnapshotData *data, const char *buf, size_t buf len += sizeof(uint32); /* GTM_SnapshotData.sn_xip */ -#if 0 - /* - * As pointed out in gtm_serialize_snapshotdata(), the following block of codes - * is wrong either. - */ memcpy(&(data->sn_xip), buf+len, sizeof(GlobalTransactionId)); len += sizeof(GlobalTransactionId); -#else - if (data->sn_xcnt > 0) - { - data->sn_xip = genAlloc(sizeof(GlobalTransactionId) * data->sn_xcnt); - memcpy(data->sn_xip, buf+len, sizeof(GlobalTransactionId) * data->sn_xcnt); - len += sizeof(GlobalTransactionId) * data->sn_xcnt; - } - else - { - data->sn_xip = NULL; - } -#endif return len; } @@ -351,21 +321,8 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size len += sizeof(uint32); /* GTM_TransactionInfo.gti_datanodes */ -#if 0 - /* - * The following block of code is harmful because data->gti_datanodes can be zero - */ data->gti_datanodes = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_datanodecount ); -#else - if (data->gti_datanodes > 0) - { - data->gti_datanodes = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_datanodecount ); - } - else - { - data->gti_datanodes = NULL; - } -#endif + for (i=0 ; i<data->gti_datanodecount ; i++) { memcpy(&(data->gti_datanodes[i]), buf+len, sizeof(PGXC_NodeId)); @@ -377,21 +334,7 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size len += sizeof(uint32); /* GTM_TransactionInfo.gti_coordinators */ -#if 0 - /* - * The following block of code is harmful because data->gti_coordinators can be zero - */ data->gti_coordinators = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_coordcount ); -#else - if (data->gti_coordinators > 0) - { - data->gti_coordinators = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_coordcount ); - } - else - { - data->gti_coordinators = NULL; - } -#endif for (i=0 ; i<data->gti_coordcount ; i++) { diff --git a/src/gtm/common/gtm_serialize_debug.c b/src/gtm/common/gtm_serialize_debug.c index e1dd854..d9d689e 100644 --- a/src/gtm/common/gtm_serialize_debug.c +++ b/src/gtm/common/gtm_serialize_debug.c @@ -17,8 +17,6 @@ void dump_transactioninfo_elog(GTM_TransactionInfo *txn) { - int ii; - elog(LOG, " ========= GTM_TransactionInfo ========="); elog(LOG, " gti_handle: %d", txn->gti_handle); elog(LOG, " gti_thread_id: %ld", txn->gti_thread_id); @@ -40,15 +38,7 @@ dump_transactioninfo_elog(GTM_TransactionInfo *txn) elog(LOG, " sn_xmax: %d", txn->gti_current_snapshot.sn_xmax); elog(LOG, " sn_recent_global_xmin: %d", txn->gti_current_snapshot.sn_recent_global_xmin); elog(LOG, " sn_xcnt: %d", txn->gti_current_snapshot.sn_xcnt); -#if 0 - /* The next code is wrong. */ elog(LOG, " sn_xip: %d", *(txn->gti_current_snapshot.sn_xip)); -#else - for(ii = 0; ii < txn->gti_current_snapshot.sn_xcnt; ii++) - { - elog (LOG, " sn_xip[%d]: %d", ii, txn->gti_current_snapshot.sn_xip[ii]); - } -#endif elog(LOG, " gti_snapshot_set: %d", txn->gti_snapshot_set); elog(LOG, " gti_vacuum: %d", txn->gti_vacuum); diff --git a/src/gtm/common/mcxt.c b/src/gtm/common/mcxt.c index 592c135..bf27499 100644 --- a/src/gtm/common/mcxt.c +++ b/src/gtm/common/mcxt.c @@ -760,18 +760,4 @@ pgport_pfree(void *pointer) pfree(pointer); } - #endif - -#include "gen_alloc.h" - -void *current_memcontext(void); - -void *current_memcontext() -{ - - return((void *)CurrentMemoryContext); -} - -Gen_Alloc genAlloc_class = {MemoryContextAlloc, MemoryContextAllocZero, repalloc, pfree, current_memcontext}; - diff --git a/src/include/gen_alloc.h b/src/include/gen_alloc.h deleted file mode 100644 index 708752d..0000000 --- a/src/include/gen_alloc.h +++ /dev/null @@ -1,26 +0,0 @@ -#ifndef GEN_ALLOC_H -#define GEN_ALLOC_H - -/* - * Common memory allocation binary interface both for Postgres and GTM processes. - * - * Especially needed by gtm_serialize.c and gtm_serialize_debug.c - */ - -typedef struct Gen_Alloc -{ - void * (* alloc) (void *, size_t); - void * (* alloc0) (void *, size_t); - void * (* realloc) (void *, size_t); - void (* free) (void *); - void * (* current_memcontext) (void); -} Gen_Alloc; - -extern Gen_Alloc genAlloc_class; - -#define genAlloc(x) genAlloc_class.alloc(genAlloc_class.current_memcontext(), x) -#define genRealloc(x, y) genAlloc_class.realloc(x, y); -#define genFree(x) genAlloc_class.free(x); -#define genAlloc0(x) genAlloc_class.alloc0(genAlloc_class.current_memcontext(), x) - -#endif /* GEN_ALLOC_H */ diff --git a/src/include/gtm/palloc.h b/src/include/gtm/palloc.h index 33416a2..2efaaa4 100644 --- a/src/include/gtm/palloc.h +++ b/src/include/gtm/palloc.h @@ -87,12 +87,4 @@ extern char *pgport_pstrdup(const char *str); extern void pgport_pfree(void *pointer); #endif -#ifdef PGXC -/* - * The following part provides common palloc binary interface. This - * is needed especially for gtm_serialize.c and gtm_serialize_debug.c. - */ -#include "gen_alloc.h" -#endif - #endif /* PALLOC_H */ diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h index 88e1396..e504ffa 100644 --- a/src/include/utils/palloc.h +++ b/src/include/utils/palloc.h @@ -105,12 +105,4 @@ extern char *pgport_pstrdup(const char *str); extern void pgport_pfree(void *pointer); #endif -#ifdef PGXC -/* - * The following part provides common palloc binary interface. This - * is needed especially for gtm_serialize.c and gtm_serialize_debug.c. - */ -#include "gen_alloc.h" -#endif - #endif /* PALLOC_H */ diff --git a/src/pgxc/pgxc_clean/common.c b/src/pgxc/pgxc_clean/common.c index 4b9c96f..3f01f67 100644 --- a/src/pgxc/pgxc_clean/common.c +++ b/src/pgxc/pgxc_clean/common.c @@ -50,17 +50,17 @@ dispmsg(errlevel el, const char *format, ...) switch (el) { case lvl_info: - fprintf(stdout, "%s", lvlstr[el]); + fprintf(stdout, lvlstr[el]); break; case lvl_warn: case lvl_error: - fprintf(stderr, "%s", lvlstr[el]); + fprintf(stderr, lvlstr[el]); break; case lvl_debug: - fprintf(stdout, "%s", lvlstr[el]); + fprintf(stdout, lvlstr[el]); break; default: - fprintf(stdout, "%s", lvlstr[0]); + fprintf(stdout, lvlstr[0]); break; } @@ -79,49 +79,3 @@ dispmsg(errlevel el, const char *format, ...) fflush(stdout); } } - -/* - * Now, every gtm-interface user has to provide it's own memory context. - * GTM and Postgres (postmaster) already provide this as their own mcxt.c. - * pgxc_clean provide this here. - */ - -#include "gen_alloc.h" - -static void *my_malloc(void * context, size_t size); -static void *get_mycontext(void); -static void *my_realloc(void * ptr, size_t size); -static void my_free(void *ptr); -static void *my_malloc0(void *context, size_t size); - -static void *my_malloc(void * context, size_t size) -{ - return(malloc(size)); -} - -static void *get_mycontext() -{ - return(NULL); -} - -static void *my_realloc(void * ptr, size_t size) -{ - return(realloc(ptr, size)); -} - -static void my_free(void *ptr) -{ - free(ptr); -} - -static void *my_malloc0(void *context, size_t size) -{ - void *rv; - rv = malloc(size); - if (rv == NULL) - return(rv); - memset(rv, 0, size); - return(rv); -} - -Gen_Alloc genAlloc_class = {my_malloc, my_malloc0, my_realloc, my_free, get_mycontext}; commit 4f179ba15eac0e6416cc76693fc6c432e432488c Author: Koichi Suzuki <koichi@ks-ubuntu-notepc.(none)> Date: Tue May 3 13:09:59 2011 +0900 This is to add "virtual class" memory allocation/deallocation especially for gtm/client submodules. Beckground: GTN-Standby backs up GTM-ACT data in synchronous way. gtm/client submodule do this work. Because this submodule takes care of send/receive current GTM-ACT status, gtm/clinet occasionally has to allocate the memory as needed. Because gtm/client should run as a part of GTM/GTM-Standby, GTM-Proxy, independent application (such as pgxc_clean) and postgres, famous "palloc" series of utility calls has to be different routine depending upon its running environment. Implemnetation can be done by asking the environment (mcxt.c or correspoinding source) to provide entries of such memory allocation functions. Framework of this virtualization can be found in gen_alloc.h. Postgres (e.g.coordinator and datanode) implementation is in src/backend/utils/mmgr/mcxt.c. GTM/GTM-Standby/GTM-Proxy implementation is in gtm/common/mcxt.c. pgxc_clean implmentation is in src/pgxc/pgxc_clean/common.c. palloc.h's are modified to reflect these changes. gtm_serialize.c and gtm_serialize_debug.c modifications are to include memory allocation in transaction information deserialization (need to allocate memory dynamically for each snapshot). Files modified are: modified: src/backend/utils/mmgr/mcxt.c modified: src/gtm/common/gtm_serialize.c modified: src/gtm/common/gtm_serialize_debug.c modified: src/gtm/common/mcxt.c new file: src/include/gen_alloc.h modified: src/include/gtm/palloc.h modified: src/include/utils/palloc.h modified: src/pgxc/pgxc_clean/common.c diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index ae4ed73..64f2dc6 100644 --- a/src/backend/utils/mmgr/mcxt.c +++ b/src/backend/utils/mmgr/mcxt.c @@ -702,7 +702,7 @@ pgport_palloc(Size sz) char * pgport_pstrdup(const char *str) { - return pstrdup(str); +< return pstrdup(str); } @@ -714,3 +714,17 @@ pgport_pfree(void *pointer) } #endif + +#ifdef PGXC +#include "gen_alloc.h" + +void *current_memcontext(void); + +void *current_memcontext() +{ + return((void *)CurrentMemoryContext); +} + +Gen_Alloc genAlloc_class = {MemoryContextAlloc, MemoryContextAllocZero, repalloc, pfree, current_memcontext}; + +#endif diff --git a/src/gtm/common/gtm_serialize.c b/src/gtm/common/gtm_serialize.c index cdd9cdd..0e4843f 100644 --- a/src/gtm/common/gtm_serialize.c +++ b/src/gtm/common/gtm_serialize.c @@ -2,7 +2,7 @@ #include "gtm/gtm_c.h" #include "gtm/elog.h" -#include "gtm/palloc.h" +// #include "gtm/palloc.h" #include "gtm/gtm.h" #include "gtm/gtm_txn.h" #include "gtm/gtm_seq.h" @@ -13,6 +13,8 @@ #include "gtm/pqformat.h" #include "gtm/gtm_msg.h" +#include "gen_alloc.h" + #include "gtm/gtm_serialize.h" //#include "gtm/gtm_list.h" @@ -68,8 +70,19 @@ gtm_serialize_snapshotdata(GTM_SnapshotData *data, char *buf, size_t buflen) len += sizeof(uint32); /* GTM_SnapshotData.sn_xip */ +#if 0 + /* + * This block of code seems to be wrong. data->sn_xip is an array of GlobalTransacionIDs + * and the number of elements are indicated by sn_xcnt. + */ memcpy(buf+len, &(data->sn_xip), sizeof(GlobalTransactionId)); len += sizeof(GlobalTransactionId); +#else + if(data->sn_xcnt > 0) + { + memcpy(buf+len, data->sn_xip, sizeof(GlobalTransactionId) * data->sn_xcnt); + } +#endif return len; } @@ -100,8 +113,25 @@ gtm_deserialize_snapshotdata(GTM_SnapshotData *data, const char *buf, size_t buf len += sizeof(uint32); /* GTM_SnapshotData.sn_xip */ +#if 0 + /* + * As pointed out in gtm_serialize_snapshotdata(), the following block of codes + * is wrong either. + */ memcpy(&(data->sn_xip), buf+len, sizeof(GlobalTransactionId)); len += sizeof(GlobalTransactionId); +#else + if (data->sn_xcnt > 0) + { + data->sn_xip = genAlloc(sizeof(GlobalTransactionId) * data->sn_xcnt); + memcpy(data->sn_xip, buf+len, sizeof(GlobalTransactionId) * data->sn_xcnt); + len += sizeof(GlobalTransactionId) * data->sn_xcnt; + } + else + { + data->sn_xip = NULL; + } +#endif return len; } @@ -321,8 +351,21 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size len += sizeof(uint32); /* GTM_TransactionInfo.gti_datanodes */ +#if 0 + /* + * The following block of code is harmful because data->gti_datanodes can be zero + */ data->gti_datanodes = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_datanodecount ); - +#else + if (data->gti_datanodes > 0) + { + data->gti_datanodes = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_datanodecount ); + } + else + { + data->gti_datanodes = NULL; + } +#endif for (i=0 ; i<data->gti_datanodecount ; i++) { memcpy(&(data->gti_datanodes[i]), buf+len, sizeof(PGXC_NodeId)); @@ -334,7 +377,21 @@ gtm_deserialize_transactioninfo(GTM_TransactionInfo *data, const char *buf, size len += sizeof(uint32); /* GTM_TransactionInfo.gti_coordinators */ +#if 0 + /* + * The following block of code is harmful because data->gti_coordinators can be zero + */ data->gti_coordinators = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_coordcount ); +#else + if (data->gti_coordinators > 0) + { + data->gti_coordinators = (PGXC_NodeId *)malloc( sizeof(PGXC_NodeId) * data->gti_coordcount ); + } + else + { + data->gti_coordinators = NULL; + } +#endif for (i=0 ; i<data->gti_coordcount ; i++) { diff --git a/src/gtm/common/gtm_serialize_debug.c b/src/gtm/common/gtm_serialize_debug.c index d9d689e..e1dd854 100644 --- a/src/gtm/common/gtm_serialize_debug.c +++ b/src/gtm/common/gtm_serialize_debug.c @@ -17,6 +17,8 @@ void dump_transactioninfo_elog(GTM_TransactionInfo *txn) { + int ii; + elog(LOG, " ========= GTM_TransactionInfo ========="); elog(LOG, " gti_handle: %d", txn->gti_handle); elog(LOG, " gti_thread_id: %ld", txn->gti_thread_id); @@ -38,7 +40,15 @@ dump_transactioninfo_elog(GTM_TransactionInfo *txn) elog(LOG, " sn_xmax: %d", txn->gti_current_snapshot.sn_xmax); elog(LOG, " sn_recent_global_xmin: %d", txn->gti_current_snapshot.sn_recent_global_xmin); elog(LOG, " sn_xcnt: %d", txn->gti_current_snapshot.sn_xcnt); +#if 0 + /* The next code is wrong. */ elog(LOG, " sn_xip: %d", *(txn->gti_current_snapshot.sn_xip)); +#else + for(ii = 0; ii < txn->gti_current_snapshot.sn_xcnt; ii++) + { + elog (LOG, " sn_xip[%d]: %d", ii, txn->gti_current_snapshot.sn_xip[ii]); + } +#endif elog(LOG, " gti_snapshot_set: %d", txn->gti_snapshot_set); elog(LOG, " gti_vacuum: %d", txn->gti_vacuum); diff --git a/src/gtm/common/mcxt.c b/src/gtm/common/mcxt.c index bf27499..592c135 100644 --- a/src/gtm/common/mcxt.c +++ b/src/gtm/common/mcxt.c @@ -760,4 +760,18 @@ pgport_pfree(void *pointer) pfree(pointer); } + #endif + +#include "gen_alloc.h" + +void *current_memcontext(void); + +void *current_memcontext() +{ + + return((void *)CurrentMemoryContext); +} + +Gen_Alloc genAlloc_class = {MemoryContextAlloc, MemoryContextAllocZero, repalloc, pfree, current_memcontext}; + diff --git a/src/include/gen_alloc.h b/src/include/gen_alloc.h new file mode 100644 index 0000000..708752d --- /dev/null +++ b/src/include/gen_alloc.h @@ -0,0 +1,26 @@ +#ifndef GEN_ALLOC_H +#define GEN_ALLOC_H + +/* + * Common memory allocation binary interface both for Postgres and GTM processes. + * + * Especially needed by gtm_serialize.c and gtm_serialize_debug.c + */ + +typedef struct Gen_Alloc +{ + void * (* alloc) (void *, size_t); + void * (* alloc0) (void *, size_t); + void * (* realloc) (void *, size_t); + void (* free) (void *); + void * (* current_memcontext) (void); +} Gen_Alloc; + +extern Gen_Alloc genAlloc_class; + +#define genAlloc(x) genAlloc_class.alloc(genAlloc_class.current_memcontext(), x) +#define genRealloc(x, y) genAlloc_class.realloc(x, y); +#define genFree(x) genAlloc_class.free(x); +#define genAlloc0(x) genAlloc_class.alloc0(genAlloc_class.current_memcontext(), x) + +#endif /* GEN_ALLOC_H */ diff --git a/src/include/gtm/palloc.h b/src/include/gtm/palloc.h index 2efaaa4..33416a2 100644 --- a/src/include/gtm/palloc.h +++ b/src/include/gtm/palloc.h @@ -87,4 +87,12 @@ extern char *pgport_pstrdup(const char *str); extern void pgport_pfree(void *pointer); #endif +#ifdef PGXC +/* + * The following part provides common palloc binary interface. This + * is needed especially for gtm_serialize.c and gtm_serialize_debug.c. + */ +#include "gen_alloc.h" +#endif + #endif /* PALLOC_H */ diff --git a/src/include/utils/palloc.h b/src/include/utils/palloc.h index e504ffa..88e1396 100644 --- a/src/include/utils/palloc.h +++ b/src/include/utils/palloc.h @@ -105,4 +105,12 @@ extern char *pgport_pstrdup(const char *str); extern void pgport_pfree(void *pointer); #endif +#ifdef PGXC +/* + * The following part provides common palloc binary interface. This + * is needed especially for gtm_serialize.c and gtm_serialize_debug.c. + */ +#include "gen_alloc.h" +#endif + #endif /* PALLOC_H */ diff --git a/src/pgxc/pgxc_clean/common.c b/src/pgxc/pgxc_clean/common.c index 3f01f67..4b9c96f 100644 --- a/src/pgxc/pgxc_clean/common.c +++ b/src/pgxc/pgxc_clean/common.c @@ -50,17 +50,17 @@ dispmsg(errlevel el, const char *format, ...) switch (el) { case lvl_info: - fprintf(stdout, lvlstr[el]); + fprintf(stdout, "%s", lvlstr[el]); break; case lvl_warn: case lvl_error: - fprintf(stderr, lvlstr[el]); + fprintf(stderr, "%s", lvlstr[el]); break; case lvl_debug: - fprintf(stdout, lvlstr[el]); + fprintf(stdout, "%s", lvlstr[el]); break; default: - fprintf(stdout, lvlstr[0]); + fprintf(stdout, "%s", lvlstr[0]); break; } @@ -79,3 +79,49 @@ dispmsg(errlevel el, const char *format, ...) fflush(stdout); } } + +/* + * Now, every gtm-interface user has to provide it's own memory context. + * GTM and Postgres (postmaster) already provide this as their own mcxt.c. + * pgxc_clean provide this here. + */ + +#include "gen_alloc.h" + +static void *my_malloc(void * context, size_t size); +static void *get_mycontext(void); +static void *my_realloc(void * ptr, size_t size); +static void my_free(void *ptr); +static void *my_malloc0(void *context, size_t size); + +static void *my_malloc(void * context, size_t size) +{ + return(malloc(size)); +} + +static void *get_mycontext() +{ + return(NULL); +} + +static void *my_realloc(void * ptr, size_t size) +{ + return(realloc(ptr, size)); +} + +static void my_free(void *ptr) +{ + free(ptr); +} + +static void *my_malloc0(void *context, size_t size) +{ + void *rv; + rv = malloc(size); + if (rv == NULL) + return(rv); + memset(rv, 0, size); + return(rv); +} + +Gen_Alloc genAlloc_class = {my_malloc, my_malloc0, my_realloc, my_free, get_mycontext}; ----------------------------------------------------------------------- Summary of changes: src/backend/utils/mmgr/mcxt.c | 16 ++++++++- src/gtm/common/gtm_serialize.c | 61 ++++++++++++++++++++++++++++++++- src/gtm/common/gtm_serialize_debug.c | 10 +++++ src/gtm/common/mcxt.c | 14 ++++++++ src/include/gen_alloc.h | 26 ++++++++++++++ src/include/gtm/palloc.h | 8 ++++ src/include/utils/palloc.h | 8 ++++ src/pgxc/pgxc_clean/common.c | 54 +++++++++++++++++++++++++++-- 8 files changed, 190 insertions(+), 7 deletions(-) create mode 100644 src/include/gen_alloc.h hooks/post-receive -- Postgres-XC |
From: Abbas B. <ga...@us...> - 2011-05-02 14:51:48
|
Project "Postgres-XC". The branch, master has been updated via a877e0769b83719b446d15a243e14b05700f975a (commit) from 7bbb6a36362ac0b9e92191bce988eeaa5dd5b118 (commit) - Log ----------------------------------------------------------------- commit a877e0769b83719b446d15a243e14b05700f975a Author: Abbas <abb...@en...> Date: Mon May 2 19:16:38 2011 +0500 This patch fixes TWO bugs 3273569 & 3243469 The problem was that prepared transactions were not being listed in the system view pg_prepared_xacts. The reason was that in XC transactions are not prepared on the coordinator and hence the function pg_prepared_xact does not return any. The main worker function GetPreparedTransactionList which is supposed to return an array of prepared transactions for the user level function pg_prepared_xact, always finds ZERO in TwoPhaseState->numPrepXacts, since the transaction was never prepared on the coordinator. The solution was to ask data nodes, where the transaction was actually prepared. All data nodes may not be involved in all transactions hence we have to ask all data nodes in any case. In order to implement this solution we created two schemas, __pgxc_datanode_schema__ & __pgxc_coordinator_schema__ one for data nodes and one for coordinators respectively. Next these schemas were added to the default search path in coordinator as well as on data node. Hence the default search path on coordinator is "$user",public, __pgxc_coordinator_schema__ and on the data node is "$user",public, __pgxc_datanode_schema__ Next we created a table named pg_prepared_xact in the schema __pgxc_coordinator_schema__ with the same fields as the old view pg_prepared_xact. and we created the old view pg_prepared_xact in the schema __pgxc_datanode_schema__. Now when a query for the view pg_prepared_xact is launched on the coordinator, it knows it is a table distributed in round robin fashion on all data nodes and hence it generates a query to ask all data nodes. The only difference will be that if we have two data nodes and the transaction was prepared on both we will get two rows for it which is correct. If one row per transaction is required all that is required is to add distinct in the query. A few changes in this patch are due to the following two limitations which are not addressed by this patch. 1. create table in system_view.sql does not enter a corresponding row in pgxc_class. 2. We need to ROLLBACK manually after a PREPARE TRANSCATION fails due to name duplication. The prepared_xacts.sql test case now passes, and hence successive regression runs are now possible. diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index b0cd121..2392e7a 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -51,6 +51,9 @@ #include "utils/rel.h" #include "utils/syscache.h" +#ifdef PGXC +#include "pgxc/pgxc.h" +#endif /* * The namespace search path is a possibly-empty list of namespace OIDs. diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index fca7be9..a039578 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -150,7 +150,12 @@ CREATE VIEW pg_locks AS CREATE VIEW pg_cursors AS SELECT * FROM pg_cursor() AS C; -CREATE VIEW pg_prepared_xacts AS +CREATE SCHEMA __pgxc_coordinator_schema__; +CREATE SCHEMA __pgxc_datanode_schema__; + +create table __pgxc_coordinator_schema__.pg_prepared_xacts ( transaction xid, gid text, prepared timestamptz, owner name, database name ); + +CREATE VIEW __pgxc_datanode_schema__.pg_prepared_xacts AS SELECT P.transaction, P.gid, P.prepared, U.rolname AS owner, D.datname AS database FROM pg_prepared_xact() AS P diff --git a/src/backend/pgxc/locator/locator.c b/src/backend/pgxc/locator/locator.c index 5d6667e..acab6d7 100644 --- a/src/backend/pgxc/locator/locator.c +++ b/src/backend/pgxc/locator/locator.c @@ -750,9 +750,41 @@ RelationLocInfo * GetRelationLocInfo(Oid relid) { RelationLocInfo *ret_loc_info = NULL; + char *namespace; Relation rel = relation_open(relid, AccessShareLock); + /* This check has been added as a temp fix for CREATE TABLE not adding entry in pgxc_class + * when run from system_views.sql + */ + if ( rel != NULL && + rel->rd_rel != NULL && + rel->rd_rel->relkind == RELKIND_RELATION && + rel->rd_rel->relname.data != NULL && + (strcmp(rel->rd_rel->relname.data, PREPARED_XACTS_TABLE) == 0) ) + { + namespace = get_namespace_name(rel->rd_rel->relnamespace); + + if (namespace != NULL && (strcmp(namespace, PGXC_COORDINATOR_SCHEMA) == 0)) + { + RelationLocInfo *dest_info; + + dest_info = (RelationLocInfo *) palloc0(sizeof(RelationLocInfo)); + + dest_info->relid = relid; + dest_info->locatorType = 'N'; + dest_info->nodeCount = NumDataNodes; + dest_info->nodeList = GetAllDataNodes(); + + relation_close(rel, AccessShareLock); + pfree(namespace); + + return dest_info; + } + + if (namespace != NULL) pfree(namespace); + } + if (rel && rel->rd_locator_info) ret_loc_info = CopyRelationLocInfo(rel->rd_locator_info); diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 552b0d8..1058c50 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -538,6 +538,26 @@ PostmasterMain(int argc, char *argv[]) /* Initialize paths to installation files */ getInstallationPaths(argv[0]); +#ifdef PGXC + /* Decide whether coordinator or data node before setting GUC variables */ + while ((opt = getopt(argc, argv, "A:B:Cc:D:d:EeFf:h:ijk:lN:nOo:Pp:r:S:sTt:W:X-:")) != -1) + { + switch (opt) + { + case 'C': + isPGXCCoordinator = true; + break; + case 'X': + isPGXCDataNode = true; + break; + default: + break; + } + } + /* Reset getopt for parsing again */ + optind = 1; +#endif + /* * Options setup */ diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index c8648ee..d955b23 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -2236,6 +2236,10 @@ static struct config_int ConfigureNamesInt[] = } }; +#ifdef PGXC +/* Variable to store search path */ +static char boot_search_path[255]; +#endif static struct config_real ConfigureNamesReal[] = { @@ -2364,7 +2368,6 @@ static struct config_real ConfigureNamesReal[] = } }; - static struct config_string ConfigureNamesString[] = { { @@ -2561,7 +2564,11 @@ static struct config_string ConfigureNamesString[] = GUC_LIST_INPUT | GUC_LIST_QUOTE }, &namespace_search_path, +#ifdef PGXC + boot_search_path, assign_search_path, NULL +#else "\"$user\",public", assign_search_path, NULL +#endif }, { @@ -3296,6 +3303,14 @@ build_guc_variables(void) struct config_generic **guc_vars; int i; +#ifdef PGXC + strcpy(boot_search_path, "\"$user\",public, "); + if (IS_PGXC_DATANODE) + strcat(boot_search_path, PGXC_DATA_NODE_SCHEMA); + else + strcat(boot_search_path, PGXC_COORDINATOR_SCHEMA); +#endif + for (i = 0; ConfigureNamesBool[i].gen.name; i++) { struct config_bool *conf = &ConfigureNamesBool[i]; diff --git a/src/include/pgxc/locator.h b/src/include/pgxc/locator.h index 3272ab6..9f669d9 100644 --- a/src/include/pgxc/locator.h +++ b/src/include/pgxc/locator.h @@ -28,6 +28,10 @@ #define IsReplicated(x) (x->locatorType == LOCATOR_TYPE_REPLICATED) +#define PGXC_COORDINATOR_SCHEMA "__pgxc_coordinator_schema__" +#define PGXC_DATA_NODE_SCHEMA "__pgxc_datanode_schema__" +#define PREPARED_XACTS_TABLE "pg_prepared_xacts" + #include "nodes/primnodes.h" #include "utils/relcache.h" diff --git a/src/test/regress/expected/prepared_xacts_2.out b/src/test/regress/expected/prepared_xacts_2.out new file mode 100644 index 0000000..e456200 --- /dev/null +++ b/src/test/regress/expected/prepared_xacts_2.out @@ -0,0 +1,230 @@ +-- +-- PREPARED TRANSACTIONS (two-phase commit) +-- +-- We can't readily test persistence of prepared xacts within the +-- regression script framework, unfortunately. Note that a crash +-- isn't really needed ... stopping and starting the postmaster would +-- be enough, but we can't even do that here. +-- create a simple table that we'll use in the tests +CREATE TABLE pxtest1 (foobar VARCHAR(10)); +INSERT INTO pxtest1 VALUES ('aaa'); +-- Test PREPARE TRANSACTION +BEGIN; +UPDATE pxtest1 SET foobar = 'bbb' WHERE foobar = 'aaa'; +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + bbb +(1 row) + +PREPARE TRANSACTION 'foo1'; +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa +(1 row) + +-- Test pg_prepared_xacts system view +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; + gid +------ + foo1 +(1 row) + +-- Test ROLLBACK PREPARED +ROLLBACK PREPARED 'foo1'; +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa +(1 row) + +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; + gid +----- +(0 rows) + +-- Test COMMIT PREPARED +BEGIN; +INSERT INTO pxtest1 VALUES ('ddd'); +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa + ddd +(2 rows) + +PREPARE TRANSACTION 'foo2'; +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa +(1 row) + +COMMIT PREPARED 'foo2'; +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa + ddd +(2 rows) + +-- Test duplicate gids +BEGIN; +UPDATE pxtest1 SET foobar = 'eee' WHERE foobar = 'ddd'; +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa + eee +(2 rows) + +PREPARE TRANSACTION 'foo3'; +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; + gid +------ + foo3 +(1 row) + +BEGIN; +INSERT INTO pxtest1 VALUES ('fff'); +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa + ddd + fff +(3 rows) + +-- This should fail, because the gid foo3 is already in use +PREPARE TRANSACTION 'foo3'; +ERROR: Could not prepare transaction on data nodes +ROLLBACK; +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa + ddd +(2 rows) + +ROLLBACK PREPARED 'foo3'; +SELECT * FROM pxtest1 ORDER BY foobar; + foobar +-------- + aaa + ddd +(2 rows) + +-- Clean up +DROP TABLE pxtest1; +-- Test subtransactions +BEGIN; + CREATE TABLE pxtest2 (a int); + INSERT INTO pxtest2 VALUES (1); + SAVEPOINT a; +ERROR: SAVEPOINT is not yet supported. + INSERT INTO pxtest2 VALUES (2); +ERROR: current transaction is aborted, commands ignored until end of transaction block + ROLLBACK TO a; +ERROR: no such savepoint + SAVEPOINT b; +ERROR: current transaction is aborted, commands ignored until end of transaction block + INSERT INTO pxtest2 VALUES (3); +ERROR: current transaction is aborted, commands ignored until end of transaction block +PREPARE TRANSACTION 'regress-one'; +BEGIN; + CREATE TABLE pxtest2 (a int); + INSERT INTO pxtest2 VALUES (1); + INSERT INTO pxtest2 VALUES (3); +PREPARE TRANSACTION 'regress-one'; +CREATE TABLE pxtest3(fff int); +-- Test shared invalidation +BEGIN; + DROP TABLE pxtest3; + CREATE TABLE pxtest4 (a int); + INSERT INTO pxtest4 VALUES (1); + INSERT INTO pxtest4 VALUES (2); + DECLARE foo CURSOR FOR SELECT * FROM pxtest4; + -- Fetch 1 tuple, keeping the cursor open + FETCH 1 FROM foo; + a +--- + 1 +(1 row) + +PREPARE TRANSACTION 'regress-two'; +-- No such cursor +FETCH 1 FROM foo; +ERROR: cursor "foo" does not exist +-- Table doesn't exist, the creation hasn't been committed yet +SELECT * FROM pxtest2; +ERROR: relation "pxtest2" does not exist +LINE 1: SELECT * FROM pxtest2; + ^ +-- There should be two prepared transactions +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; + gid +------------- + regress-one + regress-two +(2 rows) + +-- pxtest3 should be locked because of the pending DROP +set statement_timeout to 2000; +SELECT * FROM pxtest3; +ERROR: canceling statement due to statement timeout +reset statement_timeout; +-- Disconnect, we will continue testing in a different backend +\c - +-- There should still be two prepared transactions +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; + gid +------------- + regress-one + regress-two +(2 rows) + +-- pxtest3 should still be locked because of the pending DROP +set statement_timeout to 2000; +SELECT * FROM pxtest3; +ERROR: canceling statement due to statement timeout +reset statement_timeout; +-- Commit table creation +COMMIT PREPARED 'regress-one'; +\d pxtest2 + Table "public.pxtest2" + Column | Type | Modifiers +--------+---------+----------- + a | integer | + +SELECT * FROM pxtest2; + a +--- + 1 + 3 +(2 rows) + +-- There should be one prepared transaction +SELECT DISTINCT gid FROM pg_prepared_xacts; + gid +------------- + regress-two +(1 row) + +-- Commit table drop +COMMIT PREPARED 'regress-two'; +SELECT * FROM pxtest3; +ERROR: relation "pxtest3" does not exist +LINE 1: SELECT * FROM pxtest3; + ^ +-- There should be no prepared transactions +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; + gid +----- +(0 rows) + +-- Clean up +DROP TABLE pxtest2; +DROP TABLE pxtest3; -- will still be there if prepared xacts are disabled +ERROR: table "pxtest3" does not exist +DROP TABLE pxtest4; diff --git a/src/test/regress/sql/prepared_xacts.sql b/src/test/regress/sql/prepared_xacts.sql index b8915bd..fb9bc64 100644 --- a/src/test/regress/sql/prepared_xacts.sql +++ b/src/test/regress/sql/prepared_xacts.sql @@ -22,14 +22,14 @@ PREPARE TRANSACTION 'foo1'; SELECT * FROM pxtest1 ORDER BY foobar; -- Test pg_prepared_xacts system view -SELECT gid FROM pg_prepared_xacts ORDER BY gid; +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; -- Test ROLLBACK PREPARED ROLLBACK PREPARED 'foo1'; SELECT * FROM pxtest1 ORDER BY foobar; -SELECT gid FROM pg_prepared_xacts ORDER BY gid; +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; -- Test COMMIT PREPARED @@ -50,7 +50,7 @@ UPDATE pxtest1 SET foobar = 'eee' WHERE foobar = 'ddd'; SELECT * FROM pxtest1 ORDER BY foobar; PREPARE TRANSACTION 'foo3'; -SELECT gid FROM pg_prepared_xacts ORDER BY gid; +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; BEGIN; INSERT INTO pxtest1 VALUES ('fff'); @@ -59,8 +59,8 @@ SELECT * FROM pxtest1 ORDER BY foobar; -- This should fail, because the gid foo3 is already in use PREPARE TRANSACTION 'foo3'; +ROLLBACK; SELECT * FROM pxtest1 ORDER BY foobar; - ROLLBACK PREPARED 'foo3'; SELECT * FROM pxtest1 ORDER BY foobar; @@ -79,6 +79,13 @@ BEGIN; INSERT INTO pxtest2 VALUES (3); PREPARE TRANSACTION 'regress-one'; +BEGIN; + CREATE TABLE pxtest2 (a int); + INSERT INTO pxtest2 VALUES (1); + INSERT INTO pxtest2 VALUES (3); +PREPARE TRANSACTION 'regress-one'; + + CREATE TABLE pxtest3(fff int); -- Test shared invalidation @@ -99,7 +106,7 @@ FETCH 1 FROM foo; SELECT * FROM pxtest2; -- There should be two prepared transactions -SELECT gid FROM pg_prepared_xacts ORDER BY gid; +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; -- pxtest3 should be locked because of the pending DROP set statement_timeout to 2000; @@ -110,7 +117,7 @@ reset statement_timeout; \c - -- There should still be two prepared transactions -SELECT gid FROM pg_prepared_xacts ORDER BY gid; +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; -- pxtest3 should still be locked because of the pending DROP set statement_timeout to 2000; @@ -123,16 +130,17 @@ COMMIT PREPARED 'regress-one'; SELECT * FROM pxtest2; -- There should be one prepared transaction -SELECT gid FROM pg_prepared_xacts; +SELECT DISTINCT gid FROM pg_prepared_xacts; -- Commit table drop COMMIT PREPARED 'regress-two'; SELECT * FROM pxtest3; -- There should be no prepared transactions -SELECT gid FROM pg_prepared_xacts ORDER BY gid; +SELECT DISTINCT gid FROM pg_prepared_xacts ORDER BY gid; -- Clean up DROP TABLE pxtest2; DROP TABLE pxtest3; -- will still be there if prepared xacts are disabled DROP TABLE pxtest4; + ----------------------------------------------------------------------- Summary of changes: src/backend/catalog/namespace.c | 3 ++ src/backend/catalog/system_views.sql | 7 ++++- src/backend/pgxc/locator/locator.c | 32 ++++++++++++++++++++ src/backend/postmaster/postmaster.c | 20 ++++++++++++ src/backend/utils/misc/guc.c | 17 ++++++++++- src/include/pgxc/locator.h | 4 ++ .../{prepared_xacts.out => prepared_xacts_2.out} | 27 ++++++++++++----- src/test/regress/sql/prepared_xacts.sql | 24 ++++++++++----- 8 files changed, 116 insertions(+), 18 deletions(-) copy src/test/regress/expected/{prepared_xacts.out => prepared_xacts_2.out} (81%) hooks/post-receive -- Postgres-XC |