summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc-xc/src/sgml/arch-dev.sgmlin14
-rw-r--r--doc-xc/src/sgml/func.sgmlin6
-rw-r--r--doc-xc/src/sgml/installation.sgmlin4
-rw-r--r--doc-xc/src/sgml/ref/create_aggregate.sgmlin10
-rw-r--r--doc-xc/src/sgml/runtime.sgmlin145
-rw-r--r--doc-xc/src/sgml/wal.sgmlin4
-rw-r--r--doc-xc/src/sgml/xaggr.sgmlin2
7 files changed, 106 insertions, 79 deletions
diff --git a/doc-xc/src/sgml/arch-dev.sgmlin b/doc-xc/src/sgml/arch-dev.sgmlin
index 20d3293076..6a01c41fb6 100644
--- a/doc-xc/src/sgml/arch-dev.sgmlin
+++ b/doc-xc/src/sgml/arch-dev.sgmlin
@@ -834,7 +834,7 @@
local network. We encourage to install Postgres-XC with local
Gigabit network with minimum latency, that is, use as fewer
switches involved in the connection among GTM, Coordinator and
- data nodes.
+ Datanodes.
</para>
<sect3>
@@ -1054,10 +1054,10 @@
<title>Coordinator And Datanode Connection</title>
<para>
- The number of connection between Coordinator and data node may
+ The number of connection between Coordinator and Datanode may
increase from time to time. This may leave unused connection and
waste system resources. Repeating real connect and disconnect
- requires data node backend initialization which increases latency
+ requires Datanode backend initialization which increases latency
and also wastes system resources.
</para>
@@ -1074,19 +1074,19 @@
Because we consume much more resources for locks and other
control information per backend and only a few of such connection
is active at a given time, it is not a good idea to hold such
- unused connection between Coordinator and data node.
+ unused connection between Coordinator and Datanode.
</para>
<para>
To improve this, Postgres-XC is equipped with connection pooler
- between Coordinator and data node. When a Coordinator backend
- requires connection to a data node, the pooler looks for
+ between Coordinator and Datanode. When a Coordinator backend
+ requires connection to a Datanode, the pooler looks for
appropriate connection from the pool. If there's an available
one, the pooler assigns it to the Coordinator backend. When the
connection is no longer needed, the Coordinator backend returns
the connection to the pooler. Pooler does not disconnect the
connection. It keeps the connection to the pool for later reuse,
- keeping data node backend running.
+ keeping Datanode backend running.
</para>
</sect2>
diff --git a/doc-xc/src/sgml/func.sgmlin b/doc-xc/src/sgml/func.sgmlin
index 40fac7f1a7..648cda2f6d 100644
--- a/doc-xc/src/sgml/func.sgmlin
+++ b/doc-xc/src/sgml/func.sgmlin
@@ -14850,11 +14850,11 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<para>
The object size functions pg_database_size, pg_indexes_size, pg_relation_size,
pg_table_size, and pg_total_relation_size return the cumulative size
- from all the data nodes. For e.g., pg_relation_size returns the sum of disk
- space used up by the specified fork at all the data nodes where the table is
+ from all the Datanodes. For e.g., pg_relation_size returns the sum of disk
+ space used up by the specified fork at all the Datanodes where the table is
distributed or replicated. If the table is replicated on 3 tables, the size
will be 3 times that of individual nodes. If you need to retrieve the local
- results from a particular Coordinator or data node, you should issue these
+ results from a particular Coordinator or Datanode, you should issue these
function calls explicitly through <type>EXECUTE DIRECT</> statement. All other
system functions run locally at the Coordinator, unless explicitly specified
otherwise in this document.
diff --git a/doc-xc/src/sgml/installation.sgmlin b/doc-xc/src/sgml/installation.sgmlin
index 735fbdc274..40ccd548a2 100644
--- a/doc-xc/src/sgml/installation.sgmlin
+++ b/doc-xc/src/sgml/installation.sgmlin
@@ -2550,8 +2550,8 @@ postgres -X -D /usr/local/pgsql/Datanode
</programlisting>
This will start the Datanode. <option>-X</>
specifies <command>postgres</> to start as a
- Datanode. <option>-D</> specifies the data directory of the
- data node. You can specify other options of standalone <command>postgres</>.
+ Datanode. <option>-D</> specifies the data directory of the
+ Datanode. You can specify other options of standalone <command>postgres</>.
</para>
<para>
Please note that you should issue <command>postgres</> command at
diff --git a/doc-xc/src/sgml/ref/create_aggregate.sgmlin b/doc-xc/src/sgml/ref/create_aggregate.sgmlin
index e2fa36af14..2de3015d87 100644
--- a/doc-xc/src/sgml/ref/create_aggregate.sgmlin
+++ b/doc-xc/src/sgml/ref/create_aggregate.sgmlin
@@ -165,19 +165,19 @@ CREATE AGGREGATE <replaceable class="PARAMETER">name</replaceable> (
<listitem>
<para>
Three phased aggregation - is used when the process of aggregation is divided
- between Coordinator and data nodes. In this mode, each
- <productname>Postgres-XC</productname> data node involved in the query carries
+ between Coordinator and Datanodes. In this mode, each
+ <productname>Postgres-XC</productname> Datanode involved in the query carries
out the first phase named transition phase. This phase is similar to the first
phase in the two phased aggregation mode discussed above, except that, every
- data node applies this phase on the rows available at the data node. The
+ Datanode applies this phase on the rows available at the Datanode. The
result of transition phase is then transferred to the Coordinator node.
Second phase called collection phase takes place on the Coordinator.
<productname>Postgres-XC</productname> Coordinator node creates a temporary variable
of data type <replaceable class="PARAMETER">stype</replaceable>
to hold the current internal state of the collection phase. For every input
- from the data node (result of transition phase on that node), the collection
+ from the Datanode (result of transition phase on that node), the collection
function is invoked with the current collection state value and the new
- transition value (obtained from the data node) to calculate a new
+ transition value (obtained from the Datanode) to calculate a new
internal collection state value. After all the transition values from data
nodes have been processed, in the third or finalization phase the final
function is invoked once to calculate the aggregate's return
diff --git a/doc-xc/src/sgml/runtime.sgmlin b/doc-xc/src/sgml/runtime.sgmlin
index 2621413c02..2fcc635744 100644
--- a/doc-xc/src/sgml/runtime.sgmlin
+++ b/doc-xc/src/sgml/runtime.sgmlin
@@ -78,7 +78,7 @@
&xconly;
<para>
You should initialize <firstterm>database cluster</firstterm> for
- each <firstterm>Coordinator</> and <firstterm>data node</>.
+ each <firstterm>Coordinator</> and <firstterm>Datanode</>.
</para>
<!## end>
&common;
@@ -94,15 +94,21 @@
linkend="app-initdb">,<indexterm><primary>initdb</></> which is
<!## PG>
installed with <productname>PostgreSQL</productname>. The desired
+ file system location of your database cluster is indicated by the
+ <option>-D</option> option, for example:
+<screen>
+<prompt>$</> <userinput>initdb -D /usr/local/pgsql/data</userinput>
+</screen>
<!## end>
<!## XC>
installed with <productname>Postgres-XC</productname>. The desired
-<!## end>
file system location of your database cluster is indicated by the
- <option>-D</option> option, for example:
+ <option>-D</option> option. You also need to define a node name for
+ the cluster element initialized, for example:
<screen>
-<prompt>$</> <userinput>initdb -D /usr/local/pgsql/data</userinput>
+<prompt>$</> <userinput>initdb -D /usr/local/pgsql/data --nodename foo</userinput>
</screen>
+<!## end>
<!## PG>
Note that you must execute this command while logged into the
<productname>PostgreSQL</productname> user account, which is
@@ -125,23 +131,33 @@
the environment variable <envar>PGDATA</envar>.
<indexterm><primary><envar>PGDATA</envar></primary></indexterm>
</para>
- <!## XC>
+<!## XC>
<para>
If you configure multiple <firstterm>Coordinator</>
- and/or <firstterm>data node</>, you cannot
+ and/or <firstterm>Datanode</>, you cannot
share <envar>PGDATA</envar> among them and you must
- specify<firstterm>data directory</> explicitly.
+ specify <firstterm>data directory</> explicitly.
</para>
- <!## end>
+ <para>
+ <option>--nodename</option> is mandatory for all the nodes at initialization
+ </para>
+<!## end>
</tip>
&common;
<para>
Alternatively, you can run <command>initdb</command> via
the <xref linkend="app-pg-ctl">
program<indexterm><primary>pg_ctl</></> like so:
+<!## PG>
<screen>
<prompt>$</> <userinput>pg_ctl -D /usr/local/pgsql/data initdb</userinput>
</screen>
+<!## end>
+<!## XC>
+<screen>
+<prompt>$</> <userinput>pg_ctl -D /usr/local/pgsql/data -o '--nodename foo' initdb</userinput>
+</screen>
+<!## end>
This may be more intuitive if you are
using <command>pg_ctl</command> for starting and stopping the
server (see <xref linkend="server-start">), so
@@ -166,7 +182,7 @@
root# <userinput>mkdir /usr/local/pgsql/data</userinput>
root# <userinput>chown postgres /usr/local/pgsql/data</userinput>
root# <userinput>su postgres</userinput>
-postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
+postgres$ <userinput>initdb -D /usr/local/pgsql/data --nodename foo</userinput>
</screen>
</para>
@@ -312,7 +328,7 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
</para>
<para>
- Both <filename>Coordinator</> and <filename>Datanode</> have their
+ Both Coordinator and Datanode have their
own databases, essentially <productname>PostgreSQL</> databases.
They are separate and you should initialize them separately.
</para>
@@ -324,7 +340,7 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
<para>
GTM provides global transaction management feature to all the
other components in <productname>Postgres-XC</> database cluster.
- Because <filename>GTM</> handles transaction requirements from all
+ Because GTM handles transaction requirements from all
the Coordinators and Datanodes, it is highly advised to run this
in a separate server.
</para>
@@ -340,16 +356,16 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
<listitem>
<para>
- Because <filename>GTM</> receives all the request to begin/end
+ Because GTM receives all the request to begin/end
transactions and to refer to sequence values, you should
- run <filename>GTM</> in a separate server. If you
- run <filename>GTM</> in the same server as Datanode or
+ run GTM in a separate server. If you
+ run GTM in the same server as Datanode or
Coordinator, it will become harder to make workload reasonably
balanced.
</para>
<para>
- Then, you should determine <filename>GTM</>'s working directory.
- Please create this directory before you run <filename>GTM</>.
+ Then, you should determine GTM's working directory.
+ Please create this directory before you run GTM.
</para>
</listitem>
</varlistentry>
@@ -359,7 +375,7 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
<listitem>
<para>
Next, you should determine listen address and port
- of <filename>GTM</>.
+ of GTM.
Listen address can be either the IP address or host name which
receives request from other component,
typically <filename>GTM-Proxy</filename>.
@@ -371,12 +387,12 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
<term>GTM id</term>
<listitem>
<para>
- You have a chance to run more than one <filename>GTM</>s in
+ You have a chance to run more than one GTM in
one <productname>Postgres-XC</> cluster.
- For example, if you need a backup of <filename>GTM</> in
+ For example, if you need a backup of GTM in
high-availability environment, you need to run
- two <filename>GTM</>s.
- You should give unique GTM id to each of such <filename>GTM</>s.
+ two GTMs.
+ You should give unique GTM id to each of such GTMs.
GTM id value begins with one.
</para>
</listitem>
@@ -385,19 +401,31 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
</variablelist>
<para>
- When they're determined, you can start GTM as follows:
+ When this is determined, you can initialize GTM with the command <xref
+ linkend="app-initgtm">,
+ for example:
+<screen>
+<prompt>$</> <userinput>initgtm -Z gtm -D /usr/local/pgsql/data_gtm</userinput>
+</screen>
+ </para>
+
+ <para>
+ All the parameters related to GTM can be modified in <filename>gtm.conf</filename>
+ located in data folder initialized by <command>initgtm</command>.
+ </para>
+
+ <para>
+ Then you can start GTM as follows:
<!-- Check precise parameters -->
<screen>
-$ <userinput>gtm -D /usr/local/pgsql/gtm -i 1 -h localhost -p 20001</userinput>
+<prompt>$ <userinput>gtm -D /usr/local/pgsql/data_gtm</userinput>
</screen>
- where <option>-D</> option specifies <filename>GTM</>'s working
- directory, <option>-i</> option specifies <filename>GTM</>'s id
- number, <option>-h</> specifies listen address and <option>-p</> specifies port number.
+ where <option>-D</> option specifies working directory of GTM.
</para>
<para>
- Or you can start <filename>GTM</> using <filename>gtm_ctl</> like:
+ Alternatively, GTM can be started using <command>gtm_ctl</>, for example:
<screen>
-$ <userinput>gtm_ctl -S gtm start -D /usr/local/pgsql/gtm -o "-i 1 -h gtm -p 20001"</userinput>
+<prompt>$ <userinput>gtm_ctl -Z gtm start -D /usr/local/pgsql/data_gtm</userinput>
</screen>
</para>
@@ -406,14 +434,9 @@ $ <userinput>gtm_ctl -S gtm start -D /usr/local/pgsql/gtm -o "-i 1 -h gtm -p 200
<title>Starting GTM-Proxy</title>
&xconly;
<para>
- To be honest, you don't have to run <filename>GTM-Proxy</> if you
- just want to test <productname>Postgres-XC</>.
- On the other hand, because <filename>GTM</> has to handle so many
- requirements and deliver responses to them, <filename>GTM</>'s
- workload becomes an issue.
- Because <filename>GTM-Proxy</> groups requirements and response
- from <filename>Coordinators</> and <filename>Datanode</>, it is
- important to keep <filename>GTM</> workload in reasonable level.
+ GTM-Proxy is not a mandatory component of Postgres-XC cluster but
+ it can be used to group messages between GTM and cluster nodes,
+ reducing worload and the number of packages exchanged through network.
</para>
<para>
@@ -426,22 +449,32 @@ $ <userinput>gtm_ctl -S gtm start -D /usr/local/pgsql/gtm -o "-i 1 -h gtm -p 200
</para>
<para>
+ Then, you need first to initialize GTM-Proxy with <command>initgtm</command>,
+ for example:
+<screen>
+<prompt>$</> <userinput>initgtm -Z gtm_proxy -D /usr/local/pgsql/data_gtm_proxy</userinput>
+</screen>
+ </para>
+
+ <para>
+ All the parameters related to GTM-Proxy can be modified in <filename>gtm_proxy.conf</filename>
+ located in data folder initialized by <command>initgtm</command>.
+ </para>
+
+ <para>
Then, you can start <filename>GTM-Proxy</> like:
<screen>
-$ <userinput>gtm_proxy -D /usr/local/pgsql/gtm_proxy -i 1 -h server1 -p 20002 -n 2 -s gtm -t 20001</userinput>
+<prompt>$ <userinput>gtm_proxy -D /usr/local/pgsql/data_gtm_proxy</userinput>
</screen>
where <option>-D</> specifies <filename>GTM-Proxy</>'s working
- directory, <option>-i</> is its listen address, <option>-p</> is
- port number, <option>-n</> is number of worker
- thread, <option>-s</> is <filename>GTM</>'s listen address, and
- <option>-t</> is <filename>GTM</>'s listen port.
+ directory.
</para>
<para>
- Or you can start <filename>GTM-Proxy</> using <filename>gtm_ctl</>
+ Alternatively, you can start GTM-Proxy using <filename>gtm_ctl</>
as follows:
<screen>
-$ <userinput>gtm_ctl start -S gtm_proxy -i 1 -D /usr/local/pgsql/gtm_proxy -i 1 -o "-1 1 -h server1 -p 20002 -n 2 -s gtm -t 20001"</userinput>
+<prompt>$ <userinput>gtm_ctl start -Z gtm_proxy -D /usr/local/pgsql/data_gtm_proxy</userinput>
</screen>
</para>
@@ -502,7 +535,7 @@ $ <userinput>gtm_ctl start -S gtm_proxy -i 1 -D /usr/local/pgsql/gtm_proxy -i 1
<term>pgxc_node_name</term>
<listitem>
<para>
- <filename>GTM</> needs to identify each Datanode, as specified by
+ GTM needs to identify each Datanode, as specified by
this parameter.
The value should be unique and start with one.
</para>
@@ -580,7 +613,7 @@ $ <userinput>gtm_ctl start -S gtm_proxy -i 1 -D /usr/local/pgsql/gtm_proxy -i 1
<term>pgxc_node_name</term>
<listitem>
<para>
- <filename>GTM</> needs to identify each Datanode, as specified by
+ GTM needs to identify each Datanode, as specified by
this parameter.
</para>
</listitem>
@@ -696,15 +729,12 @@ $ <userinput>gtm_ctl start -S gtm_proxy -i 1 -D /usr/local/pgsql/gtm_proxy -i 1
<para>
You can start a Datanode as follows:
<screen>
-$ <userinput>postgres -X -D /usr/local/pgsql/Datanode -i</userinput>
+<prompt>$ <userinput>postgres -X -D /usr/local/pgsql/data</userinput>
</screen>
<option>-X</> specifies <command>postgres</> should run as a
- Datanode. <option>-i</> specifies <command>postgres</> to
- accept connection from TCP/IP connections.
- </para>
-
- <para>
- You should start all the Datanodes you configured.
+ Datanode. You may need to specify <option>-i</> <command>postgres</> to
+ accept connection from TCP/IP connections or edit <filename>pg_hba.conf</filename>
+ if cluster uses nodes among several servers.
</para>
</sect2>
@@ -715,15 +745,12 @@ $ <userinput>postgres -X -D /usr/local/pgsql/Datanode -i</userinput>
<para>
You can start a Coordinator as follows:
<screen>
-$ <userinput>postgres -C -D /usr/local/pgsql/Datanode -i</userinput>
+<prompt>$ <userinput>postgres -C -D /usr/local/pgsql/Datanode</userinput>
</screen>
<option>-C</> specifies <command>postgres</> should run as a
- Coordinator. <option>-i</> specifies <command>postgres</> to
- accept connection from TCP/IP connections.
- </para>
-
- <para>
- You should start all the Coordinators you configured.
+ Coordinator. You may need to specify <option>-i</> <command>postgres</> to
+ accept connection from TCP/IP connections or edit <filename>pg_hba.conf</filename>
+ if cluster uses nodes among several servers.
</para>
</sect2>
diff --git a/doc-xc/src/sgml/wal.sgmlin b/doc-xc/src/sgml/wal.sgmlin
index 06e1f0d1e6..556b7bb49a 100644
--- a/doc-xc/src/sgml/wal.sgmlin
+++ b/doc-xc/src/sgml/wal.sgmlin
@@ -769,7 +769,7 @@
other Coordinators to ensure that there are no in-progress two-phase
commits in the cluster. At that point, a barrier <acronym>WAL</acronym>
record along with the user-given or system-generated BARRIER identifier
- is writtent to the <acronym>WAL</acronym> stream of all data nodes and
+ is writtent to the <acronym>WAL</acronym> stream of all Datanodes and
the Coordinators.
</para>
@@ -777,7 +777,7 @@
User can create as many barriers as she wants to. At the time of
point-in-time-recovery, same barrier id must be specified in the
<filename>recovery.conf</filename> files of all the Coordinators and
- data nodes. When every node in the cluster recovers to the same barrier
+ Datanodes. When every node in the cluster recovers to the same barrier
id, a cluster-wide consistent state is reached. Its important that
the recovery must be started from a backup taken before the barrier
was generated. If no matching barrier record is found, either because
diff --git a/doc-xc/src/sgml/xaggr.sgmlin b/doc-xc/src/sgml/xaggr.sgmlin
index 2eecc93b2f..22d18d259e 100644
--- a/doc-xc/src/sgml/xaggr.sgmlin
+++ b/doc-xc/src/sgml/xaggr.sgmlin
@@ -124,7 +124,7 @@ SELECT sum(a) FROM test_complex;
In <productname>Postgres-XC</productname>, a user can provide collection
function if distributed aggregation is expected for improving performance. The
collection function essentially combines the state transition results produced
- at different data nodes. Without a final function the result produced by the
+ at different Datanodes. Without a final function the result produced by the
collection function is the result of aggregate. Above definition of aggregate
<function>sum</> for complex number data type can be modified to have a
collection function as follows