You can subscribe to this list here.
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
| 2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
| 2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
| 2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
| 2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
| 2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
1
|
2
|
3
|
4
(1) |
5
(2) |
6
|
7
|
|
8
|
9
|
10
|
11
|
12
(6) |
13
(2) |
14
(1) |
|
15
|
16
(2) |
17
|
18
|
19
(4) |
20
(2) |
21
(1) |
|
22
|
23
(1) |
24
(1) |
25
(1) |
26
|
27
|
28
|
|
29
|
30
(1) |
|
|
|
|
|
|
From: Phil S. <phi...@ne...> - 2012-04-20 18:20:02
|
Michel,
Just a quick note to let you know I resolved the connection pool issue.
It wasn't a permissions or firewall issue as no firewalls are running
between any of the three VMs.
I turned up the logging on both coord/datanode VMs as per your
suggestion but nothing of any significance was noted.
In attempts to troubleshoot the issue, I then updated the node
initialization removing the remote nodes on each VM. Once I did this,
the CREATE DATABASE sql would run clean but only get created on the
local node, which made sense, seeing only the local datanode was
configured in the pgxc_node catalog.
Seeing it was probably network related, one of my SA co-workers took a
thorough look at the network configs on each VM and found that one of
the VMs had an additional NIC configured called virbr0. This apparently
occured when VMTools was installed on this host (another SA was testing
out the VMTool suite a while back and used this VM to run his tests).
I disabled and removed the virbr0 NIC. I then had to edit the
pg_hba.conf hosts on coord1, data1, coord2 and data2 as follows:
host all all 192.168.38.0/24 trust
I had originally added it as 'host all all 192.168.38.0*/32* trust' but
the /32 was causing issues where postgres-xc couldn't find the hosts
properly.
Once I did all this and updated the node initialization to include the
remote hosts on each VM, the CREATE DATABASE SQL ran clean.
I have subsequently created 3 test db's, numerous login roles, numerous
group roles and loaded pg_dump backups into each test db on one of the
db VMs. The second db VM was then accessed and all db objects that were
created on the remote db VM had successfully been created on the db VM I
was accessing. As well, all the data on the remote db VM had
successfully been loaded into the db VM I was accessing.
Postgres-XC is working great so far!!
Thanks very much for taking the time to respond to and halp with this
issue. It was greatly appreciated.
Phil.
On 4/16/12 7:24 PM, Michael Paquier wrote:
>
>
> On Mon, Apr 16, 2012 at 11:34 PM, Phil Somers <phi...@ne...
> <mailto:phi...@ne...>> wrote:
>
> Michel,
>
> Thanks very much for your speedy reply.
>
> After reading your email, I realized I had forgotten to set the
> listen_addresses parameter to '*' on the coordinators and
> datanodes. I shut down the whole postgres-xc environment, updated
> the postgresql.conf files on the coordinators and datanodes and
> fired everything back up without issue, using the -o "-i" option
> for the coordinator/datanodes startup.
>
> After doing all this, I am still getting connection issues with
> datanode 0 when trying to create a database on coordinator1.
>
> I provided the output of the pgxc_node table as I am thinking
> maybe there are problems with the entries in this table?
>
> Any other suggestions you may have with this issue would be
> greatly appreciated.
>
> Phil
>
>
>
> Here are the details:
>
> -----------------------------------------------------------------------------------------------------------------
> -----------------------------------------------------------------------------------------------------------------
> -----------------------------------------------------------------------------------------------------------------
>
>
> shutdown gtm, coord1/data1, coord2/data2
> ----------------------------------------
>
> updated postgresql.conf on dbhost1 for both coord1 and data1:
> ------------------------------------------------------------
> - listen_addresses = '*'
>
> updated postgresql.conf on dbhost2 for both coord2 and data2:
> ------------------------------------------------------------
> - listen_addresses = '*'
>
>
> started coordinator and datanodes on each db host adding the -o
> "-i" as per your recommendation:
> -----------------------------------------------------------------------------------------------
> dbhost1:
> pg_ctl start -D /home/postgres/datanode/data -l
> /home/postgres/datanode/log/datanode1.log -o "-i -p 15432" -Z datanode
> pg_ctl start -D /home/postgres/coordinator/data -l
> /home/postgres/coordinator/log/coordinator1.log -o "-i" -Z coordinator
>
> dbhost2:
> pg_ctl start -D /home/postgres/datanode/data -l
> /home/postgres/datanode/log/datanode2.log -o "-i -p 15432" -Z datanode
> pg_ctl start -D /home/postgres/coordinator/data -l
> /home/postgres/coordinator/log/coordinator2.log -o "-i" -Z coordinator
>
>
> checked to make sure postgres was running on dbhost1:
> -----------------------------------------------------
> $ ps -ef | grep postgres
>
> root 1005 587 0 10:22 ? 00:00:00 sshd: postgres [priv]
> 1058 1083 1005 0 10:22 ? 00:00:01 sshd: postgres@pts/0
> 1058 1222 1 0 10:47 pts/0 00:00:00
> /home/postgres/db_home/bin/postgres -X -D
> /home/postgres/datanode/data -i -p 15432
> 1058 1224 1222 0 10:47 ? 00:00:00 postgres: writer
> process
> 1058 1225 1222 0 10:47 ? 00:00:00 postgres: wal
> writer process
> 1058 1226 1222 0 10:47 ? 00:00:00 postgres:
> autovacuum launcher process
> 1058 1227 1222 0 10:47 ? 00:00:00 postgres: stats
> collector process
> 1058 1234 1 0 10:48 pts/0 00:00:00
> /home/postgres/db_home/bin/postgres -C -D
> /home/postgres/coordinator/data -i
> 1058 1236 1234 0 10:48 ? 00:00:00 postgres: pooler
> process
> 1058 1237 1234 0 10:48 ? 00:00:00 postgres: writer
> process
> 1058 1238 1234 0 10:48 ? 00:00:00 postgres: wal
> writer process
> 1058 1239 1234 0 10:48 ? 00:00:00 postgres:
> autovacuum launcher process
> 1058 1240 1234 0 10:48 ? 00:00:00 postgres: stats
> collector process
> 1058 1401 1084 0 10:53 pts/0 00:00:00 grep --color=auto
> postgres
>
>
> psql onto dbhost1 coord1 and ran 'create database TEST';
> ------------------------------------------------------
>
>
> psql -U postgres -d postgres
>
> Password for user postgresql:
> psql (9.1.2)
> Type "help" for help.
>
> postgres=# create database TEST;
>
> ERROR: Failed to get pooled connections
>
>
>
> coord1 log output:
> -----------------
>
> LOG: database system was shut down at 2012-04-16 11:00:24 ADT
> LOG: database system is ready to accept connections
> LOG: autovacuum launcher started
>
> LOG: failed to connect to data node
> WARNING: can not connect to datanode 0
> LOG: failed to acquire connections
> STATEMENT: create database TEST;
> ERROR: Failed to get pooled connections
> STATEMENT: create database TEST;
>
>
>
> psql output for pgxc_node (output is exactly same on both coord1
> and coord2):
> ----------------------------------------------------------------------------
>
> postgres=# select * from pgxc_node;
> node_name | node_type | node_port | node_host |
> nodeis_primary | nodeis_preferred
> -----------+-----------+-----------+----------------+----------------+------------------
> coord2 | C | 5432 | 192.168.38.100 |
> f | f
> data2 | D | 15432 | 192.168.38.100 |
> f | f
> coord1 | C | 5432 | 192.168.38.101 |
> f | f
> data1 | D | 15432 | 192.168.38.101 |
> f | f
> (4 rows)
>
> Just by looking at that, I can't really get the issue you have, but I
> am pretty sure it is a permission problem.
> Could it be a firewall issue? You may try to shut it down on node 2
> once and see what happens... Are you able to create a database correctly?
> If this doesn't work, you should turn on log_connections on
> postgresql.conf of each node and have a look at the logs. It will for
> sure help to spot your problem.
> --
> Michael Paquier
> http://michael.otacoo.com
|
|
From: Michael P. <mic...@gm...> - 2012-04-20 00:59:36
|
I don't really know what you are looking for, but based on your example,
here simplified:
create table org (id int primary key, name varchar(50) not null);
create table poc (id int primary key, firstname varchar(50) not null,
lastname varchar(50) not null);
create table link (org_id int, poc_id int, function varchar(2), primary key
(org_id,poc_id));
Depending on the distribution type of each table, the key to performance is
to get queries that are completely shipped to Datanodes (remote nodes)
without having to materialize data from multiple Datanodes on Coordinators
(where application connects).
Your query is this one, I just changed the primary keys to integers:
select poc.firstname, poc.lastname, org.id, link.function from poc,org,
link where org.name = 'whatever' and org.id = link.org_id and link.poc_id =
poc.id;
For example, if I create all the tables below as replicated:
create table org (id int primary key, name varchar(50) not null) distribute
by replication;
create table poc (id int primary key, firstname varchar(50) not null,
lastname varchar(50) not null) distribute by replication;
create table link (org_id int, poc_id int, function varchar(2), primary key
(org_id,poc_id)) distribute by replication;
Your SELECT query becomes like this:
postgres=# explain select poc.firstname, poc.lastname, org.id,
link.function from poc,org, link where org.name = 'whatever' and org.id =
link.org_id and link.poc_id = poc.id;
QUERY PLAN
----------------------------------------------------------------------------
Data Node Scan on "__REMOTE_FQS_QUERY__" (cost=0.00..0.00 rows=0 width=0)
Node/s: dn1
(2 rows)
This means that the query is completely shipped to a single node, normal as
the data of all the tables are replicated on all the nodes.
If your tables become distributed:
create table org (id int primary key, name varchar(50) not null) distribute
by hash(id);
create table poc (id int primary key, firstname varchar(50) not null,
lastname varchar(50) not null) distribute by hash(id);
create table link (org_id int, poc_id int, function varchar(2), primary key
(org_id,poc_id)) distribute by hash(org_id);
postgres=# explain select poc.firstname, poc.lastname, org.id,
link.function from poc,org, link where org.name = 'whatever' and org.id =
link.org_id and link.poc_id = poc.id;
QUERY PLAN
--------------------------------------------------------------------------
Nested Loop (cost=0.00..0.03 rows=1 width=252)
Join Filter: (link.org_id = org.id)
-> Nested Loop (cost=0.00..0.01 rows=1 width=252)
Join Filter: (poc.id = link.poc_id)
-> Data Node Scan on poc (cost=0.00..0.00 rows=1000 width=240)
Node/s: dn1, dn2
-> Data Node Scan on link (cost=0.00..0.00 rows=1000 width=20)
Node/s: dn1, dn2
-> Data Node Scan on org (cost=0.00..0.00 rows=1000 width=4)
Node/s: dn1, dn2
(10 rows)
You need to use a nested loop, meaning that you need first to gather the
data from remote nodes, materialize it on Coordinator, then return the
results to client. This is extremely bad for performance.
So your goal to performance is to find the correct combination of table
distribution to have a maximum number of queries being shipped to remote
nodes.
I do not say that it is necessary to replicate all the tables. By doing so,
of course your application will be good in read but write operations will
have to run on all the nodes where table data is located, lowering
performance if your application does a lot of read, so the secret is to
have a good balance.
In XC, you can also located data only on a portion of nodes. This can also
help tune your applications.
http://postgres-xc.sourceforge.net/docs/1_0/sql-createtable.html
CREATE TABLE uses an extension called TO GROUP/NODE to locate data of a
table only on a portion of nodes.
Regards,
On Thu, Apr 19, 2012 at 9:30 PM, Michael Vitale <mic...@ar...> wrote:
> Thank you for your response. Can I just give a simple schema example
> and query resulting from it and see if it would suffer in a cluster
> solution using the primary keys, which are system generated names (GUIDs)?
>
> Table ORG (ORG_HANDLE VARCHAR(50) NOT NULL, ORG_NAME VARCHAR2(150) NOT
> NULL);
> primary key: ORG_HANDLE
>
> TABLE POC (POC_HANDLE VARCHAR(50), FIRST_NAME(VARCHAR50) NOT NULL,
> LAST_NAME VARCHAR(50) NOT NULL);
> primary key: POC_HANDLE
>
> Table ORG_POC_LINK (ORG_HANDLE VARCHAR(50) NOT NULL, POC_FUNCTION
> VARCHAR(2) NOT NULL, POC_HANDLE VARCHAR(50) NOT NULL);
> primary key: ORG_HANDLE,POC_FUNCTION,POC_HANDLE
>
>
> Query:
> select POC.FIRST_NAME, POC.LAST_NAME, ORG.HANDLE, OPL.POC_FUNCTION FROM
> POC POC, ORG ORG, ORG_POC_LINK OPL
> WHERE ORG.ORG_NAME = 'whatever' and ORG.ORG_HANDLE = OPL.ORG_HANDLE and
> OPL.POC_HANDLE = POC.POC_HANDLE
> ------------------------------
> *From:* Ashutosh Bapat [ash...@en...]
> *Sent:* Thursday, April 19, 2012 8:05 AM
> *To:* Michael Vitale
> *Cc:* pos...@li...;
> pos...@li...
> *Subject:* Re: [Postgres-xc-general] the cluster cost for normalized
> tables
>
> HI Michael,
> The distribution of data depends upon the distribution strategy used. In
> Postgres-XC, we distribute data based on the hash/modulo of the given
> column. It's usually advisable to choose the same distribution for the
> tables which have equi-joins on their distribution columns.
>
> Choosing the right distribution for the tables involved is an art. We need
> the knowledge of table definitions and set of queries to decide the exact
> distribution. If the queries are such that they join on collocated data,
> the performance is greatly improved.
>
> On Thu, Apr 19, 2012 at 4:56 PM, Michael Vitale <mic...@ar...> wrote:
>
>> Hi you most honorable cluster folks!
>>
>> Our company is moving from Oracle to PostgreSQL. We initially thought we
>> would be moving to MySQL Cluster, but an investigation of how clustering
>> works in MySQL Cluster revealed that performance would suffer substantially
>> since it is predicated on keys that segregate SQL-requested data to
>> specific nodes and not to all or most of the nodes. A highly normalized
>> database would suffer in this situation where a result set would normally
>> consist of rows gathered from most, if not all, of the back-end nodes.
>>
>> Do you all have the same problem with Clustered PostgreSQL (Postgres-XC)?
>>
>> Respectfully Yours,
>>
>> Michael Vitale
>> ARIN DBA
>> mic...@ar...
>> 703-227-9885
>>
>>
>>
>> ------------------------------------------------------------------------------
>> For Developers, A Lot Can Happen In A Second.
>> Boundary is the first to Know...and Tell You.
>> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
>> http://p.sf.net/sfu/Boundary-d2dvs2
>> _______________________________________________
>> Postgres-xc-general mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>
>>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>
>
>
> ------------------------------------------------------------------------------
> For Developers, A Lot Can Happen In A Second.
> Boundary is the first to Know...and Tell You.
> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
> http://p.sf.net/sfu/Boundary-d2dvs2
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>
>
--
Michael Paquier
http://michael.otacoo.com
|