Re: [Postgres-xc-general] General config/arch questions

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Thanks for your valuable answers, my test cluster is now working (worth
mentionning I was additionally mislead by unexpected /etc/hosts.allow set
up...)

Other notes below

Regards

2011/3/6 Michael Paquier <mic...@gm...>

> Hi Lionel,
>
> Just to complete a little bit the answers of my colleague...
>
> On Sat, Mar 5, 2011 at 3:36 PM, Abbas Butt <abb...@te...>wrote:
>
>> My actual setup is :
>>> pgxc1 for GTM, Coordinator and datanode
>>> pgxc2 for datanode only
>>>
>>
>> You mean you will have 2 computers, one will run GTM, Coordinator and 1st
>> datanode and the other the 2nd data node. If yes then this would be fine.
>>
>> BTW what Linux distribution will you be using?
>>
>
RedHat Enterprise 5.4 by now.

>
>>
>
>
>>
>>> 1.General : Is a coordinator needed for each node, or one coordinator
>>> 'to rule them all' can be setup ?
>>>
>> You can use one Coordinator with 100 Datanodes if you desire.
> It may be better if the ratio Coordinator/Datanode is close to 1, but we
> also found that if you set one Coordinator and one Datanode on the same
> machine, Coordinator was using 30% of ressources and Datanode 70%.
> With such numbers, a ratio of 0.5 is also possible.
>

Yes, I'm aiming to test, given your numbers, 1 Coordinator for 3 Datanodes,
summing up to ~20 machines (if possible with our infrastructure folks). The
breaking point may be the bandwidth (after the number of available hosts, of
course).

>
>>
>>>
>>> 2. Configuration :  What should be the differences between
>>> postgresql.conf in /datanode and /coordinator, if there are any?
>>
>> [...]
> You have also pooler connection parameters to set but I forgot all the
> names.
>

No issue, I'll try by myself. And tweaking this needs I'm ready to run
benchmarks, which is not the actual phase.

> For datanode, you have to take care of GTM connection parameter and
> pgxc_node_id (used to register on GTM).
>
> In your cluster you can have a Coordinator 1 and a Datanode 1 as the
> difference between node type is made when registering nodes on GTM.
>
>
>>
>>> Some
>>> portions are they ignored for a specific function (ex : coordinator vs
>>> datanode config, etc...)
>>> In this can pg_hba.conf & postgresql.conf be shared for the same
>>> server (maybe using symbolic links...)
>>>
>> No, you have to set up each postgresql.conf and pg_hba.conf for each node
> separately as each Coordinator and Datanode use different data folder.
>

I'm having pg_hba.conf same setup on all dirs/hosts for the moment, but for
the others, I've understood what you meant when the cluster was finally
running...

>
>>
>>> 3. GTM : I get on second node an "WARNING:  Do not have a GTM snapshot
>>> available", can this be related to previous config files/setup  ?
>>>
>> This error means that you didn't set up GTM connection parameters
> correctly.
>
>>
>>
>>>
>>> 4. Administration : are there ways of getting the online status of the
>>> nodes from one node to another
>>>
>>
>> This is currently under development. I can provide you details later.
>>
> Those experimental functionalities are located on a separate branch called
> ha_support in the GIT repo.
>

It would be of a great interest in terms of supportability in a company-wide
perspective. I'll give them a try if they're next to alpha status :)

>
> We are also thinking about adding some catalog extensions to allow
> coordinator to keep an eye on Datanodes as such a view process is linker to
> the connection pooling process.
>

Great idea.

> This is just a thought though.
> Now as we are focusing on code stability for the core, this is not a high
> priority.
> But as we merged with PostgreSQL 9.0, it may be possible in a close future
> to use XC with HOT Standby nodes. Current streaming replication is not
> synchronized so its usage is now limited in current XC.
>
> If you have any other questions, don't hesitate.
>

I never do :)

Regards

Lionel F.