From: Lionel F. <lio...@gm...> - 2011-03-08 13:09:23
|
Thanks for your valuable answers, my test cluster is now working (worth mentionning I was additionally mislead by unexpected /etc/hosts.allow set up...) Other notes below Regards 2011/3/6 Michael Paquier <mic...@gm...> > Hi Lionel, > > Just to complete a little bit the answers of my colleague... > > On Sat, Mar 5, 2011 at 3:36 PM, Abbas Butt <abb...@te...>wrote: > >> My actual setup is : >>> pgxc1 for GTM, Coordinator and datanode >>> pgxc2 for datanode only >>> >> >> You mean you will have 2 computers, one will run GTM, Coordinator and 1st >> datanode and the other the 2nd data node. If yes then this would be fine. >> >> BTW what Linux distribution will you be using? >> > RedHat Enterprise 5.4 by now. > >> > > >> >>> 1.General : Is a coordinator needed for each node, or one coordinator >>> 'to rule them all' can be setup ? >>> >> You can use one Coordinator with 100 Datanodes if you desire. > It may be better if the ratio Coordinator/Datanode is close to 1, but we > also found that if you set one Coordinator and one Datanode on the same > machine, Coordinator was using 30% of ressources and Datanode 70%. > With such numbers, a ratio of 0.5 is also possible. > Yes, I'm aiming to test, given your numbers, 1 Coordinator for 3 Datanodes, summing up to ~20 machines (if possible with our infrastructure folks). The breaking point may be the bandwidth (after the number of available hosts, of course). > >> >>> >>> 2. Configuration : What should be the differences between >>> postgresql.conf in /datanode and /coordinator, if there are any? >> >> [...] > You have also pooler connection parameters to set but I forgot all the > names. > No issue, I'll try by myself. And tweaking this needs I'm ready to run benchmarks, which is not the actual phase. > For datanode, you have to take care of GTM connection parameter and > pgxc_node_id (used to register on GTM). > > In your cluster you can have a Coordinator 1 and a Datanode 1 as the > difference between node type is made when registering nodes on GTM. > > >> >>> Some >>> portions are they ignored for a specific function (ex : coordinator vs >>> datanode config, etc...) >>> In this can pg_hba.conf & postgresql.conf be shared for the same >>> server (maybe using symbolic links...) >>> >> No, you have to set up each postgresql.conf and pg_hba.conf for each node > separately as each Coordinator and Datanode use different data folder. > I'm having pg_hba.conf same setup on all dirs/hosts for the moment, but for the others, I've understood what you meant when the cluster was finally running... > >> >>> 3. GTM : I get on second node an "WARNING: Do not have a GTM snapshot >>> available", can this be related to previous config files/setup ? >>> >> This error means that you didn't set up GTM connection parameters > correctly. > >> >> >>> >>> 4. Administration : are there ways of getting the online status of the >>> nodes from one node to another >>> >> >> This is currently under development. I can provide you details later. >> > Those experimental functionalities are located on a separate branch called > ha_support in the GIT repo. > It would be of a great interest in terms of supportability in a company-wide perspective. I'll give them a try if they're next to alpha status :) > > We are also thinking about adding some catalog extensions to allow > coordinator to keep an eye on Datanodes as such a view process is linker to > the connection pooling process. > Great idea. > This is just a thought though. > Now as we are focusing on code stability for the core, this is not a high > priority. > But as we merged with PostgreSQL 9.0, it may be possible in a close future > to use XC with HOT Standby nodes. Current streaming replication is not > synchronized so its usage is now limited in current XC. > > If you have any other questions, don't hesitate. > I never do :) Regards Lionel F. |