From: Koichi S. <koi...@gm...> - 2013-02-19 02:51:24
|
This is related to postgresql.conf parameter max_coordinators and max_datanodes. Default value is 16 so you should extend them. They are specific to XC. Regards; ---------- Koichi Suzuki 2013/2/19 Arni Sumarlidason <Arn...@md...>: > The issue seems to be related to the size of the cluster. When attempting to > initialize 20 nodes. 18, 19, and 20 consistently failed (reproducible). I > tried a cluster of 17 with fewer errors. I tried 10 with success. > > > > Best, > > > > From: Arni Sumarlidason > Sent: Monday, February 18, 2013 8:14 PM > To: 'koi...@gm...'; 'Postgres-XC Developers' > Cc: pos...@li...; mic...@gm... > Subject: RE: [Postgres-xc-general] pgxc: snapshot > > > > Mr. Koichi, > > > > You are right, PGADMIN was the source of these warnings. > > > > Do you have any idea what would cause the following error: > > # Cache lookup failed for node when executing CREATE NODE > > > > Lastly, I believe there are typos on lines 789, 797 in the pgxc script. > > > > Best regards, > > > > -----Original Message----- > From: koi...@gm... [mailto:koi...@gm...] On Behalf > Of Koichi Suzuki > Sent: Monday, February 18, 2013 12:52 AM > To: Arni Sumarlidason; Postgres-XC Developers > Subject: Re: [Postgres-xc-general] pgxc: snapshot > > > > I tried the stress test. Autovacuum seems to work well. I also ran > > vacuum analyze verbose from psql directly and it worked without > > problem during this stress test. I ran vacuumdb as well. All > > worked without problem. > > > > I noticed that you used pgAdmin. Unfortunately, pgAdmin has not been > > tuned to work with XC. I know some of pgAdmin features works well > > but others don't. Could you try to run vacuum from psql or > > vacuumdb? If they don't work, please let me know. > > > > Best Regards; > > ---------- > > Koichi Suzuki > > > > > > 2013/2/18 Koichi Suzuki <ko...@in...>: > >> Nice to hear that pgxc_ctl helps. > >> > >> As to the warning, I will try to reproduce the problem and fix it. I >> need to find a time for it so please forgive me a bit of time. The test >> will run many small transactions which will cause autovacuum lauched, as >> attached. This test was built as Datanode slave stress test. I think >> this may work as autovacuum lauch test. I will test it with four >> coordinators and four datanodes, and four gtm_proxies as well. Whole test >> will take about a couple of hours with five of six-core Xeon servers (one >> for GTM). > >> > >> Do you think this makes sense to reproduce your problem? > >> > >> I will run it both on master and REL1_0_STABLE. > >> > >> Regards; > >> --- > >> Koichi > >> > >> On Sat, 16 Feb 2013 19:32:11 +0000 > >> Arni Sumarlidason <Arn...@md...> wrote: > >> > >>> Koichi, and others, > >>> > >>> I spun some fresh VMs and ran your script with the identical outcome, GTM >>> Snapshot warnings from the auto vacuum launcher. > >>> Please advise. > >>> > >>> > >>> Thank you for your script, it does make life easier!! > >>> > >>> Best, > >>> > >>> -----Original Message----- > >>> From: Koichi Suzuki [mailto:ko...@in...] > >>> Sent: Friday, February 15, 2013 4:11 AM > >>> To: Arni Sumarlidason > >>> Cc: Michael Paquier; koi...@gm...; > >>> pos...@li... > >>> Subject: Re: [Postgres-xc-general] pgxc: snapshot > >>> > >>> If you're not sure about the configuration, please try pgxc_ctl > >>> available at > >>> > >>> git://github.com/koichi-szk/PGXC-Tools.git > >>> > >>> This is bash script (I'm rewriting into C now) so it will help to >>> understand how to configure XC. > >>> > >>> Regards; > >>> --- > >>> Koichi Suzuki > >>> > >>> On Fri, 15 Feb 2013 04:22:49 +0000 > >>> Arni Sumarlidason <Arn...@md...> wrote: > >>> > >>> > Thank you both for fast response!! > >>> > > >>> > RE: Koichi Suzuki > >>> > I downloaded the git this afternoon. > >>> > > >>> > RE: Michael Paquier > >>> > > >>> > - Confirm it is from the datanode's log. > >>> > > >>> > - Both coord & datanode connect via the same gtm_proxy on >>> > localhost > >>> > > >>> > These are my simplified configs, the only change I make on each > >>> > node is the nodename, PG_HBA > >>> > local all all >>> > trust > >>> > host all all 127.0.0.1/32 >>> > trust > >>> > host all all ::1/128 >>> > trust > >>> > host all all 10.100.170.0/24 >>> > trust > >>> > > >>> > COORD > >>> > pgxc_node_name = 'coord01' > >>> > listen_addresses = '*' > >>> > port = 5432 > >>> > max_connections = 200 > >>> > > >>> > gtm_port = 6666 > >>> > gtm_host = 'localhost' > >>> > pooler_port = 6670 > >>> > > >>> > shared_buffers = 32MB > >>> > work_mem = 1MB > >>> > maintenance_work_mem = 16MB > >>> > max_stack_depth = 2MB > >>> > > >>> > log_timezone = 'US/Eastern' > >>> > datestyle = 'iso, mdy' > >>> > timezone = 'US/Eastern' > >>> > lc_messages = 'en_US.UTF-8' > >>> > lc_monetary = 'en_US.UTF-8' > >>> > lc_numeric = 'en_US.UTF-8' > >>> > lc_time = 'en_US.UTF-8' > >>> > default_text_search_config = 'pg_catalog.english' > >>> > > >>> > DATA > >>> > pgxc_node_name = 'data01' > >>> > listen_addresses = '*' > >>> > port = 5433 > >>> > max_connections = 200 > >>> > > >>> > gtm_port = 6666 > >>> > gtm_host = 'localhost' > >>> > > >>> > shared_buffers = 32MB > >>> > work_mem = 1MB > >>> > maintenance_work_mem = 16MB > >>> > max_stack_depth = 2MB > >>> > > >>> > log_timezone = 'US/Eastern' > >>> > datestyle = 'iso, mdy' > >>> > timezone = 'US/Eastern' > >>> > lc_messages = 'en_US.UTF-8' > >>> > lc_monetary = 'en_US.UTF-8' > >>> > lc_numeric = 'en_US.UTF-8' > >>> > lc_time = 'en_US.UTF-8' > >>> > default_text_search_config = 'pg_catalog.english' > >>> > > >>> > PROXY > >>> > Nodename = 'proxy01' > >>> > listen_addresses = '*' > >>> > port = 6666 > >>> > gtm_host = '10.100.170.10' > >>> > gtm_port = 6666 > >>> > > >>> > > >>> > best, > >>> > > >>> > Arni > >>> > > >>> > From: Michael Paquier [mailto:mic...@gm...] > >>> > Sent: Thursday, February 14, 2013 11:06 PM > >>> > To: Arni Sumarlidason > >>> > Cc: pos...@li... > >>> > Subject: Re: [Postgres-xc-general] pgxc: snapshot > >>> > > >>> > > >>> > On Fri, Feb 15, 2013 at 12:57 PM, Arni Sumarlidason >>> > <Arn...@md...<mailto:Arn...@md...>> wrote: > >>> > Hi Everyone! > >>> > > >>> > I am getting these errors, "Warning: do not have a gtm snapshot >>> > available"[1]. After researching I found posts about the auto vacuum causing >>> > these errors, is this fix or work in progress? Also, I am seeing them >>> > without the CONTEXT: automatic vacuum message too. Is this something to >>> > worry about? Cluster seems to be functioning normally. > >>> > > >>> > Vacuum and analyze from pgadmin looks like this, > >>> > INFO: vacuuming "public.table" > >>> > INFO: "table": found 0 removable, 0 nonremovable row versions in 0 > >>> > pages > >>> > DETAIL: 0 dead row versions cannot be removed yet. > >>> > CPU 0.00s/0.00u sec elapsed 0.00 sec. > >>> > INFO: analyzing "public.table" > >>> > INFO: "table": scanned 0 of 0 pages, containing 0 live rows and 0 > >>> > dead rows; 0 rows in sample, 0 estimated total rows Total query >>> > runtime: 15273 ms. > >>> > > >>> > Should we use execute direct to perform maintenance? > >>> > No. Isn't this happening on a Datanode? > >>> > Be sure first to set gtm_host and gtm_port in postgresql.conf of all >>> > the nodes, Coordinator and Datanode included. GXID and snapshots are fetched >>> > of course on Coordinator for normal transaction run but also on all the >>> > nodes for autovacuum. > >>> > -- > >>> > Michael > >>> |