You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
|
|
|
1
(14) |
2
(4) |
3
(8) |
4
(4) |
5
(3) |
6
|
7
|
8
|
9
|
10
(2) |
11
(2) |
12
|
13
(5) |
14
(5) |
15
(1) |
16
(1) |
17
(2) |
18
|
19
(1) |
20
(2) |
21
(2) |
22
(2) |
23
(1) |
24
(13) |
25
(4) |
26
(1) |
27
(1) |
28
(4) |
|
From: Nirmal S. <sha...@gm...> - 2014-02-03 22:07:52
|
Hi All, I tried with log_statement enabled on all the nodes and coordinator and i got this: --This is the coordinator log *LOG: duration: 8807.961 ms* statement: select coalesce(fgpc.date_id,fgcd.date_id) date_id, fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, fgpc.m_kw_pub_key m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, sum(fgpc.m_imps) m_imps, sum(fgpc.m_clicks) m_clicks, sum(fgpc.m_cost) m_cost, sum(fgpc.m_conv_1pc) m_conv_1pc, sum(fgpc.m_conv_mpc) m_conv_mpc, avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, avg(fgpc.m_avg_cpc) m_avg_cpc, avg(fgpc.m_max_cpc) m_max_cpc, avg(fgpc.m_firstpage_cpc) m_firstpage_cpc, avg(fgpc.m_topofpage_cpc) m_topofpage_cpc, avg(fgpc.m_avg_cpm) m_avg_cpm, avg(fgpc.m_max_cpm) m_max_cpm, avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, avg(fgpc.m_avg_pos) m_avg_pos, avg(fgpc.m_lowest_pos) m_lowest_pos, avg(fgpc.m_highest_pos) m_highest_pos, avg(fgpc.m_quality_score) m_quality_score, avg(fgpc.m_view_thru_conv) m_view_thru_conv, sum(fgcd.m_revenue) m_revenue, sum(fgcd.m_conversions) m_conversions, sum(coalesce(kws.m_new_kw_bid, kws.m_kw_bid)) m_kw_total_bid, max(coalesce(kws.m_new_kw_bid, kws.m_kw_bid)) m_kw_max_bid, min(coalesce(kws.m_new_kw_bid, kws.m_kw_bid)) m_kw_min_bid from bidw.fact_msn_kw_perf_daily fgpc full outer join bidw.fact_msn_kw_conversion_daily fgcd on fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id join biods.msn_keyword_sup kws on fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = kws.m_ad_grp_pub_key where coalesce(fgpc.date_id,fgcd.date_id) between 20131201 and 20140119 group by coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid *And this the log info from all the data nodes log file:* *LOG: duration: 8387.136 ms* statement: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bid, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 So as per the query log everything looks fine. i.e. coordinator is working the way it should work. *But then why the below statement is returning me 23 sec ( test.sql has got the same query that is shown above )* [postgres@sv4-pgxc-db04 test]$ time psql -d adchemy11100 -f "test.sql" > /dev/null *real 0m23.394s* user 0m15.900s sys 0m0.645s Please advise. Nirmal On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote: > > > > On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote: > >> Hi Mason, >> >> This is the actual query that i was running. >> >> select coalesce(fgpc.date_id,fgcd.date_id) >> date_id, >> fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, >> fgpc.m_kw_pub_key m_kw_pub_key, >> kws.expr_names, >> kws.expr_values, >> kws.m_ad_grp_semid, >> sum(fgpc.m_imps) m_imps, >> sum(fgpc.m_clicks) m_clicks, >> sum(fgpc.m_cost) m_cost, >> >> sum(fgpc.m_conv_1pc) m_conv_1pc, >> sum(fgpc.m_conv_mpc) m_conv_mpc, >> avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, >> avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, >> avg(fgpc.m_avg_cpc) m_avg_cpc, >> avg(fgpc.m_max_cpc) m_max_cpc, >> avg(fgpc.m_firstpage_cpc) >> m_firstpage_cpc, >> avg(fgpc.m_topofpage_cpc) >> m_topofpage_cpc, >> avg(fgpc.m_avg_cpm) m_avg_cpm, >> avg(fgpc.m_max_cpm) m_max_cpm, >> avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, >> avg(fgpc.m_avg_pos) m_avg_pos, >> avg(fgpc.m_lowest_pos) m_lowest_pos, >> avg(fgpc.m_highest_pos) m_highest_pos, >> avg(fgpc.m_quality_score) >> m_quality_score, >> avg(fgpc.m_view_thru_conv) >> m_view_thru_conv, >> sum(fgcd.m_revenue) m_revenue, >> sum(fgcd.m_conversions) m_conversions, >> sum(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_total_bid, >> max(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_max_bid, >> min(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_min_bid >> from >> bidw.fact_msn_kw_perf_daily fgpc >> full outer join bidw.fact_msn_kw_conversion_daily fgcd on >> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = >> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id >> join biods.msn_keyword_sup kws on >> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = >> kws.m_ad_grp_pub_key >> where >> coalesce(fgpc.date_id,fgcd.date_id) between >> 20131201 and 20140119 >> group by >> >> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid >> ; >> >> *This is the explain plan for the same.* >> explain analyze verbose select ...... >> .... >> >> * Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 >> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)* >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra >> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), >> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), >> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), >> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)), >> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr >> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), >> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) >> *Node/s: d11, d12, d13, d14, d15, d16* >> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, >> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), >> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), >> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), >> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. >> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), >> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), >> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), >> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), >> pg_catalog.numeric_avg(avg( >> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), >> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), >> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, >> l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, >> l.a_15, l.a_16, l.a_17, l.a_18, l.a_ >> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT >> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, >> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, >> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, >> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f >> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, >> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, >> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE >> true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, >> a_13, a_14, a_15, a_16, a_17, a_18, a_19, >> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY >> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, >> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE >> ((COALESCE((l.a_1)::bigint, >> r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) >> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, >> a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) >> JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, >> kws.m_new_kw_bid, kws.m_kw_bi >> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup >> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE >> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 >> * Total runtime: 8378.080 ms* >> (5 rows) >> >> >> >> *This is the actual time taken by the query:* >> >> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out >> >> *real 0m23.533s* >> user 0m15.705s >> sys 0m0.748s >> >> Now i dont know why is it taking that much time. >> > > Try adding LIMIT with different amounts for example to see how that > impacts time. > > Also, try enabling statement logging (log_statement = all in > postgresql.conf) on the data nodes to see how long it takes on each node. > > Also, the statement was rewritten in XC, with relations converted into > SELECTs, so try running the rewritten version directly to see how long it > takes. > > Thanks, > > Mason > > |
From: Sandeep G. <gup...@gm...> - 2014-02-03 18:20:53
|
Hi Ashutosh, For us the app+ autovaccum is quite harmful. We are not able to run the application because the index creation gets aborted in middle. The datanodes crashes. We could somehow restart the datanodes and start the index creation but my feeling it will happen quite often. I have a related question: is there anyway to know if a command has failed. Usually we fire a command using psql. And move to the next command. Is there any way to know if the previous command failed or was a success? Thanks. Sandeep On Mon, Feb 3, 2014 at 1:13 PM, Sandeep Gupta <gup...@gm...>wrote: > Hi Ashutosh, Koichi, > > Initially my feeling was that this was a postgres bug. That is why I > posted it in the postgres community. However, I now feel that it is due to > the changes made in XC. > > I have started the same test on standalone postgres. So far it hasn't > crashed. My feeling is that it won't. If in case it does I will report > accordingly. > > As requested, I started the test with verbose log on. Attached are the log > files for the coordinator and the datanodes. > There are several redundant messages that gets printed such as "checkpoint > too often". Please use some filters etc. to view the log file. I thought it > was best to send across the whole file. > > To clarify, I create a very large table (using copy) and then repeatedly > create and drop index. I understand this is not the actual workload but > that was the only way to reproduce the error. > > The other complication is that in real system we get two kinds of errors > "tuple on found" and this deadlock. I feel that they connected. > > Let me know if the log file help or is any other suggestions that you have > may. > > -Sandeep > > > > > > > On Sun, Feb 2, 2014 at 11:49 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> Hi Sandeep, >> Can you please check if similar error happens on vanilla PG. It may be an >> application + auto-vacuum error, which can happen in PG as well and might >> be harmless. It's auto-vacuum being cancelled. Auto-vacuum will run again >> during the next iteration. >> >> >> On Fri, Jan 31, 2014 at 11:21 PM, Sandeep Gupta <gup...@gm...>wrote: >> >>> Hi, >>> >>> I was debugging an outstanding issue with pgxc ( >>> http://sourceforge.net/mailarchive/forum.php?thread_name=CABEZHFtr_YoWb22UAnPGQz8M5KqpwzbviYiAgq_%3DY...@ma...&forum_name=postgres-xc-general). >>> >>> >>> I couldn't reproduce that error. But I do get this error. >>> >>> >>> LOG: database system is ready to accept connections >>> LOG: autovacuum launcher started >>> LOG: sending cancel to blocking autovacuum PID 17222 >>> DETAIL: Process 13896 waits for AccessExclusiveLock on relation 16388 >>> of database 12626. >>> STATEMENT: drop index mdn >>> ERROR: canceling autovacuum task >>> CONTEXT: automatic analyze of table "postgres.public.la_directednetwork" >>> PreAbort Remote >>> >>> >>> It seems to be a deadlock issue and may be related to the earlier >>> problem as well. >>> Please let me know your comments. >>> >>> -Sandeep >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> WatchGuard Dimension instantly turns raw network data into actionable >>> security intelligence. It gives you real-time visual feedback on key >>> security issues and trends. Skip the complicated setup - simply import >>> a virtual appliance and go from zero to informed in seconds. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Postgres-xc-general mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >>> >>> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EnterpriseDB Corporation >> The Postgres Database Company >> > > |
From: Sandeep G. <gup...@gm...> - 2014-02-03 18:13:23
|
LOG: 00000: database system was shut down at 2014-02-03 11:42:41 EST LOCATION: StartupXLOG, xlog.c:6348 LOG: 00000: database system is ready to accept connections LOCATION: reaper, postmaster.c:2560 LOG: 00000: autovacuum launcher started LOCATION: AutoVacLauncherMain, autovacuum.c:407 WARNING: 01000: unexpected EOF on datanode connection LOCATION: pgxc_node_receive, pgxcnode.c:463 ERROR: XX000: failed to send PREPARE TRANSACTION command to the node 16384 LOCATION: pgxc_node_remote_prepare, execRemote.c:1629 STATEMENT: create INDEX mdn on la_directednetwork(head) WARNING: 01000: unexpected EOF on datanode connection LOCATION: pgxc_node_receive, pgxcnode.c:463 LOG: 00000: Failed to ABORT at node 16384 Detail: unexpected EOF on datanode connection LOCATION: pgxc_node_remote_abort, execRemote.c:2039 LOG: 00000: Failed to ABORT an implicitly PREPARED transaction status - 7 LOCATION: pgxc_node_remote_abort, execRemote.c:2070 ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn WARNING: 01000: Unexpected data on connection, cleaning. LOCATION: acquire_connection, poolmgr.c:2141 LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: create INDEX mdn on la_directednetwork(head) ERROR: 42704: index "mdn" does not exist LOCATION: DropErrorMsgNonExistent, tablecmds.c:746 STATEMENT: drop index mdn WARNING: 25P01: there is no transaction in progress LOCATION: EndTransactionBlock, xact.c:4086 LOG: 08006: failed to connect to Datanode LOCATION: grow_pool, poolmgr.c:2259 WARNING: 01000: can not connect to node 16384 LOCATION: acquire_connection, poolmgr.c:2153 LOG: 53000: failed to acquire connections LOCATION: pool_recvfds, poolcomm.c:623 STATEMENT: CHECKPOINT ERROR: 53000: Failed to get pooled connections LOCATION: get_handles, pgxcnode.c:1969 STATEMENT: CHECKPOINT |
From: Ashutosh B. <ash...@en...> - 2014-02-03 04:52:56
|
Can you please check if there is increase in disk i/o as the number of rows processed increases. I do not see any problem with the planner. But because of huge result from datanode and not enough RAM, coordinator might be choosing to store it on the disk. On Mon, Feb 3, 2014 at 12:02 AM, Nirmal Sharma <sha...@gm...>wrote: > Yes you are absolutely right. > If I run the same query directly on nodes then it runs very fast. It is > running slow when I run from coordinator. How am I going to resolve this > tuple handling on coordinator? > Please advise. > > Sent from my iPad > > On Feb 2, 2014, at 9:32 AM, Mason Sharp <ms...@tr...> wrote: > > > > > On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...>wrote: > >> This is the explain plan for the query with limit 10000. >> >> >> Limit (cost=0.00..2.50 rows=1000 width=908) (actual >> time=1586.926..1836.081 rows=10000 loops=1) >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f >> gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), >> (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), >> (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), >> (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos >> )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), >> (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), >> (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (min(COALESCE(kws.m_new_kw_bid, kw >> s.m_kw_bid))) >> -> Data Node Scan on "__REMOTE_LIMIT_QUERY__" (cost=0.00..2.50 >> rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1) >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), >> (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), >> (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), >> (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe >> st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), >> (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), >> (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (min(COALESCE(kws.m_new_kw_b >> id, kws.m_kw_bid))) >> Node/s: d11, d12, d13, d14, d15, d16 >> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, >> l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), >> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), >> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg >> _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), >> pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), >> pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), >> pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume >> ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), >> pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), >> sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, >> r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, >> l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, >> l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, >> l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, >> fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, >> fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, >> fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, >> fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, >> fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY >> bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, >> a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, >> a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY >> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, >> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE >> ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND ( >> COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, >> a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, >> a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT >> kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi >> d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY >> biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) >> ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, >> 5, 6 LIMIT 10000::bigint >> Total runtime: 2194.762 ms >> (7 rows) >> >> > If you run the generated query on the nodes directly (through EXECUTE > DIRECT) is the time similarly slow? If so, then it points to the query > rewrite that is the problem. If it is fast, then it may mean an issue in > tuple handling on the coordinator. > > > -- > Mason Sharp > > TransLattice - http://www.translattice.com > Distributed and Clustered Database Solutions > > > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |
From: Ashutosh B. <ash...@en...> - 2014-02-03 04:49:50
|
Hi Sandeep, Can you please check if similar error happens on vanilla PG. It may be an application + auto-vacuum error, which can happen in PG as well and might be harmless. It's auto-vacuum being cancelled. Auto-vacuum will run again during the next iteration. On Fri, Jan 31, 2014 at 11:21 PM, Sandeep Gupta <gup...@gm...>wrote: > Hi, > > I was debugging an outstanding issue with pgxc ( > http://sourceforge.net/mailarchive/forum.php?thread_name=CABEZHFtr_YoWb22UAnPGQz8M5KqpwzbviYiAgq_%3DY...@ma...&forum_name=postgres-xc-general). > > > I couldn't reproduce that error. But I do get this error. > > > LOG: database system is ready to accept connections > LOG: autovacuum launcher started > LOG: sending cancel to blocking autovacuum PID 17222 > DETAIL: Process 13896 waits for AccessExclusiveLock on relation 16388 of > database 12626. > STATEMENT: drop index mdn > ERROR: canceling autovacuum task > CONTEXT: automatic analyze of table "postgres.public.la_directednetwork" > PreAbort Remote > > > It seems to be a deadlock issue and may be related to the earlier problem > as well. > Please let me know your comments. > > -Sandeep > > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |
From: 鈴木 幸市 <ko...@in...> - 2014-02-03 02:58:35
|
I’m afraid this is caused by different reason. Sorry for the late response. I’m afraid this is XC-specific problem, not for PG. It's helpful if you set log_error_verbosity to VERBOSE which will let you know what source code is involved in such an error. Best; --- Koichi Suzuki # It is not a good idea to post Postgres-XC issues to Postgres ML. 2014/02/03 10:58、Sandeep Gupta <gup...@gm...<mailto:gup...@gm...>> のメール: Hi Koichi, I can try pgxc_ctl as well but I am not sure how it will help with the current issue I am facing and the error I was facing a couple of months back with tuple not found error http://postgresql.1045698.n5.nabble.com/quot-Tuple-not-found-error-quot-during-Index-creation-td5782462.html I will post an followup to the tuple not found error as well. The problem with debugging the tuple not found error was that we couldn't reproduce the error. I can do so now with somewhat consistency. But still unsure of any short-term and long term fixes. Any advice on this would be very helpful. Thanks -Sandeep On Sun, Feb 2, 2014 at 5:02 PM, 鈴木 幸市 <ko...@in...<mailto:ko...@in...>> wrote: You need to import catalog from existing coordinator/datanode depending what node you are adding. You should run pg_dumpall and psql while the adding node is in specific mode. Pgxc_ctl source code will give you what it is doing for adding/removing nodes, Pgxc_ctl source code will be found in contrib/pgxc_ctl directory and the following functions may help: 1) add_coordinatorMaster(), add_coordinatorSlave(), remove_coordinatorMaster(), and remove_coordinatorSlave() in coord_cmd.c, 2) add_datanodeMaster(), add_datanodeSlave(), remove_datanodeMaster() and remove_datanodeSlave() in datanode_cmd.c, and 3) add_gtmSlave(), add_gtmProxy(), remove_gtmSlave(), remove_gtmProxy() and reconnect_gtm_proxy() in gtm_cmd.c Good luck. --- Koichi Suzuki 2014/02/02 3:01、Sandeep Gupta <gup...@gm...<mailto:gup...@gm...>> のメール: Hi Koichi, Thank you for looking into this. I did setup the pgxc manually. I have a script that performs 1. initdb and initgtm for the coordinator and gtm respectively 2. make changes in the config file of gtm to setup the port numbers 3. launch gtm and launch the coordinator 4. Then I ssh into the remote machine and launch 4 datanode instances (ports configured appropriately) 5. Finally, I add the datanodes to the coordinator followed by pgxc_reload I will take a look into pgxc_ctl. I would say that the deadlock happens 1 out of 10 times. Not sure if that is helpful. -Sandeep On Sat, Feb 1, 2014 at 3:22 AM, Koichi Suzuki <koi...@gm...<mailto:koi...@gm...>> wrote: Did you configure XC cluster manually? Then could you share how you did? To save your effort, pgxc_ctl provides simpler way to configure and run XC cluster. It is a contrib module and the document will be found at http://postgres-xc.sourceforge.net/docs/1_1/pgxc-ctl.html Regards; --- Koichi Suzuki 2014-02-01 Sandeep Gupta <gup...@gm...<mailto:gup...@gm...>>: > Hi, > > I was debugging an outstanding issue with pgxc > (http://sourceforge.net/mailarchive/forum.php?thread_name=CABEZHFtr_YoWb22UAnPGQz8M5KqpwzbviYiAgq_%3DY...@ma...&forum_name=postgres-xc-general). > > I couldn't reproduce that error. But I do get this error. > > > LOG: database system is ready to accept connections > LOG: autovacuum launcher started > LOG: sending cancel to blocking autovacuum PID 17222 > DETAIL: Process 13896 waits for AccessExclusiveLock on relation 16388 of > database 12626. > STATEMENT: drop index mdn > ERROR: canceling autovacuum task > CONTEXT: automatic analyze of table "postgres.public.la_directednetwork" > PreAbort Remote > > > It seems to be a deadlock issue and may be related to the earlier problem as > well. > Please let me know your comments. > > -Sandeep > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg..clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li...<mailto:Pos...@li...> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > ------------------------------------------------------------------------------ WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk_______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Sandeep G. <gup...@gm...> - 2014-02-03 01:58:35
|
Hi Koichi, I can try pgxc_ctl as well but I am not sure how it will help with the current issue I am facing and the error I was facing a couple of months back with tuple not found error http://postgresql.1045698.n5.nabble.com/quot-Tuple-not-found-error-quot-during-Index-creation-td5782462.html I will post an followup to the tuple not found error as well. The problem with debugging the tuple not found error was that we couldn't reproduce the error. I can do so now with somewhat consistency. But still unsure of any short-term and long term fixes. Any advice on this would be very helpful. Thanks *-Sandeep* On Sun, Feb 2, 2014 at 5:02 PM, 鈴木 幸市 <ko...@in...> wrote: > You need to import catalog from existing coordinator/datanode depending > what node you are adding. You should run pg_dumpall and psql while the > adding node is in specific mode. Pgxc_ctl source code will give you what > it is doing for adding/removing nodes, > > Pgxc_ctl source code will be found in contrib/pgxc_ctl directory and the > following functions may help: > > 1) add_coordinatorMaster(), add_coordinatorSlave(), > remove_coordinatorMaster(), and remove_coordinatorSlave() in coord_cmd.c, > 2) add_datanodeMaster(), add_datanodeSlave(), remove_datanodeMaster() and > remove_datanodeSlave() in datanode_cmd.c, and > 3) add_gtmSlave(), add_gtmProxy(), remove_gtmSlave(), remove_gtmProxy() > and reconnect_gtm_proxy() in gtm_cmd.c > > Good luck. > --- > Koichi Suzuki > > 2014/02/02 3:01、Sandeep Gupta <gup...@gm...> のメール: > > Hi Koichi, > > Thank you for looking into this. I did setup the pgxc manually. I have > a script that performs > > 1. initdb and initgtm for the coordinator and gtm respectively > > 2. make changes in the config file of gtm to setup the port numbers > > 3. launch gtm and launch the coordinator > > 4. Then I ssh into the remote machine and launch 4 datanode instances > (ports configured appropriately) > > 5. Finally, I add the datanodes to the coordinator followed by pgxc_reload > > I will take a look into pgxc_ctl. I would say that the deadlock happens 1 > out of 10 times. Not sure if that is helpful. > > -Sandeep > > > > > On Sat, Feb 1, 2014 at 3:22 AM, Koichi Suzuki <koi...@gm...>wrote: > >> Did you configure XC cluster manually? Then could you share how you did? >> >> To save your effort, pgxc_ctl provides simpler way to configure and >> run XC cluster. It is a contrib module and the document will be >> found at http://postgres-xc.sourceforge.net/docs/1_1/pgxc-ctl.html >> >> Regards; >> --- >> Koichi Suzuki >> >> >> 2014-02-01 Sandeep Gupta <gup...@gm...>: >> > Hi, >> > >> > I was debugging an outstanding issue with pgxc >> > ( >> http://sourceforge.net/mailarchive/forum.php?thread_name=CABEZHFtr_YoWb22UAnPGQz8M5KqpwzbviYiAgq_%3DY...@ma...&forum_name=postgres-xc-general >> ). >> > >> > I couldn't reproduce that error. But I do get this error. >> > >> > >> > LOG: database system is ready to accept connections >> > LOG: autovacuum launcher started >> > LOG: sending cancel to blocking autovacuum PID 17222 >> > DETAIL: Process 13896 waits for AccessExclusiveLock on relation 16388 >> of >> > database 12626. >> > STATEMENT: drop index mdn >> > ERROR: canceling autovacuum task >> > CONTEXT: automatic analyze of table >> "postgres.public.la_directednetwork" >> > PreAbort Remote >> > >> > >> > It seems to be a deadlock issue and may be related to the earlier >> problem as >> > well. >> > Please let me know your comments. >> > >> > -Sandeep >> > >> > >> > >> ------------------------------------------------------------------------------ >> > WatchGuard Dimension instantly turns raw network data into actionable >> > security intelligence. It gives you real-time visual feedback on key >> > security issues and trends. Skip the complicated setup - simply import >> > a virtual appliance and go from zero to informed in seconds. >> > >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> > _______________________________________________ >> > Postgres-xc-general mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> > >> > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk_______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > |
From: 鈴木 幸市 <ko...@in...> - 2014-02-03 01:03:04
|
You need to import catalog from existing coordinator/datanode depending what node you are adding. You should run pg_dumpall and psql while the adding node is in specific mode. Pgxc_ctl source code will give you what it is doing for adding/removing nodes, Pgxc_ctl source code will be found in contrib/pgxc_ctl directory and the following functions may help: 1) add_coordinatorMaster(), add_coordinatorSlave(), remove_coordinatorMaster(), and remove_coordinatorSlave() in coord_cmd.c, 2) add_datanodeMaster(), add_datanodeSlave(), remove_datanodeMaster() and remove_datanodeSlave() in datanode_cmd.c, and 3) add_gtmSlave(), add_gtmProxy(), remove_gtmSlave(), remove_gtmProxy() and reconnect_gtm_proxy() in gtm_cmd.c Good luck. --- Koichi Suzuki 2014/02/02 3:01、Sandeep Gupta <gup...@gm...<mailto:gup...@gm...>> のメール: Hi Koichi, Thank you for looking into this. I did setup the pgxc manually. I have a script that performs 1. initdb and initgtm for the coordinator and gtm respectively 2. make changes in the config file of gtm to setup the port numbers 3. launch gtm and launch the coordinator 4. Then I ssh into the remote machine and launch 4 datanode instances (ports configured appropriately) 5. Finally, I add the datanodes to the coordinator followed by pgxc_reload I will take a look into pgxc_ctl. I would say that the deadlock happens 1 out of 10 times. Not sure if that is helpful. -Sandeep On Sat, Feb 1, 2014 at 3:22 AM, Koichi Suzuki <koi...@gm...<mailto:koi...@gm...>> wrote: Did you configure XC cluster manually? Then could you share how you did? To save your effort, pgxc_ctl provides simpler way to configure and run XC cluster. It is a contrib module and the document will be found at http://postgres-xc.sourceforge.net/docs/1_1/pgxc-ctl.html Regards; --- Koichi Suzuki 2014-02-01 Sandeep Gupta <gup...@gm...<mailto:gup...@gm...>>: > Hi, > > I was debugging an outstanding issue with pgxc > (http://sourceforge.net/mailarchive/forum.php?thread_name=CABEZHFtr_YoWb22UAnPGQz8M5KqpwzbviYiAgq_%3DY...@ma...&forum_name=postgres-xc-general). > > I couldn't reproduce that error. But I do get this error. > > > LOG: database system is ready to accept connections > LOG: autovacuum launcher started > LOG: sending cancel to blocking autovacuum PID 17222 > DETAIL: Process 13896 waits for AccessExclusiveLock on relation 16388 of > database 12626. > STATEMENT: drop index mdn > ERROR: canceling autovacuum task > CONTEXT: automatic analyze of table "postgres.public.la_directednetwork" > PreAbort Remote > > > It seems to be a deadlock issue and may be related to the earlier problem as > well. > Please let me know your comments. > > -Sandeep > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li...<mailto:Pos...@li...> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > ------------------------------------------------------------------------------ WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk_______________________________________________ Postgres-xc-general mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Nirmal S. <sha...@gm...> - 2014-02-02 18:32:52
|
Yes you are absolutely right. If I run the same query directly on nodes then it runs very fast. It is running slow when I run from coordinator. How am I going to resolve this tuple handling on coordinator? Please advise. Sent from my iPad > On Feb 2, 2014, at 9:32 AM, Mason Sharp <ms...@tr...> wrote: > > > > >> On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...> wrote: >> This is the explain plan for the query with limit 10000. >> >> >> Limit (cost=0.00..2.50 rows=1000 width=908) (actual time=1586.926..1836.081 rows=10000 loops=1) >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f >> gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos >> )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_bid, kw >> s.m_kw_bid))) >> -> Data Node Scan on "__REMOTE_LIMIT_QUERY__" (cost=0.00..2.50 rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1) >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe >> st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_b >> id, kws.m_kw_bid))) >> Node/s: d11, d12, d13, d14, d15, d16 >> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg >> _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume >> ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, >> l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, >> fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, >> fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND ( >> COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi >> d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 LIMIT 10000::bigint >> Total runtime: 2194.762 ms >> (7 rows) >> > > If you run the generated query on the nodes directly (through EXECUTE DIRECT) is the time similarly slow? If so, then it points to the query rewrite that is the problem. If it is fast, then it may mean an issue in tuple handling on the coordinator. > > > -- > Mason Sharp > > TransLattice - http://www.translattice.com > Distributed and Clustered Database Solutions > > |
From: Mason S. <ms...@tr...> - 2014-02-02 17:32:11
|
On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...>wrote: > This is the explain plan for the query with limit 10000. > > > Limit (cost=0.00..2.50 rows=1000 width=908) (actual > time=1586.926..1836.081 rows=10000 loops=1) > Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), > fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, > kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), > (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f > gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), > (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), > (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), > (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos > )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), > (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), > (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (min(COALESCE(kws.m_new_kw_bid, kw > s.m_kw_bid))) > -> Data Node Scan on "__REMOTE_LIMIT_QUERY__" (cost=0.00..2.50 > rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1) > Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), > fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, > kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), > (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), > (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), > (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), > (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), > (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe > st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), > (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), > (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (min(COALESCE(kws.m_new_kw_b > id, kws.m_kw_bid))) > Node/s: d11, d12, d13, d14, d15, d16 > Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, > l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), > sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), > pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg > _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), > pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), > pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), > pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume > ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), > pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), > sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, > r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, > l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, > l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, > l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, > fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, > fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, > fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, > fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, > fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, > fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY > bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, > a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, > a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, > fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY > bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, > a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE > ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND ( > COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5, > a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, > a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names, > kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi > d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY > biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) > ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, > 5, 6 LIMIT 10000::bigint > Total runtime: 2194.762 ms > (7 rows) > > If you run the generated query on the nodes directly (through EXECUTE DIRECT) is the time similarly slow? If so, then it points to the query rewrite that is the problem. If it is fast, then it may mean an issue in tuple handling on the coordinator. -- Mason Sharp TransLattice - http://www.translattice.com Distributed and Clustered Database Solutions |
From: Nirmal S. <sha...@gm...> - 2014-02-02 05:49:44
|
This is the explain plan for the query with limit 10000. Limit (cost=0.00..2.50 rows=1000 width=908) (actual time=1586.926..1836.081 rows=10000 loops=1) Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_bid, kw s.m_kw_bid))) -> Data Node Scan on "__REMOTE_LIMIT_QUERY__" (cost=0.00..2.50 rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1) Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_b id, kws.m_kw_bid))) Node/s: d11, d12, d13, d14, d15, d16 Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND ( COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 LIMIT 10000::bigint Total runtime: 2194.762 ms (7 rows) On Sat, Feb 1, 2014 at 7:44 PM, Koichi Suzuki <koi...@gm...> wrote: > Could you share "explain" result to see how plan works fine. > > Regards; > --- > Koichi Suzuki > > > 2014-02-02 Nirmal Sharma <sha...@gm...>: > > Hi, > > > > These are the timings for adding limit with different amount. > > With these timings you can see see that here the bottleneck is > coordinator > > (i.e. retrieving data from various nodes to coordinator ). > > I just want to ask whether its normal or not? > > > > ---for limit 1000 > > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > > > real 0m1.935s > > user 0m0.051s > > sys 0m0.002s > > > > ---for limit 10000 > > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > > > real 0m2.724s > > user 0m0.481s > > sys 0m0.023s > > > > --for limit 100000 > > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > > > real 0m12.102s > > user 0m3.139s > > sys 0m0.146s > > > > --for limit 200000 > > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > > > real 0m13.078s > > user 0m5.507s > > sys 0m0.316s > > > > ---for limit 400000 > > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > > > real 0m18.820s > > user 0m10.482s > > sys 0m0.659s > > > > ---for limit 600000 > > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > > > real 0m23.478s > > user 0m15.631s > > sys 0m0.940s > > [postgres@sv4-pgxc-db04 test]$ > > > > > > I will also enable the statement log and try again and will send the > output > > soon. > > > > Nirmal > > > > > > On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> > wrote: > >> > >> > >> > >> > >> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...> > >> wrote: > >>> > >>> Hi Mason, > >>> > >>> This is the actual query that i was running. > >>> > >>> select coalesce(fgpc.date_id,fgcd.date_id) > >>> date_id, > >>> fgpc.m_ad_grp_pub_key > m_ad_grp_pub_key, > >>> fgpc.m_kw_pub_key m_kw_pub_key, > >>> kws.expr_names, > >>> kws.expr_values, > >>> kws.m_ad_grp_semid, > >>> sum(fgpc.m_imps) m_imps, > >>> sum(fgpc.m_clicks) m_clicks, > >>> sum(fgpc.m_cost) m_cost, > >>> sum(fgpc.m_conv_1pc) m_conv_1pc, > >>> sum(fgpc.m_conv_mpc) m_conv_mpc, > >>> avg(fgpc.m_cnv_rate_1pc) > m_cnv_rate_1pc, > >>> avg(fgpc.m_cnv_rate_mpc) > m_cnv_rate_mpc, > >>> avg(fgpc.m_avg_cpc) m_avg_cpc, > >>> avg(fgpc.m_max_cpc) m_max_cpc, > >>> avg(fgpc.m_firstpage_cpc) > >>> m_firstpage_cpc, > >>> avg(fgpc.m_topofpage_cpc) > >>> m_topofpage_cpc, > >>> avg(fgpc.m_avg_cpm) m_avg_cpm, > >>> avg(fgpc.m_max_cpm) m_max_cpm, > >>> avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, > >>> avg(fgpc.m_avg_pos) m_avg_pos, > >>> avg(fgpc.m_lowest_pos) m_lowest_pos, > >>> avg(fgpc.m_highest_pos) m_highest_pos, > >>> avg(fgpc.m_quality_score) > >>> m_quality_score, > >>> avg(fgpc.m_view_thru_conv) > >>> m_view_thru_conv, > >>> sum(fgcd.m_revenue) m_revenue, > >>> sum(fgcd.m_conversions) m_conversions, > >>> sum(coalesce(kws.m_new_kw_bid, > >>> kws.m_kw_bid)) m_kw_total_bid, > >>> max(coalesce(kws.m_new_kw_bid, > >>> kws.m_kw_bid)) m_kw_max_bid, > >>> min(coalesce(kws.m_new_kw_bid, > >>> kws.m_kw_bid)) m_kw_min_bid > >>> from > >>> bidw.fact_msn_kw_perf_daily fgpc > >>> full outer join bidw.fact_msn_kw_conversion_daily fgcd on > >>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = > >>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id > >>> join biods.msn_keyword_sup kws > on > >>> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = > >>> kws.m_ad_grp_pub_key > >>> where > >>> coalesce(fgpc.date_id,fgcd.date_id) between > >>> 20131201 and 20140119 > >>> group by > >>> > >>> > coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid > >>> ; > >>> > >>> This is the explain plan for the same. > >>> explain analyze verbose select ...... > >>> .... > >>> > >>> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 > >>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1) > >>> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), > >>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, > kws.expr_values, > >>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), > >>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), > >>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra > >>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), > >>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), > >>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), > (avg(fgpc.m_max_cpa_pct)), > >>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), > (avg(fgpc.m_highest_pos)), > >>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr > >>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), > >>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > >>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > >>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) > >>> Node/s: d11, d12, d13, d14, d15, d16 > >>> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, > l.a_3, > >>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), > >>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), > >>> pg_catalog.numeric_avg(avg(l.a_10)), > pg_catalog.numeric_avg(avg(l.a_11)), > >>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. > >>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), > >>> pg_catalog.numeric_avg(avg(l.a_15)), > pg_catalog.numeric_avg(avg(l.a_16)), > >>> pg_catalog.numeric_avg(avg(l.a_17)), > pg_catalog.numeric_avg(avg(l.a_18)), > >>> pg_catalog.numeric_avg(avg(l.a_19)), > pg_catalog.numeric_avg(avg(l.a_20)), > >>> pg_catalog.numeric_avg(avg( > >>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), > sum(l.a_25), > >>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), > >>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, > l.a_5, > >>> l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, > l.a_15, > >>> l.a_16, l.a_17, l.a_18, l.a_ > >>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT > >>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, > >>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, > >>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, > fgpc.m_max_cpc, > >>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f > >>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, > >>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, > >>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE > true) > >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, > a_14, > >>> a_15, a_16, a_17, a_18, a_19, > >>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, > >>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY > >>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, > a_4, > >>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) > WHERE > >>> ((COALESCE((l.a_1)::bigint, > >>> r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= > 20140119))) > >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, > a_14, > >>> a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN > >>> (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, > >>> kws.m_new_kw_bid, kws.m_kw_bi > >>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY > biods.msn_keyword_sup > >>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE > >>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 > >>> Total runtime: 8378.080 ms > >>> (5 rows) > >>> > >>> > >>> > >>> This is the actual time taken by the query: > >>> > >>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out > >>> > >>> real 0m23.533s > >>> user 0m15.705s > >>> sys 0m0.748s > >>> > >>> Now i dont know why is it taking that much time. > >> > >> > >> Try adding LIMIT with different amounts for example to see how that > >> impacts time. > >> > >> Also, try enabling statement logging (log_statement = all in > >> postgresql.conf) on the data nodes to see how long it takes on each > node. > >> > >> Also, the statement was rewritten in XC, with relations converted into > >> SELECTs, so try running the rewritten version directly to see how long > it > >> takes. > >> > >> Thanks, > >> > >> Mason > >> > > > |
From: Koichi S. <koi...@gm...> - 2014-02-02 03:44:48
|
Could you share "explain" result to see how plan works fine. Regards; --- Koichi Suzuki 2014-02-02 Nirmal Sharma <sha...@gm...>: > Hi, > > These are the timings for adding limit with different amount. > With these timings you can see see that here the bottleneck is coordinator > (i.e. retrieving data from various nodes to coordinator ). > I just want to ask whether its normal or not? > > ---for limit 1000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m1.935s > user 0m0.051s > sys 0m0.002s > > ---for limit 10000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m2.724s > user 0m0.481s > sys 0m0.023s > > --for limit 100000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m12.102s > user 0m3.139s > sys 0m0.146s > > --for limit 200000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m13.078s > user 0m5.507s > sys 0m0.316s > > ---for limit 400000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m18.820s > user 0m10.482s > sys 0m0.659s > > ---for limit 600000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m23.478s > user 0m15.631s > sys 0m0.940s > [postgres@sv4-pgxc-db04 test]$ > > > I will also enable the statement log and try again and will send the output > soon. > > Nirmal > > > On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote: >> >> >> >> >> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...> >> wrote: >>> >>> Hi Mason, >>> >>> This is the actual query that i was running. >>> >>> select coalesce(fgpc.date_id,fgcd.date_id) >>> date_id, >>> fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, >>> fgpc.m_kw_pub_key m_kw_pub_key, >>> kws.expr_names, >>> kws.expr_values, >>> kws.m_ad_grp_semid, >>> sum(fgpc.m_imps) m_imps, >>> sum(fgpc.m_clicks) m_clicks, >>> sum(fgpc.m_cost) m_cost, >>> sum(fgpc.m_conv_1pc) m_conv_1pc, >>> sum(fgpc.m_conv_mpc) m_conv_mpc, >>> avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, >>> avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, >>> avg(fgpc.m_avg_cpc) m_avg_cpc, >>> avg(fgpc.m_max_cpc) m_max_cpc, >>> avg(fgpc.m_firstpage_cpc) >>> m_firstpage_cpc, >>> avg(fgpc.m_topofpage_cpc) >>> m_topofpage_cpc, >>> avg(fgpc.m_avg_cpm) m_avg_cpm, >>> avg(fgpc.m_max_cpm) m_max_cpm, >>> avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, >>> avg(fgpc.m_avg_pos) m_avg_pos, >>> avg(fgpc.m_lowest_pos) m_lowest_pos, >>> avg(fgpc.m_highest_pos) m_highest_pos, >>> avg(fgpc.m_quality_score) >>> m_quality_score, >>> avg(fgpc.m_view_thru_conv) >>> m_view_thru_conv, >>> sum(fgcd.m_revenue) m_revenue, >>> sum(fgcd.m_conversions) m_conversions, >>> sum(coalesce(kws.m_new_kw_bid, >>> kws.m_kw_bid)) m_kw_total_bid, >>> max(coalesce(kws.m_new_kw_bid, >>> kws.m_kw_bid)) m_kw_max_bid, >>> min(coalesce(kws.m_new_kw_bid, >>> kws.m_kw_bid)) m_kw_min_bid >>> from >>> bidw.fact_msn_kw_perf_daily fgpc >>> full outer join bidw.fact_msn_kw_conversion_daily fgcd on >>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = >>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id >>> join biods.msn_keyword_sup kws on >>> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = >>> kws.m_ad_grp_pub_key >>> where >>> coalesce(fgpc.date_id,fgcd.date_id) between >>> 20131201 and 20140119 >>> group by >>> >>> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid >>> ; >>> >>> This is the explain plan for the same. >>> explain analyze verbose select ...... >>> .... >>> >>> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 >>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1) >>> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra >>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), >>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), >>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), >>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)), >>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr >>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), >>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) >>> Node/s: d11, d12, d13, d14, d15, d16 >>> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, >>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), >>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), >>> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), >>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. >>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), >>> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), >>> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), >>> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), >>> pg_catalog.numeric_avg(avg( >>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), >>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), >>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, l.a_5, >>> l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15, >>> l.a_16, l.a_17, l.a_18, l.a_ >>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT >>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, >>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, >>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, >>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f >>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, >>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, >>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true) >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, >>> a_15, a_16, a_17, a_18, a_19, >>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY >>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, >>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE >>> ((COALESCE((l.a_1)::bigint, >>> r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, >>> a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN >>> (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, >>> kws.m_new_kw_bid, kws.m_kw_bi >>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup >>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE >>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 >>> Total runtime: 8378.080 ms >>> (5 rows) >>> >>> >>> >>> This is the actual time taken by the query: >>> >>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out >>> >>> real 0m23.533s >>> user 0m15.705s >>> sys 0m0.748s >>> >>> Now i dont know why is it taking that much time. >> >> >> Try adding LIMIT with different amounts for example to see how that >> impacts time. >> >> Also, try enabling statement logging (log_statement = all in >> postgresql.conf) on the data nodes to see how long it takes on each node. >> >> Also, the statement was rewritten in XC, with relations converted into >> SELECTs, so try running the rewritten version directly to see how long it >> takes. >> >> Thanks, >> >> Mason >> > |
From: Nirmal S. <sha...@gm...> - 2014-02-01 23:27:46
|
Hi, These are the timings for adding limit with different amount. With these timings you can see see that here the bottleneck is coordinator (i.e. retrieving data from various nodes to coordinator ). I just want to ask whether its normal or not? ---for limit 1000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m1.935s user 0m0.051s sys 0m0.002s ---for limit 10000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m2.724s user 0m0.481s sys 0m0.023s --for limit 100000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m12.102s user 0m3.139s sys 0m0.146s --for limit 200000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m13.078s user 0m5.507s sys 0m0.316s ---for limit 400000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m18.820s user 0m10.482s sys 0m0.659s ---for limit 600000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m23.478s user 0m15.631s sys 0m0.940s [postgres@sv4-pgxc-db04 test]$ I will also enable the statement log and try again and will send the output soon. Nirmal On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote: > > > > On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote: > >> Hi Mason, >> >> This is the actual query that i was running. >> >> select coalesce(fgpc.date_id,fgcd.date_id) >> date_id, >> fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, >> fgpc.m_kw_pub_key m_kw_pub_key, >> kws.expr_names, >> kws.expr_values, >> kws.m_ad_grp_semid, >> sum(fgpc.m_imps) m_imps, >> sum(fgpc.m_clicks) m_clicks, >> sum(fgpc.m_cost) m_cost, >> >> sum(fgpc.m_conv_1pc) m_conv_1pc, >> sum(fgpc.m_conv_mpc) m_conv_mpc, >> avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, >> avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, >> avg(fgpc.m_avg_cpc) m_avg_cpc, >> avg(fgpc.m_max_cpc) m_max_cpc, >> avg(fgpc.m_firstpage_cpc) >> m_firstpage_cpc, >> avg(fgpc.m_topofpage_cpc) >> m_topofpage_cpc, >> avg(fgpc.m_avg_cpm) m_avg_cpm, >> avg(fgpc.m_max_cpm) m_max_cpm, >> avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, >> avg(fgpc.m_avg_pos) m_avg_pos, >> avg(fgpc.m_lowest_pos) m_lowest_pos, >> avg(fgpc.m_highest_pos) m_highest_pos, >> avg(fgpc.m_quality_score) >> m_quality_score, >> avg(fgpc.m_view_thru_conv) >> m_view_thru_conv, >> sum(fgcd.m_revenue) m_revenue, >> sum(fgcd.m_conversions) m_conversions, >> sum(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_total_bid, >> max(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_max_bid, >> min(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_min_bid >> from >> bidw.fact_msn_kw_perf_daily fgpc >> full outer join bidw.fact_msn_kw_conversion_daily fgcd on >> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = >> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id >> join biods.msn_keyword_sup kws on >> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = >> kws.m_ad_grp_pub_key >> where >> coalesce(fgpc.date_id,fgcd.date_id) between >> 20131201 and 20140119 >> group by >> >> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid >> ; >> >> *This is the explain plan for the same.* >> explain analyze verbose select ...... >> .... >> >> * Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 >> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)* >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra >> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), >> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), >> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), >> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)), >> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr >> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), >> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) >> *Node/s: d11, d12, d13, d14, d15, d16* >> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, >> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), >> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), >> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), >> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. >> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), >> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), >> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), >> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), >> pg_catalog.numeric_avg(avg( >> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), >> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), >> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, >> l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, >> l.a_15, l.a_16, l.a_17, l.a_18, l.a_ >> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT >> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, >> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, >> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, >> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f >> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, >> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, >> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE >> true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, >> a_13, a_14, a_15, a_16, a_17, a_18, a_19, >> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY >> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, >> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE >> ((COALESCE((l.a_1)::bigint, >> r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) >> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, >> a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) >> JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, >> kws.m_new_kw_bid, kws.m_kw_bi >> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup >> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE >> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 >> * Total runtime: 8378.080 ms* >> (5 rows) >> >> >> >> *This is the actual time taken by the query:* >> >> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out >> >> *real 0m23.533s* >> user 0m15.705s >> sys 0m0.748s >> >> Now i dont know why is it taking that much time. >> > > Try adding LIMIT with different amounts for example to see how that > impacts time. > > Also, try enabling statement logging (log_statement = all in > postgresql.conf) on the data nodes to see how long it takes on each node. > > Also, the statement was rewritten in XC, with relations converted into > SELECTs, so try running the rewritten version directly to see how long it > takes. > > Thanks, > > Mason > > |
From: Mason S. <ms...@tr...> - 2014-02-01 21:47:11
|
On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote: > Hi Mason, > > This is the actual query that i was running. > > select coalesce(fgpc.date_id,fgcd.date_id) > date_id, > fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, > fgpc.m_kw_pub_key m_kw_pub_key, > kws.expr_names, > kws.expr_values, > kws.m_ad_grp_semid, > sum(fgpc.m_imps) m_imps, > sum(fgpc.m_clicks) m_clicks, > sum(fgpc.m_cost) m_cost, > > sum(fgpc.m_conv_1pc) m_conv_1pc, > sum(fgpc.m_conv_mpc) m_conv_mpc, > avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, > avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, > avg(fgpc.m_avg_cpc) m_avg_cpc, > avg(fgpc.m_max_cpc) m_max_cpc, > avg(fgpc.m_firstpage_cpc) m_firstpage_cpc, > avg(fgpc.m_topofpage_cpc) m_topofpage_cpc, > avg(fgpc.m_avg_cpm) m_avg_cpm, > avg(fgpc.m_max_cpm) m_max_cpm, > avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, > avg(fgpc.m_avg_pos) m_avg_pos, > avg(fgpc.m_lowest_pos) m_lowest_pos, > avg(fgpc.m_highest_pos) m_highest_pos, > avg(fgpc.m_quality_score) m_quality_score, > avg(fgpc.m_view_thru_conv) > m_view_thru_conv, > sum(fgcd.m_revenue) m_revenue, > sum(fgcd.m_conversions) m_conversions, > sum(coalesce(kws.m_new_kw_bid, > kws.m_kw_bid)) m_kw_total_bid, > max(coalesce(kws.m_new_kw_bid, > kws.m_kw_bid)) m_kw_max_bid, > min(coalesce(kws.m_new_kw_bid, > kws.m_kw_bid)) m_kw_min_bid > from > bidw.fact_msn_kw_perf_daily fgpc > full outer join bidw.fact_msn_kw_conversion_daily fgcd on > fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = > fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id > join biods.msn_keyword_sup kws on > fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = > kws.m_ad_grp_pub_key > where > coalesce(fgpc.date_id,fgcd.date_id) between > 20131201 and 20140119 > group by > > coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid > ; > > *This is the explain plan for the same.* > explain analyze verbose select ...... > .... > > * Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 > width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)* > Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), > fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, > kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), > (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), > (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra > te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), > (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), > (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), > (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)), > (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr > u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), > (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) > *Node/s: d11, d12, d13, d14, d15, d16* > Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, > r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), > sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), > pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), > pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. > numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), > pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), > pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), > pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), > pg_catalog.numeric_avg(avg( > l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), > sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), > min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, > l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, > l.a_15, l.a_16, l.a_17, l.a_18, l.a_ > 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT > fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, > fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, > fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, > fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f > gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, > fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, > fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE > true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, > a_13, a_14, a_15, a_16, a_17, a_18, a_19, > a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, > fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY > bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, > a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE > ((COALESCE((l.a_1)::bigint, > r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) > l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, > a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) > JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, > kws.m_new_kw_bid, kws.m_kw_bi > d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup > kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE > ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 > * Total runtime: 8378.080 ms* > (5 rows) > > > > *This is the actual time taken by the query:* > > [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out > > *real 0m23.533s* > user 0m15.705s > sys 0m0.748s > > Now i dont know why is it taking that much time. > Try adding LIMIT with different amounts for example to see how that impacts time. Also, try enabling statement logging (log_statement = all in postgresql.conf) on the data nodes to see how long it takes on each node. Also, the statement was rewritten in XC, with relations converted into SELECTs, so try running the rewritten version directly to see how long it takes. Thanks, Mason |
From: Nirmal S. <sha...@gm...> - 2014-02-01 19:40:28
|
Hi Mason, This is the actual query that i was running. select coalesce(fgpc.date_id,fgcd.date_id) date_id, fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, fgpc.m_kw_pub_key m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, sum(fgpc.m_imps) m_imps, sum(fgpc.m_clicks) m_clicks, sum(fgpc.m_cost) m_cost, sum(fgpc.m_conv_1pc) m_conv_1pc, sum(fgpc.m_conv_mpc) m_conv_mpc, avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, avg(fgpc.m_avg_cpc) m_avg_cpc, avg(fgpc.m_max_cpc) m_max_cpc, avg(fgpc.m_firstpage_cpc) m_firstpage_cpc, avg(fgpc.m_topofpage_cpc) m_topofpage_cpc, avg(fgpc.m_avg_cpm) m_avg_cpm, avg(fgpc.m_max_cpm) m_max_cpm, avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, avg(fgpc.m_avg_pos) m_avg_pos, avg(fgpc.m_lowest_pos) m_lowest_pos, avg(fgpc.m_highest_pos) m_highest_pos, avg(fgpc.m_quality_score) m_quality_score, avg(fgpc.m_view_thru_conv) m_view_thru_conv, sum(fgcd.m_revenue) m_revenue, sum(fgcd.m_conversions) m_conversions, sum(coalesce(kws.m_new_kw_bid, kws.m_kw_bid)) m_kw_total_bid, max(coalesce(kws.m_new_kw_bid, kws.m_kw_bid)) m_kw_max_bid, min(coalesce(kws.m_new_kw_bid, kws.m_kw_bid)) m_kw_min_bid from bidw.fact_msn_kw_perf_daily fgpc full outer join bidw.fact_msn_kw_conversion_daily fgcd on fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id join biods.msn_keyword_sup kws on fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = kws.m_ad_grp_pub_key where coalesce(fgpc.date_id,fgcd.date_id) between 20131201 and 20140119 group by coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid ; *This is the explain plan for the same.* explain analyze verbose select ...... .... * Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)* Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) *Node/s: d11, d12, d13, d14, d15, d16* Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), pg_catalog.numeric_avg(avg( l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_ 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bid, kws.m_kw_bi d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 * Total runtime: 8378.080 ms* (5 rows) *This is the actual time taken by the query:* [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out *real 0m23.533s* user 0m15.705s sys 0m0.748s Now i dont know why is it taking that much time. Nirmal On Sat, Feb 1, 2014 at 11:16 AM, Mason Sharp <ms...@tr...>wrote: > > > > On Sat, Feb 1, 2014 at 2:13 PM, Nirmal Sharma <sha...@gm...>wrote: > >> My query uses aggregates and joins and it looks like this : >> >> Select >> Sum(...), >> Sum(..), >> Avg(..), >> ... >> .... >> .. >> From tableA a inner join tableB on a.col1 =b.col1 >> Inner join tableC on a.col1=c.col1 >> >> All the 3 tables are distributed on hash(col1) . >> >> I have 1 coordinator , 6 nodes, 1 GTM. >> >> When I run this query , it takes total 23 sec. >> But when I run the same query on each and individual nodes then it takes >> 4 sec on each and every nodes. >> So since it's cluster , it should ideally take 4 sec + some overhead time >> to combine data from each node on coordinator ( max 2 more sec) but I don't >> understand why it is taking 23 sec when runs from coordinator. >> > > > Can you please add an EXPLAIN in from of your SELECT to look at the plan? > If you only use 2 tables instead of 3, does it behave more as expected? > > |
From: Mason S. <ms...@tr...> - 2014-02-01 19:33:56
|
On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote: > Hi Koichi, > > My tables are not replicated. They all are distributed the way you > explained. > For example, total record in one table is 600000 and i have 6 nodes so > each and every node has got 100000 records. > > Now the issue is that when I am running my query directly on data node it > comes up in 5 sec and it is taking the same time on each and every node so > it should take the same time if i run the query through coordinator but > somehow instead on 5sec it's taking 22 sec. So somehow the query execution > on nodes are happening correctly but data movement from nodes to > coordinator is taking a lot of time. > > > Please advise. > > What does your query look like? A single table? A join? Using aggregates? Thanks, Mason |
From: Mason S. <ms...@tr...> - 2014-02-01 19:17:00
|
On Sat, Feb 1, 2014 at 2:13 PM, Nirmal Sharma <sha...@gm...>wrote: > My query uses aggregates and joins and it looks like this : > > Select > Sum(...), > Sum(..), > Avg(..), > ... > .... > .. > From tableA a inner join tableB on a.col1 =b.col1 > Inner join tableC on a.col1=c.col1 > > All the 3 tables are distributed on hash(col1) . > > I have 1 coordinator , 6 nodes, 1 GTM. > > When I run this query , it takes total 23 sec. > But when I run the same query on each and individual nodes then it takes 4 > sec on each and every nodes. > So since it's cluster , it should ideally take 4 sec + some overhead time > to combine data from each node on coordinator ( max 2 more sec) but I don't > understand why it is taking 23 sec when runs from coordinator. > Can you please add an EXPLAIN in from of your SELECT to look at the plan? If you only use 2 tables instead of 3, does it behave more as expected? |
From: Nirmal S. <sha...@gm...> - 2014-02-01 19:13:39
|
My query uses aggregates and joins and it looks like this : Select Sum(...), Sum(..), Avg(..), ... .... .. From tableA a inner join tableB on a.col1 =b.col1 Inner join tableC on a.col1=c.col1 All the 3 tables are distributed on hash(col1) . I have 1 coordinator , 6 nodes, 1 GTM. When I run this query , it takes total 23 sec. But when I run the same query on each and individual nodes then it takes 4 sec on each and every nodes. So since it's cluster , it should ideally take 4 sec + some overhead time to combine data from each node on coordinator ( max 2 more sec) but I don't understand why it is taking 23 sec when runs from coordinator. Nirmal Sent from my iPad > On Feb 1, 2014, at 11:02 AM, Mason Sharp <ms...@tr...> wrote: > > >> On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote: >> Hi Koichi, >> >> My tables are not replicated. They all are distributed the way you explained. >> For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records. >> >> Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time. > > >> Please advise. > > What does your query look like? A single table? A join? Using aggregates? > > Thanks, > > Mason > |
From: Nirmal S. <sha...@gm...> - 2014-02-01 18:53:46
|
So if the coordinator time or total query time is the sum of time taken by query on each and every node then what is the use of having cluster. The whole point of having the cluster is to divide the work across different data node and reduce the query time by almost the no. Of data nodes. I thought the overall time taken by query/coordinator is the max of time taken by all the nodes ( max( dn1,dn2....) instead of sum of time taken by all the nodes sum(dn1,dn2....) Please let me know if my understanding is incorrect. Nirmal Sent from my iPad > On Feb 1, 2014, at 10:19 AM, Sandeep Gupta <gup...@gm...> wrote: > > Nirmal, > > Coordinator time is a function of sum of the output from each datanode. PGXC shows performance when the datanodes output small amount of data compared to original size. > > -Sandeep > > > >> On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote: >> Hi Koichi, >> >> My tables are not replicated. They all are distributed the way you explained. >> For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records. >> >> Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time. >> >> Please advise. >> >> Sent from my iPhone >> >> > On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...> wrote: >> > >> > It is not a good way to replicate all the tables for write >> > scalability. The best way is to distribute transaction tables (very >> > frequently written ones) and replicate master tables (less frequently >> > written and frequently joined with transaction tables). >> > >> > Example is our DBT-1 benchmark. Slide 12 of >> > http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf >> > shows how we designed DBT-1 table distribution for XC. I hope this >> > helps. >> > >> > Regards; >> > --- >> > Koichi Suzuki >> > >> > >> > 2014-02-01 Nirmal Sharma <sha...@gm...>: >> >> Hi, >> >> >> >> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all >> >> on same big machine ). >> >> To test the performance, i ran one query through coordinator on my data >> >> which is evenly distributed on all the nodes and it took total 25 sec to >> >> complete. >> >> >> >> And then i ran the same query on datanodes directly and it took 5 sec on >> >> each and every datanodes. >> >> >> >> Since query execution happens parallely on data nodes so ideally even if i >> >> run the query through coordinator, it should not take more than 5-8 sec max >> >> but i dont understand why is it taking 25 sec. >> >> >> >> Can somebody help me.? >> >> Do i need to make some changes to my cluster configuration? >> >> >> >> >> >> Regards >> >> Nirmal >> >> >> >> ------------------------------------------------------------------------------ >> >> WatchGuard Dimension instantly turns raw network data into actionable >> >> security intelligence. It gives you real-time visual feedback on key >> >> security issues and trends. Skip the complicated setup - simply import >> >> a virtual appliance and go from zero to informed in seconds. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> >> _______________________________________________ >> >> Postgres-xc-general mailing list >> >> Pos...@li... >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> >> >> ------------------------------------------------------------------------------ >> WatchGuard Dimension instantly turns raw network data into actionable >> security intelligence. It gives you real-time visual feedback on key >> security issues and trends. Skip the complicated setup - simply import >> a virtual appliance and go from zero to informed in seconds. >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Sandeep G. <gup...@gm...> - 2014-02-01 18:19:40
|
Nirmal, Coordinator time is a function of sum of the output from each datanode. PGXC shows performance when the datanodes output small amount of data compared to original size. -Sandeep On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote: > Hi Koichi, > > My tables are not replicated. They all are distributed the way you > explained. > For example, total record in one table is 600000 and i have 6 nodes so > each and every node has got 100000 records. > > Now the issue is that when I am running my query directly on data node it > comes up in 5 sec and it is taking the same time on each and every node so > it should take the same time if i run the query through coordinator but > somehow instead on 5sec it's taking 22 sec. So somehow the query execution > on nodes are happening correctly but data movement from nodes to > coordinator is taking a lot of time. > > Please advise. > > Sent from my iPhone > > > On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...> > wrote: > > > > It is not a good way to replicate all the tables for write > > scalability. The best way is to distribute transaction tables (very > > frequently written ones) and replicate master tables (less frequently > > written and frequently joined with transaction tables). > > > > Example is our DBT-1 benchmark. Slide 12 of > > > http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf > > shows how we designed DBT-1 table distribution for XC. I hope this > > helps. > > > > Regards; > > --- > > Koichi Suzuki > > > > > > 2014-02-01 Nirmal Sharma <sha...@gm...>: > >> Hi, > >> > >> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes > (all > >> on same big machine ). > >> To test the performance, i ran one query through coordinator on my data > >> which is evenly distributed on all the nodes and it took total 25 sec to > >> complete. > >> > >> And then i ran the same query on datanodes directly and it took 5 sec on > >> each and every datanodes. > >> > >> Since query execution happens parallely on data nodes so ideally even > if i > >> run the query through coordinator, it should not take more than 5-8 sec > max > >> but i dont understand why is it taking 25 sec. > >> > >> Can somebody help me.? > >> Do i need to make some changes to my cluster configuration? > >> > >> > >> Regards > >> Nirmal > >> > >> > ------------------------------------------------------------------------------ > >> WatchGuard Dimension instantly turns raw network data into actionable > >> security intelligence. It gives you real-time visual feedback on key > >> security issues and trends. Skip the complicated setup - simply import > >> a virtual appliance and go from zero to informed in seconds. > >> > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > >> _______________________________________________ > >> Postgres-xc-general mailing list > >> Pos...@li... > >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > >> > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Sandeep G. <gup...@gm...> - 2014-02-01 18:02:01
|
Hi Koichi, Thank you for looking into this. I did setup the pgxc manually. I have a script that performs 1. initdb and initgtm for the coordinator and gtm respectively 2. make changes in the config file of gtm to setup the port numbers 3. launch gtm and launch the coordinator 4. Then I ssh into the remote machine and launch 4 datanode instances (ports configured appropriately) 5. Finally, I add the datanodes to the coordinator followed by pgxc_reload I will take a look into pgxc_ctl. I would say that the deadlock happens 1 out of 10 times. Not sure if that is helpful. -Sandeep On Sat, Feb 1, 2014 at 3:22 AM, Koichi Suzuki <koi...@gm...> wrote: > Did you configure XC cluster manually? Then could you share how you did? > > To save your effort, pgxc_ctl provides simpler way to configure and > run XC cluster. It is a contrib module and the document will be > found at http://postgres-xc.sourceforge.net/docs/1_1/pgxc-ctl.html > > Regards; > --- > Koichi Suzuki > > > 2014-02-01 Sandeep Gupta <gup...@gm...>: > > Hi, > > > > I was debugging an outstanding issue with pgxc > > ( > http://sourceforge.net/mailarchive/forum.php?thread_name=CABEZHFtr_YoWb22UAnPGQz8M5KqpwzbviYiAgq_%3DY...@ma...&forum_name=postgres-xc-general > ). > > > > I couldn't reproduce that error. But I do get this error. > > > > > > LOG: database system is ready to accept connections > > LOG: autovacuum launcher started > > LOG: sending cancel to blocking autovacuum PID 17222 > > DETAIL: Process 13896 waits for AccessExclusiveLock on relation 16388 of > > database 12626. > > STATEMENT: drop index mdn > > ERROR: canceling autovacuum task > > CONTEXT: automatic analyze of table "postgres.public.la_directednetwork" > > PreAbort Remote > > > > > > It seems to be a deadlock issue and may be related to the earlier > problem as > > well. > > Please let me know your comments. > > > > -Sandeep > > > > > > > ------------------------------------------------------------------------------ > > WatchGuard Dimension instantly turns raw network data into actionable > > security intelligence. It gives you real-time visual feedback on key > > security issues and trends. Skip the complicated setup - simply import > > a virtual appliance and go from zero to informed in seconds. > > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > |
From: Nirmal <sha...@gm...> - 2014-02-01 16:46:57
|
Hi Koichi, My tables are not replicated. They all are distributed the way you explained. For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records. Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time. Please advise. Sent from my iPhone > On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...> wrote: > > It is not a good way to replicate all the tables for write > scalability. The best way is to distribute transaction tables (very > frequently written ones) and replicate master tables (less frequently > written and frequently joined with transaction tables). > > Example is our DBT-1 benchmark. Slide 12 of > http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf > shows how we designed DBT-1 table distribution for XC. I hope this > helps. > > Regards; > --- > Koichi Suzuki > > > 2014-02-01 Nirmal Sharma <sha...@gm...>: >> Hi, >> >> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all >> on same big machine ). >> To test the performance, i ran one query through coordinator on my data >> which is evenly distributed on all the nodes and it took total 25 sec to >> complete. >> >> And then i ran the same query on datanodes directly and it took 5 sec on >> each and every datanodes. >> >> Since query execution happens parallely on data nodes so ideally even if i >> run the query through coordinator, it should not take more than 5-8 sec max >> but i dont understand why is it taking 25 sec. >> >> Can somebody help me.? >> Do i need to make some changes to my cluster configuration? >> >> >> Regards >> Nirmal >> >> ------------------------------------------------------------------------------ >> WatchGuard Dimension instantly turns raw network data into actionable >> security intelligence. It gives you real-time visual feedback on key >> security issues and trends. Skip the complicated setup - simply import >> a virtual appliance and go from zero to informed in seconds. >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> |
From: Koichi S. <koi...@gm...> - 2014-02-01 16:29:31
|
What steps did you take to crash and add servers? I need to know what you've done. If you've done server addition manually, it is not simple and you have to follow many steps. pgxc_ctl provides simpler way for this. It is a contrib module and the documentation will be found at http://postgres-xc.sourceforge.net/docs/1_1/pgxc-ctl.html Source code of pgxc_ctl may also be helpful to know what steps you should follow if you do it manually. Regards; --- Koichi Suzuki 2014-02-01 Cristian <pub...@gm...>: > Hi i installed and configured postgres-xc in a cluster with 3 servers as > follows: > > 1) server s1: coordinator , 1 gmt-proxy , 1 datanode > 2) server s2: coordinator , 1 gmt-proxy , 1 datanode > 3) gmt > > All is working until i simulate a crash for a server. > When i add again the coordinator .... gmt gives many errors and all seams > very slow. > > I made many attempts, it seams that if a server crashs , i need to remove > node from cluster manually else give problems, but now give a bit of > confusion.. too many strange behaviours. > > Maybe i m wrong in some procedure, so i want ask the correct procedure in > the case a datanode or coordinator or gmtproxy crash (or all of them). > > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Cristian <pub...@gm...> - 2014-02-01 14:28:16
|
Hi i installed and configured postgres-xc in a cluster with 3 servers as follows: 1) server s1: coordinator , 1 gmt-proxy , 1 datanode 2) server s2: coordinator , 1 gmt-proxy , 1 datanode 3) gmt All is working until i simulate a crash for a server. When i add again the coordinator .... gmt gives many errors and all seams very slow. I made many attempts, it seams that if a server crashs , i need to remove node from cluster manually else give problems, but now give a bit of confusion.. too many strange behaviours. Maybe i m wrong in some procedure, so i want ask the correct procedure in the case a datanode or coordinator or gmtproxy crash (or all of them). |
From: Koichi S. <koi...@gm...> - 2014-02-01 08:22:15
|
Did you configure XC cluster manually? Then could you share how you did? To save your effort, pgxc_ctl provides simpler way to configure and run XC cluster. It is a contrib module and the document will be found at http://postgres-xc.sourceforge.net/docs/1_1/pgxc-ctl.html Regards; --- Koichi Suzuki 2014-02-01 Sandeep Gupta <gup...@gm...>: > Hi, > > I was debugging an outstanding issue with pgxc > (http://sourceforge.net/mailarchive/forum.php?thread_name=CABEZHFtr_YoWb22UAnPGQz8M5KqpwzbviYiAgq_%3DY...@ma...&forum_name=postgres-xc-general). > > I couldn't reproduce that error. But I do get this error. > > > LOG: database system is ready to accept connections > LOG: autovacuum launcher started > LOG: sending cancel to blocking autovacuum PID 17222 > DETAIL: Process 13896 waits for AccessExclusiveLock on relation 16388 of > database 12626. > STATEMENT: drop index mdn > ERROR: canceling autovacuum task > CONTEXT: automatic analyze of table "postgres.public.la_directednetwork" > PreAbort Remote > > > It seems to be a deadlock issue and may be related to the earlier problem as > well. > Please let me know your comments. > > -Sandeep > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |