|
From: Nirmal S. <sha...@gm...> - 2014-01-31 23:38:37
|
Hi, I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all on same big machine ). To test the performance, i ran one query through coordinator on my data which is evenly distributed on all the nodes and it took total 25 sec to complete. And then i ran the same query on datanodes directly and it took 5 sec on each and every datanodes. Since query execution happens parallely on data nodes so ideally even if i run the query through coordinator, it should not take more than 5-8 sec max but i dont understand why is it taking 25 sec. Can somebody help me.? Do i need to make some changes to my cluster configuration? Regards Nirmal |
|
From: Koichi S. <koi...@gm...> - 2014-02-01 08:17:13
|
It is not a good way to replicate all the tables for write scalability. The best way is to distribute transaction tables (very frequently written ones) and replicate master tables (less frequently written and frequently joined with transaction tables). Example is our DBT-1 benchmark. Slide 12 of http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf shows how we designed DBT-1 table distribution for XC. I hope this helps. Regards; --- Koichi Suzuki 2014-02-01 Nirmal Sharma <sha...@gm...>: > Hi, > > I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all > on same big machine ). > To test the performance, i ran one query through coordinator on my data > which is evenly distributed on all the nodes and it took total 25 sec to > complete. > > And then i ran the same query on datanodes directly and it took 5 sec on > each and every datanodes. > > Since query execution happens parallely on data nodes so ideally even if i > run the query through coordinator, it should not take more than 5-8 sec max > but i dont understand why is it taking 25 sec. > > Can somebody help me.? > Do i need to make some changes to my cluster configuration? > > > Regards > Nirmal > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
|
From: Nirmal <sha...@gm...> - 2014-02-01 16:46:57
|
Hi Koichi, My tables are not replicated. They all are distributed the way you explained. For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records. Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time. Please advise. Sent from my iPhone > On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...> wrote: > > It is not a good way to replicate all the tables for write > scalability. The best way is to distribute transaction tables (very > frequently written ones) and replicate master tables (less frequently > written and frequently joined with transaction tables). > > Example is our DBT-1 benchmark. Slide 12 of > http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf > shows how we designed DBT-1 table distribution for XC. I hope this > helps. > > Regards; > --- > Koichi Suzuki > > > 2014-02-01 Nirmal Sharma <sha...@gm...>: >> Hi, >> >> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all >> on same big machine ). >> To test the performance, i ran one query through coordinator on my data >> which is evenly distributed on all the nodes and it took total 25 sec to >> complete. >> >> And then i ran the same query on datanodes directly and it took 5 sec on >> each and every datanodes. >> >> Since query execution happens parallely on data nodes so ideally even if i >> run the query through coordinator, it should not take more than 5-8 sec max >> but i dont understand why is it taking 25 sec. >> >> Can somebody help me.? >> Do i need to make some changes to my cluster configuration? >> >> >> Regards >> Nirmal >> >> ------------------------------------------------------------------------------ >> WatchGuard Dimension instantly turns raw network data into actionable >> security intelligence. It gives you real-time visual feedback on key >> security issues and trends. Skip the complicated setup - simply import >> a virtual appliance and go from zero to informed in seconds. >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> |
|
From: Mason S. <ms...@tr...> - 2014-02-01 19:33:56
|
On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote: > Hi Koichi, > > My tables are not replicated. They all are distributed the way you > explained. > For example, total record in one table is 600000 and i have 6 nodes so > each and every node has got 100000 records. > > Now the issue is that when I am running my query directly on data node it > comes up in 5 sec and it is taking the same time on each and every node so > it should take the same time if i run the query through coordinator but > somehow instead on 5sec it's taking 22 sec. So somehow the query execution > on nodes are happening correctly but data movement from nodes to > coordinator is taking a lot of time. > > > Please advise. > > What does your query look like? A single table? A join? Using aggregates? Thanks, Mason |
|
From: Sandeep G. <gup...@gm...> - 2014-02-01 18:19:40
|
Nirmal, Coordinator time is a function of sum of the output from each datanode. PGXC shows performance when the datanodes output small amount of data compared to original size. -Sandeep On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote: > Hi Koichi, > > My tables are not replicated. They all are distributed the way you > explained. > For example, total record in one table is 600000 and i have 6 nodes so > each and every node has got 100000 records. > > Now the issue is that when I am running my query directly on data node it > comes up in 5 sec and it is taking the same time on each and every node so > it should take the same time if i run the query through coordinator but > somehow instead on 5sec it's taking 22 sec. So somehow the query execution > on nodes are happening correctly but data movement from nodes to > coordinator is taking a lot of time. > > Please advise. > > Sent from my iPhone > > > On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...> > wrote: > > > > It is not a good way to replicate all the tables for write > > scalability. The best way is to distribute transaction tables (very > > frequently written ones) and replicate master tables (less frequently > > written and frequently joined with transaction tables). > > > > Example is our DBT-1 benchmark. Slide 12 of > > > http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf > > shows how we designed DBT-1 table distribution for XC. I hope this > > helps. > > > > Regards; > > --- > > Koichi Suzuki > > > > > > 2014-02-01 Nirmal Sharma <sha...@gm...>: > >> Hi, > >> > >> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes > (all > >> on same big machine ). > >> To test the performance, i ran one query through coordinator on my data > >> which is evenly distributed on all the nodes and it took total 25 sec to > >> complete. > >> > >> And then i ran the same query on datanodes directly and it took 5 sec on > >> each and every datanodes. > >> > >> Since query execution happens parallely on data nodes so ideally even > if i > >> run the query through coordinator, it should not take more than 5-8 sec > max > >> but i dont understand why is it taking 25 sec. > >> > >> Can somebody help me.? > >> Do i need to make some changes to my cluster configuration? > >> > >> > >> Regards > >> Nirmal > >> > >> > ------------------------------------------------------------------------------ > >> WatchGuard Dimension instantly turns raw network data into actionable > >> security intelligence. It gives you real-time visual feedback on key > >> security issues and trends. Skip the complicated setup - simply import > >> a virtual appliance and go from zero to informed in seconds. > >> > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > >> _______________________________________________ > >> Postgres-xc-general mailing list > >> Pos...@li... > >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > >> > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
|
From: Nirmal S. <sha...@gm...> - 2014-02-01 18:53:46
|
So if the coordinator time or total query time is the sum of time taken by query on each and every node then what is the use of having cluster. The whole point of having the cluster is to divide the work across different data node and reduce the query time by almost the no. Of data nodes. I thought the overall time taken by query/coordinator is the max of time taken by all the nodes ( max( dn1,dn2....) instead of sum of time taken by all the nodes sum(dn1,dn2....) Please let me know if my understanding is incorrect. Nirmal Sent from my iPad > On Feb 1, 2014, at 10:19 AM, Sandeep Gupta <gup...@gm...> wrote: > > Nirmal, > > Coordinator time is a function of sum of the output from each datanode. PGXC shows performance when the datanodes output small amount of data compared to original size. > > -Sandeep > > > >> On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote: >> Hi Koichi, >> >> My tables are not replicated. They all are distributed the way you explained. >> For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records. >> >> Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time. >> >> Please advise. >> >> Sent from my iPhone >> >> > On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...> wrote: >> > >> > It is not a good way to replicate all the tables for write >> > scalability. The best way is to distribute transaction tables (very >> > frequently written ones) and replicate master tables (less frequently >> > written and frequently joined with transaction tables). >> > >> > Example is our DBT-1 benchmark. Slide 12 of >> > http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf >> > shows how we designed DBT-1 table distribution for XC. I hope this >> > helps. >> > >> > Regards; >> > --- >> > Koichi Suzuki >> > >> > >> > 2014-02-01 Nirmal Sharma <sha...@gm...>: >> >> Hi, >> >> >> >> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all >> >> on same big machine ). >> >> To test the performance, i ran one query through coordinator on my data >> >> which is evenly distributed on all the nodes and it took total 25 sec to >> >> complete. >> >> >> >> And then i ran the same query on datanodes directly and it took 5 sec on >> >> each and every datanodes. >> >> >> >> Since query execution happens parallely on data nodes so ideally even if i >> >> run the query through coordinator, it should not take more than 5-8 sec max >> >> but i dont understand why is it taking 25 sec. >> >> >> >> Can somebody help me.? >> >> Do i need to make some changes to my cluster configuration? >> >> >> >> >> >> Regards >> >> Nirmal >> >> >> >> ------------------------------------------------------------------------------ >> >> WatchGuard Dimension instantly turns raw network data into actionable >> >> security intelligence. It gives you real-time visual feedback on key >> >> security issues and trends. Skip the complicated setup - simply import >> >> a virtual appliance and go from zero to informed in seconds. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> >> _______________________________________________ >> >> Postgres-xc-general mailing list >> >> Pos...@li... >> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> >> >> ------------------------------------------------------------------------------ >> WatchGuard Dimension instantly turns raw network data into actionable >> security intelligence. It gives you real-time visual feedback on key >> security issues and trends. Skip the complicated setup - simply import >> a virtual appliance and go from zero to informed in seconds. >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
|
From: Nirmal S. <sha...@gm...> - 2014-02-01 19:13:39
|
My query uses aggregates and joins and it looks like this :
Select
Sum(...),
Sum(..),
Avg(..),
...
....
..
From tableA a inner join tableB on a.col1 =b.col1
Inner join tableC on a.col1=c.col1
All the 3 tables are distributed on hash(col1) .
I have 1 coordinator , 6 nodes, 1 GTM.
When I run this query , it takes total 23 sec.
But when I run the same query on each and individual nodes then it takes 4 sec on each and every nodes.
So since it's cluster , it should ideally take 4 sec + some overhead time to combine data from each node on coordinator ( max 2 more sec) but I don't understand why it is taking 23 sec when runs from coordinator.
Nirmal
Sent from my iPad
> On Feb 1, 2014, at 11:02 AM, Mason Sharp <ms...@tr...> wrote:
>
>
>> On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote:
>> Hi Koichi,
>>
>> My tables are not replicated. They all are distributed the way you explained.
>> For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records.
>>
>> Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time.
>
>
>> Please advise.
>
> What does your query look like? A single table? A join? Using aggregates?
>
> Thanks,
>
> Mason
>
|
|
From: Mason S. <ms...@tr...> - 2014-02-01 19:17:00
|
On Sat, Feb 1, 2014 at 2:13 PM, Nirmal Sharma <sha...@gm...>wrote: > My query uses aggregates and joins and it looks like this : > > Select > Sum(...), > Sum(..), > Avg(..), > ... > .... > .. > From tableA a inner join tableB on a.col1 =b.col1 > Inner join tableC on a.col1=c.col1 > > All the 3 tables are distributed on hash(col1) . > > I have 1 coordinator , 6 nodes, 1 GTM. > > When I run this query , it takes total 23 sec. > But when I run the same query on each and individual nodes then it takes 4 > sec on each and every nodes. > So since it's cluster , it should ideally take 4 sec + some overhead time > to combine data from each node on coordinator ( max 2 more sec) but I don't > understand why it is taking 23 sec when runs from coordinator. > Can you please add an EXPLAIN in from of your SELECT to look at the plan? If you only use 2 tables instead of 3, does it behave more as expected? |
|
From: Nirmal S. <sha...@gm...> - 2014-02-01 19:40:28
|
Hi Mason,
This is the actual query that i was running.
select coalesce(fgpc.date_id,fgcd.date_id)
date_id,
fgpc.m_ad_grp_pub_key m_ad_grp_pub_key,
fgpc.m_kw_pub_key m_kw_pub_key,
kws.expr_names,
kws.expr_values,
kws.m_ad_grp_semid,
sum(fgpc.m_imps) m_imps,
sum(fgpc.m_clicks) m_clicks,
sum(fgpc.m_cost) m_cost,
sum(fgpc.m_conv_1pc) m_conv_1pc,
sum(fgpc.m_conv_mpc) m_conv_mpc,
avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc,
avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc,
avg(fgpc.m_avg_cpc) m_avg_cpc,
avg(fgpc.m_max_cpc) m_max_cpc,
avg(fgpc.m_firstpage_cpc) m_firstpage_cpc,
avg(fgpc.m_topofpage_cpc) m_topofpage_cpc,
avg(fgpc.m_avg_cpm) m_avg_cpm,
avg(fgpc.m_max_cpm) m_max_cpm,
avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
avg(fgpc.m_avg_pos) m_avg_pos,
avg(fgpc.m_lowest_pos) m_lowest_pos,
avg(fgpc.m_highest_pos) m_highest_pos,
avg(fgpc.m_quality_score) m_quality_score,
avg(fgpc.m_view_thru_conv)
m_view_thru_conv,
sum(fgcd.m_revenue) m_revenue,
sum(fgcd.m_conversions) m_conversions,
sum(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_total_bid,
max(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_max_bid,
min(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_min_bid
from
bidw.fact_msn_kw_perf_daily fgpc
full outer join bidw.fact_msn_kw_conversion_daily fgcd on
fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
join biods.msn_keyword_sup kws on
fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
kws.m_ad_grp_pub_key
where
coalesce(fgpc.date_id,fgcd.date_id) between
20131201 and 20140119
group by
coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
;
*This is the explain plan for the same.*
explain analyze verbose select ......
....
* Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000
width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)*
Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
(sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
(avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
(avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
(avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)),
(avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)),
(avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
(sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
*Node/s: d11, d12, d13, d14, d15, d16*
Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3,
r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)),
pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)),
pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
pg_catalog.numeric_avg(avg(
l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
l.a_15, l.a_16, l.a_17, l.a_18, l.a_
19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id,
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks,
fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc,
fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc,
fgpc.m_topofpage_cpc, f
gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12,
a_13, a_14, a_15, a_16, a_17, a_18, a_19,
a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
((COALESCE((l.a_1)::bigint,
r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119)))
l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25)
JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
kws.m_new_kw_bid, kws.m_kw_bi
d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup
kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
* Total runtime: 8378.080 ms*
(5 rows)
*This is the actual time taken by the query:*
[postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out
*real 0m23.533s*
user 0m15.705s
sys 0m0.748s
Now i dont know why is it taking that much time.
Nirmal
On Sat, Feb 1, 2014 at 11:16 AM, Mason Sharp <ms...@tr...>wrote:
>
>
>
> On Sat, Feb 1, 2014 at 2:13 PM, Nirmal Sharma <sha...@gm...>wrote:
>
>> My query uses aggregates and joins and it looks like this :
>>
>> Select
>> Sum(...),
>> Sum(..),
>> Avg(..),
>> ...
>> ....
>> ..
>> From tableA a inner join tableB on a.col1 =b.col1
>> Inner join tableC on a.col1=c.col1
>>
>> All the 3 tables are distributed on hash(col1) .
>>
>> I have 1 coordinator , 6 nodes, 1 GTM.
>>
>> When I run this query , it takes total 23 sec.
>> But when I run the same query on each and individual nodes then it takes
>> 4 sec on each and every nodes.
>> So since it's cluster , it should ideally take 4 sec + some overhead time
>> to combine data from each node on coordinator ( max 2 more sec) but I don't
>> understand why it is taking 23 sec when runs from coordinator.
>>
>
>
> Can you please add an EXPLAIN in from of your SELECT to look at the plan?
> If you only use 2 tables instead of 3, does it behave more as expected?
>
>
|
|
From: Mason S. <ms...@tr...> - 2014-02-01 21:47:11
|
On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote: > Hi Mason, > > This is the actual query that i was running. > > select coalesce(fgpc.date_id,fgcd.date_id) > date_id, > fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, > fgpc.m_kw_pub_key m_kw_pub_key, > kws.expr_names, > kws.expr_values, > kws.m_ad_grp_semid, > sum(fgpc.m_imps) m_imps, > sum(fgpc.m_clicks) m_clicks, > sum(fgpc.m_cost) m_cost, > > sum(fgpc.m_conv_1pc) m_conv_1pc, > sum(fgpc.m_conv_mpc) m_conv_mpc, > avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, > avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, > avg(fgpc.m_avg_cpc) m_avg_cpc, > avg(fgpc.m_max_cpc) m_max_cpc, > avg(fgpc.m_firstpage_cpc) m_firstpage_cpc, > avg(fgpc.m_topofpage_cpc) m_topofpage_cpc, > avg(fgpc.m_avg_cpm) m_avg_cpm, > avg(fgpc.m_max_cpm) m_max_cpm, > avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, > avg(fgpc.m_avg_pos) m_avg_pos, > avg(fgpc.m_lowest_pos) m_lowest_pos, > avg(fgpc.m_highest_pos) m_highest_pos, > avg(fgpc.m_quality_score) m_quality_score, > avg(fgpc.m_view_thru_conv) > m_view_thru_conv, > sum(fgcd.m_revenue) m_revenue, > sum(fgcd.m_conversions) m_conversions, > sum(coalesce(kws.m_new_kw_bid, > kws.m_kw_bid)) m_kw_total_bid, > max(coalesce(kws.m_new_kw_bid, > kws.m_kw_bid)) m_kw_max_bid, > min(coalesce(kws.m_new_kw_bid, > kws.m_kw_bid)) m_kw_min_bid > from > bidw.fact_msn_kw_perf_daily fgpc > full outer join bidw.fact_msn_kw_conversion_daily fgcd on > fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = > fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id > join biods.msn_keyword_sup kws on > fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = > kws.m_ad_grp_pub_key > where > coalesce(fgpc.date_id,fgcd.date_id) between > 20131201 and 20140119 > group by > > coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid > ; > > *This is the explain plan for the same.* > explain analyze verbose select ...... > .... > > * Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 > width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)* > Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), > fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, > kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), > (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), > (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra > te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), > (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), > (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), > (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)), > (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr > u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), > (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) > *Node/s: d11, d12, d13, d14, d15, d16* > Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, > r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), > sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), > pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), > pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. > numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), > pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), > pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), > pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), > pg_catalog.numeric_avg(avg( > l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), > sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), > min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, > l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, > l.a_15, l.a_16, l.a_17, l.a_18, l.a_ > 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT > fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, > fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, > fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, > fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f > gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, > fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, > fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE > true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, > a_13, a_14, a_15, a_16, a_17, a_18, a_19, > a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, > fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY > bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, > a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE > ((COALESCE((l.a_1)::bigint, > r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) > l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, > a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) > JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, > kws.m_new_kw_bid, kws.m_kw_bi > d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup > kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE > ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 > * Total runtime: 8378.080 ms* > (5 rows) > > > > *This is the actual time taken by the query:* > > [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out > > *real 0m23.533s* > user 0m15.705s > sys 0m0.748s > > Now i dont know why is it taking that much time. > Try adding LIMIT with different amounts for example to see how that impacts time. Also, try enabling statement logging (log_statement = all in postgresql.conf) on the data nodes to see how long it takes on each node. Also, the statement was rewritten in XC, with relations converted into SELECTs, so try running the rewritten version directly to see how long it takes. Thanks, Mason |
|
From: Nirmal S. <sha...@gm...> - 2014-02-03 22:07:52
|
Hi All,
I tried with log_statement enabled on all the nodes and coordinator and i
got this:
--This is the coordinator log
*LOG: duration: 8807.961 ms* statement: select
coalesce(fgpc.date_id,fgcd.date_id) date_id,
fgpc.m_ad_grp_pub_key
m_ad_grp_pub_key,
fgpc.m_kw_pub_key m_kw_pub_key,
kws.expr_names,
kws.expr_values,
kws.m_ad_grp_semid,
sum(fgpc.m_imps) m_imps,
sum(fgpc.m_clicks) m_clicks,
sum(fgpc.m_cost) m_cost,
sum(fgpc.m_conv_1pc) m_conv_1pc,
sum(fgpc.m_conv_mpc) m_conv_mpc,
avg(fgpc.m_cnv_rate_1pc)
m_cnv_rate_1pc,
avg(fgpc.m_cnv_rate_mpc)
m_cnv_rate_mpc,
avg(fgpc.m_avg_cpc) m_avg_cpc,
avg(fgpc.m_max_cpc) m_max_cpc,
avg(fgpc.m_firstpage_cpc)
m_firstpage_cpc,
avg(fgpc.m_topofpage_cpc)
m_topofpage_cpc,
avg(fgpc.m_avg_cpm) m_avg_cpm,
avg(fgpc.m_max_cpm) m_max_cpm,
avg(fgpc.m_max_cpa_pct)
m_max_cpa_pct,
avg(fgpc.m_avg_pos) m_avg_pos,
avg(fgpc.m_lowest_pos)
m_lowest_pos,
avg(fgpc.m_highest_pos)
m_highest_pos,
avg(fgpc.m_quality_score)
m_quality_score,
avg(fgpc.m_view_thru_conv)
m_view_thru_conv,
sum(fgcd.m_revenue) m_revenue,
sum(fgcd.m_conversions)
m_conversions,
sum(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_total_bid,
max(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_max_bid,
min(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_min_bid
from
bidw.fact_msn_kw_perf_daily fgpc
full outer join bidw.fact_msn_kw_conversion_daily fgcd
on fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
join biods.msn_keyword_sup kws
on fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
kws.m_ad_grp_pub_key
where
coalesce(fgpc.date_id,fgcd.date_id) between
20131201 and 20140119
group by
coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
*And this the log info from all the data nodes log file:*
*LOG: duration: 8387.136 ms* statement: SELECT COALESCE((l.a_1)::bigint,
l.a_23), l.a_2, l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5),
sum(l.a_6), sum(l.a_7), sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)),
pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)),
pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)),
pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.numeric_avg(avg(l.a_19)),
pg_catalog.numeric_avg(avg(l.a_20)), pg_catalog.numeric_avg(avg(l.a_21)),
pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, l.a_22, r.a_1,
r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, fgpc.m_ad_grp_pub_key,
fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, fgpc.m_cost,
fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc,
fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc,
fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12,
a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22) LEFT JOIN
(SELECT fgcd.date_id, fgcd.m_revenue, fgcd.m_conversions,
fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND
(COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5,
a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names,
kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bid, kws.m_kw_bid,
kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup kws
WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE ((l.a_2 =
r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
So as per the query log everything looks fine. i.e. coordinator is working
the way it should work.
*But then why the below statement is returning me 23 sec ( test.sql has got
the same query that is shown above )*
[postgres@sv4-pgxc-db04 test]$ time psql -d adchemy11100 -f "test.sql" >
/dev/null
*real 0m23.394s*
user 0m15.900s
sys 0m0.645s
Please advise.
Nirmal
On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote:
>
>
>
> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote:
>
>> Hi Mason,
>>
>> This is the actual query that i was running.
>>
>> select coalesce(fgpc.date_id,fgcd.date_id)
>> date_id,
>> fgpc.m_ad_grp_pub_key m_ad_grp_pub_key,
>> fgpc.m_kw_pub_key m_kw_pub_key,
>> kws.expr_names,
>> kws.expr_values,
>> kws.m_ad_grp_semid,
>> sum(fgpc.m_imps) m_imps,
>> sum(fgpc.m_clicks) m_clicks,
>> sum(fgpc.m_cost) m_cost,
>>
>> sum(fgpc.m_conv_1pc) m_conv_1pc,
>> sum(fgpc.m_conv_mpc) m_conv_mpc,
>> avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc,
>> avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc,
>> avg(fgpc.m_avg_cpc) m_avg_cpc,
>> avg(fgpc.m_max_cpc) m_max_cpc,
>> avg(fgpc.m_firstpage_cpc)
>> m_firstpage_cpc,
>> avg(fgpc.m_topofpage_cpc)
>> m_topofpage_cpc,
>> avg(fgpc.m_avg_cpm) m_avg_cpm,
>> avg(fgpc.m_max_cpm) m_max_cpm,
>> avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
>> avg(fgpc.m_avg_pos) m_avg_pos,
>> avg(fgpc.m_lowest_pos) m_lowest_pos,
>> avg(fgpc.m_highest_pos) m_highest_pos,
>> avg(fgpc.m_quality_score)
>> m_quality_score,
>> avg(fgpc.m_view_thru_conv)
>> m_view_thru_conv,
>> sum(fgcd.m_revenue) m_revenue,
>> sum(fgcd.m_conversions) m_conversions,
>> sum(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_total_bid,
>> max(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_max_bid,
>> min(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_min_bid
>> from
>> bidw.fact_msn_kw_perf_daily fgpc
>> full outer join bidw.fact_msn_kw_conversion_daily fgcd on
>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
>> join biods.msn_keyword_sup kws on
>> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
>> kws.m_ad_grp_pub_key
>> where
>> coalesce(fgpc.date_id,fgcd.date_id) between
>> 20131201 and 20140119
>> group by
>>
>> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
>> ;
>>
>> *This is the explain plan for the same.*
>> explain analyze verbose select ......
>> ....
>>
>> * Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000
>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)*
>> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)),
>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)),
>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
>> *Node/s: d11, d12, d13, d14, d15, d16*
>> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3,
>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
>> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
>> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)),
>> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)),
>> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
>> pg_catalog.numeric_avg(avg(
>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
>> l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
>> l.a_15, l.a_16, l.a_17, l.a_18, l.a_
>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT
>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps,
>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc,
>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc,
>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f
>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
>> true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12,
>> a_13, a_14, a_15, a_16, a_17, a_18, a_19,
>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
>> ((COALESCE((l.a_1)::bigint,
>> r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119)))
>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
>> a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25)
>> JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
>> kws.m_new_kw_bid, kws.m_kw_bi
>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup
>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
>> * Total runtime: 8378.080 ms*
>> (5 rows)
>>
>>
>>
>> *This is the actual time taken by the query:*
>>
>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out
>>
>> *real 0m23.533s*
>> user 0m15.705s
>> sys 0m0.748s
>>
>> Now i dont know why is it taking that much time.
>>
>
> Try adding LIMIT with different amounts for example to see how that
> impacts time.
>
> Also, try enabling statement logging (log_statement = all in
> postgresql.conf) on the data nodes to see how long it takes on each node.
>
> Also, the statement was rewritten in XC, with relations converted into
> SELECTs, so try running the rewritten version directly to see how long it
> takes.
>
> Thanks,
>
> Mason
>
>
|
|
From: Nirmal S. <sha...@gm...> - 2014-02-01 23:27:46
|
Hi, These are the timings for adding limit with different amount. With these timings you can see see that here the bottleneck is coordinator (i.e. retrieving data from various nodes to coordinator ). I just want to ask whether its normal or not? ---for limit 1000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m1.935s user 0m0.051s sys 0m0.002s ---for limit 10000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m2.724s user 0m0.481s sys 0m0.023s --for limit 100000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m12.102s user 0m3.139s sys 0m0.146s --for limit 200000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m13.078s user 0m5.507s sys 0m0.316s ---for limit 400000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m18.820s user 0m10.482s sys 0m0.659s ---for limit 600000 [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out real 0m23.478s user 0m15.631s sys 0m0.940s [postgres@sv4-pgxc-db04 test]$ I will also enable the statement log and try again and will send the output soon. Nirmal On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote: > > > > On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote: > >> Hi Mason, >> >> This is the actual query that i was running. >> >> select coalesce(fgpc.date_id,fgcd.date_id) >> date_id, >> fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, >> fgpc.m_kw_pub_key m_kw_pub_key, >> kws.expr_names, >> kws.expr_values, >> kws.m_ad_grp_semid, >> sum(fgpc.m_imps) m_imps, >> sum(fgpc.m_clicks) m_clicks, >> sum(fgpc.m_cost) m_cost, >> >> sum(fgpc.m_conv_1pc) m_conv_1pc, >> sum(fgpc.m_conv_mpc) m_conv_mpc, >> avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, >> avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, >> avg(fgpc.m_avg_cpc) m_avg_cpc, >> avg(fgpc.m_max_cpc) m_max_cpc, >> avg(fgpc.m_firstpage_cpc) >> m_firstpage_cpc, >> avg(fgpc.m_topofpage_cpc) >> m_topofpage_cpc, >> avg(fgpc.m_avg_cpm) m_avg_cpm, >> avg(fgpc.m_max_cpm) m_max_cpm, >> avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, >> avg(fgpc.m_avg_pos) m_avg_pos, >> avg(fgpc.m_lowest_pos) m_lowest_pos, >> avg(fgpc.m_highest_pos) m_highest_pos, >> avg(fgpc.m_quality_score) >> m_quality_score, >> avg(fgpc.m_view_thru_conv) >> m_view_thru_conv, >> sum(fgcd.m_revenue) m_revenue, >> sum(fgcd.m_conversions) m_conversions, >> sum(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_total_bid, >> max(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_max_bid, >> min(coalesce(kws.m_new_kw_bid, >> kws.m_kw_bid)) m_kw_min_bid >> from >> bidw.fact_msn_kw_perf_daily fgpc >> full outer join bidw.fact_msn_kw_conversion_daily fgcd on >> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = >> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id >> join biods.msn_keyword_sup kws on >> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = >> kws.m_ad_grp_pub_key >> where >> coalesce(fgpc.date_id,fgcd.date_id) between >> 20131201 and 20140119 >> group by >> >> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid >> ; >> >> *This is the explain plan for the same.* >> explain analyze verbose select ...... >> .... >> >> * Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 >> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)* >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra >> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), >> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), >> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), >> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)), >> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr >> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), >> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) >> *Node/s: d11, d12, d13, d14, d15, d16* >> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, >> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), >> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), >> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), >> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. >> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), >> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), >> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), >> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), >> pg_catalog.numeric_avg(avg( >> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), >> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), >> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, >> l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, >> l.a_15, l.a_16, l.a_17, l.a_18, l.a_ >> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT >> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, >> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, >> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, >> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f >> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, >> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, >> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE >> true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, >> a_13, a_14, a_15, a_16, a_17, a_18, a_19, >> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY >> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, >> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE >> ((COALESCE((l.a_1)::bigint, >> r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) >> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, >> a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) >> JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, >> kws.m_new_kw_bid, kws.m_kw_bi >> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup >> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE >> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 >> * Total runtime: 8378.080 ms* >> (5 rows) >> >> >> >> *This is the actual time taken by the query:* >> >> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out >> >> *real 0m23.533s* >> user 0m15.705s >> sys 0m0.748s >> >> Now i dont know why is it taking that much time. >> > > Try adding LIMIT with different amounts for example to see how that > impacts time. > > Also, try enabling statement logging (log_statement = all in > postgresql.conf) on the data nodes to see how long it takes on each node. > > Also, the statement was rewritten in XC, with relations converted into > SELECTs, so try running the rewritten version directly to see how long it > takes. > > Thanks, > > Mason > > |
|
From: Koichi S. <koi...@gm...> - 2014-02-02 03:44:48
|
Could you share "explain" result to see how plan works fine. Regards; --- Koichi Suzuki 2014-02-02 Nirmal Sharma <sha...@gm...>: > Hi, > > These are the timings for adding limit with different amount. > With these timings you can see see that here the bottleneck is coordinator > (i.e. retrieving data from various nodes to coordinator ). > I just want to ask whether its normal or not? > > ---for limit 1000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m1.935s > user 0m0.051s > sys 0m0.002s > > ---for limit 10000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m2.724s > user 0m0.481s > sys 0m0.023s > > --for limit 100000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m12.102s > user 0m3.139s > sys 0m0.146s > > --for limit 200000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m13.078s > user 0m5.507s > sys 0m0.316s > > ---for limit 400000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m18.820s > user 0m10.482s > sys 0m0.659s > > ---for limit 600000 > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out > > real 0m23.478s > user 0m15.631s > sys 0m0.940s > [postgres@sv4-pgxc-db04 test]$ > > > I will also enable the statement log and try again and will send the output > soon. > > Nirmal > > > On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote: >> >> >> >> >> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...> >> wrote: >>> >>> Hi Mason, >>> >>> This is the actual query that i was running. >>> >>> select coalesce(fgpc.date_id,fgcd.date_id) >>> date_id, >>> fgpc.m_ad_grp_pub_key m_ad_grp_pub_key, >>> fgpc.m_kw_pub_key m_kw_pub_key, >>> kws.expr_names, >>> kws.expr_values, >>> kws.m_ad_grp_semid, >>> sum(fgpc.m_imps) m_imps, >>> sum(fgpc.m_clicks) m_clicks, >>> sum(fgpc.m_cost) m_cost, >>> sum(fgpc.m_conv_1pc) m_conv_1pc, >>> sum(fgpc.m_conv_mpc) m_conv_mpc, >>> avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc, >>> avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc, >>> avg(fgpc.m_avg_cpc) m_avg_cpc, >>> avg(fgpc.m_max_cpc) m_max_cpc, >>> avg(fgpc.m_firstpage_cpc) >>> m_firstpage_cpc, >>> avg(fgpc.m_topofpage_cpc) >>> m_topofpage_cpc, >>> avg(fgpc.m_avg_cpm) m_avg_cpm, >>> avg(fgpc.m_max_cpm) m_max_cpm, >>> avg(fgpc.m_max_cpa_pct) m_max_cpa_pct, >>> avg(fgpc.m_avg_pos) m_avg_pos, >>> avg(fgpc.m_lowest_pos) m_lowest_pos, >>> avg(fgpc.m_highest_pos) m_highest_pos, >>> avg(fgpc.m_quality_score) >>> m_quality_score, >>> avg(fgpc.m_view_thru_conv) >>> m_view_thru_conv, >>> sum(fgcd.m_revenue) m_revenue, >>> sum(fgcd.m_conversions) m_conversions, >>> sum(coalesce(kws.m_new_kw_bid, >>> kws.m_kw_bid)) m_kw_total_bid, >>> max(coalesce(kws.m_new_kw_bid, >>> kws.m_kw_bid)) m_kw_max_bid, >>> min(coalesce(kws.m_new_kw_bid, >>> kws.m_kw_bid)) m_kw_min_bid >>> from >>> bidw.fact_msn_kw_perf_daily fgpc >>> full outer join bidw.fact_msn_kw_conversion_daily fgcd on >>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key = >>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id >>> join biods.msn_keyword_sup kws on >>> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key = >>> kws.m_ad_grp_pub_key >>> where >>> coalesce(fgpc.date_id,fgcd.date_id) between >>> 20131201 and 20140119 >>> group by >>> >>> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid >>> ; >>> >>> This is the explain plan for the same. >>> explain analyze verbose select ...... >>> .... >>> >>> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000 >>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1) >>> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra >>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), >>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), >>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), >>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)), >>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr >>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), >>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))) >>> Node/s: d11, d12, d13, d14, d15, d16 >>> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, >>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), >>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), >>> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), >>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog. >>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), >>> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), >>> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), >>> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), >>> pg_catalog.numeric_avg(avg( >>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), >>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), >>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, l.a_5, >>> l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15, >>> l.a_16, l.a_17, l.a_18, l.a_ >>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT >>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, >>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, >>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, >>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f >>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, >>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score, >>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true) >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, >>> a_15, a_16, a_17, a_18, a_19, >>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY >>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, >>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE >>> ((COALESCE((l.a_1)::bigint, >>> r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, >>> a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN >>> (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, >>> kws.m_new_kw_bid, kws.m_kw_bi >>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup >>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE >>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 >>> Total runtime: 8378.080 ms >>> (5 rows) >>> >>> >>> >>> This is the actual time taken by the query: >>> >>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out >>> >>> real 0m23.533s >>> user 0m15.705s >>> sys 0m0.748s >>> >>> Now i dont know why is it taking that much time. >> >> >> Try adding LIMIT with different amounts for example to see how that >> impacts time. >> >> Also, try enabling statement logging (log_statement = all in >> postgresql.conf) on the data nodes to see how long it takes on each node. >> >> Also, the statement was rewritten in XC, with relations converted into >> SELECTs, so try running the rewritten version directly to see how long it >> takes. >> >> Thanks, >> >> Mason >> > |
|
From: Nirmal S. <sha...@gm...> - 2014-02-02 05:49:44
|
This is the explain plan for the query with limit 10000.
Limit (cost=0.00..2.50 rows=1000 width=908) (actual
time=1586.926..1836.081 rows=10000 loops=1)
Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
(sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f
gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)),
(avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)),
(avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
(avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos
)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)),
(avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)),
(sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(min(COALESCE(kws.m_new_kw_bid, kw
s.m_kw_bid)))
-> Data Node Scan on "__REMOTE_LIMIT_QUERY__" (cost=0.00..2.50
rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1)
Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
(sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
(avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)),
(avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)),
(avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
(avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe
st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)),
(avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)),
(sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(min(COALESCE(kws.m_new_kw_b
id, kws.m_kw_bid)))
Node/s: d11, d12, d13, d14, d15, d16
Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2,
l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg
_catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)),
pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)),
pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)),
pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume
ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)),
sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4,
r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2,
l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12,
l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21,
l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id,
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks,
fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc,
fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc,
fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct,
fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos,
fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY
bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5,
a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND (
COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5,
a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names,
kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi
d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY
biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7)
ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4,
5, 6 LIMIT 10000::bigint
Total runtime: 2194.762 ms
(7 rows)
On Sat, Feb 1, 2014 at 7:44 PM, Koichi Suzuki <koi...@gm...> wrote:
> Could you share "explain" result to see how plan works fine.
>
> Regards;
> ---
> Koichi Suzuki
>
>
> 2014-02-02 Nirmal Sharma <sha...@gm...>:
> > Hi,
> >
> > These are the timings for adding limit with different amount.
> > With these timings you can see see that here the bottleneck is
> coordinator
> > (i.e. retrieving data from various nodes to coordinator ).
> > I just want to ask whether its normal or not?
> >
> > ---for limit 1000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m1.935s
> > user 0m0.051s
> > sys 0m0.002s
> >
> > ---for limit 10000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m2.724s
> > user 0m0.481s
> > sys 0m0.023s
> >
> > --for limit 100000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m12.102s
> > user 0m3.139s
> > sys 0m0.146s
> >
> > --for limit 200000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m13.078s
> > user 0m5.507s
> > sys 0m0.316s
> >
> > ---for limit 400000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m18.820s
> > user 0m10.482s
> > sys 0m0.659s
> >
> > ---for limit 600000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m23.478s
> > user 0m15.631s
> > sys 0m0.940s
> > [postgres@sv4-pgxc-db04 test]$
> >
> >
> > I will also enable the statement log and try again and will send the
> output
> > soon.
> >
> > Nirmal
> >
> >
> > On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...>
> wrote:
> >>
> >>
> >>
> >>
> >> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>
> >> wrote:
> >>>
> >>> Hi Mason,
> >>>
> >>> This is the actual query that i was running.
> >>>
> >>> select coalesce(fgpc.date_id,fgcd.date_id)
> >>> date_id,
> >>> fgpc.m_ad_grp_pub_key
> m_ad_grp_pub_key,
> >>> fgpc.m_kw_pub_key m_kw_pub_key,
> >>> kws.expr_names,
> >>> kws.expr_values,
> >>> kws.m_ad_grp_semid,
> >>> sum(fgpc.m_imps) m_imps,
> >>> sum(fgpc.m_clicks) m_clicks,
> >>> sum(fgpc.m_cost) m_cost,
> >>> sum(fgpc.m_conv_1pc) m_conv_1pc,
> >>> sum(fgpc.m_conv_mpc) m_conv_mpc,
> >>> avg(fgpc.m_cnv_rate_1pc)
> m_cnv_rate_1pc,
> >>> avg(fgpc.m_cnv_rate_mpc)
> m_cnv_rate_mpc,
> >>> avg(fgpc.m_avg_cpc) m_avg_cpc,
> >>> avg(fgpc.m_max_cpc) m_max_cpc,
> >>> avg(fgpc.m_firstpage_cpc)
> >>> m_firstpage_cpc,
> >>> avg(fgpc.m_topofpage_cpc)
> >>> m_topofpage_cpc,
> >>> avg(fgpc.m_avg_cpm) m_avg_cpm,
> >>> avg(fgpc.m_max_cpm) m_max_cpm,
> >>> avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
> >>> avg(fgpc.m_avg_pos) m_avg_pos,
> >>> avg(fgpc.m_lowest_pos) m_lowest_pos,
> >>> avg(fgpc.m_highest_pos) m_highest_pos,
> >>> avg(fgpc.m_quality_score)
> >>> m_quality_score,
> >>> avg(fgpc.m_view_thru_conv)
> >>> m_view_thru_conv,
> >>> sum(fgcd.m_revenue) m_revenue,
> >>> sum(fgcd.m_conversions) m_conversions,
> >>> sum(coalesce(kws.m_new_kw_bid,
> >>> kws.m_kw_bid)) m_kw_total_bid,
> >>> max(coalesce(kws.m_new_kw_bid,
> >>> kws.m_kw_bid)) m_kw_max_bid,
> >>> min(coalesce(kws.m_new_kw_bid,
> >>> kws.m_kw_bid)) m_kw_min_bid
> >>> from
> >>> bidw.fact_msn_kw_perf_daily fgpc
> >>> full outer join bidw.fact_msn_kw_conversion_daily fgcd on
> >>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
> >>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
> >>> join biods.msn_keyword_sup kws
> on
> >>> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
> >>> kws.m_ad_grp_pub_key
> >>> where
> >>> coalesce(fgpc.date_id,fgcd.date_id) between
> >>> 20131201 and 20140119
> >>> group by
> >>>
> >>>
> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
> >>> ;
> >>>
> >>> This is the explain plan for the same.
> >>> explain analyze verbose select ......
> >>> ....
> >>>
> >>> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..2.50 rows=1000
> >>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)
> >>> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
> >>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names,
> kws.expr_values,
> >>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
> >>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
> >>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
> >>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
> >>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
> >>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
> (avg(fgpc.m_max_cpa_pct)),
> >>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)),
> (avg(fgpc.m_highest_pos)),
> >>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
> >>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
> >>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> >>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> >>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
> >>> Node/s: d11, d12, d13, d14, d15, d16
> >>> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2,
> l.a_3,
> >>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
> >>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
> >>> pg_catalog.numeric_avg(avg(l.a_10)),
> pg_catalog.numeric_avg(avg(l.a_11)),
> >>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
> >>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
> >>> pg_catalog.numeric_avg(avg(l.a_15)),
> pg_catalog.numeric_avg(avg(l.a_16)),
> >>> pg_catalog.numeric_avg(avg(l.a_17)),
> pg_catalog.numeric_avg(avg(l.a_18)),
> >>> pg_catalog.numeric_avg(avg(l.a_19)),
> pg_catalog.numeric_avg(avg(l.a_20)),
> >>> pg_catalog.numeric_avg(avg(
> >>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24),
> sum(l.a_25),
> >>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
> >>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
> l.a_5,
> >>> l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
> l.a_15,
> >>> l.a_16, l.a_17, l.a_18, l.a_
> >>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT
> >>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps,
> >>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc,
> >>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc,
> fgpc.m_max_cpc,
> >>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f
> >>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
> >>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
> >>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
> true)
> >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
> a_14,
> >>> a_15, a_16, a_17, a_18, a_19,
> >>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
> >>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
> >>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3,
> a_4,
> >>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5))))
> WHERE
> >>> ((COALESCE((l.a_1)::bigint,
> >>> r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <=
> 20140119)))
> >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
> a_14,
> >>> a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN
> >>> (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
> >>> kws.m_new_kw_bid, kws.m_kw_bi
> >>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY
> biods.msn_keyword_sup
> >>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
> >>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
> >>> Total runtime: 8378.080 ms
> >>> (5 rows)
> >>>
> >>>
> >>>
> >>> This is the actual time taken by the query:
> >>>
> >>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out
> >>>
> >>> real 0m23.533s
> >>> user 0m15.705s
> >>> sys 0m0.748s
> >>>
> >>> Now i dont know why is it taking that much time.
> >>
> >>
> >> Try adding LIMIT with different amounts for example to see how that
> >> impacts time.
> >>
> >> Also, try enabling statement logging (log_statement = all in
> >> postgresql.conf) on the data nodes to see how long it takes on each
> node.
> >>
> >> Also, the statement was rewritten in XC, with relations converted into
> >> SELECTs, so try running the rewritten version directly to see how long
> it
> >> takes.
> >>
> >> Thanks,
> >>
> >> Mason
> >>
> >
>
|
|
From: Mason S. <ms...@tr...> - 2014-02-02 17:32:11
|
On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...>wrote: > This is the explain plan for the query with limit 10000. > > > Limit (cost=0.00..2.50 rows=1000 width=908) (actual > time=1586.926..1836.081 rows=10000 loops=1) > Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), > fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, > kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), > (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f > gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), > (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), > (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), > (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos > )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), > (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), > (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (min(COALESCE(kws.m_new_kw_bid, kw > s.m_kw_bid))) > -> Data Node Scan on "__REMOTE_LIMIT_QUERY__" (cost=0.00..2.50 > rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1) > Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), > fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, > kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), > (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), > (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), > (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), > (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), > (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe > st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), > (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), > (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), > (min(COALESCE(kws.m_new_kw_b > id, kws.m_kw_bid))) > Node/s: d11, d12, d13, d14, d15, d16 > Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, > l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), > sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), > pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg > _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), > pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), > pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), > pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume > ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), > pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), > sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, > r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, > l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, > l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, > l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, > fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, > fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, > fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, > fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, > fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, > fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY > bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, > a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, > a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, > fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY > bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, > a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE > ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND ( > COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5, > a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, > a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names, > kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi > d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY > biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) > ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, > 5, 6 LIMIT 10000::bigint > Total runtime: 2194.762 ms > (7 rows) > > If you run the generated query on the nodes directly (through EXECUTE DIRECT) is the time similarly slow? If so, then it points to the query rewrite that is the problem. If it is fast, then it may mean an issue in tuple handling on the coordinator. -- Mason Sharp TransLattice - http://www.translattice.com Distributed and Clustered Database Solutions |
|
From: Nirmal S. <sha...@gm...> - 2014-02-02 18:32:52
|
Yes you are absolutely right. If I run the same query directly on nodes then it runs very fast. It is running slow when I run from coordinator. How am I going to resolve this tuple handling on coordinator? Please advise. Sent from my iPad > On Feb 2, 2014, at 9:32 AM, Mason Sharp <ms...@tr...> wrote: > > > > >> On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...> wrote: >> This is the explain plan for the query with limit 10000. >> >> >> Limit (cost=0.00..2.50 rows=1000 width=908) (actual time=1586.926..1836.081 rows=10000 loops=1) >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f >> gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos >> )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_bid, kw >> s.m_kw_bid))) >> -> Data Node Scan on "__REMOTE_LIMIT_QUERY__" (cost=0.00..2.50 rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1) >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe >> st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_b >> id, kws.m_kw_bid))) >> Node/s: d11, d12, d13, d14, d15, d16 >> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg >> _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume >> ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, >> l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, >> fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, >> fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND ( >> COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi >> d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 LIMIT 10000::bigint >> Total runtime: 2194.762 ms >> (7 rows) >> > > If you run the generated query on the nodes directly (through EXECUTE DIRECT) is the time similarly slow? If so, then it points to the query rewrite that is the problem. If it is fast, then it may mean an issue in tuple handling on the coordinator. > > > -- > Mason Sharp > > TransLattice - http://www.translattice.com > Distributed and Clustered Database Solutions > > |
|
From: Ashutosh B. <ash...@en...> - 2014-02-03 04:52:56
|
Can you please check if there is increase in disk i/o as the number of rows processed increases. I do not see any problem with the planner. But because of huge result from datanode and not enough RAM, coordinator might be choosing to store it on the disk. On Mon, Feb 3, 2014 at 12:02 AM, Nirmal Sharma <sha...@gm...>wrote: > Yes you are absolutely right. > If I run the same query directly on nodes then it runs very fast. It is > running slow when I run from coordinator. How am I going to resolve this > tuple handling on coordinator? > Please advise. > > Sent from my iPad > > On Feb 2, 2014, at 9:32 AM, Mason Sharp <ms...@tr...> wrote: > > > > > On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...>wrote: > >> This is the explain plan for the query with limit 10000. >> >> >> Limit (cost=0.00..2.50 rows=1000 width=908) (actual >> time=1586.926..1836.081 rows=10000 loops=1) >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f >> gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), >> (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), >> (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), >> (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos >> )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), >> (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), >> (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (min(COALESCE(kws.m_new_kw_bid, kw >> s.m_kw_bid))) >> -> Data Node Scan on "__REMOTE_LIMIT_QUERY__" (cost=0.00..2.50 >> rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1) >> Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, >> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), >> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), >> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), >> (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), >> (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), >> (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe >> st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), >> (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), >> (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), >> (min(COALESCE(kws.m_new_kw_b >> id, kws.m_kw_bid))) >> Node/s: d11, d12, d13, d14, d15, d16 >> Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, >> l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), >> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), >> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg >> _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), >> pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), >> pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), >> pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume >> ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), >> pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), >> sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, >> r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, >> l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, >> l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, >> l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, >> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, >> fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, >> fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, >> fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, >> fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, >> fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY >> bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, >> a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, >> a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, >> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY >> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, >> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE >> ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND ( >> COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, >> a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, >> a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT >> kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi >> d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY >> biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) >> ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, >> 5, 6 LIMIT 10000::bigint >> Total runtime: 2194.762 ms >> (7 rows) >> >> > If you run the generated query on the nodes directly (through EXECUTE > DIRECT) is the time similarly slow? If so, then it points to the query > rewrite that is the problem. If it is fast, then it may mean an issue in > tuple handling on the coordinator. > > > -- > Mason Sharp > > TransLattice - http://www.translattice.com > Distributed and Clustered Database Solutions > > > > > ------------------------------------------------------------------------------ > WatchGuard Dimension instantly turns raw network data into actionable > security intelligence. It gives you real-time visual feedback on key > security issues and trends. Skip the complicated setup - simply import > a virtual appliance and go from zero to informed in seconds. > > http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |