Thread: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

Brought to you by: ahsanhadi, amitdkhan, ashutoshbapat, gabbasb, and 3 others

postgres-xc-general

[Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal S. <sha...@gm...> - 2014-01-31 23:38:37

Hi,

I created a pgxc cluster with one coordinator, one GTM and 6 data nodes
(all on same big machine ).
To test the performance, i ran one query through coordinator on my data
which is evenly distributed on all the nodes and it took total 25 sec to
complete.

And then i ran the same query on datanodes directly and it took 5 sec on
each and every datanodes.

Since query execution happens parallely on data nodes so ideally even if i
run the query through coordinator, it should not take more than 5-8 sec max
but i dont understand why is it taking 25 sec.

Can somebody help me.?
Do i need to make some changes to my cluster configuration?


Regards
Nirmal

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Koichi S. <koi...@gm...> - 2014-02-01 08:17:13

It is not a good way to replicate all the tables for write
scalability.   The best way is to distribute transaction tables (very
frequently written ones) and replicate master tables (less frequently
written and frequently joined with transaction tables).

Example is our DBT-1 benchmark.  Slide 12 of
http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf
shows how we designed DBT-1 table distribution for XC.  I hope this
helps.

Regards;
---
Koichi Suzuki


2014-02-01 Nirmal Sharma <sha...@gm...>:
> Hi,
>
> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all
> on same big machine ).
> To test the performance, i ran one query through coordinator on my data
> which is evenly distributed on all the nodes and it took total 25 sec to
> complete.
>
> And then i ran the same query on datanodes directly and it took 5 sec on
> each and every datanodes.
>
> Since query execution happens parallely on data nodes so ideally even if i
> run the query through coordinator, it should not take more than 5-8 sec max
> but i dont understand why is it taking 25 sec.
>
> Can somebody help me.?
> Do i need to make some changes to my cluster configuration?
>
>
> Regards
> Nirmal
>
> ------------------------------------------------------------------------------
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal <sha...@gm...> - 2014-02-01 16:46:57

Hi Koichi,

My tables are not replicated. They all are distributed the way you explained. 
For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records. 

Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time. 

Please advise. 

Sent from my iPhone

> On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...> wrote:
> 
> It is not a good way to replicate all the tables for write
> scalability.   The best way is to distribute transaction tables (very
> frequently written ones) and replicate master tables (less frequently
> written and frequently joined with transaction tables).
> 
> Example is our DBT-1 benchmark.  Slide 12 of
> http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf
> shows how we designed DBT-1 table distribution for XC.  I hope this
> helps.
> 
> Regards;
> ---
> Koichi Suzuki
> 
> 
> 2014-02-01 Nirmal Sharma <sha...@gm...>:
>> Hi,
>> 
>> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all
>> on same big machine ).
>> To test the performance, i ran one query through coordinator on my data
>> which is evenly distributed on all the nodes and it took total 25 sec to
>> complete.
>> 
>> And then i ran the same query on datanodes directly and it took 5 sec on
>> each and every datanodes.
>> 
>> Since query execution happens parallely on data nodes so ideally even if i
>> run the query through coordinator, it should not take more than 5-8 sec max
>> but i dont understand why is it taking 25 sec.
>> 
>> Can somebody help me.?
>> Do i need to make some changes to my cluster configuration?
>> 
>> 
>> Regards
>> Nirmal
>> 
>> ------------------------------------------------------------------------------
>> WatchGuard Dimension instantly turns raw network data into actionable
>> security intelligence. It gives you real-time visual feedback on key
>> security issues and trends.  Skip the complicated setup - simply import
>> a virtual appliance and go from zero to informed in seconds.
>> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Postgres-xc-general mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Mason S. <ms...@tr...> - 2014-02-01 19:33:56

On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote:

> Hi Koichi,
>
> My tables are not replicated. They all are distributed the way you
> explained.
> For example, total record in one table is 600000 and i have 6 nodes so
> each and every node has got 100000 records.
>
> Now the issue is that when I am running my query directly on data node it
> comes up in 5 sec and it is taking the same time on each and every node so
> it should take the same time if i run the query through coordinator but
> somehow instead on 5sec it's taking 22 sec. So somehow the query execution
> on nodes are happening correctly but data movement from nodes to
> coordinator is taking a lot of time.
>
>


> Please advise.
>
>
What does your query look like? A single table? A join? Using aggregates?

Thanks,

Mason

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Sandeep G. <gup...@gm...> - 2014-02-01 18:19:40

Nirmal,

 Coordinator time is a function of sum of the output from each datanode.
PGXC shows performance when the datanodes output small amount of data
compared to original size.

-Sandeep



On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote:

> Hi Koichi,
>
> My tables are not replicated. They all are distributed the way you
> explained.
> For example, total record in one table is 600000 and i have 6 nodes so
> each and every node has got 100000 records.
>
> Now the issue is that when I am running my query directly on data node it
> comes up in 5 sec and it is taking the same time on each and every node so
> it should take the same time if i run the query through coordinator but
> somehow instead on 5sec it's taking 22 sec. So somehow the query execution
> on nodes are happening correctly but data movement from nodes to
> coordinator is taking a lot of time.
>
> Please advise.
>
> Sent from my iPhone
>
> > On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...>
> wrote:
> >
> > It is not a good way to replicate all the tables for write
> > scalability.   The best way is to distribute transaction tables (very
> > frequently written ones) and replicate master tables (less frequently
> > written and frequently joined with transaction tables).
> >
> > Example is our DBT-1 benchmark.  Slide 12 of
> >
> http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf
> > shows how we designed DBT-1 table distribution for XC.  I hope this
> > helps.
> >
> > Regards;
> > ---
> > Koichi Suzuki
> >
> >
> > 2014-02-01 Nirmal Sharma <sha...@gm...>:
> >> Hi,
> >>
> >> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes
> (all
> >> on same big machine ).
> >> To test the performance, i ran one query through coordinator on my data
> >> which is evenly distributed on all the nodes and it took total 25 sec to
> >> complete.
> >>
> >> And then i ran the same query on datanodes directly and it took 5 sec on
> >> each and every datanodes.
> >>
> >> Since query execution happens parallely on data nodes so ideally even
> if i
> >> run the query through coordinator, it should not take more than 5-8 sec
> max
> >> but i dont understand why is it taking 25 sec.
> >>
> >> Can somebody help me.?
> >> Do i need to make some changes to my cluster configuration?
> >>
> >>
> >> Regards
> >> Nirmal
> >>
> >>
> ------------------------------------------------------------------------------
> >> WatchGuard Dimension instantly turns raw network data into actionable
> >> security intelligence. It gives you real-time visual feedback on key
> >> security issues and trends.  Skip the complicated setup - simply import
> >> a virtual appliance and go from zero to informed in seconds.
> >>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> >> _______________________________________________
> >> Postgres-xc-general mailing list
> >> Pos...@li...
> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> >>
>
>
> ------------------------------------------------------------------------------
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal S. <sha...@gm...> - 2014-02-01 18:53:46

So if the coordinator time or total query time is the sum of time taken by query on each and every node then what is the use of having cluster. The whole point of having the cluster is to divide the work across different data node and reduce the query time by almost the no. Of data nodes.
I thought the overall time taken by query/coordinator is the max of time taken by all the nodes ( max( dn1,dn2....) instead of sum of time taken by all the nodes sum(dn1,dn2....)

Please let me know if my understanding is incorrect.

Nirmal

Sent from my iPad

> On Feb 1, 2014, at 10:19 AM, Sandeep Gupta <gup...@gm...> wrote:
> 
> Nirmal,
> 
>  Coordinator time is a function of sum of the output from each datanode. PGXC shows performance when the datanodes output small amount of data compared to original size.
> 
> -Sandeep
> 
> 
> 
>> On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote:
>> Hi Koichi,
>> 
>> My tables are not replicated. They all are distributed the way you explained.
>> For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records.
>> 
>> Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time.
>> 
>> Please advise.
>> 
>> Sent from my iPhone
>> 
>> > On Feb 1, 2014, at 12:17 AM, Koichi Suzuki <koi...@gm...> wrote:
>> >
>> > It is not a good way to replicate all the tables for write
>> > scalability.   The best way is to distribute transaction tables (very
>> > frequently written ones) and replicate master tables (less frequently
>> > written and frequently joined with transaction tables).
>> >
>> > Example is our DBT-1 benchmark.  Slide 12 of
>> > http://postgres-xc.sourceforge.net/misc-docs/20120614_PGXC_Tutorial_global.pdf
>> > shows how we designed DBT-1 table distribution for XC.  I hope this
>> > helps.
>> >
>> > Regards;
>> > ---
>> > Koichi Suzuki
>> >
>> >
>> > 2014-02-01 Nirmal Sharma <sha...@gm...>:
>> >> Hi,
>> >>
>> >> I created a pgxc cluster with one coordinator, one GTM and 6 data nodes (all
>> >> on same big machine ).
>> >> To test the performance, i ran one query through coordinator on my data
>> >> which is evenly distributed on all the nodes and it took total 25 sec to
>> >> complete.
>> >>
>> >> And then i ran the same query on datanodes directly and it took 5 sec on
>> >> each and every datanodes.
>> >>
>> >> Since query execution happens parallely on data nodes so ideally even if i
>> >> run the query through coordinator, it should not take more than 5-8 sec max
>> >> but i dont understand why is it taking 25 sec.
>> >>
>> >> Can somebody help me.?
>> >> Do i need to make some changes to my cluster configuration?
>> >>
>> >>
>> >> Regards
>> >> Nirmal
>> >>
>> >> ------------------------------------------------------------------------------
>> >> WatchGuard Dimension instantly turns raw network data into actionable
>> >> security intelligence. It gives you real-time visual feedback on key
>> >> security issues and trends.  Skip the complicated setup - simply import
>> >> a virtual appliance and go from zero to informed in seconds.
>> >> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
>> >> _______________________________________________
>> >> Postgres-xc-general mailing list
>> >> Pos...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>> >>
>> 
>> ------------------------------------------------------------------------------
>> WatchGuard Dimension instantly turns raw network data into actionable
>> security intelligence. It gives you real-time visual feedback on key
>> security issues and trends.  Skip the complicated setup - simply import
>> a virtual appliance and go from zero to informed in seconds.
>> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Postgres-xc-general mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal S. <sha...@gm...> - 2014-02-01 19:13:39

My query uses aggregates and joins and it looks like this :

Select 
Sum(...),
Sum(..),
Avg(..),
...
....
..
From tableA a inner join tableB on a.col1 =b.col1
                        Inner join tableC on a.col1=c.col1

All the 3 tables are distributed  on hash(col1) .

I have 1 coordinator , 6 nodes, 1 GTM.

When I run this query , it takes total 23 sec.
But when I run the same query on each and individual nodes then it takes 4 sec on each and every nodes.
So since it's cluster , it should ideally take 4 sec + some overhead time to combine data from each node on coordinator ( max 2 more sec) but I don't understand why it is taking 23 sec when runs from coordinator. 

Nirmal

Sent from my iPad

> On Feb 1, 2014, at 11:02 AM, Mason Sharp <ms...@tr...> wrote:
> 
> 
>> On Sat, Feb 1, 2014 at 11:46 AM, Nirmal <sha...@gm...> wrote:
>> Hi Koichi,
>> 
>> My tables are not replicated. They all are distributed the way you explained.
>> For example, total record in one table is 600000 and i have 6 nodes so each and every node has got 100000 records.
>> 
>> Now the issue is that when I am running my query directly on data node it comes up in 5 sec and it is taking the same time on each and every node so it should take the same time if i run the query through coordinator but somehow instead on 5sec it's taking 22 sec. So somehow the query execution on nodes are happening correctly but data movement from nodes to coordinator is taking a lot of time.
> 
>  
>> Please advise.
> 
> What does your query look like? A single table? A join? Using aggregates?
> 
> Thanks,
> 
> Mason
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Mason S. <ms...@tr...> - 2014-02-01 19:17:00

On Sat, Feb 1, 2014 at 2:13 PM, Nirmal Sharma <sha...@gm...>wrote:

> My query uses aggregates and joins and it looks like this :
>
> Select
> Sum(...),
> Sum(..),
> Avg(..),
> ...
> ....
> ..
> From tableA a inner join tableB on a.col1 =b.col1
>                         Inner join tableC on a.col1=c.col1
>
> All the 3 tables are distributed  on hash(col1) .
>
> I have 1 coordinator , 6 nodes, 1 GTM.
>
> When I run this query , it takes total 23 sec.
> But when I run the same query on each and individual nodes then it takes 4
> sec on each and every nodes.
> So since it's cluster , it should ideally take 4 sec + some overhead time
> to combine data from each node on coordinator ( max 2 more sec) but I don't
> understand why it is taking 23 sec when runs from coordinator.
>


Can you please add an EXPLAIN in from of your SELECT to look at the plan?
If you only use 2 tables instead of 3, does it behave more as expected?

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal S. <sha...@gm...> - 2014-02-01 19:40:28

Hi Mason,

This is the actual query that i was running.

select                           coalesce(fgpc.date_id,fgcd.date_id)
date_id,
                                 fgpc.m_ad_grp_pub_key m_ad_grp_pub_key,
                                 fgpc.m_kw_pub_key m_kw_pub_key,
                                 kws.expr_names,
                                 kws.expr_values,
                                 kws.m_ad_grp_semid,
                                 sum(fgpc.m_imps) m_imps,
                                 sum(fgpc.m_clicks) m_clicks,
                                 sum(fgpc.m_cost) m_cost,

                                 sum(fgpc.m_conv_1pc) m_conv_1pc,
                                 sum(fgpc.m_conv_mpc) m_conv_mpc,
                                 avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc,
                                 avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc,
                                 avg(fgpc.m_avg_cpc) m_avg_cpc,
                                 avg(fgpc.m_max_cpc) m_max_cpc,
                                 avg(fgpc.m_firstpage_cpc) m_firstpage_cpc,
                                 avg(fgpc.m_topofpage_cpc) m_topofpage_cpc,
                                 avg(fgpc.m_avg_cpm) m_avg_cpm,
                                 avg(fgpc.m_max_cpm) m_max_cpm,
                                 avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
                                 avg(fgpc.m_avg_pos) m_avg_pos,
                                 avg(fgpc.m_lowest_pos) m_lowest_pos,
                                 avg(fgpc.m_highest_pos) m_highest_pos,
                                 avg(fgpc.m_quality_score) m_quality_score,
                                 avg(fgpc.m_view_thru_conv)
m_view_thru_conv,
                                 sum(fgcd.m_revenue) m_revenue,
                                 sum(fgcd.m_conversions) m_conversions,
                                 sum(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_total_bid,
                                 max(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_max_bid,
                                 min(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_min_bid
                        from
                             bidw.fact_msn_kw_perf_daily  fgpc
             full outer join bidw.fact_msn_kw_conversion_daily fgcd  on
fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
             join            biods.msn_keyword_sup kws                 on
fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
kws.m_ad_grp_pub_key
             where
                       coalesce(fgpc.date_id,fgcd.date_id)  between
20131201 and 20140119
            group by

 coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
;

*This is the explain plan for the same.*
explain analyze verbose select ......
....

* Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..2.50 rows=1000
width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)*
   Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
(sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
(avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
(avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
(avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)),
(avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)),
(avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
(sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
   *Node/s: d11, d12, d13, d14, d15, d16*
   Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3,
r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)),
pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)),
pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
pg_catalog.numeric_avg(avg(
l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
l.a_15, l.a_16, l.a_17, l.a_18, l.a_
19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id,
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks,
fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc,
fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc,
fgpc.m_topofpage_cpc, f
gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12,
a_13, a_14, a_15, a_16, a_17, a_18, a_19,
a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
((COALESCE((l.a_1)::bigint,
 r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119)))
l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25)
JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
kws.m_new_kw_bid, kws.m_kw_bi
d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup
kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
* Total runtime: 8378.080 ms*
(5 rows)



*This is the actual time taken by the query:*

[postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out

*real 0m23.533s*
user 0m15.705s
sys 0m0.748s

Now i dont know why is it taking that much time.

Nirmal


On Sat, Feb 1, 2014 at 11:16 AM, Mason Sharp <ms...@tr...>wrote:

>
>
>
> On Sat, Feb 1, 2014 at 2:13 PM, Nirmal Sharma <sha...@gm...>wrote:
>
>> My query uses aggregates and joins and it looks like this :
>>
>> Select
>> Sum(...),
>> Sum(..),
>> Avg(..),
>> ...
>> ....
>> ..
>> From tableA a inner join tableB on a.col1 =b.col1
>>                         Inner join tableC on a.col1=c.col1
>>
>> All the 3 tables are distributed  on hash(col1) .
>>
>> I have 1 coordinator , 6 nodes, 1 GTM.
>>
>> When I run this query , it takes total 23 sec.
>> But when I run the same query on each and individual nodes then it takes
>> 4 sec on each and every nodes.
>> So since it's cluster , it should ideally take 4 sec + some overhead time
>> to combine data from each node on coordinator ( max 2 more sec) but I don't
>> understand why it is taking 23 sec when runs from coordinator.
>>
>
>
> Can you please add an EXPLAIN in from of your SELECT to look at the plan?
> If you only use 2 tables instead of 3, does it behave more as expected?
>
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Mason S. <ms...@tr...> - 2014-02-01 21:47:11

On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote:

> Hi Mason,
>
> This is the actual query that i was running.
>
> select                           coalesce(fgpc.date_id,fgcd.date_id)
> date_id,
>                                  fgpc.m_ad_grp_pub_key m_ad_grp_pub_key,
>                                  fgpc.m_kw_pub_key m_kw_pub_key,
>                                  kws.expr_names,
>                                  kws.expr_values,
>                                  kws.m_ad_grp_semid,
>                                  sum(fgpc.m_imps) m_imps,
>                                  sum(fgpc.m_clicks) m_clicks,
>                                  sum(fgpc.m_cost) m_cost,
>
>                                  sum(fgpc.m_conv_1pc) m_conv_1pc,
>                                  sum(fgpc.m_conv_mpc) m_conv_mpc,
>                                  avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc,
>                                  avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc,
>                                  avg(fgpc.m_avg_cpc) m_avg_cpc,
>                                  avg(fgpc.m_max_cpc) m_max_cpc,
>                                  avg(fgpc.m_firstpage_cpc) m_firstpage_cpc,
>                                  avg(fgpc.m_topofpage_cpc) m_topofpage_cpc,
>                                  avg(fgpc.m_avg_cpm) m_avg_cpm,
>                                  avg(fgpc.m_max_cpm) m_max_cpm,
>                                  avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
>                                  avg(fgpc.m_avg_pos) m_avg_pos,
>                                  avg(fgpc.m_lowest_pos) m_lowest_pos,
>                                  avg(fgpc.m_highest_pos) m_highest_pos,
>                                  avg(fgpc.m_quality_score) m_quality_score,
>                                  avg(fgpc.m_view_thru_conv)
> m_view_thru_conv,
>                                  sum(fgcd.m_revenue) m_revenue,
>                                  sum(fgcd.m_conversions) m_conversions,
>                                  sum(coalesce(kws.m_new_kw_bid,
> kws.m_kw_bid)) m_kw_total_bid,
>                                  max(coalesce(kws.m_new_kw_bid,
> kws.m_kw_bid)) m_kw_max_bid,
>                                  min(coalesce(kws.m_new_kw_bid,
> kws.m_kw_bid)) m_kw_min_bid
>                         from
>                              bidw.fact_msn_kw_perf_daily  fgpc
>              full outer join bidw.fact_msn_kw_conversion_daily fgcd  on
> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
>              join            biods.msn_keyword_sup kws                 on
>   fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
> kws.m_ad_grp_pub_key
>              where
>                        coalesce(fgpc.date_id,fgcd.date_id)  between
> 20131201 and 20140119
>             group by
>
>  coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
> ;
>
> *This is the explain plan for the same.*
> explain analyze verbose select ......
> ....
>
> * Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..2.50 rows=1000
> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)*
>    Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)),
> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)),
> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
>    *Node/s: d11, d12, d13, d14, d15, d16*
>    Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3,
> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)),
> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)),
> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
> pg_catalog.numeric_avg(avg(
> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
> l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
> l.a_15, l.a_16, l.a_17, l.a_18, l.a_
> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT
> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps,
> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc,
> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc,
> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f
> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
> true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12,
> a_13, a_14, a_15, a_16, a_17, a_18, a_19,
> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
> ((COALESCE((l.a_1)::bigint,
>  r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119)))
> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
> a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25)
> JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
> kws.m_new_kw_bid, kws.m_kw_bi
> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup
> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
> * Total runtime: 8378.080 ms*
> (5 rows)
>
>
>
> *This is the actual time taken by the query:*
>
> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out
>
> *real 0m23.533s*
> user 0m15.705s
> sys 0m0.748s
>
> Now i dont know why is it taking that much time.
>

Try adding LIMIT with different amounts for example to see how that impacts
time.

Also, try enabling statement logging (log_statement = all in
postgresql.conf) on the data nodes to see how long it takes on each node.

Also, the statement was rewritten in XC, with relations converted into
SELECTs, so try running the rewritten version directly to see how long it
takes.

Thanks,

Mason

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal S. <sha...@gm...> - 2014-02-03 22:07:52

Hi All,

I tried with log_statement enabled on all the nodes and coordinator and i
got this:

--This is the coordinator log
*LOG:  duration: 8807.961 ms*  statement: select
coalesce(fgpc.date_id,fgcd.date_id) date_id,
                                         fgpc.m_ad_grp_pub_key
m_ad_grp_pub_key,
                                         fgpc.m_kw_pub_key m_kw_pub_key,
                                         kws.expr_names,
                                         kws.expr_values,
                                         kws.m_ad_grp_semid,
                                         sum(fgpc.m_imps) m_imps,
                                         sum(fgpc.m_clicks) m_clicks,
                                         sum(fgpc.m_cost) m_cost,

                                         sum(fgpc.m_conv_1pc) m_conv_1pc,
                                         sum(fgpc.m_conv_mpc) m_conv_mpc,
                                         avg(fgpc.m_cnv_rate_1pc)
m_cnv_rate_1pc,
                                         avg(fgpc.m_cnv_rate_mpc)
m_cnv_rate_mpc,
                                         avg(fgpc.m_avg_cpc) m_avg_cpc,
                                         avg(fgpc.m_max_cpc) m_max_cpc,
                                         avg(fgpc.m_firstpage_cpc)
m_firstpage_cpc,
                                         avg(fgpc.m_topofpage_cpc)
m_topofpage_cpc,
                                         avg(fgpc.m_avg_cpm) m_avg_cpm,
                                         avg(fgpc.m_max_cpm) m_max_cpm,
                                         avg(fgpc.m_max_cpa_pct)
m_max_cpa_pct,
                                         avg(fgpc.m_avg_pos) m_avg_pos,
                                         avg(fgpc.m_lowest_pos)
m_lowest_pos,
                                         avg(fgpc.m_highest_pos)
m_highest_pos,
                                         avg(fgpc.m_quality_score)
m_quality_score,
                                         avg(fgpc.m_view_thru_conv)
m_view_thru_conv,
                                         sum(fgcd.m_revenue) m_revenue,
                                         sum(fgcd.m_conversions)
m_conversions,
                                         sum(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_total_bid,
                                         max(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_max_bid,
                                         min(coalesce(kws.m_new_kw_bid,
kws.m_kw_bid)) m_kw_min_bid
                                from
                                     bidw.fact_msn_kw_perf_daily  fgpc
                     full outer join bidw.fact_msn_kw_conversion_daily fgcd
 on fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
                     join            biods.msn_keyword_sup kws
    on   fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
kws.m_ad_grp_pub_key
                     where
                               coalesce(fgpc.date_id,fgcd.date_id)  between
20131201 and 20140119
                    group by

 coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid


*And this the log info from all the data nodes log file:*

*LOG:  duration: 8387.136 ms*  statement: SELECT COALESCE((l.a_1)::bigint,
l.a_23), l.a_2, l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5),
sum(l.a_6), sum(l.a_7), sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)),
pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)),
pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)),
pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.numeric_avg(avg(l.a_19)),
pg_catalog.numeric_avg(avg(l.a_20)), pg_catalog.numeric_avg(avg(l.a_21)),
pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, l.a_22, r.a_1,
r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, fgpc.m_ad_grp_pub_key,
fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, fgpc.m_cost,
fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc,
fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc,
fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12,
a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22) LEFT JOIN
(SELECT fgcd.date_id, fgcd.m_revenue, fgcd.m_conversions,
fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND
(COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5,
a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names,
kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bid, kws.m_kw_bid,
kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup kws
WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE ((l.a_2 =
r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6


So as per the query log everything looks fine. i.e. coordinator is working
the way it should work.


*But then why the below statement is returning me 23 sec ( test.sql has got
the same query that is shown above )*

[postgres@sv4-pgxc-db04 test]$ time psql -d adchemy11100 -f "test.sql" >
/dev/null

*real 0m23.394s*
user 0m15.900s
sys 0m0.645s


Please advise.

Nirmal



On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote:

>
>
>
> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote:
>
>> Hi Mason,
>>
>> This is the actual query that i was running.
>>
>> select                           coalesce(fgpc.date_id,fgcd.date_id)
>> date_id,
>>                                  fgpc.m_ad_grp_pub_key m_ad_grp_pub_key,
>>                                  fgpc.m_kw_pub_key m_kw_pub_key,
>>                                  kws.expr_names,
>>                                  kws.expr_values,
>>                                  kws.m_ad_grp_semid,
>>                                  sum(fgpc.m_imps) m_imps,
>>                                  sum(fgpc.m_clicks) m_clicks,
>>                                  sum(fgpc.m_cost) m_cost,
>>
>>                                  sum(fgpc.m_conv_1pc) m_conv_1pc,
>>                                  sum(fgpc.m_conv_mpc) m_conv_mpc,
>>                                  avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc,
>>                                  avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc,
>>                                  avg(fgpc.m_avg_cpc) m_avg_cpc,
>>                                  avg(fgpc.m_max_cpc) m_max_cpc,
>>                                  avg(fgpc.m_firstpage_cpc)
>> m_firstpage_cpc,
>>                                  avg(fgpc.m_topofpage_cpc)
>> m_topofpage_cpc,
>>                                  avg(fgpc.m_avg_cpm) m_avg_cpm,
>>                                  avg(fgpc.m_max_cpm) m_max_cpm,
>>                                  avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
>>                                  avg(fgpc.m_avg_pos) m_avg_pos,
>>                                  avg(fgpc.m_lowest_pos) m_lowest_pos,
>>                                  avg(fgpc.m_highest_pos) m_highest_pos,
>>                                  avg(fgpc.m_quality_score)
>> m_quality_score,
>>                                  avg(fgpc.m_view_thru_conv)
>> m_view_thru_conv,
>>                                  sum(fgcd.m_revenue) m_revenue,
>>                                  sum(fgcd.m_conversions) m_conversions,
>>                                  sum(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_total_bid,
>>                                  max(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_max_bid,
>>                                  min(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_min_bid
>>                         from
>>                              bidw.fact_msn_kw_perf_daily  fgpc
>>              full outer join bidw.fact_msn_kw_conversion_daily fgcd  on
>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
>>              join            biods.msn_keyword_sup kws                 on
>>   fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
>> kws.m_ad_grp_pub_key
>>              where
>>                        coalesce(fgpc.date_id,fgcd.date_id)  between
>> 20131201 and 20140119
>>             group by
>>
>>  coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
>> ;
>>
>> *This is the explain plan for the same.*
>> explain analyze verbose select ......
>> ....
>>
>> * Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..2.50 rows=1000
>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)*
>>    Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)),
>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)),
>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
>>    *Node/s: d11, d12, d13, d14, d15, d16*
>>    Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3,
>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
>> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
>> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)),
>> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)),
>> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
>> pg_catalog.numeric_avg(avg(
>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
>> l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
>> l.a_15, l.a_16, l.a_17, l.a_18, l.a_
>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT
>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps,
>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc,
>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc,
>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f
>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
>> true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12,
>> a_13, a_14, a_15, a_16, a_17, a_18, a_19,
>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
>> ((COALESCE((l.a_1)::bigint,
>>  r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119)))
>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
>> a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25)
>> JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
>> kws.m_new_kw_bid, kws.m_kw_bi
>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup
>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
>> * Total runtime: 8378.080 ms*
>> (5 rows)
>>
>>
>>
>> *This is the actual time taken by the query:*
>>
>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out
>>
>> *real 0m23.533s*
>> user 0m15.705s
>> sys 0m0.748s
>>
>> Now i dont know why is it taking that much time.
>>
>
> Try adding LIMIT with different amounts for example to see how that
> impacts time.
>
> Also, try enabling statement logging (log_statement = all in
> postgresql.conf) on the data nodes to see how long it takes on each node.
>
> Also, the statement was rewritten in XC, with relations converted into
> SELECTs, so try running the rewritten version directly to see how long it
> takes.
>
> Thanks,
>
> Mason
>
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal S. <sha...@gm...> - 2014-02-01 23:27:46

Hi,

These are the timings for adding limit with different amount.
With these timings you can see see that here the bottleneck is coordinator
(i.e. retrieving data from various nodes to coordinator ).
I just want to ask whether its normal or not?

---for limit 1000
[postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out

real 0m1.935s
user 0m0.051s
sys 0m0.002s

---for limit 10000
[postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out

real 0m2.724s
user 0m0.481s
sys 0m0.023s

--for limit 100000
[postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out

real 0m12.102s
user 0m3.139s
sys 0m0.146s

--for limit 200000
[postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out

real 0m13.078s
user 0m5.507s
sys 0m0.316s

---for limit 400000
[postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out

real 0m18.820s
user 0m10.482s
sys 0m0.659s

---for limit 600000
[postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out

real 0m23.478s
user 0m15.631s
sys 0m0.940s
[postgres@sv4-pgxc-db04 test]$


I will also enable the statement log and try again and will send the output
soon.

Nirmal


On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote:

>
>
>
> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>wrote:
>
>> Hi Mason,
>>
>> This is the actual query that i was running.
>>
>> select                           coalesce(fgpc.date_id,fgcd.date_id)
>> date_id,
>>                                  fgpc.m_ad_grp_pub_key m_ad_grp_pub_key,
>>                                  fgpc.m_kw_pub_key m_kw_pub_key,
>>                                  kws.expr_names,
>>                                  kws.expr_values,
>>                                  kws.m_ad_grp_semid,
>>                                  sum(fgpc.m_imps) m_imps,
>>                                  sum(fgpc.m_clicks) m_clicks,
>>                                  sum(fgpc.m_cost) m_cost,
>>
>>                                  sum(fgpc.m_conv_1pc) m_conv_1pc,
>>                                  sum(fgpc.m_conv_mpc) m_conv_mpc,
>>                                  avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc,
>>                                  avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc,
>>                                  avg(fgpc.m_avg_cpc) m_avg_cpc,
>>                                  avg(fgpc.m_max_cpc) m_max_cpc,
>>                                  avg(fgpc.m_firstpage_cpc)
>> m_firstpage_cpc,
>>                                  avg(fgpc.m_topofpage_cpc)
>> m_topofpage_cpc,
>>                                  avg(fgpc.m_avg_cpm) m_avg_cpm,
>>                                  avg(fgpc.m_max_cpm) m_max_cpm,
>>                                  avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
>>                                  avg(fgpc.m_avg_pos) m_avg_pos,
>>                                  avg(fgpc.m_lowest_pos) m_lowest_pos,
>>                                  avg(fgpc.m_highest_pos) m_highest_pos,
>>                                  avg(fgpc.m_quality_score)
>> m_quality_score,
>>                                  avg(fgpc.m_view_thru_conv)
>> m_view_thru_conv,
>>                                  sum(fgcd.m_revenue) m_revenue,
>>                                  sum(fgcd.m_conversions) m_conversions,
>>                                  sum(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_total_bid,
>>                                  max(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_max_bid,
>>                                  min(coalesce(kws.m_new_kw_bid,
>> kws.m_kw_bid)) m_kw_min_bid
>>                         from
>>                              bidw.fact_msn_kw_perf_daily  fgpc
>>              full outer join bidw.fact_msn_kw_conversion_daily fgcd  on
>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
>>              join            biods.msn_keyword_sup kws                 on
>>   fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
>> kws.m_ad_grp_pub_key
>>              where
>>                        coalesce(fgpc.date_id,fgcd.date_id)  between
>> 20131201 and 20140119
>>             group by
>>
>>  coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
>> ;
>>
>> *This is the explain plan for the same.*
>> explain analyze verbose select ......
>> ....
>>
>> * Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..2.50 rows=1000
>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)*
>>    Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)),
>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)),
>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
>>    *Node/s: d11, d12, d13, d14, d15, d16*
>>    Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3,
>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
>> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
>> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)),
>> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)),
>> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
>> pg_catalog.numeric_avg(avg(
>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
>> l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
>> l.a_15, l.a_16, l.a_17, l.a_18, l.a_
>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT
>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps,
>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc,
>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc,
>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f
>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
>> true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12,
>> a_13, a_14, a_15, a_16, a_17, a_18, a_19,
>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
>> ((COALESCE((l.a_1)::bigint,
>>  r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119)))
>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
>> a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25)
>> JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
>> kws.m_new_kw_bid, kws.m_kw_bi
>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup
>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
>> * Total runtime: 8378.080 ms*
>> (5 rows)
>>
>>
>>
>> *This is the actual time taken by the query:*
>>
>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out
>>
>> *real 0m23.533s*
>> user 0m15.705s
>> sys 0m0.748s
>>
>> Now i dont know why is it taking that much time.
>>
>
> Try adding LIMIT with different amounts for example to see how that
> impacts time.
>
> Also, try enabling statement logging (log_statement = all in
> postgresql.conf) on the data nodes to see how long it takes on each node.
>
> Also, the statement was rewritten in XC, with relations converted into
> SELECTs, so try running the rewritten version directly to see how long it
> takes.
>
> Thanks,
>
> Mason
>
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Koichi S. <koi...@gm...> - 2014-02-02 03:44:48

Could you share "explain" result to see how plan works fine.

Regards;
---
Koichi Suzuki


2014-02-02 Nirmal Sharma <sha...@gm...>:
> Hi,
>
> These are the timings for adding limit with different amount.
> With these timings you can see see that here the bottleneck is coordinator
> (i.e. retrieving data from various nodes to coordinator ).
> I just want to ask whether its normal or not?
>
> ---for limit 1000
> [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
>
> real 0m1.935s
> user 0m0.051s
> sys 0m0.002s
>
> ---for limit 10000
> [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
>
> real 0m2.724s
> user 0m0.481s
> sys 0m0.023s
>
> --for limit 100000
> [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
>
> real 0m12.102s
> user 0m3.139s
> sys 0m0.146s
>
> --for limit 200000
> [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
>
> real 0m13.078s
> user 0m5.507s
> sys 0m0.316s
>
> ---for limit 400000
> [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
>
> real 0m18.820s
> user 0m10.482s
> sys 0m0.659s
>
> ---for limit 600000
> [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
>
> real 0m23.478s
> user 0m15.631s
> sys 0m0.940s
> [postgres@sv4-pgxc-db04 test]$
>
>
> I will also enable the statement log and try again and will send the output
> soon.
>
> Nirmal
>
>
> On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...> wrote:
>>
>>
>>
>>
>> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>
>> wrote:
>>>
>>> Hi Mason,
>>>
>>> This is the actual query that i was running.
>>>
>>> select                           coalesce(fgpc.date_id,fgcd.date_id)
>>> date_id,
>>>                                  fgpc.m_ad_grp_pub_key m_ad_grp_pub_key,
>>>                                  fgpc.m_kw_pub_key m_kw_pub_key,
>>>                                  kws.expr_names,
>>>                                  kws.expr_values,
>>>                                  kws.m_ad_grp_semid,
>>>                                  sum(fgpc.m_imps) m_imps,
>>>                                  sum(fgpc.m_clicks) m_clicks,
>>>                                  sum(fgpc.m_cost) m_cost,
>>>                                  sum(fgpc.m_conv_1pc) m_conv_1pc,
>>>                                  sum(fgpc.m_conv_mpc) m_conv_mpc,
>>>                                  avg(fgpc.m_cnv_rate_1pc) m_cnv_rate_1pc,
>>>                                  avg(fgpc.m_cnv_rate_mpc) m_cnv_rate_mpc,
>>>                                  avg(fgpc.m_avg_cpc) m_avg_cpc,
>>>                                  avg(fgpc.m_max_cpc) m_max_cpc,
>>>                                  avg(fgpc.m_firstpage_cpc)
>>> m_firstpage_cpc,
>>>                                  avg(fgpc.m_topofpage_cpc)
>>> m_topofpage_cpc,
>>>                                  avg(fgpc.m_avg_cpm) m_avg_cpm,
>>>                                  avg(fgpc.m_max_cpm) m_max_cpm,
>>>                                  avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
>>>                                  avg(fgpc.m_avg_pos) m_avg_pos,
>>>                                  avg(fgpc.m_lowest_pos) m_lowest_pos,
>>>                                  avg(fgpc.m_highest_pos) m_highest_pos,
>>>                                  avg(fgpc.m_quality_score)
>>> m_quality_score,
>>>                                  avg(fgpc.m_view_thru_conv)
>>> m_view_thru_conv,
>>>                                  sum(fgcd.m_revenue) m_revenue,
>>>                                  sum(fgcd.m_conversions) m_conversions,
>>>                                  sum(coalesce(kws.m_new_kw_bid,
>>> kws.m_kw_bid)) m_kw_total_bid,
>>>                                  max(coalesce(kws.m_new_kw_bid,
>>> kws.m_kw_bid)) m_kw_max_bid,
>>>                                  min(coalesce(kws.m_new_kw_bid,
>>> kws.m_kw_bid)) m_kw_min_bid
>>>                         from
>>>                              bidw.fact_msn_kw_perf_daily  fgpc
>>>              full outer join bidw.fact_msn_kw_conversion_daily fgcd  on
>>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
>>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
>>>              join            biods.msn_keyword_sup kws                 on
>>> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
>>> kws.m_ad_grp_pub_key
>>>              where
>>>                        coalesce(fgpc.date_id,fgcd.date_id)  between
>>> 20131201 and 20140119
>>>             group by
>>>
>>> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
>>> ;
>>>
>>> This is the explain plan for the same.
>>> explain analyze verbose select ......
>>> ....
>>>
>>>  Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..2.50 rows=1000
>>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)
>>>    Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
>>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
>>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
>>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
>>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
>>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
>>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
>>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)),
>>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)), (avg(fgpc.m_highest_pos)),
>>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
>>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
>>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
>>>    Node/s: d11, d12, d13, d14, d15, d16
>>>    Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3,
>>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
>>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
>>> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)),
>>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
>>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
>>> pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)),
>>> pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)),
>>> pg_catalog.numeric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
>>> pg_catalog.numeric_avg(avg(
>>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25),
>>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
>>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4, l.a_5,
>>> l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15,
>>> l.a_16, l.a_17, l.a_18, l.a_
>>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT
>>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps,
>>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc,
>>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc,
>>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f
>>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
>>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
>>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true)
>>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14,
>>> a_15, a_16, a_17, a_18, a_19,
>>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
>>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
>>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
>>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
>>> ((COALESCE((l.a_1)::bigint,
>>>  r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <= 20140119)))
>>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14,
>>> a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN
>>> (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
>>> kws.m_new_kw_bid, kws.m_kw_bi
>>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup
>>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
>>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
>>>  Total runtime: 8378.080 ms
>>> (5 rows)
>>>
>>>
>>>
>>> This is the actual time taken by the query:
>>>
>>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out
>>>
>>> real 0m23.533s
>>> user 0m15.705s
>>> sys 0m0.748s
>>>
>>> Now i dont know why is it taking that much time.
>>
>>
>> Try adding LIMIT with different amounts for example to see how that
>> impacts time.
>>
>> Also, try enabling statement logging (log_statement = all in
>> postgresql.conf) on the data nodes to see how long it takes on each node.
>>
>> Also, the statement was rewritten in XC, with relations converted into
>> SELECTs, so try running the rewritten version directly to see how long it
>> takes.
>>
>> Thanks,
>>
>> Mason
>>
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal S. <sha...@gm...> - 2014-02-02 05:49:44

This is the explain plan for the query with limit 10000.


Limit  (cost=0.00..2.50 rows=1000 width=908) (actual
time=1586.926..1836.081 rows=10000 loops=1)
   Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
(sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f
gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)),
(avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)),
(avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
(avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos
)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)),
(avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)),
(sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(min(COALESCE(kws.m_new_kw_bid, kw
s.m_kw_bid)))
   ->  Data Node Scan on "__REMOTE_LIMIT_QUERY__"  (cost=0.00..2.50
rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1)
         Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
(sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
(avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)),
(avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)),
(avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
(avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe
st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)),
(avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)),
(sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
(min(COALESCE(kws.m_new_kw_b
id, kws.m_kw_bid)))
         Node/s: d11, d12, d13, d14, d15, d16
         Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2,
l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg
_catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)),
pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)),
pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)),
pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume
ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)),
sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4,
r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2,
 l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12,
l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21,
l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id,
fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks,
fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc,
fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc,
fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct,
fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos,
fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY
bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5,
a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND (
COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5,
a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names,
kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi
d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY
biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7)
ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4,
5, 6 LIMIT 10000::bigint
 Total runtime: 2194.762 ms
(7 rows)



On Sat, Feb 1, 2014 at 7:44 PM, Koichi Suzuki <koi...@gm...> wrote:

> Could you share "explain" result to see how plan works fine.
>
> Regards;
> ---
> Koichi Suzuki
>
>
> 2014-02-02 Nirmal Sharma <sha...@gm...>:
> > Hi,
> >
> > These are the timings for adding limit with different amount.
> > With these timings you can see see that here the bottleneck is
> coordinator
> > (i.e. retrieving data from various nodes to coordinator ).
> > I just want to ask whether its normal or not?
> >
> > ---for limit 1000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m1.935s
> > user 0m0.051s
> > sys 0m0.002s
> >
> > ---for limit 10000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m2.724s
> > user 0m0.481s
> > sys 0m0.023s
> >
> > --for limit 100000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m12.102s
> > user 0m3.139s
> > sys 0m0.146s
> >
> > --for limit 200000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m13.078s
> > user 0m5.507s
> > sys 0m0.316s
> >
> > ---for limit 400000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m18.820s
> > user 0m10.482s
> > sys 0m0.659s
> >
> > ---for limit 600000
> > [postgres@sv4-pgxc-db04 test]$ time psql -d myDB -f "test.sql" > a.out
> >
> > real 0m23.478s
> > user 0m15.631s
> > sys 0m0.940s
> > [postgres@sv4-pgxc-db04 test]$
> >
> >
> > I will also enable the statement log and try again and will send the
> output
> > soon.
> >
> > Nirmal
> >
> >
> > On Sat, Feb 1, 2014 at 1:47 PM, Mason Sharp <ms...@tr...>
> wrote:
> >>
> >>
> >>
> >>
> >> On Sat, Feb 1, 2014 at 2:40 PM, Nirmal Sharma <sha...@gm...>
> >> wrote:
> >>>
> >>> Hi Mason,
> >>>
> >>> This is the actual query that i was running.
> >>>
> >>> select                           coalesce(fgpc.date_id,fgcd.date_id)
> >>> date_id,
> >>>                                  fgpc.m_ad_grp_pub_key
> m_ad_grp_pub_key,
> >>>                                  fgpc.m_kw_pub_key m_kw_pub_key,
> >>>                                  kws.expr_names,
> >>>                                  kws.expr_values,
> >>>                                  kws.m_ad_grp_semid,
> >>>                                  sum(fgpc.m_imps) m_imps,
> >>>                                  sum(fgpc.m_clicks) m_clicks,
> >>>                                  sum(fgpc.m_cost) m_cost,
> >>>                                  sum(fgpc.m_conv_1pc) m_conv_1pc,
> >>>                                  sum(fgpc.m_conv_mpc) m_conv_mpc,
> >>>                                  avg(fgpc.m_cnv_rate_1pc)
> m_cnv_rate_1pc,
> >>>                                  avg(fgpc.m_cnv_rate_mpc)
> m_cnv_rate_mpc,
> >>>                                  avg(fgpc.m_avg_cpc) m_avg_cpc,
> >>>                                  avg(fgpc.m_max_cpc) m_max_cpc,
> >>>                                  avg(fgpc.m_firstpage_cpc)
> >>> m_firstpage_cpc,
> >>>                                  avg(fgpc.m_topofpage_cpc)
> >>> m_topofpage_cpc,
> >>>                                  avg(fgpc.m_avg_cpm) m_avg_cpm,
> >>>                                  avg(fgpc.m_max_cpm) m_max_cpm,
> >>>                                  avg(fgpc.m_max_cpa_pct) m_max_cpa_pct,
> >>>                                  avg(fgpc.m_avg_pos) m_avg_pos,
> >>>                                  avg(fgpc.m_lowest_pos) m_lowest_pos,
> >>>                                  avg(fgpc.m_highest_pos) m_highest_pos,
> >>>                                  avg(fgpc.m_quality_score)
> >>> m_quality_score,
> >>>                                  avg(fgpc.m_view_thru_conv)
> >>> m_view_thru_conv,
> >>>                                  sum(fgcd.m_revenue) m_revenue,
> >>>                                  sum(fgcd.m_conversions) m_conversions,
> >>>                                  sum(coalesce(kws.m_new_kw_bid,
> >>> kws.m_kw_bid)) m_kw_total_bid,
> >>>                                  max(coalesce(kws.m_new_kw_bid,
> >>> kws.m_kw_bid)) m_kw_max_bid,
> >>>                                  min(coalesce(kws.m_new_kw_bid,
> >>> kws.m_kw_bid)) m_kw_min_bid
> >>>                         from
> >>>                              bidw.fact_msn_kw_perf_daily  fgpc
> >>>              full outer join bidw.fact_msn_kw_conversion_daily fgcd  on
> >>> fgpc.m_ad_grp_pub_key = fgcd.m_ad_grp_pub_key and fgpc.m_kw_pub_key =
> >>> fgcd.m_kw_pub_key and fgpc.date_id = fgpc.date_id
> >>>              join            biods.msn_keyword_sup kws
> on
> >>> fgpc.m_kw_pub_key = kws.m_kw_pub_key and fgpc.m_ad_grp_pub_key =
> >>> kws.m_ad_grp_pub_key
> >>>              where
> >>>                        coalesce(fgpc.date_id,fgcd.date_id)  between
> >>> 20131201 and 20140119
> >>>             group by
> >>>
> >>>
> coalesce(fgpc.date_id,fgcd.date_id),fgpc.m_ad_grp_pub_key,fgpc.m_kw_pub_key,kws.expr_names,kws.expr_values,kws.m_ad_grp_semid
> >>> ;
> >>>
> >>> This is the explain plan for the same.
> >>> explain analyze verbose select ......
> >>> ....
> >>>
> >>>  Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..2.50 rows=1000
> >>> width=908) (actual time=1672.149..8281.918 rows=605575 loops=1)
> >>>    Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
> >>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names,
> kws.expr_values,
> >>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
> >>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
> >>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_ra
> >>> te_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)),
> >>> (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)),
> >>> (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
> (avg(fgpc.m_max_cpa_pct)),
> >>> (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos)),
> (avg(fgpc.m_highest_pos)),
> >>> (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thr
> >>> u_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)),
> >>> (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> >>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> >>> (min(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid)))
> >>>    Node/s: d11, d12, d13, d14, d15, d16
> >>>    Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2,
> l.a_3,
> >>> r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
> >>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
> >>> pg_catalog.numeric_avg(avg(l.a_10)),
> pg_catalog.numeric_avg(avg(l.a_11)),
> >>> pg_catalog.numeric_avg(avg(l.a_12)), pg_catalog.
> >>> numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)),
> >>> pg_catalog.numeric_avg(avg(l.a_15)),
> pg_catalog.numeric_avg(avg(l.a_16)),
> >>> pg_catalog.numeric_avg(avg(l.a_17)),
> pg_catalog.numeric_avg(avg(l.a_18)),
> >>> pg_catalog.numeric_avg(avg(l.a_19)),
> pg_catalog.numeric_avg(avg(l.a_20)),
> >>> pg_catalog.numeric_avg(avg(
> >>> l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24),
> sum(l.a_25),
> >>> sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)),
> >>> min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2, l.a_3, l.a_4,
> l.a_5,
> >>> l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14,
> l.a_15,
> >>> l.a_16, l.a_17, l.a_18, l.a_
> >>> 19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT
> >>> fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps,
> >>> fgpc.m_clicks, fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc,
> >>> fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc,
> fgpc.m_max_cpc,
> >>> fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, f
> >>> gpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos,
> >>> fgpc.m_lowest_pos, fgpc.m_highest_pos, fgpc.m_quality_score,
> >>> fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE
> true)
> >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
> a_14,
> >>> a_15, a_16, a_17, a_18, a_19,
> >>> a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
> >>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
> >>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3,
> a_4,
> >>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5))))
> WHERE
> >>> ((COALESCE((l.a_1)::bigint,
> >>>  r.a_1) >= 20131201) AND (COALESCE((l.a_1)::bigint, r.a_1) <=
> 20140119)))
> >>> l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13,
> a_14,
> >>> a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN
> >>> (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid,
> >>> kws.m_new_kw_bid, kws.m_kw_bi
> >>> d, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY
> biods.msn_keyword_sup
> >>> kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE
> >>> ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6
> >>>  Total runtime: 8378.080 ms
> >>> (5 rows)
> >>>
> >>>
> >>>
> >>> This is the actual time taken by the query:
> >>>
> >>> [postgres@sv4-pgxc-db04 test]$ time psql -d mydb -f "test.sql" > a.out
> >>>
> >>> real 0m23.533s
> >>> user 0m15.705s
> >>> sys 0m0.748s
> >>>
> >>> Now i dont know why is it taking that much time.
> >>
> >>
> >> Try adding LIMIT with different amounts for example to see how that
> >> impacts time.
> >>
> >> Also, try enabling statement logging (log_statement = all in
> >> postgresql.conf) on the data nodes to see how long it takes on each
> node.
> >>
> >> Also, the statement was rewritten in XC, with relations converted into
> >> SELECTs, so try running the rewritten version directly to see how long
> it
> >> takes.
> >>
> >> Thanks,
> >>
> >> Mason
> >>
> >
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Mason S. <ms...@tr...> - 2014-02-02 17:32:11

On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...>wrote:

> This is the explain plan for the query with limit 10000.
>
>
> Limit  (cost=0.00..2.50 rows=1000 width=908) (actual
> time=1586.926..1836.081 rows=10000 loops=1)
>    Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f
> gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)),
> (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)),
> (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
> (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos
> )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)),
> (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)),
> (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> (min(COALESCE(kws.m_new_kw_bid, kw
> s.m_kw_bid)))
>    ->  Data Node Scan on "__REMOTE_LIMIT_QUERY__"  (cost=0.00..2.50
> rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1)
>          Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)),
> (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)),
> (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
> (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe
> st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)),
> (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)),
> (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
> (min(COALESCE(kws.m_new_kw_b
> id, kws.m_kw_bid)))
>          Node/s: d11, d12, d13, d14, d15, d16
>          Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2,
> l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg
> _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)),
> pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)),
> pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)),
> pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume
> ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
> pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)),
> sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4,
> r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2,
>  l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12,
> l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21,
> l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id,
> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks,
> fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc,
> fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc,
> fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct,
> fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos,
> fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY
> bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5,
> a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
> a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
> ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND (
> COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5,
> a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
> a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names,
> kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi
> d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY
> biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7)
> ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4,
> 5, 6 LIMIT 10000::bigint
>  Total runtime: 2194.762 ms
> (7 rows)
>
>
If you run the generated query on the nodes directly (through EXECUTE
DIRECT) is the time similarly slow? If so, then it points to the query
rewrite that is the problem. If it is fast, then it may mean an issue in
tuple handling on the coordinator.


-- 
Mason Sharp

TransLattice - http://www.translattice.com
Distributed and Clustered Database Solutions

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Nirmal S. <sha...@gm...> - 2014-02-02 18:32:52

Yes you are absolutely right.
If I run the same query directly on nodes then it runs very fast. It is running slow when I run from coordinator. How am I going to resolve this tuple handling on coordinator?
Please advise. 

Sent from my iPad

> On Feb 2, 2014, at 9:32 AM, Mason Sharp <ms...@tr...> wrote:
> 
> 
> 
> 
>> On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...> wrote:
>> This is the explain plan for the query with limit 10000.
>> 
>> 
>> Limit  (cost=0.00..2.50 rows=1000 width=908) (actual time=1586.926..1836.081 rows=10000 loops=1)
>>    Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f
>> gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos
>> )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_bid, kw
>> s.m_kw_bid)))
>>    ->  Data Node Scan on "__REMOTE_LIMIT_QUERY__"  (cost=0.00..2.50 rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1)
>>          Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)), fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)), (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), 
>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)), (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)), (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe
>> st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)), (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)), (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))), (min(COALESCE(kws.m_new_kw_b
>> id, kws.m_kw_bid)))
>>          Node/s: d11, d12, d13, d14, d15, d16
>>          Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2, l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7), sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)), pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg
>> _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)), pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)), pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)), pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume
>> ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)), pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)), sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4, r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2,
>>  l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12, l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21, l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id, fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks, 
>> fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc, fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc, fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct, fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos, 
>> fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue, 
>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4, a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND (
>> COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4, a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi
>> d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7) ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4, 5, 6 LIMIT 10000::bigint
>>  Total runtime: 2194.762 ms
>> (7 rows)
>> 
> 
> If you run the generated query on the nodes directly (through EXECUTE DIRECT) is the time similarly slow? If so, then it points to the query rewrite that is the problem. If it is fast, then it may mean an issue in tuple handling on the coordinator.
> 
>  
> -- 
> Mason Sharp
> 
> TransLattice - http://www.translattice.com
> Distributed and Clustered Database Solutions
> 
>

Re: [Postgres-xc-general] Looks like co-ordinator is my perf bottleneck....somebody help...

From: Ashutosh B. <ash...@en...> - 2014-02-03 04:52:56

Can you please check if there is increase in disk i/o as the number of rows
processed increases. I do not see any problem with the planner. But because
of huge result from datanode and not enough RAM, coordinator might be
choosing to store it on the disk.


On Mon, Feb 3, 2014 at 12:02 AM, Nirmal Sharma <sha...@gm...>wrote:

> Yes you are absolutely right.
> If I run the same query directly on nodes then it runs very fast. It is
> running slow when I run from coordinator. How am I going to resolve this
> tuple handling on coordinator?
> Please advise.
>
> Sent from my iPad
>
> On Feb 2, 2014, at 9:32 AM, Mason Sharp <ms...@tr...> wrote:
>
>
>
>
> On Sun, Feb 2, 2014 at 12:49 AM, Nirmal Sharma <sha...@gm...>wrote:
>
>> This is the explain plan for the query with limit 10000.
>>
>>
>> Limit  (cost=0.00..2.50 rows=1000 width=908) (actual
>> time=1586.926..1836.081 rows=10000 loops=1)
>>    Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)), (avg(f
>> gpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)), (avg(fgpc.m_avg_cpc)),
>> (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)),
>> (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
>> (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowest_pos
>> )), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)),
>> (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)),
>> (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (min(COALESCE(kws.m_new_kw_bid, kw
>> s.m_kw_bid)))
>>    ->  Data Node Scan on "__REMOTE_LIMIT_QUERY__"  (cost=0.00..2.50
>> rows=1000 width=908) (actual time=1586.924..1834.118 rows=10000 loops=1)
>>          Output: (COALESCE((fgpc.date_id)::bigint, fgcd.date_id)),
>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, kws.expr_names, kws.expr_values,
>> kws.m_ad_grp_semid, (sum(fgpc.m_imps)), (sum(fgpc.m_clicks)),
>> (sum(fgpc.m_cost)), (sum(fgpc.m_conv_1pc)), (sum(fgpc.m_conv_mpc)),
>> (avg(fgpc.m_cnv_rate_1pc)), (avg(fgpc.m_cnv_rate_mpc)),
>> (avg(fgpc.m_avg_cpc)), (avg(fgpc.m_max_cpc)), (avg(fgpc.m_firstpage_cpc)),
>> (avg(fgpc.m_topofpage_cpc)), (avg(fgpc.m_avg_cpm)), (avg(fgpc.m_max_cpm)),
>> (avg(fgpc.m_max_cpa_pct)), (avg(fgpc.m_avg_pos)), (avg(fgpc.m_lowe
>> st_pos)), (avg(fgpc.m_highest_pos)), (avg(fgpc.m_quality_score)),
>> (avg(fgpc.m_view_thru_conv)), (sum(fgcd.m_revenue)),
>> (sum(fgcd.m_conversions)), (sum(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (max(COALESCE(kws.m_new_kw_bid, kws.m_kw_bid))),
>> (min(COALESCE(kws.m_new_kw_b
>>  id, kws.m_kw_bid)))
>>          Node/s: d11, d12, d13, d14, d15, d16
>>          Remote query: SELECT COALESCE((l.a_1)::bigint, l.a_23), l.a_2,
>> l.a_3, r.a_1, r.a_2, r.a_3, sum(l.a_4), sum(l.a_5), sum(l.a_6), sum(l.a_7),
>> sum(l.a_8), pg_catalog.numeric_avg(avg(l.a_9)),
>> pg_catalog.numeric_avg(avg(l.a_10)), pg_catalog.numeric_avg(avg(l.a_11)), pg
>> _catalog.numeric_avg(avg(l.a_12)), pg_catalog.numeric_avg(avg(l.a_13)),
>> pg_catalog.numeric_avg(avg(l.a_14)), pg_catalog.numeric_avg(avg(l.a_15)),
>> pg_catalog.numeric_avg(avg(l.a_16)), pg_catalog.numeric_avg(avg(l.a_17)),
>> pg_catalog.numeric_avg(avg(l.a_18)), pg_catalog.nume
>> ric_avg(avg(l.a_19)), pg_catalog.numeric_avg(avg(l.a_20)),
>> pg_catalog.numeric_avg(avg(l.a_21)), pg_catalog.numeric_avg(avg(l.a_22)),
>> sum(l.a_24), sum(l.a_25), sum(COALESCE(r.a_4, r.a_5)), max(COALESCE(r.a_4,
>> r.a_5)), min(COALESCE(r.a_4, r.a_5)) FROM ((SELECT l.a_1, l.a_2,
>>  l.a_3, l.a_4, l.a_5, l.a_6, l.a_7, l.a_8, l.a_9, l.a_10, l.a_11, l.a_12,
>> l.a_13, l.a_14, l.a_15, l.a_16, l.a_17, l.a_18, l.a_19, l.a_20, l.a_21,
>> l.a_22, r.a_1, r.a_2, r.a_3 FROM ((SELECT fgpc.date_id,
>> fgpc.m_ad_grp_pub_key, fgpc.m_kw_pub_key, fgpc.m_imps, fgpc.m_clicks,
>> fgpc.m_cost, fgpc.m_conv_1pc, fgpc.m_conv_mpc, fgpc.m_cnv_rate_1pc,
>> fgpc.m_cnv_rate_mpc, fgpc.m_avg_cpc, fgpc.m_max_cpc, fgpc.m_firstpage_cpc,
>> fgpc.m_topofpage_cpc, fgpc.m_avg_cpm, fgpc.m_max_cpm, fgpc.m_max_cpa_pct,
>> fgpc.m_avg_pos, fgpc.m_lowest_pos, fgpc.m_highest_pos,
>>  fgpc.m_quality_score, fgpc.m_view_thru_conv FROM ONLY
>> bidw.fact_msn_kw_perf_daily fgpc WHERE true) l(a_1, a_2, a_3, a_4, a_5,
>> a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17, a_18,
>> a_19, a_20, a_21, a_22) LEFT JOIN (SELECT fgcd.date_id, fgcd.m_revenue,
>> fgcd.m_conversions, fgcd.m_ad_grp_pub_key, fgcd.m_kw_pub_key FROM ONLY
>> bidw.fact_msn_kw_conversion_daily fgcd WHERE true) r(a_1, a_2, a_3, a_4,
>> a_5) ON (((l.a_1 = l.a_1) AND (l.a_2 = r.a_4) AND (l.a_3 = r.a_5)))) WHERE
>> ((COALESCE((l.a_1)::bigint, r.a_1) >= 20131201) AND (
>> COALESCE((l.a_1)::bigint, r.a_1) <= 20140119))) l(a_1, a_2, a_3, a_4,
>> a_5, a_6, a_7, a_8, a_9, a_10, a_11, a_12, a_13, a_14, a_15, a_16, a_17,
>> a_18, a_19, a_20, a_21, a_22, a_23, a_24, a_25) JOIN (SELECT
>> kws.expr_names, kws.expr_values, kws.m_ad_grp_semid, kws.m_new_kw_bi
>> d, kws.m_kw_bid, kws.m_kw_pub_key, kws.m_ad_grp_pub_key FROM ONLY
>> biods.msn_keyword_sup kws WHERE true) r(a_1, a_2, a_3, a_4, a_5, a_6, a_7)
>> ON (true)) WHERE ((l.a_2 = r.a_7) AND (l.a_3 = r.a_6)) GROUP BY 1, 2, 3, 4,
>> 5, 6 LIMIT 10000::bigint
>>  Total runtime: 2194.762 ms
>> (7 rows)
>>
>>
> If you run the generated query on the nodes directly (through EXECUTE
> DIRECT) is the time similarly slow? If so, then it points to the query
> rewrite that is the problem. If it is fast, then it may mean an issue in
> tuple handling on the coordinator.
>
>
> --
> Mason Sharp
>
> TransLattice - http://www.translattice.com
> Distributed and Clustered Database Solutions
>
>
>
>
> ------------------------------------------------------------------------------
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>
>


-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company