0% found this document useful (0 votes)
224 views46 pages

Five Tuning Tips For Your Data Warehouse

The document provides five tuning tips for data warehouses: 1. Partition tables for improved query performance through pruning and partition elimination. 2. Use data segment compression to reduce the space required for segments by eliminating repeated column values within blocks. 3. Make effective use of PGA memory to improve query performance. 4. Be aware that temporal data can affect optimizer statistics and performance. 5. Identify where queries are spending time to optimize performance bottlenecks.

Uploaded by

sray2001
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
224 views46 pages

Five Tuning Tips For Your Data Warehouse

The document provides five tuning tips for data warehouses: 1. Partition tables for improved query performance through pruning and partition elimination. 2. Use data segment compression to reduce the space required for segments by eliminating repeated column values within blocks. 3. Make effective use of PGA memory to improve query performance. 4. Be aware that temporal data can affect optimizer statistics and performance. 5. Identify where queries are spending time to optimize performance bottlenecks.

Uploaded by

sray2001
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

Five Tuning Tips For Your

Data Warehouse

Jeff Moss
My First Presentation

 Yes, my very first presentation


– For BIRT SIG
– For UKOUG
 Useful Advice from friends and colleagues
– Use graphics where appropriate
– Find a friendly or familiar face in the audience
– Imagine your audience is naked!
– …but like Oracle, be careful when combining advice!
Be Careful Combining Advice!

 Thanks for the


opportunity Mark!
Agenda

 My background
 Five tips
– Partition for success
– Squeeze your data with data segment compression
– Make the most of your PGA memory
– Beware of temporal data affecting the optimizer
– Find out where your query is at
 Questions
My Background

 Independent Consultant
 13 years Oracle experience
 Blog: http://oramossoracle.blogspot.com/
 Focused on warehousing / VLDB since 1998
 First project
– UK Music Sales Data Mart
– Produces BBC Radio 1 Top 40 chart and many more
– 2 billion row sales fact table
– 1 Tb total database size
 Currently working with Eon UK (Powergen)
– 4Tb Production Warehouse, 8Tb total storage
– Oracle Product Stack
What Is Partitioning ?

 “Partitioning addresses key issues in supporting very


large tables and indexes by letting you decompose them
into smaller and more manageable pieces called
partitions.” Oracle Database Concepts Manual, 10gR2

 Introduced in Oracle 8.0


 Numerous improvements since
 Subpartitioning adds another level of decomposition
 Partitions and Subpartitions are logical containers
Partition To Tablespace Mapping

 Partitions map to tablespaces P_JAN_2005


P_FEB_2005 T_Q1_2005
– Partition can only be in One P_MAR_2005
tablespace
P_APR_2005
– Tablespace can hold many P_MAY_2005 T_Q2_2005
partitions P_JUN_2005

– Highest granularity is One P_JUL_2005

tablespace per partition P_AUG_2005 T_Q3_2005


P_SEP_2005
– Lowest granularity is One
tablespace for all the partitions P_OCT_2005
P_NOV_2005
 Tablespace volatility P_DEC_2005
T_Q4_2005

– Read / Write
P_JAN_2006
– Read Only P_FEB_2006 T_Q1_2006

P_MAR_2006

Read / Write Read Only


Why Partition ? - Performance

 Improved query Sales Fact Table


JAN

performance FEB
MAR
SELECT SUM(sales)
FROM part_tab
APR
– Pruning or elimination MAY
JUN
WHERE sales_date BETWEEN ‘01-JAN-2005’
AND ’30-JUN-2005’
JUL
– Partition wise joins AUG
SEP
OCT

 Read only partitions NOV


DEC

– Quicker checkpointing
– Quicker backup
– Quicker recovery
– …but it depends on
mapping of:
– partition:tablespace:datafile

* Oracle 10gR2 Data Warehousing Manual


Why Partition ? -
Manageability
 Archiving
– Use a rolling window approach
– ALTER TABLE … ADD/SPLIT/DROP PARTITION…
 Easier ETL Processing
– Build a new dataset in a staging table
– Add indexes and constraints
– Collect statistics
– Then swap the staging table for a partition on the target
 ALTER TABLE…EXCHANGE PARTITION…
 Easier Maintenance
– Table partition move, e.g. to compress data
– Local Index partition rebuild
Why Partition ? - Scalability

 Partition is generally consistent and predictable


– Assuming an appropriate partitioning key is used
– …and data has an even distribution across the key
 Read only approach
– Scalable backups - read only tablespaces are ignored
– …so partitions in those tablespaces are ignored
 Pruning allows consistent query performance
Why Partition ? - Availability

 Offline data impact P_JAN_2005


P_FEB_2005
minimised
T_Q1_2005
P_MAR_2005

– …depending on granularity P_APR_2005


P_MAY_2005
– Quicker recovery P_JUN_2005
T_Q2_2005

– Pruned data not missed P_JUL_2005

– EXCHANGE PARTITION P_AUG_2005


P_SEP_2005
T_Q3_2005

 Allows offline build


P_OCT_2005
 Quick swap over P_NOV_2005 T_Q4_2005
P_DEC_2005

P_JAN_2006

P_FEB_2006 T_Q1_2006
P_MAR_2006

Read / Write Read Only


Fact Table Partitioning

Load Date Transaction Date


Tran Date Customer Load Date Tran Date Customer Load Date

January January 07-JAN-2005 Customer 1 09-JAN-2005


Partition 07-JAN-2005 Customer 1 09-JAN-2005 Partition 15-JAN-2005 Customer 2 17-JAN-2005
15-JAN-2005 Customer 2 17-JAN-2005 21-JAN-2005 Customer 7 04-APR-2005
22-JAN-2005 Customer 3 01-FEB-2005

22-JAN-2005 Customer 3 01-FEB-2005


February February 02-FEB-2005 Customer 4 05-FEB-2005
02-FEB-2005 Customer 4 05-FEB-2005
Partition Partition 26-FEB-2005 Customer 5 28-FEB-2005
26-FEB-2005 Customer 5 28-FEB-2005

06-MAR-2005 Customer 2 07-MAR-2005


March March 06-MAR-2005 Customer 2 07-MAR-2005
12-MAR-2005 Customer 3 15-MAR-2005
Partition Partition 12-MAR-2005 Customer 3 15-MAR-2005

21-JAN-2005 Customer 7 04-APR-2005 09-APR-2005 Customer 9 10-APR-2005


April 09-APR-2005 Customer 9 10-APR-2005 April
Partition Partition

Easier ETL Processing Harder ETL Processing


Each load deals with only 1 partition But still uses EXCHANGE PARTITION

No use to end user queries! Useful to end user queries


Can’t prune – Full scans! Allows full pruning capability
Watch out for…

 Partition exchange and table statistics 1

– Partition stats updated


– …but Global stats are NOT!
– Affects queries accessing multiple partitions
– Solution
 Gather stats on staging table prior to EXCHANGE
 Gather stats on partitioned table using GLOBAL

Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2


Partitioning Feature:
Characteristic Reason Matrix
Characteristic: Performance Manageability Scalability Availability
Feature:
Read Only   
Partitions
Pruning   
(Partition
Elimination)
Partition wise  
joins
Parallel DML 

Archiving   

Exchange    
Partition
Partition   
Truncation
Local Indexes    
What Is Data Segment
Compression ?
 Compresses data by eliminating intra block
repeated column values
 Reduces the space required for a segment
– …but only if there are appropriate repeats!
 Self contained
 Lossless algorithm
Where Can Data Segment
Compression Be Used ?

 Can be used with a number of segment types


– Heap & Nested Tables
– Range or List Partitions
– Materialized Views
 Can’t be used with
– Subpartitions
– Hash Partitions
– Indexes – but they have row level compression
– IOT
– External Tables
– Tables that are part of a Cluster
– LOBs
How Does Segment
Compression Work ?

ID DESCRIPTION CONTACT OUTCOME FOLLOWUP


TYPE

100
101
102 bill amount
Call to discuss new product TEL
MAIL NO
YES YES
N/A

Database Block

Symbol Table
1 100 4 NO 7 Call to discuss new product 10 102
2 Call to discuss bill amount 5 YES 8 MAIL
3 TEL 6 101 9 N/A

Row Data Area


1 2 3 4 5
6 7 8 4 9
10 7 3 5 9
Pros & Cons

 Pros  Cons
– Saves space – Increases CPU load
 Reduces LIO / PIO – Can only be used on Direct
 Speeds up Path operations
backup/recovery  CTAS
 Improves query response  Serial Inserts using
time INSERT /*+ APPEND */
– Transparent  Parallel Inserts (PDML)
 To readers  ALTER TABLE…MOVE…
 …and writers  Direct Path SQL*Loader
– Decreases time to perform – Increases time to perform
some DML some DML
 Deletes should be quicker  Bulk inserts may be
 Bulk inserts may be slower
quicker  Updates are slower
Ordering Your Data For
Maximum Benefits
 Colocate data to maximise compression benefits
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 Uniformly distributed

1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 Colocated

 For maximum compression


– Minimise the total space required by the segment
– Identify most “compressable” column(s)
 For optimal access
– We know how the data is to be queried
– Order the data by
 Access path columns
 Then the next most “compressable” column(s)
Get Max Compression
Order Package
PROCEDURE mgmt_p_get_max_compress_order

Argument Name Type In/Out Default?


------------------------------ ----------------------- ------ --------
Running mgmt_p_get_max_compress_order...
P_TABLE_OWNER VARCHAR2 IN DEFAULT
----------------------------------------------------------------------------------------------------
Table
P_TABLE_NAME : BIG_TABLE VARCHAR2 IN
Sample Size : 10000
P_PARTITION_NAME
Unique Run ID: 25012006232119 VARCHAR2 IN DEFAULT
P_SAMPLE_SIZE
ORDER BY Prefix: NUMBER IN DEFAULT
----------------------------------------------------------------------------------------------------
P_PREFIX_COLUMN1 VARCHAR2
Creating MASTER Table : TEMP_MASTER_25012006232119 IN DEFAULT
P_PREFIX_COLUMN2
Creating COLUMN Table 1: COL1 VARCHAR2 IN DEFAULT
Creating COLUMN Table 2: COL2
P_PREFIX_COLUMN3 VARCHAR2 IN DEFAULT
Creating COLUMN Table 3: COL3
----------------------------------------------------------------------------------------------------
BEGIN
The output below lists each column in the table and the number of blocks/rows and space
used when the table data is ordered by only that column, or in the case where a prefix
mgmt_p_get_max_compress_order(p_table_owner
has been specified, where the table data is ordered by the => ‘AE_MGMT’
prefix and then that column.
,p_table_name =>’BIG_TABLE’
From this one can determine if there is a specific ORDER BY which can be applied to
to the data in order to maximise compression within the table whilst, in the case of a
,p_sample_size =>10000);
a prefix being present, ordering data as efficiently as possible for the most common
END:access path(s).
/ ----------------------------------------------------------------------------------------------------
NAME COLUMN BLOCKS ROWS SPACE_GB
============================== ============================== ============ ============ ========
TEMP_COL_001_25012006232119 COL1 290 10000 .0022
TEMP_COL_002_25012006232119 COL2 345 10000 .0026
TEMP_COL_003_25012006232119 COL3 555 10000 .0042
Data Warehousing Specifics

 Star Schema compresses better than Normalized


– More redundant data
 Focus on…
– Fact Tables and Summaries in Star Schema
– Transaction tables in Normalized Schema
 Performance Impact1
– Space Savings
 Star schema: 67%
 Normalized: 24%
– Query Elapsed Times
 Star schema: 16.5%
 Normalized: 10%

1 - Table Compression in Oracle 9iR2: A Performance Analysis


Things To Watch Out For

 DROP COLUMN is awkward


– ORA-39726: Unsupported add/drop column operation on
compressed tables
– Uncompress the table and try again - still gives ORA-39726!
 After UPDATEs data is uncompressed
– Performance impact
– Row migration
 Use appropriate physical design settings
– PCTFREE 0 - pack each block
– Large blocksize - reduce overhead / increase repeats per block
PGA Memory: What For ?
 Sorts
Serial Process
– Standard sorts [SORT]
– Buffer [BUFFER] PGA
– Group By [GROUP BY (SORT)] Cursors

– Connect By [CONNECT-BY (SORT)] Variables


– Rollup [ROLLUP (SORT)]
– Window [WINDOW (SORT)] Sort Area
 Hash Joins [HASH-JOIN]
 Indexes
– Maintenance [IDX MAINTENANCE SOR]
– Bitmap Merge [BITMAP MERGE]
– Bitmap Create [BITMAP CREATE]
Dedicated
 Write Buffers [LOAD WRITE BUFFERS] Server

[] V$SQL_WORKAREA.OPERATION_TYPE
PGA Memory
Management: Manual
 The “old” way of doing things
– Still available though – even in 10g R2
 Configuring
– ALTER SESSION SET WORKAREA_SIZE_POLICY=MANUAL;
– Initialisation parameter: WORKAREA_SIZE_POLICY=MANUAL
 Set memory parameters yourself
– HASH_AREA_SIZE
– SORT_AREA_SIZE
– SORT_AREA_RETAINED_SIZE
– BITMAP_MERGE_AREA_SIZE
– CREATE_BITMAP_AREA_SIZE
 Optimal values depend on the type of work1
– One size does not fit all!

1 - Richmond Shee: If Your Memory Serves You Right


PGA Memory Management:
Automatic
 The “new” way from 9i R1
– Default OFF in 9i R1/R2
 Enabled by setting at session/instance level:
– WORKAREA_SIZE_POLICY=AUTO
– PGA_AGGREGATE_TARGET > 0
– Default ON since 10g R1
 Oracle dynamically manages the available
memory to suit the workload
– But of course, it’s not perfect!

Jože Senegačnik - Advanced Management Of Working Areas In Oracle 9i/10g, presented at UKOUG 2005
Auto PGA Parameters: Pre 10gR2

 WORKAREA_SIZE_POLICY
– Set to AUTO
 PGA_AGGREGATE_TARGET
– The target for summed PGA across all processes
– Can be exceeded if too small
 Over Allocation
 _PGA_MAX_SIZE
– Target maximum PGA size for a single process
– Default is a fixed value of 200Mb
– Hidden / Undocumented Parameter
 Usual caveats apply
Auto PGA Parameters : Pre 10gR2

 _SMM_MAX_SIZE
– Limit for a single workarea operation for one process
– Derived Default
 LEAST(5% of PGA_AGGREGATE_TARGET
, 50% of _PGA_MAX_SIZE)
 Hits limit of 100Mb
– When PGA_AGGREGATE_TARGET is >= 2000Mb
– And _PGA_MAX_SIZE is left at default of 200Mb

– Hidden / Undocumented Parameter


 Usual caveats apply
Auto PGA Parameters : Pre 10gR2

 _SMM_PX_MAX_SIZE PGA_AGGREGATE_TARGET: 3000Mb


_PGA_MAX_SIZE = 200Mb
_SMM_MAX_SIZE = 100Mb
– Limit for all the parallel slaves _SMM_PX_MAX_SIZE = 900Mb

of a single workarea operation


– Derived Default Session 1 Session 2 Session 3 Session 4

 30% of PGA_AGGREGATE_TARGET 100Mb


75Mb 100Mb
75Mb 100Mb
75Mb 100Mb
75Mb

– Hidden / Undocumented Session 5 Session 6 Session 7 Session 8


Parameter 100Mb
75Mb 100Mb
75Mb 100Mb
75Mb 100Mb
75Mb
 Usual caveats apply
Session 9 Session 10 Session 11 Session 12
– Parallel slaves still limited 75Mb 75Mb 75Mb 75Mb
 _SMM_MAX_SIZE
– Impacts only when…
 _SMM_PX_MAX_SIZE 
Degree Of Parallelism  CEILING  
 _SMM_MAX_SIZE 
10gR2 Improvements

 _SMM_MAX_SIZE now the driver


– More advanced algorithm
PGA_AGGREGATE_TARGET _SMM_MAX_SIZE
<= 500Mb 20% * PGA_AGGREGATE_TARGET
500Mb – 1000Mb 100Mb
1000Mb + 10% * PGA_AGGREGATE_TARGET

– _PGA_MAX_SIZE = 2 * _SMM_MAX_SIZE
 Parallel operations
– _SMM_PX_MAX_SIZE = 50% * PGA_AGGREGATE_TARGET
– When DOP <=5 then _smm_max_size is used
– When DOP > 5 _smm_px_max_size / DOP is used

Jože Senegačnik - Advanced Management Of Working Areas In Oracle 9i/10g, presented at UKOUG 2005
PGA Target Advisor
select trunc(pga_target_for_estimate/1024/1024) pga_target_for_estimate
, to_char(pga_target_factor * 100,'999.9') ||'%' pga_target_factor
, trunc(bytes_processed/1024/1024) bytes_processed
, trunc(estd_extra_bytes_rw/1024/1024) estd_extra_bytes_rw
, to_char(estd_pga_cache_hit_percentage,'999') ||
'%' estd_pga_cache_hit_percentage
, estd_overalloc_count
from v$pga_target_advice
/

PGA Target For PGA Tgt Estimated Extra Estimated PGA Estimated
Estimate Mb Factor Bytes Processed Bytes Read/Written Cache Hit % Overallocation Count
-------------- ------- ---------------- ------------------ --------------- --------------------
5,376 12.5% 5,884,017 7,279,799 45% 113
10,752 25.0% 5,884,017 3,593,510 62% 8
21,504 50.0% 5,884,017 3,140,993 65% 0
32,256 75.0% 5,884,017 3,104,894 65% 0
43,008 100.0% 5,884,017 2,300,826 72% 0
51,609 120.0% 5,884,017 2,189,160 73% 0
60,211 140.0% 5,884,017 2,189,160 73% 0
68,812 160.0% 5,884,017 2,189,160 73% 0
77,414 180.0% 5,884,017 2,189,160 73% 0
86,016 200.0% 5,884,017 2,189,160 73% 0
129,024 300.0% 5,884,017 2,189,160 73% 0
172,032 400.0% 5,884,017 2,189,160 73% 0
258,048 600.0% 5,884,017 2,189,160 73% 0
Beware Of Temporal Data
Affecting The Optimizer
 Slowly Changing Dimensions
– Cover ranges of time
– “From” and “To” DATE columns define applicability
– Need BETWEEN operator to retrieve rows for a reporting point in time
SELECT * FROM d_customer
WHERE ’15/01/2005’ BETWEEN valid_from AND valid_to

CUSTOMER
CUSTOMER_ID NAME CUSTOMER_TYPE
487438 Jeff Moss SME Month 1
1st Jan, 2004
D_CUSTOMER
CUSTOMER_ID NAME CUSTOMER_TYPE VALID_FROM VALID_TO
487438 Jeff Moss SME 01/01/2004

CUSTOMER
CUSTOMER_ID NAME CUSTOMER_TYPE
487438 Jeff Moss I&C
839398 Mark Rittman SME
Month 2
D_CUSTOMER 1st Feb, 2004
CUSTOMER_ID NAME CUSTOMER_TYPE VALID_FROM VALID_TO
487438 Jeff Moss SME 01/01/2004 31/01/2004
487438 Jeff Moss I&C 01/02/2004
839398 Mark Rittman SME 01/02/2004
Dependent Predicates

 When multiple predicates exist, individual selectivities


are combined using standard probability math1:
– P1 AND P2
S(P1 & P2) = S(P1) * S(P2)
– P1 OR P2
S(P1 | P2) = S(P1) + S(P2) – [S(P1) * S(P2)]
 Only valid if the predicates are independent otherwise…
– Incorrect selectivity estimate
– Incorrect cardinality estimate
– Potentially suboptimal execution plan
 BETWEEN is multiple predicates!
 Also known as Correlated Columns2
1 – Wolfgang Breitling, Fallacies Of The Cost Based Optimizer
2 – Jonathan Lewis, Cost-Based Oracle Fundamentals, Chapter 6
Some Test Tables…

 Consider these 3 test tables…


 12 records in an SCD type table
TEST_12_DISTINCT_TD TEST_2_DISTINCT_TD TEST_1_DISTINCT_TD

Key Non Key Attr From To Key Non Key Attr From To Key Non Key Attr From To
1 Jeff 01-Jan-2005 31-Jan-2005 1 Jeff 01-Jan-2005 30-Jun-2005 1 Jeff 01-Jan-2005 31-Dec-2005
2 Mark 01-Feb-2005 28-Feb-2005 2 Mark 01-Feb-2005 30-Jun-2005 2 Mark 01-Feb-2005 31-Dec-2005
3 Doug 01-Mar-2005 31-Mar-2005 3 Doug 01-Mar-2005 30-Jun-2005 3 Doug 01-Mar-2005 31-Dec-2005
4 Niall 01-Apr-2005 30-Apr-2005 4 Niall 01-Apr-2005 30-Jun-2005 4 Niall 01-Apr-2005 31-Dec-2005
5 Tom 01-May-2005 31-May-2005 5 Tom 01-May-2005 30-Jun-2005 5 Tom 01-May-2005 31-Dec-2005
6 Jonathan 01-Jun-2005 30-Jun-2005 6 Jonathan 01-Jun-2005 30-Jun-2005 6 Jonathan 01-Jun-2005 31-Dec-2005
7 Lisa 01-Jul-2005 31-Jul-2005 7 Lisa 01-Jul-2005 31-Dec-2005 7 Lisa 01-Jul-2005 31-Dec-2005
8 Cary 01-Aug-2005 31-Aug-2005 8 Cary 01-Aug-2005 31-Dec-2005 8 Cary 01-Aug-2005 31-Dec-2005
9 Mogens 01-Sep-2005 30-Sep-2005 9 Mogens 01-Sep-2005 31-Dec-2005 9 Mogens 01-Sep-2005 31-Dec-2005
10 Anjo 01-Oct-2005 31-Oct-2005 10 Anjo 01-Oct-2005 31-Dec-2005 10 Anjo 01-Oct-2005 31-Dec-2005
11 Larry 01-Nov-2005 30-Nov-2005 11 Larry 01-Nov-2005 31-Dec-2005 11 Larry 01-Nov-2005 31-Dec-2005
12 Pete 01-Dec-2005 31-Dec-2005 12 Pete 01-Dec-2005 31-Dec-2005 12 Pete 01-Dec-2005 31-Dec-2005
Optimizer Gets Incorrect
Cardinality
select * from test_1_distinct_td
where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date;

KEY NON_KEY_AT FROM_DATE TO_DATE


---------- ---------- --------- --------- Key Non Key Attr From To
1 Jeff 01-JAN-05 31-DEC-05 1 Jeff 01-Jan-2005 31-Jan-2005
2 Mark 01-FEB-05 31-DEC-05 2 Mark 01-Feb-2005 28-Feb-2005
3 Doug 01-MAR-05 31-DEC-05 3 Doug 01-Mar-2005 31-Mar-2005
4 Niall 01-Apr-2005 30-Apr-2005
4 Niall 01-APR-05 31-DEC-05
5 Tom 01-May-2005 31-May-2005
5 Tom 01-MAY-05 31-DEC-05
6 Jonathan 01-Jun-2005 30-Jun-2005
6 Jonathan 01-JUN-05 31-DEC-05 7 Lisa 01-Jul-2005 31-Jul-2005
7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-Aug-2005 31-Aug-2005
8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-Sep-2005 30-Sep-2005
9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-Oct-2005 31-Oct-2005
10 Anjo 01-OCT-05 31-DEC-05 11 Larry 01-Nov-2005 30-Nov-2005
12 Pete 01-Dec-2005 31-Dec-2005
10 rows selected.

Execution Plan
----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 264 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST_1_DISTINCT_TD | 11 | 264 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
…And Again

select * from test_2_distinct_td


where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date;

KEY NON_KEY_AT FROM_DATE TO_DATE Key Non Key Attr From To


1 Jeff 01-Jan-2005 31-Jan-2005
---------- ---------- --------- ---------
2 Mark 01-Feb-2005 28-Feb-2005
7 Lisa 01-JUL-05 31-DEC-05
3 Doug 01-Mar-2005 31-Mar-2005
8 Cary 01-AUG-05 31-DEC-05 4 Niall 01-Apr-2005 30-Apr-2005
9 Mogens 01-SEP-05 31-DEC-05 5 Tom 01-May-2005 31-May-2005
10 Anjo 01-OCT-05 31-DEC-05 6 Jonathan 01-Jun-2005 30-Jun-2005
7 Lisa 01-Jul-2005 31-Jul-2005
4 rows selected. 8 Cary 01-Aug-2005 31-Aug-2005
9 Mogens 01-Sep-2005 30-Sep-2005
10 Anjo 01-Oct-2005 31-Oct-2005
11 Larry 01-Nov-2005 30-Nov-2005
Execution Plan 12 Pete 01-Dec-2005 31-Dec-2005
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 264 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST_2_DISTINCT_TD | 11 | 264 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
…And Again

select * from test_12_distinct_td


where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date;
Key Non Key Attr From To
KEY NON_KEY_AT FROM_DATE TO_DATE 1 Jeff 01-Jan-2005 31-Jan-2005
---------- ---------- --------- --------- 2 Mark 01-Feb-2005 28-Feb-2005
10 Anjo 01-OCT-05 31-OCT-05 3 Doug 01-Mar-2005 31-Mar-2005
4 Niall 01-Apr-2005 30-Apr-2005
5 Tom 01-May-2005 31-May-2005
1 row selected. 6 Jonathan 01-Jun-2005 30-Jun-2005
7 Lisa 01-Jul-2005 31-Jul-2005
8 Cary 01-Aug-2005 31-Aug-2005
9 Mogens 01-Sep-2005 30-Sep-2005
10 Anjo 01-Oct-2005 31-Oct-2005
11 Larry 01-Nov-2005 30-Nov-2005
12 Pete 01-Dec-2005 31-Dec-2005
Execution Plan
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 96 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST_12_DISTINCT_TD | 4 | 96 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Workarounds

 Ignore it
– If your query still gets the right plan of course!
 Hints
– Force the optimizer to do as you tell it
 Stored outlines
 Adjust statistics held against the table
– Affects any SQL that accesses that object
 Optimizer Profile (10g)
– Offline Optimisation1
 Dynamic sampling level 4 or above
– Samples “single table predicates that reference 2 or more
columns”
– Takes extra time during the parse – minimal but often worth it
1 - Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2
Dynamic Sampling With A Hint

select /*+ dynamic_sampling(test_1_distinct_td,4) */ *


from test_1_distinct_td
where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date;

KEY NON_KEY_AT FROM_DATE TO_DATE


Key Non Key Attr From To
---------- ---------- --------- --------- 1 Jeff 01-Jan-2005 31-Dec-2005
1 Jeff 01-JAN-05 31-DEC-05 2 Mark 01-Feb-2005 31-Dec-2005
2 Mark 01-FEB-05 31-DEC-05 3 Doug 01-Mar-2005 31-Dec-2005
3 Doug 01-MAR-05 31-DEC-05 4 Niall 01-Apr-2005 31-Dec-2005
4 Niall 01-APR-05 31-DEC-05 5 Tom 01-May-2005 31-Dec-2005
5 Tom 01-MAY-05 31-DEC-05 6 Jonathan 01-Jun-2005 31-Dec-2005
7 Lisa 01-Jul-2005 31-Dec-2005
6 Jonathan 01-JUN-05 31-DEC-05
8 Cary 01-Aug-2005 31-Dec-2005
7 Lisa 01-JUL-05 31-DEC-05 9 Mogens 01-Sep-2005 31-Dec-2005
8 Cary 01-AUG-05 31-DEC-05 10 Anjo 01-Oct-2005 31-Dec-2005
9 Mogens 01-SEP-05 31-DEC-05 11 Larry 01-Nov-2005 31-Dec-2005
10 Anjo 01-OCT-05 31-DEC-05 12 Pete 01-Dec-2005 31-Dec-2005

10 rows selected.

Execution Plan
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 240 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST_1_DISTINCT_TD | 10 | 240 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Find Out Where Your Query Is At

 Data Warehouses are big, big, BIG!


– Big on rows
– Big on disk storage
– Big on hardware
– Big SQL statements issued
 Lots of data to scan, join and sort
 Many operations
 Long running
 So where is my long running query at ?
– No solid answers here, just food for thought…
A “Big” Query Execution Plan
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 124 | | 49722 (10)|
| 1 | PX COORDINATOR | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ20006 | 1 | 124 | | 49722 (10)|
| 3 | HASH JOIN | | 1 | 124 | | 49722 (10)|

Sorts
| 4 | BUFFER SORT | | | | | |
| 5 | PX RECEIVE | | 207K| 9510K| | 25982 (9)| 
| 6 | PX SEND BROADCAST | :TQ20000 | 207K| 9510K| | 25982 (9)|

Aggregations
| 7 | VIEW | | 207K| 9510K| | 25982 (9)|
| 8 | WINDOW SORT | | 207K| 10M| 26M| 25982 (9)| 
| 9 | MERGE JOIN | | 207K| 10M| | 25976 (9)|

 Hash joins
| 10 | TABLE ACCESS BY INDEX ROWID| AML_T_ANALYSIS_DATE | 1 | 22 | | 2 (0)|
| 11 | INDEX UNIQUE SCAN | AML_I_ANL_PK | 1 | | | 0 (0)|
| 12 | SORT AGGREGATE | | 1 | 9 | | |
| 13 |
| 14 |
| 15 |
PX COORDINATOR
PX SEND QC (RANDOM)
SORT AGGREGATE
|
|
|
:TQ10000
|
|
|
|
1 |
1 |
|
9 |
9 |
|
|
|
|
|
|
 Merge joins
| 16 |
| 17 |
| 18 | FILTER
PX BLOCK ITERATOR
TABLE ACCESS FULL
|
| AML_T_ANALYSIS_DATE
|
|
|
|
1 |
1 |
|
9 |
9 |
|
|
|
|
2
2
(0)|
(0)|
|
 Table scans
| 19 |
| 20 |
FILTER
TABLE ACCESS FULL
| |
| AML_T_BILLING_ACCOUNT_DIM|
|
82M| 2371M|
| |
| 5457
|
(5)|  Materialized
View scans
| 21 | HASH JOIN | | 18M| 1340M| | 23704 (10)|
| 22 | HASH JOIN | | 10M| 500M| | 17005 (11)|
| 23 | PX RECEIVE | | 10M| 265M| | 11304 (14)|

 Analytics
| 24 | PX SEND HASH | :TQ20003 | 10M| 265M| | 11304 (14)|
| 25 | BUFFER SORT | | 1 | 124 | | |
| 26 | VIEW | AML_V_MD_CUH_SID | 10M| 265M| | 11304 (14)|

 Parallel
| 27 | HASH JOIN | | 10M| 337M| | 11304 (14)|
| 28 | PX RECEIVE | | 17M| 310M| | 5228 (18)|
| 29 | PX SEND HASH | :TQ20001 | 17M| 310M| | 5228 (18)|
| 30 |
| 31 |
PX BLOCK ITERATOR
TABLE ACCESS FULL
|
| AML_T_MEASURE_DIM
|
|
17M|
17M|
310M|
310M|
| 5228 (18)|
| 5228 (18)|
Query
Pruning
| 32 | PX RECEIVE | | 34M| 461M| | 5958 (10)|
| 33 | PX SEND HASH | :TQ20002 | 34M| 461M| | 5958 (10)| 
| 34 | PX BLOCK ITERATOR | | 34M| 461M| | 5958 (10)|

Temp Space
| 35 | TABLE ACCESS FULL | AML_T_CUSTOMER_DIM | 34M| 461M| | 5958 (10)|
| 36 | PX RECEIVE | | 55M| 1212M| | 5562 (3)| 
| 37 | PX SEND HASH | :TQ20004 | 55M| 1212M| | 5562 (3)|
| 38 |
| 39 |
PX BLOCK ITERATOR
TABLE ACCESS FULL
|
| AML_T_CUSTOMER_DIM
|
|
55M| 1212M|
55M| 1212M|
| 5562
| 5562
(3)|
(3)| Use
| 40 | PX RECEIVE | | 94M| 2516M| | 6483 (5)|
| 41 | PX SEND HASH | :TQ20005 | 94M| 2516M| | 6483 (5)|
| 42 | PX BLOCK ITERATOR | | 94M| 2516M| | 6483 (5)|
| 43 | MAT_VIEW ACCESS FULL | AML_M_CD_BAD | 94M| 2516M| | 6483 (5)|
V$ Views To The Rescue ?

 V$SESSION – Identify your session


 V$SQL_PLAN – Get the execution plan operations
 V$SQL_WORKAREA – Get all the work areas which will be required
 V$SESSION_LONGOPS – Get information on long plan operations
 V$SQL_WORKAREA_ACTIVE – Get the work area(s) being used right now

V$SESSION V$SQL_PLAN V$SQL_WORKAREA V$SQL_WORKAREA_ACTIVE


SID SQL_ID SQL_ID SQL_ID
SERIAL# CHILD_NUMBER CHILD_NUMBER SQL_HASH_VALUE
PROGRAM ADDRESS WORKAREA_ADDRESS WORKAREA_ADDRESS
USERNAME HASH_VALUE OPERATION_ID OPERATION_ID
SQL_ID OPERATION OPERATION_TYPE OPERATION_TYPE
SQL_CHILD_NUMBER ID POLICY
SQL_ADDRESS PARENT_ID SID
SQL_HASH_VALUE V$SESSION_LONGOPS QCSID
ACTIVE_TIME
SID
SERIAL#
OPNAME
TARGET
MESSAGE
SQL_ID
SQL_ADDRESS
SQL_HASH_VALUE
ELAPSED_SECONDS
Demonstration
Problems

 V$SQL_PLAN Bug
– Service Request: 4990863.992
– Broken in 10gR1, Works in 10gR2
– PARENT_ID corruption
 Can’t link rows in this view to their parents as the values are
corrupted due to this bug
 Shows up in TEMP TABLE TRANSFORMATION operations
 Multiple Work Areas can be active…or None
 Some operations are not shown in Long ops
 V$SESSION sql_id may not be the executing cursor
– E.g. for refreshing Materialized View

* Test case for bug: http://www.oramoss.demon.co.uk/Code/test_error_v_sql_plan.sql


Questions ?
References: Papers

 Table Compression in Oracle 9iR2: A Performance Analysis


 Table Compression in Oracle 9iR2: An Oracle White Paper
 “Fallacies Of The Cost Based Optimizer”, Wolfgang Breitling
 “Scaling To Infinity, Partitioning In Oracle Data Warehouses”, Tim Gorman
 Advanced Management Of Working Areas in Oracle 9i/10g, UKOUG 2005, Joze Senegacnik
 Oracle9i Memory Management: Easier Than Ever, Oracle Open World 2002, Sushil Kumar
 Working with Automatic PGA, Christo Kutrovsky
 Optimising Oracle9i Instance Memory, Ramaswamy, Ramesh
 Oracle Metalink Note 223730.1: Automatic PGA Memory Managment in 9i
 Oracle Metalink Note 147806.1:
Oracle9i New Feature: Automated SQL Execution Memory Management
 Oracle Metalink Note 148346.1:
Oracle9i Monitoring Automated SQL Execution Memory Management
 Memory Management and Latching Improvements in Oracle Database 9i and 10g
, Oracle Open World 2005, Tanel Pőder
 If Your Memory Serves You Right…, IOUG Live! 2004, April 2004, Toronto, Canada,
Richmond Shee
 Decision Speed: Table Compression In Action
References: Online
Presentation / Code
 http://www.oramoss.demon.co.uk/presentations/fivetuningtipsforyourdatawarehouse.ppt
 http://www.oramoss.demon.co.uk/Code/mgmt_p_get_max_compression_order.prc
 http://www.oramoss.demon.co.uk/Code/test_dml_performance_delete.sql
 http://www.oramoss.demon.co.uk/Code/test_dml_performance_insert.sql
 http://www.oramoss.demon.co.uk/Code/test_dml_performance_update.sql
 http://www.oramoss.demon.co.uk/Code/test_error_v_sql_plan.sql
 http://www.oramoss.demon.co.uk/Code/run_big_query.sql
 http://www.oramoss.demon.co.uk/Code/run_big_query_parallel.sql
 http://www.oramoss.demon.co.uk/Code/get_query_progress.sql

You might also like