PracticalPartitioning v2
PracticalPartitioning v2
Large
Data
Partitioning
• Partitioning Overview
• Indexing
• Managing statistics
• Compression
• Purging
• Backing up
Partitioning
• Facts
– Divide and Conquer
– Many Types
• Range
• List
• Hash
• Interval (11g)
• Reference (11g)
• Composite
Partitioning
Fifteen Years of Development
• Facts
– Is not mostly about performance
• Especially with OLTP
• With OLTP – you must be careful to not impeded
performance!
– Is mostly about administration
– Is an extra cost option to Enterprise Edition
Partitioning
Availability
Fast=True
Administration Performance
Partitioning
• Facts
– Increases Availability of data
• Each partition is independent
• Some users may never even notice some data was
unavailable due to partition elimination
• Downtime is reduced as well as time to recover is
reduced (smaller sets of data to recover)
part1.sql
Part1.sql
ops$tkyte%ORA11GR2> CREATE TABLE emp
2 ( empno int,
3 ename varchar2(20)
4 )
5 PARTITION BY HASH (empno)
6 ( partition part_1 tablespace p1,
7 partition part_2 tablespace p2
8 )
9 /
Table created.
14 rows created.
Part1.sql
ops$tkyte%ORA11GR2> select part1, part2
2 from (
3 select empno || ', ' || ename part1, row_number() over (order by empno) rn1
4 from emp partition(part_1)
5 ) A FULL OUTER JOIN (
6 select empno || ', ' || ename part2, row_number() over (order by empno) rn2
7 from emp partition(part_2)
8 ) B on ( a.rn1 = b.rn2 )
9 /
PART1 PART2
--------------- ---------------
7369, SMITH 7521, WARD
7499, ALLEN 7566, JONES
7654, MARTIN 7788, SCOTT
7698, BLAKE 7844, TURNER
7782, CLARK 7900, JAMES
7839, KING 7902, FORD
7876, ADAMS
7934, MILLER
8 rows selected.
Part1.sql
ops$tkyte%ORA11GR2> alter tablespace p1 offline;
Tablespace altered.
EMPNO ENAME
---------- --------------------
7844 TURNER
Partitioning
• Facts
– Reduced Administrative Burden
• Performing operations on small objects is
o Easier
o Faster (each individual operation is, total time
might increase)
o Less resource intensive – click to see example
Partitioning
SQL> create table big_table1
2 ( ID, OWNER, OBJECT_NAME, SUBOBJECT_NAME,
3 OBJECT_ID, DATA_OBJECT_ID,
4 OBJECT_TYPE, CREATED, LAST_DDL_TIME,
5 TIMESTAMP, STATUS, TEMPORARY,
6 GENERATED, SECONDARY )
7 tablespace big1
8 as
9 select ID, OWNER, OBJECT_NAME, SUBOBJECT_NAME,
10 OBJECT_ID, DATA_OBJECT_ID,
11 OBJECT_TYPE, CREATED, LAST_DDL_TIME,
12 TIMESTAMP, STATUS, TEMPORARY,
13 GENERATED, SECONDARY
14 from big_table.big_table;
Table created. (10,000,000 rows)
Partitioning
SQL> create table big_table2
2 ( ID, OWNER, OBJECT_NAME, SUBOBJECT_NAME,
3 OBJECT_ID, DATA_OBJECT_ID,
4 OBJECT_TYPE, CREATED, LAST_DDL_TIME,
5 TIMESTAMP, STATUS, TEMPORARY,
6 GENERATED, SECONDARY )
7 partition by hash(id)
8 (partition part_1 tablespace big2,
9 partition part_2 tablespace big2,
10 partition part_3 tablespace big2,
11 partition part_4 tablespace big2,
12 partition part_5 tablespace big2,
13 partition part_6 tablespace big2,
14 partition part_7 tablespace big2,
15 partition part_8 tablespace big2
16 )
17 as
18 select ID, OWNER, OBJECT_NAME, SUBOBJECT_NAME,
19 OBJECT_ID, DATA_OBJECT_ID,
20 OBJECT_TYPE, CREATED, LAST_DDL_TIME,
21 TIMESTAMP, STATUS, TEMPORARY,
22 GENERATED, SECONDARY
23 from big_table.big_table;
Table created.
Partitioning
SQL> select b.tablespace_name,
2 mbytes_alloc,
3 mbytes_free
4 from ( select round(sum(bytes)/1024/1024) mbytes_free,
5 tablespace_name
6 from dba_free_space
7 group by tablespace_name ) a,
8 ( select round(sum(bytes)/1024/1024) mbytes_alloc,
9 tablespace_name
10 from dba_data_files
11 group by tablespace_name ) b
12 where a.tablespace_name (+) = b.tablespace_name
13 and b.tablespace_name in ('BIG1','BIG2')
14 /
SQL> begin
2 for x in ( select partition_name
3 from user_tab_partitions
4 where table_name = 'BIG_TABLE2' )
5 loop
6 execute immediate
7 'alter table big_table2 move partition ' ||
8 x.partition_name;
9 end loop;
10 end;
11 /
PL/SQL procedure successfully completed.
Partitioning
• Composite Partitioning
Range 11gr1 9i 8i
GLOBAL INDEX
Table created.
Local Indexes
ops$tkyte%ORA11GR2> create index local_prefixed on partitioned_table (a,b) local;
Index created.
Execution Plan
----------------------------------------------------------
Plan hash value: 1622054381
----------------------------------------------------------------------------------
| Id | Operation | Name | | Pstart| Pstop |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PARTITION RANGE SINGLE | | | 1 | 1 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| PARTITIONED_TABLE | | 1 | 1 |
|* 3 | INDEX RANGE SCAN | LOCAL_PREFIXED | | 1 | 1 |
----------------------------------------------------------------------------------
Note
-----
- dynamic sampling used for this statement (level=2)
Local Indexes
ops$tkyte%ORA11GR2> drop index local_prefixed;
Index dropped.
Execution Plan
----------------------------------------------------------
Plan hash value: 904532382
---------------------------------------------------------------------------------
| Id | Operation | Name || Pstart| Pstop |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | || | |
| 1 | PARTITION RANGE SINGLE | || 1 | 1 |
|* 2 | TABLE ACCESS BY LOCAL INDEX ROWID| PARTITIONED_TABLE || 1 | 1 |
|* 3 | INDEX RANGE SCAN | LOCAL_NONPREFIXED || 1 | 1 |
---------------------------------------------------------------------------------
2 - filter("A"=1)
3 - access("B"=2)
Local Indexes - Uniqueness
• New in 11gR2
• You can index only “part” of a table
– Maybe just the most current data needs an index
– Older data would be full scanned
• Query plans can be generated that take this into
consideration
Index only what you need
ops$tkyte%ORA11GR2> CREATE TABLE t
2 (
3 dt date,
4 x int,
5 y varchar2(30)
6 )
7 PARTITION BY RANGE (dt)
8 (
9 PARTITION part1 VALUES LESS THAN (to_date('01-jan-2010','dd-mon-yyyy')) ,
10 PARTITION part2 VALUES LESS THAN (to_date('01-jan-2011','dd-mon-yyyy')) ,
11 PARTITION junk VALUES LESS THAN (MAXVALUE)
12 )
13 /
Table created.
Index created.
Index altered.
Index only what you need
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select * from t where x = 42;
---------------------------------------------------------------
| Id | Operation || Pstart| Pstop |
---------------------------------------------------------------
| 0 | SELECT STATEMENT || | |
| 1 | VIEW || | |
| 2 | UNION-ALL || | |
| 3 | PARTITION RANGE SINGLE || 2 | 2 |
| 4 | TABLE ACCESS BY LOCAL INDEX ROWID|| 2 | 2 |
|* 5 | INDEX RANGE SCAN || 2 | 2 |
| 6 | PARTITION RANGE OR ||KEY(OR)|KEY(OR)|
|* 7 | TABLE ACCESS FULL ||KEY(OR)|KEY(OR)|
---------------------------------------------------------------
• Discuss Statistics
– Local
– Global
Gathering Statistics
Strategy For New Databases
• Create tables
• Optionally Run (or explain) queries on empty tables
– Prime / Seed the optimizer
• Enable incremental statistics
– For large partitioned tables
• Load data
• Gather statistics
– Use the defaults
• Create indexes (if required!)
Gathering Statistics
Incremental Statistics
• One of the biggest problems with large tables is keeping the
schema statistics up to date and accurate
• This is particularly challenging in a Data Warehouse where
tables continue to grow and so the statistics gathering time
and resources grow proportionately
• To address this problem, 11.1 introduced the concept of
incremental statistics for partitioned objects
• This means that statistics are gathered for recently modified
partitions
Gathering Statistics
The Concept of Synopses
Overhead
Inserts are
again Free Space
uncompressed
Uncompressed
Compressed
Block usage reaches
PCTFREE – triggers
Compression
Inserts are
uncompressed Block usage reaches
PCTFREE – triggers
Compression
5 Jack Smith
Local
Symbol Table
OLTP Table Compression
Local
More Data
Symbol Table
Per Block
Using OLTP Table Compression
• Challenge:
– Want to minimize storage
– Do not want to use Advanced compression (option) for
whatever reason
– In OLTP so no direct path options
– Backup friendly
Applying compression with Partitioning
• A current online, read-write tablespace that gets backed up like every
other normal tablespace in our system. The audit trail information in this
tablespace is not compressed, and it is constantly inserted into.
• A series of tablespaces for last year, the year before, and so on. These
are all read-only and might even be on slow, cheap media. In the event of
a media failure, we just need to restore from backup. We would
occasionally pick a year at random from our backup sets to ensure they
are still restorable (tapes go bad sometimes).
Purging
Purging
• Challenge:
– Keep N-years/months whatever of data online
– Have the data be constantly available
– Purge old data
– Add new data
– Support efficient indexing scheme (keeping availability
in mind)
– Support efficient storage (use indexes one current data
mostly)
Sliding Windows of Data
• We’ll walk through how to
– Detaching the old data: The oldest partition is either
dropped or exchanged with an empty table to permit
archiving of the old data.
– Attaching the new data: Once the new data is loaded and
processed, the table it is in is exchanged with an empty
partition in the partitioned table, turning this newly loaded
data in a table into a partition of the larger partitioned table.
Sliding Window
ops$tkyte@ORA11GR2> CREATE TABLE partitioned
2 ( timestamp date,
3 id int
4 )
5 PARTITION BY RANGE (timestamp)
6 (
7 PARTITION fy_2004 VALUES LESS THAN
8 ( to_date('01-jan-2005','dd-mon-yyyy') ) ,
9 PARTITION fy_2005 VALUES LESS THAN
10 ( to_date('01-jan-2006','dd-mon-yyyy') )
11 )
12 /
Table created.
To archive to
Sliding Window
Data to be “loaded”
Sliding Window
INDEX_NAME STATUS
------------------------------ --------
FY_2006_IDX VALID
FY_2004_IDX VALID
PARTITIONED_IDX_GLOBAL UNUSABLE
PARTITIONED_IDX_LOCAL N/A
Same here
Sliding Window
INDEX_NAME STATUS
------------------------------ --------
FY_2006_IDX VALID
FY_2004_IDX VALID
PARTITIONED_IDX_GLOBAL VALID
PARTITIONED_IDX_LOCAL N/A
6 rows selected.
• Don’t do indexes
– Even in a read/write environment
– Might represent 50-60% of your database volume
– As easy to recreate in parallel/nologging as it would be
to restore
• Easier perhaps
Backing Up
Q&A