Datawarehouse Concept

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 260

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.

2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

1 Data Warehousing Concepts


This chapter provides an overview of the Oracle data wareho sin! implementation" It incl des#

$hat is a %ata $areho se& %ata $areho se 'rchitect res

(ote that this book is meant as a s pplement to standard texts abo t data wareho sin!" This book foc ses on Oracle)specific material and does not reprod ce in detail material of a !eneral nat re" Two standard texts are#

The Data Warehouse Toolkit b* +alph ,imball -.ohn $ile* and Sons/ 01123 Building the Data Warehouse b* $illiam Inmon -.ohn $ile* and Sons/ 01123

What is a Data Warehouse?


' data wareho se is a relational database that is desi!ned for 4 er* and anal*sis rather than for transaction processin!" It s all* contains historical data derived from transaction data/ b t it can incl de data from other so rces" It separates anal*sis workload from transaction workload and enables an or!ani5ation to consolidate data from several so rces" In addition to a relational database/ a data wareho se environment incl des an extraction/ transportation/ transformation/ and loadin! -6TL3 sol tion/ an online anal*tical processin! -OL'73 en!ine/ client anal*sis tools/ and other applications that mana!e the process of !atherin! data and deliverin! it to b siness sers" See Also: Chapter 08/ 9Overview of 6xtraction/ Transformation/ and Loadin!9

' common wa* of introd cin! data wareho sin! is to refer to the characteristics of a data wareho se as set forth b* $illiam Inmon#

S b:ect Oriented Inte!rated (onvolatile Time ;ariant

Subject Oriented
%ata wareho ses are desi!ned to help *o anal*5e data" For example/ to learn more abo t *o r compan*<s sales data/ *o can b ild a wareho se that concentrates on sales" =sin! this wareho se/ *o can answer 4 estions like 9$ho was o r best c stomer for this item last *ear&9 This abilit* to define a data wareho se b* s b:ect matter/ sales in this case/ makes the data wareho se s b:ect oriented"

Integrated
Inte!ration is closel* related to s b:ect orientation" %ata wareho ses m st p t data from disparate so rces into a consistent format" The* m st resolve s ch problems as namin! conflicts and inconsistencies amon! nits of meas re" $hen the* achieve this/ the* are said to be inte!rated"

Nonvolatile
(onvolatile means that/ once entered into the wareho se/ data sho ld not chan!e" This is lo!ical beca se the p rpose of a wareho se is to enable *o to anal*5e what has occ rred"

Ti e !ariant
In order to discover trends in b siness/ anal*sts need lar!e amo nts of data" This is ver* m ch in contrast to online transaction processing (OLTP) s*stems/ where performance re4 irements demand that historical data be moved to an archive" ' data wareho se<s foc s on chan!e over time is what is meant b* the term time variant"

Contrasting O"T# and Data Warehousing $nviron ents


Fi! re 0)0 ill strates ke* differences between an OLT7 s*stem and a data wareho se" Figure 1-1 Contrasting OLTP and Data Warehousing Environments

Text description of the ill stration dwhs!88>"!if One ma:or difference between the t*pes of s*stem is that data wareho ses are not s all* in third normal form (3NF)/ a t*pe of data normali5ation common in OLT7 environments" %ata wareho ses and OLT7 s*stems have ver* different re4 irements" Here are some examples of differences between t*pical data wareho ses and OLT7 s*stems#

$orkload %ata wareho ses are desi!ned to accommodate ad hoc 4 eries" ?o mi!ht not know the workload of *o r data wareho se in advance/ so a data wareho se sho ld be optimi5ed to perform well for a wide variet* of possible 4 er* operations" OLT7 s*stems s pport onl* predefined operations" ?o r applications mi!ht be specificall* t ned or desi!ned to s pport onl* these operations"

%ata modifications ' data wareho se is pdated on a re! lar basis b* the 6TL process -r n ni!htl* or weekl*3 sin! b lk data modification techni4 es" The end sers of a data wareho se do not directl* pdate the data wareho se" In OLT7 s*stems/ end sers ro tinel* iss e individ al data modification statements to the database" The OLT7 database is alwa*s p to date/ and reflects the c rrent state of each b siness transaction"

Schema desi!n

%ata wareho ses often se denormali5ed or partiall* denormali5ed schemas -s ch as a star schema3 to optimi5e 4 er* performance" OLT7 s*stems often se f ll* normali5ed schemas to optimi5e pdate@insert@delete performance/ and to ! arantee data consistenc*"

T*pical operations ' t*pical data wareho se 4 er* scans tho sands or millions of rows" For example/ 9Find the total sales for all c stomers last month"9 ' t*pical OLT7 operation accesses onl* a handf l of records" For example/ 9+etrieve the c rrent order for this c stomer"9

Historical data %ata wareho ses s all* store man* months or *ears of data" This is to s pport historical anal*sis" OLT7 s*stems s all* store data from onl* a few weeks or months" The OLT7 s*stem stores onl* historical data as needed to s ccessf ll* meet the re4 irements of the c rrent transaction"

Data Warehouse Architectures


%ata wareho ses and their architect res var* dependin! pon the specifics of an or!ani5ation<s sit ation" Three common architect res are#

%ata $areho se 'rchitect re -Basic3 %ata $areho se 'rchitect re -with a Sta!in! 'rea3 %ata $areho se 'rchitect re -with a Sta!in! 'rea and %ata Marts3

Data Warehouse Architecture %&asic'


Fi! re 0)A shows a simple architect re for a data wareho se" 6nd sers directl* access data derived from several so rce s*stems thro !h the data wareho se" Figure 1-2 Architecture of a Data Warehouse

Text description of the ill stration dwhs!80B"!if In Fi! re 0)A/ the metadata and raw data of a traditional OLT7 s*stem is present/ as is an additional t*pe of data/ s mmar* data" S mmaries are ver* val able in data wareho ses beca se the* pre)comp te lon! operations in advance" For example/ a t*pical data wareho se 4 er* is to retrieve somethin! like ' ! st sales" ' s mmar* in Oracle is called a materialized view"

Data Warehouse Architecture %(ith a Staging Area'


In Fi! re 0)A/ *o need to clean and process *o r operational data before p ttin! it into the wareho se" ?o can do this pro!rammaticall*/ altho !h most data wareho ses se a staging area instead" ' sta!in! area simplifies b ildin! s mmaries and !eneral wareho se mana!ement" Fi! re 0)B ill strates this t*pical architect re" Figure 1-3 Architecture of a Data Warehouse ith a !taging Area

Text description of the ill stration dwhs!80>"!if

Data Warehouse Architecture %(ith a Staging Area and Data )arts'


'ltho !h the architect re in Fi! re 0)B is 4 ite common/ *o ma* want to c stomi5e *o r wareho se<s architect re for different !ro ps within *o r or!ani5ation" ?o can do this b* addin! data marts/ which are s*stems desi!ned for a partic lar line of b siness" Fi! re 0)C ill strates an example where p rchasin!/ sales/ and inventories are separated" In this example/ a financial anal*st mi!ht want to anal*5e historical data for p rchases and sales" Figure 1-" Architecture of a Data Warehouse #arts ith a !taging Area and Data

Text description of the ill stration dwhs!82C"!if Note: %ata marts are an important part of man* wareho ses/ b t the* are not the foc s of this book" See Also: Data Mart Suites doc mentation for f rther information re!ardin! data marts
Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

#art II "ogical Design


This section deals with the iss es in lo!ical desi!n in a data wareho se" It contains the followin! chapter#

Lo!ical %esi!n in %ata $areho ses

* "ogical Design in Data Warehouses


This chapter tells *o how to desi!n a data wareho sin! environment and incl des the followin! topics#

Lo!ical ;ers s 7h*sical %esi!n in %ata $areho ses Creatin! a Lo!ical %esi!n %ata $areho sin! Schemas %ata $areho sin! Ob:ects

"ogical !ersus #h+sical Design in Data Warehouses


?o r or!ani5ation has decided to b ild a data wareho se" ?o have defined the b siness re4 irements and a!reed pon the scope of *o r application/ and created a concept al desi!n" (ow *o need to translate *o r re4 irements into a s*stem deliverable" To do so/ *o create the lo!ical and ph*sical desi!n for the data wareho se" ?o then define#

The specific data content +elationships within and between !ro ps of data The s*stem environment s pportin! *o r data wareho se The data transformations re4 ired The fre4 enc* with which data is refreshed

The lo!ical desi!n is more concept al and abstract than the ph*sical desi!n" In the lo!ical desi!n/ *o look at the lo!ical relationships amon! the ob:ects" In the ph*sical desi!n/ *o look at the most effective wa* of storin! and retrievin! the ob:ects as well as handlin! them from a transportation and back p@recover* perspective" Orient *o r desi!n toward the needs of the end sers" 6nd sers t*picall* want to perform anal*sis and look at a!!re!ated data/ rather than at individ al transactions" However/ end sers mi!ht not know what the* need ntil the* see it" In addition/ a well)planned desi!n allows for !rowth and chan!es as the needs of sers chan!e and evolve" B* be!innin! with the lo!ical desi!n/ *o foc s on the information re4 irements and save the implementation details for later"

Creating a "ogical Design


' lo!ical desi!n is concept al and abstract" ?o do not deal with the ph*sical implementation details *et" ?o deal onl* with definin! the t*pes of information that *o need" One techni4 e *o can se to model *o r or!ani5ation<s lo!ical information re4 irements is entit*)relationship modelin!" 6ntit*)relationship modelin! involves identif*in! the thin!s of importance -entities3/ the properties of these thin!s -attrib tes3/ and how the* are related to one another -relationships3" The process of lo!ical desi!n involves arran!in! data into a series of lo!ical relationships called entities and attrib tes" 'n entity represents a ch nk of information" In relational databases/ an entit* often maps to a table" 'n attrib te is a component of an entit* that helps define the ni4 eness of the entit*" In relational databases/ an attrib te maps to a col mn" To be s re that *o r data is consistent/ *o need to se ni4 e identifiers" ' ni! e identifier is somethin! *o add to tables so that *o can differentiate between the same item when it appears in different places" In a ph*sical desi!n/ this is s all* a primar* ke*" $hile entit*)relationship dia!rammin! has traditionall* been associated with hi!hl* normali5ed models s ch as OLT7 applications/ the techni4 e is still sef l for data wareho se desi!n in the form of dimensional modelin!" In dimensional modelin!/ instead of seekin! to discover atomic nits of information -s ch as entities and attrib tes3 and all of the relationships between them/ *o identif* which information belon!s to a central fact table and which information belon!s to its associated dimension tables" ?o identif* b siness s b:ects or fields of data/ define relationships between b siness s b:ects/ and name the attrib tes for each s b:ect" See Also: Chapter 1/ 9%imensions9 for f rther information re!ardin! dimensions ?o r lo!ical desi!n sho ld res lt in -03 a set of entities and attrib tes correspondin! to fact tables and dimension tables and -A3 a model of operational data from *o r so rce into s b:ect)oriented information in *o r tar!et data wareho se schema" ?o can create the lo!ical desi!n sin! a pen and paper/ or *o can se a desi!n tool s ch as Oracle $areho se B ilder -specificall* desi!ned to s pport modelin! the 6TL process3 or Oracle %esi!ner -a !eneral p rpose modelin! tool3" See Also: Oracle Designer and Oracle Warehouse Builder doc mentation sets

Data Warehousing Sche as


' schema is a collection of database ob:ects/ incl din! tables/ views/ indexes/ and s*non*ms" ?o can arran!e schema ob:ects in the schema models desi!ned for data wareho sin! in a variet* of wa*s" Most data wareho ses se a dimensional model" The model of *o r so rce data and the re4 irements of *o r sers help *o desi!n the data wareho se schema" ?o can sometimes !et the so rce model from *o r compan*<s enterprise data model and reverse)en!ineer the lo!ical data model for the data wareho se from this" The ph*sical implementation of the lo!ical data wareho se model ma* re4 ire some chan!es to adapt it to *o r s*stem parameters))si5e of machine/ n mber of sers/ stora!e capacit*/ t*pe of network/ and software"

Star Sche as
The star schema is the simplest data wareho se schema" It is called a star schema beca se the dia!ram resembles a star/ with points radiatin! from a center" The center of the star consists of one or more fact tables and the points of the star are the dimension tables/ as shown in Fi! re A)0" Figure 2-1 !tar !chema

Text description of the ill stration dwhs!88E"!if The most nat ral wa* to model a data wareho se is as a star schema/ onl* one :oin establishes the relationship between the fact table and an* one of the dimension tables" ' star schema optimi5es performance b* keepin! 4 eries simple and providin! fast response time" 'll the information abo t each level is stored in one row" Note: Oracle Corporation recommends that *o choose a star schema nless *o have a clear reason not to"

Other Sche as
Some schemas in data wareho sin! environments se third normal form rather than star schemas" 'nother schema that is sometimes sef l is the snowflake schema/ which is a star schema with normali5ed dimensions in a tree str ct re" See Also: Chapter 0E/ 9Schema Modelin! Techni4 es9 for f rther information re!ardin! star and snowflake schemas in data wareho ses and Oracle9i Database Concepts for f rther concept al material

Data Warehousing Objects


Fact tables and dimension tables are the two t*pes of ob:ects commonl* sed in dimensional data wareho se schemas" Fact tables are the lar!e tables in *o r wareho se schema that store b siness meas rements" Fact tables t*picall* contain facts and forei!n ke*s to the dimension tables" Fact tables represent data/ s all* n meric and additive/ that can be anal*5ed and examined" 6xamples incl de sales/ cost/ and profit" %imension tables/ also known as look p or reference tables/ contain the relativel* static data in the wareho se" %imension tables store the information *o normall* se to contain 4 eries" %imension tables are s all* text al and descriptive and *o can se them as the row headers of the res lt set" 6xamples are customers or products"

,act Tables
' fact table t*picall* has two t*pes of col mns# those that contain n meric facts -often called meas rements3/ and those that are forei!n ke*s to dimension tables" ' fact table contains either detail)level facts or facts that have been a!!re!ated" Fact tables that contain a!!re!ated facts are often called s mmar* tables" ' fact table s all* contains facts with the same level of a!!re!ation" Tho !h most facts are additive/ the* can also be semi)additive or non)additive" 'dditive facts can be a!!re!ated b* simple arithmetical addition" ' common example of this is sales" (on)additive facts cannot be added at all" 'n example of this is avera!es" Semi)additive facts can be a!!re!ated alon! some of the dimensions and not alon! others" 'n example of this is inventor* levels/ where *o cannot tell what a level means simpl* b* lookin! at it" Creating a Ne( ,act Table ?o m st define a fact table for each star schema" From a modelin! standpoint/ the primar* ke* of the fact table is s all* a composite ke* that is made p of all of its forei!n ke*s"

Di ension Tables
' dimension is a str ct re/ often composed of one or more hierarchies/ that cate!ori5es data" %imensional attrib tes help to describe the dimensional val e" The* are normall* descriptive/ text al val es" Several distinct dimensions/ combined with facts/ enable *o to answer b siness 4 estions" Commonl* sed dimensions are c stomers/ prod cts/ and time" %imension data is t*picall* collected at the lowest level of detail and then a!!re!ated into hi!her level totals that are more sef l for anal*sis" These nat ral roll ps or a!!re!ations within a dimension table are called hierarchies" -ierarchies Hierarchies are lo!ical str ct res that se ordered levels as a means of or!ani5in! data" ' hierarch* can be sed to define data a!!re!ation" For example/ in a time dimension/ a hierarch* mi!ht a!!re!ate data from the month level to the quarter level to the year level" ' hierarch* can also be sed to define a navi!ational drill path and to establish a famil* str ct re" $ithin a hierarch*/ each level is lo!icall* connected to the levels above and below it" %ata val es at lower levels a!!re!ate into the data val es at hi!her levels" ' dimension can be composed of more than one hierarch*" For example/ in the product dimension/ there mi!ht be two hierarchies))one for prod ct cate!ories and one for prod ct s ppliers" %imension hierarchies also !ro p levels from !eneral to !ran lar" F er* tools se hierarchies to enable *o to drill down into *o r data to view different levels of !ran larit*" This is one of the ke* benefits of a data wareho se" $hen desi!nin! hierarchies/ *o m st consider the relationships in b siness str ct res" For example/ a divisional m ltilevel sales or!ani5ation" Hierarchies impose a famil* str ct re on dimension val es" For a partic lar level val e/ a val e at the next hi!her level is its parent/ and val es at the next lower level are its children" These familial relationships enable anal*sts to access data 4 ickl*"

Leve$s
' level represents a position in a hierarch*" For example/ a time dimension mi!ht have a hierarch* that represents data at the month/ quarter/ and year levels" Levels ran!e from !eneral to specific/ with the root level as the hi!hest or most !eneral level" The levels in a dimension are or!ani5ed into one or more hierarchies"

Leve$ %e$ationshi&s

Level relationships specif* top)to)bottom orderin! of levels from most !eneral -the root3 to most specific information" The* define the parent)child relationship between the levels in a hierarch*" Hierarchies are also essential components in enablin! more complex rewrites" For example/ the database can a!!re!ate an existin! sales reven e on a 4 arterl* base to a *earl* a!!re!ation when the dimensional dependencies between 4 arter and *ear are known" T+pical Di ension -ierarch+ Fi! re A)A ill strates a dimension hierarch* based on customers" Figure 2-2 T'&ica$ Leve$s in a Dimension (ierarch'

Text description of the ill stration dwhs!8AB"!if See Also: Chapter 1/ 9%imensions9 and Chapter AA/ 9F er* +ewrite9 for f rther information re!ardin! hierarchies

.ni/ue Identi0iers
=ni4 e identifiers are specified for one distinct record in a dimension table" 'rtificial ni4 e identifiers are often sed to avoid the potential problem of ni4 e identifiers chan!in!" =ni4 e identifiers are represented with the G character" For example/ #customer_id"

1elationships
+elationships ! arantee b siness inte!rit*" 'n example is that if a b siness sells somethin!/ there is obvio sl* a c stomer and a prod ct" %esi!nin! a relationship between the sales information in the fact table and the dimension tables prod cts and c stomers enforces the b siness r les in databases"

$2a ple o0 Data Warehousing Objects and Their 1elationships


Fi! re A)B ill strates a common example of a sales fact table and dimension tables customers/ products/ promotions/ times/ and channels" Figure 2-3 T'&ica$ Data Warehousing O)*ects

Text description of the ill stration dwhs!8E2"!if

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

#art III #h+sical Design


This section deals with the ph*sical desi!n of a data wareho se" It contains the followin! chapters#

7h*sical %esi!n in %ata $areho ses Hardware and I@O Considerations in %ata $areho ses 7arallelism and 7artitionin! in %ata $areho ses Indexes Inte!rit* Constraints Materiali5ed ;iews %imensions

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

3 #h+sical Design in Data Warehouses


This chapter describes the ph*sical desi!n of a data wareho sin! environment/ and incl des the followin! topics#

Movin! from Lo!ical to 7h*sical %esi!n 7h*sical %esi!n

)oving 0ro

"ogical to #h+sical Design

Lo!ical desi!n is what *o draw with a pen and paper or desi!n with Oracle $areho se B ilder or %esi!ner before b ildin! *o r wareho se" 7h*sical desi!n is the creation of the database with SFL statements" % rin! the ph*sical desi!n process/ *o convert the data !athered d rin! the lo!ical desi!n phase into a description of the ph*sical database str ct re" 7h*sical desi!n decisions are mainl* driven b* 4 er* performance and database maintenance aspects" For example/ choosin! a partitionin! strate!* that meets common 4 er* re4 irements enables Oracle to take advanta!e of partition pr nin!/ a wa* of narrowin! a search before performin! it" See Also: Chapter >/ 97arallelism and 7artitionin! in %ata $areho ses9 for f rther information re!ardin! partitionin!

Oracle9i Database Concepts for f rther concept al material re!ardin! all desi!n matters

#h+sical Design
% rin! the lo!ical desi!n phase/ *o defined a model for *o r data wareho se consistin! of entities/ attrib tes/ and relationships" The entities are linked to!ether sin! relationships" 'ttrib tes are sed to describe the entities" The ni! e identifier -=I%3 distin! ishes between one instance of an entit* and another" Fi! re B)0 offers *o a !raphical wa* of lookin! at the different wa*s of thinkin! abo t lo!ical and ph*sical desi!ns" Figure 3-1 Logica$ Design Com&ared ith Ph'sica$ Design

Text description of the ill stration dwhs!882"!if % rin! the ph*sical desi!n process/ *o translate the expected schemas into act al database str ct res" 't this time/ *o have to map#

6ntities to tables +elationships to forei!n ke* constraints 'ttrib tes to col mns 7rimar* ni4 e identifiers to primar* ke* constraints =ni4 e identifiers to ni4 e ke* constraints

#h+sical Design Structures


Once *o have converted *o r lo!ical desi!n to a ph*sical one/ *o will need to create some or all of the followin! str ct res#

Tablespaces Tables and 7artitioned Tables ;iews Inte!rit* Constraints %imensions

Some of these str ct res re4 ire disk space" Others exist onl* in the data dictionar*" 'dditionall*/ the followin! str ct res ma* be created for performance improvement#

Indexes and 7artitioned Indexes Materiali5ed ;iews

Tablespaces
' tablespace consists of one or more datafiles/ which are ph*sical str ct res within the operatin! s*stem *o are sin!" ' datafile is associated with onl* one tablespace" From a desi!n perspective/ tablespaces are containers for ph*sical desi!n str ct res" Tablespaces need to be separated b* differences" For example/ tables sho ld be separated from their indexes and small tables sho ld be separated from lar!e tables" Tablespaces sho ld also represent lo!ical b siness nits if possible" Beca se a tablespace is the coarsest !ran larit* for back p and recover* or the transportable tablespaces mechanism/ the lo!ical b siness desi!n affects availabilit* and maintenance operations" See Also: Chapter C/ 9Hardware and I@O Considerations in %ata $areho ses9 for f rther information re!ardin! tablespaces

Tables and #artitioned Tables


Tables are the basic nit of data stora!e" The* are the container for the expected amo nt of raw data in *o r data wareho se" =sin! partitioned tables instead of nonpartitioned ones addresses the ke* problem of s pportin! ver* lar!e data vol mes b* allowin! *o to decompose them into smaller and more mana!eable pieces" The main desi!n criterion for partitionin! is mana!eabilit*/ tho !h *o will also see performance benefits in most cases beca se of partition pr nin! or intelli!ent parallel processin!" For example/ *o mi!ht choose a partitionin! strate!* based on a sales transaction date and a monthl* !ran larit*" If *o have fo r *ears< worth of data/ *o can delete a month<s data as it becomes older than fo r *ears with a sin!le/ 4 ick %%L statement and load new data while onl* affectin! 0@CHth of the complete table" B siness 4 estions re!ardin! the last 4 arter will onl* affect three months/ which is e4 ivalent to three partitions/ or B@CHths of the total vol me"

7artitionin! lar!e tables improves performance beca se each partitioned piece is more mana!eable" T*picall*/ *o partition based on transaction dates in a data wareho se" For example/ each month/ one month<s worth of data can be assi!ned its own partition" Data Seg ent Co pression ?o can save disk space b* compressin! heap)or!ani5ed tables" ' t*pical t*pe of heap) or!ani5ed table *o sho ld consider for data se!ment compression is partitioned tables" To red ce disk se and memor* se -specificall*/ the b ffer cache3/ *o can store tables and partitioned tables in a compressed format inside the database" This often leads to a better scale p for read)onl* operations" %ata se!ment compression can also speed p 4 er* exec tion" There is/ however/ a cost in C7= overhead" %ata se!ment compression sho ld be sed with hi!hl* red ndant data/ s ch as tables with man* forei!n ke*s" ?o sho ld avoid compressin! tables with m ch pdate or other %ML activit*" 'ltho !h compressed tables or partitions are pdatable/ there is some overhead in pdatin! these tables/ and hi!h pdate activit* ma* work a!ainst compression b* ca sin! some space to be wasted" See Also: Chapter >/ 97arallelism and 7artitionin! in %ata $areho ses9 and Chapter 0C/ 9Maintainin! the %ata $areho se9 for information re!ardin! data se!ment compression and partitioned tables

!ie(s
' view is a tailored presentation of the data contained in one or more tables or other views" ' view takes the o tp t of a 4 er* and treats it as a table" ;iews do not re4 ire an* space in the database" See Also: Oracle9i Database Concepts

Integrit+ Constraints
Inte!rit* constraints are sed to enforce b siness r les associated with *o r database and to prevent havin! invalid information in the tables" Inte!rit* constraints in data wareho sin! differ from constraints in OLT7 environments" In OLT7 environments/ the* primaril* prevent the insertion of invalid data into a record/ which is not a bi! problem in data wareho sin! environments beca se acc rac* has alread* been ! aranteed" In data wareho sin! environments/ constraints are onl* sed for 4 er* rewrite" NOT NULL constraints are partic larl* common in data wareho ses" =nder some specific

circ mstances/ constraints need space in the database" These constraints are in the form of the nderl*in! ni4 e index" See Also: Chapter E/ 9Inte!rit* Constraints9 and Chapter AA/ 9F er* +ewrite9

Inde2es and #artitioned Inde2es


Indexes are optional str ct res associated with tables or cl sters" In addition to the classical B)tree indexes/ bitmap indexes are ver* common in data wareho sin! environments" Bitmap indexes are optimi5ed index str ct res for set)oriented operations" 'dditionall*/ the* are necessar* for some optimi5ed data access methods s ch as star transformations" Indexes are : st like tables in that *o can partition them/ altho !h the partitionin! strate!* is not dependent pon the table str ct re" 7artitionin! indexes makes it easier to mana!e the wareho se d rin! refresh and improves 4 er* performance" See Also: Chapter 2/ 9Indexes9 and Chapter 0C/ 9Maintainin! the %ata $areho se9

)ateriali4ed !ie(s
Materiali5ed views are 4 er* res lts that have been stored in advance so lon!)r nnin! calc lations are not necessar* when *o act all* exec te *o r SFL statements" From a ph*sical desi!n point of view/ materiali5ed views resemble tables or partitioned tables and behave like indexes" See Also: Chapter H/ 9Materiali5ed ;iews9

Di ensions
' dimension is a schema ob:ect that defines hierarchical relationships between col mns or col mn sets" ' hierarchical relationship is a f nctional dependenc* from one level of a hierarch* to the next one" ' dimension is a container of lo!ical relationships and does not re4 ire an* space in the database" ' t*pical dimension is cit*/ state -or province3/ re!ion/ and co ntr*" See Also: Chapter 1/ 9%imensions9

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

5 -ard(are and I6O Considerations in Data Warehouses


This chapter explains some of the hardware and I@O iss es in a data wareho sin! environment and incl des the followin! topics#

Overview of Hardware and I@O Considerations in %ata $areho ses +'I% Confi! rations

Overvie( o0 -ard(are and I6O Considerations in Data Warehouses


%ata wareho ses are normall* ver* concerned with I@O performance" This is in contrast to OLT7 s*stems/ where the potential bottleneck depends on ser workload and application access patterns" $hen a s*stem is constrained b* I@O capabilities/ it is I@O bo nd/ or has an I@O bottleneck" $hen a s*stem is constrained b* havin! limited C7= reso rces/ it is C7= bo nd/ or has a C7= bottleneck" %atabase architects fre4 entl* se +'I% -+ed ndant 'rra*s of Inexpensive %isks3 s*stems to overcome I@O bottlenecks and to provide hi!her availabilit*" +'I% can be implemented in several levels/ ran!in! from 8 to E" Man* hardware vendors have enhanced these basic levels to lessen the impact of some of the ori!inal restrictions at a !iven +'I% level" The most common +'I% levels are disc ssed later in this chapter"

Wh+ Stripe the Data?


To avoid I@O bottlenecks d rin! parallel processin! or conc rrent 4 er* access/ all tablespaces accessed b* parallel operations sho ld be striped" Stripin! divides the data of a lar!e table into small portions and stores them on separate datafiles on separate disks" 's shown in Fi! re C)0/ tablespaces sho ld alwa*s stripe over at least as many devices as CP s! In this example/ there are fo r C7=s/ two controllers/ and five devices containin! tablespaces" Figure "-1 !tri&ing O)*ects Over at Least as #an' Devices as CP+s

Text description of the ill stration dwhs!8EB"!if See Also: Oracle9i Database Concepts for f rther details abo t disk stripin! ?o sho ld stripe tablespaces for tables/ indexes/ rollback se!ments/ and temporar* tablespaces" ?o m st also spread the devices over controllers/ I@O channels/ and internal b ses" To make stripin! effective/ *o m st make s re that eno !h controllers and other I@O components are available to s pport the bandwidth of parallel data movement into and o t of the striped tablespaces" ?o can se +'I% s*stems or *o can perform stripin! man all* thro !h caref l data file allocation to tablespaces" The stripin! of data across ph*sical drives has several conse4 ences besides balancin! I@O" One additional advanta!e is that lo!ical files can be created that are lar!er than the maxim m si5e s all* s pported b* an operatin! s*stem" There are disadvanta!es however" Stripin! means that it is no lon!er possible to locate a sin!le datafile on a specific ph*sical drive" This can ca se the loss of some application t nin! capabilities" 'lso/ it can ca se database recover* to be more time)cons min!" If a sin!le ph*sical disk in a +'I% arra* needs recover*/ all the disks that are part of that lo!ical +'I% device m st be involved in the recover*"

Auto atic Striping

' tomatic stripin! is s all* flexible and eas* to mana!e" It s pports man* scenarios s ch as m ltiple sers r nnin! se4 entiall* or as sin!le sers r nnin! in parallel" Two main advanta!es make a tomatic stripin! preferable to man al stripin!/ nless the s*stem is ver* small or availabilit* is the main concern#

For parallel scan operations -s ch as f ll table scan or fast f ll scan3/ operatin! s*stem stripin! increases the n mber of disk seeks" (evertheless/ this is lar!el* offset b* the lar!e I@O si5e -DB_BLOCK_SI ! I "ULTIBLOCK_#!$D_COUNT3/ which sho ld enable this operation to reach the maxim m I@O thro !hp t for *o r platform" This maxim m is in !eneral limited b* the n mber of controllers or I@O b ses of the platform/ not b* the n mber of disks - nless *o have a small confi! ration or are sin! lar!e disks3" For index probes -for example/ within a nested loop :oin or parallel index ran!e scan3/ operatin! s*stem stripin! enables *o to avoid hot spots b* evenl* distrib tin! I@O across the disks"

Oracle Corporation recommends sin! a lar!e stripe si5e of at least 2C ,B" Stripe si5e m st be at least as lar!e as the I@O si5e" If stripe si5e is lar!er than I@O si5e b* a factor of two or fo r/ then trade)offs ma* arise" The lar!e stripe si5e can be advanta!eo s beca se it lets the s*stem perform more se4 ential operations on each diskJ it decreases the n mber of seeks on disk" 'nother advanta!e of lar!e stripe si5es is that more sers can work on the s*stem witho t affectin! each other" The disadvanta!e is that lar!e stripes red ce the I@O parallelism/ so fewer disks are sim ltaneo sl* active" If *o enco nter problems/ increase the I@O si5e of scan operations -for example/ from 2C ,B to 0AH ,B3/ instead of chan!in! the stripe si5e" The maxim m I@O si5e is platform)specific -in a ran!e/ for example/ of 2C ,B to 0 MB3" $ith a tomatic stripin!/ from a performance standpoint/ the best la*o t is to stripe data/ indexes/ and temporar* tablespaces across all the disks of *o r platform" This la*o t is also appropriate when *o have little information abo t s*stem sa!e" To increase availabilit*/ it ma* be more practical to stripe over fewer disks to prevent a sin!le disk val e from affectin! the entire data wareho se" However/ for better performance/ it is cr cial to stripe all ob:ects over m ltiple disks" In this wa*/ maxim m I@O performance -both in terms of thro !hp t and in n mber of I@Os per second3 can be reached when one ob:ect is accessed b* a parallel operation" If m ltiple ob:ects are accessed at the same time -as in a m lti ser confi! ration3/ stripin! a tomaticall* limits the contention"

)anual Striping
?o can se man al stripin! on all platforms" To do this/ add m ltiple files to each tablespace/ with each file on a separate disk" If *o se man al stripin! correctl*/ *o r s*stem<s performance improves si!nificantl*" However/ *o sho ld be aware of several drawbacks that can adversel* affect performance if *o do not stripe correctl*" $hen sin! man al stripin!/ the de!ree of parallelism -%O73 is more a f nction of the n mber of disks than of the n mber of C7=s" First/ it is necessar* to have one server

process for each datafile to drive all the disks and limit the risk of experiencin! I@O bottlenecks" Second/ man al stripin! is ver* sensitive to datafile si5e skew/ which can affect the scalabilit* of parallel scan operations" Third/ man al stripin! re4 ires more plannin! and set) p effort than a tomatic stripin!" Note: Oracle Corporation recommends that *o choose a tomatic stripin! nless *o have a clear reason not to"

"ocal and 7lobal Striping


Local stripin!/ which applies onl* to partitioned tables and indexes/ is a form of non) overlappin!/ disk)to)partition stripin!" 6ach partition has its own set of disks and files/ as ill strated in Fi! re C)A" %isk access does not overlap/ nor do files" 'n advanta!e of local stripin! is that if one disk fails/ it does not affect other partitions" Moreover/ *o still have some stripin! even if *o have data in onl* one partition" ' disadvanta!e of local stripin! is that *o need man* disks to implement it))each partition re4 ires m ltiple disks of its own" 'nother ma:or disadvanta!e is that when partitions are red ced to a few or even a sin!le partition/ the s*stem retains limited I@O bandwidth" 's a res lt/ local stripin! is not optimal for parallel operations" For this reason/ consider local stripin! onl* if *o r main concern is availabilit*/ rather than parallel exec tion" Figure "-2 Loca$ !tri&ing

Text description of the ill stration dwhs!088"!if Klobal stripin!/ ill strated in Fi! re C)B/ entails overlappin! disks and partitions" Figure "-3 ,$o)a$ !tri&ing

Text description of the ill stration dwhs!080"!if Klobal stripin! is advanta!eo s if *o have partition pr nin! and need to access data in onl* one partition" Spreadin! the data in that partition across man* disks improves performance for parallel exec tion operations" ' disadvanta!e of !lobal stripin! is that if one disk fails/ all partitions are affected if the disks are not mirrored" See Also: Oracle9i Database Concepts for information on disk stripin! and partitionin!" For M77 s*stems/ see *o r operatin! s*stem specific Oracle doc mentation re!ardin! the advisabilit* of disablin! disk affinit* when sin! operatin! s*stem stripin!

Anal+4ing Striping
Two considerations arise when anal*5in! stripin! iss es for *o r applications" First/ consider the cardinalit* of the relationships amon! the ob:ects in a stora!e s*stem" Second/ consider what *o can optimi5e in *o r stripin! effort# f ll table scans/ !eneral tablespace availabilit*/ partition scans/ or some combinations of these !oals" Cardinalit* and optimi5ation are disc ssed in the followin! section" Cardinality of Storage Object Relationships To anal*5e stripin!/ consider the relationships ill strated in Fi! re C)C" Figure "-" Cardina$it' of %e$ationshi&s

Text description of the ill stration dwhs!81H"!if Fi! re C)C shows the cardinalit* of the relationships amon! ob:ects in a t*pical Oracle stora!e s*stem" For ever* table there ma* be#

p partitions/ shown in Fi! re C)C as a one)to)man* relationship s partitions for ever* tablespace/ shown in Fi! re C)C as a man*)to)one relationship " files for ever* tablespace/ shown in Fi! re C)C as a one)to)man* relationship m files to n devices/ shown in Fi! re C)C as a man*)to)man* relationship

Striping Goals ?o can stripe an ob:ect across devices to achieve one of three !oals#

Koal 0# To optimi5e f ll table scans/ place a table on man* devices" Koal A# To optimi5e availabilit*/ restrict the tablespace to a few devices" Koal B# To optimi5e partition scans/ achieve intra)partition parallelism b* placin! each partition on man* devices"

To attain both Koals 0 and A -havin! the table reside on man* devices/ with the hi!hest possible availabilit*3/ maximi5e the n mber of partitions p and minimi5e the n mber of partitions for each tablespace s" To maximi5e Koal 0 b t with minimal intra)partition parallelism/ place each partition in its own tablespace" %o not sed striped files/ and se one file for each tablespace" To minimi5e Koal A and thereb* minimi5e availabilit*/ set " and n e4 al to 0" $hen *o minimi5e availabilit*/ *o maximi5e intra)partition parallelism" Koal B conflicts with Koal A beca se *o cannot sim ltaneo sl* maximi5e the form la for Koal B and minimi5e the form la for Koal A" ?o m st compromise to achieve some of the benefits of both !oals" "triping #oal $% Optimize F ll Table "cans Havin! a table reside on man* devices ens res scalable f ll table scans" To calc late the optimal n mber of devices for each table/ se this form la#

Text description of the ill stration dwhs!8BB"!if ?o can do this b* havin! t partitions/ with ever* partition in its own tablespace/ if ever* tablespace has one file/ and these files are not striped"

Text description of the ill stration dwhs!82H"!if

If the table is not partitioned/ b t is in one tablespace in one file/ stripe it over n devices"

Text description of the ill stration dwhs!821"!if There are a maxim m of t partitions/ ever* partition in its own tablespace/ " files in each tablespace/ each tablespace on a striped device#

Text description of the ill stration dwhs!8E8"!if "triping #oal &% Optimize 'vailability +estrictin! each tablespace to a small n mber of devices and havin! as man* partitions as possible helps *o achieve hi!h availabilit*"

Text description of the ill stration dwhs!822"!if 'vailabilit* is maximi5ed when " L n L m L 0 and p is m ch !reater than 0" "triping #oal 3% Optimize Partition "cans 'chievin! intra)partition parallelism is advanta!eo s beca se partition scans are scalable" To do this/ place each partition on man* devices"

Text description of the ill stration dwhs!82E"!if 7artitions can reside in a tablespace that can have man* files" ?o can have either a striped file or man* files for each tablespace"

1AID Con0igurations
+'I% s*stems/ also called disk arra*s/ can be hardware) or software)based s*stems" The difference between the two is how C7= processin! of I@O re4 ests is handled" In software)based +'I% s*stems/ the operatin! s*stem or an application level handles the I@O re4 est/ while in hardware)based +'I% s*stems/ disk controllers handle I@O re4 ests" +'I% sa!e is transparent to Oracle" 'll the feat res specific to a !iven +'I%

confi! ration are handled b* the operatin! s*stem and Oracle does not need to worr* abo t them" 7rimar* lo!ical database str ct res have different access patterns d rin! read and write operations" Therefore/ different +'I% implementations will be better s ited for these str ct res" The p rpose of this chapter is to disc ss some of the basic decisions *o m st make when desi!nin! the ph*sical la*o t of *o r data wareho se implementation" It is not meant as a replacement for operatin! s*stem and stora!e doc mentation or a cons ltant<s anal*sis of *o r I@O re4 irements" See Also: Oracle9i Database Per"ormance Tuning #uide and $e"erence for more information re!ardin! +'I% There are advanta!es and disadvanta!es to sin! +'I%/ and those depend on the +'I% level nder consideration and the specific s*stem in 4 estion" The most common confi! rations in data wareho ses are#

+'I% 8 -Stripin!3 +'I% 0 -Mirrorin!3 +'I% 8M0 -Stripin! and Mirrorin!3 +'I% >

1AID 8 %Striping'
+'I% 8 is a non)red ndant disk arra*/ so there will be data loss with an* disk fail re" If somethin! on the disk becomes corr pted/ *o cannot restore or recalc late that data" +'I% 8 provides the best write thro !hp t performance beca se it never pdates red ndant information" +ead thro !hp t is also 4 ite !ood/ b t *o can improve it b* combinin! +'I% 8 with +'I% 0" Oracle does not recommend sin! +'I% 8 s*stems witho t +'I% 0 beca se the loss of one disk in the arra* will affect the complete s*stem and make it navailable" +'I% 8 s*stems are sed mainl* in environments where performance and capacit* are the primar* concerns rather than availabilit*"

1AID 1 %)irroring'
+'I% 0 provides f ll data red ndanc* b* complete mirrorin! of all files" If a disk fail re occ rs/ the mirrored cop* is sed to transparentl* service the re4 est" +'I% 0 mirrorin! re4 ires twice as m ch disk space as there is data" In !eneral/ +'I% 0 is most sef l for s*stems where complete red ndanc* of data is re4 ired and disk space is not an iss e" For lar!e datafiles or s*stems with less disk space/ +'I% 0 ma* not be feasible/ beca se it re4 ires twice as m ch disk space as there is data" $rites nder +'I% 0 are no faster

and no slower than s al" +eadin! data can be faster than on a sin!le disk beca se the s*stem can choose to read the data from the disk that can respond faster"

1AID 891 %Striping and )irroring'


+'I% 8M0 offers the best performance of all +'I% s*stems/ b t costs the most beca se *o do ble the n mber of drives" Basicall*/ it combines the performance of +'I% 8 and the fa lt tolerance of +'I% 0" ?o sho ld consider +'I% 8M0 for datafiles with hi!h write rates/ for example/ table datafiles/ and online and archived redo lo! files"

Striping: )irroring: and )edia 1ecover+


Stripin! affects media recover*" Loss of a disk s all* means loss of access to all ob:ects stored on that disk" If all datafiles in a database are striped over all disks/ then loss of an* disk stops the entire database" F rthermore/ *o ma* need to restore all these database files from back ps/ even if each file has onl* a small fraction of its total data stored on the failed disk" Often/ the same s*stem that provides stripin! also provides mirrorin!" $ith the declinin! price of disks/ mirrorin! can provide an effective s pplement to/ b t not a s bstit te for/ back ps and lo! archives" Mirrorin! can help *o r s*stem recover from disk fail res more 4 ickl* than sin! a back p/ b t mirrorin! is not as rob st" Mirrorin! does not protect a!ainst software fa lts and other problems a!ainst which an independent back p wo ld protect *o r s*stem" ?o can effectivel* se mirrorin! if *o are able to reload read)onl* data from the ori!inal so rce tapes" If *o have a disk fail re/ restorin! data from back ps can involve len!th* downtime/ whereas restorin! from a mirrored disk enables *o r s*stem to !et back online 4 ickl* or even sta* online while the crashed disk is replaced and res*nchroni5ed"

1AID ;
+'I% > s*stems provide red ndanc* for the ori!inal data while storin! parit* information as well" The parit* information is striped over all disks in the s*stem to avoid a sin!le disk as a bottleneck d rin! write operations" The I@O thro !hp t of +'I% > s*stems depends pon the implementation and the stripin! si5e" For a t*pical +'I% > s*stem/ the thro !hp t is normall* lower than +'I% 8 M 0 confi! rations" In partic lar/ the performance for hi!h conc rrent write operations s ch as parallel load can be poor" Man* vendors se memor* -as batter*)backed cache3 in front of the disks to increase thro !hp t and to become comparable to +'I% 8M0" Contact *o r disk arra* vendor for specific details"

The I portance o0 Speci0ic Anal+sis

' data wareho se<s re4 irements are at man* levels/ and resolvin! a problem at one level can ca se problems with another" For example/ resolvin! a problem with 4 er* performance d rin! the 6TL process can affect load performance" ?o cannot simpl* maximi5e 4 er* performance at the expense of an nrealistic load time" If *o do/ *o r implementation will fail" In addition/ a partic lar process is dependent pon the wareho se<s architect re" If *o decide to chan!e somethin! in *o r s*stem/ it can ca se performance to become nacceptable in another part of the wareho sin! process" 'n example of this is switchin! from sin! database files to flat files d rin! the loadin! process" Flat files can have different read performance" This chapter is not meant as a replacement for operatin! s*stem and stora!e doc mentation" ?o r s*stem<s re4 irements will re4 ire detailed anal*sis prior to implementation" Onl* a detailed data wareho se architect re and I@O anal*sis will help *o when decidin! hardware and I@O strate!ies" See Also: Oracle9i Database Per"ormance Tuning #uide and $e"erence for details re!ardin! how to anal*5e I@O re4 irements

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

; #arallelis and #artitioning in Data Warehouses


%ata wareho ses often contain lar!e tables and re4 ire techni4 es both for mana!in! these lar!e tables and for providin! !ood 4 er* performance across these lar!e tables" This chapter disc sses two ke* methodolo!ies for addressin! these needs# parallelism and partitionin!"

These topics are disc ssed#


Overview of 7arallel 6xec tion Kran les of 7arallelism 7artitionin! %esi!n Considerations Miscellaneo s 7artition Operations Note: 7arallel exec tion is available onl* with the Oracle1i 6nterprise 6dition"

Overvie( o0 #arallel $2ecution


7arallel exec tion dramaticall* red ces response time for data)intensive operations on lar!e databases t*picall* associated with decision s pport s*stems -%SS3 and data wareho ses" ?o can also implement parallel e(ec tion on certain t*pes of online transaction processin! -OLT73 and h*brid s*stems" 7arallel exec tion is sometimes called parallelism" Simpl* expressed/ parallelism is the idea of breakin! down a task so that/ instead of one process doin! all of the work in a 4 er*/ man* processes do part of the work at the same time" 'n example of this is when fo r processes handle fo r different 4 arters in a *ear instead of one process handlin! all fo r 4 arters b* itself" The improvement in performance can be 4 ite hi!h" In this case/ each 4 arter will be a partition/ a smaller and more mana!eable nit of an index or table" See Also: Oracle9i Database Concepts for f rther concept al information re!ardin! parallel exec tion

When to I ple ent #arallel $2ecution


The most common se of parallel exec tion is in %SS and data wareho sin! environments" Complex 4 eries/ s ch as those involvin! :oins of several tables or searches of ver* lar!e tables/ are often best exec ted in parallel" 7arallel exec tion is sef l for man* t*pes of operations that access si!nificant amo nts of data" 7arallel exec tion improves processin! for#

Lar!e table scans and :oins Creation of lar!e indexes 7artitioned index scans B lk inserts/ pdates/ and deletes '!!re!ations and cop*in!

?o can also se parallel exec tion to access ob:ect t*pes within an Oracle database" For example/ se parallel exec tion to access LOBs -lar!e ob:ects3" 7arallel exec tion benefits s*stems that have all of the followin! characteristics#

S*mmetric m lti)processors -SM73/ cl sters/ or massivel* parallel s*stems S fficient I@O bandwidth =nder tili5ed or intermittentl* sed C7=s -for example/ s*stems where C7= sa!e is t*picall* less than B8N3 S fficient memor* to s pport additional memor*)intensive processes s ch as sorts/ hashin!/ and I@O b ffers

If *o r s*stem lacks an* of these characteristics/ parallel exec tion mi!ht not si!nificantl* improve performance" In fact/ parallel exec tion can red ce s*stem performance on over tili5ed s*stems or s*stems with small I@O bandwidth" See Also: Chapter A0/ 9=sin! 7arallel 6xec tion9 for f rther information re!ardin! parallel exec tion re4 irements

7ranules o0 #arallelis
%ifferent parallel operations se different t*pes of parallelism" The optimal ph*sical database la*o t depends on the parallel operations that are most prevalent in *o r application or even of the necessit* of sin! partitions" The basic nit of work in parallelism is a called a !ran le" Oracle divides the operation bein! paralleli5ed -for example/ a table scan/ table pdate/ or index creation3 into !ran les" 7arallel exec tion processes exec te the operation one !ran le at a time" The n mber of !ran les and their si5e correlates with the de!ree of parallelism -%O73" It also affects how well the work is balanced across 4 er* server processes" There is no wa* *o can enforce a specific !ran le strate!* as Oracle makes this decision internall*"

&loc< 1ange 7ranules


Block ran!e !ran les are the basic nit of most parallel operations/ even on partitioned tables" Therefore/ from an Oracle perspective/ the de!ree of parallelism is not related to the n mber of partitions" Block ran!e !ran les are ran!es of ph*sical blocks from a table" The n mber and the si5e of the !ran les are comp ted d rin! r ntime b* Oracle to optimi5e and balance the work distrib tion for all affected parallel exec tion servers" The n mber and si5e of !ran les are dependent pon the si5e of the ob:ect and the %O7" Block ran!e !ran les do not depend on static preallocation of tables or indexes" % rin! the comp tation of the !ran les/ Oracle takes the %O7 into acco nt and tries to assi!n !ran les from different

datafiles to each of the parallel exec tion servers to avoid contention whenever possible" 'dditionall*/ Oracle considers the disk affinit* of the !ran les on M77 s*stems to take advanta!e of the ph*sical proximit* between parallel exec tion servers and disks" $hen block ran!e !ran les are sed predominantl* for parallel access to a table or index/ administrative considerations -s ch as recover* or sin! partitions for deletin! portions of data3 mi!ht infl ence partition la*o t more than performance considerations"

#artition 7ranules
$hen Oracle ses partition !ran les/ a 4 er* server process works on an entire partition or s bpartition of a table or index" Beca se partition !ran les are staticall* determined b* the str ct re of the table or index when a table or index is created/ partition !ran les do not !ive *o the flexibilit* in paralleli5in! an operation that block !ran les do" The maxim m allowable %O7 is the n mber of partitions" This mi!ht limit the tili5ation of the s*stem and the load balancin! across parallel exec tion servers" $hen Oracle ses partition !ran les for parallel access to a table or index/ *o sho ld se a relativel* lar!e n mber of partitions -ideall*/ three times the %O73/ so that Oracle can effectivel* balance work across the 4 er* server processes" 7artition !ran les are the basic nit of parallel index ran!e scans and of parallel operations that modif* m ltiple partitions of a partitioned table or index" These operations incl de parallel creation of partitioned indexes/ and parallel creation of partitioned tables" See Also: Oracle9i Database Concepts for information on disk stripin! and partitionin!

#artitioning Design Considerations


In con: nction with parallel exec tion/ partitionin! can improve performance in data wareho ses" The followin! are the main desi!n considerations for partitionin!#

T*pes of 7artitionin! 7artition 7r nin! 7artition)$ise .oins

T+pes o0 #artitioning
This section describes the partitionin! feat res that si!nificantl* enhance data access and improve overall application performance" This is especiall* tr e for applications that access tables and indexes with millions of rows and man* !i!ab*tes of data"

7artitioned tables and indexes facilitate administrative operations b* enablin! these operations to work on s bsets of data" For example/ *o can add a new partition/ or!ani5e an existin! partition/ or drop a partition and ca se less than a second of interr ption to a read)onl* application" =sin! the partitionin! methods described in this section can help *o t ne SFL statements to avoid nnecessar* index and table scans - sin! partition pr nin!3" ?o can also improve the performance of massive :oin operations when lar!e amo nts of data -for example/ several million rows3 are :oined to!ether b* sin! partition)wise :oins" Finall*/ partitionin! data !reatl* improves mana!eabilit* of ver* lar!e databases and dramaticall* red ces the time re4 ired for administrative tasks s ch as back p and restore" Kran larit* can be easil* added or removed to the partitionin! scheme b* splittin! partitions" Th s/ if a table<s data is skewed to fill some partitions more than others/ the ones that contain more data can be split to achieve a more even distrib tion" 7artitionin! also allows one to swap partitions with a table" B* bein! able to easil* add/ remove/ or swap a lar!e amo nt of data 4 ickl*/ swappin! can be sed to keep a lar!e amo nt of data that is bein! loaded inaccessible ntil loadin! is completed/ or can be sed as a wa* to sta!e data between different phases of se" Some examples are c rrent da*<s transactions or online archives" See Also: Oracle9i Database Concepts for an introd ction to the ideas behind partitionin! #artitioning )ethods Oracle offers fo r partitionin! methods#

+an!e 7artitionin! Hash 7artitionin! List 7artitionin! Composite 7artitionin!

6ach partitionin! method has different advanta!es and desi!n considerations" Th s/ each method is more appropriate for a partic lar sit ation"

%ange Partitioning
+an!e partitionin! maps data to partitions based on ran!es of partition ke* val es that *o establish for each partition" It is the most common t*pe of partitionin! and is often sed with dates" For example/ *o mi!ht want to partition sales data into monthl* partitions"

+an!e partitionin! maps rows to partitions based on ran!es of col mn val es" +an!e partitionin! is defined b* the partitionin! specification for a table or index in %$#TITION& B'&#$N(!&)column_list* and b* the partitionin! specifications for each individ al partition in +$LU!S&L!SS&T,$N&)-alue_list*/ where column_list is an ordered list of col mns that determines the partition to which a row or an index entr* belon!s" These col mns are called the partitionin! col mns" The val es in the partitionin! col mns of a partic lar row constit te that row<s partitionin! ke*" is an ordered list of val es for the col mns in the col mn list" 6ach val e m st be either a literal or a TO_D$T! or #%$D f nction with constant ar! ments" Onl* the +$LU!S L!SS T,$N cla se is allowed" This cla se specifies a non)incl sive pper bo nd for the partitions" 'll partitions/ except the first/ have an implicit low val e specified b* the +$LU!S L!SS T,$N literal on the previo s partition" 'n* binar* val es of the partition ke* e4 al to or hi!her than this literal are added to the next hi!her partition" Hi!hest partition bein! where "$.+$LU! literal is defined" ,e*word/ "$.+$LU!/ represents a virt al infinite val e that sorts hi!her than an* other val e for the data t*pe/ incl din! the n ll val e"
-alue_list

The followin! statement creates a table sales_ran/e that is ran!e partitioned on the sales_date field#
C#!$T!&T$BL!&sales_ran/e& )salesman_id&&NU"B!#)0*1& salesman_name&+$#C,$#2)34*1& sales_amount&&NU"B!#)54*1& sales_date&&&&D$T!* CO"%#!SS %$#TITION&B'&#$N(!)sales_date* ) %$#TITION&sales_6an2444&+$LU!S&L!SS& T,$N)TO_D$T!)74284582444717DD8""8''''7**1 %$#TITION&sales_fe92444&+$LU!S&L!SS& T,$N)TO_D$T!)74384582444717DD8""8''''7**1 %$#TITION&sales_mar2444&+$LU!S&L!SS& T,$N)TO_D$T!)74:84582444717DD8""8''''7**1 %$#TITION&sales_apr2444&+$LU!S&L!SS& T,$N)TO_D$T!)74084582444717DD8""8''''7** *;

Note: This table was created with the CO"%#!SS ke*word/ th s all partitions inherit this attrib te" See Also: Oracle9i S%& $e"erence for partitionin! s*ntax and the Oracle9i Database 'dministrator(s #uide for more examples

(ash Partitioning
Hash partitionin! maps data to partitions based on a hashin! al!orithm that Oracle applies to a partitionin! ke* that *o identif*" The hashin! al!orithm evenl* distrib tes rows amon! partitions/ !ivin! partitions approximatel* the same si5e" Hash partitionin! is the ideal method for distrib tin! data evenl* across devices" Hash partitionin! is a !ood and eas*)to) se alternative to ran!e partitionin! when data is not historical and there is no obvio s col mn or col mn list where lo!ical ran!e partition pr nin! can be advanta!eo s" Oracle ses a linear hashin! al!orithm and to prevent data from cl sterin! within specific partitions/ *o sho ld define the n mber of partitions b* a power of two -for example/ A/ C/ H3" The followin! statement creates a table sales_hash/ which is hash partitioned on the salesman_id field#
C#!$T!&T$BL!&sales_hash )salesman_id&&NU"B!#)0*1& salesman_name&+$#C,$#2)34*1& sales_amount&&NU"B!#)54*1& <ee=_no&&&&&&&NU"B!#)2**& %$#TITION&B'&,$S,)salesman_id*& %$#TITIONS&:;

See Also: Oracle9i S%& $e"erence for partitionin! s*ntax and the Oracle9i Database 'dministrator(s #uide for more examples Note: ?o cannot define alternate hashin! al!orithms for partitions"

List Partitioning
List partitionin! enables *o to explicitl* control how rows map to partitions" ?o do this b* specif*in! a list of discrete val es for the partitionin! col mn in the description for each partition" This is different from ran!e partitionin!/ where a ran!e of val es is associated with a partition and with hash partitionin!/ where *o have no control of the row)to)partition mappin!" The advanta!e of list partitionin! is that *o can !ro p and or!ani5e nordered and nrelated sets of data in a nat ral wa*" The followin! example creates a list partitioned table !ro pin! states accordin! to their sales re!ions#
C#!$T!&T$BL!&sales_list )salesman_id&&NU"B!#)0*1& salesman_name&+$#C,$#2)34*1 sales_state&&&+$#C,$#2)24*1 sales_amount&&NU"B!#)54*1&

sales_date&&&&D$T!* %$#TITION&B'&LIST)sales_state* ) %$#TITION&sales_<est&+$LU!S)7California71&7,a<aii7*&CO"%#!SS1 %$#TITION&sales_east&+$LU!S)7Ne<&'or=71&7+ir/inia71&7>lorida7*1 %$#TITION&sales_central&+$LU!S)7Te?as71&7Illinois7* *;

7artition sales_<est is f rthermore created as a sin!le compressed partition within sales_list" For details abo t partitionin! and compression/ see 97artitionin! and %ata Se!ment Compression9" 'n additional capabilit* with list partitionin! is that *o can se a defa lt partition/ so that all rows that do not map to an* other partition do not !enerate an error" For example/ modif*in! the previo s example/ *o can create a defa lt partition as follows#
C#!$T!&T$BL!&sales_list )salesman_id&&NU"B!#)0*1& salesman_name&+$#C,$#2)34*1 sales_state&&&+$#C,$#2)24*1 sales_amount&&NU"B!#)54*1& sales_date&&&&D$T!* %$#TITION&B'&LIST)sales_state* ) %$#TITION&sales_<est&+$LU!S)7California71&7,a<aii7*1 %$#TITION&sales_east&+$LU!S&)7Ne<&'or=71&7+ir/inia71&7>lorida7*1 %$#TITION&sales_central&+$LU!S)7Te?as71&7Illinois7* %$#TITION&sales_other&+$LU!S)D!>$ULT* *;

See Also: Oracle9i S%& $e"erence for partitionin! s*ntax/ 97artitionin! and %ata Se!ment Compression9 for information re!ardin! data se!ment compression/ and the Oracle9i Database 'dministrator(s #uide for more examples

Com&osite Partitioning
Composite partitionin! combines ran!e and hash or list partitionin!" Oracle first distrib tes data into partitions accordin! to bo ndaries established b* the partition ran!es" Then/ for ran!e)hash partitionin!/ Oracle ses a hashin! al!orithm to f rther divide the data into s bpartitions within each ran!e partition" For ran!e)list partitionin!/ Oracle divides the data into s bpartitions within each ran!e partition based on the explicit list *o chose" Inde2 #artitioning

?o can choose whether or not to inherit the partitionin! strate!* of the nderl*in! tables" ?o can create both local and !lobal indexes on a table partitioned b* ran!e/ hash/ or composite methods" Local indexes inherit the partitionin! attrib tes of their related tables" For example/ if *o create a local index on a composite table/ Oracle a tomaticall* partitions the local index sin! the composite method" Oracle s pports onl* ran!e partitionin! for !lobal partitioned indexes" ?o cannot partition !lobal indexes sin! the hash or composite partitionin! methods" See Also: Chapter 2/ 9Indexes9 #er0or ance Issues 0or 1ange: "ist: -ash: and Co posite #artitioning This section describes performance iss es for#

$hen to =se +an!e 7artitionin! $hen to =se Hash 7artitionin! $hen to =se List 7artitionin! $hen to =se Composite +an!e)Hash 7artitionin! $hen to =se Composite +an!e)List 7artitionin!

When to +se %ange Partitioning


+an!e partitionin! is a convenient method for partitionin! historical data" The bo ndaries of ran!e partitions define the orderin! of the partitions in the tables or indexes" +an!e partitionin! or!ani5es data b* time intervals on a col mn of t*pe D$T!" Th s/ most SFL statements accessin! ran!e partitions foc s on timeframes" 'n example of this is a SFL statement similar to 9select data from a partic lar period in time"9 In s ch a scenario/ if each partition represents data for one month/ the 4 er* 9find data of month 1H)%6C9 needs to access onl* the %ecember partition of *ear 1H" This red ces the amo nt of data scanned to a fraction of the total data available/ an optimi5ation method called partition pr nin!" +an!e partitionin! is also ideal when *o periodicall* load new data and p r!e old data" It is eas* to add or drop partitions" It is common to keep a rollin! window of data/ for example keepin! the past B2 months< worth of data online" +an!e partitionin! simplifies this process" To add data from a new month/ *o load it into a separate table/ clean it/ index it/ and then add it to the ran!e) partitioned table sin! the !.C,$N(! %$#TITION statement/ all while the ori!inal table remains online" Once *o add the new partition/ *o can drop the trailin! month with the D#O% %$#TITION statement" The alternative to sin! the D#O% %$#TITION statement can

be to archive the partition and make it read onl*/ b t this works onl* when *o r partitions are in separate tablespaces" In concl sion/ consider sin! ran!e partitionin! when#

;er* lar!e tables are fre4 entl* scanned b* a ran!e predicate on a !ood partitionin! col mn/ s ch as O#D!#_D$T! or %U#C,$S!_D$T!" 7artitionin! the table on that col mn enables partition pr nin!" ?o want to maintain a rollin! window of data" ?o cannot complete administrative operations/ s ch as back p and restore/ on lar!e tables in an allotted time frame/ b t *o can divide them into smaller lo!ical pieces based on the partition ran!e col mn"

The followin! example creates the table sales for a period of two *ears/ 0111 and A888/ and partitions it b* ran!e accordin! to the col mn s_salesdate to separate the data into ei!ht 4 arters/ each correspondin! to a partition"
C#!$T!&T$BL!&sales &&)s_productid&&NU"B!#1 &&&s_saledate&&&D$T!1 &&&s_custid&&&&&NU"B!#1 &&&s_totalprice&NU"B!#* %$#TITION&B'&#$N(!)s_saledate* &)%$#TITION&sal@@q5&+$LU!S&L!SS&T,$N&)TO_D$T!)745A$%#A5@@@71&7DDA"ONA ''''7**1 &&%$#TITION&sal@@q2&+$LU!S&L!SS&T,$N&)TO_D$T!)745ABULA5@@@71&7DDA"ONA ''''7**1 &&%$#TITION&sal@@q3&+$LU!S&L!SS&T,$N&)TO_D$T!)745AOCTA5@@@71&7DDA"ONA ''''7**1 &&%$#TITION&sal@@q:&+$LU!S&L!SS&T,$N&)TO_D$T!)745AB$NA244471&7DDA"ONA ''''7**1 &&%$#TITION&sal44q5&+$LU!S&L!SS&T,$N&)TO_D$T!)745A$%#A244471&7DDA"ONA ''''7**1 &&%$#TITION&sal44q2&+$LU!S&L!SS&T,$N&)TO_D$T!)745ABULA244471&7DDA"ONA ''''7**1 &&%$#TITION&sal44q3&+$LU!S&L!SS&T,$N&)TO_D$T!)745AOCTA244471&7DDA"ONA ''''7**1 &&%$#TITION&sal44q:&+$LU!S&L!SS&T,$N&)TO_D$T!)745AB$NA244571&7DDA"ONA ''''7***;

When to +se (ash Partitioning


The wa* Oracle distrib tes data in hash partitions does not correspond to a b siness or a lo!ical view of the data/ as it does in ran!e partitionin!" Conse4 entl*/ hash partitionin! is not an effective wa* to mana!e historical data" However/ hash partitions share some performance characteristics with ran!e partitions" For example/ partition pr nin! is limited to e4 alit* predicates" ?o can also se partition)wise :oins/ parallel index access/ and parallel %ML" See Also:

97artition)$ise .oins9 's a !eneral r le/ se hash partitionin! for these p rposes#

To improve the availabilit* and mana!eabilit* of lar!e tables or to enable parallel %ML in tables that do not store historical data" To avoid data skew amon! partitions" Hash partitionin! is an effective means of distrib tin! data beca se Oracle hashes the data into a n mber of partitions/ each of which can reside on a separate device" Th s/ data is evenl* spread over a s fficient n mber of devices to maximi5e I@O thro !hp t" Similarl*/ *o can se hash partitionin! to distrib te evenl* data amon! the nodes of an M77 platform that ses Oracle +eal 'pplication Cl sters" If it is important to se partition pr nin! and partition)wise :oins accordin! to a partitionin! ke* that is mostl* constrained b* a distinct val e or val e list" Note: In hash partitionin!/ partition pr nin! ses onl* e4 alit* or IN)list predicates"

If *o add or mer!e a hashed partition/ Oracle a tomaticall* rearran!es the rows to reflect the chan!e in the n mber of partitions and s bpartitions" The hash f nction that Oracle ses is especiall* desi!ned to limit the cost of this reor!ani5ation" Instead of resh fflin! all the rows in the table/ Oracles ses an 9add partition9 lo!ic that splits one and onl* one of the existin! hashed partitions" Conversel*/ Oracle coalesces a partition b* mer!in! two existin! hashed partitions" 'ltho !h the hash f nction<s se of 9add partition9 lo!ic dramaticall* improves the mana!eabilit* of hash partitioned tables/ it means that the hash f nction can ca se a skew if the n mber of partitions of a hash partitioned table/ or the n mber of s bpartitions in each partition of a composite table/ is not a power of two" In the worst case/ the lar!est partition can be twice the si5e of the smallest" So for optimal performance/ create a n mber of partitions and s bpartitions for each partition that is a power of two" For example/ A/ C/ H/ 02/ BA/ 2C/ 0AH/ and so on" The followin! example creates fo r hashed partitions for the table sales_hash sin! the col mn s_productid as the partition ke*#
C#!$T!&T$BL!&sales_hash &&)s_productid&&NU"B!#1 &&&s_saledate&&&D$T!1 &&&s_custid&&&&&NU"B!#1 &&&s_totalprice&NU"B!#* %$#TITION&B'&,$S,)s_productid* %$#TITIONS&:;

Specif* partition names if *o want to choose the names of the partitions" Otherwise/ Oracle a tomaticall* !enerates internal names for the partitions" 'lso/ *o can se the STO#! IN cla se to assi!n hash partitions to tablespaces in a ro nd)robin manner" See Also: Oracle9i S%& $e"erence for partitionin! s*ntax and the Oracle9i Database 'dministrator(s #uide for more examples

When to +se List Partitioning


?o sho ld se list partitionin! when *o want to specificall* map rows to partitions based on discrete val es" =nlike ran!e and hash partitionin!/ m lti)col mn partition ke*s are not s pported for list partitionin!" If a table is partitioned b* list/ the partitionin! ke* can onl* consist of a sin!le col mn of the table"

When to +se Com&osite %ange-(ash Partitioning


Composite ran!e)hash partitionin! offers the benefits of both ran!e and hash partitionin!" $ith composite ran!e)hash partitionin!/ Oracle first partitions b* ran!e" Then/ within each ran!e/ Oracle creates s bpartitions and distrib tes data within them sin! the same hashin! al!orithm it ses for hash partitioned tables" %ata placed in composite partitions is lo!icall* ordered onl* b* the bo ndaries that define the ran!e level partitions" The partitionin! of data within each partition has no lo!ical or!ani5ation be*ond the identit* of the partition to which the s bpartitions belon!" Conse4 entl*/ tables and local indexes partitioned sin! the composite ran!e)hash method#

S pport historical data at the partition level S pport the se of s bpartitions as nits of parallelism for parallel operations s ch as 7%ML or space mana!ement and back p and recover* 're eli!ible for partition pr nin! and partition)wise :oins on the ran!e and hash dimensions

+sing Com&osite %ange-(ash Partitioning


=se the composite ran!e)hash partitionin! method for tables and local indexes if#

7artitions m st have a lo!ical meanin! to efficientl* s pport historical data

The contents of a partition can be spread across m ltiple tablespaces/ devices/ or nodes -of an M77 s*stem3 ?o re4 ire both partition pr nin! and partition)wise :oins even when the pr nin! and :oin predicates se different col mns of the partitioned table ?o re4 ire a de!ree of parallelism that is !reater than the n mber of partitions for back p/ recover*/ and parallel operations

Most lar!e tables in a data wareho se sho ld se ran!e partitionin!" Composite partitionin! sho ld be sed for ver* lar!e tables or for data wareho ses with a well) defined need for these conditions" $hen sin! the composite method/ Oracle stores each s bpartition on a different se!ment" Th s/ the s bpartitions ma* have properties that differ from the properties of the table or from the partition to which the s bpartitions belon!" The followin! example partitions the table sales_ran/e_hash b* ran!e on the col mn s_saledate to create fo r partitions that order data b* time" Then/ within each ran!e partition/ the data is f rther s bdivided into 02 s bpartitions b* hash on the col mn s_productid#
C#!$T!&T$BL!&sales_ran/e_hash) &&s_productid&&NU"B!#1 &&s_saledate&&&D$T!1 &&s_custid&&&&&NU"B!#1 &&s_totalprice&NU"B!#* &&&%$#TITION&B'&#$N(!&)s_saledate* &&&SUB%$#TITION&B'&,$S,&)s_productid*&SUB%$#TITIONS&C &&)%$#TITION&sal@@q5&+$LU!S&L!SS&T,$N&)TO_D$T!)745A$%#A5@@@71&7DDA"ONA ''''7**1 &&&%$#TITION&sal@@q2&+$LU!S&L!SS&T,$N&)TO_D$T!)745ABULA5@@@71&7DDA"ONA ''''7**1 &&&%$#TITION&sal@@q3&+$LU!S&L!SS&T,$N&)TO_D$T!)745AOCTA5@@@71&7DDA"ONA ''''7**1 &&&%$#TITION&sal@@q:&+$LU!S&L!SS&T,$N&)TO_D$T!)745AB$NA244471&7DDA"ONA ''''7***;

6ach hashed s bpartition contains sales data for a sin!le 4 arter ordered b* prod ct code" The total n mber of s bpartitions is CxH or BA" In addition to this s*ntax/ *o can create s bpartitions b* sin! a s bpartition template" This offers better ease in namin! and control of location for tablespaces and s bpartitions" The followin! statement ill strates this#
C#!$T!&T$BL!&sales_ran/e_hash) &&s_productid&&NU"B!#1 &&s_saledate&&&D$T!1 &&s_custid&&&&&NU"B!#1 &&s_totalprice&NU"B!#* &&&%$#TITION&B'&#$N(!&)s_saledate* &&&SUB%$#TITION&B'&,$S,&)s_productid*

&&&SUB%$#TITION&T!"%L$T!) SUB%$#TITION&sp5&T$BL!S%$C!&t9s51 SUB%$#TITION&sp2&T$BL!S%$C!&t9s21 SUB%$#TITION&sp3&T$BL!S%$C!&t9s31 SUB%$#TITION&sp:&T$BL!S%$C!&t9s:1 SUB%$#TITION&sp0&T$BL!S%$C!&t9s01 SUB%$#TITION&spD&T$BL!S%$C!&t9sD1 SUB%$#TITION&spE&T$BL!S%$C!&t9sE1 SUB%$#TITION&spC&T$BL!S%$C!&t9sC* )%$#TITION&sal@@q5&+$LU!S&L!SS&T,$N&)TO_D$T!)745A$%#A5@@@71&7DDA"ONA ''''7**1 &%$#TITION&sal@@q2&+$LU!S&L!SS&T,$N&)TO_D$T!)745ABULA5@@@71&7DDA"ONA ''''7**1 &%$#TITION&sal@@q3&+$LU!S&L!SS&T,$N&)TO_D$T!)745AOCTA5@@@71&7DDA"ONA ''''7**1 &%$#TITION&sal@@q:&+$LU!S&L!SS&T,$N&)TO_D$T!)745AB$NA244471&7DDA"ONA ''''7***;

In this example/ ever* partition has the same n mber of s bpartitions" ' sample mappin! for sal@@q5 is ill strated in Table >)0" Similar mappin!s exist for sal@@q2 thro !h sal@@q:"

Ta)$e --1 !u)&artition #a&&ing


Subpartition
sal@@q5_sp5 sal@@q5_sp2 sal@@q5_sp3 sal@@q5_sp: sal@@q5_sp0 sal@@q5_spD sal@@q5_spE sal@@q5_spC

Tablespace
t9s5 t9s2 t9s3 t9s: t9s0 t9sD t9sE t9sC

See Also: Oracle9i S%& $e"erence for details re!ardin! s*ntax and restrictions

When to +se Com&osite %ange-List Partitioning

Composite ran!e)list partitionin! offers the benefits of both ran!e and list partitionin!" $ith composite ran!e)list partitionin!/ Oracle first partitions b* ran!e" Then/ within each ran!e/ Oracle creates s bpartitions and distrib tes data within them to or!ani5e sets of data in a nat ral wa* as assi!ned b* the list" %ata placed in composite partitions is lo!icall* ordered onl* b* the bo ndaries that define the ran!e level partitions"

+sing Com&osite %ange-List Partitioning


=se the composite ran!e)list partitionin! method for tables and local indexes if#

S bpartitions have a lo!ical !ro pin! defined b* the ser The contents of a partition can be spread across m ltiple tablespaces/ devices/ or nodes -of an M77 s*stem3 ?o re4 ire both partition pr nin! and partition)wise :oins even when the pr nin! and :oin predicates se different col mns of the partitioned table ?o re4 ire a de!ree of parallelism that is !reater than the n mber of partitions for back p/ recover*/ and parallel operations

Most lar!e tables in a data wareho se sho ld se ran!e partitionin!" Composite partitionin! sho ld be sed for ver* lar!e tables or for data wareho ses with a well) defined need for these conditions" $hen sin! the composite method/ Oracle stores each s bpartition on a different se!ment" Th s/ the s bpartitions ma* have properties that differ from the properties of the table or from the partition to which the s bpartitions belon!" This statement creates a table quarterly_re/ional_sales that is ran!e partitioned on the t?n_date field and list s bpartitioned on state"
C#!$T!&T$BL!&quarterly_re/ional_sales )deptno&NU"B!#1& &item_no&+$#C,$#2)24*1 &t?n_date&D$T!1& &t?n_amount&NU"B!#1& &state&+$#C,$#2)2** %$#TITION&B'&#$N(!&)t?n_date* SUB%$#TITION&B'&LIST&)state* ) %$#TITION&q5_5@@@&+$LU!S&L!SS&T,$N)TO_D$T!)75A$%#A5@@@717DDA"ONA''''7** )SUB%$#TITION&q5_5@@@_north<est&+$LU!S&)7O#71&7F$7*1& &SUB%$#TITION&q5_5@@@_south<est&+$LU!S&)7$ 71&7UT71&7N"7*1 &SUB%$#TITION&q5_5@@@_northeast&+$LU!S&)7N'71&7+"71&7NB7*1 &SUB%$#TITION&q5_5@@@_southeast&+$LU!S&&)7>L71&7($7*1 &SUB%$#TITION&q5_5@@@_northcentral&+$LU!S&)7SD71&7FI7*1 &SUB%$#TITION&q5_5@@@_southcentral&+$LU!S&)7N"71&7T.7**1 %$#TITION&q2_5@@@&+$LU!S&L!SS&T,$N)TO_D$T!)75ABULA5@@@717DDA"ONA''''7**& )SUB%$#TITION&q2_5@@@_north<est&+$LU!S&)7O#71&7F$7*1 &SUB%$#TITION&q2_5@@@_south<est&+$LU!S&)7$ 71&7UT71&7N"7*1 &SUB%$#TITION&q2_5@@@_northeast&+$LU!S&)7N'71&7+"71&7NB7*1

&SUB%$#TITION&q2_5@@@_southeast&+$LU!S&)7>L71&7($7*1 &SUB%$#TITION&q2_5@@@_northcentral&+$LU!S&)7SD71&7FI7*1 &SUB%$#TITION&q2_5@@@_southcentral&+$LU!S&)7N"71&7T.7**1 %$#TITION&q3_5@@@&+$LU!S&L!SS&T,$N&)TO_D$T!)75AOCTA5@@@717DDA"ONA ''''7**& )SUB%$#TITION&q3_5@@@_north<est&+$LU!S&)7O#71&7F$7*1 &SUB%$#TITION&q3_5@@@_south<est&+$LU!S&&)7$ 71&7UT71&7N"7*1 &SUB%$#TITION&q3_5@@@_northeast&+$LU!S&)7N'71&7+"71&7NB7*1 &SUB%$#TITION&q3_5@@@_southeast&+$LU!S&)7>L71&7($7*1 &SUB%$#TITION&q3_5@@@_northcentral&+$LU!S&)7SD71&7FI7*1 &SUB%$#TITION&q3_5@@@_southcentral&+$LU!S&)7N"71&7T.7**1 %$#TITION&q:_5@@@&+$LU!S&L!SS&T,$N&)TO_D$T!)75AB$NA2444717DDA"ONA ''''7**& )SUB%$#TITION&q:_5@@@_north<est&+$LU!S)7O#71&7F$7*1 &SUB%$#TITION&q:_5@@@_south<est&+$LU!S)7$ 71&7UT71&7N"7*1 &SUB%$#TITION&q:_5@@@_northeast&+$LU!S)7N'71&7+"71&7NB7*1 &SUB%$#TITION&q:_5@@@_southeast&+$LU!S)7>L71&7($7*1 &SUB%$#TITION&q:_5@@@_northcentral&+$LU!S&)7SD71&7FI7*1 &SUB%$#TITION&q:_5@@@_southcentral&+$LU!S&)7N"71&7T.7***;

?o can create s bpartitions in a composite partitioned table sin! a s bpartition template" ' s bpartition template simplifies the specification of s bpartitions b* not re4 irin! that a s bpartition descriptor be specified for ever* partition in the table" Instead/ *o describe s bpartitions onl* once in a template/ then appl* that s bpartition template to ever* partition in the table" The followin! statement ill strates an example where *o can choose the s bpartition name and tablespace locations#
C#!$T!&T$BL!&quarterly_re/ional_sales )deptno&NU"B!#1& &item_no&+$#C,$#2)24*1 &t?n_date&D$T!1& &t?n_amount&NU"B!#1& &state&+$#C,$#2)2** %$#TITION&B'&#$N(!&)t?n_date* SUB%$#TITION&B'&LIST&)state* SUB%$#TITION&T!"%L$T!) SUB%$#TITION&north<est&+$LU!S&)7O#71&7F$7*&T$BL!S%$C!&ts51 SUB%$#TITION&south<est&+$LU!S&)7$ 71&7UT71&7N"7*&T$BL!S%$C!&ts21 SUB%$#TITION&northeast&+$LU!S&)7N'71&7+"71&7NB7*&T$BL!S%$C!&ts31 SUB%$#TITION&southeast&+$LU!S&)7>L71&7($7*&T$BL!S%$C!&ts:1 SUB%$#TITION&northcentral&+$LU!S&)7SD71&7FI7*&T$BL!S%$C!&ts01 SUB%$#TITION&southcentral&+$LU!S&)7N"71&7T.7*&T$BL!S%$C!&tsD* )%$#TITION&q5_5@@@&+$LU!S&L!SS&T,$N)TO_D$T!)75A$%#A5@@@717DDA"ONA ''''7**1 &%$#TITION&q2_5@@@&+$LU!S&L!SS&T,$N)TO_D$T!)75ABULA5@@@717DDA"ONA ''''7**1 &%$#TITION&q3_5@@@&+$LU!S&L!SS&T,$N)TO_D$T!)75AOCTA5@@@717DDA"ONA ''''7**1 &%$#TITION&q:_5@@@&+$LU!S&L!SS&T,$N)TO_D$T!)75AB$NA2444717DDA"ONA ''''7***;&&&

See Also: Oracle9i S%& $e"erence for details re!ardin! s*ntax and restrictions

#artitioning and Data Seg ent Co pression


?o can compress several partitions or a complete partitioned heap)or!ani5ed table" ?o do this b* either definin! a complete partitioned table as bein! compressed/ or b* definin! it on a per)partition level" 7artitions witho t a specific declaration inherit the attrib te from the table definition or/ if nothin! is specified on table level/ from the tablespace definition" To decide whether or not a partition sho ld be compressed or sta* ncompressed adheres to the same r les as a nonpartitioned table" However/ d e to the capabilit* of ran!e and composite partitionin! to separate data lo!icall* into distinct partitions/ s ch a partitioned table is an ideal candidate for compressin! parts of the data -partitions3 that are mainl* read)onl*" It is/ for example/ beneficial in all rollin! window operations as a kind of intermediate sta!e before a!in! o t old data" $ith data se!ment compression/ *o can keep more old data online/ minimi5in! the b rden of additional stora!e cons mption" ?o can also chan!e an* existin! ncompressed table partition later on/ add new compressed and ncompressed partitions/ or chan!e the compression attrib te as part of an* partition maintenance operation that re4 ires data movement/ s ch as "!#(! %$#TITION/ S%LIT %$#TITION/ or "O+! %$#TITION" The partitions can contain data or can be empt*" The access and maintenance of a partiall* or f ll* compressed partitioned table are the same as for a f ll* ncompressed partitioned table" 6ver*thin! that applies to f ll* ncompressed partitioned tables is also valid for partiall* or f ll* compressed partitioned tables" See Also: Chapter B/ 97h*sical %esi!n in %ata $areho ses9 for a !eneric disc ssion of data se!ment compression/ Chapter 0C/ 9Maintainin! the %ata $areho se9 for a sample rollin! window operation with a ran!e) partitioned table/ and Oracle9i Database Per"ormance Tuning #uide and $e"erence for an example of calc latin! the compression ratio Data Seg ent Co pression and &it ap Inde2es If *o want to se data se!ment compression on partitioned tables with bitmap indexes/ *o need to do the followin! before *o introd ce the compression attrib te for the first time# 0" Mark bitmap indexes n sable" 0" Set the compression attrib te" 0" +eb ild the indexes"

The first time *o make a compressed partition part of an alread* existin!/ f ll* ncompressed partitioned table/ *o m st either drop all existin! bitmap indexes or mark them UNUS$BL! prior to addin! a compressed partition" This m st be done irrespective of whether an* partition contains an* data" It is also independent of the operation that ca ses one or more compressed partitions to become part of the table" This does not appl* to a partitioned table havin! B)tree indexes onl*" This reb ildin! of the bitmap index str ct res is necessar* to accommodate the potentiall* hi!her n mber of rows stored for each data block with data se!ment compression enabled and m st be done onl* for the first time" 'll s bse4 ent operations/ whether the* affect compressed or ncompressed partitions/ or chan!e the compression attrib te/ behave identicall* for ncompressed/ partiall* compressed/ or f ll* compressed partitioned tables" To avoid the recreation of an* bitmap index str ct re/ Oracle recommends creatin! ever* partitioned table with at least one compressed partition whenever *o plan to partiall* or f ll* compress the partitioned table in the f t re" This compressed partition can sta* empt* or even can be dropped after the partition table creation" Havin! a partitioned table with compressed partitions can lead to sli!htl* lar!er bitmap index str ct res for the ncompressed partitions" The bitmap index str ct res for the compressed partitions/ however/ are in most cases smaller than the appropriate bitmap index str ct re before data se!ment compression" This hi!hl* depends on the achieved compression rates" Note: Oracle will raise an error if compression is introd ced to an ob:ect for the first time and there are sable bitmap index se!ments" $2a ple o0 Data Seg ent Co pression and #artitioning The followin! statement moves and compresses an alread* existin! partition sales_q5_5@@C of table sales#
$LT!#&T$BL!&sales& "O+!&%$#TITION&sales_q5_5@@C&T$BL!S%$C!&ts_arch_q5_5@@C&CO"%#!SS;

If *o se the "O+! statement/ the local indexes for partition sales_q5_5@@C become n sable" ?o have to reb ild them afterward/ as follows#
$LT!#&T$BL!&sales& "ODI>'&%$#TITION&sales_q5_5@@C&#!BUILD&UNUS$BL!&LOC$L&IND!.!S;

The followin! statement mer!es two existin! partitions into a new/ compressed partition/ residin! in a separate tablespace" The local bitmap indexes have to be reb ilt afterward/ as follows#
$LT!#&T$BL!&sales&"!#(!&%$#TITIONS&sales_q5_5@@C1&sales_q2_5@@C& INTO&%$#TITION&sales_5_5@@C&T$BL!S%$C!&ts_arch_5_5@@C& CO"%#!SS&U%D$T!&(LOB$L&IND!.!S;

See Also: Oracle9i Database Per"ormance Tuning #uide and $e"erence for details re!ardin! how to estimate the compression ratio when sin! data se!ment compression

#artition #runing
7artition pr nin! is an essential performance feat re for data wareho ses" In partition pr nin!/ the cost)based optimi5er anal*5es >#O" and F,!#! cla ses in SFL statements to eliminate nneeded partitions when b ildin! the partition access list" This enables Oracle to perform operations onl* on those partitions that are relevant to the SFL statement" Oracle pr nes partitions when *o se ran!e/ LIK!/ e4 alit*/ and IN)list predicates on the ran!e or list partitionin! col mns/ and when *o se e4 alit* and IN)list predicates on the hash partitionin! col mns" 7artition pr nin! dramaticall* red ces the amo nt of data retrieved from disk and shortens the se of processin! time/ improvin! 4 er* performance and reso rce tili5ation" If *o partition the index and table on different col mns -with a !lobal/ partitioned index3/ partition pr nin! also eliminates index partitions even when the partitions of the nderl*in! table cannot be eliminated" On composite partitioned ob:ects/ Oracle can pr ne at both the ran!e partition level and at the hash or list s bpartition level sin! the relevant predicates" +efer to the table sales_ran/e_hash earlier/ partitioned b* ran!e on the col mn s_salesdate and s bpartitioned b* hash on col mn s_productid/ and consider the followin! example#
S!L!CT&G&>#O"&sales_ran/e_hash F,!#!&s_saledate&B!TF!!N&)TO_D$T!)745ABULA5@@@71&7DDA"ONA''''7**&$ND &)TO_D$T!)745AOCTA5@@@71&7DDA"ONA''''7**&$ND&s_productid&H&5244;

Oracle ses the predicate on the partitionin! col mns to perform partition pr nin! as follows#

$hen sin! ran!e partitionin!/ Oracle accesses onl* partitions sal@@q2 and sal@@q3" $hen sin! hash s bpartitionin!/ Oracle accesses onl* the one s bpartition in each partition that stores the rows with s_productidH5244" The mappin!

between the s bpartition and the predicate is calc lated based on Oracle<s internal hash distrib tion f nction" #runing .sing DAT$ Colu ns In the earlier partitionin! pr nin! example/ the date val e was f ll* specified as fo r di!its for the *ear sin! the TO_D$T! f nction/ : st as it was in the nderl*in! table<s ran!e partitionin! description" $hile this is the recommended format for specif*in! date val es/ the optimi5er can pr ne partitions sin! the predicates on s_salesdate when *o se other formats/ as in the followin! example#
S!L!CT&G&>#O"&sales_ran/e_hash F,!#!&s_saledate&B!TF!!N&TO_D$T!)745ABULA@@71&7DDA"ONA##7*&$ND &&TO_D$T!)745AOCTA@@71&7DDA"ONA##7*&$ND&s_productid&H&5244; &&

'ltho !h this ses the %%)MO()++ format/ which is not the same as the base partition/ the optimi5er can still pr ne properl*" If *o exec te an !.%L$IN %L$N statement on the 4 er*/ the %$#TITION_ST$#T and %$#TITION_STO% col mns of the o tp t table do not specif* which partitions Oracle is accessin!" Instead/ *o see the ke*word K!' for both col mns" The ke*word K!' for both col mns means that partition pr nin! occ rs at r n)time" It can also affect the exec tion plan beca se the information abo t the pr ned partitions is missin! compared to the same statement sin! the same TO_D$T! f nction than the partition table definition" Avoiding I6O &ottlenec<s To avoid I@O bottlenecks/ when Oracle is not scannin! all partitions beca se some have been eliminated b* pr nin!/ spread each partition over several devices" On M77 s*stems/ spread those devices over m ltiple nodes"

#artition=Wise >oins
7artition)wise :oins red ce 4 er* response time b* minimi5in! the amo nt of data exchan!ed amon! parallel exec tion servers when :oins exec te in parallel" This si!nificantl* red ces response time and improves the se of both C7= and memor* reso rces" In Oracle +eal 'pplication Cl sters environments/ partition)wise :oins also avoid or at least limit the data traffic over the interconnect/ which is the ke* to achievin! !ood scalabilit* for massive :oin operations" 7artition)wise :oins can be f ll or partial" Oracle decides which t*pe of :oin to se" ,ull #artition=Wise >oins ' f ll partition)wise :oin divides a lar!e :oin into smaller :oins between a pair of partitions from the two :oined tables" To se this feat re/ *o m st e4 ipartition both

tables on their :oin ke*s" For example/ consider a lar!e :oin between a sales table and a c stomer table on the col mn c stomerid" The 4 er* 9find the records of all c stomers who bo !ht more than 088 articles in F arter B of 01119 is a t*pical example of a SFL statement performin! s ch a :oin" The followin! is an example of this#
S!L!CT&cIcust_last_name1&COUNT)G* >#O"&sales&s1&customers&c F,!#!&sIcust_id&H&cIcust_id& &&&&&$ND&sItime_id&B!TF!!N&TO_D$T!)745ABULA5@@@71&7DDA"ONA''''7*&$ND& &&&&&)TO_D$T!)745AOCTA5@@@71&7DDA"ONA''''7** &&(#OU%&B'&cIcust_last_name&,$+IN( &&COUNT)G*&J&544;

This lar!e :oin is t*pical in data wareho sin! environments" The entire c stomer table is :oined with one 4 arter of the sales data" In lar!e data wareho se applications/ this mi!ht mean :oinin! millions of rows" The :oin method to se in that case is obvio sl* a hash :oin" ?o can red ce the processin! time for this hash :oin even more if both tables are e4 ipartitioned on the customerid col mn" This enables a f ll partition)wise :oin" $hen *o exec te a f ll partition)wise :oin in parallel/ the !ran le of parallelism/ as described nder 9Kran les of 7arallelism9/ is a partition" 's a res lt/ the de!ree of parallelism is limited to the n mber of partitions" For example/ *o re4 ire at least 02 partitions to set the de!ree of parallelism of the 4 er* to 02" ?o can se vario s partitionin! methods to e4 ipartition both tables on the col mn customerid with 02 partitions" These methods are described in these s bsections"

(ash-(ash
This is the simplest method# the customers and sales tables are both partitioned b* hash into 02 partitions/ on the s_customerid and c_customerid col mns" This partitionin! method enables f ll partition)wise :oin when the tables are :oined on s_customerid and c_customerid/ both representin! the same c stomer identification n mber" Beca se *o are sin! the same hash f nction to distrib te the same information -c stomer I%3 into the same n mber of hash partitions/ *o can :oin the e4 ivalent partitions" The* are storin! the same val es" In serial/ this :oin is performed between pairs of matchin! hash partitions/ one at a time" $hen one partition pair has been :oined/ the :oin of another partition pair be!ins" The :oin completes when the 02 partition pairs have been processed" Note: ' pair of matchin! hash partitions is defined as one partition with the same partition n mber from each table" For example/ with f ll partition)wise :oins we :oin partition 8 of sales with partition 8 of

customers/

partition 0 of sales with partition 0 of customers/ and so

on"

7arallel exec tion of a f ll partition)wise :oin is a strai!htforward paralleli5ation of the serial exec tion" Instead of :oinin! one partition pair at a time/ 02 partition pairs are :oined in parallel b* the 02 4 er* servers" Fi! re >)0 ill strates the parallel exec tion of a f ll partition)wise :oin" Figure --1 Para$$e$ E.ecution of a Fu$$ Partition- ise /oin

Text description of the ill stration dwhs!08>"!if In Fi! re >)0/ ass me that the de!ree of parallelism and the n mber of partitions are the same/ in other words/ 02 for both" %efinin! more partitions than the de!ree of parallelism ma* improve load balancin! and limit possible skew in the exec tion" If *o have more partitions than 4 er* servers/ when one 4 er* server completes the :oin of one pair of partitions/ it re4 ests that the 4 er* coordinator !ive it another pair to :oin" This process repeats ntil all pairs have been processed" This method enables the load to be balanced d*namicall* when the n mber of partition pairs is !reater than the de!ree of parallelism/ for example/ 2C partitions with a de!ree of parallelism of 02" Note: To ! arantee an e4 al work distrib tion/ the n mber of partitions sho ld alwa*s be a m ltiple of the de!ree of parallelism"

In Oracle +eal 'pplication Cl sters environments r nnin! on shared)nothin! or M77 platforms/ placin! partitions on nodes is critical to achievin! !ood scalabilit*" To avoid remote I@O/ both matchin! partitions sho ld have affinit* to the same node" 7artition pairs sho ld be spread over all nodes to avoid bottlenecks and to se all C7= reso rces available on the s*stem"

(odes can host m ltiple pairs when there are more pairs than nodes" For example/ with an H)node s*stem and 02 partition pairs/ each node receives two pairs" See Also: Oracle9i $eal 'pplication Clusters Concepts for more information on data affinit*

0Com&osite-(ash1-(ash
This method is a variation of the hash)hash method" The sales table is a t*pical example of a table storin! historical data" For all the reasons mentioned nder the headin! 9$hen to =se +an!e 7artitionin!9/ ran!e is the lo!ical initial partitionin! method" For example/ ass me *o want to partition the sales table into ei!ht partitions b* ran!e on the col mn s_salesdate" 'lso ass me *o have two *ears and that each partition represents a 4 arter" Instead of sin! ran!e partitionin!/ *o can se composite partitionin! to enable a f ll partition)wise :oin while preservin! the partitionin! on s_salesdate" 7artition the sales table b* ran!e on s_salesdate and then s bpartition each partition b* hash on s_customerid sin! 02 s bpartitions for each partition/ for a total of 0AH s bpartitions" The customers table can still se hash partitionin! with 02 partitions" $hen *o se the method : st described/ a f ll partition)wise :oin works similarl* to the one created b* the hash)hash method" The :oin is still divided into 02 smaller :oins between hash partition pairs from both tables" The difference is that now each hash partition in the sales table is composed of a set of H s bpartitions/ one from each ran!e partition" Fi! re >)A ill strates how the hash partitions are formed in the sales table" 6ach cell represents a s bpartition" 6ach row corresponds to one ran!e partition/ for a total of H ran!e partitions" 6ach ran!e partition has 02 s bpartitions" 6ach col mn corresponds to one hash partition for a total of 02 hash partitionsJ each hash partition has H s bpartitions" (ote that hash partitions can be defined onl* if all partitions have the same n mber of s bpartitions/ in this case/ 02" Hash partitions are implicit in a composite table" However/ Oracle does not record them in the data dictionar*/ and *o cannot manip late them with %%L commands as *o can ran!e partitions" Figure --2 %ange and (ash Partitions of a Com&osite Ta)$e

Text description of the ill stration dwhs!8A2"!if -Composite)Hash3)Hash partitionin! is effective beca se it lets *o combine pr nin! -on s_salesdate3 with a f ll partition)wise :oin -on customerid3" In the previo s example 4 er*/ pr nin! is achieved b* scannin! onl* the s bpartitions correspondin! to FB of 0111/ in other words/ row n mber B in Fi! re >)A" Oracle then :oins these s bpartitions with the c stomer table/ sin! a f ll partition)wise :oin" 'll characteristics of the hash)hash partition)wise :oin appl* to the composite)hash partition)wise :oin" In partic lar/ for this example/ these two points are common to both methods#

The de!ree of parallelism for this f ll partition)wise :oin cannot exceed 02" 6ven tho !h the sales table has 0AH s bpartitions/ it has onl* 02 hash partitions" The r les for data placement on M77 s*stems appl* here" The onl* difference is that a hash partition is now a collection of s bpartitions" ?o m st ens re that all these s bpartitions are placed on the same node as the matchin! hash partition from the other table" For example/ in Fi! re >)A/ store hash partition 1 of the sales table shown b* the ei!ht circled s bpartitions/ on the same node as hash partition 1 of the customers table"

0Com&osite-List1-List

The -Composite)List3)List method resembles that for -Composite)Hash3)Hash partition) wise :oins"

Com&osite-Com&osite 0(ash2List Dimension1


If needed/ *o can also partition the customer table b* the composite method" For example/ *o partition it b* ran!e on a postal code col mn to enable pr nin! based on postal code" ?o then s bpartition it b* hash on customerid sin! the same n mber of partitions -023 to enable a partition)wise :oin on the hash dimension"

%ange-%ange and List-List


?o can also :oin ran!e partitioned tables with ran!e partitioned tables and list partitioned tables with list partitioned tables in a partition)wise manner/ b t this is relativel* ncommon" This is more complex to implement beca se *o m st know the distrib tion of the data before performin! the :oin" F rthermore/ if *o do not correctl* identif* the partition bo nds so that *o have partitions of e4 al si5e/ data skew d rin! the exec tion ma* res lt" The basic principle for sin! ran!e)ran!e and list)list is the same as for sin! hash)hash# *o m st e4 ipartition both tables" This means that the n mber of partitions m st be the same and the partition bo nds m st be identical" For example/ ass me that *o know in advance that *o have 08 million c stomers/ and that the val es for customerid var* from 0 to 08/888/888" In other words/ *o have 08 million possible different val es" To create 02 partitions/ *o can ran!e partition both tables/ sales on c_customerid and customers on s_customerid" ?o sho ld define partition bo nds for both tables in order to !enerate partitions of the same si5e" In this example/ partition bo nds sho ld be defined as 2A>880/ 0A>8880/ 0HE>880/ """ 08888880/ so that each partition contains 2A>888 rows"

%ange-Com&osite3 Com&osite-Com&osite 0%ange Dimension1


Finall*/ *o can also s bpartition one or both tables on another col mn" Therefore/ the ran!e)composite and composite)composite methods on the ran!e dimension are also valid for enablin! a f ll partition)wise :oin on the ran!e dimension" #artial #artition=(ise >oins Oracle can perform partial partition)wise :oins onl* in parallel" =nlike f ll partition)wise :oins/ partial partition)wise :oins re4 ire *o to partition onl* one table on the :oin ke*/ not both tables" The partitioned table is referred to as the reference table" The other table ma* or ma* not be partitioned" 7artial partition)wise :oins are more common than f ll partition)wise :oins"

To exec te a partial partition)wise :oin/ Oracle d*namicall* repartitions the other table based on the partitionin! of the reference table" Once the other table is repartitioned/ the exec tion is similar to a f ll partition)wise :oin" The performance advanta!e that partial partition)wise :oins have over :oins in non) partitioned tables is that the reference table is not moved d rin! the :oin operation" 7arallel :oins between non)partitioned tables re4 ire both inp t tables to be redistrib ted on the :oin ke*" This redistrib tion operation involves exchan!in! rows between parallel exec tion servers" This is a C7=)intensive operation that can lead to excessive interconnect traffic in Oracle +eal 'pplication Cl sters environments" 7artitionin! lar!e tables on a :oin ke*/ either a forei!n or primar* ke*/ prevents this redistrib tion ever* time the table is :oined on that ke*" Of co rse/ if *o choose a forei!n ke* to partition the table/ which is the most common scenario/ select a forei!n ke* that is involved in man* 4 eries" To ill strate partial partition)wise :oins/ consider the previo s sales8customer example" 'ss me that s_customer is not partitioned or is partitioned on a col mn other than c_customerid" Beca se sales is often :oined with customers on customerid/ and beca se this :oin dominates o r application workload/ partition sales on s_customerid to enable partial partition)wise :oin ever* time customers and sales are :oined" 's in f ll partition)wise :oin/ *o have several alternatives#

(ash2List
The simplest method to enable a partial partition)wise :oin is to partition sales b* hash on c_customerid" The n mber of partitions determines the maxim m de!ree of parallelism/ beca se the partition is the smallest !ran le of parallelism for partial partition)wise :oin operations" The parallel exec tion of a partial partition)wise :oin is ill strated in Fi! re >)B/ which ass mes that both the de!ree of parallelism and the n mber of partitions of sales are 02" The exec tion involves two sets of 4 er* servers# one set/ labeled set ) in Fi! re >)B/ scans the customers table in parallel" The !ran le of parallelism for the scan operation is a ran!e of blocks" +ows from customers that are selected b* the first set/ in this case all rows/ are redistrib ted to the second set of 4 er* servers b* hashin! customerid" For example/ all rows in customers that co ld have matchin! rows in partition %5 of sales are sent to 4 er* server 0 in the second set" +ows received b* the second set of 4 er* servers are :oined with the rows from the correspondin! partitions in sales" F er* server n mber 0 in the second set :oins all customers rows that it receives with partition %5 of sales" Figure --3 Partia$ Partition- ise /oin

Text description of the ill stration dwhs!8EC"!if Note: This section is based on ran!e)hash/ b t it also applies for ran!e)list partial partition)wise :oins"

Considerations for f ll partition)wise :oins also appl* to partial partition)wise :oins#

The de!ree of parallelism does not need to e4 al the n mber of partitions" In Fi! re >)B/ the 4 er* exec tes with two sets of 02 4 er* servers" In this case/ Oracle assi!ns 0 partition to each 4 er* server of the second set" '!ain/ the n mber of partitions sho ld alwa*s be a m ltiple of the de!ree of parallelism" In Oracle +eal 'pplication Cl sters environments on shared)nothin! platforms -M77s3/ each hash partition of sales sho ld preferabl* have affinit* to onl* one node in order to avoid remote I@Os" 'lso/ spread partitions over all nodes to avoid bottlenecks and se all C7= reso rces available on the s*stem" ' node can host m ltiple partitions when there are more partitions than nodes" See Also: Oracle9i $eal 'pplication Clusters Concepts for more information on data affinit*

Com&osite

's with f ll partition)wise :oins/ the prime partitionin! method for the sales table is to se the ran!e method on col mn s_salesdate" This is beca se sales is a t*pical example of a table that stores historical data" To enable a partial partition)wise :oin while preservin! this ran!e partitionin!/ s bpartition sales b* hash on col mn s_customerid sin! 02 s bpartitions for each partition" 7r nin! and partial partition)wise :oins can be sed to!ether if a 4 er* :oins customers and sales and if the 4 er* has a selection predicate on s_salesdate" $hen sales is composite/ the !ran le of parallelism for a partial partition)wise :oin is a hash partition and not a s bpartition" +efer to Fi! re >)A for an ill stration of a hash partition in a composite table" '!ain/ the n mber of hash partitions sho ld be a m ltiple of the de!ree of parallelism" 'lso/ on an M77 s*stem/ ens re that each hash partition has affinit* to a sin!le node" In the previo s example/ the ei!ht s bpartitions composin! a hash partition sho ld have affinit* to the same node" Note: This section is based on ran!e)hash/ b t it also applies for ran!e)list partial partition)wise :oins"

%ange
Finall*/ *o can se ran!e partitionin! on s_customerid to enable a partial partition) wise :oin" This works similarl* to the hash method/ b t a side effect of ran!e partitionin! is that the res ltin! data distrib tion co ld be skewed if the si5e of the partitions differs" Moreover/ this method is more complex to implement beca se it re4 ires prior knowled!e of the val es of the partitionin! col mn that is also a :oin ke*" &ene0its o0 #artition=Wise >oins 7artition)wise :oins offer benefits described in this section#

+ed ction of Comm nications Overhead +ed ction of Memor* +e4 irements

%eduction of Communications Overhead


$hen exec ted in parallel/ partition)wise :oins red ce comm nications overhead" This is beca se/ in the defa lt case/ parallel exec tion of a :oin operation b* a set of parallel exec tion servers re4 ires the redistrib tion of each table on the :oin col mn into dis:oint s bsets of rows" These dis:oint s bsets of rows are then :oined pair)wise b* a sin!le parallel exec tion server"

Oracle can avoid redistrib tin! the partitions beca se the two tables are alread* partitioned on the :oin col mn" This enables each parallel exec tion server to :oin a pair of matchin! partitions" This improved performance from sin! parallel exec tion is even more noticeable in Oracle +eal 'pplication Cl sters confi! rations with internode parallel exec tion" 7artition)wise :oins dramaticall* red ce interconnect traffic" =sin! this feat re is for lar!e %SS confi! rations that se Oracle +eal 'pplication Cl sters" C rrentl*/ most Oracle +eal 'pplication Cl sters platforms/ s ch as M77 and SM7 cl sters/ provide limited interconnect bandwidths compared with their processin! powers" Ideall*/ interconnect bandwidth sho ld be comparable to disk bandwidth/ b t this is seldom the case" 's a res lt/ most :oin operations in Oracle +eal 'pplication Cl sters experience hi!h interconnect latencies witho t parallel exec tion of partition)wise :oins"

%eduction of #emor' %e4uirements


7artition)wise :oins re4 ire less memor* than the e4 ivalent :oin operation of the complete data set of the tables bein! :oined" In the case of serial :oins/ the :oin is performed at the same time on a pair of matchin! partitions" If data is evenl* distrib ted across partitions/ the memor* re4 irement is divided b* the n mber of partitions" There is no skew" In the parallel case/ memor* re4 irements depend on the n mber of partition pairs that are :oined in parallel" For example/ if the de!ree of parallelism is A8 and the n mber of partitions is 088/ > times less memor* is re4 ired beca se onl* A8 :oins of two partitions are performed at the same time" The fact that partition)wise :oins re4 ire less memor* has a direct effect on performance" For example/ the :oin probabl* does not need to write blocks to disk d rin! the b ild phase of a hash :oin" #er0or ance Considerations 0or #arallel #artition=Wise >oins The cost)based optimi5er wei!hs the advanta!es and disadvanta!es when decidin! whether or not to se partition)wise :oins"

In ran!e partitionin! where partition si5es differ/ data skew increases response timeJ some parallel exec tion servers take lon!er than others to finish their :oins" Oracle recommends the se of hash -s b3partitionin! to enable partition)wise :oins beca se hash partitionin!/ if the n mber of partitions is a power of two/ limits the risk of skew" The n mber of partitions sed for partition)wise :oins sho ld/ if possible/ be a m ltiple of the n mber of 4 er* servers" $ith a de!ree of parallelism of 02/ for example/ *o can have 02/ BA/ or even 2C partitions" If there is an even n mber of partitions/ some parallel exec tion servers are sed less than others" For example/ if there are 0E evenl* distrib ted partition pairs/ onl* one pair will work on the

last :oin/ while the other pairs will have to wait" This is beca se/ in the be!innin! of the exec tion/ each parallel exec tion server works on a different partition pair" 't the end of this first phase/ onl* one pair is left" Th s/ a sin!le parallel exec tion server :oins this remainin! pair while all other parallel exec tion servers are idle" Sometimes/ parallel :oins can ca se remote I@Os" For example/ on Oracle +eal 'pplication Cl sters environments r nnin! on M77 confi! rations/ if a pair of matchin! partitions is not collocated on the same node/ a partition)wise :oin re4 ires extra internode comm nication d e to remote I@O" This is beca se Oracle m st transfer at least one partition to the node where the :oin is performed" In this case/ it is better to explicitl* redistrib te the data than to se a partition)wise :oin"

)iscellaneous #artition Operations


The followin! partition operations are needed on a re! lar basis#

'ddin! 7artitions %roppin! 7artitions 6xchan!in! 7artitions Movin! 7artitions Splittin! and Mer!in! 7artitions Tr ncatin! 7artitions Coalescin! 7artitions

Adding #artitions
%ifferent t*pes of partitions re4 ire sli!htl* different s*ntax when bein! added" Basic topics are#

'ddin! a 7artition to a +an!e)7artitioned Table 'ddin! a 7artition to a Hash)7artitioned Table 'ddin! a 7artition to a List)7artitioned Table

Adding a #artition to a 1ange=#artitioned Table =se the $LT!#&T$BL!&III&$DD&%$#TITION statement to add a new partition to the 9hi!h9 end -the point after the last existin! partition3" To add a partition at the be!innin! or in the middle of a table/ se the S%LIT&%$#TITION cla se" For example/ consider the table/ sales/ which contains data for the c rrent month in addition to the previo s 0A months" On .an ar* 0/ 0111/ *o add a partition for .an ar*/ which is stored in tablespace ts?"
$LT!#&T$BL!&sales& &&&&&&$DD&%$#TITION&6an@D&+$LU!S&L!SS&T,$N&)745A>!BA5@@@7* &&&&&&T$BL!S%$C!&ts?;

?o cannot add a partition to a ran!e)partitioned table that has a "$.+$LU! partition/ b t *o can split the "$.+$LU! partition" B* doin! so/ *o effectivel* create a new partition defined b* the val es that *o specif*/ and a second partition that remains the "$.+$LU! partition" Local and !lobal indexes associated with the ran!e)partitioned table remain sable" Adding a #artition to a -ash=#artitioned Table $hen *o add a partition to a hash)partitioned table/ Oracle pop lates the new partition with rows rehashed from an existin! partition -selected b* Oracle3 as determined b* the hash f nction" The followin! statements show two wa*s of addin! a hash partition to table scu9a/ear" Choosin! the first statement adds a new hash partition whose partition name is s*stem !enerated/ and which is placed in the table<s defa lt tablespace" The second statement also adds a new hash partition/ b t that partition is explicitl* named p_named and is created in tablespace /ear0"
$LT!#&T$BL!&scu9a/ear&$DD&%$#TITION; $LT!#&T$BL!&scu9a/ear &&&&&&$DD&%$#TITION&p_named&T$BL!S%$C!&/ear0;

Adding a #artition to a "ist=#artitioned Table The followin! statement ill strates addin! a new partition to a list)partitioned table" In this example/ ph*sical attrib tes and NOLO((IN( are specified for the partition bein! added"
$LT!#&T$BL!&q5_sales_9y_re/ion& &&&$DD&%$#TITION&q5_nonmainland&+$LU!S&)7,I71&7%#7* &&&&&&STO#$(!&)INITI$L&24K&N!.T&24K*&T$BL!S%$C!&t9s_3 &&&&&&NOLO((IN(;

'n* val e in the set of literal val es that describe the partition bein! added m st not exist in an* of the other partitions of the table" ?o cannot add a partition to a list)partitioned table that has a defa lt partition/ b t *o can split the defa lt partition" B* doin! so/ *o effectivel* create a new partition defined b* the val es that *o specif*/ and a second partition that remains the defa lt partition" Local and !lobal indexes associated with the list)partitioned table remain sable"

Dropping #artitions

?o can drop partitions from ran!e/ composite/ list/ or composite ran!e)list partitioned tables" For hash)partitioned tables/ or hash s bpartitions of ran!e)hash partitioned tables/ *o m st perform a coalesce operation instead" Dropping a Table #artition =se one of the followin! statements to drop a table partition or s bpartition#
$LT!#&T$BL!&III&D#O%&%$#TITION to drop a table $LT!#&T$BL!&III&D#O%&SUB%$#TITION to drop a s

partition bpartition of a ran!e)list

partitioned table ' t*pical example of droppin! a partition containin! data and referential inte!rit* ob:ects is as follows#
$LT!#&T$BL!&sales &&&DIS$BL!&CONST#$INT&dname_sales5; $LT!#&T$BL!&sales&D#O%&%$#TITTION&dec@C; $LT!#&T$BL!&sales &&&!N$BL!&CONST#$INT&dname_sales5;

In this example/ *o disable the inte!rit* constraints/ iss e the $LT!#&T$BL!&III&D#O%& %$#TITION statement/ then enable the inte!rit* constraints" This method is most appropriate for lar!e tables where the partition bein! dropped contains a si!nificant percenta!e of the total data in the table" See Also: Oracle9i Database 'dministrator(s #uide for more detailed examples

$2changing #artitions
?o can convert a partition -or s bpartition3 into a nonpartitioned table/ and a nonpartitioned table into a partition -or s bpartition3 of a partitioned table b* exchan!in! their data se!ments" ?o can also convert a hash)partitioned table into a partition of a ran!e)hash partitioned table/ or convert the partition of the ran!e)hash partitioned table into a hash)partitioned table" Similarl*/ *o can convert a list)partitioned table into a partition of a ran!e)list partitioned table/ or convert the partition of the ran!e)list partitioned table into a list)partitioned table ' t*pical example of exchan!in! into a nonpartitioned table follows" In this example/ table stoc=s can be ran!e/ hash/ or list partitioned"
$LT!#&T$BL!&stoc=s &&&&!.C,$N(!&%$#TITION&p3&FIT,&stoc=_ta9le_3;

See Also:

Oracle9i Database 'dministrator(s #uide for more detailed examples

)oving #artitions
=se the "O+!&%$#TITION cla se to move a partition" For example/ to move the most active partition to a tablespace that resides on its own disk -in order to balance I@O3 and to not lo! the action/ iss e the followin! statement#
$LT!#&T$BL!&parts&"O+!&%$#TITION&depot2 &&&&&T$BL!S%$C!&ts4@:&NOLO((IN(;

This statement alwa*s drops the partition<s old se!ment and creates a new se!ment/ even if *o do not specif* a new tablespace" See Also: Oracle9i Database 'dministrator(s #uide for more detailed examples

Splitting and )erging #artitions


The S%LIT&%$#TITION cla se of the $LT!#&T$BL! or $LT!#&IND!. statement is sed to redistrib te the contents of a partition into two new partitions" Consider doin! this when a partition becomes too lar!e and ca ses back p/ recover*/ or maintenance operations to take a lon! time to complete" ?o can also se the S%LIT&%$#TITION cla se to redistrib te the I@O load" This cla se cannot be sed for hash partitions or s bpartitions" ' t*pical example is to split a ran!e)partitioned table as follows#
$LT!#&T$BL!&-et_cats&S%LIT&%$#TITION& &&&&&&fee_=aty&at&)544*&INTO&)&%$#TITION &&&&&&fee_=aty5&III1&%$#TITION&fee_=aty2&III*; $LT!#&IND!.&B$>5&#!BUILD&%$#TITION&fee_=aty5; $LT!#&IND!.&B$>5&#!BUILD&%$#TITION&fee_=aty2; $LT!#&IND!.&+!T&#!BUILD&%$#TITION&-et_parta; $LT!#&IND!.&+!T&#!BUILD&%$#TITION&-et_part9;

See Also: Oracle9i Database 'dministrator(s #uide for more detailed examples =se the $LT!#&T$BL!&III&"!#(!&%$#TITIONS statement to mer!e the contents of two partitions into one partition" The two ori!inal partitions are dropped/ as are an* correspondin! local indexes"

?o cannot se this statement for a hash)partitioned table or for hash s bpartitions of a ran!e)hash partitioned table" The followin! statement mer!es two s bpartitions of a table partitioned sin! ran!e)list method into a new s bpartition located in tablespace t9s_<est#
$LT!#&T$BL!&quarterly_re/ional_sales &&&"!#(!&SUB%$#TITIONS&q5_5@@@_north<est1&q5_5@@@_south<est &&&&&&INTO&SUB%$#TITION&q5_5@@@_<est &&&&&&&&&T$BL!S%$C!&t9s_<est;

Truncating #artitions
=se the $LT!#&T$BL!&III&T#UNC$T!&%$#TITION statement to remove all rows from a table partition" Tr ncatin! a partition is similar to droppin! a partition/ except that the partition is emptied of its data/ b t not ph*sicall* dropped" ?o cannot tr ncate an index partition" However/ if there are local indexes defined for the table/ the $LT!#&T$BL!&T#UNC$T!&%$#TITION statement tr ncates the matchin! partition in each local index" The followin! example ill strates a partition that contains data and has referential inte!rit* constraints#
$LT!#&T$BL!&sales &&&&DIS$BL!&CONST#$INT&dname_sales5; $LT!#&T$BL!&sales&T#UNC$T!&%$#TITTION&dec@:; $LT!#&T$BL!&sales &&&&!N$BL!&CONST#$INT&dname_sales5;

In this example/ *o disable the inte!rit* constraints/ iss e the $LT!#&T$BL!&III& T#UNC$T!&%$#TITION statement/ then re)enable the inte!rit* constraints" This method is most appropriate for lar!e tables where the partition bein! tr ncated contains a si!nificant percenta!e of the total data in the table" See Also: Oracle9i Database 'dministrator(s #uide for more detailed examples

Coalescing #artitions
Coalescin! partitions is a wa* of red cin! the n mber of partitions in a hash)partitioned table/ or the n mber of s bpartitions in a ran!e)hash partitioned table" $hen a hash partition is coalesced/ its contents are redistrib ted into one or more remainin! partitions determined b* the hash f nction" The specific partition that is coalesced is selected b* Oracle/ and is dropped after its contents have been redistrib ted"

The followin! statement ill strates a t*pical case of red cin! b* one the n mber of partitions in a table#
$LT!#&T$BL!&ouu5 &&&&&CO$L!SC!&%$#TITION;

See Also: Oracle9i Database 'dministrator(s #uide for more detailed examples
Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback Index List

? Inde2es
This chapter describes how to se indexes in a data wareho sin! environment and disc sses the followin! t*pes of index#

Bitmap Indexes B)tree Indexes Local Indexes ;ers s Klobal Indexes See Also: Oracle9i Database Concepts for !eneral information re!ardin!

indexin!

&it ap Inde2es
Bitmap indexes are widel* sed in data wareho sin! environments" The environments t*picall* have lar!e amo nts of data and ad hoc 4 eries/ b t a low level of conc rrent %ML transactions" For s ch applications/ bitmap indexin! provides#

+ed ced response time for lar!e classes of ad hoc 4 eries +ed ced stora!e re4 irements compared to other indexin! techni4 es %ramatic performance !ains even on hardware with a relativel* small n mber of C7=s or a small amo nt of memor* 6fficient maintenance d rin! parallel %ML and loads

F ll* indexin! a lar!e table with a traditional B)tree index can be prohibitivel* expensive in terms of space beca se the indexes can be several times lar!er than the data in the table" Bitmap indexes are t*picall* onl* a fraction of the si5e of the indexed data in the table" Note: Bitmap indexes are available onl* if *o have p rchased the Oracle1i 6nterprise 6dition" See Oracle9i Database *e+ ,eatures for more information abo t the feat res available in Oracle1i and the Oracle1i 6nterprise 6dition"

'n index provides pointers to the rows in a table that contain a !iven ke* val e" ' re! lar index stores a list of rowids for each ke* correspondin! to the rows with that ke* val e" In a bitmap index/ a bitmap for each ke* val e replaces a list of rowids" 6ach bit in the bitmap corresponds to a possible rowid/ and if the bit is set/ it means that the row with the correspondin! rowid contains the ke* val e" ' mappin! f nction converts the bit position to an act al rowid/ so that the bitmap index provides the same f nctionalit* as a re! lar index" If the n mber of different ke* val es is small/ bitmap indexes save space" Bitmap indexes are most effective for 4 eries that contain m ltiple conditions in the F,!#! cla se" +ows that satisf* some/ b t not all/ conditions are filtered o t before the table itself is accessed" This improves response time/ often dramaticall*" &ene0its 0or Data Warehousing Applications

Bitmap indexes are primaril* intended for data wareho sin! applications where sers 4 er* the data rather than pdate it" The* are not s itable for OLT7 applications with lar!e n mbers of conc rrent transactions modif*in! the data" 7arallel 4 er* and parallel %ML work with bitmap indexes as the* do with traditional indexes" Bitmap indexin! also s pports parallel create indexes and concatenated indexes" See Also: Chapter 0E/ 9Schema Modelin! Techni4 es9 for f rther information abo t sin! bitmap indexes in data wareho sin! environments Cardinalit+ The advanta!es of sin! bitmap indexes are !reatest for col mns in which the ratio of the n mber of distinct val es to the n mber of rows in the table is nder 0N" $e refer to this ratio as the degree of cardinality" ' !ender col mn/ which has onl* two distinct val es -male and female3/ is ideal for a bitmap index" However/ data wareho se administrators also b ild bitmap indexes on col mns with hi!her cardinalities" For example/ on a table with one million rows/ a col mn with 08/888 distinct val es is a candidate for a bitmap index" ' bitmap index on this col mn can o tperform a B)tree index/ partic larl* when this col mn is often 4 eried in con: nction with other indexed col mns" In fact/ in a t*pical data wareho se environments/ a bitmap index can be considered for an* non) ni4 e col mn" B)tree indexes are most effective for hi!h)cardinalit* data# that is/ for data with man* possible val es/ s ch as customer_name or phone_num9er" In a data wareho se/ B)tree indexes sho ld be sed onl* for ni4 e col mns or other col mns with ver* hi!h cardinalities -that is/ col mns that are almost ni4 e3" The ma:orit* of indexes in a data wareho se sho ld be bitmap indexes" In ad hoc 4 eries and similar sit ations/ bitmap indexes can dramaticall* improve 4 er* performance" $ND and O# conditions in the F,!#! cla se of a 4 er* can be resolved 4 ickl* b* performin! the correspondin! Boolean operations directl* on the bitmaps before convertin! the res ltin! bitmap to rowids" If the res ltin! n mber of rows is small/ the 4 er* can be answered 4 ickl* witho t resortin! to a f ll table scan" E.am&$e 5-1 6itma& 7nde. The followin! shows a portion of a compan*<s customers table"
S!L!CT&cust_id1&cust_/ender1&cust_marital_status1&cust_income_le-el >#O"&customers; CUST_ID&&&&C&CUST_"$#IT$L_ST$TUS&&CUST_INCO"!_L!+!L AAAAAAAAAA&A&AAAAAAAAAAAAAAAAAAAA&AAAAAAAAAAAAAAAAAAAAA

III& &&&&&&&&E4&>&&&&&&&&&&&&&&&&&&&&&&DK&E41444&A&C@1@@@ &&&&&&&&C4&>&married&&&&&&&&&&&&&&,K&5041444&A&5D@1@@@ &&&&&&&&@4&"&sin/le&&&&&&&&&&&&&&&,K&5041444&A&5D@1@@@ &&&&&&&544&>&&&&&&&&&&&&&&&&&&&&&&IK&5E41444&A&5C@1@@@ &&&&&&&554&>&married&&&&&&&&&&&&&&CK&041444&A&D@1@@@ &&&&&&&524&"&sin/le&&&&&&&&&&&&&&&>K&5541444&A&52@1@@@ &&&&&&&534&"&&&&&&&&&&&&&&&&&&&&&&BK&5@41444&A&2:@1@@@ &&&&&&&5:4&"&married&&&&&&&&&&&&&&(K&5341444&A&5:@1@@@ III

Beca se cust_/ender/ cust_marital_status/ and c stOincome_le-el are all low) cardinalit* col mns -there are onl* three possible val es for marital stat s and re!ion/ two possible val es for !ender/ and 0A for income level3/ bitmap indexes are ideal for these col mns" %o not create a bitmap index on cust_id beca se this is a ni4 e col mn" Instead/ a ni4 e B)tree index on this col mn provides the most efficient representation and retrieval" Table 2)0 ill strates the bitmap index for the cust_/ender col mn in this example" It consists of two separate bitmaps/ one for !ender"

Ta)$e 5-1 !am&$e 6itma& 7nde.


gender@A)A
cust_id&E4 cust_id&C4 cust_id&@4 cust_id&544 cust_id&554 cust_id&524 cust_id&534 cust_id&5:4 4 4 5 4 4 5 5 5

gender@A,A
5 5 4 5 5 4 4 4

6ach entr* -or bit3 in the bitmap corresponds to a sin!le row of the customers table" The val e of each bit depends pon the val es of the correspondin! row in the table" For instance/ the bitmap cust_/enderH7>7 contains a one as its first bit beca se the re!ion is

in the first row of the customers table" The bitmap cust_/enderH7>7 has a 5ero for its third bit beca se the !ender of the third row is not >"
east

'n anal*st investi!atin! demo!raphic trends of the compan*<s c stomers mi!ht ask/ 9How man* of o r married c stomers have an income level of K or H&9 This corresponds to the followin! SFL 4 er*#
S!L!CT&COUNT)G*&>#O"&customers F,!#!&cust_marital_status&H&7married7& $ND&cust_income_le-el&IN&)7,K&5041444&A&5D@1@@@71&7(K&5341444&A& 5:@1@@@7*;

Bitmap indexes can efficientl* process this 4 er* b* merel* co ntin! the n mber of ones in the bitmap ill strated in Fi! re 2)0" The res lt set will be fo nd b* sin! bitmap or mer!e operations witho t the necessit* of a conversion to rowids" To identif* additional specific c stomer attrib tes that satisf* the criteria/ se the res ltin! bitmap to access the table after a bitmap to rowid conversion" Figure 5-1 E.ecuting a 8uer' +sing 6itma& 7nde.es

Text description of the ill stration dwhs!81B"!if

&it ap Inde2es and Nulls =nlike most other t*pes of indexes/ bitmap indexes incl de rows that have NULL val es" Indexin! of n lls can be sef l for some t*pes of SFL statements/ s ch as 4 eries with the a!!re!ate f nction COUNT" E.am&$e 5-2 6itma& 7nde.
S!L!CT&COUNT)G*&>#O"&customers&F,!#!&cust_marital_status&IS&NULL;

This 4 er* ses a bitmap index on cust_marital_status" (ote that this 4 er* wo ld not be able to se a B)tree index"

S!L!CT&COUNT)G*&>#O"&employees;

'n* bitmap index can be sed for this 4 er* beca se all table rows are indexed/ incl din! those that have NULL data" If n lls were not indexed/ the optimi5er wo ld be able to se indexes onl* on col mns with NOT&NULL constraints" &it ap Inde2es on #artitioned Tables ?o can create bitmap indexes on partitioned tables b t the* m st be local to the partitioned table))the* cannot be !lobal indexes" -Klobal bitmap indexes are s pported onl* on nonpartitioned tables3" Bitmap indexes on partitioned tables m st be local indexes" See Also: 9Index 7artitionin!9

&it ap >oin Inde2es


In addition to a bitmap index on a sin!le table/ *o can create a bitmap :oin index/ which is a bitmap index for the :oin of two or more tables" ' bitmap :oin index is a space efficient wa* of red cin! the vol me of data that m st be :oined b* performin! restrictions in advance" For each val e in a col mn of a table/ a bitmap :oin index stores the rowids of correspondin! rows in one or more other tables" In a data wareho sin! environment/ the :oin condition is an e4 i)inner :oin between the primar* ke* col mn or col mns of the dimension tables and the forei!n ke* col mn or col mns in the fact table" Bitmap :oin indexes are m ch more efficient in stora!e than materiali5ed :oin views/ an alternative for materiali5in! :oins in advance" This is beca se the materiali5ed :oin views do not compress the rowids of the fact tables" E.am&$e 5-3 6itma& /oin 7nde.9 E.am&$e 1 =sin! the example in 9Bitmap Index9/ create a bitmap :oin index with the followin! sales table#
S!L!CT&time_id1&cust_id1&amount&>#O"&sales; TI"!_ID&&&CUST_ID&&&&$"OUNT AAAAAAAAA&AAAAAAAAAA&AAAAAAAAAA 45AB$NA@C&&&&&&2@E44&&&&&&&22@5 45AB$NA@C&&&&&&&33C4&&&&&&&&55: 45AB$NA@C&&&&&&DEC34&&&&&&&&003 45AB$NA@C&&&&&5E@334&&&&&&&&&&4 45AB$NA@C&&&&&52E024&&&&&&&&5@0 45AB$NA@C&&&&&&33434&&&&&&&&2C4 III

C#!$T!&BIT"$%&IND!.&sales_cust_/ender_96i? ON&sales)customersIcust_/ender* >#O"&sales1&customers F,!#!&salesIcust_id&H&customersIcust_id LOC$L;

The followin! 4 er* shows how to se this bitmap :oin index and ill strates its bitmap pattern#
S!L!CT&salesItime_id1&customersIcust_/ender1&salesIamount >#O"&sales1&customers F,!#!&salesIcust_id&H&customersIcust_id; TI"!_ID&&&C&$"OUNT AAAAAAAAA&A&AAAAAAAAAA 45AB$NA@C&"&&&&&&&22@5 45AB$NA@C&>&&&&&&&&55: 45AB$NA@C&"&&&&&&&&003 45AB$NA@C&"&&&&&&&&&&4 45AB$NA@C&"&&&&&&&&5@0 45AB$NA@C&"&&&&&&&&2C4 45AB$NA@C&"&&&&&&&&&32 III

Table 2)A ill strates the bitmap :oin index in this example#

Ta)$e 5-2 !am&$e 6itma& /oin 7nde.


custBgender@A)A
sales&record&5 sales&record&2 sales&record&3 sales&record&: sales&record&0 sales&record&D sales&record&E 5 4 5 5 5 5 5

custBgender@A,A
4 5 4 4 4 4 4

?o can create other bitmap :oin indexes sin! more than one col mn or more than one table/ as shown in these examples"

E.am&$e 5-" 6itma& /oin 7nde.9 E.am&$e 2 ?o can create a bitmap :oin index on more than one col mn/ as in the followin! example/ which ses customers)/ender1&marital_status*#
C#!$T!&BIT"$%&IND!.&sales_cust_/ender_ms_96i? ON&sales)customersIcust_/ender1&customersIcust_marital_status* >#O"&sales1&customers F,!#!&salesIcust_id&H&customersIcust_id LOC$L&NOLO((IN(;

E.am&$e 5-- 6itma& /oin 7nde.9 E.am&$e 3 ?o can create a bitmap :oin index on more than one table/ as in the followin!/ which ses customers)/ender* and products)cate/ory*#
C#!$T!&BIT"$%&IND!.&sales_c_/ender_p_cat_96i? ON&sales)customersIcust_/ender1&productsIprod_cate/ory* >#O"&sales1&customers1&products F,!#!&salesIcust_id&H&customersIcust_id $ND&salesIprod_id&H&productsIprod_id LOC$L&NOLO((IN(;

E.am&$e 5-5 6itma& /oin 7nde.9 E.am&$e " ?o can create a bitmap :oin index on more than one table/ in which the indexed col mn is :oined to the indexed table b* sin! another table" For example/ we can b ild an index on countriesIcountry_name/ even tho !h the countries table is not :oined directl* to the sales table" Instead/ the countries table is :oined to the customers table/ which is :oined to the sales table" This t*pe of schema is commonl* called a snowfla)e schema"
C#!$T!&BIT"$%&IND!.&sales_c_/ender_p_cat_96i? ON&sales)customersIcust_/ender1&productsIprod_cate/ory* >#O"&sales1&customers1&products F,!#!&salesIcust_id&H&customersIcust_id $ND&salesIprod_id&H&productsIprod_id LOC$L&NOLO((IN(;

&it ap >oin Inde2 1estrictions .oin res lts m st be stored/ therefore/ bitmap :oin indexes have the followin! restrictions#

7arallel %ML is c rrentl* onl* s pported on the fact table" 7arallel %ML on one of the participatin! dimension tables will mark the index as n sable" Onl* one table can be pdated conc rrentl* b* different transactions when sin! the bitmap :oin index" (o table can appear twice in the :oin" ?o cannot create a bitmap :oin index on an index)or!ani5ed table or a temporar* table"

The col mns in the index m st all be col mns of the dimension tables" The dimension table :oin col mns m st be either primar* ke* col mns or have ni4 e constraints" If a dimension table has composite primar* ke*/ each col mn in the primar* ke* m st be part of the :oin" See Also: Oracle9i S%& $e"erence for f rther details

&=tree Inde2es
' B)tree index is or!ani5ed like an pside)down tree" The bottom level of the index holds the act al data val es and pointers to the correspondin! rows/ m ch as the index in a book has a pa!e n mber associated with each index entr*" See Also: Oracle9i Database Concepts for an explanation of B)tree str ct res In !eneral/ se B)tree indexes when *o know that *o r t*pical 4 er* refers to the indexed col mn and retrieves a few rows" In these 4 eries/ it is faster to find the rows b* lookin! at the index" However/ sin! the book index analo!*/ if *o plan to look at ever* sin!le topic in a book/ *o mi!ht not want to look in the index for the topic and then look p the pa!e" It mi!ht be faster to read thro !h ever* chapter in the book" Similarl*/ if *o are retrievin! most of the rows in a table/ it mi!ht not make sense to look p the index to find the table rows" Instead/ *o mi!ht want to read or scan the table" B)tree indexes are most commonl* sed in a data wareho se to index ni4 e or near) ni4 e ke*s" In man* cases/ it ma* not be necessar* to index these col mns in a data wareho se/ beca se ni4 e constraints can be maintained witho t an index/ and beca se t*pical data wareho se 4 eries ma* not work better with s ch indexes" Bitmap indexes sho ld be more common than B)tree indexes in most data wareho se environments"

"ocal Inde2es !ersus 7lobal Inde2es


B)tree indexes on partitioned tables can be !lobal or local" $ith OracleHi and earlier releases/ Oracle recommended that !lobal indexes not be sed in data wareho se environments beca se a partition %%L statement -for example/ $LT!# T$BL! """ D#O% %$#TITION3 wo ld invalidate the entire index/ and reb ildin! the index is expensive" In Oracle1i/ !lobal indexes can be maintained witho t Oracle markin! them as n sable after %%L" This enhancement makes !lobal indexes more effective for data wareho se environments"

However/ local indexes will be more common than !lobal indexes" Klobal indexes sho ld be sed when there is a specific re4 irement which cannot be met b* local indexes -for example/ a ni4 e index on a non)partitionin! ke*/ or a performance re4 irement3" Bitmap indexes on partitioned tables are alwa*s local" See Also: 9T*pes of 7artitionin!9 for f rther details
Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

C Integrit+ Constraints
This chapter describes inte!rit* constraints/ and disc sses#

$h* Inte!rit* Constraints are =sef l in a %ata $areho se Overview of Constraint States T*pical %ata $areho se Inte!rit* Constraints

Wh+ Integrit+ Constraints are .se0ul in a Data Warehouse


Inte!rit* constraints provide a mechanism for ens rin! that data conforms to ! idelines specified b* the database administrator" The most common t*pes of constraints incl de#
UNILU!

constraints

To ens re that a !iven col mn is ni4 e


NOT NULL

constraints

To ens re that no n ll val es are allowed


>O#!I(N K!'

constraints

To ens re that two ke*s share a primar* ke* to forei!n ke* relationship Constraints can be sed for these p rposes in a data wareho se#

%ata cleanliness Constraints verif* that the data in the data wareho se conforms to a basic level of data consistenc* and correctness/ preventin! the introd ction of dirt* data"

F er* optimi5ation The Oracle database tili5es constraints when optimi5in! SFL 4 eries" 'ltho !h constraints can be sef l in man* aspects of 4 er* optimi5ation/ constraints are partic larl* important for 4 er* rewrite of materiali5ed views"

=nlike data in man* relational database environments/ data in a data wareho se is t*picall* added or modified nder controlled circ mstances d rin! the extraction/ transformation/ and loadin! -6TL3 process" M ltiple sers normall* do not pdate the data wareho se directl*/ as the* do in an OLT7 s*stem" See Also: Chapter 08/ 9Overview of 6xtraction/ Transformation/ and Loadin!9 Man* si!nificant constraint feat res have been introd ced for data wareho sin!" +eaders familiar with Oracle<s constraint f nctionalit* in OracleE and OracleH sho ld take special note of the f nctionalit* described in this chapter" In fact/ man* OracleE)based and OracleH)based data wareho ses lacked constraints beca se of concerns abo t constraint performance" (ewer constraint f nctionalit* addresses these concerns"

Overvie( o0 Constraint States


To nderstand how best to se constraints in a data wareho se/ *o sho ld first nderstand the basic p rposes of constraints" Some of these p rposes are#

6nforcement

In order to se a constraint for enforcement/ the constraint m st be in the !N$BL! state" 'n enabled constraint ens res that all data modifications pon a !iven table -or tables3 satisf* the conditions of the constraints" %ata modification operations which prod ce data that violates the constraint fail with a constraint violation error"

;alidation To se a constraint for validation/ the constraint m st be in the +$LID$T! state" If the constraint is validated/ then all data that c rrentl* resides in the table satisfies the constraint" (ote that validation is independent of enforcement" 'ltho !h the t*pical constraint in an operational s*stem is both enabled and validated/ an* constraint co ld be validated b t not enabled or vice versa -enabled b t not validated3" These latter two cases are sef l for data wareho ses"

Belief In some cases/ *o will know that the conditions for a !iven constraint are tr e/ so *o do not need to validate or enforce the constraint" However/ *o ma* wish for the constraint to be present an*wa* to improve 4 er* optimi5ation and performance" $hen *o se a constraint in this wa*/ it is called a belief or #!L' constraint/ and the constraint m st be in the #!L' state" The #!L' state provides *o with a mechanism for tellin! Oracle1i that a !iven constraint is believed to be tr e" (ote that the #!L' state onl* affects constraints that have not been validated"

T+pical Data Warehouse Integrit+ Constraints


This section ass mes that *o are familiar with the t*pical se of constraints" That is/ constraints that are both enabled and validated" For data wareho sin!/ man* sers have discovered that s ch constraints ma* be prohibitivel* costl* to b ild and maintain" The topics disc ssed are#

=(IF=6 Constraints in a %ata $areho se FO+6IK( ,6? Constraints in a %ata $areho se +6L? Constraints Inte!rit* Constraints and 7arallelism Inte!rit* Constraints and 7artitionin! ;iew Constraints

.NID.$ Constraints in a Data Warehouse

' UNILU! constraint is t*picall* enforced sin! a UNILU! index" However/ in a data wareho se whose tables can be extremel* lar!e/ creatin! a ni4 e index can be costl* both in processin! time and in disk space" S ppose that a data wareho se contains a table sales/ which incl des a col mn sales_id" sales_id ni4 el* identifies a sin!le sales transaction/ and the data wareho se administrator m st ens re that this col mn is ni4 e within the data wareho se" One wa* to create the constraint is as follows#
$LT!#&T$BL!&sales&$DD&CONST#$INT&sales_unique& UNILU!)sales_id*;&

B* defa lt/ this constraint is both enabled and validated" Oracle implicitl* creates a ni4 e index on sales_id to s pport this constraint" However/ this index can be problematic in a data wareho se for three reasons#

The ni4 e index can be ver* lar!e/ beca se the sales table can easil* have millions or even billions of rows" The ni4 e index is rarel* sed for 4 er* exec tion" Most data wareho sin! 4 eries do not have predicates on ni4 e ke*s/ so creatin! this index will probabl* not improve performance" If sales is partitioned alon! a col mn other than sales_id/ the ni4 e index m st be !lobal" This can detrimentall* affect all maintenance operations on the sales table"

' ni4 e index is re4 ired for ni4 e constraints to ens re that each individ al row modified in the sales table satisfies the UNILU! constraint" For data wareho sin! tables/ an alternative mechanism for ni4 e constraints is ill strated in the followin! statement#
$LT!#&T$BL!&sales&$DD&CONST#$INT&sales_unique& UNILU!&)sales_id*&DIS$BL!&+$LID$T!;

This statement creates a ni4 e constraint/ b t/ beca se the constraint is disabled/ a ni4 e index is not re4 ired" This approach can be advanta!eo s for man* data wareho sin! environments beca se the constraint now ens res ni4 eness witho t the cost of a ni4 e index" However/ there are trade)offs for the data wareho se administrator to consider with DIS$BL! +$LID$T! constraints" Beca se this constraint is disabled/ no %ML statements that modif* the ni4 e col mn are permitted a!ainst the sales table" ?o can se one of two strate!ies for modif*in! this table in the presence of a constraint#

=se %%L to add data to this table -s ch as exchan!in! partitions3" See the example in Chapter 0C/ 9Maintainin! the %ata $areho se9" Before modif*in! this table/ drop the constraint" Then/ make all necessar* data modifications" Finall*/ re)create the disabled constraint" +e)creatin! the constraint is more efficient than re)creatin! an enabled constraint" However/ this approach does not ! arantee that data added to the sales table while the constraint has been dropped is ni4 e"

,O1$I7N E$F Constraints in a Data Warehouse


In a star schema data wareho se/ >O#!I(N K!' constraints validate the relationship between the fact table and the dimension tables" ' sample constraint mi!ht be#
$LT!#&T$BL!&sales&$DD&CONST#$INT&sales_time_f= &&>O#!I(N&K!'&)sales_time_id*&#!>!#!NC!S&time&)time_id* &&!N$BL!&+$LID$T!;

However/ in some sit ations/ *o ma* choose to se a different state for the >O#!I(N K!'& constraints/ in partic lar/ the !N$BL!&NO+$LID$T! state" ' data wareho se administrator mi!ht se an !N$BL!&NO+$LID$T! constraint when either#

The tables contain data that c rrentl* disobe*s the constraint/ b t the data wareho se administrator wishes to create a constraint for f t re enforcement" 'n enforced constraint is re4 ired immediatel*"

S ppose that the data wareho se loaded new data into the fact tables ever* da*/ b t refreshed the dimension tables onl* on the weekend" % rin! the week/ the dimension tables and fact tables ma* in fact disobe* the >O#!I(N K!' constraints" (evertheless/ the data wareho se administrator mi!ht wish to maintain the enforcement of this constraint to prevent an* chan!es that mi!ht affect the >O#!I(N K!' constraint o tside of the 6TL process" Th s/ *o can create the >O#!I(N K!' constraints ever* ni!ht/ after performin! the 6TL process/ as shown here#
$LT!#&T$BL!&sales&$DD&CONST#$INT&sales_time_f= &&>O#!I(N&K!'&)sales_time_id*&#!>!#!NC!S&time&)time_id* &&!N$BL!&NO+$LID$T!;

can 4 ickl* create an enforced constraint/ even when the constraint is believed to be tr e" S ppose that the 6TL process verifies that a >O#!I(N K!' constraint is tr e" +ather than have the database re)verif* this >O#!I(N K!' constraint/ which wo ld re4 ire time and database reso rces/ the data wareho se administrator co ld instead create a >O#!I(N K!' constraint sin! !N$BL! NO+$LID$T!"
!N$BL! NO+$LID$T!

1$"F Constraints

The 6TL process commonl* verifies that certain constraints are tr e" For example/ it can validate all of the forei!n ke*s in the data comin! into the fact table" This means that *o can tr st it to provide clean data/ instead of implementin! constraints in the data wareho se" ?o create a #!L' constraint as follows#
$LT!#&T$BL!&sales&$DD&CONST#$INT&sales_time_f= &&>O#!I(N&K!'&)sales_time_id*&#!>!#!NC!S&time&)time_id*& &&#!L'&DIS$BL!&NO+$LID$T!; #!L'

constraints/ even tho !h the* are not sed for data validation/ can# 6nable more sophisticated 4 er* rewrites for materiali5ed views" See Chapter AA/ 9F er* +ewrite9 for f rther details" 6nable other data wareho sin! tools to retrieve information re!ardin! constraints directl* from the Oracle data dictionar*"

Creatin! a #!L' constraint is inexpensive and does not impose an* overhead d rin! %ML or load" Beca se the constraint is not bein! validated/ no data processin! is necessar* to create it"

Integrit+ Constraints and #arallelis


'll constraints can be validated in parallel" $hen validatin! constraints on ver* lar!e tables/ parallelism is often necessar* to meet performance !oals" The de!ree of parallelism for a !iven constraint operation is determined b* the defa lt de!ree of parallelism of the nderl*in! table"

Integrit+ Constraints and #artitioning


?o can create and maintain constraints before *o partition the data" Later chapters disc ss the si!nificance of partitionin! for data wareho sin!" 7artitionin! can improve constraint mana!ement : st as it does to mana!ement of man* other operations" For example/ Chapter 0C/ 9Maintainin! the %ata $areho se9 provides a scenario creatin! UNILU! and >O#!I(N K!' constraints on a separate sta!in! table/ and these constraints are maintained d rin! the !.C,$N(! %$#TITION statement"

!ie( Constraints
?o can create constraints on views" The onl* t*pe of constraint s pported on a view is a #!L' constraint" This t*pe of constraint is sef l when 4 eries t*picall* access views instead of base tables/ and the %B' th s needs to define the data relationships between views rather than tables" ;iew constraints are partic larl* sef l in OL'7 environments/ where the* ma* enable more sophisticated rewrites for materiali5ed views"

See Also: Chapter H/ 9Materiali5ed ;iews9 and Chapter AA/ 9F er* +ewrite9
Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

G )ateriali4ed !ie(s
This chapter introd ces *o to the se of materiali5ed views and disc sses#

Overview of %ata $areho sin! with Materiali5ed ;iews T*pes of Materiali5ed ;iews Creatin! Materiali5ed ;iews +e!isterin! 6xistin! Materiali5ed ;iews 7artitionin! and Materiali5ed ;iews Materiali5ed ;iews in OL'7 6nvironments Choosin! Indexes for Materiali5ed ;iews Invalidatin! Materiali5ed ;iews Sec rit* Iss es with Materiali5ed ;iews 'lterin! Materiali5ed ;iews %roppin! Materiali5ed ;iews 'nal*5in! Materiali5ed ;iew Capabilities

Overvie( o0 Data Warehousing (ith )ateriali4ed !ie(s


T*picall*/ data flows from one or more online transaction processin! -OLT73 databases into a data wareho se on a monthl*/ weekl*/ or dail* basis" The data is normall*

processed in a staging file before bein! added to the data wareho se" %ata wareho ses commonl* ran!e in si5e from tens of !i!ab*tes to a few terab*tes" =s all*/ the vast ma:orit* of the data is stored in a few ver* lar!e fact tables" One techni4 e emplo*ed in data wareho ses to improve performance is the creation of s mmaries" S mmaries are special kinds of a!!re!ate views that improve 4 er* exec tion times b* precalc latin! expensive :oins and a!!re!ation operations prior to exec tion and storin! the res lts in a table in the database" For example/ *o can create a table to contain the s ms of sales b* re!ion and b* prod ct" The s mmaries or a!!re!ates that are referred to in this book and in literat re on data wareho sin! are created in Oracle sin! a schema ob:ect called a materialized view" Materiali5ed views can perform a n mber of roles/ s ch as improvin! 4 er* performance or providin! replicated data" 7rior to OracleHi/ or!ani5ations sin! s mmaries spent a si!nificant amo nt of time and effort creatin! s mmaries man all*/ identif*in! which s mmaries to create/ indexin! the s mmaries/ pdatin! them/ and advisin! their sers on which ones to se" The introd ction of s mmar* mana!ement in OracleHi eased the workload of the database administrator and meant the ser no lon!er needed to be aware of the s mmaries that had been defined" The database administrator creates one or more materiali5ed views/ which are the e4 ivalent of a s mmar*" The end ser 4 eries the tables and views at the detail data level" The 4 er* rewrite mechanism in the Oracle server a tomaticall* rewrites the SFL 4 er* to se the s mmar* tables" This mechanism red ces response time for ret rnin! res lts from the 4 er*" Materiali5ed views within the data wareho se are transparent to the end ser or to the database application" 'ltho !h materiali5ed views are s all* accessed thro !h the 4 er* rewrite mechanism/ an end ser or database application can constr ct 4 eries that directl* access the s mmaries" However/ serio s consideration sho ld be !iven to whether sers sho ld be allowed to do this beca se an* chan!e to the s mmaries will affect the 4 eries that reference them"

)ateriali4ed !ie(s 0or Data Warehouses


In data wareho ses/ *o can se materiali5ed views to precomp te and store a!!re!ated data s ch as the s m of sales" Materiali5ed views in these environments are often referred to as s mmaries/ beca se the* store s mmari5ed data" The* can also be sed to precomp te :oins with or witho t a!!re!ations" ' materiali5ed view eliminates the overhead associated with expensive :oins and a!!re!ations for a lar!e or important class of 4 eries"

)ateriali4ed !ie(s 0or Distributed Co puting


In distrib ted environments/ *o can se materiali5ed views to replicate data at distrib ted sites and to s*nchroni5e pdates done at those sites with conflict resol tion

methods" The materiali5ed views as replicas provide local access to data that otherwise wo ld have to be accessed from remote sites" Materiali5ed views are also sef l in remote data marts" See Also: Oracle9i $eplication and Oracle9i -eterogeneous Connectivity 'dministrator(s #uide for details on distrib ted and mobile comp tin!

)ateriali4ed !ie(s 0or )obile Co puting


?o can also se materiali5ed views to download a s bset of data from central servers to mobile clients/ with periodic refreshes and pdates between clients and the central servers" This chapter foc ses on the se of materiali5ed views in data wareho ses" See Also: Oracle9i $eplication and Oracle9i -eterogeneous Connectivity 'dministrator(s #uide for details on distrib ted and mobile comp tin!

The Need 0or )ateriali4ed !ie(s


?o can se materiali5ed views in data wareho ses to increase the speed of 4 eries on ver* lar!e databases" F eries to lar!e databases often involve :oins between tables/ a!!re!ations s ch as SU"/ or both" These operations are expensive in terms of time and processin! power" The t*pe of materiali5ed view *o create determines how the materiali5ed view is refreshed and sed b* 4 er* rewrite" ?o can se materiali5ed views in a n mber of wa*s/ and *o can se almost identical s*ntax to perform a n mber of roles" For example/ a materiali5ed view can replicate data/ a process formerl* achieved b* sin! the C#!$T! SN$%S,OT statement" (ow C#!$T! "$T!#I$LI !D +I!F is a s*non*m for C#!$T! SN$%S,OT" Materiali5ed views improve 4 er* performance b* precalc latin! expensive :oin and a!!re!ation operations on the database prior to exec tion and storin! the res lts in the database" The 4 er* optimi5er a tomaticall* reco!ni5es when an existin! materiali5ed view can and sho ld be sed to satisf* a re4 est" It then transparentl* rewrites the re4 est to se the materiali5ed view" F eries !o directl* to the materiali5ed view and not to the nderl*in! detail tables" In !eneral/ rewritin! 4 eries to se materiali5ed views rather than detail tables improves response" Fi! re H)0 ill strates how 4 er* rewrite works" Figure :-1 Trans&arent 8uer' %e rite

Text description of the ill stration dwhs!8AE"!if $hen sin! 4 er* rewrite/ create materiali5ed views that satisf* the lar!est n mber of 4 eries" For example/ if *o identif* A8 4 eries that are commonl* applied to the detail or fact tables/ then *o mi!ht be able to satisf* them with five or six well)written materiali5ed views" ' materiali5ed view definition can incl de an* n mber of a!!re!ations -SU"/ COUNT)?*/ COUNT)G*/ COUNT)DISTINCT&?*/ $+(/ +$#I$NC!/ STDD!+/ "IN/ and "$.3" It can also incl de an* n mber of :oins" If *o are ns re of which materiali5ed views to create/ Oracle provides a set of advisor* proced res in the DB"S_OL$% packa!e to help in desi!nin! and eval atin! materiali5ed views for 4 er* rewrite" These f nctions are also known as the S mmar* 'dvisor or the 'dvisor" (ote that the OL'7 S mmar* 'dvisor is different" See Oracle9i O&'P ser(s #uide for f rther details re!ardin! the OL'7 S mmar* 'dvisor" If a materiali5ed view is to be sed b* 4 er* rewrite/ it m st be stored in the same database as the fact or detail tables on which it relies" ' materiali5ed view can be partitioned/ and *o can define a materiali5ed view on a partitioned table" ?o can also define one or more indexes on the materiali5ed view" =nlike indexes/ materiali5ed views can be accessed directl* sin! a S!L!CT statement" Note: The techni4 es shown in this chapter ill strate how to se materiali5ed views in data wareho ses" Materiali5ed views can also be sed b* Oracle +eplication" See Oracle9i $eplication for f rther information"

Co ponents o0 Su

ar+ )anage ent

S mmar* mana!ement consists of#

Mechanisms to define materiali5ed views and dimensions" ' refresh mechanism to ens re that all materiali5ed views contain the latest data" ' 4 er* rewrite capabilit* to transparentl* rewrite a 4 er* to se a materiali5ed view" ' collection of materiali5ed view anal*sis and advisor* f nctions and proced res in the DB"S_OL$% packa!e" Collectivel*/ these f nctions are called the S mmar* 'dvisor/ and are also available as part of Oracle 6nterprise Mana!er" See Also: Chapter 02/ 9S mmar* 'dvisor9 and Oracle9i O&'P ser(s #uide for OL'7)related schemas

Man* lar!e decision s pport s*stem -%SS3 databases have schemas that do not closel* resemble a conventional data wareho se schema/ b t that still re4 ire :oins and a!!re!ates" The se of s mmar* mana!ement feat res imposes no schema restrictions/ and can enable some existin! %SS database applications to improve performance witho t the need to redesi!n the database or the application" Fi! re H)A ill strates the se of s mmar* mana!ement in the wareho sin! c*cle" 'fter the data has been transformed/ sta!ed/ and loaded into the detail data in the wareho se/ *o can invoke the s mmar* mana!ement process" First/ se the 'dvisor to plan how *o will se s mmaries" Then/ create s mmaries and desi!n how 4 eries will be rewritten" Figure :-2 Overvie of !ummar' #anagement

Text description of the ill stration dwhs!8E0"!if =nderstandin! the s mmar* mana!ement process d rin! the earliest sta!es of data wareho se desi!n can *ield lar!e dividends later in the form of hi!her performance/ lower s mmar* administration costs/ and red ced stora!e re4 irements"

Data Warehousing Ter inolog+


Some basic data wareho sin! terms are defined as follows#

*imension tables describe the b siness entities of an enterprise/ represented as hierarchical/ cate!orical information s ch as time/ departments/ locations/ and prod cts" %imension tables are sometimes called look p or reference tables" %imension tables s all* chan!e slowl* over time and are not modified on a periodic sched le" The* are sed in lon!)r nnin! decision s pport 4 eries to a!!re!ate the data ret rned from the 4 er* into appropriate levels of the dimension hierarch*"

+ierarchies describe the b siness relationships and common access patterns in the database" 'n anal*sis of the dimensions/ combined with an nderstandin! of the t*pical work load/ can be sed to create materiali5ed views" See Also: Chapter 1/ 9%imensions9

Fact tables describe the b siness transactions of an enterprise" Fact tables are sometimes called detail tables" The vast ma:orit* of data in a data wareho se is stored in a few ver* lar!e fact tables that are pdated periodicall* with data from one or more operational OLT7 databases" Fact tables incl de facts -also called meas res3 s ch as sales/ nits/ and inventor*"
o o o

' simple meas re is a n meric or character col mn of one table s ch as factIsales" ' comp ted meas re is an expression involvin! meas res of one table/ for example/ factIre-enues ) factIe?penses" ' m ltitable meas re is a comp ted meas re defined on m ltiple tables/ for example/ fact_aIre-enues ) fact_9Ie?penses"

Fact tables also contain one or more forei!n ke*s that or!ani5e the b siness transactions b* the relevant b siness entities s ch as time/ prod ct/ and market" In most cases/ these forei!n ke*s are non)n ll/ form a ni4 e compo nd ke* of the fact table/ and each forei!n ke* :oins with exactl* one row of a dimension table"

' materiali5ed view is a precomp ted table comprisin! a!!re!ated and :oined data from fact and possibl* from dimension tables" 'mon! b ilders of data wareho ses/ a materiali5ed view is also known as a s mmary"

)ateriali4ed !ie( Sche a Design


S mmar* mana!ement can perform man* sef l f nctions/ incl din! 4 er* rewrite and materiali5ed view refresh/ even if *o r data wareho se desi!n does not follow these ! idelines" However/ *o will reali5e si!nificantl* !reater 4 er* exec tion performance and materiali5ed view refresh performance benefits and *o will re4 ire fewer materiali5ed views if *o r schema desi!n complies with these ! idelines" ' materiali5ed view definition incl des an* n mber of a!!re!ates/ as well as an* n mber of :oins" In several wa*s/ a materiali5ed view behaves like an index#

The p rpose of a materiali5ed view is to increase 4 er* exec tion performance"

The existence of a materiali5ed view is transparent to SFL applications/ so that a %B' can create or drop materiali5ed views at an* time witho t affectin! the validit* of SFL applications" ' materiali5ed view cons mes stora!e space" The contents of the materiali5ed view m st be pdated when the nderl*in! detail tables are modified"

Sche as and Di ension Tables In the case of normali5ed or partiall* normali5ed dimension tables -a dimension that is stored in more than one table3/ identif* how these tables are :oined" (ote whether the :oins between the dimension tables can ! arantee that each child)side row :oins with one and onl* one parent)side row" In the case of denormali5ed dimensions/ determine whether the child)side col mns ni4 el* determine the parent)side -or attrib te3 col mns" These relationships can be enabled with constraints/ sin! the NO+$LID$T! and #!L' options if the relationships represented b* the constraints are ! aranteed b* other means" (ote that if the :oins between fact and dimension tables do not s pport the parent)child relationship described previo sl*/ *o still !ain si!nificant performance advanta!es from definin! the dimension with the C#!$T! DI"!NSION statement" 'nother alternative/ s b:ect to some restrictions/ is to se o ter :oins in the materiali5ed view definition -that is/ in the C#!$T! "$T!#I$LI !D +I!F statement3" ?o m st not create dimensions in an* schema that does not satisf* these relationships" Incorrect res lts can be ret rned from 4 eries otherwise" See Also: Chapter 1/ 9%imensions9 and Oracle9i O&'P ser(s #uide for OL'7) related schemas )ateriali4ed !ie( Sche a Design 7uidelines Before startin! to define and se the vario s components of s mmar* mana!ement/ *o sho ld review *o r schema desi!n to abide b* the followin! ! idelines wherever possible" K idelines 0 and A are more important than ! ideline B" If *o r schema desi!n does not follow ! idelines 0 and A/ it does not then matter whether it follows ! ideline B" K idelines 0/ A/ and B affect both 4 er* rewrite performance and materiali5ed view refresh performance" Sche a 7uideline K ideline 0

Description %imensions sho ld either be denormali5ed -each dimension contained in one table3 or the :oins between tables in a normali5ed or partiall*

Sche a 7uideline %imensions

Description normali5ed dimension sho ld ! arantee that each child)side row :oins with exactl* one parent)side row" The benefits of maintainin! this condition are described in 9Creatin! %imensions9" ?o can enforce this condition b* addin! >O#!I(N K!' and NOT&NULL constraints on the child)side :oin ke*s and %#I"$#' K!' constraints on the parent)side :oin ke*s"

K ideline A %imensions If dimensions are denormali5ed or partiall* denormali5ed/ hierarchical inte!rit* m st be maintained between the ke* col mns of the dimension table" 6ach child ke* val e m st ni4 el* identif* its parent ke* val e/ even if the dimension table is denormali5ed" Hierarchical inte!rit* in a denormali5ed dimension can be verified b* callin! the +$LID$T!_DI"!NSION proced re of the DB"S_OL$% packa!e" Fact and dimension tables sho ld similarl* ! arantee that each fact table row :oins with exactl* one dimension table row" This condition m st be declared/ and optionall* enforced/ b* addin! >O#!I(N K!' and NOT NULL constraints on the fact ke* col mn-s3 and %#I"$#' K!' constraints on the dimension ke* col mn-s3/ or b* sin! o ter :oins" In a data wareho se/ constraints are t*picall* enabled with the NO+$LID$T! and #!L' cla ses to avoid constraint enforcement performance overhead" See Oracle9i S%& $e"erence for f rther details" Incremental loads of *o r detail data sho ld be done sin! the SFLILoader direct)path option/ or an* b lk loader tilit* that ses Oracle<s direct)path interface" This incl des INS!#T """ $S&S!L!CT with the $%%!ND or %$#$LL!L hints/ where the hints ca se the direct loader lo! to be sed d rin! the insert" See Oracle9i S%& $e"erence and 9T*pes of Materiali5ed ;iews9" +an!e@composite partition *o r tables b* a monotonicall* increasin! time col mn if possible -preferabl* of t*pe D$T!3" 'fter each load and before refreshin! *o r materiali5ed view/ se the +$LID$T!_DI"!NSION proced re of the DB"S_"+I!F packa!e to incrementall* verif* dimensional inte!rit*" If a time dimension appears in the materiali5ed view as a time col mn/

K ideline B %imensions

K ideline C Incremental Loads

K ideline > 7artitions K ideline 2 %imensions K ideline E

Sche a 7uideline Time %imensions

Description partition and index the materiali5ed view in the same manner as *o have the fact tables"

If *o are concerned with the time re4 ired to enable constraints and whether an* constraints mi!ht be violated/ se the !N$BL! NO+$LID$T! with the #!L' cla se to t rn on constraint checkin! witho t validatin! an* of the existin! constraints" The risk with this approach is that incorrect 4 er* res lts co ld occ r if an* constraints are broken" Therefore/ as the desi!ner/ *o m st determine how clean the data is and whether the risk of wron! res lts is too !reat"

"oading Data
' pop lar and efficient wa* to load data into a wareho se or data mart is to se SFLILoader with the DI#!CT or %$#$LL!L option or to se another loader tool that ses the Oracle direct)path '7I" See Also: Oracle9i Database tilities for the restrictions and considerations when sin! SFLILoader with the DI#!CT or %$#$LL!L ke*words Loadin! strate!ies can be classified as one)phase or two)phase" In one)phase loadin!/ data is loaded directl* into the tar!et table/ 4 alit* ass rance tests are performed/ and errors are resolved b* performin! %ML operations prior to refreshin! materiali5ed views" If a lar!e n mber of deletions are possible/ then stora!e tili5ation can be adversel* affected/ b t temporar* space re4 irements and load time are minimi5ed" The %ML that ma* be re4 ired after one)phase loadin! ca ses m ltitable a!!re!ate materiali5ed views to become n sable in the safest rewrite inte!rit* level" In a two)phase loadin! process#

%ata is first loaded into a temporar* table in the wareho se" F alit* ass rance proced res are applied to the data" +eferential inte!rit* constraints on the tar!et table are disabled/ and the local index in the tar!et partition is marked n sable" The data is copied from the temporar* area into the appropriate partition of the tar!et table sin! INS!#T $S S!L!CT with the %$#$LL!L or $%%!ND hint" The temporar* table is dropped" The constraints are enabled/ s all* with the NO+$LID$T! option"

Immediatel* after loadin! the detail data and pdatin! the indexes on the detail data/ the database can be opened for operation/ if desired" ?o can disable 4 er* rewrite at the

s*stem level b* iss in! an $LT!# S'ST!" S!T LU!#'_#!F#IT!_!N$BL!D L false statement ntil all the materiali5ed views are refreshed" If LU!#'_#!F#IT!_INT!(#IT' is set to stale_tolerated/ access to the materiali5ed view can be allowed at the session level to an* sers who do not re4 ire the materiali5ed views to reflect the data from the latest load b* iss in! an $LT!# S!SSION S!T LU!#'_#!F#IT!_INT!(#IT'Ltrue statement" This scenario does not appl* when LU!#'_#!F#IT!_INT!(#IT' is either enforced or trusted beca se the s*stem ens res in these modes that onl* materiali5ed views with pdated data participate in a 4 er* rewrite"

Overvie( o0 )ateriali4ed !ie( )anage ent Tas<s


The motivation for sin! materiali5ed views is to improve performance/ b t the overhead associated with materiali5ed view mana!ement can become a si!nificant s*stem mana!ement problem" $hen reviewin! or eval atin! some of the necessar* materiali5ed view mana!ement activities/ consider some of the followin!#

Identif*in! what materiali5ed views to create initiall* Indexin! the materiali5ed views 6ns rin! that all materiali5ed views and materiali5ed view indexes are refreshed properl* each time the database is pdated Checkin! which materiali5ed views have been sed %eterminin! how effective each materiali5ed view has been on workload performance Meas rin! the space bein! sed b* materiali5ed views %eterminin! which new materiali5ed views sho ld be created %eterminin! which existin! materiali5ed views sho ld be dropped 'rchivin! old detail and materiali5ed view data that is no lon!er sef l

'fter the initial effort of creatin! and pop latin! the data wareho se or data mart/ the ma:or administration overhead is the pdate process/ which involves#

7eriodic extraction of incremental chan!es from the operational s*stems Transformin! the data ;erif*in! that the incremental chan!es are correct/ consistent/ and complete B lk)loadin! the data into the wareho se +efreshin! indexes and materiali5ed views so that the* are consistent with the detail data

The pdate process m st !enerall* be performed within a limited period of time known as the pdate window" The pdate window depends on the pdate fre! ency -s ch as dail* or weekl*3 and the nat re of the b siness" For a dail* pdate fre4 enc*/ an pdate window of two to six ho rs mi!ht be t*pical" ?o need to know *o r pdate window for the followin! activities#

Loadin! the detail data =pdatin! or reb ildin! the indexes on the detail data 7erformin! 4 alit* ass rance tests on the data +efreshin! the materiali5ed views =pdatin! the indexes on the materiali5ed views

T+pes o0 )ateriali4ed !ie(s


The S!L!CT cla se in the materiali5ed view creation statement defines the data that the materiali5ed view is to contain" Onl* a few restrictions limit what can be specified" 'n* n mber of tables can be :oined to!ether" However/ the* cannot be remote tables if *o wish to take advanta!e of 4 er* rewrite" Besides tables/ other elements s ch as views/ inline views -s b4 eries in the >#O" cla se of a S!L!CT statement3/ s b4 eries/ and materiali5ed views can all be :oined or referenced in the S!L!CT cla se" The t*pes of materiali5ed views are#

Materiali5ed ;iews with '!!re!ates Materiali5ed ;iews Containin! Onl* .oins (ested Materiali5ed ;iews

)ateriali4ed !ie(s (ith Aggregates


In data wareho ses/ materiali5ed views normall* contain a!!re!ates as shown in 6xample H)0" For fast refresh to be possible/ the S!L!CT list m st contain all of the (#OU%& B' col mns -if present3/ and there m st be a COUNT)G* and a COUNT)column* on an* a!!re!ated col mns" 'lso/ materiali5ed view lo!s m st be present on all tables referenced in the 4 er* that defines the materiali5ed view" The valid a!!re!ate f nctions are# SU"/ COUNT)?*/ COUNT)G*/ $+(/ +$#I$NC!/ STDD!+/ "IN/ and "$./ and the expression to be a!!re!ated can be an* SFL val e expression" See Also: 9+estrictions on Fast +efresh on Materiali5ed ;iews with '!!re!ates9 Fast refresh for a materiali5ed view containin! :oins and a!!re!ates is possible after an* t*pe of %ML to the base tables -direct load or conventional INS!#T/ U%D$T!/ or D!L!T!3" It can be defined to be refreshed ON CO""IT or ON D!"$ND" ' #!>#!S,&ON CO""IT/ materiali5ed view will be refreshed a tomaticall* when a transaction that does %ML to one of the materiali5ed view<s detail tables commits" The time taken to complete the commit ma* be sli!htl* lon!er than s al when this method is chosen" This is beca se the refresh operation is performed as part of the commit process" Therefore/ this method ma* not be s itable if man* sers are conc rrentl* chan!in! the tables pon which the materiali5ed view is based"

Here are some examples of materiali5ed views with a!!re!ates" (ote that materiali5ed view lo!s are onl* created beca se this materiali5ed view will be fast refreshed" E.am&$e :-1 Creating a Materialized View: Example 1
C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&products FIT,&S!LU!NC!1&#OFID )prod_id1&prod_name1&prod_desc1&prod_su9cate/ory1&prod_su9cat_desc1& prod_ cate/ory1&prod_cat_desc1&prod_<ei/ht_class1&prod_unit_of_measure1& prod_pac=_ siMe1&supplier_id1&prod_status1&prod_list_price1&prod_min_price* INCLUDIN(&N!F&+$LU!S; C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&sales FIT,&S!LU!NC!1&#OFID )prod_id1&cust_id1&time_id1&channel_id1&promo_id1&quantity_sold1& amount_sold* INCLUDIN(&N!F&+$LU!S; C#!$T!&"$T!#I$LI !D&+I!F&product_sales_m%CT>#!!&4&&T$BL!S%$C!&demo STO#$(!&)INITI$L&C=&N!.T&C=&%CTINC#!$S!&4* BUILD&I""!DI$T! #!>#!S,&>$ST !N$BL!&LU!#'&#!F#IT! $S&&S!L!CT&pIprod_name1&SU")amount_sold*&$S&dollar_sales1 COUNT)G*&$S&cnt1&COUNT)amount_sold*&$S&cnt_amt >#O"&sales&s1&products&p F,!#!&sIprod_id&H&pIprod_id &&&&&(#OU%&B'&prod_name;

6xample H)0 creates a materiali5ed view product_sales_m- that comp tes total n mber and val e of sales for a prod ct" It is derived b* :oinin! the tables sales and products on the col mn prod_id" The materiali5ed view is pop lated with data immediatel* beca se the b ild method is immediate and it is available for se b* 4 er* rewrite" In this example/ the defa lt refresh method is >$ST/ which is allowed beca se the appropriate materiali5ed view lo!s have been created on tables product and sales" E.am&$e :-2 Creating a Materialized View: Example 2
C#!$T!&"$T!#I$LI !D&+I!F&product_sales_m&&%CT>#!!&4&T$BL!S%$C!&demo &&STO#$(!&)INITI$L&5D=&N!.T&5D=&%CTINC#!$S!&4* &&BUILD&D!>!##!D &&#!>#!S,&CO"%L!T!&ON&D!"$ND &&!N$BL!&LU!#'&#!F#IT! &&$S &&S!L!CT &&&pIprod_name1 &&&&&SU")amount_sold*&$S&dollar_sales &&&&&&>#O"&sales&s1&products&p

&&&&&&F,!#!&sIprod_id&H&pIprod_id &&&&&&(#OU%&B'&pIprod_name;

6xample H)A creates a materiali5ed view product_sales_m- that comp tes the s m of sales b* prod_name" It is derived b* :oinin! the tables store and fact on the col mn store_=ey" The materiali5ed view does not initiall* contain an* data/ beca se the b ild method is D!>!##!D" ' complete refresh is re4 ired for the first refresh of a b ild deferred materiali5ed view" $hen it is refreshed and once pop lated/ this materiali5ed view can be sed b* 4 er* rewrite" E.am&$e :-3 Creating a Materialized View: Example 3
C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&sales FIT,&S!LU!NC!1&#OFID )prod_id1&cust_id1&time_id1&channel_id1&promo_id1&quantity_sold1& amount_sold* INCLUDIN(&N!F&+$LU!S; C#!$T!&"$T!#I$LI !D&+I!F&sum_sales &&%$#$LL!L &&BUILD&I""!DI$T!&& &&#!>#!S,&>$ST&ON&CO""IT&& &&$S&& &&S!L!CT&sIprod_id1&sItime_id1&& &&&&&&&&&COUNT)G*&$S&count_/rp1&& &&SU")sIamount_sold*&$S&sum_dollar_sales1&& &&&&&&&&COUNT)sIamount_sold*&$S&count_dollar_sales1&& &&SU")sIquantity_sold*&$S&sum_quantity_sales1&& &&&&&&&&COUNT)sIquantity_sold*&$S&count_quantity_sales& &&>#O"&sales&s &&(#OU%&B'&sIprod_id1&sItime_id;

6xample H)B creates a materiali5ed view that contains a!!re!ates on a sin!le table" Beca se the materiali5ed view lo! has been created/ the materiali5ed view is fast refreshable" If %ML is applied a!ainst the sales table/ then the chan!es will be reflected in the materiali5ed view when the commit is iss ed" 1e/uire ents 0or .sing )ateriali4ed !ie(s (ith Aggregates Table H)0 ill strates the a!!re!ate re4 irements for materiali5ed views"

Ta)$e :-1 %e4uirements for #ateria$i;ed <ie s

ith Aggregates

I0 aggregate H is present: aggregate F is re/uired and aggregate I is optional H


COUNT)e?pr* SU")e?pr* $+()e?pr* STDD!+)e?pr*

F
A COUNT)e?pr* COUNT)e?pr* COUNT)e?pr* SU")e?pr* COUNT)e?pr* SU")e?pr*

I
A A SU")e?pr* SU")e?pr&G&e?pr*

+$#I$NC!)e?pr*

SU")e?pr&G&e?pr*

(ote that COUNT)G* m st alwa*s be present" Oracle recommends that *o incl de the optional a!!re!ates in col mn in the materiali5ed view in order to obtain the most efficient and acc rate fast refresh of the a!!re!ates"

)ateriali4ed !ie(s Containing Onl+ >oins


Some materiali5ed views contain onl* :oins and no a!!re!ates/ s ch as in 6xample H)C/ where a materiali5ed view is created that :oins the sales table to the times and customers tables" The advanta!e of creatin! this t*pe of materiali5ed view is that expensive :oins will be precalc lated" Fast refresh for a materiali5ed view containin! onl* :oins is possible after an* t*pe of %ML to the base tables -direct)path or conventional INS!#T/ U%D$T!/ or D!L!T!3" ' materiali5ed view containin! onl* :oins can be defined to be refreshed ON&CO""IT or ON D!"$ND" If it is ON CO""IT/ the refresh is performed at commit time of the transaction that does %ML on the materiali5ed view<s detail table" Oracle does not allow self):oins in materiali5ed :oin views" If *o specif* #!>#!S, >$ST/ Oracle performs f rther verification of the 4 er* definition to ens re that fast refresh can be performed if any of the detail tables chan!e" These additional checks are#

' materiali5ed view lo! m st be present for each detail table" The rowids of all the detail tables m st appear in the S!L!CT list of the materiali5ed view 4 er* definition"

If there are no o ter :oins/ *o ma* have arbitrar* selections and :oins in the F,!#! cla se" However/ if there are o ter :oins/ the F,!#! cla se cannot have an* selections" F rther/ if there are o ter :oins/ all the :oins m st be connected b* $NDs and m st se the e4 alit* -L3 operator" If there are o ter :oins/ ni4 e constraints m st exist on the :oin col mns of the inner table" For example/ if *o are :oinin! the fact table and a dimension table and the :oin is an o ter :oin with the fact table bein! the o ter table/ there m st exist ni4 e constraints on the :oin col mns of the dimension table"

If some of these restrictions are not met/ *o can create the materiali5ed view as #!>#!S,& >O#C! to take advanta!e of fast refresh when it is possible" If one of the tables did not meet all of the criteria/ b t the other tables did/ the materiali5ed view wo ld still be fast refreshable with respect to the other tables for which all the criteria are met" ' materiali5ed view lo! sho ld contain the rowid of the master table" It is not necessar* to add other col mns" To speed p refresh/ *o sho ld create indexes on the materiali5ed view<s col mns that store the rowids of the fact table" E.am&$e :-" #ateria$i;ed <ie Containing On$' /oins

C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&sales &&FIT,&#OFID; & C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&times &&FIT,&#OFID; & C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&customers &&FIT,&#OFID; & C#!$T!&"$T!#I$LI !D&+I!F&detail_sales_m-& &&&&&&&%$#$LL!L&BUILD&I""!DI$T! &&&&&&&#!>#!S,&>$ST &&&&&&&$S &&&&&&&S!L!CT &&&&&&&sIro<id&Nsales_ridN1&tIro<id&Ntimes_ridN1&cIro<id& Ncustomers_ridN1 &&&&&&&cIcust_id1&cIcust_last_name1&sIamount_sold1 &&&&&&&sIquantity_sold1&sItime_id &&&&&&&>#O"&sales&s1&times&t1&customers&c& &&&&&&&F,!#!&&sIcust_id&H&cIcust_id)O*&$ND &&&&&&&&&&&&&&sItime_id&H&tItime_id)O*;

In this example/ to perform a fast refresh/ UNILU! constraints sho ld exist on cIcust_id and tItime_id" ?o sho ld also create indexes on the col mns sales_rid/ times_rid/ and customers_rid/ as ill strated in the followin!" This will improve the refresh performance"

C#!$T!&IND!.&m-_i?_salesrid& &&ON&detail_sales_m-)Nsales_ridN*; &

'lternativel*/ if the previo s example did not incl de the col mns times_rid and customers_id/ and if the refresh method was #!>#!S, >O#C!/ then this materiali5ed view wo ld be fast refreshable onl* if the sales table was pdated b t not if the tables times or customers were pdated"
C#!$T!&"$T!#I$LI !D&+I!F&detail_sales_m-& &&&&&&&%$#$LL!L &&&&&&&BUILD&I""!DI$T! &&&&&&&#!>#!S,&>O#C! &&&&&&&$S &&&&&&&S!L!CT &&&&&&&sIro<id&Nsales_ridN1 &&&&&&&cIcust_id1&cIcust_last_name1&sIamount_sold1 &&&&&&&sIquantity_sold1&sItime_id &&&&&&&>#O"&sales&s1&times&t1&customers&c& &&&&&&&F,!#!&sIcust_id&H&cIcust_id)O*&$ND& &&&&&&&&&&&&&sItime_id&H&tItime_id)O*;

Nested )ateriali4ed !ie(s


' nested materiali5ed view is a materiali5ed view whose definition is based on another materiali5ed view" ' nested materiali5ed view can reference other relations in the database in addition to referencin! materiali5ed views" Wh+ .se Nested )ateriali4ed !ie(s? In a data wareho se/ *o t*picall* create man* a!!re!ate views on a sin!le :oin -for example/ roll ps alon! different dimensions3" Incrementall* maintainin! these distinct materiali5ed a!!re!ate views can take a lon! time/ beca se the nderl*in! :oin has to be performed man* times" =sin! nested materiali5ed views/ *o can create m ltiple sin!le)table materiali5ed views based on a :oins)onl* materiali5ed view and the :oin is performed : st once" In addition/ optimi5ations can be performed for this class of sin!le)table a!!re!ate materiali5ed view and th s refresh is ver* efficient" E.am&$e :-- =ested #ateria$i;ed <ie ?o can create a nested materiali5ed view on materiali5ed views that contain :oins onl* or :oins and a!!re!ates" 'll the nderl*in! ob:ects -materiali5ed views or tables3 on which the materiali5ed view is defined m st have a materiali5ed view lo!" 'll the nderl*in! ob:ects are treated as if the* were tables" 'll the existin! options for materiali5ed views can be sed/ with the

exception of ON CO""IT #!>#!S,/ which is not s pported for a nested materiali5ed views that contains :oins and a!!re!ates" =sin! the tables and their col mns from the sh sample schema/ the followin! materiali5ed views ill strate how nested materiali5ed views can be created"
8G&create&the&materialiMed&-ie<&lo/s&G8 C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&sales& &&FIT,&#OFID; C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&customers& &&FIT,&#OFID; C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&times &&FIT,&#OFID; 8Gcreate&materialiMed&-ie<&6oin_sales_cust_time&as&fast&refresha9le&at &&&CO""IT&time&G8 C#!$T!&"$T!#I$LI !D&+I!F&6oin_sales_cust_time& #!>#!S,&>$ST&ON&CO""IT&$S S!L!CT&cIcust_id1&cIcust_last_name1&sIamount_sold1&tItime_id1 &&&&&&&tIday_num9er_in_<ee=1&sIro<id&srid1&tIro<id&trid1&cIro<id&crid& >#O"&sales&s1&customers&c1&times&t F,!#!&sItime_id&H&tItime_id&$ND &&&&&&sIcust_id&H&cIcust_id;

To create a nested materiali5ed view on the table 6oin_sales_cust_time. *o wo ld have to create a materiali5ed view lo! on the table" Beca se this will be a sin!le)table a!!re!ate materiali5ed view on 6oin_sales_cust_time/ *o need to lo! all the necessar* col mns and se the INCLUDIN( N!F +$LU!S cla se"
8G&create&materialiMed&-ie<&lo/&on&6oin_sales_cust_time&G8 C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&6oin_sales_cust_time& FIT,&#OFID&)cust_name1&day_num9er_in_<ee=1&amount_sold* INCLUDIN(&N!F&+$LU!S; 8G&create&the&sin/leAta9le&a//re/ate&materialiMed&-ie<& sum_sales_cust_time&on &&&6oin_sales_cust_time&as&fast&refresha9le&at&CO""IT&time&G8 C#!$T!&"$T!#I$LI !D&+I!F&sum_sales_cust_time& &&#!>#!S,&>$ST&ON&CO""IT& &&$S &&S!L!CT&COUNT)G*&cnt_all1&SU")amount_sold*&sum_sales1 COUNT)amount_sold* &&&&&&&&&cnt_sales1&cust_last_name1&day_num9er_in_<ee= &&>#O"&6oin_sales_cust_time &&(#OU%&B'&cust_last_name1&day_num9er_in_<ee=;

This schema can be dia!rammaticall* represented as in Fi! re H)B" Figure :-3 =ested #ateria$i;ed <ie !chema

Text description of the ill stration dwhs!801"!if

Nesting )ateriali4ed !ie(s (ith >oins and Aggregates ?o can nest materiali5ed views with :oins and a!!re!ates/ b t the ON D!"$ND cla se is necessar* for >$ST #!>#!S," Some t*pes of nested materiali5ed views cannot be fast refreshed" =se !.%L$IN_"+I!F to identif* those t*pes of materiali5ed views" Beca se *o have to invoke the refresh f nctions man all*/ orderin! has to be taken into acco nt" This is beca se the refresh for a materiali5ed view that is b ilt on other materiali5ed views will se the c rrent state of the other materiali5ed views/ whether the* are fresh or not" ?o can find the dependent materiali5ed views for a partic lar ob:ect sin! the 7L@SFL f nction (!T_"+_D!%!ND!NCI!S in the DB"S_"+I!F packa!e" Nested )ateriali4ed !ie( .sage 7uidelines ?o sho ld keep the followin! in mind when decidin! whether to se nested materiali5ed views#

If *o want to se fast refresh/ *o sho ld fast refresh all the materiali5ed views alon! an* chain" It makes little sense to define a fast refreshable materiali5ed view on top of a materiali5ed view that m st be refreshed with a complete refresh" If *o want the hi!hest level materiali5ed view to be fresh with respect to the detail tables/ *o need to ens re that all materiali5ed views in a tree are refreshed in the correct dependenc* order before refreshin! the hi!hest)level" Oracle does not provide s pport for a tomatic refreshin! of intermediate materiali5ed views in a nested hierarch*" If the materiali5ed views nder the hi!hest)level materiali5ed view are stale/ refreshin! onl* the hi!hest)level will s cceed/ b t makes it fresh onl* with respect to its nderl*in! materiali5ed view/ not the detail tables at the base of the tree" $hen refreshin! materiali5ed views/ *o need to ens re that all materiali5ed views in a tree are refreshed" If *o onl* refresh the hi!hest)level materiali5ed view/ the materiali5ed views nder it will be stale and *o m st explicitl* refresh them"

1estrictions When .sing Nested )ateriali4ed !ie(s The followin! restrictions exist on the wa* *o can nest materiali5ed views#

Fast refresh for ON CO""IT is not s pported for a hi!her)level materiali5ed view that contains :oins and a!!re!ates" DB"S_"+I!FI#!>#!S, '7Is will not a tomaticall* refresh nested materiali5ed views nless explicitl* specified" Th s/ if monthly_sales_m- is based on sales_m-/ *o have to refresh sales_m- first/ followed b* monthly_sales_m-" Oracle does not a tomaticall* refresh monthly_sales_m- when *o refresh sales_m- or vice versa" If *o have a table costs with a materiali5ed view cost_m- based on it/ *o cannot then create a preb ilt materiali5ed view on table costs" The res lt wo ld make cost_m- a nested materiali5ed view and this method of conversion is not s pported"

Creating )ateriali4ed !ie(s


' materiali5ed view can be created with the C#!$T! "$T!#I$LI !D +I!F statement or sin! Oracle 6nterprise Mana!er" 6xample H)2 creates the materiali5ed view cust_sales_m-" E.am&$e :-5 Creating a #ateria$i;ed <ie
C#!$T!&"$T!#I$LI !D&+I!F&cust_sales_m%CT>#!!&4&T$BL!S%$C!&demo STO#$(!&)INITI$L&5D=&N!.T&5D=&%CTINC#!$S!&4* %$#$LL!L BUILD&I""!DI$T! #!>#!S,&CO"%L!T! !N$BL!&LU!#'&#!F#IT! $S S!L!CT&&cIcust_last_name1 &&&&&SU")amount_sold*&$S&sum_amount_sold &&&&&>#O"&customers&c1&sales&s &&&&&F,!#!&sIcust_id&H&cIcust_id &&&&&(#OU%&B'&cIcust_last_name;

It is not ncommon in a data wareho se to have alread* created s mmar* or a!!re!ation tables/ and *o mi!ht not wish to repeat this work b* b ildin! a new materiali5ed view" In this case/ the table that alread* exists in the database can be re!istered as a preb ilt materiali5ed view" This techni4 e is described in 9+e!isterin! 6xistin! Materiali5ed ;iews9" Once *o have selected the materiali5ed views *o want to create/ follow these steps for each materiali5ed view"

0" %esi!n the materiali5ed view" 6xistin! ser)defined materiali5ed views do not re4 ire this step" If the materiali5ed view contains man* rows/ then/ if appropriate/ the materiali5ed view sho ld be partitioned -if possible3 and sho ld match the partitionin! of the lar!est or most fre4 entl* pdated detail or fact table -if possible3" +efresh performance benefits from partitionin!/ beca se it can take advanta!e of parallel %ML capabilities" 0" =se the C#!$T! "$T!#I$LI !D +I!F statement to create and/ optionall*/ pop late the materiali5ed view" If a ser)defined materiali5ed view alread* exists/ then se the ON %#!BUILT T$BL! cla se in the C#!$T! "$T!#I$LI !D +I!F statement" Otherwise/ se the BUILD I""!DI$T! cla se to pop late the materiali5ed view immediatel*/ or the BUILD D!>!##!D cla se to pop late the materiali5ed view later" ' BUILD D!>!##!D materiali5ed view is disabled for se b* 4 er* rewrite ntil the first #!>#!S,/ after which it will be a tomaticall* enabled/ provided the !N$BL! LU!#' #!F#IT! cla se has been specified" See Also: Oracle9i S%& $e"erence for descriptions of the SFL statements C#!$T!& "$T!#I$LI !D +I!F/ $LT!# "$T!#I$LI !D +I!F/ and D#O%
"$T!#I$LI !D +I!F

Na ing )ateriali4ed !ie(s


The name of a materiali5ed view m st conform to standard Oracle namin! conventions" However/ if the materiali5ed view is based on a ser)defined preb ilt table/ then the name of the materiali5ed view m st exactl* match that table name" If *o alread* have a namin! convention for tables and indexes/ *o mi!ht consider extendin! this namin! scheme to the materiali5ed views so that the* are easil* identifiable" For example/ instead of namin! the materiali5ed view sum_of_sales/ it co ld be called sum_of_sales_m- to denote that this is a materiali5ed view and not a table or view"

Storage And Data Seg ent Co pression


=nless the materiali5ed view is based on a ser)defined preb ilt table/ it re4 ires and occ pies stora!e space inside the database" Therefore/ the stora!e needs for the materiali5ed view sho ld be specified in terms of the tablespace where it is to reside and the si5e of the extents" If *o do not know how m ch space the materiali5ed view will re4 ire/ then the DB"S_OL$%I!STI"$T!_SI ! packa!e/ which is described in Chapter 02/ 9S mmar* 'dvisor9/ can estimate the n mber of b*tes re4 ired to store this ncompressed materiali5ed view" This information can then assist the desi!n team in determinin! the tablespace in which the materiali5ed view sho ld reside"

?o sho ld se data se!ment compression with hi!hl* red ndant data/ s ch as tables with man* forei!n ke*s" This is partic larl* sef l for materiali5ed views created with the #OLLU% cla se" %ata se!ment compression red ces disk se and memor* se -specificall*/ the b ffer cache3/ often leadin! to a better scale p for read)onl* operations" %ata se!ment compression can also speed p 4 er* exec tion" See Also: Oracle9i S%& $e"erence for a complete description of STO#$(! semantics/ Oracle9i Database Per"ormance Tuning #uide and $e"erence/ and Chapter >/ 97arallelism and 7artitionin! in %ata $areho ses9 for data se!ment compression examples

&uild )ethods
Two b ild methods are available for creatin! the materiali5ed view/ as shown in Table H) A" If *o select BUILD I""!DI$T!/ the materiali5ed view definition is added to the schema ob:ects in the data dictionar*/ and then the fact or detail tables are scanned accordin! to the S!L!CT expression and the res lts are stored in the materiali5ed view" %ependin! on the si5e of the tables to be scanned/ this b ild process can take a considerable amo nt of time" 'n alternative approach is to se the BUILD D!>!##!D cla se/ which creates the materiali5ed view witho t data/ thereb* enablin! it to be pop lated at a later date sin! the DB"S_"+I!FI#!>#!S, packa!e described in Chapter 0C/ 9Maintainin! the %ata $areho se9"

Ta)$e :-2 6ui$d #ethods


&uild )ethod
BUILD I""!DI$T! BUILD D!>!##!D

Description Create the materiali5ed view and then pop late it with data Create the materiali5ed view definition b t do not pop late it with data

$nabling Duer+ 1e(rite


Before creatin! a materiali5ed view/ *o can verif* what t*pes of 4 er* rewrite are possible b* callin! the proced re DB"S_"+I!FI!.%L$IN_"+I!F" Once the materiali5ed view has been created/ *o can se DB"S_"+I!FI!.%L$IN_#!F#IT! to find o t if -or wh* not3 it will rewrite a specific 4 er*" 6ven tho !h a materiali5ed view is defined/ it will not a tomaticall* be sed b* the 4 er* rewrite facilit*" ?o m st set the LU!#'_#!F#IT!_!N$BL!D initiali5ation parameter to

before sin! 4 er* rewrite" ?o also m st specif* the !N$BL! LU!#' #!F#IT! cla se if the materiali5ed view is to be considered available for rewritin! 4 eries"
T#U!

If this cla se is omitted or specified as DIS$BL! LU!#' #!F#IT! when the materiali5ed view is created/ the materiali5ed view can s bse4 entl* be enabled for 4 er* rewrite with the $LT!# "$T!#I$LI !D +I!F statement" If *o define a materiali5ed view as BUILD D!>!##!D/ it is not eli!ible for 4 er* rewrite ntil it is pop lated with data"

Duer+ 1e(rite 1estrictions


F er* rewrite is not possible with all materiali5ed views" If 4 er* rewrite is not occ rrin! when expected/ DB"S_"+I!FI!.%L$IN_#!F#IT! can help provide reasons wh* a specific 4 er* is not eli!ible for rewrite" 'lso/ check to see if *o r materiali5ed view satisfies all of the followin! conditions" )ateriali4ed !ie( 1estrictions ?o sho ld keep in mind the followin! restrictions#

The definin! 4 er* of the materiali5ed view cannot contain an* non)repeatable expressions -#OFNU"/ S'SD$T!/ non)repeatable 7L@SFL f nctions/ and so on3" The 4 er* cannot contain an* references to #$F or LON( #$F datat*pes or ob:ect #!>s" If the definin! 4 er* of the materiali5ed view contains set operators -UNION/ "INUS/ and so on3/ rewrite will se them for f ll text match rewrite onl*" If the materiali5ed view was re!istered as %#!BUILT/ the precision of the col mns m st a!ree with the precision of the correspondin! S!L!CT expressions nless overridden b* the FIT, #!DUC!D %#!CISION cla se" If the materiali5ed view contains the same table more than once/ it is possible to do a !eneral rewrite/ provided the 4 er* has the same aliases for the d plicate tables as the materiali5ed view"

7eneral Duer+ 1e(rite 1estrictions ?o sho ld keep in mind the followin! restrictions#

If a 4 er* has both local and remote tables/ onl* local tables will be considered for potential rewrite" (either the detail tables nor the materiali5ed view can be owned b* S'S" S!L!CT and (#OU% B' lists/ if present/ m st be the same in the 4 er* of the materiali5ed view" '!!re!ate f nctions m st occ r onl* as the o termost part of the expression" That is/ a!!re!ates s ch as $+()$+()?** or $+()?*M $+()?* are not allowed" CONN!CT B' cla ses are not allowed"

1e0resh Options
$hen *o define a materiali5ed view/ *o can specif* two refresh options# how to refresh and what t*pe of refresh" If nspecified/ the defa lts are ass med as ON&D!"$ND and >O#C!" The two refresh exec tion modes are# ON CO""IT and ON D!"$ND" %ependin! on the materiali5ed view *o create/ some of the options ma* not be available" Table H)B describes the refresh modes"

Ta)$e :-3 %efresh #odes


1e0resh )ode
ON& CO""IT

Description +efresh occ rs a tomaticall* when a transaction that modified one of the materiali5ed view<s detail tables commits" This can be specified as lon! as the materiali5ed view is fast refreshable -in other words/ not complex3" The ON CO""IT privile!e is necessar* to se this mode +efresh occ rs when a ser man all* exec tes one of the available refresh proced res contained in the DB"S_"+I!F packa!e -#!>#!S,/ #!>#!S,_$LL_"+I!FS/ #!>#!S,_D!%!ND!NT3

ON& D!"$ND

$hen a materiali5ed view is maintained sin! the ON CO""IT method/ the time re4 ired to complete the commit ma* be sli!htl* lon!er than s al" This is beca se the refresh operation is performed as part of the commit process" Therefore this method ma* not be s itable if man* sers are conc rrentl* chan!in! the tables pon which the materiali5ed view is based" If *o anticipate performin! insert/ pdate or delete operations on tables referenced b* a materiali5ed view conc rrentl* with the refresh of that materiali5ed view/ and that materiali5ed view incl des :oins and a!!re!ation/ Oracle recommends *o se ON CO""IT& fast refresh rather than ON D!"$ND fast refresh" If *o think the materiali5ed view did not refresh/ check the alert lo! or trace file" If a materiali5ed view fails d rin! refresh at CO""IT time/ *o m st explicitl* invoke the refresh proced re sin! the DB"S_"+I!F packa!e after addressin! the errors specified in the trace files" =ntil this is done/ the materiali5ed view will no lon!er be refreshed a tomaticall* at commit time" ?o can specif* how *o want *o r materiali5ed views to be refreshed from the detail tables b* selectin! one of fo r options# CO"%L!T!/ >$ST/ >O#C!/ and N!+!#" Table H)C describes the refresh options"

Ta)$e :-" %efresh O&tions


1e0resh Option
CO"%L!T! >$ST

Description +efreshes b* recalc latin! the materiali5ed view<s definin! 4 er* 'pplies incremental chan!es to refresh the materiali5ed view sin! the information lo!!ed in the materiali5ed view lo!s/ or from a SFLILoader direct)path or a partition maintenance operation 'pplies >$ST refresh if possibleJ otherwise/ it applies CO"%L!T! refresh Indicates that the materiali5ed view will not be refreshed with the Oracle refresh mechanisms

>O#C! N!+!#

$hether the fast refresh option is available depends pon the t*pe of materiali5ed view" ?o can call the proced re DB"S_"+I!FI!.%L$IN_"+I!F to determine whether fast refresh is possible" 7eneral 1estrictions on ,ast 1e0resh The definin! 4 er* of the materiali5ed view is restricted as follows#

The materiali5ed view m st not contain references to non)repeatin! expressions like S'SD$T! and #OFNU"" The materiali5ed view m st not contain references to #$F or LON( #$F data t*pes"

1estrictions on ,ast 1e0resh on )ateriali4ed !ie(s (ith >oins Onl+ %efinin! 4 eries for materiali5ed views with :oins onl* and no a!!re!ates have the followin! restrictions on fast refresh#

'll restrictions from 9Keneral +estrictions on Fast +efresh9" The* cannot have (#OU% B' cla ses or a!!re!ates" If the F,!#! cla se of the 4 er* contains o ter :oins/ then ni4 e constraints m st exist on the :oin col mns of the inner :oin table" If there are no o ter :oins/ *o can have arbitrar* selections and :oins in the F,!#!& cla se" However/ if there are o ter :oins/ the F,!#! cla se cannot have an* selections" F rthermore/ if there are o ter :oins/ all the :oins m st be connected b* $NDs and m st se the e4 alit* -L3 operator" +owids of all the tables in the >#O" list m st appear in the S!L!CT list of the 4 er*"

Materiali5ed view lo!s m st exist with rowids for all the base tables in the >#O" list of the 4 er*"

1estrictions on ,ast 1e0resh on )ateriali4ed !ie(s (ith Aggregates %efinin! 4 eries for materiali5ed views with :oins and a!!re!ates have the followin! restrictions on fast refresh#

'll restrictions from 9Keneral +estrictions on Fast +efresh9"

Fast refresh is s pported for both ON CO""IT and ON D!"$ND materiali5ed views/ however the followin! restrictions appl*#

'll tables in the materiali5ed view m st have materiali5ed view lo!s/ and the materiali5ed view lo!s m st# Contain all col mns from the table referenced in the materiali5ed view" Specif* with #OFID and INCLUDIN( N!F +$LU!S" Specif* the S!LU!NC! cla se if the table is expected to have a mix of inserts@direct)loads/ deletes/ and pdates" Onl* SU"/ COUNT/ $+(/ STDD!+/ +$#I$NC!/ "IN and "$. are s pported for fast refresh" COUNT)G* m st be specified" For each a!!re!ate $(()e?pr*/ the correspondin! COUNT)e?pr* m st be present" If +$#I$NC!)e?pr* or STDD!+)e?pr3 is specified/ COUNT)e?pr* and SU")e?pr* m st be specified" Oracle recommends that SU")e?pr&Ge?pr* be specified" See Table H)0 for f rther details" The S!L!CT list m st contain all (#OU% B' col mns" If the materiali5ed view has one of the followin!/ then fast refresh is s pported onl* on conventional %ML inserts and direct loads" Materiali5ed views with "IN or "$. a!!re!ates Materiali5ed views which have SU")e?pr* b t no COUNT)e?pr* Materiali5ed views witho t COUNT)G*

S ch a materiali5ed view is called an insert)onl* materiali5ed view"

The CO"%$TIBILIT' parameter m st be set to 1"8 if the materiali5ed a!!re!ate view has inline views/ o ter :oins/ self :oins or !ro pin! sets and >$ST #!>#!S, is specified d rin! creation" (ote that all other re4 irements for fast refresh specified previo sl* m st also be satisfied" Materiali5ed views with named views or s b4 eries in the >#O" cla se can be fast refreshed provided the views can be completel* mer!ed" For information on which views will mer!e/ refer to the Oracle9i Database Per"ormance Tuning #uide and $e"erence" If there are no o ter :oins/ *o ma* have arbitrar* selections and :oins in the F,!#! cla se"

Materiali5ed a!!re!ate views with o ter :oins are fast refreshable after conventional %ML and direct loads/ provided onl* the o ter table has been modified" 'lso/ ni4 e constraints m st exist on the :oin col mns of the inner :oin table" If there are o ter :oins/ all the :oins m st be connected b* $NDs and m st se the e4 alit* -L3 operator" For materiali5ed views with CUB!/ #OLLU%/ Kro pin! Sets/ or concatenation of them/ the followin! restrictions appl*# The S!L!CT list sho ld contain !ro pin! distin! isher that can either be a (#OU%IN(_ID f nction on all (#OU% B' expressions or (#OU%IN( f nctions one for each (#OU% B' expression" For example/ if the (#OU% B' cla se of the materiali5ed view is 9(#OU% B' CUB!)a1&9*9/ then the S!L!CT list sho ld contain either 9(#OU%IN(_ID)a1&9*9 or 9(#OU%IN()a* $ND (#OU%IN()9*9 for the materiali5ed view to be fast refreshable" (#OU% B' sho ld not res lt in an* d plicate !ro pin!s" For example/ 9(#OU%&B'&a1&#OLLU%)a1&9*9 is not fast refreshable beca se it res lts in d plicate !ro pin!s 9)a*1&)a1&9*1&$ND&)a*9"

1estrictions on ,ast 1e0resh on )ateriali4ed !ie(s With the .NION A"" Operator Materiali5ed views with the UNION $LL set operator s pport the #!>#!S, >$ST option if the followin! conditions are satisfied#

The definin! 4 er* m st have the UNION $LL operator at the top level" The UNION $LL operator cannot be embedded inside a s b4 er*/ with one exception# The UNION $LL can be in a s b4 er* in the >#O" cla se provided the definin! 4 er* is of the form S!L!CT&G&>#O" -view or s b4 er* with UNION $LL3 as in the followin! example#
C#!$T!&+I!F&-ie<_<ith_unionall_m$S )S!L!CT&cIro<id&crid1&cIcust_id1&2&umar=er &>#O"&customers&c &F,!#!&cIcust_last_name&H&7Smith7 &UNION&$LL &S!L!CT&cIro<id&crid1&cIcust_id1&3&umar=er &>#O"&customers&c &F,!#!&cIcust_last_name&H&7Bones7*; C#!$T!&"$T!#I$LI !D&+I!F&unionall_inside_-ie<_m#!>#!S,&>$ST&ON&D!"$ND& $S S!L!CT&G&>#O"&-ie<_<ith_unionall;

(ote that the view -ie<_<ith_unionall_m- satisfies all re4 irements for fast refresh"

6ach 4 er* block in the UNION $LL 4 er* m st satisf* the re4 irements of a fast refreshable materiali5ed view with a!!re!ates or a fast refreshable materiali5ed view with :oins" The appropriate materiali5ed view lo!s m st be created on the tables as re4 ired for the correspondin! t*pe of fast refreshable materiali5ed view" (ote that Oracle also allows the special case of a sin!le table materiali5ed view with :oins onl* provided the #OFID col mn has been incl ded in the S!L!CT list and in the materiali5ed view lo!" This is shown in the definin! 4 er* of the view -ie<_<ith_unionall_m-"

The S!L!CT list of each 4 er* m st incl de a maintenance col mn/ called a UNION& $LL marker" The UNION $LL col mn m st have a distinct constant n meric or strin! val e in each UNION $LL branch" F rther/ the marker col mn m st appear in the same ordinal position in the S!L!CT list of each 4 er* block" Some feat res s ch as o ter :oins/ insert)onl* a!!re!ate materiali5ed view 4 eries and remote tables are not s pported for materiali5ed views with UNION $LL" 7artition Chan!e Trackin!)based refresh is not s pported for UNION $LL materiali5ed views" The compatibilit* initiali5ation parameter m st be set to 1"A"8 to create a fast refreshable materiali5ed view with UNION $LL"

O1D$1 &F Clause


'n O#D!# B' cla se is allowed in the C#!$T! "$T!#I$LI !D +I!F statement" It is sed onl* d rin! the initial creation of the materiali5ed view" It is not sed d rin! a f ll refresh or a fast refresh" To improve the performance of 4 eries a!ainst lar!e materiali5ed views/ store the rows in the materiali5ed view in the order specified in the O#D!# B' cla se" This initial orderin! provides ph*sical cl sterin! of the data" If indexes are b ilt on the col mns b* which the materiali5ed view is ordered/ accessin! the rows of the materiali5ed view sin! the index often red ces the time for disk I@O d e to the ph*sical cl sterin!" The O#D!# B' cla se is not considered part of the materiali5ed view definition" 's a res lt/ there is no difference in the manner in which Oracle detects the vario s t*pes of materiali5ed views -for example/ materiali5ed :oin views with no a!!re!ates3" For the same reason/ 4 er* rewrite is not affected b* the O#D!# B' cla se" This feat re is similar to the C#!$T! T$BL! """ O#D!# B' capabilit* that exists in Oracle"

)ateriali4ed !ie( "ogs


Materiali5ed view lo!s are re4 ired if *o want to se fast refresh" The* are defined sin! a C#!$T! "$T!#I$LI !D +I!F LO( statement on the base table that is to be chan!ed" The* are not created on the materiali5ed view" For fast refresh of materiali5ed

views/ the definition of the materiali5ed view lo!s m st specif* the #OFID cla se" In addition/ for a!!re!ate materiali5ed views/ it m st also contain ever* col mn in the table referenced in the materiali5ed view/ the INCLUDIN( N!F +$LU!S cla se and the S!LU!NC! cla se" 'n example of a materiali5ed view lo! is shown as follows where one is created on the table sales"
C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&sales FIT,&#OFID )prod_id1&cust_id1&time_id1&channel_id1&promo_id1&quantity_sold1& amount_sold* INCLUDIN(&N!F&+$LU!S;

Oracle recommends that the ke*word S!LU!NC! be incl ded in *o r materiali5ed view lo! statement nless *o are s re that *o will never perform a mixed %ML operation -a combination of INS!#T/ U%D$T!/ or D!L!T! operations on m ltiple tables3" The bo ndar* of a mixed %ML operation is determined b* whether the materiali5ed view is ON CO""IT or ON D!"$ND"

For ON CO""IT/ the mixed %ML statements occ r within the same transaction beca se the refresh of the materiali5ed view will occ r pon commit of this transaction" For ON D!"$ND/ the mixed %ML statements occ r between refreshes" The followin! example of a materiali5ed view lo! ill strates where one is created on the table sales that incl des the S!LU!NC! ke*word#
C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&sales FIT,&S!LU!NC!1&#OFID )prod_id1&cust_id1&time_id1&channel_id1&promo_id1& &quantity_sold1&amount_sold* INCLUDIN(&N!F&+$LU!S;

.sing Oracle $nterprise )anager


' materiali5ed view can also be created sin! Oracle 6nterprise Mana!er b* selectin! the materiali5ed view ob:ect t*pe" There is no difference in the information re4 ired if this approach is sed" However/ *o m st complete three propert* sheets and *o m st ens re that the option ,nable - ery .ewrite on the #eneral sheet is selected" See Also: Oracle /nterprise Manager Con"iguration #uide and Chapter 02/ 9S mmar* 'dvisor9 for f rther information

.sing )ateriali4ed !ie(s (ith N"S #ara eters

$hen sin! certain materiali5ed views/ *o m st ens re that *o r (LS parameters are the same as when *o created the materiali5ed view" Materiali5ed views with this restriction are as follows#

6xpressions that ma* ret rn different val es/ dependin! on (LS parameter settin!s" For example/ -date P 980@8A@8B93 or -rate QL 9A"0>893 are (LS parameter dependent expressions" 64 i:oins where one side of the :oin is character data" The res lt of this e4 i:oin depends on collation and this can chan!e on a session basis/ !ivin! an incorrect res lt in the case of 4 er* rewrite or an inconsistent materiali5ed view after a refresh operation" 6xpressions that !enerate internal conversion to character data in the S!L!CT list of a materiali5ed view/ or inside an a!!re!ate of a materiali5ed a!!re!ate view" This restriction does not appl* to expressions that involve onl* n meric data/ for example/ aO9 where a and 9 are n meric fields"

1egistering $2isting )ateriali4ed !ie(s


Some data wareho ses have implemented materiali5ed views in ordinar* ser tables" 'ltho !h this sol tion provides the performance benefits of materiali5ed views/ it does not#

7rovide 4 er* rewrite to all SFL applications 6nable materiali5ed views defined in one application to be transparentl* accessed in another application Kenerall* s pport fast parallel or fast materiali5ed view refresh

Beca se of these limitations/ and beca se existin! materiali5ed views can be extremel* lar!e and expensive to reb ild/ *o sho ld re!ister *o r existin! materiali5ed view tables with Oracle whenever possible" ?o can re!ister a ser)defined materiali5ed view with the C#!$T! "$T!#I$LI !D +I!F """ ON %#!BUILT T$BL! statement" Once re!istered/ the materiali5ed view can be sed for 4 er* rewrites or maintained b* one of the refresh methods/ or both" The contents of the table m st reflect the materiali5ation of the definin! 4 er* at the time *o re!ister it as a materiali5ed view/ and each col mn in the definin! 4 er* m st correspond to a col mn in the table that has a matchin! datat*pe" However/ *o can specif* FIT, #!DUC!D %#!CISION to allow the precision of col mns in the definin! 4 er* to be different from that of the table col mns" The table and the materiali5ed view m st have the same name/ b t the table retains its identit* as a table and can contain col mns that are not referenced in the definin! 4 er* of the materiali5ed view" These extra col mns are known as nmana!ed col mns" If rows are inserted d rin! a refresh operation/ each nmana!ed col mn of the row is set to its defa lt val e" Therefore/ the nmana!ed col mns cannot have NOT NULL constraints nless the* also have defa lt val es"

Materiali5ed views based on preb ilt tables are eli!ible for selection b* 4 er* rewrite provided the parameter LU!#'_#!F#IT!_INT!(#IT' is set to at least the level of stale_tolerated or trusted" See Also: Chapter AA/ 9F er* +ewrite9 for details abo t inte!rit* levels $hen *o drop a materiali5ed view that was created on a preb ilt table/ the table still exists))onl* the materiali5ed view is dropped" $hen a preb ilt table is re!istered as a materiali5ed view and 4 er* rewrite is desired/ the parameter LU!#'_#!F#IT!_INT!(#IT' m st be set to at least stale_tolerated beca se/ when it is created/ the materiali5ed view is marked as nknown" Therefore/ onl* stale inte!rit* modes can be sed" The followin! example ill strates the two steps re4 ired to re!ister a ser)defined table" First/ the table is created/ then the materiali5ed view is defined sin! exactl* the same name as the table" This materiali5ed view sum_sales_ta9 is eli!ible for se in 4 er* rewrite"
C#!$T!&T$BL!&sum_sales_ta9 &&%CT>#!!&4&&T$BL!S%$C!&demo &&&STO#$(!&)INITI$L&5D=&N!.T&5D=&%CTINC#!$S!&4* &&&&$S &&&&S!L!CT&sIprod_id1 &&&&&&&SU")amount_sold*&$S&dollar_sales1 &&&&&&&SU")quantity_sold*&$S&unit_sales &&&&&&&&&>#O"&sales&s&(#OU%&B'&sIprod_id; C#!$T!&"$T!#I$LI !D&+I!F&sum_sales_ta9 ON&%#!BUILT&T$BL!&FIT,OUT&#!DUC!D&%#!CISION !N$BL!&LU!#'&#!F#IT! $S S!L!CT&sIprod_id1 &&SU")amount_sold*&$S&dollar_sales1 &&SU")quantity_sold*&$S&unit_sales &&>#O"&sales&s&(#OU%&B'&sIprod_id;

?o co ld have compressed this table to save space" See 9Stora!e 'nd %ata Se!ment Compression9 for details re!ardin! data se!ment compression" In some cases/ ser)defined materiali5ed views are refreshed on a sched le that is lon!er than the pdate c*cle" For example/ a monthl* materiali5ed view mi!ht be pdated onl* at the end of each month/ and the materiali5ed view val es alwa*s refer to complete time periods" +eports written directl* a!ainst these materiali5ed views implicitl* select onl* data that is not in the c rrent -incomplete3 time period" If a ser)defined materiali5ed view alread* contains a time dimension#

It sho ld be re!istered and then fast refreshed each pdate c*cle" ?o can create a view that selects the complete time period of interest" The reports sho ld be modified to refer to the view instead of referrin! directl* to the ser)defined materiali5ed view"

If the ser)defined materiali5ed view does not contain a time dimension/ then#

Create a new materiali5ed view that does incl de the time dimension -if possible3" The view sho ld a!!re!ate over the time col mn in the new materiali5ed view"

#artitioning and )ateriali4ed !ie(s


Beca se of the lar!e vol me of data held in a data wareho se/ partitionin! is an extremel* sef l option when desi!nin! a database" 7artitionin! the fact tables improves scalabilit*/ simplifies s*stem administration/ and makes it possible to define local indexes that can be efficientl* reb ilt" 7artitionin! the fact tables also improves the opport nit* of fast refreshin! the materiali5ed view when the partition maintenance operation occ rs" 7artitionin! a materiali5ed view also has benefits for refresh/ beca se the refresh proced re can se parallel %ML to maintain the materiali5ed view" See Also: Chapter >/ 97arallelism and 7artitionin! in %ata $areho ses9 for f rther details abo t partitionin!

#artition Change Trac<ing


It is possible and advanta!eo s to track freshness to a finer !rain than the entire materiali5ed view" The abilit* to identif* which rows in a materiali5ed view are affected b* a certain detail table partition/ is known as 7artition Chan!e Trackin! -7CT3" $hen one or more of the detail tables are partitioned/ it ma* be possible to identif* the specific rows in the materiali5ed view that correspond to a modified detail partition-s3J those rows become stale when a partition is modified while all other rows remain fresh" 7artition Chan!e Trackin! can be sed to identif* which materiali5ed view rows correspond to a partic lar detail table" 7artition Chan!e Trackin! is also sed to s pport fast refresh after partition maintenance operations on detail tables" For instance/ if a detail table partition is tr ncated or dropped/ the affected rows in the materiali5ed view are identified and deleted" Identif*in! which materiali5ed view rows are fresh or stale/ rather than considerin! the entire materiali5ed view as stale/ allows 4 er* rewrite to se those rows that are fresh while in LU!#'_#!F#IT!_INT!(#IT'L!N>O#C!D or T#UST!D modes" To s pport 7CT/ a materiali5ed view m st satisf* the followin! re4 irements#

't least one of the detail tables referenced b* the materiali5ed view m st be partitioned" 7artitioned tables m st se either ran!e or composite partitionin!" The partition ke* m st consist of onl* a sin!le col mn" The materiali5ed view m st contain either the partition ke* col mn or a partition marker of the detail table" See Oracle9i Supplied P&0S%& Packages and Types $e"erence for details re!ardin! the DB"S_"+I!FI%"$#K!# f nction" If *o se a (#OU% B' cla se/ the partition ke* col mn or the partition marker m st be present in the (#OU% B' cla se" %ata modifications can onl* occ r on the partitioned table" The CO"%$TIBILIT' initiali5ation parameter m st be a minim m of 1"8"8"8"8" 7artition Chan!e Trackin! is not s pported for a materiali5ed view that refers to views/ remote tables/ or o ter :oins" 7artition Chan!e Trackin!)based refresh is not s pported for UNION $LL materiali5ed views"

7artition chan!e trackin! re4 ires s fficient information in the materiali5ed view to be able to correlate each materiali5ed view row back to its correspondin! detail row in the so rce partitioned detail table" This can be accomplished b* incl din! the detail table partition ke* col mns in the select list and/ if (#OU% B' is sed/ in the (#OU% B' list" %ependin! on the desired level of a!!re!ation and the distinct cardinalities of the partition ke* col mns/ this has the nfort nate effect of si!nificantl* increasin! the cardinalit* of the materiali5ed view" For example/ sa* a pop lar metric is the reven e !enerated b* a prod ct d rin! a !iven *ear" If the sales table were partitioned b* time_id/ it wo ld be a re4 ired field in the S!L!CT cla se and the (#OU% B' cla se of the materiali5ed view" If there were 0888 different prod cts sold each da*/ it wo ld s bstantiall* increase the n mber of rows in the materiali5ed view" #artition )ar<er In man* cases/ the advanta!es of 7CT will be offset b* this restriction for hi!hl* a!!re!ated materiali5ed views" The DB"S_"+I!FI%"$#K!# f nction is desi!ned to si!nificantl* red ce the cardinalit* of the materiali5ed view -see 6xample H)E for an example3" The f nction ret rns a partition identifier that ni4 el* identifies the partition for a specified row within a specified partition table" The DB"S_"+I!FI%"$#K!# f nction is sed instead of the partition ke* col mn in the S!L!CT and (#OU% B' cla ses" =nlike the !eneral case of a 7L@SFL f nction in a materiali5ed view/ se of the DB"S_"+I!FI%"$#K!# does not prevent rewrite with that materiali5ed view even when the rewrite mode is LU!#'_#!F#IT!_INT!(#IT'Henforced" E.am&$e :-> Partition Change Trac?ing The followin! example ses the sh sample schema and the three detail tables sales/ products/ and times to create two materiali5ed views" For this example/ sales is a

partitioned table sin! the time_id col mn and products is partitioned b* the prod_cate/ory col mn" times is not a partitioned table" The first materiali5ed view is for the *earl* sales reven e for each prod ct" The second materiali5ed view is for monthl* c stomer sales" 's c stomers tend to p rchase in b lk/ sales avera!e : st two orders for each c stomer per month" Therefore/ the impact of incl din! the time_id in the materiali5ed view will not nacceptabl* increase the n mber of rows stored" However/ most orders are lar!e and contain man* different prod cts" $ith approximatel* 0888 different prod cts sold each da*/ incl din! the time_id in the materiali5ed view wo ld s bstantiall* increase the cardinalit*" This materiali5ed view ses the DB"S_"+I!FI%"$#K!# f nction" The detail tables m st have materiali5ed view lo!s for >$ST #!>#!S,"
C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&S$L!S&FIT,&#OFID &&&)prod_id1&time_id1&quantity_sold1&amount_sold* &&&INCLUDIN(&N!F&+$LU!S; C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&%#ODUCTS&FIT,&#OFID &&&)prod_id1&prod_name1&prod_desc* &&&INCLUDIN(&N!F&+$LU!S; C#!$T!&"$T!#I$LI !D&+I!F&LO(&ON&TI"!S&FIT,&#OFID &&&)time_id1&calendar_month_name1&calendar_year* &&&INCLUDIN(&N!F&+$LU!S; C#!$T!&"$T!#I$LI !D&+I!F&cust_mth_sales_mBUILD&D!>!##!D&#!>#!S,&>$ST&ON&D!"$ND !N$BL!&LU!#'&#!F#IT! $S &&S!L!CT&sItime_id1&pIprod_id1&SU")sIquantity_sold*1 SU")sIamount_sold*1 &&&&&&&&&pIprod_name1&tIcalendar_month_name1&COUNT)G*1& &&&&&&&&&COUNT)sIquantity_sold*1&&&&COUNT)sIamount_sold* &&>#O"&sales&s1&products&p1&times&t &&F,!#!&&sItime_id&H&tItime_id&$ND&sIprod_id&H&pIprod_id &&(#OU%&B'&tIcalendar_month_name1&pIprod_id1&pIprod_name1&sItime_id;

cust_mth_sales_m- incl des the partition ke* col mn from table sales -time_id3 in both its S!L!CT and (#OU% B' lists" This enables 7CT on table sales for materiali5ed view cust_mth_sales_m-" However/ the (#OU% B' and S!L!CT lists incl de %#ODUCTSI%#OD_ID rather than the partition ke* col mn -%#OD_C$T!(O#'3 of the products table" Therefore/ 7CT is not enabled on table products for this materiali5ed view" In other words/ an* partition maintenance operation to the sales table will allow a 7CT fast refresh of cust_mth_sales_m-" However/ 7CT fast refresh is not possible after an* kind of modification to the products table" To correct this/ the (#OU% B' and S!L!CT& lists m st incl de col mn %#ODUCTSI%#OD_C$T!(O#'" Followin! a partition maintenance

operation/ s ch as a drop partition/ a 7CT fast refresh sho ld be performed on an*

materiali5ed view that is referencin! the table pon which the partition operations are ndertaken" E.am&$e :-: Creating a #ateria$i;ed <ie
C#!$T!&"$T!#I$LI !D&+I!F&prod_yr_sales_mBUILD&D!>!##!D #!>#!S,&>$ST&ON&D!"$ND !N$BL!&LU!#'&#!F#IT! $S &&&&S!L!CT&DB"S_"+I!FI%"$#K!#)sIro<id*1 &&&&&&&&&&&DB"S_"+I!FI%"$#K!#)pIro<id*1 &&&&&&&&&&&sIprod_id1&SU")sIamount_sold*1&SU")sIquantity_sold*1 &&&&&&&&&&&pIprod_name1&tIcalendar_year1&COUNT)G*1 &&&&&&&&&&&COUNT)sIamount_sold*1&COUNT)sIquantity_sold* &&&&>#O"&&&sales&s1&products&p1&times&t &&&&F,!#!&&sItime_id&H&tItime_id&$ND &&&&&&&&&&&sIprod_id&H&pIprod_id &&&&(#OU%&B'&DB"S_"+I!FI%"$#K!#&)sIro<id*1 &&&&&&&&&&&&&DB"S_"+I!FI%"$#K!#&)pIro<id*1 &&&&&&&&&&&&&&tIcalendar_year1&sIprod_id1&pIprod_name;

prod_yr_sales_m- incl des the DB"S_"+I!FI%"$#K!# f nction on the sales and products tables in both its S!L!CT and (#OU% B' lists" This enables partition chan!e trackin! on both the sales table and the products table with si!nificantl* less

cardinalit* impact than !ro pin! b* the respective partition ke* col mns" In this example/ the desired level of a!!re!ation for the prod_yr_sales_m- is to !ro p b* timesIcalendar_year" =sin! the DB"S_"+I!FI%"$#K!# f nction/ the materiali5ed view cardinalit* is increased onl* b* a factor of the n mber of partitions in the sales table times/ the n mber of partitions in the products table" This wo ld !enerall* be si!nificantl* less than the cardinalit* impact of incl din! the respective partition ke* col mns" ' s bse4 ent INS!#T statement adds a new row to the sales_part3 partition of table sales" 't this point/ beca se cust_mth_sales_m- and prod_yr_sales_m- have partition chan!e trackin! available on table sales/ Oracle can determine that those rows in these materiali5ed views correspondin! to sales_part3 are stale/ while all other rows in these materiali5ed views are nchan!ed in their freshness state" 'n INS!#T INTO products statement is not tracked for materiali5ed view cust_mth_sales_m-" Therefore/ cust_mth_sales_m- becomes completel* stale when the products table is modified in this wa*"

#artitioning a )ateriali4ed !ie(


7artitionin! a materiali5ed view involves definin! the materiali5ed view with the standard Oracle partitionin! cla ses/ as ill strated in the followin! example" This statement creates a materiali5ed view called part_sales_m-/ which ses three partitions/ can be fast refreshed/ and is eli!ible for 4 er* rewrite"

C#!$T!&"$T!#I$LI !D&+I!F&part_sales_m%$#$LL!L

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

J Di ensions
The followin! sections will help *o create and mana!e a data wareho se#

$hat are %imensions& Creatin! %imensions ;iewin! %imensions =sin! %imensions with Constraints ;alidatin! %imensions 'lterin! %imensions %eletin! %imensions =sin! the %imension $i5ard

What are Di ensions?


' dimension is a str ct re that cate!ori5es data in order to enable sers to answer b siness 4 estions" Commonl* sed dimensions are customers/ products/ and time" For example/ each sales channel of a clothin! retailer mi!ht !ather and store data re!ardin! sales and reclamations of their Cloth assortment" The retail chain mana!ement can b ild a data wareho se to anal*5e the sales of its prod cts across all stores over time and help answer 4 estions s ch as#

$hat is the effect of promotin! one prod ct on the sale of a related prod ct that is not promoted& $hat are the sales of a prod ct before and after a promotion& How does a promotion affect the vario s distrib tion channels&

The data in the retailer<s data wareho se s*stem has two important components# dimensions and facts" The dimensions are prod cts/ c stomers/ promotions/ channels/ and time" One approach for identif*in! *o r dimensions is to review *o r reference tables/ s ch as a prod ct table that contains ever*thin! abo t a prod ct/ or a promotion table containin! all information abo t promotions" The facts are sales - nits sold3 and profits" ' data wareho se contains facts abo t the sales of each prod ct at on a dail* basis" ' t*pical relational implementation for s ch a data wareho se is a Star Schema" The fact information is stored in the so)called fact table/ whereas the dimensional information is stored in the so)called dimension tables" In o r example/ each sales transaction record is ni4 el* defined as for each c stomer/ for each prod ct/ for each sales channel/ for each promotion/ and for each da* -time3" See Also: Chapter 0E/ 9Schema Modelin! Techni4 es9 for f rther details In Oracle1i/ the dimensional information itself is stored in a dimension table" In addition/ the database ob:ect dimension helps to or!ani5e and !ro p dimensional information into hierarchies" This represents nat ral 5Kn relationships between col mns or col mn !ro ps -the levels of a hierarch*3 that cannot be represented with constraint conditions" Koin! p a level in the hierarch* is called rollin! p the data and !oin! down a level in the hierarch* is called drillin! down the data" In the retailer example#

$ithin the time dimension/ months roll p to 4 arters/ 4 arters roll p to *ears/ and *ears roll p to all *ears" $ithin the product dimension/ prod cts roll p to s bcate!ories/ s bcate!ories roll p to cate!ories/ and cate!ories roll p to all prod cts" $ithin the customer dimension/ c stomers roll p to city" Then cities rolls p to state" Then states roll p to country" Then co ntries roll p to su9re/ion" Finall*/ s bre!ions roll p to re/ion/ as shown in Fi! re 1)0"

Figure @-1 !am&$e %o$$u& for a Customer Dimension

Text description of the ill stration dwhs!8EA"!if %ata anal*sis t*picall* starts at hi!her levels in the dimensional hierarch* and !rad all* drills down if the sit ation warrants s ch anal*sis" %imensions do not have to be defined/ b t spendin! time creatin! them can *ield si!nificant benefits/ beca se the* help 4 er* rewrite perform more complex t*pes of rewrite" The* are mandator* if *o se the S mmar* 'dvisor -a K=I tool for materiali5ed view mana!ement3 to recommend which materiali5ed views to create/ drop/ or retain" See Also: Chapter AA/ 9F er* +ewrite9 for f rther details re!ardin! 4 er* rewrite and Chapter 02/ 9S mmar* 'dvisor9 for f rther details re!ardin! the S mmar* 'dvisor ?o m st not create dimensions in an* schema that does not satisf* these relationships" Incorrect res lts can be ret rned from 4 eries otherwise"

Creating Di ensions
Before *o can create a dimension ob:ect/ the dimension tables m st exist in the database/ containin! the dimension data" For example/ if *o create a c stomer dimension/ one or more tables m st exist that contain the cit*/ state/ and co ntr* information" In a star schema data wareho se/ these dimension tables alread* exist" It is therefore a simple task to identif* which ones will be sed"

(ow *o can draw the hierarchies of a dimension as shown in Fi! re 1)0" For example/ city is a child of state -beca se *o can a!!re!ate cit*)level data p to state3/ and country" This hierarchical information will be stored in the database ob:ect dimension" In the case of normali5ed or partiall* normali5ed dimension representation -a dimension that is stored in more than one table3/ identif* how these tables are :oined" (ote whether the :oins between the dimension tables can ! arantee that each child)side row :oins with one and onl* one parent)side row" In the case of denormali5ed dimensions/ determine whether the child)side col mns ni4 el* determine the parent)side -or attrib te3 col mns" These constraints can be enabled with the NO+$LID$T! and #!L' cla ses if the relationships represented b* the constraints are ! aranteed b* other means" ?o create a dimension sin! either the C#!$T! DI"!NSION statement or the %imension $i5ard in Oracle 6nterprise Mana!er" $ithin the C#!$T! DI"!NSION statement/ se the L!+!L cla se to identif* the names of the dimension levels" See Also: Oracle9i S%& $e"erence for a complete description of the C#!$T! DI"!NSION statement This c stomer dimension contains a sin!le hierarch* with a !eo!raphical roll p/ with arrows drawn from the child level to the parent level/ as shown in Fi! re 1)0" 6ach arrow in this !raph indicates that for an* child there is one and onl* one parent" For example/ each cit* m st be contained in exactl* one state and each state m st be contained in exactl* one co ntr*" States that belon! to more than one co ntr*/ or that belon! to no co ntr*/ violate hierarchical inte!rit*" Hierarchical inte!rit* is necessar* for the correct operation of mana!ement f nctions for materiali5ed views that incl de a!!re!ates" For example/ *o can declare a dimension products_dim/ which contains levels product/ su9cate/ory/ and cate/ory#
C#!$T!&DI"!NSION&products_dim &&&&&&&L!+!L&product&&&&&&&&&&&IS&)productsIprod_id* &&&&&&&L!+!L&su9cate/ory&&&&&&&IS&)productsIprod_su9cate/ory* &&&&&&&L!+!L&cate/ory&&&&&&&&&&IS&)productsIprod_cate/ory*&III

6ach level in the dimension m st correspond to one or more col mns in a table in the database" Th s/ level product is identified b* the col mn prod_id in the prod cts table and level su9cate/ory is identified b* a col mn called prod_su9cate/ory in the same table" In this example/ the database tables are denormali5ed and all the col mns exist in the same table" However/ this is not a prere4 isite for creatin! dimensions" 9=sin!

(ormali5ed %imension Tables9 shows how to create a dimension customers_dim that has a normali5ed schema desi!n sin! the BOIN K!' cla se" The next step is to declare the relationship between the levels with the ,I!#$#C,' statement and !ive that hierarch* a name" ' hierarchical relationship is a f nctional dependenc* from one level of a hierarch* to the next level in the hierarch*" =sin! the level names defined previo sl*/ the C,ILD O> relationship denotes that each child<s level val e is associated with one and onl* one parent level val e" The followin! statements declare a hierarch* prod_rollup and define the relationship between products/ su9cate/ory1 and cate/ory"
&,I!#$#C,'&prod_rollup& &)product&&&&&&&&&C,ILD&O> &&su9cate/ory&&&&&C,ILD&O> &&cate/ory*

In addition to the 5Kn hierarchical relationships/ dimensions also incl de 5K5 attrib te relationships between the hierarch* levels and their dependent/ determined dimension attrib tes" For example the dimension times_dim/ as defined in Oracle9i Sample Schemas/ has col mns fiscal_month_desc/ fiscal_month_name/ and days_in_fiscal_month" Their relationship is defined as follows#
L!+!L&fis_month&&&IS&TI"!SI>ISC$L_"ONT,_D!SC III $TT#IBUT!&fis_month&D!T!#"IN!S &&&&&&)fiscal_month_name1&days_in_fiscal_month*

The $TT#IBUT! """ D!T!#"IN!S cla se relates fis_month to fiscal_month_name&and& days_in_fiscal_month" (ote that this is a nidirectional determination" It is onl* ! aranteed/ that for a specific fiscal_month/ for example/ 5@@@A55/ *o will find exactl* one matchin! val es for fiscal_month_name/ for example/ No-em9er and days_in_fiscal_month/ for example/ AH" ?o cannot determine a specific fiscal_month_desc based on the fiscal_month_name/ which is No-em9er for ever* fiscal *ear" In this example/ s ppose a 4 er* were iss ed that 4 eried b* fiscal_month_name instead of fiscal_month_desc" Beca se this 5K5 relationship exists between the attrib te and the level/ an alread* a!!re!ated materiali5ed view containin! fiscal_month_desc can be :oined back to the dimension information and sed to identif* the data" See Also: Chapter AA/ 9F er* +ewrite9 for f rther details of sin! dimensional information

' sample dimension definition follows#


C#!$T!&DI"!NSION&products_dim &&&&&&&&L!+!L&product&&&&&&&&&&&IS&)productsIprod_id* &&&&&&&&L!+!L&su9cate/ory&&&&&&&IS&)productsIprod_su9cate/ory* &&&&&&&&L!+!L&cate/ory&&&&&&&&&&IS&)productsIprod_cate/ory* &&&&&&&&,I!#$#C,'&prod_rollup&) &&&&&&&&&&&&&&&&product&&&&&&&&&C,ILD&O> &&&&&&&&&&&&&&&&su9cate/ory&&&&&C,ILD&O> &&&&&&&&&&&&&&&&cate/ory &&&&&&&&* &&&&&&&&$TT#IBUT!&product&D!T!#"IN!S &&&&&&&&)productsIprod_name1&productsIprod_desc1 &&&&&&&&&prod_<ei/ht_class1&prod_unit_of_measure1 &&&&&&&&&prod_pac=_siMe1prod_status1&prod_list_price1&prod_min_price* &&&&&&&&$TT#IBUT!&su9cate/ory&D!T!#"IN!S &&&&&&&&)prod_su9cate/ory1&prod_su9cat_desc* &&&&&&&&$TT#IBUT!&cate/ory&D!T!#"IN!S &&&&&&&&)prod_cate/ory1&prod_cat_desc*;

The desi!n/ creation/ and maintenance of dimensions is part of the desi!n/ creation/ and maintenance of *o r data wareho se schema" Once the dimension has been created/ check that it meets these re4 irements#

There m st be a 0#n relationship between a parent and children" ' parent can have one or more children/ b t a child can have onl* one parent" There m st be a 0#0 attrib te relationship between hierarch* levels and their dependent dimension attrib tes" For example/ if there is a col mn fiscal_month_desc/ then a possible attrib te relationship wo ld be fiscal_month_desc to fiscal_month_name" If the col mns of a parent level and child level are in different relations/ then the connection between them also re4 ires a 0#n :oin relationship" 6ach row of the child table m st :oin with one and onl* one row of the parent table" This relationship is stron!er than referential inte!rit* alone/ beca se it re4 ires that the child :oin ke* m st be non)n ll/ that referential inte!rit* m st be maintained from the child :oin ke* to the parent :oin ke*/ and that the parent :oin ke* m st be ni4 e" ?o m st ens re - sin! database constraints if necessar*3 that the col mns of each hierarch* level are non)n ll and that hierarchical inte!rit* is maintained" The hierarchies of a dimension can overlap or be disconnected from each other" However/ the col mns of a hierarch* level cannot be associated with more than one dimension" .oin relationships that form c*cles in the dimension !raph are not s pported" For example/ a hierarch* level cannot be :oined to itself either directl* or indirectl*" Note: The information stored with a dimension ob:ects is onl* declarative"

The previo sl* disc ssed relationships are not enforced with the creation of a dimension ob:ect" ?o sho ld validate an* dimension definition with the DB"S_"+I!FI+$LID$T!_DI"!NSION proced re/ as disc ssed on 9;alidatin! %imensions9"

)ultiple -ierarchies
' sin!le dimension definition can contain m ltiple hierarchies" S ppose o r retailer wants to track the sales of certain items over time" The first step is to define the time dimension over which sales will be tracked" Fi! re 1)A ill strates a dimension times_dim& with two time hierarchies" Figure @-2 timesAdim Dimension ith T o Time (ierarchies

Text description of the ill stration dwhs!8E>"!if From the ill stration/ *o can constr ct the hierarch* of the denormali5ed time_dim dimension<s C#!$T! DI"!NSION statement as follows" The complete C#!$T! DI"!NSION statement as well as the C#!$T! T$BL! statement are shown in Oracle9i Sample Schemas"
C#!$T!&DI"!NSION&times_dim &&&L!+!L&day&&&&&&&&&IS&TI"!SITI"!_ID &&&L!+!L&month&&&&&&&IS&TI"!SIC$L!ND$#_"ONT,_D!SC &&&L!+!L&quarter&&&&&IS&TI"!SIC$L!ND$#_LU$#T!#_D!SC &&&L!+!L&year&&&&&&&&IS&TI"!SIC$L!ND$#_'!$# &&&L!+!L&fis_<ee=&&&&IS&TI"!SIF!!K_!NDIN(_D$' &&&L!+!L&fis_month&&&IS&TI"!SI>ISC$L_"ONT,_D!SC &&&L!+!L&fis_quarter&IS&TI"!SI>ISC$L_LU$#T!#_D!SC &&&L!+!L&fis_year&&&&IS&TI"!SI>ISC$L_'!$# &&&,I!#$#C,'&cal_rollup&&&&) &&&&&&&&&&&&&day&&&&&C,ILD&O>

&&&&&&&&&&&&&month&&&C,ILD&O> &&&&&&&&&&&&&quarter&C,ILD&O> &&&&&&&&&&&&&year &&&* &&&,I!#$#C,'&fis_rollup&&&&) &&&&&&&&&&&&&day&&&&&&&&&C,ILD&O> &&&&&&&&&&&&&fis_<ee=&&&&C,ILD&O> &&&&&&&&&&&&&fis_month&&&C,ILD&O> &&&&&&&&&&&&&fis_quarter&C,ILD&O> &&&&&&&&&&&&&fis_year &&&*&Pattri9ute&determination&clausesJIII

.sing Nor ali4ed Di ension Tables


The tables sed to define a dimension ma* be normali5ed or denormali5ed and the individ al hierarchies can be normali5ed or denormali5ed" If the levels of a hierarch* come from the same table/ it is called a f ll* denormali5ed hierarch*" For example/ cal_rollup in the times_dim dimension is a denormali5ed hierarch*" If levels of a hierarch* come from different tables/ s ch a hierarch* is either a f ll* or partiall* normali5ed hierarch*" This section shows how to define a normali5ed hierarch*" S ppose the trackin! of a c stomer<s location is done b* cit*/ state/ and co ntr*" This data is stored in the tables customers and countries" The customer dimension customers_dim is partiall* normali5ed beca se the data entities cust_id and country_id are taken from different tables" The cla se BOIN K!' within the dimension definition specifies how to :oin to!ether the levels in the hierarch*" The dimension statement is partiall* shown in the followin!" The complete C#!$T! DI"!NSION statement as well as the C#!$T! T$BL! statement are shown in Oracle9i Sample Schemas"
C#!$T!&DI"!NSION&customers_dim &&&&&&&L!+!L&customer&&IS&)customersIcust_id* &&&&&&&L!+!L&city&&&&&&IS&)customersIcust_city* &&&&&&&L!+!L&state&&&&&IS&)customersIcust_state_pro-ince* &&&&&&&L!+!L&country&&&IS&)countriesIcountry_id* &&&&&&&L!+!L&su9re/ion&IS&)countriesIcountry_su9re/ion* &&&&&&&L!+!L&re/ion&IS&)countriesIcountry_re/ion* &&&&&&&,I!#$#C,'&/eo/_rollup&) &&&&&&&&&&&&&&&&&customer&&&&&&&&C,ILD&O> &&&&&&&&&&&&&&&&&city&&&&&&&&&&&&C,ILD&O> &&&&&&&&&&&&&&&&&state&&&&&&&&&&&C,ILD&O> &&&&&&&&&&&&&&&&&country&&&&&&&&&C,ILD&O> &&&&&&&&&&&&&&&&&su9re/ion&&&&&&&C,ILD&O> &&&&&&&&&&&&&&&&&re/ion &&&&&&&BOIN&K!'&)customersIcountry_id*&#!>!#!NC!S&country &&&&&&&&*&IIIattribute determination clause;

!ie(ing Di ensions
%imensions can be viewed thro !h one of two methods#

=sin! The %6MOO%IM 7acka!e =sin! Oracle 6nterprise Mana!er

.sing The D$)OBDI) #ac<age


Two proced res allow *o to displa* the dimensions that have been defined" First/ the file smdimIsql/ located nder QO#$CL!_,O"!8rd9ms8demo/ m st be exec ted to provide the D!"O_DI" packa!e/ which incl des#
D!"O_DI"I%#INT_DI" to print a specific dimension D!"O_DI"I%#INT_$LLDI"S to print all dimensions accessible

to a ser

The D!"O_DI""%#INT_DI" proced re has onl* one parameter# the name of the dimension to displa*" The followin! example shows how to displa* the dimension TI"!S_DI""
S!T&S!#+!#OUT%UT&ON; !.!CUT!&D!"O_DI"I%#INT_DI"&)7TI"!S_DI"7*;

To displa* all of the dimensions that have been defined/ call the proced re D!"O_DI"I%#INT_$LLDI"S witho t an* parameters is ill strated as follows"
!.!CUT!&DB"S_OUT%UTI!N$BL!)54444*; !.!CUT!&D!"O_DI"I%#INT_$LLDI"S;

+e!ardless of which proced re is called/ the o tp t format is identical" ' sample displa* is shown here"
DI"!NSION&S,I%#O"O_DI" L!+!L&C$T!(O#'&IS&S,I%#O"OTIONSI%#O"O_C$T!(O#' L!+!L&%#O"O&IS&S,I%#O"OTIONSI%#O"O_ID L!+!L&SUBC$T!(O#'&IS&S,I%#O"OTIONSI%#O"O_SUBC$T!(O#' ,I!#$#C,'&%#O"O_#OLLU%&)&%#O"O C,ILD&O>&SUBC$T!(O#' C,ILD&O>&C$T!(O#'* $TT#IBUT!&C$T!(O#'&D!T!#"IN!S&S,I%#O"OTIONSI%#O"O_C$T!(O#' $TT#IBUT!&%#O"O&D!T!#"IN!S&S,I%#O"OTIONSI%#O"O_B!(IN_D$T! $TT#IBUT!&%#O"O&D!T!#"IN!S&S,I%#O"OTIONSI%#O"O_COST $TT#IBUT!&%#O"O&D!T!#"IN!S&S,I%#O"OTIONSI%#O"O_!ND_D$T! $TT#IBUT!&%#O"O&D!T!#"IN!S&S,I%#O"OTIONSI%#O"O_N$"! $TT#IBUT!&SUBC$T!(O#'&D!T!#"IN!S&S,I%#O"OTIONSI%#O"O_SUBC$T!(O#'

.sing Oracle $nterprise )anager


'll of the dimensions that exist in the data wareho se can be viewed sin! Oracle 6nterprise Mana!er" Select the *imension ob:ect from within the "chema icon to displa* all of the dimensions" Select a specific dimension to !raphicall* displa* its hierarch*/ levels/ and an* attrib tes that have been defined"

See Also: Oracle /nterprise Manager 'dministrator(s #uide and 9=sin! the %imension $i5ard9 for details re!ardin! creatin! and sin! dimensions

.sing Di ensions (ith Constraints


Constraints pla* an important role with dimensions" F ll referential inte!rit* is sometimes enabled in data wareho ses/ b t not alwa*s" This is beca se operational databases normall* have f ll referential inte!rit* and *o can ens re that the data flowin! into *o r wareho se never violates the alread* established inte!rit* r les" Oracle recommends that constraints be enabled and/ if validation time is a concern/ then the NO+$LID$T! cla se sho ld be sed as follows#
!N$BL!&NO+$LID$T!&CONST#$INT&p=_time;

7rimar* and forei!n ke*s sho ld be implemented also" +eferential inte!rit* constraints and NOT NULL constraints on the fact tables provide information that 4 er* rewrite can se to extend the sef lness of materiali5ed views" In addition/ *o sho ld se the #!L' cla se to inform 4 er* rewrite that it can rel* pon the constraints bein! correct as follows#
$LT!#&T$BL!&time&"ODI>'&CONST#$INT&p=_time&#!L';

This information is also sed for 4 er* rewrite" See Also: Chapter AA/ 9F er* +ewrite9 for f rther details

!alidating Di ensions
The information of a dimension ob:ect is declarative onl* and not enforced b* the database" If the relationships described b* the dimensions are incorrect/ incorrect res lts co ld occ r" Therefore/ *o sho ld verif* the relationships specified b* C#!$T! DI"!NSION sin! the DB"S_OL$%I+$LID$T!_DI"!NSION proced re periodicall*" This proced re is eas* to se and has onl* five parameters#

%imension name Owner name Set to T#U! to check onl* the new rows for tables of this dimension

Set to T#U! to verif* that all col mns are not n ll =ni4 e r n I% obtained b* callin! the DB"S_OL$%IC#!$T!_ID proced re" The I% is sed to identif* the res lt of each r n

The followin! example validates the dimension TI"!_>N in the !rocer* schema
+$#I$BL!&#ID&NU"B!#; !.!CUT!&DB"S_OL$%IC#!$T!_ID)K#ID*; !.!CUT!&DB"S_OL$%I+$LID$T!_DI"!NSION&)7TI"!_>N71&7(#OC!#'71&R >$LS!1&T#U!1&K#ID*;

If the +$LID$T!_DI"!NSION proced re enco nters an* errors/ the* are placed in a s*stem table" The table can be accessed from the view S'ST!"I"+I!F_!.C!%TIONS" F er*in! this view will identif* the exceptions that were fo nd" For example#
S!L!CT&G&>#O"&S'ST!"I"+I!F_!.C!%TIONS F,!#!&#UNID&H&K#ID; #UNID&OFN!#&&&&T$BL!_N$"!&&DI"!NSION_N$"!&#!L$TIONS,I%&B$D_#OFID AAAAA&AAAAAAAA&AAAAAAAAAAA&AAAAAAAAAAAAAA&AAAAAAAAAAAA&AAAAAAAAA DEC&&&(#OC!#'&&"ONT,&&&&&&&TI"!_>N&&&&&&&&>O#!I(N&K!'&& $$$$u<$$B$$$$#<$$$

However/ rather than 4 er* this view/ it ma* be better to 4 er* the rowid of the invalid row to retrieve the act al row that has violated the constraint" In this example/ the dimension TI"!_>N is checkin! a table called month" It has fo nd a row that violates the constraints" =sin! the rowid/ *o can see exactl* which row in the month table is ca sin! the problem/ as in the followin!#
S!L!CT&G&>#O"&month F,!#!&ro<id&IN&)S!L!CT&9ad_ro<id& &&&&&&&&&&&&&&&&>#O"&S'ST!"I"+I!F_!.C!%TIONS& &&&&&&&&&&&&&&&&F,!#!&#UNID&H&K#ID*; "ONT,&&&&LU$#T!#&>ISC$L_LT#&'!$#&>ULL_"ONT,_N$"!&"ONT,_NU"B AAAAAAAA&AAAAAAA&AAAAAAAAAA&AAAA&AAAAAAAAAAAAAAA&AAAAAAAAAA &&5@@@43&&&5@@C5&&&&&&5@@C5&5@@C&"arch&&&&&&&&&&&&&&&&&&&&3

Finall*/ to remove res lts from the s*stem table for the c rrent r n#
!.!CUT!&DB"S_OL$%I%U#(!_#!SULTS)K#ID*;

Altering Di ensions
?o can modif* the dimension sin! the $LT!# DI"!NSION statement" ?o can add or drop a level/ hierarch*/ or attrib te from the dimension sin! this command"

+eferrin! to the time dimension in Fi! re 1)A/ *o can remove the attrib te fis_year/ drop the hierarch* fis_rollup/ or remove the level fiscal_year" In addition/ *o can add a new level called foyer as in the followin!#
$LT!#&DI"!NSION&times_dim&D#O%&$TT#IBUT!&fis_year; $LT!#&DI"!NSION&times_dim&D#O%&,I!#$#C,'&fis_rollup; $LT!#&DI"!NSION&times_dim&D#O%&L!+!L&fis_year; $LT!#&DI"!NSION&times_dim&$DD&L!+!L&f_year&IS&timesIfiscal_year;

If *o tr* to remove an*thin! with f rther dependencies inside the dimension/ Oracle re:ects the alterin! of the dimension" ' dimension becomes invalid if *o chan!e an* schema ob:ect that the dimension is referencin!" For example/ if the table on which the dimension is defined is altered/ the dimension becomes invalid" To check the stat s of a dimension/ view the contents of the col mn in-alid in the $LL_DI"!NSIONS data dictionar* view" To revalidate the dimension/ se the CO"%IL! option as follows#
$LT!#&DI"!NSION&times_dim&CO"%IL!;

%imensions can also be modified sin! Oracle 6nterprise Mana!er" See Also: Oracle /nterprise Manager 'dministrator(s #uide

Deleting Di ensions
' dimension is removed sin! the D#O% DI"!NSION statement" For example#
D#O%&DI"!NSION&times_dim;

%imensions can also be deleted sin! Oracle 6nterprise Mana!er" See Also: Oracle /nterprise Manager 'dministrator(s #uide

.sing the Di ension Wi4ard


'n alternative method for creatin! and viewin! dimensions is to se Oracle 6nterprise Mana!er/ which !raphicall* displa*s the dimension definition/ th s makin! it easier to

see the hierarch* and a dimension wi5ard is provided to facilitate eas* definition of the dimension ob:ect" The %imension $i5ard is a tomaticall* invoked whenever a re4 est is made to create a dimension ob:ect in Oracle 6nterprise Mana!er" ?o are then ! ided step b* step thro !h the information re4 ired for a dimension" ' dimension created sin! the $i5ard can contain an* of the attrib tes described in 9Creatin! %imensions9/ s ch as :oin ke*s/ m ltiple hierarchies/ and attrib tes" ?o mi!ht prefer to se the $i5ard beca se it !raphicall* displa*s the hierarchical relationships as the* are bein! constr cted" $hen it is time to describe the hierarch*/ the $i5ard a tomaticall* displa*s a defa lt hierarch* based on the col mn val es/ which *o can s bse4 entl* amend" See Also: Oracle /nterprise Manager 'dministrator(s #uide

)anaging the Di ension Object


The dimension ob:ect is located within the /areho se section for a database" Selectin! a specific dimension res lts in > sheets of information becomin! available" The #eneral Property sheet shown in Fi! re 1)B displa*s the dimension definition in a !raphical form" Figure @-3 Dimension ,enera$ Pro&ert' !heet

Text description of the ill stration dim!en"!if

The levels in the dimension can either be shown on the #eneral Property sheet/ or b* selectin! the Levels propert* sheet/ levels can be deleted/ displa*ed or new ones defined for this dimension as ill strated in Fi! re 1)C" Figure @-" Dimension Leve$s Pro&ert' !heet

Text description of the ill stration dimlevel"!if

B* selectin! the level name from the list on the left of the propert* sheet/ the col mns sed for this level are displa*ed in the "elected 0ol mns window in the lower half of the propert* sheet" Levels can be added or removed b* pressin! the New or *elete b ttons b t the* cannot be modified" ' similar propert* sheet to that for Levels is provided for the attrib tes in the dimension and is selected b* clickin! on the 'ttrib tes tab" One of the main advanta!es of sin! Oracle 6nterprise Mana!er to define the dimension is that the hierarchies can be easil* displa*ed" Fi! re 1)> ill strates the +ierarchy propert* sheet" Figure @-- Dimension (ierarch' Pro&ert' !heet

Text description of the ill stration dimhierA"!if In Fi! re 1)>/ *o can see that the hierarch* called C$L_#OLLU% contains fo r levels where the top level is *ear/ followed b* 4 arter/ month/ and da*"

?o can add or remove hierarchies b* pressin! the New or *elete b ttons b t the* cannot be modified"

Creating a Di ension
'n alternative to writin! the C#!$T! DI"!NSION statement is to invoke the %imension wi5ard/ which ! ides *o thro !h 2 steps to create a dimension" Step 1 First/ *o m st define which t*pe of dimension ob:ect is to be defined" If a time dimension is re4 ired/ selectin! the time dimension t*pe ens res that *o r dimension is reco!ni5ed as a time dimension that has specific t*pes of hierarchies and attrib tes" Step * Specif* the name of *o r dimension and into which schema it sho ld reside b* selectin! from the drop down list of schemas" Step 3 The levels in the dimension are defined in Step B as shown in Fi! re 1)2" Figure @-5 Dimension Wi;ard9 Define Leve$s

Text description of the ill stration dimlevea"!if First/ !ive the level a name and then select the table from where the col mns which define this level are located" (ow/ select one or more col mns from the available list and sin! the 1 ke* move them into the "elected 0ol mns area" ?o r level will now appear in the list on the left side of the propert* sheet" To define another level/ click the New b tton/ or/ if all the levels have been defined/ click the Ne(t b tton to proceed to the next step" If a mistake is made when definin! a level/ simpl* click the *elete b tton to remove it and start a!ain" Step 5 The levels in the dimension can also have attrib tes" Kive the attrib te a name and then select the level on which this attrib te is to be defined and sin! the 1 b tton move it into the "elected Levels col mn" (ow choose the col mn from the drop down list for this attrib te" Levels can be added or removed b* pressin! the New or *elete b ttons b t the* cannot be modified"

Step ; ' hierarch* is defined as ill strated in Fi! re 1)E" Figure @-> Dimension Wi;ard9 Define (ierarchies

Text description of the ill stration dimhierw"!if First/ !ive the hierarch* a name and then select the levels to be sed in this hierarch* and move them to the "elected Levels col mn sin! the 1 b tton" The level name at the top of the list defines the top of the hierarch*" =se the p and down b ttons to move the levels into the re4 ired order" (ote that each level will indent so *o can see the relationships between the levels" Step ? Finall*/ the S mmar* screen is displa*ed as shown in Fi! re 1)H where a !raphical representation of the dimension is shown on the left side of the propert* sheet and on the ri!ht side the C#!$T! DI"!NSION statement is shown" Clickin! on the Finish b tton will create the dimension"

Figure @-: Dimension Wi;ard9 !ummar' !creen

Text description of the ill stration dimwi5sa"!if

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

#art I! )anaging the Warehouse $nviron ent


This section deals with the tasks for mana!in! a data wareho se" It contains the followin! chapters#

Overview of 6xtraction/ Transformation/ and Loadin! 6xtraction in %ata $areho ses Transportation in %ata $areho ses Loadin! and Transformation Maintainin! the %ata $areho se Chan!e %ata Capt re S mmar* 'dvisor

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

18 Overvie( o0 $2traction: Trans0or ation: and "oading


This chapter disc sses the process of extractin!/ transportin!/ transformin!/ and loadin! data in a data wareho sin! environment#

Overview of 6TL 6TL Tools

Overvie( o0 $T"
?o need to load *o r data wareho se re! larl* so that it can serve its p rpose of facilitatin! b siness anal*sis" To do this/ data from one or more operational s*stems needs to be extracted and copied into the wareho se" The process of extractin! data from so rce s*stems and brin!in! it into the data wareho se is commonl* called ,TL/ which stands for extraction/ transformation/ and loadin!" The acron*m 6TL is perhaps too simplistic/ beca se it omits the transportation phase and implies that each of the other phases of the process is distinct" $e refer to the entire process/ incl din! data loadin!/ as 6TL" ?o sho ld nderstand that 6TL refers to a broad process/ and not three well) defined steps" The methodolo!* and tasks of 6TL have been well known for man* *ears/ and are not necessaril* ni4 e to data wareho se environments# a wide variet* of proprietar* applications and database s*stems are the IT backbone of an* enterprise" %ata has to be shared between applications or s*stems/ tr*in! to inte!rate them/ !ivin! at least two applications the same pict re of the world" This data sharin! was mostl* addressed b* mechanisms similar to what we now call 6TL" %ata wareho se environments face the same challen!e with the additional b rden that the* not onl* have to exchan!e b t to inte!rate/ rearran!e and consolidate data over man* s*stems/ thereb* providin! a new nified information base for b siness intelli!ence" 'dditionall*/ the data vol me in data wareho se environments tends to be ver* lar!e" $hat happens d rin! the 6TL process& % rin! extraction/ the desired data is identified and extracted from man* different so rces/ incl din! database s*stems and applications" ;er* often/ it is not possible to identif* the specific s bset of interest/ therefore more data than necessar* has to be extracted/ so the identification of the relevant data will be done at a later point in time" %ependin! on the so rce s*stem<s capabilities -for example/ operatin! s*stem reso rces3/ some transformations ma* take place d rin! this extraction process" The si5e of the extracted data varies from h ndreds of kilob*tes p to !i!ab*tes/ dependin! on the so rce s*stem and the b siness sit ation" The same is tr e for the time delta between two -lo!icall*3 identical extractions# the time span ma* var* between da*s@ho rs and min tes to near real)time" $eb server lo! files for example can easil* become h ndreds of me!ab*tes in a ver* short period of time" 'fter extractin! data/ it has to be ph*sicall* transported to the tar!et s*stem or an intermediate s*stem for f rther processin!" %ependin! on the chosen wa* of transportation/ some transformations can be done d rin! this process/ too" For example/ a SFL statement which directl* accesses a remote tar!et thro !h a !atewa* can concatenate two col mns as part of the S!L!CT statement" The emphasis in man* of the examples in this section is scalabilit*" Man* lon!)time sers of Oracle are experts in pro!rammin! complex data transformation lo!ic sin! 7L@SFL" These chapters s !!est alternatives for man* s ch data manip lation operations/ with a

partic lar emphasis on implementations that take advanta!e of Oracle<s new SFL f nctionalit*/ especiall* for 6TL and the parallel 4 er* infrastr ct re"

$T" Tools
%esi!nin! and maintainin! the 6TL process is often considered one of the most diffic lt and reso rce)intensive portions of a data wareho se pro:ect" Man* data wareho sin! pro:ects se 6TL tools to mana!e this process" Oracle $areho se B ilder -O$B3/ for example/ provides 6TL capabilities and takes advanta!e of inherent database abilities" Other data wareho se b ilders create their own 6TL tools and processes/ either inside or o tside the database" Besides the s pport of extraction/ transformation/ and loadin!/ there are some other tasks that are important for a s ccessf l 6TL implementation as part of the dail* operations of the data wareho se and its s pport for f rther enhancements" Besides the s pport for desi!nin! a data wareho se and the data flow/ these tasks are t*picall* addressed b* 6TL tools s ch as O$B" Oracle1i is not an 6TL tool and does not provide a complete sol tion for 6TL" However/ Oracle1i does provide a rich set of capabilities that can be sed b* both 6TL tools and c stomi5ed 6TL sol tions" Oracle1i offers techni4 es for transportin! data between Oracle databases/ for transformin! lar!e vol mes of data/ and for 4 ickl* loadin! new data into a data wareho se"

Dail+ Operations
The s ccessive loads and transformations m st be sched led and processed in a specific order" %ependin! on the s ccess or fail re of the operation or parts of it/ the res lt m st be tracked and s bse4 ent/ alternative processes mi!ht be started" The control of the pro!ress as well as the definition of a b siness workflow of the operations are t*picall* addressed b* 6TL tools s ch as O$B"

$volution o0 the Data Warehouse


's the data wareho se is a livin! IT s*stem/ so rces and tar!ets mi!ht chan!e" Those chan!es m st be maintained and tracked thro !h the lifespan of the s*stem witho t overwritin! or deletin! the old 6TL process flow information" To b ild and keep a level of tr st abo t the information in the wareho se/ the process flow of each individ al record in the wareho se can be reconstr cted at an* point in time in the f t re in an ideal case"

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

11 $2traction in Data Warehouses


This chapter disc sses extraction/ which is the process of takin! data from an operational s*stem and movin! it to *o r wareho se or sta!in! s*stem" The chapter disc sses#

Overview of 6xtraction in %ata $areho ses Introd ction to 6xtraction Methods in %ata $areho ses %ata $areho sin! 6xtraction 6xamples

Overvie( o0 $2traction in Data Warehouses


6xtraction is the operation of extractin! data from a so rce s*stem for f rther se in a data wareho se environment" This is the first step of the 6TL process" 'fter the extraction/ this data can be transformed and loaded into the data wareho se" The so rce s*stems for a data wareho se are t*picall* transaction processin! applications" For example/ one of the so rce s*stems for a sales anal*sis data wareho se mi!ht be an order entr* s*stem that records all of the c rrent order activities" %esi!nin! and creatin! the extraction process is often one of the most time)cons min! tasks in the 6TL process and/ indeed/ in the entire data wareho sin! process" The so rce s*stems mi!ht be ver* complex and poorl* doc mented/ and th s determinin! which data needs to be extracted can be diffic lt" The data has to be extracted normall* not onl* once/ b t several times in a periodic manner to s ppl* all chan!ed data to the wareho se and keep it p)to)date" Moreover/ the so rce s*stem t*picall* cannot be modified/ nor can its performance or availabilit* be ad: sted/ to accommodate the needs of the data wareho se extraction process" These are important considerations for extraction and 6TL in !eneral" This chapter/ however/ foc ses on the technical considerations of havin! different kinds of so rces and extraction methods" It ass mes that the data wareho se team has alread* identified the

data that will be extracted/ and disc sses common techni4 es sed for extractin! data from so rce databases" %esi!nin! this process means makin! decisions abo t the followin! two main aspects#

$hich extraction method do I choose& This infl ences the so rce s*stem/ the transportation process/ and the time needed for refreshin! the wareho se"

How do I provide the extracted data for f rther processin!& This infl ences the transportation method/ and the need for cleanin! and transformin! the data"

Introduction to $2traction )ethods in Data Warehouses


The extraction method *o sho ld choose is hi!hl* dependent on the so rce s*stem and also from the b siness needs in the tar!et data wareho se environment" ;er* often/ there<s no possibilit* to add additional lo!ic to the so rce s*stems to enhance an incremental extraction of data d e to the performance or the increased workload of these s*stems" Sometimes even the c stomer is not allowed to add an*thin! to an o t)of)the) box application s*stem" The estimated amo nt of the data to be extracted and the sta!e in the 6TL process -initial load or maintenance of data3 ma* also impact the decision of how to extract/ from a lo!ical and a ph*sical perspective" Basicall*/ *o have to decide how to extract data lo!icall* and ph*sicall*"

"ogical $2traction )ethods


There are two kinds of lo!ical extraction#

F ll 6xtraction Incremental 6xtraction

,ull $2traction The data is extracted completel* from the so rce s*stem" Since this extraction reflects all the data c rrentl* available on the so rce s*stem/ there<s no need to keep track of chan!es to the data so rce since the last s ccessf l extraction" The so rce data will be provided as)is and no additional lo!ical information -for example/ timestamps3 is necessar* on the so rce site" 'n example for a f ll extraction ma* be an export file of a distinct table or a remote SFL statement scannin! the complete so rce table"

Incre ental $2traction 't a specific point in time/ onl* the data that has chan!ed since a well)defined event back in histor* will be extracted" This event ma* be the last time of extraction or a more complex b siness event like the last bookin! da* of a fiscal period" To identif* this delta chan!e there m st be a possibilit* to identif* all the chan!ed information since this specific time event" This information can be either provided b* the so rce data itself like an application col mn/ reflectin! the last)chan!ed timestamp or a chan!e table where an appropriate additional mechanism keeps track of the chan!es besides the ori!inatin! transactions" In most cases/ sin! the latter method means addin! extraction lo!ic to the so rce s*stem" Man* data wareho ses do not se an* chan!e)capt re techni4 es as part of the extraction process" Instead/ entire tables from the so rce s*stems are extracted to the data wareho se or sta!in! area/ and these tables are compared with a previo s extract from the so rce s*stem to identif* the chan!ed data" This approach ma* not have si!nificant impact on the so rce s*stems/ b t it clearl* can place a considerable b rden on the data wareho se processes/ partic larl* if the data vol mes are lar!e" Oracle<s Chan!e %ata Capt re mechanism can extract and maintain s ch delta information" See Also: Chapter 0>/ 9Chan!e %ata Capt re9 for f rther details abo t the Chan!e %ata Capt re framework

#h+sical $2traction )ethods


%ependin! on the chosen lo!ical extraction method and the capabilities and restrictions on the so rce side/ the extracted data can be ph*sicall* extracted b* two mechanisms" The data can either be extracted online from the so rce s*stem or from an offline str ct re" S ch an offline str ct re mi!ht alread* exist or it mi!ht be !enerated b* an extraction ro tine" There are the followin! methods of ph*sical extraction#

Online 6xtraction Offline 6xtraction

Online $2traction The data is extracted directl* from the so rce s*stem itself" The extraction process can connect directl* to the so rce s*stem to access the so rce tables themselves or to an intermediate s*stem that stores the data in a preconfi! red manner -for example/ snapshot

lo!s or chan!e tables3" (ote that the intermediate s*stem is not necessaril* ph*sicall* different from the so rce s*stem" $ith online extractions/ *o need to consider whether the distrib ted transactions are sin! ori!inal so rce ob:ects or prepared so rce ob:ects" O00line $2traction The data is not extracted directl* from the so rce s*stem b t is sta!ed explicitl* o tside the ori!inal so rce s*stem" The data alread* has an existin! str ct re -for example/ redo lo!s/ archive lo!s or transportable tablespaces3 or was created b* an extraction ro tine" ?o sho ld consider the followin! str ct res#

Flat files %ata in a defined/ !eneric format" 'dditional information abo t the so rce ob:ect is necessar* for f rther processin!"

% mp files Oracle)specific format" Information abo t the containin! ob:ects is incl ded"

+edo and archive lo!s Information is in a special/ additional d mp file"

Transportable tablespaces ' powerf l wa* to extract and move lar!e vol mes of data between Oracle databases" ' more detailed example of sin! this feat re to extract and transport data is provided in Chapter 0A/ 9Transportation in %ata $areho ses9" Oracle Corporation recommends that *o se transportable tablespaces whenever possible/ beca se the* can provide considerable advanta!es in performance and mana!eabilit* over other extraction techni4 es" See Also: Oracle9i Database tilities for more information on sin! d mp and flat files and Oracle9i Supplied P&0S%& Packages and Types $e"erence for details re!ardin! Lo!Miner

Change Data Capture


'n important consideration for extraction is incremental extraction/ also called Chan!e %ata Capt re" If a data wareho se extracts data from an operational s*stem on a ni!htl*

basis/ then the data wareho se re4 ires onl* the data that has chan!ed since the last extraction -that is/ the data that has been modified in the past AC ho rs3" $hen it is possible to efficientl* identif* and extract onl* the most recentl* chan!ed data/ the extraction process -as well as all downstream operations in the 6TL process3 can be m ch more efficient/ beca se it m st extract a m ch smaller vol me of data" =nfort natel*/ for man* so rce s*stems/ identif*in! the recentl* modified data ma* be diffic lt or intr sive to the operation of the s*stem" Chan!e %ata Capt re is t*picall* the most challen!in! technical iss e in data extraction" Beca se chan!e data capt re is often desirable as part of the extraction process and it mi!ht not be possible to se Oracle<s Chan!e %ata Capt re mechanism/ this section describes several techni4 es for implementin! a self)developed chan!e capt re on Oracle so rce s*stems#

Timestamps 7artitionin! Tri!!ers

These techni4 es are based pon the characteristics of the so rce s*stems/ or ma* re4 ire modifications to the so rce s*stems" Th s/ each of these techni4 es m st be caref ll* eval ated b* the owners of the so rce s*stem prior to implementation" 6ach of these techni4 es can work in con: nction with the data extraction techni4 e disc ssed previo sl*" For example/ timestamps can be sed whether the data is bein! nloaded to a file or accessed thro !h a distrib ted 4 er*" See Also: Chapter 0>/ 9Chan!e %ata Capt re9 for f rther details Ti esta ps The tables in some operational s*stems have timestamp col mns" The timestamp specifies the time and date that a !iven row was last modified" If the tables in an operational s*stem have col mns containin! timestamps/ then the latest data can easil* be identified sin! the timestamp col mns" For example/ the followin! 4 er* mi!ht be sef l for extractin! toda*<s data from an orders table#
S!L!CT&G&>#O"&orders&F,!#!&T#UNC)C$ST)order_date&$S&date*17dd7*&H&TO_ D$T!)S'SD$T!17ddAmonAyyyy7*;

If the timestamp information is not available in an operational so rce s*stem/ *o will not alwa*s be able to modif* the s*stem to incl de timestamps" S ch modification wo ld re4 ire/ first/ modif*in! the operational s*stem<s tables to incl de a new timestamp

col mn and then creatin! a tri!!er to pdate the timestamp col mn followin! ever* operation that modifies a !iven row" See Also: 9Tri!!ers9 #artitioning Some so rce s*stems mi!ht se Oracle ran!e partitionin!/ s ch that the so rce tables are partitioned alon! a date ke*/ which allows for eas* identification of new data" For example/ if *o are extractin! from an orders table/ and the orders table is partitioned b* week/ then it is eas* to identif* the c rrent week<s data" Triggers Tri!!ers can be created in operational s*stems to keep track of recentl* pdated records" The* can then be sed in con: nction with timestamp col mns to identif* the exact time and date when a !iven row was last modified" ?o do this b* creatin! a tri!!er on each so rce table that re4 ires chan!e data capt re" Followin! each %ML statement that is exec ted on the so rce table/ this tri!!er pdates the timestamp col mn with the c rrent time" Th s/ the timestamp col mn provides the exact time and date when a !iven row was last modified" ' similar internali5ed tri!!er)based techni4 e is sed for Oracle materiali5ed view lo!s" These lo!s are sed b* materiali5ed views to identif* chan!ed data/ and these lo!s are accessible to end sers" ' materiali5ed view lo! can be created on each so rce table re4 irin! chan!e data capt re" Then/ whenever an* modifications are made to the so rce table/ a record is inserted into the materiali5ed view lo! indicatin! which rows were modified" If *o want to se a tri!!er)based mechanism/ se chan!e data capt re" Materiali5ed view lo!s rel* on tri!!ers/ b t the* provide an advanta!e in that the creation and maintenance of this chan!e)data s*stem is lar!el* mana!ed b* Oracle" However/ Oracle recommends the sa!e of s*nchrono s Chan!e %ata Capt re for tri!!er based chan!e capt re/ since C%C provides an externali5ed interface for accessin! the chan!e information and provides a framework for maintainin! the distrib tion of this information to vario s clients Tri!!er)based techni4 es affect performance on the so rce s*stems/ and this impact sho ld be caref ll* considered prior to implementation on a prod ction so rce s*stem"

Data Warehousing $2traction $2a ples


?o can extract data in two wa*s#

6xtraction =sin! %ata Files 6xtraction ;ia %istrib ted Operations

$2traction .sing Data ,iles


Most database s*stems provide mechanisms for exportin! or nloadin! data from the internal database format into flat files" 6xtracts from mainframe s*stems often se COBOL pro!rams/ b t man* databases/ as well as third)part* software vendors/ provide export or nload tilities" %ata extraction does not necessaril* mean that entire database str ct res are nloaded in flat files" In man* cases/ it ma* be appropriate to nload entire database tables or ob:ects" In other cases/ it ma* be more appropriate to nload onl* a s bset of a !iven table s ch as the chan!es on the so rce s*stem since the last extraction or the res lts of :oinin! m ltiple tables to!ether" %ifferent extraction techni4 es var* in their capabilities to s pport these two scenarios" $hen the so rce s*stem is an Oracle database/ several alternatives are available for extractin! data into files#

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

1* Transportation in Data Warehouses


The followin! topics provide information abo t transportin! data into a data wareho se#

Overview of Transportation in %ata $areho ses Introd ction to Transportation Mechanisms in %ata $areho ses

Overvie( o0 Transportation in Data Warehouses

Transportation is the operation of movin! data from one s*stem to another s*stem" In a data wareho se environment/ the most common re4 irements for transportation are in movin! data from#

' so rce s*stem to a sta!in! database or a data wareho se database ' sta!in! database to a data wareho se ' data wareho se to a data mart

Transportation is often one of the simpler portions of the 6TL process/ and can be inte!rated with other portions of the process" For example/ as shown in Chapter 00/ 96xtraction in %ata $areho ses9/ distrib ted 4 er* technolo!* provides a mechanism for both extractin! and transportin! data"

Introduction to Transportation )echanis s in Data Warehouses


?o have three basic choices for transportin! data in wareho ses#

Transportation =sin! Flat Files Transportation Thro !h %istrib ted Operations Transportation =sin! Transportable Tablespaces

Transportation .sing ,lat ,iles


The most common method for transportin! data is b* the transfer of flat files/ sin! mechanisms s ch as FT7 or other remote file s*stem access protocols" %ata is nloaded or exported from the so rce s*stem into flat files sin! techni4 es disc ssed in Chapter 00/ 96xtraction in %ata $areho ses9/ and is then transported to the tar!et platform sin! FT7 or similar mechanisms" Beca se so rce s*stems and data wareho ses often se different operatin! s*stems and database s*stems/ sin! flat files is often the simplest wa* to exchan!e data between hetero!eneo s s*stems with minimal transformations" However/ even when transportin! data between homo!eneo s s*stems/ flat files are often the most efficient and most eas*) to)mana!e mechanism for data transfer"

Transportation Through Distributed Operations


%istrib ted 4 eries/ either with or witho t !atewa*s/ can be an effective mechanism for extractin! data" These mechanisms also transport the data directl* to the tar!et s*stems/ th s providin! both extraction and transformation in a sin!le step" %ependin! on the tolerable impact on time and s*stem reso rces/ these mechanisms can be well s ited for both extraction and transformation"

's opposed to flat file transportation/ the s ccess or fail re of the transportation is reco!ni5ed immediatel* with the res lt of the distrib ted 4 er* or transaction" See Also: Chapter 00/ 96xtraction in %ata $areho ses9 for f rther details

Transportation .sing Transportable Tablespaces


OracleHi introd ced an important mechanism for transportin! data# transportable tablespaces" This feat re is the fastest wa* for movin! lar!e vol mes of data between two Oracle databases" 7revio s to OracleHi/ the most scalable data transportation mechanisms relied on movin! flat files containin! raw data" These mechanisms re4 ired that data be nloaded or exported into files from the so rce database/ Then/ after transportation/ these files were loaded or imported into the tar!et database" Transportable tablespaces entirel* b*pass the nload and reload steps" =sin! transportable tablespaces/ Oracle data files -containin! table data/ indexes/ and almost ever* other Oracle database ob:ect3 can be directl* transported from one database to another" F rthermore/ like import and export/ transportable tablespaces provide a mechanism for transportin! metadata in addition to transportin! data" Transportable tablespaces have some notable limitations# so rce and tar!et s*stems m st be r nnin! OracleHi -or hi!her3/ m st be r nnin! the same operatin! s*stem/ m st se the same character set/ and/ prior to Oracle1i/ m st se the same block si5e" %espite these limitations/ transportable tablespaces can be an inval able data transportation techni4 e in man* wareho se environments" The most common applications of transportable tablespaces in data wareho ses are in movin! data from a sta!in! database to a data wareho se/ or in movin! data from a data wareho se to a data mart" See Also: Oracle9i Database Concepts for more information on transportable tablespaces Transportable Tablespaces $2a ple S ppose that *o have a data wareho se containin! sales data/ and several data marts that are refreshed monthl*" 'lso s ppose that *o are !oin! to move one month of sales data from the data wareho se to the data mart"

!te& 19 P$ace the Data to )e Trans&orted into its o n Ta)$es&ace

The c rrent month<s data m st be placed into a separate tablespace in order to be transported" In this example/ *o have a tablespace ts_temp_sales/ which will hold a cop* of the c rrent month<s data" =sin! the C#!$T! T$BL! """ $S S!L!CT statement/ the c rrent month<s data can be efficientl* copied to this tablespace#
C#!$T!&T$BL!&temp_6an_sales NOLO((IN(& T$BL!S%$C!&ts_temp_sales $S& S!L!CT&G&>#O"&sales& F,!#!&time_id&B!TF!!N&735AD!CA5@@@7&$ND&745A>!BA24447;

Followin! this operation/ the tablespace ts_temp_sales is set to read)onl*#


$LT!#&T$BL!S%$C!&ts_temp_sales&#!$D&ONL';

' tablespace cannot be transported nless there are no active transactions modif*in! the tablespace" Settin! the tablespace to read)onl* enforces this" The tablespace ts_temp_sales ma* be a tablespace that has been especiall* created to temporaril* store data for se b* the transportable tablespace feat res" Followin! 9Step B# Cop* the %atafiles and 6xport File to the Tar!et S*stem9/ this tablespace can be set to read@write/ and/ if desired/ the table temp_6an_sales can be dropped/ or the tablespace can be re) sed for other transportations or for other p rposes" In a !iven transportable tablespace operation/ all of the ob:ects in a !iven tablespace are transported" 'ltho !h onl* one table is bein! transported in this example/ the tablespace ts_temp_sales co ld contain m ltiple tables" For example/ perhaps the data mart is refreshed not onl* with the new month<s worth of sales transactions/ b t also with a new cop* of the c stomer table" Both of these tables co ld be transported in the same tablespace" Moreover/ this tablespace co ld also contain other database ob:ects s ch as indexes/ which wo ld also be transported" 'dditionall*/ in a !iven transportable)tablespace operation/ m ltiple tablespaces can be transported at the same time" This makes it easier to move ver* lar!e vol mes of data between databases" (ote/ however/ that the transportable tablespace feat re can onl* transport a set of tablespaces which contain a complete set of database ob:ects witho t dependencies on other tablespaces" For example/ an index cannot be transported witho t its table/ nor can a partition be transported witho t the rest of the table" ?o can se the DB"S_TTS packa!e to check that a tablespace is transportable" See Also: Oracle9i Supplied P&0S%& Packages and Types $e"erence for detailed information abo t the DB"S_TTS packa!e

In this step/ we have copied the .an ar* sales data into a separate tablespaceJ however/ in some cases/ it ma* be possible to levera!e the transportable tablespace feat re witho t even movin! data to a separate tablespace" If the sales table has been partitioned b* month in the data wareho se and if each partition is in its own tablespace/ then it ma* be possible to directl* transport the tablespace containin! the .an ar* data" S ppose the .an ar* partition/ sales_6an2444/ is located in the tablespace ts_sales_6an2444" Then the tablespace ts_sales_6an2444 co ld potentiall* be transported/ rather than creatin! a temporar* cop* of the .an ar* sales data in the ts_temp_sales" However/ the same conditions m st be satisfied in order to transport the tablespace ts_sales_6an2444 as are re4 ired for the speciall* created tablespace" First/ this tablespace m st be set to #!$D ONL'" Second/ beca se a sin!le partition of a partitioned table cannot be transported witho t the remainder of the partitioned table also bein! transported/ it is necessar* to exchan!e the .an ar* partition into a separate table - sin! the $LT!# T$BL! statement3 to transport the .an ar* data" The !.C,$N(! operation is ver* 4 ick/ b t the .an ar* data will no lon!er be a part of the nderl*in! sales table/ and th s ma* be navailable to sers ntil this data is exchan!ed back into the sales table after the export of the metadata" The .an ar* data can be exchan!ed back into the sales table after *o complete step B"

!te& 29 E.&ort the #etadata


The 6xport tilit* is sed to export the metadata describin! the ob:ects contained in the transported tablespace" For o r example scenario/ the 6xport command co ld be#
!.%&T#$NS%O#T_T$BL!S%$C!Hy& &&&&T$BL!S%$C!SHts_temp_sales &&&&>IL!H6an_salesIdmp

This operation will !enerate an export file/ 6an_salesIdmp" The export file will be small/ beca se it contains onl* metadata" In this case/ the export file will contain information describin! the table temp_6an_sales/ s ch as the col mn names/ col mn datat*pe/ and all other information that the tar!et Oracle database will need in order to access the ob:ects in ts_temp_sales"

!te& 39 Co&' the Datafi$es and E.&ort Fi$e to the Target !'stem
Cop* the data files that make p ts_temp_sales/ as well as the export file 6an_salesIdmp to the data mart platform/ sin! an* transportation mechanism for flat files" Once the datafiles have been copied/ the tablespace ts_temp_sales can be set to #!$D F#IT! mode if desired"

!te& "9 7m&ort the #etadata

Once the files have been copied to the data mart/ the metadata sho ld be imported into the data mart#
I"%&T#$NS%O#T_T$BL!S%$C!Hy&D$T$>IL!SH78d98temp6anIf7& &&&&T$BL!S%$C!SHts_temp_sales& &&&&>IL!H6an_salesIdmp

't this point/ the tablespace ts_temp_sales and the table temp_sales_6an are accessible in the data mart" ?o can incorporate this new data into the data mart<s tables" ?o can insert the data from the temp_sales_6an table into the data mart<s sales table in one of two wa*s#
INS!#T&8GO&$%%!ND&G8&INTO&sales&S!L!CT&G&>#O"&temp_sales_6an;

Followin! this operation/ *o can delete the temp_sales_6an table -and even the entire ts_temp_sales tablespace3" 'lternativel*/ if the data mart<s sales table is partitioned b* month/ then the new transported tablespace and the temp_sales_6an table can become a permanent part of the data mart" The temp_sales_6an table can become a partition of the data mart<s sales table#
$LT!#&T$BL!&sales&$DD&%$#TITION&sales_446an&+$LU!S &&L!SS&T,$N&)TO_D$T!)745Afe9A2444717ddAmonAyyyy7**; $LT!#&T$BL!&sales&!.C,$N(!&%$#TITION&sales_446an& &&FIT,&T$BL!&temp_sales_6an INCLUDIN(&IND!.!S&FIT,&+$LID$TION;

Other .ses o0 Transportable Tablespaces The previo s example ill strates a t*pical scenario for transportin! data in a data wareho se" However/ transportable tablespaces can be sed for man* other p rposes" In a data wareho sin! environment/ transportable tablespaces sho ld be viewed as a tilit* -m ch like Import@6xport or SFLILoader3/ whose p rpose is to move lar!e vol mes of data between Oracle databases" $hen sed in con: nction with parallel data movement operations s ch as the C#!$T! T$BL! """ $S S!L!CT and INS!#T """ $S S!L!CT statements/ transportable tablespaces provide an important mechanism for 4 ickl* transportin! data for man* p rposes"

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers

Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

13 "oading and Trans0or ation


This chapter helps *o create and mana!e a data wareho se/ and disc sses#

Overview of Loadin! and Transformation in %ata $areho ses Loadin! Mechanisms Transformation Mechanisms Loadin! and Transformation Scenarios

Overvie( o0 "oading and Trans0or ation in Data Warehouses


%ata transformations are often the most complex and/ in terms of processin! time/ the most costl* part of the 6TL process" The* can ran!e from simple data conversions to extremel* complex data scr bbin! techni4 es" Man*/ if not all/ data transformations can occ r within an Oracle1i database/ altho !h transformations are often implemented o tside of the database -for example/ on flat files3 as well" This chapter introd ces techni4 es for implementin! scalable and efficient data transformations within Oracle1i" The examples in this chapter are relativel* simple" +eal) world data transformations are often considerabl* more complex" However/ the transformation techni4 es introd ced in this chapter meet the ma:orit* of real)world data transformation re4 irements/ often with more scalabilit* and less pro!rammin! than alternative approaches" This chapter does not seek to ill strate all of the t*pical transformations that wo ld be enco ntered in a data wareho se/ b t to demonstrate the t*pes of f ndamental technolo!* that can be applied to implement these transformations and to provide ! idance in how to choose the best techni4 es"

Trans0or ation ,lo(


From an architect ral perspective/ *o can transform *o r data in two wa*s#

M ltista!e %ata Transformation 7ipelined %ata Transformation

)ultistage Data Trans0or ation The data transformation lo!ic for most data wareho ses consists of m ltiple steps" For example/ in transformin! new records to be inserted into a sales table/ there ma* be separate lo!ical transformation steps to validate each dimension ke*" Fi! re 0B)0 offers a !raphical wa* of lookin! at the transformation lo!ic" Figure 13-1 #u$tistage Data Transformation

Text description of the ill stration dwhs!8A>"!if $hen sin! Oracle1i as a transformation en!ine/ a common strate!* is to implement each different transformation as a separate SFL operation and to create a separate/ temporar* sta!in! table -s ch as the tables ne<_sales_step5 and ne<_sales_step2 in Fi! re 0B) 03 to store the incremental res lts for each step" This load)then)transform strate!* also provides a nat ral checkpointin! scheme to the entire transformation process/ which enables to the process to be more easil* monitored and restarted" However/ a disadvanta!e to m ltista!in! is that the space and time re4 irements increase" It ma* also be possible to combine man* simple lo!ical transformations into a sin!le SFL statement or sin!le 7L@SFL proced re" %oin! so ma* provide better performance than performin! each step independentl*/ b t it ma* also introd ce diffic lties in modif*in!/ addin!/ or droppin! individ al transformations/ as well as recoverin! from failed transformations"

#ipelined Data Trans0or ation $ith the introd ction of Oracle1i/ Oracle<s database capabilities have been si!nificantl* enhanced to address specificall* some of the tasks in 6TL environments" The 6TL process flow can be chan!ed dramaticall* and the database becomes an inte!ral part of the 6TL sol tion" The new f nctionalit* renders some of the former necessar* process steps obsolete whilst some others can be remodeled to enhance the data flow and the data transformation to become more scalable and non)interr ptive" The task shifts from serial transform)then) load process -with most of the tasks done o tside the database3 or load)then)transform process/ to an enhanced transform)while)loadin!" Oracle1i offers a wide variet* of new capabilities to address all the iss es and tasks relevant in an 6TL scenario" It is important to nderstand that the database offers toolkit f nctionalit* rather than tr*in! to address a one)si5e)fits)all sol tion" The nderl*in! database has to enable the most appropriate 6TL process flow for a specific c stomer need/ and not dictate or constrain it from a technical perspective" Fi! re 0B)A ill strates the new f nctionalit*/ which is disc ssed thro !ho t later sections" Figure 13-2 Pi&e$ined Data Transformation

Text description of the ill stration dw!H082>"!if

"oading )echanis s
?o can se the followin! mechanisms for loadin! a wareho se#

SFLILoader 6xternal Tables OCI and %irect)7ath '7Is 6xport@Import

SD"K"oader
Before an* data transformations can occ r within the database/ the raw data m st become accessible for the database" One approach is to load it into the database" Chapter 0A/ 9Transportation in %ata $areho ses9/ disc sses several techni4 es for transportin! data to an Oracle data wareho se" 7erhaps the most common techni4 e for transportin! data is b* wa* of flat files" SFLILoader is sed to move data from flat files into an Oracle data wareho se" % rin! this data load/ SFLILoader can also be sed to implement basic data transformations" $hen sin! direct)path SFLILoader/ basic data manip lation/ s ch as datat*pe conversion and simple NULL handlin!/ can be a tomaticall* resolved d rin! the data load" Most data wareho ses se direct)path loadin! for performance reasons" Oracle<s conventional)path loader provides broader capabilities for data transformation than a direct)path loader# SFL f nctions can be applied to an* col mn as those val es are bein! loaded" This provides a rich capabilit* for transformations d rin! the data load" However/ the conventional)path loader is slower than direct)path loader" For these reasons/ the conventional)path loader sho ld be considered primaril* for loadin! and transformin! smaller amo nts of data" See Also: Oracle9i Database tilities for more information on SFLILoader The followin! is a simple example of a SFLILoader controlfile to load data into the sales table of the sh sample schema from an external file sh_salesIdat" The external flat file sh_salesIdat consists of sales transaction data/ a!!re!ated on a dail* level" (ot all col mns of this external file are loaded into sales" This external file will also be sed as so rce for loadin! the second fact table of the sh sample schema/ which is done sin! an external table# The followin! shows the controlfile )sh_salesIctl3 to load the sales table#
LO$D&D$T$ IN>IL!&sh_salesIdat $%%!ND&INTO&T$BL!&sales >I!LDS&T!#"IN$T!D&B'&NSN )&%#OD_ID1&CUST_ID1&TI"!_ID1&C,$NN!L_ID1&%#O"O_ID1 &LU$NTIT'_SOLD1&$"OUNT_SOLD* &

It can be loaded with the followin! command#


Q&&sqlldr&sh8sh&controlHsh_salesIctl&directHtrue

$2ternal Tables

'nother approach for handlin! external data so rces is sin! external tables" Oracle1iRs external table feat re enables *o to se external data as a virt al table that can be 4 eried and :oined directl* and in parallel witho t re4 irin! the external data to be first loaded in the database" ?o can then se SFL/ 7L@SFL/ and .ava to access the external data" 6xternal tables enable the pipelinin! of the loadin! phase with the transformation phase" The transformation process can be mer!ed with the loadin! process witho t an* interr ption of the data streamin!" It is no lon!er necessar* to sta!e the data inside the database for f rther processin! inside the database/ s ch as comparison or transformation" For example/ the conversion f nctionalit* of a conventional load can be sed for a direct) path INS!#T $S S!L!CT statement in con: nction with the S!L!CT from an external table" The main difference between external tables and re! lar tables is that externall* or!ani5ed tables are read)onl*" (o %ML operations -U%D$T!@INS!#T@D!L!T!3 are possible and no indexes can be created on them" Oracle1i<s external tables are a complement to the existin! SFLILoader f nctionalit*/ and are especiall* sef l for environments where the complete external so rce has to be :oined with existin! database ob:ects and transformed in a complex manner/ or where the external data vol me is lar!e and sed onl* once" SFLILoader/ on the other hand/ mi!ht still be the better choice for loadin! of data where additional indexin! of the sta!in! table is necessar*" This is tr e for operations where the data is sed in independent complex transformations or the data is onl* partiall* sed in f rther processin!" See Also: Oracle9i S%& $e"erence for a complete description of external table s*ntax and restrictions and Oracle9i Database tilities for sa!e examples ?o can create an external table named sales_transactions_e?t/ representin! the str ct re of the complete sales transaction data/ represented in the external file sh_salesIdat" The prod ct department is especiall* interested in a cost anal*sis on prod ct and time" $e th s create a fact table named cost in the sales history schema" The operational so rce data is the same as for the sales fact table" However/ beca se we are not investi!atin! ever* dimensional information that is provided/ the data in the cost fact table has a coarser !ran larit* than in the sales fact table/ for example/ all different distrib tion channels are a!!re!ated" $e cannot load the data into the cost fact table witho t appl*in! the previo sl* mentioned a!!re!ation of the detailed information/ d e to the s ppression of some of the dimensions" Oracle<s external table framework offers a sol tion to solve this" =nlike SFLILoader/ where *o wo ld have to load the data before appl*in! the a!!re!ation/ *o can combine

the loadin! and transformation within a sin!le SFL %ML statement/ as shown in the followin!" ?o do not have to sta!e the data temporaril* before insertin! into the tar!et table" The Oracle ob:ect directories m st alread* exist/ and point to the director* containin! the sh_salesIdat file as well as the director* containin! the bad and lo! files"
C#!$T!&T$BL!&sales_transactions_e?t ) &&%#OD_ID&NU"B!#)D*1 &&CUST_ID&NU"B!#1 &&TI"!_ID&D$T!1 &&C,$NN!L_ID&C,$#)5*1 &&%#O"O_ID&NU"B!#)D*1 &&LU$NTIT'_SOLD&NU"B!#)3*1 &&$"OUNT_SOLD&NU"B!#)5412*1 &&UNIT_COST&NU"B!#)5412*1 &&UNIT_%#IC!&NU"B!#)5412* * O#($NI $TION&e?ternal& ) &&T'%!&oracle_loader &&D!>$ULT&DI#!CTO#'&data_file_dir &&$CC!SS&%$#$"!T!#S& &&) &&&&#!CO#DS&D!LI"IT!D&B'&N!FLIN!&C,$#$CT!#S!T&USE$SCII &&&&B$D>IL!&lo/_file_dirK7sh_salesI9ad_?t7 &&&&LO(>IL!&lo/_file_dirK7sh_salesIlo/_?t7 &&&&>I!LDS&T!#"IN$T!D&B'&NSN&LD#T#I"& &&* &&location& &&) &&&&7sh_salesIdat7 &&* *#!B!CT&LI"IT&UNLI"IT!D;

The external table can now be sed from within the database/ accessin! some col mns of the external data onl*/ !ro pin! the data/ and insertin! it into the costs fact table#
INS!#T&8GO&$%%!ND&G8&INTO&COSTS ) &&TI"!_ID1 &&%#OD_ID1 &&UNIT_COST1 &&UNIT_%#IC! * S!L!CT& &&TI"!_ID1 &&%#OD_ID1 &&SU")UNIT_COST*1 &&SU")UNIT_%#IC!* >#O"&sales_transactions_e?t (#OU%&B'&time_id1&prod_id;

OCI and Direct=#ath A#Is


OCI and direct)path '7Is are fre4 entl* sed when the transformation and comp tation are done o tside the database and there is no need for flat file sta!in!"

$2port6I port
6xport and import are sed when the data is inserted as is into the tar!et s*stem" (o lar!e vol mes of data sho ld be handled and no complex extractions are possible" See Also: Chapter 00/ 96xtraction in %ata $areho ses9 for f rther information

Trans0or ation )echanis s


?o have the followin! choices for transformin! data inside the database#

Transformation =sin! SFL Transformation =sin! 7L@SFL Transformation =sin! Table F nctions

Trans0or ation .sing SD"


Once data is loaded into an Oracle1i database/ data transformations can be exec ted sin! SFL operations" There are fo r basic techni4 es for implementin! SFL data transformations within Oracle1i#

C+6'T6 T'BL6 """ 'S S6L6CT 'nd I(S6+T @IM'776(%I@ 'S S6L6CT Transformation =sin! =7%'T6 Transformation =sin! M6+K6 Transformation =sin! M ltitable I(S6+T

C1$AT$ TA&"$ LLL AS S$"$CT And INS$1T 6K9A##$NDK6 AS S$"$CT The C#!$T! T$BL! """ $S S!L!CT statement -CT'S3 is a powerf l tool for manip latin! lar!e sets of data" 's shown in the followin! example/ man* data transformations can be expressed in standard SFL/ and CT'S provides a mechanism for efficientl* exec tin! a SFL 4 er* and storin! the res lts of that 4 er* in a new database table" The INS!#T @IM$%%!NDI@ """ $S S!L!CT statement offers the same capabilities with existin! database tables" In a data wareho se environment/ CT'S is t*picall* r n in parallel sin! NOLO((IN( mode for best performance"

' simple and common t*pe of data transformation is data s bstit tion" In a data s bstit tion transformation/ some or all of the val es of a sin!le col mn are modified" For example/ o r sales table has a channel_id col mn" This col mn indicates whether a !iven sales transaction was made b* a compan*<s own sales force -a direct sale3 or b* a distrib tor -an indirect sale3" ?o ma* receive data from m ltiple so rce s*stems for *o r data wareho se" S ppose that one of those so rce s*stems processes onl* direct sales/ and th s the so rce s*stem does not know indirect sales channels" $hen the data wareho se initiall* receives sales data from this s*stem/ all sales records have a NULL val e for the salesIchannel_id field" These NULL val es m st be set to the proper ke* val e" For example/ ?o can do this efficientl* sin! a SFL f nction as part of the insertion into the tar!et sales table statement# The str ct re of so rce table sales_acti-ity_direct is as follows#
SLLJ&D!SC&sales_acti-ity_direct Name&&&&&&&&&&&NullT&&&&Type AAAAAAAAAAAA&&&AAAAA&&&&AAAAAAAAAAAAAAAA S$L!S_D$T!&&&&&&&&&&&&&&D$T! %#ODUCT_ID&&&&&&&&&&&&&&NU"B!# CUSTO"!#_ID&&&&&&&&&&&&&NU"B!# %#O"OTION_ID&&&&&&&&&&&&NU"B!# $"OUNT&&&&&&&&&&&&&&&&&&NU"B!# LU$NTIT'&&&&&&&&&&&&&&&&NU"B!# INS!#T&8GO&$%%!ND&NOLO((IN(&%$#$LL!L&G8 INTO&sales S!L!CT&product_id1&customer_id1&T#UNC)sales_date*1&7S71& &&promotion_id1&quantity1&amount& >#O"&&sales_acti-ity_direct;

Trans0or ation .sing .#DAT$ 'nother techni4 e for implementin! a data s bstit tion is to se an U%D$T! statement to modif* the salesIchannel_id col mn" 'n U%D$T! will provide the correct res lt" However/ if the data s bstit tion transformations re4 ire that a ver* lar!e percenta!e of the rows -or all of the rows3 be modified/ then/ it ma* be more efficient to se a CT'S statement than an U%D$T!" Trans0or ation .sing )$17$ Oracle<s mer!e f nctionalit* extends SFL/ b* introd cin! the SFL ke*word "!#(!/ in order to provide the abilit* to pdate or insert a row conditionall* into a table or o t of line sin!le table views" Conditions are specified in the ON cla se" This is/ besides p re b lk loadin!/ one of the most common operations in data wareho se s*nchroni5ation" 7rior to Oracle1i/ mer!es were expressed either as a se4 ence of %ML statements or as 7L@SFL loops operatin! on each row" Both of these approaches s ffer from deficiencies

in performance and sabilit*" The new mer!e f nctionalit* overcomes these deficiencies with a new SFL statement" This s*ntax has been proposed as part of the pcomin! SFL standard"

When to +se #erge


There are several benefits of the new "!#(! statement as compared with the two other existin! approaches"

The entire operation can be expressed m ch more simpl* as a sin!le SFL statement" ?o can paralleli5e statements transparentl*" ?o can se b lk %ML" 7erformance will improve beca se *o r statements will re4 ire fewer scans of the so rce table"

#erge E.am&$es
The followin! disc sses vario s implementations of a mer!e" The examples ass me that new data for the dimension table prod cts is propa!ated to the data wareho se and has to be either inserted or pdated" The table products_delta has the same str ct re as products"

E.am&$e 1 #erge O&eration +sing !8L in Orac$e@i


"!#(!&INTO&products&t USIN(&products_delta&s& ON&)tIprod_idHsIprod_id*& F,!N&"$TC,!D&T,!N U%D$T!&S!T& tIprod_list_priceHsIprod_list_price1 tIprod_min_priceHsIprod_min_price& F,!N&NOT&"$TC,!D&T,!N INS!#T& )prod_id1&prod_name1&prod_desc1 prod_su9cate/ory1&prod_su9cat_desc1&prod_cate/ory1& prod_cat_desc1&prod_status1&prod_list_price1&prod_min_price* +$LU!S )sIprod_id1&sIprod_name1&sIprod_desc1 sIprod_su9cate/ory1&sIprod_su9cat_desc1& sIprod_cate/ory1&sIprod_cat_desc1 sIprod_status1&sIprod_list_price1&sIprod_min_price*;

E.am&$e 2 #erge O&eration +sing !8L Prior to Orac$e@i


' re! lar :oin between so rce products_delta and tar!et products"
U%D$T!&products&t& S!T

)prod_name1&prod_desc1&prod_su9cate/ory1&prod_su9cat_desc1& prod_cate/ory1 prod_cat_desc1&prod_status1&prod_list_price1 prod_min_price*&H& )S!L!CT&prod_name1&prod_desc1&prod_su9cate/ory1&prod_su9cat_desc1 prod_cate/ory1&prod_cat_desc1&prod_status1&prod_list_price1 prod_min_price&from&products_delta&s&F,!#!&sIprod_idHtIprod_id*;

'n anti:oin between so rce products_delta and tar!et products"


INS!#T&INTO&products&t S!L!CT&G&>#O"&products_delta&s F,!#!&sIprod_id&NOT&IN& )S!L!CT&prod_id&>#O"&products*;

The advanta!e of this approach is its simplicit* and lack of new lan! a!e extensions" The disadvanta!e is its performance" It re4 ires an extra scan and a :oin of both the products_delta and the products tables"

E.am&$e 3 Pre-@i #erge +sing PL2!8L


C#!$T!&O#&#!%L$C!&%#OC!DU#!&mer/e_proc IS& CU#SO#&cur&IS& S!L!CT&prod_id1&prod_name1&prod_desc1&prod_su9cate/ory1& prod_su9cat_desc1 &&&&&&&prod_cate/ory1&prod_cat_desc1&prod_status1&prod_list_price1 &&&&&&&prod_min_price& >#O"&products_delta; crec&curUro<type; B!(IN &&O%!N&cur; &&LOO% &&&&>!TC,&cur&INTO&crec; &&&&!.IT&F,!N&curUnotfound; &&&&U%D$T!&products&S!T& &&&&&&&&&&prod_name&H&crecIprod_name1&prod_desc&H&crecIprod_desc1& &&&&&&&&&&prod_su9cate/ory&H&crecIprod_su9cate/ory1& &&&&&&&&&&prod_su9cat_desc&H&crecIprod_su9cat_desc1& &&&&&&&&&&prod_cate/ory&H&crecIprod_cate/ory1& &&&&&&&&&&prod_cat_desc&H&crecIprod_cat_desc1& &&&&&&&&&&prod_status&H&crecIprod_status1& &&&&&&&&&&prod_list_price&H&crecIprod_list_price1 &&&&&&&&&&prod_min_price&H&crecIprod_min_price &&&&&&F,!#!&crecIprod_id&H&prod_id; &&&&I>&SLLUnotfound&T,!N &&&&INS!#T&INTO&products& &&&&)prod_id1&prod_name1&prod_desc1&prod_su9cate/ory1& &&&&&prod_su9cat_desc1&prod_cate/ory1& &&&&&prod_cat_desc1&prod_status1&prod_list_price1&prod_min_price* &&&&+$LU!S

&&&&)crecIprod_id1&crecIprod_name1&crecIprod_desc1& crecIprod_su9cate/ory1& &&&&&crecIprod_su9cat_desc1&crecIprod_cate/ory1& &&&&&crecIprod_cat_desc1&crecIprod_status1&crecIprod_list_price1& crecIprod_min_ price*; &&&&!ND&I>; &&!ND&LOO%; &&CLOS!&cur; !ND&mer/e_proc; 8

Trans0or ation .sing )ultitable INS$1T Man* times/ external data so rces have to be se!re!ated based on lo!ical attrib tes for insertion into different tar!et ob:ects" It<s also fre4 ent in data wareho se environments to fan o t the same so rce data into several tar!et ob:ects" M ltitable inserts provide a new SFL statement for these kinds of transformations/ where data can either end p in several or exactl* one tar!et/ dependin! on the b siness transformation r les" This insertion can be done conditionall* based on b siness r les or nconditionall*" It offers the benefits of the INS!#T """ S!L!CT statement when m ltiple tables are involved as tar!ets" In doin! so/ it avoids the drawbacks of the alternatives available to *o sin! f nctionalit* prior to Oracle1i" ?o either had to deal with n independent INS!#T """ S!L!CT statements/ th s processin! the same so rce data n times and increasin! the transformation workload n times" 'lternativel*/ *o had to choose a proced ral approach with a per)row determination how to handle the insertion" This sol tion lacked direct access to hi!h)speed access paths available in SFL" 's with the existin! INS!#T """ S!L!CT statement/ the new statement can be paralleli5ed and sed with the direct)load mechanism for faster performance" E.am&$e 13-1 +nconditiona$ 7nsert The followin! statement a!!re!ates the transactional sales information/ stored in sales_acti-ity_direct/ on a per dail* base and inserts into both the sales and the costs fact table for the c rrent da*"
INS!#T&$LL &&&INTO&sales&+$LU!S&)product_id1&customer_id1&today1&7S71& promotion_id1&& &&&&&&&&&&&&&&&&&&&&&&quantity_per_day1&amount_per_day* &&&INTO&costs&+$LU!S&)product_id1&today1&product_cost1&product_price* S!L!CT&T#UNC)sIsales_date*&$S&today1& &&&sIproduct_id1&sIcustomer_id1&sIpromotion_id1 &&&SU")sIamount_sold*&$S&amount_per_day1&SU")sIquantity*& quantity_per_day1 &&&pIproduct_cost1&pIproduct_price &&&>#O"&sales_acti-ity_direct&s1&product_information&p &&&F,!#!&sIproduct_id&H&pIproduct_id

&&&$ND&trunc)sales_date*Htrunc)sysdate* &&&(#OU%&B'&trunc)sales_date*1&sIproduct_id1& &&&&&&&&&&&&sIcustomer_id1&sIpromotion_id1&pIproduct_cost1& pIproduct_price;

E.am&$e 13-2 Conditiona$ ALL 7nsert The followin! statement inserts a row into the sales and cost tables for all sales transactions with a valid promotion and stores the information abo t m ltiple identical orders of a c stomer in a separate table cum_sales_acti-ity" It is possible two rows will be inserted for some sales transactions/ and none for others"
INS!#T&$LL F,!N&promotion_id&IN&)S!L!CT&promo_id&>#O"&promotions*&T,!N &&&INTO&sales&+$LU!S&)product_id1&customer_id1&today1&7S71& promotion_id1& &&&&&&&&&&&&&&&&&&&&&&&quantity_per_day1&amount_per_day* &&&INTO&costs&+$LU!S&)product_id1&today1&product_cost1&product_price* F,!N&num_of_orders&J&5&T,!N &&&INTO&cum_sales_acti-ity&+$LU!S&)today1&product_id1&customer_id1 &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&promotion_id1&quantity_per_day1& amount_per_day1 &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&num_of_orders* S!L!CT&T#UNC)sIsales_date*&$S&today1&sIproduct_id1&sIcustomer_id1& &&&&&&&sIpromotion_id1&SU")sIamount*&$S&amount_per_day1&SU")sIquantity* &&&&&&&quantity_per_day1&COUNT)G*&num_of_orders1 &&&&&&&pIproduct_cost1&pIproduct_price >#O"&sales_acti-ity_direct&s1&product_information&p F,!#!&sIproduct_id&H&pIproduct_id $ND&T#UNC)sales_date*&H&T#UNC)sysdate* (#OU%&B'&T#UNC)sales_date*1&sIproduct_id1&sIcustomer_id1& &&&&&&&&&&&&sIpromotion_id1&pIproduct_cost1&pIproduct_price;

E.am&$e 13-3 Conditiona$ F7%!T 7nsert The followin! statement inserts into an appropriate shippin! manifest accordin! to the total 4 antit* and the wei!ht of a prod ct order" 'n exception is made for hi!h val e orders/ which are also sent b* express/ nless their wei!ht classification is not too hi!h" It ass mes the existence of appropriate tables lar/e_frei/ht_shippin// e?press_shippin// and default_shippin/"
INS!#T&>I#ST &&&F,!N&)sum_quantity_sold&J&54&$ND&prod_<ei/ht_class&P&0*&O#& &&&&&&&&)sum_quantity_sold&J&0&$ND&prod_<ei/ht_class&J&0*&T,!N &&&&&&INTO&lar/e_frei/ht_shippin/&+$LU!S& &&&&&&&&&&)time_id1&cust_id1&prod_id1&prod_<ei/ht_class1& sum_quantity_sold* &&&F,!N&sum_amount_sold&J&5444&T,!N &&&&&&INTO&e?press_shippin/&+$LU!S &&&&&&&&&&)time_id1&cust_id1&prod_id1&prod_<ei/ht_class1 &&&&&&&&&&&sum_amount_sold1&sum_quantity_sold* &&&!LS! &&&&&&INTO&default_shippin/&+$LU!S

&&&&&&&&&&)time_id1&cust_id1&prod_id1&sum_quantity_sold* S!L!CT&sItime_id1&sIcust_id1&sIprod_id1&pIprod_<ei/ht_class1 &&&&&&&SU")amount_sold*&$S&sum_amount_sold1& &&&&&&&SU")quantity_sold*&$S&sum_quantity_sold >#O"&sales&s1&products&p F,!#!&sIprod_id&H&pIprod_id $ND&sItime_id&H&T#UNC)sysdate* (#OU%&B'&sItime_id1&sIcust_id1&sIprod_id1&pIprod_<ei/ht_class;

E.am&$e 13-" #i.ed Conditiona$ and +nconditiona$ 7nsert The followin! example inserts new c stomers into the c stomers table and stores all new c stomers with cust_credit_limit hi!her then C>88 in an additional/ separate table for f rther promotions"
INS!#T&>I#ST &&F,!N&cust_credit_limit&JH&:044&T,!N &&&&&INTO&customers &&&&&INTO&customers_special&+$LU!S&)cust_id1&cust_credit_limit* &&!LS! &&&&&INTO&customers S!L!CT&G&>#O"&customers_ne<;

Trans0or ation .sing #"6SD"


In a data wareho se environment/ *o can se proced ral lan! a!es s ch as 7L@SFL to implement complex transformations in the Oracle1i database" $hereas CT'S operates on entire tables and emphasi5es parallelism/ 7L@SFL provides a row)based approached and can accommodate ver* sophisticated transformation r les" For example/ a 7L@SFL proced re co ld open m ltiple c rsors and read data from m ltiple so rce tables/ combine this data sin! complex b siness r les/ and finall* insert the transformed data into one or more tar!et table" It wo ld be diffic lt or impossible to express the same se4 ence of operations sin! standard SFL statements" =sin! a proced ral lan! a!e/ a specific transformation -or n mber of transformation steps3 within a complex 6TL processin! can be encaps lated/ readin! data from an intermediate sta!in! area and !eneratin! a new table ob:ect as o tp t" ' previo sl* !enerated transformation inp t table and a s bse4 ent transformation will cons me the table !enerated b* this specific transformation" 'lternativel*/ these encaps lated transformation steps within the complete 6TL process can be inte!rated seamlessl*/ th s streamin! sets of rows between each other witho t the necessit* of intermediate sta!in!" ?o can se Oracle1i<s table f nctions to implement s ch behavior"

Trans0or ation .sing Table ,unctions


Oracle1i<s table f nctions provide the s pport for pipelined and parallel exec tion of transformations implemented in 7L@SFL/ C/ or .ava" Scenarios as mentioned earlier can be done witho t re4 irin! the se of intermediate sta!in! tables/ which interr pt the data flow thro !h vario s transformations steps"

What is a Table ,unction? ' table f nction is defined as a f nction that can prod ce a set of rows as o tp t" 'dditionall*/ table f nctions can take a set of rows as inp t" 7rior to Oracle1i/ 7L@SFL f nctions#

Co ld not take c rsors as inp t Co ld not be paralleli5ed or pipelined

Startin! with Oracle1i/ f nctions are not limited in these wa*s" Table f nctions extend database f nctionalit* b* allowin!#

M ltiple rows to be ret rned from a f nction +es lts of SFL s b4 eries -that select m ltiple rows3 to be passed directl* to f nctions F nctions take c rsors as inp t F nctions can be paralleli5ed +et rnin! res lt sets incrementall* for f rther processin! as soon as the* are created" This is called incremental pipelinin!

Table f nctions can be defined in 7L@SFL sin! a native 7L@SFL interface/ or in .ava or C sin! the Oracle %ata Cartrid!e Interface -O%CI3" See Also: P&0S%& ser(s #uide and $e"erence for f rther information and Oracle9i Data Cartridge Developer(s #uide Fi! re 0B)B ill strates a t*pical a!!re!ation where *o inp t a set of rows and o tp t a set of rows/ in that case/ after performin! a SU" operation" Figure 13-3 Ta)$e Function E.am&$e

Text description of the ill stration dwhs!8HC"!if The pse docode for this operation wo ld be similar to#
INS!#T&INTO&out S!L!CT&G&>#O"&)NTa9le&>unctionN)S!L!CT&G&>#O"&in**;

The table f nction takes the res lt of the S!L!CT on In as inp t and delivers a set of records in a different format as o tp t for a direct insertion into Out" 'dditionall*/ a table f nction can fan o t data within the scope of an atomic transaction" This can be sed for man* occasions like an efficient lo!!in! mechanism or a fan o t for other independent transformations" In s ch a scenario/ a sin!le sta!in! table will be needed" Figure 13-" Pi&e$ined Para$$e$ Transformation ith Fanout

Text description of the ill stration dwhs!8E1"!if The pse docode for this wo ld be similar to#
INS!#T&INTO&tar/et&S!L!CT&G&>#O"&)tf2)S!L!CT&G& >#O"&)tf5)S!L!CT&G&>#O"&source****;

This will insert into tar/et and/ as part of tf5/ into Sta/e Ta9le 5 within the scope of an atomic transaction"
INS!#T&INTO&tar/et&S!L!CT&G&>#O"&tf3)S!L!CT&G&>#O"&sta/e_ta9le5*;

E.am&$e 13-- Ta)$e Functions Fundamenta$s The followin! examples demonstrate the f ndamentals of table f nctions/ witho t the sa!e of complex b siness r les implemented inside those f nctions" The* are chosen for demonstration p rposes onl*/ and are all implemented in 7L@SFL" Table f nctions ret rn sets of records and can take c rsors as inp t" Besides the Sales ,istory schema/ *o have to set p the followin! database ob:ects before sin! the examples#
#!"&o96ect&types C#!$T!&T'%!&product_t&$S&OBB!CT&)& &&&&prod_id&&&&&&&&&&&&&&NU"B!#)D*1& &&&&prod_name&&&&&&&&&&&&+$#C,$#2)04*1 &&&&prod_desc&&&&&&&&&&&&+$#C,$#2):444*1 &&&&prod_su9cate/ory&&&&&+$#C,$#2)04*1

&&&&prod_su9cat_desc&&&&&+$#C,$#2)2444*I &&&&prod_cate/ory&&&&&&&&+$#C,$#2)04*1 &&&&prod_cat_desc&&&&&&&&+$#C,$#2)2444*1 &&&&prod_<ei/ht_class&&&&NU"B!#)2*1 &&&&prod_unit_of_measure&+$#C,$#2)24*1 &&&&prod_pac=_siMe&&&&&&&+$#C,$#2)34*1 &&&&supplier_id&&&&&&&&&&NU"B!#)D*1 &&&&prod_status&&&&&&&&&&+$#C,$#2)24*1 &&&&prod_list_price&&&&&&NU"B!#)C12*1 &&&&prod_min_price&&&&&&&NU"B!#)C12* *; 8 C#!$T!&T'%!&product_t_ta9le&$S&T$BL!&O>&product_t; 8 CO""IT; #!"&pac=a/e&of&all&cursor&types #!"&<e&ha-e&to&handle&the&input&cursor&type&and&the&output&cursor& collection& #!"&type C#!$T!&O#&#!%L$C!&%$CK$(!&cursor_%K(&as &&T'%!&product_t_rec&IS&#!CO#D&)&&&&&& &&&&&prod_id&&&&&&&&&&&&&&NU"B!#)D*1& &&&&&prod_name&&&&&&&&&&&&+$#C,$#2)04*1 &&&&&prod_desc&&&&&&&&&&&&+$#C,$#2):444*1 &&&&&prod_su9cate/ory&&&&&+$#C,$#2)04*1 &&&&&prod_su9cat_desc&&&&&+$#C,$#2)2444*1 &&&&&prod_cate/ory&&&&&&&&+$#C,$#2)04*1 &&&&&prod_cat_desc&&&&&&&&+$#C,$#2)2444*1 &&&&&prod_<ei/ht_class&&&&NU"B!#)2*1 &&&&&prod_unit_of_measure&+$#C,$#2)24*1 &&&&&prod_pac=_siMe&&&&&&&+$#C,$#2)34*1 &&&&&supplier_id&&&&&&&&&&NU"B!#)D*1 &&&&&prod_status&&&&&&&&&&+$#C,$#2)24*1 &&&&&prod_list_price&&&&&&NU"B!#)C12*1 &&&&&prod_min_price&&&&&&&NU"B!#)C12**; &&T'%!&product_t_recta9&IS&T$BL!&O>&product_t_rec; &&T'%!&stron/_refcur_t&IS&#!>&CU#SO#&#!TU#N&product_t_rec; &&T'%!&refcur_t&IS&#!>&CU#SO#; !ND; 8 #!"&artificial&help&ta9le1&used&to&demonstrate&fi/ure&53A: C#!$T!&T$BL!&o9solete_products_errors&)prod_id&NU"B!#1&ms/& +$#C,$#2)2444**;

The followin! example demonstrates a simple filterin!J it shows all obsolete prod cts except the prod_cate/ory Boys" The table f nction ret rns the res lt set as a set of records and ses a weakl* t*ped ref c rsor as inp t"
C#!$T!&O#&#!%L$C!&>UNCTION&o9solete_products)cur&cursor_p=/Irefcur_t*& &&&#!TU#N&product_t_ta9le IS &&&&prod_id&&&&&&&&&&&&&&NU"B!#)D*;&&&&&&

&&&&prod_name&&&&&&&&&&&&+$#C,$#2)04*;&&& &&&&prod_desc&&&&&&&&&&&&+$#C,$#2):444*;& &&&&prod_su9cate/ory&&&&&+$#C,$#2)04*;&& &&&&prod_su9cat_desc&&&&&+$#C,$#2)2444*;&& &&&&prod_cate/ory&&&&&&&&+$#C,$#2)04*; &&&&prod_cat_desc&&&&&&&&+$#C,$#2)2444*;&& &&&&prod_<ei/ht_class&&&&NU"B!#)2*; &&&&prod_unit_of_measure&+$#C,$#2)24*; &&&&prod_pac=_siMe&&&&&&&+$#C,$#2)34*; &&&&supplier_id&&&&&&&&&&NU"B!#)D*; &&&&prod_status&&&&&&&&&&+$#C,$#2)24*; &&&&prod_list_price&&&&&&NU"B!#)C12*;& &&&&prod_min_price&&&&&&&NU"B!#)C12*; &&sales&NU"B!#KH4; &&o96set&product_t_ta9le&KH&product_t_ta9le)*; &&i&NU"B!#&KH&4; B!(IN &&&LOO% &&&&&AA&>etch&from&cursor&-aria9le &&&&&>!TC,&cur&INTO&prod_id1&prod_name1&prod_desc1&prod_su9cate/ory1& &&&&&prod_su9cat_desc1&&&prod_cate/ory1&prod_cat_desc1& prod_<ei/ht_class1& &&&&&prod_unit_of_measure1&&prod_pac=_siMe1&supplier_id1&prod_status1& &&&&&prod_list_price1&prod_min_price; &&&&&!.IT&F,!N&curUNOT>OUND;&AA&e?it&<hen&last&ro<&is&fetched &&&&&I>&prod_statusH7o9solete7&$ND&prod_cate/ory&VH&7Boys7&T,!N &&&&&AA&append&to&collection &&&&&iKHiO5; &&&&&o96setIe?tend; &&&&&o96set)i*KHproduct_t)&prod_id1&prod_name1&prod_desc1& prod_su9cate/ory1& prod_su9cat_desc1&&prod_cate/ory1&prod_cat_desc1&prod_<ei/ht_class1& prod_unit_ of_measure1&&prod_pac=_siMe1&supplier_id1&prod_status1&prod_list_price1& prod_ min_price*; &&&&&!ND&I>; &&&!ND&LOO%; &&&CLOS!&cur; &&&#!TU#N&o96set; !ND; 8

?o can se the table f nction in a SFL statement to show the res lts" Here we se additional SFL f nctionalit* for the o tp t"
S!L!CT&DISTINCT&U%%!#)prod_cate/ory*1&prod_status& >#O"&T$BL!)o9solete_products)CU#SO#)S!L!CT&G&>#O"&products***; U%%!#)%#OD_C$T!(O#'*&&&&&&&%#OD_ST$TUS AAAAAAAAAAAAAAAAAAAA&&&&&&&AAAAAAAAAAA (I#LS&&&&&&&&&&&&&&&&&&&&&&o9solete "!N&&&&&&&&&&&&&&&&&&&&&&&&o9solete 2&ro<s&selectedI

The followin! example implements the same filterin! than the first one" The main differences between those two are#

This example ses a stron! t*ped +6F c rsor as inp t and can be paralleli5ed based on the ob:ects of the stron! t*ped c rsor/ as shown in one of the followin! examples" The table f nction ret rns the res lt set incrementall* as soon as records are created"
#!"&Same&e?ample1&pipelined&implementation #!"&stron/&ref&cursor&)input&type&is&defined* #!"&a&ta9le&<ithout&a&stron/&typed&input&ref&cursor&cannot&9e& paralleliMed #!" C#!$T!&O#& #!%L$C!&>UNCTION&o9solete_products_pipe)cur& cursor_p=/Istron/_refcur_t*& #!TU#N&product_t_ta9le %I%!LIN!D %$#$LL!L_!N$BL!&)%$#TITION&cur&B'&$N'*&IS &&&&prod_id&&&&&&&&&&&&&&NU"B!#)D*;&&&&&& &&&&prod_name&&&&&&&&&&&&+$#C,$#2)04*;&&& &&&&prod_desc&&&&&&&&&&&&+$#C,$#2):444*;& &&&&prod_su9cate/ory&&&&&+$#C,$#2)04*;&& &&&&prod_su9cat_desc&&&&&+$#C,$#2)2444*;&& &&&&prod_cate/ory&&&&&&&&+$#C,$#2)04*; &&&&prod_cat_desc&&&&&&&&+$#C,$#2)2444*;&& &&&&prod_<ei/ht_class&&&&NU"B!#)2*; &&&&prod_unit_of_measure&+$#C,$#2)24*; &&&&prod_pac=_siMe&&&&&&&+$#C,$#2)34*; &&&&supplier_id&&&&&&&&&&NU"B!#)D*; &&&&prod_status&&&&&&&&&&+$#C,$#2)24*; &&&&prod_list_price&&&&&&NU"B!#)C12*;& &&&&prod_min_price&&&&&&&NU"B!#)C12*; &&sales&NU"B!#KH4; B!(IN &&LOO% &&&&&AA&>etch&from&cursor&-aria9le &>!TC,&cur&INTO&prod_id1&prod_name1&prod_desc1&prod_su9cate/ory1& prod_su9cat_ desc1&prod_cate/ory1&prod_cat_desc1&prod_<ei/ht_class1& prod_unit_of_measure1& prod_pac=_siMe1&supplier_id1&prod_status1&prod_list_price1& prod_min_price; &!.IT&F,!N&curUNOT>OUND;&AA&e?it&<hen&last&ro<&is&fetched &&&I>&prod_statusH7o9solete7&$ND&prod_cate/ory&VH7Boys7&T,!N &%I%!&#OF&)product_t)prod_id1&prod_name1&prod_desc1& prod_su9cate/ory1&prod_

su9cat_desc1&prod_cate/ory1&prod_cat_desc1&prod_<ei/ht_class1& prod_unit_of_ measure1&prod_pac=_siMe1&supplier_id1&prod_status1& prod_list_price1&prod_min_ price**; &&&&!ND&I>; &&!ND&LOO%; &&CLOS!&cur; &&#!TU#N; !ND; 8

?o can se the table f nction as follows#


S!L!CT&DISTINCT&prod_cate/ory1&D!COD!)prod_status1&7o9solete71&7NO& LON(!#& #!"O+!_$+$IL$BL!71&7N8$7*& >#O"&T$BL!)o9solete_products_pipe)CU#SO#)S!L!CT&G&>#O"&products***; %#OD_C$T!(O#'&&&&D!COD!)%#OD_ST$TUS1 AAAAAAAAAAAAA&&&&AAAAAAAAAAAAAAAAAAA (irls&&&&&&&&&&&&NO&LON(!#&$+$IL$BL! "en&&&&&&&&&&&&&&NO&LON(!#&$+$IL$BL! 2&ro<s&selectedI

$e now chan!e the de!ree of parallelism for the inp t table prod cts and iss e the same statement a!ain#
$LT!#&T$BL!&products&%$#$LL!L&:;

The session statistics show that the statement has been paralleli5ed#
S!L!CT&G&>#O"&+Q%L_S!SST$T&F,!#!&statisticH7Lueries&%aralleliMed7; ST$TISTIC&&&&&&&&&&&&&&L$ST_LU!#'&&S!SSION_TOT$L AAAAAAAAAAAAAAAAAAAA&&&AAAAAAAAAA&&AAAAAAAAAAAAA Lueries&%aralleliMed&&&&&&&&&&&&5&&&&&&&&&&&&&&3 5&ro<&selectedI

Table f nctions are also capable to fano t res lts into persistent table str ct res" This is demonstrated in the next example" The f nction filters ret rns all obsolete prod cts except a those of a specific prod_cate/ory -defa lt "en3/ which was set to stat s obsolete b* error" The detected wron! prod_id<s are stored in a separate table str ct re" Its res lt set consists of all other obsolete prod ct cate!ories" It f rthermore demonstrates how normal variables can be sed in con: nction with table f nctions#

C#!$T!&O#&#!%L$C!&>UNCTION&o9solete_products_dml)cur& cursor_p=/Istron/_refcur_t1& prod_cat&+$#C,$#2&D!>$ULT&7"en7*&#!TU#N&product_t_ta9le %I%!LIN!D %$#$LL!L_!N$BL!&)%$#TITION&cur&B'&$N'*&IS &&&&%#$("$&$UTONO"OUS_T#$NS$CTION; &&&&prod_id&&&&&&&&&&&&&&NU"B!#)D*;&&&&&& &&&&prod_name&&&&&&&&&&&&+$#C,$#2)04*;&&& &&&&prod_desc&&&&&&&&&&&&+$#C,$#2):444*;& &&&&prod_su9cate/ory&&&&&+$#C,$#2)04*;&& &&&&prod_su9cat_desc&&&&&+$#C,$#2)2444*;&& &&&&prod_cate/ory&&&&&&&&+$#C,$#2)04*; &&&&prod_cat_desc&&&&&&&&+$#C,$#2)2444*;&& &&&&prod_<ei/ht_class&&&&NU"B!#)2*; &&&&prod_unit_of_measure&+$#C,$#2)24*; &&&&prod_pac=_siMe&&&&&&&+$#C,$#2)34*; &&&&supplier_id&&&&&&&&&&NU"B!#)D*; &&&&prod_status&&&&&&&&&&+$#C,$#2)24*; &&&&prod_list_price&&&&&&NU"B!#)C12*;& &&&&prod_min_price&&&&&&&NU"B!#)C12*; &&sales&NU"B!#KH4; B!(IN &&&LOO% &&&&&AA&>etch&from&cursor&-aria9le &&&>!TC,&cur&INTO&prod_id1&prod_name1&prod_desc1&prod_su9cate/ory1& prod_su9cat_ desc1&&prod_cate/ory1&prod_cat_desc1&prod_<ei/ht_class1& prod_unit_of_measure1&& prod_pac=_siMe1&supplier_id1&prod_status1&prod_list_price1& prod_min_price; &&&!.IT&F,!N&curUNOT>OUND;&AA&e?it&<hen&last&ro<&is&fetched &&&&I>&prod_statusH7o9solete7&T,!N &&&&&&I>&prod_cate/oryHprod_cat&T,!N &&&&&&&&INS!#T&INTO&o9solete_products_errors&+$LU!S& &&&&&&&)prod_id1&7correctionK&cate/ory&7SSU%%!#)prod_cat*SS7&still& a-aila9le7*; &&&&&&&!LS! &&&&%I%!&#OF&)product_t)&prod_id1&prod_name1&prod_desc1& prod_su9cate/ory1&prod_ su9cat_desc1&prod_cate/ory1&prod_cat_desc1&prod_<ei/ht_class1& prod_unit_of_ measure1&prod_pac=_siMe1&supplier_id1&prod_status1&prod_list_price1& prod_min_ price**; &&&&&&&!ND&I>; &&&&&!ND&I>; &&&!ND&LOO%; &&&CO""IT; &&&CLOS!&cur; &&&#!TU#N; !ND; 8

The followin! 4 er* shows all obsolete prod ct !ro ps except the prod_cate/ory "en/ which was wron!l* set to stat s o9solete"

S!L!CT&DISTINCT&prod_cate/ory1&prod_status&>#O"&T$BL!)o9solete_products_ dml)CU#SO#)S!L!CT&G&>#O"&products***; %#OD_C$T!(O#'&&&&&&&&&&&%#OD_ST$TUS AAAAAAAAAAAAA&&&&&&&&&&&AAAAAAAAAAA Boys&&&&&&&&&&&&&&&&&&&&o9solete (irls&&&&&&&&&&&&&&&&&&&o9solete 2&ro<s&selectedI

's *o can see/ there are some prod cts of the prod_cate/ory "en that were obsoleted b* accident#
S!L!CT&DISTINCT&ms/&>#O"&o9solete_products_errors; "S( AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA correctionK&cate/ory&"!N&still&a-aila9le 5&ro<&selectedI

Takin! advanta!e of the second inp t variable chan!es the res lt set as follows#
S!L!CT&DISTINCT&prod_cate/ory1&prod_status&>#O"&T$BL!)o9solete_products_ dml)CU#SO#)S!L!CT&G&>#O"&products*1&7Boys7**; %#OD_C$T!(O#'&&&&%#OD_ST$TUS AAAAAAAAAAAAA&&&&AAAAAAAAAAA (irls&&&&&&&&&&&&o9solete "en&&&&&&&&&&&&&&o9solete 2&ro<s&selectedI S!L!CT&DISTINCT&ms/&>#O"&o9solete_products_errors; "S( AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA correctionK&cate/ory&BO'S&still&a-aila9le 5&ro<&selectedI

Beca se table f nctions can be sed like a normal table/ the* can be nested/ as shown in the followin!#
S!L!CT&DISTINCT&prod_cate/ory1&prod_status&& >#O"&T$BL!)o9solete_products_dml)CU#SO#)S!L!CT&G& &&&&&&&&>#O"&T$BL!)o9solete_products_pipe)CU#SO#)S!L!CT&G&&&>#O"& products******; %#OD_C$T!(O#'&&&&&&&%#OD_ST$TUS AAAAAAAAAAAAA&&&&&&&AAAAAAAAAAA (irls&&&&&&&&&&&&&&&o9solete

5&ro<&selectedI

Beca se the table f nction o9solete_products_pipe filters o t all prod cts of the prod_cate/ory Boys/ o r res lt does no lon!er incl de prod cts of the prod_cate/ory Boys" The prod_cate/ory "en is still set to be obsolete b* accident"
S!L!CT&COUNT)G*&>#O"&o9solete_products_errors; "S( AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA correctionK&cate/ory&"!N&still&a-aila9le

The bi!!est advanta!e of Oracle1i 6TL is its toolkit f nctionalit*/ where *o can combine an* of the latter disc ssed f nctionalit* to improve and speed p *o r 6TL processin!" For example/ *o can take an external table as inp t/ :oin it with an existin! table and se it as inp t for a paralleli5ed table f nction to process complex b siness lo!ic" This table f nction can be sed as inp t so rce for a "!#(! operation/ th s streamin! the new information for the data wareho se/ provided in a flat file within one sin!le statement thro !h the complete 6TL process"

"oading and Trans0or ation Scenarios


The followin! sections offer examples of t*pical loadin! and transformation tasks#

7arallel Load Scenario ,e* Look p Scenario 6xception Handlin! Scenario 7ivotin! Scenarios

#arallel "oad Scenario


This section presents a case st d* ill stratin! how to create/ load/ index/ and anal*5e a lar!e data wareho se fact table with partitions in a t*pical star schema" This example ses SFLILoader to explicitl* stripe data over B8 disks"

The example 0A8 KB table is named facts" The s*stem is a 08)C7= shared memor* comp ter with more than 088 disk drives" Thirt* disks -C KB each3 are sed for base table data/ 08 disks for indexes/ and B8 disks for temporar* space" 'dditional disks are needed for rollback se!ments/ control files/ lo! files/ possible sta!in! area for loader flat files/ and so on" The facts table is partitioned b* month into 0A partitions" To facilitate back p and recover*/ each partition is stored in its own tablespace" 6ach partition is spread evenl* over 08 disks/ so a scan accessin! few partitions or a sin!le partition can proceed with f ll parallelism" Th s there can be intra) partition parallelism when 4 eries restrict data access b* partition pr nin!"

6ach disk has been f rther s bdivided sin! an operatin! s*stem tilit* into C operatin! s*stem files with names like 8de-8D5I51&8de-8D5I21&III&1& 8de-8D34I:" Fo r tablespaces are allocated on each !ro p of 08 disks" To better balance I@O and paralleli5e table space creation -beca se Oracle writes each block in a datafile when it is added to a tablespace3/ it is best if each of the fo r tablespaces on each !ro p of 08 disks has its first datafile on a different disk" Th s the first tablespace has 8de-8D5I5 as its first datafile/ the second tablespace has 8de-8D:I2 as its first datafile/ and so on/ as ill strated in Fi! re 0B)>"

Figure 13-- Datafi$e La'out for Para$$e$ Load E.am&$e

Text description of the ill stration dwhs!811"!if

Step 1: Create the Tablespaces and Add Data0iles in #arallel The followin! is the command to create a tablespace named Tsfacts5" Other tablespaces are created with analo!o s commands" On a 08)C7= machine/ it sho ld be possible to r n all 0A C#!$T! T$BL!S%$C! statements to!ether" 'lternativel*/ it mi!ht be better to r n them in two batches of 2 -two from each of the three !ro ps of disks3"
C#!$T!&T$BL!S%$C!&TSfacts5& D$T$>IL!&8de-8D5I57&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D2I57&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D3I57&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D:I57&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D0I57&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8DDI57&&SI !&542:"B&#!US!1

D$T$>IL!&8de-8DEI57&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8DCI57&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D@I57&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D54I5&&SI !&542:"B&#!US!1 D!>$ULT&STO#$(!&)INITI$L&544"B&N!.T&544"B&%CTINC#!$S!&4*; III C#!$T!&T$BL!S%$C!&TSfacts2 D$T$>IL!&8de-8D:I27&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D0I27&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8DDI27&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8DEI27&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8DCI27&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D@I27&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D54I2&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D5I27&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D2I27&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D3I27&&SI !&542:"B&#!US!1 D!>$ULT&STO#$(!&)INITI$L&544"B&N!.T&544"B&%CTINC#!$S!&4*; III C#!$T!&T$BL!S%$C!&TSfacts: D$T$>IL!&8de-8D54I:7&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D5I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D2I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D3I:&&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D:I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D0I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8DDI:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8DEI:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8DCI:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D@I:7&&SI !&542:"B&#!US!1 D!>$ULT&STO#$(!&)INITI$L&544"B&N!.T&544"B&%CTINC#!$S!&4*; III C#!$T!&T$BL!S%$C!&TSfacts52 D$T$>IL!&8de-8D34I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D25I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D22I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D23I:&&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D2:I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D20I:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D2DI:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D2EI:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D2CI:7&&SI !&542:"B&#!US!1 D$T$>IL!&8de-8D2@I:7&&SI !&542:"B&#!US!1 D!>$ULT&STO#$(!&)INITI$L&544"B&N!.T&544"B&%CTINC#!$S!&4*;

6xtent si5es in the STO#$(! cla se sho ld be m ltiples of the m ltiblock read si5e/ where blocksi5e I "ULTIBLOCK_#!$D_COUNT L m ltiblock read si5e" and N!.T sho ld normall* be set to the same val e" In the case of parallel load/ make the extent si5e lar!e eno !h to keep the n mber of extents reasonable/ and to avoid excessive overhead and seriali5ation d e to bottlenecks in the data dictionar*" $hen %$#$LL!LHT#U! is sed for parallel loader/ the INITI$L extent is not sed" In this case
INITI$L

*o can override the INITI$L extent si5e specified in the tablespace defa lt stora!e cla se with the val e specified in the loader control file/ for example/ 2C,B" Tables or indexes can have an nlimited n mber of extents/ provided *o have set the CO"%$TIBL! initiali5ation parameter to match the c rrent release n mber/ and se the "$.!.T!NTS ke*word on the C#!$T! or $LT!# statement for the tablespace or ob:ect" In practice/ however/ a limit of 08/888 extents for each ob:ect is reasonable" ' table or index has an nlimited n mber of extents/ so set the %!#C!NT_INC#!$S! parameter to 5ero to have extents of e4 al si5e" Note: If possible/ do not allocate extents faster than abo t A or B for each min te" Th s/ each process sho ld !et an extent that lasts for B to > min tes" (ormall*/ s ch an extent is at least >8 MB for a lar!e ob:ect" Too small an extent si5e inc rs si!nificant overhead/ which affects performance and scalabilit* of parallel operations" The lar!est possible extent si5e for a C KB disk evenl* divided into C partitions is 0 KB" 088 MB extents sho ld perform well" 6ach partition will have 088 extents" ?o can then c stomi5e the defa lt stora!e parameters for each ob:ect created in the tablespace/ if needed" Step *: Create the #artitioned Table $e create a partitioned table with 0A partitions/ each in its own tablespace" The table contains m ltiple dimensions and m ltiple meas res" The partitionin! col mn is named dim_2 and is a date" There are other col mns as well"
C#!$T!&T$BL!&facts&)dim_5&NU"B!#1&dim_2&D$T!1&III &&meas_5&NU"B!#1&meas_2&NU"B!#1&III&* %$#$LL!L %$#TITION&B'&#$N(!&)dim_2* )%$#TITION&6an@0&+$LU!S&L!SS&T,$N&)742A45A5@@07*&T$BL!S%$C! TSfacts51 %$#TITION&fe9@0&+$LU!S&L!SS&T,$N&)743A45A5@@07*&T$BL!S%$C! TSfacts21 III %$#TITION&dec@0&+$LU!S&L!SS&T,$N&)745A45A5@@D7*&T$BL!S%$C! TSfacts52*;

Step 3: "oad the #artitions in #arallel This section describes fo r alternative approaches to loadin! partitions in parallel" The different approaches to loadin! help *o mana!e the ramifications of the %$#$LL!LHT#U!& ke*word of SFLILoader that controls whether individ al partitions are loaded in parallel" The %$#$LL!L ke*word entails the followin! restrictions#

Indexes cannot be defined" ?o m st set a small initial extent/ beca se each loader session !ets a new extent when it be!ins/ and it does not se an* existin! space associated with the ob:ect" Space fra!mentation iss es arise"

However/ re!ardless of the settin! of this ke*word/ if *o have one loader process for each partition/ *o are still effectivel* loadin! into the table in parallel" E.am&$e 13-5 Loading Partitions in Parallel Case 1 In this approach/ ass me 0A inp t files are partitioned in the same wa* as *o r table" ?o have one inp t file for each partition of the table to be loaded" ?o start 0A SFLILoader sessions conc rrentl* in parallel/ enterin! statements like these#
SLLLD#&D$T$H6an@0Idat&DI#!CTHT#U!&CONT#OLH6an@0Ictl SLLLD#&D$T$Hfe9@0Idat&DI#!CTHT#U!&CONT#OLHfe9@0Ictl &I&I&I& SLLLD#&D$T$Hdec@0Idat&DI#!CTHT#U!&CONT#OLHdec@0Ictl

In the example/ the ke*word %$#$LL!LHT#U! is not set" ' separate control file for each partition is necessar* beca se the control file m st specif* the partition into which the loadin! sho ld be done" It contains a statement s ch as the followin!#
LO$D&INTO&facts&partition)6an@0*

The advanta!e of this approach is that local indexes are maintained b* SFLILoader" ?o still !et parallel loadin!/ b t on a partition level))witho t the restrictions of the %$#$LL!L ke*word" ' disadvanta!e is that *o m st partition the inp t prior to loadin! man all*" E.am&$e 13-> Loading Partitions in Parallel Case 2 In another common approach/ ass me an arbitrar* n mber of inp t files that are not partitioned in the same wa* as the table" ?o can adopt a strate!* of performin! parallel load for each inp t file individ all*" Th s if there are seven inp t files/ *o can start seven SFLILoader sessions/ sin! statements like the followin!#
SLLLD#&D$T$Hfile5Idat&DI#!CTHT#U!&%$#$LL!LHT#U!

Oracle partitions the inp t data so that it !oes into the correct partitions" In this case all the loader sessions can share the same control file/ so there is no need to mention it in the statement" The ke*word %$#$LL!LHT#U! m st be sed/ beca se each of the seven loader sessions can write into ever* partition" In Case 0/ ever* loader session wo ld write into onl* one

partition/ beca se the data was partitioned prior to loadin!" Hence all the %$#$LL!L ke*word restrictions are in effect" In this case/ Oracle attempts to spread the data evenl* across all the files in each of the 0A tablespaces))however an even spread of data is not ! aranteed" Moreover/ there co ld be I@O contention d rin! the load when the loader processes are attemptin! to write to the same device sim ltaneo sl*" E.am&$e 13-: Loading Partitions in Parallel Case 3 In this example/ *o want precise control over the load" To achieve this/ *o m st partition the inp t data in the same wa* as the datafiles are partitioned in Oracle" This example ses 08 processes loadin! into B8 disks" To accomplish this/ *o m st split the inp t into 0A8 files beforehand" The 08 processes will load the first partition in parallel on the first 08 disks/ then the second partition in parallel on the second 08 disks/ and so on thro !h the 0Ath partition" ?o then r n the followin! commands conc rrentl* as back!ro nd processes#
SLLLD#&D$T$H6an@0Ifile5Idat&DI#!CTHT#U!&%$#$LL!LHT#U!&>IL!H8de-8D5I5& III SLLLD#&D$T$H6an@0Ifile54Idat&DI#!CTHT#U!&%$#$LL!LHT#U!&>IL!H8de-8D54I5& F$IT; III SLLLD#&D$T$Hdec@0Ifile5Idat&DI#!CTHT#U!&%$#$LL!LHT#U!&>IL!H8de-8D34I: III SLLLD#&D$T$Hdec@0Ifile54Idat&DI#!CTHT#U!&%$#$LL!LHT#U!&>IL!H8de-8D2@I:&

For Oracle +eal 'pplication Cl sters/ divide the loader session evenl* amon! the nodes" The datafile bein! read sho ld alwa*s reside on the same node as the loader session" The ke*word %$#$LL!LHT#U! m st be sed/ beca se m ltiple loader sessions can write into the same partition" Hence all the restrictions entailed b* the %$#$LL!L ke*word are in effect" 'n advanta!e of this approach/ however/ is that it ! arantees that all of the data is precisel* balanced/ exactl* reflectin! *o r partitionin!" Note: 'ltho !h this example shows parallel load sed with partitioned tables/ the two feat res can be sed independent of one another" E.am&$e 13-@ Loading Partitions in Parallel Case 4 For this approach/ all partitions m st be in the same tablespace" ?o need to have the same n mber of inp t files as datafiles in the tablespace/ b t *o do not need to partition the inp t the same wa* in which the table is partitioned"

For example/ if all B8 devices were in the same tablespace/ then *o wo ld arbitraril* partition *o r inp t data into B8 files/ then start B8 SFLILoader sessions in parallel" The statement startin! p the first session wo ld be similar to the followin!#
SLLLD#&D$T$Hfile5Idat&DI#!CTHT#U!&%$#$LL!LHT#U!&>IL!H8de-8D5 I&I&I SLLLD#&D$T$Hfile34Idat&DI#!CTHT#U!&%$#$LL!LHT#U!&>IL!H8de-8D34

The advanta!e of this approach is that as in Case B/ *o have control over the exact placement of datafiles beca se *o se the >IL! ke*word" However/ *o are not re4 ired to partition the inp t data b* val e beca se Oracle does that for *o " ' disadvanta!e is that this approach re4 ires all the partitions to be in the same tablespace" This minimi5es availabilit*" E.am&$e 13-1B Loading E.terna$ Data This is probabl* the most basic se of external tables where the data vol me is lar!e and no transformations are applied to the external data" The load process is performed as follows# 0" ?o create the external table" Most likel*/ the table will be declared as parallel to perform the load in parallel" Oracle will d*namicall* perform load balancin! between the parallel exec tion servers involved in the 4 er*" 0" Once the external table is created -remember that this onl* creates the metadata in the dictionar*3/ data can be converted/ moved and loaded into the database sin! either a %$#$LL!L C#!$T! T$BL! $S S!L!CT or a %$#$LL!L INS!#T statement"
2I C#!$T!&T$BL!&products_e?t 3I )prod_id&NU"B!#1&prod_name&+$#C,$#2)04*1&III1& :I &price&NU"B!#)DI2*1&discount&NU"B!#)DI2** 0I O#($NI $TION&!.T!#N$L DI ) EI D!>$ULT&DI#!CTO#'&)sta/e_dir* CI $CC!SS&%$#$"!T!#S @I )&#!CO#DS&>I.!D&34 54I B$D>IL!&79ad89ad_products_e?t7 55I LO(>IL!&7lo/8lo/_products_e?t7 52I )&prod_id&%OSITION&)5KC*&C,$#1 53I &&prod_name&%OSITION&)G1O04*&C,$#1 5:I &&prod_desc&&%OSITION&)G1O244*&C,$#1 50I &&I&I&I* 5DI #!"O+!_LOC$TION&)7ne<8ne<_prod5It?t717ne<8ne<_prod2It?t7** 5EI %$#$LL!L&0 5CI #!B!CT&LI"IT&244; 5@I #&load&it&in&the&data9ase&usin/&a&parallel&insert 24I $LT!#&&S!SSION&!N$BL!&%$#$LL!L&D"L; 25I INS!#T&INTO&T$BL!&products&S!L!CT&G&>#O"&products_e?t; 22I

In this example/ sta/e_dir is a director* where the external flat files reside"

(ote that loadin! data in parallel can be performed in Oracle1i b* sin! SFLILoader" B t external tables are probabl* easier to se and the parallel load is a tomaticall* coordinated" =nlike SFLILoader/ d*namic load balancin! between parallel exec tion servers will be performed as well beca se there will be intra)file parallelism" The latter implies that the ser will not have to man all* split inp t files before startin! the parallel load" This will be accomplished d*namicall*"

Ee+ "oo<up Scenario


'nother simple transformation is a ke* look p" For example/ s ppose that sales transaction data has been loaded into a retail data wareho se" 'ltho !h the data wareho se<s sales table contains a product_id col mn/ the sales transaction data extracted from the so rce s*stem contains =niform 7rice Codes -=7C3 instead of prod ct I%s" Therefore/ it is necessar* to transform the =7C codes into prod ct I%s before the new sales transaction data can be inserted into the sales table" In order to exec te this transformation/ a look p table m st relate the product_id val es to the =7C codes" This table mi!ht be the product dimension table/ or perhaps another table in the data wareho se that has been created specificall* to s pport this transformation" For this example/ we ass me that there is a table named product/ which has a product_id and an upc_code col mn" This data s bstit tion transformation can be implemented sin! the followin! CT'S statement#
C#!$T!&T$BL!&temp_sales_step2& &&&NOLO((IN(&%$#$LL!L&$S& &&&S!L!CT& &&&&&&sales_transaction_id1 &&&&&&productIproduct_id&sales_product_id1& &&&&&&sales_customer_id1 &&&&&&sales_time_id1& &&&&&&sales_channel_id1& &&&&&&sales_quantity_sold1& &&&&&&sales_dollar_amount &&&>#O"&&temp_sales_step51&product &&&F,!#!&temp_sales_step5Iupc_code&H&productIupc_code;

This CT'S statement will convert each valid =7C code to a valid product_id val e" If the 6TL process has ! aranteed that each =7C code is valid/ then this statement alone ma* be s fficient to implement the entire transformation"

$2ception -andling Scenario


In the precedin! example/ if *o m st also handle new sales data that does not have valid =7C codes/ *o can se an additional CT'S statement to identif* the invalid rows#

C#!$T!&T$BL!&temp_sales_step5_in-alid&NOLO((IN(&%$#$LL!L&$S &&&S!L!CT&G&>#O"&temp_sales_step5 &&&F,!#!&temp_sales_step5Iupc_code&NOT&IN&)S!L!CT&upc_code&>#O"& product*;

This invalid data is now stored in a separate table/ temp_sales_step5_in-alid/ and can be handled separatel* b* the 6TL process" 'nother wa* to handle invalid data is to modif* the ori!inal CT'S to se an o ter :oin#
C#!$T!&T$BL!&temp_sales_step2& &&&NOLO((IN(&%$#$LL!L&$S &&&S!L!CT& &&&&&&&&sales_transaction_id1 &&&&&&&&productIproduct_id&sales_product_id1& &&&&&&&&sales_customer_id1 &&&&&&&&sales_time_id1& &&&&&&&&sales_channel_id1& &&&&&&&&sales_quantity_sold1& &&&&&&&&sales_dollar_amount &&&>#O"&&temp_sales_step51&product &&&F,!#!&temp_sales_step5Iupc_code&H&productIupc_code&)O*;

=sin! this o ter :oin/ the sales transactions that ori!inall* contained invalidated =7C codes will be assi!ned a product_id of NULL" These transactions can be handled later" 'dditional approaches to handlin! invalid =7C codes exist" Some data wareho ses ma* choose to insert n ll)val ed product_id val es into their sales table/ while other data wareho ses ma* not allow an* new data from the entire batch to be inserted into the sales table ntil all invalid =7C codes have been addressed" The correct approach is determined b* the b siness re4 irements of the data wareho se" +e!ardless of the specific re4 irements/ exception handlin! can be addressed b* the same basic SFL techni4 es as transformations"

#ivoting Scenarios
' data wareho se can receive data from man* different so rces" Some of these so rce s*stems ma* not be relational databases and ma* store data in ver* different formats from the data wareho se" For example/ s ppose that *o receive a set of sales records from a nonrelational database havin! the form#
product_id1&customer_id1&<ee=ly_start_date1&sales_sun1&sales_mon1& sales_tue1& &&sales_<ed1&sales_thu1&sales_fri1&sales_sat

The inp t table looks like this#

S!L!CT&G&>#O"&sales_input_ta9le; %#ODUCT_ID&CUSTO"!#_ID&F!!KL'_ST&&S$L!S_SUN&&S$L!S_"ON&&S$L!S_TU!&& S$L!S_F!D&S$L!S_T,U&&S$L!S_>#I&&S$L!S_S$T AAAAAAAAAA&AAAAAAAAAAA&AAAAAAAAA&AAAAAAAAAA&AAAAAAAAAA&AAAAAAAAAA& AAAAAAAAAAAAAAAAAAAA&AAAAAAAAAA&AAAAAAAAAA &&&&&&&555&&&&&&&&&222&45AOCTA44&&&&&&&&544&&&&&&&&244&&&&&&&&344&&&&&&& :44&&&&&&&044&&&&&&&&D44&&&&&&&&E44 &&&&&&&222&&&&&&&&&333&4CAOCTA44&&&&&&&&244&&&&&&&&344&&&&&&&&:44&&&&&&& 044&&&&&&&D44&&&&&&&&E44&&&&&&&&C44 &&&&&&&333&&&&&&&&&:::&50AOCTA44&&&&&&&&344&&&&&&&&:44&&&&&&&&044&&&&&&& D44&&&&&&&E44&&&&&&&&C44&&&&&&&&@44

In *o r data wareho se/ *o wo ld want to store the records in a more t*pical relational form in a fact table sales of the Sales ,istory sample schema#
prod_id1&cust_id1&time_id1&amount_sold

Note: ' n mber of constraints on the sales table have been disabled for p rposes of this example/ beca se the example i!nores a n mber of table col mns for the sake of brevit*"

Th s/ *o need to b ild a transformation s ch that each record in the inp t stream m st be converted into seven records for the data wareho se<s sales table" This operation is commonl* referred to as pivoting/ and Oracle offers several wa*s to do this" The res lt of the previo s example will resemble the followin!#
S!L!CT&prod_id1&cust_id1&time_id1&amount_sold&>#O"&sales; &&&%#OD_ID&&&&CUST_ID&&&TI"!_ID&&&$"OUNT_SOLD AAAAAAAAAA&AAAAAAAAAA&&&AAAAAAAAA&AAAAAAAAAAA &&&&&&&555&&&&&&&&222&&&45AOCTA44&&&&&&&&&544 &&&&&&&555&&&&&&&&222&&&42AOCTA44&&&&&&&&&244 &&&&&&&555&&&&&&&&222&&&43AOCTA44&&&&&&&&&344 &&&&&&&555&&&&&&&&222&&&4:AOCTA44&&&&&&&&&:44 &&&&&&&555&&&&&&&&222&&&40AOCTA44&&&&&&&&&044 &&&&&&&555&&&&&&&&222&&&4DAOCTA44&&&&&&&&&D44 &&&&&&&555&&&&&&&&222&&&4EAOCTA44&&&&&&&&&E44 &&&&&&&222&&&&&&&&333&&&4CAOCTA44&&&&&&&&&244 &&&&&&&222&&&&&&&&333&&&4@AOCTA44&&&&&&&&&344 &&&&&&&222&&&&&&&&333&&&54AOCTA44&&&&&&&&&:44 &&&&&&&222&&&&&&&&333&&&55AOCTA44&&&&&&&&&044 &&&&&&&222&&&&&&&&333&&&52AOCTA44&&&&&&&&&D44 &&&&&&&222&&&&&&&&333&&&53AOCTA44&&&&&&&&&E44 &&&&&&&222&&&&&&&&333&&&5:AOCTA44&&&&&&&&&C44 &&&&&&&333&&&&&&&&:::&&&50AOCTA44&&&&&&&&&344 &&&&&&&333&&&&&&&&:::&&&5DAOCTA44&&&&&&&&&:44 &&&&&&&333&&&&&&&&:::&&&5EAOCTA44&&&&&&&&&044

&&&&&&&333&&&&&&&&:::&&&5CAOCTA44&&&&&&&&&D44 &&&&&&&333&&&&&&&&:::&&&5@AOCTA44&&&&&&&&&E44 &&&&&&&333&&&&&&&&:::&&&24AOCTA44&&&&&&&&&C44 &&&&&&&333&&&&&&&&:::&&&25AOCTA44&&&&&&&&&@44

$2a ples o0 #re=OracleJi #ivoting The pre)Oracle1i wa* of pivotin! involved sin! CT'S -or parallel INS!#T $S S!L!CT3 or 7L@SFL is shown in this section"

E.am&$e 1 Pre-Orac$e@i Pivoting +sing a CTA! !tatement


C#!$T!&ta9le&temp_sales_step2&NOLO((IN(&%$#$LL!L&$S& &&&S!L!CT&product_id1&customer_id1&time_id1&amount_sold &&&>#O" &&&)S!L!CT&product_id1&customer_id1&<ee=ly_start_date1&time_id1 &&&&&&&&&&sales_sun&amount_sold&>#O"&sales_input_ta9le &&&&UNION&$LL &&&&S!L!CT&product_id1&customer_id1&<ee=ly_start_dateO51&time_id1 &&&&&&&&&&&sales_mon&amount_sold&>#O"&sales_input_ta9le &&&&UNION&$LL &&&&S!L!CT&product_id1&cust_id1&<ee=ly_start_dateO21&time_id1 &&&&&&&&&&&sales_tue&amount_sold&>#O"&sales_input_ta9le &&&&UNION&$LL &&&&S!L!CT&product_id1&customer_id1&<ee=ly_start_dateO31&time_id1 &&&&&&&&&&&sales_<e9&amount_sold&>#O"&sales_input_ta9le &&&&UNION&$LL &&&&S!L!CT&product_id1&customer_id1&<ee=ly_start_dateO:1&time_id1 &&&&&&&&&&&sales_thu&amount_sold&>#O"&sales_input_ta9le &&&&UNION&$LL &&&&S!L!CT&product_id1&customer_id1&<ee=ly_start_dateO01&time_id1 &&&&&&&&&&&sales_fri&amount_sold&>#O"&sales_input_ta9le &&&&UNION&$LL &&&&S!L!CT&product_id1&customer_id1&<ee=ly_start_dateOD1&time_id1& &&&&&&&&&&&sales_sat&amount_sold&>#O"&sales_input_ta9le*;

Like all CT'S operations/ this operation can be f ll* paralleli5ed" However/ the CT'S approach also re4 ires seven separate scans of the data/ one for each da* of the week" 6ven with parallelism/ the CT'S approach can be time)cons min!"

E.am&$e 2 Pre-Orac$e@i Pivoting +sing PL2!8L


7L@SFL offers an alternative implementation" ' basic 7L@SFL f nction to implement a pivotin! operation is shown in the followin! statement#
D!CL$#! &&&CU#SO#&c5&is &&&&&&S!L!CT&product_id1&customer_id1&<ee=ly_start_date1&sales_sun1& &&&&&&sales_mon1&sales_tue1&sales_<ed1&sales_thu1&sales_fri1&sales_sat &&&&&&>#O"&sales_input_ta9le; B!(IN &&&>O#&crec&IN&c5&LOO%

&&&&&&INS!#T&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)crecIproduct_id1&crecIcustomer_id1& crecI<ee=ly_start_date1&&&&& &&&&&&&&&&&&&crecIsales_sun&*; &&&&&&INS!#T&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)crecIproduct_id1&crecIcustomer_id1& crecI<ee=ly_start_dateO51 &&&&&&&&&&&&&crecIsales_mon&*; &&&&&&INS!#T&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)crecIproduct_id1&crecIcustomer_id1& crecI<ee=ly_start_dateO21 &&&&&&&&&&&&&&crecIsales_tue&*; &&&&&&INS!#T&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)crecIproduct_id1&crecIcustomer_id1& crecI<ee=ly_start_dateO31 &&&&&&&&&&&&&&crecIsales_<ed&*; &&&&&&INS!#T&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&&+$LU!S&)crecIproduct_id1&crecIcustomer_id1& crecI<ee=ly_start_dateO:1 &&&&&&&&&&&&&&&crecIsales_thu&*; &&&&&&&INS!#T&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&&+$LU!S&)crecIproduct_id1&crecIcustomer_id1& crecI<ee=ly_start_dateO01 &&&&&&&&&&&&&&&crecIsales_fri&*; &&&&&&&INS!#T&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&&+$LU!S&)crecIproduct_id1&crecIcustomer_id1& crecI<ee=ly_start_dateOD1 &&&&&&&&&&&&&&&crecIsales_sat&*; &&&&!ND&LOO%; &&&&CO""IT; !ND;

This 7L@SFL proced re can be modified to provide even better performance" 'rra* inserts can accelerate the insertion phase of the proced re" F rther performance can be !ained b* paralleli5in! this transformation operation/ partic larl* if the temp_sales_step5 table is partitioned/ sin! techni4 es similar to the paralleli5ation of data nloadin! described in Chapter 00/ 96xtraction in %ata $areho ses9" The primar* advanta!e of this 7L@SFL proced re over a CT'S approach is that it re4 ires onl* a sin!le scan of the data" $2a ple o0 OracleJi #ivoting Oracle1i offers a faster wa* of pivotin! *o r data b* sin! a m ltitable insert" The followin! example ses the m ltitable insert s*ntax to insert into the demo table shIsales some data from an inp t table with a different str ct re" The m ltitable insert statement looks like this#
INS!#T&$LL &&&&&&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)product_id1&customer_id1&<ee=ly_start_date1&sales_sun* &&&&&&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold*

&&&&&&+$LU!S&)product_id1&customer_id1&<ee=ly_start_dateO51&sales_mon* &&&&&&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)product_id1&customer_id1&<ee=ly_start_dateO21&sales_tue* &&&&&&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)product_id1&customer_id1&<ee=ly_start_dateO31&sales_<ed* &&&&&&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)product_id1&customer_id1&<ee=ly_start_dateO:1&sales_thu* &&&&&&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)product_id1&customer_id1&<ee=ly_start_dateO01&sales_fri* &&&&&&INTO&sales&)prod_id1&cust_id1&time_id1&amount_sold* &&&&&&+$LU!S&)product_id1&customer_id1&<ee=ly_start_dateOD1&sales_sat* S!L!CT&product_id1&customer_id1&<ee=ly_start_date1&sales_sun1 &&&&&&sales_mon1&sales_tue1&sales_<ed1&sales_thu1&sales_fri1&sales_sat >#O"&sales_input_ta9le;

This statement onl* scans the so rce table once and then inserts the appropriate data for each da*"

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

15 )aintaining the Data Warehouse


This chapter disc sses how to load and refresh a data wareho se/ and disc sses#

=sin! 7artitionin! to Improve %ata $areho se +efresh Optimi5in! %ML Operations % rin! +efresh +efreshin! Materiali5ed ;iews =sin! Materiali5ed ;iews with 7artitioned Tables

.sing #artitioning to I prove Data Warehouse 1e0resh

,TL -6xtraction/ Transformation and Loadin!3 is done on a sched led basis to reflect chan!es made to the ori!inal so rce s*stem" % rin! this step/ *o ph*sicall* insert the new/ clean data into the prod ction data wareho se schema/ and take all of the other steps necessar* -s ch as b ildin! indexes/ validatin! constraints/ takin! back ps3 to make this new data available to the end sers" Once all of this data has been loaded into the data wareho se/ the materiali5ed views have to be pdated to reflect the latest data" The partitionin! scheme of the data wareho se is often cr cial in determinin! the efficienc* of refresh operations in the data wareho se load process" In fact/ the load process is often the primar* consideration in choosin! the partitionin! scheme of data wareho se tables and indexes" The partitionin! scheme of the lar!est data wareho se tables -for example/ the fact table in a star schema3 sho ld be based pon the loadin! paradi!m of the data wareho se" Most data wareho ses are loaded with new data on a re! lar sched le" For example/ ever* ni!ht/ week/ or month/ new data is bro !ht into the data wareho se" The data bein! loaded at the end of the week or month t*picall* corresponds to the transactions for the week or month" In this ver* common scenario/ the data wareho se is bein! loaded b* time" This s !!ests that the data wareho se tables sho ld be partitioned on a date col mn" In o r data wareho se example/ s ppose the new data is loaded into the sales table ever* month" F rthermore/ the sales table has been partitioned b* month" These steps show how the load process will proceed to add the data for a new month -.an ar* A8803 to the table sales" 0" 7lace the new data into a separate table/ sales_45_2445" This data can be directl* loaded into sales_45_2445 from o tside the data wareho se/ or this data can be the res lt of previo s data transformation operations that have alread* occ rred in the data wareho se" sales_45_2445 has the exact same col mns/ datat*pes/ and so forth/ as the sales table" Kather statistics on the sales_45_2445 table" 0" Create indexes and add constraints on sales_45_2445" '!ain/ the indexes and constraints on sales_45_2445 sho ld be identical to the indexes and constraints on sales" Indexes can be b ilt in parallel and sho ld se the NOLO((IN( and the CO"%UT! ST$TISTICS options" For example#
2I C#!$T!&BIT"$%&IND!.&sales_45_2445_customer_id_9i?& 3I &&ON&sales_45_2445)customer_id* :I &&&&&&T$BL!S%$C!&sales_id?&NOLO((IN(&%$#$LL!L&C&CO"%UT!& ST$TISTICS;

'ppl* all constraints to the sales_45_2445 table that are present on the sales table" This incl des referential inte!rit* constraints" ' t*pical constraint wo ld be#
$LT!#&T$BL!&sales_45_2445&$DD&CONST#$INT&sales_customer_id &&&&&&#!>!#!NC!S&customer)customer_id*&!N$BL!&NO+$LID$T!;

If the partitioned table sales has a primar* or ni4 e ke* that is enforced with a !lobal index str ct re/ ens re that the constraint on sales_p=_6an45 is validated witho t the creation of an index str ct re/ as in the followin!#
$LT!#&T$BL!&sales_45_2445&$DD&CONST#$INT&sales_p=_6an45 %#I"$#'&K!'&)sales_transaction_id*&DIS$BL!&+$LID$T!;

The creation of the constraint with !N$BL! cla se wo ld ca se the creation of a ni4 e index/ which does not match a local index str ct re of the partitioned table" ?o m st not have an* index str ct re b ilt on the nonpartitioned table to be exchan!ed for existin! !lobal indexes of the partitioned table" The exchan!e command wo ld fail" 0" 'dd the sales_45_2445 table to the sales table" In order to add this new data to the sales table/ we need to do two thin!s" First/ we need to add a new partition to the sales table" $e will se the $LT!# T$BL! """ $DD %$#TITION statement" This will add an empt* partition to the sales table#
$LT!#&T$BL!&sales&$DD&%$#TITION&sales_45_2445& +$LU!S&L!SS&T,$N&)TO_D$T!)745A>!BA244571&7DDA"ONA''''7**;

Then/ we can add o r newl* created table to this partition sin! the !.C,$N(! %$#TITION operation" This will exchan!e the new/ empt* partition with the newl* loaded table"
$LT!#&T$BL!&sales&!.C,$N(!&%$#TITION&sales_45_2445&FIT,&T$BL!& sales_45_2445& INCLUDIN(&IND!.!S&FIT,OUT&+$LID$TION&U%D$T!&(LOB$L&IND!.!S; &&

The !.C,$N(! operation will preserve the indexes and constraints that were alread* present on the sales_45_2445 table" For ni4 e constraints -s ch as the ni4 e constraint on sales_transaction_id3/ *o can se the U%D$T! (LOB$L IND!.!S cla se/ as shown previo sl*" This will a tomaticall* maintain *o r !lobal index str ct res as part of the partition maintenance operation and keep them accessible thro !ho t the whole process" If there were onl* forei!n)ke* constraints/ the exchan!e operation wo ld be instantaneo s" The benefits of this partitionin! techni4 e are si!nificant" First/ the new data is loaded with minimal reso rce tili5ation" The new data is loaded into an entirel* separate table/ and the index processin! and constraint processin! are applied onl* to the new partition" If the sales table was >8 KB and had 0A partitions/ then a new month<s worth of data contains approximatel* C KB" Onl* the new month<s worth of data needs to be indexed" (one of the indexes on the remainin! C2 KB of data needs to be modified at all" This partitionin! scheme additionall* ens res that the load processin! time is directl*

proportional to the amo nt of new data bein! loaded/ not to the total si5e of the sales table" Second/ the new data is loaded with minimal impact on conc rrent 4 eries" 'll of the operations associated with data loadin! are occ rrin! on a separate sales_45_2445 table" Therefore/ none of the existin! data or indexes of the sales table is affected d rin! this data refresh process" The sales table and its indexes remain entirel* nto ched thro !ho t this refresh process" Third/ in case of the existence of an* !lobal indexes/ those are incrementall* maintained as part of the exchan!e command" This maintenance does not affect the availabilit* of the existin! !lobal index str ct res" The exchan!e operation can be viewed as a p blishin! mechanism" =ntil the data wareho se administrator exchan!es the sales_45_2445 table into the sales table/ end sers cannot see the new data" Once the exchan!e has occ rred/ then an* end ser 4 er* accessin! the sales table will immediatel* be able to see the sales_45_2445 data" 7artitionin! is sef l not onl* for addin! new data b t also for removin! and archivin! data" Man* data wareho ses maintain a rollin! window of data" For example/ the data wareho se stores the most recent B2 months of sales data" . st as a new partition can be added to the sales table -as described earlier3/ an old partition can be 4 ickl* -and independentl*3 removed from the sales table" These two benefits -red ced reso rces tili5ation and minimal end) ser impact3 are : st as pertinent to removin! a partition as the* are to addin! a partition" +emovin! data from a partitioned table does not necessaril* mean that the old data is ph*sicall* deleted from the database" There are two alternatives for removin! old data from a partitioned table# ?o can ph*sicall* delete all data from the database b* droppin! the partition containin! the old data/ th s freein! the allocated space#
$LT!#&T$BL!&sales&D#O%&%$#TITION&sales_45_5@@C;

?o can exchan!e the old partition with an empt* table of the same str ct reJ this empt* table is created e4 ivalent to step0 and A described in the load process" 'ss min! the new empt* table st b is named sales_archi-e_45_5@@C/ the followin! SFL statement will Rempt*< partition sales_45_5@@C#
$LT!#&T$BL!&sales&!.C,$N(!&%$#TITION&sales_45_5@@C&FIT,&T$BL!& sales_archi-e_45_ 5@@C&INCLUDIN(&IND!.!S&FIT,OUT&+$LID$TION&U%D$T!&(LOB$L&IND!.!S;

(ote that the old data is still existent/ as the exchan!ed/ nonpartitioned table sales_archi-e_45_5@@C" If the partitioned table was set p in a wa* that ever* partition is stored in a separate tablespace/ *o can archive -or transport3 this table sin! Oracle<s transportable tablespace framework before droppin! the act al data -the tablespace3" See 9Transportation =sin! Transportable Tablespaces9 for f rther details re!ardin! transportable tablespaces" In some sit ations/ *o mi!ht not want to drop the old data immediatel*/ b t keep it as part of the partitioned tableJ altho !h the data is no lon!er of main interest/ there are still potential 4 eries accessin! this old/ read)onl* data" ?o can se Oracle<s data compression to minimi5e the space sa!e of the old data" $e also ass me that at least one compressed partition is alread* part of the partitioned table" See Also: Chapter B/ 97h*sical %esi!n in %ata $areho ses9 for a !eneric disc ssion of data se!ment compression and Chapter >/ 97arallelism and 7artitionin! in %ata $areho ses9 for partitionin! and data se!ment compression

1e0resh Scenarios
' t*pical scenario mi!ht not onl* need to compress old data/ b t also to mer!e several old partitions to reflect the !ran larit* for a later back p of several mer!ed partitions" Let<s ass me that a back p -partition3 !ran larit* is on a 4 arterl* base for an* 4 arter/ where the oldest month is more than B2 months behind the most recent month" In this case/ we are therefore compressin! and mer!in! sales_45_5@@C/ sales_42_5@@C/ and sales_43_5@@C into a new/ compressed partition sales_q5_5@@C" 0" Create the new mer!ed partition in parallel another tablespace" The partition will be compressed as part of the "!#(! operation#
2I $LT!#&T$BL!&sales&"!#(!&%$#TITION&sales_45_5@@C1&sales_42_5@@C1& sales_43_ 3I 5@@C&INTO&%$#TITION&sales_q5_5@@C&T$BL!S%$C!&archi-e_q5_5@@C& CO"%#!SS&U%D$T!& :I (LOB$L&IND!.!S&%$#$LL!L&:; 0I 0" The partition "!#(! operation invalidates the local indexes for the new mer!ed

2I $LT!#&T$BL!&sales&"ODI>'&%$#TITION&sales_5_5@@C&#!BUILD&UNUS$BL!& LOC$L& 3I IND!.!S; :I

partition" $e therefore have to reb ild them#

'lternativel*/ *o can choose to create the new compressed data se!ment o tside the partitioned table and exchan!e it back" The performance and the temporar* space cons mption is identical for both methods# 0" Create an intermediate table to hold the new mer!ed information" The followin! statement inherits all NOT NULL constraints from the ori!in table b* defa lt#
2I C#!$T!&T$BL!&sales_q5_5@@C_out&T$BL!S%$C!&archi-e_q5_5@@C& NOLO((IN(&CO"%#!SS& 3I %$#$LL!L&:&$S&S!L!CT&G&>#O"&sales& :I F,!#!&time_id&JH&&TO_D$T!)745AB$NA5@@C717ddAmonAyyyy7* 0I $ND&time_id&P&TO_D$T!)745ABUNA5@@C717ddAmonAyyyy7*; DI 0" Create the e4 ivalent index str ct re for table sales_q5_5@@C_out than existin! table sales"

for the

0" 7repare the existin! table sales for the exchan!e with the new compressed table sales_q5_5@@C_out" Beca se the table to be exchan!ed contains data act all* covered in three partition/ we have to Rcreate one matchin! partition/ havin! the ran!e bo ndaries we are lookin! for" ?o simpl* have to drop two of the existin! partitions" (ote that *o have to drop the lower two partitions sales_45_5@@C and sales_42_5@@CJ the lower bo ndar* of a ran!e partition is alwa*s defined b* the pper -excl sive3 bo ndar* of the previo s partition#
2I $LT!#&T$BL!&sales&D#O%&%$#TITION&sales_45_5@@C; 3I $LT!#&T$BL!&sales&D#O%&%$#TITION&sales_42_5@@C; :I & 0" ?o can now exchan!e table sales_q5_5@@C_out with partition sales_43_5@@C" 2I $LT!#&T$BL!&sales&!.C,$N(!&%$#TITION&sales_43_5@@C& 3I FIT,&T$BL!&sales_q5_5@@C_out&INCLUDIN(&IND!.!S&FIT,OUT&+$LID$TION& :I U%D$T!&(LOB$L&IND!.!S; 0I

=nlike what the name of the partition s !!ests/ its bo ndaries cover F0)011H"

Both methods appl* to sli!htl* different b siness scenarios# =sin! the "!#(! %$#TITION approach invalidates the local index str ct res for the affected partition/ b t it keeps all data accessible all the time" 'n* attempt to access the affected partition thro !h one of the n sable index str ct res raises an error" The limited availabilit* time is approximatel* the time for re)creatin! the local bitmap index str ct res" In most cases this can be ne!lected/ since this part of the partitioned table sho ldn<t be to ched too often" The CT'S approach/ however/ minimi5es navailabilit* of an* index str ct res close to 5ero/ b t there is a specific time window/ where the partitioned table does not have all the data/ beca se we dropped two partitions" The limited availabilit* time is approximatel* the time for exchan!in! the table" %ependin! on the existence and n mber of !lobal indexes/ this time window varies" $itho t an* existin! !lobal indexes/ this time window a matter of a fraction to few seconds" Note:

Before *o add a sin!le or m ltiple compressed partitions to a partitioned table for the ver* first time/ all local bitmap indexes m st be either dropped or marked n sable" 'fter the first compressed partition is added/ no additional actions are necessar* for all s bse4 ent operations involvin! compressed partitions" It is irrelevant how the compressed partitions are added to the partitioned table" See Also: Chapter >/ 97arallelism and 7artitionin! in %ata $areho ses9 for f rther details abo t partitionin! and data se!ment compression This example is a simplification of the data wareho se rollin! window load scenario" +eal)world data wareho se refresh characteristics are alwa*s more complex" However/ the advanta!es of this rollin! window approach are not diminished in more complex scenarios"

Scenarios 0or .sing #artitioning 0or 1e0reshing Data Warehouses


This section contains two t*pical scenarios" 1e0resh Scenario 1 %ata is loaded dail*" However/ the data wareho se contains two *ears of data/ so that partitionin! b* da* mi!ht not be desired" Sol tion# 7artition b* week or month -as appropriate3" =se INS!#T to add the new data to an existin! partition" The INS!#T operation onl* affects a sin!le partition/ so the benefits described previo sl* remain intact" The INS!#T operation co ld occ r while the partition remains a part of the table" Inserts into a sin!le partition can be paralleli5ed#
INS!#T&8GO&$%%!NDG8&INTO&sales&%$#TITION&)sales_45_2445*& S!L!CT&G&>#O"&ne<_sales;

The indexes of this sales partition will be maintained in parallel as well" 'n alternative is to se the !.C,$N(! operation" ?o can do this b* exchan!in! the sales_45_2445 partition of the sales table and then sin! an INS!#T operation" ?o mi!ht prefer this techni4 e when droppin! and reb ildin! indexes is more efficient than maintainin! them" 1e0resh Scenario * (ew data feeds/ altho !h consistin! primaril* of data for the most recent da*/ week/ and month/ also contain some data from previo s time periods"

Sol tion 0# =se parallel SFL operations -s ch as C#!$T! T$BL! """ $S S!L!CT3 to separate the new data from the data in previo s time periods" 7rocess the old data separatel* sin! other techni4 es" (ew data feeds are not solel* time based" ?o can also feed new data into a data wareho se with data from m ltiple operational s*stems on a b siness need basis" For example/ the sales data from direct channels ma* come into the data wareho se separatel* from the data from indirect channels" For b siness reasons/ it ma* f rthermore make sense to keep the direct and indirect data in separate partitions" Sol tion A# Oracle s pports composite ran!e list partitionin!" The primar* partitionin! strate!* of the sales table co ld be ran!e partitionin! based on time_id as shown in the example" However/ the s bpartitionin! is a list based on the channel attrib te" 6ach s bpartition can now be loaded independentl* of each other -for each distinct channel3 and added in a rollin! window operation as disc ssed before" The partitionin! strate!* addresses the b siness needs in the most optimal manner"

Opti i4ing D)" Operations During 1e0resh


?o can optimi5e %ML performance thro !h the followin! techni4 es#

Implementin! an 6fficient M6+K6 Operation Maintainin! +eferential Inte!rit* 7 r!in! %ata

I ple enting an $00icient )$17$ Operation


Commonl*/ the data that is extracted from a so rce s*stem is not simpl* a list of new records that needs to be inserted into the data wareho se" Instead/ this new data set is a combination of new records as well as modified records" For example/ s ppose that most of data extracted from the OLT7 s*stems will be new sales transactions" These records will be inserted into the wareho se<s sales table/ b t some records ma* reflect modifications of previo s transactions/ s ch as ret rned merchandise or transactions that were incomplete or incorrect when initiall* loaded into the data wareho se" These records re4 ire pdates to the sales table" 's a t*pical scenario/ s ppose that there is a table called ne<_sales that contains both inserts and pdates that will be applied to the sales table" $hen desi!nin! the entire data wareho se load process/ it was determined that the ne<_sales table wo ld contain records with the followin! semantics#

If a !iven sales_transaction_id of a record in ne<_sales alread* exists in sales/ then pdate the sales table b* addin! the sales_dollar_amount and sales_quantity_sold val es from the ne<_sales table to the existin! row in the sales table"

Otherwise/ insert the entire new record from the ne<_sales table into the sales table"

This U%D$T!A!LS!AINS!#T operation is often called a mer!e" ' mer!e can be exec ted sin! one SFL statement in Oracle1i/ tho !h it re4 ired two earlier" E.am&$e 1"-1 #erging Prior to Orac$e@i The first SFL statement pdates the appropriate rows in the sales tables/ while the second SFL statement inserts the rows#
U%D$T! &&)S!L!CT &&&sIsales_qua

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

1; Change Data Capture


Chan!e %ata Capt re efficientl* identifies and captures data that has been added to/ pdated/ or removed from/ Oracle relational tables/ and makes the change data available for se b* applications" Chan!e %ata Capt re is provided as an Oracle database server component with Oracle1i" This chapter introd ces Chan!e %ata Capt re in the followin! sections#

'bo t Chan!e %ata Capt re Installation and Implementation Sec rit* Col mns in a Chan!e Table Chan!e %ata Capt re ;iews S*nchrono s Mode of %ata Capt re 7 blishin! Chan!e %ata Mana!in! Chan!e Tables and S bscriptions

S bscribin! to Chan!e %ata 6xport and Import Considerations See Also: Oracle9i Supplied P&0S%& Packages and Types $e"erence for more information abo t the Chan!e %ata Capt re p blish and s bscribe 7L@SFL packa!es"

About Change Data Capture


Oftentimes/ data wareho sin! involves the extraction and transportation of relational data from one or more so rce databases/ into the data wareho se for anal*sis" Chan!e %ata Capt re 4 ickl* identifies and processes onl* the data that has chan!ed/ not entire tables/ and makes the chan!e data available for f rther se" $itho t Chan!e %ata Capt re/ database extraction is a c mbersome process in which *o move the entire contents of tables into flat files/ and then load the files into the data wareho se" This ad hoc approach is expensive in a n mber of wa*s" Chan!e %ata Capt re does not depend on intermediate flat files to sta!e the data o tside of the relational database" It capt res the chan!e data res ltin! from INS!#T/ U%D$T!/ and D!L!T! operations made to ser tables" The chan!e data is then stored in a database ob:ect called a chan!e table/ and the chan!e data is made available to applications in a controlled wa*" Table 0>)0 describes the advanta!es of performin! database extraction with Chan!e %ata Capt re"

Ta)$e 1--1 Data)ase E.traction With and Without Change Data Ca&ture
Database $2traction 6xtraction With Change Data Capture %atabase extraction from INS!#T/ U%D$T!/ and D!L!T! operations occ rs immediatel*/ at the same time the chan!es occ r to the so rce tables" Without Change Data Capture %atabase extraction is mar!inal at best for INS!#T operations/ and problematic for U%D$T! and D!L!T! operations/ beca se the data is no lon!er in the table" The entire contents of tables are moved into flat files"

Sta!in!

Sta!es data directl* to relational tablesJ there is no need to se flat files"

Database $2traction Interface

With Change Data Capture 7rovides an eas*)to) se p blish and s bscribe interface sin! DB"S_LO("N#_CDC_%UBLIS, and DB"S_LO("N#_CDC_SUBSC#IB! packa!es" S pplied with the Oracle1i -and later3 database server" +ed ces overhead cost b* simplif*in! the extraction of chan!e data"

Without Change Data Capture 6rror prone and manpower intensive to administer"

Cost

6xpensive beca se *o m st write and maintain the capt re software *o rself/ or p rchase it from a third)part* vendors"

' Chan!e %ata Capt re s*stem is based on the interaction of a p blisher and s bscribers to capt re and distrib te chan!e data/ as described in the next section"

#ublish and Subscribe )odel


Most Chan!e %ata Capt re s*stems have one p blisher that capt res and p blishes chan!e data for an* n mber of Oracle so rce tables" There can be m ltiple s bscribers accessin! the chan!e data" Chan!e %ata Capt re provides 7L@SFL packa!es to accomplish the p blish and s bscribe tasks" #ublisher The p blisher is s all* a database administrator -%B'3 who is in char!e of creatin! and maintainin! schema ob:ects that make p the Chan!e %ata Capt re s*stem" The p blisher performs these tasks#

%etermines the relational tables -called so rce tables3 from which the data wareho se application is interested in capt rin! chan!e data" =ses the Oracle s pplied packa!e/ DB"S_LO("N#_CDC_%UBLIS,/ to set p the s*stem to capt re data from one or more so rce tables" 7 blishes the chan!e data in the form of chan!e tables" 'llows controlled access to s bscribers b* sin! the SFL (#$NT and #!+OK! statements to !rant and revoke the S!L!CT privile!e on chan!e tables for sers and roles"

Subscribers The s bscribers/ s all* applications/ are cons mers of the p blished chan!e data" S bscribers s bscribe to one or more sets of col mns in so rce tables" S bscribers perform the followin! tasks#

=se the Oracle s pplied packa!e/ DB"S_LO("N#_CDC_SUBSC#IB!/ to s bscribe to so rce tables for controlled access to the p blished chan!e data for anal*sis" 6xtend the s bscription window and create a new s bscriber view when the s bscriber is read* to receive a set of chan!e data" =se S!L!CT statements to retrieve chan!e data from the s bscriber views" %rop the s bscriber view and p r!e the s bscription window when finished processin! a block of chan!es" %rop the s bscription when the s bscriber no lon!er needs its chan!e data"

$2a ple o0 a Change Data Capture S+ste


The Chan!e %ata Capt re s*stem capt res the effects of %ML statements/ incl din! INS!#T/ D!L!T!/ and U%D$T!/ when the* are performed on the so rce table" 's these operations are performed/ the chan!e data is capt red and p blished to correspondin! chan!e tables" To capt re chan!e data/ the p blisher creates and administers chan!e tables/ which are special database tables that capt re chan!e data from a so rce table" For example/ for each so rce table for which *o want to capt re data/ the p blisher creates a correspondin! chan!e table" Chan!e %ata Capt re ens res that none of the pdates are missed or d plicated" 6ach s bscriber has its own view of the chan!e data" This makes it possible for m ltiple s bscribers to sim ltaneo sl* s bscribe to the same chan!e table witho t interferin! with one another" Fi! re 0>)0 shows the p blish and s bscribe model in a Chan!e %ata Capt re s*stem" Figure 1--1 Pu)$ish and !u)scri)e #ode$ in a Change Data Ca&ture !'stem

Text description of the ill stration s*ncfi!0"!if For example/ ass me that the chan!e tables in Fi! re 0>)0 contains all of the chan!es that occ rred between Monda* and Frida*/ and also ass me that#

S bscriber 0 is viewin! and processin! data from T esda*" S bscriber A is viewin! and processin! data from $ednesda* to Th rsda*"

S bscribers 0 and A each have a ni4 e s bscription window that contains a block of transactions" Chan!e %ata Capt re mana!es the s bscription window for each s bscriber b* creatin! a s bscriber view that ret rns a ran!e of transactions of interest to that s bscriber" The s bscriber accesses the chan!e data b* performin! S!L!CT statements on the s bscriber view that was !enerated b* Chan!e %ata Capt re" $hen a s bscriber needs to read additional chan!e data/ the s bscriber makes proced re calls to e1tend the window and to create a new s bscriber view" 6ach s bscriber can +alk thro !h the data at its own pace/ while Chan!e %ata Capt re mana!es the data stora!e" 's each s bscriber finishes processin! the data in its s bscription window/ it calls proced res to drop the s bscriber view and purge the contents of the s bscription window" 6xtendin! and p r!in! windows is necessar* to prevent the chan!e table from !rowin! indefinitel*/ and to prevent the s bscriber from seein! the same data a!ain" Th s/ Chan!e %ata Capt re provides the followin! benefits for s bscribers#

K arantees that each s bscriber sees all of the chan!es/ does not miss an* chan!es/ and does not see the same chan!e data more than once" ,eeps track of m ltiple s bscribers and !ives each s bscriber shared access to chan!e data" Handles all of the stora!e mana!ement/ a tomaticall* removin! data from chan!e tables when it is no lon!er re4 ired b* an* of the s bscribers"

Co ponents and Ter inolog+ 0or S+nchronous Change Data Capture


This section describes the Chan!e %ata Capt re components shown in Fi! re 0>)A" The p blisher is responsible for all of the components shown in Fi! re 0>)A/ except for the s bscriber views" The p blisher creates and maintains all of the schema ob:ects that make p the Chan!e %ata Capt re s*stem/ and p blishes chan!e data so that s bscribers can se it" S bscribers are the cons mers of chan!e data and are !ranted controlled access to the chan!e data b* the p blisher" S bscribers s bscribe to one or more col mns in so rce tables" $ith s*nchrono s data capt re/ the chan!e data is !enerated as data manip lation lan! a!e -%ML3 operations are made to the so rce table" 6ver* time a %ML operation occ rs on a so rce table/ a record of that operation is written to the chan!e table" Figure 1--2 Com&onents in a !'nchronous Change Data Ca&ture !'stem

Text description of the ill stration s*ncOcom"!if The followin! s bsections describe Chan!e %ata Capt re components in more detail" Source S+ste ' so rce s*stem is a prod ction database that contains so rce tables for which Chan!e %ata Capt re will capt re chan!es" Source Table ' so rce table is a database table that resides on the so rce s*stem that contains the data *o want to capt re" Chan!es made to the so rce table are immediatel* reflected in the chan!e table" Change Source ' chan!e so rce represents a so rce s*stem" There is a s*stem)!enerated chan!e so rce named S'NC_SOU#C!" Change Set ' chan!e set represents the collection of chan!e tables" There is a s*stem)!enerated chan!e set named S'NC_S!T" Change Table

' chan!e table contains the chan!e data res ltin! from %ML statements made to a sin!le so rce table" ' chan!e table consists of two thin!s# the chan!e data itself/ which is stored in a database table/ and the s*stem metadata necessar* to maintain the chan!e table" ' !iven chan!e table can capt re chan!es from onl* one so rce table" In addition to p blished col mns/ the chan!e table contains control col mns that are mana!ed b* Chan!e %ata Capt re" See 9Col mns in a Chan!e Table9 for more information" #ublication ' p blication provides a wa* for p blishers to p blish m ltiple chan!e tables on the same so rce table/ and control s bscriber access to the p blished chan!e data" For example/ 7 blication ' consists of a chan!e table that contains all the col mns from the !"%LO'!! so rce table/ while 7 blication B contains all the col mns except the salar* col mn from the !"%LO'!! so rce table" Beca se each chan!e table is a separate p blication/ the p blisher can implement sec rit* on the salar* col mn b* allowin! onl* selected s bscribers to access 7 blication '" Subscriber !ie( ' s bscriber view is a view created b* Chan!e %ata Capt re that ret rns all of the rows in the s bscription window" In Fi! re 0>)A/ the s bscribers have created two views# one on col mns E and H of So rce Table B and one on col mns C/ 2/ and H of So rce Table C The col mns incl ded in the view are based on the act al col mns that the s bscribers s bscribed to in the so rce table" Subscription Windo( ' s bscription window defines the time ran!e of chan!e rows that the s bscriber can c rrentl* see" The oldest row in the window is the low watermarkJ the newest row in the window is the hi!h watermark" 6ach s bscriber has a s bscription window"

Installation and I ple entation


Chan!e %ata Capt re comes pre)packa!ed with the appropriate Oracle1i drivers alread* installed with which *o can implement s*nchrono s data capt re" In addition/ note that Chan!e %ata Capt re ses .ava" Therefore/ when *o install the Oracle1i database server/ ens re that .ava is enabled" Chan!e %ata Capt re installs s*stemwide tri!!ers on the C#!$T! T$BL!/ $LT!# T$BL!/ and D#O% T$BL! statements" If s*stem tri!!ers are disabled on the database instance/ Chan!e %ata Capt re will not f nction correctl*" Therefore/ *o sho ld never disable s*stem tri!!ers" To remove Chan!e %ata Capt re from the database/ the SFL script rmcdcIsql is provided in the admin director*" This will remove the s*stem tri!!ers that C%C installs

on the C#!$T! T$BL!/ $LT!# T$BL! and D#O% table statements" In addition/ rmcdcIsql removes all .ava classes sed b* Chan!e %ata Capt re" (ote that after rmcdcIsql is called/ C%C will no lon!er operate on the s*stem" If the s*stem administrator decides to remove the .ava ;irt al Machine from a database instance/ rmcdcIsql m st be called before rm6-m is called" To re)install Chan!e %ata Capt re/ the SFL script initcdcIsql is provided in the admin& director*" It creates the C%C s*stem tri!!ers and .ava classes that are re4 ired b* Chan!e %ata Capt re"

Change Data Capture 1estriction on Direct=#ath INS$1T


Chan!e %ata Capt re does not s pport the direct)path INS!#T statement -and/ b* association/ the multi_ta9le_insert statement3 feat re in parallel %ML mode" $hen *o create a chan!e table/ Chan!e %ata Capt re creates tri!!ers on the so rce table" Beca se a direct)path INS!#T disables all database tri!!ers/ an* rows inserted into the so rce table sin! the SFL statement for direct)path INS!#T in parallel %ML mode will not be capt red in the chan!e table" Similarl*/ Chan!e %ata Capt re cannot capt re the inserted rows from m ltitable insert operations beca se the SFL multi_ta9le_insert statement in parallel %ML mode ses direct)path INS!#T" 'lso/ note that the m ltitable insert operation does not ret rn an error messa!e to indicate that the tri!!ers sed b* Chan!e %ata Capt re did not fire" See Also: Oracle9i S%& $e"erence for more information re!ardin! m ltitable inserts/ direct)path INS!#T/ and tri!!ers

Securit+
?o !rant privile!es for a chan!e table separatel* from the privile!es *o !rant for a so rce table" For example/ a s bscriber that has privile!es to perform a S!L!CT operation on a so rce table mi!ht not have privile!es to perform a S!L!CT operation on a chan!e table" The p blisher controls s bscribers< access to chan!e data b* sin! the SFL (#$NT and #!+OK! statements to !rant and revoke the S!L!CT privile!e on chan!e tables for sers and roles" The p blisher m st !rant the S!L!CT privile!e before a ser or application can s bscribe to the chan!e table" Skip Headers Oracle9 i DataWarehousingGuide

Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

1? Su

ar+ Advisor

This chapter ill strates how to se the S mmar* 'dvisor/ a tool for choosin! and nderstandin! materiali5ed views" The chapter contains# Overview of the S mmar* 'dvisor in the %BMSOOL'7 7acka!e =sin! the S mmar* 'dvisor 6stimatin! Materiali5ed ;iew Si5e Is a Materiali5ed ;iew Bein! =sed& S mmar* 'dvisor $i5ard

Overvie( o0 the Su ar+ Advisor in the D&)SBO"A# #ac<age


Materiali5ed views provide hi!h performance for complex/ data)intensive 4 eries" The S mmar* 'dvisor helps *o achieve this performance benefit b* choosin! the proper set of materiali5ed views for a !iven workload" In !eneral/ as the n mber of materiali5ed views and space allocated to materiali5ed views is increased/ 4 er* performance improves" B t the additional materiali5ed views have some cost# the* cons me additional stora!e space and m st be refreshed/ which increases maintenance time" The S mmar* 'dvisor considers these costs and makes the most cost)effective trade)offs when recommendin! the creation of new materiali5ed views and eval atin! the performance of existin! materiali5ed views" To help *o select from amon! the man* possible materiali5ed views in *o r schema/ Oracle provides a collection of materiali5ed view anal*sis and advisor* f nctions and proced res in the DB"S_OL$% packa!e" Collectivel*/ these f nctions are called the S mmar* 'dvisor/ and the* are callable from an* 7L@SFL pro!ram" Fi! re 02)0 shows how the S mmar* 'dvisor recommends materiali5ed views from a h*pothetical or ser) defined workload or one obtained from the SFL cache/ or Oracle Trace" ?o can r n the S mmar* 'dvisor from Oracle 6nterprise Mana!er or b* invokin! the DB"S_OL$% packa!e" ?o m st have .ava enabled to se the S mmar* 'dvisor" 'll data and res lts !enerated b* the S mmar* 'dvisor is stored in a set of tables referred to as the S mmar* 'dvisor repositor*" These tables are owned b* S'ST!" and start with "+I!FQ_$D+_G" Onl* %B's can access these tables directl*/ b t other sers can access

the data relevant to them sin! a set of read)onl* views" These views start with "+I!F_" Th s/ the table "+I!FQ_$D+_FO#KLO$D stores the workload of all sers/ b t a ser accesses his workload thro !h the "+I!F_FO#KLO$D view" Figure 15-1 #ateria$i;ed <ie s and the !ummar' Advisor

Text description of the ill stration dwhs!8H>"!if =sin! the S mmar* 'dvisor or the DB"S_OL$% packa!e/ *o can#

6stimate the si5e of a materiali5ed view +ecommend a materiali5ed view +ecommend materiali5ed views based on collected workload information +eport act al tili5ation of materiali5ed views based on collected workload %efine a filter to se a!ainst a workload Load and validate a workload 7 r!e filters/ workloads/ and res lts Kenerate a ni4 e identifier -for example/ r n I%/ filter I%/ or workload I%3

'll of these tasks can be performed independentl* of one another" However/ sometimes *o need to se several proced res from the DB"S_OL$% packa!e to complete a task" For example/ to recommend a set of materiali5ed views based on a workload/ *o have to first load the workload and then !enerate the set of recommendations" Before *o can se an* of these proced res/ *o m st create a ni4 e identifier for the data the* are abo t to create" This n mber is obtained b* callin! the proced re

and the ni4 e n mber is known s bse4 entl* as a r n I%/ workload I% or filter I% dependin! on the proced re it is !iven"
C#!$T!_ID

The identifier is sed to store the 'dvisor artifacts in the repositor*" 6ach activit* in the 'dvisor re4 ires a ni4 e identifier to distin! ish it from other ob:ects" For example/ when *o add a filter item/ *o associate the item with a filter I%" $hen *o load a workload/ the data !ets stored sin! the ni4 e workload I%" In addition/ when *o r n #!CO""!ND_"+I!F_ST#$T!(' or !+$LU$T!_"+I!F_ST#$T!('/ a ni4 e I% is associated with the r n" Beca se the I% is : st a ni4 e n mber/ Oracle ses the same C#!$T!_ID f nction to ac4 ire the val e" It is onl* when a specific operation is performed -s ch as a load workload3 that the I% is identified as a workload I%" ?o can se the S mmar* 'dvisor with or witho t a workload/ b t better res lts are achieved if a workload is provided" This can be s pplied b*#

The ser Oracle Trace The c rrent SFL cache contents

Once the workload is loaded into the 'dvisor workload repositor* or at the time the materiali5ed view recommendations are !enerated/ a filter can be applied to the workload to restrict what is anal*5ed" This provides the abilit* to !enerate different sets of recommendations based on different workload scenarios" These filters are created sin! the proced re $DD_>ILT!#_IT!"" ?o can create an* n mber of filters/ and se more than one at a time to filter a workload" See 9=sin! Filters with the S mmar* 'dvisor9 for f rther details" The S mmar* 'dvisor ses fo r t*pes of schema ob:ects/ some of which are defined in the ser<s schema and some are in the s*stem schema#

=ser schema For both ;)table and workload tables/ before the workload is available to the recommendation process" It m st be loaded into the advisor workload repositor*"

;)tables ;)tables are !enerated b* Oracle Trace for storin! res lts of formattin! server)collected trace" 7lease note that these ;)tables are different from the ;S tables"

$orkload tables

$orkload tables are ser tables that store workload information/ and can reside in an* schema"

S*stem schema +es lt tables +es lt tables are internal tables that store both intermediate and final res lts from all S mmar* 'dvisor components"

+ead)onl* views +ead)onl* views allow *o to access recommendations/ filters and workloads"These views are "+I!F_#!CO""!ND$TIONS/ "+I!F_!+$LU$TIONS/ "+I!F_>ILT!#/ and "+I!F_FO#KLO$D" $henever the S mmar* 'dvisor is r n/ the res lts/ with the exception of estimated si5e/ are placed in internal tables/ which can be accessed from read)onl* views in the database" These res lts can be 4 eried/ so *o do not have to keep r nnin! the 'dvisor process"

If *o want to view the res lts of the last materiali5ed view recommendation/ *o can iss e the followin! statement#
S!L!CT&"+I!F_OFN!#1&"+I!F_N$"!1&#!CO""!ND!D_$CTION1& %CT_%!#>O#"$NC!_($IN1& &&&B!N!>IT_TO_COST_#$TIO >#O"&S'ST!"I"+I!F_#!CO""!ND$TIONS F,!#!&#UNIDH&)S!L!CT&"$.)#UNID*&>#O"&S'ST!"I"+I!F_#!CO""!ND$TIONS* &&O#D!#&B'&#!CO""!ND$TION_NU"B!#&$SC

The advisor* f nctions and proced res of the DB"S_OL$% packa!e re4 ire *o to !ather str ct ral statistics abo t fact and dimension table cardinalities/ and the distinct cardinalities of ever* dimension le-el col mn/ BOIN K!' col mn/ and fact table ke* col mn" ?o do this b* loadin! *o r data wareho se/ then !atherin! either exact or estimated statistics with the DB"S_ST$TS packa!e or the $N$L' ! T$BL! statement" Beca se !atherin! statistics is time)cons min! and extreme statistical acc rac* is not re4 ired/ it is !enerall* preferable to estimate statistics" =sin! information from the s*stem workload table/ schema metadata and statistical information !enerated b* the DB"S_ST$TS packa!e/ the 'dvisor en!ine !enerates s mmar* recommendations and s mmar* sa!e eval ations and stores the res lts in res lt tables" To se the S mmar* 'dvisor with a workload/ some or all of the followin! steps m st be followed#

Optionall* obtain an identifier n mber as a filter I% and define one or more filter items" Obtain an identifier n mber as a workload I% and load a workload" If a filter was defined in step 0/ then it can be sed d rin! the operation to refine the SFL statements as the* are collected from the workload so rce" Load the workload" Call the proced re #!CO""!ND_"+I!F_ST#$T!(' to !enerate the recommendations"

These steps can be repeated several times with different workloads to see the effect on the materiali5ed views"

.sing the Su

ar+ Advisor
se the 'dvisor#

The followin! sections will help *o

Identifier ( mbers $orkload Mana!ement Loadin! a =ser)%efined $orkload Loadin! a Trace $orkload Loadin! a SFL Cache $orkload ;alidatin! a $orkload +emovin! a $orkload =sin! Filters with the S mmar* 'dvisor +emovin! a Filter +ecommendin! Materiali5ed ;iews S mmar* %ata +eport $hen +ecommendations are (o Lon!er +e4 ired Stoppin! the +ecommendation 7rocess S mmar* 'dvisor Sample Sessions S mmar* 'dvisor and Missin! Statistics S mmar* 'dvisor 7rivile!es and O+')B8CC2

Identi0ier Nu bers
Most of the DB"S_OL$% proced res re4 ire a ni4 e identifier as one of their parameters" ?o obtain this b* callin! the proced re C#!$T!_ID/ which is ill strated in the followin! section" D&)SBO"A#LC1$AT$BID #rocedure

Ta)$e 15-1 D6#!AOLAPCC%EATEA7D Procedure Parameters

#ara eter Datat+pe Description


id NU"B!#

The ni4 e identifier that can be sed to create a filter/ load a workload/ or create an anal*sis $ith a SFL tilit* s ch as SFLI7l s/ do the followin!# 0" %eclare an o tp t variable to receive the new identifier"
2I +$#I$BL!&"'_ID&NU"B!#; 3I 0" Call the C#!$T!_ID f nction to !enerate a new 2I !.!CUT!&DB"S_OL$%IC#!$T!_ID)K"'_ID*;

identifier"

Wor<load )anage ent


The 'dvisor performs best when a workload based on sa!e is available" The 'dvisor $orkload +epositor* is capable of storin! m ltiple workloads/ so that the different ses of a real)world data wareho sin! environment can be viewed over a lon! period of time and across the life c*cle of database instance start p and sh tdown" To facilitate wider se of the S mmar* 'dvisor/ three t*pes of workload are s pported#

C rrent contents of the SFL cache Oracle Trace collection =ser)specified workload

$hen the workload is loaded sin! the appropriate load_<or=load proced re/ it is stored in a new workload repositor* in the S'ST!" schema called "+I!F_FO#KLO$D whose format is shown in Table 02)A" ' specific workload can be removed b* callin! the %U#(!_FO#KLO$D ro tine and passin! it a valid workload I%" To remove all workloads for the c rrent ser/ call %U#(!_FO#KLO$D and pass the constant val e DB"S_OL$%IFO#KLO$D_$LL"

Ta)$e 15-2 #<7EWAWO%DLOAD Ta)$e


Colu n
$%%LIC$TION C$#DIN$LIT' FO#KLO$DID

Datat+pe
+$#C,$#2)34* NU"B!# NU"B!#

Description Optional application name for the 4 er* Total cardinalit* of all of tables in 4 er* $orkload id identif*in! a ni4 e samplin!

Colu n
>#!LU!NC' I"%O#T_TI"! L$STUS! OFN!# %#IO#IT' LU!#' LU!#'ID #!S%ONS!TI"! #!SULTSI !

Datat+pe
NU"B!# D$T! D$T! +$#C,$#2)34* NU"B!# LON( NU"B!# NU"B!# NU"B!#

Description ( mber of times 4 er* exec ted %ate at which item was collected Last date of exec tion =ser who last exec ted 4 er* =ser)s pplied rankin! of 4 er* F er* text Id n mber identif*in! a ni4 e 4 er* 6xec tion time in seconds Total b*tes selected b* the 4 er*

Once the workload has been collected sin! the appropriate LO$D_FO#KLO$D ro tine/ there is also a filter mechanism that ma* be applied/ this lets *o specif* the portion of workload that is to be loaded into the repositor*" ?o can also se the same filter mechanism to restrict workload)based s mmar* recommendation and eval ation to a s bset of the 4 eries contained in the workload repositor*" Once the workload has been loaded/ the S mmar* 'dvisor is r n b* callin! the proced re #!CO""!ND_"+I!F_ST#$T!('" ' ma:or benefit of this approach is that it is eas* to model different workloads b* simpl* modif*in! the fre4 enc* col mn/ removin! some SFL 4 eries/ or addin! new 4 eries" S mmar* 'dvisor can retrieve workload information from the SFL cache as well as Oracle Trace" If the collected data was retrieved from a server with the instance parameter c rsorOsharin! set to SI"IL$# or >O#C!/ then ser 4 eries with embedded literal val es will be converted to a statement that contains s*stem)!enerated bind variables" Note: Oracle Trace will be deprecated in a f t re release"

In Oracle1i/ it is not possible to retrieve the bind)variable data in order to reconstr ct the statement in the form ori!inall* s bmitted b* the ser" This will/ in t rn/ ca se S mmar* 'dvisor to not consider the 4 er* for rewrite and potentiall* miss a critical statement in the ser<s workload" 's a work)aro nd/ if the 'dvisor will be sed to recommend materiali5ed views/ then the server sho ld set the instance parameter CU#SO#_S,$#IN( to !.$CT"

"oading a .ser=De0ined Wor<load


' ser)defined workload is loaded sin! the proced re LO$D_FO#KLO$D_US!#" The <or=load_id is obtained b* callin! the proced re C#!$T!_ID" The val e of the fla!s parameter determines whether the workload is considered to be new/ sho ld be sed to overwrite an existin! workload/ or sho ld be appended to an existin! workload" The optional filter_id can be s pplied to specif* the filter that is to be sed a!ainst this workload" $here the filter wo ld have been defined sin! the $DD_>ILT!#_IT!" proced re" D&)SBO"A#L"OADBWO1E"OADB.S$1 #rocedure

Ta)$e 15-3 D6#!AOLAPCLOADAWO%DLOADA+!E% Procedure Parameters


#ara eter Datat+pe Description The re4 ired workload id that was ret rned b* the create_id call
fla/s NU"B!# <or=load_id NU"B!#

Can take one of the followin! val es#


DB"S_OL$%IFO#KLO$D_O+!#F#IT!

The load ro tine will explicitl* remove an* existin! 4 eries from the workload that are owned b* the specified collection I%
DB"S_OL$%IFO#KLO$D_$%%!ND

The load ro tine preserves an* existin! 4 eries in the workload" 'n* 4 eries collected b* the load operation will be appended to the end of the specified workload
DB"S_OL$%IFO#KLO$D_N!F

The load ro tine ass mes there are no existin! 4 eries in the workload" If it finds an existin! workload element/ the call will

#ara eter

Datat+pe Description fail with an error (ote# the fla!s have the same behavior irrespective of the LO$D_FO#KLO$D operation

filter_id o<ner_name ta9le_name

NU"B!#

Specif* filter for the workload to be loaded


+$#C,$#2

The schema that contains the ser s pplied table or view


+$#C,$#2

The table or view name containin! valid workload data The act al workload is defined in a separate table and the two parameters o<ner_name and ta9le_name describe where it is stored" There is no restriction on which schema the workload resides in/ the name for the table/ or how man* of these ser)defined tables exist" The onl* restriction is that the format of the ser table m st correspond to the US!#_FO#KLO$D table/ as described in Table 02)C#

Ta)$e 15-" +!E%AWO%DLOAD


Colu n
LU!#'

Datat+pe Can be an* +$#C,$# or LON( t*pe" 'll character t*pes are s pported

Optional6 1e/uired Description +e4 ired SFL statement

OFN!# $%%LIC$TION

+$#C,$#2)34* +$#C,$#2)34*

+e4 ired Optional

=ser who last exec ted 4 er* 'pplication name for the 4 er* ( mber of times 4 er* exec ted Last date of exec tion =ser)s pplied rankin! of

>#!LU!NC'

NU"B!#

Optional

L$STUS! %#IO#IT'

D$T! NU"B!#

Optional Optional

Colu n

Datat+pe

Optional6 1e/uired Description 4 er*

#!S%ONS!TI"! NU"B!# #!SULTSI ! NU"B!#

Optional Optional

6xec tion time in seconds Total b*tes selected b* the 4 er* Cache address Cache hash val e

SLL_$DD# SLL_,$S,

NU"B!# NU"B!#

Optional Optional

The followin! is an example of loadin! a ser workload" 0" %eclare an o tp t variable to receive the new identifier"
2I +$#I$BL!&"'_ID&NU"B!#; 3I 0" Call the C#!$T!_ID f nction to !enerate a new identifier" 2I !.!CUT!&DB"S_OL$%IC#!$T!_ID)K"'_ID*; 3I 0" Insert into the "'_FO#KLO$D tables the 4 eries *o want advice on" 2I INS!#T&INTO&ad-isor_user_<or=load&+$LU!S 3I ) :I &7S!L!CT&SU")sIquantity_sold* 0I &&>#O"&sales&s1&products&p DI &&F,!#!&sIprod_id&H&pIprod_id&$ND&pIprod_cate/ory&H&77Boys77 EI &&(#OU%&B'&pIprod_cate/ory71&7S,71&7app571&541&NULL1&01&NULL1& NULL* CI

0" Load the workload from a tar!et table or view"

2I !.!CUT!&DB"S_OL$%ILO$D_FO#KLO$D_US!#)K"'_ID1& DB"S_OL$%IFO#KLO$D_N!F1 3I &&&DB"S_OL$%I>ILT!#_NON!1&7S,71&7"'_FO#KLO$D7*;

"oading a Trace Wor<load


'lternativel*/ *o can collect a Trace workload from Oracle 6nterprise Mana!er to !ather d*namic information abo t *o r 4 er* workload/ which can be sed b* an advisor* f nction" If Oracle Trace is available/ consider sin! it to collect materiali5ed view sa!e" %oin! so enables *o to see which materiali5ed views are in se" It also lets the 'dvisor detect an* n s al 4 er* re4 ests from sers that wo ld res lt in recommendin! some different materiali5ed views"

' workload collected b* Oracle Trace is loaded sin! the proced re LO$D_FO#KLO$D_T#$C!" ?o obtain <or=load_id b* callin! the proced re C#!$T!_ID" The val e of the fla!s parameter will determine whether the workload is considered new/ sho ld be sed to overwrite an existin! workload or sho ld be appended to an existin! workload" The optional filter I% can be s pplied to specif* the filter that is to be sed a!ainst this workload" In addition/ *o can specif* an application name to describe this workload and !ive ever* 4 er* a defa lt priorit*" The application name is simpl* a ta! that enables *o to classif* the workload 4 er*" The name can later be sed to filter the workload d rin! a #!CO""!ND_"+I!F_ST#$T!(' or !+$LU$T!_"+I!F_ST#$T!(' operation" The priorit* is an important piece of information" It tells the 'dvisor how important the 4 er* is to the b siness" $hen recommendations are formed/ the priorit* will determine its val e and will ca se the 'dvisor to make decisions that favor hi!her rankin! 4 eries" If the o<ner_name parameter is not defined/ then the proced re will expect to find the formatted trace tables in the schema for the c rrent ser" D&)SBO"A#L"OADBWO1E"OADBT1AC$ #rocedure

Ta)$e 15-- D6#!AOLAPCLOADAWO%DLOADAT%ACE Procedure Parameters


#ara eter Datat+pe Description The re4 ired id that was ret rned b* the C#!$T!_ID call
fla/s NU"B!# <or=load_id NU"B!#

Can take one of the followin! val es#


DB"S_OL$%IFO#KLO$D_O+!#F#IT!

The load ro tine will explicitl* remove an* existin! 4 eries from the workload that are owned b* the specified collection I%
DB"S_OL$%IFO#KLO$D_$%%!ND

The load ro tine preserves an* existin! 4 eries in the workload" 'n* 4 eries collected b* the load operation will be appended to the end of the specified workload
DB"S_OL$%IFO#KLO$D_N!F

The load ro tine ass mes there are no existin! 4 eries in the workload" If it finds an existin! workload element/ the call will

#ara eter

Datat+pe Description fail with an error (ote# the fla!s have the same behavior irrespective of the LO$D_FO#KLO$D operation

filter_id

NU"B!#

Specif* filter for the workload to be loaded


application +$#C,$#2

The defa lt b siness application name" This val e will be sed for a 4 er* if one is not fo nd in the tar!et workload
priority NU"B!#

The defa lt b siness priorit* to be assi!ned to ever* 4 er* in the tar!et workload
o<ner_name +$#C,$#2

The schema that contains the Oracle Trace data" If omitted/ the c rrent ser will be sed Oracle Trace collects two t*pes of data" One is a d ration event which ca ses a data item to be collected twice# once at the start of the operation and once at the end of the operation" The d ration of the data item is the difference between the start and end of the operation" For example/ exec tion time is collected as a d ration event" It first collects the clock time when the operation starts" Then it collects the clock time when the operation ends" 6xec tion time is calc lated b* s btractin! the start time from the end time" ' point event is a static data item that doesn<t chan!e over time" For example/ an owner name is a static data item that wo ld be the same at the start and the end of an operation" To collect/ anal*5e and load the s mmar* event set/ *o m st do the followin!# 0" Set six initiali5ation parameters to collect data sin! Oracle Trace" 6nablin! these parameters inc rs some additional overhead at database connection/ b t is otherwise transparent"
O#$CL!_T#$C!_COLL!CTION_N$"!&H&oraclesm O#$CL!!

or oraclee

is the Oracle 6xpert collection which contains S mmar* 'dvisor data and additional data that is onl* sed b* Oracle 6xpert" is the S mmar* 'dvisor collection that contains onl* S mmar* 'dvisor data and is the preferred collection t*pe"
O#$CL!S"

O#$CL!_T#$C!_COLL!CTION_%$T, L location of collection files O#$CL!_T#$C!_COLL!CTION_SI !&H&4 O#$CL!_T#$C!_!N$BL!&H&T#U! O#$CL!_T#$C!_>$CILIT'_N$"!&H&oraclesm or oralcee O#$CL!_T#$C!_>$CILIT'_%$T, L location of trace facility files

See Also: Oracle9i Database Per"ormance Tuning #uide and $e"erence for f rther information re!ardin! these parameters A" + n the Oracle Trace Mana!er/ specif* a collection name/ and select the SU""$#'_!+!NT set" Oracle Trace Mana!er reads information from the associated confi! ration file and re!isters events to be lo!!ed with Oracle" $hile collection is enabled/ the workload information defined in the event set !ets written to a flat lo! file" B" $hen collection is complete/ Oracle Trace a tomaticall* formats the Oracle Trace lo! file into a set of relations/ which have the predefined s*non*ms be!innin! with +_5@225D2:3_" 'lternativel*/ the collection file/ which s all* has an extension of "C%F/ can be formatted man all* sin! the otrcfmt tilit*/ as shown in this example#
:I otrcfmt&&&collection_nameIcdf&&&user8pass<ordWdata9ase 0I

The trace data can be formatted in an* schema" The LO$D_FO#KLO$D_T#$C! call lets *o specif* the location of the data" C" + n the ($T,!#_T$BL!_ST$TS proced re of the DB"S_ST$TS packa!e or $N$L' !& """ !STI"$T! ST$TISTICS to collect cardinalit* statistics on all fact tables/ dimension tables/ and ke* col mns -an* col mn that appears in a dimension L!+!L cla se or BOIN cla se of a C#!$T! DI"!NSION statement3" 0" + n the C#!$T!_ID Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

#art ! Warehouse #er0or ance


This section deals with wa*s to improve *o r data wareho se<s performance/ and contains the followin! chapters#

Schema Modelin! Techni4 es SFL for '!!re!ation in %ata $areho ses SFL for 'nal*sis in %ata $areho ses OL'7 and %ata Minin! =sin! 7arallel 6xec tion F er* +ewrite

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

1C Sche a )odeling Techni/ues


The followin! topics provide information abo t schemas in a data wareho se#

Schemas in %ata $areho ses Third (ormal Form Star Schemas Optimi5in! Star F eries

Sche as in Data Warehouses


' schema is a collection of database ob:ects/ incl din! tables/ views/ indexes/ and s*non*ms" There is a variet* of wa*s of arran!in! schema ob:ects in the schema models desi!ned for data wareho sin!" One data wareho se schema model is a star schema" The Sales ,istory sample schema -the basis for most of the examples in this book3 ses a star schema" However/ there are other schema models that are commonl* sed for data wareho ses" The most prevalent of these schema models is the third normal form (3NF) schema" 'dditionall*/ some data wareho se schemas are neither star schemas nor B(F schemas/ b t instead share characteristics of both schemasJ these are referred to as h*brid schema models" The Oracle1i database is desi!ned to s pport all data wareho se schemas" Some feat res ma* be specific to one schema model -s ch as the star transformation feat re/ described in 9=sin! Star Transformation9/ which is specific to star schemas3" However/ the vast ma:orit* of Oracle<s data wareho sin! feat res are e4 all* applicable to star schemas/ B(F schemas/ and h*brid schemas" ,e* data wareho sin! capabilities s ch as partitionin! -incl din! the rollin! window load techni4 e3/ parallelism/ materiali5ed views/ and anal*tic SFL are implemented in all schema models" The determination of which schema model sho ld be sed for a data wareho se sho ld be based pon the re4 irements and preferences of the data wareho se pro:ect team" Comparin! the merits of the alternative schema models is o tside of the scope of this bookJ instead/ this chapter will briefl* introd ce each schema model and s !!est how Oracle can be optimi5ed for those environments"

Third Nor al ,or


'ltho !h this ! ide primaril* ses star schemas in its examples/ *o can also se the third normal form for *o r data wareho se implementation" Third normal form modelin! is a classical relational)database modelin! techni4 e that minimi5es data red ndanc* thro !h normali5ation" $hen compared to a star schema/ a B(F schema t*picall* has a lar!er n mber of tables d e to this normali5ation process" For example/ in Fi! re 0E)0/ orders and order items tables contain similar information as sales table in the star schema in Fi! re 0E)A" B(F schemas are t*picall* chosen for lar!e data wareho ses/ especiall* environments with si!nificant data)loadin! re4 irements that are sed to feed data marts and exec te lon!)r nnin! 4 eries" The main advanta!es of B(F schemas are that the*#

7rovide a ne tral schema desi!n/ independent of an* application or data) sa!e considerations Ma* re4 ire less data)transformation than more normali5ed schemas s ch as star schemas

Fi! re 0E)0 presents a !raphical representation of a third normal form schema" Figure 1>-1 Third =orma$ Form !chema

Text description of the ill stration dwhs!08H"!if

Opti i4ing Third Nor al ,or

Dueries

F eries on B(F schemas are often ver* complex and involve a lar!e n mber of tables" The performance of :oins between lar!e tables is th s a primar* consideration when sin! B(F schemas" One partic larl* important feat re for B(F schemas is partition)wise :oins" The lar!est tables in a B(F schema sho ld be partitioned to enable partition)wise :oins" The most common partitionin! techni4 e in these environments is composite ran!e)hash partitionin! for the lar!est tables/ with the most)common :oin ke* chosen as the hash) partitionin! ke*" 7arallelism is often heavil* tili5ed in B(F environments/ and parallelism sho ld t*picall* be enabled in these environments"

Star Sche as
The star schema is perhaps the simplest data wareho se schema" It is called a star schema beca se the entit*)relationship dia!ram of this schema resembles a star/ with points radiatin! from a central table" The center of the star consists of a lar!e fact table and the points of the star are the dimension tables" ' star schema is characteri5ed b* one or more ver* lar!e fact tables that contain the primar* information in the data wareho se/ and a n mber of m ch smaller dimension

tables -or look p tables3/ each of which contains information abo t the entries for a partic lar attrib te in the fact table" ' star ! ery is a :oin between a fact table and a n mber of dimension tables" 6ach dimension table is :oined to the fact table sin! a primar* ke* to forei!n ke* :oin/ b t the dimension tables are not :oined to each other" The cost)based optimi5er reco!ni5es star 4 eries and !enerates efficient exec tion plans for them" ' t*pical fact table contains ke*s and meas res" For example/ in the sh sample schema/ the fact table/ sales/ contain the meas res 4 antit*Osold/ amo nt/ and cost/ and the ke*s cust_id/ time_id/ prod_id/ channel_id1&and&promo_id" The dimension tables are customers/ times/ products/ channels1 and promotions" The product dimension table/ for example/ contains information abo t each prod ct n mber that appears in the fact table" ' star :oin is a primar* ke* to forei!n ke* :oin of the dimension tables to a fact table" The main advanta!es of star schemas are that the*#

7rovide a direct and int itive mappin! between the b siness entities bein! anal*5ed b* end sers and the schema desi!n" 7rovide hi!hl* optimi5ed performance for t*pical star 4 eries" 're widel* s pported b* a lar!e n mber of b siness intelli!ence tools/ which ma* anticipate or even re4 ire that the data)wareho se schema contain dimension tables

Star schemas are sed for both simple data marts and ver* lar!e data wareho ses" Fi! re 0E)A presents a !raphical representation of a star schema" Figure 1>-2 !tar !chema

Text description of the ill stration dwhs!88E"!if

Sno(0la<e Sche as
The snowflake schema is a more complex data wareho se model than a star schema/ and is a t*pe of star schema" It is called a snowflake schema beca se the dia!ram of the schema resembles a snowflake" Snowflake schemas normali5e dimensions to eliminate red ndanc*" That is/ the dimension data has been !ro ped into m ltiple tables instead of one lar!e table" For example/ a prod ct dimension table in a star schema mi!ht be normali5ed into a products table/ a product_cate/ory table/ and a product_manufacturer table in a snowflake schema" $hile this saves space/ it increases the n mber of dimension tables and re4 ires more forei!n ke* :oins" The res lt is more complex 4 eries and red ced 4 er* performance" Fi! re 0E)B presents a !raphical representation of a snowflake schema" Figure 1>-3 !no f$a?e !chema

Text description of the ill stration dwhs!88H"!if Note: Oracle Corporation recommends *o choose a star schema over a snowflake schema nless *o have a clear reason not to"

Opti i4ing Star Dueries


?o sho ld consider the followin! when sin! star 4 eries#

T nin! Star F eries =sin! Star Transformation

Tuning Star Dueries

To !et the best possible performance for star 4 eries/ it is important to follow some basic ! idelines#

' bitmap index sho ld be b ilt on each of the forei!n ke* col mns of the fact table or tables" The initiali5ation parameter ST$#_T#$NS>O#"$TION_!N$BL!D sho ld be set to true" This enables an important optimi5er feat re for star)4 eries" It is set to false b* defa lt for backward)compatibilit*" The cost)based optimi5er sho ld be sed" This does not appl* solel* to star schemas# all data wareho ses sho ld alwa*s se the cost)based optimi5er"

$hen a data wareho se satisfies these conditions/ the ma:orit* of the star 4 eries r nnin! in the data wareho se will se a 4 er* exec tion strate!* known as the star transformation" The star transformation provides ver* efficient 4 er* performance for star 4 eries"

.sing Star Trans0or ation


The star transformation is a powerf l optimi5ation techni4 e that relies pon implicitl* rewritin! -or transformin!3 the SFL of the ori!inal star 4 er*" The end ser never needs to know an* of the details abo t the star transformation" Oracle<s cost)based optimi5er a tomaticall* chooses the star transformation where appropriate" The star transformation is a cost)based 4 er* transformation aimed at exec tin! star 4 eries efficientl*" Oracle processes a star 4 er* sin! two basic phases" The first phase retrieves exactl* the necessar* rows from the fact table -the res lt set3" Beca se this retrieval tili5es bitmap indexes/ it is ver* efficient" The second phase :oins this res lt set to the dimension tables" 'n example of an end ser 4 er* is# 9$hat were the sales and profits for the !rocer* department of stores in the west and so thwest sales districts over the last three 4 arters&9 This is a simple star 4 er*" Note: Bitmap indexes are available onl* if *o have p rchased the Oracle1i 6nterprise 6dition" In Oracle1i Standard 6dition/ bitmap indexes and star transformation are not available" Star Trans0or ation (ith a &it ap Inde2 ' prere4 isite of the star transformation is that there be a sin!le)col mn bitmap index on ever* :oin col mn of the fact table" These :oin col mns incl de all forei!n ke* col mns" For example/ the sales table of the sh sample schema has bitmap indexes on the time_id/ channel_id/ cust_id/ prod_id/ and promo_id col mns"

Consider the followin! star 4 er*#


S!L!CT&chIchannel_class1&cIcust_city1&tIcalendar_quarter_desc1 &&&SU")sIamount_sold*&sales_amount >#O"&sales&s1&times&t1&customers&c1&channels&ch F,!#!&sItime_id&H&tItime_id $ND&&&sIcust_id&H&cIcust_id $ND&&&sIchannel_id&H&chIchannel_id $ND&&&cIcust_state_pro-ince&H&7C$7 $ND&&&chIchannel_desc&in&)7Internet717Catalo/7* $ND&&&tIcalendar_quarter_desc&IN&)75@@@AL57175@@@AL27* (#OU%&B'&chIchannel_class1&cIcust_city1&tIcalendar_quarter_desc;

Oracle processes this 4 er* in two phases" In the first phase/ Oracle ses the bitmap indexes on the forei!n ke* col mns of the fact table to identif* and retrieve onl* the necessar* rows from the fact table" That is/ Oracle will retrieve the res lt set from the fact table sin! essentiall* the followin! 4 er*#
S!L!CT&III&>#O"&sales F,!#!&time_id&IN &&)S!L!CT&time_id&>#O"&times& &&&F,!#!&calendar_quarter_desc&IN)75@@@AL57175@@@AL27** &&&$ND&cust_id&IN &&)S!L!CT&cust_id&>#O"&customers&F,!#!&cust_state_pro-inceH7C$7* &&&$ND&channel_id&IN &&)S!L!CT&channel_id&>#O"&channels&F,!#!&channel_desc& IN)7Internet717Catalo/7**;

This is the transformation step of the al!orithm/ beca se the ori!inal star 4 er* has been transformed into this s b4 er* representation" This method of accessin! the fact table levera!es the stren!ths of Oracle<s bitmap indexes" Int itivel*/ bitmap indexes provide a set)based processin! scheme within a relational database" Oracle has implemented ver* fast methods for doin! set operations s ch as $ND -an intersection in standard set)based terminolo!*3/ O# -a set)based nion3/ "INUS/ and COUNT" In this star 4 er*/ a bitmap index on time_id is sed to identif* the set of all rows in the fact table correspondin! to sales in 5@@@AL5" This set is represented as a bitmap -a strin! of 0<s and 8<s that indicates which rows of the fact table are members of the set3" ' similar bitmap is retrieved for the fact table rows correspondin! to the sale from 5@@@A L2" The bitmap O# operation is sed to combine this set of L5 sales with the set of L2 sales" 'dditional set operations will be done for the customer dimension and the product dimension" 't this point in the star 4 er* processin!/ there are three bitmaps" 6ach bitmap corresponds to a separate dimension table/ and each bitmap represents the set of rows of the fact table that satisf* that individ al dimension<s constraints"

These three bitmaps are combined into a sin!le bitmap sin! the bitmap $ND operation" This final bitmap represents the set of rows in the fact table that satisf* all of the constraints on the dimension table" This is the res lt set/ the exact set of rows from the fact table needed to eval ate the 4 er*" (ote that none of the act al data in the fact table has been accessed" 'll of these operations rel* solel* on the bitmap indexes and the dimension tables" Beca se of the bitmap indexes< compressed data representations/ the bitmap set)based operations are extremel* efficient" Once the res lt set is identified/ the bitmap is sed to access the act al data from the sales table" Onl* those rows that are re4 ired for the end ser<s 4 er* are retrieved from the fact table" 't this point/ Oracle has effectivel* :oined all of the dimension tables to the fact table sin! bitmap indexes" This techni4 e provides excellent performance beca se Oracle is :oinin! all of the dimension tables to the fact table with one lo!ical :oin operation/ rather than :oinin! each dimension table to the fact table independentl*" The second phase of this 4 er* is to :oin these rows from the fact table -the res lt set3 to the dimension tables" Oracle will se the most efficient method for accessin! and :oinin! the dimension tables" Man* dimension are ver* small/ and table scans are t*picall* the most efficient access method for these dimension tables" For lar!e dimension tables/ table scans ma* not be the most efficient access method" In the previo s example/ a bitmap index on productIdepartment can be sed to 4 ickl* identif* all of those prod cts in the !rocer* department" Oracle<s cost)based optimi5er a tomaticall* determines which access method is most appropriate for a !iven dimension table/ based pon the cost)based optimi5er<s knowled!e abo t the si5es and data distrib tions of each dimension table" The specific :oin method -as well as indexin! method3 for each dimension table will likewise be intelli!entl* determined b* the cost)based optimi5er" ' hash :oin is often the most efficient al!orithm for :oinin! the dimension tables" The final answer is ret rned to the ser once all of the dimension tables have been :oined" The 4 er* techni4 e of retrievin! onl* the matchin! rows from one table and then :oinin! to another table is commonl* known as a semi):oin" $2ecution #lan 0or a Star Trans0or ation (ith a &it ap Inde2 The followin! t*pical exec tion plan mi!ht res lt from 9Star Transformation with a Bitmap Index9#
S!L!CT&ST$T!"!NT &SO#T&(#OU%&B' &&,$S,&BOIN &&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&&&&&&&&&&C,$NN!LS &&&,$S,&BOIN &&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&&&&&&&&&CUSTO"!#S &&&&,$S,&BOIN &&&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&&&&&&&&TI"!S &&&&&%$#TITION&#$N(!&IT!#$TO# &&&&&&T$BL!&$CC!SS&B'&LOC$L&IND!.&#OFID&&&&&&&S$L!S &&&&&&&BIT"$%&CON+!#SION&TO&#OFIDS

&&&&&&&&BIT"$%&$ND &&&&&&&&&BIT"$%&"!#(! &&&&&&&&&&BIT"$%&K!'&IT!#$TION &&&&&&&&&&&BU>>!#&SO#T &&&&&&&&&&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&CUSTO"!#S &&&&&&&&&&&BIT"$%&IND!.&#$N(!&SC$N&&&&&&&&&&&&S$L!S_CUST_BI. &&&&&&&&&BIT"$%&"!#(! &&&&&&&&&&BIT"$%&K!'&IT!#$TION &&&&&&&&&&&BU>>!#&SO#T &&&&&&&&&&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&C,$NN!LS &&&&&&&&&&&BIT"$%&IND!.&#$N(!&SC$N&&&&&&&&&&&&S$L!S_C,$NN!L_BI. &&&&&&&&&BIT"$%&"!#(! &&&&&&&&&&BIT"$%&K!'&IT!#$TION &&&&&&&&&&&BU>>!#&SO#T &&&&&&&&&&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&TI"!S &&&&&&&&&&&BIT"$%&IND!.&#$N(!&SC$N&&&&&&&&&&&&S$L!S_TI"!_BI.

In this plan/ the fact table is accessed thro !h a bitmap access path based on a bitmap $ND/ of three mer!ed bitmaps" The three bitmaps are !enerated b* the BIT"$% "!#(! row so rce bein! fed bitmaps from row so rce trees nderneath it" 6ach s ch row so rce tree consists of a BIT"$% K!' IT!#$TION row so rce which fetches val es from the s b4 er* row so rce tree/ which in this example is a f ll table access" For each s ch val e/ the BIT"$% K!' IT!#$TION row so rce retrieves the bitmap from the bitmap index" 'fter the relevant fact table rows have been retrieved sin! this access path/ the* are :oined with the dimension tables and temporar* tables to prod ce the answer to the 4 er*" Star Trans0or ation (ith a &it ap >oin Inde2 In addition to bitmap indexes/ *o can se a bitmap :oin index d rin! star transformations" 'ss me *o have the followin! additional index str ct re#
C#!$T!&BIT"$%&IND!.&sales_c_state_96i? ON&sales)customersIcust_state_pro-ince* >#O"&sales1&customers F,!#!&salesIcust_id&H&customersIcust_id LOC$L&NOLO((IN(&CO"%UT!&ST$TISTICS;

The processin! of the same star 4 er* sin! the bitmap :oin index is similar to the previo s example" The onl* difference is that Oracle will tili5e the :oin index/ instead of a sin!le)table bitmap index/ to access the c stomer data in the first phase of the star 4 er*" $2ecution #lan 0or a Star Trans0or ation (ith a &it ap >oin Inde2 The followin! t*pical exec tion plan mi!ht res lt from 96xec tion 7lan for a Star Transformation with a Bitmap .oin Index9#
S!L!CT&ST$T!"!NT &SO#T&(#OU%&B'

&&,$S,&BOIN &&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&&&&&&&&&&C,$NN!LS &&&,$S,&BOIN &&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&&&&&&&&&CUSTO"!#S &&&&,$S,&BOIN &&&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&&&&&&&&TI"!S &&&&&%$#TITION&#$N(!&$LL &&&&&&T$BL!&$CC!SS&B'&LOC$L&IND!.&#OFID&&&&&&&S$L!S &&&&&&&BIT"$%&CON+!#SION&TO&#OFIDS &&&&&&&&BIT"$%&$ND &&&&&&&&&BIT"$%&IND!.&SIN(L!&+$LU!&&&&&&&&&&&&S$L!S_C_ST$T!_BBI. &&&&&&&&&BIT"$%&"!#(! &&&&&&&&&&BIT"$%&K!'&IT!#$TION &&&&&&&&&&&BU>>!#&SO#T &&&&&&&&&&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&C,$NN!LS &&&&&&&&&&&BIT"$%&IND!.&#$N(!&SC$N&&&&&&&&&&&&S$L!S_C,$NN!L_BI. &&&&&&&&&BIT"$%&"!#(! &&&&&&&&&&BIT"$%&K!'&IT!#$TION &&&&&&&&&&&BU>>!#&SO#T &&&&&&&&&&&&T$BL!&$CC!SS&>ULL&&&&&&&&&&&&&&&&&TI"!S &&&&&&&&&&&BIT"$%&IND!.&#$N(!&SC$N&&&&&&&&&&&&S$L!S_TI"!_BI.

The difference between this plan as compared to the previo s one is that the inner part of the bitmap index scan for the customer dimension has no s bselect" This is beca se the :oin predicate information on customerIcust_state_pro-ince can be satisfied with the bitmap :oin index sales_c_state_96i?" -o( Oracle Chooses to .se Star Trans0or ation The star transformation is a cost)based transformation in the followin! sense" The optimi5er !enerates and saves the best plan it can prod ce witho t the transformation" If the transformation is enabled/ the optimi5er then tries to appl* it to the 4 er* and/ if applicable/ !enerates the best plan sin! the transformed 4 er*" Based on a comparison of the cost estimates between the best plans for the two versions of the 4 er*/ the optimi5er will then decide whether to se the best plan for the transformed or ntransformed version" If the 4 er* re4 ires accessin! a lar!e percenta!e of the rows in the fact table/ it mi!ht be better to se a f ll table scan and not se the transformations" However/ if the constrainin! predicates on the dimension tables are s fficientl* selective that onl* a small portion of the fact table needs to be retrieved/ the plan based on the transformation will probabl* be s perior" (ote that the optimi5er !enerates a s b4 er* for a dimension table onl* if it decides that it is reasonable to do so based on a n mber of criteria" There is no ! arantee that s b4 eries will be !enerated for all dimension tables" The optimi5er ma* also decide/ based on the properties of the tables and the 4 er*/ that the transformation does not merit bein! applied to a partic lar 4 er*" In this case the best re! lar plan will be sed"

Star Trans0or ation 1estrictions Star transformation is not s pported for tables with an* of the followin! characteristics#

F eries with a table hint that is incompatible with a bitmap access path F eries that contain bind variables Tables with too few bitmap indexes" There m st be a bitmap index on a fact table col mn for the optimi5er to !enerate a s b4 er* for it" +emote fact tables" However/ remote dimension tables are allowed in the s b4 eries that are !enerated" 'nti):oined tables Tables that are alread* sed as a dimension table in a s b4 er* Tables that are reall* nmer!ed views/ which are not view partitions

The star transformation ma* not be chosen b* the optimi5er for the followin! cases#

Tables that have a !ood sin!le)table access path Tables that are too small for the transformation to be worthwhile

In addition/ temporar* tables will not be sed b* star transformation nder the followin! conditions#

The database is in read)onl* mode The star 4 er* is part of a transaction that is in seriali5able mode

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

1G SD" 0or Aggregation in Data Warehouses


This chapter disc sses a!!re!ation of SFL/ a basic aspect of data wareho sin!" It contains these topics#

Overview of SFL for '!!re!ation in %ata $areho ses

+OLL=7 6xtension to K+O=7 B? C=B6 6xtension to K+O=7 B? K+O=7I(K F nctions K+O=7I(K S6TS 6xpression Composite Col mns Concatenated Kro pin!s Considerations when =sin! '!!re!ation Comp tation =sin! the $ITH Cla se

Overvie( o0 SD" 0or Aggregation in Data Warehouses


'!!re!ation is a f ndamental part of data wareho sin!" To improve a!!re!ation performance in *o r wareho se/ Oracle provides the followin! extensions to the (#OU% B' cla se#

and #OLLU% extensions to the (#OU% B' cla se Three (#OU%IN( f nctions (#OU%IN( S!TS expression
CUB!

The CUB!/ #OLLU%/ and (#OU%IN( S!TS extensions to SFL make 4 er*in! and reportin! easier and faster" #OLLU% calc lates a!!re!ations s ch as SU"/ COUNT/ "$./ "IN/ and $+( at increasin! levels of a!!re!ation/ from the most detailed p to a !rand total" CUB! is an extension similar to #OLLU%/ enablin! a sin!le statement to calc late all possible combinations of a!!re!ations" CUB! can !enerate the information needed in cross) tab lation reports with a sin!le 4 er*"
CUB!/ #OLLU%/ and the (#OU%IN( S!TS extension let *o specif* exactl* the !ro pin!s of interest in the (#OU% B' cla se" This allows efficient anal*sis across m ltiple dimensions witho t performin! a CUB! operation" Comp tin! a f ll c be creates a heav* processin! load/ so replacin! c bes with !ro pin! sets can si!nificantl* increase performance" CUB!/ #OLLU%/ and !ro pin! sets prod ce a sin!le res lt set that is e4 ivalent to a UNION $LL of

differentl* !ro ped rows" To enhance performance/ CUB!/ #OLLU%/ and (#OU%IN( S!TS can be paralleli5ed# m ltiple processes can sim ltaneo sl* exec te all of these statements" These capabilities make a!!re!ate calc lations more efficient/ thereb* enhancin! database performance/ and scalabilit*" The three (#OU%IN( f nctions help *o identif* the !ro p each row belon!s to and enable sortin! s btotal rows and filterin! res lts" See Also: Oracle9i S%& $e"erence for f rther details

Anal+4ing Across )ultiple Di ensions


One of the ke* concepts in decision s pport s*stems is m ltidimensional anal*sis# examinin! the enterprise from all necessar* combinations of dimensions" $e se the term dimension to mean an* cate!or* sed in specif*in! 4 estions" 'mon! the most commonl* specified dimensions are time/ !eo!raph*/ prod ct/ department/ and distrib tion channel/ b t the potential dimensions are as endless as the varieties of enterprise activit*" The events or entities associated with a partic lar set of dimension val es are s all* referred to as facts" The facts mi!ht be sales in nits or local c rrenc*/ profits/ c stomer co nts/ prod ction vol mes/ or an*thin! else worth trackin!" Here are some examples of m ltidimensional re4 ests#

Show total sales across all prod cts at increasin! a!!re!ation levels for a !eo!raph* dimension/ from state to co ntr* to re!ion/ for 0111 and A888" Create a cross)tab lar anal*sis of o r operations showin! expenses b* territor* in So th 'merica for 0111 and A888" Incl de all possible s btotals" List the top 08 sales representatives in 'sia accordin! to A888 sales reven e for a tomotive prod cts/ and rank their commissions"

'll these re4 ests involve m ltiple dimensions" Man* m ltidimensional 4 estions re4 ire a!!re!ated data and comparisons of data sets/ often across time/ !eo!raph* or b d!ets" To vis ali5e data that has man* dimensions/ anal*sts commonl* se the analo!* of a data c be/ that is/ a space where facts are stored at the intersection of n dimensions" Fi! re 0H) 0 shows a data c be and how it can be sed differentl* b* vario s !ro ps" The c be stores sales data or!ani5ed b* the dimensions of product/ mar=et/ sales/ and time" (ote that this is onl* a metaphor# the act al data is ph*sicall* stored in normal tables" The c be data consists of both detail and a!!re!ated data" Figure 1:-1 Logica$ Cu)es and <ie s )' Different +sers

Text description of the ill stration dwhs!8HE"!if ?o can retrieve slices of data from the c be" These correspond to cross)tab lar reports s ch as the one shown in Table 0H)0" +e!ional mana!ers mi!ht st d* the data b* comparin! slices of the c be applicable to different markets" In contrast/ prod ct mana!ers mi!ht compare slices that appl* to different prod cts" 'n ad hoc ser mi!ht work with a wide variet* of constraints/ workin! in a s bset c be" 'nswerin! m ltidimensional 4 estions often involves accessin! and 4 er*in! h !e 4 antities of data/ sometimes in millions of rows" Beca se the flood of detailed data !enerated b* lar!e or!ani5ations cannot be interpreted at the lowest level/ a!!re!ated views of the information are essential" '!!re!ations/ s ch as s ms and co nts/ across man* dimensions are vital to m ltidimensional anal*ses" Therefore/ anal*tical tasks re4 ire convenient and efficient data a!!re!ation"

Opti i4ed #er0or ance


(ot onl* m ltidimensional iss es/ b t all t*pes of processin! can benefit from enhanced a!!re!ation facilities" Transaction processin!/ financial and man fact rin! s*stems))all of these !enerate lar!e n mbers of prod ction reports needin! s bstantial s*stem reso rces" Improved efficienc* when creatin! these reports will red ce s*stem load" In fact/ an* comp ter process that a!!re!ates data from details to hi!her levels will benefit from optimi5ed a!!re!ation performance" Oracle1i extensions provide a!!re!ation feat res and brin! man* benefits/ incl din!#

Simplified pro!rammin! re4 irin! less SFL code for man* tasks F icker and more efficient 4 er* processin!

+ed ced client processin! loads and network traffic beca se a!!re!ation work is shifted to servers Opport nities for cachin! a!!re!ations beca se similar 4 eries can levera!e existin! work

An Aggregate Scenario
To ill strate the se of the (#OU% B' extension/ this chapter ses the sh data of the sample schema" 'll the examples refer to data from this scenario" The h*pothetical compan* has sales across the world and tracks sales b* both dollars and 4 antities information" Beca se there are man* rows of data/ the 4 eries shown here t*picall* have ti!ht constraints on their F,!#! cla ses to limit the res lts to a small n mber of rows" E.am&$e 1:-1 !im&$e Cross-Ta)u$ar %e&ort With !u)tota$s Table 0H)0 is a sample cross)tab lar report showin! the total sales b* country_id and channel_desc for the =S and =, thro !h the Internet and direct sales in September A888"

Ta)$e 1:-1 !im&$e Cross-Ta)u$ar %e&ort With !u)tota$s


Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

1J SD" 0or Anal+sis in Data Warehouses


The followin! topics provide information abo t how to improve anal*tical SFL 4 eries in a data wareho se#

Overview of SFL for 'nal*sis in %ata $areho ses +ankin! F nctions $indowin! '!!re!ate F nctions +eportin! '!!re!ate F nctions

L'K@L6'% F nctions FI+ST@L'ST F nctions Linear +e!ression F nctions Inverse 7ercentile F nctions H*pothetical +ank and %istrib tion F nctions $I%THOB=C,6T F nction =ser)%efined '!!re!ate F nctions C'S6 6xpressions

Overvie( o0 SD" 0or Anal+sis in Data Warehouses


Oracle has enhanced SFL<s anal*tical processin! capabilities b* introd cin! a new famil* of anal*tic SFL f nctions" These anal*tic f nctions enable *o to calc late#

+ankin!s and percentiles Movin! window calc lations La!@lead anal*sis First@last anal*sis Linear re!ression statistics

+ankin! f nctions incl de c m lative distrib tions/ percent rank/ and ()tiles" Movin! window calc lations allow *o to find movin! and c m lative a!!re!ations/ s ch as s ms and avera!es" La!@lead anal*sis enables direct inter)row references so *o can calc late period)to)period chan!es" First@last anal*sis enables *o to find the first or last val e in an ordered !ro p" Other enhancements to SFL incl de the C$S! expression" C$S! expressions provide if) then lo!ic sef l in man* sit ations" To enhance performance/ anal*tic f nctions can be paralleli5ed# m ltiple processes can sim ltaneo sl* exec te all of these statements" These capabilities make calc lations easier and more efficient/ thereb* enhancin! database performance/ scalabilit*/ and simplicit*" See Also: Oracle9i S%& $e"erence for f rther details 'nal*tic f nctions are classified as described in Table 01)0"

Ta)$e 1@-1 Ana$'tic Functions and Their +ses


T+pe +ankin! .sed ,or Calc latin! ranks/ percentiles/ and n)tiles of the val es in a res lt

T+pe

.sed ,or set"

$indowin!

Calc latin! c m lative and movin! a!!re!ates" $orks with these f nctions#
SU"/ $+(/ "IN/ "$./ COUNT/ +$#I$NC!/ STDD!+/ >I#ST_+$LU!/ L$ST_+$LU!/ and new statistical f nctions

+eportin!

Calc latin! shares/ for example/ market share" $orks with these f nctions#
SU"/ $+(/ "IN/ "$./ COUNT -with@witho t DISTINCT3/ +$#I$NC!/ STDD!+/ #$TIO_TO_#!%O#T/ and new statistical f nctions

L'K@L6'%

Findin! a val e in a row a specified n mber of rows from a c rrent row" First or last val e in an ordered !ro p" Calc latin! linear re!ression and other statistics -slope/ intercept/ and so on3" The val e in a data set that corresponds to a specified percentile"

FI+ST@L'ST Linear +e!ression

Inverse 7ercentile

H*pothetical +ank and The rank or percentile that a row wo ld have if inserted into a %istrib tion specified data set" To perform these operations/ the anal*tic f nctions add several new elements to SFL processin!" These elements b ild on existin! SFL to allow flexible and powerf l calc lation expressions" $ith : st a few exceptions/ the anal*tic f nctions have these new elements" The processin! flow is represented in Fi! re 01)0" Figure 1@-1 Processing Order

Text description of the ill stration dwhs!8A0"!if

The essential concepts sed in anal*tic f nctions are#

7rocessin! order F er* processin! sin! anal*tic f nctions takes place in three sta!es" First/ all :oins/ F,!#!/ (#OU% B' and ,$+IN( cla ses are performed" Second/ the res lt set is made available to the anal*tic f nctions/ and all their calc lations take place" Third/ if the 4 er* has an O#D!# B' cla se at its end/ the O#D!# B' is processed to allow for precise o tp t orderin!" The processin! order is shown in Fi! re 01)0"

+es lt set partitions The anal*tic f nctions allow sers to divide 4 er* res lt sets into !ro ps of rows called partitions" (ote that the term partitions sed with anal*tic f nctions is nrelated to Oracle<s table partitions feat re" Thro !ho t this chapter/ the term partitions refers to onl* the meanin! related to anal*tic f nctions" 7artitions are created after the !ro ps defined with (#OU% B' cla ses/ so the* are available to an* a!!re!ate res lts s ch as s ms and avera!es" 7artition divisions ma* be based pon an* desired col mns or expressions" ' 4 er* res lt set ma* be partitioned into : st one partition holdin! all the rows/ a few lar!e partitions/ or man* small partitions holdin! : st a few rows each"

$indow For each row in a partition/ *o can define a slidin! window of data" This window determines the ran!e of rows sed to perform the calc lations for the c rrent row" $indow si5es can be based on either a ph*sical n mber of rows or a lo!ical interval s ch as time" The window has a startin! row and an endin! row" %ependin! on its definition/ the window ma* move at one or both ends" For instance/ a window defined for a c m lative s m f nction wo ld have its startin! row fixed at the first row of its partition/ and its endin! row wo ld slide from the startin! point all the wa* to the last row of the partition" In contrast/ a window defined for a movin! avera!e wo ld have both its startin! and end points slide so that the* maintain a constant ph*sical or lo!ical ran!e" ' window can be set as lar!e as all the rows in a partition or : st a slidin! window of one row within a partition" $hen a window is near a border/ the f nction ret rns res lts for onl* the available rows/ rather than warnin! *o that the res lts are not what *o want" $hen sin! window f nctions/ the c rrent row is incl ded d rin! calc lations/ so *o sho ld onl* specif* -n)03 when *o are dealin! with n items"

C rrent row

6ach calc lation performed with an anal*tic f nction is based on a c rrent row within a partition" The c rrent row serves as the reference point determinin! the start and end of the window" For instance/ a centered movin! avera!e calc lation co ld be defined with a window that holds the c rrent row/ the six precedin! rows/ and the followin! six rows" This wo ld create a slidin! window of 0B rows/ as shown in Fi! re 01)A" Figure 1@-2 !$iding Windo E.am&$e

Text description of the ill stration dwhs!8AA"!if

1an<ing ,unctions
' rankin! f nction comp tes the rank of a record compared to other records in the dataset based on the val es of a set of meas res" The t*pes of rankin! f nction are#
#$NK and D!NS!_#$NK CU"!_DIST and %!#C!NT_#$NK NTIL! #OF_NU"B!#

1ANE and D$NS$B1ANE


The #$NK and D!NS!_#$NK f nctions allow *o to rank items in a !ro p/ for example/ findin! the top three prod cts sold in California last *ear" There are two f nctions that perform rankin!/ as shown b* the followin! s*ntax#
#$NK&)&*&O+!#&)&Xquery_partition_clauseY&order_9y_clause&* D!NS!_#$NK&)&*&O+!#&)&Xquery_partition_clauseY&order_9y_clause&*

The difference between #$NK and D!NS!_#$NK is that D!NS!_#$NK leaves no !aps in rankin! se4 ence when there are ties" That is/ if *o were rankin! a competition sin! D!NS!_#$NK and had three people tie for second place/ *o wo ld sa* that all three were in second place and that the next person came in third" The #$NK f nction wo ld also !ive three people in second place/ b t the next person wo ld be in fifth place" The followin! are some relevant points abo t #$NK#

'scendin! is the defa lt sort order/ which *o ma* want to chan!e to descendin!" The expressions in the optional %$#TITION B' cla se divide the 4 er* res lt set into !ro ps within which the #$NK f nction operates" That is/ #$NK !ets reset whenever the !ro p chan!es" In effect/ the val e expressions of the %$#TITION B'& cla se define the reset bo ndaries" If the %$#TITION B' cla se is missin!/ then ranks are comp ted over the entire 4 er* res lt set" The O#D!# B' cla se specifies the meas res -Qval e expressionPs3 on which rankin! is done and defines the order in which rows are sorted in each !ro p -or partition3" Once the data is sorted within each partition/ ranks are !iven to each row startin! from 0" The NULLS >I#ST T NULLS L$ST cla se indicates the position of NULLs in the ordered se4 ence/ either first or last in the se4 ence" The order of the se4 ence wo ld make NULLs compare either hi!h or low with respect to non)NULL val es" If the se4 ence were in ascendin! order/ then NULLS >I#ST implies that NULLs are smaller than all other non)NULL val es and NULLS L$ST implies the* are lar!er than non)NULL val es" It is the opposite for descendin! order" See the example in 9Treatment of (=LLs9" If the NULLS >I#ST T NULLS L$ST cla se is omitted/ then the orderin! of the n ll val es depends on the $SC or D!SC ar! ments" ( ll val es are considered lar!er than an* other val es" If the orderin! se4 ence is $SC/ then n lls will appear lastJ n lls will appear first otherwise" ( lls are considered e4 al to other n lls and/ therefore/ the order in which n lls are presented is non)deterministic"

1an<ing Order The followin! example shows how the X$SC&S&D!SCY option chan!es the rankin! order" E.am&$e 1@-1 %an?ing Order
S!L!CT&channel_desc1&& &&&TO_C,$#)SU")amount_sold*1&7@1@@@1@@@1@@@7*&S$L!SQ1 &&&#$NK)*&O+!#&)O#D!#&B'&SU")amount_sold*&*&$S&default_ran=1& &&&#$NK)*&O+!#&)O#D!#&B'&SU")amount_sold*&D!SC&NULLS&L$ST*&$S& custom_ran= >#O"&sales1&products1&customers1&times1&channels F,!#!&salesIprod_idHproductsIprod_id&$ND &&&salesIcust_idHcustomersIcust_id&$ND &&&salesItime_idHtimesItime_id&$ND &&&salesIchannel_idHchannelsIchannel_id&$ND&

&&&timesIcalendar_month_desc&IN&)72444A4@71&72444A547* &&&&$ND&country_idH7US7 (#OU%&B'&channel_desc; C,$NN!L_D!SC&&&&&&&&&S$L!SQ&&&&&&&&&D!>$ULT_#$NK&CUSTO"_#$NK AAAAAAAAAAAAAAAAAAAA&AAAAAAAAAAAAAA&AAAAAAAAAAAA&AAAAAAAAAAA Direct&Sales&&&&&&&&&&&&&&01E::12D3&&&&&&&&&&&&0&&&&&&&&&&&5 Internet&&&&&&&&&&&&&&&&&&31D201@@3&&&&&&&&&&&&:&&&&&&&&&&&2 Catalo/&&&&&&&&&&&&&&&&&&&51C0C13CD&&&&&&&&&&&&3&&&&&&&&&&&3 %artners&&&&&&&&&&&&&&&&&&510441253&&&&&&&&&&&&2&&&&&&&&&&&: Tele&Sales&&&&&&&&&&&&&&&&&&D4:1D0D&&&&&&&&&&&&5&&&&&&&&&&&0

$hile the data in this res lt is ordered on the meas re S$L!SQ/ in !eneral/ it is not ! aranteed b* the #$NK f nction that the data will be sorted on the meas res" If *o want the data to be sorted on S$L!SQ in *o r res lt/ *o m st specif* it explicitl* with an O#D!# B' cla se/ at the end of the S!L!CT statement" 1an<ing on )ultiple $2pressions +ankin! f nctions need to resolve ties between val es in the set" If the first expression cannot resolve ties/ the second expression is sed to resolve ties and so on" For example/ here is a 4 er* rankin! fo r of the sales channels over two months based on their dollar sales/ breakin! ties with the nit sales" -(ote that the T#UNC f nction is sed here onl* to create tie val es for this 4 er*"3 E.am&$e 1@-2 %an?ing On #u$ti&$e E.&ressions
S!L!CT&channel_desc1&calendar_month_desc1 &&&TO_C,$#)T#UNC)SU")amount_sold*1AD*1&7@1@@@1@@@1@@@7*&S$L!SQ1 &&&TO_C,$#)SU")quantity_sold*1&7@1@@@1@@@1@@@7*&S$L!S_Count1& &&&#$NK)*&O+!#&)O#D!#&B'&trunc)SU")amount_sold*1&AD*&D!SC1& SU")quantity_sold*& D!SC*&$S&col_ran=& >#O"&sales1&products1&customers1&times1&channels F,!#!&salesIprod_idHproductsIprod_id&$ND &&&salesIcust_idHcustomersIcust_id&$ND &&&salesItime_idHtimesItime_id&$ND &&&salesIchannel_idHchannelsIchannel_id&$ND& &&&timesIcalendar_month_desc&IN&)72444A4@71&72444A547*&$ND &&&channelsIchannel_descPJ7Tele&Sales7 (#OU%&B'&channel_desc1&calendar_month_desc; C,$NN!L_D!SC&&&&&&&&&C$L!ND$#&S$L!SQ&&&&&&&&&S$L!S_COUNT&&&&&COL_#$NK AAAAAAAAAAAAAAAAAAAA&AAAAAAAA&AAAAAAAAAAAAAA&AAAAAAAAAAAAAA&AAAAAAAAA Direct&Sales&&&&&&&&&2444A54&&&&&&5414441444&&&&&&&&5@21005&&&&&&&&&5 Direct&Sales&&&&&&&&&2444A4@&&&&&&&@14441444&&&&&&&&5ED1@04&&&&&&&&&2 Internet&&&&&&&&&&&&&2444A54&&&&&&&D14441444&&&&&&&&5231503&&&&&&&&&3 Internet&&&&&&&&&&&&&2444A4@&&&&&&&D14441444&&&&&&&&553144D&&&&&&&&&: Catalo/&&&&&&&&&&&&&&2444A54&&&&&&&314441444&&&&&&&&&0@1EC2&&&&&&&&&0 Catalo/&&&&&&&&&&&&&&2444A4@&&&&&&&314441444&&&&&&&&&0:1C0E&&&&&&&&&D %artners&&&&&&&&&&&&&2444A54&&&&&&&214441444&&&&&&&&&041EE3&&&&&&&&&E %artners&&&&&&&&&&&&&2444A4@&&&&&&&214441444&&&&&&&&&:D1224&&&&&&&&&C

The sales_count col mn breaks the ties for three pairs of val es" 1ANE and D$NS$B1ANE Di00erence The difference between #$NK and D!NS!_#$NK f nctions is ill strated as follows# E.am&$e 1@-3 %A=D and DE=!EA%A=D
S!L!CT&channel_desc1&calendar_month_desc1 &&&TO_C,$#)T#UNC)SU")amount_sold*1AD*1&7@1@@@1@@@1@@@7*&S$L!SQ1 &&&&&&#$NK)*&O+!#&)O#D!#&B'&trunc)SU")amount_sold*1AD*&D!SC* &&&&&&&&&&&&&&&$S&#$NK1 D!NS!_#$NK)*&O+!#&)O#D!#&B'&T#UNC)SU")amount_sold*1AD*&D!SC* &&&&&&&&&&&&&&&$S&D!NS!_#$NK >#O"&sales1&products1&customers1&times1&channels F,!#!&salesIprod_idHproductsIprod_id&$ND &&&salesIcust_idHcustomersIcust_id&$ND &&&salesItime_idHtimesItime_id&$ND &&&salesIchannel_idHchannelsIchannel_id&$ND& &&&timesIcalendar_month_desc&IN&)72444A4@71&72444A547*&$ND &&&channelsIchannel_descPJ7Tele&Sales7 (#OU%&B'&channel_desc1&calendar_month_desc; C,$NN!L_D!SC&&&&&&&&&C$L!ND$#&S$L!SQ&&&&&&&&&&&&&&#$NK&D!NS!_#$NK AAAAAAAAAAAAAAAAAAAA&AAAAAAAA&AAAAAAAAAAAAAA&AAAAAAAAA&AAAAAAAAAA Direct&Sales&&&&&&&&&2444A54&

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

*8 O"A# and Data )ining


In lar!e data wareho se environments/ man* different t*pes of anal*sis can occ r" In addition to SFL 4 eries/ *o ma* also appl* more advanced anal*tical operations to *o r data" Two ma:or t*pes of s ch anal*sis are OL'7 -On)Line 'nal*tic 7rocessin!3 and data minin!" +ather than havin! a separate OL'7 or data minin! en!ine/ Oracle has inte!rated OL'7 and data minin! capabilities directl* into the database server" Oracle OL'7 and Oracle %ata Minin! are options to the Oracle1i %atabase" This chapter

provides a brief introd ction to these technolo!ies/ and more detail can be fo nd in these prod cts< respective doc mentation" The followin! topics provide an introd ction to Oracle<s OL'7 and data minin! capabilities#

OL'7 %ata Minin! See Also: Oracle9i O&'P ser(s #uide for f rther information re!ardin! OL'7 and Oracle Data Mining doc mentation for f rther information re!ardin! data minin!

O"A#
Oracle1i OL'7 adds the 4 er* performance and calc lation capabilit* previo sl* fo nd onl* in m ltidimensional databases to Oracle<s relational platform" In addition/ it provides a .ava OL'7 '7I that is appropriate for the development of internet)read* anal*tical applications" =nlike other combinations of OL'7 and +%BMS technolo!*/ Oracle1i OL'7 is not a m ltidimensional database sin! brid!es to move data from the relational data store to a m ltidimensional data store" Instead/ it is tr l* an OL'7)enabled relational database" 's a res lt/ Oracle1i provides the benefits of a m ltidimensional database alon! with the scalabilit*/ accessibilit*/ sec rit*/ mana!eabilit*/ and hi!h availabilit* of the Oracle1i database" The .ava OL'7 '7I/ which is specificall* desi!ned for internet)based anal*tical applications/ offers prod ctive data access" See Also: Oracle9i O&'P ser(s #uide for f rther information re!ardin! OL'7

&ene0its o0 O"A# and 1D&)S Integration


Basin! an OL'7 s*stem directl* on the Oracle server offers the followin! benefits#

Scalabilit* 'vailabilit* Mana!eabilit* Back p and +ecover* Sec rit*

Scalabilit+

Oracle1i OL'7 is hi!hl* scalable" In toda*<s environment/ there is tremendo s !rowth alon! three dimensions of anal*tic applications# n mber of sers/ si5e of data/ complexit* of anal*ses" There are more sers of anal*tical applications/ and the* need access to more data to perform more sophisticated anal*sis and tar!et marketin!" For example/ a telephone compan* mi!ht want a c stomer dimension to incl de detail s ch as all telephone n mbers as part of an application that is sed to anal*5e c stomer t rnover" This wo ld re4 ire s pport for m lti)million row dimension tables and ver* lar!e vol mes of fact data" Oracle1i can handle ver* lar!e data sets sin! parallel exec tion and partitionin!/ as well as offerin! s pport for advanced hardware and cl sterin!" Availabilit+ Oracle1i incl des man* feat res that s pport hi!h availabilit*" One of the most si!nificant is partitionin!/ which allows mana!ement of precise s bsets of tables and indexes/ so that mana!ement operations affect onl* small pieces of these data str ct res" B* partitionin! tables and indexes/ data mana!ement processin! time is red ced/ th s minimi5in! the time data is navailable" 'nother feat re s pportin! hi!h availabilit* is transportable tablespaces" $ith transportable tablespaces/ lar!e data sets/ incl din! tables and indexes/ can be added with almost no processin! to other databases" This enables extremel* rapid data loadin! and pdates" )anageabilit+ Oracle enables *o to precisel* control reso rce tili5ation" The %atabase +eso rce Mana!er/ for example/ provides a mechanism for allocatin! the reso rces of a data wareho se amon! different sets of end) sers" Consider an environment where the marketin! department and the sales department share an OL'7 s*stem" =sin! the %atabase +eso rce Mana!er/ *o co ld specif* that the marketin! department receive at least 28 percent of the C7= reso rces of the machines/ while the sales department receive C8 percent of the C7= reso rces" ?o can also f rther specif* limits on the total n mber of active sessions/ and the de!ree of parallelism of individ al 4 eries for each department" 'nother reso rce mana!ement facilit* is the pro!ress monitor/ which !ives end sers and administrators the stat s of lon!)r nnin! operations" Oracle1i maintains statistics describin! the percent)complete of these operations" Oracle 6nterprise Mana!er enables *o to view a bar)!raph displa* of these operations showin! what percent complete the* are" Moreover/ an* other tool or an* database administrator can also retrieve pro!ress information directl* from the Oracle data server/ sin! s*stem views" &ac<up and 1ecover+ Oracle provides a server)mana!ed infrastr ct re for back p/ restore/ and recover* tasks that enables simpler/ safer operations at terab*te scale" Some of the hi!hli!hts are#

%etails related to back p/ restore/ and recover* operations are maintained b* the server in a recover* catalo! and a tomaticall* sed as part of these operations" This red ces administrative b rden and minimi5es the possibilit* of h man errors" Back p and recover* operations are f ll* inte!rated with partitionin!" Individ al partitions/ when placed in their own tablespaces/ can be backed p and restored independentl* of the other partitions of a table" Oracle incl des s pport for incremental back p and recover* sin! +ecover* Mana!er/ enablin! operations to be completed efficientl* within times proportional to the amo nt of chan!es/ rather than the overall si5e of the database" The back p and recover* technolo!* is hi!hl* scalable/ and provides ti!ht interfaces to ind str*)leadin! media mana!ement s bs*stems" This provides for efficient operations that can scale p to handle ver* lar!e vol mes of data" Open 7latforms for more hardware options U enterprise)level platforms" See Also: Oracle9i $ecovery Manager ser(s #uide for f rther details

Securit+ . st as the demands of real)world transaction processin! re4 ired Oracle to develop rob st feat res for scalabilit*/ mana!eabilit* and back p and recover*/ the* lead Oracle to create ind str*)leadin! sec rit* feat res" The sec rit* feat res in Oracle have reached the hi!hest levels of ="S" !overnment certification for database tr stworthiness" Oracle<s fine !rained access control feat re/ enables cell)level sec rit* for OL'7 sers" Fine !rained access control works with minimal b rden on 4 er* processin!/ and it enables efficient centrali5ed sec rit* mana!ement"

Data )ining
Oracle enables data minin! inside the database for performance and scalabilit*" Some of the capabilities are#

'n '7I that provides pro!rammatic control and application inte!ration 'nal*tical capabilities with OL'7 and statistical f nctions in the database M ltiple al!orithms# (aVve Ba*es/ decision trees/ cl sterin!/ and association r les +eal)time and batch scorin! modes M ltiple prediction t*pes 'ssociation insi!hts See Also: Oracle Data Mining doc mentation for more information

$nabling Data )ining Applications

Oracle1i %ata Minin! provides a .ava '7I to exploit the data minin! f nctionalit* that is embedded within the Oracle1i database" B* deliverin! complete pro!rammatic control of the database in data minin!/ Oracle %ata Minin! -O%M3 delivers powerf l/ scalable modelin! and real)time scorin!" This enables e)b sinesses to incorporate predictions and classifications in all processes and decision points thro !ho t the b siness c*cle" O%M is desi!ned to meet the challen!es of vast amo nts of data/ deliverin! acc rate insi!hts completel* inte!rated into e)b siness applications" This inte!rated intelli!ence enables the a tomation and decision speed that e)b sinesses re4 ire in order to compete toda*"

#redictions and Insights


Oracle %ata Minin! ses data minin! al!orithms to sift thro !h the lar!e vol mes of data !enerated b* e)b sinesses to prod ce/ eval ate/ and deplo* predictive models" It also enriches mission critical applications in C+M/ man fact rin! control/ inventor* mana!ement/ c stomer service and s pport/ $eb portals/ wireless devices and other fields with context)specific recommendations and predictive monitorin! of critical processes" O%M delivers real)time answers to 4 estions s ch as#

$hich ( items is person ' most likel* to b * or like& $hat is the likelihood that this prod ct will be ret rned for repair&

)ining Within the Database Architecture


Oracle %ata Minin! performs all the phases of data minin! within the database" In each data minin! phase/ this architect re res lts in si!nificant improvements incl din! performance/ a tomation/ and inte!ration" Data #reparation %ata preparation can create new tables or views of existin! data" Both options perform faster than movin! data to an external data minin! tilit* and offer the pro!rammer the option of snap)shots or real)time pdates" Oracle %ata Minin! provides tilities for complex/ data minin!)specific tasks" Binnin! improves model b ild time and model performance/ so O%M provides a tilit* for ser) defined binnin!" O%M accepts data in either sin!le record format or in transactional format and performs minin! on transactional formats" Sin!le record format is most common in applications/ so O%M provides a tilit* for transformin! sin!le record format" 'ssociated anal*sis for preparator* data exploration and model eval ation is extended b* Oracle<s statistical f nctions and OL'7 capabilities" Beca se these also operate within

the database/ the* can all be incorporated into a seamless application that shares database ob:ects" This allows for more f nctional and faster applications" )odel &uilding Oracle %ata Minin! provides fo r al!orithms# (aVve Ba*es/ %ecision Tree/ Cl sterin!/ and 'ssociation + les" These al!orithms address a broad spectr m of b siness problems/ ran!in! from predictin! the f t re likelihood of a c stomer p rchasin! a !iven prod ct/ to nderstand which prod cts are likel* be p rchased to!ether in a sin!le trip to the !rocer* store" 'll model b ildin! takes place inside the database" Once a!ain/ the data does not need to move o tside the database in order to b ild the model/ and therefore the entire data)minin! process is accelerated" )odel $valuation Models are stored in the database and directl* accessible for eval ation/ reportin!/ and f rther anal*sis b* a wide variet* of tools and application f nctions" O%M provides '7Is for calc latin! traditional conf sion matrixes and lift charts" It stores the models/ the nderl*in! data/ and these anal*sis res lts to!ether in the database to allow f rther anal*sis/ reportin! and application specific model mana!ement" Scoring Oracle %ata Minin! provides both batch and real)time scorin!" In batch mode/ O%M takes a table as inp t" It scores ever* record/ and ret rns a scored table as a res lt" In real) time mode/ parameters for a sin!le record are passed in and the scores are ret rned in a .ava ob:ect" In both modes/ O%M can deliver a variet* of scores" It can ret rn a ratin! or probabilit* of a specific o tcome" 'lternativel* it can ret rn a predicted o tcome and the probabilit* of that o tcome occ rrin!" Some examples follow"

How likel* is this event to end in o tcome '& $hich o tcome is most likel* to res lt from this event& $hat is the probabilit* of each possible o tcome for this event&

>ava A#I
The Oracle %ata Minin! '7I lets *o b ild anal*tical models and deliver real)time predictions in an* application that s pports .ava" The '7I is based on the emer!in! .S+) 8EB standard"

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

*1 .sing #arallel $2ecution


This chapter covers t nin! in a parallel exec tion environment and disc sses#

Introd ction to 7arallel 6xec tion T nin! T*pes of 7arallelism Initiali5in! and T nin! 7arameters for 7arallel 6xec tion T nin! Keneral 7arameters for 7arallel 6xec tion Monitorin! and %ia!nosin! 7arallel 6xec tion 7erformance 'ffinit* and 7arallel Operations Miscellaneo s 7arallel 6xec tion T nin! Tips

Introduction to #arallel $2ecution Tuning


7arallel exec tion dramaticall* red ces response time for data)intensive operations on lar!e databases t*picall* associated with decision s pport s*stems -%SS3 and data wareho ses" ?o can also implement parallel exec tion on certain t*pes of online transaction processin! -OLT73 and h*brid s*stems" 7arallel exec tion improves processin! for#

F eries re4 irin! lar!e table scans/ :oins/ or partitioned index scans Creation of lar!e indexes Creation of lar!e tables -incl din! materiali5ed views3 B lk inserts/ pdates/ mer!es/ and deletes

?o can also se parallel exec tion to access ob:ect t*pes within an Oracle database" For example/ *o can se parallel exec tion to access lar!e ob:ects -LOBs3" 7arallel exec tion benefits s*stems with all of the followin! characteristics#

S*mmetric m ltiprocessors -SM7s3/ cl sters/ or massivel* parallel s*stems S fficient I@O bandwidth

=nder tili5ed or intermittentl* sed C7=s -for example/ s*stems where C7= sa!e is t*picall* less than B8N3 S fficient memor* to s pport additional memor*)intensive processes/ s ch as sorts/ hashin!/ and I@O b ffers

If *o r s*stem lacks an* of these characteristics/ parallel exec tion mi!ht not si!nificantl* improve performance" In fact/ parallel exec tion ma* red ce s*stem performance on over tili5ed s*stems or s*stems with small I@O bandwidth"

When to I ple ent #arallel $2ecution


7arallel exec tion provides the !reatest performance improvements in %SS and data wareho sin! environments" OLT7 s*stems also benefit from parallel exec tion/ b t s all* onl* d rin! batch processin!" % rin! the da*/ most OLT7 s*stems sho ld probabl* not se parallel exec tion" % rin! off)ho rs/ however/ parallel exec tion can effectivel* process hi!h)vol me batch operations" For example/ a bank mi!ht se paralleli5ed batch pro!rams to perform millions of pdates to appl* interest to acco nts"

Operations That Can &e #aralleli4ed


The Oracle server can se parallel exec tion for an* of the followin!#

'ccess methods For example/ table scans/ index f ll scans/ and partitioned index ran!e scans"

.oin methods For example/ nested loop/ sort mer!e/ hash/ and star transformation"

%%L statements
C#!$T! T$BL! $S S!L!CT/ C#!$T! IND!./ #!BUILD IND!./ #!BUILD IND!. %$#TITION/ and "O+! S%LIT CO$L!SC! %$#TITION

%ML statements For example/ INS!#T&$S&S!L!CT/ pdates/ deletes/ and "!#(! operations"

Miscellaneo s SFL operations For example/ (#OU% B'/ NOT IN/ S!L!CT DISTINCT/ UNION/ UNION $LL/ CUB!/ and #OLLU%/ as well as a!!re!ate and table f nctions"

The #arallel $2ecution Server #ool


$hen an instance starts p/ Oracle creates a pool of parallel exec tion servers which are available for an* parallel operation" The initiali5ation parameter %$#$LL!L_"IN_S!#+!#S& specifies the n mber of parallel exec tion servers that Oracle creates at instance start p" $hen exec tin! a parallel operation/ the parallel exec tion coordinator obtains parallel exec tion servers from the pool and assi!ns them to the operation" If necessar*/ Oracle can create additional parallel exec tion servers for the operation" These parallel exec tion servers remain with the operation thro !ho t :ob exec tion/ then become available for other operations" 'fter the statement has been processed completel*/ the parallel exec tion servers ret rn to the pool" Note: The parallel exec tion coordinator and the parallel exec tion servers can onl* service one statement at a time" ' parallel exec tion coordinator cannot coordinate/ for example/ a parallel 4 er* and a parallel %ML statement at the same time"

$hen a ser iss es a SFL statement/ the optimi5er decides whether to exec te the operations in parallel and determines the de!ree of parallelism -%O73 for each operation" ?o can specif* the n mber of parallel exec tion servers re4 ired for an operation in vario s wa*s" If the optimi5er tar!ets the statement for parallel processin!/ the followin! se4 ence of events takes place# 0" The SFL statement<s fore!ro nd process becomes a parallel exec tion coordinator" 0" The parallel exec tion coordinator obtains as man* parallel exec tion servers as needed -determined b* the %O73 from the server pool or creates new parallel exec tion servers as needed" 0" Oracle exec tes the statement as a se4 ence of operations" 6ach operation is performed in parallel/ if possible" 0" $hen statement processin! is completed/ the coordinator ret rns an* res ltin! data to the ser process th Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

** Duer+ 1e(rite
This chapter disc sses how Oracle rewrites 4 eries" It contains#

Overview of F er* +ewrite 6nablin! F er* +ewrite How Oracle +ewrites F eries Special Cases for F er* +ewrite %id F er* +ewrite Occ r& %esi!n Considerations for Improvin! F er* +ewrite Capabilities

Overvie( o0 Duer+ 1e(rite


One of the ma:or benefits of creatin! and maintainin! materiali5ed views is the abilit* to take advanta!e of 4 er* rewrite/ which transforms a SFL statement expressed in terms of tables or views into a statement accessin! one or more materiali5ed views that are defined on the detail tables" The transformation is transparent to the end ser or application/ re4 irin! no intervention and no reference to the materiali5ed view in the SFL statement" Beca se 4 er* rewrite is transparent/ materiali5ed views can be added or dropped : st like indexes witho t invalidatin! the SFL in the application code" Before the 4 er* is rewritten/ it is s b:ected to several checks to determine whether it is a candidate for 4 er* rewrite" If the 4 er* fails an* of the checks/ then the 4 er* is applied to the detail tables rather than the materiali5ed view" This can be costl* in terms of response time and processin! power" The Oracle optimi5er ses two different methods to reco!ni5e when to rewrite a 4 er* in terms of one or more materiali5ed views" The first method is based on matchin! the SFL text of the 4 er* with the SFL text of the materiali5ed view definition" If the first method fails/ the optimi5er ses the more !eneral method in which it compares :oins/ selections/ data col mns/ !ro pin! col mns/ and a!!re!ate f nctions between the 4 er* and a materiali5ed view" F er* rewrite operates on 4 eries and s b4 eries in the followin! t*pes of SFL statements#
S!L!CT C#!$T! T$BL! """ $S S!L!CT INS!#T INTO """ S!L!CT

It also operates on s b4 eries in the set operators UNION/ UNION $LL/ INT!#S!CT/ and "INUS/ and s b4 eries in %ML statements s ch as INS!#T/ D!L!T!/ and U%D$T!" Several factors affect whether or not a !iven 4 er* is rewritten to se one or more materiali5ed views#

6nablin! or disablin! 4 er* rewrite# o b* the C#!$T! or $LT!# statement for individ al materiali5ed views o b* the initiali5ation parameter LU!#'_#!F#IT!_!N$BL!D o b* the #!F#IT! and NO#!F#IT! hints in SFL statements +ewrite inte!rit* levels %imensions and constraints

There is also an explain rewrite proced re which will advise whether 4 er* rewrite is possible on a 4 er* and if so/ which materiali5ed views will be sed"

Cost=&ased 1e(rite
F er* rewrite is available with cost)based optimi5ation" Oracle optimi5es the inp t 4 er* with and witho t rewrite and selects the least costl* alternative" The optimi5er rewrites a 4 er* b* rewritin! one or more 4 er* blocks/ one at a time" If the rewrite lo!ic has a choice between m ltiple materiali5ed views to rewrite a 4 er* block/ it will select the one which can res lt in readin! in the least amo nt of data" 'fter a materiali5ed view has been picked for a rewrite/ the optimi5er performs the rewrite/ and then tests whether the rewritten 4 er* can be rewritten f rther with another materiali5ed view" This process contin es ntil no f rther rewrites are possible" Then the rewritten 4 er* is optimi5ed and the ori!inal 4 er* is optimi5ed" The optimi5er compares these two optimi5ations and selects the least costl* alternative" Since optimi5ation is based on cost/ it is important to collect statistics both on tables involved in the 4 er* and on the tables representin! materiali5ed views" Statistics are f ndamental meas res/ s ch as the n mber of rows in a table/ that are sed to calc late the cost of a rewritten 4 er*" The* are created b* sin! the DB"S_ST$TS packa!e" F eries that contain in)line or named views are also candidates for 4 er* rewrite" $hen a 4 er* contains a named view/ the view name is sed to do the matchin! between a materiali5ed view and the 4 er*" $hen a 4 er* contains an inline view/ the inline view can be mer!ed into the 4 er* before matchin! between a materiali5ed view and the 4 er* occ rs" In addition/ if the inline view<s text definition exactl* matches with that of an inline view present in an* eli!ible materiali5ed view/ !eneral rewrite ma* be possible" This is beca se/ whenever a materiali5ed view contains exactl* identical inline view text to the

one present in a 4 er*/ 4 er* rewrite treats s ch an inline view like a named view or a table" Fi! re AA)0 presents a !raphical view of the cost)based approach sed d rin! the rewrite process" Figure 22-1 The 8uer' %e rite Process

Text description of the ill stration dwhs!80E"!if

When Does Oracle 1e(rite a Duer+?


' 4 er* is rewritten onl* when a certain n mber of conditions are met#

F er* rewrite m st be enabled for the session" ' materiali5ed view m st be enabled for 4 er* rewrite" The rewrite inte!rit* level sho ld allow the se of the materiali5ed view" For example/ if a materiali5ed view is not fresh and 4 er* rewrite inte!rit* is set to enforced/ then the materiali5ed view will not be sed"

6ither all or part of the res lts re4 ested b* the 4 er* m st be obtainable from the precomp ted res lt stored in the materiali5ed view"

To determine this/ the optimi5er ma* depend on some of the data relationships declared b* the ser sin! constraints and dimensions" S ch data relationships incl de hierarchies/ referential inte!rit*/ and ni4 eness of ke* data/ and so on" Sa ple Sche a and )ateriali4ed !ie(s The followin! sections se an example schema and a few materiali5ed views to ill strate how the optimi5er ses data relationships to rewrite 4 eries" Oracle<s sh sample schema consists of these tables#
COSTS1&COUNT#I!S1&CUSTO"!#S1&%#ODUCTS1&%#O"OTIONS1&TI"!S1&C,$NN!LS1& S$L!S &&

See Also: Oracle9i Sample Schemas for details re!ardin! the sh sample schema $2a ples o0 )ateriali4ed !ie(s 0or Duer+ 1e(rite The 4 er* rewrite examples in this chapter mainl* refer to the followin! materiali5ed views" (ote that those materiali5ed views do not necessaril* represent the most efficient implementation for the sh sample schema" Instead/ the* are a base for demonstratin! Oracle<s rewrite capabilities" F rther examples demonstratin! specific f nctionalit* can be fo nd in the specific context" The followin! materiali5ed views contain :oins and a!!re!ates#
C#!$T!&"$T!#I$LI !D&+I!F&sum_sales_pscat_<ee=_m&&!N$BL!&LU!#'&#!F#IT! &&$S S!L!CT&pIprod_su9cate/ory1&tI<ee=_endin/_day1 &&&&&&&SU")sIamount_sold*&$S&sum_amount_sold >#O"&&&sales&s1&products&p1&times&t F,!#!&&sItime_idHtItime_id& $ND&&&&sIprod_idHpIprod_id (#OU%&B'&pIprod_su9cate/ory1&tI<ee=_endin/_day; C#!$T!&"$T!#I$LI !D&+I!F&sum_sales_prod_<ee=_m&&!N$BL!&LU!#'&#!F#IT! &&$S S!L!CT&pIprod_id1&tI<ee=_endin/_day1&sIcust_id1 &&&&&&&SU")sIamount_sold*&$S&sum_amount_sold >#O"&&&sales&s1&products&p1&times&t F,!#!&&sItime_idHtItime_id $ND&&&&sIprod_idHpIprod_id (#OU%&B'&pIprod_id1&tI<ee=_endin/_day1&sIcust_id;

C#!$T!&"$T!#I$LI !D&+I!F&sum_sales_pscat_month_city_m&&!N$BL!&LU!#'&#!F#IT! &&$S S!L!CT&pIprod_su9cate/ory1&tIcalendar_month_desc1&cIcust_city1 &&&&&&&SU")sIamount_sold*&$S&sum_amount_sold1 &&&&&&&COUNT)sIamount_sold*&$S&count_amount_sold >#O"&&&sales&s1&products&p1&times&t1&customers&c F,!#!&&sItime_idHtItime_id& $ND&&&&sIprod_idHpIprod_id& $ND&&&&sIcust_idHcIcust_id&&& (#OU%&B'&pIprod_su9cate/ory1&tIcalendar_month_desc1&cIcust_city;

The followin! materiali5ed views contain :oins onl*#


C#!$T!&"$T!#I$LI !D&+I!F&6oin_sales_time_product_m&&!N$BL!&LU!#'&#!F#IT! &&$S S!L!CT&pIprod_id1&pIprod_name1&tItime_id1&tI<ee=_endin/_day1 &&&&&&&sIchannel_id1&sIpromo_id1&sIcust_id1 &&&&&&&sIamount_sold >#O"&&&sales&s1&products&p1&times&t F,!#!&&sItime_idHtItime_id& $ND&&&&sIprod_id&H&pIprod_id; C#!$T!&"$T!#I$LI !D&+I!F&6oin_sales_time_product_o6_m&&!N$BL!&LU!#'&#!F#IT! &&$S S!L!CT&pIprod_id1&pIprod_name1&tItime_id1&tI<ee=_endin/_day1 &&&&&&&sIchannel_id1&sIpromo_id1&sIcust_id1 &&&&&&&sIamount_sold >#O"&&&sales&s1&products&p1&times&t F,!#!&&sItime_idHtItime_id& $ND&&&&sIprod_idHpIprod_id)O*;

?o m st collect statistics on the materiali5ed views so that the optimi5er can determine whether to rewrite the 4 eries" ?o can do this either on a per ob:ect base or for all newl* created ob:ects witho t statistics" On a per ob:ect base/ shown for 6oin_sales_time_product_m-#
!.!CUT!&DB"S_ST$TSI($T,!#_T$BL!_ST$TS& )7S,717BOIN_S$L!S_TI"!_%#ODUCT_"+71 &&&&&&estimate_percentHJ2419loc=_sampleHJT#U!1cascadeHJT#U!*;

For all newl* created ob:ects witho t statistics/ on schema level#


!.!CUT!&DB"S_ST$TSI($T,!#_SC,!"$_ST$TS)7S,71&options&HJ&7($T,!#&!"%T'71 &&&&&&estimate_percentHJ241&9loc=_sampleHJT#U!1&cascadeHJT#U!*;

See Also:

Oracle9i Supplied P&0S%& Packages and Types $e"erence for f rther information abo t sin! the DB"S_ST$TS packa!e to maintain statistics

$nabling Duer+ 1e(rite


Several steps m st be followed to enable 4 er* rewrite# 0" Individ al materiali5ed views m st have the !N$BL! LU!#' #!F#IT! cla se" 0" The initiali5ation parameter LU!#'_#!F#IT!_!N$BL!D m st be set to true" 0" Cost)based optimi5ation m st be sed either b* settin! the initiali5ation parameter O%TI"I !#_"OD! to all_ro<s or first_ro<s/ or b* anal*5in! the tables and settin! O%TI"I !#_"OD! to choose" 0" The initiali5ation parameter O%TI"I !#_>!$TU#!S_!N$BL! sho ld be left nset for 4 er* rewrite to be possible" However/ if it is !iven a val e/ then it m st be set to at least H"0"2 or 4 er* rewrite and explain rewrite will not be possible" If step 0 has not been completed/ a materiali5ed view will never be eli!ible for 4 er* rewrite" !N$BL! LU!#' #!F#IT! can be specified either when the materiali5ed view is created/ as ill strated here/ or with the $LT!# "$T!#I$LI !D +I!F statement"
C#!$T!&"$T!#I$LI !D&+I!F&6oin_sales_time_product_m!N$BL!&LU!#'&#!F#IT! $S S!L!CT&pIprod_id1&pIprod_name1&tItime_id1&tI<ee=_endin/_day1 &&&&&&&sIchannel_id1&sIpromo_id1&sIcust_id1 &&&&&&&sIamount_sold >#O"&&&sales&s1&products&p1&times&t F,!#!&&sItime_idHtItime_id& $ND&&&&sIprod_id&H&pIprod_id;

?o can se the initiali5ation parameter LU!#'_#!F#IT!_!N$BL!D to disable 4 er* rewrite for all materiali5ed views/ or to enable it a!ain for all materiali5ed views that are individ all* enabled" However/ the LU!#'_#!F#IT!_!N$BL!D parameter cannot enable 4 er* rewrite for materiali5ed views that have disabled it with the C#!$T! or $LT!# statement" The NO#!F#IT! hint disables 4 er* rewrite in a SFL statement/ overridin! the LU!#'_#!F#IT!_!N$BL!D parameter/ and the #!F#IT! hint -when sed with m-_name3 restricts the eli!ible materiali5ed views to those named in the hint"

Initiali4ation #ara eters 0or Duer+ 1e(rite


F er* rewrite re4 ires the followin! initiali5ation parameter settin!s#
O%TI"I !#_"OD!

L all_ro<s/ first_ro<s/ or choose

LU!#'_#!F#IT!_!N$BL!D L true CO"%$TIBL! L H"0"8 -or !reater3

The LU!#'_#!F#IT!_INT!(#IT' parameter is optional/ b t m st be set to stale_tolerated/ trusted/ or enforced if it is specified -see 9'cc rac* of F er* +ewrite93" It defa lts to enforced if it is ndefined" Beca se the inte!rit* level is set b* defa lt to enforced/ all constraints m st be validated" Therefore/ if *o se !N$BL! NO+$LID$T!/ certain t*pes of 4 er* rewrite mi!ht not work" To enable 4 er* rewrite in this environment/ *o sho ld set *o r inte!rit* level to a lower level of !ran larit* s ch as trusted or stale_tolerated" Skip Headers Oracle9 i DataWarehousingGuide Release2 (9.2) Part Number A96520-01

Home Book Contents Index Master Feedback List Index

7lossar+
additive
%escribes a fact -or meas re3 that can be s mmari5ed thro !h addition" 'n additive fact is the most common t*pe of fact" 6xamples incl de sales/ cost/ and profit" Contrast with nonadditive and semi2additive" See Also: fact

advisor
See# S mmar* 'dvisor"

aggregate
S mmari5ed data" For example/ nit sales of a partic lar prod ct co ld be a!!re!ated b* da*/ month/ 4 arter and *earl* sales"

aggregation
The process of consolidatin! data val es into a sin!le val e" For example/ sales data co ld be collected on a dail* basis and then be a!!re!ated to the week level/ the week data co ld be a!!re!ated to the month level/ and so on" The data can then be referred to as a!!re!ate data" 'ggregation is s*non*mo s with s mmarization/ and a!!re!ate data is s*non*mo s with s mmar* data"

ancestor
' val e at an* level hi!her than a !iven val e in a hierarch*" For example/ in a Time dimension/ the val e 5@@@ mi!ht be the ancestor of the val es L5A@@ and BanA@@" See Also: hierarch* and level

attribute
' descriptive characteristic of one or more levels" For example/ the prod ct dimension for a clothin! man fact rer mi!ht contain a level called item/ one of whose attrib tes is color" 'ttrib tes represent lo!ical !ro pin!s that enable end sers to select data based on like characteristics" (ote that in relational modelin!/ an attrib te is defined as a characteristic of an entit*" In Oracle1i/ an attrib te is a col mn in a dimension that characteri5es elements of a sin!le level"

cardinality
From an OLT7 perspective/ this refers to the n mber of rows in a table" From a data wareho sin! perspective/ this t*picall* refers to the n mber of distinct val es in a col mn" For most data wareho se %B's/ a more important iss e is the degree of cardinality" See Also: de!ree of cardinalit*

child
' val e at the level nder a !iven val e in a hierarch*" For example/ in a Time dimension/ the val e BanA@@ mi!ht be the child of the val e L5A@@" ' val e can be a child for more than one parent if the child val e belon!s to m ltiple hierarchies" See Also:

hierarch* level parent

cleansing
The process of resolvin! inconsistencies and fixin! the anomalies in so rce data/ t*picall* as part of the 6TL process" See Also: 6TL

Common Warehouse Metadata (CWM)


' repositor* standard sed b* Oracle data wareho sin!/ and decision s pport" The C$M repositor* schema is a standalone prod ct that other prod cts can share))each prod ct owns onl* the ob:ects within the C$M repositor* that it creates"

cross product
' proced re for combinin! the elements in m ltiple sets" For example/ !iven two col mns/ each element of the first col mn is matched with ever* element of the second col mn" ' simple example is ill strated as follows#
Col5&&&Col2&&&Cross&%roduct AAAA&&&AAAA&&&AAAAAAAAAAAAA a&&&&&&c&&&&&&ac 9&&&&&&d&&&&&&ad &&&&&&&&&&&&&&9c &&&&&&&&&&&&&&9d

Cross prod cts are performed when !ro pin! sets are concatenated/ as described in Chapter 0H/ 9SFL for '!!re!ation in %ata $areho ses9"

data mart
' data wareho se that is desi!ned for a partic lar line of b siness/ s ch as sales/ marketin!/ or finance" In a dependent data mart/ the data can be derived from an enterprise)wide data wareho se" In an independent data mart/ data can be collected directl* from so rces" See Also: data wareho se

data source
' database/ application/ repositor*/ or file that contrib tes data to a wareho se"

data warehouse
' relational database that is desi!ned for 4 er* and anal*sis rather than transaction processin!" ' data wareho se s all* contains historical data that is derived from transaction data/ b t it can incl de data from other so rces" It separates anal*sis workload from transaction workload and enables a b siness to consolidate data from several so rces" In addition to a relational database/ a data wareho se environment often consists of an 6TL sol tion/ an OL'7 en!ine/ client anal*sis tools/ and other applications that mana!e the process of !atherin! data and deliverin! it to b siness sers" See Also: 6TL and online anal*tical processin! -OL'73

degree of cardinality
The n mber of ni4 e val es of a col mn divided b* the total n mber of rows in the table" This is partic larl* important when decidin! which indexes to b ild" ?o t*picall* want to se bitmap indexes on low de!ree of cardinalit* col mns and B)tree indexes on hi!h de!ree of cardinalit* col mns" 's a !eneral r le/ a cardinalit* of nder 0N makes a !ood candidate for a bitmap index"

denormalize
The process of allowin! red ndanc* in a table" Contrast with normalize"

derived fact (or measure)


' fact -or meas re3 that is !enerated from existin! data sin! a mathematical operation or a data transformation" 6xamples incl de avera!es/ totals/ percenta!es/ and differences"

detail
See# fact table"

detail table
See# fact table"

dimension

The term dimension is commonl* sed in two wa*s#

' !eneral term for an* characteristic that is sed to specif* the members of a data set" The B most common dimensions in sales)oriented data wareho ses are time/ !eo!raph*/ and prod ct" Most dimensions have hierarchies" 'n ob:ect defined in a database to enable 4 eries to navi!ate dimensions" In Oracle1i/ a dimension is a database ob:ect that defines hierarchical -parent@child3 relationships between pairs of col mn sets" In Oracle 6xpress/ a dimension is a database ob:ect that consists of a list of val es"

dimension table
%imension tables describe the b siness entities of an enterprise/ represented as hierarchical/ cate!orical information s ch as time/ departments/ locations/ and prod cts" %imension tables are sometimes called look p or reference tables"

dimension value
One element in the list that makes p a dimension" For example/ a comp ter compan* mi!ht have dimension val es in the prod ct dimension called L$%%C and D!SK%C" ;al es in the !eo!raph* dimension mi!ht incl de Boston and %aris" ;al es in the time dimension mi!ht incl de "$'@D and B$N@E"

drill
To navi!ate from one item to a set of related items" %rillin! t*picall* involves navi!atin! p and down thro !h the levels in a hierarch*" $hen selectin! data/ *o can expand or collapse a hierarch* b* drillin! down or p in it/ respectivel*" See Also: drill down and drill p

drill down
To expand the view to incl de child val es that are associated with parent val es in the hierarch*" See Also: drill and drill p

drill up
To collapse the list of descendant val es that are associated with a parent val e in the hierarch*"

element
'n ob:ect or process" For example/ a dimension is an ob:ect/ a mappin! is a process/ and both are elements"

entity
6ntit* is sed in database modelin!" In relational databases/ it t*picall* maps to a table"

ETL
6xtraction/ transformation/ and loadin!" 6TL refers to the methods involved in accessin! and manip latin! so rce data and loadin! it into a data wareho se" The order in which these processes are performed varies" (ote that 6TT -extraction/ transformation/ transportation3 and 6TM -extraction/ transformation/ move3 are sometimes sed instead of 6TL" See Also: data wareho se extraction transformation

transportation

e traction
The process of takin! data o t of a so rce as part of an initial phase of 6TL" See Also: 6TL

fact
%ata/ s all* n meric and additive/ that can be examined and anal*5ed" 6xamples incl de sales/ cost/ and profit" Fact and meas re are s*non*mo sJ fact is more commonl* sed with relational environments/ meas re is more commonl* sed with m ltidimensional environments" See Also: derived fact -or meas re3

fact table

' table in a star schema that contains facts" ' fact table t*picall* has two t*pes of col mns# those that contain facts and those that are forei!n ke*s to dimension tables" The primar* ke* of a fact table is s all* a composite ke* that is made p of all of its forei!n ke*s" ' fact table mi!ht contain either detail level facts or facts that have been a!!re!ated -fact tables that contain a!!re!ated facts are often instead called s mmary tables3" ' fact table s all* contains facts with the same level of a!!re!ation"

fast refresh
'n operation that applies onl* the data chan!es to a materiali5ed view/ th s eliminatin! the need to reb ild the materiali5ed view from scratch"

file!to!table mapping
Maps data from flat files to tables in the wareho se"

hierarchy
' lo!ical str ct re that ses ordered levels as a means of or!ani5in! data" ' hierarch* can be sed to define data a!!re!ationJ for example/ in a time dimension/ a hierarch* mi!ht be sed to a!!re!ate data from the "onth level to the Luarter level to the 'ear level" Hierarchies can be defined in Oracle1i as part of the dimension ob:ect" ' hierarch* can also be sed to define a navi!ational drill path/ re!ardless of whether the levels in the hierarch* represent a!!re!ated totals" See Also: dimension and level

level
' position in a hierarch*" For example/ a time dimension mi!ht have a hierarch* that represents data at the "onth/ Luarter/ and 'ear levels" See Also: hierarch*

level value table


' database table that stores the val es or data for the levels *o created as part of *o r dimensions and hierarchies"

mapping

The definition of the relationship and data flow between so rce and tar!et ob:ects"

materialized view
' pre)comp ted table comprisin! a!!re!ated or :oined data from fact and possibl* dimension tables" 'lso known as a s mmar* or a!!re!ate table"

measure
See# fact"

metadata
%ata that describes data and other str ct res/ s ch as ob:ects/ b siness r les/ and processes" For example/ the schema desi!n of a data wareho se is t*picall* stored in a repositor* as metadata/ which is sed to !enerate scripts sed to b ild and pop late the data wareho se" ' repositor* contains metadata" 6xamples incl de# for data/ the definition of a so rce to tar!et transformation that is sed to !enerate and pop late the data wareho seJ for information/ definitions of tables/ col mns and associations that are stored inside a relational modelin! toolJ for b siness r les/ disco nt b* 08 percent after sellin! 0/888 items"

model
'n ob:ect that represents somethin! to be made" ' representative st*le/ plan/ or desi!n" Metadata that defines the str ct re of the data wareho se"

nonadditive
%escribes a fact -or meas re3 that cannot be s mmari5ed thro !h addition" 'n example incl des 'vera!e" Contrast with additive and semi2additive"

normalize
In a relational database/ the process of removin! red ndanc* in data b* separatin! the data into m ltiple tables" Contrast with denormalize" The process of removin! red ndanc* in data b* separatin! the data into m ltiple tables"

"L#$
See# online anal*tical processin! -OL'73"

online analytical processing ("L#$)

OL'7 f nctionalit* is characteri5ed b* d*namic/ m ltidimensional anal*sis of historical data/ which s pports activities s ch as the followin!#

Calc latin! across dimensions and thro !h hierarchies 'nal*5in! trends %rillin! p and down thro !h hierarchies +otatin! to chan!e the dimensional orientation

OL'7 tools can r n a!ainst a m ltidimensional database or interact directl* with a relational database"

"LT$
See# online transaction processin! -OLT73"

online transaction processing ("LT$)


Online transaction processin!" OLT7 s*stems are optimi5ed for fast and reliable transaction handlin!" Compared to data wareho se s*stems/ most OLT7 interactions will involve a relativel* small n mber of rows/ b t a lar!er !ro p of tables"

parallelism
Breakin! down a task so that several processes do part of the work" $hen m ltiple C7=s each do their portion sim ltaneo sl*/ ver* lar!e performance !ains are possible"

parallel e ecution
Breakin! down a task so that several processes do part of the work" $hen m ltiple C7=s each do their portion sim ltaneo sl*/ ver* lar!e performance !ains are possible"

parent
' val e at the level above a !iven val e in a hierarch*" For example/ in a Time dimension/ the val e L5A@@ mi!ht be the parent of the val e BanA@@" See Also: child hierarch*

level

partition

;er* lar!e tables and indexes can be diffic lt and time)cons min! to work with" To improve mana!eabilit*/ *o can break *o r tables and indexes into smaller pieces called partitions"

pivoting
' transformation where each record in an inp t stream is converted to man* records in the appropriate table in the data wareho se" This is partic larl* important when takin! data from nonrelational databases"

publisher
=s all* a database administrator who is in char!e of creatin! and maintainin! schema ob:ects that make p the Chan!e %ata Capt re s*stem"

refresh
The mechanism whereb* materiali5ed views are chan!ed to reflect new data"

schema
' collection of related database ob:ects" +elational schemas are !ro ped b* database ser I% and incl de tables/ views/ and other ob:ects" $henever possible/ a sample schema called sh is sed thro !ho t this K ide" See Also: snowflake schema and star schema

semi!additive
%escribes a fact -or meas re3 that can be s mmari5ed thro !h addition alon! some/ b t not all/ dimensions" 6xamples incl de headco nt and on hand stock" Contrast with additive and nonadditive"

slice and dice


This is an informal term referrin! to data retrieval and manip lation" $e can pict re a data wareho se as a c be of data/ where each axis of the c be represents a dimension" To 9slice9 the data is to retrieve a piece -a slice3 of the c be b* specif*in! meas res and val es for some or all of the dimensions" $hen we retrieve a data slice/ we ma* also move and reorder its col mns and rows as if we had diced the slice into man* small pieces" ' s*stem with !ood slicin! and dicin! makes it eas* to navi!ate thro !h lar!e amo nts of data"

snowfla%e schema

' t*pe of star schema in which the dimension tables are partl* or f ll* normali5ed" See Also: schema and star schema

source
' database/ application/ file/ or other stora!e facilit* from which the data in a data wareho se is derived"

source system
' database/ application/ file/ or other stora!e facilit* from which the data in a data wareho se is derived"

staging area
' place where data is processed before enterin! the wareho se"

staging file
' file sed when data is processed before enterin! the wareho se"

star &uery
' :oin between a fact table and a n mber of dimension tables" 6ach dimension table is :oined to the fact table sin! a primar* ke* to forei!n ke* :oin/ b t the dimension tables are not :oined to each other"

star schema
' relational schema whose desi!n represents a m ltidimensional data model" The star schema consists of one or more fact tables and one or more dimension tables that are related thro !h forei!n ke*s" See Also: schema and snowflake schema

sub'ect area
' classification s*stem that represents or distin! ishes parts of an or!ani5ation or areas of knowled!e" ' data mart is often developed to s pport a s b:ect area s ch as sales/ marketin!/ or !eo!raph*"

See Also: data mart

subscribers
Cons mers of the p blished chan!e data" These are normall* applications"

summary
See# materiali5ed view"

(ummary #dvisor
The S mmar* 'dvisor recommends which materiali5ed views to retain/ create/ and drop" It helps database administrators mana!e materiali5ed views" It is a K=I in Oracle 6nterprise Mana!er/ and has similar capabilities to the DB"S_OL$% packa!e"

target
Holds the intermediate or final res lts of an* part of the 6TL process" The tar!et of the entire 6TL process is the data wareho se" See Also: data wareho se and 6TL

third normal form ()*+)


' classical relational database modelin! techni4 e that minimi5es data red ndanc* thro !h normali5ation"

third normal form schema


' schema that ses the same kind of normali5ation as t*picall* fo nd in an OLT7 s*stem" Third normal form schemas are sometimes chosen for lar!e data wareho ses/ especiall* environments with si!nificant data loadin! re4 irements that are sed to feed data marts and exec te lon!)r nnin! 4 eries" See Also: snowflake schema and star schema

transformation

The process of manip latin! data" 'n* manip lation be*ond cop*in! is a transformation" 6xamples incl de cleansin!/ a!!re!atin!/ and inte!ratin! data from m ltiple so rces"

transportation
The process of movin! copied or transformed data from a so rce to a data wareho se" See Also: transformation

uni&ue identifier
'n identifier whose p rpose is to differentiate between the same item when it appears in more than one place"

update window
The len!th of time available for pdatin! a wareho se" For example/ *o mi!ht have H ho rs at ni!ht to pdate *o r wareho se"

update fre&uency
How often a data wareho se is pdated with new information" For example/ a wareho se mi!ht be pdated ni!htl* from an OLT7 s*stem"

validation
The process of verif*in! metadata definitions and confi! ration parameters"

versioning
The abilit* to create new versions of a data wareho se pro:ect for new re4 irements and chan!es"

Cop*ri!ht D 0112/ A88A Oracle Corporation" Home Book Contents Index Master Feedback 'll +i!hts +eserved" List Index

You might also like