0% found this document useful (0 votes)

62 views25 pages

1.6 Efficient Data Cube Computation & Indexing OLAP

Uploaded by

hareeeee14

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views25 pages

1.6 Efficient Data Cube Computation & Indexing OLAP

Uploaded by

hareeeee14

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

SRI KRISHNA COLLEGE OF ENGINEERING

AND TECHNOLOGY

M.Tech. Computer Science and Engineering

21CSI501 DATA WAREHOUSING AND MINING

MODULE 1

1.6 DATA WAREHOUSE IMPLEMENTATION

Efficient data cube computation – Indexing OLAP data

Faculty - Dr.D.Prabha
DATA WAREHOUSE IMPLEMENTATION

• Data Warehouses contain huge volumes of data.

• Business People use Data Warehousing, to make
decisions from historic data.
• Inorder to make decisions, business people will use
data warehousing, to be answered in order of seconds.
• Crucial for data warehouse systems to support highly
efficient cube computation techniques.
Efficient Data Cube Computation

• Multi-dimensional data analysis – Efficient computation of

aggregations across many set of dimensions.
• In SQL, aggregations are referred to as group – by’s.
• All these dimensions are combined in the cuboid .
• Each group-by can be represented by a cuboid, where the set
of group – by’s forms a lattice of cuboids defining a data
cube.
Compute cube Operator and Curse of
Dimensionality
Compute cube operator :
It computes the aggregates over all subsets of the dimensions
specified in the operation.
Syntax :
Compute cube cubename
Example
Consider we define the data cube for an electronic store “Best
Electronics”
Dimensions are :
• City
• Item
• Year

Measure :
• Sales_in_dollars
Example : Compute cube operator
The statement “ compute cube sales “
• It explicitly instructs the system to compute the sales
aggregate cuboids for all the subsets of the set { item, city,
year}
• Generates a lattice of cuboids making up a 3D data cube
‘sales’
• Each cuboid in the lattice corresponds to a subset
Example : Compute cube operator
Cont...
BASE CUBOIDS APEX CUBOIDS
Return total sales for any Group by is empty – contains
combination of 3 dimensions total sum of all sales.

Least generalized - Most Most generalized and least

specific of the cuboids. specific

Explore downwards – Drilling Explore upwards – Drilling up /

down within the data cube. Rolling up within the data cube.
Cont...

SQL SYNTAX :
define cube sales_cube [city, item , year] : sum
(sales_in_dollars)

compute cube sales_cube

Compute cube operator
Advantages
• Computes all the cuboids for the cube in advance
• Online analytical processing needs to access different cuboids
for different queries.
• Pre-computation leads to fast response time

Disadvantages
• Required storage space may explode if all of the cuboids in
the data cube are pre computed
Cont...
Consider the following 2 cases for n dimensional cube
Case 1 : Dimensions have no hierarchies
• Then the total number of cuboids computed for a n
dimensional cube = 2n
Case 2: Dimensions have hierarchies
• Then the total number of cuboids computed for a n
dimensional cube :

Where Li is the number of levels associated with dimension i.

1 = Virtual Top (all)
Curse of dimensionality

• The storage requirements are more excessive, when

dimensions have multiple levels of concept hierarchy is
referred to as Curse of Dimensionality.
• Size of each cuboid also depends on the cardinality.
• Cardinality – number of distinct values in each dimensions.
• Many cuboids are large in size, only some of the cuboids are
materialized .
Types of Materialization
No Materialization :
• Do not pre compute any of the “non – base cuboids”.
• Leads to expensive multidimensional which is extremely slow.

Full Materialization :
• Pre compute all of the cuboids.
• Resulting lattice of computed cuboids called as Full Cube.
• Huge amount of memory space in order to store all of the pre
computed cuboids.
Cont....
Partial Materialization :
• Selectively compute a proper subset of the whole set of possible
cuboids.
• Resulting lattice of computed cuboids called as Sub Cube.

Factors :
• Identify the subset of cuboids or sub cubes.

• Exploit the cuboids or sub cube during query processing.

• Efficiently update the materialized cuboids during load and

refresh.
Cont....
ICEBERG CUBE:
• A data cube that stores only those cube cells with an aggregate
values.
SHELL CUBE:
• Pre - computing the cuboids only for a small number of
dimensions.
Indexing of OLAP Data

• To facilitate efficient data accessing, Data warehouses support

index structures and materialized views.
Index OLAP data by
• Bitmap Indexing
• Join Indexing
Bitmap Indexing
• It allows quicker searching in data cubes.
• Bit map index is an alternative representation of the
record_ID.
• In the bit map index for a given attribute, there is a distinct bit
vector Bv.
• If the attribute has the value v for a given row, then the bit
represents that value is set to 1 , all other bits for that row are
set to 0.
Advantages of Bitmap Indexing
• Useful for low cardinality domains.
• Leads to significant reduction in space

Example
ABC Electronics, dimensions – item at top levels has four values
(types) : “home entertainment, computer, phone and security”.
Suppose that cube is stored as a relational table, each item
consists of four values. The table has dimensions item , city and
mapping to bitmap index tables for dimensions.
Cont...
RID Item city RID H C P S
R1 H V R1 1 0 0 0
R2 C V R2 0 1 0 0
R3 P V R3 0 0 1 0
R4 S V R4 0 0 0 1
R5 H T R5 1 0 0 0
R6 C T R6 0 1 0 0
R7 P T R7 0 0 1 0
R8 S T R8 0 0 0 1

Base Table Item Bitmap Index Table

Cont...
RID Item city RID V T
R1 H V R1 1 0
R2 C V R2 1 0
R3 P V R3 1 0
R4 S V R4 1 0
R5 H T R5 0 1
R6 C T R6 0 1
R7 P T R7 0 1
R8 S T R8 0 1

Base Table City Bitmap Index Table

Join Indexing
• Join Indexing registers the joinable rows of two relations.
• Join Index records can identify joinable tuples without
performing costly join operations.
• Useful for maintaining the relationship between a foreign key
and its matching primary keys from joinable relation.
• Star schema model of data warehousing makes join indexing.
• Because linkage between a fact table and its corresponding
dimension table
Cont...
• Join indexing maintains relationships between attribute values
of a dimension and corresponding rows in a fact table.
• Join indices may span multiple dimensions to form Composite
join indices.
Example
ABC Electronics, “sales_star [time, item, branch, location] :
dollars_sold = sum (sales_in_dollars)”. Join index relationship
between sales fact table and dimension tables of location and
item.
Cont...
Join Index table for Join Index table for
location/sales item/sales
LOCATION SALES_KEY ITEM SALES_KEY
Main street T57 Sony-TV T57
Main street T238 Sony-TV T459
Main street T884 .... ...
.... .....

Join Index table linking location and item to sales

LOCATION Item SALES_KEY
Main street ......
Main street Sony-TV T57
Main street .... .....
.... ...... .....
Cont...
Linkages between Sales Fact table and location and item
dimension tables.
Location Sales Item

T57
Sony_TV
Main Street
T238

T459

T884
Cont...
• To speed up query processing, join and bit map indexing
methods can be integrated to form Bit mapped join indices.

Introduction To Data Warehouse Using Cognos
100% (2)
Introduction To Data Warehouse Using Cognos
56 pages
SQL Syntax Informix
100% (2)
SQL Syntax Informix
1,232 pages
Concepts and Techniques: - Chapter 5
No ratings yet
Concepts and Techniques: - Chapter 5
95 pages
Data Warehouse - Logical Design
No ratings yet
Data Warehouse - Logical Design
40 pages
05 Cube Tech
No ratings yet
05 Cube Tech
95 pages
Data Cube Computation
No ratings yet
Data Cube Computation
5 pages
Concepts and Techniques: - Chapter 5
No ratings yet
Concepts and Techniques: - Chapter 5
95 pages
MongoDB Administrator Training
100% (1)
MongoDB Administrator Training
216 pages
DMDW Co1 Session 7
No ratings yet
DMDW Co1 Session 7
46 pages
Hirasugar Institute of Technology, Nidasoshi
No ratings yet
Hirasugar Institute of Technology, Nidasoshi
30 pages
Data Mining and Warehosuing Lecture 02
No ratings yet
Data Mining and Warehosuing Lecture 02
22 pages
DM Module 2
No ratings yet
DM Module 2
47 pages
P7 CubeTech
No ratings yet
P7 CubeTech
34 pages
01 Spatial Database 22-9-2016
No ratings yet
01 Spatial Database 22-9-2016
60 pages
Data Cubes
No ratings yet
Data Cubes
17 pages
Lec 04
No ratings yet
Lec 04
15 pages
Unit-4 Finalized
No ratings yet
Unit-4 Finalized
7 pages
BMW M-2
No ratings yet
BMW M-2
41 pages
DWDM Module 2
No ratings yet
DWDM Module 2
76 pages
Data Warehouse and Data Mining - Unit 4
No ratings yet
Data Warehouse and Data Mining - Unit 4
14 pages
db2 Perf Tune 115
100% (1)
db2 Perf Tune 115
702 pages
Data Cube
No ratings yet
Data Cube
5 pages
Oracle (SQL) Documentation (Karthik)
No ratings yet
Oracle (SQL) Documentation (Karthik)
62 pages
Olap Ssas
No ratings yet
Olap Ssas
69 pages
Postgresql 11 A4 PDF
No ratings yet
Postgresql 11 A4 PDF
2,621 pages
DW Seminar
No ratings yet
DW Seminar
13 pages
Iarjset 2024 11912
No ratings yet
Iarjset 2024 11912
14 pages
Chapter 2
No ratings yet
Chapter 2
3 pages
Unit 2
No ratings yet
Unit 2
26 pages
Module 2
No ratings yet
Module 2
19 pages
05 QueryProcessing LecW4 Feb7 22
No ratings yet
05 QueryProcessing LecW4 Feb7 22
55 pages
Duck Data Umpire by Cubical Kits: Sarathchand P.V. B.E (Cse), M.Tech (CS), (PHD) Professor and Research Scholar
No ratings yet
Duck Data Umpire by Cubical Kits: Sarathchand P.V. B.E (Cse), M.Tech (CS), (PHD) Professor and Research Scholar
4 pages
Data Warehousing and Mining
No ratings yet
Data Warehousing and Mining
69 pages
Data Cube
No ratings yet
Data Cube
55 pages
Note2 3
No ratings yet
Note2 3
36 pages
OLAP2
No ratings yet
OLAP2
53 pages
DPM 9
No ratings yet
DPM 9
39 pages
What Are Schemas
No ratings yet
What Are Schemas
25 pages
Data Warehousing & Modeling: Module - 2
No ratings yet
Data Warehousing & Modeling: Module - 2
144 pages
DM 6
No ratings yet
DM 6
29 pages
DBeaver V 24 1 Ea
No ratings yet
DBeaver V 24 1 Ea
1,153 pages
SQL Complete Notes.
No ratings yet
SQL Complete Notes.
63 pages
Data Cube Computation and Data Generalization: Lesson Introduction
No ratings yet
Data Cube Computation and Data Generalization: Lesson Introduction
11 pages
Unit 4 - Data Cube Technology
No ratings yet
Unit 4 - Data Cube Technology
27 pages
Module 2 DMDW
No ratings yet
Module 2 DMDW
132 pages
Batch B DWM Experiments
No ratings yet
Batch B DWM Experiments
90 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
46 pages
OLAP Vs OLTP 1635783645
No ratings yet
OLAP Vs OLTP 1635783645
44 pages
DWDM 3-1 Unit 1
No ratings yet
DWDM 3-1 Unit 1
18 pages
Data Warehouse C
No ratings yet
Data Warehouse C
34 pages
Data Warehousing & OLAP (Business Intellegent)
No ratings yet
Data Warehousing & OLAP (Business Intellegent)
31 pages
Implementation: Data Warehouse
No ratings yet
Implementation: Data Warehouse
56 pages
DM Mod2 PDF
No ratings yet
DM Mod2 PDF
41 pages
Chapter 3 Topic - 4
No ratings yet
Chapter 3 Topic - 4
29 pages
DMDW 1 2nd Module
No ratings yet
DMDW 1 2nd Module
29 pages
Data Ware House Concept 2019 (Compatibility Mode) PDF
No ratings yet
Data Ware House Concept 2019 (Compatibility Mode) PDF
25 pages
Cube Computation and Indexes For Data Warehouses: CPS 196.03 Notes 7
No ratings yet
Cube Computation and Indexes For Data Warehouses: CPS 196.03 Notes 7
28 pages
Lecture 4
No ratings yet
Lecture 4
2 pages
09 Data Serving
No ratings yet
09 Data Serving
46 pages
Chapter Four - Data Warehouse Design: SATA Technology and Business Collage
No ratings yet
Chapter Four - Data Warehouse Design: SATA Technology and Business Collage
10 pages
Chapter 2 and 3
No ratings yet
Chapter 2 and 3
89 pages
What Is A Database
No ratings yet
What Is A Database
3 pages
Difference Between Column-Stores and OLAP Data Cubes
No ratings yet
Difference Between Column-Stores and OLAP Data Cubes
3 pages
Data Cube
No ratings yet
Data Cube
42 pages
Chap 2
No ratings yet
Chap 2
21 pages
Advanced Post-Processing With The FEMAP API: Patrick Kriengsiri, Femap Development
No ratings yet
Advanced Post-Processing With The FEMAP API: Patrick Kriengsiri, Femap Development
78 pages
DM and DW Notes-Module2
No ratings yet
DM and DW Notes-Module2
18 pages
CEC005 Module 4 CANVAS Merged
No ratings yet
CEC005 Module 4 CANVAS Merged
85 pages
Lab 5 - Working With Relational Data Stores in The Cloud
No ratings yet
Lab 5 - Working With Relational Data Stores in The Cloud
15 pages
Full Thesis
No ratings yet
Full Thesis
147 pages
Chapter 2 - Database Models
No ratings yet
Chapter 2 - Database Models
7 pages
Homework 2 Solution
No ratings yet
Homework 2 Solution
7 pages
Crystal Reports Training1
No ratings yet
Crystal Reports Training1
257 pages
Differences Between CUBES and Star Schema
No ratings yet
Differences Between CUBES and Star Schema
3 pages
3M MicroTouch Controller CX Reference Guide
No ratings yet
3M MicroTouch Controller CX Reference Guide
27 pages
Advanced Query Tuning Using IBM Data Studio
No ratings yet
Advanced Query Tuning Using IBM Data Studio
64 pages
How To Use The Excel VLOOKUP Function - Exceljet
No ratings yet
How To Use The Excel VLOOKUP Function - Exceljet
17 pages
ADBMS - Unit 1 - 21042018 - 032136AM
No ratings yet
ADBMS - Unit 1 - 21042018 - 032136AM
21 pages
SQL Age - DWH Interview Questions
No ratings yet
SQL Age - DWH Interview Questions
1 page
GrowDataSkills SQL Interview Bank 2
No ratings yet
GrowDataSkills SQL Interview Bank 2
56 pages
Abap Dictionary: Tuesday, November 6, 2007
No ratings yet
Abap Dictionary: Tuesday, November 6, 2007
8 pages
Zoom Group Project
No ratings yet
Zoom Group Project
5 pages
LRN3322 - In-Memory - 10 - Year - Anniversary V3 - 1724971329244001resz
No ratings yet
LRN3322 - In-Memory - 10 - Year - Anniversary V3 - 1724971329244001resz
76 pages
00-Schedule - CMU 15-445 - 645 - Intro To Database Systems (Fall 2021)
No ratings yet
00-Schedule - CMU 15-445 - 645 - Intro To Database Systems (Fall 2021)
2 pages
DWDM Unit 2 Part 2 by Jithender Tulasi
No ratings yet
DWDM Unit 2 Part 2 by Jithender Tulasi
63 pages
SQL Complete Roadmap !
No ratings yet
SQL Complete Roadmap !
8 pages
Google Hack
No ratings yet
Google Hack
3 pages
Build and Study RS, D, JK, and T Flip Flops Using TTL Logic Gates
From Everand
Build and Study RS, D, JK, and T Flip Flops Using TTL Logic Gates
GURUPRASAD N H
No ratings yet
Design an RP2040 board with KiCad: Creating Raspberry Pi Pico-compatible PCBs
From Everand
Design an RP2040 board with KiCad: Creating Raspberry Pi Pico-compatible PCBs
Jo Hinchliffe
No ratings yet
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet

1.6 Efficient Data Cube Computation & Indexing OLAP

Uploaded by

1.6 Efficient Data Cube Computation & Indexing OLAP

Uploaded by

SRI KRISHNA COLLEGE OF ENGINEERING

M.Tech. Computer Science and Engineering

21CSI501 DATA WAREHOUSING AND MINING

1.6 DATA WAREHOUSE IMPLEMENTATION

• Data Warehouses contain huge volumes of data.

• Multi-dimensional data analysis – Efficient computation of

Least generalized - Most Most generalized and least

Explore downwards – Drilling Explore upwards – Drilling up /

compute cube sales_cube

Where Li is the number of levels associated with dimension i.

• The storage requirements are more excessive, when

• Exploit the cuboids or sub cube during query processing.

• Efficiently update the materialized cuboids during load and

• To facilitate efficient data accessing, Data warehouses support

Base Table Item Bitmap Index Table

Base Table City Bitmap Index Table

Join Index table linking location and item to sales

You might also like