Data Warehousing & DATA MINING (SE-409) : Lecture-4
Data Warehousing & DATA MINING (SE-409) : Lecture-4
Data Warehousing & DATA MINING (SE-409) : Lecture-4
MINING (SE-409)
Lecture-4
Online Analytical Processing (OLAP)
Huma Ayub
Software Engineering department
An enterprise wide fall in profit What was the quarterly sales during
last year ??
Profit down by a large percentage What was the quarterly sales at
consistently during last quarter only. regional level during last year ??
Rest is OK
• Analysis is directional
– Drill Down [details.
More inYear->month->week
– Roll Up subsequent
slides
– Pivot
Ahsan Abdullah 4
Challenges…
• Not feasible to write predefined queries.
– Fails to remain user_driven (becomes programmer
driven).
Ahsan Abdullah 5
Challenges
• Contradiction
– Want to compute answers in advance, but don't
know the questions
• Solution
– Compute answers to “all” possible “queries”. But
how?
Ahsan Abdullah 6
“All” possible queries (level aggregates)
ALL ALL
Zone Defense
Ahsan Abdullah ...Gulberg 7
OLAP: Facts & Dimensions
Ahsan Abdullah 8
Where does OLAP fit in?
?
Transaction
Data
Data
Loading
ELT
OLAP
Reports
Decision
Maker
Data Cube
(MOLAP) Presentation
Tools
Ahsan Abdullah 9
OLTP vs. OLAP
Feature OLTP OLAP
Level of data Detailed Aggregated
Amount of data per Small Large
transaction
Views Pre-defined User-defined
[Programmer]
Typical write Update, insert, delete Bulk insert
operation
“age” of data Current (60-90 days) Historical 5-10 years and
also current [Active
DW]
Number of users High Low-Med
Tables Flat tables [Highly Multi-Dimensional tables
normalized]
Database size Med (109 B – 1012 B) High (1012 B – 1015 B)
Query Optimizing Requires experience Already “optimized”
10
Ahsan Abdullah
Data availability High Low-Med
OLAP FASMI Test
Fast: Delivers information to the user at a fairly constant rate.
Most queries answered in under five seconds.
Ahsan Abdullah 11
Multidimensional OLAP (MOLAP)
Ahsan Abdullah 12
OLAP Implementations
1. MOLAP: OLAP implemented with a multi-dimensional
data structure.
Ahsan Abdullah 13
MOLAP Implementations
OLAP has historically been implemented using a
multi_dimensional data structure or “cube”.
Ahsan Abdullah 14
MOLAP Implementations
• No standard query language for querying MOLAP
- No SQL !
Ahsan Abdullah 15
Aggregations in MOLAP
Sales volume as a function of (i) product, (ii) time,
and (iii) geography
Product
Bread 8
Category Division Quarter Eggs 45
Butter 13
Product District Month Week 12
Jam
Juice 10
City Day
w1 w2 w3 w4 w5 w6
Ahsan Abdullah 23
MOLAP evaluation
Drawbacks of MOLAP:
Ahsan Abdullah 24
MOLAP Implementation issues
Maintenance issue: Every data item received
must be aggregated into every cube (assuming
“to-date” summaries are maintained). Lot of
work.
Ahsan Abdullah 25
Partitioned Cubes
• To overcome the space limitation of MOLAP, the cube is
partitioned.
Ahsan Abdullah 26
Partitioned Cubes: How it looks Like?
Men’s clothing
Children clothing
Bed linen
Time
Product
Geography
Example: Joining the store cube and the list price cube
along the product dimension, to calculate the sale price
without redundant storage of the sale price data.
Ahsan Abdullah 28