Multidimensional
Databases
Prof. Navneet Goyal
Computer Science Department
BITS, Pilani
Database Evolution
• Flat files
• Hierarchical and Network
• Relational
• Distributed Relational
• Multidimensional
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 2
Why Multi-Dimensional Databases?
• No single "best" data structure for all
applications within an enterprise
• Organizations have abandoned the search
for the HOLY GRAIL of globally accepted
database
• Select the most appropriate data structure
on a case-by-case basis from a palette of
standard database structures
• Multidimensional Databases for OLAP?
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 3
Why Multi-Dimensional Databases?
• From econometric research conducted at MIT in
the 1960s, the multidimensional database has
matured into the database engine of choice for
data analysis applications
• Inherent ability to integrate and analyze large
volumes of enterprise data
• Offers a good conceptual fit with the way end-
users visualize business data
– Most business people already think about their
businesses in multidimensional terms
– Managers tend to ask questions about product sales in
different markets over specific time periods
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 4
Multidimensional Database
Spreadsheets – A 2D database?
Functionalities
What about a stack of similar
spreadsheets for different times?
Limitations?
We can not relate data in different
sheets easily
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 5
Multidimensional Database
An MDDB is a computer software system
designed to allow for the efficient and
convenient storage and retrieval system of
large volumes of data that is
1. Intimately related &
2. Stored, viewed and analyzed form different
perspectives
These perspectives are called Dimensions
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 6
A Motivating Example
An automobile manufacturer wants to increase
sale volumes by examining sales data collected
throughout the organization. The evaluation
would require viewing historical sales volume
figures from multiple dimensions such as
Sales volume by model
Sales volume by color
Sales volume by dealer
Sales volume over time
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 7
Relational Structure
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 8
Multidimensional Array Structure
Sales Volumes
Mini Van 6 5 4
M
O
D Coupe 3 5 5
E
L
Sedan
4 3 2
Blue Red White
COLOR
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 9
RDBMS vs. MDD
• Multidimensional array structure represents a
higher level of organization than the relational
table
• Perspectives are embedded directly into the
structure in the multidimensional model
• All possible combinations of perspectives containing a
specific attribute (the color BLUE, for example) line up along
the dimension position for that attribute.
• Perspectives are placed in fields in the
relational model - tells us nothing about field
contents.
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 10
RDBMS vs. MDD
• MDD makes data browsing and manipulation
intuitive to the end-user
• Any data manipulation action possible with a
MDD is also possible using relational
technology
• Substantial cognitive advantages in query
formulation
• Substantial computational performance
advantages in query processing when using
MDD
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 11
RDBMS vs. MDD
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 12
Mutlidimensional Representation
Sales Volumes
M Mini Van
O
D Coupe
E
L Sedan
Carr
Gleason
Clyde
DEALERSHIP
Blue Red White
COLOR
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 13
Viewing Data - An Example
Sales Volumes
M
O
D
E
L
DEALERSHIP
COLOR
Assume that each dimension has 10 positions, as shown in
the cube above
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 14
Viewing Data - An Example
•How many records would be there in a relational table?
•Implications for viewing data from an end-user
standpoint?
SALES VOLUMES FOR ALL DEALERSHIPS
MODEL COLOR DEALERSHIP VOLUME
MINI VAN BLUE CLYDE 2
MINI VAN BLUE GLEASON 2
MINI VAN BLUE CARR 2
MINI VAN RED CLYDE 1
MINI VAN WHITE GLEASON 3
•
•
•
RECORD NUMBER.... 998
RECORD NUMBER.... 999
RECORD NUMBER.... 1000
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 15
Performance Advantages
• Volume figure when car type = SEDAN,
color=BLUE, & dealer=GLEASON?
• RDBMS – all 1000 records might need to be
searched to find the right record
• MDB has more ‘knowledge’ about where the
data lies
• Max. of 30 position searches!!
• Average case
15 vs. 500
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 16
Performance Advantages
• Total Sales across all colors and dealers when
model = SEDAN?
• RDBMS – all 1000 records must be searched to
get the answer
• MDB – Sum the contents of one 10x10 ‘slice’
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 17
Performance Advantages
• Data manipulation that requires a minute in
RDBMS may require only a few seconds in MDB
• MDBs are an order of magnitude faster than
RDBMSs
• Performance benefits are more for queries that
generate cross-tab views of data
• The performance advantages offered by
multidimensional technology facilitates the
development of interactive decision support
applications like OLAP that can be impractical
in a relational environment.
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 18
RDBMS vs. MDB
• Any data manipulation action possible
with a multidimensional database is also
possible using relational technology
• MDBs however offer several advantages
like:
– Ease of data presentation and navigation
– Ease of maintenance
– Performance
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 19
Ease of Data Presentation & Navigation
• Intuitive spreadsheet like data views are
natural output of MDBs
• Obtaining the same views in a relational
environment, requires either a complex
SQL or a SQL generator against a RDB to
convert the table outputs into a more
intuitive format
• Top N queries are not possible with SQL
at all
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 20
Ease of Maintenance
• Ease of maintenance because data is
stored as it is viewed
• No additional overhead is required to
translate user queries into requests for
data
• To provide same intuitiveness, RDBs use
indexes and sophisticated joins which
require significant maintenance and
storage
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 21
Performance
• Performance of MDBs can be matched by
RDBs through database tuning
• Not possible to tune the database for all
possible adhoc queries
• Tuning requires resources of an
expensive DB specialist
• Aggregate navigators are helping RDBs
to catch up with MDBs as far as
aggregation queries are concerned
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 22
Adding Dimension - An Example
Sales Volumes
M Mini Van Mini Van Mini Van
O
D Coupe Coupe Coupe
E
L Sedan
Carr
Gleason Sedan
Carr
Gleason Sedan
Carr
Gleason
Clyde Clyde Clyde DEALERSHIP
Blue Red White Blue Red White Blue Red White
COLOR COLOR COLOR
JANUARY FEBRUARY MARCH
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 23
When is MDD (In)appropriate?
First, consider situation 1
PERSONNEL
LAST NAME EMPLOYEE# EMPLOYEE AGE
SMITH 01 21
REGAN 12 19
FOX 31 63
WELD 14 31
KELLY 54 27
LINK 03 56
KRANZ 41 45
LUCUS 33 41
WEISS 23 19
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 24
When is MDD (In)appropriate?
Now consider situation 2
SALES VOLUMES FOR GLEASON DEALERSHIP
MODEL COLOR VOLUME
MINI VAN BLUE 6
MINI VAN RED 5
MINI VAN WHITE 4
SPORTS COUPE BLUE 3
SPORTS COUPE RED 5
SPORTS COUPE WHITE 5
SEDAN BLUE 4
SEDAN RED 3
SEDAN WHITE 2
1. Set up a MDD structure for situation 1, with LAST NAME
and Employee# as dimensions, and AGE as the measurement.
2. Set up a MDD structure for situation 2, with MODEL and
COLOR as dimensions, and SALES VOLUME as the measurement.
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 25
When is MDD (In)appropriate?
MDD Structures for the Situations
Employee Age
Smith 21
Regan 19
Sales Volumes Fox 63
L
A
M
Mini Van 6 5 4 S
Weld 31
O T
D Coupe 3 5 5 Kelly 27
E N
L A
Sedan 4 3 2 M Link 56
E
Blue Red White Kranz 45
COLOR
Lucas 41
Weiss 19
31 41 23 01 14 54 03 12 33
EMPLOYEE #
Note the sparse between the two MDD representations
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 26
When is MDD (In)appropriate?
Our sales volume dataset has a great number
of meaningful interrelationships
Interrelationships more meaningful than
individual data elements themselves.
The greater the number of inherent interrelationships
between the elements of a dataset, the more likely it is
that a study of those interrelationships will yield business
information of value to the company.
Highly interrelated dataset types be placed in a
multidimensional data structure for greatest
ease of access and analysis
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 27
When is MDD (In)appropriate?
• No last name is matching with more
than one emp # and no emp # is
matching with more than one last name
• In contrast, there is a sales figure
associated with every combination of
model and color resulting in a
completed filled up 3x3 matrix
• Performance suffers (RDB 9 vs. MDB 18)
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 28
When is MDD (In)appropriate?
• The relative performance advantages of
storing multidimensional data in a
multidimensional array increase as the
size of the dataset increases
• The relative performance disadvantages
of storing non-multidimensional data in
a multidimensional array increase as the
size of the dataset increases.
• NO inherent value of storing Non-
multidimensional data (employee data)
in multidimensional arrays
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 29
When is MDD Appropriate?
The greater the number of inherent interrelationships
between the elements of a dataset, the more likely it is
that a study of those interrelationships will yield business
information of value to the company.
• Most companies have limited time and
resources to devote to analyzing data
• It therefore becomes critical that these highly
interrelated dataset types be placed in a
multidimensional data structure for greatest
ease of access and analysis.
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 30
When is MDD Appropriate?
Examples of applications that are suited for
multidimensional technology:
1. Financial Analysis and Reporting
2. Budgeting
3. Promotion Tracking
4. Quality Assurance and Quality Control
5. Product Profitability
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 31
MDD Features - Rotation
Sales Volumes
M
Mini Van
6 5 4 C Blue 6 3 4
O O
D Coupe
3 5 5 L Red 5 5 3
E O
L R
Sedan 4 3 2 o
White 4 5 2
( ROTATE 90 ) Mini Van Coupe Sedan
Blue Red White
COLOR MODEL
View #1: ModelxColor View #2: ColorxModel
•Also referred to as “data slicing.”
•Each rotation yields a different slice or two dimensional table
of data.
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 32
MDD Features - Rotation
Sales Volumes
M Mini Van
C Blue
C Blue
O O O
D Coupe L Red
L Red
E O O
L Sedan
Carr
Gleason
R White
Carr
Gleason
R White
Mini Van
Coupe
Clyde Clyde Sedan
Blue Red White Sedan Coupe Mini Van Carr Gleason Clyde
COLOR MODEL DEALERSHIP
o o o
( ROTATE 90 ) ( ROTATE 90 ) ( ROTATE 90 )
DEALERSHIP DEALERSHIP MODEL
View #1 View #2 View #3
D D
E E
A A
L Carr L Carr Mini Van
E E M
R Gleason
R Gleason O Coupe
S S D
H Mini Van H Blue E Sedan
Blue
I Clyde Coupe I Clyde Red L Red
White
White
P Sedan P Mini Van Coupe Sedan
White Red Blue Clyde Gleason Carr
COLOR MODEL DEALERSHIP
o o
( ROTATE 90 ) ( ROTATE 90 )
MODEL COLOR COLOR
View #4 View #5 View #6
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 33
MDD Features - Rotation
• All the six views can be obtained by simple
rotation
• In MDBs rotations are simple as no
rearrangement of data is required
• Rotation is also referred to as “data slicing”
• No. of views
•2D – 2
•3D – 6
•4D - ? 24
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 34
MDD Features - Ranging
• How sales volume of models painted
with new metallic blue compared with
the sales of normal blue color models?
• The user knows that only Sports Coupe
and Mini Van models have received the
new paint treatment
• Also the user knows that only 2 dealers
viz, Carr and Clyde have unconstrained
supply of these models
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 35
MDD Features - Ranging
Sales Volumes
Mini Van
M Mini Van
O
D Coupe Carr
E Coupe
Clyde
L Normal Metal
Carr Blue Blue
Clyde
DEALERSHIP
Normal Metal
Blue Blue
COLOR
• The end user selects the desired positions along each dimension.
• Also referred to as "data dicing."
• The data is scoped down to a subset grouping
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 36
MDD Features - Ranging
• The reduced array can now be rotated
and used in computations in the same
was as the parent array
• Referred to as “Data Dicing” as data is
scoped down to a subset grouping
• Complex SQL query is required in RDB
• Performance is better in MDB as less
resource consuming searches are
required
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 37
MDD Features – Roll-Up & Drill-Down
• Users want different views of the same data
• For eg., Sales Volume by model vs, sales volume by
dealership
• Many times views are similar
Sales volume by dealership vs. volume by district
• Natural relationship between Sales Volumes at the
DEALERSHIP level and Sales Volumes at the
DISTRICT level
• Sales Volumes for all the dealerships in a district
sum to the Sales Volumes for that district
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 38
MDD Features – Roll-Up & Drill-Down
• Multidimensional database technology is
specially designed to facilitate the handling
of these natural relationships
• Define two related aggregates on the same
dimension
• One aggregation is dealership and the other
district
• District is at a higher level of aggregation
than dealership
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 39
MDD Features - Roll-Ups & Drill Downs
ORGANIZATION DIMENSION
REGION Midwest
DISTRICT Chicago St. Louis Gary
DEALERSHIP Clyde Gleason Carr Levi Lucas Bolton
• The figure presents a definition of a hierarchy within
the organization dimension.
• Aggregations perceived as being part of the same dimension.
•Moving up and moving down levels in a hierarchy is referred
to
as “roll-up” and “drill-down.”
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 40
MDD Features - Roll-Ups & Drill Downs
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 41
MDD Features:
Drill-Down Through a Dimension
Sales Volumes
COLOR
M
O
D
E
L
Clyde Gleason Carr Levi Lucas Bolton Gary St. Louis Chicago Midwest
REGION
DISTRICT
DEALERSHIP
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 42
Queries
• High degree of structure in
MDB makes the query
language very simple and
efficient
• Query language is intuitive
• Output is immediately useful
to end user
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 43
Queries: Example
• Display sales volume by model
for each dealership
PRINT TOTAL.(SALES_VOLUME KEEP MODEL DEALERSHIP)
DEALERSHIP
MODEL CLYDE GLEASON CARR
MINI VAN 7 5 6
SPORTS COUPE 4 6 8
SEDAN 3 8 12
Trends emerge and comparisons are easily made
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 44
Queries: Example
• Corresponding SQL
SELECT MODEL, DEALERSHIP, SUM(SALES_VOLUME)
FROM SALES_VOLUME
GROUP BY MODEL, DEALERSHIP
ORDER BY MODEL, DEALERSHIP
MODEL | DEALERSHIP | SUM(SALES_VOLUME)
MINI VAN | CLYDE | 7
MINI VAN | GLEASON | 5
MINI VAN | CARR | 6
SPORTS COUPE| CLYDE | 4
SPORTS COUPE| GLEASON | 6
SPORTS COUPE| CARR | 8
SEDAN | CLYDE | 3
SEDAN | GLEASON | 8
SEDAN | CARR | 12
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 45
Queries: Example
Use report writer in addition to SQL and we get
MINI VAN
CLYDE 7
GLEASON 5
CARR 6
SPORTS COUPE
CLYDE 4
GLEASON 5
CARR 8
SEDAN
CLYDE 3
GLEASON 8
CARR 12
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 46
MDD Features:
Multidimensional Computations
• Well equipped to handle demanding mathematical
functions.
• Can treat arrays like cells in spreadsheets. For
example, in a budget analysis situation, one can
divide the ACTUAL array by the BUDGET array to
compute the VARIANCE array.
• Applications based on multidimensional database
technology typically have one dimension defined
as a "business measurements" dimension.
• Integrates computational tools very tightly with
the database structure.
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 47
MDD Features:
Multidimensional Computations
Sales Volumes
Mini Van 11 10 0.1
Coupe 16 12 0.33
Sedan 8 10 - 0.2
16 16 0.0
Actual Budget Variance
BUSINESS MEASUREMENTS
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 48
The Time Dimension
• TIME as a predefined hierarchy for rolling-up
and drilling-down across days, weeks, months,
years and special periods, such as fiscal years.
– Eliminates the effort required to build sophisticated
hierarchies every time a database is set up.
– Extra performance advantages
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 49
Contrasting Relational Model and
MD Model
Criteria Relational Model Multidimensional
Databases
Focus Data integrity of each piece Facilitate exploration of
of data interrelationships between
dimensions
Organization structure One-dimensional array Multi-dimensional arrays
Perspectives Embedded in fields Embedded directly in
MDDB structure
Computational power for Joining tables often Structure designed for
Query processing required; computationally OLAP; computationally
expensive cheap
Cognitive issues in Cumbersome Intuitive
querying data
Query Languages SQL or SQL front-ends, Point-and-click emphasis
such as QBE No standardized language
Management of Time Not well suited Well suited
Dimension
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 50
RDBMS vs. MDDB
• DO I still use RDBMS for my DW?
• MDDBs store data in hypercube, i.e.,
multidimensional array
• RDBMS store data as tables with row and
columns that do not map directly to
multidimensional view that user have of
data
• EDW – RDBMS
• Data Marts - MDDB
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 51
RDBMS vs. MDDB: Trade-Offs
• SIZE
– MDDBs limited by size
• Mid – 1990s 10GB caused problems
• Today – 100GB is OK
– Large DWs are still better served by relational
front-ends running against high performance
and scalable RDBMS
• VOLATILITY
– Highly volatile data are better handled by
RDBMS
– MDDBs take long to load and update
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 52
RDBMS vs. MDDB: Trade-Offs
• AGGREGATE STRATEGY
– MDDBs support aggregates better
– RDBMSs are catching up with the help of
Aggregate Navigators
• INVESTMENT PROTECTION
– Most organizations already have made significant
investments in relational technology and skill sets
– Continued use for another purpose (DW) provides
additional ROI and lowers technical risk of failure
– MDDBs – need to acquire new software and train
staff to use it
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 53
RDBMS vs. MDDB: Trade-Offs
TYPE OF USERS
– Power users prefer the range of
functionalities available in MOLAP tools
– Users that require broad views of enterprise
data require access to DW and therefore
better served by a ROLAP tool
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 54
INTEGRATED ARCHITECTURE
• DB vendors have integrated their
multidimensional and relational database
products
• Multidimensional Front-end tools
• If queries require data that are not
available in MDDB, the tools retrieve the
data from the larger RDB
• Known as “DRILL-THROUGH”
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 55
Q&A
Thank You