0% found this document useful (0 votes)
156 views

Prof. Navneet Goyal Computer Science Department BITS, Pilani

1) Multidimensional databases evolved from flat files and hierarchical/network databases to relational and distributed relational databases. 2) Multidimensional databases store and analyze large volumes of enterprise data from different perspectives called dimensions, allowing managers to ask questions about business data like product sales over time and markets. 3) Multidimensional databases arrange data in an intuitive multidimensional array structure, unlike relational databases which require joins to relate different tables, providing faster performance for analysis tasks.

Uploaded by

bixanh0205
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views

Prof. Navneet Goyal Computer Science Department BITS, Pilani

1) Multidimensional databases evolved from flat files and hierarchical/network databases to relational and distributed relational databases. 2) Multidimensional databases store and analyze large volumes of enterprise data from different perspectives called dimensions, allowing managers to ask questions about business data like product sales over time and markets. 3) Multidimensional databases arrange data in an intuitive multidimensional array structure, unlike relational databases which require joins to relate different tables, providing faster performance for analysis tasks.

Uploaded by

bixanh0205
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 57

Multidimensional

Databases

Prof. Navneet Goyal


Computer Science Department
BITS, Pilani
Database Evolution

• Flat files
• Hierarchical and Network
• Relational
• Distributed Relational
• Multidimensional

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 2


Why Multi-Dimensional Databases?
• No single "best" data structure for all
applications within an enterprise
• Organizations have abandoned the search
for the HOLY GRAIL of globally accepted
database
• Select the most appropriate data structure
on a case-by-case basis from a palette of
standard database structures
• Multidimensional Databases for OLAP?

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 3


Why Multi-Dimensional Databases?
• From econometric research conducted at MIT in
the 1960s, the multidimensional database has
matured into the database engine of choice for
data analysis applications
• Inherent ability to integrate and analyze large
volumes of enterprise data
• Offers a good conceptual fit with the way end-
users visualize business data
– Most business people already think about their
businesses in multidimensional terms
– Managers tend to ask questions about product sales in
different markets over specific time periods

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 4


Multidimensional Database

 Spreadsheets – A 2D database?
 Functionalities
 What about a stack of similar
spreadsheets for different times?
 Limitations?
We can not relate data in different
sheets easily
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 5
Multidimensional Database

An MDDB is a computer software system


designed to allow for the efficient and
convenient storage and retrieval system of
large volumes of data that is
1. Intimately related &
2. Stored, viewed and analyzed form different
perspectives
These perspectives are called Dimensions

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 6


A Motivating Example
An automobile manufacturer wants to increase
sale volumes by examining sales data collected
throughout the organization. The evaluation
would require viewing historical sales volume
figures from multiple dimensions such as
 Sales volume by model
 Sales volume by color
 Sales volume by dealer
 Sales volume over time

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 7


Relational Structure

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 8


Multidimensional Array Structure

Sales Volumes

Mini Van 6 5 4
M
O
D Coupe 3 5 5
E
L
Sedan
4 3 2
Blue Red White

COLOR

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 9


RDBMS vs. MDD
• Multidimensional array structure represents a
higher level of organization than the relational
table
• Perspectives are embedded directly into the
structure in the multidimensional model
• All possible combinations of perspectives containing a
specific attribute (the color BLUE, for example) line up along
the dimension position for that attribute.
• Perspectives are placed in fields in the
relational model - tells us nothing about field
contents.

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 10


RDBMS vs. MDD
• MDD makes data browsing and manipulation
intuitive to the end-user
• Any data manipulation action possible with a
MDD is also possible using relational
technology
• Substantial cognitive advantages in query
formulation
• Substantial computational performance
advantages in query processing when using
MDD

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 11


RDBMS vs. MDD

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 12


Mutlidimensional Representation

Sales Volumes

M Mini Van

O
D Coupe
E
L Sedan
Carr
Gleason
Clyde
DEALERSHIP
Blue Red White

COLOR

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 13


Viewing Data - An Example

Sales Volumes

M
O
D
E
L

DEALERSHIP

COLOR

Assume that each dimension has 10 positions, as shown in


the cube above

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 14


Viewing Data - An Example
•How many records would be there in a relational table?
•Implications for viewing data from an end-user
standpoint?
SALES VOLUMES FOR ALL DEALERSHIPS
MODEL COLOR DEALERSHIP VOLUME
MINI VAN BLUE CLYDE 2
MINI VAN BLUE GLEASON 2
MINI VAN BLUE CARR 2
MINI VAN RED CLYDE 1
MINI VAN WHITE GLEASON 3




RECORD NUMBER.... 998
RECORD NUMBER.... 999
RECORD NUMBER.... 1000

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 15


Performance Advantages
• Volume figure when car type = SEDAN,
color=BLUE, & dealer=GLEASON?
• RDBMS – all 1000 records might need to be
searched to find the right record
• MDB has more ‘knowledge’ about where the
data lies
• Max. of 30 position searches!!
• Average case
15 vs. 500

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 16


Performance Advantages
• Total Sales across all colors and dealers when
model = SEDAN?
• RDBMS – all 1000 records must be searched to
get the answer
• MDB – Sum the contents of one 10x10 ‘slice’

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 17


Performance Advantages
• Data manipulation that requires a minute in
RDBMS may require only a few seconds in MDB
• MDBs are an order of magnitude faster than
RDBMSs
• Performance benefits are more for queries that
generate cross-tab views of data
• The performance advantages offered by
multidimensional technology facilitates the
development of interactive decision support
applications like OLAP that can be impractical
in a relational environment.

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 18


RDBMS vs. MDB
• Any data manipulation action possible
with a multidimensional database is also
possible using relational technology
• MDBs however offer several advantages
like:
– Ease of data presentation and navigation
– Ease of maintenance
– Performance

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 19


Ease of Data Presentation & Navigation

• Intuitive spreadsheet like data views are


natural output of MDBs
• Obtaining the same views in a relational
environment, requires either a complex
SQL or a SQL generator against a RDB to
convert the table outputs into a more
intuitive format
• Top N queries are not possible with SQL
at all

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 20


Ease of Maintenance
• Ease of maintenance because data is
stored as it is viewed
• No additional overhead is required to
translate user queries into requests for
data
• To provide same intuitiveness, RDBs use
indexes and sophisticated joins which
require significant maintenance and
storage

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 21


Performance
• Performance of MDBs can be matched by
RDBs through database tuning
• Not possible to tune the database for all
possible adhoc queries
• Tuning requires resources of an
expensive DB specialist
• Aggregate navigators are helping RDBs
to catch up with MDBs as far as
aggregation queries are concerned

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 22


Adding Dimension - An Example

Sales Volumes

M Mini Van Mini Van Mini Van


O
D Coupe Coupe Coupe
E
L Sedan
Carr
Gleason Sedan
Carr
Gleason Sedan
Carr
Gleason
Clyde Clyde Clyde DEALERSHIP
Blue Red White Blue Red White Blue Red White

COLOR COLOR COLOR

JANUARY FEBRUARY MARCH

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 23


When is MDD (In)appropriate?

First, consider situation 1


PERSONNEL

LAST NAME EMPLOYEE# EMPLOYEE AGE


SMITH 01 21
REGAN 12 19
FOX 31 63
WELD 14 31
KELLY 54 27
LINK 03 56
KRANZ 41 45
LUCUS 33 41
WEISS 23 19

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 24


When is MDD (In)appropriate?

Now consider situation 2


SALES VOLUMES FOR GLEASON DEALERSHIP

MODEL COLOR VOLUME


MINI VAN BLUE 6
MINI VAN RED 5
MINI VAN WHITE 4
SPORTS COUPE BLUE 3
SPORTS COUPE RED 5
SPORTS COUPE WHITE 5
SEDAN BLUE 4
SEDAN RED 3
SEDAN WHITE 2

1. Set up a MDD structure for situation 1, with LAST NAME


and Employee# as dimensions, and AGE as the measurement.
2. Set up a MDD structure for situation 2, with MODEL and
COLOR as dimensions, and SALES VOLUME as the measurement.

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 25


When is MDD (In)appropriate?

MDD Structures for the Situations

Employee Age
Smith 21

Regan 19
Sales Volumes Fox 63
L
A
M
Mini Van 6 5 4 S
Weld 31
O T
D Coupe 3 5 5 Kelly 27
E N
L A
Sedan 4 3 2 M Link 56
E
Blue Red White Kranz 45
COLOR
Lucas 41

Weiss 19

31 41 23 01 14 54 03 12 33
EMPLOYEE #

Note the sparse between the two MDD representations

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 26


When is MDD (In)appropriate?
 Our sales volume dataset has a great number
of meaningful interrelationships
 Interrelationships more meaningful than
individual data elements themselves.
 The greater the number of inherent interrelationships
between the elements of a dataset, the more likely it is
that a study of those interrelationships will yield business
information of value to the company.
 Highly interrelated dataset types be placed in a
multidimensional data structure for greatest
ease of access and analysis

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 27


When is MDD (In)appropriate?

• No last name is matching with more


than one emp # and no emp # is
matching with more than one last name
• In contrast, there is a sales figure
associated with every combination of
model and color resulting in a
completed filled up 3x3 matrix
• Performance suffers (RDB 9 vs. MDB 18)

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 28


When is MDD (In)appropriate?
• The relative performance advantages of
storing multidimensional data in a
multidimensional array increase as the
size of the dataset increases
• The relative performance disadvantages
of storing non-multidimensional data in
a multidimensional array increase as the
size of the dataset increases.
• NO inherent value of storing Non-
multidimensional data (employee data)
in multidimensional arrays

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 29


When is MDD Appropriate?
The greater the number of inherent interrelationships
between the elements of a dataset, the more likely it is
that a study of those interrelationships will yield business
information of value to the company.
• Most companies have limited time and
resources to devote to analyzing data
• It therefore becomes critical that these highly
interrelated dataset types be placed in a
multidimensional data structure for greatest
ease of access and analysis.

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 30


When is MDD Appropriate?
Examples of applications that are suited for
multidimensional technology:

1. Financial Analysis and Reporting


2. Budgeting
3. Promotion Tracking
4. Quality Assurance and Quality Control
5. Product Profitability

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 31


MDD Features - Rotation

Sales Volumes

M
Mini Van
6 5 4 C Blue 6 3 4
O O
D Coupe
3 5 5 L Red 5 5 3
E O
L R
Sedan 4 3 2 o
White 4 5 2
( ROTATE 90 ) Mini Van Coupe Sedan
Blue Red White

COLOR MODEL

View #1: ModelxColor View #2: ColorxModel

•Also referred to as “data slicing.”


•Each rotation yields a different slice or two dimensional table
of data.

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 32


MDD Features - Rotation

Sales Volumes

M Mini Van
C Blue
C Blue

O O O
D Coupe L Red
L Red
E O O
L Sedan
Carr
Gleason
R White
Carr
Gleason
R White
Mini Van
Coupe
Clyde Clyde Sedan
Blue Red White Sedan Coupe Mini Van Carr Gleason Clyde

COLOR MODEL DEALERSHIP


o o o
( ROTATE 90 ) ( ROTATE 90 ) ( ROTATE 90 )

DEALERSHIP DEALERSHIP MODEL

View #1 View #2 View #3

D D
E E
A A
L Carr L Carr Mini Van
E E M
R Gleason
R Gleason O Coupe
S S D
H Mini Van H Blue E Sedan
Blue
I Clyde Coupe I Clyde Red L Red
White
White
P Sedan P Mini Van Coupe Sedan
White Red Blue Clyde Gleason Carr

COLOR MODEL DEALERSHIP


o o
( ROTATE 90 ) ( ROTATE 90 )

MODEL COLOR COLOR

View #4 View #5 View #6

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 33


MDD Features - Rotation
• All the six views can be obtained by simple
rotation
• In MDBs rotations are simple as no
rearrangement of data is required
• Rotation is also referred to as “data slicing”
• No. of views
•2D – 2
•3D – 6
•4D - ? 24
December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 34
MDD Features - Ranging

• How sales volume of models painted


with new metallic blue compared with
the sales of normal blue color models?
• The user knows that only Sports Coupe
and Mini Van models have received the
new paint treatment
• Also the user knows that only 2 dealers
viz, Carr and Clyde have unconstrained
supply of these models

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 35


MDD Features - Ranging
Sales Volumes

Mini Van
M Mini Van
O
D Coupe Carr
E Coupe
Clyde
L Normal Metal
Carr Blue Blue
Clyde

DEALERSHIP
Normal Metal
Blue Blue

COLOR

• The end user selects the desired positions along each dimension.
• Also referred to as "data dicing."
• The data is scoped down to a subset grouping

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 36


MDD Features - Ranging

• The reduced array can now be rotated


and used in computations in the same
was as the parent array
• Referred to as “Data Dicing” as data is
scoped down to a subset grouping
• Complex SQL query is required in RDB
• Performance is better in MDB as less
resource consuming searches are
required

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 37


MDD Features – Roll-Up & Drill-Down

• Users want different views of the same data


• For eg., Sales Volume by model vs, sales volume by
dealership
• Many times views are similar
Sales volume by dealership vs. volume by district
• Natural relationship between Sales Volumes at the
DEALERSHIP level and Sales Volumes at the
DISTRICT level
• Sales Volumes for all the dealerships in a district
sum to the Sales Volumes for that district

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 38


MDD Features – Roll-Up & Drill-Down

• Multidimensional database technology is


specially designed to facilitate the handling
of these natural relationships
• Define two related aggregates on the same
dimension
• One aggregation is dealership and the other
district
• District is at a higher level of aggregation
than dealership

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 39


MDD Features - Roll-Ups & Drill Downs
ORGANIZATION DIMENSION

REGION Midwest

DISTRICT Chicago St. Louis Gary

DEALERSHIP Clyde Gleason Carr Levi Lucas Bolton

• The figure presents a definition of a hierarchy within


the organization dimension.
• Aggregations perceived as being part of the same dimension.
•Moving up and moving down levels in a hierarchy is referred
to
as “roll-up” and “drill-down.”

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 40


MDD Features - Roll-Ups & Drill Downs

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 41


MDD Features:
Drill-Down Through a Dimension
Sales Volumes
COLOR

M
O
D
E
L

Clyde Gleason Carr Levi Lucas Bolton Gary St. Louis Chicago Midwest

REGION
DISTRICT
DEALERSHIP

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 42


Queries
• High degree of structure in
MDB makes the query
language very simple and
efficient
• Query language is intuitive
• Output is immediately useful
to end user

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 43


Queries: Example

• Display sales volume by model


for each dealership
PRINT TOTAL.(SALES_VOLUME KEEP MODEL DEALERSHIP)

DEALERSHIP
MODEL CLYDE GLEASON CARR
MINI VAN 7 5 6
SPORTS COUPE 4 6 8
SEDAN 3 8 12

Trends emerge and comparisons are easily made

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 44


Queries: Example
• Corresponding SQL
SELECT MODEL, DEALERSHIP, SUM(SALES_VOLUME)
FROM SALES_VOLUME
GROUP BY MODEL, DEALERSHIP
ORDER BY MODEL, DEALERSHIP
MODEL | DEALERSHIP | SUM(SALES_VOLUME)
MINI VAN | CLYDE | 7
MINI VAN | GLEASON | 5
MINI VAN | CARR | 6
SPORTS COUPE| CLYDE | 4
SPORTS COUPE| GLEASON | 6
SPORTS COUPE| CARR | 8
SEDAN | CLYDE | 3
SEDAN | GLEASON | 8
SEDAN | CARR | 12

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 45


Queries: Example
Use report writer in addition to SQL and we get
MINI VAN
CLYDE 7
GLEASON 5
CARR 6
SPORTS COUPE
CLYDE 4
GLEASON 5
CARR 8
SEDAN
CLYDE 3
GLEASON 8
CARR 12

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 46


MDD Features:
Multidimensional Computations
• Well equipped to handle demanding mathematical
functions.
• Can treat arrays like cells in spreadsheets. For
example, in a budget analysis situation, one can
divide the ACTUAL array by the BUDGET array to
compute the VARIANCE array.
• Applications based on multidimensional database
technology typically have one dimension defined
as a "business measurements" dimension.
• Integrates computational tools very tightly with
the database structure.

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 47


MDD Features:
Multidimensional Computations

Sales Volumes

Mini Van 11 10 0.1

Coupe 16 12 0.33

Sedan 8 10 - 0.2

16 16 0.0

Actual Budget Variance

BUSINESS MEASUREMENTS

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 48


The Time Dimension

• TIME as a predefined hierarchy for rolling-up


and drilling-down across days, weeks, months,
years and special periods, such as fiscal years.
– Eliminates the effort required to build sophisticated
hierarchies every time a database is set up.
– Extra performance advantages

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 49


Contrasting Relational Model and
MD Model
Criteria Relational Model Multidimensional
Databases
Focus Data integrity of each piece Facilitate exploration of
of data interrelationships between
dimensions
Organization structure One-dimensional array Multi-dimensional arrays
Perspectives Embedded in fields Embedded directly in
MDDB structure
Computational power for Joining tables often Structure designed for
Query processing required; computationally OLAP; computationally
expensive cheap
Cognitive issues in Cumbersome Intuitive
querying data
Query Languages SQL or SQL front-ends, Point-and-click emphasis
such as QBE No standardized language
Management of Time Not well suited Well suited
Dimension

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 50


RDBMS vs. MDDB

• DO I still use RDBMS for my DW?


• MDDBs store data in hypercube, i.e.,
multidimensional array
• RDBMS store data as tables with row and
columns that do not map directly to
multidimensional view that user have of
data
• EDW – RDBMS
• Data Marts - MDDB

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 51


RDBMS vs. MDDB: Trade-Offs
• SIZE
– MDDBs limited by size
• Mid – 1990s 10GB caused problems
• Today – 100GB is OK
– Large DWs are still better served by relational
front-ends running against high performance
and scalable RDBMS
• VOLATILITY
– Highly volatile data are better handled by
RDBMS
– MDDBs take long to load and update

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 52


RDBMS vs. MDDB: Trade-Offs
• AGGREGATE STRATEGY
– MDDBs support aggregates better
– RDBMSs are catching up with the help of
Aggregate Navigators
• INVESTMENT PROTECTION
– Most organizations already have made significant
investments in relational technology and skill sets
– Continued use for another purpose (DW) provides
additional ROI and lowers technical risk of failure
– MDDBs – need to acquire new software and train
staff to use it

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 53


RDBMS vs. MDDB: Trade-Offs

TYPE OF USERS
– Power users prefer the range of
functionalities available in MOLAP tools
– Users that require broad views of enterprise
data require access to DW and therefore
better served by a ROLAP tool

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 54


INTEGRATED ARCHITECTURE

• DB vendors have integrated their


multidimensional and relational database
products
• Multidimensional Front-end tools
• If queries require data that are not
available in MDDB, the tools retrieve the
data from the larger RDB
• Known as “DRILL-THROUGH”

December 7, 2021 Dr. Navneet Goyal, BITS, Pilani 55


Q&A
Thank You

You might also like