0% found this document useful (0 votes)
101 views20 pages

Data Cube Technology

Data cubes allow data to be modeled and viewed in multiple dimensions to enable interactive analysis. A data cube stores measures from a fact table that can be analyzed based on dimensions like time, products, or locations. Common operations on data cubes include rolling up, drilling down, slicing, and dicing to summarize and explore the data from different levels.

Uploaded by

engineershaiwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views20 pages

Data Cube Technology

Data cubes allow data to be modeled and viewed in multiple dimensions to enable interactive analysis. A data cube stores measures from a fact table that can be analyzed based on dimensions like time, products, or locations. Common operations on data cubes include rolling up, drilling down, slicing, and dicing to summarize and explore the data from different levels.

Uploaded by

engineershaiwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

From Tables and Spreadsheets

to Data Cubes

◼ A data warehouse OLAP tools are based on a multidimensional data


model which views data in the form of a data cube
◼ A data cube, such as sales, allows data to be modeled and viewed
in multiple dimensions
◼ Dimension tables, such as item (item_name, brand, type), or
time(day, week, month, quarter, year)
◼ Fact table contains measures (such as dollars_sold) and keys to
each of the related dimension tables

Han: Data Cubes 1


Data Cube Terminology

◼ A data cube supports viewing/modelling of a variable


(a set of variables) of interest. Measures are used to
report the values of the particular variable with respect
to a given set of dimensions.
◼ A fact table stores measures as well as keys
representing relationships to various dimensions.
◼ Dimensions are perspectives with respect to which an
organization wants to keep record.
◼ A star schema defines a fact table and its associated
dimensions.

Han: Data Cubes 2


Conceptual Modeling of
Data Warehouses
◼ Modeling data warehouses: dimensions & measures
◼ Star schema: A fact table in the middle connected to a
set of dimension tables
◼ Snowflake schema: A refinement of star schema
where some dimensional hierarchy is normalized into a
set of smaller dimension tables, forming a shape
similar to snowflake
◼ Fact constellations: Multiple fact tables share
dimension tables, viewed as a collection of stars,
therefore called galaxy schema or fact constellation
Han: Data Cubes 3
Example of Star Schema
time
time_key item
day item_key
day_of_the_week Sales Fact Table item_name
month brand
quarter time_key type
year supplier_type
item_key
branch_key
branch location
location_key
branch_key location_key
branch_name units_sold street
branch_type city
dollars_sold province_or_street
country
avg_sales
Measures

Han: Data Cubes 4


A Concept Hierarchy: Dimension (location)

all all

region Europe ... North_America

country Germany ... Spain Canada ... Mexico

city Frankfurt ... Vancouver ... Toronto

office L. Chan ... M. Wind

Han: Data Cubes 5


View of Warehouses and Hierarchies

Specification of hierarchies
◼ Schema hierarchy
day < {month <
quarter; week} < year
◼ Set_grouping hierarchy
{1..10} < inexpensive

Han: Data Cubes 6


Multidimensional Data
◼ Sales volume as a function of product, month,
and region
Dimensions: Product, Location, Time
Hierarchical summarization paths

Industry Region Year

Category Country Quarter


Product

Product City Month Week

Office Day

Month
Han: Data Cubes 7
A Sample Data Cube
Total annual sales
Date of TV in U.S.A.
1Qtr 2Qtr 3Qtr 4Qtr sum
TV
PC U.S.A
VCR

Country
sum
Canada

Mexico

sum

All, All, All


Han: Data Cubes 8
Browsing a Data Cube

◼ Visualization
◼ OLAP capabilities
◼ Interactive manipulation
Han: Data Cubes 9
Typical OLAP Operations

◼ Roll up (drill-up): summarize data


◼ by climbing up hierarchy or by dimension reduction
◼ Drill down (roll down): reverse of roll-up
◼ from higher level summary to lower level summary or detailed
data, or introducing new dimensions
◼ Slice and dice:
◼ project and select
◼ Pivot (rotate):
◼ reorient the cube, visualization, 3D to series of 2D planes.
◼ Other operations
◼ drill across: involving (across) more than one fact table
◼ …

Han: Data Cubes 10


A Star-Net Query Model
Customer Orders
Shipping Method
Customer
CONTRACTS
AIR-EXPRESS

ORDER
TRUCK
PRODUCT LINE
Time Product
ANNUALY QTRLY DAILY PRODUCT ITEM PRODUCT GROUP
CITY
SALES PERSON
COUNTRY
DISTRICT

REGION
DIVISION
Location Each circle is
called a footprint Promotion Organization
Han: Data Cubes 11
Discovery-Driven Exploration of Data
Cubes
◼ Hypothesis-driven: exploration by user, huge search space
◼ Discovery-driven (Sarawagi et al.’98)
◼ pre-compute measures indicating exceptions, guide user in the
data analysis, at all levels of aggregation
◼ Exception: significantly different from the value anticipated,
based on a statistical model
◼ Visual cues such as background color are used to reflect the
degree of exception of each cell
◼ Computation of exception indicator (modeling fitting and
computing SelfExp, InExp, and PathExp values) can be
overlapped with cube construction
Han: Data Cubes 12
Examples: Discovery-Driven Data Cubes

Han: Data Cubes 13


Software to Work with Data Cubes

◼ http://www.bi-verdict.com/
◼ http://www.bi-
verdict.com/fileadmin/FreeAnalyses/Comment_
OLAP_revival.htm

Han: Data Cubes 14


Summary
◼ Data warehouse
◼ A subject-oriented, integrated, time-variant, and nonvolatile
collection of data in support of management’s decision-
making process
◼ A multi-dimensional model of a data warehouse
◼ Star schema, snowflake schema, fact constellations
◼ A data cube allows to view measures with respect to a given
set of dimensions
◼ OLAP operations: drilling, rolling, slicing, dicing and
pivoting

Han: Data Cubes 15


◼ Advantages of data cubes:
◼ • Multi-dimensional analysis: Data cubes enable
multi-dimensional analysis of business data,
allowing users to view data from different
perspectives and levels of detail.
◼ • Interactivity: Data cubes provide interactive
access to large amounts of data, allowing users to
easily navigate and manipulate the data to
support their analysis.
◼ • Speed and efficiency: Data cubes are
◼ • Data aggregation: Data cubes support
complex calculations and data aggregation,
enabling users to quickly and easily summarize
large amounts of data.
◼ • Improved decision-making: Data cubes
provide a clear and comprehensive view of
business data, enabling improved decision-
making and business intelligence.
◼ • Accessibility: Data cubes can be accessed
from a variety of devices and platforms, making
it easy for users to access and analyze business
◼ • Helps in giving a summarised view of data.
◼ • Data cubes store large data in a simple
way.
◼ • Data cube operation provides quick and
better analysis,
◼ • Improve performance of data.
◼ Disadvantages of data cube:
◼ • Complexity: OLAP systems can be complex
to set up and maintain, requiring specialized
technical expertise.
◼ • Data size limitations: OLAP systems can
struggle with very large data sets and may
require extensive data aggregation or
summarization.
◼ • Performance issues: OLAP systems can be
slow when dealing with large amounts of data,
especially when running complex queries or
calculations.
◼ • Data integrity: Inconsistent data definitions
and data quality issues can affect the accuracy
of OLAP analysis.
◼ • Cost: OLAP technology can be expensive,
especially for enterprise-level solutions, due to
the need for specialized hardware and
software.
◼ Inflexibility: OLAP systems may not easily
accommodate changing business needs and
may require significant effort to modify or
extend.

You might also like