0% found this document useful (0 votes)
438 views6 pages

Unit - 3 Data Cube Technology

Data cubes organize multidimensional data into a matrix structure to enable efficient online analytical processing (OLAP). Data is represented by multiple dimensions and attributes. OLAP allows analysts to quickly perform operations like roll-ups, drills-downs, slicing, dicing, and pivoting on data cubes to extract and analyze data from different perspectives. Key benefits of OLAP include consistent information and calculations, ability to create "what if" scenarios, and powerful visualization of data.

Uploaded by

Binay Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
438 views6 pages

Unit - 3 Data Cube Technology

Data cubes organize multidimensional data into a matrix structure to enable efficient online analytical processing (OLAP). Data is represented by multiple dimensions and attributes. OLAP allows analysts to quickly perform operations like roll-ups, drills-downs, slicing, dicing, and pivoting on data cubes to extract and analyze data from different perspectives. Key benefits of OLAP include consistent information and calculations, ability to create "what if" scenarios, and powerful visualization of data.

Uploaded by

Binay Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Data Cube Technology

Data Cube or OLAP (online analytical processing) approach in Data Mining


Grouping of data in a multidimensional matrix is called data cubes. In Data warehousing, we generally
deal with various multidimensional data models as the data will be represented by multiple dimensions
and multiple attributes. This multidimensional data is represented in the data cube as the cube represents a
high-dimensional space. The Data cube pictorially shows how different attributes of data are arranged in
the data model. Below is the diagram of a general data cube.

The example above is a 3D cube having attributes


like branch(A,B,C,D),item
type(home,entertainment,computer,phone,security),
year(1997,1998,1999) .

Data cube classification:


The data cube can be classified into two categories:
 Multidimensional data cube: It basically
helps in storing large amounts of data by making use
of a multi-dimensional array. It increases its
efficiency by keeping an index of each dimension.
Thus, dimensional is able to retrieve data fast.
 Relational data cube: It basically helps in storing large amounts of data by making use of relational
tables. Each relational table displays the dimensions of the data cube. It is slower compared to a
Multidimensional Data Cube.

Advantages of data cubes:


 Helps in giving a summarised view of data.
 Data cubes store large data in a simple way.
 Data cube operation provides quick and better analysis,
 Improve performance of data.

What is OLAP?
Online Analytical Processing (OLAP) is a category of software that allows users to analyze information
from multiple database systems at the same time. It is a technology that enables analysts to extract and view
business data from different points of view.
Analysts frequently need to group, aggregate and join data. These OLAP operations in data mining are
resource intensive. With OLAP data can be pre-calculated and pre-aggregated, making analysis faster.
OLAP databases are divided into one or more cubes. The cubes are designed in such a way that creating and
viewing reports become easy. OLAP stands for Online Analytical Processing.

OLAP cube:
At the core of the OLAP concept, is an OLAP Cube. The OLAP cube
is a data structure optimized for very quick data analysis.

OLAP Cube

Er.Binay Yadav Page 1


Data Cube Technology

Data cube operations(or Basic analytical operations of OLAP):

Data cube operations are used to manipulate data to meet the needs of users. These operations help to
select particular data for the analysis purpose. There are mainly 5 operations listed below-
Roll-up: Roll-up is also known as “consolidation” or “aggregation.” The Roll-up operation can be
performed in 2 ways
1. Reducing dimensions
2. Climbing up concept hierarchy. Concept hierarchy is a system of grouping things based on their
order or level.
Operation and aggregate certain similar data attributes having the same dimension together. For
example, if the data cube displays the daily income of a customer, we can use a roll-up operation to
find the monthly income of his salary.
Consider the following diagram

Roll-up operation in OLAP


 In this example, cities New jersey and Lost Angles and rolled up into country USA
 The sales figure of New Jersey and Los Angeles are 440 and 1560 respectively. They become 2000
after roll-up
 In this aggregation process, data is location hierarchy moves up from city to the country.
 In the roll-up process at least one or more dimensions need to be removed. In this example, Cities
dimension is removed.

Drill-down: In drill-down data is fragmented into smaller parts. It is the opposite of the rollup process. It
can be done via
 Moving down the concept hierarchy
 Increasing a dimension

Er.Binay Yadav Page 2


Data Cube Technology

Drill-down operation in OLAP


Consider the diagram above
 Quater Q1 is drilled down to months January, February, and March. Corresponding sales are also
registers.
 In this example, dimension months are added.
This operation is the reverse of the roll-up operation. It allows us to take particular information and
then subdivide it further for coarser granularity analysis. It zooms into more detail. For example- if
India is an attribute of a country column and we wish to see villages in India, then the drill-down
operation splits India into states, districts, towns, cities, villages and then displays the required
information.

 Slicing:
Here, one dimension is selected, and a new sub-cube is created.
Following diagram explain how slice operation performed:

Slice operation in OLAP


 Dimension Time is Sliced with Q1 as the filter.
 A new cube is created altogether.
This operation filters the unnecessary portions. Suppose in a particular dimension, the user doesn’t
need everything for analysis, rather a particular attribute. For example, country=”jamaica”, this will
display only about jamaica and only display other countries present on the country list.

Er.Binay Yadav Page 3


Data Cube Technology

 Dicing:
This operation is similar to a slice. The difference in dice is you select 2 or more dimensions that
result in the creation of a sub-cube.

Dice operation in OLAP


This operation does a multidimensional cutting, that not only cuts only one dimension but also can go
to another dimension and cut a certain range of it. As a result, it looks more like a sub cube out of the
whole cube(as depicted in the figure). For example- the user wants to see the annual salary of
Jharkhand state employees.

 Pivot:
In Pivot, you rotate the data axes to provide a substitute presentation of data.
In the following example, the pivot is based on item types.

Pivot operation in OLAP


This operation is very important from a viewing point of view. It basically transforms the data cube in
terms of view. It doesn’t change the data present in the data cube. For example, if the user is comparing
year versus branch, using the pivot operation, the user can change the viewpoint and now compare
branch versus item type.

Advantages of OLAP
 OLAP is a platform for all type of business includes planning, budgeting, reporting, and analysis.
 Information and calculations are consistent in an OLAP cube. This is a crucial benefit.
 Quickly create and analyze “What if” scenarios
 Easily search OLAP database for broad or specific terms.
 OLAP provides the building blocks for business modeling tools, Data mining tools, performance
reporting tools.
 Allows users to do slice and dice cube data all by various dimensions, measures, and filters.

Er.Binay Yadav Page 4


Data Cube Technology

 It is good for analyzing time series.


 Finding some clusters and outliers is easy with OLAP.
 It is a powerful visualization online analytical process system which provides faster response times
Disadvantages of OLAP
 OLAP requires organizing data into a star or snowflake schema. These schemas are complicated to
implement and administer
 You cannot have large number of dimensions in a single OLAP cube
 Transactional data cannot be accessed with OLAP system.
 Any modification in an OLAP cube needs a full update of the cube. This is a time-consuming
process
Types of OLAP systems
OLAP Hierarchical Structure

Types of OLAP Systems


Type of OLAP Explanation
ROLAP is an extended RDBMS along with multidimensional data
Relational OLAP(ROLAP):
mapping to perform the standard relational operation.
Multidimensional OLAP
MOLAP Implementes operation in multidimensional data.
(MOLAP)
In HOLAP approach the aggregated totals are stored in a
Hybrid OnlineAnalytical multidimensional database while the detailed data is stored in the
Processing (HOLAP) relational database. This offers both data efficiency of the ROLAP
model and the performance of the MOLAP model.
In Desktop OLAP, a user downloads a part of the data from the
database locally, or on their desktop and analyze it.
Desktop OLAP (DOLAP)
DOLAP is relatively cheaper to deploy as it offers very few
functionalities compares to other OLAP systems.
Web OLAP which is OLAP system accessible via the web browser.
Web OLAP (WOLAP) WOLAP is a three-tiered architecture. It consists of three
components: client, middleware, and a database server.
Mobile OLAP helps users to access and analyze OLAP data using
Mobile OLAP:
their mobile devices
SOLAP is created to facilitate management of both spatial and non-
Spatial OLAP :
spatial data in a Geographic Information system (GIS)
ROLAP
ROLAP works with data that exist in a relational database. Facts and dimension tables are stored as
relational tables. It also allows multidimensional analysis of data and is the fastest growing OLAP.
Advantages of ROLAP model:
 High data efficiency. It offers high data efficiency because query performance and access language
are optimized particularly for the multidimensional data analysis.
 Scalability. This type of OLAP system offers scalability for managing large volumes of data, and
even when the data is steadily increasing.

Er.Binay Yadav Page 5


Data Cube Technology

Drawbacks of ROLAP model:


 Demand for higher resources: ROLAP needs high utilization of manpower, software, and hardware
resources.
 Aggregately data limitations. ROLAP tools use SQL for all calculation of aggregate data.
However, there are no set limits to the for handling computations.
 Slow query performance. Query performance in this model is slow when compared with MOLAP
MOLAP
Multidimensional OLAP (MOLAP) is a classical OLAP that facilitates data analysis by using a
multidimensional data cube. Data is pre-computed, re-summarized, and stored in a MOLAP (a major
difference from ROLAP).
MOLAP uses array-based multidimensional storage engines to display multidimensional views of data.
Basically, they use an OLAP cube.
MOLAP Advantages
Below are the advantages of MOLAP:
 MOLAP can manage, analyze and store considerable amounts of multidimensional data.
 Fast Query Performance due to optimized storage, indexing, and caching.
 Smaller sizes of data as compared to the relational database.
 Automated computation of higher level of aggregates data.
 Help users to analyze larger, less-defined data.
 MOLAP is easier to the user that’s why It is a suitable model for inexperienced users.
 MOLAP cubes are built for fast data retrieval and are optimal for slicing and dicing operations.
 All calculations are pre-generated when the cube is created.
Disadvantages of MOLAP
Following are the disadvantages of MOLAP:
 One major weakness of MOLAP is that it is less scalable than ROLAP as it handles only a limited
amount of data.
 The MOLAP also introduces data redundancy as it is resource intensive
 MOLAP Solutions may be lengthy, particularly on large data volumes.
 MOLAP products may face issues while updating and querying models when dimensions are more
than ten.
 MOLAP is not capable of containing detailed data.
 The storage utilization can be low if the data set is highly scattered.
 It can handle the only limited amount of data therefore, it’s impossible to include a large amount of
data in the cube itself.

Hybrid OLAP
Hybrid OLAP is a mixture of both ROLAP and MOLAP. It offers fast computation of MOLAP and higher
scalability of ROLAP. HOLAP uses two databases.
1. Aggregated or computed data is stored in a multidimensional OLAP cube
2. Detailed information is stored in a relational database.
Benefits of Hybrid OLAP:
 This kind of OLAP helps to economize the disk space, and it also remains compact which helps to
avoid issues related to access speed and convenience.
 Hybrid HOLAP’s uses cube technology which allows faster performance for all types of data.
 ROLAP are instantly updated and HOLAP users have access to this real-time instantly updated data.
MOLAP brings cleaning and conversion of data thereby improving data relevance. This brings best
of both worlds.
Drawbacks of Hybrid OLAP:
 Greater complexity level: The major drawback in HOLAP systems is that it supports both ROLAP
and MOLAP tools and applications. Thus, it is very complicated.
 Potential overlaps: There are higher chances of overlapping especially into their functionalities.

Er.Binay Yadav Page 6

You might also like