0% found this document useful (0 votes)
9 views18 pages

Unit II OLAP

Online Analytical Processing (OLAP) is a technology that enables complex analysis of large business databases without impacting transactional systems, utilizing optimized databases for heavy read workloads. Semantic modeling simplifies data querying by abstracting underlying structures, making it user-friendly, and is primarily used in analytics and business intelligence. The document also discusses various OLAP server types (ROLAP, MOLAP, HOLAP), their operations, advantages, and disadvantages, as well as the differences between OLAP and OLTP systems.

Uploaded by

Neha Shaw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views18 pages

Unit II OLAP

Online Analytical Processing (OLAP) is a technology that enables complex analysis of large business databases without impacting transactional systems, utilizing optimized databases for heavy read workloads. Semantic modeling simplifies data querying by abstracting underlying structures, making it user-friendly, and is primarily used in analytics and business intelligence. The document also discusses various OLAP server types (ROLAP, MOLAP, HOLAP), their operations, advantages, and disadvantages, as well as the differences between OLAP and OLTP systems.

Uploaded by

Neha Shaw
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Online analytical processing (OLAP)

Analysis Services

Online analytical processing (OLAP) is a technology that organizes large business databases and supports
complex analysis. It can be used to perform complex analytical queries without negatively affecting
transactional systems.

The databases that a business uses to store all its transactions and records are called online transaction
processing (OLTP) databases. These databases usually have records that are entered one at a time. Often
they contain a great deal of information that is valuable to the organization. The databases that are used
for OLTP, however, were not designed for analysis. Therefore, retrieving answers from these databases is
costly in terms of time and effort. OLAP systems were designed to help extract this business intelligence
information from the data in a highly perform at way. This is because OLAP databases are optimized for
heavy read, low write workloads.

Semantic modeling
A semantic data model is a conceptual model that describes the meaning of the data elements it contains.
Organizations often have their own terms for things, sometimes with synonyms, or even different
meanings for the same term. For example, an inventory database might track a piece of equipment with
an asset ID and a serial number, but a sales database might refer to the serial number as the asset ID.
There is no simple way to relate these values without a model that describes the relationship.

Semantic modeling provides a level of abstraction over the database schema, so that users don't need to
know the underlying data structures. This makes it easier for end users to query data without performing
aggregates and joins over the underlying schema. Also, usually columns are renamed to more user-
friendly names, so that the context and meaning of the data are more obvious.
Semantic modeling is predominately used for read-heavy scenarios, such as analytics and business
intelligence (OLAP), as opposed to more write-heavy transactional data processing (OLTP). This is mostly
due to the nature of a typical semantic layer:

• Aggregation behaviors are set so that reporting tools display them properly.
• Business logic and calculations are defined.
• Time-oriented calculations are included.
• Data is often integrated from multiple sources.

Traditionally, the semantic layer is placed over a data warehouse for these reasons.

There are two primary types of semantic models:

• Tabular. Uses relational modeling constructs (model, tables, columns). Internally, metadata is
inherited from OLAP modeling constructs (cubes, dimensions, measures). Code and script use OLAP
metadata.
• Multidimensional. Uses traditional OLAP modeling constructs (cubes, dimensions, measures).

Multi Dimensional Data Model

The multi-Dimensional Data Model is a method which is used for ordering data in the database
along with good arrangement and assembling of the contents in the database.
The Multi Dimensional Data Model allows customers to interrogate analytical questions associated
with market or business trends, unlike relational databases which allow customers to access data
in the form of queries. They allow users to rapidly receive answers to the requests which they
made by creating and examining the data comparatively fast.
OLAP (online analytical processing) and data warehousing uses multi dimensional databases. It is
used to show multiple dimensions of the data to users.
It represents data in the form of data cubes. Data cubes allow to model and view the data from
many dimensions and perspectives. It is defined by dimensions and facts and is represented by a
fact table. Facts are numerical measures and fact tables contain measures of the related
dimensional tables or names of the facts.

Multidimensional Data Representation

Working on a Multidimensional Data Model

On the basis of the pre-decided steps, the Multidimensional Data Model works.
The following stages should be followed by every project for building a Multi Dimensional Data
Model :
Stage 1 : Assembling data from the client : In first stage, a Multi Dimensional Data Model
collects correct data from the client. Mostly, software professionals provide simplicity to the client
about the range of data which can be gained with the selected technology and collect the complete
data in detail.
Stage 2 : Grouping different segments of the system : In the second stage, the Multi
Dimensional Data Model recognizes and classifies all the data to the respective section they
belong to and also builds it problem-free to apply step by step.
Stage 3 : Noticing the different proportions : In the third stage, it is the basis on which the
design of the system is based. In this stage, the main factors are recognized according to the
user’s point of view. These factors are also known as “Dimensions”.
Stage 4 : Preparing the actual-time factors and their respective qualities : In the fourth stage,
the factors which are recognized in the previous step are used further for identifying the related
qualities. These qualities are also known as “attributes” in the database.
Stage 5 : Finding the actuality of factors which are listed previously and their qualities : In
the fifth stage, A Multi Dimensional Data Model separates and differentiates the actuality from the
factors which are collected by it. These actually play a significant role in the arrangement of a Multi
Dimensional Data Model.
Stage 6 : Building the Schema to place the data, with respect to the information collected
from the steps above : In the sixth stage, on the basis of the data which was collected
previously, a Schema is built.
For Example :
1. Let us take the example of a firm. The revenue cost of a firm can be recognized on the basis of
different factors such as geographical location of firm’s workplace, products of the firm,
advertisements done, time utilized to flourish a product, etc.

Example 1

2. Let us take the example of the data of a factory which sells products per quarter in Bangalore.
The data is represented in the table given below :
2D factory data

In the above given presentation, the factory’s sales for Bangalore are, for the time dimension,
which is organized into quarters and the dimension of items, which is sorted according to the kind
of item which is sold. The facts here are represented in rupees (in thousands).
Now, if we desire to view the data of the sales in a three-dimensional table, then it is represented
in the diagram given below. Here the data of the sales is represented as a two dimensional table.
Let us consider the data according to item, time and location (like Kolkata, Delhi, Mumbai). Here is
the table :

3D data representation as 2D

This data can be represented in the form of three dimensions conceptually, which is shown in the
image below :
3D data representation

Advantages of Multi Dimensional Data Model

The following are the advantages of a multi-dimensional data model :


• A multi-dimensional data model is easy to handle.
• It is easy to maintain.
• Its performance is better than that of normal databases (e.g. relational databases).
• The representation of data is better than traditional databases. That is because the multi-
dimensional databases are multi-viewed and carry different types of factors.
• It is workable on complex systems and applications, contrary to the simple one-dimensional
database systems.
• The compatibility in this type of database is an upliftment for projects having lower bandwidth
for maintenance staff.

Disadvantages of Multi Dimensional Data Model

The following are the disadvantages of a Multi Dimensional Data Model :


• The multi-dimensional Data Model is slightly complicated in nature and it requires professionals
to recognize and examine the data in the database.
• During the work of a Multi-Dimensional Data Model, when the system caches, there is a great
effect on the working of the system.
• It is complicated in nature due to which the databases are generally dynamic in design.
• The path to achieving the end product is complicated most of the time.
• As the Multi Dimensional Data Model has complicated systems, databases have a large
number of databases due to which the system is very insecure when there is a security break.

Types of OLAP Servers


We have four types of OLAP servers −
• Relational OLAP (ROLAP)
• Multidimensional OLAP (MOLAP)
• Hybrid OLAP (HOLAP)
• Specialized SQL Servers

Relational OLAP
ROLAP servers are placed between relational back-end server and client front-end tools. To store and
manage warehouse data, ROLAP uses relational or extended-relational DBMS.
ROLAP includes the following −

• Implementation of aggregation navigation logic.


• Optimization for each DBMS back end.
• Additional tools and services.

Multidimensional OLAP
MOLAP uses array-based multidimensional storage engines for multidimensional views of data. With
multidimensional data stores, the storage utilization may be low if the data set is sparse. Therefore, many
MOLAP server use two levels of data storage representation to handle dense and sparse data sets.

Hybrid OLAP
Hybrid OLAP is a combination of both ROLAP and MOLAP. It offers higher scalability of ROLAP and
faster computation of MOLAP. HOLAP servers allows to store the large data volumes of detailed
information. The aggregations are stored separately in MOLAP store.

Specialized SQL Servers


Specialized SQL servers provide advanced query language and query processing support for SQL
queries over star and snowflake schemas in a read-only environment.

OLAP Operations
Since OLAP servers are based on multidimensional view of data, we will discuss OLAP operations in
multidimensional data.
Here is the list of OLAP operations −

• Roll-up
• Drill-down
• Slice and dice
• Pivot (rotate)
Roll-up
Roll-up performs aggregation on a data cube in any of the following ways −

• By climbing up a concept hierarchy for a dimension


• By dimension reduction
The following diagram illustrates how roll-up works.

• Roll-up is performed by climbing up a concept hierarchy for the dimension location.


• Initially the concept hierarchy was "street < city < province < country".
• On rolling up, the data is aggregated by ascending the location hierarchy from the level of city to
the level of country.
• The data is grouped into cities rather than countries.
• When roll-up is performed, one or more dimensions from the data cube are removed.
Drill-down
Drill-down is the reverse operation of roll-up. It is performed by either of the following ways −

• By stepping down a concept hierarchy for a dimension


• By introducing a new dimension.
The following diagram illustrates how drill-down works −
• Drill-down is performed by stepping down a concept hierarchy for the dimension time.
• Initially the concept hierarchy was "day < month < quarter < year."
• On drilling down, the time dimension is descended from the level of quarter to the level of month.
• When drill-down is performed, one or more dimensions from the data cube are added.
• It navigates the data from less detailed data to highly detailed data.
Slice
The slice operation selects one particular dimension from a given cube and provides a new sub-cube.
Consider the following diagram that shows how slice works.
• Here Slice is performed for the dimension "time" using the criterion time = "Q1".
• It will form a new sub-cube by selecting one or more dimensions.
Dice
Dice selects two or more dimensions from a given cube and provides a new sub-cube. Consider the
following diagram that shows the dice operation.
The dice operation on the cube based on the following selection criteria involves three dimensions.

• (location = "Toronto" or "Vancouver")


• (time = "Q1" or "Q2")
• (item =" Mobile" or "Modem")
Pivot
The pivot operation is also known as rotation. It rotates the data axes in view in order to provide an
alternative presentation of data. Consider the following diagram that shows the pivot operation.
OLAP vs OLTP

Sr.No. Data Warehouse (OLAP) Operational Database (OLTP)

1 Involves historical processing of information. Involves day-to-day processing.

2 OLAP systems are used by knowledge OLTP systems are used by clerks, DBAs, or
workers such as executives, managers and database professionals.
analysts.
3 Useful in analyzing the business. Useful in running the business.

4 It focuses on Information out. It focuses on Data in.

5 Based on Star Schema, Snowflake, Schema Based on Entity Relationship Model.


and Fact Constellation Schema.

6 Contains historical data. Contains current data.

7 Provides summarized and consolidated data. Provides primitive and highly detailed data.

8 Provides summarized and multidimensional Provides detailed and flat relational view of data.
view of data.

9 Number or users is in hundreds. Number of users is in thousands.

10 Number of records accessed is in millions. Number of records accessed is in tens.

11 Database size is from 100 GB to 1 TB Database size is from 100 MB to 1 GB.

12 Highly flexible. Provides high performance

Differences between MOLAP, ROLAP, and


HOLAP
MOLAP

MOLAP is an abbreviation for Multi-dimensional Online Analytical Processing. In this type


of analytical processing, multi-dimensional databases (MDDBs) are used to store data. This
data is later used for analysis. MOLAP consists of data that is pre-computed and fabricated.
The data cubes from MDDBs carry data that has already been calculated. This increases the
speed of querying data.

The architecture of MOLAP consists of three main components:

• Database server: This exists in the data layer.


• MOLAP server: This consists of the MOLAP engine in the application layer.
• Front-end tool: This is usually the client desktop in the presentation layer.

The MOLAP engine in the application layer collects data from the databases in the data
layer. It then loads data cubes into the multi-dimensional databases. When the user makes a
query, data will move in a propriety format from the MDDBs to the client desktop in the
presentation layer. This enables users to view data in multiple dimensions.

Image Source: EDUCBA

Advantages

• It performs well with operations such as slice and dice.


• Users can use it to perform complex calculations.
• It consists of pre-computed data that can be indexed fast.
Disadvantages

• It can only store a limited volume of data.


• The data used for analysis depends on certain requirements that were set (previously).
This limits data analysis and navigation.

ROLAP

ROLAP is an abbreviation for Relational Online Analytical Processing. In this type of


analytical processing, data storage is done in a relational database. In this database, the
arrangement of data is made in rows and columns. Data is presented to end-users in a multi-
dimensional form.

There are three main components in a ROLAP model:

1. Database server: This exists in the data layer. This consists of data that is loaded into
the ROLAP server.
2. ROLAP server: This consists of the ROLAP engine that exists in the application
layer.
3. Front-end tool: This is the client desktop that exists in the presentation layer.

Let’s briefly look at how ROLAP works. When a user makes a query (complex), the ROLAP
server will fetch data from the RDBMS server. The ROLAP engine will then create data
cubes dynamically. The user will view data from a multi-dimensional point.

Unlike in MOLAP, where the multi-dimensional view is static, ROLAP provides a dynamic
multi-dimensional view. This explains why it is slower when compared to MOLAP.
Image Source: Tech Differences

Advantages

• It can handle huge volumes of data.


• A ROLAP model can store data efficiently.
• ROLAP utilizes a relational database. This enables the model to integrate the ROLAP
server with an RDBMS (relational database management system).

Disadvantages

• There is slow performance, especially when the volume of data is huge.


• ROLAP has certain limitations relating to SQL. For example, the SQL feature has
difficulties in handling complex calculations.
HOLAP

This is an abbreviation for Hybrid Online Analytical Processing. This type of analytical
processing solves the limitations of MOLAP and ROLAP and combines their attributes. Data
in the database is divided into two parts: specialized storage and relational storage.
Integrating these two aspects addresses issues relating to performance and scalability.
HOLAP stores huge volumes of data in a relational database and keeps aggregations in a
MOLAP server.

The HOLAP model consists of a server that can support ROLAP and MOLAP. It consists of
a complex architecture that requires frequent maintenance. Queries made in the HOLAP
model involve the multi-dimensional database and the relational database. The front-user tool
presents data from the database management system (directly) or through the intermediate
MOLAP.

Processing OLAP Queries


After you build your OLAP query and apply filters, computations, sorts, and any other adjustments to further refine
your request, you need to process it. Processing your query may take a few moments if your query is complex, or if
the data in linked report sections needs to be refreshed.

To process an OLAP query, select Tools, then Process Query , and then Option.

Since a document can contain multiple queries, the Process drop-down list has three processing options:
• Process Current—Processes the current object. In some cases more than one query may be processed, for
example, if a report references results sets from multiple queries. Process Current is the default selection
when using the toolbar button.
• Process All—Processes all the queries in the document.
• Process Custom—Opens the Process Custom dialog box so that you can indicate which queries to process
by selecting a query’s check box.

Interactive Reporting sends the query to the database and retrieve the data to the OLAPQuery section.
While the data is being retrieved, the Status bar displays a dynamic count indicating rate and progress of
server data processing and network transfer.

You might also like