0% found this document useful (0 votes)
3 views

PGDM BA 04 - Data Mining

Online Analytical Processing (OLAP) is a set of tools for data analysis that enables businesses to make informed decisions by processing data from multiple sources using a multidimensional model. OLAP systems facilitate faster decision-making, support non-technical users, and provide an integrated view of data across various business units. There are different types of OLAP servers, including ROLAP, MOLAP, and HOLAP, each with its own advantages and limitations, and OLAP operations such as roll-up, drill down, slice, dice, and pivot allow for detailed data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

PGDM BA 04 - Data Mining

Online Analytical Processing (OLAP) is a set of tools for data analysis that enables businesses to make informed decisions by processing data from multiple sources using a multidimensional model. OLAP systems facilitate faster decision-making, support non-technical users, and provide an integrated view of data across various business units. There are different types of OLAP servers, including ROLAP, MOLAP, and HOLAP, each with its own advantages and limitations, and OLAP operations such as roll-up, drill down, slice, dice, and pivot allow for detailed data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

OLAP Servers

Online Analytical Processing(OLAP) refers to a set of software tools used for data
analysis in order to make business decisions. OLAP provides a platform for gaining
insights from databases retrieved from multiple database systems at the same time. It
is based on a multidimensional data model, which enables users to extract and view
data from various perspectives. A multidimensional database is used to store OLAP
data. Many Business Intelligence (BI) applications rely on OLAP technology.

Why is OLAP important?


Online analytical processing (OLAP) helps organizations process and benefit from a growing amount
of digital information. Some benefits of OLAP include the following.

Faster decision making


Businesses use OLAP to make quick and accurate decisions to remain competitive in a fast-paced
economy. Performing analytical queries on multiple relational databases is time consuming because
the computer system searches through multiple data tables. On the other hand, OLAP systems
precalculate and integrate data so business analysts can generate reports faster when needed.

Non-technical user support


OLAP systems make complex data analysis easier for non-technical business users. Business users
can create complex analytical calculations and generate reports instead of learning how to operate
databases.

Integrated data view


OLAP provides a unified platform for marketing, finance, production, and other business units.
Managers and decision makers can see the bigger picture and effectively solve problems. They can
perform what-if analysis, which shows the impact of decisions taken by one department on other
areas of the business.

What is OLAP architecture?


Online analytical processing (OLAP) systems store multidimensional data by representing
information in more than two dimensions, or categories. Two-dimensional data involves columns and
rows, but multidimensional data has multiple characteristics. For example, multidimensional data for
product sales might consist of the following dimensions:

 Product type
 Location
 Time
Data engineers build a multidimensional OLAP system that consists of the following elements.

Data warehouse
A data warehouse collects information from different sources, including applications, files, and
databases. It processes the information using various tools so that the data is ready for analytical
purposes. For example, the data warehouse might collect information from a relational database that
stores data in tables of rows and columns.

ETL tools
Extract, transform, and load (ETL) tools are database processes that automatically retrieve, change,
and prepare the data to a format fit for analytical purposes. Data warehouses use ETL to convert
and standardize information from various sources before making it available to OLAP tools.

OLAP server
An OLAP server is the underlying machine that powers the OLAP system. It uses ETL tools to
transform information in the relational databases and prepare them for OLAP operations.

OLAP database
An OLAP database is a separate database that connects to the data warehouse. Data engineers
sometimes use an OLAP database to prevent the data warehouse from being burdened by OLAP
analysis. They also use an OLAP database to make it easier to create OLAP data models.

OLAP cubes
A data cube is a model representing a multidimensional array of information. While it’s easier to
visualize it as a three-dimensional data model, most data cubes have more than three dimensions.
An OLAP cube, or hypercube, is the term for data cubes in an OLAP system. OLAP cubes are rigid
because you can't change the dimensions and underlying data once you model it. For example, if
you add the warehouse dimension to a cube with product, location, and time dimensions, you have
to remodel the entire cube.

OLAP analytic tools


Business analysts use OLAP tools to interact with the OLAP cube. They perform operations such as
slicing, dicing, and pivoting to gain deeper insights into specific information within the OLAP cube.

How does OLAP work?

How does OLAP work?


An online analytical processing (OLAP) system works by collecting, organizing, aggregating, and
analyzing data using the following steps:

1. The OLAP server collects data from multiple data sources, including relational databases and data
warehouses.
2. Then, the extract, transform, and load (ETL) tools clean, aggregate, precalculate, and store data in
an OLAP cube according to the number of dimensions specified.
3. Business analysts use OLAP tools to query and generate reports from the multidimensional data in
the OLAP cube.

OLAP uses Multidimensional Expressions (MDX) to query the OLAP cube. MDX is a query, like
SQL, that provides a set of instructions for manipulating databases.

Type of OLAP servers:


The three major types of OLAP servers are as follows:
 ROLAP
 MOLAP
 HOLAP
Relational OLAP (ROLAP):
Relational On-Line Analytical Processing (ROLAP) is primarily used for data stored in
a relational database, where both the base data and dimension tables are stored as
relational tables. ROLAP servers are used to bridge the gap between the relational
back-end server and the client’s front-end tools. ROLAP servers store and manage
warehouse data using RDBMS, and OLAP middleware fills in the gaps.
Benefits:
 It is compatible with data warehouses and OLTP systems.
 The data size limitation of ROLAP technology is determined by the underlying
RDBMS. As a result, ROLAP does not limit the amount of data that can be stored.
Limitations:
 SQL functionality is constrained.
 It’s difficult to keep aggregate tables up to date.
Multidimensional OLAP (MOLAP):
Through array-based multidimensional storage engines, Multidimensional On-Line
Analytical Processing (MOLAP) supports multidimensional views of data. Storage
utilization in multidimensional data stores may be low if the data set is sparse.
MOLAP stores data on discs in the form of a specialized multidimensional array
structure. It is used for OLAP, which is based on the arrays’ random access capability.
Dimension instances determine array elements, and the data or measured value
associated with each cell is typically stored in the corresponding array element. The
multidimensional array is typically stored in MOLAP in a linear allocation based on
nested traversal of the axes in some predetermined order.
However, unlike ROLAP, which stores only records with non-zero facts, all array
elements are defined in MOLAP, and as a result, the arrays tend to be sparse, with
empty elements occupying a larger portion of them. MOLAP systems typically include
provisions such as advanced indexing and hashing to locate data while performing
queries for handling sparse arrays, because both storage and retrieval costs are
important when evaluating online performance. MOLAP cubes are ideal for slicing and
dicing data and can perform complex calculations. When the cube is created, all
calculations are pre-generated.
Benefits:
 Suitable for slicing and dicing operations.
 Outperforms ROLAP when data is dense.
 Capable of performing complex calculations.
Limitations:
 It is difficult to change the dimensions without re-aggregating.
 Since all calculations are performed when the cube is built, a large amount of data
cannot be stored in the cube itself.

Hybrid OLAP (HOLAP):


ROLAP and MOLAP are combined in Hybrid On-Line Analytical Processing (HOLAP).
HOLAP offers greater scalability than ROLAP and faster computation than
MOLAP.HOLAP is a hybrid of ROLAP and MOLAP. HOLAP servers are capable of
storing large amounts of detailed data. On the one hand, HOLAP benefits from
ROLAP’s greater scalability. HOLAP, on the other hand, makes use of cube
technology for faster performance and summary-type information. Because detailed
data is stored in a relational database, cubes are smaller than MOLAP.
Benefits:
 HOLAP combines the benefits of MOLAP and ROLAP.
 Provide quick access at all aggregation levels.
Limitations
 Because it supports both MOLAP and ROLAP servers, HOLAP architecture is
extremely complex.
 There is a greater likelihood of overlap, particularly in their functionalities.
Other types of OLAP include:
 Web OLAP (WOLAP): WOLAP refers to an OLAP application that can be
accessed through a web browser. WOLAP, in contrast to traditional client/server
OLAP applications, is thought to have a three-tiered architecture consisting of
three components: a client, middleware, and a database server.
 Desktop OLAP (DOLAP): DOLAP is an abbreviation for desktop analytical
processing. In that case, the user can download the data from the source and work
with it on their desktop or laptop. In comparison to other OLAP applications,
functionality is limited. It is less expensive.
 Mobile OLAP (MOLAP): Wireless functionality or mobile devices are examples of
MOLAP. The user is working and accessing data via mobile devices.
 Spatial OLAP (SOLAP): SOLAP egress combines the capabilities of Geographic
Information Systems (GIS) and OLAP into a single user interface. SOLAP is
created because the data can be alphanumeric, image, or vector. This allows for
the quick and easy exploration of data stored in a spatial database.

What is data modeling in OLAP?

Data modeling is the representation of data in data warehouses or online analytical processing
(OLAP) databases. Data modeling is essential in relational online analytical processing
(ROLAP) because it analyzes data straight from the relational database. It stores
multidimensional data as a star or snowflake schema.

Star schema

The star schema consists of a fact table and multiple dimension tables. The fact table is a data
table that contains numerical values related to a business process, and the dimension table
contains values that describe each attribute in the fact table. The fact table refers to dimensional
tables with foreign keys—unique identifiers that correlate to the respective information in the
dimension table.

In a star schema, a fact table connects to several dimension tables so the data model looks like a
star. The following is an example of a fact table for product sales:

 Product ID
 Location ID
 Salesperson ID
 Sales amount

The product ID tells the database system to retrieve information from the product dimension
table, which might look as follows:

 Product ID
 Product name
 Product type
 Product cost

Likewise, the location ID points to a location dimension table, which could consist of the
following:

 Location ID
 Country
 City

The salesperson table might look as follows:

 Salesperson ID
 First name
 Last name
 Email
Snowflake schema

The snowflake schema is an extension of the star schema. Some dimension tables might lead to
one or more secondary dimension tables. This results in a snowflake-like shape when the
dimension tables are put together.

For example, the product dimension table might contain the following fields:

 Product ID
 Product name
 Product type ID
 Product cost

The product type ID connects to another dimension table as shown in the following example:

 Product type ID
 Type name
 Version
 Variant

What are OLAP operations?

Business analysts perform several basic analytical operations with a multidimensional online
analytical processing (MOLAP) cube.

Roll up

In roll up, the online analytical processing (OLAP) system summarizes the data for specific
attributes. In other words, it shows less-detailed data. For example, you might view product sales
according to New York, California, London, and Tokyo. A roll-up operation would provide a
view of the sales data based on countries, such as the US, the UK, and Japan.

Drill down

Drill down is the opposite of the roll-up operation. Business analysts move downward in the
concept hierarchy and extract the details they require. For example, they can move from viewing
sales data by years to visualizing it by months.

Slice

Data engineers use the slice operation to create a two-dimensional view from the OLAP cube.
For example, a MOLAP cube sorts data according to products, cities, and months. By slicing the
cube, data engineers can create a spreadsheet-like table consisting of products and cities for a
specific month.

Dice

Data engineers use the dice operation to create a smaller subcube from an OLAP cube. They
determine the required dimensions and build a smaller cube from the original hypercube.

Pivot

The pivot operation involves rotating the OLAP cube along one of its dimensions to get a
different perspective on the multidimensional data model. For example, a three-dimensional
OLAP cube has the following dimensions on the respective axes:

 X-axis—product
 Y-axis—location
 Z-axis—time

Upon a pivot, the OLAP cube has the following configuration:

 X-axis—location
 Y-axis—time
 Z-axis—product

How does OLAP compare with other data analytics methods?

Data mining

Data mining is analytics technology that processes large volumes of historical data to find
patterns and insights. Business analysts use data-mining tools to discover relationships within the
data and make accurate predictions of future trends.

OLAP and data mining

Online analytical processing (OLAP) is a database analysis technology that involves querying,
extracting, and studying summarized data. On the other hand, data mining involves looking
deeply into unprocessed information. For example, marketers could use data-mining tools to
analyze user behaviors from records of every website visit. They might then use OLAP software
to inspect those behaviors from various angles, such as duration, device, country, language, and
browser type.

OLTP
Online transaction processing (OLTP) is a data technology that stores information quickly and
reliably in a database. Data engineers use OLTP tools to store transactional data, such as
financial records, service subscriptions, and customer feedback, in a relational database. OLTP
systems involve creating, updating, and deleting records in relational tables.

OLAP and OLTP

OLTP is great for handling and storing multiple streams of transactions in databases. However, it
cannot perform complex queries from the database. Therefore, business analysts use an OLAP
system to analyze multidimensional data. For example, data scientists connect an OLTP database
to a cloud-based OLAP cube to perform compute-intensive queries on historical data.

How does AWS help with OLAP?


AWS databases provide various managed cloud databases to help organizations store and perform
online analytical processing (OLAP) operations. Data analysts use AWS databases to build secure
databases that align with their organization's requirements. Organizations migrate their business
data to AWS databases because of the affordability and scalability.

 Amazon Redshift is a cloud data warehouse designed specifically for online analytical processing.
 Amazon Relational Database Service (Amazon RDS) is a relational database with OLAP
functionality. Data engineers use Amazon RDS with Oracle OLAP to perform complex queries on
dimensional cubes.
 Amazon Aurora is a MySQL- and PostgreSQL-compatible cloud relational database. It is optimized
for running complex OLAP workloads.

You might also like