0% found this document useful (0 votes)
7 views22 pages

Bi Unit 2

Uploaded by

vidyaghodageri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views22 pages

Bi Unit 2

Uploaded by

vidyaghodageri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

UNIT 2

INTRODUCTION TO OLTP AND OLAP

ONLINE TRANSACTION PROCESSING (OLTP)

Definition : “ It is a class of systems that manage transaction oriented applications”.

 They are mainly concerned with entry, storage and retrieval of data.
 They support online transactions and query
processing. ex: supermarkets, banking, airlines,
insurance etc
 They cover day-to-day operations of an organization like purchasing,
inventory, manufacturing, payroll, accounting etc.
ex: point of sales at a supermarket is a OLTP system.
• OLTP systems include transactions on databases like
1) INSERT( in a supermarket a record of final purchase is added to database)
2) UPDATE ( price of a product is raised from 100 INR to 200 INR)
3) DELETE ( if a product is out of demand then the store removes it from the
shelf and hence the database)

OLTP
• Consider a supermarket database that consists of following tables to manage its
data about products, employees, inventory supplies etc.
• Transactions table
• ProductMaster table
• EmployeeDetails table
• InventorySupplies table
• Suppliers

Fig: 3.1 Schema of ProductMaster table


Fig: 3.2 Sample records of ProductMaster table

Fig: 3.2 Sample records of Product Master table

Considering the sample supermarket database, the Queries that an OLTP system can process
are:

• Search for a particular customer’s record


• Retrieve the product description and unit price of particular product
• Filter all products with a unit price >= 25
• Filter all products supplied by a particular supplier
• Search and display the record of particular supplier
Advantages of an OLTP system:
• Simplicity:
It is designed typically for use by clerks, cashiers, clients etc
• Efficiency:
Allows users to read, write and delete data quickly
• Fast Query Processing:
It responds to user actions immediately and supports transaction
processing on demand.
• Security :
OLTP system require concurrency control (locking) and recovery
mechanisms (logging)
• OLTP system data content not suitable for decision making:
The current data produced by OLTP systems is not easily used in decision
making

Considering the sample supermarket database, the Queries that OLTP systems
cannot answer:
• Which new product the supermarket should introduce?
• Should the new product be specific to few customer segments?
• How much discount the supermarket should offer at their year-end sale?
• Should different discounts be given to different customer segments?
• How to identify the most consistent salesperson depending on various parameters?

ALL the above questions require some analysis that OLTP is unable to provide.

OLAP: ONLINE ANALYTICAL PROCESSING

• OLAP deals with Historical Data or Archival Data, and it is characterized by relatively
low volume of transactions.

• In OLAP data is held in dimensional form

• Hence OLAP tools are based on multi-dimensional data models that views data in form
of data cube.

• The Queries needed for these systems are often very complex and involve aggregations

• Applications of OLAP are planning, budgeting, sales forecasting, sales


reporting, business process management etc.
• ex: If we collect last 10 years data about flight reservation, This may give useful
information like peak time of travel, and what kinds of people are traveling in the various
classes available (Economy/Business).

consider a supermarket store “AllGoods” store.


OLAP
1. One dimensional data: “The data that is viewed from one particular perspective

• Ex: in fig 3.4, we are looking at salesAmount data with one perspective called section.

• Similarly in fig 3.5 we are looking at salesAmount data with one perspective
called ProductCategoryName.

• And similarly in fig 3.6 we are looking at salesAmount data with one perspective
called YearQuarter.
2. Two dimensional data: The data that is viewed or plotted using two

perspectives. here the dimensions are thought of as a kind of coordinate system.

• In the previous table 3.7: the salesAmount data is been plotted along two dimensions
called as yearQuarter and productCategoryName.

• The yearQuarters on vertical axis and productCategoryName on horizontal axis.

3. Three dimensional data: the data that is viewed or plotted using two perspectives.

• Consider the fig 3.8: here the data is plotted in three perspectives called as
productCategoryName, section and YearQuarter.

• Hence an analyst can now easily look for the section which recorded
maximum accessories sales in Q2.

• It is also possible to go beyond 3rd dimension depending on what kind of data is stored
and what kind if quesries is required from OLAP systems.


Queries that an OLAP system can process:

Considering the fig 3.8 that plots 3-D data by productCategoryName, section and
YearQuarter, the OLAP can answer following questions:

• What will be the future sales trend for accessories in Kid’s section?

• Given customers buying pattern, will it be profitable to launch product XYZ in the
Kids section?

• What impact will a 5% increase in price of products will have on customers?

Advantages of OLAP systems

• multidimensional data representation

• Consistency of information

• “what if “ analysis

• Single platform for all information and business needs like planning,
budgeting, forecasting, reporting and analysis

• fast and interactive ad hoc exploration.

DIFFERENT OLAP ARCHITECTURES

Different OLAP architectures are:

1. Multidimensional OLAP ( MOLAP)

2. Relational OLAP (ROLAP)

3. Hybrid OLAP (HOLAP)

1) MOLAP

• Here data is stored as a multidimensional cube and in a multidimensional array.


Advantages of MOLAP:

• Fast data retrieval

• Optimal for slicing and dicing

• It can perform complex calculations which are pre-generated when the cube is created.

Disadvantages of MOLAP;

• It can handle limited amount of data and large amount of data cannot be included in
the cube.

• Cube technology is proprietary, hence additional investment required in human


and capital.

2) ROLAP:

• Here the data is stored in relational databases.

• Usually includes adding a “WHERE” clause in SQL statements to implement the slice
and dice operations in OLAP.

Advantages:

• It can handle large amount of data

• It can make use of already created functionalities found in relational databases

Disadvantages

• Difficult to perform complex calculations using SQL.

• Slow performance as data size increases.


3) HOLAP :

• It combines the best parts of ROLAP and MOLAP

• It makes use of greater scalability feature of ROLAP and faster performance &
summary type information feature of MOLAP.

• It stores time-based information in the MOLAP cube, and conditions-based or


older information in the ROLAP data store

Disadvantage:

Greater implementation and maintenance cost

OLTP V/S OLAP

• OLTP helps in execution and storage of day to day transaction in alignment with business
strategy.

• These day to day transactions are stored in commercial RDBMS.

• Then the data from multiple transactional systems is brought in an enterprise


data warehouse after extraction, cleansing (error detection and rectification) and
transformation( convert from legacy or host format to data warehouse format).

• This data is used for analytics , finding patterns & trends and decision making
which brings efficiency in operations of an organization.
DATA MODEL FOR OLTP:

• An OLTP system usually adopts a ER (entity relationship) data model.

• The relationships between entities (tables) signify relationships between data

• The building blocks of ER model are entities, attributes and relationships.

Consider fig: 3.5 ER data model example

• It consists of three entities

employee (employeeid primary key)

employeeAddress (employeeid foreign key)

employeePayHistory (employeeid foreign key)

• It consists of two relationships

(1:M cardinality) between employee and employeeAddress

(1:M cardinality) between employee and employeePayHistory.

DATA MODELS FOR OLAP

• OLAP systems adopt a multidimensional data model like star schema or


snowflake schema.

• Dimension:

a) A dimension is a perspective or an entity wrt which an organization wants to


keep records.
ex : time, product, customer , employee

b) Each dimension will have a table associated with it called as the dimension
table.

c) Each dimension table will have attributes like productName, ProductCategory,


UnitPrice etc

• Facts:

These are the numerical measures or quantities by which the relationships


between dimensions are an

ex: TotalSales (sales amount in dollars), Quantity (number of units sold),


Discount(amount of discount offered in dollars).

1) Star Model:

• This has a central fact table which is connected to dimension tables surrounding it.

• Each dimension is represented by only one table and each table has a set of attributes.

2) Snowflake model:

• This has a central fact table which is connected to dimension tables surrounding it.

• Here dimensions are normalized into multiple related tables .


• It is used when dimension table is relatively big in size.

ex: the product dimension is further normalized into productcategory dimension.

ROLE OF OLAP TOOLS IN BI ARCHITECTURE

• BI architecture consists of variety of applications.

• Data is extracted from multiple databases scattered around the enterprise,


cleansed, transformed and loaded into a common business data warehouse.

• The OLAP system then extracts information from data warehouse and stores it in a
multidimensional database using ROLAP or MOLAP which inturn stores the data in
a cube.

• Now the users can use query & reporting tools, analysis & data mining tools over
this data

• Now the OLAP can produce roll-up reports, drill down reports, drill-through reports,
aggregations, summaries, pivot tables on varied views of data.
SHOULD OLAP BE PERFORMED DIRECTLY ON OPERATIONAL
DATABASES

• OLTP and OLAP systems are designed for different purposes.

• OLTP systems are designed to query operational databases and OLAP need a
data warehouse built where data is integrated from various sources..

• Performing OLAP queries on operational databases will degrade the


performance of operational tasks.

• OLTP systems support locking and logging. OLAP requires read only access for
aggregation and summarization. Hence applying locking and logging to OLAP
will impact the throughput of OLAP system.

GETTING STARTED WITH BUSINESS INTELLIGENCE

Using analytical information for decision support

• In the past, business executives used numerical information to support their decisions.

The IT applications that provided such numerical information were called Analytical
applications.

• Later BI provided a set of concepts and processes that allowed business executives
to take informed decisions.

BI made decision making faster, reliable, consistent and highly team oriented

What are informed decisions and why are they required?

• Informed decisions are based on facts and facts alone and not on gut feeling. Hence
the chances of it being correct are more often.

• It is easy to communicate facts to stakeholders.

• When similar facts are presented to large set of decisions makers it is likely that they
arrive at same conclusion.

• Hence such type of decision making will lead to business benefits.

Information sources before dawn of BI

1) Market research :

• This helps in better understanding of the marketplace in which the business is operating.
• It includes understanding the customers, competitors, products, changing
market dynamics etc.

• Answers questions like:

Whether launch of product X in region A will be successful?

Will customers be receptive to product X?

Should we discontinue product Y?

2) Statistical data

• It includes revealing hidden patterns, spotting trends etc through proven


mathematical techniques for understanding raw data.

• ex: variance in production rate, correlation of sales with campaigns, cluster analysis
of shopping patterns etc.

• This helps decision makes to see new opportunities or innovate products and services

3) Management reporting:

• The IT teams within the organizations prepare ad hoc reports by using specialized tools.

4) Market survey:

• A third party agency is employed to conduct consumer survey and competitive


analysis etc.

• They use benchmark data to perform the SWOT analysis.

BUSINESS INTELLIGENCE DEFINED

• In 1989, howard dresner of gartner group coined the term BI.

• Definition:

“BI is a set of concepts and methodologies to improve decision making in


business through use of facts and fact based systems”

• BI uses a set of technologies and tools like

data extraction ( informatica / IBM datastage / AB initio)

data analysis (SAS / IBM SPSS)

data reporting (IBM cognos / Business object )


• BI transforms raw data into meaningful information

raw data  meaningful information  knowledge discovery  beneficial insights


 impactful decisions  business benefits

• Business benefits include increased productivity, increased profits, reduced costs,


improved operations etc.

• Features of BI :

1) Fact based decision making:

• Decisions made using BI are purely made on facts and history.

2) Single version of truth:

• The same piece of data is available at more than one place that agrees wholly and in
every respect.

3) 360 degree perspective on your business:

• BI enables looking at business from various perspectives which will help each person in
the project/program team to look at data from his role to find attributes that help in
decision making.

4) Virtual team members on same page:

BI provides same facts to the decision makers/ stakeholders / executives who work on
common project/ business goals/ purpose and who are spread across geographic
locations.

Visibility into Enterprise performance

fig 4.1: types of decisions supported by BI


1) Strategic level :

• BI helps making long term decisions that affect the entire organization.

• ex: for goodfood restaurant chain it answers questions like where could be the next 5
restaurants?

2) Tactical level :

• These decisions are made more frequently compared to strategic decisions.

• They affect the single unit (s) / department (s).

• ex: What are the right months to redeem customer loyalty points?

3) Operational level:

• These decisions are made more frequently.

• The impact is restricted to single unit /department / function.

• These decisions help carry out day to day operations of business

• Ex: what menu item needs to be dropped this week to handle bad weather?

Evolution of BI and Role of DSS, EIS,MIS and Digital Dashboards

• MIS: management information was provided to decision makers by the IT team using
MIS (management information system).

• Generating reports in MIS involved various phases like requirement gathering,


analysis, design of new schema to combine data from several sources, programming to
read data, populating new schema and then generating reports.

some challenges of MIS approach are:

• Long delay between request and delivery of reports.

• Multiple versions of truth (data or facts)

• Too many versions of data cannot serve any new requirement and hence needs to
be discarded

• Until the report reaches the executives, their requirements might change leading to
dissatisfaction of services.
Solutions to above challenges were :

1) New tools to connect to heterogeneous databases

2) Multidimensional dbms and hardware solutions to handle queries faster and


new reporting tools

3) OLAP and data integration with middleware

BI solutions are a product of all the above three developments/solutions

Some BI solutions are:

1) Ad hoc reporting systems:

• These reporting tools combine data from multiple sources, store metadata and report
specifications for faster re-runs and deliver reports in multiple formats like pdf, doc
or xls.

• They meet requirements of individual decision makers in data set collection


and frequency.

• They analyze the information systems employed in operational activities of an


organization.

2) Decision support systems (DSS) :

• It is an information system that supports business decision making activities


(operational decisions) also called as knowledge based systems, they support decisions
required to run day to day operations.

• Uses graphics to present information from multiple sources

• ex: inventory, POS systems etc.

3) Executive information systems (EIS):

• It includes powerful reporting and analytical tools

• It helps to integrate and coordinate business process

• It supports senior management to make strategic decisions by providing internal and


external data

• EIS uses KPI to measure business/functions/project performance.


Difference between ERP and BI

ERP BI

ERP is for data gathering, BI is for data retrieval


aggregation, search, update
etc

It is an operational / It is an OLAP system


transactional / OLTP system

Supports capture, storage Supports data integration from


and flow of data across internal and external sources,
multiple units of transforms it and stores in
organization business data warehouse

Supports pre built reports Advanced reporting and


that meets transactional visualization. Supports
needs Dynamic reports like drill up,
drill down, drill across, pivots
etc

Little or no support for Supports analytical needs


analytical needs of
organization

Is data warehouse synonymous with BI?

• BI is the front end and data warehouse is the backend

• Data warehouse stores the data and BI converts this data into meaningful information

• BI contains all the tools required for analytics and reporting

• BI includes marketing research, analytics, reporting, dashboards, data warehouse,


data mining etc.
Why BI is needed at virtually all levels

1) There is too much data, but little insight

• The volume, amount and velocity of data is growing in leaps and bounds.
Hence managing this data is a challenge.

• Data comes from varied sources and has different schemas.

• Hence it is necessary to integrate this data and store it in a format from


which meaningful information can be derived and used for decision making.

2) There is a need to expand BI from boardroom to front lines :

• BI needs to be integrated at operational process level due to the changing market


conditions.

3) Structured and unstructured data needs to converge

• Unstructured data like emails, text msgs, memos need to blend in order to support
better decision making.

• ex: adding comments /suggestions from customers into BI applications to help market
segment analysis

BI is for Past , Present and Future

• Considering the fig: 4.3

• BI is considers the past, present and future context and scenarios.


• BI has a standard set of reports that support short term decisions and answer
questions like,

what happened, when and where did it happen?

• The statistical analysis capabilities of BI allows to dig deep into current and past data.
And answers questions like why this is happening? Why the customers prefer a particular
brand over another?

• BI also helps in forecasting and predictive modeling and answers questions like what if
the trend continues? What is likely to happen next? What will be in demand?

The BI value chain

• The BI value chain is depicted as

Transformation  Storage  Delivery

• Data from different OLTP/transactional systems is collected , cleansed ( error


detection and rectification) and stored into enterprise data ware house.

• Before storing the data is also converted into a unified format supported by the
data warehouse. This is called transformation and then the data/information is
delivered.

Introduction to business analytics

• Business analytics requires high volume of high quality data.

• It helps businesses optimize existing processes, better understand customer


behavior, recognize opportunities and spot problems before they happen.

• An analytic application is defined as a packaged BI for a particular domain or business


problem.

• Business analytics includes domains like

marketing analytics, customer

analytics,

retail sales analytics, financial services analytics,

supply chain analytics, transportation analytics etc.


Difference between BI and BA

Business Intelligence Business analytics

Answers • what happened? • why did it happen?

The • when did it happen? • will it happen again?


questions
• who is accountable for • what will happen if we change
what happened? x?

• how many? • what is the best that can


happen?
• how often?

• where did it happen?

Makes • reporting (KPI, metrics) • statistical/quantitative analysis


use of :
• automated monitoring/ alerting • data mining

• dashboards/ scoreboards • Predictive modeling

• OLAP (cubes, slice& • multivariate testing


dice, drilling)
• extract learning out of business
• Ad hoc query data

You might also like