0% found this document useful (0 votes)
72 views68 pages

Business Analytics Book

Business intelligence (BI) is a broad set of technologies and processes used to gather, store, analyze and provide access to data to help business users make better decisions. Key components of BI include data warehousing, business analytics, business performance management and user interfaces like dashboards and reports. BI helps organizations monitor performance, identify opportunities and make informed decisions by providing insights from integrated, organized data on areas like sales, customers and operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views68 pages

Business Analytics Book

Business intelligence (BI) is a broad set of technologies and processes used to gather, store, analyze and provide access to data to help business users make better decisions. Key components of BI include data warehousing, business analytics, business performance management and user interfaces like dashboards and reports. BI helps organizations monitor performance, identify opportunities and make informed decisions by providing insights from integrated, organized data on areas like sales, customers and operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 68

Module I

Business Intelligence - Definition, Need, Use & Components


Business Analytics – Introduction, Components, Types
Business Intelligence v/s Business Analytics
Transaction Processing v/s Analytic Processing
Business Intelligence
• Any business organization needs to continually monitor its business environment
and its own performance, and then rapidly adjust its future plans
• This includes monitoring the industry, the competitors, the suppliers, and the
customers
• Customized reports need to be designed to deliver the required information to every
executive.
• These reports can be converted into customized dashboards that deliver the
information rapidly and in easy-to grasp formats
Business Intelligence
• Business intelligence is a broad set of information technology (IT) solutions that includes tools for
gathering, analyzing, and reporting information to the users about performance of the organization and its
environment
• Consider a retail business chain that sells many kinds of goods and services around the world, online and
in physical stores.
• It generates data about sales, purchases, and expenses from multiple locations and time frames.
• Analyzing this data could help identify fast-selling items, regional-selling items, seasonal items, fast-
growing customer segments, and so on. It might also help generate ideas about what products sell together,
which people tend to buy which products, and so on.
• These insights and intelligence can help design better promotion plans, product bundles, and store layouts,
which in turn lead to a better-performing business.
Need for BI
• Helps the Leadership in making informed and better decisions

• Identifying new business opportunities & analysing gaps in the current


processes
• Helps in creating accurate reports by extracting data right from the data source
• Saves time otherwise needed in organizing data manually

• Real time reporting for efficient management

• Helps in forecasting
Features of BI
Ranking
reports

Security features What-If analysis

Open Executive
Integration dashboards

Operational
Interactive reports
reports

Geospatia
l Mapping
Framework of BI
BI Components

Data Business
warehouse Analytics

Business
Performance User
Managemen Interface
t
Data Warehousing
• A data warehouse (DW) is an organized collection of integrated, subject oriented databases designed to support
decision support functions
• DW is organized at the right level of granularity to provide clean enterprise-wide data in a standardized format for
reports, queries, and analysis
• DW is physically and functionally separate from an operational and transactional database

• DW supports business reporting and data mining activities

• It can facilitate distributed access to up-to-date business knowledge for departments and functions, thus improving
business efficiency and customer service
• DW enables a consolidated view of corporate data, all cleaned and organized.

• DW thus provides better and timely information. It simplifies data access and allows end users to perform extensive
analysis.
Business Analytics
• Business analytics (BA) refers to the skills, technologies, and practices for continuous iterative
exploration and investigation of past business performance to gain insight and drive business
planning
• Business analytics focuses on developing new insights and understanding of business performance
based on data and statistical methods
• Business analytics makes extensive use of analytical modeling and numerical analysis, including
explanatory and predictive modeling, and fact-based management to drive decision making
• Business analytics can answer questions like why is this happening, what if these trends continue,
what will happen next (predict), and what is the best outcome that can happen (optimize)
• BI components include – Reporting and queries; Advanced Analytics; Data, Text and Web mining
Business Performance
Management
• Business Performance Management (BPM), otherwise termed as Corporate Performance Management (CPM)
or Enterprise Performance Management, is tuned toward optimization of overall business performance and
achievement of business goals
• It enables an organization to enhance the management of their business performance through the aid of reports,
analytics, Key Performance Indicators, etc. that help them measure and monitor efficiency and success of their
business activities
• The optimisation of comprehensive performance of an organisation is the main aim of BPM

• BPM includes the following processes – Budgeting, Planning & Forecasting; Business Modeling; Scorecard;
Dashboarding; Financial, statutory & management reporting; Risk management; Predictive analysis; Internal
Controls
User Interface
Dashboards

• Various information is organised & presented in a manner which is


easy to understand with the help of a dashboard
• Various trends, exceptions & organisational performance measures
(KPIs) are presented by these dashboards

Visualisation Tools

• It includes multidimensional cube presentations to Virtual reality


• It includes technology similar to Geographical Information Systems
Working of BI
Advantages of BI
• Employees Authorisation

• Link various employees for competent & successful processing of data

• Ease of teamwork & allocation


• Communicating BI to entire organisation

• Evaluate & improve Inputs

• Improved association

• Reduced training requirements

• Ease of Reporting
Disadvantages of BI
• Lost of Historical data

• High Cost attached

• Difficulties in implementation

• Time consuming

• Data privacy issues


BI Applications
CRM Healthcare and Wellness Education

• Maximize the return on • Diagnose disease in • Student Enrolment


marketing campaigns patients (Recruitment and
• Improve customer • Treatment effectiveness Retention)
retention (churn analysis) • Wellness management • Course offerings
• Maximize customer value • Manage fraud and abuse • Fund-raising from
(cross-, up-selling) • Public health Alumni and other donors
• Identify and delight management
highly-valued customers
• Manage brand image
BI Applications
Retail Banking Financial Services

• Optimize inventory • Automate the loan • Predict changes in


levels at different application process bond and stock prices
locations • Detect fraudulent • Assess the effect of
• Improve store layout transactions events on market
and sales promotions • Maximize customer movements
• Optimize logistics for value (cross-, up- • Identify and prevent
seasonal effects selling) fraudulent activities in
• Minimize losses due • Optimize cash trading
to limited shelf life reserves with
forecasting
BI Applications
Insurance Manufacturing Telecom

• Forecast claim costs for • Discover novel patterns • Churn management


better business planning to improve product • Marketing and product
• Determine optimal rate quality creation
plans • Predict/prevent • Network failure
• Optimize marketing to machinery failures management
specific customers • Fraud Management
• Identify and prevent
fraudulent claim
activities
Business Analytics
• Business analytics (BA) is the iterative, methodical exploration of an organization's
data, with an emphasis on statistical analysis
• Business analytics is used by companies that use data-driven decision-making
• It makes extensive use of data, statistical and quantitative analysis, explanatory &
predictive modeling, and fact based management to drive decision making
• Analytics may be used as input for human decisions or may drive fully automated
decisions
Components of BA
Data
Aggregation

Data
Data Mining
Visualisation

Association & Sequence


Optimisation
Identification

Predictive
Text Mining
Analytics

Forecasting
Types & Techniques of BA
Descriptive Analytics
• Perhaps the most basic and still the most important and widely used kind of analytics is descriptive analytics
• This deals which uncovering the truth regarding business by analyzing the historical data. A number of
factual information is revealed in this form of analytics
• This is where, the grouping of data, use of descriptive statistics, and a number of visualization techniques
come in handy
• Here for example, by finding frequency, mean, median, mode, maximum, minimum values of a subject in
different scenarios help in covering a lot of information
• This allows the leadership to understand what has happened until now and gives a brief glimpse of what
could happen next.
Types & Techniques of BA
Diagnostic Analytics
• This form of analytics deals with finding the reasons for whatever that has happened in the
business so far
• Methodologies such as Segmentation etc comes in handy where patterns are detected in the
data to give a better insight into the scenario in which the company is present
• For example, running analytics on the customer base of a company and identifying the
different types of customers the company has been dealing with and targeting the specific kind
of customers that might have been pulling back the companies’ growth.
Types & Techniques of BA
Predictive Analytics
• This is that branch of analytics that deals with the future.
• Here, again based on the historical data, a range of sophisticated statistical and machine learning methodologies are put to use to
understand what can happen in the future given certain conditions or the pace at which the current scenario is moving
• This is done by identifying patterns in the data, figuring out the important drivers and features, and finding its relation with the
objective that we are trying to predict
• In none of these methods, time is involved as when time gets involved then a particular kind of predictive analytics is performed
known as forecasting
• Forecasting refers to predicting a value over a fixed period of time where time also acts as a driver i.e. plays a role in deciding what the
predicted value is going to be in the output.
• Sometimes a very specific type of prediction is also performed such as Text Mining where texts are predicted to create products that
can aid the business operation and can help in increasing the profits
• In Predictive Analytics, advanced Machine Learning and Deep Learning algorithm are developed, and sometimes statistical models are
also created
Types & Techniques of BA
Prescriptive Analytics
• The most advanced form of analytics, here not only we try to predict but also try to find a course of action
that is best suited to reach the objective
• While predictable analytics provide us what will happen, prescriptive analytics provide us with the answer on
how to avoid the prediction (in the case the predicted output is something not in the interests of the company)
• Different strategies are devised here and are put to use to check the different outcomes. This is where
optimization and simulation methodologies are put to use and compared to the previously mentioned forms
of analytics, this is a new and developing form of analytics
• Advanced Machine & Deep Learning methodologies are often used in this type of analytics that allows us to
create different scenarios and find the best course of action.
Important BA Tools
• SQL
– It is among the most important tool as SQL queries allow the user to easily filter out and create subsets of an otherwise large
dataset
– By having the relevant amount of data, the analyst can quickly start working on the cleaning of the data and then creating
models out of it

• Tableau/ QlikView/ PowerBI


– The most important tool for report generation through the means of visualization. Tableau allows the user to quickly create
interesting, complex, and detailed graphs that can magnify the impact of a report
– The good aspect of this tool is that it is easy to use and requires less data preparation in order to get the desired output.

• Birt
– Another useful report based tool allows us to create graphs and dashboards, however, it is relatively complex than tableau as the
user needs to have a decent knowledge of Java to make the most out of it.
Important BA Tools
• Python
– One of the most advanced tools, python allows the user to perform multiple things
– Python can be used to perform basic steps such as data cleaning to a complex aspect of analytics that includes the development of various
kinds of models.
– The development of highly complex machine learning and deep learning model is particularly effective through this tool. Python also
allows us to create reports and has libraries for visualization but it is up to the user to use them or use dedicated visualization tools

• R
– This statistical tool created “by the statisticians for the statisticians”, allows a business analyst to perform all the descriptive and inferential
statistics along with the development of statistical models
– If compared to python it has a bit of a steep learning curve but this eventually pays off as it has a large community of users and is respected
in the world of corporate as well as academia

• MS Excel
– One of the most basic yet widely used and effective tool
– The importance of MS Excel in the field of Business Analytics can be understood from realizing the difference between a sword and a
needle
Important BA Tools
• SPSS Modeler(Clementine)
– A data mining tool by SPSS Inc. (IBM)
– Has an intuitive GUI & its point & click modelling capabilities are very comprehensive

• KXEN
– One of the few to drive automated Analytics
– Can work with very large amount of data
– Drawback is its complexity in understanding the results

• WEKA
– Waikato Environment for Knowledge Analysis is a popular machine learning software
– Its written in Java script & is an open source software
– It contains a GUI for interacting with data files & produces visual results & graphs
Advantages of BA
• Improving the decision making process

• Better alignment with strategy

• Realising cost efficiency


• Speeding up of decision making

• Improving competitiveness

• Synchronised financial and operational strategy

• Providing a single, unified view of enterprise information

• Potential increase in revenues


BI v/s BA
• While BI and BA serve similar purposes, and the terms may be used interchangeably, these practices differ in
their fundamental focus
• Business intelligence analytics focuses on descriptive analytics, combining data gathering, data storage, and
knowledge management with data analysis to evaluate past data and providing new perspectives into currently
known information
• Business analytics focuses on prescriptive analytics, using data mining, modelling, and machine learning to
determine the likelihood of future outcomes
• Essentially, business intelligence answers the questions, “What happened?” and “What needs to change?” and
business analytics answers the questions, “Why is this happening?”, “What if this trend continues?”, “What will
happen next?”, and “What will happen if we change something?” Business analytics and business intelligence
solutions tend to overlap in structure and purpose
BI v/s BA
• Lets take an example, you sell T-shirts through an online store. Business intelligence provides helpful
reports of the past and current state of your business. BI tells you that sales of your blue hood T-Shirts have
spiked in the past three weeks. As a result, you decide to make more blue hood T-Shirts to keep up with
demand
• Business analytics asks, “Why did sales of blue hood T-shirts spike?” By mining your website data, you
learn that a majority of traffic has come from a post by an actor who wore your T-shirt. This insight helps
you decide to send complimentary T-shirts to a few other prominent actors in the vicinity for shoots
• You use the previous sales information to anticipate how many T-shirts you will need to make and how
much supplies you will need to order to keep up with demand if the actors were to share posts while
wearing your T-shirts
Data Science
• Data science is the domain of study that deals with vast volumes of data using modern tools and
techniques to find unseen patterns, derive meaningful information, and make business decisions
• Data science uses complex machine learning algorithms to build predictive models
• Data science or data-driven science enables better decision making, predictive analysis, and pattern
discovery. It lets you:
– Find the leading cause of a problem by asking the right questions
– Perform exploratory study on the data
– Model the data using various algorithms 
– Communicate and visualize the results via graphs, dashboards, etc.
Data Science
• In practice, data science is already helping the airline industry predict disruptions in travel to
alleviate the pain for both airlines and passengers. With the help of data science, airlines can
optimize operations in many ways, including:
– Plan routes and decide whether to schedule direct or connecting flights
– Build predictive analytics models to forecast flight delays
– Offer personalized promotional offers based on customers booking patterns 
– Decide which class of planes to purchase for better overall performance

• Data science is the art & science of acquiring knowledge through data
• It involves principles, processes and techniques for understanding phenomena via the analysis of
data
Data Science
• Components of Data Science – Organising, Packaging & Delivering the data
• Advantages
– Helps management with better & faster decisions

– Empowers the decision makers with solid data and outlines a path to achieve business goals
– Can anticipate new challenges & opportunities through the power of data

– Spotting the trend & capitalizing on it before the competition


– Setting the guidelines for best practices

– Rigorously testing the decisions until it achieves perfection

• Applications - Internet Search, Digital Advertisements, Recommender Systems


Online Analytical Processing (OLAP)
• OLAP is a method of analysing data in a multi-dimensional format, often across
multiple time periods, with the aim of uncovering the business information
concealed within data
• OLAP performs multidimensional analysis of business data and provides the
capability for complex calculations, trend analysis, and sophisticated data modeling
• It is the foundation for many kinds of business applications for Business
Performance Management, Planning, Budgeting, Forecasting, Financial Reporting,
Analysis, Simulation Models, Knowledge Discovery, and Data Warehouse Reporting
• OLAP enables end-users to perform ad hoc analysis of data in multiple dimensions,
thereby providing the insight and understanding they need for better decision
making
OLAP Features
Features

Fast Analysis

Share Multidimensional Information


OLAP Example
• Let us consider the data of a supermarket store, “AllGoods” store, for the year “2001”
• This data as captured by the OLTP system is under the following column headings: Section, Product-
CategoryName, YearQuarter, and SalesAmount.
• We have a total of 32 records/rows.
• The Section column can have one value from amongst “Men”, “Women”, “Kid”, and “Infant”.
• The ProductCategory Name column can have either the value “Accessories” or the value “Clothing”.
• The YearQuarter column can have one value from amongst “Q1”, “Q2”, “Q3”, and “Q4”.
• The SalesAmount column record the sales figures for each Section, ProductCategory Name, and Year
Quarter
Data Models for OLAP
Following are 3 chief types of multidimensional schemas
each having its unique advantages.
• Star Schema
• Snowflake Schema
• Galaxy Schema
Star Schema
• Star Schema in data warehouse, in
which the center of the star can have one
fact table and a number of associated
dimension tables
• It is known as star schema as its structure
resembles a star. The Star Schema data
model is the simplest type of Data
Warehouse schema
Star Schema
• Every dimension in a star schema is represented with the only one-dimension table.

• The dimension table should contain the set of attributes.

• The dimension table is joined to the fact table using a foreign key

• The dimension table are not joined to each other

• Fact table would contain key and measure

• The Star schema is easy to understand and provides optimal disk usage.

• The schema is widely supported by BI Tools


Snowflake Schema
• Snowflake Schema in data warehouse is
a logical arrangement of tables in a
multidimensional database such that
the ER diagram resembles a snowflake
shape
• A Snowflake Schema is an extension of
a Star Schema, and it adds additional
dimensions
• The dimension tables are normalized
which splits data into additional tables
Galaxy Schema
• A Galaxy Schema contains
two fact table that share
dimension tables between them
• It is also called Fact
Constellation Schema
• The schema is viewed as a
collection of stars hence the
name Galaxy Schema
Types of OLAP
ROLAP
• Relational - OLAP works with data that exist
in a relational database. Facts and dimension
tables are stored as relational tables
• It also allows multidimensional analysis of
data and is the fastest growing OLAP
• ROLAP products provide GUIs and generate
SQL execution plans that typically remove
end-users from the SQL writing process
ROLAP
Advantages of ROLAP model:
• High data efficiency. It offers high data efficiency because query performance and access language are
optimized particularly for the multidimensional data analysis.
• Scalability. This type of OLAP system offers scalability for managing large volumes of data, and even when the
data is steadily increasing.
Drawbacks of ROLAP model:
• Demand for higher resources: ROLAP needs high utilization of manpower, software, and hardware resources.
• Aggregately data limitations. ROLAP tools use SQL for all calculation of aggregate data. However, there are
no set limits to the for handling computations.
• Slow query performance. Query performance in this model is slow when compared with MOLAP
MOLAP
• MOLAP stands for Multi
dimensional Online Analytical
Processing. MOLAP is the most
used storage type
• It is designed to offer maximum
query performance to the users
• The storage is in proprietary
formats & not in relational
database
MOLAP
Advantages:
• Since the data is stored on the OLAP server in optimized format, queries (even complex calculations) are faster than
ROLAP
• The data is compressed so it takes up less space
• And because the data is stored on the OLAP server, you don’t need to keep the connection to the relational database
• Cube browsing is fastest using MOLAP
Disadvantages:
• This doesn’t support REAL TIME i.e newly inserted data will not be available for analysis until the cube is processed
• It is not scalable. It can handle only a limited amount of data at a time. It cannot be increased dynamically
• The solutions provided by MOLAP can be lengthy when data is in large volumes
• The storage is not utilized efficiently if the data set being analyzed is scattered
HOLAP
• Hybrid OLAP is a mixture of both ROLAP and MOLAP

• It offers fast computation of MOLAP and higher scalability of


ROLAP
• HOLAP uses two databases
– Aggregated or computed data is stored in a multidimensional OLAP cube
– Detailed information is stored in a relational database.
HOLAP
Benefits of Hybrid OLAP:
• This kind of OLAP helps to economize the disk space, and it also remains compact which helps to avoid issues
related to access speed and convenience.
• Hybrid HOLAP's uses cube technology which allows faster performance for all types of data.
• ROLAP are instantly updated and HOLAP users have access to this real-time instantly updated data. MOLAP
brings cleaning and conversion of data thereby improving data relevance. This brings best of both worlds.
Drawbacks of Hybrid OLAP:
• Greater complexity level: The major drawback in HOLAP systems is that it supports both ROLAP and MOLAP
tools and applications. Thus, it is very complicated.
• Potential overlaps: There are higher chances of overlapping especially into their functionalities.
Advantages of OLAP
• OLAP is a platform for all type of business includes planning, budgeting, reporting, and analysis
• Information and calculations are consistent in an OLAP cube. This is a crucial benefit
• Quickly create and analyze "What if" scenarios
• Easily search OLAP database for broad or specific terms

• OLAP provides the building blocks for business modeling tools, Data mining tools, performance reporting
tools
• Allows users to do slice and dice cube data by various dimensions, measures, and filters
• It is good for analyzing time series
• Finding some clusters and outliers is easy with OLAP
• It is a powerful visualization online analytical process system which provides faster response times
Disadvantages of OLAP
• OLAP requires organizing data into a star or snowflake schema.
These schemas are complicated to implement and administer
• You cannot have large number of dimensions in a single OLAP cube
• Transactional data cannot be accessed with OLAP system.
• Any modification in an OLAP cube needs a full update of the cube.
This is a time-consuming process
Online Transaction Processing (OLTP)
• OLTP, or online transactional processing, enables the real-time execution of large numbers of
database transactions by large numbers of people, typically over the internet
• OLTP is basically focused on query processing, maintaining data integrity in multi-access
environments as well as effectiveness that is measured by the total number of transactions per second
• In OLTP, the emphasis is on fast processing, because OLTP databases are read, written, and updated
frequently
• If a transaction fails, built-in system logic ensures data integrity

• Classic examples of OLTP systems are order entry, retail sales, and financial transaction systems
OLTP Characteristics
Short Response time

Small Transactions

Data Maintenance Operations

Large User Populations

High Concurrency

Large Data Volumes

High Availability

Lifecycle related Data Usage


ER Model
The first mathematical-theory-driven model for data management was designed by Ed Codd of IBM in 1970.
• A relational database is composed of a set of relations (data tables), which can be joined using shared attributes.
A “data table” is a collection of instances (or records), with a key attribute to uniquely identify each instance.
• Data tables can be JOINed using the shared “key” attributes to create larger temporary tables, which can be
queried to fetch information across tables. Joins can be simple ones as between two tables. Joins can also be
complex with AND, OR, UNION or INTERSECTION, and more operations.
• High-level commands in Structured Query Language (SQL) can be used to perform joins, selection, and
organizing of records
Relational data models flow from conceptual models, to logical models to physical implementations. Data can be
conceived of as being about entities, and relationships among entities
ER Model
Column Name Data type & length Constraint Description
ProductID Character,7 Primary Key It is not null & unique
ProductName Character,35 Not null Name of the product must be specified
ProductDescription Character,50 Not null A brief desription of the product
UnitPrice Numeric 8,2   The price per unit of the product
QtyInStock Numeric 5   The units of the product in stock

ProductID ProductName ProductDescription UnitPrice QtyInStock


P101 Glucon D Energy drink 120.50 250
P102 Boost Energy drink 135.00 200
P103 Dettol Antiseptic liquid 90.00 500
P104 Pears Toilet Soap 45.00 350
P105 Luxor Pen 40.00 50
P106 Maggie Sauce Tomato Sauce 54.00 75
Implementing the Relational data model
• Once the logical data model has been created, it is easy to translate it into a
physical data model, which can then be implemented it using any publicly
available DBMS
• Every entity should be implemented by creating a database table

• Every table will be a specific data field (key) that would uniquely identify
each relation (or row) in that table
• Each master table or database relation should have programs to create, read,
update, and delete the records
Advantages of OLTP
• OLTP offers accurate forecast for revenue and expense.
• It provides a solid foundation for a stable business /organization due to timely modification of all transactions.
• OLTP makes transactions much easier on behalf of the customers.
• It broadens the client base for an organization by speeding up and simplifying individual processes.
• OLTP provides support for bigger databases.
• We need OLTP to use the tasks which are frequently performed by the system when we need only a small number of
records
• It is used when you need consistency and concurrency in order to perform tasks that ensure its greater availability.
• It is designed typically for use by clerks, cashiers, etc.
• It efficiently allows its users to read, write, and delete data quickly
Disadvantages of OLTP
• If the OLTP system faces hardware failures, then online transactions get severely affected
• OLTP systems allow multiple users to access and change the same data at the same time, which
many times created an unprecedented situation
• If the server hangs for seconds, it can affect to a large number of transactions
• OLTP required a lot of staff working in groups in order to maintain inventory
• OLTP makes the database much more susceptible to hackers and intruders
• Server failure may lead to wiping out large amounts of data from the database

You might also like