0% found this document useful (0 votes)

69 views54 pages

Chapter BI4

This document discusses business intelligence and data warehousing. It describes how databases can be used to improve business performance and decision making. A data warehouse is a central repository of integrated data from one or more disparate sources that is used for reporting and analysis. The document outlines some common business intelligence tools like data mining, classification, clustering, and association rule mining that can analyze large amounts of data in a data warehouse to find patterns and relationships to help businesses make more informed decisions.

Uploaded by

eyob yohannes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views54 pages

Chapter BI4

Uploaded by

eyob yohannes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 54

Chapter four

Business Intelligence
Lecture out line
 Databases to Improve Business Performance and Decision Making
 Business Intelligence
 Data Warehouses
 Association Rule Mining
 Classification
 Clustering
 Others
Using Databases to Improve Business
Performance and Decision Making
• Databases provide information to help the company
run the business more efficiently
• help managers and employees make better
decisions
 Tools for analyzing, accessing vast quantities of
data:
• Data warehousing
• Multidimensional data analysis
• Data mining
Using Databases to Improve Business
Performance and Decision Making
 Businesses use their databases to keep track of
basic transactions, such as paying suppliers,
processing orders, serving customers, and paying
employees.
 If a company wants to know which product is the
most popular or who is its most profitable
customer, the answer lies in the data.
A Good Data Warehouse
is a pre-requisite for
Business Decision Making
WHY & WHAT
DATA WAREHOUSE
MOTIVATION
““We are drowning in information,
but starving for knowledge
John Naisbett
A producer wants to know….
Which
Whichare
areour
our
lowest/highest
lowest/highestmargin
margin
customers ?
customers ?

Who
Whoare
aremy
mycustomers
customers
and what products
What
Whatisisthe
themost
most and what products
are
arethey
theybuying?
buying?
effective distribution
effective distribution
channel?
channel?

What
Whatproduct
productprom-
prom- Which
Whichcustomers
customers
-otions have the biggest are
-otions have the biggest are mostlikely
most likelyto
togo
go
impact
impacton
onrevenue?
revenue? to
to the competition??
the competition

What
Whatimpact
impactwill
will
new products/services
new products/services
have
haveon
onrevenue
revenue
and
andmargins?
margins?
Data, Data……. Everywhere
I can’t find the data I need
yet ... datais scattered over the network
many versions, subtle differences

I can’t get the data I need

– need an expert to get the data

I can’t understand the data I found

– available data poorly documented

I can’t use the data I found

– results are unexpected
– data needs to be transformed from
one form to other
Data Mining
works with
Data Warehouse Data
Data Warehouse provides the Enterprise with
memory

Data Mining provides the Enterprise with intelligence

 Data Mining helps
 to extract and analyze
 such information

By to Motivation
Pla
What Is Data Mining? A Definition

Knowledge Discovery in
Databases
The non-trivial extraction of
implicit, previously
unknown and potentially
useful knowledge from
data in large data
repositories
Alternative names
• Knowledge Discovery
(mining) in Databases
(KDD)
• Knowledge extraction
• Data/pattern analysis
• Business Intelligence etc.,
Problem Behind…..
Heterogeneous Information Sources

“Heterogeneities are
everywhere”
Personal
Databases

World
Scientific Databases Medical data Wide
They have Web

Different interfaces
Different data representations
Duplicate and inconsistent
information
Problem
Data Management in Large Enterprises
 Application driven development of
operational systems resulted in vertical
fragmentation of informational systems .
Sales Planning Suppliers
Stock Mngmt Debt Mngmt Inventory Mngmt

Sales Administration Finance Manufacturing ...

Ultimate Goal
Unified Access to Data

Integration System

World
Wide
Personal
Web
Digital Libraries Scientific Databases Databases

· Collects and combines information

· Provides integrated view, uniform user interface
· Supports sharing
Best Solution
The Warehousing Approach
• An approach
Clients
where the
Information is
integrated in Data
advance & Warehouse
stored in a
warehouse for
Integration System Metadata
direct querying
and analysis

Extractor/ Extractor/ Extractor/

Monitor Monitor Monitor

Source Source Source

Data Warehousing -- a process

It is a technique for assembling and

managing data from various
sources for the purpose of
answering business questions,
thus making decisions that were
not previous possible

Process of constructing (and using) a

data warehouse
Data warehouse contd..

What is a data warehouse?

Huge database system that stores and manages data
required to analyze historical and current transactions

Quick and efficient Often uses a process called

way to access large data mining to find patterns
amounts of data and relationships among data

Uses multidimensional
databases
Components of a Data Warehouse
Why a Data Warehouse?

The Warehousing Approach

Clients
i Information
integrated in
Data
advance Warehouse

i Stored in ware house

for direct querying Integration System Metadata

and analysis
...
Extractor/ Extractor/ Extractor/
Monitor Monitor Monitor

...
Source Source Source
Business intelligence and, data mining
 Once data have been captured and organized in
data warehouses ,they are available for further
analysis.
 A series of tools enables users to analyze these data
to see new patterns, relationships, and insights that
are useful for guiding decision making
BI cont’d
Definition
According to (Adelman et.al, 2002), BI is a term that
encompasses a broad range of analytical software and solutions for
gathering, consolidating, analyzing and providing access to information in a
way that is supposed to let an enterprise's users make better business
decisions
Stackowiak et al. (2007) define Business intelligence as the process
of taking large amounts of data, analyzing that data, and presenting a high-
level set of reports that condense the essence of that data into the basis of
business actions, enabling management to make fundamental daily business
decisions.
Business intelligence as a “business management term used to
describe applications and technologies which are used to gather, provide
access to analyze data and information about an enterprise, in order to help
them make better informed business decisions.”
Cont’d
 These tools for consolidating, analyzing, and
providing access to vast amounts of data to help
users make better business decisions are often
referred to business intelligence (BI).
 business intelligence provides firms with the
capability to amass information; develop
knowledge about customers, competitors, and
internal operations; and change decision-making
behavior to achieve higher profitability and other
business goals
BI cont’d

 how business intelligence works

BI cont’d
Traditional BI systems consist of a back-end database, a front-end user interface,
software that processes the information to produce the business intelligence itself,
and a reporting system.
The capabilities of BI include decision support, online analytical processing,
statistical analysis, forecasting, and data mining.
Table 3,1 Current BI Techniques
TECHNIQUE DESCRIPTION

Predictive modeling Predict value for a specific data item attribute

Association, correlation, causality analysis Identify relationships between attributes

Classification Determine to which class a data Classification Determine to which class a data
item belongs item belongs
Clustering and outlier analysis Partition a Clustering and outlier analysis Partition a
set into classes, whereby items with similar set into classes, whereby items with similar
characteristics are grouped together characteristics are grouped together

Making discovered knowledge easily

Model Visualization understood using charts, plots, histograms,
Application area of BI
Manufacturers, electronic commence businesses, Banking
Telecommunication providers, Airlines, retailers, health
systems, financial services, bioinformatics and hotels use
BI for customer support, market research, segmenting,
product profitability, inventory and distribution analysis,
statistical analysis, detecting fraud detection etc.
BI cont’d
Why BI?
 Customers are the most critical aspect to a company's success.
 It is very important that firms have information on their preferences.
Firms must quickly adapt to their changing demands.
 Business Intelligence enables firms to gather information on the trends
in the marketplace and come up with innovative products or services in
anticipation of customer's changing demands.
 With BI, firms can identify their most profitable customers and the
underlying reasons for those customers’ loyalty
 Analyze click-stream data to improve e-commerce strategies
 Determine what combinations of products and service lines customers
are likely to purchase and when
 Etc.
Data mining
 Data mining is more discovery driven.
 Data mining provides insights into corporate data that
cannot be obtained with OLAP by finding hidden
patterns and relationships in large databases and
inferring rules from them to predict future behavior
 The patterns and rules are used to guide decision
making and forecast the effect of those decisions
 The types of information obtainable from data
mining are listed in Table 4,1
BI cont’d
• Data Mining
• Finds hidden patterns and relationships in large databases
and infers rules from them to predict future behavior
• Types of information obtainable from data mining
• Associations: Occurrences linked to single event
• Classifications: Patterns describing a group an item belongs to
• Clusters: Discovering as yet unclassified groupings
• Forecasting: Uses series of values to forecast future values
Basic Concepts: Association Rules
Association rule is a rule which is
described in the form of XY with
interestingness measure of support TID ITEMSET
and confidence where 1 Computer, printer, scanner,
antivirus
X and Y are simple or complex
2 Computer, printer, scanner
statements
A Simple Statement is to mean a 3 Computer, antivirus
statement formed from a single 4 antivirus,scanner
attribute say age, buy or sex and a
value which is related by relational
operator
Buy(X, “Computer”) Buy(X, “Printer”)[Sup = 50%, conf=66.67%]
Which is to mean a person X who buy a computer also buy a
printer . 50% shows that a person buy a computer and printer
among the entire data set (support). Out of the tuples that buy a
computer, 66.67% of them also certainly buy printer (confidence )

27
Market Basket Analysis…
 Analyzes customer buying habits by finding associations between
different items that customers place in their “Shopping Baskets”
 The discovery of the interesting correlations can help retailers develop
marketing strategies by gaining insight into .”which items are frequently
purchased together by the customers”.
 This information leads to increased sales by helping retailers to do
selective marketing and plan their shelf place.

Basket Data: Retail organizations, e.g., supermarkets, collect and store

massive amounts sales data, called basket data. A record consist of

 transaction date
 items bought
Or, basket data may consist of items bought by a customer over a
period.
BI continued
 Association Rules”
 Market Baskets
 Frequent Itemsets
 A-priori Algorithm
 The Market-Basket Model
 A large set of items, e.g., things sold in a supermarket.
 A large set of baskets, each of which is a small set of the
items, e.g., the things one customer buys on one day.
BI continued
Example
Items={milk, coke, pepsi, beer, juice}.
Support = 3 .
B1 = {m, c, b} B2 = {m, p, j}
B3 = {m, b} B4 = {c, j}
B5 = {m, p, b} B6 = {m, c, b, j}
B7 = {c, b, j} B8 = {b, c}
Frequent itemsets: {m}, {c}, {b}, {j},
{m, b}, {c, b}, {j, c}.
Why Association Rule Mining
 Do you buy a printer when you buy a Computer while
visiting ABC company ?

 Do you often use Google drive, when you use Gmail?

 Do you order tea when you order bread ?

 Given a database of transactions, where each transaction is

a list of items purchased by a customer in a visit

 Find all rules that correlate the presence of one set of items
(item set) with that of another set of items
Why Association Rule Mining
 Support
 Simplest question: find sets of items that appear “frequently”
in the baskets.
 Support for itemset I = the number of baskets containing all
items in I.
 Given a support threshold s, sets of items that appear in > s
baskets are called frequent itemsets.
Association mining from frequent Pattern
 The rule A  B holds in the transaction set D with support s,
where s is the percentage of transactions in D that contain A B
(i.e., the union of itemsets A and B, or say, both A and B).

 This is taken to be the probability, P(A  B) =

 Support shows the probability that all the predicates in A and

B fulfill together.

Association mining from frequent Pattern
 The rule A  B has confidence c in the transaction set D,
where c is the percentage of transactions in D containing A
that also contain B.
 This is taken to be the conditional probability, P(B|A)=

 Confidence measure how often predicates B fulfilled if

predicate A get fulfilled.

 Ie.,
support(A B) = P(A  B)
confidence(A B) = P(B|A)
Association Rule- Basic Concepts
 Association Rule form :
 Antecedent  Consequent [support, confidence]
 Examples:
 buys(x, “ computer”) ¨buys(x, “ financial Mgt. software”)
[0.5%, 60%]

 age(x, “30..39”) ^ income(x, “50000”) buys(x, “ car”) [1%,

75%]

 buys(x, “shoe) ® buys(x, “sock”) [60%, 80%]

 major(x, “MBA”) ^ takes(x, “Managerial Economics”) ®

grade(x, “A”) [50%, 75%]
Association Rule- Basic Concepts
Example
B1 = {m, c, b} B2 = {m, p, j}
B3 = {m, b} B4 = {c, j}
B5 = {m, p, b} B6 = {m, c, b, j}
B7 = {c, b, j} B8 = {b, c}
An association rule: {m, b} c. _x0002_ Confidence = 2/4
= 50%.
Association Rule- Basic Concepts
 Support count ()
 Frequency of occurrence of an
itemset
 E.g. ({I1, I2,I3}) = 2
TID Items
 Support 1 I1 I3 I2
 Fraction of transactions that contain 2 I1, I2, I3, I4
an itemset 3 I5, I2, I3, I6
 E.g. s(({I1, I2,I3}) ) = 2/5 4 I1, I5, I2, I3
 Frequent Itemset 5 I1, I5, I2, I6 I3
 An itemset whose support is greater
than or equal to a minsup threshold

frequent (or large) itemset is an itemset whose number of occurrences is

above a threshold s. A notation L is used to indicate large or frequent
itemset, and I is used to indicate a specific target itemset.
Classification
 Classification recognizes patterns that describe the group
to which an item belongs by examining existing items
that have been classified and by inferring a set of rules.
 For example, businesses such as credit card or
telephone companies worry about the loss of steady
customers.
 Classification helps discover the characteristics of
customers who are likely to leave and can provide a
model to help managers predict who those customers
are so that the managers can devise special campaigns
to retain such customers.
Classification Task

 Classification Task —A Two-Step Process:

 Model Construction
 Model Usage
 Model construction: describing a set of predetermined
classes
 Each tuple/sample is assumed to belong to a predefined
class, as determined by the class label attribute.
 The set of tuples used for model construction is training
set
 The model is represented as classification rules, decision
trees, or mathematical formulae
Classification Task

Model Construction:
Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

Damail Assistant Prof 5 no (Model)
Yordanos
Assistant Prof 9 yes
ZuriashProfessor 12 yes
Moha Associate Prof 8 yes IF rank = ‘professor’
Dawit Assistant Prof 7 no OR years > 7
Aman Associate Prof 8 yes
THEN tenured = ‘yes’
PART 1: General Introduction- Classification Task

Model Usage in Prediction:

Classifier

Testing
Data Unseen Data

(Merga, Professor,8)
NAME RANK YEARS TENURED
Tenured?
Kedir Assistant Prof 4 no
Abebe Associate Prof 8 yes
Kebede Professor 9 yes
Alima Assistant Prof 9 yes
Bayesian Theorem: for prediction

 Let X be a data sample (“evidence”): class label is unknown

 Let H be a hypothesis that X belongs to class C
 Classification is to determine P(H|X), known as posterior probability, the
probability that the hypothesis holds given the observed data sample X.
 P(H) (prior or apriori probability), the initial probability of H.
 E.g., X will buy computer, regardless of age, income, …
 P(X): probability that sample data is observed. It’s priori probability of X, called
marginal probability.
 P(X|H) is the known as the likelihood function is the posteriori probability of
X, i.e., the probability of observing the sample X, given that the hypothesis
holds.
Given training data X, posterior probability of a hypothesis H, P(H|X), follows the
Bayes theorem
P( H | X)  P(X | H ) P( H )
P(X)
Naïve Bayesian Classifier: Training Dataset
age income studentcredit_rating
buys_computer
ClGiven: <=30 high no fair no
C1 Buys Comp= ‘yes’
<=30 high no excellent no
C2:buys_comp = ‘no’
A data sample X: 31…40 high no fair yes
X = (age <=30, >40 medium no fair yes
Income = medium, >40 low yes fair yes
Student = yes >40 low yes excellent no
Credit_rating = Fair) 31…40 low yes excellent yes
Task: <=30 medium no fair no
Classify X using Bayesian
<=30 low yes fair yes
classifier
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Naïve Bayesian Classifier: An Example
We need to maximize P(X|Ci)P(Ci), for i=1,2.

P(Ci): P(buys_Comp= “yes”) = 9/14 = 0.643

P(buys_Comp= “no”) = 5/14= 0.357

Compute P(X|Ci) for each class:

P(age = “<=30” | buys_Comp= “yes”) = 2/9 = 0.222

P(age = “<= 30” | buys_Comp= “no”) = 3/5 = 0.6

P(income = “medium” | buys_Comp= “yes”) = 4/9 = 0.444

P(income = “medium” | buys_Comp= “no”) = 2/5 = 0.4

P(student = “yes” | buys_Comp= “yes) = 6/9 = 0.667

P(student = “yes” | buys_Comp= “no”) = 1/5 = 0.2

P(credit_rating = “fair” | buys_Comp= “yes”) = 6/9 = 0.667

P(credit_rating = “fair” | buys_Comp= “no”) = 2/5 = 0.4
Naïve Bayesian Classifier: An Example(1)

 X = (age <= 30 , income = medium, student = yes,

credit_rating = fair)
• P(X|Ci) : P(X|buys_Compr= “yes”)
= 0.222 x 0.444 x 0.667 x 0.667
= 0.044

• P(X|buys_Comp= “no”)
= 0.6 x 0.4 x 0.2 x 0.4
= 0.019

• P(X|Ci)P(Ci) : P(X|buys_Comp= “yes”) P(buys_Comp= “yes”)

= 0.028

• P(X|buys_Car= “no”) * P(buys_Comp= “no”)

= 0.007
• Therefore, X belongs to class (“buys_Comp= yes”)
play tennis?
Naive Bayesian Classifier Example 2

Outlook Temperature Humidity W indy Class

sunny hot high false N
sunny hot high true N
overcast hot high false P
rain mild high false P
rain cool normal false P
rain cool normal true N
overcast cool normal true P
sunny mild high false N
sunny cool normal false P
rain mild normal false P
sunny mild normal true P
overcast mild high true P
overcast hot normal false P
rain mild high true N
Naive Bayesian Classifier Example
Outlook Temperature Humidity Windy Class
overcast hot high false P
rain mild high false P
rain cool normal false P
overcast cool normal true P
sunny cool normal false P
rain mild normal false P
sunny mild normal true P
overcast mild high true P
9
overcast hot normal false P

Outlook Temperature Humidity Windy Class

sunny hot high false N
sunny hot high true N
rain cool normal true N
sunny mild high false N
rain mild high true N 5
Naive Bayesian Classifier Example

 Given the training set, we compute the probabilities:

Outlook P N Humidity P N
sunny 2/9 3/5 high 3/9 4/5
overcast 4/9 0 normal 6/9 1/5
rain 3/9 2/5
Tempreature Windy
hot 2/9 2/5 true 3/9 3/5
mild 4/9 2/5 false 6/9 2/5
cool 3/9 1/5
 We also have the probabilities
 P = 9/14
 N = 5/14
Naive Bayesian Classifier Example
 To classify a new sample X:
 outlook = sunny
 temperature = cool
 humidity = high
 windy = false
 Prob(P|X) = Prob(P)*Prob(sunny|P)*Prob(cool|P)*
Prob(high|P)*Prob(false|P) = 9/14*2/9*3/9*3/9*6/9 =
0.01
 Prob(N|X) = Prob(N)*Prob(sunny|N)*Prob(cool|N)*
Prob(high|N)*Prob(false|N) = 5/14*3/5*1/5*4/5*2/5 =
0.013
 Therefore X takes class label N
What is Cluster Analysis?

Finding groups of objects such that the objects in a group will be similar (or
related) to one another and different from (or unrelated to) the objects in other
groups

Inter-
Intra- cluster
cluster distances
distances are
are maximized
minimized
Clustering

 Clustering works in a manner similar to classification when

no groups have yet been defined.
 A data mining tool can discover different groupings within
data, such as finding affinity groups for bank cards or
partitioning a database into groups of customers based on
demographics and types of personal investments
Exercise
 The following sample dataset
taken from ABC sparepart
shop database, consider 60%
and 80% for support count S ID Spare part Type
and cobfidence respectively. 1 Tyer , Innertube , seatbelts , Brake,
 1 find frequent item set in FuelLine , FuelFilter
each level
2 Tyer , Innertube , seatbelts , FuelLine ,
 2. generatee strong rule FuelFilter Fuel Tank
 3 upon your finding give
advice for sales man or 3 FuelLine , FuelFilter ,Fuel Tank
managers to imrove the
business 4 FuelLine , FuelFilter ,Fuel Tank
Foot Brake
 5 FuelLine , FuelFilter ,Airbags
What is Machine Learning?

Machine learning allows

computers to learn and
infer from data.
for successful BI: Align Business and IT for
the Long Haul

Business Intelligence Notes
No ratings yet
Business Intelligence Notes
206 pages
1 DM Intro1
No ratings yet
1 DM Intro1
34 pages
Hu DM 2024
No ratings yet
Hu DM 2024
205 pages
1 DM Intro
No ratings yet
1 DM Intro
34 pages
NOTES2-Business Intelligence Notes
No ratings yet
NOTES2-Business Intelligence Notes
206 pages
Introduction DM2
No ratings yet
Introduction DM2
13 pages
Bi Unit 2
No ratings yet
Bi Unit 2
56 pages
Business Intelligence
No ratings yet
Business Intelligence
38 pages
Student Noe - 2 - DM For Future Ready Business v1
No ratings yet
Student Noe - 2 - DM For Future Ready Business v1
5 pages
Unit-2 Bi
No ratings yet
Unit-2 Bi
58 pages
BI Lecture 5ppt
No ratings yet
BI Lecture 5ppt
18 pages
NOTES2 Business Intelligence Notes
No ratings yet
NOTES2 Business Intelligence Notes
243 pages
Data Mining
No ratings yet
Data Mining
142 pages
Semana 1 INTELIGENCIA
No ratings yet
Semana 1 INTELIGENCIA
34 pages
Bid M Course
No ratings yet
Bid M Course
76 pages
Data Mning Tools and TechniquesAIMA
No ratings yet
Data Mning Tools and TechniquesAIMA
97 pages
Data Mining and Business Intelligence
No ratings yet
Data Mining and Business Intelligence
42 pages
Data Warehousing
No ratings yet
Data Warehousing
23 pages
Chapter 8
No ratings yet
Chapter 8
20 pages
Module 3
No ratings yet
Module 3
187 pages
Guc 59 58 22895 2022-09-21T15 48 39
No ratings yet
Guc 59 58 22895 2022-09-21T15 48 39
28 pages
DWDM
No ratings yet
DWDM
30 pages
The Customer Trap: How Misunderstanding Pain Points Derails Product Success
From Everand
The Customer Trap: How Misunderstanding Pain Points Derails Product Success
Kapil Tandon
No ratings yet
Chapter 1
No ratings yet
Chapter 1
43 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
16 pages
BI Lecture 1ppt
No ratings yet
BI Lecture 1ppt
24 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
12 pages
Financial Literacy For Managers Finance and Accounting For Better Decision Making 9781613630174 - Compress
No ratings yet
Financial Literacy For Managers Finance and Accounting For Better Decision Making 9781613630174 - Compress
16 pages
Unit 1 - Introduction To Data Mining and Data Warehousing
No ratings yet
Unit 1 - Introduction To Data Mining and Data Warehousing
84 pages
Tutorial Business Intelligence 3rd - PDF Material
No ratings yet
Tutorial Business Intelligence 3rd - PDF Material
22 pages
Data Warehouse Fundamentals: Instructor: Paul Chen
No ratings yet
Data Warehouse Fundamentals: Instructor: Paul Chen
97 pages
An Introduction To Data Mining: Information System Management Assignment
No ratings yet
An Introduction To Data Mining: Information System Management Assignment
18 pages
Prof. Ramesh Behl Rbehl@imi - Edu
No ratings yet
Prof. Ramesh Behl Rbehl@imi - Edu
60 pages
Data Mining:: Dr. Hany Saleeb
No ratings yet
Data Mining:: Dr. Hany Saleeb
37 pages
Supplemental Chapter: Business Intelligence: Information Systems Development
100% (1)
Supplemental Chapter: Business Intelligence: Information Systems Development
23 pages
Big Data and BI
No ratings yet
Big Data and BI
27 pages
Content Strategy: Connecting the dots between business, brand, and benefits
From Everand
Content Strategy: Connecting the dots between business, brand, and benefits
Rahel Anne Bailie
No ratings yet
DWNotes PDF
No ratings yet
DWNotes PDF
209 pages
FBA 311 - Business Intelligence & Decision Making Handout (Mary Isaac)
No ratings yet
FBA 311 - Business Intelligence & Decision Making Handout (Mary Isaac)
14 pages
Department of Information Technology: Data Warehousing and Data Mining IT4204 3
No ratings yet
Department of Information Technology: Data Warehousing and Data Mining IT4204 3
60 pages
Introduction To BI
No ratings yet
Introduction To BI
45 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
17 pages
Data Mining
No ratings yet
Data Mining
25 pages
Course Outline: Intoduction To Business Intelligent Technology
No ratings yet
Course Outline: Intoduction To Business Intelligent Technology
8 pages
Ifa Pure Proposal 2023
No ratings yet
Ifa Pure Proposal 2023
22 pages
Satish BSBMKG512 Forecast International Market and Business Needs - AT
0% (1)
Satish BSBMKG512 Forecast International Market and Business Needs - AT
21 pages
Project Planning and CPM
No ratings yet
Project Planning and CPM
156 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
11 pages
CH-4 Business Analytics (Wiley)
No ratings yet
CH-4 Business Analytics (Wiley)
30 pages
Management Information Systems: Chapter 8: Accessing Organizational Information - Data Warehouse + BP18
No ratings yet
Management Information Systems: Chapter 8: Accessing Organizational Information - Data Warehouse + BP18
27 pages
Chapter 2 - Research To Support Projects
No ratings yet
Chapter 2 - Research To Support Projects
55 pages
Management Systems in digital business Environments: Howto keep the balance of agility and stability while establishing governance frameworks
From Everand
Management Systems in digital business Environments: Howto keep the balance of agility and stability while establishing governance frameworks
Helmut Steigele
No ratings yet
Bi DW DM
No ratings yet
Bi DW DM
39 pages
Deepanshu Analyst - SQL
No ratings yet
Deepanshu Analyst - SQL
1 page
Data Mining 1
No ratings yet
Data Mining 1
10 pages
Data Mining
100% (3)
Data Mining
18 pages
The Concepts of Business Intelligence
No ratings yet
The Concepts of Business Intelligence
30 pages
Guide To Road Safety: Part 5: Road Safety For Rural and Remote Areas
No ratings yet
Guide To Road Safety: Part 5: Road Safety For Rural and Remote Areas
70 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
12 pages
MBA K723 Winter 2013: Data Mining and Business Intelligence
No ratings yet
MBA K723 Winter 2013: Data Mining and Business Intelligence
48 pages
How to Optimise Your Supply Chain to Make Your Firm Competitive!
From Everand
How to Optimise Your Supply Chain to Make Your Firm Competitive!
Andrei Besedin
2/5 (2)
Data Mining: What Is Data Mining?: Correlations or Patterns Among Fields in Large Relational Databases
No ratings yet
Data Mining: What Is Data Mining?: Correlations or Patterns Among Fields in Large Relational Databases
6 pages
Applied Regression - HW1 - JP, Savio, Leila, Mohan
100% (1)
Applied Regression - HW1 - JP, Savio, Leila, Mohan
18 pages
Operation Analytics and Investigating Metric Spike Project
No ratings yet
Operation Analytics and Investigating Metric Spike Project
8 pages
Application of Statistical Techniques in Food Science: Chemical Analysis Data
No ratings yet
Application of Statistical Techniques in Food Science: Chemical Analysis Data
12 pages
ML Unit-2 Half
No ratings yet
ML Unit-2 Half
16 pages
Chapter 3 - Concrete Fundamentals
100% (1)
Chapter 3 - Concrete Fundamentals
130 pages
2101 - Assignment 4
No ratings yet
2101 - Assignment 4
3 pages
Data Mining: by Doug Alexander
No ratings yet
Data Mining: by Doug Alexander
6 pages
Faktor-Faktor Yang Mempengaruhi Keputusan Pembelian Bebek Goreng Pak Koes Yogyakarta
No ratings yet
Faktor-Faktor Yang Mempengaruhi Keputusan Pembelian Bebek Goreng Pak Koes Yogyakarta
24 pages
Case Study: Ensure To Insure
No ratings yet
Case Study: Ensure To Insure
31 pages
Bussiness Intelligent
No ratings yet
Bussiness Intelligent
6 pages
Five 1
No ratings yet
Five 1
54 pages
Summary, Findings, Conclusion and Suggestions
No ratings yet
Summary, Findings, Conclusion and Suggestions
17 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
64 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Advanced Digital Marketing
67% (6)
Advanced Digital Marketing
363 pages
OB ch4 and 5 Block16
No ratings yet
OB ch4 and 5 Block16
39 pages
MSBI
No ratings yet
MSBI
30 pages
Project1 3
100% (1)
Project1 3
89 pages
Ready Mix Concrete
100% (2)
Ready Mix Concrete
57 pages
Unit Wise Questions
No ratings yet
Unit Wise Questions
14 pages
Chapter 6 Integer Programming Refined
100% (1)
Chapter 6 Integer Programming Refined
53 pages
Intrebari Grila Econometrie Exemple PT EXAMEN Pus
No ratings yet
Intrebari Grila Econometrie Exemple PT EXAMEN Pus
7 pages
Chapter Four: Price and Out Put Determination Under Perfect Competition
No ratings yet
Chapter Four: Price and Out Put Determination Under Perfect Competition
49 pages
Project Identification, Design and Implementation MPM 621 CHAP ONE
No ratings yet
Project Identification, Design and Implementation MPM 621 CHAP ONE
34 pages
MPM 68 Individual 2 Assign 1
100% (2)
MPM 68 Individual 2 Assign 1
4 pages
Calculation of Regrattion
No ratings yet
Calculation of Regrattion
6 pages
Foundation of Project Management
No ratings yet
Foundation of Project Management
24 pages
Chapter 9 - Tests On Material
No ratings yet
Chapter 9 - Tests On Material
41 pages
Capstone Project - Jaro-Prof. Babji
No ratings yet
Capstone Project - Jaro-Prof. Babji
5 pages
Preganancy
No ratings yet
Preganancy
8 pages
Efficient Management of Information Communication Technology Resources in An Organisation
No ratings yet
Efficient Management of Information Communication Technology Resources in An Organisation
19 pages
Cheat Sheet Tutorial
No ratings yet
Cheat Sheet Tutorial
2 pages
PG IV 1110 Online Predictive Modelling End Term Paper
No ratings yet
PG IV 1110 Online Predictive Modelling End Term Paper
3 pages
Kebebewe
No ratings yet
Kebebewe
8 pages
Chapter 4 - Additive and Chemical Admixtures
No ratings yet
Chapter 4 - Additive and Chemical Admixtures
42 pages
Analysis of Variance
100% (1)
Analysis of Variance
25 pages
Advancing Pharmaceutical Dry Milling by Process Analytics and Robustness Testing
No ratings yet
Advancing Pharmaceutical Dry Milling by Process Analytics and Robustness Testing
9 pages
BDA-24 - Lect (3-4) - (Fundamentals of Data Analysis)
No ratings yet
BDA-24 - Lect (3-4) - (Fundamentals of Data Analysis)
15 pages
Final Proposal 4
No ratings yet
Final Proposal 4
17 pages
The Professional Geographer: To Cite This Article: Trisalyn A. Nelson (2012) Trends in Spatial Statistics, The
No ratings yet
The Professional Geographer: To Cite This Article: Trisalyn A. Nelson (2012) Trends in Spatial Statistics, The
14 pages
Chapter 5 - Cement Hydration and AAR in Concrete
No ratings yet
Chapter 5 - Cement Hydration and AAR in Concrete
64 pages
Gebremicael Tetemke Research JIT - 2020
No ratings yet
Gebremicael Tetemke Research JIT - 2020
86 pages
Mis Assignment 2 Final
No ratings yet
Mis Assignment 2 Final
7 pages
Mis Assigment 1
No ratings yet
Mis Assigment 1
7 pages
Sun Tzu and Machiavelli On Strategy
No ratings yet
Sun Tzu and Machiavelli On Strategy
8 pages
Chapter 6 - Durability of Concrete Material
No ratings yet
Chapter 6 - Durability of Concrete Material
32 pages
Construction Contract and Basic Law Assigment
100% (1)
Construction Contract and Basic Law Assigment
21 pages
FIFA 18 - Data Analysis: - Harsh Takrani - Pranay Lulla
No ratings yet
FIFA 18 - Data Analysis: - Harsh Takrani - Pranay Lulla
16 pages
Mis Litreture
No ratings yet
Mis Litreture
4 pages
The Problem of Digital Divide and Inequality in Big Data Analysis
No ratings yet
The Problem of Digital Divide and Inequality in Big Data Analysis
11 pages
Project 1. Contemporary Arch.
No ratings yet
Project 1. Contemporary Arch.
28 pages
Car Parking
No ratings yet
Car Parking
1 page
3PG Thesis Research Title Submission Form Copy 3
No ratings yet
3PG Thesis Research Title Submission Form Copy 3
3 pages
Nanotechnology Integration With Hempcrete Enhancing Sustainable Construction
No ratings yet
Nanotechnology Integration With Hempcrete Enhancing Sustainable Construction
4 pages
PG Thesis - Research Title Submission Form
No ratings yet
PG Thesis - Research Title Submission Form
2 pages
Ge Ne Rat Orr Oo M
No ratings yet
Ge Ne Rat Orr Oo M
1 page
Data Analyst RoadMap
No ratings yet
Data Analyst RoadMap
1 page
Value Nets (Review and Analysis of Bovet and Martha's Book)
From Everand
Value Nets (Review and Analysis of Bovet and Martha's Book)
BusinessNews Publishing
No ratings yet

Chapter BI4

Uploaded by

Chapter BI4

Uploaded by

Chapter four

I can’t get the data I need

I can’t understand the data I found

I can’t use the data I found

Data Mining provides the Enterprise with intelligence

Sales Administration Finance Manufacturing ...

· Collects and combines information

Extractor/ Extractor/ Extractor/

Source Source Source

It is a technique for assembling and

Process of constructing (and using) a

What is a data warehouse?

Quick and efficient Often uses a process called

The Warehousing Approach

i Stored in ware house

 how business intelligence works

Predictive modeling Predict value for a specific data item attribute

Association, correlation, causality analysis Identify relationships between attributes

Making discovered knowledge easily

Basket Data: Retail organizations, e.g., supermarkets, collect and store

massive amounts sales data, called basket data. A record consist of

 Do you often use Google drive, when you use Gmail?

 Do you order tea when you order bread ?

 Given a database of transactions, where each transaction is

 This is taken to be the probability, P(A  B) =

 Support shows the probability that all the predicates in A and

 Confidence measure how often predicates B fulfilled if

 age(x, “30..39”) ^ income(x, “50000”) buys(x, “ car”) [1%,

 buys(x, “shoe) ® buys(x, “sock”) [60%, 80%]

 major(x, “MBA”) ^ takes(x, “Managerial Economics”) ®

frequent (or large) itemset is an itemset whose number of occurrences is

 Classification Task —A Two-Step Process:

NAME RANK YEARS TENURED Classifier

Model Usage in Prediction:

 Let X be a data sample (“evidence”): class label is unknown

P(Ci): P(buys_Comp= “yes”) = 9/14 = 0.643

Compute P(X|Ci) for each class:

P(age = “<=30” | buys_Comp= “yes”) = 2/9 = 0.222

P(income = “medium” | buys_Comp= “yes”) = 4/9 = 0.444

P(student = “yes” | buys_Comp= “yes) = 6/9 = 0.667

P(credit_rating = “fair” | buys_Comp= “yes”) = 6/9 = 0.667

 X = (age <= 30 , income = medium, student = yes,

• P(X|Ci)*P(Ci) : P(X|buys_Comp= “yes”) * P(buys_Comp= “yes”)

• P(X|buys_Car= “no”) * P(buys_Comp= “no”)

Outlook Temperature Humidity W indy Class

Outlook Temperature Humidity Windy Class

 Given the training set, we compute the probabilities:

 Clustering works in a manner similar to classification when

Machine learning allows

You might also like

• P(X|Ci)P(Ci) : P(X|buys_Comp= “yes”) P(buys_Comp= “yes”)