DMBI - For MBA - Unit V
DMBI - For MBA - Unit V
By
Mr.R.Rajesh MBA,M.Com,M.Phil
Assistant Professor of MBA Department
Ganadipathy Tulsi’s Jain Engineering College
1
1
BI & DM Applications in various sectors
Business Intelligence solutions are applied to literally any
processes using data within the company. BI provides data
for usage or decision making in:-
Software. IT departments or third-party companies use
data from BI for developing applications.
Digital Marketing. Marketing teams need to track the
progress of campaigns, measure results, get valuable
feedback.
Finance. Finance teams get financial and operational data ,
save time on modeling scenarios and forecasting results.
Executives can understand the numbers from every
customer, product, and process on a daily basis and make
major strategic decisions.
BI & DM Applications in various sectors
Business Intelligence solutions are applied to literally any
processes using data within the company. BI provides data
for usage or decision making in:-
Sales - get meaningful insights to increase revenue &
sales.
Human Resources - get data on hiring campaigns,
combine data from different departments, and measure the
impact on the organization’s performance.
Travel and Hospitality - analyze customers’ online big data
from the travel industry to make better decisions on service
offerings.
Manufacturing - improving operations, optimizing pricing
strategies, streamlining fulfillment processes.
BI & DM Applications in various sectors
Business Intelligence solutions are applied to literally any
processes using data within the company. BI provides data
for usage or decision making in:-
Retail/eCommerce - get the overview on business
performance on different levels.
Account Management Teams - Monitor account
performance, Revenue tracking and Regulatory reporting.
Operations - Ticket backlog, SLA’s , Call centre stats,
Technical performance indicators and many more…
Retailing
Retail Industry is now paying due attention to BI platform,
particularly in the areas of product, customer and functional
acumen.
Among the many factors some of which have compelled
retailers to implement BI software are accelerated
competition, to raise profit, customary credit card utilization,
attractiveness of loyalty cards and radio frequency
identification.
Retailing
Business Intelligence in Retail – Purpose
Allows the retailer to gain high quality information through
the use of BI tools like data warehousing, data mining, and
online analytical processing.
BI allows organizations to predict the behavior of their
competitors, suppliers, customers, technologies,
acquisitions, markets, products and services.
BI helps retailers use such as Point of Sale transactions and
social media gives unprecedented access to the customers’
mind.
Retailers are also using retail management technologies
like Self Checkout POS, RFID and Cloud Computing for them
by providing a real time integrated and collaborative
information system.
Retailing
Business Intelligence in Retail – Purpose
BI further helps the retailer keep a vigilant eye on business
activities by estimating the long and short-term demands,
notifications of low inventory and monitoring factors that
influence customer buying decisions.
• Loyalty Scheme
DataMarketing
(I/E) Inventory Optimization
• Track Consumer
Preferences
• Personal
Promotional
Offer • Store Layout –
• Stock Deployment
• Real Time Sales (I/E)
Maximization in
Data (I/E) minimal space.
• Replenishment
• Price Decisions • Logistics
• Procurement
• Layout – heat maps, Optimization
customer buying • SKU Analyzation
• Service Level
habits (I) • SKU Deployment Optimization
• Layout – Shoppers
platform
acceptance (E)
• Consumer Insight to
Substitution
acceptance
Business Intelligence in Retail – Example of a Data
Capture Map
Business Intelligence in Retail –BI Example of
Price Decisions
Methodology - How AMAZON
Personalized Recommendation System: -
uses BI
Uses Comprehensive collaborative filtering engine (CFE)
-BI allows Amazon to analyze previous purchase to suggest new
items
-Uses your recommendations to suggest new purchases to others
who bought similar items
-Pulls info from your search and wish list to recommend new
purchases
Book Recommendations from Kindle Highlighting:
-Uses social networking services to send Kindle highlighted notes to
others for book discussions
-Uses highlight function to determine what other books you might
like
One-Click Ordering / Price Optimization
-Auto fills in shipping and payment methods to allow for quick
purchase.
-Sets prices based on activity on website, competitors prices,
product availability, item preferences, order history, expected profit
margin. Product prices change every 10 minutes as data is analyzed
Methodology - How AMAZON uses BI
Anticipatory Shipping Model:
-Uses big data to predict when you are likely to order the same
product again and pre-stages the items a dc close so it is ready
to ship
-Uses predictive analytics to increase product sales by
suggesting its time to by or creating personalized sales for
items pre-staged
Supply Chain Optimization:
-Links to manufacturers to track their inventory
-Uses analytics to determine warehouse closest to customers
-Uses graph theory to help decide best delivery schedule,
route and product groupings to reduce
CHALLENGES
Amazon has been making big investments in its big data analysis – they will continue to be industry disruptors, but will find challenges
- Expanded go stores
- Drone delivery
- VR real world shopping experience
Business Intelligence in Retail –Benefits to BI
The benefits associated with BI adoption in the retail sector
include accurate decision making, efficient service delivery and
competitive advantage.
These benefits include
- Better customer focus
- Ability to anticipate changes in the market earlier
- Ability to manage prices better
- More efficient service delivery
- More robust forecasting of future trends
- More efficient use of resources
- Improved sharing of inter-department knowledge
- Easier to manage costs
- Strengthens strategic planning
- Better quality of information for improved
decision making
Business intelligence in banking sector
The KPI in retail banking
The KPI in retail banking may include the factors that have links
to the performance of a retail bank.
Risk Management
Probability of loan default and expected recovery of loan
default – Important for loan pricing
Credit cards early detection and prevention of frauds
Analyzing credit portfolios, enabling banks to quickly identify
potential delinquency cases
Determine overall financial health
Information about volatility in current economic environment
Accurately estimating the risk of customer loans based on: The
financial assets and earning capacity of the borrower
The prevailing economic climate
Business intelligence in banking sector
Need of business intelligence and Data warehousing
Customer segmentation
Required to defined profitability amount of service and
attention to be provided to customer
Better understand customer needs and sentiments regarding
banking
Effective tailored product and services to a segment
Effective customer profiling according to the segment
Determine profitability across branches and products
Identify and develop new cross-sell and up-sell opportunities
and marketing campaigns accordingly
Business intelligence in banking sector
Need of business intelligence and Data warehousing
Regulatory requirement
Regulatory requirements indicated by the RBI for preparation
of Off-site Monitoring Surveillance (OSMOS) Reports on a
regular basis in electronic format
Asset Liability Management (ALM) guidelines for banks being
implemented by the RBI w.e.f. April 1, 1999
Regulatory requirement of filing of statutory returns such as
the one under Section 42 of the Reserve Bank of India Act,
1934
Need for timely submission of Balance Sheets and Profit & Loss
Accounts
Need for Inter-Branch Reconciliation of Accounts within a
definite time frame
Business intelligence in banking sector
Data Mining
Used to identify motives of individuals when shopping,
likewise when locating criminals
Discovers hidden patterns and relationships throughout
large amounts of information
Tactical Analysis
Creating models that represent a crime or crimes that can
be connected to identify cases to locate suspects
Breaks down crime based on day and time and other
variables
Techniques Utilized
Behavioral Analysis
Predict future crime based on relationships or behavior of
criminals
Using past criminal records to categorize performance
Use of Dashboards
Visual display of most import information necessary to
achieve one or more objectives; consolidated and
arranged on a single screen so the information can be
monitored at a single glance.
- Dashboard Insight.com
Ability to locate & prevent crime (utilizing maps and
charts) in real-time
Chromosome
Gene
expression
35
Affymetrix Microarrays
1.28cm
50um
~107 oligonucleotides,
half Perfectly Match mRNA (PM),
half have one Mismatch (MM)
Gene expression computed from PM and
MM
36
Microarray Potential Applications
• New and better molecular diagnostics
• New molecular targets for therapy
– few new drugs, large pipeline, …
• Outcome depends on genetic signature
– best treatment?
• Fundamental Biological Discovery
– finding and refining biological pathways
• Personalized medicine ?!
37
Microarray Data Mining Challenges
• Avoiding false positives, due to
– too few records (samples), usually < 100
– too many columns (genes), usually > 1,000
• Model needs to be robust in presence of noise
• For reliability need large gene sets; for
diagnostics or drug targets, need small gene
sets
• Estimate class probability
• Model needs to be explainable to biologists
38
CATs: Clementine Application Templates
Preparation
39
Key Ideas
• Capture the complete process
• X-validation loop w. feature selection inside
• Randomization to select significant genes
• Internal iterative feature selection loop
• For each class, separate selection of optimal
gene sets
• Neural nets – robust in presence of noise
• Bagging of neural nets
40
Microarray Classification
Train data Feature and Parameter Selection
41
Classification:
Gene Data
External X-val
Train data Feature and Parameter Selection
T r a i n
Final Results
42
Measuring false positives with
randomization
Rand
Class
Gene Class
178 1 2
105 1 1
4174 2 1
7133 2 2
Randomize
500 times
43
Gene Reduction improves
Classification
• most learning algorithms look for non-linear
combinations of features -- can easily find
many spurious combinations given small # of
records and large # of genes
• Classification accuracy improves if we first
reduce # of genes by a linear method, e.g. T-
values of mean difference
• Heuristic: select equal # genes from each class
• Then apply a favorite machine learning
algorithm 44
Iterative Wrapper approach to
selecting the best gene set
• Test models using 1,2,3, …, 10, 20, 30, 40, ...,
100 top genes with x-validation.
• Heuristic 1: evaluate errors from each class;
select # number of genes from each class that
minimizes error for that class
• For randomized algorithms, average 10+
Cross-validation runs!
• Select gene set with lowest average error
45
Clementine stream for subset
selection by x-validation
46
Microarrays: ALL/AML Example
• Leukemia: Acute Lymphoblastic (ALL) vs Acute
Myeloid (AML), Golub et al, Science, v.286,
1999
– 72 examples (38 train, 34 test), about 7,000 genes
– well-studied (CAMDA-2000), good test example
ALL AML
47
Gene subset selection: one X-
validation
Error Avg for 10-fold X-val
30%
25%
20%
15%
10%
5%
0%
1 2 3 4 5 10 20 30 40
Genes per Class
48
Gene subset selection:
multiple cross-validation runs
For ALL/AML data, 10 genes per class had the
lowest error: (<1%)
49
ALL/AML: Results on the test data
• Genes selected and model trained on Train set
ONLY!
• Best Net with 10 top genes per class (20
overall) was applied to the test data (34
samples):
– 33 correct predictions (97% accuracy),
– 1 error on sample 66
• Actual Class AML, Net prediction: ALL
• other methods consistently misclassify sample 66 --
misclassified by a pathologist?
50
Pediatric Brain Tumour Data
• 92 samples, 5 classes (MED, EPD, JPA, EPD,
MGL, RHB) from U. of Chicago Children’s
Hospital
• Outer cross-validation with gene selection
inside the loop
• Ranking by absolute T-test value (selects top
positive and negative genes)
• Select best genes by adjusted error for each
class
• Bagging of 100 neural nets 51
Selecting Best Gene Set
• Minimizing
Combined
Error for all
classes is
not optimal
52