Week 5 Big Data Application in Business
Week 5 Big Data Application in Business
1
Outline
• What is Big Data ?
• Objectives for Big Data and Examples
• Key Components of Big Data System
• Applications of Big Data Analytics in different Industries
• Big Data Analytics – Statistics Models
• Examples Using Big Data Analytics
• Necessary Steps in Big Data Analytics
• Reference
2
What is Big Data?
Source: Analytics: The real-world use of big data in financial services (ibm.com)
https://www.ibm.com/downloads/cas/E4BWZ1PY
Four characteristics of Big data
Source: https://en.m.wikipedia.org/wiki/Petabyte
Four characteristics of Big data
What is Big Data ?
Hadoop
▪ A open source framework written in java that allows
distributed processing of large datasets across clusters
of computers using simple programming models
▪ To store large volume of any kind of data
▪ To provide enormous processing power and able to
handle limitless concurrent tasks
8
What is Big Data?
Benefit of Big Data Application to Management
• Not storing or managing massive volume of data
• Able to analyze structured and unstructured data(e.g.
voice or text or log files or images or video) at a very
cost effective way
• Relative complete picture of their customers and operations
What is Big Data?
Examples:
• Banks: Analyzing log files understand better on
its multi-channel customer interactions.
• Hotel: Analyzing customer lines with video
analytics.
• Insurance: Analyzing voice data from call center
recordings to predict customer satisfaction
Big Data Analytics
• Source: http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf
Objectives for Big Data
Cost Reduction from Big Data Technologies
▪ Storage cost reduction
▪ Roughly, estimated cost of storing one terabyte of
structured data for a year:
.
Objectives for Big Data
Operation Cost Reduction from Big Data Technologies
Example: UPS
• UPS needs to capture and track a variety of package movements and
transactions everyday. On average, it tracks data on 16 million packages per
day for 9 million customers, 39.5 million tracking requests from customers per
day and telematics sensors in over 46,000 vehicles.
• UPS acquires big data technologies and online map data to conduct a project
to reconfigure its drivers’ route structures
• Eventually, it saves more than 8.4 million gallons of fuel by cutting 85 million
miles off of daily routes.
• UPS also estimates that saving only one daily mile driven per driver saves the
company $30 million, so the big data project help UPS save the significant fuel cost.
Source: http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf
Objectives for Big Data
Time Reduction from Big Data
• Example: Macy’s merchandising group: https://youtu.be/eZeTkK65RQM
• Macy’s, Inc. is one of the nation’s premier omnichannel retailers and
it operates more than 800 department and specialty stores
• Big Data analytics support its department store chain to reduce the
time to optimize pricing of its 73 million items for sale from over 27
hours to just over 1 hour.
• Big Data technology e.g. Hadoop cluster also helps Macy’s save
70% hardware cost.
Source: http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf
Objectives for Big Data
Developing New Product / Service Offerings
> Most online firms using big data to develop a lot of products and
service
• LinkedIn’s People You May Know, Groups You May Like,
Jobs You May Be Interested In
• Google using big data algorithms for search or ad
placement and self-driving car
Source: http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf
Objectives for Big Data
Supporting Business Decisions
Example: Bank of America
• Using big data technology to analyze:
• unstructured customer data from website click log files ,
transaction records, bankers’ notes, and voice
recordings from call centers and
• structured data e.g. transaction data, customer
demographic data and survey data to improve
understanding on customer preference and purchasing
behavior
Source: http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf
Objectives for Big Data
Developing New Product / Service Offerings
Example: Caesars Entertainment
• The Caesars Entertainment: one of the world's most diversified casino-
entertainment providers and a leader in the application of data analytics
in the area of customer loyalty, marketing, and service
• Real-time customers data from its total rewards loyalty program, web
clickstreams, and from real-time play in slot machines.
• Objective of implementing big data tools to respond in real time for
customer marketing and service
• Using both Hadoop clusters and open-source and commercial analytics
software as well as data scientists to its analytics group
• Use big data analytics to analyze mobile data, and is experimenting
with targeted real-time offers to mobile devices
Source: http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf
Objectives for Big Data
Supporting Internal Business Decisions
Example: Healthcare service centers and hospitals
• Analytical focus on unstructured data — e.g. customer
attitudes in recorded voice files from call centers
• Using “natural language processing” software and Hadoop
and NoSQL storage can turn the voice data into text and
store the data sets for further analysis and reporting
• Learning video (https://youtu.be/7t75CNC34vU)
Source: http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf
Key Components of Big Data System
Platform Infrastructure
Data
Application code
Business View
Source: http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf
Key Components of Big Data System
Data Storage System
Hadoop - open-source programming environment
that supports big data processing through
distributed storage and processing on clusters of
computers
Key Components of Big Data System
Platform Infrastructure
• To integrate, manage, and apply sophisticated computational
processing to the data.
• Data:
✓ Structure and unstructured datasets such web logs, images,
videos, social media, Docs and PDF
• Application, Functions, and Services:
✓ To manipulate, process and analyze the data
✓ Example: calculate all the customers who like Facebook on social
media by a text mining application
•Business View:
✓ Enable raw data to be re-structured into a statistical model, a flat
file, a relational table or a cube for additional analysis
Key Components of Big Data Stack
Powerful Presentation and Data Visualization
• Powerful data visualization allow users to view information in an
intuitive and graphical way.
• Example: Google Chart and Tableau
• Tableau: a professional solution for visualizing AI, Big Data and
Machine Learning apps
Videos on Big data analytics
❑Big data analytics in stock price volatility
https://youtube.com/watch?v=UapNhA7wQ7M&feature=share
❑Big data in retail
https://youtube.com/watch?v=qwBS_IXw3JQ&feature=share
❑DHL Data analytics
https://youtube.com/watch?v=00wOf3xEQD4&feature=share
❑Big data analytics in finance
https://youtube.com/watch?v=HPvepzVTQgA&feature=share
❑Big data analytics in banks
https://youtube.com/watch?v=LUWacLuCBzo&feature=share
24
Application of Big Data Analytics in Different
Industries
• Banking: Have analytical insight from large volumes of
unstructured data in order to make site analysis and
customer segmentation
• Insurance – Big data analytics explore all customer data,
discover market trend, understand customers’ product
needs and service preference, make fraud detection
• Life Science: Predictive analytics used to build up more
intelligent, automated solutions to improve speed and
efficiency of clinical research process.
25
Application of Big Data Analytics in Different
Industries
• Market research: sample selection, site selection for
face-to-face survey, respondent analysis...etc
• Business: Marketing management, Retail management,
Advertising, Risk Management, Customer relationship
management...etc
• Public sector: public policy, election study, demographics
study, Environmental study, Criminal detection
• Retail – Retail analytics can perform retail site analysis,
improve customer relationship, understand more
customers buying preference and predict market trends
and boost profitability.
Application of Big Data Analytics in Different
Industries
Health Care
• Predictive analytics can analyze large amounts of structured or
unstructured information e.g. patient records, health plans, insurance
information, voice data from call centers and other types of
information, health care providers can provide lifesaving diagnoses
and treatments to improve population health
• Understand the clinical and nonclinical factors that affect
readmissions.
• Predict and prevent avoidable readmissions.
• Identify patients that have higher risk of infection to optimize
discharge planning.
27
Big Data Analytics – Statistics Models
Major statistics models
• Logistic regressions / logit model, Classification
trees, Factor analysis, Cluster analysis
• Classification / Segmentation
• Association analysis, Correlation analysis…etc
28
Big Data Analytics – Statistics Models
Logistic regression model
29
Big Data Analytics – Statistics Models
Segmentation analysis
30
Big Data Analytics – Statistics Models
Segmentation analysis
Once segments have been identified, an analyst can try
to understand the similarities and differences in segments
31
Big Data Analytics – Statistics Models
Segmentation analysis
Basic steps in segmentation analysis:
• Variables selection
• Data preparation
• Select a measure of similarity
• measure the distance between two records
• Select the type of clustering method
• rules for forming clusters
• to measure the distance between two clusters
• Decide the number of customer segments / clusters
• Interpret the segments solution
• Profiling the segments
32
Big Data Analytics – Statistics Models
Association analysis
• Searches for relationships between items
• Also known as market basket analysis (MBA)
• Give rule form results
• If a customers have credit card, then 10% of the
time he also buys trust funds
• 60% of all shoppers will buy insurance when
they also purchase retirement plans
33
Big Data Analytics – Statistics Models
Applications of Association analysis
Classification analysis
35
Examples - Using Big Data Analytics
Classification analysis
• Study the characteristics of customers and assign them to a
certain class
• Task is featured by
• Well-defined definition of the classes
• Examples
• Classifying credit card applicants into diamond, platinum, or
gold card groups
• Assigning customers to predefined customer groups
36
Examples - Using Big Data Analytics
Association / MBA Analysis
• To determine which things go together / which products
should be bundle for sales
• Can be used to plan arrangement of items on store shelves
• To identify cross-selling opportunities
Example
• Determine what things bundle together in a shopping cart at
the supermarket
37
Examples - Using Big Data Analytics
38
Examples - Using Big Data Analytics
Logistics regression
Fraud detection
• Predict which insurance claims, cellular phone calls,
or credit-card purchases are likely to be fraudulent
39
Examples - Using Big Data Analytics
Cluster / Segmentation analysis
Customer segmentation in telecommunications
• Direct marketing
• Has been used popularly in direct mail
• Customer acquisition/ segmentation
• For profiling good customers, performing market
segmentation
• Customer retention
• Predict good customers who are likely to leave
40
Examples - Using Big Data Analytics
Cluster / Segmentation analysis
Customer Segmentation in banks
• For grouping customers of similar kind into respective
categories
• Corporate and Retail Customers, Demographic, Behavioral
(Product Holdings/Usage) and Needs
• Customer segments were to be determined by a segmentation
model based on different variables that were identified by
management.
• e.g. demographic data of customers (working position,
income, sex, age, marital status), transaction data (last-
monthly credit card transaction records), home address of
customers
41
Necessary Steps in Big Data Analytics
Data
Modelling
Evaluation
42
Necessary Steps in Big Data Analytics
Identify project objectives and scope
• Determine business objectives
• understand what the client really wants to purchase
• Evaluate current situation and data accuracy and availability
• list all available dataset: transaction, customer
demographic, voice, …etc..
• Set up analysis goals
• describe the outputs of the project that enable the
achievement of the business objectives
• Prepare project plans
• plan for achieving the business goals
43
Necessary Steps in Big Data Analytics
Data navigation
• Build up a data dictionary
• Variable selection criteria
• Describe data
• format of data, quantity, fields
• Explore data
• distributions
• Verify data quality
• data completeness, missing values definitions
44
Necessary Steps in Big Data Analytics
Data preparation
• Select data
• Clean and standardize data
• Improve the data quality
• Build up new data sets
• compute new variables
• Integrate data
• combine data from multiple sources
• Format data
45
Necessary Steps in Big Data Analytics
Modeling
• Identify and select appropriate models
• Design tests
• Set up criteria to test the quality and validity of
selected models
• Build model
• Documentation of the detail of the models e.g.
values of the variables parameters of chosen
models
46
Necessary Steps in Big Data Analytics
Evaluation
• Evaluate results
➢assess the degree to which the model meets the
business objectives
➢test the model on test applications in the real
application
• Back testing models
• Review process and improve
➢check whether any major factor were
unobserved
47
Necessary Steps in Big Data Analytics
Implementation
• Plan implement
➢summarize implementation strategy including
necessary steps and how to perform them
• Plan monitoring and maintenance
➢summarize monitoring and maintenance strategy
• Generate analytical report
• Review project results
• Knowledge sharing with business users
48
Failure of Big Data Analytics Project
49
Reference Book on Big Data Case Study
Reference
> Big Data Analytics What it is and why it matters
(https://www.sas.com/en_us/insights/analytics/big-data-analytics.html#technical)
> Big data analytics in today’s world
(https://www.sas.com/en_us/insights/analytics/big-data-analytics.html#todaysworld)
> Big Data in Big Companies
(http://www.sas.com/resources/asset/Big-Data-in-Big-Companies.pdf)
> Big Data Case Study Collection
(https://www.bernardmarr.com/img/bigdata-case-studybook_final.pdf)
> Getting Started with Statistics Concepts
(http://www.statsoft.com/Textbook/Elementary-Statistics-Concepts)
> Getting Started with SAS Enterprise Miner: Setting Up an Enterprise Miner Project
(https://www.youtube.com/watch?v=489wJm2X0TY)
> What is Logistic Regression?
(https://www.statisticssolutions.com/what-is-logistic-regression/)
51