Lesson 5 Analytics Methods
Lesson 5 Analytics Methods
References:
IngramMicroAdvisor
KDnuggets
October 18 1
By the end of this lesson, you should know:
• Categories of analytics methods.
• Methodology for data analytics.
• Popular analytics methods.
• Choosing analytical methods.
October 18 2
RECAP: Purpose of data analytics
• Support decision-making.
• Provide an advantage over competitors.
• Gives insight into the future.
October 18 3
RECAP: Health care
Data
VALUE!
October 18 4
Four Types of Analytics
• Prescriptive – This type of analysis reveals what actions should be
taken. This is the most valuable kind of analysis and usually results in
rules and recommendations for next steps.
• Predictive – An analysis of likely scenarios of what might happen. The
deliverables are usually a predictive forecast.
• Diagnostic – A look at past performance to determine what
happened and why. The result of the analysis is often an analytic
dashboard.
• Descriptive – What is happening now based on incoming data. To
mine the analytics, you typically use a real-time dashboard and/or
email reports.
October 18 5
Prescriptive Analytics
Prescriptive analytics is really valuable, but largely not used. According to
Gartner, 13 percent of organizations are using predictive but only 3 percent
are using prescriptive analytics. Where analytics in general sheds light on a
subject, prescriptive analytics gives you a laser-like focus to answer specific
questions.
For example, in the health care industry, you can better manage the patient
population by using prescriptive analytics to measure the number of patients
who are clinically obese, then add filters for factors like diabetes and LDL
cholesterol levels to determine where to focus treatment. The same
prescriptive model can be applied to almost any industry target group or
problem.
October 18 6
Predictive Analytics
Predictive analytics use data to identify past patterns to predict the
future.
For example, some companies are using predictive analytics for sales
lead scoring. Some companies have gone one step further use
predictive analytics for the entire sales process, analysing lead source,
number of communications, types of communications, social media,
documents, CRM data, etc. Properly tuned predictive analytics can be
used to support sales, marketing, or for other types of complex
forecasts.
October 18 7
Diagnostic Analytics
Diagnostic analytics are used for discovery or to determine why
something happened.
For example, for a social media marketing campaign, you can use
descriptive analytics to assess the number of posts, mentions,
followers, fans, page views, reviews, pins, etc. There can be thousands
of online mentions that can be distilled into a single view to see what
worked in your past campaigns and what didn’t.
October 18 8
Descriptive Analytics
Descriptive analytics are valuable for uncovering patterns that offer
insight.
October 18 9
What do we search for in data analytics?
• Correlation
• A technique for investigating the relationship between two quantitative,
continuous variables, for example, age and blood pressure.
• Pattern
• A repetitive characteristic.
October 18 10
Methodology for
analytics, data mining,
and data science projects
October 18 11
Business understanding
• Understand the problem to be solved. This may require multiple
iterations before an acceptable solution formulation would appear.
October 18 12
Data understanding
• Data is the raw material from which the solution will be built.
October 18 13
Data preparation
• Often, data is not in the form that it is required, hence, conversion is
necessary to achieve a form that can help yield better results.
October 18 14
Modelling
• The primary place where data mining techniques are applied to the
data.
October 18 15
Evaluation
• Aim is to assess the data mining results and to gain confidence that
the results are valid and reliable.
• Stakeholders would like to know if the proposed model is going to do
more good than harm, or would it be catastrophic.
• Evaluating results of data mining includes both quantitative and
qualitative assessments.
• These evaluation techniques are statistical in nature and thus not
covered in this course.
October 18 16
Deployment
• Data mining results are put into real use in order to realise some
return on investment. This involves implementing the proposed
model.
• The observation from this stage may require an iteration back to the
Business Understanding stage. There, improvements and refinements
to the model is made.
October 18 17
Popular analytics methods
• Classification and class probability estimations
• Regression
• Similarity matching
• Clustering
• Co-occurrence grouping
• Profiling
October 18 18
Case: MegaTelCo
The company has a major problem with customer retention in their
wireless business. In the mid-Atlantic region, 20% of cell phone
customers leave as soon as their contracts expire, and lately it has been
getting increasingly difficult to acquire new customers.
The cell phone market has become saturated. Telco companies are
battling to attract each other’s customers while retaining their own.
Customers switching from one company to another is called “churn”,
and it is expensive all around.
October 18 19
Classification and class probability estimation
• Goal: To predict in which class an individual belongs to.
• Question: Among all the customers of MegaTelecom, which are likely
to respond to a given offer?
• Individual is a customer.
• Classes are “will respond” and “will not respond”.
• Classification task: A data mining model predicts which class an
individual belongs to.
• Class probability estimation task: Instead predicting which class an
individual belongs to, here it predicts the “probability” that an
individual will belong to which class. The probability comes as a score
value.
October 18 20
Probability:
80%
WILL RESPOND
If OFFER
Probability:
5%
WILL NOT RESPOND
October 18 21
Regression
• Goal: To predict or estimate, for each individual, the numerical value
of some variable for that variable.
• Question: How much will a given customer use the service?
• Task: Predict the “service usage” property (variable) for a particular
individual typically by looking at other similar individuals in the
population and their historical usage.
October 18 22
A LOT!
WILL RESPOND
HOW much
service would
she use?
A LITTLE….
October 18 23
Similarity matching
• Underlies other data mining tasks, such as classification, regression
and clustering.
• Goal: To identify similar individuals based on data known about them.
In other words, to find similar individuals.
• Most popular methods for making product recommendations (finding
people who are similar to you when purchasing items).
October 18 24
Clustering
• Goal: To group individuals in a population together by their similarity,
but not driven by any specific purpose.
• Question: Do our customers form natural groups or segments?
• Useful in preliminary domain exploration which natural groups would
later suggest other data mining tasks or approaches.
October 18 25
Texts occasionally
Calls for long hours
Texts frequently
Seldom calls nor texts
October 18 26
Co-occurrence grouping
• Also known as frequent itemset mining, association rule discovery,
market-basket analysis.
• Goal: To find associations between individuals based on transactions
involving them.
• Question: What items are commonly purchased together?
• Task: Identify similarity of objects based on their “appearing”
together in transactions.
• Example: people who bought X also bought Y.
October 18 27
Hungry
people who
bought PIZZA
also bought
NOODLES,
therefore,
always offer
NOODLES to
someone
who bought
PIZZA.
October 18 28
Profiling
• Also known as “behaviour description”.
• Goal: To characterise the typical behaviour of an individual, group or
population.
• Question: What is the typical cell phone usage of this customer
segment?
• Task: Requires a complex description of night and weekend airtime
averages, international usage, roaming charges, text minutes etc.
October 18 29
Jane is a student. This is her
service usage profile
recorded by her telco.
January
February
March
A mismatch!
April Does not fit
profile.
ALERT!
October 18 30
Jack is a lecturer. This is his
purchase profile recorded by
his credit card company.
January
February
March
April A mismatch!
Does not fit
profile.
FRAUD!
October 18 31
Which analytical method?
• Often, a data analyst must be able to propose one or multiple
analytical methods to solve a business problem. However, this can be
tricky. One way is by identifying if the business problem requires a
supervised or an unsupervised data mining method by determining if
the question has a target/purpose for the grouping.
October 18 32
Q1: Do our customers Q2: Can we find groups of customers who
naturally fall into different have particularly high likelihoods of
groups? cancelling their service soon after their
contracts expire?
Is there a target
Is there a target
Is there a target
Task: To predict the
target.
Will a customer leave when
her contract expires?
Requires labelled data
Hence, use supervised methods on the target.
Classification, regression
October 18 34
NEXT thing that you
need to know…. Will this customer purchase
service S1 if given incentive
X?
Classification or
Regression?
Which service package (S1,
S2 or none) will a customer
likely purchase if given
How much will this incentive X?
customer use the service?
October 18 35