0% found this document useful (0 votes)
6 views

IBA Module2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

IBA Module2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

MODULE 2

The First Step in Information Management

The Foundation of Analytics:


Descriptive, Predictive and
Prescriptive
Polling
Questions
▪ What type of statistical analyses do you use or plan to use (can choose multiple answers)?
− Descriptive
− Predictive
− Prescriptive
− I don’t use any of these
− I don’t know the difference between these

pg 2
Polling
Questions
▪ What type of statistical analyses do you use or plan to use (can choose multiple answers)?
− Descriptive
− Predictive
− Prescriptive
− I don’t use any of these
− I don’t know the difference between these
▪ How frequently do you use statistical analyses in your work?
− I don’t currently do any type of statistical analysis
− Less than once a week
− Once or a few times a week
− At least once a day

pg 3
Topics for Discussion

▪ Overview of statistical analysis process Descriptiv


− Forming a hypothesis e
− Identifying appropriate sources Predictiv
− Proving/Disproving the hypothesis e
Prescriptiv
▪ Types of data analysis
e
− Descriptive data analytics
− Predictive data analytics
− Prescriptive data analytics
▪ How these types compare within the analytic environment
▪ Key takeaways and suggested resources
pg 3
The Process of Statistical
Analysis
When we have resource constraints, Statistical Analysis enables us to make quantitative
inferences based on an amount of information we can analyze (a sample).
Form Identify Data Prove/Disprove
Hypotheses Source Hypothesis

• Null: Nothing • Don’t go • Is Type I or


special overboard! Type II error
• Alternative: • Collect your worse?
Something own, OR • Choose
unique, an confidence
• Use
actionable level
finding, etc. secondary
• Reject/not
data
reject null pg 5
Step 1: Forming a
Hypothesis
▪ In statistical analysis, we have two hypotheses:
− Null hypothesis: Claims that any irregularities in the sample are due
to chance Step1
− Alternative hypothesis: Claims that irregularities in the sample are due
to non-random causes (and would therefore reflect the population)
▪ What are you really
− Experiment 1: looking to discover/prove?
▪ Null: There is no difference in the amount sold when comparing salespeople who did
and did not receive training.
▪ Alternative: There is a difference in the amount sold when comparing salespeople who
did and did not receive training.
− Experiment 2:
▪ Null: The salespeople who received training do not sell more on average than the
salespeople who did not receive training.
▪ Alternative: Salespeople who received the training sell more on average than those who
did not receive the training.
pg 6
Step 2: Identifying Appropriate
Sources
▪ Remember, you don’t need Big Data for every decision!
▪ Sometimes, knowing what data you don’t need is just as important Step2
as knowing what you do need. Keep your end decision in mind.
▪ Potential sources of data:
− Primary data − collect new data
▪ Who to include: Random sample, stratified random sample, etc.
▪ How many to include: Sample size calculators online (free)
▪ Determine the level of measurement needed for your desired analysis:
categorical, ordinal, interval, rational
▪ As necessary, design a control group
− Secondary data − utilize existing data
▪ Census records, syndicated data, government data, etc.
▪ Consider your data needs, data cleanliness, cost, etc., when determining
appropriate sources.
pg 7
Step 3: Proving/Disproving the
Hypothesis
▪ Establish a confidence level prior to analysis.
▪ Confidence levels: Step3
1. Determine how significant a difference/irregularity must be for you
to prove/disprove your alternative hypothesis.
2. Determine how confident you can be in your decision.
▪ Even with a high confidence level, you aren’t always right:
− Type I error: You reject the null hypothesis but shouldn’t have.
− Type II error: You do not reject the null hypothesis but should have.
− How to decrease the likelihood of these errors: change the confidence level, increase
sample size (be aware of effect size), etc.
▪ Determine which type of error is more detrimental to your investigation and set
up your study accordingly.
pg 8
Step 3: Proving/Disproving the
Hypothesis
Training N Mean
Std.
Deviation
Std. Error
Mean
▪ Confidence level = 95%
▪ Alpha = 0.05
No training 74 102.643 9.95482 1.15722
QPctQ3
Training 74 106.3889 9.83445 1.14323

95% 95%
Confidenc Confidenc
Levene's Test for t-test for Equality Sig. (2- Mean Std. Error
Equality of of Means tailed) Difference Difference
e Interval e Interval Percent of 3rd Quarter Quota Sold
of the of the
Variances
Difference Difference
by Trained vs. Untrained
Lower Upper Salespeople
F Sig.
108
106
0.029 0.865 -2.303 146 0.023 -3.74595 1.6267 -6.96086 -0.53103
104
102
-2.303 145.978 0.023 -3.74595 1.6267 -6.96087 -0.53102 100
No training Training

pg 9
Types of Data
Analysis
Types of Data
Analysis

Descriptiv Predictiv Prescriptiv


e e e
• Aims to help • Helps forecast • Suggests
uncover valuable behavior of people conclusions or
insight from the and markets actions that may
data being analyzed • Answers the question be taken based
• Answers the “What could happen?” on the analysis
question • Answers the
“What happened?” question
“What should
be done?”

pg 11
Descriptiv Data
e
Analytics Mean, Median and Mode Amounts
▪ Though the most simple type, it is used most
of Items Purchased
often.
▪ Two types of descriptive analysis:
7 6.5
6
1. Measures of central tendency (tells us 5
about the middle) 4
▪ Mean − the average 3
2
2
▪ Median − the midpoint of the 1
1
responses
0
▪ Mode − the response with the highest Mean Median Mode
frequency
2. Measures of between
distance dispersionthe two
▪▪ Range − the min, the max and the Customer_ID Items Purchased Amount Spent
Variance − the average degree to which
each of the points differ from the mean 29304 1$ 1.09
▪ Standard Deviation − the most 28308 3 $ 44.43
common/standard way of expressing
the spread of data 19962 21 $ 218.58
30281 1 $ 73.02

pg 12
Predictiv Analysi
e
s
Predictiv Data
e
Analytics
▪ Some mistake predictive analysis to have exclusive relevance to predicting
future events.
− However, in cases such as sentiment analysis, existing data (e.g., the text
of a tweet) is used to predict non-existent data (whether the tweet is positive
or negative).

▪ Several of the models that can be used for predictive analysis are:
− Forecasting
− Simulation
− Regression
− Classification
− Clustering
pg 14
Predictiv Forecastin
e
g
▪ Forecasting:
− Moving average technique: use the Net Income of Store C Projected 2017-2020
mean of prior periods to predict the $25,000.00

next $20,000.00
▪ The mean of periods 1−4 = period 5
▪ The mean of periods 2−5 = period 6 $15,000.00

− Exponential smoothing technique:


$10,000.00
similar, but more recent data points
are weighted more heavily due to $5,000.00

relevance
− Regression techniques
$-
2006 2008 2010 2012 2014 2016 2018 2020

▪ Use caution in forecasting – The


2022

larger the forecasted time period,


the less accuracy there is in the
projections.
pg 15
Predictiv Simulatio
e
n
▪ Simulation
− Queuing models: used to predict wait time and queue length
▪ Results can be used to create staff schedules in a way that reduces inefficiencies, etc.
− Discrete event model: used in special situations when queuing cannot be used
▪ Results can be used to identify bottlenecks, etc.
− Monte Carlo simulations: used to identify probable outcomes of a scenario
based on many possible outcomes (uses random number generation and many
iterations of the scenario).
▪ Results can be used to predict the likelihood of profitability within the first two years, etc.

pg 16
Predictiv Queuing Model
e
Example
Scenario 1 Scenario 2

pg 17
Predictiv Monte Carlo Simulation
e
Example

pg 18
Predictiv Regressio
e
n
▪ Regression − generally speaking, used
to understand the correlation of
independent and dependent variables
▪ Types of regression models:
− Logistic: used for categorical variables (i.e., will customers shop at your store or a
competitor?)
− Linear: used to identify a linear relationship between the dependent variable and
at least one independent variables (i.e., daily store revenue predicted by the
number of customers entering the store)
− Step-wise: used to identify a relationship between dependent/independent
variables. This is done by adding/removing variables based on how those
variables impact the overall strength of the model.
pg 19
Predictiv Classification &
e
Clustering
▪ Classification: used to assign objects to
one of several categories
− Sentiment analysis of social media
postings

▪ Clustering: another method of forming


groups
− Intragroup differences are minimized
− Intergroup differences are maximized
− Commonly used to create and better
understand customer groups

pg 20
Prescriptiv Analysi
e
s
Prescriptiv Data
e
Analytics
▪ Decisions can be formulated from descriptive and predictive analysis
− If I need to cut a product and I know that product C is least preferred and least
profitable, I will cut product C.

▪ However, prescriptive analytics explicitly tell you the decisions that should
be made. This can be done using a variety of techniques:
− Linear programming
− Integer programming
− Mixed integer programming
− Nonlinear programming

pg 22
Prescriptiv Linear Programming
e
Example
Product A Product B Product C Product D Product E
Quantity to Order
Profit per Unit $ 5 $ 3 $ 20 $ 50 $ 200 Total Profit $ -
Product A Product B Product C Product D Product E Used Available
Storage Space 0.05 0.5 1 5 10 1000
Selling Effort 0.25 5 0.5 2 7 500
Minimum Order 100 15 20 60 5

Solution:
Product A Product B Product C Product D Product E
Quantity to Order 100 15 490 60 5
Profit per Unit $ 5 $ 3 $ 20 $ 50 $ 200 Total Profit $ 14,345.00

Product A Product B Product C Product D Product E Used Available


Storage Space 0.05 0.5 1 5 10 852.5 1000
Selling Effort 0.25 5 0.5 2 7 500 500
Minimum Order 100 15 20 60 5

pg 23
Comparing the Three Types of Data
Analytics
▪ Descriptive analysis is most common.
− Best practice to perform descriptive
analyses prior to prescriptive/predictive
▪ Understand that distribution, variance,
skew, etc., may exclude certain models

▪ How to know which type of analysis to


pursue:
− How much time do you have?
− What resources are available to you?
− How accurate is your data? How accurate
do you need the model/analysis to be?
− How popular/accepted is the model you are considering?
▪ Don’t subscribe to “that’s how we’ve always done it,” but
remember to use a model that stakeholders will accept.
pg 24
Key Takeaways and Suggested
Resources
▪ Gaining meaningful insights from data requires planning, technical awareness and consistency.
▪ Statistical analysis isn’t a replacement for your own logic (don’t go on statistical autopilot).
▪ Utilize available resources (blogs, podcasts, articles, webinars and online courses) to learn more.
− Look for APPLIED statistics topics
▪ Big data is not always required.
▪ Basic understanding of the statistical
analysis process goes a long way!

Guide: When Predictive Models Fail Book: Statistics


searchdatamanagement.techtarget.com/ in Plain English
Podcast: Not So Standard Deviations ezine/Business-Information/When- Timothy C. Urdan
predictive-analytics-models-produce-
https://soundcloud.com/nssd-podcast false-outcomes

pg 25
QUESTION FOR DISCUSSION: How do the following types of
analytics compare with one another?

? Descriptiv
e
Predictiv
e
Prescriptiv
e

pg 26
LEARNING ACTIVITIES

▪ Search for a case study in the internet highlighting the


significance of the use of any of the types of analytics.

▪ Analyze the case.

pg 3

You might also like