0% found this document useful (0 votes)
153 views40 pages

DSC4213 - Analytics Tools For Consulting

This document provides an overview of an analytical tools course taught by Professor WANG Tong. It introduces the instructor and teaching assistant. The course covers various quantitative modeling tools and their application in business consulting contexts. Topics include linear regression, dynamic pricing, choice modeling, classification, forecasting, and simulation. Students will complete homework assignments, projects and a group presentation for assessment. The document concludes with an agenda for the introductory session, which includes reviewing simple and multiple linear regression models.

Uploaded by

Sabina Tan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
153 views40 pages

DSC4213 - Analytics Tools For Consulting

This document provides an overview of an analytical tools course taught by Professor WANG Tong. It introduces the instructor and teaching assistant. The course covers various quantitative modeling tools and their application in business consulting contexts. Topics include linear regression, dynamic pricing, choice modeling, classification, forecasting, and simulation. Students will complete homework assignments, projects and a group presentation for assessment. The document concludes with an agenda for the introductory session, which includes reviewing simple and multiple linear regression models.

Uploaded by

Sabina Tan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

DSC4213 Analytical Tools for

Consulting

Session 1 Introduction

Prof. WANG Tong


Dept. of Decision Sciences
NUS Business School
About the Prof

• WANG Tong
– Associate Professor, Dept. of Decision Sciences
– Office: Biz 1 (Mochtar Riady Building) 8-68
– Email: [email protected]
– Web: http://www.bschool.nus.edu.sg/staff/bizwt/

• Andrew LIM
– Professor, Dept. of Decision Sciences
– Office: Biz 1 (Mochtar Riady Building) 8-70
– Email: [email protected]

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 2


About DSC4213

• The objective: analytical tools + consulting


– to integrate the quantitative tools that have been taught in various business
courses
– to explore how the tools can be used in a diverse array of business
applications
– to enhance your appreciation of the power (and limitations) of quantitative
modeling in business/consulting applications

• Please be
– participative
– open-minded
– ready to criticize
– innovative

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 3


Topics Overview
Session Lecture Topic Case/Game
1 Review of Linear Regression
2 Dynamic Pricing Part I: Data Analytics Case: The Retailer
3 Dynamic Pricing Part II: Price Optimization Strategies Game: retailer2.net
4 Modeling Choice: the Logit model Case: New York Health Club

5 Classification and Logistic Regression Data: James Harden and NBA

6 [Hari Raya] No class, take-home project 1

7 Simulation with @Risk TBA


8 Forecasting with Regression and Simulation Football Ticket Sales Forecast

9 Decision Tree Modeling TBA

10 Rating and Ranking I TBA


11 Rating and Ranking II TBA
12 Project Presentations
13 Project Presentations

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 4


Assessment

• 100% Continuous Assessment


– Participation (10%)
• Mandatory attendance (Be on time!)
• Contribution to the learning experience
– Group homework and presentation (30%)
• Around 5 group-based homework (“consulting reports”)
• Random informal in-class presentation
– Individual take-home projects (30%)
• Specified data and questions
• Open-book but individual
– Group Project and Presentation (30%)
• Form project groups (3~4 members per group)
• Find a proper consulting topic and apply analytical tools
• Presentation and report
• More details later

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 5


More Logistics

• Follow IVLE for lesson plan, materials, updates, and grades

• Hard copies of cases to be distributed later

• Slides will be online after class

• Q?

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 6


Plan for today

• Linear Regression
– Simple and multiple linear regression model
– Least squares estimation
– Model assessment
– Model selection
• Other Considerations in Regression Model
– Qualitative predictors
– Introducing nonlinearity: interaction terms, polynomial terms, log
transformation
• Practical Issues
– Multicollinearity
– Heteroscedasticity
– Outliers and high leverage points

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 7


Simple Linear Regression

• Regression: a simple but fundamental tool in analytical consulting


• Simple Linear Regression: a linear model with one
predictor/covariate/feature

– Coefficient β0 and β1 are the intercept and slope


– ε is the error term with zero mean

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 8


Estimation by Least Squares

• To estimate the β’s and yield prediction of Y on the basis of X=x

• Define the Residual Sum of Squares (RSS)

• Least Squares Estimation: choose the coefficient estimates to minimize RSS

– The regression line always goes through the mean x, y


– The best (in the sense of in-sample RSS) linear model that represents the data
– Minimum variance among all unbiased linear estimators

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 9


Accuracy of Least Squares Estimate

• If we further assume εi are


– independent of each other and independent of X
– of the same variance σ2 (homoscedasticity)
– normally distributed
• We can estimate σ2 (σ_hat: residual standard error)

• And obtain the standard errors of the estimates (variance under different
samples)

• And confidence intervals (95%)

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 10


Hypothesis Testing on coefficients

• With standard errors, we can test hypothesis


– H0: β1 = 0 (there is no relationship between X and Y)
– HA: β1 ≠ 0
• by calculating the t-statistic (n-p-1 degree of freedom)
a/2=.025 a/2=.025

• and obtain the corresponding p-value Reject H0 Do not reject H0 Reject H0


-tn-2,α/2 0 tn-2,α/2
• Interpretation of the p-value t
– The probability that the null hypothesis is true and data is out of randomness
– The lower the p-value, the stronger the evidence against the null
– Lower p-value => Higher statistical significance
– SIGNIFICANCE DOES NOT IMPLY THE STRENGTH OF RELATIONSHIP

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 11


Simple Linear Regression in R

• Example: the tips dataset


– Regress tip on total_bill

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 12


Assessing Overall Quality of the Model

• Measure of variation
– TSS: total sum of squares (variation of y around its mean)

– RgSS: regression sum of squares (variation explained by the regression model)

– RSS: residual sum of squares (variation attributable to other factors)

TSS = RgSS + RSS

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 13


Assessing Overall Quality of the Model

yi
y
yi – yi

_
yi – y

y
_
_ yi - y
_
y y

xi X
Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 14
Assessing Overall Quality of the Model

• R-squared (R2): fraction of variation explained by the predictors

– R2 is always between 0 and 1

– R2 is COR[Y, Y_hat]2, and is equal to COR[X, Y]2 in simple regression

Y Y Y

X X X
r2 = 1 0 < r2 < 1 r2 = 0
Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 15
Assessing Overall Quality of the Model

• Does the whole model explain anything at all?


– H0: β1 = β2 = ... = βp = 0
– HA: at least one β ≠ 0

• F statistic and p-value

a = .05

0 F
Do not Reject H0
reject H0
Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 16
Multiple Linear Regression

• A linear model with multiple (p) predictors

• Interpretation of the parameters


– 0 is the intercept (i.e. the average value for Y if all the X’s are zero)

– jis the slope for variable Xj, i.e., the average increase in Y when Xj is increased
by one and all other X’s are held constant

– But predictors usually change together!

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 17


Estimating Multiple Linear Regression

• Least squares estimation is still valid

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 18


Multiple Linear Regression in R

• Regress tip on total_bill and size

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 19


Assessing Overall Quality of the Model

• R2 can be calculated similarly


y

<
yi ŷ = b0 + b1x1 + b2 x2

yi

x2i
x2

x1i

x1

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 20


Assessing Overall Quality of the Model

• R2 never decreases when a new X variable is added to the model

• Adjusted R2:

– used to correct for the fact that adding non-relevant variables will still reduce the
residual sum of squares

– provides a better comparison between multiple regression models with different


numbers of independent variables

– Penalize excessive use of unimportant independent variables

– Smaller than R2

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 21


Model Selection

• Deciding which variables to include in the model


– Compare all subsets (2p)

– Forward selection: start from none, iteratively add the variable that improve R 2
most

– Backward selection: start from all, iteratively remove the least significant
variable

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 22


Qualitative Predictors

• Code categorical predictors as indicator variables (dummy variables)


– Two categories: sex = {Male, Female} => sexMale = 1 if Male and 0 if Female

– More categories: day = {Thur, Fri, Sat, Sun}


=> daySat = 1 if Sat and 0 otherwise
daySun = 1 if Sun and 0 otherwise
dayThur = 1 if Thur and 0 otherwise

• Can we simply code day={0,1,2,3}?

– An n-category predictor => n – 1 dummy variables (one is kept as the baseline


category)

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 23


Qualitative Predictors

• Interpretation:

Y
Female

Male

X
the expected difference in tips from a male customer as opposed to a female
customer, after controlling for the effect of total_bill and size

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 24


Qualitative Predictors

• Is this the same as having two regressions for Male and Female separately?

Y
vs Female

Male

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 25


Qualitative Predictors

• Multi-level factors

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 26


Interaction

• In previous models, we have the effect of total_bill (β1) is independent of


other predictors (e.g., size)

• What if the effect on Y of increasing X1 depends on another X2?


– With larger size, there could be stronger or weaker impact from total_bill
– Smokers pay more tips (as percentage of total_bill)

• In statistics it is referred to as an interaction effect (synergy,


complementarity)
– Mathematically,

– Which is equivalent to

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 27


Interaction

• Without interaction

tip
smokerNo

smokerYes

Total_bill

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 28


Interaction

• With interaction

tip
smokerNo

smokerYes

Total_bill

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 29


Interaction

• Interpretation of interaction effect

– β4 is significantly different from 0 (very low p-value)

– β4 = -0.064 means that on average, smokers pay 6.4% (of the total bill) less tips
than non-smokers, after controlling for total bill amount and table size
tip
– Overall effect of being a smoker: smokerNo
• β4: 6.4% (of the total bill) less
• β3: $1.16 more

smokerYes

• R2 is improved after including the interaction term Total_bill

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 30


Nonlinear Terms

• A simple way of introducing nonlinearity

– Quadratic term

– Logarithm transformation

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 31


Practical Issues

• Multicollinearity (Durbin-Watson Test)


• Non-constant variance of error terms (Heteroscedasticity)
• Dependence of the error terms
• Outliers

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 32


Selling Seasonal Products

• Challenges in selling fashion products


– Short selling season
• About several weeks (and it is getting shorter and shorter)
• Obsolescence after the season
– Volatile market conditions
• Difficult to predict the trend in advance
• Erratic demand: “hit” items sell out quickly while “dogs” sell slowly
• Lack of experience or historical data
– Rigid supply chains
• Long procurement lead time (several months) due to out-sourcing, etc.
• Product line and quantity commitment well in advance of the selling season

• Other seasonal products


– Toys, IT Products

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 33


Markdown

• Once the season begins, the only thing we can control is price

• Markdown: dynamically reduce price to boost demand and increase revenue

“Retailers hate markdowns. Discount an item too late, and stores are stuck with
truckloads of inventory. Too early, and they loose profits as people snap up items
thrown on the bargain table prematurely …”

Wall Street Journal, Aug. 7, 2001

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 34


Markdown

• Markdown is getting increasingly prevalent


– Average discount in US department stores: 6% in 1967 vs. 20% in 1997
– Average discount in US specialty stores: 10% in 1967 vs. 28% in 1997
– In 1998, 72% of all fashion items were sold at a discount

• Why mark down?


– Perishablility: product value decreases over time
– Limited capacity/cash: to make room for next batch
– Salesforce incentive, financial concern, …

• Markdown vs. Promotion

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 35


The Economics Behind Markdown

• Segmentation by time => the dynamic nature of the choice process


• Price Skimming: riding down the demand curve

Demand

0 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1


Price

Perfect Price Discrimination!

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 36


The Economics Behind Markdown

• Why not everybody just wait till the end?

– Fashionability: pay more to be the first

– Time of Use: winter clothes, toys for Christmas, gifts for the Valentine’s day

– Obsolescence: consumer electronic goods, computer hardware

– Quality deterioration: bread, fresh juice

– Availability: cheaper but may not available

– Price uncertainty: future price is unknown to the customers

Segmentation by Time and Risk!

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 37


The Retailer Case/Game

• A stock of 2000 units of a single fashion item, no replenishment

• Demand is hard to predict, limited historical data available

• Pricing
– Four allowable prices: $60 (full price), $54 (90%), $48 (80%), $36 (60%)
– Start with $60 in Week 1, then mark down over time (no price increase)

• 15-week selling season


– Left over items are sold to discounters for $25 per unit (salvage value)

• Costs are paid (hence they are sunk)

• GOAL: maximize revenue by determining the timing and magnitude of


markdowns

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 38


The Retailer Case/Game item 2
Week Qty Price Sales
1 2000 60 115
2 1885 60 105
3 1780 60 136
• Two key steps 4 1644 60 115
5 1529 60 73
6 1456 60 102
– Estimation: build the demand model 7 1354 54 58
8 1296 54 187
• Study the historical data 9 1109 54 198
• Choose a demand model and estimate parameters 10 911 54 196
11 715 54 132
12 583 54 60
– Optimize prices 13 523 54 119
14 404 54 131
15 273 54 215
16 58

Data Information Model Prescription

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 39


Next time …

• The Retailer Case: Data Analysis

• Homework
– Do self-study on how to use StatTools to do regression
• http://www.palisade.com/GuidedTour/EN/StatTools/

– Read the retailer case and understand the context of fashion retailing

– Explore the retailer dataset and prepare a one-or-two-page summary on


• What are the major factors affecting sales/revenue of a product?
• How to measure the impact of each factors?
• How about the remaining uncertainty in sales after taking the factors into account?
• How to build a demand model based on your answers to the above?

Aug 2016 DSC4213 Session 1 - Prof. WANG Tong 40

You might also like