0% found this document useful (0 votes)
46 views

Machine Learning

This document describes using double machine learning to estimate the price elasticity of demand from wine sales data. It uses various machine learning algorithms like lasso, elastic net, deep learning, random forests, and gradient boosting within the double machine learning framework. It finds that gradient boosting performs best, estimating the price elasticity to be around -1.19. The document concludes linear models may not be appropriate for this data and that nonparametric deep learning and gradient boosting methods learn the data better.

Uploaded by

atitti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Machine Learning

This document describes using double machine learning to estimate the price elasticity of demand from wine sales data. It uses various machine learning algorithms like lasso, elastic net, deep learning, random forests, and gradient boosting within the double machine learning framework. It finds that gradient boosting performs best, estimating the price elasticity to be around -1.19. The document concludes linear models may not be appropriate for this data and that nonparametric deep learning and gradient boosting methods learn the data better.

Uploaded by

atitti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Machine Learning

and
Applied Econometrics
An Application: Double Machine
Learning for Price Elasticity

4/22/2019 Machine Learning and Econometrics 1


Double Machine Learning for Price
Elasticity of Demand Function
• This presentation is in part based on:
– Alexandre Belloni, Victor Chernozhukov, and Christian
Hansen, High-Dimensional Methods and Inference on
Structural and Treatment Effects, Journal of Economic
Perspectives 28:2 (29-50), Spring 2014.
– Victor Chernozhukov, Denis Chetverikov, Mert Demirer,
Esther Duflo, Christian Hansen, Whitney Newey, and James
Robins, Double/Debiased Machine Learning for Treatment
and Structural Parameters, Econometrics Journal 21:1,
2018.

4/22/2019
Structural and Treatment Effects
• The Model
Y  f ( D, Z )  u, E (u | Z , D)  0
D  h ( Z )  v, E (v | Z )  0
– D is the target variable of interest (e.g., price) or
the treatment variable (typically, D=0 or 1)
– Z is the set of exogenous covariates or control
variables (instruments, confounders), may be
high-dimensional.
• Partial Linear Model: f ( D, Z )  D  g (Z )

4/22/2019
Structural and Treatment Effects
• If D is numeric structural variable
  y / D
• If D=1 or 0
– Average Treatment Effect (ATE)
  E  f (1, Z )  f (0, Z ) 
– Average Treatment Effect for the Treated (ATT)
  E  f (1, Z )  f (0, Z ) | D  1

4/22/2019
Structural and Treatment Effects
• Based on Partial Linear Model,
– Frisch-Waugh-Lovel Theorem: ˆ  
uˆ  Y  ˆD  gˆ ( Z )
u  Y  g (Z ) 
 u   v if g and h arelinear
v  D  h(Z )
– Machine Learning : g (Z ) and h(Z )
– OLS:   v ' u / v ' v
• This estimate is biased and inefficient!
– De-biased:   v ' u / v ' D, in general   

4/22/2019
Structural and Treatment Effects
• Based on Partial Linear Model,
– Sample Splitting
• {1,…,N} = Set of all observations
• I1 = main sample = set of observation numbers, of size
n, is used to estimate θ; e.g., n=N/2.
• I2 = auxilliary sample = set of observations, of size πn =
N −n, is used to estimate g;
• I1 and I2 form a random partition of the set {1,...,N}
– Cross Fitting on {I1,I2} and {I2,I1}

4/22/2019
Structural and Treatment Effects
• Cross Fitting on {I1,I2} and {I2,I1}
– Machine Learning:
g1 ( Z ) and h1 ( Z ) on ( I1 , I 2 )
g 2 ( Z ) and h2 ( Z ) on ( I 2 , I1 )
– De-Biased Estimator:
 2   ( I1 , I 2 )  1   2
  
1   ( I 2 , I1 )  2
–  is N consistent and approximately centered
normal (Chernozhukov, et.al., 2017)
4/22/2019
Structural and Treatment Effects
• Extensions
– Based on sample splitting {1,…,N} = {I1,I2},
de-biased estimator may be obtained from
pooled data and ML residuals:
  v1 v2  ' u1 u2  / v1 v2  '  D1 D2 
– Cross fitting can be k-fold, e.g. k=2, 5, 10

4/22/2019
Example: Table Wine Sales in Vancouver BC

• Total Weekly Sales of Imported and Domestic


Table Wine in Vancouver, BC, Canada
from week ending April 4, 2009 to week ending
May 28, 2011 (372,228 sales)
– Irregularly-spaced time series
– Data Source: American Association of Wine
Economists

4/22/2019
Example: Table Wine Sales in Vancouver BC

• 372,228 observations of 17 variables in an Excel


spreadsheet:
– SKU #, Product Long Name, Store Category Major
Name, Store Category Sub Name, Store Category
Minor Name, Current Display Price, Bottled Location
Code, Bottle Location Desc, Domestic/Import
Indicator, VQA Indicator, Product Sweetness Code,
Product Sweetness Desc, Alcohol Percent, Julian Week
No, Week Ending Date, Total Weekly Selling Unit, Total
Weekly Volume Litre

4/22/2019
4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

Y   D  g ( Z )  u, E (u | Z , D)  0
D  m( Z )  v , E (v | Z )  0

• Y = log of quantity (total weekly selling unit in bottles)


• D = log of price (current display price in Canadian $)
• Z = { What = Store Category Minor Name (Red/White), Where
= Store Category Sub Name (Countries), Loc = Bottled Location
Code, Alc = Alcohol Percent, Age = Julian Week No, …}
•  = Price Elasticity

4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

• GLM (Lasso)
K-fold CF Y (Val. MSE) D (Val.MSE)  (Price Elas.)
2 2.126 0.320 -1.238
5 2.126 0.320 -1.238
10 2.126 0.320 -1.238

• GLM (Elastic Net)


K-fold CF Y (Val. MSE) D (Val. MSE)  (Price Elas.)
2 2.129 0.321 -1.228
5 2.127 0.321 -1.232
10 2.127 0.320 -1.233

4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

• DL (20,20)
K-fold CF Y (Val. MSE) D (Val.MSE)  (Price Elas.)
2 1.977 0.273 -1.261
5 1.984 0.273 -1.271
10 1.983 0.274 -1.131

• DL (20,10,5)
K-fold CF Y (Val. MSE) D (Val. MSE)  (Price Elas.)
2 1.966 0.273 -1.279
5 1.982 0.274 -1.124
10 1.973 0.273 -1.245

4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

• DRF (50 trees, max depth=20)


K-fold CF Y (Val. MSE) D (Val.MSE)  (Price Elas.)
2 2.126 0.320 -1.129
5 2.130 0.318 -1.135
10 2.129 0.318 -1.136

• GBM (50 trees, max depth=5)


K-fold CF Y (Val. MSE) D (Val. MSE)  (Price Elas.)
2 1.943 0.266 -1.192
5 1.944 0.266 -1.192
10 1.941 0.265 -1.193

4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

• Conclusion
– Linear regression model may not explain and validate this
set of data. Thus, the price elasticity estimate of 1.23 may
not be reliable.
– The nonparametric Deep Learning Neural Networks and
Gradient Boosting Machine perform better in learning this
dataset.
– Gradient Boosting Machine as applied to a partial linear
model framework in price elasticity is 1.19.
– All computations are done with R package H2O:
• Darren Cook, Practical Machine Learning with H2O,
O'Reilly Media, Inc., 2017.
4/22/2019

You might also like