0% found this document useful (0 votes)

66 views16 pages

Machine Learning

This document describes using double machine learning to estimate the price elasticity of demand from wine sales data. It uses various machine learning algorithms like lasso, elastic net, deep learning, random forests, and gradient boosting within the double machine learning framework. It finds that gradient boosting performs best, estimating the price elasticity to be around -1.19. The document concludes linear models may not be appropriate for this data and that nonparametric deep learning and gradient boosting methods learn the data better.

Uploaded by

atitti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views16 pages

Machine Learning

Uploaded by

atitti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Machine Learning

and
Applied Econometrics
An Application: Double Machine
Learning for Price Elasticity

4/22/2019 Machine Learning and Econometrics 1

Double Machine Learning for Price
Elasticity of Demand Function
• This presentation is in part based on:
– Alexandre Belloni, Victor Chernozhukov, and Christian
Hansen, High-Dimensional Methods and Inference on
Structural and Treatment Effects, Journal of Economic
Perspectives 28:2 (29-50), Spring 2014.
– Victor Chernozhukov, Denis Chetverikov, Mert Demirer,
Esther Duflo, Christian Hansen, Whitney Newey, and James
Robins, Double/Debiased Machine Learning for Treatment
and Structural Parameters, Econometrics Journal 21:1,
2018.

4/22/2019
Structural and Treatment Effects
• The Model
Y  f ( D, Z )  u, E (u | Z , D)  0
D  h ( Z )  v, E (v | Z )  0
– D is the target variable of interest (e.g., price) or
the treatment variable (typically, D=0 or 1)
– Z is the set of exogenous covariates or control
variables (instruments, confounders), may be
high-dimensional.
• Partial Linear Model: f ( D, Z )  D  g (Z )

4/22/2019
Structural and Treatment Effects
• If D is numeric structural variable
  y / D
• If D=1 or 0
– Average Treatment Effect (ATE)
  E  f (1, Z )  f (0, Z ) 
– Average Treatment Effect for the Treated (ATT)
  E  f (1, Z )  f (0, Z ) | D  1

4/22/2019
Structural and Treatment Effects
• Based on Partial Linear Model,
– Frisch-Waugh-Lovel Theorem: ˆ  
uˆ  Y  ˆD  gˆ ( Z )
u  Y  g (Z ) 
 u   v if g and h arelinear
v  D  h(Z )
– Machine Learning : g (Z ) and h(Z )
– OLS:   v ' u / v ' v
• This estimate is biased and inefficient!
– De-biased:   v ' u / v ' D, in general   

4/22/2019
Structural and Treatment Effects
• Based on Partial Linear Model,
– Sample Splitting
• {1,…,N} = Set of all observations
• I1 = main sample = set of observation numbers, of size
n, is used to estimate θ; e.g., n=N/2.
• I2 = auxilliary sample = set of observations, of size πn =
N −n, is used to estimate g;
• I1 and I2 form a random partition of the set {1,...,N}
– Cross Fitting on {I1,I2} and {I2,I1}

4/22/2019
Structural and Treatment Effects
• Cross Fitting on {I1,I2} and {I2,I1}
– Machine Learning:
g1 ( Z ) and h1 ( Z ) on ( I1 , I 2 )
g 2 ( Z ) and h2 ( Z ) on ( I 2 , I1 )
– De-Biased Estimator:
 2   ( I1 , I 2 )  1   2
  
1   ( I 2 , I1 )  2
–  is N consistent and approximately centered
normal (Chernozhukov, et.al., 2017)
4/22/2019
Structural and Treatment Effects
• Extensions
– Based on sample splitting {1,…,N} = {I1,I2},
de-biased estimator may be obtained from
pooled data and ML residuals:
  v1 v2  ' u1 u2  / v1 v2  '  D1 D2 
– Cross fitting can be k-fold, e.g. k=2, 5, 10

4/22/2019
Example: Table Wine Sales in Vancouver BC

• Total Weekly Sales of Imported and Domestic

Table Wine in Vancouver, BC, Canada
from week ending April 4, 2009 to week ending
May 28, 2011 (372,228 sales)
– Irregularly-spaced time series
– Data Source: American Association of Wine
Economists

4/22/2019
Example: Table Wine Sales in Vancouver BC

• 372,228 observations of 17 variables in an Excel

spreadsheet:
– SKU #, Product Long Name, Store Category Major
Name, Store Category Sub Name, Store Category
Minor Name, Current Display Price, Bottled Location
Code, Bottle Location Desc, Domestic/Import
Indicator, VQA Indicator, Product Sweetness Code,
Product Sweetness Desc, Alcohol Percent, Julian Week
No, Week Ending Date, Total Weekly Selling Unit, Total
Weekly Volume Litre

4/22/2019
4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

Y   D  g ( Z )  u, E (u | Z , D)  0
D  m( Z )  v , E (v | Z )  0

• Y = log of quantity (total weekly selling unit in bottles)

• D = log of price (current display price in Canadian $)
• Z = { What = Store Category Minor Name (Red/White), Where
= Store Category Sub Name (Countries), Loc = Bottled Location
Code, Alc = Alcohol Percent, Age = Julian Week No, …}
•  = Price Elasticity

4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

• GLM (Lasso)
K-fold CF Y (Val. MSE) D (Val.MSE)  (Price Elas.)
2 2.126 0.320 -1.238
5 2.126 0.320 -1.238
10 2.126 0.320 -1.238

• GLM (Elastic Net)

K-fold CF Y (Val. MSE) D (Val. MSE)  (Price Elas.)
2 2.129 0.321 -1.228
5 2.127 0.321 -1.232
10 2.127 0.320 -1.233

4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

• DL (20,20)
K-fold CF Y (Val. MSE) D (Val.MSE)  (Price Elas.)
2 1.977 0.273 -1.261
5 1.984 0.273 -1.271
10 1.983 0.274 -1.131

• DL (20,10,5)
K-fold CF Y (Val. MSE) D (Val. MSE)  (Price Elas.)
2 1.966 0.273 -1.279
5 1.982 0.274 -1.124
10 1.973 0.273 -1.245

4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

• DRF (50 trees, max depth=20)

K-fold CF Y (Val. MSE) D (Val.MSE)  (Price Elas.)
2 2.126 0.320 -1.129
5 2.130 0.318 -1.135
10 2.129 0.318 -1.136

• GBM (50 trees, max depth=5)

K-fold CF Y (Val. MSE) D (Val. MSE)  (Price Elas.)
2 1.943 0.266 -1.192
5 1.944 0.266 -1.192
10 1.941 0.265 -1.193

4/22/2019
Table Wine Sales in Vancouver BC
Double Machine Learning of Price Elasticity

• Conclusion
– Linear regression model may not explain and validate this
set of data. Thus, the price elasticity estimate of 1.23 may
not be reliable.
– The nonparametric Deep Learning Neural Networks and
Gradient Boosting Machine perform better in learning this
dataset.
– Gradient Boosting Machine as applied to a partial linear
model framework in price elasticity is 1.19.
– All computations are done with R package H2O:
• Darren Cook, Practical Machine Learning with H2O,
O'Reilly Media, Inc., 2017.
4/22/2019

England and Verrall - Predictive Distributions of Outstanding Liabilities in General Insurance
100% (2)
England and Verrall - Predictive Distributions of Outstanding Liabilities in General Insurance
43 pages
(Ebook PDF) A Second Course in Statistics: Regression Analysis 8th Edition 2024 Scribd Download
100% (2)
(Ebook PDF) A Second Course in Statistics: Regression Analysis 8th Edition 2024 Scribd Download
45 pages
Cheat Sheet Econometrics
No ratings yet
Cheat Sheet Econometrics
4 pages
CHYS 3P15 Final Exam Review
No ratings yet
CHYS 3P15 Final Exam Review
7 pages
T Test
No ratings yet
T Test
16 pages
Fixed Effects, Random Effects Model Cheat Sheet
100% (1)
Fixed Effects, Random Effects Model Cheat Sheet
4 pages
Econometric Methods Johnston
No ratings yet
Econometric Methods Johnston
514 pages
Intro To Hydrology
No ratings yet
Intro To Hydrology
415 pages
Causal Inference and Research Design Scott Cunningham (Baylor)
100% (1)
Causal Inference and Research Design Scott Cunningham (Baylor)
1,056 pages
Fundamentals of Applied Econometrics: by Richard A. Ashley
No ratings yet
Fundamentals of Applied Econometrics: by Richard A. Ashley
26 pages
Introduction To ETL in Python: Stefano Francavilla
No ratings yet
Introduction To ETL in Python: Stefano Francavilla
62 pages
Dynare Examples
No ratings yet
Dynare Examples
21 pages
Cheat Sheet
No ratings yet
Cheat Sheet
163 pages
Bahan Univariate Linear Regression
No ratings yet
Bahan Univariate Linear Regression
64 pages
Var Models in Stata
No ratings yet
Var Models in Stata
13 pages
Dsge - Stata Manual
No ratings yet
Dsge - Stata Manual
111 pages
Machine Learning Techniques - Types of Machine Learning - Applications Mathematical Foundations of Machine Learning
No ratings yet
Machine Learning Techniques - Types of Machine Learning - Applications Mathematical Foundations of Machine Learning
15 pages
Mitchell Machine Learning
No ratings yet
Mitchell Machine Learning
37 pages
Maths of Machine Learning
No ratings yet
Maths of Machine Learning
75 pages
Structural VAR and Applications: Jean-Paul Renne
No ratings yet
Structural VAR and Applications: Jean-Paul Renne
55 pages
Introduction To Vars and Structural Vars:: Estimation & Tests Using Stata
100% (1)
Introduction To Vars and Structural Vars:: Estimation & Tests Using Stata
69 pages
Statistics I
100% (2)
Statistics I
686 pages
Time Series Analysis - Economics
100% (1)
Time Series Analysis - Economics
48 pages
Notes On ARIMA: ND RD
No ratings yet
Notes On ARIMA: ND RD
4 pages
The Mostly Complete Chart of Neural Networks
100% (1)
The Mostly Complete Chart of Neural Networks
19 pages
Quick Stata Tips
No ratings yet
Quick Stata Tips
103 pages
Comandos
No ratings yet
Comandos
51 pages
Unit 7 - Time Series
No ratings yet
Unit 7 - Time Series
33 pages
SVAR Notes: Learn in Person
No ratings yet
SVAR Notes: Learn in Person
19 pages
In Class Exercise Linear Regression in R
No ratings yet
In Class Exercise Linear Regression in R
6 pages
Microeconometría Bancaria
No ratings yet
Microeconometría Bancaria
62 pages
6 XG Boost - Jupyter Notebook
100% (1)
6 XG Boost - Jupyter Notebook
3 pages
Department of Economics: ECONOMICS 481: Economics Research Paper and Seminar
No ratings yet
Department of Economics: ECONOMICS 481: Economics Research Paper and Seminar
15 pages
Ampl Cheat Sheet
No ratings yet
Ampl Cheat Sheet
5 pages
MA 21001 Probability and Statistics For Engineers
No ratings yet
MA 21001 Probability and Statistics For Engineers
2 pages
MR GMAT Quantitative Question Bank BTG D27 M8
No ratings yet
MR GMAT Quantitative Question Bank BTG D27 M8
508 pages
CEC453 Machine Learning
No ratings yet
CEC453 Machine Learning
168 pages
100 Days of ML
100% (1)
100 Days of ML
15 pages
Oil Export Indonesia
100% (1)
Oil Export Indonesia
12 pages
Neural
No ratings yet
Neural
35 pages
Ensemble Learning and Random Forests
No ratings yet
Ensemble Learning and Random Forests
151 pages
Stats Statcrunch Card PDF
No ratings yet
Stats Statcrunch Card PDF
2 pages
45 Genetic Algorithms
No ratings yet
45 Genetic Algorithms
20 pages
Machine Learning
100% (1)
Machine Learning
46 pages
A Second Course in Statistics: Regression Analysis: Journal of The American Statistical Association June 1997
No ratings yet
A Second Course in Statistics: Regression Analysis: Journal of The American Statistical Association June 1997
9 pages
Stata Ts Introduction To Time-Series Commands
100% (1)
Stata Ts Introduction To Time-Series Commands
6 pages
Lab 3 - Linear Regression
No ratings yet
Lab 3 - Linear Regression
15 pages
Univariate Time Series Modelling and Forecasting
No ratings yet
Univariate Time Series Modelling and Forecasting
74 pages
Lecture Notes WI3411TU Financial Time Series - 2021
No ratings yet
Lecture Notes WI3411TU Financial Time Series - 2021
107 pages
Econ275 (Stanford) PDF
No ratings yet
Econ275 (Stanford) PDF
4 pages
Chapter 6 Statistical Estimation Method of Moments MLE
No ratings yet
Chapter 6 Statistical Estimation Method of Moments MLE
29 pages
Analysing Panel Data Using STATA
No ratings yet
Analysing Panel Data Using STATA
13 pages
IndiaInvestments Wiki
No ratings yet
IndiaInvestments Wiki
432 pages
Statistical Machine Learning For Quantitative Finance
No ratings yet
Statistical Machine Learning For Quantitative Finance
25 pages
Solving DSGE Models Using Dynare
No ratings yet
Solving DSGE Models Using Dynare
11 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
Multiple Regression Tutorial 3
100% (2)
Multiple Regression Tutorial 3
5 pages
CH 12
No ratings yet
CH 12
82 pages
Agricultural Statistical Data Analysis Using Stata by George Boyhan
No ratings yet
Agricultural Statistical Data Analysis Using Stata by George Boyhan
253 pages
Statistical Modeling
No ratings yet
Statistical Modeling
22 pages
Properties of The OLS Estimators
100% (1)
Properties of The OLS Estimators
2 pages
FULLTEXT01
No ratings yet
FULLTEXT01
60 pages
Oil Forcasting Method
No ratings yet
Oil Forcasting Method
8 pages
Final Thesis Yifan Cao
No ratings yet
Final Thesis Yifan Cao
178 pages
Practical Guide to Forming Simulation
From Everand
Practical Guide to Forming Simulation
Rakesh Kumar
No ratings yet
Gorfnik - A Simpler Paradox
No ratings yet
Gorfnik - A Simpler Paradox
125 pages
Sales and Operations Planning Optimisation
No ratings yet
Sales and Operations Planning Optimisation
16 pages
The Dynamics of Price Elasticity
No ratings yet
The Dynamics of Price Elasticity
13 pages
Dynamic Linear Programming
No ratings yet
Dynamic Linear Programming
23 pages
Batman Returns
No ratings yet
Batman Returns
10 pages
Back To The Future Part II
No ratings yet
Back To The Future Part II
52 pages
19th International Conferenceon Turkish Linguistics PDF
No ratings yet
19th International Conferenceon Turkish Linguistics PDF
43 pages
Predictability Dynamics of Emerging Sovereign CDS Markets
No ratings yet
Predictability Dynamics of Emerging Sovereign CDS Markets
5 pages
MBA Sahil Business Analytics
No ratings yet
MBA Sahil Business Analytics
5 pages
Lorem Ipsum Dolor Sit Amet, Consectetur Adipiscing Elit. Aliquam Semper Ipsum Urna, Nec Cursus Dolor Dictum Nec. Donec Luctus Mauris Quis Cursus.
No ratings yet
Lorem Ipsum Dolor Sit Amet, Consectetur Adipiscing Elit. Aliquam Semper Ipsum Urna, Nec Cursus Dolor Dictum Nec. Donec Luctus Mauris Quis Cursus.
14 pages
Vol2 4 1 PDF
No ratings yet
Vol2 4 1 PDF
17 pages
Chapter Four
No ratings yet
Chapter Four
8 pages
Chapter 3. Panel Threshold Regression Models
No ratings yet
Chapter 3. Panel Threshold Regression Models
86 pages
06CateUEN05ThreeWayPPT PDF
No ratings yet
06CateUEN05ThreeWayPPT PDF
130 pages
Business Statistics - Examples and Lecture Distribution (MBA)
No ratings yet
Business Statistics - Examples and Lecture Distribution (MBA)
4 pages
Regression Tutorial
No ratings yet
Regression Tutorial
5 pages
Stat 2
No ratings yet
Stat 2
2 pages
2019 Fin Econ
No ratings yet
2019 Fin Econ
6 pages
Presentation of Regression Results
No ratings yet
Presentation of Regression Results
5 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
Determinants of Income Inequality in Ethiopia
No ratings yet
Determinants of Income Inequality in Ethiopia
23 pages
Type 1 Error and Type 2 Error
100% (1)
Type 1 Error and Type 2 Error
3 pages
Forcasting Dimsum
No ratings yet
Forcasting Dimsum
18 pages
(Advances in Econometrics, 39) David T. Jacho-Chavez, Gautam Tripathi - The Econometrics of Complex Survey Data - Theory and Applications-Emerald Publishing (2019)
100% (1)
(Advances in Econometrics, 39) David T. Jacho-Chavez, Gautam Tripathi - The Econometrics of Complex Survey Data - Theory and Applications-Emerald Publishing (2019)
338 pages
Sop 23
No ratings yet
Sop 23
8 pages
Operations Management: Box-Jenkins Method of Forecasting
No ratings yet
Operations Management: Box-Jenkins Method of Forecasting
11 pages
Lecture7 - Interval Estimation Pt.3
No ratings yet
Lecture7 - Interval Estimation Pt.3
14 pages
1.bais Varience Trade-Off
No ratings yet
1.bais Varience Trade-Off
5 pages
Reasoning Under Uncertainty For GATE Exam
No ratings yet
Reasoning Under Uncertainty For GATE Exam
3 pages
1.3.2. Feature Engineering and Variable - Transformation
No ratings yet
1.3.2. Feature Engineering and Variable - Transformation
29 pages
PHD Econ, Applied Econometrics 2021/22 - Takehome University of Innsbruck
No ratings yet
PHD Econ, Applied Econometrics 2021/22 - Takehome University of Innsbruck
20 pages
Example 1stst
No ratings yet
Example 1stst
10 pages
Chapter 18
No ratings yet
Chapter 18
25 pages

Machine Learning

Uploaded by

Machine Learning

Uploaded by

Machine Learning

4/22/2019 Machine Learning and Econometrics 1

• Total Weekly Sales of Imported and Domestic

• 372,228 observations of 17 variables in an Excel

• Y = log of quantity (total weekly selling unit in bottles)

• GLM (Elastic Net)

• DRF (50 trees, max depth=20)

• GBM (50 trees, max depth=5)

You might also like