0% found this document useful (0 votes)

2 views5 pages

First Assignment

The document outlines a homework assignment focused on nonlinear econometrics for finance, specifically analyzing household finance and medical costs using linear regression techniques in Python. It includes various problems that require generating histograms, performing regression analysis, testing hypotheses, and making predictions based on given data. Additionally, it covers the review of methods related to sample variance and its properties, including bias, consistency, and asymptotic normality.

Uploaded by

aurarolee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views5 pages

First Assignment

Uploaded by

aurarolee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Nonlinear econometrics for finance

HOMEWORK 1
(Review of linear econometrics and review of
methods)

Problem 1 (Linear econometrics). (60 points) Household finance is a

growing field in finance. Rising health costs are not just impacting house-
holds’ finances, they are affecting an array of decisions, including the decision
to change (or retire from) an occupation which provides favorable health in-
surance subsidies. For a cross section of individuals, the file “insurance.csv”
provides the following information:
• age: age of primary beneficiary of health insurance
• sex: gender of primary beneficiary of health insurance
• bmi : this is a measure of a person’s weight relative to height. It is
defined as bmi = kg/m2 , where kg is the person’s weight and m2 is the
person’s height measured in squared meters. A bmi between 18.5 and
24.9 is considered healthy. More would be considered “overweight”.
• children: number of children covered by health insurance
• smoker: whether the primary beneficiary is a smoker or not
• region: the primary beneficiary’s residential area in the US (northeast,
southeast, northwest, southwest)
• charges: medical costs billed to health insurance.
Given this information, you need to perform linear regression in Python to
understand the drivers of medical costs.

1
(1) (3 points) Generate an histogram of the medical costs and compute de-
scriptive statistics (mean, median, standard deviation, minimum, max-
imum). Is the distribution symmetric? Why or why not, in your view?

(2) (3 points) Take a logarithmic transformation of the medical costs. Plot

the histogram of the log-costs. What do you notice now? How would
you explain the change?

Begin by excluding all categorical variables (sex, smoker and region).

(3) (4 points) Run a regression of the log-costs on the non-categorical ex-

planatory variables:

log(costi ) = θ0 + θ1 agei + θ2 bmii + θ3 childreni + εi ,

where εi is an error term.

(4) (3 points) Give an economic interpretation of the estimated coefficients

in the regression above. What does the model say about the determi-
nants of medical costs?

(5) (4 points) We want to test whether the coefficient θ2 for bmi is statis-
tically significant. Test the hypothesis using the relevant test statistic.
Does bmi have more or less explanatory power than age?

(6) (3 points) We want to test whether the coefficient θ2 for bmi is statis-
tically significant. Test the hypothesis using the relevant p-value.

(7) (5 points) Test the single linear restriction θ1 = 3θ2 using the relevant
test statistic.

(8) (3 points) Test the single linear restriction θ1 = 3θ2 using the relevant
p-value.

(9) (5 points) Test the multiple linear restriction θ1 = 0.04 and θ2 = 0

using the relevant test statistic.

(10) (3 points) Test the multiple linear restriction θ1 = 0.04 and θ2 = 0

using the relevant p-value.

2
(11) (4 points) Using the estimated model, predict medical costs for a 50
year-old person with bmi = 36 and 4 children. Is the prediction lower or
higher than the mean of the distribution of the medical costs? (Recall
that the regression gives you a prediction for the log of the medical
costs (say, log(y)) not for the medical costs (say, y). Hence, after you
find the prediction for the log of the medical costs, you need to make
a transformation to find a prediction for the medical costs themselves.
Hint: if log(y) is normal, y is lognormal. What is E(y) for a log normal
random variable? )
Now, take the categorical variables into account using dummy variables
(https://en.wikipedia.org/wiki/Dummy_variable_(statistics)).
(12) (3 points) How much more (or less) do males spend relative to females
(controlling for all other variables)?
(13) (3 points) How much more (or less) do smokers spend relative to non
smokers (controlling for all other variables)?
(14) (3 points) In which region are medical costs higher (controlling for all
other variables)?
(15) (3 points) What is the difference in medical costs between the northeast
and the southwest (controlling for all other variables)?
(16) (4 points) Are the coefficients associated with the dummies individually
statistically significant?
(17) (4 points) Using your model, predict medical costs for a 50 year-old
male smoker with bmi = 36 who lives in the southwest and has 4
children.

Problem 2 (Review of methods). (30 points) Assume an iid sample

{x1 , x2 , ..., xT } from some distribution with expected value µ and variance σ 2 .
A natural estimator for the true variance (i.e., σ 2 ) of the random PTvariable
2 1
which generates the data is the sample variance, namely sx = T t=1 (xt −
X)2 , where X defines the sample mean, i.e., X = T1 Tt=1 xt .
P

First, let us focus on the finite-T (or finite-sample) properties of s2x :

3
(1) (4 points) Show that the sample variance s2x is biased for the true
variance σ 2 .
(2) (3 points) How would you correct the bias?
(3) (3 points) What is the bias of the infeasible variance estimator s2x,inf =
1
PT 2
T t=1 (xt − µ) . Why am I calling this estimator infeasible?

Now, let us turn to the large-T (or infinite-sample or asymptotic) properties

of s2x . Write the following:
T
1X
s2x = (xt − X)2
T t=1
T
1X
= ((xt − µ) − (X − µ))2
T t=1
T T
1X 1X
= (xt − µ)2 − 2(X − µ) (xt − µ) + (X − µ)2 (1)
T t=1 T t=1 | {z }
| {z } | {z } (c)
(a) (b)

Now, subtract σ 2 from the

√ left-hand side and from the right-hand side of Eq.
(1) and standardize by T to obtain:
T
√ √
PT 2 2
2 2 t=1 ((xt − µ) − σ ) 1 X
T (sx − σ ) = √ − 2(X − µ) √ (xt − µ) + T (X − µ)2
{zT T t=1 | {z }
| } | {z } (c∗ )
(a∗ ) (b∗ )

(2)
(4) (4 points) Show that s2x is consistent for σ 2 by applying the LLN to
(a), (b) and (c) in Eq (1).
√
(5) (4 points) Show that T (s2x − σ 2 ) is asymptotically normal by ap-
plying the LLN, the CLT and Slutsky’s theorem to (a∗ ), (b∗ ) and (c∗ )
in Eq. (2).
Notice that consistency is a statement about sample averages, like s2x , con-
verging (as T → ∞) to expected values. Asymptotic normality is a statement
√
about demeaned (by σ 2 , in our example)
√ and standardized (by T , in
our example) sample averages, like T (s2x − σ 2 ), converging (as T → ∞) to
a mean-zero normal distribution.

4
(6) (6 points) Use my sample Python codes from Lecture 1 to write a code
which shows consistency of s2x . You should draw your observations from
a random variable which is neither exponential nor normal.

(7) (6 points) Use my sample Python codes√from Lecture 1 to write a code

which shows asymptotic normality of T (s2x − σ 2 ). You should draw
your observations from a random variable which is neither exponential
nor normal.

Homework2 1
No ratings yet
Homework2 1
3 pages
FinalExam Mar21 Solutions
No ratings yet
FinalExam Mar21 Solutions
9 pages
Econometrics 2 Exam Answers
67% (3)
Econometrics 2 Exam Answers
6 pages
mt1 2017 Soln
No ratings yet
mt1 2017 Soln
8 pages
Ec 606 Final 201819
No ratings yet
Ec 606 Final 201819
3 pages
Past Paper 2015
No ratings yet
Past Paper 2015
7 pages
Assignment 2.1
No ratings yet
Assignment 2.1
19 pages
ECON209 F2023 - Practice Questions - Midterm 1
No ratings yet
ECON209 F2023 - Practice Questions - Midterm 1
7 pages
DA R Assignment2
No ratings yet
DA R Assignment2
9 pages
Regression With Linear Predictors Complete DOCX Download
100% (17)
Regression With Linear Predictors Complete DOCX Download
16 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Exam Solutions
No ratings yet
Exam Solutions
7 pages
Exam EcoI PZ2 WS2122
No ratings yet
Exam EcoI PZ2 WS2122
4 pages
PS4 Intro To Econometrics
No ratings yet
PS4 Intro To Econometrics
5 pages
mt1 2019 Soln
No ratings yet
mt1 2019 Soln
8 pages
ESB2021 Resit With Solution
No ratings yet
ESB2021 Resit With Solution
9 pages
Primer of Applied Regression and Analysis of Variance (Glantz S.a., Slinker B.K., Neilands T.B)
No ratings yet
Primer of Applied Regression and Analysis of Variance (Glantz S.a., Slinker B.K., Neilands T.B)
1,472 pages
Folland EHHC7 CH03 IM
No ratings yet
Folland EHHC7 CH03 IM
9 pages
Chapter 08 Nonlinear Regression Functions
No ratings yet
Chapter 08 Nonlinear Regression Functions
75 pages
Questionbank 011020035933
No ratings yet
Questionbank 011020035933
9 pages
Sample Solution
No ratings yet
Sample Solution
4 pages
Introduction To Logistic Regression: Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein
No ratings yet
Introduction To Logistic Regression: Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein
36 pages
Metrics Jan 2021
No ratings yet
Metrics Jan 2021
10 pages
ECON3208 Past Paper 2008
No ratings yet
ECON3208 Past Paper 2008
9 pages
Mock Final Exam - Econometrics 2022-2023
100% (1)
Mock Final Exam - Econometrics 2022-2023
7 pages
B.A. (Hons.) Economics Introductory Econometrics SEM-III (7033)
No ratings yet
B.A. (Hons.) Economics Introductory Econometrics SEM-III (7033)
31 pages
Exercise 1 (Week 37)
No ratings yet
Exercise 1 (Week 37)
4 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Midterm Solution 2024spring
No ratings yet
Midterm Solution 2024spring
10 pages
Due Monday, October 23
No ratings yet
Due Monday, October 23
3 pages
Dummy Variable Ques
No ratings yet
Dummy Variable Ques
7 pages
CBCS Core - Introductory Econometrics Semester 4th
No ratings yet
CBCS Core - Introductory Econometrics Semester 4th
28 pages
Final - Econ3005 - 2022spring - Combined 2
No ratings yet
Final - Econ3005 - 2022spring - Combined 2
11 pages
ECON326 Midterm
No ratings yet
ECON326 Midterm
5 pages
Proiect Econometrie
No ratings yet
Proiect Econometrie
15 pages
4-R Code and PPT - Predicting Medical Expenses Using Linear Regression - New Without Prerequsit
No ratings yet
4-R Code and PPT - Predicting Medical Expenses Using Linear Regression - New Without Prerequsit
17 pages
Biostat II Final Exam
No ratings yet
Biostat II Final Exam
7 pages
2018, Applied Ecotrix, Question Paper
No ratings yet
2018, Applied Ecotrix, Question Paper
20 pages
2-06 Non-Linear Models - Logged Variables and Standardized Coefficients
No ratings yet
2-06 Non-Linear Models - Logged Variables and Standardized Coefficients
28 pages
30C00200 Problem Set 1
No ratings yet
30C00200 Problem Set 1
4 pages
Multiolelogisitcregre
No ratings yet
Multiolelogisitcregre
8 pages
Metrics Aug 2023
No ratings yet
Metrics Aug 2023
10 pages
MLR Practice
No ratings yet
MLR Practice
13 pages
Ecs 4220
No ratings yet
Ecs 4220
7 pages
Introductory Econometrics A Modern Approach 5th Edition Wooldridge Solutions Manual 1
100% (51)
Introductory Econometrics A Modern Approach 5th Edition Wooldridge Solutions Manual 1
26 pages
Regression Logistic Regression
100% (1)
Regression Logistic Regression
37 pages
EDA Final Exam Question Paper
No ratings yet
EDA Final Exam Question Paper
2 pages
Computer Lab 3 MM
No ratings yet
Computer Lab 3 MM
38 pages
DocScanner May 30, 2024 17-22
No ratings yet
DocScanner May 30, 2024 17-22
13 pages
FIN213 - Semester Test 2 Solutions Memo 20240503
No ratings yet
FIN213 - Semester Test 2 Solutions Memo 20240503
13 pages
2017, Sem 5, Applied Econometrics
No ratings yet
2017, Sem 5, Applied Econometrics
27 pages
Proiect Econometrie
No ratings yet
Proiect Econometrie
15 pages
Econtrix
No ratings yet
Econtrix
14 pages
Weatherwax Weisberg Solutions
No ratings yet
Weatherwax Weisberg Solutions
162 pages
Trix 2019
No ratings yet
Trix 2019
6 pages
Cross Section Answers
No ratings yet
Cross Section Answers
22 pages
EconometricsII Exercises
100% (1)
EconometricsII Exercises
27 pages
Theo Assignment 2 New
No ratings yet
Theo Assignment 2 New
10 pages
Random Effects Models: Yanez, Spring 2004 1 Lecture Notes XI
No ratings yet
Random Effects Models: Yanez, Spring 2004 1 Lecture Notes XI
14 pages
Tables For Cas Exam Mas-Ii
No ratings yet
Tables For Cas Exam Mas-Ii
18 pages
Standardizing Six Sigma Green BELT Training
No ratings yet
Standardizing Six Sigma Green BELT Training
19 pages
Assignment FMS1323
No ratings yet
Assignment FMS1323
4 pages
Class Frequency: 2 2 4 3 1 3 7 3 21 8 1 8 Total 7 36
No ratings yet
Class Frequency: 2 2 4 3 1 3 7 3 21 8 1 8 Total 7 36
5 pages
The Normal Distribution
No ratings yet
The Normal Distribution
5 pages
Modeling Count Data. ISBN 1107611253, 978-1107611252
100% (27)
Modeling Count Data. ISBN 1107611253, 978-1107611252
23 pages
Getting Started in Factor Analysis (Using Stata 10) : Oscar Torres-Reyna
No ratings yet
Getting Started in Factor Analysis (Using Stata 10) : Oscar Torres-Reyna
5 pages
Design Space Process Models With Monte Carlo Simulation
No ratings yet
Design Space Process Models With Monte Carlo Simulation
38 pages
Advenced Level Descriptive Statistics
100% (1)
Advenced Level Descriptive Statistics
14 pages
Demand Estimation and Forecasting - Lecturenotes
100% (1)
Demand Estimation and Forecasting - Lecturenotes
33 pages
Tests in SPSS
No ratings yet
Tests in SPSS
24 pages
2 - Statistics 3 Worksheet
100% (1)
2 - Statistics 3 Worksheet
4 pages
Skittles Project Final
No ratings yet
Skittles Project Final
5 pages
Measures of Variability
No ratings yet
Measures of Variability
5 pages
Raheema KV BPCC 134
No ratings yet
Raheema KV BPCC 134
25 pages
Bco Grade 7 Math
No ratings yet
Bco Grade 7 Math
12 pages
Correlation and Linear
No ratings yet
Correlation and Linear
27 pages
T Distribution
0% (1)
T Distribution
26 pages
D7bc1home Assignment 1
No ratings yet
D7bc1home Assignment 1
5 pages
Elementary Statistics Final Exam Group Bình Định - K06HN - Andrews MBA Group members: Nguyễn Hoàng Minh, Nguyễn Đức Thắng, Đào Thị Khánh Linh
No ratings yet
Elementary Statistics Final Exam Group Bình Định - K06HN - Andrews MBA Group members: Nguyễn Hoàng Minh, Nguyễn Đức Thắng, Đào Thị Khánh Linh
9 pages
REGRESS
No ratings yet
REGRESS
24 pages
Solution - Chapter 10
No ratings yet
Solution - Chapter 10
33 pages
HW 1
No ratings yet
HW 1
7 pages
Introduction To Statistics and Data Analysis: Detailed Introductory Part of Statistics L2
No ratings yet
Introduction To Statistics and Data Analysis: Detailed Introductory Part of Statistics L2
80 pages
Analisis Faktor Kondisi Ekonomi, Tingkat Pendidikan Dan Kemampuan Berwirausaha Terhadap Kinerja Usaha Bagi Pengusaha Pindang Di Desa Cukanggenteng
No ratings yet
Analisis Faktor Kondisi Ekonomi, Tingkat Pendidikan Dan Kemampuan Berwirausaha Terhadap Kinerja Usaha Bagi Pengusaha Pindang Di Desa Cukanggenteng
12 pages
T Test Expamles
No ratings yet
T Test Expamles
7 pages
Statistics and Probability
No ratings yet
Statistics and Probability
59 pages
Basics in Experimental Research: Ajit Sahai
No ratings yet
Basics in Experimental Research: Ajit Sahai
44 pages
Experimental Statistics MCQ's 0001final
100% (2)
Experimental Statistics MCQ's 0001final
19 pages

First Assignment

Uploaded by

First Assignment

Uploaded by

Nonlinear econometrics for finance

Problem 1 (Linear econometrics). (60 points) Household finance is a

(2) (3 points) Take a logarithmic transformation of the medical costs. Plot

Begin by excluding all categorical variables (sex, smoker and region).

(3) (4 points) Run a regression of the log-costs on the non-categorical ex-

log(costi ) = θ0 + θ1 agei + θ2 bmii + θ3 childreni + εi ,

where εi is an error term.

(4) (3 points) Give an economic interpretation of the estimated coefficients

(9) (5 points) Test the multiple linear restriction θ1 = 0.04 and θ2 = 0

(10) (3 points) Test the multiple linear restriction θ1 = 0.04 and θ2 = 0

Problem 2 (Review of methods). (30 points) Assume an iid sample

First, let us focus on the finite-T (or finite-sample) properties of s2x :

Now, let us turn to the large-T (or infinite-sample or asymptotic) properties

Now, subtract σ 2 from the

(7) (6 points) Use my sample Python codes√from Lecture 1 to write a code

You might also like