0% found this document useful (0 votes)

18 views7 pages

Assignment 03 AK

Uploaded by

neilarora6969

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views7 pages

Assignment 03 AK

Uploaded by

neilarora6969

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

ECON 333 D100 Statistical Analysis of Economic Data

Spring 2024

Assignment 03

1. A researcher carefully selected a simple random sample of 176 men and got the following estimate
for his linear regression model (W is the wage rate, in $/hour, and A is the worker’s age, in years):

̂ = 12.21 + 0.24𝐴
𝑊
𝑅2 = 0.038

a. What does that number 0.24 in front of A mean?

̂ , that is, the estimated regression predicts that the average

It’s the marginal effect of A on 𝑊
wage will be ¢24/hr higher for a man who is one year older.

b. What does that number 12.21 mean?

Short answer: Nothing, really…

Formally, it’s what the predicted average wage for a newborn man before his umbilical cord
falls off. Which, of course, is not really meaningful when applied to the real world where
nobody signs a work agreement before they turn one year old.

One could try and assign it some meaning if they want. For instance, one could think about
this number as one determining the overall level of wages (how high, vertically, the
regression line is) and then use an interpretation to organize their thinking about the labour
market with the help of this idea. Or something like it :)

c. What does that number 0.038 in the second line mean? What are its units of measurement?
The coefficient of determination (aka ‘overall fit’ or simply ‘R-squared’). The common
interpretations of R2 (and many people take these as one thing expressed in various ways to
apply to different questions) are:

• It measures the part of the variation in the dependent variable that is predictable
from the independent variable.
• It measures how well the estimated regression line fits the data.
• It measures how accurate the predictions (that are based on the estimated regression
line) of the estimated model will be.

The R2 does not have units of measurement.

1
2. Use the spreadsheet Growth (see the assignment instructions), which contains data on average
growth rates from 1960 through 1995 for 63 countries, along with variables that are potentially
related to growth. A detailed description of the original data is given in Growth_Description (PDF
file in the assignment instructions). In this exercise, you will investigate the relationship between
growth and trade.

a. Construct a scatterplot of the average annual growth rate (growth) on the average trade share
(tradeshare) - copy & paste your R output here. Does there appear to be a relationship
between the variables? Answer in 1-2 sentences.

> plot(Asmt03$tradeshare, Asmt03$growth, main="Rate of Growth and Trade",

+ xlab="Trade Share ", ylab="Growth Rate ")

It seems that there is a (weak) positive relationship between the two variables. Or some people
may look at it and say, “I see no relationship, especially if I ignore an obvious outlier in the
right-top corner.”

The relationship you see depends on how hard you look, and in my case, it is biased because I
learned before from the international trade & globalization researchers that trade is good for
growth and have already done the estimation of the regression line. The truth is we are looking
for expected patterns if we already have some preconceptions about what they should be (and so
we see the expected patterns even if they are not really there). One should be careful making
judgments about patterns when a relationship is weak.

2
b. Using all observations, run a regression of growth on tradeshare. Copy & paste your R
output here (see page 3 if you are not sure what I ask for here).

> print(summary(lm(Asmt03$growth~Asmt03$tradeshare)))
Call:
lm(formula = Asmt03$growth ~ Asmt03$tradeshare)
Residuals:
Min 1Q Median 3Q Max
-4.2305 -0.8643 0.1445 1.0649 3.4251

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5272 0.4353 1.211 0.23058
Asmt03$tradeshare 2.2352 0.6871 3.253 0.00186 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.584 on 61 degrees of freedom
Multiple R-squared: 0.1478, Adjusted R-squared: 0.1339
F-statistic: 10.58 on 1 and 61 DF, p-value: 0.001863

c. Use the regression to predict the growth rate for a country with a trade share of 0.4 and for
another with a trade share equal to 0.8.

̂ = 0.5272 + 2.2352 × 𝑇𝑟𝑎𝑑𝑒𝑆ℎ𝑎𝑟𝑒

𝐺𝑟𝑜𝑤𝑡ℎ

̂ = 0.5272 + 2.2352 × 0.4 = 1.4%

𝐺𝑟𝑜𝑤𝑡ℎ

̂ = 0.5272 + 2.2352 × 0.8 = 2.3%

𝐺𝑟𝑜𝑤𝑡ℎ

d. Make a histogram for growth (the dependent variable in your regression). Copy & paste your
R output (command line & graph) here. Does it appear roughly normally distributed?

It’s on the next page (did not fit here :(

It is roughly normally distributed (the bulk of the growth rates are in the middle [between 1% and
3%], there are the tails that go off to the sides, one can imagine a rough bell shape enveloping
the histogram, etc.)

hist(Asmt03$growth)

3
e. I think that if one wanted to run a regression with the data in assignment 2 (1229 records of
average hourly earnings, degrees, sex, and age), they would not care to check whether the
dependent variable there is normally distributed, but here in Growth data set, we do want to
look at a histogram. Why is it important to check that distribution here, but one could safely
ignore it there (in CPS96_15)?

Because the sample size in Growth is relatively small (n = 63).

The sample size in CP96_15 was definitely large (n = 1229), so we could rely on the Central
Limit Theorem to accept the regression estimates (its t-values, p-values, F-statistic, etc.) and use
them for hypothesis tests, confidence intervals, and so on. We usually are fine with n = 63 as a
‘large enough’ set. Remember, I suggested as a rule of thumb in class that n > 50 is large enough
to rely on the Central Limit Theorem unless there is something ‘special’ about your data. Well,
there is something special here – the sample is rather large compared to the population (there
are fewer than 200 countries around the world, so the sample is about 1/3 of the population –
think about what it means if you are into thinking deeper :)

f. Construct a scatterplot of the average annual growth rate (growth) on the measure of
education (yearsschool) - copy & paste your R output here. Does there appear to be a
relationship between the variables? Answer in 1-2 sentences.

4
> plot(Asmt03$yearsschool, Asmt03$growth, main="Rate of Growth and Education"
, xlab="Average number of years of schooling in 1960 ", ylab="Growth Rate ")

There seems to be a relationship. And it appears stronger than we saw in part (a), with the
growth rate vs. trade share. What exactly you would see as a relationship? That depends :)

‘Mechanically’ (I call it that because that’s what R will do by default), one would probably detect
a positive correlation (yellow line).

It seems to me, however, that the relationship is probably more interesting than that. It’s non-
linear (orange line) and it suggests that we should think about stuff :) For instance, one obvious
(to an economist) interpretation of this non-linear relationship is that there is a diminishing
marginal return on education (with respect to economic growth). In fact, I did hear from
Economic Development people that they do contribute high growth rates in Southeast Asia in the
last several decades to ‘smart’ decisions of their governments to invest in education a lot, and
especially their decisions to invest in primary and secondary education rather than post-
secondary education (like universities).

g. Using all observations, run a regression of growth on yearsschool. Copy & paste your R
output here (see page 3 if you are not sure what I ask for here).

5
> print(summary(lm(Asmt03$growth~Asmt03$yearsschool)))
Call:
lm(formula = Asmt03$growth ~ Asmt03$yearsschool)
Residuals:
Min 1Q Median 3Q Max
-3.6777 -1.1330 -0.1487 0.9534 4.4342
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.72247 0.36790 1.964 0.05412 .
Asmt03$yearsschool 0.26529 0.07737 3.429 0.00109 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.571 on 61 degrees of freedom

Multiple R-squared: 0.1616, Adjusted R-squared: 0.1478
F-statistic: 11.76 on 1 and 61 DF, p-value: 0.001093

It may be interesting to compare this to the regression in part (b) and think what the differences
(in coefficients’ standard errors, p-values, the regression R-squared) mean :)

6
=====================================================

This is approximately how your R regression output (copy-pasted from R to here) will look like:

Pset 1
No ratings yet
Pset 1
5 pages
5 Test of Population Variance Workbook
No ratings yet
5 Test of Population Variance Workbook
5 pages
Assignments
No ratings yet
Assignments
6 pages
Ms 236 N 0
No ratings yet
Ms 236 N 0
63 pages
Group Coursework
No ratings yet
Group Coursework
3 pages
Theme 3 Multivariante Regression Model
No ratings yet
Theme 3 Multivariante Regression Model
8 pages
SimpleRegression Transcript
No ratings yet
SimpleRegression Transcript
4 pages
ECO 4000 R Assignment
No ratings yet
ECO 4000 R Assignment
3 pages
Simple Regression Model (Assumptions)
No ratings yet
Simple Regression Model (Assumptions)
9 pages
Human Capital and Economic Growth - Statistical Approach
No ratings yet
Human Capital and Economic Growth - Statistical Approach
10 pages
SEE5211 Chapter3-P2017
No ratings yet
SEE5211 Chapter3-P2017
58 pages
Chapter08 Part 5
No ratings yet
Chapter08 Part 5
19 pages
Corelation and Regression
No ratings yet
Corelation and Regression
137 pages
Estimating Demand: Regression Analysis
No ratings yet
Estimating Demand: Regression Analysis
29 pages
Im ch01
No ratings yet
Im ch01
11 pages
Econometrics Notes
No ratings yet
Econometrics Notes
95 pages
Chapter 14 Multiple Regression and Correlation Analysis
No ratings yet
Chapter 14 Multiple Regression and Correlation Analysis
25 pages
2023 Tutorial 11
No ratings yet
2023 Tutorial 11
7 pages
Econ107 Assignment 1 Prep
No ratings yet
Econ107 Assignment 1 Prep
9 pages
A Review of Basic Econometrics
No ratings yet
A Review of Basic Econometrics
5 pages
L3 Bivariate Worksheet
No ratings yet
L3 Bivariate Worksheet
25 pages
Topic 0 Introduction
No ratings yet
Topic 0 Introduction
34 pages
Basics
No ratings yet
Basics
8 pages
Tutorial 1-13 Answer Intermediate Macro
No ratings yet
Tutorial 1-13 Answer Intermediate Macro
40 pages
Regression Explained SPSS
No ratings yet
Regression Explained SPSS
25 pages
Econometrics Notes Final
No ratings yet
Econometrics Notes Final
10 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
Assignment 5
No ratings yet
Assignment 5
4 pages
Regression Models - Follow
No ratings yet
Regression Models - Follow
7 pages
Regression With Stata
No ratings yet
Regression With Stata
40 pages
Time Series Analysis
No ratings yet
Time Series Analysis
9 pages
CH2. Simple Linear Regression 2023
No ratings yet
CH2. Simple Linear Regression 2023
100 pages
Eviews Class Problems
No ratings yet
Eviews Class Problems
3 pages
Regression With Stata Chapter 1 - Simple and Multiple Regression PDF
No ratings yet
Regression With Stata Chapter 1 - Simple and Multiple Regression PDF
42 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Econometrics - Functional Forms
No ratings yet
Econometrics - Functional Forms
22 pages
6338 - Multicollinearity & Autocorrelation
No ratings yet
6338 - Multicollinearity & Autocorrelation
28 pages
Multicollinearity Autocorrelation
No ratings yet
Multicollinearity Autocorrelation
28 pages
Regression Models For Data Science in R by Brian Caffo
No ratings yet
Regression Models For Data Science in R by Brian Caffo
144 pages
Introduction To Simple Linear Regression
No ratings yet
Introduction To Simple Linear Regression
34 pages
Eco 313 2024 Exam & Memo
No ratings yet
Eco 313 2024 Exam & Memo
9 pages
Econometric Project - Linear Regression Model
No ratings yet
Econometric Project - Linear Regression Model
17 pages
Regression: An Introduction To Econometrics
No ratings yet
Regression: An Introduction To Econometrics
19 pages
Studenmund Ch01 v2
No ratings yet
Studenmund Ch01 v2
31 pages
Regression Models For Data Science in R
No ratings yet
Regression Models For Data Science in R
137 pages
Chapter 3 Notes-Alyssa
No ratings yet
Chapter 3 Notes-Alyssa
10 pages
Chapter 3 Notes-Alyssa
No ratings yet
Chapter 3 Notes-Alyssa
10 pages
(The SAGE Quantitative Research Kit) Peter Martin - Linear Regression - An Introduction To Statistical Models-SAGE Publications (2022)
No ratings yet
(The SAGE Quantitative Research Kit) Peter Martin - Linear Regression - An Introduction To Statistical Models-SAGE Publications (2022)
201 pages
Data Analysis Training Workshop - Day 3 Presentation
No ratings yet
Data Analysis Training Workshop - Day 3 Presentation
24 pages
Data Analytics Lesson 11 Notes
No ratings yet
Data Analytics Lesson 11 Notes
8 pages
Applied Econometrics Module
100% (1)
Applied Econometrics Module
142 pages
DSC2608 Learning Unit 5
No ratings yet
DSC2608 Learning Unit 5
26 pages
Logarithmic Functional Form
No ratings yet
Logarithmic Functional Form
20 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Session 5 Marked B PDF
No ratings yet
Session 5 Marked B PDF
36 pages
Reg Mods
No ratings yet
Reg Mods
137 pages
DA-3rd Unit
No ratings yet
DA-3rd Unit
16 pages
DSR 2879
No ratings yet
DSR 2879
25 pages
Uncertainty Slope Intercept of Least Squares Fit
No ratings yet
Uncertainty Slope Intercept of Least Squares Fit
14 pages
Interpretation of Water Quality Data by Principal
No ratings yet
Interpretation of Water Quality Data by Principal
8 pages
2 Conditional Probability
No ratings yet
2 Conditional Probability
6 pages
Significance+Tests+Four Step+Practice+Answer+Key+ +Intro+Stats+ +Stats+Medic
No ratings yet
Significance+Tests+Four Step+Practice+Answer+Key+ +Intro+Stats+ +Stats+Medic
2 pages
Exercise Dispersion
No ratings yet
Exercise Dispersion
10 pages
Design and Analysis of Quality of Life Studies in Clinical Trials Second Edition Chapman Hall CRC Interdisciplinary Statistics Diane L. Fairclough PDF Download
100% (3)
Design and Analysis of Quality of Life Studies in Clinical Trials Second Edition Chapman Hall CRC Interdisciplinary Statistics Diane L. Fairclough PDF Download
76 pages
STAT-502 Assignment 1 (Spring-2025)
No ratings yet
STAT-502 Assignment 1 (Spring-2025)
3 pages
STATISTICAL TREATMENT OF DATA (Autosaved)
No ratings yet
STATISTICAL TREATMENT OF DATA (Autosaved)
20 pages
SOWQMT1014JD11
No ratings yet
SOWQMT1014JD11
5 pages
Narrabundah College: Specialist Mathematics AC/IB Unit 7: Specialist Mathematics - 1.0 STD Units
No ratings yet
Narrabundah College: Specialist Mathematics AC/IB Unit 7: Specialist Mathematics - 1.0 STD Units
12 pages
Week 2 Test Statistics
No ratings yet
Week 2 Test Statistics
61 pages
02 Multiple Regression and Issues in Regression Analysis-1
No ratings yet
02 Multiple Regression and Issues in Regression Analysis-1
43 pages
Overview of Hypothesis Testing Analysis
No ratings yet
Overview of Hypothesis Testing Analysis
3 pages
Asset-V1 TUMx+QPLS1x+2T2018+type@asset+block@QPLS1X 6-2 Gage R R
No ratings yet
Asset-V1 TUMx+QPLS1x+2T2018+type@asset+block@QPLS1X 6-2 Gage R R
15 pages
Constructing The Spatial Weights Matrix Using A Local Statistic
No ratings yet
Constructing The Spatial Weights Matrix Using A Local Statistic
15 pages
Aerofit Case Study Analysis - Ipynb - Colaboratory
No ratings yet
Aerofit Case Study Analysis - Ipynb - Colaboratory
6 pages
AP 7.1 Guided Notes For Reading Textbook
No ratings yet
AP 7.1 Guided Notes For Reading Textbook
6 pages
L10 - T Test
No ratings yet
L10 - T Test
28 pages
MMPC-005 Quantitative Analysis
No ratings yet
MMPC-005 Quantitative Analysis
4 pages
SMB-R Programming Lab
No ratings yet
SMB-R Programming Lab
57 pages
CH 10 Slides
No ratings yet
CH 10 Slides
65 pages
Unit-6 - Non Parametric Test
No ratings yet
Unit-6 - Non Parametric Test
16 pages
Association Between The Functional Movement Screen and Injury Development in College Athletes
No ratings yet
Association Between The Functional Movement Screen and Injury Development in College Athletes
8 pages
Hubungan Peran Ayah Dalam Pengasuhan Dengan Kepercayaan Diri Anak Usia 5-6 Tahun Di Paud Cendana Rumbai Kecamatan Rumbai
No ratings yet
Hubungan Peran Ayah Dalam Pengasuhan Dengan Kepercayaan Diri Anak Usia 5-6 Tahun Di Paud Cendana Rumbai Kecamatan Rumbai
14 pages
Scale of Likert
No ratings yet
Scale of Likert
12 pages
Chapter 4 - Probability - The Study of Randomness
No ratings yet
Chapter 4 - Probability - The Study of Randomness
2 pages
Malhotra MR6e 20
No ratings yet
Malhotra MR6e 20
46 pages
Lim Xin Yong
No ratings yet
Lim Xin Yong
5 pages

Assignment 03 AK

Uploaded by

Assignment 03 AK

Uploaded by

ECON 333 D100 Statistical Analysis of Economic Data

a. What does that number 0.24 in front of A mean?

̂ , that is, the estimated regression predicts that the average

b. What does that number 12.21 mean?

Short answer: Nothing, really…

The R2 does not have units of measurement.

> plot(Asmt03$tradeshare, Asmt03$growth, main="Rate of Growth and Trade",

̂ = 0.5272 + 2.2352 × 𝑇𝑟𝑎𝑑𝑒𝑆ℎ𝑎𝑟𝑒

̂ = 0.5272 + 2.2352 × 0.4 = 1.4%

̂ = 0.5272 + 2.2352 × 0.8 = 2.3%

It’s on the next page (did not fit here :(

Because the sample size in Growth is relatively small (n = 63).

Residual standard error: 1.571 on 61 degrees of freedom

You might also like