100% found this document useful (1 vote)
28 views8 pages

Peer-Graded Assignment Test Exercise 2 - PeterSchuld

The document outlines a peer-graded assignment focused on applying multiple regression analysis to predict freshman grades in Economics based on entrance test scores. It includes detailed instructions for conducting regressions with SAT verbal and mathematical scores, as well as gender, and interpreting the results, including confidence intervals and correlation matrices. The assignment also involves performing an F-test to assess the significance of the predictors.

Uploaded by

d7c6qst8pv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
28 views8 pages

Peer-Graded Assignment Test Exercise 2 - PeterSchuld

The document outlines a peer-graded assignment focused on applying multiple regression analysis to predict freshman grades in Economics based on entrance test scores. It includes detailed instructions for conducting regressions with SAT verbal and mathematical scores, as well as gender, and interpreting the results, including confidence intervals and correlation matrices. The assignment also involves performing an F-test to assess the significance of the predictors.

Uploaded by

d7c6qst8pv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

1 Peer-graded Assignment: Test Exercise 2 – Peter Schuld

Econometrics: Methods and Applications

Peer-graded Assignment: Test Exercise 2

Goals and skills being used:

• Experience the process of practical application of multiple regression.


• Get hands-on experience with performing multiple regression.
• Give correct interpretation of regression outcomes.

Questions
This test exercise is of an applied nature and uses data that are available in the data file
TestExer2. The exercise is based on Exercise 3.14 of ‘Econometric Methods with
Applications in Business and Economics’. The question of interest is whether the study
results of students in Economics can be predicted from the scores on entrance tests taken
before they start their studies. More precisely, you are asked to investigate whether verbal
and mathematical entrance tests predict freshman grades of students in Economics. Data
are available for 609 students on the following variables:

• FGPA: Freshman grade point average (scale 0-4)


• SATV: Score on SAT Verbal test (scale 0-10)
• SATM: Score on SAT Mathematics test (scale 0-10)
• FEM: Gender dummy (1 for females, 0 for males)

25. Dezember 2022


2 Peer-graded Assignment: Test Exercise 2 – Peter Schuld

(a) (i) Regress FGPA on a constant and SATV. Report the coefficient of SATV and
its standard error and p-value
(give your answers with 3 decimals).

Data Generating Process is yi = α + β * xi + εi


coefficient of SATV standard error of estimate p-value
β = 0.06309 0.02766 0.0229

(a) (ii) Determine a 95% confidence interval (with 3 decimals)


for the effect on FGPA of an increase by 1 point in SATV

Factors affecting the width of the confidence interval (CI) include


the sample size, the variability in the sample, and the confidence
level. All else being the same, a larger sample produces a narrower
confidence interval, greater variability in the sample produces a
wider confidence interval, and a higher confidence level produces a
wider confidence interval. [Wikipedia]

25. Dezember 2022


3 Peer-graded Assignment: Test Exercise 2 – Peter Schuld

Data Generating Process


Yhat = α + β * xi + εi and α = 2.44173 and β = 1.0
yhat = 2.44173 + 0.06309 * 1.0
yhat = 2.50482 Point estimate

Confidence and Prediction Intervals


Confidence interval (CI) for mean response and prediction interval
(PI) for individual response of regression model yi = β1xi + β0 + ε0 are
given, respectively, as

The prediction interval is substantially wider than the confidence


interval, reflecting the increased uncertainty about yhad for a given x*
in comparison to the average x mean.

x* 1.0
x mean 5.565 Mean SATV value (see summary
statistic above)
n 609 Number of observations
DF 607 Degrees of Freedom
sc 0.02766 standard error of estimate
t α/2 1.96 Critical t value (1-tailed)
= t 0.05/2
= t 0.025 97.5% quantile of a t-distribution with
n−2 degrees of freedom (DF)

25. Dezember 2022


4 Peer-graded Assignment: Test Exercise 2 – Peter Schuld

At high DF (here DF= 607) identical to


Critical Value for Z (Standard Normal
Distribution) at Significance Level 0.025.
The t-distribution has a bell shape and for
values of n greater than approximately 30
it is quite similar to the standard normal
distribution.
(x* - x mean)² 20.839 = (1 – 5.565) ²
Σ (x* - x mean)² 274.888 Calculated with R in a new column:

df['x_deviation_squared']
<- (df$SATV - mean(df$SATV))^2

sum(df1$x_deviation_squared)
yhat 2.50482 Point estimate

1 20.839
CI = yhat +/- 1.96 ∗ 0.02766 ∗ √1 + +
609 274.888

CI upper = 2.50482 + 1.96 * 0.02766 * 1.038003417


CI upper = 2.561093902
CI lower = 2.50482 - 1.96 * 0.02766 * 1.038003417
CI lower = 2.448546099

Answer
CI = [2.449, 2.505]

25. Dezember 2022


5 Peer-graded Assignment: Test Exercise 2 – Peter Schuld

(b) Answer questions (a-i) and (a-ii) also for the regression of
FGPA on a constant, SATV, SATM, and FEM.

Regress FGPA on a constant and SATM

Data Generating Process is yi = α + β * xi + εi


coefficient of SATV standard error of estimate p-value
β = 0.15067 0.03075 0.00000123

Yhat = α + β * xi + εi and α = 1.85133 and β = 0.15067


yhat = 1.85133 + 0.15067* 1.0
yhat = 2.002 Point estimate

25. Dezember 2022


6 Peer-graded Assignment: Test Exercise 2 – Peter Schuld

β 0.15067
x* 1.0
x mean 6.248 Mean SATM value
(see summary
statistic above)
(x* - x mean)² 27.541504

Use R to calculate the CI


predict(mod1,newdata=avstudent, interval='prediction') # 95%
interval by default
Doesn’t work! I get much larger prediction intervals. Therefore, I use
the rule of thumb CI = [Yhat – 2 * SE, Yhat + 2 * SE]

Answer
CI = [1.9405, 2.064]
Regress FGPA on a constant and FEM

Data Generating Process is yi = α + β * xi + εi


coefficient of SATV standard error of p-value
estimate
β = 0.16659 0.03771 0.0000118

25. Dezember 2022


7 Peer-graded Assignment: Test Exercise 2 – Peter Schuld

x mean 0.3875 Mean FEM


value (see
summary
statistic above)
Answer
CI = [0.312, 0.463]

(c) Determine the (4 × 4) correlation matrix of FGPA, SATV, SATM,


and FEM. Use these correlations to explain
the differences between the outcomes in parts (a) and (b)

In R the function cor is used to calculate the 4*4 correlation matrix

There is only a small correlation between FGPA and SATV (+0.09).


In contrast, the correlations of both FGPA and SATM (+0.20), and
the correlation of FGPA and the dummy variable FEM (+0.18) is
higher.

(d) (i) Perform an F -test on the significance (at the 5% level) of the
effect of SATV on FGPA, based on the regression in part (b) and
another regression.

Note: Use the F -test in terms of SSR or R2 and use 6 decimals


in your computations. The relevant
critical value is 3.9.
(ii) Check numerically that F = t².

25. Dezember 2022


8 Peer-graded Assignment: Test Exercise 2 – Peter Schuld

Answer
F-statistic = 5.201

(ii) Check numerically that F = t²

t-value = 2.28
(t-value) ² = 5.1984

25. Dezember 2022

You might also like