0% found this document useful (0 votes)
55 views10 pages

Econ G2 Final

The document contains two tasks related to econometrics analysis. Task 1 evaluates a multiple regression model to determine wages. It finds education and training have a significant positive impact on wages, while age does not. Task 2 uses logistic regression to analyze factors affecting student retention. It finds the orientation program increases the odds of returning, but grade point average is only marginally significant.

Uploaded by

Thái Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views10 pages

Econ G2 Final

The document contains two tasks related to econometrics analysis. Task 1 evaluates a multiple regression model to determine wages. It finds education and training have a significant positive impact on wages, while age does not. Task 2 uses logistic regression to analyze factors affecting student retention. It finds the orientation program increases the odds of returning, but grade point average is only marginally significant.

Uploaded by

Thái Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

NATIONAL ECONOMICS UNIVERSITY

BACHELOR OF FINTECH PROGRAM

GROUP REPORT
COURSE: ECONOMETRICS
Lecture: Nguyen Manh The
Group 2
Nguyen Thien Duc
Ngyuyen Tien Dat
Nguyen Ngoc Ban
Tran Thi Hoa
Le Thu Ngan

Hanoi | November 21st 2021


Table of contents

TASK 1:  Multiple regression: Determinants of wages............................................3


1. Econometric model that is evaluated on the data of mult_reg.xlsx...............3
2. Some basic statistics..........................................................................................3
3. Diagnostic test....................................................................................................5
4. Test for significance of each variable:.............................................................6
5. Test for the significance of model....................................................................6
6. Summary...........................................................................................................7
TASK 2: Lakeland: Retention to leave......................................................................7
1. The logistic regression equation relating X1 and X2 to Y...............................7
2. Interpretation of E(Y) when X2 =0...................................................................7
3. Compute the estimated logit by using both independent variables and
software....................................................................................................................8
4. Conduct a test for overall significance by using α = 0.05...............................8
5. Determine each the independent variable is significant by using α = 0.05. . .9
6. Use the estimated logit computed in part (c) to estimate the probability that
students with a 2.5 grade point average who did not attend the orientation
program will return to Lakeland for their sophomore year. And estimate the
probability for students with a 2.5 grade point average who attended the
orientation program................................................................................................9
* When GPA=2.5, Program=0..............................................................................9
*When GPA = 2.5, Program = 1.........................................................................10
7. The estimated odds ratio for the orientation program and its intepretation
10
8. Recommendation on making the orientation program a required activity 10

Page | 2
TASK 1:  Multiple regression: Determinants of wages
1. Econometric model that is evaluated on the data of mult_reg.xlsx
Wages = β1 + β2*Age + β3*Educ + β4*Training1 + β5*Training2 (Model)

Estimated function:
Wages_hat = 6.91198 + 0.10004*Age + 1.04524*Educ + 0.50819*Training1 +
0.86564*Tranining2
Interpretation of each coefficient
When other variables remain unchanged, if age rise by one unit, the level of wage
increase by 0.01004 units
When other variables remain unchanged, if education rises by one unit, the level of
wage increase by 1.04524 units
When other variables remain unchanged, if Training 1 rises by one unit, the level of
wage increase by 0.50819 units
When other variables remain unchanged, if Training 2 rise by one unit, the level of
wage increase by 0.8654 units

2. Some basic statistics


Variables Mean Standard deviation
Wages 31.14476 5.579512
Age 27.796 3.363482
Educ 14.61 2.041135
Training1 4.502 1.548804

Page | 3
Training2 4.4976 1.565265

According to this model, if Wages is affected by Age, Education, Training1,


Training2 then the mean of Wages is 31.14476 and standard deviation of Wages is
5.579512.
The below illustration is the model histogram

We calculate the skewness and the kurtosis of the model, and the result shows
skewness of the model is 0.1092786 and the kurtosis of the model is 2.72303. Besides
the jarque bera test is undertaken, so the test shows the following result

So, we derive the conclusion that the model is normally distributed as the p-value of
the above Jarque bera test is greater than 0.05

Page | 4
3. Diagnostic test

At α = 0.05, no variables are significant. However, at α = 0.001, only Educ is


significant. So, we suppose that this model has the problem of multicollinearity.
We use the Variance inflation factor to check for the multicollinearity in the model
VIFj = 1/(1-Rj2)
We observe that the vif of Training 1 is the largest out of four variables and it is
greater than 10 so we will remove Training1 out of the model.

We estimate the model 1, which is dropped out variable Training 1. We calculate the
VIF of each variable in model 1, then the result below shows that both age, education
and Training 2 do not have multicollinearity.

We also use the Breusch-Pagan test to detect the heteroscedasticity in the model 1

The p-value calculated in Breusch-Pagan test is greater than 0.05, so there is no


heteroscedasticity in model 1
Then we will use model 1 for further tests.
Rewrite the regression function of model 1

Page | 5
Wages = β1 + β2* Age + β3 *Educ + β4*Training2
Estimated function:
Wages_hat = β1_hat + β2_hat* Age + β3_hat *Educ + β4_hat*Training2
We assume that variable age, education and Training2 positively affect dependent
variable wages

4. Test for significance of each variable:


After correcting the multicollinearity, we have use the model1:
Wages_hat = 6.95163 + 0.09988* Age + 1.04591*Educ + 1.36434*Training2

Test for β2 Test for β3 Test for β4


H0: β2 = 0 H0: β3 = 0 H0: β4 = 0
Hypothesis H1: β2 ≠ 0 H1: β3 ≠ 0 H 1 : β4 ≠ 0

p-value 0.129741 (>0.05) 2e -16 (<0.05) 2e-16 (<0.05)

Conclusion Not reject H0 Reject H0 Reject H0

Conclusion: At α = 0.05, Age is not significant, Educ and Training2 are significant.
5. Test for the significance of model
H0: β2 = β3 = β4 = 0
H1: H0 is false

F= ( kESS
−1 n−k )
) :(
RSS
= 78.02

f0.05(4-1, 500-4) = f0.05(3; 496) = 2.622879

Page | 6
Since F >f0.05(3; 496), we reject H0 and the model 1 is of significance
Is apprentice training more efficient than education in raising the level of wage
H0: β4 – β3 ≤ 0
H 1 : β4 – β3 > 0
t= (β4_hat – β3_hat) / se (β4_hat - β3_hat) = 1.83823

And t0.05(n-k) = 1.647932


As the t-value > t0.05, we reject H0 and β4 > β3
6. Summary
We could answer two research questions that is set in the beginning. As the result of
the test of significance of each variable, age is the determinant of the productivity.
Besides, the result in the test for β 4 > β3 illustrates that apprentice Training variable
(Training 2 in this model case) gain more efficiency in raising the level of wages than
the education variable.

TASK 2: Lakeland: Retention to leave


1. The logistic regression equation relating X1 and X2 to Y.

e β 1+ β 2∗GPA+ β 3∗Program
P(Y=1) = β 1+ β 2∗GPA +β 3∗Program
1+ e

2. Interpretation of E(Y) when X2 =0


When X2=0 ,
β 1+ β 2∗GPA
e
P(Y=1) = β 1+ β 2∗GPA
1+ e
p
 The logit form: ln ( ) = β 1+ β 2∗GPA
1− p

When the student did not attend the orientation program, GPA increase 1 unit, the odd
of the student returned to Lakeland for the sophomore year increase e β 2 times.

Page | 7
3. Compute the estimated logit by using both independent variables and
software

p
ln ( ) = −6.8926+2.5388∗GPA +  1.5608* Program
1− p

In which : p is probability the student returned to Lakeland for the sophomore year
4. Conduct a test for overall significance by using α = 0.05.
Hypothesis : H0 : Model is not significant
H1 : Model is significant
Null deviance: 128.207 on 99 degrees of freedom
Residual deviance: 80.338 on 97 degrees of freedom

Chisquare χ 2 = 128.207 - 80.338 = 47.869 on 2 degrees of freedom > χ 2 (2, 0.05)


( 5.99146)
So we reject H0 , the model is overall significance at alpha=0.05

Page | 8
5. Determine each the independent variable is significant by using α = 0.05

Hypothesis 1 : H0 : β2 = 0
H 1 : β2 ≠ 0
At alpha=0.05 , both GPA is not significant because from the result we can see GPA
is just significant at alpha = 0.001
Hypothesis 2 : Ho : β3=0
H1 : β3≠0

At α= 0.05 Program is not significant because from the result we can see Progam is
just significant at α= 0.01

6. Use the estimated logit computed in part (c) to estimate the probability
that students with a 2.5 grade point average who did not attend the
orientation program will return to Lakeland for their sophomore year.
And estimate the probability for students with a 2.5 grade point average
who attended the orientation program.

* When GPA=2.5, Program=0


e−6.8926+2.5388∗2.5+1.5608∗0
P= = 0.3669
1+ e−6.8926+2.5388∗2.5+1.560∗0
 The probability that students with a 2.5 grade point average who did not
attend the orientation program will return to Lakeland for their sophomore year
is 36.69%

Page | 9
*When GPA = 2.5, Program = 1
e−6.8926+2.5388∗2.5+1.5608∗1
P= = 0.7340
1+ e−6.8926+2.5388∗2.5+1.560∗1

 The probability that students with a 2.5 grade point average who attend the
orientation program will return to Lakeland for their sophomore year is 73.40%

7. The estimated odds ratio for the orientation program and its intepretation
p
ln ( ) = −6.8926+2.5388∗GPA +  1.5608* Program
1− p

When GPA unchanged and the student attended the orientation program , odd of the
student returned to Lakeland for the sophomore year increases e 1.5608 =4.76times
instead of not attending the orientation program.

8. Recommendation on making the orientation program a required activity


When GPA unchanged, if the student attended the orientation program odd of the
student returned to Lakeland for the sophomore year increases 4.76 times so I
recommend making the orientation program a required activity.

Page | 10

You might also like