0% found this document useful (0 votes)
21 views6 pages

Assignment A (Hand In)

Uploaded by

sophia.costello
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views6 pages

Assignment A (Hand In)

Uploaded by

sophia.costello
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Question 1:

a.) EViews:

Completed table:
Age < 30 Age > = 30
Male 411 443
Female 270 292

b.) EViews:

Completed table: to 3 decimal places


AHE (Sample SD (AHE) = SAHE Share with a n (Nr of
mean or sample (Sample standard Bachelor degree Observations)
average) deviation)
Male 22.554 13.077 46.721% 854
Female 19.369 11.232 59.786% 562

c.) E( AHE / Female=0)=22.55427 (1)


E( AHE / Female=1)=19.36897 (2)
(1) – (2) = 3.1853
Therefore, the estimated wage gap = US$3.19/hour (2dp)

d.) Standard error:

13.07690
SE ( AHE m ) =
√ 854
¿ 0.4474823059 …

90% Confidence interval:

22.55427 ±(1.28 × 0.4474823059 …)


[21.98149265 … ,23.12704735 … ]
Therefore, the 90% confidence interval for the population mean of wages for male workers is [21.981, 23.127]
to 3 decimal places.

e.) Standard error:


11.23 198
SE ( AHE f )=
√562
¿ 0.4737924804 …

90% Confidence interval:


19.36897 ±(1.28 × 0.4737924804 …)
[18.76251563 … ,19.97542437 … ]

Therefore, the 90% confidence interval for the population mean of wages for female workers is [18.763,
19.975] to 3 decimal places.

f.) Null & Alternative hypothesis: △=¿ gender wage gap (difference between AHE for males & females)
H 0: △ = 2
H 1: △ > 2

g.) T-statistic:

22.55427−19.36897
t=


2 2
13.07690 11.23 198
+
854 562
¿ 4.887639031 …
= 4.888 (3dp)

Critical value t* = 1.64 at the significance level of 5%

Given a large sample, the t-statistic asymptotically follows a normal


distribution. Hence, we can use the corresponding critical values as seen in
the z-score table and graph to find t* for this question.

h.) Is H0 rejected?
In a one-sided right-tail, we reject H0 if t≥ t*

4.888 ≥ 1.64
I.e., 4.888 lies in the right-hand side purple rejection region in the graph
Therefore, we reject the null hypothesis at the 5% significance level

Do I agree with the claim of the feminist activist?


I agree with the claim of the feminist activist as the above test indicates that in rejecting the null, we accept
that this gap is greater than $2/hour. Additionally, because we are testing at a 5% significance level, which
indicates that there is a 5% probability of rejecting the null hypothesis when it is true. This determination can
be assumed with a high level of confidence.

Question 2:
a.)
EViews:

Completed table: to 3 decimal places


Variable Sample mean Standard deviation Minimum Maximum
Education (ed) 13.829 1.814 12.000 18.000
High school test score (bytest) 51.002 8.819 28.950 71.360
Ethnicity (black) 0.193 0.394 0.000 1.000

b.) Completed table: to 3 decimal places EViews:


Answer
E(ed | bytest <30) 12.000
E(ed | 30 <= bytest <40) 12.591
E(ed | 40 <= bytest <50) 13.213
E(ed | 50 <= bytest <60) 14.182
E(ed | 60 <= bytest <70) 15.141
E(ed | 70 <= bytest <80) 16.667

The above table indicates that bytest scores (in points) are
positively correlated with higher expected values of ed (in
years). E.g., the lowest bytest scores correspond to an
expected value of the minimum years of education, and
the highest correspond to an expected value of just over 1
year lower than the maximum.
c.) EViews:
OLS Regression (compact format):
^ =8.836+0.098∗bytest
ed
se: (0.133) (0.003)
t-stat: (66.478) (36.752)
p-value: (0.000) (0.000)
R2 = 0.227
n = 3796

d.) Interpretation of the coefficients: the estimated intercept ( ^ β 0) tells us that if the high school test score
(bytest) is 0, then the estimated years of education (ed) will be 8.836 years (3dp). Additionally, the
estimated slope ( ^β 1) tells us that if the high school test score (bytest) increases by one point, the
estimated years of education (ed) will increase on average by 0.133 years (3dp).

Interpretation of the p-values: the p-values of ^ β 0 and ^


β 1 = 0. This means that these variables are
statistically significant (in terms of their impact on ed) at any significance level (e.g., 1%, 5% or 10%).

e.) Interpretation of the R-squared measure: R2 measures the proportion of the variation in ed that is
explained by the model. In the above model R2 = 0.227 (3dp). This indicates that the point score
obtained in the high school test (bytest) accounts for approximately 22.7% of variation in the years of
education (ed).

Even though this percentage seems somewhat low, it is still significant in accounting for the variance in
ed which is likely contingent on several different variables, e.g., socio-economic status. Additionally, as
aforementioned, the coefficients of the intercept and slope are statistically significant. Therefore, this
model is still useful, as it allows us to draw important conclusions about the relationships between the
variables.

f.) EViews  Black = 1:


OLS Regression
(compact
format) 
Black = 1:
^ =9.070+0.097∗bytest
ed
se: (0.289) (0.007)
t- stat: (31.359) (14.681)
p- value: (0.000) (0.000)
R2 = 0.226
n = 731
EViews  Black = 0:
OLS Regression (compact format)  Black = 0:
^ =8.571+0.102∗bytest
ed
se: (0.160) (0.003)
t-stat: (53.474) (32.882)
p-value: (0.000) (0.000)
R2 = 0.220
n = 3065

g.) Method: to complete this question, I subbed the test scores into the term ‘bytest’ in
^ =9.070+0.097∗bytest when black = 1, and ed
ed ^ =8.571+0.102∗bytest when black = 0.

Completed table: to 3 decimal places


^ ) if black
Test score (bytest) Predicted education (ed ^ ) if non-black
Predicted education (ed
35 12.465 12.141
45 13.435 13.161
55 14.405 14.181
65 15.375 15.201

Discussion of results: for both black and non-black individuals there exists a positive correlation
between ed^ and bytest. However, this correlation is stronger when black = 1, indicating that the same
test scores (in points) are associated with higher levels of education (in years) for black individuals
(compared to their non-black counterparts).

h.) Is the relationship between test scores & completed education causal? Causality means that a specific
action leads to a specific, measurable consequence, i.e., a causal relation between two events exists if
the occurrence of the first causes the other. In considering whether test scores and completed
education have a causal relationship we must consider that just because two variables are associated
(i.e., correlated) does not mean that one causes the other. We have determined that these variables
are positively correlated, now we must consider if they are causally related.

The least squares assumptions for causal inference include…


1. The error term ui has a conditional mean 0  E(ui|Xi) = 0. In observational data, such as the data
we are currently dealing with, Xi (bytest) is not randomly assigned. Instead, the best we can hope
for is that bytest is as if randomly assigned (i.e. E(ui|Xi) = 0). For this assumption to be met, all the
other factors affecting ed should be unrelated to bytest. This means that given a value of bytest,
the mean of these other factors should equal zero.

However, this is not the case in the given data, as the factors that impact ed also impact bytest. For
example, the variable black explains 1.081% of variation in ed, but also explains 9.902% of variation
in bytest. This
indicates that
the variable
black, which
impacts ed, is
not unrelated to
bytest,
therefore
violating this
first assumption

2. Xi and Yi should be independent and identically distributed. I.e. ed and bytest should be i.i.d. This
assumption is fulfilled by random sampling, and therefore is met in this case.

3. Large outliers are unlikely. This involves an assumption of finite kurtosis for both Xi (bytest) and Yi
(ed), which is plausible in this case. For example, bytest is capped, as the best you can do in a
standardised test is full marks, and the worst is no marks. Additionally, ed is capped within the
range of possibility, which in this case, is 12-18 years. Because these variables have finite ranges,
they also both adhere to the assumption that Xi and Yi have nonzero finite fourth moments.

Even though this data adheres to the second and third assumptions of causality, we cannot conclude that the
relationship between test scores and completed education uncovered in (c) and (f) is causal. This is due to the
fact that this relationship appears to violate the first assumption.

You might also like