Problem Set 1 (Sas9891)
Problem Set 1 (Sas9891)
Question 1
a. Omitting a variable leads to omitted-variable bias. Suppose that variable v in question has
a non-zero coefficient (i.e. it is correlated with the dependent variable) and is correlated
with at least one independent variable w (i.e. Cov ( v , w ) ≠0 ¿ .
Removing v from the regression will cause all correlated independent variables to have
biased estimators. Let w be a variable in the regression with the omitted variable. The
extent of the bias is |β v β w| and its direction is upward if ρ v, y is in the same direction as
pw , y. Otherwise, the direction is downward.
While both phenomena can undermine the validity and reliability of regression, I believe that
omitted-variable bias is a more severe concern. Compared to superfluous variables that only
make the estimators less precise, omitted-variable bias can change the direction and
magnitude of estimators, which can be quite misleading.
Question 2
i. Since we don't have a separate variable for the verbal SAT score, we cannot directly test
this statement. UNCERTAIN.
ii.
H0 : ^
β math ≤0
H1: ^
βmath >0
Assuming that df >30 and observations follow a normal distribution, we perform a one-
tailed t-test at the 5% significance level α =0.05 → z α =1.645
H1: ^
β sat ≠ ^
β math ≡ ^
β sat − ^
β math ≠ 0
The method for calculating E ( β sat − βmath ) is trivial. However, calculating Var (β sat −β math)
is not possible as the number of observations is unknown. UNCERTAIN.
Question 3
a. Yes, the percentage of students eligible for the federally funded school lunch program is
proportional to the percentage of students living in poverty. Therefore, it can be used as a
proxy variable.
b. In the first regression, effect of expenditure on math 10 is the sum of the indirect effect of
expend on lnchprg and the direct effect of expend on math 10. On the other hand, in the
second regression, it is only the direct effect of expend on math 10. Therefore, the
estimators are different.
Question 4
b.
psoda=0.9563196+0.1149882 ∙ prpblck+ 0.0000016 ∙income
Sample size: 407−9=398
2
R =0.05952→ The independent variables do not explain the dependent variable very
well.
Both variables are statistically significant ( p=1.26 × 10−5∧ p=1.22× 10−5 respectively)
Subhani 3
The model suggests that, all else being equal, a percentage point increase in the
proportion of Black individuals increases the price of medium soda in that ZIP code by
$0.0011 (0.11c). It is not economically large as the effect is negligible in monetary terms.
c.
psoda=1.03740+0.06493 ∙ prpblck
The discrimination effect is lesser after removing income.
ρ prpblck , income=−0.43
ρ psoda , income=0.13
Because ρ prpblck , income and ρ psoda , income are in different directions, ρ prpblck , income × ρ psoda ,income <0 .
Therefore, the discrimination effect is lesser due to the omission of income from the
regression.
d.
log ( psoda ) =−0.79377+0.12158 ∙ prpblck+ 0.07651∙ log (income)
For every 1% increase in income , psoda increases by 0.07651%.
Because 0 ≤ prpblck ≤ 1, taking the log will turn them into negative values, making the
coefficient harder to interpret.
e. If prpblck increases by 0.2, psoda will increase by exp ( 0.12158 × 0.2 )=1.024 times, or a
2.4% increase.
f. prpblck decreases.
h. Yes, having both variables, which are highly correlated leads to multicollinearity.
Question 5
β1
a. For every 1% increase in expendA , voteA increases by (assuming β 1> 0)
100
Subhani 4
b.
H 0 : β 1=−β 2 ≡ β 1 + β 2=0
c.
Yes, A’s expenditure affects the outcome as it is statistically significant. B’s expenditure
also affects the outcome but is inverse of A’s expenditure. Yes, we can use these results to
test the null hypothesis in (b) through a t-test.
d.
H 0 : β 1+ β 2=0
H 1 : β1 + β 2 ≠ 0
β 1=6.08332
β 2=−6.61542
∴ β1 + β 2=−0.5321
2 2
Var ( β 1 )=( σ β × √ n ) =( 0.38215× √ 169 ) =24.6805
1
2 2
Var ( β 2 )=( σ β × √ n ) = ( 0.37882× √ 169 ) =24.2523
2
7.1367
95 % CI=−0.5321± 1.96 ∙ ≡[−1.6081 , 0.5440]
√ 169
∵ 0 ∈ [ −1.6081 ,0.5440 ] , we cannot reject the null hypothesis at the 5% significance level.
Therefore, whether A’s expenditure cancels out B’s expenditure is inconclusive.