Home Work 1: Group Member Student Name ID Contribution
Home Work 1: Group Member Student Name ID Contribution
GROUP MEMBER
4.
1. E(ut ) = 0 : the expected value of the error term ut in the regression model is 0. In other
words, the mean of the errors across all observations is 0. This assumption implies that
the errors do not exhibit any systematic bias.
2. Var (ut ) = σ² < ∞ : the variance of the error termut in the regression model is constant and
finite for all values of the independent variable x t . This helps to reduce bias in the
estimates..
3. cov (ui , u j) = 0: the errors ui và u j in the regression model are not linearly correlated with
each other. In other words, the error at observation i is independent of the error at
observation j.
4. cov (ut , x t ) = 0: there is no relationship (no correlation) between the error term ut and the
corresponding independent variable x t . This ensures that the errors in the model are
completely random and not related to the independent variable.
5. ut ~ N(0, σ²): the error term ut in the regression model follows a normal distribution with
a mean of 0 and constant variance σ².
Making the first four assumptions, these are necessary to prove that the OLS estimators are
BLUE (Best Linear Unbiased Estimators) as stated by the Gauss-Markov Theorem. They ensure
that the estimators are unbiased and have the smallest variance among all linear unbiased
estimators. If violated, the OLS estimators may become biased or inefficient, leading to
unreliable results.
The fifth assumption states that the disturbance term ut follows a normal distribution. This
assumption is made to support statistical inference from the sample to the population. It allows
the application of statistical tests, as these rely on the normality of the disturbances. It ensures
that OLS estimates can be used to test hypotheses about the coefficients. If the disturbances do
not follow a normal distribution, statistical tests may become inaccurate or lose their
significance. Making this assumption implies that test statistics will follow a t-distribution
(provided that the other assumptions also hold).
5.
(3.39)
y t =α + β x t +ut
Because the parameters are linear, this model can be estimated using OLS.
(3.40)
α β u
y t =ⅇ x t ⅇ ↔ ln ( y t )=α + β ln (x ¿¿ t)+ut ¿
t
9.
In hypothesis testing for regression analysis, hypotheses are tested using the actual
coefficients, not the estimated values.
Hypothesis testing in statistics is generally concerned with making inferences about
population parameters based on sample data. The estimated coefficients are derived from a
specific sample and are subject to sampling variability.
Hypothesis testing accounts for this variability using the sampling distribution of which
allows researchers to infer properties about the actual coefficients in the population.
By testing the actual coefficients, the results can be generalized to the entire population.
If we tested the estimated coefficients directly, the conclusions would be valid only for the
specific sample and not for the population.
Hypotheses are tested concerning the actual values of the coefficients because the goal is
to infer properties about the population, not just describe the specific sample at hand. The
estimated values serve as the basis for this inference but are not the direct focus of the hypothesis
test.
CHAPTER 2 WOOLDRIDGE TEXT 7E
Exercise 4
(i).
when cigs = 0, the predicted birth weight is 119.77 ounces
when cigs = 0, the predicted birth weight is 109.49 ounces
When compared to not smoking at all, the expected birth weight drops by 10.28 (8.6%)
ounces when the mother smokes 20 cigarettes a day.
(ii).
No, a causal relationship is not always captured by this regression. Since smoking behaviors and
baby birth weight may be influenced by various factors, the relationship is correlational. To
prove causation, a more thorough analysis that accounts for confounding variables is needed.
(iii).
When predicted birth weight is 125 ounces, we have equation
125=119.77-0.514cigs
Cisg ≈ -10.18
Comment: This result suggests that the mother would have to smoke approximately -10
cigarettes per day. This outcome is unrealistic because the amount of cigarettes smoked cannot
be negative. Furthermore, a birth weight of 125 ounces is greater than the 119.77 ounces
maximum projected weight for cigs=0. Consequently, this model is unable to predict a birth
weight of 125 ounces. This demonstrates the drawbacks of attempting to forecast something as
complicated as birth weight using a straightforward linear regression model with just one
explanatory variable (cigarettes).
(iv). The percentage of pregnant women who abstain from smoking is 0.85 (85%). This suggests
that the majority of the sample's women do not smoke, and the results strongly mirror those of
moms who do not smoke. Given that smoking is uncommon in this population, the regression
equation's projections for greater birth weights are more in line with those of non-smokers,
confirming that smokers in the sample cannot realistically have a birth weight of 125 ounces.
Exercise 5
(i). The intercept suggests that when income is zero, the predicted consumption is -$124.84. This
unrealistic prediction highlights that this consumption model may not accurately predict
consumption at extremely low-income levels. However, viewed on an annual basis, -$124.84 is
relatively close to zero.
(ii). Income = 30,000$, based on the equation, we calculate
Predicted consumption= –124.84 + .853(30,000) = 25,465.16 $
( iii ).
Exercise 6
(I). The coefficient represents the percentage change in housing price/log (price) for a 1%
change in the distance from the incinerator/log (dist). Yes, it is. It is obvious that living nearer to
an incinerator depreciates housing prices; however, living farther away appreciates housing
prices.
(II). Simple regression fails to provide an unbiased estimate of the ceteris paribus elasticity of
price concerning dist. The Decision of the city to locate the incinerator farther from higher-
priced neighborhoods introduces a positive correlation between log (dist) and housing quality, an
omitted variable that influences housing prices. This violates the SLR.4 assumption, which
requires that the error term be uncorrelated with the explanatory variable. As a result, the OLS
estimates are biased and fail to isolate the true ceteris paribus effect of dist on price.
(III). House size, number of bathrooms, lot size, home age, and neighborhood quality (such as
school quality) are examples of factors that, as noted in (II), could be associated with dist/log
(dist)