Revision 235
Revision 235
2. What test is used to test the significance of the parameters of the linear regression
model?
a) Chi Square test
b) ANOVA test
c) T- test
d) Wald test
7. In linear regression modeling, we should run many hypothesis tests to check the
statistical significance of the estimated parameters.
a) True
b) False
1
8. In a linear regression model with one predictor, … parameters are estimated.
a) 1
b) 2
c) 3
d) 4
9. In a linear regression model with one predictor, we should check the assumption of
multicollinearity.
a) True
b) False
10. What is the OLS assumption that is unnecessary to check in case you build a two-
variable model?
a) No Autocorrelation
b) No Multicollinearity
c) No Heteroscedasticity
d) Linearity assumption
•A time series is a set of observations on the values that a variable takes at different
Time series times.
•The major assumption underlying time series analysis is stationarity.
•Cross-section data are data on one or more variables collected at the same point in
•In pooled, or combined, data are elements of both time series and cross-section data.
Pooled data •The major assumption underlying pooled data analysis is the nonexistence of serial
autocorrelation.
2- A salesperson for a large car brand wants to determine whether there is a relationship
between an individual's income and the price they pay for a car. As such, the
individual's "income" is the independent variable and the "price" they pay for a car
2
is the dependent variable. The salesperson wants to use this information to
determine which cars to offer potential customers in new areas where average
income is known.
The following is the output of the analysis, read it carefully and answer the questions:
a) Comment on the goodness of fit of the model and write the interpretation of the
goodness-of-fit measure.
R square is the goodness of fit measure of the linear regression model. As the value
of R square being close to 1, as the goodness of fit of the model advances. In the
current model, R square equals 0.762 which indicates that the model fits the data in
a very good manner.
R square is interpreted as follows: the amount of variability in Y "the price
customers can pay for a car" that is explained by X "income" is 76.2%.
b) What is the null hypothesis of ANOVA test? Comment on the significance of the
given model
H0: B1 = B2 = 0
In the current model, the p-value associated with F-test "ANOVA test" is far below
0.05 which indicates that we could reject H0 at 0.05 level of significance (or in other
words, at 0.95 level of confidence), hence we conclude that the model is statistically
significant, i.e., the results can be generalized to the target population and not
limited only to the sample.
c) Write the fitted model and Interpret the estimated values of the parameters.
The fitted model:
̂𝑖 = 8266.786 + 0.564 𝑋𝑖
𝑌
B1 = the intercept is 8266.78, is the mean value of Y in case X equals 0,i.e, people
could afford a price for a car equals 8266.7$ in case of having an income equals to
the population's mean income.
B2: the slope is 0.564, is the change in Y associated with a unit change in X. as the
income increases by 1 unit above its mean population level, the customer could
afford 0.56 $ increase in the price of a car.
3
3- Explain the properties of OLS estimators.
4
As we predict the value of Y by the knowledge of X, X should be independent of the
error which is the unknown side of Y. the error term gets smaller by the knowledge
of X, so the two terms shouldn't be correlated in any way. In case of being correlated,
this means that as X increases the error increase (or may be decreases) which means
the difference between Y and estimated Y is not stable at the different levels of X.
5
6- In the following figures, comment on the type of the relationship (linear or
nonlinear), and the direction of the relationship?
(A) (B)
(C) (D)
6
8- Discuss the OLS assumption of “No Autocorrelation”, explain why the
violation of this assumption may affect the results of the predictive model,
what is the test used for checking the autocorrelation, and how we judge on
its results, and write the potential remedies of this violation?
1. The autocorrelation can lead to biased parameter estimates, especially in
the variance, affecting the understanding of the relationships between
variables. This results in invalid inferences drawn from the model, leading
to misleading conclusions about relationships between variables.
2. Run Durbin-Watson Test to check the autocorrelation, the output shows a
calculated value of DW and you may check its value with two reference
values; Dwlow, and DWupp.
if DW < Dwlow, hence there is a positive autocorrelation.
If DW> 4*Dwupp, hence there is a negative autocorrelation.
3. As for the potential remedy: using lagged predictors.
3. Another criterion is the condition index which should not exceed 30.
4. Remedies: Stepwise regression, Principle component regression, Ridge
regression, Factor analysis.
7
▪ The adjusted R-square = 0.937, shows that the model has a high degree of
goodness of fit.
▪ The significant F-test in the ANOVA table states that the model is
statistically significant at a 0.05 level of significance.
▪ The predictor “X” has a statistically significant effect on Y (sig. = 0.013), and
its effect is positive and equals 0.397. that is when X increases by 1 unit, Y
will increase by 0.397 units. After controlling the effect of Z.
▪ The predictor “Z” has a statistically significant effect on Y (sig. = 0.020), and
its effect is positive and equals 0.379. that is when X increases by 1 unit, Y
will increase by 0.379 units. After controlling the effect of X.
▪ The constant is sig. as well and its value is 1.66, which means that when X
and Z equals Zero, the mean value of Y would be 1.6.