0% found this document useful (0 votes)
13 views8 pages

Revision 235

The document contains a series of questions and answers related to linear regression, including model significance tests, types of data, properties of OLS estimators, and assumptions of OLS. It covers various statistical concepts such as goodness of fit, hypothesis testing, and multicollinearity. Additionally, it discusses the implications of adding predictors to regression models and the interpretation of regression outputs.

Uploaded by

d5vkdmqzmc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views8 pages

Revision 235

The document contains a series of questions and answers related to linear regression, including model significance tests, types of data, properties of OLS estimators, and assumptions of OLS. It covers various statistical concepts such as goodness of fit, hypothesis testing, and multicollinearity. Additionally, it discusses the implications of adding predictors to regression models and the interpretation of regression outputs.

Uploaded by

d5vkdmqzmc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Choose the correct answer

1. A linear regression model is a model linear in variables and nonlinear in parameters


a) True
b) False

2. What test is used to test the significance of the parameters of the linear regression
model?
a) Chi Square test
b) ANOVA test
c) T- test
d) Wald test

3. 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋23 , this model is linear in


a) Parameters
b) Variables
c) Bothe parameters and variables
d) None of them

4. In Econometrics, researchers study only linear models


a) True
b) False

5. Which of the following models is a two-variable model?


a) 𝑌 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋23
1
b) 𝑌𝑖 = 𝛽1 + 𝛽1 ( ) + 𝑢𝑖
𝑋𝑖
c) 𝑌𝑖 = ln 𝛽1 + 𝛽1 𝑋1𝑖 + 𝛽3 𝑋2𝑖 𝑢𝑖
d) 𝑌 = 𝛽0 + 𝛽1 𝑋1 + ln 𝛽2 𝑋2

6. We use …… to choose among competing models?


a) The determination coefficient
b) The value of F-Statistic
c) The number of predictors
d) The correlation coefficient

7. In linear regression modeling, we should run many hypothesis tests to check the
statistical significance of the estimated parameters.
a) True
b) False

1
8. In a linear regression model with one predictor, … parameters are estimated.
a) 1
b) 2
c) 3
d) 4

9. In a linear regression model with one predictor, we should check the assumption of
multicollinearity.
a) True
b) False

10. What is the OLS assumption that is unnecessary to check in case you build a two-
variable model?
a) No Autocorrelation
b) No Multicollinearity
c) No Heteroscedasticity
d) Linearity assumption

Answer the following Questions:


1- What are the different types of data? Write the major assumption aligned with each
type.

•A time series is a set of observations on the values that a variable takes at different
Time series times.
•The major assumption underlying time series analysis is stationarity.

•Cross-section data are data on one or more variables collected at the same point in

Cross-sectional time, such as census and surveys.


•The major assumption underlying cross sectional data analysis is homogeneity or the
size or scale effect.

•In pooled, or combined, data are elements of both time series and cross-section data.
Pooled data •The major assumption underlying pooled data analysis is the nonexistence of serial
autocorrelation.

2- A salesperson for a large car brand wants to determine whether there is a relationship
between an individual's income and the price they pay for a car. As such, the
individual's "income" is the independent variable and the "price" they pay for a car

2
is the dependent variable. The salesperson wants to use this information to
determine which cars to offer potential customers in new areas where average
income is known.
The following is the output of the analysis, read it carefully and answer the questions:
a) Comment on the goodness of fit of the model and write the interpretation of the
goodness-of-fit measure.
R square is the goodness of fit measure of the linear regression model. As the value
of R square being close to 1, as the goodness of fit of the model advances. In the
current model, R square equals 0.762 which indicates that the model fits the data in
a very good manner.
R square is interpreted as follows: the amount of variability in Y "the price
customers can pay for a car" that is explained by X "income" is 76.2%.
b) What is the null hypothesis of ANOVA test? Comment on the significance of the
given model
H0: B1 = B2 = 0
In the current model, the p-value associated with F-test "ANOVA test" is far below
0.05 which indicates that we could reject H0 at 0.05 level of significance (or in other
words, at 0.95 level of confidence), hence we conclude that the model is statistically
significant, i.e., the results can be generalized to the target population and not
limited only to the sample.
c) Write the fitted model and Interpret the estimated values of the parameters.
The fitted model:
̂𝑖 = 8266.786 + 0.564 𝑋𝑖
𝑌
B1 = the intercept is 8266.78, is the mean value of Y in case X equals 0,i.e, people
could afford a price for a car equals 8266.7$ in case of having an income equals to
the population's mean income.
B2: the slope is 0.564, is the change in Y associated with a unit change in X. as the
income increases by 1 unit above its mean population level, the customer could
afford 0.56 $ increase in the price of a car.

3
3- Explain the properties of OLS estimators.

• An estimator, say the OLS estimator 𝛽̂2 , is said to be a best


linear unbiased estimator (BLUE) of 𝛽̂2 if the following hold:
1.It is linear, that is, a linear function of a random variable,
such as the dependent variable Y in the regression model.
2.It is unbiased, that is, its average or expected value, E(𝛽̂2 ), is
equal to the true value, 𝛽2 .
3.It has minimum variance in the class of all such linear
unbiased estimators; an unbiased estimator with the least
variance is known as an efficient estimator.

4- Discuss the following OLS assumption:

4
As we predict the value of Y by the knowledge of X, X should be independent of the
error which is the unknown side of Y. the error term gets smaller by the knowledge
of X, so the two terms shouldn't be correlated in any way. In case of being correlated,
this means that as X increases the error increase (or may be decreases) which means
the difference between Y and estimated Y is not stable at the different levels of X.

5- Explain what is the expected effect of adding a new predictor to a regression


model?

5
6- In the following figures, comment on the type of the relationship (linear or
nonlinear), and the direction of the relationship?

(A) (B)

(C) (D)

A- the relationship is linear and positive


B- the relationship is linear and negative
C- the relationship is nonlinear
D- there is no relationship between X and Y

7- Explain Cook’s distance measure as a measure of influence?

6
8- Discuss the OLS assumption of “No Autocorrelation”, explain why the
violation of this assumption may affect the results of the predictive model,
what is the test used for checking the autocorrelation, and how we judge on
its results, and write the potential remedies of this violation?
1. The autocorrelation can lead to biased parameter estimates, especially in
the variance, affecting the understanding of the relationships between
variables. This results in invalid inferences drawn from the model, leading
to misleading conclusions about relationships between variables.
2. Run Durbin-Watson Test to check the autocorrelation, the output shows a
calculated value of DW and you may check its value with two reference
values; Dwlow, and DWupp.
if DW < Dwlow, hence there is a positive autocorrelation.
If DW> 4*Dwupp, hence there is a negative autocorrelation.
3. As for the potential remedy: using lagged predictors.

9- Discuss the OLS assumption of “No Multicollinearity”, explain why the


violation of this assumption may affect the results of the predictive model,
what is the test used for checking the multicollinearity, and how we judge on
its results, and write the potential remedies of this violation?

1. High multicollinearity yields into insignificant predictors’ coefficients, very


high 𝑅2 and instable regression coefficients. This can also serve as an
indicator of collinearity.
2. When VIF > 10, this is an indication of collinearity

3. Another criterion is the condition index which should not exceed 30.
4. Remedies: Stepwise regression, Principle component regression, Ridge
regression, Factor analysis.

10- Comment on the following model:

7
▪ The adjusted R-square = 0.937, shows that the model has a high degree of
goodness of fit.
▪ The significant F-test in the ANOVA table states that the model is
statistically significant at a 0.05 level of significance.
▪ The predictor “X” has a statistically significant effect on Y (sig. = 0.013), and
its effect is positive and equals 0.397. that is when X increases by 1 unit, Y
will increase by 0.397 units. After controlling the effect of Z.
▪ The predictor “Z” has a statistically significant effect on Y (sig. = 0.020), and
its effect is positive and equals 0.379. that is when X increases by 1 unit, Y
will increase by 0.379 units. After controlling the effect of X.
▪ The constant is sig. as well and its value is 1.66, which means that when X
and Z equals Zero, the mean value of Y would be 1.6.

You might also like