Chap 013
Chap 013
Chap 013
Multiple Regression
True False
2. If a regression model's F test statistic is Fcalc = 43.82, we could say that the
explained variance is approximately 44 percent.
True False
3. In a regression, the model with the best fit is preferred over all other models.
True False
True False
5. A predictor whose pairwise correlation with Y is near zero can still have a
significant t-value in a multiple regression when other predictors are included.
True False
True False
True False
True False
True False
10. Evans' Rule says that if n = 50 you need at least 5 predictors to have a good
model.
True False
True False
12. The random error term in a regression model reflects all factors omitted from the
model.
True False
13. If the probability plot of residuals resembles a straight line, the residuals show a
fairly good fit to the normal distribution.
True False
14. Confidence intervals for Y may be unreliable when the residuals are not normally
distributed.
True False
15. A negative estimated coefficient in a regression usually indicates a weak
predictor.
True False
16. For a certain firm, the regression equation Bonus = 2,000 + 257 Experience +
0.046 Salary describes employee bonuses with a standard error of 125. John has 10
years' experience, earns $50,000, and earned a bonus of $7,000. John is an
outlier.
True False
17. There is one residual for each predictor in the regression model.
True False
18. If R2 and R2adj differ greatly, we should probably add a few predictors to improve
the fit.
True False
True False
20. A parsimonious model is one with many weak predictors but a few strong ones.
True False
21. The F statistic and its p-value give a global test of significance for a multiple
regression.
True False
22. In a regression model of student grades, we would code the nine categories of
business courses taken (ACC, FIN, ECN, MGT, MKT, MIS, ORG, POM, QMM) by
including nine binary (0 or 1) predictors in the regression.
True False
23. A disadvantage of Excel's Data Analysis regression tool is that it expects the
independent variables to be in a block of contiguous columns so you must delete
a column if you want to eliminate a predictor from the model.
True False
24. A disadvantage of Excel's regression is that it does not give as much accuracy in
the estimated regression coefficients as a package like MINITAB.
True False
25. Nonnormality of the residuals from a regression can best be detected by looking
at the residual plots against the fitted Y values.
True False
26. A high variance inflation factor (VIF) indicates a significant predictor in the
regression.
True False
True False
True False
29. Plotting the residuals against a binary predictor (X = 0, 1) reveals nothing about
heteroscedasticity.
True False
30. The regression equation Bonus = 2,812 + 27 Experience + 0.046 Salary says that
Experience is the most significant predictor of Bonus.
True False
True False
32. A regression of Y using four independent variables X1, X2, X3, X4 could also have
up to four nonlinear terms (X2) and six simple interaction terms (XjXk) if you have
enough observations to justify them.
True False
True False
34. If the residuals in your regression are nonnormal, a larger sample size might help
improve the reliability of confidence intervals for Y.
True False
35. Multicollinearity can be detected from t tests of the predictor variables.
True False
36. When multicollinearity is present, the regression model is of no use for making
predictions.
True False
37. Autocorrelation of the residuals may affect the reliability of the t values for the
estimated coefficients of the predictors X1, X2, . . . , Xk.
True False
True False
39. Statisticians who work with cross-sectional data generally do not anticipate
autocorrelation.
True False
40. The ill effects of heteroscedasticity might be mitigated by redefining totals (e.g.,
total number of homicides) as relative values (e.g., homicide rate per 100,000
population).
True False
True False
True False
43. Heteroscedasticity exists when all the errors (residuals) have the same variance.
True False
True False
45. A squared predictor is used to test for nonlinearity in the predictor's relationship
to Y.
True False
46. Nonnormality of residuals is not usually considered a major problem unless there
are outliers.
True False
47. In the fitted regression Y = 12 + 3X1 - 5X2 + 27X3 + 2X4 the most significant
predictor is X3.
True False
48. Given that the fitted regression is Y = 76.40 -6.388X1 + 0.870X2, the standard error
of b1 is 1.453, and n = 63. At = .05, we can conclude that X1 is a significant
predictor of Y.
True False
49. Unlike other predictors, a binary predictor has a t-value that is either 0 or 1.
True False
50. The t-test shows the ratio of an estimated coefficient to its standard error.
True False
51. In a multiple regression with five predictors in a sample of 56 U.S. cities, we would
use F5, 50 in a test of overall significance.
True False
52. In a multiple regression with six predictors in a sample of 67 U.S. cities, what
would be the critical value for an F-test of overall significance at = .05?
A. 2.29
B. 2.25
C. 2.37
D. 2.18
53. In a multiple regression with five predictors in a sample of 56 U.S. cities, what
would be the critical value for an F-test of overall significance at = .05?
A. 2.45
B. 2.37
C. 2.40
D. 2.56
54. When predictor variables are strongly related to each other, the __________ of the
regression estimates is questionable.
A. logic
B. fit
C. parsimony
D. stability
55. A test is conducted in 22 cities to see if giving away free transit system maps will
increase the number of bus riders. In a regression analysis, the dependent variable
Y is the increase in bus riders (in thousands of persons) from the start of the test
until its conclusion. The independent variables are X1 = the number (in thousands)
of free maps distributed and a binary variable X2 = 1 if the city has free downtown
parking, 0 otherwise. The estimated regression equation is
In city 3, the observed Y value is 7.3 and X1 = 140 and X2 = 0. The residual for city
3 (in thousands) is:
A. 6.15.
B. 1.15.
C. 4.83.
D. 1.57.
56. If X2 is a binary predictor in Y = 0 + 1X1 + 2X2, then which statement is most
nearly correct?
57. The unexplained sum of squares measures variation in the dependent variable Y
about the:
B. estimated Y values.
D. Y-intercept.
58. Which of the following is not true of the standard error of the regression?
A. .2742
B. .0752
C. .9248
D. .9617
60. A fitted multiple regression equation is Y = 12 + 3X1 - 5X2 + 7X3 + 2X4. When X1
increases 2 units and X2 increases 2 units as well, while X3 and X4 remain
unchanged, what change would you expect in your estimate of Y?
A. Decrease by 2
B. Decrease by 4
C. Increase by 2
D. No change in Y
61. A fitted multiple regression equation is Y = 28 + 5X1 - 4X2 + 7X3 + 2X4. When X1
increases 2 units and X2 increases 2 units as well, while X3 and X4 remain
unchanged, what change would you expect in your estimate of Y?
A. Increase by 2
B. Decrease by 4
C. Increase by 4
D. No change in Y
62. Which is not a name often given to an independent variable that takes on just two
values (0 or 1) according to whether or not a given characteristic is absent or
present?
A. Absent variable
B. Binary variable
C. Dummy variable
63. Using a sample of 63 observations, a dependent variable Y is regressed against
two variables X1 and X2 to obtain the fitted regression equation Y = 76.40 -
6.388X1 + 0.870X2. The standard error of b1 is 3.453 and the standard error of b2 is
0.611. At a = .05, we could:
A. .3995.
B. .6005.
C. .6654.
D. .8822.
66. Refer to the following regression results. The dependent variable is Abort (the
number of abortions per 1000 women of childbearing age). The regression was
estimated using data for the 50 U.S. states with these predictors: EdSpend =
public K-12 school expenditure per capita, Age = median age of population,
Unmar = percent of total births by unmarried women, Infmor = infant mortality
rate in deaths per 1000 live births.
A. 1605.7.
B. 0.9134.
C. 89.66.
A. 3177.17.
B. 301.19.
C. 17.71.
D. impossible to determine.
70. A Realtor is trying to predict the selling price of houses in Greenville (in thousands
of dollars) as a function of Size (measured in thousands of square feet) and
whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if there is a
fireplace). Part of the regression output is provided below, based on a sample of
20 homes. Some of the information has been omitted.
A. 9.5.
B. 13.8.
C. 122.5.
D. 1442.6.
71. A Realtor is trying to predict the selling price of houses in Greenville (in thousands
of dollars) as a function of Size (measured in thousands of square feet) and
whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if there is a
fireplace). The regression output is provided below. Some of the information has
been omitted.
A. 20
B. 18
C. 3
D. 2
72. A Realtor is trying to predict the selling price of houses in Greenville (in thousands
of dollars) as a function of Size (measured in thousands of square feet) and
whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if there is a
fireplace). The regression output is provided below. Some of the information has
been omitted.
B. A fireplace adds around $6476 to the selling price of the average house.
C. A large house with no fireplace will sell for more than a small house with a
fireplace.
A. Heteroscedastic residuals
B. Multicollinearity
C. Autocorrelated residuals
75. A useful guideline in determining the extent of collinearity in a multiple regression
model is:
A. Sturge's Rule.
B. Klein's Rule.
C. Occam's Rule.
D. Pearson's Rule.
76. In a multiple regression all of the following are true regarding residuals except:
B. they are the differences between observed and predicted values of the
response variable.
A. Autocorrelation
B. Heteroscedasticity
C. Nonnormality
D. Multicollinearity
A. Logic of causation
B. Overall fit
C. Degree of collinearity
D. Binary predictors
79. If the standard error is 12, a quick prediction interval for Y is:
A. 15.
B. 24.
C. 19.
A. NumCyl, HpMax
B. Intercept, NumCyl
C. NumCyl, Domestic
D. ManTran, Width
85. In the following regression (n = 91), which coefficients differ from zero in a two-
tailed test at = .05?
A. NumCyl, HPMax
B. Intercept, ManTran
D. Intercept, Domestic
86. Based on the following regression ANOVA table, what is the R2?
A. 0.1336
B. 0.6005
C. 0.3995
88. The relationship of Y to four other variables was established as Y = 12 + 3X1 - 5X2
+ 7X3 + 2X4. When X1 increases 5 units and X2 increases 3 units, while X3 and X4
remain unchanged, what change would you expect in your estimate of Y?
A. Decrease by 15
B. Increase by 15
C. No change
D. Increase by 5
89. Does the picture below show strong evidence of heteroscedasticity against the
predictor Wheelbase?
A. Yes
B. No
A. SSR/SSE
B. SSR/SST
C. 1 - SSE/SST
91. If SSR = 3600, SSE = 1200, and SST = 4800, then R2 is:
A. .5000
B. .7500
C. .3333
D. .2500
B. The R2 statistic can only increase (or stay the same) when you add more
predictors to a regression.
C. If the F-statistic is insignificant, the t-statistics for the predictors also are
insignificant at the same .
D. If there is a binary predictor (X = 0, 1) in the model, the residuals may not sum
to zero.
C. The remaining estimated 's will change if X5 was collinear with other
predictors.
A. It is a continuous distribution.
A. -1.250
B. -0.240
C. +0.870
D. +1.500
101.The regression equation Salary = 28,000 + 2700 YearsExperience + 1900
YearsCollege describes employee salaries at Ramjac Corporation. The standard
error is 2400. Mary has 10 years' experience and 4 years of college. Her salary is
$58,350. What is Mary's standardized residual (approximately)?
A. -1.150
B. +2.007
C. -1.771
D. +1.400
102.Which Excel function will give the p-value for overall significance if a regression
has 75 observations and 5 predictors and gives an F test statistic Fcalc = 3.67?
A. =F.INV(.05, 5, 75)
B. =F.DIST(3.67, 4, 74)
C. =F.DIST.RT(3.67, 5, 69)
D. =F.DIST(.05, 4, 70)
103.The ScamMore Energy Company is attempting to predict natural gas
consumption for the month of January. A random sample of 50 homes was used
to fit a regression of gas usage (in CCF) using as predictors Temperature = the
thermostat setting (degrees Fahrenheit) and Occupants = the number of
household occupants. They obtained the following results:
In testing each coefficient for a significant difference from zero (two-tailed test at
= .10), which is the most reasonable conclusion about the predictors?
A. 60
B. 59
C. 52
D. 6
105.A regression with 72 observations and 9 predictors violates:
A. Evans' Rule.
B. Klein's Rule.
C. Doane's Rule.
D. Sturges' Rule.
A. (3, 44)
B. (4, 46)
C. (4, 42)
D. (3, 43)
A. 61
B. 60
C. 55
D. 54
Essay Questions
108.Using state data (n = 50) for the year 2000, a statistics student calculated a matrix
of correlation coefficients for selected variables describing state averages on the
two main scholastic aptitude tests (ACT and SAT). (a) In the spaces provided, write
the two-tailed critical values of the correlation coefficient for = .05 and = .01
respectively. Show how you derived these critical values. (b) Mark with * all
correlations that are significant at = .05, and mark with ** those that are
significant at = .01. (c) Why might you expect a negative correlation between
ACT% and SAT%? (d) Why might you expect a positive correlation between SATQ
and SATV? Explain your reasoning. (e) Why is the matrix empty above the
diagonal?
109.Using data for a large sample of cars (n = 93), a statistics student calculated a
matrix of correlation coefficients for selected variables describing each car. (a) In
the spaces provided, write the two-tailed critical values of the correlation
coefficient for = .05 and = .01 respectively. Show how you derived these
critical values. (b) Mark with * all correlations that are significant at = .05, and
mark with ** those that are significant at = .01. (c) Why might you expect a
negative correlation between Weight and HwyMPG? (d) Why might you expect a
positive correlation between HPMax and Length? Explain your reasoning. (e) Why
is the matrix empty above the diagonal?
110.Analyze the regression below (n = 50 U.S. states) using the concepts you have
learned about multiple regression. Circle things of interest and write comments in
the margin. Make a prediction for Poverty for a state with Dropout = 15,
TeenMom = 12, Unem = 4, and Age65% = 12 (show your work). The variables are
Poverty = percentage below the poverty level; Dropout = percent of adult
population that did not finish high school; TeenMom = percent of total births by
teenage mothers; Unem = unemployment rate, civilian labor force; and Age65% =
percent of population aged 65 and over.
111. Analyze the regression results below (n = 33 cars in 1993) using the concepts you
have learned about multiple regression. Circle things of interest and write
comments in the margin. Make a prediction for CityMPG for a car with EngSize =
2.5, ManTran = 1, Length = 184, Wheelbase = 104, Weight = 3000, and Domestic
= 0 (show your work). The variables are CityMPG = city MPG (miles per gallon by
EPA rating); EngSize = engine size (liters); ManTran = 1 if manual transmission
available, 0 otherwise; Length = vehicle length (inches); Wheelbase = vehicle
wheelbase (inches); Weight = vehicle weight (pounds); Domestic = 1 if U.S.
manufacturer, 0 otherwise.
Chapter 13 Multiple Regression Answer Key
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
2. If a regression model's F test statistic is Fcalc = 43.82, we could say that the
explained variance is approximately 44 percent.
FALSE
The R2 statistic (not the F statistic) shows the percent of explained variation.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
3. In a regression, the model with the best fit is preferred over all other models.
FALSE
Occam's Razor says that complexity is justified only if it is necessary for a good
model.
AACSB: Analytic
Blooms: Understand
Difficulty: 2 Medium
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
TRUE
AACSB: Analytic
Blooms: Understand
Difficulty: 2 Medium
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
5. A predictor whose pairwise correlation with Y is near zero can still have a
significant t-value in a multiple regression when other predictors are included.
TRUE
The t-statistic for a predictor depends on which other predictors are in the
model.
AACSB: Analytic
Blooms: Understand
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
TRUE
At least one predictor coefficient will differ from zero at the same used in the
F test.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
7. R2adj can exceed R2 if there are several weak predictors.
FALSE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
FALSE
Binary predictors behave like any other except they look weird on a scatter plot.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-05 Incorporate a categorical variable into a multiple regression model.
Topic: Categorical Predictors
9. In a multiple regression with 3 predictors in a sample of 25 U.S. cities, we would
use F3, 21 in a test of overall significance.
TRUE
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
10. Evans' Rule says that if n = 50 you need at least 5 predictors to have a good
model.
FALSE
On the contrary, Evans' Rule is intended to prevent having too many predictors.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Assessing Overall Fit
11. The model Y = 0 + 1X + 2X2 cannot be estimated by Excel because of the
nonlinear term.
FALSE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-09 Explain the role of data conditioning and data transformations.
Topic: Tests for Nonlinearity and Interaction
12. The random error term in a regression model reflects all factors omitted from
the model.
TRUE
The errors are assumed normally distributed with zero mean and constant
variance.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
13. If the probability plot of residuals resembles a straight line, the residuals show a
fairly good fit to the normal distribution.
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
14. Confidence intervals for Y may be unreliable when the residuals are not
normally distributed.
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-04 Interpret confidence intervals for regression coefficients.
Topic: Violations of Assumptions
15. A negative estimated coefficient in a regression usually indicates a weak
predictor.
FALSE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
16. For a certain firm, the regression equation Bonus = 2,000 + 257 Experience +
0.046 Salary describes employee bonuses with a standard error of 125. John has
10 years' experience, earns $50,000, and earned a bonus of $7,000. John is an
outlier.
FALSE
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-08 Identify unusual residuals and high leverage observations.
Topic: Violations of Assumptions
17. There is one residual for each predictor in the regression model.
FALSE
There are k predictors, but there are n residuals e1, e2, , en.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
18. If R2 and R2adj differ greatly, we should probably add a few predictors to
improve the fit.
FALSE
Evidence of unnecessary predictors can be seen when R2adj is much smaller than
R2.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
19. The effect of a binary predictor is to shift the regression intercept.
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-05 Incorporate a categorical variable into a multiple regression model.
Topic: Categorical Predictors
20. A parsimonious model is one with many weak predictors but a few strong
ones.
FALSE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Multiple Regression
21. The F statistic and its p-value give a global test of significance for a multiple
regression.
TRUE
The F-test tells whether or not at least some predictors are significant.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
22. In a regression model of student grades, we would code the nine categories of
business courses taken (ACC, FIN, ECN, MGT, MKT, MIS, ORG, POM, QMM) by
including nine binary (0 or 1) predictors in the regression.
FALSE
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-05 Incorporate a categorical variable into a multiple regression model.
Topic: Categorical Predictors
23. A disadvantage of Excel's Data Analysis regression tool is that it expects the
independent variables to be in a block of contiguous columns so you must
delete a column if you want to eliminate a predictor from the model.
TRUE
AACSB: Technology
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Multiple Regression
24. A disadvantage of Excel's regression is that it does not give as much accuracy in
the estimated regression coefficients as a package like MINITAB.
FALSE
AACSB: Technology
Blooms: Understand
Difficulty: 1 Easy
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Multiple Regression
25. Nonnormality of the residuals from a regression can best be detected by
looking at the residual plots against the fitted Y values.
FALSE
Use a probability plot to check for nonnormality (a residual plot tests for
heteroscedasticity).
AACSB: Analytic
Blooms: Understand
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
26. A high variance inflation factor (VIF) indicates a significant predictor in the
regression.
FALSE
A high VIF indicates that a predictor is related to the other predictors in the
model.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
27. Autocorrelation may be detected by looking at a plot of the residuals against
time.
TRUE
Too many or too few crossings of the zero axis suggest nonrandomness.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
29. Plotting the residuals against a binary predictor (X = 0, 1) reveals nothing about
heteroscedasticity.
FALSE
You can still spot wider or narrower spread at the two points X = 0 and X = 1.
AACSB: Analytic
Blooms: Remember
Difficulty: 3 Hard
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
30. The regression equation Bonus = 2,812 + 27 Experience + 0.046 Salary says
that Experience is the most significant predictor of Bonus.
FALSE
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
31. A multiple regression with 60 observations should not have 13 predictors.
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
32. A regression of Y using four independent variables X1, X2, X3, X4 could also have
up to four nonlinear terms (X2) and six simple interaction terms (XjXk) if you
have enough observations to justify them.
TRUE
We must count all the possible squares and two-way combinations of four
predictors.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-09 Explain the role of data conditioning and data transformations.
Topic: Tests for Nonlinearity and Interaction
33. When autocorrelation is present, the estimates of the coefficients will be
unbiased.
TRUE
There is no bias in the OLS estimates, though variances and t-tests may be
affected.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
34. If the residuals in your regression are nonnormal, a larger sample size might
help improve the reliability of confidence intervals for Y.
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
35. Multicollinearity can be detected from t tests of the predictor variables.
FALSE
The t-tests only indicate significance (we use VIFs to detect multicollinearity).
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
36. When multicollinearity is present, the regression model is of no use for making
predictions.
FALSE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
37. Autocorrelation of the residuals may affect the reliability of the t values for the
estimated coefficients of the predictors X1, X2, . . . , Xk.
TRUE
Autocorrelation can affect the variances of the estimators, hence their t-values.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
39. Statisticians who work with cross-sectional data generally do not anticipate
autocorrelation.
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
40. The ill effects of heteroscedasticity might be mitigated by redefining totals (e.g.,
total number of homicides) as relative values (e.g., homicide rate per 100,000
population).
TRUE
Large magnitude ranges for X's and Y (the "size" problem) can induce
heteroscedasticity.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-09 Explain the role of data conditioning and data transformations.
Topic: Violations of Assumptions
41. Nonnormal residuals lead to biased estimates of the coefficients in a regression
model.
FALSE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
43. Heteroscedasticity exists when all the errors (residuals) have the same variance.
FALSE
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
45. A squared predictor is used to test for nonlinearity in the predictor's
relationship to Y.
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-09 Explain the role of data conditioning and data transformations.
Topic: Tests for Nonlinearity and Interaction
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
47. In the fitted regression Y = 12 + 3X1 - 5X2 + 27X3 + 2X4 the most significant
predictor is X3.
FALSE
We must have the t-statistics (not just the coefficients) to assess each
predictor's significance.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
48. Given that the fitted regression is Y = 76.40 -6.388X1 + 0.870X2, the standard
error of b1 is 1.453, and n = 63. At = .05, we can conclude that X1 is a
significant predictor of Y.
TRUE
tcalc = (-6.388)/(1.453) = -4.396, which is < t.025 = -2.000 for d.f. = 60 in a two-
tailed test.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
49. Unlike other predictors, a binary predictor has a t-value that is either 0 or 1.
FALSE
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-05 Incorporate a categorical variable into a multiple regression model.
Topic: Categorical Predictors
50. The t-test shows the ratio of an estimated coefficient to its standard error.
TRUE
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
51. In a multiple regression with five predictors in a sample of 56 U.S. cities, we
would use F5, 50 in a test of overall significance.
TRUE
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
52. In a multiple regression with six predictors in a sample of 67 U.S. cities, what
would be the critical value for an F-test of overall significance at = .05?
A. 2.29
B. 2.25
C. 2.37
D. 2.18
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
53. In a multiple regression with five predictors in a sample of 56 U.S. cities, what
would be the critical value for an F-test of overall significance at = .05?
A. 2.45
B. 2.37
C. 2.40
D. 2.56
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
54. When predictor variables are strongly related to each other, the __________ of
the regression estimates is questionable.
A. logic
B. fit
C. parsimony
D. stability
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
55. A test is conducted in 22 cities to see if giving away free transit system maps
will increase the number of bus riders. In a regression analysis, the dependent
variable Y is the increase in bus riders (in thousands of persons) from the start
of the test until its conclusion. The independent variables are X1 = the number
(in thousands) of free maps distributed and a binary variable X2 = 1 if the city
has free downtown parking, 0 otherwise. The estimated regression equation is
In city 3, the observed Y value is 7.3 and X1 = 140 and X2 = 0. The residual for
city 3 (in thousands) is:
A. 6.15.
B. 1.15.
C. 4.83.
D. 1.57.
yestimated = 1.32 + .0345(140) - 1.45(0) = 6.15, so the residual is (7.3 - 6.15) = 1.15.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Assessing Overall Fit
56. If X2 is a binary predictor in Y = 0 + 1X1 + 2X2, then which statement is most
nearly correct?
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-05 Incorporate a categorical variable into a multiple regression model.
Topic: Categorical Predictors
57. The unexplained sum of squares measures variation in the dependent variable
Y about the:
B. estimated Y values.
D. Y-intercept.
We are trying to explain variation in the response variable around its mean.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
58. Which of the following is not true of the standard error of the regression?
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-04 Interpret confidence intervals for regression coefficients.
Topic: Confidence Intervals for Y
59. A multiple regression analysis with two independent variables yielded the
following results in the ANOVA table: SS(Total) = 798, SS(Regression) = 738,
SS(Error) = 60. The multiple correlation coefficient is:
A. .2742
B. .0752
C. .9248
D. .9617
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
60. A fitted multiple regression equation is Y = 12 + 3X1 - 5X2 + 7X3 + 2X4. When X1
increases 2 units and X2 increases 2 units as well, while X3 and X4 remain
unchanged, what change would you expect in your estimate of Y?
A. Decrease by 2
B. Decrease by 4
C. Increase by 2
D. No change in Y
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
61. A fitted multiple regression equation is Y = 28 + 5X1 - 4X2 + 7X3 + 2X4. When X1
increases 2 units and X2 increases 2 units as well, while X3 and X4 remain
unchanged, what change would you expect in your estimate of Y?
A. Increase by 2
B. Decrease by 4
C. Increase by 4
D. No change in Y
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
62. Which is not a name often given to an independent variable that takes on just
two values (0 or 1) according to whether or not a given characteristic is absent
or present?
A. Absent variable
B. Binary variable
C. Dummy variable
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-05 Incorporate a categorical variable into a multiple regression model.
Topic: Categorical Predictors
63. Using a sample of 63 observations, a dependent variable Y is regressed against
two variables X1 and X2 to obtain the fitted regression equation Y = 76.40 -
6.388X1 + 0.870X2. The standard error of b1 is 3.453 and the standard error of b2
is 0.611. At a = .05, we could:
For 1 we have tcalc = (-6.388)/(3.453) = -1.849 which is less than t.05 = -1.671 for
d.f. = 60 in a left-tailed test. For 2 we have tcalc = (0.870)/(0.611) = +1.424 which
does not exceed t.05 = +1.671 for d.f. = 60 in a right-tailed test. For a two-tailed
test, t.025 = 2.000, so neither coefficient would differ significantly from zero at
a = .05. Evans' Rule is not violated because n/k = 63/3 = 21.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
64. Refer to this ANOVA table from a regression:
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
65. Refer to this ANOVA table from a regression:
A. .3995.
B. .6005.
C. .6654.
D. .8822.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
66. Refer to the following regression results. The dependent variable is Abort (the
number of abortions per 1000 women of childbearing age). The regression was
estimated using data for the 50 U.S. states with these predictors: EdSpend =
public K-12 school expenditure per capita, Age = median age of population,
Unmar = percent of total births by unmarried women, Infmor = infant mortality
rate in deaths per 1000 live births.
For Infmor, tcalc = (-3.7848)/(1.0173) = -3.720, which is < t.025 = -2.014 for d.f. =
45.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
67. Refer to the following correlation matrix that was part of a regression analysis.
The dependent variable was Abort (the number of abortions per 1000 women
of childbearing age). The regression was estimated using data for the 50 U.S.
states with these predictors: EdSpend = public K-12 school expenditure per
capita, Age = median age of population, Unmar = percent of total births by
unmarried women, Infmor = infant mortality rate in deaths per 1000 live births.
Correlation Matrix
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
68. Part of a regression output is provided below. Some of the information has
been omitted.
A. 1605.7.
B. 0.9134.
C. 89.66.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
69. Part of a regression output is provided below. Some of the information has
been omitted.
A. 3177.17.
B. 301.19.
C. 17.71.
D. impossible to determine.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
70. A Realtor is trying to predict the selling price of houses in Greenville (in
thousands of dollars) as a function of Size (measured in thousands of square
feet) and whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if
there is a fireplace). Part of the regression output is provided below, based on a
sample of 20 homes. Some of the information has been omitted.
A. 9.5.
B. 13.8.
C. 122.5.
D. 1442.6.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
71. A Realtor is trying to predict the selling price of houses in Greenville (in
thousands of dollars) as a function of Size (measured in thousands of square
feet) and whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if
there is a fireplace). The regression output is provided below. Some of the
information has been omitted.
A. 20
B. 18
C. 3
D. 2
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
72. A Realtor is trying to predict the selling price of houses in Greenville (in
thousands of dollars) as a function of Size (measured in thousands of square
feet) and whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if
there is a fireplace). The regression output is provided below. Some of the
information has been omitted.
Fcalc = MSR/MSE = (1588.6)/(17.717) = 89.66, which exceeds F.05 = 3.59 for d.f. =
(2, 17).
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
73. A Realtor is trying to predict the selling price of houses in Greenville (in
thousands of dollars) as a function of Size (measured in thousands of square
feet) and whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if
there is a fireplace). Part of the regression output is provided below, based on a
sample of 20 homes. Some of the information has been omitted.
B. A fireplace adds around $6476 to the selling price of the average house.
C. A large house with no fireplace will sell for more than a small house with a
fireplace.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
74. A log transformation might be appropriate to alleviate which problem(s)?
A. Heteroscedastic residuals
B. Multicollinearity
C. Autocorrelated residuals
By reducing data magnitudes, the log transform may help equalize variances.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
A. Sturge's Rule.
B. Klein's Rule.
C. Occam's Rule.
D. Pearson's Rule.
Klein's Rule suggests severe collinearity if any r exceeds the multiple correlation
coefficient.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
76. In a multiple regression all of the following are true regarding residuals except:
B. they are the differences between observed and predicted values of the
response variable.
Residuals help in all these except to detect multicollinearity (we need VIFs for
that task).
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-08 Identify unusual residuals and high leverage observations.
Topic: Violations of Assumptions
77. The residual plot below suggests which violation(s) of regression assumptions?
A. Autocorrelation
B. Heteroscedasticity
C. Nonnormality
D. Multicollinearity
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
78. Which is not a standard criterion for assessing a regression model?
A. Logic of causation
B. Overall fit
C. Degree of collinearity
D. Binary predictors
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Multiple Regression
79. If the standard error is 12, a quick prediction interval for Y is:
A. 15.
B. 24.
C. 19.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-04 Interpret confidence intervals for regression coefficients.
Topic: Confidence Intervals for Y
The larger the VIFs, the more we suspect that the predictors are multicollinear.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
81. Which statement best describes this regression (Y = highway miles per gallon in
91 cars)?
The p-value for the F-test indicates significance, but the quick prediction
interval is Y 2(4.019) or Y 8 mpg, which would not permit a very precise
prediction.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
82. Based on these regression results, in your judgment which statement is most
nearly correct (Y = highway miles per gallon in 91 cars)?
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
83. In the following regression, which are the three best predictors?
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
84. In the following regression, which are the two best predictors?
A. NumCyl, HpMax
B. Intercept, NumCyl
C. NumCyl, Domestic
D. ManTran, Width
Absolute t-statistics indicate a ranking, so find tcalc = (Coef)/(Std Err) for each
predictor.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
85. In the following regression (n = 91), which coefficients differ from zero in a two-
tailed test at = .05?
A. NumCyl, HPMax
B. Intercept, ManTran
D. Intercept, Domestic
If the confidence interval includes zero, the predictor is not significant in a two-
tailed test.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
86. Based on the following regression ANOVA table, what is the R2?
A. 0.1336
B. 0.6005
C. 0.3995
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
87. In the following regression, which statement best describes the degree of
multicollinearity?
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
88. The relationship of Y to four other variables was established as Y = 12 + 3X1 -
5X2 + 7X3 + 2X4. When X1 increases 5 units and X2 increases 3 units, while X3
and X4 remain unchanged, what change would you expect in your estimate of
Y?
A. Decrease by 15
B. Increase by 15
C. No change
D. Increase by 5
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Predictor Significance
89. Does the picture below show strong evidence of heteroscedasticity against the
predictor Wheelbase?
A. Yes
B. No
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
90. Which is not a correct way to find the coefficient of determination?
A. SSR/SSE
B. SSR/SST
C. 1 - SSE/SST
R2 = SSR/SST or R2 = 1 - SSE/SST.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
91. If SSR = 3600, SSE = 1200, and SST = 4800, then R2 is:
A. .5000
B. .7500
C. .3333
D. .2500
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
92. Which statement is incorrect?
B. The R2 statistic can only increase (or stay the same) when you add more
predictors to a regression.
C. If the F-statistic is insignificant, the t-statistics for the predictors also are
insignificant at the same .
Positive autocorrelation results in too few crossings of the zero point on the
axis (cycles).
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
93. Which statement about leverage is incorrect?
2(k + 1)/n = 2(4 + 1)/40 = .25, so hi = .15 would not indicate high leverage.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-08 Identify unusual residuals and high leverage observations.
Topic: Violations of Assumptions
94. Which statement is incorrect?
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-05 Incorporate a categorical variable into a multiple regression model.
Topic: Categorical Predictors
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
96. If you rerun a regression, omitting a predictor X5, which would be unlikely?
C. The remaining estimated 's will change if X5 was collinear with other
predictors.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
97. In a multiple regression, which is an incorrect statement about the residuals?
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-06 Detect multicollinearity and assess its effects.
Topic: Multicollinearity
A. It is a continuous distribution.
In ANOVA we use d.f. = (k, n - k - 1). The value of does not affect d.f.
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
99. Which of the following would be most useful in checking the normality
assumption of the errors in a regression model?
AACSB: Analytic
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 13-07 Analyze residuals to check for violations of residual assumptions.
Topic: Violations of Assumptions
100. The regression equation Salary = 25,000 + 3200 YearsExperience + 1400
YearsCollege describes employee salaries at Axolotl Corporation. The standard
error is 2600. John has 10 years' experience and 4 years of college. His salary is
$66,500. What is John's standardized residual?
A. -1.250
B. -0.240
C. +0.870
D. +1.500
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-08 Identify unusual residuals and high leverage observations.
Topic: Violations of Assumptions
101. The regression equation Salary = 28,000 + 2700 YearsExperience + 1900
YearsCollege describes employee salaries at Ramjac Corporation. The standard
error is 2400. Mary has 10 years' experience and 4 years of college. Her salary is
$58,350. What is Mary's standardized residual (approximately)?
A. -1.150
B. +2.007
C. -1.771
D. +1.400
Mary's predicted salary is 28,000 + 2700 (10) + 1900 (4) = 62,600, so her
standardized residual is (58,350 - 62,600)/(2400) = -1.771 (she is somewhat
underpaid according to the fitted regression).
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-08 Identify unusual residuals and high leverage observations.
Topic: Violations of Assumptions
102. Which Excel function will give the p-value for overall significance if a regression
has 75 observations and 5 predictors and gives an F test statistic Fcalc = 3.67?
A. =F.INV(.05, 5, 75)
B. =F.DIST(3.67, 4, 74)
C. =F.DIST.RT(3.67, 5, 69)
D. =F.DIST(.05, 4, 70)
In pre-2010 versions of Excel the function was =FDIST(3.67, 5, 69) for d.f. = (k, n
- k - 1).
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
103. The ScamMore Energy Company is attempting to predict natural gas
consumption for the month of January. A random sample of 50 homes was
used to fit a regression of gas usage (in CCF) using as predictors Temperature
= the thermostat setting (degrees Fahrenheit) and Occupants = the number of
household occupants. They obtained the following results:
In testing each coefficient for a significant difference from zero (two-tailed test
at = .10), which is the most reasonable conclusion about the predictors?
Find the test statistic tcalc = (Coef)/(StdErr) for each predictor and compare with
t.05 = 1.678 for d.f. = n - k - 1 = 50 - 2 - 1 = 47.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
104. In a regression with 60 observations and 7 predictors, there will be _____
residuals.
A. 60
B. 59
C. 52
D. 6
There are 60 residuals e1, e2, . . . , e60 (one residual for each observation).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 13-01 Use a fitted multiple regression equation to make predictions.
Topic: Assessing Overall Fit
A. Evans' Rule.
B. Klein's Rule.
C. Doane's Rule.
D. Sturges' Rule.
Evans' Rule suggests n/k 10, but in this example n/k = 72/9 = 8.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
106. The F-test for ANOVA in a regression model with 4 predictors and 47
observations would have how many degrees of freedom?
A. (3, 44)
B. (4, 46)
C. (4, 42)
D. (3, 43)
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-02 Interpret the R2 and perform an F test for overall significance.
Topic: Assessing Overall Fit
107. In a regression with 7 predictors and 62 observations, degrees of freedom for a
t-test for each coefficient would use how many degrees of freedom?
A. 61
B. 60
C. 55
D. 54
d.f. = n - k - 1 = 62 - 7 - 1 = 54.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 13-03 Test individual predictors for significance.
Topic: Predictor Significance
Essay Questions
108. Using state data (n = 50) for the year 2000, a statistics student calculated a
matrix of correlation coefficients for selected variables describing state averages
on the two main scholastic aptitude tests (ACT and SAT). (a) In the spaces
provided, write the two-tailed critical values of the correlation coefficient for
= .05 and = .01 respectively. Show how you derived these critical values. (b)
Mark with * all correlations that are significant at = .05, and mark with **
those that are significant at = .01. (c) Why might you expect a negative
correlation between ACT% and SAT%? (d) Why might you expect a positive
correlation between SATQ and SATV? Explain your reasoning. (e) Why is the
matrix empty above the diagonal?
(a) As explained in Chapter 12, for d.f. = n - 2 = 50 - 2 = 48, the critical values
of Student's t for a two-tailed test for zero correlation are t.025 = 2.011 and t.005
= 2.682. The critical values of the correlation coefficient are:
No correlation in the first column (ACT) is significant at either , but all the
other correlations differ significantly from zero at either value of . (b) An
inverse correlation between ACT% and SAT% might be expected because
students in a given state usually take one or the other, but not both (depending
on what their state universities prefer). (c) If the tests measure general ability,
test-takers who score well on SATQ tend also to score well on SATV. (d) Entries
above the diagonal are redundant, so they are omitted.
Feedback:
(a) As explained in Chapter 12, for d.f. = n - 2 = 50 - 2 = 48, the critical values
of Student's t for a two-tailed test for zero correlation are t.05 = 2.011 and t.01 =
2.682. The critical values of the correlation coefficient are:
No correlation in the first column (ACT) is significant at either , while all other
correlations differ significantly from zero at either value of . (b) An inverse
correlation between ACT% and SAT% might be expected because students in a
given state usually take one or the other, but not both (students may not know
that requirements follow a pattern by region). (c) If the tests measure general
ability, test-takers who score well on SATQ would tend also to score well on
SATV. (d) Entries above the diagonal are redundant, so they are omitted.
(a) As explained in Chapter 12, for d.f. = n - 2 = 93 - 2 = 91, the critical values of
Student's t for a two-tailed test are t.025 = 1.986 and t.005 = 2.631. The critical
values of the correlation coefficient are:
Given the large sample, it would also be reasonable to use z.025 = 1.960 (giving
r.05 = .202) or z.005 = 2.576 (giving r.01 = .261). However, none of the sample
correlations is close to the decision point. All the correlations are significant at
either value of . (b) An inverse correlation between Weight and HwyMPG is
expected because larger cars have more mass that must be accelerated and
moved. (c) Longer cars require bigger engines, so HPMax and Length are
correlated. In fact, many measurable aspects of a car are correlated. (d) Entries
above the diagonal are redundant, so they are omitted.
Given the large sample, it would also be reasonable to use z.025 = 1.960 (giving
r.05 = .202) or z.005 = 2.576 (giving r.01 = .261). However, none of the sample
correlations is close to the decision point. All the correlations are significant at
either value of . (b) An inverse correlation between Weight and HwyMPG is
expected because larger cars have more mass that must be accelerated and
moved. (c) Longer cars require bigger engines, so HPMax and Length are
correlated. In fact, many measurable aspects of a car are correlated. (d) Entries
above the diagonal are redundant, so they are omitted.
Feedback: The regression is significant overall (F = 18.74, p < .0001). All the
predictors are significant at = .05 (p-values less than .05). TeenMom and
Unem are the best predictors, while Age65% and DropOut are barely
significant. The intercept is not meaningful since no state has all these
predictors equal to zero. Regarding leverage, we can apply the quick rule to
check for any residual greater than 2(k + 1)/n = 2(5)/50 = .20. By this criterion,
only AK (leverage .434) has unusual leverage. We would want to check each
predictor to see which X values are unusual for Alaska, but this is not possible
without the raw data. There are no outliers in the Studentized residual column,
although there are three unusual ones: AK (t = -2.251), IN (t = -2.129), and NM
(t = +2.829). Autocorrelation is not an issue since these are not time-series
observations (and, in any event, the residual plot against observation order
crosses the zero centerline 22 times, which is not far from what would be
expected for 50 observations). The residual plot against predicted Y has no
pattern (suggesting homoscedasticity) and the residual probability plot is linear
(suggesting normality). Overall, there are no serious problems. The fitted
(estimated) regression equation is: Poverty = - 5.3546 + 0.2065 Dropout +
0.4238 TeenMom + 1.1081 Unem + 0.3469 Age65%, so the predicted value of
the dependent variable Poverty for a state with Dropout = 15, TeenMom = 12,
Unem = 4, and Age65% = 12 is: Poverty = - 5.3546 + 0.2065(15) + 0.4238(12) +
1.1081(4) + 0.3469(12) = 11.42. This prediction question is to see whether the
student knows how to interpret the regression coefficients and use them
correctly. The given values of the predictors are very close to their respective
means, so the prediction actually corresponds well to an "average" state.
Feedback: The regression is significant overall (F = 20.09, p < .0001). There are
four strong predictors. Weight and Wheelbase are highly significant at = .01
(p-values less than .01), while EngSize and Domestic are significant at = .05
(p-values less than .05). The other two predictorsLength and ManTranare
not significant at the customary levels, although their t-values (at least 1.00 in
absolute magnitude) suggest that they may be contributing to the regression
(that is, if they are omitted, the R2adj would probably decline). The intercept is
not meaningful since no car has all these predictors equal to zero (e.g., Weight
= 0 is impossible). Regarding leverage, we can apply the quick rule to check for
any residual greater than 2(k + 1)/n = 2(7)/33 = .424. By this criterion, only the
Ford AeroStar (leverage .583) has unusual leverage. We would want to check
the values of each independent variable in the regression to see which one(s)
is(are) unusual. However, this is not possible without having the raw data. There
are no outliers in the Studentized residual column, although observation 15
(Honda Civic, t = 2.862) is unusual. If we refer to the Studentized deleted
residual, observation 15 (Honda Civic, t = 3.392) is in fact an outlier. Its actual
mileage (42 mpg) is much better than predicted (34.1 mpg). Autocorrelation is
not an issue since these are not time-series observations. The residual plot
against predicted Y has no pattern (suggesting homoscedasticity) and the
residual probability plot is linear (suggesting normality). Regarding
multicollinearity, the VIFs are rather large, suggesting lack of independence
among predictors. Since none of the VIFs exceeds 10, most students will
conclude that there is no serious problem with multicollinearity. It is a fact that
many car measurements are correlated, which is a simple characteristic of the
data. However, experimentation might be needed to see whether their
contributions are truly necessary. The unexpected positive signs of EngineSize
and Wheelbase may be symptomatic of intercorrelation among the predictors.
Overall, there are no serious problems aside from one possible outlier. Nothing
should be done since this outlier is simply part of the data set. However, it
might be prudent to verify the MPG for observation 15 to make sure it is not a
typo. The fitted (estimated) regression equation is CityMPG = 34.27 + 3.824
EngSize - 2.014 ManTran - 0.08573 Length + 0.5420 Wheelbase - 0.01909
Weight - 4.285 Domestic, so the predicted value of the response variable
CityMPG for a car with EngSize = 2.5, ManTran = 1, Length = 184, Wheelbase =
104, Weight = 3000, and Domestic = 0 is CityMPG = 34.27 + 3.824(2.5) -
2.014(1) - 0.08573(184) + 0.5420(104) - 0.01909(3000) - 4.285(0) = 34.27 + 9.56 -
2.01 - 15.77 + 56.37 - 57.27 - 0 = 25.14. The given values of the predictors are
very close to their respective means, so the prediction actually corresponds well
to an "average" car. Note that the prediction is strongly affected by the two
terms involving Wheelbase and Weight.