Reading 07-Correlation and Regression
Reading 07-Correlation and Regression
Reading 07-Correlation and Regression
A second limitation to the use of regression results specific to investment contexts is that public
knowledge of regression relationships may negate their future usefulness. Suppose, for
example, an analyst discovers that stocks with a certain characteristic have had historically very
high returns. If other analysts discover and act upon this relationship, then the prices of stocks
with that characteristic will be bid up. The knowledge of the relationship may result in the
relation no longer holding in the future.
Finally, if the regression assumptions listed in Section 3.2 are violated, hypothesis tests and
predictions based on linear regression will not be valid. Although there are tests for violations
of regression assumptions, often uncertainty exists as to whether an assumption has been
violated. This limitation will be discussed in detail in the reading on multiple regression.
SUMMARY
◾ A scatter plot shows graphically the relationship between two variables. If the points on
the scatter plot cluster together in a straight line, the two variables have a strong linear
relation.
Cov(X ,Y )
◾ The sample correlation coefficient for two variables X and Y is r = sx sy .
◾ If two variables have a very strong linear relation, then the absolute value of their
correlation will be close to 1. If two variables have a weak linear relation, then the
absolute value of their correlation will be close to 0.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
◾ The squared value of the correlation coefficient for two variables quantifies the
percentage of the variance of one variable that is explained by the other. If the correlation
coefficient is positive, the two variables are directly related; if the correlation coefficient
is negative, the two variables are inversely related.
◾ If we have n observations for two variables, we can test whether the population
correlation between the two variables is equal to 0 by using a t-test. This test statistic has
a t-distribution with n − 2 degrees of freedom if the null hypothesis of 0 correlation is
true.
◾ Even one outlier can greatly affect the correlation between two variables. Analysts should
examine a scatter plot for the variables to determine whether outliers might affect a
particular correlation.
◾ The dependent variable in a linear regression is the variable that the regression model
tries to explain. The independent variables are the variables that a regression model uses
to explain the dependent variable.
◾ If there is one independent variable in a linear regression and there are n observations on
the dependent and independent variables, the regression model is Yi = b0 + b1Xi + εi, i = 1,
…, n, where Yi is the dependent variable, Xi is the independent variable, and εi is the error
term. In this model, the coefficient b0 is the intercept. The intercept is the predicted value
of the dependent variable when the independent variable has a value of zero. In this
model, the coefficient b1 is the slope of the regression line. If the value of the independent
variable increases by one unit, then the model predicts that the value of the dependent
variable will increase by b1 units.
◾ The assumptions of the classic normal linear regression model are the following:
• A linear relation exists between the dependent variable and the independent
variable.
• The variance of the error term is the same for all observations (homoskedasticity).
◾ The estimated parameters in a linear regression model minimize the sum of the squared
regression residuals.
◾ The standard error of estimate measures how well the regression model fits the data. If
the SEE is small, the model fits well.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
◾ The coefficient of determination measures the fraction of the total variation in the
dependent variable that is explained by the independent variable. In a linear regression
with one independent variable, the simplest way to compute the coefficient of
determination is to square the correlation of the dependent and independent variables.
◾ To test whether the population value of a regression coefficient, b1, is equal to a particular
‸
hypothesized value, B1, we must know the estimated coefficient, b 1 , the standard error of
the estimated coefficient, s‸ , and the critical value for the t-distribution at the chosen
b1
‸
level of significance, tc. The test statistic for this hypothesis is ( b 1 − B1 ) /s‸ . If the
b1
absolute value of this statistic is greater than tc, then we reject the null hypothesis that b1
= B1.
‸ ‸
◾ In the regression model Yi = b0 + b1Xi + εi, if we know the estimated parameters, b 0 and b 1
, for any value of the independent variable, X, then the predicted value of the dependent
‸ ‸ ‸
variable Y is Y = b 0 + b 1 X.
◾ The prediction interval for a regression equation for a particular predicted value of the
‸
dependent variable is Y ± tc sf where sf is the square root of the estimated variance of the
prediction error and tc is the critical level for the t-statistic at the chosen significance
level. This computation specifies a (1 − α) percent confidence interval. For example, if α
= 0.05, then this computation yields a 95 percent confidence interval.
REFERENCES
Buetow, Gerald W., Robert R. Johnson, and David E. Runkle. 2000. “The Inconsistency of
Returns-Based Style Analysis.” Journal of Portfolio Management 26 (3): 61–77.
Campbell, John Y., Karine Serfaty-de Medeiros, and Luis M. Viceira. 2010. “Global Currency
Hedging.” Journal of Finance 65 (1): 87–121.
Chan, Louis K. C., Stephen G. Dimmock, and Josef Lakonishok. 2009. “Benchmarking Money
Manager Performance: Issues and Evidence.” Review of Financial Studies 22 (11): 4553–99.
Daniel, Wayne W., and James C. Terrell. 1995. Business Statistics for Management and
Economics. 7th ed. Boston: Houghton-Mifflin.
Dybvig, Philip H., and Stephen A. Ross. 1985a. “Differential Information and Performance
Measurement Using a Security Market Line.” Journal of Finance 40 (2): 383–99.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
Dybvig, Philip H., and Stephen A. Ross. 1985b. “The Analytics of Performance Measurement
Using a Security Market Line.” Journal of Finance 40 (2): 401–16.
Genre, Veronique, Geoff Kenny, Aidan Meyler, and Allan Timmermann. 2013. “Combining
expert forecasts: Can anything beat the simple average?” International Journal of Forecasting
29 (1): 108–21.
Greene, William H. 2018. Economic Analysis. 8th ed. Upper Saddle River, NJ: Prentice-Hall.
Keane, Michael P., and David E. Runkle. 1990. “Testing the Rationality of Price Forecasts:
New Evidence from Panel Data.” American Economic Review 80 (4): 714–35.
Nelson, David C., Robert B. Moskow, Tiffany Lee, and Gregg Valentine. 2003. Food Investor’s
Handbook. New York: Credit Suisse First Boston.
Sonkin, Paul D., and Paul Johnson. 2017. Pitch the Perfect Investment. New York: Wiley.
PRACTICE PROBLEMS
© 2016 CFA Institute. All rights reserved.
1. The following table shows the sample correlations between the monthly returns for four
different mutual funds and the S&P 500. The correlations are based on 36 monthly
observations. The funds are as follows:
Test the null hypothesis that each of these correlations, individually, is equal to zero
against the alternative hypothesis that it is not equal to zero. Use a 5 percent significance
level.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
2. Julie Moon is an energy analyst examining electricity, oil, and natural gas consumption in
different regions over different seasons. She ran a regression explaining the variation in
energy consumption as a function of temperature. The total variation of the dependent
variable was 140.58, the explained variation was 60.16, and the unexplained variation
was 80.42. She had 60 monthly observations.
B. What was the sample correlation between energy consumption and temperature?
3. You are examining the results of a regression estimation that attempts to explain the unit
sales growth of a business you are researching. The analysis of variance output for the
regression is given in the table below. The regression was based on five observations (n =
5).
A. How many independent variables are in the regression to which the ANOVA
refers?
C. Calculate the sample variance of the dependent variable using information in the
above table.
D. Define Regression SS and explain how its value of 88 is obtained in terms of other
quantities reported in the above table.
F. Explain how the value of the F-statistic of 36.667 is obtained in terms of other
quantities reported in the above table.
4. An economist collected the monthly returns for KDL’s portfolio and a diversified stock
index. The data collected are shown below:
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
The economist calculated the correlation between the two returns and found it to be
0.996. The regression results with the KDL return as the dependent variable and the index
return as the independent variable are given as follows:
Regression Statistics
Multiple R 0.996
R-squared 0.992
Standard error 2.861
Observations 6
When reviewing the results, Andrea Fusilier suspected that they were unreliable. She
found that the returns for Month 2 should have been 7.21 percent and 6.49 percent,
instead of the large values shown in the first table. Correcting these values resulted in a
revised correlation of 0.824 and the revised regression results shown as follows:
Regression Statistics
Multiple R 0.824
R-squared 0.678
Standard error 2.062
Observations 6
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
Regression Statistics
Multiple R 0.8623
R-squared 0.7436
Standard error 0.0213
Observations 24
A. 0.8261.
B. 0.7436.
C. 0.8623.
6. Suppose that you deleted several of the observations that had small residual values. If you
re-estimated the regression equation using this reduced sample, what would likely happen
to the standard error of the estimate and the R-squared?
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
A. −0.7436.
B. 0.7436.
C. 0.8623.
A. You look up the F-value in a table. The F depends on the numerator and
denominator degrees of freedom.
B. Divide the “Mean Square” for the regression by the “Mean Square” of the
residuals.
C. The F-value is equal to the reciprocal of the t-value for the slope coefficient.
9. If the ratio of net income to sales for a restaurant is 5 percent, what is the predicted ratio
of cash flow from operations to sales?
10. Is the relationship between the ratio of cash flow to operations and the ratio of net income
to sales significant at the 5 percent level?
B. No, because the p-values of the intercept and slope are less than 0.05.
C. Yes, because the p-values for F and t for the slope coefficient are less than 0.05.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
Golub directs his assistant, Jill Batten, to study the relationships between Stellar monthly
common stock returns versus the previous month’s percent change in the US Consumer Price
Index for Energy (CPIENG), and Stellar monthly common stock returns versus the previous
month’s percent change in the US Producer Price Index for Crude Energy Materials (PPICEM).
Golub wants Batten to run both a correlation and a linear regression analysis. In response,
Batten compiles the summary statistics shown in Exhibit 1 for the 248 months between January
1980 and August 2000. All of the data are in decimal form, where 0.01 indicates a 1 percent
return. Batten also runs a regression analysis using Stellar monthly returns as the dependent
variable and the monthly change in CPIENG as the independent variable. Exhibit 2 displays the
results of this regression model.
Lagged Monthly
Monthly Return Stellar Change
Common Stock CPIENG PPICEM
Mean 0.0123 0.0023 0.0042
Standard Deviation 0.0717 0.0160 0.0534
Regression Statistics
Multiple R 0.1452
R-squared 0.0211
Standard error of the estimate 0.0710
Observations 248
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
11. Batten wants to determine whether the sample correlation between the Stellar and
CPIENG variables (−0.1452) is statistically significant. The critical value for the test
statistic at the 0.05 level of significance is approximately 1.96. Batten should conclude
that the statistical relationship between Stellar and CPIENG is:
A. significant, because the calculated test statistic has a lower absolute value than the
critical value for the test statistic.
B. significant, because the calculated test statistic has a higher absolute value than the
critical value for the test statistic.
C. not significant, because the calculated test statistic has a higher absolute value than
the critical value for the test statistic.
12. Did Batten’s regression analyze cross-sectional or time-series data, and what was the
expected value of the error term from that regression?
13. Based on the regression, which used data in decimal form, if the CPIENG decreases by
1.0 percent, what is the expected return on Stellar common stock during the next period?
14. Based on Batten’s regression model, the coefficient of determination indicates that:
15. For Batten’s regression model, the standard error of the estimate shows that the standard
deviation of:
16. For the analysis run by Batten, which of the following is an incorrect conclusion from the
regression output?
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
CFA全新考季资料免费获取(含CFA高清网课)
(随考季更新,长期有效)
PDF
里
资
料
扫
码
获
得
【CFA万人微信群】
需要加入我们CFA全球考友微信群的请添加CFA菌的微信号:374208596,备注需要加哪些群~或直接扫左方
CFA菌菌二维码即可~
所有人均先加入CFA全球考友总群再根据您的需求加入其他分群~
(201 年12月 ,201 年6 考 ,一级、二级、三级分群、上海、北京、成都、深圳、海外等分
群)!
备考资料、学霸考经、考试资讯免费共享!交流、答疑、互助应有尽有!快来加入我们吧!群数量太多,文件中
只是部分展示~有困难的话可以随时咨询我哦!
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
B. In the month after the CPIENG declines, Stellar’s common stock is expected to
exhibit a positive return.
Liu provides a number of statistics in Exhibit 1. She also estimates a simple regression to
investigate the effect of the debt ratio on a company’s short interest ratio. The results of this
simple regression, including the analysis of variance (ANOVA), are shown in Exhibit 2.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
Regression Statistics
Multiple R 0.3054
2
R 0.0933
Standard error of estimate 2.7905
Observations 50
Liu is considering three interpretations of these results for her report on the relationship
between debt ratios and short interest ratios:
Interpretation 1 Companies’ higher debt ratios cause lower short interest ratios.
Interpretation 2 Companies’ higher short interest ratios cause higher debt ratios.
Interpretation 3 Companies with higher debt ratios tend to have lower short interest
ratios.
She is especially interested in using her estimation results to predict the short interest ratio for
MQD Corporation, which has a debt ratio of 0.40.
17. Based on Exhibits 1 and 2, if Liu were to graph the 50 observations, the scatterplot
summarizing this relation would be best described as:
A. horizontal.
B. upward sloping.
C. downward sloping.
A. −9.2430.
B. −0.1886.
C. 8.4123.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
19. Based on Exhibit 1, the correlation between the debt ratio and the short interest ratio is
closest to:
A. −0.3054.
B. 0.0933.
C. 0.3054.
20. Which of the interpretations best describes Liu’s findings for her report?
A. Interpretation 1
B. Interpretation 2
C. Interpretation 3
A. intercept.
B. debt ratio.
22. Based on Exhibit 2, the degrees of freedom for the t-test of the slope coefficient in this
regression are:
A. 48.
B. 49.
C. 50.
23. The upper bound for the 95% confidence interval for the coefficient on the debt ratio in
the regression is closest to:
A. −1.0199.
B. −0.3947.
C. 1.4528.
24. Which of the following should Liu conclude from these results shown in Exhibit 2?
C. The debt ratio explains 30.54% of the variation in the short interest ratio.
25. Based on Exhibit 2, the short interest ratio expected for MQD Corporation is closest to:
A. 3.8339.
B. 5.4975.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
C. 6.2462.
26. Based on Liu’s regression results in Exhibit 2, the F-statistic for testing whether the slope
coefficient is equal to zero is closest to:
A. −2.2219.
B. 3.5036.
C. 4.9367.
SOLUTIONS
1. The critical t-value for n − 2 = 34 df, using a 5 percent significance level and a two-tailed
test, is 2.032. First, take the smallest correlation in the table, the correlation between Fund
3 and Fund 4, and see if it is significantly different from zero. Its calculated t-value is
r√n − 2 0.3102√36 − 2
t= = = 1.903
√1 − r2 √1 − 0.31022
This correlation is not significantly different from zero. If we take the next lowest
correlation, between Fund 2 and Fund 3, this correlation of 0.4156 has a calculated
t-value of 2.664. So this correlation is significantly different from zero at the 5 percent
level of significance. All of the other correlations in the table (besides the 0.3102) are
greater than 0.4156, so they too are significantly different from zero.
2.
B. For a linear regression with one independent variable, the absolute value of
correlation between the independent variable and the dependent variable equals the
square root of the coefficient of determination, so the correlation is √0.4279 =
0.6542. (The correlation will have the same sign as the slope coefficient.)
⎛ ⎞
2 1/2
‸ ‸
⎜ n (Y − − b 1 Xi ) ⎟
⎜ ⎟
b 0
⎜∑ ⎟
i
⎜ ⎟
1/2
=( )
Unexplained variation
⎜ ⎟
⎜ i=1 n−2 ⎟
⎜ ⎟
n−2
⎝ ⎠
= √ 60−2 = 1.178
80.42
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
n ¯¯¯)2
(Yi − Y Total variation 140.58
∑ = = = 2.3827
i=1
n−1 n−1 60 − 1
3.
A. The degrees of freedom for the regression is the number of slope parameters in the
regression, which is the same as the number of independent variables in the
regression. Because regression df = 1, we conclude that there is one independent
variable in the regression.
B. Total SS is the sum of the squared deviations of the dependent variable Y about its
mean.
C. The sample variance of the dependent variable is the total SS divided by its degrees
of freedom (n − 1 = 5 − 1 = 4 as given). Thus the sample variance of the dependent
variable is 95.2/4 = 23.8.
D. The Regression SS is the part of total sum of squares explained by the regression.
Regression SS equals the sum of the squared differences between predicted values
2
‸
of the Y and the sample mean of Y: ∑ (Y i − Y ¯¯¯) . In terms of other values in
n
i=1
the table, Regression SS is equal to Total SS minus Residual SS: 95.2 − 7.2 = 88.
E. The F-statistic tests whether all the slope coefficients in a linear regression are
equal to 0.
F. The calculated value of F in the table is equal to the Regression MSS divided by
the Residual MSS: 88/2.4 = 36.667.
G. Yes. The significance of 0.00904 given in the table is the p-value of the test (the
smallest level at which we can reject the null hypothesis). This value of 0.00904 is
less than the specified significance level of 0.05, so we reject the null hypothesis.
The regression equation has significant explanatory power.
4. The Month 2 data point is an outlier, lying far away from the other data values. Because
this outlier was caused by a data entry error, correcting the outlier improves the validity
and reliability of the regression. In this case, the true correlation is reduced from 0.996 to
0.824. The revised R-squared is substantially lower (0.678 versus 0.992). The
significance of the regression is also lower, as can be seen in the decline of the F-value
from 500.79 to 8.44 and the decline in the t-statistic of the slope coefficient from 22.379
to 2.905.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
打印者:yanzhao yang <[email protected]>。打印仅供个人、私人使用。未经出版商的事先许可,不得复制或传播此图书的任何部分。违者将被起诉。
The total sum of squares and regression sum of squares were greatly exaggerated in the
incorrect analysis. With the correction, the slope coefficient changes from 1.069 to 0.623.
This change is important. When the index moves up or down, the original model indicates
that the portfolio return goes up or down by 1.069 times as much, while the revised model
indicates that the portfolio return goes up or down by only 0.623 times as much. In this
example, incorrect data entry caused the outlier. Had it been a valid observation, not
caused by a data error, then the analyst would have had to decide whether the results were
more reliable including or excluding the outlier.
6. C is correct. Deleting observations with small residuals will degrade the strength of the
regression, resulting in an increase in the standard error and a decrease in R-squared.
7. C is correct. For a regression with one independent variable, the correlation is the same as
the Multiple R with the sign of the slope coefficient. Because the slope coefficient is
positive, the correlation is 0.8623.
9. C is correct. To make a prediction using the regression model, multiply the slope
coefficient by the forecast of the independent variable and add the result to the intercept.
10. C is correct. The p-value is the smallest level of significance at which the null hypotheses
concerning the slope coefficient can be rejected. In this case the p-value is less than 0.05,
and thus the regression of the ratio of cash flow from operations to sales on the ratio of
net income to sales is significant at the 5 percent level.
Because the absolute value of t = −2.3017 is greater than 1.96, the correlation coefficient
is statistically significant. For a regression with one independent variable, the t-value (and
significance) for the slope coefficient (which is −2.3014) should equal the t-value (and
significance) of the correlation coefficient. The slight difference between these two
t-values is caused by rounding error.
12. A is correct because the data are time series, and the expected value of the error term, E
(ε), is 0.
13. C is correct. From the regression equation, Expected return = 0.0138 + −0.6486(−0.01) =
0.0138 + 0.006486 = 0.0203, or 2.03 percent.
14. C is correct. R-squared is the coefficient of determination. In this case, it shows that 2.11
percent of the variability in Stellar’s returns is explained by changes in CPIENG.
15. A is correct, because the standard error of the estimate is the standard deviation of the
regression residuals.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9
16. C is the correct response, because it is a false statement. The slope and intercept are both
statistically significant.
17. C is correct because the slope coefficient (Exhibit 2) and the cross-product (Exhibit 1) are
negative.
19. A is correct. The correlation coefficient equals the covariance between variables X and Y
divided by the product of the standard deviations of variables X and Y, as follows:
−9.2430
49 −0.1886327
= 0.2130×2.9004
= −0.3054.
√ 2.2225 √ 412.2042
49 49
20. C is correct. Conclusions cannot be drawn regarding causation, only about association.
21. C is correct. Liu explains the short interest ratio using the debt ratio.
22. A is correct. The degrees of freedom are the number of observations minus the number of
parameters estimated, which equals two in this case (the intercept and the slope
coefficient). The number of degrees of freedom is 50 − 2 = 48.
23. B is correct. The calculation for the confidence interval is −4.1589 ± (2.011 × 1.8718).
The upper bound is −0.3947. The 2.011 is the critical t-value for the 5% level of
significance (2.5% in one tail) for 48 degrees of freedom.
24. B is correct. The t-statistic is −2.2219, which is outside of the bounds created by the
critical t-values of ± 2.011 for a two-tailed test with a 5% significance level. The 2.011 is
the critical t-value for the 5% level of significance (2.5% in one tail) for 48 degrees of
freedom.
25. A is correct because Predicted value = 5.4975 + (−4.1589 × 0.40) = 5.4975 − 1.6636 =
3.8339.
Mean regression sum of squares 38.4404
26. C is correct because F = Mean squared error
= 7.7867
= 4.9367.
NOTES
1Examples in this reading were updated in 2014 by Professor Sanjiv Sabherwal of the University of Texas,
Arlington.
2Later, we show that variables with a correlation of 0 can have a strong nonlinear relation.
3The use of n − 1 in the denominator is a technical point; it ensures that the sample covariance is an unbiased
estimate of population covariance.
http://e.pub/jthbhvadxmncfyrxzzot.vbk/OEBPS/CFA0014-R03-7-print-1539097160.xh... 2018/10/9