Lecture 2
Lecture 2
INTRODUCTORY STATISTICS
FOR
Week 9 FINANCIAL
Chapter 8ANALYSIS
– Multiple Linear Regression
Using Dummy Variables
Financial analysts often need to use qualitative variables as independent variables in a regression.
A dummy variable is qualitative variable that takes on a value of 1 if a particular condition is true and 0 if
that
condition is false.
– used to account for qualitative variables such male or female, month of the year effects, etc.
It may reflect an inherent property of the data (e.g., belonging to an industry or a region). For example, a
company belongs to health care industry (dummy variable = 1) or it does not (dummy variable = 0).
It may be an identified characteristic of the data. We may introduce such a binary variable by a
condition that is either true or false. For example, the date may be before 2008 (prior to the onset of
financial crisis, dummy variable = 0) or after 2008 (after the onset of the financial crisis, dummy
variable = 1).
Alternatively, it may be constructed from some characteristic of the data. The dummy variable would
reflect a condition that is either
Institute,
20200907.
true orInvestment
CFA. Quantitative false. Examples would
Analysis, 4th Edition. include
John satisfying
Wiley & Sons P&T, a condition, such as 2
VIOLATIONS OF REGRESSION
ASSUMPTIONS
Inference based on an estimated regression model rests on certain assumptions being met.
So far, we have made an important assumption that the variance of error in a regression is constant
across observations (variance of errors are homoskedastic)
Heteroskedasticity occurs when the variance of the errors differs across observations.
does not affect consistency
causes the F test for the overall significance to be unreliable.
t tests for the significance of individual regression coefficients are unreliable because
heteroskedasticity
introduces bias into estimators of the standard error of regression coefficients.
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 3
20200907.
Homo vs Heteroskedastic variance
Regression with Regression with
homoskedasticity heteroskedasticity
Homoskedastic: There is no systematic relationship between the value of the independent variable and the regression residuals (the
vertical distance between a plotted point and the fitted regression line). Heteroskedastic: here, a systematic relationship is visually
apparent: On average, the regression residuals grow much larger as the size of the independent variable increases.
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 4
20200907.
Types of Heteroskedasticity
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 5
20200907.
Testing for Heteroskedasticity
The Breusch–Pagan test consists of regressing the squared residuals from the
estimated regression equation on the independent variables in the regression.
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 6
20200907.
Correcting for Heteroskedasticity
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 7
20200907.
Serial Correlation
When regression errors are correlated across observations, we say that they are serially
correlated (or autocorrelated)
• Positive serial correlation is serial correlation in which a positive error for one
observation increases the chance of a positive error for another observation.
• Negative serial correlation, a positive error for one observation increases the chance of a
negative error for another observation, and a negative error for one observation increases
the chance of a positive error for another.
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 8
20200907.
Testing for Serial Correlation
The Durbin Watson statistic is used to test for serial
correlation
• When the Durbin Watson (DW) statistic is less than 𝑑𝐿 , we reject the null hypothesis of no positive
serial correlation.
• When the DW statistic falls between 𝑑𝐿 and 𝑑 𝑢 , the test results are inconclusive
• When the DW statistic is greater than, 𝑑 𝑢 we fail to reject the null hypothesis of no positive
serial correlation
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 9
20200907.
Durbin Watson statistic
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 10
20200907.
Correcting for Serial Correlation
Two alternative remedial steps when a regression has significant serial
correlation:
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 11
20200907.
Multicollinearity
–does not affect the consistency of the OLS estimates of the regression
coefficients
–estimates become extremely imprecise and unreliable
•The classic symptom of multicollinearity is a high 𝑅2(and significant F-
statistic) even though the t-statistics on the estimated slope coefficients are not
significant.
•The most direct solution to multicollinearity
Institute, CFA. Quantitative is excluding
Investment Analysis, 4th Edition. John Wiley & Sons one
P&T, or more of the 12
20200907.
Problems in Linear Regression & their
Solutions
Institute, CFA. Quantitative Investment Analysis, 4th Edition. John Wiley & Sons P&T, 13
20200907.