0% found this document useful (0 votes)

40 views8 pages

CFA Level2

CFA_Level2

Uploaded by

Nick Appiah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views8 pages

CFA Level2

CFA_Level2

Uploaded by

Nick Appiah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Multiple Linear Regression Overview: Involves modeling the relationship between a dependent

variable and two or more independent variables. Unlike simple linear regression, which involves only
one independent variable, multiple linear regression can provide more complex and potentially more
accurate models for prediction, portfolio construction, or understanding security returns. However,
incorrect use can lead to misleading results and poor predictions.

Model Specification and Analysis Process:

The analyst defines the dependent variable and selects relevant independent variables.

Decisions include determining the model's form and purpose (prediction or relationship
understanding).

Software tools are typically used for model estimation and producing statistics. Examples include
Excel, Python (with libraries like scipy.stats, statsmodels, sklearn), R, SAS, and STATA.

The main tasks involve specifying the model and interpreting the software output.

Regression Equation: Represented as Yi = b0 + b1X1i + b2X2i + b3X3i + … + bkXki + εi for i = 1 to n.

Here, Y is the dependent variable, Xs are independent variables, b0 is the intercept, and b1 to bk are
slope coefficients, indicating the impact of each independent variable on Y.

Key Assumptions: There are five crucial assumptions in multiple regression: linearity,
homoskedasticity, independence of errors, normality, and independence of independent variables.
Diagnostic tools like scatterplots and residual plots help verify these assumptions.

Applications: Used in financial analysis for explaining relationships, testing theories, or forecasting.
The regression process requires careful selection of variables, model testing, examining fit, and
adjusting as necessary.

Purpose and Application of Multiple Linear Regression:

Addresses investment problems involving multiple factors rather than a single factor. In the complex
investment world, using multiple explanatory variables is essential for an accurate explanation or
forecast.

Used in various scenarios, such as portfolio managers understanding stock returns through factors
like size, value, profitability, and investment aggressiveness; financial advisers predicting financial
distress using variables like leverage and market share; analysts assessing the impact of country risk
factors on equity returns.

Regression Analysis Process:

The process begins with specifying the model, including selecting the dependent and independent
variables. These variables can be continuous (like returns) or discrete (like an indicator for a takeover
target).

Traditional regression models are used for continuous dependent variables, while logistic regression
is employed for discrete dependent variables.

The choice of independent variables may vary, ranging from financial metrics to categorical variables
like industry sectors.

After model specification, it's estimated and analyzed to ensure it meets the underlying assumptions
and goodness-of-fit criteria.

The final model, once tested and validated, can be used for identifying relationships, testing theories,
or forecasting.

Differentiating Between Continuous and Discrete Dependent Variables:

The approach varies based on the nature of the dependent variable. Continuous variables typically
use traditional regression models, while discrete variables may require logistic regression.

Importance in Investment Analysis:

Multiple regression is crucial for a nuanced understanding of financial relationships and forecasting
in the complex investment landscape. It's a key tool in testing theories and identifying significant
relationships between various financial variables.

Objective of Multiple Linear Regression:

The goal is to explain the variation of a dependent variable (Y) using the variations in a set of
independent variables (X1, X2, ..., Xk). This is an extension of simple regression, which uses only one
independent variable (X) to explain the variation in Y.

Regression Equations:

Simple regression equation: Yi = b0 + b1Xi + εi (for i = 1 to n).

Multiple regression equation: Yi = b0 + b1X1i + b2X2i + b3X3i + … + bkXki + εi (for i = 1 to n).

In multiple regression, terms with independent variables (X1, X2, ..., Xk) form the deterministic part
of the model, while the εi term represents the stochastic or random part.

Interpretation of Coefficients:

Slope coefficients (b1, b2, ..., bk) are interpreted carefully. Each coefficient (bj) represents the impact
of a specific independent variable on Y, assuming other variables are held constant.
Example: b2 measures the change in Y for a one-unit change in X2, assuming other independent
variables are constant.

Practical Example:

A regression model for monthly excess returns of a bond index (RET) against changes in government
bond yields (BY) and investment-grade credit spreads (CS).

Regression equation: RET = 0.0023 − 5.0585BY − 2.1901CS, based on 60 monthly observations.

Interpretations:

If BY and CS are zero, the average monthly bond index return is 0.0023% (about 0.028% per year).

A one-unit increase in BY decreases RET by 5.0585% (holding CS constant), indicating an empirical

duration of 5.0585 for the bond index.

A one-unit increase in CS decreases RET by 2.1901% (holding BY constant).

Example calculation: For changes of 0.005 in BY and 0.001 in CS, expected excess return on the bond
index is −2.52%.

Summary:

Assumptions of Multiple Linear Regression:

Linearity: The relationship between the dependent variable and the independent variables is linear.

Homoskedasticity: The variance of the regression residuals is constant across all observations.

Independence of Errors: Observations are independent, implying uncorrelated regression residuals.

Normality: Regression residuals should follow a normal distribution.

Independence of Independent Variables: These variables should not be random and should not have
an exact linear relationship among themselves.

Importance of Assumptions:

These assumptions are critical for valid statistical inferences using ordinary least squares (OLS) in
multiple linear regression. Violations can be detected using diagnostic tools and should be addressed
to ensure accurate model interpretation.

Model Estimation and Residual Analysis:

The model is expressed as Yi = b0 + b1X1i + b2X2i + b3X3i + … + bkXki + εi, estimated over n
observations.
Regression software can be used for estimation, producing residual plots to check for assumption
violations.

Example of regression: ABC stock's monthly excess returns analyzed using Fama–French three-factor
model.

Using Diagnostic Plots:

Scatterplot Matrix: Helps visualize relationships between variables and detect outliers.

Residual vs. Predicted Value Plot: Checks for homoskedasticity and independence of errors. Outliers
can be identified here.

Residuals vs. Factors Plot: Assesses potential assumption violations in relation to independent
variables.

Normal Q-Q Plot: Compares the distribution of residuals against a normal distribution, checking for
normality. Outliers are identified if they deviate significantly from the diagonal line.

Practical Application:

Using Python and R code, the model can be estimated and residual plots generated for analysis.
Outliers and assumption violations can be identified and addressed to ensure the model's accuracy.

Chapter 2

Goodness of Fit in Multiple Regression:

R-squared (R²): Measures how much variation in the dependent variable is explained by the
independent variables. In multiple regression, R² always increases or remains the same with the
addition of more independent variables.

Limitations of R²: It does not indicate statistical significance of coefficients, biases in estimates, or
quality of model fit. High R² can be misleading due to overfitting.

Adjusted R-squared (Adjusted R²):

Adjusts R² for the number of independent variables relative to the number of observations,
preventing automatic increase with the addition of variables.

Calculated as Adjusted R² = 1 - [(Sum of squares error/(n-k-1)) / (Sum of squares total/(n-1))].

Increases when a new variable with a t-statistic greater than |1.0| is added, but decreases
otherwise.

Overfitting: Occurs when the model is too complex for the data, possibly leading to unreliable
coefficients.
ANOVA (Analysis of Variance):

Used to evaluate the model's fit, comparing the sum of squares due to regression with the total sum
of squares.

ANOVA table shows degrees of freedom, sum of squares, mean squares, F-statistic, and significance
of F-statistic.

Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC):

AIC and BIC help compare model quality, assessing model parsimony.

AIC = n * ln(Sum of squares error/n) + 2 * (k + 1), where n is the sample size and k is the number of
independent variables.

BIC = n * ln(Sum of squares error/n) + ln(n) * (k + 1), placing a higher penalty for model complexity.

AIC is preferred for prediction models, while BIC is favored for fitting models.

Practical Application:

Different models can be evaluated for their explanatory power using these statistical measures. For
example, comparing models with varying numbers of factors (like in portfolio excess returns
regression) to determine the best model based on R², Adjusted R², AIC, and BIC.

For investment decisions, these metrics guide the selection of the most appropriate model – either
focusing on prediction accuracy (AIC) or simplicity and fit (BIC).

Interpreting Goodness-of-Fit Statistics:

In the given research scenario, the PM must evaluate models based on R², Adjusted R², AIC, and BIC
to determine which model (CAPEX only vs. CAPEX and ADV) provides a better fit or is more
appropriate for the research objectives.

Interpretation of Coefficients in Multiple Regression:

Intercept: Expected value of the dependent variable when all independent variables are zero.

Slope Coefficients: Expected change in the dependent variable for a one-unit change in a given
independent variable, with all other variables held constant.

Testing Individual Coefficients:

Conducted similarly to simple regression using t-tests.

Null hypothesis (H0): bj = Bj (bj is the coefficient for the jth variable, Bj is the hypothesized value).
Alternative hypothesis (Ha): bj ≠ Bj.

For significance testing, typically H0: bj = 0 and Ha: bj ≠ 0.

Joint Hypothesis Testing:

Used to test the significance of a subset of variables in a multiple regression model.

Compares an unrestricted model (with all variables) to a restricted model (with fewer variables).

Null hypothesis: The coefficients of the variables excluded in the restricted model are zero.

F-test is used to compare the two models.

Comparing Nested Models:

Uses F-statistic: F = [(Sum of squares error restricted model - Sum of squares error unrestricted) / q] /
[Sum of squares error unrestricted model / (n-k-1)].

q is the number of restrictions (number of variables omitted in the restricted model).

Example Application:

Testing whether two additional factors in a five-factor model (Factors 4 and 5) are necessary in
explaining portfolio excess returns. An F-test compares the restricted model (using only Factors 1, 2,
and 3) against the unrestricted model (using all five factors).

The null hypothesis that Factors 4 and 5 are not significant is not rejected if the F-statistic is below a
critical value.

Goodness-of-Fit Test:

Tests the significance of the entire regression equation.

Null hypothesis: All slope coefficients are zero.

F-statistic: Mean square regression (MSR) divided by mean square error (MSE).

Degrees of freedom: k in the numerator (number of independent variables) and n-k-1 in the
denominator.

Evaluating Model Fit:

Adjusted R², AIC, BIC, t-statistics, and F-tests are used to assess model fit.

No single metric is definitive; a combination is often used to determine the best model.

Selecting the Best Model for ROA Analysis:

Various models are estimated to identify key drivers of Return on Assets (ROA) for a sample of
diversified manufacturers.

The models include different combinations of capital expenditures (CAPEX), advertising expenditures
(ADV), and R&D spending.

The best model is chosen based on a balance of explanatory power (R², Adjusted R²), model
parsimony (AIC, BIC), and significance of the variables (t-tests, F-tests).

Recommendation and Justification:

Based on the given statistics, the recommended model is selected by considering the trade-off
between explanatory power and model simplicity. The decision is made by comparing R², Adjusted
R², AIC, and BIC across all potential models and considering the significance of individual variables
and combinations thereof.

Forecasting with Multiple Regression:

Predicting the Dependent Variable: The forecasted value of the dependent variable (Yˆf) in multiple
regression is obtained by summing the product of each estimated slope coefficient (bˆj) with the
assumed value of its corresponding independent variable (Xjf), and then adding the estimated
intercept (bˆ0).

Formula: Ŷf = bˆ0 + bˆ1X1f + bˆ2X2f + ... + bˆkXkf = bˆ0 + Σkj=1bˆjXjf.

Example: Using a five-factor model for portfolio returns, the forecasted return is calculated by
plugging in assumed values for each of the factors into the regression equation.

Cautions in Prediction:

Inclusion of All Variables: If the regression model includes all five independent variables, predictions
must also consider all of them, even if some are not statistically significant. This is because their
correlations were considered in estimating the coefficients.

Inclusion of Intercept: The intercept term must always be included in the prediction.

Uncertainty and Error: Predictions are subject to model error (residuals) and, if using forecasted
independent variables, additional error from inaccuracies in these forecasts.

Standard Error and Prediction Interval:

Model Error: The basic uncertainty in the model, represented by residuals.

Sampling Error: Arises when independent variables are themselves forecasts, adding to the forecast
error.

Resulting Error: The combination of model and sampling errors leads to a larger standard error of the
forecast, widening the prediction interval.
Software Use: Software is typically used to calculate the forecast interval and standard error. For
example, with the five-factor model, a standard error of the forecast, confidence bounds, and the
point estimate are provided.

Practical Application:

In practice, predictions made using multiple regression must carefully consider the accuracy of the
independent variable values used and acknowledge the inherent uncertainties in the model and the
external forecasts. The calculation of the forecast interval is complex and generally handled by
statistical software, providing a more comprehensive picture of the potential range of the forecasted
dependent variable.

Beastmen Brayherds - Old World Builder
No ratings yet
Beastmen Brayherds - Old World Builder
2 pages
Cfa l2 2024 Volume1 1522872379
No ratings yet
Cfa l2 2024 Volume1 1522872379
30 pages
Mosaic-2 A Reading Skills Book
100% (1)
Mosaic-2 A Reading Skills Book
330 pages
Compare and Contrast Education and Philosophy
No ratings yet
Compare and Contrast Education and Philosophy
5 pages
Lecture 2
No ratings yet
Lecture 2
29 pages
5) Multiple Regression
100% (1)
5) Multiple Regression
8 pages
LM01 Basics of Multiple Regression and Underlying Assumptions IFT Notes
No ratings yet
LM01 Basics of Multiple Regression and Underlying Assumptions IFT Notes
11 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
2 pages
CFA L2 2024 Volume1
100% (1)
CFA L2 2024 Volume1
168 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
6 pages
MBA Analytics For Finance 09
No ratings yet
MBA Analytics For Finance 09
12 pages
Econometrics Unit 4
No ratings yet
Econometrics Unit 4
56 pages
IDS UNIT 5 Linear Regression
No ratings yet
IDS UNIT 5 Linear Regression
27 pages
1 - Multiple Regression
No ratings yet
1 - Multiple Regression
8 pages
2024 Chapter 1
No ratings yet
2024 Chapter 1
8 pages
7-Multiple Regression
No ratings yet
7-Multiple Regression
17 pages
U-4 Iml
No ratings yet
U-4 Iml
17 pages
BA Module 5 Summary
No ratings yet
BA Module 5 Summary
3 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
Mulitple Linear Regression
No ratings yet
Mulitple Linear Regression
6 pages
Data Scienece Note
No ratings yet
Data Scienece Note
38 pages
Hypotest 8
No ratings yet
Hypotest 8
2 pages
Multiple Linear Regression and Its Assumptions
No ratings yet
Multiple Linear Regression and Its Assumptions
16 pages
Unit 3 Da
No ratings yet
Unit 3 Da
20 pages
Multiple Linear Regression: Application
No ratings yet
Multiple Linear Regression: Application
22 pages
1.5.linear Regression
No ratings yet
1.5.linear Regression
5 pages
Da Semi
No ratings yet
Da Semi
42 pages
High Yield Notes
No ratings yet
High Yield Notes
251 pages
Module 5: Multiple Regression Analysis: Tom Ilvento
No ratings yet
Module 5: Multiple Regression Analysis: Tom Ilvento
20 pages
Linear Regression Basic Interview Questions
No ratings yet
Linear Regression Basic Interview Questions
36 pages
Hanan
No ratings yet
Hanan
9 pages
3.multiple Correlation & Regression
No ratings yet
3.multiple Correlation & Regression
24 pages
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
No ratings yet
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
6 pages
Data Analytics and Visualization Unit-II
No ratings yet
Data Analytics and Visualization Unit-II
23 pages
2024 L2 QuantMethods
No ratings yet
2024 L2 QuantMethods
57 pages
Pink Green Bright Aesthetic Playful Math Class Presentation
No ratings yet
Pink Green Bright Aesthetic Playful Math Class Presentation
34 pages
Unit5 R
No ratings yet
Unit5 R
5 pages
2023 L2 QuantMethods
No ratings yet
2023 L2 QuantMethods
57 pages
DA unit-III
No ratings yet
DA unit-III
30 pages
Week 5 Lecture Q A
No ratings yet
Week 5 Lecture Q A
14 pages
Quantative Methods
No ratings yet
Quantative Methods
8 pages
Unit 5
No ratings yet
Unit 5
10 pages
Day 2-Data Science
No ratings yet
Day 2-Data Science
16 pages
Statistics For Decision Making
No ratings yet
Statistics For Decision Making
7 pages
Multiple Regression - WPS Office
No ratings yet
Multiple Regression - WPS Office
2 pages
4 Multiple Regression Analysis
No ratings yet
4 Multiple Regression Analysis
58 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
2 pages
OE-ML Unit - 3
No ratings yet
OE-ML Unit - 3
29 pages
Unit 2 Topic 1 REGRESSION
No ratings yet
Unit 2 Topic 1 REGRESSION
19 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
Multiple-Regression - Batool & Raya
No ratings yet
Multiple-Regression - Batool & Raya
24 pages
Mod2 - Multiple Linear Regression
No ratings yet
Mod2 - Multiple Linear Regression
10 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
9 pages
Regression
No ratings yet
Regression
27 pages
Unit II-II
No ratings yet
Unit II-II
21 pages
Data Science
100% (1)
Data Science
14 pages
Unit 2questionbank-1
No ratings yet
Unit 2questionbank-1
38 pages
SimpleMultipleLinearRegression FoundationalMathofAI S24
No ratings yet
SimpleMultipleLinearRegression FoundationalMathofAI S24
6 pages
2025 L2 QuantMethods
No ratings yet
2025 L2 QuantMethods
57 pages
Bsacore1 M5 Wed
No ratings yet
Bsacore1 M5 Wed
4 pages
Episode 13 Layout
No ratings yet
Episode 13 Layout
22 pages
包装创新设计中的智能技术专利数据可视化分析黎映川
No ratings yet
包装创新设计中的智能技术专利数据可视化分析黎映川
7 pages
Assignment Matrices
No ratings yet
Assignment Matrices
5 pages
GL Marine Cargo Logistics Trends Report 2023
No ratings yet
GL Marine Cargo Logistics Trends Report 2023
19 pages
GIS Lec2 Spatial Data Types Final
No ratings yet
GIS Lec2 Spatial Data Types Final
26 pages
Izzaldine SYUFYAN - Circus Assessment Citeria A and B
No ratings yet
Izzaldine SYUFYAN - Circus Assessment Citeria A and B
14 pages
Communicative Styles
No ratings yet
Communicative Styles
42 pages
2021 Pricelist Europractice General MPW 8
No ratings yet
2021 Pricelist Europractice General MPW 8
7 pages
Edith Brown Weiss - Intergenerational Equity UN
No ratings yet
Edith Brown Weiss - Intergenerational Equity UN
24 pages
Getting Started
No ratings yet
Getting Started
123 pages
Script
No ratings yet
Script
2 pages
SSC Phy - Sci. Material. (EM) Final PDF
No ratings yet
SSC Phy - Sci. Material. (EM) Final PDF
33 pages
Linguistics and Evolution A Developmental Approach Andresen JT PDF Download
No ratings yet
Linguistics and Evolution A Developmental Approach Andresen JT PDF Download
79 pages
Elsevier Cover Letter Sample
100% (2)
Elsevier Cover Letter Sample
5 pages
Practice Tesssts
No ratings yet
Practice Tesssts
33 pages
Cognition and Metacognition: Prepared By: Dr. Pooja Gupta Dr. Devaleena Kundu
No ratings yet
Cognition and Metacognition: Prepared By: Dr. Pooja Gupta Dr. Devaleena Kundu
10 pages
MPCC Mark III Instruction Manual
No ratings yet
MPCC Mark III Instruction Manual
7 pages
Planting Design 1
No ratings yet
Planting Design 1
20 pages
Quality Properties
No ratings yet
Quality Properties
9 pages
Week 3.4 - Rankine Active Pressure Example
No ratings yet
Week 3.4 - Rankine Active Pressure Example
4 pages
Fine Blanking and Forming in One Pass: 25 MM For Parts From 150 To 200 Sq. MM Unlike Conventional Stamping, Fine Blanking
No ratings yet
Fine Blanking and Forming in One Pass: 25 MM For Parts From 150 To 200 Sq. MM Unlike Conventional Stamping, Fine Blanking
11 pages
Principle - of - Econ - HW1 (Solution)
No ratings yet
Principle - of - Econ - HW1 (Solution)
8 pages
Covumaiphuongbodevip2024 Deso05
No ratings yet
Covumaiphuongbodevip2024 Deso05
10 pages
Welcome Remarks
100% (1)
Welcome Remarks
5 pages
Coate's and Foucault
No ratings yet
Coate's and Foucault
4 pages
DAPUS
No ratings yet
DAPUS
3 pages
Answer Key
No ratings yet
Answer Key
4 pages

CFA Level2

Uploaded by

CFA Level2

Uploaded by

Multiple Linear Regression Overview: Involves modeling the relationship between a dependent

Model Specification and Analysis Process:

Regression Equation: Represented as Yi = b0 + b1X1i + b2X2i + b3X3i + … + bkXki + εi for i = 1 to n.

Purpose and Application of Multiple Linear Regression:

Regression Analysis Process:

Differentiating Between Continuous and Discrete Dependent Variables:

Importance in Investment Analysis:

Objective of Multiple Linear Regression:

Simple regression equation: Yi = b0 + b1Xi + εi (for i = 1 to n).

Multiple regression equation: Yi = b0 + b1X1i + b2X2i + b3X3i + … + bkXki + εi (for i = 1 to n).

Regression equation: RET = 0.0023 − 5.0585BY − 2.1901CS, based on 60 monthly observations.

A one-unit increase in BY decreases RET by 5.0585% (holding CS constant), indicating an empirical

A one-unit increase in CS decreases RET by 2.1901% (holding BY constant).

Assumptions of Multiple Linear Regression:

Independence of Errors: Observations are independent, implying uncorrelated regression residuals.

Normality: Regression residuals should follow a normal distribution.

Model Estimation and Residual Analysis:

Using Diagnostic Plots:

Goodness of Fit in Multiple Regression:

Adjusted R-squared (Adjusted R²):

Calculated as Adjusted R² = 1 - [(Sum of squares error/(n-k-1)) / (Sum of squares total/(n-1))].

Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC):

Interpreting Goodness-of-Fit Statistics:

Interpretation of Coefficients in Multiple Regression:

Testing Individual Coefficients:

Conducted similarly to simple regression using t-tests.

For significance testing, typically H0: bj = 0 and Ha: bj ≠ 0.

Joint Hypothesis Testing:

Used to test the significance of a subset of variables in a multiple regression model.

F-test is used to compare the two models.

Comparing Nested Models:

q is the number of restrictions (number of variables omitted in the restricted model).

Tests the significance of the entire regression equation.

Null hypothesis: All slope coefficients are zero.

Evaluating Model Fit:

Selecting the Best Model for ROA Analysis:

Recommendation and Justification:

Forecasting with Multiple Regression:

Formula: Ŷf = bˆ0 + bˆ1X1f + bˆ2X2f + ... + bˆkXkf = bˆ0 + Σkj=1bˆjXjf.

Standard Error and Prediction Interval:

Model Error: The basic uncertainty in the model, represented by residuals.

You might also like