Lesson #7 - Regression Analysis
Lesson #7 - Regression Analysis
● 1. Intercept (β0\beta_0β0):
○ Represents the expected value of Y when all independent variables are zero.
● 2. Slope Coefficients (β1,β2,…\beta_1, \beta_2, \dotsβ1,β2,…):
○ Represents the change in the dependent variable (Y) for a one-unit change in the
independent variable (X), holding other variables constant.
● 3. Significance of Coefficients:
○ Positive Coefficient: Indicates a positive relationship between the independent
variable and the dependent variable.
○ Negative Coefficient: Indicates a negative relationship.
6. Residual Analysis
● 1. Definition: Residuals are the differences between observed and predicted values of
the dependent variable.
● 2. Purpose: To check the validity of the regression assumptions (e.g., homoscedasticity,
normality).
● 3. Diagnostic Plots:
○ Residual vs. Fitted Plot: Used to check for linearity and homoscedasticity.
○ Normal Q-Q Plot: Used to check the normality of residuals.
○ Scale-Location Plot: Checks the homoscedasticity assumption.
○ Residuals vs. Leverage Plot: Identifies influential observations (outliers).
● Economics: Predicting GDP based on factors like investment, education, and labor.
● Medicine: Analyzing the relationship between patient outcomes and treatment methods.
● Marketing: Estimating sales based on advertising spend, price, and market conditions.
● Finance: Modeling stock prices based on economic indicators and company
performance.
● 1. Outliers: Can disproportionately affect the regression line and model accuracy.
● 2. Multicollinearity: High correlation between independent variables can make it difficult
to determine the effect of each predictor.
● 3. Overfitting: Including too many predictors can lead to a model that fits the sample
data well but performs poorly on new data.
● 4. Assumption Violations: If the assumptions of regression are violated, the results
may not be reliable.