0% found this document useful (0 votes)
0 views3 pages

Linear Regression Notes Extended

Linear Regression is a fundamental statistical tool for predictive analysis that establishes relationships between dependent and independent variables. It includes various types such as simple, multiple, and polynomial regression, and relies on assumptions like linearity and independence for reliable results. Despite its simplicity and effectiveness, it has limitations such as sensitivity to outliers and the assumption of linearity, making advanced methods necessary in some cases.

Uploaded by

prriya45
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views3 pages

Linear Regression Notes Extended

Linear Regression is a fundamental statistical tool for predictive analysis that establishes relationships between dependent and independent variables. It includes various types such as simple, multiple, and polynomial regression, and relies on assumptions like linearity and independence for reliable results. Despite its simplicity and effectiveness, it has limitations such as sensitivity to outliers and the assumption of linearity, making advanced methods necessary in some cases.

Uploaded by

prriya45
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Linear Regression – Detailed Notes

(Extended)
1. Introduction
Linear Regression is one of the simplest and most widely used statistical tools for
predictive analysis. It establishes a relationship between a dependent variable and
one or more independent variables using a straight line. The technique is useful in
understanding trends, forecasting future values, and discovering causal
relationships. In machine learning, it's a foundational supervised learning algorithm.
The core idea is to fit a line such that the difference between actual and predicted
values is minimized.

2. Types of Linear Regression


There are several types of linear regression based on the number of predictors:
- Simple Linear Regression: Deals with a single independent variable.
- Multiple Linear Regression: Includes multiple predictors.
- Polynomial Regression: A variation where the relationship is modeled as an nth
degree polynomial.
- Ridge and Lasso Regression: Regularized versions to handle multicollinearity and
overfitting.

3. Mathematical Foundation
The general equation for simple linear regression is: Y = β₀ + β₁X + ε, where:
- β₀ is the intercept (constant term)
- β₁ is the slope coefficient (shows the change in Y per unit change in X)
- ε is the error term or residual (actual - predicted)

To determine the best fit line, we use the Ordinary Least Squares (OLS) method
which minimizes the sum of squared residuals.

4. Assumptions in Linear Regression


For the linear regression model to produce reliable results, several assumptions
must be met:
1. Linearity: The relationship between the independent and dependent variable
should be linear.
2. Independence: Observations should be independent of each other.
3. Homoscedasticity: The variance of residuals should be constant.
4. Normality: Residuals should be normally distributed.
5. No multicollinearity: Independent variables should not be highly correlated
among themselves.

5. Step-by-Step Example
Let’s consider a dataset where we want to predict a student's marks based on the
number of hours studied:

Hours Studied (X): 2, 4, 6, 8, 10


Marks Scored (Y): 50, 60, 65, 70, 85

Steps:
1. Calculate the mean of X and Y
2. Apply the formulas for β₁ and β₀:
β₁ = Σ[(X - X̄ )(Y - Ȳ)] / Σ[(X - X̄ )²]
β₀ = Ȳ - β₁X̄
3. Use Y = β₀ + β₁X to predict values
4. Visualize with a scatter plot and regression line
This process helps understand how much each hour of study contributes to the
exam marks.

6. Model Evaluation Metrics


To assess the quality of our regression model, we use several metrics:
- R² (Coefficient of Determination): Explains the proportion of variance in Y
explained by X.
- MAE (Mean Absolute Error): Average of absolute differences between actual and
predicted values.
- MSE (Mean Squared Error): Average of squared differences.
- RMSE (Root Mean Squared Error): Square root of MSE; gives error in same units as
the target variable.

Higher R² and lower error values indicate a better model fit.

7. Real-World Applications
Linear regression is used extensively in real-life scenarios:
- Finance: Forecasting sales, stock prices
- Education: Predicting student performance
- Healthcare: Estimating patient readmission or risk scores
- Marketing: Forecasting customer lifetime value (CLTV)
- Manufacturing: Predicting machinery failure time or defects based on usage
metrics

8. Python Code Implementation


Here is a basic implementation in Python using sklearn:

from sklearn.linear_model import LinearRegression


import numpy as np

X = np.array([[2], [4], [6], [8]])


y = np.array([50, 60, 65, 75])

model = LinearRegression()
model.fit(X, y)
print('Intercept:', model.intercept_)
print('Slope:', model.coef_[0])

predicted = model.predict([[10]])
print('Predicted marks for 10 hours study:', predicted[0])

9. Limitations and Challenges


Although simple and powerful, linear regression has some limitations:
- It assumes linearity which may not always hold
- Sensitive to outliers
- Does not handle complex nonlinear interactions
- Performance depends on satisfying model assumptions

Advanced regression or ensemble methods (like decision trees or random forest)


are used when linear regression falls short.

10. Conclusion
Linear regression is foundational in statistics and machine learning. It's
interpretable, easy to implement, and provides a good starting point for regression
problems. A solid understanding of its assumptions, applications, and limitations
helps in choosing the right model and avoiding pitfalls in real-world analysis.

You might also like