What is Lasso Regression?
Last Updated :
23 Jul, 2025
Lasso Regression is a regression method based on Least Absolute Shrinkage and Selection Operator and is used in regression analysis for variable selection and regularization. It helps remove irrelevant data features and prevents overfitting. This allows features with weak influence to be clearly identified as the coefficients of less important variables are shrunk toward zero.
In this guide, we will understand the core concepts of lasso regression as well as how it works to mitigate overfitting.
Understanding Lasso Regression
Lasso Regression is a regularization technique used to prevent overfitting. It improves linear regression by adding a penalty term to the standard regression equation. It works by minimizing the sum of squared differences between the observed and predicted values by fitting a line to the data.
However in real-world datasets features have strong correlations with each other known as multicollinearity where Lasso Regression actually helps.
For example, if we're predicting house prices based on features like location, square footage and number of bedrooms. Lasso Regression can identify most important features. It might determine that location and square footage are the key factors influencing price while others has less impact. By making coefficient for the bedroom feature to zero it simplifies the model and improves its accuracy.
Bias-Variance Tradeoff in Lasso Regression
The bias-variance tradeoff refers to the balance between two types of errors in a model:
- Bias: Error caused by over simplistic assumptions of the data.
- Variance: Error caused by the model being too sensitive to small changes in the training data.
When implementing Lasso Regression the L1 regularization penalty reduces variance by making the coefficients of less important features to zero. This prevents overfitting by ensuring model doesn't fit to noise in the data.
However increasing regularization strength i.e raising the lambda value can increase bias. This happens because a stronger penalty can cause the model to oversimplify making it unable to capture the true relationships in the data leading to underfitting.
Thus the goal is to choose right lambda value that balances both bias and variance through cross-validation.
Bias Variance TradeoffUnderstanding Lasso Regression Working
Lasso Regression is an extension of linear regression. While traditional linear regression minimizes the sum of squared differences between the observed and predicted values to find the best-fit line, it doesn’t handle the complexity of real-world data well when many factors are involved.
1. Ordinary Least Squares (OLS) Regression
It builds on Ordinary Least Squares (OLS) Regression method by adding a penalty term. The basic equation for OLS is:
min RSS = Σ(yᵢ - ŷᵢ)²
Where
- y_i is the observed value.
- ŷᵢ is the predicted value for each data point i.
2. Penalty Term for Lasso Regression
In Lasso regression a penalty term is added to the OLS equation. Penalty is the sum of the absolute values of the coefficients. Updated cost function becomes:
RSS + \lambda \times \sum |\beta_i|
Where,
- \beta_i represents the coefficients of the predictors
- \lambda is the tuning parameter that controls the strength of the penalty. As \lambda increases more coefficients are pushed towards zero
3. Shrinking Coefficients:
Key feature of Lasso is its ability to make coefficients of less important features to zero. This removes irrelevant features from the model helps in making it useful for high-dimensional data with many predictors relative to the number of observations.
4. Selecting the optimal \lambda:
Selecting correct lambda value is important. Cross-validation techniques are used to find the optimal value helps in balancing model complexity and predictive performance.
Primary objective of Lasso regression is to minimize residual sum of squares (RSS) along with a penalty term multiplied by the sum of the absolute values of the coefficients.
Graphical Representation of Lasso RegressionIn the plot, the equation for the Lasso Regression of cost function combines the residual sum of squares (RSS) and an L1 penalty on the coefficients β_j.
- RSS measures: Squared difference between expected and actual values is measured.
- L1 penalty: Penalizes absolute values of the coefficients making some of them to zero and simplifying the model. Strength of L1 penalty is controlled by the lambda parameter.
- y-axis: Represents value of the cost function which Lasso Regression tries to minimize.
- x-axis: Represents value of the lambda (λ) parameter which controls the strength of the L1 penalty in the cost function.
- Green to orange curve: This curve shows how the cost function (on the y-axis) changes as lambda (on the x-axis) increases. As lambda grows the curve shifts from green to orange. This indicates that the cost function value increases as the L1 penalty becomes stronger helps in pushing more coefficients toward zero.
When to use Lasso Regression
Lasso Regression is useful in the following situations:
- Feature Selection: It automatically selects most important features by reducing the coefficients of less significant features to zero.
- Collinearity: When there is multicollinearity it can help us by reducing the coefficients of correlated variables and selecting only one of them.
- Regularization: It helps preventing overfitting by penalizing large coefficients which is useful when the number of predictors is large.
- Interpretability: Compared to traditional linear regression models that have all features lasso regression generates a model with fewer non-zero coefficients making model simpler to understand.
For its implementation refer to:
Advantages of Lasso Regression
- Feature Selection: It removes the need to manually select most important features hence the developed regression model becomes simpler and more explainable.
- Regularization: It constrains large coefficients so a less biased model is generated which is robust and general in its predictions.
- Interpretability: This creates another models helps in making them simpler to understand and explain which is important in fields like healthcare and finance.
- Handles Large Feature Spaces: It is effective in handling high-dimensional data such as images and videos.
Disadvantages
- Selection Bias: Lasso may randomly select one variable from a group of highly correlated variables which leads to a biased model.
- Sensitive to Scale: It is sensitive to features with different scales as they can impact the regularization and affect model's accuracy.
- Impact of Outliers: It can be easily affected by the outliers in the given data which results to overfitting of the coefficients.
- Model Instability: It can be unstable when there are many correlated variables which causes it to select different features with small changes in the data.
- Tuning Parameter Selection: Analyzing different λ (alpha) values may be problematic but can be solved by cross-validation.
By introducing a penalty term to the coefficients Lasso helps in doing the right balance between bias and variance that improves accuracy and preventing overfitting.
Similar Reads
Machine Learning Tutorial Machine learning is a branch of Artificial Intelligence that focuses on developing models and algorithms that let computers learn from data without being explicitly programmed for every task. In simple words, ML teaches the systems to think and understand like humans by learning from the data.Do you
5 min read
Introduction to Machine Learning
Python for Machine Learning
Machine Learning with Python TutorialPython language is widely used in Machine Learning because it provides libraries like NumPy, Pandas, Scikit-learn, TensorFlow, and Keras. These libraries offer tools and functions essential for data manipulation, analysis, and building machine learning models. It is well-known for its readability an
5 min read
Pandas TutorialPandas is an open-source software library designed for data manipulation and analysis. It provides data structures like series and DataFrames to easily clean, transform and analyze large datasets and integrates with other Python libraries, such as NumPy and Matplotlib. It offers functions for data t
6 min read
NumPy Tutorial - Python LibraryNumPy (short for Numerical Python ) is one of the most fundamental libraries in Python for scientific computing. It provides support for large, multi-dimensional arrays and matrices along with a collection of mathematical functions to operate on arrays.At its core it introduces the ndarray (n-dimens
3 min read
Scikit Learn TutorialScikit-learn (also known as sklearn) is a widely-used open-source Python library for machine learning. It builds on other scientific libraries like NumPy, SciPy and Matplotlib to provide efficient tools for predictive data analysis and data mining.It offers a consistent and simple interface for a ra
3 min read
ML | Data Preprocessing in PythonData preprocessing is a important step in the data science transforming raw data into a clean structured format for analysis. It involves tasks like handling missing values, normalizing data and encoding variables. Mastering preprocessing in Python ensures reliable insights for accurate predictions
6 min read
EDA - Exploratory Data Analysis in PythonExploratory Data Analysis (EDA) is a important step in data analysis which focuses on understanding patterns, trends and relationships through statistical tools and visualizations. Python offers various libraries like pandas, numPy, matplotlib, seaborn and plotly which enables effective exploration
6 min read
Feature Engineering
Supervised Learning
Unsupervised Learning
Model Evaluation and Tuning
Advance Machine Learning Technique
Machine Learning Practice