Aml 3
Aml 3
Module: 3
Learning Objectives:
1. Acquire a comprehensive understanding of the core ideas
behind linear regression and recognise the inherent
constraints associated with this statistical technique.
2. Understand the concepts and differences between Ridge
and LASSO regression techniques.
3. Acquire proficiency in the actual application and assessment
procedures of LASSO and Ridge regression models.
4. Delve into generalised linear regression models, ensuring a
thorough understanding of multicollinearity and critical
model assumptions.
Structure:
3.1 Basics of Linear Regression and its Limitations
3.2 Introduction to Ridge and LASSO Regression
3.3 Implementing and Evaluating LASSO and Ridge Models
3.4 Exploring Generalised Linear Regression Models
3.5 Addressing Multicollinearity and Model Assumptions
3.6 Summary
3.7 Keywords
3.8 Self-Assessment Questions
3.9 Case Study
3.10 References
3.1 Basics of Linear Regression and its Limitations
Linear regression is a statistical approach that has significant
importance and enjoys extensive use within the field of
machine learning. The objective is to provide a mathematical
representation of the association between a response variable
and one or many predictor variables via the use of a linear
equation. In its most basic configuration, where there exists
just one independent variable, the association is denoted by
the mathematical expression: \(y = mx + c\), where \(y\)
symbolises the dependent variable, \(x\) represents the
independent variable, \(m\) denotes the slope, and \(c\)
signifies the y-intercept.
The main objective of linear regression is to determine the
optimal straight line, often referred to as the regression line,
that effectively forecasts the output values within a certain
range. The concept of "best fit" is often characterised by the
objective of minimising the sum of squared differences, also
known as errors, between the observed values, which refer to
the actual data points, and the values projected by the model.
However, while linear regression is powerful, it has its limitations:
Linearity Assumption: The primary limitation of this approach
is in its assumption of a linear association between the
dependent and independent variables. In practical contexts,
this assertion does not consistently hold true. Numerous
phenomena possess intrinsic non-linearity, rendering their
exact representation unattainable by alinear model.
Independence: It is not advisable to infer observations only
from repeated measurements or matched data. The
assumption of linear regression is that the residuals, which
refer to the discrepancies between the observed and projected
values, are independent.
Homoscedasticity: The word "homoscedasticity" refers to the
requirement that the variability of the errors should be
constant across all levels of the independent variables. To
clarify, it is expected that the dispersion of residuals is
approximately uniform along the regression line.
Lack of Flexibility: Linear regression exhibits a degree of
inflexibility since it may fail to adequately capture intricate
correlations present within the dataset. In some instances, it
may be necessary to use more advanced methodologies or
include polynomial concepts.
Outliers: The linear regression model is susceptible to the
influence of outliers. The presence of a solitary outlier has the
potential to have a substantial influence on both the slope and
intercept of the regression line.
Linear regression provides a transparent and comprehensible
framework for forecasting outcomes. However, it is crucial for
practitioners to thoroughly evaluate the appropriateness of
linear regression for a specific dataset or research inquiry due
to its underlying assumptions and inherent limits.
3.2 Introduction to Ridge and LASSO Regression
Linear regression offers a basic approach for modelling the
associations between variables. Nevertheless, in scenarios
characterised by the existence of multicollinearity (where
independent variables exhibit strong correlation) or when the
number of predictors surpasses the number of observations,
ordinary linear regression may yield results that are unstable or
prone to overfitting. Ridge and LASSO regression are two
statistical approaches that have been specifically developed to
tackle the aforementioned issues.
Ridge Regression: Ridge regression, sometimes referred to as
Tikhonov regularisation, incorporates a penalty component into
the objective function of linear regression. The inclusion of a
penalty term serves to prevent the occurrence of high
coefficients, a phenomenon that has the potential to result in
overfitting, particularly when multicollinearity is present. The
primary goal of Ridge regression is to minimise the sum of
squared residuals while also including a punishment term. This
penalty term is determined by the squared magnitude of the
coefficient vector, which is further scaled by a hyperparameter
denoted as λ. As the value of λ increases, the penalty becomes
more pronounced, resulting in reduced coefficients. This
phenomenon leads to a trade-off, where a decrease in variance
is achieved at the expense of adding a certain level of bias.
LASSO Regression: LASSO regression, an abbreviation for Least
Absolute Shrinkage and Selection Operator, is a regularisation
approach used in statistical analysis.
Similar to the Ridge method, it introduces a penalty term into
the objective functionof linear regression. Nevertheless, the
LASSO penalty corresponds to the absolutevalue of the
coefficients' size. The aforementioned distinction has a
significantinfluence: LASSO has a tendency to force some
coefficients to attain a value of zero,hence picking a more
streamlined model that excludes these coefficients. Therefore,
the LASSO technique may therefore be regarded as a method
for selecting features. Both Ridge regression and LASSO
regression are used to overcome the constraints ofordinary
linear regression, particularly in scenarios involving
multicollinearity or asignificant risk of overfitting.
Nevertheless, the regularisation techniques used byRidge and
LASSO vary in their approach. Ridge regularisation tends to
uniformlyreduce all coefficients, but LASSO regularisation has
the ability to shrink somecoefficients to zero, thereby
eliminating them from the model. The selection between
Ridge and LASSO regularisation techniques should be
based onconsiderations such as the problem's contextual
factors, the characteristics of thedata, and the intended
objective.
3.3 Implementing and Evaluating LASSO and Ridge Models
Ridge and LASSO regressions are advanced variations of linear
regression often used in statistical modelling. Their practical
applications frequently include the utilisation of software
packages and modules such as Scikit-learn, a popular Python
framework. During the implementation process, it is customary
to initialise the regression model by explicitly defining the
regularisation strength, which is often represented as α or λ. In
the context of Ridge regression, it is seen that all coefficients
undergo a uniform shrinkage towards zero, yet none of them
are entirely removed. In contrast, the LASSO method has the
capability to successfully conduct feature selection by reducing
certain coefficients to zero.
For Ridge: To begin the process, it is essential to import the
required libraries, namely the Ridge module, from the
sklearn.linear_model package. Next, instantiate a Ridge
regression model by giving the desired regularisation strength.
Subsequently, train this model using the provided training
dataset.
For LASSO: To begin, the first step is to import the Lasso
module from the sklearn.linear_model library. Create an
instance of a Lasso regression object, once again setting the
regularisation intensity, and proceed to train it using your
training dataset.
Evaluation: The evaluation of the performance of Ridge and
LASSO regressions entails using approaches that are similar to
those used for assessing regular linear regression models.
Common metrics that are often used include:
Root Mean Squared Error (RMSE): The measure presented
quantifies the extent of discrepancy between predicted and
actual values, with smaller values indicating the superior
performance of the model.
R-squared: The term "explained variance" refers to the amount
of variability in the dependent variable that can be accounted
for by the independent variables. Models with higher R-
squared values are indicative of a greater proportion of
variance being explained by the model.
Coefficient Analysis: LASSO regression has particular
significance in this context. Through the analysis of the
coefficients that LASSO has decreased to zero, one may
ascertain the characteristics that are considered less significant
in forecasting the result.
Overfitting Check: Utilise cross-validation as a means to assess
the model's efficacy in handling novel data. The observation of
a substantial decrease in performance when applying a model
to additional data is indicative of the phenomenon known as
overfitting.
When doing an evaluation, it is crucial to consider the distinct
characteristics of Ridge and LASSO. For instance, the LASSO
method's capability to effectively decrease coefficients to zero
may provide valuable insights into the relative relevance of
features. On the other hand, the Ridge method's uniform
shrinkage approach can lead to more stable coefficient
estimates, particularly when dealing with multicollinearity.
It is important to exercise caution and conduct thorough
assessments while using Ridge and LASSO methods, despite
their easy implementation using contemporary technologies.
Regularised regression techniques, such as LASSO and Ridge,
provide a reliable alternative in situations when conventional
regression approaches may encounter difficulties. However, it
is crucial to comprehend the intricacies of these methods and
correctly interpret their results in order to effectively use them.
3.6 Summary
❖ Module 3 delves extensively into the subject of regression
analysis, first with an exploration of the fundamental
principles behind linear regression and the inherent
limitations associated with it. Although linear regression is
widely used in predictive modelling, it may encounter
limitations when dealing with intricate datasets, particularly
when there is a correlation among variables. In order to
address this issue, Ridge and LASSO regression is proposed
as regularisation methodologies, which serve the dual
purpose of mitigating multicollinearity and mitigating the
risk of model overfitting.
3.10 References
1. Kurilovas, E., 2019. Advanced machine learning approaches
to personalise learning: learning analytics and decision
making. Behaviour & Information Technology, 38(4), pp.410-
423.
2. Hubinger, E., van Merwijk, C., Mikulik, V., Skalse, J. and
Garrabrant, S., 2019. Risks from learned optimisation in
advanced machine learning systems. arXiv preprint
arXiv:1906.01820.
3. Chakraborty, D. and Elzarka, H., 2019. Advanced machine
learning techniques for building performance simulation: a
comparative analysis. Journal of Building Performance
Simulation, 12(2), pp.193-207.
4. Hearty, J., 2016. Advanced machine learning with Python.
Packt Publishing Ltd.
5. Roy, K.S., Roopkanth, K., Teja, V.U., Bhavana, V. and Priyanka,
J., 2018. Student career prediction using advanced machine
learning techniques. International Journal of Engineering &
Technology, 7(3.20), pp.26-29.