Least Square Method Definition
Least Square Method Definition
The least-squares method is a crucial statistical method that is practiced to find a regression line or a
best-fit line for the given pattern. This method is described by an equation with specific parameters. The
method of least squares is generously used in evaluation and regression. In regression analysis, this
method is said to be a standard approach for the approximation of sets of equations having more
equations than the number of unknowns.
The method of least squares actually defines the solution for the minimization of the sum of squares of
deviations or the errors in the result of each equation. Find the formula for sum of squares of errors,
which help to find the variation in observed data.
The least-squares method is often applied in data fitting. The best fit result is assumed to reduce the sum
of squared errors or residuals which are stated to be the differences between the observed or
experimental value and corresponding fitted value given in the model.
These depend upon linearity or nonlinearity of the residuals. The linear problems are often seen in
regression analysis in statistics. On the other hand, the non-linear problems are generally used in the
iterative method of refinement in which the model is approximated to the linear one with each iteration.
R Squared (R2)
R Squared (R2)
Mean Absolute Error(MAE)
MAE is a very simple metric which calculates the absolute difference between
To better understand, let’s take an example you have input data and output data
Now you have to find the MAE of your model which is basically a mistake made by
the model known as an error. Now find the difference between the actual value and
predicted value that is an absolute error but we have to find the mean absolute of
so, sum all the errors and divide them by a total number of observations And this is
Advantages of MAE
The MAE you get is in the same unit as the output variable.
Disadvantages of MAE
The graph of MAE is not differentiable so we have to apply various
MSE is a most used and very simple metric with a little bit of change in mean
absolute error. Mean squared error states that finding the squared difference
So, above we are finding the absolute difference and here we are finding the
squared difference.
What actually the MSE represents? It represents the squared distance between
Advantages of MSE
The graph of MSE is differentiable, so you can easily use it as a loss function.
Disadvantages of MSE
The value you get after calculating MSE is a squared unit of output.
If you have outliers in the dataset then it penalizes the outliers most
As RMSE is clear by the name itself, that it is a simple square root of mean squared
error.
Advantages of RMSE
The output value you get is in the same unit as the required output
Disadvantages of RMSE
for performing RMSE we have to NumPy NumPy square root function over MSE.
print("RMSE",np.sqrt(mean_squared_error(y_test,y_pred)))
Most of the time people use RMSE as an evaluation metric and mostly when you
are working with deep learning techniques the most preferred metric is RMSE.
R Squared (R2)
R2 score is a metric that tells the performance of your model, not the loss in an
absolute sense that how many wells did your model perform.
In contrast, MAE and MSE depend on the context as we have seen whereas the R2
So, with help of R squared we have a baseline model to compare a model which
none of the other metrics provides. The same we have in classification problems
Now, how will you interpret the R2 score? suppose If the R2 score is zero then the
above regression line by mean line is equal means 1 so 1-1 is zero. So, in this case,
both lines are overlapping means model performance is worst, It is not capable to
Now the second case is when the R2 score is 1, it means when the division term is
zero and it will happen when the regression line does not make any mistake, it is
So we can conclude that as our regression line moves towards perfection, R2 score
The normal case is when the R2 score is between zero and one like 0.8 which
means your model is capable to explain 80 per cent of the variance of data.
Adjusted R Squared
The disadvantage of the R2 score is while adding new features in data the R2 score
But the problem is when we add an irrelevant feature in the dataset then at that time
Now as K increases by adding some features so the denominator will decrease, n-1
will remain constant. R2 score will remain constant or will increase slightly so the
complete answer will increase and when we subtract this from one then the
resultant score will decrease. so this is the case when we add an irrelevant feature
in the dataset.
And if we add a relevant feature then the R2 score will increase and 1-R2 will
decrease heavily and the denominator will also decrease so the complete term
n=40
k=2
adj_r2_score = 1 - ((1-r2)*(n-1)/(n-k-1))
print(adj_r2_score)
Hence, this metric becomes one of the most important metrics to use during the