Open In App

Residual Sum of Squares

Last Updated : 07 Nov, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Residual Sum of Squares is essentially the sum of the squared differences between the actual values of the dependent variable and the values predicted by the model. This metric provides a numerical representation of how well the model fits the data, with smaller values indicating a better fit and larger values suggesting a poorer fit.

For example, for predicting retail store sales based on advertising spend using a linear regression model. Calculate the Residual Sum of Squares (RSS) by finding the squared differences between actual and predicted sales to assess model fit.

Residual-Sum-of-Squares
Residual Sum of Squares

The scatter plot on the right displays the residuals, which are the differences between actual sales and predicted sales, plotted against advertising spend.

Ideally, we want these residuals to be randomly scattered around the horizontal zero line. If they are, it indicates that our model fits the data well.

However, in this case, we can see some patterns in the residuals, which suggests that our model may not be capturing all the underlying relationships in the data. This could mean we need a more complex model to better understand the relationship between advertising and sales.

Types of Sum of Squares

In regression analysis, RSS is one of the three main types of sum of squares, alongside the Total Sum of Squares (TSS) and the Sum of Squares due to Regression (SSR) or Explained Sum of Squares (ESS).

  • Total Sum of Squares measures the total variation in the dependent variable relative to its mean.
  • Sum of Squares due to Regression measures the variation explained by the regression model.
  • Residual Sum of Squares, on the other hand, measures the variation that is not explained by the model, which is essentially the error or residual component.

How to Calculate the Residual Sum of Squares?

Residual Sum of Squares (RSS) can be calculated using the following formula:

{RSS= \Sigma_{i=1}^n(y_i-f(x_i))^2}

Where, 

  • y_i is the ith value of variable to be predicted,
  • f(x_i) is the predicted value, and
  • n is the number of terms or variables.

Regression Sum of Squares (SSR)

The regression sum of squares measures how well the model is and how close is the predicted value to the expected value.

Consider a set X with n observations. The sum of squares S for this set can be calculated using the below formula:

\bold{S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2}

Where,

  • Xi is the ith observation of the set,
  • \bold{\bar{X}}    is the mean of the dataset, and
  • n is the number of observations.

Total Sum of Squares (TSS)

Total sum of squares is used to denote the amount of variation in the dependent variable. The total sum of squares is the sum of the regression sum of squares and the residual sum of squares. It is calculated as:

TSS = RSS + SSR

Where the abbreviations have their usual meaning.

How to Calculate Sum of Squares?

We will discuss steps to calculate the sum of squares for both the residual method and regressive method in the following headings.

How to Calculate Residual Sum of Squares?

To calculate the residual sum of squares, we can use the following steps:

Step 1: Organize the data to find the expected value.

Step 2: Calculate the residual i.e., yi - ลทi.

Step 3: Use the following formula to calculate the Residual Sum of Squares.

\bold{RSS= \Sigma_{i=1}^n(y_i-f(x_i))^2}

Step 4: The result is the required value of the Residual Sum of Squares.

How to Calculate Sum of Squares Due to Regression?

To calculate the sum of squares due to regression we can use the following steps:

  • Step 1: Calculate the mean of the given data
  • Step 2: Calculate the difference between the mean and each data point.
  • Step 3: Calculate the square of the value obtained in step 2.
  • Step 4: Sum all the values obtained from Step 3.

Significance and Limitations

Significance of Sum of Squares

The sum of squares formula can be used for various purposes and has great significance in real life such as:

  • It can be used to find the variability of data points from the mean value.
  • It helps the investors to make a good decision regarding investment by checking the variance of the stock.
  • It can also help to compare the stock price of two different companies.

Limitations of Sum of Squares

The sum of squares has the following limitations:

  • A higher dataset makes it very difficult to make decisions in real life as the graph of the data is more spread out.
  • An investor may need data of many years to make good decisions but this huge data becomes very difficult to handle.

Solved Examples of Residual Sum of Squares

Problem 1: Calculate the sum of squares of the set X = [1,2,3,6] if the mean is found to be 3.

Solution:

Given \bar{X} = 3

X

X-\bar{X} 

1

-2

2

-1

3

0

6

3

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-2)^2+(-1)^2+0^2+3^2

S = 4+1+0+9

S = 14

Therefore , The sum of squares of the set is 14.

Problem 2: Calculate the sum of squares of the set X = [3,6,9,12,15] if the mean is found to be 9.

Solution:

Given \bar{X} = 9

X

X-\bar{X} 

3

-6

6

-3

9

0

12

3

15

6

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-6)^2+(-3)^2+0^2+3^2+6^2

S = 36+9+0+9+36

S = 90

\therefore    The sum of squares of the set is 90.

Problem 3: Calculate the sum of squares of the dataset X = [1,2,3,4,5,6]

Solution:

In this case we need to calculate the mean first.

\bar{X} = \frac{1+2+3+4+5+6}{6}

= 21/6

\bar {X} = 3.5

X

X-\bar{X} 

1

-2.5

2

-1.5

3

-0.5

4

0.5

5

1.5

6

2.5

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-2.5)^2+(-1.5)^2+(-0.5)^2+(0.5)^2+(1.5)^2+(2.5)^2

S = 6.25+2.25+0.25+0.25+2.25+6.25

S = 17.50

\therefore    The sum of squares of the set is 17.50.

Problem 4: Calculate the sum of squares of the dataset Y = [3,4,5,1,7]

Solution:

In this case we need to calculate the mean first.

\bar{X} = \frac{3+4+5+1+7}{5}

= 20/5

\bar {X} = 4

X

X-\bar{X} 

3

-1

4

0

5

1

1

-3

7

3

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-1)^2+(0)^2+(1)^2+(-3)^2+(3)^2

S = 1+0+1+9+9

S = 20

\therefore    The sum of squares of the set is 20.

Problem 5: Calculate the sum of squares of the set X = [1,4,6,8] if mean is found to be 4.75.

Solution:

Given \bar{X} = 4.75

X

X-\bar{X} 

1

-3.75

4

-0.75

6

1.25

8

3.25

Using S = \Sigma_{i=1}^{n} (X_i- \bar{X})^2

S = (-3.75)^2+(-0.75)^2+(1.25)^2+(3.25)^2

S = 14.0625+0.5625+1.5625+10.5625

S = 26.75

\therefore    The sum of squares of the set is 26.75.


Similar Reads