0% found this document useful (0 votes)
142 views1 page

Sums of Squares

This document discusses key concepts in linear regression analysis, specifically sums of squares. It defines three sums of squares: (1) Total Sum of Squares (SST) which measures the total variability in the dependent variable y, (2) Regression Sum of Squares (SSR) which measures the variability in y explained by the independent variable x, and (3) Error Sum of Squares (SSE) which measures the residual variability in y not explained by x. It also defines the covariance and variance of x and y, and how they relate to the slope and intercept of the regression line.

Uploaded by

Kashif Khalid
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views1 page

Sums of Squares

This document discusses key concepts in linear regression analysis, specifically sums of squares. It defines three sums of squares: (1) Total Sum of Squares (SST) which measures the total variability in the dependent variable y, (2) Regression Sum of Squares (SSR) which measures the variability in y explained by the independent variable x, and (3) Error Sum of Squares (SSE) which measures the residual variability in y not explained by x. It also defines the covariance and variance of x and y, and how they relate to the slope and intercept of the regression line.

Uploaded by

Kashif Khalid
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

The Sums of Squares

Regression, correlation, Analysis of Variance, and other important statistical models all rely on a single key concept, the sum of the squared deviations of a quantity from its mean. You saw this in elementary statistics as the numerator for the variance of a variable. I will discuss them in the context of simple linear regression. The "Total Sum of Squares" SST = SSyy =

(y i y ) 2 measures the variability of the dependent i=1


=
SS yy n1 .

variable, It is equal to n-1 times the sample variance of y: s 2 y


n

SSxx =

sample variance of x: s 2 x
n

(x i x ) 2 measures the variability of the independent variable. i=1


=
SS xx n1 .

It is equal to n-1 times the

(x i x )(y i y ) measures the tendency of x and y to vary together. It can be negative if i=1 high x goes with low y, positive if high x goes with high y, or zero if x and y are unrelated. It is n-1 SS xy times the covariance of x and y: Cov[x,y] = n1 . Note that the covariance of any variable with itself is its variance: Cov[x,x] = s 2 x
SSxy = The estimated slope of the regression line, b1, is the covariance divided by the variance: SS xy b1 = SS xx . (The book also gives alternate forms that get the same answer with less arithmetic, but our goal here is to master the concepts, an d the
shortcut forms obscure those.)

The estimated y-intercept of the regression line is b 0

=y b 1 x

The point estimate corresponding to each specific yi = b0 + b1xi. The "Residual" for each specific yi is yi - (b0 + b1xi) The Error Sum of Squares SSE is the sum of the squared residuals: SSE = (y i [b 0 + b 1 x i ] ) i=1 SSE 2 The variance of the residuals is s = n2 ; the standard error s is the square root of this quantity.
n
2

The Regression Sum of Squares SSR measures the total amount of variation in y that is accounted for ("explained") by the variation in x: SSR = SST - SSE = SSyy - SSE
SSR The Simple Coefficient of Determination R2 = SST

SS yy SSE SS yy

You might also like