CG DADL - 2024 June - Lecture 03
CG DADL - 2024 June - Lecture 03
𝑆𝑆𝑆𝑆𝑆𝑆 = � 𝜀𝜀𝑖𝑖 2
𝑖𝑖=1
i =1
α = y − βx
S
∑x i
x= i =1
S
S
∑y i
y= i =1
S
11 CG DADL (June 2024) Lecture 3 – Simple Linear Regression
Simple Linear Regression (cont.)
Suppose we have a linear equation 𝑦𝑦 = 2 + 3𝑥𝑥 in which
𝑆𝑆𝑆𝑆𝑆𝑆 = 0:
src01
i =1
S
Corrected Total Sum of Squares = ∑ ( yi − y )
2
i =1
∑ (y
i =1
i − y) = 0
Analysis of Variance
• F-value = MSEModel/MSEError = 57.08
• F-value has n and m-n-1 DF
• The corresponding p-value is < .0001,
indicating that at least one of the
independent variables is useful for predicting
the dependent variable.
• In this case, there is only 1 independent
variable: the value of height is useful for
predicting the value of weight.
• The area under the curve to the left of -7.555 and to the right of +7.555 is less than 0.0001.
• We reject the null hypothesis and conclude that the slope β is not 0, i.e. the variable height is
useful for predicting the dependent variable weight.
22 CG DADL (June 2024) Lecture 3 – Simple Linear Regression
Validation of Model – Coefficient of
Linear Correlation
• In a simple linear regression model, the coefficient of determination = the squared of the
coefficient of linear correlation between X and Y.
• In our example: X = Height; Y = Weight
• r = 0.877785
• R2 = 0.7705 = 0.877785 x 0.877785
• QQ plot on the left shows the residuals in the child’s weight example.
• Data points must fall (approximately) on a straight line for normal distribution.