0% found this document useful (0 votes)
6 views7 pages

Lecture 7 8 Weeks Correlation and Regression

The document explains correlation as a relationship between two variables, represented by ordered pairs, and introduces the correlation coefficient (r) as a measure of the strength and direction of this relationship. It details the range of r from -1 to 1, indicating strong positive, strong negative, or no correlation. Additionally, it discusses residuals in regression analysis and provides an example of predicting a test score based on TV watching hours using a regression equation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views7 pages

Lecture 7 8 Weeks Correlation and Regression

The document explains correlation as a relationship between two variables, represented by ordered pairs, and introduces the correlation coefficient (r) as a measure of the strength and direction of this relationship. It details the range of r from -1 to 1, indicating strong positive, strong negative, or no correlation. Additionally, it discusses residuals in regression analysis and provides an example of predicting a test score based on TV watching hours using a regression equation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Correlation and

Regression

1
Correlation
A correlation is a relationship between two variables.
The data can be represented by the ordered pairs (x,
y) where x is the independent (or explanatory)
variable, and y is the dependent (or response)
variable.
y

x
Example: 2 4 6

x 1 2 3 4 5 –2
y –4 –2 –1 0 2
–4
2
Correlation Coefficient
The correlation coefficient is a measure of the
strength and the direction of a linear relationship
between two variables. The symbol r represents the
sample correlation coefficient. The formula for r is

n  xy   x  y 
r .
n  x   x  n  y   y 
2 2 2 2

The range of the correlation coefficient is 1 to 1. If x


and y have a strong positive linear correlation, r is
close to 1. If x and y have a strong negative linear
correlation, r is close to 1. If there is no linear
correlation or a weak linear correlation, r is close to 0.

3
Linear Correlation
y
y

r = 0.91 r = 0.88

x
x
Strong negative correlation
Strong positive correlation
y
y
r = 0.42
r = 0.07

x
x
Weak positive correlation
Nonlinear Correlation
4
Residuals
After verifying that the linear correlation between two
variables is significant, next we determine the equation
of the line that can be used to predict the value of y for
a given value of x.
Observed
y
y-value

d2 For a given x-value,


d1
d = (observed y-value) – (predicted y-value)

Predicted d
3
y-value
x
Each data point di represents the difference between
the observed y-value and the predicted y-value for a
given x-value on the line. These differences are called
residuals. 5
Regression equation
Example continued:
Using the equation ŷ = –4.07x + 93.97, we can predict
the test score for a student who watches 9 hours of TV.

ŷ = –4.07x + 93.97
= –4.07(9) + 93.97
= 57.34

A student who watches 9 hours of TV over the weekend


can expect to receive about a 57.34 on Monday’s test.

6
Linear Correlation
y y
As x increases, As x increases,
y tends to y tends to
decrease. increase.

x x
Negative Linear Correlation Positive Linear Correlation
y y

x x
No Correlation Nonlinear Correlation
7

You might also like