0% found this document useful (0 votes)
97 views30 pages

Correlation & Regression

Using this equation, the estimated selling price of a textbook can be calculated based on the number of pages. Interpretation: This regression equation can be used to estimate the selling price (dependent variable Y) of a textbook based on the number of pages (independent variable X) with an intercept of $48 and a slope of $0.05143 per page.

Uploaded by

AkibZ ART
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views30 pages

Correlation & Regression

Using this equation, the estimated selling price of a textbook can be calculated based on the number of pages. Interpretation: This regression equation can be used to estimate the selling price (dependent variable Y) of a textbook based on the number of pages (independent variable X) with an intercept of $48 and a slope of $0.05143 per page.

Uploaded by

AkibZ ART
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

Linear Correlation

&
Regression Analysis

Correlation & Regression Analysis 1


Chapter Six
Linear Regression and Correlation

GOALS
When you have completed this chapter, you will be able to:
ONE
Draw a scatter diagram.
TWO
Understand and interpret the terms dependent variable and
independent variable.
THREE
Calculate and interpret the coefficient of correlation, the
coefficient of determination, and the standard error of
estimate.
Correlation & Regression Analysis 2
Chapter Six continued
Linear Regression and Correlation
GOALS
When you have completed this chapter, you will be able to:

FOUR
Calculate the least squares regression line and interpret
the slope and intercept values.

Correlation & Regression Analysis 3


Correlation Analysis

 Correlation Analysis is a group of statistical


techniques used to measure the strength of the
association between two variables.
 Types of correlation :
 a. Positive and Negative
 b. Simple , Multiple and Partial
 c. Linear and Nonlinear.

Correlation & Regression


Analysis 4
Correlation Analysis

A Scatter Diagram is a chart that portrays the


relationship between the two variables.

 The Dependent Variable is the variable being


predicted or estimated.
 The Independent Variable provides the basis for
estimation. It is the predictor variable.

Correlation & Regression


Analysis 5
The Coefficient of Correlation, r

Correlation & Regression


Analysis 6
The Coefficient of Correlation, r

The characteristics of the coefficient of correlation are:


 It requires interval or ratio-scaled data.
 It can range from -1.00 to 1.00.
 Values of -1.00 or 1.00 indicate perfect and strong
correlation.
 Values close to 0.0 indicate weak correlation.
 Negative values indicate an inverse relationship
and positive values indicate a direct relationship.

Correlation & Regression


Analysis 7
Perfect Negative Correlation (r = - 1)
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Correlation & Regression
Analysis 8
Perfect Positive Correlation (r = +1)
10
9
8
7
6
5
Y 4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Correlation & Regression
Analysis 9
Zero Correlation (r = 0)
10
9
8
7
6
5
Y 4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Correlation & Regression
Analysis 10
Strong Positive Correlation
10
9
8
7
6
Y 5
4
3
2
1
0
0 1 2 3 4 5 6 7 8 9 10
X
Correlation & Regression
Analysis 11
Formula for r

We calculate the coefficient of correlation from the


following formulas.

(X  X )(Y  Y )
r
 ( X  X ) 2
  ( Y  Y ) 2

XY  n  XY

 X 2  n  X 2    Y 2  n  Y 2 
   

Correlation & Regression


Analysis 12
Coefficient of Determination

The coefficient of determination (r2) is the proportion of


the total variation in the dependent variable (Y) that is
explained or accounted for by the variation in the
independent variable (X).

Correlation & Regression


Analysis 13
Coefficient of Determination

The features of the coefficient of determination are:

 It is the square of the coefficient of correlation.


 It ranges from 0 to 1.
 It does not give any information on the direction of the
relationship between the variables.

Correlation & Regression


Analysis 14
Example # 1

Dan Ireland, the student body president at Toledo State


University, is concerned about the cost to students of
textbooks. He believes there is a relationship between the
number of pages in the text and the selling price of the
book. To provide insight into the problem he selects a
sample of eight textbooks currently on sale in the
bookstore. Draw a scatter diagram. Compute the
correlation coefficient.

Correlation & Regression


Analysis 15
Example # 1 Continued

Book Page Price ($)


Intro to History 500 84
Basic Algebra 700 75
Intro to Psyc 800 99
Intro to Sociology 600 72
Bus. Mmgt 400 69
Intro to Biology 500 81
Fund. of Jazz 600 63
Princ. of Nursing 800 93
Correlation & Regression
Analysis 16
Example # 1 continued

Scatter Diagram of Number of Pages and Selling Price of Text

100

90
Price ($)

80

70

60
400 500 600 700 800
Page

Correlation & Regression


Analysis 17
Example # 1 continued

Book Page Price ($)


X Y XY X2 Y2
Into to History 500 84 42,000 250,000 7,056
Basic Algebra 700 75 52,500 490,000 5,625
Into to Psyc 800 99 79,200 640,000 9,801
Into to Sociology 600 72 43,200 360,000 5,184
Bus. Mmgt 400 69 27,600 160,000 4,761
Intro to Biology 500 81 40,500 250,000 6,561
Fund. of Jazz 600 63 37,800 360,000 3,969
Princ. of Nursing 800 93 74,400 640,000 8,649
Total 4,900 636 397,200 3,150,000 51,606

Correlation & Regression Analysis 18


Example # 1 continued

XY  n  XY
r
  X 2  n  X 2    Y 2  n  Y 2 
   
397200  8  612.5  79.5

(3150000  8  612.52 )  (51606  8  79.52 )
 0.614

Correlation & Regression


Analysis 19
Example # 1 continued

Interpretation:
The correlation between the number of pages and the
selling price of the book is 0.614. This indicates a Higher
association between the variable.

Correlation & Regression


Analysis 20
Regression Analysis

In regression analysis we use the independent variable (X) to


estimate the dependent variable (Y).
 The relationship between the variables is linear.
 Both variables must be at least interval scale.
 The least squares criterion is used to determine the
 2
equation. That is the term  (Y  Y) is minimized.

Correlation & Regression


Analysis 21
Regression Analysis

The regression equation: Y = a + bX,
where:

Y is the average predicted value of Y for any X.
a is the Y-intercept. It is the estimated Y value when
X=0

b is the slope of the line, or the average change in Y


for each change of one unit in X

The least squares principle is used to obtain a and b.

Correlation & Regression


Analysis 22
Regression Analysis

The least squares principle is used to obtain a and b. The


equations to determine a and b are:

b yx 
 ( x  x )( y  y)

 xy  n  xy
 (x  x) 2
x nx
2 2

and a  y  b yx  x

Correlation & Regression


Analysis 23
Example # 2 continued

Develop a regression equation for the information given


in Example # 1 that can be used to estimate the selling
price based on the number of pages.

Solution:
397200  8  612.5  79.5
b 2
 .05143
3150000  8  (612.5)
636 4,900
a  0.05143  48.0
8 8

Correlation & Regression


Analysis 24
Example # 2 continued

The regression equation is:


Y’ = 48.0 + .05143X
The equation crosses the Y-axis at $48. A book with
no pages would cost $48.
The slope of the line is .05143. Each addition page
costs about a nickel.
The sign of the b value and the sign of r will always
be the same.

Correlation & Regression


Analysis 25
Example # 2 continued

We can use the regression equation to estimate


values of Y. The estimated selling price of an 800
page book is $89.14, found by:

Y  48.0  0.05143X
 48.0  0.05143 (800)  89.14

Correlation & Regression


Analysis 26
The Standard Error of Estimate

The standard error of estimate measures the scatter, or


dispersion, of the observed values around the line of
regression

Correlation & Regression


Analysis 27
The Standard Error of Estimate

• The formulas that are used to compute the standard


error:
(Y  Y ) 2
s y. x 
n2
Y 2  aY  bXY

n2

Correlation & Regression


Analysis 28
Example # 3

Find the standard error of estimate for the problem


involving the number of pages in a book and the selling
price.
Y 2  aY  bXY
s y. x 
n2
51606  48 (636)  0.05143 (397200)

82
 10.408

Correlation & Regression


Analysis 29
Assumptions Underlying Linear Regression
For each value of X, there is a group of Y values, and
these Y values are:

 The standard deviations of these normal distributions


are equal.
 The Y values are statistically independent. This
means that in the selection of a sample, the Y values
chosen for a particular X value do not depend on the
Y values for any other X values.
 The means of these normal distributions of Y values
all lie on the straight line of regression.

Correlation & Regression


Analysis 30

You might also like