0% found this document useful (0 votes)
6 views

Data Management (Correlation and Regression)

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Data Management (Correlation and Regression)

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

LINEAR

CORRELATION
AND REGRESSION
At the end of the lesson, the student
will be able to:

Use the methods of linear regression


LEARNING and correlations to predict the value of
a variable given certain conditions ;
OBJECTIVES and

Advocate the use of statistical data in


making important decisions.
CORRELATION

• A correlation is a relationship
between two variables.

• A correlation coefficient is a
numerical measure of the linear
relationship between two
variables.

• It has a range extending from


–1 to +1.
Research studies deal with relationships between two
or more variables.
CORRELATION
• A direct or positive relationship between two
variables means that an increase in the value of
one variable corresponds to an increase in the
value of the other variable.

• An inverse or negative relationship between two


y variables means that an increase in the value of
one variable corresponds to a decrease in the
value of the other variable.

• A zero or no relationship between two variables


means that if one variable has a high value, the
other variable may either have a high or a low
value.
x
CORRELATION
The Pearson Product Moment
Correlation reveals the magnitude
and direction of the relationship.

The Pearson Product Moment


Correlation coefficient (rxy) is a
measure of the correlation of two
variables which are either ratio or
interval.
It was derived by the famous British
statistician, Carl Pearson.
Pearson Correlation Coefficient

Where:
rxy = degree of relationship between x
and y
x = observed data for the independent
variable
y = observed data for the dependent
variable
n = sample size
Pearson Correlation Coefficient

► Below are the proposed guidelines for the Pearson coefficient correlation
interpretation:
Positive Correlation Negative Correlation
Verbal Interpretation
Coefficient Coefficient
Slight correlation 0.00 to 0.20 0.00 to - 0.20
Low correlation 0.21 to 0.40 - 0.21 to - 0.40
Moderate correlation 0.41 to 0.60 - 0.41 to - 0.60
High correlation 0.61 to 0.80 - 0.61 to - 0.80
Very high correlation 0.81 to 1.00 - 0.81 to - 1.00
Student English Mathematics
A research study was conducted to Number Grade Grade
1 93 91
determine the correlation between 2 89 86
students grades in English and their 3 84 80
4 91 88
grades in Mathematics. A random 5 90 89
sample of 10 students were taken and 6 83 87
7 75 78
the results of the sampling are 8 81 78
tabulated below. 9 84 85
10 77 76
Student English Mathematics To determine the relationship exists
Number Grade Grade between the two variables the
1 93 91 Pearson’s rxy is used.
2 89 86
3 84 80 Let x = grade in English
4 91 88 y = grade in Mathematics
5 90 89
6 83 87
7 75 78
8 81 78
9 84 85
10 77 76
To compute for ∑xy, ∑x2 and ∑y2

Number x y xy x2 y2
1 93 91 8463 8649 8281
2 89 86 7654 7921 7396
3 84 80 6720 7056 6400
4 91 88 8008 8281 7744
5 90 89 8010 8100 7921
6 83 87 7221 6889 7569
7 75 78 5850 5625 6084
8 81 78 6318 6561 6084
9 84 85 7140 7056 7225
10 77 76 5852 5929 5776
n = 10 847 838 71236 72067 70480
There is a very high positive correlation between scores in English and
Mathematics.
Student English Mathematics
Number Grade Grade
Activity #7 1 15 16

2 16 18

3 13 12

4 11 10

5 5 6

6 4 5
A research study was conducted to
determine the correlation between 7 6 7
students grades in English and their
grades in Mathematics. A random 8 9 8
sample of 10 students were taken
and the results of the sampling are 9 13 14
tabulated below.
10 11 12
REGRESSION ANALYSIS

Regression Analysis is used when


predicting the behavior of a variable.

The regression equation explains the


number of variations observable in
the independent variable x.

It is the equation of a line called the


line of best fit.
USES OF REGRESSION
ANALYSIS

► Determining the strength of


predictors

► Forecasting an effect

► Trend forecasting
The equation of the line of best fit is written of the form:

Where:
y = criterion measure m = slope of the line
x = predictor b = y-intercept

To determine the values of m and b, use the following formulas:

LINE OF BEST FIT


A research study was conducted to Student English Mathematics
Number Grade Grade
determine the correlation between students 1 93 91
grades in English and their grades in 2 89 86
Mathematics. A random sample of 10 3 84 80
4 91 88
students were taken and the results of the
5 90 89
sampling are tabulated below. 6 83 87
7 75 78
8 81 78
9 84 85
10 77 76
To compute for ∑xy, ∑x,∑x2 and ∑y

Number x y xy x2
1 93 91 8463 8649
2 89 86 7654 7921
3 84 80 6720 7056
4 91 88 8008 8281
5 90 89 8010 8100
6 83 87 7221 6889
7 75 78 5850 5625
8 81 78 6318 6561
9 84 85 7140 7056
10 77 76 5852 5929
n = 10 847 838 71236 72067
The equation of the line of best fit is written of the form:
REGRESSION ANALYSIS

Given the line of best fit of the form:

Predict the student grade in Mathematics with the respective


grade in English.
a. 70

b. 80

c. 100
Recitation #3 Student English Mathematics
Number Grade Grade
1 15 16

2 16 18

3 13 12

4 11 10

Predict the student grade in 5 5 6


Mathematics with the respective
6 4 5
grade in English.
a. 10 7 6 7

b. 12 8 9 8
c. 20
9 13 14
d. 25
10 11 12
Calculation Summary
Sum of X = 103
Sum of Y = 108
Sum of squares (SSX) = 1219
Sum of products (SP) = 1272

Regression Equation = ŷ = bX + a

b = SP/SSX = 159.6/158.1 = 1.0095

a = MY - bMX = 10.8 - (1.0095*10.3) = 0.40215

ŷ = 1.0095X + 0.40215
Activity #8
Seven college students of UCC
have the following monthly
family income and their final
grade in MMW. Family
Student Final Grade
1. Determine the correlation Income
coefficient. A 30,000 1.25
2. Using the Linear Regression,
B 21,000 1.75
predict the monthly family
income of the student with C 45,000 3.00
the following final exam in
MMW. D 54,000 2.75
a. 1.00 c. 2.00 E 86,000 3.00
b. 1.50 d. 5.00
F 34,000 2.25

G 49,000 2.50
Calculation Summary
Sum of X = 16.5
Sum of Y = 319000
Mean X = 2.3571
Mean Y = 45571.4286
Sum of squares (SSX) = 2.6071
Sum of products (SP) = 62821.4286

Regression Equation = ŷ = bX + a

b = SP/SSX = 62821.43/2.61 = 24095.89041

a = MY - bMX = 45571.43 - (24095.89*2.36) =


-11226.0274

ŷ = 24095.89041X - 11226.0274

You might also like