Data Management (Correlation and Regression)
Data Management (Correlation and Regression)
CORRELATION
AND REGRESSION
At the end of the lesson, the student
will be able to:
• A correlation is a relationship
between two variables.
• A correlation coefficient is a
numerical measure of the linear
relationship between two
variables.
Where:
rxy = degree of relationship between x
and y
x = observed data for the independent
variable
y = observed data for the dependent
variable
n = sample size
Pearson Correlation Coefficient
► Below are the proposed guidelines for the Pearson coefficient correlation
interpretation:
Positive Correlation Negative Correlation
Verbal Interpretation
Coefficient Coefficient
Slight correlation 0.00 to 0.20 0.00 to - 0.20
Low correlation 0.21 to 0.40 - 0.21 to - 0.40
Moderate correlation 0.41 to 0.60 - 0.41 to - 0.60
High correlation 0.61 to 0.80 - 0.61 to - 0.80
Very high correlation 0.81 to 1.00 - 0.81 to - 1.00
Student English Mathematics
A research study was conducted to Number Grade Grade
1 93 91
determine the correlation between 2 89 86
students grades in English and their 3 84 80
4 91 88
grades in Mathematics. A random 5 90 89
sample of 10 students were taken and 6 83 87
7 75 78
the results of the sampling are 8 81 78
tabulated below. 9 84 85
10 77 76
Student English Mathematics To determine the relationship exists
Number Grade Grade between the two variables the
1 93 91 Pearson’s rxy is used.
2 89 86
3 84 80 Let x = grade in English
4 91 88 y = grade in Mathematics
5 90 89
6 83 87
7 75 78
8 81 78
9 84 85
10 77 76
To compute for ∑xy, ∑x2 and ∑y2
Number x y xy x2 y2
1 93 91 8463 8649 8281
2 89 86 7654 7921 7396
3 84 80 6720 7056 6400
4 91 88 8008 8281 7744
5 90 89 8010 8100 7921
6 83 87 7221 6889 7569
7 75 78 5850 5625 6084
8 81 78 6318 6561 6084
9 84 85 7140 7056 7225
10 77 76 5852 5929 5776
n = 10 847 838 71236 72067 70480
There is a very high positive correlation between scores in English and
Mathematics.
Student English Mathematics
Number Grade Grade
Activity #7 1 15 16
2 16 18
3 13 12
4 11 10
5 5 6
6 4 5
A research study was conducted to
determine the correlation between 7 6 7
students grades in English and their
grades in Mathematics. A random 8 9 8
sample of 10 students were taken
and the results of the sampling are 9 13 14
tabulated below.
10 11 12
REGRESSION ANALYSIS
► Forecasting an effect
► Trend forecasting
The equation of the line of best fit is written of the form:
Where:
y = criterion measure m = slope of the line
x = predictor b = y-intercept
Number x y xy x2
1 93 91 8463 8649
2 89 86 7654 7921
3 84 80 6720 7056
4 91 88 8008 8281
5 90 89 8010 8100
6 83 87 7221 6889
7 75 78 5850 5625
8 81 78 6318 6561
9 84 85 7140 7056
10 77 76 5852 5929
n = 10 847 838 71236 72067
The equation of the line of best fit is written of the form:
REGRESSION ANALYSIS
b. 80
c. 100
Recitation #3 Student English Mathematics
Number Grade Grade
1 15 16
2 16 18
3 13 12
4 11 10
b. 12 8 9 8
c. 20
9 13 14
d. 25
10 11 12
Calculation Summary
Sum of X = 103
Sum of Y = 108
Sum of squares (SSX) = 1219
Sum of products (SP) = 1272
Regression Equation = ŷ = bX + a
ŷ = 1.0095X + 0.40215
Activity #8
Seven college students of UCC
have the following monthly
family income and their final
grade in MMW. Family
Student Final Grade
1. Determine the correlation Income
coefficient. A 30,000 1.25
2. Using the Linear Regression,
B 21,000 1.75
predict the monthly family
income of the student with C 45,000 3.00
the following final exam in
MMW. D 54,000 2.75
a. 1.00 c. 2.00 E 86,000 3.00
b. 1.50 d. 5.00
F 34,000 2.25
G 49,000 2.50
Calculation Summary
Sum of X = 16.5
Sum of Y = 319000
Mean X = 2.3571
Mean Y = 45571.4286
Sum of squares (SSX) = 2.6071
Sum of products (SP) = 62821.4286
Regression Equation = ŷ = bX + a
ŷ = 24095.89041X - 11226.0274