0% found this document useful (0 votes)
72 views

Day 8 - Module Linear Correlation

The document discusses linear correlation between two variables. It defines correlation as measuring the strength of relationship between variables. The Pearson's correlation coefficient r ranges from -1 to 1, with values closer to these extremes indicating a stronger relationship. A value near 0 indicates little or no linear relationship. The coefficient of determination r^2 measures the percentage of variation in one variable explained by the other. Three key points are calculating and interpreting r and r^2, and distinguishing between positive, negative, and no correlation.

Uploaded by

Joven Jaravata
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views

Day 8 - Module Linear Correlation

The document discusses linear correlation between two variables. It defines correlation as measuring the strength of relationship between variables. The Pearson's correlation coefficient r ranges from -1 to 1, with values closer to these extremes indicating a stronger relationship. A value near 0 indicates little or no linear relationship. The coefficient of determination r^2 measures the percentage of variation in one variable explained by the other. Three key points are calculating and interpreting r and r^2, and distinguishing between positive, negative, and no correlation.

Uploaded by

Joven Jaravata
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Linear Correlation

Lesson Objectives: At the end of the module, each student should be able to

1) define and explain the concept of correlation


2) Calculate and interpret the Pearson’s coefficient of correlation r
3) Compute and interpret the coefficient of determination r 2

--------------------------------------------------------------------------------------------------------------------

Linear Correlation

LINEAR CORRELATION

Correlation analysis is concerned with the measuring the strength of the relationship
between variables. When we compute measures of correlation from a set of paired data, our
interest focuses on the degree of correlation between the variables. Accompanying the scatter
diagram is the correlation coefficient. This statistic measures the length of the linear
association between X and Y is denoted by r. Its ranges is -1 < r < + 1. When r is near zero,
there is little or no linear relationship between X and Y. An r- value near +1 indicates a strong
positive relationship, while an r-value near -1 indicates a strong negative relationship.
There are two linear correlation techniques we shall be studying. They are the
Pearson’s correlation model and the Spearman’s correlation method.

The PEARSON’S Correlation Coefficient r:

r= n∑ xy - (∑ x ¿ ¿
√¿ ¿

Below are scatter plots showing various correlation coefficient values:

r = +0.9 r = -0.5 r = 0.0


. . . .
... . . . . . . .
.. .. . . . .
.. ..... . . .
. .. . . . ..
Page 1 of 7
.. . . . . . .
... .. . . . . .
.. . . . . . .
x x x

We use the guide below in interpreting the value of the coefficient of correlation r.
Computer r-value Interpretation
0.00 to 0.10 No correlation
0.11 to 0.25 Negligible correlation
0.26 to 0.50 Moderate correlation
0.51 to 0.75 Strong or high correlation
0.76 to 1.00 Very strong Perfect correlation

A positive r-value indicates a direct relationship which meanas that an increase in the value of
x will also increase the value of y while a decrease in x- value results to a fall in y-value. A
negative r-values shows an inverse relationship where a rise in x-value results to a fall in y and
fall in x results to a rise in y-value.
Example:
Study Time and Exam Sores
The table below shows study time and exam scores for ten studdents.

Study Hours 1 5 7 8 10 11 14 15 15 19
Exam Scores 53 74 59 43 56 84 96 69 84 83

a) Plot the scatter diagram and make an initial interpretation based on iy.
b) Find the correlation coeficient r. Interpret.

Solution
Y
90 .
80 . . .
70 .
60 .
Page 2 of 7
50 .
40 . .
. X
0 2 4 6 8 10 12 14 16 18 20
Study Hours
a) From the scatter plot, we see here a particular trend where study hours and exam results
tend to change in the same direction. There is a noticeable direct or positive relationship
between these two variables.
b) Computation for the correlation coefficient r:

∑x = 105; ∑ y = 701; ∑ xy = 7,880; ∑ x2 = 1,367 ∑ y2 = 51,729


x = 10.5; y = 70.1

r = [10(7,880) – (105)(701)] / √ ¿ ¿
= 0.63
Interpretation: An r-value equal to +0.63 shows a positive strong correlation between the
study hours and exam results which means that increased study time improves greatly exam
scores.

Coefficient of Determination
The coefficient of determination is the square of the correlation coefficient r. This is denoted by
r 2. It is measured of the relative fit between the regression line and the scatter plot.Because
coefficient of determination always lies in the range 0 < r 2 < 1, it is often expressed as a
percent of variation explained. The unexplained variation (which 1 - r 2 ) reflects factors not
included in the model.

In the above example. the coefficient of determination r 2 = (0.63¿2 = 0.39 or 39%.


The scatter plot shows an imperfect fit, since only 39% of the variations in the exam scores
can be explained by the study time. The remaining 61% is unexplained variation in exam
scores that reflects other factors (e.g., previous night’s sleep, class attendance, test anxiety,
etc).

Page 3 of 7
ACTIVITY 8 Name: ________________________

1. Determine in each case whether you would expect a positive correlation (+) , a
negative correlation (-) , or no correlation (0):
____ a) the amount of rubber on tires and the number of miles they have been driven
____ b) income and education
____ c) shirt size and sense of humor
____ d) the number of hours that bowlers practice and their scores
____ e) hair color and one’s knowledge of military affairs

2. Choose the best answer.


____(a) It identifies the percentage of variation of the variable y that is directly attributable to
the variation of the variable x. ( a. r b. r 2 c. slope b d. r = 1)
____(b) It seeks to determine a possible relationship between two variables (x and y).
( a. correlation b. time-series c. regression ) analysis.
(c) Perfect fit between scatter diagram and regression line. ( a. r = -1 b. r = 0
c. r > 0 d. r < 0)
____(d) 0.11 ≤ r ≤ 0.25 ( a. no correlation b. negligible correlation c. moderate
correlation d. can’t be determined)
____(e) It aims to determine the strength of relationship between two variables x and y..
( a. correlation b. time-series c. regression ) analysis.

3. The following table gives the experience (in years) and monthly salaries (in
thousands of pesos) of 7 randomly selected college instructors.
Experience 4 6 4 9 18 5 16
Monthly salary 22 17 15 19 24 13 27

a) Compute for the Pearson’s r.

b) Are experience and salaries positively or negatively related? ____________If so, how
strong is their relationship? ________________________What would this mean then?
_____________________________________________________________________
c) How many percent (%) of the variations in the salaries are accounted for by the number of
years of experience? ____________ How many percent are explained by other factors?
_____________.

Page 4 of 7
d) Suggest two of these other factors that may affect the variations in salaries received other
than the experience factor. (1)_________________________ _________________ and
(2)____________________________________________________.

Page 5 of 7

You might also like