0% found this document useful (0 votes)
97 views

Correlation and Linear Regression

The document discusses correlation and linear regression. It defines correlation coefficient and how it measures the strength and direction of the linear relationship between two variables. It provides examples to interpret correlation coefficients and their strength. It also discusses Pearson product-moment correlation coefficient and Spearman's rank-order correlation coefficient, how to calculate them, and what types of relationships they measure.

Uploaded by

MONN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

Correlation and Linear Regression

The document discusses correlation and linear regression. It defines correlation coefficient and how it measures the strength and direction of the linear relationship between two variables. It provides examples to interpret correlation coefficients and their strength. It also discusses Pearson product-moment correlation coefficient and Spearman's rank-order correlation coefficient, how to calculate them, and what types of relationships they measure.

Uploaded by

MONN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 63

MMW

GOOD
AFTERNOON
April 28, 2022
PRAYER
Online House
Rules
MMW

Correlation and
Linear
Regression
Objectives:
Discuss the definition and Calculate the correlation
01 uses of correlation and 03 coefficient using Pearson
linear regression. r.

Calculate the correlation


02 Interpret the strength of
scatterplot. 04 coefficient using Spearman’s
rank correlation coefficient.
Objectives:
Familiarize the steps in
Explore the uses of
05 regression analysis. 06 finding the regression
equation.

Evaluate the relationship Interpret the relationship


07 between the variables. 08 between the variables.
Mathrivia!
Rene Descartes was a french philosopher. His famous
statement is “I think, therefore I am.”

Rene Descartes develop rules for deductive reasoning


and scientific thinking.
He also developed a system for using letters as
mathematical variables and how to plot points on a
plane called the Cartesian plane.

This invention is called the Cartesian Coordinate


System.
Math Connect
Investigate possible relationship between two situations.
Have you encountered situations that involved cause and effect?
Example:
1. When a student missed most of his classes, his grades are lower than
usual.
2. The faster the jet pilot flies, the higher the G-forces are.
3. The more gasoline you put in your car, the farther it can go.
4. As one exercises more, their body weight decreases.
5. When a man gets older, the less hair that he has.
Correlation
Coefficient
Correlation Coefficient
- Are used to measure the strength of the linear relationship between two variables.

- Identifies the size and the direction of the relationship


Size: -1 to 1; Direction: Positive or Negative
- It is denoted as “r”

- The value of r ranges from -1 to 1. Hence there is neither a value of r that


can exceed 1 nor lower than -1

−1 ≤ 𝑟 ≤1
Correlation Coefficient
- When a correlation coefficient is greater than 0, it indicates a positive relationship
while a value less than zero signifies a negative relationship.

- A value of zero indicates no relationship between the two variables bein compared

- Positive one (1) is a perfect positive correlation


- Negative one (1) is a perfect negative correlation.
Correlation Scale
 

Note: The closer the value of r to +1 or negative 1 the stronger the linear
relationship. When it is near to 0, the linear relationship is weak.
Example 1: Which Is the strongest
correlation among the following?

0.01 0.10

-0.28 0.05
Example
If r is 0.78, then there is a moderate high positive correlation between the
variables.  

Thus, the sign of r indicates the direction of correlation, while the absolute
value of r indicates the extent or magnitude of correlation.
Pearson Product-
Moment Correlation
Coefficient
Pearson Product-Moment Correlation Coefficient

It is simply Called the Pearson r


The most popular and widely used correlation
coefficient.
Pearson’s Correlation coefficient (r) for continuous
(interval level) data ranges from -1 to 1

If we let X and Y be the variables we are investigating,


then the formula for finding the correlation coefficient
is
Pearson Product-Moment Correlation Coefficient

Where:
Test of Significance of the Correlation Coefficient

It is important that the value of the correlation


coefficient be tested if it is significant or not.

If it is found out to be significant then, there is really a


relationship that exist between the two variables.
Otherwise, the computed r is due to the chance alone.

Use the table of Critical Values of the correlatyion


coefficient.
Table of Critical Values for Pearson’s r
Table of Critical Values for Pearson’s r

Example:
n=8 at 0.05 level of significance
Table of Critical Values for Pearson’s r

Example:
Suppose a personnel manager would like to know if there is a relationship
between knowledge factors and practical factors of a training course of the 6
trainees using 0.05 level of significance. If the computed r is 0.96, is there a
signifucant relationship between knowledge factors and practical factors of a
training course of 6 trainees?
Steps Solution

Find the Computed r r = 0.96

Find the degree of freedom (df)


Table of Critical Values for Pearson’s r

Steps Solution

Use the table of critical value to


locate its Critical Value.

Since the computed value of r is 0.96, It is


very clear that the tabular value is less than
Interpretation the computed value of r, therefore computed r
is significant. Thus we conclude that there is
really a significant relationship between the
practical and the knowledge factors of a
training course.
Note:
TV<CV = There is significant relationship
between the two variable.
TV>CV = There is no significant
relationship between the two variable.
Table of Critical Values for Pearson’s r

Example:
A teacher wants to know if the number opf hours spent in studying is
correlated with the score obtained in an examination. The followig table
shows the number of hours spent in studying and the scores obtained by the 6
students. Compute correlation coefficient and test its significance at 0.01.

Student No. of hours spent Score in the Exam


in Studying (X) (Y)

A 3.0 20
B 2.7 34
C 3.8 19
D 2.6 10
E 3.2 24
F 3.4 31
Step1: Construct a Table  

X Y XY
3.0 20 9.00 400
60.0
2.7 34
91.8 7.29 1156
3.8 19 361
72.2 14.44
2.6 10 26 6.76 100
3.2 24 76.8 10.24 576
3.4 31 105.4 961
11.56
Ʃ 𝑋 =18.7 Ʃ 𝑌 =138 Ʃ 𝑋𝑌 =432.2 Ʃ 𝑋 2 =59.29 Ʃ 𝑌 2 =3554
Step2: Use the Formula to
compute for the value of r

𝑟 =¿ ¿
𝑟 =0.11
Steps Solution

Find the degree of freedom (df)

Use the table of critical value to


locate its Critical Value.
Interpretation: Note: TV=0.875
CV/r=0.11

Since the tabular value is 0.875 using 0.01


level of significance is greater than the
computed value of 0.11, we can say that the
relationship is not significant. Thus the result
is just due to the chance alone.
Coefficient of Determination

• It tells us the amount of variation in Y that


can be accounted for by the variation in X.
• This can be obtain class by squaring the
value of the Pearson r.
• This is denoted as.
Coefficient of Determination

Example:
Suppose the correlation coefficient r between X and Y variable is
0.8 find the Coefficient determination.

This means that 64% of the variation in Y is due to the


variation in X.
How much then is not accounted?

36% is not accounted for and not correlated.


Spearman Rank-
Order Correlation
Spearman Rank-Order Correlation

• It was named after Charles Spearman


• It is also called Spearman correlation.
• It’s often denoted with the Greek letter rho (ρ) and called Spearman’s
rho.
• It measures the strength and direction of association between two
ranked variables.
• It’s calculated the same way as the Pearson correlation
coefficient but takes into account their ranks instead of their values.
• It determines the strength and direction of the monotonic relationship
between two variables rather than the strength and direction of the
linear relationship between two variables, which is what Pearson's
correlation determines.
Monotonic Relationship

• Monotonically increasing - as the x variable


increases the y variable never decreases;
• Monotonically decreasing - as the x variable
increases the y variable never increases;
• Not monotonic - as the x variable increases the y
variable sometimes decreases and sometimes
increases.
Spearman Rank-Order Correlation

Important facts about Spearman Correlation Coefficient


• It can take a real value in the range −1 ≤ ρ ≤ 1.
• Its maximum value ρ = 1 corresponds to the case when there’s a
monotonically increasing function between x and y. In other words, larger x
values correspond to larger y values and vice versa.
• Its minimum value ρ = −1 corresponds to the case when there’s a
monotonically decreasing function between x and y. In other words, larger x
values correspond to smaller y values and vice versa.
• The formula is 2
6Ʃ𝐷
𝜌=1− 2
Where:
𝑛(𝑛 −1)


Steps on how to calculate the correlation coefficient using Spearman
rho

• Rank the scores in X and Y separately giving rank 1 to the largest, rank 2 to the
second-largest, and so on.
• When there are ties in scores, we assign to each tied observation the mean of the
ranks, which they jointly occupy. For example, if the third and fourth largest
values or scores are the same, we assign each rank of 5 since If the fifth, sixth,
and seventh-largest values are the same, we assign the rank of 6 to each since
• Find the difference between the ranks of Xs and Ys. To check whether your
work is correct or not, find the sum of the differences. If the sum is equal to
zero, then your work is correct.
• Square each difference obtained in Step 3 and get the sum to obtain 
• Substitute the obtained values in the formula. Test the significance and Interpret.
Example
The following table shows the ratings of a group of 10 Student Leaders in the University
who have been evaluated independently for leadership on a scale of 1 to 10 and 10 being the
highest by the SSG Moderators and by the students whom they supervise. Calculate the
value of Spearman rho to determine if there is a correlation between the two evaluations.

Student SSG Students’


Leaders Moderators’ Evaluation
Evaluation
A 4 3
B 2 4
C 2 5
D 1 1
E 7 7
F 9 8
G 3 6
H 5 8
I 2 5
J 7 3
Solution: Construct a table
Student Leaders SSG Moderators’ Students’ Evaluation Rank of X Rank of X
Evaluation (X) (Y)

A 4 3 5.0 8.5
B 2 4 8.0 7.0
C 2 5 8.0 5.5
D 1 1 10.0 10.0
E 7 7 2.5 3.0
F 9 8 1.0 1.5
G 3 6 6.0 4.0
H 5 8 4.0 1.5
I 2 5 8.0 5.5
J 7 3 2.5 8.5

Rank of X
9 7 7 5 4 3 2 2 2 1
1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th
1.0 2.5 2.5 4.0 5.0 6.0 8.0 8.0 8.0 10.0
Solution: Construct a table
Student Leaders SSG Moderators’ Students’ Evaluation Rank of X Rank of X
Evaluation (X) (Y)

A 4 3 5.0 8.5
B 2 4 8.0 7.0
C 2 5 8.0 5.5
D 1 1 10.0 10.0
E 7 7 2.5 3.0
F 9 8 1.0 1.5
G 3 6 6.0 4.0
H 5 8 4.0 1.5
I 2 5 8.0 5.5
J 7 3 2.5 8.5

Rank of Y
8 8 7 6 5 5 4 3 3 1
1st 2 nd
3 rd
4 th
5 th
6 th
7 th
8 th
9 th
10th
1.5 1.5 3.0 4.0 5.5 5.5 7.0 8.5 8.5 10.0
Solution: Construct a table
Student SSG Students’ Rank of X Rank of X D
Leaders Moderators’ Evaluation (Y)
Evaluation
(X)

A 4 3 5.0 8.5 -3.5 12.25


B 2 4 8.0 7.0 1.0 1.00
C 2 5 8.0 5.5 2.5 6.25
D 1 1 10.0 10.0 0.0 0.00
E 7 7 2.5 3.0 -0.5 0.25
F 9 8 1.0 1.5 -0.5 0.25
G 3 6 6.0 4.0 2.0 4.00
H 5 8 4.0 1.5 2.5 6.25
I 2 5 8.0 5.5 2.5 6.25
J 7 3 2.5 8.5 -6.0 36.00
           
Ʃ 𝑫𝟐=𝟕𝟐.𝟓
Substitute the values in the formula

 
Locate the CV and Interpret

Interpretation:
Using the correlation scale, we can say that there is a moderate positive
correlation between the two evaluations made. Since the tabulated value
(0.564) is greater than the computed value (0.5606) at a 0.05 level of
significance, therefore the relationship between the two evaluations is not
significant.
Linear
Regression
Linear Regression

• Linear regression is a basic and commonly used type of predictive


analysis.
• The overall idea of regression is to examine two things: (1) does a set
of predictor variables do a good job in predicting an outcome
(dependent) variable? (2) Which variables, in particular, are
significant predictors of the outcome variable, and in what way do
they–indicated by the magnitude and sign of the beta estimates–
impact the outcome variable? 
• These regression estimates are used to explain the relationship
between one dependent variable and one or more independent
variables.
Four Assumptions of Linear Regression

Linearity: The relationship between X and the mean


of Y is linear.
Homoscedasticity: The variance of residual is the
same for any value of X.
Independence: Observations are independent of each
other.
Normality: For any fixed value of X, Y is normally
distributed.
Linear Regression Analysis

• It deals with the estimation of one


variable based on the changes or
movements of the other variable
• It is used to solve problems
concerning prediction, estimation, and
forecasting.
Linear Regression Equation

• Regression equations can help you figure


out if your data can be fit to an equation.
• This is extremely useful if you want to
make predictions from your data–either
future predictions or indications of past
behavior.
Linear Regression Equation

• The slanted line passing through the data points of a


scatter diagram is the regression line or line of best
fit, which is used to make predictions
• The regression equation is the technical way of
describing the regression line.

• Y is the predicted score for the dependent variable


• a is the constant or intercept
Linear Regression Equation

• Indicates where the regression line would intersect the y axis


(or the vertical axis, also known as the ordinate).
• It is the value for Y when X = 0
• b is the regression coefficient or the slope of the regression
line
• It signifies how many predicted units of change (either
positive or negative) in the DV (Dependent Variable) there are
for anyone unit increase in the IV (Independent Variable).
• X is the known score on the independent variable
Linear Regression Equation

• The constant can be found using the following formula:


Where:
Steps Through the use of Linear Regression Equation

• Construct a tabular arrangement of the given data. Get the


sum of the values of X to get the as well as the values of Y to
get the.
• Multiply each values of X by its corresponding value of Y.
Get the sum to obtain.
• Square each entry in the X column and find the sum to get .
• Substitute the values in the formula to find a and b.
• Form the regression equation by substituting the values of a
and b.
Example

The following table shows the number of weeks six persons have worked at an automobile
inspection station and the number of cars each one inspected on a given day.

• Determine the regression equation that will enable us to predict Y in terms of X that is
predicting the number of cars inspected from the number of weeks employed.
• Predict and estimate the number of cars inspected by someone who has been working for
eight weeks.

Employee No. of weeks No. of cars


employed (X) inspected (Y)
A 2 13
B 7 21
C 9 23
D 1 14
E 5 15
F 12 21
Solution : Construct a Table

(X)
2
(Y)
13
XY
26 4 𝑎=¿ ¿
7
9
1
21
23
14
14
207
14
49
81
1
𝑎=¿ ¿
5 15 75 25
𝑎=12.45
12 21 252 144

Ʃ 𝑋 =36 Ʃ 𝑌 =107Ʃ 𝑋𝑌 =721Ʃ 𝑋 2 =304 𝑏=¿ ¿


𝑏=¿ ¿
𝑏=0.898
Solution : Answer the following questions

 Determine the regression equation that will enable us to predict


Y in terms of X that is predicting the number of cars inspected
from the number of weeks employed.
 Predict and estimate the number of cars inspected by someone
who has been working for eight weeks.
𝑎=12.45 𝑏=0.898

𝑌 =𝑎+𝑏𝑋 𝑌 =12.45+0.898 ( 8 )
𝑌 =12.45+0.898 𝑋 𝑌 =19.63 𝑜𝑟 20.
Generalization
Drill and Practice
• Two judges were asked to rank 7 art projects submitted by Architecture
students in a certain subject. The rankings are shown in the table below.
Measure the degree of relationship of the sets of variables.

Art Project Judge A Judge B


A 1 2
B 2 1
C 3 4
D 4 3
E 5 6
F 6 7
G 7 5
Drill and Practice
The annual income and annual savings of 9 families (in thousand pesos).

Family Annual Savings Annual Income a. Compute the regression


(X) (Y) equation for predicting the
A 1 36 annual income from the
B 2 39 knowledge of the annual
C 2 42 savings.
D 5 45 b. Estimate the savings of a
E 5 48 family whose income is Php
F 6 51 60,000.
G 7 54
H 8 56
I 7 59
Assignment

Have an advance reading


regarding our next topic which
is about geometric designs.
Prayer
Photo
Opportunity
Thank you!
ALTERNATIVE RESOURCES
Here’s an assortment of alternatives resources whose style fits that of this
template

● Schoolkids writing something down


● Side view medium shot confident school girl holding ta
blet
ICON PACK: MATHS
SYMBOLS
RESOURCES
Did you like the resources? Get them for
free at our other websites.

VECTORS PHOTOS
● Linear flat UI UX landing page ● College girls studying together
● Cartoon math elements background ● Portrait smiley business woman
● Medium shot students with laptop
ICONS ● Highschool girls sitting bench
● Expressive african american woman
● Icon Pack: Maths Symbols ● Senior businessman formal clothes
● Portrait senior woman wearing red s
hirt

You might also like