0% found this document useful (0 votes)
5 views

Correlation and Regression-Lecture9

This document discusses correlation and regression analysis. It defines correlation as a statistical method used to determine if a relationship exists between two variables. Regression is used to describe the nature of this relationship - whether it is positive, negative, linear, or nonlinear. The key aspects covered are: - Correlation coefficient measures the strength and direction of the linear relationship between two quantitative variables. - Scatter plots graph the relationship between two variables to visualize if it is linear. - Linear regression finds the line of best fit to model the relationship between an independent and dependent variable. - The coefficient of determination indicates how much of the dependent variable is explained by the independent variable. Different types of correlation coefficients are used

Uploaded by

gana09890
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Correlation and Regression-Lecture9

This document discusses correlation and regression analysis. It defines correlation as a statistical method used to determine if a relationship exists between two variables. Regression is used to describe the nature of this relationship - whether it is positive, negative, linear, or nonlinear. The key aspects covered are: - Correlation coefficient measures the strength and direction of the linear relationship between two quantitative variables. - Scatter plots graph the relationship between two variables to visualize if it is linear. - Linear regression finds the line of best fit to model the relationship between an independent and dependent variable. - The coefficient of determination indicates how much of the dependent variable is explained by the independent variable. Different types of correlation coefficients are used

Uploaded by

gana09890
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Correlation and

Regression

Dr. Hanaa Moussa


Correlation and regression:
An area of inferential statistics involves determining whether a relationship exists
between two or more numerical or quantitative variables.
Examples:
Educators are interested in determining whether the number of hours a student
studies is related to the student’s score on a particular exam.

 Medical researchers are interested in questions such as, Is caffeine related to


kidney damage? or Is there a relationship between a person’s age and his or her
blood pressure?
These are only a few of the many questions that can be answered by using the
techniques of correlation and regression analysis.
Correlation and regression:

Correlation: is a statistical method used to determine whether a relationship


between variables exists.

Regression: is a statistical method used to describe the nature of the


relationship between variables, that is, positive or negative, linear or
nonlinear.
1. Are two or more variables linearly related?
2. If so, what is the strength of the relationship?
3. What type of relationship exists?
4. What kind of predictions can be made from the relationship?
 To answer questions (1) & (2), statisticians use a numerical measure this
measure is called a correlation coefficient.
 To answer the question (3), you must ascertain what type of relationship
exists. There are two types of relationships
1.simple relationships (simple regression): two variables—(independent or
explanatory or predictor variable) & (dependent or response variable)
2. multiple relationships (multiple regression ).

 Finally, the question (4) asks what type of predictions can be made.
Predictions are made in all areas and daily
Correlation coefficient
Correlation coefficient (r): measures the strength and direction of a
linear relationship between two quantitative variables x and y.

 The symbol for the sample correlation coefficient is 𝒓.


 The range of the correlation coefficient is from −𝟏 to +𝟏 .
 If there is a strong positive linear relationship between the
variables, the value of 𝒓 will be close to +𝟏 .
 If there is a strong negative linear relationship between the
variables, the value of 𝒓 will be close to −𝟏.
 When there is no linear relationship between the variables or
only a weak relationship, the value of 𝒓 will be close to 0.
A scatter plot
is a graph of the ordered pairs (x, y) of numbers consisting of the independent
variable x and the dependent variable y.
Example

Construct a scatter plot for the data obtained in a study on the number of absences and the final
grades of seven randomly selected students from a statistics class. The data are shown here.
Degree of correlation positive negative

Perfect correlation +1 -1

Strong correlation +0.75 to 0.99 -0.75 to- 0.99

Moderate correlation +0.25 to 0.74 -0.25 to -0.74

Weak correlation 0< to 0.24 0> to -0.24

No correlation 0 0
Correlation coefficient for quantitative data
1. Pearson product moment correlation coefficient (PPMC)
Correlation coefficient examples
Example : Construct a scatter plot for the data obtained in a study on the number of
absences and the final grades of seven randomly selected students from a statistics class.
Also, find the correlation coefficient. The data are shown.
Example (2):
A researcher wishes to see if there is a relationship between the ages and net worth
of the wealthiest people in America. The data for a specific year are shown. Evaluate
Pearson’s correlation coefficient
Regression Line: (Line of best fit)
Linear Regression Equation:
𝑌 = 𝑎 + 𝑏𝑋
Where;
𝑛 𝑋𝑌 − 𝑋 𝑌
𝑏=
𝑛 𝑋2 − 𝑋 2
And
𝑌 𝑋
𝑎= −𝑏
𝑛 𝑛
Example:

Find the regression line for the data obtained in a study on the number of
absences and the final grades of seven randomly selected students from a
statistics class.
𝑦 𝑥
𝑎= −𝑏
𝑛 𝑛

511 57
𝑎= − (−3.662) = 102.493
7 7
Example2

Evaluate the Pearson’s correlation coefficient and regression line for the data
shown for car rental companies in the United States for a recent y.
Pearson’s correlation coefficient
regression line

𝑦 𝑥
𝑎= −𝑏
𝑛 𝑛

18.7 153.8
𝑎= — (0.106) = 0.369
6 6

𝑦 ′ = 𝑎 + 𝑏𝑥

𝑦 ′ = 0.369 + 0.106𝑥
Points
Example

Find the coefficient of determination for the data obtained in a study on


the number of absences and the final grades of seven randomly selected
students from a statistics class.
𝒓2 = 0.8911
𝒓2 = 0.8911

This result means that 89% of the variation in the dependent


variable (Final grade y) is accounted for by the variations in the
independent variable (number of absence x). The rest of the
variation, 0.11, or 11%, is unexplained.
Correlation coefficient for ordinal data
2. Spearman correlation coefficient (Rank correlation ): it is used to find
the correlation between the ordinal qualitative variables

Where,
𝒏 is the number of the data pairs
𝒅 is the rank difference.
Example
Example
Correlation coefficient examples
Example (5) : the following table gives the grades of some students in mathematics and
statistics. Find the correlation coefficient between them.
Reference:
Bluman, Allan G. Elementary statistics : a step by step approach / Allan Bluman. —
8th ed.

You might also like