L5 Correlation & Regression - 082913
L5 Correlation & Regression - 082913
Correlation
Correlation Finding the relationship between two quantitative variables
(data numbers) without being able to infer causal relationships.
It is a statistical technique used to determine the degree to which two
variables are related
Scatter diagram
• Rectangular coordinate
• Two quantitative variables
• One variable is called independent (X) and the second
is called dependent (Y)
• Points are not joined
Example Plot Scatter between Wt and blood pressure
Scatter plots
The pattern of data is indicative of the type of relationship between
your two variables:
positive relationship
negative relationship
no relationship
r = 0.759
strong direct relation
Example
Find the correlation between the marks of English and Math.
Solution
give ranking for each subject (English and math) -1
since we have ten marks (n=10) we will rank from 1 to 10 -2
3- Give rank 1 for the highest ,2 for the second highest and so on
4-Do these for the two subjects
Example
In a study of the relationship between level education and
income the following data was obtained. Find the
relationship between them and comment.
Rank Rank
X X (X)
5 5 Preparatoryﻣﺗوﺳطﺔ
6 6 Primary.اﺑﺗداﺋﻲ
(2+1)/2.=1.5 2 University.ﺟﺎﻣﻌﻲ
(3+4)/2=3.5 3 Secondaryﺛﺎﻧوي
(3+4)/2=3.5 4 secondary
7 7 Illiterate اﻣﻲ
(2+1)/2.=1.5 1 university.
Rank Rank
Y Y (Y)
3 3 25
(5+6)/2 =5.5 5 10
7 7 8
(5+6)/2 =5.5 6 10
4 4 15
2 2 50
1 1 60
Answer:
di2 di Rank Rank
Y X (Y) (X)
4 2 3 5 25 Preparatory A
∑ di2=64
:Comment
.There is an indirect weak correlation between level of education and income
Linear Regression
Whilst correlation predict the relation between two variables the Linear regression
is used in predictive analysis to find the best equation relates the dependent and
. independent variables
For example, want to relate the weights of individuals to their heights using a
.linear regression model
.There are several linear regression analyses available to the researcher
Simple linear regression
One dependent variable
One independent variable
Multiple linear regression
(One dependent variable
Two or more independent variables
Logistic regression
One dependent variable (binary)
Two or more independent variable(s)
Ordinal regression
One dependent variable (ordinal)
One or more independent variable(s) (nominal )
Solved example
Example 2
b= -2.427 , a = 143.106
Y= 143.106-2.427 X
2- Prediction
3-Model validation
3-Model validation