Correlation Regression
Correlation Regression
Regression
Relationships
In most practical applications, we find that there is a relationship between two (or more) variables, for
example, we find that there is a relationship between the student's degree and the number of study
hours. There are two types of variables:
Independent Variable (X) Dependent Variable(Y)
The variable that causes changes in the Is the variable that measures the
dependent variable outcome of a study
Example:
A scatter plot (or scatter diagram): used to show the relationship between two variables.
1 r 1
Value of r Degree of correlation
perfect correlation
Positive strong correlation
Positive weak correlation
No correlation
Negative weak correlation
Negative strong correlation
Pearson's Correlation Coefficient
Used to measure the strength and direction of the relationship between the dependent
variable (Y) and the independent variable (x).
-1
-1 << rr << +1
+1
negative positive
strong weak weak strong
-1 - 0.5 0 0.5 +1
2 4 7 10 15 18 22 30 X
94 70 62 55 38 31 42 12 Y
X2 Y2
2 94 188 4 8836
4 70 280 16 4900
7 62 434 49 3844
10 55 550 100 3025
15 38 570 225 1444
18 31 558 324 961
22 42 924 484 1764
30 12 360 900 144
108 404 3864 2102 24918
CORRELATION
Calculate the Pearson's Correlation Coefficient between the two variables.
,
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
, 𝑟=
,
√𝑛 ∑ 𝑋 2
−( ∑ 𝑋 )
2
√ 𝑛∑ 𝑌 − (∑ 𝑌 )
2 2
−12720
𝑟=
√5152 √ 36128
¿ − 0 . 932
Then there is the Negative strong relation between two variable X and Y
CORRELATION
example:
Use the following table data:
12 14 16 23 18 14
34 45 43 39 45 31
X2 Y2
12 34 408 144 1156
14 45 630 196 2025
16 43 688 256 1849
23 39 897 529 1521
18 45 810 324 2025
14 31 434 196 961
97 237 3867 1645 9537
CORRELATION
Calculate the Pearson's Correlation Coefficient between the two variables.
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑟=
√𝑛 ∑ 𝑋 2 2
−( ∑ 𝑋 ) √ 𝑛∑ 𝑌 − (∑ 𝑌 )
2 2
𝒓 =𝟎. 𝟑𝟎𝟔
There is positive weak relation between x and y.
REGRESSION
Regression is a study of the joint distribution of two variables, one of which is an independent
variable (X) and the other takes values that depend on the value of the independent variable (Y)
and dependent variables, which helps explain the change in the dependent variable (Y) according to a
y
If the correlation is strong, the values of the two variables draw a straight line that represents the
relationship between them. The regression line is called the least squares method
Y =aX + b
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌 ∑ 𝑌 ∑ 𝑋
𝑎= 𝑏= −𝑎
𝑛 ∑ 𝑋 − (∑ 𝑋 )
2 2
𝑛 𝑛
REGRESSION
Example 1:
Use the following table data:
2 4 7 10 15 18 22 30 X
94 70 62 55 38 31 42 12 Y
X2
2 94 188 4
4 70 280 16
7 62 434 49
10 55 550 100
15 38 570 225
18 31 558 324
22 42 924 484
30 12 360 900
108 404 3864 2102
REGRESSION
, , ,
a) Find the regression equation
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌 ( 8 ∗ 3864 ) − ( 108 ∗ 404 )
𝑎=
𝑛 ∑ 𝑋 − (∑ 𝑋 )
2 2 ¿ 2 ¿ − 2. 4689
( 8 ∗2102 ) − ( 108 )
∑ ∑
( )
𝑌 𝑋 404 108
𝑏= −𝑎 ¿ − − 2 . 4689 ∗ 0
𝑛 𝑛 8 8
𝒀 =− 𝟐.𝟒𝟔𝟖𝟗 𝑿 +𝟖𝟑.𝟖𝟑
b) Find the value 𝑌 if 𝑋 = 6.
𝒀 =− 𝟐. 𝟒𝟔𝟖𝟗 ∗𝟔+𝟖𝟑¿ 69
.𝟖𝟑. 017
REGRESSION
example: 2
A sample of 10 families in the Chicago area revealed the following figures for family size and
the amount spent on food per week:
Size 3 6 5 6 6 3 4 4 5 3
Food 99 104 55 151 129 142 111 74 91 111