Correlation and Regression Analysis
Correlation and Regression Analysis
Regression Analysis
Correlation
N XY ( X)( Y)
r
[N( X2) ( X)2][N( Y 2) ( Y)2]
Test of Significance
r N 2
t
1 r 2
df = n – 2
Correlation Coefficient & Strength of Relationships
Day 1 2 3 4 5 6 7 8 9 10 11 12
Temperature (°F) 79 76 78 84 90 83 93 94 97 85 88 82
Total Sales (Units) 14 14 14 16 20 15 19 21 20 18 20 15
7 3 7 8 6 5 2 1 9 7 0 0
N XY ( X)( Y)
r
[N( X 2 ) ( X)2 ][N( Y 2 ) ( Y)2 ]
12(183,222) (1,029)(2,115)
[12(88,733) (1,029) 2 ][12(380,887) (2,115) 2 ]
= 0.93
Reject H0
-2.228 0 +2.228
8.00
Step 6: Conclusion.
We can conclude that there is evidence that
shows significant association between the
atmospheric temperature and the total sales
of fruit shake.
Example 2: Spearman Rank
Day 1 2 3 4 5 6 7 8 9 10 11 12
Temperature (°F) 79 76 78 84 90 83 93 94 97 85 88 82
Total Sales (Units) 14 14 14 16 20 15 19 21 20 18 20 15
7 3 7 8 6 5 2 1 9 7 0 0
Day X Y RX RY D D2
1 79 147 10 10.5 –0.5 0.25
2 76 143 12 12 0 0
3 78 147 11 10.5 0.5 0.25
4 84 168 7 7 0 0
5 90 206 4 3 1 1
6 83 155 8 8 0 0
7 93 192 3 5 –2 4
8 94 211 2 1 1 1
9 97 209 1 2 –1 1
10 85 187 6 6 0 0
11 88 200 5 4 1 1
12 82 150 9 9 0 0
2
Total 0 D 8.5
Computation of
6 D2
1
N(N2 1)
6(8.5) 51
1 2
1 1 0.030.97
12(12 1) 12(143)
Reject H0
-2.228 0 +2.228
12.62
Step 6: Conclusion.
We can conclude that there is evidence that
shows significant association between the
atmospheric temperature and the total sales
of fruit shake.
Simple Regression Equation
Regression analysis is a simple statistical
tool used to model the dependence of a
variable on one (or more) explanatory
variables.
A simple linear regression is the least
estimator of a linear regression model with
a single predictor (or one independent
variable)
The least square model determines a
regression equation by minimizing the sum
of squares of the vertical distances between
the actual Y values and the predicted values
of Y.
Assumptions of Linear Regression Equation
y x x y
2
a
n x x
2 2
n( XY ) ( X )( Y )
b 2 2
n( X ) ( X )
Intercept of the regression line
Where:
y a bx
y = criterion mesure
x= predictor
= ordinate or the point where the regression line crosses the y-axis
b = beta weight or the slope of the line
a
Measures of Variations
Yi Unexplained
Y
sum of squares
Total sum of Ŷ = b 1X + b 0
squares
Explained sum of
squares
Xi X
2 2
(Yi Y) = (Ŷi Y) + (Y
i Ŷi )2 SST = SSR + SSE
Standard Error of Estimate
2 2
SSE (Y
i i
Ŷ ) b0 ( Y) b1( XY)
Y
sE
n 2 n 2 n 2
Coefficient of Determination
The coefficient of determination is the measure of
variation of the dependent variable that is
explained by the regression line and the
independent variable
total variation un explained variation explained variation
r2
total variation total variation
2
2
r
SSR
1
SSE
1
(Y Ŷ )
i i
2
SST SST (Y Y )
i