Week 10 Correlation and Regression Analysis Lec
Week 10 Correlation and Regression Analysis Lec
Regression Analysis
Pearson Product-Moment Correlation
Spearman Rank Correlation
Week 10
Linear Regression Analysis
Engr. Nora G. Yulo, MATM 111 Teacher
Correlation
Correlation is finding the relationship between two
quantitative variables without being able to infer causal
relationships. It is a statistical technique used to
determine the degree to which two variables are related.
Week 10
Scatter Diagram
Rectangular coordinate
Two quantitative variables
One variable is called independent (X)
and the second is called dependent (Y)
Points are not joined Y
* *
No frequency table *
Week 10 X
Example:
Week 10
Correlation
Definition Situation
Positive correlation Both variables increases
Negative correlation One variable increases
while the other variable
decreases
No correlation One variable neither
increases or decreases
while the other variable
Week 10 increases
Relationships of Variables
𝑵 ∑ 𝑿𝒀 − ( ∑ 𝑿 )( ∑ 𝒀 )
𝒓=
Week 10 √[ 𝑵 ( ∑ 𝑿 ) − ( ∑ 𝑿 ) ] [ 𝑵 ( ∑ 𝒀 ) − ( ∑ 𝒀 ) ]
𝟐 𝟐 𝟐 𝟐
Spearman Rank Correlation
Spearman rank correlation (or Spearman’s rho) is a nonparametric test that is used to
measure the degree of association between the two two variables. It is often denoted by (rho) or
as . Spearman rank correlation is the counterpart of Pearson Product-Moment Correlation in
parametric statistics. It is calculated by converting each variable to ranks and calculating the
Pearson Product-Moment Correlation between the two sets of ranks. For small sample sizes,
the observed correlation coeffcieint is compared to what would result if the ranks of the X-
values and Y-values were random permutations of the integeres 1 to n (sample size). The
following formula is used to calculate the Spearman rank correlation:
𝟔∑ 𝑫 𝟐
𝝆 𝒐𝒓 𝒓 𝒔 =𝟏−
Week 10 𝑵 ( 𝑵 − 𝟏)𝟐
Week
Does it appear there is a relationship between grams of fat and calories (Show a scatter diagram)? Compute the
coefficient of correlation. Determine at the 0.05 siginificance level whether the correlation in the population is greater
than zero.
Pearson Product-Moment Correlation
Week
Does it appear there is a relationship between grams of fat and calories ? Compute the coefficient of correlation.
Determine at the 0.05 siginificance level whether the correlation in the population is greater than zero.
Pearson Product-Moment Correlation
Week
Does it appear there is a relationship between grams of fat and calories ? Compute the coefficient of correlation.
Determine at the 0.05 siginificance level whether the correlation in the population is greater than zero.
Week
Step 2: Calculate the degrees of freedom (d.f. = N – 2) and determine the critical value of t.
(Use the t-table.)
d.f. = = and .
Week
Solution:
Since the computed t-value of ______________ is greater than the tabular value __________ at level of significance of 0.05,
we would need to reject the null hypothesis.
Since the null hypothesis is _______________________, we can conclude that there is evidence that shows significant
association between ______________________________________________________________
Week
Total Sales (Units) 147 143 147 168 206 155 192 211 209 187 200 150
Step 2: Calculate the degrees of freedom (d.f. = N – 2) and determine the critical value of t.
(Use the CRITICAL VALUE FOR CORRELATION CORFFICIENT)
d.f. = 12 – 2 = 10 and 0.576.
Week
Pearson Product-Moment Correlation 10
Step 3: Determine the Day X Y XY
1 79 147 6 241 21 609 11 613
value of r (Pearson-
2 76 143 5 776 20 449 10 868
moment correlation 3 78 147 6 084 21 609 11 466
coefficient). 4 84 168 7 056 28 224 14 112
5 90 206 8 100 42 436 18 540
6 83 155 6 889 24 025 12 865
7 93 192 8 649 36 864 17 856
8 94 211 8 836 44 521 19 834
9 97 209 9 409 43 681 20 273
10 85 187 7 225 34 969 15 895
11 88 200 7 744 40 000 17 600
12 82 150 6 724 22 500 12 300
Total 1 029 2 115 88 733 380 887 183 222
Week
The coefficient of correlation, r = 0.93, between the atmospheric temperature and total
0.93 sales indicates a very high, direct correlation (very dependable relationship) – that is
an increased in atmospheric temperature is highly associated with the increased in
total sales of fruit shake.
Week
Step 3: Calculate the t value using the formula where t is the t test for correlation coefficient, r is the correlation
coefficient and N is the number of paired samples. Then, determine the statistical decision for hypothesis testing.
(NOTE: If do not reject . If reject . )
Solution:
Since the computed t-value of 8.00 is greater than the tabular value of 0.576 at level of significance of 0.05, we
would need to reject the null hypothesis.
Since the null hypothesis is rejected, we can conclude that there is evidence that shows significant association
between the atmospheric temperature and the total sales of fruit shake.
Regression Analysis
Regression: technique concerned with predicting some variables by
knowing others
The process of predicting variable Y using variable X
Regression
- Uses a variable (x) to predict some outcome variable (y)
- Tells you how values in y change as a function of changes in values of x
- Linear means “straight line”
- Regression tells us how to draw the straight line described by the
correlation
Week 10
Regression
- It calculates the “best-fit” line for a certain set of data.
- The regression line makes the sum of the squares of the residuals smaller
than for any other line.
- Regression minimizes residuals.
Week 10
Regression
In deriving the linear regression equation, compute for the b and a. To
obtain the values of the slope (b) and the intercept (a) of the linear regression
equation, we have:
b= a=
(Slope) (Intercept)
Week 10
Linear Regression Equation (LRE)
ŷ a bX
Week 10
- End of discussion -
Is there anything that you would like to be
clarified? Would you like to add some insights in this
discussion? If there’s none, prepare yourselves for an
offline task.
Week 10