0% found this document useful (0 votes)
118 views26 pages

UDAU M6 Correlation & Regression

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 26

Chapter 6

Linear Correlation &


Linear Regression
A. Linear Correlation
⚫ Correlation – measures the degree
and direction of the relationship
between paired variables.
⚫ 3 Degrees of Correlation
⚫ 1. perfect (±) correlations
⚫ 2. some degree of (±) correlation
⚫ 3. no correlation
Degrees of Correlation
Interpretation
Examples of Correlation:
⚫ Weight vs. Height (tall people tend to be
heavier than small people)
⚫ GNP vs. Aggregate Investment
⚫ Work Output vs. Work Experience
⚫ Statistics vs. Technical Writing
⚫ Computer Literacy vs. Personality
⚫ Wine Consumption vs. Illness
⚫ Students’ Grades vs. # of Hours spent in
Studying
Measures of Correlation
Pearson r Correlation

r= N(∑XY) – (∑X)(∑Y)
√[N(∑X2) – (∑X)2][N(∑Y2) – (∑Y)2
Extent of Correlation Between Paired
Variables
⚫ 0.00 - ±0.09 - No Correlation
⚫ ±0.10 - ±0.20 - Negligible Correlation
⚫ ±0.21 - ±0.40 - Low Correlation
⚫ ±0.41 - ±0.60 - Substantial Correlation
⚫ ±0.61 - ±0.80 - Marked Correlation
⚫ ±0.81 - ±0.99 - High to Very High Corr.
⚫ ±1.00 - Perfect Correlation
Example:
⚫ Table 1 presents data relating the number
of weeks of experience in a job involving
the wiring of miniature electronic
components and the number of
components which were rejected during
the past week for 12 randomly selected
workers.
⚫ i) Compute the correlation coefficient for
the data.
⚫ ii) Test if the correlation is significant.
Table 1: Weeks of Experience & Number of
Components Rejected During a Sampled
Week for 12 Assembly Workers

Sampled 1 2 3 4 5 6 7 8 9 10 11 12
Worker

Wks. Of
Experie 7 9 6 14 8 12 10 4 2 11 1 8
nce (X)

# of
Rejects 26 20 28 16 23 18 24 26 38 22 32 25
(Y)
W X Y X2 y2 XY
1 7 26 49 676 182
2 9 20 81 400 180
3 6 28 36 784 168
4 14 16 196 256 224
5 8 23 64 529 184
6 12 18 144 324 216
7 10 24 100 576 240
8 4 26 16 676 104
9 2 38 4 1444 76
10 11 22 121 484 242
11 1 32 1 1024 32
12 8 25 64 625 200
∑X=92 ∑Y=298 ∑X2=876 ∑Y2=7798 ∑XY=2048
Solutions:
⚫ i.1. Using Pearson (r)
r= 12(2,048) – (92)(298)
[12(876)–922][12(7,598)–2982]
r = - 0.908 ≈ 0.91
very strong negative
correlation
ii. Test for Significance
⚫ Ho: r = 0 Ha: r ≠ 0
⚫ ά(.05) = 0.5760 ά(.01) = 0.7079
⚫ df = n – 2 = 12 – 2 = 10
⚫ rc rt
⚫ - 0.908 > - 0.7079 , reject Ho
⚫ The computed r is significant, which implies
that a very strong negative correlation exist
between the 2 variables. The (-) correlation
means that the longer the work experience in a
job involving the wiring of miature component,
the lesser the number of rejected components.
B. Linear Regression

⚫- is a technique used to
approximate or predict the
values or relationship between
dependent and independent
(predictor) variable.
Regression Techniques
⚫ 1. Scatter Diagram – visual or graphical
method of determining the relationship of
paired data.
⚫ 2. Equation of the Regression Line
(least-squares line) – is used as a
predictor of the y-values.
⚫ 3. Standard Error of Estimate (Se) –
measures the amount of spread of the
sample points about the regression line.
Formulas:
⚫ Equation of the Regression Line
⚫ Ypred. = mx + b
⚫ b = Ῡ - mX Ῡ = ∑Y/n
⚫ m = n(∑XY) - ∑X∑Y X = ∑X/n
⚫ n∑X2 – (∑X)2
⚫ Standard Error of Estimate
⚫ Se = ∑Y2 – b(∑Y) – m(∑XY)
⚫ n–2
Examples:
⚫ 1. Each of the following pairs represents
the number of licensed drivers (X) and the
number of cars (Y) for houses in a posh
village in Metro Manila. Predict the
number of cars for each of 2 new families
with 2 and 5 drivers.
⚫ Drivers(X) | 5 5 2 2 3 1 2
⚫ Cars(Y) | 4 3 2 2 2 1 2
Solution:

X Y XY X2 Y2
5 4 20 25 16
5 3 15 25 9
2 2 4 4 4
2 2 4 4 4
3 2 6 9 4
1 1 1 1 1
2 2 4 4 4
∑X= 20 ∑Y=16 ∑XY=54 ∑X2=72 ∑Y2=42
⚫X = ∑X/N = 20/7 = 2.86
⚫ Ῡ = ∑Y/N = 16/7 = 2.29
⚫ m = n(∑XY) - ∑X∑Y = 7(54) – (20)(16)
⚫ n∑X2 – (∑X)2 7(72) – (20)2
⚫ m = 0.56
⚫ b = Ῡ - mX = 2.29 – (0.56)(2.86) = 0.69
⚫ Ypred. = mx + b = 0.56X + 0.69
⚫ If x=2, Ypred. = 0.56(2)+0.69 = 1.81≈ 2
⚫ If x=5, Ypred. = 0.56(5)+0.69 = 3.49≈ 4
⚫ 2. The following is the length of time that 10
job applicants for employment abroad have
studied College and their scores in a proficiency
test in English.
Yrs. In College| 5 3 2 3 4 2 3 4 4 5
Scores in Eng. |84 63 48 75 78 58 57 72 73 89
a. Construct a scatter diagram.
b. Find the equation of a regression line.
c. Sketch the graph of the regression line.
d. Compute for Se.
e. If X = 4 ½ , what is Ypred. ?
⚫ a)
Solutions:
⚫ 90 ·
⚫ ·
⚫ 80 ·
⚫ · ··
⚫ 70

⚫ 60 ·
⚫ · ·
⚫ 50
⚫ ·
⚫ 1 2 3 4 5
X Y XY X2 Y2
5 84 420 25 7056
3 63 189 9 3969
2 48 96 4 2304
3 75 225 9 5625
4 78 312 16 6084
2 58 116 4 3364
3 57 171 9 3249
4 72 288 16 5184
4 73 292 16 5329
5 89 445 25 7921
∑X=35 ∑Y=697 ∑XY=2554 ∑X2=133 ∑Y2=50085
⚫ b) Ῡ = 697/10 = 69.7
⚫ X = 35/10 = 3.5
⚫ m = 10(2554) – (35)(697) = 10.9
⚫ 10(133) – 352
⚫ b = 69.7 – 10.9(3.5) = 31.55
⚫ Ypred.= 10.9x + 31.55
⚫ e) If X= 4½,
⚫ Ypred. = 10.9(4.5)+31.55= 80.6 ≈ 81
⚫ d) Standard Error of Estimate

⚫ Se = 50,085 – 31.55(697) – 10.9(2554)


⚫ 10 – 2

⚫ Se = 0.47
Multiple Regression
Analysis

-allows you to measure how


much of the variation in a series
can be attributed to another
series.
Multiple Regression
Analysis
- is used when we have several
variables that we can use to
predict what we’re interested in.
Multiple Regression Equation
Y = b0 + b1X1 + b2X2 + … + bnXn
where:
•Y is the dependent variable to be predicted
•b0is the intercept
•x1, x2, x3, ..., xnare the the known independent
variables that may influence y
•b1, b2, b3, ..., bnare the numerical constants
which must be determined from the observed
data
Thank you

You might also like