0% found this document useful (0 votes)

30 views19 pages

Chapter 5

This document discusses correlation analysis and correlation coefficients. It defines correlation coefficients as a measure of the strength of the linear relationship between two quantitative variables from -1 to 1. A value of 1 or -1 indicates a perfect linear relationship, while 0 indicates no linear relationship. The document provides an example of calculating the correlation coefficient between height and muscle strength in male alcoholics, finding a positive correlation of 0.42. It also discusses interpreting and testing the significance of correlation coefficients.

Uploaded by

Samuel Debele

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views19 pages

Chapter 5

Uploaded by

Samuel Debele

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Chapter 5:

Correlation Analysis

1
Introduction
• Correlation coefficients are used to measure the strength of the
relationship or association between two quantitative variables.

• The standard method (Pearson correlation) leads to a quantity

called r that can take on any value from -1 to +1

• This correlation coefficient, r, measures the degree of 'straight-

line' association between the values of two variables.
• Thus a value of +1.0 or -1.0 is obtained if all the points in a
scatter plot lie on a perfectly straight line.
2
• Given data on two variables X and Y.

• If there is a close relationship between the two variables,

• then the points in the scatter plot will tend to bunch together
otherwise they will be scattered all around.

• If high values of X are accompanied by high values of Y and low

values of X with low values of Y, then the sum of products of the
two deviations will be positive hence r will be positive too.

3
Example: Height and quadriceps muscle strength in 41 male alcoholics

Figure 1. Scatter diagram showing muscle strength and height for 41 male alcoholics
4
• If you look at Figure 1, it is fairly easier to see that taller men
tend to be stronger than shorter men, or,
• looking at the other way round, that stronger men tend to be
taller than weaker men

• It is only a tendency, the tallest man is not the strongest and not
is the shortest man the weakest.
• Correlation enables us to measure how close this association is.

• The computation of the correlation coefficient is based on the

products of differences from the mean of the two variables.

5
• To see how correlation works, we can draw two lines on the
scatter diagram,
– a horizontal line through the mean strength and
– a vertical line through the mean height, as shown in Figure 2.

• Because large heights tend to go with large strength and small

heights with small strength,
• there are more observations in the top right quadrant and the
bottom left quadrant than there are in the top left and bottom
right quadrants.

6
Figure 2. Scatter diagram showing muscle strength and height for 41
male alcoholics, with lines through the mean height and mean strength
7
• When we add the products for all subjects, the sum will be positive,
because there are more positive products than negative ones.

• Further, subjects with very large values for both height and
strength, or very small values for both, will have large positive
products.

 So the stronger the relationship is, the bigger the sum of products
will be.

8
• If the sum of products is positive, we say that there is a positive
correlation between the variables

• If the sum of products is negative, we say that there is a negative

correlation between the variables

• The sum of products will depend on the number of observations

and the units in which they are measured.

9
If we have two variables X and Y with values xi and yi for the ith
individual, the correlation between them denoted by r(X,Y) is
given by:

 (Xi  X)(Yi  Y)  XY  [ X  Y ] / n
r 
 (Xi  X)  (Yi  Y)
2 2
[ X 2  ( X ) 2 / n][ Y 2  ( Y ) 2 / n]

The equation is clearly symmetrical as it does not matter which

variable is X and which is Y.

10
Example 1: The correlation coefficient for the muscle strength(y)
and height(x) will be:

Σx2= 1,196,828
∑X = 7,000
∑Y = 13,207
Σy2= 4,757,609
Σxy= 2,267,142
n=41
r = 0.42
11
 The size of the correlation coefficient clearly reflects the degree
of closeness to a straight line on the scatter diagram.

• The correlation coefficient is less than 1.0. r will not equal –1.0 or
+1.00 when there is a perfect relationship unless the points lie on
a straight line.

 Correlation measures closeness to a linear relationship, not to

any perfect relationship.

12
Figure 3. Scatter diagram showing simulated data from a population where
there is a perfect relationship between the variables and yet the population
correlation coefficient is less than one

13
Example 2: Figure 4. Scatter diagram showing muscle strength(y) and
age(x) for 41 male alcoholics

14
Figure 5. Scatter diagram showing muscle strength and age for 41
male alcoholics, with lines through the mean
15
Example: The correlation coefficient for the muscle strength(y)
and Age(x) will be:

Σx2= 82,845,
∑X = 1,785
∑Y = 13207
Σy2= 4757609,
Σxy= 553,800
n=41

r = -0.42

16
Inference on Correlation Coefficient

r=0 r<0

b=0 b<0
Y

Y
X
X

r>0

b>0
Y

17
Hypothesis testing on correlation coefficient

Under the null hypothesis that there is no association in the

population (=0), the appropriate test statistics is given by:
n2
r
1 r 2

has a t-distribution with n-2 degrees of freedom.

Example: for the muscle-height data:

n2 41  2
t  r  0.42   2.89
1 r 1  (0.42)
2 2

P < 0.01
18
Interpretation of correlation coefficient
• Correlation coefficients lie within the range -1 to +1, with the mid-
point of zero indicating no linear association between the two
variables or the two variables are statistically independent

• A very small correlation does not necessarily indicate that two

variables are not associated, however.

• To be sure of this we should study a plot of the data, because it is

possible that the two variables display a non-linear relationship
(for example cyclical or curved).
• In such cases r will underestimate the association, as it is a
measure of linear association alone.
19

Tutorial 14 Correlation
No ratings yet
Tutorial 14 Correlation
3 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
L6 - Biostatistics - Linear Regression and Correlation
No ratings yet
L6 - Biostatistics - Linear Regression and Correlation
121 pages
Correlation
No ratings yet
Correlation
19 pages
Correlation Analysis
No ratings yet
Correlation Analysis
14 pages
Correlation
No ratings yet
Correlation
48 pages
Correlation 1
100% (1)
Correlation 1
57 pages
Business Statistics
No ratings yet
Business Statistics
19 pages
Introduction To Correlation and Regression Analyses PDF
No ratings yet
Introduction To Correlation and Regression Analyses PDF
12 pages
16.. Correlation Analysis - Michael
No ratings yet
16.. Correlation Analysis - Michael
25 pages
Correlation Analysis
100% (1)
Correlation Analysis
51 pages
Correlation Lecture
No ratings yet
Correlation Lecture
20 pages
Biostatistics Lect 7a - Correlation - 142021
No ratings yet
Biostatistics Lect 7a - Correlation - 142021
31 pages
Correlation and Regression Original
No ratings yet
Correlation and Regression Original
44 pages
SHS Correlation and Regression Final
No ratings yet
SHS Correlation and Regression Final
79 pages
Correlation Analysis
No ratings yet
Correlation Analysis
12 pages
WK 7 - Overview of Correlation Analysis
No ratings yet
WK 7 - Overview of Correlation Analysis
5 pages
QTT Lec Correlations
No ratings yet
QTT Lec Correlations
33 pages
Lesson 11 Pearsons R
No ratings yet
Lesson 11 Pearsons R
12 pages
Ab2eb51 31052025 175425 Split 1
No ratings yet
Ab2eb51 31052025 175425 Split 1
45 pages
QMM 1
No ratings yet
QMM 1
18 pages
Lecture 10 Correlation
No ratings yet
Lecture 10 Correlation
32 pages
07 - S1 Chapter 7
No ratings yet
07 - S1 Chapter 7
21 pages
Correlation BMLT
No ratings yet
Correlation BMLT
5 pages
Session 18
No ratings yet
Session 18
15 pages
26 - Correlation and Regression Analysis
No ratings yet
26 - Correlation and Regression Analysis
50 pages
Correlation 11 12 2024 25122024 090652pm
No ratings yet
Correlation 11 12 2024 25122024 090652pm
34 pages
Correlation Analysis - Final
No ratings yet
Correlation Analysis - Final
40 pages
Correlation Analysis and Regression 22
No ratings yet
Correlation Analysis and Regression 22
41 pages
ECN 652 Handout 9 Student
No ratings yet
ECN 652 Handout 9 Student
46 pages
Correlation: Khairil Anuar Md. Isa Bbiomedicalsc. (Hons), Ukm Msc. (Medical Stat), Usm
No ratings yet
Correlation: Khairil Anuar Md. Isa Bbiomedicalsc. (Hons), Ukm Msc. (Medical Stat), Usm
33 pages
Chapter Four Correlation Analysis: Positive or Negative
No ratings yet
Chapter Four Correlation Analysis: Positive or Negative
15 pages
Correlation Analysis
No ratings yet
Correlation Analysis
54 pages
Session 16
No ratings yet
Session 16
16 pages
SOCI1005 - Correlation and Regression
No ratings yet
SOCI1005 - Correlation and Regression
36 pages
Lesson 7 - Linear Correlation and Simple Linear Regression
No ratings yet
Lesson 7 - Linear Correlation and Simple Linear Regression
8 pages
Correlation
No ratings yet
Correlation
20 pages
Correlation
No ratings yet
Correlation
24 pages
Chapter 7 Correlation and Simple Linear Linear Regression Fall 2023-2024
No ratings yet
Chapter 7 Correlation and Simple Linear Linear Regression Fall 2023-2024
35 pages
Correlation
100% (1)
Correlation
78 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
52 pages
Correlation
100% (1)
Correlation
49 pages
Correlation Analysis 2
No ratings yet
Correlation Analysis 2
15 pages
Correlations
No ratings yet
Correlations
30 pages
Chapter - 9 Correlations 9.0. Objectives: R. Note That Both Measures Are Taken On Each Individual Being Studied
No ratings yet
Chapter - 9 Correlations 9.0. Objectives: R. Note That Both Measures Are Taken On Each Individual Being Studied
8 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
Datasets - Bodyfat2 Fitness Newfitness Abdomenpred: Saseg 8B - Correlation Analysis
No ratings yet
Datasets - Bodyfat2 Fitness Newfitness Abdomenpred: Saseg 8B - Correlation Analysis
34 pages
MRS - Diana-Correlation Analysis-Notes
No ratings yet
MRS - Diana-Correlation Analysis-Notes
16 pages
Correlation Coefficient in Medical Research
No ratings yet
Correlation Coefficient in Medical Research
6 pages
Unit 14
No ratings yet
Unit 14
16 pages
8 Correlation
No ratings yet
8 Correlation
22 pages
DADM-Correlation and Regression
No ratings yet
DADM-Correlation and Regression
138 pages
Correlation Rev 1.0
No ratings yet
Correlation Rev 1.0
5 pages
Lesson 11 - Regression and Correlation Analysis
No ratings yet
Lesson 11 - Regression and Correlation Analysis
8 pages
Correlation
No ratings yet
Correlation
34 pages
Correlation and Regression Analysis
100% (1)
Correlation and Regression Analysis
59 pages
Correlation
No ratings yet
Correlation
46 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Engineering Formulas: QuickStudy Laminated Reference Guide
From Everand
Engineering Formulas: QuickStudy Laminated Reference Guide
Beena Ajmera
No ratings yet
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Data Collection Statistics
No ratings yet
Data Collection Statistics
18 pages
9 SVM 2
No ratings yet
9 SVM 2
7 pages
Regression Linear
No ratings yet
Regression Linear
24 pages
L1111001R1 PA Pulse-Tube Validation Verification
No ratings yet
L1111001R1 PA Pulse-Tube Validation Verification
2 pages
Pearson LCCI Level 3 Certificate in Business Statistics (VRQ
No ratings yet
Pearson LCCI Level 3 Certificate in Business Statistics (VRQ
51 pages
Regression Analysis: Study Hours GPA 5 2.8 8 3.1 6 3.4 7 3.5 1 2.2 4 3.67 3 3 8 2.5 5 3.33 2 3
No ratings yet
Regression Analysis: Study Hours GPA 5 2.8 8 3.1 6 3.4 7 3.5 1 2.2 4 3.67 3 3 8 2.5 5 3.33 2 3
9 pages
Engle & Manganelli (2004) - CAViaR Conditional Autoregressive Value at Risk by Regression Quantiles
No ratings yet
Engle & Manganelli (2004) - CAViaR Conditional Autoregressive Value at Risk by Regression Quantiles
15 pages
Research One
No ratings yet
Research One
25 pages
Lepage Test
No ratings yet
Lepage Test
3 pages
14622inferenceforsingleproportions 160909005557
No ratings yet
14622inferenceforsingleproportions 160909005557
19 pages
Standard Deviation
No ratings yet
Standard Deviation
13 pages
Statistics Chapter1
No ratings yet
Statistics Chapter1
3 pages
Modeling Higher Moments
No ratings yet
Modeling Higher Moments
31 pages
Examination 2 STAT 285: Business Statistics Spring 2020: Raehslerr@duq - Edu
No ratings yet
Examination 2 STAT 285: Business Statistics Spring 2020: Raehslerr@duq - Edu
3 pages
QMT11 Chapter 11 Experimental Design and ANOVA
No ratings yet
QMT11 Chapter 11 Experimental Design and ANOVA
40 pages
Business Analysis and Econometric Application: Poonam Singh National Institute of Industrial Engineering
No ratings yet
Business Analysis and Econometric Application: Poonam Singh National Institute of Industrial Engineering
13 pages
BM 31 1 010502
No ratings yet
BM 31 1 010502
27 pages
AIML Practical 02 22105A2021
No ratings yet
AIML Practical 02 22105A2021
8 pages
Brandt and Kinlay - Estimating Historical Volatility v1.2 June 2005
No ratings yet
Brandt and Kinlay - Estimating Historical Volatility v1.2 June 2005
44 pages
Chapter 8 Review
No ratings yet
Chapter 8 Review
6 pages
CE504 - HW2 - Dec 27, 20
No ratings yet
CE504 - HW2 - Dec 27, 20
4 pages
Tutorial 8 - Questions
No ratings yet
Tutorial 8 - Questions
2 pages
Day of The Week Effects
No ratings yet
Day of The Week Effects
13 pages
(Ebook PDF) Essentials of Modern Business Statistics With Microsoft Office Excel 7th Editioninstant Download
100% (5)
(Ebook PDF) Essentials of Modern Business Statistics With Microsoft Office Excel 7th Editioninstant Download
51 pages
Sas Tutorial Procunivariate
No ratings yet
Sas Tutorial Procunivariate
10 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
27 pages
Se - Eco - 22 - 0208 Output Results
No ratings yet
Se - Eco - 22 - 0208 Output Results
6 pages
Mit
No ratings yet
Mit
119 pages
Uji Validitas Instrumen B-Ipq Versi Indonesia Pada Pasien Hipertensi Di Rsud Sultan Syarif Mohamad Alkadrie Pontianak
No ratings yet
Uji Validitas Instrumen B-Ipq Versi Indonesia Pada Pasien Hipertensi Di Rsud Sultan Syarif Mohamad Alkadrie Pontianak
9 pages
PCA - Principal Component Analysis: Step by Step Computation of PCA
No ratings yet
PCA - Principal Component Analysis: Step by Step Computation of PCA
2 pages