0% found this document useful (0 votes)

164 views84 pages

2 Correlation and Regression PDF

The document discusses correlation and regression analysis tools used to examine relationships between two quantitative variables. It introduces scatterplots, which graphically depict relationships in data. Correlation coefficients measure the strength and direction of linear relationships, while regression equations describe average relationships between a response and explanatory variable. The chapter covers scatterplots, correlation coefficients, cautions in their use, and simple linear regression.

Uploaded by

Winnie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

164 views84 pages

2 Correlation and Regression PDF

Uploaded by

Winnie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 84

STAT1600B

Statistics: Ideas and Concepts

2017-2018 (2nd Semester)

Department of Statistics and Actuarial Science

The University of Hong Kong

Chapter 2:
Correlation and Regression

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 1 / 63
Introduction

Outline

1 Introduction

2 Scatterplot

3 Correlation Coefficient r

4 Rank Correlation Coefficient rs

5 Cautions in the Use of Correlation

6 Simple Linear Regression

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 2 / 63
Introduction

Tools Used to Examine the Relationships between

Two Quantitative Variables
In this chapter, we are going to examine the relationship
between two quantitative variables.
By three tools:
1 Scatterplot, which is a two-dimensional graph of data values.
2 Correlation
1 Correlation coefficient r , which is a statistic that measures the
strength and direction of a linear relationship between two
quantitative variables.
2 Rank correlation coefficient rs , which is the non-parametric
counterpart of r .
3 Regression equation, which is an equation that describes the
average relationship between a quantitative response variable
and a quantitative explanatory variable.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 3 / 63
Scatterplot

Outline

1 Introduction

2 Scatterplot

3 Correlation Coefficient r

4 Rank Correlation Coefficient rs

5 Cautions in the Use of Correlation

6 Simple Linear Regression

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 4 / 63
Scatterplot

Scatterplot
Scatterplot is a two-dimensional graph of data values.
It is used to reveal graphically any relation between the two
variables.
Suppose we have the following bivariate data in the table below.
TABLE 7.2 Bivariate Data: Scores for 10 Male College Students
on Two Self-Report Measures

STRESS EATING DIFFICULTIES

STUDENT X Y
A 17 9
B 8 13
C 8 7
D 20 18
E 14 11
F 7 2
G 21 5
H 22 15
I 19 26
J 30 28

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 5 / 63
Scatterplot

Scatterplot
The scatterplot can be plotted to allow us to easily see the
nature of the relationship, if any exists, between the two
variables.
30
J
I
25

20
Eating difficulties

H
15
B
E
10
A
C
G
5

0
5 10 15 20 25 30
Stress

FIGURE 7.1 Scatter diagram of bivariate distribution of stress scores and eating difficulties scores for
10 male college students. Data from Table 7.2.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 6 / 63
Scatterplot

Scatterplot
Again, suppose we have the following data of two variables –
height and handspan.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 7 / 63
Scatterplot

Scatterplot
We can then plot a scatterplot to investigate the relationships
between height and handspan.

Observations from the scatterplot:

Handspan tends to increase with height, implying positive
association.
The pattern of relationship resembles a straight line, implying
linear relationship.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 8 / 63
Scatterplot

Positive/Negative Association and Linear

Relationship

sum of normal score is positive

Two variables have a positive association when the values of
one variable tend to increase as the values of the other variable
increase.
Two variables have a negative association when the values of
one variable tend to decrease as the values of the other variable
increase.
Two variables have a linear relationship when the pattern of
their relationship resembles a straight line.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 9 / 63
Scatterplot

Curvilinear Pattern

A linear pattern is common, but it is not the only type of

relationship.
Sometimes, a curve describes the pattern of a scatterplot better
than a straight line does.
In this case, the relationship is called nonlinear or curvilinear.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 10 / 63
Scatterplot

Curvilinear Pattern
For instance, below is a scatterplot showing the relationship between
song-specific age (age in the year the song was popular) and musical
preference (positive score ! above average, negative score ! below
average).

Observations from the scatterplot:

The association is curvilinear.
Musical
Chung, LI (SAAS
preference
, HKU )
is at peak around the song-specific age atCh23.5.
STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) 2 11 / 63
Scatterplot
Scatterplot Each point represent the (X,Y) value for an object

Regression line

A line well fit the data

Scatterplot
Example : Stress and Eating Difficulties of 10 Male Students

Student Stress (x ) Eating Difficulty (y )

A 17 9
B 8 13
C 8 7
D 20 18
E 14 11
F 7 2
G 21 5
H 22 15
I 19 26
J 30 28
Scatterplot
Example : Stress and Eating Difficulties of 10 Male Students

9 Linear relationship
how strong?

17
Correlation
Example : Height and Handspan of n = 167 individuals

x æ x - x öæ y - y ö
çç ÷÷ç ÷>0
æ x - x öæ y - y ö s ç s ÷
è x øè y ø
çç ÷÷ç ÷<0
s ç s ÷
è x øè y ø

y
æ x - x öæ y - y ö
çç ÷÷ç ÷>0
s ç s ÷ æ x - x öæ y - y ö
è x øè y ø çç ÷÷ç ÷<0
s ç s ÷
è x øè y ø
Correlation
Example : Height and Handspan of n = 167 individuals

æ x - x öæ y - y ö
å çç s ÷÷çç s ÷÷ > 0
è x øè y ø
+ve assocation
Correlation
Measure the strength of linear relationship

1 æ xi - x öæ yi - y ö
r= å çç ÷÷ç
n - 1 è sx øçè s y ÷ø
÷ Pearson correlation
coefficient

Standard score of X Standard score of Y

Average of product standard scores

• unit free • sign (+ or -) indicates direction of association

Association

* *
*
* *
* ** r>0 ** *
* ** * r<0
** *
* * * +ve -ve ** *

*
* * *
* * *
* * * r=0 *
* *
*
*
r=0
** * * Not
* * Not *
* * linear
associated
Correlation coefficient
Sample statistics
1 1
x=
n
åx y=
n
åy

S xx = å (x - x ) = å x -
2 2
(å x ) 2

S yy = å(y - y) = å y -
2 (å y )
2
2

n n

S xy = å ( x - x )( y - y ) = å xy -
(å x )(å y )
n

r=
S xy
=
å (x - x )( y - y )
S xx S yy å (x - x ) å ( y - y )
2 2
Correlation
S xx = 3248 -
(166)
2
= 492.4
Stress Eat Difficulty Product
10
(x ) (y ) (xy )
17 9 153 S yy = 2458 -
(134)
2
= 662.4
8 13 104 10
8 7 56 166 ´ 134
20 18
S xy = 2610 - = 385.6
360 10
14 11 154
7 2 14 Correlation coefficient
21 5 105
385.6 S xy
22 15 330 r= r= = 0.675
19 26 494 S
492.4 ´ 662 S
xx .4
yy

30 28 840
+ve association
å x = 166 å y = 134
å xy = 2610
åx 2
= 3248 åy 2
= 2458 How strong?
Correlation
-1 £ r £ 1
Perfect -ve Perfect +ve
linear relationship linear relationship

r=0

No linear relationship
(uncorrelated)
Correlation
Correlation
Pearson correlation coefficient is sensitive to outliers.

Add a student K with x = 60, y = 50

r = 0.675 r = 0.899
Rank Correlation
Rank Correlation Coefficient (Spearman’s Rho)

Pearson correlation coefficient between ranks of data

rs =
å (R - R )(R - R )
x x y y

å (R - R ) å (R - R )
2 2
x x y y

• Robust (less sensitive to outliers)

• Can be applied to qualitative (ordinal) data

Rank Correlation
Step 1: Rank the observations for each variable.
x Rx y Ry

• B and C have equal stress scores (ties), so they share

the same average ranks: (9 + 10) / 2 = 9.5.
Rank Correlation
Step 2: Calculate Pearson correlation between ranks.
Rx Ry RxRy S Rxx = 505.5 -
(66)
2
= 109.5
11

S Ryy = 506 -
(66)
2
= 110
11

66 ´ 66
S Rxy = 473.5 - = 77.5
11

Spearman’s Rho

77.5
rs = = 0.706
åR x = 66 åR y = 66 109.5 ´ 110
åR R
x y = 473.5
åR 2
x = 505.5 åR 2
y = 506
Rank Correlation
Computational formula

6å d i2
rs = 1 - d = Rx - R y
n (n - 1)
where
2

• When there is no ties, it gives the exact value of rank

correlation coefficient.

• When there are not too many ties, it gives well

approximation to the rank correlation coefficient.
Rank Correlation
Example : Ranking of candidates by two vice presidents

6å d i2 6 ´ 18 strongly
= 1- = 0.786
8 ´ (8 - 1)
rs = 1 -
n (n - 1) +ve associated
2 2
Correlation ¹ Causation
Example : Price and Demand for town gas
Year 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969
Price 30 31 37 42 43 45 50 54 54 57
Demand 134 112 136 109 105 87 56 43 77 35

Year 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979
Price 58 58 60 73 88 89 92 97 100 102
Demand 65 56 58 55 49 39 36 46 40 42

Pearson correlation coefficient: r = – 0.79

? Low demand is due to high price. ?

Correlation ¹ Causation

1960-1965
Time

1974-1979

1966-1973
Price Demand
Correlation ¹ Causation
• People with rare surnames live longer? latent causes : inheritance

• Observed correlation does not imply causation.

Source: “Surname Frequency and Lifespan”, Pablo A. Peña (2013)

Correlation for Combined Data
• Grouping may result in deceiving correlation

Combined data –ve r

BBA Within group +ve r

BEng
GPA
BSc

Time spent

Simpson’s Paradox – direction of relationship is reversed within subgroups

compared to the direction of relationship within the whole group.
Correlation Coefficient r

Outline

1 Introduction

2 Scatterplot

3 Correlation Coefficient r

4 Rank Correlation Coefficient rs

5 Cautions in the Use of Correlation

6 Simple Linear Regression

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 12 / 63
Correlation Coefficient r

Correlation Coefficient r

Correlation coefficient is a statistic that measures the

strength and direction of a linear relationship between two
quantitative variables.
Strength
It is determined by the closeness of the points to a straight line
Direction
It is determined by whether one variable generally increases or
generally decreases when the other variable increases
Linear
When the pattern is nonlinear, the correlation coefficient is not
an appropriate way to measure the strength of the relationship.
This measure is also called the Pearson product-moment
correlation coefficient.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 13 / 63
Correlation Coefficient r

Formula of the Correlation Coefficient r

Formula:
P
Sxy (x x )(y y)
r=p = pP P
(Sxx )(Syy ) (x x )2 (y y)2
where
X X P
2 x )2
2 (
Sxx = (x x) = x
X X Pn 2
( y)
Syy = (y y)2 = y2
n P P
X X ( x )( y)
Sxy = (x x )(y y) = xy
n

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 14 / 63
Correlation Coefficient r

Calculation of r
TABLE 7.5 Calculation of r from the Raw Scores of Table 7.2

STUDENT X Y X2 Y2 XY
A 17 9 289 81 153
B 8 13 64 169 104
C 8 7 64 49 56
D 20 18 400 324 360
E 14 11 196 121 154
F 7 2 49 4 14
G 21 5 441 25 105
H 22 15 484 225 330
I 19 26 361 676 494
J 30 28 900 784 840
n ! 10 Sum: 166 134 3,248 2,458 2,610
⎭
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎬
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎫
①
!X "!X# /n ! 3,248 " 166 /10 ! 492.4
2
② SSX ! 2
" 2

!Y "!Y # /n ! 2,458 " 134 /10 ! 662.4

2
③ SSY ! 2
" 2

"!X#"!Y# (166)(134)
④ !(X " X$ )(Y " Y$ ) ! !XY " ## n
! 2,610 " ## ! 385.6
10
!(X " X$)(Y " Y$) 385.6 385.6 385.6
⑤ r ! ### ! ## ! ## ! # ! $.675
%$S
(SSX)(S$
Y)
$)(662.4
%(492.4$)$ $5.76
%326,16 $ 571.11

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 15 / 63
Correlation Coefficient r

Interpreting the Correlation Coefficient r

r is always between –1 and +1.

Magnitude indicates the strength of the linear relationship.
Sign indicates the direction of the association.
r > 0: the two variables tend to increase together (a positive
association)
r < 0: when one variable increases, the other is likely to decrease (a
negative association)
r = –1 or + 1 indicates a perfect linear relationship. (i.e., All data points
lie on the same straight line.)
r = 0 indicates that the best straight line through the data is exactly
horizontal (i.e., with a slope of 0), so knowing x does not change the
predicted value of y.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 16 / 63
Correlation Coefficient r

Interpreting the Correlation Coefficient r

Strong Positive Correlation Strong Negative Correlation Very Strong Positive Correlation

Moderately Strong Positive Correlation Weak Connection Not Too Strong Negative Correlation

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 17 / 63
Rank Correlation Coefficient rs

Outline

1 Introduction

2 Scatterplot

3 Correlation Coefficient r

4 Rank Correlation Coefficient rs

5 Cautions in the Use of Correlation

6 Simple Linear Regression

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 18 / 63
Rank Correlation Coefficient rs

Rank Correlation Coefficient rs

Previously, we discussed the correlation coefficient r , as a

measure of the strength of a linear relationship for quantitative
bivariate variables X and Y .
Since rankings are qualitative data but not quantitative data
even though they are numerical, sample correlation coefficient r
cannot be used.
We are going to introduce rank correlation coefficient rs (also
called Spearman’s rho), which can be used to perform correlation
analysis to a form of qualitative data: bivariate rankings.
Correlation coefficient r and rank correlation coefficient rs are
regarded as parametric and nonparametric counterparts.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 19 / 63
Rank Correlation Coefficient rs

Rank Table
As an example, consider two corporate vice-presidents who have just interviewed eight
candidates for the position of personnel manager in the firm.
Each vice-president separately has contemplated the strengths and weaknesses of each
candidate and has ranked the individuals from 1 = most promising to
8 = least promising. The orderings are shown in the following rank table:

Candidate Ranking of Vice-President 1, X Ranking of Vice-President 2, Y

Feldho↵ 2 4
Hancock 6 6
Johnson 5 7
Pringle 4 3
Reilly 3 1
Sayer 7 5
Stephan 1 2
Taylor 8 8

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 20 / 63
Rank Correlation Coefficient rs

Formula of the Rank Correlation Coefficient rs

If we wish to assess the strength of the relation between the two

sets of ranks, we can compute the sample rank correlation
coefficient rs .
The Spearman correlation coefficient rs is defined as the Pearson
correlation coefficient between the ranks of the data.
P
(Rx R x )(Ry R y )
rs = qP P ,
(Rx R x ) 2 (Ry R y ) 2

where Rx and Ry are the ranks of the two variables of interest

and R x and R y are the means these ranks respectively.
That is, the Spearman correlation coefficient rs is defined as the
Pearson correlation coefficient between the ranks of the data.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 21 / 63
Rank Correlation Coefficient rs

Question
Find the Pearson correlation and Spearman correlation between X
and Y below:

X Y
160 26
158 24
180 19
198 58

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 22 / 63
Rank Correlation Coefficient rs

Formula of the Rank Correlation Coefficient rs

If there are no tied ranks in the data, then the following formula
also works.
Shortcut Formula:
P
6 ni=1 di2
rs = 1
n(n 2 1)

where

di = Rank(xi ) Rank(yi )
= Rxi Ryi (di↵erence between a pair of ranks)
n = the number of pairs of ranks

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 23 / 63
Rank Correlation Coefficient rs

Calculation of rs
Candidate i Ranking of VP 1, X Ranking of VP 2, Y di di2
Feldho↵ 2 4 2 4
Hancock 6 6 0 0
Johnson 5 7 2 4
Pringle 4 3 1 1
Reilly 3 1 2 4
Sayer 7 5 2 4
Stephan 1 2 1 1
Taylor 8 8 0 0
P
n=8 di2 = 18

P
6 ni=1 di2 6(18)
rs = 1 =1 =1 0.214 = 0.786
n(n 2 1) 8(63)
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 24 / 63
Rank Correlation Coefficient rs

Interpreting Rank Correlation Coefficient rs

Same as its parametric counterpart, rs is constrained to be

between 1 and +1 inclusively.

Rank Correlation Coefficient Interpretation

Perfectly negatively correlated
rs = 1
The two vice-presidents’ agreements are exactly opposite. (Refer to following slides)
Negatively correlated
1 < rs < 0
There is an overall disagreement between the two vice-presidents.
Uncorrelated
rs = 0
The two vice-presidents’ rankings are not related.
Positively correlated
0 < rs < 1
There is an overall agreement between the two vice-presidents.
Perfectly positively correlated
rs = 1
The two vice-presidents agrees exactly. (Refer to following slides)

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 25 / 63
Rank Correlation Coefficient rs

Case of Perfectly Positive Correlation

Candidate i Ranking of VP 1, X Ranking of VP 2, Y di di2
Feldho↵ 1 1 0 0
Hancock 2 2 0 0
Johnson 3 3 0 0
Pringle 4 4 0 0
Reilly 5 5 0 0
Sayer 6 6 0 0
Stephan 7 7 0 0
Taylor 8 8 0 0
P
n=8 di2 = 0

P
6 ni=1 di2 6(0)
rs = 1 =1 =1 0=1
n(n 2 1) 8(63)
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 26 / 63
Rank Correlation Coefficient rs

Case of Perfectly Negative Correlation

Candidate i Ranking of VP 1, X Ranking of VP 2, Y di di2
Feldho↵ 1 8 7 49
Hancock 2 7 5 25
Johnson 3 6 3 9
Pringle 4 5 1 1
Reilly 5 4 1 1
Sayer 6 3 3 9
Stephan 7 2 5 25
Taylor 8 1 7 49
P
n=8 di2 = 168

P
6 ni=1 di2 6(168)
rs = 1 =1 =1 2= 1
n(n 2 1) 8(63)
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 27 / 63
Rank Correlation Coefficient rs

When to Use rs instead of r ?

Situation 1: Data are given in the form of ranks.

just like the above example
Situation 2: Data are given in the form of scores, but what
matters is that one score is higher than another and how much
higher is not really important. Then, translating scores to ranks
will be suitable.
will be illustrated in the following example

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 28 / 63
Rank Correlation Coefficient rs

Calculation of rs from Data in the Form of Scores

Suppose an instructor is curious about the relation between the order in which the 15
members of her class completed an examination and the number of points earned on it.
She assigns a rank of 1 to the first paper turned in and succeeding ranks according to the
order of completion.
After she has scored the tests, she records the order of turn-in X and the test score
obtained Y , as shown in the table below.
TABLE 7.7 Calculation of rS
① ②
ORDER OF TEST
TURN-IN SCORE RANK OF
SUBJECT X Y R

A 1 28
B 2 21
C 3 22
D 4 22
E 5 32
F 6 36
G 7 33
H 8 39
I 9 25
J 10 30
K 11 20
L 12 28
M 13 31
N 14 38
O 15 34
n ! 15
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 29 / 63
Rank Correlation Coefficient rs

Calculation of rs from Data in the Form of Scores

TABLE 7.7 Calculation of rS
① ②
ORDER OF TEST ③ ④ ⑤
TURN-IN SCORE RANK OF X RANK OF Y D! ⑥
SUBJECT X Y RX RY RX " RY D2

A 1 28 1 6.5 "5.5 30.25

B 2 21 2 2 0.0 0.00
C 3 22 3 3.5 ".5 .25
D 4 22 4 3.5 .5 .25
E 5 32 5 10 "5.0 25.00
F 6 36 6 13 "7.0 49.00
G 7 33 7 11 "4.0 16.00
H 8 39 8 15 "7.0 49.00
I 9 25 9 5 4.0 16.00
J 10 30 10 8 2.0 4.00
K 11 20 11 1 10.0 100.00
L 12 28 12 6.5 5.5 30.25
M 13 31 13 9 4.0 16.00
N 14 38 14 14 0.0 0.00
O 15 34 15 12 3.0 9.00
n ! 15 !D 2
! 345.00

Calculation: ⑦ rS
6 !D 2
6(345)
! 1 " # ! 1 " ## ! .38
n(n2 " 1) 15(152 " 1)

She then converts the test scores to ranks, assigning a rank of 1 to the lowest score.
Since two scores are tied, the instructor assigns the average of the ranks available for
them to each.
The set of paired ranks appears in the columns Rank of X (RX ) and Rank of Y (RY ).
The value of rs is then computed as above. Are there any problems here?
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 30 / 63
Cautions in the Use of Correlation

Outline

1 Introduction

2 Scatterplot

3 Correlation Coefficient r

4 Rank Correlation Coefficient rs

5 Cautions in the Use of Correlation

6 Simple Linear Regression

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 31 / 63
Cautions in the Use of Correlation

Cautions in the Use of Correlation

Bare in mind the following five cautions in the use of correlation.

1 Correlation Does Not Prove Causation
2 r and rs are Only for Linear Relationship
3 E↵ect of Variability
4 E↵ect of Discontinuity
5 Correlation for Combined Data

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 32 / 63
Cautions in the Use of Correlation

1. Correlation Does Not Prove Causation

If variation in X causes variation in Y , that causal connection

will appear in some degree of correlation between X and Y .
However, we cannot reason backward from a correlation to a
causal relationship.
We must always remember “correlation does not imply
causation”.
There are at least four possibilities of an observed correlation.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 33 / 63
Cautions in the Use of Correlation

1. Correlation Does Not Prove Causation

Denote X as the explanatory variable, Y as the response variable.
X Y X Y

(a) (b)

X Y FIGURE 7.9 Possible relationships

X Y
between variables X and Y that may
(c) (d ) underlie a correlation.

(a) Causation – X is a cause of Y .

(b) Reverse of causation – Y is a cause of X .
(c) A third variable influences both X and Y .
(d) A complex of interrelated variables influences X and Y .
Note: Two or more of these situations may occur simultaneously.
For example, X and Y may influence each other. (That is,
both (a) and (b).)
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 34 / 63
Cautions in the Use of Correlation

2. r and rs are Only for Linear Relationship

Remember that Pearson’s and Spearman’s correlation

coefficients are appropriate only for linear relationships.

FIGURE 7.10 A curvilinear relationship between

X and Y to which a straight line has been fitted. Ob-
servations of age (X ) and strength of grip (Y ) would
X yield data like those plotted here.

When data for one or both variables are not linear, other
measures of association are better.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 35 / 63
Cautions in the Use of Correlation

3. E↵ect of Variability
The correlation coefficient is sensitive to the variability characterizing the
measurements of the two variables.
For example, if a university had only minimal entrance requirements, the
relationship between total SAT scores and freshman GPA might look like
4.0 4.0
this in Fig (a):
3.0
Freshman GPA

Freshman GPA
3.0

2.0

2.0
1.0

800 1000 1200 1400 1200 1300 1400

SAT total SAT total
(a) (b)

FIGURE 7.11 Relations between SAT scores and freshman GPA when range is unrestricted (a) and
when it is restricted (b).
However, suppose that a more selective private university admitted students
only with SAT scores of 1, 200 or higher.
From the new scatterplot in Fig (b), the relationship is much weaker.
Therefore, restricting the range, whether in X , in Y , or in both, results in a
Chung,lower correlation
LI (SAAS , HKU ) coefficient (in
STAT1600B magnitude).
Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 36 / 63
Cautions in the Use of Correlation

4. E↵ect of Discontinuity
The correlation tends to be an overestimate in discontinuous distributions.
Revisit the example of GPA vs SAT total. Suppose you made a mistake and
lost the data records with GPA lies between 1.0 and 3.0. And you still want
to compute the correlation coefficient using the remaining data. The data
might look like this:

discontinuity
Region of
Y: GPA

X: SAT score
FIGURE 7.12 Scatter diagram for discontinuous data.
Most likely you will obtain a higher correlation than the previous one.
Usually, discontinuity, whether in X , in Y , or in both, results in a higher
correlation coefficient.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 37 / 63
Cautions in the Use of Correlation

5. Correlation for Combined Data

Suppose the correlation coefficient between the academic

aptitude test score and the grade in a course conducted by
Professor Haggerty is 0.5.
Another professor called Eagan taught the same course. In his
class, the correlation coefficient is also 0.5 .
What do you think the correlation coefficient would be if we
pool the two samples? Also 0.5?

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 38 / 63
Cautions in the Use of Correlation

5. Correlation for Combined Data

Actually, the correlation coefficient for the pooled sample is not necessarily
0.5. It depends on where the sample values lie relative to one another in
both the X and Y dimensions.

Eagan’s class
Y = course grade

Haggerty’s class

X = aptitude X
(a) (b)

FIGURE 7.13 Correlation resulting from the pooling of data from heterogeneous samples.
If they lie in the way like Fig 7.13 (a), the correlation coefficient would be
lower among the pooled data than among the separate samples.
If they lie in the way like Fig 7.13 (b), the correlation coefficient would be
higher among
Chung, LI (SAAS
the pooled
, HKU )
data than among the separate samples. Ch 2 39 / 63
STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2)
Cautions in the Use of Correlation

Examples of Deceiving Relationship

Outliers can substantially inflate or deflate correlations.
An outlier that is consistent with the trend of the rest of the
data will inflate the correlation.
An outlier that is not consistent with the rest of the data can
substantially decrease the correlation.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 40 / 63
Cautions in the Use of Correlation

Examples of Deceiving Relationship

Example 1 (Rivkin, 1986): Highway Deaths and Speed Limits

Correlation between death rate and speed limit is 0.55.

If Italy removed, correlation drops to 0.098.
If then Britain removed, correlation jumps to 0.70.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 41 / 63
Cautions in the Use of Correlation

Examples of Deceiving Relationship

Example 2 (Utts, 2005): Ages of Husbands and Wives (r = 0.39)

Subset of data on ages of husbands and wives, with one outlier

added (entered 82 instead of 28 for husband’s age).
Correlation with outlier removed is 0.964 – a very strong linear
relationship.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 42 / 63
Cautions in the Use of Correlation

Examples of Deceiving Relationship

Groups combined inappropriately may mask relationships.

The missing link is a third variable.
Simpson’s Paradox
Two or more groups.
Variables for each group may be strongly correlated.
When groups combined into one, very little correlation between
the two variables.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 43 / 63
Cautions in the Use of Correlation

Examples of Deceiving Relationship

Example 3 (Utts, 2005):
Pages versus Price for the Books on a Professor’s Shelf

Correlation is 0.312, more pages =) less cost?

Scatterplot includes book type: H = hardcover, S = softcover.
Correlation for H books: 0.64 and Correlation for S books: 0.35
Combining two types masked the positive correlation and produced illogical
negative association.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 44 / 63
Simple Linear Regression

Outline

1 Introduction

2 Scatterplot

3 Correlation Coefficient r

4 Rank Correlation Coefficient rs

5 Cautions in the Use of Correlation

6 Simple Linear Regression

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 45 / 63
Simple Linear Regression

Simple Linear Regression

Regression analysis is the area of statistics that is used to

examine the relationship between a quantitative response
variable and one or more explanatory variables.
A key element of regression analysis is the estimation of a
regression equation that describes how, on average, the
response variable is related to the explanatory variables.
The simplest kind of relationship between two variables X and
Y is a straight line, which is called linear relationship.
The term simple linear regression refers to methods used to
analyze straight line relationship, i.e., only one response variable
Y (regressand) and only one explanatory variable X (regressor).

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 46 / 63
Simple Linear Regression

Response Variable and Explanatory Variable

In studying the relationship between two quantities, the value of the

explanatory variable is thought to partially explain the value of the
response variable for an individual.
Examples:
In the relationship between smoking and lung cancer, whether or not
an individual smokes is the explanatory variable, and whether or not
he or she develops lung cancer is the response variable.
If we note that people with higher education levels generally have
higher incomes, education level is the explanatory variable and income
is the response variable.
The identification of one variable as “explanatory” and the other as
“response” does not imply that there is a causal relationship. It simply
implies that knowledge of the value of the explanatory variable may help
provide knowledge about the value of the response variable for an individual.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 47 / 63
Simple Linear Regression

Scatterplot with Regression Line

Revisit the example of handspan versus height.
From the scatterplot, the pattern resembles a linear relationship.
Thus, we want to plot a “best fit” line on the scatterplot to show the linear
relationship.
This “best fit” line is called regression line, which is obtained from
estimating b0 and b1 in the regression equation Ŷ = b0 + b1 X .
The criterion to determine which line is “best fit” is based on least square
estimation.

Regression equation: Handspan = 3 + 0.35 ⇥ Height

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 48 / 63
Simple Linear Regression

Regression Equation

The equation for the regression line is

Ŷ = b0 + b1 X
Y hat =/= Y, Y is unknown
where
b0 is the intercept, which is the value of Y when X = 0,
b1 is the slope, which is how much the variable Y changes for
one unit increase in the variable X .
Purposes of the regression equation:
To estimate the average value of Y at any specified value of
X.
To predict the unknown value of Y for an individual, given
that individual’s value of X .

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 49 / 63
Simple Linear Regression

Criterion of Best Fit: Least Squares Criterion

How do we find the straight line of “best fit”?
One simple way is the least squares criterion.

d6
Actual value
of Y1 d7
d5

d3
d4
Y
d1
d2

Predicted value
of Y1
FIGURE 8.2 Discrepancies between
seven Y values and the line of regres-
X sion of Y on X.

The least squares regression line has to minimize the SSE (Sum of
Squared Errors) for the observed data set.
P P
SSE = i (yi ŷi )2 = i di2
The term di is called prediction error or residual, which is the
di↵erence between the observed value and the predicted value of
observation i .
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 50 / 63
Simple Linear Regression

How to Estimate b0 and b1?

Sxy
The slope is b1 =
S
Pxx
(x x )(y y)
= P
(x x )2
P
xy nx y
= P 2
x nx 2
P P P
xy ( x )( n
y)
= P 2 ( P x )2 .
x n

The intercept is b0 = y b1 x .

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 51 / 63
Simple Linear Regression

Example of Estimating b0 and b1

Suppose we are given the following data on the residence size
(X ) (in hundreds of square feet) and the building material cost
(Y ) (in thousand dollars).

Home X : Residence Size Y : Building Material Cost

1 17 46
2 29 60
3 18 42
4 19 43
5 21 50
6 21 47
7 14 39
8 24 58
9 26 53
10 28 58

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 52 / 63
Simple Linear Regression

Example of Estimating b0 and b1

We have
P toPcalculate
P six P items, which are
2
x , y, x , y, xy, x , in order to estimate b0 and b1 .
Home X : Residence Size Y : Building Material Cost XY X2
1 17 46
2 29 60
3 18 42
4 19 43
5 21 50
6 21 47
7 14 39
8 24 58
9 26 53
10 28 58
P
P
/n — —

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 53 / 63
Simple Linear Regression

Example of Estimating b0 and b1

Home X : Residence Size Y : Building Material Cost XY X2
1 17 46 782 289
2 29 60 1740 841
3 18 42 756 324
4 19 43 817 361
5 21 50 1050 441
6 21 47 987 441
7 14 39 546 196
8 24 58 1392 576
9 26 53 1378 676
10 28 58 1624 784
P
P 217 496 11072 4929
/n 21.7 49.6 — —
P P
Note: xP= 21.7, y = 49.6,
P 2 x = 217, y = 496,
xy = 11072, x = 4929.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 54 / 63
Simple Linear Regression

Example of Estimating b0 and b1

The slope is
P (
P P
x )( y) (217)(496)
xy 11072
b1 = P Pn = 10
= 1.4030.
( x )2 (217)2
x2 n
4929 10

The intercept is

b0 = y b1 x = 49.6 1.4030(21.7) = 19.1549.

Therefore, the regression equation is

Ŷ = 19.1549 + 1.4030X .

Note: To avoid large rounding errors in your final results, it is a

good idea to keep the decimal terms such as y, x and b1 in the
memory of your calculator as you work.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 55 / 63
Simple Linear Regression

Prediction of Y at a Particular Value of X

We can then use the regression equation to predict the building

material cost Y at a particular residence size X .
For example, suppose the next contract that the builder signs
calls for a house with 2500 square feet (X = 25).
The building material cost is predicted to be
Ŷ = 19.15 + 1.40(25) = 54.15, or about $54150.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 56 / 63
Simple Linear Regression

Prediction Errors and Residuals

How good is the prediction of the regression?

Put the observed X values in the regression equation to get the
predicted Ŷ values.
Compare with the observed Y values.
The prediction error is the di↵erence:

prediction error = Y Ŷ

The amount by which an individual value di↵ers from the

regression line value can be due to natural variation rather than
“errors” in the measurements.
Thus, a more neutral term, the residual of an individual, would
sometimes be used instead of the prediction error.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 57 / 63
Simple Linear Regression

Interpreting the Squared Correlation, r 2

Recall that the correlation, r , measures the strength and direction of
a linear relationship between two quantitative variables, and has a
value between 1 and 1.
The squared correlation, r 2 , has a value between 0 and 1, retains
information about the strength of the relationship, but loses
information about the direction.
However, it gives the proportion of variation explained by
the regressor.
For example,
r = 0.6 =) r 2 = 0.36 = 36%

That means, the explanatory variable explains 36% of the

variation among the observed values of the response variable.
This interpretation stems from the use of least squares line as a
prediction, HKU
Chung, LI (SAAS
tool.) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 58 / 63
Simple Linear Regression

Explained Error and Unexplained Error

Note that at X = 20, Ŷ = 517, and at X = 63, Ŷ = 388.

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 59 / 63
Simple Linear Regression

Sum of Squares in Regression

To analyse the explained and unexplained errors over the entire sample,
consider the sum of squares of them to get rid of negative signs.

1 Total errors: Total variation / Sum of squares total (SST )

X
SST = (y y)2

2 Unexplained residuals (prediction errors): Sum of squared errors (SSE )

X
SSE = (y ŷ)2

3 Errors explained by regression: Sum of squares due to regression (SSR)

X
SSR = (ŷ y)2

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 60 / 63
Simple Linear Regression

Sum of Squares in Regression

Two important results:
1

SST = SSR + SSE

2
SST SSE SSR
r2 = =
SST SST

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 61 / 63
Simple Linear Regression

Extrapolation
It is risky to use a regression equation to predict values outside
the range of the observed data, a process called extrapolation.
Because there is no guarantee that the relationship will continue
to hold beyond the range for which we have the observed data.
Examples:
Regression equation relating weight to height
Weight = 180 + 5 ⇥ (Height)
This equation should work well for adult, but not for children.
The weight of a boy who is 36 inches tall would be estimated to
be 0 pound.
Straight line relationship between
y = winning time in Olympic women’s 100 m backstroke swim
and x = Olympic year
This straight line could be used to predict the winning time in
the near future, but should not be used to predict the time in
the year 3000.
Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 62 / 63
Simple Linear Regression

Extension of Simple Linear Regression

In simple linear regression, a response variable (Y ) is regressed

on only one explanatory variable (X ).
Actually, we can also regress a response variable on many
explanatory variables. This kind of regression is called
multiple linear regression. regression equation, not regression line
(It will be discussed in detail in the course
STAT3600 – Linear Statistical Analysis.)
In addition, we can regress many response variables on many
explanatory variables. This kind of regression is called
multivariate linear regression and is much more complicated.
(It will be discussed in detail in the course
STAT4602 – Multivariate Data Analysis.)

Chung, LI (SAAS , HKU ) STAT1600B Statistics: Ideas and Concepts 2017-2018 (Sem 2) Ch 2 63 / 63

Topic4 Linear Models
No ratings yet
Topic4 Linear Models
72 pages
Unit-1 Correlation and Regression
No ratings yet
Unit-1 Correlation and Regression
46 pages
7d - Correlation With % Correlation
No ratings yet
7d - Correlation With % Correlation
59 pages
CORRELATION - Product Moment and Rank-Order Correlation
No ratings yet
CORRELATION - Product Moment and Rank-Order Correlation
22 pages
Coo Relation
No ratings yet
Coo Relation
6 pages
Correlation and Regression
100% (5)
Correlation and Regression
49 pages
Biostatistics Unit 10. Measures of Relationship
No ratings yet
Biostatistics Unit 10. Measures of Relationship
37 pages
Pearson and Spearman Correlation
No ratings yet
Pearson and Spearman Correlation
50 pages
Corelation - CL
No ratings yet
Corelation - CL
12 pages
Correlation and Regression
No ratings yet
Correlation and Regression
39 pages
Final 2nd MAT1243 Handout 2023 Ac Year
No ratings yet
Final 2nd MAT1243 Handout 2023 Ac Year
81 pages
Chapter 05
No ratings yet
Chapter 05
13 pages
Unit 17 Correlation and Regression
100% (1)
Unit 17 Correlation and Regression
13 pages
L3 Correlation
No ratings yet
L3 Correlation
101 pages
Correlation 2025
No ratings yet
Correlation 2025
90 pages
Lecture VII Bivariate Data
No ratings yet
Lecture VII Bivariate Data
8 pages
Stat
No ratings yet
Stat
17 pages
Lecture 7
No ratings yet
Lecture 7
65 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
14 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
11 pages
5 - Correlation Analysis
No ratings yet
5 - Correlation Analysis
34 pages
IPS7e LecturePPT ch02
No ratings yet
IPS7e LecturePPT ch02
105 pages
Correlation Rank - Correlation Curve - Fitting For Student
No ratings yet
Correlation Rank - Correlation Curve - Fitting For Student
26 pages
Correlation 2
No ratings yet
Correlation 2
23 pages
Jan 28 - Correlation II
No ratings yet
Jan 28 - Correlation II
59 pages
STAT1600 (24-25, 1st) Chapter 2
No ratings yet
STAT1600 (24-25, 1st) Chapter 2
63 pages
Correlation D 17
No ratings yet
Correlation D 17
8 pages
MRS - Diana-Correlation Analysis-Notes
No ratings yet
MRS - Diana-Correlation Analysis-Notes
16 pages
Unit 2 Correlation Analysis: 2.1. Definition
No ratings yet
Unit 2 Correlation Analysis: 2.1. Definition
9 pages
BA 216 Lecture 5 Notes
No ratings yet
BA 216 Lecture 5 Notes
31 pages
Chapter 05
No ratings yet
Chapter 05
80 pages
Statistics Lecture Series: BY Frahi Fadila
No ratings yet
Statistics Lecture Series: BY Frahi Fadila
15 pages
The Big Picture: Department of Statistics University of Wisconsin-Madison
No ratings yet
The Big Picture: Department of Statistics University of Wisconsin-Madison
7 pages
Regression Correlation
No ratings yet
Regression Correlation
22 pages
Using Statistical Techniq Ues in Analyzing Data
100% (1)
Using Statistical Techniq Ues in Analyzing Data
40 pages
Chapter 4: Describing The Relationship Between Two Variables
No ratings yet
Chapter 4: Describing The Relationship Between Two Variables
27 pages
Correlation & Regression
100% (1)
Correlation & Regression
23 pages
Correlation and Its Significance
No ratings yet
Correlation and Its Significance
15 pages
Practical Research 2: Quarter 1 - Module 1
0% (1)
Practical Research 2: Quarter 1 - Module 1
37 pages
Statistics & Probability Q4 - Week 7-8
No ratings yet
Statistics & Probability Q4 - Week 7-8
15 pages
Stat and Prob Q4 Week 7 Module 15 Lorena
No ratings yet
Stat and Prob Q4 Week 7 Module 15 Lorena
24 pages
Introduction To Correlation Analysis GB6023 2012
No ratings yet
Introduction To Correlation Analysis GB6023 2012
34 pages
Econmetrics Chapter 3
No ratings yet
Econmetrics Chapter 3
20 pages
Chapter 16
No ratings yet
Chapter 16
19 pages
Datasets - Bodyfat2 Fitness Newfitness Abdomenpred: Saseg 8B - Correlation Analysis
No ratings yet
Datasets - Bodyfat2 Fitness Newfitness Abdomenpred: Saseg 8B - Correlation Analysis
34 pages
Correlation New
No ratings yet
Correlation New
37 pages
Linear Correlation 1205885176993532 3
No ratings yet
Linear Correlation 1205885176993532 3
102 pages
Chapter 6 Booklet - Bivariate Data
No ratings yet
Chapter 6 Booklet - Bivariate Data
12 pages
Correlation Analysis
No ratings yet
Correlation Analysis
102 pages
Correlation Analysis
100% (1)
Correlation Analysis
51 pages
Correlation BMLT
No ratings yet
Correlation BMLT
5 pages
Unit 4 Statistics Notes Scatter Plot 2023-24
No ratings yet
Unit 4 Statistics Notes Scatter Plot 2023-24
15 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
23 pages
Time Seriesforcasting and Index Number
100% (1)
Time Seriesforcasting and Index Number
16 pages
Presentation On: Correlation and Rank Correlation: Submitted To
100% (3)
Presentation On: Correlation and Rank Correlation: Submitted To
23 pages
Correlation: Some Commonly Used Jargons
No ratings yet
Correlation: Some Commonly Used Jargons
19 pages
Correlation
No ratings yet
Correlation
19 pages
TQ - Stat
100% (1)
TQ - Stat
9 pages
Hossein Tavakoli: Research Method 2012 Tehran
100% (1)
Hossein Tavakoli: Research Method 2012 Tehran
9 pages
DBSCAN Clustering Algorithm: Presented by
No ratings yet
DBSCAN Clustering Algorithm: Presented by
22 pages
Kendall Rank Correlation Coefficient
100% (1)
Kendall Rank Correlation Coefficient
4 pages
Azure Machine Learning Algorithm Cheat Sheet Nov2019
100% (1)
Azure Machine Learning Algorithm Cheat Sheet Nov2019
1 page
Chapter 1-3 (2!9!2016) Handouts
No ratings yet
Chapter 1-3 (2!9!2016) Handouts
31 pages
Grade 10 Research Design and Method
No ratings yet
Grade 10 Research Design and Method
42 pages
Unit 1 - Big Data Analytics - CCS334
No ratings yet
Unit 1 - Big Data Analytics - CCS334
35 pages
509BCE21 - D.S. Using PYTHON PROJECTS 2025
No ratings yet
509BCE21 - D.S. Using PYTHON PROJECTS 2025
233 pages
Chapter 3 Forecasting
No ratings yet
Chapter 3 Forecasting
55 pages
Bike Sharing Assignment
100% (6)
Bike Sharing Assignment
7 pages
Dissertation Chapter 4 Template
100% (2)
Dissertation Chapter 4 Template
6 pages
Acl Pricelist
No ratings yet
Acl Pricelist
1 page
Top 12 Retail BI Dashboards
100% (2)
Top 12 Retail BI Dashboards
21 pages
Chapter 1-3 (9!9!2016) Handouts
No ratings yet
Chapter 1-3 (9!9!2016) Handouts
42 pages
Following The Dao
No ratings yet
Following The Dao
32 pages
Report
No ratings yet
Report
9 pages
EBSCO-FullText-05 08 2025
No ratings yet
EBSCO-FullText-05 08 2025
15 pages
Following The Dao: Week 3: The Analects, Fate, and Effort
No ratings yet
Following The Dao: Week 3: The Analects, Fate, and Effort
44 pages
Structural Equation Modeling: Petri Nokelainen
No ratings yet
Structural Equation Modeling: Petri Nokelainen
145 pages
Following The Dao: Week 6: Classical Daoism
No ratings yet
Following The Dao: Week 6: Classical Daoism
41 pages
The Role of Managers in Work-Life Balance Implementation: Mervyl Mcpherson
No ratings yet
The Role of Managers in Work-Life Balance Implementation: Mervyl Mcpherson
8 pages
Factoextra-Extract and Visualize The Results of Multivariate Data Analyses - Factoextra
No ratings yet
Factoextra-Extract and Visualize The Results of Multivariate Data Analyses - Factoextra
23 pages
DSN3126 Chapter 4
No ratings yet
DSN3126 Chapter 4
18 pages
2 Correlation and Regression
No ratings yet
2 Correlation and Regression
120 pages
10 Best Practices For ChatGPT Advanced Data Analysis
No ratings yet
10 Best Practices For ChatGPT Advanced Data Analysis
3 pages
Clustering Algorithms CheatSheet 1710438661
No ratings yet
Clustering Algorithms CheatSheet 1710438661
6 pages
Chapter 7-8 (25-10-2016) Handout
No ratings yet
Chapter 7-8 (25-10-2016) Handout
57 pages
551 Manual Course Info and Lab 1
No ratings yet
551 Manual Course Info and Lab 1
40 pages
Difference Between Forecast Linear and Forecast ETS
No ratings yet
Difference Between Forecast Linear and Forecast ETS
6 pages
Following the Dao: Week 5: Mencius 孟子
No ratings yet
Following the Dao: Week 5: Mencius 孟子
39 pages
Fanatisme Dan Perilaku Agresif Verbal Di Media Sos PDF
No ratings yet
Fanatisme Dan Perilaku Agresif Verbal Di Media Sos PDF
26 pages
Following The Dao: Week 2: The Confucian Analects
No ratings yet
Following The Dao: Week 2: The Confucian Analects
28 pages
Genre Analysis
No ratings yet
Genre Analysis
10 pages
NTDs in New Borns
No ratings yet
NTDs in New Borns
6 pages
Crime Detection Application SRS Document
No ratings yet
Crime Detection Application SRS Document
7 pages
Math Finals
No ratings yet
Math Finals
5 pages
Faktor-Faktor Yang Mempengaruhi Kunjungan Lansia Ke Posyandu Lansia Di RW Vii Kelurahan Wonokusumo Kecamatan Semampir Surabaya
No ratings yet
Faktor-Faktor Yang Mempengaruhi Kunjungan Lansia Ke Posyandu Lansia Di RW Vii Kelurahan Wonokusumo Kecamatan Semampir Surabaya
11 pages
Ghemawats - Apollo Hospital
No ratings yet
Ghemawats - Apollo Hospital
5 pages
Data Interpretation Guide For All Competitive and Admission Exams
From Everand
Data Interpretation Guide For All Competitive and Admission Exams
Mohmmad Khaja Shareef
2.5/5 (6)
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
Graphs and Tables of the Mathieu Functions and Their First Derivatives
From Everand
Graphs and Tables of the Mathieu Functions and Their First Derivatives
James C. Wiltse
No ratings yet
Calculus III Essentials
From Everand
Calculus III Essentials
Editors of REA
1/5 (2)
Barron's Physics Practice Plus: 400+ Online Questions and Quick Study Review
From Everand
Barron's Physics Practice Plus: 400+ Online Questions and Quick Study Review
Barron's Educational Series
No ratings yet