0% found this document useful (0 votes)

16 views32 pages

12 Correlation and Significancy

Uploaded by

990293kwi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views32 pages

12 Correlation and Significancy

Uploaded by

990293kwi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 32

Correlation

1
Q&A
• SSres 和 SSE 是同一個東西嗎

• 分母的「除以 x 的離均差平方和 (x-x_bar)^2 」是為了標準化嗎

2
Reviews
1. What is the probability distribution? What is it usage?
2. What is the degree of freedom? What is it used for?
3. What is the different between the z-distribution and t-distribution? What is
their application?
4. What is the usage of two-sample testing?
5. What is the 5% significant level?
6. What is the null hypothesis?
7. What is the p-value? What is it used for?
8. What is the usage of linear regression?
9. How to judge if a dataset is normal distributed or not?
3
Correlation and covariance
• In statistic, correlation is a statistical linear relation between two random
variables.
• It is used to measure the degree to which two random variables
move/change in relation to each other.
• It is a statistical analysis on scatter plots.
• It has a coefficient (called the correlation coefficient, R or r) ranging from -1
to 1.
• The correlation coefficient (R or r) need to be calculated from covariance.
• The correlation is the covariance normalized by the standard deviation of the
two random variables.

4
Correlation vs linear regression?
(Think and discussion)
• What is the difference between R2 (coefficient of determination) and
R (correlation coefficient)?

❑
(𝑥¿¿𝑖− 𝑥)(𝑦 ¿¿𝑖−𝑦)
𝑅=∑ ¿¿
√ ∑ (𝑥¿¿𝑖−𝑥) √∑ ( 𝑦¿¿𝑖−𝑦) ¿¿
2 2

5
More about
the coefficient of determination (R 2)
• In statistics, the coefficient of determination is a measurement
that examines how differences in one variable (y) can be
explained by the difference in a second variable (x).
• Use to see how good one variable (y) can be predicted from
the second variable (x).
• This quantity is a measure of the proportion of variability
explained by the fitted model.

6
More about
the coefficient of determination (R 2)
• An example:
• the coefficient of determination (0.913) suggests that the
model fit to the data explains 91.3% of the variability
observed.

• [Probability & Statistics for Engineers & Scientists, Ninth

Edition, Walpole et al., 2011]

7
Correlation coefficients (R) vs
determination coefficients (R2)
• R2: a measure of the proportion of variability explained
by the fitted model.

• R: a measure of strength/degree of the relationships

between two variables.

8
For examples (-1 to 1)

9
For examples (-1 to 1)

10
What is covariance?
• Think about some words with “co-”?
• In statistic, covariance is a measure of the joint variability of
two random variables.
• The covariance is the calculation of variance of two random
variables.

11
Calculating covariance

• Calculate variance from sample data

(d.f. = N-1):
var

• Calculate covariance of two sample variables

(d.f. = N-1):
❑
(𝑥¿ ¿ 𝑖− 𝑥 )(𝑦 ¿ ¿ 𝑖− 𝑦 )
𝜎 𝑥𝑦 =∑ ¿¿
𝑁 −1
12
Calculating correlation (REVIEW)
• The correlation is the covariance normalized by the standard
deviation of the two random variables.

𝜎 𝑥𝑦 (𝑥¿¿𝑖−𝑥)(𝑦¿¿𝑖− 𝑦) ❑
𝑅= =∑ ¿¿
𝜎𝑥𝜎𝑦 (𝑁 −1)𝜎 𝑥 𝜎 𝑦
❑
(𝑥¿¿𝑖− 𝑥)(𝑦¿¿𝑖− 𝑦)
¿∑ ¿¿
s
√ ∑ (𝑥¿¿𝑖− 𝑥) √∑ (𝑦¿¿𝑖− 𝑦) ¿¿
2 2

13
Calculating correlation
• The correlation is the covariance normalized by the standard
deviation of the two random variables.

14
Standard Coordinates of Histograms
(REVIEW)
• Assume we have a dataset {x} of N data items, x1, … , xN.

Dataset

Data
xi=x1, … , xN
先把資料無因次化 (normalized)

15
Confusion Caused by Correlation
(Using correlation incorrectly)--REVIEW
• High correlation can only tell when one is large then the other is large
(positive correlation) or small (negative correlation).
• But, correlation DOES NOT mean that changing in one variable causes
(or absolutely cause) the other to change. (causation 原因 )
• Examples:
(1) Shoe size vs reading skills
(2) Fertilizer vs plant size.
(3) Ocean temperature vs ocean salinity
(4) Ocean temperature vs ocean current speed.
16
Confusion of Correlation Coefficients--REVIEW
(Using correlation coefficient incorrectly)
• To check how good/reliable the analysis of correlation is, we need to do
significant testing (we will discuss this next semester (maybe later) after
learning the probability), besides of calculating correlation coefficients.

17
(𝑥¿¿𝑖− 𝑥)(𝑦 ¿¿𝑖−𝑦)❑
𝑅=∑ ¿¿
√ ∑ (𝑥¿¿𝑖−𝑥) √∑ ( 𝑦¿¿𝑖−𝑦) ¿¿
2 2

Critical value of r at the

5% significant level

r c= 2
√ 𝑡2
𝑡 + 𝑛 −2

n-2: d.f.
18
t-table

n-2

√
2
𝑡𝑐
r= 2
𝑡 𝑐 +𝑛 − 2

=0.8783

19
Homework 11 ( 這周五之前繳交 )

• Link to
• https://forms.gle/pcxJyJ4kpohCDAdA6

20
HW 11
• 求以下資料集 x 與 y 的相關係數 (R) 。 ❑
(𝑥¿¿𝑖− 𝑥)(𝑦 ¿¿𝑖−𝑦)
𝑅=∑ ¿¿
√ ∑ (𝑥¿¿𝑖−𝑥) √∑ ( 𝑦¿¿𝑖−𝑦) ¿¿
2 2

√
2
𝑡𝑐
r= 2
𝑡 𝑐 +𝑛 − 2

• 根據資料的自由度，判斷所求得的相關係數 (R) 在統計上是否有意

義。

21
補強題目
• 根據以下的 x 和 y 的資料，求出它們之間的 (a) 相關係數、 (b) 決
定係數和 (c) 數線性回歸線之方程式。這些統計結果是否顯著 ?
為什麼 ?

22
se??

23
se?? (Standard error)

• Comparing sample mean () and population mean ()

24
Standard error (of the estimated sample mean)

25
Standard error (of the estimated sample mean)

−𝑡 𝛼/ 2 𝑡 𝛼/ 2

Confidence interval
26
Previous Exercise 01 (discussion)
• John recorded a set of data for a sine wave as shown in
below. Help him to conduct the two-tailed t-testing for the
mean of the data. Can the mean of the data statistically
trustable/representable? (The mean and std of data is about
0.097 and 0.639, respectively)

1
John Data
0.8
Sine wave
John data t(s) h(m)
23 0.39
𝑥 − 𝜇0
0.6
25 0.42
0.4
𝑡= 34
70
0.55
0.93
0.2
𝑠 200 -0.34
h (m)

0
229 -0.75
-0.2 √𝑛 501
509
0.62
0.51
-0.4
593 -0.79
-0.6
685 -0.57
-0.8

-1
0 100 200 300 400 500 600 700 800 27
t (s)
Example (discussion)
• (The mean and std of data is about 0.097 and 0.639, respectively)
• =0.2
• =2.262
• -0.3< <0.5 (95%)

1
John Data
0.8
Sine wave
John data t(s) h(m)
23 0.39
𝑥 − 𝜇0
0.6
25 0.42
0.4
𝑡= 34
70
0.55
0.93
0.2
𝑠 200 -0.34
h (m)

0
229 -0.75
-0.2 √𝑛 501
509
0.62
0.51
-0.4
593 -0.79
-0.6
685 -0.57
-0.8

-1
0 100 200 300 400 500 600 700 800 28
t (s)
t-table

29
Example (discussion)
• (The mean and std of data is about 0.47 and 0.56, respectively)
• =0.28
• =3.182
• -0.42< <1.36 (95%)

1
John Data
0.8
Sine wave
John data t(s) h(m)
0.6
23 0.39
25 0.42
0.4 34 0.55
0.2 70 0.93
200 -0.34
h(m)

0
229 -0.75
-0.2 501 0.62
-0.4
509 0.51
593 -0.79
-0.6
685 -0.57
-0.8

-1
0 100 200 300 400 500 600 700 800 30
t(s)
Example comparison (discussion)
• (The mean and std of data is about 0.097 and 0.639, respectively)
• =0.2
• =2.262
• -0.3< <0.5 (95%)

1
John Data
0.8
Sine wave
John data t(s) h(m)
0.6
23 0.39
25 0.42
0.4 34 0.55
0.2 70 0.93
200 -0.34
h (m)

0
229 -0.75
-0.2 501 0.62
-0.4
509 0.51
593 -0.79
-0.6
685 -0.57
-0.8

-1
0 100 200 300 400 500 600 700 800 31
t (s)
Question (application & discussion)
• If the mean and std of data is about 0.097 and 0.7, respectively.
Then how many data that you need to get standard error<0.1, by
assuming that the natural signal that John observed is unknown?
• =0.1
• =0.7/0.1=7
• =49

02 V3 2016 CFA二级强化班 Quantitative Methods
No ratings yet
02 V3 2016 CFA二级强化班 Quantitative Methods
79 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
6.1 Test For Single Mean: Assumptions
No ratings yet
6.1 Test For Single Mean: Assumptions
17 pages
Corr and Regress
No ratings yet
Corr and Regress
42 pages
Lecture 4 - Correlation and Regression
No ratings yet
Lecture 4 - Correlation and Regression
35 pages
Linear Correlation 1205885176993532 3
No ratings yet
Linear Correlation 1205885176993532 3
102 pages
Chapter-23 Bivariate Statistical Analysis: Measurement of Association
No ratings yet
Chapter-23 Bivariate Statistical Analysis: Measurement of Association
30 pages
2 Correlation and Linear Regression PDF
No ratings yet
2 Correlation and Linear Regression PDF
26 pages
Statistical Analysis of Relationships The Basics
No ratings yet
Statistical Analysis of Relationships The Basics
24 pages
MATH 101-Week 7-8 - Lesson 4.1 Correlation & Regression Analysis
No ratings yet
MATH 101-Week 7-8 - Lesson 4.1 Correlation & Regression Analysis
53 pages
Day 8 - Module Linear Correlation
No ratings yet
Day 8 - Module Linear Correlation
5 pages
Correlation New
No ratings yet
Correlation New
37 pages
Three Segments: - Overview - Calculation of R - Assumptions
No ratings yet
Three Segments: - Overview - Calculation of R - Assumptions
16 pages
Correlation and Chi-Square Test - LDR 280
100% (1)
Correlation and Chi-Square Test - LDR 280
71 pages
Chapter 5 Data Analysis Ab
No ratings yet
Chapter 5 Data Analysis Ab
56 pages
Datasets - Bodyfat2 Fitness Newfitness Abdomenpred: Saseg 8B - Correlation Analysis
No ratings yet
Datasets - Bodyfat2 Fitness Newfitness Abdomenpred: Saseg 8B - Correlation Analysis
34 pages
Correlation Lecture
No ratings yet
Correlation Lecture
20 pages
Correlation Analysis
No ratings yet
Correlation Analysis
54 pages
Chapter Seventeen: Correlation and Regression
No ratings yet
Chapter Seventeen: Correlation and Regression
71 pages
Biostatistics Lect 7a - Correlation - 142021
No ratings yet
Biostatistics Lect 7a - Correlation - 142021
31 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
STAT22209 - Chapter 01-Correlation Analyisis - 2022
No ratings yet
STAT22209 - Chapter 01-Correlation Analyisis - 2022
53 pages
5 Correlation and Cofficient 2023
No ratings yet
5 Correlation and Cofficient 2023
51 pages
Cce 68 D 4 CC 4
No ratings yet
Cce 68 D 4 CC 4
28 pages
Correlation and Regression Unit 1
No ratings yet
Correlation and Regression Unit 1
16 pages
Chapter 8 - PSYC 284
No ratings yet
Chapter 8 - PSYC 284
7 pages
Biostatistics PPT - 6
No ratings yet
Biostatistics PPT - 6
35 pages
Correlation Regression
No ratings yet
Correlation Regression
42 pages
Lecture 10
No ratings yet
Lecture 10
33 pages
Tutorial 6 Questions - Model - Answers
No ratings yet
Tutorial 6 Questions - Model - Answers
4 pages
Two Variables Chap3
No ratings yet
Two Variables Chap3
47 pages
IV - Measures of Relationship
100% (1)
IV - Measures of Relationship
4 pages
Relationship - Correlation and Regression
No ratings yet
Relationship - Correlation and Regression
42 pages
Group Assignment
No ratings yet
Group Assignment
3 pages
PRP1001 JXH1003 Week 7 2024 No Notes
No ratings yet
PRP1001 JXH1003 Week 7 2024 No Notes
49 pages
Lecture 9 Simple-Linear-Regression-Correlation Updated
No ratings yet
Lecture 9 Simple-Linear-Regression-Correlation Updated
44 pages
Chapter 4-Correlation and Regresssion
No ratings yet
Chapter 4-Correlation and Regresssion
60 pages
Unit II Notes Correlation and Regression
No ratings yet
Unit II Notes Correlation and Regression
19 pages
Regression and Correlation
No ratings yet
Regression and Correlation
23 pages
Correlation and Regration
No ratings yet
Correlation and Regration
57 pages
Part 2 Exploring Relationships Among Variables
No ratings yet
Part 2 Exploring Relationships Among Variables
8 pages
Correlation Analysis - Final
No ratings yet
Correlation Analysis - Final
40 pages
Pred 354 12th Lesson
0% (1)
Pred 354 12th Lesson
16 pages
Chapter 12
No ratings yet
Chapter 12
36 pages
Statistics
No ratings yet
Statistics
13 pages
Statistics: Correlation: 2.1 Interpreting A Scatterplot
No ratings yet
Statistics: Correlation: 2.1 Interpreting A Scatterplot
8 pages
Correlation Simple Regression
No ratings yet
Correlation Simple Regression
26 pages
Stat
No ratings yet
Stat
17 pages
Correlation Regression
No ratings yet
Correlation Regression
58 pages
MATH 121 (Chapter 10) - Correlation & Regression
No ratings yet
MATH 121 (Chapter 10) - Correlation & Regression
30 pages
Chapter - 5 - Correlation and Regression
No ratings yet
Chapter - 5 - Correlation and Regression
70 pages
Lecture 10 Correlation
No ratings yet
Lecture 10 Correlation
32 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
Unit 8.1 Correlation-Regression
No ratings yet
Unit 8.1 Correlation-Regression
38 pages
Laporan Praktikum Kecerdasan Buatan: Jobhseet 12 Corellation
No ratings yet
Laporan Praktikum Kecerdasan Buatan: Jobhseet 12 Corellation
11 pages
Statistics: Session 1/march 6 Session 2/march 7 Session 3/march 8 Session 4/march 9
No ratings yet
Statistics: Session 1/march 6 Session 2/march 7 Session 3/march 8 Session 4/march 9
6 pages
Basic Statistics
No ratings yet
Basic Statistics
31 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Week 4
No ratings yet
Week 4
5 pages
AI in Finance
No ratings yet
AI in Finance
2 pages
Persoalan N Hipotesis
No ratings yet
Persoalan N Hipotesis
10 pages
Dead or Alive Continuous Data Profiling For Interactive Data Science
No ratings yet
Dead or Alive Continuous Data Profiling For Interactive Data Science
11 pages
Censoring & Truncation
No ratings yet
Censoring & Truncation
14 pages
Comprehensive Data Analysis Course
No ratings yet
Comprehensive Data Analysis Course
5 pages
Artikel Jurnal Mia Nurhidayah
No ratings yet
Artikel Jurnal Mia Nurhidayah
6 pages
Measures of Dispersion Assignment
No ratings yet
Measures of Dispersion Assignment
3 pages
PR69 - Data Analisis and Simulations in Intralogistics - EN
No ratings yet
PR69 - Data Analisis and Simulations in Intralogistics - EN
3 pages
Examples of Business Intelligence: Data Visualization
No ratings yet
Examples of Business Intelligence: Data Visualization
2 pages
Edu 5950-1 Chapter 8 (Group 5)
No ratings yet
Edu 5950-1 Chapter 8 (Group 5)
58 pages
Count Data Analysis Using Poisson Regression: University of Southeastern Philippines Advanced Studies Mintal, Davao City
No ratings yet
Count Data Analysis Using Poisson Regression: University of Southeastern Philippines Advanced Studies Mintal, Davao City
19 pages
Lab - 8
No ratings yet
Lab - 8
7 pages
Mvda Lab Da 5 22MDT0044
No ratings yet
Mvda Lab Da 5 22MDT0044
11 pages
BBA Project Work Guidelines
No ratings yet
BBA Project Work Guidelines
32 pages
Shalini.s F Sec
No ratings yet
Shalini.s F Sec
12 pages
SL - No Content NO Chapter-1 Introduction
No ratings yet
SL - No Content NO Chapter-1 Introduction
69 pages
Mamabolo 2020 A Systematic Literature Review of Skills Required in The Different Phases of The Entrepreneurial Process
No ratings yet
Mamabolo 2020 A Systematic Literature Review of Skills Required in The Different Phases of The Entrepreneurial Process
27 pages
Fried Man Test: Sample Problem
No ratings yet
Fried Man Test: Sample Problem
8 pages
ML Lab
No ratings yet
ML Lab
2 pages
FAO - Fish Sauce
No ratings yet
FAO - Fish Sauce
104 pages
Regression Metrics
No ratings yet
Regression Metrics
3 pages
AI Midterm Quiz 1 - Attempt Review
No ratings yet
AI Midterm Quiz 1 - Attempt Review
6 pages
LSTM Autoencoder For Extreme Rare Event Classification in Keras - by Chitta Ranjan - Towards Data Science
No ratings yet
LSTM Autoencoder For Extreme Rare Event Classification in Keras - by Chitta Ranjan - Towards Data Science
19 pages
Assignment 9-KS
No ratings yet
Assignment 9-KS
3 pages
Biblography Books:: o o o o o o o o o
No ratings yet
Biblography Books:: o o o o o o o o o
27 pages
1 PB
No ratings yet
1 PB
7 pages
Influential Observation
No ratings yet
Influential Observation
4 pages
R Package - GLMM - 240331 - 140905
No ratings yet
R Package - GLMM - 240331 - 140905
6 pages
Statistics For Business and Economics,: 11E Anderson/Sweeney/Williams
No ratings yet
Statistics For Business and Economics,: 11E Anderson/Sweeney/Williams
39 pages

12 Correlation and Significancy

Uploaded by

12 Correlation and Significancy

Uploaded by

Correlation

• 分母的「除以 x 的離均差平方和 (x-x_bar)^2 」是為了標準化嗎

• [Probability & Statistics for Engineers & Scientists, Ninth

• R: a measure of strength/degree of the relationships

• Calculate variance from sample data

• Calculate covariance of two sample variables

Critical value of r at the

• 根據資料的自由度，判斷所求得的相關係數 (R) 在統計上是否有意

• Comparing sample mean () and population mean ()

You might also like