0% found this document useful (0 votes)

46 views7 pages

Unit 3 Covariance and Correlation

Covariance and correlation quantify the relationship between two numeric variables. Covariance expresses how much two variables change together, whether positively or negatively. Correlation identifies both the direction and strength of association between -1 and 1. Pearson's correlation coefficient is most common and divides covariance by the standard deviations of each dataset. The example shows a moderate positive correlation between two sets of observations.

Uploaded by

Shreya Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views7 pages

Unit 3 Covariance and Correlation

Uploaded by

Shreya Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Unit 3: Covariance and Correlation

Covariance

When analyzing data, it’s often useful to be able to investigate the relationship
between two numeric variables to assess trends. For example, you might expect
height and weight observations to have a noticeable positive relationship—taller
people tend to weigh more. One of the simplest and most common ways such
associations are quantified and compared is through the idea of correlation, for
which you need the covariance.

The covariance expresses how much two numeric variables “change together”
and the nature of that relationship, whether it is positive or negative. Suppose
for n individuals you have a sample of observations for two variables, labeled

and

where xi corresponds to yi for i = 1,…….., n.

The sample covariance rxy is computed with the following, where x¯ and y¯
represent the respective sample means of both sets of observations:

When you get a positive result for rxy , it shows that there is a positive linear
relationship—as x increases, y increases. When you get a negative result, it
shows a negative linear relationship—as x increases, y decreases, and vice versa.
When rxy = 0, this indicates that there is no linear relationship between the values
of x and y.
Lets take two vector

R> xdata <- c(2,4.4,3,3,2,2.2,2,4)

R> ydata <- c(1,4.4,1,3,2,2.2,2,7)

Although these are two different collections of numbers, note that they have an
identical arithmetic mean.

R> mean(xdata)
[1] 2.825
R> mean(ydata)
[1] 2.825

The sample covariance of thesetwo sets of observations is as follows:

The obtained value is a positive number, so this suggests there is a positive

relationship based on the observations in x and y.
Correlation

Correlation allows you to interpret the covariance further by identifying both

the direction and the strength of any association. There are different types of
correlation coefficients, but the most common of these is Pearson’s product-
moment correlation coefficient, the default implemented by R. Pearson’s
sample correlation coefficient xy is computed by dividing the sample covariance
by the product of the standard deviation of each data set.

When xy = -1, a perfect negative linear relationship exists. Any result less than
zero shows a negative relationship, and the relationship gets weaker the nearer
to zero the coefficient gets, until xy = 0, showing no relationship at all. As the
coefficient increases above zero, a positive relationship is shown, until xy = 1,
which is a perfect positive linear relationship.

R>sd(xdata)
0.9528154
R> sd(ydata)
2.012639

Correlation =

is positive just like rxy, the value of 0.771 indicates a moderate-to-strong

positive association between the observations in x and y.
The R commands cov and cor are used for the sample covariance and
correlation; you need only to supply the two corresponding vectors of data.

R> xdata <- c(2,4.4,3,3,2,2.2,2,4)

R> ydata <- c(1,4.4,1,3,2,2.2,2,7)

R> cov(xdata,ydata)

[1] 1.479286

R> cov(xdata,ydata)/(sd(xdata)*sd(ydata))

[1] 0.7713962

R> cor(xdata,ydata)

[1] 0.7713962

To plot these bivariate observations as a coordinate-based plot

R> plot(xdata,ydata)
As discussed earlier, the correlation coefficient estimates the nature of the
linear relationship between two sets of observations, so if you look at the
pattern formed by the points in Figure and imagine drawing a perfectly straight
line that best represents all the points, you can determine the strength of the
linear association by how close those points are to your line. Points closer to a
perfect straight line will have a value of closer to either -1 or 1. The
direction is determined by how the line is sloped—an increasing trend, with the
line sloping upward toward the right, indicates positive correlation; a negative
trend would be shown by the line sloping downward toward the right.

To aid your understanding of the idea of correlation, Below Figure displays

different scatterplots, each showing 100 points. These observations have been
randomly and artificially generated to follow preset “true” values of xy , labeled
above each plot.
The first row of scatterplots shows negatively correlated data; the second shows
positively correlated data. These match what you would expect to see—the
direction of the line shows the negative or positive correlation of the trend, and
the extremity of the coefficient corresponds to the closeness to a “perfect line.”
The third and final row shows data sets generated with a correlation coefficient
set to zero, implying no linear relationship between the observations in x and y.
The middle and rightmost plots are particularly important because they
highlight the fact that Pearson’s correlation coefficient identifies only “straight-
line” relationships; these last two plots clearly show some kind of trend or
pattern, but this particular statistic cannot be used to detect such a trend. To
wrap up this section, look again at the quakes data. Two of the variables are mag
(the magnitude of each event) and stations (the number of stations that reported
detection of the event). A plot of stations on the y-axis against mag on the x-
axis can be produced with the following:

R>
plot(quakes$mag,quakes$stations,xlab="Magnitude",ylab="No.
of stations")

Figure 13-6 shows this image.

You can see by the vertical patterning that the magnitudes appear to have
been recorded to a certain specific level of precision. Nevertheless, a positive
relationship (more stations tend to detect events of higher magnitude) is
clearly visible in the scatterplot, a feature that is con-firmed by a positive
covariance.

R> cov(quakes$mag,quakes$stations)

[1] 7.508181

As you might expect from examining the pattern, Pearson’s correlation

coefficient confirms that the linear association is quite strong.

R> cor(quakes$mag,quakes$stations)

[1] 0.8511824

Mean Median Mode MCQ
100% (8)
Mean Median Mode MCQ
4 pages
Chapter 8. Correlation and Regression Analyses
No ratings yet
Chapter 8. Correlation and Regression Analyses
36 pages
Unit II Notes Correlation and Regression
No ratings yet
Unit II Notes Correlation and Regression
19 pages
Correlation
No ratings yet
Correlation
83 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
100 pages
Statistics Module 3hejeiehhwwhgsysysudhhdbb
No ratings yet
Statistics Module 3hejeiehhwwhgsysysudhhdbb
44 pages
Correlation Analysis
No ratings yet
Correlation Analysis
52 pages
2019 Correlation+Analysis Elsevier
No ratings yet
2019 Correlation+Analysis Elsevier
16 pages
Correlation Notes
No ratings yet
Correlation Notes
17 pages
Lecture 05
No ratings yet
Lecture 05
20 pages
RM Chap 18 Bivariate Analysis
No ratings yet
RM Chap 18 Bivariate Analysis
30 pages
Lecture 13 Correlation Chapter 12 Part 1
No ratings yet
Lecture 13 Correlation Chapter 12 Part 1
20 pages
Sem 6 Ques Data Science
No ratings yet
Sem 6 Ques Data Science
23 pages
Topic 7.1 - Correlation and Simple Linear Regression
No ratings yet
Topic 7.1 - Correlation and Simple Linear Regression
20 pages
1.2. Ch-2 - Correlation Theory-1
No ratings yet
1.2. Ch-2 - Correlation Theory-1
29 pages
Chapter 3 - CORRELATION THEORY
No ratings yet
Chapter 3 - CORRELATION THEORY
9 pages
ABM 401 Lesson 12
No ratings yet
ABM 401 Lesson 12
14 pages
Measures of Association
No ratings yet
Measures of Association
14 pages
Correlation: Hapter
No ratings yet
Correlation: Hapter
16 pages
Correlation N Regression
No ratings yet
Correlation N Regression
25 pages
Correlation
No ratings yet
Correlation
6 pages
Correlation Coeficient
No ratings yet
Correlation Coeficient
19 pages
Correlation
No ratings yet
Correlation
6 pages
Correlation
No ratings yet
Correlation
30 pages
Statistics and Probability: Senior High School
77% (13)
Statistics and Probability: Senior High School
44 pages
Lecture 5
No ratings yet
Lecture 5
30 pages
ch7 - CORELATION
No ratings yet
ch7 - CORELATION
16 pages
Correlation Ansd Simple Regression
No ratings yet
Correlation Ansd Simple Regression
27 pages
Chapter 4 (Correlation Part)
No ratings yet
Chapter 4 (Correlation Part)
16 pages
Correction and Regression
No ratings yet
Correction and Regression
30 pages
Correlation of Experimental Data CLIL 2017
No ratings yet
Correlation of Experimental Data CLIL 2017
8 pages
Correlation Analysis
No ratings yet
Correlation Analysis
16 pages
MIS BA 20232024 Notes Chapter02
No ratings yet
MIS BA 20232024 Notes Chapter02
8 pages
Correlation Analysis-Students NotesMAR 2023
No ratings yet
Correlation Analysis-Students NotesMAR 2023
24 pages
Correlation and Regression
No ratings yet
Correlation and Regression
22 pages
Simple Linear Regression: Y XI. XI X
No ratings yet
Simple Linear Regression: Y XI. XI X
25 pages
MRS - Diana-Correlation Analysis-Notes
No ratings yet
MRS - Diana-Correlation Analysis-Notes
16 pages
Introduction To Correlation and Regression Analyses PDF
No ratings yet
Introduction To Correlation and Regression Analyses PDF
12 pages
ECN 236 - Correlation Theory 3
No ratings yet
ECN 236 - Correlation Theory 3
2 pages
Correlation: (For M.B.A. I Semester)
100% (2)
Correlation: (For M.B.A. I Semester)
46 pages
Unit 4
No ratings yet
Unit 4
10 pages
Scatter Plot Linear Correlation
No ratings yet
Scatter Plot Linear Correlation
4 pages
Unit 17 Correlation and Regression
100% (1)
Unit 17 Correlation and Regression
13 pages
The Significance of Correlation
No ratings yet
The Significance of Correlation
6 pages
Test Bank For Statistics For Business & Economics 12e (Chapter 02)
100% (4)
Test Bank For Statistics For Business & Economics 12e (Chapter 02)
45 pages
CORRELATION
No ratings yet
CORRELATION
4 pages
Regression Analysis Assignment
100% (1)
Regression Analysis Assignment
8 pages
Sampling Distributions, Estimation and Hypothesis Testing - Multiple Choice Questions
100% (1)
Sampling Distributions, Estimation and Hypothesis Testing - Multiple Choice Questions
6 pages
Statistics Regression Final Project
100% (2)
Statistics Regression Final Project
12 pages
Correlation and Regression Analysis
0% (1)
Correlation and Regression Analysis
17 pages
Module Five: Correlation Objectives
No ratings yet
Module Five: Correlation Objectives
11 pages
Unit 3-1
No ratings yet
Unit 3-1
12 pages
Correlation BMLT
No ratings yet
Correlation BMLT
5 pages
Oe Statistics Notes
No ratings yet
Oe Statistics Notes
32 pages
Course Pack Correlation
No ratings yet
Course Pack Correlation
12 pages
Correlation and Covariance
No ratings yet
Correlation and Covariance
11 pages
Chapter Four Correlation Analysis: Positive or Negative
No ratings yet
Chapter Four Correlation Analysis: Positive or Negative
15 pages
Correlation and Regression
No ratings yet
Correlation and Regression
11 pages
Correlation
No ratings yet
Correlation
19 pages
Stat1 Formulas and Tables For Statistics 2022
No ratings yet
Stat1 Formulas and Tables For Statistics 2022
34 pages
2024 Division Statistics Month Celebration Statistics Quiz Answers
No ratings yet
2024 Division Statistics Month Celebration Statistics Quiz Answers
7 pages
SEM
100% (1)
SEM
20 pages
Approach To Comparative Politics
No ratings yet
Approach To Comparative Politics
8 pages
Chapter 3 - Linear Regression Model
No ratings yet
Chapter 3 - Linear Regression Model
289 pages
Beginning Behavioral Research A Conceptual Primer, 7th Edition Exclusive Download
100% (17)
Beginning Behavioral Research A Conceptual Primer, 7th Edition Exclusive Download
14 pages
2 9 Cumulative Frequency Tables
No ratings yet
2 9 Cumulative Frequency Tables
6 pages
Sample Size and Power Calculation
No ratings yet
Sample Size and Power Calculation
31 pages
Data Mining 1 Practical File-1
No ratings yet
Data Mining 1 Practical File-1
24 pages
Math11 SP Q3 M8 PDF
No ratings yet
Math11 SP Q3 M8 PDF
12 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
DTB (ch5)
No ratings yet
DTB (ch5)
14 pages
Managerial Statistics Final Exam Practise
No ratings yet
Managerial Statistics Final Exam Practise
18 pages
Comparative Analysis
No ratings yet
Comparative Analysis
60 pages
Chapter 3: Answers To Questions and Problems: Managerial Economics and Business Strategy, 8e
No ratings yet
Chapter 3: Answers To Questions and Problems: Managerial Economics and Business Strategy, 8e
16 pages
RCBD Principles, Randomization and Layout
No ratings yet
RCBD Principles, Randomization and Layout
23 pages
Chapter 3 - Forecasting PDF
No ratings yet
Chapter 3 - Forecasting PDF
45 pages
How To Master Regression Analysis in JMP Assignment Help Guide
No ratings yet
How To Master Regression Analysis in JMP Assignment Help Guide
13 pages
Digital Watermarking Tech Overview-WIPRO
No ratings yet
Digital Watermarking Tech Overview-WIPRO
9 pages
Unit1 Matrix and Array
No ratings yet
Unit1 Matrix and Array
19 pages
Network Security
No ratings yet
Network Security
43 pages
Chapter 6
No ratings yet
Chapter 6
10 pages
Cryptography & Protocols: Presented By: Dr. S. S. Bedi Department of CSIT, MJP Rohilkhsnd University, Bareilly
No ratings yet
Cryptography & Protocols: Presented By: Dr. S. S. Bedi Department of CSIT, MJP Rohilkhsnd University, Bareilly
59 pages
Nunung Manis Setiyani, Rita Andini, Abrar Oemar
No ratings yet
Nunung Manis Setiyani, Rita Andini, Abrar Oemar
18 pages
Unit 1 R Reading-Writing Files
No ratings yet
Unit 1 R Reading-Writing Files
8 pages
Jurnal Hannum Anggina Dia
No ratings yet
Jurnal Hannum Anggina Dia
19 pages
GWR Tutorial
No ratings yet
GWR Tutorial
27 pages
Analisis Pengaruh Inflasi, Suku Bunga Kredit, Pendapatan Per Kapita Terhadap Penanaman Modal Dalam Negeri Di Indonesia
No ratings yet
Analisis Pengaruh Inflasi, Suku Bunga Kredit, Pendapatan Per Kapita Terhadap Penanaman Modal Dalam Negeri Di Indonesia
19 pages
Random Variables: Corr (X, Y) Cov (X, Y) / Cov (X, Y) Is The Covariance (X, Y)
No ratings yet
Random Variables: Corr (X, Y) Cov (X, Y) / Cov (X, Y) Is The Covariance (X, Y)
15 pages
Researach Paper Hardik SSRG
No ratings yet
Researach Paper Hardik SSRG
6 pages
Contoh Uji Validitas & Reliabilitas PDF
No ratings yet
Contoh Uji Validitas & Reliabilitas PDF
5 pages
Unit 3 Count Propotion
No ratings yet
Unit 3 Count Propotion
5 pages
Unit 3 Data Analysis
No ratings yet
Unit 3 Data Analysis
3 pages
Engineering Formulas: QuickStudy Laminated Reference Guide
From Everand
Engineering Formulas: QuickStudy Laminated Reference Guide
Beena Ajmera
No ratings yet
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Mathematical Equality: Fundamentals and Applications
From Everand
Mathematical Equality: Fundamentals and Applications
Fouad Sabry
No ratings yet

Unit 3 Covariance and Correlation

Uploaded by

Unit 3 Covariance and Correlation

Uploaded by

Unit 3: Covariance and Correlation

where xi corresponds to yi for i = 1,…….., n.

R> xdata <- c(2,4.4,3,3,2,2.2,2,4)

R> ydata <- c(1,4.4,1,3,2,2.2,2,7)

The sample covariance of thesetwo sets of observations is as follows:

The obtained value is a positive number, so this suggests there is a positive

Correlation allows you to interpret the covariance further by identifying both

is positive just like rxy, the value of 0.771 indicates a moderate-to-strong

R> xdata <- c(2,4.4,3,3,2,2.2,2,4)

R> ydata <- c(1,4.4,1,3,2,2.2,2,7)

To plot these bivariate observations as a coordinate-based plot

To aid your understanding of the idea of correlation, Below Figure displays

Figure 13-6 shows this image.

As you might expect from examining the pattern, Pearson’s correlation

You might also like