0% found this document useful (0 votes)
17 views20 pages

Lecture 13 Correlation Chapter 12 Part 1

This document discusses correlation as a statistical measure of the relationship between variables, specifically focusing on Pearson's and Spearman's correlation coefficients. It explains how to interpret correlation values, the significance of scatterplots, and the characteristics of relationships between variables. Additionally, it emphasizes that correlation does not imply causation and highlights the importance of considering outliers and data range in correlation analysis.

Uploaded by

Dina Bardakji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views20 pages

Lecture 13 Correlation Chapter 12 Part 1

This document discusses correlation as a statistical measure of the relationship between variables, specifically focusing on Pearson's and Spearman's correlation coefficients. It explains how to interpret correlation values, the significance of scatterplots, and the characteristics of relationships between variables. Additionally, it emphasizes that correlation does not imply causation and highlights the importance of considering outliers and data range in correlation analysis.

Uploaded by

Dina Bardakji
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Data Analysis and

Presentation
BIDA330
Correlation
Chapter 12 Part 1
Correlation
• Correlation is a measure of the degree of relatedness of
variables
• It can help a business researcher determine, for example, whether
the stocks of two airlines rise and fall in any related manner
• For a sample of pairs of data, correlation analysis can yield a
numerical value that represents the degree of relatedness of the
two stock prices over time
• In the transportation industry, is a correlation evident between the
price of transportation and the weight of the object being
shipped? If so, how strong are the correlations?
Correlation Cont.
• Several measures of correlation are available, the selection of which
depends mostly on the level of data being analyzed
• Ideally, researchers would like to solve for , the population coefficient
of correlation. However, because researchers virtually always deal with
sample data, this section introduces a widely used sample coefficient
of correlation, r
• This measure is applicable only if both variables being analyzed have
interval or ratio level of data
• The statistic r is the Pearson product-moment correlation
coefficient, named after Karl Pearson (1857–1936), an English
statistician who developed several coefficients of correlation along
with other significant statistical concepts
Pearson's correlation
• Pearson's correlation is the parametric test for correlation
between two continuous (interval/ratio) variables
• The assumptions to apply the test are as follows:
• Normal distribution
• Independence of observations
• Linear relationship
• If the first assumption, that is, normality, is not met or if one
variable is ordinal in nature, a nonparametric alternative known
as Spearman's correlation is applied
Spearman's correlation
• Spearman's correlation can be applied to curvilinear relationships
(in ranked or ordinal data)
• However, the relationship in any correlation must be monotonic,
that is, as the value of one variable increases or decreases, so
does the value of the other variable either increase/decrease
Pearson product-moment correlation
coefficient
• Named after Karl Pearson (1857–1936), an English statistician who developed
several coefficients of correlation along with other significant statistical
concepts
• The term r is a measure of the linear correlation of two variables
• It is a number that ranges from -1 to 0 to +1, representing the strength of the
relationship between the variables. r belongs to [-1:1], -1≤ r ≤ 1
• An r value of +1 denotes a perfect positive relationship between two sets of
numbers
• An r value of -1 denotes a perfect negative correlation, which indicates an
inverse relationship between two variables: as one variable gets larger, the
other gets smaller
• An r value of 0 means no linear relationship is present between the two
variables
Scatterplot/Diagram
A scatterplot is a graph that is used to represent the
relationship between two variables. (Also referred to
as a scatter diagram.)
In a scatterplot, the X values are placed on the
horizontal axis and the Y values are placed on the
vertical axis.
The value of the scatterplot is that it lets you see the
nature of the relationship.
Strong Negative Correlation (r = –.933)
Moderate Negative Correlation (r = –.674)
Virtually No Correlation (r = –.004)
Strong Positive Correlation (r = .909)
Moderate Positive Correlation (r = .518)
Characteristics of the Relationship
• A correlation measures three characteristics of
the relationship between X and Y:

• 1) The Direction of the Relationship

• 2) The Form of the Relationship

• 3) The Degree of the Relationship


The Direction of the Relationship
• In a positive correlation, the two variables tend to move in the
same direction (correlation is +).
• When X increases, Y increases.
• When X decreases, Y decreases.
• See (a) and (d) of scatterplot examples
• In a negative correlation, the two variables move in opposite
directions (correlation is -).
• When X increases, Y decreases.
• When X decreases, Y increases.
• See (b) and (c) of scatterplot examples.
The Form of the Relationship
• There are many forms that plots can take. The one we will
consider is linear. In a linear form, the points in the plot tend to
form a straight line. See scatterplot examples (a) and (b) for linear
forms. The remaining examples are not linear.
Pearson Product-moment Correlation Coefficient
Example 1: Economics: What is the measure of
correlation between the interest rate of federal funds and
the commodities futures index?
Interpreting the Pearson Correlation
• Correlation describes a relationship between two
variables, not why the variables are related (not proof of
cause-and-effect).
• The value of a correlation can be greatly affected by the
range of scores in the data.
• The value of a correlation can be greatly affected by one
or two extreme points (outliers).
• A correlation should not be interpreted as a “proportion”.
For example, a correlation of .815 does not mean that
one could predict with 81.5 % accuracy. To describe how
accurately one variable predicts the other, you must
square the correlation. (r = .815 means 66% accuracy) r2
is the coefficient of determination.
Lab: Solve Example 1 Economics using Excel

You might also like