Lecture 13 Correlation Chapter 12 Part 1
Lecture 13 Correlation Chapter 12 Part 1
Presentation
BIDA330
Correlation
Chapter 12 Part 1
Correlation
• Correlation is a measure of the degree of relatedness of
variables
• It can help a business researcher determine, for example, whether
the stocks of two airlines rise and fall in any related manner
• For a sample of pairs of data, correlation analysis can yield a
numerical value that represents the degree of relatedness of the
two stock prices over time
• In the transportation industry, is a correlation evident between the
price of transportation and the weight of the object being
shipped? If so, how strong are the correlations?
Correlation Cont.
• Several measures of correlation are available, the selection of which
depends mostly on the level of data being analyzed
• Ideally, researchers would like to solve for , the population coefficient
of correlation. However, because researchers virtually always deal with
sample data, this section introduces a widely used sample coefficient
of correlation, r
• This measure is applicable only if both variables being analyzed have
interval or ratio level of data
• The statistic r is the Pearson product-moment correlation
coefficient, named after Karl Pearson (1857–1936), an English
statistician who developed several coefficients of correlation along
with other significant statistical concepts
Pearson's correlation
• Pearson's correlation is the parametric test for correlation
between two continuous (interval/ratio) variables
• The assumptions to apply the test are as follows:
• Normal distribution
• Independence of observations
• Linear relationship
• If the first assumption, that is, normality, is not met or if one
variable is ordinal in nature, a nonparametric alternative known
as Spearman's correlation is applied
Spearman's correlation
• Spearman's correlation can be applied to curvilinear relationships
(in ranked or ordinal data)
• However, the relationship in any correlation must be monotonic,
that is, as the value of one variable increases or decreases, so
does the value of the other variable either increase/decrease
Pearson product-moment correlation
coefficient
• Named after Karl Pearson (1857–1936), an English statistician who developed
several coefficients of correlation along with other significant statistical
concepts
• The term r is a measure of the linear correlation of two variables
• It is a number that ranges from -1 to 0 to +1, representing the strength of the
relationship between the variables. r belongs to [-1:1], -1≤ r ≤ 1
• An r value of +1 denotes a perfect positive relationship between two sets of
numbers
• An r value of -1 denotes a perfect negative correlation, which indicates an
inverse relationship between two variables: as one variable gets larger, the
other gets smaller
• An r value of 0 means no linear relationship is present between the two
variables
Scatterplot/Diagram
A scatterplot is a graph that is used to represent the
relationship between two variables. (Also referred to
as a scatter diagram.)
In a scatterplot, the X values are placed on the
horizontal axis and the Y values are placed on the
vertical axis.
The value of the scatterplot is that it lets you see the
nature of the relationship.
Strong Negative Correlation (r = –.933)
Moderate Negative Correlation (r = –.674)
Virtually No Correlation (r = –.004)
Strong Positive Correlation (r = .909)
Moderate Positive Correlation (r = .518)
Characteristics of the Relationship
• A correlation measures three characteristics of
the relationship between X and Y: