0% found this document useful (0 votes)
3 views

lecture 10 correlation

The document provides an overview of correlation analysis, focusing on the correlation coefficient, particularly the Pearson product-moment correlation coefficient, and its various characteristics such as direction and strength. It explains how to calculate correlation coefficients, what they indicate about relationships between variables, and the limitations of correlation, including the distinction between correlation and causation. Additionally, it discusses the coefficient of determination and different types of correlation coefficients, such as point-biserial and Spearman's rho.

Uploaded by

Fatima Batool
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

lecture 10 correlation

The document provides an overview of correlation analysis, focusing on the correlation coefficient, particularly the Pearson product-moment correlation coefficient, and its various characteristics such as direction and strength. It explains how to calculate correlation coefficients, what they indicate about relationships between variables, and the limitations of correlation, including the distinction between correlation and causation. Additionally, it discusses the coefficient of determination and different types of correlation coefficients, such as point-biserial and Spearman's rho.

Uploaded by

Fatima Batool
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Correlation Analysis

Correlation Analysis

◼ One of the most basic measures of the


association among variables, and a
foundational statistic for several more
complex statistics, is the correlation
coefficient.
Correlation Analysis

◼ there are a number of different types of


correlation coefficients, the most commonly
used in social science research is the Pearson
product-moment correlation coefficient.

◼ The other types of correlation are the


point-biserial coefficient, the Spearman rho
coefficient, and the phi coefficient.
When to Use Correlation and What
It Tells Us
◼ Researchers compute correlation coefficients
when they want to know how two variables
are related to each other

▪ For a Pearson product-moment correlation, both of the


variables must be measured on an interval or ratio scale
and are known as continuous variables
Characteristics of correlation

◼ There are two fundamental characteristics of


correlation coefficients researchers care about.
◼ The first of these is the direction of the
correlation coefficient
▪ A positive correlation indicates that the values on the
two variables being analyzed move in the same direction.
▪ That is, as scores on one variable go up, scores on the
other variable go up as well (on average).
Characteristics of correlation

▪ A negative correlation indicates that the values on


the two variables being analyzed move in opposite
directions.
▪ That is, as scores on one variable go up, scores on
the other variable go down, and vice versa (on
average).
Characteristics of correlation

◼ The second fundamental characteristic of


correlation coefficients is the strength or
magnitude of the relationship.
◼ Correlation coefficients range in strength from
–1.00 to +1.00.
▪ The closer the correlation coefficient is to either
–1.00 or +1.00, the stronger the relationship is
between the two variables.
▪ r is the symbol for the sample Pearson correlation
coefficient
◼ Generally, correlation coefficients stay
between –.70 and +.70.
◼ –.20 and +.20 indicate a weak relation
between two variables
◼ .20 - .50 (either positive or negative) represent
a moderate relationship
◼ larger than .50 (either positive or negative)
represent a strong relationship
Pearson Correlation Coefficients in
Depth
◼ The first step is to notice that we are
concerned with a sample’s scores on two
variables at the same time
◼ it is critical that the scores on the two variables
are paired (score on IV is paired with DV)
Calculating the Correlation
Coefficient
◼ The first step is to standardize your variables
◼ What this does is provide a z score for each
case in the sample.
◼ then we have to find the sum of the cross
products between the z scores on each of the
two variables being examined for each case in
the sample.
◼ When we multiply each individual’s score on
one variable with that individual’s score on the
second variable (i.e., find a cross product),
◼ sum those across all of the individuals in the
sample, and then divide by N
◼ we have an average cross product, and this is
known as covariance.
◼ If we standardize this covariance, we end up
with a correlation coefficient
What the Correlation Coefficient
Does, and Does Not, Tell Us
◼ Correlation Coefficient Tells Us
▪ Strength of the Relationship
▪ Values range between -1 and +1.
▪ +1: Perfect positive relationship (as one variable
increases, the other also increases).
▪ -1: Perfect negative relationship (as one variable
increases, the other decreases).
▪ 0: No linear relationship.
◼ Direction of the Relationship
▪ Positive values indicate a direct (positive)
relationship.
▪ Negative values indicate an inverse (negative)
relationship.
◼ Nature of Linear Association
▪ The correlation coefficient specifically measures
how well data points fit a linear pattern.
Correlation Coefficient Does Not
Tell Us
◼ Causation
▪ Correlation does not imply causation. A strong
correlation does not mean one variable causes the
other to change.
Example
▪ There is a positive correlation between shark attacks and
the number of people at the beach.
▪ This doesn’t mean going to the beach causes shark
attacks. The real reason is more people go to the beach in
summer, which increases the likelihood of shark attacks.
◼ Non-Linear Relationships
◼ It fails to capture relationships that are
non-linear
▪ A child’s happiness might increase with candy
consumption up to a point, but after eating too
much candy, happiness might decrease due to a
stomachache.
◼ Correlation Can Be Misleading Because of
Outliers
◼ A few unusual data points can distort the
correlation value.
Example:
▪ Imagine measuring the relationship between study
hours and exam scores for 10 students.
▪ If 9 students show a strong positive trend, but 1 student
studied a lot and failed because of illness, that one
“outlier” could lower the correlation, making it seem
weaker than it really is.
◼ Correlation Does Not Show How Much One
Variable Affects the Other
◼ It only shows the strength of the relationship,
not the size of the effect.
Example
▪ If height and weight have a high positive
correlation, it doesn’t mean all tall people are
heavy or that height determines exact weight. It
just shows a general trend.
The Coefficient of Determination

◼ The coefficient of determination, denoted as


R2 is a statistical measure that explains the
proportion of the variance (variability) in one
variable that is predictable from the variance
in another variable.

◼ It is derived from the square of the correlation


coefficient r
R2=r2
The Coefficient of Determination
Explained Variance

◼ Explained variance refers to the portion of


the total variability in the dependent variable
(outcome) that can be attributed to its
relationship with the independent variable(s).
◼ Suppose we are studying the relationship
between study hours and exam scores, and
R2=0.7
▪ This means 70% of the variability in exam scores is
explained by study hours.
▪ The remaining 30% (unexplained variance) could
be due to other factors like prior knowledge, sleep
quality, or test anxiety.
Shared variance

◼ When two variables are related, or correlated,


with each other, there is a certain amount of
shared variance between them.
▪ If height and weight are correlated with R2=0.49 it
means 49% of the variability in weight is shared
with height (explained by height). The remaining
51% is influenced by other factors like diet or
genetics.
Other Types of Correlation
Coefficients
◼ Point Biserial Correlation
◼ The Point Biserial correlation is used to
measure the relationship between:
▪ One continuous variable (e.g., test scores).
▪ One binary variable (e.g., gender: male/female or
pass/fail).
▪ Why We Use Point Biserial Correlation:
▪ It helps us understand whether the two groups
(from the binary variable) differ significantly on
the continuous variable.
◼ Spearman's Rho Correlation (ρ)
◼ Spearman’s Rho measures the relationship
between two variables based on their ranked
values (not actual scores).
▪ Used for ordinal data (rankings) or when the data
is not normally distributed.
▪ It measures monotonic relationships (consistent
upward or downward trends).
SPSS activity

◼ Analyze < correlate <bivariate



One-Tailed vs. Two-Tailed Tests in
Hypothesis Testing
◼ When conducting a hypothesis test, choosing
one-tailed or two-tailed depends on the
research question and how you frame the
alternative hypothesis
◼ Two-Tailed Test
▪ A two-tailed test checks for the possibility of a
relationship in both directions.
◼ Question: Does caffeine consumption affect
reaction times?
▪ Null: Caffeine has no effect (μ=0).
▪ Alternative: Caffeine either increases or decreases
reaction time (μ≠0).
▪ A two-tailed test is used because we are
considering effects in both directions.
◼ One-Tailed Test
▪ A one-tailed test checks for a relationship in one
specific direction only.
▪ It is used when we are only interested in whether a
parameter is greater than or less than a specific
value.
◼ Does a new drug lower blood pressure?
▪ Null: The drug has no effect or increases blood
pressure (μ≥0).
▪ Alternative: The drug lowers blood pressure (μ<0).
◼ A one-tailed test is used because we are only
interested in a decrease.

You might also like