The document provides an overview of correlation analysis, focusing on the correlation coefficient, particularly the Pearson product-moment correlation coefficient, and its various characteristics such as direction and strength. It explains how to calculate correlation coefficients, what they indicate about relationships between variables, and the limitations of correlation, including the distinction between correlation and causation. Additionally, it discusses the coefficient of determination and different types of correlation coefficients, such as point-biserial and Spearman's rho.
The document provides an overview of correlation analysis, focusing on the correlation coefficient, particularly the Pearson product-moment correlation coefficient, and its various characteristics such as direction and strength. It explains how to calculate correlation coefficients, what they indicate about relationships between variables, and the limitations of correlation, including the distinction between correlation and causation. Additionally, it discusses the coefficient of determination and different types of correlation coefficients, such as point-biserial and Spearman's rho.
association among variables, and a foundational statistic for several more complex statistics, is the correlation coefficient. Correlation Analysis
◼ there are a number of different types of
correlation coefficients, the most commonly used in social science research is the Pearson product-moment correlation coefficient.
◼ The other types of correlation are the
point-biserial coefficient, the Spearman rho coefficient, and the phi coefficient. When to Use Correlation and What It Tells Us ◼ Researchers compute correlation coefficients when they want to know how two variables are related to each other
▪ For a Pearson product-moment correlation, both of the
variables must be measured on an interval or ratio scale and are known as continuous variables Characteristics of correlation
◼ There are two fundamental characteristics of
correlation coefficients researchers care about. ◼ The first of these is the direction of the correlation coefficient ▪ A positive correlation indicates that the values on the two variables being analyzed move in the same direction. ▪ That is, as scores on one variable go up, scores on the other variable go up as well (on average). Characteristics of correlation
▪ A negative correlation indicates that the values on
the two variables being analyzed move in opposite directions. ▪ That is, as scores on one variable go up, scores on the other variable go down, and vice versa (on average). Characteristics of correlation
◼ The second fundamental characteristic of
correlation coefficients is the strength or magnitude of the relationship. ◼ Correlation coefficients range in strength from –1.00 to +1.00. ▪ The closer the correlation coefficient is to either –1.00 or +1.00, the stronger the relationship is between the two variables. ▪ r is the symbol for the sample Pearson correlation coefficient ◼ Generally, correlation coefficients stay between –.70 and +.70. ◼ –.20 and +.20 indicate a weak relation between two variables ◼ .20 - .50 (either positive or negative) represent a moderate relationship ◼ larger than .50 (either positive or negative) represent a strong relationship Pearson Correlation Coefficients in Depth ◼ The first step is to notice that we are concerned with a sample’s scores on two variables at the same time ◼ it is critical that the scores on the two variables are paired (score on IV is paired with DV) Calculating the Correlation Coefficient ◼ The first step is to standardize your variables ◼ What this does is provide a z score for each case in the sample. ◼ then we have to find the sum of the cross products between the z scores on each of the two variables being examined for each case in the sample. ◼ When we multiply each individual’s score on one variable with that individual’s score on the second variable (i.e., find a cross product), ◼ sum those across all of the individuals in the sample, and then divide by N ◼ we have an average cross product, and this is known as covariance. ◼ If we standardize this covariance, we end up with a correlation coefficient What the Correlation Coefficient Does, and Does Not, Tell Us ◼ Correlation Coefficient Tells Us ▪ Strength of the Relationship ▪ Values range between -1 and +1. ▪ +1: Perfect positive relationship (as one variable increases, the other also increases). ▪ -1: Perfect negative relationship (as one variable increases, the other decreases). ▪ 0: No linear relationship. ◼ Direction of the Relationship ▪ Positive values indicate a direct (positive) relationship. ▪ Negative values indicate an inverse (negative) relationship. ◼ Nature of Linear Association ▪ The correlation coefficient specifically measures how well data points fit a linear pattern. Correlation Coefficient Does Not Tell Us ◼ Causation ▪ Correlation does not imply causation. A strong correlation does not mean one variable causes the other to change. Example ▪ There is a positive correlation between shark attacks and the number of people at the beach. ▪ This doesn’t mean going to the beach causes shark attacks. The real reason is more people go to the beach in summer, which increases the likelihood of shark attacks. ◼ Non-Linear Relationships ◼ It fails to capture relationships that are non-linear ▪ A child’s happiness might increase with candy consumption up to a point, but after eating too much candy, happiness might decrease due to a stomachache. ◼ Correlation Can Be Misleading Because of Outliers ◼ A few unusual data points can distort the correlation value. Example: ▪ Imagine measuring the relationship between study hours and exam scores for 10 students. ▪ If 9 students show a strong positive trend, but 1 student studied a lot and failed because of illness, that one “outlier” could lower the correlation, making it seem weaker than it really is. ◼ Correlation Does Not Show How Much One Variable Affects the Other ◼ It only shows the strength of the relationship, not the size of the effect. Example ▪ If height and weight have a high positive correlation, it doesn’t mean all tall people are heavy or that height determines exact weight. It just shows a general trend. The Coefficient of Determination
◼ The coefficient of determination, denoted as
R2 is a statistical measure that explains the proportion of the variance (variability) in one variable that is predictable from the variance in another variable.
◼ It is derived from the square of the correlation
coefficient r R2=r2 The Coefficient of Determination Explained Variance
◼ Explained variance refers to the portion of
the total variability in the dependent variable (outcome) that can be attributed to its relationship with the independent variable(s). ◼ Suppose we are studying the relationship between study hours and exam scores, and R2=0.7 ▪ This means 70% of the variability in exam scores is explained by study hours. ▪ The remaining 30% (unexplained variance) could be due to other factors like prior knowledge, sleep quality, or test anxiety. Shared variance
◼ When two variables are related, or correlated,
with each other, there is a certain amount of shared variance between them. ▪ If height and weight are correlated with R2=0.49 it means 49% of the variability in weight is shared with height (explained by height). The remaining 51% is influenced by other factors like diet or genetics. Other Types of Correlation Coefficients ◼ Point Biserial Correlation ◼ The Point Biserial correlation is used to measure the relationship between: ▪ One continuous variable (e.g., test scores). ▪ One binary variable (e.g., gender: male/female or pass/fail). ▪ Why We Use Point Biserial Correlation: ▪ It helps us understand whether the two groups (from the binary variable) differ significantly on the continuous variable. ◼ Spearman's Rho Correlation (ρ) ◼ Spearman’s Rho measures the relationship between two variables based on their ranked values (not actual scores). ▪ Used for ordinal data (rankings) or when the data is not normally distributed. ▪ It measures monotonic relationships (consistent upward or downward trends). SPSS activity
◼ Analyze < correlate <bivariate
◼ One-Tailed vs. Two-Tailed Tests in Hypothesis Testing ◼ When conducting a hypothesis test, choosing one-tailed or two-tailed depends on the research question and how you frame the alternative hypothesis ◼ Two-Tailed Test ▪ A two-tailed test checks for the possibility of a relationship in both directions. ◼ Question: Does caffeine consumption affect reaction times? ▪ Null: Caffeine has no effect (μ=0). ▪ Alternative: Caffeine either increases or decreases reaction time (μ≠0). ▪ A two-tailed test is used because we are considering effects in both directions. ◼ One-Tailed Test ▪ A one-tailed test checks for a relationship in one specific direction only. ▪ It is used when we are only interested in whether a parameter is greater than or less than a specific value. ◼ Does a new drug lower blood pressure? ▪ Null: The drug has no effect or increases blood pressure (μ≥0). ▪ Alternative: The drug lowers blood pressure (μ<0). ◼ A one-tailed test is used because we are only interested in a decrease.