0% found this document useful (0 votes)
148 views20 pages

Association Between Two Variables: - No Association - Linear Association

The document discusses different statistical analysis methods used to examine relationships between variables, including t-tests, correlation, regression, and chi-square tests. It explains how to quantify the strength of linear correlation using the correlation coefficient r, which ranges from -1 to 1. A larger absolute r value indicates a stronger linear relationship. The document also discusses how to interpret r and r-squared, and the difference between correlation and regression analysis. Examples are provided to illustrate linear correlation and the use of regression to predict one variable based on another.

Uploaded by

anon-753408
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views20 pages

Association Between Two Variables: - No Association - Linear Association

The document discusses different statistical analysis methods used to examine relationships between variables, including t-tests, correlation, regression, and chi-square tests. It explains how to quantify the strength of linear correlation using the correlation coefficient r, which ranges from -1 to 1. A larger absolute r value indicates a stronger linear relationship. The document also discusses how to interpret r and r-squared, and the difference between correlation and regression analysis. Examples are provided to illustrate linear correlation and the use of regression to predict one variable based on another.

Uploaded by

anon-753408
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Dependent Independent

Variable Variable

Last Week T-Test Continuous Categorical

Correlation
and Continuous Continuous
This Week Regression

Chi-Square Categorical Categorical

General Categorical Categorical


Next Week Linear or or
Model Continuous Continuous

Association Between Two Variables

• No association
• Linear association
– Positive association
– Negative association
• Curvilinear association
1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Strength of Linear Association


1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Strength of Linear Association
1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Strength of Linear Association


1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Quantifying the Strength
of Linear Correlation
• What does a positive Student SAT GPA
linear correlation # (X) (Y)
mean?
1 450 2.7
– Large numbers on one
variable go with large 2 520 3.1
numbers on the other
variable. 3 600 3.5
• How to decide what 4 470 2.6
are large and small 5 460 3.1
numbers? µ 500 3.0
– Relative to the means.
σ 55.5 0.32

3.6

3.4

3.2

3
350 400 450 500 550 600 650
2.8

2.6

2.4
Student SAT GPA X – µ X Y – µ Y (X – µ X)(Y – µ Y)
# (X) (Y)
1 450 2.7 -50 -0.3 15

2 520 3.1 20 0.1 2

3 600 3.5 100 0.5 50

4 470 2.6 -30 -0.4 12

5 460 3.1 -40 0.1 -4

Sum 2500 15.0 0 0 75 (Cross Product)

µ 500 3.0 0 0 15 (Covariance)

σ 55.5 0.32

Quantifying the Strength


of Linear Correlation
• Is 15 a large or smaller number?
• At least we know it is positive.
• Magnitude relative to the variance (or standard
deviation) of X and Y.

Co var iance σ XY
r= =
σ X ⋅σ Y σ X ⋅σ Y
• r = 15 / (55.5 x 0.32) = 0.84
Alternative Approach
• Standardize X and Y first (z-scores), then
calculate the covariance between the z-
scores.

r=
∑z X ⋅ zY
N

Student SAT GPA zX zY zX zY


# (X) (Y)
1 450 2.7 -0.90 -0.93 0.84

2 520 3.1 0.36 0.31 0.11

3 600 3.5 1.80 1.55 2.79

4 470 2.6 -0.54 -1.24 0.67

5 460 3.1 -0.72 0.31 -0.22

Sum 2500 15.0 0 0 4.19

µ 500 3.0 0 0 0.84 r


σ 55.5 0.32
Interpreting the Magnitude of
Correlations
• Always between -1 and +1
• Proportion of variance explained by the
other variable: r2
• r = .84, r2 = .71 = 71%
• A correlation of .8 is NOT two times
stronger than a correlation of .4.
– How much stronger?
– 4 times. (.8)2 = .64; (.4)2 = .16

Significance Testing
• The following has a t distribution:
r N −2
t=
1− r2
df = N – 2
r = .84, t = 2.68, df = 3, p = .075
Not significant at .05 level. Small sample size.
When There’s a Significant
Correlation
• Correlation and Causation
• X causes Y
• Y causes X
• Z causes both X and Y

When There’s No Significant


Correlation
• Small sample
• Other Noise
• Attenuation due to unreliability of
measurement
• Outliers
• Restriction in range
• Curvilinearity
From Correlation to Regression
• Correlation: to describe the relationship
between two variables
• Regression: to use one variable to predict
another variable
• The accuracy of prediction depends on the
strength of correlation

Strength of Linear Association


1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
An Example

• Research Question: Does eating spinach


increase strength?
• Randomly sampled 20 individuals.
• IV: How many cans of spinach one consumed
in the past week.
• DV: How many push-ups one can do in a
minute.

70

60 r = .86
50

40
Pushup

30

20

10

0
0 5 10 15 20 25
Spinach
Coefficientsa

Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 19.443 3.494 5.565 .000
spinach 1.550 .220 .856 7.031 .000
a. Dependent Variable: pushup

Yˆ =19.48+1.55X

)
zY = (.856) z X

Understanding R2:
Proportion of Variance Explained, or
Proportion Reduction in Error
70

60

50

40
Pushup

30

20

10

0
0 5 10 15 20 25
Spinach
70

60

50

40
Pushup

30

20 When you don’t know X,


you can only use the mean
10 of Y to predict the Y score
of any individual.
0
0 5 10 15 20 25
Spinach

70

60

50

40
Pushup

30

20 Errors (or variance) are relatively


high when you use the mean of
10 Y as your prediction.
0
0 5 10 15 20 25
Spinach
70

60

50

40
Pushup

30

20

10

0
0 5 10 15 20 25
Spinach

70

60

50

40
Pushup

30

20 When you know X, and use X to


predict Y, the errors become
10 smaller.

0
0 5 10 15 20 25
Spinach
70

60

50

40
Pushup

30
Green
20
)
R2 =
∑ (Y − Y ) 2
10
∑ (Y − Y ) 2
Green and
Red
0
0 5 10 15 20 25
Spinach

# of push-ups

Spinach consumption
Association Between Two
Categorical Variables
• Angelina Jolie or Jennifer Aniston?

Test for Independence


• Null Hypothesis: There is no relationship
between JA/AJ preference and which side
you are sitting in the classroom.
• To rephrase: JA/AJ preference does not
depend on which side you are sitting in the
classroom.
• Another version: People sitting on the right
and people sitting on the left do not have
different JA/AJ preferences.
JA AJ Total
Observed
Left Expected

Right

Total

Expected Frequency
• Expected assuming the null hypothesis is
true, i.e., no association between the two
variables.

C⋅R
Expected =
N
• C: Column total, R: Row total,
N: Grand total
Chi-Square
(Observed − Expected ) 2
χ =∑
2

Expected
• Degree of Freedom
df = (# of Columns – 1)(# of Rows – 1)
• What is the df for a 2 x 2 table?
• The shape of Chi-Square distribution
depends on the degree of freedom

Chi-Square Distribution
Critical Region

Chi-Square
• The chi-square statistic is always positive.
Why?
• When df = 1, chi-square distribution is the
distribution of z2.
• Without looking up in a reference, what is
the alpha = .05 cutoff value for the chi-
square distribution (df = 1)?
– (1.96)2 = 3.84
Back to Angelina and Jennifer
• In SPSS.

If We Still Have Time…

Chi-Square Test
for Goodness of Fit
To test whether a distribution is
the same as a predetermined or
theoretical distribution.
Next Week
• Integrating t-test, correlation, regression,
and chi-square test for independence
• They are all special cases of the general
linear model
• Effect size and power for the above tests

You might also like