0% found this document useful (0 votes)

10 views67 pages

lecture12

The document discusses statistical methods for analyzing data, particularly focusing on ANOVA and Chi-square tests. It highlights the importance of these tests in comparing means and proportions across multiple groups, along with their assumptions and alternatives for different data types. Additionally, it provides examples and explanations of how to calculate ANOVA and interpret its results.

Uploaded by

waleedkhalillahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views67 pages

lecture12

Uploaded by

waleedkhalillahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 67

More than two groups:

ANOVA and Chi-square

First, recent news…
 RESEARCHERS FOUND A NINE-
FOLD INCREASE IN THE RISK
OF DEVELOPING PARKINSON'S
IN INDIVIDUALS EXPOSED IN
THE WORKPLACE TO CERTAIN
SOLVENTS…
The data…
Table 3. Solvent Exposure Frequencies and Adjusted Pairwise
Odds Ratios in PD–Discordant Twins, n = 99 Pairsa
Which statistical test?
Are the observations correlated? Alternative to the
Outcom independent correlated chi-square test if
e sparse cells:
Variable

Binary or Chi-square test: McNemar’s chi-square Fisher’s exact test:

categorical compares proportions test: compares binary compares proportions between
between two or more outcome between correlated independent groups when
(e.g. groups there are sparse data (some
groups (e.g., before and after)
fracture, cells <5).
yes/no)
Relative risks: odds Conditional logistic
ratios or risk ratios McNemar’s exact test:
regression: multivariate
compares proportions between
regression technique for a
correlated groups when there
Logistic regression: binary outcome when groups
are sparse data (some cells
multivariate technique are correlated (e.g., matched
<5).
data)
used when outcome is
binary; gives multivariate-
adjusted odds ratios GEE modeling: multivariate
regression technique for a
binary outcome when groups
Comparing more than two
groups…
Continuous outcome
(means)
Are the observations independent or correlated?
Outcome independent correlated Alternatives if the
Variable normality assumption is
violated (and small
sample size):

Continuous Ttest: compares means Paired ttest: compares Non-parametric statistics

(e.g. pain between two independent means between two related Wilcoxon sign-rank
groups groups (e.g., the same test: non-parametric
scale, subjects before and after)
cognitive alternative to the paired ttest
function) ANOVA: compares means
between more than two Repeated-measures Wilcoxon sum-rank test
independent groups ANOVA: compares changes (=Mann-Whitney U test):
over time in the means of two non-parametric alternative to
Pearson’s correlation or more groups (repeated the ttest
measurements)
coefficient (linear
correlation): shows linear Kruskal-Wallis test: non-
correlation between two Mixed models/GEE parametric alternative to
continuous variables modeling: multivariate ANOVA
regression techniques to
compare changes over time
Linear regression: between two or more groups; Spearman rank
ANOVA example
Mean micronutrient intake from the school lunch by school
S1a, n=28 S2b, n=25 S3c, n=21 P-valued
Calcium (mg) Mean 117.8 158.7 206.5 0.000
SDe 62.4 70.5 86.2
Iron (mg) Mean 2.0 2.0 2.0 0.854
SD 0.6 0.6 0.6
Folate (μg) Mean 26.6 38.7 42.6 0.000
SD 13.1 14.5 15.1
Zinc (mg) Mean 1.9 1.5 1.3 0.055
SD 1.0 1.2 0.4
a
School 1 (most deprived; 40% subsidized lunches). FROM: Gould R, Russell J,
Barker ME. School lunch
b
School 2 (medium deprived; <10% subsidized). menus and 11 to 12 year
old children's food choice
c
School 3 (least deprived; no subsidization, private school). in three secondary schools
in England-are the
d
ANOVA; significant differences are highlighted in bold (P<0.05). nutritional standards being
met? Appetite. 2006
ANOVA
(ANalysis Of VAriance)
 Idea: For two or more groups, test
difference between means, for
quantitative normally distributed
variables.
 Just an extension of the t-test (an
ANOVA with only two groups is
mathematically equivalent to a t-
test).
One-Way Analysis of
Variance

 Assumptions, same as ttest

 Normally distributed outcome

 Equal variances between the

groups
 Groups are independent
Hypotheses of One-Way
ANOVA

H 0 : μ 1 μ 2 μ 3 

H 1 : Not all of the population means are the same

ANOVA
 It’s like this: If I have three groups
to compare:
 I could do three pair-wise ttests, but
this would increase my type I error
 So, instead I want to look at the
pairwise differences “all at once.”
 To do this, I can recognize that
variance is a statistic that let’s me look
at more than one difference at a
time…
The “F-test”
Is the difference in the means of the groups more
than background noise (=variability within groups)?
Summarizes the mean differences
between all groups at once.

Variabilit y between groups

F
Variabilit y within groups

Analogous to pooled variance from a

ttest.
Recall, we have already used an “F-test” to check for equality of variances If F>>1
(indicating unequal variances), use unpooled variance in a t-test.
The F-distribution
 The F-distribution is a continuous probability distribution
that depends on two parameters n and m (numerator and
denominator degrees of freedom, respectively):

http://www.econtools.com/jevons/java/Graphics2D/FDist.html
The F-distribution
 A ratio of variances follows an F-
distribution: 2
 between
2
~ Fn ,m
 within
The F-test tests the hypothesis that two
variances are equal.
F will be close to 1 if sample variances are
equal. H : 2  2
0 between within
2 2
H a :  between  within
How to calculate ANOVA’s
by hand…
Treatment 1 Treatment 2 Treatment 3 Treatment 4
y11 y21 y31 y41
y12 y22 y32 y42
n=10 obs./group
y13 y23 y33 y43
y14 y24 y34 y44 k=4 groups
y15 y25 y35 y45
y16 y26 y36 y46
y17 y27 y37 y47
y18 y28 y38 y48
y19 y29 y39 y49
y110 y210 y310 y410
10


10 10 10
y1 j
y 2j y 3j y 4j The group
j 1 j 1
y1  y 2 
j 1
y 3 
j 1 y 4  means
10 10 10 10
10 10 10

10
( y 2 j  y 2 ) 2  (y  (y
 (y  y 4 ) 2
2
1j  y1 ) 2
3j  y 3 ) 4j
j 1 j 1 j 1 j 1 The (within)
10  1 10  1 10  1 10  1 group
variances
Sum of Squares Within
(SSW), or Sum of Squares
Error (SSE)
10 10 10
 (y
10 2
 (y 1j  y1 ) 2
j 1
2j  y 2 )  (yj 1
3j  y 3 ) 2
 (y
j 1
4j  y 4 ) 2
The (within)
j 1
group
10  1 10  1 10  1 10  1 variances

10 10
10 10

 (y 1j  y1 ) 2
+  ( y 2 j  y 2 ) 2 +  ( y 3 j  y 3 ) + 2
 (y
j 1
4j  y 4 ) 2
j 1 j 3
j 1

4 10
 
i 1 j 1
( y ij  y i  ) 2 Sum of Squares Within (SSW)
(or SSE, for chance error)
Sum of Squares Between (SSB),
or Sum of Squares Regression
(SSR)
4 10
Overall
mean of all  y
i 1 j 1
ij
40
observation y 
s (“grand 40
mean”)

4 Sum of Squares

10 x  (y
i 1
i  y ) 2 Between (SSB).
Variability of the
group means
compared to the
grand mean (the
variability due to the
treatment).
Total Sum of Squares
(SST)

Total sum of
4 10 squares(TSS).


i 1 j 1
( y ij  y   ) 2 Squared difference
of every observation
from the overall
mean. (numerator of
variance of Y!)
Partitioning of Variance

4 10 4 4 10
  (y
i 1 j 1
ij  y i ) 2

+10x ( y i  y  ) 2
=  ( y ij  y   ) 2
i 1 i 1 j 1

SSW + SSB =
TSS
ANOVA Table
Mean Sum
Source of Sum of of Squares
variation d.f. squares F-statistic p-value

Between k-1 SSB SSB/k-1 Go to

SSB
(sum of squared k1
(k groups) SSW Fk-1,nk-k
deviations of nk  k
group means from chart
grand mean)

Within nk-k SSW s2=SSW/nk-k

(sum of squared
(n individuals per
deviations of
group)
observations from
their group mean)

Total nk-1 TSS

variation (sum of squared deviations of
observations from grand mean) TSS=SSB + SSW
n n
X n  Yn 2 X  Yn 2
SSB n (X n  ( ))  n (Yn  ( n ))  
ANOVA=t-test i 1 2 i 1 2
n n
X n Yn 2 Y X

n (
i 1 2
 )  n ( n  n )2
2 i 1 2 2

X n 2 Yn 2 X *Y Y X X *Y
n(( )  ( )  2 n n  ( n )2  ( n )2  2 n n ) 
2 2 2 2 2 2
2 2 2
n( X n  2 X n * Yn  Yn ) n( X n  Yn )
Mean
Source of Sum of Sum of
variation d.f. squares Squares F-statistic p-value

Between 1 SSB Squared Go to

n( X  Y ) 2 (X  Y ) 2 2
( ) (t 2 n  2 )
(2 groups) (squared difference sp
2
sp
2
sp
2 F1, 2n-2
differenc in means n

n Chart
e in times n notice
means values
multiplie are just
d by n) (t 2n-2)2
Within 2n-2 SSW Pooled
variance
equivalent to
numerator of
pooled
variance

Total 2n-1 TSS

variation
Example
Treatment 1 Treatment 2 Treatment 3 Treatment 4
60 inches 50 48 47
67 52 49 67
42 43 50 54
67 67 55 67
56 67 56 68
62 59 61 65
64 67 61 65
59 64 60 56
72 63 59 60
71 65 64 65
Example
Step 1) calculate the
Treatment 1 Treatment 2 Treatment 3 Treatment 4
sum of squares between
60 inches 50 48 47
groups: 67 52 49 67
42 43 50 54
67 67 55 67
Mean for group 1 = 62.0 56 67 56 68
62 59 61 65
Mean for group 2 = 59.7 64 67 61 65

Mean for group 3 = 56.3 59 64 60 56

72 63 59 60
Mean for group 4 = 61.4 71 65 64 65

Grand mean= 59.85

SSB = [(62-59.85)2 + (59.7-59.85)2 + (56.3-59.85)2 + (61.4-59.85)2 ]
xn per group= 19.65x10 = 196.5
Example
Step 2) calculate the
Treatment 1 Treatment 2 Treatment 3 Treatment 4
sum of squares within
60 inches 50 48 47
groups: 67 52 49 67
42 43 50 54
67 67 55 67
(60-62) +(67-62) + (42-
2 2
56 67 56 68
62) 2+ (67-62) 2+ (56-62) 62 59 61 65
2
+ (62-62) 2+ (64-62) 2+ 64 67 61 65

(59-62) 2+ (72-62) 2+ (71- 59 64 60 56

72 63 59 60
62) 2+ (50-59.7) 2+ (52-
71 65 64 65
59.7) 2+ (43-59.7) 2+67-
59.7) 2+ (67-59.7) 2+ (69-
59.7) 2…+….(sum of 40
squared deviations) =
2060.6
Step 3) Fill in the ANOVA
table
Source of variation d.f. Sum of squares Mean Sum of F-statistic p-value
Squares

Between 3 196.5 65.5 1.14 .344

Within 36 2060.6 57.2

Total 39 2257.1
Step 3) Fill in the ANOVA
table
Source of variation d.f. Sum of squares Mean Sum of F-statistic p-value
Squares

Between 3 196.5 65.5 1.14 .344

Within 36 2060.6 57.2

Total 39 2257.1

INTERPRETATION of ANOVA:
How much of the variance in height is explained by treatment
group?
Coefficient of
Determination

2 SSB SSB
R  
SSB  SSE SST
The amount of variation in the outcome variable (dependent
variable) that is explained by the predictor (independent
variable).
Beyond one-way ANOVA
Often, you may want to test more
than 1 treatment. ANOVA can
accommodate more than 1
treatment or factor, so long as they
are independent. Again, the
variation partitions beautifully!

TSS = SSB1 + SSB2 + SSW

ANOVA example
Table 6. Mean micronutrient intake from the school lunch by school
S1a, n=25 S2b, n=25 S3c, n=25 P-valued
Calcium (mg) Mean 117.8 158.7 206.5 0.000
SDe 62.4 70.5 86.2
Iron (mg) Mean 2.0 2.0 2.0 0.854
SD 0.6 0.6 0.6
Folate (μg) Mean 26.6 38.7 42.6 0.000
SD 13.1 14.5 15.1
Mean 1.9 1.5 1.3 0.055
Zinc (mg)
SD 1.0 1.2 0.4
a
School 1 (most deprived; 40% subsidized lunches). FROM: Gould R, Russell J,
Barker ME. School lunch
b
School 2 (medium deprived; <10% subsidized). menus and 11 to 12 year
old children's food choice
c
School 3 (least deprived; no subsidization, private school). in three secondary schools
in England-are the
d
ANOVA; significant differences are highlighted in bold (P<0.05). nutritional standards being
met? Appetite. 2006
Answer
Step 1) calculate the sum of squares between groups:
Mean for School 1 = 117.8
Mean for School 2 = 158.7
Mean for School 3 = 206.5

Grand mean: 161

SSB = [(117.8-161)2 + (158.7-161)2 + (206.5-161)2] x25 per

group= 98,113
Answer
Step 2) calculate the sum of squares within groups:

S.D. for S1 = 62.4

S.D. for S2 = 70.5
S.D. for S3 = 86.2

Therefore, sum of squares within is:

(24)[ 62.42 + 70.5 2+ 86.22]=391,066
Answer
Step 3) Fill in your ANOVA table

Source of variation d.f. Sum of squares Mean Sum of F-statistic p-value

Squares
Between 2 98,113 49056 9 <.05

Within 72 391,066 5431

Total 74 489,179

**R2=98113/489179=20%
School explains 20% of the variance in lunchtime calcium
intake in these kids.
ANOVA summary
 A statistically significant ANOVA (F-
test) only tells you that at least two of
the groups differ, but not which ones
differ.

 Determining which groups differ (when

it’s unclear) requires more
sophisticated analyses to correct for
the problem of multiple comparisons…
Question: Why not just
do 3 pairwise ttests?
 Answer: because, at an error rate of 5% each test, this
means you have an overall chance of up to 1-(.95)3=
14% of making a type-I error (if all 3 comparisons
were independent)
 If you wanted to compare 6 groups, you’d have to do
C = 15 pairwise ttests; which would give you a high
6 2
chance of finding something significant just by chance
(if all tests were independent with a type-I error rate of
5% each); probability of at least one type-I error = 1-
(.95)15=54%.
Recall: Multiple
comparisons
Correction for multiple
comparisons
How to correct for multiple comparisons
post-hoc…
• Bonferroni correction (adjusts p by most

conservative amount; assuming all tests

independent, divide p by the number of
tests)
• Tukey (adjusts p)

• Scheffe (adjusts p)

• Holm/Hochberg (gives p-cutoff beyond

which not significant)

Procedures for Post Hoc
Comparisons
If your ANOVA test identifies a difference between
group means, then you must identify which of your k
groups differ.

If you did not specify the comparisons of interest

(“contrasts”) ahead of time, then you have to pay a price
for making all kCr pairwise comparisons to keep overall
type-I error rate to α.

Alternately, run a limited number of planned comparisons (making only

those comparisons that are most important to your research question).
(Limits the number of tests you make).
1. Bonferroni
For example, to make a Bonferroni correction, divide your desired
alpha cut-off level (usually .05) by the number of comparisons you
are making. Assumes complete independence between comparisons,
which is way too conservative.
Obtained P-value Original Alpha # tests New Alpha Significant?

.001 .05 5 .010 Yes

.011 .05 4 .013 Yes

.019 .05 3 .017 No

.032 .05 2 .025 No

.048 .05 1 .050 Yes

2/3. Tukey and Sheffé
 Both methods increase your p-values to
account for the fact that you’ve done
multiple comparisons, but are less
conservative than Bonferroni (let
computer calculate for you!).

 SAS options in PROC GLM:


adjust=tukey

adjust=scheffe
4/5. Holm and Hochberg
 Arrange all the resulting p-values
(from the T=kCr pairwise
comparisons) in order from
smallest (most significant) to
largest: p1 to pT
Holm
1. Start with p1, and compare to Bonferroni p (=α/T).
2. If p1< α/T, then p1 is significant and continue to step 2.
If not, then we have no significant p-values and stop
here.
3. If p2< α/(T-1), then p2 is significant and continue to step.
If not, then p2 thru pT are not significant and stop here.
4. If p3< α/(T-2), then p3 is significant and continue to step
If not, then p3 thru pT are not significant and stop here.
Repeat the pattern…
Hochberg
1. Start with largest (least significant) p-value, pT,
and compare to α. If it’s significant, so are all
the remaining p-values and stop here. If it’s not
significant then go to step 2.
2. If pT-1< α/(T-1), then pT-1 is significant, as are all
remaining smaller p-vales and stop here. If not,
then pT-1 is not significant and go to step 3.
Repeat the pattern…
Note: Holm and Hochberg should give you the same results.
Use Holm if you anticipate few significant comparisons; use
Hochberg if you anticipate many significant comparisons.
Practice Problem
A large randomized trial compared an experimental drug and 9 other standard
drugs for treating motion sickness. An ANOVA test revealed significant
differences between the groups. The investigators wanted to know if the
experimental drug (“drug 1”) beat any of the standard drugs in reducing total
minutes of nausea, and, if so, which ones. The p-values from the pairwise ttests
(comparing drug 1 with drugs 2-10) are below.

Drug 1 vs. 2 3 4 5 6 7 8 9 10
drug …

p-value .05 .3 .25 .04 .001 .006 .08 .002 .01

a. Which differences would be considered statistically significant using a Bonferroni

correction? A Holm correction? A Hochberg correction?
Answer
Bonferroni makes new α value = α/9 = .05/9 =.0056; therefore, using Bonferroni, the
new drug is only significantly different than standard drugs 6 and 9.

Arrange p-values:
6 9 7 10 5 2 8 4 3

.00 .00 .00 .01 .04 .05 .08 .25 .3

1 2 6

Holm: .001<.0056; .002<.05/8=.00625; .006<.05/7=.007; .01>.05/6=.0083; therefore,

new drug only significantly different than standard drugs 6, 9, and 7.

Hochberg: .3>.05; .25>.05/2; .08>.05/3; .05>.05/4; .04>.05/5; .01>.05/6; .006<.05/7;

therefore, drugs 7, 9, and 6 are significantly different.
Practice problem
 b. Your patient is taking one of the standard drugs that was
shown to be statistically less effective in minimizing
motion sickness (i.e., significant p-value for the
comparison with the experimental drug). Assuming that
none of these drugs have side effects but that the
experimental drug is slightly more costly than your
patient’s current drug-of-choice, what (if any) other
information would you want to know before you start
recommending that patients switch to the new drug?
Answer
 The magnitude of the reduction in minutes of nausea.
 If large enough sample size, a 1-minute difference could
be statistically significant, but it’s obviously not clinically
meaningful and you probably wouldn’t recommend a
switch.
Continuous outcome
(means)
Are the observations independent or correlated?
Outcome independent correlated Alternatives if the
Variable normality assumption is
violated (and small
sample size):

Continuous Ttest: compares means Paired ttest: compares Non-parametric statistics

Proc NPAR1WAY in SAS

Binary or categorical
outcomes (proportions)
Are the observations correlated? Alternative to the
Outcom independent correlated chi-square test if
e sparse cells:
Variable

Binary or Chi-square test: McNemar’s chi-square Fisher’s exact test:

categorical compares proportions test: compares binary compares proportions between
between two or more outcome between correlated independent groups when
(e.g. groups there are sparse data (some
groups (e.g., before and after)
fracture, cells <5).
yes/no)
Relative risks: odds Conditional logistic
ratios or risk ratios McNemar’s exact test:
regression: multivariate
compares proportions between
regression technique for a
correlated groups when there
Logistic regression: binary outcome when groups
are sparse data (some cells
multivariate technique are correlated (e.g., matched
<5).
data)
used when outcome is
binary; gives multivariate-
adjusted odds ratios GEE modeling: multivariate
regression technique for a
binary outcome when groups
Chi-square test
for comparing proportions
(of a categorical variable)
between >2 groups
I. Chi-Square Test of Independence
When both your predictor and outcome variables are categorical, they may be cross-
classified in a contingency table and compared using a chi-square test of
independence.

A contingency table with R rows and C columns is an R x C contingency table.

Example
 Asch, S.E. (1955). Opinions and
social pressure. Scientific
American, 193, 31-35.
The Experiment
 A Subject volunteers to participate
in a “visual perception study.”
 Everyone else in the room is
actually a conspirator in the study
(unbeknownst to the Subject).
 The “experimenter” reveals a pair
of cards…
The Task Cards

Standard line Comparison lines

A, B, and C
The Experiment
 Everyone goes around the room and says
which comparison line (A, B, or C) is
correct; the true Subject always answers
last – after hearing all the others’ answers.
 The first few times, the 7 “conspirators”
give the correct answer.
 Then, they start purposely giving the
(obviously) wrong answer.
 75% of Subjects tested went along with
the group’s consensus at least once.
Further Results
 In a further experiment, group size
(number of conspirators) was
altered from 2-10.

 Does the group size alter the

proportion of subjects who
conform?
The Chi-Square test

Number of group members?

Conformed?
2 4 6 8 10

Yes 20 50 75 60 30

No 80 50 25 40 70

Apparently, conformity less likely when less or more group

members…
 20 + 50 + 75 + 60 + 30 = 235
conformed
 out of 500 experiments.

 Overall likelihood of conforming =

235/500 = .47
Calculating the expected,
in general
 Null hypothesis: variables are
independent
 Recall that under independence:
P(A)*P(B)=P(A&B)
 Therefore, calculate the marginal
probability of B and the marginal
probability of A. Multiply
P(A)*P(B)*N to get the expected cell
count.
Expected frequencies if no
association between group
size and conformity…
Number of group members?
Conformed?
2 4 6 8 10

Yes 47 47 47 47 47

No 53 53 53 53 53
 Do observed and expected differ
more than expected due to
chance?
Chi-Square test
(observed - expected)2
 
2

expected

2 (20  47) 2 (50  47) 2 (75  47) 2 (60  47) 2 (30  47) 2
4      
47 47 47 47 47
(80  53) 2 (50  53) 2 (25  53) 2 (40  53) 2 (70  53) 2
    85
53 53 53 53 53

Degrees of freedom = (rows-1)(columns-1)=(2-1)(5-1)=4

The Chi-Square
distribution:
is sum of squared normal deviates
df
 2 df  Z 2 ; where Z ~ Normal(0,1)
i 1

The expected
value and
variance of a chi-
square:

E(x)=df

Var(x)=2(df)
Chi-Square test
(observed - expected)2
 
2

expected

Degrees of freedom = (rows-1)(columns-1)=(2-1)(5-1)=4

Rule of thumb: if the chi-square statistic is much greater than it’s degrees of freedom,
indicates statistical significance. Here 85>>4.
Chi-square example: recall data…

Brain tumor No brain tumor

Own a cell 5 347 352

phone
Don’t own a 3 88 91
cell phone

8 435 453
5 3
ptumor / cellphone  .014; ptumor / nophone  .033
352 91
(pˆ1  p
ˆ2)  0 8
Z  ;p .018
( p )(1  p ) ( p )(1  p ) 453

n1 n2
(.014  .033)  .019
Z    1.22
(.018)(.982) (.018)(.982) .0156

352 91
Same data, but use Chi-square
test
Brain tumor No brain tumor
Own 5 347 352
Don’t own 3 88 91

8 435 453
8 352
ptumor  .018; pcellphone  .777 Expected value
453 453 in cell c= 1.7, so
ptumor xpcellphone .018 * .777 .014 technically
Expected in cell a .014 * 453 6.3; 1.7 in cell c; should use a
Fisher’s exact
345.7 in cell b; 89.3 in cell d
here! Next
(R-1 )*(C-1 ) 1*1 1 df term…
2 (8 - 6.3) 2 (3 - 1.7) 2 (89.3 - 88) 2 (347 - 345.7) 2
 1    1.48
6.3 1.7 89.3 345.7
NS
note :Z 2 1.22 2 1.48
Caveat
**When the sample size is very
small in any cell (expected
value<5), Fisher’s exact test is
used as an alternative to the chi-
square test.
Binary or categorical
outcomes (proportions)
Are the observations correlated? Alternative to the
Outcom independent correlated chi-square test if
e sparse cells:
Variable

Binary or Chi-square test: McNemar’s chi-square Fisher’s exact test:

categorical compares proportions test: compares binary compares proportions between
between two or more outcome between correlated independent groups when
(e.g. groups there are sparse data (np <5).
groups (e.g., before and after)
fracture,
yes/no)
Relative risks: odds Conditional logistic McNemar’s exact test:
ratios or risk ratios compares proportions between
regression: multivariate
correlated groups when there
regression technique for a
are sparse data (np <5).
Logistic regression: binary outcome when groups
multivariate technique are correlated (e.g., matched
data)
used when outcome is
binary; gives multivariate-
adjusted odds ratios GEE modeling: multivariate
regression technique for a
binary outcome when groups

ANOVA and Chi Square
No ratings yet
ANOVA and Chi Square
67 pages
ANOVA
0% (1)
ANOVA
26 pages
Tutorial No 4 Torsion
No ratings yet
Tutorial No 4 Torsion
7 pages
Standford - HRP 259 Introduction To Probability and Statistics - Lecture 12
No ratings yet
Standford - HRP 259 Introduction To Probability and Statistics - Lecture 12
67 pages
Chapter 6 ANOVA (Analysis of Variance)
No ratings yet
Chapter 6 ANOVA (Analysis of Variance)
26 pages
ANOVA
No ratings yet
ANOVA
39 pages
Anova
No ratings yet
Anova
47 pages
3 ANOVA and Chi Square
No ratings yet
3 ANOVA and Chi Square
67 pages
Document From Da??
No ratings yet
Document From Da??
21 pages
A Nova Sumner 2016
No ratings yet
A Nova Sumner 2016
23 pages
Bst 32202 Linear Regression 3 Anova One Way
No ratings yet
Bst 32202 Linear Regression 3 Anova One Way
29 pages
Anova
No ratings yet
Anova
59 pages
ENS185 Module on Statistical Tests (1)
No ratings yet
ENS185 Module on Statistical Tests (1)
9 pages
Comparing Means and Proportions Measures of Association
No ratings yet
Comparing Means and Proportions Measures of Association
59 pages
Class Ix: Hypothesis Testing - III Analysis of Variance
No ratings yet
Class Ix: Hypothesis Testing - III Analysis of Variance
26 pages
Nonparametric Tests and Anovas:: What You Need To Know
No ratings yet
Nonparametric Tests and Anovas:: What You Need To Know
40 pages
Review of One-Way ANOVA: Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
No ratings yet
Review of One-Way ANOVA: Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
24 pages
Analysis of Continuous and Categorical Variables: January 28, 2020
No ratings yet
Analysis of Continuous and Categorical Variables: January 28, 2020
31 pages
T (Ea) For Two
No ratings yet
T (Ea) For Two
31 pages
Review of ANOVA and Linear Regression
No ratings yet
Review of ANOVA and Linear Regression
54 pages
8. Raghunath Chatterjee_Statistical Tests_Lecture
No ratings yet
8. Raghunath Chatterjee_Statistical Tests_Lecture
47 pages
Adobe Scan 03-Jan-2024
No ratings yet
Adobe Scan 03-Jan-2024
33 pages
Final Exam
No ratings yet
Final Exam
47 pages
Chapter 4 Hypotheses Testing of More Than Two Populations
No ratings yet
Chapter 4 Hypotheses Testing of More Than Two Populations
90 pages
Bio2 Module 3 - Comparison of Means
No ratings yet
Bio2 Module 3 - Comparison of Means
19 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
385 pages
One-Way Analysis of Variance: Using The One-Way
No ratings yet
One-Way Analysis of Variance: Using The One-Way
25 pages
Class5 Lecture
No ratings yet
Class5 Lecture
53 pages
Anova Mab2024
No ratings yet
Anova Mab2024
30 pages
Basic Concepts of One Way Analysis of Variance (ANOVA)
No ratings yet
Basic Concepts of One Way Analysis of Variance (ANOVA)
38 pages
Basic Concepts of One Way Analysis of Variance (ANOVA)
No ratings yet
Basic Concepts of One Way Analysis of Variance (ANOVA)
38 pages
Chapter Iii - Measures of Central Tendency PDF
No ratings yet
Chapter Iii - Measures of Central Tendency PDF
19 pages
Analysis of Variance ANOVA
No ratings yet
Analysis of Variance ANOVA
39 pages
Chapter 13 One Way ANOVA
No ratings yet
Chapter 13 One Way ANOVA
19 pages
Basic Concepts of One Way Analysis of Variance (ANOVA)
No ratings yet
Basic Concepts of One Way Analysis of Variance (ANOVA)
30 pages
Lecture 5
No ratings yet
Lecture 5
22 pages
20-Introduction To Analysis of Variance
No ratings yet
20-Introduction To Analysis of Variance
31 pages
Topic: ANOVA (Analysis of Variation) : Md. Jiyaul Mustafa
No ratings yet
Topic: ANOVA (Analysis of Variation) : Md. Jiyaul Mustafa
49 pages
Lesson 9 Anova
No ratings yet
Lesson 9 Anova
18 pages
2 One-Way ANOVA
No ratings yet
2 One-Way ANOVA
59 pages
ANALYSIS OF VARIANCE
No ratings yet
ANALYSIS OF VARIANCE
10 pages
Use of F Distribution (Analysis of Variance (ANOVA) )
No ratings yet
Use of F Distribution (Analysis of Variance (ANOVA) )
10 pages
ANOVA
No ratings yet
ANOVA
21 pages
Unit III
No ratings yet
Unit III
9 pages
Introduction To Analysis of Variance
No ratings yet
Introduction To Analysis of Variance
17 pages
Principles of The T-Test and ANOVA
No ratings yet
Principles of The T-Test and ANOVA
64 pages
Psych Stat (Book) - Finals
No ratings yet
Psych Stat (Book) - Finals
4 pages
14 Anova1
No ratings yet
14 Anova1
31 pages
Lecture 4 - How To Choose A Statistical Test
No ratings yet
Lecture 4 - How To Choose A Statistical Test
18 pages
One Way Anova
100% (1)
One Way Anova
52 pages
Where Are We and Where Are We Going?: Purpose IV DV Inferential Test
No ratings yet
Where Are We and Where Are We Going?: Purpose IV DV Inferential Test
36 pages
Chapter Viii
No ratings yet
Chapter Viii
4 pages
Psych Stats
No ratings yet
Psych Stats
8 pages
Ch15
No ratings yet
Ch15
79 pages
ONE WAY ANOVA
No ratings yet
ONE WAY ANOVA
9 pages
Anova-Ppt For Sonia Kalra Ma'Am
No ratings yet
Anova-Ppt For Sonia Kalra Ma'Am
31 pages
QRM - Week 3 Lecture - Canvas
No ratings yet
QRM - Week 3 Lecture - Canvas
25 pages
Design of Experiments and ANOVA
No ratings yet
Design of Experiments and ANOVA
45 pages
Analysis of Variance - ANOVA: Eleisa Heron Eleisa Heron
No ratings yet
Analysis of Variance - ANOVA: Eleisa Heron Eleisa Heron
43 pages
Quantitative Method-Breviary - SPSS: A problem-oriented reference for market researchers
From Everand
Quantitative Method-Breviary - SPSS: A problem-oriented reference for market researchers
Jens K. Perret
No ratings yet
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Stats-Chi-square
No ratings yet
Stats-Chi-square
10 pages
Polymers-13-02423 2
No ratings yet
Polymers-13-02423 2
25 pages
Stress Transformation 2
No ratings yet
Stress Transformation 2
19 pages
Sustainability 14 10458
No ratings yet
Sustainability 14 10458
13 pages
Tutorial No.5-Midterm Revision
No ratings yet
Tutorial No.5-Midterm Revision
6 pages
Relative Importance Index
No ratings yet
Relative Importance Index
5 pages
Tutorial No. 2-Normal Stresses & Mechanical Properties
No ratings yet
Tutorial No. 2-Normal Stresses & Mechanical Properties
5 pages
Torsion Test Results
No ratings yet
Torsion Test Results
19 pages
Torsion Tutorial - L51 PDF
No ratings yet
Torsion Tutorial - L51 PDF
10 pages
Melt Flow Index
No ratings yet
Melt Flow Index
3 pages

lecture12

Uploaded by

lecture12

Uploaded by

More than two groups:

ANOVA and Chi-square

Binary or Chi-square test: McNemar’s chi-square Fisher’s exact test:

Continuous Ttest: compares means Paired ttest: compares Non-parametric statistics

 Assumptions, same as ttest

 Equal variances between the

H 1 : Not all of the population means are the same

Variabilit y between groups

Analogous to pooled variance from a

Between k-1 SSB SSB/k-1 Go to

Within nk-k SSW s2=SSW/nk-k

Total nk-1 TSS

Between 1 SSB Squared Go to

Total 2n-1 TSS

Mean for group 3 = 56.3 59 64 60 56

Grand mean= 59.85

(59-62) 2+ (72-62) 2+ (71- 59 64 60 56

Between 3 196.5 65.5 1.14 .344

Within 36 2060.6 57.2

Between 3 196.5 65.5 1.14 .344

Within 36 2060.6 57.2

TSS = SSB1 + SSB2 + SSW

Grand mean: 161

SSB = [(117.8-161)2 + (158.7-161)2 + (206.5-161)2] x25 per

S.D. for S1 = 62.4

Therefore, sum of squares within is:

Source of variation d.f. Sum of squares Mean Sum of F-statistic p-value

Within 72 391,066 5431

 Determining which groups differ (when

conservative amount; assuming all tests

• Holm/Hochberg (gives p-cutoff beyond

which not significant)

If you did not specify the comparisons of interest

Alternately, run a limited number of planned comparisons (making only

.001 .05 5 .010 Yes

.011 .05 4 .013 Yes

.019 .05 3 .017 No

.032 .05 2 .025 No

.048 .05 1 .050 Yes

 SAS options in PROC GLM:

p-value .05 .3 .25 .04 .001 .006 .08 .002 .01

a. Which differences would be considered statistically significant using a Bonferroni

.00 .00 .00 .01 .04 .05 .08 .25 .3

Holm: .001<.0056; .002<.05/8=.00625; .006<.05/7=.007; .01>.05/6=.0083; therefore,

Hochberg: .3>.05; .25>.05/2; .08>.05/3; .05>.05/4; .04>.05/5; .01>.05/6; .006<.05/7;

Continuous Ttest: compares means Paired ttest: compares Non-parametric statistics

Proc NPAR1WAY in SAS

Binary or Chi-square test: McNemar’s chi-square Fisher’s exact test:

A contingency table with R rows and C columns is an R x C contingency table.

Standard line Comparison lines

 Does the group size alter the

Number of group members?

Apparently, conformity less likely when less or more group

 Overall likelihood of conforming =

Degrees of freedom = (rows-1)*(columns-1)=(2-1)*(5-1)=4

Degrees of freedom = (rows-1)*(columns-1)=(2-1)*(5-1)=4

Brain tumor No brain tumor

Own a cell 5 347 352

Binary or Chi-square test: McNemar’s chi-square Fisher’s exact test:

You might also like

Degrees of freedom = (rows-1)(columns-1)=(2-1)(5-1)=4

Degrees of freedom = (rows-1)(columns-1)=(2-1)(5-1)=4