0% found this document useful (0 votes)
42 views2 pages

Chapter 12

Uploaded by

amaan.nanji1999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views2 pages

Chapter 12

Uploaded by

amaan.nanji1999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Chapter 12: Tests for 3 or more samples: ANOVA and Kruskai-Wallis

Analysis of Variance (ANOVA):

 Here we start off with the null hypothesis that all the samples come from the same population,
then we look for evidence otherwise, same as the previous tests. If the samples come from the
same population, we do expect them to be somewhat different, such that the different samples
will have distributions that look similar and have similar means. If the samples come from
different populations, they will have different means which means they will be more spread
apart
 We only need one sample mean to be different in order to reject the null hypothesis. This is
good because we know that our samples are different
 We can use a “post-hoc” test which allows us to identify where the difference are in an ANOVA
 Assumptions:
o The three or more samples are independent random samples
o Each population is normally distributed
o Each population has the same variance
o The variable under study (the mean that you are calculating) is measured at the interval
or ratio scale
 With an independent sample t-test, we take the observed difference (two means or a sample
mean and a hypothesized or known population mean), divided by the expected difference to be
present because of our sampling, and then compared that number to a critical test value or
threshold
 Within group variability is simple the variability that we can expect because of sampling, it is
equivalent to the standard errors in previous tests
 Called one-way ANOVA because we are only considering one variable
 How to do ANOVA:
o Calculate the between-group variability. We need to calculate the total mean/the mean
we get when we pool all of our data together. You do this by adding up all of the
observations from all your samples and divide by the sum of all the sample sizes, or you
can use the sample means and their sample sizes for each sample you are considering
 As with the variance, squaring the deviations “penalizes” the larger deviations more, so if there
are any large difference the total mean and the sample means then there is a large difference
between the group means. We multiply by the number of observations in example to weigh the
different samples accordingly: if one sample has a lot of between group variability, but few
observations, we want to count that less than another sample with a lot of observations
 After this we calculate the between-group mean squares and use the degrees of freedom
calculation here, Df = N/P, where P is the number of different parameters or relationships
 Step two is to calculate the within-group variability. You do this by subtracting each of the
scores from the mean of the entire sample. Square each of those deviations. Add those up for
each group, then add the two groups together
 The last step is to compare this statistic with a critical F-statistic, the ratio of two variances. If
your F-statistic is greater than the critical value in the table, reject the null hypothesis. This
means your data exhibits more variability than you would expect it would
 Based on our F-statistic and if we reject our null hypothesis, we can make a claim
 The critical statistic for the Jarque-Bera is called a chi-square and has 2 degree of freedom. The
Kolomogrov-Smirnov test and the Shapiro-Wilk test, they cannot reject the null hypothesis

Kruskai-Wallis test:

 Is a nonparametric test
 Put all the data into one variable, nothing which data are from which sample and rank all of the
data
 Sum all of the rankings for each samples
 Calculate the mean rank for each sample, (A median is the middle-ranking value of an ordered
list. Given the lowest rank is 1 and the highest rank is n, the middle rank is (1+n)/2 - which is also
the mean rank) the mean ranks should be similar if they are from the same population
 Observations are split into two cases for simplicity; variability/differences we observe vs the
variability/differences
 The first part includes the total sample size and the mean ranks that we calculate with the
data/samples we have
 The second set of brackets we have is a number multiplied by the total sample size plus one, we
only expect these numbers if the mean ranks of each sample are the same
 If the rank sum averages are very different, then this could change the result by just pushing the
value of H over the critical value. H is a rank-based nonparametric test that can be used
to determine if there are statistically significant differences between two or more groups of an
independent variable on a continuous or ordinal dependent variable

You might also like