T Test and Chi Square Test
T Test and Chi Square Test
Statistical Tests
Introduction
A parametric statistical test is one that makes assumptions about the parameters (defining
properties) of the population distribution(s) from which one's data are drawn.
A non-parametric test is one that makes no such assumptions. In this strict sense, "non-
parametric" is essentially a null category, since virtually all statistical tests assume one thing
or another about the properties of the source population(s).
Which is more powerful?
Non-parametric statistical procedures are less powerful because they use less information in
their calculation. For example, a parametric correlation uses information about the mean and
deviation from the mean while a non-parametric correlation will use only the ordinal position
of pairs of scores.
Parametric Assumptions
The observations must be independent
The observations must be drawn from normally distributed populations
These populations must have the same variances
The means of these normal and homoscedastic populations must be linear combinations of
effects due to columns and/or rows
Nonparametric Assumptions
Certain assumptions are associated with most nonparametric statistical tests, but these are
fewer and weaker than those of parametric tests.
Advantages of Nonparametric Tests
Probability statements obtained from most nonparametric statistics are exact probabilities,
regardless of the shape of the population distribution from which the random sample was
drawn
If sample sizes as small as N=6 are used, there is no alternative to using a nonparametric
test
Easier to learn and apply than parametric tests
Based on a model that specifies very general conditions.
No specific form of the distribution from which the sample was drawn.
Hence nonparametric tests are also known as distribution free tests.
Disadvantages of nonparametric tests
Losing precision/wasteful of data
Low power
False sense of security
Lack of software
Testing distributions only
Higher-ordered interactions not dealt with
Parametric models are more efficient if data permit.
It is difficult to compute by hand for large samples
Tables are not widely available
In cases where a parametric test would be appropriate, non-parametric tests have less
power. In other words, a larger sample size can be required to draw conclusions with the
same degree of confidence.
Few points
The inferences drawn from tests based on the parametric tests such as t, F and Chi-square
may be seriously affected when the parent population’s distribution is not normal.
The adverse effect could be more when sample size is small.
Thus when there is doubt about the distribution of the parent population, a nonparametric
method should be used.
In many situations, particularly in social and behavioural sciences, observations are
difficult or impossible to take on numerical scales and a suitable nonparametric test is an
alternative under such situations.
Level of Measurement
4. Ratio Scale
Kelvin temperature, speed, height, mass or weight
Ratio data is interval data with a natural zero point
fruits count
orange 1
orange 1
mango 2
banana 3
lemon 4
banana 3
orange 1
lemon 4
lemon 4
orange 1
mango 2
banana 3
lemon 4
banana 3
orange 1
lemon 4
lemon 4
Table 2: Data Sets
Interpretation:
Here p value is 0.560 which is more than 0.05. Hence it is not significant and we fail to reject the null
hypothesis and conclude that there is no significant difference in the proportions of apples, bananas,
oranges, and peaches.
We could also test to see if a basket of fruit contains 10% apples, 20% bananas, 50% oranges, and
20% peaches.
count
Cumulative
Frequency Percent Valid Percent Percent
Valid 1.00 5 29.4 29.4 29.4
2.00 2 11.8 11.8 41.2
3.00 4 23.5 23.5 64.7
4.00 6 35.3 35.3 100.0
Total 17 100.0 100.0
The chi-square test for independence, also called Pearson's chi-square test or the chi-
square test of association, is used to discover if there is a relationship between two
categorical variables. Assumptions
When you choose to analyse your data using a chi-square test for independence, you need
to make sure that the data you want to analyse "passes" two assumptions. You need to do
this because it is only appropriate to use a chi-square test for independence if your data
passes these two assumptions. If it does not, you cannot use a chi-square test for
independence. These two assumptions are:
o Assumption #1: Your two variables should be measured at an ordinal or nominal
level (i.e., categorical data).
o Assumption #2: Your two variable should consist of two or more
categorical, independent groups. Example independent variables that meet this
criterion include gender (2 groups: Males and Females), ethnicity (e.g., 3 groups:
Caucasian, African American and Hispanic), physical activity level (e.g., 4 groups:
sedentary, low, moderate and high), profession (e.g., 5 groups: surgeon, doctor,
nurse, dentist, therapist), and so forth.
relevant buttons to transfer the variables; or (2) drag-and-drop the variables. How do
you know which variable goes in the row or column box? There is no right or wrong way. It
will depend on how you want to present your data.
If you want to display clustered bar charts (recommended), make sure that Display
clustered bar charts checkbox is ticked.