0% found this document useful (0 votes)
25 views269 pages

Lectures Total

Uploaded by

jmnvrp6k9v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views269 pages

Lectures Total

Uploaded by

jmnvrp6k9v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 269

BIOSTATISTICS

Definition:

Statistics is the science of collecting, summarizing, presenting,


and interpreting data, and of using them to test a hypothesis.

The statistical methods can generally be classified into 3 main


classes:
Methods of data collection.

Methods of data presentation

Statistical analysis and interpretation of data.


• METHODS OF DATA COLLECTION
In health affairs data can be collected by several ways:
1. Census: it is the process of enumeration of persons in the whole
country or in a certain locality at some point of time to obtain
data about age and sex distribution, marital status, occupation,
educational level, income and housing conditions …..etc.
2. Hospital records: information about epidemiological
characteristics of various diseases can be obtained by reviewing
hospital records. i.e. malignant diseases, heart diseases ……etc.
3. Field surveys: in which the person collect data from the
community either through a comprehensive survey (collection of
data from everyone in the community). This needs a lot of effort,
time and money i.e. census. The other method is the "sample
survey" in which sample will be taken. The sample should be
representative to the total population.
4. Special registers as registration of diseases, deaths and births.
Such as cancer and diabetes registers.
PRESENTATION OF DATA

• Once a number of observations are collected, we have to express


them in a simple form which permit, directly or by means of
further calculations a conclusions to be drawn.

Types of data: there are two types of data:


 Constant data: this type is of no statistical importance, such as
normal person has 2 eyes ….etc.
 Variables: such as height, weight, blood pressure, pulse rate,
blood urea, blood calcium …..etc.

There are two types of variables:


 Quantitative variables
 Qualitative (Categorical) variables.
Quantitative variables:
These are variables which can be expressed in the form of
quantities or numbers. They are classified into 2 subtypes:
1. Continuous quantitative variables: has 2 characteristics:
 These variables are usually obtained through measurements.
i.e. height, weight, blood sugar level, blood urea, blood
cholesterol, calcium in the blood.
 They can be expressed by integer or fractional values.
2. Discrete quantitative variables: this type of variables is usually
obtained through enumeration. It is always an integer and
never a fractional value. Examples: pulse rate, number of
Qualitative variables:
Take the form of qualities or names and cannot be expressed in
the form of quantities or numbers. Examples:
Social level:
 Upper social level
 Middle social level
 Lower social level
Level of education:
 Illiterate
 Read and write
 Primary
 Secondary
 University
In these 2 examples the categories can be arranged in a definite
order and is called ordinal qualitative variables.
• The other type of qualitative variables in which
categories cannot be arranged in any definite order is
called nominal qualitative variable.

Examples:
 Marital status: single, married, divorced, widow
 Blood group: A, B, AB and O.
Types of variables

Quantitative variables Qualitative variables


Continuous Discrete Ordinal Nominal
•Age •Number of Social class Marital status
•Height pregnancies Level of Sex
•Weight •Number of education Blood group
•Body temp. deliveries Level of Race
•Blood minerals •Pulse rate success
•Blood urea
Types of variables

 Dependent variable = response variable = outcome variable.


 Independent variable = explanatory variable = predictor variable

 Dichotomous (binary) & poly-chotomous variable.


 A categorical variable is binary or dichotomous when there are
only two possible categories. Examples include 'Yes / No', 'Dead /
Alive' or 'Patient has disease / patient does not have disease'.
 A categorical variable is poly-chotomous when there are more
than two possible categories.`
Proportions, %, ratio, rates and scores:
We may encounter a number of other types of data in the medical
field. These include:
 Proportions: the numerator is part of the denominator i.e.
proportion of diseased persons.
 Percentage: a proportion multiplied by 100.
 Ratios: the numerator is not a part of the denominator.
Occasionally you may encounter the ratio of two variables. For
example, body mass index (BMI), calculated as an individual's
weight (kg) divided by his / her height squared (m2).
Proportions, %, ratio, rates and scores:
 Rates: Event (e.g. disease) rates, in which the number of
disease events is divided by the number of population at
risk (PAR) during the time period under consideration.

 Scores: We sometimes use an arbitrary value, i.e. a score,


when we cannot measure a quantity. For example, a
series of responses to questions on quality of life may be
summed to give some overall quality of life score on each
individual. Another example is socio-economic score.
• Censored data
• We may come across censored data in situations
illustrated by the following examples.
• If we measure laboratory values using a tool that can
only detect levels above a certain cut-off value, then
any values below this cut-off will not be detected. For
example, when measuring virus levels, those below the
limit of etectability will often be reported as
'undetectable' even though there may be some virus in
the sample.
• We may encounter censored data when following
patients in a trial in which, for example, some patients
withdraw from the trial before the trial has ended.
• Methods of data presentation:
• Presentation of data can be classified into 3
methods:
A. Numerical presentation (Tabular)
B. Graphical presentation
C. Mathematical or statistical presentation
• NUMERICAL PRESENTATION (TABULAR)
• In this method the collected data are presented without being
subjected to any mathematical procedure, and can be
presented as follows:
1. Raw data or simple presentation: in this case the collected
data are presented according to the order of appearance of the
observations without arranging the data in any order. It is not
a good method of presentation especially if the number of
observation is large.
2. Ordered presentation or (array): this is a better method of
presentation but it is not suitable in case of a large number of
observations. This method can be used with
quantitative continuous, discrete and qualitative
ordinal data.
Tabular presentation:
• There are various types of tables, but the basic type is the
simple frequency distribution table. The general structure of
the table is composed of three parts:
1. Title: answer about four questions: what, whom, where and
when. Examples of title is the following:
• Sex distribution (what) of patients (whom) admitted to
pediatric department in BMC (where) in 2010 (when).
• Age distribution (what) of patients (whom) admitted to internal
medicine department in BMC (where) in 2009(when).
2. Contents: the general structure of the table is composed of two
columns at least as shown below:
• The variable name is in the upper part of the left column and
the unit of measurement e.g. Kg. etc.
• In the middle part we have to write the variable categories.
• The right column is used for the frequency. In the upper part of
the column we write the frequency or more specific terms such
as: Number of patients, number of students.
• In the middle part we put the frequency of each category which
is the number of observations that belongs to that category.
• At the bottom of the column we write the sum of the
frequencies.
3. Total: in the lower part we write total which is the sum of
observations.

For categorical (qualitative) data:


• The categories here are the different groups of data.
• Example: the following sheet contains row data of 18
attendants of CCU, AUH, Jan 2010:
Serial no. Sex Age Pulse rate
1 Male 35 70
2 Female 30 75
3 Male 40 74
4 Male 45 68
5 Female 33 80
6 Female 34 65
7 Female 32 72
8 Female 28 74
9 Male 25 75
10 Male 30 71
11 Male 40 69
12 Male 42 72
13 Male 45 73
14 Male 41 71
15 Female 55 70
16 Female 44 72
17 Male 36 66
18 Male 37 77
1. For qualitative variables:
sex of patients admitted to CCU, AUH, Jan 2010:

Frequency Tally Sex


11 II/II II/II I Male
7 II/II II Female

18 Total
• For discrete quantitative variables:
• The categories here are the different classes of data.
• Pulse in the previous table is a discrete quantitative variable.
Example for its table (manual table) is:
• In construction of this frequency distribution, the following rules
are followed:
• In general, about 4 – 12 groups is an appropriate figure to adopt.
(the number of classes depends upon the number of observation
in the series).
• Keep the class – or group intervals at an equal width.
• The first step is to find the upper and lower limits over which the
tabulation extend and try to make the classes or intervals. For
example, in the pulse in the previous table, upper limit is 80 and
lower limit is 65, so the range is 15 and we distributed it into 4
equal classes each with class interval of 4.
2. For discrete quantitative variables:

Pulse rate of patients of CCU, AUH, Jan 2010:


Frequency Tally Pulse
3 III 65-68
8 II/II III 69-72
5 II/II 73-76
2 II 77-80

18 Total
3. For continuous quantitative variables:

 Example: the weights of 60 children in kilograms are as follows:


 10 19 18 20 29 25 27 19 22 18 15 11 19 12 29 30
16 35 23 21 22 14 23 32 31 13 26 17 24 19 21 25
28 18 32 17 26 20 25 17 27 15 26 33 34 31 25 26
28 29 24 16 24 26 31 27 24 17 35 23
• Put these data in a simple frequency distribution and cumulative
frequency tables?

Answer:
• Range = highest value – lowest value = 35 – 10 = 25 kg.
• We select the class interval =5 therefore the number of classes =
5.
Simple frequency distribution table:
Weight (kg) Frequency
10- 5
15- 15
20- 10
25- 20
30 – 35 10
Total 60
Cumulative frequency tables:

• This form of the frequency table gives the


number of individuals have less or more than
a certain point of age or height ……etc. in the
frequency table.
Ascending and descending cumulative frequency tables:

Ascending cumulative Descending cumulative


frequency table frequency table
Weight (kg.) Ascending Weight (kg.) Descending
cum. Freq. cum. Freq.
Less than 10 0 10 and more 60
Less than 15 5 15 and more 55
Less than 20 20 20 and more 40
Less than 25 30 25 and more 30
Less than 30 50 30 and more 10
35 or less 60 More than 35 0
• In the ascending cumulative frequency table the
frequencies are added to the previous one and so
on until the last category or class interval we reach
to the total.
• In the descending cumulative frequency table the
total frequency will be in the first category and we
subtract the following frequency from its previous
one until we reach to 0 at the end of the table.
• Advantage: it tells how many children are less than
or more than a particular weight.
GRAPHICAL PRESENTATION OF DATA
 Numerical presentation of data in form of tables may
be not understandable to all readers. However
presentation of the same material diagrammatically
often proves a very useful aid. However there are some
basic principles which should not be forgotten.
a) The object of the graph is to make it obvious, or at
least clear as possible.
b) Simplicity is invariably the keynote
c) The numbers on the axis should not be numerous and
should run at regular and even intervals.
d) Scales should begin at zero if the charts are used for
comparisons, otherwise false impression may be
created.
• The most common diagrams which have been used
in presenting the frequency distribution data are
the following:
1. Bar charts:
a) Simple bar chart
b) Multiple bar chart
c) Multi-component bar chart.
2. Line graph
3. Histogram
4. Frequency polygon
5. Pie or circular chart.
60

50

40

30

20

10

0
Ascaris Bilharziasis Ankylostom Oxyuris E. Histolytica
35

30

25

20
Males
Females
15

10

0
Ascaris Bilharziasis Ankylostoma Oxyuris E. Histo
60

50

40

30 Column1
Females

20

10

0
Ascaris Bilharziasis Ankylostoma Oxyuris Entameba. Histo
2. Line graph:
• It is the most suitable type to represent data when we are
dealing with a time variable (hours, days, weeks, months,
years).
• The time variable is a special unit of quantitative continuous
variable. Usually we show the time variable on the
horizontal axis and the other variable on the vertical axis.
For example temperature chart over hours, blood sugar over
months, crude birth rate and infant mortality rate over
years.
• Example: the following table shows infant mortality rate in
Egypt over 10 years:
Year IMR per thousand
1971 103
1972 116
1973 98
1974 101
1975 89
1976 87
1977 85
1978 74
1979 76
1980 76
140

120

100

80

60

40

20

0
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980
3. Histogram:
• This type of graph is used to present data from a frequency
distribution table, and the variables must be quantitative
continuous. The table is of a simple type and closed ended. In
this graph each category in the table is represented by a bar
graph, the height of the bar should be opposite to the
corresponding frequency on the Y (vertical axis). The width of
the bar depends on the width of the interval that it represents
equal intervals and there is no space between the bars.
• Example: this table shows the distribution of 45 students by
their weights:
4. Frequency polygon:

• This graph is used to represent quantitative continuous


variable whether the table is of a simple or complex type
and the table should be of closed ended type. In this graph
each category in the table is presented by a single point,
whose abscissa (X) is the midpoint of the class and whose
ordinate (Y) is the frequency of the class. Then every
consecutive 2 points are joined by a straight line. Fig (7).

• Example: this table shows the distribution of 42 students by


their weights and sex. Present it in a suitable graph.
16

14

12

10

8 Column1
Females
6

0
50- 55- 60- 65- 70- 75-80
5. Pie chart:
• This graph can be used to represent qualitative or discrete
quantitative variables, and it can be used for the 4 types of
variables.
• The idea is to draw a circle with a suitable radius and this circle
is divided into a number of sectors equal to the number of
categories in the frequency distribution table. Therefore each
sector will represent one of the categories in the table, and the
area of the sector will be proportional to its frequency.
• The determination of the angle sector (or each class) is done as
follows:

Frequency of the class X 360


The angle of each class (i.e. sector) = ---------------------------------
Total frequency
Frequency

A
B
AB
O
POPULATION AND SAMPLES

• Usually it is not possible to study the entire


population. It is therefore necessary to consider a
sample and to relate its characteristics to the total
population.

• The population is the entire group in which the


investigator is interested i.e. a statistical
population may be another type of population
and not made only of people e.g. population of
birth weight, R.B.Cs. W.B.Cs…….etc.
POPULATION AND SAMPLES
• Samples: are portion of the population. The population must
be clearly and exhaustively defined before any sample is
drawn from. Any inference or conclusion, based on the
information from a sample must refer the particular
population from which the sample was drawn. Therefore the
findings from a sample can be generalized to the population.
• Sampling methods are broadly divided into two
categories: probability and non-probability.
• In probability sampling every member of the population has a
known chance (probability) of participating in the study.
Probability sampling methods include simple, stratified
systematic, multistage, and cluster sampling methods.
• In non-probability sampling, on the other hand, sampling
group members are selected on non-random manner,
therefore not each population member has a chance to
participate in the study. Non-probability sampling methods
include purposive, quota, convenience and snowball sampling
methods.
• Advantages of sampling:
1. Low cost
2. Reduces manpower requirements
3. Gather information more quickly
4. Obtain more comprehensive data
5. Greater accuracy: some samples, actually give better
estimates of the complete count than a comprehensive
survey

The Process of Sampling in Primary Data Collection


• The process of sampling in primary data collection involves
the following stages:
1. Defining target population. Target population represent specific
segment within wider population that are best positioned to serve as a
primary data source for the research. Suppose you are conducting a
study on smoking among medical students in Assiut, your target
population would be Assiut medical students.
2. Choosing sampling frame. Sampling frame can be explained as a list of
people within the target population who can contribute to the research.
A list of all medical students of Assiut would be our sampling frame.
3. Determining sampling size. This is the number of individuals from the
sampling frame who will participate in the primary data collection
process. We use statistical methods to determine the sample size for
descriptive and analytic studies. Consult a statistician.
4. Selecting a sampling method. Use a probability sampling technique
from the ones listed above; simple random sampling, stratified etc..
5. Applying the chosen sampling method in practice.
A- Random probability samples:
1. Simple random sample: it is a sample drawn from the
reference population (universe) at random so that the
different population characteristics (variables) are not only
represented in the sample, but also in more or less the same
proportion, hence, the sample is described as being
"representative". It is suitable to draw a sample form a
uniform or homogenous population, using different
techniques as random tables, random generator.
2. Stratified random sample: it is suitable when the population
under the study is hetero-genous (not similar in structure). It is
a form of random sample obtained as follows:
• First the target population is divided (stratified) into a suitable
number of strata representing the different attributes and
variables, as age, sex, occupation, residence, and
socioeconomic status.
• Then out of each stratum, a random sample is drawn (to be
subdivided into study and control groups, if necessary).
• The sum of samples taken from all strata form the final
stratified sample.
• Advantage: when the disease or health problem under study is
not uniformly distributed within the different variables,
stratification of the target population into homogenous groups
of variables allows for getting a final representative sample
that gives precise data.
3. Systematic random sample: when a study is conducted in a
confined community, as village or camp, a random sample can
be drawn as follows:
• Houses of the village are numbered: the first house in the
sample is randomly chosen, followed by taking every (n)th
house, e.g. the 5th, 10th, or 15th, depending on the sample size,
until the end of houses. Collected houses form the systematic
random sample.
• Or individual of the camp are listed, the first in sample is
randomly chosen, then every (n)th individual is taken until the
end of list. Collected individuals form the systematic random
sample.
4. Cluster sample: it is useful when the population is very
widely dispersed. The sub-divisions of the population are
decided on the basis of area, time or both. For example, the
persons selected for the study are clustered in certain
streets, blocks, places, week or month ……….etc. it is a costly
method. On conducting a study on school children or
industrial workers, the random sample can be drawn by
subdividing the group into clusters representing the school
classes, or industrial processes. From each cluster, a simple
random sample is drawn, and taken samples collectively
form the final sample of study.
• Multistage sample: it is a form of random sample that can be
applied in national or widespread studies. The field of work is
arranged into levels or stages: governorate form the first level,
then come the other levels of cities and towns, districts or
villages, and lastly families and individuals (the last level). A
random sample is successively drawn from each stage (level),
and collectively form the final sample of study.
Advantages of using random (probability) samples:
1. It enables in drawing a conclusion concerning the population
2. Help to eliminate bias
• Selected, purposive or non probability sample:
• It is chosen according to the investigator 's own
judgment and experience.
• The results of this sample cannot be generalized to
the whole population because it is not a
representative sample.
• It is only suitable for pilot studies.
Frequency distribution of the population
 Histogram (or frequency polygone) are used to show the
frequency distribution of the population.
 Our confidence in drawing general conclusions from the data
depends on how many individuals were measured. The larger
the sample, the finer the grouping interval that can be chosen,
so that the histogram (or frequency polygon) becomes
smoother and more closely resembles the distribution of the
total population.
Shapes of frequency distributions
 Figure 3.5 shows three of the most common shapes of frequency
distributions.
 They all have high frequencies in the centre of the distribution
and low frequencies at the two extremes, which are called the
upper and lower tails of the distribution.
 The distribution in Figure 3.5(a) is also symmetrical about the
centre; this shape of curve is often described as ‘bell-shaped’. The
two other distributions are asymmetrical or skewed.
 The upper tail of the distribution in Figure 3.5(b) is longer than
the lower tail; this is called positively skewed or skewed to the
right.
 The distribution in Figure 3.5(c) is negatively skewed or skewed to
the left.
All three distributions in Figure 3.5 are unimodal, that is they
have just one peak.
Figure 3.6 (a) shows a bimodal frequency distribution, that is a
distribution with two peaks. This is occasionally seen and usually
indicates that the data are a mixture of two separate distributions.
Also shown in Figure 3.6 are two other distributions that are
sometimes found, the reverse J-shaped and the uniform
distributions.
THE NORMAL DISTRIBUTION
Introduction:
• In practice it is found that a reasonable description of many
variables is provided by the normal distribution, sometimes
called the Gaussian distribution after its discoverer, Gauss.
• The curve of normal distribution is symmetrical about the
mean and bell – shaped; the bell is tall and narrow for small
standard deviation and short and wide for large ones.
• Fig. (9) illustrate the normal curve describing the distribution
of heights of adult men in the United Kingdom.
• Other examples of variables that are approximately normally
distributed are blood pressure, body temperature, and hemoglobin
level. Examples of variables that are not normally distributed are
triceps skin-fold thickness and income, both of which are positively
skewed.
• The normal distribution is not only important because it is a good
empirical description of many variables, but because it occupies a
central role in the techniques of statistical analysis. For example, it is
the justification for the calculation of the confidence interval. It also
forms the basis of the methodology of significance testing of means.
The standard normal distribution:
• If a variable is normally distributed then a change of units does
not affect this. For example, whether the height is measured in
centimeters or inches it is normally distributed. Changing the
mean simply moves the curve up or down the axis, while changing
the standard deviation alters the height and width of the curve.
• In practice, by a suitable change of units any normally distributed
variable can be related to the standard normal distribution whose
mean is zero and whose standard deviation is 1.
• The standard normal distribution:
• This is done by subtracting the mean from each
observation and dividing by the standard deviation. The
relationship is:

x-µ
• SND, z = ---------------
ơ

• Where x is the original variable with mean µ and


standard deviation ơ and z is the corresponding
standard normal deviate (SND). This is illustrated for the
distribution of adult male heights in figure (10).
• SND = (height – 171.5) / 6.5
• Height = 171.5 + (6.5XSND).
• The possibility of converting any normally distributed curve
variable into an SND means that tables are only needed for the
standard normal distribution. (SND). The most commonly
provided sets of tables are
• the area under the frequency distribution curve, and
• the so-called percentage points.
Table for area under the curve of the normal distribution:
 Table for the area under the frequency distribution curve of the
normal distribution is useful for determining the proportion of
the population which has values in some specified range.

 This will be illustrated for the distribution shown in fig. (9) and
(10) of the heights of adult men in the United Kingdom, which is
approximately normal with mean µ=171.5cm and standard
deviation ơ= 6.5cm.
Area in the upper tail of distribution:
 The normal distribution can be used to estimate, for
example, the proportion of men taller than 180 cm.
 This proportion is represented by the fraction of the
area under the frequency distribution curve that is
above 180cm. The corresponding SND (Z) is:

180 – 171.5
• Z = ------------------- = 1.31
6.5
• so that the proportion may be derived from the proportion of
the area of the standard normal distribution that is above 1.31.
• This area is illustrated in Figure 5.3(a) and can be found from a
computer or from Table A1 in the Appendix.
• The rows of the table refer to z to one decimal place and the
columns to the second decimal place.
• Thus the area above 1.31 is given in row 1.3 and column 0.01
and is 0.0951.
• We conclude that a fraction 0.0951, or equivalently 9.51%, of
adult men are taller than 180 cm.
Area in the lower tail of distribution:
 The proportion of men shorter than 160cm, for example can
be similarly estimated.
160 – 171.5
• Z = ---------------------------- = - 1.77

6.5
• The required area is illustrated in fig. (5.3). As the standard
normal distribution (SND) is symmetrical about zero the area
below z = - 1.77 is equal to the area above z = 1.77 and is
0.0375. Thus 3.75% of men are shorter than 160 cm.
Area of distribution between two values:
 The proportion of men with a height between for example, 165cm and
175cm is estimated by finding the proportion of men shorter than
165cm and taller than 175cm and subtracting these from 1. This is
illustrated in figure (5.3).
 SND corresponding to 165cm is:
165 – 171.5
Z = ------------------------ = -1 Proportion below this height is 0.1587.
6.5
 SND corresponding to 175cm is:
175 – 171.5
Z = ------------------------ = 0.54 Proportion above this height is 0.2946.
6.5
 Proportion of men with heights between 165cm and 175cm
= 1 – proportion below 165 – proportion above 175cm
= 1 – 0.1587 – 0.2964 or 54.67%
Value corresponding to specified tail area:
• Table A1 can also be used the other way round, that is starting
with an area and finding the corresponding z value.
• For example, what height is exceeded by 5% or 0.05 of the
population?. Looking through the table the closest value to
0.05 is found in row 1.6 and column 0.04 and so the required z
value is 1.64.
• The corresponding height is found by inverting the definition of
SND to give:

• X = µ + zơ

• And is 171.5 + 1.64 X 6.5 = 182.2 cm.


Percentage points of the normal distribution:
• An interpretation of the SND that is sometimes useful is that it
expresses the value of the variable in terms of the number of
standard deviations it is away from the mean. This is shown on
the scale of the original variable in figure (12).
• Thus, for example, z = 1 corresponds to a value which is one
standard deviation above the mean and z = -1 to one standard
deviation below the mean. The areas above z = 1 and below z = -1
are both 0.1587 or 15.87%. Therefore 31.74% (2X15.87%) of the
distribution is further than one standard deviation from the
mean, or equivalently 68.26% of the distribution lies within one
• Similarly, 4.55% of the distribution is further than two standard
deviation from the mean, or equivalently 95.45% of the distribution
lies within two standard deviations of the mean. This is the justification
for the practical application of the standard deviation.
• The z value encompassing exactly 95% of the distribution between – z -
1.96 and 1.96 figure (5.5.a). Therefore, the z value 1.96 is said to be 5%
percentage point of the normal distribution, as 5% of the distribution is
further than 1.96 standard deviations from the mean (2.5% on each
tail). 2.58 is the 1% of the percentage point. The commonly used
percentage points are tabulated in table A2. Note that they could also
be found from table A1 in the way described above.
 The percentage points described here are termed two-sided
percentage points, as they cover extreme observations in
both the upper and lower tails of the distribution.
 Some books tabulate one-sided percentage points, referring
to just one tail of the distribution. The one-sided a% point is
the same as the two-sided 2a% points, (fig. 5.5b).
 For example, 1.96 is the one-sided 2.5% point, as 2.5% is
below -1.96) and it is the two sided 5% point.
• These properties mean that, for a normally distributed
population, we can derive the range of values within which a
given proportion of the population will lie. The 95% reference
range is given by the mean – 1.96 s.d. to mean +1.96 s.d.,
since 95% of the values in a population lie in this range.
• We can also define the 90% reference range and the 99%
reference range in the same way, as mean – 1.64 s.d. to mean
1.64 + s.d. and mean – 2.58 s.d. to mean
+2.58 s.d., respectively.
• Characteristics of normal distribution curve:
1. Bell shaped curve with most of the values clustered near
the mean and few values out near the tails.
2. Symmetrical around the mean
3. The mean, the median and the mode of a normal
distribution have the same value.
4. 95% of the measurements have values which are
approximately within 2 (1.96) standard deviations of the
mean. (also, about 68% of the measurements have values
which are approximately within one standard deviations
of the mean and about 99% of the measurements have
values which are approximately within three (2.58)
standard deviations of the mean). [68% lies within mean
±1 SD, 95% lies within mean ±2 SD and 99% lies within
mean ±3 SD]. This is the confidence interval of the mean.
• Example: if the mean height of a group of 120 women is 158
cm and the standard deviation is 3cm, it means that 95% of
the women are between 152 and 164cm (assuming that the
heights are normally distributed). In other words, 2.5% of
the women (which in this case corresponds to 3 women) are
shorter than 152cm and 2.5% (or 3 women) are taller than
164cm
Normal distribution curve
Sample size calculation for cross-sectional
studies

(1.96)2 X P (1 – P)
N (sample size) = ----------------------------
(d)2

Where:
P = Prevalence of the condition under study.
d = Error rate (i.e. 5%)
Sample size calculation for cohort studies and randomized clinical trials

2 X(1.96 + 0.84)2 X P (1 – P)
N (sample size) = ---------------------------------------------
(P0 – P1)2

Where:
P0 = Proportion of the participants with the condition in the
control group
P1 = Proportion of the participants with the condition in the
exposed group
P0 + P1
P = -----------------
2
Sample size calculation for case-control studies
2 X(1.96 + 0.84)2 X P (1 – P)
N (sample size) = ---------------------------------------------
(P0 – P1)2

Where:
P0 = Proportion of the participants with the condition in the
control group
P1 = Proportion of the participants with the condition in the cases
= P0 X OR / 1 + (P0 (OR – 1)
OR = Odds ratio

P0 + P1
P = -----------------
MATHEMATICAL PRESENTATION OF DATA
• There are 2 groups for statistical presentation of data:
A. Measures of position (measures of central tendency)
B. Measures of dispersion

• MEASURES OF POSITION (CENTRAL TENDENCY)

• These measures are:


1. The average or arithmetic mean
2. The median and
3. The mode.
1. Average or arithmetic mean:
• The mean of a sample (or average value) is designated (X)
and can be calculated from the frequency distribution by the
sum of all observations (Xi) divided by the number of
observation (n).
• Mean from ungrouped data:
e.g. 39, 50, 26, 45, 47 are the ages of 5 individuals in years.

39 + 50 + 26 + 45 + 47 207
Mean = -------------------------------------- = ------------ = 41.4 years
5 5

Sum of observations Σ Xi
Mean (X dash) = ----------------------------------------- = -------------
Total number of observations n
• Mean from grouped data:
• Calculate the arithmetic mean from the following
table:
• First, we should find the midpoint of each class interval
• Then we multiply this midpoint by frequency of this class
• Then we add the results of multiplication of the midpoint of
class interval by its frequency and divide it by the sum of the
frequencies. As in the following table:
Age (years) Frequency (F) Midpoint (X) FX
10- 10 15 10 X 15 = 150
20- 5 25 5 X 25 = 125
30- 15 35 15 X 35 = 525
40- 10 45 10 X 45 = 450
50-60 20 55 20 X 55 = 1100
Total 60 2350
ΣFX 2350
• Mean (X dash) = ------------ = -------- = 39.17 yrs
Freq(N) 60
Where:
• Σ = summation
• F = frequency of the corresponding class interval
• X = midpoint of the corresponding class interval
• N = summation of total frequencies (total number of
observations).
Properties of the arithmetic mean:
1. Simple: can easily be calculated and interpreted
2. Unique: for a given set of data there is only one mean.
3. Sensitive: the mean is affected by every value. Extreme
values may have an influence and can distort the mean and
become misleading. For example, the following are duration
of stay in hospital in days for specific conditions: 5, 5, 5, 7, 10,
20 and 102.
154
• The mean = ----------- = 22
7
• The extreme value 102 days have a misleading effect on the
mean. In this case the median (7 days) is a less misleading
measure i.e. the mean may be affected by large outlying
observations and the median is unaffected.
Other types of mean

1. Geometric mean
2. Weighted mean
3. Harmonic mean
4. Trimmed mean
• The geometric mean
• The arithmetic mean is an inappropriate summary measure of
location if our data are skewed. If the data are skewed to the
right, we can produce a distribution that is more symmetrical if
we take the logarithm [to base 10 or to base e (2.7182818)] of
each value of the variable in this data set.
• The arithmetic mean of the log values is a measure of location
for the transformed data. To obtain a measure that has the same
units as the original observations, we have to back transform
(i.e. take the antilog of) (also known as exponentiating) the
mean of the log data; we call this the geometric mean.
• The geometric mean

• When reporting the final results, however, it is clearer to


transform them back into the original units by taking antilogs
(also known as exponentiating). The antilog of the mean of
the transformed values is called the geometric mean.

• Geometric mean (GM) = antilog(u) = exp(u) = eu

• Provided the distribution of the log data is approximately


symmetrical, the geometric mean is similar to the median and
less than the mean of the raw data.
Geometric mean
By anti-logging, the Geometric mean for normals = (2.7182818)2.433 = 11.39 ng/day/100 ml creatinine
By anti-logging, the Geometric mean for diabetics = (2.7182818)3.391 = 29.7 ng/day/100 ml creatinine
The weighted mean
• We use a weighted mean when certain values of the variable of
interest, x, are more important than others. We attach a weight,
w, to each of the values, xi, in our sample, to reflect this
importance.

• If the values xl, x2, x3, . . . , xn, have corresponding weights w1, w2,

w3, . . . , wn, the weighted arithmetic mean is:

W1 x1 + W2 x2 + ………. Wn xn
-------------------------------------------
W1 + w2 + …………….. wn
• The weighted mean
• For example, suppose we are interested in determining the
average length of stay of hospitalized patients in a district,
and we know the average discharge time for patients in
every hospital. To take account of the amount of
information provided, one approach might be to take each
weight as the number of patients in the associated
hospital.
• The weighted mean and the arithmetic mean are identical
if each weight is equal to one.
• Example: Last week, 6 patients were discharged from hospital A,
the period of stay for these patients were as follows: 8, 7, 15, 9 , 4
and 3 days. In the same week there were 5 patients discharged
hospital B, the duration of stay for them were as follows: 5, 4, 14,
7 and 6 days. In the same week there were 4 patients discharged
hospital C, the duration of stay for them were as follows: 2, 4, 6,
and 4 days. Calculate the weighted mean.

• The average duration of say in hospital A = (8 + 7 + 15 + 9 + 4 + 3)


divided by the number of patients (6) = 46 ÷ 6 = 7.67.
• The average duration of say in hospital B = (5 + 4 + 14 + 7 + 6 + 8)
divided by the number of patients (6) = 44 ÷ 6 = 7.3.
• The average duration of say in hospital C = (2 + 4 + 6 ) divided by
the number of patients (3) = 12 ÷ 3 = 4.
• The weight of the first mean (hospital A) = 6/15 = .4
• The weight of the second mean (hospital B) = 6/15 = .4
• The weight of the third mean (hospital C) = 3/16 = .2
• The total sum of weight should be 1.
• If we multiply each mean by its weight and add all together we
obtain the arithmetic mean or average.
• Then, (7.67 x .4) + (7.3 x .4) + (4 x .2) = 3.07 + 2.92 + .8 = 6.07 days.

• If we want to give extra weight for the first mean and less weights
for the other two means for example:
• Weight for the first mean = .2
• Weight for the second mean = .2
• Weight for the third mean = .6
• Then, the weighted mean will be: [(7.67 x .2) + (7.3 x .2) + (4 x .6)]
= 1.53 + 1.46 + 2.4 = 5.39 days
• Example: If we take three 100 point exams in your statistics class and
score 80, 80, 95. the last exam is much easier than the first two, so
your supervisor has given it less weight. The weights for the three
exams are:
• Exam1: 40% of your grade (note: 40% as a proportion of .4)
• Exam2: 40% of your grade
• Exam3: 20% of your grade
• What is your weighted average (mean) for the class?

1. Multiply the numbers in your data by the weights:


.4 x 80 = 32
.4 x 80 = 32
.2 x 95 = 18
2. Add the numbers up 32 + 32 + 19 = 83. 83 is the weighted average for
the 3 exams.
The proportion weight given to each exam is called weighted factor
Decisions: Weighted means can help with decisions where
some things are more important than others:

• Example: if we wants to buy a new telephone, and decides on the


following rating system:
• Vice Quality 50% - Battery Life 30% - Camera 20%

• The Samsung type gets 8 (out of 10) for voice Quality, 6 for Battery Life
and 7 for Camera
• The Hawawii type gets 9 for voice Quality, 4 for Battery Life and 6 for
Camera

• Which type is best?


• Samsung: 0.5 × 8 + 0.3 × 6 + 0.2 × 7 = 4 + 1.8 + 1.4 = 7.2
• Hawawii : 0.5 × 9 + 0.3 × 4 + 0.2 × 6 = 4.5 + 1.2 + 1.2 = 6.9
• So it is better to buy the Samsung.
The harmonic mean
• The harmonic mean is a measure of central tendency that is
calculated by averaging the reciprocals of the data points. For
example, the harmonic mean of 10 and 30 is 15. It is calculated
by averaging 0.1 (the reciprocal of 10) and 0.03333 (the
reciprocal of 30). The average of these two numbers is 0.06667;
the reciprocal of 0.6667 is 15.

• The formula is:


H = n ÷ [(1 ÷ M1) + (1 ÷ M2) + (1 ÷ M3) + … + (1 ÷ Mn)]

Where: H = the harmonic mean;


n = the number of values; and
M1, M2, etc. = the individual values.
The trimmed mean
• The mean is the sum of the observations divided by the number
of observations.

• Mean is too sensitive to extreme observations. Trimmed mean is


designed to solve that problem.

• The trimmed mean compensates for this by dropping a certain


percentage of values on the tails. For example, the 50% trimmed
mean is the mean of the values between the upper and lower
quartiles. The 90% trimmed mean is the mean of the values after
truncating the lowest and highest 5% of the values.

• The use of a trimmed mean helps eliminate the influence of


outliers or data points on the tails that may unfairly affect the
traditional mean.
2. The Median:
• The median or (the middle value) of a series of observations is the value,
of the middle observation when all observations are listed in an order form
(lowest to highest). In another words, half of the observations lies below
or equal to the median and half lies above or equal to it.
The Median from ungrouped data:
• When the number of observations is odd: the median is the value
corresponds to the middle of observation.
• Example: these are the blood urea of 5 persons: 20 – 25 – 30 – 40 –35mg.
• First, put the observations in the form of an array, i.e. 20 – 25 – 30 – 35 –
40 mg.
• The median is the 3rd in the ordered observations = 30mg.
N+1 5+1
In general the rank of the median = ----------------- = ------------- = 3 rd.
2 2
Where N = is the number of observations.
• When the number of observations is even: the median is the average of
the two middle observations.
• Example: the following are blood urea of 6 persons: 20 – 35 – 30 – 40 – 25
– 15mg.
• First, put the observations in the form of an array, i.e. 15 – 20 – 25 – 30 –
35 – 40 mg.
• The rank of the median of the ordered observations are: 25 and 30 mg.
25 + 30
• The value of the median = ------------------ = 27.5 mg.
2
In general the rank of the median in even observation are:
N N 6 6
• ----- and ------------- + 1 = ------ = 3 rd and ------ + 1 = 4th
2 2 2 2

• Then the rank of the median is the 3rd and 4th observations and its value
equal to the average of these two observations.
The Median from grouped data:
The following table shows frequency distribution of 26
individuals by age in years:

Age (years) Frequency Cumulative


frequency

5- 3 3
10- 5 8
15- 12 20
20- 3 23
25-30 3 26

Total 26
• In order to determine the value of the median firstly we
have to calculate:
• General rank of the median:
N 26
• ----------- = ------------- = 13
2 2

• Special rank of the median = General rank – cumulative


frequency of the class before the median class = 13 – 8 = 5
(5th)
• The median:
Special rank X class interval
= Lm + --------------------------------------------
Frequency of the median class

Where:
• Lm = lower limit of the median class

5 X 5
= 15 + (------------) = 15 + 2.08 = 17.08 years.
12
3. The Mode:
 The mode is defined as the most frequently occurring value
in a series of observations. There may no mode or one mode
or more than one mode for a group of observations.
 The mode from ungrouped data: these are the weights of 5
live births: 3 – 3.5 – 3 – 2.5 – 3 kg. The mode is 3kg. which is
the most frequent observation.
 The mode from grouped data: the class which has the
highest frequency is the modal class.

D1
Mode = Lm + -------------------- X I
D1 + D2
Where:
 Modal class = the class interval of the mode is the class
interval with the highest frequency.
 Lm = lower limit of the modal class.
 D1 = difference between frequency of the modal class and
the above (previous) one.
 D2 = difference between frequency of the modal class and
the lower (following) one
 I = amount (or width) of the class interval.

 Example: the following table shows distribution of children


according to their ages in pediatric department 2010:
The modal class interval is 9 - ˂ 12.
D1 = 60-30 = 30 D2 = 60 -20 = 40 I=3 Lm = 9
30
The mode = 9 + ------------------- X 3 = 10.29 years.
30 + 40

Age (years) Frequency


0- 40
3- 40
6- 30
9- 60
12- 20
15-18 10

Total 200
MEASURES OF DISPERSION
Introduction:
 As we have seen, the mean, median and mode are measures
of the central tendency of a variable, but they do not provide
any information of how much the measurements vary. This
section describes some common measures of variation (or
variability) which in statistical textbooks are often referred to
as measures of dispersion.
Measures of dispersion include the following:
1. Range
2. Percentiles and quantiles
3. Average deviation
4. Variance and
5. Standard deviation
1. RANGE:
• The range of a set of measurements is the difference
between the smallest and the largest measurement. For
example, if the weights of 7 pregnant women were: 40 – 41
– 42 – 43 – 47 – 72 kg.
• The range would be 72 – 40 = 32 kg.
• Although simple to calculate, the range does not tell us
anything about the distribution of the values between the
two extreme ones.

• If the weight of 7 other pregnant women were: 40 – 46 – 46


– 46 – 50 – 60 – 72kg.
• The range would be 72 – 40 = 32 kg. Although the values are
very different from those of the previous example.
2. PERCENTILES AND QUANTILES:
• A second way of describing the variation of a set of measurement is to
divide the distribution into percentiles. As a matter of fact the concept
of percentiles is just an extension of the concept of median, which
may also be called the 50th percentile.
• Percentiles are points which divide all the measurements into 100
equal parts.
• The 3rd percentile (P3) is the value below which 3% of the
measurements lie. The 50th percentile (P50) or the median is the value
below which 50% of measurements lie.
• The concept of percentile is used by nutritionists to develop standard
charts for specific countries.
Quantiles and percentiles
• Equal-sized divisions of a distribution are called quantiles.
• For example, we may define tertiles, which divide the data into
three equally-sized groups, and quintiles, which divide them into
five.
• Quintiles are estimated from the intersections with the
cumulative frequency curve of lines at 20%, 40%, 60% and 80%.
• Divisions into ten equally sized groups are called deciles.
• More generally, the kth percentile (or centile as it is also called) is
the point below which k% of the values of the distribution lie.
Cumulative % and quatiles for age of children (months) …… etc
Observation Cumulative % Age (in months) Quartile
1 5 21 Minimum = 21
2 10 22
3 15 23
4 20 24

5 25 27 1st quartile ??
6 30 28
7 35 31
8 40 32
9 45 33

10 50 35 Median (2nd Q) ??
11 55 45
12 60 46
13 65 48
14 70 54

15 75 55 3rd quartile ??
16 80 57
17 85 59
18 90 59
19 95 61
20 100 62 Maximum = 62
• For the haemoglobin data, the median is the 35.5th observation
and so we take the average of the 35th and 36th observations. Thus
the median is (11.811.9 + ) ÷ 11.85 = 2, as shown in Table
3.3. It is the haemoglobin value corresponding to the point where
the 50% line crosses the curve, as shown in the Figure
• Also marked on Figure 3.7 are the two points where the 25% and
75% lines cross the curve. These are called the lower (first) and
upper (third) quartiles of the distribution, respectively, and
together with the median they divide the distribution into four
equally-sized groups.
• The difference between the lower and upper quartiles is known as
the interquartile range.
Box plot:
• A useful plot, based on these values, is a box and whiskers plot.
• The box is drawn from the lower quartile to the upper quartile; its
length gives the interquartile range.
• The horizontal line in the middle of the box represents the median.
• Just as a cat’s whiskers mark the full width of its body, the
‘whiskers’ in this plot mark the full extent of the data. They are
drawn on either end of the box to the minimum and maximum
values.
Box plot
3. AVERAGE DEVIATION:
• Average deviation is the mean of the absolute difference
between individual values and the mean of these values.
1. Calculate the mean of all measurements
2. Calculate the difference between each individual measurement
and the mean
3. Add the absolute difference between the mean and the
individual measurements (remove the negative sign).
4. Divide the above result by the number of observation
5. Average deviation = Σ (X – mean) / n
3. AVERAGE DEVIATION: example
Mean = 60 /10 = 6 - Average deviation = Σ (X – mean) / n
Average deviation = 16 / 10 = 1.6

Absolute value Variable – mean Variable


)X - mean( )X - mean( (X)
1 5–6=-1 5
0 0=6–6 6
2 2-=6–4 4
2 8–6=2 8
7–6=1
1 4 – 6 = -2 7
2 7–6=1 4
1 9–6=3 7
3 3–6=-3 9
3 7–6=1 3
1 7
16 zero 60
4. VARIANCE:
• For calculation of the variance follow the following steps:

1. Calculate the mean of all measurements

2. Calculate the difference between each individual measurement


and the mean

3. Square all these differences

4. Take the sum of all squared differences

5. Divide this sum by the number of measurements minus one

Σ(X – X dash)2

Variance, SD 2 = ------------------------

(n – 1)
4. Variance: example
Mean = 60 /10 = 6 - Variance = Σ (X – mean)2 / n – 1
Variance = 34 / 10 – 1 = 3.78

Absolute value Variable – mean Variable


(X - mean)2 )X - mean( (X)
(-1)2 = 1 5–6=-1 5
(0)2 = 0 0=6–6 6
(-2)2 = 4 2-=6–4
(2)2 = 4 4
8–6=2 8
(1)2 = 1 7–6=1
(-2)2 = 4 4 – 6 = -2 7
(1)2 = 1 7–6=1 4
(3)2 = 9 9–6=3 7
(-3)2 = 9 3–6=-3 9
(1)2 = 1 7–6=1 3
7
34 zero 60
• Degrees of freedom:
• The denominator (n – 1) is called the number of degrees
of freedom of the variance.
• The variance has convenient mathematical properties and
is the appropriate measure when doing statistical theory.
• A disadvantage, is that it is in the square of the units used
for the observation. For example, if the observations are
weights in grams the variance is in grams squared.
• For many purposes it is more convenient to express the
variation in the original units by taking the square root of
the variance. This is called standard deviation (SD)
Σ(X – mean)2
• SD = square root of --------------------
(n – 1)
5. STANDARD DEVIATION:
• To determine how much our measurements differ from
the mean value there is a measure which we need
when we use statistical tests. This measure is called the
standard deviation.
• The standard deviation is a measure which describes
how much individual measurements differ, on the
average, from the mean.
• Standard deviation = square route of the variance
5. STANDARD DEVIATION:
• A large standard deviation shows that there is a wide scatter of
measured values around the mean, while a small standard
deviation shows that the individual values are concentrated
around the mean with little variation among them,
• Fortunately, many pocket calculators can do this calculation for
us, but it is still important to understand what it means.

• For calculation of standard deviation form grouped data, follow


the steps as in the following example:
Weight Frequency Mid-point FX F(X)2
(F) (X)
2- 8 3 8X3 = 24 8X(3)2 = 72
4- 6 5 6X5 = 30 6X(5)2 = 150
6- 10 7 10X7 = 70 10X(7)2 = 490
8 -10 6 9 6X9 = 54 6X(9)2 =486
Total 30 178 1198
First, calculate the mean = ΣFX/n = 178/30 = 5.93
Σ F(mid-point X)2 – n (mean)2
S D = square root of ----------------------------------------------
n–1
SD = √1198 – 30 (5.93)2 / 30-1 = √143/29 = √4.93 = 2.22
N.B.
For normally distributed data, it is suitable to use the
mean as a measure of central tendency and the
standard deviation as a measure of dispersion.
[Mean ± SD]

For skewed data – non normally distributed data – it is


suitable to use the median as a measure of central
tendency and the inter-quartile range (IQR) as a
measure of dispersion. [Median, IQR]
• From the table we determine that there seems to be a
difference in utilization of antenatal care between those who
live close to and those who live far from the clinic (64% versus
47%). We now want to know whether this observed difference
is statistically significant.

• The chi-square test can be used to give us the answer. This test
is based on measuring the difference between the observed
frequencies and the expected frequencies if the null hypothesis
(i.e. the hypothesis of no difference) were true.
To perform a x2 test you need to complete the following 3 steps:
1. Calculate the x2 value
2. Use a x2 table, and
3. Interpret the results.

1. Calculate the x2 value


Complete the following steps:
a) Calculate the expected frequency (E) for each cell. To find the
expected frequency E of a cell you multiply the row total by
the column total and divide by the grand (overall) total:
Row total X column total
 Expected frequency (E) = ---------------------------------------------
Grand (overall) total
b) For each cell, subtract the expected frequency from the
observed frequency (O), O – E
c) For each cell, square the result of (O – E) and divide by the
expected E.
d) Add the results of step (c) for all cells.
• The formula for calculating a chi-square value (step (b) to (d) is
as follows:
(O – E)2
• X2 = Σ ---------------------
E
Where:
• O is the observed frequency (indicated in the table)
• E is the expected frequency (to be calculated), and
• Σ (the sum of) for all the cells of the table.
 For any two-by-two table (which contains 4 cells) the formula is:
(O1 – E1)2 (O2 – E2)2 (O3 – E3)2 (O4 – E4)2
 X2 = --------------- + ----------------- + ----------------- + -----------------
E1 E2 E3 E4

2. Using a x2 table
• As for the t-test, the calculated X2 value has to be compared
with a theoretical X2 value in order to determine whether the
null hypothesis is rejected or not.
• First you must decide on a p-value. We usually take 0.05
• Then the degree of freedom have to be calculated. With the X2
test the number of degrees of freedom is related to the number of
cells. i.e. the number of groups or variables you are comparing.
The number of degrees of freedom is found by multiplying the
number of rows (r) minus 1 by the number of columns (c) minus 1:
• d.f. = (r – 1) X (c – 1)
• For a simple two-by-two table the number of degrees of freedom
is 1 (i.e. d.f. = (2 – 1) X (2 – 1).
• Then the X2 value belonging to the p-value and the number of
degrees of freedom is located in the table, in order to determine
whether the X2 value is statistically significant or not.
3. Interpreting the result:
• As for the t-test, the null hypothesis is rejected if p<0.05, which is
the case if the calculated X2 is larger than the theoretical X2 value
in the table.

• Let us now apply the X2 test to the data given in Example (D)
(utilization of antenatal care). This gives the following results:
Step1(a):
The expected frequencies for each cell are calculated as follows:
E1 = 86 X 80 /155 = 44.4 E2 = 69X80 /155 = 35.6
E3 = 86 X 75 /155 = 41.6 E4 = 69 X 57 /155 = 33.4
For convenience, the observed and expected frequencies are
shown in the following table:
Table (E): Utilization of antenatal clinics, observed and expected
frequencies.
Total Did not use Used ANC Distance from
ANC ANC
80 O2=29 O1=51 < 10 km
E2=35.6 E1=44.4
75 O4=40 O3=35 ≥10 km
E4=33.4 E3 =41.6
155 69 86 Total
 Note that the expected frequencies refer to the values we would
have expected, given the total numbers of 80 and 75 women in
the two groups. If the null hypothesis, stating that there is no
difference between the two groups, were true.

 Step1(b) to 1(d):
(51 – 44.4)2 (29 – 35.6)2 (35 – 41.6)2 (40 – 33.4)2
 X2 = --------------- + ----------------- + ----------------- + -----------------
44.4 35.6 41.6 33.4

= 0.98 + 1.22 + 1.05 + 1.30 = 4.55


• Step2:
• As we have a simple two-by-two table the number of degrees
of freedom (d.f.) is 1.
• Use the table of chi-square values. We have decided theoretical
p = 0.05. This gives us the value of 3.84. Our value of 4.55 is
larger than 3.84, which means that the p value is smaller than
0.05.
Step 3:
• We can now conclude that the women living within a distance
of 10 km from the clinic use antenatal care significantly more
often than the women living more than 10 km away.
By using SPSS the x2 = 4.57 and the p value = 0.032
• It is important to present your data clearly and to formulate
carefully any conclusions based on statistical tests in the final
report of your study.
Example (1): A researcher was interested in studying the association
between sex of students and prevalence of obesity. A sample of 200
male students 10-12 years and 150 females students of the same age.

It was found that, 50 male students are complaining of obesity


compared with 45 female students. Test whether there is significant
difference between prevalence of obesity among both sexes?

Normal
Total Obese
weight

200 150 (75%) 50 (25%) Male


150 105 (70%) 45 (30%) Female

350 255 95 Total


The Answer:
• E1 = 95x200/350=54.3 – (O1 = 50) –
• E2=255x200 / 350 =145.7 (O2 = 150)
• E3=95x150 / 350=40.7 (O3=45) –
• E4 =255 x 150/350 =109.5 (O4 = 105)

(50 – 54.3)2 (150 – 145.7)2 (45 – 40.7)2 (105 – 109.4)2


• X2 = --------------- + ------------------ + ----------------- + ---------------------
54.3 145.7 40.7 109.4
= 0.34 + 0.12 + 0.45 + 0.17 = 1.08
• d.f. = (r – 1) (c – 1) = (2 – 1) (2 – 1) = 1
• Theoretical x2 value from x2 table at p =0.05 and d.f. =1 is 3.84.
• Theoretical x2 value (3.84) is more than the calculated x2 value
(1.18), then the difference between prevalence rate of obesity
among male and female is not significant (p>0.05).
Example (2): A survey study was conducted to study the
Oxyuris infestation among primary school students in Assiut
city. The result of the study was as follows:

Total Negative Positive Sex


155 (100.0) 149 (94.8) 6 (5.2) Males
95 (100.0) 86 (90.5) 9 (9.5) Females
250 235 15 Total
Is there a significant difference of the prevalence rates of this
disease among both sexes (males and females).
The Answer:
• E1 = 15x155/250 = 9.3 (O1 = 6) -
• E2 = 235x155 / 250 = 145.7 (O2 = 149)
• E3 = 15x95 / 250 = 5.7 (O3 = 9) –
• E4 = 235 x 95 / 250 = 89.3 (O4 = 86)
(6 – 9.3)2 (149 – 145.7)2 (9 – 5.7)2 (86 – 89.3)2
X2 = -------------- + ------------------- + ------------------- + --------------------
9.3 145.7 5.7 89.3
= 1.17 + 1.91 + 0.07 + 0.12 = 3.27

• d.f. = (r – 1) (c – 1) = (2 – 1) (2 – 1) = 1
• Theoretical x2 value from x2 table at p =0.05 and d.f. =1 is 3.84.
Then, the difference is not significant.

• By using SPSS the x2 = 3.27 and the p value = 0.07


Example (3): to study the effects of three types of drugs (A, B
and C) on patients the following results were obtained:

Total Drug C Drug B Drug A Treatment


398 168 (84%) 160 (80%) 70 (70%) Favourable
102 32 (16%) 40 (20%) 30 (30%) Un-favourable
500 200 200 100 Total

Is there a significant difference between the proportion of


favourable outcomes of these drugs.
The Answer:
• E1 = 100x398/500 = 79.6 ----------------------------------------(O1 = 70)
• E2 = 100x102 / 500 = 20.4 --------------------------------------(O2 = 30)
• E3 = 200x398 / 500 = 159.2 ------------------------------------(O3 = 160)
• E4 = 200 x 102 / 500 = 40.8 -------------------------------------(O4 = 40)
• E5 = 200 x 398 / 500 = 159.2 -----------------------------------(O5 = 168)
• E6 = 200 x 102 / 500 = 40.8 -------------------------------------(O6 = 32)
(70 – 79.6)2 (30 – 20.4)2 (160 – 159.2)2 (40 – 40.8)2
• X2 = --------------- + ------------------ + ----------------- + ---------------------
79.3 20.4 159.2 40.8
(168 – 159.2)2 (32– 40.8)2
+ --------------------- + ------------------ =
159.2 40.8
= 1.16 + 4.5 + 0.004 + 0.016 + 0.486 + 1.898 = 8.064

• d.f. = (r – 1) (c – 1) = (2 – 1) (3 – 1) = 2

• Theoretical x2 value from x2 table at p =0.05 and d.f. =2 is 5.99.


• Then, the difference is significant (p<0.05).
Example (4): to study the effects of diabetes on duration of
wound healing, the following results were obtained:

Total Prolonged Normal Patients


328 100 (30.5) 228 (69.5) Diabetic
158 40 (25.3) 118 (74.7) Non-diabetic
486 140 346 Total
Test whether there is a relationship between duration of
healing of wounds and presence or absence of diabetes.
The Answer:
• E1 =346x328/486=233.5(O1=228)-
• E2=346x158/486=112.5 (O2 =118)
• E3 = 140x328/486 =94.5 (O3=100)-
• E4 =140x158/486=45.5 (O4 = 40)

(228 – 233.5)2 (118 – 112.5)2 (100 – 94.5)2 (40 – 45.5)2


X2 = ------------------ + ------------------- + ----------------- + --------------------
233.5 112.5 94.5 45.5
= 0.13 + 0.27 + 0.32 + 0.66 = 1.38. d.f. = (2 – 1) (2 – 1) = 1

• Theoretical x2 value from x2 table at p =0.05 and d.f. =1 is 3.84.


Then, the difference is not significant.
• By using SPSS the x2 = 1.39 and the p value = 0.239
Assumptions of chi square test

1. The first assumption is the independence of data. For the


chisquare test to be meaningful it is imperative that each
person, item or entity contributes to only one cell of the
contingency table. Therefore, you cannot use a chi-square test
on a repeated-measures design.

2. The expected frequencies should be greater than 5. Although it


is acceptable in larger contingency tables to have up to 20% of
expected frequencies below 5, the result is a loss of statistical
power (so, the test may fail to detect a genuine effect). Even in
larger contingency tables no expected frequencies should be
below 1. If you find yourself in this situation consider using
Fisher’s exact test.
Analyze --- descriptive ----- crosstab
Statistics ----- Chi square
Cells
Output of chi square test
Reporting the results of chi-square

• When reporting Pearson’s chi-square we simply report the value


of the test statistic with its associated degrees of freedom and
the significance value. It’s also useful to reproduce the
contingency table. As such, we could report:

• There was a significant association between the type of training


and whether or not cats would dance χ2 (1) = 25.36, p < .001.
• One way chi-square:
Analyze --- non-parametric tests --- chi square
Example 1:
Minority Classification
Observed N Expected N Residual
No 370 237.0 133.0
Yes 104 237.0 -133.0
Total 474

Test Statistics
Minority Classification
Chi-Square 149.274a
Df 1
Asymp. Sig. .000

a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell
frequency is 237.0.
Example 2
colour
Observed N Expected N Residual
blue 25 20.0 5.0
pink 20 20.0 .0
green 20 20.0 .0
brown 15 20.0 -5.0
yellow 20 20.0 .0
Total 100

• Test Statistics
• colour
• Chi-Square 2.500a
• df 4
• Asymp. Sig. .645
• a. 0 cells (.0%) have expected frequencies less than 5. The minimum
expected cell frequency is 20.0.
COMPARISON OF TWO MEANS - STUDENT'S T-TEST
The unpaired (two-sample) t-test - Independent samples t-test

 The t-test also referred to as Student's t-test is used for


numerical data to determine whether an observed difference
between the means of two groups can be considered statistically
significant.
 We have samples from two independent (unrelated) groups of
individuals one numerical variable of interest. We are interested
in whether the mean is the same in the two groups.

 For example, we may wish to compare the weights in two groups


of children, each child being randomly allocated to receive either
a dietary supplement or placebo.
Assumptions
• In the population, the variable is Normally distributed and the
variances of the two groups are the same. In addition, we have
reasonable sample sizes so that we can check the assumptions
of Normality and equal variances.

Rationale
• We consider the difference in the means of the two groups.
Under the null hypothesis that the population means in the two
groups are the same, this difference will equal zero. Therefore,
we use a test statistic that is based on the difference in the two
sample means, and on the value of the difference in population
means under the null hypothesis (i.e. zero).

• This test statistic, often referred to as t, follows the t-


distribution.
If the assumptions are not satisfied

 When the sample sizes are reasonably large, the t-test is fairly
robust to departures from Normality.

 However, it is less robust to unequal variances. There is a


modification of the unpaired t-test that allows for unequal
variances, and results from it are often provided in computer
output.

 However, if you are concerned that the assumptions are not


satisfied, then you either transform the data to achieve
approximate Normality and /or equal variances, or use a non-
parametric test such as the Wilcoxon rank sum test (Mann
Whitney U test).
Example:

 It has been observed that in a certain province the proportion of


women who are delivered through Caesarean section is very
high.

 A study is, therefore, conducted to discover why this is the case.


As small height is known to be one of the risk factors related to
difficult deliveries, the researcher may want to find out if there is
a difference between the mean height of women in this province
who had normal deliveries and of those who had Caesarean
sections.

 The null hypothesis would be that there is no difference between


the mean heights of the two groups of women.
• A t-test would be the appropriate way to determine
whether the observed difference between the two
means is statistically significant.

• To actually perform a t-test you have to complete 4


steps:
1. Test for homogeneity of variance
2. Calculate the t-value
3. Use a t-table, and
4. Interpret the results.
1. Testing for homogeneity of variance with Hartley's F
max test:

• In order to use a parametric statistical test, your data should


show homogeneity of variance: in other words, the spread of
scores in each condition should be roughly similar. (The spread of
scores is reflected in the variance, which is simply the standard
deviation squared).

• Sometimes, it's quite obvious that the variances are very


dissimilar; you just need to look at them.
• In other cases, it's less obvious, and a more formal test is
required. There are various ways to test for homogeneity of
variance.
• If you are performing the statistical tests by hand, then
it's easier to use Hartley's F max test than Levene's.

(a) Divide the larger variance by the smaller one. This


gives you an F-ratio (NB: this should not be confused
with the F-ratio that's produced in ANOVA, this F-ratio
is quite different). If the variances are similar to each
other, then the F-ratio will be close to 1: the more the
variances differ, the larger the F-ratio will be.

• F max = larger variance / smaller variance


• When performing some statistical tests, SPSS routinely tests
for homogeneity of variance.

• For example, if you perform an independent-measures t-


test, SPSS will also show the results of a Levene's test on the
data.

• If the Levene's test result is statistically significant (the result


has a p < 0.05) , it means that the data do not show
homogeneity of variance.

• If the Levene's test is not significant (p > .05) then you can
assume that the data show homogeneity of variance.
(b) If the F-ratio is very close to 1, you are safe in concluding that
the data probably show homogeneity of variance. If the F-ratio
is quite a bit larger than 1, then to decide how likely it is to get
your obtained F-ratio by chance, you need to use a table of F-
max values.

 To use the table you need to know the d.f. (the number of
participants in a group, minus 1) and k (the number of groups or
conditions).

 Note that Hartley's test assumes that there are equal numbers
of participants in each group.
2. Calculate the t-value: when variances are un-equal:

• To calculate the t-value, you need to complete the following:

A. Calculate the difference between the means: first mean –


second mean.

B. Calculate the standard deviation for each of the study groups.

C. The standard error of the difference between the two means.

SD12 SD22
• Square root of ------------ + --------------
n1 n2
Example: the following money (in pounds) with 20 male and 20
female children, test whether there is a significant difference
between the two means or not
• Males:
11.0 - 12.0 - 13.0 - 14.0 - 15.0 - 13.0 - 14.0 - 15.0 - 16.0 - 18.0
- 22.0 - 33.0 - 44.0 - 55.0 - 66.0 - 65.0 - 45.0 - 43.0 - 66.0 -
32.0
N1 = 20 - Mean1 = 15.4 - SD1 = 9.5

• Females:
11.0 - 12.0 - 13.0 - 11.0 - 11.0 - 12.0 - 13.0 - 14.0 - 15.0 - 55.0 -
22.0 - 21.0 - 21.0 - 23.0 - 24.0 - 21.0 - 23.0 - 24.0 - 25.0 -
32.0
N2 = 20 - Mean2 = 35.4- SD2 = 16.3
The standard error of the difference is given by the following
formula:

9.52 16.32
• Square root of ----------- + ------------ = 4.2
20 20

Where:

• SD1 is the standard deviation of the first sample

• SD2 is the standard deviation of the second sample

• n1 is the sample size of the first sample

• n2 is the sample size of the second sample


Calculate the t- value
Finally, divide the difference between the means by the standard error of
the difference. The value now obtained is called t-value.

First mean – Second mean 15.4 – 35.4


t-value = ---------------------------------------- = --------------------------- = - 4.7
Standard error 4.2

• Expressed in one single formula:

X1 – X2
t = --------------------------------------------------
SD12 SD22
√----------- + -----------
n1 n2
• Where X1 is the mean value of the first sample and X2 is the mean value
of the second sample.
3. Using a T-Table:
• Once the t-value has been calculated, you will have to refer to a
t-table, from which you can determine whether the null
hypothesis is rejected or not.
A. First, decide which significance level (p-value) you want to use.
Remember that the p-value is an expression of the likelihood of
finding a difference by chance when there is no real difference.
Usually we take a p-value of 0.05.
B. Second, determine the number of degrees of freedom for the
test being performed. Degrees of freedom is a measure derived
from the sample size, which has to be taken into account when
performing a t-test. The bigger the sample size (and degrees of
freedom) the smaller the difference needed to reject the null
hypothesis.
• The way the number of degrees of freedom is calculated differs
from one statistical test to another. For Student t-test the number
of degrees of freedom is calculated as the sum of the two sample
sizes minus 2. That is: d.f. = n1 + n2 – 2
• Thus, for example 1 the number of degrees of freedom is:
• d.f. = 20 + 20 – 2 = 38

C. Third, the t-value belonging to the p-value and the degrees of


freedom is located in the table.
• In our example, we look up the t-value belonging to p = 0.05 and
d.f. = 38 and we find it is 2.03.
4. Interpreting the Result:

• We now compare the absolute value of the t-value calculated in


step 1 (i.e. the t-value, ignoring the sign) with the t-value derived
from the table in step 3. If the calculated t-value is larger than the
value derived from the table, p is smaller than the value indicated
at the top of the column. We then reject the null hypothesis and
conclude that there is a statistically significant difference
between the two means.

• If the calculated t-value is smaller than the value derived from


the table, p is larger than the value indicated at the top of the
table. We then accept the null hypothesis and conclude that the
observed difference is not statistically significant.
• In our example the t-value calculated in step 1 is 4.7, which is
larger than the t-value derived from the table is step 3 (2.03).
Thus, p is smaller than 0.05, and we therefore reject the null
hypothesis and conclude that the observed difference of 4.67
between the two means is a statistically significant difference.
• by using SPSS, the calculated t-value =2.783, P value = 0.019.
We can express this conclusion in different ways:
• We can say that the probability that the observed difference of
20 between the two groups is due to chance is less than 5%.
• We can also say that the difference between the two groups is
4.7 times the standard error.
If you want to compare mean values of more than two groups
(e.g. heights of urban, semi-urban and rural women) you
cannot use Student's t-test. In this case you must use the F-test.
Calculate the t-value: when variances are equal:

The standard error of difference between two means when the


variances is equal is as follows:

SD12(n1 – 1) + SD22(n2 – 1) 1 1
Square root ----------------------------------------x (----- --- + --------)
n1 + n2 – 2 n1 n2

First mean – Second mean


t-value = ----------------------------------------
Standard error

Then complete as un-equal variance.


Example: the following is weight of 10 male and 10 female children, test
whether there is a significant difference between their mean weights or not

Females Males Serial


22 30 1
33 35 2
44 45 3
55 40 4
66 50 5
65 35 6
45 55 7
43 25 8
66 35 9
32 40 10
47.1 39.0 Mean
15.6 9.1 SD
Calculate the t-value: when variances are equal:
The standard error of difference between two means when the
variances is equal is as follows:

(9.1)2 (10 – 1) + (15.6)2 (10 – 1) 1 1


Square root --------------------------------------------x (----- --- + --------)
10 + 10 – 2 10 10

First mean (39.0) – Second mean (47.1)


t-value = ---------------------------------------------------------- = 1.4
Standard error (5.7)

Then complete as un-equal variance. Theoretical t value at p =


0.05 and d.f. (10+10 -2) = 2.1. The difference is insignificant.
Analyze --- Compare means --- Independent samples t test
PAIRED T-TEST
• When dealing with paired (matched) observations,
comparison of sample means is performed by a modified t-
test know as the paired t-test.

• In the paired t-test a single set of differences between the


paired observations is used instead of the original two sets
of observations.

• The paired t-test calculates a value of t as:


t= mean difference / standard error

• The number of degrees of freedom is the sample size minus


1 (or the number of paired observations minus 1).
 The problem

 We have two samples that are related to each other and one
numerical or ordinal variable of interest.

 The variable may be measured on each individual in two


circumstances. For example, in a cross-over trial, each patient
has two measurements on the variable, one while taking active
treatment and one while taking placebo.

 The individuals in each sample may be different, but are linked


to each other in some way- For example patients in one group
may be individually matched to patients in the other group in a
case-control study.
 Such data are known as paired data. It is important to take
account of the dependence between the two samples when
analyzing the data, otherwise the advantages of pairing are lost.
We do this by considering the differences in the values for each
pair, thereby reducing our two samples to a single sample of
differences.

 The paired t-test

 Assumptions
 In the population of interest, the individual differences are
Normally distributed with a given (usually unknown) variance.
 We have a reasonable sample size so that we can check the
assumption of Normality.
 Rationale
 If the two sets of measurements were the same, then we would
expect the mean of the differences between each pair of
measurements to be zero in the population of interest.
Therefore, our test statistic simplifies to a one-sample t-test on
the differences, where the hypothesized value for the mean
difference in the population is zero.

 Additional notation
 Because of the paired nature of the data, our two samples must
be of the same size, n. We have n differences, with sample
mean, d dash, and estimated standard deviation sd.
• Since the pairing is explicitly defined and thus new information added
to the data, paired data can always be analyzed with the independent
sample t-test as well, but not vice versa.

• A typical guideline to determine whether the dependent sample t-test


is the right test is to answer the following three questions:
1. Is there a direct relationship between each pair of observations (e.g.,
before vs. after scores on the same subject)?
2. Are the observations of the data points definitely not random (e.g.,
they must not be randomly selected specimen of the same
population)?
3. Do both samples have to have the same number of data points?

• If the answer is yes to all three of these questions the dependent


sample t-test is the right test, otherwise use the independent sample t-
test. In statistical terms the dependent samples t‐test requires that the
within ‐ group variation, which is a source of measurement errors, can
be identified and excluded from the analysis
If the assumptions are not satisfied
• If the differences do not follow a Normal distribution, the
assumption underlying the t-test is not satisfied. We can either
transform the data, or use a non-parametric test such as the sign
test or Wilcoxon signed ranks test to assess whether the
differences are centered around zero.
Paired t-test:

• To interpret the result, the same table of t-values is used as for


the t-test that is used unpaired observations.

• To illustrate how the paired t-test is used, it will be performed


on the results of the nutritional survey referred to in the
following table:

• Table (I): Results of quality control exercise during a nutritional


survey (weight measurement in Kg):
Difference A – B (kg) Observer B Observer A .Child No
0.9 17.7 18.6 1
2.6 14.5 17.1 2
1.9 12.4 14.3 3
2.5 20.7 23.2 4
1.6 16.8 18.4 5
0.5 6
2.5 14.4 14.9 7
- 2.3 14.1 16.6 8
0.3 17.1 14.8 9
2.7 21.2 21.5 10
0.8 21.9 24.6 11
2.1 16.6 17.4 12
1.6 13.6 15.7 13
1.7 14.5 16.1 14
- 3.7 11.2 12.9 15
- 1.0 16.0 12.3 16
1.8 17
2.6 20.4 19.4 18
- 0.8 17.5 19.3 19
2.5 22.2 24.8 20
15.1 14.3
10.9 13.4
• The null hypothesis in this study is that if observer A and B
measured all the children in the population form which these 20
children were sampled, there would, on average, be no
difference between their measurements. In other words, the
mean difference between A and B would be zero.

• We can regard this set of 20 differences (the A – B column) as


sample from the population of differences that would have been
obtained if the observers had measured the whole population.

• To perform the significance test the value of t has to be


calculated and compared to the theoretical value in the t-table
to determine the probability that the result occurred by chance.
• This is done as follows:
1. Calculate the mean difference in the sample. This is the sum
of the differences divided by the number of measurements:
Mean difference = 1.04

2. Calculate the standard deviation of the differences. Standard


deviation = 1.77

3. Calculate the standard error: Standard error = standard


deviation / √n = 1.77 / √20 = 0.40.

4. The value of t is the mean difference divided by the standard


error: t = 1.04 / 0.40 = 2.60
5. Refer to tables of t-values.
• The number of degrees of freedom is the sample size (the
number of pairs of observations) minus 1, which in this case
is 20 – 1 = 19. t-value from the table at 19 d.f. and p = 0.05 is
2.09. which is < the calculated t-value (2.6). Then the
difference is significant.

• The probability from the table is < 0.05 which allows us to


conclude that there is a significant difference between the
observers.

• By using SPSS, the calculated t-value =2.626, P value = 0.017.

• Exercise (1): the effects of vitamin B12 on 2 groups (matched


pairs) of 20 patients as follows:
Hb2 (x2) Hb1 (x1) Serial no.
7 15.5 1
6 13 2
10 9 3
11 8 4
13 7 5
9 15 6
8 12.5 7
10 16 8
6 10 9
7 8 10
11 10 11
7 13 12
8 15 13
12 15.5 14
11 13.5 15
5 10.5 `16
7 9 17
8 8 18
6 7 18
7 6 20
• Test whether there a significant difference between mean
hemoglobin level of both groups.
Answer:
• First, we should calculate the mean difference between both
groups. d dash (mean difference) = (x1 – x2) /n. from the table
below: d dash (mean difference) = 52.25/20 = 2.625
• (Sd)2 = Σd2 – n (d dash)2 / n – 1 = 417.25 – 20x (2.625) 2 / 20 – 1 =
14.71
• Sd = √14.71 = 3.835
• Standard error = standard deviation / √n = 3.835 / √20 = 0.857
• t-value = mean difference (d dash) / standard error= 2.625/ 0.857 =
3.063
• Theoretical t-value from the table at p value 0.05 and d.f. n – 1 =
19 is 2.09 which is less than the calculated t-value (3.063). Then
the difference between both groups is significantly different.
• by using SPSS, the calculated t-value =3.61, P value = 0.006.
d2 x1 - x 2 = d Hb2 (x2) Hb1 (x1) Serial no.
72.25 8.5 7 15.5 1
49 7 6 13 2
1 - 1 10 9 3
9 - -3 11 8 4
36 - 6 13 7 5
36 6 9 15 6
20.25 4.5 8 12.5 7
36 6 10 16 8
16 4 6 10 9
1 1 7 8 10
1 - 1 11 10 11
36 6 7 13 12
49 7 8 15 13
12.25 3.5 12 15.5 14
6.25 2.5 11 13.5 15
30.25 5.5 5 10.5 `16
4 2 7 9 17
0 0 8 8 18
1 1 6 7 18
1 - -1 7 6 20
417.25 52.5 Total
Exercise (2): the following are matched pairs of weights of 10
children (5 years old) measured by two observers. Test whether
there significant difference between measurements by both
observers.
Observer 2 Observer 1 Serial no.
5 11 1
4 10 2
8 5 3
7 6 4
9 7 5
11 9 6
5 4 7
6 5 8
7 9 9
10 11 10
(x1 – x2)2 = d2 x1 - x2 = d Observer2 Observer1 Serial no.
36 6 5 11 1
36 6 4 10 2
9 -3 8 5 3
1 -1 7 6 4
4 -2 9 7 5
4 -2 11 9 6
1 -1 5 4 7
1 -1 6 5 8
4 2 7 9 9
1 1 10 11 10
97 5 Total
• Mean difference (d dash) = 5/10 = 0.5

• (Sd)2 = Σ(X – X dash)2 / n -1 = 97 / 10 – 1= 10.778


• Sd = √10.778 = 3.28
• Standard error = sd /√n = 3.28/√10 = 3.278/3.16 = 1.037

• t-value = d dash / standard error = 0.5 / 1.037 = 0.482

• Theoretical t-value from the table ad d.f. (n – 1) = 10 – 1 = 9 and


p-value 0.05 is 2.26 which is more than the calculated t-value
(0.482). Then the difference is not significant.

• By using SPSS, the calculated t-value =0.488, P value = 0.637.


Exercise (3): the following are matched pairs of weights of 10 children (5 years old). Test whether there
significant difference between both groups.

(x1 – x2)2 = d2 x1 - x 2 = d Group B Group A Serial no.


4 -2 12 10 1
1 1 11 12 2
1 1 10 11 3
0 0 13 13 4
36 -6 14 8 5
25 -5 12 7 6
16 -4 15 11 7
4 -2 16 14 8
16 4 11 15 9
4 2 14 16 10
107 - 11 Total
The answer:
• Mean difference (d dash) = 11/10 = 1.1

• Sd = √Σd2 – n (d dash)2 / n -1 = √107 – 10 (1.1)2 / n – 1 =


√107 – 12.1 / 9 = √94.9 / 9 = 3.25
• Standard error = sd/√n = 3.25 / √10 = 3.25 / 3.16 = 1.03
• t-value = d dash / standard error = 1.1 / 1.03 = 1.07
• d.f. = 10 – 1 = 9

• Theoretical t-value from the table at d.f. (n – 1) = 10 – 1 = 9 and


p-value 0.05 is 2.26 which is more than the calculated t-value
(1.07). Then the difference is not significant.

• By using SPSS, the calculated t-value =1.071, P value = 0.312.
Analyze --- Compare means ----- Paired samples t test
ONE-SAMPLE T-TEST

 The paired t-test is a special case of one-sample t-test which


tests whether a sample mean is different from specified value,
µ, which need not be zero.

The general formula is:


X-µ
 T- value = ----------------------
S / √n

 d.f. = n - 1
The problem
• We have a sample from a single group of individuals and one
numerical or ordinal variable of interest. We are interested in
whether the average of this variable takes a particular value.
• For example, we may have a sample of patients with a specific
medical condition. We have been monitoring triglyceride levels in
the blood of healthy individuals and know that they have a
geometric mean of 1.74 mmol/L. We wish to know whether the
average level in our patients is the same as this value.
The one-sample t-test
• Assumptions
• In the population, the variable is Normally distributed with a
given (usually unknown) variance. In addition, we have taken a
reasonable sample size so that we can check the assumption of
Normality.
Rationale
• We are interested in whether the mean, x, of the variable in the
population of interest differs from some hypothesized value, µ.
We use a test statistic that is based on the difference between
the sample mean, X dash, and µ1,. Assuming that we do not
know the population variance, then this test statistic, often
referred to as t, follows the t-distribution.
• If we do know the population variance, or the sample size is
very large, then an alternative test (often called a z-test), based
on the Normal distribution, may be used. However, in these
situations, results from either test are virtually identical.

Additional notation
• Our sample is of size n and the estimated standard deviation is
s.
• Interpretation of the confidence interval
• The 95% confidence interval provides a range of values in which
we are 95% certain that the true population mean lies. If the 95%
confidence interval does not include the hypothesized value for
the mean, µl, we reject the null hypothesis at the 5% level. If,
however, the confidence interval includes µl, then we fail to
reject the null hypothesis at that level.

If the assumptions. are not satisfied


• We may be concerned that the variable does not follow a
Normal distribution in the population. Whereas the t-test is
relatively robust to some degree of non-Normality, extreme
skewness may be a concern.
• Example (one sample t-test): the following are
weights of 10 five-year old Libyan children with
measles: 9 – 11 – 12 – 8 – 13 – 7 – 14 – 6 – 5 – 15. If
we know that reference weight for 5 year old Libyan
children is 14 kg.

• Test whether the above sample suggest that 5 year


old children with measles differ in their weights from
the reference weight (14 kg.) or not.

• The null hypothesis in this condition states that the


mean weight of 5 year old children is 14 kg.
(X – X dash)2 Difference Weight of children Serial
1 9 – 10 = -1 9 1
1 11 – 10 = 1 11 2
4 12 – 10 = 2 12 3
4 8 – 10 = 2 8 4
9 13 – 10 = 3 13 5
9 7 – 10 = -3 7 6
16 14 – 10 = 4 14 7
16 6 – 10 = 4 6 8
25 5 – 10 = 5 5 9
25 15 – 10 = 5 15 10
110 100 Total
• The mean weight of children with measles = 100/10 = 10.
• Standard deviation = square root of Σ(X – X dash)2 / n – 1 =
√110/10-1 = 3.5

• Standard error = SD / √n = 3.5/√10 = 1.11

• t= X – 14/ SE = 10 – 14 / 1.11 = 3.6

• Theoretical t-value from the table at d.f. = 9 & p-value 0.05 is


2.26.

• Then, the weight of children with measles is significantly


lower than that of reference Libyan children.

• By using SPSS, the calculated t-value =3.618, P value = 0.006.


(X – X dash)2 X – X dash Weight of children Serial
4 54 – 52 = -2 54 1
169 65 – 52 = 13 65 2
121 41 – 52 = - 11 41 3
0 52 – 52 = 0 52 4
25 47 – 52 = - 5 47 5
1 51 – 52 = -1 51 6
81 43 – 52 = - 9 43 7
196 66 – 52 = 14 66 8
64 60 – 52 = 8 60 9
121 41 – 52 = - 11 41 10
782 520 Total
• The mean weight of marks of children= 520/10 = 52.
• Standard deviation = square root of Σ(X – X dash)2 / n – 1 =
√782/10-1 = 9.321

• Standard error = SD / √n = 9.321 /√10 = 2.948

• t= X – 60/ SE = 52 – 60 / 2.948 = - 2.714

• Theoretical t-value from the table at d.f. = 9 & p-value 0.05 is


2.26.

• Then, the marks of children is significantly lower than that of


reference children (p<0.05).

• By using SPSS, the calculated t-value =2.714, P value = 0.024.


One sample t test
• APA Style write-up for Independent Sample t-test

• An independent sample t test reported a significant difference


in salary drawn by male and female employees, t (344.26) =
11.69, p <.001, 95% C.I. [$12,816.73 - $ 18,002.99]. The male
employees are drawing on an average higher salary (M =
$41,441.78, SD = $19,499.21) as compared to female (M = $
26,031.92, SD = $7,558.021) employees.

• APA Style write-up for Paired Sample t-test

• A paired sample t-test suggested that there had been significant


increase in the salary of employees (M = $34,419.57, SD =
$17,075.66) since they joined the company (M = $17,016.09 , SD
= $7870.64 ), t (473) = 35.04, p<.001.
Comparison of means from several groups:
analysis of variance (ANOVA)
INTRODUCTION
• When our exposure variable has more than two categories, we
often wish to compare the mean outcomes from each of the
groups defined by these categories. For example, we may wish
to examine how haemoglobin measurements collected as part
of a community survey vary with age groups and sex, and to
see whether any sex difference is the same for all age groups.
We can do this using analysis of variance.
INTRODUCTION
• In general this will be done using a computer package, but we
include details of the calculations for the simplest case, that of
one-way analysis of variance, as these are helpful in
understanding the basis of the methods. Analysis of variance
may be seen as a special case of multiple regression.
• We start with one-way ANOVA, which is appropriate when
the subgroups to be compared are defined by just one
exposure, for example in the comparison of means between
different socioeconomic or ethnic groups (there is only one
dependent variable and one independent variable).

• Two-way ANOVA is also described and is appropriate when


the subdivision is based on two factors such as age groups
and sex (there is only one dependent variable and two
independent variable [e.g. age and sex]).

• The methods can be extended to the comparison of


subgroups cross-classified by more than two factors.
ONE-WAY ANALYSIS OF VARIANCE

• One-way ANOVA is used to compare the mean of a numerical


outcome variable in the groups defined by an exposure level
with two or more categories.

• It is called one-way as the exposure groups are classified by


just one variable (one independent variable).

• The method is based on assessing how much of the overall


variation in the outcome (dependent variable) is attributable
to differences between the exposure group means: hence the
name ANOVA.
Example
• The following table shows the mean weight of individuals
according to group.

• We start by considering the variance of all the observations,


ignoring their subdivision into groups. i.e. total sum of squares.

• Recall that the variance is the square of the standard deviation,


and equals the sum of squared deviations of the observations
about the overall mean divided by the degrees of freedom:
• Variance, s2 = Σ (x – x dash)2 /(n – 1)
• Variance = (standard deviation)2
• SD = square root of the variance
• One-way ANOVA divides this sum of squares (SS Σ(x - xdash)2)
into two distinct components:

1. The sum of squares due to differences between the group


means (between groups sum of squares).
2. The sum of squares due to differences between the
observations within each group. This is also called the residual
sum of squares (within groups sum of squares).

• The total degrees of freedom (n - 1) are similarly divided. The


between-groups SS has (k - 1) d.f:, and the residual SS has (n – k)
d.f:, where k is the number of groups, and n is the number of
observations.
• The fourth column of the table gives the amount of variation per
degree of freedom, and this is called the mean square (MS). The
test of the null hypothesis that the mean outcome does not
differ between exposure groups is based on a comparison of the
between-groups and within-groups mean squares.

• If the observed differences in mean weight for the different


groups were simply due to chance, the variation between these
group means would be about the same size as the variation
between individuals with the same type, while if they were real
differences the between-groups variation would be larger.

• The mean squares are compared using the F test, sometimes


called the variance-ratio test.
Between-groups MS
• F = -------------------------------------
Within-groups MS

Degrees of freedom (d.f.) Between-groups = k – 1,


Degrees of freedom (d.f.) Within-groups = n – k.
where
n = is the total number of observations and
k = is the number of groups.
One-way ANOVA:
Differences in weight levels between individuals in different
groups (first, second and third group).

Grand mean = 79

a) data:

Individual values SD Mean No. of Groups


patients
75 – 77 – 79 – 81 - 83 3.16 79 5 First
80 – 82 – 84 – 86 -88 3.16 84 5 Second
70 – 72 – 74 -76 -78 3.16 74 5 Third
Total sum of squares (TSS) = (x – grand mean) 2 = 370
(X – grand mean)2 X – grand mean Individual value (x) Group
16 75 – 79 = - 4 75 First group
4 77 – 79 = -2 77 First group
0 79 – 79 = 0 79 First group
4 81 – 79 = 2 81 First group
16 83 – 79 = 4 83 First group
1 80 – 79 = 1 80 Second group
9 82 – 79 = 3 82 Second group
25 84 – 79 = 5 84 Second group
49 86 – 79 = 7 86 Second group
81 88 – 79 = 9 88 Second group
81 70 – 79 = -9 70 Third group
49 72 – 79 = -7 72 Third group
25 74 – 79 = -5 74 Third group
9 76 – 79 = -3 76 Third group
1 78 – 79 = -1 78 Third group
370 Total
Within groups sum of squares (RSS) = (x – group mean) 2 = 120
(X – group mean)2 X – group mean Individual value (x) Group
16 75 – 79 = - 4 75 First group
4 77 – 79 = -2 77 First group
0 79 – 79 = 0 79 First group
4 81 – 79 = 2 81 First group
16 83 – 79 = 4 83 First group
16 80 – 84 = -4 80 Second group
4 82 –84 = -2 82 Second group
0 84 – 84 = 0 84 Second group
4 86 – 84 = 2 86 Second group
16 88 – 84 = 4 88 Second group
16 70 – 74 = -4 70 Third group
4 72 – 74 = -2 72 Third group
0 74 – 74 = 0 74 Third group
4 76 – 74 = 2 76 Third group
16 78 – 74 = 4 78 Third group
120 Total
Between groups sum of squares = (group mean – grand mean)2 = 250
(group mean – grand mean)2 Group mean – grand mean Group mean Group
0 79 – 79 = 0 79 First group
0 79 – 79 = 0 79 First group
0 79 – 79 = 0 79 First group
0 79 – 79 = 0 79 First group
0 79 – 79 = 0 79 First group
25 84 – 79 = - 5 84 Second group
25 84 –79 = - 5 84 Second group
25 84 – 79 = 5 84 Second group
25 84 – 79 = 5 84 Second group
25 84 – 79 = 5 84 Second group
25 74 – 79 = - 5 74 Third group
25 74– 79 = - 5 74 Third group
25 74 – 79 = - 5 74 Third group
25 74 – 79 = - 5 74 Third group
25 74 – 79 = - 5 74 Third group
250 Total
b) Calculation:

 n = 5 +15 = 5 + 5 , no. of groups (k) = 3


 Total Sum of squares = 370 - d.f. = n - 1 = 15 – 1 = 14
 Between groups Sum of squares = 250 - d.f. = k - 1 = 3 – 1 = 2
 Within groups Sum of squares = 120 - d.f. = n - k = 15 – 3 = 12

 Between groups mean squares = 250 / 2 = 125


 Within group mean of squares = 120 / 12 = 10

 F Ratio = between group mean squares / within groups mean


squares = 125 / 10 =12.5
c) Analysis of variance:

Of the total sum of squares (=370), 250 (67.6%) is attributable to between-


group variation (between group sum of squares) and 120 (32.4%) is attributable
to within group variation (within group sum of squares or residual sum of
squares).

Between-groups MS MS = d.f. SS Source of


F = ----------------------------- SS/d.f. variation
Within-groups MS

F = 125 / 10 = 12.5 125 2 250 Between groups

P < 0.05 10 12 120 Within groups

14 370 Total
 F should be about 1 if there are no real differences between the
groups and larger than 1 if there are differences.

 Under the null hypothesis that the between group differences


are simply due to chance, this ratio follows an F distribution
which, in contrast to most distributions, is specified by a pair of
degrees of freedom: (k – 1) degrees of freedom in the numerator
and (n – k) in the denominator.

 P-values for the corresponding test of the null hypothesis (that


mean weight do not differ according to group) are reported by
statistical computer packages
 In the previous table, F = 125/10 = 12.5.

 The critical or theoretical f value from the table of f distribution at d.f. 2


for the numerator (column) and 12 for the denominator (row) is 3.89.
 The calculated f value (12.5) is greater than the critical or theoretical f
value (3.89), then the difference is significant ( P value < 0.05).

 There is thus strong evidence that mean weight levels differ between
the three groups (first, second and third), the mean being lowest for
the third group (74), intermediate for the first group (79), and highest
for the second group (84).

 To identify the source of differences, conduct a post-hoc analysis.


Rationale
• The one-way analysis of variance separates the total variability
in the data into that which can be attributed to differences
between the individuals from the different groups (the
between-group variation), and to the random variation
between the individuals within each group (the within group
variation, sometimes called unexplained or residual variation).
These components of variation are measured using variances,
hence the name analysis of variance (ANOVA).
• Under the null hypothesis that the group means are the same,
the between-group variance will be similar to the within-group
variance.
• If, however, there are differences between the groups, then the
between-group variance will be larger than the within-group
variance. The test is based on the ratio of these two variances.
Procedures:
1. Define the null and alternative hypotheses under study
 H0: all group means in the population are equal
 H1: at least one group mean in the population differs from the
others.

2. Collect relevant data from samples of individuals

3. Calculate the value of the test statistic specific to Ho


 The test statistic for ANOVA is a ratio, F, of the between group
variance to the within-group variance. This F statistic follows the
F-distribution with (k - I), (n - k) degrees of freedom in the
numerator and denominator, respectively.
4. Compare the value of the test statistic to values from a known
probability distribution
 Refer to the F-ratio (Appendix). Because the between group
variation is > the within-group variation, we look at the one-
sided P-values.

5. Interpret the P-value and results


 If we obtain a significant result at this initial stage, we may
consider performing specific pair-wise post-hoc comparisons. We
can use one of a number of special tests devised for this purpose
(e.g. Duncan's, Scheffe's).
 Assumptions
 There are two assumptions underlying the analysis of variance
and corresponding F test.

1. The first is that the outcome is normally distributed.

2. The second is that the population value for the standard


deviation between individuals is the same in each exposure
group. This is estimated by the square root of the within-groups
mean square. This is called equal variances or homogeneity of
variances.

 Moderate departures from normality may be safely ignored, but


the effect of unequal standard deviations may be serious. In the
latter case, transforming the data may help.
If the assumptions are not satisfied

• Although ANOVA is relatively robust to moderate departures from


Normality, it is not robust to unequal variances.

• Therefore, before carrying out the analysis, we check for


Normality, and test whether the variances are similar in the
groups either by eyeballing them, or by using Levene's test.

• If the assumptions are not satisfied, we can either transform the


data, or use the non-parametric equivalent of one-way ANOVA,
the Kruskal-Wallis test.

• We can perform one way ANOVA with unequal variances using


Brown Forsythe test or Welch test.
• Relationship with the unpaired t test

• When there are only two groups, the one-way analysis of


variance gives exactly the same results as the t test.

• The F statistic (with 1, n – 2 degrees of freedom) exactly equals


the square of the corresponding t statistic (with n – 2 degrees of
freedom), and the corresponding P-values are identical.
TWO-WAY ANALYSIS OF VARIANCE

• Two-way analysis of variance is used when the data are


classified in two ways, for example by age-group and sex. There
is one dependent variable (weight) and two independent
variables (e.g. age groups and sex).

• The data are said to have a balanced design if there are equal
numbers of observations in each group and an unbalanced
design if there are not.

• Balanced designs are of two types, with replication if there is


more than one observation in each group and without
replication if there is only one.
Balanced design with replication

• The following table shows the results from an experiment in


which five male and five female rats of each of three strains
were treated with growth hormone. The aims were to find out
whether the strains responded to the treatment to the same
extent, and whether there was any sex difference. The
measure of response was weight gain after seven days.

• These data are classified in two ways, by strain and by sex. The
design is balanced with replication because there are five
observations in each strain–sex group.
• Number of groups = sex(2) X strain (3) = 6.
Mean weight gains in grams with standard deviations in
parentheses (n = 5 for each group).

C B A Sex / strain

12.2 (0.7) 12.1 (0.7) 11.9 (0.9) Male

13.1 (0.9) 11.8 (0.6) 12.3 (1.1) Female


• Two-way ANOVA divides the total sum of squares into four
components:

1. The sum of squares due to differences between the strains. This is


said to be the main effect of the factor, strain. Its associated degrees
of freedom are one less than the number of strains and equal 2.

2. The sum of squares due to differences between the sexes, that is the
main effect of sex. Its degrees of freedom equal 1, one less than the
number of sexes.

3. The sum of squares due to the interaction between strain and sex. An
interaction means that the strain differences are not the same for
both sexes and, equivalently, that the sex difference is not the same
for the three strains. The degrees of freedom equal the product of
the degrees of freedom of the two main effects, which is 2 x 1 = 2.
4. The residual sum of squares due to differences between the rats
within each strain–sex group. Its degrees of freedom equal 24,
the product of the number of strains (3), the number of sexes (2)
and the number of observations in each group minus one (4).

• The null hypotheses of no main effect of the two exposures and


of no interaction are examined by using the F test to compare
their mean squares with the residual mean square, as described
for one-way analysis of variance.

• No evidence of any association was obtained in this experiment


(P value is > 0.05).
Analysis of variance:
Between-groups MS MS d.f. SS Source of
F = -------------------------------- variation
Within-groups MS
Main effect:
F value = 1.32/0.7 = 1.9, p = 0.17 1.32 2 2.63 Strain
F value = 1.16/0.7 = 1.7, p =0.2 1.16 1 1.16 Sex
Interaction:
F value = 0.83/0.7 = 1.2, p =0.32 0.83 2 1.65 Strain x sex
0.70 24 16.85 Residual
29 22.3 Total
Naming ANOVA
• All ANOVAs have two things in common: they involve some quantity of
independent variables and these variables can be measured using
either the same or different participants. If the same participants are
used we typically use the term repeated measures and if different
participants are used we use the term independent.

• When there are two or more independent variables, it’s possible that
some variables use the same participants whereas others use different
participants. In this case we use the term mixed.

• When we name an ANOVA, we are simply telling the reader how many
independent variables we used and how they were measured. In
general terms we could write the name of an ANOVA as:
 A (number of independent variables) way of how these variables were
measured ANOVA.
Naming ANOVA
• By remembering this you can understand the name of any ANOVA you
come across. Look at these examples and try to work out how many
variables were used and how they were measured:
 One-way independent ANOVA
 Two-way repeated-measures ANOVA
 Two-way mixed ANOVA
 Three-way independent ANOVA

The answers you should get are:


 One independent variable measured using different participants.
 2 independent variables both measured using the same participants.
 Two independent variables: one measured using different participants
and the other measured using the same participants.
 Three independent variables all of which are measured using different
participants.
ANOVA
How to report the results
What to do when assumptions are violated
A - Reporting results from one-way independent ANOVA
• When we report an ANOVA, we have to give details of the F-ratio and the
degrees of freedom from which it was calculated. For the experimental effect
in these data the F-ratio was derived by dividing the mean squares for the
effect by the mean squares for the residual.

• Therefore, the degrees of freedom used to assess the F-ratio are the degrees
of freedom for the effect of the model (dfM = 2) and the degrees of freedom
for the residuals of the model (dfR = 12).

• Therefore, the correct way to report the main finding would be:
 There was a significant effect of Viagra on levels of libido, F(2, 12) = 5.12, p
< .05, ω = .60.

• Notice that the value of the F-ratio is preceded by the values of the degrees of
freedom for that effect. Also, we rarely state the exact significance value of
the F-ratio: instead we report that the significance value, p, was less than the
criterion value of .05 and include an effect size measure.
Input data
Analysis – compare means – one way ANOVA
Write variables in the windows
Select options
Post hoc multiple comparisons (pair-wise comparisons)
ANOVA table
Test of homogeneity of vairances (Result of Levene test)
Result of post-hoc multiple comparison
Post hoc pairwise comparison tests
Post hoc procedures
 Post hoc tests consist of pairwise comparisons that are designed
to compare all different combinations of the treatment groups. So,
it is rather like taking every pair of groups and then performing a
t-test on each pair of groups.

 Now, this might seem like a particularly stupid thing to say in the
light of what I have already told you about the problems of
inflated familywise error rates.

 However, pairwise comparisons control the family-wise error by


correcting the level of significance for each test such that the
overall Type I error rate (α) across all comparisons remains at .05.
Post hoc procedures
• There are several ways in which the familywise error rate can be
controlled. The most popular (and easiest) way is to divide α by
the number of comparisons, thus ensuring that the cumulative
Type I error is below .05.
• Therefore, if we conduct 10 tests, we use .005 as our criterion for
significance.
• This method is known as the Bonferroni correction.

• There is a trade-off for controlling the familywise error rate and


that is a loss of statistical power. This means that the probability of
rejecting an effect that does actually exist is increased (this is
called a Type II error). By being more conservative in the Type I
error rate for each comparison, we increase the chance that we
will miss a genuine difference in the data.
Post hoc procedures
 Therefore, when considering which post hoc procedure to use we
need to consider three things:

(1) does the test control the Type I error rate;

(2) does the test control the Type II error rate (i.e. does the test have
good statistical power); and

(3) is the test reliable when the test assumptions of ANOVA have
been violated?

 SPSS provides no less than 18 post hoc procedures. It is important


that you know which post hoc tests perform best according to the
aforementioned criteria.
Post hoc procedures and Type I (α) and
Type II error rates
• The Type I error rate and the statistical power of a test are linked.
Therefore, there is always a trade-off: if a test is conservative (the
probability of a Type I error is small) then it is likely to lack
statistical power (the probability of a Type II error will be high).

• Therefore, it is important that multiple comparison procedures


control the Type I error rate but without a substantial loss in
power.

• If a test is too conservative then we are likely to reject differences


between means that are, in reality, meaningful.
Post hoc procedures and violations of test assumptions
• Most research on post hoc tests has looked at whether the test
performs well when the group sizes are different (an unbalanced
design), when the population variances are very different, and when
data are not normally distributed.

• The good news is that most multiple comparison procedures perform


relatively well under small deviations from normality. The bad news
is that they perform badly when group sizes are unequal and when
population variances are different.
Post hoc procedures and violations of test assumptions

• There are several multiple comparison procedures that have been


specially designed for situations in which population variances
differ.

• SPSS provides four options for this situation: Tamhane’s T2,


Dunnett’s T3, Games–Howell and Dunnett’s C. Tamhane’s T2 is
conservative and Dunnett’s T3 and C keep very tight Type I error
control.

• The Games–Howell procedure is the most powerful but can be


liberal when sample sizes are small. However, Games–Howell is
also accurate when sample sizes are unequal.
• SPSS divides the post hoc multiple comparison tests
into two groups, when equal variances assumed and
when equal variances not assumed.

• Precautions which should be considered when


choosing a suitable post hoc (pair-wise) analysis
following significant ANOVA test:

1. does the test control the Type I error rate;


2. does the test control the Type II error rate (i.e. does
the test have good statistical power);
3. is the test reliable when the test assumptions of
ANOVA have been violated?
• The least significant difference (LSD) pairwise multiple
comparison test following ANOVA is characterized by:
• It is equivalent to multiple individual t tests between all
pairs of groups.
• The disadvantage of this test is that no attempt is made
to adjust the observed significance level for multiple
comparisons.
• Works best if you have just three groups, but if you
have > 3 groups, then using LSD may increase chances
of type I error. So if you have just 3 repeated measures
use Fisher’s LSD.
• However, if you have just three groups, LSD is 8% more
powerful than Tukey HSD
• Bonferroni correction pairwise multiple comparison
following ANOVA test is characterized by:

1. It correct type I error by dividing .05 by the number of


comparisons.
2. Bonferroni ‘s correction should not be used if you are
having more than five groups.
3. As it overcorrects (hence very conservative) so it
should be avoided as you may commit type II error
(fail to detect significance when it is present)
Tukey's test
• It is preferred when you will perform all the possible
comparisons between a large set of means (six or
more means). Use it when you have large number of
groups ( 6 or more).

• Can be used only when you have equal number of


observation in each group.

• It requires equal variance to be assumed between


groups
Scheffe ‘s test
• The test works even when there is unequal number of
observations in groups
• The significance level of the Scheffé test is designed to
allow all possible linear combinations of group means to
be tested, not just pairwise comparisons available in this
feature. The result is that the Scheffé test is often more
conservative than other tests, which means that a larger
difference between means is required for significance.
• It requires equal variance to be assumed between
groups
Tukey’s – b

• Also called Tukey’s WSD (wholly significant


difference) and

• is a compromise between much liberal Newman –


Keul’s test and much conservative Tukey’s HSD
S -N-K (student- Neuman & Kuel ‘s test)
• It seems most popular of all post-hoc tests which use concept of
step in which all means are arranged in ascending or descending
order account. For example, 1.2, 2.4, 3.8 and 3.75 are four
ordered means and when you move from 1.2 to 2.4 it is just one
step and while when you compare 1.2 with 3.75 it is three steps.
• So, the critical value for each comparison is different in SNK
depending on the number of steps while in Tukey’s HSD the
critical value remain constant (.05) [such tests are also known as
exact tests].
• S.N.K. test pools the groups that do not differ significantly from
each other. Therefore this improves the reliability of the post
hoc comparison because it increases the sample size used in the
comparison.
• You can use this test if you feel that using Tukey’s would be too
conservative. This is a bit liberal test.
Dunnet’s test
• Use it when you have a control group. It uses a t test to
compare multiple groups. Dunnett's pairwise multiple
comparison t test compares a set of treatments against a
single control mean.
• The last category is the default control category.
Alternatively, you can choose the first category.
• You can also choose a two-sided or one-sided test. To test
that the mean at any level (except the control category) of
the factor is not equal to that of the control category, use a
two-sided test
• To test whether the mean at any level of the factor is
smaller than that of the control category, select < Control.
• Likewise, to test whether the mean at any level of the factor
is larger than that of the control category, select > Control.
• Sidak test
• It is suitable for small number of comparisons
only.
• It is used as an alternative to Bonferroni test if
you are concerned about loss of power due to
Bonnferroni correction.
• Duncan test

• It is also called Duncan’s multiple range test.

• It is a useful alternative to LSD test as it


require large differences between group
means to detect significance during post-hoc
comparisons.
• Games Howell test:
• It is a non-parametric test for doing multiple
comparisons. It means it can be used when sample size
is unequal and homogeneity of variance is violated.
• It may be too liberal when sample size is small. Hence,
this test is not recommended when sample size is less
than 5.
• This test is more powerful than T2, T3 and C and that is
why it should be preferred over them.
• It is best known of all test for doing multiple
comparisons when homogeneity of variance is violated.
• Very commonly used and recommended.
• Tamhane’s T2:

• Although, compared to Games-Howell test, all the


three remaining tests T2, T3 and C are conservative
(least likely to detect significant difference)

• T2 is more conservative than T3 and C.

You might also like