Mean, Median, Mode and Standard Deviation
Mean, Median, Mode and Standard Deviation
MODE AND
STANDARD
DEVIATION
DR.TANU SHREYA
1ST YEAR MDS
1
2
INTRODUCTION
JOHN GRAUNT(1620-1674) is the father of health statistics.
Normal BP – 120/80 mm Hg
Europeans are taller than Asians
Average male adult weighs 70kgs
Drug A is better than drug B
- Endless
Cannot be arrived by just Raw data .
Numbers tell tales – Speak the language of STATISTICS – Adds meaning to
data – helps to interpret data.
Thus lending “significance” to the study. 3
DESCRIPTIVE STATISTICS
STATISTICS – Is the science of compiling, classifying & tabulating numerical
data and expressing the results in a mathematical or graphical form.
OR
Statistics is the study of methods & procedures for collecting, classifying,
summarizing & analysing data & for making scientific inferences from such
data.
- Prof P.V.Sukhatme
4
BIOSTATISTICS
Is the branch of statistics applied to biological or medical sciences
(biometry).
OR
Is that branch of statistics concerned with mathematical facts and data
relating to biological events.
5
USES OF BIOSTATISTICS
To test whether the differences between two population is real or a chance
occurrence.
To study correlation between attributes in the same population.
To evaluate the efficacy of vaccines, sera.etc.
To measure mortality and morbidity.
To evaluate achievements of public health programs.
To fix priorities in public health programs.
To help promote health legislation and create administrative standards for oral
health.
6
BASIS OF STATISTICAL
ANALYSIS
1.The population(U)
2. The set of characteristics [variables] of the units of this population(V)
3. The probability distribution(P) of these characteristics in the population.
7
THE POPULATION (U)
Is a collection of units of observations that are of interest .
They are the target of investigation.
The success of an investigation depends largely on identification of population
of interest.
8
VARIABLE
For eg- in case of a particular drug to determine its efficacy, one needs to
define the disease and what other characteristics of the U one intends to
study(-age, sex, educational qualifications, etc)
9
TYPES OF VARIABLES
QUALITATIVE(CATEGORICAL)
QUANTITATIVE(NUMERICAL)
CATEGORIAL NUMERICAL
10
FREQUENCY/ PROBABILITY
DISTRIBUTION(P)
The most crucial link between the population and its characteristics , which allow us to draw
inference on the population based on sample observation.
It is a way of showing the number of observations or frequencies at different values or how
frequently each value appears in a population.
P-value is defined as the probability under the assumption of null hypothesis, of obtaining a
result equal to or more extreme than what was actually observed.
Formula- ∑ Xi/ n
∑(sigma) is the sum of , Xi is the value of each observation in data, n
is the number of observations in the data.
13
ADVANTAGES- Easy to calculate and understand
- It is the most useful of all averages.
14
VARISTIONS OF MEAN
GEOMETRIC MEAN- nth root of the product
When the variation between the lowest and the highest value is
very high, geometric mean is advised & preferred .
16
MODE
is the value of the variable which occurs with the greatest frequency.
The mode is located from frequency distribution table, taking the value of
the variable with the maximum frequency.
17
MEASURES OF DISPERSION
Dispersion is the degree of spread or variation of the variable about a central
value.
They help to know how widely the observations are spread on either side of
the average.
3 meaures are-
1.RANGE
2.MEAN DEVIATION
3.STANDARD DEVIATION
18
RANGE
Simplest method.
Defined as the difference between the value of the largest item and the value of
the smallest item.
It gives no information about the values that lie between the extreme values.
19
MEAN DEVIATION
It is the average of the deviations from arithmetic mean.
It is given by,
M.D.= ∑(X-Xi)
n
20
STANDARD DEVIATION(ROOT
MEAN SQUARE DEVIATION)
Is the most important and widely used measure of studying
dispersion.
It is the square root of the mean of the squared deviations from
arithmetic mean.
Greater the S.D. greater will be the magnitude of dispersion from
mean.
A small S.D. means higher degree of uniformity of the
observations.
21
22
USES OF SD
1. Summarizes the deviations of a large distribution from mean in one figure
used as unit of freedom .
2. Indicates whether the variation from the mean is by chance or real .
3. Helps finding standard error- which determines whether the difference
between means of two samples is by chance or real .
4. Helps finding the suitable size of the sample for valid conclusions.
23
THE NORMAL CURVE
Also known as Normal distribution or Gaussian distribution.
When data is collected from a very large number of people and a
frequency distribution is made with narrow class intervals, the
resulting is smooth and symmetrical called the normal curve.
24
25
CHARACTERISTICS OF
NORMAL CURVE
Bell shaped
Symmetrical
Mean, Mode & Median – coincide
Has two inflections – the central part is convex, while at the point of
inflection the curve changes from convexity to concavity.
Total area of curve -1, mean-0, standard deviation-1.
IF MEAN > 2S.D., VALUES ARE NORMALLY DISTRIBUTED.
26
VARIATION FROM NORMAL
CURVE
Skewness – as the static to measure the asymmetry
coefficient of skewness is 0.
CURVE CAN BE-
Negatively (left) skewed
Positively (right) skewed
27
28
KURTOSIS
Kurtosis – is a measure of height of the distribution curve
Coefficient of kurtosis is 3
1.Mesokurtic (normal) 2.Platykurtic (flat) 3.Leptokurtic(high)
29
30
TESTS OF SIGNIFICANCE
A statistical procedure by which one can conclude if the results from the
sample is due to chance or not.
2 types of tests-
31
PROCEDURE OF TESTING
1.Hypothesis testing
Hypothesis is an assumption about the status of a phenomenon or is a
statement about the parameters or form of population.
Null hypothesis or hypothesis of no difference .
States no difference between statistic of a sample & parameter of population
or between statistics of two samples .
This nullifies the claim that the experiment result is different from or better
than the one observed already,
32
2. State Alternate hypothesis –
Any hypothesis alternate to null hypothesis, which is to be tested
Note : the alternate hypothesis is accepted when null hypothesis is rejected.
33
34
FOR QUALITATIVE
DATA
35
COMPARISIONS HYPOTHESIS PARAMETRIC HYPOTHESIS NONPARAMETRIC
TESTED TEST TESTED TEST
1.SINGLE GROUP SAMPLE MEAN ONE SAMPLE T- SAMPLE MEDIAN SIGN TEST
NOT DIFFERENT TEST(<30) NOT DIFFERENT
FROM POPULATION Z TEST (>30) FROM POPULATION
MEAN MEDIAN
2.TWO TWO POPULATION UNPAIRED T-TEST/ TWO POPULATION MANN WHITNEY
INDEPENDENT MEANS ARE EQUAL INDEPENDENT MEDIANS ARE TEST
GROUP SAMPLE T-TEST EQUAL
3.TWO RELATED OR MEAN DIFFERENCE PAIRED T-TEST MEDIAN WILCOXON RANK
PAIRED SAMPLES IS ZERO DIFFERENCE IS TEST
ZERO
4. THREE OR MORE ALL POPULATION ANOVA ALL POPULATION KRUSKAL WALLIS
INDEPENDENT MEANS ARE EQUAL MEDIANS ARE TEST
SAMPLES EQUAL
36
FOR QUALITATIVE
DATA
37
TESTS CHARACTERISTICS
38
APPLICATION IN RESEARCH
METHODOLOGY
39
40
CONCLUSION
Research is a quest for knowledge through deligent search or
investigation aimed at discovery and interpretation of new
knowledge. Scientific method is a systemic body of procedures and
techniques applied in carrying out experimentation targeted at
obtaining new knowledge hence its understanding is necessary in
human health science and medicine fields.
41
THANK YOU.
42