0% found this document useful (1 vote)
82 views27 pages

Biostatistics: Khadeeja PK

The document discusses various measures of central tendency including mean, median, and mode. It also covers measures of dispersion such as range, mean deviation, and standard deviation. Normal distribution and concepts of skewness and kurtosis are explained.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
82 views27 pages

Biostatistics: Khadeeja PK

The document discusses various measures of central tendency including mean, median, and mode. It also covers measures of dispersion such as range, mean deviation, and standard deviation. Normal distribution and concepts of skewness and kurtosis are explained.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

BIOSTATISTICS

KHADEEJA PK
PART 1
• MEASURES OF CENTRAL TENDENCY
• Central tendency: refers to the Middle of the Distribution
Value or parameter which serves as single estimate of a series
of data.
• Gives a Mental picture of central value
• Enables comparison
• One central value around which all other observations are
dispersed
• Objective:
• to condense the entire mass of data
• to facilitate comparison
• There are 3 types of measures to calculate the central value. Which are
:
1) Mean
2) Median
3) Mode
MEAN
• Mean is the arithmetic average. Calculated by adding all the values
and then dividing by the total number of observations.
• The calculation of the mean incorporates all values in the data. If you
change any value, the mean changes. However, the mean doesn’t
always locate the centre of the data accurately. This happens when we
have a huge differences between the values.
• For e.g. our values are 4, 3, 5, 6, 14, 18, 40, 10, 2, 6, 7.. If u calculate
mean for this e.g. it will be 10.45 which does not appear as a central
value. As most of the observations are below 10. Here the mean is
getting influenced by the extreme value that is 40. So this is the
problem with mean. Thus mean can be used when the values are not
scattered.
• Advantages of mean :
• Easy to calculate and understand.
• Takes all values into consideration
• Allows further statistical analysis
• More reliable than other measures of central tendency.

• Limitation:
• Gets influenced by extreme values
MEDIAN
• Median is the middle value. It is the value that splits the data set in
half.
• To calculate median, we first have to arrange the values in either
ascending or descending order and then pick the middle value.. With
the above same values used to calculate mean,
• For e.g. Arrange the values above in ascending order: 2, 3, 4, 5, 6, 6, 7,
10, 14, 18, 40. Now select the central value and that will be 6. So the
median is 6. This is when we have odd numbers.
• In the case of even set of numbers, after arranging the values in
ascending or descending order, select the central 2 values and then
calculate their average (add the nos. and divide by 2).
• E.g. 2, 3, 4, 6, 8, 10.. so to calculate median its already in ascending
order, central values are 4 and 6. 4+6/2 will be 5. So median is 5.
•MODE
Mode is the value that occurs the most frequently in the data set.
Not affected by extreme values. 2, 3, 4, 5, 6, 6, 7, 10, 14, 18, 40. in
this example mode is: 6 (appeared twice)
• 2, 5, 4, 7, 8, 9 ,10 here none of the value is repeating. So there is
no mode.
• So either u can have single mode or multiple modes or no mode at
all.. When there is no mode, we can calculate the mode by the
following formula :-
• Mode= 3median - 2mean
Measure of dispersion
• The main idea about the measure of dispersion is to get to know
how the data are spread. It shows how much the data vary from
their average value. Dispersion helps to understand the distribution
of the data.
RANGE
• Range is the difference between the largest and the smallest observations.
Range = X max – X min

• Advantages of Range:
• It is the simplest of the measure of dispersion
• Easy to calculate
• Easy to understand
• Independent of change of origin
• Limitations of Range:
• It is based on two extreme observations. Hence, get affected by fluctuations
• A range is not a reliable measure of dispersion
• Dependent on change of scale
MEAN DEVIATION
• Mean deviation is the arithmetic mean of the absolute deviations of
the observations from a measure of central tendency. Also called
Average deviation
STANDARD
• Standard deviation DEVIATION
is the most important and widely used. First used by KARL
PEARSON in 1893. It is the square root of the mean of the squared deviations
from arithmetic mean. It is denoted by a Greek letter sigma, σ.
• Greater the deviation – greater the dispersion from central value Smaller the
deviation- higher degree of uniformity
Advantages of Standard Deviation:

• Squaring the deviations overcomes the drawback of ignoring


signs in mean deviations
• Suitable for further mathematical treatment
• Least affected by the fluctuation of the observations
• The standard deviation is zero if all the observations are
constant
• Independent of change of origin
Limitations of Standard Deviation
• Not easy to calculate
• Difficult to understand for a layman
• Dependent on the change of scale
•USES
Summarizes the deviationsDEVIATION:
OF STANDARD of a large distribution
• Indicates whether the variation from mean is by chance or real
• Helps in finding standard error
• Helps in finding the suitable size of sample
COEFFICIENT OF DISPERSION
• The coefficients of dispersion are calculated along with the measure
of dispersion when two series are compared which differ widely in
their averages. The dispersion coefficient is also used when two series
with different measurement unit are compared. It is denoted as C.D.
• The coefficients of dispersion (C.D.) based on different measures of
dispersion are:
• Coefficient of mean deviation = Mean deviation/average from which it
is calculated. Coefficient of Standard deviation = S.D. ⁄ Mean
NORMAL CURVE

• When data is collected from a very large number of people and a frequency
distribution is made with narrow class intervals the resulting curve is smooth
and symmetrical and it is called a normal curve.
• In a normal curve:
• The area between one standard deviation on either side of the mean will include
approximately 68% of the values
• The area between two standard deviation on either side of the mean will include
approximately 95% of the values
• The area between three standard deviation on either side of the mean will include
approximately 99.7% of the values
• The limit on either side of the mean are called confidence limit.
• STANDARD NORMAL CURVE
• There might be many normal curves but there is only one standard normal
curve
• The standard normal curve is bell shaped
• The curve is perfectly symmetrical based on an infinitely large number of observations.
The maximum number of observation is at the mean and the number of observation
gradually decrease on either side with few observation at few extreme points
• The total area of the curve is one, its mean id zero and standard deviation is one
• All the three measures of central tendency the mean median and mode coincide
• If mean > 2standard deviation it indicates that values are normally distributed
Skewness
• It is the statistic to measure the asymmetry of distribution on either
side of mean
kurtosis
• Is the measure of height of distribution curve
• Kurtosis: -
• Tall curve: Leptokurtic
• Flat curve : Platykurtic
• Normal: Mesokurtic
Reference
• P. Soben. Essentials of preventive and
social medicine.
• K. Park. Park’s Textbook of Preventive
and Social medicine.

You might also like