0% found this document useful (0 votes)
31 views44 pages

Com 201 - Concept of Frequency Distribution, Mean, Median - Print

The document discusses the concept of frequency distribution and measures of central tendency, including mean, median, and mode for both grouped and ungrouped data. It outlines methods for data presentation, such as tabular and graphical formats, and provides examples of calculating mean, median, and mode. Additionally, it highlights the advantages and disadvantages of each measure of central tendency and their applicability based on data distribution characteristics.

Uploaded by

bobofficial001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views44 pages

Com 201 - Concept of Frequency Distribution, Mean, Median - Print

The document discusses the concept of frequency distribution and measures of central tendency, including mean, median, and mode for both grouped and ungrouped data. It outlines methods for data presentation, such as tabular and graphical formats, and provides examples of calculating mean, median, and mode. Additionally, it highlights the advantages and disadvantages of each measure of central tendency and their applicability based on data distribution characteristics.

Uploaded by

bobofficial001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

CONCEPT OF FREQUENCY

DISTRIBUTION, MEAN, MEDIAN


AND MODE OF GROUPED AND
UNGROUPED DATA

OMISORE AKINLOLU G.
FWACP, MPH, MB.Ch.B
RE-CAP OF LAST LECTURE
PRESENTATION OF DATA
• There are various methods of data presentation.
• Irrespective of the methods, data are usually
grouped or collated into frequencies- to determine
the rate of occurrence.
METHODS OF DATA PRESENTATION
• Tabular Presentation of Data.
• Graphical or diagrammatic presentation of Data
- Quantitative or numerical Data- Use
Histogram, Frequency Polygon.
- Qualitative or categorical data- Use Bar OR
Pie chart. Dot maps for geographical mapping
• Summary indices- e.g. Mean, Median & Mode.
TABULAR PRESENTATION.
• Done in form of frequency tables.
• Can be for both quantitative and qualitative data.
• Definitions for Frequency Table
CLASS- one of the groups into which
data can be classified.
CLASS FREQUENCY (CF)- is the number of
observations (NOB) in the data set falling in a
particular class.
CLASS RELATIVE FREQUENCY- CF divided by
the total NOB in the data set.
Example of a Frequency Table
Frequency Table
CLASS FREQUENCY RELATIVE
FREQUENCY
Level of Number
Education
None 254 0.34
Primary 201 0.27
Secondary 119 0.16

Post secondary 97 0.13

Others 75 0.10
Total 746 1.00 7
Methods of summarizing data
• Measures of location/ central
tendency-

• Measures of dispersion /
spread/variation

• (Measures of partition)- some take this


as measures of dispersion too.
Measures of Central Tendency
• Describe the location of the centre of a
distribution of numerical and ordinal
measurements
• Indicates the typical experience for a group
• Values in the data in which other values are
distributed
• They locate the midpoint of a distribution
Measures of Central Tendency
Frequently used ones are:

• Mean

• Median

• Mode

• Midrange
The Mean (x-bar)

• Arithmetic average of the


observations
• Most widely used average measure
• Most reliable and trustworthy
• Locates the centre of gravity of a
distribution
• Tells where the values for a group
are centred
The Mean cont’d
• Used when numbers can be added
• Suitable for numeric variables-
measured on interval or ratio scales
• Cannot be used for nominal/ordinal
variable because they cannot be
added.
The Mean cont’d
• Obtained by adding up all the individual
observations (summation “∑”) and
dividing by the number of observations
Xbar= (x1 + x2 + x3 + x4 …+ xn )
N
Xbar= ∑x
N
• ∑x= summation of observation,
• N= number of observations
Example:
Find the mean
Age of the first 10 1st year clinical students in
UNIOSUN 23, 19, 21, 20, 23, 21, 22, 24,
22, 22

x= 23 + 19 + 21 + 20 + 23 + 21 + 22 + 24 + 22 + 22
10

= 21.7 years
PROCEDURE FOR CALCULATING MEAN (Grouped
data)
• Find class-mid mark for each interval.
• Multiply class mid-mark in each interval by
corresponding frequencies. The class mid-
mark for each interval is the average of
the class limits.
• Add results in (ii) across all intervals.
• Divide results in (iii) by number of observations
or total frequency.
Mean - (Grouped Data)
Example:
Marks of students in practical 1
Marks Frequency Class mid mark ∑ fI xI
60-64 10 62 10*62 620
65-69 14 67 14*67 938
70-74 12 72 12*72 864
75-79 20 77 20*77 1540
80-84 10 82 10*82 820
85-89 14 87 14*87 1218
-
∑f = 80 ∑ fi xi = 6000

x = ∑ fi xI
∑ fi

x = 6000/80
= 75
Properties of Mean (X)
• Affected by extremes of values
• All the other observations lie about it
• Makes use of all information
• The sum of the deviations of the values
from the mean is always equal zero i.e
the mean is subtracted from each value
to form deviations (x minus xbar)

• ∑ (x- Xbar) = 0
• (x1 – Xbar)+ (x2 – Xbar) + (x3 – Xbar)…= 0
Advantages of the Mean
• It is easy to calculate
• Makes use of all information in the
distribution- hence reliable and
accurate
• Amenable to statistical procedures and
testing
Disadvantages of the mean
• It may be unduly influenced by abnormal
values in the distribution
• Not used with badly skewed
distribution- the more asymmetric the
distribution, the less desirable it is to
summarize the observation by using the
mean
The Median
• The middle number in an array of
the data, when the number of
observations is odd
Or
• The arithmetic mean of the middle
numbers in an array of data when
the number of observations is even
The Median cont’d
• The value above or below which half
(50%) of the observations fall
• Bisector of histogram/ polygon
• For interval, ratio and ordinal scale (not
nominal)
EXAMPLE ON MEDIAN
Find the Median of the first 10 1st year clinical students in
UNIOSUN 23, 19, 21, 20, 23, 21, 22, 24, 22, 22

Step 1: Arrange in Ascending Order


19, 20,21,21,22,22,22,23,23,24
Step 2: Pick the middle observation
(22+22)/2
= 22
Calculation of Median (Grouped)
• Sample size (n) = 80
• Median position (n/2) = 40th
• Median class = 75-79,
• Lower boundary (bL) = 74.5 (for median
class)
• Frequency in median class, f = 20
• Cumulative above median class (F) = 36
• Class-width ( c ) = 5
• Apply formula:
Median = bL + (n/2 – F)_ x c
fmed
CLASS BOUNDARIES& INTERVALS
• A class or group boundary lies midway between the data values.
For example,
• For data in the class or group labelled:
• 7.1 – 7.3

(a)The class values 7. 1 and 7.3 are the lower and upper limits of
the class and their difference gives the class width.
(b)The class boundaries are 0.05 below the lower class limit and
0.05 above the upper class limit (because the figures are in
1Decimal place)
(c)The class interval/ width is the difference between the upper and
lower class boundaries.
(d)Question- What are the class boundaries if the figures
are between 7 and 8?
10.Median (Grouped Data)
Using last example of mean.
Cumulative
Marks Boundaries Frequency Frequency (F)
60-64 59.5-64.5 10 10
65-69 64.5-69.5 14 24
70-74 69.5-74.5 12 36
75-79 74.5-79.5 20 56
80-84 79.5-84.5 10 66
85-89 84.5-89.5 14 80

= 74.5 + (40 – 36) x 5 = 75.5 marks


20
Advantages of the median
• Used with distribution of any shape
especially when data are skewed
• Easy to calculate and understand
• Better representations when there are
outliers
• Not affected by extreme values “the
middle remains the middle”
Disadvantages of the median
• Does not use all information in the
distribution
• Only takes into account one or 2
observations
• Provides no information about other
observations
The Mode
• The value/observation which occurs
most frequently when observations are
arranged in an array
• Most fashionable
• There can be several modes- bimodal,
multimodal
Example on Mode

In the age of 10 medical students


23, 19, 21, 20, 23, 21, 22, 24,
22, 22
The mode is 22
Formula for grouped mode
• Mode is the value that has the highest frequency in a data set.
• For grouped data, class mode (or, modal class) is the class with the
highest frequency.
• To find mode for grouped data, use the following formula:
• Mode= Lb + ∆1 XC
• ∆1 + ∆2
• Where:
– * Lb - is the lower boundary of class mode.
– * ∆1- is the difference between the frequency of class
mode and the frequency of the class before the class
mode.
– * ∆2 is the difference between the frequency of class mode
and the frequency of the class after the class mode.
– * C is the class width
10. Mode (Grouped Data)
Using last example of median.
Cumulative
Marks Boundaries Frequency Frequency (F)
60-64 59.5-64.5 10 10
65-69 64.5-69.5 14 24
70-74 69.5-74.5 12 36
75-79 74.5-79.5 20 56
80-84 79.5-84.5 10 66
85-89 84.5-89.5 14 80

• Mode= L + ∆1 XC
∆1 + ∆2
L= Lower limit of modal class. C= Class width/Modal class Size
∆1= f1-f0 (frequency of modal class- frequency of the class
preceeding modal class.
∆2= f1-f2 (frequency of modal class- frequency of the class
succeeding the modal class.
= 74.5 + (20-12) x5
(20-12) + (20-10)
= 76.7
Advantages of the Mode
• Easy to compute
• Not affected by extreme values
• Main usefulness is for calling attention to
distribution in which the values cluster at
several places
• Only average available for nominal scale
Disadvantages of the Mode
• Not the best for biological or medical
statistics
• Several observations with the same
frequency - multimodal
• Least valuable
Mid- Range
• Minimum + maximum, divided by 2
• Not affected by extreme of values
• Does not consider all information in the
distribution
Choice of measure of central
tendency
• Depends on the shape/nature of the
distribution- skewed to the left or right
or a normal distribution

• Depends on the scale of measurement


(ordinal or numerical)
Choice of measure cont’d
• For continuous variation with a unimodal
and symmetric distribution, the mean,
median and mode will be identical and lie
on the same plane
• When the distribution is skewed, the
median may be a more informative
descriptive measure to use than the
mean as it is not affected by extreme
values
Choice of measure cont’d
• For statistical analyses and tests of
significance, the mean is better
whenever possible since it includes
information from all observations and
its theoretic properties provide for
more powerful statistical tests
Note
• If mean = median ….. Symmetry
• If mean > median ….. Skewed to the
right (positive)
• If mean < median ….. Skewed to the left
(negative)
Conclusion
• Measures of central tendency provide
good ways of summarizing data.
• They are in everyday statistical use.
• Though they are regarded as
descriptive statistics, they often
provide the basis for making inference
about statistics.
THANKS
FOR
YOUR
ATTENTION
• Fundamentals of Biostatistics 8th Edition
Bernard Rosner. (Chapters 1 &2).
http://galaxy.ustc.edu.cn:30803/zhangwen/Bi
ostatistics/Fundamentals+of+Biostatistics+8th
+edition.pdf

You might also like