0% found this document useful (0 votes)

346 views37 pages

SFB Module I 2019

This document provides an overview of descriptive statistics concepts including: 1) Data can be quantitative (measurable) or qualitative (categorical). Variables can be dependent or independent. 2) A population is the entire set being studied, while a sample is a subset of the population. Parameters describe populations, while statistics describe samples. 3) Descriptive statistics are used to organize and summarize characteristics of data through methods like ordered arrays, tables, and frequency distributions.

Uploaded by

Ashwani Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

346 views37 pages

SFB Module I 2019

Uploaded by

Ashwani Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Statistics for Biology

Notes only for the use of students of

B.Sc. (H) Microbiology
Amity Institute of Microbial Technology

Course Instructor
Anil Chandra

1
Module I: Descriptive Statistics
1. Data, Types of Data, Variables & Constants

Data: Data is the raw material of statistics.

1.1 Quantitative Data: represent a measurable or countable quantity
Examples:
- the height or weight of a person
- average marks scored by student

1.2 Qualitative Data: represents the properties, classification, names or labels OR data
which cannot be measured or quantified
Examples:
- the hair color of a person (red, blond, brown, black)
- Division scored by the students (remember division scored is a classification and not a
measurable quantity)

Variables & Constants:

Variables: Values which varies with each observation
Constants: Values which remains constant with each observation
Eg., Suppose we are conducting a study on 100 women from a village regarding their Age,
Height, Ethnicity etc. In this case woman and village remains constant while other
parameters observed like Age, height etc. are variables since these values will vary with each
observation

Dependent & Independent variables:

In terms of statistics, independent variables are those variables which can be controlled by
the experimenter as he/she can set different levels for these variables while dependent
variables are those variables whose values whose output is dependent on these independent
variables i.e. values in which the experimenter is interested . For eg. A study regarding plant
growth is being conducted at different pH levels. Here pH levels are independent variables,
because these values are under the control of experimenter, while plant growth is
dependent variable since plant growth is dependent on the pH levels set by experimenter.

Quantitative Variables are of two types

Discrete Varibles: The data which has interruptions or gaps in between OR the data which
cannot take all intermediate values in between
Eg. Number of students in a class OR number of seeds observed in Mendel’s
experiment on peas. Number of students could be 0, 1, 2, 3, …. But it cannot be 1.5
or 1.7 etc. (It cannot take the intermediate values and for the reference of students it is
relevant to mention here that between two natural numbers viz., 0, 1, 2, … there are
countably infinite number of values)

Continuous Variables: The value which is continuous over a certain interval OR in other
words can take all the values between two given values is called continuous data
Eg. Height of students of a class (Say class XIIth)
The height of students can range from 4 feet (or may be lower) to 6 feet (or may be higher).
So the data set – height of a student can take any value between these two values.

2
2. Population/Universe and Sample, Parameter & Statistics:
Population/Universe:
- The term population/universe refers to a collection of people or objects that share common
observable characteristics.
For example, a population could be all of the people who live in your city, all of the students
enrolled in a particular university, or all of the people who are afflicted by a certain disease
(e.g., all women diagnosed with breast cancer during the last five years).
- Generally, researchers are interested in particular characteristics of a population, not
the characteristics that define the population but rather such attributes as height,
weight, gender, age, heart rate, and systolic or diastolic blood pressure.

Sample:
Subset of a population, or a part taken out from the population is called a sample.

Given below is a venn diagram that represents A Population and its Sample

Population (P)

Sample (S)

Why select a Sample?

Often, it is:
a) too expensive or
b) impossible to collect information on an entire population.
For appropriately chosen samples, accurate statistical estimates of population
parameters are possible.

Parameter and Statistics

The quantity calculated to represent a character of a population is known as parameter.

For eg. Mean of a Population (µ)
The quantity calculated to represent a character of a sample is known as statstics.
For example Average of a Sample (X)

3
3. Introduction to Statistics
3.1 What is Statistics?
Statistics: it is a branch of science that deals with the:
a) Collection, organization and interpretation of data (Descriptive Statistics)
b) Drawing an inference about population(s) from given sample(s): (Inferential Statistics)

Adopted from the book “Statistics for the Utterly Confused by Lloyd Jaisingh”

Limitations of Statistics
1. Statistics does not deal with individual measurements: Since statistics deals with
aggregate of facts, the study of individual measurements lies outside the scope of
statistics. Data are statistical when they relate to measurement of masses. For eg.
Wages earned by an individual worker has no role to play in statistics, however, the
data of wages pertaining to all the workers is used for statistical interpretation
2. Statistical results are true only on an average: The conclusion obtained statistically
are not universally true, they are true only under certain conditions
3. Statistics deals with quantitative characteristics: Statistics are numerical
statements of facts. Such characteristics which cannot be expressed in numbers are
incapable of statistical analysis. Thus qualitative characteristics like honesty,
efficiency, intelligence, blindness and deafness cannot be studied directly. We must
assign certain qualitative scales of measurements so that these characteristics could be
studied statistically.
4. Statistics can be misused: One of the major drawbacks of statistics is that it can be
misused. This misuse could be because of many reasons. For example if statistical
conclusions are based on incokplete information one may arrive at a false conclusion.

3.2 Data Scale

Measurement & Scaling:
It is a process of associating numbers or symbols to observations of a research study
Eg., Weight, height, weight – measured directly with respective units of measurement
Eg., Marital Status: Single (S), Married (M), Widowed (W), Divorced (D)

Problems associated with Scaling

In research, we face measurement problems, specially when measures are complex and
abstract

4
Eg., to measure attitude, opinion towards a subject we have to construct scale
First we would understand types of data and appropriate measurements of scales

Types of Data and Measurement Scale:

Interval Scale
 Provides information about order
• -20º C to + 20º C
 Possesses equal intervals
• 10º C, 20º C, 30º C
 Difference between values are well defined
• 20º C – 10º C = 10º C
• 40º C – 30º C = 10º C
 Has no absolute zero
• 0º C does not indicate no temperature
 Other Examples
• TIME OF DAY on a 12-hour clock

Ratio Scale
 Provides information about order
• 20 cm to 120 cm
 Possesses equal intervals
• 10 cm, 20 cm …
 Difference between values are well defined
• 100 cm – 80 cm= 10 cm
• 20 cm – 10 cm = 10 cm
 Has absolute zero
• 0 cm means no height
 Other examples
5
Years of experience, no. of children

Ordinal Scale:
 Provides information about order (natural order)
 Difference between values are logically not defined but we can categorize them as
greater than, less than
Examples:
 Satisfaction, Happiness, Discomfort,
Rank orders

Nominal Scale
 Used only for labeling variables (no order/ no concept of difference between variables
or less than greater than)

6
4. Descriptive Statistics
4.1 Descriptive Statistics
4.1.1 Collection of data: Collection of data can be classified into two forms:
1. Primary data: Data collected by the experimenter himself/herself
2. Secondary Data: Data collected from other sources
There are many different ways of collecting data, some of the methods are:
 Interviews
 Questionnaires and Surveys
 Observations
 Focus Groups
 Case Studies
 Documents and Records.
4.1.2 Organization of data:
Data can be organized into the following ways:
a) Ordered Array: representation of data in the increasing order of its magnitude is
called an Ordered Array (used for quantitative data)
Eg 1. Given below is number of marks in Statistics scored by 10 students in a class
90, 87, 85, 90, 92, 94, 56, 73, 75, 75
The Ordered Array of above data will be:
56, 73, 75, 75, 85, 87, 90, 90, 92, 94

b) Tabulated form: Representation of data in the form of a table

(i) Tabulated Data in Discrete, Ungrouped Form (Raw data):
Marks scored: 56, 73, 75, 75, 85, 87, 90, 90, 92, 94 (Total 10 students)

Frequency Distribution
In statistics, a frequency distribution is a list, table or graph that displays the
frequency of various outcomes in a sample. Each entry in the table contains the
frequency or count of the occurrences of values within a particular group or interval,
and in this way, the table summarizes the distribution of values in the sample.

(i) Tabulated Data in Discrete, Grouped Form:

Marks No. of students or
scored Frequency
56 1
73 1
75 2
85 1
87 1
90 2
92 1
94 1
Total 10 students

(ii) Tabulated data in Continuous Form

Marks No. of students
scored
0-10 03
10-20 06
20-30 12

7
30-40 13
40-50 04
50-60 09
60-70 03
Total 50

In Example b the marks scored are given in the form called Class Intervals (CI)
00 – 10 is a class interval where lowest class interval is 00 and highest is 9; It means
that this class interval contains number of students who have scored the marks
between 0 to 9

Class interval can be of two types:

- Overlapping: If Class intervals are given in the form:
Class Interval
0-10
10-20
20-30
And so on
such class intervals are called overlapping class intervals, but always remember
0-10 contains all the values between 0 to 10 including 0 but excluding except 10;
10-20 contains all the values between 10 to 20 including 10 but excluding 20
and so on …
- Non-overlapping: Class interval given in the form:
Class Interval
0-9
10-19
20-29
And so on
are called non-overlapping class interval, in this case:
0-9 will contain all the values between 0 to 9 including 0 and 9
10-19 will contain all the values between 10-19 including 10 and 19
and so on…

Creating a Grouped Frequency Distribution from a Given Data

Grouped Frequency Distributions

Guidelines for classes
 There should be between 5 and 20 classes.
 The class width should be an odd number. This will guarantee that the class
midpoints are integers instead of decimals.
 The classes must be mutually exclusive. This means that no data value can fall into
two different classes
 The classes must be all inclusive or exhaustive. This means that all data values must
be included.
 The classes must be continuous. There are no gaps in a frequency distribution.
Classes that have no values in them must be included (unless it's the first or last class
which are dropped).
 The classes must be equal in width. The exception here is the first or last class. It is
possible to have an "below ..." or "... and above" class. This is often used with ages.

8
Creating a Grouped Frequency Distribution
Find the largest and smallest values
Compute the Range = Maximum - Minimum
Select the number of classes desired. This is usually between 5 and 20.
Find the class width by dividing the range by the number of classes and rounding up. There
are two things to be careful of here. You must round up, not off. Normally 3.2 would round
to be 3, but in rounding up, it becomes 4. If the range divided by the number of classes gives
an integer value (no remainder), then you can either add one to the number of classes or add
one to the class width. Sometimes you're locked into a certain number of classes because of
the instructions. The Bluman text fails to mention the case when there is no remainder.
Pick a suitable starting point less than or equal to the minimum value. You will be able to
cover: "the class width times the number of classes" values. You need to cover one more
value than the range. Follow this rule and you'll be okay: The starting point plus the number
of classes times the class width must be greater than the maximum value. Your starting point
is the lower limit of the first class. Continue to add the class width to this lower limit to get
the rest of the lower limits.
To find the upper limit of the first class, subtract one from the lower limit of the second class.
Then continue to add the class width to this upper limit to find the rest of the upper limits.
Find the boundaries by subtracting 0.5 units from the lower limits and adding 0.5 units from
the upper limits. The boundaries are also half-way between the upper limit of one class and
the lower limit of the next class. Depending on what you're trying to accomplish, it may not
be necessary to find the boundaries.
Tally the data.
Find the frequencies.
Find the cumulative frequencies. Depending on what you're trying to accomplish, it may not
be necessary to find the cumulative frequencies.
If necessary, find the relative frequencies and/or relative cumulative frequencies.

c) Graphical representation: Graphical representation of data is widely used to present

data in a simple clear and effective manner. Some of the methods of graphical
representation we will discuss are in given below:

c) Graphical representation: Graphical representation of data is widely used to present

data in a simple clear and effective manner. Some of the methods of graphical
representation we will discuss are in given below:

i) Line Graph/Line Chart (source: https://www.smartdraw.com/line-graph/)

A line graph, also known as a line chart, is a type of chart used to visualize the value of
something over time. For example, a finance department may plot the change in the amount
of cash the company has on hand over time.
The line graph consists of a horizontal x-axis and a vertical y-axis. Most line graphs only
deal with positive number values, so these axes typically intersect near the bottom of the y-
axis and the left end of the x-axis. The point at which the axes intersect is always (0, 0). Each
axis is labeled with a data type. For example, the x-axis could be days, weeks, quarters, or
years, while the y-axis shows revenue in dollars.
Data points are plotted and connected by a line in a "dot-to-dot" fashion.
The x-axis is also called the independent axis because its values do not depend on anything.
For example, time is always placed on the x-axis since it continues to move forward
regardless of anything else. The y-axis is also called the dependent axis because its values
depend on those of the x-axis: at this time, the company had this much money. The result is
that the line of the graph always progresses in a horizontal fashion and each x value only has
one y value (the company cannot have two amounts of money at the same time).

9
More than one line may be plotted in the same axis as a form of comparison. For example,
you could create a line graph comparing the amount of money held by each branch office
with a separate line for each office. In this case each line would have a different color,
identified in a legend.
The line graph is a powerful visual tool for marketing, finance, and other areas. It is also
useful in laboratory research, weather monitoring, or any other function involving a
correlation between two numerical values. If two or more lines are on the chart, it can be
used as a comparison between them.

i) Histogram:
- A histogram is a graphical display of tabulated frequencies of continuous data (Quantitative
data on both X and Y Axis), which are shown as bars.
- It shows what proportion of cases fall into each of several categories. The categories are
usually specified as non-overlapping intervals of some variable.
- The categories (bars) must be adjacent.
- The intervals (or bands, or bins) are generally of the same size.
- The width of the bars, marked along with the X axis, are also important along-with height

Eg. 3(a) Consider the data given below:

Marks No. of students

scored
00-10 03
10-20 06
20-30 12
30-40 13
40-50 04
50-60 09
60-70 03
Total 50
Histogram of the above tabulated data is given below

10
14
12

No. of students
10
8
6
4
2
0 20 30 40 50 70
0 10 60
Marks scored

ii) Frequency Polygon / Frequency Curve

-The values of the variables are plotted on X axis and their frequencies are plotted on Y axis.
-A frequency polygon is obtained by joining the plotted points by straight lines.
-If the class intervals are of small lengths then the plotted points are joined by free hand.
-The former is called frequency polygon and the latter is called frequency curve.
e.g. Frequency polygon of example 3(a) is given below

10
No. of students

0 35
5 15 25 45 55 65
Marks scored

iii) Cumulative frequency curve OR ogive:

A cumulative frequency polygon can be drawn when the cumulative frequency is plotted
against its respective class interval. Consider the example given below:

Marks No. of students Cumulative

scored (frequency) frequency
00-10 03 03
10-20 06 09
20-30 12 21
30-40 13 34
40-50 04 38
50-60 09 47
60-70 03 50
Total 50

11
60

Cumulative frequency
40

0
Marks scored
45 55 65
5 15 25 35

d. Diagrammatic representation

b) Bar diagrams:
- A bar diagram is a chart with rectangular bars with lengths proportional to the values that
they represent.
- Bar charts are used for comparing two or more values that were taken over time or on
different conditions, usually on small data sets.
- The bars can be horizontally lines or it can also be used to mass a point of view.
- It is generally used to depict the discrete data set.
Bar diagrams are of following types:
1. Simple bar diagram: A simple bar diagram is used to depict only one variable. As one
bar represents only one figure, there are as many bars as the number of figures.
Eg. Oxygen consumption in cc/kg/h in different months in half year in a species of fish was
obtained as below:
Months Jan Feb Mar Apr May June
O2 67 74 84 85 100 105
The simple bar diagram of above data is depicted below

120
June
May
100
Mar Apr
80 Feb
Jan
60

0
Months

Multiple bar diagrams: When a comparison between two or more relative variables has to
be made, them multiple bar diagrams are preferred. The technique of plotting multiple bar
diagram is given below:

12
90
80
70
60
50
40 East
30
20 West
10 North
0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East 20.4 27.4 90 20.4
West 30.6 38.6 34.6 31.6
North 45.9 46.9 45 43.9

Proportional Bar Diagram

Proportional Bar Diagrams are used for more complex comparisons wherein we not only
compare the values of data but also various components in frequencies of the data. Example
is given below:

13
Difference between Histogram and Bar diagram

BASIS FOR
HISTOGRAM BAR GRAPH
COMPARISON

Definition Histogram refers to a graphical Bar graph is a pictorial

representation, that displays data by representation of data that uses
way of bars to show the frequency of bars to compare different
numerical data. categories of data.

Indicates Distribution of non-discrete variables Comparison of discrete variables

Presents Quantitative data Categorical data

Spaces Bars touch each other, hence there are Bars do not touch each other,
no spaces between bars hence there are spaces between
bars.

Elements Elements are grouped together, so Elements are taken as individual

that they are considered as ranges. entities.

Can bars be No Yes

reordered?

Width of bars Need not to be same Same

14
Pie Charts (Source: https://www.smartdraw.com/pie-chart/):
A pie chart is a circular chart divided into wedge-like sectors, illustrating proportion. Each
wedge represents a proportionate part of the whole, and the total value of the pie is always
100 percent.

Pie charts can make the size of portions easy to understand at a glance. They're widely used
in business presentations and education to show the proportions among a large variety of
categories including expenses, segments of a population, or answers to a survey

For example, let us consider placement data of students of M.Sc. Microbiology:

Type of Organization
Type of No. of Angle
Organization Students
Placed 3
10
JRF/SRF 10 360ᴼX10 JRF/SRF
----------- Pharma Industry
50
Pharma 12 360ᴼX12 Food Industry
25 12
Industry ----------- Government Job
50
Food Industry 25 360ᴼX25
-----------
50
Government 3 360ᴼX3
Job -----------
50
TOTAL 50

Numerical representation (which we will discuss)

a. Measures of location of central tendency: Done by determining Mean, Mode
Median
b. Measures of dispersion: Range, Standard Deviation & Variance, Quartile Deviation

Statistical Inference (discussed in Module III for B.Sc. Microbiology /

Module V for B.Sc. Anthropology):
Statistical inference is a method used to draw inference about the large original
population from given Sample(s)
Examples for such inferences:
- Hypothesis testing: yes/no questions about the population
- Estimation: estimation of numerical characteristics
- Correlation: description of the association of different variables
- Forecasting of future observations

Assignments:
1) Define Histogram and Bar Diagram. Give the difference between them.
2) Define Primary & Secondary Data. Differentiate between discrete variable and continuous
variables by giving appropriate examples
3) Define Sample & Population. Write a short note on descriptive and inferential statistics.
Write short notes on data scales/scaling techniques

15
5. Measures of Central Tendency – Mean, Mode & Median

Mean:
a) Arithmetic Mean (X)
i) Arithmetic Mean for Ungrouped Data: If x1, x2, x3, …, xn is a set of n values then
Arithmetic Mean = (x1 + x2 + x3 + … + xn) = Σ X
n n
Find the Arithmetic Mean (AM) of: 20, 21, 20, 24, 21, 24, 26, 22, 23, 21 (Note here n = 10)

AM = 20 + 21 + 20 + 24 + 21 + 24 + 26 + 22 + 23 + 21 = 22.2 (Answer)
10
ii) Arithmetic Mean for Discrete Grouped Data. Also known as the “weighted
Arithmetic mean”
X x1 x2 x3 ………….. xn
Frequency (f) f1 f2 f3 …………. fn

Arithmetic Mean = (f1x1 + f2x2 + f3x3 + …. + fnxn) = ΣfX

(f1+ f2 + f3 + … + fn) Σf

Eg) Let us arrange the data given below in tabulated form

20, 21, 20, 24, 21, 24, 26, 22, 23, 21
X 20 21 22 23 24 26
f 2 3 1 1 2 1 Σ f = 10
fX 40 63 22 23 48 26 Σ fX = 222

Here AM = ΣfX/ Σf = 222/10 = 22.2

ii) Arithmetic Mean for Continuous Grouped Data.

In case of continuous grouped data, Class Intervals are given and mid point of class interval
is used as variable X

Eg., Rate of respiration of 50 fishes in a species and their frequencies are given below.
Calculate the mean of this experimental data
Rate of Resp 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80
Frequency 3 11 7 4 10 5 7 3

Here rate of Respiration is the variable given in the form of class interval. So we will use the
mid point of each class interval to calculate the mean
Mid point of Class Interval 1-10 is 5.5 (= m1)
Mid point of Class Interval 11-20 is 15.5 (= m2)
Mid point of Class Interval 21-30 is 25.5 (= m3) and so on

So our modified table will be

Rate of Resp (X) 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80
Frequency (f) 3 11 7 4 10 5 7 3 Σ f = 50
Mid point of class 5.5 15.5 25.5 35.5 45.5 55.5 65.5 75.5
interval (m)
Fm 16.5 170.5 178.5 142 455 277.5 458.5 226.5 Σ fm =
1925
Mean = Σ f m = 1925 = 38.5
Σf 50

16
Merits and Limitations of Arithmetic Mean
Merits:
1. Simplest to understand
2. Easy to Compute
3. Each item is used in calculation
4. Defined by rigid mathematical formula
5. Can be subjected to further algebraic treatment
6. It is relatively reliable, which means that it does not vary too much when repeated
samples are taken from one and same population

Limitations:
1. Since the value of mean is dependent on each and every item of the series, extreme
items, i.e. values which are very large or very small compared to most of the values in
the group unduly affect the value of average
2. In a distribution with open end classes the value of mean cannot be calculated without
making assumptions about the lower/upper limits of the class intervals
3. The average is not always a good measure of central tendency. It is a good measure
only when the frequency distribution follows a typical bell shaped curve. The average
is not a good measure of central tendency for U-shaped distribution (example –
failure rate of electronic components) or markedly skewed distribution like income
distribution or price distribution

Median
A median of a distribution is defined as the value of that variable which divides the total
frequency into two equal parts when the series is arranged in ascending or descending order
of magnitude

(a) Median for ungrouped data:

If n = odd Median value is the (n+1)/2 nd value

If n = even Median value is the average of n/2 nd and its next value

Eg. To find the Median of the given data set of 10 observations viz.
{20, 21, 20, 24, 21, 24, 26, 22, 23, 21}
First arrange it in ascending order of magnitude OR Ordered Array
{20, 20, 21, 21, 21, 22, 23, 24, 24, 26}
Here n = 10 (which is even value) so Median = Average of n/2 nd and its next value =
Average of 5th and 6th value = Average of 21 & 22 = (21+22)/2 = 43/2 = 21.5

Eg. To find the Median of the given data set of 9 observations viz.
{20, 21, 20, 24, 21, 24, 26, 22, 23}
First arrange it into an Ordered Array viz.
{20, 20, 21, 21, 22, 23, 24, 24, 26}
Here n = 9 (odd) so Median = (n+1)/2 nd value = 5th value = 21

Median for Discrete Series: In a discrete series items are first arranged into an Ordered
Array and their respective frequencies are written against respective items.

Median is located by the same formula as given in (a)

17
Eg., Determine median of the following data
X 5 6 7 8 9 10
f 2 4 8 10 15 25

Here n = ∑f = 2+4+8+10+15+25 = 64 which is even number

So Median = Average of n/2nd and its next value = Average of (32nd and 33rd value)
To find the 32nd value we will add a table of cumulative frequency and our modified table
will be
X 5 6 7 8 9 10
f 2 4 8 10 15 25
Cumulative 2 6 14 24 39 64
frequency

Looking at the cumulative frequency table we observe that the 32nd value is 9 and 33rd value
is also 9 so Median = 9

(b) Median for grouped data

In this case the median can not be found directly by the method given above.
Here n = ∑f

The steps in computation of median in this case are:

1. Find the Median class Median class would be where the (n+1)/2nd value lies
2. Make the cumulative frequency table to find where the (n+1)/2nd value lies
3. Calculate Median using the formula
Median = L1 + (n/2 – c) X h
fm
Where
L1 = Lower limit of the median class
n = ∑f
c = cumulative frequency of the class preceding the median class
fm = frequency of the median class
h = width of the class interval of median class

Eg. Calculate the median

CI 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80

Frequency 3 15 2 8 11 4 1 6

Here first we will prepare the cumulative frequency (cf) table and the revised table will be:
CI 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80
Frequency 3 15 2 8 11 4 1 6
cf 3 18 20 28 39 43 44 50
Here n = 50
Median class = where (n+1)/2nd value lies = 51/2nd value = 25.5th value lies
Looking at the cf table 25.5th value lies in the class interval 31-40 which will be our median
class
Therefore,
L1 = 31; n = 50; c = 20; fm = 8; h = 10
Median = 31 + (50/2 – 20) X 10 = 37.25
8

18
Merits and Limitations of Median
Merits of median
(i) If found directly it represents an actual item
(ii) It eliminates the effects of extreme items, since they are not taken in its calculations and
hence are good measure of location of central tendencies in case of markedly skewed
distributions like income or price distribution
(iii) The values of only the middle items are required to be known
(iv) It can be found even for the data which is qualitative
(v) Median is most suitable for expressing qualitative data

Limitations of Median
(i) It may not be representative when the distribution is irregular
(ii) It cannot be located when the items are grouped. It can only be estimated in this case.
(iii) The data must be kept in ascending or descending order which involves considerable
work if number of items is large
(iv) It is a positional value only and is not based on every value of the distribution
(v) Median is affected by sampling fluctuations

Quartiles
The procedure for computing quartiles is the same as the median

Q1 = (N+1)/4th item (if n is odd) or N/4th item (if n is even)

Q2 = Median
Q3 = Size of 3(N+1)/4th (if n is odd) Item OR 3N/4th Item (if n is even)

Q) From the following data compute the value of upper and lower quartiles:
Marks Below 10 10-20 20-40 40-60 60-80 Above 80
No. of 8 10 22 25 10 5
students

First we will calculate cumulative frequencies (cf)

Marks Below 10 10-20 20-40 40-60 60-80 Above 80

No. of students (f) 8 10 22 25 10 5
c.f. 8 18 40 65 75 80

First quartile Q1 = size of N/4th item = 80/4 = 20th item

Hence Q1 lies in the class 20 – 40
− . .
= +

L = 20, N/4 = 20, c.f. = 18, f=22, i = 20

−
= + = .

Similarly Q3 = size of 3N/4 th item = 60th item

Hence Q3 lies in the class 40 – 60

19
− . .
= +

In this case L = 40, 3N/4 = 60, c.f. = 40, f=25, i = 20

60 − 40
= 40 + 20 =
25

Mode
Mode of a frequency distribution is defined as “that value of the variable for which the
frequency is maximum”

Mode for discrete series:

Find the most repeated value in the discrete data. It will be the mode
Eg., (i) Mode of 1, 2, 2, 3, 4, 5,, 6 is 2 (ii) Mode of 1, 2, 2, 3, 3, 4, 4, 5 are 2, 3 and 4

Mode for continuous series:

For continuous grouped data the Mode is calculated using the following steps
1. Modal Class is determined by inspection. It is the class which has maximum frequency
2. Formula of Mode is applied which is

Mode = L1 + (f1 – f0) _____ X h

(f1 – f0+ f1 – f2)

Where
L1 = Lower limit of modal class
f0 = frequency of the class preceding modal class
f1 = frequency of the modal class
f2 = frequency of the class succeeding the modal class
h = Width of the class interval of modal class

Eg. Find mode of the distribution given below:

CI 30-34 35-39 40-45
Frequency 2 6 3

Here since the class interval 35-39 have the maximum frequency it will be our modal class
As per the formula
L1 = 35; f0 = 2; f1 = 6; f2 = 3; h = 5

Mode = 35 + (6 – 2) X 5 = 35 + 20/7 = 35 + 2.89 = 37.89 (approx)

(6 – 2 + 6 – 3)

Empirical formula for Mode

Many a times, mode is ill defined. In such cases you will find two modes (bi-modal) or
more than two modes (multi-modal). In such cases Mode can also be calculated by using
the formula
Mode = 3 Median – 2 Mean

20
Merits and Limitations of Mode
Merits of Mode
(i) It avoids the effects of extreme
(ii) Often it can be ascertained by mere inspection
(iii) Only the values occurring with high frequencies are required to be known. All values
need not be known

Limitations of Mode
(i) It is not well defined and therefore is rarely used for higher life science research
(ii) Arithmetic explanation of mode is not possible
(iii) Sometimes it is indefinite
(iv) It becomes difficult to find a good measure of location in multi-modal distributions
(v) It is not based on all the observations of a series

21
6. Measure of variation

Significance of a good measure of variation

1. To determine the reliability of an average: A good measure of
variation helps us to determine how far an average is representative of
mass. When dispersion is small, average is a typical value in the sense
that it closely represents the individual value and it is reliable in the sense
that it is a good estimate of average in the corresponding University.
2. To serve as a basis for the control of variability: It helps us to
determine the nature and cause of variation in order to control the
variation itself. In matter related to health, variation in body temperature,
pulse beat and blood pressure are basic guides to diagnosis. Prescribed
treatment is designed to control their variation
3. To compare two or more series with regard to their variability: It
helps in comparison of two or more series to determine compare the
variability of these series
4. To facilitate the use of other statistical measures: Correlation,
statistical tools, hypothesis testing, techniques in quality control like
product control, process control etc. are based on measures of dispersion

Properties of a good measure of variation:

1. It should be simple to understand
2. It should be easy to compute
3. It should be rigidly defined
4. It should be based on each and every item of distribution
5. It should be amendable to further algebraic treatment
6. It should not be unduly affected by extreme values

There are four different measures of variations in a frequency distribution:

1. Range
2. Mean Deviation
3. Standard Deviation / Variance
4. Quartile Deviation (QD)/ Inter-Quartile Range (IQR)
5. Coefficient of variation

Range: Maximum Value (L) – Minimum Value (S)

Coefficient of Range = (L – S)
(L + S)

22
Mean Deviation:
The mean deviation is also known as average deviation. It is the average difference between
the items in a distribution from the median or mean of that series. Theoretically there is an
advantage in taking the deviation from median because sum of the deviations of items from
median is minimum when signs are ignored. However, in practice the arithmetic mean is
more frequently used in calculating the value of average deviation and this is the reason why
it is more commonly called “mean deviation”.

Discrete Discrete grouped Continuous

ungrouped
∑| − | ∑ | − | ∑ | − |

Where
f = frequencies of X Where
f = frequencies of class
intervals and
m = mid points of class
intervals

Coefficient of Mean Deviation = Mean Deviation

Mean

Calculation of mean deviation (Discrete Ungrouped series)

Eg) Calculate mean deviation with respect to mean for the following data
X 23 22 20 24 16 17 18 19 21 ∑ X = 180
X– -3 -2 0 4 -4 -3 -2 -1 1
= X - 20
| − | 3 2 0 4 4 3 2 1 1 ∑| − |= 20
∑| − | 20
ℎ = = = 2.22
9

Calculation of mean deviation (Discrete Ungrouped series)

Eg) Calculate mean deviation with respect to mean for the following data

X 10 11 12 13 14 15 16 17 18
f 1 2 3 4 5 4 3 2 1 = 25
fX 10 22 36 52 70 60 48 34 18 ∑ = 350
Mean = 350/25 = 14
X– -4 -3 -2 -1 0 1 2 3 4
= X - 14
| − | 4 3 2 1 0 1 2 3 4
| − | 4 6 6 4 0 4 6 6 4 ∑ | − | =40

∑ | − | 40
ℎ = = = 1.6
25

23
Standard Deviation
Standard deviation may be defined as “the square root of the arithmetic mean of the squares
of deviations from the arithmetic mean”

Computation of Standard Deviation (σ)

Standard Deviation for ungrouped data:
Standard deviation for {x1, x2, x3, …, xn} is calculated by the formula

σ= (x1-x)2 + (x2-x)2 + (x3-x)2 + … (xn – x)2 …. (1)

√ n

The above formula can also be written as

σ= ∑ (X – x )2
√ n
Where x = Mean = ∑ X
n

Formula (1) can further be simplified into a more easy & popular form viz

σ= ∑ X2 – ∑ X 2

√ n n ….. (2)

For all the questions of Standard Deviation it is convenient to use the Formula (2)

Eg., Haemoglobin percent g/100 ml of liver fed Wallago attu was recorded as 23, 22, 20,
24, 16, 17, 18, 19 and 21. Calculate the Standard Deviation

X 23 22 20 24 16 17 18 19 21 ∑ X = 180
X2 529 484 400 576 256 289 324 361 441 ∑ X2 = 3660

here x = Mean = 180 = 20

Using formula (2) for Standard Deviation

σ= 3660 – (20 )2
√ 9

σ= 60 = 4 √15 = 4 X 3.9 (approx) = 5.2 (approx)

√ 9 3 3

Standard Deviation for discrete grouped data:

σ= ∑ fX2 – ∑ fX 2

√ ∑f ∑f ….. (3)

24
Standard Deviation for continuous grouped data:

2
σ= ∑ fm2 – ∑ fm
√ ∑f ∑f ….. (4)

 A very important property of Standard Deviation is that it is

independent of the scale and origin.
It means that if each variable is subtracted or added or divided or (a combination of
subtraction and division is applied) by a suitably chosen constant then the value of Standard
Deviation remains unchanged.

This leads to a very important method called Step Deviation Method.

Eg. We will find Standard Deviation for discrete grouped data given below
X 101 102 103 104 105 106 107 108 109 110
f 2 4 6 8 10 8 6 4 2 2

Please note that if we try to solve this problem using normal method then the values of X are
large. To find Standard Deviation we have to find X2 and then ∑ X2 which will be larger
values. By Step Deviation Method our values of X will be simplified

By Step Deviation Method, we shall subtract al the values of X by a suitably chosen constant.
Now question arises how to choose this constant? Easy method is to find the average of
maximum and minimum value of X. In the above question minimum and maximum values of
X are 101 and 110 respectively. Average of which will be 211/2 = 105.5. To make our
calculations simple we shall chose 105 instead of 105.5

 Always remember in step deviation method only the variables (values of X or m) are
subtracted by a suitable constant. Frequency will remain unchanged.

Applying Step Deviation Method i.e. subtracting each value of X1 by 105 we will get new
variables which we name as X1 our modified table shall be

X1 = X-105 -4 -3 -2 -1 0 1 2 3 4 5
f 2 4 6 8 10 8 6 4 2 2 ∑ f = 52
X12 16 9 4 1 0 1 4 9 16 25
fX1 -8 -12 -12 -8 0 8 12 12 8 10 ∑ fX1 = 10
fX12 32 36 24 8 0 8 24 36 32 50 ∑ fX12 = 250

In formula (3) of Standard Deviation we shall use X1 instead of X, then formula will be

σ= ∑ fX12 – ∑ fX1 2

√ ∑f ∑f

σ= 250 – (10/52)2 = 2.2 (approx)

√ 52

25
Next we shall find Standard Deviation for continuous grouped data given below

Question) Ovary weight of 50 fishes and their frequency is given in class interval (CI),
tabulated below. Find Standard Deviation
CI 2-3 3-4 4-5 5-6 6-7
Frequency 6 13 11 8 12

Here first we shall find the mid point (m) of each class interval
CI 2-3 3-4 4-5 5-6 6-7
Frequency 6 13 11 8 12
M 2.5 3.5 4.5 5.5 6.5

Now we shall apply step deviation method. Here we shall subtract the values of m by a
suitable constant. Here minimum and max values of m are 2.5 & 6.5 resp. Their Average =
4.5. Also looking at the values it seem appropriate to subtract each value of m by 4.5
Our modified table will be
CI 2-3 3-4 4-5 5-6 6-7
Frequency (f) 6 13 11 8 12 ∑ f = 50
M 2.5 3.5 4.5 5.5 6.5
m1 = m – 4.5 -2 -1 0 1 2
m12 4 1 0 1 4
fm1 -12 -13 0 8 24 ∑ f.m1 = 7
fm1 2 24 13 0 8 48 ∑ f.m12 = 93

In formula (4) of Standard Deviation we shall use m1 instead of m, then formula will be

σ= ∑ fm12 – ∑ fm1 2

√ ∑f ∑f ….. (4)

Upon calculation

σ= 93 – (7/50)2 = 1.32 (approx)

√ 50

Merits and Limitations of Standard deviation

Merits of Standard Deviation
1. Standard deviation summarizes the deviation of a large distribution from mean
2. It indicates whether the variation of differences of an individual from the mean is real or
by chance
3. It helps in calculating Standard Error

Limitations of Standard deviation

1. Standard deviation gives weightage to only extreme values.
2. The process of squaring deviations and then taking square root involves lengthy
calculations

Variance
Square of standard deviation is called Variance i.e. Variance = σ2

26
Quartile Deviation (QD)

( − )
=

Inter Quartile Range (IQR)

=( − )

Box Plot (Based on Q1, Q2, Q3)

A box-and-whisker plot organizes data values into four groups. Ordered data are divided
into lower and upper halves by the median. The median of the lower half is the lower
quartile. The median of the upper half is the upper quartile

Make a box-and-whisker plot

Example 1: The following set of numbers are the allowances of fifteen different boys in a
given week (they are arranged from least to greatest).
Step I: Find the median (which will act as second quartile Q2).
Step II: Find First Quartile Q1 and Third Quartile Q3
Step III: Plot the Quartiles Q1, Q2, Q3 and Q4 as per the diagram given below:

Q3 – Q1 = Interquartile Range

Questions to calculate QD

Discrete series
Eg) Compute the coefficient of quartile deviation from the following data
Marks 10 20 30 40 50 80
No. of students 4 7 15 8 7 2
Cumulative frequency (cf) 4 11 26 34 41 43
Q1 = size of (N+1)/4 th item = 11th item = 20, Q3 = size of 3(N+1)/4 th item 33rd item = 40
QD = (Q3 – Q1)/2 = (40-20)/2 = 10
Coefficient of QD = (Q3 – Q1) = 40 – 20 = 0.333
(Q3 + Q1) 40 + 20

27
Merits and Limitations of Quartile Deviation
Merits
1. In certain respects it is superior range as a measure of dispersion
2. It has a special utility in measuring variation in case of open-end distributions or one
in which data may be ranked
3. It is also useful in erratic skewed distributions, where the other measures of variations
may be warped by extreme values
Limitations
1. Quartile deviation ignores 50% items i.e. the first 25% and the last 25%. As the value
of quartile deviation does not depend upon every item of the series it cannot be
regarded as a good method of measuring dispersion
2. It is not capable of mathematical manipulation
3. Its value is very much affected by sampling fluctuations
4. It does not show scatter of data around average, rather it shows a distance on scale

Mean & Standard deviation of a composite group

If two groups contain n1 and n2 observations with means x1 and x2, and standard deviations
σ1 and σ2 respectively, then:
Mean of the composite group ( x ) is given by:
x = n1 x1 + n2 x2
(n1 + n2)

Standard deviation (σ) of the composite group is given by:

(n1 σ12 + n2 σ22) + n1d12 + n2d22

σ 2 = --------------------------------------------
(n1 + n2)
where
d1 = x1 – x
d2 = x2 - x

Example) There are two branches of an establishment employing 100 and 80 persons
respectively. If the arithmetic mean of monthly salaries paid by the two branches are Rs. 275
and Rs. 225 respectively, find the arithmetic mean of the salaries of the employees of the
establishment as a whole.

Sol:

Branch No. 1 Branch No. 2

No. of person 100 (n1) 80 (n2)
Arithmetic mean of monthly salaries 275 (x1) 225 (x2)
(in Rs.)

Mean of the composite group ( x ) is given by:

x = n1 x1 + n2 x2 = 100 X 275 + 80 X 225 = 252.77
(n1 + n2) (100 + 80)

28
Example) For a group of 50 boys the mean score and the standard deviation of the scores in a
test are 59.5 and 8.38 respectively while for a group of 40 girls the mean score and the
standard deviation of the scores in the same test are 54 and 8.23 respectively. Find the mean
and standard deviation of the combined group of 90 students.
Branch No. 1 Branch No. 2
No. 50 (n1) 40 (n2)
Arithmetic mean 59.5 (x1) 54.0 (x2)
S.D. 8.38 (σ1) 8.23 (σ2)

As per the formulas

Mean of the composite group ( x ) is given by:

x = n1 x1 + n2 x2 = 50 X 59.5 + 40 X 54.0
(n1 + n2) (50 + 40)

Standard deviation (σ) of the composite group is given by:

(n1 σ12 + n2 σ22) + n1d12 + n2d22

σ2 = --------------------------------------------
(n1 + n2)
where
d1 = x1 – x = 59.5 – 57.05 = 2.45
d2 = x2 – x = 54 – 57.05 = - 3.05

[50 X (8.38)2+ 40 X (8.23)2]

Hence, σ2 = ----------------------------------- = 76.58
(50 + 40)

σ = √76.58 = 8.75

-------------------------------------------------------------------------------------------------------------

Coefficient of Variation
The measures of dispersion - Range, Mean Deviation, Standard deviation etc. are expressed
in same unit as the original observations, and are called Absolute measures of variability. So
they can not be used for comparing the variability of two or more distributions given in
different units. In order to meet such situations, we use Coefficient of Variation to compare
the variability in two different distributions.

Coefficient of variation (CV) = 100 X Standard Deviation

Mean

Eg., The scores of two batsmen, A and B, in ten innings during a certain season are as under:
A 32 28 47 63 71 39 10 60 96 14
B 19 31 48 53 67 90 10 62 40 80
Find which of the batsman is more consistent in scoring

Sol. For comparing the variability we use CV = 100 X Standard deviation

Mean
For cricketer A
Mean = (32+28+47+63+71+39+10+60+96+14)/10 = 46

29
Standard deviation = 25.5 (please calculate)
Therefore, CV = 100 X 25.5/46 = 55
For cricketer B
Mean = 50
Standard deviation = 24.4
Therefore, CV = 100 X 24.4/50 = 49

Since for Cricketer B, the CV is smaller he is more consistent.

Assignments:

1) If the interest paid on each of three different sums of money yielding 5%, 6% and
8% simple interest per annum respectively is the same, what is the average yield
percent on the total sum invested? (Hint: Find which mean to use AM, HM or GM)

2) Find the Standard deviation of incubation period smallpox in 50 patients of the

following data:
Period: 10 11 12 13 14 15 16
No. of patients: 2 7 11 15 10 4 1

2) Calculate mean and the standard deviation of the following frequency distribution
Variable 5 10 15 20 25 30 40 45 50 60
Frequency 2 4 6 6 10 10 10 6 4 2

3) Find the Standard deviation of first ten natural numbers. (Hint: First ten natural
numbers are 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, you have to find their standard deviation)

4) From the given data set, state which series is more variable:
Variable 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
Series A 10 18 32 40 22 18
Series B 18 22 40 32 18 10

5) In a series of adults the mean blood pressure was 135 mmHg with standard deviation 10
mmHg. In the same series mean height was 170 cm with standard deviation 6 cm. Which
character shows greater variation?

6) In an industrial establishment, the coefficients of variation of wages of male and female

workers were 55% and 70% respectively. The standard deviations were Rs. 22 and Rs. 15.40
respectively. Calculate the combined average wages for all the workers, if 80% of the
workers were male.

7) The median and mode of the following frequency distribution are known to be 27 and 26
respectively. Find the values of a & b.
Values 0-10 10-20 20-30 30-40 40-50
Frequency 3 a 20 12 b

8) The mean monthly salary paid to all employees in a certain company was Rs. 500. The
mean monthly salaries paid to male and female employees were 520 and 420 rupees
respectively. Obtain the percentage of male to female employees in the company.

30
9) Find median, mode and standard deviation of the data given below:
CI 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80
F 3 11 7 4 10 5 7 3

10) Find the standard deviation by step deviation method for the following data on the age of
patients suffering from pulmonary disease.
Age (in years) 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No. of patients 6 14 10 8 1 3 8

11) Calculate Quartile Deviation: Find the interquartile range and Coefficient of quartile
deviation from the following data

Marks in statistics Above 0 10 20 30 40 50 60 70 80

No. of students 150 140 100 80 80 70 30 14 0

31
5. Skewness Moments & Kurtosis:

Skewness
Skewness refers to asymmetry or lack of symmetry in the shape of frequency distribution i.e.
When a distribution is not symmetrical it is called a skewed distribution.

Tests of Skewness
Skewness is present if:
1. The values of mean, median and mode do not conincide
2. When the data are plotted on a graph they do not give normal bell shaped curve

Measures of Skewness
1. The Karl Pearson’s Coefficient of skewness

−
=

OR using Imperical Formula, when mode is ill defined:

Mode = 3 Median – 2 Mode

3 ( − .)
=

2. The Bowley’s Coefficient of skewness

( 3− . ) − ( . − 1)
=
( 3− . ) + ( . − 1)
OR
3 + 1 − 2 .
=
3− 1

3. Measure of skewness based on moments

First we will understand Moments

32
MOMENTS
Moments about mean (central moments)
First moment about mean
( − ) ( − )
µ = =0 µ = =0
Second moment about mean
( − ) ( − )
µ = = µ = =

Third moment about mean

( − ) ( − )
µ = µ =
Fourth moment about mean
( − ) ( − )
µ = µ =

Skewness & Kurtosis Based on Moments

Skewness (β1) Kurtosis (β2)

µ µ
= =
µ µ

Moments about Origin (OR Moments about Zero)

First moment about origin

= =
Second moment about origin

= =
Third moment about origin

= =
Fourth moment about origin

= =

Moments about Arbitrary Point ‘A’

First moment about A
( − ) ( − )
µ ′ = µ ′ =
Second moment about A
( − ) ( − )
µ ′ = µ ′ =
Third moment about A
( − ) ( − )
µ ′ = µ ′ =
Fourth moment about A
( − ) ( − )
μ ′ = μ ′ =

33
Conversion of Moments about Arbitrary Origin into Moments about Mean
µ1 = 0
µ2 = µ2’ – (µ1’)2
µ3 = µ3’ - 3µ1’ µ2’ + 2 (µ1’)3
µ4 = µ4’ - 4µ1’ µ3’ + 6 (µ2’) (µ1’)2 - 3 (µ1’)4

Q) Find the first four moments about the mean in the following distribution:

Height 60-62 63-65 66-68 69-71 72-74

Frequency 5 18 42 27 8

Solution:
Calculation of Moments
Height 60-62 63-65 66-68 69-71 72-74 TOTAL
mi 61 64 67 70 73
Frequency 5 18 42 27 8 100
d = (m-63)/3 -2 -1 0 1 2
d2 4 1 0 1 4
d3 -8 -1 0 1 8
d4 16 1 0 1 16
fd -10 -18 0 27 16 15
fd2 20 18 0 27 32 97
fd 3 -40 -18 0 27 64 33
fd4 80 18 0 27 128 253

µ1’ = fd/N X i = 15X3/100 = 0.45

µ2’ = fd2/N X i2 = 97X9/100 = 8.73
µ3’ = fd3/N X i3 = 33X27/100 = 8.91
µ4’ = fd4/N X i4 = 253X81/100 = 204.93

Applying the conversions

µ1 = 0
µ2 = µ2’ – (µ1’)2 = 8.53
µ3 = µ3’ - 3µ1’ µ2’ + 2 (µ1’)3 = -2.693
µ4 = µ4’ - 4µ1’ µ3’ + 6 (µ2’) (µ1’)2 - 3 (µ1’)4 = 199.376

34
Kurtosis
Kurtosis is refers to the degree of flatness or peakedness of the curve of frequency
distribution. In other words measures of kurtosis tell us the extent to which a distribution is
peaked or flat-topped than a normal curve (normal curve is discussed later)

The three different curves in terms of Kurtosis are drawn below:

If β2 > 3 distribution is Leptokurtic

If β2 = 3 distribution is Normal
If β2 < 3 distribution is Platykurtic

Simple Example
Q) Calculate Skewness & Kurtosis of the given data

X 2 3 4 5 6
f 1 3 7 3 7

Calculate first four moments about mean

First calculate = /
=

X f ( − ) (X - )2 (X - )3 (X - )4 ( − ) f(X - )2 f(X - )3 f(X - )4

2 1 -2 4 -8 16 -2 4 -8 16
3 3 -1 1 -1 1 -3 3 -3 3
4 7 0 0 0 0 0 0 0 0
5 3 +1 1 1 1 3 3 3 3
6 1 +2 4 8 16 2 4 8 16
15 0 14 0 38

0
μ = =0
15
14
μ = = 0.9333
15
0
μ = =0
15
38
μ = = 2.533
15

35
Hence we get
1 = =( . )
= 0 (Distribution is symmetric and not skewed)

.
= = .
= 2.91 (Distribution is platykurtic since the value is less than 3)

Q) Calculate Karl Pearsons Coefficient of Skewness

Marks 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100

No. of students 5 12 15 20 18 10 6 4

Skp = (Mean – Mode) / Standard Deviation

Calculate the values

Answer is Skp = -0.008

Q) Calculate Bowley’s Coefficient of skewness

No. of children per family 0 1 2 3 4 5 6

No. of families 7 10 16 25 18 11 8

= ( 3 + 1 − 2 . )/( 3 − 1 )
( )
SkB = =0

Some examples
Q 1) The standard deviation of symmetric distribution is 3. What must be the value of
the fourth moment about the mean in order that the distribution be mesokurtic
For mesokurtic distribution =3
= 3, Hence μ = =9
β2= µ4/µ22 , Hence, 3 = OR μ = 243
Thus the fourth moment about mean must be 243 in order that the distribution be mesokurtic

Q2) If first four moments about the value 5 are equal to -4, 33, -117 and 560 determine
the corresponding moments about the mean
Conversion of Moments about Arbitrary Origin into Moments about Mean
µ2 = µ2’ – (µ1’)2
µ3 = µ3’ - 3µ1’ µ2’ + 2 (µ1’)3
µ4 = µ4’ - 4µ1’ µ3’ + 6 (µ2’) (µ1’)2 - 3 (µ1’)4

Given that µ1’ = -4, µ2’ = 33, µ3’ = -117, µ4’ = 560, substituting these values in above given
equations we get
µ2 =6, µ3 = 19, µ4 =32

36
Note: Also refer to class notes for additional questions

References:
1. Introductory Biostatistics for the Health Sciences Modern Applications Including
Bootstrap - Michael R. Chernick, Robert H. Friis. Wiley-Interscience
2. Statistics for Anthropology Second Edition - Loren Madrigal. Cambridge University
Press
3. Statistical Methods – SP Gupta and Archana Gupta. Sultan Chand & Sons
4. Statistics for the utterly confused - Lloyd Jaisingh. McGraw Hill Education
5. Grubbs, F.E. 1979. Procedures for detecting outlying observations. In Army Statistics
6. Manual DARCOM-P706-103, Chapter 3. U.S. Army Research and Development
Center,
7. Aberdeen Proving Ground, MD 21005.
8. American Public Health Association, Standard Methods for the Examination of Water
and

Iso 3511 Instrument - Symbols - Part - 4 PDF
0% (1)
Iso 3511 Instrument - Symbols - Part - 4 PDF
10 pages
Note For Int To Statistics
No ratings yet
Note For Int To Statistics
24 pages
Chapter One&2
No ratings yet
Chapter One&2
16 pages
Scientific Data
No ratings yet
Scientific Data
22 pages
1lesson 1 Basic Concepts of Statistics With Answers
No ratings yet
1lesson 1 Basic Concepts of Statistics With Answers
9 pages
Basic Statistics For Testing
No ratings yet
Basic Statistics For Testing
58 pages
Chapter 1 - NATURE OF STATISTICS
No ratings yet
Chapter 1 - NATURE OF STATISTICS
14 pages
Lesson 1 Basic Concepts of Statistics
No ratings yet
Lesson 1 Basic Concepts of Statistics
9 pages
Introduction To Statistical Methods in Research
No ratings yet
Introduction To Statistical Methods in Research
30 pages
Statistics Lesson 1
No ratings yet
Statistics Lesson 1
111 pages
Lesson 1 Intro To Statistics
No ratings yet
Lesson 1 Intro To Statistics
3 pages
Chapter 1 Introduction To Statistics
No ratings yet
Chapter 1 Introduction To Statistics
28 pages
Introduction To Statistics - Note
No ratings yet
Introduction To Statistics - Note
16 pages
STATISTICS Powrepoint 2
No ratings yet
STATISTICS Powrepoint 2
82 pages
Lecture No 01 Statistics 13-2-24
No ratings yet
Lecture No 01 Statistics 13-2-24
34 pages
Statistics and Freq Distribution
No ratings yet
Statistics and Freq Distribution
35 pages
Basics of Biostatistics ALL
No ratings yet
Basics of Biostatistics ALL
456 pages
Chapter-1 Data Analysis
No ratings yet
Chapter-1 Data Analysis
14 pages
G.E. 4 Pre - Final Handoout
No ratings yet
G.E. 4 Pre - Final Handoout
11 pages
Lecture 1 - Introduction To Statistics
No ratings yet
Lecture 1 - Introduction To Statistics
41 pages
BST 121
No ratings yet
BST 121
111 pages
Introduction Statistics
100% (1)
Introduction Statistics
23 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
10 pages
Chapter 1 - 250119 - 072242
No ratings yet
Chapter 1 - 250119 - 072242
11 pages
Chapter One
No ratings yet
Chapter One
34 pages
Merged Presentation 8614
No ratings yet
Merged Presentation 8614
290 pages
Statistics Lecture Notes by IIUCian Teacher
No ratings yet
Statistics Lecture Notes by IIUCian Teacher
71 pages
MMW Module 4 Lesson 1
No ratings yet
MMW Module 4 Lesson 1
13 pages
MMW Module 4
No ratings yet
MMW Module 4
54 pages
Chapter 1 Part1
No ratings yet
Chapter 1 Part1
21 pages
Lecture 1 - Online - INTRODUCTION TO BIOSTATISTICS (Compatibility Mode)
100% (1)
Lecture 1 - Online - INTRODUCTION TO BIOSTATISTICS (Compatibility Mode)
28 pages
Module 0. Review On Statistics
No ratings yet
Module 0. Review On Statistics
76 pages
SASA
No ratings yet
SASA
22 pages
MATH 121 (Chapter 1) - Nature of Statistics
No ratings yet
MATH 121 (Chapter 1) - Nature of Statistics
23 pages
BIOSTATISTICS
0% (1)
BIOSTATISTICS
17 pages
Two Major Areas of Statistics
No ratings yet
Two Major Areas of Statistics
17 pages
Hand-Out in Statistics Statistics
No ratings yet
Hand-Out in Statistics Statistics
4 pages
1 - 2 Biostatistics
No ratings yet
1 - 2 Biostatistics
24 pages
Unit 2
No ratings yet
Unit 2
27 pages
Math As A Tool Data Management Introduction and Central Tendency
No ratings yet
Math As A Tool Data Management Introduction and Central Tendency
12 pages
LESSON 2 Introduction To Statistics Continuation
100% (1)
LESSON 2 Introduction To Statistics Continuation
32 pages
Electronic Statistics and Probabilities
No ratings yet
Electronic Statistics and Probabilities
241 pages
MTPDF1 - Introduction To Statistics
No ratings yet
MTPDF1 - Introduction To Statistics
106 pages
Prob and Stat - Unit1
No ratings yet
Prob and Stat - Unit1
67 pages
STATAPP1
No ratings yet
STATAPP1
11 pages
Lecture 2
No ratings yet
Lecture 2
50 pages
STAT. - Adamu2 Finialcorrect NEW-LASTEST
No ratings yet
STAT. - Adamu2 Finialcorrect NEW-LASTEST
398 pages
Chapter 1: Introduction To Statistics Sher Muhammad CH
100% (1)
Chapter 1: Introduction To Statistics Sher Muhammad CH
4 pages
Statistical Analysis
No ratings yet
Statistical Analysis
26 pages
Basic Ideas of Data Management
No ratings yet
Basic Ideas of Data Management
32 pages
Stat For Engand Scientist - 231127 - 120304
No ratings yet
Stat For Engand Scientist - 231127 - 120304
75 pages
Lec Notes Business Stat
No ratings yet
Lec Notes Business Stat
7 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
34 pages
Chapter 1: Introduction To Statistics
No ratings yet
Chapter 1: Introduction To Statistics
28 pages
1 - Intro To Bio - Data Types&pres - SFB
No ratings yet
1 - Intro To Bio - Data Types&pres - SFB
71 pages
m1002 Lecture One 2025
No ratings yet
m1002 Lecture One 2025
15 pages
Emdad Rahman
No ratings yet
Emdad Rahman
85 pages
Unit 1 Mean and SD
No ratings yet
Unit 1 Mean and SD
45 pages
Understandingstatisticsinresearch 151026064600 Lva1 App6892
No ratings yet
Understandingstatisticsinresearch 151026064600 Lva1 App6892
37 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
27 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Patient Clinical Audit Case Study Example
No ratings yet
Patient Clinical Audit Case Study Example
3 pages
Study On Street Vendors Before and After Pandemic
100% (1)
Study On Street Vendors Before and After Pandemic
81 pages
Pospiszyl 2023 The Fifth Element The Enlightenment and The Draining of Eastern Europe
No ratings yet
Pospiszyl 2023 The Fifth Element The Enlightenment and The Draining of Eastern Europe
28 pages
Ucc2817, Ucc2818, Ucc3817 and Ucc3818 Bicmos Power Factor Pregulator
No ratings yet
Ucc2817, Ucc2818, Ucc3817 and Ucc3818 Bicmos Power Factor Pregulator
45 pages
Class Notes For English 2 (PDF 2)
No ratings yet
Class Notes For English 2 (PDF 2)
17 pages
The Best of Charlie Munger 1994 2011 PDF
No ratings yet
The Best of Charlie Munger 1994 2011 PDF
1 page
F4 Chapter 3 (Exercise 6)
No ratings yet
F4 Chapter 3 (Exercise 6)
3 pages
Project Proposal
No ratings yet
Project Proposal
9 pages
EMR System UI Design
No ratings yet
EMR System UI Design
3 pages
Interfacing of LED 8051
No ratings yet
Interfacing of LED 8051
16 pages
5 Versionfinal
No ratings yet
5 Versionfinal
8 pages
DRAGO COSIC-prezentacija HIDROGEN
No ratings yet
DRAGO COSIC-prezentacija HIDROGEN
12 pages
T 14.419.003 SH1 AA - CEF - Signed PDF
No ratings yet
T 14.419.003 SH1 AA - CEF - Signed PDF
33 pages
TUV SUD - MT Procedure Rev.05
No ratings yet
TUV SUD - MT Procedure Rev.05
11 pages
Study Guide Chapter 8. The Teaching of Araling Panlipunan
No ratings yet
Study Guide Chapter 8. The Teaching of Araling Panlipunan
5 pages
An Analytical Method of Aircraft Gearboxes: To Predict Efficiency
No ratings yet
An Analytical Method of Aircraft Gearboxes: To Predict Efficiency
25 pages
BUCHI Destilador B-324 LIGAL 489 Operationmanual - SP
No ratings yet
BUCHI Destilador B-324 LIGAL 489 Operationmanual - SP
30 pages
Funny PHD Thesis Quotes
100% (3)
Funny PHD Thesis Quotes
4 pages
Rewriting The Classics Argumentative Essay by Lucienne Tanios
No ratings yet
Rewriting The Classics Argumentative Essay by Lucienne Tanios
2 pages
Paper 4 PDF
No ratings yet
Paper 4 PDF
5 pages
Calculators List Allowed
No ratings yet
Calculators List Allowed
1 page
Schmidt Sciences
No ratings yet
Schmidt Sciences
6 pages
Structure and Written Expression: Section Two
100% (1)
Structure and Written Expression: Section Two
26 pages
Ok Java Case Study
No ratings yet
Ok Java Case Study
18 pages
45B Ahmed Shaikh AIML Journal
No ratings yet
45B Ahmed Shaikh AIML Journal
181 pages
Rapid Serial Visual Presentation in Dynamic Graph Visualization
No ratings yet
Rapid Serial Visual Presentation in Dynamic Graph Visualization
8 pages
VOCALOID 6 Reference Manual ENG
No ratings yet
VOCALOID 6 Reference Manual ENG
88 pages
Whiplash Project
No ratings yet
Whiplash Project
11 pages
Drone Suppliers Uae
No ratings yet
Drone Suppliers Uae
5 pages