Module 3 - Branches of Statistics
Module 3 - Branches of Statistics
a. Arithmetic mean
b. Median
c. Mode
d. Weighted average or Mean
1. Arithmetic mean. This is a computational average and is defined as the sum of the variables.
Example: divided by the number of variables.
2. Median. This represents the point on the scale or on the distribution where half of the variables
are greater and the other half are lesser.
3. Mode. The mode is the simplest of the measures of central tendency and is defined as the
variable which occurs most frequently in a statistical series.
4. Weighted average or mean. It is defined as the sum of the product of the frequency and the
weight of a set of variables divided by the total number of frequencies. It is used for finding the
average of responses to opinions or items of the questionnaire which are given weights.
2. Measuresof spread:
These are ways of summarizing a group of data by describing how to spread
out the scores. Measures of spread help us to summarize how spread out these
scores are. To describe this spread, a number of statistics are available to us,
including the range, quartiles, absolute deviation, variance, and standard
deviation. When we use descriptive statistics it is useful to summarize our group
of data using a combination of tabulated description (i.e., tables), graphical
description i.e., graphs and charts),and statistical commentary(i.e.,a discussion
of the results).
CENTRAL TENDENCY
Measures of central tendency help you find the middle, or the average, of a
data set. The 3 most common measures of central tendency are the mode,
median, and mean.
Find the mean, median, mode, and range for the following list of values:
13,18,13,14,13,16,14,21,13
Find the mean, median, mode, and range for the following list of values:
Solution:
Given data: 13, 18, 13, 14, 13, 16, 14, 21, 13
The mean is the usual average.
Mean = 15
Find the median for the following list of values:
Solution:
The median is the middle value, so rewrite the list in ascending order as given below:
13, 13, 13, 13, 14, 14, 16, 18, 21
There are nine numbers in the list, so the middle one will be
{9 + 1} / {2} = {10} / {2} = 5
= 5th number
Hence, the median is 14.
Median = 14
NOTE: Arrange the data points from smallest to largest. If the number of data points is odd, the median is the middle data
point in the list. If the number of data points is even, the median is the average of the two middle data points in the list.
Find the mode for the following list of values:
Solution:
The mode is the number that is repeated more often than any other, so 13 is the mode.
Mode = 13
.
Types of Mode
1. Unimodal - A set of data with one Mode is known as a Unimodal Mode.For example, the Mode of data set
A = { 14, 15, 16, 17, 15, 18, 15, 19} is 15 as there is only one value repeating itself. Hence, it is a Unimodal data
set.
2. Bimodal - A set of data with two Modes is known as a Bimodal Mode. This means that there are two data
values that are having the highest frequencies.For example, the Mode of data set A = { 8,13,13,14,15,17,17,19} is
13 and 17 because both 13 and 17 are repeating twice in the given set. Hence, it is a Bimodal data set.
3. Trimodal - A set of data with three Modes is known as a Trimodal Mode. This means that there are three
data values that are having the highest frequencies. For example, the Mode of data set A = {2, 2, 2, 3, 4, 4, 5,
6, 5,4, 7, 5, 8} is 2, 4, and 5 because all the three values are repeating thrice in the given set. Hence, it is a
Trimodal data set.
4. Multimodal - Multimodal Mode - A set of data with four or more than four Modes is known as a
Multimodal Mode.For example, The Mode of data set A = {100, 80, 80, 95, 95, 100, 90, 90,100 ,95 } is 80, 90,
95, and 100 because both all the four values are repeated twice in the given set. Hence, it is a Multimodal
data set.
Note: In a data set and there is no number repeated more often than any other, thus there is no MODE.
Find the range for the following list of values:
Solution:
The largest value in the list is 21, and the smallest is 13, so the range is 21 – 13 = 8.
Range = 8
For another example of computing the central tendency watch video link:
https://www.youtube.com/watch?v=81zcjULlh58
VARIANCE
The variance is a measure of variability. It is calculated by taking the
average of squared deviations from the mean. Variance tells you the degree
of spread in your dataset. The more spread the data, the larger the variance
is in relation to the mean.
VAR.P: This function calculates the population variance. Use this function when the range of
values represents the entire population.
This function uses the following formula:
Populationvariance=Σ(xi–μ)2/N where:
10 1.2 1.44
12 3.2 10.24
9 0.2 0.04
11 2.2 4.84
10 1.2 1.44
12 3.2 10.24
7 -1.8 3.24
TOTAL 0 73.6
Step3:
As the data is not given as sample data, thus we use the formula
for population variance.
=73.6/10
=7.36
For another example of computing variance watch video link:
https://www.youtube.com/watch?v=deIQeQzPK08
STANDARD DEVIATION
The standard deviation is the average amount of variability in your dataset. It tells you, on
average, how far each value lies from the mean. A high standard deviation means that values
are generally far from the mean, while a low standard deviation indicates that values are
clustered close to the mean.
Variance is the average squared deviation from the mean, while standard deviation is the
square root of this number.Both measures reflect variability in distribution,but their units differ:
Standard deviation is expressed in the same units as the original values (e.g.,minutes or
meters).
Variance is expressed in much larger units (e.g.,meters squared).
Although the units of variance are harder to intuitively understand, variance is important in
statistical tests.
For another example of computing standard deviation watch video link:
https://www.youtube.com/watch?v=esskJJF8pCc
FRACTILES
A fractile is a cut-off point for a certain fraction of sample. If your distribution is known, then
the fractile is just the cut-off point where the distribution reaches a certain probability.
In visual terms, a fractile is a point on a probability density curve (PDF) so that the area
under the curve between that point and the origin (i.e.zero) is equal to a specified fraction.
For example, a fractile of .25 cuts off the bottom quarter of a sample, and .5 cuts the sample
in half.Skills and Traits of the Profession
Fractiles are measures of location or position which include not only central location but also
any position based on the number of equal divisions in a given distribution. If we divide the
distribution into four equal divisions then we have quartiles denoted by Q1, Q2, Q3. The most
common used fractiles are quartiles, deciles and percentiles.
FRACTILES FOR UNGROUPED DATA
QUARTILES
QUARTILES divide a distribution into four equal parts. For example Q1, or the first quartile
locates the point which is greater than 25% of the items in distribution.
Q3 is the 3rd quartile
Q3 = 3Nth item
4
This means that 75% of of the observations lie below this value.
Q2 is the 2nd quartile
Q2 = 2Nth item or the median
4
https://www.youtube.com/watch?v=40o82o3uNfk
FRACTILES FOR GROUPED DATA
QUARTILES
FRACTILES FOR GROUPED DATA
QUARTILES
EXAMPLE
EXAMPLE
EXAMPLE
FRACTILES FOR GROUPED DATA
DECILE
EXAMPLE
FRACTILES FOR GROUPED DATA
PERCENTILE
EXAMPLE
ACTIVITY NO. 3: DESCRIPTIVE STATISTICS
1. Using our Data Set compute the following central tendencies for ungrouped data of 4Ps families in
terms of monthly income, age, and number of children :
Mean
Median
Mode
Interpret your data based on your computation.
2. Compute also the variance and standard deviation of ungrouped data of 4Ps families in terms of
monthly income, age, and number of children:
Interpret your data based on your computation.
3. Identify the following fractiles of 4Ps families in terms of monthly income, age, and number of
children :
Q2
Q3
D3
D5
P50
P70
Interpret your data based on your computation.