0% found this document useful (0 votes)
48 views50 pages

Module 3 - Branches of Statistics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views50 pages

Module 3 - Branches of Statistics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Jasper B.

Alcedo, RSW, MSSW


Instructor II
1. Data 3. Data Set 5. Population
2. Random Variables 4. Data Value/Datum 6. Sample
Data are the values A population consists of
A collection of data
(measurements or all subjects (human or
values forms a data set.
observations) that the otherwise) that are being
variables can assume. studied.
Each value in the data
set is called a data value
Variables whose values are A sample is a subgroup
or a datum.
determined by chance are of the population
called random variables
Descriptive Statistics
Descriptive statistics is the term given to the analysis of data that helps
describe, show or summarize data in a meaningful way such that, for
example, patterns might emerge from the data.

Descriptive statistics do not, however, allow us to make conclusions


beyond the data we have analyzed or reach conclusions regarding any
hypotheses we might have made. They are simply a way to describe
our data.
1. Measures of central tendency
These are ways of describing the central position of a
frequency distribution for a group of data:

a. Arithmetic mean
b. Median
c. Mode
d. Weighted average or Mean
1. Arithmetic mean. This is a computational average and is defined as the sum of the variables.
Example: divided by the number of variables.

2. Median. This represents the point on the scale or on the distribution where half of the variables
are greater and the other half are lesser.

3. Mode. The mode is the simplest of the measures of central tendency and is defined as the
variable which occurs most frequently in a statistical series.

4. Weighted average or mean. It is defined as the sum of the product of the frequency and the
weight of a set of variables divided by the total number of frequencies. It is used for finding the
average of responses to opinions or items of the questionnaire which are given weights.
2. Measuresof spread:
These are ways of summarizing a group of data by describing how to spread
out the scores. Measures of spread help us to summarize how spread out these
scores are. To describe this spread, a number of statistics are available to us,
including the range, quartiles, absolute deviation, variance, and standard
deviation. When we use descriptive statistics it is useful to summarize our group
of data using a combination of tabulated description (i.e., tables), graphical
description i.e., graphs and charts),and statistical commentary(i.e.,a discussion
of the results).
CENTRAL TENDENCY
Measures of central tendency help you find the middle, or the average, of a
data set. The 3 most common measures of central tendency are the mode,
median, and mean.

Mode: the most frequent value.


Median: the middle number in an ordered dataset.
Mean: the sum of all values divided by the total number of values.

Find the mean, median, mode, and range for the following list of values:

13,18,13,14,13,16,14,21,13
Find the mean, median, mode, and range for the following list of values:

13, 18, 13, 14, 13, 16, 14, 21, 13

Solution:

Given data: 13, 18, 13, 14, 13, 16, 14, 21, 13
The mean is the usual average.

Mean = {13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13} / {9} = 15


(Note that the mean is not a value from the original list. This is a common result. You should not assume
that your mean will be one of your original numbers.)

Mean = 15
Find the median for the following list of values:

13, 18, 13, 14, 13, 16, 14, 21, 13

Solution:

The median is the middle value, so rewrite the list in ascending order as given below:
13, 13, 13, 13, 14, 14, 16, 18, 21
There are nine numbers in the list, so the middle one will be
{9 + 1} / {2} = {10} / {2} = 5
= 5th number
Hence, the median is 14.

Median = 14

NOTE: Arrange the data points from smallest to largest. If the number of data points is odd, the median is the middle data
point in the list. If the number of data points is even, the median is the average of the two middle data points in the list.
Find the mode for the following list of values:

13, 18, 13, 14, 13, 16, 14, 21, 13

Solution:

The mode is the number that is repeated more often than any other, so 13 is the mode.

Mode = 13
.
Types of Mode
1. Unimodal - A set of data with one Mode is known as a Unimodal Mode.For example, the Mode of data set
A = { 14, 15, 16, 17, 15, 18, 15, 19} is 15 as there is only one value repeating itself. Hence, it is a Unimodal data
set.
2. Bimodal - A set of data with two Modes is known as a Bimodal Mode. This means that there are two data
values that are having the highest frequencies.For example, the Mode of data set A = { 8,13,13,14,15,17,17,19} is
13 and 17 because both 13 and 17 are repeating twice in the given set. Hence, it is a Bimodal data set.
3. Trimodal - A set of data with three Modes is known as a Trimodal Mode. This means that there are three
data values that are having the highest frequencies. For example, the Mode of data set A = {2, 2, 2, 3, 4, 4, 5,
6, 5,4, 7, 5, 8} is 2, 4, and 5 because all the three values are repeating thrice in the given set. Hence, it is a
Trimodal data set.
4. Multimodal - Multimodal Mode - A set of data with four or more than four Modes is known as a
Multimodal Mode.For example, The Mode of data set A = {100, 80, 80, 95, 95, 100, 90, 90,100 ,95 } is 80, 90,
95, and 100 because both all the four values are repeated twice in the given set. Hence, it is a Multimodal
data set.

Note: In a data set and there is no number repeated more often than any other, thus there is no MODE.
Find the range for the following list of values:

13, 18, 13, 14, 13, 16, 14, 21, 13

Solution:

The largest value in the list is 21, and the smallest is 13, so the range is 21 – 13 = 8.

Range = 8
For another example of computing the central tendency watch video link:
https://www.youtube.com/watch?v=81zcjULlh58
VARIANCE
The variance is a measure of variability. It is calculated by taking the
average of squared deviations from the mean. Variance tells you the degree
of spread in your dataset. The more spread the data, the larger the variance
is in relation to the mean.
VAR.P: This function calculates the population variance. Use this function when the range of
values represents the entire population.
This function uses the following formula:
Populationvariance=Σ(xi–μ)2/N where:

Σ: A greek symbol that means “sum”


xi: The ith value in the dataset
μ: The population mean
N: The total number of observations

Find the variance of the numbers 3,8,6,10,12,9,11,10,12,and 7.


Example: Find the variance of the numbers
3,8,6,10,12,9,11,10,12,7.

Step1: First compute the mean of the 10 values given.


x̅ =3+8+6+10+12+9+11+10+12+7/10
=88/10
=8.8
Step2: Make a table as follows with
X (value) X - x̅ (X-x̅ )²
three columns, one for the X values,
3 -5.8 33.64
the second for the deviations, and the 8 -0.8 0.64
third for squared deviations. 6 -2.8 7.84

10 1.2 1.44

12 3.2 10.24

9 0.2 0.04

11 2.2 4.84

10 1.2 1.44

12 3.2 10.24

7 -1.8 3.24

TOTAL 0 73.6
Step3:

As the data is not given as sample data, thus we use the formula
for population variance.
=73.6/10
=7.36
For another example of computing variance watch video link:
https://www.youtube.com/watch?v=deIQeQzPK08
STANDARD DEVIATION
The standard deviation is the average amount of variability in your dataset. It tells you, on
average, how far each value lies from the mean. A high standard deviation means that values
are generally far from the mean, while a low standard deviation indicates that values are
clustered close to the mean.

Variance is the average squared deviation from the mean, while standard deviation is the
square root of this number.Both measures reflect variability in distribution,but their units differ:
Standard deviation is expressed in the same units as the original values (e.g.,minutes or
meters).
Variance is expressed in much larger units (e.g.,meters squared).
Although the units of variance are harder to intuitively understand, variance is important in
statistical tests.
For another example of computing standard deviation watch video link:
https://www.youtube.com/watch?v=esskJJF8pCc
FRACTILES
A fractile is a cut-off point for a certain fraction of sample. If your distribution is known, then
the fractile is just the cut-off point where the distribution reaches a certain probability.

In visual terms, a fractile is a point on a probability density curve (PDF) so that the area
under the curve between that point and the origin (i.e.zero) is equal to a specified fraction.
For example, a fractile of .25 cuts off the bottom quarter of a sample, and .5 cuts the sample
in half.Skills and Traits of the Profession

Fractiles are measures of location or position which include not only central location but also
any position based on the number of equal divisions in a given distribution. If we divide the
distribution into four equal divisions then we have quartiles denoted by Q1, Q2, Q3. The most
common used fractiles are quartiles, deciles and percentiles.
FRACTILES FOR UNGROUPED DATA
QUARTILES
QUARTILES divide a distribution into four equal parts. For example Q1, or the first quartile
locates the point which is greater than 25% of the items in distribution.
Q3 is the 3rd quartile
Q3 = 3Nth item
4

This means that 75% of of the observations lie below this value.
Q2 is the 2nd quartile
Q2 = 2Nth item or the median
4

Q1 is the first quartile


Q1 = Nth item
4
FRACTILES FOR UNGROUPED DATA
DECILES
DECILES are values that divide a distribution into ten equal parts.
D1 is the 1st decile
D1 = Nth item
10

D3 is the 3rd decile


D3 = 3Nth item
10

D5 is the fifth decile


D5 = 5Nth item or the median
10
FRACTILES FOR UNGROUPED DATA
PERCENTILES
PERCENTILES are values that divide a distribution into 100 equal parts.
P10 or the tenth percentile means the 10th item in the distribution which is 10% higher than the
rest of the items.
P1 is the first percentile P1 = Nth item
100

P25 is the 25th percentile P25 = 25Nth item or Q1


100

P50 is the 50th percentile P50 = 50Nth item or the Median


100

P67 is the 67th percentile P67 = 67Nth item


100
EXAMPLE
EXAMPLE
EXAMPLE
EXAMPLE
EXAMPLE
FOR FURTHER DISCUSSION, WATCH THE LINK:

https://www.youtube.com/watch?v=40o82o3uNfk
FRACTILES FOR GROUPED DATA
QUARTILES
FRACTILES FOR GROUPED DATA
QUARTILES
EXAMPLE
EXAMPLE
EXAMPLE
FRACTILES FOR GROUPED DATA
DECILE
EXAMPLE
FRACTILES FOR GROUPED DATA
PERCENTILE
EXAMPLE
ACTIVITY NO. 3: DESCRIPTIVE STATISTICS
1. Using our Data Set compute the following central tendencies for ungrouped data of 4Ps families in
terms of monthly income, age, and number of children :
Mean
Median
Mode
Interpret your data based on your computation.
2. Compute also the variance and standard deviation of ungrouped data of 4Ps families in terms of
monthly income, age, and number of children:
Interpret your data based on your computation.
3. Identify the following fractiles of 4Ps families in terms of monthly income, age, and number of
children :
Q2
Q3
D3
D5
P50
P70
Interpret your data based on your computation.

You might also like