0% found this document useful (0 votes)
96 views47 pages

Descriptive Statistics: Frequency Distributions and Related Statistics

This document discusses descriptive statistics such as frequency distributions, measures of central tendency, and measures of variation. [1] It describes different types of datasets and how to represent grouped and ungrouped data using frequency distributions, including discrete and continuous distributions. [2] It explains how to calculate common measures of central tendency like the mean, median, and mode for both ungrouped and grouped data. It also discusses measures of variation such as variance, standard deviation, and the variation range. [3] Additionally, the document covers skewness and kurtosis statistics to characterize the symmetry and concentration of a distribution, and how to analyze continuous frequency distributions.

Uploaded by

ha ssan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views47 pages

Descriptive Statistics: Frequency Distributions and Related Statistics

This document discusses descriptive statistics such as frequency distributions, measures of central tendency, and measures of variation. [1] It describes different types of datasets and how to represent grouped and ungrouped data using frequency distributions, including discrete and continuous distributions. [2] It explains how to calculate common measures of central tendency like the mean, median, and mode for both ungrouped and grouped data. It also discusses measures of variation such as variance, standard deviation, and the variation range. [3] Additionally, the document covers skewness and kurtosis statistics to characterize the symmetry and concentration of a distribution, and how to analyze continuous frequency distributions.

Uploaded by

ha ssan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Descriptive statistics

Frequency distributions and related


statistics
Types of datasets
Separately registered data – ungrouped data

Frequency distribution – grouped data by


classes
Ungrouped, raw Grouped data
data (frequency distribution)

Observations Classes Frequency of


1 1 x occurrence
f
2 3
1 6
1 1
2 3
2 4
3 3
5 1
4 1
3 3
5 1
2 1
• Discrete frequency distribution
– Classes are separate, usually whole numbers
• Continuous frequency distribution
– Classes are continuous variables, that are limited
to a certain range
– Observations, that fall within one range, are
considered as similar
Example
Discrete
0 1 2 3 4
€9,99 €10,99 €19,99

Continuous
(0-100] (100-200] (200-300]
(300-500] (500-1000]
Continuous frequency distribution
• Class interval length

i = x − x ''
i
'
i
• Usually same-length classes are used
• Constructed based on socio-economic context
Continuous frequency distribution
• Different class interval length
– if a classification system is used,
– if there are few observations in outlining intervals,
– if empty intervals appear.
Continuous frequency distribution
• Interval density – proportion of interval
frequency and length
Determining the length of intervals
• Sturges’ formula (n ≤ 100)
xmax − xmin
=
1 + 3,2  lg n
• Brex formula (n > 100)

xmax − xmin
=
5  lg n
Cumulative frequency distribution
• Used for analysis of total frequency up to the
class.
• The frequency of observations corresponding
to the value in the class or a lower value.
Charts
• Polygon
– Used for discrete frequency distributions
– Horizontal axis — classes, vertical axis — frequencies
• Histogram
– Used for continuous frequency distributions
– Horizontal axis — segments, that correspond to length
of classes, vertical axis — frequencies
• Cumulate
– Cumulative frequency distribution
– Horizontal axis — values of discrete or continuous
classes, vertical axis — accumulated frequencies
Polygon
(Scatter with lines)
Histogram
Analysis of a discrete frequency
distribution
Mean statistics

• Objective
• Abstract
• Describe the phenomenon as a
whole
Mean statistics
Measures of central tendency
Summary means
and
Structural (location) means

Simple or unweighed means –


for ungrouped data
and
Weighed means – for grouped data
Arithmetic mean
• Simple arithmetic mean

x=
 x i

N
• Weighed arithmetic mean

x=
 x f i i

f i
Arithmetic mean
𝒙𝒊
𝒙𝒊 𝒇𝒊
3
3 1
5
5 2
5
8 1
8

𝟑+𝟓+𝟓+𝟖 𝟑+𝟓∙𝟐+𝟖

𝒙= ഥ
𝒙=
𝟒 𝟒
Structural means
• Median
• Quantiles
• Mode
Median
• Class, that separates an ordered frequency
distribution (ascending or descending) in two
equal parts (by frequencies).

• If the number of observations is a odd


number, the median is arithmetic mean from
two middle observations.
Median
𝒙𝒊 8 7 3 9 1 2 2

𝒙𝒊 1 2 2 3 7 8 9

Me
Mode
• The most frequent observation or class.
• x with the largest f
• For continuous frequency distributions:

f Mo − f Mo−1
Mo = x0 +  Mo
f Mo − f Mo−1 + f Mo − f Mo+1
Mode
• One mode — monomodal
• Two modes — bimodal
• Three or more modes — multimodal
Variance statistics
How much observed values differ
from the average and how
significant are this differences.
Variance statistics
• Variation range
• Variance and standard deviation
• Variance coefficient
Variation range
• Different between the largest and smallest
observation

Rv = xmax − xmin
• Takes rare, extreme values into account!
Variance and standard deviation
• Variance — mean quadratic deviation from the
arithmetic mean in quadratic measures
N

 (x − x )
2
i
 =
2 i =1

N
• Weighed variance

 (x − x ) 2
fi
 2
= i

f i
Variance and standard deviation
𝒙𝒊 𝒙𝒊 − 𝒙ഥ 𝟐 𝒙𝒊 𝒇𝒊 𝒙 𝟐 𝒇𝒊
𝒙𝒊 − ഥ
3 (3 - 5,25)2
3 1 (3 - 5,25)2
5 (5 - 5,25)2
5 2 (5 - 5,25)2·2
5 (5 - 5,25)2
8 1 (8 - 5,25)2
8 (8 - 5,25)2
Variance and standard deviation
• Standard deviation — mean quadratic deviation
from the arithmetic mean in the same measures
as xi

=  2
Variance application
• Dispersion comparison
• Inequality analysis
• Convergence analysis
• What is “normal”?
• etc.
Variance coefficient
• Relative level of variance (in percent)


V= 100
x
• Allows comparison of different objects with
different measures
Skewness and kurtosis statistics
Skewness
• Skewness statistics characterise the skew of
the symmetry relatively to the arithmetic
mean
Structural skewness
• An approximate statistic of skewness

x − Me
A=
x − Mo
• A = 0 → symmetric
• A < 3 → asymmetric
Skewness coefficient
• A more precise skewness statistic

m3
K3 =
• K3 > 0 → positive skew
 3

• K3 < 0 → negative skew


Kurtosis coefficient
• The concentration of observations around the
arithmetic mean, in comparison to the normal
distribution.

m4
E= −3
 4
• E > 0 → pointed at peak
• For normal distribution E = 0
Central moments
n

 i
( x − x ) f i
k

mk = i =1
n


i =1
fi
Analysis of a continuous
frequency distribution
Arithmetic mean
• Simple arithmetic mean

x=
 x i

N
• Weighed arithmetic mean

x=
 x f i i

f i
Median
For continuous frequency distributions:
𝑛
σ𝑖=1 𝑓𝑖
σ 𝑀𝑒−1
− 𝑖=1 𝑓𝑖
2
𝑀𝑒 = 𝑥0 + Δ𝑀𝑒
𝑓𝑀𝑒
𝑥0 - start of the median interval
Δ𝑀𝑒 - length of the median interval
σ𝑛𝑖=1 𝑓𝑖 - total number of observations
σ𝑀𝑒−1
𝑖=1 𝑓𝑖 - number of observations until median
interval
𝑓𝑀𝑒 - number of observations in the median interval
Mode
For continuous frequency distributions:
f Mo − f Mo−1
Mo = x0 +  Mo
f Mo − f Mo−1 + f Mo − f Mo+1
𝑥0 - start of the mode interval
Δ𝑀𝑜 - length of the median interval
𝑓𝑀𝑜 - number of observations in the mode interval
𝑓𝑀𝑜−1 - number of observations in the interval before the
mode interval
𝑓𝑀𝑜+1 - number of observations in the interval after the
mode interval
Variance and standard deviation
• Variance — mean quadratic deviation from the
arithmetic mean in quadratic measures
N

 (x − x )
2
i
 =
2 i =1

N
• Weighed variance

 (x − x ) 2
fi
 2
= i

f i
Analysis of a growth rate
Geometric mean
• Simple geometric mean
N
x0 = N  xi
i =1
• Weighed geometric mean
n

 n

 i
f
x0 =
i fi
i =1 x
i =1
Example – growth rate vs. percent

× 100 𝑡ℎ𝑒𝑛 − 100


Growth rate Change in percent
1.34 +34%
+100 𝑡ℎ𝑒𝑛 ÷ 100

× 100 𝑡ℎ𝑒𝑛 − 100


Growth rate Change in percent
0.72 -28%
+100 𝑡ℎ𝑒𝑛 ÷ 100

You might also like