0% found this document useful (0 votes)
20 views17 pages

Chapter 3

This document provides an overview of descriptive statistics including measures of central tendency, variability, and shape. It discusses various statistical measures such as the mode, median, mean, range, interquartile range, variance, standard deviation, skewness, and coefficient of skewness. Formulas and examples are provided to demonstrate how to calculate each measure using raw data sets. The key statistical concepts covered in the document are measures used to describe the center, spread, and symmetry of data distributions.

Uploaded by

KHIEM HUOL GIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views17 pages

Chapter 3

This document provides an overview of descriptive statistics including measures of central tendency, variability, and shape. It discusses various statistical measures such as the mode, median, mean, range, interquartile range, variance, standard deviation, skewness, and coefficient of skewness. Formulas and examples are provided to demonstrate how to calculate each measure using raw data sets. The key statistical concepts covered in the document are measures used to describe the center, spread, and symmetry of data distributions.

Uploaded by

KHIEM HUOL GIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Chapter 3

Descriptive
Statistics
University of Economics
Ho Chi Minh City

Dang Van Thac 1/22


Outline
• Measures of Central Tendency: Ungrouped Data
• Measures of Variability: Ungrouped Data
• Measures of Shape

Dang Van Thac 2/22


Measures of Central Tendency: Ungrouped Data
• Measures of central tendency yield information about the center, or middle part, of a group
of numbers.
• Five measures of central tendency: Mode, Median, Mean, Percentiles, and Quartiles.
• Mode: is the most frequently occurring value in a set of data. Data with two modes are said
to be bimodal. Data with more than two modes are referred to as multimodal.
• How to find the mode?
Step 1: rank the raw data from smallest to largest value.
Step2: identify the most frequently occurring value in a set of data.
Example:
7.00 11.00 14.25 15.00 15.00 15.50 19.00 19.00 19.00 19.00
14.20 19.00 11.00 28.00
21.00 22.00 23.00 24.00 25.00 27.00 27.00 28.00 34.22 43.25
24.00 23.00 43.25 19.00
27.00 25.00 15.00 7.00 The mode is 19, because 19 is
34.22 15.50 15.00 22.00 the most frequently occurring
number in the data set
19.00 19.00 27.00 21.00 Dang Van Thac 3/22
Measures of Central Tendency: Ungrouped Data
• Median: is the middle value in an ordered array of numbers.
• How to find the median?
Step 1: Arrange the observations in an ordered data array.
Step 2: Calculate (n+1)/2
Step 3: For an odd number of terms, find the middle term of the ordered array. It is the median.
Step 4: For an even number of terms, find the average of the middle two terms. This average is the median.
Example:
15 11 14 3 21 17 22 16 19 16 5 7 19 8 9 20 4 15 11 14 3 21 17 16 19 16 5 7 19 8 9 20 4
Step 1: 3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 22 3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21
Step 2: (n+1)/2 =(17+1)/2=9 (n+1)/2 =(16+1)/2=8.5

Step 3: 9 is an odd number, thus the 8.5 is an even number, thus the median is mean
median is at location 9th : 15 of value at location 8th and 9th : (14+15)/2=14.5
Dang Van Thac 4/22
Measures of Central Tendency: Ungrouped Data
• Mean: is the average of a group of numbers

𝑥 𝑋1 +𝑋2 +𝑋3 +⋯+𝑋𝑁 is commonly used


Population mean: 𝜇 = = in mathematics to
𝑁 𝑁
represent a
summation of all the
𝑥 𝑋1 +𝑋2 +𝑋3 +⋯+𝑋𝑛
Sample mean: 𝑥= = numbers in a
𝑛 𝑛 grouping.

Example:
15 11 14 3 21 17 22 16 19 16 5 7 19 8 9 20 4

𝑥
𝑥= =(15+11+14+3+...+4)/17=13.30
𝑛

Dang Van Thac 5/22


Measures of Central Tendency: Ungrouped Data
• Percentiles: are measures of central tendency that divide a group of data into 100 parts. There are
99 percentiles because it takes 99 dividers to separate a group of data into 100 parts. The nth
percentile is the value such that at least n percent of the data are below that value and at most
(100 - n) percent are above that value.
• How to find the location of a percentile?
Step 1: Organize the numbers into an ascending-order array.
Step 2: Calculate the percentile location (i ) by: 𝑖 = [𝑝/100](𝑁)
i=percentile location, p=the percentile of interest, N=number in the data set
Step 3: Determine the location by either (a) or (b).
Example:
a. If i is a whole number, the Pth percentile is the Determine the 80th percentile of 1240 numbers
average of the value at the ith location and the Step 2: i=[80/100](1240)=992
value at the (i+1) location. Step 3: Because i = 992 is a whole number, thus
b. If i is not a whole number, the Pth percentile the 80th percentile is:
value is located at the whole number part of (i+1). a. P80=(992nd number +993rd number)/2

Dang Van Thac 6/22


Measures of Central Tendency: Ungrouped Data
• Quartiles: are measures of central tendency that divide a group of data into four parts.

The value of 𝑄1 is found at the 25th percentile:


25
𝑄1 = [ ](𝑁)
100
The value of 𝑄3 is found at the 75th percentile:
75
𝑄3 = [ ](𝑁)
100
The value of 𝑄2 is equal to the median

Example: 106 109 114 116 121 122 125 129

25 75
𝑄1 = 8 = 2 ⇒ 𝑄1 =(109+114)/2=111.5 𝑄3 = 100
8 = 6 ⇒ 𝑄3 =(122+125)/2=123.5
100
𝑛+1
𝑄2 = 𝑚𝑒𝑑𝑖𝑎𝑛 = = 4.5 ⇒ 𝑄2 =(116+121)/2=118.5
2 Dang Van Thac 7/22
Measures of Variability: Ungrouped Data
• Measures of variability: describe the spread or the dispersion of a set of data.
• Seven measures of variability for ungrouped data: range, interquartile range,
mean absolute deviation, variance, standard deviation, z scores, and coefficient
of variation.
• Range: is the difference between the largest value of a data set and the smallest
value of a set.
Range = Highest – Lowest

Example: 5 21 12 23 15 14 6 8 7 9 16 21 17
Range=23-5=18

Dang Van Thac 8/22


Measures of Variability: Ungrouped Data
• Interquartile Range: is the range of values between the first and third quartile.
Interquartile range= 𝑄3 - 𝑄1

Example: 5 21 12 23 15 14 6 8 7 9 16 21 17

5 6 7 8 9 12 14 15 16 17 21 21 23

25
𝑄1 = 13 = 3.25 ⇒ 𝑄1 =8
100

75
𝑄3 = 13 = 9.75 ⇒ 𝑄1 =17
100

𝐼𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑔𝑒 = 𝑄3 - 𝑄1 =17-8=9

Dang Van Thac 9/22


Measures of Variability: Ungrouped Data
• Variance: is the average of the squared deviations about the arithmetic mean for a set of
numbers.
𝑥−𝜇 2 𝑥−𝑥 2
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 𝑆𝑎𝑚𝑝𝑙𝑒 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝑠 2 =
𝑁 𝑛−1
Example:
X x-μ (x-μ)2
5 -8 64
9 -4 16
16 3 9
17 4 16
18 5 25
𝑥−𝜇 2
𝜇 = 13 𝑥−𝜇 =0 𝑥−𝜇 2 = 130 𝜎2 = 𝑁
=130/5=26

Dang Van Thac 11/22


Measures of Variability: Ungrouped Data
• Standard deviation: is the square root of the variance.

𝑥−𝜇 2 𝑥−𝑥 2
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛: 𝜎 = 𝜎2 = 𝑆𝑎𝑚𝑝𝑙𝑒: 𝑠 = 𝑠2 =
𝑁 𝑛−1

Example:

X x-μ (x-μ)2
𝑥−𝜇 2
5 -8 64 𝜎2 = =130/5=26
𝑁
9 -4 16
16 3 9
17 4 16 σ= 𝜎 2 = 26 =5.1
18 5 25

Dang Van Thac 12/22


Measures of Shape
• Measures of shape are tools that can be used to describe the shape of a distribution of data.
• Skewness: is when a distribution is asymmetrical or lacks symmetry. The skewed portion is the
long, thin part of the curve.

Dang Van Thac 17/22


Measures of Shape
• Skewness and the Relationship between Mean, Median, and Mode

Dang Van Thac 18/22


Measures of Shape
• Coefficient of skewness

3(𝜇 − 𝑀𝑑 ) 𝑆𝑘 = coefficient of skewness


𝑆𝑘 =
𝜎 𝑀𝑑 =median

𝑆𝑘 = 0 symmetric distribution Example:


𝑆𝑘 > 0 positively skewed A distribution has a mean of 29, a median of 26, and a
𝑆𝑘 < 0 negatively skewed standard deviation of 12.3. The coefficient of skewness is

3(𝜇 − 𝑀𝑑 ) 3(29 − 26)


𝑆𝑘 = = = +0.73
𝜎 12.3

=> The distribution is positively skewed


Dang Van Thac 19/22
Measures of Shape
• Kurtosis: describes the amount of peakedness of a distribution.

Dang Van Thac 20/22


Measures of Shape
• Box-and-Whisker Plots (Box plot): is a diagram that utilizes the upper and
lower quartiles along with the median and the two most extreme values to
depict a distribution graphically.

Dang Van Thac 21/22


Measures of Shape
Example:
87 85 84 84 82 82 82 81 81 81
80 79 79 77 76 75 74 74 74 73
73 73 73 72 72 71 71 71 70 69
69 68 68 65 65 64 64 63 62 62

𝑄1 = 69
𝑄2 = 𝑚𝑒𝑑𝑖𝑎𝑛 = 73
𝑄3 = 80.5
IQR= 𝑄3 - 𝑄1 =80.5-69=11.5
Inner fence= 𝑄1 − 1.5IQR = 69 − 1.5 11.5 = 51.75
𝑄3 + 1.5IQR = 80.5 + 1.5 11.5 = 97.75
outer fence= 𝑄1 − 3IQR = 69 − 3 11.5 = 34.5
𝑄3 + 3IQR = 80.5 + 3 11.5 = 115.0

Dang Van Thac 22/22

You might also like