0% found this document useful (0 votes)

18 views

Lecture 3

Uploaded by

Ly Khánh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Lecture 3

Uploaded by

Ly Khánh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Lecture 3.

Chapter 5
Numerical descriptive measures
5.1 Measure of central location
5.2 Measure of variability
5.3 Measure of relative standing and box plots
5.4 Approximate descriptive measures for
grouped data
Optional reading: 5.5 & 5.6
1

5.1 Measures of central location:

Arithmetic Mean (or Average)
This is the most popular and useful measure
of central location.

Sum of measurements
Mean =
Number of measurements
Sample mean Population mean

 ni1 x i  Ni1 x i
x 
n N

Sample size Population size

1
Example 5.1, page 131
The mean of the sample of 10 measurements (waiting times for a bus),
14, 8, 12, 13, 12, 6, 19, 7, 11, 8, is given by
10 x 14  8  12  ...  11  8
x  i 1 i   11.0
10 10
Example 4.1 (contd.), page 85
Suppose the telephone bills of Example 4.1 represent a population
of measurements. The population mean is

 i200
1 xi 196.65  468.75  ...  270.90  196.65
   238.015
200 200

Example 5.4, page 134

When many of the measurements have the same value, the
measurements can be summarized in a frequency table.
Waist sizes (cm) xi 70 75 77 80 82 85 90 100
# of pairs of trousers fi 2 1 2 1 1 5 1 1

Mean is seriously affected by extreme values called

‘outliers’. E.g. as soon as a pair of trousers of big size
moves into the sample (say, of size 200 cm), the
average waist size increases to 89.7 cm beyond what
it was previously 81.9 cm!
4

2
Median: another measure of central location
Example 5.2, page 132 Example 5.3, page 132
Seven employee salaries were recorded Suppose the director’s salary of $130000
(in $1000): 49, 52, 47, 53, 51, 47, 50. was added to the group recorded before.
Find the median salary. Find the median salary.
Odd number of observations Even number of observations
First, sort the salaries in an order. First, sort the salaries.
Then, locate the value in the middle. Then, locate the values in the middle.
There are two middle values!

47,47,49,50,51,52,53 47,47,49,50,51,52,53,130
47,47,49,50 51,52,53,130
Median is the value that falls in
the middle when the
measurements are arranged in
order of magnitude.
47,47,49,50, 50.5, 51,52,53,130
5

Mode: Another commonly used measure

of central location

The mode of a set of observations is the value that

occurs most frequently. A set of data may have
one mode or two or more modes.

Example 5.4, page 134

Waist sizes (cm) xi 70 75 77 80 82 85 90 100
# of pairs of trousers fi 2 1 2 1 1 5 1 1
The mean is 81.9 cm, the median = (x7 + x8)/2 =
83.5. The mode of this data set is 85 cm is more than
the median (83.5) and mean (81.9)

3
Example 4.1 (contd.), page 85
For large data sets, the modal class is much more
relevant than a single-value mode. The modal class
is the class with the highest frequency. There may be
one modal class or two or more modal classes

Histogram
For large data sets
70 the modal class is
60
much more relevant
50
than the a single-
Frequency

30
value mode.
20 In example 4.1: The
10 modal class is
0
100 150 200 250 300 350 400 450 500 515 More
[200, 250] with the
Bin highest frequency 60.

Example 5.6, page 136

The mean provides information
Excel Output about the over-all performance level
Marks of the class. It can serve as a tool for
making comparisons with other
Mean 73.98 classes and/or other exams.
Standard Error 2.1502163
Median 81 The median indicates that half of the
Mode 84 class received a grade below 81%,
Standard Deviation 21.502163 and half of the class received a grade
Sample Variance 462.34303 above 81%.
Kurtosis 0.3936606
Skewness -1.073098 The mode must be used when data is
Range 89
Minimum 11 nominal. If marks are classified by
Maximum 100 letter grade (say, A, B, C, D, F), the
Sum 7398 frequency of each
Count 100 grade can be calculated..
Note: If your data is multi-modal, then
Excel prints the smallest one or N/A. 8

4
Excel Histogram for Example 5.6

Bin Frequency Frequency

10 0
20 3
30 2 30
40 6
50 6 20
60 5
10
70 10
80 16 0
90 28
100 24

e
10

0
or
10
More 0 The histogram is skewed to the left

M
If the distribution is negatively Modal class is [80, 90)
skewed, then Mean < Median < Mode. with the highest
73.98 < 81 < 84 frequency 28
9

5.2 Measures of Variability

(about the mean): Range, variance …
Range = Largest observation – Smallest observation
The variance of a population of N measurements x1, x2,
…, xN having
N ( x i   ) 2
a mean  is defined as 2  i 1
N
The variance of a sample of n measurements x1, x2, …,
xn having   
2
 n
a mean x is n ( x  x )2

1  n 2  i 1
  i 
x 

defined as
s 2  i 1 i
n 1
 
n  1  i 1
xi 
n 
 
 
𝑛
or 𝑠 2 = 𝑛−1 𝑥 2 − 𝑥 2
The population standard deviation = 
The sample standard deviation = s
10

5
Example 5.8, page 149
Let us use the Excel printout that is run from the
‘Descriptive statistics’ sub-menu
Trust A Trust B
Rates of return over the
past 10 years for two unit Mean 20Mean 15
trusts are shown below. Standard Standard
Error 5.29471435Error 3.152353618
Which one has a higher Median 18.6Median 14.75
level of risk? Mode #N/A Mode #N/A
Trust A: 12.3, -2.2, 24.9, 1.3, Standard Standard
37.6, 46.9, 28.4, 9.2, 7.1, 34.5 Deviation 16.7433569Deviation 9.968617423
Trust B: 15.1, 0.2 , 9.4, 15.2, Sample Sample
Variance 280.34Variance 99.37333333
30.8, 28.3, 21.2, 13.7, 1.7,14.4
Kurtosis -1.3419311Kurtosis -0.46393926
Trust A should be Skewness 0.21697141Skewness 0.106952106
considered Range 49.1Range 30.6
riskier because its standard Minimum -2.2Minimum 0.2
Maximum 46.9Maximum 30.8
deviation is larger.
Sum 200Sum 150
Count 10Count 1110

Interpreting Standard Deviation

in case the histogram is bell – shaped
or mound-shape

Example 5.9, page 151

A statistician wants to describe the way returns on
investment are distributed: the mean return = 10%, the
standard deviation of the return = 3% and the histogram is
bell-shaped.
How can the statistician use the mean and the standard
deviation to describe the distribution?
– Approximately 68% of the returns lie between
𝑥 − 𝑠, 𝑥 + 𝑠 = [7%,13%]
- Approximately 95% of the returns lie between
𝑥 − 2𝑠, 𝑥 + 2𝑠 = [4%,16%]
- Approximately 99.7% of the returns lie between
𝑥 − 3𝑠, 𝑥 + 3𝑠 = [1%,19%]
12

6
Example 5.9 page 151 (contd.):
other conclusions
• By the empirical rule, approximately 95% of the area
under a mound-shaped histogram lies between
( x  2s, x  2s)
95%
of the area
2 4 6 8 10 12 14 16 More
x  2s, x x  2s
• About 95% of all the measurements fall within two
standard deviations around the mean [4%, 16 %] (in
fact, 96 out of 100 call durations fall in this
range: 96%)
• The range = 16.72-3.41 = 13.31
• s  range / 4 = 3.4 (in fact s = 3)
Example 5.10, page 152 : study yourself 13

Interpreting Standard Deviation

Chebyshev’s Theorem
• Given any set of measurements and a number k
(greater than 1), the fraction of these
measurements that lie within k standard deviations
around the mean is at least 1–1/k2. 1–1/22=3/4 or 75%
• This theorem is valid for any set of measurements
(sample, population) of any shape (not only for bell
–shaped populations).
1–1/32=8/9 or 89%

k Interval Chebyshev Empirical rule

1 approx 68%
x  s, x  s
2 x  2s, x  2s at least 75% approx 95%
3 x  3s, x  3s at least 89% approx 100%
14

7
A measure of variability:
Coefficient of Variation
s
Sample coefficient of variation : cv 
x

Population coefficient of variation : CV 

This coefficient provides a proportionate measure
of variation. A standard deviation of 10 may be
perceived as large when the mean value is 100,
but only moderately large when the mean value
is 500.

Example 5.11, page 154 (Example 5.8, contd.):

cvA = 𝑠𝐴 /𝑥𝐴 = 0.837, cvB = 𝑠𝐵 /𝑥𝐵 = 0.665.
15

5.3 Measures of Relative Standing

and Box Plots
Measures of relative standing are designed to
provide information about the position of
particular values relative to the entire data set.

Percentile: the pth percentile is the value for

which p % of values are less than that value and
(100-p)% are greater than that value.

Suppose you scored in the 60th percentile on the

UMAT, that means 60% of the other scores were
below yours, while 40% of scores were above
yours.
16

8
Commonly Used Percentiles…
First (lower) decile = 10th percentile
First (lower) quartile, Q1 = 25th percentile
Second (middle)quartile,Q2 = 50th percentile
Third (upper) quartile, Q3, = 75th percentile
Ninth (upper) decile = 90th percentile
Location of Percentiles

Example 5.12, page 160

Calculate the 25th, 50th, and 75th percentile of the
data: 5, 12, 17, 10, 38, 19, 13, 5, 14, 27.
After sorting the data we have
5, 5, 10, 12, 13, 14, 17, 19, 27, 38.
25
L 25  (10  1)  2.75
100

The 2.75th location translates to the value

P25 = 5 + (.75)(10 – 5) = 8.75

2nd observation 3rd observation 2nd observation

18
Similarly, we can have: P50 = 13.5 and P75 =21

9
Quartiles and Variability

• Quartiles can provide an idea about the

shape of a histogram

Q1 Q2 Q3 Q1 Q2 Q3
Positively skewed Negatively skewed
histogram histogram
19

Interquartile Range…
The quartiles can be used to create another
measure of variability, the interquartile range,
which is defined as follows:

Interquartile Range = Q3 – Q1

The interquartile range measures the spread of the

middle 50% of the observations.

Large values of this statistic mean that the 1st and

3rd quartiles are far apart, indicating a high level of
variability.
20

10
Box Plots
Box Plot is a pictorial display that graphs
five main descriptive measures of the
measurement set:
• L – The largest measurement
• Q3 – The upper quartile An adjustment to this general
• Q2 – The median description of a box plot may
be needed in the presence of
• Q1 – The lower quartile outliers. See the next example.
• S – The smallest measurement

S Q1 Q2 Q3 L
21

Box Plots
The box plot is a technique that graphs five
statistics:
• the minimum and maximum observations, and

Whisker Whisker (1.5*(Q3–Q1))

• the first, second, and third quartiles.

11
Box Plots

• The lines extending to the left and right

are called whiskers.
• The whiskers extend outward to the
smaller of 1.5 times the interquartile range
or to the most extreme point.
• Any points that lie outside the whiskers are
called outliers

Example, page 162: Share value of 11 stocks

-2.75 16.05

S Q1 Q2 Q3 L
0.9 4.3 5.3 9.0 11.4 25.5

IQR = Q3 – Q1 = 9.0 – 4.3 = 4.7

Fences ={Q1 – 1.5(IQR), Q3 + 1.5(IQR)} = {-2.75, 16.05}

Any value outside the interval (-2.75, 16.05) is an outlier.

The only outlier is 25.5. Therefore, the whiskers, emanating
from each end of the box (Q1 & Q3) will extend to the two
extreme values that are not an outlier: 0.9 and 11.4.

12
Interpreting the box plot results ???

S Q1 Q2 Q3 L
0.9 4.3 5.3 9.0 25.5

25% 50% 25%

The distribution is positively skewed

50%

25% 25%

0.9 25.5
25

5.4 Approximating Descriptive

Measures for Grouped Data

• Approximating descriptive measures for

grouped data may be needed when
approximated values satisfy the needs when
only secondary grouped data are available.

number  ki1 fi mi midpoint of class i

of classes x fimi is approx.
n frequency of class i equal
n = f1+f2+…+ fk to the number of
1 k (  ki1 fi mi ) 2  measurements
s 
2
 fi m i 
2

n  1  i1 n 

in class i

13
Example 5.15, page 169
i61 fimi 312.0 Class Class Frequency Midpoint
x   10.4
30 6 i limits fi mi fimi fimi2
1 2–5 3 3.5 10.5 36.75
1  k 2 (  i1 fimi ) 
k 2
s2   fimi   2 5–8 6 6.5 39.0 253.5
n  1  i1 n  3 8–11 8 9.5 76.0 722.0
1  312 
2 – – – – – –
3,751.5    17.47 6 17–20 2 18.5 37.0 684.5
29  30 
n = 30 312.0 3 751.5

10
Real values :
8 x  10.26 and s2  18.40
6
4 Approximate the mean and
standard deviation of the
2
telephone call durations
0
represented by the
2 5
3.5 6.5 8 11 14 17 20 More
frequency distribution. 27

Summary: page 189

Home assignment:

- Section 5.1 Exercises pages 139-140: 5.6, 5.10

- Section 5.2 Exercises pages 155-157: 5.29, 5.40

- Section 5.3 Exercises page 167: 5.63

- Section 5.4 Exercises page 170: 5.74

The FIRAC Method of Legal Writing
100% (2)
The FIRAC Method of Legal Writing
4 pages
Chapter 3 - Numerical Technique - Send
No ratings yet
Chapter 3 - Numerical Technique - Send
49 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
AP ECON 2500 Session 2
No ratings yet
AP ECON 2500 Session 2
22 pages
Chapter Six (Statistics)
No ratings yet
Chapter Six (Statistics)
91 pages
Descriptive Statistics PDF
100% (1)
Descriptive Statistics PDF
40 pages
Analysis of Statistcal Data
No ratings yet
Analysis of Statistcal Data
46 pages
03 - Measures - of - Center - Variation
No ratings yet
03 - Measures - of - Center - Variation
45 pages
1 - 3 Descriptive Measures
No ratings yet
1 - 3 Descriptive Measures
33 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
Stat 1101 4 7
No ratings yet
Stat 1101 4 7
18 pages
Chapter 02 STAT 410
No ratings yet
Chapter 02 STAT 410
47 pages
Topic 3
No ratings yet
Topic 3
49 pages
ISM Session 1-8+webinar1,2 Merged
No ratings yet
ISM Session 1-8+webinar1,2 Merged
718 pages
FDSA unit 2
No ratings yet
FDSA unit 2
44 pages
Lecture 3 Numerical Measures of Data
No ratings yet
Lecture 3 Numerical Measures of Data
36 pages
Introduction To Probability and Statistics Thirteenth Edition
No ratings yet
Introduction To Probability and Statistics Thirteenth Edition
46 pages
Click To Add Text Dr. Cemre Erciyes: Soc 2003 Statistical Methods and Computer Applications in Social Sciences 18/19
No ratings yet
Click To Add Text Dr. Cemre Erciyes: Soc 2003 Statistical Methods and Computer Applications in Social Sciences 18/19
69 pages
Ch 2 Lecture Notes
No ratings yet
Ch 2 Lecture Notes
12 pages
Descriptive Measures With Samples-1
No ratings yet
Descriptive Measures With Samples-1
33 pages
Lecture 1, BAS115
No ratings yet
Lecture 1, BAS115
57 pages
Unit 8. Data Analysis
No ratings yet
Unit 8. Data Analysis
69 pages
Lesson 1
No ratings yet
Lesson 1
37 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
Chapter_02_3dd163f1e86b5d97eecec4e61b4af8cc
No ratings yet
Chapter_02_3dd163f1e86b5d97eecec4e61b4af8cc
46 pages
Week 11 Measure of Center and Variability
No ratings yet
Week 11 Measure of Center and Variability
35 pages
DSML
No ratings yet
DSML
510 pages
Introduction To Probability and Statistics Twelfth Edition
No ratings yet
Introduction To Probability and Statistics Twelfth Edition
47 pages
Chapter 3 - Data Presentation
100% (1)
Chapter 3 - Data Presentation
40 pages
Lec5&6 02sep2016
No ratings yet
Lec5&6 02sep2016
32 pages
المحاضرة رقم 3
No ratings yet
المحاضرة رقم 3
44 pages
Lecture 2-Summarizing Data - HSciences Biostats - 010232en
No ratings yet
Lecture 2-Summarizing Data - HSciences Biostats - 010232en
37 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Unit 3 - Descriptive Statistics
No ratings yet
Unit 3 - Descriptive Statistics
44 pages
Lecture III-Measures of Dispersion
No ratings yet
Lecture III-Measures of Dispersion
33 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
Day 2-Statistical Measures of Data Rev
100% (1)
Day 2-Statistical Measures of Data Rev
82 pages
03 Numerical Description
No ratings yet
03 Numerical Description
52 pages
Descriptive Statsistics
No ratings yet
Descriptive Statsistics
34 pages
Lecture 2-3 Data Analysis Location & Dispression
No ratings yet
Lecture 2-3 Data Analysis Location & Dispression
43 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
14 pages
Chapter 02
No ratings yet
Chapter 02
47 pages
Statistical Data
No ratings yet
Statistical Data
41 pages
Numerical Descriptive Techniques (6 Hours)
No ratings yet
Numerical Descriptive Techniques (6 Hours)
89 pages
Chapter Four: Numerical Descriptive Techniques
No ratings yet
Chapter Four: Numerical Descriptive Techniques
65 pages
03 -- measures_of_center_variation
No ratings yet
03 -- measures_of_center_variation
45 pages
1 Descriptive
No ratings yet
1 Descriptive
42 pages
EECM3724_Unit_1_Ch3_slides_2022
No ratings yet
EECM3724_Unit_1_Ch3_slides_2022
48 pages
Chapter 3 Data Presentation
No ratings yet
Chapter 3 Data Presentation
40 pages
Click To Add Text Dr. Cemre Erciyes
No ratings yet
Click To Add Text Dr. Cemre Erciyes
69 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
Notes 3 Descriptive Statistics RJMurden 2021
No ratings yet
Notes 3 Descriptive Statistics RJMurden 2021
47 pages
Desc. Stat
No ratings yet
Desc. Stat
41 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
51 pages
HNS 2321 BIOSTATISTICS LECTURE 3 AND 4 DESCRITIVE STATISTICS
No ratings yet
HNS 2321 BIOSTATISTICS LECTURE 3 AND 4 DESCRITIVE STATISTICS
36 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
From Everand
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
Stuart A. Klugman
4/5 (1)
Genetic Algorithms
No ratings yet
Genetic Algorithms
90 pages
CH - 02 - Simple Linear Regression - TQT
No ratings yet
CH - 02 - Simple Linear Regression - TQT
61 pages
Chap 015
No ratings yet
Chap 015
59 pages
Chap 7 - Social Psychology - Myers' Psychology
No ratings yet
Chap 7 - Social Psychology - Myers' Psychology
39 pages
AFD - Climate Change - GEMMES
No ratings yet
AFD - Climate Change - GEMMES
612 pages
Weekly Home Learning Plan Grade 8 Quarter 1 Week 1 Date: September 20-25, 2021
No ratings yet
Weekly Home Learning Plan Grade 8 Quarter 1 Week 1 Date: September 20-25, 2021
1 page
Two Port Parameters
No ratings yet
Two Port Parameters
18 pages
Pop in The Age
No ratings yet
Pop in The Age
15 pages
Assignment DMBA403 MBA 4 Set-1 and 2 Feb-March 2024
No ratings yet
Assignment DMBA403 MBA 4 Set-1 and 2 Feb-March 2024
3 pages
1747-ASB Replacement PDF
No ratings yet
1747-ASB Replacement PDF
3 pages
400 Sekar Laut Reviu Sustainability Reporting
No ratings yet
400 Sekar Laut Reviu Sustainability Reporting
6 pages
Mayur Resume
No ratings yet
Mayur Resume
1 page
Ashrae 62.1 and 90.1 Compliance
100% (1)
Ashrae 62.1 and 90.1 Compliance
18 pages
Principles of Construction Management: Lecture No. 2 - Overview of The Construction Industry Cont
No ratings yet
Principles of Construction Management: Lecture No. 2 - Overview of The Construction Industry Cont
54 pages
A Study of Supply Chain Management
No ratings yet
A Study of Supply Chain Management
10 pages
Review of Simplex Solution To LP Problem
No ratings yet
Review of Simplex Solution To LP Problem
19 pages
Chapter 4 Concurrency Control
No ratings yet
Chapter 4 Concurrency Control
38 pages
2018 Institute PPT 6 8
No ratings yet
2018 Institute PPT 6 8
97 pages
Advanement Exam Review Sheet
No ratings yet
Advanement Exam Review Sheet
4 pages
Guidelines For The Development of Gastronomy Tourism
No ratings yet
Guidelines For The Development of Gastronomy Tourism
48 pages
Datasheet 180270 - Model F (BS 684) with Compensating slides en
No ratings yet
Datasheet 180270 - Model F (BS 684) with Compensating slides en
3 pages
Booklet 9 Acid and Alkalis
No ratings yet
Booklet 9 Acid and Alkalis
48 pages
Rif: J-40062205-0: Empresa: Accesorios Telmovil C.A
No ratings yet
Rif: J-40062205-0: Empresa: Accesorios Telmovil C.A
74 pages
Functional Requirements For: Pharmacy Information Management Systems
No ratings yet
Functional Requirements For: Pharmacy Information Management Systems
3 pages
Electric Motor Cooling Systems: Welkon Limited
100% (1)
Electric Motor Cooling Systems: Welkon Limited
6 pages
Final Document PDF
100% (1)
Final Document PDF
94 pages
Large Above Ground Water Tank Installation Guidelines
No ratings yet
Large Above Ground Water Tank Installation Guidelines
2 pages
Hydrazine
No ratings yet
Hydrazine
6 pages
16c3001 Com Inb RB 545 545silagepack Final Low
No ratings yet
16c3001 Com Inb RB 545 545silagepack Final Low
20 pages
Ancient Egypt
No ratings yet
Ancient Egypt
4 pages
OR
100% (1)
OR
716 pages
Module 4 - Purposive Communication
No ratings yet
Module 4 - Purposive Communication
9 pages
2008 Lu GeoGebra England Taiwan PDF
No ratings yet
2008 Lu GeoGebra England Taiwan PDF
132 pages
L820RM (En) 02
No ratings yet
L820RM (En) 02
240 pages