0% found this document useful (0 votes)

40 views55 pages

3 Numerical Descriptive Measures

Uploaded by

sarikajayaswal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views55 pages

3 Numerical Descriptive Measures

Uploaded by

sarikajayaswal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 55

Numerical Descriptive

measures
Summary Definitions

 The central tendency/location is the extent to which the values of a

numerical variable, group around a typical or central value. It is the
central value around which data tends to cluster.

 The variation is the amount of dispersion or scattering away from

a central value that the values of a numerical variable show. It
measures the spread of the data

 The shape is the pattern of the distribution of values from the

lowest value to the highest value.
Numerical Descriptive Techniques
Measure of Central Tendency: Mean

Mean (average)
 The sum of all the data entries divided by the number of entries.

 Sigma notation: Σx = add all of the data entries (x) in the

data set.

 Population mean: x
u N

 Sample mean: x
x n
Population mean µ

 The population mean is the sum of the values in the population

divided by the population size, N.

i1Xi X X
 N   1  X
2
N
N

Where μ = population mean

N = population size
Xi = ith value of the variable
X
Sample mean

 The arithmetic mean (often just called the “mean”)

is the most common measure of central
tendency.


For a sample of size n:
The ith value
Pronounced x-bar
n


i1
Xi X X  
X n  X1 2
n
n

Sample size Observed values

Measures of Central Tendency : Mean

Advantages of using mean:

• easy to calculate,
• provides good description for data on height, grades etc.
Disadvantages of using mean:
• is sensitive to extreme values.

11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20

Mean Mean =
=13 14
Measures of Central Tendency : Median
• Median is the value that divides the data into two parts- 50% of the observations
have values less than the median and 50% of the observations have values
greater then the median.

• The median is calculated by placing all the observations in order; the

observation that falls in the middle is the median.

• The location of the median when the values are in numerical order
(smallest to largest):
n1
Median position  2 position in the ordered
data

• If the number of values is odd, the median is the middle number.

• If the number of values is even, the median is the average of the two
middle numbers.
• Note that (n + 1)/2 is not the value of the median, only the position of
the median in the ranked data.
Measures of Central Tendency: Median

 In an ordered array, the median is the “middle” number

(50% above, 50% below).

11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20

Median = 13 Median = 13

 Less sensitive than the mean to extreme values

 There are as many values above the median as below it in
the data array.
 The sample and population medians are computed in the
same way.
EXAMPLES - Mean and Median

A sample of 10 adults was asked to report the number of hours they spent on the internet the
previous month. The results are listed here. Calculate the sample mean and Median.

0 7 12 5 33 14 8 0 9 22

The median is the average of the fifth and sixth observations (the middle two), which
are 8 and 9, respectively. Thus, the median is 8.5.
Measures of Central Tendency : The Mode

 Value that occurs most often.

 Not affected by extreme values.
 Used mainly for nominal data.
 There may be no mode.
 There may be several modes. (bi-modal)
 The sample and population modes are computed in the same
way.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

Mode = 9 No Mode
Copyright © 2017 Pearson Education, Ltd.
Who wins between Mean, Median, and Mode?

Out of the three measures to choose from, which one should we use?

• The mean is generally our first selection. However, there are several
circumstances when the median is better.
• The mode is seldom the best measure of central location.
• One advantage the median holds is that it not as sensitive to
extreme values as is the mean.

Find the mode for the data in Internet Example

0 7 12 5 33 14 8 0 9 22

All observations except 0 occur once. There are two 0s. Thus, the
mode is 0. As you can see, this is a poor measure of central location. It
is nowhere near the center of the data. Compare this with the mean
11.0 and median 8.5 and we can see that mean and median are
superior measures.
Activity
The prices (in dollars) for a sample of roundtrip flights from Chicago, Illinois to
Cancun, Mexico are listed. What is the mean, median, mode price of the
flights?
1872 432 397 427 388 482 397 358 432

Which central tendency measure is best suitable to describe this data?

Mean=5185/9= 576.111

Median=5th position= 427

358,388,397,397,427,432,432,482,1872

Mode= 397, 432

Because of extreme values median is appropriate

Summary
• Compute the Mean to
Describe the central location of a single set of interval data.

• Compute the Median to

Describe the central location of a single set of interval or ordinal data
(with extreme observations)

• Compute the Mode to

Describe a single set of nominal, interval data
Instructor-
Dispersion and Variation

Why Study Dispersion?

– A measure of location, such as the mean or the median,

only describes the center of the data. It is valuable from
that standpoint, but it does not tell us anything about the
spread of the data.
– For example, if your nature guide told you that the river
ahead averaged 3 feet in depth, would you want to
wade across on foot without additional information?
Probably not. You would want to know something about
the variation in the depth.
– A second reason for studying the dispersion in a set of
data is to compare the spread in two or more
distributions.
Measures of Variation

Variation

Range Variance Standard Coefficient

Deviation of Variation


Measures of variation
give information on the
spread or variability of
the data values which
measure of location fail
to tell.
Same
centre,
different
variation
Measures of Variation: The Range

 Simplest measure of variation.

 Difference between the largest and the smallest
values:

Range = Xlargest – Xsmallest

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 2 = 12
 Potential problem with Range?
 Once again let us think about the following example on grades.
 Grades of course 1: {4, 4, 4, 4, 50}.
 Grades of course 2: {4, 8, 15, 24, 39, 50}.

 Range= 46 in both the courses but the two courses have very
different distributions.

• Its major advantage is the ease with which it can be computed.

• Its major shortcoming is its failure to provide information on the

dispersion of the observations between the two end points.

• Hence we need a measure of variability that incorporates all the

data and not just two observations.
Deviation, Variance, and Standard Deviation

Deviation
 The difference between the data entry, x, and the mean of the

data set.

 It gives a rough estimate of the typical distance of a data value

from the mean.

 Population data set:


Deviation of x = x – μ

 Sample data set:


Deviation of x = x – x
Numerical Descriptive Measures for a Population:
Variance σ2

 Average of squared deviations of values from the mean.

N

Population variance:  i
(X  μ)2

i1
σ2  N
Where
μ = population mean, N = population size
Xi = ith value of the variable X
Copyright © 2017 Pearson Education, Ltd.
Numerical Descriptive Measures for a Population: Standard
Deviation σ
 Most commonly used measure of variation.
 Shows average variation about the mean.
 Is the square root of the population variance.
 Has the same units as the original data.

N
Population standard deviation:
 i

(X  μ) 2

i1
σ
 N
Measures of Variation: Sample Variance
 Average (approximately) of squared deviations of values from the
mean.

n

Sample variance:

2
 (X  X)
i
2

S  i1
n -1
Where
X = arithmetic mean
n = sample size
Xi = ith value of the variable
X
Measures of Variation: Sample Standard Deviation
 Most commonly used measure of variation.
 Shows average variation about the mean.
 Is the square root of the variance.
 Has the same units as the original data.


Sample standard deviation:  (X i  X) 2

S i 1
n -1
Interpreting Standard Deviation
 Standard deviation is a measure of the typical amount an entry
deviates from the mean.
 The more the entries are spread out, the greater the
standard deviation.

.
Measures of Variation: Comparing Standard
Deviations

Smaller standard deviation

Larger standard deviation

Measure of Variability: Standard Deviation -
Interpretation
Measure of Variability: Standard Deviation -
Interpretation
Measure of Variability: Standard Deviation -
Interpretation
Measure of Variability: Standard Deviation -
Interpretation
• Suppose that the mean and standard deviation of last year’s midterm test
marks are 70 and 5, respectively.

• What can you say about the distribution of grades if the histogram is bell-
shaped?
• We know that approximately 68% of the marks fell between 65 and 75,
approximately 95% of the marks fell between 60 and 80, and
approximately 99.7% of the marks fell between 55 and 85.

• What can you say about the distribution of grades if the shape of the
histogram is not known?

• If the shape of the histogram is not known, we can say that at least 75%
of the marks fell between 60 and 80, and at least 88.9% of the marks fell
between 55 and 85. (k= 2 and 3.)
The Coefficient of Variation (CV)
 Measures relative variation
 Always in percentage (%)
 Shows variation relative to mean
 Is the standard deviation divided by the mean, multiplied by 100%
Comparing Coefficients of Variation

 Stock A:
 Average price last year = $50

 Standard deviation = $5

S $5
 
CVA   100%  100% 10%
X $50
S $5
CVB   100%  100% 5%
X $100
Measures of Variation:
Summary Characteristics
 The more the data are spread out, the greater the range, variance,
and standard deviation.

 The more the data are concentrated, the smaller the range, variance,
and standard deviation.

 If the values are all the same (no variation), all these measures will be
zero.

 None of these measures are ever negative.

 The measure of variability can be used for interval data and Ordinal data
(IQR).
Measure of Relative Standing

• Measures of relative standing are designed to

provide information about the position of particular
values relative to the entire data set.
• Percentile: the Pth percentile is the value for which P
% of the observations are less than that value and
(100-P)% of the observations are greater than that
value.
• Suppose you scored in the 60th percentile on some
exam, that means 60% of the other scores were
below yours, while 40% of scores were above yours
Quartile Measures
 The quartile measures the spread of values above and below the mean
by dividing the distribution into four groups.
 A quartile divides data into three points:
 First quartile, Q1: About one quarter of the data fall on or below Q1.
 Second quartile, Q2: About one half of the data fall on or below Q2
(median).
 Third quartile, Q3: About three quarters of the data fall on or below
Q3.

25% 25% 25% 25%

Q1 Q2
Q3

 Quartiles are used to calculate the interquartile range, which is a

measure of variability around the median.
Quartile Measures:
Locating Quartiles

Find a quartile by determining the value in the appropriate position

in the ranked data, where:

First quartile position: Q1 = (n+1)/4 ranked value.

Second quartile position: Q2 = (n+1)/2 ranked value or Median.

Third quartile position: Q3 = 3(n+1)/4 ranked value.

where n is the number of observed values.

The number of nuclear power plants in the top 15 nuclear power-producing
countries in the world are listed. Find the first, second, and third quartiles of
the data set.
7 18 11 6 59 17 18 54 104 20 31 8 10 15 19

Solution:
• Q2 divides the data set into two halves.
Lower half Upper half

6 7 8 10 11 15 17 18 18 19 20 31 54 59 104
Q2
 The first (16/4th position) =4th position = 10, second quartiles (16*2)/4 =8th
position = 18 and third quartiles (16*3)/4 =12th position = 31
Lower half Upper half
6 7 8 10 11 15 17 18 18 19 20 31 54 59 104
Q1 Q2 Q3

 Q1 tells us that 25% of the countries have 10 or less nuclear plants, Q2

tells us that about 50% have 18 or less; and Q3 reveals that about 75%
have 31 or less plants.
Measure of Relative Standing: Commonly
used Percentiles
Measure of Relative
Standing: Location of
Percentiles
Measure of Relative
Standing: Location of
Percentiles
Measure of Relative
Standing: Location of
Percentiles
Measure of Relative
Standing: Location of
Percentiles
Interquartile Range(IQR)

 Measures the range of the middle 50% of the data that shows how
spread out the data is.
 The difference between the third and first quartiles.
 IQR = Q3 – Q1
 Large values of this statistic mean that the 1st and 3rd quartiles are
far apart indicating a high level of variability.

Find the interquartile range of the data set. Recall Q1 = 10, Q2 = 18,
and Q3 = 31

Solution:
• IQR = Q3 – Q1 = 31 – 10 = 21

The number of power plants in the middle portion of the data set vary by at
most 21.
Describing Relationship between Two

Variables

 One graphical technique we use to show the

relationship between 2 variables is called a scatter
diagram.
 To draw a scatter diagram we need two variables. We
scale one variable along the horizontal axis (X-axis)
of a graph and the other variable along the vertical
axis (Y-axis).
Describing Relationship between Two
Variables – Scatter Diagram Examples
We Discuss Two Measures Of The Relationship
Between Two Numerical Variables

Scatter plots allow you to visually examine the

relationship between two numerical variables
 Now,

We will discuss two quantitative measures of such

relationships.
 The Covariance
 The Coefficient of Correlation
The Covariance
 The covariance measures the strength of the linear
relationship between two numerical variables (X & Y)
Numerical Illustration
Interpreting Covariance

 Covariance between two variables:

When there is no particular pattern, the covariance is a small number.

 The covariance has a major flaw:
 It is not possible to determine the relative strength of the
relationship from the size of the covariance
Coefficient of Correlation

 Measures the relative strength of the linear

relationship between two numerical variables
 Sample coefficient of correlation:
cov (X , Y)
r
SX SY

n n n
 (X  X)(Y  Y)
i i  (Xi  X) 2
 i
(Y  Y ) 2

cov (X , Y)  i1 SX  i1

SY  i1
n 1 n 1 n 1
Features of the
Coefficient of Correlation

 The population coefficient of correlation is referred as ρ.

 The sample coefficient of correlation is referred to as r.
 Either ρ or r have the following features:
 Unit free
 Ranges between –1 and 1
 The closer to –1, the stronger the negative linear relationship
 The closer to 1, the stronger the positive linear relationship
 The closer to 0, the weaker the linear relationship / no relationship
Scatter Plots of Sample Data with
Various Coefficients of Correlation
Coefficient of Correlation

Because we’ve already calculated the covariances we need to compute only the standard deviations of X
and Y.
For Set 1: Strong positive linear relationship
For Set 2: Strong negative linear relationship
For Set 3: Weak negative linear relationship

The Prevalence of Sexual Revictimization
No ratings yet
The Prevalence of Sexual Revictimization
14 pages
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet
Murphy Book Solution
No ratings yet
Murphy Book Solution
100 pages
07 Solutions Regression
50% (2)
07 Solutions Regression
74 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
105 pages
Lecture 3 & 4 Describing Data Numerical Measures
No ratings yet
Lecture 3 & 4 Describing Data Numerical Measures
24 pages
Lesson 1
No ratings yet
Lesson 1
37 pages
Descriptive Statistics PDF
No ratings yet
Descriptive Statistics PDF
130 pages
CENTRAL TENDENCY MEASURES Lectures 3+4+5
No ratings yet
CENTRAL TENDENCY MEASURES Lectures 3+4+5
35 pages
Numerical Descriptive Techniques (6 Hours)
No ratings yet
Numerical Descriptive Techniques (6 Hours)
89 pages
Basic Statistics Overview - Part I
No ratings yet
Basic Statistics Overview - Part I
17 pages
STAT - Lec.2 - Measures of Centeral Tendency - Measures of Dispersion.
100% (1)
STAT - Lec.2 - Measures of Centeral Tendency - Measures of Dispersion.
33 pages
Chapter 3 Numerical Descriptive S
No ratings yet
Chapter 3 Numerical Descriptive S
108 pages
Module 10 Introduction To Data and Statistics
No ratings yet
Module 10 Introduction To Data and Statistics
63 pages
Important Measures of Central Tendency Are Mean, Median and Mode
No ratings yet
Important Measures of Central Tendency Are Mean, Median and Mode
31 pages
Chapter 3
No ratings yet
Chapter 3
121 pages
MMW 6 Data Management Part 3 Central Location Variability PDF
No ratings yet
MMW 6 Data Management Part 3 Central Location Variability PDF
5 pages
Quantitative Methods: Sessions 1-3 Case: Catalog Marketing
No ratings yet
Quantitative Methods: Sessions 1-3 Case: Catalog Marketing
70 pages
Chapt3 Overheads
No ratings yet
Chapt3 Overheads
8 pages
2 Mean Median Mode Variance
No ratings yet
2 Mean Median Mode Variance
29 pages
Chapter 3 - Data Presentation
100% (1)
Chapter 3 - Data Presentation
40 pages
2) SummarizationOfData Mean Median Mod SD CV
No ratings yet
2) SummarizationOfData Mean Median Mod SD CV
24 pages
Statistic For Business
No ratings yet
Statistic For Business
91 pages
MODULE 7 - Steph
No ratings yet
MODULE 7 - Steph
6 pages
Pre - Week 3vs4 - Updated
No ratings yet
Pre - Week 3vs4 - Updated
34 pages
Topic 3
No ratings yet
Topic 3
49 pages
Measures of Location and VARIATION For 1 Variable
No ratings yet
Measures of Location and VARIATION For 1 Variable
44 pages
Lecture 03
No ratings yet
Lecture 03
31 pages
Chapter 5 Statistics and Data
No ratings yet
Chapter 5 Statistics and Data
25 pages
Chapter 3 Descriptive Measures
No ratings yet
Chapter 3 Descriptive Measures
44 pages
Chap 3 Measures of Central Tendency
No ratings yet
Chap 3 Measures of Central Tendency
28 pages
Module 3 Descriptive Statistics Final
100% (1)
Module 3 Descriptive Statistics Final
15 pages
Chapter4 - Measures of Central Tendency and Variation
100% (2)
Chapter4 - Measures of Central Tendency and Variation
32 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
(Business Statistics) Chapter 3 Part 1
No ratings yet
(Business Statistics) Chapter 3 Part 1
30 pages
Lecture 9descriptivestatistics 171204035552
No ratings yet
Lecture 9descriptivestatistics 171204035552
26 pages
Chapter 3. Data Management Lesson 2.: Learning Outcomes: at The End of The Lesson The Students Will Be Able To
No ratings yet
Chapter 3. Data Management Lesson 2.: Learning Outcomes: at The End of The Lesson The Students Will Be Able To
4 pages
2830a Lecture 3
No ratings yet
2830a Lecture 3
68 pages
Chapter 4 Data Management Part 3
No ratings yet
Chapter 4 Data Management Part 3
68 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
Introduction To Statistics PDF
No ratings yet
Introduction To Statistics PDF
32 pages
Lecture 3 Sem 1 Edited
No ratings yet
Lecture 3 Sem 1 Edited
30 pages
Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Note Chapter 3
No ratings yet
Note Chapter 3
14 pages
Chapter 4 Basic Statistics
No ratings yet
Chapter 4 Basic Statistics
22 pages
4 Descriptive Statistics
No ratings yet
4 Descriptive Statistics
13 pages
Topic 3 - Data Presentation, Summarization, Measure of Central Tendency&Spread.
No ratings yet
Topic 3 - Data Presentation, Summarization, Measure of Central Tendency&Spread.
48 pages
Idl 3
No ratings yet
Idl 3
64 pages
Chapter 3 Data Presentation
No ratings yet
Chapter 3 Data Presentation
40 pages
Measures of Central Tendency and Dispersion
No ratings yet
Measures of Central Tendency and Dispersion
30 pages
Finals. Fmch. Measure of Central Tendency Shape of The Distribution of Dispe
No ratings yet
Finals. Fmch. Measure of Central Tendency Shape of The Distribution of Dispe
5 pages
Session 3
No ratings yet
Session 3
11 pages
1 Descriptive Statistics
No ratings yet
1 Descriptive Statistics
23 pages
Quantitative Analysis
No ratings yet
Quantitative Analysis
27 pages
Ch3 Numerically Summarizing Data
No ratings yet
Ch3 Numerically Summarizing Data
35 pages
Measure of Central Tendency and Variability
No ratings yet
Measure of Central Tendency and Variability
73 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Numerical Descriptive Measures 1
No ratings yet
Numerical Descriptive Measures 1
39 pages
Introduction To Statistics Lecture 7
No ratings yet
Introduction To Statistics Lecture 7
32 pages
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
Analisa Pengaruh Fasilitas Dan Kepuasan Pelanggan Terhadap Loyalitas Pelanggan Menginap Di Mikie Holiday Resort Dan Hotel Berastagi
No ratings yet
Analisa Pengaruh Fasilitas Dan Kepuasan Pelanggan Terhadap Loyalitas Pelanggan Menginap Di Mikie Holiday Resort Dan Hotel Berastagi
13 pages
The Empirical Rule and Chebyshev
No ratings yet
The Empirical Rule and Chebyshev
6 pages
Regression 2024
No ratings yet
Regression 2024
49 pages
Exercise 1
No ratings yet
Exercise 1
1 page
MPC 006 Previous Year Question Papers by
No ratings yet
MPC 006 Previous Year Question Papers by
67 pages
Availability, Reliability, SIL: Mean Time Between Failures (MTBF)
No ratings yet
Availability, Reliability, SIL: Mean Time Between Failures (MTBF)
4 pages
Build Mer
No ratings yet
Build Mer
28 pages
CMC 4
No ratings yet
CMC 4
1 page
UNIT 1 Introduction of Data Mining
No ratings yet
UNIT 1 Introduction of Data Mining
11 pages
AGR003 Laboratory Stats Tester: For Android
No ratings yet
AGR003 Laboratory Stats Tester: For Android
3 pages
Excel Analysis ToolPak Tutorial
No ratings yet
Excel Analysis ToolPak Tutorial
15 pages
Standard Shaft: NOMINAL TOLERANCE UNIT Um (1 Um 0,001 MM)
No ratings yet
Standard Shaft: NOMINAL TOLERANCE UNIT Um (1 Um 0,001 MM)
2 pages
Ganpat University V.M. Patel College of Management Studies (V.M.P.C.M.S)
No ratings yet
Ganpat University V.M. Patel College of Management Studies (V.M.P.C.M.S)
4 pages
Multiple Regression
No ratings yet
Multiple Regression
35 pages
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 3: Multiple Regression Analysis
17 pages
Experiments and Quasi-Experiments: Solutions To Exercises
No ratings yet
Experiments and Quasi-Experiments: Solutions To Exercises
4 pages
Statistic and PRO. B. E. (Civil IV - Computer - E&C VI)
No ratings yet
Statistic and PRO. B. E. (Civil IV - Computer - E&C VI)
0 pages
Anova
No ratings yet
Anova
8 pages
Airline Data Analysis
No ratings yet
Airline Data Analysis
20 pages
Par Inc Case Problem
No ratings yet
Par Inc Case Problem
2 pages
BCS301 Questions Paper
No ratings yet
BCS301 Questions Paper
17 pages
Sensitivity, Specificity, Accuracy, Associated Confidence Interval and ROC Analysis With Practical SAS Implementations
No ratings yet
Sensitivity, Specificity, Accuracy, Associated Confidence Interval and ROC Analysis With Practical SAS Implementations
9 pages
Performance of Education Graduates in The Licensure Examination For Teachers (Let)
No ratings yet
Performance of Education Graduates in The Licensure Examination For Teachers (Let)
22 pages
Skittles Project
No ratings yet
Skittles Project
5 pages
Regression: by Vijeta Gupta Amity University
No ratings yet
Regression: by Vijeta Gupta Amity University
15 pages
Fergusson College, Pune - 4. Department of Computer Science
No ratings yet
Fergusson College, Pune - 4. Department of Computer Science
2 pages
Tourism Management: Ching-Fu Chen, Song Zan Chiou-Wei
No ratings yet
Tourism Management: Ching-Fu Chen, Song Zan Chiou-Wei
7 pages

3 Numerical Descriptive Measures

Uploaded by

3 Numerical Descriptive Measures

Uploaded by

Numerical Descriptive

 The central tendency/location is the extent to which the values of a

 The variation is the amount of dispersion or scattering away from

 The shape is the pattern of the distribution of values from the

 Sigma notation: Σx = add all of the data entries (x) in the

 The population mean is the sum of the values in the population

Where μ = population mean

 The arithmetic mean (often just called the “mean”)

Sample size Observed values

Advantages of using mean:

• The median is calculated by placing all the observations in order; the

• If the number of values is odd, the median is the middle number.

 In an ordered array, the median is the “middle” number

 Less sensitive than the mean to extreme values

 Value that occurs most often.

Find the mode for the data in Internet Example

Which central tendency measure is best suitable to describe this data?

Median=5th position= 427

Mode= 397, 432

Because of extreme values median is appropriate

• Compute the Median to

• Compute the Mode to

Why Study Dispersion?

– A measure of location, such as the mean or the median,

Range Variance Standard Coefficient

 Simplest measure of variation.

Range = Xlargest – Xsmallest

• Its major advantage is the ease with which it can be computed.

• Its major shortcoming is its failure to provide information on the

• Hence we need a measure of variability that incorporates all the

 It gives a rough estimate of the typical distance of a data value

 Population data set:

 Sample data set:

 Average of squared deviations of values from the mean.

Smaller standard deviation

Larger standard deviation

 None of these measures are ever negative.

• Measures of relative standing are designed to

25% 25% 25% 25%

 Quartiles are used to calculate the interquartile range, which is a

Find a quartile by determining the value in the appropriate position

First quartile position: Q1 = (n+1)/4 ranked value.

Second quartile position: Q2 = (n+1)/2 ranked value or Median.

Third quartile position: Q3 = 3(n+1)/4 ranked value.

where n is the number of observed values.

 Q1 tells us that 25% of the countries have 10 or less nuclear plants, Q2

 One graphical technique we use to show the

Scatter plots allow you to visually examine the

We will discuss two quantitative measures of such

 Covariance between two variables:

When there is no particular pattern, the covariance is a small number.

 Measures the relative strength of the linear

cov (X , Y)  i1 SX  i1

 The population coefficient of correlation is referred as ρ.

You might also like