Chapter 3: Statistics

This document discusses various numerical measures used for descriptive statistics, including measures of location, variability, and distribution shape. Measures of location include the mean, median, trimmed mean, and mode. Measures of variability include the range, interquartile range, variance, and coefficient of variation. Measures of distribution shape include skewness and percentiles, which provide information about how data are spread from minimum to maximum values.

Uploaded by

Sophia Athena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views3 pages

Chapter 3: Statistics

Uploaded by

Sophia Athena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

CHAPTER 3 o If the data have exactly two modes, the data are bimodal.

o If the data have more than two modes, the data are multimodal.
DESCRIPTIVE STATISTICS: NUMERICAL MEASURES

Numerical Measures
 Weighted Mean
 If the measures are computed for data from a sample, they are called sample statistics.
o In some instances the mean is computed by giving each observation a weight that
 If the measures are computed for data from a population, they are called population
reflects its relative importance.
parameters.
o The choice of weights depends on the application.
 A sample statistic is referred to as the point estimator of the corresponding population
o The weights might be the number of credit hours earned for each grade, as in GPA.
parameter.
o In other weighted mean computations, quantities such as pounds, dollars, or
Measures of Location volume are frequently used.
 Mean ∑ w i xi
o The mean provides a measure of central location. o x́=
o The mean of a data set is the average of all the data values. ∑ wi
o The sample mean x́ is the point estimator of the population mean µ. where: xi = value of observation i
wi = weight for observation I
o Sample Mean: x́=
∑ xi Numerator: sum of the weighted data values
n Denominator: sum of the weights
 Geometric Mean
o Population Mean: μ=
∑ xi o The geometric mean is calculated by finding the nth root of the product of n values.
N o It is often used in analyzing growth rates in financial data (where using the
where: Sxi = sum of the values of the N observations arithmetic mean will provide misleading results).
N = number of observations in the population o It should be applied anytime you want to determine the mean rate of change over
 Median several successive periods (be it years, quarters, weeks, . . .).
o The median of a data set is the value in the middle when the data items are o Other common applications include: changes in populations of species, crop yields,
arranged in ascending order. pollution levels, and birth and death rates.
o Whenever a data set has extreme values, median is the preferred measure of central o x́ g =√n ( x 1 ) ( x 2 ) …( x n)
location.
o The median is the measure of location most often reported for annual income and = [(x1)(x2)…(xn)]1/n
property value data.  Percentiles
o A few extremely large incomes or property values can inflate the mean. o A percentile provides information about how the data are spread over the interval
o For an odd number of observations, arrange it in ascending order and the middle from the smallest value to the largest value.
o Admission test scores for colleges and universities are frequently reported in terms
value is the median.
o For an even number of observations, arrange it in ascending order and the average of percentiles.
o The pth percentile of a data set is a value such that at least p percent of the items
of the middle two values is the median.
take on this value or less and at least (100 - p) percent of the items take on this
 Trimmed Mean
value or more.
o Another measure sometimes used when extreme values are present
o Arrange the data in ascending order.
o It is obtained by deleting a percentage of the smallest and largest values from a data
o Compute Lp, the location of the pth percentile.
set and then computing the mean of the remaining values.
Lp = (p/100)(n + 1)
o For example, the 5% trimmed mean is obtained by removing the smallest 5% and
 Quartiles
the largest 5% of the data values and then computing the mean of the remaining
o Quartiles are specific percentiles.
values.
o First Quartile = 25th Percentile
 Mode
o The mode of a data set is the value that occurs with greatest frequency. o Second Quartile = 50th Percentile = Median
o The greatest frequency can occur at two or more different values. o Third Quartile = 75th Percentile
σ
Measures of Variability
 It is often desirable to consider measures of variability (dispersion), as well as measures
o Population CoV:
[ μ
x 100 ] %

of location. Measures of Distribution Shape, Relative Location, and Detecting Outliers

 For example, in choosing supplier A or supplier B we might consider not only the  Distribution Shape
average delivery time for each, but also the variability in delivery time for each. o Skewness
 Range  An important measure of the shape of a distribution is called skewness.
o The range of a data set is the difference between the largest and smallest data  The formula for the skewness of sample data is
3
values. n x i−x́
o Range = Largest value – Smallest value
o It is the simplest measure of variability.

 Skewness =
(n−1)(n−2)
∑ s [ ]
Skewness can be easily computed using statistical software. (Chapter 2)
o It is very sensitive to the smallest and largest data values.
 Interquartile Range  z-Scores
o The interquartile range of a data set is the difference between the third quartile and o The z-score is often called the standardized value.
the first quartile. o It denotes the number of standard deviations a data value xi is from the mean.
o It is the range for the middle 50% of the data. xi −x́
o It overcomes the sensitivity to extreme data values.
o z i=
s
 Variance o Excel’s STANDARDIZE function can be used to compute the z-score.
o The variance is a measure of variability that utilizes all the data. o An observation’s z-score is a measure of the relative location of the observation in
o It is based on the difference between the value of each observation (xi) and the a data set.
mean ( x́ for a sample,  for a population). o A data value less than the sample mean will have a z-score less than zero.
o The variance is useful in comparing the variability of two or more variables. o A data value greater than the sample mean will have a z-score greater than zero.
o The variance is the average of the squared differences between each data value and o A data value equal to the sample mean will have a z-score of zero.
the mean.  Chebyshev’s Theorem

o Sample Variance: s2=

∑ ( x i−x́ ) 2 o At least (1 - 1/z2) of the items in any data set will be within z standard deviations of
the mean, where z is any value greater than 1.
n−1 o Chebyshev’s theorem requires z > 1, but z need not be an integer.

o Population Variance: σ 2=
∑ ( xi −μ ) 2 o At least 75% of the data values must be within z = 2 standard deviations of the
mean.
N o At least 89% of the data values must be within z = 3 standard deviations of the
 Standard Deviation
mean.
o The standard deviation of a data set is the positive square root of the variance.
o At least 94% of the data values must be within z = 4 standard deviations of the
o It is measured in the same units as the data, making it more easily interpreted than
mean.
the variance.
 Empirical Rule
o Sample SD: s = √ s2 o When the data are believed to approximate a bell-shaped distribution:
o Population SD:  = √  2  The empirical rule can be used to determine the percentage of data values that
must be within a specified number of standard deviations of the mean.
 Coefficient of Variation  The empirical rule is based on the normal distribution, which is covered in
o The coefficient of variation indicates how large the standard deviation is in relation Chapter 6.
to the mean. o For data having a bell-shaped distribution:
s
o Sample CoV:
[ x́
x 100 ] %



Approximately 68% of the data values will be within +/- 1 standard deviation
of its mean.
Approximately 95% of the data values will be within +/- 2 standard deviations
of its mean.
 Almost all of the data values will be within +/- 3 standard deviations of its s xy
mean. o Sample CC: r xy=
 Detecting Outliers sx s y
o An outlier is an unusually small or unusually large value in a data set. σ xy
o A data value with a z-score less than -3 or greater than +3 might be considered an o Population CC: ρ xy=
σxσ y
outlier.
o It might be: o The coefficient can take on values between -1 and +1.
 an incorrectly recorded data value o Values near -1 indicate a strong negative linear relationship.
 a data value that was incorrectly included in the data set o Values near +1 indicate a strong positive linear relationship.
 a correctly recorded unusual data value that belongs in the data set o The closer the correlation is to zero, the weaker the relationship.

Five-Number Summaries and Box Plots Data Dashboards: Adding Numerical Measures to Improve Effectiveness
 Summary statistics and easy-to-draw graphs can be used to quickly summarize large  Data dashboards are not limited to graphical displays.
quantities of data.  The addition of numerical measures, such as the mean and standard deviation of KPIs,
 Five-Number Summary to a data dashboard is often critical.
o Smallest Value  Dashboards are often interactive.
o First Quartile  Drilling down refers to functionality in interactive dashboards that allows the user to
o Median access information and analyses at increasingly detailed level.
o Third Quartile
o Largest Value
 Box Plot
o A box plot is a graphical summary of data that is based on a five-number summary.
o A key to the development of a box plot is the computation of the median and the
quartiles Q1 and Q3.
o Box plots provide another way to identify outliers.
o Limits are located (not drawn) using the interquartile range (IQR).
o Data outside these limits are considered outliers
o The locations of each outlier is shown with the symbol

Measures of Association between Two Variables

 Often a manager or decision maker is interested in the relationship between two
variables.
 Covariance
o The covariance is a measure of the linear association between two variables.
o Positive values indicate a positive relationship.
o Negative values indicate a negative relationship.

o Sample Covariance: s xy=

∑ ( x i ¿−x́)( y i − ý) ¿
n−1
o Population Covariance: σ xy =
∑ ( x i ¿−μ x )( y i−μ y ) ¿
N
 Correlation Coefficient
o Correlation is a measure of linear association and not necessarily causation.
o Just because two variables are highly correlated, it does not mean that one variable
is the cause of the other.

Meg Squats Uplifted (122 Pages)
100% (3)
Meg Squats Uplifted (122 Pages)
122 pages
Coraline Script
75% (8)
Coraline Script
62 pages
DSILYTC Session 5 - Descriptive Statistics
No ratings yet
DSILYTC Session 5 - Descriptive Statistics
99 pages
EECM3724 Unit 1 Ch3 Slides 2022
No ratings yet
EECM3724 Unit 1 Ch3 Slides 2022
48 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
7 pages
Session 2
No ratings yet
Session 2
14 pages
Class 1 - 20th August 2024 - Descriptive Statistic
No ratings yet
Class 1 - 20th August 2024 - Descriptive Statistic
6 pages
Stat 102 Module 3
No ratings yet
Stat 102 Module 3
8 pages
OSTA-WS2024-Lecture 03
No ratings yet
OSTA-WS2024-Lecture 03
38 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
Spring Semester, 2020-2021
No ratings yet
Spring Semester, 2020-2021
40 pages
Part 2-Chapter 3 - Describing Data - Edit
No ratings yet
Part 2-Chapter 3 - Describing Data - Edit
46 pages
Angilan, Ef
No ratings yet
Angilan, Ef
5 pages
Basic Business Statistics: Concepts & Applications: Activity 4+ 5 + 6 Descriptive Statistics and Graphical Analysis
No ratings yet
Basic Business Statistics: Concepts & Applications: Activity 4+ 5 + 6 Descriptive Statistics and Graphical Analysis
33 pages
Introductory of Statistics - Chapter 3
No ratings yet
Introductory of Statistics - Chapter 3
7 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
50 pages
City Uni of New York
No ratings yet
City Uni of New York
33 pages
MATM111 Midterms REVIEWER
No ratings yet
MATM111 Midterms REVIEWER
3 pages
Chapter 5
No ratings yet
Chapter 5
6 pages
6 CE 411 - HYDROLOGY (Statistical Measures)
No ratings yet
6 CE 411 - HYDROLOGY (Statistical Measures)
33 pages
Probability and Statistics: Lums Undergraduate SS-4-6
No ratings yet
Probability and Statistics: Lums Undergraduate SS-4-6
17 pages
ISDS 361A - Cheat Sheet Exam 1 PDF
No ratings yet
ISDS 361A - Cheat Sheet Exam 1 PDF
2 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
4 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
ch03 Ver3
No ratings yet
ch03 Ver3
25 pages
STATISTICS (Averages and Variation)
No ratings yet
STATISTICS (Averages and Variation)
8 pages
Chapter 3 Review
100% (1)
Chapter 3 Review
12 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
41 pages
719 Final Syllabus Merged
No ratings yet
719 Final Syllabus Merged
200 pages
Lecture No. 6 Measures of Variability
No ratings yet
Lecture No. 6 Measures of Variability
25 pages
Describing Data: Measure Sample Population Mean 'X Stand. Dev. Variance Size
No ratings yet
Describing Data: Measure Sample Population Mean 'X Stand. Dev. Variance Size
10 pages
Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types
No ratings yet
Descriptive Statistics Summary (Session 1-5) : Types of Data - Two Types
4 pages
Lecture 2.2 - Statistics - Desc Stat and Distrib
No ratings yet
Lecture 2.2 - Statistics - Desc Stat and Distrib
48 pages
Statistics I Chapter 2: Univariate Data Analysis
No ratings yet
Statistics I Chapter 2: Univariate Data Analysis
27 pages
Lecture 2b - Describing Data-Numerical
No ratings yet
Lecture 2b - Describing Data-Numerical
47 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Central Tendency Variation Outliers
No ratings yet
Central Tendency Variation Outliers
59 pages
Actuary Math - Stat. Lec1-9
No ratings yet
Actuary Math - Stat. Lec1-9
22 pages
FORMULAS
No ratings yet
FORMULAS
16 pages
Unit 3 - Descriptive Statistics
No ratings yet
Unit 3 - Descriptive Statistics
44 pages
HNS 2321 Biostatistics Lecture 3 and 4 Descritive Statistics
No ratings yet
HNS 2321 Biostatistics Lecture 3 and 4 Descritive Statistics
36 pages
SLIDES - Statistics-Descriptive Statistics
No ratings yet
SLIDES - Statistics-Descriptive Statistics
25 pages
Stat Chapter 5-9
No ratings yet
Stat Chapter 5-9
32 pages
Notes Stats Quiz 2
No ratings yet
Notes Stats Quiz 2
10 pages
Lecture 9
No ratings yet
Lecture 9
40 pages
Introduction To Probability and Statistics Thirteenth Edition
No ratings yet
Introduction To Probability and Statistics Thirteenth Edition
46 pages
2 Descriptives
No ratings yet
2 Descriptives
43 pages
R - Iii Unit
No ratings yet
R - Iii Unit
34 pages
TUT1
No ratings yet
TUT1
7 pages
QUALITATIVE DATA Are Measurements For Which There Is No Natural
No ratings yet
QUALITATIVE DATA Are Measurements For Which There Is No Natural
9 pages
Basic Business Statistics: Numerical Descriptive Measures
No ratings yet
Basic Business Statistics: Numerical Descriptive Measures
33 pages
Reviewer Part 1
No ratings yet
Reviewer Part 1
9 pages
History Reporting
No ratings yet
History Reporting
61 pages
To Data Science: Chapter 4: Statistical Description of Data
No ratings yet
To Data Science: Chapter 4: Statistical Description of Data
13 pages
Module 3 Descriptive Statistics Numerical Measures
No ratings yet
Module 3 Descriptive Statistics Numerical Measures
28 pages
Chapter 2 Descriptive Statistics
No ratings yet
Chapter 2 Descriptive Statistics
12 pages
Chapter 3
No ratings yet
Chapter 3
17 pages
Review Data
No ratings yet
Review Data
745 pages
Crossword - Word Formation - PDF Worksheet - B2 - WF003
No ratings yet
Crossword - Word Formation - PDF Worksheet - B2 - WF003
1 page
CV Donny Prasetyo Utomo
No ratings yet
CV Donny Prasetyo Utomo
5 pages
Mataguisi Group Final Research
No ratings yet
Mataguisi Group Final Research
41 pages
Claes 20 Gauge Vitrectomy System
No ratings yet
Claes 20 Gauge Vitrectomy System
8 pages
The Role of Technology
100% (1)
The Role of Technology
10 pages
Spongebob Squarepants (Theme Song) (Arr. Paul Lavender) - Snare Drum by Paul Lavender - Marching Band - Digital Sheet Music SH
No ratings yet
Spongebob Squarepants (Theme Song) (Arr. Paul Lavender) - Snare Drum by Paul Lavender - Marching Band - Digital Sheet Music SH
1 page
Q3 Module6 CSS9
No ratings yet
Q3 Module6 CSS9
7 pages
A. Recount Text
No ratings yet
A. Recount Text
9 pages
WORKSHOP 1 Roadmap For Developing Relationship
No ratings yet
WORKSHOP 1 Roadmap For Developing Relationship
3 pages
Regions Bank Statement
No ratings yet
Regions Bank Statement
2 pages
Nodi Amazzonici - Genere, Genere e Donne Guerriere Di Ariosto
No ratings yet
Nodi Amazzonici - Genere, Genere e Donne Guerriere Di Ariosto
24 pages
Long Quiz Week 12 Joint Arrangements - ACTG341 Advanced Financial Accounting and Reporting 1
No ratings yet
Long Quiz Week 12 Joint Arrangements - ACTG341 Advanced Financial Accounting and Reporting 1
4 pages
Abdalla Et Al 2024 A Comprehensive Review of Plant Based Cosmetic Oils (Virgin Coconut Oil Olive Oil Argan Oil and
No ratings yet
Abdalla Et Al 2024 A Comprehensive Review of Plant Based Cosmetic Oils (Virgin Coconut Oil Olive Oil Argan Oil and
14 pages
Fear of Falling
No ratings yet
Fear of Falling
8 pages
The Tech Guy Files
No ratings yet
The Tech Guy Files
105 pages
PC Magazine - February 2014 USA
No ratings yet
PC Magazine - February 2014 USA
142 pages
Proforma Log Book 16690148 A
No ratings yet
Proforma Log Book 16690148 A
1 page
Freud's Wolfman
No ratings yet
Freud's Wolfman
21 pages
Global 6000 SN 9527 Specifications EQ
No ratings yet
Global 6000 SN 9527 Specifications EQ
12 pages
UGBS 105 Lecture 1 - 4 Updated
No ratings yet
UGBS 105 Lecture 1 - 4 Updated
28 pages
Bhu BSC
No ratings yet
Bhu BSC
1 page
Asm 21970
No ratings yet
Asm 21970
18 pages
Words Worth Essay
No ratings yet
Words Worth Essay
1 page
Dragonborn Warlock 3rd Level
No ratings yet
Dragonborn Warlock 3rd Level
3 pages
Maths Revision 3
No ratings yet
Maths Revision 3
16 pages
E Portfolio Reflection
No ratings yet
E Portfolio Reflection
2 pages
Building in Existing Fabric Refurbishment Extensions New Design 1sst Edition Christian Schittich
No ratings yet
Building in Existing Fabric Refurbishment Extensions New Design 1sst Edition Christian Schittich
77 pages

Chapter 3: Statistics

Uploaded by

Chapter 3: Statistics

Uploaded by

CHAPTER 3 o If the data have exactly two modes, the data are bimodal.

of location. Measures of Distribution Shape, Relative Location, and Detecting Outliers

o Sample Variance: s2=

Measures of Association between Two Variables

o Sample Covariance: s xy=

You might also like