0% found this document useful (0 votes)

46 views10 pages

DescribingDataNumerically Lesson

This document provides an overview of describing data numerically through calculating measures of central tendency (mean, median, mode) and measures of spread (range, variance, standard deviation). It discusses key concepts like sample vs population, descriptive vs inferential statistics, and skewed data. Examples are provided to demonstrate calculating the mean, median, mode, and range for sample battery lifetime data. The document is intended to teach students how to numerically describe and analyze sample data.

Uploaded by

rgererg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views10 pages

DescribingDataNumerically Lesson

Uploaded by

rgererg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

LESSON: DESCRIBING DATA NUMERICALLY

This lesson includes an overview of the subject, instructor notes, and example exercises using
Minitab.

Describing Data Numerically

Lesson Overview

Statistics is the discipline concerned with the optimal acquisition (where garbage in equals
garbage out) and analysis of data in order to model a population or process.

We can begin to analyze a data set by describing it both numerically and graphically. This lesson
considers important numerical summaries of data. In this lesson, we will use sample data taken
from a large population, and we are only considering quantitative (numeric) data, not qualitative
(categorical) data. For the data sets of interest, we will select only one variable of interest; that is,
we will be working with univariate data, not bivariate or multivariate data.

Prerequisites

This lesson requires knowledge of basic arithmetic. Symbolic notation will be introduced and
used to simplify the formulas for the computation of numerical measurements. In Minitab,
computations will be made on single columns of data.

Learning Targets

This lesson teaches students how to:

 Calculate basic numerical measures of center for a sample set of data, including its
mean, median, and mode
 Determine which measure of center may be more appropriate for a given data set
 Calculate basic numerical measures of spread for a sample set of data, including its
range, variance, and standard deviation

Time Required

It will take the instructor 30-45 minutes in class to introduce the descriptive statistics formulas.
We recommend starting the activity sheet in class so that students can ask the instructor

WWW.MINITAB.COM/ACADEMIC
questions while working on it. The exercises on the activity sheet will take an additional 30-45
minutes, and they can be used as homework or quiz problems.

Materials Required

 Minitab or Minitab Express

 Minitab worksheet of sample data, entitled DescribingDataNumerically_Lesson.mtw
 Internet access (optional example)

Assessment

The activity sheet contains exercises for students to assess their understanding of the learning
targets for this lesson.

Possible Extensions

This lesson provides good introductory examples for students new to statistics. The instructor
may want to do the Sampling lesson first so that students know how data is being selected
from the population. The recommended follow-up lesson is Describing Data Graphically.

References

Tranquilizing Sheep – Reaction Time Online Game:

http://www.freeonlinegames.com/game/sheep-reaction

Instructor Notes with Examples

Sample Data
Since we are calculating numerical values on sample data, below is the definition of a sample.
There is another lesson devoted entirely to sampling.

Definition: A sample is a subset of subjects from the population for which observations are
actually made.

The numerical values that are calculated on a sample are called statistics.

LESSON: DESCRIBING DATA NUMERICALLY 2

Definition: A sample statistic is a numerical value characterizing the sample (e.g. center,
range, spread, shape). Statistics are typically “English” letters: 𝑥̅ , s, or m.

There are two branches of statistics that are discussed in introductory statistics courses –
descriptive statistics and inferential statistics. Later lessons will be devoted to inferential
statistics.

Definition: Descriptive statistics (also called summary statistics) uses graphical and/or
numerical summaries for describing or summarizing data from a sample.

 The most common descriptive statistics provide information about a sample’s central
tendency (mean, median, mode) and variability (variance, standard deviation, range).
 Some graphical methods for displaying and describing data include: dotplot, stem-and-
leaf plot, histogram, boxplot, and time series plot (time ordered data). Additional lessons
describe these graphs.

Notation: When discussing samples throughout this lesson, we need to have notation for a
generic sample of size n. We’ll use:

x1, x2, ..., xi, ..., xn,

where

x1 denotes the numeric value of the first item in the sample,

x2 denotes the numeric value of the second item in the sample,
⁞
xi denotes the numeric value of the ith item in the sample,
⁞
xn denotes the numeric value of the nth item in the sample.

Sample Mean

Definition: The sample mean, denoted by 𝒙, is the arithmetic average of the n data values
in the sample.
𝒏
𝑥 + 𝑥 + 𝑥 + ⋯+ 𝑥 𝟏
𝒙= = 𝒙𝒊
𝑛 𝒏
𝒊 𝟏

LESSON: DESCRIBING DATA NUMERICALLY 3

By picture, we can think of the sample mean as the fulcrum point that keeps a weightless ruler,
in which each observation is represented by the same weight, in perfect balance.

Also noted in each picture are the modes (circled) and location of the medians (m). The
definitions of these statistics are contained in the following pages.

Example 1
Ten batteries from brands A, B, and C were tested to determine their lifetimes (in hours).

Here are the lifetimes plotted as comparison dotplots in Minitab:

The sample mean lifetimes of battery brands A, B, and C are:

LESSON: DESCRIBING DATA NUMERICALLY 4

Sample Median

Definition: The sample median is the middle ordered data value if the sample size n is
odd and the average of the middle two ordered data values if the sample size n is even.

 50% of the data is less than or equal to the median.

 50% of the data is greater than or equal to the median.
 The median provides a measure which is less affected by extreme scores than the
mean is.

Example 2
The sample median lifetimes of batteries from brands A, B, and C are:

 Battery brand A ordered lifetimes: 38, 41, 87, 94, 102, 116, 155, 179, 214, 289. Since there
𝟏𝟎𝟐 𝟏𝟏𝟔
is an even number of data points, the sample median is: = 𝟏𝟎𝟗 hours.
𝟐
 Battery brand B ordered lifetimes: 22, 22, 32, 39, 64, 65, 99, 142, 191, 317. The sample
median is 64.5 hours.
 Battery brand C ordered lifetimes: 18, 24, 34, 41, 43, 95, 122, 139, 318, 360. The sample
median is = 𝟔𝟗 hours.
 As an additional example, suppose we have battery brand D with ordered lifetimes: 20,
32, 45, 67, 69, 142, 150. Since there are an odd number of data points, the sample
median is 67, the middle ordered data value.

Sample Mode

Definition: The most frequently occurring sample data value is the mode. There can be
more than one mode.

LESSON: DESCRIBING DATA NUMERICALLY 5

Example 3
Battery brands A and C do not have modes. Battery brand B has mode 22 hours.

Example 4
You decide to participate in a fishing contest at a local pond. Each contestant must catch 5 fish,
and the winner will be determined by the contestant with the “longest” catches overall. Given
you caught the following 5 fish below, would you rather the judges use the mean or median to
determine longest catches?

Answer: You want to win the contest! So, hopefully the judges will determine the longest
catches using the mean of the five catches. The length of the median catch definitely won’t win
you the top prize!

Skewed Data
A data set is said to be skewed if it is asymmetric, either positively or negatively, as denoted in
the figures below.

 For positively skewed data, generally the mean is greater than the median.
 For negatively skewed data, generally the mean is less than the median.
 For symmetric data, the mean and median tend to be close to the same value.

Below are histograms of exam scores for 110 students. Note: All histogram bins contain their left
endpoints.

LESSON: DESCRIBING DATA NUMERICALLY 6

Mean (~71.86) and median (73) Mean (~66.32) is greater than the
are about the same median (62)

Mean (~75.04) and median (75) Mean (~83.50) is less than the
are about the same median (89)

Measures of Spread
We can observe three measures of spread for a sample: the sample range, sample variance, and
sample standard deviation.

Sample Range

Definition: The sample range for a data set is the difference between the largest
(maximum) and smallest (minimum) data values in the sample.

Returning to Example 1 (data below), we can calculate sample ranges for battery brands A, B,
and C.

LESSON: DESCRIBING DATA NUMERICALLY 7

 Sample range for battery brand A lifetimes: 289 – 38 = 251 hours
 Sample range for battery brand B lifetimes: 317 – 22 = 295 hours
 Sample range for battery brand C lifetimes: 360 – 18 = 342 hours

Sample Variance and Sample Standard Deviation

Definition: The sample variance is the most common estimate of data spread, and we use
it in conjunction with the sample mean. It is a measure of deviation from the sample mean 𝑥̅ .
For instance, the difference (𝑥 − 𝑥̅ ) is the deviation of the first data point from the sample
mean. Hence, we have the n deviations:

(𝑥 − 𝑥̅ ), (𝑥 − 𝑥̅ ), … , (𝑥 − 𝑥̅ ), … , (𝑥 − 𝑥̅ )

Some deviations are negative, while others are positive, and summing the deviations yields
0. In order to make all deviations positive, we square each deviation. The sample variance is
the sum of the squared deviations divided by (n – 1) and is denoted by the symbol s2.

𝒏
𝟐
1 𝟏
𝒔 = [(𝑥 − 𝑥̅ ) + (𝑥 − 𝑥̅ ) + ⋯ + (𝑥 − 𝑥̅ ) ] = (𝒙𝒊 − 𝒙)𝟐
𝑛−1 𝒏−𝟏
𝒊 𝟏

Comments regarding the sample variance, s2:

 The sample variance (s2) measures the average scatter of the data values about the
sample mean. It is the average of the squared deviations.
 Why do we divide by n – 1 instead of n? Because dividing by n – 1 gives us a BETTER
ESTIMATE of the true population variance σ2.
 The units of s2 are squared units. For example, if our data consists of peoples’ weights
in pounds, s2 has units pounds squared. To return to the same units as the sample mean,
we take the square root of the sample variance; it is called the sample standard
deviation, and it is denoted by s.

LESSON: DESCRIBING DATA NUMERICALLY 8

The sample variance and sample standard deviation for battery brand A lifetimes from Example
1 are computed as follows.

We already computed the sample mean of battery brand A as 𝑥̅ = 131.5 hours. So, the sum of
the squared deviations is:

(41 − 131.5) + (289 − 131.5) + (214 − 131.5) + ⋯ + (155 − 131.5) = 55850.5 hours

Thus,
. .
𝒔𝟐 = ≅ 𝟔𝟐𝟎𝟓. 𝟔𝟏 hrs2, and s = ≅ 𝟕𝟖. 𝟕𝟖 hrs.

Minitab Calculations
All computations we just did by hand in previous examples can be easily calculated in Minitab.

Example 5
Ten batteries from brands A, B, and C were tested to determine their lifetimes (in hours).

Open the Minitab worksheet DescribingDataNumerically_Lesson.mtw. Data for battery brand

A, B, and C lifetimes are in columns C1, C2, and C3, respectively.

How to compute descriptive statistics in Minitab:

Minitab

1 Choose Stat > Basic Statistics > Display Descriptive Statistics.

2 In Variables, enter ‘Brand A’ ‘Brand B’ ‘Brand C.’
3 Click Statistics and check Mean, Standard deviation, Variance, Median, Mode,
Minimum, Maximum, Range, and N total.
4 Click OK in each dialog box.

LESSON: DESCRIBING DATA NUMERICALLY 9

Minitab Express

1 Open the descriptive statistics dialog box.

 Mac: Statistics > Summary Statistics > Descriptive Statistics
 PC: STATISTICS > Descriptive Statistics
2 In Variable, enter ‘Brand A’ ‘Brand B’ ‘Brand C.’
3 Click Statistics, and then select Mean, Standard deviation, Variance, Median, Mode,
Minimum, Maximum, Range, and N total.
4 Click OK.

The Minitab output is:

Before beginning the activity sheet, here’s a fun riddle for remembering the mean, median,
mode, and range.

LESSON: DESCRIBING DATA NUMERICALLY 10

Gestalt Psychology
86% (7)
Gestalt Psychology
350 pages
Process Audit Check Sheet Cum Report
75% (4)
Process Audit Check Sheet Cum Report
4 pages
Emgt 512 SP 2024
No ratings yet
Emgt 512 SP 2024
156 pages
Planning A Training Session
No ratings yet
Planning A Training Session
8 pages
SexDating April22
No ratings yet
SexDating April22
17 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
93 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Statistics
No ratings yet
Statistics
21 pages
TL Strategies in Smart Pedagogy
No ratings yet
TL Strategies in Smart Pedagogy
19 pages
Probability and Statistics Notes
No ratings yet
Probability and Statistics Notes
38 pages
Sneha Priya
No ratings yet
Sneha Priya
73 pages
Chapter 3 Numerical Summaries of Data: Important Note: Follow Rounding Instructions
100% (1)
Chapter 3 Numerical Summaries of Data: Important Note: Follow Rounding Instructions
4 pages
Unit II TYCS DS
No ratings yet
Unit II TYCS DS
176 pages
Ai - Ssmda
No ratings yet
Ai - Ssmda
142 pages
03 - Chapter - 02 - Part2
No ratings yet
03 - Chapter - 02 - Part2
90 pages
Math Test Prep File
No ratings yet
Math Test Prep File
88 pages
SSM & Da All Unit Notes
No ratings yet
SSM & Da All Unit Notes
152 pages
1 - Chapter (1) Analysis of Data and Its Types Exercise
No ratings yet
1 - Chapter (1) Analysis of Data and Its Types Exercise
10 pages
Math236 Lecture 2
No ratings yet
Math236 Lecture 2
64 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Ssmda End Sem
No ratings yet
Ssmda End Sem
152 pages
Inventory Management For The Health Sector: ABC Analysis Approach
No ratings yet
Inventory Management For The Health Sector: ABC Analysis Approach
36 pages
Chapter2-Statistical Analysis
No ratings yet
Chapter2-Statistical Analysis
86 pages
EE311 Lecture #2 Descriptive Statistics
No ratings yet
EE311 Lecture #2 Descriptive Statistics
47 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
Lesson 5 (Descriptive Statistics Part 1) - Oct 2024
No ratings yet
Lesson 5 (Descriptive Statistics Part 1) - Oct 2024
72 pages
Slideset 2
No ratings yet
Slideset 2
63 pages
Biostatistics Revision DR - NJ
No ratings yet
Biostatistics Revision DR - NJ
67 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
Lecture 6
No ratings yet
Lecture 6
84 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
53 pages
Screenshot 2024-07-22 at 10.26.36 AM
No ratings yet
Screenshot 2024-07-22 at 10.26.36 AM
35 pages
Week 4 Bioscience
No ratings yet
Week 4 Bioscience
37 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
34 pages
Descripti VE Statistics and Data Visualization: January 14, 2020
No ratings yet
Descripti VE Statistics and Data Visualization: January 14, 2020
34 pages
Statistics
No ratings yet
Statistics
63 pages
4 - Statistik Deskriptif
No ratings yet
4 - Statistik Deskriptif
33 pages
Training Content For CBC 2025 Tots PRESENTATION-1
No ratings yet
Training Content For CBC 2025 Tots PRESENTATION-1
27 pages
Statistics
No ratings yet
Statistics
30 pages
Unit 2 Fod
No ratings yet
Unit 2 Fod
32 pages
3 Measures of Central Tendency
No ratings yet
3 Measures of Central Tendency
30 pages
Imperial Causality
No ratings yet
Imperial Causality
124 pages
Descriptive Statistics PDF
100% (1)
Descriptive Statistics PDF
40 pages
2466939-EDA and STATISTICS NOTES
No ratings yet
2466939-EDA and STATISTICS NOTES
15 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
ECO2004 Ch3
No ratings yet
ECO2004 Ch3
16 pages
Stat 1101 4 7
No ratings yet
Stat 1101 4 7
18 pages
Distribution Terminology
No ratings yet
Distribution Terminology
21 pages
Almendralejo Statistics
No ratings yet
Almendralejo Statistics
19 pages
CIForMean LargeSample Activity
No ratings yet
CIForMean LargeSample Activity
8 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
23 pages
Prof. Joy V. Lorin-Picar Davao Del Norte State College: New Visayas, Panabo City
No ratings yet
Prof. Joy V. Lorin-Picar Davao Del Norte State College: New Visayas, Panabo City
91 pages
Lab 3 Statistics Intro
No ratings yet
Lab 3 Statistics Intro
12 pages
Critical Thinking Literature Review PDF
100% (2)
Critical Thinking Literature Review PDF
7 pages
Ge 4 - Topic 2-Statistics
No ratings yet
Ge 4 - Topic 2-Statistics
8 pages
2.4-2.6worked Sec 2
No ratings yet
2.4-2.6worked Sec 2
23 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Click To Add Text Dr. Cemre Erciyes: Soc 2003 Statistical Methods and Computer Applications in Social Sciences 18/19
No ratings yet
Click To Add Text Dr. Cemre Erciyes: Soc 2003 Statistical Methods and Computer Applications in Social Sciences 18/19
69 pages
Math
No ratings yet
Math
6 pages
Measurement of Variability
No ratings yet
Measurement of Variability
11 pages
O-Level Sociology (2251) Saqlain Shah Notes
No ratings yet
O-Level Sociology (2251) Saqlain Shah Notes
65 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
17 pages
Statistics 10 1
No ratings yet
Statistics 10 1
5 pages
Measures of Central Tendency: Mean Median Mode
No ratings yet
Measures of Central Tendency: Mean Median Mode
20 pages
Statistical Analysis - Descriptive Stat
No ratings yet
Statistical Analysis - Descriptive Stat
6 pages
BBA Statistics
No ratings yet
BBA Statistics
4 pages
Pcs - 2
No ratings yet
Pcs - 2
2 pages
Corr
No ratings yet
Corr
2 pages
Describing Data Graphically: Exercise 1
No ratings yet
Describing Data Graphically: Exercise 1
9 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
Alternative Methodology To Avoid Convergence Problems Caused For WELDRAW Keyword
No ratings yet
Alternative Methodology To Avoid Convergence Problems Caused For WELDRAW Keyword
3 pages
Probability: Victormanuel - Casero@uclm - Es Office 2-B14 (Edificio Polit Ecnico)
No ratings yet
Probability: Victormanuel - Casero@uclm - Es Office 2-B14 (Edificio Polit Ecnico)
33 pages
Effect of Training Cessation On Muscular Performance: A Meta-Analysis
No ratings yet
Effect of Training Cessation On Muscular Performance: A Meta-Analysis
10 pages
Stat Handout
No ratings yet
Stat Handout
7 pages
PJJ SBLE3123 ENGLISH PROFIENCY IIIvv
No ratings yet
PJJ SBLE3123 ENGLISH PROFIENCY IIIvv
5 pages
Uf Thesis Dissertation
100% (2)
Uf Thesis Dissertation
7 pages
Statistics
No ratings yet
Statistics
25 pages
Variability and Accuracy of Sahlis Method InEstimation of Haemoglobin Concentration
No ratings yet
Variability and Accuracy of Sahlis Method InEstimation of Haemoglobin Concentration
8 pages
E-Book On Essentials of Business Analytics: Group 7
No ratings yet
E-Book On Essentials of Business Analytics: Group 7
6 pages
CSR One Page Write-Up
No ratings yet
CSR One Page Write-Up
1 page
Master of Education in Advanced Teaching (M.Ed.) - Uopeople Catalog
No ratings yet
Master of Education in Advanced Teaching (M.Ed.) - Uopeople Catalog
12 pages
Lesson Plans For 4 15-4 18
No ratings yet
Lesson Plans For 4 15-4 18
3 pages
RSM Research Paper (605,610)
No ratings yet
RSM Research Paper (605,610)
15 pages
Alfred Rajesh
No ratings yet
Alfred Rajesh
2 pages
VAC Khushi
No ratings yet
VAC Khushi
7 pages
Population Mean Hypothesis Testing For Large Samples: Exercise 1
No ratings yet
Population Mean Hypothesis Testing For Large Samples: Exercise 1
12 pages
Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"
No ratings yet
Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"
4 pages
Short History of Leadership Theory J Robert Clinton
No ratings yet
Short History of Leadership Theory J Robert Clinton
126 pages
Complete Resolving Spectral Mixtures With Applications From Ultrafast Time-Resolved Spectroscopy To Super-Resolution Imaging 1st Edition Cyril Ruckebusch (Eds.) PDF For All Chapters
No ratings yet
Complete Resolving Spectral Mixtures With Applications From Ultrafast Time-Resolved Spectroscopy To Super-Resolution Imaging 1st Edition Cyril Ruckebusch (Eds.) PDF For All Chapters
47 pages
The Archaeology of Movement Oscar Aldred PDF Download
No ratings yet
The Archaeology of Movement Oscar Aldred PDF Download
85 pages
A School Support Intervention and Educational Outcomes
No ratings yet
A School Support Intervention and Educational Outcomes
12 pages
Institutional Strengthening Handbook
No ratings yet
Institutional Strengthening Handbook
38 pages
Curriculum Development Plan 101
No ratings yet
Curriculum Development Plan 101
7 pages
An Extensive IB Lab Report Guide
No ratings yet
An Extensive IB Lab Report Guide
34 pages
Development of Learning Media Based On Prezi On Sociology Subject at 11 Grade of Social Program
No ratings yet
Development of Learning Media Based On Prezi On Sociology Subject at 11 Grade of Social Program
11 pages

DescribingDataNumerically Lesson

Uploaded by

DescribingDataNumerically Lesson

Uploaded by

LESSON: DESCRIBING DATA NUMERICALLY

Describing Data Numerically

This lesson teaches students how to:

 Minitab or Minitab Express

Tranquilizing Sheep – Reaction Time Online Game:

Instructor Notes with Examples

LESSON: DESCRIBING DATA NUMERICALLY 2

x1, x2, ..., xi, ..., xn,

x1 denotes the numeric value of the first item in the sample,

LESSON: DESCRIBING DATA NUMERICALLY 3

Here are the lifetimes plotted as comparison dotplots in Minitab:

The sample mean lifetimes of battery brands A, B, and C are:

LESSON: DESCRIBING DATA NUMERICALLY 4

 50% of the data is less than or equal to the median.

LESSON: DESCRIBING DATA NUMERICALLY 5

LESSON: DESCRIBING DATA NUMERICALLY 6

LESSON: DESCRIBING DATA NUMERICALLY 7

Sample Variance and Sample Standard Deviation

Comments regarding the sample variance, s2:

LESSON: DESCRIBING DATA NUMERICALLY 8

Open the Minitab worksheet DescribingDataNumerically_Lesson.mtw. Data for battery brand

How to compute descriptive statistics in Minitab:

1 Choose Stat > Basic Statistics > Display Descriptive Statistics.

LESSON: DESCRIBING DATA NUMERICALLY 9

1 Open the descriptive statistics dialog box.

The Minitab output is:

LESSON: DESCRIBING DATA NUMERICALLY 10

You might also like