Descriptive Statistics: Overview of Using Data
Descriptive Statistics: Overview of Using Data
Descriptive Statistics
Overview of Using Data
Lecture # 02
TOPICS to be COVERED
02 Types of Data
03 Types of Measurements
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 2
Overview of Using Data: Definitions and Goals
• Data
• Variable
• Observation
• Variation
• Random variables
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 3
• Data are the facts and figures collected, analyzed,
and summarized for presentation and
interpretation.
• A variable is any characteristics, number, or
quantity that can be measured or counted. It is
called a variable because the value may vary
between data units in a population, and may
change in value over time.
Example: Height of a whole class students
• An observation is a set of values corresponding
to a set of variables
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 4
variation is the difference in a variable measured over
observations (time, customers, items, etc.).
Example: When we collect data, we are gathering past observed values, or realizations of a variable.
By collecting these past realizations of one or more variables, our goal is to learn more about the
variation of a particular business situation.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 7
Data can be categorized in several ways based
on how they are collected and the type
collected.
• In many cases, it is not feasible to collect data
from the population of all elements of
interest.
• In such instances, we collect data from a
subset of the population known as a sample.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 8
What is a Statistic????
Sample
Sample
Sample
Population
Sample
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 11
• If arithmetic operations cannot be performed
on the data, they are considered categorical
data
• For instance, the data in the Industry column
in Table 2.1 are categorical
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 12
Cross-Sectional and Time Series Data
• Cross-sectional data are collected from several entities at the
same, or approximately the same, point in time.
3. Interval 4. Ratio
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 15
Categorical (Nominal) data
• What does this mean? No mathematical
operations can be performed on the data
relative to each other.
• Therefore, nominal data reflect qualitative
differences rather than quantitative ones.
• Nominal measurements only permit you to
determine whether two individuals are the
same or different.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 16
Nominal data
Examples:
Male Yes
Female No
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 17
Ordinal data
• Ordinal data is data that comprises of categories that can be rank
ordered.
• Similarly with nominal data the distance between each category
cannot be calculated but the categories can be ranked above or
below each other.
• No fixed units of measurement
Examples:
‒ college football rankings
‒ survey responses:(poor, average, good, very good, excellent)
What does this mean? Can make statistical judgements and perform
limited maths.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 18
Ordinal data
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 19
Interval and Ratio data
• Both interval and ratio data are examples of scale data.
• Scale data:
• data is in numeric format ($50, $100, $150)
• data that can be measured on a continuous scale
• the distance between each can be observed and as a result
measured
• the data can be placed in rank order.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 20
Interval data
• Ordinal data but with constant differences
between observations
• Examples:
• Time – moves along a continuous measure or
seconds, minutes and so on and is without a
zero point of time.
• Temperature – moves along a continuous
measure of degrees and is without a true zero.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 21
Ratios
• Ratio data measured on a continuous scale and does
have a natural zero point
• Ratios are meaningful
Examples:
‒ Monthly sales
‒ Delivery times
‒ Weight
‒ Height
‒ Age
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 22
Data for Business Analytics
Classifying Data Elements in a Purchasing Database
Ra
Ca
Ca
Ca
Ca
Ra
Ra
In
In
Ra
te
tio
te
te
te
te
te
tio
tio
tio
go
go
r
go
go
rv
va
al
ric
r ic
ric
r ic
l
al
al
al
al
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 23
• Ref book pg 24
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 24
Modifying Data in Excel
Sorting Data in Excel
• Step 1. Select cells A1:F21
• Step 2. Click the Data tab in the Ribbon
• Step 3. Click Sort in the Sort & Filter group
• Step 4. Select the check box for My data has headers
• Step 5. In the first Sort by dropdown menu, select Sales
(March 2010)
• Step 6. In the Order dropdown menu, select Largest to
Smallest (see Figure 2.4)
• Step 7. Click OK
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 25
Filtering
• Step 1. Select cells A1:F21
• Step 2. Click the Data tab in the Ribbon
• Step 3. Click Filter in the Sort & Filter group
• Step 4. Click on the Filter Arrow in column B, next to
Manufacturer
• Step 5. If all choices are checked, you can easily
deselect all choices by unchecking
• (Select All). Then select only the check box for Toyota.
• Step 6. Click OK
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 26
Creating Distributions from Data
• Distributions help summarize many
characteristics of a data set by describing how
often certain values for a variable appear in
that data set.
• Distributions can be created for both
categorical and quantitative data, and they
assist the analyst in determining variation.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 27
Frequency Distributions for Categorical Data
• A frequency distribution is a summary of data
that shows the number (frequency) of
observations in each of several non
overlapping classes, typically referred to as
bins.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 28
Frequency Distribution
Consider a data set of 26 children of ages 1-6 years. Then the frequency
distribution of variable ‘age’ can be tabulated as follows:
Age 1 2 3 4 5 6
Frequency 5 3 7 5 4 2
Grouped Frequency Distribution of Age:
Age Group 1-2 3-4 5-6
Frequency 8 12 6
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 29
Example: 1
• A survey was taken in Maple Avenue. In each
of 20 homes, people were asked how many
cars were registered to their households.
• The results were recorded as follows:
3, 1, 4, 0, 2, 1, 5, 2, 1, 5, 4, 2, 3, 2, 0, 2, 1, 0, 3, 2.
• Present this data in Frequency Distribution
Table.
• Also find maximum number of cars registered
by household.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 30
Example: 2
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 31
Solution ?
• Discussed in class
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 32
Relative Frequency and Percent Frequency Distributions
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 33
Relative Frequency and Percent Frequency
Distributions
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 34
Example: 3
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 35
Frequency Distributions for Quantitative Data
• Consider the quantitative data in Table 2.6
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 36
• These data show the time in days required to
complete year-end audits for a sample of 20 clients
of Sanderson and Clifford, a small public
accounting firm. The three steps necessary to
define the classes for a frequency distribution with
quantitative data are as follows:
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 38
• Bin Limits: Bin limits must be chosen so that
each data item belongs to one and only one
class.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 39
Example: 4
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 40
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 41
• Step 1. Select cells B10:B14
• Step 2. Type the formula =FREQUENCY(A2:D6,
A10:A14). The range A2:D6
• defines the data set, and the range A10:A14
defines the bins.
• Step 3. Press CTRL+SHIFT1+ENTER after typing
the formula in Step 2.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 42
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 43
Data Presentation
Two types of statistical presentation of data - graphical and numerical.
Graphical Presentation, we look for the overall pattern and for striking deviations
from that pattern. Over all pattern usually described by shape, center, and spread
of the data. An individual value that falls outside the overall pattern is called an
outlier.
• Bar diagram and Pie charts are used for categorical variables.
• Histogram, stem and leaf and Box-plot are used for numerical variable.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 44
Histograms
• Step 1. Click the Data tab in the Ribbon
• Step 2. Click Data Analysis in the Analyze group
• Step 3. When the Data Analysis dialog box opens, choose
Histogram from the list of
• Analysis Tools, and click OK
• In the Input Range: box, enter A2:D6
• In the Bin Range: box, enter A10:A14
• Under Output Options:, select New Worksheet Ply:
• Select the check box for Chart Output (see Figure 2.13)
• Click OK
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 45
A common graphical presentation
of quantitative data is a histogram
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 46
Thank You !
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 47