Statistical Business Analytics: Unit 1: Introduction, Data and Statistics
Statistical Business Analytics: Unit 1: Introduction, Data and Statistics
Business analytics:
■ Economics
Economists use statistical information
in making forecasts about the future of
the economy or some aspect of it.
Applications in
Business and Economics
■ Marketing
Electronic point-of-sale scanners at
retail checkout counters are used to
collect data for a variety of marketing
research applications.
■ Production
A variety of statistical quality
control charts are used to monitor
the output of a production process.
Applications in
Business and Economics
■ Finance
Financial advisors use price-earnings ratios and
dividend yields to guide their investment
recommendations.
Data and Data Sets
Data Set
Types of Data and Scales of Measurement
Nature of variable
Type of data
representation Data
Qualitative Quantitative
Numerical
Numerical Nonnumerical
Nonnumerical Numerical
Numerical
Nominal
Nominal Ordinal
Ordinal Nominal
Nominal Ordinal
Ordinal Interval
Interval Ratio
Ratio
Scale of measurement
Qualitative and Quantitative Data
Data
Data can
can be
be classified
classified as
as being
being either
either of
of qualitative
qualitative
or
or of
of quantitative
quantitative nature.
nature.
The
The statistical
statistical analysis
analysis that
that is
is appropriate
appropriate depends
depends
on
on whether
whether the
the data
data for
for the
the variable
variable are
are qualitative
qualitative
or
or quantitative.
quantitative.
In
In general,
general, there
there are
are more
more alternatives
alternatives for
for statistical
statistical
analysis
analysis when
when thethe data
data are
are quantitative.
quantitative.
Qualitative Data
Labels
Labels or
or names
names used
used to
to identify
identify an
an attribute
attribute of
of each
each
element
element
Often
Often referred
referred to
to as
as categorical
categorical data
data
Use
Use either
either the
the nominal
nominal or
or ordinal
ordinal scale
scale of
of
measurement
measurement
Can
Can be
be either
either numeric
numeric or
or nonnumeric
nonnumeric
Appropriate
Appropriate statistical
statistical analyses
analyses are
are rather
rather limited
limited
Quantitative Data
Quantitative
Quantitative data
data indicate
indicate how
how many
many or
or how
how much:
much:
discrete,
discrete, ifif measuring
measuring how
how many
many
continuous,
continuous, ifif measuring
measuring how
how much
much
Quantitative
Quantitative data
data are
are always
always numeric.
numeric.
Ordinary
Ordinary arithmetic
arithmetic operations
operations are
are meaningful
meaningful for
for
quantitative
quantitative data.
data.
Scales of Measurement
Scales
Scales of
of measurement
measurement include:
include:
Nominal Interval
Ordinal Ratio
The
The scale
scale determines
determines the the amount
amount of
of information
information
contained
contained in
in the
the data.
data.
The
The scale
scale indicates
indicates the
the data
data summarization
summarization and
and
statistical
statistical analyses
analyses that
that are
are most
most appropriate.
appropriate.
Scales of Measurement
■ Nominal
Data
Data are
are labels
labels or
or names
names used
used to
to identify
identify an
an
attribute
attribute of
of the
the element.
element.
A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.
Scales of Measurement
■ Nominal
Example:
Example:
Students
Students ofof aa university
university are
are classified
classified by
by the
the
school
school in
in which
which they
they are
are enrolled
enrolled using
using aa
nonnumeric
nonnumeric labellabel such
such as
as Business,
Business, Humanities,
Humanities,
Education,
Education, and
and soso on.
on.
Alternatively,
Alternatively, aa numeric
numeric code
code could
could be
be used
used for
for
the
the school
school variable
variable (e.g.
(e.g. 11 denotes
denotes Business,
Business,
22 denotes
denotes Humanities,
Humanities, 33 denotes
denotes Education,
Education, and
and
so
so on).
on).
Scales of Measurement
■ Ordinal
The
The data
data have
have the
the properties
properties of of nominal
nominal data
data and
and
the
the order
order or
or rank
rank of
of the
the data
data is
is meaningful.
meaningful.
A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.
Scales of Measurement
■ Ordinal
Example:
Example:
Students
Students of
of aa university
university are
are classified
classified by
by their
their
course
course performance
performance using
using aa nonnumeric
nonnumeric label
label
such
such as
as Distinction,
Distinction, Merit,
Merit, Pass
Pass or
or Fail.
Fail.
Alternatively,
Alternatively, aa numeric
numeric code
code could
could bebe used
used for
for
the
the class
class standing
standing variable
variable (e.g.
(e.g. 11 denotes
denotes
Distinction,
Distinction, 22 denotes
denotes Merit
Merit and
and soso on).
on).
Scales of Measurement
■ Interval
The
The data
data have
have the
the properties
properties ofof ordinal
ordinal data,
data, and
and
the
the interval
interval (or
(or distance)
distance) between
between observations
observations is
is
expressed
expressed inin terms
terms of
of aa fixed
fixed unit
unit of
of measure.
measure.
Interval
Interval data
data are
are always
always numeric.
numeric.
ItIt is
is meaningful
meaningful to
to calculate
calculate sums
sums and
and differences
differences
of
of datadata values,
values, but
but the
the scale
scale doesn’t
doesn’t have
have aa natural
natural
zero
zero point.
point.
Scales of Measurement
■ Interval
Example:
Example:
Marianna
Marianna has
has an
an GMAT
GMAT score
score of
of 605,
605, while
while Kostas
Kostas
has
has an
an GMAT
GMAT score
score of
of 490.
490. Marianna
Marianna scored
scored 115
115
points
points more
more than
than Kostas.
Kostas.
Scales of Measurement
■ Ratio
The
The data
data have
have all
all the
the properties
properties of
of interval
interval data
data
and
and the
the ratio
ratio of
of two
two values
values is
is meaningful.
meaningful.
Variables
Variables such
such as
as distance,
distance, height,
height, weight,
weight, and
and time
time
use
use the
the ratio
ratio scale.
scale.
This
This scale
scale must
must contain
contain aa zero
zero value
value that
that indicates
indicates
that
that nothing
nothing exists
exists for
for the
the variable
variable at
at the
the zero
zero point.
point.
Scales of Measurement
■ Ratio
Example:
Example:
Marianna’s
Marianna’s college
college record
record shows
shows 36
36 credits
credits
earned,
earned, while
while Kostas’s
Kostas’s record
record shows
shows 72
72 credit
credit ss
earned.
earned. Kostas
Kostas has
has twice
twice as
as many
many credits
credits
earned
earned as
as Marianna.
Marianna.
End of Unit 1
Unit 2 Data Acquisition and Analysis
■ Types of Data Sets
■ Data Sources and Acquisition
■ Descriptive Statistics
■ Statistical Inference
■ Computers and Statistical Analysis
Cross-Sectional Data
With
With cross-sectional
cross-sectional data,
data, observations
observations are
are made
made onon
aa number
number ofof elements
elements at
at aa single
single date
date // for
for aa single
single
time
time period.
period.
Example:
Example: data
data detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in June
June 2006
2006 in
in each
each of
of the
the regions
regions
of
of Italy
Italy
Time Series Data
With
With time
time series
series data,
data, observations
observations are
are made
made onon aa
single
single entity
entity at
at several
several dates
dates // over
over several
several time
time
periods.
periods.
Example:
Example: datadata detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in Tuscany,
Tuscany, Italy
Italy in
in each
each of
of
the
the last
last 36
36 months
months
Panel Data
With
With panel
panel data,
data, observations
observations are
are made
made on
on aa number
number
of
of elements
elements at
at several
several dates
dates // over
over several
several time
time
periods.
periods.
Example:
Example: data
data detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in each
each of
of the
the regions
regions of
of Italy
Italy
in
in each
each of
of the
the last
last 36
36 months
months
Data Sources
■ Existing Sources
■ Statistical Studies
In
In experimental
experimental studies
studies the
the variables
variables of
of interest
interest
are
are first
first identified.
identified. Then
Then one
one or
or more
more factors
factors are
are
controlled
controlled soso that
that data
data can
can be
be obtained
obtained about
about how
how
the
the factors
factors influence
influence the
the variables.
variables.
In
In observational
observational (nonexperimental)
(nonexperimental) studies
studies no
no
attempt
attempt is
is made
made to to control
control or
or influence
influence the
the
variables
variables of
of interest.
interest. a survey is a
good example
Data Acquisition Considerations
Time Requirement
• Searching for information can be time consuming.
• Information may no longer be useful by the time it
is available.
Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
Data Errors
• Using any data that happens to be available or
that were acquired with little care can lead to poor
and misleading information.
Descriptive Statistics
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 72 69 67 74
62 82 98 101 79 105 79 69 62 73
Tabular Summary:
Frequency and Percent Frequency
12
10
8
6
4
2
Parts
50-59 60-69 70-79 80-89 90-99 100-110 Cost (€ )
Numerical Descriptive Statistics
1. Population
consists of all 2. A sample of 50
tune-ups. Average engine tune-ups
cost of parts is is examined.
unknown.