0% found this document useful (0 votes)
43 views45 pages

Statistical Business Analytics: Unit 1: Introduction, Data and Statistics

Three key developments have spurred growth in business analytics: increased data availability due to technological advances, methodological developments, and greater computing power. Business analytics involves transforming data into insights to make better, more objective decisions. There are three main types of analytical methods: descriptive analytics which describes past events, predictive analytics which uses past data to predict the future, and prescriptive analytics which indicates the best course of action. Statistics are important for understanding phenomena, facilitating decision making, and are used extensively in business, economics, marketing, production, and finance.

Uploaded by

sziklay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views45 pages

Statistical Business Analytics: Unit 1: Introduction, Data and Statistics

Three key developments have spurred growth in business analytics: increased data availability due to technological advances, methodological developments, and greater computing power. Business analytics involves transforming data into insights to make better, more objective decisions. There are three main types of analytical methods: descriptive analytics which describes past events, predictive analytics which uses past data to predict the future, and prescriptive analytics which indicates the best course of action. Statistics are important for understanding phenomena, facilitating decision making, and are used extensively in business, economics, marketing, production, and finance.

Uploaded by

sziklay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Statistical Business Analytics

Unit 1: Introduction, Data and Statistics


Unit 1 Introducton to Business Analytics

Three developments spurred recent explosive growth in the


use of analytical methods in business applications:

■ Technological advances produce a lot of data for business


■ Numerous methodological developments
■ Explosion in computing power and storage capability
Business Analytics Defined

Business analytics:

• Scientific process of transforming data into insight


for making better decisions

• Used for data-driven or fact-based decision making,


which is often seen as more objective than other
alternatives for decision making
A Categorization on Analytical Methods
and Models

Descriptive Analytics : Encompasses the set of techniques that


describes what has happened in the past.

Predictive Analytics: Consists of techniques that use models


constructed from past data to predict the future or ascertain
the impact of one variable on another.

Prescriptive Analytics: Indicates a best course of action to take


(Optimization, Simulation optimization Decision analysis)
The Spectrum of Business Analytics
Unit 1 Data and Statistics
■ General Purpose of Statistics
■ Organization of Data
■ Types of Data Sources
■ Scales of Measurement
Why Do We Need Statistics?

■ Statistics is about collecting, analyzing, interpreting


and presenting data…
■ … for the purpose of
• understanding phenomena in the I need help!
world around is;
• facilitating informed decision
making.
■ Statistics is used extensively
in various fields of science
and policy-making.
Applications in
Business and Economics
■ Accounting
Public accounting firms use statistical
sampling procedures when conducting
audits for their clients.

■ Economics
Economists use statistical information
in making forecasts about the future of
the economy or some aspect of it.
Applications in
Business and Economics
■ Marketing
Electronic point-of-sale scanners at
retail checkout counters are used to
collect data for a variety of marketing
research applications.

■ Production
A variety of statistical quality
control charts are used to monitor
the output of a production process.
Applications in
Business and Economics
■ Finance
Financial advisors use price-earnings ratios and
dividend yields to guide their investment
recommendations.
Data and Data Sets

■ Data are the facts and figures collected, summarized,


analyzed, and interpreted.
■ The data collected in a particular study are referred
to as the data set.
Elements, Variables, and Observations

■ The elements are the entities on which data are


collected.
■ A variable is a characteristic of interest for the elements.
■ The set of measurements collected for a particular
element is called an observation.
■ The total number of data values in a data set is the
number of elements multiplied by the number of
variables.
Data, Data Sets,
Elements, Variables, and Observations
Observation Variables
Element
Names Stock Annual Price/
Company Exchange Sales ($bn) Earnings

BP FTSE 22.34 11.5


Carrefour PAR 25.4 13.4
Toyota TSE 183.02 14.9
General Motors NYSE 192.60 -1.5
Phillips Electronics NYSE 3.47 14.45

Data Set
Types of Data and Scales of Measurement

Nature of variable
Type of data
representation Data

Qualitative Quantitative

Numerical
Numerical Nonnumerical
Nonnumerical Numerical
Numerical

Nominal
Nominal Ordinal
Ordinal Nominal
Nominal Ordinal
Ordinal Interval
Interval Ratio
Ratio

Scale of measurement
Qualitative and Quantitative Data

Data
Data can
can be
be classified
classified as
as being
being either
either of
of qualitative
qualitative
or
or of
of quantitative
quantitative nature.
nature.

The
The statistical
statistical analysis
analysis that
that is
is appropriate
appropriate depends
depends
on
on whether
whether the
the data
data for
for the
the variable
variable are
are qualitative
qualitative
or
or quantitative.
quantitative.

In
In general,
general, there
there are
are more
more alternatives
alternatives for
for statistical
statistical
analysis
analysis when
when thethe data
data are
are quantitative.
quantitative.
Qualitative Data

Labels
Labels or
or names
names used
used to
to identify
identify an
an attribute
attribute of
of each
each
element
element

Often
Often referred
referred to
to as
as categorical
categorical data
data

Use
Use either
either the
the nominal
nominal or
or ordinal
ordinal scale
scale of
of
measurement
measurement

Can
Can be
be either
either numeric
numeric or
or nonnumeric
nonnumeric

Appropriate
Appropriate statistical
statistical analyses
analyses are
are rather
rather limited
limited
Quantitative Data

Quantitative
Quantitative data
data indicate
indicate how
how many
many or
or how
how much:
much:
discrete,
discrete, ifif measuring
measuring how
how many
many

continuous,
continuous, ifif measuring
measuring how
how much
much

Quantitative
Quantitative data
data are
are always
always numeric.
numeric.

Ordinary
Ordinary arithmetic
arithmetic operations
operations are
are meaningful
meaningful for
for
quantitative
quantitative data.
data.
Scales of Measurement

Scales
Scales of
of measurement
measurement include:
include:
Nominal Interval
Ordinal Ratio

The
The scale
scale determines
determines the the amount
amount of
of information
information
contained
contained in
in the
the data.
data.

The
The scale
scale indicates
indicates the
the data
data summarization
summarization and
and
statistical
statistical analyses
analyses that
that are
are most
most appropriate.
appropriate.
Scales of Measurement

■ Nominal

Data
Data are
are labels
labels or
or names
names used
used to
to identify
identify an
an
attribute
attribute of
of the
the element.
element.

A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.
Scales of Measurement

■ Nominal

Example:
Example:
Students
Students ofof aa university
university are
are classified
classified by
by the
the
school
school in
in which
which they
they are
are enrolled
enrolled using
using aa
nonnumeric
nonnumeric labellabel such
such as
as Business,
Business, Humanities,
Humanities,
Education,
Education, and
and soso on.
on.
Alternatively,
Alternatively, aa numeric
numeric code
code could
could be
be used
used for
for
the
the school
school variable
variable (e.g.
(e.g. 11 denotes
denotes Business,
Business,
22 denotes
denotes Humanities,
Humanities, 33 denotes
denotes Education,
Education, and
and
so
so on).
on).
Scales of Measurement

■ Ordinal

The
The data
data have
have the
the properties
properties of of nominal
nominal data
data and
and
the
the order
order or
or rank
rank of
of the
the data
data is
is meaningful.
meaningful.

A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.
Scales of Measurement

■ Ordinal

Example:
Example:
Students
Students of
of aa university
university are
are classified
classified by
by their
their
course
course performance
performance using
using aa nonnumeric
nonnumeric label
label
such
such as
as Distinction,
Distinction, Merit,
Merit, Pass
Pass or
or Fail.
Fail.
Alternatively,
Alternatively, aa numeric
numeric code
code could
could bebe used
used for
for
the
the class
class standing
standing variable
variable (e.g.
(e.g. 11 denotes
denotes
Distinction,
Distinction, 22 denotes
denotes Merit
Merit and
and soso on).
on).
Scales of Measurement

■ Interval

The
The data
data have
have the
the properties
properties ofof ordinal
ordinal data,
data, and
and
the
the interval
interval (or
(or distance)
distance) between
between observations
observations is
is
expressed
expressed inin terms
terms of
of aa fixed
fixed unit
unit of
of measure.
measure.

Interval
Interval data
data are
are always
always numeric.
numeric.

ItIt is
is meaningful
meaningful to
to calculate
calculate sums
sums and
and differences
differences
of
of datadata values,
values, but
but the
the scale
scale doesn’t
doesn’t have
have aa natural
natural
zero
zero point.
point.
Scales of Measurement

■ Interval

Example:
Example:
Marianna
Marianna has
has an
an GMAT
GMAT score
score of
of 605,
605, while
while Kostas
Kostas
has
has an
an GMAT
GMAT score
score of
of 490.
490. Marianna
Marianna scored
scored 115
115
points
points more
more than
than Kostas.
Kostas.
Scales of Measurement

■ Ratio

The
The data
data have
have all
all the
the properties
properties of
of interval
interval data
data
and
and the
the ratio
ratio of
of two
two values
values is
is meaningful.
meaningful.

Variables
Variables such
such as
as distance,
distance, height,
height, weight,
weight, and
and time
time
use
use the
the ratio
ratio scale.
scale.

This
This scale
scale must
must contain
contain aa zero
zero value
value that
that indicates
indicates
that
that nothing
nothing exists
exists for
for the
the variable
variable at
at the
the zero
zero point.
point.
Scales of Measurement

■ Ratio

Example:
Example:
Marianna’s
Marianna’s college
college record
record shows
shows 36
36 credits
credits
earned,
earned, while
while Kostas’s
Kostas’s record
record shows
shows 72
72 credit
credit ss
earned.
earned. Kostas
Kostas has
has twice
twice as
as many
many credits
credits
earned
earned as
as Marianna.
Marianna.
End of Unit 1
Unit 2 Data Acquisition and Analysis
■ Types of Data Sets
■ Data Sources and Acquisition
■ Descriptive Statistics
■ Statistical Inference
■ Computers and Statistical Analysis
Cross-Sectional Data

With
With cross-sectional
cross-sectional data,
data, observations
observations are
are made
made onon
aa number
number ofof elements
elements at
at aa single
single date
date // for
for aa single
single
time
time period.
period.

Example:
Example: data
data detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in June
June 2006
2006 in
in each
each of
of the
the regions
regions
of
of Italy
Italy
Time Series Data

With
With time
time series
series data,
data, observations
observations are
are made
made onon aa
single
single entity
entity at
at several
several dates
dates // over
over several
several time
time
periods.
periods.

Example:
Example: datadata detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in Tuscany,
Tuscany, Italy
Italy in
in each
each of
of
the
the last
last 36
36 months
months
Panel Data

With
With panel
panel data,
data, observations
observations are
are made
made on
on aa number
number
of
of elements
elements at
at several
several dates
dates // over
over several
several time
time
periods.
periods.

Example:
Example: data
data detailing
detailing the
the number
number ofof building
building
permits
permits issued
issued in
in each
each of
of the
the regions
regions of
of Italy
Italy
in
in each
each of
of the
the last
last 36
36 months
months
Data Sources

■ Existing Sources

Within a firm – almost any department


Business database services – Economist Intelligence Unit,
Reuters, Bloomberg
Government agencies - European Commission,
European Central Bank, Fed. Res. Bank of St. Louis
Bureaus of statistics - Eurostat
Industry associations – European Tourist Office
Special-interest organizations – OECD, IMF, UNO
Internet – more and more firms; Wikipedia
Data Sources

■ Statistical Studies

In
In experimental
experimental studies
studies the
the variables
variables of
of interest
interest
are
are first
first identified.
identified. Then
Then one
one or
or more
more factors
factors are
are
controlled
controlled soso that
that data
data can
can be
be obtained
obtained about
about how
how
the
the factors
factors influence
influence the
the variables.
variables.

In
In observational
observational (nonexperimental)
(nonexperimental) studies
studies no
no
attempt
attempt is
is made
made to to control
control or
or influence
influence the
the
variables
variables of
of interest.
interest. a survey is a
good example
Data Acquisition Considerations

Time Requirement
• Searching for information can be time consuming.
• Information may no longer be useful by the time it
is available.
Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
Data Errors
• Using any data that happens to be available or
that were acquired with little care can lead to poor
and misleading information.
Descriptive Statistics

■ Descriptive statistics are the tabular, graphical, and


numerical methods used to summarize data.
■ Descriptive statistical tools are used whenever the
sole purpose of our analysis is to describe the
observed data values.
Example: Hudson Auto Repair

The manager of Hudson Auto


would like to have a better
understanding of the cost
of parts used in the engine
tune-ups performed in the
shop. She examines 50
customer invoices for tune-ups. The costs of parts,
rounded to the nearest euro, are listed on the next
slide.
Example: Hudson Auto Repair

■ Sample of Parts Cost for 50 Tune-ups

91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 72 69 67 74
62 82 98 101 79 105 79 69 62 73
Tabular Summary:
Frequency and Percent Frequency

Parts Parts Percent


Cost (€ ) Frequency Frequency
50-59 2 4
60-69 13 26
(2/50)×100
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
50 100
Graphical Summary: Histogram

Tune-up Parts Cost


18
16
14
Frequency

12
10
8
6
4
2
Parts
50-59 60-69 70-79 80-89 90-99 100-110 Cost (€ )
Numerical Descriptive Statistics

■ The most common numerical descriptive statistic


is the average (or mean).
■ Hudson’s average cost of parts, based on the 50
tune-ups studied, is € 79 (found by summing the
50 cost values and then dividing by 50).
Statistical Inference

■ In statistical inference the purpose of our analysis


is to generalize the information extracted from the
observed data to some larger population.
■ One needs to resort to statistical inference whenever
collecting data on the entire population is impossible
or would be too costly or time-consuming.
Statistical Inference

Population - the set of all elements of interest in a


particular study
Sample - a subset of the population

Statistical inference - the process of using data obtained


from a sample to make estimates
and test hypotheses about the
characteristics of a population
Census - collecting data for a population

Sample survey - collecting data for a sample


Process of Statistical Inference

1. Population
consists of all 2. A sample of 50
tune-ups. Average engine tune-ups
cost of parts is is examined.
unknown.

4. The sample average 3. The sample data


provide a sample
is used to estimate the average parts cost
population average. of € 79 per tune-up.
Computers and Statistical Analysis
■ Statistical analysis often involves working with
large amounts of data.
■ Computer software is typically used to conduct the
analysis.
■ Statistical software packages such as Microsoft Excel,
Minitab and SPSS and are capable of data management,
analysis, and presentation.
■ Instructions for using Excel, Minitab and SPSS are
provided in chapter appendices.
End of Unit 2

You might also like