Lecture 1
Lecture 1
Data
Data Sources
Descriptive Statistics
Statistical Inference
Computers and
Statistical Analysis
Applications in Economics
Statistics: a methodology to use data to
learn the “truth.” i.e., Uncover the true
data mechanism
Marketing
Electronic point-of-sale scanners at
retail checkout counters are used to
collect data for a variety of marketing
research applications.
Production
Statistical quality
control charts are used to monitor
the output of a production process.
Applications in Finance
Finance
Data Set
Data and Data Sets
Data are the facts and figures collected,
summarized, analyzed, and interpreted.
The data collected in a particular study are referred
to as the data set.
Elements, Variables, and Observations
Data
Qualitative Quantitative
Numerical
Numerical Nonnumerical
Nonnumerical Numerical
Numerical
Nominal
Nominal Ordinal
Ordinal Nominal
Nominal Ordinal
Ordinal Interval
Interval Ratio
Ratio
Scales of Measurement
Scales
Scales of
of measurement
measurement include:
include:
Nominal Interval
Ordinal Ratio
The
The scale
scale determines
determines the the amount
amount of
of information
information
contained
contained in
in the
the data.
data.
The
The scale
scale indicates
indicates the
the data
data summarization
summarization and
and
statistical
statistical analyses
analyses that
that are
are most
most appropriate.
appropriate.
Scales of Measurement
Nominal
Data
Data are
are labels
labels or
or names
names used
used to
to identify
identify an
an
attribute
attribute of
of the
the element.
element.
A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.
Scales of Measurement
Nominal
Example:
Example:
Students
Students of
of aa university
university are
are classified
classified by
by the
the
dorm
dorm that
that they
they live
live in
in using
using aa nonnumeric
nonnumeric label
label
such
such as
as Farley,
Farley, Keenan,
Keenan, Zahm,
Zahm, Breen-Phillips,
Breen-Phillips,
and
and so
so on.
on.
AA numeric
numeric code
code can
can be
be used
used for
for
the
the school
school variable
variable (e.g.
(e.g. 1:
1: Farley,
Farley, 2:
2: Keenan,
Keenan,
3:
3: Zahm,
Zahm, and
and so
so on).
on).
Scales of Measurement
Ordinal
The
The data
data have
have the
the properties
properties of of nominal
nominal data
data and
and
the
the order
order or
or rank
rank of
of the
the data
data is
is meaningful
meaningful..
A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.
Scales of Measurement
Ordinal
Example:
Example:
Students
Students of
of aa university
university are
are classified
classified by
by their
their
class
class standing
standing using
using aa nonnumeric
nonnumeric label
label such
such asas
Freshman,
Freshman, Sophomore,
Sophomore, Junior,
Junior, or
or Senior.
Senior.
AA numeric
numeric code
code can
can be
be used
used for
for
the
the class
class standing
standing variable
variable (e.g.
(e.g. 11 denotes
denotes
Freshman,
Freshman, 22 denotes
denotes Sophomore,
Sophomore, and and so
so on).
on).
Scales of Measurement
Interval
The
The data
data have
have the
the properties
properties of
of ordinal
ordinal data,
data, and
and
the
the interval
interval between
between observations
observations is
is expressed
expressed in
in
terms
terms ofof aa fixed
fixed unit
unit of
of measure.
measure.
Interval
Interval data
data are
are always
always numeric
numeric..
Scales of Measurement
Interval
Example:
Example: Average
Average Starting
Starting Salary
Salary Offer
Offer 2003
2003
Economics/Finance:
Economics/Finance: $40,084
$40,084
History:
History: $32,108
$32,108
Psychology:
Psychology: $27,454
$27,454
Econ
Econ &
& Finance
Finance majors
majors earn
earn $7,976
$7,976 more
more than
than
History
History majors
majors and
and $12,630
$12,630 more
more than
than
Psychology
Psychology majors.
majors.
Source:
Source:National
NationalAssociation
Associationof
ofColleges
Collegesand
andEmployers
Employers
Scales of Measurement
Ratio
The
The data
data have
have all
all the
the properties
properties of
of interval
interval data
data
and
and the
the ratio
ratio of
of two
two values
values is
is meaningful
meaningful..
Variables
Variables such
such as
as distance,
distance, height,
height, weight,
weight, and
and time
time
use
use the
the ratio
ratio scale.
scale.
This
This scale
scale must
must contain
contain aa zero
zero value
value that
that indicates
indicates
that
that nothing
nothing exists
exists for
for the
the variable
variable at
at the
the zero
zero point.
point.
Scales of Measurement
Ratio
Example:
Example:
Econ
Econ &
& Finance
Finance majors
majors salaries
salaries are
are 1.24
1.24 times
times
History
History major
major salaries
salaries and
and are
are 1.46
1.46 times
times
Psychology
Psychology major
major salaries
salaries
Qualitative and Quantitative Data
Data
Data can
can be
be qualitative
qualitative or
or quantitative.
quantitative.
The
The appropriate
appropriate statistical
statistical analysis
analysis depends
depends
on
on whether
whether the
the data
data for
for the
the variable
variable are
are qualitative
qualitative
or
or quantitative.
quantitative.
There
There are
are more
more options
options for
for statistical
statistical
analysis
analysis when
when the
the data
data are
are quantitative.
quantitative.
Qualitative Data
Labels
Labels or
or names
names used
used to
to identify
identify an
an attribute
attribute of
of each
each
element.
element. E.g.,
E.g., Black
Black or
or white,
white, male
male or
or female.
female.
Referred
Referred to
to as
as categorical
categorical data
data
Use
Use either
either the
the nominal
nominal or
or ordinal
ordinal scale
scale of
of
measurement
measurement
Can
Can be
be either
either numeric
numeric or
or nonnumeric
nonnumeric
Appropriate
Appropriate statistical
statistical analyses
analyses are
are rather
rather limited
limited
Quantitative Data
Quantitative
Quantitative data
data indicate
indicate how
how many
many oror how
how much:
much:
D
Discrete
iscrete,, ifif measuring
measuring how
how many.
many. E.g.,
E.g., number
number
of
of 6-packs
6-packs consumed
consumed atat tail-gate
tail-gate party
party
Continuous
Continuous,, ifif measuring
measuring how
how much.
much. E.g.,
E.g., pounds
pounds
of
of hamburger
hamburger consumed
consumed at at tail-gate
tail-gate party
party
Quantitative
Quantitative data
data are
are always
always numeric
numeric..
Ordinary
Ordinary arithmetic
arithmetic operations
operations are
are meaningful
meaningful for
for
quantitative
quantitative data.
data.
Cross-Sectional Data
Cross-sectional
Cross-sectional data
data observations
observations across
across individuals
individuals
at
at the
the same
same point
point in
in time.
time.
Example
Example:: the
the growth
growth rate
rate from
from 1960
1960 to
to 2004
2004 of
of
each
each country
country in
in the
the world
world (about
(about 182
182 of
of them).
them).
Example
Example:: wages
wages for
for head
head of
of household
household inin
Indiana
Indiana
Time Series Data
Time
Time series
series data
data are
are collected
collected over
over several
several time
time
periods.
periods.
Example
Example:: the
the sequence
sequence ofof U.S.
U.S. GDP
GDP growth
growth each
each
Year
Year from
from 1960
1960 to
to 2005
2005
Example:
Example: the
the sequence
sequence of of Professor
Professor Mark’s
Mark’s wage
wage
each
each year
year from
from 1983
1983 to
to 2005.
2005.
Data Sources
Existing Sources
Within a firm – almost any department
Business database services – Dow Jones & Co.
Government agencies - U.S. Department of Labor
Industry associations – Travel Industry Association
of America
Special-interest organizations – Graduate Management
Admission Council
Collect your own
Data Sources
Statistical Studies
In
In experimental
experimental studies
studies variables
variables of of interest
interest
are
are identified.
identified. Then
Then additional
additional factors
factors are
are
varied
varied to
to obtain
obtain data
data that
that tells
tells us
us how
how
those
those factors
factors influence
influence the
the variables.
variables.
In
In observational
observational (nonexperimental)
(nonexperimental) studies
studies we
we
cannot
cannot control
control or
or influence
influence the
the
variables
variables of
of interest.
interest. a survey is a
good example
Descriptive Statistics
Descriptive statistics are the tabular, graphical,
and numerical methods used to summarize
data.
Example: Hudson Auto Repair
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
Tabular Summary:
Frequency and Percent Frequency
10
8
6
4
2
Parts
5059 6069 7079 8089 9099 100-110 Cost ($)
Numerical Descriptive Statistics
The most common numerical descriptive statistic
is the average (or sample mean).
Hudson’s average cost of parts, based on the 50
tune-ups studied, is $79 (found by summing the
50 cost values and then dividing by 50).
Statistical Inference