Chapter 1
Chapter 1
Data Information
Questions:
• What information to note?
• How to generate and extract this
information? 5
Copyright © 2009 Cengage Learning
I. What is statistics?
1. Why study statistics?
• Being informed
Your ability to be informed thoroughly
Extract information from tables, charts, and
graphs
Follow numerical arguments
6
Copyright © 2009 Cengage Learning
I. What is statistics?
1. Why study statistics?
• Making informed judgments
Example:
How should you select an online seller in
Ebay based on their feedback scores and
detailed ratings?
If you know the rate of being unemployed of
new graduate students of a particular
major, will you choose that major?
Questions:
• What information to note?
• Make your own judgments based on the
available information? 7
Copyright © 2009 Cengage Learning
I. What is statistics?
1. Why study statistics?
• Making informed Judgments
Your ability to make informed judgments
Decide whether existing information is adequate or
whether additional information is required
If necessary, collect more information in a
reasonable and thoughtful way
Summarize the available data in a useful and
informative manner
Analyze the available data
10
Copyright © 2009 Cengage Learning
I. What is statistics?
2. Understand the nature of
probabilities & statistics
interest rates, population, stock market
prices, unemployment rate…
11
Copyright © 2009 Cengage Learning
I. What is statistics?
- Furthermore:
- Collect
- describe
- summarize
- present
- analyze
12
Copyright © 2009 Cengage Learning
II/ Key concepts in statistics
1/ Elementary units vs The Frame
Elementary units
The persons or objects that have
characteristics of interest to statisticians
The frame
A complete listing of all elementary units
relevant to a statistical investigation
13
Copyright © 2009 Cengage Learning
II/ Key concepts in statistics
2/ Variables and Data
Variables
Characteristics of interest of elementary
units
Data
A single observation about a specified
characteristic of interest is called a datum.
Any collection of observations about one or
more characteristics of interest, for one or
more elementary units, is called a data set.
A data set may be univariate, bivariate,
multivariate 14
Copyright © 2009 Cengage Learning
II/ Key concepts in statistics
3/ Population vs Sample
Population
is the WHOLE group of all possible
observations about a specified characteristic
of interest.
A descriptive measure of a population is
called a parameter
Sample
is a subset of data drawn from the
population.
A descriptive measure of a sample is called a
statistic 15
Copyright © 2009 Cengage Learning
3/Population vs. Sample
Population Sample
a b cd b c
ef gh i jk l m n gi n
o p q rs t u v w o r u
x y z y
16
Copyright © 2009 Cengage Learning
Một chính trị gia đang trong chiến dịch
tranh cử chức thị trưởng một thành phố
có 25000 cử tri tiến hành một cuộc
điều tra chọn mẫu. Kết quả có 48% trong
số 200 người được hỏi trả lời sẽ bầu
cho ông ta. Hãy xác định:
• Tổng thể thống kê
• Mẫu
17
Statistic
— A descriptive measure of a sample.
Subset
Statistic
Parameter
Populations have Parameters,
Samples have Statistics.
Statistics
- Collect data
e.g., Survey, Observation,
Experiments
- Present data
e.g., Charts and graphs
- Characterize data x i
23
Copyright © 2009 Cengage Learning
Inferential Statistics
• Procedures used to draw conclusions or
inferences about the characteristics of
a population from information obtained
from the sample.
• Making estimates, testing hypothesis…
• Used when we can not enumerate the
whole population
24
Copyright © 2009 Cengage Learning
Inferential Statistics
Population parameters
Sample statistics (unknown, but can be
(known) estimated from sample
Inference evidence
Sample Population
25
Copyright © 2009 Cengage Learning
Inferential Statistics
Sample
Inference
Statistic
Parameter
Variables
Variables can
can be
be classified
classified as
as being
being qualitative
qualitative
or
or quantitative.
quantitative.
Depends
Depends onon whether
whether thethe variables
variables are
are qualitative
qualitative or
or
quantitative,
quantitative, we
we choose
choose the
the most
most
appropriate
appropriate statistical
statistical methods
methods
In
In general,
general, there
there are
are more
more statistical
statistical analysis
analysis for
for
quantitative
quantitative variables.
variables.
28
Copyright © 2009 Cengage Learning
Qualitative or Quantitative variable?
Marital Status
Qualitative
Gender
Height
Ages Quantitative
Student Evaluation
Grades 29
Copyright © 2009 Cengage Learning
Qualitative Variable
• Gender:
1. Male 2. Female
• Eye colors:
1.Brown 2.Black 3.Blue
4.Green
• Marital status:
1. Single
2. Married
3. Divorced
4. Widowed
Copyright © 2009 Cengage Learning 31
Quantitative Variable
Quantitative
Quantitative variable
variable is
is aa variable
variable
that
that is
is normally
normally expressed
expressed numerically.
numerically.
It
It indicates
indicates how
how many
many or
or how
how much:
much:
E.g.
(i) The number of students in a class
(ii)The number of correct answers in a test
(iii)People’s height, weight; students’ GPA
33
Copyright © 2009 Cengage Learning
4/ Scales of Measurement
Scales
Scales of
of measurement
measurement include:
include:
Nominal Interval
Ordinal Ratio
The
The scale
scale determines
determines thethe amount
amount of
of information
information
contained
contained in
in the
the data.
data.
The
The scale
scale indicates
indicates the
the data
data summarization
summarization and
and
statistical
statistical analyses
analyses that
that are
are most
most appropriate.
appropriate. 34
Copyright © 2009 Cengage Learning
Level of measurements
Highest Level
Categorical Codes ID
Numbers
Category Names
Nominal Scale Lowest Level
Basic Analysis
35
Copyright © 2009 Cengage Learning
Scales of Measurement
Nominal
Data
Data are
are labels
labels or
or names
names used
used to
to identify
identify aa
characteristic
characteristic of
of the
the elementary
elementary units.
units.
There
There is
is no
no relative
relative order
order or
or rank
rank between
between these
these
data
data categories
categories
Numeric
Numeric codes
codes are
are assigned
assigned for
for each
each data
data category.
category.
36
Copyright © 2009 Cengage Learning
Scales of Measurement
Nominal
It
It never
never makes
makes sense
sense to
to add,
add, subtract,
subtract, multiply,
multiply,
divide,
divide, rank,
rank, average
average or
or manipulate
manipulate
Used
Used to
to count
count frequency
frequency of
of variables
variables outcomes
outcomes
E.g:
E.g: Numbers
Numbers assigned
assigned for
for aa person’s
person’s gender
gender
Or
Or numbers
numbers assigned
assigned for
for aa person’s
person’s marital
marital status
status
37
Copyright © 2009 Cengage Learning
Example
Students
Students of
of aa university
university are
are classified
classified by
by
the
the
school
school in
in which
which they
they are
are enrolled
enrolled such
such as
as
Business,
Business, Humanities,
Humanities, Education,
Education, and
and so
so on.
on.
AA numeric
numeric code
code could
could be
be used
used for
for
the
the school
school variable
variable (e.g.
(e.g. 11 denotes
denotes
Business,
Business,
22 denotes
denotes Humanities,
Humanities, 33 denotes
denotes Education,
Education,
and
and
so
so on).
on).
38
Copyright © 2009 Cengage Learning
Scales of Measurement
Ordinal
The
The data
data have
have the
the properties
properties of
of nominal
nominal data
data and
and
the
the order
order or
or rank
rank of
of the
the data
data is
is meaningful.
meaningful.
Numeric
Numeric codes
codes may
may be
be used
used which
which do
do indicate
indicate the
the
rank
rank // order
order of
of data
data categories.
categories.
The
The gap
gap between
between numbers
numbers or
or units
units on
on this
this scale
scale
doesn’t
doesn’t mean
mean equal
equal magnitude
magnitude between
between variable
variable
outcomes
outcomes 39
Copyright © 2009 Cengage Learning
Thu nhập hàng tháng
1. <3trđ 2. 3-5 trđ 3. >5trđ
Xếp hạng loại nhạc yêu thích nhất
1. Pop 2. Rock 3. Hiphop 4. loại khác
Just
Just like
like nominal
nominal data,
data, itit never
never makes
makes sense
sense to
to
add,
add, subtract,
subtract, multiply,
multiply, divide,
divide, rank,
rank, average
average
or
or manipulate
manipulate
41
Copyright © 2009 Cengage Learning
Scales of Measurement
Interval
Have
Have the
the properties
properties of
of ordinal
ordinal data,
data, and
and
the
the intervals
intervals between
between numbers
numbers oror units
units on
on the
the scale
scale
are
are equal
equal over
over all
all level
level of
of the
the scale
scale
Interval
Interval scales
scales provide
provide more
more quantitative
quantitative
information
information
There
There is
is no
no zero
zero value
value that
that indicates
indicates
that
that nothing
nothing exists
exists for
for the
the variable
variable atat the
the zero
zero point.
point. 42
Copyright © 2009 Cengage Learning
Scales of Measurement
Interval
Addition
Addition and
and subtraction
subtraction are
are permissible
permissible but
but
multiplication
multiplication and
and division
division continue
continue to
to make
make no
no
sense
sense
43
Copyright © 2009 Cengage Learning
Scales of Measurement
Ratio
Have
Have all
all the
the properties
properties of
of interval
interval data
data
and
and the
the ratio
ratio of
of two
two values
values is
is meaningful
meaningful..
This
This scale
scale must
must contain
contain aa zero
zero value
value that
that indicates
indicates
that
that nothing
nothing exists
exists for
for the
the variable
variable at
at the
the zero
zero point.
point.
Variables
Variables such
such as
as distance,
distance, height,
height, weight,
weight, and
and time
time
use
use the
the ratio
ratio scale.
scale. 44
Copyright © 2009 Cengage Learning
Scales of Measurement
Ratio
All
All types
types of
of arithmetic
arithmetic operations,
operations, even
even
multiplication
multiplication and
and division
division can
can be
be performed
performed with
with
such
such data
data
45
Copyright © 2009 Cengage Learning
• Sinh viên A tháng 7/08 chi tiêu: 1.640.000 đ
• Sinh viên B tháng 7/08 chi tiêu: 3.280.000 đ
Qui đổi ra USD: (tỷ giá 1 USD = 16.400 VND)
Sinh viên A tháng 7/08 chi tiêu: 100 USD
Sinh viên B tháng 7/08 chi tiêu: 200 USD
Nếu SV A bị mất cắp ngân sách sẽ bằng 0
Requirements:
1/ Answer the questions on your own
2/ Identify the scale of measurement used
for each question? 48
Copyright © 2009 Cengage Learning
III/ Data Analysis Process
49
Copyright © 2009 Cengage Learning
• Reliability
• Validity
–Data collection
Statistical Methods:
Presenting data: Tables, Chart
and Graphs
Presenting data: descriptive
52 st
Copyright © 2009 Cengage Learning
III/ Data Analysis Process
Statistical Methods:
Estimation, Hypothesis testing,
Variance analysis, Regression and
Correlation, Time series and Forecasting,
Index numbers…
53
Copyright © 2009 Cengage Learning
III/ Data Analysis Process
54
Copyright © 2009 Cengage Learning
• Nghiên cứu cách sử dụng thời gian nhàn
rỗi của sinh viên trường Đại học ngoại
thương
55
56