0% found this document useful (0 votes)
4 views

Lecture_4

Statistics is the science of gathering, analyzing, interpreting, and presenting data, and is crucial in various fields such as business, medicine, and government for making informed decisions. It is divided into descriptive statistics, which summarizes data, and inferential statistics, which makes predictions about a population based on sample data. Understanding the differences between populations and samples, as well as the types of variables and levels of measurement, is essential for effective statistical analysis.

Uploaded by

asoulonjourney
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture_4

Statistics is the science of gathering, analyzing, interpreting, and presenting data, and is crucial in various fields such as business, medicine, and government for making informed decisions. It is divided into descriptive statistics, which summarizes data, and inferential statistics, which makes predictions about a population based on sample data. Understanding the differences between populations and samples, as well as the types of variables and levels of measurement, is essential for effective statistical analysis.

Uploaded by

asoulonjourney
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 61

Statistics

What is Statistics?

• Science of gathering, analyzing, interpreting,


and presenting data
• Branch of mathematics
• Facts and figures
• Measurement taken on a sample
Statistics is the scientific method that
enables us to make decisions as responsibly
as possible.
Statistics in Business
• Accounting — auditing and cost estimation
• Economics — regional, national, and international economic performance
• Finance — investments and portfolio management
• Management — human resources, compensation, and quality
management
• Management Information Systems — (ERP): performance of systems
which gather, summarize, and disseminate information to various
managerial levels
• Marketing — market analysis and consumer research
• International Business — market and demographic analysis
Statistics…
• The science of data to answer research
questions
– Formulate a research question(s) (hypothesis)
– Collect data
– Analyze and summarize data
– Draw conclusions to answer research
questions
• Statistical Inference
– In the presence of variation
Answers Questions from Everyday
Life
• Business: Will a new marketing strategy be
profitable?
• Industry: Will a product’s life exceed the
warranty period?
• Medicine: Will this year’s flu vaccine reduce the
chance of flu?
• Education: Will technology improve learning?
• Government: Will a change in interest rates
affect inflation?

5
Statistics: Science of
variability..?
• Virtually everything varies
• Variation occurs among individuals
• Variation occurs within any one individual
as time passes
Population Versus Sample
• Population — the whole
– a collection of persons, objects, or items under study
– The entire group of individuals in a statistical study we
want information about.

• Census — gathering data from the entire


population
• Sample — a portion of the whole
– a subset of the population
– a part of the population from which we actually collect
information, used to draw conclusions about the
whole (statistical inference
7
Statistics can be split into two
broad categories

1. Descriptive statistics

2. Inferential statistics
Descriptive Statistics

 Collect data
 ex. Survey
 Present data
 ex. Tables and graphs
 Characterize data
 ex. Sample mean = X i

n
Descriptive statistics..
• Encompasses the following:
– Graphical or pictorial display
– Condensation of large masses of data into a
form such as tables
– Preparation of summary measures to give a
concise description of complex information (e.g.
an average figure)
– Exhibition of patterns that may be found in sets
of information
10
Inferential Statistics
 Estimation
 ex. Estimate the population
mean weight using the
sample mean weight
 Hypothesis testing
 ex. Test the claim that the
population mean weight is
120 pounds

Drawing conclusions and/or making decisions


concerning a population based on sample results.
Inferential Statistics..
• Especially relates to:
– Determining whether characteristics of a situation
are unusual or if they have happened by chance

– Estimating values of numerical quantities and


determining the reliability of those estimates

– Using past occurrences to attempt to predict the


future
Process of Inferential Statistics

Calculate x
to estimate 
Population Sample
 x
(parameter) (statistic)

Select a
random sample
13
Population vs. Sample

Population Sample

Measures used to describe the Measures computed from


population are called parameters sample data are called statistics
Parameter vs. Statistic

• Parameter — descriptive measure of the


population
– Usually represented by Greek letters

• Statistic — descriptive measure of a


sample
– Usually represented by Roman letters
Symbols for Population and
Sample Parameters
 denotes population parameter


2
denotes population variance
 denotes population standard deviation

x denotes sample mean


S
2
denotes sample variance
S denotes sample standard deviation
Types of Variables
 Categorical (qualitative) variables have values
that can only be placed into categories, such as
“yes” and “no.”

 Numerical (quantitative) variables have


values that represent quantities.
Types of Variables

Data

Categorical Numerical
Examples:
 Marital Status
 Political Party Discrete Continuous
 Eye Color
Examples: Examples:
(Defined categories)
 Number of Children  Weight
 Defects per hour  Voltage
(Counted items) (Measured
characteristics)
Levels of Data Measurement

• Nominal — Lowest level of measurement


• Ordinal
• Interval
• Ratio — Highest level of measurement

19
Levels of Measurement

 A nominal scale classifies data into distinct


categories in which no ranking is implied.

Categorical Variables Categories

Personal Computer Yes / No


Ownership

Type of Stocks Owned Growth Value Other

Internet Provider Microsoft Network / AOL


Levels of Measurement

 An ordinal scale classifies data into distinct


categories in which ranking is implied

Categorical Variable Ordered Categories

Student class designation Freshman, Sophomore, Junior,


Senior
Product satisfaction Satisfied, Neutral, Unsatisfied

Faculty rank Professor, Associate Professor,


Assistant Professor, Instructor
Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,
C, DDD, DD, D
Student Grades A, B, C, D, F
Levels of Measurement

 An interval scale is an ordered scale in which the


difference between measurements is a meaningful
quantity but the measurements do not have a true
zero point.

 A ratio scale is an ordered scale in which the


difference between the measurements is a
meaningful quantity and the measurements have a
true zero point.
Interval and Ratio Scales
Usage Potential of Various
Levels of Data
Ratio
Interval
Ordinal

Nominal
Data Level, Operations,
and Statistical Methods
Statistical
Data Level Meaningful Operations
Methods

Nominal Classifying and Counting Nonparametric

Ordinal All of the above plus Ranking Nonparametric

Interval All of the above plus Addition, Parametric


Subtraction

All of the above plus


Ratio multiplication and division Parametric
Data preparation rules

• Data presented must be


– factual
– relevant
Before presentation always check:
• the source of the data
• that the data has been accurately
transcribed
• the figures are relevant to the problem
Methods of visual presentation
of data
• Table

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr


East 20.4 27.4 90 20.4
West 30.6 38.6 34.6 31.6
North 45.9 46.9 45 43.9
Methods of visual presentation
of data
• Graphs
90
80
70
60
50 East
40 West
30 North
20
10
0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Methods of visual presentation
of data
• Pie chart

1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Methods of visual presentation
of data
• Multiple bar chart

4th Qtr

3rd Qtr
North
West
2nd Qtr East

1st Qtr

0 20 40 60 80 100
Methods of visual presentation
of data
• Simple pictogram

100
80
60
40
North
20
East
0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr West
Frequency distributions
• Frequency tables

Observation Table
Class Interval Frequency Cumulative Frequency
< 20 13 13
<40 18 31
<60 25 56
<80 15 71
<100 9 80
Frequency diagrams
Frequency

30
25 Frequency

20 Cumulative Frequency
15
10
90
80
5 70
0 60
50
< 20 <40 <60 <80 <100 Cumulative Frequency
40
30
20
10
0
< 20 <40 <60 <80 <100
Frequency

30
25
20
15 Frequency
10
5
0
< 20 <40 <60 <80 <100
Ungrouped Versus Grouped
Data

• Ungrouped data
• have not been summarized in any way
• are also called raw data
• Grouped data
• have been organized into a frequency
distribution
Example of Ungrouped
Data
42 26 32 34 57

30 58 37 50 30
Ages of a Sample of
53 40 30 47 49
Managers from
50 40 32 31 40
XYZ
52 28 23 35 25

30 36 32 26 50

55 30 58 64 52

49 33 43 46 32

61 31 30 40 60

74 37 29 43 54

35
Frequency Distribution of
Ages

Class Interval Frequency


20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11
60-under 70 3
70-under 80 1

36
Data Range

42 26 32 34 57 Range = Largest - Smallest


30 58 37 50 30

53 40 30 47 49
= 74 - 23
50 40 32 31 40 = 51
52 28 23 35 25

30 36 32 26 50

55 30 58 64 52 Smallest
49 33 43 46 32

61 31 30 40 60 Largest
74 37 29 43 54

37
Number of Classes and Class
Width
• The number of classes should be between 5 and 15.
• Fewer than 5 classes cause excessive summarization.
• More than 15 classes leave too much detail.
• Class Width
• Divide the range by the number of classes for an
approximate class width
• Round up to a convenient number

51
Approximate Class Width = = 8.5
6
Class Width = 10

38
Class Midpoint

beginning class endpoint + ending class endpoint


Class Midpoint =
2
30 + 40
=
2
= 35

1
Class Midpoint = class beginning point + class width
2
1
= 30 + 10
2
= 35

39
Relative Frequency
Relative
Class Interval Frequency Frequency
20-under 30 6 .12
30-under 40 18 6 .36

50
40-under 50 11 .22
50-under 60 11 18 .22

60-under 70 3 50 .06
70-under 80 1 .02
Total 50 1.00

40
Cumulative Frequency
Cumulative
Class Interval Frequency Frequency
20-under 30 6 6
30-under 40 18
18 + 6 24
40-under 50 11 35
11 + 24
50-under 60 11 46
60-under 70 3 49
70-under 80 1 50
Total 50

41
Class Midpoints, Relative Frequencies,
and Cumulative Frequencies

Relative Cumulative
Class IntervalFrequency Midpoint Frequency Frequency
20-under 30 6 25 .12 6
30-under 40 18 35 .36 24
40-under 50 11 45 .22 35
50-under 60 11 55 .22 46
60-under 70 3 65 .06 49
70-under 80 1 75 .02 50
Total 50 1.00
Cumulative Relative Frequencies

RelativeCumulative Cumulative Relative


Class IntervalFrequency Frequency Frequency Frequency
20-under 30 6.12 6 .12
30-under 40 18 .36 24 .48
40-under 50 11 .22 35 .70
50-under 60 11 .22 46 .92
60-under 70 3.06 49 .98
70-under 80 1 .02 50 1.00
Total 50 1.00

43
Common Statistical Graphs
• Histogram -- vertical bar chart of frequencies
• Frequency Polygon -- line graph of frequencies
• Ogive -- line graph of cumulative frequencies
• Pie Chart -- proportional representation for
categories of a whole
• Stem and Leaf Plot
• Pareto Chart
• Scatter Plot

44
Histogram

Class Interval

20
Frequency
20-under 30 6

Frequency
30-under 40 18

10
40-under 50 11
50-under 60 11
60-under 70 3
0

70-under 80 1 0 10 20 30 40 50 60 70 80
Years

45
Histogram Construction

Class Interval Frequency


20-under 30 6

20
30-under 40 18
40-under 50 11
Frequency
10
50-under 60 11
60-under 70 3
70-under 80 1
0

0 10 20 30 40 50 60 70 80
Years
46
Frequency Polygon

Class Interval Frequency

20
20-under 30 6
30-under 40 18

Frequency
40-under 50 11

10
50-under 60 11
60-under 70 3
70-under 80 1
0

0 10 20 30 40 50 60 70 80
Years

47
Ogive

Cumulative
Class Interval Frequency

60
20-under 30 6

40
Frequency
30-under 40 24
40-under 50 35

20
50-under 60 46
60-under 70 49 0

70-under 80 50 0 10 20 30 40 50 60 70 80
Years

48
Relative Frequency Ogive
Cumulative
Relative

Cumulative Relative Frequency


Class Interval Frequency 1.00
0.90
20-under 30 .12 0.80
0.70
30-under 40 .48 0.60
0.50
40-under 50 .70 0.40
50-under 60 .92 0.30
0.20
60-under 70 .98 0.10
0.00
70-under 80 1.00 0 10 20 30 40 50 60 70 80
Years

49
Complaints by Passengers
Schedules,
Personnel Etc.
14% 10%

Equipment
15%

Stations, Etc.
40%
Train
Performance
21%

50
2d Quarter
Truck
Production
Company

A 357,411

B 354,936

Second C 160,997

Quarter Truck D 34,099

Production E
Totals
12,747
920,190

51
Pie Chart Calculations for
Company A
2d Quarter
Truck
Production
Company Proportion Degrees

A 357,411 .388 140

B 357, 411 354,936 .386 139


=
C 920,190 160,997 .175 63

D 34,099 .388 .037


360 = 13

E 12,747 .014 5
Totals 920,190 1.000 360

52
Second Quarter
Truck Production
17%
4%
1%

39%
39%

A B C D E

53
Pareto Chart
100 100%
90 90%
80 80%
70 70%
60 60%
Frequency

50 50%
40 40%
30 30%
20 20%
10 10%
0 0%
Poor Short in Defective Other
Wiring Coil Plug

54
Scatter Plot

Registered Gasoline Sales


Vehicles (1000's of 200
(1000's) Gallons)

Gasoline Sales
5 60
100
15 120

9 90
0
15 140 0 5 10 15 20
Registered Vehicles
7 60

55
Principles of Excellent Graphs

 The graph should not distort the data.


 The graph should not contain unnecessary
adornments (sometimes referred to as chart junk).
 The scale on the vertical axis should begin at zero.
 All axes should be properly labeled.
 The graph should contain a title.
 The simplest possible graph should be used for a
given set of data.
Graphical Errors: Chart Junk

Bad Presentation  Good Presentation


Minimum Wage Minimum Wage
1960: $1.00
$
4
1970: $1.60
2
1980: $3.10
0
1990: $3.80 1960 1970 1980 1990
Graphical Errors:
Compressing the Vertical Axis
Bad Presentation  Good Presentation
Quarterly Sales Quarterly Sales
$ $
200 50

100 25

0 0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
Graphical Errors: No Zero Point
on the Vertical Axis
Bad Presentation
 Good Presentations

Monthly Sales $ Monthly Sales


$ 45
45
42
42 39
39 36
36 0
J F M A M J J F M A M J

Graphing the first six months of sales


• http://www.stats.gla.ac.uk/steps/glossary/p
resenting_data.html
• http://www.ilir.uiuc.edu/courses/lir593/
6
1

You might also like