Quantitative Techniques A COQT111: Robbie Stewart
Quantitative Techniques A COQT111: Robbie Stewart
TECHNIQUES A
COQT111
Robbie Stewart
1
1.1 Introduction
• Management Decision Making
➢Statistics is used by managers to assist them in their
decision making. Statistics forms a “management decision
support system”.
• What is Statistics?
➢Statistics can be defined as the science of collecting,
organising, analysing and interpreting data in order to
make decisions
2
5
3
• The language of statistics continued
➢A sample statistic describes a characteristic of a sample
4
1.5 Statistical Applications in Management
➢Finance
➢Marketing
➢Human Resources
➢Operations and Logistics
10
5
• Types of Data
11
➢Continuous data
✓ Data can take on any numerical value
✓ Can be fractions or decimals
✓ e.g. – The distanced a student travels to campus in
kilometres
12
6
• Measurement scales
➢Nominal data is the weakest form of data to analyse, has
no numerical properties, no specific order and is
associated with categorical data e.g. male, female
➢Ordinal data is also associated with categorical data but
has an implied ranking, is a stronger than nominal data and
each consecutive category possesses either more or less
than the previous category e.g. small, medium, large
➢Interval data is also associated with numeric data and
quantitative random variables
➢Ratio data consists of real numbers associated with
quantitative random variables, has all the properties of
numbers and is the strongest data for statistical analysis
13
14
7
15
16
8
1.9 Data Collection Methods
17
18
9
➢ A personal interview is a face to face questionnaire:
20
10
✓ Disadvantages of a telephone interviews include:
❖ Lack of anonymity
❖ Non-verbal responses cannot be observed
❖ Trained interviewers are required which increases costs
❖ Interviewer bias can occur
❖ Data can be lost if the respondent puts down the phone
❖ Sampling bias can occur as only people with phones can
be interviewed
21
22
11
✓ Disadvantages of e-surveys over personal interviews
include:
❖ Limited sampling frames
❖ Sampling bias can occur as only people with email,
internet access or mobile phones can respond
24
12
Chapter 2 Summarising Data
Prescribed sections 2.1 – 2.3 p 26 – 42
Do not study Stacked Bar Charts, Multiple Bar Charts, Histograms, Box Plot, Trendline
Graphs, Lorenz Curve and Pareto Curve.
25
26
13
2.2 Summarising Categorical Data
• Single Categorical Variable
– To construct a categorical frequency table
➢ List the categorical variables (column 1)
➢ Count the number of occurrences (column 2)
➢ For a percentage frequency table convert the count to a
percentage (column 3)
27
28
14
– To construct a pie chart
➢ Divide a circle into categorical segments
➢ The size of each segment must be equal to of the count
(or percentage) of its category
➢ The sum of the segments must be equal to the sample size
(or 100%)
29
30
15
2.3 Summarising Numeric Data (p35)
– Numeric frequency distribution summarises data into
intervals of equal width. (Grouped data)
➢ Step 1: Determine the data range
70 −20
– Interval width = = 10
5
32
16
• Cumulative Frequency Distribution (p38)
– A cumulative frequency distribution is a summary
table of cumulative frequency counts and is used to
answer questions of a less than or greater than nature
➢ Step 1: Using the numeric frequency distribution add
an extra interval below the first interval (the
frequency for this interval should be zero (0))
➢ Step 2: Count the number of variables that fall below
(or equal to) the maximum value of each interval
➢ Alternatively add the frequencies of each interval
below the maximum value of the current interval
together or the current intervals frequency to the
cumulative frequency of the previous interval.
➢ Step 3: Check that the value of the cumulative
frequency of the last interval is equal to the sample
size.
33
34
17
• Ogive (p39)
– An Ogive is a graph of the cumulative frequency
distribution. To construct an Ogive:
➢ Step 1: On the x axis mark the interval limits.
➢ Step 2: On the y axis plot the cumulative frequency
against the upper limit of the interval.
➢ Step 3: Join the cumulative frequency points with a
line graph
35
36
18
Chapter 3 Describing Data: Numeric Descriptive
Statistics
Prescribed sections 3.1 – 3.5, 3.8 & 3.10 p 66 – 87, 91
& 93 - 94
Do not study Geometric mean on p73.
37
38
19
Using your Casio fx82 to calculate measures
• Ungrouped Data is data that is given as individual data points
• Ungrouped data has no frequency distribution
• On your Casio fx82 calculator you must setup the Stats Mode to
frequency OFF for ungrouped data
✓ Press Shift ; Setup ; down arrow ; 3 (STAT) ; 2 (OFF)
✓ Press Mode ; 2 (STAT) ; 1 (1-VAR)
✓On your calculator choose Option 3:Sum ; Option 2:Σx (Σx = 376)
✓ n = Option 4:Var ; Option 1:n (n = 20)
UNGROUPED DATA
376
✓ x̄ = = 18.8
20
✓ Check your answer x̄ = Option 4:Var ; Option 2: x̄ = 18.8
40
20
• Central Location Measures GROUPED DATA
➢ σ 𝑥 = 1130
➢ 𝑛 = 30
1130
➢ 𝑥ҧ = = 37.67 minutes
30
41
21
• Central Location Measures GROUPED DATA
– Median (Me)
43
– Median (Me)
44
22
• Central Location Measures GROUPED DATA
– Median (Me)
30
10[ − 8]
➢ 𝑀𝑒 = 30 + 2
= 30 + 7.78 = 37.78 minutes
9
➢ The median courier delivery time is 37.78 minutes
45
23
• Central Location Measures GROUPED DATA
– Mode (MO)
10(9 −5)
➢ 𝑀𝑜 = 30 + = 30 + 6.67 = 36.67 minutes
2×9 − 5 −7
➢ The mode for courier delivery time is 36.67 minutes
47
24
• Advantages and Disadvantages of Different
Measures of Central Location
49
50
25
– The data below represents the prices
of different vehicle types sold by a
dealer. Calculate the:
➢ Mean,
➢ Median and
➢ Mode
51
UNGROUPED DATA
52
26
UNGROUPED DATA
UNGROUPED DATA
• Percentiles
Step 1: Determine Percentile position as:
20(n + 1) 20
P20 = = (n + 1)
100 100
60(n + 1) 60
P60 = = (n + 1)
100 100
54
27
UNGROUPED DATA
20 20
P20 = (n + 1) P20 = (20 + 1) = 4.2
100 100
60 60
P60 = ( n) P60 = (20 + 1) = 12.6
100 100
GROUPED DATA
56
28
GROUPED DATA
Step 2: Determine quartile value as:
1
10 30 − 3
4
Q1 = 20 + = 20 + 9 = 29
5
3
10 30 − 17
4
Q3 = 40 + = 40 + 55 = 47.86
7 7
57
GROUPED DATA
• Percentiles
29
• Percentiles (continued) GROUPED DATA
20 60
c n − f () c n − f ()
+ +
100 100
P20 = OP20 P60 = OP60
f P20 f P60
20 60
10 100 × 30 − 3 10 100 × 30 − 17
𝑃20 = 20 + 𝑃60 = 40 +
5 7
30 10
𝑃20 = 20 + = 26 𝑃60 = 40 + = 41.43
5 7
59
Calculate the:
➢ 1st Quartile,
➢ 3rd Quartile,
➢ 20th Percentile
➢ 90th Percentile
60
30
1st Quartile
𝑛
Car Prices n = 160 ∴ = 40
4
Price (R000) f F 160
25[ −31]
50 < 75 4 4 𝑄1 = 125 + 4
=
24
75 < 100 10 14 125 + 9.375 = 134.375
100 < 125 17 31
The 1st Quartile car price is
125 < 150 24 55 R134,375.00
150 < 175 32 87
175 < 200 29 116 3rd Quartile
3𝑛
200 < 225 20 136 n = 160 ∴ = 120
4
225 < 250 15 151 3×160
25[ −116]
250 < 275 7 158 𝑄1 = 200 + 4
=
20
275 < 300 2 160
200 + 5 = 205
160
The 3rd Quartile car price is
R205,000.00
61
20th Percentile
Car Prices n = 160 ∴ 0.2 × 𝑛 = 32
Price (R000) f F 𝑄1 = 125 +
25[32 −31]
=
24
50 < 75 4 4
125 + 1.04167 = 126.04167
75 < 100 10 14
100 < 125 17 31 The 20th Percentile car price
is R126,041.67
125 < 150 24 55
150 < 175 32 87
175 < 200 29 116 90th Percentile
200 < 225 20 136 n = 160 ∴ 0.9 × 𝑛 = 144
225 < 250 15 151 25[144−136]
𝑄1 = 225 + =
15
250 < 275 7 158 225 + 13.33333 = 238.33333
275 < 300 2 160
The 90th Percentile car price
160 is R238,333.33
62
31
3.4 Measures of Dispersion (p79)
Dispersion refers to the extent to which data values are scattered
around their central location value
63
– Range (p79)
64
32
– Variance (s2) (p80)
ALWAYS SHOW
ALL YOUR
WORKINGS
33
– What is the Standard Deviation used for?
✓ The Standard deviation (s) determines the degree of
confidence that any data point has to the mean
✓ Where the data variables are normally distributed (bell
shaped) 68.3% of the data will fall within 1 Standard
Deviation of the mean; 95.5% of the data will fall within 2
Standard Deviations of the mean and 99.7% of the data will
fall within 3 Standard Deviations of the mean.
67
68
34
– Coefficient of Variation (cv) (p83)
16.4
✓ cv(mass) = × 100 = 21.03%
78
20.1
✓ cv(height) = 166 × 100 = 12.11%
✓ The Coefficient of Variation (cv) for passenger height is
smaller than that for passenger mass compared to their
respective averages.
✓ There is less variation in passenger height than there is in
passenger mass.
69
✓ The mean is
greater than the
median.
70
35
– Negatively Skewed Distribution
71
72
36
• Bowley Coefficient of Skewness
The Bowley skewness method is used when the mean
and median are not known:
74
37
3.8 Choosing Valid Descriptive Statistics
Measures
For categorical type data:
➢ categorical frequency table;
➢ bar and pie charts;
➢ the modal category;
➢ measures of dispersion and skewness cannot
be determined.
75
76
38
Chapter 4 Basic Probability Concepts
77
P ( A) =
76
= 0.214 = 21.4%
355
78
39
• Ways to derive Objective Probabilities
— a priori – where outcomes are known in advance
— empirically – where research is conducted
— mathematically – through use of probability distributions
79
P (BP ) =
13
= 0.26 = 26%
50
There is a 26% probability that a motorist will prefer BP
80
40
4.4 Basic Probability Concepts (p109)
— The intersection of events
81
10
P(small ∩ service) = = 0.0588 = 5.89%
170
There is a 5.89% probability that the firm will be both small and in
the service sector
82
41
• Basic Probability Concepts
— The union of events
83
36 + 24 − 10
P(small U service) = = 0.294 = 29.4%
170
Note that the intersection is subtracted to avoid double counting
84 counting
42
• Basic Probability Concepts
— Mutually exclusive events
85
36 + 48 + 86 170
P(small U medium U large) = = =1
170 170
86
43
• Basic Probability Concepts
— Statistically independent events
➢ Two events, A and B, are statistically independent if the
occurrence of event A has no effect on the outcome of event B,
and vise versa.
➢ Just because two events are statistically independent does not
make them mutually exclusive.
➢ Events that are statistically independent may still occur at the
same time but the outcome of the one event does not effect
the outcome of the other event.
87
88
44
• Calculating Objective Probabilities
– Joint Probability P(A∩B) is the probability of both
event A and event B occurring
45
4.6 Probability Rules p116
– The addition rule for non mutually exclusive events
𝑃 𝐴∪𝐵 = 𝑃 𝐴 +𝑃 𝐵 −𝑃 𝐴∩𝐵
• Probability Rules
– The addition rule for mutually exclusive events
𝑃 𝐴∪𝐵 = 𝑃 𝐴 +𝑃 𝐵
46
• Probability Rules
– The multiplication rule for statistically dependent
events 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴/𝐵 × 𝑃 𝐵
93
• Probability Rules
– To test for statistical independence: 𝑃 𝐴/𝐵 = 𝑃 𝐴
Is the size of the firm
statically independent of
the sector that the firm
operates in?
47
4.7 Probability Trees p120
In order for a product, consisting of 2 components, to work both
components must succeed. If the probability of component 1
failing is 5% and the probability of component 2 failing is 10%
construct a probability tree to determine the probabilities of the
product failing or succeeding?
The probability of the
product succeeding is
0.855 or 85.5%
𝑃(𝐴 and 𝐵)
𝑃(𝐴|𝐵) =
𝑃(𝐵)
Where: P(B) = P(A and B) + P(Ā and B)
And: P(A and B) = P(BꟾA) x P(A)
96
48
Bayes’ Theorem
➢ Given the historical probability of successfully launching a new
product of 40%. Market research is known to predict a positive
test market for 80% of successful products but will predict a
positive test market for 30% of failed product launches.
➢ If the market research for the launch of a new product A comes
back positive what is the probability that the product will be
successful?
Marginal Conditional Joint
Bayes’ Theorem
P(S and Y) = 0.32
P(S) = 0.4 P(NꟾS) = 0.2
P(S and N) = 0.08
P(F) = 0.6 P(YꟾF) = 0.3
P(F and Y) = 0.18
P(YꟾS) = 0.8 P(NꟾF) = 0.7
P(F and N) = 0.42
0.32
= = 0.64
0.5
98
49
4.9 Counting Rules – Permutations and
Combinations p122
– The Multiplication Rule of Counting for a single
event
𝑛! = 𝑛 𝑓𝑎𝑐𝑡𝑜𝑟𝑖𝑎𝑙 = 𝑛 × 𝑛 + 1 × 𝑛 − 2 × 𝑛 − 3 × ⋯ × 3 × 2
6! = 720
There are 720 possible permutations
99
4 10 6 = 240
There are 240 possible permutations
100
50
• Counting Rules – Permutations and
Combinations
– The Permutations Rule
A permutation is a number of distinct ways of selecting r objects
from a larger group of n objects where the ordering IS important
𝑛!
𝑛 𝑃𝑟 = 𝑛−𝑟 ! Note that r is always smaller than n
A factory has 3 machines and 8 workers
How many possible permutations (arrangements) of workers to
machines are possible?
8!
n Pr = 8 P3 = = 336
(8 − 3)!
There are 336 possible permutations
1
The probability of any one permutation is = 0.00297 = 0.297%
336
101
51
Chapter 5 Probability Distributions
103
52
5.4 Binomial Probability Distribution if:
➢ Random variable is observed n times
➢ 2 events are mutually exclusive and collectively exhaustive
➢ If the probability of success is p then the probability of failure
is (1 -p)
➢ Events are independent
𝑃 𝑥 =𝑛 𝐶𝑟 × 𝑝 𝑥 1 − 𝑝 𝑛−𝑥
for 𝑥 = 0,1,2,3 ⋯ 𝑛
P ( x 3) = P ( x = 0) + P ( x = 1) + P ( x = 2) + P ( x = 3)
P ( x = 0) =10 C0 (0.2)0 (1 − 0.2 )
(10−0 )
= 0.107
P ( x = 1) =10 C1 (0.2)1 (1 − 0.2 )
(10−1)
= 0.269
P ( x = 2) =10 C2 (0.2) 2 (1 − 0.2 )
(10− 2 )
= 0.302
P ( x = 3) =10 C3 (0.2)3 (1 − 0.2 )
(10−3)
= 0.201
53
• Binomial Probability Distribution
Descriptive Statistical Measures of the Binomial Distribution
➢ The mean and standard deviation for a random variable that
follows a Binomial Distribution can be calculated using the
formulas:
𝑀𝑒𝑎𝑛: 𝜇 = 𝑛𝑝
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛: 𝜎 = 𝑛𝑝(1 − 𝑝)
➢ Research shows 20% of insurance policies are cashed in before
their maturity date. Determine the mean and standard deviation
for 10 randomly selected policies.
p = 20% = 0.2 = 10 0.2 = 2
q = 1 – 0.2 = 0.8
n = 10
= 10 0.2 0.8 = 1.27
On average 2 policies out of 10 will be cashed in.
The standard deviation is 1.27 policies.
107
𝑒 −λ λ𝑥 𝑒 −𝑎 𝑎 𝑥
𝑃 𝑥 = 𝑥!
= 𝑥!
for x = 0, 1, 2, 3 …
Where:
e = mathematical constant ≈ 2.71828
a = λ = mean
x = number of occurrences
108
54
• Poisson Probability Distribution
A travel agency receives 5 web based queries a day on average.
What is the probability that it will receive more than 4 web
based queries on any day?
P ( x 5) = 1 − [ P ( x = 0) + P ( x = 1) + P ( x = 2) + P ( x = 3) + P ( x = 4)]
e −5 50 𝑒 −5 53
P ( x = 0) = = 0.00674 𝑃 𝑥=3 = = 0.1404
0! 3!
e −5 51
P ( x = 1) = = 0.0337
1! 𝑒 −5 54
𝑃(𝑥 = 4) = = 0.1755
e −5 52 4!
P ( x = 2) = = 0.0842
2!
109
𝑀𝑒𝑎𝑛: 𝜇 = λ = a
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛: 𝜎 = λ= 𝑎
110
55
5.6 Continuous Probability Distributions p(140)
➢ Can take on any value e.g. time, mass, distance
➢ The most widely used Continuous Probability Distribution is the
Normal Probability Distribution
111
112
56
Chapter 7 Confidence Interval Estimation
113
114
57
7.4 Confidence Interval for a Single
Population Mean (μ) when the Population
Standard Deviation (σ) is known (p180)
𝑥→𝜇
• 300 shoppers sampled spend R78 on average
• Population standard deviation = R21
• Find the 95% confidence intervals for the
average spend by grocery shoppers
• z-value from t distribution tables
115
116
58
x−z
– x+z or = xz
n n n
lower limit upper limit
𝜎
x = R 78 Standard error = 𝜎𝑥 =
𝑛
𝜎 21
n = 300 𝜎𝑥 = = = 1.2124
= 21
𝑛 300
• 95% confidence level 5% = 0.05 = 0.025
2 2
118
59
– The greater the confidence interval the wider the
range of the upper and lower limits into which the
population parameter will fall (Example 7.2 p183)
𝑥 = 35.8 𝑚𝑖𝑛 𝜎 11
𝑛 = 100 Standard error 𝜎𝑥 = = = 1.1
𝑛 100
𝜎 = 11
At 95% confidence interval
𝜎
𝜇 =𝑥±𝑧 = 35.8 ± 1.96 1.1
𝑛
Lower limit = 33.64 The population mean will fall between
Upper limit = 37.96 33.64 min and 37.96 min 95% of the time
At 90% confidence interval
𝜎
𝜇 =𝑥±𝑧 = 35.8 ± 1.645 1.1
𝑛
Lower limit = 33.99 The population mean will fall between
Upper limit = 37.61 33.99 min and 37.61 min 90% of the time
119
60
– The smaller the population Standard Deviation, in relation to
its mean, the more precise the estimate and the narrower the
confidence interval (variation of Example 7.3 p184)
At 95% confidence interval where σ = 21.5
𝑥 = 88.4 𝑚𝑜𝑛𝑡ℎ𝑠
𝑛 = 144 𝜎 21.5
Standard error 𝜎𝑥 = = = 1.792
𝜎1 = 21.5 𝑛 144
𝜎
𝜇 =𝑥±𝑧 = 88.4 ± 1.96 1.792
𝑛
Lower limit = 84.89
Upper limit = 91.91
The population mean will fall between 84.89 months and 91.91
months 95% of the time
121
The population mean will fall between 86.52 months and 90.28
months 95% of the time
122
61
7.6 The Rationale of a Confidence Interval (p185)
• If n=100 µ=32 standard error=1.1
• 95% confidence µ=32 ±1.96(1.1) =
• Lower limit R29.84 to upper limit R34.16
• At any value where the sample mean lies between
R29.84 and R34.16 the 95% confidence interval of
the sample mean will include the true mean (µ)
• Where the value where the sample mean does not
lie between R29.84 and R34.16 the 95% confidence
interval of the sample mean will not include the true
mean (µ)
123
124
62
125
126
63
127
64
7.9 Confidence Interval for the Population
Proportion (π) (p189)
𝑝 1−𝑝
𝜎𝑝 ≈
𝑛
𝑝 1−𝑝 𝑝 1−𝑝
𝑝−𝑧 ≤𝜋 ≤𝑝+𝑧
𝑛 𝑛
lower limit upper limit
129
65
ALL WORK COVERED TO
THIS POINT WILL BE
ASSESSED IN THE
SEMESTER TEST
131
132
66
8.1 Introduction
133
134
67
135
136
68
Statistical and Management Conclusions
137
138
69
• Hypothesis Testing
n = 360
= 67.5
= 0.05
z − crit = 1.96
– Step 1: Define the Statistical Hypotheses (Null and
Alternative)
➢Two-sided Hypothesis Test
H 0 : = 175
H1 : 175
139
140
70
– Step 4: Compare the Sample Test Statistic to the
Area of Acceptance
141
142
71
8.4 Hypothesis Test for a Single Population
Mean (μ) – Population Standard Deviation
(σ) is Unknown
➢ Example 8.3 page 212
➢ SARS claim it takes, on average, less than 45 minutes
to complete a tax return via eFiling
Time taken to complete tax return on eFiling
42 56 29 35 47 37 39 29 45 35 51 53
𝑠
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 = 𝑛 = 12 (df = 11)
𝑛 𝑥 = 41.5
𝑥−𝜇 𝑠 = 9.04
𝑡 − 𝑠𝑡𝑎𝑡 = 𝑠 𝛼 = 0.05
𝑛 𝑡 − 𝑐𝑟𝑖𝑡 = ±1.796
143
• Hypothesis Testing
– Step 1: Define the Statistical Hypotheses (Null and
Alternative)
➢One-sided Lower-tailed Hypothesis Test
𝐻0 : 𝜇 ≥ 45 (It takes 45 minutes or more to
complete a tax return)
144
72
• Step 2: Determine the Region of Acceptance of
the Null Hypothesis
➢Accept H0 if t-stat ≥ -1.796
➢Accept H1 (reject H0) if t-stat < -1.796
x − 41.5 − 45 − 3 .5
t − stat = = = = −1.341
s 9.04 2.6096
n 12
145
t-crit = -1.796
146
73
– Step 5: Draw Statistical and Management
Conclusions
➢ t-stat = -1.796 ≤ -1.341
➢ Accept H0 : µ ≥ 45
➢ It can be concluded with 95% confidence that the
mean time to complete a tax return via eFiling is
no less than 45 minutes.
➢ The claim by SARS is therefore rejected at the 5%
level of significance.
147
148
74
• 8.5 Hypothesis Test for a Single Population
Proportion (π) (p215)
➢ Example 8.4
➢ A Cell phone company claims it has 15% market share.
If a survey of 360 users shows 42 use the cell
company test the claim at the 1% level of significance.
n = 360
42 z − stat =
(p − )
p= = 0.1167
360 (1 − )
= 0.15 n
1 = 0.01
z − crit = 2.58
149
• Hypothesis Testing
– Step 1: Define the Statistical Hypotheses (Null and
Alternative)
➢Two-sided Hypothesis Test
H 0 : = 0.15
H1 : 0.15
150
75
• Step 2: Determine the Region of Acceptance of
the Null Hypothesis
➢α = 0.01/2 = 0.005
➢z-crit = ±2.58
➢Accept H0 if -2.58 ≤ z ≤2.58
➢Accept H1 (reject H0) if z < -2.58 or z > 2.58
0.1167 − 0.15
= = −1.771
0.0188
151
152
76
– Step 5: Draw Statistical and Management
Conclusions
– z-stat = -1.771 ≥ critical z -2.58, the z-stat lies
within the area of acceptance of H0 : µ = 0.15
– We can say with 99% confidence that the claim by
the cell phone company that they have a 15%
market share is true.
153
154
77
Chapter 9 Hypothesis Testing: Comparison between
Two Populations (Means and Proportions)
Prescribed sections 9.1 – 9.3 and 9.5 p 235 – 242 and
247 - 251
155
156
78
• Hypothesis Testing (Difference between two
means where σ is known)
– Example 9.1 (p236)
➢ Courier A used 60 times with an average delivery time
of 42 minutes and a population standard deviation of
14 minutes.
➢ Courier B used 48 times with an average delivery time
of 38 minutes and a population standard deviation of
10 minutes.
➢ Test the claim that there is no statistically significant
difference between the courier companies at a 5%
level of significance
157
𝐻1 : 𝜇1 − 𝜇2 ≠ 0 (there is a significant
difference in courier delivery times)
158
79
• Step 2: Determine the Region of Acceptance of
the Null Hypothesis
➢α = 0.05/2 = 0.025
➢z-crit = ±1.96
➢Accept H0 if -1.96 ≤ z ≤ 1.96
➢Accept H1 (reject H0) if z < -1.96 or z > 1.96
𝑥ҧ1 − 𝑥ҧ2 − 0 42 − 38 − 0
𝑧 − 𝑠𝑡𝑎𝑡 = = = 1.73
𝜎12 𝜎22 142 102
+ 60 + 48
𝑛1 𝑛2
159
80
161
162
81
• Hypothesis Testing (Difference between two
means where σ is unknown)
– Step 1: Define the Statistical Hypotheses (Null and
Alternative)
➢One-sided upper tailed Hypothesis Test
𝐻0 : 𝜇1 − 𝜇2 ≤ 0 (ROI% of financial firms is
not significantly greater than ROI% of manufacturing
firms)
163
164
82
• Step 3: Calculate the Sample Test Statistic
𝑛1 − 1 𝑠12 + 𝑛2 − 1 𝑠22
𝑠𝑝2 =
𝑛1 + 𝑛2 − 2
28 − 1 × 9.6452 + 24 − 1 × 8.8232
= = 86.0468
28 + 24 − 2
= 1.391
165
166
83
• Step 5: Statistical and management conclusion
➢At a 5% level of significance we accept H0
➢There is no statistically significant difference
between the ROI% of financial and manufacturing
firms at the 5% level of significance.
➢The claim by the financial analyst that the ROI% of
financial firms is greater than the ROI% of
manufacturing firms is rejected at the 5% level of
significance
167
( p1 − p2 ) → ( 1 − 2 )
168
84
• Hypothesis Testing (Difference between two
proportions)
– Example 9.4 (p248)
➢ A research company is required to establish if the
recall rate of teenagers is different from the recall rate
for young adults of a recent AIDS awareness campaign.
➢ A sample of 640 teenagers and 420 young adults were
interviewed. 362 teenagers and 260 young adults were
able to recall the AIDS awareness campaign.
➢ Conduct a hypothesis test at a 5% level of significance
to test the claim that there is an equal recall rate for
both groups.
169
𝐻1 : 𝜋1 − 𝜋2 ≠ 0
170
85
• Step 2: Determine the Region of Acceptance of the
Null Hypothesis
➢α = 0.05/2 = 0.025
➢where degrees of freedom (df) = 𝑛1 + 𝑛2
= 640 + 420 = 1060
➢z-crit = ±1.96
➢Accept H0 if -1.96 ≤ z ≤ 1.96
➢Accept H1 (reject H0) if z < -1.96 or z > 1.96
171
260
𝑝2 = 420 = 0.619 where 𝑛2 = 420
362 + 260
𝜋ො = = 0.5868
640 + 420
1 1
Standard error = 𝜋ො × 1 − 𝜋ො × 𝑛1
+𝑛 =
2
1 1
0.5868 × 1 − 0.5868 × + = 0.0309
640 420
172
86
Standard error = 0.0309
𝑝1 − 𝑝2 − 𝜋1 − 𝜋2
𝑧 − 𝑠𝑡𝑎𝑡 =
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟
0.5656 − 0.619 − 0
=
0.0309
= −1.728
173
174
87
• Step 5: Statistical and management conclusion
➢At a 5% level of significance we accept H0
➢We can say with 95% confidence that there is no
statistically significant difference in the recall rate
of the AIDS awareness campaign between
teenagers and young adults.
175
176
88
10.1 Introduction and Rationale
177
178
89
The Chi-Squared Statistic
• The chi-squared test is based on frequency
count data. It always compares a set of
observed frequencies obtained from a random
sample to a set of expected frequencies that
describes the null hypothesis.
179
180
90
Table of gender by magazine preference
181
91
Hypothesis test for independence
of association
• Step 1: Define the null and alternative
hypotheses
➢ 𝐻0: There is no association between
gender and magazine preference.
➢ 𝐻1: There is an association between
gender and magazine preference.
183
184
92
• Step 2: Determine the region of acceptance of the
null hypothesis
➢ α = 0.05
➢ 𝑑𝑓 = (𝑟 − 1)(𝑐 − 1)
➢ df = (2-1)x(3-1) = 2
➢ Critical χ2 = 5.991. Accept H0 if χ2 ≤ 5.991
185
186
93
• Construct the χ2 table for observed (fo) and
expected (fe) frequencies
187
188
94
• Step 5: Statistical and management conclusion
➢At a 5% level of significance we accept H0
➢There is no statistically significant difference
in the preference for magazines based on
the readers gender.
189
190
95
Hypothesis test for equality of multiple
proportions
191
192
96
• Step 3: Calculate the sample test statistic (χ2 -stat)
To calculate the expected frequencies:
𝑟𝑜𝑤 𝑡𝑜𝑡𝑎𝑙 × 𝑐𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙
𝑓𝑒 =
𝑛
193
194
97
• Step 4: Compare the sample statistic to the region
of acceptance
➢Sample χ2 = 3.805 < Critical χ2 = 4.605
➢ Sample χ2 is within the critical χ2 limits
195
196
98
Chapter 12 Linear Regression and Correlation Analysis
197
12.1 Introduction
In management, many numeric measures are
related (either strongly or loosely) to one
another. For example:
➢ advertising expenditure is assumed to
influence sales volumes;
➢ a company’s share price is likely to be
influenced by its return on investment;
➢ the number of hours of operator training is
believed to impact positively on productivity.
198
99
• Scatter plot between pairs of x and y
199
Where:
y = a + bx
y = dependent variable
a = intercept on the vertical axis (value of y
when x = 0)
y
b = gradient (b = x )
x = independent variable
200
100
• Example 12.1 Flat screen TV Sales p330
201
σ 𝑦 −𝑏 σ 𝑥
➢ To determine a use the formula 𝑎 = 𝑛
➢ From the 5:Reg option retrieve the values for 1:A and
2:B to confirm your answers
➢ Plot the regression line
202
101
𝑦 = 12.817 + 4.368𝑥
203
204
102
• Correlation Analysis (cont)
➢This measure is called Pearson’s correlation
coefficient. It is represented by the symbol r
➢When r is calculated from sample data the
following formula is used :
205
➢ Press Shift 1
➢ Press 3:Sum
➢ Retrieve the values for 1:∑x2; 2:∑x ; 3:∑y2; 4:∑y; 5:∑xy
➢ Press Shift 1
➢ Press 4:Var
➢ Retrieve the values for 1:n
206
103
• Predicting the value of the dependent
variable for a given independent variable
➢ Once the regression equation has been estimated it is
possible to predict the value of the dependent variable (y)
for any given dependent variable (x) by substitution.
Flat-screen TV sales and adverts placed
Adverts 4 4 3 2 5 2 4 3 5 5 3 4
Sales 26 28 24 18 35 24 36 25 31 37 30 32
➢ a = 12.816; b = 4.368
➢ y = 12.816 + 4.368x
➢ Estimate the sales level if the firm places 6 advertisements
➢ y = 12.816 + 4.368(6) = 39
➢ Thus if 6 adverts are placed we predict the firm will sell 39
TV’s
207
208
104
• Perfect associations
209
• Strong associations
210
105
• Moderate to weak associations
211
• No association
212
106
12.4 The Coefficient of Determination (r2)
The coefficient of determination can be calculated by squaring
the correlation coefficient (r)
213
214
107
Step 2: Determine the Region of Acceptance of the Null
Hypothesis
➢At a 5% level of significance α = 0.025
➢Degrees of freedom (df) = n – 2 = 12 – 2 = 10
➢ Accept H0 if -2.228 ≤ t-stat ≤ 2.228
➢Accept H1 (reject H0) if t-stat < -2.228 or > 2.228
𝑛−2 12 − 2 10
𝑡 − 𝑠𝑡𝑎𝑡 = 𝑟 = 0.8198 = 0.8198 × = 4.527
1 − 𝑟2 1 − 0.81982 0.3279
215
216
108
– Step 5: Draw Statistical and Management
Conclusions
➢ t-stat = 4.527 > critical t = 2.228
➢ Reject H0 : 𝜌 = 0
➢ Accept 𝐻1 : 𝜌 ≠ 0
➢ It can be concluded with 95% confidence that
there is a strong positive correlation between the
number of advertisements placed and the sales
of flat-screen TV’s.
217
218
109
14.1 Introduction
219
110
Example 14.1 p377
6.87
➢ 𝑃𝑟𝑖𝑐𝑒 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒(2008) = 6.87 × 100 = 100 (Base year)
7.18
➢ 𝑃𝑟𝑖𝑐𝑒 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒(2009) = 6.87 × 100 = 104.5 (4.5% increase
since 2008)
7.58
➢ 𝑃𝑟𝑖𝑐𝑒 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒(2010) = 6.87 × 100 = 110.3 (10.3% increase
since 2008)
8.44
➢ 𝑃𝑟𝑖𝑐𝑒 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒(2011) = 6.87 × 100 = 122.9 (22.9% increase
since 2008)
221
222
111
• Laspeyres vs Paasche weighting
methods
➢ The Laspeyres approach holds quantities constant
at base period values
➢ The Paasche approach holds quantities constant
at current period values
223
112
➢ Step 2: Find the current year σ 𝑝1 × 𝑞0 .
σ 𝑝1 ×𝑞0
➢ Step 3: Calculate the composite price index σ × 100
𝑝0 ×𝑞0
554.86
➢ . 511.81 × 100 = 108.84
113
➢ Step 2: Find the current year σ 𝑝1 × 𝑞1 .
σ 𝑝1 ×𝑞1
➢ Step 3: Calculate the composite price index σ × 100
𝑝0 ×𝑞1
478.94
➢ . 442.34 × 100 = 108.3
227
228
114
Example 14.5 p385
➢ In 2009 a hardware store sold 143 doors. In 2010 they sold
122 doors and in 2011 they sold 174 doors.
➢ Using 2009 as the base year calculate the quantity relatives
for 2010 and 2011
143
➢ 𝑄𝑢𝑎𝑛𝑡𝑖𝑡𝑦 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒(2009) = 143 × 100 = 100 (Base year)
122
➢ 𝑄𝑢𝑎𝑛𝑡𝑖𝑡𝑦 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒(2010) = 143 × 100 = 85.3 (14.7%
decrease since 2009)
174
➢ 𝑄𝑢𝑎𝑛𝑡𝑖𝑡𝑦 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒(2011) = 143 × 100 = 121.7 (21.7%
increase since 2009)
229
230
115
• Laspeyres vs Paasche weighting
methods
➢ The Laspeyres approach holds prices constant at
base period values
➢ The Paasche approach holds prices constant at
current period values
Σ 𝑝0 𝑞1
𝐿𝑎𝑠𝑝𝑒𝑦𝑟𝑒𝑠 𝑞𝑢𝑎𝑛𝑡𝑖𝑡𝑦 𝑖𝑛𝑑𝑒𝑥 =
Σ(𝑝0 𝑞0 )
Σ 𝑝1 𝑞1
𝑃𝑎𝑎𝑠𝑐ℎ𝑒 𝑞𝑢𝑎𝑛𝑡𝑖𝑡𝑦 𝑖𝑛𝑑𝑒𝑥 =
Σ(𝑝1 𝑞0 )
231
116
➢ Step 2: Find the current year σ 𝑝0 × 𝑞1 .
117
➢ Step 2: Find the current year σ 𝑝1 × 𝑞1 .
σ 𝑝1 ×𝑞1
➢ Step 3: Calculate the composite quantity index σ × 100
𝑝0 ×𝑞1
11750
➢ × 100 = 97.36
12069
235
THE END
236
118