Notes
Notes
Statistics
Descriptive statistics, probability, probability distributions, estimation and
hypothesis testing, further topics.
Operations Research
Linear programming, stock control.
Reference Books
Statistics
1. Roxy Peck, Tom Short, Chris Olsen. 2020. Introduction to statistics and data
analysis, 6th Edition, Cengage.
2. Neil A. Weiss. 2017. Introductory Statistics, 10th Edition, Pearson.
Operations Research
1. Taha Handy A. 2017. Operations Research – An Introduction, 10th Edition,
Pearson.
2. Michael W. Carter, Camille C. Price, Ghaith Rabadi. 2019. Operations
research: a practical introduction, 2nd Edition, Boca Raton, FL.
1
An Introduction to Operations Research
2
Linear Programming (L.P.)
It is a special case of mathematical programming in which the objective
function (e.g. maximize profit function or a minimize cost function) and the
constraints (resource limitations) can be expressed by linear mathematical
relationships.
The manufacturer can make a profit of $12 on a unit length of cloth A and $8 on
a unit length of cloth B. What product mix will make the largest possible profit?
Model formulation
Let x be the number of unit length of cloth A produced
and y be the number of unit length of cloth B produced.
(x and y are called decision variables)
Our objective is to maximize total profit P =
subject to the following constraints:
(red wool)
(green wool)
(yellow wool)
(non-negativity)
3
(x,y) satisfying (*) is called a feasible solution. A feasible solution which also
attains the maximum value for P is called an optimal solution. All feasible
points form a feasible region. (**) is the standard form of maximization
problem.
Graphical method
The presence of 2 decision variables x and y allows graphical method to be
used. A constraint is drawn as a line with an arrow indicating the specific
region.
4
y
700
500
400
300
200
100
0 I______I______I______I______I______I______I______I______I______I____ x
100 200 300 400 500 600 700 800 900
5
The optimal point will always be found at one of the corners of the feasible
region. To determine the required corner, the objective function P = 12x + 8y is
plotted on the graph by choosing a suitable value of P so that it can be easily
plotted on the graph and it will lie inside the feasible region.
Choose P = 12x8xk x
y
i.e. 12x8xk = 12x + 8y
Take k = 10
6x + 3y = 1800 --------(1)
4x + 4y = 1400 --------(2)
The optimum product mix is to produce 250 unit lengths of cloth A and 100 unit
lengths of cloth B. The maximum profit = max. P = ______________
= 3800 ($)
6
Slack resources (unused resources)
Consider the red wool constraint 4x + 4y 1400
LHS = 4x + 4y = amount of red wool used
= ________________
RHS = max. availability of red wool = _______
→slack red wool = ________ (fully utilized)
As the red wool and green wool are fully utilized, they are called binding
constraints. Yellow wool is a non-binding constraint.
Shadow price
The change in the value of the objective function resulting from 1 unit increase
in the value of the RHS of a constraint is called shadow price.
x 0 1401/4=350.25
y 1401/4=350.25 0
7
The green and yellow wool constraints are still the same. The objective
function P = 12x + 8y remains the same.
y
700
600
500
400
300
200
100
0 I______I______I______I______I______I______I______I______I______I____ x
100 200 300 400 500 600 700 800 900
4x + 4y = 1401---------(5)
6x + 3y = 1800---------(6)
8
Substituting y = 100 1 into (5), we get 4x + 4 ( 100 1 ) = 1401
2 2
3
x= 249
4
Max. P = 12x + 8y = 12( 249 3 ) + 8( 100 1 ) = 3801 ($)
4 2
An additional kg of red wool results in an increase of profit = 3801 – 3800
= 1($)
= shadow price of 1 kg of red wool
= max. amount of money spent to get an additional kg of red wool.
2
(11)-(12) 6y = 598 y= 99
3
2
Substituting y = 99 into (9), we get 4x + 4( 99 2 ) = 1400
3 3
1
x= 250
3
9
(II) Consider the yellow wool constraint 2x + 6y 1800
If an additional kg of yellow wool is available then the new yellow wool
constraint becomes 2x + 6y 1801. The new optimal point remains the
same as the original L.P. i.e. x = 250, y = 100, max. P = 3800($)
Therefore, shadow price of 1 kg of yellow wool = 0.
Model Formulation
Let x be the number of kg of ingredient A produced
Let y be the number of kg of ingredient B produced
Our objective is to minimize total cost C = _________________
subject to the following constraints:
(weight) x
y
(nitrogen) x
y
10
(phosphate) x
y
(bone meal) x
y
(non-negativity)
No limitation for lime.
(*) is the standard form of minimization problem.
180
160
140
120
100
80
60
40
20
0 I______I______I______I______I______I______I______I______I______I____ x
20 40 60 80 100 120 140 160
11
Note: the nitrogen constraint 0.3x + 0.1y 15 is a redundant constraint i.e. it
can be omitted without affecting the feasible region.
12
Consider the phosphate constraint 0.1x + 0.05y 8
0.1(60) + 0.05(40) = 8 = RHS
→ surplus phosphate = ________
Shadow cost
Consider the weight constraint x + y 100
If an additional kg is allowed in the requirement of weight then the new weight
constraint becomes x + y 101
The optimal point is obtained by solving x + y = 101
0.1x + 0.05y = 8
x = 59 , y = 42 and min. C = 1.2(59) + 0.8(42) = 104.4
13
Similarly, shadow cost of 1 kg of nitrogen = __________
and shadow cost of 1 kg of bone meal = ___________
Simplex Method
It is an iterative process. We start with a feasible solution which is then
improved step by step until an optimal solution is obtained.
Simplex method to solve standard maximization problem is illustrated by the
following example on the manufacturing of cloth.
Example
Maximize total profit P = 12x + 8y
subject to 4x + 4y 1400 (red wool)
6x + 3y 1800 (green wool)
2x + 6y 1800 (yellow wool)
x 0, y 0 (non-negativity)
Solution:
Step 1 Rewrite the objective function as
Step 2 Convert the inequalities into equalities for the constraints by introducing
non-negative slack variables.
14
Step 3 Set up the initial simplex tableau which requires
(a) a column for each variable, including slack variables, objective function
and a ‘quantity’ column.
(b) a row for each constraint and a row for the objective function. Row titles
are r, s, t and P.
Title x y r s t P Quantity
r
s
t
P
(c) the coefficients of the variables are now entered into the tableau in the
order in which they appear in the equations of the model.
(d) the initial simplex tableau represents the first feasible solution to the
problem.
r = ________ , s = _________ , t = ___________
x = ________ , y = _________ , P = ___________
Note: an identity matrix for the coefficient of the variables is always found and
the solution can be read directly from the quantity column.
15
Step 4 Iteration
(a) Identify the pivotal column
Consider the objective function
-12x -8y + P = 0
When x increases by 1 unit, P increases by 12
When y increases by 1 unit, P increases by 8
Thus, column x is chosen since it can increase P faster which is called
‘pivotal column’.
Note: the column in the P-row with the most negative entry is
the pivotal column.
(b) Identify the pivotal row (carry out after pivotal column has already
chosen, in this case column x)
Consider the red wool constraint 4x + 4y +r = 1400
1400
Set y = r = 0, x = = 350. This solution cannot be used as the
4
green wool constraint is violated.
16
Thus, s-row is chosen since the calculated x-value is feasible. The s-row
is called ‘pivotal row’.
Note: Calculate the ratio of the quantity element in each row to the
element in the pivotal column. The row in which this ratio is the least
non-negative number is the pivotal row.
(e) Divide all the elements in the pivotal row by the pivotal element and enter
into the new tableau.
(f) Enter 0 for all the elements in the pivotal column except the pivotal
element.
m x n
New element = old element -
pivotal element
where m = element at the intersection of the pivotal column and the row
in which the old element lies
and n = element at the intersection of the pivotal row and the column
in which the old element lies.
17
Remarks:
(i) When choosing the pivotal element, 2 columns tied by having the most
negative P entry, we may choose either column first.
(ii) Choose the element for which one obtained the least non-negative ratios.
If there is a tie for the lowest ratio, we may also choose either row.
(h) When there are no more negative entry in the P-row. Stop. The last
tableau gives the optimal solution.
x = ______ → 250 unit lengths of cloth A
y = ______ → 100 unit lengths of cloth B
r = ______ → slack red wool =0 i.e. red wool is fully used
s = ______ → slack green wool = 0 i.e. green wool is fully used
t = ______ → slack yellow wool = 700 i.e. 700 kg of unused yellow
wool .
Maximum P = 3800($)
At the P-row,
under r column gives the shadow price of 1 kg of red wool = ________
under s column gives the shadow price of 1 kg of green wool = ________
under t column gives the shadow price of 1 kg of yellow wool = ________
18
Title x y r s t P Quantity Ratio
r 4 4 1 0 0 0 1400 1400/4 = 350
s 6 3 0 1 0 0 1800 1800/6 = 300
t 2 6 0 0 1 0 1800 1800/2 = 900
P -12 -8 0 0 0 1 0
r 0 2 1 -2/3 0 0 200 200/2 = 100
x 1 ½ 0 1/6 0 0 300 300/1/2= 600
t 0 5 0 -1/3 1 0 1200 1200/5 = 240
P 0 -2 0 2 0 1 3,600
y 0 1 ½ -1/3 0 0 100
x 1 0 -1/4 1/3 0 0 250
t 0 0 -5/2 4/3 1 0 700
P 0 0 1 4/3 0 1 3,800
Example
A company can produce 3 products A, B and C. The products yield a profit of
$8, $5 and $10 respectively. The products use a machine which has 400 hours
capacity in the next period. Each unit of the products uses 2, 3 and 1 hour
respectively of the machine capacity. There are only 150 units of a component
used by A and C. 200 kg of material Y are available, A uses 2 kg per unit and C
uses 4kg per unit. The marketing department states that no more than 50 units
of B can be sold.
The company wishes to maximize total contribution.
(a) Formulate the linear program.
(b) Solve the linear program.
(c) Interpret the final solution by giving the optimal product mix, slack
resources and shadow prices.
Solution:
(a) Let x1 , x2 and x3 be the no. of units of product A, B and C produced
respectively.
19
Our objective is to maximize contribution P =
subject to
(machine hour)
(component)
(material Y)
(sales of B)
(non-negativity)
20
Title x1 x2 x 3 S1 S2 S3 S4 P Qty Ratio
(c) Interpretation
Product mix : x1 = _________, x2 = ___________, x3 = __________
max . P = ___________
21
Slack resources : S1 = _________unused machine capacity
S2 = _________unused component
S3 = _________ → material Y is fully utilized
S4 = _________ → production of B is just enough
to meet its sales
22
Sensitivity Analysis
The study of the effect of discrete changes in the problem’s parameter on the current
optimal solution is referred to as sensitivity analysis.
y
700
500
400
(0, 350)
(350, 0)
0 I______I______I______I______I______I______I______I______I______I____
(300, 0) x
100 200 300 400 500 600 700 800 900
4𝑥 + 4𝑦 ≤ 1400
6𝑥 + 3𝑦 ≤ 1800
1
Sensitivity analysis using graphical approach
(I) Sensitivity analysis on the coefficient of the objective function
By how much can the profit from one-unit length of cloth A or cloth B change
before it becomes profitable to change the optimal mix?
Note: A change in the coefficient of the variable in the objective function changes
the slope / gradient of the objective function.
Now, let the profit of 1 unit length of cloth A be a, then the new objective function
becomes P’ = ax + 8y
The slope of the new objective function = _________
As long as the slope of the new objective function lies between the slope of the
two binding constraints, the optimal mix is still at point B.
a
i.e. −2 − −1
8
−16 −𝑎 −8
8 𝑎 16
Similarly, let the profit of 1 unit length of cloth B be b, then the new objective
function becomes P” = 12x + by
The slope of the new objective function = ___________
12
The optimal mix is still at B, if −2 − −1
b
1 𝑏
− ≥ − ≥ −1
2 12
−6 ≥ −𝑏 ≥ −12
6 𝑏 12
2
(II) Sensitivity analysis on the RHS constants of the constraints
This form of sensitivity analysis is concerned with the range over which the
RHS constant of a constraint can fluctuate such that the optimal solution
remains feasible (before the binding constrains become non-binding).
(a) By how much can the amount of red wool available be changed before
the green and red wool constraints become non-binding?
Let b1 denote the amount of red wool available. The red wool constraint
becomes 4x + 4y b1
If b1 > 1400 then the red wool constraint is shifted upwards parallel to the
old red wool constraint. The optimal point B will correspondingly shift
along the green wool constraint to a new optimal point of intersection. At
point E(180, 240), the green and red wool constraints are still binding.
Above E, they become non-binding and the yellow and green wool
constraints become binding. Therefore, the maximum value that b1 can
take is _________________ = 1680.
If b1 < 1400 then the red wool constraint shifts downwards. The optimal
B moves down the green wool constraint until point A(300, 0). Beyond A,
the red and green wool constraints become non-binding. Therefore, the
minimum value that b1 can take is ________________ = 1200.
Hence, if 1200 b1 1680, then the green and red wool constraints
will still be binding i.e. the optimal solution remains feasible.
3
(b) By how much can the amount of green wool available be changed
before the green and red wool constraints become non-binding?
Let b2 denote the amount of green wool available. The green wool
constraint becomes 6x + 3y b2
If b2 > 1800 then the green and red wool constraints will be binding until
point ( , ) . Therefore, the maximum value that b2 can take is
_________________ = 2100.
If b2 < 1800 then the green and red wool constraints will be binding until
point ( , ) . Therefore, the minimum value that b 2 will take is
_________________ = 1275.
Hence, as long as 1275 b2 2100 then the green and red wool
constraints will still be binding.
(c) By how much can the amount of yellow wool available be changed
before the green and red wool constraints become non-binding?
Let b3 denote the amount of yellow wool available. The yellow wool
constraint becomes 2x + 6y b3
The yellow wool constraint is non-binding and the shifting of b3 will not
affect the optimal point at point B.
If b3 > 1800, the yellow wool constraint will not touch the optimal point
thus the maximum value of b3 = .
If b3 < 1800, the yellow wool constraint can be shifted downwards until
point B. Therefore, the minimum value of b3 = _______________ = 1100.
Hence, as long as 𝑏3 ≥ 1100 then the green and red wool constraints
will still be binding.
4
Sensitivity analysis using Simplex method
Example (Manufacturing of clothes) :The simplex tableau
Title x y r s t P Quantity
r 4 4 1 0 0 0 1400
s 6 3 0 1 0 0 1800
t 2 6 0 0 1 0 1800
P -12 -8 0 0 0 1 0
r 0 2 1 -2/3 0 0 200
x 1 ½ 0 1/6 0 0 300
t 0 5 0 -1/3 1 0 1200
P 0 -2 0 2 0 1 3600
y 0 1 ½ -1/3 0 0 100
x 1 0 -1/4 1/3 0 0 250
t 0 0 -5/2 4/3 1 0 700
P 0 0 1 4/3 0 1 3800
5
The same optimal solution will remain if
4
1− ≥0 𝑎𝑛𝑑 + ≥0
4 3 3
4
1≥ 𝑎𝑛𝑑 ≥−
4 3 3
∴ −4 4
Coefficient function:
Add 12 through out (12 + ): 12 − 4 (12 + ) 12 + 4
8 (12 + ) 16
6
(II) Sensitivity analysis on the RHS constants
If the amount of red wool available becomes 1400 + b. The only effect on the
tableau is some changes in the entries of the quantity column for each tableau.
Title x y r s t P Quantity
r 4 4 1 0 0 0 1400+b
s 6 3 0 1 0 0 1800
t 2 6 0 0 1 0 1800
P -12 -8 0 0 0 1 0
r 0 2 1 -2/3 0 0 (1400+b)-4(300)=200+b
x 1 ½ 0 1/6 0 0 300
t 0 5 0 -1/3 1 0 1200
P 0 -2 0 2 0 1 3600
y 0 1 ½ -1/3 0 0 100+b/2
x 1 0 -1/4 1/3 0 0 250-b/4
t 0 0 -5/2 4/3 1 0 700-5b/2
P 0 0 1 4/3 0 1 3800+b
−200 𝑏 280
7
If the amount of green wool available becomes 1800+b’ and rework the
quantity column of each tableau.
Title x y r s t P Quantity
r 4 4 1 0 0 0 1400
s 6 3 0 1 0 0 1800+b’
t 2 6 0 0 1 0 1800
P -12 -8 0 0 0 1 0
r 0 2 1 -2/3 0 0 200- 2b’/3
x 1 ½ 0 1/6 0 0 300+b’/6
t 0 5 0 -1/3 1 0 1200-b’/3
P 0 -2 0 2 0 1 3600+2b’
y 0 1 ½ -1/3 0 0 100-b’/3
x 1 0 -1/4 1/3 0 0 250+b’/3
t 0 0 -5/2 4/3 1 0 700+4b’/3
P 0 0 1 4/3 0 1 3800+4b’/3
−525 𝑏’ 300
8
If the amount of yellow wool available becomes 1800 + b”. The only effect on
the tableau is some changes in the entries of the quantity column for each tableau.
Title x y r s t P Quantity
r 4 4 1 0 0 0 1400
s 6 3 0 1 0 0 1800
t 2 6 0 0 1 0 1800+b”
P -12 -8 0 0 0 1 0
r 0 2 1 -2/3 0 0 200
x 1 ½ 0 1/6 0 0 300
t 0 5 0 -1/3 1 0 1200+b”
P 0 -2 0 2 0 1 3600
y 0 1 ½ -1/3 0 0 100
x 1 0 -1/4 1/3 0 0 250
t 0 0 -5/2 4/3 1 0 700+b”
P 0 0 1 4/3 0 1 3800
9
Inventory Planning and control
Stock comprises a very large part of a business’s working capital and therefore it is
very important to control it effectively.
Total relevant inventory costs (or total operating costs, TOC) are costs relevant to the
derivation of economic order quantity (EOQ) or economic batch quantity (EBQ).
Economic order quantity (EOQ) is the quantity ordered which minimizes total
operating costs of holding and ordering stocks.
Economic batch quantity (EBQ) is the quantity manufactured which minimizes total
operating costs of holding and setting up of production.
Inventory control
The objective is to maintain stock levels so that the total relevant inventory costs ( or
total operating costs TOC) of the company are at a minimum.
2
Deterministic models
It is one in which all the parameters are known with certainty i.e. demand rate and lead
time are known.
Assumptions
(a) the demand rate is constant
(b) the unit cost is constant
(c) lead time is zero or constant
(d) the cost of holding stock is proportional to the quantity of stock held
(e) no shortage
𝑄
𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑠𝑡𝑜𝑐𝑘 =
2
3
Annual ordering costs:
𝐴𝑛𝑛𝑢𝑎𝑙 𝐷𝑒𝑚𝑎𝑛𝑑 𝑅𝑆
𝐶𝑂 = × 𝑜𝑟𝑑𝑒𝑟 𝑐𝑜𝑠𝑡 =
𝑂𝑟𝑑𝑒𝑟 𝑄𝑢𝑎𝑛𝑡𝑖𝑡𝑦 𝑄
Annual holding costs:
𝑄
𝐶𝐻 = 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑠𝑡𝑜𝑐𝑘 × ℎ𝑜𝑙𝑑𝑖𝑛𝑔 𝑐𝑜𝑠𝑡 = × 𝐶𝐼
2
Total Inventory costs:
𝑅𝑆 𝑄𝐶𝐼
𝑇𝑂𝐶 = 𝐶𝑂 + 𝐶𝐻 = +
𝑄 2
Differentiate TOC with respect to Q :
𝑑𝑇𝑂𝐶 𝑅𝑆 𝐶𝐼
=− 2+
𝑑𝑄 𝑄 2
𝑑𝑇𝑂𝐶
First order differentiation is max/min when = 0:
𝑑𝑄
𝐶𝐼 𝑅𝑆 2𝑅𝑆
= 2 ∴ 𝑄2 =
2 𝑄 𝐶𝐼
𝑑2 𝑇𝑂𝐶 2𝑅𝑆
Second order differentiation, = > 0 hence Q is minimum.
𝑑𝑄 2 𝑄3
2𝑅𝑆
𝐸𝑂𝑄 = 𝑄 = √
𝐶𝐼
TOC is minimum when Q= EOQ.
𝑅𝑆 2𝑅𝑆 𝐶𝐼
𝑇𝑂𝐶 = +√ ∙
𝐶𝐼 2
√2𝑅𝑆
𝐶𝐼
∴ 𝑇𝑂𝐶𝑚𝑖𝑛𝑖𝑚𝑢𝑚 = √2𝑅𝑆𝐶𝐼
4
Graphical approach to derive EOQ
√2𝑅𝑆𝐶𝐼
2𝑅𝑆
√
𝐶𝐼
Minimum TOC occurs when annual ordering cost equals annual holding cost.
RS QCI 2 RS
= → Q= = EOQ
Q 2 CI
5
E.g. The annual demand forecast for a particular product sold by a retail store is
12,000 units, the cost is $60 per unit. The cost of ordering and receiving
delivery is $30 on each occasion. Stock holding costs are 30% per annum of
stock value. No shortage is allowed.
(a) What is the optimal order size and how many orders should be placed in
a year?
(b) What are the ordering and holding costs and hence what is the total
relevant inventory cost per annum?
6
E.g. Metrobus Company is a city-owned transit company which operates a fleet of
400 buses. The fleet includes buses used for public transit as well as school
buses. Metrobus Company is interested in establishing an inventory cost. All
buses use the same type of tyre and the annual requirements are estimated at
5000 tyres. Ordering cost per order is $125 and cost carrying a tyre in
inventory for one year is estimated at $20. Assuming that lead time is zero
and 300 working days in a year.
(i) Determine the optimal order quantity, minimum total annual relevant
inventory cost and the time between orders.
(ii) If Metrobus Company must order tyres by the dozen, what would be the
percentage change in the total annual relevant inventory cost as
compared to the answers in (i)?
7
(II) Manufacturing model to derive EBQ formula
In purchase model, all items purchased were treated as being received into
inventory at one time. However, when a company manufactures the items, there
is a continuous flow of stock into the inventory as items are completed. The
inventory of finished items does not build up immediately to its maximum point
but it builds up gradually since items are being produced faster than they are
being sold.
Notation
Let Q = no. of units produced per production run
R = annual demand
S = set up cost per production run
V = usage/ sales rate in units per day
P = production rate in units per day
C = unit cost
I = inventory holding or carrying cost expressed
as a % of value of average stock held.
D = no. of days in actual production.
𝐷(𝑃 − 𝑉)
(𝑃 − 𝑉)
𝑉
𝐷(𝑃 − 𝑉)
Average Stock =
2
8
When there is production, P units are produced per day and V units are
demanded per day. There is a net increase of (𝑃 − 𝑉) units per day.
This reached a peak at the end of the actual production (i.e. D days).
𝑇ℎ𝑒 𝑚𝑎𝑥𝑖𝑚𝑢𝑚 𝑠𝑡𝑜𝑐𝑘 = 𝐷(𝑃 – 𝑉)
𝑇ℎ𝑒 𝑚𝑖𝑛𝑖𝑚𝑢𝑚 𝑠𝑡𝑜𝑐𝑘 = 0
𝐷(𝑃 − 𝑉) 𝑄 𝑄 𝑉
𝑇ℎ𝑒 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑠𝑡𝑜𝑐𝑘 = = (𝑃 − 𝑉) = (1 − )
2 2𝑃 2 𝑃
𝑑𝑇𝑂𝐶 𝐶𝐼 𝑉 𝑅𝑆
= (1 − ) − 2
𝑑𝑄 2 𝑃 𝑄
𝑑𝑇𝑂𝐶
First order differentiation is max/min when = 0:
𝑑𝑄
𝐶𝐼 𝑉 𝑅𝑆 2𝑅𝑆
(1 − ) = 2 ∴ 𝑄2 =
2 𝑃 𝑄 𝑉
𝐶𝐼 (1 − )
𝑃
Since Quantity is non negative,
2𝑅𝑆
𝑄=√
𝑉
𝐶𝐼 (1 − )
𝑃
9
𝑑2 𝑇𝑂𝐶 2𝑅𝑆
Second order differentiation, = > 0 hence Q is minimum.
𝑑𝑄 2 𝑄3
2𝑅𝑆
𝐸𝐵𝑄 = 𝑄 = √
𝑉
𝐶𝐼 (1 − )
𝑃
TOC is minimum when Q= EBQ.
𝑉
𝑅𝑆 𝐶𝐼 (1 − )
2𝑅𝑆
𝑇𝑂𝐶 = +√ ∙ 𝑃
2𝑅𝑆 𝑉 2
𝐶𝐼 (1 − )
√ 𝑉 𝑃
𝐶𝐼 (1 − )
𝑃
𝑉
∴ 𝑇𝑂𝐶𝑚𝑖𝑛𝑖𝑚𝑢𝑚 = √2𝑅𝑆𝐶𝐼 (1 − )
𝑃
E.g. ABC Company manufactures pencil boxes. The estimated annual demand is
9000 boxes. The set up costs of each production run is $50 and the current
rate of production is 1000 boxes per month. Cost of each box is $4 and the
cost of holding one box in stock for one year is $0.40.
(b) What are the set up cost and holding cost and hence what is the total
relevant inventory cost per annum?
10
Quantity Discount (Assume constant demand & lead time = 0)
Advantages of buying in large quantities
(i) lower unit cost
(ii) lower ordering cost
(iii) fewer stock out
(iv) lower transportation cost
Disadvantages of buying in large quantities
(i) higher inventory carrying or holding costs
(ii) more capital required
(iii) Obsolete stock or older stock
(iv) greater risk of deterioration and depreciation of the stock.
E.g. A merchant has an annual demand for a product of 600 items. He buys from a
supplier at a cost of $6 per item and the cost of ordering is $10 per order. The
inventory holding costs are 20% p. a. of stock value. If the supplier offers a 5%
discount on orders of between 200 and 999 items, and a 10% discount on
orders of 1,000 or more. Can the merchant reduce his costs by taking
advantage of either of these discounts?
11
A curve showing annual total cost plotted against ordering size.
Stochastic model
A stochastic model is one in which the rate of demand or lead time is not known with
certainty. In this case, the demand or lead time follows a known probability distribution
(probably constructed from a historical analysis of demand or lead time in the past).
Since demand or lead time is not constant, it is necessary to keep safety stock and set
a reorder point to minimize the risk of stock out.
Reorder point is defined as a condition that signals someone that a purchase order
should be placed to replenish the inventory stock of some items.
Safety stock is the extra inventory held as a buffer or protection against the possibility
of a stock out.
𝑅𝑒𝑜𝑟𝑑𝑒𝑟 𝑝𝑜𝑖𝑛𝑡 = (𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑑𝑎𝑖𝑙𝑦 𝑢𝑠𝑎𝑔𝑒 × 𝑙𝑒𝑎𝑑 𝑡𝑖𝑚𝑒 𝑖𝑛 𝑑𝑎𝑦) + 𝑠𝑎𝑓𝑒𝑡𝑦 𝑠𝑡𝑜𝑐𝑘
𝑅𝑒 = 𝑉𝐿 + 𝐵
Note:-
For deterministic model, demand is known with certainty and so safety stock is not
required.
12
Reorder level Reorder point
Lead time
E.g. Find the reorder level for a company which has constant demand 100
units/day, ordering cost $10/order, lead time = 3 days, holding cost/ unit/ year
= $2 and 1 year has 250 working days.
Draw a diagram to show the stock level.
13
Statistics - Introduction
Definition
1. Statistics is commonly regarded as a collection of numerical
facts which are expressed in terms of summarizing
statements and which have been collected either from several
observations or from other numerical data.
Types of variables
Quantitative variables
- These are values/data which can be quantified. These can be
discrete or continuous.
E.g. no. of people, height of student, weight of student
Qualitative variables
- These are variables which cannot be quantified e.g. beauty,
intelligence, aggressiveness etc. but they can be classified or
ranked.
E.g. Faculty ranking: Lecturer, Senior Lecturer, Assc. Prof.,
Prof.
Discrete variable
- a variable which can take on a finite number of values, usually
occur through the process of counting and usually integer-
valued.
E.g. no. of people
1
Continuous variable
- a variable which can take a value at any fractional point along
a specified interval of finite values, usually generated by the
process of measurement.
E.g. height of student
Sampling
The term population refers to the set of all the item under
consideration (not necessarily human) in a particular enquiry.
A sample is a group of items drawn from the population on which
observations are made. A sample is a part of the population. We
hope to draw some conclusions about the population by studying
the sample.
2
Useful sample
Conclusion can be drawn about the population if the sample is
1. of the proper size; the larger the sample, the more reliable the
results.
2. the sample should be random to avoid bias in the results.
Frequency distribution
A frequency distribution is a grouping of data into classes, showing
the number or frequency of data in each class. A frequency
distribution can be presented in tabular form called frequency
distribution table.
Basic rules of construction
1. Pick out highest and lowest values.
2. Find range (=highest value - lowest value)
3. Divide into class intervals. Normally 5 to 15 classes.
Preferably of equal width (not necessary all the time). Classes
must be chosen so that all the data can be included and each
item of data can only go into one class. Relatively few classes
so that information can be easily grouped to show the pattern
but not too few to lose too much detail.
4. For each figure in the raw data, insert a tally mark against the
appropriate class, e.g. llll llll ll .
5. Total the tally marks.
6. Tabulate the frequency distribution.
3
Some terminology used in the construction of frequency distribution
1. The largest and smallest values that can go into any given
class are referred to as class limits.
E.g. 5 - 6 has lower class limit 5 and upper class limit 6
10 - 20 has lower class limit 10 and upper class limit 20
A class which has either no upper class limit or lower class
limit is called an open ended class.
E.g. 30 is an open ended class with no lower class limit
50 is an open ended class with no upper class limit
In further calculation, assume open ended class to be of the
same size as the immediate neighboring class.
3. Class size
= 𝑢𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 − 𝑙𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦
Preferably ( but not necessarily ) equal.
1. Histogram
A histogram has values of the variable on the horizontal
scale and frequencies on the vertical scale. For each class,
a rectangle is constructed with base equal to the class size
and height determined from the class frequency. The areas
of the rectangles must be proportional to the frequencies. If
equal class size for all classes, then the heights of the
rectangles will be proportional to the class frequencies. If
unequal class size, then the height of the rectangle will be
proportional to the class frequency divided by the class size.
The class boundaries are usually marked on the horizontal
scale.
3 2 3 6 3 6 2 4 7 4
6 4 1 5 8 3 4 10 1 5
4 7 11 4 15 3 2 6 3 2
13 1 5 8 4 1 3 2 10 8
3 9 6 3 1 14 4 5 3 1
5
Solution:
(i)
No. of occupants frequency class boundaries class size
1 − 2 11 0.5 − 2.5 2.5 − 0.5 = 2
3 − 4 18 2.5 − 4.5 4.5 − 2.5 = 2
5 − 6 9 4.5 − 6.5 6.5 − 4.5 = 2
7 − 8 5 7.5 − 8.5 2
9 − 10 3 8.5 − 10.5 2
11 − 12 1 10.5 − 12.5 2
13 − 14 2 12.5 − 14.5 2
15 − 16 1 14.5 − 16.5 2
(ii)
Histogram of the number of occupants of
Frequency 50 houses ( equal class size)
20
15
10
class boundaries
0
Number of occupants
(iii)
No. of count class class Adjusted freq.
occupants frequency boundaries size
1 − 2 11 0.5 - 2.5 2 (11 × 2)⁄2 = 11
3 − 4 18 2.5 - 4.5 2 (18 × 2)⁄2 = 18
5 − 8 9+5=14 4.5 - 8.5 4 (14 × 2)⁄4 = 7
9 − 10 3 8.5 - 10.5 2 (3 × 2)⁄2 = 3
11 − 16 1+2+1=4 10.5 -16.5 6 (4 × 2)⁄6 = 1.33
6
2. Frequency Polygon
It is a line graph of class frequency plotted against class mark.
It can be obtained by connecting mid-points of the tops of the
rectangles in the histogram. It is customary to join the points
at each end of the diagram to the base line at the centres of
the adjoining class intervals (i.e. as if there are 2 classes with
0 frequency).The area under the frequency polygon is equal
to the area of the histogram.
cum. freq.
50
45
40
35
30
25
20
15
10
5
0 class boundaries
0.5 2.5 4.5 6.5 8.5 10.5 12.5 14.5 16.5
8
4. Frequency curve
As the class intervals are made smaller and smaller and the
number of observations (frequency) increases, the histogram
or frequency polygon will closely approximate curve which is
called frequency curve. A smooth curve is drawn instead of
straight line segments. Smoothed ogive is obtained by
smoothing the ogive.
9
STATISTICAL MEASURES
1.Introduction
(i) X = x1 + x2 + x3 + …..+ xn
1
(iv) (X + Y) = X + Y where Y is another variable
(vi) kX = k X
∑ 𝑋 = ∑ 𝑥𝑖
𝑖=1
i. Arithmetic mean,𝑿 ̅
It is defined as the value each item in the distribution would
have if all the value are shared out equally among all the
items.
𝑥 +𝑥 +⋅⋅⋅+𝑥𝑛
a. For ungrouped data, x1, x2, x3, …, xn 𝑋̅ = 1 2𝑛
2
b. For grouped data of a frequency distribution
The arithmetic mean, X = f1 x1 + f 2 x2 + ...... + f n xn
f1 + f 2 + ...... + f n
Solution:
Scores Number of candidates 𝑿 𝒇𝒙 𝒙𝟐 𝒇𝒙𝟐
10 - 19 1 14.5 14.5 210.3 210.25
20 - 29 6 24.5 147 600.3 3601.5
30 - 39 9 34.5 310.5 1190 10712
40 - 49 31 44.5 1379.5 1980 61388
50 - 59 42 54.5 2289 2970 124751
60 - 69 32 64.5 2064 4160 133128
70 - 79 17 74.5 1266.5 5550 94354
80 - 89 10 84.5 845 7140 71403
90 - 99 2 94.5 189 8930 17861
150 8505 517408
Mean score X =
fx =
8505
=
f 150
3
ii. Mode X
The mode of a set of data, is that value which occurs with the
greatest frequency i.e. the most common value in the
distribution. The mode may not exist and even if it does , it
may not be unique.
4
frequency
~ 1
Mode, X = L X~ + ( ) x C X~ where
1 + 2
L X~ = lower boundary of modal class
1 = excess of modal frequency over the frequency of the
preceding class.
2 = excess of modal frequency over the frequency of next class.
C X~ = class size of modal class.
E.g. Find the modal height for the distribution of height of 200
trees
Class ( metress ) No. of trees ( freq. )
60 - 62 10
63 - 65 36
66 - 68 84
69 - 71 54
72 - 74 16
84 − 36
𝑀𝑜𝑑𝑒 = 65.5 + ( ) (68.5 − 65.5) = 67.3𝑚
(84 − 36) + (84 − 54)
5
~
iii. Median X
The median is defined as the value of the middle item when the
data are arranged in an increasing or descending order of
magnitude. It divides a set of data such that at least half of data
items are as large as or larger than it is , and at least half of the
data items are as small as or smaller than it is .
~ n
If n is even , then X is the mean of the values of the ( ) th
2
n
item and ( +1) th items.
2
6
b. For grouped data
2. Position of
~
X = f is calculated.
2
~
3. To locate the X class : -
[
f − ( f ) X~ ]
~ 2
X = LX~ + x C X~
f X~
~
where LX~ = lower boundary of the X class
~
( f ) X~ = cumulative frequency preceding the X class
f X~ = frequency of the median class
~
C X~ = class size of the X class
7
E.g. Find the median height of the following frequency distribution
Solution:
Class Cumulative
number of trees Class boundaries
(meters) frequency
60 - 62 10 59.5 - 62.5 10
63 - 65 36 62.5 - 65.5 46
66 - 68 84 65.5 - 68.5 130
69 - 71 54 68.5 - 71.5 184
72 - 74 16 71.5 - 74.5 200
~ 200
Position of X = = 100
2
~
X class boundaries : L=65.5 C=68.5-65.5=3
100−46
𝑚𝑒𝑑𝑖𝑎𝑛 = 65.5 + ( ) × 3 = 67.4𝑚
84
8
The fractiles
The median belongs to a general class of statistical
description called fractiles. A fractile is a value below which
lies a given percentage of the data ; for the median this
percentage is 50%. The quartiles divide the data into 4 equal
parts , the deciles divide the data into 10 equal parts, the
percentiles divide the data into 100 equal parts , etc.
Formulae of fractile
(a) Position of Qi = n xi ; i = 1, 2, 3
4
n
[ xi − ( f ) Qi ]
Qi = LQi + 4 xC Qi
f Qi
n
(b) Position of Di = xi ; i = 1, 2, …, 9
10
n
[ xi − ( f ) Di ]
Di = LDi + 10 xC Di
f Di
n
(c) Position of Pi = xi ; i = 1, 2, …,100
100
n
[ xi − ( f ) Pi ]
Pi = LPi + 100 xC Pi
f Pi
9
Locating Fractiles from the ‘<’ ogive.
(Cum. freq.)
3n
4
n
2
n
4
class boundaries
Q1 ~ Q3
X
10
No. of trees
'<' ogive of height of trees
(Cum. freq.)
200
150
100
50
class boundaries
0
59.5 62.5 65.5 68.5 71.5 74.5
3×200
Position of Q3= =150 Q3= 70 meter
4
75% of the trees have height < 70meter.
7×200
Position of D7= = D7= 69 meter
10
70% of the trees have height < 69 meter.
20 ×200
Position of P20= = P20= 65 meter
100
20% of the trees have height < 65meter.
(40−10)
Using formula: 𝑃20 = 62.5 + × 3 = 65
36
11
Relationship between mean, median and mode
~
X = X̂ = X
b. Positively skewed
12
c. Negatively skewed
~
X X X̂
Small spread
Big spread
13
The extent of variability or the ‘spread’ of the data may be
described by the range, the quartile deviation and the standard
deviation.
(i) Range
a. For ungrouped data
Range = maximum value – minimum value
E.g. {2, 2, 3, 4, 5} ; range = 5 – 2 = 3
14
(ii) Quartile deviation, Q.D. or semi-interquartile range
It gives the average amount by which the two quartiles Q1
and Q3 differ from the median.
Q3 − Q1
Q. D. =
2
a. For ungrouped data
E.g. Find the Q.D. of the following data:
2, 2, 2, 3, 4, 5, 5, 6, 6, 6
Advantages
1. It is not affected by extreme values because it is calculated
on the central values of the distribution.
2. Calculation is unaffected by open-ended classes.
Disadvantage
It is not fully representative of a set of measurements as it is
not based on all the information available.
15
(3) Standard deviation
This is the most important measure of dispersion. It can be
used for further statistical analysis.
Standard deviation is the root-mean-square deviation
between the individual values and the mean in a distribution.
square deviations:
(x1 − x )2 , (x2 − x )2 , …, (xn − x )2
mean-square deviation: ( x − x )
2
( x − x )
root-mean-square deviation: 2
16
Computation of the standard deviation:
a. Ungrouped data
Alternatively:
Variance
17
E.g. Find the standard deviation and variance for the following data:
2, 12, 7, 5, 9
N=5
∑x = 2 + 12 + 7 + 5 + 9 = 35
∑x2 = 22 + 122 + 72 + 52 + 92 = 303
Population standard deviation,
x2 x
2 2
303 35
= − = − = 11.6 = 3.41
N N 5 5
Population variance, = 3.41 = 11.6
2 2
18
b. Grouped data
19
fx 792
Population mean, = f = 40 = 19.8
20
fx 3425
Sample mean, x = = = 34.25 units
f 100
Remarks:
(1) The larger the value of the standard deviation, the greater the
dispersion or spread of the data from the mean. If the standard
deviation is large, the mean is not really suitable as a
‘representative’ value. If the standard deviation is small then
mean is a more representative value as the data values are
concentrated around the mean.
Data range 6s
1
We may estimate s approximately by × 𝑟𝑎𝑛𝑔𝑒.
6
21
<- 68.27%->
--------95.45%----------→
X - 2s X -s X X +s X +2s
5. Skewness
It is the degree of asymmetry or departure from symmetry of a
distribution. A measure of skewness indicates not only the amount
of ‘asymmetry’ but also its direction. A set of data is said to be
skewed in the direction of the extreme values, or speaking in
terms of frequency curves, in the direction of the ‘tail’.
22
E.g. The mean and median of a distribution are found to be
30.9 and 28.8 respectively. The standard deviation is
13.23.
~
3( X − X ) 3(30.9 − 28.8)
Sk = = = 0.48 shows positive skewness
s 13.23
(i.e. skewed to the right) and the degree of skewness is
moderate.
23
CORRELATION AND LINEAR REGRESSION
Correlation
▪ measures the strength of the relationship between two
variables
▪ involves a bivariate data / distribution
Regression
▪ a study to identify the relationship between two or more
variables using a mathematical equation
▪ is normally used for estimation purposes
Example:
A study on relationship between the sales of ice cream and the
temperatures
▪ temperature is an independent variable since it can be
used to explain the sales of ice cream
▪ sales is a dependent variable since the sales depends on
temperature
Univariate distribution
▪ data of single characteristic is grouped together
▪ Example: height of student, price of item etc
Bivariate distribution
▪ data of two characteristics are grouped together
▪ Example: sales of ice cream and temperature, sales of good
and advertisement expenses.
1
Scatter Diagram
▪ a plot of paired observations ( X, Y )
▪ illustrates whether
any relationship between the dependent and independent
variables exists
the relationship is positive or negative
the relationship is linear or non-linear
▪ A positive relationship exists when both variables ↑ (or ↓)
at the same time
▪ In a negative relationship, as one variable ↑, the other
variable will ↓, and vice versa
Example:
The data below relates the weekly maintenance cost ($) to the
age (in months) of ten machines of similar type in a
manufacturing company.
Machine 1 2 3 4 5 6 7 8 9 10
Age (X) 5 10 15 20 30 30 30 50 50 60
Cost (Y) 190 240 250 300 310 335 300 300 350 395
Construct a scatter diagram and comment on it.
Solution:
Scatter Diagram of Weekly Maintenance
Cost and Age of Machine
400
Maintenance cost ($)
350
300
250
200
150
0 10 20 30 40 50 60
Age of machine (months)
Comment:
2
Two types of correlation
1. Linear correlation
✓ correlation is said to be linear if the relationship can be
represented by a straight line
2. Non-linear correlation (or curvilinear correlation)
correlation is said to be non-linear if the relationship can be
represented by a curve
Positive linear correlation
▪ An increase in the independent variable (X) will result an
increase in the dependent variable (Y)
Correlation Coefficient ( r )
▪ measure the strength of linear relationship between two
variables
▪ has a range of values from –1 to +1 i.e. − 1 r +1
▪ if r = 0 , then there is no linear relationship between the
two variables
▪ words of difference strength are used to describe the
degree of correlation, rough guides are listed in the
following table for interpretation purpose
3
r Correlation coefficient
▪ The degree of strength of the relationship does not depend
on the sign of the coefficient of correlation.
▪ E.g. Coefficient of – 0.92 and + 0.92 have equal strength,
both indicate very strong correlation between the two
variables.
5 5
x 0
x
0
0 2 4 6 8 10 0 2 4 6 8 10
5 5
x x
0
0
0 2 4 6 8 10
0 2 4 6 8 10
x
0
0 2 4 6 8 10
4
Product Moment Correlation Coefficient, r
Example:
Calculate the product moment correlation coefficient for the
following data. What does the value of the coefficient indicate?
X 5 6 7 9 8
Y 8 9 9 11 13
Solution:
X Y XY X2 Y2
5 8 40 25 64
6 9 54 36 81
7 9 63 49 81
9 11 99 81 121
8 13 104 64 169
X = 35 Y = 50 XY = 360 X 2 = 255 Y 2 = 516
5
Example:
Refer to the data given in the previous example, calculate the
product moment correlation coefficient between age and
maintenance cost. Hence, find the coefficient of determination
and comment on the results.
Machine 1 2 3 4 5 6 7 8 9 10
Age (X) 5 10 15 20 30 30 30 50 50 60
Cost (Y) 190 240 250 300 310 335 300 300 350 395
Solution:
X Y XY X2 Y2
5 190 5(190)= 52 = 1902=
10 240 2400 100 57600
15 250 3750 225 62500
20 300 6000 400 90000
30 310 9300 900 96100
30 335 10050 900 112225
30 300 9000 900 90000
50 300 15000 2500 90000
50 350 17500 2500 122500
60 395 23700 3600 156025
∑X= ∑Y= ∑ XY = ∑ X2 = ∑ Y2 =
300 2970 97650 12050 913050
Comment:
6
Correlation and Causation
✓ Causation Correlation
7
The procedure for obtaining r s is given as follows:
STEP 1 Rank the X values (to give R 1 values)
STEP 2 Rank the Y values (to give R 2 values)
STEP 3 For each pair of ranks, calculate d 2 = (R1 – R2)2
and then calculate d 2
STEP 4 The value of the Spearman’s rank correlation
coefficient can be found using the following
formula:
6 d 2
rs = 1 − − 1 rs +1
n(n 2 − 1) with
Solution:
Competitor A B C D E F G H I J
R1 4 9 2 5 3 10 6 7 8 1
R2 6 10 2 8 1 9 7 4 5 3
d = R 1– R 2 – 2 – 1 0 –3 2 1 –1 3 3 –2
d2 4 1 0 9 4 1 1 9 9 4
n = 10, d = 42
2
6d 2 6(42)
rs = 1 − = 1 − = 0.7455
n(n − 1)
2
10(10 − 1)
2
8
Comment:
Spearman’s coefficient of rank correlation for the data is 0.7455,
indicating that there is a moderately degree of association
between rankings of X and Y i.e. the opinions of the 2 judges
agree moderately well.
9
Note:
Sometimes two or more individuals or entries may be tied in
rank, in this case, each is given the average of the ranks as
shown by the following example
Salesman 1 2 3 4 5 6 7 8
Sales 20 35 25 20 35 40 20 10
Ranking 5 8 1
Solution:
X R1 Y R2 d = R1– R 2 d2
30 30 - 5.5 30.25
31 6 14 1 5 25
32 7 30 -2 4
30 23 -1 1
46 32 11 - 0.5 0.25
30 26 -3 9
19 1 20 2 -1 1
35 8 21 3 5 25
40 9 23 4.5 20.25
46 30 1.5 2.25
57 12 35 12 0 0
30 26 -3 9
d 2 = 127
10
6d 2 6(127)
n=12, rs = 1 − = 1 − = 0.5559
n(n 2 − 1) 12(12 2 − 1)
The result shows that there is fairly weak positive correlation
between rankings of vehicles owned and rankings of number
of road deaths.
11
LINEAR REGRESSION
▪ Regression is concerned with obtaining a mathematical
equation which describes the relationship between two
variables
▪ The equation can be used for comparison or estimation
purpose
12
Notes:
For any set of bivariate data, the least squares regression line of
Y on X
1. is used to estimate a value of Y given a value of X
2. passes through the mean point ( X , Y ) of the data
Example
The following table shows the output at a factory and costs of
production over the past 5 months. Find the equation of the least
squares regression line.
Month 1 2 3 4 5
Output(000’s units) 20 16 24 22 18
Costs (RM’000) 82 70 90 85 73
Solution:
Let X = output in 000’s units; Y = total costs in RM’000.
X Y XY X2
20 82 20(82)= 202=
16 70 1120 256
24 90 2160 576
22 85 1870 484
18 73 1314 324
X = 100 Y = 400 XY = 8104 X 2 = 2040
Y X
a= −b =
n n
13
Regression Analysis as a forecasting tool
• Two types of estimation using the regression equation
1. Extrapolation estimate
Extrapolation find the value of Y outside the observed
range of X
most commonly used for forecasting using a time series
may be less accurate and unreliable to a certain extent
2. Interpolation estimate
Interpolation find the value of Y within the observed
range of X
forecasting using interpolation is more accurate and
more reliable than using extrapolation
Example:
The data below relates the weekly maintenance cost ($) to the
age (in months) of machines of similar type in a manufacturing
company.
Machine 1 2 3 4 5 6 7 8 9 10
Age (X) 5 10 15 20 30 30 30 50 50 60
Cost (Y) 190 240 250 300 310 335 300 300 350 395
(a) Find the least squares regression line of maintenance cost
on age.
(b) Using the regression line, predict the maintenance cost for
a machine of this type, which is 40 months old. Comment
on the accuracy of your estimate.
(c) Plot a scatter diagram and draw the regression line.
(d) Predict the maintenance cost for a 40-month old machine
graphically.
Solution:
From the previous Example, we have
∑ X = 300 ∑ Y = 2970 ∑ XY = 97650 ∑ X2 = 12050 ∑Y2 = 913050
nXY − (X )(Y )
(a) b=
nX 2 − (X ) 2 = =2.8033
Y X
a= −b
n n = =212.901
The least squares regression line of maintenance cost on age
is Ŷ = 212.901 + 2.8033X
14
(b) When X = 40, Y =
Comment: This estimate is obtained by interpolation since
X=40 lies within the range of X, i.e., [5,60]. Hence, it is more
accurate and reliable.
350
300
250
200
150
0 10 20 30 40 50 60
Age of machine (months)
15
Interpretation of 'a ' and 'b '
In the regression equation Y = a + b X,
✓ a is the estimated value of Y when X = 0; i.e. the Y-intercept
value
✓ b indicates the changes in Y when a unit change in X
✓ b is positive positive linear relationship between X and Y
✓ b is negative negative linear relationship between X and Y
✓ b will always have the same sign as the coefficient of
correlation, r
Example:
If Y = a + b X = 3.33 + 0.47 X, then interpret the values of 'a '
and 'b ' ; where Y = sales ($'000) and X = advertising costs
($'00),
Solution:
a = 3.33 is the value of Y when X = 0. Hence it is the value of
sales ($'000) when there is no expenditure on advertising.
Example:
If Y = a + b X =28 + 2.6X, then interpret the values of 'a ' and 'b '
; where Y = expenditure in $'000 and X = output in 000's units.
Solution:
16
THE ADVANTAGES AND DISADVANTAGES OF
REGRESSION ANALYSIS
Advantages:
(a) It can be used to estimate a line of best fit using all the data
available. It is likely to provide a more reliable estimate than
any other technique of producing a straight line of best fit
(for example, estimating by eye).
Disadvantages:
(a) It assumes a linear relationship between the two variables,
whereas a non-linear relationship may exist.
17
Probability
1. Introduction
1.3 Event
An event A associated with the experiment is a subset of the sample
space, S .
E.g. Throwing a die once.
A = {5} is an event of S.
B = { even number } = { 2, 4, 6 } is also an event of S.
A
A
2 1
3 4 6
5
(d) If A and B are two events of the sample space S, then the event
which consists of all the sample points in A , B or both is called
A union B and it is denoted by A B or A + B.
E.g. Throwing a die once.
S={1,2,3,4,5,6} A = { 1, 3, 5 } then B = { 2, 3, 6 }
A B = {1, 2, 3, 5, 6 }
(e) If A and B are two events of the sample space S, then the event
which consists of all the sample points that are common to A and
B is called A intersect B and it is denoted by A B or A B.
E.g. Throwing a die once.
S={1,2,3,4,5,6} A = { 1, 3, 5 } then B = { 2, 3, 6 }
A B = { 3 }
(f) A B is an event which occurs iff either A or B or both occurs.
A B is an event which occurs iff both A and B occur together.
2
(g) Two events A and B are said to be mutually exclusive if the
occurrence of A excludes B and vice versa. That is, A and B
cannot occur together or if A occurs then B cannot occur and vice
versa.
E.g. Throwing a die once. S = { 1, 2, 3, 4, 5, 6 }
If A = { 1, 2, 3 }, B = { 4, 5, 6 } then A and B are mutually
exclusive events.
If C = { 1, 5, 6 }, D = { 2, 3, 5 } then C and D are not mutually
exclusive events.
E.g. In an examination.
A = you pass the exam.
B = you fail the exam.
A and B are mutually exclusive events.
2. Approaches to probability
E.g. If a pair of dice are thrown, what is the probability that a total of 8
shows?
Let A be the event of getting a total of 8.
The first die gives 6 possible outcomes.
The second die gives 6 possible outcomes.
Therefore, 6x6 = 36 possible outcomes for the experiment of
throwing a pair of dice; N = 36
4
2.2 Relative frequency approach to probability (empirical approach)
If an experiment is repeated N times under the same conditions, N A of
these trials result in the occurrence of event A then the probability of
event A,
P(A) = lim N A
N → N
NA
This approach is formulated on the assumption that the proportion
N
will tend to be stable and approaches a constant when the number of N
trials increases.
E.g. No. of ‘head’ resulting from throwing a coin.
NA
No. of throws (N) No. of ‘head’ (NA)
N
___________________________________________
10 4 0.4
100 54 0.54
1,000 520 0.52
10,000 5,100 0.51
Note: A single probability means that only one event can take place . It is
called a marginal or unconditional probability.
5
3. Probability rules
3.1 Complementary event
If A is the complementary event of A then P( A ) = 1 – P(A).
E.g. A sample of 1,000 people showed that 400 smoke cigarettes, 500
drink beer, 250 smoke and drink.
Calculate the probability of a person who smokes cigarette or
drinks beer or both.
Let C be the event that a person who smokes cigarette.
Let D be the event that a person who drinks beer.
7
Method 2---using a table
C C Total
D 250 250
D 150 350
Total 1000
P(C) P( C ) Total
P(D)
P( D )
Total 1.0
P(A B) =
P( A B)
P(B|A) = if P(A) > 0
P ( A)
9
P( A B)
P(B|A) = = --------- =
P ( A)
E.g. In the experiment of throwing 2 dice once. One of the dice turns out 2.
What is the probability that the sum of the 2 dice is less than 5?
E.g. A batch of 12 T.V. sets, 4 of which are not working. What is the
probability that (i) choosing 2 consecutive sets at random, both
will be defective?
(ii) choosing 3consecutive sets at random, all will
be defective?
P( A D) P( A) x( P( D / A)
(b) P(A/D) = = which is known
P( D) P( A) xP( D / A) + P( B) xP( D / B)
as posterior probability.
E.g. Two machines A and B produced respectively 40% and 60% of the
total output of a factory. The percentages of defective output of these
machines are 5% and 3% respectively.
(a) If an item is selected at random, find the probability that the item
is defective.
(b) If the item selected is found to be defective, what is the probability
that it is produced by machine B.
11
A probability tree can be used to represent the above probabilities.
E.g. Suppose a factory manager wishes to advertise his new brand of jam
and have 3 alternative advertising mediums: T.V., posters and shop
promotions. Usually, he advertises on T.V. 60% of the time, posters 30% of the
time and shop promotions 10% of the time. It has been calculated that the
probability of each advertising medium being successful is 0.1, 0.3 and 0.2
respectively.
(a) Find the probability that his advertisements will be successful.
(b) If his advertisements were successful, what is the probability that he
used poster advertisement?
12
E.g. A company has 1,000 replacement parts for a given assembly. 20% of
the parts are defective and the rest are good. 40% were bought from
external sources and the rest were made by the company itself, and of
those bought from external sources, 80% are good. If a part is randomly
selected from this stock, what is the probability that :-
13
Probability Distribution
E.g. If we toss a coin and repeat the experiment three times and we
are interested in the number of ‘H’ obtained.
Now
X = 0 means the occurrence of outcome TTT
X = 1 means the occurrence of outcome TTH,HTT,THT
X = 2 means the occurrence of outcome HHT,HTH,THH
X = 3 means the occurrence of outcome HHH
P( X = 0 ) = P(TTT) =
P( X = 1 ) = P(TTH,HTT,THT) =
P( X = 2 ) = P(HHT,HTH,THH) =
P( X = 3 ) = P(HHH) =
1
Any description for e.g. tabular, graphic or formula which gives each
value of a random variable with its corresponding probability is called
the probability distribution of that random variable.
∑ 𝑃 (𝑋 = 𝑥 ) = 1
𝑚𝑒𝑎𝑛 = 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑛𝑢𝑚𝑏𝑒𝑟 = 𝜇 = 𝐸 (𝑋) = ∑ 𝑥 ∙ 𝑃(𝑥)
𝐸 (𝑋 2 ) = ∑ 𝑥 2 ∙ 𝑃(𝑥)
𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎 2 = 𝐸 (𝑋 2 ) − [𝐸 (𝑋)]
Probability Distribution of X
1/2
3/8
P(X=x)
1/4
1/8
0
0 1 2 3
All Possible Outcome of X
2
2. Discrete probability distribution
For discrete probability distribution, the random variable takes a discrete
set of values.
Let the random variable X denote the number of ‘successes’ that can be
obtained from such a repetitive experiment of n independent trials.
Let p be the probability of getting a ‘success’ of each trial.
E.g. Suppose that all students taking a certain examination only 40% pass.
If a group of three candidates for this examination is selected at
random, what is the probability that
(i) all three will pass
(ii) exactly two will pass
(iii) at least two will pass.
3
E.g. Suppose that 20% of all items coming off a production are defective. If
a random sample of 10 are chosen and inspected, what is the
probability of
(i) no defective item in the sample
(ii) at most 2 defective items
(iii) more than 3 but less than and equal to 6 defective items.
Notes:
For the use of binomial distribution (FITS)
(a) We are dealing with mutually exclusive events with only Two possible
outcomes i.e. either getting a success or a failure.
(b) The events are Independent such that the probability of getting a
success in a single trial remains the Same for all trials.
(c) Number of trials is Fixed.
4
2.1.1 Mean and standard deviation of binomial distribution
Mean = µ = n p
Standard deviation = σ = np(1 − p)
6
2.2.2 Poisson distribution not as an approximation
It is important to realise that although the Poisson distribution is useful
as an approximation to the binomial distribution, it is also a distribution
in its own right. It is used in many situations where we can expect a
fixed number of ‘successes’ per unit time ( or per some other kind of unit)
for e.g .average service rate = 10 customers per hour; average arrival rate
= 6 customers per hour; 1.6 accidents can be expected per day at a busy
road junction; 12 small pieces of meat can be expected in a frozen meat
pie etc.
E.g. A machine shop employing a large number of men finds that, over a
period of time, the average absentee rate is 3 men per shift. Calculate
the probability that, on a given shift,
(i) exactly two men will be absent;
(ii) more than four men will be absent.
8
(c) For 2 normal distributions which have the same mean but
difference standard deviations, the curve for the distribution which
has the largest standard deviation will be flatter and will not have
such a prominent peak as the curve for the distribution with the
smaller standard deviation.
(d) No matter what the shape of the normal distribution, one important
property of the normal distribution is the nature of the relationship
of the area under the curve to the standard deviation of the
distribution. In particular, 68.27% of the area under the curve is
contained within plus and minus one standard deviation of the
mean; 95.45% of the area under the curve is contained within plus
and minus 2 standard deviations of the mean; 99.73% of the area
under the curve is contained within plus and minus 3 standard
deviations of the mean.
9
3.1.2 Standard normal distribution
Normal distribution with mean μ = 0 and standard deviation σ = 1 is
known as standard normal distribution denoted by Z ~N( 0, 12).
P(Z > z )
-z 0 z
(c) P( Z > z ) = P( Z z )
E.g. To show that 95.45% of the area under the normal curve lies within plus
and minus 2 standard deviations of the mean.
Consider standard normal curve Z ~ N( 0, 12 )
11
E.g. Assume that the weight of adult males is normally distributed with a
mean of 69 kg. and a standard deviation of 3 kg.
(i) What is the conditional probability that an individual will be
heavier than 72 kg. if it is known that he is heavier than 70 kg. ?
(ii) Determine the maximum weight of 95% of adult males.
E.g. A manufacturer is filling cans with soup to a net weight of 16 gm. The
actual amount of soup which is put into the cans by the filling machine
is normally distributed about the set weight with a standard deviation of
½ gm. If the manufacturer requires not more than 1% of cans to contain
less than the advertised net weight of 16 gm. , at what weight should the
filling machine be set?
12
3.2 Normal approximation of binomial distribution
If X ~ b( x; n, p ), we can use the normal approximate with μ = n p,
if 0.1 p 0.9 and n p 5 .
13
E.g. Of the people who enter a large supermarket, it has been found that 70%
will make at least one purchase. For a sample of 50 individuals, use the
normal approximation to find
(a) at least 40 people make one or more purchases each;
(b) fewer than 30 people make at least one purchase.
E.g. The average number of calls for services received by a machine repair
department per 8-hour shift is 10. Determine the probability that more
than 15 calls will be received during a randomly selected 8-hour shift:
(a) Using Poisson distribution
(b) Using normal approximation to the Poisson distribution.
14
4. Mathematical expectation (or expected value)
Suppose x1, x2, …, xn are n possible outcomes of an experiment with p1, p2 ,…,
pn as their respective probabilities of occurrence.
15
E.g. 1
In a business venture, a man can make a profit of $2,000 with a probability of
0.65 or a loss of $4,000 with a probability of 0.35. Should he undertake the
venture? Calculate the variance of profit.
Solution:
x: profit
p: probability
= 8,190,000
E.g. 2
A man is considering the purchase of a rottery ticket. He can win a first prize of
$50,000 , one of four second prizes of $10,000 each or one of ten third prizes of
$1,000 each. He estimates that 100,000 tickets will be sold. The winners will
be selected at random. How much should be prepared to pay for the ticket?
Solution:
x: prize
p: probability
Expected prize = xp =
= $1
16
Sampling Distribution
Introduction
Sampling distribution is a study of the relationship between a population and
the samples drawn from the population.
There are 2 types of sample: (1) probability sample (or random sample)
(2) non-probability sample (or quota sample)
It is the probability samples that are of importance as they are the only type of
sample which allows us to make statistical inferences about the population
from which they were drawn.
Note:
** Parameter - a number that describes a population.
e.g. µ = population mean;
σ = population standard deviation;
p = population proportion are population parameters.
1
Sampling distribution
Consider all possible random samples of size n drawn from a population. For
each sample, a sample statistic such as or is computed which will vary
from sample to sample. We then obtain the frequency distribution of the
sample statistic which is called the distribution of the sample statistic or
sampling distribution of the sample statistic.
A histogram can be plotted, and we can see that it has the shape of a
normal curve.
2
Sampling distribution of sample means
Properties
(a) For random samples of size of n taken from a population having mean
μ and standard deviation σ , the sampling distribution of sample means
has mean and standard deviation given by
Or
(for finite population)
(c) If the size of the random sample n is large (n 30), then the sampling
distribution of sample means can be approximated using normal
distribution. This result is known as Central Limit Theorem.
i.e. If n 30 , then ~ N( , 2 ) .
= or
3
E.g. 1
A random sample of 25 adults is drawn from a normal population of height for
which the mean is 172.5 cm. and standard deviation 6.25 cm.
(a) This sample of size 25 has mean value that belongs to a sampling
distribution of sample mean. Find the shape of this sampling distribution.
(b) Find the mean and standard error of this sampling distribution.
(c) What is the probability that the sample mean will be greater than 175
cm ?
(d) Find a symmetric interval about 172.5 cm, which can be expected to
contain 99% of all sample mean.
0.005
0.005
0.99
a b
172.5
4
E.g. 2
We are making 500 special components for an unique machine. Each
component consists of a carbon rod which has a mean length of 4.08 cm and
standard deviation of 0.5 cm.
(a) Calculate the probability that, in a random sample of 100 of these
components, the mean length will be between 4.00 cm. and 4.10 cm.
(b) What is the probability that the combined length of a random sample 49
of these components is more than 205 cm?
E.g. 3
A large number of samples with size n are taken from a normal distributed
population with mean 74 and standard deviation 6, find the sample size n
(a) if the probability that the sample mean exceeds 75 is 0.282
(b) if the probability that the sample mean is less than 70.4 is 0.00135 .
X~ N( μ=74 , σ2=62 ) => ~ N( = 74, 2 = 62/n )
5
Sampling distribution of sample proportions
Frequently in statistical work, it is desirable to know what proportion of items in
a population possess a certain characteristic. For e.g. what proportion of
consumers prefer a certain product or what proportion of the students pass a
certain examination? In these cases, we consider all possible samples of size
n drawn from a population having population proportion p. For each sample
the sample proportion is computed.
Note:
Population proportion ,p
= No. of items having the characteristic concerned in the population
Total no. of items in the population
Sample proportion ,
= No. of items having the characteristic concerned in the sample
Total no. of items in the sample
= or
E.g.1
It has been found that 2% of the items produced by a machine are defective.
What is the probability that in a shipment of 400 such items,
(a) more than 3% are defective?
(b) between 1% and 3% are defective?
7
Statistical estimation and hypothesis testing
Introduction
There are 2 types of statistical inferences:-
(1) Statistical estimation (2) Hypothesis testing
Hypothesis testing involves the setting up of a hypothesis (or theory) about the
population and then sampling in order to see if the hypothesis is supported or
rejected.
Statistical estimation
Point estimate
An estimate of a population parameter given by a single value and
calculated from sample data is called a point estimate of the population
parameter.
1
(c) Given 2 samples from the same population.
Sp =
2
Confidence interval estimates or confidence limits
σ
X Z α/2
n
3
Confidence interval estimate of population mean ,
Model I :
where and or
which contains 99% or 95% of all sample means can be obtained i.e.
4
E.g. A normal population has unknown mean and standard deviation 15. A
random sample of size 25 drawn from this population was found to have a
mean of 950. Construct
(a) a 90% C.I. for ; (b) a 95% C.I. for ; (c) compare your
results.
Solution: Given:
5
Model II :
is estimated by or
6
Model III :If is unknown then it is estimated by s, and the sample size n is
small (n<30) and the population is normal or approximately normal.
7
E.g. A sample of 10 packets of sugar packed by a machine has the following
weights (kg):
1 , 1.05, 1.10, 0.95, 0.96, 1.10, 1.02, 0.97, 0.99, 1
(a) Calculate the sample mean and standard deviation
8
Confidence interval estimate of difference of two means
Population I Population II
Pop. Mean
Sample size n1 n2
Sample mean
then
where
and
9
Model IV : For large sample size n1 , n2 30 , are unknown and
estimated by s1 & s2 respectively.
10
Confidence interval estimate of population proportion, p
Model V
Assumption: For large sample size, n 30, the 100(1- )% C.I. estimate for p
is
where or .
E.g. A producer of steel pipes selected a simple random sample of 300 pipes
from the production process to estimate the proportion of defective pipes.
There were 15 defective pipes in the sample.
(a) What is the point estimate of the proportion of defective pipes in the
population?
(b) Construct a 95% confidence interval estimate of the proportion of the
defective pipes in the population.
(c) How large a sample would be needed if the probability is to be 0.95 that
the error of estimate will not exceed 0.02 unit?
11
Confidence interval estimate of difference of two proportions
Population proportion p1 p2
Sample size n1 n2
Sample proportion
and which is
estimated by
12
Model VI : For large sample sizes n1 , n2 30. p1 & p2 are estimated by
- respectively.
13
Hypothesis Testing
Statistical decisions
We study the sample data and then make decisions about the
population from which the sample is drawn. Such decisions are
called statistical decisions.
Statistical hypotheses
They are statements or assumptions which may or may not be
true concerning one or more populations. Based on sample
information, these hypotheses will be tested. Normally, the
hypothesis to be tested is formulated in the sole purpose of
being rejected or nullified. This hypothesis is called the null
hypothesis, denoted by H0. We have to formulate the other
hypothesis which differs from the null hypothesis and it is usually
called the alternative hypothesis, denoted by H1 or H a .
2
Procedures for testing statistical hypotheses
(1) State the assumptions or known facts about:-
(i) the population of interest
(ii) the nature of the samples and the sample sizes
(iii) state H0 and H1 .
(5) Make decision. If the test statistic falls in the CR, then we
reject H0 otherwise we accept it and draw conclusion.
H0 : μ μ0 or H0 : μ μ0 or H0 : μ = μ0
H1 : μ < μ0 H0 : μ > μ0 H1 : μ μ0
(left-tailed test) (right-tailed test) (2-tailed test)
3
Test A (Z - test )
Model : Population is normal with known standard deviation σ .
OR
Population is not normal with known standard
deviation σ and sample size n 30.
X − o
The test statistic is Z= where X =
X n
______________
______________
______________
5
Test B ( large sample Z-test )
6
Test C (Small sample t-test)
Model: (i) Population is normally distributed with unknown
standard deviation but estimated by sample
standard deviation, s.
(ii) The sample size is small (n<30).
X − o s
The test statistic is t = where SX = which follows a
SX n
t-distribution with (n-1) degrees of freedom.
______________
______________
______________
8
Tests of hypotheses concerning the difference of means of
two populations
H0 : 1 − 2 d 0 or H0 : 1 − 2 d 0 or H0 : 1 − 2 = d 0
H1: 1 − 2 d 0 H1: 1 − 2 d 0 H1 : 1 − 2 d 0
(Left-tailed test) (Right-tailed test) (2-tailed test)
Test D (Z-test)
Model: The two populations are normal with known standard
deviations 1 and 2 .
( x1 − x 2 ) − d 0 12 22
The test statistic is Z = where x −x = + .
x −x 1 2
n1 n2
1 2
9
Test E (Large sample Z-test)
( x1 − x 2 ) − d 0
2 2
s1 s
The test statistic is Z = where sx −x = + 2 .
sx −x 1 2
n1 n2
1 2
10
Test F (small sample t-test)
Model: (i) 2 populations are normal with unknown but common
standard deviation 1 = 2 = .
(ii) The 2 samples are independent random samples
with small sample sizes n1 and n2 (< 30).
( x1 − x 2 ) − d 0
The test statistic is t =
sx −x
1 2
(n1 − 1) s1 + (n 2 − 1) s 2
2 2
11
E.g. Two salesmen A and B are working in a certain district.
From a sample survey conducted by the Head office, the
following results on sales were obtained:-
Salesman A Salesman B
n1 = 20 n2 = 18
x1 = 170 x2 = 205
s1 = 20 s2 = 25
H0 : p p0 or H0 : p p0 or H0 : p = p0
H1 : p < p0 H1 : p > p0 H1 : p p0
(Left-tailed test) (Right-tailed test) (2-tailed test)
where p0 is a predetermined constant.
12
(b) Right-tailed test
To test H0 : p p0 against H1 : p > p0
13
Test H (Test concerning differences between proportions)
Consider 2 independent random samples from 2 binomial
populations consisting of n1 and n2 trials and the no. of
successes are x1 and x2 respectively.
Then p1 = x1 ~ N ( p1 , p1 (1 − p1 ) )
n1 n1
x p (1 − p 2 )
p2 = 2 ~ N ( p2 , 2 ) approximately when n1 , n2 are
n2 n2
large.
(i) If p1 = p2 = p known, the standard error of p1 − p 2 is
pq pq 1 1
= + = pq ( + )
p1 − p2 n1 n2 n1 n2
(ii) If p1 = p2 = p unknown, the standard error of p1 − p 2 is
1 1 x1 + x2
estimated by S = pq ( + ) where p =
p1 − p 2 n1 n2 n1 + n2
1 1 x1 + x2
S = pq ( + ) and p =
p1 − p 2 n1 n2 n1 + n2
14
(a) Left-tailed test
Ho : p1 - p2 do against H1 : p1 - p2 < do
15
Test I (Paired comparison t – test)
D − ( X − Y ) X − Y − ( X − Y )
The test statistic is t = =
SD SD
which follows a student’s t - distribution with (n – 1) degrees of
freedom.
16
E.g. A new product was introduced into the market in January
1997. After a poor year for sales, the manufacturer initiated an
intensive advertising campaign during January 1998. The table
below records the sales, in thousand dollars, for a one-month
period before and a one-month period after the advertising
campaign, for each of eleven regions.
Region A B C D E F G H I J K
Sales
Before 2.4 2.6 3.9 2.0 3.2 2.2 3.3 2.1 3.1 2.2 2.8
Sales
After 3.0 2.5 4.0 4.1 4.8 2.0 3.4 4.0 3.3 4.2 3.9
The sales may be assumed to follow a normal distribution.
Determine, at the 5% sig. level, whether an increase in sales has
occurred .
Solution:
Region A B C D E F G H I J
K
Sales
Before(X)2.4 2.6 3.9 2.0 3.2 2.2 3.3 2.1 3.1 2.2 2.8
Sales
After (Y) 3.0 2.5 4.0 4.1 4.8 2.0 3.4 4.0 3.3 4.2 3.9
D= Y-X
D= ; D2 =
17
Chi-square tests ( tests) can be used in the following ways:-
(a) Test of differences among k proportions
(b) Test of contingency tables
(c) Test of goodness of fit
1 n1 x1 n1 - x1
2 n2 x2 n2 - x2
3 n3 x3 n3 - x3
. . . .
. . . .
. . . .
k nk xk nk - xk
We estimate p0 by if H0 is true.
1
At sig. level, the critical region CR = { } with
k – 1 degrees of freedom.
2
We are interested in testing
H0 : p1 = p2 = p3 against
H1 : at least 2 of the proportions are not the same.
p= = 0.49
3
(b) Test of contingency tables
A A1 A2 A3 … Ac Row total
B
B1 O11 O12 O13 . . . O1c O1.
B2 O21 O22 O23 . . . O2 c O2.
. . . . ... . .
. . . . ... . .
. . . . ... . .
Br Or 1 Or 2 Or 3 . . . Or c Or .
Column total O.1 O. 2 O. 3 . . . O. c O. .
4
E.g. Consider the following table of frequencies of 6800 men
according to eye color and hair color.
Hair color Fair Brown Black Red Total
Eye color
Blue 1768 807 189 47 2811
Grey or green 946 1387 746 53 3132
Brown 115 438 288 16 857
Total 2829 2632 1223 116 6800
Use the level of sig. = 1% to test the null hypothesis that there is
no relationship between hair and eye color.
To test H0 : eye color and hair color are not associated
against H1 : eye color and hair color are associated.
=
= 1073.52
5
At 1% sig. level and v = (3-1)(4-1) = 6 degrees of freedom, the
critical value is = 16.812
Since the calculated test statistic = 1073.52 > 16.812, H0 is
rejected showing that hair color and eye color are significantly
associated at 1% sig. level.
Example
A study is conducted of the number of calls received on the
switchboard of an insurance firm. A count is made of the number of
incoming calls per minute for a sample of 100 minutes. The results
of the study are shown below:
No. of calls Observed
per min (X) freq. (O)
0 40
1 35
2 14
3 8
4 2
5 1
6
Test at 5% level of sig. whether the distribution of calls arriving at
the switchboard is a Poisson distribution.
7
Rules for using Chi-square test
(1) The total no. of observations (total frequencies) should not be
too small , usually not less than 50.
(2) There should be at least a frequency of 5 in each expected
frequency. If E < 5, then it is necessary to group several
adjacent classes into one class.
8
Analysis of Variance
Previously, we have tested hypotheses comparing two means. Now, we
expand further our idea of hypothesis tests. We describe a test that
simultaneously compares several means (three or more means).
The null hypothesis is Ho: 𝜇1 = 𝜇2 =⋅⋅⋅= 𝜇𝑐 for c population means.
The alternative hypothesis is H1 : not all 𝜇𝑖 ’s are equal, i = 1, 2, …, c.
The F-distribution
The F-distribution, similar to the t-distribution and the 𝜒 2 -distribution,
is a family of probability distributions. Each F-distribution is identified
by two numbers of degrees of freedom, the degrees of freedom in the
numerator and the degrees of freedom in the denominator.
ANOVA assumptions
To use ANOVA, we assume the following:
1. The c populations are normally distributed.
2. The populations have equal variances (𝜎 2 ).
3. The c samples are selected independently.
When these conditions are met, F is used as the test statistic. The
symbolic name for a critical value of F will be F𝛼 ,dfn, dfd , where
𝛼 = the level of significance of the test , i.e. the area under the
distribution curve to the right of the critical value being sought.
dfn = the degrees of freedom in the numerator
dfd = the degrees of freedom in the denominator
1
Table 9 gives values of F𝛼 , dfn, dfd for 𝛼 = 0.05, 0.025, 0.01, 0.001 for
various combinations for the degrees of freedom. Hence, the critical
value with 6 and 10 d.f. for 𝛼 = 0.05 (area to the right = 0.05) is
F 0.05, 6, 10 = 3.22
2
Data presentation
Let xij denote the jth observation from the ith treatment and arrange the
data as follows
Treatment 1 2 … i … c
x11 x21 xi1 xc1
x12 x22 xi2 xc2
. . . .
. . . .
x1𝑘1 xi
x2𝑘2 xc𝑘𝑐
Total T1 T1 T2 … Ti … Tc T
mean 𝑥1 𝑥1 𝑥2 𝑥𝑖 𝑥𝑐 𝑥
Here, Ti is the total of all observations for treatment i (or column total),
𝑥𝑖 is the mean of all observations for treatment i, and n is the total
sample size.
𝑘𝑖 𝑥
𝑘𝑖 ∑𝑗=1 𝑖𝑗
That is , Ti = ∑𝑗=1 𝑥𝑖𝑗 , 𝑥𝑖 = , n = k1+k2+…. +kc = ∑𝑐𝑖=1 𝑘𝑖
𝑘𝑖
The symbol T represents total of all observations in the experiment and
𝑥 represents the overall mean of all observations:
𝑘
𝑘𝑖 ∑𝑐𝑖=1 ∑𝑗=1
𝑖 𝑥
𝑖𝑗 𝑇
T = ∑𝑐𝑖=1 ∑𝑗=1 𝑥𝑖𝑗 = ∑𝑐𝑖=1 𝑇𝑖 , 𝑥= =
𝑛 𝑛
3
The second component, called the sum of squares for error [ SS(error) ],
is used to measure the variation within the c samples
2 2 𝑇1 2 𝑇2 2 𝑇𝑐 2
SS(error) = ∑𝑖 ∑𝑗(𝑥𝑖𝑗 − 𝑥𝑖⋅ ) = ∑𝑖 ∑𝑗 𝑥𝑖𝑗 − ( + +⋅⋅⋅ + )
𝑘1 𝑘2 𝑘𝑐
The mean square for treatments and the mean square error are obtained
by dividing the sum-of-squares value by the corresponding number of
degrees of freedom.
𝑆𝑆(𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠) 𝑆𝑆(𝑒𝑟𝑟𝑜𝑟)
MS (treatments) = MS (error) =
𝑐−1 𝑛−𝑐
Source df SS MS F
𝑆𝑆(𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠) 𝑀𝑆(𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠)
Treatments c–1 SS(treatments)
𝑐−1 𝑀𝑆(𝑒𝑟𝑟𝑜𝑟)
𝑆𝑆(𝑒𝑟𝑟𝑜𝑟)
Error n–c SS(error)
𝑛−𝑐
4
Testing the Equality of the Treatment Means
The mean squares in the ANOVA table are used to test the null hypothesis
Ho : 𝜇1 = 𝜇2 =⋅⋅⋅= 𝜇𝑐 against
H1 : at least two of the means are not equal
𝑀𝑆(𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠)
The test statistic is F* =
𝑀𝑆(𝑒𝑟𝑟𝑜𝑟)
In the formula of F, a measure of the variation between the treatments, the
MS(treatments), is compared to a measure of variation within treatments,
the MS(error). If the MS(treatments) is significantly larger than the
MS(error), we will conclude that the treatment means are not all the same
as expressed by H1. This would imply that the factor being tested does have
a significant effect on the response variable.
Thus, large values of F* lead to the rejection of Ho.
If, however, the MS(treatments) is not significantly larger than the
MS(error), we will not be able to reject the null hypothesis that all means
are equal.
5
Step 1 Let 𝜇1 = mean production rate at 68o F
𝜇2 = mean production rate at 72o F
𝜇3 = mean production rate at 76o F
Ho : 𝜇1 = 𝜇2 = 𝜇3 against
H1 : not all means production rate are equal
Step 2 We will use an ANOVA table to record the sums of squares (SS)
and organize the calculation.
Source df SS MS F
Treatments c–1
Error n–c
Total n-1
SS(error) = = 9.5
42.25
The calculated value of the test statistic F* = = 44.47
0.95
Step 3
At 5% sig. level, the critical value is F 5%, 2, 10 = 4.10.
Step 4
Since F* > F 5%, 2, 10 , Ho is rejected at 5% sig. level. We conclude that the
three temperature treatments have significantly different effects on the
production rate at 5% sig. level.
6
Estimating Differences in the Treatment Means
Suppose in carrying out the ANOVA procedure, we make the decision to
reject the null hypothesis. This allows us to conclude that all the treatment
means are not the same. The next question we may want to ask is : which
treatment means are different from the others? This section provides a
procedure for comparing a particular pair of means.
The t- distribution is used as a basis for such comparison. Recall that one
of the assumptions of ANOVA is that the population variances are equal
for all treatments. This common population variance 𝜎 2 is estimated by
the mean square error, the MS(error), in the ANOVA.
7
Randomized Complete Block Design
When we want to compare the means of c populations in the presence
of an extraneous variable, blocking is used. A block is a collection of
c experimental units that are as nearly alike (homogeneous) as possible
relative to the extraneous variable. Each treatment is randomly assigned
to 1 unit within each block. Since the effect of the extraneous variable
is controlled by matching like experimental units, any differences in
response are attributed to treatments effects.
The term ‘fixed effects’ applies to both blocks and treatments. That is,
it is assumed that neither blocks nor treatments are randomly chosen.
Any inferences made apply only to the c treatments and b blocks
actually used.
Data presentation
Let xij denote the observation from the ith block and jth treatment .
There are n = cb measurements divided between c treatment levels and
b blocks.
8
Treatment
Block 1 2 3 . .j . c block total block mean
1 x11 x12 x13 x1c T1. 𝑥1.
2 x21 x22 x23 x2c T2. 𝑥2.
3 x31 x32 x33 x3c T3. 𝑥.3.
.
i xi1 xi2 xi3 xij xic Ti . 𝑥𝑖.
.
∑𝑏
𝑖=1 𝑇𝑖.
2
𝑇.. 2 𝑆𝑆(𝑏𝑙𝑜𝑐𝑘) 𝑀𝑆(𝑏𝑙𝑜𝑐𝑘𝑠)
Blocks b -1 SS(blocks)= −
𝑐 𝑛 (𝑏−1) 𝑀𝑆(𝑒𝑟𝑟𝑜𝑟)
𝑆𝑆(𝑒𝑟𝑟𝑜𝑟)
Error (c-1)(b-1) SS(error) by subtraction
(𝑐−1)(𝑏−1)
𝑇.. 2
Total n-1 SS(total) =∑ ∑ 𝑥𝑖𝑗 2 −
𝑛
9
Example
In an experiment to compare the percentage efficiency of different
chelating agents in extracting metal ions from aqueous solution the
following results were obtained:
Chelating agent
Day A B C D
1 84 80 83 79
2 79 77 80 79
3 83 78 80 78
(a) Test whether the different chelating agents have significantly
different efficiencies at 5% sig. level.
(b) Test whether the day-to-day variation has significantly affect the
efficiencies at 5% sig. level.
In this e.g. , the factor is chelating agent ,there are 4 treatments (A, B, C,
and D) and 3 blocks.
The day on which the experiment is performed introduces uncontrolled
variation caused both by changes in laboratory temperature, pressure etc.
and slight differences in the concentration of the metal ion solution, i.e.
the day is an extraneous uncontrolled random factor. It is a randomized
complete block design.
Chelating agent
Day A B C D Total
1 84 80 83 79 T1. =
2 79 77 80 79 T2. =
3 83 78 80 78 T3. =
Total: T.1 = T.2= T.3= T.4 = T.. =
10
ANOVA table
Source df SS MS F*
Treatments
Blocks
error
total
11
Example A company sells three shampoos for dry, normal and oily hair.
Sales, in millions of dollars, for the past five months are given in the
following table:-
month Sales ($mns)
Dry Normal Oily
June 7 9 12
July 11 12 14
Aug. 13 11 8
Sept. 8 9 7
Oct. 9 10 13
Using 5% sig. level, apply the ANOVA procedure to test whether:-
(a) the mean sales for dry, normal and oily hair are the same.
(b) the mean sales are the same for each of the five months.
2 𝑇.. 2 1532
SS(total) = ∑ ∑ 𝑥𝑖𝑗 − = 1633 - = 72.4
𝑛 15
∑𝑐𝑗=1 𝑇.𝑗 2 𝑇.. 2 482 512 542 1532
SS(treatments) = − = ( + + )−
𝑏 𝑛 5 5 5 15
= 1564.2 - 1560.6 = 3.6
∑𝑏
𝑖=1 𝑇𝑖.
2
𝑇.. 2 282 372 322 242 322 1532
SS(blocks) = = ( + − + + )−
𝑐 𝑛 3 3 3 3 3 15
= 1592.33 - 1560.6 = 31.73
SS(error) = SS(total) - SS(treatments) - SS(blocks)
= 72.4 - 3.6 - 31.73 = 37.07
12
ANOVA table:
Source df SS MS F*
Treatments
Blocks
Error
total
13
Statistical Quality Control (SQC)
Introduction
In this chapter, we shall be concerned with industrial
manufacturing processes with repetitive operations, where the
quality of the product is of interest. Quality control is a tool
for cost reduction and quality improvement.
Types of measurements
(i) variable e.g. weight, length, height etc. Variable
measurements are usually assumed to be normally
distributed.
(ii) Attribute e.g. good or defective, accepted or rejected, up or
down etc. binomial random variable is appropriate.
1
Sources of variation in a manufacturing process
2
UCL
Central line
LCL
Time/sample
number
3
The 2–sigma (2 ) limits or inner control limits are:
2
2 = x
n
95.45% of the sample means will lie inside the 2 limits
(warning limits)
The 3 – sigma (3 ) limits or outer control limits are:
3
3 =
x
n
99.73% of the sample means will lie inside the 3 limits
(action limits)
4
Features indicating a process is out of control (or assignable
variations are present)
(a) A single point is outside the 3 limits. The probability of a
point being outside the 3 limits, if the process is in control is
0.27%.
(b) Several points are near the control limits, especially
successive points. Some quality control experts draw 2 and
3 limits on a control chart. Several successive points
beyond the 2 limits indicate that the process should be
carefully watched.
(c) A run of several points is on one side of the central line. The
probability that a single point is (say) above the central line is
1
. The probability that six successive points would lie above
2
1
the central line only ( )6 = 0.016.
2
(d) A trend in the points exists.
Example
The weight of component is specified as 5.5 1.0 kg. 5
Samples each of size n = 6 items are drawn each hour for 5
consecutive hours. The results are tabulated as follows:-
Sample
Weight (kg)
Hour(am) no. Sum Mean Range
8:00 1 4.9 4.8 4.8 5.1 6.6 5.2 31.4 5.2 1.8
9:00 2 6.8 5.1 5.2 7.1 5.3 5.2 34.7 5.8 2
10:00 3 7.1 6.9 5.9 6.2 6.9 6.9 39.9 6.7 1.2
11:00 4 6.8 6.2 6.5 7.1 7.6 6.8 41.0 6.8 1.4
12:00 5 6 4.6 4.5 4.5 4.3 5.2 29.1 4.9 1.7
Average: 5.9 1.62
(a) Find the % of components failed to meet the specified
tolerances.
(b) Construct the X -chart. Does the process appear to be in
control?
5
Solution
(a) A chart which is not a control chart may be of interest to
production supervision, but it does not give the definite basis
for action that the control chart supplies. It shows individual
measurements plotted for each sample and the nominal
weight, upper and lower tolerance limits.
6
The X -chart shows sample means (averages) rather than
individual values on the vertical scale is then drawn as
follows:-
From the above chart, we can see that 2 of the 5 sample means
lie outside the 3 control limits. The process is out of control
and it is the responsibility of the Q.C. inspector to inform the
production department so that any assignable causes can be
detected.
Remark:
Tolerance limits should not be indicated on the X -chart. It is
the individual output that has to meet the tolerance limits, not
the average of a sample. Averages of samples often fall
within tolerance limits even though some of the individual
outputs in the sample are outside the limits.
7
Control chart for sample ranges , R-chart
In addition to monitoring changes in the mean, it is also useful
to closely scrutinize variation in the dispersion within a
process. Although standard deviation is a reliable measure of
dispersion, quality control techniques usually rely on the
range as an indication in the variability of the process. Range
is easier to compute and more readily understood by those
without a sufficient statistical background.
LCLR = R - 3 SR and
UCLR = R + 3 SR
where SR = standard deviation in the distribution of the
sample ranges. In practice, it is calculated as follows:-
LCLR = D3 R
UCLR = D4 R
Central line = R
where D3 = 0 and D4 =2.004
LCLR = D3 R = 0 (1.62) = 0
UCLR = D4 R = 2.004 (1.62) = 3.25
8
The R-chart is constructed as follows:-
Remarks:
The interpretation of the R-chart is similar to that of X -chart.
The process can be considered out of control with respect to
dispersion if
(a) a single point is outside the 3 limits
(b) several points are near the control limits
(c) several points lie on one side of the central line
(d) there is a trend in the points
In the above example, none of the 4 features described above
are clearly satisfied, thus the process variation is in control.
9
Control charts for attributes
Control charts for X -chart and R –chart are designed to
monitor quantitative data in a process. In many cases it is
necessary or desirable to measure the quality of a process, or
the output of that process, based on the attribute of
acceptability. The term attribute, as used in quality control,
refers to those quality characteristics which are either good or
defective, accepted or rejected etc. They conform to
specifications or they do not conform to specifications.
X
Fraction defective (or proportion defective) =
n
p (1 − p )
with mean E( X ) = p and V( X ) =
n n n
Control charts can be set up using the binomial distribution but
the working is simplified by using normal distribution to
approximate the binomial distribution.
The 3 control limits of the fraction defectives are:
LCLp = p - 3 p (1− p ) and
n
p (1 − p )
UCLp = p + 3
n
and p is estimated by p = mean proportion of defectives for
all samples.
10
In constructing p-charts, we simply take note of the proportion
of defective items in a sample. This sample proportion, p̂ is
Number of defectives in a sample
p̂ =
sample size
p (1 − p )
LCLp = p − 3
n
p (1 − p )
UCLp = p + 3
n
Central line = p
11
Example
XYZ Factory makes electric guitars and other musical
instruments. A quality control procedure to detect proportion
of defective in their AAA model guitar entailed the selection
of k = 15 different samples of size n = 40. The number of
defective guitars in each sample is shown in the following
table:
12
Solution:-
(a) A total of k x n = 15(40) = 600 guitars are inspected.
211
p = = 0.35
600
p (1 − p ) 0.35 (1 − 0.35 )
LCLp = p − 3 = 0.35 - 3 = 0.12
n 40
p (1 − p ) 0.35 (1 − 0.35 )
UCLp = p + 3 = 0.35 + 3 = 0.58
n 40
13
(b) Interpretation of the p-chart is the same as for the
variable chart. The same four indications that the process is
out of control applied here.
Note that a point above the UCLp indicates a deterioration in
quality whereas a point below indicates an improvement in
quality. If the sample size n changes, the control limits must
be recomputed since UCLp and LCLp are linked to n.
In this example, the p- chart shows that 3 out of 15 samples
are outside the control limits (i.e. sample nos. 5, 12 and 13 are
out of control). The search for the assignable causes revealed
that:
(i) Sample no. 5 was taken during time when certain key
personnel were on vacation, and less skilled employees were
forced to fill in.
(ii) Sample no. 12 has an unusually low proportion of
defective which is due to a one-time use of superior raw
materials when the regular supplier was unable to provide
the usual material.
(iii) Sample no. 13 was taken when new construction at the
plant temporarily interrupted electric power, thus
disallowing the use of computerized production methods.
14
Example
Eliminating sample nos. 5, 12 and 13 yields
156
p = = 0.33
480
p (1 − p ) 0.33 (1 − 0.33 )
LCLp = p − 3 = 0.33 - 3 = 0.11
n 40
p (1 − p ) 0.33 (1 − 0.33 )
UCLp = p + 3 = 0.33 + 3 = 0.55
n 40
15