Maths
Maths
( A) = P (PB(A)A) = 16 2 = 13
P B
1 3 1
3. Let A and B be two events such that P(A) = , P(B) = , P(A B) = . Compute P ( A B ) and
3 4 4
(
P AB . ) (May/June 2019)
Solution:
1
P ( A B) 1
P ( A B) = = 4= ,
P( B) 3 3
4
P ( A B ) = P ( A) − P ( A B ) =
1 1 1
− =
3 4 12
4. State Baye’s Theorem on Probability.
Statement:
If E1, E2 …En are a set of exhaustive and mutually exclusive events associated with a random
P ( Ei ) P ( A / E i )
experiment and A is any other event associated with Ei. Then P( Ei / A) = n , i=1,2,..n
P ( Ei ) P ( A / Ei )
i =1
5. Let X be the random variable which denotes the number of heads in three tosses of a fair coin.
Determine the probability mass function of X.
Solution:
If a coin is tossed three times, then the sample space is
S = TTT , HTT , THT , TTH , HTH , HHT , THH , HHH
Let X be the R.V denotes the number of heads. The probability mass function is
X= x 0 1 2 3
P(X=x) 1/8 3/8 3/8 1/8
P(X=x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
12
1 2 3 4 5 6 5 4 3 2 1
E ( X ) = xP( X = x) = 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12
x=2 36 36 36 36 36 36 36 36 36 36 36
252
= =7
36
7. Find the value of ‘k’ for a Continuous random variable X whose probability density function is
given by f ( x ) = kx e ; x 0 .
2 −x
Solution:
Since X is a continuous random variable f ( x) 0 and x 0 and f ( x ) dx = 1
0
e − x dx = 1 k ( x 2 )( −e − x ) − ( 2 x ) ( e − x ) + ( 2 ) ( −e − x ) = 1
k x
2
0
0
1
k 0 − ( −2 ) = 1 2k = 1 k =
2
a (1 + x ) , 2 < x < 5
2
8. A random variable X has the p.d.f f(x) given by f(x) = . Find ‘a’
0, otherwise
and P (X < 4). (Nov/Dec 2023)
Solution:
Since X is a continuous random variable f ( x ) 0 , x and f ( x ) dx = 1
−
5
5
x 3
a(1 + x )dx = 1 a x + 3 =1
2
2 2
5 3
23 1
a 5 + −
2 + = 1 42a = 1 a =
3 3 42
1 23 31
4
(1 + x 2 ) 1 x3 43
4 4
P ( X 4 ) = f ( x)dx = dx = x + = 4 + − 2 + =
2 2
42 42 3 2 42 3 3 63
0, x<0
9. The CDF of a continuous random variable is given by F(x) = - x . Find the PDF of X
1- e , x 0
5
and mean of X.
Solution:
The relation between pdf and cdf
0, x < 0 0, x0
d d
f (x) = F(x) = - x = 1 − x
dx dx 1 - e 5 , x 0 e 5 , x 0
5
10. Let X be a random variable with E(X)=1, E [X (X – 1)] = 4. Find Var(X) &Var (2 – 3X).
Solution:
E[X(X-1)] = 4 E[X2 – X] =4
E[X2] – E[X] = 4 E[X2] – 1 =4 E[X2] = 5
Var ( X ) = E X 2 − ( E X ) = 5 − 1 = 4
2
1
= (r + 2)! if n is positive integer ( n) = ( n − 1)!
2
14. For a binomial variate X, find p and when n = 6 and 9 P(X = 4) = P( X = 2).
Solution:
n−x
By Binomial distribution, P(X = x) = nC x p q
x
1 −1 1
p= , Since p is positive, p =
4 2 4
X −
Solution: X follows N(20, 10) = 20 & = 10 . Let Z = be the standard normal variate
15 − 20 40 − 20
P 15 X 40 = P Z = P −0.5 Z 2
10 10
= P −0.5 Z 0 + P 0 Z 2
P ( B1 ) =
1 1C1 3C1 1 P ( B1 ) P ( A / B1 ) =
1 1
= 0.0667
3 P( A / B1 ) = = 3 5
6C 2 5
P ( B2 ) =
1 2C1 1C1 1 P ( B2 ) P ( A / B2 ) =
1 1
= 0.1111
3 P ( A / B2 ) = = 3 3
4C 2 3
P ( B3 ) =
1 4C1 3C1 2 P ( B3 ) P ( A / B3 ) =
1 2
= 0.0606
3 P( A / B3 ) = = 3 11
12C 2 11
Total 3
P( B ) P( A / B ) =0.2384
i =1
i i
By Bay’s theorem,
P(Bi ) P( A / Bi )
P( Bi / A) = 3 , i = 1,2,3.
P(Bi ) P( A / Bi )
i =1
P[Urn I was chosen given that the balls are white and red]
P(B ) P( A / B1 ) 0.0667
= P( B1 / A) = 3 1 = = 0.2798
P(Bi ) P( A / Bi )
0.2384
i =1
P[Urn II was chosen given that the balls are white and red]
P[ Urn III was chosen given that the balls are white and red ]
P(B ) P( A / B3 ) 0.0606
= P( B3 / A) = 3 3 = = 0.2542
P(Bi ) P( A / Bi )
0.2384
i =1
2. The discrete random variable X has the following probability mass function:
b, x 0
2b, x 1
P x
3b, x 2
0 , otherwise
(a) Find the value of ‘b’, (b)Determine P ( X < 2 ) ,P ( X 2 ) and P ( 0 < X < 2 )
(c ) Determine the distribution function of X.
Solution:
To the value of ‘b’:
1
We know that P( X = x ) = 1 b + 2b + 3b = 1 6b = 1 b = 6
i
i
X=x 0 1 2
P(X = x) 1 2 3
6 6 6
1 2 3
P ( X < 2 ) = P( X = 0 ) + P( X = 1 ) = + =
6 6 6
1 2 3
P ( X 2) = + + = 1
6 6 6
2
P ( 0 < X < 2 ) = P( X = 1 ) =
6
The distribution function of X is
X=x 0 1 2
P(X = x) 1 2 3
6 6 6
F(X) 1 3 6
=1
6 6 6
3. A discrete random variable has the following probability distribution.
X=x 0 1 2 3 4 5 6 7 8
P(X = x) a 3a 5a 7a 9a 11a 13a 15a 17a
Find (i) The value of a, (ii) P(X>3), (iii) Distribution of X (iv) P (1.5 < X < 4.5 X > 2 ) ,(v) Mean.
(Nov/Dec 2023)
Solution:
(i). To the value of ‘a’:
a+3a+5a+7a+9a+11a+13a+15a+17a = 1
1
81a = 1 a =
81
X =x 0 1 2 3 4 5 6 7 8
P( X = x) 1 3 5 7 9 11 13 15 17
81 81 81 81 81 81 81 81 81
= 0 P( X = 0) + 1 P( X = 1) + 2 P( X = 2) + 3 P( X = 3) + 4 P( X = 4)
+ 5 P( X = 5) + 6 P( X = 6) + 7 P( X = 7) + 8 P( X = 8)
= 5.4814
;0 x 3
4. Let X be a continuous random variable for the pdf f(x) = 81
0 ; otherwise
Find first four moments about the origin.
Solution:
Let be the rth moment about the origin.
'
r
E (x r ) = = x
' r
f(x) dx
r
−
3 4x (9 − x2 )
= x r
dx
0
81
3
= ( 9 x r +1 − x r +3 ) dx
4
81 0
3
4 9 xr +2 xr +4
= −
81 r + 2 r + 4 0
4 9 ( 3) ( 3)
r +2 r +4
= − − 0 − 0
81 r + 2 r+4
4 ( 3) ( 3)
r +4 r +4
= −
81 r + 2 r + 4
4 ( 3)
r +4
1 1
=
'
− − − − −(1)
81 r + 2 r + 4
r
put r =1, 2,3, 4 in (1), then we get the first four moments about the origin as,
1' = 1.6, 2' = 3, 3' = 6.17, 4' = 13.5
ax, 0 x1
a, 1 x 2
5.
If the density function of a continuous random variable X is given by f(x) =
3a - ax, 2 x 3
0, elsewhere
(a)Find the value of ‘a’ (b) Find the c.d.f of X (c) Find P ( X 1.5 ) (April/May 2022)
Solution:
(i) Since f ( x )dx = 1 f ( x)dx = 1
− −
0 1 2 3
−
f ( x)dx + f ( x)dx + f ( x)dx + f ( x)dx + f ( x)dx = 1
0 1 2 3
1 2 3
ax dx + a dx + (3a − ax)dx = 1
0 1 2
1 3
x2 x2 1
a + a x 1 + a 3x − = 1 a =
2
2 0 2 2 2
(ii) CDF
If x 0,
= + + − = − −
4 0 2 1 2 4 2 2 4 4
If x 3
3 x
x 1 2 3 x
x 1
F ( x) =
−
f ( x) dx =
0
2
dx + dx + − dx + 0 dx
1
2 2
2 2 3
1 3
x 2 x 3x x 2 1 1 1
2
= + + − = + + =1
4 0 2 1 2 4 2 4 2 4
0, x0
2
x , 0 x 1
4
x 1
FX ( x ) = − , 1 x 2
2 4
3x x 2 5
− − , 2 x3
2 4 4
1, x3
(iii)P ( X 1.5 ) = F (1.5) [ F ( x) = P( X x)]
1.5 1
= − = 0.5
2 4
2 x ,0 x 1 1 1 1
6. A random variable X has the P.d.f f ( x ) = . Find (i) P X , (ii) P x ,
0 , Otherwise 2 4 2
3 1
(iii) P X /X
4 2
Solution:
1/2
1
1/2 1/2
x2 1
(a) P X =
2 0
f ( x)dx = 2 x dx = 2 =
0 2 0 4
1/2
1 1
1/2 1/2
x2 1 1 3
(b) P X = f ( x)dx = 2 xdx = 2 = − =
4 2 1/4 1/4 2 1/4 4 16 16
1 1 1 3
P X = 1− P X = 1− =
2 2 4 4
7
3 1 16 7
P X / X = =
4 2 3 12
4
7. Derive the moment generating function, mean and variance of Binomial distribution.
A random variable X is said to follow a binomial distribution if it assumes only non-negative values
with probability mass function
P(X = x) = nC x p x q n− x , where x=0, 1, 2… n and q = 1 – p.
MGF, Mean and Variance of Binomial distribution:
M X (t ) = E (e tX ) = e x =1
tx
p ( x)
= e tx nC x p x q n− x
x =1
( )
= pet nC x q n−x
x
x =1
( )
n
MGF = M X (t ) = pet + q
d
Mean = E(x) = ( M x (t))
dt t =0
(
= n pet + q ) n −1
pet t =0
= n p ( p + q)
n−1
Mean = n p p + q = 1
d2
E ( x 2 ) = 2 (M x (t))
dt t =0
=
d
dt
n p e t pe t + q( )
n −1
t =0
= n p et (n − 1) pet + q ( )n−2
pet + pet + q ( ) n −1 t
e
t =0
= n p (n − 1)( p + q)
n−2
p + ( p + q)
n−1
p + q = 1
= n p (n − 1) p + 1
= n p np − p + 1
8. Messages arrive at a switchboard in a Poisson manner at an average rate of six per hour. Find the
probability for each of the following events (a) exactly two messages arrive within one hour,
(b) no message arrives within one hour, (c) at least three messages arrive within one hour.
Solution:
Let X be a random variable that messages arrive at a switchboard.
e− x
By Poisson distribution, P(X = x) = , x = 0,1,2,.....
x!
Here = 6
e−6 ( 6 )
2
(i). P[exactly two messages arrive within one hour] = P(X = 2) = = 0.0446
2!
e−6 ( 6 )
0
9. Find mean, variance and MGF of exponential distribution. Also prove the lack of memory
property of the Exponential distribution. (APR / MAY 2019)
Solution:
e − x , x 0
We know that f ( x ) =
0, otherwise
M X ( t ) = E ( etx ) = etx f ( x ) dx = e − x etx dx
0 0
= e − x( −t ) dx
0
e − x ( −t )
= =
− ( − t ) 0 − t
d 1
Mean = 1 = M X ( t ) = 2
=
dt t =0 ( − t ) t =0
d2 ( 2) 2
2 = 2 M X ( t ) = 3
= 2
dt t =0 ( − t ) t =0
( )
2
2 1 1
Variance = 2 − 1 = 2 − 2 = 2 .
MEMORYLESS PROPERTY
Statement: If X is exponentially distributed with parameters , then for any two positive integers ‘s’
and ‘t’, P X s + t / X s = P X t s, t 0
St. Joseph’s College of Engineering 11
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Proof:
e − x , x 0
The p.d.f of X is f ( x ) =
0 , Otherwise
P X k = e − x dx = −e − x = e − k
k
k
P x s + t x s
P X s + t / x s =
P x s
P X s + t e− ( s +t )
= = − s = e − t = P X t
P X s e
10. Let X be a Uniformly distributed R.V over [-5,5]. Determine (1) P(X ≤ 2) (2) P(X≤ 2) (3) CDF
of X (4) Var ( X)
Solution:
The R.V. X ~ U[-5,5].
The p.d.f
1
, a xb
f ( x) = b − a
0 , otherwise
1
for − 5 x 5
f ( x) = 10
0 otherwise
2 2 2
1 1 1
(1) P ( X 2 ) = f ( x)dx = dx = dx = x −5
2
−5 −5
10 10 −5 10
1 7
2 + 5 =
=
10 10
(2) P ( X 2 ) = P ( −2 X 2 )
2 2
1 1 1 4 2
f ( x) d x = 10 d x = 10 x 2 + 2 = =
2
= −2
=
−2 −2
10 10 5
( 3) Cumulative distribution function of X
If x −5
x x
F ( x) =
−
f ( x)dx = 0 dx = 0
−
If − 5 x 5
x+5
x x
1 1
f ( x)dx = 10 dx = 10 x
x
F ( x) = −5
=
−5 −5
10
If x 5
5+5
5 x 5
1 1
F ( x) = f ( x)dx + f ( x)dx = 10 dx + 0 = 10 x
5
−5
= =1
−5 5 −5
10
(b − a ) (5 − (−5) )
2 2
100 25
(4)Var ( X ) = = = = .
12 12 12 3
x-1
3 1
11. Let P ( X = x ) = , x = 1, 2, 3, .... be the probability mass function of the RV X. Compute
4 4
(a) P(X > 4/ X >2) (b) Find mean and variance
Solution:
(a)To find P(X > 4/ X >2) (By Memoryless property )
3 1
2
1 1 2
= 1 + + +
4 4 4 4
2 −1 2 −1 2
3 1 1 3 1 3 1
= 1 − 4 = 4 4 4 =
4 4 4
(b)To find mean and variance
12. Buses arrive at a specified stop at 15 minutes interval starting at 7 a.m., that is, they arrive at 7,
7.15, 7.30, 7.45 and so on. If a passenger arrives at a stop at a random time that is uniformly
distributed between 7 and 7.30, find the probability that he waits for (i) Less than 5 minutes for a
bus, (ii) More than 10 minutes for a bus.
Solution:
Let X denotes number of minutes past 7.00 a.m. that the passenger arrives at the stop till 7.30a.m.
1
X ~U[0,30] f(x) = , 0 x 30
30
P ( that he has to wait for the bus for less than 5 minutes ) = P (10 x 15) (25 x 30)
= P (10 x 15) + P (25 x 30)
15 30
= f(x)dx + f(x)dx
10 25
15 30
1 1
=
dx + dx
10
30 25
30
=
1
30
15
x 10 + x 25
30
10
= = 0.3333
30
(ii) Here the passenger has to arrive between 7 and 7.05 or 7.15 and 7.20
P ( that he has to wait for the bus for more than 10 minutes ) = P (0 x 5) (15 x 20)
= P (0 x 5) + P (15 x 20)
5 20
= f(x)dx + f(x)dx
0 15
5 20
1 1
dx + dx=
0
30 15
30
=
1
30
5 20
10
x 0 + x 15 = = 0.3333
30
13. A component has an exponential time to failure distribution with mean of 10,000 hours.
(a)The component has already been in operation for its mean life. What is the probability that it
will fail by 15,000 hours? (b) At 15,000 hours the component is still in operation. What is the
probability that it will operate for another 5000 hours?
Solution:
Let X be the random variable denoting the time to failure of the component following exponential
distribution with Mean = 10000 hours.
1 1
= 10, 000 =
10, 000
1 −
x
e 10,000
,x 0
The p.d.f. of X is f ( x ) = 10, 000
0 , otherwise
(a) Probability that the component will fail by 15,000 hours given that it has already been in
operation for its mean life = P X 15,000 / X 10,000
P 10, 000 X 15, 000
= − − − − − − − −(1)
P X 10, 000
15,000 x
−
P 10,000 X 15,000
1
= e 10000
dx
10,000
10000
− x
1 e 10000 − 10000
x
= = − e
10000 −1 5000
10000 5000
= e −0.5 = 0.6065
14. The weights in pounds of parcels arriving at a package delivery company’s warehouse can be
modelled by an N (5,16) normal random variable X. (a) What is the probability that a randomly
selected parcel weighs between 1 and 10 pounds? (b) What is the probability that a randomly
selected parcel weighs more than 9 pounds?
Solution:
x− x−5
Given X N ( , ) where = 5, = 16, z = =
16
(a)The standard normal z values corresponding to 1 x = 1, x2 = 10 are
St. Joseph’s College of Engineering 15
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
1− 5 10 − 5
z1 = = −0.25and z2 = = 0.31
16 16
P(1 X 10) = P(−0.25 Z 0.31) = P(−0.25 Z 0) + P(0 Z 0.31)
= P(0 Z 0.25) + P(0 Z 0.31)
= 0.0987 + 0.1217 = 0.2204
9−5
(b) The standard normal z values corresponding to x = 9 are z = = 0.25
16
P( X 9) = P(Z 0.25) = 0.5 − P(0 Z 0.25)
= 0.5 − 0.0987 = 0.4013
15. An electrical firm manufactures light bulbs that have a life, before burn-out, that is normally
distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the
probability that a bulb burns between 778 and 834 hours.
Solution:
x − x − 800
Given X N ( , ) where = 800, = 40, z = =
40
The standard normal z values corresponding to x1 = 778, x2 = 834 are
778 − 800 834 − 800
z1 = = −0.55 and z2 = = 0.85
40 40
P(778 X 834) = P(−0.55 Z 0.85)
= P(0 Z −0.55) + P(0 Z 0.85)
= P(0 Z 0.55) + P(0 Z 0.85)
= 0.2088 + 0.3023 = 0.5111
Hence the probability that a bulb burns between 778 and 834 hours is 0.5111
UNIT – II TWO DIMENSIONAL RANDOM VARIABLES
PART-A
1. The joint probability mass function of X and Y is
X\Y 0 1 2
0 0.1 0.04 0.02
1 0.08 0.2 0.06
2 0.06 0.14 0.3
Find the marginal distributions.
Solution:
X\Y 0 1 2 P(X = x)
0 0.1 0.04 0.02 0.16
1 0.08 0.2 0.06 0.34
2 0.06 0.14 0.3 0.5
P(Y = y) 0.24 0.38 0.38 1
The marginal distribution of X is The marginal distribution of Y is
X 0 1 2 Y 0 1 2
2. The two discreate random variable X and Y are independent and P(X =0,Y=0)=2/9,
P(X =1,Y=1)=2/9, P(X =0)=3/9. Find the joint probability mass function of X and Y.
k , 0 x, y 1
5. The joint p.d.f. of RV (X,Y) is given as f ( x , y ) = . Find k
0 , elsewhere
Solution:
1 1 1 1
f ( x, y ) dxdy = 1 kdxdy = 1 k x dy = 1 k dy = 1 k = 1
1
0
− − 0 0 0 0
1 −y
6. The joint pdf of the random variable (X,Y) is given as f ( x , y ) = xe , 0 x 2, y 0. Calculate
2
the marginal p.d.f of X.
Solution:
The marginal p.d.f of X is given by
1 1 1 e− y 1 x
f X ( x) = f ( x, y )dy = xe − y dy = x e − y dy = x = − x 0 − 1 = .
− 0
2 2 0 2 −1 0 2 2
8 xy , 0 x 1, 0 y x
7. If f ( x , y ) = is the joint PDF of X & Y, find f(y/x).
0 , elsewhere
Solution:
x
x
y2
f X ( x) = f ( x, y )dy = 8 xydy = 8 x = 4 x 3 , 0 x 1
y y =0 2 0
f ( x, y )
f ( y / x) =
f X ( x)
8 xy 2 y
= = , 0 y x,0 x 1
4 x3 x 2
8. The joint probability density function of bivariate random variable (X , Y) is given by
4 xy , 0 x 1, 0 y 1
f ( x, y) = Find P (X + Y < 1 )
0 , elsewhere
Solution:
4 xy , 0 x 1, 0 y 1
Given the joint pdf of (X , Y) is f ( x, y) = .
0 , elsewhere
1 1− x 1− x
y2 1 1
P( X + Y 1) =
0 0
4 xydydx = 4 x dx = 2 x(1 − x) 2 dx
0 2 0 0
1
1
x2 x3 x 4 1 2 1 1
= 2 ( x − 2 x + x )dx = 2 − 2 + = 2 − + =
2 3
0 2 3 4 0 2 3 4 6
9. If the joint cumulative distribution function of X and Y is given by
F ( x , y ) = (1 − e − x )(1 − e − y ), x 0, y 0 , find P ( 1 < X < 2 , 1 < Y < 2 ).
Solution:
St. Joseph’s College of Engineering 18
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
2 F (1 − e )(1 − e )
2 −x −y
2 2
(
= e− x dx. e− y dy = −e− x . −e− y = e−1 − e −2 )
2 2 2
1 1
1 1
1 1 e −1
2 2
= − 2 = 2 = 0.054
e e e
10. Let X and Y be two random variables having joint density function
3 1 1
f ( x , y ) = ( x 2 + y 2 ), 0 x 1, 0 y 1. Determine P X , Y
2 2 2
Solution:
1 1
1
1
( )
2 2
1 3 2
P X , Y = f ( x, y )dydx = x + y 2 dydx
2 2 x =− 1
y= x =0 y = 1
2
2 2
1 1 1
1
3 2
2
y 3
3 2 1 1 1
2
3 x2 7 2
2 0 2 0 2 3 8 2 0 2 24
= x y + dx = x 1 − + 1 − dx = + dx
3 1
2
1
3 x3 7 x 3 1 1 7 1 3 8 1
2
=
+ = . + . = =
2 6 24 0 2 6 8 24 2 2 48 4
11. If X and Y have joint p.d.f f ( x , y ) = e − ( x + y ) , x 0, y 0. Check whether X and Y are
independent.
Solution:
e− y
f ( x, y )dy = e = −e 0 − 1 = e
−( x+ y ) −x −x −x
The marginal pdf of X is f X ( x) = dy = e
− 0 − 1 0
e− x
The marginal pdf of Y is fY ( y ) = f ( x, y )dx = e dx = e = −e − y 0 − 1 = e − y
−( x+ y ) −y
− 0 −1 0
Now, fX (x).fY (y) = e− x .e− y = e−(x + y) = f (x, y) X and Y are independent.
12. If the joint distribution function of X and Y is given by
1 − e − x − e − y + e − ( x + y ) , x 0, y 0
F ( x, y) = . Find the joint density function of X and Y.
0 , elsewhere
Solution:
The joint p.d.f of (X,Y) is given by
2 2 − y −( x + y )
f ( x, y ) =
xy
F ( x, y ) =
xy
(
1 − e − x − e − y + e −( x + y ) =
x
e −e) (
= e −( x + y ) )
−( x + y )
e , x 0, y 0
f ( x, y ) =
0 , elsewhere
13. Let X and Y be two independent R.Vs with Var(X) = 9 and Var(Y) = 3. Find Var (4X – 2Y + 6)
Solution:
Var (4X – 2Y + 6) = 16 Var(X) + 4 Var(Y) = 16(9) + 4(3) = 156
St. Joseph’s College of Engineering 19
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
14. If X has mean 4 and variance 9, while Y has mean –2 and variance 5 and the two are
independent find (a) E[XY] (b) E[XY2]
Solution:
Given E[X] = 4, E[Y] = –2, X2 = 9, Y2 = 5 , X and Y are independent.
(a) E[XY] = E[X] E[Y] = 4(–2) = –8
(b) E[XY2] = E[X] E[Y2]
Y2 = E[Y 2 ] − [ E[Y ]]2 5 = E[Y 2 ] − 4 E[Y 2 ] = 9 E[ XY 2 ] = 4(9) = 36
15. If Y = –2X + 3, find Cov (X, Y).
Solution:
Cov(X,Y) = E(XY) – E(X) E(Y)
= E(X(–2X + 3)) – E(X){E(–2X + 3)}= [E(–2X2 + 3X) – E(X)]{–2E(X) + 3}
= –2E(X2) + 3 E(X) + 2 (E(X))2 – 3E(X)= 2(E(X))2 – 2 E(X2) = –2 var(X)
16. Define Coefficient of correlation between two random variables X,Y and its range.
Solution:
Cov ( X , Y )
Coefficient of correlation between x and y is rXY = and range of correlation
Var ( X ) Var (Y )
coefficient is −1 rXY 1
−1
17. The correlation coefficient of two random variables X and Y is while their variances are 3
4
and 5. Find the covariance.
Solution:
−1
Given rxy = , 2X = 3, 2Y = 5 ,
4
rxy = Cov ( X ,Y )
, X 0, Y 0
XY
1 Cov(X, Y) 1
− = Cov(X, Y) = − . 3. 5 = −0.968
4 3. 5 4
18. The regression equations are 3x + 2y = 26 and 6x + y = 31. Find the mean of X and Y.
Solution:
Regression lines pass through the mean values of X and Y.
Let 3x + 2y = 26 ------(1) 6x + y = 31 -------(2)
Multiply equation (2) by 2 and subtract equation (2)
3x + 2y = 26
12x + 2y = 62
−−−−−−−−−−
−9 x = −36 x = 4
Substitute in equation (1), 3 (4) – 2y = 26 y = 7.
mean value of X = 4 and mean value of Y = 7
19. The two lines of regression are 4x – 5y + 33 = 0 and 20x – 9y = 107. Calculate the coefficient of
correlation between X and Y.
Solution:
4x – 5y + 33 = 0 --------- (1) 20x – 9y = 107 --------(2)
Let (1) be the regression line of Y on X and let (2) be the regression line of X on Y.
4 33 4
y = x + b1 =
5 5 5
9 107 9 4 9 9 3
x= y+ b2 = r = b1b2 = . = = = 0.6 1
20 20 20 5 20 25 5
X
1 2 3 p(y)
Y
1 c 2c 3c 6c
2 2c 4c 6c 12c
3 3c 6c 9c 18c
p(x) 6c 12c 18c 36c
Since p(x,y) is the joint pdf of X and Y p(x,y) ≥ 0 , for all x ,y
1
m n
p (x,y) = 1 36c = 1 c =
36
P ( Y = 1, X = 1) 1
1 P ( Y = 1, X = 2 ) 2 36 1
P ( Y = 1 / X = 1) = = 36 = , P (Y = 1 / X = 2) = = =
P ( X = 1) 1 6 P (X = 2) 1 6
6 3
P ( Y = 1, X = 3 ) 3 36 1
P ( Y = 1 / X = 3) = = =
P ( X = 3) 1 6
2
when y = 2 and x = 1, 2, 3
P ( Y = 2, X = 1)
2
1 P ( Y = 2, X = 2 ) 4 36 1
P ( Y = 2 / X = 1) = = 36 = , P ( Y = 2 / X = 2 ) = = =
P ( X = 1) 1 3 P ( X = 2) 1 3
6 3
P ( Y = 2, X = 3 ) 6 36 1
P ( Y = 2 / X = 3) = = =
P ( X = 3) 1 3
2
St. Joseph’s College of Engineering 21
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
when y = 3 and x = 1, 2, 3
P ( Y = 3, X = 1) 3
1 P ( Y = 3, X = 2 ) 6 36 1
P ( Y = 3 / X = 1) = = 36 = , P (Y = 3 / X = 2) = = =
P ( X = 1) 1 2 P ( X = 2) 1 2
6 3
P ( Y = 3, X = 3 ) 9 36 1
P ( Y = 3 / X = 3) = = =
P ( X = 3) 1 2
2
P ( X +Y 4 ) = P ( X =1,Y=1) + P ( X =1,Y=2 ) + P ( X =1,Y=3) + P ( X =2,Y=1) + P ( X =2,Y=2 )
+ P ( X =3,Y=1)
= c + 2c + 3c + 2c + 4c + 3c
15
= 15c =
36
2. x+y
The Joint distribution of X and Y is given by p(x, y) = , x = 1, 2,3; y = 1, 2,3 Find all the
36
marginal distributions and conditional probability distribution of Y given X =2.
Solution:
x+y
Given f (x, y) = , x = 1, 2,3; y = 1, 2,3
36
X
1 2 3 PY(y)
Y
2 3 4 9
1
36 36 36 36
3 4 5 12
2
36 36 36 36
4 5 6 15
3
36 36 36 36
9 12 15
PX(x)
36 36 36
The marginal distribution of X is
X 1 2 3
9 12 15
P(x)
36 36 36
The marginal distribution of Y is
Y 1 2 3
9 12 15
p(y)
36 36 36
3. k ( x + 1)e − y , 0 x 1, y 0
Find the constant k such that f ( x , y ) = is a joint p.d.f. of the
0 , otherwise
continuous random variable (X,Y). Are X and Y independent R.Vs? Explain.
St. Joseph’s College of Engineering 22
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Solution:
To find k : Given that f(x,y) is pdf of (X,Y)
f(x,y) ≥ 0 , for all x ,y and f ( x, y)dxdy = 1
− −
1
f ( x, y)dxdy = 1 k ( x + 1)e
−y
dxdy = 1
− − 0 0
1
k ( x + 1)dx . e − y dy = 1
0 0
1
x2
k + x . −e − y = 1
2 0
0
3 2
k (1) = 1 k =
2 3
2 −y
f ( x, y) = 3 ( x + 1)e , 0 x 1, y 0
0 , otherwise
The marginal PDF of X is
2 2 −y
− f ( x, y)dy = 0 3 ( x + 1)e dy = 3 ( x + 1) −e 0
−y
f X ( x) =
2
= ( x + 1) −e− + e0
3
2 2
= ( x + 1)(1) = ( x + 1), 0 x 1
3 3
The marginal PDF of Y is
1
2
− f ( x, y)dx = 0 3 ( x + 1)e dx
−y
fY ( y ) =
1
2 x2
= e− y + x
3 2 0
2 1 3 2
= e− y + 1 = . e− y = e− y , 0 y
3 2 2 3
2
Consider f X ( x) . fY ( y ) = ( x + 1) . e − y = f(x , y)
3
X and Y are independent
4. The joint density function of two random variable X and Y is given
6 2 xy
x + ,0 x 1, 0 y 2
by f ( x , y ) = 7 2 .
0 ,elsewhere
(a) Compute the marginal p.d.f of X and Y?
1 1
(b) Find E(X) & E(Y) (c) P X , Y .
2 2
Solution:
The marginal pdf of X is
7 2 2 0 7
The marginal pdf of Y is
6 xy
1
fY ( y ) = f ( x, y ) dx = x 2 + dx
− 0
7 2
1
6 x3 x 2 y 6 1 y
= + = + ,0 y 2
7 3 2 2 0 7 3 4
1
6
E(X) = x f X ( x)dx = x 2 x 2 + x dx
− 0
7
6 3 2 1
= 2x + x
7 0
1
6 2 x4 x2 6 1 1 6
= + = + =
7 4 2 7 2 2 7
0
E(Y) = y f Y ( y )dy
−
2
6 1 y
=y + dy
0
7 3 4
2
6 y y2
= +
7 3 4
0
2
6 y 2 y3 6 2 2 8
= + = + =
7 6 12 7 3 3 7
0
1
2
1 1
P( X , Y ) = f ( x, y) dy dx
2 2 − 1
2
1
2 2
6 xy
= x 2 + dy dx
0 1
7 2
2
1
2
2
6 x y2
= x2 y + dx
0
7 2 2 1
2
1
2
6 2 x2 x
= 2 x + x − − dx
0
7 2 16
=
x =0
8xy dx
y
x2
= 8 y = 4 y 3 , 0 y 1
2 0
The conditional p.d.f. of X and Y
f ( x, y) 8 xy 2 x
f ( x / y) = = = , 0 x y 1
fY ( y ) 4 y 3 y 2
f ( x, y ) 8 xy 2y
f ( y / x) = = = , 0 x y 1
f X ( x) 4 x (1 − x 2 ) 1 − x 2
1
(ii). E ( X ) =f X ( x) = x f ( x) dx =
x 4 x (1 − x 2 ) dx
x x =0
1
= 4 x 2 − x 4 dx
x =0
1
x3 x5 1 1 8
= 4 − = 4 − =
3 5 0 3 5 15
1 1
E ( Y ) =f Y ( x) = y f ( y)dy = y 4 y dy = 4 y dy = 4 y = 4
1
5
3 4
y y =0 y =0 5 0 5
(iii). ( )
f X ( x) fY ( y) = 4 x 1 − x 2 4 y 3 8xy = f ( x, y)
f ( x, y) dx dy = 1
− −
kxye
(
− x2 + y 2 ) dx dy = 1
0 0
k xe − x2
dx. ye − y dy = 1
2
0 0
2
Put x = u, Put y 2 = v
2xdx = du, 2ydy = dv
When x = 0,u = 0 When y = 0,v = 0
x = ,u = y = ,v =
k 1
e − u du . e − v dv = 1
20 20
k e−u e−v
. =1
4 −1 0 −1 0
k
0 + 1.0 + 1 =1 k = 4
4
The joint pdf is f ( x, y) = 4 xye
(
− x2 + y 2 ) , x 0, y 0
The marginal pdf of X is
(
− x2 + y 2 ) dy
f X ( x) =
−
f ( x, y)dy = 4 xye 0
ye dy
− x2 −y
= 4 xe
2
2 1
= 4 xe− x
2
= 2 xe − x , x 0
2
2 1
= 4 ye − y
2
= 2 ye − y , y 0
2
Now,
(
f X ( x).fY ( y ) = 2 xe − x . 2 ye − y
2
)( 2
)
= 4 xye
(
− x2 + y 2 )
f X ( x).fY ( y ) = f ( x, y )
X and Y are independent.
7. The joint probability mass function of (X,Y) is given by p(x, y) = k ( 2x + 3y ), x = 0 , 1, 2 ,
y = 1, 2, 3. Find the correlation coefficient between X and Y.
Solution:
X
y 0 1 2
1 3k 5k 7k
2 6k 8k 10k
3 9k 11k 13k
To find the value of k :
3 3
We know that P(x
j =1 j =1
i, y j ) = 1. i.e. sum of all probabilities = 1
3 5 7 6 8 10
= 0 1 + 1 1 + 2 1 + 0 2 + 1 2 + 2 2
72 72 72 72 72 72
9 11 13
+ 0 3 + 1 3 + 2 3
72 72 72
5 14 16 40 33 78 186
= 0+ + +0+ + +0+ + = = 2.5833
72 72 72 72 72 72 72
X2 = E ( X 2 ) − ( E ( X ))2 = 2 − (1.1667 ) = 0.6388 X = 0.7992
2
4 12 4 144 144 12
− 0
0.2 x 2 0.2
0.04
E(X) = x f X ( x)dx = x(25)dx =25 = 25
2 0 2
= 0.5
− 0
E(Y) = y fY ( y )dy = y (5e − y )dy =5 − ye − y − e − y = 5 0 + 1 = 5
0
− 0
0.2 0.2
−y −y
E(XY) = xy f ( x, y )dx dy = xy (25e )dxdy = 25 ye dy . xdx
− − 0 0 0 0
x 2 0.2
0.04
= 25 − ye− y − e− y .
= 250 + 1. = (25)(0.02) = 0.5
0
0
2 2
Cov (x, y) = E(XY) – E(X)E(Y) = 0.5 – (0.5)(5) = –2
( )
E x2 =
37028
8
= 4628.5, E y 2 = ( ) 38132
8
= 4766.5
Cov ( X, Y ) 3
r ( X, Y ) = = = 0.603
X Y ( 2.121)( 2.345)
11. Test students got the following percentage of marks in economics (X) and statistics (Y), find the
Karl Pearson’s correlation coefficient from the following data:
X 78 36 98 25 75 82 90 62 65 39
Y 84 51 91 60 68 62 86 58 53 47
Solution:
Since rxy does not change with change of origin and scale.
X y u= x-65 v=y-66 u2 v2 uv
78 84 13 18 169 324 234
36 51 -29 -15 841 225 435
98 91 33 25 1089 625 825
25 60 -40 -6 1600 36 240
75 68 10 2 100 4 20
82 62 17 -4 289 16 -68
90 86 25 20 625 400 500
62 58 -3 -8 9 64 24
65 53 0 -13 0 169 0
39 47 -26 -19 676 361 494
650 660 0 0 5398 2224 2704
650 660
E (x) = x = = 65, E ( y ) = y = = 66
10 10
E ( u ) = 0, E ( v ) = 0, E ( uv ) =
2704
10
( )
= 27.04, E u 2 =
5398
10
( )
= 53.98, E v 2 =
2224
10
22.24
U=
U = 0 = 0 , V=
V = −17 = −2.125 ,
n 8 n 8
U2 =
2
( )
U 132 2
− U − 0 = 16.5 U = 4.062 ,
=
n 8
V =
2 V2
− V =
2 163
( )
− (−2.125)2 = 15.86 V = 3.9825 ,
n 8
Cov(U , V) =
UV 101
−U V = − 0 = 12.625
n 8
Cov(U ,V ) 12.625
rUV = = = 0.7804
U . v (4.062).(3.9825)
rXY = 0.7804
X = U + 30 X = 0 + 30 = 30
Y = V + 27 Y = −2.125 + 27 = 24.875
X = U X = 4.062
Y = V Y = 3.9825
r X
The regression line of X on Y is X − X =
Y
(Y − Y )
(0.7804).(4.062)
X − 30 = (Y − 24.875)
3.9825
X = 0.796Y + 10.2
When Y = 18, X = 0.796 (18) + 10.2 = 24.528
Critical value: The critical value of z for two tailed test at 5% level of significance is 1.96
Conclusion:
i.e., z = 1.724 1.96 calculated value < tabulated value
Therefore We accept the null hypothesis H0.
i.e., The sample has been drawn from the population with mean = 3.25
To find confidence limit:
95% confidence limits are
2.61
x 1.96 = 3.4 1.96 = 3.4 0.1705 = (3.57,3.2295 )
n 900
3. (i) In a big city 325 men out of 600 men were found to be smokers. Does this information support
the conclusion that the majority of men in this city are smokers?
Given n=600, Number of smokers=325
p = sample proportion of smokers p =325/600=0.5417
P= Population proportion of smokers in the city = 1/2 =0.5Q=0.5
Null Hypothesis H0: The number of smokers and non-smokers are equal in the city.
Alternative Hypothesis H1: P > 0.5 (Right Tailed)
Test Statistic:
p − P 0.5417 − 0.5
z= = = 2.04
PQ 0.5*0.5
n 600
Critical value:
Tabulated value of z at 5% level of significance for right tail test is 1.645.
Conclusion: Since Calculated value of z > tabulated value of z. We reject the null hypothesis. The
majority of men in the city is smokers.
(ii) In a large city A, 20% of a random sample of 900 school boys had a slight physical defect. In
another large city B, 18.5% of a random sample of 1600 school boys had the same defect. Is the
difference between the proportions significant? (Nov / Dec 2022)
Solution:
n1 = 900 , n2 = 1600
20 18.5
p1 = = 0.2 , p2 = = 0.185
100 100
( x − x)
n n
xi
2
i
5752 681.6
x= i =1
= = 575.2, S2 = i =1
=
= 75.733 S = 8.702
n 10 n −1 9
x − 575.2 − 577
Under H 0 , the test statistic is t = = = −0.6541
S 8.702
n 10
t = 0.6541
Tabulated value of t for v=9 degrees of freedom t0.025 =2.262
Since t t0.025 . H 0 is accepted
Conclusion: The mean breaking strength of the wire can be assumed as 577kg at 5% level of
significance.
6. Samples of two types of electric bulbs were tested for length of life and the following data were
obtained.
Sample Size Mean S.D
I 8 1234h 36h
II 7 1036h 40h
Is the difference in the means sufficient to warrant that type I bulbs are superior type II bulbs?
Solution:
Here x1 =1234, x2 =1036, n1 =8, n 2 =7, s1 =36, s2 = 40
Let H 0 : 1 = 2 ,
H1 : 1 2 (ie. Type I bulbs are superior to type II bulbs) (one tail test, Right tailed test)
x1 − x2 n s 2 + n2 s22
Under H 0 , the test statistic is t = , where S = 1 1 = 40.7317
1 1 n1 + n2 − 2
S +
n1 n2
1234 − 1036
t = = 9.39
1 1
40.7317 +
8 7
Degrees of freedom v = n1 + n 2 -2=13
Tabulated value of t for 13 d.f. at 5% level of significance is t0.05 =1.77
Since t t0.05 . H 0 is rejected. H1 is accepted.
Conclusion: Type I bulbs may be regarded superior to type II bulbs at 5% level of significance.
7. Two horses A and B were tested according to the time (in seconds) to run a particular track with
the following results.
Horse A 28 30 32 33 33 29 34
Horse B 29 30 30 24 27 27 -
Test whether you can discriminate between two horses. (Nov / Dec 2022)
Solution:
Null Hypothesis H0: 1 = 2
Alternative Hypothesis H1: 1 2 (two tailed test)
=219 =167
=31.43
=26.84
1 1
x = 219 = 31.3 ; y = 167 = 27.8
7 6
( x − x ) + ( y − y )
2 2
31.43 + 26.84
S 2
= = = 5.29
n1 + n2 − 2 7+6−2
S = 2.3
x1 − x2 31.3 − 27.8
t= = = 2.73
1 1 1 1
S + 2.3 +
n1 n2 7 6
Degrees of freedom= n1 + n2 − 2 = 11 , From table t5% (v = 11) = 2.23
Calculated t > tabulated t. H 0 is rejected. (i.e,)there is a significant difference between two horses ,
and they can be discriminated.
8. The nicotine contents in milligrams in two samples of tobacco were found to be as follows:
Sample A 24 27 26 21 25 -
Sample B 27 30 28 31 22 36
Can it be said that both the samples have come from same normal population? (Apr/May 2021)
Solution:
(i) F-test : (Equality of variance)
Let H0 : 12 = 22
H1 : 12 22
( x − x) ( y − y)
x y
x−x y− y
2 2
24 -0.6 0.36 27 -2 4
27 2.4 5.76 30 1 1
26 1.4 1.96 28 -1 1
21 -3.6 12.96 31 2 4
25 0.4 0.16 22 -7 49
36 7 49
123 21.2 174 108
x=
x = 123 = 24.6, y = y = 174 = 29
n1 5 n2 6
( x − x) ( y − y)
2 2
21.2 108
S12 = = = 5.3 , S22 = = = 21.6
n1 − 1 4 n2 − 1 5
St. Joseph’s College of Engineering 39
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
2
S
The test statistic is F = 22 (sin ce S22 S12 )
S1
21.6
= = 4.07
5.3
From the table, F0.05 ( n2 − 1, n1 − 1) = F0.05 (5, 4) = 6.26
Since F F0.05 H 0 is accepted
(ii) t-test:(Equality of means)
Null Hypothesis H0: 1 = 2
Alternative Hypothesis H1: 1 2
x1 − x2
Under H 0 , the test statistic is t =
1 1
S +
n1 n2
1 1 129.2
S2 = ( x − x )2 + ( y − y )2 = 21.2 + 108 = = 14.36
n1 + n2 − 2 5+6−2 9
S = 3.79
24.6 − 29 −4.4
t= = = −1.909 t = 1.909
1 1 2.30432
3.79 +
5 6
From the table, with degrees of freedom n1 + n 2 -2=9, t0.05 =2.262
sin ce t t0.05 H 0 is accepted ie. 1 = 2
Conclusion: The two samples could have been drawn from the same normal population.
9. Two independent samples of six and seven items respectively had the following values of the
variable:
Sample 1 39 41 43 41 45 39 -
Sample 2 40 42 40 44 39 38 40
Do the two estimates of population variance differ significantly at 5% level of significance?
(Nov / Dec 2022)
Solution:
Let H0 : 12 = 22 ; H1 : 12 22
( ) ( )
x 2 y 2
x−x y− y
39 5.4289 40 0.1764
41 0.1089 42 2.4964
43 2.7889 40 0.1754
41 0.1089 44 12.8164
45 13.4689 39 2.0164
39 5.4289 38 5.8564
40 0.1764
x =248 ( x − x ) 2
y =283 ( y − y ) 2
=27.3334 =23.7148
Given n1 = 6 ; n2 = 7
x=
x = 248 = 41.33, y = y = 287 = 40.42
n1 6 n2 7
S12
The test statistic is F = (sin ce S12 S22 )
S22
5.47
F= = 1.383
3.95
From the table, F0.05 ( n1 − 1, n2 − 1) = F0.05 (5,6) = 4.39
Since F F0.05 H 0 is accepted
10. Two random samples gave the following results:
Sample Size Sample Sum of squares of
mean deviations from the mean
1 10 15 90
2 12 14 108
Test whether the samples come from the same normal population at 5% level of significance.
Solution:
A normal population has 2 parameters namely mean µ and variance 2 . To test if independent samples
have been drawn from the same normal population, we have to test
1) Equality of population variances using F-test.
2)Equality of population means using t-test
Level of significance: = 5%
S12
Test Statistics: F = 2
S2
1 1
Where S1 =
2
n1 − 1
( x − x )2 =
10 − 1
(90) = 10
1 1
S12 =
n1 − 1
( y − y )2 =
12 − 1
(108) = 9.818
S12 10
S
Here 1
2
S 2
2 F = 2
= = 1.02
S2 9.818
Critical value: The critical value of F at 5% level of significance with degrees of freedom
(n1 − 1, n2 − 1) = (9,11) is 2.90
Here calculated value < table value, we accept H 0
Hence we may conclude The two samples are drawn from same normal populations.
11. The table below gives the number of aircraft accidents that occurred during the various days of
the week. Test whether the accidents are uniformly distributed over the week.
Days Mon Tue Wed Thurs Fri Sat
No. of accidents 14 18 12 11 15 14
Solution:
We want to test whether the accidents are uniformly distributed. So, we apply 2 -test.
Null Hypothesis H0: The accidents are uniformly distributed over the 6 days. (Monday to Saturday)
Alternative Hypothesis H1: The accidents are not uniformly distributed.
84
Under Ho, the expected frequencies for each day = =14
6
(O − E )
2
The test statistic is 2 =
E
O E O − E (O − E ) (O − E )2
2
E
14 14 0 0 0
18 14 4 16 1.143
12 14 -2 4 0.286
11 14 -3 9 0.643
15 14 1 1 0.071
14 14 0 0 0.000
84 84 2 =2.143
Number of degrees of freedom v = n-1 = 6-1 = 5
For v=5 degrees of freedom, from the table of 2 at 5% level is 02.05 = 11.07
Conclusion: Since 2 0.05 2
, Ho is accepted at 5% level of significance. the accidents are uniformly
distributed over the 6 days.
12. The theory predicts that the proportion of beans in the four groups A, B, C, D should be 9:3:3:1.
In an experiment among 1600 beans, the numbers in the four groups were 882, 313, 287, and 118.
Does the experimental result support the theory?
Solution:
Ho: The experimental data support the theory
Based on Ho, the expected numbers of beans in the four groups are as follows
St. Joseph’s College of Engineering 42
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Observed Expected (O − E )2
(O − E ) 2
frequency (O) frequency (E) E
882 900 324 0.360
313 300 169 0.563
287 300 169 0.563
118 100 324 3.240
4.726
(O − E )
2
=
2
=4.726
E
Calculated value of 2 =4.726
Tabulated value of 2 is 7.81 at 5% level of significance. Since calculated value < tabulated value.
Therefore, we accept the null hypothesis. i.e. the experimental data support the theory.
13. A die was thrown 498 times. Denoting x to be the number appearing on the top face of it, the
observed frequency of x is given below:
X 1 2 3 4 5 6
F 69 78 85 82 86 98
What opinion you would form for the accuracy of the die?
olution:
Given n=6
Null Hypthesis H0: There is no significant difference
Alternative Hypothesis H1: There is a significant difference
Level of significance: = 5% or 0.05
Degrees of freedom=n-1=6-1=5
On the assumption H0, the expected frequency for each face=498 /6=83
(O − E )2
The test statistic is 2
= E
For v=5 degrees of freedom, the table of 2 at 5% level is 0.05 = 11.07
2
2 02.05
Conclusion: Since the calculated value of 2 < the table value of 2 , Ho is accepted at 5% level of
significance. The die is accurate.
Alternate hypothesis H 1 : Party affiliation and tax plan are not independent.
Level of significance: = 0.05
r s (Oij − Eij ) 2
The test statistic: = 2
i =1 i =1 Eij
Reaction
Party Infavour Neutral Opposed Total
Party A 120 20 20 160
Party B 50 30 60 140
Party C 50 10 40 100
Total 220 60 120 400
Alternate hypothesis H 1 : Conditions of home and conditions of child are not independent.
r s (Oij − Eij ) 2
The test statistics: 2 =
i =1 i =1 Eij
Analysis:
Condition of home Total
Clean Dirty
Condition of Child Clean 70 50 120
Fair 80 20 100
Dirty 35 45 80
Total 185 115 300
Corresponding row total×Column total
Expected Frequency =
Grand Total
70 74 -4 16 16
= 0.216
74
50 46 4 16 0.348
80 61.67 18.33 335.99 5.448
20 38.33 -18.33 335.99 8.766
35 49.33 -14.33 205.35 4.163
45 30.67 14.33 205.35 6.695
Total 25.636
2 = 25.636
0.05
2
= 5.991
Conclusion: Since 2 2 , we Reject our Null Hypothesis H0 . Hence, Conditions of home and
3. What are the three essential steps to plan design of experiment? (Nov/Dec 2022)
Solution:
St. Joseph’s College of Engineering 46
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
To plan an experiment, the following three are essential.
1. A Statement of the objective: Statement should clearly mention the hypothesis to be tested.
2. A description of the experiment: Description should include the type of experimental material, size
of the experiment and the number of replications.
3. The outline of the method of analysis: The outline of the method consists of analysis of variance.
4. Define Analysis of Variance. (Nov/Dec 2023)
Solution:
Analysis of Variance is a technique that will enable us to test for the significance of the difference
among more than two sample means.
5. What are the assumptions of analysis of variance? (April / May 2021)
Solution:
(i) The sample observations are independent
(ii) The Environmental effects are additive in nature
(iii) Population from which samples are taken is normal
6. What is the purpose of ANOVA?
Solution:
The purpose of ANOVA is to test the homogeneity of several means.
7. Define Replication.
Solution:
The repetition of the treatments under investigation is known as replication.
8. Define Experimental Error.
Solution:
The variation from plot to plot caused by uncontrolled factors, that is factors beyond the control of
the experimenter, is known as experimental error.
9. Define one-way classification and two-way classifications in ANOVA.
Solution:
The entire experiment influences on only single factor is one way classification.
The entire experiment influences on only two factors are two-way classification.
10. Write down the ANOVA table for one way classification. (April / May 2017)
Solution:
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
Between SSC MSC
SSC C-1 MSC = FC =
Samples C −1 MSE
Within SSE if MSC MSE
SSE N-C MSE =
Samples N−C
11. Find the missing value of A, B, C, D from the ANOVA table.
S.V D.F S.S M.S.S F cal
Treatment 2 A 3 1.66
Error B C 5 -
Total 9 D - -
Solution:
A=2 x 3 =6, B = Total SS – Total D.F = 9-2 =7,
C = 7 x 5 =35, D = A+C = 6+35 =41
12. What is Latin Square Design? (April / May 2019)
18. Mention the advantages of Latin square design over other designs. (April/May 2021)
Solution:
The advantages of the Latin square design over other designs are:
(i) With a two-way stratification or grouping, the Latin square controls more of the variation than the
CRD or the randomized completely block design. The two-way elimination of variation often results in
small error mean square.
(ii) The analysis is simple.
(iii) Even with missing data the analysis remains relatively simple.
19. Compare RBD, LSD, CRD.
Solution:
CRD RBD LSD
To influence one To influence two factors To influence more than two factor
factor
No restriction further No restriction on treatment The number of replications of each
treatments and replications treatment is equal to the number of
treatments
- Use only rectangular or Use only Square filed
square field
PART-B
1. The following are the number of mistakes made in 5 successive days by 4 technicians working for
a photographic laboratory. Test whether the difference among the foursample means can be
attributed to chance. (Test at a level of significance = 0.01 )
Technicians
I II III IV
6 14 10 9
14 9 12 12
10 12 7 8
8 10 15 10
11 14 11 11
Solution:
H0: There is no significant difference between the technicians
H1 : Significant difference between the technicians
St. Joseph’s College of Engineering 49
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
We shift the origin
X1 X2 X3 X4 TOTAL X12 X22 X32 X42
-4 4 0 -1 -1 16 16 0 1
4 -1 2 2 7 16 1 4 4
0 2 -3 -2 -3 0 4 9 4
-2 0 5 0 3 4 0 25 0
Tot 1 4 1 1 7 1 16 1 1
al -1 9 5 0 13 37 37 39 10
N= Total No of Observations = 20 T=Grand Total = 13
(Grand total )2
Correction Factor = = 8.45
Total No of Observatio ns
( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
(−1)2 (9)2 (5)2 (0)2
SSC = − C.F = + + + − 8.45 = 12.95
1 2 3 4
C1 C2 C3 C4 5 5 5 5
SSE = TSS – SSC = 114.55-12.95= 101.6
ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
Between SSC
SSC=12.95 C-1= 4-1=3 MSC = =4.317
Samples K −1 MSC
FC = =1.471
MSE
Within SSE
SSE=101.6 N-C=20-4=16 MSE = =6.35
Samples N−K
Cal FC = 1.471 & Tab FC (16,3)=5.29
Conclusion: Cal FC< Tab FC There is no significance difference between the technicians.
2. As part of the investigation of the collapse of the roof of a building, a testing laboratory is given
all the available bolts that connected the steel structure at 3 different positions on the roof. The
forces required to shear each of these bolts (coded values) are as follows:
Position 1 : 90 82 79 98 83 91
Position 2 : 105 89 93 104 89 95 86
Position 3 : 83 89 80 94
Perform an analysis of variance to test at the 0.05 level of significance whether the differences
among the sample means at the 3 positions are significant. (April / May 2019)
Solution:
H0: There is no significant difference between the sample means at the three positions.
H1: Significant difference between the sample means at the three positions.
We shift the origin
X1 X2 X3 TOTAL X12 X22 X32
1 16 -6 11 1 256 36
-7 0 0 -7 49 0 0
-10 4 -9 -15 100 16 81
9 15 5 29 81 225 25
-6 0 - -6 36 0 -
2 6 - 8 4 36 -
Total - -3 - -3 - 9 -
( X ) + ( X ) + ( X )
2 2 2
(−11) 2 ( 38) 2 (−10 ) 2
SSC = − C .F = + + − 17 = 234.44
1 2 3
C1 C2 C3 6 7 4
SSE = TSS – SSC = 938-234.44= 703.56
ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
SSC
Between MSC = =117.2
SSC=234.44 C-1= 3-1=2 C −1
Samples
2 MSC
FC = =2.332
MSE
Within SSE
SSE=703.56 N-C=17-3=14 MSE = =50.25
Samples N−C
Column /
1 2 3 4
Row
1 A 18 C 21 D 25 B 11
2 D 22 B 12 A 15 C 19
3 B 15 A 20 C 23 D 24
4 C 22 D 21 B 10 A 17
Analyse the experimental yield.
Solution:
Null Hypothesis H 0 : There is no significant difference between rows and column
Alternate Hypothesis H1 : There is a significant difference between rows and column
Test statistic:
Y1 18 21 25 11 75
Y2 22 12 15 19 68
Y3 15 20 23 24 82
Y4 22 21 10 17 70
Total 77 74 73 71 295
( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
R1 R2 R3
Step 7: SSE=TSS-SSC-SSR = 216 − 42 − 91.5
SSE = 296.06
Step 8: ANOVA TABLE:
Source of Sum of Degrees of Mean Sum of varience F – ratio
Variation Squares Freedom Squares
Between SSC=4.68 c-1=4-1=3 SSC MSE
MSC = Fc =
Columns c −1 MSC
(Salesmen) FC (9,3) = 8.81
= 1.56 = 21.09
Total 329.93 15
Table value: Fc > Fc(9,3) and FR<FR(9,3) at 5% LOS
Conclusion:
H 0 is rejected, hence there is a significant difference between rows.
St. Joseph’s College of Engineering 52
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
H 0 is accepted, hence there is no significant difference between columns.
4. The following table gives the number of refrigerators sold by 4 salesmen in three months.
Salesman
Months
A B C D
May 50 40 48 39
June 46 48 50 45
July 39 44 40 39
Is there a significant difference in the sale made by the four salesmen? Is there a significant
difference in the sales made during different months? (April / May 2021), (April / May 2021)
Solution:
Null Hypothesis H 0 : There is no significant difference between the sales in the 3 seasons and also
between the sales of the 4 salesmen.
Alternate Hypothesis H1 : There is a significant difference between the sales in the 3 seasons and also
between the sales of the 4 salesmen.
Test statistic: To simplify calculations, we deduct 40 from each value
Seasons A B C D Seasons
X12 X22 X32 X42
X1 X2 X3 X4 Total
Y1 Summer 10 0 8 -1 17 100 0 64 1
Y2 Winter 6 8 10 5 29 36 64 100 25
Y3 Monson -1 4 0 -1 2 1 16 0 1
( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
152 122 182 32
SSC = − C.F = + + + − 192 = 42
1 2 3 4
Step 5:
C1 C2 C3 C4 3 3 3 3
( Y ) + ( Y ) + ( Y )
2 2 2
172 292 22
Step 6: SSR = − C.F = + + − 192 = 91.5
1 2 3
R1 R2 R3 4 4 4
Step 7: SSE=TSS-SSC-SSR = 216 − 42 − 91.5
SSE = 82.5
Step 8: ANOVA TABLE:
Source of Sum of Degrees of Mean Sum of
varience F – ratio
Variation Squares Freedom Squares
MSC
FC =
SSC MSE
Between MSC =
c −1 14
Columns SSC=42 c-1=4-1=3 = FC (3,6) = 4.76
42 13.75
(Salesmen) = = 14
3 = 1.018
MSR
SSR FR =
MSR = MSE
Between r −1
45.75
rows SSR =91.5 r-1=3-1=2 91.5 = FR (2,6) = 5.14
(Seasons) = = 45.75 13.75
2 = 3.327
SSE
MSE =
(c − 1)(r − 1)
Error SSE=82.5 (c-1)(r-1)=6 82.5
= = 13.75
6
Total 216 11
Conclusion:
1) Cal FC < Table FC ,0.05 (3,6) Hence, we accept the H 0 and we conclude that there is no significant
difference between sales in the three seasons.
2) Cal FR < Table FR,0.05 (2,6) Hence, we accept the H 0 and we conclude that there is no significant
difference between in the sales of 4 salesmen.
5. The following table gives the number of articles of a product produced by five different workers
using four types of machines.
Machines
Workers
P Q R S
A 44 38 47 36
B 46 40 52 43
C 34 36 44 32
D 43 38 46 33
E 38 42 49 39
Test (i) Whether the five workers differ with respect to mean productivity and
(ii) Whether the four machines differ with respect to mean productivity.
Solution:
H0: There is no significant difference between the Machine types and between the Workers
H1 : There is a Significant difference between the Machine types and between the Workers
We shift the origin Xij = xij – 46; h = 5; k = 4; N = 20
2 0 -6 6 -3 -3 2.25 81
5 -8 -4 3 -7 -16 64 138
T
2
k
T
2
h
SSE = TSS – SSC – SSR = 574 – 161.5 – 338.8 = 73.7
ANOVA Table
Source of Sum of
Degree of freedom Mean Square F- Ratio FTabRatio
Variation Squares
Between
Rows SSR=161.5 h - 1= 4 MSR= 40.375 FR = 6.574 F5%(4, 12) =
(Workers) 3.26
Between
Columns SSC=338.8 k – 1=3 MSC =112.933
(Machine) FC = 18.388
F5%(3, 12) =
Residual SSE = 73.7 (h – 1)( k – 1) = 12 MSE = 6.1417 3.59
Total 574
Conclusion : Cal FC>Tab FC and Cal FR>Tab FR There is a significant difference between the
Machine types and a significant difference between the Workers.
6. Analyse the variance in the following latin square of yields (in kgs) of paddy where A, B, C, D
denote the different methods of cultivation.
D 122 A 121 C 123 B 122
B 124 C 123 A 122 D 125
A 120 B 119 D 120 C 121
C 122 D 123 B 121 A 122 (April / May 2021)
Examine whether the different methods of cultivation have given significantly different yields.
TOTAL 8 6 6 10 30 24 20 14 34
n = 4, N = 16, T=Grand Total = 30 ;
(Grand total)2 (30)2
Correction Factor = = = 56.25
Total No of Observations 16
SST = X12 + X 22 + X 32 − C.F = 92 − 56.25 = 35.75
( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
c1 c2 c3 c4
( Y ) + ( Y ) + ( Y ) + ( Y )
2 2 2 2
r1 r2 r3 r4
Letters Total
A 1 2 0 2 5
B 2 4 -1 1 6
C 3 3 1 2 9
D 2 5 0 3 10
Total 30
SSK =
(A) + (B) + (C) + (D) − C.F
4 4 4 4
5 6 9 10
= + + + − 56.25
4 4 4 4
= 60.5 − 56.25 = 4.25
SSE = SST – SSC – SSR-SSK = 35.75 – 24.75 – 2.75 – 4.25 = 4
ANOVA Table
Source of Sum of Degree of FTabRatio
Mean Square F- Ratio
Variation Squares freedom ( 5% level)
Between
SSR=24.75 n - 1= 3 MSR=8.25
Rows FR(3, 6)=4.76
FR= 12.31
Between
SSC=2.75 n - 1= 3 MSC = 0.92
Columns
Y1 1 -1 5 3 8 1 1 25 9
Y2 3 5 1 1 10 9 25 1 1
Y3 3 -1 1 3 6 9 1 1 9
Y4 -1 7 -1 3 8 1 49 1 9
6 10 6 10 32 20 76 28 28
N=Total No of Observations = 16 T=Grand Total = 32
(Grand total )2
Correction Factor = = 64
Total No of Observatio ns
TSS = X12 + X 2 2 + X 32 + X 42 − C.F = 20 + 76 + 28 + 28 − 64 = 88
( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
(6)2 (10)2 (6)2 (10)2
SSC = − C.F = + + + − 64 = 4
1 2 3 4
C1 C2 C3 C4 4 4 4 4
( Y ) + ( Y ) + ( Y ) + ( Y )
2 2 2 2
(8)2 (10)2 (6)2 (8)2
SSR = − C.F = + + + − 64 = 2
1 2 3 4
R1 R2 R3 R4 4 4 4 4
To find SSK
Treatment 1 2 3 4 Total
A 1 1 3 7 12
B -1 1 1 -1 0
C 5 3 -1 3 10
( Y1 )2 ( Y2 ) 2 ( Y3 ) 2 ( Y4 ) 2
SSK = + + + − C.F = 22
K1 K2 K3 K4
SSE= TSS − SSC−SSR−SSK = 88-4-2-22=60
ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
SSC
Between MSC = MSC
SSC=4 n-1=3 n −1 FC = =7.52
Columns MSE
=1.33
SSR
MSR = MSR
Between Rows SSR=2 n-1=3 n −1 FR = =14.9
MSE
=0.67
Between SSK MSK
SST=22 n-1=3 MSK = =7.33 FK = =1.36
Treatments n −1 MSE
MSE
Error (or) SSE
SSE=60 (n-1) (n-2)=6 =
Residual (n − 1)(n − 2)
= 10
Table value F(3,6) degrees of freedom 4.76
There is significant difference between treatments
8. Analysis the following results of a Latin square experiments (April / May 2021)
Column / Row 1 2 3 4
1 A(12) D(20) C(16) B(10)
2 D(18) A(14) B(11) C(14)
3 B(12) C(15) D(19) A(13)
4 C(16) B(11) A(15) D(20)
The letters A, B, C, D denote the treatments and the figures in brackets denote the observations.
Solution:
H0 : There is no significant difference between rows,columns and treatments
H1 : There is a significant difference between rows,columns and treatments
Subtract 15 from every value
Variety X1 X2 X3 X4 TOTAL
Y1 -3 5 1 -5 -2
Y2 3 -1 -4 -1 -3
Y3 -3 0 4 -2 -1
Y4 1 -4 0 5 2
-2 0 1 -3 -4
N=Total No of Observations = 16 T=Grand Total = -4
( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
C1 C2 C3 C4
( Y ) + ( Y ) + ( Y ) + ( Y )
2 2 2 2
R1 R2 R3 R4
To find SSK
Treatment 1 2 3 4 Total
A -3 -1 0 -2 -6
B -3 -4 -4 -5 16
C 1 0 1 -1 1
D 3 5 4 5 17
( Y1 )2 ( Y2 ) 2 ( Y3 ) 2 ( Y4 ) 2
SSK = + + + − C.F = 144.5
K1 K2 K3 K4
SSE= TSS − SSC−SSR−SSK = 6.5
ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
SSC
Between MSC = MSC
SSC=2.5 n-1=3 n −1 FC = =1.30
Columns MSE
=0.83
SSR
Between MSR = MSR
SSR=3.5 n-1=3 n −1 FR = =1.08
Rows MSE
=1.167
SSK
Between MSK = =4 MSK
SST=144.5 n-1=3 n −1 FK = =44.6
Treatments MSE
8.17
MSE
Error (or) SSE
SSE=6.5 (n-1) (n-2)=6 =
Residual (n − 1)(n − 2)
= 1.08
Table value F(3,6) degrees of freedom 4.76
There is significant difference between treatments
UNIT – V STATISTICAL QUALITY CONTROL
PART A
1. Write down the objectives of statistical quality control.
To achieve better utilization of raw materials, to control waste and scrap and to optimize the
quality of the product without any defects
2. Define control chart.
It is a useful graphical method to find whether a process is in statistical quality control.
3. What are the uses of Quality control chart?
( )
Lower control limit = np − 3 np 1- p or np − 3 np q
17. Find the lower and upper control limits for X - chart and R- chart, when each sample is of size 4
and X =10.80 and R =0.46.?
For X -chart, LCL=10.46, UCL=11.14: For R-chart, LCL=0, UCL=1.05.
18. Find the lower and upper control limits for X - chart?
For X -chart, LCL=11.43, UCL=18.57 .
St. Joseph’s College of Engineering 60
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
19. Find the lower and upper control limits for p- chart and np – chart, when n=100 and P = 0.085?
For p-chart, LCL =0.0013, UCL=0.1687; For np-chart, LCL=0.134, UCL=16.867
20. A garment was sampled on 10 consecutive hours of production. The number of defects found per
garment is given below: Defects: 5,1,7,0,2,3,4,0,3,2. Compute upper and lower control limits for
monitoring number of defects. (APR / MAY ’19)
c =2.7 UCL = c +3 c =7.6295 LCL = c -3 c = -2.295
PART B
1. Given below are the values of sample mean X and sample range R for 10 samples, each of size 5.
Draw the appropriate mean and range charts and comment on the state of control on the state of
control of the process.
Sample No. 1 2 3 4 5 6 7 8 9 10
Mean X i 43 49 37 44 45 37 51 46 43 47
Range Ri 5 6 5 7 7 4 8 6 4 6
Solution:
1 1
X= X i = 43 + 49 + 37 + 44 + 45 + 37 + 51 + 46 + 43 + 47 = 44.2
N 10
1 1
R = Ri = 5 + 6 + 5 + 7 + 7 + 4 + 8 + 6 + 4 + 6 = 5.8
N 10
From the table of control chart for sample size n=5, we have A2 = 0.577, D3 = 0 & D4 = 2.115
Conclusion:
Since 2nd,3rd,6thand 7th sample means fall outside the control limits the statistical process is out
St. Joseph’s College of Engineering 61
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
of control according to X chart
(ii). Control limits for R-Chart:
CL = R = 5.8; LCL = D3R = 0
UCL = D4 R = ( 2.115 )( 5.8 ) = 12.267 12.27
Conclusion:
Since all the sample range fall within the control limits the statistical process is under control
according to R − chart .
2. The following data gives the average life in hours and range in hours of 12 samples each of 5 lamps.
Construct X - chart and R- chart, comment on state of control.
Sample No. 1 2 3 4 5 6 7 8 9 10 11 12
Mean X i 120 127 152 157 160 134 137 123 140 144 120 127
Range Ri 30 44 60 34 38 35 45 62 39 50 35 41
(APR / MAY 2019)
Solution:
1
X=
N
Xi
1
= 120 + 127 + 152 + 157 + 160 + 134 + 137 + 123 + 140 + 144 + 120 + 127 = 136.75
12
1
R = Ri
N
1
= 30 + 44 + 60 + 34 + 38 + 35 + 45 + 62 + 39 + 50 + 35 + 41 = 42.75
12
From the table of control chart for sample size n=5, we have A2 = 0.577, D3 = 0 & D4 = 2.115
UCL=161.4
4
CL = 136.75
Conclusion: Since all the sample points lie within the LCL and UCL lines, the process is under control according
to X chart .
UCL=90.4
1
LCL=0
Conclusion:
Since all the sample range fall within the control limits the statistical process is under control according to
R chart .
3. The following data give the measurements of 10 samples each of size 5 in the production process
taken in an interval of 2 hours. Calculate the sample means and ranges and draw the control
charts for mean and range.
Sample No. 1 2 3 4 5 6 7 8 9 10
Observed 49 50 50 48 47 52 49 55 53 54
measurement 55 51 53 53 49 55 49 55 50 54
X 54 53 48 51 50 47 49 50 54 52
49 46 52 50 44 56 53 53 47 54
53 50 47 53 45 50 45 57 51 56
(MAY / JUNE 2016) (NOV / DEC 2018)
Solution:
St. Joseph’s College of Engineering 63
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
1 1
X = Xi = 52 + 50 + 50 + 51 + 47 + 52 + 49 + 54 + 51 + 54 = 51.0
N 10
1 1
R = R i = 6 + 7 + 6 + 5 + 6 + 9 + 8 + 7 + 7 + 4 = 6.5
N 10
From the table of control chart for sample size n=5, we have A2 = 0.577, D3 = 0 & D4 = 2.115
UCL=54.75
MEAN CHART
Conclusion:
Since 5th sample mean fall outside the control limits the statistical process is out of control
according to X chart .
CL = R = 6.5;
LCL = D3 R = 0;
UCL = D 4 R = ( 2.115 )( 6.5 ) = 13.7475
RANGE CHART
Conclusion: Since all the sample means fall within the control limits the statistical process is under
control according to R chart .
4. 15 samples of 200 items each were from the output of a process. The number of defective items in
the samples are given below. Prepare a control chart for the fraction defective and comment on
the state of control.
Sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
number(i):
No. of. 12 15 10 8 19 15 17 11 13 20 10 8 9 5 8
Defective(np):
Solution:
1 180
np = 12 + 15 + 10 + ... + 5 + 8 = 180, np = N np = 15 = 12,
12
p= = 0.06
200
For the p-chart: CL = p = 0.06
p (1-p ) ( 0.006 ) (0.94) = 0.001
LCL=p -3 = 0.006 − 3
n 200
p (1-p ) ( 0.06 )( 0.94 )
UCL=p + 3 =0.06+ =0.11
n 200
npi 12
The fraction defective (values of pi = ) for the given samples are p1 = =0.06,
n 200
15
p2 = =0.075, p3=0.05, p4=0.04, p5=0.095, p6=0.075, p7= 0.085, p8=0.055, p9=0.065, p10=0.1,
200
p11=0.05, p12=0.04, p13=0.045, p14=0.025, p15=0.04.
Conclusion:
Since all the sample point lie between the LCL and UCL lines, the process is under control.
5. Construct a control chart for defectives for the following data:
Sample No: 1 2 3 4 5 6 7 8 9 10
No. inspected: 90 65 85 70 80 80 70 95 90 75
No. of
9 7 3 2 9 5 3 9 6 7
defectives:
(APR / MAY 2019)
Solution:
We note that the size of the sample varies from sample to sample.
We can construct p-chart , provided 0.75n n i 1.25 n , for all i
1 1 1
Here n = ni = ( 90 + 65 + ............. + 90 + 75) = (800 ) = 80
N 10 10
Hence 0.75 n = 60 and 1.25 n = 100
The values of ni be between 60 and 100. Hence p-chart, can be drawn by the method given below.
CL= p = 0.075
LCL= p − 3
(
p 1− p ) = 0.075 − 3 0.075 x0.925
= −0.013
n 80
Since LCL cannot be negative, it is taken as 0.
UCL= p + 3
(
p 1− p ) = 0.075 + 3 0.075 x0.925
= 0.163
n 80
The values of pi for the various samples are 0.100, 0.108,0.035, 0.029, 0.113, 0.063, 0.043, 0.067,
0.093
Since all the sample points lie within the control lines, the process is under control.
6. The data given below are the number of defectives in 10 samples of 100 items each. Construct a
p-chart and an np-chart and comment on the results.
Sample No. 1 2 3 4 5 6 7 8 9 10
No. of defectives 6 16 7 3 8 12 7 11 11 4
(Nov/Dec 2023)
Solution:
Sample size is constant for all samples, n=100.
For p-chart:
p (1 − p ) ( 0.085) (0.915)
LCL = p − 3 = 0.085 − 3 = 0.0013
n 100
p (1 − p ) ( 0.085)( 0.915) = 0.1687
UCL = p + 3 = 0.085 +
n 3
UCL=0.1687
Conclusion:
All these values are less than UCL=0.1687 and greater than LCL=0.0013. In the control chart, all
sample points lie within the control limits. Hence, the process is under statistical control.
For np-chart:
(
UCL = n p + 3 n p 1 − p )
= np+3
p 1− p ( )
= 100 ( 0.1687 ) = 16.87
n
np = 100 ( 0.085 ) = 8.5
(
LCL = n p − 3 n p 1 − p )
= n p −3
(
p 1− p
) = 100 ( 0.0013) = 0.13
n
UCL=16.87
CL=8.5
LCL=0.13
Conclusion:
All the values of number of defectives in the table lie between 16.87 and 0.13. Hence, the process is
under control even in np-chart.
Solution:
np = 5 + 10 + 12 + ... + 3 + 4 = 90,
1 90
np =
N
np =
15
= 6,
1 6
p = 6 = = 0.06
n 100
For np-chart:
CL = np = 6
( )
UCL=np +3 np 1-p = 6 + 3 6 ( 0.94 ) = 13.12
( )
LCL=np − 3 np 1-p = 6 − 3 6 ( 0.94 ) = −1.12
LCL = 0
Conclusion:
Since all the sample points lie between the upper and lower control lines, the process is under
control.
c=
ci = ( 2 + 4 + 3 + + 2 + 1) = 45 = 3
N 15 15
CL = c = 3; LCL = c − 3 c = 3 − 3 3 = − 2.20 ;
LCL = 0 ( since LCL cannot be negative)
UCL = c + 3 c = 3 + 3 3 = 8.20
Since all the sample points lie within the LCL and UCL lines, the process is under control.
9. A plant produces paper for newsprint and rolls of papers are inspected for defects. The results of
inspection of 20 rolls of papers are given below: Draw the c-chart and comment on the state of
control.
Roll 1 2 3 4 5 6 7 8 9 10
No.(i):
No. of 19 10 8 12 15 22 7 13 18 13
defect
s(c):
(i) 11 12 13 14 15 16 17 18 19 20
(c) 16 14 8 7 6 4 5 6 8 9
Solution:
The number of defects per sample containing only one item is given,
c=
c i
=
(19 + 10 + 8 + + 8 + 9)
N 20
220
= = 11
20
CL = c = 11;
LCL = c − 3 c = 11 − 3 11 =1.05,
UCL = c + 3 c = 11 + 3 11 = 20.95
Conclusion: Since one-point falls outside the control lines, the process is out of control.