0% found this document useful (0 votes)
23 views71 pages

Maths

The document outlines the syllabus and content for the MA1452 Applied Probability and Statistics course for the 2023-2024 academic year. It includes various probability problems, solutions, and theorems such as Bayes' Theorem, probability mass functions, and moment generating functions. The document serves as a comprehensive guide for students in the Biotech and Chemical departments during their fourth semester.

Uploaded by

Sushmita N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views71 pages

Maths

The document outlines the syllabus and content for the MA1452 Applied Probability and Statistics course for the 2023-2024 academic year. It includes various probability problems, solutions, and theorems such as Bayes' Theorem, probability mass functions, and moment generating functions. The document serves as a comprehensive guide for students in the Biotech and Chemical departments during their fourth semester.

Uploaded by

Sushmita N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

MA1452-Applied Probability & Statistics Dept.

of BIOTECH & CHEMICAL 2023-2024


MA1452 – APPLIED PROBABILITY AND STATISTICS
(II Biotech & Chemical – IV Semester)
UNIT I – PROBABILITY & RANDOM VARIABLES
PART – A
1. A is known to hit the target in 2 out of 5 shots. B is known to hit the target in 3 out of 4 shots.
Find the probability of the target being hit when both try?
Solution:
2 3
Given P ( A) = , P( B) =
5 4
By Additive probability, P( A  B) = P ( A) + P( B ) − P( A  B)
= P ( A) + P ( B) − P( A) P ( B ) Since A and B are independent.
17
=
20
1 1 1
2. Let A and B be two events such that P ( A) = , P ( B ) = and P ( A  B ) = . Compute
2 3 6
( A) and P ( A  B) .
P B (April/May 2021)
Solution:
( )
P A  B = P( B) − P( A  B) =
1 1 1
− = ,
3 6 6

( A) = P (PB(A)A) = 16  2 = 13
P B

1 3 1
3. Let A and B be two events such that P(A) = , P(B) = , P(A  B) = . Compute P ( A B ) and
3 4 4
(
P AB . ) (May/June 2019)
Solution:
1
P ( A  B) 1
P ( A B) = = 4= ,
P( B) 3 3
4
P ( A  B ) = P ( A) − P ( A  B ) =
1 1 1
− =
3 4 12
4. State Baye’s Theorem on Probability.
Statement:
If E1, E2 …En are a set of exhaustive and mutually exclusive events associated with a random
P ( Ei ) P ( A / E i )
experiment and A is any other event associated with Ei. Then P( Ei / A) = n , i=1,2,..n
 P ( Ei ) P ( A / Ei )
i =1
5. Let X be the random variable which denotes the number of heads in three tosses of a fair coin.
Determine the probability mass function of X.
Solution:
If a coin is tossed three times, then the sample space is
S = TTT , HTT , THT , TTH , HTH , HHT , THH , HHH 
Let X be the R.V denotes the number of heads. The probability mass function is
X= x 0 1 2 3
P(X=x) 1/8 3/8 3/8 1/8

St. Joseph’s College of Engineering 1


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
6. A fair die is thrown twice and their scores are summed up. If the sum of the scores of upper side
face by throwing a die two times is an event. Find the expected value of that event.
Solution:
Sample space = 2,3,4,5,6,7,8,9,10,11,12
X= x 2 3 4 5 6 7 8 9 10 11 12

P(X=x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
12
1 2 3 4 5 6 5 4 3 2 1
E ( X ) =  xP( X = x) = 2  + 3  + 4  + 5  + 6  + 7  + 8  + 9  + 10  + 11 + 12 
x=2 36 36 36 36 36 36 36 36 36 36 36
252
= =7
36
7. Find the value of ‘k’ for a Continuous random variable X whose probability density function is
given by f ( x ) = kx e ; x  0 .
2 −x

Solution:

Since X is a continuous random variable f ( x)  0 and x  0 and  f ( x ) dx = 1
0

e − x dx = 1  k ( x 2 )( −e − x ) − ( 2 x ) ( e − x ) + ( 2 ) ( −e − x )  = 1

k x
2
0
0

1
 k 0 − ( −2 )  = 1  2k = 1  k =
2
 a (1 + x ) , 2 < x < 5
 2
8. A random variable X has the p.d.f f(x) given by f(x) =  . Find ‘a’

0, otherwise
and P (X < 4). (Nov/Dec 2023)
Solution:

Since X is a continuous random variable f ( x )  0 , x and  f ( x ) dx = 1
−
5
5
 x  3

 a(1 + x )dx = 1  a  x + 3  =1
2

2 2

 5 3
  23   1
 a  5 + −
  2 +   = 1  42a = 1  a =
 3   3  42

1  23   31
4
(1 + x 2 ) 1  x3  43  
4 4
P ( X  4 ) =  f ( x)dx =  dx =  x +  =  4 +  −  2 +   =
2 2
42 42  3  2 42  3  3   63


0, x<0
9. The CDF of a continuous random variable is given by F(x) =  - x . Find the PDF of X

1- e , x  0
5

and mean of X.
Solution:
The relation between pdf and cdf
 0, x < 0  0, x0
d d   
f (x) =  F(x) =  - x  = 1 − x
dx dx 1 - e 5 , x  0   e 5 , x  0
  5

St. Joseph’s College of Engineering 2


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

  −x   −x 
1 e 5  e 5 
 
1 − 5x 25
E ( X ) =  xf ( x) dx =  xe dx = ( x)   − (1)  1  = =5
− 0
5 5  −1    5
  5   25   0

10. Let X be a random variable with E(X)=1, E [X (X – 1)] = 4. Find Var(X) &Var (2 – 3X).
Solution:
E[X(X-1)] = 4  E[X2 – X] =4
 E[X2] – E[X] = 4  E[X2] – 1 =4  E[X2] = 5
Var ( X ) = E  X 2  − ( E  X ) = 5 − 1 = 4
2

Var ( 2 − 3 X ) = (−3)2 Var ( X ) = 9  4 = 36 Var ( aX + b ) = (a)2 Var ( X )


11. 1
Let M X (t) = such that t 1, be the MGF of R.V X. Find the MGF of Y = 2X +1.
1- t
Solution:
M Y (t ) = M 2 X +1 (t ) = et M X (2t )  M aX +b (t ) = ebt M X (at ) 
 1  et
 M X (at ) =  M X (t ) t →at 
= et  = .
1 − t  t →2t 1 − 2t
12. If the random variable has the moment generating function M (t) = 3 , compute E[X2].
x
3-t
Solution:
−1
1  t  2  t2  2  t3 
2 3
3 3  t t t t
M X (t ) = = = 1 −  = 1 +   +   +   + = 1 +   +   +   +
3−t  t
3 1 −  
3  3 3  3 3  1!  9  2!  9  3! 
 3 
 tr   t2 
E ( X r ) = r' = coefficient of   in M X ( t ) ; E ( X 2 ) = 2' = coefficient of   in M X ( t ) = .
2
 r!  2!  9
x 2e-x
13. A continuous RV X has the pdf f(x) = , x > 0 Find the rth order moment of X about the
2
origin.
Solution:
   
x 2e− x 1 1
r = E[ X r ] =  x r f ( x)dx =  x r dx =  x r + 2e − x dx =  x ( r +3) −1e − x dx
− 0
2 20 20

1
x
n −1 − x
= (r + 3) e dx = (n)
2 0

1
= (r + 2)! if n is positive integer ( n) = ( n − 1)!
2
14. For a binomial variate X, find p and when n = 6 and 9 P(X = 4) = P( X = 2).
Solution:
n−x
By Binomial distribution, P(X = x) = nC x p q
x

Given 9 P (X = 4) = P(X = 2), n=6


9  6C4 p 4 q 2 = 6C2 p 2 q 4
9 p2 = q2
= (1 − p)
2

St. Joseph’s College of Engineering 3


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
= 1− 2 p + p  8 p + 2 p −1 = 0
2 2

1 −1 1
 p= , Since p is positive,  p =
4 2 4

15. If X and Y are independent binomial variates following B  5,  and B  7,  respectively


1 1
 2  2
find P  X + Y = 3.
Solution:
1
By additive property, X + Y is also a binomial variable with parameters n1 + n2 = 12 & p =
2
 1
 X + Y ~ B 12,  ; P ( Z = z ) = nCz p z q n− z ; z = 0,1, 2,3.....n
 2
3 9
1 1 55
 P  X + Y = 3 = 12C3     = 10
2 2 2
16. One percent of jobs arriving at a computer system need to wait until weekends for scheduling,
owing to core-size limitations. Find the probability that there are no jobs among a sample of 200
jobs.
Solution:
Let X be the Poisson random variable denoting the number of jobs that have to wait
p = 1% = 0.01, n = 200,  = np = (200)(0.01) =2,
e−  x e −2 20
By Poisson distribution, P( X = x) = , x = 0,1, 2,.....  P( X = 0) = = e −2 = 0.1353
x! 0!
17. If the probability that a target is destroyed on any one shot is 0.5, find the probability that it
would be destroyed on 6th attempt.
Solution:
Given that, the probability that a target is destroyed on any one shot is 0.5 p = 0.5 ,q = 1 − p = 0.5
By Geometric Distribution, P(X = x) = q x −1p, x = 1, 2,3,.... ;
P(X = 6) = (0.5)6−1 (0.5) = (0.5)6 = 0.0156
4
18. If X is a Uniformly distributed R.V with mean 1 and variance find P(X<0).
3
Solution:
(b − a) 4
2
a+b
Mean = = 1  a + b = 2 -------(1) variance = =  b − a = 4 -------(2)
2 12 3
(1) + (2) 2b = 6  b = 3; (1)– (2) 2a = –2  a = –1
Probability density function of Uniform distribution is
 1 1
 , a xb  , −1  x  3
f ( x) =  b − a  f ( x) =  4

0 , otherwise 
0 , otherwise
0 0
1 11
P(X  0)=  f (x)dx =  4 dx = 4  x  =
0
. −1
−1 −1
4
19. Suppose the length of life of an appliance has an exponential distribution with mean 10 years.
What is the probability that the average life time of a random sample of the appliance is atleast
10.5 years?
Solution:
Mean of the exponential distribution = E(X) = 1/ 10 = 1/

St. Joseph’s College of Engineering 4


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
x
1 1 −
 = , f (x) = e −x , x  0  f (x) = e 10 , x  0
10 10
  x
1 −10
P(X  10.5)=  f (x)dx=  e dx = e −1.05 = 0.3499
10.5 10.5
10
20. If X is a normal variate with mean = 20 and S.D = 10. Find P 15  X  40 . (April/May 2022)

X −
Solution: X follows N(20, 10)  = 20 &  = 10 . Let Z = be the standard normal variate

15 − 20 40 − 20 
P 15  X  40 = P  Z  = P  −0.5  Z  2
 10 10 

= P  −0.5  Z  0 + P  0  Z  2

= P  0  Z  0.5 + P  0  Z  2 = 0.1915 + 0.4772 = 0.6687 .


PART-B
1. The contents of urns I, II, III are as follows:
1 white, 2 black and 3 red balls;
2 white, 1 black and 1 red balls;
4 white, 5 black and 3 red balls. One urn is chosen at random and two balls drawn. They happen
to be white and red. What is the probability that they come from urns I, II and III?
Solution:
Let B1, B2, B3 be the event that urn I, II and III are chosen respectively.
Let A be the event that getting 1 white and 1 red ball.
P( Bi ) P( A / Bi ) P( Bi ) P( A / Bi )

P ( B1 ) =
1 1C1  3C1 1 P ( B1 ) P ( A / B1 ) =
1 1
 = 0.0667
3 P( A / B1 ) = = 3 5
6C 2 5

P ( B2 ) =
1 2C1  1C1 1 P ( B2 ) P ( A / B2 ) =
1 1
 = 0.1111
3 P ( A / B2 ) = = 3 3
4C 2 3

P ( B3 ) =
1 4C1  3C1 2 P ( B3 ) P ( A / B3 ) =
1 2
 = 0.0606
3 P( A / B3 ) = = 3 11
12C 2 11

Total 3

 P( B ) P( A / B ) =0.2384
i =1
i i

By Bay’s theorem,
P(Bi ) P( A / Bi )
P( Bi / A) = 3 , i = 1,2,3.
 P(Bi ) P( A / Bi )
i =1

P[Urn I was chosen given that the balls are white and red]
P(B ) P( A / B1 ) 0.0667
= P( B1 / A) = 3 1 = = 0.2798
 P(Bi ) P( A / Bi )
0.2384
i =1

P[Urn II was chosen given that the balls are white and red]

St. Joseph’s College of Engineering 5


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
P ( B ) P ( A / B2 ) 0.1111
= P( B2 / A) = 3 2 = = 0.4660
 P(Bi ) P( A / Bi )
0.2384
i =1

P[ Urn III was chosen given that the balls are white and red ]
P(B ) P( A / B3 ) 0.0606
= P( B3 / A) = 3 3 = = 0.2542
 P(Bi ) P( A / Bi )
0.2384
i =1

2. The discrete random variable X has the following probability mass function:
b, x 0
2b, x 1
P x
3b, x 2
0 , otherwise
(a) Find the value of ‘b’, (b)Determine P ( X < 2 ) ,P ( X  2 ) and P ( 0 < X < 2 )
(c ) Determine the distribution function of X.
Solution:
To the value of ‘b’:
1
We know that  P( X = x ) = 1  b + 2b + 3b = 1  6b = 1  b = 6
i
i

X=x 0 1 2
P(X = x) 1 2 3
6 6 6

1 2 3
P ( X < 2 ) = P( X = 0 ) + P( X = 1 ) = + =
6 6 6
1 2 3
P ( X  2) = + + = 1
6 6 6
2
P ( 0 < X < 2 ) = P( X = 1 ) =
6
The distribution function of X is
X=x 0 1 2
P(X = x) 1 2 3
6 6 6
F(X) 1 3 6
=1
6 6 6
3. A discrete random variable has the following probability distribution.
X=x 0 1 2 3 4 5 6 7 8
P(X = x) a 3a 5a 7a 9a 11a 13a 15a 17a

Find (i) The value of a, (ii) P(X>3), (iii) Distribution of X (iv) P (1.5 < X < 4.5 X > 2 ) ,(v) Mean.
(Nov/Dec 2023)
Solution:
(i). To the value of ‘a’:

St. Joseph’s College of Engineering 6


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
We know that  P( X = x ) = 1
i
i

a+3a+5a+7a+9a+11a+13a+15a+17a = 1
1
81a = 1  a =
81
X =x 0 1 2 3 4 5 6 7 8
P( X = x) 1 3 5 7 9 11 13 15 17
81 81 81 81 81 81 81 81 81

(ii). To find P( X>3):


P( X>3) = P( X=4) + P( X=5) + P( X=6) + P( X=7) + P( X=8)
9 11 13 15 17 65
= + + + + =
81 81 81 81 81 81
(iii). Distrubution of X:
X P( X = x) F( X)
0 1 1
81 81
1 3 4
81 81
2 5 9
81 81
3 7 16
81 81
4 9 25
81 81
5 11 36
81 81
6 13 49
81 81
7 15 64
81 81
8 17 1
81
P(1.5  X  4.5  X  2) P(X = 3,4)
(iv) P(1.5  X  4.5/ X  2) = =
P(X  2) P(X = 3,4,5,6,7,8)
7 9
+
81 81 16
= =
7 9 11 13 15 17 72
+ + + + +
81 81 81 81 81 81
(v) Mean:
E ( X ) =  xP( X = x)

= 0  P( X = 0) + 1 P( X = 1) + 2  P( X = 2) + 3  P( X = 3) + 4  P( X = 4)
+ 5  P( X = 5) + 6  P( X = 6) + 7  P( X = 7) + 8  P( X = 8)
= 5.4814

St. Joseph’s College of Engineering 7


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
 4 x (9 − x )
2

 ;0  x  3
4. Let X be a continuous random variable for the pdf f(x) =  81

 0 ; otherwise
Find first four moments about the origin.
Solution:
Let  be the rth moment about the origin.
'

r

E (x r ) =  = x
' r
f(x) dx
r
−
3 4x (9 − x2 )
= x r
dx
0
81
3
=  ( 9 x r +1 − x r +3 ) dx
4
81 0
3
4  9 xr +2 xr +4 
=  − 
81  r + 2 r + 4  0

4   9 ( 3) ( 3)  
r +2 r +4

=  −  − 0 − 0
81  r + 2 r+4  
 
4  ( 3) ( 3) 
r +4 r +4

=  − 
81  r + 2 r + 4 
 
4 ( 3)
r +4
 1 1 
 =
'
− − − − −(1)
81  r + 2 r + 4 

r

put r =1, 2,3, 4 in (1), then we get the first four moments about the origin as,
1' = 1.6, 2' = 3, 3' = 6.17, 4' = 13.5
 ax, 0 x1
 a, 1 x 2
5. 
If the density function of a continuous random variable X is given by f(x) = 
 3a - ax, 2  x  3

0, elsewhere
(a)Find the value of ‘a’ (b) Find the c.d.f of X (c) Find P ( X  1.5 ) (April/May 2022)
Solution:
 
(i) Since  f ( x )dx = 1   f ( x)dx = 1
− −
0 1 2 3 
 
−
f ( x)dx +  f ( x)dx +  f ( x)dx +  f ( x)dx +  f ( x)dx = 1
0 1 2 3
1 2 3

 ax dx +  a dx +  (3a − ax)dx = 1
0 1 2
1 3
 x2   x2  1
a   + a  x 1 + a 3x −  = 1  a =
2

 2 0  2 2 2
(ii) CDF
If x  0,

St. Joseph’s College of Engineering 8


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
x x
F ( x) = 
−
f ( x) dx =  0 dx = 0
0
If 0  x  1
x
x x
x  x2  x2
F ( x) = 
−
f ( x) dx =  dx =
0
2
  =
 4 0 4
If 1  x  2
1
 x2   x 
x 1 x x
x 1 x 1
F ( x) =  f ( x) dx =  dx +  dx =   +   = −
− 0
2 1
2  4  0  2 1 2 4
If 2  x  3
3 x
x 1 2 x
x 1
F ( x) = 
−
f ( x) dx =  dx +  dx +   −  dx
0
2 1
2 2
2 2
1 x
 x 2   x   3x x 2  3x x 2 5
2

=   +   + −  = − −
 4  0  2 1  2 4 2 2 4 4
If x  3
3 x
x 1 2 3 x
x 1
F ( x) = 
−
f ( x) dx = 
0
2
dx +  dx +   −  dx +  0 dx
1
2 2
2 2 3
1 3
 x 2   x   3x x 2  1 1 1
2

=   +   + −  = + + =1
 4  0  2 1  2 4 2 4 2 4
 0, x0
 2
 x , 0  x 1
 4

 x 1
FX ( x ) =  − , 1 x  2
 2 4
 3x x 2 5
 − − , 2 x3
2 4 4

1, x3
(iii)P ( X  1.5 ) = F (1.5) [ F ( x) = P( X  x)]
1.5 1
= − = 0.5
2 4
2 x ,0  x  1  1 1 1
6. A random variable X has the P.d.f f ( x ) =  . Find (i) P  X   , (ii) P   x   ,
0 , Otherwise  2 4 2

 3 1
(iii) P  X  /X  
 4 2
Solution:
1/2
 1
1/2 1/2
 x2  1
(a) P  X   =
 2 0
f ( x)dx =  2 x dx = 2   =
0  2 0 4

1/2
1 1
1/2 1/2
 x2  1 1 3
(b) P   X   =  f ( x)dx =  2 xdx = 2   = − =
4 2  1/4 1/4  2 1/4 4 16 16

St. Joseph’s College of Engineering 9


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
 3 1  3
P X   X   P X  
 1
(c ) P  X  / X   = 
2
= 
3 4 4
 4 2  1  1
P X   P X  
 2  2
3/4
 3  3  x2  3/4
9 7
P  X   = 1− P  X   = 1−
 4  4 0 2 xdx = 1 − 2  2  = 1 −  16  = 16
0

 1  1 1 3
P  X   = 1− P  X   = 1− =
 2  2 4 4
7
 3 1  16 7
P X  / X   = =
 4 2  3 12
4
7. Derive the moment generating function, mean and variance of Binomial distribution.
A random variable X is said to follow a binomial distribution if it assumes only non-negative values
with probability mass function
P(X = x) = nC x p x q n− x , where x=0, 1, 2… n and q = 1 – p.
MGF, Mean and Variance of Binomial distribution:

M X (t ) = E (e tX ) = e x =1
tx
p ( x)

=  e tx nC x p x q n− x
x =1

( )

=  pet nC x q n−x
x

x =1

( )
n
MGF = M X (t ) = pet + q

d 
Mean = E(x) =  ( M x (t))
 dt  t =0
(
= n pet + q ) n −1
pet  t =0

= n p ( p + q)
n−1

Mean = n p  p + q = 1
d2 
E ( x 2 ) =  2 (M x (t))
 dt  t =0

=
d
dt
n p e t pe t + q( )
n −1
 t =0


= n p et (n − 1) pet + q ( )n−2
pet + pet + q ( ) n −1 t
e 
t =0


= n p (n − 1)( p + q)
n−2
p + ( p + q)
n−1
  p + q = 1

= n p (n − 1) p + 1
= n p np − p + 1

St. Joseph’s College of Engineering 10


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
= np(np + q )
E(x 2 ) = n 2 p 2 + n p q
Variance = E (x ) -  E (x) = n p + npq − n p = npq  Variance = npq
2 2 2 2 2 2

8. Messages arrive at a switchboard in a Poisson manner at an average rate of six per hour. Find the
probability for each of the following events (a) exactly two messages arrive within one hour,
(b) no message arrives within one hour, (c) at least three messages arrive within one hour.
Solution:
Let X be a random variable that messages arrive at a switchboard.
e−  x
By Poisson distribution, P(X = x) = , x = 0,1,2,.....
x!
Here  = 6
e−6 ( 6 )
2

(i). P[exactly two messages arrive within one hour] = P(X = 2) = = 0.0446
2!
e−6 ( 6 )
0

(ii). P[no message arrives within one hour]= = P(X = 0) = = 0.00248


0!
(iii). P[at least three messages arrive within one hour] = P(X>3)
 e−6 ( 6 )0 e−6 ( 6 )1 e−6 ( 6 )2 e−6 ( 6 )3 
= 1 − P(X  3) = 1 −  + + +  = 0.8487
 0! 1! 2! 3! 

9. Find mean, variance and MGF of exponential distribution. Also prove the lack of memory
property of the Exponential distribution. (APR / MAY 2019)
Solution:
 e −  x , x  0
We know that f ( x ) = 
0, otherwise
 
M X ( t ) = E ( etx ) =  etx f ( x ) dx =   e −  x etx dx
0 0

=   e − x(  −t ) dx
0

 e − x (  −t )  
=  =
 − (  − t ) 0  − t
d     1
Mean = 1 =  M X ( t )  =  2
=
 dt  t =0  (  − t )  t =0 

 d2    ( 2)  2
2 =  2 M X ( t ) =  3
= 2
 dt  t =0  (  − t )  t =0 

( )
2
2 1 1
Variance = 2 − 1 = 2 − 2 = 2 .
  
MEMORYLESS PROPERTY
Statement: If X is exponentially distributed with parameters  , then for any two positive integers ‘s’
and ‘t’, P  X  s + t / X  s  = P  X  t  s, t  0
St. Joseph’s College of Engineering 11
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Proof:
 e −  x , x  0
The p.d.f of X is f ( x ) = 
0 , Otherwise


 P  X  k  =   e −  x dx =  −e −  x  = e −  k
k
k

P  x  s + t  x  s
 P  X  s + t / x  s =
P  x  s
P  X  s + t  e−  ( s +t )
= = −  s = e − t = P  X  t 
P  X  s e

10. Let X be a Uniformly distributed R.V over [-5,5]. Determine (1) P(X ≤ 2) (2) P(X≤ 2) (3) CDF
of X (4) Var ( X)
Solution:
The R.V. X ~ U[-5,5].
The p.d.f
 1
 , a xb
f ( x) =  b − a
 0 , otherwise

1
 for − 5  x  5
f ( x) = 10
 0 otherwise
2 2 2
1 1 1
(1) P ( X  2 ) =  f ( x)dx =  dx =  dx =  x −5
2

−5 −5
10 10 −5 10
1 7
 2 + 5 =
=
10 10
(2) P ( X  2 ) = P ( −2  X  2 )
2 2
1 1 1 4 2
 f ( x) d x =  10 d x = 10  x  2 + 2 = =
2
= −2
=
−2 −2
10 10 5
( 3) Cumulative distribution function of X
If x  −5
x x
F ( x) = 
−
f ( x)dx =  0 dx = 0
−

If − 5  x  5
x+5
x x
1 1
 f ( x)dx =  10 dx = 10  x 
x
F ( x) = −5
=
−5 −5
10
If x  5

5+5
5 x 5
1 1
F ( x) =  f ( x)dx +  f ( x)dx =  10 dx + 0 = 10  x 
5
−5
= =1
−5 5 −5
10

St. Joseph’s College of Engineering 12


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
 0 for x  −5
 x+5

f ( x) =  for − 5  x  5
 10
 1 for x  5

(b − a ) (5 − (−5) )
2 2
100 25
(4)Var ( X ) = = = = .
12 12 12 3
x-1
 3  1 
11. Let P ( X = x ) =    , x = 1, 2, 3, .... be the probability mass function of the RV X. Compute
 4  4 
(a) P(X > 4/ X >2) (b) Find mean and variance
Solution:
(a)To find P(X > 4/ X >2) (By Memoryless property )

 3  1 
2
  1   1 2 
=    1 +   +   + 
 4  4    4   4  
2 −1 2 −1 2
 3  1   1  3  1   3  1
=    1 − 4  =  4  4   4  = 
 4  4         4
(b)To find mean and variance

12. Buses arrive at a specified stop at 15 minutes interval starting at 7 a.m., that is, they arrive at 7,
7.15, 7.30, 7.45 and so on. If a passenger arrives at a stop at a random time that is uniformly
distributed between 7 and 7.30, find the probability that he waits for (i) Less than 5 minutes for a
bus, (ii) More than 10 minutes for a bus.
Solution:
Let X denotes number of minutes past 7.00 a.m. that the passenger arrives at the stop till 7.30a.m.
1
X ~U[0,30]  f(x) = , 0  x  30
30

St. Joseph’s College of Engineering 13


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
(i) Since X is uniformly distributed between 7 and 7.30, the passenger may arrive after 7.10 or 7.25

P ( that he has to wait for the bus for less than 5 minutes ) = P (10  x  15)  (25  x  30) 
= P (10  x  15) + P (25  x  30)
15 30
=  f(x)dx +  f(x)dx
10 25
15 30
1 1
=
dx +  dx
10
30 25
30

=
1
30
15

 x 10 +  x 25
30

10
= = 0.3333
30
(ii) Here the passenger has to arrive between 7 and 7.05 or 7.15 and 7.20
P ( that he has to wait for the bus for more than 10 minutes ) = P (0  x  5)  (15  x  20)
= P (0  x  5) + P (15  x  20)
5 20
=  f(x)dx +  f(x)dx
0 15
5 20
1 1
dx +  dx=
0
30 15
30

=
1
30
5 20

10
 x 0 +  x 15 = = 0.3333
30

13. A component has an exponential time to failure distribution with mean of 10,000 hours.
(a)The component has already been in operation for its mean life. What is the probability that it
will fail by 15,000 hours? (b) At 15,000 hours the component is still in operation. What is the
probability that it will operate for another 5000 hours?
Solution:
Let X be the random variable denoting the time to failure of the component following exponential
distribution with Mean = 10000 hours.
1 1
 = 10, 000   =
 10, 000
 1 −
x

 e 10,000
,x 0
The p.d.f. of X is f ( x ) = 10, 000

 0 , otherwise
(a) Probability that the component will fail by 15,000 hours given that it has already been in
operation for its mean life = P  X  15,000 / X  10,000
P 10, 000  X  15, 000
= − − − − − − − −(1)
P  X  10, 000
15,000 x

P 10,000  X  15,000
1
=  e 10000
dx
10,000
10000

St. Joseph’s College of Engineering 14


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
15000
 − x 
1  e 10000 
=  
10000  −1 
 10000 10000
15000
 − 10000
x

= − e 
 10000
 − 10000 − 10000
15000

= − e − e 10000 
 
 − 3

= − e 2 − e − 1  = e − 1 − e − 1.5 − − − −(2)
 
 x
1 −
P  X  10, 000 =  10000 e 10000
dx
10,000

 − x 
1  e 10000 
=  
10000  −1 
 10000 10000

 − x 
= − e 10000 
 10000
= − e−  − e− 1  = e− 1 − − − − − − − (3)
Sub (2) & (3) in (1)
e −1 − e −1.5 0.3679 − 0.2231
(1)  P  X  15,000 / X  10,000 = = = 0.3936 .
e−1 0.3679
(b) Probability that the component will operate for another 5000 hours given that
it is in operation 15,000 hours = P  X  20, 000 / X  15, 000
= P  X  5000 [By memoryless property]

=  f ( x ) dx
5000
 x
1 −
=  10000
5000
e 10000
dx


 − x  
1  e 10000   − 10000
x

=   = − e 
10000  −1    5000
 10000  5000
= e −0.5 = 0.6065
14. The weights in pounds of parcels arriving at a package delivery company’s warehouse can be
modelled by an N (5,16) normal random variable X. (a) What is the probability that a randomly
selected parcel weighs between 1 and 10 pounds? (b) What is the probability that a randomly
selected parcel weighs more than 9 pounds?
Solution:
x− x−5
Given X N (  , ) where  = 5,  = 16, z = =
 16
(a)The standard normal z values corresponding to 1 x = 1, x2 = 10 are
St. Joseph’s College of Engineering 15
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
1− 5 10 − 5
z1 = = −0.25and z2 = = 0.31
16 16
P(1  X  10) = P(−0.25  Z  0.31) = P(−0.25  Z  0) + P(0  Z  0.31)
= P(0  Z  0.25) + P(0  Z  0.31)
= 0.0987 + 0.1217 = 0.2204
9−5
(b) The standard normal z values corresponding to x = 9 are z = = 0.25
16
P( X  9) = P(Z  0.25) = 0.5 − P(0  Z  0.25)
= 0.5 − 0.0987 = 0.4013
15. An electrical firm manufactures light bulbs that have a life, before burn-out, that is normally
distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the
probability that a bulb burns between 778 and 834 hours.
Solution:
x −  x − 800
Given X N (  ,  ) where  = 800,  = 40, z = =
 40
The standard normal z values corresponding to x1 = 778, x2 = 834 are
778 − 800 834 − 800
z1 = = −0.55 and z2 = = 0.85
40 40
P(778  X  834) = P(−0.55  Z  0.85)
= P(0  Z  −0.55) + P(0  Z  0.85)
= P(0  Z  0.55) + P(0  Z  0.85)
= 0.2088 + 0.3023 = 0.5111
Hence the probability that a bulb burns between 778 and 834 hours is 0.5111
UNIT – II TWO DIMENSIONAL RANDOM VARIABLES
PART-A
1. The joint probability mass function of X and Y is
X\Y 0 1 2
0 0.1 0.04 0.02
1 0.08 0.2 0.06
2 0.06 0.14 0.3
Find the marginal distributions.
Solution:
X\Y 0 1 2 P(X = x)
0 0.1 0.04 0.02 0.16
1 0.08 0.2 0.06 0.34
2 0.06 0.14 0.3 0.5
P(Y = y) 0.24 0.38 0.38 1
The marginal distribution of X is The marginal distribution of Y is
X 0 1 2 Y 0 1 2

0.16 0.34 0.5 0.24 0.38 0.38


P(X = x) P(Y = y)

2. The two discreate random variable X and Y are independent and P(X =0,Y=0)=2/9,
P(X =1,Y=1)=2/9, P(X =0)=3/9. Find the joint probability mass function of X and Y.

St. Joseph’s College of Engineering 16


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Solution:
Since X=0,1 and Y=0,1
The joint probability mass function of X and Y is
X 0 1 P(Y = y)
Y
0 2 4 2 6
ii) i).  =
9 9 3 9 
1 1 2 3
iv) iii)
9 9 9
P(X= x) 3 6 1
v)
9 9
Where
i) since X & Y are independent random variable,
2 3 2
P ( x = 0, Y = 0 ) = P ( x = 0 )  P ( Y = 0 )  = ? ? =
9 9 3
4 3 1 6
ii) P(X =1,Y=0)= , iii) P(Y = 1)= , iv). P(X =0,Y=1)= , v) P(X=1) =
9 9 9 9
3. If the joint probability mass function of two discrete random variables X and Y is given by

xy where x = 1,2,3 and y = 1,2,3. Find P ( X +Y  4 ) .


1
p ( x,y ) =
36
Solution:
X
1 2 3 p(y)
Y
1 2 3 6
1
36 36 36 36
2 4 6 12
2
36 36 36 36
3 6 9 18
3
36 36 36 36
6 12 18
p(x) 1
36 36 36
P ( X +Y  4 ) = P(1,1) + P(1,2) + P(1,3) + P(2,1) + P(2,2) + P(3,1)
1 2 3 2 4 3 15 5
= + + + + + = =
36 36 36 36 36 36 36 12
4. Find the value of k, if the joint density function of (X , Y) is given by
 k (1 − x )(1 − y ), 0  x  4,1  y  5
f ( x, y) = 
 0 , otherwise
Solution:
Given the joint pdf of (X , Y) is f(x , y) = k (1 – x) (1 – y), 0 < x < 4, 1 < y < 5

St. Joseph’s College of Engineering 17


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
  5 4 5 4
 
− −
f ( x, y )dxdy = 1    k (1 − x)(1 − y )dxdy = 1 k   (1 − x − y + xy )dxdy = 1
1 0 1 0
4
5
 x2 x2 
5
 k   x − − yx + y  dy = 1 k  (−4 + 4 y )dy = 1
1 2 2 0 1
5
 y2  1
 k  −4 y + 4  = 1  k (30 + 2) = 1  k =
 2 1 32

k , 0  x, y  1
5. The joint p.d.f. of RV (X,Y) is given as f ( x , y ) =  . Find k
 0 , elsewhere
Solution:
  1 1 1 1

  f ( x, y ) dxdy = 1    kdxdy = 1  k   x dy = 1  k  dy = 1  k = 1
1
0
− − 0 0 0 0

1 −y
6. The joint pdf of the random variable (X,Y) is given as f ( x , y ) = xe , 0  x  2, y  0. Calculate
2
the marginal p.d.f of X.
Solution:
The marginal p.d.f of X is given by
   
1 1 1  e− y  1 x
f X ( x) =  f ( x, y )dy =  xe − y dy = x  e − y dy = x   = − x 0 − 1 = .
− 0
2 2 0 2  −1  0 2 2
 8 xy , 0  x  1, 0  y  x
7. If f ( x , y ) =  is the joint PDF of X & Y, find f(y/x).
 0 , elsewhere
Solution:
x
x
 y2 
f X ( x) =  f ( x, y )dy =  8 xydy = 8 x  = 4 x 3 , 0  x  1
y y =0  2 0
f ( x, y )
f ( y / x) =
f X ( x)
8 xy 2 y
= = , 0  y  x,0  x  1
4 x3 x 2
8. The joint probability density function of bivariate random variable (X , Y) is given by
 4 xy , 0  x  1, 0  y  1
f ( x, y) =  Find P (X + Y < 1 )
 0 , elsewhere
Solution:
4 xy , 0  x  1, 0  y  1
Given the joint pdf of (X , Y) is f ( x, y) =  .
 0 , elsewhere
1 1− x 1− x
 y2 1 1
 P( X + Y  1) = 
0 0
4 xydydx = 4 x   dx = 2 x(1 − x) 2 dx
0  2 0 0
1
1
 x2 x3 x 4  1 2 1 1
= 2 ( x − 2 x + x )dx = 2  − 2 +  = 2  − +  =
2 3

0 2 3 4 0 2 3 4 6
9. If the joint cumulative distribution function of X and Y is given by
F ( x , y ) = (1 − e − x )(1 − e − y ), x  0, y  0 , find P ( 1 < X < 2 , 1 < Y < 2 ).
Solution:
St. Joseph’s College of Engineering 18
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
 2 F  (1 − e )(1 − e ) 
2 −x −y

The joint pdf is f ( x, y) = = = (1 − e− x ) .e− y = e− x .e− y = e− ( x + y ) , x  0, y  0


xy xy x
2 2 2 2 2 2
P (1  X  2,1  Y  2 ) =   f ( x, y)dxdy =   e − ( x + y ) dxdy =   e − x .e − y dxdy
1 1 1 1 1 1

2 2

(
=  e− x dx. e− y dy =  −e− x  .  −e− y  = e−1 − e −2 )
2 2 2

1 1
1 1

 1 1   e −1 
2 2

=  − 2  =  2  = 0.054
e e   e 
10. Let X and Y be two random variables having joint density function
3  1 1
f ( x , y ) = ( x 2 + y 2 ), 0  x  1, 0  y  1. Determine P X  , Y  
2  2 2
Solution:
1 1

 1
1

( )
2 2
1 3 2
P  X  , Y   =   f ( x, y )dydx =   x + y 2 dydx
 2 2  x =− 1
y= x =0 y = 1
2
2 2
1 1 1
1
3  2
2
y  3
3  2  1  1  1 
2
3  x2 7  2

2 0  2 0   2  3  8   2 0  2 24 
=  x y +  dx = x  1 −  +  1 −  dx =  + dx
3 1
2
1
3  x3 7 x  3 1 1 7 1  3  8  1
2
=
 +  =  . + . =  =
2  6 24  0 2  6 8 24 2  2  48  4
11. If X and Y have joint p.d.f f ( x , y ) = e − ( x + y ) , x  0, y  0. Check whether X and Y are
independent.
Solution:
  
 e− y 
 f ( x, y )dy =  e  = −e 0 − 1 = e
−( x+ y ) −x −x −x
The marginal pdf of X is f X ( x) = dy = e 
− 0  − 1 0
  
 e− x 
The marginal pdf of Y is fY ( y ) =  f ( x, y )dx =  e dx = e   = −e − y 0 − 1 = e − y
−( x+ y ) −y

− 0  −1  0
Now, fX (x).fY (y) = e− x .e− y = e−(x + y) = f (x, y)  X and Y are independent.
12. If the joint distribution function of X and Y is given by
1 − e − x − e − y + e − ( x + y ) , x  0, y  0

F ( x, y) =  . Find the joint density function of X and Y.

 0 , elsewhere

Solution:
The joint p.d.f of (X,Y) is given by
2 2  − y −( x + y )
f ( x, y ) =
xy
F ( x, y ) =
xy
(
1 − e − x − e − y + e −( x + y ) =
x
e −e) (
= e −( x + y ) )
−( x + y )

e , x  0, y  0
f ( x, y ) = 

 0 , elsewhere
13. Let X and Y be two independent R.Vs with Var(X) = 9 and Var(Y) = 3. Find Var (4X – 2Y + 6)
Solution:
Var (4X – 2Y + 6) = 16 Var(X) + 4 Var(Y) = 16(9) + 4(3) = 156
St. Joseph’s College of Engineering 19
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
14. If X has mean 4 and variance 9, while Y has mean –2 and variance 5 and the two are
independent find (a) E[XY] (b) E[XY2]
Solution:
Given E[X] = 4, E[Y] = –2,  X2 = 9,  Y2 = 5 , X and Y are independent.
(a) E[XY] = E[X] E[Y] = 4(–2) = –8
(b) E[XY2] = E[X] E[Y2]
 Y2 = E[Y 2 ] − [ E[Y ]]2  5 = E[Y 2 ] − 4  E[Y 2 ] = 9 E[ XY 2 ] = 4(9) = 36
15. If Y = –2X + 3, find Cov (X, Y).
Solution:
Cov(X,Y) = E(XY) – E(X) E(Y)
= E(X(–2X + 3)) – E(X){E(–2X + 3)}= [E(–2X2 + 3X) – E(X)]{–2E(X) + 3}
= –2E(X2) + 3 E(X) + 2 (E(X))2 – 3E(X)= 2(E(X))2 – 2 E(X2) = –2 var(X)
16. Define Coefficient of correlation between two random variables X,Y and its range.
Solution:
Cov ( X , Y )
Coefficient of correlation between x and y is rXY = and range of correlation
Var ( X ) Var (Y )
coefficient is −1  rXY  1
−1
17. The correlation coefficient of two random variables X and Y is while their variances are 3
4
and 5. Find the covariance.
Solution:
−1
Given rxy = ,  2X = 3,  2Y = 5 ,
4
rxy = Cov ( X ,Y )
,  X  0,  Y  0
 XY
1 Cov(X, Y) 1
− =  Cov(X, Y) = − . 3. 5 = −0.968
4 3. 5 4
18. The regression equations are 3x + 2y = 26 and 6x + y = 31. Find the mean of X and Y.
Solution:
Regression lines pass through the mean values of X and Y.
Let 3x + 2y = 26 ------(1) 6x + y = 31 -------(2)
Multiply equation (2) by 2 and subtract equation (2)
3x + 2y = 26
12x + 2y = 62
−−−−−−−−−−
−9 x = −36  x = 4
Substitute in equation (1), 3 (4) – 2y = 26  y = 7.
 mean value of X = 4 and mean value of Y = 7
19. The two lines of regression are 4x – 5y + 33 = 0 and 20x – 9y = 107. Calculate the coefficient of
correlation between X and Y.
Solution:
4x – 5y + 33 = 0 --------- (1) 20x – 9y = 107 --------(2)
Let (1) be the regression line of Y on X and let (2) be the regression line of X on Y.
4 33 4
 y = x +  b1 =
5 5 5
9 107 9 4 9 9 3
x= y+  b2 =  r = b1b2 = . = = = 0.6  1
20 20 20 5 20 25 5

St. Joseph’s College of Engineering 20


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
20. State central limit theorem
Statement:
If X 1 , X 2 .... X n is a sequence of independent random variable E ( X i ) = i and Var ( X i ) =  i2 ,
i = 1, 2,....n and if Sn = X 1 + X 2 + ...... + X n then under certain general conditions S n follows a Normal
n n
distribution with mean  =  i and variance  2 =   i2 as n →  .
i =1 i =1
PART-B
1. Determine the value of the constant c if the joint density function of two discrete random
variables X and Y is given by p(x,y) = cxy, where x = 1,2,3; y = 1,2,3. And find the marginal
distributions, conditional distributions P(Y = yi / X = xi) & find P ( X +Y  4 ) .
Solution:

X
1 2 3 p(y)
Y
1 c 2c 3c 6c
2 2c 4c 6c 12c
3 3c 6c 9c 18c
p(x) 6c 12c 18c 36c
Since p(x,y) is the joint pdf of X and Y p(x,y) ≥ 0 , for all x ,y
1

m n
p (x,y) = 1  36c = 1  c =
36

The marginal distribution of X is


X 1 2 3
1 1 1
P(X = x) 6 3 2

The marginal distribution of Y is


Y 1 2 3
1 1 1
P(Y = y) 6 3 2

The conditional distribution of P(Y = yi / X = xi) is


when y = 1 and x = 1, 2, 3

P ( Y = 1, X = 1) 1
1 P ( Y = 1, X = 2 ) 2 36 1
P ( Y = 1 / X = 1) = = 36 = , P (Y = 1 / X = 2) = = =
P ( X = 1) 1 6 P (X = 2) 1 6
6 3
P ( Y = 1, X = 3 ) 3 36 1
P ( Y = 1 / X = 3) = = =
P ( X = 3) 1 6
2
when y = 2 and x = 1, 2, 3

P ( Y = 2, X = 1)
2
1 P ( Y = 2, X = 2 ) 4 36 1
P ( Y = 2 / X = 1) = = 36 = , P ( Y = 2 / X = 2 ) = = =
P ( X = 1) 1 3 P ( X = 2) 1 3
6 3
P ( Y = 2, X = 3 ) 6 36 1
P ( Y = 2 / X = 3) = = =
P ( X = 3) 1 3
2
St. Joseph’s College of Engineering 21
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
when y = 3 and x = 1, 2, 3

P ( Y = 3, X = 1) 3
1 P ( Y = 3, X = 2 ) 6 36 1
P ( Y = 3 / X = 1) = = 36 = , P (Y = 3 / X = 2) = = =
P ( X = 1) 1 2 P ( X = 2) 1 2
6 3
P ( Y = 3, X = 3 ) 9 36 1
P ( Y = 3 / X = 3) = = =
P ( X = 3) 1 2
2
P ( X +Y  4 ) = P ( X =1,Y=1) + P ( X =1,Y=2 ) + P ( X =1,Y=3) + P ( X =2,Y=1) + P ( X =2,Y=2 )
+ P ( X =3,Y=1)
= c + 2c + 3c + 2c + 4c + 3c
15
= 15c =
36
2. x+y
The Joint distribution of X and Y is given by p(x, y) = , x = 1, 2,3; y = 1, 2,3 Find all the
36
marginal distributions and conditional probability distribution of Y given X =2.
Solution:
x+y
Given f (x, y) = , x = 1, 2,3; y = 1, 2,3
36
X
1 2 3 PY(y)
Y
2 3 4 9
1
36 36 36 36
3 4 5 12
2
36 36 36 36
4 5 6 15
3
36 36 36 36
9 12 15
PX(x)
36 36 36
The marginal distribution of X is
X 1 2 3
9 12 15
P(x)
36 36 36
The marginal distribution of Y is
Y 1 2 3
9 12 15
p(y)
36 36 36

Conditional distribution of X given Y = 2


P(Y = yi /X = 2)
X 1 2 3

P( Y = yi /X = 2) = [P(X = 2, Y= yi )] / P(X = 2) 3/12 4/12 5/12

3.  k ( x + 1)e − y , 0  x  1, y  0
Find the constant k such that f ( x , y ) =  is a joint p.d.f. of the
 0 , otherwise
continuous random variable (X,Y). Are X and Y independent R.Vs? Explain.
St. Joseph’s College of Engineering 22
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Solution:
To find k : Given that f(x,y) is pdf of (X,Y)
 
f(x,y) ≥ 0 , for all x ,y and   f ( x, y)dxdy = 1
− −
  1

  f ( x, y)dxdy = 1    k ( x + 1)e
−y
dxdy = 1
− − 0 0
1 
 k  ( x + 1)dx .  e − y dy = 1
0 0
1
 x2  
 k  + x  .  −e − y  = 1
2 0
0

3 2
 k   (1) = 1  k =
2 3
2 −y
 f ( x, y) =  3 ( x + 1)e , 0  x  1, y  0

 0 , otherwise
The marginal PDF of X is
 
2 2 −y 
− f ( x, y)dy = 0 3 ( x + 1)e dy = 3 ( x + 1) −e  0
−y
f X ( x) =

2
= ( x + 1)  −e− + e0 
3
2 2
= ( x + 1)(1) = ( x + 1), 0  x  1
3 3
The marginal PDF of Y is
 1
2
− f ( x, y)dx = 0 3 ( x + 1)e dx
−y
fY ( y ) =

1
2  x2 
= e− y  + x 
3 2 0
2 1  3 2
= e− y  + 1 = . e− y = e− y , 0  y  
3 2  2 3
2
Consider f X ( x) . fY ( y ) = ( x + 1) . e − y = f(x , y)
3
X and Y are independent
4. The joint density function of two random variable X and Y is given
 6  2 xy 
  x +  ,0  x  1, 0  y  2
by f ( x , y ) =  7  2  .

 0 ,elsewhere
(a) Compute the marginal p.d.f of X and Y?
 1 1
(b) Find E(X) & E(Y) (c) P  X  , Y   .
 2 2
Solution:
The marginal pdf of X is

St. Joseph’s College of Engineering 23


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

6 xy 
2
f X ( x) =  f ( x, y ) dy =   x 2 +  dy
− 0
7 2 
2
6 2 x y2  6
= x y +  =  2 x + x  , 0  x  1
2

7 2 2 0 7
The marginal pdf of Y is

6 xy 
1
fY ( y ) =  f ( x, y ) dx =   x 2 +  dx
− 0
7 2 
1
6  x3 x 2 y  6 1 y 
=  +  =  + ,0  y  2
7  3 2 2 0 7  3 4 
 1
6
E(X) =  x f X ( x)dx =  x  2 x 2 + x  dx
− 0
7 
6  3 2 1
= 2x + x
7 0
1
6  2 x4 x2  6 1 1  6
=  +  =  + =
7  4 2  7 2 2 7
0

E(Y) =  y f Y ( y )dy
−
2
6 1 y
=y  + dy
0
7 3 4 
2
6  y y2 
=  + 
7  3 4 
0
2
6  y 2 y3  6 2 2 8
=  +  =  + =
7  6 12  7 3 3 7
0
1
2 
1 1
P( X  , Y  ) =   f ( x, y) dy dx
2 2 − 1
2
1
2 2
6 xy 
=    x 2 +  dy dx
0 1
7 2 
2
1
2
2
6 x y2 
=   x2 y +  dx
0
7 2 2 1
2
1
2
6 2 x2 x 
=  2 x + x − −  dx
0
7 2 16 

St. Joseph’s College of Engineering 24


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
1 1
6  3x 2 15 x 
2 2
=  +  dx
0
7  2 16 0
1
6  x3 15 x 2  2 6  1 15  6 23 69
=  +  =  +  =  =
7 2 32  0 7 16 128  7 128 448
 8 xy ; 0  x  y  1
5. Given the joint probability density function of X and Y is f (x, y) = 
 0 ; elsewhere
(i) Find the marginal and conditional p.d.fs of X and Y. (ii) Find E(X), E(Y) (iii) Are X and Y
are independent?
Solution:
(i). The marginal density function of X is,
f X ( x) =  f ( x, y )dy
y
1
=
y=x
 8 xydy
1
 y2 
= 8 x  = 4 x 1 − x 2 , 0  x  1( )
 2 x
The marginal density function of Y is,
fY ( y) =  f ( x, y)dx
x
y

=
x =0
 8xy dx
y
 x2 
= 8 y  = 4 y 3 , 0  y  1
 2 0
The conditional p.d.f. of X and Y
f ( x, y) 8 xy 2 x
f ( x / y) = = = , 0  x  y 1
fY ( y ) 4 y 3 y 2
f ( x, y ) 8 xy 2y
f ( y / x) = = = , 0  x  y 1
f X ( x) 4 x (1 − x 2 ) 1 − x 2
1

(ii). E ( X ) =f X ( x) = x f ( x) dx =
  x 4 x (1 − x 2 ) dx
x x =0
1
= 4  x 2 − x 4 dx
x =0
1
 x3 x5  1 1  8
= 4 −  = 4 −  =
3 5 0  3 5  15
1 1

E ( Y ) =f Y ( x) =  y f ( y)dy = y 4 y dy = 4  y dy = 4  y  = 4
1


5
3 4

y y =0 y =0  5 0 5
(iii). ( )
f X ( x)  fY ( y) = 4 x 1 − x 2 4 y 3  8xy = f ( x, y)

St. Joseph’s College of Engineering 25


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
 f X ( x)  fY ( y )  f ( x, y ) Hence X and Y are not independent.
6. Two random variables X and Y have the joint probability density
− (x2 + y 2 )
function f ( x , y ) = k x y e , x  0, y  0 . Find the value of k and also prove that X and Y are
independent.
Solution:
−( x + y2 )
Given the joint pdf f ( x, y) = k x ye , x  0, y  0
2

 
  f ( x, y) dx dy = 1
− −

   kxye
(
− x2 + y 2 ) dx dy = 1
0 0
 
 k  xe − x2
dx. ye − y dy = 1
2

0 0
2
Put x = u, Put y 2 = v
 2xdx = du,  2ydy = dv
When x = 0,u = 0 When y = 0,v = 0
x = ,u =  y = ,v = 
 
k 1
  e − u du .  e − v dv = 1
20 20
 
k  e−u   e−v 
   .   =1
4  −1  0  −1  0
k
 0 + 1.0 + 1 =1  k = 4
4
 The joint pdf is f ( x, y) = 4 xye
(
− x2 + y 2 ) , x  0, y  0
The marginal pdf of X is
 
(
− x2 + y 2 ) dy
f X ( x) =
−
 f ( x, y)dy =  4 xye 0

 ye dy
− x2 −y
= 4 xe
2

2 1
= 4 xe− x  
2
= 2 xe − x , x  0
2

The marginal pdf of Y is


 
(
− x2 + y 2 ) dx
fY ( y ) =  f ( x, y)dx =  4 xye
− 0

St. Joseph’s College of Engineering 26


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

= 4 ye − y  xe
2
− x2
dx
0

2 1
= 4 ye − y  
2
= 2 ye − y , y  0
2

Now,
(
f X ( x).fY ( y ) = 2 xe − x . 2 ye − y
2

)( 2

)
= 4 xye
(
− x2 + y 2 )
f X ( x).fY ( y ) = f ( x, y )
 X and Y are independent.
7. The joint probability mass function of (X,Y) is given by p(x, y) = k ( 2x + 3y ), x = 0 , 1, 2 ,
y = 1, 2, 3. Find the correlation coefficient between X and Y.
Solution:
X
y 0 1 2

1 3k 5k 7k
2 6k 8k 10k
3 9k 11k 13k
To find the value of k :
3 3
We know that  P(x
j =1 j =1
i, y j ) = 1. i.e. sum of all probabilities = 1

So, 3k + 5k + 7k + 6k + 8k + 10k + 9k + 11k + 13k = 1 72k = 1  k= 1


72
x
y 0 1 2 PY ( y)
3 5 7 15
1
72 72 72 72
6 8 10 24
2
72 72 72 72
9 11 13 33
3
72 72 72 72
18 24 30 1
PX ( x)
72 72 72

(i) Marginal distribution of X: Marginal distribution of Y :


X 0 1 2 Y 1 2 3
18 30
P(X = x) 72
24
72 P(Y = y) 15 72 24
7272
33
72
2
 18   24   30  84
E ( X ) = xi P ( X=xi ) =  0   + 1  +  2   = = 1.1667
i =0  72   72   72  72

St. Joseph’s College of Engineering 27


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
 18   24   30  144
( )
2
E X2 = xi2 P ( X=xi ) =  02   + 12   +  22   = =2
i =0  72   72   72  72
 15   24   33  129
( )
3
E ( Y ) = y j P Y=y j = 1  +  2   +  3   = = 1.7917
j =1  72   72   72  72
 15   24   33  408
( )
E Y 2 = y 2j P ( Y=y j ) = 12   +  22   +  32   =
3
= 5.667
j =1  72   72   72  72
E ( X Y ) = xi y j P ( X=xi , Y=y j )
i j

 3   5   7   6   8   10 
=  0  1  +  1 1  +  2  1  +  0  2   +  1 2   +  2  2  
 72   72   72   72   72   72 
 9   11   13 
+  0  3   +  1 3   +  2  3  
 72   72   72 
5 14 16 40 33 78 186
= 0+ + +0+ + +0+ + = = 2.5833
72 72 72 72 72 72 72
 X2 = E ( X 2 ) − ( E ( X ))2 = 2 − (1.1667 ) = 0.6388   X = 0.7992
2

 Y2 = E (Y 2 ) − ( E (Y ))2 = 5.667 − (1.7917 ) = 2.4568   Y = 1.567


2

Cov ( X, Y ) =E ( X Y ) - E ( X ) E ( Y ) = 2.5833 − (1.1667 )(1.7917 ) = 0.4929


Cov ( X, Y ) 0.4929
r ( X, Y ) = = = 0.3936
 X Y ( 0.7992 )(1.567 )
 2 − x − y , 0  x  1, 0  y  1
8. The joint pdf of random variables X and Y is given by f ( x , y ) = 
 0 , elsewhere
Find Cov ( X , Y ) and the correlation coefficient between X and Y.
Solution:
The marginal pdf of X is
 1
1
 y2   1 3
f X ( x) =  f ( x, y ) dy =  ( 2 − x − y ) dy = 2 y − xy −  = 2 − x −  = − x, 0  x  1
− 0  2 0  2 2
The marginal pdf of Y is
 1
1
 x2   1  3
fY ( y ) = − f ( x, y ) dx = 0 ( 2 − x − y ) dx = 

2 x −
2
− yx  = 2 − − yx  = − y, 0  y  1
0  2  2
 1
1
3 
1
3 2  3 x 2 x3  3 1 5
E( X ) = − X
xf ( x ) dx = 0  2  0  2
x . − x dx = x − x 

dx =  . −  = − =
 2 2 3  0 4 3 12
 1
1
2 3 
1
3 2 3  3 x3 x 4  1 1 1
E( X 2 ) =  X = 0  2  0  2
− = − =  . −  = − =
2
x f ( x ) dx x . x dx x x  dx
−   2 3 4 0 2 4 4
 1
1
3 
1
3 2  3 y 2 y3  3 1 5
E (Y ) =  Y
−
yf ( y ) dy = 0  2  0  2
y . − y dy = y − y 

dx =  . −  = − =
 2 2 3  0 4 3 12
 1
1
2 3 
1
3 2 3  3 y3 y 4  1 1 1
E (Y 2 ) =  Y = 0  2  0  2
− = − =  . −  = − =
2
y f ( y ) dy y . y dy y y  dy
−   2 3 4 0 2 4 4
2
1  5 1 25 11 11
 = E ( X ) − ( E ( X )) = −   = −
2
X
2
= X =
2

4  12  4 144 144 12

St. Joseph’s College of Engineering 28


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
2
1  5 1 25 11 11
 Y2 = E (Y 2 ) − ( E (Y )) 2 = −  = − =  Y =
4  12  4 144 144 12
  1 1
E (XY) =   x y f ( x, y) dx dy =   x y (2 − x − y)dx dy
− − 0 0
1
1 
=  y   (2 x − x 2 − xy )dx  dy
0 0 
1
1
 x 2 x3 x 2 
=  y 2 − − y  dy
0  2 3 2 0
 1 y
1
=  y 1 − − dy
0 
3 2
1
2 y2 
1
 2 y2 y 3  1 1  1
=   y −  dy =  −  = − =
0
3 2   3 2 6 0  3 6  6
1 5 5 1
Cov ( X, Y ) =E ( X Y ) - E ( X ) E ( Y ) = − . = −
6 12 12 144
The coefficient of correlation between X and Y is,
1

Cov ( X, Y ) 144 = − 1 = −0.0909
rX Y = =
 X Y 11 11 11
.
12 12
 25e − y , 0  x  0.2 , y  0
9. The joint pdf of the random variables X and Y is defined as f ( x , y ) = 
 0 , elsewhere
(a) Find the marginal PDFs of X and Y , (b) COV (X,Y).
Solution:
The marginal PDF of X is
 

 f ( x, y)dy =  25e dy = 25 −e  = 25 −e + e  = 25(1) = 25, 0  x  0.2
−y −y −
f X ( x) = 0
0
− 0
 0.2
The marginal PDF of Y is fY ( y ) =  f ( x, y)dx =  25e
−y
dx = 25e − y  x 0 = 25e − y 0.2 = 5e − y , 0  y  
0.2

− 0

 0.2 x 2  0.2
 0.04 
E(X) =  x f X ( x)dx =  x(25)dx =25   = 25 
 2  0  2 
= 0.5
− 0
  
E(Y) =  y fY ( y )dy =  y (5e − y )dy =5  − ye − y − e − y  = 5 0 + 1 = 5
 0
− 0
   0.2  0.2
−y −y
E(XY) =   xy f ( x, y )dx dy =   xy (25e )dxdy =  25 ye dy .  xdx
− − 0 0 0 0

 x  2 0.2
 0.04 
= 25  − ye− y − e− y  . 
   = 250 + 1.   = (25)(0.02) = 0.5
0
  0
2  2
Cov (x, y) = E(XY) – E(X)E(Y) = 0.5 – (0.5)(5) = –2

St. Joseph’s College of Engineering 29


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
10. Calculate the correlation coefficient for the following heights in inches of fathers (x) & their son
(y):
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
Solution:
X y x2 y2 xy
65 67 4225 4489 4355
66 68 4356 4624 4488
67 65 4489 4225 4355
67 68 4489 4624 4556
68 72 4624 5184 4896
69 72 4761 5184 4968
70 69 4900 4761 4830
72 71 5184 5041 5112
544 552 37028 38132 37560
544 552 37560
E ( x) = = 68, E ( y ) = = 69, E ( xy ) = = 4695,
8 8 8

( )
E x2 =
37028
8
= 4628.5, E y 2 = ( ) 38132
8
= 4766.5

Cov ( X, Y ) =E ( X Y ) - E ( X ) E ( Y ) = 4695 − (68)(69) = 3


 X2 = E ( X 2 ) − ( E ( X ))2 = 4628.5 − ( 68) = 4.5   X = 2.121
2

 Y2 = E (Y 2 ) − ( E (Y ))2 = 4766.5 − ( 69 ) = 5.5   Y = 2.345


2

Cov ( X, Y ) 3
r ( X, Y ) = = = 0.603
 X Y ( 2.121)( 2.345)
11. Test students got the following percentage of marks in economics (X) and statistics (Y), find the
Karl Pearson’s correlation coefficient from the following data:
X 78 36 98 25 75 82 90 62 65 39
Y 84 51 91 60 68 62 86 58 53 47
Solution:
Since rxy does not change with change of origin and scale.
X y u= x-65 v=y-66 u2 v2 uv
78 84 13 18 169 324 234
36 51 -29 -15 841 225 435
98 91 33 25 1089 625 825
25 60 -40 -6 1600 36 240
75 68 10 2 100 4 20
82 62 17 -4 289 16 -68
90 86 25 20 625 400 500
62 58 -3 -8 9 64 24
65 53 0 -13 0 169 0
39 47 -26 -19 676 361 494
650 660 0 0 5398 2224 2704
650 660
E (x) = x = = 65, E ( y ) = y = = 66
10 10

E ( u ) = 0, E ( v ) = 0, E ( uv ) =
2704
10
( )
= 27.04, E u 2 =
5398
10
( )
= 53.98, E v 2 =
2224
10
22.24

St. Joseph’s College of Engineering 30


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Cov ( u, v ) =E ( uv ) - E ( u ) E ( v ) = 27.04 − 0 = 27.04
 u2 = E (u 2 ) − ( E (u ))2 = 53.98 − ( 0 ) = 53.98   u = 7.347
 v2 = E(v2 ) − ( E(v))2 = 22.24 − 0 = 22.24   v = 4.716
Cov ( u, v ) 27.04
r ( u, v ) = = = 0.7804 = r ( X, Y )
u v ( 7.347 )( 4.716)
12. Obtain the equations of the lines of regression from the following data.
X 22 26 29 30 31 33 34 35
Y 20 20 20 29 27 24 27 31
Also estimate the value of Y when X = 38 and the value of X when Y = 18.
Solution:
X Y U = X – 30 V = Y – 27 U2 V2 UV
22 20 –8 –7 64 49 56
26 20 –4 –7 16 49 28
29 21 –1 –6 1 36 6
30 29 0 2 0 4 0
31 27 1 0 1 0 0
33 24 3 –3 9 9 –9
34 27 4 0 16 0 0
35 31 5 4 21 16 20
0 –17 132 163 101
n = 8, U = 0, V = −17, U = 132, V = 163, UV = 101
2 2

U=
U = 0 = 0 , V=
V = −17 = −2.125 ,
n 8 n 8

 U2 = 
2

( )
U 132 2
− U − 0 = 16.5   U = 4.062 ,
=
n 8

V =
2  V2
− V =
2 163
( )
− (−2.125)2 = 15.86   V = 3.9825 ,
n 8

Cov(U , V) = 
UV 101
−U V = − 0 = 12.625
n 8
Cov(U ,V ) 12.625
 rUV = = = 0.7804
 U . v (4.062).(3.9825)

 rXY = 0.7804
X = U + 30  X = 0 + 30 = 30
Y = V + 27  Y = −2.125 + 27 = 24.875
 X =  U   X = 4.062
 Y =  V   Y = 3.9825
r X
The regression line of X on Y is X − X =
Y
(Y − Y )
(0.7804).(4.062)
 X − 30 = (Y − 24.875)
3.9825
 X = 0.796Y + 10.2
When Y = 18, X = 0.796 (18) + 10.2 = 24.528

St. Joseph’s College of Engineering 31


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
r Y
The regression line of Y on X is Y − Y = X −X
X
( )
(0.7804).(3.9825)
 Y − 24.875 = ( X − 30 )
(4.062)
 Y = 0.765 X + 1.925
When X = 38, Y = 0.765 (38) +1.925 = 30.995
13. In a partially destroyed laboratory record only the lines of regressions and variance of X are
available. The regression equations are 8x – 10y + 66 = 0 and 40x – 18y = 214 and variance of X =
9. Find (a) the correlation coefficient between X and Y (b) Mean values of X and Y (c) variance
of Y.
Solution:
Given 8x – 10y = –66 ……(1)
40x – 18y = 214 ……(2)
Let (1) be the regression line of y on x and (2) be the regression line of x on y.
8 x 66 8 4
10y = 8x + 66  y = + the regression coefficient of y on x is b1 = =
10 10 10 5
18 y 214 18 9
40x = 18y + 214  x = +  the regression coefficient of x on y is b2 = =
40 40 40 20
b1b2 =  
4 9  9
  = 1
 5  20  25
Let r be the correlation between x and y.
9 3
 r = b1b2 = = = 0.6 [Since both regression coefficients are positive, r is positive]
25 5
( )
Let x, y be the point of intersection of the two regression lines.
Solving (1) and (2) we get x , y
5 x (1)  40x – 50y = – 330
40x – 18y = 214
Subtracting – 32y = – 544
y = 17
Now, 8x – 10y = – 66  8x – 10(17) = – 66  8x = 170 – 66  8x = 104  x = 13
( )
 x, y = (13 , 17) is the mean of X and Y.
4
 Y2 b1 b
We know, 2 =   Y2 = 1  X2   Y2 = 5 .(9)   Y2 = 16  Variance of Y is 16
 X b2 b2 9
20
14. A bank teller customer standing in the queue one by one. Suppose that the service time Xi for
customers ‘i’ has mean E(Xi) =2 minutes and var (Xi)=1. We assume that service times for
different bank customers are independent. Let Y be the total time the bank teller spends serving
50 customers. Using central limit theorem, find the probability that the total service time between
90 and 110.
Solution:
Let Y be the total time the bank teller spends serving 50 customers. n=50,  = 2,  2 = 1
(
By CLT, Y = X 1 + X 2 + ... + X 50 follows a normal distribution with N n ,  n )
The probability that the total service time between 90 and 110 is
Y − n Y − 100 Y − 100
The standard normal variable is given by Z = = =
 n 1 50 7.071

St. Joseph’s College of Engineering 32


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
90 − 100 110 − 100
when Y = 90 , Z = = −1.41 , when Y = 110 , Z = = 1.41
7.071 7.071
 P ( 90  Y  110 ) = P ( −1.41  z  1.41) = 2 P(0  z  1.41) = 2  0.4207 = 0.8415
15. The life time of a certain brand of an electric bulb may be considered as a random variable with
mean 1200h and standard deviation 250h. Find the probability, using central limit theorem, that
the average lifetime of 60 bulbs exceeds 1250hours.
Solution:
Let Xi (i=1,2,...60) denote the life time of the bulbs. Here  =1200,  = 250
Let X denote the average life time of 60 bulbs.
 2  X −  X − 1200
By Central limit theorem, X follows N   ,  . Let Z =  =
 n  32.27
n
1250 − 1200
when X = 1250, Z = = 1.55
32.27
P ( X  1250 ) = P ( Z  1.55) 0.5 − P(0  z  1.55) = 0.5 − 0.4394 = 0.0606
UNIT – III TESTING OF HYPOTHESIS
PART-A
1. Define Population, Sample and Sample Size.
Solution:
The group of individuals under study is called population. The population may be finite or infinite. A
finite subset of statistical individuals in a population is called Sample. The number of individuals in a
sample is called Sample Size (n).
2. Define Parameter and Statistic.
Solution:
The Statistical constants in population namely mean µ and variance  2 which are usually referred to as
parameters.
Statistical measures computed from sample observations alone, i.e. mean x and variance s2 which are
usually referred to as statistic.
3. Define Sampling distribution.
Solution: The probability distribution of a sample statistic is called the sampling distribution
4. What is Standard Error? (April / May 2017)
Solution: The standard deviation of the sampling distribution of a statistic is known as its Standard
error.
5. A random sample of 20 observation produced a sample mean of x = 92.4, s = 25.8 What is the
value of standard error of sample mean.
S 25.8
Solution: Standard error ( x ) = = = 5.919
n −1 20 − 1
6. Define Null hypothesis. (Nov/Dec 2022)
Solution:
For applying the tests of significance, we first set up a hypothesis which is a definite statement about
the population parameter. Usually, such a hypothesis is a hypothesis of no difference & it is denoted
by H 0 .
7. Define critical region and acceptance region? (Nov/Dec 2019)
Solution:
A region corresponding to a statistic (t), in the sample space (s) which amounts to rejection of null
St. Joseph’s College of Engineering 33
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
hypothesis is called as critical region or region of rejection.
The region complementary to the critical region is called acceptance region.
8. Define one - tailed and two - tailed test.
Solution:
A test of any statistical hypothesis where the alternative hypothesis is one tailed (right or left tailed) is
called a one tailed test. i.e. Ho : = o Vs H1 :   o (right tailed) , (or) H1 :   o (Left tailed)

A test of statistical hypothesis whose alternative hypothesis is two tailed, such as


H o :  = o Vs H1 :   o is known as two tailed test.

9. Define Type-I and Type-II errors. (Nov/Dec2019, Nov/Dec 2023)


Solution:
Type I error: Reject Null hypothesis when it is true. The type I error is denoted by , example of type I
error is Reject a lot when it is good.
Type II error: Accept Null hypothesis when it is false. The type II error is denoted by , example of
type II error is Accept a lot when it is bad.
10. Write 95% confidence interval of the population mean, when small samples are considered.
S S
Solution: x − t 0.05    x + t 0.05
n −1 n −1
11. Write the 95% confidence interval of population proportion.
pq pq
Solution: 95% Confidence interval for P is p − 1.96  P  p + 1.96
n n
12. List out the applications of t –distribution.
Solution:
❖ To test the significant difference between the means of two independent samples.
❖ To test the significant difference between the means of two dependent samples or paired
observation.
❖ To test the significant difference between population, mean and mean of a random sample.
❖ To test the significance of an observed correlation coefficient.
13. Mention the Properties of t – distribution.
Solution:
❖ The variable t distribution ranges from - to .
❖ The t – distribution is symmetrical and has a mean zero.
❖ The variance of the t – distribution is greater than one, but approaches one as the number of degrees
of freedom and therefore the sample size become large.
14. Define student’s t-test for single mean. (April/May 2021)
Solution:
The t – distribution is used when sample size is 30 or less and the population standard deviation is
x− n
( xi − x)2
unknown. The t – statistic is defined as t = where s = 
2
. The t – distribution has been
s i =1 n −1
n
derived mathematically under the assumption of a normally distributed population.
15. What are the assumptions on which F-test is based?
Solution:
❖ Normality: The values in each group should be normally distributed.
❖ Independence of error: The variations of each value around its own group mean. i.e. error should
be independent of each value.
St. Joseph’s College of Engineering 34
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
❖ Homogeneity : The variances within each group should be equal for all groups.
16. What are the uses F – test? (April/May 2021) (Nov/Dec 2022)
Solution:
❖ To test equality of two population variances.
❖ To test the sample observation coming from normal population.
❖ To determine whether or not the two independent estimates of the population
variances are homogeneous in nature.
17. State any two properties of  2 distribution.
Solution:
(i)Chi – square curve is always positively skewed
(ii)Chi – square values increase with the increase in degrees of freedom
18. Explain the various uses of Chi-square test. (April/May 2021) (Nov/Dec 2022,2023)
Solution:
❖ Test of goodness of fit.
❖ Test of independence of attributes.
❖ Test of Homogeneity for a specified value of standard deviation.
19. State the assumptions of Chi-square test.
Solution:
❖ The sample observations should be independent.
❖ Constraints on the cell frequencies, if any must be linear.
❖ The total frequency should be atleast 50.
❖ No theoretical frequency is less than 5, then for the application of chi square test, it is pooled with
the succeeding or preceding so that combined frequency is less than 5.
20. Write down the value of  2 for a 2  2 contingency table with cell frequencies a, b, c, d.
Solution:
A B Total
A a b a+b
N ( ad − bc )
2
B c d c+d  = 2

Total a + c b+d N=a+b+c+d ( a + b )( c + d ) (a + c)(b + d )


PART-B
1. The mean of two large sampling 1000 and 2000 members are 67.5 inches and 68.0 inches
respectively. Can the samples be regarded as drawn from the same population of standard
deviation 2.5 inches? (April/May 2021)
Solution: H 0 : 1 =  2
H1 : 1   2
Level of Significance: 5%
Test Statistic:
x1 − x 2 67.5 − 68
Z= = = −5.16
 1 1  2.5 1 + 1
  + 
 n1 n 2  1000 2000
 Z = 5.16
Table value: Z = 1.96
Conclusion: The calculated value is greater than the table value, hence we reject the null hypothesis.
2. In a sample of 900 members has the mean 3.4 cms and S.D 2.61 cms. Is the sample from a large
population of mean 3.25 cms and S.D 2.61 cms? If the population is normal and its mean is
unknown, find the 95% confidence limits of true mean. (April/May 2021)
Solution:
Given n = 900 ,  = 3.25 , x = 3.4cm ,  = 2.61, s = 2.61

St. Joseph’s College of Engineering 35


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
H0: There is no significant difference between sample mean and population mean. (i.e)  = 3.25
H1: There is a significant difference between sample mean and population mean. (i.e)   3.25
Test Statistic:
x −  3.4 − 3.25
z= = = 1.724  z = 1.724
s 2.61
n 900

Critical value: The critical value of z for two tailed test at 5% level of significance is 1.96
Conclusion:
i.e., z = 1.724  1.96  calculated value < tabulated value
Therefore We accept the null hypothesis H0.
i.e., The sample has been drawn from the population with mean  = 3.25
To find confidence limit:
95% confidence limits are
  2.61 
x 1.96 = 3.4 1.96   = 3.4 0.1705 = (3.57,3.2295 )
n  900 
3. (i) In a big city 325 men out of 600 men were found to be smokers. Does this information support
the conclusion that the majority of men in this city are smokers?
Given n=600, Number of smokers=325
p = sample proportion of smokers  p =325/600=0.5417
P= Population proportion of smokers in the city = 1/2 =0.5Q=0.5
Null Hypothesis H0: The number of smokers and non-smokers are equal in the city.
Alternative Hypothesis H1: P > 0.5 (Right Tailed)

Test Statistic:

p − P 0.5417 − 0.5
z= = = 2.04
PQ 0.5*0.5
n 600

Critical value:
Tabulated value of z at 5% level of significance for right tail test is 1.645.
Conclusion: Since Calculated value of z > tabulated value of z. We reject the null hypothesis. The
majority of men in the city is smokers.

(ii) In a large city A, 20% of a random sample of 900 school boys had a slight physical defect. In
another large city B, 18.5% of a random sample of 1600 school boys had the same defect. Is the
difference between the proportions significant? (Nov / Dec 2022)
Solution:
n1 = 900 , n2 = 1600
20 18.5
p1 = = 0.2 , p2 = = 0.185
100 100

H 0 : The difference between the two proportions are not significant.


H1 : The difference between the two proportions are significant.
St. Joseph’s College of Engineering 36
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
n p +n p 476
Now p = 1 1 2 2 = = 0.1904
n1 + n2 2500
q = 1 – p = 1 – 0.1904=0.8096
p1 − p2 0.015
The test statistic is z = = = 0.9375
 1 1  0.016
pq  + 
 n1 n2 
Since z 1.96 , we accept the null hypothesis H 0 at 5% level of significance.
i.e., p1 = p2  The difference between the two propositions is not significant.
4. Ten oil tins are taken at random from an automatic filling machine. The mean weight of the tins
is 15.8 kg and standard deviation is 0.5 kg. Does the sample mean differ significantly from the
intended weight of 16 kg?
Solution:
Given x = 15.8,  = 16, s = 0.50, n = 10
H0:  = 16 the sample mean weight is not different from the intended weight.
H1:   16 i.e., the sample mean weight is not different from the intended weight.
Level of significance:  = 5% = 0.05, degrees of freedom = 10-1=9
x−
Test Statistic: t =
s
n −1
15.8 − 16 −0.2
t= = = −1.2  t = 1.2
0.50 0.167
10 − 1
Critical value: The critical value of t at 5% level of significance with degrees of freedom 9 is 2.26
Conclusion: Here calculated value < table value. so, we accept H0 at 5% level of significance.
Hence the sample mean weight is not different from the intended weight.
5. Test made on the breaking strength of 10 pieces of a metal gave the following results:
578,572,570,568,572,570,570,572,596, and 584kg. Test if the mean breaking strength of the wire
can be assumed as 577kg.
Solution:
Let us first compute sample mean x and sample standard deviation S and then test if x differs
significantly from the population mean =577.
Let H 0 :  = 577 , H1 :   577
x−x ( x − x)
2
X

578 2.8 7.84


572 -3.2 10.24
570 -5.2 27.04
568 -7.2 51.84
572 -3.2 10.24
570 -5.2 27.04
570 -5.2 27.04
572 -3.2 10.24
596 20.8 432.64

St. Joseph’s College of Engineering 37


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
584 8.8 77.44
5752 0 681.6
Where

( x − x)
n n

 xi
2
i
5752 681.6
x= i =1
= = 575.2, S2 = i =1
=
= 75.733  S = 8.702
n 10 n −1 9
x −  575.2 − 577
Under H 0 , the test statistic is t = = = −0.6541
S 8.702
n 10
 t = 0.6541
Tabulated value of t for v=9 degrees of freedom t0.025 =2.262
Since t  t0.025 .  H 0 is accepted
Conclusion: The mean breaking strength of the wire can be assumed as 577kg at 5% level of
significance.
6. Samples of two types of electric bulbs were tested for length of life and the following data were
obtained.
Sample Size Mean S.D
I 8 1234h 36h
II 7 1036h 40h
Is the difference in the means sufficient to warrant that type I bulbs are superior type II bulbs?
Solution:
Here x1 =1234, x2 =1036, n1 =8, n 2 =7, s1 =36, s2 = 40
Let H 0 : 1 = 2 ,
H1 : 1  2 (ie. Type I bulbs are superior to type II bulbs) (one tail test, Right tailed test)
x1 − x2 n s 2 + n2 s22
Under H 0 , the test statistic is t = , where S = 1 1 = 40.7317
1 1 n1 + n2 − 2
S +
n1 n2
1234 − 1036
t = = 9.39
1 1
40.7317 +
8 7
Degrees of freedom v = n1 + n 2 -2=13
Tabulated value of t for 13 d.f. at 5% level of significance is t0.05 =1.77
Since t  t0.05 .  H 0 is rejected. H1 is accepted.
Conclusion: Type I bulbs may be regarded superior to type II bulbs at 5% level of significance.
7. Two horses A and B were tested according to the time (in seconds) to run a particular track with
the following results.
Horse A 28 30 32 33 33 29 34
Horse B 29 30 30 24 27 27 -
Test whether you can discriminate between two horses. (Nov / Dec 2022)
Solution:
Null Hypothesis H0: 1 = 2
Alternative Hypothesis H1: 1  2 (two tailed test)

St. Joseph’s College of Engineering 38


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
y
x x−x
2
( ) ( y − y )2
28 10.89 29 1.44
30 1.69 30 4.84
32 0.49 30 4.84
33 2.89 24 14.44
33 2.89 27 0.64
29 5.29 27 0.64
34 7.29
x  ( x − x)
2
 y ( y − y) 2

=219 =167
=31.43
=26.84
1 1
x =  219 = 31.3 ; y = 167 = 27.8
7 6
( x − x ) + ( y − y )
2 2
31.43 + 26.84
S 2
= = = 5.29
n1 + n2 − 2 7+6−2
S = 2.3
x1 − x2 31.3 − 27.8
t= = = 2.73
1 1 1 1
S + 2.3 +
n1 n2 7 6
Degrees of freedom= n1 + n2 − 2 = 11 , From table t5% (v = 11) = 2.23
Calculated t > tabulated t.  H 0 is rejected. (i.e,)there is a significant difference between two horses ,
and they can be discriminated.
8. The nicotine contents in milligrams in two samples of tobacco were found to be as follows:
Sample A 24 27 26 21 25 -
Sample B 27 30 28 31 22 36
Can it be said that both the samples have come from same normal population? (Apr/May 2021)
Solution:
(i) F-test : (Equality of variance)
Let H0 : 12 =  22
H1 : 12   22

( x − x) ( y − y)
x y
x−x y− y
2 2

24 -0.6 0.36 27 -2 4
27 2.4 5.76 30 1 1
26 1.4 1.96 28 -1 1
21 -3.6 12.96 31 2 4
25 0.4 0.16 22 -7 49
36 7 49
123 21.2 174 108

x=
 x = 123 = 24.6, y =  y = 174 = 29
n1 5 n2 6

( x − x) ( y − y)
2 2
21.2 108
S12 = = = 5.3 , S22 = = = 21.6
n1 − 1 4 n2 − 1 5
St. Joseph’s College of Engineering 39
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
2
S
The test statistic is F = 22 (sin ce S22  S12 )
S1
21.6
= = 4.07
5.3
From the table, F0.05 ( n2 − 1, n1 − 1) = F0.05 (5, 4) = 6.26
Since F  F0.05  H 0 is accepted
(ii) t-test:(Equality of means)
Null Hypothesis H0: 1 = 2
Alternative Hypothesis H1: 1  2
x1 − x2
Under H 0 , the test statistic is t =
1 1
S +
n1 n2
1 1 129.2
S2 =   ( x − x )2 +  ( y − y )2  =  21.2 + 108 = = 14.36
n1 + n2 − 2 5+6−2 9
 S = 3.79

24.6 − 29 −4.4
t= = = −1.909  t = 1.909
1 1 2.30432
3.79 +
5 6
From the table, with degrees of freedom n1 + n 2 -2=9, t0.05 =2.262
sin ce t  t0.05  H 0 is accepted ie. 1 = 2
Conclusion: The two samples could have been drawn from the same normal population.
9. Two independent samples of six and seven items respectively had the following values of the
variable:
Sample 1 39 41 43 41 45 39 -
Sample 2 40 42 40 44 39 38 40
Do the two estimates of population variance differ significantly at 5% level of significance?
(Nov / Dec 2022)
Solution:
Let H0 : 12 =  22 ; H1 : 12   22

( ) ( )
x 2 y 2
x−x y− y
39 5.4289 40 0.1764
41 0.1089 42 2.4964
43 2.7889 40 0.1754
41 0.1089 44 12.8164
45 13.4689 39 2.0164
39 5.4289 38 5.8564
40 0.1764
 x =248  ( x − x ) 2
 y =283  ( y − y ) 2

=27.3334 =23.7148
Given n1 = 6 ; n2 = 7

x=
 x = 248 = 41.33, y =  y = 287 = 40.42
n1 6 n2 7

St. Joseph’s College of Engineering 40


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
( x − x) ( y − y)
2 2
27.33 23.71
S12 = = = 5.47 , S 22 = = = 3.95
n1 − 1 5 n2 − 1 6

S12
The test statistic is F = (sin ce S12  S22 )
S22
5.47
F= = 1.383
3.95
From the table, F0.05 ( n1 − 1, n2 − 1) = F0.05 (5,6) = 4.39
Since F  F0.05  H 0 is accepted
10. Two random samples gave the following results:
Sample Size Sample Sum of squares of
mean deviations from the mean
1 10 15 90
2 12 14 108
Test whether the samples come from the same normal population at 5% level of significance.
Solution:
A normal population has 2 parameters namely mean µ and variance  2 . To test if independent samples
have been drawn from the same normal population, we have to test
1) Equality of population variances using F-test.
2)Equality of population means using t-test

Given x = 15, y = 14, n1 = 10, n2 = 12, (x − x) 2


= 90,  ( y − y ) 2 = 108
1) F-test to test equality of populations variances
Null Hypothesis H0:  12 =  22 The population Variances are equal
Alternative Hypothesis H1:  1   2 The population Variances are not equal
2 2

Level of significance:  = 5%
S12
Test Statistics: F = 2
S2
1 1
Where S1 =
2

n1 − 1
 ( x − x )2 =
10 − 1
(90) = 10

1 1
S12 =
n1 − 1
 ( y − y )2 =
12 − 1
(108) = 9.818

S12 10
S
Here 1
2
 S 2
2 F = 2
= = 1.02
S2 9.818
Critical value: The critical value of F at 5% level of significance with degrees of freedom
(n1 − 1, n2 − 1) = (9,11) is 2.90
Here calculated value < table value, we accept H 0

2) t-test to test equality of population means:


1 = 2 there is no difference between the two population means.
Null hypothesis H 0 :
Alternate Hypothesis H1 : 1  2 there is difference between the two population means.
Level of Significance :  = 5% = 0.05 (Two tailed test )

St. Joseph’s College of Engineering 41


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
x−y
Test statistic: t =
1 1
S +
n1 n2
1 1
Where S =
2
 ( x − x )2 +  ( y − y )2  = (90 + 108) = 9.9
n1 + n2 − 2 10 + 12 − 2
S = 9.9 = 3.146
15 − 14
t= = 0.742
1 1
3.146 +
10 12
Critical value: The critical value of t at 5% level of significance with degrees of freedom
n1 + n2 − 2 = 10 + 12 − 2 = 20 is 2.086
Conclusion: calculated value <table value. H 0 is Accepted.
Conclusion: Both null hypothesis  1 =  2 and 1 = 2 are accepted.
2 2

Hence we may conclude The two samples are drawn from same normal populations.
11. The table below gives the number of aircraft accidents that occurred during the various days of
the week. Test whether the accidents are uniformly distributed over the week.
Days Mon Tue Wed Thurs Fri Sat
No. of accidents 14 18 12 11 15 14
Solution:
We want to test whether the accidents are uniformly distributed. So, we apply  2 -test.
Null Hypothesis H0: The accidents are uniformly distributed over the 6 days. (Monday to Saturday)
Alternative Hypothesis H1: The accidents are not uniformly distributed.
84
Under Ho, the expected frequencies for each day = =14
6
 (O − E ) 
2
The test statistic is  2 =   
 E 
O E O − E (O − E ) (O − E )2
2

E
14 14 0 0 0
18 14 4 16 1.143
12 14 -2 4 0.286
11 14 -3 9 0.643
15 14 1 1 0.071
14 14 0 0 0.000
84 84  2 =2.143
Number of degrees of freedom v = n-1 = 6-1 = 5
For v=5 degrees of freedom, from the table of  2 at 5% level is  02.05 = 11.07
Conclusion: Since  2  0.05 2
, Ho is accepted at 5% level of significance. the accidents are uniformly
distributed over the 6 days.
12. The theory predicts that the proportion of beans in the four groups A, B, C, D should be 9:3:3:1.
In an experiment among 1600 beans, the numbers in the four groups were 882, 313, 287, and 118.
Does the experimental result support the theory?
Solution:
Ho: The experimental data support the theory
Based on Ho, the expected numbers of beans in the four groups are as follows
St. Joseph’s College of Engineering 42
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Observed Expected (O − E )2
(O − E ) 2
frequency (O) frequency (E) E
882 900 324 0.360
313 300 169 0.563
287 300 169 0.563
118 100 324 3.240
4.726

(O − E )
2

 = 
2
=4.726
E
Calculated value of  2 =4.726
Tabulated value of  2 is 7.81 at 5% level of significance. Since calculated value < tabulated value.
Therefore, we accept the null hypothesis. i.e. the experimental data support the theory.
13. A die was thrown 498 times. Denoting x to be the number appearing on the top face of it, the
observed frequency of x is given below:
X 1 2 3 4 5 6
F 69 78 85 82 86 98
What opinion you would form for the accuracy of the die?
olution:
Given n=6
Null Hypthesis H0: There is no significant difference
Alternative Hypothesis H1: There is a significant difference
Level of significance:  = 5% or 0.05
Degrees of freedom=n-1=6-1=5
On the assumption H0, the expected frequency for each face=498 /6=83

Face Observed Expected O-E (O-E)2 (O − E )2


Frequency(O) Frequency(E)
E
1 69 83 -14 196 2.36
2 78 83 -5 25 0.30
3 85 83 2 4 0.05
4 82 83 -1 1 0.01
5 86 83 3 9 0.11
6 98 83 15 225 2.71
460 5.54

 (O − E )2 
The test statistic is  2
=  E 
 
For v=5 degrees of freedom, the table of  2 at 5% level is  0.05 = 11.07
2

  2   02.05
Conclusion: Since the calculated value of  2 < the table value of  2 , Ho is accepted at 5% level of
significance. The die is accurate.

St. Joseph’s College of Engineering 43


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
14. The following contingency table presents the reactions of legislators to a tax plan according to
party affiliation. Test whether party affiliation influences the reaction to the tax plan at 0.01 level
of signification.
Reaction
Party In favor Neutral Opposed Total
Party A 120 20 20 160
Party B 50 30 60 140
Party C 50 10 40 100
Total 220 60 120 400
Solution:
Null hypothesis H 0 : Party affiliation and tax plan are independent.

Alternate hypothesis H 1 : Party affiliation and tax plan are not independent.
Level of significance:  = 0.05
r s (Oij − Eij ) 2
The test statistic:  = 2

i =1 i =1 Eij

Reaction
Party Infavour Neutral Opposed Total
Party A 120 20 20 160
Party B 50 30 60 140
Party C 50 10 40 100
Total 220 60 120 400

160  220 160  60 160  120


E(120)= = 88 ; E(20)= = 24 ; E(20)= = 48
400 400 400
140  220 140  60 140  120
E(50)= = 77 ; E(30)= = 21 ; E(60)= = 42
400 400 400
100  220 100  60 120  100
E(50)= = 55 ; E(10)= = 15 ; E(40)= = 30
400 400 400
Oij − Eij
(O ij − Eij )
(Oij − Eij ) 2
2
Oij Eij
Eij

120 88 32 1024 11.64


20 24 -4 16 0.67
20 48 -28 784 16.33
50 77 -27 729 9.47

St. Joseph’s College of Engineering 44


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
30 21 9 81 3.86
60 42 18 324 7.71
50 55 -5 25 0.45
10 15 -5 25 1.67
40 30 10 100 3.33
Total 55.13
  2 = 55.13

 = 0.05 Degrees of freedom = (r − 1)(s − 1) = (3 − 1)(3 − 1) = 4  0.05


2
= 13.28

Conclusion: Since  2   2 , we Reject our Null Hypothesis H 0

Hence, the Party Affiliation and tax plan are dependent.


15. Calculate the expected frequencies for the following data presuming two attributes viz.,
conditions of home and condition of child as independent.
Condition of home
Clean Dirty
Condition of Child Clean 70 50
Fair 80 20
Dirty 35 45
Use Chi-Square test at 5% level of significance to state whether the two attributes are
independent.
Solution:
Null hypothesis H 0 : Conditions of home and conditions of child are independent.

Alternate hypothesis H 1 : Conditions of home and conditions of child are not independent.
r s (Oij − Eij ) 2
The test statistics:  2 =  
i =1 i =1 Eij

Analysis:
Condition of home Total
Clean Dirty
Condition of Child Clean 70 50 120
Fair 80 20 100
Dirty 35 45 80
Total 185 115 300
Corresponding row total×Column total
Expected Frequency =
Grand Total

St. Joseph’s College of Engineering 45


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
120×185
Expected Frequency for 70 = = 74
300
100×185
Expected Frequency for 80 = = 61.67
300
80×185
Expected Frequency for 35 = = 49.33
300
120×115
Expected Frequency for 50 = = 46
300
100×115
Expected Frequency for 20 = = 38.33
300
80×115
Expected Frequency for 45 = = 30.67
300

Oij Eij Oij - Eij (Oij − Eij ) 2 (Oij − Eij ) 2


Eij

70 74 -4 16 16
= 0.216
74
50 46 4 16 0.348
80 61.67 18.33 335.99 5.448
20 38.33 -18.33 335.99 8.766
35 49.33 -14.33 205.35 4.163
45 30.67 14.33 205.35 6.695
Total 25.636
  2 = 25.636

 = 0.05 Degrees of freedom = (r − 1)(c − 1) = (3 − 1)(2 − 1) = 2

 0.05
2
= 5.991

Conclusion: Since  2   2 , we Reject our Null Hypothesis H0 . Hence, Conditions of home and

conditions of child are not independent.


UNIT IV DESIGN OF EXPERIMENTS
PART-A
1. What is the aim of the design of the experiments?
Solution:
The main aim of the design of experiments is to control the extraneous variables and hence to minimize
the experimental error so that the results of the experiments could be attributed only to the
experimental variables.
2. What are the basic principles of design of experiments? (April/ May 2017), (Nov/Dec 2019)
Solution: (i) Randomization (ii) Replication (iii) Local Control

3. What are the three essential steps to plan design of experiment? (Nov/Dec 2022)
Solution:
St. Joseph’s College of Engineering 46
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
To plan an experiment, the following three are essential.
1. A Statement of the objective: Statement should clearly mention the hypothesis to be tested.
2. A description of the experiment: Description should include the type of experimental material, size
of the experiment and the number of replications.
3. The outline of the method of analysis: The outline of the method consists of analysis of variance.
4. Define Analysis of Variance. (Nov/Dec 2023)
Solution:
Analysis of Variance is a technique that will enable us to test for the significance of the difference
among more than two sample means.
5. What are the assumptions of analysis of variance? (April / May 2021)
Solution:
(i) The sample observations are independent
(ii) The Environmental effects are additive in nature
(iii) Population from which samples are taken is normal
6. What is the purpose of ANOVA?
Solution:
The purpose of ANOVA is to test the homogeneity of several means.
7. Define Replication.
Solution:
The repetition of the treatments under investigation is known as replication.
8. Define Experimental Error.
Solution:
The variation from plot to plot caused by uncontrolled factors, that is factors beyond the control of
the experimenter, is known as experimental error.
9. Define one-way classification and two-way classifications in ANOVA.
Solution:
The entire experiment influences on only single factor is one way classification.
The entire experiment influences on only two factors are two-way classification.
10. Write down the ANOVA table for one way classification. (April / May 2017)
Solution:
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
Between SSC MSC
SSC C-1 MSC = FC =
Samples C −1 MSE
Within SSE if MSC  MSE
SSE N-C MSE =
Samples N−C
11. Find the missing value of A, B, C, D from the ANOVA table.
S.V D.F S.S M.S.S F cal
Treatment 2 A 3 1.66
Error B C 5 -
Total 9 D - -
Solution:
A=2 x 3 =6, B = Total SS – Total D.F = 9-2 =7,
C = 7 x 5 =35, D = A+C = 6+35 =41
12. What is Latin Square Design? (April / May 2019)

St. Joseph’s College of Engineering 47


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Solution:
It is square array of the letters A, B, C, D etc., of Latin square alphabets in this square array each letter
appears once and only once in each row and column. For Latin square design involving n treatments, it
is necessary to include n2 observations, ‘n’ for each treatment.
13. Write down the ANOVA table for Randomized Block Design.
Solution:
Source of Sum of Degree of
Mean Square F- Ratio
Variation Degrees freedom
Column SSC MSC
SSC c-1 MSC = FC = if MSC > MSE
Treatments c −1 MSE
Row SSR MSR
SSR r-1 MSC = FR = if MSR > MSE
Treatments r −1 MSE
Error (or) SSE
SSE (r-1) (c-1) MSE =
Residual (r − 1)(c − 1)
14. Find the missing values from the ANOVA table.
S.V D.F S.S M.S.S F cal
Treatment 4 A 40 5
Block B C 115 14.375
Error D 96 8 -
Total 19 601 - -
Solution:
A = 4 x 40 =160 ; D = 96 / 8 = 12 ; B = 3 ; C = 115 x 3 = 345
15. Why a 2 x 2 Latin Square is not possible? (Nov/Dec 2023)
Solution:
Consider n x n Latin square design, the degree of freedom for SSE is
= (n 2 − 1) − (n − 1) − (n − 1) − (n − 1) = n 2 − 1 − 3n + 3 = n 2 − 3n + 2 = (n − 1)(n − 2)
For n = 2, degree of freedom of SSE = 0 and hence MSE is not defined.
Comparisons are not possible.
Hence a 2 X 2 Latin Square Design is not possible.
16. What are the advantages of completely randomized block design?
Solution:
The advantages of completely randomized experimental design as follows:
(i) Easy to lay out.
(ii) Allow flexibility
(iii) Simple Statistical Analysis
(iv)The lots of information due to missing data is smaller than with any other design
17. Write down the ANOVA table for Latin Square Design.
Solution:
Source of Sum of Degree of
Mean Square F- Ratio
Variation Degrees freedom
Column SSC MSC
SSC n-1 MSC = FC = if MSC > MSE
Treatment n −1 MSE
Row SSR MSR
SSR n-1 MSR = FR = if MSR > MSE
Treatments n −1 MSE
Between SSK MSK
SSK n-1 MSK = FK = if MSK > MSE
Treatments n −1 MSE

St. Joseph’s College of Engineering 48


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Error (or) SSE
SSE (n-1) (n-2) MSE =
Residual (n − 1)(n − 2)

18. Mention the advantages of Latin square design over other designs. (April/May 2021)
Solution:
The advantages of the Latin square design over other designs are:
(i) With a two-way stratification or grouping, the Latin square controls more of the variation than the
CRD or the randomized completely block design. The two-way elimination of variation often results in
small error mean square.
(ii) The analysis is simple.
(iii) Even with missing data the analysis remains relatively simple.
19. Compare RBD, LSD, CRD.
Solution:
CRD RBD LSD
To influence one To influence two factors To influence more than two factor
factor
No restriction further No restriction on treatment The number of replications of each
treatments and replications treatment is equal to the number of
treatments
- Use only rectangular or Use only Square filed
square field

20. Construct 4x4 Latin Square design.


Solution:
Each treatment appears only once in each row and each column.
A B C D
B C D A
C D A B
D A B C

PART-B
1. The following are the number of mistakes made in 5 successive days by 4 technicians working for
a photographic laboratory. Test whether the difference among the foursample means can be
attributed to chance. (Test at a level of significance  = 0.01 )
Technicians
I II III IV
6 14 10 9
14 9 12 12
10 12 7 8
8 10 15 10
11 14 11 11
Solution:
H0: There is no significant difference between the technicians
H1 : Significant difference between the technicians
St. Joseph’s College of Engineering 49
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
We shift the origin
X1 X2 X3 X4 TOTAL X12 X22 X32 X42
-4 4 0 -1 -1 16 16 0 1
4 -1 2 2 7 16 1 4 4
0 2 -3 -2 -3 0 4 9 4
-2 0 5 0 3 4 0 25 0
Tot 1 4 1 1 7 1 16 1 1
al -1 9 5 0 13 37 37 39 10
N= Total No of Observations = 20 T=Grand Total = 13
(Grand total )2
Correction Factor = = 8.45
Total No of Observatio ns

TSS =  X12 + X 22 + X32 + X 42 − C.F = 37 + 37 + 39 + 10 − 8.45 = 114.55

( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
(−1)2 (9)2 (5)2 (0)2
SSC = − C.F = + + + − 8.45 = 12.95
1 2 3 4

C1 C2 C3 C4 5 5 5 5
SSE = TSS – SSC = 114.55-12.95= 101.6
ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
Between SSC
SSC=12.95 C-1= 4-1=3 MSC = =4.317
Samples K −1 MSC
FC = =1.471
MSE
Within SSE
SSE=101.6 N-C=20-4=16 MSE = =6.35
Samples N−K
Cal FC = 1.471 & Tab FC (16,3)=5.29
Conclusion: Cal FC< Tab FC  There is no significance difference between the technicians.
2. As part of the investigation of the collapse of the roof of a building, a testing laboratory is given
all the available bolts that connected the steel structure at 3 different positions on the roof. The
forces required to shear each of these bolts (coded values) are as follows:
Position 1 : 90 82 79 98 83 91
Position 2 : 105 89 93 104 89 95 86
Position 3 : 83 89 80 94
Perform an analysis of variance to test at the 0.05 level of significance whether the differences
among the sample means at the 3 positions are significant. (April / May 2019)
Solution:
H0: There is no significant difference between the sample means at the three positions.
H1: Significant difference between the sample means at the three positions.
We shift the origin
X1 X2 X3 TOTAL X12 X22 X32
1 16 -6 11 1 256 36
-7 0 0 -7 49 0 0
-10 4 -9 -15 100 16 81
9 15 5 29 81 225 25
-6 0 - -6 36 0 -
2 6 - 8 4 36 -
Total - -3 - -3 - 9 -

St. Joseph’s College of Engineering 50


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
-11 38 -10 17 271 542 142

N= Total No of Observations = 17 T=Grand Total = 17


(Grand total )2
Correction Factor = = 17
Total No of Observatio ns
TSS =  X 1 2 +  X 2 2 +  X 3 2 − C .F = 271 + 542 + 142 − 17 = 938

( X ) + ( X ) + ( X )
2 2 2
(−11) 2 ( 38) 2 (−10 ) 2
SSC = − C .F = + + − 17 = 234.44
1 2 3

C1 C2 C3 6 7 4
SSE = TSS – SSC = 938-234.44= 703.56
ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom

SSC
Between MSC = =117.2
SSC=234.44 C-1= 3-1=2 C −1
Samples
2 MSC
FC = =2.332
MSE
Within SSE
SSE=703.56 N-C=17-3=14 MSE = =50.25
Samples N−C

Cal FC = 2.332 & Tab FC (14,2) =3.74


Conclusion: Cal FC< Tab FC  There is no significance difference between the given positions.
3. Four varieties A, B, C, D of a fertilizer are tested in a randomized block design with 4
replications. The plot yields in pounds are as follows:

Column /
1 2 3 4
Row
1 A 18 C 21 D 25 B 11
2 D 22 B 12 A 15 C 19
3 B 15 A 20 C 23 D 24
4 C 22 D 21 B 10 A 17
Analyse the experimental yield.
Solution:
Null Hypothesis H 0 : There is no significant difference between rows and column
Alternate Hypothesis H1 : There is a significant difference between rows and column
Test statistic:

St. Joseph’s College of Engineering 51


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Seasons
A B C D
Total
X1 X2 X3 X4

Y1 18 21 25 11 75

Y2 22 12 15 19 68

Y3 15 20 23 24 82

Y4 22 21 10 17 70

Total 77 74 73 71 295

Step1: N= Total No of observations = 16


Step 2: T=Grand Total =295
(Grand total )2 T2
Step 3: Correction Factor = = = 5439.06
Total No of Observatio ns N
Step 4: TSS =  X1 +  X 2 +  X 3 +  X 4 − C.F = 329.93
2 2 2 2

( X ) + ( X ) + ( X ) + ( X )
2 2 2 2

SSC = − C.F = 4.68


1 2 3 4
Step 5:
C1 C2 C3 C4
( Y ) + ( Y ) + ( Y )
2 2 2

Step 6: SSR = − C.F = 29.18


1 2 3

R1 R2 R3
Step 7: SSE=TSS-SSC-SSR = 216 − 42 − 91.5
SSE = 296.06
Step 8: ANOVA TABLE:
Source of Sum of Degrees of Mean Sum of varience F – ratio
Variation Squares Freedom Squares
Between SSC=4.68 c-1=4-1=3 SSC MSE
MSC = Fc =
Columns c −1 MSC
(Salesmen) FC (9,3) = 8.81
= 1.56 = 21.09

Between SSR =29.18 r-1=4-1=3


MSR =
SSR FR =
MSE FR (9,3) = 8.81
rows r −1 MSR
(Seasons) = 9.7292 = 3.38

Error SSE=296.0 (c-1)(r-1)=9 SSE


6 MSE =
(c − 1)(r − 1)
= 32.8958

Total 329.93 15
Table value: Fc > Fc(9,3) and FR<FR(9,3) at 5% LOS
Conclusion:
H 0 is rejected, hence there is a significant difference between rows.
St. Joseph’s College of Engineering 52
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
H 0 is accepted, hence there is no significant difference between columns.
4. The following table gives the number of refrigerators sold by 4 salesmen in three months.
Salesman
Months
A B C D
May 50 40 48 39
June 46 48 50 45
July 39 44 40 39
Is there a significant difference in the sale made by the four salesmen? Is there a significant
difference in the sales made during different months? (April / May 2021), (April / May 2021)
Solution:
Null Hypothesis H 0 : There is no significant difference between the sales in the 3 seasons and also
between the sales of the 4 salesmen.
Alternate Hypothesis H1 : There is a significant difference between the sales in the 3 seasons and also
between the sales of the 4 salesmen.
Test statistic: To simplify calculations, we deduct 40 from each value
Seasons A B C D Seasons
X12 X22 X32 X42
X1 X2 X3 X4 Total

Y1 Summer 10 0 8 -1 17 100 0 64 1

Y2 Winter 6 8 10 5 29 36 64 100 25

Y3 Monson -1 4 0 -1 2 1 16 0 1

Total 15 12 18 3 48 137 80 164 27


Step1: N= Total No of Observations = 12
Step 2: T=Grand Total =48
(Grand total )2 T 2 482
Step 3: Correction Factor = = = = 192
Total No of Observatio ns N 12
Step 4: TSS =  X1 +  X 2 +  X 3 +  X 4 − C.F = 137 + 80 + 164 + 27 − 192 = 216
2 2 2 2

( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
152 122 182 32
SSC = − C.F = + + + − 192 = 42
1 2 3 4
Step 5:
C1 C2 C3 C4 3 3 3 3
( Y ) + ( Y ) + ( Y )
2 2 2
172 292 22
Step 6: SSR = − C.F = + + − 192 = 91.5
1 2 3

R1 R2 R3 4 4 4
Step 7: SSE=TSS-SSC-SSR = 216 − 42 − 91.5
SSE = 82.5
Step 8: ANOVA TABLE:
Source of Sum of Degrees of Mean Sum of
varience F – ratio
Variation Squares Freedom Squares

St. Joseph’s College of Engineering 53


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

MSC
FC =
SSC MSE
Between MSC =
c −1 14
Columns SSC=42 c-1=4-1=3 = FC (3,6) = 4.76
42 13.75
(Salesmen) = = 14
3 = 1.018

MSR
SSR FR =
MSR = MSE
Between r −1
45.75
rows SSR =91.5 r-1=3-1=2 91.5 = FR (2,6) = 5.14
(Seasons) = = 45.75 13.75
2 = 3.327

SSE
MSE =
(c − 1)(r − 1)
Error SSE=82.5 (c-1)(r-1)=6 82.5
= = 13.75
6

Total 216 11
Conclusion:
1) Cal FC < Table FC ,0.05 (3,6) Hence, we accept the H 0 and we conclude that there is no significant
difference between sales in the three seasons.
2) Cal FR < Table FR,0.05 (2,6) Hence, we accept the H 0 and we conclude that there is no significant
difference between in the sales of 4 salesmen.
5. The following table gives the number of articles of a product produced by five different workers
using four types of machines.
Machines
Workers
P Q R S
A 44 38 47 36
B 46 40 52 43
C 34 36 44 32
D 43 38 46 33
E 38 42 49 39
Test (i) Whether the five workers differ with respect to mean productivity and
(ii) Whether the four machines differ with respect to mean productivity.
Solution:
H0: There is no significant difference between the Machine types and between the Workers
H1 : There is a Significant difference between the Machine types and between the Workers
We shift the origin Xij = xij – 46; h = 5; k = 4; N = 20

St. Joseph’s College of Engineering 54


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
A B C D Total=Ti* 2
[Ti* ]/k X*ij2

1 -2 -8 1 -10 -19 90.25 169

2 0 -6 6 -3 -3 2.25 81

3 -12 -10 -2 -14 -38 361 444

4 -3 -8 0 -13 -24 144 242

5 -8 -4 3 -7 -16 64 138

Total=T*j -25 -36 8 -47 -100 661.5 1074

[T*j2]/h 125 259.2 12.8 441.8 838.8

(Grand total) 2 (−100) 2


T=Grand Total = -100, Correction Factor = = = 500
Total No of Observations 20
TSS =  X ij2 − C.F = 1074 − 500 = 574
i j

T
2

SSR = − C.F = 661.5 − 500 = 161.5


i*

k
T
2

SSC = − C.F = 838.8 − 500 = 338.8


*j

h
SSE = TSS – SSC – SSR = 574 – 161.5 – 338.8 = 73.7
ANOVA Table
Source of Sum of
Degree of freedom Mean Square F- Ratio FTabRatio
Variation Squares
Between
Rows SSR=161.5 h - 1= 4 MSR= 40.375 FR = 6.574 F5%(4, 12) =
(Workers) 3.26
Between
Columns SSC=338.8 k – 1=3 MSC =112.933
(Machine) FC = 18.388
F5%(3, 12) =
Residual SSE = 73.7 (h – 1)( k – 1) = 12 MSE = 6.1417 3.59

Total 574
Conclusion : Cal FC>Tab FC and Cal FR>Tab FR  There is a significant difference between the
Machine types and a significant difference between the Workers.
6. Analyse the variance in the following latin square of yields (in kgs) of paddy where A, B, C, D
denote the different methods of cultivation.
D 122 A 121 C 123 B 122
B 124 C 123 A 122 D 125
A 120 B 119 D 120 C 121
C 122 D 123 B 121 A 122 (April / May 2021)
Examine whether the different methods of cultivation have given significantly different yields.

St. Joseph’s College of Engineering 55


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Solution.:
H0: There is no difference between columns, between rows and between treatments
H1 : Not all are equal.
We shift the origin Xij = xij – 100;
Variety X1 X2 X3 X4 TOTAL X12 X22 X32 X42
D2 A1 C3 B2 8
Y1 4 1 9 4
B4 C3 A2 D5 14
Y2 16 9 4 25
A0 B -1 D0 C1 0
Y3 0 1 0 1
C2 D3 B1 A2 8
Y4 4 9 1 4

TOTAL 8 6 6 10 30 24 20 14 34
n = 4, N = 16, T=Grand Total = 30 ;
(Grand total)2 (30)2
Correction Factor = = = 56.25
Total No of Observations 16
SST =  X12 +  X 22 +  X 32 − C.F = 92 − 56.25 = 35.75
( X ) + ( X ) + ( X ) + ( X )
2 2 2 2

SSC = − C.F = 81 − 56.25 = 24.75


1 2 3 4

c1 c2 c3 c4

( Y ) + ( Y ) + ( Y ) + ( Y )
2 2 2 2

SSR = − C.F = 59 − 56.25 = 2.75


1 2 3 4

r1 r2 r3 r4

Letters Total

A 1 2 0 2 5
B 2 4 -1 1 6
C 3 3 1 2 9
D 2 5 0 3 10
Total 30

SSK =
 (A) +  (B) +  (C) +  (D) − C.F
4 4 4 4
5 6 9 10
= + + + − 56.25
4 4 4 4
= 60.5 − 56.25 = 4.25
SSE = SST – SSC – SSR-SSK = 35.75 – 24.75 – 2.75 – 4.25 = 4
ANOVA Table
Source of Sum of Degree of FTabRatio
Mean Square F- Ratio
Variation Squares freedom ( 5% level)
Between
SSR=24.75 n - 1= 3 MSR=8.25
Rows FR(3, 6)=4.76
FR= 12.31
Between
SSC=2.75 n - 1= 3 MSC = 0.92
Columns

St. Joseph’s College of Engineering 56


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
Between Fc(3, 6)=4 .76
SSK = 4.25 n - 1= 3 MSK = 1.42
Letters
FC = 1.37
FK(3, 6)=4 .76
Residual SSE= 4 (n – 1)(n – 2) = 6 MSE = 0.67
FK = 2.12
Total 35.75
Conclusion : Cal FC< Tab FC , Cal FK< Tab FK and Cal FR> Tab FR  There is significant difference
between the rows , no significant difference between the letters and no significant difference
between the columns.
7. The following is a Latin square of a design when 4 varieties of seeds are being tested. Set up the
analysis of variance table and state your conclusion. The following is a Latin square of a design
when 4 varieties of seeds are being tested. Set up the analysis of variance table and state your
conclusion. You may carry out suitable change of origin and scale.
A 105 B 95 C 125 D 115
C 115 D 125 A 105 B 105
D 115 C 95 B 105 A 115
B 95 A 135 D 95 C 115 (April / May 2017)
Solution:
H0 : Four varieties are similar
H1 : Four varieties are not similar
Let us take 100 as origin and divide by 5 for simplifying the calculation

Variety X1 X2 X3 X4 TOTAL X12 X22 X32 X42

Y1 1 -1 5 3 8 1 1 25 9

Y2 3 5 1 1 10 9 25 1 1

Y3 3 -1 1 3 6 9 1 1 9

Y4 -1 7 -1 3 8 1 49 1 9
6 10 6 10 32 20 76 28 28
N=Total No of Observations = 16 T=Grand Total = 32
(Grand total )2
Correction Factor = = 64
Total No of Observatio ns
TSS =  X12 + X 2 2 + X 32 + X 42 − C.F = 20 + 76 + 28 + 28 − 64 = 88

( X ) + ( X ) + ( X ) + ( X )
2 2 2 2
(6)2 (10)2 (6)2 (10)2
SSC = − C.F = + + + − 64 = 4
1 2 3 4

C1 C2 C3 C4 4 4 4 4
( Y ) + ( Y ) + ( Y ) + ( Y )
2 2 2 2
(8)2 (10)2 (6)2 (8)2
SSR = − C.F = + + + − 64 = 2
1 2 3 4

R1 R2 R3 R4 4 4 4 4
To find SSK
Treatment 1 2 3 4 Total
A 1 1 3 7 12
B -1 1 1 -1 0
C 5 3 -1 3 10

St. Joseph’s College of Engineering 57


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
D 3 5 3 -1 10

( Y1 )2 ( Y2 ) 2 ( Y3 ) 2 ( Y4 ) 2
SSK = + + + − C.F = 22
K1 K2 K3 K4
SSE= TSS − SSC−SSR−SSK = 88-4-2-22=60

ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
SSC
Between MSC = MSC
SSC=4 n-1=3 n −1 FC = =7.52
Columns MSE
=1.33
SSR
MSR = MSR
Between Rows SSR=2 n-1=3 n −1 FR = =14.9
MSE
=0.67
Between SSK MSK
SST=22 n-1=3 MSK = =7.33 FK = =1.36
Treatments n −1 MSE
MSE
Error (or) SSE
SSE=60 (n-1) (n-2)=6 =
Residual (n − 1)(n − 2)
= 10
Table value F(3,6) degrees of freedom 4.76
There is significant difference between treatments
8. Analysis the following results of a Latin square experiments (April / May 2021)
Column / Row 1 2 3 4
1 A(12) D(20) C(16) B(10)
2 D(18) A(14) B(11) C(14)
3 B(12) C(15) D(19) A(13)
4 C(16) B(11) A(15) D(20)
The letters A, B, C, D denote the treatments and the figures in brackets denote the observations.
Solution:
H0 : There is no significant difference between rows,columns and treatments
H1 : There is a significant difference between rows,columns and treatments
Subtract 15 from every value

Variety X1 X2 X3 X4 TOTAL

Y1 -3 5 1 -5 -2

Y2 3 -1 -4 -1 -3

Y3 -3 0 4 -2 -1

Y4 1 -4 0 5 2
-2 0 1 -3 -4
N=Total No of Observations = 16 T=Grand Total = -4

St. Joseph’s College of Engineering 58


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
(Grand total )2
Correction Factor = =1
Total No of Observatio ns
TSS =  X12 + X 2 2 + X 32 + X 4 2 − C.F = 157

( X ) + ( X ) + ( X ) + ( X )
2 2 2 2

SSC = − C.F = 2.5


1 2 3 4

C1 C2 C3 C4
( Y ) + ( Y ) + ( Y ) + ( Y )
2 2 2 2

SSR = − C.F = 3.5


1 2 3 4

R1 R2 R3 R4
To find SSK
Treatment 1 2 3 4 Total
A -3 -1 0 -2 -6
B -3 -4 -4 -5 16
C 1 0 1 -1 1
D 3 5 4 5 17

( Y1 )2 ( Y2 ) 2 ( Y3 ) 2 ( Y4 ) 2
SSK = + + + − C.F = 144.5
K1 K2 K3 K4
SSE= TSS − SSC−SSR−SSK = 6.5

ANOVA Table
Source of Sum of Degree of
Mean Square F- Ratio
Variation Squares freedom
SSC
Between MSC = MSC
SSC=2.5 n-1=3 n −1 FC = =1.30
Columns MSE
=0.83
SSR
Between MSR = MSR
SSR=3.5 n-1=3 n −1 FR = =1.08
Rows MSE
=1.167
SSK
Between MSK = =4 MSK
SST=144.5 n-1=3 n −1 FK = =44.6
Treatments MSE
8.17
MSE
Error (or) SSE
SSE=6.5 (n-1) (n-2)=6 =
Residual (n − 1)(n − 2)
= 1.08
Table value F(3,6) degrees of freedom 4.76
There is significant difference between treatments
UNIT – V STATISTICAL QUALITY CONTROL
PART A
1. Write down the objectives of statistical quality control.
To achieve better utilization of raw materials, to control waste and scrap and to optimize the
quality of the product without any defects
2. Define control chart.
It is a useful graphical method to find whether a process is in statistical quality control.
3. What are the uses of Quality control chart?

St. Joseph’s College of Engineering 59


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
It helps in determining whether the goal set is being achieved by finding out whether the Process
is in control or not.
4. What are the different types of control chart?
Control chart for variable – Range and mean chart;
Control chart for attributes- p-chart, C-chart, np-chart.
5. Write down the control limits for mean chart.
Central limit = x , upper control limit = x +A2 R , lower control limit = x -A2 R where x is the
mean of the sample and R is the range.
6. Write down the control limits for range chart.
CL= R , UCL = D4 R , LCL=D3 R .
7. Define p-chart.
Control chart for fraction defectives is called p-chart.
8. Define C-chart and when it is used? (Nov/Dec 2023)
Control chart for number of defects is called c-chart and it is used when c  4 or when c is small
compared with the maximum number of defects given in the data.
9. Write down the control limits for c-chart.
CL = c UCL = c +3 c LCL = c -3 c
10. The total number of defects in 20 pieces is 220. what is the UCL and LCL?
UCL = c +3 c = 20.95 and LCL = c -3 c = 1.05.
11. Write down the control limits for p-chart.
UCL= np + 3 npq , LCL= np − 3 npq CL = np
12. Define np –chart.
Control chart for number of defectives is called np chart.
13. When used p-chart and np-chart? (Nov/Dec 2023)
Solution: p-chart and np-chart are used when p  0.05 or np  4
14. What is two-sided tolerance limits?
Two-sided tolerance limits are values determined from a sample of size n so that one can claim
with (1 −  ) % confidence that at least δ proportion of the population is included between these values.
15. What is the tolerance limit? (APR / MAY 2019)
Tolerance Limits of a quality characteristic are defined as those values between which nearly all
the manufactured items will lie. If the measurable quality characteristics X is assumed to be normally
distributed with mean and S.D. σ, then the tolerance limits are usually taken as μ 3σ, since only
0.27% of all the items produced can be expected to fall outside these limits.
16. Write down the formula for UCL and LCL for np-chart.
Solution:
( )
Central limit (y) = np , Upper control limit = np + 3 np 1- p or np + + 3 np q ,

( )
Lower control limit = np − 3 np 1- p or np − 3 np q
17. Find the lower and upper control limits for X - chart and R- chart, when each sample is of size 4
and X =10.80 and R =0.46.?
For X -chart, LCL=10.46, UCL=11.14: For R-chart, LCL=0, UCL=1.05.
18. Find the lower and upper control limits for X - chart?
For X -chart, LCL=11.43, UCL=18.57 .
St. Joseph’s College of Engineering 60
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
19. Find the lower and upper control limits for p- chart and np – chart, when n=100 and P = 0.085?
For p-chart, LCL =0.0013, UCL=0.1687; For np-chart, LCL=0.134, UCL=16.867
20. A garment was sampled on 10 consecutive hours of production. The number of defects found per
garment is given below: Defects: 5,1,7,0,2,3,4,0,3,2. Compute upper and lower control limits for
monitoring number of defects. (APR / MAY ’19)
c =2.7 UCL = c +3 c =7.6295 LCL = c -3 c = -2.295
PART B
1. Given below are the values of sample mean X and sample range R for 10 samples, each of size 5.
Draw the appropriate mean and range charts and comment on the state of control on the state of
control of the process.
Sample No. 1 2 3 4 5 6 7 8 9 10
Mean X i 43 49 37 44 45 37 51 46 43 47
Range Ri 5 6 5 7 7 4 8 6 4 6
Solution:
1 1
X=  X i =  43 + 49 + 37 + 44 + 45 + 37 + 51 + 46 + 43 + 47  = 44.2
N 10
1 1
R =  Ri = 5 + 6 + 5 + 7 + 7 + 4 + 8 + 6 + 4 + 6 = 5.8
N 10

From the table of control chart for sample size n=5, we have A2 = 0.577, D3 = 0 & D4 = 2.115

(i). Control limits for X chart:


CL (central line) = X = 44.2

LCL = X − A2 R = 44.2 − (0.577)(5.8) = 40.85, UCL = X + A2 R = 44.2 + (0.577)(5.8) = 47.55

Conclusion:
Since 2nd,3rd,6thand 7th sample means fall outside the control limits the statistical process is out
St. Joseph’s College of Engineering 61
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
of control according to X chart
(ii). Control limits for R-Chart:
CL = R = 5.8; LCL = D3R = 0
UCL = D4 R = ( 2.115 )( 5.8 ) = 12.267  12.27

Conclusion:
Since all the sample range fall within the control limits the statistical process is under control
according to R − chart .
2. The following data gives the average life in hours and range in hours of 12 samples each of 5 lamps.
Construct X - chart and R- chart, comment on state of control.
Sample No. 1 2 3 4 5 6 7 8 9 10 11 12

Mean X i 120 127 152 157 160 134 137 123 140 144 120 127
Range Ri 30 44 60 34 38 35 45 62 39 50 35 41
(APR / MAY 2019)
Solution:
1
X=
N
 Xi
1
=  120 + 127 + 152 + 157 + 160 + 134 + 137 + 123 + 140 + 144 + 120 + 127  = 136.75
12
1
R =  Ri
N
1
=  30 + 44 + 60 + 34 + 38 + 35 + 45 + 62 + 39 + 50 + 35 + 41 = 42.75
12

From the table of control chart for sample size n=5, we have A2 = 0.577, D3 = 0 & D4 = 2.115

i) Control limits for X chart:


CL (central line)= X = 136.75 ;
LCL = X − A2 R = 136.75 − (0.5775)(42.75 ) = 112.1
UCL = X + A2 R = 136.75 + (0.5775)(42.75) = 161.44
St. Joseph’s College of Engineering 62
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

UCL=161.4
4
CL = 136.75

Conclusion: Since all the sample points lie within the LCL and UCL lines, the process is under control according
to X chart .

ii) Control limits for R-Chart:


CL = R = 42.75; LCL = D3 R = 0; UCL = D4R = ( 2.115 )( 42.75 ) = 90.41

UCL=90.4
1

LCL=0

Conclusion:
Since all the sample range fall within the control limits the statistical process is under control according to
R chart .
3. The following data give the measurements of 10 samples each of size 5 in the production process
taken in an interval of 2 hours. Calculate the sample means and ranges and draw the control
charts for mean and range.
Sample No. 1 2 3 4 5 6 7 8 9 10
Observed 49 50 50 48 47 52 49 55 53 54
measurement 55 51 53 53 49 55 49 55 50 54
X 54 53 48 51 50 47 49 50 54 52
49 46 52 50 44 56 53 53 47 54
53 50 47 53 45 50 45 57 51 56
(MAY / JUNE 2016) (NOV / DEC 2018)
Solution:
St. Joseph’s College of Engineering 63
MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
1 1
X =  Xi = 52 + 50 + 50 + 51 + 47 + 52 + 49 + 54 + 51 + 54  = 51.0
N 10
1 1
R =  R i =  6 + 7 + 6 + 5 + 6 + 9 + 8 + 7 + 7 + 4  = 6.5
N 10

From the table of control chart for sample size n=5, we have A2 = 0.577, D3 = 0 & D4 = 2.115

(i). Control limits for X chart:


CL (central line) = X = 51.0 ;
LCL = X − A2 R 2 = 51.0 − (0.577)(6.5) = 47.2495
UCL = X + A2 R 2 = 51.0 + (0.577)(6.5) = 54.7505

UCL=54.75

MEAN CHART

Conclusion:
Since 5th sample mean fall outside the control limits the statistical process is out of control
according to X chart .

(ii). Control limits for R-Chart:

CL = R = 6.5;
LCL = D3 R = 0;
UCL = D 4 R = ( 2.115 )( 6.5 ) = 13.7475

St. Joseph’s College of Engineering 64


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

RANGE CHART

Conclusion: Since all the sample means fall within the control limits the statistical process is under
control according to R chart .
4. 15 samples of 200 items each were from the output of a process. The number of defective items in
the samples are given below. Prepare a control chart for the fraction defective and comment on
the state of control.
Sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
number(i):
No. of. 12 15 10 8 19 15 17 11 13 20 10 8 9 5 8
Defective(np):
Solution:
1 180
 np = 12 + 15 + 10 + ... + 5 + 8 = 180, np = N  np = 15 = 12,
12
p= = 0.06
200
For the p-chart: CL = p = 0.06
p (1-p ) ( 0.006 ) (0.94) = 0.001
LCL=p -3 = 0.006 − 3
n 200
p (1-p )  ( 0.06 )( 0.94 ) 
UCL=p + 3 =0.06+   =0.11
n  200 
 
npi 12
The fraction defective (values of pi = ) for the given samples are p1 = =0.06,
n 200
15
p2 = =0.075, p3=0.05, p4=0.04, p5=0.095, p6=0.075, p7= 0.085, p8=0.055, p9=0.065, p10=0.1,
200
p11=0.05, p12=0.04, p13=0.045, p14=0.025, p15=0.04.

St. Joseph’s College of Engineering 65


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

Conclusion:
Since all the sample point lie between the LCL and UCL lines, the process is under control.
5. Construct a control chart for defectives for the following data:
Sample No: 1 2 3 4 5 6 7 8 9 10
No. inspected: 90 65 85 70 80 80 70 95 90 75
No. of
9 7 3 2 9 5 3 9 6 7
defectives:
(APR / MAY 2019)
Solution:
We note that the size of the sample varies from sample to sample.
We can construct p-chart , provided 0.75n  n i  1.25 n , for all i
1 1 1
Here n =  ni = ( 90 + 65 + ............. + 90 + 75) = (800 ) = 80
N 10 10
Hence 0.75 n = 60 and 1.25 n = 100

The values of ni be between 60 and 100. Hence p-chart, can be drawn by the method given below.

Total no.of defectives 60


Now p = = = 0.075
Total no.of items inspected 800
Hence for the p-chart to be constructed,

CL= p = 0.075

LCL= p − 3
(
p 1− p ) = 0.075 − 3 0.075 x0.925
= −0.013
n 80
Since LCL cannot be negative, it is taken as 0.

St. Joseph’s College of Engineering 66


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

UCL= p + 3
(
p 1− p ) = 0.075 + 3 0.075 x0.925
= 0.163
n 80
The values of pi for the various samples are 0.100, 0.108,0.035, 0.029, 0.113, 0.063, 0.043, 0.067,
0.093

Since all the sample points lie within the control lines, the process is under control.
6. The data given below are the number of defectives in 10 samples of 100 items each. Construct a
p-chart and an np-chart and comment on the results.
Sample No. 1 2 3 4 5 6 7 8 9 10
No. of defectives 6 16 7 3 8 12 7 11 11 4
(Nov/Dec 2023)
Solution:
Sample size is constant for all samples, n=100.

Total no. of defectives = 6 + 16+7+3+8+12+7+11+11+4= 85,


Total no. Inspected= 10 x 100 = 1000
Total no.of defectives 85
Average fraction defective = p = = = 0.085
Total no.of itemsinspected 1000

For p-chart:
p (1 − p ) ( 0.085) (0.915)
LCL = p − 3 = 0.085 − 3 = 0.0013
n 100
p (1 − p )  ( 0.085)( 0.915)  = 0.1687
UCL = p + 3 = 0.085 + 
n  3 
 

St. Joseph’s College of Engineering 67


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

UCL=0.1687

Conclusion:
All these values are less than UCL=0.1687 and greater than LCL=0.0013. In the control chart, all
sample points lie within the control limits. Hence, the process is under statistical control.
For np-chart:

(
UCL = n p + 3 n p 1 − p )

= np+3
p 1− p ( )
= 100 ( 0.1687 ) = 16.87
 n 
 
np = 100 ( 0.085 ) = 8.5

(
LCL = n p − 3 n p 1 − p )


= n p −3
(
p 1− p 
 ) = 100 ( 0.0013) = 0.13
 n 
 

UCL=16.87

CL=8.5

LCL=0.13

Conclusion:
All the values of number of defectives in the table lie between 16.87 and 0.13. Hence, the process is
under control even in np-chart.

St. Joseph’s College of Engineering 68


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
7. In a factory producing spark plugs, the number of defectives found in the inspection of 15 lots of
100 each is given below; Draw the control chart for the number of defectives and comment on the
state of control.
Sample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
number(i):
No. of. 5 10 12 8 6 4 6 3 4 5 4 7 9 3 4
Defective(np):

Solution:
 np = 5 + 10 + 12 + ... + 3 + 4 = 90,
1 90
np =
N
 np =
15
= 6,

1 6
p = 6 = = 0.06
n 100

For np-chart:
CL = np = 6

( )
UCL=np +3 np 1-p = 6 + 3 6 ( 0.94 ) = 13.12

( )
LCL=np − 3 np 1-p = 6 − 3 6 ( 0.94 ) = −1.12
 LCL = 0

Conclusion:
Since all the sample points lie between the upper and lower control lines, the process is under
control.

St. Joseph’s College of Engineering 69


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024
8. 15 tape recorders were examined for quality control test. The number of defects in each tape
recorder is recorded below. Draw the appropriate control chart and comment on the state of
control.
Unit No.(i) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
No. of defects 2 4 3 1 1 2 5 3 6 7 3 1 4 2 1
(c)
Solution:
The number of defects per sample containing only one item is given,

c=
 ci = ( 2 + 4 + 3 + + 2 + 1) = 45 = 3
N 15 15
CL = c = 3; LCL = c − 3 c = 3 − 3 3 = − 2.20 ;
LCL = 0 ( since LCL cannot be negative)
UCL = c + 3 c = 3 + 3 3 = 8.20

Since all the sample points lie within the LCL and UCL lines, the process is under control.
9. A plant produces paper for newsprint and rolls of papers are inspected for defects. The results of
inspection of 20 rolls of papers are given below: Draw the c-chart and comment on the state of
control.

Roll 1 2 3 4 5 6 7 8 9 10
No.(i):
No. of 19 10 8 12 15 22 7 13 18 13
defect
s(c):
(i) 11 12 13 14 15 16 17 18 19 20
(c) 16 14 8 7 6 4 5 6 8 9

Solution:

The number of defects per sample containing only one item is given,

c=
c i
=
(19 + 10 + 8 + + 8 + 9)
N 20
220
= = 11
20

St. Joseph’s College of Engineering 70


MA1452-Applied Probability & Statistics Dept. of BIOTECH & CHEMICAL 2023-2024

CL = c = 11;
LCL = c − 3 c = 11 − 3 11 =1.05,
UCL = c + 3 c = 11 + 3 11 = 20.95

Conclusion: Since one-point falls outside the control lines, the process is out of control.

St. Joseph’s College of Engineering 71

You might also like