0% found this document useful (0 votes)
11 views122 pages

Statistical Inferences

Uploaded by

raghudesai951
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views122 pages

Statistical Inferences

Uploaded by

raghudesai951
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 122

MALLA REDDY COLLEGE OF

ENGINEERING &TECHNOLOGY
(An Autonomous Institution – UGC, Govt. of India)
Recognizes under 2(f) and 12(B) of UGC ACT 1956
(Affiliated to JNTUH, Hyderabad, Approved by AICTE –Accredited by NBA & NAAC-“A” Grade-ISO
9001:2015 Certified)

STATISTICAL INFERENCES
AND
STOCHASTIC PROCESS

B.Tech – II Year – I Semester


DEPARTMENT OF HUMANITIES AND SCIENCES
R22
MALLA REDDY COLLEGE OF ENGINEERING & TECHNOLOGY

B. TECH- II- YEAR- I-SEM L T/P/D C

3 -/ - /- 3

STATISTICAL INFERENCES AND STOCHASTIC PROCESS


(Common to CSE-AIML AND CSE-DS)
Course Objectives:

• To understand a random variable that describes randomness or an uncertainty in


certain realistic situation. It can be either discrete or continuous type.
• To learn important probability distributions like: in the discrete case, study of the
Binomial and the Poisson Distributions and in the continuous case the Normal
Distributions.
• To Understand linear relationship between two variables and also to predict how a
dependent variable changes based on adjustments to an independent variable.
• To learn the types of sampling, sampling distribution of means and variance,
Estimations of statistical parameters.
• Use of probability theory to make inferences about a population from large and
small samples.
• To understand Stochastic process and Markov chains.

UNIT-I: Random Variables


Concept of a Random Variable, Discrete Probability Distributions, Continuous Probability
Distributions. Expectation-Mean and Variance of a Random Variables, Means and
Variances of Linear Combinations of Random Variables. Moments and Moment
Generating Functions.

UNIT-II: Probability Distributions


Discrete Probability Distributions: Binomial Distribution, Poisson distribution, Normal
Distribution, Areas under the Normal Curve, Applications of the Normal Distribution,
Normal Approximation to the Binomial Distributions

UNIT-III: Correlation and Regression


Correlation- Karl Pearson Correlation Coefficient, Rank correlation, Repeated Rank
Correlation, Introduction to Linear Regression-The Simple Linear Regression Model, the
lines of regression, properties of regression coefficients, angles between two regression
lines, interpretation of regression coefficients.
UNIT-IV: Sample Estimation & Test of Hypotheses
Sampling: Definitions ,Standard error . Estimation - Point estimation and Interval estimation.
Testing of hypothesis: Null and Alternative hypothesis - Type I and Type II errors, Critical
region - confidence interval - Level of significance, One tailed and Two tailed test.
Large sample Tests: Test of significance - Large sample test for single mean, difference of
means, single proportion, difference of proportions.
Small samples: Test for single mean, difference of means, paired t-test, test for ratio of
variances (F-test) ,Chi- square test for goodness of fit and independence of attributes.

UNIT-V: Stochastic Processes and Markov Chains


Introduction to Stochastic processes- Markov process. Transition Probability, Transition
Probability Matrix, First order and Higher order Markov process, n-step transition
probabilities, Markov chain, Steady state condition, Markov analysis.
Suggested Text Books:

i) Fundamental of Statistics by S.C. Gupta,7thEdition,2016.


ii) Fundamentals of Mathematical Statistics by SC Gupta and V.K.Kapoor
iii) Higher Engineering Mathematics by B.S. Grewal, Khanna Publishers,
35th Edition,2000.
iv) R. A. Johnson, Miller and Freund’s "Probability and Statistics for Engineers", Pearson
Publishers, 9th Edition, 2017.

References :

i) Introduction to Probability and Statistics for Engineers and Scientists by


Sheldon M.Ross.
ii) Probability and Statistics for Engineers by Dr. J. Ravichandran.

Course Outcomes: After learning the contents of this paper the student must be able to
1. Describe randomness in certain realistic situation which can be either discrete or
continuous type and compute statistical constants of these random variables.
2. Provide very good insight which is essential for industrial applications by learning
probability distributions.
3. Make objective, data-driven decisions by using correlation and regression.
4. Draw statistical inference using samples of a given size which is taken from a
population.
5. Understand the Stochastic processes-Markov process
INDEX

UNIT NO UNIT NAME PAGE NO

1 Random variables 1 - 23

2 Probability Distributions 24 - 42

3 Correlation and Regression 43 - 56

4 Sample Estimation & Test of Hypotheses 57 - 104

5 Stochastic process and Markov chains 105 - 118


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

UNIT – I

Random Variables
INTRODUCTION:

Random Experiment
If an experiment is conducted any number of times under identical conditions, there will be a
set of outcomes associated with it. If the result is not certain and is any one of the several
possible outcomes, the experiment is called a random experiment.

Each outcome is known as an elementary event.

Sample Space
The set of all possible elementary events in a trail is called a sample space (denoted by S) and
each element of a sample space is called a sample point. Any subset of a sample space is an
event(denoted by E)

Equally Likely Events


Events are said to be equally likely when there is no reason to expect any one of them rather
than any one of the others.

Eg. When a card is drawn from a pack of cards, any card may be obtained. ie, all the 52
elementary events are equally likely.

Exhaustive Events
All possible events in a trail are called exhaustive events.

Eg. In tossing a coin, there are two exhaustive elementary events, head and tail.

Mutually Exclusive Events


Events are said to be mutually exclusive, if the happening of any one of the event in a trail
excludes the happening of any one of the others.

Classical definition of Probability


In a random experiment let there be n mutually exclusive and equally likely elementary
events. Let E be an event of the experiment. If m elementary events are in E( favourable to
the event E), then probability of E is defined as
𝑚 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑎𝑟𝑦 𝑒𝑣𝑒𝑣𝑛𝑡𝑠 𝑖𝑛 𝐸
P(E)= = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑎𝑟𝑦 𝑒𝑣𝑒𝑣𝑛𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑟𝑎𝑛𝑑𝑜𝑚 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡
𝑛

If 𝐸̅ (Complementary event of E) denotes the event of non-occurrence of E, then the number


of elementary events in 𝐸̅

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 1


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Is n-m and hence the probability of 𝐸̅ is defined as


𝑛−𝑚 𝑚
P(𝐸̅ )= 𝑛 = 1- 𝑛 =1- P(E)

ie P(E)+ P(𝐸̅ )=1


𝑚
Since m is a non negative integer, n is a natural number and m≤ n, we have 0≤ ≤1
𝑛

Hence 0≤ P(E)≤1

Example 1: What is the probability for a leap year to have 52 Mondays and 53 Sundays?

Solution: A leap year has 366 days i.e., 52 weeks and 2 days. These two days can be any one
of the following 7 ways-

(i) Mon & Tue


(ii) Tue & Wed
(iii) Wed & Thurs
(iv) Thurs & Fri
(v) Fri & Sat
(vi) Sat & Sun
(vii) Sun & Mon

Let E be the event of having 52 Mondays and 53 Sundays in the year.

Total number of possible cases is 𝑛 = 7

Number of favorable cases to E is 𝑚 = 1

𝑚 1
∴ 𝑃(𝐸) = =
𝑛 7
Example 2: A class consists of 6 girls and 10 boys. If a committee of 3 is chosen at random
from the class, find the probability that (i) 3 boys are selected (ii) exactly 2 girls are selected.

Solution: Total number of students = 16

𝑛(𝑆) = no.of ways of choosing 3 from 16 = C 3


16

10
(i) Suppose 3 boys are selected. This can be done in C3 ways

Here, 𝑛(𝐸) = C 7
10

𝑛(𝐸)
∴ 𝑃(𝐸) = The probability that 3 boys are selected =
𝑛(𝑆)

10
C7
= 10
= 0.2143
C3

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 2


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

(ii) Suppose exactly 2 girls are selected. Then-

𝑛(𝐸) = C 2  C1
6 10

𝑛(𝐸)
6
C 2 10 C1
∴ 𝑃(𝐸) = = = 0.2678
𝑛(𝑆) 16
C3

PROBABILITY-AXIOMATIC APPROACH

Let S be a finite sample space. A real valued function P from the power set of S into R is
called a probability function on if the following axioms are satisfied.

Axioms of probability:

(i) Axiom of positivity : P(E)≥0


(ii) Axiom of certainty : P(S) = 1
(iii) Axiom of union : If E1 and E2 are disjoint subsets of S, then
𝑃(𝐸1 ∪ 𝐸2 ) = 𝑃(𝐸1 ) + 𝑃(𝐸2 )

Addition Theorem on Probability

If S is a sample space, and 𝐸1 , 𝐸2 are any events in S then-

𝑃(𝐸1 𝑜𝑟 𝐸2 ) = 𝑃(𝐸1 ∪ 𝐸2 ) = 𝑃(𝐸1 ) + 𝑃(𝐸2 ) − 𝑃(𝐸1 ∩ 𝐸2 )

Multiplication Theorem of Probability

In a random experiment, if 𝐸1 , 𝐸2 are two events such that 𝑃(𝐸1 ) ≠ 0 and 𝑃(𝐸2 ) ≠ 0, then-

𝑃(𝐸1 ∩ 𝐸2 ) = 𝑃(𝐸1 ). 𝑃(𝐸2 ⁄𝐸1 )

𝑃(𝐸2 ∩ 𝐸1 ) = 𝑃(𝐸2 ). 𝑃(𝐸1 ⁄𝐸2 )

Conditional Probability

If 𝐸1 , 𝐸2 are two events in a sample space and 𝑃(𝐸1 ) ≠ 0, then the probability of 𝐸2 , after the
event 𝐸1 has occurred, is called the conditional probability of the event of 𝐸2 given 𝐸1 and is
𝐸2 𝐸2 𝑃(𝐸1 ∩ 𝐸2 )
denoted by 𝑃 ( ) or 𝑃(𝐸2 ⁄𝐸1 ) and we define 𝑃 ( ) =
𝐸1 𝐸1 𝑃(𝐸1 )

𝐸1 𝑃(𝐸1 ∩ 𝐸2 )
Similarly, 𝑃 ( ) =
𝐸2 𝑃(𝐸2 )

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 3


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Random Variable
A Random Variable X is a real valued function from sample space S to a real number R.

(or)

A Random Variable X is a real number which is determined by the outcomes of the random
experiment.

Eg:1. Tossing 2 coins simultaneously

Sample space ={HH,HT,TH,TT}

Let the random variable be getting number of heads then

X(S)={0,1,2}.

2.Sum of the two numbers on throwing 2 dice

X(S)={2,3,4,5,6,7,8,9,10,11,12}.

Types of Random Variables:


1.Discrete Random Variables : A Random Variable X is said to be discrete if it takes only
the values of the set {0,1,2…..n}.

Eg:1.Tosssing a coin, throwing a dice,number of defective items in a bag.

2.Continuous Random Variables: A Random Variable X which takes all possible values
in a given interval of domain.

Eg: Heights, weights of students in a class.

Discrete Probability Distribution:


Let x is a Discrete Random Variable with possible outcomes 𝑥1, 𝑥2 , 𝑥3 … . 𝑥𝑛 having
probabilities 𝑝(𝑥𝑖 )𝑓𝑜𝑟 𝑖 = 1,2 … 𝑛 .If 𝑝(𝑥𝑖 ) > 0 𝑎𝑛𝑑 ∑𝑛𝑖=1 𝑝(𝑥𝑖 ) = 1 then the function 𝑝(𝑥𝑖 )
is called Probability mass function of a random variable X and { 𝑥𝑖 , 𝑝(𝑥𝑖 )} 𝑓𝑜𝑟 𝑖 = 1,2 … 𝑛
is called Discrete Probability Distribution.

Eg: Tossing 2 coins simultaneously

Sample space ={HH,HT,TH,TT}

Let the random variable be getting number of heads then

X(S)={0,1,2}.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 4


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS
1 1
Probability of getting no heads = 4, Probability of getting 1 head = 2, Probability of getting 2
1
heads = 4

∴ Discrete Probability Distribution is

𝑥𝑖 0 1 2

𝑝(𝑥𝑖 ) 1 1 1
4 2 4

Cumulative Distribution function is given by 𝐹 (𝑥) = 𝑝[𝑋 ≤ 𝑥] = ∑𝑥𝑖=0 𝑝(𝑥𝑖 ).

Properties of Cumulative Distribution function:

1. 𝑃[𝑎 < 𝑥 < 𝑏] = 𝐹(𝑏) − 𝐹(𝑎) − 𝑃[𝑋 = 𝑏]

2. 𝑃[𝑎 ≤ 𝑥 ≤ 𝑏] = 𝐹(𝑏) − 𝐹(𝑎) − 𝑃[𝑋 = 𝑎]

3. 𝑃[𝑎 < 𝑥 ≤ 𝑏] = 𝐹(𝑏) − 𝐹(𝑎)

4. 𝑃[𝑎 ≤ 𝑥 < 𝑏] = 𝐹(𝑏) − 𝐹(𝑎) − 𝑃[𝑋 = 𝑏] + 𝑃[𝑋 = 𝑎]

Mean: The mean of the discrete Probability Distribution is defined as


∑𝑛
𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 )
𝜇= ∑𝑛
= ∑𝑛𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 ) since ∑𝑛𝑖=1 𝑝(𝑥𝑖 ) = 1
𝑖=1 𝑝(𝑥𝑖 )

Expectation: The Expectation of the discrete Probability Distribution is defined as


E(X) = ∑𝑛𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 )

In general, 𝐸(𝑔(𝑥)) = ∑𝑛𝑖=1 𝑔(𝑥𝑖 )𝑝(𝑥𝑖 )

Properties:
1) 𝐸(𝑋) = 𝜇

2) 𝐸(𝑘𝑋) = 𝑘 𝐸(𝑋)

3) 𝐸(𝑋 + 𝑘) = 𝐸(𝑋) + 𝑘

4) ) 𝐸(𝑎𝑋 ± 𝑏) = 𝑎𝐸(𝑋) ± 𝑏

Variance: The variance of the discrete Probability Distribution is defined as


𝑉𝑎𝑟(𝑋) = 𝑉(𝑋) = 𝐸[𝑋 − 𝐸(𝑋)]2

∴ 𝑉(𝑋) = 𝐸[𝑋]2 − [𝐸(𝑋)]2

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 5


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

= ∑ 𝑥𝑖 2 𝑝𝑖 − 𝜇 2

Properties:
1) V(c) = 0 where c is a constant

2) V(kX) = k 2 V(X)

3) V(X + k) = V(X)

4) V(aX ± b) = a2 V(X)

Problems
1.If 3 cars are selected randomly from 6 cars having 2 defective cars.

a)Find the Probability distribution of defective cars.

b)Find the Expected number of defective cars.

Sol: Number of ways to select 3 cars from 6 cars =6𝑐3

Let random variable X(S) = Number of defective cars = {0,1,2}


4c3 2c0 1
Probability of non defective cars = =
6c3 5

4𝑐2 2𝑐1 3
Probability of one defective cars = =
6 𝑐3 5

4c1 2c2 1
Probability of two defective cars = =
6c3 5

Clearly , p(xi ) > 0 𝑎𝑛𝑑 ∑n


i=1 p(xi ) = 1

Probability distribution of defective cars is

𝑥𝑖 0 1 2

𝑝(𝑥𝑖 ) 1 3 1
5 5 5

1 3 1
Expected number of defective cars = ∑n
i=1 xi p(x i ) = 0 ( ) + 1 ( ) + 2 ( ) = 1
5 5 5

2.Let X be a random variable of sum of two numbers in throwing two fair dice. Find the
probability distribution of X, mean ,variance.

Sol: Sample space of throwing two dices is

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 6


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

S ={(1,1),(1,2),(1,3),(1,4),(1,5),(1,6)
(2,1),(2,2),(2,3),(2,4),(2,5),(2,6)
(3,1),(3,2),(3,3),(3,4),(3,5),(3,6)
(4,1),(4,2),(4,3),(4,4),(4,5),(4,6)
(5,1),(5,2),(5,3),(5,4),(5,5),(5,6)
(6,1),(6,2),(6,3),(6,4),(6,5),(6,6)}
∴ 𝑛(𝑆) = 36.
Let X = Sum of two numbers in throwing two dice = {2,3,4,5,6,7,8,9,10,11,12}

X Favorable cases No of Favorable 𝑝(𝑥)


cases

2 (1,1) 1 1
36
2
3 (2,1),)(1,2) 2 36

4 (3,1),(2,2),(1,3) 3 3
36
4
5 (4,1),(3,2),(2,3),(1,4) 4
36
5
6 (5,1),(4,2),(3,3),(2,4),(1,5) 5 36

6
7 (6,1),(5,2),(4,3),(3,4),(2,5),(1,6) 6
36
5
8 (6,2),(5,3),(4,4),(3,5),(2,6) 5 36

9 (6,3),(5,4),(4,5),(3,6) 4 4
36
3
10 (6,4),(5,5),(4,6) 3 36
2
36
11 (6,5),(5,6) 2

1
12 (6,6) 1 36

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 7


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Clearly , p(xi ) > 0 and ∑ni=1 p(xi ) = 1

Probability distribution is given by

xi 2 3 4 5 6 7 8 9 10 11 12
p(xi ) 1 2 3 4 5 6 5 4 3 2 1
36 36 36 36 36 36 36 36 36 36 36

n
Mean = μ = ∑ xi p(xi )
i=1
1 2 3 4 5 6 5
= 2( ) + 3( ) + 4( ) + 5( ) + 6( ) + 7( ) + 8( )
36 36 36 36 36 36 36
4 3 2 1
+ 9 ( ) + 10 ( ) + 11 ( ) + 12( )
36 36 36 36
= 7.

Variance = V(X)= ∑ xi 2 pi − μ2
1 2 3 4 5 6 5
= 4 (36) + 9 (36) + 16 (36) + 25 (36) + 36 (36) + 49 (36) + 64 (36) +
4 3 2 1
81 (36) + 100 (36) + 121 (36) + 144 (36) − 49

∴ Variance = 5.83

3. Let X be a random variable of maximum of two numbers in throwing two fair dice
simultaneously. Find the

a)probability distribution of X

b)mean

c)variance

d)P(1<x<4)

e)P(2≤ 𝒙 ≤ 𝟒).

Sol: Sample space of throwing two dices = S ={(1,1),(1,2),(1,3),(1,4),(1,5),(1,6)

(2,1),(2,2),(2,3),(2,4),(2,5),(2,6)

(3,1),(3,2),(3,3),(3,4),(3,5),(3,6)

(4,1),(4,2),(4,3),(4,4),(4,5),(4,6)

(5,1),(5,2),(5,3),(5,4),(5,5),(5,6)

(6,1),(6,2),(6,3),(6,4),(6,5),(6,6)}

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 8


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

∴ 𝑛(𝑆) = 36.

Let X = Maximum of two numbers in throwing two dice = {1,2,3,4,5,6,}

X Favorable cases No of 𝑝(𝑥)


Favorable
cases

1 (1,1) 1 1
36
3
2 (2,1),)(1,2),(2,2) 3 36

3 (3,1),(1,3),(2,3)(3,3),(3,2) 5 5
36
7
4 (1,4),(4,1),(4,2),(2,4)(4,3),(3,4),(4,4) 7 36

5 (1,5),(5,1),(2,5),(5,2)(3,5),(5,3),(5,4),(4,5),(5,5) 9 9
36
11
6 (1,6)(6,1),(6,2),(2,6),(6,3),(3,6),(4,6),(6,4),(6,5)(5,6),(6,6) 11 36

Clearly , p(xi ) > 0 and ∑ni=1 p(xi ) = 1

Probability distribution is given by

𝑥𝑖 1 2 3 4 5 6
𝑝(𝑥𝑖 ) 1 3 5 7 9 11
36 36 36 36 36 36

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 9


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS
n 1 3 5 7 9 11
Mean = μ = ∑ xi p(xi ) = 1 ( ) + 2( ) + 3( ) +4( ) + 5( ) + 6( )
i=1 36 36 36 36 36 36

= 4.4 7.

Variance = V(X)= ∑ xi 2 pi − μ2
1 3 5 7 9 11
= 1 ( ) + 4 ( ) + 9 ( ) + 16 ( ) + 25 ( ) + 36 ( )
36 36 36 36 36 36
∴ Variance = 1.99.

4.A random variable X has the following probability function

𝒙𝒊 -3 -2 -1 0 1 2 3

𝒑(𝒙𝒊 ) k 0.1 k 0.2 2k 0.4 2k

Find k ,mean, variance.

Sol: We know that ∑ni=1 p(xi ) = 1

i.e k+0.1+k+0.2+2k+0.4+2k = 1

i.e 6k+0.7 = 1 ∴ 𝑘 = 0.05


n
Mean = μ = ∑ xi p(xi ) = k(−3) + 0.1(−2) + k(−1) + 2k(1) + 2(0.4) + 3(2k)
i=1

= 0.8.

Variance = V(X)= ∑ xi 2 pi − μ2
= k(−3)2 + 0.12 (−2) + k(−1)2 + 2k(1) + 4(0.4) + 9(2k)

∴ Variance = 2.86.

Continuous Probability distribution:


Let X be a continuous random variable taking values on the interval (a,b). A function f(x) is
said to be the Probability density function of x if

i) f(x) > 0 ∀ x ∈ (a, b)


b
ii) Total area under the probability curve is 1 i. e, ∫a f(x)dx = 1.
iii) For two distinct numbers ‘c’ and ‘d’ in (𝑎, 𝑏) is given by P(c < x < d) =
Area under the probability curve between ordinates x = c and x = d i. e
d
∫c f(x)dx.

Note: P(c < x < d) = P(c ≤ x ≤ d) = P(c ≤ x < d) = P(c < x ≤ d)

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 10


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Cumulative distribution function of 𝑓(𝑥) is given by


x d
∫−∞ f(x)dx i.e, f(x) = dx F(x)

Mean: The mean of the continuous Probability Distribution is defined as



μ = ∫ x f(x)dx.
−∞

Expectation: The Expectation of the continuous Probability Distribution is defined as



E(X) = ∫−∞ x f(x)dx.

In general, E(g(x)) = ∫−∞ g(x) f(x)dx.

Properties:
1) E(X) = μ

2) E(X) = k E(X)

3) E(X + k) = E(X) + k

4) ) E(aX ± b) = aE(X) ± b

Variance: The variance of the Continuous Probability Distribution is defined as



Var(X) = V(X) = ∫−∞ x 2 f(x)dx − μ2 .

Properties:
1) V(c) = 0 where c is a constant

2) V(kX) = k 2 V(X)

3) V(X + k) = V(X)

4) V(aX ± b) = a2 V(X)

Mean Deviation: Mean deviation of continuous probability distribution function is


defined as

∫ |x − μ| f(x)dx.
−∞

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 11


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Median: Median is the point which divides the entire distribution in to two equal parts. In
case of continuous distribution,median is the point which divides the total area in to two
M b 1
equal parts i.e, ∫a f(x)dx = ∫M f(x)dx = 2 ∀ x ∈ (a, b) .

Mode: Mode is the value of x for which f(x) is maximum.


i.e f ′ (x) = 0 and f " (x) < 0 for x ∈ (a, b)

Problems
𝒌
1.If the probability density function 𝒇(𝒙) = − ∞ < 𝒙 < ∞. Find the value of ‘k’
𝟏+𝒙𝟐
and probability distribution function of 𝐟(𝐱).
b
Sol: Since total area under the probability curve is 1 i. e, ∫a f(x)dx = 1.

𝐤
∫ 𝟐
dx = 1.
−∞ 𝟏 + 𝐱

2k(tan−1 x) = 1
0
2k(tan ∞ − tan−1 0) =1
−1

1
∴k=
π

Cumulative distribution function of f(x) is given by


x x
𝐤 1 x 1 π
∫ f(x)dx = ∫ dx = (tan−1 x) = [ + (tan−1 x)].
−∞ −∞ 𝟏 + 𝐱
𝟐 π −∞ π 2

2. If the probability density function 𝐟(𝐱) = 𝐜𝐞−|𝐱| − ∞ < 𝐱 < ∞.

Find the value of ‘c’, mean and variance.


b
Sol: Since total area under the probability curve is 1 i. e, ∫a f(x)dx = 1.

∫ 𝐜𝐞−|𝐱| dx = 1
−∞

2 ∫ 𝐜𝐞−𝐱 dx = 1
0
𝐞−𝐱
2c ( −1 ) ∞
0
=1
1
∴c=
2

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 12


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS
∞ 1 ∞
Mean=μ = ∫−∞ x f(x)dx = 2 ∫−∞ x𝐞−|𝐱| dx = 0 since x𝐞−|𝐱| is an odd function .

variance = V(X)

= ∫ x 2 f(x)dx − μ2
−∞
1 ∞
= ∫ x 2 𝐞−|𝐱| dx
2 −∞
1 ∞ ∞
= ∫ 2x 2 𝐞−𝐱 dx = [x 2 (−𝐞−𝐱 ) − 2x(𝐞−𝐱 ) + 2(−𝐞−𝐱 )] = 2 .
2 0 0

𝐬𝐢𝐧𝐱
3. If the probability density function 𝐟(𝐱) = { 𝟐 𝐢𝐟 𝟎 ≤ 𝐱 ≤ 𝛑 .
𝟎 𝐨𝐭𝐡𝐞𝐫𝐰𝐢𝐬𝐞
𝛑
Find mean,median,mode and 𝐏(𝟎 < 𝐱 < 𝟐).

∞ 1 π 𝐬𝐢𝐧𝐱 1 π
Sol: Mean = μ = ∫−∞ x f(x)dx = 2 ∫0 x dx = 2 [−xcosx + sinx] π0 = 2 .
𝟐

Let M be the Median then


M π
1
∫ f(x)dx = ∫ f(x)dx = ∀ x ∈ (−∞, ∞)
0 M 2
M π
𝐬𝐢𝐧𝐱 𝐬𝐢𝐧𝐱 1
∫ dx = ∫ dx = ∀ x ∈ (−∞, ∞)
0 𝟐 M 𝟐 2
π 𝐬𝐢𝐧𝐱 1 π
consider∫M dx = 2 then (−cosx) M =1
𝟐

π
∴M=
2
sinx
Since f(x) = { 2 if 0 ≤ x ≤ π
0 otherwise

To find maximum, we have f ′ (x) = 0


π
i.e, cosx = 0 implies that x = 2

sinx π
and f ′′ (x) = − which is less than 0 at x =
2 2

π
∴ Mode = 2 .

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 13


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

4.If the distributed function is given by

𝟎 𝐢𝐟 𝐱 ≤ 𝟏
𝐅(𝐱) = {𝐤(𝐱 − 𝟏)𝟒 𝐢𝐟 𝟏 ≤ 𝐱 ≤ 𝟑
𝟏 𝐢𝐟 𝐱 > 𝟑
Find 𝐤, 𝐟(𝐱), 𝐦𝐞𝐚𝐧.

Sol: Cumulative distribution function of f(x) is given by


x d
∫−∞ f(x)dx i.e, f(x) = dx F(x)

0 if x ≤ 1
i.e, f(x) = {4k(x − 1)3 if 1 ≤ x ≤ 3
0 if x > 3
b
Since total area under the probability curve is 1 i. e, ∫a f(x)dx = 1

3
∫ 4k(x − 1)3 dx = 1
1

[k(x − 1)4 ] 31 = 1

1
∴k=
16

1 0 if x ≤ 1
∴ f(x) = { (x − 1)3 if 1 ≤ x ≤ 3
4
0 if x > 3

∞ 1 3
Mean=μ = ∫−∞ x f(x)dx = 4 ∫1 x(x − 1)3 dx = 19.6

Mean and Variance of Linear combination of Variables:

Th1: If X is a continuous random variable and Y = aX+b, prove that E(Y) = aE(X)+b

And V(Y) = a2V(X), where V stands for variance and a,b are constants.

Proof: By definition

E(Y) = E(aX+b) = ∫−∞(𝑎𝑥 + 𝑏)𝑓(𝑥)𝑑𝑥
∞ ∞
= 𝑎 ∫−∞ 𝑥 𝑓(𝑥)𝑑𝑥 + 𝑏 ∫−∞ 𝑓(𝑥)𝑑𝑥

= a E(X) + b(1) = a E(X) + b

We have E(Y) = aE(X) + b

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 14


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Y = aX + b

Then Y – E(Y) = a[X-E(x)]

(Y – E(Y))2 = a2 [X-E(x)]2

Taking Expectation on both sides

E{(Y – E(Y))2 }= a2 E{[X-E(x)]2}

V(Y) = a2 V(X)

Th2: If X is a continuous random variable and k is a constant, then prove that

i)Var(X+k) = Var(X) ii)Var(kX) = k2 Var(X)

Proof: By definition Var(X) = E(X2) – [E(X)]2


∞ ∞ 2
= ∫−∞ 𝑥 2 𝑓(𝑥)𝑑𝑥 – [ ∫−∞ 𝑥 𝑓(𝑥)𝑑𝑥]

∞ ∞ 2
i)Var(X+k) = ∫−∞(𝑥 + 𝑘)2 𝑓(𝑥)𝑑𝑥 – [ ∫−∞(𝑥 + 𝑘) 𝑓(𝑥)𝑑𝑥]

∞ ∞ ∞ 2
= ∫−∞(𝑥 2 + 2𝑘𝑥 + 𝑘 2 )𝑓(𝑥)𝑑𝑥 – [ ∫−∞ 𝑥 𝑓(𝑥)𝑑𝑥 + 𝑘 ∫−∞ 𝑓(𝑥) 𝑑𝑥]

∞ ∞ ∞ 2
= ∫−∞ 𝑥 2 𝑓(𝑥)𝑑𝑥 + 2𝑘 ∫−∞ 𝑥𝑓(𝑥)𝑑𝑥 + 𝑘 2 – [ ∫−∞ 𝑥 𝑓(𝑥)𝑑𝑥 + 𝑘]

= E(X2) + 2k E(X) + k2 – [ E(X) + k]2

= E(X2) + 2k E(X) + k2 – [ E(X)]2 – 2k E(X) - k2

= E(X2) – [ E(X)]2

= Var(X)
∞ ∞ 2
ii)Var(kX) = ∫−∞ 𝑘 2 𝑥 2 𝑓(𝑥)𝑑𝑥 – [ ∫−∞ 𝑘𝑥 𝑓(𝑥)𝑑𝑥]

∞ ∞ 2
= 𝑘 2 ∫−∞ 𝑥 2 𝑓(𝑥)𝑑𝑥 – 𝑘 2 [ ∫−∞ 𝑥 𝑓(𝑥)𝑑𝑥]

= k2 [ E(X2) – {E(X)}2] = k2 Var(X)

Th3:Expectation of a Linear combination of Random variables:

Let X1,X2,…………..Xn ba any n random variables and if a1,a2,…………….an are any n


constants, then 𝐸( ∑𝑛𝑖=1 𝑎𝑖 𝑋𝑖 ) = ∑𝑛𝑖=1 𝑎𝑖 𝐸(𝑋𝑖 ) provided all expectations exist.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 15


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Problem:

1.Let X be a random variable with the following probability distribution.

X -3 6 9

P(X=x) 1/6 1/2 1/3

Find E(2X+1)2
1 1 1 11
Sol: E(X) = Σx P(x) = (−3)𝑋 6 + 6𝑋 2 + 9𝑋 3 = 2

1 1 1 93
E(X2) = Σx2 P(x) = (−3)2 𝑋 6 + 62 𝑋 2 + 92 𝑋 3 = 2

E(2X+1)2 = E(4X2+4X+1) = 4 E(X2) + 4 E(X) + 1


93 11
=4X + 4𝑋 + 1 = 209
2 2

MOMENTS:
Statistical Moments plays a crucial role while we specify our probability distribution to
work with since, with the help of moments, we can describe the properties of statistical
distribution. Therefore, they are helpful to describe the distribution.

In Statistical Estimation and Testing of Hypothesis, which all are based on the
numerical values arrived for each distribution, we required the statistical moments.

Moment word is very popular in mechanical sciences. In science moment is a measure


of energy which generates the frequency. In Statistics, moments are the arithmetic
means of first, second, third and so on, i.e. rth power of the deviation taken from
either mean or an arbitrary point of distribution. In other words, moments are
statistical measures that give certain characteristics of the distribution. In statistics,
some moments are very important. Generally, in any frequency distribution, four
moments are obtained which are known as first, second, third and fourth moments.
These four moments describe the information about mean, variance, skewness and
kurtosis of a frequency distribution. Calculation of moments gives some features of a
distribution which are of statistical importance. Moments can be classified in raw and
central moment. Raw moments are measured about any arbitrary point A (say). If A is
taken to be zero then raw moments are called moments about origin. When A is taken to
be Arithmetic mean we get central moments. The first raw moment about origin is
mean whereas the first central moment is zero. The second raw and central moments
are mean square deviation and variance, respectively. The third and fourth moments
are useful in measuring skewness and kurtosis.
Three types of moments are:
1. Moments about arbitrary point,
2. Moments about mean, and
3. Moments about origin

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 16


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Moments about Arbitrary Point


When actual mean is in fraction, moments are first calculated about an arbitrary
point and then converted to moments about the actual mean. When deviations are
taken from arbitrary point, theformulas are:

For Ungrouped Data

If 𝑥1, 𝑥2, … 𝑥𝑛 are the n observations of a variable X, then their moments about an arbitrary
point
A are-
∑𝑛
𝑖=1(𝑥𝑖 −𝐴)
0
Zero order moment A 𝜇0! = =1
𝑛

∑𝑛
𝑖=1(𝑥𝑖 −𝐴)
1
First order moment A 𝜇0′ = 𝑛

∑𝑛
𝑖=1(𝑥𝑖 −𝐴)
2
Second order moment A 𝜇1′ = 𝑛

∑𝑛
𝑖=1(𝑥𝑖 −𝐴)
3
Third order moment A 𝜇2′ = 𝑛

∑𝑛
𝑖=1(𝑥𝑖 −𝐴)
4
Fourth order moment A 𝜇3′ = 𝑛

In general, the rth order moment about arbitrary point A is given by


∑𝑛
𝑖=1(𝑥𝑖 −𝐴)
𝑟
𝜇𝑟′ = , for r = 1,2…..
𝑛

For Grouped Data

If 𝑥1, 𝑥2, … 𝑥𝑘 are k values (or mid values in case of class intervals) of a variable X with
theircorresponding frequencies 𝑓1, 𝑓2, … 𝑓𝑘 , then moments about an arbitrary point A
are-

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 17


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Moments about Origin


In case, when we take an arbitrary point A = 0 then, we get the moments about origin.
For Ungrouped Data

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 18


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

For Grouped Data

Moments about Mean


When we take the deviation from the actual mean and calculate the moments, these are known as
moments about mean or central moments and are given by

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 19


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 20


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Sol: First we construct following frequency distribution for calculation of moments

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 21


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 22


STATISTICAL INFERENCE AND
RANDOM VARIABLES
STOCHASTIC PROCESS

Moment Generating Function:

The M.G.F of a random variable X, about the origin, whose p.d.f fX(x) is given by

MX(t) = E(etx)

∑ 𝑒 𝑡𝑥 𝑝(𝑥) , 𝑓𝑜𝑟 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒


= { ∞ 𝑡𝑥
∫−∞ 𝑒 𝑓𝑋 (𝑥), 𝑓𝑜𝑟 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒

It is a tool to calculate the higher moments

𝑑𝑟
𝜇𝑟1 = [ 𝑀 (𝑡)] 𝑡=0
𝑑𝑡 𝑟 𝑋
The M.G.F of a random variable X about the point x=a is defined as

MX(t) ( about x=a) = E(et(x-a))

∑ 𝑒 𝑡(𝑥−𝑎) 𝑝(𝑥) , 𝑓𝑜𝑟 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒


= { ∞ 𝑡(𝑥−𝑎)
∫−∞ 𝑒 𝑓𝑋 (𝑥), 𝑓𝑜𝑟 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠 𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒

Properties of M.G.F:

1.Let Y = aX+b where X is a r.v with M.G.F MX(t) then MX(Y) = MX(aX+b) = ebt MX(at)

2.MkX(t) = MX(kt), where k is a constant

3.If X and Y are the two independent r.v having the M.G.F MX(t), MY(t) then the M.G.F of
(X+Y) is given by MX+Y(t) = MX(t). MY(t)

4.A r.v X may have no moments even if its M.G.F exist.

5.A r.v X can have all or some moments but M.G.F does not exist perhaps at one point.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 23


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

UNIT- II

PROBABILITY DISTRIBUTIONS

Binomial Distribution: A Random variable ‘X’ has binomial distribution if it assumes


only non-negative values with probability mass function given by

𝑛𝑐𝑟 𝑝𝑟 𝑞 𝑛−𝑟 𝑟 = 0,1,2, − − − − 𝑛


𝑝(𝑥 = 𝑟) = {
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
= 𝑏(𝑟; 𝑛, 𝑝)

Conditions For Applicability Of Binomial Distributions:


1. Number of trials must be finite (n is finite)
2. The trails are independent
3. There are only two possible outcomes in any event i.e., success and failure.
4. Probability of success in each trail remains constant.

Examples:
1. Tossing a coin 𝑛 times
2. Throwing a die
3. No. of defective items in the box

Mean Of The Binomial Distribution


μ = ∑nr=0 r. P(r)

= ∑nr=0 r. ncr P r qn−r

= nc1 p1 qn−1 + 2n2 P r qn−2 + 3nc3 p3 qn−3 + ⋯ … . nncn pn qn−n

n(n−1) n(n−1)(n−2)
= np1 qn−1 + 2. p2 qn−2 + 3. p3 qn−3 + − − +npn
2! 3!

= np[ q(n−1) +(n − 1)c1 p1 q(n−1)−1 + − − − + pn−1

=np[p + q]n−1

= np since [p + q = 1]

Mean = np.

Variance Of The Binomial Distribution


𝑛

𝜎 = ∑ 𝑟 2 𝑝(𝑟) − 𝜇 2
2

𝑟=0

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 24


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS
n

= ∑[r(r − 1) + r]P(r) − μ2
r=0

n n

= ∑ r(r − 1)P(r) + ∑ r. P(r) − n2 p2


r=0 r=0

= ∑ r(r − 1)ncr P r qn−r + np − n2 P 2


r=0

let ∑nr=0 r(r − 1)P(r) = ∑nr=0 r(r − 1)ncr P r qn−r = 2 nc2r P 2 q2 nn−2 +
6nc3 P 3 qn−3+12ncr P 4 qn−4+---− + n(n − 1) P n

= n(n − 1)P 2 ⌈qn−2 + +(n − 2)c1 p1 q(n−2)−1 + − − − + p2 ⌉

= n(n − 1)P 2 (p + q)n−2

= n2 P 2 − nP 2

σ2 = n2 P 2 − nP 2 + np − n2 P 2

= np(1 − p)

= npq.

Problems
1.In tossing a coin 10 times simultaneously. Find the probability of getting

i)at least 7 heads ii) almost 3 heads iii)exactly 6 heads.

Sol: Given 𝑛 = 10
1
Probability of getting a head in tossing a coin = 2 = 𝑝.

1 1
Probability of getting no head = 𝑞 = 1 − 2 = 2.

The probability of getting 𝑟 heads in a throw of 10 coins is


1 𝑟 1 10−𝑟
𝑃(𝑋 = 𝑟) = 𝑝(𝑟) = 10𝐶𝑟 (2) (2) ; 𝑟 = 0,1,2, … … . . ,10

(i) Probability of getting at least seven heads is given by


𝑃(𝑋 ≥ 7) = 𝑃(𝑋 = 7) + 𝑃(𝑋 = 8) + 𝑃(𝑋 = 9) + 𝑃(𝑋 = 10)
1 7 1 10−7 1 8 1 10−8 1 9 1 10−9 1 10
= 10𝐶7 ( ) ( ) + 10𝐶8 ( ) ( ) + 10𝐶9 ( ) ( ) + 10𝐶10 ( )
2 2 2 2 2 2 2

1
= [10𝐶7 + 10𝐶8 + 10𝐶9 + 10𝐶10 ]
210

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 25


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

1
= [120 + 45 + 10 + 1]
210
176
=
1024
= 0.1719

ii) Probability of getting at most 3 heads is given by


𝑃(𝑋 ≤ 3) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2) + 𝑃(𝑋 = 3)
1 1 1 10−1 1 2 1 10−2 1 3 1 10−3 1 10
= 10𝑐1 ( ) ( ) + 10𝑐2 ( ) ( ) + 10𝐶3 ( ) ( ) + 10𝐶0 ( )
2 2 2 2 2 2 2

1
= [10𝐶0 + 10𝐶1 + 10𝑐2 + 10𝑐3 ]
210
1
=210 [120 + 45 + 10 + 1]

176
=
1024
= 0.1719

iii)Probability of getting exactly six heads is given by


1 6 1 10−6
𝑃(𝑋 = 6) =10𝑐6 (2) (2)

=0.205.

2.In 𝟐𝟓𝟔 sets of 𝟏𝟐 tosses of a coin ,in how many cases one can expect 𝟖 Heads
and 𝟒 Tails.
1
Sol: The probability of getting a head, 𝑝 = 2

1
The probability of getting a tail,𝑞 = 2

Here 𝑛 = 12

1 8 1 4
The probability of getting 8heads and 4Tails in 12trials= 𝑃(𝑋 = 8) = 12𝐶8 ( ) ( )
2 2

12! 1 12 495
= ( ) = 12
8! 4! 2 2
The expected number of getting 8 heads and 4 Tails in 12 trials of such cases in256 sets

495 495
= 256 × 𝑃(𝑋 = 8) = 28 × 12
= = 30.9375 ~31
2 16

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 26


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

3.Find the probability of getting an even number 3 or 4 or 5 times in throwing a die 10


times
3 1
Sol: Probability of getting an even number in throwing a die = 6 = 2 = 𝑝.

1
Probability of getting an odd number in throwing a die =𝑞 = .
2

∴Probability of getting an even number 3 or 4 or 5 times in throwing a die 10 times is

𝑃(𝑋 = 3) + 𝑃(𝑋 = 4) + 𝑃(𝑋 = 5)

1 3 1 10−3 1 4 1 10−4 1 5 1 10−5


= 10𝑐3 (2) (2) + 10𝑐4 (2) (2) + 10𝐶5 (2) (2)

1
= 210 [10𝐶3 + 10𝐶4 + 10𝑐5 ]

1
=210 [120 + 252 + 210]

=0.568.

4.Out of 800 families with 4 children each ,how many could you expect to have

a)three boys b)five girls c) 2 or 3 boys d)at least 1 boy.

Sol: : Given 𝑛 = 5, 𝑁 = 800

Let having boys be success


1
Probability of having a boy = 2 = 𝑝.

1 1
Probability of having girl = 𝑞 = 1 − 2 = 2.

The probability of having 𝑟 boyss in 5 children is


1 𝑟 1 5−𝑟
𝑃(𝑋 = 𝑟) = 𝑝(𝑟) = 5𝐶𝑟 (2) (2) ; 𝑟 = 0,1,2 … … 5

a)Probability of having 3 boys is given by


1 3 1 5−3 5
𝑃(𝑋 = 3) = 5𝐶𝑟 ( ) ( ) =
2 2 16
5
Expected number of families having 3 boys = 𝑁 𝑝(3) =800(16) =250 families.

b) Probability of having 5 girls = Probability of having no boys is given by


1 0 1 5−0 1
𝑃(𝑋 = 0) = 5𝐶0 ( ) ( ) =
2 2 32

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 27


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS
1
Expected number of families having 5 girls = 𝑁 𝑝(0) =800(32) =25 families.

c) Probability of having either 2 or 3 boys is given by


1 2 1 5−2 1 3 1 5−3 5
𝑃(𝑋 = 2) + 𝑃(𝑋 = 3) = 5𝐶2 ( ) ( ) + 5𝐶3 ( ) ( ) =
2 2 2 2 18
5
Expected number of families having 3 boys = 𝑁 𝑝(3) =800(8) =500 families.

d) Probability of having at least 1 boy is given by

𝑃(𝑋 ≥ 1) = 1 − 𝑃(𝑋 = 0)

1 0 1 5−0 31
= 1 − 5𝐶0 ( ) ( ) =
2 2 32
31
Expected number of families having at least 1 boy =800(32) =775 families.

5.Fit a Binomial distribution for the following data.

x 0 1 2 3 4 5

f 2 14 20 34 22 8

Sol: Given n= 5,∑ 𝑓 = 2 + 14 + 20 + 34 + 22 + 8 = 100


∑ 𝑥𝑖 𝑓𝑖 == 0(2) + 1(14) + 2(20) + 3(34) + 4(22) + 5(8) = 284
∑ xi fi 284
∴ Mean of the distribution = ∑ fi
= 100 = 2.84
We have Mean of the binomial distribution = 𝑛𝑝 = 2.84
2.84
∴𝑝= = 0.568; 𝑞 = 1 − 0.568 = 0.432
5
Table To Fit Binomial Distribution
X P(𝑥𝑖 ) E(𝑥𝑖 )
0 5𝐶0 (0.568)0 (0.432)5−0 =0.02 N p(0) =100(0.02)=2

1 5𝐶1 (0.568)1 (0.432)5−1 =0.09 9

2 5𝐶2 (0.568)2 (0.432)5−2 =0.26 26

5𝐶3 (0.568)3 (0.432)5−3 =0.34


3 34

4 5𝐶4 (0.568)4 (0.432)5−4 =0.22 22

5 5𝐶5 (0.568)5 (0.432)5−5 =0.059 5.9

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 28


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Fitted Binomial distribution is

x 0 1 2 3 4 5

f 2 10 26 34 22 6

Recurrence Relation
𝑝(𝑟 + 1) = 𝑛𝐶𝑟+1 (𝑝)𝑟+1 (𝑞)𝑛−𝑟−1 ….……….(1)

𝑝(𝑟) = 𝑛𝐶𝑟 (𝑝)𝑟 (𝑞)𝑛−𝑟 …………………………….(2)

(1) 𝑝(𝑟+1) 𝑛𝐶𝑟+1 (𝑝)𝑟+1 (𝑞)𝑛−𝑟−1


= =
(2) 𝑝(𝑟) 𝑛𝐶𝑟 (𝑝)𝑟 (𝑞)𝑛−𝑟

𝑝(𝑟 + 1) 𝑛𝐶𝑟+1 𝑝
∴ = ( )
𝑝(𝑟) 𝑛𝐶𝑟 𝑞
𝑛𝐶𝑟+1 𝑝
𝑝(𝑟 + 1) = ( ) 𝑝(𝑟).
𝑛𝐶𝑟 𝑞

Poisson Distribution
A random variable ‘X’ follows Poisson distribution if it assumes only non-negative values
with probability mass function is given by

𝑒 −𝜆 𝜆𝑟
𝑃(𝑥 = 𝑟) = 𝑃(𝑟, 𝜆) = { 𝑟! 𝑓𝑜𝑟 𝑦 = 0,1, − − (𝜆 > 0)
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Conditions For Poisson Distribution


1. The number of trials are very large (infinite)
2. The probability of occurrence of an event is very small (𝜆 = 𝑛𝑝)
3. 𝜆 = 𝑛𝑝 = 𝑓𝑖𝑛𝑖𝑡𝑒

Examples:
1. The number of printing mistakes per page in a large text
2. The number of telephone calls per minute at a switch board
3. The number of defective items manufactured by a company.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 29


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Recurrence Relation
𝑒 −𝜆 𝜆𝑟+1
𝑃(𝑟 + 1) = (𝑟+1)!
-------(1)

𝑒 −𝜆 𝜆𝑟
𝑃(𝑟) = (𝑟)!
-------(2)

1 𝑃(𝑟 + 1) 𝑒 −𝜆 𝜆2 . 𝜆 𝑟!
= = 𝑋 −𝜆 2
2 𝑃(𝑟) (𝑟 + 1)𝑟! 𝑒 𝜆

𝜆
𝑃(𝑟 + 1) = ( ) 𝑃(𝑟) 𝑓𝑜𝑟 𝑟 = 0,1,2 − − − −
𝑟+1

Problems
1.Using Recurrence relation find probability when x=0,1,2,3,4,5, if mean of P.D is 3.

Sol: We have
𝜆
𝑃(𝑟 + 1) = (𝑟+1) 𝑃(𝑟) 𝑓𝑜𝑟 𝑟 = 0,1,2 − − − −(1)

Given 𝜆= 3
𝑒 −3 𝜆0
𝑃(0) = (0)!
= 𝑒 −3 [ by definition of Poisson distribution ]

From (1),
3
For 𝑟 = 0 , 𝑃(1) = (0+1) 𝑃(0) =3 𝑒 −3

3 3
For 𝑟 = 1 , 𝑃(2) = (1+1) 𝑃(0) =2 𝑒 −3

3
For 𝑟 = 2 , 𝑃(3) = (2+1) 𝑃(0) = 𝑒 −3

3 3
F For 𝑟 = 3 , 𝑃(4) = (3+1) 𝑃(0) = 4 𝑒 −3

3 3
or 𝑟 = 4 , 𝑃(5) = (4+1) 𝑃(0) =5 𝑒 −3 .

𝑷(𝑿=𝟐)
2.If X is a random variable such that 𝟑𝑷(𝑿 = 𝟒) = + 𝑷(𝑿 = 𝟎).
𝟐

Find mean, 𝑷(𝑿 ≤ 𝟐).


𝑃(𝑋=2)
Sol: Given 3𝑃(𝑋 = 4) = + 𝑃(𝑋 = 0)…..(1)
2

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 30


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Since X is a Poisson variable,

𝑒 −𝜆 𝜆𝑟
𝑃(𝑥 = 𝑟) =
𝑟!
2
𝑒 −𝜆𝜆 𝜆4 𝑒 −𝜆 𝜆2 𝑒 −𝜆 𝜆0
∴3 = +
4! (2)2! 0!

Solving it we get 𝜆4 − 2𝜆2 − 4 = 0

Taking 𝜆2 = 𝑘, 𝑤𝑒 𝑔𝑒𝑡 𝑘 2 − 2𝑘 − 4 = 0

∴ 𝑘 = 4, −2

∴ 𝜆2 = 4 𝑖𝑚𝑝𝑙𝑖𝑒𝑠 𝑡ℎ𝑎𝑡 𝜆 = 2

Therefore, Mean of the Poisson distribution = 2

𝑃(𝑋 ≤ 2) = 𝑃(𝑋 = 0) + 𝑃(𝑋 = 1) + 𝑃(𝑋 = 2)

𝑒 −2 𝜆0 𝑒 −2 21 𝑒 −𝜆 22
= + + =0.54.
0! 1! 2!

3.A car hire firm has 2 cars which it hires out day by day.The number of demands for a
car on each day is distributed as poisson with mean 1.5 Calculate the proportion of days

i)on which there is no demand

ii) on which demand is refused.

Sol: Let number of demands for cars be the success.

Given mean = 1.5= 𝜆

Using Poisson distribution,

𝑒 −𝜆 𝜆𝑟
𝑃(𝑥 = 𝑟) =
𝑟!
i)Probability that there is no demand for car is
𝑒 −1.5 (1.5)0
𝑃(𝑥 = 0) = = 0.223
0!
Expected number of days that there is no demand =N𝑃(0) = 365(0.223) =
81.39~ 81 𝑑𝑎𝑦𝑠

ii) Probability that demand refused for car is


𝑃(𝑥 > 2) = 1 − 𝑃(𝑥 = 0) − 𝑃(𝑥 = 1) − 𝑃(𝑥 = 2)

𝑒 −1.5 (1.5)0 𝑒 −1.5 (1.5)1 𝑒 −1.5 (1.5)2


= 1− − − = 0.191
0! 1! 2!

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 31


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Expected number of days that demand refused for car = N𝑃(𝑥 > 2)

= 365(0.191) = 69.7~ 70 𝑑𝑎𝑦𝑠.

4.The distribution of typing mistakes committed by typist is given below.

Fit a Poisson distribution for it.

Mistakes per page 0 1 2 3 4 5

Number of pages 142 156 69 27 5 1

Sol: Given n= 5,∑ 𝑓 = 142 + 156 + 69 + 27 + 5 + 1 = 400

∑ 𝑥𝑖 𝑓𝑖 == 0(142) + 1(156) + 2(69) + 3(27) + 4(5) + 5(1) = 400

∑ xi fi
∴ Mean of the distribution =
∑ fi
400
= 400 = 1.

We have Mean of the Poisson distribution = 𝜆 = 1

Table To Fit Poisson Distribution

X P(𝑥𝑖 ) E(𝑥𝑖 )
0 𝑒 −1 (1)0 N p(0)
=0.368
0! =400(0.368)=147.2~147
𝑒 −1 (1)1
1 1!
=0.368 147

2 𝑒 −1 (1)2
=0.184 74
2!

𝑒 −1 (1)3
3 =0.061 24
3!

𝑒 −1 (1)4
4 = 0.015 6
4!

𝑒 −1 (1)5
5 = 0.003 1
5!

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 32


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Fitted Poisson distribution is

Mistakes per page 0 1 2 3 4 5

Number of pages 147 147 74 24 6 1

Normal Distribution (Gaussian Distribution)

Let X be a continuous random variable, then it is said to follow normal distribution if its pdf
is given by

1 1 𝑥−𝜇 2
𝑓(𝑥, 𝜇, 𝜎) = 𝑒 −2 (𝜎
)
−∞ ≤ 𝑥 ≤ ∞, 𝜇, 𝜎>0
𝜎√2𝜋

Here , 𝜎 are the mean & S.D of X.

Properties Of Normal Distribution

1. Normal curve is always centered at mean


2. Mean, median and mode coincide (i.e., equal)
3. It is unimodal
4. It is a symmetrical curve and bell shaped curve
5. X-axis is an asymptote to the normal curve
6. The total area under the normal curve from −∞ 𝑡𝑜∞ is “1”
7. The points of inflection of the normal curve are 𝜇 ± 𝜎, 𝜇 ± 3𝜎
8. The area of the normal curve between
𝜇 − 𝜎 to 𝜇 + 𝜎 = 68.27%

𝜇 − 2𝜎 𝑡𝑜 𝜇 + 2𝜎=95.44%

𝜇 − 3𝜎 𝑡𝑜 𝜇 + 3𝜎=99.73%

9.The curve for normal distribution is given below

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 33


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Standard Normal Variable


𝑥−𝜇
Let 𝑍 = with mean ‘0’ and variance is ‘1’ then the normal variable is said to be standard
𝜎
normal variable.

Standard Normal Distribution

The normal distribution with man ‘0’ and variance ‘1’ is said to be standard normal
distribution of its probability density function is defined by

1 1 𝑥−𝑢 2
𝑓(𝑥) = 𝜎√2𝜋 𝑒 −2( 𝜎
)
−∞ < 𝑥 ≤ ∞

1 𝑧2
𝑓(𝑧) = 𝑒 −2 −∞ ≤ 𝑥 ≤ ∞ (𝜇 = 0, 𝜎 = 1)
√2𝜋

Mean Of The Normal Distribution


Consider Normal distribution with b,σ as parameters Then

1 (x−b)2

f(x; b, σ) e 2σ2
σ√2π

Mean of the continuous distribution is given by


∞ ∞ (x−b)2
1 −
μ = ∫ x f(x)dx = ∫ x e 2σ2 dx
−∞ −∞ σ√2π
(z)2
1 ∞ − x−b
= ∫ (σz + b) e
√2π −∞
2 dz [Putting z = σ
so that dx = σ dz]

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 34


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

(z)2 (z)2

σ − b ∞ −
= ∫ z
√2π −∞
e 2 dz + ∫
√2π −∞
e 2 dz

(z)2
2b − ∞
= ∫ e
√2π −0
2 dz

(z)2 (z)2
− −
[ since z e 2 is an odd function and e 2 is an even function]
2b π
μ= √2 = b
√2π

∴ Mean = b

Variance Of The Normal Distribution



Variance = ∫−∞ x 2 f(x)dx − μ2


σ2 = ∫ x 2 f(x)dx − μ2 .
−∞

x−μ 2
( )
1∞ −21 σ
= ∫ x2
σ√2π −∞
e dx − μ2

x−μ
Let z = ⟹ dx = σdz
σ


1 z2
= ∫ (μ2 + σ2 z 2 + 2μσz) e− 2 σdz − μ2
σ√2π
−∞

∞ ∞ ∞
μ2 −z2
2 σ2 3 2 2μσ 32
= ∫e dz + ∫ z 2 e− 2 dz + ∫ z 2 e−2 dz − μ2
√2π √2π √2π
−∞ −∞ −∞

∞ ∞
2μ2 −z2
2 2σ2 32
= ∫e dz + ∫ z 2 e−2 dz − μ2
√2π √2π
0 0

2σ2 ∞ z2
= ∫ z 2 e− 2 dz
√2π 0
z2 2zdz dt
∵ =t ⇒ = dt dz =
2 2 √2t

2σ2 dt
= ∫ (2 +)2 et
√2π √2t
0

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 35


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

2σ2 ∞ 3
= ∫ e−t +2−1 .dt
√π 0

2σ2 3
= Γ (2 )
√π

2σ2 1 1
== Γ( )
√π 2 2

σ2
= √π = σ2
√π

Median Of The Normal Distribution

Let x=M be the median, then

M ∞
1
∫ f(x)dx = ∫ f(x)dx =
2
−∞ M

Let μϵ(−∞, M)

∞ μ M 1
Let ∫−∞ f(x)dx = ∫−∞ f(x)dx + ∫μ f(x)dx = 2

μ 1 μ 2
Consider ∫−∞ f(x)dx = σ√2π ∫−∞ e− 12(x−μ
σ
) dx

x−μ
Let z = ⇒ dx = σdz [∵ Limits of z − ∞ ⟶ 0]
σ

μ 1 0 Z2
∫−∞ f(x)dx = σ√2π ∫−∞ e−2 σdz

1 0 t2
= ∫ e−2 (dt) (by taking z=-t again)
√2π ∞

1
π 1
= √ =
√2π 2 2

From (1)
μ

∫ f(x)dx = 0 ⇒ μ = M
μ

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 36


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Mode Of The Normal Distribution


1 1 x−μ 2 x−μ 2
f(x) = e− 2 (σ
)
−( )
σ√2π σ

1 1 x−μ 2 −1 x−μ 1
f`(x) = 0 ⇒ e− 2 (σ
)
( )2( ) =0
σ√2π 2 σ σ

⇒x−μ=0⇒x=μ

⇒x=μ

−1 1 x−μ 2 1 x − μ 2 −1 x−μ 1
f 11 (x) = [e−2( σ
)
. 1 + (x − μ)e−2 ( ) ( )2( ) ]
σ3 √2π σ 2 σ σ

−1
= [e0 + 0]
σ3 √2π
−1
= σ3 <0
√2π

∵ x = μ is the mode of normal distribution.

Problems :
1.If X is a normal variate, find the area A

i) to the left of 𝒛 = 𝟏. 𝟕𝟖

ii) to the right of 𝒛 = −𝟏. 𝟒𝟓

iii) Corresponding to −𝟎. 𝟖 ≤ 𝒛 ≤ 𝟏. 𝟓𝟑


iv) to the left of 𝒛 = −𝟐. 𝟓𝟐 and to the right of 𝒛 = 𝟏. 𝟖𝟑.

Sol: i) 𝑃(𝑧 < −1.78) = 0.5 − 𝑃(−1.78 < 𝑧 < 0)

= 0.5 − 𝑃(0 < 𝑧 < 1.78)

= 0.5-0.4625 =0.0375.

ii) 𝑃(𝑧 > −1.45) = 0.5 + 𝑃(−1.45 < 𝑧 < 0)

= 0.5 + 𝑃(0 < 𝑧 < 1.45)

= 0.5+0.4625 =0.9265.

iii) 𝑃(−0.8 ≤ 𝑧 ≤ 1.53) = 𝑃(−0.8 ≤ 𝑧 ≤ 0) + 𝑃(0 ≤ 𝑧 ≤ 1.53)

= 0.2881+0.4370=0.7251.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 37


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

iv ) 𝑃(𝑧 < −2.52) = 0.5 − 𝑃(0 < 𝑧 < 2.52)=0.0059

𝑃(𝑧 > 1.83) = 0.5 − 𝑃(0 < 𝑧 < 1.83)

=0.036

2.If the masses of 300 students are normally distributed with mean 68 kgs and standard
deviation 3kgs.How many students have masses

i)greater than 72kgs.

ii)less than or equal to 64 kgs

iii)between 65 and 71 kgs inclusive.

Sol: Given N=300,𝜇 = 68, 𝜎 = 3.Let X be the masses of the students.

i) Standard normal variate for X=72 is


𝑥 − 𝜇 72 − 68
𝑧= = = 1.33
𝜎 3
𝑃(𝑋 > 72)= 𝑃(𝑧 > 1.33)

== 0.5 − 𝑃(0 < 𝑧 < 1.33)

=0.5-0.4082

=0.092

Expected number of students greater than 72 = E(X>72)


=300(0.092)
=27.54~28 students

ii) Standard normal variate for X=64 is

𝑥 − 𝜇 64 − 68
𝑧= = = −1.33
𝜎 3

𝑃(𝑋 ≤ 64)= 𝑃(𝑧 ≤ −1.33)

== 0.5 − 𝑃(0 < 𝑧 < 1.33) (Using symmetry)

=0.5-0.4082

=0.092

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 38


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Expected number of students less than or equal to 64 = E(X ≤ 64)


=300(0.092)
=27.54~28 students .

iii)Standard normal variate for X=65 is

𝑥 − 𝜇 65 − 68
𝑧1 = = = −1
𝜎 3
Standard normal variate for X=71 is

𝑥 − 𝜇 71 − 68
𝑧2 = = =1
𝜎 3

𝑃(65 ≤ 𝑋 ≤ 71) = 𝑃(−1 ≤ 𝑧 ≤ 1)


= 𝑃(−1 ≤ 𝑧 ≤ 0) + 𝑃(−0 ≤ 𝑧 ≤ 1)
=2 𝑃(−0 ≤ 𝑧 ≤ 1)

=2(0.341)= 0.6826

𝐸(65 ≤ 𝑋 ≤ 71) = 300(0.6826) = 205 𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠.

∴Expected number of students between 65 and 71 kgs inclusive = 205 students.

3.In a normal distribution 𝟑𝟏% of the items are under 45 and 𝟖% of the items are

over 64. Find mean and variance of the distribution.

Sol: Given 𝑃(𝑋 < 45)= 31% = 0.31

And 𝑃(𝑋 > 64)= 8% = 0.08

Let Mean and variances of the normal distributions are 𝜇, 𝜎 2 .

Standard normal variate for X is


𝑥−𝜇
𝑧=
𝜎
Standard normal variate for 𝑋1=45 is

𝑋1 − 𝜇 45 − 𝜇
𝑧1 = =
𝜎 𝜎
⇒ 𝜇 + 𝜎𝑧1 = 45 … … … (1)

Standard normal variate for 𝑋2=64 is

𝑋2 − 𝜇 64 − 𝜇
𝑧2 = =
𝜎 𝜎
⇒ 𝜇 + 𝜎𝑧2 = 64 … … … (2)

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 39


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

From normal curve ,we ℎ𝑎𝑣𝑒 𝑃(−𝑧1 ≤ 𝑧 ≤ 0) = 0.19


⇒ 𝑧1 = −0.5
𝑃(0 ≤ 𝑧 ≤ 𝑧2 )=0.42
⇒ 𝑧2 = 1.41
𝑠𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑖𝑛𝑔 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 𝑜𝑓 𝑧1, 𝑧2 in (1) and (2),we get
𝜇 = 50, 𝜎 2 = 98.

4. In a normal distribution 𝟕% of the items are under 35 and 𝟖𝟗% of the items are

under 63. Find mean and variance of the distribution.

Sol: Given 𝑃(𝑋 < 35)= 7% = 0.07

And 𝑃(𝑋 < 63)= 89% = 0.89

Let Mean and variances of the normal distributions are 𝜇, 𝜎 2 .

Standard normal variate for X is


𝑥−𝜇
𝑧=
𝜎
Standard normal variate for 𝑋1=35 is

𝑋1 − 𝜇 35 − 𝜇
𝑧1 = =
𝜎 𝜎
⇒ 𝜇 + 𝜎𝑧1 = 35 … … … (1)

Standard normal variate for 𝑋2=63 is

𝑋2 − 𝜇 63 − 𝜇
𝑧2 = =
𝜎 𝜎
⇒ 𝜇 + 𝜎𝑧2 = 63 … … … (2)
Given 𝑃(𝑋 < 35) = 𝑃(𝑧 < 𝑧1 )
0.07 = 0.5- 𝑃(−𝑧1 ≤ 𝑧 ≤ 0)
𝑃(0 ≤ 𝑧 ≤ 𝑧1 ) = 0.43
From normal curve ,we ℎ𝑎𝑣𝑒
⇒ 𝑧1 = 1.48

We have𝑃(𝑋 < 63) = 𝑃(𝑧 < 𝑧2 )


0.89 = 0.5+𝑃(0 ≤ 𝑧 ≤ 𝑧2 )
𝑃(0 ≤ 𝑧 ≤ 𝑧2 ) = 0.39
From normal curve ,we ℎ𝑎𝑣𝑒
⇒ 𝑧2 = 1.23

substituting the values of z1, z2 in (1) and (2),we get


μ = 50, σ2 = 100.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 40


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

Normal approximation to the Binomial distribution:

Normal distribution can be used to approximate the B.D.

To find the P( x1< X < x2)


1 1
𝑥1 − −𝜇 𝑥2 + −𝜇
2 2
Then 𝑧1 = and 𝑧2 =
𝜎 𝜎

Then P( x1< X < x2) = P(z1 < Z < z2)

Problems:

1.Find the probability that out of 100 patients between 84 and 95 inclusive will survive a
heart-operation given that the chances of survival is 0.9

Sol: Given n=100, p = 0.9, q = 1-p = 0.1

Here X ~ B.D (n,p)

Required probability P(84 ≤ X ≤ 95) = ∑95 𝑟


𝑟=84 100𝐶𝑟 (0.9) (0.1)
100−𝑟

We have to sum up a large number of terms of the binomial.

To avoid it, we can replace B.D by a N.D

Mean µ = np = (100)0.9 = 90

S.D 𝜎 = √𝑛𝑝𝑞 = √100𝑋 (0.9)𝑋(0.1) = 3


1 1
𝑥1 − −𝜇 84− − 90 − 13
2 2
Then z1 = = = = −2.17
𝜎 3 6

1 1
𝑥2 + −𝜇 95+ − 90 11
2 2
Z2 = = = = 1.83
𝜎 3 6

Hence required probability

P(84 ≤ X ≤ 95) = P( -2.17 ≤ Z ≤ 1.83)

= A(2.17) + A(1.83)

= 0.4850+ 0.4664 = 0.9514

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 41


STATISTICAL INFERENCE AND
PROBABILITY DISTRIBUTIONS
STOCHASTIC PROCESS

2.Eight coins are tossed together. Find the probability of getting 1 to 4 heads ina single
toss.

Sol: Given n=8, p = q = ½

Mean µ = np = 8 (1/2) = 4

1
S.D 𝜎 = √𝑛𝑝𝑞 = √8𝑋 (2) 𝑋(1/2) = √2

Even though n=8 is not sufficiently large, we can approximate the Binomial by normal
distribution

Here x1= 1 and x2=4


1 1
𝑥1 − −𝜇 1− −4
2 2
𝑧1 = = = −2.47 and
𝜎 √2

1 1
𝑥2 + −𝜇 4− −4
2 2
𝑧2 = = = 0.35
𝜎 √2

P(1 ≤ X ≤ 4) = P( -2.47 ≤ Z ≤ 0.35)

= A(2.47) + A(0.35)

= 0.4932+0.1368 = 0.63

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 42


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

UNIT-III

CORRELATION AND REGRESSION

CORRELATION

Introduction

In a bivariate distribution and multivariate distribution we may be interested to find if there is


any relationship between the two variables under study. Correlation refers to the relationship
between two or more variables. The correlation expresses the relationship or interdependence
of two sets of variables upon each other.

Definition Correlation is a statistical tool which studies the relationship b/w 2 variables &
correlation analysis involves various methods & techniques used for studying & measuring
the extent of the relationship b/w them.

Two variables are said to be correlated if the change in one variable results in a
corresponding change in the other.

The Types of Correlation

1) Positive and Negative Correlation: If the values of the 2 variables deviate in the same
direction

i.e., if the increase in the values of one variable results in a corresponding increase in the
values of other variable (or) if the decrease in the values of one variable results in a
corresponding decrease in the values of other variable is called Positive Correlation.

e.g. Heights & weights of the individuals If the increase (decrease) in the values of one
variable results in a corresponding decrease (increase) in the values of other variable is called
Negative Correlation.

e.g, Price and demand of a commodity.

2) Linear and Non-linear Correlation: The correlation between two variables is said to be
Linear if the corresponding to a unit change in one variable there is a constant change in the
other variable over the entire range of the values (or) two variables 𝑥, 𝑦 are said to be
linearly related if there exists a relationship of the form y = a + bx.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 43


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

e.g when the amount of output in a factory is doubled by doubling the number of workers.

Two variables are said to be Non linear or curvilinear if corresponding to a unit change

in one variable the other variable does not change at a constant rate but at fluctuating rate.

i.e Correlation is said to be non linear if the ratio of change is not constant. In other words,

when all the points on the scatter diagram tend to lie near a smooth curve, the correlation is

said to be non linear (curvilinear).

3) Partial and Total correlation: The study of two variables excluding some other variables
is called Partial correlation .

e.g. We study price and demand eliminating the supply.

In Total correlation all the facts are taken into account.

e.g Price, demand & supply ,all are taken into account.

4) Simple and Multiple correlation: When we study only two variables, the relationship is
described as Simple correlation.

E.g quantity of money and price level, demand and price.

The following are scatter diagrams of Correlation.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 44


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

Karl Pearson’s Coefficient of Correlation

Karl Pearson suggested a mathematical method for measuring the magnitude of linear
relationship between 2 variables. This is known as Pearsonian Coefficient of correlation. It is
denoted by ‘𝑟’. This method is also known as Product-Moment correlation coefficient

Cov(xy)
r= σx σy

∑ xy
= Nσ
x σy

∑ XY
=
√∑ X2 ∑ Y2

̅ ) , Y = (y − Y
X = (x − X ̅ ) where, X,
̅ Y̅ are means of the series 𝑥 & 𝑦.

σx = standard deviation of series x

σy = standard deviation of series y

Properties

1. The Coefficient of correlation lies b/w −1 & +1 .


2. The Coefficient of correlation is independent of change of origin & scale of
measurements.
3. If X, Y are random variables and a, b, c, d are any numbers such that a ≠ 0, c ≠ 0 then

ac
r(aX + b, cY + d) = r(X, Y)
|ac|

4. Two independent variables are uncorrelated. That is if X and Y are independent variables
then r(X, Y) = 0

Rank Correlation Coefficient

Charles Edward Spearman found out the method of finding the Coefficient of correlation by
ranks. This method is based on rank & is useful in dealing with qualitative characteristics

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 45


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

such as morality, character, intelligence and beauty. Rank correlation is applicable to only to
the individual observations.

∑ D2
formula: ρ = 6 N(N2−1)

where : ρ - Rank Coefficient of correlation

D2 - Sum of the squares of the differences of two ranks

N- Number of paired observations.

Properties

1. The value of ρ lies between +1 and − 1.


2. If ρ = 1, then there is complete agreement in the order of the ranks & the direction of the
rank is same.
3. If ρ = −1, then there is complete disagreement in the order of the ranks & they are in
opposite directions.

Equal or Repeated ranks

If any 2 or more items are with same value the in that case common ranks are given to
repeated items. The common rank is the average of the ranks which these items would have
assumed, if they were different from each other and the next item will get the rank next to
ranks already assumed.

1 1
∑ D2 + (m3 −m)+ (m3 −m)….
Formula: ρ = 1 − 6{ 12 12
}
N3 −N

where m = the number of items whose ranks are common.

N- Number of paired observations.

D2 - Sum of the squares of the differences of two ranks

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 46


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

REGRESSION

In regression we can estimate value of one variable with the value of the other variable which
is known. The statistical method which helps us to estimate the unknown value of one
variable from the known value of the related variable is called ‘Regression’. The line
described in the average relationship b/w 2 variables is known as Line of Regression.

Regression Equation:

The standard form of the Regression equation is Y = a + b X where a, b are called constants.
‘a’ indicates value of Y when X = 0. It is called Y-intercept. ‘b’ indicates the value of slope
of the regression line & gives a measure of change of y for a unit change in X . it is also
called as regression coefficient of Y on X. The values of a, b are found with the help of
following Normal Equations.

Regression Equation of 𝐘 on 𝐗: ∑ 𝐘 = 𝐍𝐚 + 𝐛 ∑ 𝐗

∑ XY = a ∑ X + b ∑ X 2

Regression Equation of 𝐗 on 𝐘 : ∑ 𝐗 = 𝐍𝐚 + b∑ 𝐘

∑ XY = a ∑ Y + b∑ Y 2

Regression equations when deviations taken from the arithmetic mean :

̅ ) where byx = ∑ XY2


̅ = byx (X − X
Regression equation of Y on X : Y − Y ∑X

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 47


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS
∑ XY
Regression equation of X on Y : X − ̅
X = bxy (Y − ̅
Y ) where bxy = ∑ Y2

m1 −m2
Angle b/w Two Regression lines : tanθ = 1+m1 m2

Note:

𝜎x 𝜎y 1−𝑟 2
1. If θ is acute then tanθ = 𝜎2 ( )
x +𝜎2 y 𝑟

𝜎x 𝜎y 𝑟 2 −1
2. If θ is obtuse then tanθ = 𝜎2 ( )
x +𝜎2 y 𝑟
π
3. If r = 0 then tan θ = ∞ then θ = 2. Thus if there is no relationship between the 2
π
variables (i.e, they are independent) then θ = 2.

4. If 𝑟 = ±1 then tan θ = 0 then θ = 0 or π. Hence the 2 regression lines are parallel


or coincident. The correlation between 2 variables is perfect.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 48


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

Problems

1. Find Karl Pearson’s coefficient of correlation from the following data.

Ht. in 57 59 62 63 64 65 55 58 57
inches

Weight 113 117 126 126 130 129 111 116 112
in lbs

Solution:

Ht. in Deviation 𝑋2 Wt. in lbs Deviation 𝑌2 Product of


inches from mean from mean deviations
Y
of X and Y
X X = x-𝑥̅ Y = y-𝑦̅
series (XY)

57 -3 9 113 -7 49 21

59 -1 1 117 -3 9 3

62 2 4 126 6 36 12

63 3 9 126 6 36 18

64 4 16 130 10 100 40

65 5 25 129 9 81 45

55 -5 25 111 -9 81 45

58 -2 4 116 -4 16 8

57 -3 9 112 -8 64 24

540 0 102 1080 0 472 216

∑ XY 216
Coefficient of correlation r = = = 0.98
√∑ X2 ∑ Y2 √(102)(471)

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 49


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

2. Calculate Coefficient of correlation for the following data.

X 12 9 8 10 11 13 7

Y 14 8 6 9 11 12 3

Solution: In both series items are in small number.

So there is no need to take deviations.

Cov(XY)
Formula used: r =
σx σy

X Y X2 Y2 XY

12 14 144 196 168

9 8 81 64 72

8 6 64 36 48

10 9 100 81 90

11 11 121 121 121

13 12 169 144 156

7 3 49 9 21

∑ X = 70 ∑ Y = 63 ∑ X 2 = 728 ∑ Y 2 = 651 ∑ XY = 676

∑ XY− (∑ X ∑ Y)/N
r=
√(∑ X2 )−(∑ X)2 /N)(∑ Y2 −(∑ Y)2 )/N

Here N = 7.

4732 − 4410 322 322


r= = = = +0.95
√5096 − 4900√4557 − 3969 √(196)(588) 339.48

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 50


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

3. A sample of 𝟏𝟐 fathers and their elder sons gave the following data about their
elder sons. Calculate the rank correlation coefficient.

Fathers 65 63 67 64 68 62 70 66 68 67 69 71

Sons 68 66 68 65 69 66 68 65 71 67 68 70

Solution:

Fathers(x) Sons(y) Rank(x) Rank(y) di d2i


= xi − yi

65 68 9 5.5 3.5 12.25

63 66 11 9.5 1.5 2.25

67 68 6.5 5.5 1.0 1

64 65 10 11.5 -1.5 2.25

68 69 4.5 3 1.5 2.25

62 66 12 9.5 2.5 6.25

70 68 2 5.5 =3.5 12.25

66 65 8 11.5 3.5 12.25

68 71 4.5 1 =3.5 12.25

67 67 6.5 8 -1.5 2.25

69 68 3 5.5 -2.5 6.25

71 70 1 2 -1 1

∑ d2i = 72.5

Repeated values are given common rank, which is the mean of the ranks .In X: 68 & 67
appear twice.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 51


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

In Y : 68 appears 4 times , 66 appears twice & 65 appears twice. Here N = 12.

1 1
∑ D2 + (m3 −m)+ (m3 −m) 6(72.5 + 7)
ρ = 1 − 6{ 12 12
} = 1− = 0.722
N3 −N 12(122 −1)

4. Given 𝐧 = 𝟏𝟎 , 𝛔𝐱 = 5.4, 𝛔𝐲 = 𝟔. 𝟐 and sum of product of deviation from the mean of


𝐗 & 𝐘 is 𝟔𝟔. Find the correlation coefficient.

Solution: n = 10 , σx = 5.4, σy = 6.2

∑(x−x̅)2
σ2x =
n

̅)2
∑(y−y
σ2y = n

∑(x − x̅ )(y − y̅ ) = 66

∑(x − x̅ )(y − y̅ ) 66
r= = = 0.1971
√∑(x − x̅)2 ∑(y − y̅)2 (5. )(6.2)

5. The heights of mothers & daughters are given in the following table. From the 2
tables of regression estimate the expected average height of daughter when the height
of the mother is 𝟔𝟒. 𝟓 inches.

Ht. of 62 63 64 64 65 66 68 70
Mother(inches)

Ht. of the 64 65 61 69 67 68 71 65
daughter(inches)

Solution:

Let X = heights of the mother

Y = heights of the daughter

Let dx = X − 65 , dy = Y – 67, ∑ x = 522, ∑ dx = 2, ∑ dx 2 = 50, ∑ y = 530,

∑ dy = −6 ∑ dy 2 = 74, ∑ dxdy = 20

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 52


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

∑X 522
X̅ = = = 66.25
N 8

∑Y 530
Y̅ = = = 65.25
N 8

∑ dxdy − ∑ dx ∑ dy 2(−6)
20 − 8
byx = N = = 0.434
(∑ dx)
2 2
2
∑ dx − 50 −
N 8

Regression equation of Y on : Y − ̅
Y = byx (X − ̅
X)

Y = 37.93 + 0.434X

when X = 64.5 then Y = 69.923

6. The equations of two regression lines are 𝟕𝐱 − 𝟏𝟔𝐲 + 𝟗 = 𝟎 and 𝟓𝐲 − 𝟒𝐱 − 𝟑 = 𝟎.

Find the coefficient of correlation and the means of 𝐱& 𝑦 .

Solution: Given equations are 7x − 16y + 9 = 0……………….. (1)

5y − 4x − 3 = 0…………………. (2)
(1) × 4 gives 28x − 64y + 36 = 0
(2) × 7 gives -28𝑥 + 35𝑦 − 21 = 0
On adding we get −29𝑦 + 15 = 0
y = 0.5172
from(1) 7x = 16y − 9 which gives x = 0.1034
̅ y̅) we have x̅ = 0.1034
since regression line passes through (x,
y̅ = 0.5172
16 9
From(1) x = y−
7 7
4 3
From (2) y = x+ ,
5 5
σx 16 σy 4
r = and r =
σy 7 σx 5
16 4 64
Multiplying these 2 equations , we get r2 = =
7 5 35
8
r= .
√35

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 53


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS
𝟒
7. If 𝛔𝐱 = 𝛔𝐲 = 𝛔 and the angle between the regression lines is Tan-1 (𝟑). Find 𝐫.

σx σy 1−r2
Solution: tanθ = ( )
σ x +σ2 y
2 r

σ2 1−r2
= 2
( )
2σ r

4
By data, θ = Tan−1 (3).

1−r2 4
=
2r 3

3 − 3r 2 − 8r = 0

(3r − 1)(r + 3) = 0

1
r= or − 3
3

Since we cannot have r = −3

1
Thus r =
3

8. Given the following information regarding a distribution 𝑵 = 𝟓,

𝐗̅ = 𝟏𝟎, 𝐘
̅ = 𝟐𝟎, ∑(𝐗 − 𝐘)𝟐 = 𝟏𝟎𝟎, ∑(𝐘 − 𝟏𝟎)𝟐 = 𝟏𝟔𝟎. Find the regression

coefficients and hence coefficient of correlation.

Solution: Here dx = X − 4, dy = Y − 10

∑ dx ∑ dx
̅
X=A+ ⇒ 10 = Y + ⇒ ∑ dx = 30(here A = 4)
N 5

∑ dy ∑ dy
̅
Y=B+ ⇒ 20 = 10 + ⇒ ∑ dy = 50(here B = 10)
N 5

∑ dx ∑ dy
∑ dxdy − −220
byx = N = = 2.75
(∑ dx) 2 −80
2
∑ dx −
N

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 54


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS

∑ dx ∑ dy
∑ dxdy − −220
bxy = N = = 0.65
(∑ dy) 2 −340
2
∑ dy −
N

Coefficient of correlation r = ±√bxy × byx = √(0.65)(2.75) = √1.7875 = 1.337

9. Given that 𝐗 = 𝟒𝐘 + 𝟓 and 𝐘 = 𝟒𝐗 + 𝟒 are the lines of regression of 𝐗 on 𝐘 and 𝐘 on

𝟏
𝐗 respectively. Show that 𝟎 < 4𝒌 < 1. If 𝐤 = 𝟏𝟔find the means of the two variables

and coefficient of correlation between them.

Solution: Given lines are X= 4Y +5 …….(1)

Y = KX+4 ……….(2)

σ σy
From (1) & (2), r σx = 4 and r σ = K
y x

Multiplying these two equations we get r 2 = 4K

1
Since 0 ≤ r 2 ≤ 1, we have 0 ≤ 4K ≤ 4

1
If K = then we have X = 4Y + 5 and
16

Y = X/16 + 4

We get X − 4Y − 5 = 0

−X
4Y − 16 = 0
4

X
Adding we get 3 4 − 21 = 0

X = 28

23
From(2), we get Y = 4

The regression lines pass through ( x,


̅ y̅)

23
We get means x̅ = 28 𝑎𝑛𝑑 y̅ = 4

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 55


STATISTICAL INFERENCE AND
CORRELATION AND REGRESSION
STOCHASTIC PROCESS
4 1 1
We have r 2 = 4k = = ⇒ r = ±
16 4 2

1
We consider positive value and take r =
2

10.The difference between the ranks are 𝟎. 𝟓, −𝟔, −𝟒. 𝟓, −𝟑, −𝟓, −𝟏, 𝟑, 𝟎, 𝟓, 𝟓. 𝟓, 𝟎, −𝟎. 𝟓.
∑ 𝒎(𝒎𝟐 −𝟏)
For refracted ranks 𝐱 𝐚𝐧𝐝 𝐲. =3.5, 𝒓 = 𝟎. 𝟒𝟒. Find the number of terms.
𝟏𝟐

Solution: Given difference (𝑑𝑖 ) 0.5, −6, −4.5, −3, −5, −1,3,0,5,5.5,0, −0.5

∑ d2i = 156

∑ m(m2 −1)
∑ d2i +
12
Here r = 1 − 6 { }
(N2 −N)

1 − (159.5)6
=
(N2 − N)

957
=1−
N2 −N

957
⇒ 0.44 = 1 −
N2 −N

⇒ N2 − N = 1708.92

⇒ N = 42

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 56


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

UNIT –IV

SAMPLING AND TESTING OF HYPOTHESIS

Introduction: The totality of observations with which we are concerned , whether this
number be finite or infinite constitute population. In this chapter we focus on sampling from
distributions or populations and such important quantities as the sample mean and sample
variance.

Def: Population is defined as the aggregate or totality of statistical data forming a subject of
investigation .

EX. The population of the heights of Indian.

The number of observations in the population is defined to be the size of the population. It
may be finite or infinite .Size of the population is denoted by N.As the study of entire
population may not be possible to carry out and hence a part of the population alone is
selected.

Def: A portion of the population which is examined with a view to determining the
population characteristics is called a sample . In other words, sample is a subset of
population. Size of the sample is denoted by n.

The process of selection of a sample is called Sampling. There are different methods of
sampling

➢ Probability Sampling Methods


➢ Non-Probability Sampling Methods

Probability Sampling Methods:


a) Random Sampling (Probability Sampling):
It is the process of drawing a sample from a population in such a way that each
member of the population has an equal chance of being included in the sample.
Ex: A hand of cards from a well shuffled pack of cards is a random sample.
Note : If N is the size of the population and n is the size of the sample, then
➢ The no. of samples with replacement = 𝑁 𝑛
➢ The no. of samples without replacement = 𝑁𝐶𝑛
b) Stratified Sampling :
In this , the population is first divided into several smaller groups called strata
according to some relevant characteristics . From each strata samples are selected at
random, all the samples are combined together to form the stratified sampling.
c) Systematic Sampling (Quasi Random Sampling):
In this method , all the units of the population are arranged in some order . If the
population size is N, and the sample size is n, then we first define sample interval

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 57


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY
𝑁
denoted by = . then from first k items ,one unit is selected at random. Then from
𝑛
first unit every kth unit is serially selected combining all the selected units constitute
a systematic sampling.

Non Probability Sampling Methods:


a) Purposive (Judgment ) Sampling :
In this method, the members constituting the sample are chosen not according to some
definite scientific procedure , but according to convenience and personal choice of the
individual who selects the sample . It is the choice of the individual items of a sample
entirely depends on the individual judgment of the investigator.
b) Sequential Sampling:
It consists of a sequence of sample drawn one after another from the population.
Depending on the results of previous samples if the result of the first sample is not
acceptable then second sample is drawn and the process continues to take proper
decision . But if the first sample is acceptable ,then no new sample is drawn.

Classification of Samples:
➢ Large Samples : If the size of the sample n ≥
30 , then it is said to be large sample.
➢ Small Samples : If the size of the sample n < 30 ,then it is said to be small sample or
exact sample.

Parameters and Statistics:


Parameter is a statistical measure based on all the units of a population. Statistic is a
statistical measure based on only the units selected in a sample.
Note :In this unit , Parameter refers to the population and Statistic refers to sample.

Central Limit Theorem: If 𝑥̅ be the mean of a random sample of size n drawn from
population having mean 𝜇 and standard deviation 𝜎 , then the sampling distribution of the
𝜎
sample mean 𝑥̅ is approximately a normal distribution with mean 𝜇 and SD = S.E of 𝑥̅ = 𝑛

provided the sample size n is large.

Standard Error of a Statistic : The standard error of statistic ‘t’ is the standard
deviation of the sampling distribution of the statistic i.e, S.E of sample mean is the standard
deviation of the sampling distribution of sample mean.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 58


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Formulae for S.E:


𝜎 𝜎
➢ S.E of Sample mean 𝑥̅ = i.e, S.E (𝑥̅ ) =
√𝑛 √𝑛
𝑃𝑄 𝑃𝑄
➢ S.E of sample proportion p=√ 𝑛 i.e, S.E (p) = √ 𝑛 where Q=1-P

𝜎 2 𝜎2 2
➢ S.E of the difference of two sample means 𝑥
̅̅̅1 and ̅̅̅
𝑥2 i.e, S.E (𝑥 𝑥2 ) = √ 𝑛1 +
̅̅̅1 − ̅̅̅
1 𝑛2

𝑃1 𝑄1 𝑃2 𝑄2
➢ S.E of the difference of two proportions i.e, S.E(𝑝1 − 𝑝2 ) = √ +
𝑛1 𝑛2

Estimation :
To use the statistic obtained by the samples as an estimate to predict the unknown parameter
of the population from which the sample is drawn.

Estimate : An estimate is a statement made to find an unknown population parameter.

Estimator : The procedure or rule to determine an unknown population parameter is called


estimator.

Ex. Sample proportion is an estimate of population proportion , because with the help of
sample proportion value we can estimate the population proportion value.

Types of Estimation:
➢ Point Estimation: If the estimate of the population parameter is given by a single
value , then the estimate is called a point estimation of the parameter.
➢ Interval Estimation: If the estimate of the population parameter is given by two
different values between which the parameter may be considered to lie, then the
estimate is called an interval estimation of the parameter.

Confidence interval Estimation of parameters:


In an interval estimation of the population parameter 𝜃, if we can find two quantities 𝑡1 and
𝑡2 based on sample observations drawn from the population such that the unknown parameter
𝜃 is included in the interval [𝑡1, 𝑡2 ] in a specified cases ,then this is called a confidence
interval for the parameter 𝜃.

Confidence Limits for Population mean 𝝁


➢ 95% confidence limits are 𝑥̅ ± 1.96 (𝑆. 𝐸. 𝑜𝑓 𝑥̅ )
➢ 99% confidence limits are 𝑥̅ ± 2.58 (𝑆. 𝐸. 𝑜𝑓 𝑥̅ )
➢ 99.73% confidence limits are 𝑥̅ ± 3 (𝑆. 𝐸. 𝑜𝑓 𝑥̅ )
➢ 90% confidence limits are 𝑥̅ ± 1.645 (𝑆. 𝐸. 𝑜𝑓 𝑥̅ )

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 59


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Confidence limits for population proportion P


➢ 95% confidence limits are p ± 1.96(S.E.of p)
➢ 99% confidence limits are p ± 2.58(S.E. of p)
➢ 99.73% confidence limits are p ± 3(S.E.of p)
➢ 90% confidence limits are p ± 1.645(S.E.of p)

Confidence limits for the difference of two population means 𝝁𝟏 and 𝝁𝟐


➢ ̅̅̅1 − ̅̅̅
95% confidence limits are ((𝑥 𝑥2 )± 1.96 (S.E of ((𝑥
̅̅̅1 − 𝑥
̅̅̅2 ))
➢ ̅̅̅1 − ̅̅̅
99% confidence limits are ((𝑥 𝑥2 )± 2.58 (S.E of ((𝑥
̅̅̅1 − 𝑥
̅̅̅2 ))
➢ 99.73% confidence limits are ((𝑥 ̅̅̅1 − ̅̅̅
𝑥2 )± 3 (S.E of ((𝑥
̅̅̅1 − 𝑥
̅̅̅2 ))
➢ ̅̅̅1 − ̅̅̅
90% confidence limits are ((𝑥 𝑥2 )± 2.58 (S.E of ((𝑥
̅̅̅1 − 𝑥
̅̅̅2 ))

Confidence limits for the difference of two population proportions


➢ 95% confidence limits are 𝑝1-𝑝2 ± 1.96 ( S.E. of 𝑝1-𝑝2 )
➢ 99% confidence limits are 𝑝1-𝑝2 ± 2.58 ( S.E. of 𝑝1-𝑝2 )
➢ 99.73% confidence limits are 𝑝1-𝑝2 ± 3 ( S.E. of 𝑝1-𝑝2 )
➢ 90% confidence limits are 𝑝1-𝑝2 ± 1.645 ( S.E. of 𝑝1-𝑝2 )

Determination of proper sample size


Sample size for estimating population mean :

𝑧𝛼 𝜎 2
n= ( ) where 𝑧𝛼 – Critical value of z at 𝛼 Level of significance
𝐸

𝜎 − Standard deviation of population and

E – Maximum sampling Error = 𝑥̅ – 𝜇

Sample size for estimating population proportion :


𝑧𝛼 2 𝑃𝑄
𝑛= where 𝑧𝛼 – Critical value of z at 𝛼 Level of significance
𝐸2

P − Population proportion
𝑄 − 1-P
𝐸 − Maximum Sampling error = p-P

Testing of Hypothesis :
It is an assumption or supposition and the decision making procedure about the assumption
whether to accept or reject is called hypothesis testing .

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 60


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Def: Statistical Hypothesis : To arrive at decision about the population on the basis of
sample information we make assumptions about the population parameters involved such
assumption is called a statistical hypothesis .

Procedure for testing a hypothesis:

Test of Hypothesis involves the following steps:

Step1: Statement of hypothesis :


There are two types of hypothesis :

➢ Null hypothesis: A definite statement about the population parameter. Usually a null
hypothesis is written as no difference , denoted by 𝐻0 .
Ex. 𝐻0 : 𝜇 = 𝜇0
➢ Alternative hypothesis : A statement which contradicts the null hypothesis is called
alternative hypothesis. Usually an alternative hypothesis is written as some difference
, denoted by 𝐻1 .
Setting of alternative hypothesis is very important to decide whether it is two-tailed or
one – tailed alternative , which depends upon the question it is dealing.
Ex.𝐻1 : 𝜇 ≠ 𝜇0 (Two – Tailed test)
or
𝐻1 : 𝜇 > 𝜇0 (Right one tailed test)
or
𝐻1 : 𝜇 < 𝜇0 (Left one tailed test)

Step 2: Specification of level of significance :


The LOS denoted by 𝛼 is the confidence with which we reject or accept the null
hypothesis. It is generally specified before a test procedure ,which can be either 5%
(0.05) , 1% or 10% which means that thee are about 5 chances in 100 that we would
reject the null hypothesis 𝐻0 and the remaining 95% confident that we would accept
the null hypothesis 𝐻0 . Similarly , it is applicable for different level of significance.

Step 3 : Identification of the test Statistic :


There are several tests of significance like z,t, F etc .Depending upon the nature of the
information given in the problem we have to select the right test and construct the test
criterion and appropriate probability distribution.

Step 4: Critical Region:


It is the distribution of the statistic .
➢ Two – Tailed Test : The critical region under the curve is equally distributed on
both sides of the mean.
If 𝐻1 has ≠ sign , the critical region is divided equally on both sides of the
distribution.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 61


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

➢ One Tailed Test: The critical region under the curve is distributed on one side of
the mean.

Left one tailed test: If 𝐻1 has < sign , the critical region is taken in the left side of the
distribution.

Right one tailed test : If 𝐻1 has > sign , the critical region is taken on right side of the
distribution.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 62


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Step 5 : Making decision:


By comparing the computed value and the critical value decision is taken for accepting or
rejecting 𝐻0

If calculated value ≤ critical value , we accept 𝐻0 , otherwise reject 𝐻0 .

Errors of Sampling :
While drawing conclusions for population parameters on the basis of the sample results , we
have two types of errors.

➢ Type I error : Reject 𝐻0 when it is true i.e, if the null hypothesis 𝐻0 is true but it is
rejected by test procedure .
➢ Type II error : Accept 𝐻0 when it is false i.e, if the null hypothesis 𝐻0 is false but it is
accepted by test procedure.

DECISION TABLE

𝑯𝟎 is accepted 𝑯𝟎 is rejected

𝑯𝟎 is true Correct Decision Type I Error

𝑯𝟎 is false Type II Error Correct Decision

Problems:
1.If the population is 3,6,9,15,27

a) List all possible samples of size 3 that can be taken without replacement
from finite population
b) Calculate the mean of each of the sampling distribution of means
c) Find the standard deviation of sampling distribution of means
3+6+9+15+27 60
Sol: Mean of the population , 𝜇 = = =12
5 5

Standard deviation of the population ,

(3 − 12)2 + (6 − 12)2 + (9 − 12)2 + (15 − 12)2 + (27 − 12)2


𝜎= √
5

81+36+9+9+225 360
=√ =√ = 8.4853
5 5

a) Sampling without replacement :


The total number of samples without replacement is 𝑁𝐶𝑛 = 5𝐶3 =10

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 63


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

The 10 samples are (3,6,9), (3,6,15), (3,9,15), (3,6,27), (3,9,27), (3,15,27),


(6,9,15), (6,9,27), (6,15,27), (9,15,27)
b) Mean of the sampling distribution of means is
6+8+9+10+12+13+14+15+16+17 120
𝜇𝑥̅ = = = 12
10 10
c) 𝜎 2 =
(6−12)2 +(8−12)2 +(9−12)2 +(10−12)2 +(12−12)2 +(13−12)2 +(14−12)2 +(15−12)2 +(16−12)2 +(17−12)2
10

= 13.3

∴ 𝜎𝑥̅ = √13.3 = 3.651

2.A population consist of five numbers 2,3,6,8 and 11. Consider all possible samples of
size two which can be drawn with replacement from this population .Find

a) The mean of the population


b) The standard deviation of the population
c) The mean of the sampling distribution of means and
d) The standard deviation of the sampling distribution of means

Sol: a) Mean of the Population is given by


2+3+6+8+11 30
𝜇 = = =6
5 5

b) Variance of the population is given by


(𝑥𝑖 −𝑥̅ )2
𝜎2 = ∑ 𝑛

(2−6)2 +(3−6)2 +(6−6)2 +(8−6)2 +(11−6)2


= 5

16+9+0+4+25
= = 10.8 ∴ 𝜎 = 3.29
5

c) Sampling with replacement


The total no.of samples with replacement is 𝑁 𝑛 = 52 = 25
∴ List of all possible samples with replacement are
(2,2), (2,3), (2,6), (2,8), (2,11), (3,2), (3,3)(3,6), (3,8), (3,11)
{(6,2), (6,3), (6,6), (6,8), (6,11), (8,2), (8,3), (8,6), (8,8), (8,11)}
(11,2), (11,3), (11,6), (11,8), (11,11)
Now compute the arithmetic mean for each of these 25 samples which gives rise to
the distribution of means of the samples known as sampling distribution of means
The samples means are
2 , 2.5 , 4 , 5 ,6.5
2.5 , 3 ,4.5, ,5.5,7
4,4.5,6,7,8.5
5,5.5,7,8,9.5
{ 6.5,7,8.5,9.5,11 }

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 64


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

And the mean of sampling distribution of means is the mean of these 25 means
sum of all above sample means 150
𝜇𝑥̅ = = =6
25 25
d) The variance of the sampling distribution of means is obtained by subtracting the
mean 6 from each number in sampling distribution of means and squaring the result
,adding all 25 numbers thus obtained and dividing by 25.
(2−6)2 +(2.5−6)2 +(4−6)2 +(5−6)2 +⋯……(11−6)2 135
𝜎2 = = = 5.4
25 25
∴ 𝜎 = √5.4 = 2.32

3.When a sample is taken from an infinite population , what happens to the standard
error of the mean if the sample size is decreased from 800 to 200
𝜎
Sol: The standard error of mean =
√𝑛

Sample size = n .let n= 𝑛1 =800


𝜎 𝜎
Then S.E1 = = 20√2
√800

When 𝑛1 is reduced to 200

let n= 𝑛2 =200
𝜎 𝜎
Then S.E2 = =
√200 10√2

𝜎 𝜎
∴ S.E2 = 10√2 = 2(20√2) = 2 (S.𝐸1 )

Hence if sample size is reduced from 800to 200, S. E. of mean will be multiplied by 2

4.The variance of a population is 2 . The size of the sample collected from the population
is 169. What is the standard error of mean

Sol: n= The size of the sample =169

𝜎 = S.D of population = √Variance = √2

𝜎 √2 1.41
Standard Error of mean = = = = 0.185
√𝑛 √169 13

5.The mean height of students in a college is 155cms and standard deviation is 15 . What
is the probability that the mean height of 36 students is less than 157 cms.

Sol: 𝜇 = Mean of the population

= Mean height of students of a college = 155cms

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 65


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

n = S.D of population = 15cms

̅𝑥 = mean of sample = 157 cms


𝑥̅ −𝜇 157−155 12
Now z = 𝜎 = 15 = 15 = 0.8
√𝑛 √36

∴ P ( 𝑥̅ ≤ 157) = P ( z < 0.8 ) = 0.5 + P ( 0 ≤ z ≤ 0.8 )

= 0.5 +0.2881 = 0.7881

Thus the probability that the mean height of 36 students is less than 157 = 0.7881

6.A random sample of size 100 is taken from a population with 𝝈 = 5.1 . Given that the
̅ = 21.6 Construct a 95% confidence limits for the population mean .
sample mean is 𝒙

Sol: Given 𝑥̅ = 21.6

𝑧𝛼⁄2 = 1.96, n = 100 , 𝜎 = 5.1


𝜎 𝜎
∴ Confidence interval = ( 𝑥̅ − 𝑧𝛼⁄2 . , 𝑥̅ + 𝑧𝛼⁄2 . )
√ 𝑛 √𝑛
𝜎 1.96 x 5.1
𝑥̅ − 𝑧𝛼⁄2 . = 21.6 – = 20.6
√𝑛 10
𝜎 1.96 x 5.1
𝑥̅ + 𝑧𝛼⁄2 . = 21.6 + = 22.6
√𝑛 10
Hence (20.6,22.6) is the confidence interval for the population mean 𝜇

7.It is desired to estimate the mean time of continuous use until an answering machine
will first require service . If it can be assumed that 𝝈 = 60 days, how large a sample is
needed so that one will be able to assert with 90% confidence that the sample mean is
off by at most 10 days.

Sol: We have maximum error (E) = 10 days , 𝜎 = 60 days and 𝑧𝛼⁄2 = 1.645

𝑧𝛼⁄ .𝜎 2 1.645 x 60 2
2
∴n=[ ] =[ ] = 97
𝐸 10

8.A random sample of size 64 is taken from a normal population with 𝝁 = 𝟓𝟏. 𝟒 and 𝝈 =
6.8.What is the probability that the mean of the sample will a) exceed 52.9 b) fall
between 50.5 and 52.3 c) be less than 50.6

Sol: Given n = the size of the sample = 64

𝜇 = the mean of the population = 51.4

𝜎 = the S.D of the population = 6.8

a) P( 𝑥̅ exceed 52.9 ) = P(𝑥̅ > 52.9)

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 66


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY
𝑥̅ −𝜇 52.9−51.4
Z= 𝜎 = 6.8 = 1.76
√𝑛 √64

∴ P(̅̅̅
𝑥 > 52.9 ) = P(z > 1.76)

= 0.5 – P(0 < z < 1.76)


= 0.5 – 0.4608 = 0.0392

b) P( 𝑥̅ fall between 50.5 and 52.3)


i.e, P(50.5 < 𝑥̅ < 52.3) = P(𝑥
̅̅̅1̅ < 𝑥̅ < ̅𝑥̅̅2̅)
̅𝑥̅̅1̅−𝜇 50.5−51.4
𝑧1 = 𝜎 = = −1.06
0.85
√𝑛

̅𝑥̅̅2̅−𝜇 52.3−51.4
𝑧2 = 𝜎 = = 1.06
0.85
√𝑛
P(50.5 < 𝑥̅ < 52.3) = P(-1.06 < z < 1.06)
= P(-1.06 < z < 0) + P(0 < z < 1.06)
= P(0 < z < 1.06) + P(0 < z < 1.06)
= 2( 0.3554) = 0.7108

c) P( 𝑥̅ will be less than 50.6) = P(𝑥̅ < 50.6)


𝑥̅ −𝜇 50.6−51.4
Z= 𝜎 = 6.8 = -0.94
√𝑛 √64
∴P(z < -0.94) = 0.5 - P(0.94 < z < 0)
= 0.5 - P(0 < z < 0.94) = 0.50-0.3264
= 0.1736

9.The mean of certain normal population is equal to the standard error of the mean of
the samples of 64 from that distribution . Find the probability that the mean of the
sample size 36 will be negative.
𝜎
Sol: The Standard error of mean =
√𝑛

Sample size , n =64


Given mean , 𝜇 = Standard error of the mean of the samples
𝜎 𝜎
𝜇= =
√64 8

𝜎
𝑥̅ −𝜇 𝑥̅ −
8
We know z = 𝜎 = 𝜎
√𝑛 6

6𝑥̅ 3
= -
𝜎 4
If Z < 0.75, ̅𝑥 is negative
P(z < 0.75) = P( − ∞ < 𝑧 < 0.75 )
0 0.75
= ∫− ∞ ∅(𝑧) dz + ∫0 ∅(𝑧)dz = 0.50 + 0.2734
= 0.7734

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 67


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

10.The guaranteed average life of a certain type of electric bulbs is 1500hrs with a S.D
of 10 hrs. It is decided to sample the output so as to ensure that 95% of bulbs do not fall
short of the guaranteed average by more than 2% . What will be the minimum sample
size ?

Sol : Let n be the size of the sample

The guaranteed mean is 1500


We do not want the mean of the sample to be less than 2% of (1500 ) i.e, 30 hrs
So 1500 – 30 = 1470
∴ 𝑥̅ > 1470

𝑥̅ − 𝜇 1470−1500 √𝑛
∴ |𝑧| = | 𝜎 | =| 120 |= 4
√𝑛 √𝑛

From the given condition , the area of the probability normal curve to the left of
√𝑛
should be 0.95
4
√n
∴ The area between 0 and is 0.45
4

We do not want to know about the bulbs which have life above the guranteed life .

√𝑛
∴ = 1.65 i.e., √𝑛 = 6.6
4
∴ n = 44

11.A normal population has a mean of 0.1 and standard deviation of 2.1 . Find the
probability that mean of a sample of size 900 will be negative .

Sol : Given 𝜇 = 0.1 , 𝜎 = 2.1 and n = 900

The Standard normal variate


𝑥̅ − 𝜇 𝑥̅ − 𝜇 𝑥̅ − 0.1
Z= 𝜎 = 2.1 = 0.07
√𝑛 √900
∴ 𝑥̅ = 0.1 + 0.007 z where z ~ N ( 0 ,1)
∴ The required probability , that the sample mean is negative is given by
𝑃( 𝑥̅ < 0 ) = P ( 0.1 + 0.07 z < 0)
= P ( 0.07 z < - 0.1 )
−0.1
= P ( z < ( 0.07 )
= P ( z < -1.43 )
= 0.50 – P ( 0 < z < 1.43 )
= 0.50 – 0.4236 = 0.0764

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 68


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

12.In a study of an automobile insurance a random sample of 80 body repair costs had a
mean of Rs 472.36 and the S.D of Rs 62.35. If 𝒙̅ is used as a point estimator to the true
average repair costs , with what confidence we can assert that the maximum error
doesn’t exceed Rs 10.

Sol : Size of a random sample , n = 80

The mean of random sample , 𝑥̅ = Rs 472.36


Standard deviation , 𝜎 = Rs 62.35
Maximum error of estimate , 𝐸 𝑚𝑎𝑥 = Rs 10
𝜎
We have 𝐸 𝑚𝑎𝑥 =𝑍𝛼⁄2 . 𝑛

𝐸 𝑚𝑎𝑥 .√𝑛 10 √80 89.4427
i.e., 𝑍𝛼⁄2 = = = = 1.4345
𝜎 62.35 62.35
∴ 𝑍𝛼⁄2= 1.43

The area when z = 1.43 from tables is 0.4236


𝛼
∴ = 0.4236 i.e ., 𝛼 = 0.8472
2

∴ confidence = (1- 𝛼 ) 100% = 84.72 %

Hence we are 84.72% confidence that the maximum error is Rs. 10

13.If we can assert with 95% that the maximum error is 0.05 and P = 0.2 find the size of
the sample.

Sol : Given P =0.2 , E = 0.05

We have Q = 0.8 and 𝑍𝛼⁄2 = 1.96 ( 5% LOS )


𝑃𝑄
We know that maximum error , E = 𝑍𝛼⁄2 √ 𝑛

0.2 x 0.8
⇒ 0.05 = 1.96 √ 𝑛
0.2 x 0.8 x (1.96)2
⇒ Sample size , n = (0.05)2
= 246

14.The mean and standard deviation of a population are 11,795 and 14,054 respectively
̅ = 11,795 and
. What can one assert with 95 % confidence about the maximum error if 𝒙
n = 50. And also construct 95% confidence interval for true mean .

Sol: Here mean of population , 𝜇 = 11795

S.D of population , 𝜎 = 14054


𝑥̅ = 11795
𝜎
𝑛 = sample size = 50 , maximum error = 𝑍𝛼⁄2 .
√𝑛
𝑍𝛼⁄2 for 95% confidence = 1.96

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 69


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY
𝜎 14054
Max. error , 𝐸 = 𝑍𝛼⁄2 . = 1.96 . = 3899
√𝑛 √50

𝜎 𝜎
∴ Confidence interval = ( 𝑥̅ − 𝑍𝛼⁄2 . , 𝑥̅ + 𝑍𝛼⁄2 . )
√𝑛 √𝑛
= (11795-3899, 11795+3899)
= (7896, 15694)

15.Find 95% confidence limits for the mean of a normally distributed population from
which the following sample was taken 15, 17 , 10 ,18 ,16 ,9, 7, 11, 13 ,14.
15+17+10+18+16+9+7+11+13+14
Sol: We have 𝑥̅ = = 13
10

(𝑥𝑖 −𝑥̅ )2
𝑆2 = ∑ 𝑛−1

1
= [(15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 +
9
(15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 ]
40
= 3

Since 𝑍𝛼⁄2 = 1.96 , we have

𝑠 √40
𝑍𝛼⁄2 . = 1.96 . = 2.26
√𝑛 √10.√3

𝑠
∴ Confidence limits are 𝑥̅ ± 𝑍𝛼⁄2 . = 13 ± 2.26 = ( 10.74 , 15.26 )
√𝑛

16.A random sample of 100 teachers in a large metropolitan area revealed mean weekly
salary of Rs. 487 with a standard deviation Rs.48. With what degree of confidence can
we assert that the average weekly of all teachers in the metropolitan area is between 472
to 502 ?

Sol: Given 𝜇 = 487 , 𝜎 = 48 , 𝑛 = 100


𝑥̅̅ − 𝜇
Z= 𝜎
√𝑛
𝑥̅ − 487 𝑥̅ − 487
= 48 = 4.8
√100
Standard variable corresponding to Rs. 472 is
472− 487
𝑍1 = = - 3.125
4.8
Standard vaiable corresponding to Rs. 502
502− 487
𝑍2 = = 3.125
4.8

Let x̅be the mean salary of teacher . Then

P ( 472 < ̅
x < 502 ) = P ( -3.125 < z < 3.125 )

= 2 ( 0 < z < 3.125 )

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 70


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY
3.125
= 2 ∫0 ∅(𝑧)𝑑𝑧

= 2 ( 0.4991) = 0.9982

Thus we can ascertain with 99.82 % confidence

Large Samples: Let a random sample of size n >30 is defined as large sample.

Applications of Large Samples

Test of Significance of a Single Mean


Let a random sample of size n, x̅ be the mean of the sample and 𝜇 be the population mean.

1. Null hypothesis: 𝐻0 : There is no significant difference in the given population mean


value say ‘𝜇′0 .

i.e 𝐻0 : µ = 𝜇0

2. Alternative hypothesis: 𝐻1 :There is some significant difference in the given population


mean value.
i.e
a)𝐻1 : µ ≠ µ0 (Two –tailed)
b) 𝐻1 : µ > µ0 (Right one tailed)
c) 𝐻1 : µ > µ0 (Left one tailed)
3. Level of significance: Set the LOS α
𝑥̅ −µ0 𝑥̅ −µ0
4. Test Statistic: 𝑧𝑐𝑎𝑙 = 𝜎/ (OR) 𝑧𝑐𝑎𝑙 = 𝑠/
√𝑛 √𝑛
5. Decision /conclusion : If zcal value < 𝑧∝ value , accept 𝐻0 otherwise reject 𝐻0
CRITICAL VALUES OF Z
LOS ∝ 1% 5% 10%

µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645

µ> µ0 Z>2.33 z>1.645 Z>1.28

µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28

NOTE: Confidence limits for the mean of the population corresponding to the given sample.

𝜇 = 𝑋̅ ± 𝑍∝⁄2 ( S.E of 𝑋
̅ ) i.e,

𝜎 𝜀
𝜇 = 𝑋̅ ± 𝑍∝⁄2 ( 𝑛) (or) 𝜇 = 𝑋̅ ± 𝑍∝⁄2 ( 𝑛)
√ √

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 71


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

2. Test of Significance for Difference of Means of two Large Samples

𝑥1 & ̅̅̅
Let ̅̅̅ 𝑥2 be the means of the samples of two ramdom sizes 𝓃1 & 𝓃2 drawn from two
populations having means 𝜇1 &𝜇2 and SD’s 𝜎1 &𝜎2
i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬: 𝐻0 : 𝜇1 = 𝜇2

ii) Alternative hypothesis :: a) H1 : 𝜇1 ≠ 𝜇2(Two Tailed)

b) H1 : 𝜇1 < µ2 (Left one tailed)


c) H1 : 𝜇1 > µ2 (Right one tailed)

iii) Level of Significance: Set the LOS α

1 ̅ −×
(× 2̅ )−𝛿 ̅ 1 −×
(× ̅ 2 )−𝛿
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 𝑆𝐸 𝑜𝑓 ̅ −×
(× ̅ )
=-
1 2 𝜎 2𝜎 2
√ 1+ 2
𝑛1 𝑛2

Where 𝛿 = 𝜇1 − 𝜇2 ( where given constant)


Other wise 𝛿 = 𝜇1 − 𝜇2 =0
̅ 1 −×
× ̅ 2 −𝛿 ̅ 1 −𝑋
𝑋 ̅2
𝑍𝑐𝑎𝑙 = if 𝜎21 = 𝜎22 = 𝜎2 then 𝑍𝑐𝑎𝑙 = 𝜎 1 1
𝜎 2𝜎 2 √𝑛 +𝑛
√ 1+ 2 1 2
𝑛1 𝑛2

Critical value of Z from normal table at the LOS α


v) Decision: If |𝑍𝑐𝑎𝑙 | < 𝑍𝑡𝑎𝑏 , accept H0 otherwise reject H0
CRITICAL VALUES OF Z
LOS ∝ 1% 5% 10%

µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645

µ> µ0 Z>2.33 z>1.645 Z>1.28

µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28

NOTE: Confidence limits for difference of means


̅1 − 𝑋
𝜇1 − 𝜇2 = ( 𝑋 ̅ 2 ) ± 𝑧∝⁄2 [𝑆. 𝐸 𝑜𝑓 (𝑋
̅1 − 𝑋
̅ 2 )]

𝜎12 𝜎22
= (𝑋̅1 − 𝑋̅2 ) ± 𝑧∝⁄2 [√ + ]
𝑛1 𝑛1

3. Test of Significance for Single Proportions

Suppose a random sample of size n has a sample proportion p of members possessing


a certain attribute (proportion of successes). To test the hypothesis that the proportion
P in the population has a specified value P0 .

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 72


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬 : 𝐻0 : 𝑃 = 𝑃0
ii) Alternative hypothesis : a) H1 : P≠ 𝑃0 (Two Tailed test )
b) H1 ∶ 𝑃 < 𝑃0 (Left one- tailed)
c) H1 ∶ P > 𝑃0 (Right one tailed)
𝑝−𝑃
iii) Test statistic :𝑍𝑐𝑎𝑙 = when P is the Population proportion 𝑄 = 1 − 𝑃
√𝑃𝑄
𝑛

iv) At specified LOS ∝, critical value of Z


v) Decision: If |𝑧𝑐𝑎𝑙 | < 𝑍𝑡𝑎𝑏 , accept H0 otherwise reject H0
CRITICAL VALUES OF Z
LOS ∝ 1% 5% 10%

µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645

µ> µ0 Z>2.33 z>1.645 Z>1.28

µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28

NOTE : Confidence limits for population proportion


P = P ± Z∝ (S E of P)
2

pq
= P ± Z∝ (√ n )
2

4. Test for Equality of Two Proportions (Populations)

Let p1 and p2 be the sample proportions in two large random samples of sizes n1 & n2
drawn from two populations having proportions P1 & P2

i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬 : 𝐻0 : 𝑃1 = 𝑃2

ii) Alternative hypothesis : a) H1 : 𝑃1 ≠ 𝑃2 (Two Tailed)


b) H1 : 𝑃1 < 𝑃2 (Left one tailed)

c) H1 : 𝑃1 > 𝑃2 (Right one tailed)


(𝑃1 −𝑃2 )−(𝑃1 −𝑃2 )
iii) Test statistic :𝑍𝑐𝑎𝑙 = if (P1-P2) is given.
𝑃1𝑄 𝑃1𝑄
√ 1 1
𝑛1 + 𝑛2

If given only sample proportions then

𝑝1 −𝑝2 x x
𝑍𝑐𝑎𝑙 = 𝑃1 𝑞1 𝑃2 𝑞1
where p1 = n1 & p1 = n2
√ 𝑛 + 𝑛 1 2
1 2

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 73


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

OR

p1 − p2 n1 p1 +n2 p2 x1 +x2
Zcal = 1 1
Where p = n1 +n2
= n1 +n2
and q = 1- p
√pq(n +n )
1 2
iv) At specified LOS ∝ critical value of ‘Z’
v) Decision: If |𝑍𝑐𝑎𝑙 | < 𝑍𝑇𝑎𝑏, accept H0 otherwise reject H0

CRITICAL VALUES OF Z
LOS ∝ 1% 5% 10%

µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645

µ> µ0 Z>2.33 z>1.645 Z>1.28

µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28

NOTE: Confidence limits for difference of population proportions

𝑃1 − 𝑃2 = (𝑝1 − 𝑝2 ) ± 𝑍∝ (𝑆 . 𝐸 𝑜𝑓 𝑃1 − 𝑃2 )
2

Problems:
1. A sample of 64 students have a mean weight of 70 kgs . Can this be regarded as
asample mean from a population with mean weight 56 kgs and standard
deviation 25 kgs.
Sol : Given 𝒙
̅ = mean of he sample = 70 kgs
𝝁 = Mean of the population = 56 kgs
𝝈 = S.D of population = 25 kgs

𝐚𝐧𝐝 𝒏 = 𝐒𝐚𝐦𝐩𝐥𝐞 𝐬𝐢𝐳𝐞 = 𝟔4

i) Sol: Null Hypothesis 𝐻0 : A Sample of 64 students with mean weight 70 kgs be


regarded as a sample from a population with mean weight 56 kgs and standard
deviation 25 kgs. i.e., 𝐻0 : 𝜇 = 70 kgs
ii) Alternative Hypothesis 𝐻1 : Sample cannot be regarded as one coming from the
population . i.e., 𝐻1 : 𝜇 ≠ 70 kgs ( Two –tailed test )
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑥̅ − 𝜇 70−56
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 25 = 4.48
√𝑛 √64

v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0


∴ Sample cannot be regarded as one coming from the population

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 74


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

2. In a random sample of 60 workers , the average time taken by them to get to


work is 33.8 minutes with a standard deviation of 6.1 minutes . Can we reject the
null hypothesis 𝝁 = 32.6 in favor of alternative null hypothesis 𝝁 > 32.6 at 𝜶 =
0.05 LOS
𝑥 = 33.8 , 𝜇 = 32.6 and 𝜎 = 6.1
Sol : Given n = 60 , ̅
i) Null Hypothesis 𝐻0 : 𝜇 = 32.6
ii) Alternative Hypothesis 𝐻1 : 𝜇 > 32.6 ( Right one tailed test )
iii) Level of significance : 𝛼 = 0.01 (𝑍𝛼 = 2.33 )
𝑥̅ − 𝜇 33.8−32.6 1.2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 6.1 = 0.7875 = 1.5238
√𝑛 √60

v) Conclusion: Since 𝑍 𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0

3. A sample of 400 items is taken from a population whose standard deviation is 10


. The mean of the sample is 40 . Test whether the sample has come from a
population with mean 38 . Also calculate 95% confidence limits for the
population .
𝑥 = 40 , 𝜇 = 38 and 𝜎 = 10
Sol : Given n = 400 , ̅
i) Null Hypothesis 𝐻0 : 𝜇 = 38
ii) Alternative Hypothesis 𝐻1 : 𝜇 ≠ 38 ( Two –tailed test )
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑥̅ − 𝜇 38−40 −2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 10 = 0.5 = - 4
√𝑛 √400

v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣_𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0


i.e., the sample is not from the population whose is 38.
𝜎 𝜎
∴ 95% confidence interval is (𝑥
̅ − 1.96. ,𝑥
̅ + 1.96. )
√𝑛 √𝑛
1.96(10) 1.96(10)
i.e., (40 − , 40 + √400 )
√400
1.96(10) 1.96(10)
= (40 − 20
, 40 + 20 )
= ( 40 – 0.98 , 40 + 0.98 )
= ( 39.02 , 40.98 )

4. An insurance agent has claimed that the average age of policy holders who issue
through him is less than the average for all agents which is 30.5. A random
sample of 100 policy holders who had issued through him gave the following age
distribution .
Age 16-20 21-25 26-30 31-35 36-40

No# of 12 22 20 30 16
persons

Calculate the arithmetic mean and standard deviation of this distribution and
use these values to test his claim at 5% los.

Sol : Take A = 28 where A – Assumed mean

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 75


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

𝑑𝑖 = 𝑥𝑖 – A
ℎ ∑ 𝑓𝑖 𝑑𝑖
𝑥̅ = A + 𝑁

5 x 16
= 28 + 100
= 28.8

∑ 𝑓𝑑2 ∑ 𝑓𝑑 2 164 16 2
S.D : S = h √ 𝑁
− ( 𝑁
) = 5. √
100
− (
100
) = 6.35

i) Null Hypothesis 𝐻0 : The sample is drawn from population with mean 𝜇


ii) i.e., 𝐻0 : 𝜇 = 30.5 years
iii) Alternative Hypothesis 𝐻1 : 𝜇 < 30.5 ( Left one –tailed test )
iv) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑥̅ − 𝜇 28.8−30.5
v) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 6.35 = − 2.677
√𝑛 √100

vi) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0


i.e., the sample is not drawn from the population with 𝜇 = 30.5 years .

5. An ambulance service claims that it takes on the average less than 10 minutes to
reach its destination in emergency calls . A sample of 36 calls has a mean of 11
minutes and the variance of 16 minutes .Test the claim at 0.05 los?
𝑥 =11 , 𝜇 = 10 and 𝜎 = √16 = 4
Sol : Given n = 36 , ̅
i) Null Hypothesis 𝐻0 : 𝜇 = 10
ii) Alternative Hypothesis 𝐻1 : 𝜇 < 10 ( Left one –tailed test )
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑥̅ − 𝜇 11−10 6
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 4 = 4 = 1.5
√𝑛 √36

v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0

6. The means of two large samples of sizes 1000 and 2000 members are 67.5 inches
and 68 inches respectively . Can the samples be regarded as drawn from the
same population of S.D 2.5 inches.
Sol: Let 𝜇1 and 𝜇2 be the means of the two populations
Given 𝑛1 = 1000 , 𝑛2 = 2000 and ̅𝑥1 = 67.5 inches , ̅
𝑥2 = 68 inches
Population S.D, 𝜎 = 2.5 inches
i) Null Hypothesis 𝐻0 :The samples have been drawn from the same population of
S.D 2.5 inches
i.e., 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2 ( Two – Tailed test)
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
̅ 1 −𝑋
𝑋 ̅2 67.5−68 −0.5
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 1 1
= 1 1
= 0.0968 = -5.16
√𝑛 +𝑛 √( )2(1000+2000)
1 2 2.5
v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 76


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Hence , we conclude that the samples are not drawn from the same population of
S.D 2.5 inches.

7. Samples of students were drawn from two universities and from their weights in
kilograms , mean and standard deviations are calculated and shown below.
Make a large sample test to test the significance of the difference between the
means.

Mean S .D Size of the sample

University A 55 10 400

University B 57 15 100

Sol: Let 𝜇1 and 𝜇2 be the means of the two populations


Given 𝑛1 = 400 , 𝑛2 = 100 and ̅ 𝑥1 = 55 kgs , ̅
𝑥2 = 57 kgs
𝜎1 = 10 and 𝜎2 = 15
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2 ( Two – Tailed test)
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑥̅ 1 − 𝑥̅ 2 55− 57 −2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = = = 1 9
= -1.26
𝜎 2 𝜎 2 2 2 √ +
√ 1 + 2 √10 +15 4 4
𝑛1 𝑛2 400 100

v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0


Hence , we conclude that there is no significant difference between the means

8. The average marks scored by 32 boys is 72 with a S.D of 8 . While that for 36
girls is 70 with a S.D of 6. Does this data indicate that the boys perform better
than girls at 5% los ?
Sol: Let 𝜇1 and 𝜇2 be the means of the two populations
Given 𝑛1 = 32 , 𝑛2 = 36 and ̅ 𝑥1 = 72 , ̅
𝑥2 = 70
𝜎1 = 8 and 𝜎2 = 6
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 > 𝜇2 ( Right One Tailed test)
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑥̅ 1 − 𝑥̅ 2 72− 70 2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = = = √2+1 = 1.1547
𝜎 2 𝜎 2 2 2
√ 1 + 2 √8 +6
𝑛1 𝑛2 32 36

v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0


Hence , we conclude that the performance of boys and girls is the same

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 77


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

9. A sample of the height of 6400 Englishmen has a mean of 67.85 inches and a S.D
of 2.56 inches while another sample of heights of 1600 Austrians has a mean of
68.55 inches and S.D of 2.52 inches. Do the data indicate that Austrians are on
the average taller than the Englishmen ? (Use 𝜶 𝒂𝒔 𝟎. 𝟎𝟏)
Sol : Let 𝜇1 and 𝜇2 be the means of the two populations
Given 𝑛1 = 6400 , 𝑛2 = 1600 and ̅ 𝑥1 = 67.85 , ̅
𝑥2 = 68.55
𝜎1 = 2.56 and 𝜎2 = 2.52
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 < 𝜇2 ( Left One Tailed test)
iii) Level of significance : 𝛼 = 0.01 (𝑍𝛼 = - 2.33 )
𝑥̅ 1 − 𝑥̅ 2 67.85− 68.55
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = =
𝜎 2 𝜎 2 2 2
√ 1 + 2 √2.56 +2.52
𝑛1 𝑛2 6400 1600

67.85 − 68.55
=
√6.5536 + 6.35
6400 1600
− 0.7 − 0.7
= √0.001+0.004 = 0.0707 - 9.9
v) Conclusion: Since |Z cal | value > Zα value , we reject 𝐻0
Hence , we conclude that Australians are taller than Englishmen.

10. At a certain large university a sociologist speculates that male students spend
considerably more money on junk food than female students. To test her
hypothesis the sociologist randomly selects from records the names of 200
students . Of thee , 125 are men and 75 are women . The mean of the average
amount spent on junk food per week by the men is Rs. 400 and S.D is 100. For
the women the sample mean is Rs. 450 and S.D is 150. Test the hypothesis at 5 %
los ?
Sol: Let 𝜇1 and 𝜇2 be the means of the two populations
Given 𝑛1 = 125 , 𝑛2 = 75 and ̅𝑥1 = Mean of men = 400 , ̅𝑥2 = Mean of women = 450
𝜎1 = 100 and 𝜎2 = 150
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 > 𝜇2 ( Right One Tailed test)
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑥̅ 1 − 𝑥̅ 2 400− 450
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = =
𝜎 2
𝜎 2 2 2
√ 1 + 2 √100 +150
𝑛1 𝑛2 125 75

− 50
=
√80 + 300
− 50 − 50
= = 19.49
= - 2.5654
√380
v) Conclusion: Since Zcal value < Zα value , we accept 𝐻0
Hence , we conclude that difference between the means are equal

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 78


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

11. The research investigator is interested in studying whether there is a significant


difference in the salaries of MBA grads in two cities. A random sample of size
100 from city A yields an average income of Rs. 20,150 . Another random sample
of size 60 from city B yields an average income of Rs. 20,250. If the variance are
given as 𝝈𝟏 𝟐 = 40,000 and
𝝈𝟐 𝟐 = 32,400 respectively . Test the equality of means and also construct 95%
confidence limits.
Sol: Let 𝜇1 and 𝜇2 be the means of the two populations
Given 𝑛1 = 100 , 𝑛2 = 60 and ̅ 𝑥1 = Mean of city A = 20,150 , ̅
𝑥2 = Mean of city B =
20,250
𝜎1 2 = 40,000 and 𝜎2 2 = 32,400
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2 (Two -Tailed test)
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑥̅ 1 − 𝑥̅ 2 20,150− 20,250
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = = 40000 32400
𝜎 2 𝜎 2 √ + 60
√ 1 + 2 100
𝑛1 𝑛2

100
=
√400 + 540
100
= 30.66
= 3.26
v) Conclusion: Since Zcal value > Zα value , we reject 𝐻0
Hence , we conclude that there is a significant difference in the salaries of MBA
grades two cities.
𝜎 2 𝜎2 2
∴ 95% confidence interval is𝜇1 - 𝜇2 = (𝑥 𝑥2 )± 1.96 √ 𝑛1 +
̅̅̅1 − ̅̅̅
𝑛2
1

40000 32400
= (20,150 – 20,250) )± 1.96√ 100
+ 60 = (39.90, 160.09)
12. A die was thrown 9000 times and of these 3220 yielded a 3 or 4. Is this consistent
with the hypothesis that the die was unbiased?
Sol : Given n = 9000
P = Population of proportion of successes
1 1 2 1
= P( getting a 3 or 4 ) = 6 + 6 = 6 = 3 0.3333
Q = 1- P = 0.6667
3220
P = Proportion of successes of getting 3 or 4 in 9000 times = 9000 = 0.3578
i) Null Hypothesis 𝐻0 : The die is unbiased
i.e., 𝐻0 : P = 0.33

ii) Alternative Hypothesis 𝐻1 : The die is biased

i.e., 𝐻1 : P ≠ 0.33 ( Two –Tailed test)

iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 79


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY
𝑝−𝑃 0.3578−0.3333
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = = (0.3333)(0.6667)
= 4.94
√𝑃𝑄
𝑛

9000

v)Conclusion: Since Zcal value > Zα value , we reject 𝐻0

Hence , we conclude that the die is biased.

13. In a random sample of 125 cool drinkers , 68 said they prefer thumsup to Pepsi .
Test the null hypothesis P = 0.5 against the alternative hypothesis hypothesis P >
0.5?
𝑥 68
Sol : Given n = 125 , x = 68 and p = 𝑛 = 125 = 0.544
i) Null Hypothesis 𝐻0 : P = 0.5
ii) Alternative Hypothesis 𝐻1 : P > 0.5( Right One Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑝−𝑃 0.544−0.5
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = = (0.5)(0.5)
= 0.9839
√𝑃𝑄
𝑛

125

v) Conclusion: Since Zcal value < Zα value , we accept 𝐻0

14. A manufacturer claimed that at least 95% of the equipment which he supplied to
a factory conformed to specifications . An experiment of a sample of 200 piece of
equipment revealed that 18 were faulty .Test the claim at 5% los ?
Sol : Given n = 200
Number of pieces confirming to specifications = 200-18 = 182
182
∴ p = Proportion of pieces confirming to specification = 200 = 0.91
95
P = Population proportion = 100 = 0.95

i) Null Hypothesis 𝐻0 : P = 0.95


ii) Alternative Hypothesis 𝐻1 : P < 0.95( Left One Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = -1.645 )
𝑝−𝑃 0.91−0.95
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = = = - 2.59
0.95 x 0.05
√𝑃𝑄
𝑛

200

v) Conclusion: We reject 𝐻0
Hence , we conclude that the manufacturer’s claim is rejected.

15. Among 900 people in a state 90 are found to be chapatti eaters . Construct 99%
confidence interval for the true proportion and also test the hypothesis for single
proportion ?
Sol: Given x = 90 , n = 900
𝑥 90 1
∴ p = 𝑛 = 100 = 10 = 0.1
And q = 1- p= 0.9
𝑝𝑞 (0.1)(0.9)
Now √ 𝑛 = √ 900
= 0.01

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 80


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Confidence interval is 𝑃 = 𝑝 ± 𝑍∝ (√𝑝𝑞


𝑛
)
2
i.e., ( 0.1- 0.03 , 0.1 + 0.03 )
= ( 0.07 , 0.13 )
i) Null Hypothesis 𝐻0 : P = 0.5
ii) Alternative Hypothesis 𝐻1 : P ≠0.5( Two Tailed test)
iii) Level of Significance : 𝛼 = 0.01 (𝑍𝛼 = 2.58 )
𝑝−𝑃 0.1−0.5
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = = = -24.39
0.5 x 0.5
√𝑃𝑄
𝑛

900

v) Conclusion: Since |𝑍𝑐𝑎𝑙 |𝑣𝑎𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0

16. Random samples of 400 men and 200 women in a locality were asked whether
they would like to have a bus stop a bus stop near their residence . 200 men and
40 women in favor of the proposal . Test the significance between the difference
of two proportions at 5% los ?
Sol: Let 𝑃1 and 𝑃2 be the population proportions in a locality who favor the bus stop
Given 𝑛1 = Number of men = 400
𝑛2 = number of women = 200
𝑥1 = Number of men in favor of the bus stop = 200
𝑥2 = Number of women in favor of the bus stop 40
𝑥 200 1 𝑥 40 1
∴ 𝑝1 = 𝑛1 = = 2 and 𝑝2 = 𝑛2 = 200 = 5
1 400 2

i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 ≠ 𝑃2 ( Two Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑝 − 𝑝
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1 21
√𝑝𝑞(𝑛 +𝑛 )
1 2
𝑛1 𝑝1 +𝑛2 𝑝2 𝑥1 +𝑥2 200+40 240 2
We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 400+200 = 600
=5
3
q = 1- p = 5
0.5−0.2
= 1 1
= 7.07
√(0.4)(0.6)( + )
400 200

v) Conclusion: Since |Zcal |value > Zα value , we reject 𝐻0


Hence we conclude that there is difference between the men and women in their
attitude towards the bus stop near their residence.

17. A machine puts out 16 imperfect articles in a sample of 500 articles . After the
machine is overhauled it puts out 3 imperfect articles in a sample of 100 articles .
Has the machine is improved ?
Sol : Let 𝑃1 and 𝑃2 be the proportions of imperfect articles in the proportion of
articles manufactured by the machine before and after overhauling , respectively.
Given 𝑛1 = Sample size before the machine overhauling = 500
𝑛2 = Sample size after the machine overhauling = 100

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 81


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

𝑥1 = Number of imperfect articles before overhauling = 16


𝑥2 = Number of imperfect articles after overhauling = 3
𝑥 16 𝑥 3
∴ 𝑝1 = 𝑛1 = = 0.032 and 𝑝2 = 𝑛2 = 100
= 0.03
1 500 2

i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 > 𝑃2 ( Left one Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑝 − 𝑝
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1 21
√𝑝𝑞(𝑛 +𝑛 )
1 2
𝑛1 𝑝1 +𝑛2 𝑝2 𝑥1 +𝑥2 16+3 19
We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 500+100 = 600
= 0.032
q = 1- p = 0.968
0.032−0.03
= 1 1
√(0.032)(0.968)( + )
500 100
0.002
0.019
= 0.104
v) Conclusion: Since |Zcal |value < Zα value , we accept 𝐻0
Hence we conclude that the machine has improved.

18. In an investigation on the machine performance the following results are


obtained .
No# of units inspected No# of defectives

Machine 1 375 17

Machine 2 450 22

Test whether there is any significant performance of two machines at 𝜶 = 0.05

Sol: Let 𝑃1 and 𝑃2 be the proportions of defective units in the population of units inspected
in machine 1 and Machine 2 respectively.

Given 𝑛1 = Sample size of the Machine 1 = 375


𝑛2 = Sample size of the Machine 2 = 450
𝑥1 = Number of defectives of the Machine 1 = 17
𝑥2 = Number of defectives of the Machine 2 = 22
𝑥 17 𝑥 22
∴ 𝑝1 = 𝑛1 = = 0.045 and 𝑝2 = 𝑛2 = 450
= 0.049
1 375 2

i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 ≠ 𝑃2 ( Two Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑝 − 𝑝
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1 21
√𝑝𝑞(𝑛 +𝑛 )
1 2
𝑛1 𝑝1 +𝑛2 𝑝2 𝑥1 +𝑥2 17+22 39
We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 375+450 = 825
= 0.047

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 82


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

q = 1- p = 1- 0.047 = 0.953
0.045−0.049
= 1 1
√(0.047)(0.953)( + )
375 450

= - 0.267
v) Conclusion: Since |Zcal |value < Zα value , we accept 𝐻0
Hence we conclude that there is no significant difference in performance of
machines.

19. A cigarette manufacturing firm claims that its brand A line of cigarettes outsells
its
brand B by 8% . If it is found that 42 out of 200 smokers prefer brand A and 18
out of another sample of 100 smokers prefer brand B . Test whether 8%
difference is a valid claim?
Sol: Given 𝑛1 = 200
𝑛2 = 100
𝑥1 = Number of smokers preferring brand A= 42
𝑥2 = Number of smokers preferring brand B = 18
𝑥 42 𝑥 18
∴ 𝑝1 = 𝑛1 = = 0.21 and 𝑝2 = 𝑛2 = 100
= 0.18
1 200 2

and 𝑃1 - 𝑃2 = 8% = 0.08

i) Null Hypothesis 𝐻0 : 𝑃1 - 𝑃2 = 0.08


ii) Alternative Hypothesis 𝐻1 : 𝑃1 - 𝑃2 ≠ 0.08 ( Two Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
(𝑝1 − 𝑝2 )−( 𝑃1 − 𝑃2 )
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1
√𝑝𝑞(𝑛 +𝑛 )
1 2

𝑛1 𝑝1 +𝑛2 𝑝2 𝑥1 +𝑥2 42+18 60


We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 200+100 = 300
= 0.2
q = 1- p = 1- 0.2 = 0.8
(0.21−0.18)−0.08
𝑍𝑐𝑎𝑙 = 1 1
√(0.2)(0.8)( + )
200 100
−0.05
= 0.0489
= - 1.02
v) Conclusion: Since |𝑍𝑐𝑎𝑙 |𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0
Hence we conclude that 8% difference in the sale of two brands of cigarettes is a
valid claim.

20. In a city A , 20% of a random sample of 900 schoolboys has a certain slight
physical defect . In another city B ,18.5% of a random sample of 1600 school
boys has the same defect . Is the difference between the proportions significant at
5% los?
Sol: Given 𝑛1 = 900

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 83


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

𝑛2 = 1600
𝑥1 = 20% of 900 = 180
𝑥2 = 18.5% of 1600 = 296
𝑥 180 𝑥 296
∴ 𝑝1 = 𝑛1 = = 0.2 and 𝑝2 = 𝑛2 = 1600
= 0.185
1 900 2
i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 ≠ 𝑃2 ( Two Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
(𝑝1 − 𝑝2 )
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1
√𝑝𝑞(𝑛 +𝑛 )
1 2

𝑛1 𝑝1 +𝑛2 𝑝2 𝑥1 +𝑥2 180+296 476


We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 900+1600 = 2500
= 0.19
q = 1- p = 1- 0.19 = 0.81
0.2−0.185
𝑍𝑐𝑎𝑙 = 1 1
√(0.19)(0.81)( + )
900 1600
−0.015
= 0.01634
= - 0.918
v) Conclusion: Since |𝑍𝑐𝑎𝑙 |𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0
Hence we conclude that there is no significant difference between the proportions.

SMALL SAMPLES

Introduction When the sample size n < 30, then if is referred to as small samples. In this
sampling distribution in many cases may not be normal ie., we will not be justified in
estimating the population parameters as equal to the corresponding sample values.

Degree Of Freedom The number of independent variates which make up the statistic is
known as the degrees of freedom (d.f) and it is denoted by 𝜗.

For Example: If 𝑥1 + 𝑥2 + 𝑥3 = 50 and we assign any values to two os the variables (say
x1,x2 ), then the values of x3 will be known. Thus, the two variables are free and independent
choices for finding the third.

In general, the number of degrees of freedom is equal to the total number of


observations less the number of independent constraints imposed on the observations.

For example: in a set of data of n observations, if K is the number of independent constraints


then 𝜗 = 𝑛 − 𝑘

Student’s t-Distribution Or t-Distribution


̅ be the mean of a random sample of size n, taken from a normal population having the
Let 𝑋
̅ )2
(𝑋𝑖 −𝑋
mean 𝜇 and the variance 𝜎2 , and sample variance 𝑆2 =∑ 𝑛−1
, then

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 84


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY
𝑥̅ −𝜇
𝑡=𝑆 is a random variable having the 𝑡 − distribution with 𝜗 = 𝑛 − 1 degrees of freedom.

√𝑛

Properties of 𝒕 − Distribution
1. The shape of 𝑡 −distribution is bell shaped, which is similar to that of normal
distribution and is symmetrical about the mean.
2. The mean of the standard normal distribution as well as 𝑡 −distribution is zero, but
the variance of 𝑡 −distrubution depends upon the parometer 𝜗 which is called the
degrees of freedom.
3. The variance of 𝑡 −distribution exceeds 1, but approaches 1 as 𝑛 → ∞.

Applications Of 𝒕 – Distributions

1. To test the significance of the sample mean, When population variance


is not given:
Let ̅𝑥 be the mean of the sample and n be the size of the sample ‘𝜎’ be the standard
deviation of the population and 𝜇 be the mean of the population.
Then the student 𝑡 − distribution is defined by the statistic
𝑥̅ −𝜇
𝑡= 𝑠 if s is given directly
√𝑛−1

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 85


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

𝑋̅ −𝜇
If ′𝜎′ is unknown, then 𝑡 = 𝑆 where

√𝑛

̅ )2
(𝑋𝑖 −𝑋
𝑆2 = ∑ 𝑛−1

Note ∶ Confidence limits for mean µ = x̅ ± tα (S⁄ ) or µ = x̅ ± tα (S⁄ )


√n √n − 1

𝟐. To test the significance of the difference between means of the two


independent samples :
To test the significant difference between the sample means 𝑥
̅1 and 𝑥
̅2 of two independent
samples of sizes n1 and n2, with the same variance .

We use statistic

𝑥̅ 1 −𝑥̅ 2
𝑡= -------(1) where
√𝑆 2 (𝑛1 +𝑛1 )
1 2
∑ 𝑥1 ∑𝑥
̅1 =
𝑥 ̅2 = 2 and
,𝑥
𝑛1 𝑛2
2 1
𝑥1 2 + ∑(𝑥2 − ̅̅̅)
𝑆 =𝑛 +𝑛 −2 [∑(𝑥1 − ̅̅̅) 𝑥2 2
1 2
1
OR 𝑆2 =𝑛 +𝑛 −2 [(𝑛1 𝑠21 ) + (𝑛2 𝑠22 )]
1 2

Where s1 and s2 are sample standard deviations.

Note: Confidence limits for difference of means : 𝜇1 − 𝜇 2 = (𝑥


̅1 − ̅
𝑥2 )

± t α (√𝑆 2 (𝑛11 + 𝑛12 ))

Paired t- test ( Test the significance of the difference between means of two
dependent samples ) :
Paired observations arise in many practical situations where each homogenous experimental
unit receives both population condition.

For Example: To test the effectiveness of ‘drug’ some // person’s blood pressure is measured
before and after the intake of certain drug. Here the individual person is the experimental unit
and the two populations are blood pressure “before” and “after” the drug is given

Paired t-test is applied for n paired observations by taking the differences d1,d2 ------dn of the
paired data. To test whether the differences di from a random sample of a population with
mean 𝜇.
𝑑̅ 1 1 2
𝑡=𝑠 𝑤ℎ𝑒𝑟𝑒 𝑑̅ = 𝑛 𝜖 𝑑𝑖 and 𝑠2 = 𝑛−1 ∑(𝑑 − 𝑑
̅)
⁄ 𝑛

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 86


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Problems:
1. A sample of 26 bulbs gives a mean life of 990 hours with a S.D of 20 hours. The
manufacturer claims that the mean life of bulbs is 1000 hours . Is the sample not
upto the standard?
Sol: Given n = 26
𝑥 = 990
̅
𝜇 = 1000 and S.D i.e., s = 20
i) Null Hypothesis : 𝐻0 : 𝜇 = 1000
ii) Alternative Hypothesis: 𝐻1 : 𝜇 < 1000( Left one tailed test )
(Since it is given below standard)
iii) Level of significance : 𝛼 = 0.05
t tabulated value with 25 degrees of freedom for left tailed test is 1.708
𝑥̅ − 𝜇 990−1000
iv) Test Statistic : 𝑡 𝑐𝑎𝑙 = 𝑠 = 20 = − 2.5
√𝑛−1 √25

v) Conclusion: Since |𝑡 𝑐𝑎𝑙 | value > 𝑡𝛼 value , we reject 𝐻0


Hence we conclude that the sample is not upto the standard.

2. A random sample of size 16 values from a normal population showed a mean of 53


and sum of squares of deviations from the mean equals to 150 . Can this sample be
regarded as taken from the population having 56 as mean ? Obtain 95% confidence
limits of the mean of the population.?
Sol: a) Given n = 16
𝑥 = 53
̅

̅)2 = 150
𝜇 = 56 and ∑(𝑥𝑖 − 𝑥
∑(𝑥𝑖 −𝑥̅ )2 150
∴ 𝑆2 = 𝑛−1
= 15
= 10 ⇒ S = √10

Degrees of freedom 𝜗 = n-1 = 16-1 =15

i) Null Hypothesis 𝐻0 : 𝜇 = 56
ii) Alternative Hypothesis 𝐻1 : 𝜇 ≠ 56 (Two tailed test )
iii) Level of significance : 𝛼 = 0.05
t tabulated value with 15 degrees of freedom for two tailed test is 2.13
𝑥̅ − 𝜇 53−56
iv) Test Statistic : 𝑡 𝑐𝑎𝑙 = 𝑆 = √10
= − 3.79
√𝑛 √15

v) Conclusion: Since |𝑡 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑡𝛼 value , we reject 𝐻0


Hence we conclude that the sample cannot be regarded as taken from population.
b) The 95% confidence limits of the mean of the population are given by
𝑆
𝑥̅ ± 𝑡0.05 = 53 ± 2.13 × 0.79
√𝑛
= 53 ± 1.6827
= 54.68 and 51.31
∴ 95% confidence limits are( 51.31, 54.68 )

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 87


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

3. A random sample of 10 boys had the following I.Q’s : 70, 120 ,110, 101,88,
83,95,98,107 and 100.
a) Do these data support the assumption of a population mean I.Q of 100?
b) Find a reasonable range in which most of the mean I.Q values of samples of
10 boys lie
Sol: Since mean and s.d are not given

We have to determine these


x x − x̅ (x − x̅ )2

70 -27.2 739.84

120 22.8 519.84

110 12.8 163.84

101 3.8 14.44

88 -9.2 84.64

83 -14.2 201.64

95 -2.2 4.84

98 0.8 0.64

107 9.8 96.04

100 2.8 7.84

∑ 𝑥 = 972 ∑(x − x̅ )2
= 1833.60

∑𝑥 972
𝑥=
Mean , ̅ 𝑛
= 10
= 97.2 and

1 1833.6
𝑆2 = 𝑛−1 ∑(x − x̅ )2 = 9

∴ S = √203.73 = 14.27

i) Null Hypothesis 𝐻0 : 𝜇 = 100


ii) Alternative Hypothesis 𝐻1 : 𝜇 ≠ 100 (Two tailed test )
iii) Level of significance : 𝛼 = 0.05
t tabulated value with 9 degrees of freedom for two tailed test is 2.26
𝑥̅ − 𝜇 97.2−100
iv) Test Statistic : 𝑡 𝑐𝑎𝑙 = 𝑆 = 14.27 = − 0.62
√𝑛 √10

v) Conclusion: Since |𝑡 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 < 𝑡𝛼 value , we accept 𝐻0

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 88


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Hence we conclude that the data support the assumption of mean I.Q of 100 in the
population.
b) The 95% confidence limits of the mean of the population are given by
𝑆
𝑥̅ ± 𝑡0.05 = 97.2 ± 2.26× 4.512
√𝑛
= 97.2 ± 10.198
= 107.4 and 87
∴ 95% confidence limits are( 87, 107.4 )

4. Samples of two types of electric bulbs were tested for length of life and following
data were obtained
Type 1 Type 2

Sample number , 𝒏𝟏 = 8 𝒏𝟐 = 7

Sample mean , ̅𝒙̅̅𝟏̅ = 1234 ̅𝒙̅̅𝟐̅ = 1036

Sample S.D , 𝒔𝟏 = 36 𝒔𝟐 = 40

Is the difference in the mean sufficient to warrant that type 1 is superior to type
2 regarding length of life .
Sol: i) Null Hypothesis 𝐻0 : The two types of electric bulbs are identical
i.e., 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2
𝑥̅ 1 − 𝑥̅ 2
iii)Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
√𝑠(𝑛 +𝑛 )
1 2
2 𝑛 𝑠 2 +𝑛 𝑠 2
Where 𝑆 = 1 𝑛1 +𝑛1 1
1 2
1
= 8+7−2(8(36)2 + 7(40)2 ) = 1659.08
1234− 1036
∴t= 1 1
= 9.39
√1659.08( + )
8 7

iv)Degrees of freedom = 8+7-2 =13 ,tabulated value of t for 13 d.f at 5% los is 2.16
v)Conclusion: Since |𝑡 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑡𝛼 value , we reject 𝐻0

Hence we conclude that the two types 1 and 2 of electric bulbs are not identical .

5. Two horses A and B were tested according to the time to run a particular track with
the following results .
Horse A 28 30 32 33 33 29 34

Horse B 29 30 30 24 27 29

Test whether the two horses have the same running capacity
Sol: Given 𝑛1 = 7 , 𝑛2 = 6
We first compute the sample means and standard deviations

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 89


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY
1
𝑥
̅ = Mean of the first sample = ( 28 + 30 + 32 + 33 + 33 + 29 + 34)
7
1
= 7 (219) = 31.286
1
𝑦
̅ = Mean of the second sample = ( 29 + 30 + 30 + 24 + 27 + 29 )
6
1
= 6 (169) = 28.16

𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦

28 -3.286 10.8 29 0.84 0.7056

30 -1.286 1.6538 30 1.84 3.3856

32 0.714 0.51 30 1.84 3.3856

33 1.714 2.94 24 -416 17.3056

33 1.714 2.94 27 -1.16 1.3456

29 -2.286 5.226 29 0.84 0.7056

34 2.714 7.366

∑𝑥 ̅)2
∑(𝑥 − 𝑥 ∑𝑦 ̅)2
∑(𝑦 − 𝑦
= 219 = 31.4358 = 169 = 26.8336

1
Now 𝑆2 = 𝑛 ̅)2 + ∑(𝑦 − 𝑦)2 ]
[(∑(𝑥 − 𝑥
1 +𝑛2 −2

1
= 11 [31.4358 + 26.8336]

1
= 11 (58.2694)

= 5.23

∴ S = √5.23 = 2.3

i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2
𝑥̅ 1 − 𝑥̅ 2
iii) Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
𝑆√(𝑛 +𝑛 )
1 2
31.286 − 28.16
= = 2.443
1 1
2.3 (√7 + 6)

∴ 𝑡𝑐𝑎𝑙 = 2.443

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 90


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

iv)Degrees of freedom = 7+6-2 =11

Tabulated value of t for 11 d.f at 5% los is 2.2

Conclusion: Since |t cal | value > tα value , we reject 𝐻0

Hence we conclude that both horses do not have the same running capacity.

6. Ten soldiers participated in a shooting competition in the first week. After intensive
training they participated in the competition in the second week . Their scores
before and after training are given below :
Scores 67 24 57 55 63 54 56 68 33 43
before
Scores 70 38 58 58 56 67 68 75 42 38
after
Do the data indicate that the soldiers have been benefited by the training.
Sol: Given 𝑛1 = 10 , 𝑛2 = 10
We first compute the sample means and standard deviations
1
𝑥
̅ = Mean of the first sample =
10
(67 + 24 + 57 +55+63+54+56+68+33+43)
1
= 10 (520) = 52
1
𝑦
̅ = Mean of the second sample =
10
(70+38+58+58+56+67+68+75+42+38)
1
= 10 (570) = 57
𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦

67 15 225 70 13 169

24 -28 784 38 -19 361

57 5 25 58 1 1

55 3 9 58 1 1

63 11 121 56 -1 1

54 2 4 67 10 100

56 4 16 68 11 121

68 16 256 75 18 324

33 -19 361 42 -15 225

43 -9 81 38 -19 361

∑ 𝑥 = 520 ̅)2
∑(𝑥 − 𝑥 ∑ 𝑦 = 570 ̅)2
∑(𝑦 − 𝑦
= 1882 = 1664

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 91


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

1
Now 𝑆2 = 𝑛 ̅)2 + ∑(𝑦 − 𝑦)2 ]
[(∑(𝑥 − 𝑥
1 +𝑛2 −2

1
= 18 [1882 + 1664]

1
= 18 (3546)

= 197

∴ S = √197 = 14.0357

i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 < 𝜇2 (Left one tailed test)
𝑥̅ 1 − 𝑥̅ 2
iii) Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
𝑆√(𝑛 +𝑛 )
1 2
52 − 57
=
1 1
14.0357 (√10 + 10)
3546
= 18
= −0.796
∴ 𝑡𝑐𝑎𝑙 = -0.796

iv)Degrees of freedom = 10+10-2 =18

Tabulated value of t for 18 d.f at 5% los is -1.734

Conclusion: Since |t cal | value < |tα | value , we accept 𝐻0

Hence we conclude that the soldiers are not benefited by the training.

7. The blood pressure of 5 women before and after intake of a certain drug are given
below:
Before 110 120 125 132 125

After 120 118 125 136 121

Test whether there is significant change in blood pressure at 1% los?


Sol: Given n = 5
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 < 𝜇2 (Left one tailed test)
̅
𝑑
iii) Test Statistic 𝑡𝑐𝑎𝑙 = 𝑠
⁄ 𝑛

where d̅ =
∑d 1 ̅ )2
and 𝑆2 = 𝑛−1 ∑(𝑑 − 𝑑
n

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 92


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

B.P before training B.P after training 𝑑 = 𝑦−𝑥 𝑑 − 𝑑̅ ̅ )2


(𝑑 − 𝑑
110 120 10 8 64
120 118 -2 -4 16
123 125 2 0 0
132 136 4 2 4
125 121 -4 -6 36
∑ 𝑑 = 10 ̅ )2 =
∑(𝑑 − 𝑑
120

10 120
∴ ̅𝑑̅̅̅ = 5
= 2 and 𝑆2 = 4
= 30

∴ S = 5.477
̅
𝑑 2
𝑡𝑐𝑎𝑙 = 𝑠 = 5.477 = 0.862
⁄ 𝑛 ⁄
√ √5

iv) Degrees of freedom = 5-1= 4


Tabulated value of t for 4 d.f at 1% los is 4.6
Conclusion: Since |t cal | value < |tα | value , we accept 𝐻0
Hence we conclude that there is no significant difference in Blood pressure after intake of a
certain drug.

8. Memory capacity of 10 students were tested before and after training . State
whether the training was effective or not from the following scores.
Sol : i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 < 𝜇2 (Left one tailed test)
̅
𝑑
iii) Test Statistic 𝑡𝑐𝑎𝑙 = 𝑠
⁄ 𝑛

∑d 1 2
where d̅ = and 𝑆2 = 𝑛−1 ∑(𝑑 − 𝑑
̅)
n

Before(𝑥) After(𝑦) 𝑑 = 𝑦−𝑥 𝑑2


12 15 -3 9
14 16 -2 4
11 10 1 1
8 7 1 1
7 5 2 4
10 12 -2 4
3 10 -7 49
0 2 -2 4
5 3 2 4
6 8 -2 4
2
∑𝑑 ∑𝑑
= −12 = 84

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 93


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

̅= −12
𝑑 10
= -1.2

84− (−1.2)2 x 10
𝑆2 = 9
= 7.73

∴ S = 2.78
̅
𝑑 −1.2
𝑡𝑐𝑎𝑙 = 𝑠 = 2.78 = -1.365 and d.f = n-1 = 9
⁄ 𝑛 ⁄
√ √10

Tabulated value of t for 9 d.f at 5% los is 1.833

Conclusion: Since |t cal | value < |tα | value , we accept 𝐻0

Hence we conclude that there is no significant difference in memory capacity after the
training program.

Chi-Square (𝝌𝟐 ) Distribution


Chi square distribution is a type of cumulative probability distribution . probability
distributions provide the probability of every possible value that may occur . Distributions
that are cumulative give the probability of a random variable being less than or equal to a
particular value. Since the sum of the probabilities of every possible value must equal one ,
the total area under the curve is equal to one . Chi square distributions vary depending on the
degrees of freedom. The degrees of freedom is found by subtracting one from the number of
categories in the data .

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 94


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Applications of Chi – Square Distribution:

Chi – Square test as a test of goodness of fit :


𝝌𝟐 – test enables us to ascertain how well the theoretical distributions such as
binomial, Poisson, normal etc, fit the distributions obtained from sample data. If the
calculated value of 𝝌𝟐 is less than the table value at a specified level of generally 5%
significance, the fit is considered to be good.

If the calculated value of 𝝌𝟐 is greater than the table value, the fit is considered to be poor.

i) Null hypothesis: H0 : There is no difference in given values and calculated values

ii)Altenative hypothesis: H1 : There is some difference in given values and calculated


values
(O−E)2
iii) Test Statistic 𝛘𝟐 cal = ∑ E

iv)At specified level of significance for n-1 d.f if the given problem is binomial
distribution

At specified level of significance for n-2 d.f if the given problem is Poisson distribution

v)Conclusion :If 𝝌𝟐 cal value < 𝝌𝟐 tab value , then we accept H0 , Otherwise reject H0 .

2. Chi – Square test for independence of attributes :


Definition : An attribute means a quality or characteristic
Eg: Drinking, Smoking, blindness, Honesty, beauty etc.,

An attribute may be marked by its presence or absence in a number of a given population.

Let us consider two attributes A and B.

A is divided into two classes and B is divided into two classes. The various cell frequencies
can be expressed in the following table known as 2x2 contingency table.

a b a+b

c d c+ d

a + c b + d N =a + b + c + d

The expected frequencies are given by

(𝑎 + 𝑐)(𝑎 + 𝑏)
𝐸(𝑎) =
𝑁

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 95


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

(𝑏 + 𝑐)(𝑎 + 𝑏)
𝐸(𝑏) =
𝑁
(𝑎 + 𝑐)(𝑐 + 𝑑)
𝐸(𝑐) =
𝑁
(𝑏 + 𝑑)(𝑐 + 𝑑)
𝐸(𝑑) =
𝑁
(𝑂 − 𝐸)2
𝝌𝟐 𝑐𝑎𝑙 = ∑
𝐸
𝝌𝟐 𝑐𝑎𝑙 value to be compared with 𝝌𝟐 𝑡𝑎𝑏 value at 1% (5.1 or10%) level of significance for

(r-1) (c-1) d.f where r- number of rows

c-number of columns.

Note: In 𝝌𝟐 distribution for independence of attributes, we test if two attributes A and B are
independent or not.

i)Null Hypothesis: H0 : The two attributes are independent

ii) Alternative hypothesis: H1 : The two attributes are not independent

2 (O − E)2
iii) Test Statistic χ cal
=∑
E
𝐑𝐨𝐰 𝐭𝐨𝐭𝐚𝐥 𝐱 𝐂𝐨𝐥𝐮𝐦𝐧 𝐭𝐨𝐭𝐚𝐥
where E = 𝐆𝐫𝐚𝐧𝐝 𝐭𝐨𝐭𝐚𝐥

iv)At specified level of significance for (m-1) (n-1) d.f where m- no. of rows and n- no. of
columns

v)Conclusion : If 𝛘𝟐 cal value < 𝛘𝟐 tab value , then we accept H0 , Otherwise reject H0 .

Problems :

1. Fit a Poisson distribution to the following data and test for its goodness of fit at 5%
los

x 0 1 2 3 4

f 419 352 154 56 19

Sol:

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 96


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

X f fx

0 419 0

1 352 352

2 154 308

3 56 168

4 19 76

N=1000 ∑ 𝑓𝑥 = 904

∑ 𝑓𝑥 904
Mean 𝜆 = N
= 1000 = 0.904

Theoretical distribution is given by

𝑒−𝜆 𝜆𝑥
= N x p(x) = 1000 x 𝑥!

Hence the theoretical frequencies are given by

x 0 1 2 3 4 Total

f = 1000 x 406.2 366 165.4 49.8 12.6 1000


𝑒−𝜆 𝜆𝑥
𝑥!

Since Given frequencies total is equal to Calculated frequencies total.

To test for goodness of fit:

i) H0 : There is no difference in given values and calculated values

ii) H1 : There is some difference in given values and calculated values

(𝑂−𝐸)2
iii)𝝌𝟐 𝑐𝑎𝑙 = ∑ 𝐸

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 97


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
419 406.2 (419 − 406.2)2 (419 − 406.2)2
406.2
352 366 (352 − 366)2 (352 − 366)2
366
154 165.4 (154 − 165.4)2 (154 − 165.4)2
165.4
56 49.8 (56 − 49.8)2 (56 − 49.8)2
49.8
19 12.6 (19 − 12.6)2 (19 − 12.6)2
12.6
(𝑂−𝐸)2
∑ = 5.748
𝐸

Degrees of freedom = 5-2 = 3

𝝌𝟐 𝑡𝑎𝑏 at 5% LOS = 7.82

Since 𝝌𝟐 𝑐𝑎𝑙 value < 𝝌𝟐 𝑡𝑎𝑏 ,we accept 𝐻0 .

3. A die is thrown 264 times with following results. Show that the die is biased [ Given
𝝌𝟐 𝟎.𝟎𝟓 = 11.07 for 5 d.f]
No. appeared 1 2 3 4 5 6
on the die
Frequency 40 32 28 58 54 52

Sol: i) H0 : The die is unbiased

ii) H1 : The die is not unbiased

(𝑂−𝐸)2
iii)𝛘𝟐 cal = ∑ 𝐸

264
The expected frequency of each of the number 1,2,3,4,5,6 is 6
= 44

Calculation of 𝝌𝟐 :

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 98


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
40 44 16 0.3636

32 44 144 3.2727

28 44 256 5.8181

58 44 196 4.4545

54 44 100 2.2727

52 44 64 1.4545

(𝑂−𝐸)2
∑ = 17.6362
𝐸

𝝌𝟐 𝑐𝑎𝑙 = 17.6362

The number of degrees of freedom = n-1 = 5

𝝌𝟐 0.05 = 11.07 for 5 d.f

Since 𝝌𝟐 𝑐𝑎𝑙 value > 𝝌𝟐 𝑡𝑎𝑏 value , we reject 𝐻0

Hence the die is biased

4. On the basis of information given below about the treatment of 200 patients
suffering from disease , state whether the new treatment is comparatively
Superior to the conventional treatment.
Treatment Favorable Not Favorable Total

New 60 30 90

Conventional 40 70 110

Sol: i) H0 : The two attributes are independent

ii) H1 : The two attributes are not independent

(𝑂 − 𝐸)2
iii) 𝛘𝟐 cal = ∑
𝐸
𝐑𝐨𝐰 𝐭𝐨𝐭𝐚𝐥 𝐱 𝐂𝐨𝐥𝐮𝐦𝐧 𝐭𝐨𝐭𝐚𝐥
where E = 𝐆𝐫𝐚𝐧𝐝 𝐭𝐨𝐭𝐚𝐥

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 99


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

90 x 100 90 x 100 90
= 45 = 45
200 200
100x 110 100 x 110 11
= 55 = 55
200 200
100 100 200

Calculation of 𝝌𝟐 :

O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
60 45 225 5

30 45 225 5

40 55 225 4.09

70 55 225 4.09

(𝑂−𝐸)2
∑ = 18.18
𝐸

𝝌𝟐 𝑐𝑎𝑙 = 18.18

𝝌𝟐 𝑡𝑎𝑏 for 1 d.f . at 5% los is 3.841

since 𝝌𝟐 𝑐𝑎𝑙 value > 𝝌𝟐 𝑡𝑎𝑏 value , we reject 𝐻0

Hence we conclude that new and conventional treatment are not independent.

Snedecor’s F- Test of Significance


The F-Distribution is also called as Variance Ratio Distribution as it usually defines the ratio
of the variances of the two normally distributed populations. The F-distribution got its name
after the name of R.A. Fisher, who studied this test for the first time in 1924.

Symbolically, the quantity is distributed as F-distribution with and degrees of freedom 𝜗1 =


𝑛1 − 1and 𝜗2 = 𝑛2 − 1 is represented as:

Greater Variance
Fcal =
Smaller Varinace

𝑆1 2 𝑆2 2
𝐹𝑐𝑎𝑙 = Or
𝑆2 2 𝑆1 2

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 100


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Where,
𝑛1 𝑠1 2 1
𝑆1 2 is the unbiased estimator of σ12 and is calculated as: 𝑆1 2 = =𝑛 ∑(𝑥1 − 𝑥
̅̅̅)
1
2
𝑛1 −1 1 −1

𝑛2 𝑠2 2 1
𝑆2 2 is the unbiased estimator of σ22 and is calculated as: 𝑆2 2 = =𝑛 ∑(𝑥2 − 𝑥
̅̅̅)
2
2
𝑛2 −1 2 −1

To test the hypothesis that the two population variances 𝝈𝟏 𝟐 and 𝝈𝟐 𝟐 are
equal

i) H0 : σ1 2 = σ2 2

ii) H1 : σ1 2 ≠ σ2 2
𝐆𝐫𝐞𝐚𝐭𝐞𝐫 𝐕𝐚𝐫𝐢𝐚𝐧𝐜𝐞
iii) Fcal = 𝐒𝐦𝐚𝐥𝐥𝐞𝐫 𝐕𝐚𝐫𝐢𝐧𝐚𝐜𝐞

iv)At specified level of significance ( 1% or 5 %) for (ϑ1,ϑ2 ) d.f

v) If 𝐅cal value < 𝐅tab value , then we accept H0 , Otherwise reject H0 .

𝐹𝑐𝑎𝑙 (𝜗1 , 𝜗2 ) is the value of F with 𝜗1 and 𝜗2 degrees of freedom such that the area under the
F – distribution to the right of 𝐹𝛼 is 𝛼.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 101


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

Problems:

1. In one sample of 8 observations from a normal population, the sum of the squares of
deviations of the sample values from the sample mean is 84.4 and in another sample
of 10 observations it was 102.6. Test at 𝟓% level whether the populations have the
same varience.

Sol: Let 𝜎1 2 and 𝜎2 2 be the variances of the two normal populations from which the
samples are drawn.

Let the Null Hypothesis be 𝐻0 : 𝜎1 2 = 𝜎2 2

Then the Alternative Hypothesis is 𝐻1 : 𝜎1 2 ≠ 𝜎2 2

Here 𝑛1 = 8, 𝑛2 = 10
2
̅)2 = 84.4, ∑(𝑦𝑖 − 𝑦
Also ∑(𝑥𝑖 − 𝑥 ̅) = 102.6

If 𝑆1 2 𝑎𝑛𝑑𝑆2 2 be the estimates of 𝜎1 2 and 𝜎2 2 then

1 84.4
𝑆1 2 = ̅)2 =
∑(𝑥𝑖 − 𝑥 = 12.057
𝑛1 − 1 7

and
1 2 102.6
𝑆2 2 = ∑(𝑦𝑖 − 𝑦
̅) = = 11.4
𝑛2 − 1 9

Let 𝐻0 be true. Since 𝑆1 2 > 𝑆2 2 , the test statistic is

𝑆 2 12.057
𝐹 = 𝑆1 2 = = 1.057
2 11.4

i.e., calculated F = 1.057.

Degrees of freedom are given by 𝑣1 = 𝑛1 − 1 = 8 − 1 = 7

and 𝑣2 = 𝑛2 − 1 = 10 − 1 = 9

Tabulated value of 𝐹 at 5% level for (7,9) degrees of freedom is 3.29

i.e.,𝐹0.05 (7,9) = 3.29

Since calculated 𝐹 < tabulated 𝐹, we accept the Null Hypothesis 𝐻0 and


conclude that the populations have the same variance.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 102


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

2.The time taken by workers in performing a job by method I and method II is given
below

Method I 20 16 26 27 23 22 -

Method II 27 33 42 35 32 34 38

Do the data show that the variances of time distribution from population from which
these samples are drawn do not differ significantly?

Sol: Let the Null Hypothesis be 𝐻0 : 𝜎1 2 = 𝜎2 2 where 𝜎1 2 and 𝜎2 2 are the variances of the
two populations from with the samples are drawn.

The Alternative Hypothesis is 𝐻1 : 𝜎1 2 ≠ 𝜎2 2 .

Calculation of sample variances.

𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦

20 -2.3 5.29 27 -7.4 54.76

16 -6.3 39.69 33 -1.4 1.96

26 3.7 13.69 42 7.6 57.76

27 4.7 22.09 35 0.6 0.36

23 0.7 0.49 32 -2.4 5.76

22 -0.3 0.09 34 -0.4 0.16

38 3.6 12.96

134 81.34 241 133.72

Given 𝑛1 = 6, 𝑛2 = 7

∑ 𝑥 134 ∑ 𝑦 241
∴ 𝑥̅ = = = 22.3, 𝑦̅ = = = 34.4
𝑛1 6 𝑛2 67

And
2
̅)2 = 81.34, ∑(𝑦𝑖 − 𝑦
∑(𝑥𝑖 − 𝑥 ̅) = 133.72

If 𝑆1 2 𝑎𝑛𝑑𝑆2 2 be the estimates of 𝜎1 2 and 𝜎2 2 , then

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 103


PROBABILITY,STATISTICS, QUEING
TESTING OF HYPOTHESIS
THEORY

1 81.34
𝑆1 2 = ̅)2 =
∑(𝑥𝑖 − 𝑥 = 16.26
𝑛1 − 1 5

and
1 2 133.72
𝑆2 2 = ∑(𝑦𝑖 − 𝑦
̅) = = 22.29
𝑛2 − 1 6

Let 𝐻0 be true

Since 𝑆2 2 > 𝑆1 2 , the statistic is

S2 2 22.29
F= 2 = 16.268 = 1.3699 = 1.37
S1

F0.05 (5,6) d. f = 4.39

Since calculated F < tabulated F , we accept the null hypothesis 𝐻0 at 5% los i.e., there is no
significant difference between the variances of the distribution by the workers.

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 104


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

UNIT-V
STOCHASTIC PROCESS

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 105


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

MARKOV CHAIN:

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 106


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 107


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 108


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 109


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

EXAMPLE 5.1:

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 110


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 111


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

EXAMPLE 5.2:

The transition diagram is-

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 112


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 113


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 114


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 115


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 116


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 117


PROBABILITY,STATISTICS, QUEING
STOCHASTIC PROCESS
THEORY

DEPARTMENT OF HUMANITIES & SCIENCES ©MRCET (EAMCET CODE: MLRD) 118

You might also like