Distribution Theory - Notes
Distribution Theory - Notes
3 4
1
11/21/2013
X P(X =x)
If a random variable is a discrete variable, its probability distribution
0 P(X =0) = P(TTT) is called a discrete probability distribution.
=1/8 An example will make this clear. Suppose you flip a coin two times.
This simple statistical experiment can have four possible outcomes:
1 P(X =1) = 3/8 HH, HT, TH, and TT. Now, let the random variable X represent the
number of Heads that result from this experiment. The random
2 P(X = 2) = 3/8
variable X can only take on the values 0, 1, or 2, so it is a discrete
3 P(X = 3) = 1/8 random variable.
Eg: - Probability distribution of X, the number of diseased male goat with 3 male
Discrete probability distribution offspring is given by the following p.m.f. p(x):
(i) p(x) ≥ 0 The above p.m.f is represented graphically by the following figure.
(ii) Σp(x) = 1
1
where p(x) = P(X = x)= probability that the r.v X takes p.m.f
p(x)
the value x.
0 0 1 2 3
7 8
X→
2
11/21/2013
x p(x) F(x) 1
3
11/21/2013
13 14
Cumulative distribution
x function (c.d.f)
Probability density function F(x) = P(x ≤ x) = ∫ f (t )dt
−∞
Properties
P(x<X<x+dx)=f(x)dx (1) 0 ≤ F(x) ≤ 1
(2) F (-∞) = 0
x x+dx
(3) F (+∞) = 1
4
11/21/2013
Binomial Experiment:
A binomial experiment (also known as a Bernoulli trial) is a statistical
experiment that has the following properties:
F(x)
•The experiment consists of n repeated trials.
•Each trial can result in just two possible outcomes. We call one of
-∞ x ∞ these outcomes a success and the other, a failure.
•The probability of success, denoted by P, is the same on every trial.
•The trials are independent; that is, the outcome on one trial does
not affect the outcome on other trials.
17 18
Notation
Consider the following statistical experiment. You flip a coin 2 times
and count the number of times the coin lands on heads. This is a The following notation is helpful, when we talk about binomial
binomial experiment because: probability.
The experiment consists of repeated trials. We flip a coin 2 times. x: The number of successes that result from the binomial
Each trial can result in just two possible outcomes - heads or tails. experiment.
The probability of success is constant - 0.5 on every trial. n: The number of trials in the binomial experiment.
P: The probability of success on an individual trial.
The trials are independent; that is, getting heads on one trial does
Q: The probability of failure on an individual trial. (This is equal to
not affect whether we get heads on other trials.
1 - P.)
b(x; n, P): Binomial probability - the probability that an n-trial
binomial experiment results in exactly x successes, when the
probability of success on an individual trial is P.
nCr: The number of combinations of n things, taken r at a time.
5
11/21/2013
Binomial Distribution
Binomial Probability
A binomial random variable is the number of successes x in n repeated
trials of a binomial experiment. The probability distribution of a The binomial probability refers to the probability that a binomial
binomial random variable is called a binomial distribution (also known experiment results in exactly x successes. For example, in the above
as a Bernoulli distribution). table, we see that the binomial probability of getting exactly one head
in two coin flips is 0.50.
The binomial distribution has the following properties: Given x, n, and P, we can compute the binomial probability based on
the following formula:
The mean of the distribution (μx) is equal to n * P .
Binomial Formula. Suppose a binomial experiment consists of n trials
The variance (σ2x) is n * P * ( 1 - P ).
Example
Cumulative Binomial Probability
A cumulative binomial probability refers to the probability that the The probability that a student is accepted to a prestigious college is 0.3.
binomial random variable falls within a specified range (e.g., is greater If 5 students from the same school apply, what is the probability that at
than or equal to a stated lower limit and less than or equal to a stated most 2 are accepted?
upper limit).
Solution: To solve this problem, we compute 3 individual probabilities,
For example, we might be interested in the cumulative binomial
using the binomial formula. The sum of all these probabilities is the
probability of obtaining 45 or fewer heads in 100 tosses of a coin (see
answer we seek. Thus,
Example 1 below). This would be the sum of all these individual
binomial probabilities. b(x < 2; 5, 0.3) = b(x = 0; 5, 0.3) + b(x = 1; 5, 0.3) + b(x = 2; 5, 0.3)
6
11/21/2013
Note:-
A trial of a random experiment that can only Poisson Distribution
two possible outcomes (success and failure) is called a (S.D. Poisson(1781 – 1840) , published in 1837)
Bernoulli trial (Swiss mathematician Jacob Bernoulli Let x be a random variable following Poisson distribution with p.m.f
(1654 –1705)).
e− λ λ x
If it come with a Bernoulli trial to assign a value 1 p(x) = , x = 0,1,2,...
x! λ > 0
to one outcome and 0 to other, such r.v is known as
dichotomous or Bernoulli r.v with p.m.f 1 1 1
e = 1 + + + + ... = 2.718282
1! 2! 3!
p.m.f satisfies the following properties
p(x) = px q1-x, x = 0,1 0<p<1,
p+ q =1 1. p(x)≥ 0
e−λλx
A binomial experiment consists of a sequence of n 2. Σp(x) = 1,ie, Σ =1
independent and identical Bernoulli trials. x!
25 26
could be a length, an area, a volume, a period of time, etc. The mean & variance of the distribution is equal to λ .
7
11/21/2013
Q2. A fire station switch board receives an average of 0.9 call per
Eg:- minute. Find the probability that there is ,
1. No. of telephone calls received by a fire station switch (i) No call in a minute.
board during a given period of time. (ii) Two or more calls in a minute.
2. No. of defective leather footwear among the exported (iii) What is the average no. of call per minute?
footwears (iv) What is the S.D of the no. of calls received per minutes?
3. No. of E-coli bacteria per small volume of urine. Ans. Let X be the no. of calls per minutes. Then X ∼ P(λ=0.9)
Normal Distribution
Normal distribution represents the distributions of
continuous variables like height, weight, BP, experimental An important use of normal distribution is in
errors in scientific measurements etc. and this distribution approximating discrete distributions, like binomial
is often used to approximate other distributions. distribution for large values of ‘n’. It plays an important
role in statistical inference and in statistical quality
It provides a reasonable approximation to the distribution control.
of many different variables in the field of science and
engineering, business and commerce, biology, education,
sociology, agriculture etc.
31 32
8
11/21/2013
A continuous random variable X is said to have a Normal distribution curve is symmetric and Bell shaped. Any
Normal distribution with parameters µ and σ, if its change in the parameter µ merely shift the curve to the left or
p.d.f is given by right, but any change in the parameter σ will changes the shape
1 1
− 2 ( x − µ )2
of the curve. ie, As σ increases the curve becomes more and
f (x ) = e 2σ more flat. The mean µ describes where the corresponding curve
2π σ is centered, and S.D σ describes how much the curve spreads out
= 0, O.W around the center.
-∞ < X < +∞
-∞ < µ < +∞, σ > 0
Probability Distribution
No Notationaly, X ∼ N(µ, σ2). Since f (x) is a p.d.f, it is to be noted that
33 34
x
Properties.
Cumulative density function (c.d.f) F(x) = ∫ f (x) dx (i) Normal distribution is bell shaped and is symmetric about its mean µ
−∞
=P(X≤x)
(ii) For a normal distribution mean = median = mode.
Cumulative Probability Distribution
(iii) The points of inflection of a normal density curve are at x= µ ± σ.
(iv) The ordinate of a normal density is maximum at x=µ and maximum
ordinate is
F(x)
1
2π σ
-∞ x +∞ (v) Almost all values of a normal variable lies in (µ-3σ,µ+3σ).
1
Note:- (i) F (µ ) =
2 For a normal distribution,
(ii) F( ∞ ) = 1
9
11/21/2013
−z2
1 -∞ t 0 z +∞
φ (z ) = e 2
, −∞ < Z < +∞
2π
Standard normal distribution possesses the same properties as the general
normal distribution does .
-∞ 37 z=0 38
+∞
X −µ
Ans. X ∼ N (µ=120, σ = 10), we know that Z = σ ∼ N(0,1).
39 40
10
11/21/2013
110−120 140−120
x - µ 90 − µ 90 − 120 (iii) P(110 < X <140) = P < Z<
i) P(X<90) = P < = P Z < 10 10
σ σ 10
= P(−1 < Z < 2) = ϕ (2) − ϕ (−1)
= P(Z < −3) = ϕ (− 3) = 1 − ϕ (3)
= ϕ (2) − (1 − ϕ (1))
= 1 − 0.9987 = 0.0013 = 0.9772 –1 +0.8413 = 0.8185
90 120
x
140 − 120
(ii) P(X>140) = P Z > = P(Z > 2)
10
= 1 − P (Z < 2 ) = 1 − ϕ (2)
120 140 x X
41 42
11
11/21/2013
But sample sizes are sometimes small, and often we do not know the
Sampling Distributions standard deviation of the population. When either of these
problems occur, statisticians rely on the distribution of the t statistic
Student's t Distribution
(also known as the t score), whose values are given by:
The t distribution (Student’s t-distribution) is a probability
distribution that is used to estimate population parameters when the t = [ x - μ ] / [ s / sqrt( n ) ]
sample size is small and/or when the population variance is where x is the sample mean, μ is the population mean, s is the
unknown.
standard deviation of the sample, and n is the sample size. The
distribution of the t statistic is called the t distribution or the
Why Use the t Distribution? Student t distribution.
According to the central limit theorem, the sampling distribution of a The t distribution allows us to conduct statistical analyses on certain
statistic (like a sample mean) will follow a normal distribution, as long data sets that are not appropriate for analysis, using the normal
as the sample size is sufficiently large. Therefore, when we know the
distribution.
standard deviation of the population, we can compute a z-score, and
use the normal distribution to evaluate probabilities with the sample
mean.
freedom (see last section) and v > 2. The sample size is greater than 40, without outliers.
The variance is always greater than 1, although it is close to 1 when The t distribution should not be used with small samples from
there are many degrees of freedom. With infinite degrees of populations that are not approximately normal.
freedom, the t distribution is the same as the standard normal
distribution
12
11/21/2013
Chi-Square Distribution
Probability and the Student t Distribution
The distribution of the chi-square statistic is called the chi-square
When a sample of size n is drawn from a distribution. In this lesson, we learn to compute the chi-square
population having a normal (or nearly normal) statistic and find the probability associated with the statistic.
distribution, the sample mean can be The Chi-Square Statistic
transformed into a t score, using the equation
Suppose we conduct the following statistical experiment.
presented at the beginning of this lesson. We
We select a random sample of size n from a normal population,
repeat that equation below:
having a standard deviation equal to σ. We find that the standard
t = [ x - μ ] / [ s / sqrt( n ) ] deviation in our sample is equal to s.
Given these data, we can define a statistic, called chi-square, using
where x is the sample mean, μ is the population
the following equation:
mean, s is the standard deviation of the sample,
n is the sample size, and degrees of freedom are Χ2 = [ ( n - 1 ) * s2 ] / σ2
equal to n - 1.
( χ ) e ; 0 < χ2 < ∞
pdf =
The mean of the distribution is equal to the number of degrees of
freedom: µ = n.
n/2
The variance is equal to two times the number of degrees of
freedom: σ2 = 2 * n
When the degrees of freedom are greater than or equal to 2, the
e is a constant equal to the base of the natural maximum value for Y occurs when Χ2 = n- 2.
As the degrees of freedom increase, the chi-square curve approaches
logarithm system (approximately 2.71828).
a normal distribution.
13
11/21/2013
F Distribution
The steps required to compute an f statistic:
The F distribution is the probability distribution
associated with the f statistic. In this lesson, we show Select a random sample of size n1 from a
how to compute an f statistic and how to find normal population, having a standard
probabilities associated with specific f statistic values.
deviation equal to σ1.
The f Statistic Select an independent random sample of size
The f statistic, also known as an f value, is a random n2 from a normal population, having a
variable that has an F distribution. standard deviation equal to σ2.
The f statistic is the ratio of s12/σ12 and s22/σ22.
The following equivalent equations are commonly used to compute an f The F Distribution
statistic:
The distribution of all possible values of the f statistic is called an F
f = [ s12/σ12 ] / [ s22/σ22 ] distribution, with v1 = n1 - 1 and v2 = n2 - 1 degrees of freedom.
f = [ s12 * σ22 ] / [ s22 * σ12 ] The curve of the F distribution depends on the degrees of freedom, v1
and v2.
χ 2 χ22
f = 1 v2
v1 When describing an F distribution, the number of degrees of freedom
associated with the standard deviation in the numerator of the f statistic
f = χ12 * v2 χ 2 2 * v1 is always stated first.
Thus, f(5, 9) would refer to an F distribution with v1 = 5 and v2 = 9
where σ1 is the standard deviation of population 1, s1 is the standard degrees of freedom
deviation of the sample drawn from population 1, σ2 is the standard
The F distribution has the following properties:
deviation of population 2, s2 is the standard deviation of the sample drawn
from population 2, Χ21 is the chi-square statistic for the sample drawn The mean of the distribution is equal to v2 / ( v2 - 2 ) for v2 > 2.
The variance is equal to [ 2 * v22 * ( v1 + v1 - 2 ) ] / [ v1 * ( v2 - 2 )2 *
from population 1, v1 is the degrees of freedom for Χ21, Χ22 is the chi-
( v2 - 4 ) ] for v2 > 4.
square statistic for the sample drawn from population 2, and v2 is the
degrees of freedom for Χ22 . Note that degrees of freedom v1 = n1 - 1, and
degrees of freedom v2 = n2 - 1 .
14