0% found this document useful (0 votes)
24 views

Pattern - Recognition - Module - 2 Notes

The document discusses several discrete and continuous probability distributions including binomial, Poisson, geometric, and uniform distributions. Key aspects such as probability mass functions, cumulative distribution functions, means, variances, and examples are provided for each distribution. Parameters like scale, shape, and location that characterize continuous distributions are also described.

Uploaded by

tejashreddy29
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Pattern - Recognition - Module - 2 Notes

The document discusses several discrete and continuous probability distributions including binomial, Poisson, geometric, and uniform distributions. Key aspects such as probability mass functions, cumulative distribution functions, means, variances, and examples are provided for each distribution. Parameters like scale, shape, and location that characterize continuous distributions are also described.

Uploaded by

tejashreddy29
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

3 Probability Distributions

3.1 Some Discrete Probability Distributions


3.1.1 Binomial Distribution
Binomial distribution is one of the most important discrete probability distribution due to its ap-
plications in several contexts. A random variable X is said to follow a Binomial distribution when
• The random variable can have only two outcomes success and failure (also known as Bernoulli
trials).
• The objective is to find the probability of getting k successes out of n trials.
• The probability of success is p and thus the probability of failure is (1 − p).
• The probability p is constant and does not change between trials.
Success and failure are generic terminologies used in binomial distribution; based on the context, the
interpretation will change (winning a lottery can be considered as success and not winning as failure).

Definition 28. The probability mass function (PMF) of Binomial distribution is


 
n k
P (X = k) = p (1 − p)n−k
k

where,  
n n!
=
k k!(n − k)!
Definition 29. Cumulative Distribution Function (CDF) of Binomial Distribution is
x  
X n
P (X ≤ x) = (p)k (1 − p)(n−k)
k
k=0

Definition 30. The MEAN of Binomial Distribution is

µ = E(X) = np

Definition 31. The variance of a binomial distribution is given by

σ 2 = V ar(X) = np(1 − p)

Definition 32. The standard deviation of binomial distribution is given by


p
σ = np(1 − p)

Example: The probability that a certain kind of component will survive a shock test is 3/4. Find
the probability that exactly 2 of the next 4 components tested survive.
Solution: Assuming that the tests are independent and p = 3/4 for each of the 4 tests, we obtain
Example: The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are
known to have contracted this disease, what is the probability that

42 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


(a) at least 10 survive,
(b) from 3 to 8 survive, and
(c) exactly 5 survive?
Solution: Let X be the number of people who survive.
Example: It is conjectured that an impurity exists in 30% of all drinking wells in a certain rural
community. In order to gain some insight into the true extent of the problem, it is determined that
some testing is necessary. It is too expensive to test all of the wells in the area, so 10 are randomly
selected for testing.
(a) Using the binomial distribution, what is the probability that exactly 3 wells have the impurity,
assuming that the conjecture is correct?
(b) What is the probability that more than 3 wells are impure?

3.1.2 Poisson Distribution


Experiments yielding numerical values of a random variable X, the number of outcomes occurring
during a given time interval or in a specified region, are called Poisson experiments. The given time
interval may be of any length, such as a minute, a day, a week, a month, or even a year. For example,
a Poisson experiment can generate observations for the random variable X representing the number
of telephone calls received per hour by an office, the number of days school is closed due to snow
during the winter, or the number of games postponed due to rain during a baseball season. The
specified region could be a line segment, an area, a volume, or perhaps a piece of material. In such

43 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


instances, X might represent the number of field mice per acre, the number of bacteria in a given
culture, or the number of typing errors per page. A Poisson experiment is derived from the Poisson
process and possesses the following properties.

• The number of outcomes occurring in one time interval or specified region of space is indepen-
dent of the number that occur in any other disjoint time interval or region. In this sense we
say that the Poisson process has no memory.
• The probability that a single outcome will occur during a very short time interval or in a small
region is proportional to the length of the time interval or the size of the region and does not
depend on the number of outcomes occurring outside this time interval or region.
• The probability that more than one outcome will occur in such a short time interval or fall in
such a small region is negligible.
Definition 33. The probability mass function of the Poisson random variable X is

e−λ (λ)x
P (X = x) = , x=0,1,2,3,...
x!
where λ is the average number of outcomes per unit time, distance, area, or volume and e = 2.71828...
Definition 34. Cumulative Distribution Function (CDF) of Poisson Distribution is
x
X e−λ (λ)k
P (X ≤ x) =
k!
k=0

Definition 35. The MEAN of Poisson Distribution is

µ = E(X) = λ

Definition 36. The variance of a Poisson distribution is given by

σ 2 = V ar(X) = λ

Definition 37. The standard deviation of Poisson distribution is given by



σ= λ

44 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


Example: On average, 20 customers per day cancel their order placed at Fashion Trends Online.
Calculate the probability that the number of cancellations on a day is exactly 20 and the probability
that the maximum number of cancellations is 25.

Solution:

Example: The number of calls arriving at a call center follows a Poisson distribution at 10 per hour.
Calculate the probability that the number of calls over a 3-hour period will exceed 30.

Solution:

3.1.3 Geometric Distribution


In many contexts, we may like to predict the number of failures before the occurrence of success in
a Bernoulli trial. For example, in a repeat purchase such as grocery purchase, the grocery store may
like to predict gaps between purchase of a specific product by the customer.
Definition 38. The probability mass function of the Geometric distribution is

P (X = x) = (1 − p)x−1 p, x=1,2,3,...

where p is the success probability.


Definition 39. Cumulative Distribution Function (CDF) of Geometric Distribution is

P (X ≤ x) = 1 − (1 − p)x

Definition 40. The MEAN of Geometric Distribution is


1
µ = E(X) =
p

45 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


Definition 41. The variance of a Geometric distribution is given by
(1 − p)
σ 2 = V ar(X) =
p2
Example: Local Dhaniawala (LD) is an online grocery store and has an innovative feature which
predicts whether the customer has forgotten to buy an item which is very common among customers
of grocery items. The probability that a customer buys milk in each shopping visit is 0.2.
(a) Calculate the probability that the customer’s first purchase of milk happens during the 5th visit.
(b) Calculate the average time between purchases of milk.
(c) If a customer has not purchased milk during the past 3 shopping visits, what is the probability
that the customer will not buy milk for another 2 visits?

Solution:

3.2 Some Continuous Probability Distributions


3.2.1 Parameters of Continuous Distributions

(a) (b) (c)

Figure 9: a) Scale b) Shape c) Location.


• Scale parameter: Scale parameter defines the range of the continuous distribution. The
larger the scale parameter value, larger is the spread of the distribution.

46 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


• Shape parameter: Shape parameter defines the shape of the probability distribution. The
changes to the value of shape parameter will change the shape of the distribution.
• Location parameter: Location parameter locates (or shifts) the distribution on the horizon-
tal axis.

3.2.2 Uniform Distribution


A uniform distribution is a distribution that has constant probability due to equally likely occurring
events. It is also known as rectangular distribution (continuous uniform distribution). It has two
parameters a and b: a = minimum and b = maximum. The distribution is written as U(a, b).
Definition 42. The probability mass function of the Uniform distribution is

1
 b−a , when a ≤ x ≤ b
f (x) =
0, when x < a or x > b

Definition 43. Cumulative Distribution Function (CDF) of Uniform Distribution is




 0, for x < a



F (x) = x−a
b−a , for x ∈ [a, b)




1, for x ≥ b

Definition 44. The MEAN of Uniform Distribution is


1
µ = E(X) = (a + b)
2
Definition 45. The variance of a Uniform distribution is given by
1
σ 2 = V ar(X) = (b − a)2
12
Example: Using the uniform distribution probability density function for random variable X, in (0,
20), find P (3 < X < 16).

Solution:
Here, a = 0, b =20

f(x) = 1/(20 – 0) = 1/20

P (3 < X < 16) = (16 – 3) * (1/20) = 13/20.

Example: A random variable X has a uniform distribution over (-5 , 6), find cumulative distribution
function for x = 3.

Solution:
Here, a = -5, b = 6, x = 3

CDF = (3 – (-5))/(6 – (-5)) = 8/11

47 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


3.2.3 Exponential Distribution
Exponential distribution is a single parameter continuous distribution that is traditionally used for
modelling time to failure of electronic components.

Definition 46. A continuous random variable X is said to have an exponential distribution with
parameter λ > 0, if its PDF is given by

λe−λx

x>0
f (x) =
0 otherwise

Definition 47. Cumulative Distribution Function (CDF) of Exponential Distribution is

F (x) = 1 − e−λx


Definition 48. The MEAN of Exponential Distribution is


1
µ = E(X) =
λ
Definition 49. The variance of a Exponential distribution is given by
1
σ 2 = V ar(X) =
λ2
The time to failure of an avionic system follows an exponential distribution with a mean time be-
tween failures (MTBF) of 1000 hours.
(a) Calculate the probability that the system will fail before 1000 hours.
(b) Calculate the probability that it will not fail up to 2000 hours.
(c) Calculate the time by which 10% of the systems will fail (that is, calculate P10 life).

Solution:

48 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


3.2.4 Normal Distribution
The most important continuous probability distribution in the entire field of statistics is the normal
distribution (also known as Gaussian distribution). Its graph, called the normal curve, is the bell-
shaped curve of Figure 10, which approximately describes many phenomena that occur in nature,
industry, and research. For example, physical measurements in areas such as meteorological exper-
iments, rainfall studies, and measurements of manufactured parts are often more than adequately
explained with a normal distribution. In addition, errors in scientific measurements are extremely
well approximated by a normal distribution.
Definition 50. For a normal distribution of a random variable X with the mean = µ and the
variance = σ 2 , the probability density f(x) is given by
−(x−µ)2
n
f (x) = √1 e 2σ2 −∞ < x < ∞
σ 2π

Note: No closed form solution exists for the cumulative distribution function of a normal distribu-
tion. Many functions that are numerical approximations are used for finding the value of cumulative
distribution function of normal distribution.

Properties of Normal Distribution


• Theoretical normal density functions are defined between −∞ to ∞.

49 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


Figure 10: A normal curve.

• It is a two parameter distribution, where the parameter µ is the mean (location parameter)
and the parameter σ is the standard deviation (scale parameter).
• All normal distributions have symmetrical bell shape around mean µ (thus it is also median).
µ is also the mode of the normal distribution, that is, µ is the mean, median as well as the
mode.

• Any linear transformation of a normal random variable is also normal random variable. That
is, if X is a normal random variable, then the linear transformation AX + B (where A and B
are two constants) is also a normal random variable.
• If X1 and X2 are two independent normal random variables with mean µ1 and µ2 and variance
σ12 and σ22 , respectively, then X1 + X2 is also a normal distribution with mean µ1 + µ2 and
variance σ12 + σ22 .
• Sampling distribution of mean values of a large sample drawn from a population of any distri-
bution is likely to follow a normal distribution.
Standard Normal Variable

Every normal distribution can be converted to the standard normal distribution by turning the
individual values into z-scores.
Definition 51. A normal random variable with mean µ = 0 and σ = 1 is called the standard normal
variable (distribution) and usually represented by Z. The probability density function of a standard
normal variable is given by
1 Z2
f (z) = √ e− 2

By using the following transformation, any normal random variable X can be converted into a
standard normal variable:
X −µ
Z=
σ

50 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


The random variable X can be written in the form of a standard normal random variable using the
relationship
X = µ + σZ
Definition 52. The CDF of a standard normal distribution can be given as
z
r
e2kz
Z
1 − x2
2 2
F (x) = P (Z ≤ z) = √ e dx ≈ , where k =
2π −∞ 1 + e2kz π

Example 1: According to a survey on use of smart phones in India, the smart phone users spend
68 minutes in a day on average in sending messages and the corresponding standard deviation is 12
minutes. Assume that the time spent in sending messages follows a normal distribution.
(a) What proportion of the smart phone users are spending more than 90 minutes in sending mes-
sages daily?
(b) What proportion of customers are spending less than 20 minutes?
(c) What proportion of customers are spending between 50 minutes and 100 minutes?
Solution:

(b) Proportion of customers spending less than 20 minutes is P (X ≤ 20) = F (20) = 3.167110−5 .
(c) Proportion of customers spending between 50 and 100 minutes is given by P (50 ≤ X ≤
100) = F (100) − F (50) = 0.9293.

3.2.5 Chi-square Distribution


Chi-square distribution with k degrees of freedom [denoted as χ2 (k) distribution] is a non-parametric
distribution which is obtained by adding square of k independent standard normal random variables.
Chi-square distribution plays a pivotal role in analytics since it is used in many hypothesis tests
and forms the basis for the classification tree technique chi-square automatic interaction detection
(CHAID). In sentiment analysis, chi-square test of independence is used to identify the features
(words) that can be used for classifying documents into various classes such as positive, negative,
and neutral. In logistic regression model, chi-square test is used to find the statistical significance of

51 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


independent variables (predictor variables). Consider a normal random variable X1 with mean µ1
and standard deviation σ1 . Then we can define Z1 (the standard normal random variable) as

X1 − µ1
Z1 =
σ1
Then,
!2
X1 − µ1
Z12 =
σ1

is a chi-square distribution with one degree of freedom [χ2 (1)]. Let X2 be a normal random variable
with mean µ2 and standard deviation σ2 and Z2 be the corresponding standard normal variable.
Then the random variable Z12 + Z22 given by
!2 !2
X1 − µ1 X2 − µ2
Z12 + Z22 = +
σ1 σ2

is a chi-square distribution with 2 degrees of freedom.


Definition 53. A chi-square distribution with k degrees of freedom is given by sum of squares of
standard normal random variables Z1 , Z2 , · · · , Zk obtained by transforming normal random variables
X1 , X2 , · · · , Xk with mean values µ1 , µ2 , · · · , µk and corresponding standard deviations σ1 , σ2 , · · · , σk
. That is
!2 !2 !2
X 1 − µ1 X 2 − µ2 X k − µk
χ2 (k) = Z12 + Z22 + · · · + Zk2 = + + ··· +
σ1 σ2 σk

Properties of Chi-Square Distribution:


• The mean and standard deviation of a chi-square distribution are k and 2k respectively,
where k is the degrees of freedom.

• As the degrees of freedom k increases, the probability density function of a chi-square distri-
bution approaches normal distribution.
• Chi-square goodness of fit test is one of the popular tests for checking whether a data follows
a specific probability distribution.

3.2.6 Student’s t-Distribution


Student’s t-distribution (or simply t-distribution) arises while estimating the population mean of a
normal distribution using sample which is either small and/or the population standard deviation
is unknown. In analytics, t-distribution plays an important role in hypothesis testing and also in
diagnostics of linear regression models.
Assume that X1 , X2 , · · · , Xn are n observations from a normal distribution with mean µ and
standard deviation σ. Let
n
X Xi
X̄ =
i=1
n

52 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


v
u n
u 1 X
S=t (Xi − X̄)2
n − 1 i=1

where X and S are mean and standard deviation estimated from the sample X1 , X2 , · · · , Xn . Then
the random variable t defined by
X̄ − µ
t= √
S/ n
follows a t-distribution with (n - 1) degrees of freedom. Here one degree of freedom is lost since the
standard deviation is estimated from the sample (degrees of freedom is the number of observations
in the sample minus number of restrictions or estimates made using the sample).

Properties of t-Distribution:

• The mean of a t-distribution with 2 or more degrees of freedom is 0.


• The variance of t-distribution is n/(n-2) for n > 2, where n is the number of degrees of freedom.
• As the degrees of freedom n increases, the probability density function of a t-distribution
approaches the density function of standard normal distribution. For n > 120, the difference
between the area under probability density function of a t-distribution is very close to the area
under a standard normal distribution.
• t-distribution is an important distribution for hypothesis testing of means of a population and
for comparing means of two populations.

3.2.7 F-Distribution
F-distribution (short form of Fisher’s distribution named after statistician Ronald Fisher) is a ratio
of two chi-square distributions. Let Y1 and Y2 be two independent chi-square distributions with k1
and k2 degrees of freedom, respectively. Then the random variable X defined as

Y1 /k1
X=
Y2 /k2

Properties of F-Distribution:

• The Mean of F-distribution is k2 /(k2 − 2), for k2 > 2.


• Standard deviation of F-distribution is
s
2k22 (k1 + k2 − 2)
for k2 > 4
k1 (K2 − 2)2 (k2 − 4)

• F-distribution is non-symmetrical and the shape of the distribution depends on the values of
k1 and k2 .

• F-distribution is used in Analysis of Variance to test the mean values of multiple groups.

53 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


3.3 Visualization using R.
• Fitting: https://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf
• Fitting https://rpubs.com/eraldagjika/715261

• VISUALIZATION https://flowingdata.com/2012/05/15/how-to-visualize-and-compare-distributio
• PLOTS http://www.cookbook-r.com/Graphs/Plotting_distributions_(ggplot2)/

3.4 Problems
Problem 59. An employee is selected from a staff of 10 to super- vise a certain project by selecting
a tag at random from a box containing 10 tags numbered from 1 to 10. Find the formula for
the probability distribution of X rep- resenting the number on the tag that is drawn. What is the
probability that the number drawn is less than 4?

Problem 60. According to Chemical Engineering Progress (November 1990), approximately 30%
of all pipework failures in chemical plants are caused by operator error.
(a) What is the probability that out of the next 20 pipework failures at least 10 are due to operator
error?
(b) What is the probability that no more than 4 out of 20 such failures are due to operator error?
(c) Suppose, for a particular plant, that out of the ran- dom sample of 20 such failures, exactly 5 are
due to operator error. Do you feel that the 30% figure stated above applies to this plant? Comment.
Problem 61. The probability that a patient recovers from a delicate heart operation is 0.9. What
is the probabil- ity that exactly 5 of the next 7 patients having this operation survive?
Problem 62. In testing a certain kind of truck tire over rugged terrain, it is found that 25% of
the trucks fail to complete the test run without a blowout. Of the next 15 trucks tested, find the
probability that
(a) from 3 to 6 have blowouts;
(b) fewer than 4 have blowouts;
(c) more than 5 have blowouts.

Problem 63. The probability that a student pilot passes the written test for a private pilot’s license
is 0.7. Find the probability that a given student will pass the test
(a) on the third try;
(b) before the fourth try
Problem 64. On average, 3 traffic accidents per month occur at a certain intersection. What is
the probability that in any given month at this intersection
(a) exactly 5 accidents will occur?
(b) fewer than 3 accidents will occur?
(c) at least 2 accidents will occur?

Problem 65. On average, a textbook author makes two word processing errors per page on the first
draft of her textbook. What is the probability that on the next page she will make
(a) 4 or more errors?
(b) no errors?

54 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


Problem 66. The probability that a person living in a certain city owns a dog is estimated to be
0.3. Find the probability that the tenth person randomly interviewed in that city is the fifth one to
own a dog.
Problem 67. Find the probability that a person flipping a coin gets
(a) the third head on the seventh flip;
(b) the first head on the fourth flip.
Problem 68. Three people toss a fair coin and the odd one pays for coffee. If the coins all turn up
the same, they are tossed again. Find the probability that fewer than 4 tosses are needed.
Problem 69. The number of customers arriving per hour at a certain automobile service facility is
assumed to follow a Poisson distribution with mean λ = 7.
(a) Compute the probability that more than 10 customers will arrive in a 2-hour period.
(b) What is the mean number of arrivals during a 2-hour period?
Problem 70. Given a standard normal distribution, find the area under the curve that lies
(a) to the right of z = 1.84 and
(b) between z = -1.97 and z = 0.86.
Problem 71. Given a standard normal distribution, find the value of k such that
(a) P (Z > k) = 0.3015and
(b) P (k < Z < −0.18) = 0.4197.

Problem 72. Given a random variable X having a normal distribution with µ = 50 and σ = 10,
find the probability that X assumes a value between 45 and 62.
Problem 73. Given a random variable X having a normal distribution with µ = 300 and σ = 50,
find the probability that X assumes a value greater than 362.
Problem 74. Given a normal distribution with µ = 40 and σ = 6, find the value of x that has
(a) 45% of the area to the left and
(b) 14% of the area to the right.
Problem 75. A certain type of storage battery lasts, on average, 3.0 years with a standard deviation
of 0.5 year. Assuming that battery life is normally distributed, find the probability that a given battery
will last less than 2.3 years.

Problem 76. An electrical firm manufactures light bulbs that have a life, before burn-out, that is
normally distributed with mean equal to 800 hours and a standard deviation of 40 hours. Find the
probability that a bulb burns between 778 and 834 hours.
Problem 77. Gauges are used to reject all components for which a certain dimension is not within
the specification 1.50 ± d. It is known that this measurement is normally distributed with mean 1.50
and standard deviation 0.2. Determine the value d such that the specifications “cover” 95% of the
measurements.
Problem 78. A certain machine makes electrical resistors having a mean resistance of 40 ohms and
a standard deviation of 2 ohms. Assuming that the resistance follows a normal distribution and can
be measured to any degree of accuracy, what percentage of resistors will have a resistance exceeding
43 ohms?

55 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).


Problem 79. The average grade for an exam is 74, and the standard deviation is 7. If 12% of
the class is given As, and the grades are curved to follow a normal distribution, what is the lowest
possible A and the highest possible B?
Problem 80. Given a continuous uniform distribution, show that
(a) µ = b+a
2
(b−a)2
(b) σ = 12

Problem 81. The daily amount of coffee, in liters, dispensed by a machine located in an airport
lobby is a random variable X having a continuous uniform distribution with A = 7 and B = 10.
Find the probability that on a given day the amount of coffee dispensed by this machine will be
(a) at most 8.8 liters;
(b) more than 7.4 liters but less than 9.5 liters;
(c) at least 8.5 liters.
Problem 82. Given a standard normal distribution, find the value of k such that
(a) P (Z > k) = 0.2946;
(b) P (Z < k) = 0.0427;
(c) P (−0.93 < Z < k) = 0.7235.
Problem 83. The lifetime T (years) of an electronic component is a continuous random variable
with a probability density function given by

f (t) = e−t for, t ≥ 0

Find the lifetime L which a typical component is 60components are sold to a manufacturer, find the
probability that at least one of them will have a lifetime less than L years.
Problem 84. Commonly, car cooling systems are controlled by electrically driven fans. Assuming
that the lifetime T in hours of a particular make of fan can be modelled by an exponential distribution
with λ = 0.0003 find the proportion of fans which will give at least 10000 hours service. If the fan
is redesigned so that its lifetime may be modelled by an exponential distribution with λ = 0.00035,
would you expect more fans or fewer to give at least 10000 hours service?
Problem 85. The time intervals between successive barges passing a certain point on a busy water-
way have an exponential distribution with mean 8 minutes.
(a) Find the probability that the time interval between two successive barges is less than 5 minutes.
(b) Find a time interval t such that we can be 95% sure that the time interval between two successive
barges will be greater than t.

56 Inferential Statistics (4CSDS2011), Lecture Review, Dr. Gourish Goudar. (Homepage).

You might also like