0% found this document useful (0 votes)
13 views75 pages

CHAPTER 5-Probability Distribution

Chapter 5 covers probability distributions, including discrete and continuous types, with a focus on the binomial and Poisson distributions. It explains key concepts such as random variables, probability distributions, and confidence intervals for estimating population parameters. The chapter also includes examples and exercises to illustrate the application of these statistical concepts.

Uploaded by

mondechipululu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views75 pages

CHAPTER 5-Probability Distribution

Chapter 5 covers probability distributions, including discrete and continuous types, with a focus on the binomial and Poisson distributions. It explains key concepts such as random variables, probability distributions, and confidence intervals for estimating population parameters. The chapter also includes examples and exercises to illustrate the application of these statistical concepts.

Uploaded by

mondechipululu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 75

CHAPTER 5:

PROBABILITY DISTRIBUTIONS
CHAPTER OBJECTIVES
To introduce the basic concepts of discrete probability
distributions.
To present applications of the binomial distribution.
To present applications of the Poisson distribution.
To introduce the basic concepts of continuous probability
distributions.
To present applications of the standard normal distribution.
To develop confidence interval estimates for the mean, proportion
and variance
Introduction
Random variable: variable whose value depends on the outcome of a
random experiment.
A random variable is denoted by a capital letter (i.e., X).
 A value assigned to a random variable is denoted by a small letter (i.e., x).
Types of random variables:
Discrete random variables: These variables usually occur in counting experiments.
Continuous random variables: These variables usually occur in experiments where
measurements are taken
Introduction
Probability distribution: list of all the possible outcomes of a random
variable and their associated probabilities of occurrence.
Two types of probability distributions: discrete and continuous.
Discrete probability distribution: assumes that the outcomes of a
random variable under study can take on only specific (usually integer)
values.
Examples of discrete probability distribution: Binomial distribution and
Poisson distribution.
Example of continuous probability distribution is the normal distribution.
Introduction
There are two conditions that a probability distribution has to satisfy:
 The probability assigned to each value of the random variable occurs in the
range 0 to 1, that is, for each .
The sum of the probabilities assigned to all possible values of the random
variable is equal to 1, that is,
The probability that a random variable will assume a specified value is denoted
by
The probability that a random variable takes on a value less than or equal to a
specified value of is denoted by .
The probability that a random variable takes on a value greater than a specified
value of is denoted by .
Discrete random variable
Mean of a discrete random variable
The mean is denoted by and is also known as the expected value of a
random variable.
The expected value of a random variable may also be denoted by .
or
Standard deviation of a discrete random variable.
The standard deviation gives a measures of how much a probability
distribution is dispersed or spread around the mean of a discrete random
variable.
Discrete random variable
The standard deviation is the square root of the variance. The formula for the variance is given
by:

The formula for the standard deviation is given by:


Example
Suppose we toss a coin twice, then s = {HH, HT, TH, TT} and each of the
outcomes within s has a probability of ¼. If we now define a random variable as
the number of heads observed, this would mean that:
X= number of heads and x= 0; 1; 2

0 1 2
1/4 2/4 1/4
Example
Exercise
A bank stock is currently selling at R10 a share. An investor plans to buy
shares and to hold the stock for one year. Let X denote the price of the
stock after one year. The probability distribution is shown below here:
10 11 12 13 14
0,35 0,25 0,2 0,15 0,05

1. Find the expected price of the stock after one year.


2. Find the standards deviation of the price of the stock over the one year
period.
Example
Mr Ndlovu is a quality assurance manager at a manufacturing plant for
memory chips. He keeps a record of the number of defective memory
chips that are returned to the plant by dealers.
Let X denote the number of defective memory chips that are returned to
the plant in a production batch of 300 chips. The number of returns he
receives varies from zero to four in a production batch. The following
table gives the probability distribution of X.
0 1 2 3 4
0,15 0,30 0,25 0,20 0,10
Example
Determine the probability that the number of defective memory chips that
are returned to the plant in a production batch of 300 memory chips is:
1. Exactly three
2. More than two
3. At least two
4. Form one to three
5. Less than two
6. At most two
7. Between one and four
Solution
Binomial distribution
Let X be a random variable denoting the number of successes in the n trials
of a binomial experiment.
Then X is a binomial random variable which follows a Binomial distribution.
X can take the values { 0, 1, 2, … , n }.
If the probability of success in a single trial is p, then the parameters of the
binomial distribution are n and p.
We say that: “X is binomially distributed with parameters n and p.”
In mathematical symbols, we write:

X ~ Bin (n, p )
The Binomial Mathematical Formula
Let X represent the number of successes observed in n trials of a
Binomial experiment. Then

P ( X  x)  p ( x) C xn p x (1  p ) n  x ,

Where: = The sample size, i.e. the number of independent trials


(observation)
= the number of success outcomes in the n independently drawn
objects.
The Binomial Mathematical Formula
probability of a success outcomes on a single independent object.
probability of failure outcome on a single independent object.

n n! n (n  1) (n  2) .... 2 1


C 
x 
x !(n  x)! {x ( x  1) .... 2 1}{(n  x) (n  x  1) ... 2 1}
Mean and Variance of a Binomial Distribution
The mean of the binomial distribution is the expected value for the
number of successes is calculated using

The standard deviation of the binomial distribution is determined using:


Example
1. The binomial random variable X has a parameters and . Determine .
2. An idea proposed by economist Arthur Laffer suggests that beyond a
certain point, high tax rates depress the economy so much that they
actually reduce tax revenues. His idea is to propose tax reductions as
a means of stimulating the economy and, as a result, increase tax
revenues. In the mid 1990s, Time magazine reported that 30% of the
US Congress supported a tax cut as a means of stimulating the
economy and increasing tax revenues. Suppose at that time that
members of congress were randomly selected for an interview and
asked whether they supported a tax cut to stimulate the economy.
Example
a. Find the probability that all of those interviewed were opposed to a tax
cut.
b. Find the probability that one of the five members of congress who were
interviewed was in favour of a tax cut.
c. Find the probability that at least three of the five members of congress
who were interviewed were in favour of a tax cut when interviewed.
d. Find the probability that less than five members of congress who were
interviewed were in favour of a tax cut.
e. Construct the binomial distribution for the random variable X in the
form of a table.
Solution
Solution

a. 4

0 1 2 3 4 5
0,1681 0,3602 0,3087 0,1323 0,0284 0,0024
Example
Let X be a discrete random variable that possesses a binomial
distribution. Find the mean and standard deviation of the probability
distribution for which and .
Poisson Distribution
The Poisson distribution is a discrete probability distribution for the
counts of events that occur randomly in a given interval of time (or
space).
The Poisson distribution is given by:

Where.
Poisson experiment
A Poisson experiment is an experiment that has the following properties
The experiment consists of counting the number of times an event of
interest (termed a success) occurs in a specified interval of time, distance,
or volume (or any other unit of measurement).
The probability of a success in an interval is the same for all intervals of
equal size.
The number of successes in an interval is independent of the number of
success that occurs in other intervals.
The probability of a success is proportional to the size of the interval.
The intervals do not overlap.
Mean and standard deviation of
the Poisson distribution
Example
Assume that the number of patients that arrive at a physiotherapy
practice per hour follows a Poisson distribution. By reviewing past record,
it was determined that the mean is two patients per hour.
a. What is the probability that in a given hour exactly three patients will
arrive?
b. What is the probability that in a given two-hour period exactly eight
patients will arrive?
c. What is the probability that in a given half-hour period exactly two
patients will arrive?
Solution
X= the number of patients that arrive at a physiotherapy practice per hour.
a.
b.
c.
Example
The management of a supermarket receives, on average, two requests
per month to order items that the supermarket does not usually hold in
stock. Find the mean and the standard deviation of the probability
distribution.
Solution
X=the number of requests per months to order items that the supermarket
does not usually hold in stock.
Continuous probability
distribution
The random variable X is said to be continuous if it can take on any
value in a certain interval.
In other words, if X is a continuous random variable in the interval (a; b) then X
can have unlimited values within this interval.
Further it can be shown that for any continuous random variable the following
is valid:
The Normal distribution
The most common continuous distribution is the so called normal
distribution. The normal distribution is determined by two parameters,
namely the population average, µ, and the population variance,. It is
written as:

Mean=
The Normal distribution
For the normal distribution the following is valid:
the normal curve is bell shaped and symmetrical about the population
average, ;
the total area under this curve is exactly one;
the probabilities are directly proportional to the areas under the curve;
if the probability must be calculated, the area under the curve between a
and b must be calculated.
The Standard Normal
distribution
If, for a normal distribution, the population average, , and the population
variance, , then the distribution is called the standard normal distribution
and is denoted by:

the total area under the standard normal curve is exactly 1.


the curve is symmetrical around 0 and it is therefore simple to calculate
different probabilities using simple arithmetic.
If , what is the probability that Z will have a value less than 2,14?
Standardizing
The probability of a normal distribution can be calculated by first
standardizing to the standardized normal distribution:
If and is calculated the .
Z-scores are always calculated to two decimal places.
Rules
Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:
Rule 6:
Example
The examination mark in Statistics has a normal distribution with an
average of 60% and a variance of 100. What is the probability that a
student
a. will pass Statistics?
b. will get a distinction in Statistics?
Solution
Exercise
Consider Calculate the probabilities for the following:
THE CENTRAL LIMIT THEOREM
A very important theorem in Statistics is the Central Limit Theorem,
which in its simplest form states that:
If then
where is the average of the sample from the population of samples.
If is not normally distributed, then
When the number of samples, is large. Irrespective of the distribution
that is from, if a large number of samples n is drawn from any
distribution, then the averages of each sample, is normally distributed
with a mean of and a variance of .
Example
The marks obtained in a Statistics test are normally distributed with an
average, and a variance, . If 20 candidates complete the course, what is
the probability that their average mark will be?
a. more than 80%?
b. between 50 and 70%?
Solution
Sampling distribution of the
proportion
Population proportion will be represented by , and the sample proportion
by , where is the number of items with the characteristic and is the
sample size.
The standard error of the proportion is given as
Example
Suppose that in a class of 100, 28 students fail a test. If a sample of 50
students is randomly chosen. What is the probability that more than 25%
will fail the test?
The population proportion of students who fail the test is:
Confidence Intervals
Inferential statistics
Draw conclusion from data.
Sample – Describe data.
Use sample statistic to infer population parameter
Estimation
Hypothesis testing
Confidence Intervals
Estimation - Numerical values assigned to a population parameter using
a sample statistic.
Sample mean used to estimate population mean
Sample variance used to estimate population variance
Sample stand deviation s used to estimate population stand deviation
Sample proportion used to estimate population proportion
Confidence Intervals
Steps in estimation
Step 1: Select sample
Step 2: Get required information from the sample
Step 3: Calculate sample statistic
Step 4: Assign values to population parameter
Confidence Intervals
There are two major types of estimates:
Point estimates: Consists of a single number (or a single sample
statistic) that is calculated from sample data
Interval estimates: Consists of calculating two numbers that form the
endpoints of the interval.
Confidence Intervals
Population Parameter Estimate (Value of the Estimator (Formula)
statistic)
Mean
Variance
Proportion
Confidence Intervals
Confidence interval
An upper and lower limit within in which the population parameter is
expected to lie
Limits will vary from sample to sample
Specify the probability that the interval will include the parameter
Typical used 90%, 95%, 99%
Probability denoted by
o known as the level of confidence
o is the significance level
Confidence intervals for the
population mean
Large sample size
A sample of size 30 or more is considered a large sample.
When a large sample is taken from a population that does not have a
normal distribution, the distribution of the sample will be approximately
normal
Therefore, whether or not a population has a normal distribution, the
normal distribution is used to construct a confidence interval for when a
large sample is taken from that population.
Confidence intervals for the
population mean
The formula for constructing a confidence interval for when , is given
by:
, (If is known)
and
, (If is not known).
The value of is from the standard normal distribution table.
Confidence Intervals
Four commonly used confidence levels

Z
0,9 0,1 0,05 1,645
0,95 0,05 0,025 1,96
0,98 0,02 0,01 2,33
0,99 0,01 0,005 2,575
Example
The Management of a large national chain of motels decided to estimate
the mean cost per room of repairing damages caused by its customers. A
random sample of 150 vacated rooms were inspected by the
management and this indicated a mean repair cost of R84,30 and a
standard deviation of R37,20. Construct a 95% confidence interval for the
mean repair cost for the population of 2 000 motel rooms.
Solution

; Therefore, at a 95% level of confidence, the mean repair cost


for the population of motel rooms is between R78,35 and R90,25 per
room.
Exercise
A manufacture of steel beams has a production process that operates
continuously throughout an entire production shift. The beams that are
produced by the process are expected to have an average length of 295
cm (Centimetres) and the standard deviation is known to be 2,5 cm.
At certain time interval throughout the production process, sample are
selected to determine whether the average length of the beams is still
equal to 295 cm or whether the production process has to be reset in
order to change the length of the beams produced. A random sample of
50 beams is selected and the average beam length is found to be 294,5
cm. Construct a 99% confidence interval estimate for the population
average beam length.
Confidence Intervals
The behaviour of confidence intervals
Always when constructing confidence interval, we would like high
confidence, as well as a small margin of error.
The margin of error is given by:
The margin of error becomes smaller when:
oThe Z-table value becomes smaller
o becomes smaller
oThe value of becomes larger.
Confidence Intervals
Small sample
A sample size of less than 30 is considered a small sample size.
For a small sample from a normal population and is known, the normal
distribution can be used.
If is unknown we use to estimate
We need to replace the normal distribution with the t-distribution.
Confidence Intervals
The shape of the t-distribution is very similar to that of the normal
distribution; both are bell-shaped and symmetrical.
The only difference is that, because s is used to estimate the unknown ,
the t-distribution has more area in the tails and less in the center.
The formula for constructing a confidence interval for is given by:

Where s= the sample estimate of the population standard deviation .


The value of is obtained from the t-distribution table.
Example
The human resources director of a large corporation wishes to study
absenteeism among clerical workers at the corporation’s central office
during the previous year. A random sample of 25 clerical workers reveals
a mean absenteeism of 9,7 days, with a variance of 16 days.
Construct a 95% confidence interval for the average number of days of
absence for clerical workers last year. Assume that the population of
absence is normally distributed.
Solution

The mean number of days of absence for all clerical workers at the corporation last year was
between 8 days and 11 days at a 95% level of confidence.
Exercise
An auditor of a certain company found that the billing errors that occurred
on a random sample of invoices were:
R32; -R45; R66; R2; -R8; -R51; R12 and R18.
Positive errors indicate that the customer was billed too much and
negative errors indicate that the customer was billed too little. Find the
99% confidence interval for the mean billing error. Assume that the
population of billing errors is normally distributed.
Confidence Intervals
Confidence interval for population proportion
Each element in the population can be classified either as a success or
failure
Sample proportion
Proportion always between 0 and 1
For large samples the sampling distribution of the sample proportion is
approximately normal.
Confidence Intervals
The formula for constructing a confidence interval for is given by:
Example
A sales manager needs to determine the proportion of defective radio
returns that is made on a monthly basis. In December 65 new radios were
sold and in January 13 were returned for rework.
Estimate with 95% confidence the population proportion of returns for
December.
Solution

95% confident the mean monthly returns will be will be between 10,3%
and 29,7%
Exercise
According to a recent report by the Census Bureau, 26% of the single
male households own stocks, bonds and mutual funds. Although Census
Bureau estimates are based on very large sample, for convenience,
assume that this result is based on a random sample of 2 000 single male
households. Find a 95% confidence interval for the proportion of all single
male households that own stocks, bonds and mutual funds.
Confidence Intervals
Confidence interval for the population variance
Population variance very often important
Quality control
Sample variance is based on a sample of size n
Distribution of resulted from repeated sampling is a (chi-square)
distribution
Confidence interval for the population variance

The formula for constructing a confidence interval for is given by:


Example
For a binding machine to work on its optimum capacity the variation in the
temperature of the room is vital. The temperature for 30 consecutive
hours were measured and sample standard deviation were found to be
0,68 degrees. What will be a 90% confidence interval for ?
Solution
==

90% confident the variation in temperature will be will be between 0,315 and
0,757 degrees
Exercise
A precision meter is guaranteed to be accurate within 2%. A sample of
five meter readings on the same object yielded the following
measurements:
350; 348; 348; 352 and 351.
Construct a 99% confidence interval for assuming the population of
measurements is normally distributed.
Determining sample sizes for
estimates
Sample size for estimating means
Confidence level
Accepted sampling error -
Need to know , else use
Example
The maintenance of charge accounts may become too costly if the
average account purchase falls below a certain level. A department store
manager would like to estimate the average amount of account purchases
per month by its customers who have charge accounts to within R2,50
with a probability of approximately 0,95.
How many accounts should be selected from the store’s records if the
standard deviation of monthly account balances is known to be R16,50?
Solution
and
==
Therefore, at least 168 accounts should be selected.
Determining sample sizes for
estimates
Sample size for estimating proportions
Confidence level
Accepted sampling error -
Need to know , else use
Example
A cable television company wants to estimate the proportion of its
customers who would purchase a cable television program guide. The
company would like to have 95% confidence that its estimate is correct to
within of the true population.
Past experience in other area indicates that 30% of the customers will
purchase the program guide. What sample size is required?
Solution
=
A sample size of at least 323 customer is required.

You might also like