Probability 2024

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Probability Distribution

Probability plays a central role in statistics. We use probability to predict certain event. If the event is
sure to happen then the probability is 1. For example the probability anyone will die sooner or later is 1.
Probability of 0 means the event will not occur at all. Thus for any probability of an event it is between 0
and 1. The idea of probability is simple.

Example 1

Toss a fair coin twice. The outcomes could be a head (H) or a tail (T). Thus the chances to get a tail for
this toss is one out of two. We say the probability to get a tail for one single toss of a coin is 1/2 or

simply write in symbol as p(T) = 1/2 .

In example tossing a fair coin can be regarded as an experiment. The set of all possible outcomes of the
experiment is known as sample space, denoted by S. For the above example S = { H, T }. Collection of all
the outcomes of interests is known as an event, denoted by E. For this example E = { T }.

In general we are looking for probability of an event. This probability is defined as p(E) = n(E)/n(S), where
n(E), n(S) denote number of elements in E and S respectively.

Example 2

Toss a fair coin three times. List all the outcomes of getting two heads. For this example all the possible
outcomes can be found by simply using tree diagram shown below.

So S = { HHH, HHT, HTH, HTT, THH, THT, TTH, TTT }. Since our interest is to get two heads,

E = { HHT, HTH, THH }. Thus p( two heads appeared) = 3/8.

For this example let us go a bit further. Suppose X= number of heads appeared. Thus X = 0, 1, 2, 3. Of
course X = 0 means no heads appeared. You can verify that p(X=0) = 1/8 , p(X=1) = 3/8 p(X=2) = 3/8
p(X=3) = 1/8.

1
Combining these probabilities will lead us to the following table :

X 0 1 2 3
1 3 3 1
P(X) 8 8 8 8

What we have done is that we have built a probability distribution for the experiment.

Using the table you can verify that p(X ≤2) =7/8.

In this example the variable X is also known as a random variable as its value is generated randomly.

In general for any probability distribution for an experiment with a random variable Y , for example must
yield ∑p( Y) =1. You can verify that in example 2, ∑p(X) =1

Example 3

A wheel of fortune is divided into three parts numbered 1, 2, and 3. When the wheel is turned and
comes to a stop a needle will point to one of the part. A player turn the wheel 2 times. If Y = sum of the
numbers obtained for the two turn, find the probability distribution for Y. Use your distribution to
obtain p(Y ≥5) . If you want to bet on which Y value to appear, which value should you bet ?

Solution

For the two turns of the wheel a tree diagram can be drawn as follows:

You can verify that based on the tree diagram Y = 2, 3, 4, 5, 6. Verify that the probability distribution
for Y is

Y 2 3 4 5 6
1 2 3 2 1
P(X) 9 9 9 9 9

2
Of course since Y = 4 has the highest probability, so you should bet on this value.

Example 4

A random variable T has a probability distribution as follows.

T 4 7 9 16 20
1 3 1
P(T) 16 16 4 2x x

x
Find the value of
Find p( T is at least 10)
Find p( T is at most 8)
Find p( T is in between 5 and 20 exclusive)

Mean and Variance for Probabilty Distribution

In any probability distribution we can find mean and variance. Mean for a probability distribution is also
known as expected value for the distribution.

The formula for expected value is E(X)= ∑xp(x) whereas variance is V(X) = ∑x2p(x) – [∑xp(x)]2

3
Hence standard deviation s is given as s = √(V(X) .

Example 5

Find the expected value, variance and standard deviation of the distribution in Example 3.

Solution.

Binomial Distribution

This is one of the most popular discrete probability distribution and often arised in application.

This distribution is generated by a binomial experiment, an experiment that possesses the following
properties :

The experiment consists of n identical trials


Each trial results in one of two outcomes. We will call one outcome a success S and the other a
failure F.
The probability of success on a single trial is equal to p and remains the same from trial to trial.
The probability of a failure is equal to ( 1 – p) = q.
The trials are independent i.e each trial doesn’t depends on other trials.
The random variable of interest is X, the number of successes observed during the n trials.

A simple example of a binomial experiment is that of tossing a coin. If the coin is tossed 5 times then

n = 5. If our interest is to find number of heads appear the event of getting a head is a success. We can
let our random variable X = number of successes ( heads )appeared. Since for each toss the probability
of getting a head is 1/2 and remains unchanged from toss to toss we can let p = 1/2 .

Of course the outcome for each toss is independent from toss to toss.

A binomial experiment consists of n trials , probability of success p and random variable

4
X ( number of successes ) will generate the following probability formula

p ( X  k )  nC k p k (1  p ) nk
where k = 0, 1, 2,…..,n.

Mean ( or expected value) of X is np and variance np(1 – p)

Example 6

A coin is tossed 10 times.

(a)Find the probability of getting

(i) 6 heads. (ii) at most 3 heads. (iii) no heads at all (iv) all heads

(b) Find the mean and variance.

Solution

Solution for Example 6(a) (ii) requires us to add up the probabilities cumulatively. This is quite
laborious when there are many probabilities to add up. A cumulative Binomial table is provided to
ease the job.

5
How to use cumulative Binomial table.

Make sure you have a table in front of you. Suppose X has Binomial distribution with n = 20 p = 0.4.
Often this is written as X~B(n;p). In the table look for n = 20, p = 0.4.

Your table will give the value for p( X ≤ k). For example p( X ≤ 2)= 0.0036 (verify using binomial probality).
Also p( X ≤ 10) = 0.8725.

Note also that p( X =5)= p( X ≤ 5) - p( X ≤ 4) = 0.1256 – 0.0510 = 0.0746.

For p(X ≥ 7), we have p(X ≥ 7) = 1- p( X ≤ 6) = 1- 0.2500 = 0.7500.

Remember that since X is discrete p(X < 8) is the same as p( X ≤ 7). Also p(X > 8) means p(X ≥ 9).

Example 7.

A student decided to guess the answers for a multiple choice questions. There are 30 questions and
each question has four choices. He will pass the exam if he got at least twelve questions correct. Find
the probability that he got

(a) At least 5 questions correct

(b) Between eight to twelve questions correct inclusive

(c) At least twelve questions correct ( in which case he passed the exam).

6
Example 8.

The probability that a single radar set will detect an enemy plane is 0.9. If we have five radar sets, what
is the probability that

(a) Exactly four sets will detect the plane

(b) At least one set will detect the plane

(c) None of the sets will detect the plane

Normal Distribution

This is the most popular distribution for continuous variable. Variables such as height, weight,
wealth etc can be considered normal as extreme values of height for example are quite rare.
There will be not many people who are either too short or too tall. Many people will be near to
average height. This explains the mount shaped of the normal distribution curve of a normal
distribution. In short if a random variable X has normal distribution with mean µ , and variance
σ2 we write X~ N(µ; σ2) .

Recall that for discrete distribution we must have ∑p(X) =1. For continuous distribution the
requirement is ∫ f(x)dx. Th function f(x) is known as probability density function for the
distribution.

In case of normal distribution f(x) = [ exp(-(x- µ)2/2 σ2)]/[ σ√(2π)]. The graph of this function is
sketched below.

7
As usual we are interested to find probabilities associated with this distribution. As in Binomial,
a cumulative table of normal distribution is provided.

To use the table we need to standardized X to Z value by formula Z = ( X - µ )/ σ . In any


continuous distribution the probabilities are calculated by finding the area under the graph of
the distribution. Thus p( Z ≥ 1) = area shaded below.

This Z values range from negative to positive values. Z = 0 will be the middle value. Since the
area under the graph is 1 and the graph is symmetrical in the middle, we must have

p( Z ≥ 0)= 0.5 = p( Z ≤ 0).

Your table will give the value p( 0 ≤ Z ≤ k ). For example p( 0 ≤ Z ≤ 1.25 ) = 0.3944. Due to
symmetry p( 0 ≤ Z ≤ 1.25 ) = p( -1.25 ≤ Z ≤ 0 ). See figure below .

The value p(Z > 1.25 ) = 0.5 - p( 0 ≤ Z ≤ 1.25 ) = 0.5 - 0.3944 = 0.1056.

Example 9

Mean weight of average Malaysian male is 70 kg with standard deviation of 5.5 kg. Assume that
mean weight is normally distributed. If a Malaysian male is randomly selected find the
probability that his weight is

8
(a) At least 78 kg

(b) Between 65 kg to 73 kg

20% of Malaysian males are considered overweight. Find the cutoff weight for “ overweight
“ in this case. If 15% are considered underweight, find the cutoff weight for “ underweight “

Example 10

A machining operations produces bearings with diameters that are normally distributed with a mean of
3.0005 inches and a standard deviation of 0.0010 inch. Specifications require the bearing diameters to
lie in the interval of 3.000 ± 0.0020 inches. Those outside the interval are considered scrap and must be
remachined. With the existing machine setting, what fraction of total production will be scrapped ?

Example 11

A soft drink machine can be regulated so that it discharges an average of µ ounces per cup. If the
ounces of fill are normally distributed with standard deviation 0.3 ounce, give the setting for µ so that

9
8-ounce cups will overflow only 1% of the time.

Central Limit Theorem ( CLT )

In statistics we often worked with samples. We find mean and standard deviation of samples. A closer
look to these samples lead us to the following important theorem.

CLT

Given a population distribution with mean , µ and standard deviation σ the sampling distribution of x
is approximately normal for a large sample ( n ≥ 30) i.e approximately x ~ N (µ , σ2/n). However if the
population is normal then no restrictions for n.

Example 12

A population has a mean of 28 and variance of 81. A sample of size 100 is taken from the population.
Find the probability that the mean of the sample is

(a) More than 30

(b) Between 26 to 30.

10
Confidence Interval for the mean

For the normal distribution define z to be the point on the x axis for which the area under the

density function to its right is equal to  :

P(Z > z ) = 

z z
By symmetry P( - <Z< ) = 1 -  . Assuming normality we have
2 2

 
 x  x 
P  z   z   1 
 x 
 2 2 
 n 

which then can be rewritten as

   
P x  z x   x  x  z x   1  
 n n 
 2 2

 
x  z x 

The interval defined by  2
n  is called 100(1-  ) % confidence interval for the mean with

variance known . If value of  is specified , the upper and lower limits of this interval can be

calculated from the sample average. The probability is 1 -  that the true mean lies between them.

11
Example 13

The temperature ( in degrees Celcius ) at ten points chosen at random in a large building is measured,
giving the following list of readings :

180, 16.50, 17.50, 180, 19.50, 16.50, 180, 170, 190, 17.50

The standard deviation of temperature is known to be 10 C . Find a 90% confidence interval for the
mean temperature in the building.

Example 14

A machine fills cartons of liquid; the mean fill is adjustable but the dial on the gauge is not very accurate.
The standard deviation of the quantity of fill is 6 ml. A sample of 30 cartons gave a measured average
content of 570 ml. Find 90 % and 95 % confidence intervals for the mean.

12
Suppose we have a situation of which the population standard deviation is unknown. So we have to
estimate by the sample standard deviation s. This however will add an extra uncertainty because this
estimate is itself subject to error. Assuming the population is normal in this case we need to use the t
x  x
Tn 
distribution : sx / n

The value of n is called the degree of freedom. The greater the n the more will Tn approach the
normal distribution.

 
 sx 
 x  t  n1 n 
The 100(1-  ) % confidence interval in this case is  ( n -1 degree of freedom )
 2, 

Example 15

The measured lifetimes of a sample of 20 electronic components gave an average of 1250 h with
sample standard deviation of 96 h. Assuming that the lifetime has a normal distribution , find a 95 %
confidence interval for the mean lifetime of the population .

Hypothesis Testing

Basically hypothesis testing consists of two hypotheses to be tested. One is null hypothesis ( H0 ) and
the other one is alternative hypothesis ( Ha ) . Ha normally contains assertions of concern. H0 is the
assertion opposite to Ha . However we always used equal sign for H0 .

For example some comments expressed that net weight for milo bought is found to be less than what is
written on the can which is 500g. If µ is the supposed mean weight as claimed by manufacturer, our
concern is the true net weight is less than 500g.

In this case we will set Ha to be Ha : µ < 500. H0 then will be H0 : µ = 500

13
Now, how do we make decision process ? Well a tolerance has to be set first. If we take a sample of 30
cans for example and found that x is 499g , should we accept Ha ? Probably this will be quite unsafe as
the difference is only 1 gram. So there must be some tolerations.

This toleration will be given by alpha level ( α level ). Usually α is set between 0.001 to 0.1. The most
popular is α= 0.05.

In general if Ha contains < , we say left tailed test, contains > will be right tailed test, and contains ≠ is
the double tail test.

Next we choose appropriate test statistics based on the distribution involved. For normal distribution we
use Zt = (x - µ)/( σ/√n ) and compute the value.

Next we find Zα from the expression p( Z ≥ Zα ) = α ( if right tailed test ), or p( Z ≤ Zα ) = α ( if left tailed
test ), or p( Z ≥ Zα/2 ) = α/2 and p( Z ≤ Zα/2 ) = α/2 ( if doubled tailed test ).

For the right tailed test if Zt > Zα , then we accept Ha . If not then we don’t have enough evidence to
reject H0 .

Example 16.

There are some concerns on the net weight of 250g milo. Some people feel that the actual net weight is
less than 250g. Consumer Society conducted a statistical test on 30 randomly selected cans and found
that mean sample is 245g. It is claimed by the manufacturer that µ = 250g and σ = 10g.

Based on α = 0.05 can we say that the concerns need to be investigated deeper ?

14

You might also like