Probability 2024
Probability 2024
Probability 2024
Probability plays a central role in statistics. We use probability to predict certain event. If the event is
sure to happen then the probability is 1. For example the probability anyone will die sooner or later is 1.
Probability of 0 means the event will not occur at all. Thus for any probability of an event it is between 0
and 1. The idea of probability is simple.
Example 1
Toss a fair coin twice. The outcomes could be a head (H) or a tail (T). Thus the chances to get a tail for
this toss is one out of two. We say the probability to get a tail for one single toss of a coin is 1/2 or
In example tossing a fair coin can be regarded as an experiment. The set of all possible outcomes of the
experiment is known as sample space, denoted by S. For the above example S = { H, T }. Collection of all
the outcomes of interests is known as an event, denoted by E. For this example E = { T }.
In general we are looking for probability of an event. This probability is defined as p(E) = n(E)/n(S), where
n(E), n(S) denote number of elements in E and S respectively.
Example 2
Toss a fair coin three times. List all the outcomes of getting two heads. For this example all the possible
outcomes can be found by simply using tree diagram shown below.
So S = { HHH, HHT, HTH, HTT, THH, THT, TTH, TTT }. Since our interest is to get two heads,
For this example let us go a bit further. Suppose X= number of heads appeared. Thus X = 0, 1, 2, 3. Of
course X = 0 means no heads appeared. You can verify that p(X=0) = 1/8 , p(X=1) = 3/8 p(X=2) = 3/8
p(X=3) = 1/8.
1
Combining these probabilities will lead us to the following table :
X 0 1 2 3
1 3 3 1
P(X) 8 8 8 8
What we have done is that we have built a probability distribution for the experiment.
Using the table you can verify that p(X ≤2) =7/8.
In this example the variable X is also known as a random variable as its value is generated randomly.
In general for any probability distribution for an experiment with a random variable Y , for example must
yield ∑p( Y) =1. You can verify that in example 2, ∑p(X) =1
Example 3
A wheel of fortune is divided into three parts numbered 1, 2, and 3. When the wheel is turned and
comes to a stop a needle will point to one of the part. A player turn the wheel 2 times. If Y = sum of the
numbers obtained for the two turn, find the probability distribution for Y. Use your distribution to
obtain p(Y ≥5) . If you want to bet on which Y value to appear, which value should you bet ?
Solution
For the two turns of the wheel a tree diagram can be drawn as follows:
You can verify that based on the tree diagram Y = 2, 3, 4, 5, 6. Verify that the probability distribution
for Y is
Y 2 3 4 5 6
1 2 3 2 1
P(X) 9 9 9 9 9
2
Of course since Y = 4 has the highest probability, so you should bet on this value.
Example 4
T 4 7 9 16 20
1 3 1
P(T) 16 16 4 2x x
x
Find the value of
Find p( T is at least 10)
Find p( T is at most 8)
Find p( T is in between 5 and 20 exclusive)
In any probability distribution we can find mean and variance. Mean for a probability distribution is also
known as expected value for the distribution.
The formula for expected value is E(X)= ∑xp(x) whereas variance is V(X) = ∑x2p(x) – [∑xp(x)]2
3
Hence standard deviation s is given as s = √(V(X) .
Example 5
Find the expected value, variance and standard deviation of the distribution in Example 3.
Solution.
Binomial Distribution
This is one of the most popular discrete probability distribution and often arised in application.
This distribution is generated by a binomial experiment, an experiment that possesses the following
properties :
A simple example of a binomial experiment is that of tossing a coin. If the coin is tossed 5 times then
n = 5. If our interest is to find number of heads appear the event of getting a head is a success. We can
let our random variable X = number of successes ( heads )appeared. Since for each toss the probability
of getting a head is 1/2 and remains unchanged from toss to toss we can let p = 1/2 .
Of course the outcome for each toss is independent from toss to toss.
4
X ( number of successes ) will generate the following probability formula
p ( X k ) nC k p k (1 p ) nk
where k = 0, 1, 2,…..,n.
Example 6
(i) 6 heads. (ii) at most 3 heads. (iii) no heads at all (iv) all heads
Solution
Solution for Example 6(a) (ii) requires us to add up the probabilities cumulatively. This is quite
laborious when there are many probabilities to add up. A cumulative Binomial table is provided to
ease the job.
5
How to use cumulative Binomial table.
Make sure you have a table in front of you. Suppose X has Binomial distribution with n = 20 p = 0.4.
Often this is written as X~B(n;p). In the table look for n = 20, p = 0.4.
Your table will give the value for p( X ≤ k). For example p( X ≤ 2)= 0.0036 (verify using binomial probality).
Also p( X ≤ 10) = 0.8725.
Remember that since X is discrete p(X < 8) is the same as p( X ≤ 7). Also p(X > 8) means p(X ≥ 9).
Example 7.
A student decided to guess the answers for a multiple choice questions. There are 30 questions and
each question has four choices. He will pass the exam if he got at least twelve questions correct. Find
the probability that he got
(c) At least twelve questions correct ( in which case he passed the exam).
6
Example 8.
The probability that a single radar set will detect an enemy plane is 0.9. If we have five radar sets, what
is the probability that
Normal Distribution
This is the most popular distribution for continuous variable. Variables such as height, weight,
wealth etc can be considered normal as extreme values of height for example are quite rare.
There will be not many people who are either too short or too tall. Many people will be near to
average height. This explains the mount shaped of the normal distribution curve of a normal
distribution. In short if a random variable X has normal distribution with mean µ , and variance
σ2 we write X~ N(µ; σ2) .
Recall that for discrete distribution we must have ∑p(X) =1. For continuous distribution the
requirement is ∫ f(x)dx. Th function f(x) is known as probability density function for the
distribution.
In case of normal distribution f(x) = [ exp(-(x- µ)2/2 σ2)]/[ σ√(2π)]. The graph of this function is
sketched below.
7
As usual we are interested to find probabilities associated with this distribution. As in Binomial,
a cumulative table of normal distribution is provided.
This Z values range from negative to positive values. Z = 0 will be the middle value. Since the
area under the graph is 1 and the graph is symmetrical in the middle, we must have
Your table will give the value p( 0 ≤ Z ≤ k ). For example p( 0 ≤ Z ≤ 1.25 ) = 0.3944. Due to
symmetry p( 0 ≤ Z ≤ 1.25 ) = p( -1.25 ≤ Z ≤ 0 ). See figure below .
The value p(Z > 1.25 ) = 0.5 - p( 0 ≤ Z ≤ 1.25 ) = 0.5 - 0.3944 = 0.1056.
Example 9
Mean weight of average Malaysian male is 70 kg with standard deviation of 5.5 kg. Assume that
mean weight is normally distributed. If a Malaysian male is randomly selected find the
probability that his weight is
8
(a) At least 78 kg
(b) Between 65 kg to 73 kg
20% of Malaysian males are considered overweight. Find the cutoff weight for “ overweight
“ in this case. If 15% are considered underweight, find the cutoff weight for “ underweight “
Example 10
A machining operations produces bearings with diameters that are normally distributed with a mean of
3.0005 inches and a standard deviation of 0.0010 inch. Specifications require the bearing diameters to
lie in the interval of 3.000 ± 0.0020 inches. Those outside the interval are considered scrap and must be
remachined. With the existing machine setting, what fraction of total production will be scrapped ?
Example 11
A soft drink machine can be regulated so that it discharges an average of µ ounces per cup. If the
ounces of fill are normally distributed with standard deviation 0.3 ounce, give the setting for µ so that
9
8-ounce cups will overflow only 1% of the time.
In statistics we often worked with samples. We find mean and standard deviation of samples. A closer
look to these samples lead us to the following important theorem.
CLT
Given a population distribution with mean , µ and standard deviation σ the sampling distribution of x
is approximately normal for a large sample ( n ≥ 30) i.e approximately x ~ N (µ , σ2/n). However if the
population is normal then no restrictions for n.
Example 12
A population has a mean of 28 and variance of 81. A sample of size 100 is taken from the population.
Find the probability that the mean of the sample is
10
Confidence Interval for the mean
For the normal distribution define z to be the point on the x axis for which the area under the
P(Z > z ) =
z z
By symmetry P( - <Z< ) = 1 - . Assuming normality we have
2 2
x x
P z z 1
x
2 2
n
P x z x x x z x 1
n n
2 2
x z x
The interval defined by 2
n is called 100(1- ) % confidence interval for the mean with
variance known . If value of is specified , the upper and lower limits of this interval can be
calculated from the sample average. The probability is 1 - that the true mean lies between them.
11
Example 13
The temperature ( in degrees Celcius ) at ten points chosen at random in a large building is measured,
giving the following list of readings :
180, 16.50, 17.50, 180, 19.50, 16.50, 180, 170, 190, 17.50
The standard deviation of temperature is known to be 10 C . Find a 90% confidence interval for the
mean temperature in the building.
Example 14
A machine fills cartons of liquid; the mean fill is adjustable but the dial on the gauge is not very accurate.
The standard deviation of the quantity of fill is 6 ml. A sample of 30 cartons gave a measured average
content of 570 ml. Find 90 % and 95 % confidence intervals for the mean.
12
Suppose we have a situation of which the population standard deviation is unknown. So we have to
estimate by the sample standard deviation s. This however will add an extra uncertainty because this
estimate is itself subject to error. Assuming the population is normal in this case we need to use the t
x x
Tn
distribution : sx / n
The value of n is called the degree of freedom. The greater the n the more will Tn approach the
normal distribution.
sx
x t n1 n
The 100(1- ) % confidence interval in this case is ( n -1 degree of freedom )
2,
Example 15
The measured lifetimes of a sample of 20 electronic components gave an average of 1250 h with
sample standard deviation of 96 h. Assuming that the lifetime has a normal distribution , find a 95 %
confidence interval for the mean lifetime of the population .
Hypothesis Testing
Basically hypothesis testing consists of two hypotheses to be tested. One is null hypothesis ( H0 ) and
the other one is alternative hypothesis ( Ha ) . Ha normally contains assertions of concern. H0 is the
assertion opposite to Ha . However we always used equal sign for H0 .
For example some comments expressed that net weight for milo bought is found to be less than what is
written on the can which is 500g. If µ is the supposed mean weight as claimed by manufacturer, our
concern is the true net weight is less than 500g.
13
Now, how do we make decision process ? Well a tolerance has to be set first. If we take a sample of 30
cans for example and found that x is 499g , should we accept Ha ? Probably this will be quite unsafe as
the difference is only 1 gram. So there must be some tolerations.
This toleration will be given by alpha level ( α level ). Usually α is set between 0.001 to 0.1. The most
popular is α= 0.05.
In general if Ha contains < , we say left tailed test, contains > will be right tailed test, and contains ≠ is
the double tail test.
Next we choose appropriate test statistics based on the distribution involved. For normal distribution we
use Zt = (x - µ)/( σ/√n ) and compute the value.
Next we find Zα from the expression p( Z ≥ Zα ) = α ( if right tailed test ), or p( Z ≤ Zα ) = α ( if left tailed
test ), or p( Z ≥ Zα/2 ) = α/2 and p( Z ≤ Zα/2 ) = α/2 ( if doubled tailed test ).
For the right tailed test if Zt > Zα , then we accept Ha . If not then we don’t have enough evidence to
reject H0 .
Example 16.
There are some concerns on the net weight of 250g milo. Some people feel that the actual net weight is
less than 250g. Consumer Society conducted a statistical test on 30 randomly selected cans and found
that mean sample is 245g. It is claimed by the manufacturer that µ = 250g and σ = 10g.
Based on α = 0.05 can we say that the concerns need to be investigated deeper ?
14