Probability Distribution
Probability Distribution
DISTRIBUTIONS
Basic Concepts of
Probability
We will have a quick revision of probability concepts before moving on to
our topic probability distribution.
Probability
•Measurement of chance that some event is likely to happen.
Eg:-There is a 70% chance of rain today
A smoker is 10% more likely to get cancer.
What is the chance that I will live longer than 70 years?
10
Types of Probability
•Marginal/Simple probability
•Union probability
• Joint probability
•Conditional probability
1.Marginal or Simple Probability
14
2.Union Probability
Here S = {1, 2, 3, 4, 5, 6}
Let A be the event of getting an even number. So A =
{2, 4, 6}.
Hence we have P(A)= 3/6
Let B be the event of getting a number that is multiple of 3. So B = {3, 6}
Hence we have P(B) = 2/6
We can clearly see that the events are not mutually exclusive
That is A∩B =1/6
Thus the compound probability is given by:
◾ P(A ∪ B) = P(A)+P(B)–P(A∩B) = (3/6) + (2/6) – (1/6) =2/38
3.Joint Probability
P (A ⋂B)
probability of two events occurring together at the same time.
Notation:- P(A and B) or P(AB) or
The symbol “∩” in a joint probability is called an intersection. The probability of
event A and event B happening is the same thing as the point where A and B
intersect. Hence, the joint probability is also called the intersection of two or more
events.
We can represent this relation using a Venn diagram as shown below
In the case of only two random variables, this is called a bivariate distribution,
otherwise, it is a multivariate distribution.
Uses Probability formula with multiplication Rule (AND rule)
In case of mutually exclusive events
P(A ∩ B) = P(A)× P(B)
In case of not mutually exclusive events
P(A∩B) = P(A)× P(B∣A) if P(A) ≠ 0
P(A∩B) = P(B)× P(A∣B) if P(B) ≠ 0
18
◾ In a group of 100 sports car buyers, 40 bought alarm systems,
30 purchased bucket seats, and 20 purchased an alarm system
and bucket seats. If a car buyer chosen at random bought an
alarm system, what is the probability they also bought bucket
seats?
20
◾ Formula, P(B|A) = P(A∩B) / P(A)
P(B|A) = 20/40 = 0.5
◾ The probability that a buyer bought bucket seats, given
that they purchased an alarm system is 50%.
21
Bayes’ Theorem on Conditional Probability
Uses
Different probability distributions help us to know more about the data and its
characteristics.
Helps to understand what could be the possible outcome if it follows a
particular distribution.
Explaining levels of risk to patients, accessing clinical guidelines and evidence
summaries, assessing medical marketing and advertising material, interpreting
screening test results, reading research
They’re also used in hypothesis testing to determine p values
Notations:
• A discrete distribution has a range of ◾ A range of values that are infinite, and
values that are countable. therefore uncountable
36
Binomial Distribution
◾ Type of discrete probability distribution
◾ Discrete outcome with dichotomous nature
Binomial distribution:-
Notation- B(n, p) n=no of trials; p=probability of success in each one.
Eg:- X~ B(10.0.6)
Means Variable X follows A binomial distribution with 10 trials and likelihood
of success of 0.6 on each individual trial.
Bernoulli distribution:-
We can express a Bernoulli distribution as a Binomial distribution with single
trial.
Bern (p)=B(1,p)
40
Poisson Distribution
◾ A discrete probability distribution
◾ The Poisson distribution is the probability of a given number of events
happening in a fixed interval of time. (When we want to test out how unusual
an event frequency is for a given interval)
◾ Properties
The occurrence of the events are independent
The probability of the single occurrence of the event in a given
interval is proportional to the length of the interval
Poisson distribution describes the behaviour of rare events (with small
probabilities) such as patients arriving at an emergency room, decaying
radioactive atoms, bank customers coming to their bank, number of suicide
cases in adolescence, deaths in a calamity
◾ Variance = mean (if the no. of trials is very large)
i.e. Mean(m) = Variance
SD = √Variance = √m 41
◾ If you want to find probability of a certain number of events
happening in a period of time (or number of events), then use
the Poisson Distribution.
◾ If you are given an exact probability and you want to find the
probability of the event happening a certain number out times
out of x (i.e. 10 times out of 100, or 99 times out of 1000), use
the Binomial Distribution
45
46
◾ Most important distribution in all of statistics
◾ It is defined as a continuous frequency distribution of
infinite range
47
Recap:- Points
• Frequently found in nature, hence named so
• Bell shape with single peak at the Centre of distribution
• Arithmetic mean, median and mode are equal
• Thetotal area under the curve is 1,half under the curve is to the right
of Centre point and half to the left of Centre point
• Symmetrical about the mean
• Asymptotic- The curve gets closer and closer to the X axis but never
actually touches it. The tail of the curve extends indefinitely in both
directions
• Thelocation of normal distribution is determined by mean and
standard distribution
1. Bell shaped curve
2. Continuous probability curve
3. It is symmetrical about its mean – The curve on the either side of mean
is a mirror image of the other side
4. The mean, median and mode are equal
5. Total area under the curve is one square unit or 100%
55
6. The normal distribution is completely determined by two parameters, mean
(µ) & standard deviation(σ)
7. Curve is symmetrical & asymptotic (touches at infinity, range between -∞
and ∞)
8. Three sigma Rule- In particular, the empirical rule predicts that in normal
distributions, 68% of observations fall within the first standard deviation (µ
± 1σ), 95% within the first two standard deviations (µ ± 2σ), and 99.7%
within the first three standard deviations (µ ± 3σ) of the mean.
57
◾ Data obtained from biological measurements approximately
follow normal distribution
◾ Binomial & Poisson distribution can be approximated to
normal distribution
◾ For a large sample, any statistics (mean, SD etc) approximately
follow normal distribution
◾ Normal curve is used to find the confidence limits of the
population parameters
◾ Normal distribution is the basis of tests of significance
59
◾ A normal distribution with a mean of zer0 and a standard deviation
of one is called a standard normal distribution.
60
Standard Normal deviation and Z-scores
Z scores or the standard scores or Standard Normal deviate(SND)
The table that transforms every normal distribution to a distribution with mean
distribution or simply standard distribution and the individual values are called
Every normal random variable X can be transformed into a z score via the
68
◾ A z-score tells you where the score lies on a normal distribution curve
69
Recap:-