0% found this document useful (0 votes)
21 views73 pages

Probability Distribution

Uploaded by

chippybabu18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views73 pages

Probability Distribution

Uploaded by

chippybabu18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 73

PROBABILITY

DISTRIBUTIONS
Basic Concepts of
Probability
We will have a quick revision of probability concepts before moving on to
our topic probability distribution.
Probability
•Measurement of chance that some event is likely to happen.
Eg:-There is a 70% chance of rain today
A smoker is 10% more likely to get cancer.
What is the chance that I will live longer than 70 years?

•In the case of probability we assign a number to the chance


that a particular event will happen
Importance for the concept of Probability

Importance for the concept of Probability According to Ya-lun Chou


“Statistics as a method of decision- making under
uncertainty, is founded on probability theory, since
probability is at once the language and the measure of
uncertainty and the risks associated with it.”
Basic Terms in Probability

• Experiment-Any process that produces an outcome that cannot be predicted with


certainty . Contain a number of trials

eg:-Doing a survey, conducting experimental studies

• Sample space-Set of all possible outcomes

• Sample point-One element of sample space

• Event- Subset of sample space you are interested in.

• Outcome-Result of one trial in an experiment

See the next slide for further terms


Probability = Possibility;

◾ Probability means “chance”


◾ Probability is the likelihood that the event will occur.
◾ Two Conditions:
 Value is between 0 and 1.
 Sum of the probabilities of all events must be 1.
◾ Zero probability implies that something is impossible.
◾ A probability of 1 means something is certain.

◾ Probability = no. of favourable outcomes


total no. of all possible outcomes 7
Random Variable

Probability is often associated with at least one event. This event


can be anything.

Examples of events include rolling a die or pulling a coloured


ball out of a bag.

In these examples the outcome of the event is random (you can’t


be sure of the value that the die will show when you roll it), so
the variable that represents the outcome of these events is called
a random variable (often abbreviated to RV).
◾ Let A denote an event . The probability of that event is usually
written P(A) or Pr(A).
◾ The complement of an event Ac is everything not in that event .
◾ P(Ac) = 1 - P(A)
◾ The complement of ‘rain tomorrow’ is ‘no rain tomorrow’; the
complement of ‘3 or fewer cyclones’ is ‘4 or more cyclones’.

10
Types of Probability

•Marginal/Simple probability
•Union probability
• Joint probability
•Conditional probability
1.Marginal or Simple Probability

◾ If A is an event, then the marginal probability is just the probability


of that event occurring, P(A).
◾ It is an unconditional probability. It is not conditioned on
another event.
◾ Example: You are tossing a dice. What is the probability of
getting 2?
Probability = no. of favourable outcomes
total no. of all possible outcomes
= 1/6 =0.16
◾ The probability of getting 2 is 0.16 or 16%

14
2.Union Probability

 It is the probability of happening of the event A or B. That is the event that


occurs if either or both events occur. Either A or B occurred or both did.
 Uses the symbol "∪" (union) means "or". i.e., P(A∪B)
 Uses Additive rule for probability (OR Rule)
If Mutually exclusive
P(A ∪ B) = P(A)+P(B)
If two events not mutually exclusive
P(A ∪ B) = P(A)+P(B) – P(A∩B)

‘or’ rule –Additive Rule for probability


When we’re in the ‘or’ scenario we have to add the individual
probabilities and subtract the intersection. Mathematically we
write this as
Rule 1- P(A ∪ B) = P(A) + P(B) - P(A ∩ B).
Rule 2- P(A ∪ B) = P(A) + P(B) (mutually exclusive events)16
◾ Consider the experiment of rolling a dice. Find the probability of getting an
even number and a number that is multiple of 3.

Here S = {1, 2, 3, 4, 5, 6}
Let A be the event of getting an even number. So A =
{2, 4, 6}.
Hence we have P(A)= 3/6
Let B be the event of getting a number that is multiple of 3. So B = {3, 6}
Hence we have P(B) = 2/6
We can clearly see that the events are not mutually exclusive
That is A∩B =1/6
Thus the compound probability is given by:
◾ P(A ∪ B) = P(A)+P(B)–P(A∩B) = (3/6) + (2/6) – (1/6) =2/38
3.Joint Probability

 The Joint probability is a statistical measure that is used to calculate the

P (A ⋂B)
probability of two events occurring together at the same time.
 Notation:- P(A and B) or P(AB) or
 The symbol “∩” in a joint probability is called an intersection. The probability of
event A and event B happening is the same thing as the point where A and B
intersect. Hence, the joint probability is also called the intersection of two or more
events.
 We can represent this relation using a Venn diagram as shown below
 In the case of only two random variables, this is called a bivariate distribution,
otherwise, it is a multivariate distribution.
 Uses Probability formula with multiplication Rule (AND rule)
In case of mutually exclusive events
P(A ∩ B) = P(A)× P(B)
In case of not mutually exclusive events
P(A∩B) = P(A)× P(B∣A) if P(A) ≠ 0
P(A∩B) = P(B)× P(A∣B) if P(B) ≠ 0

18
◾ In a group of 100 sports car buyers, 40 bought alarm systems,
30 purchased bucket seats, and 20 purchased an alarm system
and bucket seats. If a car buyer chosen at random bought an
alarm system, what is the probability they also bought bucket
seats?

20
◾ Formula, P(B|A) = P(A∩B) / P(A)
P(B|A) = 20/40 = 0.5
◾ The probability that a buyer bought bucket seats, given
that they purchased an alarm system is 50%.

21
Bayes’ Theorem on Conditional Probability

Bayes’ Theorem defines the probability of an event based on the condition of


occurrence of other events. It is also called conditional probability. It helps in
calculating the probability of occurrence of one event based on the condition that
another event is also taking place or occurring. It is named after Thomas Bayes who
was an English statistician.

P (A/B)= P(A and B) /P(B)


P(A and B ) = P(A∩B)
P(A∩B) = P(A)× P(B∣A) if P(A) ≠ 0
P(A∩B) = P(B)× P(A∣B) if P(B) ≠ 0
Probability Rules Recap:-
And Rule
When the events are independent, the joint probability is just the
product of the individual marginal probabilities of the events:
P(A ∩ B) = P(A) ✕ P(B).

or’ rule –Additive Rule for probability


With the ‘and’ rule we had to multiply the individual probabilities.
But when we’re in the ‘or’ scenario we have to add the individual
probabilities and subtract the intersection. Mathematically we
write this as
Rule 1- P(A ∪ B) = P(A) + P(B) - P(A ∩ B).
Rule 2- P(A ∪ B) = P(A) + P(B) (mutually exclusive events)
General Multiplication Rule
• The general multiplication rule is a beautiful equation that links all
three types of probability:-simple, joint and union
Probability Theorems
Theorem 1:
The sum of the probability of happening of an event and not happening of an event is equal to 1.
P(A) + P(A') = 1.
Theorem 2:
The probability of an impossible event or the probability of an event not happening is always equal to 0.
P(ϕ) = 0.
Theorem 3:
The probability of a sure event is always equal to 1. P(A) = 1
Theorem 4:
The probability of happening of any event always lies between 0 and 1. 0 < P(A) < 1
Theorem 5:
If there are two events A and B, we can apply the formula of the union of two sets and we can derive the
formula for the probability of happening of event A or event B as follows.
P(A∪B) = P(A) + P(B) - P(A∩B)
Also for two mutually exclusive events A and B, we have P( A U B) = P(A) + P(B)
Bayes' Theorem on Conditional Probability
Bayes' theorem describes the probability of an event based on the condition of occurrence of other
events. It is also called conditional probability. It helps in calculating the probability of happening of one
event based on the condition of happening of another event.The conditional probability of an event A,
given the occurrence of another event B, is equal to the product of the event of B, given A and the
probability of A divided by the probability of event B.” i.e. P(A|B) = P (B/A) × P(A) ÷ P(B)
Law of Total Probability
If there are n number of events in an experiment, then the sum of the probabilities of those n events is
always equal to 1.
P(A1) + P(A2) + P(A3) + … + P(An) = 1
28
Recap :-
Distributions in Statistics
• Distributionof a statistical dataset is the spread of the data which shows all
possible values or intervals of the data and how they occur.
• Possible values a variable can take and how frequently they occur.
Definition
“A distribution is simply a collection of data or scores on a variable. Usually,
these scores are arranged in order from ascending to descending and then they
can be presented graphically.”

A frequency distribution describes a specific sample or dataset. It’s the


number of times each possible value of a variable occurs in the dataset.(Refer
previous class slides
Next we will see about Probability distribution
30
 A probability distribution is an idealized frequency distribution.
 It is the mathematical function that gives the probabilities of occurrence of
different possible outcomes for an experiment.
 It is a summary of the probabilities of all possible outcomes of an experiment
or situation, known as a random variable.(2 types-discrete and continuous)

Uses
 Different probability distributions help us to know more about the data and its
characteristics.
 Helps to understand what could be the possible outcome if it follows a
particular distribution.
 Explaining levels of risk to patients, accessing clinical guidelines and evidence
summaries, assessing medical marketing and advertising material, interpreting
screening test results, reading research
 They’re also used in hypothesis testing to determine p values
Notations:

Let Y= Actual outcome of an event


y= One of the possible outcomes (each distinct outcome)
Represent as P(Y=y) or P(y) called as Probability function
Eg:
Y= Number of red marbles drawn out of a box
Number of getting specified number say 5 marbles represented by ‘y’
Therefore represent it as P(5) or P(y=5)
Notation for Probability Distributions

•Variables that follow a probability distribution are called random


variables or outcome of an event. There’s special notation you can
use to say that a random variable follows a specific distribution:
Random variables are usually denoted by X.
•The ~ (tilde) symbol means “follows the distribution.”
•The distribution is denoted by a capital letter (usually the first letter
of the distribution’s name), followed by brackets that contain the
distribution’s parameters.(mean and variance)
•For example, the following notation means “the random
variable X follows a normal distribution with a mean of µ and a
variance of σ2.
Probability Distribution is broadly classified into two

Discrete Probability Distributions


Continuous Probability Distributions
Discrete probability Continuous probability
distributions distributions

• A discrete distribution has a range of ◾ A range of values that are infinite, and
values that are countable. therefore uncountable

◾ Made up of discrete variables ◾ Made up of continuous variables.

◾ Random variables that take ◾ Random variables that take continuous


discrete values. values
◾ Usually described with a ◾ Expressed with a formula called Probability
frequency distribution table, or Density Function describing the shape of
other type of graph or chart. the distribution.
◾ Described by Probability Mass ◾ Described by Probability Density Function.
Function(PMF) (PDF)
◾ Probability distribution will be ◾ Probability distribution will be a curve.
unconnected individual bars
Eg:- Time, distance
Eg:- Number of children in a
family
35
DISCRETE PROBABILITY CONTINUOUS PROBABILITY
DISTRIBUTIONS DISTRIBUTIONS

 Binomial distribution  Normal/ Gaussian


 Poisson distribution distribution
 Geometric Distribution  Students t
 Hypergeometric  Chi-Squared
distribution  Uniform
 Multinomial
 Exponential
Distribution

36
Binomial Distribution
◾ Type of discrete probability distribution
◾ Discrete outcome with dichotomous nature

◾ Only two possible outcomes

◾ A binomial distribution can be thought of as simply the


probability of a SUCCESS or FAILURE outcome in an
experiment or survey that is repeated multiple times.
◾ Eg: Taking a test could have two possible outcomes: pass
or fail
38
Binomial distributions must meet the following criteria:
 In this distribution similar trials are carried out several times in a row.
 Each trial results in one of the two possible, mutually exclusive
outcomes; Success and Failure
 Each observation or trial is independent. The outcome of any particular
trial is not affected by the outcome of any other trial.
 The probability of success (p) is exactly the same from one trial to
another. The probability of a failure, 1-p, is denoted by q
Bernoulli distribution
The binomial distribution is derived from a process known as Bernoulli trial or
each individual trial. Bernoulli distribution comprises one trial and two possible
outcomes. Binomial events are a sequence of identical Bernoulli events. In binomial
distribution outcomes for each iteration are two as in Bernoulli distribution, but we
have many iterations.
Variable X follows a Bernoulli distribution with probability of success equal to p is
indicated by X ~Bern (p) Here σ² =p(1-p) , σ =√p(1-p)

Eg: coin flip, single true/false quiz question


1
Representing Binomial and Bernoulli distribution

Binomial distribution:-
Notation- B(n, p) n=no of trials; p=probability of success in each one.
Eg:- X~ B(10.0.6)
Means Variable X follows A binomial distribution with 10 trials and likelihood
of success of 0.6 on each individual trial.

Bernoulli distribution:-
We can express a Bernoulli distribution as a Binomial distribution with single
trial.
Bern (p)=B(1,p)

40
Poisson Distribution
◾ A discrete probability distribution
◾ The Poisson distribution is the probability of a given number of events
happening in a fixed interval of time. (When we want to test out how unusual
an event frequency is for a given interval)
◾ Properties
 The occurrence of the events are independent
 The probability of the single occurrence of the event in a given
interval is proportional to the length of the interval
 Poisson distribution describes the behaviour of rare events (with small
probabilities) such as patients arriving at an emergency room, decaying
radioactive atoms, bank customers coming to their bank, number of suicide
cases in adolescence, deaths in a calamity
◾ Variance = mean (if the no. of trials is very large)
i.e. Mean(m) = Variance
SD = √Variance = √m 41
◾ If you want to find probability of a certain number of events
happening in a period of time (or number of events), then use
the Poisson Distribution.

◾ If you are given an exact probability and you want to find the
probability of the event happening a certain number out times
out of x (i.e. 10 times out of 100, or 99 times out of 1000), use
the Binomial Distribution

45
46
◾ Most important distribution in all of statistics
◾ It is defined as a continuous frequency distribution of
infinite range

47
Recap:- Points
• Frequently found in nature, hence named so
• Bell shape with single peak at the Centre of distribution
• Arithmetic mean, median and mode are equal
• Thetotal area under the curve is 1,half under the curve is to the right
of Centre point and half to the left of Centre point
• Symmetrical about the mean
• Asymptotic- The curve gets closer and closer to the X axis but never
actually touches it. The tail of the curve extends indefinitely in both
directions
• Thelocation of normal distribution is determined by mean and
standard distribution
1. Bell shaped curve
2. Continuous probability curve
3. It is symmetrical about its mean – The curve on the either side of mean
is a mirror image of the other side
4. The mean, median and mode are equal
5. Total area under the curve is one square unit or 100%

55
6. The normal distribution is completely determined by two parameters, mean
(µ) & standard deviation(σ)
7. Curve is symmetrical & asymptotic (touches at infinity, range between -∞
and ∞)
8. Three sigma Rule- In particular, the empirical rule predicts that in normal
distributions, 68% of observations fall within the first standard deviation (µ
± 1σ), 95% within the first two standard deviations (µ ± 2σ), and 99.7%
within the first three standard deviations (µ ± 3σ) of the mean.
57
◾ Data obtained from biological measurements approximately
follow normal distribution
◾ Binomial & Poisson distribution can be approximated to
normal distribution
◾ For a large sample, any statistics (mean, SD etc) approximately
follow normal distribution
◾ Normal curve is used to find the confidence limits of the
population parameters
◾ Normal distribution is the basis of tests of significance

59
◾ A normal distribution with a mean of zer0 and a standard deviation
of one is called a standard normal distribution.

60
Standard Normal deviation and Z-scores
Z scores or the standard scores or Standard Normal deviate(SND)

A normal distribution with mean=0 and SD=1 is called standard normal

distribution. A random variable with the standard normal distribution is called

standard normal random variable denoted by Z.

The table that transforms every normal distribution to a distribution with mean

zero and standard deviation 1.This distribution is called standard normal

distribution or simply standard distribution and the individual values are called

standard scores or z scores

Every normal random variable X can be transformed into a z score via the

following equation: z = (X - μ) / σ , where X is the value of the element, μ is the

mean, and σ is the standard deviation of X


◾ Z score a.k.a standardizing converts all values of
a variable to a new scale where mean=0 and Sd=1

• z-score tells you the number of standard deviations


from the mean a data point is.

◾ The distance of a particular value from the mean expressed in


terms of SD is the key to determine the area of the normal curve
occupied by that value called as z score or SND
67
◾ Z-scores are a way to compare results from a test to a “normal”
population
◾ For example, knowing that someone’s weight is 150 pounds might be
good information, but if you want to compare it to the “average” person’s
weight, looking at a vast table of data can be overwhelming (especially if
some weights are recorded in kilograms)
◾ A z-score can tell you where that person’s weight is compared to the
average population’s mean weight.

68
◾ A z-score tells you where the score lies on a normal distribution curve

◾ A z-score of 1 is 1 standard deviation above the mean


◾ A z-score of 2 is 2 standard deviations above the mean
◾ A z-score of -1.8 is -1.8 standard deviations below the mean
◾ A z-score of zero tells you the values is exactly average while a score of +3
tells you that the value is much higher than average

69
Recap:-

You might also like