0% found this document useful (0 votes)
5 views10 pages

Stats Fundamentals01

This document discusses probability concepts including conditional probability, law of total probability, additive law, multiplication rule, Bayes' law, discrete and continuous distributions, and uniform distribution. It provides definitions, formulas, and intuition for understanding each concept. Discrete distributions are defined as having a finite number of possible outcomes while continuous distributions have infinitely many possible outcomes.

Uploaded by

Rob
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views10 pages

Stats Fundamentals01

This document discusses probability concepts including conditional probability, law of total probability, additive law, multiplication rule, Bayes' law, discrete and continuous distributions, and uniform distribution. It provides definitions, formulas, and intuition for understanding each concept. Discrete distributions are defined as having a finite number of possible outcomes while continuous distributions have infinitely many possible outcomes.

Uploaded by

Rob
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

365 DATA SCENCE 26

3.3 Conditional Probability

For any two events A and B, such that the likelihood of B occurring is greater than
0 (𝑃 𝐵 > 0), the conditional probability formula states the following.

Probability of the
Probability of intersection.
A, given B has
occurred

Probability of event B

Intuition behind the formula: Remember:


• Only interested in the outcomes • Unlike the union or the intersection,
where B is satisfied. changing the order of A and B in
the conditional probability alters its
• Only the elements in the meaning.
intersection would satisfy A as well.
• 𝑃 (𝐴/𝐵) is not the same as 𝑃 (𝐵/𝐴)
• Parallel to the “favoured over all” ,evenif 𝑃(𝐴 𝐵) = 𝑃(𝐵|𝐴) numerically.
formula:
• The two conditional probabilities
• Intersection = “preferred possess different meanings even if
outcomes” they have equal values.

• B = “sample space”
365 DATA SCENCE 27

3.4 Law of Total Probability

The law of total probability dictates that for any set A, which is a union of many
mutually exclusive sets 𝐵 , 𝐵 , ... , 𝐵 , its probability equals the
following sum.

𝑃(𝐴)=𝑃(𝐴|𝐵1) ×𝑃(𝐵1)+𝑃(𝐴|𝐵2) ×𝑃(𝐵2) +⋯+(𝑃𝐴|𝐵n)×𝑃(𝐵n)

Probability of 𝐵2
occurring.
Conditional Probability of 𝐵1 Conditional
Probability of A, occurring. Probability of
given 𝐵1 has A, given 𝐵2
occurred has occurred

Intuition behind the formula:

• Since P(A) is the union of mutually exclusive sets, so it equals the sum of the
individual sets.

• The intersection of a union and one of its subsets is the entire subset.

• We can rewrite the conditional probability formula


to get 𝑷(𝑨∩𝑩)=𝑷(𝑨\𝑩)×𝑷(𝑩).

• Another way to express the law of total probability is:

• 𝑃(𝐴)=𝑃(𝐴∩𝐵1) +𝑃(𝐴∩𝐵2) +⋯+𝑃(𝐴∩𝐵n)


365 DATA SCENCE 28

3.5 Additive Law

The additive law calculates the probability of the union based on the probability of the
individual sets it accounts for.

𝑃(𝐴∪𝐵) =𝑃(𝐴) +𝑃(𝐵) −𝑃(𝐴∩𝐵)

Probability of the Probability of


union the intersection

Intuition behind the formula

• Recall the formula for finding the size of the Union using the size of the
Intersection:
• 𝐴∪𝐵=𝐴+𝐵 −𝐴∩𝐵
• The probability of each one is simply its size over the size of the sample space.
• This holds true for any events A and B.

3.6 The Multiplication Rule

The multiplication rule calculates the probability of the intersection based on the
conditional probability.

𝑃(𝐴∩𝐵) =𝑃(𝐴|𝐵)×𝑃(𝐵)

Probability of
the intersection Probability of
Conditional of event B
Probability

Intuition behind the formula

• We can multiply both sides of the conditional probability formula


to get 𝑷(𝑨∩𝑩)=𝑷(𝑨\𝑩)×𝑷(𝑩).
• If event B occurs in 40% of the time (𝑃 (𝐵)=0.4) and event A occurs in 50% of the
time B occurs (𝑃 (𝐴|𝐵) =0.5), then they would simultaneously occur in 20% of the
time (P (A|𝐵) ×𝑃 (𝐵) =0.5×0.4= 0.2).
365 DATA SCENCE 29

3.7 Bayes’ Law

Bayes’ Law helps us understand the relationship between two events by computing
the different conditional probabilities. We also call it Bayes’ Rule or Bayes’ Theorem.

Conditional probability of
B, given A.

Conditional
probability of
A, given B.

Intuition behind the formula

• According to the multiplication rule 𝑃 (𝐴∩𝐵) =𝑃 (𝐴|𝐵) ×𝑃 (𝐵), so 𝑃 (𝐵∩𝐴) =𝑃 (𝐵|𝐴)


×𝑃 (𝐴).

• Since 𝑃 𝐴∩𝐵 =𝑃(𝐵∩𝐴), weplugin 𝑃 𝐵𝐴 ×𝑃 𝐴 for𝑃 𝐴∩𝐵 in the conditional probability


formula

• Bayes’ Law is often used in medical or business analysis to determine which of


two symptoms affects the other one more.
365 DATA SCENCE 30

4. Discrete Distributions

A distribution shows the possible values a random variable can take and how
frequently they occur.

4.1 Important Notation for Distributions:

Y actual outcome
Population Sample
y one of the possible outcomes
𝑃(𝑌 = 𝑦) is equivalent to p(𝑦). Mean 𝜇 𝑥ҧ

Variance 𝜎2 𝑠2

Deviation 𝜎 𝑠

We call a function that assigns a probability to each distinct outcome in the sample
space, a probability function.

𝜇−𝜎 𝜇 𝜇+𝜎
365 DATA SCENCE 31

4.2 Types of Distributions

Certain distributions share characteristics, so we separate them into types. The


well-defined types of distributions we often deal with have elegant statistics. We
distinguish between two big types of distributions based on the type of the possible
values for the variable – discrete and continuous.

Discrete Continuous

• Have a finite number of outcomes. • Have infinitely many consecutive


possible values.
• Use formulas we already talked
about. • Use new formulas for attaining the
probability of specific values and
• Can add up individual values to intervals.
determine probability of an interval.
• Cannot add up the individual values
• Can be expressed with a table, that make up an interval because there
graph or a piece-wise function. are infinitely many of them.

• Expected Values might be • Can be expressed with a graph or a


unattainable. continuous function.

• Graph consists of bars lined up one • Graph consists of a smooth curve.


after the other.
365 DATA SCENCE 32

4.2.1 Discrete Distributions

Discrete Distributions have finitely many different possible outcomes. They possess
several key characteristics which separate them from continuous ones.

Key characteristics of discrete


distribution
• Have a finite number of outcomes.
• Use formulas we already talked
about.
• Can add up individual values to
1 2 3 4 5 6
determine
probability of an interval.
• Can be expressed with a table,
graph or a piece-wise
function.
• Expected Values might be
unattainable.

0 1 • Graph consists of bars lined up


one after the other.
• 𝑷(𝒀≤𝒚 )=𝑷(𝒀<𝒚+𝟏)

Examples of Discrete Distributions:


• Discrete Uniform Distribution
• Bernoulli Distribution
• Binomial Distribution
• Poisson Distribution
0 1 2 3 4 5 6 7 8 9 10
365 DATA SCENCE 33

4.2.2 Uniform Distribution

A distribution where all the outcomes are equally likely is called a Uniform
Distribution.

1 2 3 4 5 6

Notation:
• 𝒀~ 𝑼(𝒂, 𝒃)
• * alternatively, if the values are categorical, we simply indicate the number of
categories, like so: 𝒀~𝑼(𝒂)

Key characteristics
• All outcomes are equally likely.
• All the bars on the graph are equally tall.
• The expected value and variance have no predictive power.

Example and uses:


• Outcomes of rolling a single die.
• Often used in shuffling algorithms due to its fairness.
365 DATA SCENCE 34

4.2.3 Bernoulli Distribution

A distribution consisting of a single trial and only two possible outcomes – success or
failure is called a Bernoulli Distribution.

0 1

Notation:
• 𝒀~ 𝑩𝒆𝒓𝒏(𝒑)

Key characteristics
• One trial.
• Two possible outcomes.
• 𝑬𝒀=𝒑
• 𝑽𝒂𝒓𝒀 =𝒑×(𝟏−𝒑)

Example and uses:


• Guessing a single True/False question.
• Often used in when trying to determine what we expect to get out a single trial of
an experiment.
365 DATA SCENCE 35

4.2.4 Binomial Distribution

A sequence of identical Bernoulli events is called Binomial and follows a Binomial


Distribution.

0 1 2 3 4 5 6 7 8 9 10

Notation:
• 𝒀~ 𝑩(𝒏, 𝒑)

Key characteristics
• Measures the frequency of occurrence of one of the possible outcomes over the n
trials.
• 𝑷(𝒀=𝒚) = 𝑪(𝒚,𝒏)×𝒑𝒚×(𝟏−𝒑)𝒏−𝒚
• 𝑬(𝒀) = 𝒏 × 𝒑
• 𝑽𝒂𝒓(𝒀) = 𝒏 × 𝒑 × (𝟏−𝒑)

Example and uses:


• Determining how many times we expect to get a heads if we flip a coin 10 times.
• Often used when trying to predict how likely an event is to occur over a series of
trials.

You might also like