Stats Fundamentals01
Stats Fundamentals01
For any two events A and B, such that the likelihood of B occurring is greater than
0 (𝑃 𝐵 > 0), the conditional probability formula states the following.
Probability of the
Probability of intersection.
A, given B has
occurred
Probability of event B
• B = “sample space”
365 DATA SCENCE 27
The law of total probability dictates that for any set A, which is a union of many
mutually exclusive sets 𝐵 , 𝐵 , ... , 𝐵 , its probability equals the
following sum.
Probability of 𝐵2
occurring.
Conditional Probability of 𝐵1 Conditional
Probability of A, occurring. Probability of
given 𝐵1 has A, given 𝐵2
occurred has occurred
• Since P(A) is the union of mutually exclusive sets, so it equals the sum of the
individual sets.
• The intersection of a union and one of its subsets is the entire subset.
The additive law calculates the probability of the union based on the probability of the
individual sets it accounts for.
• Recall the formula for finding the size of the Union using the size of the
Intersection:
• 𝐴∪𝐵=𝐴+𝐵 −𝐴∩𝐵
• The probability of each one is simply its size over the size of the sample space.
• This holds true for any events A and B.
The multiplication rule calculates the probability of the intersection based on the
conditional probability.
𝑃(𝐴∩𝐵) =𝑃(𝐴|𝐵)×𝑃(𝐵)
Probability of
the intersection Probability of
Conditional of event B
Probability
Bayes’ Law helps us understand the relationship between two events by computing
the different conditional probabilities. We also call it Bayes’ Rule or Bayes’ Theorem.
Conditional probability of
B, given A.
Conditional
probability of
A, given B.
4. Discrete Distributions
A distribution shows the possible values a random variable can take and how
frequently they occur.
Y actual outcome
Population Sample
y one of the possible outcomes
𝑃(𝑌 = 𝑦) is equivalent to p(𝑦). Mean 𝜇 𝑥ҧ
Variance 𝜎2 𝑠2
Deviation 𝜎 𝑠
We call a function that assigns a probability to each distinct outcome in the sample
space, a probability function.
𝜇−𝜎 𝜇 𝜇+𝜎
365 DATA SCENCE 31
Discrete Continuous
Discrete Distributions have finitely many different possible outcomes. They possess
several key characteristics which separate them from continuous ones.
A distribution where all the outcomes are equally likely is called a Uniform
Distribution.
1 2 3 4 5 6
Notation:
• 𝒀~ 𝑼(𝒂, 𝒃)
• * alternatively, if the values are categorical, we simply indicate the number of
categories, like so: 𝒀~𝑼(𝒂)
Key characteristics
• All outcomes are equally likely.
• All the bars on the graph are equally tall.
• The expected value and variance have no predictive power.
A distribution consisting of a single trial and only two possible outcomes – success or
failure is called a Bernoulli Distribution.
0 1
Notation:
• 𝒀~ 𝑩𝒆𝒓𝒏(𝒑)
Key characteristics
• One trial.
• Two possible outcomes.
• 𝑬𝒀=𝒑
• 𝑽𝒂𝒓𝒀 =𝒑×(𝟏−𝒑)
0 1 2 3 4 5 6 7 8 9 10
Notation:
• 𝒀~ 𝑩(𝒏, 𝒑)
Key characteristics
• Measures the frequency of occurrence of one of the possible outcomes over the n
trials.
• 𝑷(𝒀=𝒚) = 𝑪(𝒚,𝒏)×𝒑𝒚×(𝟏−𝒑)𝒏−𝒚
• 𝑬(𝒀) = 𝒏 × 𝒑
• 𝑽𝒂𝒓(𝒀) = 𝒏 × 𝒑 × (𝟏−𝒑)