Probability Course Notes - 365 Data
Probability Course Notes - 365 Data
Probability
365 DATA SCENCE 2
Table of Contents
Abstract .................................................................................................................................... 3
Words of welcome.................................................................................................................. 5
1. The Basics of Probability ................................................................................................... 6
1.1 Expected Values .......................................................................................................... 7
1.2 Probability Frequency Distribution ........................................................................... 8
1.3 Complements ............................................................................................................... 9
2. Combinatorics .................................................................................................................. 10
2.1 Permutations ...............................................................................................................10
2.2 Factorials .....................................................................................................................11
2.3 Variations......................................................................................................................12
2.4 Combinations .............................................................................................................13
2.4.1 Combinations with separate sample spaces ...................................................13
2.4.2 Winning the Lottery ............................................................................................14
2.4.3 Combinations with Repetition ..........................................................................15
2.4.4 Applications of Combination with Repetition..................................................16
2.4.5 Pizza Example...............................,.......................................................................16
2.5 Methodology .............................................................................................................17
2.5.1 Pizza and Sequences ..............................................................................................18
2.5.2 Always Ending in 0 ................................................................................................. 19
2.5.3 Positions ..............................................................................................................20
2.5.4 The Final Step ......................................................................................................20
2.6 Symmetry of Combinations ......................................................................................21
3. Bayesian Notation...............................................................................................................22
3.1 Multiple Events ..............................................................................................................23
3.1.1 Intersection .......................................................................................................23
3.1.2 Union .................................................................................................................24
3.1.3 Mutually Exclusive Sets ....................................................................................24
3.2 Independent and Dependent Events ......................................................................25
3.3 Conditional Probability ..............................................................................................26
3.4 Law of Total Probability .............................................................................................27
365 DATA SCENCE 3
Abstract
Finally, we go over distributions which are the “heart” of probability applied in data
science. You may have heard of many of them, but this is the only place where you’ll
find detailed information about many of the most common distributions.
• Discrete Distributions : Uniform distribution, Bernoulli distribution, Binomial
distribution (that’s where you’ll see a lot of the combinatorics from the previous
parts), Poisson
• Continuous: Normal distribution, Standard normal distribution, Student’s T, Chi-
Squared, Exponential, Logistic
Words of welcome
Dataset
? ML Insight
You are here because you want to comprehend the basics of probability before you
can dive into the world of statistics and machine learning. Understanding the driving
forces behind key statistical features is crucial to reaching your goal of mastering data
science. This way you will be able to extract important insight when analysing data
through supervised machine learning methods like regressions, but also fathom the
outputs unsupervised or assisted ML give you.
Distributions are the main way we lie to classify sets of data. If a dataset complies with
certain characteristics, we can usually attribute the likelihood of its values to a specific
distribution. Since many of these distributions have elegant relationships between
certain outcomes and their probabilities of occurring, knowing key features of our
data is extremely convenient and useful.
365 DATA SCENCE 6
Probability is the likelihood of an event occurring. This event can be pretty much
anything – getting heads, rolling a 4 or even bench pressing 225lbs. We measure
probability with numeric values between 0 and 1, because we like to compare the
relative likelihood of events. Observe the general probability formula.
𝑃𝑟𝑒𝑓𝑒𝑟𝑟𝑒𝑑 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠
P(X)=
𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑝𝑎𝑐𝑒
Probability Formula:
• The Probability of event X occurring equals the number of preferred outcomes
over the number of outcomes in the sample space.
• Preferred outcomes are the outcomes we want to occur or the outcomes we are
interested in. We also call refer to such outcomes as “Favorable”.
• Sample space refers to all possible outcomes that can occur. Its “size” indicates the
amount of elements in it.
experiment we conduct.
experiment.
In this instance, the experimental probability for getting heads would equal the
number of heads we record over the course of the 20 outcomes, over 20 (the total
number of trials).
Expected value for categorical variables. Expected value for numeric variables.
365 DATA SCENCE 8
1.3 Complements
A’ = Not A
opposite
Characteristics of complements:
• Can never occur simultaneously.
• Add up to the sample space. (A + A’ = Sample space)
• Their probabilities add up to 1. (P(A) + P(A’) = 1)
• The complement of a complement is the original event. (A’)’ = A)
Example:
• Assume event A represents drawing a spade, so P(A) = 0.25.
• Then, A’ represents not drawing a spade, so drawing a club, a diamond or a heart.
P(A’) = 1 – P(A), so P(A’) = 0.75.
365 DATA SCENCE 10
2. Combinatorics
2.1 Permutations
Characteristics of Permutations:
• Arranging all elements within the sample space.
• No repetition.
• 𝑃 𝑛 =𝑛× 𝑛−1 × 𝑛−2 ×⋯×1=𝑛! (Called“nfactorial”)
Example:
• If we need to arrange 5 people, we would have P(5) = 120 ways of doing so.
365 DATA SCENCE 11
2.2 Factorials
Factorials express the product of all integers from 1 to n and we denote them with
the “!” symbol.
Key Values:
• 0!=1.
• If n < 0, n! does not exist.
Examples: n = 7, k = 4
365 DATA SCENCE 12
2.3 Variations
Variations represent the number of different possible ways we can pick and arrange a
number of elements.
Variations
with Number of elements
repetition we are arranging
Number of different
elements available
• We still have n-many options for the second element because repetition is allowed.
Number of elements we
Variations are arranging
without
repetition
Number of different
elements available
• We only have (n-1)-many options for the second element because we cannot
repeat the value for we chose to start with.
•
365 DATA SCENCE 13
2.4 Combinations
Combinations represent the number of different possible ways we can pick a number
of elements.
Combinations
Characteristics of Combinations:
• Takes into account double-counting. (Selecting Johny, Kate and Marie is the same
as selecting Marie, Kate and Johny)
Combinations represent the number of different possible ways we can pick a number
of elements.
𝐶 = 𝑛1 × 𝑛2 × ⋯ × 𝑛𝑝
• The option we choose for any element does not affect the number of options for
the other elements.
• We need to know the size of the sample space for each individual element.
(𝑛1, 𝑛2 ... 𝑛𝑝)
365 DATA SCENCE 14
To win the lottery, you need to satisfy two distinct independent events:
69!
𝐶= ×26
64!5!
Total number of
Combinations
𝐶𝑃𝑜𝑤𝑒𝑟𝑏𝑎𝑙𝑙 𝑛𝑢𝑚𝑏𝑒𝑟
𝐶5 𝑛𝑢𝑚𝑏𝑒𝑟𝑠
• One event has a sample size of 26, the other has a sample size of
• Using the “favoured over all” formula, we find the probability of any single ticket
winning equals
365 DATA SCENCE 15
Combinations represent the number of different possible ways we can pick a number
of elements. In special cases we can have repetition in combinations and for those we
use a different formula.
Combinations with
repetition Number of elements
we need to select
Now that you know what the formula looks like, we are going to walk you through the
process of deriving this formula from the Combinations without repetition formula.
This way you will be able to fully understand the intuition behind and not have to
bother memorizing it.