DS ML Probability Statistics Interview
DS ML Probability Statistics Interview
Shubhankar Agrawal
Abstract
This document serves as a quick refresher for Data Science and Machine Learning interviews. It covers mathematical concepts across a range of
Probability and Statistical topics. This requires the reader to have a foundational level knowledge with tertiary education in the field. This PDF
contains material for revision over key concepts that are tested in interviews.
Contents
Permutations (Arrange n) ∶ 𝑛!
1 Mathematics 1
(𝑛) ∏ 𝑛..𝑛 − 𝑟 (5)
1.1 Basic Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Combinations (Choose r from n) ∶ =
𝑟 𝑟!
1.2 Combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Coupon Collector
1.3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Number of draws (n) to get k items when each draw is uniform:
2 Probability 1
1 1 1
2.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 1 𝑛 = 𝑘 ∗ ( + + .... + ) (6)
1 2 𝑘
2.2 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . 2
Circular Seating
2.3 Moments and Functions . . . . . . . . . . . . . . . . . . . . . 2
Number of people: 𝑛 − 1 since one position is fixed.
2.4 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Estimation • Functions
Stars and Bars
(𝑛+𝑘−1)
3 Statistics 2 Partitioning a set of 𝑛 items into 𝑘 boxes is given by
𝑘−1
3.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3.2 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3. Linear Algebra
Terminology • Test of Proportions • Errors • Power Analysis Eigenvalues and Eigenvectors are given by calculating determinant
3.3 Analysis of Variance . . . . . . . . . . . . . . . . . . . . . . . 4 of this equation.
3.4 Non Parametric Methods . . . . . . . . . . . . . . . . . . . . . 4
𝐴⋅𝑥 =𝜆⋅𝑥 (7)
Chi Square • Mann Whitney U Test
1
3.5 Stochastic and Temporal Models . . . . . . . . . . . . . . . . . 4 𝐴−1 = ⋅ 𝑎𝑑𝑗(𝐴) (8)
mod 𝐴
Markov Chains
ARIMA Determinant non-zero => Non-singular and invertible.
4 Contact me 5
2. Probability
1. Mathematics 2.1. Basic Concepts
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)
1.1. Basic Formulae
𝑃(𝐴, 𝐵) = 𝑃(𝐴∕𝐵) ⋅ 𝑃(𝐵)
Basic mathematical formulae to be known. ∑ (9)
Series Progressions 𝑃(𝐴) = 𝑃(𝐴, 𝐵)
𝐵
𝑛
𝑆𝑛 = 𝑎 + (𝑎 + 𝑑) + ... + [𝑎 + (𝑛 − 1)𝑑] = [2𝑎 + (𝑛 − 1)𝑑]
2 Here 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴, 𝐵)
(1)
𝑛−1 1 − 𝑟𝑛
𝑆𝑛 = 𝑎 + 𝑎𝑟 + +... + 𝑎𝑟 =𝑎 A and B are independent
1−𝑟
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) ⋅ 𝑃(𝐵)
Euler’s Number (e) [2.718] (10)
𝑃(𝐴∕𝐵) = 𝑃(𝐴)
∑ 𝑥𝑘
𝑒𝑥 = Independence ≠ Mutual Exclusivity. Independence means they
𝑘!
𝑛 (2) don’t depend on each other. Mutually exclusive would mean they
1
𝑒 = (1 + ) cannot occur together.
𝑛
1–6
Probability and Statistics: Data Science and ML Refresher
2.4. Distributions
The most common continuous and discrete distributions are listed in
Table 8. Apart from these, some CDFs of continuous distributions
are listed in Table 3.
2.4.1. Estimation
Parameter estimation can be done with Maximum Likelihood Esti-
mation (MLE) or Maximum A Posteriori (MAP) algorithms.
Figure 1. Normal Distribution
∏𝑛 𝑛
∑ [3]
𝓁(𝜃) = log 𝑃(𝑥𝑖 ∣ 𝜃) = log 𝑃(𝑥𝑖 ∣ 𝜃)
𝑖=1 𝑖=1
(14) 3.2. Hypothesis Testing
𝑛
∑ Testing variables to analyse performance
𝜃̂MLE = arg max 𝓁(𝜃) = arg max log 𝑃(𝑥𝑖 ∣ 𝜃)
𝜃 𝜃
𝑖=1
Population: The entire group
2–6
Probability and Statistics: Data Science and ML Refresher
3–6
Probability and Statistics: Data Science and ML Refresher
𝑘
∑ (𝑂𝑖 − 𝐸𝑖 )2
𝑐ℎ𝑖 2 =
𝑖=1
𝐸𝑖
where 𝑑𝑓 = 𝑘 − 1 (26)
𝑂 = Observed Frequency
𝐸 = Expected Frequency
𝑟 𝑐
∑ ∑ (𝑂𝑖𝑗 − 𝐸𝑖𝑗 )2
𝑐ℎ𝑖 2 =
𝑖=1 𝑗=1
𝐸𝑖𝑗
(27)
𝑑𝑓 = (𝑟 − 1) × (𝑐 − 1)
Figure 2. Power over Distributions r, c = Number of rows and columns
[4]
Variance: Hypothesis Test on Sample Variance
∑ 𝑛1 (𝑛1 + 1)
3.3. Analysis of Variance 𝑈1 = rank −
Group 1
2
ANOVA is used to compare two variables and if the means of their ∑ 𝑛2 (𝑛2 + 1)
are statistically different from each other. This is used for comparing 𝑈2 = rank −
2
the groups within a continuous variable when values can be assumed Group 2
to follow a normal distribution. These steps are needed to calculate 𝑈 = min(𝑈1 , 𝑈2 )
the ANOVA statistic. 𝑛𝑛 (29)
𝜇𝑈 = 1 2
2
𝑛1 𝑛2 (𝑛1 + 𝑛2 + 1)
𝑘
∑ 𝜎𝑈2 =
Sum Squares Between Groups 𝑆𝑆𝐵 = 𝑛𝑗 (𝑌̄ 𝑗 − 𝑌)
̄ 2 12
𝑗=1 𝑈 − 𝜇𝑈
𝑍= √
𝑘 𝑛𝑗
∑ ∑ 𝜎𝑈2
Sum Squares Within Groups 𝑆𝑆𝑊 = (𝑌𝑗𝑖 − 𝑌̄ 𝑗 )2
𝑗=1 𝑖=1
(25) A summary of all tests is provided in Table 9.
𝑆𝑆𝐵
Mean Squares Between 𝑀𝑆𝐵 =
𝑘−1
𝑆𝑆𝑊 3.5. Stochastic and Temporal Models
Mean Squares Within 𝑀𝑆𝑊 =
𝑁−𝑘 3.5.1. Markov Chains
𝑀𝑆𝐵
F Statistic = Markov Chain problems can be solved by building the transition
𝑀𝑆𝑊
matrix 𝑃.
Useful link for F Test: F Test Table [2] To calculate the probabilities to get to a certain end state, extract
two matrices Q: sub-matrix of transient states (no absorbing state)
and R: sub-matrix of transient states to absorbing state.
3.4. Non Parametric Methods
−1
3.4.1. Chi Square Fundamental Matrix 𝑁 = (𝐼 − 𝑄)
(30)
Absorbing Probability 𝐵 = 𝑁 ⋅ 𝑅
Chi Square is used to evaluate on categorical variables. It is a special
case of the Gamma distribution. It requires building the frequency Irreducible: No state is unreachable Aperiodic: No periodic self
table. loops Ergodic: Irreducible (and) Aperiodic. In this case, the steady
Goodness of Fit: Check if a sample fits a population. state equations can be solved.
4–6
Probability and Statistics: Data Science and ML Refresher
𝜋𝑃 = 𝜋
𝑛
∑ (31)
𝜋𝑖 = 1
𝑖=1
3.5.2. ARIMA
Family of models for time series forecasting.
Stationarity: No changing components (trends)
Heteroskedasticity: Non constant variance
4. Contact me
You can contact me through these methods:
References
[1] Chi Square Table. [Online]. Available: https://math.arizona.
edu/~jwatkins/chi-square-table.pdf.
[2] F Test Table. [Online]. Available: https://www.stat.purdue.edu/
~lfindsen/stat503/F_alpha_05.pdf.
[3] Key Properties of the Normal Distribution. [Online]. Available:
https : / / analystprep . com / cfa - level - 1 - exam / wp - content /
uploads/2019/10/page-123.jpg.
[4] Power and Sample Size Determination. [Online]. Available: https:
//sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_power/
bs704_power_print.html.
[5] T Test Table. [Online]. Available: https://www.sjsu.edu/faculty/
gerstman/StatPrimer/t-table.pdf.
5–6
Probability and Statistics: Data Science and ML Refresher
1 𝑎+𝑏 (𝑏 − 𝑎)2
Continuous Uniform Single uniform outcome
𝑏−𝑎 2 12
−(𝑥 − 𝜇)2
1
Normal √ ⋅𝑒 2𝜎2 𝜇 𝜎2 Gaussian Event
2𝜋𝜎2
1 1 Time to witness x events
Exponential 𝜆 ⋅ 𝑒−𝜆𝑥
𝜆 𝜆2 (Inverse of Poisson)
1 𝑎 1 𝑎 𝑎 Gamma Event (a > 0)
Gamma (𝜆𝑥) 𝑒−𝜆𝑥
Γ(𝑎) 𝑥 𝜆 𝜆2 Γ(𝑎) = (𝑎 − 1)!
6–6