0% found this document useful (0 votes)

25 views6 pages

DS ML Probability Statistics Interview

This document is a refresher on Probability and Statistics tailored for Data Science and Machine Learning interviews, requiring foundational knowledge in the field. It covers key mathematical concepts, including combinatorics, probability distributions, and statistical methods such as hypothesis testing and analysis of variance. The material serves as a revision guide for essential topics commonly tested in interviews.

Uploaded by

newtondr7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views6 pages

DS ML Probability Statistics Interview

Uploaded by

newtondr7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Probability and Statistics: Data Science and ML Refresher

Shubhankar Agrawal

Abstract
This document serves as a quick refresher for Data Science and Machine Learning interviews. It covers mathematical concepts across a range of
Probability and Statistical topics. This requires the reader to have a foundational level knowledge with tertiary education in the field. This PDF
contains material for revision over key concepts that are tested in interviews.

Contents
Permutations (Arrange n) ∶ 𝑛!
1 Mathematics 1
(𝑛) ∏ 𝑛..𝑛 − 𝑟 (5)
1.1 Basic Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Combinations (Choose r from n) ∶ =
𝑟 𝑟!
1.2 Combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Coupon Collector
1.3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Number of draws (n) to get k items when each draw is uniform:
2 Probability 1
1 1 1
2.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 1 𝑛 = 𝑘 ∗ ( + + .... + ) (6)
1 2 𝑘
2.2 Random Variables . . . . . . . . . . . . . . . . . . . . . . . . 2
Circular Seating
2.3 Moments and Functions . . . . . . . . . . . . . . . . . . . . . 2
Number of people: 𝑛 − 1 since one position is fixed.
2.4 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Estimation • Functions
Stars and Bars
(𝑛+𝑘−1)
3 Statistics 2 Partitioning a set of 𝑛 items into 𝑘 boxes is given by
𝑘−1
3.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3.2 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3. Linear Algebra
Terminology • Test of Proportions • Errors • Power Analysis Eigenvalues and Eigenvectors are given by calculating determinant
3.3 Analysis of Variance . . . . . . . . . . . . . . . . . . . . . . . 4 of this equation.
3.4 Non Parametric Methods . . . . . . . . . . . . . . . . . . . . . 4
𝐴⋅𝑥 =𝜆⋅𝑥 (7)
Chi Square • Mann Whitney U Test
1
3.5 Stochastic and Temporal Models . . . . . . . . . . . . . . . . . 4 𝐴−1 = ⋅ 𝑎𝑑𝑗(𝐴) (8)
mod 𝐴
Markov Chains
ARIMA Determinant non-zero => Non-singular and invertible.
4 Contact me 5
2. Probability
1. Mathematics 2.1. Basic Concepts
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)
1.1. Basic Formulae
𝑃(𝐴, 𝐵) = 𝑃(𝐴∕𝐵) ⋅ 𝑃(𝐵)
Basic mathematical formulae to be known. ∑ (9)
Series Progressions 𝑃(𝐴) = 𝑃(𝐴, 𝐵)
𝐵
𝑛
𝑆𝑛 = 𝑎 + (𝑎 + 𝑑) + ... + [𝑎 + (𝑛 − 1)𝑑] = [2𝑎 + (𝑛 − 1)𝑑]
2 Here 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴, 𝐵)
(1)
𝑛−1 1 − 𝑟𝑛
𝑆𝑛 = 𝑎 + 𝑎𝑟 + +... + 𝑎𝑟 =𝑎 A and B are independent
1−𝑟
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) ⋅ 𝑃(𝐵)
Euler’s Number (e) [2.718] (10)
𝑃(𝐴∕𝐵) = 𝑃(𝐴)
∑ 𝑥𝑘
𝑒𝑥 = Independence ≠ Mutual Exclusivity. Independence means they
𝑘!
𝑛 (2) don’t depend on each other. Mutually exclusive would mean they
1
𝑒 = (1 + ) cannot occur together.
𝑛

Fibonacci Series Bayes Theorem

𝑓(𝑛) = 𝑓(𝑛 − 1) + 𝑓(𝑛 − 2) (3)

𝑃(𝐵∕𝐴) ⋅ 𝑃(𝐴)
𝑃(𝐴∕𝐵) = ∑
Taylor Series 𝑃(𝐵∕𝐴) ⋅ 𝑃(𝐴)
𝐴

𝑓 ′ (𝑎)(𝑥 − 𝑎) 𝑓 ′′ (𝑎)(𝑥 − 𝑎)2

′
𝑓 (𝑛) (𝑎)(𝑥 − 𝑎)𝑛 where 𝑃(𝐴∕𝐵) = 𝑃𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟
𝑓(𝑎)+ + +...+ +... (4) (11)
1! 2! 𝑛! 𝑃(𝐵∕𝐴) = 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑
𝑃(𝐴) = 𝑃𝑟𝑖𝑜𝑟
1.2. Combinatorics 𝑃(𝐵) = 𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒
Basic formulae to generate permutations and combinations.

1–6
Probability and Statistics: Data Science and ML Refresher

2.2. Random Variables MAP incorporates the prior distribution as well..

PMF: Probability Mass Function
PDF: Probability Density Function
𝜃̂MAP = arg max log 𝑃(𝜃 ∣ 𝐱) = arg max [log 𝑃(𝐱 ∣ 𝜃) + log 𝑃(𝜃)]
CDF: Cumulative Density Function 𝜃 𝜃
𝑛
∑
PMF and CDF sum to 1 over the entire range, formulae in Table 1 𝜃̂MAP = arg max [ log 𝑃(𝑥𝑖 ∣ 𝜃) + log 𝑃(𝜃)]
𝜃
𝑖=1
(15)
Table 1. Random Variables
2.4.2. Functions
Variable Discrete Continuous
Point PMF: 𝑃(𝑋 = 𝑥) PDF: 𝑓(𝑥)
∑ Table 3. Continuous Distribution CDFs
Cumulative Sum of PMF: 𝑃(𝑋) CDF: ∫ 𝑓(𝑥)
Distribution PDF CDF
1 𝑥−𝑎
Uniform
2.3. Moments and Functions 𝑏−𝑎 𝑏−𝑎
Moment-generating functions across random variables form the basis −(𝑥 − 𝜇)2
1 𝑥−𝜇
of distributions and their capabilities. Normal √ ⋅𝑒 2𝜎2 Φ( )
2𝜋𝜎2 𝜎

Exponential 𝜆 ⋅ 𝑒−𝜆𝑥 1 − 𝑒−𝜆𝑥

Table 2. Moments
Moment Discrete Continuous 1 𝑎 1 𝑥 𝜆𝑘 𝑡 𝑘−1 𝑒−𝜆𝑡
Gamma (𝜆𝑥) 𝑒−𝜆𝑥 ∫0 𝑑𝑡
∑ Γ(𝑎) 𝑥 Γ(𝑘)
𝐸[𝑋] 𝑋 ⋅ 𝑃(𝑋) ∫ 𝑥 ⋅ 𝑓(𝑥)
∑ 2
𝐸[𝑋 2 ] 𝑋 ⋅ 𝑃(𝑋) ∫ 𝑥 2 ⋅ 𝑓(𝑥)
It is also helpful to know how to derive the PMF, PDF and CDF of
Mean: The average value. the relevant probability distributions.
Variance/Covariance: Strength of variation with itself/another.
Skewness: Left (Negative) skewed right (Positive) distribution 3. Statistics
weights.
3.1. Concepts
Kurtosis: Normal (Mesokurtic) vs Negative (Platykurtic) is flatter
vs Positive (Leptokurtic) higher peak. Central Tendency Functions: Mean, Median, Mode, Quartiles.

𝑀𝑜𝑑𝑒 = 3 ⋅ 𝑀𝑒𝑑𝑖𝑎𝑛 − 2 ⋅ 𝑀𝑒𝑎𝑛 (16)

𝑀𝑒𝑎𝑛(𝜇) = 𝐸[𝑋]
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒(𝜎2 ) = 𝐸[𝑋 2 ] − (𝐸[𝑋])2 Law of Large Numbers: Mean value converges over trials.
√ Central Limit Theorem: Distribution converges to Normal
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛(𝜎) = 𝜎 (12)
3
𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 = 𝐸[𝑋 ] Chebyshev’s Inequality
4
𝐾𝑢𝑟𝑡𝑜𝑠𝑖𝑠 = 𝐸[𝑋 ] − 3 1
𝑃 (|𝑋 − 𝜇| ≥ 𝑘𝜎) ≤ (17)
𝑘2
Correlation: Measures both direction and strength of variation
(between -1 and 1) Normal Distribution Key concept of statistics.

𝐶𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝐸[𝑋, 𝑌] − 𝐸[𝑋] ⋅ 𝐸[𝑌]

𝐶𝑜𝑣(𝑋, 𝑌)
𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛(𝜌) =
𝜎𝑥 ⋅ 𝜎𝑦 (13)
∑
𝑖
(𝑥 − 𝜇𝑥 ) ⋅ (𝑦 − 𝜇𝑦 )
(Continuous) =
𝜎𝑥 ⋅ 𝜎𝑦

2.4. Distributions
The most common continuous and discrete distributions are listed in
Table 8. Apart from these, some CDFs of continuous distributions
are listed in Table 3.

2.4.1. Estimation
Parameter estimation can be done with Maximum Likelihood Esti-
mation (MLE) or Maximum A Posteriori (MAP) algorithms.
Figure 1. Normal Distribution

∏𝑛 𝑛
∑ [3]
𝓁(𝜃) = log 𝑃(𝑥𝑖 ∣ 𝜃) = log 𝑃(𝑥𝑖 ∣ 𝜃)
𝑖=1 𝑖=1
(14) 3.2. Hypothesis Testing
𝑛
∑ Testing variables to analyse performance
𝜃̂MLE = arg max 𝓁(𝜃) = arg max log 𝑃(𝑥𝑖 ∣ 𝜃)
𝜃 𝜃
𝑖=1
Population: The entire group

2–6
Probability and Statistics: Data Science and ML Refresher

Sample: A subset of evaluation

𝜎
Confidence Interval = 𝜇 ± 𝑡𝛼∕2,𝑛 ⋅ √
𝑁 𝑛
1 ∑
Population 𝜇 = 𝑥 𝜎
𝑁 𝑖=1 𝑖 Minimum detectable effect (MDE) = 𝑡𝛼∕2,𝑛 ⋅ √ (21)
𝑛
𝑁
1 ∑ MDE
𝜎2 = (𝑥 − 𝜇)2 Cohen’s D =
𝑁 𝑖=1 𝑖 Pooled SE
(18)
𝑛
1∑ There are several types of T-tests.
Sample 𝑥̄ = 𝑥
𝑛 𝑖=1 𝑖 Sample Greater/Lesser Than: One Tail [Significance = 𝛼]
𝑛 Sample Not Equal To: Two Tail [Significance = 𝛼∕2]
1 ∑
𝑠2 = ̄ 2
(𝑥 − 𝑥)
𝑛 − 1 𝑖=1 𝑖
Table 5. Types of T Test
Standard Error (SE): The SE exists only for√the sample since it is
a subset. The standard deviation is divided by 𝑛 to penalize for the Sample DF Notes
smaller sample size. As the sample size increases, the SE vanishes. 1 𝑛−1
2 (Paired) 𝑛−1
2 (Unpaired) 𝑛1 + 𝑛2 − 2 Same Population: Pool variance
𝑠
(One Sample) 𝑆𝐸 = √
𝑛
√
1 1 (𝑛1 − 1)𝑠12 + (𝑛2 − 1)𝑠22
(Two Sample Eq Var) 𝑆𝐸 = 𝑠2 ⋅ ( + ) (19) 𝑆𝑝2 = (22)
𝑛1 𝑛2 𝑛1 + 𝑛2 − 2
√
𝑠12 𝑠2 Pooled variance is given by Equation 22. Use the appropriate stan-
(Two Sample Uneq Var) 𝑆𝐸 = + 2
𝑛1 𝑛2 dard error formula based on the samples.
NOTE: T test uses degrees of freedom (DF), the Z test does not.
Types of Hypothesis:
Null 𝐻0 : The variable is as expected (values are equal). 3.2.2. Test of Proportions
Alternate 𝐻1 : The variable is strange (values are high/low). A similar formula as the T test when comparing proportions is used
as in Equation 23. This can be used while conducting binomial trials
across samples.
Table 4. T Test or Z Test
Test 2
𝜎 Variance unknown 𝜎2 Variance Known 𝑝2 − 𝑝1
z (or) t =
𝑆𝐸
𝑛 < 30 T - Test T - Test √
𝑛 > 30 T - Test Z - Test 𝑝1 (1 − 𝑝1 ) 𝑝2 (1 − 𝑝2 )
Independent SE = +
𝑛1 𝑛2
√ (23)
𝑝(1 − 𝑝)
Pooled SE =
Test Statistic
𝑛1 + 𝑛2
𝑥 − 𝜇0 𝑝1 𝑛1 + 𝑝2 𝑛2
z (or) t = √ (20) where 𝑝 =
𝑛1 + 𝑛2
𝜎∕ 𝑛
Super Helpful X and T Test Table (for common values): T Test
Table [5]
3.2.1. Terminology
3.2.3. Errors
Important terms used in Hypothesis Testing
p-value: Under the null hypothesis, probability a seeing a value The different types of outcomes for a test are summarized in Table 6.
more extreme than test statistic.
Significance Level (𝛼): Threshold at which test is conducted.
Table 6. Hypothesis Testing Outcomes
Critical Value: The point on the distribution corresponding to
Real / Pred Accept 𝐻0 Reject 𝐻0
the significance level.
Power (𝛽): Probability of accepting null hypothesis when it is Type I Error
𝐻0 = True Confidence Level (1 − 𝛼)
true. Significance Level (𝛼)

Confidence Interval: Estimated range of values containing the Type II Error

𝐻0 = False Power (1 − 𝛽)
population mean. Fail to Reject (𝛽)
Minimum Detectable Effect: Smallest difference that can be
detected.
Cohen’s D: Effect size related to standard deviation 3.2.4. Power Analysis
Based on the statistic formula, the variables can be estimated with Power Analysis is used to calculate the sample size needed to observe
equations 21. the Minimum detectable effect, as seen in Figure 2.

3–6
Probability and Statistics: Data Science and ML Refresher

𝑘
∑ (𝑂𝑖 − 𝐸𝑖 )2
𝑐ℎ𝑖 2 =
𝑖=1
𝐸𝑖
where 𝑑𝑓 = 𝑘 − 1 (26)
𝑂 = Observed Frequency
𝐸 = Expected Frequency

Independence: Check if two categorical variables are indepen-

dent.

𝑟 𝑐
∑ ∑ (𝑂𝑖𝑗 − 𝐸𝑖𝑗 )2
𝑐ℎ𝑖 2 =
𝑖=1 𝑗=1
𝐸𝑖𝑗
(27)
𝑑𝑓 = (𝑟 − 1) × (𝑐 − 1)
Figure 2. Power over Distributions r, c = Number of rows and columns
[4]
Variance: Hypothesis Test on Sample Variance

Deriving from the formula for minimum detectable effect with a

(𝑛 − 1)𝑠2
small tweak, the minimum sample size needed is calculated. Usually 𝑐ℎ𝑖 2 =
𝜎2 (28)
a power of 0.8 is used in hypothesis testing.
𝑑𝑓 = 𝑛 − 1

2 Useful table for Chi Square: Chi Square Table [1]

𝑍𝛼∕2 + 𝑍𝛽
Sample Size n = ( )
𝛿∕𝜎 (24) 3.4.2. Mann Whitney U Test
where 𝛿 = Minimum Detectable Effect Also known as Wilcoxon Rank Sum test to determine significant dif-
ference between two ordinal samples. All samples from both groups
NOTE: Power Analysis uses the Z test to calculate the sample size. need to be ranked together for this. It uses the Z score table for
If two independent samples are sized, the size calculated is doubled. evaluation.

∑ 𝑛1 (𝑛1 + 1)
3.3. Analysis of Variance 𝑈1 = rank −
Group 1
2
ANOVA is used to compare two variables and if the means of their ∑ 𝑛2 (𝑛2 + 1)
are statistically different from each other. This is used for comparing 𝑈2 = rank −
2
the groups within a continuous variable when values can be assumed Group 2
to follow a normal distribution. These steps are needed to calculate 𝑈 = min(𝑈1 , 𝑈2 )
the ANOVA statistic. 𝑛𝑛 (29)
𝜇𝑈 = 1 2
2
𝑛1 𝑛2 (𝑛1 + 𝑛2 + 1)
𝑘
∑ 𝜎𝑈2 =
Sum Squares Between Groups 𝑆𝑆𝐵 = 𝑛𝑗 (𝑌̄ 𝑗 − 𝑌)
̄ 2 12
𝑗=1 𝑈 − 𝜇𝑈
𝑍= √
𝑘 𝑛𝑗
∑ ∑ 𝜎𝑈2
Sum Squares Within Groups 𝑆𝑆𝑊 = (𝑌𝑗𝑖 − 𝑌̄ 𝑗 )2
𝑗=1 𝑖=1
(25) A summary of all tests is provided in Table 9.
𝑆𝑆𝐵
Mean Squares Between 𝑀𝑆𝐵 =
𝑘−1
𝑆𝑆𝑊 3.5. Stochastic and Temporal Models
Mean Squares Within 𝑀𝑆𝑊 =
𝑁−𝑘 3.5.1. Markov Chains
𝑀𝑆𝐵
F Statistic = Markov Chain problems can be solved by building the transition
𝑀𝑆𝑊
matrix 𝑃.
Useful link for F Test: F Test Table [2] To calculate the probabilities to get to a certain end state, extract
two matrices Q: sub-matrix of transient states (no absorbing state)
and R: sub-matrix of transient states to absorbing state.
3.4. Non Parametric Methods
−1
3.4.1. Chi Square Fundamental Matrix 𝑁 = (𝐼 − 𝑄)
(30)
Absorbing Probability 𝐵 = 𝑁 ⋅ 𝑅
Chi Square is used to evaluate on categorical variables. It is a special
case of the Gamma distribution. It requires building the frequency Irreducible: No state is unreachable Aperiodic: No periodic self
table. loops Ergodic: Irreducible (and) Aperiodic. In this case, the steady
Goodness of Fit: Check if a sample fits a population. state equations can be solved.

4–6
Probability and Statistics: Data Science and ML Refresher

𝜋𝑃 = 𝜋
𝑛
∑ (31)
𝜋𝑖 = 1
𝑖=1

3.5.2. ARIMA
Family of models for time series forecasting.
Stationarity: No changing components (trends)
Heteroskedasticity: Non constant variance

Table 7. SARIMAX Components

Component Variable Notes
Auto-Regressive p Partial Autocorrelation (PACF)
Differencing d Augmented Dickey Fuller (ADF)
Moving Average q Autocorrelation (ACF)
Seasonality s Seasonal Decomposition

4. Contact me
You can contact me through these methods:

Personal Website - astronights.github.io

# [email protected]
ï linkedin.com/in/shubhankar-agrawal

References
[1] Chi Square Table. [Online]. Available: https://math.arizona.
edu/~jwatkins/chi-square-table.pdf.
[2] F Test Table. [Online]. Available: https://www.stat.purdue.edu/
~lfindsen/stat503/F_alpha_05.pdf.
[3] Key Properties of the Normal Distribution. [Online]. Available:
https : / / analystprep . com / cfa - level - 1 - exam / wp - content /
uploads/2019/10/page-123.jpg.
[4] Power and Sample Size Determination. [Online]. Available: https:
//sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_power/
bs704_power_print.html.
[5] T Test Table. [Online]. Available: https://www.sjsu.edu/faculty/
gerstman/StatPrimer/t-table.pdf.

5–6
Probability and Statistics: Data Science and ML Refresher

Table 8. Common Probability Distributions

Type Name PMF/PDF Mean Variance Summary
2
1 𝑛+1 𝑛 −1
Discrete Uniform Single uniform outcome
𝑛 2 12
Bernoulli 𝑝𝜆 ⋅ (1 − 𝑝)(1−𝜆) 𝑝 𝑝 ⋅ (1 − 𝑝) Single binary outcome
(𝑛 ) Pick k from n
Binomial ⋅ 𝑝𝑘 ⋅ (1 − 𝑝)(𝑛−𝑘) 𝑛⋅𝑝 𝑛 ⋅ 𝑝 ⋅ (1 − 𝑝)
𝑘 (with replacement)
1 1−𝑝
Geometric (1 − 𝑝)(𝑛−1) ⋅ 𝑝 # of events till first success
𝑝 𝑝2
(𝑛 + 𝑟 − 1) 𝑟 𝑟 ⋅ (1 − 𝑝)
Negative Binomial ⋅ (1 − 𝑝)𝑛 ⋅ 𝑝𝑟 # of events till r successes
𝑟−1 𝑝 𝑝2
(𝐾 ) (𝑁 − 𝑛)
⋅ Pick k from n picks
𝑘 𝐾−𝑘 𝑛⋅𝐾 𝐾 𝑁−𝐾 𝑁−𝑛
Hyper-geometric (𝑁 ) 𝑛⋅ ⋅ ⋅ of K within N items
𝑁 𝑁 𝑁 𝑁−1
(no replacement)
𝑛
𝑒−𝜆 ⋅ 𝜆𝑥
Poisson 𝜆 𝜆 # of events in 𝜆 time
𝑥!

1 𝑎+𝑏 (𝑏 − 𝑎)2
Continuous Uniform Single uniform outcome
𝑏−𝑎 2 12
−(𝑥 − 𝜇)2
1
Normal √ ⋅𝑒 2𝜎2 𝜇 𝜎2 Gaussian Event
2𝜋𝜎2
1 1 Time to witness x events
Exponential 𝜆 ⋅ 𝑒−𝜆𝑥
𝜆 𝜆2 (Inverse of Poisson)
1 𝑎 1 𝑎 𝑎 Gamma Event (a > 0)
Gamma (𝜆𝑥) 𝑒−𝜆𝑥
Γ(𝑎) 𝑥 𝜆 𝜆2 Γ(𝑎) = (𝑎 − 1)!

Table 9. Statistical Tests

Test Data Comparing Statistic Notes
Z Test Numerical Means Z Known 𝜎2
T Test Numerical Means T Unknown 𝜎2
ANOVA Numerical Means F Groups
Chi Square Categorical Frequency / Variance Chi2 Requires Building Frequency Table
Mann Whitney U Ordinal Medians Z Non normal data, Requires Ranking

6–6

STAT 330 Course Notes Fall 2024 Edition
No ratings yet
STAT 330 Course Notes Fall 2024 Edition
482 pages
Neil J. Salkind - Encyclopedia of Research Design (2010, SAGE Publications, Inc) PDF
92% (13)
Neil J. Salkind - Encyclopedia of Research Design (2010, SAGE Publications, Inc) PDF
1,644 pages
Book
No ratings yet
Book
106 pages
Book
No ratings yet
Book
113 pages
Module 2 ML Chapter2
No ratings yet
Module 2 ML Chapter2
64 pages
Slide Mathematical Statistics 220802
No ratings yet
Slide Mathematical Statistics 220802
254 pages
Probability and Statistics For STEM
No ratings yet
Probability and Statistics For STEM
251 pages
Econometricks-Short Guide
No ratings yet
Econometricks-Short Guide
110 pages
Lecture Notes MAI
No ratings yet
Lecture Notes MAI
114 pages
Project Report
No ratings yet
Project Report
56 pages
Elementary Statistics For UWG v1.11
No ratings yet
Elementary Statistics For UWG v1.11
239 pages
DLMDSAS01 - Advanced Statistics.
100% (1)
DLMDSAS01 - Advanced Statistics.
248 pages
Statistics and Econometrics
No ratings yet
Statistics and Econometrics
12 pages
Introduction To Probability Theory and S
No ratings yet
Introduction To Probability Theory and S
127 pages
Formulario Ep Probability and Statistics
No ratings yet
Formulario Ep Probability and Statistics
28 pages
Intro To Prob Theory
No ratings yet
Intro To Prob Theory
302 pages
001-2023-0929 DLMDSAS01 Course Book
No ratings yet
001-2023-0929 DLMDSAS01 Course Book
224 pages
Probability & Statistics Formulas Fba107a9 9abb 402f 998a Efdbc52cfdae-2-11
No ratings yet
Probability & Statistics Formulas Fba107a9 9abb 402f 998a Efdbc52cfdae-2-11
10 pages
Doc-Cours MathsV
No ratings yet
Doc-Cours MathsV
69 pages
Stat 509 Notes
100% (1)
Stat 509 Notes
195 pages
MI 2026 Probs and Statistics Theory and Answer
No ratings yet
MI 2026 Probs and Statistics Theory and Answer
119 pages
Data Analysis
No ratings yet
Data Analysis
51 pages
Fundamentals of Statistics (18.6501x)
No ratings yet
Fundamentals of Statistics (18.6501x)
20 pages
Ps Notes
No ratings yet
Ps Notes
62 pages
Book IntroStatistics PDF
No ratings yet
Book IntroStatistics PDF
263 pages
MS Theory Exam Study Guide
No ratings yet
MS Theory Exam Study Guide
50 pages
EJ1313488
No ratings yet
EJ1313488
16 pages
Intentionally Slow Concentric Velocity Resistance.27
No ratings yet
Intentionally Slow Concentric Velocity Resistance.27
15 pages
A Comparative Study of Programming Languages in Rosetta Code PDF
No ratings yet
A Comparative Study of Programming Languages in Rosetta Code PDF
293 pages
Statistical Methods in Data Analysis - W. J. Metzger
No ratings yet
Statistical Methods in Data Analysis - W. J. Metzger
278 pages
Basic Facts Quals
No ratings yet
Basic Facts Quals
30 pages
Probability and Statistics Cookbook
No ratings yet
Probability and Statistics Cookbook
28 pages
18.443 MIT Stats Course
No ratings yet
18.443 MIT Stats Course
139 pages
Humansci 1
No ratings yet
Humansci 1
302 pages
Problem Set 2
No ratings yet
Problem Set 2
18 pages
Lec 1
No ratings yet
Lec 1
30 pages
Review
No ratings yet
Review
6 pages
Introduction To Probability Theory and Statistics
No ratings yet
Introduction To Probability Theory and Statistics
127 pages
STATS Introduction Statistical Analysis
No ratings yet
STATS Introduction Statistical Analysis
105 pages
Probability and Statistics Cheat Sheet
100% (2)
Probability and Statistics Cheat Sheet
28 pages
MECH 262 - Notes (Statistics)
No ratings yet
MECH 262 - Notes (Statistics)
7 pages
STAT515 Lecture
No ratings yet
STAT515 Lecture
85 pages
Rohatgi Expl
No ratings yet
Rohatgi Expl
192 pages
MAT 211 Introduction To Business Statistics I Lecture Notes
No ratings yet
MAT 211 Introduction To Business Statistics I Lecture Notes
69 pages
NUS ST2334 Lecture Notes
No ratings yet
NUS ST2334 Lecture Notes
56 pages
Power Analysis For Experimental Research A Practical Guide For The Biological Medical and Social Sciences 1st Edition R. Barker Bausell (Author)
No ratings yet
Power Analysis For Experimental Research A Practical Guide For The Biological Medical and Social Sciences 1st Edition R. Barker Bausell (Author)
56 pages
CoDa Preprint PPS
No ratings yet
CoDa Preprint PPS
45 pages
ST2334 Notes (Probability and Statistics - NUS)
No ratings yet
ST2334 Notes (Probability and Statistics - NUS)
55 pages
Neurología: Cognitive Rehabilitation Program in Patients With Multiple Sclerosis: A Pilot Study
No ratings yet
Neurología: Cognitive Rehabilitation Program in Patients With Multiple Sclerosis: A Pilot Study
12 pages
Probability and Statistics - Cookbook
No ratings yet
Probability and Statistics - Cookbook
28 pages
SIPP 118 en Adolescentes
No ratings yet
SIPP 118 en Adolescentes
11 pages
Financial Literacy, Financial Education and Downstream Financial Behaviors
No ratings yet
Financial Literacy, Financial Education and Downstream Financial Behaviors
103 pages
Probability and Statistics: Cookbook
No ratings yet
Probability and Statistics: Cookbook
28 pages
Childhood Trauma Questionnaire Based Child Maltrea
No ratings yet
Childhood Trauma Questionnaire Based Child Maltrea
11 pages
A Meta-Analysis of The Impact of Technology On Learning Effectiveness of Elementary Students
No ratings yet
A Meta-Analysis of The Impact of Technology On Learning Effectiveness of Elementary Students
36 pages
Stats Cheat Sheet
No ratings yet
Stats Cheat Sheet
28 pages
Beattie 2016
No ratings yet
Beattie 2016
35 pages
Bauserman (2012)
No ratings yet
Bauserman (2012)
26 pages
Chaves Et Al - Treinamento, Volidade - 2023
No ratings yet
Chaves Et Al - Treinamento, Volidade - 2023
21 pages
Pinquart Influencesofsocioeconomicstatus
No ratings yet
Pinquart Influencesofsocioeconomicstatus
39 pages
Personality Psychology As Science: Research Methods: Sixth Edition
No ratings yet
Personality Psychology As Science: Research Methods: Sixth Edition
44 pages
A Simplified Guide To Determination of Sample Size
No ratings yet
A Simplified Guide To Determination of Sample Size
11 pages
A Probability and Statistics Cheatsheet
No ratings yet
A Probability and Statistics Cheatsheet
28 pages
Technology and SLA
No ratings yet
Technology and SLA
18 pages
Autism Spectrum Ratig Scales
78% (9)
Autism Spectrum Ratig Scales
16 pages
The Effectiveness of Peer Support For Individuals With Mental Illness Systematic Review and Meta Analysis
No ratings yet
The Effectiveness of Peer Support For Individuals With Mental Illness Systematic Review and Meta Analysis
10 pages
Revision - Elements or Probability: Notation For Events
No ratings yet
Revision - Elements or Probability: Notation For Events
20 pages
Sample Size Calculation
No ratings yet
Sample Size Calculation
9 pages
College Statistics
No ratings yet
College Statistics
244 pages
Effects of Resistance Training Frequency On Measures of Muscle Hypertrophy A Systematic Review and Meta-Analysis - Schoenfeld Et Al. 2016 PDF
No ratings yet
Effects of Resistance Training Frequency On Measures of Muscle Hypertrophy A Systematic Review and Meta-Analysis - Schoenfeld Et Al. 2016 PDF
14 pages
Final Paper - Loftus and Palmer Experiment
No ratings yet
Final Paper - Loftus and Palmer Experiment
12 pages
A Systematic Review and Meta-Analysis of Psychological Research On Conspiracy Beliefs: Field Characteristics, Measurement Instruments, and Associations With Personality Traits
No ratings yet
A Systematic Review and Meta-Analysis of Psychological Research On Conspiracy Beliefs: Field Characteristics, Measurement Instruments, and Associations With Personality Traits
13 pages
Probability and Statistics: Cookbook
No ratings yet
Probability and Statistics: Cookbook
28 pages
Probability & Statistics Facts and Formulae: Guides To Statistical Information 1
No ratings yet
Probability & Statistics Facts and Formulae: Guides To Statistical Information 1
4 pages
Secondary Students Attitudes Towards Mathematics
No ratings yet
Secondary Students Attitudes Towards Mathematics
13 pages
A Meta-Analysis of The Association Between Appraisals of Trauma and Posttraumatic Stress in Children and Adolescents
No ratings yet
A Meta-Analysis of The Association Between Appraisals of Trauma and Posttraumatic Stress in Children and Adolescents
6 pages
Effect Size, Calculating Cohen's D
No ratings yet
Effect Size, Calculating Cohen's D
6 pages
Progressive Mathematics Initiative (PMI) An Innovative Approach To Teaching and Learning Mathematics, Evidence From Three Senior High Schools in Ghana
No ratings yet
Progressive Mathematics Initiative (PMI) An Innovative Approach To Teaching and Learning Mathematics, Evidence From Three Senior High Schools in Ghana
9 pages
Advanced Statistics
No ratings yet
Advanced Statistics
40 pages
Core Concepts in Real Analysis
From Everand
Core Concepts in Real Analysis
Roshan Trivedi
No ratings yet
Worked Examples in Advanced Mechanics of Materials using MATLAB
From Everand
Worked Examples in Advanced Mechanics of Materials using MATLAB
Eric Okoth Ogur
No ratings yet
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Vector Calculus Using Mathematica Second Edition
From Everand
Vector Calculus Using Mathematica Second Edition
Steven Tan
No ratings yet
Homework Helpers: Geometry
From Everand
Homework Helpers: Geometry
Carolyn C. Wheater
No ratings yet
Mathematics N4: FET College Nated, #6
From Everand
Mathematics N4: FET College Nated, #6
Efetobo Emede
No ratings yet
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
From Everand
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)

DS ML Probability Statistics Interview

Uploaded by

DS ML Probability Statistics Interview

Uploaded by

Probability and Statistics: Data Science and ML Refresher

Fibonacci Series Bayes Theorem

𝑓(𝑛) = 𝑓(𝑛 − 1) + 𝑓(𝑛 − 2) (3)

𝑓 ′ (𝑎)(𝑥 − 𝑎) 𝑓 ′′ (𝑎)(𝑥 − 𝑎)2

2.2. Random Variables MAP incorporates the prior distribution as well..

Exponential 𝜆 ⋅ 𝑒−𝜆𝑥 1 − 𝑒−𝜆𝑥

𝑀𝑜𝑑𝑒 = 3 ⋅ 𝑀𝑒𝑑𝑖𝑎𝑛 − 2 ⋅ 𝑀𝑒𝑎𝑛 (16)

𝐶𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝐸[𝑋, 𝑌] − 𝐸[𝑋] ⋅ 𝐸[𝑌]

Sample: A subset of evaluation

Confidence Interval: Estimated range of values containing the Type II Error

Independence: Check if two categorical variables are indepen-

Deriving from the formula for minimum detectable effect with a

2 Useful table for Chi Square: Chi Square Table [1]

Table 7. SARIMAX Components

 Personal Website - astronights.github.io

Table 8. Common Probability Distributions

Table 9. Statistical Tests

You might also like

Personal Website - astronights.github.io