Module 8
Module 8
🔹 Population vs Sample:
✅ Population:
Notation:
✅ Sample:
Notation:
Inferential statistics bridges the gap between sample and population using
techniques such as:
Hypothesis Testing
Confidence Intervals
Regression Analysis
ANOVA (Analysis of Variance)
Chi-square tests
Bayesian Inference
🧠 Example:
✅ Summary:
Term Description
Populati
Entire group under study. Often too large to examine fully.
on
Subset of the population used to make inferences. Must be randomly
Sample
and representatively chosen.
Inferenti
Methods for making predictions or generalizations about a population
al
based on a sample. Key in decision-making under uncertainty.
Statistics
The sampling distribution of the sample mean (or sum), approaches a normal
distribution as the sample size becomes large, regardless of the original
distribution of the data, provided the data are independent and identically
distributed (i.i.d.) and have finite variance.
📘 Formal Statement:
The individual data might be skewed (e.g., some customers spend way
more).
But if you repeatedly take samples (e.g., 50 customers at a time), and
compute the average for each group,
Those averages will start to form a normal distribution — even if the
original data was not normal!
📌 Key Requirements for CLT:
Requirement Description
The random variables must be independent and identically
i.i.d.
distributed.
Finite mean (μ) Each variable must have the same, finite expected value.
Finite variance
The variance must be finite.
(σ²)
Large sample In practice, n ≥ 30 is often "large enough", but more skewed
size (n ≥ 30) distributions may need larger n.
Given:
Step 1: Use Moment Generating Function (MGF) approach
🧮 Numerical Example:
Let’s say we're measuring the number of steps taken daily by users:
Now:
✅ Summary:
Concept Description
Sampling distribution of the sample mean becomes normal as
CLT
sample size increases.
Any distribution (with finite mean and variance), when using sample
Applies to
means.
Why it's Enables inference using normal distribution methods even for non-
crucial normal data.
In Data Powers confidence intervals, A/B testing, bootstrapping, model
Science evaluation, and more.
🔹 Definition:
Application Description
Chi-Squared Test for Used with contingency tables to test whether two
Independence categorical variables are independent.
Chi-Squared Checks how well observed data fit a specified theoretical
Goodness-of-Fit Test distribution.
Test of a Population Used when testing whether a population’s variance
Variance equals a specific value.
ANOVA (as part of F- Sum of squares in ANOVA are chi-squared distributed
distribution) under the null hypothesis.
Given:
📊 Chi-Squared in Data Science & ML:
✅ Summary:
Concept Description
Chi-Squared
Distribution of the sum of squared standard normal variables.
Distribution
Goodness-of-fit, tests for independence, variance estimation,
Used for
model diagnostics.
🔹 What is Estimation?
🔹 1. Point Estimator
📘 Definition:
A point estimator is a single numerical value computed from sample data that
serves as a best guess for an unknown population parameter.
🧠 Examples:
If you have a sample of 100 students’ scores and want to estimate the average
score of all students in the university, then:
But this estimate is a single value — it doesn't tell you how confident you are in
the estimate. That’s where interval estimation comes in.
🔹 2. Interval Estimator
📘 Definition:
✅ Summary:
Concept Description
Point Single best guess for a population parameter (e.g., sample
Estimator mean)
Interval Range of plausible values with a specified confidence level
Estimator (e.g., 95% CI)
Why
Point gives estimate, interval gives reliability measure
Important
Desirable Unbiased, consistent, efficient (for point); accurate, precise (for
Qualities interval)
🔍 What is MLE?
📘 Definition:
Simply put, MLE finds the parameters that best "explain" the data we observed.
⚙️Step-by-Step Breakdown of MLE:
🧮 1. Likelihood Function
🧾 2. Log-Likelihood Function
Because products of probabilities get small and are hard to differentiate, we use the
log-likelihood:
📐 Fisher Information:
✅ Summary:
Concept Description
Estimation method that finds parameters maximizing the likelihood
MLE
of observing the sample
Log-
Likelihoo Used to simplify computation (product → sum)
d
Propertie
Consistent, efficient, asymptotically normal
s
Use All branches of statistics and ML: logistic regression, HMMs, A/B
Cases testing, etc.
Invarianc
MLE of a function is function of the MLE
e
🧮 Example:
🧾 Interpretation:
✅ Summary Table
📚 Categories of Estimators:
Type Description
Point Estimator Gives a single best guess of the parameter
Interval Gives a range of values (confidence interval) for the
Estimator parameter
✅ Examples of Estimators in Statistics
6. Median as Estimator
7. Mode as Estimator
9. Bayesian Estimator
Step 7: Conclusion
📊 6. Graphical Representation
🧪 7. Real-Life Examples
✅ Final Thought:
Assumptions:
🧪 Real-World Example
✅ Conclusion
While HT-I and HT-II focus on one and two sample parametric tests, Hypothesis
Testing – III includes:
✅ 1. Non-Parametric Tests
These tests do not assume any specific distribution (like normal) and are useful
when:
Test Purpose
MANOVA (Multivariate ANOVA) Compares mean vectors across groups
Compares multivariate means between two
Hotelling’s T²
groups
Wilks' Lambda Used in MANOVA to test significance
📊 6. Common Use Cases of HT-III Techniques
🧾 7. Summary Table
✅ Final Thoughts:
Medical research
Social sciences
Genetics (e.g., Chi-square for Mendelian ratios)
Machine learning model evaluation (LRT)