1.4.1. Estimation and Inference
1.4.1. Estimation and Inference
1
Learning Goals
In this section, we will cover:
- Statistical estimation and inference
- Parametric and non-parametric approaches to modeling
- Common statistical distributions
- Frequentist vs. Bayesian statistics
2
Estimation vs. Inference
Estimation: is the application of an algorithm, for example taking an average:
3
Machine Learning and Statistical Inference
Machine learning and statistical inference are similar
(a case of computer science borrowing from a long history in statistics).
We may care either about the whole distribution or just features (e.g. mean).
4
Example: Customer Churn
Customer churn occurs when a customer leaves a company
6
Customer Churn: Example Dataset
IBM Cognos Customer Churn Dataset:
- Data from fictional telecommunications firm
7
Customer Churn Example: Plotting
8
Customer Churn Example: Plotting
9
Customer Churn Example: Plotting
10
Customer Churn Example: Plotting
11
Parametric vs. Non-parametric
12
Non-parametric Statistics
13
Non-parametric Inference
14
Parametric Models
A parametric model is a particular type of statistical model: it's also a
set of distributions or regressions, but they have a finite number of
parameters.
An example of a parametric model: the Normal Distribution.
15
Example: Customer Lifetime Value
Customer lifetime value is an estimate of the
customer's value to the company
16
Parametric Models: Maximum Likelihood
The most common way of estimating parameters in a parametric model
is through maximum likelihood estimation (MLE).
17
Parametric Models: Maximum Likelihood
We choose the value of 0 (parameters) that maximizes the likelihood function.
18
Commonly Used Distributions
19
Commonly Used Distributions
20
Commonly Used Distributions
21
Commonly Used Distributions
22
Commonly Used Distributions
23
Frequentist vs. Bayesian Statistics
A Frequentist is concerned with repeated observations in the limit.
Frequentist approach:
1. Derive the probabilistic property of a procedure
2. Apply the probability directly to the observed data
24
Frequentist vs. Bayesian: Bayesian
A Bayesian describes parameters by probability distributions.
25
Frequentist vs. Bayesian: Bayesian
We will consider two examples of probabilistic systems:
● Coin flips - What is the probability of an unfair coin coming up
heads?
● Election of a particular candidate for UK Prime Minister - What
is the probability of seeing an individual candidate winning, who has
not stood before?
26
Frequentist vs. Bayesian: Bayesian
27
Frequentist vs. Bayesian Statistics
We use much of the same math and the same formulas in both
Frequentist and Bayesian statistics.
28
Summary
● Estimation and Inference
○ Inferential Statistics consist in learning characteristics of the population from a
sample. The population characteristics are parameters, while the sample
characteristics are statistics. A parametric model, uses a certain number of
parameters like mean and standard deviation.
○ The most common way of estimating parameters in a parametric model is through
maximum likelihood estimation.
○ Through a hypothesis test, you test for a specific value of the parameter.
○ Estimation represents a process of determining a population parameter based on a
model fitted to the data.
○ The most common distribution functions are: uniform, normal, log normal,
exponential, and poisson.
○ A frequentist approach focuses in observing man repeats of an experiment. A
29
bayesian approach describes parameters through probability distributions.
Learning Recap
In this section, we discussed:
- Statistical estimation and inference
- Parametric and non-parametric approaches to modeling
- Common statistical distributions
- Frequentist vs. Bayesian statistics
30