0% found this document useful (0 votes)

127 views61 pages

ETM Week 2 Rev

ETM week 2 rev

Uploaded by

mper0084

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

127 views61 pages

ETM Week 2 Rev

ETM week 2 rev

Uploaded by

mper0084

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

MONASH

BUSINESS

Paradigms in Statistical Inference

Week 2: ETM2100 Principles of Statistical Inference

Disclaimer: All logos, images and text used in this presentation

Accredited by: Advanced Signatory:
are the property of their respective copyright owners and are
used here fore educational purposes.
Unit Learning Outcomes
On Successful completion of this unit, you should be able to:

1. recognize uncertainty as basic element in statistical science (uncertainty)

2. characterize and articulate various paradigm of inference in statistical

analysis (paradigm)

3. identify strength and limitations of different estimation procedures and

approaches in hypothesis testing in statistical analysis (inference)

4. develop statistical thinking process in the conduct of statistical analysis

(thinking)
Learning Outcome:
Week 2: Paradigms in Inference
At the end of the week, you should be able to:

1. explain the nature of statistical inference (uncertainty,

paradigm, thinking)
2. distinguish features of different paradigms in statistical
inference (uncertainty, paradigm, thinking)
Outline
✓ Nature of Statistical Inference

✓ Summarizing the Data

✓ Frequentist Inference

✓ Bayesian Inference

✓ Parametric vs. Nonparametric

Nature of Statistical Inference
Recall…
Given the data…
Recall…
Summarize
Given the data…
- Characteristics
Central Tendency
Variability
Association of features
- Patterns
Seasonality of Sales
- Models
Abstraction of patterns
and other characteristics

From the summaries of the data…

✓ Does it characterize the population?
✓ Does the population behaves like the data?
✓ What is the likelihood that the data truly represents the
population?
Population and Sample
Assumption: Sample represents a Population
Probabilistic Nature of Characteristic of Interest/Phenomenon
✓ Define a random variable X
X – sales, price, income
✓ Characterize X in terms of its distribution 𝐹𝑋 (inc. mean, variance,
quantiles, etc.)
✓ Problem: don’t have full information about the population
➢Possibly, about the parameter/s that govern 𝐹𝑋
✓ Solution: obtain independent observations from the population
➢Single population⇒Common distribution ⇒Observations are
identically distributed!
✓ Observed Data: realizations of IID Random Variable!
Population and Sample
Assumption: Sample represents a Population

✓ Population: unknown characteristics

✓ Sample: characteristics used to make inference about the population
Does it make sense? Provided above assumption is true!

o Target Population
o Sampled Population (N)
o Sample (n)
Random Sample/The Data
IID Random Variable
❑ Target Population
✓Totality of elements/units under consideration
✓Information is desired from
✓Generally impossible to examine in its entirety
❑ Sampled Population
✓Totality from which samples were actually selected from
✓An abstraction of the truth
❑ Sample
✓Probabilistic representation of the population
✓Framework: unit carries information about the population
o We wish to observe such unit
o Randomly select from the population
Random Sample/The Data
Mathematical Formulation

❑ 𝑋 a characteristic of interest (random variable, measurement of a

unit)
❑ 𝐹𝑋 the distribution of 𝑋 (population)
❑ 𝑋1 , 𝑋2 , … , 𝑋𝑛 measurements from 𝑛 independent observed units in
𝐹𝑋
o Each of 𝑋𝑖′ 𝑠 has distribution 𝐹𝑋 (identical)
o 𝐹𝑋 common distribution of 𝑋𝑖′ 𝑠
o 𝑋𝑖′ 𝑠 are assumed not related to each other (independent)
❑ The collection 𝑋 = 𝑋1 , 𝑋2 , … . , 𝑋𝑛 -a random sample from a
population

The population is associated with the distribution 𝑭𝑿

Random Sample/The Data
Population and Distribution

A population is associated with at least 1 random variable

Characteristics of Population⇔Characteristics of Random Variable

Population ⇔ Distribution
Random Sample/The Data
Visual Representation: Population of N=100,000, Mean=100, SD=5, n=100

Histogram of Histogram of Histogram of

Population Population Sample,
10 Bins 100 Bins n=100
10 Bins
Random Sample/The Data
Visual Representation: Population of N=100,000, Mean=100, SD=5, n=10

Histogram of
Population Smoothed Histogram of Sample, n=10
100 Bins
Random Sample/The Data
Visual Representation
Can we use
Example this?

𝑋 − sales per hour

N=100,00 hours selling so far
Population: 𝐹𝑋 sales following a
normal distribution
[With mean 100, SD=5]
Sample: 𝑋1 = 5.70, 𝑋2 = 99.23,
𝑋3 = 104.38 𝑋4 = 92.46, 𝑋5 = 95.95, To infer on
𝑋6 = 103.18 𝑋7 = 93.30, 𝑋8 = 0.70, this?
𝑋9 = 93.81, 𝑋10 = 101.16
obtained from 𝐹𝑋

The population is associated with the

distribution 𝐹𝑋 𝑁 100,25
Statistics and Sampling Distribution
Summarizing the data
✓ Statistic: function of random sample
o Function of observable random variables
o Itself a random variable
o Does not contain any unknown parameter
✓ Observable: can be computed directly from the data/random
sample
Example: Given 𝑋1 = 5.70, 𝑋2 = 99.23, 𝑋3 = 104.38
𝑋4 = 92.46, 𝑋5 = 95.95, 𝑋6 = 103.18
𝑋7 = 93.30, 𝑋8 = 0.70, 𝑋9 = 93.81, 𝑋10 = 101.16
The following are all statistics!
Statistics and Sampling Distribution
Linking the summaries to the population
✓ Sampling Distribution: probability distribution of a statistic
(computed from a random sample
❑Distribution of the statistic for all possible samples (same
sizes) from the same population
❑Analysis can be based on the sampling distribution of the
statistic rather than the joint distribution of all individual data
values (sample).
❑Depends on:
o underlying population distribution
o sampling method
o sample size
o form of statistic
✓ Standard Error – standard deviation of the statistic
Statistics and Sampling Distribution
Linking the summaries to the population
✓ Knowledge of the Sampling Distribution useful in making
inferences about the Sampled Population
❑Example:
Sample 1 Sample 2 Sample 3 Sample 4 Sample Sample Sample Sample
30 28 25 30 31 32 28 32 100 101 102 103
29 27.5 31.5 30 5 35 15 50 5 12 35 78
20 32.5 8.5 56.5

❑Are the 2 groups of samples obtained from the same population?

❑Which group was obtained from a more dispersed population?
❑Can the data (sample) help us talk about the population?

Inference: inductive reasoning, develop a general conclusion

Statistics and Sampling Distribution
Example: N=1,000,000, 𝜇 = 200, 𝜎 = 25, Normal

n=20, 10,000 samples n=20, 1,000 samples

Mean of SD: 199.956 Mean of SD: 199.896
Std error of SD: 5.592 Std error of SD: 5.394
Statistics and Sampling Distribution
Linking the summaries to the population

Inferential Statistics/Statistical Inference

❑generalize beyond actual observations
❑Form of confidence about our conclusions (about the
population)
❑Basis of decision-making
On Sample Mean and Variance
Some Important Theory
❑ Sample Mean: Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from 𝐹𝑋
with common mean 𝐸 𝑋𝑖 = 𝜇 and common variance 𝑉𝑎𝑟 𝑋𝑖 =
𝜎 2 . Then
σ𝑛𝑖=1 𝑋𝑖 𝜎2
𝐸 𝑋ത = 𝐸 = 𝜇 ; 𝑉 𝑋ത =
𝑛 𝑛

❑ Sample Variance: Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from

𝐹𝑋 with common mean 𝐸 𝑋𝑖 = 𝜇 and common variance
𝑉𝑎𝑟 𝑋𝑖 = 𝜎 2 . Provided that the moments (expected values)
exists, then
σ 𝑛 ത 2
2 𝑖=1 𝑋𝑖 − 𝑋 2 2
1 𝑛−3 4
𝐸 𝑆 =𝐸 = 𝜎 ; 𝑉 𝑆 = 𝜇4 − 𝜎
𝑛−1 𝑛 𝑛−1
where 𝜇4 is the fourth moment (kurtosis)
On Uncertainty

❑ Always present, variability is present in the sampling distribution

o Summaries of samples of the data from a population may
vary by chance
o Inferences are based on probability, thus, conclusions are
made without complete confidence (but with “controlled”
uncertainty).
❑ Sampling Distribution: a benchmark, a reference in statistical
decision making.
On Errors and Means
❑ Standard Error – associated with random sampling
❑ Sample Mean – most popular measure of central tendency
✓Less variable than the values of the variables (random
variables) themselves
✓Cluster around the true mean, most of the sample means are
close to the true mean

n=20, 10,000 samples

Mean of SD: 199.956
Std error of SD: 5.592
Why Normal Assumptions?

❑ With Normality assumption

✓Mathematically tractable
✓Computationally efficient
❑ Linear function of iid normal random variables are normally
distributed
❑ Some special distributions results from sampling from the
normal distribution
❑ Asymptotics: limiting behavior of statistics as sample size
becomes large.
From Histogram to ECDF
Bins=10 Bins=50

❑ Histogram: visualization of the

distribution
1. Define bins from the data range
2. Count datapoints within each bin
3. Height of the bars: frequency
Bins=100 Bins=1000
4. Adjust resolution by adjusting bin
width
From Histogram to ECDF Normal
Mean=100
❑ Empirical Cumulative Distribution Function SD=5
𝑛𝑜. 𝑜𝑓 𝑜𝑏𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 ≤ 𝑡 n=100
෡
𝐹𝑛 𝑡 =
𝑛
𝑛
1
= ෍ 𝐼 𝑥𝑖 ≤ 𝑡
𝑛
𝑖=1
where 𝐼 𝑥𝑖 ≤ 𝑡 is the indicator function on the event Normal
𝑥𝑖 ≤ 𝑡, Mean=100
SD=5
𝐼 𝑥𝑖 ≤ 𝑡 = 1 if 𝑥𝑖 ≤ 𝑡, and 𝐼 𝑥𝑖 ≤ 𝑡 = 0 OW n=1000

❑ The ECDF approximates the true CDF well for

large sample sizes.
Central Limit Theorem
Given 𝑋1 , 𝑋2 , … , 𝑋𝑛 a random sample (i.e., independent and
identically distributed) from 𝐹𝑋 with mean 𝐸 𝑋 and 𝑉 𝑋 < ∞.
Define
𝑋1 +𝑋2 +⋯,+𝑋𝑛 σ𝑛
𝑖=1 𝑋𝑖 𝑋ത𝑛 −𝐸 𝑋ത𝑛 𝑋ത𝑛 −𝐸 𝑋
𝑋ത𝑛 = = and 𝑍𝑛 = =
𝑛 𝑛 𝑉 𝑋ത𝑛 𝑉 𝑋
𝑛

Then as 𝑛 → ∞, the distribution of 𝑍𝑛 approaches 𝑁 0,1 , i.e.,

𝑍𝑛 ≈ 𝑁 0,1
Equivalently,
𝑉 𝑋
𝑋ത𝑛 ≈ 𝑁 𝐸 𝑋 ,
𝑛
and
𝑆𝑛 ≈ 𝑁 𝑛𝐸 𝑋 , 𝑛𝑉 𝑋 as 𝑛 → ∞
Central Limit Theorem
❑ In inference about population parameter(s)
o Summarize the data (statistic)
o Determine the sampling distribution
▪ Easy if samples were selected from the normal distribution
▪ Otherwise, sampling distribution will be challenging
❑ Central Limit Theorem
o Provided that samples were drawn from a population with finite
variance
o sample size is large [rule of thumb,𝑛 ≥ 30]

⇒ Approximate distribution of the sample mean is a Normal

distribution
Central Limit Theorem
Example: N=1,000,000, 𝜇 = 200, 𝜎 = 25, Normal

n=500, 1,000 samples

Mean of SD: 199.994
Std error of SD: 1.095

n=1000, 1,000 samples

Mean of SD: 199.9859
n=20, 1,000 samples
Std error of SD: 0.785
Mean of SD: 200.001
Std error of SD: 5.698
Summarizing the Data
Statistic
Data Summaries in terms of Statistic!
• Information in the sample 𝑋 = 𝑋1 , 𝑋2 , … , 𝑋𝑛 ′ to make inferences about an
unknown parameter 𝜃
• Get information in the sample, determine a few key features of the sample
values/summary⇒ Statistics!
• t = 𝑡 𝑋 defines a data summary
o using only the observed value of 𝑡(𝑋), one will treat as “equal” or “similar”
two samples, say 𝑋 and 𝑌, that satisfy 𝑡 𝑋 = 𝑡(𝑌), even though the
actual samples may be different in some ways
• Example: Observing number of hours spent in social media
▪ Define 𝑋𝑖 for the 𝑖𝑡ℎ person
▪ Assume 𝑋𝑖 𝑠 independent, also, identical, i.e., 𝑋𝑖 ~𝐹𝑋 ⇒ a sample
▪ Consider: mean 𝑋ത
What Statistic to Consider?

• There are too many statistics to be considered.

• Summarize but do not discard important information about
the unknown parameter 𝜃 (characteristic of the population of
interest)
o Sufficiency
o Likelihood
What Statistic to Consider?
Sufficiency
• Sufficient statistic for a parameter 𝜃 captures all information about 𝜃 contained
in the sample 𝑥.
o A statistic 𝑡(𝑋) is a sufficient statistic for 𝜃 if the conditional distribution of the
sample 𝑋 given the value of 𝑡(𝑋)does not depend on 𝜃
o Let 𝑓 𝑥, 𝜃 be the density of 𝑋 = 𝑥 and 𝑞 𝑡, 𝜃 be the density of 𝑡 = 𝑡 𝑋 . Then
𝑡 is sufficient statistics for 𝜃 if, for every 𝑥𝜖𝑋 (collection of all possible values of
𝑓 𝑥,𝜃
𝑋), the ratio does not depend of 𝜃.
𝑞 𝑡,𝜃
o The sum σ𝑛𝑖=1 𝑥𝑖is often a sufficient statistics for parameters of most
exponential families.

❑ The sufficiency principle: if statistic is sufficient for 𝜃, then any inference about
𝜃 should depend on the sample 𝑥 only through the value of 𝑡 𝑥 .
What Statistic to Consider?
Minimality
❑ There are many sufficient statistics, which one to choose from?
❑ Some information about a single parameter cannot be summarized in one
statistic alone.
❑ Example: 𝑥~𝑁 𝜇, 𝜎 2 , how do we summarize 𝑥1 , 𝑥2 , … , 𝑥𝑛 so there will be
no lost information about 𝜇 & 𝜎 2 ?
❑ Summarize without loss of information
o most data summary while still retaining all the information is preferred
▪ minimal sufficient statistics!
2
σ𝑛
𝑖=1 𝑥𝑖 −𝑥ҧ
𝑥ҧ and 𝑆 2 = are jointly minimal sufficient for 𝜇 & 𝜎 2 !
𝑛−1
What Statistic to Consider?
Completeness

• A statistics 𝑡 for 𝜃 is complete if for some function 𝑔 for which 𝐸 𝑔 𝑡 =0

implies that 𝑃 𝑔 𝑡 = 0 = 1 for all possible values of 𝜃

• 𝑡 is complete if and only if, the only estimator of zero, which is a function of 𝑡
and which has zero mean, is a statistic that is identically zero with
Probability 1 (statistic is degenerate at the point 0)
Frequentist Inference
Frequentist Inference
Classical Inference

• Assumption
o Observed Data 𝑥 = 𝑥1 , 𝑥2 , … , 𝑥𝑛 was from a probability model
(distribution of a random variable defined on a population of interest).
o 𝑋 = 𝑋1 , 𝑋2 , … , 𝑋𝑛 comprises n independent draws from a population with
probability distribution 𝐹, i.e., 𝐹 → 𝑋
• Inference
o What properties of 𝐹 can be inferred from the data 𝑥?
o Example: a popular property of 𝐹 is the expectation of a single draw of 𝑋
from 𝐹
𝜃 = 𝐸𝐹 𝑋
o Note that 𝜃෠ = 𝑥,ҧ with large n, CLT → 𝜃෠ ≈ 𝜃
Frequentist Inference
Classical Inference
• Algorithm
o Calculate 𝜃෠ from some know algorithm
σ𝑛 𝑥
⇒ 𝜃෠ = 𝑡 𝑥 = 𝑥ҧ = 𝑖=1 𝑖 ,
𝑛
෡ = 𝑡 𝑋 , 𝑡 . applied to the theoretical sample 𝑋
a realization of Θ
❑ Frequentist Inference: the accuracy of an observed estimate 𝜃෠ = 𝑡 𝑥 is the
෡ = 𝑡 𝑋 as an estimator of 𝜃.
probabilistic accuracy of Θ
❑𝜃෠ is a single value from the range of values of Θ
෡
❑Spread of values of Θ ෡ define measures of accuracy
❑Suppose 𝜇 = 𝐸𝐹 Θ ෡
❑Accuracy of 𝜃෠
2
෡
Bias = 𝜇 − 𝜃 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝐸𝐹 Θ − 𝜇
Frequentist Inference
Classical Inference

• Frequentism: an infinite sequence of future trials

• Hypothetical Data Sets 𝑋 (1) , 𝑋 (2) , 𝑋 (3) , … . Generated by the same
mechanism as 𝑥 ⇒ Θ෡ (1) , Θ
෡ (2) , Θ
෡ (3) , … .

• Frequentist Principle:
attribute for accuracy of 𝜃መ the properties of Θ
෡ values, e.g., Θ
෡ ′ 𝑠 have
empirical variance of 0.04, then standard error is 0.04 = 0.2
Frequentist Inference
Example: Normal, 𝜇=200 sd=25

• Hypothetical Data Sets 𝑋 (1) , 𝑋 (2) , 𝑋 (3) , … . Generated by the same

mechanism as 𝑥
1
𝑋
2
𝑋
𝑋 3

𝑋 4

𝑋 5

෡ (1)
Θ ෡ (2)
Θ ෡ (3) Θ
Θ ෡ (4) ෡ (5)
Θ ෡ (6)
Θ ෡ (7)
Θ ෡ (8)
Θ ෡ (9)
Θ ෡ (10)
Θ

෡ (𝑗) ⇒ 𝜃෠ = 𝑡 𝑥
• In practice, we only have 1 such Θ
Frequentist Inference
Example: Normal, 𝜇=200 sd=25

• Frequentist Principle:
• 𝜃 = 𝐸𝐹 𝑋
• Suppose 𝜃෠ = 𝑥ҧ
Accuracy of 𝑥ҧ in characterizing 𝜃 = 𝐸𝐹 𝑋
Is based on the sampling distribution of 𝑥ҧ

Mean of the sampling distribution: 199.81

Std error of the SD: 3.342
Frequentist Inference
෡=𝑡 𝑋
• Criticism: requires calculating properties of estimators Θ
obtained from the true distribution 𝐹 (but 𝐹 is unknown?).
• Some devices to circumvent this defect:
o The plug-in principle
o Taylor series approximations
o Parametric families and maximum likelihood theory
o Simulation and the bootstrap
o Pivotal statistics
Bayesian Inference
Recall: Conditional Probability & Bayes Theorem
Population 15 Years and Over
Not in the Labor
Age Employed Unemployed Total Percent
Force
15 - 24 5,481 1,323 13,248 20,052 26.99
25 - 34 10,989 1,252 4,462 16,704 22.48
35 - 44 9,596 575 3,046 13,216 17.79
45 - 54 7,634 385 2,552 10,571 14.23
55 - 64 4,450 236 2,898 7,584 10.21
65 and over 1,686 42 4,447 6,176 8.31
Total 39,837 3,813 30,653 74,303 100.00
Percent 53.61 5.13 41.25 100.00
39837
Probability of Employed = = 0. . 5361
74303
10989+9596+7634+4450+1686 34355
Probability of Employed among 25 & Over = = = 0.6333
16704+13216+10571+7584+6176 54251
Observe what happens to the probability when additional information about the
population is available.
Recall: Conditional Probability & Bayes Theorem
Population: All Individuals 15 Years and Over

Employed Individuals Individuals 24

Years and Over
Recall Conditional Probability & Bayes Theorem
❑ Definition of conditional probability:

𝑃 𝐴𝐵 = 𝑃 𝐴|𝐵 𝑃 𝐵 = 𝑃 𝐵|𝐴 𝑃 𝐴
❑ Bayes’ rule:
𝑃 𝐵|𝐴𝑗 𝑃 𝐴𝑗
𝑃 𝐴𝑗 |𝐵 =
𝑃 𝐵

❑ Bayes Theorem,
𝑃 𝐵|𝐴𝑗 𝑃 𝐴𝑗 𝑃 𝐵|𝐴𝑗 𝑃 𝐴𝑗
𝑃 𝐴𝑗 |𝐵 = =
σ𝑖 𝑃 𝐵|𝐴𝑖 𝑃 𝐴𝑖 𝑃 𝐵
Fundamental Unit of Statistical Inference
This applies to both Frequentists and Bayesian Inference

❑ Family of probability densities (population, distribution)

𝐹𝑋 = 𝑓 𝑥; 𝜇 : 𝑥𝜖𝑋, 𝜇𝜖Ω

𝑥 −observed data; 𝑋 − sample space; Ω – parameter space

❑ Inference Process: Observe 𝑥 from 𝐹𝑋 or 𝑓 𝑥; 𝜇 then infer on 𝜇.

Bayesian Inference
Uniquely Bayesian

❑ Knowledge of prior density: 𝑔 𝜇 ,𝜇𝜖Ω

❑ Inference Process: Observe 𝑥 from 𝐹𝑋 or 𝑓 𝑥; 𝜇 then infer on 𝜇 but delimit
the range of 𝜇 values to entertained (within the prior density).

❑ Next Step: update prior given new data in 𝑥

𝑔 𝜇|𝑥 - posterior density of 𝜇.
❑ In Bayes’ Rule: 𝑥 is fixed at its observed value (no longer a random
variable) while 𝜇 varies over Ω (a contradiction to the frequentists point of
view).
Comparing Frequentist and Bayesian Inference
❑ Bayesian inference requires a prior distribution𝑔 𝜇
❑ Frequentism replaces the choice of a prior with the choice of a method,
or algorithm,𝑡 𝑥 .
❑ Modern data-analysis problems are often viewed in terms of popular
methodology, this plays into the methodological orientation of
frequentism, more flexible than Bayesian in dealing with specific
algorithms

❑ Having chosen 𝑔 𝜇 , only a single probability distribution 𝑔 𝜇 is in play

for Bayesians. Frequentists, by contrast, must struggle to balance the
behavior of 𝑡 𝑥 over a family of possible distributions (unknown).
Comparing Frequentist and Bayesian Inference

❑ The simplicity argument cuts both ways. Bayesian: choice of prior being
correct, or at least not harmful. Frequentism: more defensive posture,
hoping to do well, or at least not poorly.
❑ Bayesian analysis answers all possible questions at once. Frequentism
focuses on the problem at hand, requiring different estimators for
different questions.
Comparing Frequentist and Bayesian Inference

❑ The simplicity of the Bayesian approach is especially appealing in

dynamic contexts, where data arrives sequentially and updating one’s
beliefs is a natural practice.
❑ In the absence of genuine prior information, a whiff of subjectivity
hangs over Bayesian results, even those based on uninformative
priors. Classical frequentism claimed for itself the high ground of
scientific objectivity.
Frequentist vs Bayesian
Bayesian is opposed to frequentist, at least orthogonal!

❑ Computer-age statistical inference at its most successful combines

elements of the two philosophies.
❑ Bayesian reveal some worrisome flaws of frequentist, while it is also
exposed to criticism of dangerous overuse.
❑ Challenge: crucial to combine virtues of the two philosophies in an era of
massively complicated data sets.
❑ While some new schools of thought are infused into the frequentist
philosophy, Bayesian is also moving in similar trajectory.
❑ Bottomline: Need to address the evolving nature of the data as it exists,
compiled, stored and curated [from source to the analysts].
Parametric vs. Nonparametric
Statistical Models
❑ Represent a phenomenon with a Model

INPUT NATURE OUTPUT or RESPONSE

❑ Data 𝑥 is collected, (observation, experimentation, compilation)

❑ Goal of Data Analysis
• Understanding: extract information on how NATURE links the
RESPONSE to the INPUT
• Prediction: predict the RESPONSE to an INPUT (or future INPUT)
Statistical Models
❑ Inference: replace the NATURE “black box”
(i.e., the unknown mechanism that NATURE uses to associate the
responses to the inputs) by a statistical model

INPUT NATURE OUTPUT or RESPONSE

Parametric treatment: Assumes the common distribution 𝐹𝑋 for 𝑋 is

governed by some parameter (or vector of parameters)
o statistical model can be parametrized
Parametric Models

❑ Assume a random sample from a population that is governed by a

parameter
❑ Inference is about the underlying parameter using the data
❑ In a parametric model, common tasks (for inference)
❑ Point estimation
❑ Interval estimation
❑ Hypothesis testing
Parametric Models

❑ Point estimation: come up with a single value guess or representation

of the unknown parameter
❑ Interval estimation: rather than a single value, an interval is constructed
under certain “level of confidence”
❑ Hypothesis testing: a conjecture about the unknown parameter is
tested based on collected data
Inference for a Parameter

Example: Suppose 𝑥𝑛 is a random sample from 𝑁 𝜇, 𝜎 2

Point Estimation:
• With 𝜃 = 𝜇, 𝜎 2 unknown, find a point estimator for 𝜃
• Given 𝜎 2 known, what are the characteristics of using 𝑥ҧ as an estimator
for 𝜇
σ 𝑋𝑖 −𝑋ത 2 𝑛−2 2 σ 𝑋𝑖 −𝑋ത 2
• Which between 𝑇1 = 𝑆2 = and 𝑇2 = 𝑆 = shall be a
𝑛−1 𝑛 𝑛
better estimator for 𝜎 2 ?
Inference for a Parameter

Example: Suppose 𝑥𝑛 is a random sample from 𝑁 𝜇, 𝜎 2

Hypothesis Testing:
• With 𝜃 = 𝜇, 𝜎 2 unknown, test at 𝛼 level of significance
𝐻0 : 𝜇 = 𝜇0 vs 𝐻1 : 𝜇 ≠ 𝜇0
• Derive tests for 𝐻0 vs 𝐻1 based on 𝑥ҧ (the sample mean), and on 𝑋 𝑛
2
(the sample median)
• It was observed that 𝑋 = 𝑥, at 𝛼 = 0.05, will you reject 𝐻0 or do not
reject 𝐻0 ?
Nonparametric
• Nonparametric: no assumptions on the form of the data (distribution or
density function), i.e., the model structure is not specified, and so we
estimate the form of the model

• Not necessarily zero parameters, but the nature of the parameters are
flexible and not fixed in advance

• Semiparametric: hybrid of parametric and nonparametric approaches, i.e.,

the model has parametric and nonparametric components
MONASH
BUSINESS

Thank you.

Disclaimer: All logos, images and text used in this presentation

Accredited by: Advanced Signatory:
are the property of their respective copyright owners and are
used here fore educational purposes.

Sta 341 Class Notes Final
No ratings yet
Sta 341 Class Notes Final
120 pages
Introduction-to-TikTok-Shop-Affiliate-Program 2
No ratings yet
Introduction-to-TikTok-Shop-Affiliate-Program 2
10 pages
Documents 12-01-2022
No ratings yet
Documents 12-01-2022
4,940 pages
Drawn To Type Lettering For Illustrators - Marty Blake
100% (2)
Drawn To Type Lettering For Illustrators - Marty Blake
169 pages
كل مذكرات السنة الأولى في الانجليزية
No ratings yet
كل مذكرات السنة الأولى في الانجليزية
32 pages
PSM1
No ratings yet
PSM1
4 pages
Project On Mysql
No ratings yet
Project On Mysql
67 pages
OLS 2 Variables 2
No ratings yet
OLS 2 Variables 2
169 pages
Lesson 3: Surface Creation
No ratings yet
Lesson 3: Surface Creation
86 pages
MSC Circ 0913
No ratings yet
MSC Circ 0913
11 pages
Random Variables: Complete Business Statistics, 8/e Instructor's Solutions Manual, Chapter 3
No ratings yet
Random Variables: Complete Business Statistics, 8/e Instructor's Solutions Manual, Chapter 3
33 pages
Carbon Dioxide Enhanced Oil Recovery in The United States: Snapshot and Forecast
No ratings yet
Carbon Dioxide Enhanced Oil Recovery in The United States: Snapshot and Forecast
45 pages
Handbook of Chemical Engineering Calculations, Third Edition. Nicholas P. Chopey
No ratings yet
Handbook of Chemical Engineering Calculations, Third Edition. Nicholas P. Chopey
3 pages
MS For Survey Works (Draft) R5
No ratings yet
MS For Survey Works (Draft) R5
47 pages
Introduction To Data Analytics: Sampling Distributions
No ratings yet
Introduction To Data Analytics: Sampling Distributions
31 pages
DPBS 1203 Business and Economic Statistics
No ratings yet
DPBS 1203 Business and Economic Statistics
21 pages
Introduction To Inferential Statistics Sampling Distributions
No ratings yet
Introduction To Inferential Statistics Sampling Distributions
21 pages
Solar Si ANN
No ratings yet
Solar Si ANN
18 pages
Unit 2 Sampling Distribution (S) of Statistic (S) : Structure
No ratings yet
Unit 2 Sampling Distribution (S) of Statistic (S) : Structure
24 pages
Lect9 Math231
No ratings yet
Lect9 Math231
42 pages
Statistical Inferences: Dr. Olivia Carrillo Gamboa
No ratings yet
Statistical Inferences: Dr. Olivia Carrillo Gamboa
16 pages
Sampling Distributions and Confidence Intervals
No ratings yet
Sampling Distributions and Confidence Intervals
69 pages
Term 1: Business Statistics: Session 5: Sampling Distributions
No ratings yet
Term 1: Business Statistics: Session 5: Sampling Distributions
19 pages
Sampling Distribution
No ratings yet
Sampling Distribution
41 pages
Seminar Week 4 - With Solutions - Fullpage
No ratings yet
Seminar Week 4 - With Solutions - Fullpage
35 pages
Course: Statistical Inference & Applications: Instructor in Charge
No ratings yet
Course: Statistical Inference & Applications: Instructor in Charge
30 pages
fml-g12s Ds en
No ratings yet
fml-g12s Ds en
7 pages
5 BSM214 Lecture5 Fall2023
No ratings yet
5 BSM214 Lecture5 Fall2023
25 pages
Mil H 6875H
No ratings yet
Mil H 6875H
29 pages
Fundamental of Statistical Analysis and Inference
No ratings yet
Fundamental of Statistical Analysis and Inference
22 pages
MATH 403 Engineering Data Analysis 95 132
No ratings yet
MATH 403 Engineering Data Analysis 95 132
38 pages
7 Estimation
No ratings yet
7 Estimation
91 pages
(English-Vietnamese) Bạn có nhiều hơn một cuộc đời - Marc Levy - Have A Sip EP98 (DownSub.com)
No ratings yet
(English-Vietnamese) Bạn có nhiều hơn một cuộc đời - Marc Levy - Have A Sip EP98 (DownSub.com)
46 pages
Task 3 - Instructions Sheet
No ratings yet
Task 3 - Instructions Sheet
4 pages
Unit 8. Data Analysis
No ratings yet
Unit 8. Data Analysis
69 pages
Topic06 Written
No ratings yet
Topic06 Written
15 pages
Sampling and Statistical Inference: Eg: What Is The Average Income of All Stern Students?
100% (1)
Sampling and Statistical Inference: Eg: What Is The Average Income of All Stern Students?
11 pages
Filtermedia HSL HSL-C Uk
No ratings yet
Filtermedia HSL HSL-C Uk
2 pages
Shahzad 2014
No ratings yet
Shahzad 2014
21 pages
Chap5 Statistical Inference
No ratings yet
Chap5 Statistical Inference
38 pages
History Plan Week 6and 7. Term 1
No ratings yet
History Plan Week 6and 7. Term 1
2 pages
03 Takaful MAYBANK EZYPAY Application Form V1.0 2018
No ratings yet
03 Takaful MAYBANK EZYPAY Application Form V1.0 2018
2 pages
Statistical Foundations: SOST70151 - LECTURE 5
No ratings yet
Statistical Foundations: SOST70151 - LECTURE 5
49 pages
Statistical Inference
No ratings yet
Statistical Inference
106 pages
Estadística II T2
No ratings yet
Estadística II T2
4 pages
Experiment-2 RLC Circuit
No ratings yet
Experiment-2 RLC Circuit
6 pages
Unit - III (P&S Notes)
No ratings yet
Unit - III (P&S Notes)
39 pages
202004160626023624rajiv Saksena Advance Statistical Inference
No ratings yet
202004160626023624rajiv Saksena Advance Statistical Inference
31 pages
Stats 201 Midterm Sheet
No ratings yet
Stats 201 Midterm Sheet
2 pages
QR Midterm Memo
No ratings yet
QR Midterm Memo
2 pages
Literature Review On Iron and Steel Industry
100% (2)
Literature Review On Iron and Steel Industry
6 pages
The Practice of Statistic For Business and Economics Is An Introductory
No ratings yet
The Practice of Statistic For Business and Economics Is An Introductory
15 pages
Bizstat ssn2
No ratings yet
Bizstat ssn2
55 pages
MIT2 854F10 Stats
No ratings yet
MIT2 854F10 Stats
38 pages
Chemistry Quiz - General
No ratings yet
Chemistry Quiz - General
3 pages
Sampling Distribution
No ratings yet
Sampling Distribution
37 pages
Business Statistics
No ratings yet
Business Statistics
25 pages
Notes On Sampling and Hypothesis Testing
No ratings yet
Notes On Sampling and Hypothesis Testing
10 pages
Sampling & Sampling Distributions
No ratings yet
Sampling & Sampling Distributions
26 pages
ML Unit2 SimpleLinearRegression pdf-60-97
No ratings yet
ML Unit2 SimpleLinearRegression pdf-60-97
38 pages
Rubric 4
No ratings yet
Rubric 4
5 pages
Biostatistics Revision DR - NJ
No ratings yet
Biostatistics Revision DR - NJ
67 pages
DAILY LESSON LOG Organic Compounds
No ratings yet
DAILY LESSON LOG Organic Compounds
4 pages
PCP Comprehensive Solutions
No ratings yet
PCP Comprehensive Solutions
8 pages
Review of Chapters 1-5
No ratings yet
Review of Chapters 1-5
21 pages
Admnadvt
No ratings yet
Admnadvt
2 pages
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
No ratings yet
Point Estimation: Statistics (MAST20005) & Elements of Statistics (MAST90058) Semester 2, 2018
12 pages
Module01 ProbabilityAndHypothesisTesting
No ratings yet
Module01 ProbabilityAndHypothesisTesting
62 pages
STAT100 - Full Course Notes
No ratings yet
STAT100 - Full Course Notes
27 pages
Upang Cea Common Ece069 p3-1
No ratings yet
Upang Cea Common Ece069 p3-1
49 pages
Chap 1 Sampling Distributions
No ratings yet
Chap 1 Sampling Distributions
14 pages
Screenshot 2024-12-15 at 01.18.34
No ratings yet
Screenshot 2024-12-15 at 01.18.34
161 pages
Business Statistics CH
No ratings yet
Business Statistics CH
29 pages
Estimation & Hypothesis Testing - PPTX (Final)
No ratings yet
Estimation & Hypothesis Testing - PPTX (Final)
92 pages
2 - Analyze - Inferential Statistics
No ratings yet
2 - Analyze - Inferential Statistics
27 pages
Stability Analysis and Modelling of Unde
No ratings yet
Stability Analysis and Modelling of Unde
309 pages
Stat Notes
No ratings yet
Stat Notes
5 pages
MTPDF6 - Sampling Distribution and Point Estimation
No ratings yet
MTPDF6 - Sampling Distribution and Point Estimation
62 pages
Unit 4 - Introduction To Statistical Inference Vs2
No ratings yet
Unit 4 - Introduction To Statistical Inference Vs2
24 pages
Biost 6.1
No ratings yet
Biost 6.1
28 pages
Lecture1 - Copy (1) Copy 2
No ratings yet
Lecture1 - Copy (1) Copy 2
24 pages
Lecture Note On Biostatistics
No ratings yet
Lecture Note On Biostatistics
74 pages
Sp25 Module 06 Sampling
No ratings yet
Sp25 Module 06 Sampling
45 pages
Inf Lec 1
No ratings yet
Inf Lec 1
26 pages
5 Inference
No ratings yet
5 Inference
57 pages
Sampling Dist
No ratings yet
Sampling Dist
34 pages
q3 Peh Week3
No ratings yet
q3 Peh Week3
8 pages
Statistics For Dummies
From Everand
Statistics For Dummies
Deborah J. Rumsey
4/5 (28)
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)

ETM Week 2 Rev

Uploaded by

ETM Week 2 Rev

Uploaded by

MONASH

Paradigms in Statistical Inference

Week 2: ETM2100 Principles of Statistical Inference

Disclaimer: All logos, images and text used in this presentation

1. recognize uncertainty as basic element in statistical science (uncertainty)

2. characterize and articulate various paradigm of inference in statistical

3. identify strength and limitations of different estimation procedures and

4. develop statistical thinking process in the conduct of statistical analysis

1. explain the nature of statistical inference (uncertainty,

✓ Summarizing the Data

✓ Parametric vs. Nonparametric

From the summaries of the data…

✓ Population: unknown characteristics

❑ 𝑋 a characteristic of interest (random variable, measurement of a

The population is associated with the distribution 𝑭𝑿

A population is associated with at least 1 random variable

Characteristics of Population⇔Characteristics of Random Variable

Histogram of Histogram of Histogram of

𝑋 − sales per hour

The population is associated with the

❑Are the 2 groups of samples obtained from the same population?

Inference: inductive reasoning, develop a general conclusion

n=20, 10,000 samples n=20, 1,000 samples

Inferential Statistics/Statistical Inference

❑ Sample Variance: Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from

❑ Always present, variability is present in the sampling distribution

n=20, 10,000 samples

❑ With Normality assumption

❑ Histogram: visualization of the

❑ The ECDF approximates the true CDF well for

Then as 𝑛 → ∞, the distribution of 𝑍𝑛 approaches 𝑁 0,1 , i.e.,

⇒ Approximate distribution of the sample mean is a Normal

n=500, 1,000 samples

n=1000, 1,000 samples

• There are too many statistics to be considered.

• A statistics 𝑡 for 𝜃 is complete if for some function 𝑔 for which 𝐸 𝑔 𝑡 =0

• Frequentism: an infinite sequence of future trials

• Hypothetical Data Sets 𝑋 (1) , 𝑋 (2) , 𝑋 (3) , … . Generated by the same

Mean of the sampling distribution: 199.81

Employed Individuals Individuals 24

❑ Family of probability densities (population, distribution)

𝑥 −observed data; 𝑋 − sample space; Ω – parameter space

❑ Inference Process: Observe 𝑥 from 𝐹𝑋 or 𝑓 𝑥; 𝜇 then infer on 𝜇.

❑ Knowledge of prior density: 𝑔 𝜇 ,𝜇𝜖Ω

❑ Next Step: update prior given new data in 𝑥

❑ Having chosen 𝑔 𝜇 , only a single probability distribution 𝑔 𝜇 is in play

❑ The simplicity of the Bayesian approach is especially appealing in

❑ Computer-age statistical inference at its most successful combines

INPUT NATURE OUTPUT or RESPONSE

❑ Data 𝑥 is collected, (observation, experimentation, compilation)

INPUT NATURE OUTPUT or RESPONSE

Parametric treatment: Assumes the common distribution 𝐹𝑋 for 𝑋 is

❑ Assume a random sample from a population that is governed by a

❑ Point estimation: come up with a single value guess or representation

Example: Suppose 𝑥𝑛 is a random sample from 𝑁 𝜇, 𝜎 2

Example: Suppose 𝑥𝑛 is a random sample from 𝑁 𝜇, 𝜎 2

• Semiparametric: hybrid of parametric and nonparametric approaches, i.e.,

Disclaimer: All logos, images and text used in this presentation

You might also like