0% found this document useful (0 votes)

38 views15 pages

Lec28 StratifiedSampling

The document discusses stratified sampling techniques for Monte Carlo estimation. Stratified sampling divides the sample space into mutually exclusive strata and estimates the mean within each strata. The overall estimate is a weighted average of the strata means, with weights proportional to the probability of each strata. This can reduce variance compared to simple random sampling. The document provides formulas for calculating stratified sampling estimates and their variance, and discusses how to optimally allocate sample sizes across strata.

Uploaded by

hu jack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views15 pages

Lec28 StratifiedSampling

Uploaded by

hu jack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Stratified Sampling

Prof. Nicholas Zabaras

Email: [email protected]
URL: https://www.zabaras.com/

October 11, 2020

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 1

Contents
 Conditional Monte Carlo, Random sums

 Stratified sampling, Systematic Sampling

 The goals for today’s lecture include:

 Understand conditional Monte Carlo, and stratified sampling algorithms

Following closely:
 C. Robert, G. Casella, Monte Carlo Statistical Methods (Ch.. 1, 2, 3.1, & 3.2) (google books, slides, video)
 J. S. Liu, MC Strategies in Scientific Computing (Chapters 1 & 2)
 J-M Marin and C. P. Robert, Bayesian Core (Chapter 2)
 Statistical Computing & Monte Carlo Methods, A. Doucet (course notes, 2007)
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 2
Conditional Monte Carlo
 Let   H ( X )    H ( x ) p( x )dx be some expected performance measure of a computer simulation
model, where 𝑿 is the input random variable (vector) with a pdf 𝑝(𝒙) and 𝐻(𝑿) is the sample
performance measure (output random variable).
 Suppose that there is a random variable (or vector), 𝒀 ~ 𝑔(𝒚), such that the conditional expectation
𝔼[𝐻(𝑿) | 𝒀 = 𝒚 ] can be computed analytically.
 Since   H ( X )    H ( X ) | Y , it follows that 𝔼 [𝑯(𝑿) | 𝒀 ] is an unbiased estimator of ℓ.
Furthermore, it is readily seen
Var   H  X  | Y   Var  H ( X )
 Thus using the random variable 𝔼[𝑯(𝑿) | 𝒀 ] instead of 𝑯(𝑿), leads to variance reduction.
 The equation above is derived from
Var U   Var U |V    Var  U |V 
for any pair of random variables (𝑈, 𝑉).

 The conditional Monte Carlo idea is referred to as Rao-Blackwellization.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 3

Conditional Monte Carlo
 Step 1: Generate a sample 𝒀1, 𝒀2, … , 𝒀𝑁 from 𝑔(𝒚).

 Step 2: Calculate 𝔼 𝐻(𝑿)|𝒀𝑘 , 𝑘 = 1,2, … 𝑁 analytically

 Step 3: Estimate ℓ = 𝔼 𝐻(𝑿) = 𝔼 𝔼 𝐻(𝑿)|𝒀 by

𝑁
1
෠ℓ𝑐 = ෍ 𝔼 𝐻(𝑿)|𝒀𝑘
𝑁
𝑘=1

The Algorithm requires that a random variable 𝒀 be found, such that 𝔼[𝐻(𝑿) |𝒀 = 𝒚] is
known analytically for all 𝒚. Moreover, for the Algorithm to be of practical use, the following
conditions must be met:
(a) 𝒀 should be easy to generate.
(b) 𝔼[𝐻(𝑿)|𝒀 = 𝒚] should be readily computable for all values 𝒚.
(c) 𝔼[𝑉𝑎𝑟(𝐻(𝑿)|𝒀)] should be large relative to 𝑉𝑎𝑟(𝔼[𝐻(𝑿)|𝒀 ]).

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 4

Conditional Monte Carlo: Example – Random Sums
 Consider the estimation of
R
 Pr  S R  x    SR  x
 , where : S R   X i
i 1

 𝑅 is a random variable with a given distribution and the {𝑋𝑖 } are i.i.d. with 𝑋𝑖 ~ 𝐹 and
independent of 𝑅. Let 𝐹𝑟 be the cdf of the random variable 𝑆𝑟, for fixed 𝑅 = 𝑟.

 Noting that
 r   r

F ( x)  Pr  X i  x   F  x   X i 
r

 i 1   i 2 
 We obtain
  R
   R

   SR  x |  X i     F  x   X i 
  i 2    i 2 
 As an estimator of ℓ based on conditioning, we can take

1 N
 Rk

c   F  x   X ki 
N k 1  i  2 

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 5

Stratified Sampling
 We wish to estimate
  H ( X )    H ( x ) p ( x ) dx
 Assume that there exists a random variable 𝑌 taking values in {1, … , 𝑚}, say,
with known probabilities {𝑝𝑖 , 𝑖 = 1, … , 𝑚}, and we assume that it is easy to
sample from the conditional distribution of 𝑿 given 𝑌.

 The events {𝑌 = 𝑖}, 𝑖 = 1, … , 𝑚 form disjoint subregions, or strata, of the

sample space Ω, hence the name stratification. Using the conditioning
formula, we can write
m
   H ( X | Y )   pi  H ( X ) | Y = i 
i 1

R. Y. Rubinstein, D. P. Kroese, Simulation and the Monte Carlo Method, 2007

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 6

Stratified Sampling Estimator
 This representation suggests that we can estimate ℓ via the following stratified sampling
estimator:
𝑚 𝑁𝑖
1
෠ℓs = ෍ 𝑝𝑖 ෍ 𝐻 𝑿𝑖𝑗
𝑁𝑖
𝑖=1 𝑗=1
where Xij is the j −th sample from the conditional distribution of 𝑿 given 𝑌 = 𝑖. Here 𝑁𝑖 is the
sample size assigned to the 𝑖 −th stratum.

 The variance of the stratified sampling estimator is given by

𝑚 𝑚
𝑝𝑖2 𝑝𝑖2 𝜎𝑖2
𝑉𝑎𝑟 ℓ෠ 𝑠 = ෎ 𝑉𝑎𝑟 𝐻(𝑋)|𝑌 = 𝑖 = ෎
𝑁𝑖 𝑁𝑖
𝑖=1 𝑖=1

where  i2  Var  H ( X ) | Y  i 

 How the strata should be chosen depends very much on the problem at hand. However, for a
given particular choice of the strata, {𝑁𝑖 } can be obtained in an optimal manner.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 7

Stratified Sampling
 Assuming that a maximum number of 𝑁 samples can be collected, that is, σ𝑚
𝑖=1 𝑁𝑖 = 𝑁, the
optimal value of 𝑁𝑖 is given by
𝑚 2
pi i 1
N i*  N m which gives a minimal 𝑉𝑎𝑟 ℓ෠ ∗𝑠 = ෍ 𝑝𝑖 𝜎𝑖
𝑁
p
j 1
j j variance of 𝑖=1

 This theorem asserts that the minimal variance of ℓ෠ 𝑠 is attained for sample sizes 𝑁𝑖 that are
proportional to 𝑝𝑖𝜎𝑖 .

 Although the 𝑝𝑖’s are assumed to be known, the {𝜎𝑖 } are usually unknown. In practice, one
would estimate the {𝜎𝑖 } from "pilot" runs and then proceed to estimate the optimal sample
sizes, 𝑁𝑖∗ , from the equation above.

 A simple stratification procedure, which can achieve variance reduction without requiring prior
knowledge of 𝜎𝑖2 and 𝐻(𝑿), is presented next.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 8

Stratified Sampling
 Let the sample sizes 𝑁𝑖 be proportional to 𝑝𝑖, that is, 𝑁𝑖 = 𝑝𝑖 𝑁, 𝑖 = 1, … 𝑚. Then

𝑉𝑎𝑟 ℓ෠ 𝑠 ≤ 𝑉𝑎𝑟 ℓ෠
𝑚
𝑚
𝑝𝑖2 𝜎𝑖2 1
 Substituting 𝑁𝑖 = 𝑝𝑖 𝑁 in 𝑉𝑎𝑟 ℓ෠ = ෎
𝑠 ෠𝑠 =
yields 𝑉𝑎𝑟 ℓ ෍ 𝑝𝑖 𝜎𝑖2
𝑁𝑖 𝑁
𝑖=1 𝑖=1
 The result now follows from
𝑚

𝑁𝑉𝑎𝑟 ℓ෠ = 𝑉𝑎𝑟 𝐻(𝑋) ≥ 𝔼 𝑉𝑎𝑟 𝐻 𝑋 |𝑌 = ෍ 𝑝𝑖 𝜎𝑖2 = 𝑁𝑉𝑎𝑟 ℓ෠ 𝑠

𝑖=1
ℓ෠ is the MC estimator of ℓ = 𝔼 𝐻(𝑋)
where we used

Var  H ( X )   Var  H  X  | Y    Var   H  X  | Y   Var  H  X  | Y 

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 9

Systematic Sampling Method
෠ 𝑆 is more accurate than the
 The proposition in the slide before states that the estimator ℓ
෠
estimator ℓ.

 It effects stratification by favoring those events {𝑌 = 𝑖} whose probabilities 𝑝𝑖 are largest.

Intuitively, this cannot, in general, be an optimal assignment, since information on 𝜎𝑖2 and
𝐻(𝑿) is ignored.

 In the special case of equal weights (𝑝𝑖 = 1/𝑚 and 𝑁𝑖 = 𝑁/𝑚), the estimator ℓ෠ s =
1 𝑁𝑖
σ𝑚 𝑝
𝑖=1 𝑖 σ𝑗=1 𝐻 𝑿𝑖𝑗 reduces to
𝑁𝑖

𝑚 𝑁 Τ𝑚
1
ℓ෠ 𝑠 = ෍ ෍ 𝐻(𝑿𝑖𝑗 )
𝑁
𝑖=1 𝑗=1

and the method is known as the systematic sampling method.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 10

Stratified Sampling
 The stratification process is more obvious when partition of the stochastic
space is possible:
M
  H ( X )    H ( x ) p ( x ) dx   Z m  H ( x ) p m ( x ) d x ,
D m 1 Dm

p( x )
where : pm ( x )  Dm ( x ) (conditional PDF ),
Zm
M
Z m  Pr  x  Dm    p ( x ) dx , D  Dm , Dm are disjoint
Dm m 1

 The stratified sampling algorithm can be easily implemented assuming that

the conditional PDFs and normalization factors 𝑍𝑚 are known.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 11

Stratified Sampling: Algorithm
 Step 1: For 𝑚 = 1, … , 𝑀, draw 𝑁𝑚 samples x  ( m ) Nm
i i1
from the conditional PDF 𝑝𝑚.

 Evaluate the estimator of  H ( x) p

Dm
m ( x )dx at domain 𝑚:
Nm
1
m 
Nm
 i )
H (
i 1
x (m)

 Then the overall estimator of

M
  H ( X )    Z m  H ( x ) pm ( x ) d x
m 1 Dm

is as follows:
M M Nm
1
  Zm  Z m  i )
s
(m)
m H ( x
m 1 m 1 Nm i 1

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 12

Stratified Sampling: Variance Reduction
M M Nm
1
  Zm  Z m  H (x
s
 The variance of the stratified sampling estimator is: m
(m)
i )
m 1 m 1 Nm i 1
M M

Var    Z m2Var   2 1
VarDm  H ( x ) 
 
s
 Z
  m 1 
m m
m 1 Nm

 Note that for the choices:

Z m  1/ M , N m  N / M :
M
  1
Var  H ( x )
s
Var 
  MN m 1
Dm

 If we select 𝐷𝑚 such that VarDm  H ( x )  is small on average, i.e.

M
1
M
Var  H ( x )  Var  H ( x)
m 1
Dm

(e.g. 𝐻(𝒙) is relatively homogeneous within 𝐷𝑚), then we do achieve variance reduction,
i.e. 𝑣𝑎𝑟 ℓ෠ 𝑠 < 𝑣𝑎𝑟 ℓ෠ .
Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 13
Stratified Sampling: Variance Reduction
 Consider the following case:
p ( x)  1, x  [0,1]
1/ k , 0  x  1/ 2
H ( x)  
 k , 1/ 2  x  1
 It can easily be shown that:

(k 2  1) 2
Var[0,1]  H ( x)   2
  as k  .
4k

 However, the variance of the stratified sampling estimator is zero since:

Var[0,1/2]  H ( x)   Var[1/2,1]  H ( x)   0.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 14

Stratified Sampling: Example
 Often 𝐻(𝒙) is not known explicitly.

 Then, the only way to select the partition domains 𝐷𝑚 is by drawing samples
of 𝒙 and evaluating 𝐻(𝒙).

 In that respect, the applicability of the stratified sampling method is limited.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 15

Stochastic Simulation Book
No ratings yet
Stochastic Simulation Book
146 pages
Monte Carlo Methods in Bayesian Computation Full Text Download
94% (17)
Monte Carlo Methods in Bayesian Computation Full Text Download
16 pages
Principles and Practice of Clinical Research 4th Edition John I. Gallin (Editor) PDF Download
100% (2)
Principles and Practice of Clinical Research 4th Edition John I. Gallin (Editor) PDF Download
81 pages
Applied Bayesian Econometrics For Central Bankers Updated 2017 PDF
No ratings yet
Applied Bayesian Econometrics For Central Bankers Updated 2017 PDF
222 pages
Model Assisted Survey Sampling
0% (4)
Model Assisted Survey Sampling
8 pages
MCMC Brief
100% (1)
MCMC Brief
69 pages
Lec26 RandomVariableGeneration
No ratings yet
Lec26 RandomVariableGeneration
38 pages
Lec11 Introduction2BayesianStatistics
No ratings yet
Lec11 Introduction2BayesianStatistics
48 pages
L11 TopicModels 2
No ratings yet
L11 TopicModels 2
37 pages
Unit 4 Research Design
100% (2)
Unit 4 Research Design
20 pages
AI 19 Bayes Nets IV Sampling
No ratings yet
AI 19 Bayes Nets IV Sampling
29 pages
Lec29 ImportanceSampling
No ratings yet
Lec29 ImportanceSampling
84 pages
MCMC
No ratings yet
MCMC
76 pages
High-Dimensional Gaussian SamplingA Review and A Unifying Approach Basedon A Stochastic Proximal Point Algorithm
No ratings yet
High-Dimensional Gaussian SamplingA Review and A Unifying Approach Basedon A Stochastic Proximal Point Algorithm
54 pages
Lec31 32 CaterpillarRegressionExample
No ratings yet
Lec31 32 CaterpillarRegressionExample
108 pages
Chapter4 Sampling Stratified Sampling
No ratings yet
Chapter4 Sampling Stratified Sampling
43 pages
Siggraph03
No ratings yet
Siggraph03
24 pages
Lec33 MetropolisHastings
No ratings yet
Lec33 MetropolisHastings
66 pages
Lec16 SummarizingPosteriors BayesianModelSelection
No ratings yet
Lec16 SummarizingPosteriors BayesianModelSelection
59 pages
Lec35 SequentialImportanceSampling
No ratings yet
Lec35 SequentialImportanceSampling
46 pages
Balasubramanian Et Al - Towards A Theory of Non-Log-Concave Sampling - First-OrderStationarity Guarantees For Langevin Monte Carlo
No ratings yet
Balasubramanian Et Al - Towards A Theory of Non-Log-Concave Sampling - First-OrderStationarity Guarantees For Langevin Monte Carlo
24 pages
Introduction To State Space Models and Sequential Bayesian Inference
No ratings yet
Introduction To State Space Models and Sequential Bayesian Inference
58 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
Lec25 MonteCarloMethods
No ratings yet
Lec25 MonteCarloMethods
57 pages
Chapter4 Sampling Stratified Sampling
No ratings yet
Chapter4 Sampling Stratified Sampling
26 pages
A Conceptual Introduction To Markov Chain Monte Carlo Methods
No ratings yet
A Conceptual Introduction To Markov Chain Monte Carlo Methods
56 pages
Basic Sampling Methods: Sargur Srihari Srihari@cedar - Buffalo.edu
No ratings yet
Basic Sampling Methods: Sargur Srihari Srihari@cedar - Buffalo.edu
30 pages
ICML - 2016 - Stratified Sampling Meets Machine Learning
No ratings yet
ICML - 2016 - Stratified Sampling Meets Machine Learning
10 pages
7 Inference L8 Unlocked
No ratings yet
7 Inference L8 Unlocked
29 pages
Lecture 19
No ratings yet
Lecture 19
12 pages
Slides No Break
No ratings yet
Slides No Break
77 pages
SP Sampling Lect 12
No ratings yet
SP Sampling Lect 12
19 pages
MCMC Sampling - Class 2025
No ratings yet
MCMC Sampling - Class 2025
101 pages
cdd4 PDF
No ratings yet
cdd4 PDF
75 pages
Chap016 - Sao Chép
No ratings yet
Chap016 - Sao Chép
30 pages
Computational Statistics With Matlab: Mark Steyvers May 13, 2011
No ratings yet
Computational Statistics With Matlab: Mark Steyvers May 13, 2011
78 pages
Statistical Inference and Monte Carlo Algorithms
No ratings yet
Statistical Inference and Monte Carlo Algorithms
96 pages
Characterization of Clutter Heterogeneity and Estimation of Its Covariance Matrix
No ratings yet
Characterization of Clutter Heterogeneity and Estimation of Its Covariance Matrix
6 pages
Bias Issues
No ratings yet
Bias Issues
16 pages
Surprise Sampling 21
No ratings yet
Surprise Sampling 21
29 pages
Allocation of Sample Size
No ratings yet
Allocation of Sample Size
26 pages
Low Variance Sampling Techniques For Particle Filter
No ratings yet
Low Variance Sampling Techniques For Particle Filter
7 pages
Putational Statistics Using Matlab
No ratings yet
Putational Statistics Using Matlab
78 pages
Sampling and Sampling Distribution
100% (2)
Sampling and Sampling Distribution
43 pages
Gaussian Invariant Markov Chain Monte Carlo: Metropolis-Hastings, Variance Reduction, Control Variate, Poisson Equation
No ratings yet
Gaussian Invariant Markov Chain Monte Carlo: Metropolis-Hastings, Variance Reduction, Control Variate, Poisson Equation
29 pages
Sampling Methods: Søren Højsgaard
No ratings yet
Sampling Methods: Søren Højsgaard
22 pages
Lecture Notes in Statistics 153: Edited by P. Bickel, P. Diggle, S. Fienberg, K. Krickeberg, Olkin, N. Wermuth, S. Zeger
No ratings yet
Lecture Notes in Statistics 153: Edited by P. Bickel, P. Diggle, S. Fienberg, K. Krickeberg, Olkin, N. Wermuth, S. Zeger
10 pages
STS202 2013 06
No ratings yet
STS202 2013 06
4 pages
Comphy Project 2-1
No ratings yet
Comphy Project 2-1
6 pages
Hybrid Least Squares For Learning Functions From Highly Noisy Data
No ratings yet
Hybrid Least Squares For Learning Functions From Highly Noisy Data
30 pages
Sampling and Estimmation Notes
No ratings yet
Sampling and Estimmation Notes
5 pages
CPSC 540: Machine Learning: Monte Carlo Methods
No ratings yet
CPSC 540: Machine Learning: Monte Carlo Methods
32 pages
Unit4 Sampling Methods
No ratings yet
Unit4 Sampling Methods
15 pages
Advantages : Simple Random Sampling Systematic Sampling
No ratings yet
Advantages : Simple Random Sampling Systematic Sampling
2 pages
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
No ratings yet
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
10 pages
Quarter 4 - Week 1 & 2: Length of Confidence Interval and Appropriate Sample Size
No ratings yet
Quarter 4 - Week 1 & 2: Length of Confidence Interval and Appropriate Sample Size
9 pages
Stratified
No ratings yet
Stratified
17 pages
Iterative and Non-Iterative Simulation Algorithms
No ratings yet
Iterative and Non-Iterative Simulation Algorithms
6 pages
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
No ratings yet
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
14 pages
MCMC Final Edition
No ratings yet
MCMC Final Edition
17 pages
Detection and Estimation Theory Lecture Notes For Ecen 672
No ratings yet
Detection and Estimation Theory Lecture Notes For Ecen 672
216 pages
ML Lesson - 5-4
No ratings yet
ML Lesson - 5-4
1 page
Assessing The Impact of Customer Satisfaction On Customer Loyalty
No ratings yet
Assessing The Impact of Customer Satisfaction On Customer Loyalty
67 pages
Lec20 RidgeRegression
No ratings yet
Lec20 RidgeRegression
21 pages
Data Analysis Visualisations in Excel Printable
100% (1)
Data Analysis Visualisations in Excel Printable
39 pages
Computational Statistics With Matlab
No ratings yet
Computational Statistics With Matlab
71 pages
Gate Scholorship Work - October: Sampling Fundamentals
No ratings yet
Gate Scholorship Work - October: Sampling Fundamentals
13 pages
Chapter 9 - Audit Sampling
100% (2)
Chapter 9 - Audit Sampling
37 pages
Lecture 8.2 - Variational Quantum Eigensolver
No ratings yet
Lecture 8.2 - Variational Quantum Eigensolver
27 pages
NFL Bets Never Lost
No ratings yet
NFL Bets Never Lost
19 pages
Sampling and Sampling Distributions (Autosaved)
0% (1)
Sampling and Sampling Distributions (Autosaved)
74 pages
Lecture Note
No ratings yet
Lecture Note
41 pages
Skittles Project - My Page Version
No ratings yet
Skittles Project - My Page Version
22 pages
Lecture 1.1 - Single States
No ratings yet
Lecture 1.1 - Single States
49 pages
Ek 2020
No ratings yet
Ek 2020
203 pages
Testing Hypotheses: Two-Sample Tests: Statistics For Management Levin and Rubin
No ratings yet
Testing Hypotheses: Two-Sample Tests: Statistics For Management Levin and Rubin
29 pages
Durrande 2020
No ratings yet
Durrande 2020
90 pages
Lec14 15 GenerativeModelsForDiscreteData
No ratings yet
Lec14 15 GenerativeModelsForDiscreteData
74 pages
Efficiency of Proportional Allocation Procedure Over Other Allocation Procedures in Stratified Random Sampling
No ratings yet
Efficiency of Proportional Allocation Procedure Over Other Allocation Procedures in Stratified Random Sampling
3 pages
Lec12 13 BayesianInferenceForTheGaussian
No ratings yet
Lec12 13 BayesianInferenceForTheGaussian
57 pages
Toc Test of Control
100% (1)
Toc Test of Control
47 pages
Gonzalez 2020
No ratings yet
Gonzalez 2020
79 pages
Chapter 3
No ratings yet
Chapter 3
23 pages
Dai 2020
No ratings yet
Dai 2020
62 pages
Seminar em
No ratings yet
Seminar em
51 pages
Lec9 MultivariateGaussian
No ratings yet
Lec9 MultivariateGaussian
60 pages
Lecture 3 - Entanglement in Action
No ratings yet
Lecture 3 - Entanglement in Action
36 pages
Lec7 InformationTheory
No ratings yet
Lec7 InformationTheory
41 pages
Lec22 Introduction2BayesianRegression
No ratings yet
Lec22 Introduction2BayesianRegression
42 pages
Lecture 4.1 - Quantum Query Algorithms
No ratings yet
Lecture 4.1 - Quantum Query Algorithms
38 pages
Lec23 Evidence4Regression
No ratings yet
Lec23 Evidence4Regression
38 pages
Lec17 PriorModeling
No ratings yet
Lec17 PriorModeling
37 pages
ICICI
No ratings yet
ICICI
23 pages
Lecture 8.1 - Iterative Quantum Phase Estimation - Moving Beyond Traditional QPE
No ratings yet
Lecture 8.1 - Iterative Quantum Phase Estimation - Moving Beyond Traditional QPE
31 pages
Lec27 AcceptReject
No ratings yet
Lec27 AcceptReject
30 pages
Lec24 BayesianLinearRegression
No ratings yet
Lec24 BayesianLinearRegression
29 pages
Safety Practiceand Associated Factorsamong Wastehandlersatselectedgovernmenthospitalsinsomaliregion
No ratings yet
Safety Practiceand Associated Factorsamong Wastehandlersatselectedgovernmenthospitalsinsomaliregion
12 pages
Artikel 3
No ratings yet
Artikel 3
13 pages
Lec18 HierarchicalBayesianModels
No ratings yet
Lec18 HierarchicalBayesianModels
20 pages
Lec21 BiasVarianceDecomposition
No ratings yet
Lec21 BiasVarianceDecomposition
15 pages
Lecture 7 - Introduction To Quantum Noise Bonus
No ratings yet
Lecture 7 - Introduction To Quantum Noise Bonus
13 pages
Broiler and Gait Score
No ratings yet
Broiler and Gait Score
7 pages
Ch27 Sample Preparation
No ratings yet
Ch27 Sample Preparation
74 pages
Reading 7 Estimation and Inference Answers
No ratings yet
Reading 7 Estimation and Inference Answers
4 pages
MARK977 Research For Marketing Decisions Autumn Semester, 2017 Week 8
No ratings yet
MARK977 Research For Marketing Decisions Autumn Semester, 2017 Week 8
54 pages
MGMT610CH07 PDF
No ratings yet
MGMT610CH07 PDF
25 pages
Chapter 2, Traffic Engineering Studies - Speed
No ratings yet
Chapter 2, Traffic Engineering Studies - Speed
10 pages
Fact Finding Techniques
No ratings yet
Fact Finding Techniques
6 pages
Applied Statistics in Business & Economics,: David P. Doane and Lori E. Seward
No ratings yet
Applied Statistics in Business & Economics,: David P. Doane and Lori E. Seward
37 pages
Chapter 11 - Determining The Extent of Testing
No ratings yet
Chapter 11 - Determining The Extent of Testing
3 pages
Consulting Group
No ratings yet
Consulting Group
11 pages
BRM Assignment Sample Size - Tahir Khan Abdali
No ratings yet
BRM Assignment Sample Size - Tahir Khan Abdali
3 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Lec28 StratifiedSampling

Uploaded by

Lec28 StratifiedSampling

Uploaded by

Stratified Sampling

Prof. Nicholas Zabaras

October 11, 2020

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 1

 Stratified sampling, Systematic Sampling

 The goals for today’s lecture include:

 Understand conditional Monte Carlo, and stratified sampling algorithms

 The conditional Monte Carlo idea is referred to as Rao-Blackwellization.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 3

 Step 2: Calculate 𝔼 𝐻(𝑿)|𝒀𝑘 , 𝑘 = 1,2, … 𝑁 analytically

 Step 3: Estimate ℓ = 𝔼 𝐻(𝑿) = 𝔼 𝔼 𝐻(𝑿)|𝒀 by

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 4

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 5

 The events {𝑌 = 𝑖}, 𝑖 = 1, … , 𝑚 form disjoint subregions, or strata, of the

R. Y. Rubinstein, D. P. Kroese, Simulation and the Monte Carlo Method, 2007

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 6

 The variance of the stratified sampling estimator is given by

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 7

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 8

𝑁𝑉𝑎𝑟 ℓ෠ = 𝑉𝑎𝑟 𝐻(𝑋) ≥ 𝔼 𝑉𝑎𝑟 𝐻 𝑋 |𝑌 = ෍ 𝑝𝑖 𝜎𝑖2 = 𝑁𝑉𝑎𝑟 ℓ෠ 𝑠

Var  H ( X )   Var  H  X  | Y    Var   H  X  | Y   Var  H  X  | Y 

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 9

 It effects stratification by favoring those events {𝑌 = 𝑖} whose probabilities 𝑝𝑖 are largest.

and the method is known as the systematic sampling method.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 10

 The stratified sampling algorithm can be easily implemented assuming that

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 11

 Evaluate the estimator of  H ( x) p

 Then the overall estimator of

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 12

 Note that for the choices:

 If we select 𝐷𝑚 such that VarDm  H ( x )  is small on average, i.e.

 However, the variance of the stratified sampling estimator is zero since:

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 14

 In that respect, the applicability of the stratified sampling method is limited.

Statistical Computing and Machine Learning, Fall 2020, N. Zabaras 15

You might also like