0% found this document useful (0 votes)

74 views9 pages

Chapter 3: Variance Reduction

This document summarizes techniques for variance reduction in Monte Carlo methods. It discusses control variates, which use correlated random variables with known expectations to reduce the variance of estimators. The optimal control variate coefficients can be estimated from sample data. Partial integration, also called Rao-Blackwellization, lowers variance by replacing integrals over some variables with their averages. An example applies control variates to estimate the expected value of an exponential function of the distance between two uniformly distributed random variables.

Uploaded by

aman pandey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views9 pages

Chapter 3: Variance Reduction

Uploaded by

aman pandey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Lecture Notes on Monte Carlo Methods

Fall Semester, 2005

Courant Institute of Mathematical Sciences, NYU
Jonathan Goodman, [email protected]

Chapter 3: Variance Reduction.

1 Introduction
Variance reduction is the search for alternative and more accurate estimators
of a given quantity. The possibility of variance reduction is what separates
Monte Carlo from direct simulation. Simple variance reduction methods often
are remarkably effective and easy to implement. It is good to think about them
as you wait for a long Monte Carlo computation to finish. In some applications,
such as rare event simulation and quantum chemistry, they make practicle what
would be impossible otherwise. Most advanced Monte Carlo is some kind of
variance reduction.
Among the many variance reduction techniques, which may be used in com-
bination, are control variates, partial integration, systematic sampling, and im-
portance sampling. The method of control variates is useful when a crude version
of the problem can be solved explicitly. This is often the case in simple prob-
lems (possibly the definition of “simple”) such pricing problems in quantitative
finance where the crude solvable version could be Black Scholes. Partial inte-
gration, also called Rao Blackwellization lowers variance by replacing integrals
over come variables or over parts of space by their averages. Systematic sam-
pling methods range from the simplest, antithetic variates, to the slightly more
sophisticated stratified sampling, to quasi Monte Carlo integration. Importance
sampling has appeared already as sampling with a weight function. It also is the
basis of reweighting and score function strategies for sensitivity analysis. Meth-
ods for rare event sampling mostly use importance functions, often suggested
by the mathematical theory of large deviations.

2 Control variates
Suppose X is a random variable and that we want to evaluate
A = E[V (X)] .
We may estimate A by generating L independent samples of X and taking
L
b= 1
X
A V (Xk ) . (1)
L
k=1

The error is of the order of

σV h
2
i
b−A ∼ √
A , σV2 = E (V (X) − A) .
L

1
Thus, the number of samples and the run time needed to achieve a given accu-
racy is inversely proportional to the variance.
A control variate is an easily evaluated random variable, W (X), so that
B = E[W (X)] is known. If W (X) is correlated with V (X) with covariance

CV W = cov(V, W ) = E [(V − A)(W − B)] ,

then the random variable,

Z = V (X) − α (W (X) − B) , (2)

can have less variance than V (X). This will make the control variate estimator
L
b= 1
X
A (V (Xk ) − αW (Xk )) + αB (3)
L
k=1

more accurate than the simple one (1), often dramatically so.
We choose α to minimize the variance of Z in (2). The variance is
2
σZ = σV2 − 2αCV W + α2 σW
2
.

The optimal α is
CV W
α∗ = 2 , (4)
σW
and the corresponding optimal variance is
2
2 CCW
= σV2 − = σV2 1 − ρ2V W ,

σZ 2 (5)
σW

in terms of the correlation coefficient

CV W
ρV W = corr(V, W ) = .
σ V σW
Thus, the quality of W as a control variate depends on the correlation between
V and W .
In practice it is not likely that one would know the optimal α (4) in advance,
but it can be estimated from Monte Carlo data. From the samples Xk we can
evaluate Vk = V (Xk ) and Wk = W (Xk ), then
L
2 1X 2
σ
d
W = (Wk − B) ,
L
k=1
L
b(1) 1X
A = Vk (simple estimator of A),
L
k=1
L
1 X
b(1) (Wk − B) ,
Cd
VW = Vk − A
L
k=1

2
c∗ CdVW
α = , (6)
σ
d 2
W
L
c∗ 1
X
A b(1) − α
b = A (Wk − B) .
L
k=1

The estimate (6) may not be a very accurate estimate of (4), but the perfomance
does not depend strongly on α when α is close α∗ , where the derivative is zero.
One can use more than one control variate. Given W1 (X), . . ., Wn (X), we
can form
Xn
Z = V (X) − αl Wl (X) . (7)
l=1

The optimal coefficients, the αl that minimize var(Z), are found by solving the
system of linear equations
n
X
cov(V, Wl ) = cov(Wl , Wm )αm . (8)
l=1

Should the coefficients in (8) be unknown, we can estimate them from Monte
Carlo data as above.
Let VS denote the control variate sum on the right of (7) so that V = Z +VS .
The optimality conditions for the coefficients αl is that VS be uncorrelated with
Z. If this were not so, we could use W = VS as an additional control variate and
further reduce the variance. Because they are uncorrelated, var(V ) = var(Z)+.
In statisticians’ terminology, the total variance of V is the sum of the explained
part, var(VS ), and the unexplained part, var(Z).
Linear algebra has a geometrical way to express this. Given a random
variable, X, there is a vector space consisting of mean zero functions of X
with finite variance. If V (X) − A and W (X) − B are two such, their in-
ner product is hV − A, W − Bi = cov(V, W ). The corresponding length is
2
kV k = hV − A, V − Ai = var(V ). In the vector space is the subspace, S,
spanned by the vectors Wl (X) − Bl . Minimizing var(Z) in (7) is the same as
2
finding the VS ∈ S that minimizes kV − VS k . This is the element of VS closest
to V − A. In this way we write V = Z + VS with VS perpendicular to Z.
Example: From the introduction. Let B ⊂ R3 be the unit ball of points with
|x| ≤ 1. Suppose X and Y are independent and uniformly distibuted in B and
try to evaluate −λ|X−Y |
e
E .
|X − Y |
−λ|X−Y |
Since the functional V (X, Y ) = e |X−Y | depends on |X − Y |, we seek control
variates that have this dependence, the difficulty being finding functionals whose
2
expected value is known. One possibility is W1 (X, Y ) = |X − Y | with
h i h i
2 2
E [W1 ] = E |X| + 2E [hX, Y i] + E |Y | .

3
The middle term on the right vanishes because X and Y are independent. The
other two each are equal to 35 , so E[W1 ] = 65 . With λ = .2, the improvement
takes us from var(V ) ≈ .99 to var(Z) ≈ .72, about 26% lower. Another possi-
4
bility is W2 = |X − Y | with W [W2 ] = 67 + 65 . Using these two control variates
together gives var(Z) ≈ .58, an almost 50% reduction. The Matlab program
that does this, CV1.m, is posted.
This example shows a relatively modest variance reduction from two not very
insightful control variates. Variance reduction methods that seem impressive
in one dimensional examples may become less effective in higher dimensional
problems, as this relatively modest six dimensional problem illustrates.

3 Partial averaging
Partial averaging, or Rao-Blackwellization, reduces variance by averaging ofver
some of the variables or over part of the integration domain. for example, sup-
pose (X, Y ) is a random variable wtih probability density f (x, y). Let V (X, Y )
be a random variable and
R
V (x, y)f (x, y)dy
Ve (x) = E [V (X, Y ) | x] = R (9)
f (x, y)dy

A simple inequality shows that except in the trivial case where V already was
independent of y,
var(Ve ) < var(V ) . (10)
In fact, the reader can check that
2
var(V ) = var(Ve ) + E V − Ve . (11)

The conclusion is that if a problem can be solved partially, if some of the integrals
(9) can be computed explicitly, the remaining problem is easier.
A more abstract and general version of of the partial averaging method is
that if G is a sub σ−algebra and

Ve = E [V | G] ,

then we again have (11) and the variance reduction property (10). Of course,
the method still depends on being able to evaluate Ve efficiently.
Subset averaging is another concrete realization of the partial averaging prin-
ciple. Suppose B is a subset (i.e., an event) and that E [V | B] is known. If

E [V | B] if x ∈ B,
V (x) =
e
V (x) if x ∈
/ B,

then again var(V ) = var(Ve ) except in trivial situations. For example, we might
take B to be the largest set for which E [V | B] can be evaluated by symmetry.

4
Example. Consider just the Y integration in the previous example.:
−λ|X−Y |
e−λ|x−y|
Z
e 3
EY = dy .
|X − Y | 4π |y|≤1 |x − y|

For each x, define Bx = {y | |x − y| ≤ 1 − |x|}. This is the largest round ball

about x contained in the integration domain |y| ≤ 1. The conditional expecta-
tion
3 e−λ|x−y|
R
4π y∈Bx |x−y| dy
EY [V (x, Y ) | Bx ] =
P (Y ∈ Bx )
may be evaluated using radial symmetry. The numerator is
1−|x| 1−|x|
e−λr
Z Z
3
4πr2 dr = 3 e−λr rdr
4π r=0 r r=0
3
= 2
1 − e−λ(1−|x|) (1 + λ(1 − |x|)) ,
λ
3
And P (Y ∈ Bx ) = (1 − |x|)) , so that

EY [V (x, Y ) | Bx ]
−3 3 −λ(1−|x|)

= u(x) = (1 − |x|)) 1 − e (1 + λ(1 − |x|)) . (12)
λ2
Therefore
e−λ|X−Y |
h i
A = E(X,Y ) = E(X,Y ) Ve (X, Y ) ,
|X − Y |
where  −λ|X−Y |
 e
if |X − Y | ≥ 1 − |X| ,
Ve (X, Y ) =
 |X − Y |
u(X) if |X − Y | < 1 − |X| .
Computational experiments (Matlab script CV3.m posted) with λ = .2 show
that var(Ve ) ≈ .61. We may further reduce the variance using the earlier control
2 4
variates W1 = |X − Y | and W2 = |X − Y | . Using only W1 gives var(Z) ≈ .35.
Using W1 and W2 together gives var(Z) ≈ .24. Thus, the combined effects of
not very sophisticated partial averaging and two simple control variates reduces
the variance, and the work needed to achieve a given accuracy, by a factor of 4
(from .99 to .24).

4 Importance sampling and rare events

Evaluating A = Ef [V (X)] need not mean sampling f . In many situations,
particularly those governed by f −rare1 events, much better results come from
1 This means that the f probability of B is small.

5
sampling a well chosen different distrubution, g. To get an unbiased estimator,
Z
A = V (x)f (x)dx
Z
f (x)
= V (x) g(x)dx
g(x)
A = Eg [V (X)L(X)] . (13)

The ratio L(x) = f (x)/g(x) may be called the likelihood ratio or the importance
function or the score function, or the Radon Nikodym derivative. Of course we
must have g(x) > 0 whenever f (x) > 0. Estimators using (13) are unbiased for
any g. Our task is to find distributions g that we can sample so that

varg [V (X)L(X)]

is small. Of course, the variance is zero if V (x)L(x) is constant, i.e. g(x) =

V (x)f (x)/A. Even when V is positive, this requires knowing A in advance.
Nevertheless, we take the intuition that a good g will be large where V f is large
although f may not be large there.
Rare event sampling is a special case that illustrates many of the ideas. If
B is some event and V (x) = IB (x) is the indicator function2 (or characteristic
function), then

p = P (B) = Ef [IB (X)] = Eg [IB (X)L(X)] .

The standard estimator using f samples would be

L
N 1X
pb = = IB (Xk ) . (14)
L L
k=1

Here N is the number of hits, samples landing in B. The variance of pb is

p(1 − p)/L. for small p (rare events), this is nearly the same as p/L.
We should distinguish between absolute and relative accuracy, p particularly
when estimating small p. The absolute error is ∆p = pb−p ∼ σ(b p) ≈ p/L. The
relative error is ∆p/p, which could be much larger. For example, if p = 10−5
and pb = 10−4 , then the absolute error is about 10−4 , which may seem small,
but the relative error is about 10. The estimator is off by a factor of ten, not a
good result. Note that the relative error is of the order of

σ(b
p) 1 1
≈p =p . (15)
p pL E [N ]

That is, the relative error is governed by the expected number of hits. If I want
10% accuracy in pb, I need to generate about 100 hits, which could mean very
many mostly fruitless trials.
2 This means IB (x) = 1 if x ∈ B and IB (x) = 0 if x ∈
/ B.

6
The first goal of importance sampling is to make hits more likely. But this is
not enough. Often the conditional probability withing B, fB (x) = f (x)/P (B)
for x ∈ B, is sharply peaked within B. Informally, we say that rare events
happen in predictable ways. If g is peaked in the wrong parts of B, the resulting
variance could be larger than for the naive estimator. To be a good importance
sampler, g samples must hit B often, and in roughly the same way f samples
that hit B do.
The mathematical study of rare events is the theory of large deviations3 . A
typical large deviations theorem says Pn (B) ∼ e−Rn , where n is some measure
of the problem size. The proof usually involves a change of measure of the kind
we have been discussing, a g that knows how rare f events that hit B do so.

4.1 Cramer’s theorem

Cramer’s theorem illustrates several ideas about large deviations. Let Y be a
scalar mean zero random variable with density h(y) that decays rapidly for large
y. Let S = n1 (Y1 + · · · + Yn ). We expect S to be within O( √1n ) of zero, but (for
positive b) what is P (S > b)? Under suitable hypotheses below, we will find
that
P (S ≥ b) = e−R(b)n C(b)n−1/2 + O n−3/2 . (16)
Let us verify this when Y is Gaussian with variance σy2 = 1. then S is
Gaussian with variance σS2 = 1/n and the corresponding Gaussian probability
density for f gives
r Z ∞
n 2
P (S ≥ b) = e−ns /2 ds .
2π s=b
√
Using the new integration variable4 u = ns gives
Z ∞
1 2
P (S ≥ b) = √ √
e−u /2 du .
2π u= nb
2
The maximum value of the integrand, e−nb /2 , gives a rough idea how small the
probability will be. Moreover, most of the mass is concentrated near the left
2
endpoint: when u is large e−u /2 is a very
√ rapidly decreasing function√ of u. This
motivates the change of variable u = nb + v, so u2 /2 = nb2 /2 + nbv + v 2 /2,
and Z ∞
1 2 √ 2
P (S ≥ b) = √ e−nb /2 e− nbv e−v /2 dv .
2π v=0
2
The first factor decays so rapidly that by the time e−v /2 is significantly different
from one, the product is essentially zero. This suggests the approximation
Z ∞ √
Z ∞ √
− nbv −v 2 /2 1
e e dv ≈ e− nbv dv = √ .
v=0 v=0 nb
3 Amir Dembo has a nice book on large deviations that discusses their use in importance

sampling.
4 This is a general way, called Watson’s lemma, to estimate integrals like this one, see the

book by Erdely or the book by Bender and Orszag.

7
The interested reader will be able to show that the error in this approximation
is O(n−3/2 ) so that

2 1 1
P (S ≥ b) = e−nb /2 √ √ + O n−3/2 .
2πb n
√
This is of the general form (16) with R = b2 /2 and C = 1/ 2πb.
This formula illustrates the predictability of rare events. If B is the event
S ≥ b, then most of the hits in B have S only slightly above b. In fact, P (S ≥
b+ | S ≥ b) ∼ e−bn . Moreover, we explore the mechanism for samples S ≥ b by
finding hb (y), the conditional probability density of, say, Y1 , given that S ≥ b.
In this Gaussian world, the density of Y1 when S = b + is Gaussian and
(after some thought) hb ∼ N (b + , 1). Since is small for large n, this gives
fb (y) → N (b, 1) as n → ∞. This suggests that we can sample more effectively
by drawing the Yk from N (b, 1) than from N (0, 1).
The proof of Cramer’s theorem for general h(y) builds on these observations.
The probability density for S is5
Z Z
φ(s) = n · · · h(y1 ) · · · h(yn )δ(y1 + · · · + yn − ns)dy1 · · · dyn .

The trick is to use an exponential factor with unknown force parameter λ

Z Z
enλs φ(s) = n · · · eλy1 h(y1 ) · · ·λyn h(yn )δ(· · ·)dy1 · · · dyn ,

because ns = y1 + · · · + yn wherever the integral is different from zero. The

factors eλy h(y) can be normalized to be probability densities:
1 λy
hλ (y) = e h(y) , (17)
Z(λ)

with Z
eλy h(y)dy = E eλY .

Z(λ) = (18)

The hypothesis of Cramer’s theorem is that the exponential moments (18) are
finite, at least for a suitable range of λ. This implies taht h(y) decays expo-
nentially in some average sense as y → ∞. Clearly, the force λ, changes the
expected value of Y . Denote this by
Z
1
µ(λ) = Eλ [Y ] = yeλy h(y)dy . (19)
Z(λ)

Note that hb in the Gaussian case has the form (17) with λ chosen so that
µ(λ) b. In general, define λ∗ (s) by

µ(λ∗ (s)) = s . (20)

5 The
R R
factor of n in front insures that φ(s)ds = 1 because δ(a − ns)ds = 1/n.

8
It is easy to see that if λ∗ (s) exists for a given s, it is unique. Uning this λ∗ , we
have
1 nλ∗ (s)s
e · Z(λ∗ (s))n φ(s)
n Z Z
= ··· hλ∗ (y1 ) · · · hλ∗ (yn )δ(y1 + · · · + yn − ns)dy1 · · · dyn .

What is special about using λ = λ∗ (s) is that the right side is probability density
at s for an average of iid random variables with mean s. This allows us to use
the central limit theorem to approximate the right side. Let
h i
2
σλ2 ∗ = Eλ∗ (Y − s)
Z
1
= (y − s)2 eλ∗ y h(y)dy .
Z(λ∗ )

Then the right side corresponds to the maximum of a N (s, nσλ2 ∗ ) density, which
is
1
√ .
nσλ∗
Altogether6
√
−n −nλ∗ (s)s n
−1/2
φ(s) = Z(λ∗ (s)) e +O n .
σλ∗
This shows that
√
φ(s) = e−R(s)n d(s) n + O n−1/2 ,

with
R(s) = ln {Z(λ∗ (s))} + sλ∗ (s) , (21)
and
1
d(s) = .
σλ∗
As with the Gaussian case Z ∞
p= φ(s)ds ,
s=b
and most of the mass is near s = b. Again using the Watson lemma, we expand

R(s) = R(b) + R0 (b)(x − b) + O (x − b)2 ,

and get (16) with R given by (21) and

1
c= .
R0 (s)σλ∗

6 The error term comes from the error term in the central limit theorem.

Solutions To Steven Kay's Statistical Estimation Book
67% (3)
Solutions To Steven Kay's Statistical Estimation Book
16 pages
STA 303 Theory of Estimation 9th Lecture-1
No ratings yet
STA 303 Theory of Estimation 9th Lecture-1
7 pages
Final New CH 1-5 PR
No ratings yet
Final New CH 1-5 PR
62 pages
Advanced Econometrics PDF
No ratings yet
Advanced Econometrics PDF
58 pages
Cowles e Nelson (2019) - Survey Research
100% (1)
Cowles e Nelson (2019) - Survey Research
139 pages
Numerical Methods II: Prof. Mike Giles
No ratings yet
Numerical Methods II: Prof. Mike Giles
22 pages
Variance Reduction - Antithetic Variates
No ratings yet
Variance Reduction - Antithetic Variates
60 pages
Chapter 3 - Variance Reduction Methods
No ratings yet
Chapter 3 - Variance Reduction Methods
20 pages
Variance Reduction Techniques 1
No ratings yet
Variance Reduction Techniques 1
48 pages
Notessc w04
No ratings yet
Notessc w04
8 pages
An Introduction To Variational Calculus in Machine Learning
No ratings yet
An Introduction To Variational Calculus in Machine Learning
7 pages
HW1
No ratings yet
HW1
21 pages
ECON 1630 Problem Set #2 Fall 2021: Bias Variance
No ratings yet
ECON 1630 Problem Set #2 Fall 2021: Bias Variance
9 pages
Stats 102c Notes
No ratings yet
Stats 102c Notes
6 pages
Week 7 Lecture
No ratings yet
Week 7 Lecture
54 pages
2021 Week 5 Chapter3 Control
No ratings yet
2021 Week 5 Chapter3 Control
7 pages
7 Expectation
No ratings yet
7 Expectation
20 pages
Elements of Statistical Learning Solutions
100% (3)
Elements of Statistical Learning Solutions
112 pages
Lecture 1: Optimal Prediction (With Refreshers) : 36-401, Fall 2017 Sunday 3 September, 2017
No ratings yet
Lecture 1: Optimal Prediction (With Refreshers) : 36-401, Fall 2017 Sunday 3 September, 2017
13 pages
MC RiskManage
No ratings yet
MC RiskManage
19 pages
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
No ratings yet
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
9 pages
7772 LectureNotes
No ratings yet
7772 LectureNotes
120 pages
A Solution Manual and Notes For: The Elements of Statistical Learning by Jerome Friedman, Trevor Hastie, and Robert Tibshirani
No ratings yet
A Solution Manual and Notes For: The Elements of Statistical Learning by Jerome Friedman, Trevor Hastie, and Robert Tibshirani
121 pages
Weatherwax Epstein Hastie Solution Manual
No ratings yet
Weatherwax Epstein Hastie Solution Manual
147 pages
Econometric S If All 2020
No ratings yet
Econometric S If All 2020
119 pages
Madzoka Assignment
No ratings yet
Madzoka Assignment
10 pages
SCS JFE Internet Appendix
No ratings yet
SCS JFE Internet Appendix
26 pages
Weatherwax Epstein Hastie Solution Manual
No ratings yet
Weatherwax Epstein Hastie Solution Manual
147 pages
Booklet Exercises
No ratings yet
Booklet Exercises
31 pages
Error Propagation
No ratings yet
Error Propagation
22 pages
A Step by Step Mathematical Derivation A
No ratings yet
A Step by Step Mathematical Derivation A
32 pages
Importance Sampling
No ratings yet
Importance Sampling
13 pages
Msqe Metrics 1 ps2
No ratings yet
Msqe Metrics 1 ps2
11 pages
Topic 4 - Sequences of Random Variables
No ratings yet
Topic 4 - Sequences of Random Variables
32 pages
Variance Reduction Techniques
No ratings yet
Variance Reduction Techniques
48 pages
(Textbook) (Solution) The Elements of Statistical Learning
No ratings yet
(Textbook) (Solution) The Elements of Statistical Learning
147 pages
MIT15 097S12 Lec04
No ratings yet
MIT15 097S12 Lec04
6 pages
PRML Solution Manual-2
No ratings yet
PRML Solution Manual-2
122 pages
Econometrics Notes
No ratings yet
Econometrics Notes
85 pages
Background/Random Processes
No ratings yet
Background/Random Processes
33 pages
323 Egec
No ratings yet
323 Egec
18 pages
CH 00
No ratings yet
CH 00
4 pages
Empirical Finance 6
No ratings yet
Empirical Finance 6
38 pages
Importance Sampling
No ratings yet
Importance Sampling
8 pages
STAT4027 Assignment 1: Lewis Hastie
No ratings yet
STAT4027 Assignment 1: Lewis Hastie
26 pages
Foundations of Statistical Inference
No ratings yet
Foundations of Statistical Inference
89 pages
Statistics
No ratings yet
Statistics
4 pages
4404 Notes ATV
No ratings yet
4404 Notes ATV
6 pages
Expectation: Definition Expected Value of A Random Variable X Is Defined
No ratings yet
Expectation: Definition Expected Value of A Random Variable X Is Defined
15 pages
Lecture 13: Simple Linear Regression in Matrix Format
No ratings yet
Lecture 13: Simple Linear Regression in Matrix Format
9 pages
TP Stat Inf 103957
No ratings yet
TP Stat Inf 103957
32 pages
Odf-SC1 05 MonteCarlo
No ratings yet
Odf-SC1 05 MonteCarlo
22 pages
Solutions Exercises Chapter 2: Dependence)
No ratings yet
Solutions Exercises Chapter 2: Dependence)
3 pages
Fundamentals of Statistics (18.6501x)
No ratings yet
Fundamentals of Statistics (18.6501x)
20 pages
Solutions For Tutorial 2
No ratings yet
Solutions For Tutorial 2
14 pages
Chap5
No ratings yet
Chap5
51 pages
Efficient Estimation of Conditional Covariance Matrices For Dimension Reduction - Solís, Loubes, Marteau - JDS, Bruxelles 2012
No ratings yet
Efficient Estimation of Conditional Covariance Matrices For Dimension Reduction - Solís, Loubes, Marteau - JDS, Bruxelles 2012
44 pages
Sum of Variances
No ratings yet
Sum of Variances
11 pages
Variational Problems in Machine Learning and Their Solution With Finite Elements
No ratings yet
Variational Problems in Machine Learning and Their Solution With Finite Elements
11 pages
Problems in Quantum Mechanics: Third Edition
From Everand
Problems in Quantum Mechanics: Third Edition
D. ter Haar
3/5 (2)
Feynman Lectures Simplified 2C: Electromagnetism: in Relativity & in Dense Matter
From Everand
Feynman Lectures Simplified 2C: Electromagnetism: in Relativity & in Dense Matter
Robert Piccioni
No ratings yet
Exercises of Quantum Physics
From Everand
Exercises of Quantum Physics
Simone Malacrida
No ratings yet
Practical Geography Book Fo Ordinary Lev
No ratings yet
Practical Geography Book Fo Ordinary Lev
17 pages
Chtsyll
No ratings yet
Chtsyll
48 pages
Chao Wattpad
100% (1)
Chao Wattpad
45 pages
Ararsa - Adherence of IFA
No ratings yet
Ararsa - Adherence of IFA
49 pages
Business Marketing Module 3 - Market Opportunities Analysis and Consumer Analysis
100% (1)
Business Marketing Module 3 - Market Opportunities Analysis and Consumer Analysis
48 pages
Outline
No ratings yet
Outline
24 pages
BSRM Assignment
No ratings yet
BSRM Assignment
19 pages
KCE Society's: Moolji Jaitha College, Jalgaon
No ratings yet
KCE Society's: Moolji Jaitha College, Jalgaon
32 pages
Practice Sums Solutions
No ratings yet
Practice Sums Solutions
27 pages
Training and Development at RELIANCE
No ratings yet
Training and Development at RELIANCE
61 pages
Baels Thesis JHCSC Midsalip Campus-2022-Ople Et - Al
No ratings yet
Baels Thesis JHCSC Midsalip Campus-2022-Ople Et - Al
94 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Amegashie Maxwel Final Long Essay PDF
No ratings yet
Amegashie Maxwel Final Long Essay PDF
53 pages
Unit 16
No ratings yet
Unit 16
24 pages
Jiro Proposal 1
No ratings yet
Jiro Proposal 1
13 pages
Strengths, Weaknesses, Opportunities, and Threats of CVIF-DLP Utilization in Selected Private Schools in The Division of Quezon
No ratings yet
Strengths, Weaknesses, Opportunities, and Threats of CVIF-DLP Utilization in Selected Private Schools in The Division of Quezon
19 pages
Statistics Learning From Data 2nd Edition Roxy Peck Tom Short ISBN10 1337558087 ISBN13 9781337558082
No ratings yet
Statistics Learning From Data 2nd Edition Roxy Peck Tom Short ISBN10 1337558087 ISBN13 9781337558082
332 pages
PSM Syllabus
No ratings yet
PSM Syllabus
13 pages
Research Project Report: "Customer Preferences in Choosing Big Bazaar"
No ratings yet
Research Project Report: "Customer Preferences in Choosing Big Bazaar"
14 pages
Poll
No ratings yet
Poll
4 pages
Dpmo
No ratings yet
Dpmo
2 pages
Banana-Papaya Buko Cream Pie Elhhh
No ratings yet
Banana-Papaya Buko Cream Pie Elhhh
14 pages
Applied Probability Previous Solved Questions PDF
100% (2)
Applied Probability Previous Solved Questions PDF
13 pages
Beyond Descriptive: Making Inferences Based On Your Sample
No ratings yet
Beyond Descriptive: Making Inferences Based On Your Sample
65 pages
Muskan Project 2
No ratings yet
Muskan Project 2
38 pages
Legal Research Method
100% (1)
Legal Research Method
143 pages
Polytechnic University of The Philippines: The Problem and Its Background
No ratings yet
Polytechnic University of The Philippines: The Problem and Its Background
17 pages
Module The Research Proposal PDF
100% (1)
Module The Research Proposal PDF
14 pages

Chapter 3: Variance Reduction

Uploaded by

Chapter 3: Variance Reduction

Uploaded by

Lecture Notes on Monte Carlo Methods

Fall Semester, 2005

Chapter 3: Variance Reduction.

The error is of the order of

CV W = cov(V, W ) = E [(V − A)(W − B)] ,

then the random variable,

Z = V (X) − α (W (X) − B) , (2)

in terms of the correlation coefficient

For each x, define Bx = {y | |x − y| ≤ 1 − |x|}. This is the largest round ball

4 Importance sampling and rare events

is small. Of course, the variance is zero if V (x)L(x) is constant, i.e. g(x) =

p = P (B) = Ef [IB (X)] = Eg [IB (X)L(X)] .

The standard estimator using f samples would be

Here N is the number of hits, samples landing in B. The variance of pb is

4.1 Cramer’s theorem

book by Erdely or the book by Bender and Orszag.

The trick is to use an exponential factor with unknown force parameter λ

because ns = y1 + · · · + yn wherever the integral is different from zero. The

µ(λ∗ (s)) = s . (20)

R(s) = R(b) + R0 (b)(x − b) + O (x − b)2 ,

and get (16) with R given by (21) and

You might also like