0% found this document useful (0 votes)
3 views22 pages

Drawings

The document discusses the process of drawing from various statistical densities using simulations, highlighting methods for both univariate and multivariate distributions. It covers practical applications such as normal and lognormal distributions, truncated densities, and the use of Halton sequences for improved coverage and variance reduction in simulations. Additionally, it emphasizes the importance of drawing techniques in achieving accurate approximations of integrals in applied economics.

Uploaded by

dunsscoto24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views22 pages

Drawings

The document discusses the process of drawing from various statistical densities using simulations, highlighting methods for both univariate and multivariate distributions. It covers practical applications such as normal and lognormal distributions, truncated densities, and the use of Halton sequences for improved coverage and variance reduction in simulations. Additionally, it emphasizes the importance of drawing techniques in achieving accurate approximations of integrals in applied economics.

Uploaded by

dunsscoto24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

DRAWING FROM DENSITIES

Anna Conte – Applied Economics

Learning outcomes:

1/22
Using simulations

• Simulation consists of drawing from a density, calculating a statistic


for each draw, and averaging the results.
• In many cases, we want to calculate an average of the form
Z
t̄ = t () f () d

where t (.) is a statistic of interest and f (.) is a density.


• For some densities, this task is simple.
• However, in many situations, it is not immediately clear how to
draw from the relevant density.
• Furthermore, even with simple densities, there may be ways of
taking draws that provide a better approximation to the integral
than a sequence of purely random draws.

2/22
Random Draws: Standard Normal and Uniform

• If we want to take a draw from a standard normal density or a


standard uniform density, the process from a programming
perspective is very easy.
• Most statistical packages contain random number generators for
these densities.
• We can simply calls these routines to obtain a sequence of random
draws.

3/22
How to draw from a univariate density

• Suppose that we want to draw a number from a distribution with


density f (.) and cumulative distribution function F (.).

• We first need to draw a number from a uniform distribution using a


“random” number generator, say µ1 . Practically, any statistical
package can do this for us.

• If the inverse of the cumulative distribution function is known, then


you simply need to calculate

1 = F −1 (µ1 )

4/22
A graphical representation from Kenneth Train’s book

5/22
Application: normal distribution


• We want draws such that ŷr ∼ N µy , σy2 .

• We first need to draw from a standard normal.

• Drawing from a standard normal distribution is fairly easy, since


good approximations of the inverse cumulative distribution function
are available from mostly all statistical packages.

• Let’s say that we get R draws υ̂r from a standard normal


distribution.
• We transform them into draws from a normal distribution with
mean µy and variance σy2 :

ŷr = µy + σy υ̂r

6/22
Application: lognormal distribution


• We want draws such that ln(x̂r ) ∼ N µy , σy2 .

• We take normal draws of y with mean µy and variance σy as


derived before.

• Finally, we taking the natural exponential of the normal draws to


get our lognormal draws:

x̂r = exp (ŷr )

• The moments of the lognormal are functions of the mean and


variance of the normal that is exponentiated.

• In particular, the mean of x is exp µy + σy2 /2 , and its variance is
  
exp 2µy + σy2 exp σy2 − 1

7/22
Truncated Univariate Densities (i)

• Consider a random variable that ranges from a to b with density


proportional to f () within this range.
• That is, the density is (1/k) f () for a ≤  ≤ b, and 0 otherwise,
where k is the normalizing constant that insures that the density
integrates to 1:
Z b
k= f () d = F (b) − F (a)
a

• A draw from this density can be obtained by applying the procedure


just described while assuring that the draw is within the appropriate
range.
• Draw µ from a standard uniform density.
• Calculate the weighted average of F (a) and F (b) as
µ̄ = (1 − µ)F (a) + µF (b) or as µ̄ = F (a) + µ [F (b) − F (a)].
• Then calculate  = F −1 (µ̄).

8/22
Truncated Univariate Densities (ii)

•P1: Since
JYD/... µ̄ is between F (a) and F (b),  is necessarily between a and b.
CB495-09Drv CB495/Train KEY BOARDED May 25, 2009 16:32 Char Count= 0
• Essentially, the draw of µ determines how far to go between a and b.
• Note that the normalizing constant k is not used in the calculations
and therefore need not be calculated.
208 Estimation

F (ε)

F (b)

μ1
F (a)
ε
f (ε)

a ε1 b ε 9/22
Figure 9.2. Draw of μ̄1 between F(a) and F(b) gives draw ε1 from f (ε)
Truncated Univariate Densities (iii)

• To draw from a normal with mean m and standard deviation s


left-truncated at a and right-truncated at b, draw µ first.
• Then compute
 = m + s × F −1 F a−m b−m a−m
    
s +µ F s −F s .

10/22
Multivariate Normals (i)

• To draw from a multivariate normal, it is possible to use a


procedure similar to the one used to draw from a univariate normal
distribution.
• Let  be a vector with K elements distributed N(b, Ω).
• Let L be the lower-triangular Choleshi factor of Ω such that
LL0 = Ω.
• A draw of  from N(b, Ω) is obtained as follows.
• Take K draws from a standard normal, and label the vector of
0
these draws η = (η1 , . . . , ηK ) .
• Calculate  = b + Lη.

11/22
Multivariate Normals (ii)

• Consider a three-dimensional  with zero mean.


• A draw of  is calculated as

    
1 s11 0 0 η1
2  = s21 s22 0  η2 
    
3 s31 s32 s33 η3
or

1 = s11 η1
2 = s21 η1 + s22 η2
3 = s31 η1 + s32 η2 + s33 η3

• Essentially, the Choleski factor expresses K correlated terms as


arising from K independent components, with each component
loading differently onto each term. 12/22
Example: application to random effects panel data

• Consider this formula, where g (.) is the normal N 0, σα2
probability density function of αi :

" #
Z ∞ Y
f (yi1 , . . . , yiT | xi1 , . . . , xiT , β) = f (yit | xit , β, α) g (α) dα
−∞ t

• We draw an arbitrarily large number of αs from its distribution


g (α), denoted as α
br with r = 1, . . . , R.
Q
• For every such value, we compute t f (yit | xit , β, α
br ).
• Finally, we calculate the average of such values so obtained, which
approximates the integral above:

PR Q
r =1 [ t f (yit | xit , β, α
br )]
f (yi1 , . . . , yiT | xi1 , . . . , xiT , β) ≈
R

13/22
The crude frequency estimator of f (yi1 , . . . , yiT | xi1 , . . . , xiT , β)

• The so-obtained R draws from a normal distribution are then used


to compute an estimate of the joint probability of interest:

PR Q
r =1 [ t f (yit | xit , β, α
br )]
f (yi1 , . . . , yiT\
| xi1 , . . . , xiT , β) =
R

• This is know as the “crude frequency” estimator.

• There are more advanced, smoother methods, but the crude


frequency seems to work well with a limited number of draws per
subject in several cases.

14/22
How large is an “arbitrarily” large number of random draws?

• Generally, a number of independent draws in the order of a few


thousands may work.
• However, other techniques enable us to draw from a distribution
and achieve great accuracy with a much more contained number of
draws.
• For example, antithetic draws, systematic draws, Halton sequences,
and so on.
• The purpose for preferring non-independent draws is that of a
better coverage and variance reduction.
• It is often the case that a small number non-independent draws
provide the same accuracy of a much larger number of independent
draws.

15/22
Coverage

• Recall that the objective is to approximate an integral of the form


R
t () f () d.
• The integral is over the density f .
• It seems reasonable that a more accurate approximation would be
obtained by evaluating t () at values of  that are spread
throughout the domain of f .
• With independent random draws, it is possible that the draws will
be clumped together, with no draws from large areas of the domain.
• Procedures that guarantee better coverage can be expected to
provide a better approximation.

16/22
Variance reduction

• With independent draws, the covariance over draws is zero.


• The variance of a simulator based on R independent draws is
therefore the variance based on one draw divided by R.
• If the draws are negatively correlated instead of independent, then
the variance of the simulator is lower.
• For R = 2, the variance of ť = [t(1 ) + t(2 )]/2 is
[V (t(1 )) + V (t(2 )) + 2Cov (t(1 ), t(2 ))]/4.
• If the draws are independent, then the variance is V (t(r ))/2.
• If the two draws are negatively correlated with each other, the
covariance term is negative and the variance becomes less than
V (t(r ))/2.
• When the draws are negatively correlated within an unbiased
simulator, a value above ť = Er (t()) for one draw will tend to be
associated with a value for the next draw that is below Er (t()),
such that their average is closer to the true value t̄.
17/22
How to get Halton draws

• A Halton sequence is defined in terms of a given number,


usually a prime.
• Let’s see an example with the prime 3.
• The Halton sequence for 3 is created by dividing the unit
interval into three parts with breaks at 1/3 and 2/3. The first
terms in the sequence are these breakpoints: 1/3 and 2/3.
• Then each of the three segments is divided into thirds, and the
breakpoints for these segments are added to the sequences in a
particular way: the lower breakpoints in all three segments
(1/9, 4/9, 7/9) are entered in the sequence before the higher
breakpoints (2/9, 5/9, 8/9).
• Then each of the nine segments is divided into thirds, with the
breakpoints added to the sequences, and so on for as many
points as the researcher needs.

18/22
Halton sequences

• Halton sequences (Halton, 1960) provide coverage and induce a


negative correlation over observations.
• Since a Halton sequence is defined on the unit interval, its elements
can be considered as well-placed “draws” from a standard uniform
density.
• The Halton draws provide better coverage than random draws, on
average, because they are created to progressively fill in the unit
interval evenly and ever more densely.
• The elements in each cycle are equidistant apart, and each cycle
covers the unit interval in the areas not covered by previous cycles.
• The pattern by which Halton sequences are created makes them
such that each subsequence fills in the gaps of the previous
subsequences.
• If a nonprime is used, then there is a possibility that the cycles will
coincide throughout the entire sequence. 19/22
Independent draws vs. Halton sequences: bivariate case

20/22
How to get Halton draws in Stata

global draws “100”


mat p=[3,7]
mdraws, neq(2) dr($draws) prefix(h) burn(7) primes(p)

21/22
Useful references

• Train, Kenneth (2009), Discrete Choice Methods with


Simulation, Cambridge University Press.

• Stern, Steven (1997), “Simulation-Based Estimation”, Journal


of Economic Literature, Vol. 35(4), pp. 2006–2039.

22/22

You might also like