ST302
ST302
Thorsten Rheinlander
London School of Economics and Political Science
August 17, 2006
Example 1.1 We toss a coin three times, with the result of each toss being H
(head) or T (tail). Our time set consists of t = 0; 1; 2; 3. The sample space is
the set of all eight possible outcomes. At time 0 (just before the …rst coin toss),
1
we only know that the true ! does not belong to ; and does belong to , hence
we set
F0 = f;; g :
At time 1, after the …rst coin toss, in addition the following two sets are resolved:
AH = fHHH; HHT; HT H; HT T g
AT = fT HH; T HT; T T H; T T T g :
get resolved, together with their complements and unions. Altogether, we get
At time 3, after the third coin toss, we know the true realization of !, and can
therefore tell for each subset of whether ! is a member or not. Hence
2
Consider now a discrete time stochastic process X = (X0 ; X1 ; X2 ; :::), i.e. a
collection of random variables indexed by time. Hence X is a function of a chance
parameter ! and a time parameter n. We would write Xn (!) for a function
value, but typically suppress the chance parameter, otherwise the notation gets
too heavy. Moreover, the two variables play quite di¤erent roles, since the chance
parameter ! comes from the sample space (which can be a very large set, with
no natural ordering) whereas the time parameter n is an element of the ordered
set N+ .
We denote with Fn a family of sets containing all information about X up
to time n. In more detail, Fn is the -algebra generated by sets of the form (we
assume that the starting value X0 is just a deterministic number)
fX1 = i1 ; X2 = i2 ; :::; Xn = in g
if the state space is discrete. If the state space is R then Fn is the -algebra
generated by sets of the form
fX1 2 (a1 ; b1 ) ; X2 2 (a2 ; b2 ) ; :::; Xn 2 (an; bn )g
with intervals (a1 ; b1 ), (a2 ; b2 ), ... We have
F0 = f;; g (at time zero, we know nothing)
F0 F1 F2 ::: Fn (the more time evolves, the more we know)
(Fn ) = (F0 ; F1 ; F2 ; :::) is called a …ltration. Sometimes we write FnX to specify
that we are collecting information about the stochastic process X.
We say a random variable H is Fn -measurable if it depends on information
about X up to time n only. For example, if f is some function, then f (Xn ) is
Fn -measurable; and so is max1 i n Xi . Or, in other words, H is Fn -measurable if
it is a function of X1 ,...,Xn .
We say that a stochastic process Y = (Yn )n 0 is adapted to (Fn ) if each Yn is
Fn -measurable. FnY is called the …ltration generated by Y if it is the smallest
…ltration where Y is adapted to. Summing up,
The random variable H is measurable with respect to a -algebra F if the
information in F is su¢ cient to determine H, and F can even contain more
information than just about H.
The stochastic process X is adapted to a …ltration (Fn ) if the information in
(Fn ) is su¢ cient to determine all the Xn ’s, and (Fn ) can even contain more
info. FnX contains just the information about all the Xn ’s. More precisely,
the -algebra FnX contains exactly all the information about X0 , X1 , ..., Xn .
3
1.2 Conditional Expectation
We consider a -algebra F and want to make a prediction about some random
variable H based on the information contained in F. If H is F-measurable, then
we can precisely determine H from the given information in F. If H is independent
of F, then the information in F is of no help in predicting the outcome of H. In
the intermediary case, we can try to use the information in F to make an ’educated
guess’about H, without being able to completely evaluate H.
You cannot avoid this section. Its material is absolutely essential for every-
thing that follows. To quote Thomas Mikosch, "The notion of conditional ex-
pectation is one of the most di¢ cult ones in probability theory, but it is also one
of the most powerful tools... For its complete theoretical understanding, measure-
theoretic probability theory is unavoidable." However, it is perfectly possible, and
not even di¢ cult, to get an operational understanding of conditional expectation.
You have just to grasp the following intuitive properties, they will be given for the
forementioned reason without proof.
It turns out that one can de…ne (by the partial averaging property below) the
conditional expectation also in the more general case where we only assume that
E [jHj] < 1 (so E [H 2 ] < 1 need not be satis…ed). We give a list of properties
of E [H j F]. Here A denotes the indicator function of the set A: A (!) = 1 if
! 2 A, A (!) = 0 if ! 2 = A.
(a) (Measurability) E [H j F] is F-measurable
(the conditional expectation is a predictor based on available information)
(b) (Partial Averaging) For every set A 2 F,
E[ AE [ H j F]] = E [ A H]
(on every set A 2 F, the conditional expectation of H given F has the same
expectation as H itself). In particular (take A = ),
E [E [ H j F]] = E [H]
4
Properties (a) and (b) characterize the random variable E [H j F] uniquely (mod-
ulo null sets).
(c) (Linearity) E [ a1 H1 + a2 H2 j F] = a1 E [H 1 j F] + a2 E [ H2 j F]
(d) (Positivity) If H 0, then E [H j F] 0
(e) (Jensen’s Inequality) If f is a convex function such that
E [jf (H)j] < 1, then
f (E [H j F]) E [ f (H) j F]
(f) (Tower Property) If G is a sub -algebra of F (contains less information
than F) then
E [E [H j F] j G] = E [H j G] :
In case we deal with a …ltration, this property has the following form: for two
time points s t (hence Fs Ft ) we have
E [E [H j Ft ] j Fs ] = E [H j Fs ] :
(nested conditional expectations are evaluated by taking only one conditional
expectation with respect to the earliest time point)
(g) (Taking out what is known) If G is a random variable which is F-
measurable, and such that E [jGHj] < 1, then
E [ GH j F] = G E [ H j F]
In particular (choose H = 1),
E [G j F] = G
(if G does only depend on available information then we can predict G fully)
(h) (Rôle of independence) If H is independent of F, then
E [H j F] = E [H]
(in that case, the information in F is useless in predicting H)
Let X be any random variable. We will sometimes use the notation E [H j X] if
we predict H by using the information about X only (more formally, we condition
with respect to the -algebra generated by X). Moreover, if A we write
P (Aj F) for E [ A j F]. An important special case of (h) is if we consider the
trivial -algebra F0 = f;; g. In that case, the conditional expectation with
respect to F0 is just the common expectation (so the result is a number):
E [ H j F0 ] = E [H] :
5
2 Martingales in Discrete Time
2.1 De…nition
Let X = (Xn )n 0 be a discrete time process with E [jXn j] < 1 for all n, and let
(Fn ) be some …ltration. We assume that X is adapted to (Fn ), that is, each Xn
is Fn -measurable.
E [Xn j Fn 1 ] = Xn 1 ;
a submartingale if
E [ Xn j Fn 1 ] Xn 1 ;
and a supermartingale if
E [ Xn j Fn 1 ] Xn 1 :
E [ (Xn Xn 1 )j Fn 1 ] = 0
(here we have used properties (c): Linearity and (g): Taking out what is known
of the conditional expectation), tells us that, based on all the information about
the game’s progress so far, one can predict that our increase in wealth will be on
average zero: we are playing a fair game. In reality, our wealth will be described in
most games by a supermartingale: there is a tendency to loose on average. There
are some games like Black Jack, however, where you can make a very clever use of
past events (like memorizing the cards which have already been taken out of the
deck) to achieve that your wealth process evolves like a submartingale: you are
engaged in a favourable game.
We will later on introduce the notion of Markov processes: these are stochastic
processes where future predictions depend on the past only via the present state,
they are memoryless. A martingale need not be a Markov process (because future
outcomes may depend on the whole past) and a Markov process need not be a
martingale (since it may be related to an unfair game).
6
Fix some time horizon T 2 N0 , and let H be a FT -measurable random
variable. De…ne a stochastic process X for n = 0; 1; 2; :::; T as
Xn = E [Hj Fn ] :
is a martingale as well.
7
Example 2.4 Products of non-negative independent RVs of mean 1. Let
Y1 , Y2 ,... be a sequence of independent non-negative RVs with
E [Yn ] = 1; 8n:
E [ Mn j Fn 1 ] = E [Mn 1 Yn j Fn 1 ]
= Mn 1 E [Yn j Fn 1 ] taking out what is known (g)
= Mn 1 E [Yn ] by independence (h)
= Mn 1 :
We set (S0 := 0 and) Sn = X1 + ::: + Xn . Let (Fn ) = FnX , the …ltration which
is generated by X.
Then
exp( Sn )
Mn =
' ( )n
is a martingale. This follows from the previous example, just put
exp ( Xk )
Yk = :
'( )
Example 2.6 Exponential martingale for random walk. In the setting of
the previous example, denote the distribution of the Xn ’s as X where
1
P (X = +1) = P (X = 1) = :
2
The stochastic process S = (Sn )n 0 is called a symmetric random walk. Then
1 +
' ( ) = E [exp ( X)] = e +e = cosh :
2
8
We get that
n
1
Mn = exp ( Sn )
cosh
is a martingale. In case the walk is not symmetric, then we have
9
Of more practical relevance is therefore the question whether you can turn a
game which is considered to be fair on a turn by turn base (your wealth coming
from this game is modelled as a martingale) by using a clever stopping time-based
strategy into something favourable for you. In fact, at …rst sight even a rather
dull strategy would serve this purpose, as the following example suggests:
Example 2.10 Symmetric random walk. Imagine a game based on fair coin
tossing, that is, the gambler’s fortune is modelled by a symmetric random walk
Sn = X1 + ::: + Xn with S0 = x. Let a < b and x 2 (a; b). We are interested in
the probability that we are reaching the higher gain b …rst. To calculate this, we
consider the stopping time
T = min fn : Sn 2
= (a; b)g ;
the …rst time the random walk exits from the interval (a; b). One can show that
E [T ] < 1, so condition (ii) of the optional stopping theorem is ful…lled. We
therefore get
x = E [ST ] = aP (ST = a) + bP (ST = b) :
Since P (ST = a) = 1 P (ST = b), we get that x = a + (b a) P (ST = b), hence
the probability that starting in x we reach the level b before a is (x a) = (b a).
10
Example 2.11 Gambling game. Same setting as in the last example, but here
we have (asymmetric walk) that
P (Xi = +1) = p; P (Xi = 1) = 1 p,q
(typically p < q). Using the martingale of example 2.6, we get
x a b
q q q
= P (ST = a) + P (ST = b) :
p p p
Since P (ST = a) = 1 P (ST = b), we get
( )
x a b a
q q q q
= + P (ST = b) :
p p p p
We solve this to get that the probability that starting in x we reach the level b is
x a
q q
p p
b a
q q
p p
:
Example 2.12 Duration of fair games. Consider a symmetric simple random
walk Sn = X1 + ::: + Xn with S0 = 0. For a < 0 < b, let
T = min fn : Sn 2
= (a; b)g :
Then we have
E [T ] = ab:
Proof: We have that Mn = Sn2 n is a martingale. Indeed:
Mn+1 Mn = (Sn + Xn+1 )2 (n + 1) Sn2 + n
2
= 2Sn Xn+1 + Xn+1 1:
Now, Sn is Fn -measurable, and Xn+1 is independent of Fn , with E [Xn+1 ] = 0 and
2
E Xn+1 = 1. This yields
2
E [ Mn+1 M n j Fn ] = E 2Sn Xn+1 + Xn+1 1 Fn
2
= 2Sn E [Xn+1 j Fn ] + E Xn+1 Fn 1
2
= 2Sn E [Xn+1 ] + E Xn+1 1
= 0:
One can show that E [T ] < 1, so condition (ii) of the optional stopping theorem
is ful…lled. Applying this to the martingale Sn2 n yields
b a ab2 + ba2
E [T ] = E ST2 = a2 + b2 = = ab:
b a b a b a
11
3 Discrete-Time Markov Chains
3.1 Basic De…nitions
Example 3.1 (Gambling game) Consider a gambling game in which on any
turn you win $1 with probability p = 0:4 or lose $1 with probability 1 p = 0:6.
Let Xn be the amount of money you have after n turns. You stop the game if
you are either bankrupt (Xn = 0) or if you have reached a prespeci…ed gain of
$N . Then X0 , X1 , X2 ,... is a stochastic process which has the following ’Markov
property’: Given the present state Xn , any other information about the past is
irrelevant in predicting the next state Xn+1 .
In the following, we …x a …nite set S, called the state space. The stochastic
processes considered in this chapter are assumed to have values in S unless other-
wise stated.
E [h (Xt+s )j Ft ] = E [ h (Xt+s )j Xt ]
for any s; t 2 N+ . If this conditional expectation depends only on the time incre-
ment s, but not on t, then the chain is said to be homogenous.
We shall from now on focus on homogenous chains. Let i; j 2 S. We can then
form the quantities
p(i; j) = P (Xt+1 = jj Xt = i)
which do not depend on t since the chain is homogenous. p (i; j) is called the one-
step transition probability from state i to state j. We can form the transition
matrix (p (i; j))i;j .
In the gambling game example, the transition probability for 0 < i < N is
p (i; i + 1) = 0:4;
p (i; i 1) = 0:6:
12
Moreover, we have p (0; 0) = 1 and p (N; N ) = 1. The transition matrix for N = 4
is (here the state space is S = f0; 1; 2; 3; 4g)
0 1 2 3 4
0 1:0 0 0 0 0
1 0:6 0 0:4 0 0
2 0 0:6 0 0:4 0
3 0 0 0:6 0 0:4
4 0 0 0 0 1:0
That is, the two step transition matrix is the matrix product of the one step
transition matrix with itself. It follows by induction
Theorem 3.4 The m step transition matrix pm = (pm (i; j))i;j is the mth power
of the one step transition matrix.
13
Example 3.5 (Coin tossing) Each turn, a fair coin is tossed. We have 3 classes,
hence S = f1; 2; 3g. If you score head, you move up one class unless you are in
1st class. If you score tail you move down unless you are 3rd class. The transition
matrix is
1 2 3
1 0:5 0:5 0
2 0:5 0 0:5
3 0 0:5 0:5
The two-step transition matrix is obtained by matrix multiplication
0 1 0 1 0 1
0:5 0:5 0 0:5 0:5 0 0:5 0:25 0:25
@ 0:5 0 0:5 A @ 0:5 0 0:5 A = @ 0:25 0:5 0:25 A
0 0:5 0:5 0 0:5 0:5 0:25 0:25 0:5
and so on.
1 2 3
1 1=3 1=3 1=3
2 0 1=2 1=2
3 0 3=4 1=4
Here 2 and 3 are recurrent, while 1 is transient (once you have gone to 2 or 3,
there is no way back to 1).
14
De…nition 3.8 We say that state i communicates with state j (write i ! j)
if one can get with positive probability from i to j, that is, pn (i; j) > 0 for some
n > 0. If i ! j and j ! i then i and j intercommunicate. If all states in a
chain intercommunicate with each other, then the chain is called irreducible.
In the previous example, state 1 communicates with states 2 and 3, but does not
intercommunicate with them. States 2 and 3 intercommunicate. So the example
chain is not irreducible.
Theorem 3.9 If i communicates with j, but j does not communicate with i, then
i is transient.
The explanation is that with positive probability we get from i to j but can never
go back.
Theorem 3.12 If C is a …nite, closed, and irreducible set, then all states in C
are recurrent.
The explanation consists of two steps: 1) In a …nite closed set C there has to be one
recurrent state: since C is closed, the chain spends an in…nite amount of time in
C; if it would spend only a …nite amount of time at each of the …nitely many states
15
of C, we would have a contradiction. 2) If i is recurrent and intercommunicates
with j, then j is also recurrent.
S = T [ R1 [ ::: [ Rk ;
where T is a set of transient states and the Ri ’s are closed sets of recurrent states.
De…nition
P 3.15 A stationary distribution is a vector with 0 (j) 1,
j (j) = 1 which solves
X
(i) p (i; j) = (j)
i
m = pm
where pm is the mth power of the transition matrix. By de…nition of the stationary
distribution we have 1 = p = , 2 = 1 p = p = , and so on.
16
Theorem 3.17 (Convergence theorem) Suppose we have an irreducible, ape-
riodic chain which has a stationary distribution (it follows from irreducibility
that must be unique). Then as n ! 1, pn (i; j) ! (j). If the state space S is
…nite, then there exists such a . Moreover, if i is the expected no. of steps it
takes to go from i back to i, then (i) = 1= i .
The number n of turns does not have to be very large for pn (i; j) to be close to
its limiting value (j). For example, already after 4 steps we have
:2278 :5634 :2088
4
p = :2226 :5653 :2121
:2215 :5645 :2140
p (i; i + 1) = p when i 0
p (i; i 1) = 1 p when i 1
p (0; 0) = 1 p
17
To …nd a stationary distribution we solve the equation
X X
(i) p (i; j) = (j) with (j) = 1:
i j
and so on. We set (0) = c for some c to be determined later. By induction, the
solution to our equation is
i
p
(i) = c :
1 p
We have three cases:
1. pP> 1=2: then p= (1 p) > 1 and (i) increases exponentially fast. Hence
i (i) = 1 cannot be satis…ed, and consequently, there does not exist a
stationary distribution .
P
2. p = 1=2: p= (1 p) = 1, so (i) = c and i (i) = 1. Again, there is no
stationary distribution.
P P
3. p < 1=2: = p=(1 p) < 1, so i (i) = c i i is a convergent geometric
series. Recall that we have for such a series
X
1
1 1 p
i
= = ;
i=0
1 1 2p
One can show that the re‡ecting random walk is aperiodic. It is clearly irreducible.
Therefore the convergence theorem implies that for p < 1=2, the probability that
the walk is after n steps in state i converges for n ! 1 to (i). In this case,
the chain is recurrent. One can show that it is even positively recurrent: the
18
expected return time i to state i (i.e., the expected no. of steps it takes to go
from i back to i) equals i = 1= (i) < 1.
On the other hand, if p > 1=2 then the chain is transient: once we leave a state,
with positive probability we will never come back (the chain ’explodes’).
The case p = 1=2 is a borderline case between recurrence and transience: the state
0 is recurrent, but 0 = 1. In such a situation one says that 0 is null-recurrent.
4 Poisson-Processes
From now on, we will study continuous-time stochastic processes. The time set
is now R+ so that a stochastic process is a collection of random variables X =
(Xt )t 0 . The information about X is contained in a …ltration (Ft ) = (Ft )t 0 where
Fs Ft for s < t. Sometimes we consider …ltrations which contain not only
information about the process X but also about some other events or processes.
To express that an arbitrary …ltration (Ft ) contains all information about X (and
possibly more) we use the expression ’X is adapted to (Ft )’which means that
for each t 0, Xt is Ft -measurable.
We say that a stochastic process M is a martingale relative to (Ft ) if E [jMt j] < 1
for all t and
(i) M is adapted to (Ft )
(ii) we have for all (s; t) with s < t that
E [Mt j Fs ] = Ms :
Let T be a random time. As the event fT = tg often has probability zero, we
formulate the stopping time property now as follows: T is a stopping time if for
all t 0
fT tg 2 Ft :
There is also an optional stopping theorem in continuous time. However, some-
times it is a bit technical to check whether its assumptions are satis…ed. We will
tell you when that is the case.
Suppose f is a (deterministic) function of time. We often want to approximate
f (t) for small values of t and collect some remainder terms which are negligible
for small t in a ’bin’which we shall denote with the symbol o (t). For example,
consider the series expansion of the exponential function,
1 1
et = 1 + t + t2 + t3 + :::
2 6
= 1 + t + o (t)
19
re‡ecting that for very small t we have
et t 1 + t:
More formally we can de…ne N as follows: recall …rst the indicator function
1; if t Tn (!)
fTn tg =
0; if t < Tn (!) :
Let now N be given as X
Nt = fTn tg ;
n 1
N is called the counting process associated to the stopping time sequence (Tn ).
As the Tn are stopping times, we have fTn tg 2 Ft for all n. Therefore Nt is
Ft -measurable for all t, and hence adapted to the …ltration (Ft ).
20
Here (i) means that for any s < t we have that Nt Ns is independent of Fs , and
(ii) that for all s; t and u; v such that t s = v u we have that the probability
distribution of Nt Ns is the same as that of Nv Nu (and hence the same as
that of Nt s , since N0 = 0).
E [ (Nt t)j Fs ] = Ns s:
e t
( t)n
P (Nt = n) = :
n!
That is, Nt has the Poisson distribution with parameter t.
21
Proof. Step 1. For all t 0, P (Nt = 0) = e t .
Since fNt = 0g = fNs = 0g \ fNt Ns = 0g for 0 s < t we get by the indepen-
dence of increments that
P (Nt = 0) = P (Ns = 0) P (Nt Ns = 0)
= P (Ns = 0) P (Nt s = 0)
by the stationarity of increments. Let f (t) = P (Nt = 0). We have proven that
f (t) = f (s) f (t s) ; for all 0 s < t: As we can exclude the case f (t) = 0
(otherwise Nt = 1 for all t which is absurd), it follows that P (Nt = 0) = e ct for
a constant c which can be easily identi…ed at the end of the proof with .
Step 2. P (Nt 2) is o (t). That is, for very small t is the probability that the
Poisson process has more than one jump negligibly small. (We are not going into
details here)
Step 3.
P (Nt = 1) 1 P (Nt = 0) P (Nt 2)
lim = lim
t#0 t t#0 t
t
1 e + o (t)
= lim
t#0 t
= :
Step 4. Conclusion. For 0 < < 1, set
Nt
' (t) = E :
We have by the independence and stationarity of the increments
Nt+s
' (t + s) = E
Nt+s Nt +Nt
= E
Nt+s Nt Nt
= E E
= E Ns E Nt
22
Moreover, ( ) = '0 (0), the derivative of ' at zero. Therefore
We get that
( )t t+ t
' (t) = e =e ;
hence by the exponential series
X X ( t)n n
n t
' (t) = P (Nt = n) = e :
n 0 n 0
n!
t( t)n
P (Nt = n) = e for n 0;
n!
as desired.
Recall that the claims in our model arrive at the times T1 , T2 , T3 ,... If the Poisson
process N is the associated counting process to this sequence of stopping times,
then we get by step 1 of the previous proof
P (T1 t) = P (Nt 1)
= 1 P (Nt = 0)
= 1 e t:
By the stationarity of increments of a Poisson process, we also get that for each
n 0,
P (Tn+1 Tn t) = 1 e t :
That is, the waiting time up to the next arrival of a claim is exponentially
distributed. Let us recall the density function f of the exponential distribution:
t
e for t 0
f (t) =
0 for t < 0:
23
the next claim, as follows (use integration by parts):
Z Z 1
t
E [T ] = t P (T = t) dt = t e dt
0
Z 1
t 1 1
= te 0
+ e t dt = :
0
Similarly,
2
E T2 = 2;
St = Y1 + ::: + YNt :
24
By a similar computation one can get (if E [Nt2 ] < 1) that
var (St ) = var (Yi ) E [Nt ] + var (Nt ) (E [Yi ])2
= t var (Yi ) + (E [Yi ])2 = t E (Yi )2 :
Let us recall that the moment-generating function for the Poisson distribution
(here for the random variable Nt s ) is given as
E ecNt s
= exp ((ec 1) (t s)) : (4.1)
We denote with MY (r) the moment-generating function for the random variable
Yi (which does not depend on i),
MY (r) = E erYi :
Theorem 4.5 We have
" !#
X
Nt
E exp r Yi = e(MY (r) 1) (t s)
:
i=Ns +1
Proof. Recall that we can write ax = exp (x log (a)). By the stationarity and
independence of the increments of N , as well as that the Yi ’s are independent of
N , we get
" !# " Nt s
!#
XNt X
E exp r Yi = E exp r Yi
i=Ns +1 i=1
" Nt s #
Y
(multiplicative property of exp) = E exp (rYi )
i=1
" Nt s #
Y
(Yi are independent of N ) = E E [exp (rYi )]
i=1
" Nt s #
Y
= E MY (r)
i=1
h i
= E (MY (r))Nt s
= E [exp (Nt s log (MY (r)))] :
We now set c = log (MY (r)) into the expression (4.1) for the moment generating
function for Nt s to end up with
E [exp (Nt s log (MY (r)))] = e(MY (r) 1) (t s)
:
25
4.3 Ruin Probabilities for the Classical Risk Process
Let us imagine an insurance company which starts with an initial capital u, col-
lects premiums at a rate c, and has payments modelled by a compound Poisson
process, as discussed above. Then we get the following model for the surplus of
the insurance portfolio:
X
Nt
Ct = u + ct Yi :
i=1
C is called the classical risk process. We assume that Yi 0 and that the Yi ’s
are independent and identically distributed as in the last subsection. Let
= E [Yi ] < 1:
A somewhat unpleasant event for the company happens at the ruin time
where we follow the convention that inf ? = 1. Therefore, the board of the
company as well as regulators might want to have a close look on the (ultimate)
ruin probability
(u) = P ( < 1) :
The ruin probability is a function of the initial capital u. If (u) is large, then
regulators might force the company to start business with more initial reserves,
that is, to work with a larger u. Obviously, (u) is decreasing in u. We are
interested in obtaining some description of the dependence of on u. Let us
impose one more assumption, the so-called net pro…t condition:
c> :
We can interpret this condition by recalling from last section that the expected
value of a compound Poisson process is given as
"N #
X t
E Yi = t:
i=1
Therefore, the net pro…t condition just tells that the premium rate (our income) is
higher than the expected rate of our claim payments. It is a necessary requirement
to get an insurance company to be interested into business. Moreover, we assume
26
that the claims are not too heavy-tailed by requiring the moment-generating func-
tion of the claim distribution to be …nite,
The crucial observation which will lead to the description of the ruin probability
will be obtained by solving the following problem:
In other words, we are looking for an exponential martingale related to the risk
process C. We compute by using the independence and stationarity of increments,
the de…nition of the risk process C, and the result of Theorem 4.5 that
rCt
E e Fs
r(Ct Cs ) rCs
= E e e Fs
rCs r(Ct Cs )
(taking out + independence) = e E e
" !#
X
Nt
rCs rct+rcs
(de nition of C) = e E exp r Yi
i=Ns +1
" Nt
!#
Xs
rCs rct+rcs
(stationarity) = e E exp r Yi
i=1
rCs rct+rcs
(Theorem 4:5) = e exp ( (MY (r) 1) (t s))
(r) = 0:
One solution to this equation is r = 0 which is useless since exp ( rCt ) is just
a constant for r = 0. Some analysis, however, shows that there might be one
additional solution R > 0 to this equation. In case this solution exists we call it
the adjustment coe¢ cient R.
27
Example 4.6 Let the claims Yi be Exp ( )-distributed. One computes
1 Rc = 0:
R
R= ;
c
we have R > 0 (equivalent to c > = ) because of the net pro…t condition (re-
call that the exponential distribution has a mean of 1= ). However, often the
exponential distribution is a rather poor approximation to the ’real’ claim size
distribution, so we need to consider a more general situation.
Let us reformulate the result which we obtained above as follows.
Theorem 4.7 Assume that the adjustment coe¢ cient R exists. Then exp ( RCt )
is a martingale.
Before we proceed further, let us collect one more result from martingale theory
(without proof).
Theorem 4.9 Assume that the adjustment coe¢ cient R exists. Then
Ru
(u) < e :
Proof. Recall that (u) = P ( < 1) where is the time of ruin. By the optional
stopping theorem, applied to the stopped martingale exp ( RCt ), we get
Ru C0 RC RC RC
e =e =E e ^t
E e ^t
t =E e t :
28
RC
Letting now t tend to in…nity, we get (note that C 0, hence e 1)
Ru RC
e E e <1 >E[ <1 ] = P ( < 1) = (u) :
It results that the ruin probability (u) decreases rather quickly, namely expo-
nentially, with increasing initial reserve u. So running an insurance company is
not a too risky business, provided the company is equipped with plenty of initial
capital.
X
n
29
We call this expression
R t the stochastic integral of with respect to S. Thus, the
stochastic integral 0 dS represents the capital accumulated between 0 and t by
driving the trading strategy .
For the …rst term we use that S is a martingale (and the optional stopping theo-
rem):
E[ 0 (ST1 S0 )j Fs ] = 0 (E [ST1 j Fs ] S0 )
= 0 (Ss S0 ) :
30
We evaluate the second term by using the tower property together with ’taking
out what is known’and the martingale property of S:
where = lim"#0 u " . More precisely, one can de…ne a stochastic integral
Rt u
0 u
dSu as the limit of the approximating sums in (5.1). Important here is
again that the integrand has to be left-continuous, not right-continuous – the
reason why we work with u instead of u . This R guarantees in particular that if
S is a martingale, then the stochastic integral dS is a martingale as well (the
last statement is not completely rigorous, but essentially true).
31
exists. One also abbreviates this by saying that X is an RCLL process. In case
Xt 6= Xt the process exhibits a jump of size
Xt = Xt Xt
or in di¤erential notation
dXt = xt dt + Xt
Let now f be a real-valued function with continuous derivative. We can form the
composed function f (Xt ). At the exceptional points f (Xt ) changes only at jump
points, and then it jumps by f (Xt ) f (Xt ). In the open intervals where X is
continuously di¤erentiable we have the usual chain rule at our disposal. In total
we get
df (Xt ) = f 0 (Xt ) xt dt + f (Xt ) f (Xt )
or in integral form
Given two stochastic processes X; Y and a time grid on [0; t] we can de…ne the
’realized covariance’
X
X i+1 X i Y i+1 Y i
i
X
= X i+1
Y i+1
X iY i
Y i
X i+1
X i
X i
Y i+1
Y i
i
X
= Xt Yt X0 Y0 Y i
X i+1
X i
+X i
Y i+1
Y i
i
Z t Z t
! Xt Yt X0 Y0 Yt dXt Xt dYt :
0 0
32
Considering …ner and …ner grid size, we call the limit the quadratic covariation of
X and Y , denoted by [X; Y ]. Therefore,
X
X i+1 X i Y i+1 Y i ! [X; Y ]t ;
i
Note that, in contrast to the version of the Itô-formula we have proved so far,
X and Y do not have to be PD for the integration by parts formula to be valid.
However, if X and Y are PD, then we get from the approximation
X
[X; Y ]t = Xs Ys .
0 s t
In particular, since we have for a Poisson process N (jump size one!) that
X X
Nt = Ns = ( Ns )2 ;
0 s t 0 s t
Theorem 5.5 The stochastic exponential Z is the unique solution to the equation
Z t
Zt = 1 + Zs dXs ;
0
33
Proof. Let us assume that everywhere Xt = 0. Then we have Zt = exp(Xt ),
and the statement is a well-known result from classical analysis. If, on the other
hand, dXt = 0 with the possible exception of countably many jump points where
dXt = Xt , then Zt =Zt = 1 + Xt , and we get
dZt Zt Zt
= = (1 + Xt ) 1 = dXt :
Zt Zt
The general case is a combination of the two forementioned special cases.
dZt = et = Zt dXt
Zt dN
Z0 = 1:
= e (1 + )Nt
t
= exp (Nt ln (1 + ) t)
c
= exp (cNt (e 1) t)
34
5.4 Jump Times and Hazard Rates
We are given a stochastic process X of state variables which in‡uence the economy
via stochastic interest rates r(Xt ), and via a process (Xt ) which will later be
interpreted as hazard rate process. The …ltration generated by X will be denoted
by (Gt ). Let E1 be a random variable independent of (Gt ) which is exponentially
distributed. Let (Ft ) be the larger …ltration which encompasses (Gt ) and the
information about E1 . Then we can construct a (Ft )-stopping time as follows:
de…ne Z t
= inf t : (Xs ) ds E1 :
0
In case (Xs ) = (constant ) this would give us the …rst jump time of a Poisson
process with intensity . In this sense the de…nition of is a generalization to the
case of stochastic intensity. Recall now that, since E1 Exp(1), we have for an
arbitrary time T that
P (T < E1 ) = exp ( T ) :
Therefore, we get by independence and by the fact that (Xs ) is Gt -measurable
for all s < t that
Z t
P ( > tj Gt ) = P (Xs ) ds < E1 Gt
0
Z t
= exp (Xs ) ds :
0
This identity gives us, in analogy with the deterministic case studied in survival
analysis, the interpretation of (Xt ) as hazard rate process. Our goal is now to
calculate the expected discounted value p (0; t) of a survivor bond, which pays out
one unit if the policyholder has survived up to time t (actuarial interpretation), or
of a defaultable bond, which pays out one unit if t is smaller than the default time
(credit risk interpretation). In both cases the payout function of the claim is
f >tg . We get (recall that E f >tg Gt is just another notation for P ( > tj Gt ))
Z t
p(0; t) = E exp r (Xs ) ds f >tg
0
Z t
= E E exp r (Xs ) ds f >tg Gt
0
Z t
= E exp r (Xs ) ds E f >tg Gt
0
35
Z t Z t
= E exp r (Xs ) ds exp (Xs ) ds
0 0
Z t
= E exp (r + ) (Xs ) ds :
0
Here the interest rate r (Xs ) has been replaced by the actuarial interest rate
(r + ) (Xs ). Let now Nt = f tg be the survival process associated to . It
is possible (but rather tedious) to show that
Z t Z t^
M t = Nt u f >ug du = Nt u du
0 0
R t^
is a martingale with respect to (Ft ). We call 0 u du the compensator of N .
6 Brownian Motion
6.1 De…nition and First Properties
De…nition 6.1 B is a standard (one-dimensional) Brownian motion if it satis…es
the following conditions: (B0 = 0 and)
(i) Its increments are independent of the past
(ii) It has stationary increments
(iii) The mapping t ! Bt is continuous
(iv) Bt is N (0; t) distributed (normal distribution with mean 0 and variance t)
ij =E Zi i Zj j
36
p
Self-similarity. The processes (Bct ) and ( cBt ) have the same distribution (c >
0): speeding up time by a factor of c does not change properties (i) (iii), but Bct
has variance
p ct. This can also be achieved by multiplying a standard Brownian
motion by c.
Covariance formula. E [Bs Bt ] = min (s; t): let s < t. Writing Bt = Bs +
(Bt Bs ) and using the fact that Bs and (Bt Bs ) are independent with mean
0, we have
Quadratic variation. Note that we had de…ned the quadratic covariation [X; Y ]
of two processes X, Y as the limit (the ti ’s form a partition n : 0 = t0 < t1 <
::: < tn of [0; t]) X
Xti+1 Xti Yti+1 Yti ! [X; Y ]t :
i
Our goal is now to determine [B; B]t . We set mesh( n ) = maxi jti+1 ti j. Let us
now consider …ner and …ner partitions m n for m < n such that mesh( n ) !
0 for n ! 1. We have to evaluate the limit of
X 2
Qn (t) = Bti+1 Bti :
i
37
and consequently
X
var (Qn (t)) = 3 (ti+1 ti )2 (ti+1 ti )2
i
X
= 2 (ti+1 ti )2 :
i
Since var (Qn (t)) = E (Qn (t) t)2 we have proved that Qn (t) converges to t in
mean-square. One can show that this implies that [B; B]t = t, since [B; B]t is the
limit of Qn (t) (in an appropriate sense). This result is remarkable: we have seen
earlier on that for a PD stochastic process X,
X
[X; X]t = ( Xs )2 :
0 s t
E [Bt j Fs ] = E [(Bt Bs ) + Bs j Fs ]
= E [Bt Bs ] + Bs
= Bs :
Let now
Ta = min ft 0 : Bt = ag
38
be the …rst time the Brownian motion hits a. One can show that
P (Ta < 1) = 1:
In the following, we will apply the following version of the optional stopping the-
orem:
E [MT ] = E [M0 ] :
Example 6.6 Exit distribution from an interval. De…ne the exit time by
(a < 0 < b)
= min ft : Bt 2
= (a; b)g :
Then we have
a b
P (B = b) = and P (B = a) = :
b a b a
Proof. Let us …rst check that is a stopping time. We have
0 = E [B ] = aP (B = a) + bP (B = b) :
39
Proof. Given Fs , the value of Bs is known while Bt Bs is independent with
mean 0 and variance t s. We get
Example 6.8 Mean exit time from an interval. For a < 0 < b, de…ne the
exit time by
= min ft : Bt 2
= (a; b)g :
Then
E[ ]= ab:
Proof. It is possible (but more di¢ cult to show than in the last example) to
apply the conclusion of the optional stopping time. This gives with the formula
for the exit distribution from the last example
a b
0 = E B2 = b2 + a2 E [ ]:
b a b a
Rearranging now gives
b2 a + ba2
E[ ]= = ba:
b a
We can derive one surprising conclusion from this example. Recall that Ta =
min ft 0 : Bt = ag and let a;N = min ft : Bt 2
= (a; N )g, so Ta a;N and there-
fore
E [Ta ] E [ a;N ] = aN ! 1
as N ! 1. The same conclusion holds for a > 0 by symmetry, so we have for all
a 6= 0 that
E [Ta ] = 1:
40
7 Stochastic Calculus for Brownian Motion
7.1 Stochastic Integral of Elementary Strategies
Let Bt denote the price of some stock at time t, we assume that B is a Brownian
motion (historically, this was one of the oldest models for stock prices). Assume
we buy a random amount of shares of this stock at time s for the price Bs , wait
until time t > s and sell it then for a price Bt . It is possible that is negative
(this amounts to short-selling the stock). We assume that we have no insider
information about future stock prices. Therefore should be a Fs -measurable
random variable. Our gain (or loss) from this strategy is given as
Z t
dBu := (Bt Bs ) :
s
We are interested in calculating conditional mean and variance of our gain in this
case. We have by the tower property, the fact that is Fs -measurable and the
independence of Bt Bs from Fs
E [ (Bt Bs )j Fs ] = E [ (Bt Bs )j Fs ]
= E [Bt Bs ]
= 0:
Moreover,
E 2
(Bt Bs )2 Fs = 2
E (Bt Bs )2 Fs
= 2
E (Bt Bs )2
2
= (t s):
We call this the stochastic integral of the strategy with respect to the Brownian
motion B. In this de…nition it is essential that i is FTi -measurable. Otherwise
we could predict future increments of the stock price and make a riskless pro…t.
41
For instance, if we can foresee the future then we can follow the strategy #i =
sign(BTi+1 BTi ) to get
Z T X
n
#u dBu = sign(BTi+1 BTi ) BTi+1 BTi
0 i=1
X
n
= BTi+1 BTi 0
i=1
RT
The stochastic integral 0 u dBu is a random variable. We can de…ne a stochastic
integral process as follows for any t 2 [0; T ]:
Z t X
n
Note that if t < Ti < Ti+1 , then BTi+1^t BTi ^t = 0. If Ti < t < Ti+1 then
BTi+1^t BTi ^t = Bt BTi , and if Ti < Ti+1 < t then BTi+1^t BTi ^t = BTi+1 BTi .
We always assume that our strategy is not too ’wild’in the sense that
" n #
X
2
E i (Ti+1 Ti ) < 1:
i=1
Generalizing our considerations above (trading at one time point only), one can
prove the following result.
Rt
Theorem 7.1 The stochastic integral process 0 u dBu is a martingale with con-
tinuous paths. Moreover,
" Z 2
# Z
t t
2
E u dBu Fs = E u du Fs
s s
Note that we are only trading at n discrete time points, this formula (for s = 0)
can also be written as
2 !2 3 " n #
Xn X
E4 i BTi+1^t BTi ^t 5=E 2
i (Ti+1 Ti )
i=1 i=1
42
7.2 Stochastic Integral for Left-Continuous Integrands. Itô’s
Formula.
Let us now consider a left-continuous trading strategy which has right-limits and
which is adapted to (Ft ), hence for all t we have that t is Ft -measurable. It is
possible to approximate such a strategy by the elementary strategies considered
in the last section. One can show that the corresponding elementary stochastic
integrals then also approximate a random variableR which we call the stochastic
t
integral of with respect to B and denote by 0R u dBu . If we consider it as
a function of t, then we get a stochastic process dB. It has the same two
properties as in the last section.
hR i
T 2
Theorem 7.2 Let be left-continuous with right-limits, and such that E 0 u du <
1. Then
R
(i) dB is a martingale with continuous paths, in particular (for 0 < s < t < T )
Z t
E u dBu Fs = 0
s
Please note that since we assume that F0 is the trivial -algebra (containing only
and ?), in case s = 0 the two formulas read as
Z t
E u dBu = 0;
0
" Z 2
# Z
t t
2
E u dBu = E u du :
0 0
On the other hand, if one of the integrals is PD (i.e., an integral with respect to
du), then we get Z Z
dB; # du = 0:
t
43
We would like to develop a calculus for the stochastic integral with respect to
Brownian motion. In classical calculus, the key for doing computations is the
fundamental theorem relating integration and di¤erentiation, and we want to gen-
eralize this. Let f now be a two times continuously di¤erentiable function, and
consider the following heuristic motivation. Taylor’s theorem tells us that we have
1
f (y) x) + f 00 (x) (y x)2 + R
f (x) = f 0 (x) (y
2
where R is some remainder term. We now partition the intervall [0; t] as 0 = t0 <
t1 < ::: < tn = t and write as a telescoping sum
X
n
f (Bt ) f (B0 ) = f Bti+1 f (Bti )
i=0
Xn
= f 0 (Bti ) Bti+1 Bti
i=0
1 X 00
n
2
+ f (Bti ) Bti+1 Bti +R
2 i=0
If we now make the R t partition …ner and …ner, the …rst term on the right-hand side
0
is approximately 0 f (Bu ) dBu . What about the second term? We have by the
tower property
" n # " n #
X 2
X h 2
i
E f 00 (Bti ) Bti+1 Bti = E E f 00 (Bti ) Bti+1 Bti Fti
i=0
" i=0 #
X n h i
2
= E f 00 (Bti )E Bti+1 Bti Fti
" i=0 #
X n h i
2
= E f 00 (Bti )E Bti+1 Bti
" i=0 #
X n
= E f 00 (Bti ) (ti+1 ti )
i=0
and
R t 00the last sum inside of the expectation tends with …ner and …ner partitions to
0
f (Bu ) du. But one can get even more: it is possible to prove that on a set with
probability one we have (at least for certain subsequences of partitions) that
Xn Z t
00 2
f (Bti ) Bti+1 Bti ! f 00 (Bu ) du:
i=0 0
44
Moreover, the remainder term R goes to zero. We have motivated the most im-
portant result in stochastic calculus.
Example 7.4 Take f (x) = x2 . We have f 0 (x) = 2x, f 00 (x) = 2. Itô’s formula
yields (assuming B0 = 0)
Z t Z
2 1 t
Bt = 2Bu dBu + 2 du
0 2 0
Z t
= 2 Bu dBu + t
0
Rearranging gives us Z t
1 1
Bu dBu = Bt2 t
0 2 2
which is indeed surprising if we compare it with
Z t
1
u du = t2 :
0 2
There is also a multidimensional, time-dependent generalization of Itô’s formula.
However, for the purpose of this lecture, we will only need the following corollaries
from this.
45
Theorem 7.5 Time-dependent Itô’s formula. Let f (t; x) be continuously
di¤erentiable: once in t, twice in x. Then
Z t Z t Z
@ @ 1 t @2
f (t; Bt ) f (0; B0 ) = f (u; Bu ) du+ f (u; Bu ) dBu + f (u; Bu ) du:
0 @t 0 @x 2 0 @x2
@
Example 7.6 Let f (t; x) = exp (x t=2). Then @t f (t; x) = 12 exp (x t=2),
@ @ 2
@x
f (t; x) = exp (x t=2), @x 2 f (u; Bu ) = exp (x t=2). It follows that
Z Z t
1 t
exp (Bt t=2) = 1 exp (Bu u=2) du + exp (Bu u=2) dBu
2 0 0
Z
1 t
+ exp (Bu u=2) du
2 0
Z t
= 1+ exp (Bu u=2) dBu :
0
Theorem 7.7 Let h (t; x) be a space-time harmonic function. That is, it ful…lls
the partial di¤erential equation
@ 1 @2
h (t; x) + h (t; x) = 0:
@t 2 @x2
hR i
t @ 2
Let moreover E 0 @x
h (u; Bu ) du < 1. Then h (t; Bt ) is a martingale, and
h (0; 0) = E [h (t; Bt )] :
Theorem 7.8 Integration by parts. We recall the formula from the previous
chapter: Z t Z t
Xt Yt X0 Y0 = Xt dYt + Yt dXt + [X; Y ]t
0 0
46
R R
Let , # be strategies such that the stochastic integrals dB, # dB are well-
de…ned. The derivation of the integration by parts formula in the previous chapter
carries over to the Brownian case as well, so we get, noting that all processes
involved here are left-continuous,
R andR the stochastic integrals have zero initial
value, that (setting X = dB, Y = # dB)
Z t Z t Z tZ u Z tZ u Z t
u dBu #u dBu = v dBv #u dBu + #v dBv u dBu + u #u du:
0 0 0 0 0 0 0
RtRu
Example 7.9 We calculate the iterated integral 0 0 Bv dBv dBu by using the
integration by parts formula for = 1, # = B:
Z tZ u Z t Z t Z t
2
Bv dBv dBu = Bt Bu dBu Bu dBu Bu du: (7.1)
0 0 0 0 0
hence Z t Z t
1
Bu2 dBu = Bt3 Bu du: (7.3)
0 3 0
47
t
Rt
We have by integration by parts (e = 0
e s ds)
Z t Z t Z tZ s
t s s
e e dBs = e e s dBs e u
dBu e s
ds
0 0 0 0
Z t
= Bt Xs ds:
0
48
7.3 Itô-Processes and Stochastic Di¤erential Equations
Itô-processes: these are stochastic processes of the form
Z t Z t
Xt = X0 + s ds + s dBs ;
0 0
where and are adapted stochastic processes (which includes the special case
when they are deterministic). We sometimes write this formula shorter in di¤er-
ential notation as
dXt = t dt + t dBt :
We can calculate the quadratic variation of an Itô-process as
Z t
2
[X]t = [X; X]t = s ds;
0
or in di¤erential notation
2
d [X]t = t dt:
We have an Itô-formula for these kind of processes as follows:
Z t Z
0 1 t 00
f (Xt ) f (X0 ) = f (Xs ) dXs + f (Xs ) d [X]s
0 2 0
Z t Z t Z
0 _ 0 _ 1 t 00 2
= f (Xs ) s ds + f (Xs ) s dBs + f (Xs ) s ds:
0 0 2 0
or in di¤erential notation
One can state certain conditions on the functions and which guarantee the
existence of a unique solution, and there is a vast body of knowledge of e¢ cient
numerical algorithms to approximate this solution.
49
Examples:
The SDE
dXt = Xt dt + dBt
The SDE
dXt = Xt dt + Xt dBt
1 2
Xt = X0 exp t + Bt :
2
dXt = Xt dt + Xt dBt :
Fix T > 0, and let 0 t T . For any function h, we de…ne a function F (t; x) by
considering the conditional expectation
r(T t)
F (t; Xt ) = E e h (XT ) Ft : (7.4)
@ @ 1 2 @2
F (t; x) + F (t; x) + F (t; x) = rF (t; x);
@t @x 2 @x2
F (T; x) = h(x):
50
Proof. We have
rt rT
e F (t; Xt ) = E e h (XT ) Ft :
By the tower property, and the de…nition of F ,
rt rT
E e F (t; Xt ) Fs = E E e h (XT ) Ft Fs
rT
= E e h (XT ) Fs
= e rs F (s; Xs ):
rt rt @ @ 1 2 @2 rt @
d e F (t; Xt ) = e rF + F + F + F dt + e F dB:
@t @x 2 @x2 @x
Because we have a martingale, the dt-term must be equal to zero, and the partial
di¤erential equation follows.
Remark 7.13 Often the converse of this statement is used: given a solution to
the forementioned PDE, it can be shown under certain conditions that it equals
the conditional expectation (7.4).
Example. The previous example generalizes easily in the following way: Take
a discrete-time Markov chain and assign to each state a rate i . If Xt is in a
state i with i = 0 then Xt stays there forever. If i > 0, Xt stays at i for an
51
exponentially distributed amount of time with rate i , then goes to state j with
transition probability p (i; j). The lack of memory property of the exponential
distribution implies that given the present state, the rest of the past is irrelevant
for predicting the future.
Example 8.2 For a Poisson process with intensity , we have that the transition
probability to go from n to n + 1 is given for all n 0 as
ph (n; n + 1) = 1 e h + o (h)
= h + o (h) ;
so we get
q (n; n + 1) = :
52
We introduce a new matrix Q with entries
q (i; j) if j 6= i
Q (i; j) =
i if j = i
Note that the o¤-diagonal elements q (i; j) with i 6= j are nonnegative, while the
diagonal entry i is a negative number chosen to make the row sum equal to
0. A computation based on the Chapman-Kolmogorov equation now reveals the
connection between the matrices P and Q (you can look up the details in the book
by Durrett, Ch. 4.2):
53
In general, matrices do not commute: AB 6= BA, but the explanation is that here
Pt = eQt is made up of powers of Q:
X
(Qt)n X (Qt)n
1 1
Q exp(Qt) = Q = Q = exp(Qt) Q:
n=0
n! n=0
n!
To check this we di¤erentiate the formula for pt (i; j) as above to get for j > i
t ( t)j i t ( t)j i 1
e +e = pt (i; j) + pt (i + 1; j) :
(j i)! (j i 1)!
t
When j = i, pt (i; i) = e , so the derivative is
t
e = pt (i; i) = pt (i; i) + pt (i + 1; i)
since pt (i + 1; i) = 0.
54
This is di¢ cult to check since it involves all of the Pt which in addition are usually
not that easy to compute. It would be better to get a condition which only
involves the single matrix Q which we usually get easily from the problem under
consideration.
T
Theorem 8.6 is a stationary distribution if and only if Q = 0.
Proof. We just prove the ’only if’part. We have, since Pt = exp(Qt),
T T
Pt = exp(Qt)
X1
tn
T
= Qn
n=0
n!
!
X
1
tn
T n
= I+ Q
n=1
n!
X
1
tn
T T
= + Q Qn 1
n=1
n!
T
= :
De…nition 8.7 A chain is called irreducible if for any two states i and j it is
possible to get from i to j in a …nite number of steps.
55
The relation T Q = 0 now leads to three equations, which we solve to get (1) =
3=8, (2) = 4=8, (3) = 1=8.
De…nition 8.9 A birth and death chain has state space S = f0; 1; :::; N g and
jump rates
q(n; n + 1) = n
q(n; n 1) = n
since then X
T
Q j
= (k) q (k; j) (j) j = 0:
k6=j
Note: the detailed balance condition is not a necessary condition for being
stationary. Indeed, in the weather example it is not satis…ed. However, it always
holds for birth and death chains.
Example. Barbershop. A barber can cut hair at rate 3, where the units are
people per hour, i.e. each haircut requires an exponentially distributed amount of
time with mean 20 minutes. Suppose customers arrive at times of an intensity rate
2 Poisson process, but will leave if both chairs in the waiting room are full. What
fraction of time will both chairs be full? In the long run, how many customers
does the barber serve per hour?
Solution: we de…ne our state to be the number of customers in the system, so
S = f0; 1; 2; 3g. From the problem description it is clear that
q (i; i 1) = 3 for i = 1; 2; 3
q (i; i + 1) = 2 for i = 0; 1; 2
56
The detailed balance conditions say
2 (0) = 3 (1) ; 2 (1) = 3 (2) ; 2 (2) = 3 (3)
This together with (0) + (1) + (2) + (3) = 1 yields
(0) = 27=65; (1) = 18=65; (2) = 12=65; (1) = 8=65:
From this we see that 8=65’s of the time both chairs are full, so that fraction of
the arrivals are lost and hence 57=65’s or 87:7% of the customers enter service.
Since the original arrival rate is 2, this means that the barber serves an average
of 114=65 = 1:754 customers per hour.
Example. Queuing chain. A bank has one teller to serve customers who need
an exponential amount of service with rate and queue in a line if the server is
busy. Customers arrive at times of a Poisson process with rate but only join the
queue with probability an if there are n customers in line. We have
q (n; n + 1) = an
q(n; n 1) = :
The next result gives a natural condition to prevent the queue length from growing
out of control.
(n + 1) = an (n) :
(n) = (n 1)
n
2
= (n 2)
n(n 1) 2
= :::
( = )n
= (0) :
n!
57
To …nd the stationary distribution we want to …nd c = (0) such that
X1
( = )n
c = 1:
n=0
n!
As the sum is the series for the exponential function, we get c = exp ( = ) and
the stationary distribution is Poisson.
58