6.262 Discrete Stochastic Processes - Notes - 0. Course Content

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

DISCRETE STOCHASTIC PROCESSES

Draft of 2nd Edition

R. G. Gallager

January 31, 2011

ii

Preface

These notes are a draft of a major rewrite of a text [9] of the same name. The notes and the
text are outgrowths of lecture notes developed over some 20 years for the M.I.T. graduate
subject 6.262, entitled ‘Discrete Stochastic Processes.’ The original motivation for 6.262
was to provide some of the necessary background for Electrical Engineering, Computer
Science, and Operations Research students working in the burgeoning field of computer-
communication networks. Many of the most important and challenging problems in this
area involve queueing and congestion, but the literature on queueing and congestion was
rapidly becoming so di↵use that a more cohesive approach was needed. Queuing problems
are examples of stochastic processes, and more particularly, discrete stochastic processes.
Discrete stochastic processes form a cohesive body of study, allowing queueing and conges­
tion problems to be discussed where they naturally arise.
In the intervening years, it has become increasingly apparent that many problems involving
uncertainty in almost all branches of technology and human a↵airs provide interesting and
important examples of discrete stochastic processes. Discussing these problems as they arise
both increases the application domain of this subject and also enhances our intuition and
understanding about the general principles.
The purpose of this text is both to help students understand the general principles of
discrete stochastic processes, and to develop the understanding and intuition necessary
to apply stochastic process models to problems in engineering, science, and operations
research. Although the ultimate objective is applications, there is relatively little detailed
description of real applications. Rather, there are many simple examples designed both to
build insight about particular principles in stochastic processes and to illustrate the generic
e↵ect of these phenomena in real systems. I am convinced that the ”case study” method, in
which many applications are studied in the absence of general principles, is inappropriate
for understanding stochastic processes (and similarly inappropriate for any field that has a
rich and highly cohesive mathematical structure).
When we try to either design new kinds of systems or understand physical phenomena, we
usually employ a variety of stochastic process models to gain understanding about di↵erent
tradeo↵s and aspects of the problem. Creating these models requires deep understanding
both of the application area and of the structure of stochastic processes. The application
areas are too broad, and the structure too deep, to do all this in one text. My experience
indicates that engineers rarely have difficulty applying well-understood theories and tools
to well-understood application areas. The difficulty comes when the theoretical structure
is not understood on both an intuitive and mathematical level. The ”back of the envelope
calculations” that we so prize as engineers are the result of this deep understanding of both
application areas and conceptual structure.
I try to present the structural aspects of stochastic processes in the simplest possible light
here, thus helping readers develop insight. This requires somewhat more abstraction than
engineers are used to, but much less than often appears in mathematics texts. It also
requires students to spend less time doing complex calculations and more time drawing
illustrative diagrams and thinking. The proofs and explanations here are meant to be read,
iii

not because students might doubt the result, but to enhance understanding. In order to use
these results in modeling real situations, the robustness of the results must be understood
at an intuitive level, and this is gained only by understanding why the results are true, and
why they fail when the required conditions are unsatisfied.
Students learn about new concepts in many ways, partly by learning facts, partly by doing
exercises, and partly by successively refining an internal structural picture of what the
subject is about. The combination of all of these leads to understanding and the ability to
create models for real problems. This ability to model, however, requires much more than
the ”plug and chug” of matching exercises to formulas and theorems. The ability to model
is based on understanding at an intuitive level, backed by mathematics.
Stochastic processes is the branch of probability dealing with probabilistic systems that
evolve in time. By discrete stochastic processes, I mean processes in which changes occur
only at discrete times separated by either deterministic or random intervals. In particu­
lar, we do not treat noise processes such as Gaussian processes. This distinction between
discrete and non-discrete processes is somewhat artificial, but is dictated by two practical
considerations. The first is that many engineering graduate students take a course involving
noise, second moment theory, and inference (including detection and estimation) (the ma­
terial in such subjects is more standard than the title). Such a course has much cohesion,
fits nicely into one academic term, but has relatively little conceptual overlap with the ma­
terial here. The second consideration is that extensions of the material here to continuous
processes often obscure the probabilistic ideas with mathematical technicalities.
The mathematical concepts here are presented without measure theory, but a little math­
ematical analysis is required and developed as used. The material requires more patience
and more mathematical abstraction than many engineering students are used to, but that
is balanced by a minimum of ‘plug and chug’ exercises. If you find yourself writing many
equations in an exercise, stop and think, because there is usually an easier way. In the the­
orems, proofs, and explanations, I have tried to favor simplicity over generality and clarity
over conciseness (although this will often not be apparent on a first reading). I have pro­
vided references rather than proofs for a number of important results where the techniques
in the proof will not be reused and provide little intuition. Numerous examples are given
showing how results fail to hold when all the conditions are not satisfied. Understanding is
often as dependent on a collection of good counterexamples as on knowledge of theorems.
In engineering, there is considerable latitude in generating mathematical models for real
problems. Thus it is more important to have a small set of well-understood tools than a
large collection of very general but less intuitive results.
Most results here are quite old and well established, so I have not made any e↵ort to
attribute results to investigators, most of whom are long dead or retired. The organization
of the material owes a great deal, however, to Sheldon Ross’s book, Stochastic Processes,
[16] and to William Feller’s classic books, Probability Theory and its Applications, [7] and
[8].
Contents

1 INTRODUCTION AND REVIEW OF PROBABILITY 1

1.1 Probability models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 The sample space of a probability model . . . . . . . . . . . . . . . . 3

1.1.2 Assigning probabilities for finite sample spaces . . . . . . . . . . . . 4

1.2 The axioms of probability theory . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Axioms for events . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2.2 Axioms of probability . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Probability review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3.1 Conditional probabilities and statistical independence . . . . . . . . 9

1.3.2 Repeated idealized experiments . . . . . . . . . . . . . . . . . . . . . 10

1.3.3 Random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3.4 Multiple random variables and conditional probabilities . . . . . . . 13

1.3.5 Stochastic processes and the Bernoulli process . . . . . . . . . . . . 15

1.3.6 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.3.7 Random variables as functions of other random variables . . . . . . 23

1.3.8 Conditional expectations . . . . . . . . . . . . . . . . . . . . . . . . 25

1.3.9 Indicator random variables . . . . . . . . . . . . . . . . . . . . . . . 28

1.3.10 Moment generating functions and other transforms . . . . . . . . . . 28

1.4 Basic inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

1.4.1 The Markov inequality . . . . . . . . . . . . . . . . . . . . . . . . . . 30

1.4.2 The Chebyshev inequality . . . . . . . . . . . . . . . . . . . . . . . . 31

1.4.3 Cherno↵ bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1.5 The laws of large numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

iv

CONTENTS v

1.5.1 Weak law of large numbers with a finite variance . . . . . . . . . . . 35

1.5.2 Relative frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

1.5.3 The central limit theorem . . . . . . . . . . . . . . . . . . . . . . . . 38

1.5.4 Weak law with an infinite variance . . . . . . . . . . . . . . . . . . . 42

1.5.5 Convergence of random variables . . . . . . . . . . . . . . . . . . . . 44

1.5.6 Convergence with probability 1 . . . . . . . . . . . . . . . . . . . . . 47

1.6 Relation of probability models to the real world . . . . . . . . . . . . . . . . 49

1.6.1 Relative frequencies in a probability model . . . . . . . . . . . . . . 50

1.6.2 Relative frequencies in the real world . . . . . . . . . . . . . . . . . . 51

1.6.3 Statistical independence of real-world experiments . . . . . . . . . . 53

1.6.4 Limitations of relative frequencies . . . . . . . . . . . . . . . . . . . 54

1.6.5 Subjective probability . . . . . . . . . . . . . . . . . . . . . . . . . . 55

1.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

1.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

2 POISSON PROCESSES 69

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

2.1.1 Arrival processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

2.2 Definition and properties of a Poisson process . . . . . . . . . . . . . . . . . 71

2.2.1 Memoryless property . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

2.2.2 Probability density of Sn and S1 , . . . Sn . . . . . . . . . . . . . . . . 75

2.2.3 The PMF for N (t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

2.2.4 Alternate definitions of Poisson processes . . . . . . . . . . . . . . . 78

2.2.5 The Poisson process as a limit of shrinking Bernoulli processes . . . 79

2.3 Combining and splitting Poisson processes . . . . . . . . . . . . . . . . . . . 82

2.3.1 Subdividing a Poisson process . . . . . . . . . . . . . . . . . . . . . . 83

2.3.2 Examples using independent Poisson processes . . . . . . . . . . . . 85

2.4 Non-homogeneous Poisson processes . . . . . . . . . . . . . . . . . . . . . . 86

2.5 Conditional arrival densities and order statistics . . . . . . . . . . . . . . . . 89

2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

vi CONTENTS

3 FINITE-STATE MARKOV CHAINS 103

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

3.2 Classification of states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

3.3 The matrix representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

3.3.1 Steady state and [P n ] for large n . . . . . . . . . . . . . . . . . . . . 111

3.3.2 Steady state assuming [P ] > 0 . . . . . . . . . . . . . . . . . . . . . 113

3.3.3 Ergodic Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . 114

3.3.4 Ergodic Unichains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

3.3.5 Arbitrary finite-state Markov chains . . . . . . . . . . . . . . . . . . 117

3.4 The eigenvalues and eigenvectors of stochastic matrices . . . . . . . . . . . 118

3.4.1 Eigenvalues and eigenvectors for M = 2 states . . . . . . . . . . . . . 118

3.4.2 Eigenvalues and eigenvectors for M > 2 states . . . . . . . . . . . . . 120

3.5 Markov chains with rewards . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

3.5.1 Examples of Markov chains with rewards . . . . . . . . . . . . . . . 123

3.5.2 The expected aggregate reward over multiple transitions . . . . . . . 125

3.5.3 The expected aggregate reward with an additional final reward . . . 128

3.6 Markov decision theory and dynamic programming . . . . . . . . . . . . . . 129

3.6.1 Dynamic programming algorithm . . . . . . . . . . . . . . . . . . . . 130

3.6.2 Optimal stationary policies . . . . . . . . . . . . . . . . . . . . . . . 135

3.6.3 Policy improvement and the seach for optimal stationary policies . . 137

3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

4 RENEWAL PROCESSES 156

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

4.2 The strong law of large numbers and convergence WP1 . . . . . . . . . . . 159

4.2.1 Convergence with probability 1 (WP1) . . . . . . . . . . . . . . . . . 159

4.2.2 Strong law of large numbers (SLLN) . . . . . . . . . . . . . . . . . . 161

4.3 Strong law for renewal processes . . . . . . . . . . . . . . . . . . . . . . . . 162

4.4 Renewal-reward processes; time-averages . . . . . . . . . . . . . . . . . . . . 167

4.4.1 General renewal-reward processes . . . . . . . . . . . . . . . . . . . . 170

CONTENTS vii

4.5 Random stopping trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

4.5.1 Wald’s equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

4.5.2 Applying Wald’s equality to m(t) = E [N (t)] . . . . . . . . . . . . . . 178

4.5.3 Stopping trials, embedded renewals, and G/G/1 queues . . . . . . . 180

4.5.4 Little’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

4.5.5 Expected queueing time for an M/G/1 queue . . . . . . . . . . . . . 186

4.6 Expected number of renewals . . . . . . . . . . . . . . . . . . . . . . . . . . 188

4.6.1 Laplace transform approach . . . . . . . . . . . . . . . . . . . . . . . 189

4.6.2 The elementary renewal theorem . . . . . . . . . . . . . . . . . . . . 191

4.7 Renewal-reward processes; ensemble-averages . . . . . . . . . . . . . . . . . 193

4.7.1 Age and duration for arithmetic processes . . . . . . . . . . . . . . . 194

4.7.2 Joint age and duration: non-arithmetic case . . . . . . . . . . . . . . 198

4.7.3 Age Z(t) for finite t: non-arithmetic case . . . . . . . . . . . . . . . 199

4.7.4 Age Z(t) as t ! 1: non-arithmetic case . . . . . . . . . . . . . . . 202

4.7.5 Arbitrary renewal-reward functions: non-arithmetic case . . . . . . . 204

4.8 Delayed renewal processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

4.8.1 Delayed renewal-reward processes . . . . . . . . . . . . . . . . . . . . 208

4.8.2 Transient behavior of delayed renewal processes . . . . . . . . . . . . 209

4.8.3 The equilibrium process . . . . . . . . . . . . . . . . . . . . . . . . . 210

4.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

4.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

5 COUNTABLE-STATE MARKOV CHAINS 227

5.1 Introduction and classification of states . . . . . . . . . . . . . . . . . . . . 227

5.1.1 Using renewal theory to classify and analyze Markov chains . . . . . 230

5.2 Birth-death Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

5.3 Reversible Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

5.4 The M/M/1 sample-time Markov chain . . . . . . . . . . . . . . . . . . . . 244

5.5 Branching processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

5.6 Round-robin and processor sharing . . . . . . . . . . . . . . . . . . . . . . . 249

5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254

5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

viii CONTENTS

6 MARKOV PROCESSES WITH COUNTABLE STATE SPACES 261

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

6.1.1 The sampled-time approximation to a Markov process . . . . . . . . 265

6.2 Steady-state behavior of irreducible Markov processes . . . . . . . . . . . . 266

6.2.1 Renewals on successive entries to a given state . . . . . . . . . . . . 267

6.2.2 The limiting fraction of time in each state . . . . . . . . . . . . . . . 268

6.2.3 Finding {pj (i); j 0} in terms of {⇡j ; j 0} . . . . . . . . . . . . . 269

6.2.4 Solving for the steady-state process probabilities directly . . . . . . 272

6.2.5 The sampled-time approximation again . . . . . . . . . . . . . . . . 273

6.2.6 Pathological cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

6.3 The Kolmogorov di↵erential equations . . . . . . . . . . . . . . . . . . . . . 274

6.4 Uniformization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278

6.5 Birth-death processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

6.6 Reversibility for Markov processes . . . . . . . . . . . . . . . . . . . . . . . 281

6.7 Jackson networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

6.7.1 Closed Jackson networks . . . . . . . . . . . . . . . . . . . . . . . . . 292

6.8 Semi-Markov processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

6.8.1 Example — the M/G/1 queue . . . . . . . . . . . . . . . . . . . . . 297

6.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

6.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

7 RANDOM WALKS, LARGE DEVIATIONS, AND MARTINGALES 313

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

7.1.1 Simple random walks . . . . . . . . . . . . . . . . . . . . . . . . . . 314

7.1.2 Integer-valued random walks . . . . . . . . . . . . . . . . . . . . . . 315

7.1.3 Renewal processes as special cases of random walks . . . . . . . . . . 315

7.2 The queueing delay in a G/G/1 queue: . . . . . . . . . . . . . . . . . . . . . 315

7.3 Detection, decisions, and hypothesis testing . . . . . . . . . . . . . . . . . . 319

7.3.1 The error curve and the Neyman-Pearson rule . . . . . . . . . . . . 322

7.4 Threshold crossing probabilities in random walks . . . . . . . . . . . . . . . 328

7.4.1 The Cherno↵ bound . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

CONTENTS ix

7.4.2 Tilted probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330

7.4.3 Back to threshold crossings . . . . . . . . . . . . . . . . . . . . . . . 332

7.5 Thresholds, stopping rules, and Wald’s identity . . . . . . . . . . . . . . . . 333

7.5.1 Wald’s identity for two thresholds . . . . . . . . . . . . . . . . . . . 335

7.5.2 The relationship of Wald’s identity to Wald’s equality . . . . . . . . 335

7.5.3 Zero-mean simple random walks . . . . . . . . . . . . . . . . . . . . 336

7.5.4 Exponential bounds on the probability of threshold crossing . . . . . 337

7.5.5 Binary hypotheses testing with IID observations . . . . . . . . . . . 339

7.5.6 Sequential decisions for binary hypotheses . . . . . . . . . . . . . . . 340

7.5.7 Joint distribution of crossing time and barrier . . . . . . . . . . . . . 342

7.6 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

7.6.1 Simple examples of martingales . . . . . . . . . . . . . . . . . . . . . 344

7.6.2 Scaled branching processes . . . . . . . . . . . . . . . . . . . . . . . 345

7.6.3 Partial isolation of past and future in martingales . . . . . . . . . . 345

7.7 Submartingales and supermartingales . . . . . . . . . . . . . . . . . . . . . 346

7.8 Stopped processes and stopping trials . . . . . . . . . . . . . . . . . . . . . 349

7.9 The Kolmogorov inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 352

7.9.1 The strong law of large numbers (SLLN) . . . . . . . . . . . . . . . . 354

7.9.2 The martingale convergence therem . . . . . . . . . . . . . . . . . . 355

7.10 Markov modulated random walks . . . . . . . . . . . . . . . . . . . . . . . . 356

7.10.1 Generating functions for Markov random walks . . . . . . . . . . . . 358

7.10.2 stopping trials for martingales relative to a process . . . . . . . . . . 359

7.10.3 Markov modulated random walks with thresholds . . . . . . . . . . . 360

7.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

7.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363

A Table of standard random variables 372

MIT OpenCourseWare
http://ocw.mit.edu

6.262 Discrete Stochastic Processes


Spring 2011

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

You might also like