100% found this document useful (1 vote)
998 views410 pages

Handbook of Stochastic Methods - 2ed - Gardiner

"to make available in simple language and deductive form, the many formulae and methods that can be found in the literature on stochastic methods."

Uploaded by

chevy595
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
100% found this document useful (1 vote)
998 views410 pages

Handbook of Stochastic Methods - 2ed - Gardiner

"to make available in simple language and deductive form, the many formulae and methods that can be found in the literature on stochastic methods."

Uploaded by

chevy595
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 410
C.W. Gardiner Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences Second Edition With 29 Figures & Springer Professor Dr. Crispin W. Gardiner, Ph. D. Physics Department, Victoria University of Wellington ‘Wellington, New Zealand Series Editor: Professor Dr. Dr. h.c.mult. Hermann Haken Institut fiir Theoretische Physik und Synergetik der Universitat Stuttgart D-70550 Stuttgart, Germany and Center for Complex Systems, Florida Atlantic University Boca Raton, FL 33431, USA Cip-Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme Gardiner, Crispin W. Handbook of stochastic methods: for physics, chemistry and the natural sciences/C.W. Gardiner. - 2. ed., 4. printing, - Berlin; Heidelberg: New York; Barcelona; Budapest; Hong Kong; London; Milan; Paris; Santa (Clara; Singapore; Tokyo: Springer, 1996 (Springer series in synergetics; Vol. 13) ISBN 3-540-61634-9 NE:GT ' 2nd Edition 1985 4th Printing 1997 ISSN 0172-7389 ISBN 3-540-61634-9 Study Edition Springer-Verlag Berlin Heidelberg New York ISBN 3-540-15607-0 2nd Edition Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the materi concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9,1965, in its and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. ‘© Springer-Verlag Berlin Heidelberg 1983, 1985 Printed in Germany The use of general descr istered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. SPIN 10547672 55/3144-543210- Printed on acid-free paper Foreword In the past, the Springer Series in Synergetics has consisted predominantly of conference proceedings on this new interdisciplinary field, a circumstance dictat- ed by its rapid grawth. As synergetics matures, it becomes more and more desir- able to present the relevant experimental and theoretical results in a coherent fashion and to provide students and research workers with fundamental “know- how” by means of texts and monographs. From the very beginning, we have stressed that the formation of spatial, temporal, or functional structures by complex systems can be adequately dealt with only if stochastic processes are properly taken into account. For this reason, I gave an introduction to these processes in my book Synergetics. An Introduc- tion, Volume 1 of this series. But research workers and students wanting to penetrate the theory of these processes more deeply were quite clearly in need of a far more comprehensive text. This gap has been filled by the present book by Professor Crispin Gardiner. It provides a solid basis for forthcoming volumes in the series which draw heavily on the methods and concepts of stochastic pro- cesses. These include Noise-Induced Transitions, by W. Horsthemke and R. Lefever, The Kinetic Theory of Electromagnetic Processes, by Y. L. Klimonto- vich, and Concepts and Models of a Quantitative Sociology, by W. Weidlich and G. Haag. Though synergetics provides us with rather general concepts, it is by no means “art pour l’art”. On the contrary, the processes it deals with are of funda- mental importance in self-organizing systems such as those of biology, and in the construction of devices, e.g., in electronics. Therefore I am particularly pleased that the present book has been written by a scientist who has himself applied — and even developed — such methods in the theory of random processes, for ex- ample in the fields of quantum optics and chemical reactions. Professor Gardiner’s book will prove most useful not only to students and scientists work- ing in synergetics, but also to a much wider audience interested in the theory of random processes and its important applications to a variety of fields. Hi. Haken Preface to the Corrected Printing Since I started writing this book ten years ago, a great deal has happened. I have been gratified to find how popular my exposition has become, and of course continually bemused that errors still come to light. I am very grateful to all those who have pointed them out to me, in particular to Matthew Collett, Scott Parkins, and Andrew Smith, who, as students and colleagues, over the last five years have kept me aware of everything they noticed. As well, I must also thank Prof. Urbaan Titulaer and Mr. Alexander Kainz, of the Johannes Kepler University of Linz, who sent me a very full and careful list of corrections. As a consequence a number of corrections have been made in this second printing of the second edition. The most significant of these is the removal of the converse result of Sect. 3.7.3b, which was incorrectly derived, and which is probably not true. At this time I must also express my thanks to my wife Helen May and my youngest daughter Nell, who have been of such support in the years since this book was written. Pasadena, California C.W. Gardiner October 1989 Preface to the Second Edition In this edition I have corrected a number of misprints, and made a few altera- tions of a more substantial kind. In particular, I have rewritten Sections 4.2.3 and 4.3.6, using a more correct definition of the Stratonovich stochastic integral; I have clarified a slightly confusing exposition on boundaries in Section 5.2.1¢; and I have rewritten Sections 6.3.3 and §.4.4c to take account of recent progress in these fields. I have also slightly augmented the bibliography and references. Pasadena, California C. W. Gardiner March 1985 Preface to the First Edition My intention in writing this book was to put down in relatively simple language and in a reasonably deductive form, all those formulae and methods which have been scattered throughout the scientific literature on stochastic methods through- out the eighty years that they have been in use. This might seem an unnecessary aim since there are scores of books entitled “Stochastic Processes”, and similar titles, but careful perusal of these soon shows that their aim does not coincide with mine. There are purely theoretical and highly mathematical books, there are books related to electrical engineering or communication theory, and there are books for biologists — many of them very good, but none of them covering the kind of applications that appear nowadays so frequently in Statistical Physics, Physical Chemistry, Quantum Optics and Electronics, and a host of other theoretical subjects that form part of the subject area of Synergetics, to which series this book belongs. The main new point of view here is the amount of space which deals with methods of approximating problems, or transforming them for the purpose of approximating them. I am fully aware that many workers will not see their meth- ods here. But my criterion here has been whether an approximation is systematic. Many approximations are based on unjustifiable or uncontrollable assumptions, and are justified @ posteriori. Such approximations are not the subject of a systematic book — at least, not until they are properly formulated, and their range of validity controlled. In some cases I have been able to put certain approximations on a systematic basis, and they appear here — in other cases I have not. Others have been excluded on the grounds of space and time, and I presume there will even be some that have simply escaped my attention. A word on the background assumed. The reader must have a good knowledge of practical calculus including contour integration, matrix algebra, differential equations, both ordinary and partial, at the level expected of a first degree in applied mathematics, physics or theoretical chemistry. This is not a text book for a particular course, though it includes matter that has been used in the University of Waikato in a graduate course in physics. It contains material which I would expect any student completing a doctorate in our quantum optics and stochastic processes theory group to be familiar with. There is thus a certain bias towards my own interests, which is the prerogative of an author. I expect the readership to consist mainly of theoretical physicists and chemists, and thus the general standard is that of these people. This is not a rigor- ous book in the mathematical sense, but it contains results, all of which I am con- fident are provable rigorously, and whose proofs can be developed out of the demonstrations given. Preface to the First Edition Ix The organisation of the book is as in the following table, and might raise some eyebrows. For, after introducing the general properties of Markov processes, I have chosen to base the treatment on the conceptually difficult but intuitively appealing concept of the stochastic differential equation. I do this because of my own experience of the simplicity of stochastic differential equation methods, once ‘one has become familiar with the Ito calculus, which I have presented in Chapter 4 in a rather straightforward manner, such as I have not seen in any previous text. It is true that there is nothing in a stochastic differential equation that is not in a Fokker-Planck equation, but the stochastic differential equation is so much easier to write down and manipulate that only an excessively zealous purist would try to eschew the technique. On the other hand, only similar purists of an opposing camp would try to develop the theory without the Fokker-Planck equation, so Chapter 5 introduces this as a complementary and sometimes overlapping method of handling the same problem. Chapter 6 completes what may be regarded as the “central core” of the book with a treatment of the two main analytical approxima- tion techniques: small noise expansions and adiabatic elimination. The remainder of the book is built around this core, since very many methods of treating the jump processes in Chapter 7 and the spatially distributed systems, themselves best treated as jump processes, depend on reducing the system to an approximating diffusion process. Thus, although /ogically the concept of a jump process is much simpler than that of a diffusion process, analytically, and in terms of computational methods, the reverse is true. Chapter 9 is included because of the practical importance of bistability and, as indicated, it is almost independent of all but the first five chapters. Again, I have included only systematic methods, for there is a host of ad hoc methods in this field. Chapter 10 requires some knowledge of quantum mechanics. I hope it will be of interest to mathematicians who study stochastic processes because there is still much to be done in this field which is of great practical importance and which naturally introduces new realms in stochastic processes — in particular, the rather fascinating field of stochastic processes in the complex plane which turn up as the only way of reducing quantum processes to ordinary stochastic proc- esses. It is with some disappointment that I have noted a tendency among mathe- maticians to look the other way when quantum Markov processes are mentioned, for there is much to be done here. For example, I know of no treatment of escape problems in quantum Markov systems. It is as well to give some idea of what is not here. I deal entirely with Markov Processes, or systems that can be embedded in Markov processes. This means that no work on non-linear Markovian stochastic differential equations has been included, which I regret. However, van Kampen has covered this field rather well, and it is now well covered in his book on stochastic processes. Other subjects have been omitted because I feel that they are not yet ready for a definitive formulation. For example, the theory of adiabatic elimination in spatially distributed systems, the theory of fluctuating hydrodynamics, renor- malisation group methods in stochastic differential equations, and associated critical phenomena. There is a great body of literature on all of these, and a x Preface to the First Edition 2. Probability Concepts 1. Introduction + 3. Markov Processes ’ 4. Ito Calculus and = : Stochastic Differential Equations Ee . 5. The Fokker-Planck Equation ' s 1 6. Approximation Methods - , for Diffusion Processes _ 7. Master Equations and i Jump Processes 1 vu | 1 1 1 ’ 8. Spatially Distributed 9. Bistability, Metastability, 1 Systems and Escape Problems 1 1 1 \ 10. Quantum Mechanical Markov Processes Preface to the First Edition XI Further, for the sake of compactness and simplicity I have normally present- ed only one way of formulating certain methods. For example, there are several different ways of formulating the adiabatic elimination results, though few have been used in this context. My formulation of quantum Markov processes and the use of P-representations is only one of many. To have given a survey of all formulations would have required an enormous and almost unreadable book. However, where appropriate I have included specific references, and further relevant matter can be found in the general bibliography. Hamilton, New Zealand C. W. Gardiner January, 1983 Acknowledgements My warmest appreciation must go to Professor Hermann Haken for inviting me to write this book for the Springer Series in Synergetics. and for helping support a sabbatical leave in Stuttgart where I did most of the initial exploration of the subject and commenced writing the book. The physical production of the manuscript would not have been possible without the thorough- ness of Christine Coates, whose ability to produce a beautiful typescript, in spite of my handwriting and changes of mind, has never ceased to arouse my admiration. The thorough assistance of Moira Steyn-Ross in checking formulae and the consistency of the manuscript has been a service whose es- sential nature can only be appreciated by an author. Many of the diagrams, and some computations, were prepared with the assistance of Craig Savage, for whose assistance I am very grateful. Since I first became interested in stochastic phenomena, I have benefitted greatly from contact with a large number of people, and in particular I wish to thank L. Arnold, R. Graham, S. Gross- man, F. Haake, P. Hohenberg, W. Horsthemke, N. G. van Kampen, R. Landauer, R. Lefever, M. Malek-Mansour, G. ‘olis, A. Nitzan, P. Ortoleva, J. Ross, F. Schlégl, and U. Titulaer. To my colleagues, students and former students at the University of Waikato must goa consider- able amount of credit for much of the work in this book; in particular to Bruce Liley, whose en- couragement and provision of departmental support has been warmly appreciated. I want to ex- press my appreciation to Dan Walls who¥irst introduced me to this field, and with whom I have en- joyed a fruitful collaboration for many years; to Howard Carmichael, Peter Drummond, Ken McNeil, Gerard Milburn, Moira Steyn-Ross, and above all, to Subhash Chaturvedi, whose insights into and knowledge of this field have been of particular value. It is my wish to dedicate this book to my wife, Heather, and my children, Simon and Amanda, who have been remarkably patient of my single minded application of my time to this project. The extract from the Paper by A. Einstein which appears in Sect. 1.2.1 is reprinted with the per- mission of the Hebrew University, Jerusalem, Israel, who hold the copyright. The diagram which appears as Fig. 1.3(b) is reprinted with permission of Princeton University Press. Contents 1. A Historical Introduction 1A 1.2 1.3 1.4 Motivation Some Historical Examples . 1.2.1 Brownian Motion . 1.2.2 Langevin’s Equation Birth-Death Processes . Noise in Electronic Systems . 1.4.1 Shot Noise 1.4.2 Autocorrelation Functions and Spectra 1.4.3 Fourier Analysis of Fluctuating Function: Stationary Systems ........... 00sec cece eee e eee ee eees 1.4.4 Johnson Noise and Nyquist’s Theorem ..............-.4+ 2. Probability Concepts 2A 2.2 2. 2 B 2.5 2.6 2.7 Events, and Sets of Events Probabilities 2.2.1 Probability Axioms 2.2.2 The Meaning of P(A) . 2.2.3 The Meaning of the Axioms . 2.2.4 Random Variables ... Joint and Conditional Probabi 2.3.1 Joint Probabilities .. 2.3.2 Conditional Probabilities 2.3.3 Relationship Between Joint Probabilities of Different Orders 2.3.4 Independence . Mean Values and Probability Density 2.4.1 Determination of Probability Density by Means of Arbitrary Functions . 2.4.2 Sets of Probability Zero . Mean Values .. 2.5.1 Moments, Correlations, and Covariances . 2.5.2 The Law of Large Numbers . Characteristic Function . Cumulant Generating Functio Correlation Functions and Cumulants 2.7.1. Example: Cumulant of Order 4: «X;X)X3X» . 2.7.2. Significance of Cumulants ties: Independence . 21 22 22 23 23 24 25 25 25 26 27 28 28 29 29 30 30 32 33 35 35 xiv Contents 2.8 Gaussian and Poissonian Probability Distributions ............. 2.8.1 The Gaussian Distribution . 2.8.2 Central Limit Theorem 2.8.3 The Poisson Distribution 2.9 Limits of Sequences of Random Variables .. 2.9.1 Almost Certain Limit ...... 2.9.2 Mean Square Limit (Limit in the Mean) 2.9.3 Stochastic Limit, or Limit in Probability 2.9.4 Limit in Distribution . . 2.9.5 Relationship Between Limits . Markov Processes 3.1 Stochastic Processes 3.2 Markov Process 3.2.1. Consistency — the Chapman-Kolmogoroy Equation . 3.2.2. Discrete State Spaces . : 3.2.3 More General Measures | . Continuity in Stochastic Processes . _ 3.3.1 Mathematical Definition of a Continuous Markov Process 3.4 Differential Chapman-Kolmogorov Equation 3.4.1 Derivation of the Differential Chapman-Kolthogorov Equation - 3.4.2 Status of the Differential Chapman-Kolmegorov Equation . Interpretation of Conditions and Results ... 3.5.1 Jump Processes: The Master Equation . 3.5.2 Diffusion Processes — the Fokker-Planck Equation . 3.5.3 Deterministic Processes — Liouville’s Equation 3.5.4 General Processes Equations for Time Development in Initial Time — Backward Equations Stationary and Homogeneous Markov Processes . 3.7.1 Ergodic Properties 3.7.2 Homogeneous Processes 3.7.3 Approach toa Stationary Process 3.7.4 Autocorrelation Function for Markov Processes . . Examples of Markov Processes 3.8.1 The Wiener Process ... 3.8.2 The Random Walk in One Dimension . 3.8.3 Poisson Process ............ 3.8.4 The Ornstein-Uhlenbeck Process 3.8.5 Random Telegraph Process 3. 3. 3. a 3. a _ 20 |. The Ito Calculus and Stochastic Differential Equations 4.1 Motivation ... 4.9 Stochastic Intaaration Contents xv 4.2.1 Definition of the Stochastic Integral ............00c.0002. 83 4.2.2 Example { W(r)dW(’) . 84 i 4.2.3 The Stratonovich Integral 86 4.2.4 Nonanticipating Functions . 86 4.2.5 Proof that dW(0? = dtand dW(n)?*? 87 4.2.6 Properties of the Ito Stochastic Integral . 88 4.3 Stochastic Differential Equations (SDE) 92 4.3.1 Ito Stochastic Differential Equation: Definition 93 4.3.2 Markov Property of the Solution of an ItoStochastic Differential Equation . . 95 4.3.3 Change of Variables: Ito’s Formula . 95 4.3.4 Connection Between Fokker-Planck Equation and Stochastic Differential Equation ............... . 96 4.3.5 Multivariable Systems 97 4.3.6 Stratonovich’s Stochastic Differential Equation 98 4.3.7 Dependence on Initial Conditions and Parameters 101 4.4 Some Examples and Solutions . 102 4.4.1 Coefficients Without x Dependence . 102 4.4.2 Multiplicative Linear White Noise Process 103 4.4.3 Complex Oscillator with Noisy Frequency .. . seeeeeee 104 4.4.4 Ornstein-Uhlenbeck Process . S 106 4.4.5 Conversion from Cartesian to Polar Coordinates. seeeenene 107 4.4.6 Multivariate Ornstein-Uhlenbeck Process ... 109 4.4.7 The General Single Variable Linear Equation 112 4.4.8 Multivariable Linear Equations . Aid 4.4.9 Time-Dependent Ornstein-Uhlenbeck Process ............ 115 5. The Fokker-Planck Equation 5.1 Background S ue 5.2 Fokker-Planck Equation in One Dimension 118 Boundary Conditions 118 Stationary Solutions for Homogeneous Fokker- Planck Equations 124 5.2.3 Examples of Stationary Solutions 126 5.2.4 Boundary Conditions for the Backward Fokker-Planck Equation 128 Eigenfunction Methods (Homogeneous Processes) 129 Examples . 132 First Passage Times for Homogeneous Processes . 136 Probability of Exit Through a Particular End of the Interval .... 142 5.3 Fokker-Planck Equations in Several Dimensions 143 5.3.1 Change of Variables .............. 144 5.3.2 Boundary Conditions ...... 146 5.3.3 Stationary Solutions: Potential Conditions 146 6.24 Detailed Ralance 14g XVI Contents 5.3.5 Consequences of Detailed Balance .....................5 5.3.6 Examples of Detailed Balance in Fokker-Planck Equations . 5.3.7 Eigenfunction Methods in Many Variables — Homogeneous Processes .........-.0.0c0 eee ec ee eeeeee 5.4 First Exit Time from a Region (Homogeneous Processes) 5.4.1 Solutions of Mean Exit Time Problems 5.4.2 Distribution of Exit Points Approximation Methods for Diffusion Processes 6.1 Small Noise Perturbation Theories 6.2 Small Noise Expansions for Stochastic Differential Equations . 6.2.1 Validity of the Expansion 6.2.2 Stationary Solutions (Homogeneous Processes) . 6.2.3 Mean, Variance, and Time Correlation Function 6.2.4 Failure of Small Noise Perturbation Theories .. Small Noise Expansion of the Fokker-Planck Equation 7 6.3.1 Equations for Moments and Autocorrelation Functions .... 6.3.2 Example 6.3.3 Asymptotic Method for Stationary Distributions Adiabatic Elimination of Fast Variables .................00 0005 6.4.1 Abstract Formulation in Terms of Operators and Projectors .......+-.....6+ 6.4.2. Solution Using Laplace Transform 6.4.3 Short-Time Behavidur 6.4.4 Boundary Conditions . 6.4.5 Systematic Perturbative Analysis 6.5 White Noise Process as a Limit of Nonwhite Process . 6.5.1 Generality of the Result . 6.5.2. More General Fluctuation Equations 6.5.3 Time Nonhomogeneous Systems ... . 6.5.4 Effect of Time Dependence in L, . . Adiabatic Elimination of Fast Variables: The General Case . 6.6.1 Example: Elimination of Short-Lived Chemical Intermediates 6.6.2 Adiabatic Elimination in Haken’s Model 6.6.3 Adiabatic Elimination of Fast Variables: A Nonlinear Case 6.6.4 An Example with Arbitrary Nonlinear Coupling 6. ie 6. PS 6. a . Master Equations and Jump Processes 7.1 Birth-Death Master Equations — One Variable . 1 Stationary Solutions .. .1.2 Example: Chemical Reaction X = A 7.1.3 A Chemical Bistable System 7 7.2. Approximation of Master Equations by Fokker-Planck Equations 7.2.1 Jump Process Approximation of a Diffusion Process ...... 236 236 238 241 246 Contents XVII 7.2.2. The Kramers-Moyal Expansion .... 249 7.2.3 Van Kampen’s System Size Expansion 250 7.2.4 Kurtz’s Theorem ... 254 7.2.5 Critical Fluctuations . 255 7.3 Boundary Conditions for Birth-Death Processes . 257 7.4 Mean First Passage Times .... 259 7.4.1 Probability of Absorption 261 7.4.2 Comparison with Fokker-Planck Equation . 261 7.5 Birth-Death Systems with Many Variables 262 7.5.1 Stationary Solutions when Detailed Balance Holds 263 7.5.2 Stationary Solutions Without Detailed Balance (Kirchoff’s Solution) 266 7.5.3 System Size Expansion and Related Expansions . 266 7.6 Some Examples ... 16.1 X + A = 2X T62X2YS SA 7.6.3 Prey: -Predator System 7.6.4 Generating Function Equations . 7.7 The Poisson Representation 7.7.1 Kinds of Poisson Representations . . 7.7.2 Real Poisson Representations ..... 7.7.3 Complex Poisson Representations . 282 7.1.4 The Positive Poisson Representation . 285 7.1.5 Time Correlation Functions . 289 7.1.6 Trimolecular Reaction . 294 7.7.7. Thitd-Order Noise .. . 299 . Spatially Distributed Systems 8.1 Background 0.00... .0.cceecceeeeeeeeeeeeeeeeeeeeeeeeees 303 8.1.1 Functional Fokker-Planck Equations . 305 8.2 Multivariate Master Equation Description 307 8.2.1 Diffusion 307 8.2.2 Continuum Form of Diffusion Master Equation 308 8.2.3 Reactions and Diffusion Combined 313 8.2.4 Poisson Representation Methods .. 314 8.3 Spatial and Temporal Correlation Structures 315 8.3.1 Reaction X #4 E ty. 315 8.3.2 Reactions B+ x # LCA +X 2x. 319 8.3.3 A Nonlinear Model with a Second-Order Phase Transition .. 324 8.4 Connection Between Local and Global Descriptions .. .. 328 8.4.1 Explicit Adiabatic Elimination of Innomogeneous Modes . 328 8.5 Phase-Space Master Equation ........ 0.0.0.0 cece ener eee 331 8.5.1 Treatment of Flow 8.5.2 Flow as a Birth-Death Process 8.5.3 Inclusion of Collisions — the Boltzmann Master Equation .. 336 8.5.4 Collisions and Flow Together ..............0eeeeeeeee ee 339 XVIII Contents 9. Bistability, Metastability, and Escape Problems 9.1 Diffusion in a Double-Well Potential (One Variable) 342 9.1.1 Behaviour for D = 0 : 343, 9.1.2 Behaviour if Dis Very Small 343 9.1.3 ExitTime .. 345 9.1.4 Splitting Probability : 345 9.1.5 Decay from an Unstable State .. 347 9.2. Equilibration of Populations in Each Well . 348 9.2.1 Kramers’ Method 349 9.2.2 Example: Reversible Denaturation of Chymotrypsinogen .. 352 9.2.3 Bistability with Birth-Death Master Equations (One Variable) 354 9.3 Bistability in Multivariable Systems 357 9.3.1 Distribution of Exit Points .. 357 9.3.2 Asymptotic Analysis of Mean Exit Time 362 9.3.3 Kramers’ Method in Several Dimensions .. 363 9.3.4 Example: Brownian Motion in a Double Potential . 366 10. Quantum Mechanical Markov Processes 10.1 Quantum Mechanics of the Harmonic Oscillator .............. 373 10.1.1 Interaction with an External Field . 375 10.1.2 Properties of Coherent States 376 10.2 Density Matrix and Probabilities . 380 10.2.1 Von Neumann’s Equation 382 10.2.2 Glauber-Sudarshan P-Representation 382 10.2.3 Operator Correspondences 383 10.2.4 Application to the Driven Harmonic Oscillator 384 10.2.5 Quantum Characteristic Function 386 10.3 Quantum Markov Processes 388 10.3.1 Heat Bath 388 10.3.2 Correlations of Smooth Functions of Bath Operators ... 389 10.3.3 Quantum Master Equation for a System Interacting with a Heat Bath 390 10.4 Examples and Applications of Quantum Markov Processes 395 10.4.1 Harmonic Oscillator ....... 395 10.4.2 The Driven Two-Level Atom 399 10.5 Time Correlation Functions in Quantum Markov Processes 10.5.1 Quantum Regression Theorem 10.5.2 Application to Harmonic Oscillator in the P-Representation 10.5.3 Time Correlations for Two-Level Atom . 10.6 Generalised P-Representations ... 10.6.1 Definition of Generalised P-Representation . 409 10.6.2 Existence Theorems 4 10.6.3 Relation to Poisson Representation . 413 10.6.4 Operator Identities .......... 0-0 cece cece eee eee ee 414 Contents XIX 10.7 Application of Generalised P-Representations to Time-Development Equations .. 415 10.7.1 Complex P-Representation 416 10.7.2 Positive P-Representation 416 10.7.3 Example ... 418 References ... 2... 2.0.0.0 0 cece eect eee eee eee e eee e eee ee 421 Bilogaphyg 0 427 Symbol Index 0.0.0... eee e cece cece cece eee eeevereuereeeeeneees 431 Author Index ... 435 Subject NG@x 8 437 1. A Historical Introduction 1.1 Motivation Theoretical scien¢e up to the end of the nineteenth century can be viewed as thc study of solutions of differential equations and the modelling of natural phenomen: by deterministic solutions of these differential equations. It was at that timc commonly thought that if all initial data could only be collected, one would bi able to predict the future with certainty. We now know this is not so, in at least two ways. Firstly, the advent of quantun mechanics within a quarter of a century gave rise to a new physics, and hence new theoretical basis for all science, which had as an essential basis a purels statistical element. Secondly, more recently, the concept of chaos has arisen, it which even quite simple differential equation systems have the rather alarmin; property of giving rise to essentially unpredictable behaviour. To be sure, one cai predict the future of such a system given its initial conditions, but any error in thy initial conditions is so rapidly magnified that no practical predictability is left In fact, the existence of chaos is really not surprising, since it agrees with more o our everyday experience than does pure predictability—but it is surprising perhap that it has taken so long for the point to be made. 3 Number of molecules 3 d Le) Fig. 1.1. Stochastic simulation of an isomerisation reaction X — A 2 1. A Historical Introduction Chaos and quantum mechanics are not the subject of this chapter. Here I wish to give a semihistorical outline of how a phenomenological theory of fluctuating phenomena arose and what its essential points are. The very usefulness of predic- table models indicates that life is not entirely chaos. But there is a limit to predic- tability, and what we shall be most concerned with in this book are models of limited predictability. The experience of careful measurements in science normally gives us data like that of Fig. 1.1, representing the growth of the number of mole- cules of a substance ¥ formed by a chemical reaction of the form X = A. A quite well defined deterministic motion is evident, and this is reproducible, unlike the fluctuations around this motion, which are not. 1.2. Some Historical Examples 1.2.1. Brownian Motion The observation that, when suspended in water, small pollen grains are found to be in a very animated and irregular state of motion, was first systematically investigated by Robert Brown in 1827, and the observed phenomenon took the name Brownian Motion because of his fundamental pioneering work. Brown was a botanist—indeed a very famous botanist—and of course tested whether this motion was in some way a manifestation of life. By showing that the motion was present in any suspension of fine particles—glass, minerals and even a fragment of the sphinx—he ruled out any specifically organic origin of this motion. The motion is illustrated in Fig. 1.2. ve Fig. 1.2. Motion of a point undergoing Brownian motion The riddle of Brownian motion was not quickly solved, and a satisfactory explanation did not come until 1905, when Einstein published an explanation under the rather modest title “iiber die von der molekular-kinetischen Theorie der 1.2 Some Historical Examples 3 Warme geforderte Bewegung von in ruhenden Flissigkeiten suspendierten Teil- chen” (concerning the motion, as required by the molecular-kinetic theory of heat, of particles suspended in liquids at rest) [1.2]. The same explanation was indepen- dently developed by Smoluchowski [1.3], who was responsible for much of the later systematic development and for much of the experimental verification of Brownian motion theory. There were two major points in Einstein’s solution to the problem of Brownian motion. (i) The motion is caused by the exceedingly frequent impacts on the pollen grain of the incessantly moving molecules of liquid in which it is suspended. (ii) The motion pf these molecules is so complicated that its effect on the pollen grain can only be described probabilistically in terms of exceedingly frequent statistically independent impacts. The existence of fluctuations like these ones calls out for a statistical explanation of this kind of phenomenon. Statistics had already been used by Maxwell and Boltzmann in their famous gas theories, but only as a description of possible states and the likelihood of their achievement and not as-an intrinsic part of the time evolution of the system. Rayleigh [1.1] was in fact the first to consider a statistical description in this context, but for one reason or another, very little arose out of his work. For practical purposes, Einstein’s explanation of the nature of Brownian motion must be regarded as the beginning of stochastic modelling of natural phenomena. Einstein’s reasoning is very clear and elegant. It contains all the basic concepts which will make up the subject matter of this book. Rather than paraphrase a classic piece of work, I shall simply give an extended excerpt from Einstein’s paper (author’s translation): “It must clearly be assumed that each individual particle executes a motion which is independent of the motions of all other particles; it will also be considered that the movements of one and the same particle in different time intervals are independent processes, as long as these time intervals are not chosen too small “We introduce a time interval t into consideration, which is very small com- pared to the observable time intervals, but nevertheless so large that in two succes- sive time intervals t, the motions executed by the particle can be thought of as events which are independent of each other. “Now let there be a total of ” particles suspended in a liquid. In a time interva t, the X-coordinates of the individual particles will increase by an amount 4, where for each particle 4 has a different (positive or negative) value. There will be & certain frequency law for 4; the number dn of the particles which experience < shift which is between 4 and 4 + dd will be expressible by an equation of the forir dn = ng(A)dA, (1.2.1 where fads =1 (1.2.2 4 1. A Historical Introduction and ¢ is only different from zero for very small values of 4, and satisifes the condi- tion g(4) = g(—4). (1.2.3) “We now investigate how the diffusion coefficient depends on g. We shall once more restrict ourselves to the case where the number v of particles per unit volume depends only on x and ¢. “Let v = f(x, t) be the number of particles per unit volume. We compute the distribution of particles at the time ¢ + t from the distribution at time ¢. From the definition of the function (4), it is easy to find the number of particles which at time ¢ + t are found between two planes perpendicular to the x-axis and passing through points x and x + dx. One obtains flo, t+ ddx = dx f fle + 4, glad. (1.2.4) But since t is very small, we can set of St +) =f t)+ ta, (1.2.5) Furthermore, we develop f(x + 4, 1) in powers of 4: fle + 4,1) =f, 0+ 48a He a 7 ae 5 (1.2.6) We can use this series under the integral, because only small values of 4 contribute to this equation. We obtain f+ x, af i g(4)dA +55 LF x L Ag(4)d4 +32 LiF % P (Ayda . (1.2.7) Because g(x) = ¢(-x), the second, fourth, etc., terms on the right-hand side vanish, while out of the Ist, 3rd, Sth, etc., terms, each one is very small compared with the previous. We obtain from this equation, by taking into consideration f od) = 1 (1.2.8) and setting Lea Ti yous=d, (1.2.9) and keeping only the Ist and third terms of the right-hand side, af ay Y pl... (1.2.10) 1.2 Some Historical Examples 5 This is already known as the differential equation of diffusion and it can be seen that D is the diffusion coefficient. ... “The problem, which corresponds to the problem of diffusion from a single point (neglecting the interaction between the diffusing particles), is now com- pletely determined mathematically: its solution is a n evxtsor a bx, t) = = 2... c O= Taep St a “We now calculate, with the help of this equation, the displacement A, in the direction of the X-axis that a particle experiences on the average or, more exactly, the square root of,the arithmetic mean of the square of the displacement in the direction of the X-axis; it is A, = JP = JSD.” (1.2.12) Einstein’s derivation is really based on a discrete time assumption, that impacts happen only at times 0, 1, 2z, 3¢..., and his resulting equation (1.2.10) for the distribution function f(x, f) and its solution (1.2.11) are to be regarded as approxi- mations, in which t is considered so small that f may be considered as being continuous. Nevertheless, his description contains very many of the major concepts which have been developed more and more generally and rigorously since then, and which will be central to this book. For example: i) The Chapman-Kolmogorov Equation occurs as Einstein’s equation (1.2.4). It states that the probability of the particle being at point x at time ¢ + 7 is given by the sum of the probability of all possible “pushes” 4 from positions x + 4, multi- plied by the probability of being at x + 4 at time ¢. This assumption is based on the independence of the push 4 of any previous history of the motion: it is only necessary to know the initial position of the particle at time t—not at any previous time. This is the Markov postulate and the Chapman Kolmogorov equation, of which (1.2.4) is a special form, is the central dynamical equation to all Markov processes. These will be studied in detail in Chap. 3. ii) The Fokker-Planck Equation: Eq. (1.2.10) is the diffusion equation, a special case of the Fokker-Planck equation, which describes a large class of very interesting stochastic processes in which the system has a continuous sample path. In this case, that means that the pollen grain’s position, if thought of as obeying a probabilistic law given by solving the diffusion equation (1.2.10), in which time ¢ is continuous (not discrete, as assumed by Einstein), can be written x(t), where x(t) is a continuous function of time-but a random function. This leads us to consider the possibility of describing the dynamics of the system in some direct probabilistic way, so that we would have a random or stochastic differential equation for the path. This procedure was initiated by Langevin with the famous equation that to this day bears his name. We will discuss this in detail in Chap. 4. iii) The Kramers-Moyal and similar expansions are essentially the same as that used by Einstein to go from (1.2.4) (the Chapman-Kolmogorov equation) to the 6 1. A Historical Introduction diffusion equation (1.2.10). The use of this type of approximation, which effectively replaces a process whose sample paths need not be continuous with one whose paths are continuous, has been a topic of discussion in the last decade. Its use and validity will be discussed in Chap. 7. 1.2.2 Langevin’s Equation Some time after Einstein’s original derivation, Langevin [1.4] presented a new method which was quite different from Einstein’s and, according to him, “infinitely more simple.” His reasoning was as follows. From statistical mechanics, it was known that the mean kinetic energy of the Brownian particle should, in equilibrium, reach a value jm) = $kT (1.2.13) (T; absolute temperature, k; Boltzmann's constant). (Both Einstein and Smolucho- wski had used this fact). Acting on the particle, of mass m there should be two forces: i) a viscous drag: assuming this is given by the same formula as in macroscopic hydrodynamics, this is —6xya dx/dt, 7 being the viscosity and a the diameter of the particle, assumed spherical. ii) another fluctuating force X which represents the incessant impacts of the molecules of the liquid on the Bréwnian particle. All that is known about it is that fact, and that it should be positive and negative with equal probability. Thus, the equation of motion for the position of the particle is given by Newton’s law as dx no dt? = — 60% +X (1.2.14) and multiplying by x, this can be written dee ) (2) — mt = —3nna GO + Xx, (1.2.15) t - where v = dx/dt. We now average over a large number of different particles and use (1.2.13) to obtain an equation for 1, so that G(s, 0) = 1. Expanding the solution (1.4.7) 1.4 Noise in Electronic Systems 13 in powers of s, we find P(n, t) = exp (—Ar) (At)"/n! (1.4.8) which is known as a Poisson distribution (Sect. 2.8.3). Let us introduce the variable N(t), which is to be considered as the number of electrons which have arrived up to time #, and is a random quantity. Then, P(n, t) = Prob {N(t) =n}, (1.4.9) and N(t) can be called a Poisson process variable. Then clearly, the quantity y(t), formally defined by w(t) = dN(t)/dt> (1.4.10) is zero, except when N(t) increases by 1; at that stage it is a Dirac delta function, ie, wi) = Det), (1.4.1) where the f, are the times of arrival of the individual electrons. We may write It) = f de'Ft— tut’). (1.4.12) A very reasonable restriction on F(t — t’) is that it vanishes if t 0) =0 (<0) (1.4.13) so that (1.4.12) can be rewritten as t matey ENC’) =f dtge LO an: (1.4.14) We can derive a simple differential equation. We differentiate I(r) to obtain zo = [a err ao) + le dt'(—aq)ee“” ao (1.4.15) so that HO _ at) + au. (1.4.16) 14 1. A Historical Introduction This is a kind of stochastic differential equation, similar to Langevin’s equation, in which, however, the fluctuating force is given by qu(t), where y(t) is the derivative of the Poisson process, as given by (1.4.11). However, the mean of y(t) is nonzero, in fact, from (1.4.10) dult)aty = = Adt (1.4.17) {aN(t) — dat? = Adt (1.4.18) from the properties of the Poisson distribution, for which the variance equals the mean. Defining, then, the fluctuation as the difference between the mean value and dN(t), we write dn(t) = dN(t) — Adt, (1.4.19) so that the stochastic differential equation (1.4.16) takes the form di(t) = Aq — al(t)) dt + qdn(t) . (1.4.20) Now how does one solve such an equation? In this case, we have an academic prob- lem anyway since the solution is known, but one would like to have a technique. Suppose we try to follow the method used by Langevin—what will we get as an answer? The short reply to this question is: nonsense. For example, using ordinary calculus and assuming (I(s)dn(t) = 0, we can derive ALO? ag — a(n and (1421) ; FEO — agiatny — ac (1.4.22) solving in the limit t —- co, where the mean values would reasonably be expected to be constant one finds dI(c0)> = Agia and (1.4.23) + {Aq — allt + dn(ty}?> . (1.4.28) We now assume again that / ve (1.4.32) and computing the autocorrelation function of that variable. A more traditional approach is to compute the spectrum of the quantity x(r). This is defined in two stages. First, define T yw) = f dt e™x(t) (1.4.33) a then the spectrum is defined by Sin a S(o) = tim = [y(o) - (1.4.34) The autocorrelation function and the spectrum are closely connected. By a little manipulation one finds At pre S(@) = lim [+ feos (we)dez f “x(e)xt + oat] (1.4.35) and taking the limit T— oo (under suitable assumptions to ensure the validity of certain interchanges of order), one finds 1.4 Noise in Electronic Systems 17 S(o) = 4 J cos (wt)G(n)de « (1.4.36) a This is a fundamental result which relates the Fourier transform of the autocorrela- tion function to the spectrum. The result may be put in a slightly different form when one notices that G—-) = tim = J dt x(t +ox(t) = G(r) (1.4.37) so we obtain S(o) = x i e“™"G(2)dr (1.4.38) with the corresponding inverse Gt) = J e™*S(o)doo. (1.4.39) This result is known as the Wiener-Khinchin theorem (1.12,13] and has widespread application. It means that one may either directly measure the autocorrelation function of a signal, or the spectrum, and convert back and forth, which by means of the fast Fourier transform and computer is relatively straightforward. 1.4.3. Fourier Analysis of Fluctuating Functions: Stationary Systems The autocorrelation function has been defined so far as a time average of a signal, but we may also consider the ensemble average, in which we repeat the same mea- surement many times, and compute averages, denoted by ¢_ >. It will be shown that for very many systems, the time average is equal to the ensemble average; such systems are termed ergodic (Sect. 3.7.1). If we have such a fluctuating quantity x(t), then we can consider the average = GO), (1.4.40) this result being the consequence of our ergodic assumption. Now it is very natural to write a Fourier transform for the stochastic quantity x(t) x(t) = f deo c(w) e™ (1.4.41) and consequently, clo) = ff dt x(t) e-™. (1.4.42) 18 1. A Historical Introduction Note that x(s) real implies e(w) = c*(—a). (1.4.43) If the system is ergodic, we must have a constant , since the time average is clearly constant. The process is then stationary by which we mean that all time- dependent averages are functions only of time differences, i.e., averages of functions x(t,), X(t), --. 2(f,) are equal to those of x(t, + 4), x(t2 + 4), -.. (te + 4). For convenience, in what follows we assume = RKT/n . (1.4.46) This result is known nowadays as Nyquist’s theorem. Johnson remarked. “The effect is one of the causes of what is called ‘tube noise’ in vacuum tube amplifiers. Indeed, it is often by far the larger part of the ‘noise’ of a good amplifier.” Johnson noise is easily described by the formalism of the previous subsection. The mean noise voltage is zero across a resistor, and the system is arranged so that it is in a steady state and is expected to be well represented by a stationary process. Johnson’s quantity is, in practice, a limit of the kind (1.4.34) and may be summa- rised by saying that the voltage spectrum S(w) is given by S(@) = RKT/x, (1.4.47) that is, the spectrum is flat, i-e., a constant function of w. In the case of light, the frequencies correspond to different colours of light. If we perceive light to be white, it is found that in practice all colours are present in equal proportions—the optical spectrum of white light is thus flat—at least within the visible range. In analogy, the term white noise is applied to a noise voltage (or any other fluctuating quantity) whose spectrum is flat. White noise cannot actually exist. The simplest demonstration is to note that the mean power dissipated in the resistor in the frequency range («,, @;) is given by $f deo S(co)|R = kT(@,-0>)/n (1.4.48) a so that the total power dissipated in all frequencies is infinite! Nyquist realised this, and noted that, in practice, there would be quantum corrections which would, at room temperature, make the spectrum flat only up to 7 x 10!? Hz, which is not detectable in practice, in a radio situation. The actual power dissipated in the resistor would be somewhat less than infinite, 10-'® W in fact! And in practice there are other limiting factors such as the inductance of the system, which would limit the spectrum to even lower frequencies. From the definition of the spectrum in terms of the autocorrelation function given in Sect. 1.4, we have CElt + DE(t)) = G(r) (1.4.49) . _ i deo "2K kT (1.4.50) = 2RKT&), (1.4.51) which implies that no matter how small the time difference t, E(t + t) and E(t) are not correlated. This is, of course, a direct result of the flatness of the spectrum. A typical model of S(c) that is almost flat is S(o) = RKT/{n(w?t2+1))] (1.4.52) 20 1. A Historical Introduction 1.5.a,b. Correlation Functions (——) and corresponding spectra (------) for (a) short correlation time corresponding to an almost flat spectrum; (b) long correlation time, giving a quite rapidly decreasing spectrum This is flat provided w < t¢~!. The Fourier transform can be explicitly evaluated in this case to give = (R KTR) exp (—1/t0) (1.4.53) so that the autocorrelation function vanishes only for t > tc, which is called the correlation time of the fluctuating voltage. Thus, the delta function correlation function appears as an idealisation, only valid on a sufficiently long time scale. This is very reminiscent of Kinstein’s assumption regarding Brownian motion and of the behaviour of Langevin’s fluctuating force. The idealised white noise will play a highly important role in this book but, in just the same way as the fluctuation term that arises in a stochastic differential equation is not the same as an ordinary differential, we will find that differential equations which include white noise as a driving term have to be handled with great care. Such equations arise very naturally in any fluctuating system and it is possible to arrange by means of Stratono- vich’s rules for ordinary calculus rules to apply, but at the cost of imprecise mathe- matical definition and some difficulties in stochastic manipulation. It turns out to be far better to abandon ordinary calculus and use the Jto calculus, which is not very different (it is, in fact, very similar to the calculus presented for shot noise) and to preserve tractable statistical properties. All these matters will be discussed thoroughly in Chap. 4. White noise, as we have noted above, does not exist as a physically realisable process and the rather singular behaviour it exhibits does not arise in any realisable context. It is, however, fundamental in a mathematical, and indeed in a physical sense, in that it is an idealisation of very many processes that do occur. The slightly strange rules which we will develop for the calculus of white noise are not really very difficult and are very much easier to handle than any method which always deals with a real noise. Furthermore, situations in which white noise is not a good approximation can very often be indirectly expressed quite simply in terms of white noise. In this sense, white noise is the starting point from which a wide range of stochastic descriptions can be derived, and is therefore fundamental to the subject of this book. 2. Probability Concepts In the preceding chapter, we introduced probability notions without any definitions. In order to formulate essential concepts more precisely, it is necessary to have some more precise expression of these concepts. The intention of this chapter is to provide some background, and to present a number of essential results. It is not a thorough outline of mathematical probability, for which the reader is referred to standard mathematical texts such as those by Feller [2.1] and Papoulis [2.2]. 2.1 Events, and Sets of Events It is convenient to use a notation which is as general as possible in order to describe those occurrences to which we might wish to assign probabilities. For example, we may wish to talk about a situation in which there are 6.4 x 10'* molecules in a certain region of space; or a situation in which a Brownian particle is at a certain point x in space; or possibly there are 10 mice and 3 owls in a certain region of a forest. These occurrences are all examples of practical realisations of events. More abstractly, an event is simply a member of a certain space, which in the cases most practically occuring can be characterised by a vector of integers 2 = (m, Ma Ms...) (2.1.1) or a vector of real numbers X= (Xp Xa Xa - (2.1.2) The dimension of the vector is arbitary. It is convenient to use the language of set theory, introduce the concept of a set of events, and use the notation oEA (2.1.3) to indicate that the event @ is one of events contained in A. For example, one may consider the set A(25) of events in the ecological population in which there are no more than 25 animals present; clearly the event that there are 3 mice, a tiger, and no other animals present satisfies @ € A(25). (2.1.4) ae 2. Freoaouity Concepts More significantly, suppose we define the set of events A(r, AV) that a molecule is within a volume element AV centred on a point r. In this case, the practical signi- ficance of working in terms of sets of events becomes clear, because we should nor- mally be able to determine whether or not a molecule is within a neighbourhood AV of r, but to determine whether the particle is exactly at r is impossible. Thus, if we define the event w(y) that the molecule is at point y, it makes sense to ask whether oy) € A(r, AV) (2.1.5) and to assign a certain probability to the set A(r, AV), which is to be interpreted as the probability of the occurrence of (2.1.5) 2.2 Probabilities Most people have an intuitive conception of a probability, based on their own experience. However, a precise formulation of intuitive concepts is fraught with difficulties, and it has been found most convenient to axiomatise probability theory as an essentially abstract science, in which a probability measure P(A) is assigned to every set A, in the space of events, including the set of all events: Q (2.2.1) the set of noevents:@; (2.2.2) in order to define probability, we need our sets of events to form a closed system (known by mathematicians as a c-algebra) under the set theoretic operations of union and intersection. 2.2.1 Probability Axioms We introduce the probability of A, P(A), as a function of A satisfying the following probability axioms: (i) P(A) 0 forall A, (2.2.3) (i) P(Q)=1, (2.2.4) (iii) if A, ((=1, 2, 3, ...) is a countable (but possibly infinite) collection of nonoverlapping sets, i.e., such that A, A,=@ forall i#/, (2.2.5) then P(U A= x P(A). (2.2.6) These are all the axioms needed. Consequentially, however, we have: 4 FiyvaumUEs “2 (iv) if A is the complement of A, i.e., the set of all events not contained in A, then P(A) = 1 — P(A), (2.2.7) (v) P(@) = 0. (2.2.8) 2.2.2. The Meaning of P(A) There is no way of making probability theory correspond to reality without requiring a certain degree of intuition. The probability P(A), as axiomatised above, is the intuitive probability that an “arbitrary” event a, i.e., an event @ “chosen at random’’, will satisfy € A. Or more explicitly, if we choose an event “at random” from Q N times, the relative frequency that the particular event chosen will satisfy @ & A approaches P(A) as the number of times, N, we choose the event, approaches infinity. The number of choices N can be visualised as being done one after the other (“‘independent” tosses of one die) or at the same time (N dice are thrown at the same time “independently”). All definitions of this kind must be intuitive, as we can see by the way undefined terms (“arbitrary”, “at random”, “independent”) keep turning up. By eliminating what we now think of as intuitive ideas and axiomatising probability, Kolomogorov [2.3] cleared the road for a rigorous development of mathematical probability. But the circular definition problems posed by wanting an intuitive understanding remain. The simplest way of looking at axiomatic pro- bability is as a formal method of manipulating probabilities using the axioms. In order to apply the theory, the probability space must be defined and the probability measure P assigned. These are a priori probabilities, which are simply assumed. Examples of such a priori probabilities abound in applied disciplines. For example, in equilibrium statistical mechanics one assigns equal probabilities to equal volumes of phase space. Einstein’s reasoning in Brownian motion assigned a probability (4) to the probability of a “push” 4 from a position x at time t. The task of applying probability is (i) to assume some set of a priori probabilities which seem reasonable and to deduce results from this and from the structure of the probability space, (ii) to measure experimental results with some apparatus which is constructed to measure quantities in accordance with these a priori probabilities. The structure of the probability space is very important, especially when the space of events is compounded by the additional concept of time. This extension makes the effective probability space infinite-dimensional, since we can construct events such as “the particle was at points x, at times #,, 2 = 0, 1, 2, ..., 00”. 2.2.3. The Meaning of the Axioms Any intuitive concept of probability gives rise to nonnegative probabilities, and the probability that an arbitrary event is contained in the set of all events must be 1 no matter what our definition of the word arbitrary. Hence, axioms (i) and (ii) are understandable. The heart of the matter lies in axiom (iii). Suppose we are dealing with only 2 sets A and B, and A B= @. This means there are no events con- a 2. riyvavnny Cuneepes tained in both A and B. Therefore, the probability that wo € A U Bis the probabi- lity that either @ € A or w © B. Intuitive considerations tell us this probability is the sum of the individual probabilities, i-e., P(A U B) = P{(w & A) or (w & B)} = P(A) + P(B) (2.2.9) (notice this is not a proof—merely an explanation). The extension now to any finite number of nonoverlapping sets is obvious, but the extension only to any countable number of nonoverlapping sets requires some comment. This extension must be made restrictive because of the existence of sets labelled by a continuous index, for example, x, the position in space. The probability of a molecule being in the set whose only element in x is zero; but the probability of being in a region R of finite volume is nonzero. The region R is a union of sets of the form {x}—but not a countable union. Thus axiom (iii) is not applicable and the probability of being in R is not equal to the sum of the probabilities of being in {x}. 2.2.4 Random Variables The concept of a random variable is a notational convenience which is central to this book. Suppose we have an abstract probability space whose events can be written x. Then we can introduce the random variable F(x) which is a function of x, which takes on certain values for each x. In particular, the identity function of x, written X(x) is of interest; it isgiven by X(z) - (2.2.10) We shall normally use capitals in this book to denote random variables and small letters x to denote their values whenever it is necessary to make a distinction. Very often, we have some quite different underlying probability space @ with values w, and talk about X(w) which is some function of w, and then omit explicit mention of @. This can be for either of two reasons: i) we specify the events by the values of x anyway, i.e., we identify x and @; ii) the underlying events @ are too complicated to describe, or sometimes, even to know. For example, in the case of the position of a molecule in a liquid, we really should interpret each w as being capable of specifying all the positions, momenta, and orientations of each molecule in that volume of liquid; but this is simply too difficult to write down, and often unnecessary. One great advantage of introducing the concept of a random variable is the simplicity with which one may handle functions of random variables, e.g., X?, sin(a - X), etc, and compute means and distributions of these. Further, by defining stochastic differential equations, one can also quite simply talk about time devel- opment of random variables in a way which is quite analogous to the classical description by means of differential equations of nonprobabilistic systems. 2.3 Joint and Conditional Probabilities: Independence 2.3.1. Joint Probabilities We explained in Sect. 2.2.3 how the occurrence of mutually exclusive events is related to the concept of nonintersecting sets. We now consider the concept P(A M B), where Af Bis nonempty. An event @ which satisfies @ € A will only satisfy @ © AN B if @ & Bas well. Thus, P(A 9. B) = P{(w & A) and (@ € B)} (2.3.1) and P(A 1 B) is called the joint probability that the event @ is contained in both classes, or, alternatively, that both the events @ € A and w & B occur. Joint pro- babilities occur naturally in the context of this book in two ways: i) When the event is specified by a vector, e.g., m mice and n tigers. The probability of this event is the joint probability of [m mice (and any number of tigers)] and [n tigers (and any number of mice)}. All vector specifications are implicitly joint probabilities in this sense. ii) When more than one time is considered: what is the probability that (at time f, there are m, tigers and m, mice) and (at time f; there are m, tigers and nz mice). To consider such a probability, we have effectively created out of the events at time t, and events at time fz, joint events involving one event at each time. In essence, there is no difference between these two cases except for the fundamental dynamical role of time. 2.3.2 Conditional Probabilities We may specify conditions on the events we are interested in and consider only these, e.g., the probability of 21 buffaloes given that we know there are 100 lions. What does this mean? Clearly, we will be interested only in those events contained in the set B = {all events where exactly 100 lions occur}. This means that we to define conditional probabilities, which are defined only on the collection of all sets contained in B. we define the conditional probability as P(A|B) = P(A 1) B)/P(B) (2.3.2) and this satisfies our intuitive conception that the conditional probability that @ & A (given that we know w & B), is given by dividing the probability of joint occurrence by the probability (@ © B). We can define in both directions, i.e., we have P(A 1) B) = P(A|B)P(B) = P(B| A)P(A). (2.3.3) There is no particular conceptual difference between, say, the probability of {(21 buffaloes) given (100 lions)} and the reversed concept. However, when two times 26 2. Probability Concepts are involved, we do see a difference. For example, the probability that a particle is at position x, at time f,, given that it was at x, at the previous time 2, is a very nat- ural thing to consider; indeed, it will turn out to be a central concept in this book. The converse sounds strange, i.e., the probability that a particle is at position x, at time 4,, given that it will be at position x, at a later time f,. It smacks of clair- voyance—we cannot conceive of any natural way in which we would wish to consi- der it, although it is, in principle, a quantity very similar to the “natural” condi- tional probability, in which the condition precedes the events under consideration. The natural definition has already occurred in this book, for example, the ¢(4)d4 of Einstein (Sect. 1.2.1.) is the probability that a particle at x at time ¢ will be in the range [x + 4, x + 4 + dd] at time ¢ + 1, and similarly in the other examples. Our intuition tells us as it told Einstein (as can be seen by reading the extract from his paper) that this kind of conditional probability is directly related to the time development of a probabilistic system. 2.3.3. Relationship Between Joint Probabilities of Different Orders Suppose we have a collection of sets B, such that BN B= (2.3.4) YB=2 (2.3.5) so that the sets divide up the spage Q into nonoverlapping subsets. Then YAN B=AN(UB)=ANQ=A (2.3.6) Using now the probability axiom (iii), we see that A ( B, satisfy the conditions on the A, used there, so that = P(AN B)= PtU (4 U B)) (2.3.7) = P(A) (2.3.8) and thus X P(A|B)P(B) = P(A) (2.3.9) Thus, summing over all mutually exclusive possibilities of B in the joint probability eliminates that variable. Hence, in general, DPA BN Ce) = PBN CEN) (2.3.10) The result (2.3.9) has very significant consequences in the development of the theory of stochastic processes, which depends heavily on joint probabilities. 2.3. Joint and Conditional Probabil : Independence 27 2.3.4. Independence We need a probabilistic way of specifying what we mean by independent events. Two sets of events A and B should represent independent sets of events if the speci- fication that a particular event is contained in B has no influence on the probability of that event belonging to A. Thus, the conditional probability P(A |B) should be independent of B, and hence P(A 1) B) = P(A)P(B) (2.3.11) In the case of several events, we need a somewhat stronger specification. The events (o € A) (i = 1,2s..., n) will be considered to be independent if for any subset (i, in, «.., ix) of the set (1,2, ..., n), PlAi, 0 Ag ++ Ay) = PlAy)P(Ay) «++ PlAy) + (2.3.12) It is important to require factorisation for all possible combinations, as in (2.3.12). For example, for three sets A,, it is quite conceivable that P(A, N Ay) = P(A)P(A,) (2.3.13) for all different i and j, but also that AN 4,= 420 43=4;N Ay. (See Fig 2.1) This requires P(A, 9 Aa 1 As) = P(A2 1 As N As) = P(A2 N As) = P(A2)P(As) (2.3.14) = P(A,)P(Az)P(A3)- We can see that the occurrence of w € A, and w € A, necessarily implies the oc- currence of w & A,. In this sense the events are obviously not independent. Random variables X;, Xz, Xs, ..., will be said to be independent random vari- ables, if for all sets of the form A, = (x such that a, < x < b,) the events X; € A, Fig. 2.1. Illustration of statistical independence in pairs, but not in threes. In the three sets A, 1 A, is, in all cases, the central region. By appropriate choice of probabilities, we can arrange P(A, N A,) = P(A))P(A) . 28 2. Probability Concepts X21 © Az, X3 © As, ... are independent events. This will mean that all values of the X, are assumed independently of those of the remaining X,. 2.4 Mean Values and Probability Density The mean value of a random variable R(w) in which the basic events w are coun- tably specifiable is given by &R) = F P@)RO), (2.4.1) where P(w) means the probability of the set containing only the single event w. In the case of a continuous variable, the probability axioms above enable us to define a probability density function p(w) such that if A(wo, dix) is the set (© < @ < w + do), (2.4.2) then P(co)derg = P[A(oo, deo)] (2.4.3) = p(y, doo»). (2.4.4) The last is a notation often used by mathematicians. Details of how this is done have been nicely explained by Félller [2.1]. In this case, &R) = § doo Rlo)plo) (2.4.5) One can often (as mentioned in Sect. 2.2.4) use R itself to specify the event, so we will often write = J AR S(R) PCR), (2.4.8) then we know p(R). The proof follows by choosing SR) =1 Ro R vanishes suf- ficiently rapidly as |n — m| — oo, then defining > x X= Xe), 25.6) a we shall show lim Ey = (XD. (2.5.7) It is clear that Any =X). (2.5.8) We now calculate the variance of Xy and show that as N — oo it vanishes under certain conditions: x En8n) — End? = Hi Di Kar Xn - 259) Provided (X,, X,) falls off sufficiently rapidly as |n — m| — 00, we find lim (var {¥y}) = 0 (2.5.10) new so that lim Xy is a deterministic variable equal to , (2.6.7) then ° $108) = TI ds) (2.6.8) The characteristic function plays an important role in this book which arises from the convergence property (v), which allows us to perform limiting processes on the characteristic function rather than the probability distribution itself, and often makes proofs easier. Further, the fact that the characteristic function is truly characteristic, i.e., the inversion formula (vi), shows that different characteristic functions arise from different distributions. As well as this, the straightforward derivation of the moments by (2.6.2) makes any determination of the characteristic function directly relevant to measurable quantities. 2.7 Cumulant Generating Function: Correlation Functions and Cumulants A further important property of the characteristic function arises by considering its logarithm (s) = log g(s) (2.7.1) which is called the cumulant generating function. Let us assume that all moments exist so that ¢(s) and hence, (s), is expandable in a power series which can be written as 96 =F Sonn. x9 my! m,| Fa, am), _ where the quantities (71X32 ... Xr") are called the cumulants of the variables X. The notation chosen should not be taken to mean that the cumulants are func- tions of the particular product of powers of the X; it rather indicates the moment of highest order which occurs in their expression in terms of moments. Stratonovich . 34 2. Probability Concepts [2.4] also uses the term correlation functions, a term which we shall reserve for cumulants which involve more than one X,. For, if the X are all independent, the factorisation property (2.6.6) implies that (s) (the cumulant generating function) is a sum of terms, each of which is a function of only one s, and hence the coeffi- cient of mixed terms, i.e., the correlation functions (in our terminology) are all zero and the converse is also true. Thus, the magnitude of the correlation functions is a measure of the degree of correlation. The cumulants and correlation functions can be evaluated in terms of moments by expanding the characteristic function as a power series: XTX... Xe 1 ar, & m,) stisf2 ... st», (2.7.3) expanding the logarithm in a power series, and comparing it with (2.7.2) for 6(s) No simple formula can be given, but the first few cumulants can be exhibited: we find (Xi) = (XD (2.7.4) (XX) = XX — (XY<..> Term ¢X1X2) HAN) + (MX) MaXs) = Dr. Hence, Dy + Dr = C(XY2X3X,) - c)p=2 partition ¢.)<.>¢..> Term (X1)><.)<.) Term (X,)<(X2)¢X3) (Xe) = CM X2XX0) - Hence, (X,X2X3X4) = Co — Cr + 2Cr — 6Cy Q78 2.7.2. Significance of Cumulants From (2.7.4, 5) we see that the first two cumulants are the means (X,) and co variances (X,, X,). Higher-order cumulants contain information of decreasing significance, unlike higher-order moments. We cannot set all moments higher that a certain order equal to zero since (X*") > (X*)? and thus, all moments contair information about lower moments. For cumulants, however, we can consistently set (=a (?) = 0? (X") = 0 (> 2), 36 2. Probability Concepts and we can easily deduce by using the inversion formula for the characteristic func- tion that P(x) = cde exp[— (x — )?/207], (2.7.9) a Gaussian probability distribution. It does not, however, seem possible to give more than this intuitive justification. Indeed, the theorem of Marcienkiewicz (2.8,9] shows that the cumulant generating function cannot be a polynomial of degree greater than 2, that is, either all but the first 2 cumulants vanish or there are an infinite number of nonvanishing cumulants. The greatest significance of cumulants lies in the definition of the correlation functions of different variables in terms of them; this leads further to important approximation methods. 2.8 Gaussian and Poissonian Probability Distributions 2.8.1 The Gaussian Distribution By far the most important probability distribution is the Gaussian, or normal distribution. Here we collect together the most important facts about it. If X is a vector of m Gaussian random variables, the corresponding multi- variate probability density function can be written P(x) = [(2n)" det(o)]-exp[— 1G — Hfo"(x — ¥)] (2.8.1) so that 7 (X) = fdexp@)=2 (2.8.2) (XX?) = f de xx"p(z) = 8 +0 (2.8.3) and the characteristic function is given by #(s) = 0. Then, under these conditions, the distribution of the normalisec sums S,/c, tends to the Gaussian with zero mean and unit variance. The proof of the theorem can be found in [2.1]. It is worthwhile commenting 01 the hypotheses, however. We first note that the summands X, are required to b independent. This condition is not absolutely necessary; for example, choose X,= s y, 2.8.11 38 2. Probability Concepts where the Y, are independent. Since the sum of the X’s can be rewritten as a sum of Y’s (with certain finite coefficients), the theorem is still true. Roughly speaking, as long as the correlation between X, and X, goes to zero sufficiently rapidly as |i—j]—~ oo, a central limit theorem will be expected. The Lin- deberg condition (2.8.10) is not an obviously understandable condition but is the weakest condition which expresses the requirement that the probability for | X;| to be large is very small. For example, if all the 5, are infinite or greater than some constant C, it is clear that a? diverges as n —- oo. The sum of integrals in (2.8.10) is the sum of contributions to variances for all | X,| > to,, and it is clear that as n — co, each contribution goes to zero. The Lindeberg condition requires the sum of all the contributions not to diverge as fast as o3. In practice, it is a rather weak requirement; satisfied if | X,| < C for all X,, or if p,(x) go to zero sufficiently rapidly as x—» + oo. An exception is Pix) = alm? + a?" (2.8.12) the Cauchy, or Lorentzian distribution. The variance of this distribution is infinite and, in fact, the sum of all the X, has a distribution of the same form as (2.8.12) with a, replaced by 37a). Obviously, the Lindeberg condition is not satisfied. A related condition, also known as the Lindeberg condition, will arise in Sect. 3.3.1. where we discuss the replacement of a discrete process by one with con- tinuous steps. 2.8.3 The Poisson Distribution A distribution which plays a central role in the study of random variables which take on positive integer values is the Poisson distribution. If X is the relevant variable the Poisson distribution is defined by P(X = x) = P(x) = e-*a*/x! (2.8.13) and clearly, the factorial moments, defined by OX, = Oe — I(x —r + DD, (2.8.14) are given by KX =a", (2.8.15) For variables whose range is nonnegative integral, we can very naturally define the generating function G) =F #P@) = (28.16) which is related to the characteristic function by 2.9 Limits of Sequences of Random Variables 39 G(s) = g(—i logs). (2.8.17) The generating function has the useful property that Xr. = [(é) ‘ao. (2.8.18) For the Poisson distribution we have os “(soy Gs) = xy —7 = expla(s — 1) (2.8.19) We may also define the factorial cumulant generating function g(s) by g(s) = log G(s) (2.8.20) and the factorial cumulants (X"), by 9 = SS, We see that the Poisson distribution has all but the first factorial cumulant zero. The Poisson distribution arises naturally in very many contexts, for example, we have already met it in Sect.1.2.3 as the solution of a simple master equation. It plays a similar central role in the study of random variables which take on integer values to that occupied by the Gaussian distribution in the study of variables with a continuous range. However, the only simple multivariate generalisation of the Poisson is simply a product of Poissons, i.c., of the form nm e-4i(a,*t int X;,! P(x, Xa Xy- (2.8.21) There is no logical concept of a correlated multipoissonian distribution, similar to that of a correlated multivariate Gaussian distribution. 2.9 Limits of Sequences of Random Variables Much of computational work consists of determining approximations to random variables, in which the concept of a limit of a sequence of random variables naturally arises. However, there is no unique way of defining such a limit. For, suppose we have a probability space Q, and a sequence of random vari- ables X, defined on Q. Then by the limit of the sequence as n — co X=limX,, (2.9.1) we mean a random variable X which, in some sense, is approached by the sequence . 40 2. Probability Concepts of random variables X,. The various possibilities arise when one considers that the probability space @ has elements @ which have a probability density p(w). Then we can choose the following definitions. 2.9.1 Almost Certain Limit X, converges almost certainly to X if, for all cw except a set of probability zero lim X,@) = X(@). (2.9.2) Thus each realisation of X, converges to X and we write aclim X, =X (2.9.3) 2.9.2. Mean Square Limit (Limit in the Mean) Another possibility is to regard the X,(w) as functions of w, and look for the mean square deviation of X,(w) from X(w). Thus, we say that X, converges to X in the mean square if lim f deo po X,(@) — X(o)P = lim ((X, — X)*> (2.9.4) = aor This is the kind of limit which is well known in Hilbert space theory. We write ms-lim X, = X (2.9.5) 2.9.3 Stochastic Limit, or Limit in Probabitity We can consider the possibility that X,(@) approaches X because the probability of deviation from X approaches zero: precisely, this means that if for any ¢ > 0 lim P(|X, — X| >) =0 (2.9.6) then the stochastic limit of X, is X. Note that the probability can be written as follows. Suppose x,(z) a function such that x2) =1 jz| >e Iz] &) = J do po)x |X, — X1)- (2.9.8) 29. Limits of Sequences of Random Variables 41 In this case, we write st-lim X, = X 2.9.9) 2.9.4 Limit in Dis tribution An even weaker form of convergence occurs if, for any continuous bounded function f(x) lim (f(%)> = AO) - (2.9.10) In this case the convergence of the limit is said to be in distribution. In particular, using exp(ixs) for f(x), we find that the characteristic functions approach each other, and hence the probability density of X, approaches that of X. 2.9.5 Relationship Between Limits The following relations can be shown. Almost certain convergence —> stochastic convergence. Convergence in mean square —> stochastic convergence. Stochastic convergence —> convergence in distribution. All of these limits have uses in applications. 3. Markov Processes 3.1 Stochastic Processes All of the examples given in Chap. | can be mathematically described as stochastic processes by which we mean, in a loose sense, systems which evolve probabilistically in time or more precisely, systems in which a certain time-dependent random variable X(t) exists. We can measure values x,, x2, X3, etc., of X(t) at times t, ta, ty, ... and we assume that a set of joint probability densities exists POR, th 5 X2y 23 X35 03 ---) G11) which describe the system completely. In terms of these joint probability density functions, one can also define condi- tional probability densities: PCS iy thy5 Xap f25 «+ [Yay T13 Vay T25 ++) FPR NS Bay fas 5 Vas TH Vas 125 Py 15 Yas Ta3 (3.1.2) These definitions are valid independently of the ordering of the times, although it is usual to consider only times which increase from right to left i.e., hH2h2hD.. Vv en (3.1.3) The concept of an evolution equation leads us to consider the conditional probabili- ties as predictions of the future values of X(r) (i.e.,x,, x2, ... at times f, f, ...), given the knowledge of the past (values y,, 2, ..., at times t,,72, ...). The concept of a general stochastic process is very loose. To define the process we need to know at least all possible joint probabilities of the kind in (3.1.1). If such knowledge does define the process, it is known as a separable stochastic process. All the processes considered in this book will be assumed to be separable. The most simple kind of stochastic process is that of complete independence: PU» 115 ay ta Hay a3) = TT pte fi) (3.1 4) which means that the value of X at time f is completely independent of its values in the past (or future). An even more special case occurs when the p(x,, t,) are inde- pendent of t,, so that the same probability law governs the process at all times. We then have the Bernoulli trials, in which a probabilistic process is repeated at succes- sive times. 3.2. Markov Process 4. The next most simple idea is that of the Markov process in which knowledge 0 only the present determines the future. 3.2. Markov Process The Markov assumption is formulated in terms of the conditional probabilities. W: require that if the times satisfy the ordering (3.1.3), the conditional probability i determined entirely by the knowledge of the most recent condition, i.e., PUR L5 Xay fas [Yas T13 Van Ta +) = Pl, thy Xn fos [Ys Ts (3.2.1 This is simply a more precise statement of the assumptions made by Einstein Smoluchowski and others. It is, even by itself, extremely powerful. For it mean that we can define everything in terms of the simple conditional probabilitie P(x, t)|¥1, 71). For example, by definition of the conditional probability densit PEs th X25 21 P1, 1) = Ptr, th la, 23 Yi, T1)P(%2, t21 Yi, 71) and using the Marko assumption (3.2.1), we find PCs 15 Xay fas Vas T1) = PCy th | X25 2)P(R2, 1s 71) 3.2.2 and it is not difficult to see that an arbitrary joint probability can be expressed sim ply as PQty ths X2y f25 Kay bs3 «+ Xm tn) PCRs ta] ¥ay fa)PCRay fal 12)PCEa, fa] Be fa) G22 ses Pnaty tat | Emp ty) P(Sny tn) provided hen Sae-..2tiaBty- 3.24 3.2.1 Consistency—the Chapman-Kolmogorov Equation From Sect.2.3.3 we require that summing over all mutually exclusive events ¢ one kind in a joint probability eliminates that variable, i.e., DPANBNC..)=PANC...); (3.2! and when this is applied to stochastic processes, we get two deceptively simile equations: Pt) = J dx2 pla, th x2, bh) = J dea p(x, ty] x0 te)p(ea ta) - 3.2. 44 3. Markov Processes This equation is an identity valid for all stochastic processes and is the first in a hierarchy of equations, the second of which is PCH, th [55 ts) = J da Play ths ay tal tt) = J dz pai, th| ta) ta Xa) )P(%a, tal 2, t)- G.2.7) This equation is also always valid. We now introduce the Markov assumption. If ty > t, > ty, we can drop the ft, dependence in the doubly conditioned probability and write PCH, th [55 ts) = J dee PCr, th Xa, te)p(%ay fa] 3, b3) G.2.8) which is the Chapman-Kolmogorov equation. What is the essential difference between (3.2.8) and (3.2.6)? The obvious answer is that (3.2.6) is for unconditioned probabilities, whereas (3.2.7) is for conditional probabilities. Equation (3.2.8) is a rather complex nonlinear functional equation relating all conditional probabilities p(x,, t;|x,, t;) to each other, whereas (3.2.6) simply constructs the one time probabilities in the future 1, of ft, given the conditional probability p(x, t1| 2, 2). The Chapman-Kolmogorov equation has many solutions. These are best under- stood by deriving the differential form which is done in Sect. 3.4.1 under certain rather mild conditions. 3.2.2 Discrete State Spaces In the case where we have a discrete variable, we will use the symbol N = (M,, Na, NN, ...), where the N, are random variables which take on integral values. Clearly, we now replace faxoy (3.2.9) and we can now write the Chapman-Kolmogorov equation for such a process as P(m,, t)| 5, 2) = D1 P(m, th| M2, ta) Pla, ta| Ms, ts) « (3.2.10) This is now a matrix multiplication, with possibly infinite matrices. 3.2.3. More General Measures A more general formulation would assume a measure dyu(x) instead of dx where a variety of choices can be made. For example, if u(x) is a step function with steps at integral values of x, we recover the discrete state space form. Most mathematical works attempt to be as general as possible. For applications, such generality can lead to lack of clarity so, where possible, we will favour a more specific notation. 3.3. Continuity in Stochastic Processes 45 3.3 Continuity in Stochastic Processes Whether or not the random variable X(t) has a continuous range of possible values is a completely different question from whether the sample path of X(t) is a continu- ous function of r. For example, in a gas composed of molecules with velocities V(t), it is clear that all possible values of V(t) are in principle realisable, so that the range of V(t) is continuous. However, a model of collisions in a gas of hard spheres as occurring instantaneously is often considered, and in such a model the velocity be- fore the collision, v;, will change instantaneously at the time of impact to another value v;, so the sample path of V(t) is not continuous. Nevertheless, in such a model, the position of a gas molecule X(t) would be expected to be continuous. A major question now arises. Do Markov processes with continuous sample paths actually exist in reality? Notice the combination of Markov and continuous. It is almost certainly the case that in a classical picture (i.e., not quantum mechanical), all variables with a continuous range have continuous sample paths. Even the hard sphere gas mentioned above is an idealisation and more realistically, one should allow some potential to act which would continuously deflect the molecules during a collision. But it would also be the case that, if we observe on such a fine time scale, the process will probably not be Markovian. The immediate history of the whole system will almost certainly be required to predict even the probabilistic future. This is certainly born out in all attempts to derive Markovian probabilistic equa- tions from mechanics. Equations which are derived are rarely truly Markovian— rather there is a certain characteristic memory time during which the previous history is important (Haake [3.1]). This means that there is really no such thing as a Markov process; rather, there may be systems whose memory time is so small that, on the time scale on which we carry out observations, it is fair to regard them as being well appro- ximated by a Markov process. But in this case, the question of whether the sample paths are actually continuous is not relevant. The sample paths of the approxi- mating Markov process certainly need not be continuous. Even if collisions of mole- cules are not accurately modelled by hard spheres, during the time taken for a collision, a finite change of velocity takes place and this will appear in the appro- ximating Markov. process as a discrete step. On this time scale, even the position may change discontinuously, thus giving the picture of Brownian motion as modelled by Einstein. In chemical reactions, for example, the time taken for an individual reaction to proceed to completion—roughly of the same order of magnitude as the collision time for molecules—provides yet another minimum time, since during this time, states which cannot be described in terms of individual molecules exist. Here, there- fore, the very description of the state in terms of individual molecules requires a certain minimum time scale to be considered. However, Markov processes with continuous sample paths do exist mathema- tically and are useful in describing reality. The model of the gas mentioned above provides a useful example. The position of the molecule is indeed probably best 46 3. Markov Processes modelled as changing discontinuously by discrete jumps. Compared to the distances travelled, however, these jumps are infinitesimal and a continuous curve provides a good approximation to the sample path. On the other hand, the velocities can change by amounts which are of the same order of magnitude as typical values at- tained in practice. The average velocity of a molecule in a gas is about 1000 m/s and during a collision can easily reverse its sign. The velocities simply cannot reach (with any significant probability) values for which the changes of velocity can be regarded as very small. Hence, there is no sense in a continuous path description of velocities in a gas. 3.3.1 Mathematical Definition of a Continuous Markov Process For a Markov process, it can be shown [3.2] that with probability one, the sample paths are continuous functions of 1, if for any ¢ > 0 we have fk A =0 3. lim 4; sede dx p(x, t + At|z, t) (3.3.1) uniformly in z, f and At. This means that the probability for the final position x to be finitely different from z goes to zero faster that\Ar, as At goes to zero. [Equation (3.3.1) is sometimes called the Lindeberg condition.] Examples t i) Einstein’s solution for his f(x, t) (Sect. 1.2.1) is really the conditional probability (x, t|0, 0). Following his method we would find p(x, t + At|z, t) = (4nDAt)~"? exp [— (x — 2)/4DAN)] (3.3.2) and it is easy to check that (3.3.1) is satisfied in this case. Thus, Brownian motion in Einstein’s formulation has continuous sample paths. ii) Cauchy Process: Suppose At p(x, t+ Atlz, 0) = fle — 2) + Ae). 3.3.3) Then this does not satisfy (3.3.1) so the sample paths are discontinuous. However, in both cases, we have as required for consistency lim p(x, t + At|z, t) = &(x — z), (3.3.4) ra and it is easy to show that in both cases, the Chapman-Kolomogorov equation is satisfied. The difference between the two processes just described is illustrated in Fig. 3.1 in which simulations of both processes are given. The difference between the two is 3.4 Differential Chapman-Kolmogorov Equation 47 a Fig. 3.1. Illustration of sample paths of een i the Cauchy process X(t) (-----) and ae xt) Lay] Brownian motion W(t) ( ) 1 ! 1 1 1 pwn pep ot ! L- | 3 t* striking. Notice, however, that even the Brownian motion curve is extremely irre- gular, even though continuous—in fact it is nowhere differentiable. The Cauchy- process curve is, of course, wildly discontinuous. 3.4 Differential Chapman-Kolmogorov Equation Under appropriate assumptions, the Chapman-Kolmogorov equation can be re- duced to a differential equation. The assumptions made are closely connected with the continuity properties of the process under consideration. Because of the form of the continuity condition (3.3.1), one is led to consider a method of dividing the differentiability conditions into parts, one corresponding to continuous motion of a representative point and the other to discontinuous motion. We require the following conditions for all e > 0: i) fim ple, t+ Atl2, n/t = Welz, 0 (3.4.1) uniformly in x, z, and t for |x — 2| > e; it) Him pS de Gu dpe, t+ Atl 2,1) = Ade, 0) + 00) 5 (3.4.2) if) tim 2; sede HE = 0s — FPL t+ ANE, D = BYle, 0) + 00); B43) the last two being uniform in z, ¢, and ¢. Notice that all higher-order coefficients of the form in (3.4.2,3) must vanish. For example, consider the third-order quantity defined by lim Bef dx Ge, — 2x) — 2) — 24) Pls t + Atlz, 1) aro Al is tice = Cin(Z, t) + O(e). (3.4.4) 48 3, Markov Processes Since Ci, is symmetric in i, j, k, consider YD aayaxCyalZ, t) = Cla, z, 1) (3.4.5) ok so that Cinles ) = yz a Ole, 2,1) (48) hs = Fi Fe ada, 6 cat Then, 12@, 2.91 e and |x — z| = By, 1). (3.5.12) It is easy to see that this picture gives i) sample paths which are always continuous — for, clearly, as At 0, y(t + At) = vt); ii) sample paths which are nowhere diffierentiable, because of the Ar'/? occurring in (3.5.10). We shall see later, in Chap. 4 that the heuristic picture of (3.5.10) can be made much more precise and leads to the concept of the stochastic differential equation. 3.5.3 Deterministic Processes—Liouville’s Equation It is possible that in the differential Chapman-Kolmogorov equation (3.4.22) only the first term is nonzero, so we are led to the special case of a Liouville equation: 34 3. Markov Processes Pele). 38 tale, let) 65.13) which occurs in classical mechanics. This equation describes a completely deter- ministic motion, i., if x(y, t) is the solution of the ordinary differential equation £0 _ gx0,1 (3.5.14) with x(y, t’) y (3.5.15) then the solution to (3.5.13) with initial condition P(z, t'|y, ') = 82 — y) (3.5.16) is P(z, tly, t') = 8[z — x(, (3.5.17) The proof of this assertion is best obtained by direct substituion. For ~EE (Ade, nste — 200.9) (5.18) = -3% {Adz(y, 1), 118[z — x(y, O]} (3.5.19) = -E [4e0. 0.0g oe — av. 9} ] (3.520) and a ) Za — xy, O)= —egae SO Ol ae (3.5.21) and by use of (3.5.14), we see that (3.5.20,21) are equal. Thus, if the particle is in a well-defined initial position y at time #’, it stays on the trajectory obtained by solving the ordinary differential equation (3.5.14). Hence, deterministic motion, as defined by a first-order differential equation of the form (3.5.14), is an elementary form of Markov process. The solution (3.5.17) is, of course, merely a special case of the kind of process approximated by equations like (3.5.9) in which the Gaussian part is zero. 3.5.4 General Processes In general, none of the quantities in A(z, t), B(z, 1) and W(x|z, 1) need vanish, and in this case we obtain a process whose sample paths are as illustrated in Fig. 3.2, i.e., a piecewise continuous path made up of pieces which correspond to a diffusion process with a nonzero drift, onto which is superimposed a fluctuating part. 3.6 Equations for Time Development in Initial Time-Backward Equations 55 Fig. 3.2. Illustration ofa sample path of a general Markov process, in which drift, diffusion and jumps exist Zit) t + Z(t) Fig. 3.3. Sample path of a Markov process with only drift and jumps t It is also possible that A(z, t) is nonzero, but B(z, f) is zero and here the sample paths are, as in Fig. 3.3, composed of pieces of smooth curve [solutions of (3.5.14)] with discontinuities superimposed. This is very like the picture one would expect in a dilute gas where the particles move freely between collisions which cause an instantaneous change in momentum, though not position. 3.6 Equations for Time Development in Initial Time—Backward Equations We can derive much more simply than in Sect. 3.4, some equations which give the time development with respect to the initial variables y, ¢’ of p(x, tly, t’). We consider tim 4 [oC ty, 0” + At’) — p(x, tly, #9) (6.1) Aro At’ | | = lim apf de ve, + AM ly, Lae, ty, + Ar) = pls, tlz.t + Ar) (3.6.2) by use of the Chapman-Kolmogrov equation in the second term and by noting that the first term gives 1 x p(x, tly, t’ + At’). The assumptions that are necessary are now the existence of all relevant deriva- 56 3. Markov Processes tives, and that p(x, f|y, ’) is continuous and bounded in x,t,t’ for some range t—t'>8> 0. We may then write = im o J dz p(z,t' + At'ly, t*) (p(x, tly, 1) — p(s, t]z, 0’) (3.6.3) We now proceed using similar techniques to those used in Sect. 3.4.1 and finally derive ape, ty. 1) dete) 1) See, t1y, 1’) Bet — BAy, 1) PEE) — 2 3 Bay, 1) Pe + fede Maly, t) a tly, t’) — p(x, tle, ¢')] (3.6.4) which will be called the backward differential Chapman-Kolmogorov equation. In a mathematical sense, it is better defined than the corresponding forward equation (3.4.22). The appropriate initial condition for both equation is P(x, tly, 1) = 8(x — y) for all ¢, (3.6.5) representing the obvious fact that if the particle is at y at time ft, the probability density for finding it at x at the same time is 5(x—y). The forward and the backward equations are equivalent to each other. For, solutions of the forward equation, subject to the initial condition (3.6.5) [or 3.5.4], and any appropriate boundary conditions, yield solutions of the Chapman- Kolmogorov equation, as noted in Sect. 3.4.2. But these have, just been shown to yield the backward equation. (The relation between appropriate boundary condi- tions for the Fokker-Planck equations is dealt with in Sect. 5.2.1,4). The basic dif- ference is which set of variables is held fixed. In the case of the forward equation, we hold (y, ’) fixed, and solutions exist for > 1’, so that (3.6.5) is an initial condi- tion for the forward equation. For the backward equation, solutions exist for r’ < 1, so that since the backward equation expresses development in 1’, (3.6.5) is really better termed final condition in this case. Since they are equivalent, the forward and backward equations are both useful. The forward equation gives more directly the values of measurable quantities as a function of the observed time, t, and tends to be used more commonly in applica- tions. The backward equation finds most application in the study of first passage time or exit problems, in which we find the probability that a particle leaves a region in a given time. 3.7 Stationary and Homogeneous Markov Processes In Sect. 1.4.3 we met the concept of a stationary process, which represents the stochastic motion of a system which has settled down to a steady state, and whose stochastic properties are independent of when they are measured. Stationarity can be defined in various degrees, but we shall reserve the term “stationary process” 3.7. Stationary and Homogeneous Markov Processes 57 for a strict definition, namely, a stochastic process X(t) is stationary if X(t) and the process X(t + ©) have the same statistics for any ¢. This is equivalent to saying that all joint probability densities satisfy time translation invariance, i.e., PCy 15 Xap ta3 Xap bay Cn) = PKiy ty + 5 Kay by + 5 Hay by +E} 5 Eas ty + 8) (3.7.1) and hence such probabilities are only functions of the time differences, t, — t,. In particular, the one-time probability is independent of time and can be simply written as px) ‘ (3.7.2) and the two-time joint probability as Psl%1, ty — t23 X2, 0). (3.7.3) Finally, the conditional probability can also be written as Palit — ta] 2, 0). G.7.4) For a Markov process, since all joint probabilities can be written as products of the two-time conditional probability and the one-time probability, a necessary and sufficient condition for stationarity is the ability to write the one and two-time probabilities in the forms given in (3.7.1-3). 3.7.1 Ergodic Properties If we have a stationary process, it is reasonable to expect that average measurements could be constructed by taking values of the variable x at successive times, and averaging various functions of these. This is effectively a belief that the law of large numbers (as explained in Sect. 2.5.2) applies to the variables defined by successive measurements in a stochastic process. Let us define the variable X(T) by X(T) = rh. dt x(t), (3.7.5) where x(t) is a stationary process, and consider the limit T— oo. This represents a possible model of measurement of the mean by averaging over all times. Clearly KX) = &)s- G.7.6) We now calculate the variance of X(T). Thus, ROY = a FF ate Cxtedatey G77) 58 3. Markov Processes and if the process is stationary, t (3.7.23) by P(x; t) = p(x, t| Xo to) and (3.7.24) P(x, t|x’, t') = p(x, tlx’, 1/) % (3.7.25) and all other joint probabilities are obtained from these in the ‘fsual manner for a Markov process. Clearly, if (3.7.22) is satisfied, we find that as —- co oras fy —- — 00, P(x, 1) — p,(#) and all other probabilities become stationary because the conditional probability is stationary. Such a process is known as a homogeneous process. The physical interpretation is rather obvious. We have a stochastic system whose variable x is by some external agency fixed to have a value x, at time fo. It then evolves back to a stationary system with the passage of time. This is how many stationary systems are created in practice. From the point of view of the differential Chapman-Kolmogorov equation, we will find that the stationary distribution function p,(x) is a solution of the stationary differential Chapman-Kolmogorov equation, which takes the form Sige lAledoe thy 01 +S segs (Bales thy + J dx (Weel x)p(e, tly, 0) — Weelzdpls, tly, 1, (3.7.26) where we have used the fact that the process is homogeneous to note that 4, B and W, as defined in (3.4. 1-3), are independent of t. This is an alternative definition of a homogeneous process. 3.7. Stationary and Homogeneous Markov Processes 61 3.7.3 Approach to a Stationary Process A converse problem also exists. Suppose 4, B and W are independent of time and P.(z) satisfies (3.7.26). Under what conditions does a solution of the differential Chapman-Kolmogorov equation approach the stationary solution p,(z)? There does not appear to be a complete answer to this problem. However, we can give a reasonably good picture as follows. We define a Lyapunov functional K of any two solutions p, and p; of the differential Chapman-Kolgomorov equation by K= J dx p(x, 1) log [ai(, d/pAlx, 1) (3.7.27) and assume for the moment that neither p, nor p, are zero anywhere. We will now show that K is always positive and dK/dt is always negative. Firstly, noting that both p,(x, ¢) and p,(x, f) are normalised to one, we write Kl pu, pas t] = J dx pi(x, t) (log [ai(x, H/pxla, 1] + palx, )/pi(x, t) — 1} (3.7.28) and use the inequality valid for all z > 0, —logz+z—13>0, (3.7.29) to show that K > 0. Let us now show that dK/dt < 0. We can write (using an abbreviated notation) aK a a aa) ax Be flog p: + 1 — log pal ~ F? Les/pal}- (3.7.30) We now calculate one by one the contribution to dK/dt from drift, diffusion, and jump terms in the differential Chapman-Kolmogorov equation: (FB) = 3S ae [- tos (loa +X (Awd) + (pilp2) x (Apa (3.7.31) which can be rearranged to give dKy a (Bae = Df deg l— Asp los (plea). (3.7.32) Similarly, we may calculate (dK) Fae 7 SS a2 low (Pues) + Ngee Burd) 2 — (pilp2) a (2,p9} (3.7.33) 62 3. Markov Processes and after some rearranging we may write (dl 1 a a (Fi) gg = 2 SS a OB fox oe (eves Now (Pups FBS de aes |r By toe (rips 6.134) Finally, we may calculate the jump contribution similarly: (FB) cay = Sees |OMCeL= oe", — W's 1 x {log [pi(x, #)/px(x, 9) + 1} — (Welle) — We" | x)p lx, Opi, O/palx, 9) 3.7.35) and after some rearrangement, (4) og J dxdx’ W(x|x') (pa(x’, 119" log [9/¢'] — ¢ + 91}, (3.7.36) where 9 = P(x, #)/palx, t) (3.7.37) and ¢’ is similarly defined in terms of's’. We now consider the simplest case. Suppose a stationary Solution p,(x) exists which is nonzero everywhere, except at infinity, where it and its first derivative vanish. Then we may choose p,(x, t) = p,(x). The contributions to dK/dt from (3.7.32) and the second term in (3.7.34) can be integrated to give surface terms which vanish at infinity so that we find (a rae (3.7.38a) (a) oe (3.7.38b) (a) ae (3.7.38e) where the last inequality comes by setting z = ¢’¢" in (3.7.29). We must now consider under what situations the equalities in (3.7.38) are ac- tually achieved. Inspection of (3.7.36) shows that this term will be zero if and only if ¢ = ¢’ for almost all x and x’ which are such that W(x|x’) #0. Thus, if W(x| x’) is never zero, i.e., if transitions can take place in both directions between any pair of states, the vanishing of the jump contribution implies that g(x) is independent of x, i.c., 3.7 Stationary and Homogeneous Markov Processes 63 pilx, t)/p,(x) = constant. 3.7.39) The constant must equal one since p,(x, t) and p,(x) are both normalised. The term arising from diffusion will be strictly negative if B,, is almost every- where positive definite. Hence, we have now shown that under rather strong condi- tions, namely, pax) # 0 with probability 1 W(x|x’) # 0 with probability 1, (3.7.40) B,(x) positive definite with probability 1, that any solution ofthe differential Chapman-Kolmogorov equation approaches the stationary solution p,(x) at t—~ co. The result fails in two basic kinds of systems. a) Disconnected State Space The result is best illustrated when A, and B,, vanish, so we have a pure jump system. Suppose the space divides into two regions R, and R, such that transitions from R, to R, and back are impossible; hence, W(x|x’) = 0 if x and x’ are not both in R, or R,. Then it is possible to have dK/dt = 0 if pix, t)=Ap(xs) xER (3.7.41) =Ap(x) xER, so that there is no unique stationary distribution. The two regions are disconnected and separate stochastic processes take place in each, and in each of these, there is a unique stationary solution. The relative probability of being R, or R; is not changed by the process. A similar result holds, in general, if as well we have B, and A, vanishing on the boundary between R, and Ry. b) p, (x) Vanishes in Some Definite Region If we have pPAx)=0 xER, (3.7.42) #0 xER, and again A, and B,, vanish, then it follows that, since p,(x) satisfies the stationary equation (3.7.26), Walyy=0 xER,yER. (3.7.43) In other words, no transitions are possible from the region R, where the stationary distribution is positive to R,, where the stationary distribution vanishes. 64 3. Markov Processes 3.7.4 Autocorrelation Function for Markov Processes For any Markov process, we can write a very elegant formula for the autocorrela- tion function. We define s = J deo X5¢X(t) | [05 tol)» Palo) (3.7.52) = lim (Xt + DX +. = lim f dry x8¢x(+ 1) [Fo to + DePelZo to +t). (3.7.53) However, when the process is Markovian, this cumbersome limiting procedure is not necessary since Markov => (X(t)|[%o, to], = (X(t) |[X0. to]>« = (X(1)|[X0, to) - (3.7.54) Equation (3.7.46) is 4 regression theorem when applied to a Markov process and is the basis of a more powerful regression theorem for linear systems. By this we mean systems such that a linear equation of motion exists for the means, i.e., X(t) | [o, to]>/dt = —ACX(1)| [x0 to]> G.7.55) which is very often the case in systems of practical interest, either as an exact result or as an approximation. The initial conditions for (3.7.55) are clearly Za PCW, #1 Wor bo) - (3.8.1) Utilising the initial condition PW, fol Wo, fo) = 5(w — Wo) os (3.8.2) on the conditional probability, we solve (3.8.1) by use of the characteristic function G(s, t) = f dw p(w, t] Wo, to) exp (isw) 8.3) which satisfies % 49% (3.8.4) so that Hs, t) = exp [- est — 1)| (5, to) - (3.8.5) From (3.8.2), the initial condition is (5, to) = exp (iswo) so that $6551) = exp [ise — 4.20 — td]. (3.8.6) 3.8 Examples of Markov Processes 67 Performing the Fourier inversion, we have the solution to (3.8.1): PCW, t| Wo, to) = [2n(t — to)]-/? exp [—(w — wo)*/2(t — to)] - (3.8.7) This represents a Gaussian, with W(t) = Wo (3.8.8) (W(t) — wo) =t— to, (3.8.9) so that an initially sharp distribution spreads in time, as graphed in Fig.3.4. plwit|Wo, to) Fig. 3.4. Wiener process: spreading of an initially sharp distribution p(w, |W, to) << TO with increasing time f — to Wo, A multivariate Wiener process can be defined as Wt) = (Wilt), Walt), » Walt] 3.8.10) which satisfies the multivariable Fokker-Planck equation a 1. @ By PAs 1 Woy to) =z Lar 1 |W, to) (3.8.11) whose solution is POW, £| Wo , to) = [2n(t — to]? exp [— (w — o)?/2(t — t)] , 3.8.12) a multivariate Gaussian with = Wo (3.8.13) and (Wilt) — Worl (Wilt) — wo) = (t — to)5ir - (3.8.14) 68 3. Markov Processes The one-variable Wiener process is often simply called Brownian motion, since the Wiener process equation (3.8.1) is exactly the same as the differential equation of diffusion, shown by Einstein to be obeyed by Brownian motion, as we noted in Sect. 1.2. The terminology is, however, not universal. Points of note concerning the Wiener process are: a) Irregularity of Sample Paths Although the mean value of W(t) is zero, the mean square becomes infinite as ¢ —~ co. This means that the sample paths of W(t) are very variable, indeed surprisingly so. In Fig. 3.5, we have given a few different sample paths with the same initial point to illustrate the extreme non-reproducibility of the paths. b) Non-differentiability of Sample Paths The Wiener process is a diffusion process and hence the sample paths of W(t) are continuous. However, they are not differentiable. Consider Prob{|[W(¢ + h) — W(NI/Al > kK}. (3.8.15) From the solution for the conditional probability, this probability is 2 f dw(2nh)-!exp (—w?/2h) 3.8.16) is and in the limit as — 0 this is one. This means that no matter what value of k choose, |[W(t + h) — W(1)]/h| is almost certain to be greater than this, i.e., the derivative at any point is almost certainly infinite. This is in agreement with the similar intuitive picture presented in Sect.3.5.2 and the simiflated paths given in Fig. 3.5 illustrate in point dramatically. This corresponds, of course, to the well- known experimental fact that the Brownian particles have an exceedingly irregular motion. However, this clearly an idealisation, since if W(t) represents the position Fig. 3.5. Three simulated sample paths of the Wiener process, illustrating their great variability 3.8 Examples of Markov Processes 59 of the Brownian particle, this means that its speed is almost certainly infinite. The Ornstein-Uhlenbeck process is a more realistic model of Brownian motion (Sect. 3.8.4). c) Independence of Increment The Wiener process is fundamental to the study of diffusion processes, and by means of stochastic differential equations, we can express any diffusion process in terms of the Wiener process. Of particular importance is the statistical independence of the increments of W(t). More precisely, since the Wiener process is a Markov process, the joint proba- bility density can be written PCW as bn3 Wats tas Wam2s En25 «+ 5 Woy fo) = if POWea1s tes | Wes L)PCWos bo) » (3.8.17) and using the explicit form of the conditional probabilities (3.8.7), we see that PC Was n3 Was tants Wa-2s tna} -+-3 Wos fo) = {x(t — 1"? exp [enn — W?/2t41 — tO} P00» fo) (3.8.18) If we define the variables AW, = Wit) — W(t) (3.8.19) At = tt, 3.8.20) then the joint probability density for the AW, is P(AWa; AWn13 AWy-25 -- AW15 Wo) = tl {(2nAt,)~'!? exp (—Aw?/2At,)} p(Wo, fo) (3.8.21) which shows from the definition of statistical independence given in Sect.2.3.4, that the variables AW, are independent of each other and of W(t). The aspect of having independent increments AW, is very important in the definition of stochastic integration which is carried out in Sect. 4.2. d) Autocorrelation Functions A quantity of great interest is the autocorrelation function, already discussed in Sects. 1.4.2 and 3.7.4. The formal definition is 5, that 70 3. Markov Processes CHD M(s) | [Wo told = LW) — WS) M(s)> + - (3.8.23) Using the independence of increments, the first average is zero and the second is given by (3.8.9) so that we have, in general, s and t sdh. (3.8.80) 3.8 Examples of Markov Processes 7 The correlation function with a definite initial condition is not normally of as much interest as the stationary correlation function, which is obtained by allowing the system to approach the stationary distribution. It is achieved by putting the initial condition in the remote past, as pointed out in Sect. 3.7.2. Letting tg - —0oo, we find im p(X2» $1 x0, to) = Pal%2) = (®D/k)~'"? exp (— kx}/D) . (3.8.81) and by straightforward substitution and integration and noting that the stationary mean is zero, we get XOX). = KOA). = F exp (kt = 51). 3.8.82) This result demonstrates the general property of stationary processes: that the correlation functions depend only on time differences. It is also a general result [3.6] that the process we have described in this section is the only stationary Gaus- sian Markov process in one real variable. The results of this subsection are very easily obtained by the stochastic differ- ential equation methods which will be developed in Chap.4. The Ornstein-Uhlenbeck process is a simple, explicitly representable process, which has a stationary solution. In its stationary state, it is often used to model a realistic noise signal, in which X(t) and X(s) are only significantly correlated if |t—s|~ Mk =r. (3.8.83) (More precisely, t, known as the correlation time can be defined for arbitrary processes X(s) by t= fat (x), X),/var(X}, (3.8.84) which is independent of the precise functional form of the correlation function). 3.8.5 Random Telegraph Process We consider a signal X(t) which can have either of two values a and b and switches from one to the other with certain probabilities per unit time. Thus, we have a Master equation 9,P(a, t|x, to.) = —AP(a, t| x, to) + Pb, t|x, to) (3.8.85) @,P(b, tx, to) = AP(a, t|x, to) — HPC, tx, to) for which the solution can simply be found by noting that P(a, tx, to.) + P(b, t|X, to) = 1 78 3, Markov Processes and that a simple equation can be derived for AP(a, t| x, to) — 1P(6, t| x, fo), whose solution is, because of the initial condition P(x’, tol X, to) = Fx, 21 (3.8.86) AP(a, t| x, to.) — UPB, t|x, to) = exp[—(A + p(t — to)]A0,,. — 20),) (3.8.87) so that Pastas) = GAG + expl—U-+ A = 1 (Ane 4 PUbs tx, to) = 4, ~ exPL-G+ ADE eae Tagen This process clearly has the stationary solution obtained by letting tg - —oo: Pa) = 4 TF i : (3.8.89) PO a which is, of course, obvious from the Master equation. The mean of X(t) and its varianc# are straightforwardly computed: £X(t)|[x0, tol) = Li xPCx, t| X05 to) = oA _ a BD + exp[—A+ ule — 1] (x (> Ea ) 6.8.90) so that _ mt oa x), = (3.8.91) The variance can also be computed but is a very messy expression. The stationary variance is easily computed to be (a — bud G+ 892) var{X}, = To compute the stationary time correlation function, we write (assuming t > s) HOAs = TL 2x'PCs #124, VPC (3.8.93) = Fx CXOIL, SPP.@)- (3.8.94) 3.8 Examples of Markov Processes. 79. we now use (3.8.90-92) to obtain {X(t)X(9))s = (XD? + exp[—A + w(t — SCX). — s = (X(t) X(s)>. — (X73 = exp[—A +4) |t— 51] GS. (3.8.97) G+ uy Notice that this time gorrelation function is of exactly the same form as that of the Ornstein-Uhlenbeck process. Higher-order correlation functions are not the same of course, but because of this simple correlation function and the simplicity of the two state process, the random telegraph signal also finds wide application in model building. 4. The Ito Calculus and Stochastic Differential Equations 4.1 Motivation In Sect.1.2.2 we met for the first time the equation which is the prototype of what is now known as a Langevin equation, which can be described heuristically as an ordinary differential equation in which a rapidly and irregularly fluctuating random function of time [the term X(t) in Langevin’s original equation] occurs. The sim- plicity of Langevin’s derivation of Einstein’s results is in itself sufficient motivation to attempt to put the concept of such an equation on a reasonably precise footing. The simple-minded Langevin equation that turns up most often can be written in the form ® = az, 1) +x, Ne), (4.1.0 where x is the variable of interest, a(x, t) and (x, f) are certain known functions and &(t) is the rapidly fluctuating random term. An idealised mathematical formulation of the concept of a “rapidly varying, highly irregular function’ is that for ¢ + 1’, €(t) and é(t’) are statistically independent. We also require (é(t)) = 0, since any non- zero mean can be absorbed into the definition of a(x, t), and thus require that KANE(LY = Bt — #') (4.1.2) which satisfies the requirement of no correlation at different times and furthermore, has the rather pathological result that ¢(t) has infinite variance. From a realistic point of view, we know that no quantity can have such an infinite variance, but the concept of white noise as an idealisation of a realistic fluctuating signal does have some meaning, and has already been mentioned in Sect.1.4.2 in connection with Johnson noise in electrical circuits. We have already met two sources which might be considered realistic versions of almost uncorrelated noise, namely, the Ornstein-Uhlenbeck process and the random telegraph signal. For both of these the second-order correlation function can, up to a constant factor, be put in the form 0, the é(s) in the first integral are independent of the é(s) in the second integral. Hence, by continuity, u(t) and u(t’) — u(t) are statistically indepen- dent and further, u(r’) — u(t) is independent of u(t”) for all ” < t. This means that u(t’) is fully determined (probabilistically) from the knowledge of the value of u(t) and not by any past values. Hence, u(t) is a Markov process. Since the sample functions of u(t) are continuous, we must be able to describe u(t) by a Fokker-Planck equation. We can compute the drift and diffusion coef- ficients for this process by using the formulae of Sect.3.5.2. We can write 82 4, The Ito Calculus and Stochastic Differential Equations Kult + At) — wolfe, > = CF eleydsy = 0 (4.1.10) and dt + At) = wo eo 9 = "Pas “Pas ceneeeny (4.1.10 ae = "fds f ds'8(s— 8!) = At (4.1.12) so that the drift and diffusion coefficients are ult + At) — tof DY _ 9 Aug, 1) = firm a (4.1.13) Be 1) fin E+ A) — PD can Thus, the Fokker-Planck equation is that of the Wiener process and we can write fede" = ut) = Wer). (4.1.15) 3 Thus, we have the paradox that the integral of ¢(t) is W(t), which is itself not dif- ferentiable, as shown in Sect.3.8.1. Tlis means that mathematically speaking, the Langevin equation (4.1.1) does not exist. However, the corresponding integral equation 2 x(0) — x0) = j ales), slds + { Bte(s), sede (4.1.16) can be interpreted consistently. We make the replacement, which follows directly from the interpretation of the integral of E(t) as the Wiener process W(t), that dW(t) = W(t + dt) — W(t) = E(ede (4.1.17) and thus write the second integral as f Px(s), saws) , (4.1.18) 3 which is a kind of stochastic Stieltjes integral with respect to a sample function W(t). Such an integral can be defined and we will carry this out in the next section. Before doing so, it should be noted that the requirement that u(r) be continuous, while very natural, can be relaxed to yield a way of defining jump processes as stochastic differential equations. This has already been hinted at in the treatment of shot noise in Sect. 1.4.1. However, it does not seem to be nearly so useful and will not be treated in this book. The interested reader is referred to [4.1]. 4.2. Stochastic Integration 83 As a final point, we should note that one normally assumes that &(t) is Gaus- sian, and satisfies the conditions (4.1.2) as well. The above did not require this: the Gaussian nature follows in fact from the assumed continuity of u(t). Which of these assumptions is made is, in a strict sense, a matter of taste. However, the continuity of u(t) seems a much more natural assumption to make than the Gaus- sian nature of &(t), which involves in principle the determination of moments of arbitrarily high order. 4.2 Stochastic Integration 4.2.1 Definition of the Stochastic Integral Suppose G(s) is an arbitrary function of time and W(t) is the Wiener process. We define the stochastic integral J‘ G(s)dW(t) as a kind of Riemann-Stieltjes integral. Namely, we divide the interval [fo, f] into subintervals by means of partitioning points (as in Fig. 4.1) EEG Ot % Fig. 4.1. Partitioning of the time interval used in the definition of stochastic integration = KWH?) — (W(te)*> — (t = to)] = 0 (4.2.21) This is also obvious by definition, since in the individual terms we have 0 for an arbitrary nonanticipating function G(t). The proof is quite straightforward. For N = 0, let us define 7= lim (3) G(AW? — At)P> (4.2.26) oe gat = lim {CSXGia(AW? — At)? +E 2G.1G,(AW} — At)(AW? ~ At)>). “ "Tndependent Independent (4.2.27) The horizontal braces indicate factors which are statistically independent of each other because of the properties of the Wiener process, and because the G, are values of a nonanticipating function which are independent of all AW, for j > i. Using this independence, we can factorise the means, and also using i) (AW = An ii) (AW? — At)*>= 2At? (from Gaussian nature of 4W,), we find I= 2lim( Ar(GY)) « (4.2.28) 88 4. The Ito Calculus and Stochastic Differential Equations Under reasonably mild conditions on G(t) (e.g., boundedness), this means that ms-lim (cS G,_, AW? — a G,, At) =0 (4.2.29) and since msclim 5 G,1A4 = fdrGte) (4.2.30) ve 7 we have frameypac’) = farce). Comments i) The proof f G()[dW(nP** = 0 for N> 0 is similar and uses the explicit ex- pressions for the higher moments of a Gaussian (Sect.2.8.1). ii) dW(t) only occurs in integrals so that when we restrict ourselves to nonantici- pating functions, we can simply write dW(ty = dt (4.2.31) dwtyt® = 0(N > 0). (4.2.32) iii) The results are only valid for the Ito integral, since we haye used the fact that AW, is independent of G,_,. In the Stratonovich integral, - AW, = W(t) — Wti1) (4.2.33) Gu = Git + t1)] (4.2.34) and although G(r) is nonanticipating, this is not sufficient to guarantee the indepen- dence of AW, and G,_, as thus defined. iv) By similar methods one can prove that f G@)dr'dW(r’) = ms-lim 5 G,_,AW,At, = 0 (4.2.35) is = and similarly for higher powers. The simplest way of characterising these results is to say that dW(r) is an infinitesimal order of } and that in calculating differ- entials, infinitesimals of order higher than | are discarded. 4.2.6 Properties of the Ito Stochastic Integral a) Existence One can show that the Ito stochastic integral f G(r’)dW(t’) exists whenever the io function G(t’) is continuous and nonanticipating on the closed interval [f, t] [4.3]. 4.2. Stochastic Integration 89 b) Integration of Polynomials We can formally use the result of Sect.4.2.5: amor = 1) + aor — WO = (2) Werawey and using the fact that dW(t)’ — 0 for all r > 2, = nw(rytaw(r) + Me) eed D waytae (4.2.36) so that 1 +1 fweraney = 5 ro = war — Ff wort. | 4237 c) Two Kinds of Integral We note that for each G(r) there are two kinds of integrals, namely, fowya and fauyame). both of which occur in the previous equation. There is, in general, no connection between these two kinds of integral. d) General Differentiation Rules In forming differentials [as in(b) above], one must keep all terms up to second order in dW(t). This means that, for example, d{exp [W(t)]} = exp [W(t) + dW(t)] — exp [W(1)] xp [W na W(t) + td W(0)"] = exp [W()ldW(t) + $at] (4.2.38) or more generally, amo, = Lae + $L cane + Lame + 42h amor ef + ayaa) +. and we use (dt)? 0 didW(t)—-0 (Sect. 4.2.5, comment (iv)] [awnp = dt and all higher powers vanish, to arrive at 90 4. The Ito Calculus and Stochastic Differential Equations df W(0), (4.2.39) e) Mean Value Formula For nonanticipating G(t), = i i dt’ (G(’)H(t')y (4.2.46) which implies KE(t)E(s)> = S(t — 5). An important point of definition arises here, however. In integrals involving delta functions, it frequently occurs in the study of stochastic differential equations that the argument of the delta function is equal to either the upper or the lower limit of the integral, that is, we find integrals like Fat fyace — 11) (4.2.47) or IL dt f(t)8(t — t2) (4.2 48) and various conventions can be made concerning the value of such integrals. We will show that in the present context, we must always make the interpretation h=ft) (4.2.49) h=0 (4.2.50) corresponding to counting all the weight of a delta function at the lower limit of an integral, and none of the weight at the upper limit. To demonstrate this, note that Cf ownawey tf Heaney =0 (4.2.51) This follows, since the function defined by the integral inside the square bracket is, by Sect.4.2.4 comment (v), a nonanticipating function and hence the complete integrand, [obtained by multiplying by G(t’) which is also nonanticipating] is itself nonanticipating. Hence the average vanishes by the result of Sect. 4.2.6. Now using the formulation in terms of the Langevin source €(t), we can rewrite (4.2.52) as far as aryHs)yace’ — s’) =0 (4.2.52) which corresponds to not counting the weight of the delta function at the upper limit. Consequently, the full weight must be counted at the lower limit. 92 4. The Ito Calculus and Stochastic Differential Equations This property is a direct consequence of the definition of the Ito integral as in (4.2.10), in which the increment points “towards the future”. That is, we can interpret dW(t) = Wt + dt) — Wa). (4.2.53) In the case of the Stratonovich integral, we get quite a different formula, which is by no means as simple to prove as in the Ito case, but which amounts to choosing t= $f) t= tft). This means that in both cases, the delta function occuring at the limit of an integral has } its weight counted. This formula, although intuitively more satisfying than the Ito form, is more complicated to use, especially in the perturbation theory of stochastic differential equations, where the Ito method makes very many terms vanish. (Stratonovich) (4.2.54) 4.3. Stochastic Differential Equations (SDE) We concluded in Sect.4.1, that the most satisfactory interpretation of the Langevin equation t & = a(x, t) + B(x, NEI) “ (4.3.1) is a stochastic integral equation x(t) — x(0) = f dt'atx(e), 7) + fa )bIX(t), #1 (43.2) 0 0 Unfortunately, the kind of stochastic integral to be used is not given by the reason- ing of Sect.4.1. The Ito integral is mathematically and technically the most satis- factory, but unfortunately, it is not always the most natural choice physically. The Stratonovich integral is the natural choice for an interpretation which assumes &(t) is a real noise (not a white noise) with finite correlation time, which is then allowed to become infinitesimally small after calculating measurable quantities. Furthermore, a Stratonovich interpretation enables us to use ordinary calculus, which is not possible for an Ito interpretation. From a mathematical point of view, the choice is made clear by the near impos- sibility of carrying out proofs using the Stratonovich integral. We will therefore define the Ito SDE, develop its equivalence with the Stratonovich SDE, and use either form depending on circumstances. The relationship between white noise stochastic differential equations and the real noise systems is explained in Sect.6.5. 4.3 Stochastic Differential Equations (SDE) 93 43.1 Ito Stochastic Differential Equation: Definition A stochastic quantity x(t) obeys an Ito SDE written as dx(t) = a{x(t), tldt + b[x(t), a(t) (43.3) if for all ¢ and fo, x1) = alte) + ft alte), 1+ f aC) BIx(e), 11 (4.3.4) Before considering what conditions must be satisfied by the coefficients in (4.3.4), it is wise to consider what one means by a solution of such an equation and what uniqueness of solution would mean in this context. For this purpose, we can con- sider a discretised version of the SDE obtained by taking a mesh of points 1, (as illustrated in Fig. 4.2) such that b t) (thus, the initial conditions if considered random, must be nonanticipating) and (ii) a(x, f) is a nonanticipating function of t for any fixed x. Constructing an approximate solution iteratively by use of (4.3.6), we see that x, is always independent of AW, for j > i. The solution is then formally constructed by letting the mesh size go to zero. To say that the solution is unique means that for a given sample function W(t) of the random Wiener process W(t), the particular solution of the equation which arises is unique. To say that the solution exists means that with probability one, a solution exists for any choice of sample function W(t) of the Wiener process W(t). The method of constructing a solution outlined above is called the Cauchy- Euler method, and can be used to generate simulations. However, this is not the way uniqueness and existence are usually demonstrated, though it is possible to demonstrate these properties this way. Existence and unique- ness will not be proved here. The interested reader will find proofs in [4.3]. The conditions which are required for existence and uniqueness in a time interval [to, T] are: i) Lipschitz condition: a K exists such that $ lax, 1) — ay, 1)| + [6G 1) — by, | << Kilx—yl + (4.3.10) for all x and y, and all ¢ in the range [fo, T]. ii) growth condition: a K exists such that for all ¢ in the range [fo, T], la(x, t)|? + [dG 1)? < KC + |x1?)- (4.3.11) Under these conditions there will be a unique nonanticipating solution x(t) in the range (to, T]. Almost every stochastic differential equation encountered in practice satisfies the Lipschitz condition since it is essentially a smoothness condition. However, the growth condition is often violated. This does not mean that no solution exists; rather, it means the solution may “explode” to infinity, that is, the value of x can become infinite in a finite time; in practice, a finite random time. This phenomenon occurs in ordinary differential equations, for example, _ Gazax (4.3.12) has the general solution with an initial condition x = x, at t = 0, x(t) = (— at + 1x3"? (4.3.13) 4.3, Stochastic Differential Equations (SDE) 95 If ais positive, this becomes infinite when x» = (at)-'/ but if a is negative, the solu- tion never explodes. Failing to satisfy the Lipschitz condition does not guarantee the solution will explode. More precise stability results are required for one to be certain of that [4.3]. 4.3.2. Markov Property of the Solution of an Ito Stochastic Differential Equation We now show that x(r), the solution to the stochastic differential equation (4.3.4), is a Markov Process. Heuristically, the result is obvious, since with a given initial condition x(tc), the future time development is uniquely (stochastically) deter- mined, that is, x(t) for f > ft) is determined only by i) the particular sample path of W(t) for t > to ; ii) the value of x(t). Since x(t) is a nonanticipating function of 1, W(t) for t > ty is independent of x(t) for t < ft). Thus, the time development of x(t) for t > f, is independent of x(t) for 1 < ty provided x(t.) is known. Hence, x(t) isa Markov process. For a precise proof see [4.3]. 4.3.3 Change of Variables: Ito’s Formula Consider an arbitrary function of x(t): flx(1)]. What stochastic differential equa- tion does it obey? We use the results of Sect.4.2.5 to expand df[x(t)] to second order in dW(t): Al] = AO + ax()] — FO) =f RMdx(1) + Ef" ROEXC? + =f (Ol {alx(t), t]dt + b[x(1), aw(t)} + AFA), PAW OP + where all other terms have been discarded since they are of higher order. Now use [dW(t)P = dt to obtain Al) = {alx(), tf XO] + 45x), 1Pf'Ex()]} at + O[x(t), tf x(a W(1) . (43.14) This formula is known as Ito’s formula and shows that changing variables is not given by ordinary calculus unless f[x(1)] is merely linear in x(t). Many Variables. In practice, Ito’s formula becomes very complicated and the easiest method is to simply use the multivariate form of the rule that dW(t) is an in- finitesmial of order }. By similar methods to those used in Sect.4.2.5, we can show that for an n dimensional Wiener process W(t), 96 4. The Ito Calculus and Stochastic Differential Equations dW(t)dW)(t) = d,dt (4.3.15a) [dw(ny’? =0 (N > 0) (4.3.15b) dW(t)dt =0 (4.3.15c) aen =0 (N>0). (4.3.15d) which imply that dW/(t) is an infinitesimal of order 4. Note, however, that (4.3.15a) is a consequence of the independence of dW((t) and dW,(t). To develop Ito’s for- mula for functions of an n dimensional vector x(t) satisfying the stochastic differen- tial equation dx = A(x, t)dt + B(x, dW), (4.3.16) we simply follow this procedure. The result is Aflx) = (31 Ale, DOA) + FT LBE, OB, 148.8, fla)} dt +E Bule, NILA (0 - (43.17) 4.3.4 Connection Between Fokker-Planck Equation and Stochastic Differential Equation We now consider the time development of an arbitrary f(x()). Using Ito’s formula carta at = (LEO) 4 cpr i ao 1)0,f + 4b[x(), (FOZ) - (4.3.18) However, x(t) has a conditional probability density p(x, t] xo, fo) and Z xd = f dx M098.014, 1x10) = J dxlalx, 18S + $6Cx, N°OE/I0Ca tL x,t). 43.19) This is now of the same form as (3.4.16) Sect.3.4.1. Under the same conditions as there, we integrate by parts and discard surface terms to obtain J dx fo)0.p = f dx f(x) {—A,falx, 1p] + 49216(x, 1)?p]} and hence, since f(x) is arbitrary, 8,P(x, t Xo, to) = —AMalx, t)p(x, 1x0, ta)] + $82(0(x, t)*p(x, t| xo, t0)] - (4.3.20) 4.3 Stochastic Differential Equations (SDE) 97 We have thus a complete equivalence to a diffusion process defined by a drift coefficient a(x, 1) and a diffusion coefficient b(x, 1)?. The results are precisely analogous to those of Sect.3.5.3, in which it was shown that the diffusion process could be locally approximated by an equation resembling an Ito stochastic differential equation. 4.3.5 Multivariable Systems In general, many variable systems of stochastic differential equations can be defined for n variables by dx = A(x, t)dt + B(x, t)dW(t), (4.3.21) where dW(t) is an n variable Wiener process, as defined in Sect.3.8.1. The many variable version of the reasoning used in Sect. 4.3.4 shows that the Fokker-Planck equation for the conditional probability density p(x, t|xo, fo) = p is dep = —SLALACx, Op) + 4 32.9, ((BE, OB, Oy). (4.3.22) Notice that the same Fokker-Planck equation arises from all matrices B such that BB" is the same. This means that we can obtain the same Fokker-Planck equation by replacing B by BS where $ is orthogonal, i.e., SS™ = 1. Notice that § may de- pend on x(r). This can be seen more directly. Suppose $(t) is an orthogonal matrix with an arbitrary nonanticipating dependence on t. Then define aV(t) = S(t)dW(1). (4.3.23) Now the vector dV(t) is a linear combination of Gaussian variables d¥(r) with coe- ficients $(r) which are independent of dW(r), since $(t) is nonanticipating. For any fixed value of S(t), the dM(t) are thus Gaussian and their correlation matrix is CAV ADAV ID) = $2 Sul )Sim() AWW nl)? = TSS (Oat = by de (4.3.24) since $(t) is orthogonal. Hence, all the moments are independent of $(t) and are the same as those of dW(t), so dV(t) is itself Gaussian with the same correlation matrix as dW(t). Finally, averages at different times factorise, for example, if t>fin Ze AMOS OAM S ul » (4.3.25) we can factorise out the averages of dW/(t) to various powers since dW,(t) is in- dependent of all other terms. Evaluating these we will find that the orthogonal nature of $(t) gives, after averaging over dW,(t), simply 98 4. The Ito Calculus and Stochastic Differential Equations SAWP (AW SAOY (4.3.26) which similarly gives ({dW,(t)!"[dW,(t')"). Hence, the dV(t) are also increments of a Wiener process. The orthogonal transformation simply mixes up different sample paths of the process, without changing its stochastic nature. Hence, instead of (4.3.21) we can write dx = A(x, t)dt + Bx, )S™()S(Nd W(t) (43.27) = A(x, t)dt + B(x, NST(NdM), (4.3.28) and since V(t) is itself simply a Wiener process, this equation is equivalent to dx = A(x, t)dt + B(x, )S™()dW(t) (4.3.29) which has exactly the same Fokker-Planck equation (4.3.22). We will return to some examples in which this identity is relevant in Sect.4.4.6. 4.3.6 Stratonovich’s Stochastic Differential Equation Stratonovich [4.2] has defined a stochastic integral of an integrand which is a func- tion of x(1) and 1 by 5] Gte("),)ave) = ms slim 36 pees, tf OHO) WC ° _ (4.3.30) It should be noted that only the dependence on x(t) is averaged. If G(z, t) is dif- ferentiable in t, the integral is independent of the particular choice of value for 1 in the range [t;-1, ti]. It is possible to write a stochastic differential equation (SDE) using Strato- novich’s integral x(t) = x(te) + f dt'alx(e’), 1+ S fae BECe’), C1, (43.31) a 4, and we shall show that is equivalent to an appropriate Ito SDE. Let us assume that x(t) is a solution of dx(t) = a[x(t), t]dt + b[x(t), ]dW(t) (4.3.32) and deduce the corresponding a and f. In both cases, the solution x(t) is the same function. We first compute the connection between S fare \BLx('), t’]) and fame b[x(t"), t”]. Then, X(t) + X(G-1) p Sawer pe ),11~ XB a (W(t) —W(4-)). (4.3.33) 4.3 Stochastic Differential Equations (SDE) 99 In (4.3.33) we write X(t) = x (t-1) + dx (4-1) and use the Ito SDE (4.3.32) to write x (t)) =a [X (teas tre M(te— an) + x (Hew) HT WU) — Wi) (4.3.34) Then, applying Ito’s formula, we can write X(t) +X (ti-1) at B 5 it] = BUX (lint) +54 (ti) al = Bltinn) + (atin) BG) +4 (G-DEEG= 4-0) + 3b (t-1) OB (G1) [W(u) — WW) (4.3.35) (For simplicity, we write A(t,) etc, instead of B[x(r,), t,] wherever possible). Putting all these back in the original equation (4.3.32) and dropping as usual dr?, dt dW, and setting dW? = dt, we find SJ=L BG) (WH) — WH} 43D bt-1) OB (4-1) (4-6-1) - Hence we derive SJ Ble’), 1140) = f BEX), 1A) + Hf OLX), 10, BLx(), #1”. |(4.3.37) fA i “0 This formula gives a connection between the Ito and Stratonovich integrals of func- tions B[x(t’), t'], in which x(t’) is the solution of the Ito SDE (4.3.31). It does not give a general connection between the Ito and Stratonovich integrals of arbi- trary functions. If we now make the choice a(x, 1) = a(x, t) — £b(x, 1)8,b(x, 1) B(x, t) = B(x, t) (4.3.38) We see that the Ito SDE dx =adt+b dW(t), (4.3.39a) is the same as the Stratonovich SDE dx = [a — }bd,b]dt + b dW(t), | (4.3.39b) or conversely, the Stratonovich SDE dx =adt+ BdW(t) (4.3.40a) is the same as the Ito SDE dx = [a + $80,B]dt + BdW(t). (4.3.40b) ww 4. tue 10 Carculus ang stuenasuc Wimerenual Equations Comments i) Using Ito’s formula (4.3.14) we can show that the rule for a change of variables in Stratonovich SDE is exactly the same as in ordinary calculus. Start with the Stratonovich SDE (4.3.40a), and convert to the Ito SDE (4.3.40b). Change to the new variable y = f(x) with the inverse x = g(y). Define a@(y) = alg(y)] BO) = Bley) - Use Ito’s formula and note that df/dx = (dg/dy)~' to obtain the Ito SDE de\-* _, dg (dey idg\—" 5 sy - (Zz) Fy dt + (3) paw. Now convert back to a Stratonovich equation using (4.3.39); we obtain - . dy =(adt + Baw) (@) or af{x(s)] = falx(t), t]dt + Blx(t), dW ty} f’Ix(t)] (4.3.41) which is the same as in ordinary calculus. ii) Many Variables. If a many variable Ito equation is 7 dx = A(x, t)dt +B(x, tdW(t), (4.3.42) then the corresponding Stratonovich equation can be shown similarly to be given by replacing Ai = Ay ~ 45) BusDBy Bi = Bi. (4.3.43) iii) Fokker-Planck Equation corresponding to the Stratonovich SDE, (S) dx = AX(x, t)dt + B(x, dW) (4.3.44) can, by use of (4.3.43) and the known correspondence (Sect.4.3.5) between the Ito SDE and Fokker-Planck equation, be put in the form | ap = —DatAtp) + 49,189 BRP) (4.3.45) which is often known as the “‘Stratonovich form” of the Fokker-Planck equation. In contrast to the two forms of the SDEs, the two forms of Fokker-Planck equation have a different appearance but are (of course) interpreted with the same rules — 4.3. Stochastic Differential Equation (SDE) 101 those of ordinary calculus. We will find later that the Stratonovich form of the Fokker-Planck equation does arise very naturally in certain contexts (Sect.6.6). iv) Comparison of the Ito and Stratonovich Integrals. The Stratonovich integral as defined in (4.3.30) is quite a specialised concept, for it can only be defined in terms of a function G(z, t) of two variables. The more “obvious” definition in terms of G(x EE (4+ 4-1), 44+ 4-1) was not used by Stratonovich in his original defini- tion, although the view that this provides the definition of the Stratonovich integral is widespread in the literature (including the first edition of this book). Apparently, the more obvious definition cannot be proved to converge — see [4.6]. In practise, the precise definition of the Stratonovich integral from first principles is of no great interest, whereas the property that the rule for change of variables is given by ordinary caléulus is of great significance, and this is ensured not so much by the definition as by the relations (4.3.37, 43) between the two kinds of integral. One could simply choose to define the Stratonovich integral as being given by (4.3.37) when the function obeys the SDE (4.3.31), and this would be mathemat- ically completely satisfactory, and much less confusing. 4.3.7 Dependence on Initial Conditions and Parameters In exactly the same way as in the case of deterministic differential equations, if the functions which occur in a stochastic differential equation depend continuously on parameters, then the solution normally depends continuously on that parameter. Similarly, the solution depends continuously on the initial conditions. Let us formu- late this more precisely. Consider a one-variable equation dx = a(A, x, t)dt + b(A, x, NdW(t) with initial condition (4.3.46) (to) = e(A) where A is a parameter. Let the solution of (4.3.49) be x(, 1). Suppose i) stelim (2) = (do) 5 4g ii) lim {sup f€[to, TIL|a(A, x, £) — ao, x, t)| + [B(A, x, 1) — bo, x, 1]} = 05 Belg Isten! iii) a K independent of 2 exists such that la, x, t)|? + |b, x, )|? < KL + |2#|). hen, tli A, t) — =0. on (sup | x(A, t) — xo, t)|} = 0 (4.3.47) For a proof see [4.1]. 102 4. The Ito Calculus and Stochastic Differential Equations Comments i) Recalling the definition of stochastic limit, the interpretation of the limit (4.3.50) is that as A —~ do, the probability that the maximum deviation over any finite in- terval [f., T] between x(A, t) and x(Ap, t) is nonzero, goes to zero. ii) Dependence on the initial condition is achieved by letting a and b be independent of 2. iii) The result will be very useful in justifying perturbation expansions. iv) Condition (ii) is written in the most natural form for the case that the functions a(x, t) and B(x, t) are not themselves stochastic. It often arises that a(x, 1) and (x, t) are themselves stochastic (nonanticipating) functions. In this case, condition (ii) must be replaced by a probabilistic statement. It is, in fact, sufficient to replace lim by st-lim. Aly” dag 4.4 Some Examples and Solutions 4.4.1 Coefficients Without x Dependence The simple equation ' dx = a(t)dt + b(t)dW(t) (4.4.1) with a(t) and B(1) nonrandom functions of time, is solved simply by integrating x(t) = Xo + falt)dt + f b)dW(t) . (4.4.2) to io Here, xo can be either a nonrandom initial condition or may be random, but must be independent of W(t) — W(to) for t > to; otherwise, x(t) is not nonanticipating. As constructed, x(t) is Gaussian, provided x, is either nonrandom or itself Gaus- sian, since J awit) io is simply a linear combination of infinitesimal Gaussian variables.Further, &x()) = Gro) + J altde io (since the mean of the Ito integral vanishes) and 4.4 Some Examples and Solutions 103 x(t) ~ Cx(>IEXE) — Gx(DDD = GO, X(9) = foenamee f oenamey =F"? oerpar’, where we have used the result (4.2.42) with, however, Gr) = i <0 rs Hr’) The process is thus completely determined. 44.2 Multiplicative Linear White Noise Process The equation dx = cx dW(t) (4.4.3) is known as a multiplicative white noise process because it is linear in x, but the “noise term” dW(t) multiplies x. We can solve this exactly by using Ito’s formula Let us define a new variable by y=logx, (4.4.4) so that dy=tar— Loy ly = = dx — 3 (dx =cdW(t) — 4dr. (4.4.5) This equation can now be directly integrated, so we obtain Y(t) = Wto) + W(t) —W(te)] — $e°(t — to) (4.4.6) and hence, x(t) = x(to) exp {[W(1) — W(to)] — 4e°(t — to)}- (4.4.7) We can calculate the mean by using the formula for any Gaussian variable z with zero mean exp [he%(t — to) — $e%(t — &)] = (xlto)> « (4.4.9) This result is also obvious from definition, since dx) = (ex dW(1)> = 0 so that as) dt 0. We can also calculate the autocorrelation function = Crlto) exp (2° (W(1) + W(s) — 2W(te)?> — (t + 5 — 2t0)]} = (x(to)*>exp {he7[t + 5 — 2to + 2min(t, s) — (t + 5 — 2t,)]} = (x(to)*>exp [e?min(t — fo, 5 — to)] « (4.4.10) Stratonovich Interpretation. The solution of this equation interpreted as a Stratono- vich equation can also be obtained, but ordinary calculus would then be valid. Thus, instead of (4.4.5) we would obtain # dy =cdW(t) S and hence, X(t) = x(to) exp el (1) — W(to)]} - (4.4.11) In this case, (x()) = Cx(te) exp [he7(t — t0)] (4.4.12) and &x(1)x(s)) = = exp [Cio — y)t] = <2(0)*) exp [(ia — 7)(¢ + 8) — 2ymin(r, s)] - (4.4.19) In the limit ¢, s+ 00, witht + t=, limée(t + )2(0)> = 0. (4.4.20) However, the correlation function of physical interest is the complex correlation <2(t)2*(8)> = <1 2(0)|?> = <|z(0)|*exp fiat — s) — yft + s — 2min(s, s)}} = ¢2(0)|?>exp [io(t — s) — yt — s|]- (4.4.21) Thus, the complex correlation function has a damping term which arises purely from the noise. It may be thought of as a noise induced dephasing effect, whereby for large time differences, z(t) and z*(s) become independent. A realistic oscillator cannot be described by this model of a complex oscillator, as discussed by van Kampen [4.5]. However the qualitative behaviour is very simi- 106 4, The Ito Calculus and Stochastic Differential Equations lar, and this model may be regarded as a prototype model of oscillators with noisy frequency. 4.4.4 Ornstein-Uhlenbeck Process Taking the Fokker-Planck equation given for the Ornstein-Uhlenbeck process (Sect.3.8.4), we can immediately write down the SDE using the result of Sect.4.3.4: x = — kx dt + /B dwn), (4.4.22) and solve this directly. Putting yoxet, (4.4.23) then dy = (dx)d(e*) + (dxje* + xd(e*) = [—kx dt + /DdW(t)\k edt + [-kx dt + /D dW(nle* + kx edt. (4.4.24) We note that the first product vanishgs, involving only dt?, and dW(t)dt (in fact, it can be seen that this will always happen if we simply multiply x by a deterministic function of time). We get dy = JD e'dW(t) (4.4.25) so that integrating and resubstituting for y, we get x(t) = x(O)e + VD fev dWr'). (4.4.26) é If the initial condition is deterministic or Gaussian distributed, then x(t) is clearly Gaussian with mean and variance x(t) = (xe (4.4.27) var {x(#)} = <{[x) — (xO)]e* + vD{e*awey} ». (4.4.28) Taking the initial condition to be nonanticipating, that is, independent of dW(t) for t > 0, we can write using the result of Sect.4.2.6f var {x(t)} = var {x(0)} e-" + D f ereemin dy! 4.4 Some Examples and Solutions 107 = {{var {x(0)} — D/2k}e* + D/2k . (4.4.29) These equations are the same as those obtained directly by solving the Fokker- Planck equation in Sect.3.8.4, with the added generalisation of a nonanticipating random initial condition. Added to the fact that the solution is a Gaussian variable, we also have the correct conditional probability. The time correlation function can also be calculated directly and is, att), x6) = var futoyye-* 4 Defereemane) fee aIH(ey 4 j oon = varfx(Oertn 4 DPT teemaingy! i - — P) p-nets 4 DP e-birns = [var 10) Rl tpenes, (4.4.30) Notice that if k > 0, as t, s— oo with finite |* — s|, the correlation function be- comes stationary and of the form deduced in Sect.3.8.4. In fact, if we set the initial time at — oo rather than 0, the solution (4.4.26) becomes x)= VD f eka’). (4.4.31) in which the correlation function and the mean obviously assume their stationary values. Since the process is Gaussian, this makes it stationary. 4.4.5 Conversion from Cartesian to Polar Coordinates A model often used to describe an optical field is given by a pair of Ornstein-Uh- Ienbeck processes describing the real and imaginary components of the electric field, i.e., dE\(t) = — yE\(t) dt + @ dWi(t) dE,{t) = — yEft) dt + dW{t). (4.4.32) It is of interest to convert to polar coordinates. We set E,(t) = a(t)cos g(t) Ex(t) = a(t)sin g(t) (4.4.33) and for simplicity, also define H(t) = log a(t) (4.4.34) 108 4. The Ito Calculus and Stochastic Differential Equations so that H(t) + ig(t) = log [Ei(t) + iEA(1)) . (4.4.35) We then use the Ito calculus to derive ME, +iE:) 1 (a(E, + iE) £, + iE, 2 (E+ iby — WE + iE2) 4, 4 Law) + idW,(t)] OE, +E, (E, + £2) _ 1 eld (t) + i dwnp 2 (E, + iE? du + ig) = (4.4.36) and noting dW,(t)dW,(t) = 0, dW,(t)? = dW,(1)? = dr, it can be seen that the last term vanishes, so we find d[u(t) + ig(t)] = —ydt + eexp [—H(1) — id (NAW) + i dWA(t)}. (4.4.37) We now take the real part, set a(t) = exp [u(t)] and using the Ito calculus find da(t) = {—ya(t) + he*/a(t)} dt + e{dW,(t)cosg(t) + dW2(t)sin g(t)]}. (4.4.38) The imaginary part yields g(t) = [e/a(t)] [—dW(2)sin g(t) i dW.c0s g(t)] . : (4.4.39) We now define : dW(t) = dW,(t)cos g(t) + dW2(t)sin g(t) (4.4.40) dW,(t) = — dW,(1)sin g(t) + dW,(1)cos g(t) . We note that this is an orthogonal transformation of the kind mentioned in Sect. 4.3.5, so that we may take dW,(t) and dW,(t) as increments of independent Wiener processes W,(t) and W(t). Hence, the stochastic differential equations for phase and amplitude are g(t) = [ela(s)]aW(t) (4.4.41a) da(t) = [—ya(t) + fe*/a(ldt + edW.(1). (4.4.41b) Comment. Using the rules given in Sect. 4.3.6 (ii), it is possible to convert both the Cartesian equation (4.4.32) and the polar equations (4.4.41) to the Stratonovich form, and to find that both are exactly the same as the Ito form. Nevertheless, a direct conversion using ordinary calculus is not possible. Doing so we would get the same result until (4.4.38) where the term [}e?/a(t)]dt would not be found. This must be compensated by an extra term which arises from the fact that the Stratonovich increments dW,(t) are correlated with g(t) and thus, dW,(t) and 4.4 Some Examples and Solutions 109 dW(t) cannot simply be defined by (4.4.40). We see the advantage of the Ito method which retains the statistical independence of d W(t) and variables evaluated at time 1. Unfortunately, the equations in Polar form are not soluble, as the corresponding Cartesian equations are. There is an advantage, however, in dealing with polar equations in the laser, whose equations are similar, but have an added term pro- portional to a(t)?dt in (4.4.41b). 4.4.6 Multivariate Ornstein-Uhlenbeck Process we define the process by the SDE dx(t) = — Ax(t)dt + B dW(t), (4.4.42) (A and Bare constant matrices) for which the solution is easily obtained (as in Sect. 4.4.4): x(t) = exp (—At)x(0) + fexp[—A(t — 118 a(t). (4.4.43) The mean is = exp (—A1) s. Then s, s (4.4.54a) and similarly, =o exp[—A%s — 1] t 0 governed by the same law of time development of the mean [e.g.,(4.4.44)]. It is a consequence of the Markovian linear nature of the problem. For Ltosonde = 2 caso, 510)pde = ((-Ax,dt + BdW(0)], x7)> (4.4.59) and since t > 0, dW(z) is uncorrelated with x™(0), so d GIG.) = 4 GO). (4.4.59) Thus, computation of G,(z) requires the knowledge of G,(0) =o and the time development equation of the mean. This result is similar to those of Sect.3.7.4. 4.4.7 The General Single Variable Linear Equation a) Homogeneous Case We consider firstly the homogeneous case t dx = (b(d)dt + g(aW(n)|x (4.4.60) and using the usual Ito rules, write y =logx (4.4.61) so that dy = dx|x — 3(dxy}/x? = [b(t)dt + g(t)dW(t)] — 4g(t)dt, (4.4.62) and integrating and inverting (4.4.61), we get x(t) = x(0) exp {f (B(¢’) — da(t’]de’ + j g(t’) w(r') (4.4.63) 0 = x(06(t) (4.4.64) which serves to define (1). We note that [using (4.4.8)] Ex = Cexp {1 f 180°) — dete"Pde” + nf swam) = CxO) exp {nf Heya’ + n(n — 1) face'prar’} . (4.4.65) 4.4 Some Examples and Solutions 413 b) Inhomogeneous Case Now consider dx =[a(t) + b(t)xldt +.Lf(t) + g()xldt) (4.4.66) and write 2(t) = xIGOr (4.467) with g(t) as defined in (4.4.64) and a solution of the homogeneous equation (4.4.60). Then we write dz = dx{g(t)l' + x dig(t)"] + dx d[gy']. Noting that d[g(s)"! = —dg(Ig(0)? + (46()FIG()I? and using Ito rules, we find = (late) — fing(olar + AAW) go" (4.4.68) which is directly integrable. Hence, the solution is x(t) = gt | x0) + j BCC" {lale’) — feet’) dt + fe dW}. (4.4.69) c) Moments and Autocorrelation It is better to derive equations for the moments from (4.4.66) rather than calculate moments and autocorrelation directly from the solution (4.4.69). For we have |. ss nD septate = nx(ty'dx(t) + ms nO =D scryatpcr) + g(t)x(t)Par « (4.4.70) Hence, HOO? = atEy9[nbe) +7 FY acer] + GU Yfna(t) + mfr — DADC] 44.7) + en py, These equations from a hierarchy in which the nth equation involves the solutions of the previous two, and can be integrated successively. 114 4. The Ito Calculus and Stochastic Differential Equations 4.4.8 Multivariable Linear Equations a) Homogeneous Case The equation is x(t) = [BO)dt +H G(NAWO)A(Y) , (4.4.72) where B(t), G(s) are matrices. The equation is not, in general, soluble in closed form unless all the matrices B(t), G,(t’) commute at all times with each other, i.e. GING) = GG (1) BYW)G At’) = GAt')B(1) (4.4.73) Bt) BC’) = B(t’)B(t) In this case, the solution is completely analogous to the one variable case and we have x(t) = &(1)x(0) with a0) = exof fa — E Raw + {Eecoawao. (44.74) b) Inhomogeneous Case ‘ . We can reduce the inhomogeneous case to the homogeneous case in exactly the same way as in one dimension. Thus, we consider dx(t) = [A(t) + B(t)x]dt + x [FRO + G(t)x]dW(t) (4.4.75) and write v(t) = v(t x(t) (4.4.76) where y(t) is a matrix solution of the homogeneous equation (4.4.72). We first have to evaluate d[y~']. For any matrix M we have MM! = 1, so, expanding to second order, Md[M~'] + dM M-! + dM d[M~'] = 0. Hence, d[M~'] = —[M + dM]! dM M~" and again to second order d{M-"] = —M~'dM M-' + M-'dM M-'dM M* (4.4.77) and thus, since y(t) satisfies the homogeneous equation, AY") = VO BO + EGO — Fi GOAW} - and, again taking differentials 4.4 Some Examples and Solutions 115 y(t) = WO" TAD — GEMMA + He F(QAW(O}. Hence, x(t) = w(t) {x) + fuer (4@) — SGM + FAW} (4.4.78) This solution is not very useful for practical purposes, even when the solution for the homogeneous equation is known, because of the difficulty in evaluating means and correlation functions. 4.4.9 Time-Dependent Ornstein-Uhlenbeck Process This is a particular case of the previous general linear equation which is soluble. It is a generalisation of the multivariate Ornstein-Uhlenbeck process (Sect.4.4.6) to include time-dependent parameters, namely, dx(t) = —A(t)x(t)dt + B()\d (1). (4.4.79) This is clearly of the same form as (4.4.75) with the replacements A(t) +0 Bit) — —A(t) Dt Fi()dW(t) — B()dW(t) (4.4.80) Gt) + 0. The corresponding homogeneous equation is simply the deterministic equation dx(t) = — A(t)x(1) dt (4.4.81) which is soluble provided A(t)A(t’) = A(t’)A(1) and has the solution x(0) = w(t)x(0) with w(t) = exp[— j A(t’dt') (4.4.82) Thus, applying (4.4.78), x(t) = exp[— j A(t')dt"|x(0) + j {exp [— { A(s)ds}} Ba')dWt'). (44.83) This is very similar to the solution of the time-independent Ornstein-Uhlenbeck process, as derived in Sect. 4.4.6 (4.4.43). 116 4. The Ito Calculus and Stochastic Differential Equations From this we have att) = exp (= f Ala Kx) (4.4.84) x aed, By (z, t)p(z, t) (5.2.6) We note that this can also be written EDL 2 1G,1) =0 (5.2.1) where we define the probability current lia IGN) = ASD PEN -—Zd az, Bi, (z, t)p(z, t) (5.2.8) Equation (5.2.7) has the form of a local conservation equation, and can be written in an integral form as follows. Consider some region R with a boundary S and define P(R, t) = § dz p(z, t) R then (5.2.7) is equivalent to we yo J dS wz, 1) (5.2.9) where n is the outward pointing normal to S. Thus (5.2.9) indicates that the total loss of probability is given by the surface integral of J over the boundary of R. We can show as well that the current J does have the somewhat stronger property, that a surface integral over any surface S gives the net flow of probability across that surface. Fig. 5.1, Regions used to demonstrate that the probability current is the flow of probability For consider two adjacent regions R, and R,, separated by a surface Sz. Let ‘5S; and S; be the surfaces which, together with S,2, enclose respectively R, and Rz (see Fig. 5.1). Then the net flow of probability can be computed by noting that we are dealing here with a process with continuous sample paths, so that, in a sufficiently short 120 5. The Fokker-Planck Equation time At, the probability of crossing S,, from R, to R, is the joint probability of being in R, at time ¢ and R, at time t + At, = f dxf dy p(x, t+ At;y, 1). iss The net flow of probability from R, to R, is obtained by subtracting from this the probability of crossing in the reverse direction, and dividing by Ar; i.e. lim Af df dy (ole, 1 + At; y 1) — ply, t+ At x, D) (5.2.10) _—— Note that fax f dy ple, 59,1) =0 A since this is the probability of being in R, and R, simultaneously. Thus, we can write (5.2.10) = f dx f dy [8 px, t's 9, 1) — By p(y, 05 x, Drne ae and using the Fokker-Planck equation in the form (5.2.7) a : a : =- {areas I(x, 5 Ra fay Dg 6 Rd (5.2.11) where J; (x, t; Rz, t) is formed from DX, t3 Ras t) = f dy p(x. ts y,t) R2 in the same way as J(z, t) is formed from p(z, t)in(5.2.8) and J, (y, t; Ri, t) is defined similarly. We now convert the integrals to surface integrals. The integral over S, vanishes, since it will involve p(x, t; Rz, f), With x not in R, or on its boundary (except for a set of measure zero.) Similarly the integral over S, vanishes, but those over S,, do not, since here the integration is simply over part of the boundaries of R, and Ry. Thus we find, the net flow from R, to R, is f dS ne {I(x, t; Rit) + ICs, ts Re, O} Siz and we finally conclude, since x belongs the union of R, and R,, that the net flow of probability per unit time from R, to R, Slim gf dx f dylple,t + At; 0) — plyt + Atsx,01 =f dS mdz, 0) aso ACR ay Sia where n points from R; to R, (5.2.12) 5.2. Fokker-Planck Equation in One Dimension 121 We can now consider the various boundary conditions separately. a) Reflecting Barrier We can consider the situation where the particle cannot leave a region R, hence there is zero net flow of probability across S, the boundary of R. Thus we require a-J(z,t)=0 forge S, m=normaltoS (6.2.13) where J(z, 1) is given by (5.2.8). Since the particle cannot cross S, it must be reflected there, and hence the name reflecting barrjer for this condition. b) Absorbing Barrier Here, one assumes that the moment the particle reaches S, it is removed from the system, thus the barrier absorbs. Consequently, the probability of being on the boundary is zero, i.e. P(z,t)=0 fozeS (5.2.14) ¢) Boundary Conditions at a Discontinuity It is possible for both the A; and B,, coefficients to be discontinuous at a surface S, but for there to be free motion across S. Consequently, the probability and the normal component of the current must both be continuous across S, a J(2)|s, = a-I(@)|s_ (5.2.15) P(2) Is, = PZ) s_ (5.2.16) where S',, S_, as subscripts, mean the limits of the quantities from the left and right hand sides of the surface. The definition (5.2.8) of the current, indicates that the derivatives of p(z) are not necessarily continuous at S. 4) Periodic Boundary Condition We assume that the process takes place on an interval [a, 6] in which the two end points are identified with each other. (this occurs, for example, if the diffusion is on a circle). Then we impose boundary conditions derived from those for a discon- tinuity, i.e., I: lim p(x, ¢) = lim pe, 1) (5.2.17) Mi: lim J(x, #) = lim J(x, 1). (5.2.18) Most frequently, periodic boundary conditions are imposed when the functions A(x, t) and B(x, t) are periodic on the same interval so that we have A(b, t) = A(a, t) (5.2.19) B(b, t) = Bla, t) 122 5. The Fokker-Planck Equation and this means that I and IT simply reduce to an equality of p(x, £) and its derivatives at the points a and b. ©) Prescribed Boundaries If the diffusion coefficient vanishes at a boundary, we have a situation in which the kind of boundary may be automatically prescribed. Suppose the motion occurs only for x > a. If a Lipschitz condition is obeyed by A(x, ) and VB(x, 1) atx=a (Sect. 4.3.1) and B(x, #) is differentiable at x =a then 4,B(a, 1) =0. (5.2.20) The SDE then has solutions, and we may write dx(t)=A(x,)dt+ VB dW (5.2.21) In this rather special case, the situation is determined by the sign of A (x, /). Three cases then occur, as follows. i) Exit boundary. In this case, we suppose A(a,t) <0 (5.2.22) so that if the particle reaches the point a, it will certainly proceed out of region to x 0. (5.2.23) In this case, if the particle reaches the point a, the sign of A(a, 0) is such as to return it to x > a; thus a particle placed to the right of a can never leave the region. However, a particle introduced at x= will certainly enter the region. Hence the name, “entrance boundary”. iii) Natural boundary. Finally consider A(a,t)=0. (5.2.24) The particle, once it reaches x = a, will remain there. However it can be demon- strated that it cannot ever reach this point. This is a boundary from which we can. neither absorb nor at which we can introduce any particles. Feller [5.4] has shown that in general the boundaries can be assigned to one of the four types; regular, entrance, exit and natural. His general criteria for the classification of these boundaries are as follows. Define f(x) = exp [- 2 5 ds A(s)/B(s) (5.2.25) 5.2. Fokker-Planck Equation in One Dimension 123 g(x) = 2B) S(O) (5.2.26) A(x) =F09 J g(s) ds (5.2.27) hy (x) = g(x) J f(s) ds. (5.2.28) Here xo € (a, 6), and is fixed. Denote by F (x1. %) (5.2.29) the space of all functions integrable on the interval (x,, x9); then the boundary at @ can be classified as I: Regular: if f(x) €-/ (a, x9), and g(x) € 7 (a, x0) D: Exit: if g(x) ¢ (a, x9), and hy(x) € 7 (a, xo) Il: Entrance : if g(x) €-/ (a, x9), and hy(x) € / (a, x0) IV: Natural ; all other cases It can be seen from the results of Sect. 5.2.2 that for an exit boundary there is no normalisable stationary solution of the FPE, and that the mean time to reach the boundary, (5.2.161), is finite. Similarly, if the boundary is exit, a stationary solution can exist, but the mean time to reach the boundary is infinite. In the case of a regular boundary, the mean time to reach the boundary is finite, but a stationary solution with a reflecting boundary at a does exist. The case of natural boundaries is harder to analyse. The reader is referred to [5.5] for a more complete description. f) Boundaries at Infinity Alll of the above kinds of boundary can occur at infinity, provided we can si- multaneously guarantee the normalisation of the probability which, if p(x) is rea- sonably well behaved, requires lim p(x, 1) = 0. (5.2.30) If 2,p(x) is reasonably well behaved (i.e. does not oscillate infinitely rapidly as x— ©), lim ,p(x, t) = 0 (5.2.31) so that a nonzero current at infinity will usually require either A(x, t) or B(x, t) to become infinite there. Treatment of such cases is usually best carried out by changing to another variable which is finite at x = oo. Where there are boundaries at x = + oo and nonzero currents at infinity are permitted, we have two possibilities which do not allow for loss of probability: 124 5. The Fokker-Planck Equation i) (+ 0, t)=0 (5.2.32) ii) J(+ 0, t) = J(—00, t). (5.2.33) These are the limits of reflecting and periodic boundary conditions, respectively. 5.2.2 Stationary Solutions for Homogeneous Fokker-Planck Equations We recall (Sect.3.7.2) that in a homogeneous process, the drift and diffusion coef- ficients are time independent. In such a case, the equation satisfied by the stationary distribution is d ia =. FAC.) — > Ga lBOr.@)] = (5.2.34) which can also be written simply in terms of the current (as defined in Sect.5.2.1) dita) _ dx 0 (5.2.35) which clearly has the solution J(x) = constant. (5.2.36) Suppose the process takes place on an interval (a, 6). Then we must have Ja) = Hx) = Ib) = I : - (5.2.37) and if one of the boundary conditions is reflecting, this means that both are reflect- ing, and J = 0. If the boundaries are not reflecting, (5.2.37) requires them to be periodic. We then use the boundary conditions given by (5.2.17,18). a) Zero Current—Potential Solution Setting J = 0, we rewrite (5.2.37) as AGps0) = $F tBCp.0)] = 0 (52.38) for which the solution is Nv PAC) = Fy OP? J de’ AC VBOM (5.2.39) where / is a normalisation constant such that fax PAx) = 1. (5.2.40) 5.2. Fokker-Planck Equation in One Dimension 125 Such a solution is known as a potential solution, for various historical reasons, but chiefly because the stationary solution is obtained by a single integration (the full significance of this term will be treated in Sect.5.3.3). b) Periodic Boundary Condition Here we have nonzero current J and we rewrite (5.2.36) as AGie.e) — $4 (Beop.00] (5.2.41) However, J is not arbitrary, but is determined by normalisation and the periodic boundary condition Psa) = p.(b) (5.2.42) I(a) = Jb). (5.2.43) For convenience, define w(x) = exp [2 f dv AGB). (5.2.44) Then we can easily integrate (5.2.41) to get Pa)BOIV() = pa) B(@)/va) ~2.)f de'ly(x’). (5.2.45) By imposing the boundary condition (5.2.42) we find that J = BOM) — BarCorrsay |} dxiu(x)] (5.2.46) so that dv’ BU), td’ Bla) Syeaiv@ * lpr wea) BG) Fax . vx)a wo’) Px) = p.(a) (5.2.47) c) Infinite Range and Singular Boundaries In either of these cases, one or the other of the above possibilities may turn out to be forbidden because of divergences, etc. A full enumeration of the possibilities is, in general, very complicated. We shall demonstrate these by means of the examples given in the next section. 126 5. The Fokker-Planck Equation 5.2.3 Examples of Stationary Solutions a) Diffusion in a Gravitational Field A strongly damped Brownian particle moving in a constant gravitational field is often described by the SDE (Sect.6.4) dx = —g dt + /D dW(t) (5.2.48) for which the Fokker-Planck equation is a 2.2 5, (€P) + > 1 n&e ae (5.2.49) On the interval (a, 5) with reflecting boundary conditions, the stationary solution is given by (5.2.39), i.e. PAX) = M exp [—2gx/D] , (5.2.50) where we have absorbed constant factors into the definition of Clearly this solution is normalisable on (a, 6) only if a a is finite, though b may be infinite. The result is no more profound than to say that particles diffusing in a beaker of fluid will fall down, and if the beaker is infinitely deep, they will never stop falling! Diffusion upwards against gravity is possible for any distance but with exponentially small probability. Now assume periodic boundary conditions on (a, b). Substitution into (5.2.47) yields P(x) = pa); (5.2.51) a constant distribution. The interpretation is that the particles pass freely from a to b and back. b) Ornstein Uhlenbeck Process We use the notation of Sect.3.8.4 where the Fokker-Planck equation was op _ ap Aa Zee P+ > 5 Dig? (5.2.52) whose stationary solution on the interval (a, 6) with reflecting barriers is PAx) = WM exp (—kx?/D) . (5.2.53) Provided k > 0, this is normalisable on (— co, oo). If k <0, one can only make sense of it on a finite interval. Suppose a=—b<0. (5.2.54) 5.2. Fokker-Planck Equation in One Dimension 127 Then in this case, k yw) = exo|—F @ «| _ and if we consider the periodic boundary condition on this interval, by noting ya) = w(— 4), (5.2.56) we find that PAX) = play (x)/y(a) = pla exp| —F (= ~ 2] so that the symmetry yields the same solution as in the case of reflecting barriers. Letting a —- oo, we see that we still have the same solution. The result is also true if a— co independently of b —- —co, provided k > 0. c) A Chemical Reaction Model Although chemical reactions are normally best modelled by a birth-death master equation formalism (as in Chap. 7), approximate treatments are often given by means of a FPE. The reaction X+A==2x (5.2.57) is of interest since it possesses an exit boundary at x = 0 (where x is the number of molecules of X). Clearly if there is no X, a collision between X and A cannot occur so no more X is produced. The FPE is derived in Sect.7.6.1 and is ,p(x, t) = —8,[(ax — x*)p(x, t)] + 4 A2(ax + x*)p(x, 1]. (5.2.58) We introduce reflecting boundaries at x = a and x = f. In this case, the stationary solution is pax) = e(a + xyz! (5.2.59) which is not normalisable if a = 0. The pole at x = 0 is a result of the absorption there. In fact, comparing with (5.2.28), we see that BO, t) = (ax + x*),-0 = 0 AQ, t) = (ax — X%)09 = 0 (5.2.60) 9,BO, t) = (a + 2x),-0 > 0 so we indeed have an exit boundary. The stationary solution has relevance only 128 5. The Fokker-Planck Equation if a > Osince it is otherwise not normalisable. The physical meaning of a reflecting barrier is quite simple: whenever a molecule of X disappears, we simply add another one immediately. A plot of p,(x) is given in Fig. 5.2. The time for all x to disappear is in practice extraordinarily long, and the stationary solution (5.2.59) is, in practice, a good representation of the distribution except near x = 0. Ps (x) Fig. 5.2. Non-normalisable “stationary” p,(x) for the reaction X + A == 2X x 5.2.4 Boundary Conditions for the Backward Fokker-Planck Equation We suppose that p(x, t|x’, t') obeys the forward Fokker-Planck equation for a set of x,t and x’, 1’, and that the process is confined to a region R with boundary S. Then, if s is a time between ¢ and t’, @ ny 2 yn 0 = 5 plx, tla’) = 5 J dypls, tly, sip(y, six4,0), (5.2.61) as as where we have used the Chapman-Kolmogorov equation. We take the derivative a/As inside the integral, use the forward Fokker-Planck equation for the second factor and the backward equation for the first factor. For brevity, let us write PCy, S) = ply, s| x’, 1’) (5.2.62) B(y, s) = p(x, tly, s)- Then, a # 5 O= dy [- Digy, Av) + Sapa, @n]— (5.2.63) . a + jal Ay, — 3p Bu ios! oy] and after some manipulation = [ard [an +t3[62 .n —00, 2] = [49 DE |—Anp + 53: 0S (Bu) — 78,32) (5.2.64) 5.2. Fokker-Planck Equation in One Dimension 129 = [54s [p[-a9+ 55 3 (B.,0)] ~zJEasv (Ea,d 2) (5.2.65) {yas, 5, 2. We now treat the various cases individually. a) Absorbing Boundaries This requires p = 0 on the boundary. That it also requires p(y, t) = 0 on the boun- dary is easily seen to be consistent with (5.2.65) since on substituting p = 0 in that equation, we get O= fp das By i (5.2.66) However, if the boundary is absorbing, clearly P(x, tly,s)=0, — for y & boundary (5.2.67) since this merely states that the probability of X re-entering R from the boundary is zero. b) Reflecting Boundaries Here the condition on the forward equation makes the first integral vanish in (5.2.65). The final factor vanishes for arbitrary p only if a Dn.BAly) 3y, tly, = 0- (5.2.68) aT yy In one dimension this reduces to 2 w tly,s)=0 (5.2.69) ay x, 0¥, 5s) = 2. unless B vanishes. ©) Other Boundaries We shall not consider these in this section. For further details see [5.4]. 5.2.5 Eigenfunction Methods (Homogeneous Processes) We shall now show how, in the case of homogeneous processes, solutions can most naturally be expressed in terms of eigenfunctions. We consider reflecting and absorbing boundaries. a) Eigenfunctions for Reflecting Boundaries We consider a Fokker-Planck equation for a process on an interval (a, b) with reflecting boundaries. We suppose the FPE to have a stationary solution p,(x) and the form 130 >. the Fokker-Planck Equation 9, p(x, 0) = —A,[A(x)p(x, 1)] + $02 [BO)pC>, 1)] - (5.2.70) We define a function q(x, t) by P(x, t) = p(x)q(x, 1) (5.2.71) and, by direct substitution, find that q(x, 1) satisfies the backward equation g(x, 1) = A(x)0.4(x, 1) + $ BODAA(x, 1). (5.2.72) We now wish to consider solutions of the form P(x, t) = Px(xe*" (5.2.73) W(x, 1) =Q,(xJe* (5.2.74) which obey the eigenfunction equations —0,[A(x)Pi(x)] + 4 02[B(2)P,(X)] = —AP (2) (5.2.75) A(X)0,Oi(x) + § BOX)IIQUAX) = —1'Qu(X) « (5.2.76) Then we can straightforwardly show by partial integration that ' ae af AxP (x) Qu(x) = [Qu(x){— A) PX) + § 0,1B)Ps))} — } BO)PAC)9.Qv)]2, (5.2.77) and using the reflecting boundary condition on the coefficient of Q,,(x), we see that it vanishes. Further, using the definition of g(x, t) in terms of the stationary solution (5.2.71), it is simple to show that $ BOX), Q(x) = —AG)Palx) + $ 0,[B@)P(X)] (5.2.78) so that term vanishes also. Hence, the Q,(x) and P,(x) form a bi-orthogonal system » J dx PxOu(X) = Biv (5.2.79) or, there are two alternative orthogonality systems, fax PAlX)OAX)OAX) = Bua (5.2.80) Jastrscoy'Ps(a)Pu(2) = By (5.2.81) 5.2. Fokker-Planck Equation in One Dimension 131 It should be noted that setting 2 = solution p,(x) since = 0 gives the normalisation of the stationary Po(x) = p(x) (5.2.82) Q(x) = (5.2.83) Using this orthogonality we can write any solution in terms of eigenfunctions. For if P(x 1) = = A,Pi(xye™ , (5.2.84) then ’ f dx Q,(x)p(x, 0) = Aa. (5.2.85) For example, the conditional probability p(x, | xo, 0) is given by the initial con- dition P(x, 0] x0, 0) = 8(x — x) (5.2.86) so that A= j dx Q,(x)5(x — Xo) = Qi(%o) (5.2.87) and hence, PAX 10,0) = 3 PAR)QQ)E™ « (6.2.88) We can write the autocorrelation function quite elegantly as Kx(t)x(O)> = J dx J dxo xxop(x, t]X0, Op.) (5.2.89) = Sf dx xPore™, (5.2.90) where we have used the definition of Q,(x) by (5.2.74). b) Eigenfunctions for Absorbing Boundaries This is very similar. We define P, and Q, as above, except that p,(x) is still the sta- tionary solution of the Fokker-Planck equation with reflecting boundary conditions. With this definition, we find that we must have P,(a) = Q,(a) = P,(b) = Q,(b) = 0 (5.2.91) we 2) ker ORneL-rneK Equauon and the orthogonality proof still follows through. Eigenfunctions are then com- puted using this condition and the eigenfunction equations (5.2.75, 76) and all other results look the same, However, the range of A does not include 4 = 0, and hence (x, t|Xo, 0) + 0 as t+ 00. 5.2.6 Examples a) A Wiener Process with Absorbing Boundaries The Fokker-Planck equation Op =} 2p (5.2.92) it treated on the interval (0, 1). The absorbing boundary condition requires PO, t) = pl, t) =0 (5.2.93) and the appropriate eigenfunctions are sin (mmx) so we expand in a Fourier sine series pix, t) = 5° b(t) sin(nnx) (5.2.94) which automatically satisfies (5.2.93). The initial condition is chosen so that P(x, 0) = B(x — xo) (5.2.95) for which the Fourier coefficients afe 1 b,(0) = 2 f dx 8(x — x9) sin (nnx) = 2 sin (nnxp) . (5.2.96) 3 Substituting the Fourier expansion (5.2.94) into (5.2.92) gives b(t) = —Anbilt) (5.2.97) with Ay = nn? /2 (5.2.98) and the solution b,(t) = b,(O)exp(— 4,t) « (5.2.99) So we have the solution [which by the initial condition (5.2.95) is for the conditional probability p(x, t] xo, 0)] P(x, t|xo, 0) = 2 x exp(— 4,t) sin (nmx,) sin (nnx) . (5.2.100) $.2. Fokker-Planck Equation in One Dimension 133 b) Wiener Process with Reflecting Boundaries Here the boundary condition reduces to [on the interval (0, 1)] 4,p(0, 1) = a, p(1, 1) = 0 (5.2.101) and the eigenfunctions are now cos (nx), so we make a Fourier cosine expansion Px, 1) = hay + 5? a,(t) cos (nex) (5.2.102) with the same initial condition P(x, 0) = B(x — x9) (5.2.103) so that a,(0) = 2 f dx cos (nmx)B(x — x9) = 2cos (nm) - (5.2.104) In the same way as before, we find a,(t) = a,(0) exp (— A,t) (5.2.105) with A, = nn?/2 (5.2.106) so that P(x, t|xXo, 0) = 1 +2 x cos (nxq) cos (nmx) exp (—A,f) - (5.2.107) As f — co, the process becomes stationary, with stationary distribution a(x) = lim px, #1 x0, 0) = 1. (5.2.108) We can compute the stationary autocorrelation function by &x(1)x(0)), = ax xy xXop(X, t| Xo, p(x) (5.2.109) and carrying out the integrals explicitly, t) = fap’, t|x, 0) (5.2.141) which means that G(x, t) is the same as Prob(T > #). Since the system is time homogeneous, we can write P(x’, t|x, 0) = P(x, O|x, —t) (5.2.142) and the backward Fokker-Planck equation can be written 4,p(x’, t|x, 0) = A(x)0, p(x’, t|x, 0) + $ B(x) p(x’, tx, 0) (5.2143) and hence, G(x, 1) obeys the equation 9,G(x, 1) = A(x)0,G(x, 1) + } B(x)82G(x, 1) . (5.2.144) The boundary conditions are clearly that P(x’, 0|x, 0) = 3x — x’) and hence, Gx,0)=1 a1)=0 whenx=aorb, i.e, Ga, 1) = Gb, 1) =0. (5.2.146) Since G(x, t) is the probability that 7 > ¢, the mean of any function of T is _ Janda, _ (5.2.147) Thus, the mean first passage time T(x) = t) = (5.2.180) We now find an equation for g,(x, t). We use the fact that p(a, t|x, 0) satisfies a backward Fokker-Planck equation. Thus, A(X)0.80(%, 1) + $B(x)822.(x, 1) = — Sdt'dyJ(a, t"|x, 0) Ja, t|x, 0) = g(x, t) . (5.2.181) The mean exit time, given that exit is through a, is T(a, x) =~ §10,Prob (T, >t) dt = f ga(x, Hdt/ga(x, 00). (5.2.182) a 3 Simply integrating (5.2.181) with respect to 1, we get A(x)0, )T(a, x)] + 4B(x)03[2.(x)T(a, x)] (5.2.183) where we define n4(x) = (probability of exit through a) = g,(x, 0) . (5.2.184) 5.3. Fokker-Planck Equations in Several Dimensions 143 The boundary conditions on (5.2.183) are quite straightforward since they follow from those for the backward Fokker-Planck equation, namely, n,(a)T(a, a) = 7,(b)T(a, b) = 0. (5.2.185) In the first of these clearly T(a, a) is zero (the time to reach a from a is zero) and in the second, 2,(b) is zero (the probability of exiting through a, starting from b, is zero). By letting ¢+0 in (5.2.181), we see that J(a,0|x,0) must vanish if a #x. since, p(a,0|x,0) = d(x~ a). Hence, the right-hand side tends to zero and we get A(x)O,7a(x) + 4B(x)02z,(x) = 0, (5.2.186) the boundary condition this time being n,(a) = 1 5.2.187, n(b) =0. ¢ » The solution of (5.2.186) subject to this boundary condition and the condition max) + (x) = 1 (5.2.188) is nx) =f dy vO fay O) (5.2.189) mx) = Lf dy von f dy vO) (5.2.190) with y(x) as defined in (5.2.157). These formulae find application in the problem of relaxation of a distribution initially concentrated at an unstable stationary point (Sect.9.1.4). 5.3. Fokker-Planck Equations in Several Dimensions In many variable situations, Fokker-Planck equations take on an essentially more complex range of behaviour than is possible in the case of one variable. Boundaries are no longer simple end points of a line but rather curves or surfaces, and the nature of the boundary can change from place to place. Stationary solutions even with reflecting boundaries can correspond to nonzero probability currents and eigenfunction methods are no longer so simple. 144 5. The Fokker-Planck Equation Nevertheless, the analogies between one and many dimensions are useful, and this section will follow the same general outline as that on one-variable situations. 5.3.1 Change of Variables Suppose we have a Fokker-Planck equation in variable x,, 8:p(x, 1) = — 3 dlAdx)p(@, 1] + 4 2 9,0,1B,,(x)p(x, 1)] 6.3.1) and we want to know the corresponding equation for the variables Y= Slx), (5.3.2) where f; are certain differentiable independent functions. Let us denote by p(y, t) the probability density for the new variable, which is given by (x1, x: ) AY, Ya + | (5.3.3) 1.0 = pls, 1)| The simplest way to effect the change of variables is to use Ito’s formula on the corresponding SDE dx(t) = A(x)dt + /B@) dW(1) (5.3.4) and then recompute the corresponding FPE for p(y, t) con the resulting SDE as derived in Sect. 4.3.4. The result is rather complicated. In specific situations, direct implementation (5.3.3) may be preferable. There is no way of avoiding a rather messy calculation unless full use of symmetries and simplifications is made. Example: Cartesian to Polar Coordinates, As an example, one can consider the transformation to polar coordinates of the Rayleigh process, previously done by the SDE method in Sect.4.4.5. Thus, the Fokker-Planck equation is = ya fo 1a ap 3,0(E, Eas) = v5 EP + Yap Bap + x & (58 + 58) (53.5) and we want to find the FPE for a and ¢ defined by E, = acosg (5.3.6) E, =asing. The Jacobian is cosg —asing sing acosg (5.3.7) 5.3 Fokker-Planck Equations in Several Dimensions 145 We use the polar form of the Laplacian to write fF 1 1aya agit dei aagt a alae) 6.3.8) and inverting (5.3.6) a= SE? + E} | ae g = tan(E,/E\) , 2 we note Bo aE, ~ JEP SS Similarly, (5.3.10) aa 3a sin g and a E, # F+F = cos ga. Similarly, (53.11) a zs =—singja. Hence, a a 5g Ei + 5p. Ese _ @p da, ap 3$) | p (Op da , apap = + Eilagag, + ag an) tale ag, * ae) - op_1a = 4a a1 3 (ae. (5.3.12) Let us use the symbol f(a, g) for the density function in terms of a and g. The Jaco- bian formula (5.5.3) tells us that (Ey, E) (a, $) Ba, $) = P(E, Ex) = ap(E,, E2) . (5.3.13) Putting together (5.3.5, 8, 12, 13), we get : a 2 Lap at B= - Fl (m+ F\a]+5 wag tae (5.3.14) 146 5. The Fokker-Planck Equation which (of course) is the FPE, corresponding to the two SDE’s in Sect.4.4.5, which were derived by changing variables according to Ito’s formula. 5.3.2 Boundary Conditions We have already touched on boundary conditions in general in Sect.5.2.1 where they were considered in terms of probability current. The full range of boundary conditions for an arbitrary multidimensional Fokker-Planck equation does not seem to have been specified yet. In this book we shall therefore consider mostly reflecting barrier boundary conditions at a surface S, namely, nJ=0 onS, (5.3.15) where m is the normal to the surface and 1 a Ii, 1) = Adz, DPE) — > Day Bul® Op, 1) (5.3.16) J OX; and absorbing barrier boundary conditions p(x,t)=0 forxonS. (5.3.17) In practice, some part of the surface may be reflecting and another absorbing. Ata surface S on which A, or B,, age discontinuous, we enforce nd=nd, on : (5.3.18) p(x) = pox) x on S. The tangential current component is permitted to be discontinuous. The boundary conditions on the backward equation have already been derived in Sect.5.2.4. For completeness, they are Absorbing Boundary p(x,tly,t')=0 yes (5.3.19) Reflecting Boundary ¥\n,B,(y) Z p(x, tly, 1 yes. (5.3.20) 7 7 5.3.3 Stationary Solutions: Potential Conditions A large class of interesting systems is described by Fokker-Planck equations which permit a stationary distribution for which the probability current vanishes for all x in R. Assuming this to be the case, by rearranging the definition of J (5.3.16), we obtain a completely equivalent equation L apd) _ tye FEA) BE) — pode) — FEE Bule)]- (63.21) 5.3. Fokker-Planck Equations in Several Dimensions 147 If the matrix B,,(x) has an inverse for all x, we can rewrite (5.3.21) Flow asa] =F: Bit) [24ula) — 2X Ba(0)] (5.3.22) = ZA, B, x]. (5.3.23) This equation cannot be satisfied for arbitrary B,(x) and A,(x) since the left-hand side is explicitly a gradient. Hence, Z, must also be a gradient, and a necessary and sufficient condition for that is the vanishing of the curl, i.e., 9Z, _ 9Z, ox 7 Oa,” (5.3.24) If this condition is satisfied, the stationary solution can be obtained by simple integration of (5.3.22): pax) = exp {f dx! ZA, B, x')). (5.3.25) The conditions (5.3.24) are known as potential conditions since we derive the quan- tities Z, from derivatives of log [p,(x)], which, therefore, is often thought of as a potential —¢(x) so that more precisely, P(x) = exp [—¢(x)] (5.3.26) and Hx) = -f dx'.Z[A, B, x']. (5.3.27) Example: Rayleigh Process in Polar Coordinates. From (5.3.14) we find A= [” a “pe (5.3.28) 0 (5.3.29) (5.3.30) Z=2B(A= [re + “4 (53.31) 148, 5. The Fokker-Planck Equation and clearly Ge Fem. (5.3.32) The stationary solution is then pila, $) = expt (ds Z, + da Z,)) (5.3.33) : = Wexp (- 1S + log a) (5.3.34) =a exp( (5.3.35) 5.3.4 Detailed Balance a) Definition of Detailed Balance The fact that the stationary solution of certain Fokker-Planck equations corres- ponds to a vanishing probability current is a particular version of the physical phenomenon of detailed balance. A Markov process satisfies detailed balance if, roughly speaking, in the stationary situation each possible transition balances with the reversed transition. The concept of detailed balance comes from physics, so let us explain more precisely with a physical example. We consider a gas of particles with positions r and velgcities v. Then a transition corresponds to a particle at some time f with position velocity (r,v) having acquired by a later time ¢ + 7 position and velocity (r’, v’). The probability density of this transition is the joint probability density p(r', v', t +t; 1, v, 1). We may symbolically write this transition as (0,014. (5.3.36) The reversed transition is not given simply by interchanging primed and unprimed quantities Rather, it is (,-v,.0-@—-2,t4+9- (5.3.37) It corresponds to the time reversed transition and requires the velocities to be re- versed because the motion from r’ to r is in the opposite direction from that from rtor’. The probability density for the reversed transition is thus the joint probability density Pr, —v, t+ t5r', 02). (5.3.38) The principle of detailed balance requires the equality of these two joint probabilities when the system is in a stationary state. Thus, we may write 5.3. Fokker-Planck Equations in Several Dimensions 149 Pr’, v', t; 7, v, 0) = pr, —v, t; 7’, —v’, 0) (5.3.39) (The principle can be derived under certain conditions from the laws of physics, see [5.7] and Sect.5.3.6b.) More explicitly, for a Markov process we can rewrite (5.3.39) P(r’, v', tT] r, v, O)p,(r, v) = p(r, — v, t]r’, — v’, Ope’, — v’), (5.3.40) where the conditional probabilities now apply to the corresponding homogeneous Makov process (if the process was not Markoy, the conditional probabilities would be for the stationary system only). In its general form, detailed balance is formulated in terms of arbitrary variables X,, which under timereyersal, transform to the reversed variables according to the rule 1 eX; (5.3.41) g=+1 (5.3.42) depending on whether the variable is odd or even under time reversal. In the above, r is even, v is odd. Then by detailed balance we require pAx, t+ 5x’, t) = p(ex', t+ 7; ex, 1). (5.3.43) By ex, we mean (6,1, &2X2. -..)- Notice that setting t = 0 in (5.3.43) we obtain 8(x — x')p(x') = 5(ex — ex’)p.(ex) . (5.3.44) The two delta functions are equal since only sign changes are involved. Hence, P(x) = ps(ex) (5.3.45) is a consequence of the formulation of detailed balance by (5.3.43). Rewriting now in terms of conditional probabilities, we have P(x, t|x', O)p,(x') = plex’, tl ex, 0)p,(x) . (5.3.46) b) General Consequences of Detailed Balance An important consequence of (5.3.45) is that )s = 8d), (5.3.47) (hence all odd variables have zero stationary mean), and for the autocorrelation function G(x) = &) 59x, BuPopter) a a 4 +2 z BuP.) gy Plex) + BuPs ae, wes)| Ips. (5.3.66) iii) Jump Term: SdzlW(x | z)p(z, t4x', 0) — W(z|x)p(~, t|x’, )] = J de[W(x|z)p.(z)p(ex’, t|ez, 0) — W(z|x)p.(x)p(ex’, t| ex, 0)1/p, . (5.3.67) We now use the fact that p,(x) is a solution of the stationary differential Chapman- Kolmogorov equation to write 3K Civ) + FB 5S Ours] — fede Wel=ppca) = —J dz Wx|2)p.(2) (5.3.68) and using the detailed balance condition (5.3.53(i)) for W = —J dz W(ez|ex)p,(x). (5.3.69) Now substitute yoer (5.3.70) y= ex’ and all up all three contributions, taking care of (5.3.68, 69): = [-xadeerncn[Z aol See, (Bordo 00)] +5 3 &€,B,,(ey)p.(y) laa | (5.3.71) Oy,dy, + f del Mey 2)p.C2)p (1122.0) — Mey l2)p.@)0C9", t1y, 0) iv We now substitute the detailed balance conditions (5.3.53). = [400 gh 069 119. 0) + 3B BUC) egy OO" #1940) + § de{W(z| yp’ t|z,0) — W(zly)p(y's t1y.0)} pWViply'). (5.3.72) 194 3. Lhe Fokker-Planck Equation The term in the large curly brackets is now recognisable as the backward differential Chapman-Kolmogorov operator (Sect.3.6, (3.6.4)]. Note that the process is homo- geneous, so that PCy’, tly, 0) = p(y, Oly, —1)- We see that a : , a , (5.3.72) = 5 [a #1, Op.Cypw] = 57 BC, #12", 0) (5.3.73) which means that p(x, ¢|x’, 0), defined in (5.3.62), satisfies the forward differential Chapman-Kolmogorov equation. Since the initial condition of p(x, |x’, 0) and A(x, tx’, 0) at ¢ = 0 are the same (5.3.63) and the solutions are unique, we have shown that provided the detailed balance conditions (5.3.53) are satisfied, detailed balance is satisfied. Hence, sufficiency is shown. Comments i) Even variables only: the conditions are considerably simpler if all ¢, are +1. In this case, the conditions reduce to Welx)pAe) = Wo |p@) (53.74) Ada) = Size (Blsga) (63.75) Bix) = By (x), 7 (5.3.76) the last of which is trivial. The condition (5.3.75) is exactly the same as the potential condition (5.3.21) which expresses the vanishing of J, the probability current in the stationary state. The conditions (5.3.74, 75) taken together imply that p,(x) satisfies the stationary differential Chapman-Kolmogorov equation, which is not the case for the general conditions (5.3.53). ii) Fokker-Planck equations: van Kampen, [5.7], and Graham and Haken [5.9] in- troduced the concept of reversible and irreversible drift parts. The irreversible drift is D(x) = } (A(x) + &A((ex)] (5.3.77) and the reversible drift I(x) = }[Adx) — 6, Alex). (5.3.78) Using again the potential defined by plz) = exp [—9(2)], (6.3.79) we see that in the case of a Fokker-Planck equation, we can write the conditions for detailed balance as e:8/By (ex) = By (x) (5.3.80) Fgh (Bylo) = 3 Ba) EO (5.3.81) =| (x) — 14x) B2)] = 9 (5.3.82) 7 Ox, where the last equation is simply the stationary FPE for p,(x), after substituting (3.3.53(i)). As was the case for the potential conditions, it can be seen that (5.3.81) gives an equation for g/8x, which can only be satisfied provided certain conditions on D(x) and B,(x) are satisfied. If B,,(x) has an inverse, these take the form 32. = a, 5 (5.3.83) where 2, = E Bae) [200) — 2 Bf] (5.3.84) and we have paz) = exp[—g(a)] = exp (fds’-2). (5.3.85) Thus, as in the case of a vanishing probability current, p,(x) can be determined explicitly as an integral. iii) Connection between backward and forward operators of differential Chapman- Kolmogorov equations is provided by the detailed balance. The proof of sufficient conditions amounts to showing that if f(x, t) is a solution of the forward differential Chapman-Kolmogorov equation, then 4(%, 1) = flex, — Dipl) (5.3.86) is a solution of the backward differential Chapman-Kolmogorov equation. This relationship will be used in Sect.5.3.7 for the construction of eigenfunctions. 5.3.6 Examples of Detailed Balance in Fokker-Planck Equations a) Kramers’ Equation for Brownian Motion [5.10] We take the motion of a particle in a fluctuating environment. The motion is in one dimension and the state of the particle is described by its position x and velocity v. This gives the differential equations 156 5. The Fokker-Planck Equation @ dry (5.3.87) and mi? = —V'@) — o-+ VIET Et) (5.3.88) which are essentially Langevin’s equations (1.2.14) in which for brevity, we write 6nya = B and V(x) is a potential whose gradient V’(x) gives rise to a force on the particle. By making the assumption that the physical fluctuating force ¢(t) is to be interpreted as &(t)dt = dW(t) (5.3.89) as explained in Sect.4.1, we obtain SDE’s dx =v dt (5.3.90) m dv = —[V'(x) + Boldt + /2BKT dW(t) (5.3.91) for which the corresponding FPE is 2B — 20) + LZ ve + puto) + OAT SE (53.2) or m m> dv" The equation can be slightly simplified by introducing new scaled variables y = Xo/mkT (5.3.93) u = v/mkT (5.3.94) Uy) = Vxy/kT (5.3.95) y= Bim (5.3.96) so that the FPE takes the form o_o ay a 22) Som — Gp) + elU'Onal + 75, (ww + (63.97) which we shall call Kramers’ equation. Here, y (the position) is an even variable and u (the velocity) an odd variable, as. explained in Sect.5.3.4. The drift and diffusion can be written 5.3 Fokker-Planck Equations in Several Dimensions 187 u A(y, u) = [_ vy) — wal > (5.3.98) 0 0 BO, uw) = [ ; | (5.3.99) and 7] - [? iF (5.3.10) u —u We can check the conditions one by one. The condition (5.3.53(iii)) is trivially satisfied. The condition (5.3.53(ii)) is somewhat degenerate, since B is not invertible. It can be written eA(y, — u)p,(y, u) = —ACy, up, ¥) + dy Os ” Ou 0 (5.3.101) or, more fully —u —u 0 UO) — I. “| vot | : » w| a The first line is an identity and the second states —up,(y, u) = 9 (5.3.103) au ie., PAY, ¥) = exp (— He) f(y) (5.3.104) which means that if p,(y;u) is written in the form (5.3.104), then the detailed balance conditions are satisfied. One must now check whether (5.3.104) indeed gives a stationary solution of Kramers’ equation (5.3.97) by substitution. The final brac- ket vanishes, leaving O= ul — U'(y)uf (5.3.105) which means SO) = M exp[—UGQ)] (5.3.106) and 158 5. The Fokker-Planck Equation PAC, u) = Mexp[—U(y) — fu’). (5.3.107) In terms of the original (x, v) variables, Ya) er | (5.3.108) Paes 2) = exp [ — HO) — 5 which is the familiar Boltzmann distribution of statistical mechanics. Notice that the denominators kT arise from the assumed coefficient ./28kT of the fluctuating force in (5.3.88). Thus, we take the macroscopic equations and add a fluctuating force, whose magnitude is fixed by the requirement that the solution be the Boltz- mann distribution corresponding to the temperature T. But we have also achieved exactly the right distribution function. This means that the assumption that Brownian motion is described by a Markov process of the form (5.3.87, 88) must have considerable validity. b) Deterministic Motion Here we have B,(x) and W(x|x’) equal to zero, so the detailed balance conditions are simply ,A(ex) = —A(x) . (5.3.109) Since we are now dealing with a Liouville equation (Sect.3.5.3), the motion of a point whose coordinates are x is described by the ordinary differential equation % 4am = Afx(s)] - : (5.3.110) Suppose a solution of (5.3.110) which passes through the point y at t = 0 is alt, y) (53.111) which therefore satisfies g0,y)=y. (5.3.112) Then the relation (5.3.109) implies that the reversed solution eq(—t, ey) (5.3.113) is also a solution of (5.3.110), and since eq(0, ey) = eey=y, (5.3.114) i.e., the initial conditions are the same, these solutions must be identical, i-e., eq(—t, ey) = g(t, y)- (5.3.115) Now the joint probability in the stationary state can be written as 3.5 POkKer-rianck Equations 1 Several Wutteusivus toy PAX, 13 %',1') = Jf dy p(x, t; x", 1/5 y, 0) = J dy [x — a(t, yx’ — a(t’, »Ipy) (5.3.116) and Pex’, — t'; ex, — t) = f dy bfex — q(—t, y)]6[ex’ — g(—1', y)] p(y). (5.3.17) Change the variables from y to ey and note that p,(y) = p,(ey), and dey = dy, so that (5.3.117) = f dy 8[x — eq(—t, ey)]6[x’ — eg(—’, ep)]p.(») (5.3.18) and using (5.3.115), = J dy S[x — g(t, yl8[x’ — a(t’, »)lp.Cy) (5.3.19) = px, ty x',1'). (5.3.120) Using the stationarity property, that p, depends only on the time difference, we see that detailed balance is satisfied. This direct proof is, of course, unnecessary since the original general proof is valid for this deterministic system. Furthermore, any system of deterministic first- order differential equations can be transformed into a Liouville equation, so this direct proof is in general unnecessary and it is included here merely as a matter of interest. However, it is important to give a brief summary of the philosophy behind this demonstration of detailed balance. In physical systems, which are where de- tailed balance is important, we often have an unbelievably large number of variables, of the order of 10° at least. These variables (say, momentum and velocity of the particles in a gas) are those which occur in the distribution function which obeys a Liouville equation for they follow deterministic equations of motion, like Newton’s laws of motion. It can be shown directly that, for appropriate forms of interaction, Newton’s laws obey the principle of microscopic reversibility which means that they can be put in the form (5.3.110), where A(x) obeys the reversibility condition (5.3.109). The macroscopically observable quantities in such a system are functions of these variables (for example, pressure, temperature, density of particles) and, by appropriate changes of variable, can be represented by the first few components of the vector x. Thus, we assume x can be written x= (a, #) (5.3.121) where the vector @ represents the macroscopically observable quantities and is all the others. Then, in practice, we are interested in BCG 5 15 ay f25 sy b5 ---) = ff [dbs dts... p0er, ths Xa, 3 Xs, 5 (5.3.12) 160 5. The Fokker-Planck Equation From the microscopic reversibility, it follows from our reasoning above that p, and thus also f, both obey the detailed balance conditions but, of course, f does not obey a Liouville equation. /f it turns out or can be proven that p obeys, to some degree approximation, a Markov equation of motion, then we must preserve the detailed balance property, which takes the same form for f as for p. In this sense, the condi- tion (5.3.43) for detailed balance may be said to be derived from microscopic re- versibility of the equations of motion. c) Ornstein-Uhlenbeck Process: Onsager Relations in Linear Systems Most systems in which detailed balance is of interest can be approximated by an Ornstein-Uhlenbeck process, i.c., this means we assume Aa) = 3 Ass (5.3.123) B,(x) = By. (5.3.124) The detailed balance conditions are not trivial, however. Namely, a Ei GeAy + Ay)x) =X By x log pil) (5.3.125) 7 7 Ox; and 2¢By = By. ' (5.3.126) Equation (5.3.125) has the qualitative implication that p,(x)-is a Gaussian since derivative of log p,(x) is linear in x. Furthermore, since the left-hand side contains no constant term, this Gaussian must have zero mean, hence, we can write P(e) = Mexp (— $ x7o"!x). (5.3.127) One can now substitute (5.3.127) in the stationary Fokker-Planck equation and re- arrange to obtain Ein — FF Byag' + EH on Ay + 4 Fon Buoy" \xex; = 0 (5:3.128) (we have used the symmetry of the matrix c). The quadratic term vanishes if the symmetric part of its coefficient is zero. This condition may be written in matrix form as oA + Ato“! = —o"'Bo™! (5.3.129) or Ag + cAT = —B. (5.3.130) The constant term also vanishes if (5.3.129) is satisfied. Equation (5.3.130) is, of 5.3 Fokker-Planck Equations in Several Dimensions 161 course, exactly that derived by SDE techniques in Sect. 4.4.6 (4.4.51) with the substitutions A——A (5.3.131) BB" B. We can now write the detailed balance conditions in their most elegant form. We define the matrix ¢ by & = diag (e, &, &, «..) (5.3.132) and clearly ; e=1, (5.3.133) Then the conditions (5.3.125, 126) become in matrix notation ede + A= —Bo™ (5.3134) eBe=B. (5.3.135) The potential condition (5.3.83) is simply equivalent to the symmetry of ¢. As noted in Sect.5.3.4 (5.3.49), detailed balance requires eo =e. (5.3.136) Bearing this in mind, we take (5.3.130) Ag + cAT = —B and from (5.3.134) eAta + Ao = (5.3.137) which yield eAea = cAT (5.3.138) and with (5.3.136) 8(Aa) = (Ao)Fe. (5.3.139) These are the celebrated Onsager relations; Onsager, [5.11]; Casimir, [5.12]. The derivation closely follows van Kampen’s [5.6] work. The interpretation can be made simpler by introducing the phenomenological forces defined as the gradient of the potential ¢ = log[p,(x)]: F(x) = —Vg(x) = or'x (5.3.140) 162 5, The Fokker-Planck Equation (in physics, g/kT is the entropy of the system). Because of the linear form of the A(x) ((5.3.123)], the exact equations of motion for t) = G(x, t). (5.4.3) Since the process is homogeneous, we find that G(x, t) obeys the backward Fokker- Planck equation A,G(x, 1) = Fy Ail), 1) + 2 F By(3)80,G(, 1) - (5.4.4) The initial conditions on (5.4.4) will arise from: i) p(x’, O|x, 0) = &(x — x’) (5.4.5) so that Gx,0)=1 xER (5.4.6) =0 elsewhere; ii) the boundary condition (5.4.1) requires Gan=0 xeS. (5.4.7) 5.4 First Exit Time from a Region 171 As in Sect. 5.2.7, we find that these imply that the mean exit time from R starting at x, for which we shall use the symbol 7(x), satisfies SAH) Ta) + 2 F By(2)98,TH) = —1 (5.4.8) with the boundary condition Tx)=0 xeS (5.4.9) and the nth moments T,(x) = (T") = Jertats, nd (5.4.10) satisfy =A Tya(8) = BAUR) T AH) + Ye By (290,72) (5.4.11) with the boundary conditions T(x) =0 xeES. (5.4.12) Inclusion of Reflecting Regions. It is possible to consider that S, the boundary of R, is divided into two regions S, and S, such that the particle is reflected when it meets S, and is absorbed when it meets S,. The boundary conditions on G(x, t) are then, from those derived in Sect. 5.2.4, EnB HAG, H=0 (HES) (5.4.13) Gx,)=0 (@eES,) (5.4.14) and hence, SBT) =0 @ ES) (5.4.15) T(x)=0 (x ES). (5.4.16) 5.4.1 Solutions of Mean Exit Time Problems The basic partial differential equation for the mean first passage time is only simple to solve in one dimension or in situations where there is a particular symmetry available. Asymptotic approximations can provide very powerful results, but these will be dealt with in Chap.9. We will illustrate some methods here with some examples. we D. ENE FOKREI=rlatteR Equauun a) Ornstein-Uhlenbeck Process in Two Dimensions (Rotationally Symmetric) we suppose that a particle moves according to dx = —kx dt + /DdW\(t) (5.4.17) dy = —ky dt + /D dW,{t) and want to know the mean exit time from the region xety 7 Oy (x) f dx’ P,(x'). (5.4.27) i R The success of the method depends on the knowledge of the eigenvalues satisfying the correct boundary conditions on S and normalised on R. c) Asymptotic Result If the first eigenvalue A, is very much less than all other eigenvalues, the series may be approximated by its first term. This will mean that the eigenfunction Q, will be very close to a solution of DAIS) + 7 x B,(x)0,0, f(x) = 0 (5.4.28) since 4, is very small. Hence, Q(x) ~ K (5.4.29) where K is a constant. Taking account of the bi-orthonormality of P, and Q,, we see I= f dx P\(x)O\(x) ~ K J dx P(x) (5.4.30) so that T(x) ~ WA. (5.4.31) The reasoning given here is rather crude. It can be refined by the asymptotic meth- ods of Chap.9. 4) Application of the Eigenfunction Method Two-dimensional Brownian motion: the particle moves in the x y plane within a square whose corners are (0, 0), (0, 1), (1, 0), (1, 1). The sides of this square are absorbing barriers. T(x, y) obeys 24 =) =-1. (5.4.32) ax dy The eigenfunctions satisfying the boundary condition T = 0 on the edges of the square are Prim(%, Y) = sin(nnx) sin(mnx) (5.4.33) 174 5. The Fokker-Planck Equation with Qn m(, }') = 4 sin(nnx) sin(mnx) (5.4.34) and n, m positive and integral. The eigenvalues are = Pot tm). (5.4.35) The coefficient J dx dy Pan(x,y)=0 — (either n or m even) ® (5.4.36) 4 = janqa (mand n both odd). Hence, 32 ; T(x, ») = rp 2 2 akon? vB) MO) sin(mmy) . (5.4.37) 5.4.2 Distribution of Exit Points This problem is the multidimensional analogue of that treated in Sect.5.2.8. Namely, what is the probability of exiting through an element dS(a) at @ of the boundary S of the region R. We assume absorption on all S. The probability that the particle exits through dS (a) after time f is g(a, x, 1)|dS(a)| = —f ai’sa, 1'|x,0)-dS(@). (5.4.38) Fig. 5.5. Region and surface considered in Sect. 5.4.2 5.4. First Exit Time from a Region 175 Similar reasoning to that in Sect.5.2.8 shows that g(a, x, 1) obeys the backward Fokker-Planck equation DAG Ra, x,1) + $Y Byl2IG,8(a, x, 1) = .8(a 2,1). The boundary conditions follow by definition. Initially we have g(a,x,0)=0 for x#a,xER and at all times g(@,x,1)=0 for x#a,xES. If x = a, then exit through dS(a) is certain, hence, g(a,a,t)dS(a)=1 forall or effectively g(a,x,t)=5(a—x) x eS, forallt, where 8,(a — x) is an appropriate surface delta function such that JlaS(a)|8,(a— x) =1. The probability of ultimate exit through dS(a) is ma, x)|dS(a)| = g(a, x, 0) | dS(a)| . The mean exit time given that exit occurred at a is T(a, x) = f dt g(a, x, x(a, x) a and in the same way as in Sect.5.2.8, we show that this satisfies (5.4.39) (5.4.40) (5.4.41) (5.4.42) (5.4.43) (5.4.44) (5.4.45) (5.4.46) DA Alx(a, x)T(a, x)] + 3 By(x)dGjlx(a, x)T(a, x)] = —n(a, x) and the boundary conditions is n(a,x)T(a,x)=0, xeES. (5.4.47) (5.4.48) Further, by letting t—- oo in the corresponding Fokker-Planck equation for g(a, x,t), we obtain the equation for x(a, x): 176 5. The Fokker-Planck Equation 2% A(x)dln(a, x)] + 4 z B,,(x)9,0,[x(a, x)] (5.4.49) The boundary condition for (5.4.49) is na,x)=0 a#x, xES (5.4.50) and fldS(a)|x(a, x) = 1. (5.4.51) Thus, we can summarise this as na,x)=6(a—x) xES, (5.4.52) where 8,(a — x) is the surface delta function for the boundary S. 6. Approximation Methods for Diffusion Processes The methods described in the previous two chapters have concentrated on exact results, and of course the results available are limited in their usefulness. Approxi- mation methods are the essence of most applications, where some way of reducing a problem to an exactly soluble one is always sought. It could even be said that most work on applications‘is concerned with the development of various approximations. There are two major approximation methods of great significance. The first is the small noise expansion theory which gives solutions linearised about a deter- ministic equation. Since noise is often small, this is a method of wide practical application, the equations are reduced into a sequence of time-dependent Ornstein- Uhlenbeck processes. Mostly the first order is used. Another large class of methods is given by adiabatic elimination, in which differ- ent time scales are identified and fast variables are eliminated completely. This forms the basis of the second half of the chapter. 6.1 Small Noise Perturbation Theories In many physical and chemical problems, the stochastic element in a dynamical system arises from thermal fluctuations, which are always very small. Unless one measures very carefully, it is difficult to detect the existence of fluctuations. In such a case, the time development of the system will be almost deterministic and the fluctuations will be a small perturbation. With this in mind, we consider a simple linear example which is exactly soluble: a one-variable Ornstein-Uhlenbeck process described by the stochastic differential equation: dx = — kx dt + edW(t) (6.1.1) for which the Fokker-Planck equation is dip = A,(kx p) + }e282p. (6.1.2) The solutions of these have been previously investigated in Sects. 3.8.4, 4.4.4. Here éis a small parameter which is zero in the deterministic limit. However, the limit € — 0 is essentially different in the two cases. In the stochastic differential equation (SDE) (6.1.1), as e—> 0, the differential equation becomes nonstochastic but remains of first order in t, and the limit e— 0 is therefore not singular. In contrast, in the Fokker-Planck equation (FPE) (6.1.2), 178 6. Approximation Methods for Ditfusion Processes the limit e > 0 reduces a second-order differential equation to one of first order. This limit is singular and any perturbation theory is a singular perturbation theory. The solution to (6.1.1) is known exactly—it is (4.4.26) alt) = ce + 6 f e-k—a(e') (6.1.3) i which can be written Xa(t) = Xo(t) + exi(t) (6.1.4) and this is generic; that is, we can normally solve in a power series in the small parameter ¢. Furthermore, the zero-order term x,(f) is the solution of the equation obtained by setting e = 0, i-e., of dx = — kx dt. (6.1.5) The situation is by no means so simple for the Fokker-Planck Equation (6.1.2). Assuming the initial condition c is a nonstochastic variable, the exact solution is the Gaussian with mean and variance given by 0, and the coefficient is the Gaussian random variable y. This is essentially the same conclusion as was reached by the SDE method. The general form of these results is the following. We have a system described by the SDE . dx = a(x)dt + eb(x) dW(t). (6.1.14) Then we can write the solution as X(t) = xo(t) + exi(t) + e?x,(t) + .. (6.1.15) and solve successively for the x,(t). In particular, x,(t) is the solution of the deter- ministic equation dx = a(x)dt. (6.1.16) Alternatively, we consider the Fokker-Planck equation d.p = — A,{a(x)p] + 42*02[(x)*p] - (6.1.17) Then by changing the variable to the scaled variable and thus writing Y= [x — xolt)/e (6.1.18) BAY, t) =ep(x,t|c,0) , (6.1.19) we can write the perturbation expansion BAY, t) = Boy, 1) + ey, t) + ey, 1) +. - (6.1.20) Here we will find that f(y, t) is indeed a genuine probability density, i.e., is positive and normalised, while the higher-order terms are negative in some regions. Thus, it can be said that the Fokker-Planck perturbation theory is not probabilistic. In contrast, the SDE theory expands in a series of random variables x,(t), each of which has its own probability distribution. At every stage, the system is probabilistic. And finally, the most noticeable difference. The first term in the SDE perturba- tion theory is x(t) which is the solution of the SDE obtained by setting ¢ = 0 in (6.1.1). In contrast, the first term fo(y, £) in (6.1.20) is not the solution of the equa- 180 6. Approximation Methods for Diffusion Processes tion obtained by setting ¢ = 0 in (6.1.2). In general, it is a limiting form of a FPE for p,(y, t) obtained by setting ¢ = 0 in the FPE for the scaled variable, y. 6.2 Small Noise Expansions for Stochastic Differential Equations We consider a stochastic differential equation of the form dx = a(x)dt + eb(x)d W(t) (6.2.1) in which ¢ is a small parameter. At this stage we exclude time dependence from a(x) and (x), which is only necessary in order to simplify the algebra. The results and methods are exactly the same. We then assume that the solution x(t) of (6.2.1) can be written x(t) = x9(t) + exy(t) + e?x,(t) +... (6.2.2) We also assume that we can write a(x) = a(x, + ex, + ex, +...) (6.2.3) = (x0) + € a(Xo, X1) + E7a42(Xo5 X15 X2) + +--+ (6.2.4) The particular functional dependence in (6.2.4) is important and is easy to demonstrate, for t a(x) = a(x + 5 Xp) * 1 dra(xo) =D a (3 ern) (6.2.5) Formally resumming is not easy, but it is straightforward to compute the first few powers of ¢ and to obtain (Xo) = a(Xo) de (Xo, 1) = X1 ed 2, alsa sy = ed 4 Lg ii 629 de a Ay(Xoy X15 Xay Xs) = Xs See) +x, oH + +x ato Although it is not easy to write explicitly the full set of terms in general, it is easy to see that we can always write for n > 1, Aly Hy oy) = 4 ED) + An(Xo X1y ++ Xnot) > (6.2.7) 6.2 Small Noise Expansions for Stochastic Differential Equations 181 where it is seen that A, is independent of x,. In fact, it is easy to see directly from (6.2.5) that the coefficient of e” can only involve x, if this is contributed from the term p = |, and the only other possibility for e” to arise is from terms with m l. Assuming that the solution of (6.2.9) is given by X(t) = a(t), (6.2.12) the equation (6.2.10a) for n = I can be written as dx, [a(t)}xidt + bla(t)]dW(t) (6.2.13) where we have noted from (6.2.5) that 4g vanishes and by = This equation, the first of a perturbation theory, is a time-dependent Ornstein- Uhlenbeck process whose solution can be obtained straightforwardly by the methods of Sect.4.4.9, reduced to one dimension. The solution is obtained simply by multiplying by the integrating factor 182 6. Approximation Methods for Diffusion Processes exp f ai'Mate} and is x(t) = f Blatt} exp(— f Mats)ds} avr), (6.2.14) where the initial condition x,(0) = 0 has been included. For many purposes this form is quite adequate and amounts to a linearisation of the original equation about the deterministic solution. Higher-order terms are more complex because of the more complicated form of (6.2.10b) but are, in essence, treated in exactly the same way. In order to solve the equation for xy(‘), we assume we know all the x,(t) form < Nso that A, and b,_, become known (sto- chastic) functions of f after substituting these solutions. Then (6.2.10b) becomes dxy = {—Kfa(t)]xw + An(t)} dt + bya(t)dW(t) (6.2.15) whose solution is obtained directly, or from Sect.4.4.9, as ant) = fUAgt dt! + by aed] exp {— f klas)}d}. (6.2.16) oO av Formally, the procedure is now complete. The range of validity of the method and its practicability are yet unanswered. Like all perturbation theories, terms rapidly become unwieldy with increasing order. 6.2.1 Validity of the Expansion The expansion will not normally be a convergent power series. For (6.2.14) shows that x,(‘) is a Gaussian variable, being simply an Ito integral with nonstochastic coefficients, and hence x,(r) can with finite probability assume a value greater than any fixed value. Thus, only if all power series involved in the derivation of (6.2.5) are convergent for all arguments, no matter how large, can we expect the method to yield a convergent expansion. We can, in fact, show that the expansion is asymptotic by using the results on dependence on a parameter given in Sect.4.3.7. We define a remainder by Vale, 1) = [x(t) — > ex (nye, (6.2.17) where the x,(t) are solutions of the set of stochastic differential equations (6.2.10) with initial conditions (6.2.11). We then derive an equation for y,(t). We can write a{x(t)] = af ex.) + yale, ert] (6.2.18) 6.2 Small Noise Expansions for Stochastic Differential Equations 183 and we define a function @jy1[Xo, Xi, Xa ---Xm Y, e] by GnaalXo, Xi, X, Yee] = et! (a eX, tery] = EeaXn, X,..X)}- (6.2.19) We require that for all fixed Xo, X,, ... Xw, Y, lim Xe XY a) (6.2.20) 0 exists. We similarly define 5,[Xo, X1,--- Xm, Y,2] and impose the same condition on it. This condition is not probabilistic, but expresses required analytic properties of the functions a(x) and (x); it requires, in fact, that the expansions (6.2.4, 8) be merely asymptotic expansions. Now we can write the differential equation for y,(e, f) as Dr = GnsilXo (t)s Xi(t), --- Xa(t)s Yor €] at + Bylxolt), xi(t), «+. Xpa(t)s Yow €] AW(t) (6.2.21) The coefficients of dt and dW(t) are now stochastic functions because the x,(f) are stochastic. However, the requirement (6.2.20) is now an almost certain limit, and hence implies the existence of the stochastic limits Stelim dng iLXo(t), XC), --- Xa(t)s Yor €] = Greats Yn) (6.2.22) and st-lim By LxoCt)s a(t), --- Xnalt)s Yas €) = Bult Yn) (6.2.23) which is sufficient to satisfy the result of Sect.4.3.7 on the continuity of solutions of the SDE (6.2.21) with respect to the parameter ¢, provided the appropriate Lip- schitz conditions (ii) and (iii) of Sect.4.3.7 are satisfied. Thus, y,(0, t) exists as a solution of the SDE dy,(0, 1) = Gyaalt, yo(0, O] + Salt, yn(0, tA W(t) (6.2.24) which, from the definition (6.2.17) shows that x(t)— Se’x(t) ~ er". (6.2.25) = Hence, the expansion in power of ¢ is an asymptotic expansion. 6.2.2. Stationary Solutions (Homogeneous Processes) A stationary solution is obtained by letting t — oo. If the process is, as written, homogeneous and ergodic, it does not matter what the initial condition is. In this 184 6. Approximation Methods for Diffusion Processes case, one chooses x0(0) so that a[x9(0)] vanishes and the solution to (6.2.10a) is Xolt) = x00) = a (6.2.26) {where we write simply a instead of a(t)]. Because of the initial condition §, at =the solution (6.2.14) to the equation of order one is not a stationary process. One must either let tf —~ co or set the initial condition not at t = 0, but at f = — oo. Choosing the latter, we have 40 = f B(@) exp [= Kad") (62.27) Similarly, ag) = f Asta + bed) exp [= = 1)K(@)), (6.2.28) where by 43 and bs_, we mean the values of A, and 6,_, obtained by inserting the stationary values of all arguments. From (6.2.28) it is clear that x(t) is, by construc- tion, stationary. Clearly the integrals in (6.2.27, 28) converge only if k(a) > 0, which will mean that only a stable stationary solution of the deterministic process generates a stationary solution by this method. This is rather obvious—the addition of fluctuations to an unstable state derives the system away from that state. 6.2.3. Mean, Variance, and Time Correlation Function If the series expansion in ¢ is valid in some sense, it is useful to know the expansion for mean and variance. Clearly &x(D) = BS erate) (6.2.29) var (x(0)) = Ey ot SB Cenl taal) — Ceml OD Ginan( OD] (6.2.30) Since, however, we assume a deterministic initial condition and x(t) is hence deter- ministic, all terms involving xo(t) vanish. We can then work out that var (x(1)) = evar (xy(t)} + 22(0), xa(0)) + ef[2cxi(0), x00) + var fx} +... 6231) and similarly, (Os 4) = P(e 9) + ET (, 2269) + Ga, (1 + ef xei(t), xa(5)) + a(S), x3(1)) + Cralt), 22(5)>] a. (6.2.32) 6.2 Small Noise Expansions for Stochastic Differential Equations 185 6.2.4 Failure of Small Noise Perturbation Theories a) Example: Cubic Process and Related Behaviour Consider the stochastic differential equation dx = —x'dt + edW(t). (6.2.33) It is not difficult to see that the expansion conditions (6.2.20) are trivially satisfied for the coefficients of both dt and dW(t), and in fact for any finite t, an asymptotic expansion with terms xy(t) given by (6.2.16) is valid. However, at x = 0, it is clear that A (6.2.34) Ee@r| = Ko) x0 and because x = 0 is the stationary solution of the deterministic equation, the per- turbation series for stationary solutions is not likely to converge since the exponen- tial time factors are all constant. For example, the first-order term in the stationary expansion is, from (6.2.27), x) = f dv’) = Wa) — W—~) (6.2.35) which is infinite with probability one (being a Gaussian variable with infinite variance). The problem is rather obvious. Near x = 0, the motion described by (6.2.33) is simply not able to be approximated by an Ornstein-Uhlenbeck process. For example, the stationary probability distribution, which is the stationary solution of the FPE O,p = 0,(x*p) + he*O2p , (6.2.36) is given by P(x) = Mexp (—x*/2e*) (6.2.37) and the moments are et) = (222)""4 rey a ‘) rly ) (n even) oo =0 (n odd). The lowest-order term of the expansion of the variance is proportional to ¢ to the first power, not e? as in (6.2.31). In this case, we must simply regard the cubic process described by (6.2.33) as a fundamental process. If we introduce the new scaled variables through the de- finitions 186 6. Approximation Methods for Diffusion Processes ok (6.2.39) t=t/e and use dW(t/e) = dW@)[/e, (6.2.40) then the cubic process can be recuced to a parameterless form dy = —ydt + dW(). (6.2.41) Regarding the solution of (6.2.41) as a known quantity, we can write x(t) = /e yet) (6.2.42) so that the limit e—0 is approached like ./¢, and also with a slower time scale. This kind of scaling result is the basis of many critical phenomena. A successful perturbation theory in the case where a(x) behaves like x? near x = 0 must involve firstly the change of variables (6.2.39) then a similar kind of pertur- bation theory to that already outlined—but in which the zero-order solution is the cubic process. Thus, let us assume that we can write a(x) = —x*e(x), ‘ (6.2.43) where c(x) is a smooth function with c(0) # 0. Then, using the transformations (6.2.39,40), we can rewrite the SDE as dy = —yay/ ede + Wy EAH). (6.2.44) If we expand y(t), (ye), b(yx/e) as series in ./¢, we obtain a perturbation theory. If we write wt)= > eny,(t) , (6.2.45) then we get for the first two terms dy = —yie(0)dt + B(0)A W(t) (6.2.46) dn = -y] 380) + REO] a + [ZO] are. (62.47) We see that the equation for y, is in fact that of a time-dependent Ornstein-Uhlen- beck process with stochastic coefficients. Thus, in principle, as long as the cubic orocess is known, the rest is easily computed. In practice, not a great deal is in fact x(t) ~—I ii) x90) =0>x(t)=0 forall iii) xo(t) > O> x(t) 1. Thus, depending on the initial condition, we get two different asymptotic ex- pansions, whose stationary limits represent the fluctuations about the two deter- ministic stationary states. There is no information in these solutions about the possible jump from the branch x =1 to the branch x= — 1, or conversely — at least not in any obvious form. In this sense the asymptotic expansion fails, since it does not give a picture of the overall behaviour of the stationary state. We will see in Chap. 9 that this results because an asymptotic expansion of behaviour characteris- tic of jumps from one branch to the other is typically of the order of magnitude of exp (— I/e?), which approaches zero faster than any power as e —~ 0, and thus is not represented in an expansion in powers of e. 6.3 Small Noise Expansion of the Fokker-Planck Equation As mentioned in Sect. 6.1, a small noise expansion of a Fokker-Planck equation is a singular expansion involving the introduction of scaled variables. Let us consider how this is done. 188 6. Approximation Methods for Diffusion Processes We consider the Fokker-Planck equation Op = —4,{A(x)p] + 4e*02B(x)p] - (6.3.1) We assume the solution of the deterministic equation to be a(t) so that dat) = Afa(t)) . (6.3.2) and introduce new variables (y, s) by y= be-all (6.3.3) sat (6.3.4) and A(y,s) = e p(x, t)- (6.3.5) We note that _ 4 9 , 9 =e yt Bs (6.3.6) 9.B(Y, 8) = 0,p(y, 8) = 22 (6.3.7) + so that substituting into (6.3.1) we get, with the help of the equation of motion (6.3.2) for a(t) : 7 ~~ ay 3 (Aleto) + erl — Ale) 4 2 = Ales +5 5 {Bla(s) + ey]p} . (6.3.8) We are now in a position to make an expansion in powers of ¢. We assume that A and B have an expansion in powers of ¢ of the form Alas) + ey] = 33 Asdery" (6.3.9) Bla(s) + ey] =F B@ery" (6.3.10) and expand in powers of e: p= Spe. (63.11) Substituting these expansions into the FPE (6.3.8), we get by equating coefficients Mo Po _ 45) 2 (yp) + + Bs) 22 (6.3.12) 6.3 Small Noise Expansion of the Fokker-Planck Equation 189 op, a P z 1 Oo. 5 Fe = —ZtAown, + Alow*pd + J FalBloo. + Bisdvp) —(63.13) and, in general, Op, | ZL a — Be = — 3) 8 md, mai In] + > a] Sy BrenlOWn] (63.18) mao Only the equation for fa is a FPE and, as mentioned in Sect.6.1, only po is a proba- bility. The first equation in the hierarchy, (6.3.12), is a time-dependent Ornstein- Uhlenbeck process which corresponds exactly to (6.2.13), the first equation in the hierarchy for the stochastic differential equation. Thereafter the correspondence ceases. The boundary conditions on the p, do present technical difficulties since the transformation from x to y is time dependent, and a boundary at a fixed position in the x variable corresponds to a moving boundary in the y variable. Further, a boundary at x = a corresponds to one at [a — a(s)] yea @ (6.3.15) which approaches + co as e — 0. There does not seem to be any known technique of treating such boundaries, except when + 00, so that the y boundary is also at +00 and hence constant. Boundary conditions then assume the same form in the y variable as in the x variable. In the case where the boundaries are at infinity, the result of the transformation (6.3.3) is to change a singular perturbation problem (6.3.1). (in which the limit @—~0 yields an equation of lower order) into an ordinary perturbation problem (6.3.8) in which the coefficients of the equation depend smoothly on, and the limit e—0 is an equation of 2nd order. The validity of the expansion method will depend on the form of the coefficients. 6.3.1 Equations for Moments and Autocorrelation Functions The hierarchy (6.3.14) is not very tractable, but yields a relatively straightforward procedure for computing the moments perturbatively. We assume that the boun- daries are at + co so that we can integrate by parts and discard surface terms. Then we define (x)? = (a + ey), — Ca + ey? = ervar{y}, to order & . (6.3.31) The procedure can clearly be carried on to arbitrarily high order. Of course in a one-variable system, the stationary distribution can be evaluated exactly and the moments found by integration. But in many variable systems this is not always possible, whereas the multivariate extension of this method is always able to be carried out. b) Stationary Autocorrelation Function The autocorrelation function of x is simply related to that of y in a stationary state by , = a? + y(t) ¥(0)>, (6.3.32) and a hierarchy of equations for <»(t))(0)> is easily developed. Notice that Z vor, = (EF PO—4O yoy + 4n(n — 1)Bla + ev(t)]»(t)""7} vO), (6.3.33) which can be derived by using the FPE (6.3.1) for p(y, t|,¥o, f0) and integrating by parts, or by using Ito’s formula for the corresponding SDE. Using the definition of 4,, 8, in (6.3.9.10) and expanding A and B in a power series, we get Fe HOV, = He [nd aCHOrVO, + ma = ) 5 cyeyet2y(0>,]- (6.3.34) These equations themselves form a hierarchy which can be simnlv solved in a power 1926. Approximation Methods for Diffusion Processes series in e. Normally one is most interested in ¢y(t)y(0),, which can be calculated to order e*, provided one knows 0). (6.3.43) ex(t) The initial condition is (0) = 0"), (6.3.44) ¢3(0) = = 1/12. Hence, we obtain 1/6 [| = ayy + att, (6.3.46) which have the solutions a =4(l+VI—avI—@ aan a= i4(-1+VI-@vI—@. The correlation function is, to 2nd order in ¢ (many terms cancel) BO ee (6.3.48) Notice that the eigenvalues 4, and 4, depend on «?. Any attempt to solve the 194 6. Approximation Methods for Diffusion Processes system (6.3.40) perturbatively would involve expanding exp (A,t) and exp(A,t) in po- wers of e? and would yield terms like t%exp (—2r) which would not be an accurate representation of the long-time behaviour of the autocorrelation function. The x correlation function is . = #e,(0) (6.3.49) and the spectrum S(o) = 2 f dt ec\(1)/2n (6.3.50) (63.51) 6.3.3 Asymptotic Method for Stationary Distributions For an arbitrary Fokker-Planck equation ap = — DAAiap + be P9a,By lop, (63.52) one can generate an asymptotic expansion for the stationary solution by setting paz) = exp [$e] ‘ (63.53) in terms of which we find U5 Adx)og + 4X Bylx)0168,91 + #3 Ale) + 2 Byard + 30,8, (x) = 0. (6.3.54) The first term, which is of order e°, is a Hamilton Jacobi equation. The main sig- nificance of the result is that an asymptotic expansion for ¢(x) can be, in prin- ciple, developed: Hx) = Teal) (6.3.55) where g(x) satisfies 3 A@)ho + 4 51 Bu(*)01H09 40 = 0 (6.3.56) Graham and Tel (6.8, 9] have recently shown how equation (6.3.56) may be solved in the general case. Their main result is that solutions, though continuous, in general have infinitely many discontinuities in their derivatives, except in certain special cases, which are closely related to the situation in which the FPE satisfies potential conditions 6.4 Adiabatic Elimination of Fast Variables 195 6.4 Adiabatic Elimination of Fast Variables It is very often the case that a dynamical system can be described by stochastic equa- tions which have widely different response times and in which the behaviour on a very short time scale is not of interest. The most natural example of this occurs in Brownian motion. Here it is normal to observe only the position of the Brownian particle, but the fundamental equations involve the momentum as well, which is normally unobservable. Thus, Langevin’s equation (1.2.14) can be rewritten as an equation for position and velocity as dx _ : (6.4.1) me = — put SIKBT Ee) - (6.4.2) If we interpret the equations as Ito stochastic differential equations, the method of solution has already been given in Sect.4.4.6. However, it is simpler to integrate (6.4.2) first to give the solution 22) = v6 0) exp (— Brim) + PTB f exp [Bor — rhmigtedr'. (64.3) We now want to consider the situation in which the friction coefficient £ is not small but the mass m is very small. Then for times ¢ such that 1> m[B=t, (6.4.4) the exponential in the first term will be negligible and the lower limit in the integral will be able to be extended to —co, without significant error. Hence, v(t) ETB f expl— — ryyeieear’. (6.4.5) Here t will be called relaxation time since it determines the time scale of relaxation to (6.4.5). Let us define nt) =T f exp[—@ — e/a) (6.4.6) which is, from Sect.4.4.4, a stationary Ornstein-Uhlenbeck process. The correlation function is nt, Ont’, DY = dexp Cire) (6.4.7) =e it 1). (6.4.8) 196 6. Approximation Methods for Diffusion Processes We see that the limit t — 0 corresponds to a white noise limit in which the correla- tion function becomes a delta function. Thus, we can write (6.4.1) as dx_ /2kT og ed (6.4.9) and in the limit r — 0, this should become dx_ J Rae: (6.4.10) An alternative, and much more transparent way of looking at this is to say that in (6.4.2) the limit m — 0 corresponds to setting the left-hand side equal to zero, so that ut) = KT et) , (6.4.11) B The reasoning here is very suggestive but completely nonrigorous and gives no idea of any systematic approximation method, which should presumably be some asymptotic expansion in a small dimensionless parameter. Furthermore, there does not seem to be any way of implementing such an expansion directly on the stochastic differential equation—at feast to the author's knowledge no one has successfully developed such a scheme. The Fokker-Planck equation equivalent to (6.4.1, 2) for the distribution func- tion p(x, », 1) is ap KT ap ar xo +5(2) + apa (6.4.12) We define the position distribution function p(x, t) by (x, 1) = f dvplx,v, 1). (6.4.13) Then we expect that, corresponding to the “reduced” Langevin equation (6.4.10) the FPE for f(x, 1) is re (6.4.14) a. RS and B We seek a way of deriving (6.4.14) from (6.4.12) in some perturbative manner, so that we obtain higher corrections in powers of the appropriate small parameter. More generally, we can consider Brownian motion in a potential for which the Langevin equations are (Sect.5.3.6) 6.4 Adiabatic Elimination of Fast Variables 197 (6.4.15) The limit of large f should result in very rapid relaxation of the second equation to a quasistationary state in which dv/dt —- 0. Hence, we assume that for large enough f, = 7Ve — S2BKT E(t) (6.4.16) and substituting in (6.4.15) we get Ga ev) + AF ew (64.17 corresponding to a FPE for p(x) known as the Smoluchowski equation: 2 pl [vo + er}. (6.4.18) In this case we have eliminated the fast variable v, which is assumed to relax very rapidly to the value given by (6.4.16). This procedure is the prototype of all adiabatic elimination procedures which have been used as the basis of Haken’s slaving principle {6.1]. The basic physical assumption is that large (or, in general, short relaxation times) force the variables governed by equations involving large f (e.g., v) to relax to a value given by assum- ing the slow variable (in this case x) to be constant. Such fast variables are then effectively slaved by the slow variables. Surprisingly, the problem of a rigorous derivation of the Smoluchowski equa- tion and an estimation of corrections to it, has only rather recently been solved. The first treatment was by Brinkman [6.2] who only estimated the order of magni- tude of corrections to (6.4.18) but did not give all the correction terms to lowest order. The first correct solution was by Stratonovich [Ref. 6.3, Chap. 4, Sect. 11.1]. Independently, Wilemski [6.4] and Titulaer [6.5] have also given correct treatments. In the following sections we will present a systematic and reasonably general theory of the problem of the derivation of the Smoluchowski equation and correc- tions to it, and will then proceed to more general adiabatic elimination problems. The procedure used is an adaptation of projection operator methods, which have been used in statistical physics, quantum optics, and related fields for many years. These methods can be formulated directly in the time domain, but we will find it more convenient to use a Laplace transform method, which was that originally used by Wilemski. The manner of presentation is similar to that of Papanicolaou [6.6], who has given a rigorous basis to its use in some problems. However, the de- monstrations used here will be largely formal in character. . 198 6. Approximation Methods for Diffusion Processes 6.4.1 Abstract Formulation in Terms of Operators and Projectors Let us consider the rescaled form of the Fokker-Planck equation (6.4.12) derived as (5.3.97) in Sect. 5.3.6, which we can write in the form % =(L, + Lip, (6.4.19) where L, and L, are differential operators given by ay, a T= ial a. z (6.4.20) avs um2 l= Hut VO, (6.4.21) We would like to derive an equation for the distribution function in y, By, t) = J du plu, y, t) (6.4.22) which would be valid in the limit where y becomes very large. It is expected that an approximate solution to (6.4.19) would be obtained by multiplying p(y, f) by the stationary distribution of ' % =Lp, : (6.4.23) that is, by (2n)-"? exp (—4u2). (6.4.24) The reasoning is that for large y, the velocity distribution is very rapidly therma- lised or, more crudely, we can in (6.4.19) neglect L, compared to L; and the solu- tion is a function of y multiplied by a solution of (6.4.23), which approaches a stationary solution in a time of order y-!, which will be very small. We formalise this by defining a projection operator P by (PP) (u y) = Qn)? exp (— he) J du fu, y) - (6.4.25) where f(u, y) is an arbitrary function. The reader may easily check that Pap. (6.4.26) In terms of the vector space of all functions of u and y, P is an operator which pro- jects any vector into the subspace of all vectors which can be written in the form of 8(u, Y) = (20)? exp (—}*) &(y), (6.4.27) 6.4 Adiabatic Elimination of Fast Variables 199 where é(y) is an arbitrary function of y. However, functions of the form (6.4.27) are all solutions of Lg=0, (6.4.28) that is, the space into which P projects is the null space of L,. We may also note that in this case P= lim [exp (L1)] . (6.4.29) To prove this, expand any function of u and y in eigenfunctions P,(u) of L, as in Sect.5.2.5: Slt, 2) = 3 AW) Pal)» (6.4.30) where Ax(y) = fdu O,(u) flu, y)- (6.4.31) Then fim fexp (Ly 1) flu y)] = 32 AxG) lim e*P,(u) (6.4.32) = Pou) f du Oo(u) flu, ¥) (6.4.33) and noting that for this process (Ornstein-Uhlenbeck) Po(ui) = (2n)-"? exp (—}u2) (6.4.34) Qu) = 1. (6.4.35) We see that (6.4.29) follows. In this case and in all other cases, we also have the essential relation PL,P=0, (6.4.36) For, considering PL,Pflw, »), we see from the definition of L,, that LPf(u, ») is proportional to u exp (—4u?) oc P,(u) (6.4.37) and P P\u) =0 (6.4.38) 200 6. Approximation Methods for Diffusion Processes either by explicit substitution or by noting that P,(u) is not in the null space of L;. Let us define v= Pp (6.4.39) w=(1—P)p (6.4.40) so that p=vtw and vis in the null space of Z,, while w is not in the null space of L,. We can now note that, from (6.4.29), PL, =L,P =0 (6.4.41) and from the original equation we have 2 PeyL, + Lip = POL, + Li[Pp + (1 — Pip] = PL{I — Pop, [where we have used (6.4.41) and (6.4.36)] so that 2 — Phy. (6.4.42) Similarly, Y= (1 PXyL, + Lap = (1 = PXpLs + Lap + (1 — Ppl = yLi(l = Pyp + = PULA = Pp + — PLP (6.443) and using PL,P = 0, we have ea yLw + (1 — P)Lw + Ly. (6.4.44) 6.4.2. Solution Using Laplace Transform The fundamental equations (6.4.42, 44) can be solved in a number of iterative ways. However, since they are linear equations, a solution in terms of the Laplace trans- form can be very appropriate and readily yields a perturbation expansion. 6.4 Adiabatic Elimination of Fast Variables 201 The Laplace transform of any function of time f(t) is defined by f(s) = s eft) (6.4.45) and it may quite readily be defined for operators and abstract vectors in exactly the same way. Our fundamental equations become, on using few ¢ =sf(s)—f), (6.4.46) a $0(s) = PL,Ws) + 0), (6.4.47) $s) = [pL + (b — P)La}W(s) + Lyi(s) + (0). These equations are linear operator equations, whose formal solution is straight- forward. For simplicity, let us first assume w(0) =0 (6.4.48) which means that the initial distribution is assumed to be of the form PU, y, 0) = (20)? exp (—4u*) p(y, 0) (6.4.49) that is, the initial thermalisation of the velocity u is assumed. Then we have formally Ws) = [s — yL, — (1 — P)LJ“'L,0(5) (6.4.50) and hence, 5 Hs) = PL[s — yL, — (1 — P)L,J'L,0(s) + v0). (6.4.51) We have here, at least formally, the complete solution of the problem. For any finite s, we can take the large y limit to find $5 Ws) = —y"PLL;'L,a(s) + (0). (6.4.52) Notice that Ly! does not always exist. However, we know that PL,i(s) = PL,P p(s) = 0 (6.4.53) from (6.4.36). Hence, Lz0(s) contains no component in the null space of L, and thus L7'L,0(s) exists. In this case of Kramers’ equation, let us now see what (6.4.52) looks like. It is equivalent to the differental equation ov (6.4.54) 202 6. Approximation Methods for Diffusion Processes Now note that a Sout UZ] Polad f a wo,» 1 wo=[- [with P,(u) defined by (6.4.34)]. (6.4.55) For this problem is it useful to bring in the eigenfunctions of the Ornstein-Uh- lenbeck process [Sect.5.2.6c] which for these parameters, jD=k=1 take the form Plu) = (2m)? exp (—}12)Q,(u) with Ou) = (2a!) H, (ul / 2) - Using L,P,(u) = —nP,(u) and the recursion formulae for Hermite polynomials 7 XH, (x) = $Hyyi(x) + 2H 1(x) pena -x2} le 7H,0)] = — eH (2). we see that 7 o. (0) + 5 Pw) Ly = so that, using (6.4.59), tho =[U') + 5] Papo. We now apply L, once more and use the relations (6.4.60, 61) again. We find = a — Expo) = —[VT Pu + Palin| 2 — VT PALOU'Y) apy —-—2ly a PLti'he = — 2luon + Z| perro (6.4.56) (6.4.57) (6.4.58) (6.4.59) (6.4.60) (6.4.61) (6.4.62) (6.4.63) (6.4.64) (6.4.65) 6.4 Adiabatic Elimination of Fast Variables 203 and the equation of motion (6.4.54) takes the form [after cancelling the factor P,(u)] (6.4.66) which is exactly the Smoluchowski equation, derived using the naive elimination given in Sect. 6.4. 6.4.3 Short-Time Behaviour One should note that,the limit y— oo in (6.4.52) implies that s is finite, or roughly speaking y>s. (6.4.67) Thus, the solution for the Lapace transform will only be valid provided sy (6.4.68) or for the solution, this will mean that ty (6.4.69) Let us define sy = sy (6.4.70) so that (6.4.51) becomes ys0 = PLils,y — Liy — (1 — P)LJ"Ly0 + v0). (6.4.71) The limit »— co gives ys = y PLAS, — L,)1L,6 + 0). (6.4.72) Using the fact that L,0 is proportional to P,(u) (6.4.62), we see that 510 = ys, + 1I)-PLZ6 + 00). (6.4.73) Changing back to the variable s again and rearranging, we find + 1) "PL + v0) (6.4.74) which is equivalent to a = f dt’ exp be’ — PLio(eyat . (6.4.75) 204 6. Approximation Methods for Diffusion Processes Alternatively, we can rewrite (6.4.74) in the form a [s* — sv(0)] + [50 — 2(0)] = PLZ (6.4.76) which, using (6.4.46) and the same working out as in Sect. 6.4.2, is equivalent to the equation for p: (6.4.77) in which the initial condition @ — ar (0) =0 is implied because fe" = 8 f0) — sf0 -—FO j and no constant term appears in the first bracket in (6.4.76). We may smiilarly rewrite (6.4.75) or integrate (6.4.77) to get + 2% Slvo+z ali fat’ exp bt’ — Ope. . (6.4.78) Equations (6.4.77, 78) demonstrate a non-Markov nature, seen explicitly in (6.4.78) which indicates that the prediction of A(t + Af) requires the knowledge of A(t’) for 0 <1 < +. However, the kernel exp [)(t’ — 1)] is significantly different from zero only for |t’ — t| ~ y7! and on a time scale much longer than this, (6.4.78) is approximated by the Smoluchowski equation (6.4.66). Formally, we achieve this by integrating by parts in (6.4.78): fat’ exp tye’ — 9) pen Ae) _ 7 fexp ine’ — 12a. 6.4.79) Neglecting the last term as being of order y~?, we find the Smoluchowski equation replaced by % Sle “+ AL ee "p0)| (6.4.80) This equation is to lowest order in y equivalent to (6.4.78) for all times, that is, very short ( 0 and u < 0 are distinct. From the stochastic differential equations y= udt . (6.4.81) du = —[U'(y) + yuldt + /2y dW(t) We see that foru > 0, y = ais an exit boundary for u <0, y = ais an entrance boundary. since a particle with u > 0 at y = a must proceed to y > a or be absorbed. Simi- larly, particles to the left of the boundary can never reach y = aif w <0. Conven- tionally, we describe y = a as absorbing or reflecting as follows: i) absorbing barrier: particle absorbed for u > 0, no particles with u <0 >plu,a,t)=0 u>0 (6.4.82) =0 uw 1 does not occur. This is not possible, of course, for an expansion to higher powers of y. b) Application to Brownian Motion For Brownian motion we have already computed the operator A of (6.4.99); it is given by (6.4.65) A=—PLLj'L, =z + 2): (6.4.102) The other operators can be similarly calculated. For example, from (6.4.63, 64), Li'l, = — [pwnd + v7 evn +2] [von + 2}. Multiplying by (1 — P) simply removes the term involving P,, and multiplying by L;! afterwards multiplies by —}. Hence, _ a} LMU = Pylaki'Ls = VT PAO[O') + ay. We now use the Hermite polynomial recursion relations (6.4.60, 61) when multi- plying by Lz: we derive LP) =[- Zu + U0) A] Pa (6.4.103) =- VT VON —Z[VF PW + VF Uo: Finally, multiplying by P annihilates all terms since Po(u) does not occur. Hence, B= PL,L;(1 — P)L,Li'L, =0. (6.4.104) The computation of C and DA follow similarly. One finds that c= F[vw +g] ew +3] (6.4.105) and 6.4 Adiabatic Elimination of Fast Variables 209 aly, aja, a DA= -Z[vm+g)Z[ew +3 (6.4.106) so that Olan 7 a C+DA =Zuwlvw+Z] (6.4.107) and (6.4.101) is equivalent to the differential equation oo on 19 ey zym " ap ‘gl +y ur Urore +2] (6.4.108) with the initial condition tim p(y.) = {t= 9°? PaLao + al 1,0). (6.4.109) This alteration of the initial condition is a reflection of a “layer” phenomenon. Equation (6.4.108) is valid for ¢ > y~! and is known as the corrected Smoluchowski equation. The exact solution would take account of the behaviour in the time up tot ~ y" in which terms like exp (—yt) occur. Graphically, the situation is as in Fig. 6.1. the changed initial condition accounts for the effect of the initial layer near t ~ y~! ply,t) Fig. 6.1e Formation of a layer at a boundary. The exact solution (—) changes rapidly near the boundary on the left. The approximation (------) is good except near the boundary. The appropriate boundary condition for the approximation is thus given by the smaller value, where the dashed line meets the boundary (c) Boundary Conditions The higher order implementation of boundary conditions cannot be carried out by the methods of this section, since a rapidly varying layer occurs in the variable x near the boundary, and the assumption that the operator d/dy is bounded becomes unjustified. Significant progress has been made by Titulaer and co-workers 210 6. Approximation Methods for Diffusion Processes [6.10- 12}. Suppose the boundary is at y=0, then we can substitute z= yy, r= 1 into the Kramers equation (6.4.19) to obtain oP é é a 1 oP Sa} fut Sus] Pt — ey — 4. or [z (« 4) 4 | 7 VON a, (64.110) so that the zero order problem is the solution of that part of the equation independent of y: to this order the potential is irrelevant. Only the stationary solution of (6.4.110) has so far been amenable to treatment. It can be shown that the stationary solution to the y + oo limit of (6.4.110) can be written in the form P(u, 2) = wo(u, 2) + dB (u, 2) + D di Un (u, 2) (64.111) el where Vo(u, 2) =(2 x)? exp(— 34?) (6.4.112) Yo (u, 2) =(2 x)? (2 — u) exp(— $4?) (6.4.13) and the y,(u, z) are certain complicated functions related to Hermite polynomials. The problem of determining the coefficients d; is not straightforward, and the reader is referred to [6.12] for a treatment. It is found that the solution has an infinite derivative at z = 0, and for small z is of the form a+ bz! . ‘’ 6.5 White Noise Process as a Limit of Nonwhite Process The relationship between real noise and white noise has been mentioned previously in Sects.1.4.4, 4.1. We are interested in a limit of a differential equation % = as) + Boda), (6.5.1) where a,(t) is a stochastic source with some nonzero correlation time. We will show that if a(t) is a Markov process, then in the limit that it becomes a delta correlated process, the differential equation becomes a Stratonovich stochastic differential equation with the same coefficients, that is, it becomes (S) dx = alx)dt + b(t) (6.5.2) which is equivalent to the Ito equation dx = [a(x) + $b(x)b()]dt + b@)AW(t) . (6.5.3) To achieve the limit of a delta correlated Markov process, we must take the large y limit of a(t) = ya(y*t) , (6.5.4) 6.5 White Noise Process as a Limit of Nonwhite Process 2 where a(z) is a stationary stochastic process with , = g(t) - (6.5.6) Then, (a(t) = 0 Kao(t)ao(0)>. = y*g(7"t) « (6.5.7) In the limit of infinite 7, the correlation function becomes a delta function. More precisely, suppose f sar =1 (6.5.8) and i Itlg(a)dt = 1. (6.5.9) defines the correlation time of the process a(t). [If g(t) is exponential, then t, as defined in (6.5.9) requires that g(t) < exp (—t/t.) which agrees with the usage in Sects.1.4.4, 3.7.1.] Then clearly, the correlation time of a9(z) is t./7?, which becomes zero as y —- 00; further, lim ag(t)a(0)>, =0 (#0). (6.5.10) ea But at all stages, [cao(t)ao(0)>,dt = feta _ (6.5.11) so that we can write Tim (ag(t)ae(0)>, = 8(0) « (65.12) Therefore, the limit y — 00 of ag(t) does correspond to a normalised white noise limit. The higher-order correlation functions might be thought to be important too, but this turns out not to be the case. We will give a demonstration in the case where a(r) is a Markov diffusion pro- cess whose Fokker-Planck equation is 2126. Approximation Methods for Diffusion Processes BO _ _ 3 ta(aro(an] + 5 ZB. (65.13) This means that the FPE for the pair (x, a) is 9P AL, + pla + Ls)plx: @) (65.14) with t= —-La@+t5a@ L= — Zu) (6.5.15) L=— 2 a(x). The asymptotic analysis now proceeds similarly to that used in Sects.6.4.1, 6.4.3, with a slight modification to take account of the operator L;. Analogously to Sect. 6.4.1, we define a projector P on the space of functions of x and a by (PA\(x, a) = pa) f da fix, a), (6.5.16) where p,(a) is the solution of ‘ Lip{a) =0. (6.5.17) We assume that in the stationary distribution of a, the mean (a), is zero. This means that the projection operator P satisfies the essential condition PL,P=0. (6.5.18) Since (PL:Pf) (x, a) = pa) § dal - 2 boars] J de'fie, a’) I ~pla), 2 bx) J def a) = 0. (65.19) Also, it is obvious that PL, = LP (6.520) and, as before, PL,=L,P=0. (6.5.21) 6.5 White Noise Process as a Limit of Nonwhite Process rie] Defining, as before v= Pp (6.5.22) w=(1— Pp (6.5.23) and using the symbols 4, # for the corresponding Laplace transforms, we find 5 Os)= POPL, + pL + Ls)p(s) + (0) (6.5.24) = yPL{Pp(s) + (1 — P)p(s)] + LsP p(s) + v0) so that 5 0(s) = yPL,Ws) + Lyi(s) + v(0) (6.5.25) and similarly, 5 Ws) = [y°L, + pl — PL, + L3)W(s) + yLz0(s) + w(0) (6.5.26) which differ from (6.4.47) only by the existence of the L3(s) term in (6.5.25) and Lyi in (6.5.26). We again assume w(0) = 0, which means that a(t) is a stationary Markov process, so that 50(s) = LyO(s) — yPLal—s + °Ly + pl — P)L, + Ls 'yLz0(s) + v(0). (6.5.27) Now the limit »— co gives si(s) = (Ls — PL,Lj"L,)0(s) + v(0) . (6.5.28) We now compute PL,L;1Lys. We write Hs) = p(x)p,(a) (6.5.29) PLaL;' Lyi = pa) § da’ [- = B60" | Le [- 2 50a] paa’)p(x). (6.5.30) We now need to evaluate fda al;ap(a) = —D (6.5.31) and to do this we need a convenient expression for Lz. Consider fexp (Lit')dt’ = Lz? exp (L,t) — Lz! (6.5.32) 2 and using (6.4.29) 214 6. Approximation Methods for Diffusion Processes fexp(L,t) de = —L7(1 — P). a Since, by assumption, Papa) = p,(a) { da'a'p,(a’) = p,(a’), = we have D = [daa exp (Lit)ap,(a) dt We note that exp (L,!)ap.(a) is the solution of the Fokker-Planck equation af=Lf with initial condition f(a, 0) = apa). Hence, | exp (Lyt)ap,(a) = f da‘p(a, t|a’, O)a’p,(@’) and substituting in (6.5.35), we find D = {dt fda da! aa’p(a’, t\a, 0)p,(a) , 3 dt a(t)a(0)) and from (6.5.8) and the symmetry of the correlation function, D=1/2. Using (6.5.42) as the value of D, we find = PhabitLit = + pao) 2[ 010) 3. 060)0(0)| 2 Ox Ox so that the differential equation corresponding to (6.5.28) for (6.5.33) (6.5.34) (6.5.35) (6.5.36) (6.5.37) (6.5.38) (6.5.39) (6.5.40) (6.5.41) (6.5.42) (6.5.43) 6.5 White Noise Process as a Limit of Nonwhite Process 215 (x, 1) = f da p(x, a) (6.5.44) is 2 = 2 anon +z ZomZ ooo. (6.5.45) This is, of course, the FPE in the Stratonovich form which corresponds to (S) dx = alx)dt + &(x)dW(t) (6.5.46) or which has the Ito form dx = [alx) + $b'(X)b(@)]dt + BOO) , (6.5.47) as originally asserted. 6.5.1 Generality of the Result A glance at the proof shows that all we needed was for a(t) to form a stationary Markov process with zero mean and with an evolution equation of the form 2@) = Lipla), (6.5.48) where L, is a linear operator. This is possible for any kind of Markov process, in particular, for example, the random telegraph process in which a(t) takes on values +a. In the limit y —- co, the result is still a Fokker-Planck equation. This is a reflection of the central limit theorem. For, the effective Gaussian white noise is made up of the sum of many individual components, as y —- co, and the net result is still effectively Gaussian. In fact, Papanicolaou and Kohler [6.7] have rigoro- usly shown that the result is valid even if a(t) is a non-Markov process, provided it is “strongly mixing” which, loosely speaking, means that all its correlation func- tions decay rapidly for large time differences. 6.5.2. More General Fluctuation Equations Notice that in (6.5.1), instead of defining a,(t) as simply ya(t/y*), we can use the more general form a(t, x) = yx, a(t/)")] (6.5.49) and now consider only B(x) = 1, since all x dependence can be included in y. We assume that fda y(x, apa) = 0 (6.5.50) in analogy to the previous assumption (a). 216 6. Approximation Methods for Diffusion Processes Then D becomes x dependent, and we have to use D(x) = f dt Cvlx, als, a(O]) (65.51) and £3) = far (SE x, tiv, (0) (6.5.52) and the Fokker-Planck equation becomes a = — Frias + B+ ZS at [D@)A] - (6.5.53) In this form we have agreement with the form derived by Stratonovich [Ref. 6.3, Eq.(4.4.39)]. 6.5.3 Time Nonhomogeneous Systems If instead of (6.5.1) we have & = as) + 0G, Dalt), 4 (6.5.54) the Laplace transform method cannot be used simply. We can evade this difficulty by the following trick. Introduce the extra variable t so that the equations become dx = (ax, 1) + yb(x, Dalat (6.5.55) da = yA(a)dt + y/B(a) dW(t) (6.5.56) dc=dt. (6.5.57) The final equation constrains f to be the same as t, but the system now forms a homogeneous Markov process in the variables (x, a, 1). Indeed, any nonhomo- geneous Markov process can be written as a homogeneous Markov process using this trick. The Fokker-Planck equation is now op gp Ph twat Ls (6.5.58) with le 1 =—24@+tZ%@ (6.5.59) 6.5 White Noise Process a Limit of Nonwhite Process 217 y= — 20x, 00 (6.5.60) aaa — — 8 a 9- (6.5.61) Using the same procedure as before, we obtain ae [aa La a P[-F Fal + Zo Ngo] (6.5.62) which yields : de=dt (6.5.63) so that we have, after eliminating t in terms of t, | 2a + phox, n Zoe, ole (6.5.64) in exact analogy to (6.5.45). 6.5.4 Effect of Time Dependence in L, Suppose, in addition, that A and B depend on time as well, so that a 1 @ L, =~ x, Ale) + 7 5B Blwd- (6.5.65) In this case, we find P is a function of r and hence does not commute with Ls. Thus, PL, # LP. (6.5.66) Nevertheless, we can take care of this. Defining 0(s) and #(s) as before, we have 5 H(s) = P(yLz + Ls)W(s) + PLs0(s) + (0) (6.5.67) 5 Ws) = [pL + (1 — PL, + (1 — P)Ls}W(s) + yLr0(s) + (1 — P)L30(5) (6.5.68) so that 5 Hs) = PL;i(s) + P(yL, + Ls)[s — YL, — xl — P)L, — (1 — P)L;J* X [pla + (1 — P)L,J0(s) + v(0) - (6.5.69) We see that because L, is multiplied by y and L, is not, we get in the limit of large 6H) ~ (PT. — PT TAT Vale) 4 0 (6.5.70) 210 0. Approximation Methods tor Dittusion Processes In this case we will not assume that we can normalise the autocorrelation function to a constant. The term — PL,L;'L, gives a a 7 an 1) 5, OCs 1) f atiax(tax()> > (6.5.71) where by a(t) we mean the random variable whose FPE is op or 2 (z A(@, t) + 44 Bla, a) . (6.5.72) Thus, the limit y —- oo effectively makes the random motion of a infinitely faster than the motion due to the time dependence of a arising from the time dependence of A and B. Defining D(a) = fdtalja(O)> , (6.5.73) we find, by eliminating z as before, 2 [- Zax, 1) + DDS 16,02 ox, ole . (6.5.74) 6.6 Adiabatic Elimination of Fast Variables: The General Case We now want to consider the general case of two variables x and @ which are coupled together in such a way that each affects the other. This is now a problem analogous to the derivation of the Smoluchowski equation with nonvanishing V'(x), whereas the previous section was a generalization of the same equation with V’(x) set equal to zero. The most general problem of this kind would be so complex and unwieldy as to be incomprehensible. In order to introduce the concepts involved, we will first con- sider an example of a linear chemical system and then develop a generalised theory. 6.6.1 Example: Elimination of Short-Lived Chemical Intermediates We consider the example of a chemically reacting system ? k X= Y==4A (6.6.1) k where X and Y are chemical species whose quantities vary, but A is by some means held fixed. The deterministic rate equations for this system are oe (6.6.2a) 0.0. Aaldpatic Elumination OF rast Variaoles: ine General Case IY d oa —2ytxta. (6.6.2b) Here x, y, a, are the concentrations of X, Y, A. The rate constants have been cho- sen so that k = I, for simplicity. The physical situation is often that Y is a very short-lived intermediate state which can decay to X or A, with time constant y~'. Thus, the limit of large y in which the short-lived intermediate Y becomes even more short lived, and its concentration negligible, is of interest. This results in the situation where we solve (6.6.2b) with dy/dt = 0 so that y=(e+a)/2y, (6.6.3) and substitute this in (6.6.2a) to get dx_x_a G=F-F- (6.6.4) The stochastic analogue of this procedure is complicated by the fact that the white noises to be added to (6.6.2) are correlated, and the stationary distribution of y depends on ». More precisely, the stochastic differential equations corresponding to (6.3.2) are usually chosen to be (Sect.7.6.1) dx = (—x + py)dt + &By,dW,(t) + €By,dW.(t) (6.6.5) dy = (—2yy + x + a)dt + eB, dW,(t) + eBr,dW,(t), where the matrix B satisfies (6.6.6) 2a —2a BBT= | —2a 4a} Here ¢ is a parameter, which is essentially the square root of the inverse volume of the reacting system and is usually small, though we shall not make use of this fact in what follows. We wish to eliminate the variable y, whose mean value would be given by (6.6.3) and becomes vanishingly small in the limit. It is only possible to apply the ideas we have been developing if the variable being eliminated has a distribution function in the stationary state which is independent of y. We will thus have to define a new variable as a function of y and x which possesses this desirable property. The Fokker-Planck equation corresponding to (6.6.5) is a a ar [z OW + gate — 25, ae a + Row -x-a+ retaZ|p. (6.6.7) It seems reasonable to define a new variable z by 220 6. Approximation Methods for Diffusion Processes z=ly-—x-a (6.6.8) which is proportional to the difference between y and its stationary value. Thus, we formally define a pair of new variables (x,, z) by =X x=X (6.6.9) z =2yy—x-—a y=@+m + a)/2y so that we can transform the FPE using o_8 8 ax x, ~ 32 a ana ay az to obtain p_ [2 [nse 2) 4 gag? y F (_a62q— aye%a) ar |ax,| 2 2 i eS az a + (2 "5+ 2)+2 + iS (82°y7a + ea + 4yea)|p. (6.6.11) The limit of y —~ co does not yet give a Fokker-Planck operator in z, which is simply proportional to a fixed operator: we see that the drift and diffusion terms for z are proportional to y and »’, respectively. However, the substitution a@azy? (6.6.12) changes this. In terms of a, the drift and diffusion coefficients become proportional to y and we can see (now writing x instead of x,) BP = ty, + 72a) + Llp (66.13) in which a a Ly =2 za + Base (6.6.14) LQ) -2 [- aead — te] are a2 -; Es + [rn 2e- pared sna, # foe teas a (6.6.15) 2 i ) + ead. (6.6.16) 6.6 Adiabatic Elimination of Fast Variables: The General Case 221 Notice that L,(y) has a large y limit given by the first term of the first line of (6.6.15). The only important property of Z, is that PL,P = 0. Defining P as usual Pf(x, @) = pa) J da’ fix, a’) (6.6.17) where p,(q) is the stationary solution for L,, we see that for any operator beginning with 4/@a [such as the y dependent part of L,(y)] we have pose (a) § da’ 2hs a)=0, (6.6.18) provided we can drop boundary terms. Hence, the y dependent part of L,(y) satis- fies PL,P = 0. Further, it is clear from (6.6.14) that (a), = 0, so we find that the first, y independent part, also satisfies this condition. Thus PL))P =0. (6.6.19) Nevertheless, it is worth commenting that the y dependent part of L,(y) contains terms which look more appropriate to L;, that is, terms not involving any x deriva- tives. However, by moving these terms into L,, we arrange for L, to be independent of y. Thus, P is independent of y and the limits are clearer. The procedure is now quite straightforward. Defining, as usual PP(s) = Ws) (6.6.20) (1 — P)p(s) = Ws) and assuming, as usual, (0) = 0, we find 5 (8) = PlypLy + y'?La(y) + Ls] [6(s) + (s)] + (0) (6.6.21) and using PL,=L,P=0 PL,P (6.6.22) PL; =L;P, we obtain 5 O(s) = Py"? La(y)W(s) + Ls0(s) + v(0) (6.6.23) and similarly, 5 Hs) = [pL, + (1 — P)La(y) + La]i(s) + y!'*La(y)0(s) (6.6.24) so that 222 6. Approximation Methods for Diffusion Processes $ (8) = {La + yPLAQ ls — yLy — y'°(1 — PLAY) — Li'L,} Hs) + v0). (6.6.25) Now taking the large y limit, we get sis) = (Ls — PLL 7'L,)a(s) + (0), (6.6.26) where : a a L= lim L3() = z(- aca e — 4a) (6.6.27) Equation (6.6.26) is of exactly the same form as (6.5.28) and indeed the formal derivation from (6.6.7) is almost identical. The evaluation of PL,L;'L, is, however, slightly different because of the existence of terms involving 9/da. Firstly, notice that since P /da = 0, we can write = PLali*av = —psa) § da! (— ta’ 2) 13+ 3 (48a 2, — 4a’) pfa'os) (6.6.28) and from the definition of p,(a) as satisfying L,p,(a) = 0, we see from (6.6.14) that a 2, ' 9q?:(@) = —ap,(a)/4era (6.6.29) and hence that —PLL;' Ly = ne & J da’a' Lz 'a'p,(a’)p(x) (6.6.30) = = Fe GO FarcatatO>, (66.31) where we have used the reasoning given in Sect.6.5 to write the answer in terms of the correlation function. Here, L, is the generator of an Ornstein-Uhlenbeck process (Sect 3.8.4) with k = 2, D = 16¢%a, so that from (3.8.2), —PhilitLv = — 4 F ps(a) Po ra 42a j dte™ (6.6.32) 1 7 9 =-+ (a). Hence, from (6.6.26), the effective Fokker-Planck equation is ax Ox Op _ a (6.6.33) 6.6 Adiabatic Elimi tion of Fast Variables: The General Case 223 Comments i) This is exactly the equation expected from the reaction k/2 X=A (6.6.34) A/2 (with k = 1) (Sect.7.5.3). It is expected because general principles tell us that the stationary variance of the concentration fluctuations is given by var (x(0)}. 2x), . (6.6.35) ii) Notice that the net effect of the adiabatic elimination is to reduce the coefficient of 8?/Ax?, a result of the correlation between the noise terms in x and y in the ori- ginal equations. iii) This result differs from the usual adiabatic elimination in that the noise term in the eliminated variable is important. There are cases where this is not so; they will be treated shortly. 6.6.2 Adiabatic Elimination in Haken’s Model Haken has introduced a simple model for demonstrating adiabatic elimination [Ref. 6.1, Sect.7.2]. The deterministic version of the model is a pair of coupled equations which may be written x= —ex — axa (6.6.36) i = —Ka + bx. (6.6.37) One assumes that if x is sufficiently large, we may, as before, replace @ by the stationary solution (6.6.37) in terms of x to obtain ante (6.6.38) #= ex — 2 x. (6.6.39) The essential aim of the model is to obtain the cubic form on the right-hand side of (6.6.39). In making the transition to a stochastic system, we find that there are various possibilities available. The usual condition for the validity of adiabatic elimination is en. (6.6.40) 224 6. Approximation Methods for Diffusion Processes Ina stochastic version, all other parameters come into play as well, and the condi- tion (6.6.40) is, in fact, able to be realised in at least three distinct ways with characteristically different answers. Let us write stochastic versions of (6.6.36, 37): dx = —(ex + axa)dt + C dW\(t) (6.6.41) da = (—Ka + bx*)dt + D dW,(t) and we assume here, for simplicity, that C and D are constants and W,(t) and W(t) are independent of each other. The Fokker-Planck equation is a a oe iz (ex + axa) +4072 + 2 (a — bx) + Dt a De (6.6.42) We wish to eliminate a. It is convenient to define a new variable 8 by _ 7 2 (6.6.43) so that, for fixed x, the quantity 8 has zero mean. In terms of this variable, we can write a FPE: 9 _ gy + 18 : 6.6.44 p= Lt +L + Lp (6.6.44) De R= = Hes Sim (6.6.45) a 2bx a b 13 = Fpaps — 2 Silex + 2 + ax) bx oa bxy , 2BxC? ~¢(Tarapt ap x n) toe ap (6.6.46) a b =[Z (x +2 ») (6.6.47) In terms of these variables, the limit e — 0 is not interesting since we simply get the same system with e = 0. No elimination is possible since Z, is not multiplied by a large parameter. In order for the limit e —- 0 to have the meaning deterministically that (6.6.39) is a valid limiting form, there must exist an A such that aed, as e+0. (6.6.48) 6.6 Adiabatic ination of Fast Variables: The General Case 225 For this limit to be recognisable deterministically, it must not be swamped by noise so one must also have 2 ¢ =eB as e—~0 (6.6.49) which means, as e—- 0 Be\2@+ 4x48 (6.6.50) Belay ax: : However, there are two distinct possibilities for L9. In order for L? to be inde- pendent of e, we musi have x independent of e, which is reasonable. Thus, the limit (6.6.48) must be achieved by the product ab being proportional to ¢. We con- sider various possibilities. a) The Silent Slave: a Proportional to ¢ We assume we can write a=ea. (6.6.51) We see that L9 is independent of ¢ while Z2 and L$ are proportional to ¢. If we rescale time by t=e (6.6.52) then ap _ (1 P= (Pu+ +L). (6.6.53) where L,=L L, = Le (6.6.54) L, = Lie. Clearly, the usual elimination procedure gives to lowest order oF = Lip = [Re + Ad) + a3,|6 (6.6.55) since L, does not become infinite as ¢ — 0. This corresponds exactly to eliminating a adiabatically, ignoring the fluctuations in a and simply setting the deterministic value in the x equation. I call it the ‘silent slave’, since (in Haken’s terminology) a is slaved by x and makes no contribution 226 6. Approximation Methods for Diffusion Processes to the noise in the x equation. This is the usual form of slaving, as considered by Haken. b) The Noisy Slave: a Proportional to e!/* If we alternatively assume that both a and b are proportional to é!’?, we can write a= Ge! (6.6.56) b= be, where ab =KA. (6.6.57) L{ stays constant, L$ is proportional to e and L = e!"L, + higher order terms in e, (6.6.58) where 1.=aphx (6.6.59) ae ax* 6. Thus, the limiting equation is % = (Ls — PL,Lj'L,)p : (6.6.60) The term PL,L7'L, can be worked out as previously; we find = Phyl; (6.6.61) so ap _ (a @p Fy, @Dt a E[-{1 — FF |+ 40] + ola+ Se >. (6.6.62) I call this the “noisy slave”, since the slave makes his presence felt in the final equation by adding noise (and affecting the drift, though this appears only in the Ito form as written; as a Stratonovich form, there would be no extra drift). ©) The General Case Because we assume ab cx ¢, it can be seen that the second two terms in (6.6.46) are always proportional to €”, where p > 1, and hence are negligible (provided b is bounded). Thus, the only term of significance in L3 is the first. Then it follows that if 6.6 Adiabatic Eli tion of Fast Variables: The General Case 227 we have the following possibilities: r > }: no effect from Ly: limiting equation is (6.6.55), r = $: limiting equation is (6.6.62)—a noisy slave, r<}: the term PL,L;'L, becomes of order e””-!— oo and is dominant. The equation is asymptotically (for r < }) winD(a a ae OZ oa Fe i) o (6.6.63) These are quite distinct differences, all of which can be incorporated in the one formula, namely, in general 9b ar a ay & By gar 6.6.6 + AM) + SB +e (6.6.64) In applying adiabatic elimination techniques, in general, one simply must take par- ticular care to ensure that the correct dependence on small parameters of all constants in the system has been taken. 6.6.3 Adiabatic Elimination of Fast Variables: A Nonlinear Case We want to consider the general case of two variables x and @ which are coupled together in such a way that each affects the other, though the time scale of a is considerably faster than that of x. Let us consider a system described by a pair of stochastic differential equations: dx = [a(x) + b(x)aldt + c(x)dW(t) (6.6.65) da = y[A(a) — fix)]dt + y/2B(@) dW2(t) . (6.6.66) If we naively follow the reasoning of Sect.6.4, we immediately meet trouble. For in this limit, one would put 4@ — 9) = — [BORO (6667) on the assumption that for large a, (6.6.66) is always such that da/dt = 0. But then to solve (6.6.67) for @ in terms x yields, in general, some complicated nonlinear function of x and dW,(t)/dt whose behaviour is inscrutable. However, if B(a) is zero, then we can define u(x) to be Aluo(x)] = £0) (6.6.68) and substitute in (6.6.65) to obtain 228 6. Approximation Methods for Diffusion Processes dx = [a(x) + b(x)uo(x)]dt + e(x)dW,(t) (6.6.69) We shall devise a somewhat better procedure based on our previous methodology which can also take the effect of fluctuations in (6.6.66) into account. The Fokker-Planck equation equivalent to (6.6.65, 66) is B =(%L+L+ Lip, (6.6.70) where a a 11 =3,/@) — A(a)] + 55 Bla) (6.6.71) and L, and L, are chosen with hindsight in order for the requirement PL,P = 0 to be satisfied. Firstly, we choose P, as usual, to be the projector into the null space of L,. We write p,(a) for the stationary solution i.e., the solution of Lipa) = 0 (6.6.72) so that p,(a) explicitly depends on x, because L, explicitly depends on x through the function f(x). The projector P is defined by (PF) (x, a) = p.(a) J da’ F(x, a’) (6.6.73) for any function F(x,a). ‘ We now define the function u(x) as u(x) = f da ap,(a) = Za lelxy | B(x) (6.6.85) so the lowest-order differential equation is FO) — 3 fata) + Ho Ae) + FF coe) (6.6.86) 230 6. Approximation Methods for Diffusion Processes which is equivalent to the stochastic differential equation dx = (a(x) + b(x)u(x)]dt + o(x)dW(t) . (6.6.87) To this order, the equation of motion contains no fluctuating term whose origin is in the equation for a. However, the effect of completely neglecting fluctuations is given by (6.6.69) which is very similar to (6.6.87) but has w(x) instead of u(x). While it is expected that the average value of a in the stationary state would be similar to u(x), it is not the same, and the similarity would only be close when the noise term B(a) was small. Second-Order Corrections It is possible to evaluate (6.6.83) to order py. At first glance, the occurrence of 2nd derivatives in L, would seem to indicate that to this order, 4th derivatives occur since L, occurs twice. However, we can show that the fourth-order terms vanish. Consider the expression P(Lz + Ls)Ly"[L2 + (1 — P)Ls}i(s) « (6.6.88) We know i) PHs) = Hs) (6.6.89) ii) (1 — P)L,Pa(s) = L, Pos) , where we have used PL,P = 0. Thus, (6.6.88) becomes P(L, + Ls)L71(1 — P)(L, + Ls)Pi(s) = P{PL; + [P, Ls] + LsP} x (1 = P)Ly (1 — P){L,P + [La, P] + PLs} Hs) (6.6.90) where the commutator [A, B] is defined by [4, B] = AB— BA. (6.6.91) We have noted that Ly! commutes with (1 — P) and used (1 — P)* = (1 — P)in (6.6.90) to insert another (1 — P) before L;'. We have also inserted another P in front of the whole expression, since P? = P. Using now P(l — P)=(1 — P)P=0, (6.6.90) becomes P{PL, + (P, Lal} Li! (1 — P){La + [Ls, P]} Hs) . (6.6.92) We will now compute [P, L;]: (Ps fax, a) = pla) {= Z tats) + 6(3)u(9 i Gd + FHalecor| J Fes a) de! (6.6.93) 6.0 Adiabatic Elimination of Fast varianies. Le Genial Cae a and (LPyfte, a) = [— 2 ace) + Bue] + FZ recor] pe) f daiftx, 2). (6.6.94) Subtracting these, and defining, ra) = PLO |p.(a) (6.6.95) s(a) = © an 2) Ip fa), (6.6.96) One finds (UP, Esl A) Ge, @) = rs(a)falx) + b(xu(X)]P/C, a) — $s,(a)e(x)P f(x, a) — rap 2 lef, a]. (6.6.97) The last term can be simplified even further since we are only interested in the case where f(x, a) is 2, ie., Sx, a) = p.(a)p(x) - (6.6.98) Then, PE dx)*p.(@0) (6.6.99) = pala) £00} da’p.(o*)6(2) (6.6.100) = p.{a) 2 (x)*A(x) - (6.6.101) We can further show that P[P, L;]) =0. (6.6.102) For since J da pa) =1, (6.6.103) it follows that J da r,(a)p(a) = § da s,(a)p,(a) = 0 (6.6.104) 232 6. Approximation Methods for Diffusion Processes which is sufficient to demonstrate (6.6.102). Hence, instead of (6.6.92) we may write (6.6.92) in the form PL Ly" {Lz + (Ls, P}} (5) (6.6.105) and [L;, P]o(s) = —p,(a) {r(a)fa(x) + b)u(x)] — 45,(a)e(x)?} BO) + parse) 2 fels9P00) (66.106) 6.6.4 An Example with Arbitrary Nonlinear Coupling We consider the pair of equations dx = yb(x)adt (6.6.107) da = —yA(x, a, y)dt + y/2B(, a, ») dW(t) and assume the existence of the following limits and asymptotic expansions AC, a 7) ~ 32 Anly ay™ (6.6.108) Bx, D~T BGO These expansions imply that there is an asymptotic stationary distribution of a at fixed x given by P(a, x) = lim Pla, x, 7) (6.6.109) Pala, x) cc Bo(x, a)" exp {f dalAo(x, a)/Bo(x, «)]} - (6.6.10) We assume that A,(x, a) and Bo(x, a) are such that , = J da ap,(a, x, 9) ~ ao(x)y* (6.6.112) where a(x) can be determined from (6.6.108). We define the new variables B =x. L ye) (6.6.113) 6.6 Adiabatic Elimination of Fast Variables: ‘rhe General Case 233 In terms of which the Fokker-Planck operator becomes (the Jacobian is a con- stant, as usual) on changing x, back to x L=—Zacoox) = BE.) + La adobe) Fy + 5 Bac (296) (ao(x) (a(x) +9 [554 (3 thx + pet + Ax) (6.6.114) and, by using the asymptotic expansions (6.6.108), we can write this as L=1;,4+ pL) +L (6.6.15) with a Ly = — 5 a0(x)0(x) (6.6.116) _@ a = 5p 0 y+ op BAB, x) (6.6.117) Ly) = La + OY") (6.6.118) = BZ 66) — FAME) ala) + A629] ABAB, x) ~ al ap HO) + BB, »): (6.6.19) We note that L, and L, do not commute, but, as in Sect. 6.5.4, this does not affect the limiting result, = (Ly — PLL;'La)p . (6.6.120) The evaluation of the PL,L;1L, term is straightforward, but messy. We note that the terms involving 0/08 vanish after being operated on by P. From the explicit form of p,(a, x) one can define G(B, x) by [AG ase + Ath, 2) 2.6, 9} @ (ABB, x) [i Gases) + BB, 9) | WB. I= GE...) 6 01) and one finds that op anl 234 6. Approximation Methods for Diffusion Processes Phrti*tap =(2 mon 2 ww + Zoo) |p (6.6.122) with D(x) = j dt (6.6.123) Ex) = [ atk BU0), GB, D1) where ¢...|x) indicates an average over p,(B, x). This is a rather strong adiabatic elimination result, in which an arbitrary nonlinear elimination can be handled and a finite resulting noise dealt with. The calculation is simpler than that in the pre- vious section, since the terms involving L; are of lower order here than those involving L,. 7. Master Equations and Jump Processes It is very often the case that in systems involving numbers of particles, or individual objects (animals, bacteria, etc) that a description in terms of a jump process can be very plausibly made. In such cases we find, as first mentioned in Sect. 1.1, that in an appropriate limit macroscopic deterministic laws of motion arise, about which the random nature of the process generates a fluctuating part. However the determinis- tic motion and the fluctuations arise directly out of the same description in terms of individual jumps, or transitions. In this respect, a description in terms of a jump. process (and its corresponding master equation) is very satisfactory. In contrast, we could model such a system approximately in terms of stochastic differential equations, in which the deterministic motion and the fluctuations have a completely independent origin. In such a model this independent description of fluctuations and deterministic motion is an embarrassment, and fluctuation dissi- pation arguments are necessary to obtain some information about the fluctuations. In this respect the master equation approach is a much more complete description. However the existence of the macroscopic deterministic laws is a very significant result, and we will show in this chapter that there is often a limit in which the solution of a master equation can be approximated asymptotically (in terms of a large parameter Q describing the system size) by a deterministic part (which is the solution of a deterministic differential equation), plus a fluctuating part, describa- ble by a stochastic differential equation, whose coefficients are given by the original master equation. Such asymptotic expansions have already been noted in Sect. 3.8.3, when we dealt with the Poisson process, a very simple jump process, and are dealt with in detail in Sect. 7.2. The result of these expansions is the development of rather simple rules for writing Fokker-Planck equations equivalent (in an asymptotic approximation) to master equations, and in fact it is often in practice quite simple to write down the appropriate approximate Fokker-Planck equation without ever formulating the master equation itself. There are several different ways of formulating the first- order approximate Fokker-Planck equation, all of which are equivalent. However, there is as yet only one way of systematically expanding in powers of Q-', and that is the system size expansion of van Kampen. The chapter concludes with an outline of the Poisson representation, a method devised by the author and co-workers, which, for a class of master equations, sets up a Fokker-Planck equation exactly equivalent to the master equation. In this special case, the system size expansion arises as a small noise expansion of the Poisson representation Fokker-Planck equation. 236 7. Master Equations and Jump Processes 7.1 Birth-Death Master Equations— One Variable The one dimensional prototype of all birth-death systems consists of a population of individuals X in which the number that can occur is called x, which is a non- negative integer. We are led to consider the conditional probability P(x, t|x’, ’) and its corresponding master equation. The concept of birth and death is usually that only a finite number of X are created (born) or destroyed (die) in a given event. The simplest case is when the X are born or die one at a time, with a time independent probability so that the transition probabilities W(x|x’, f) can be written Wx|x', 1) = 1). are + OO) st » 1) Thus there are two processes, X—ox+tl: t*(x) = transition probability per unit time. (7.1.2) x—x—I1: t-(x) = transition probability per unit time. (7.1.3) The general master equation (3.5.5) then takes the form O,P(x, t/t) = 18 — IP — 1, tl, 0) + + P(X + 1, the, 0+) — [t*@) + OD) PC, tI, 1). (7.1.4) There are no general methods of golving this equation, except in the time-inde- pendent situation. 7.1.1 Stationary Solutions We can write the equation for the stationary solution P,(x) as 0=J(x + 1)—JO@) (7.1.5) with IG.) = (x) P(x) — #*(@ — IPA — 1). (7.1.6) We now take note of the fact that x is a non-negative integer; we cannot have a negative number of individuals. This requires () 7° =0: no probability of an individual dying if there are none present; (7.1.7) (ii) P(x, t|x',)=0 forx Wz, 1) = Flexp (—kyt + 2) es = Fie — et] ete so G(s, 1) =F [(s— Ve] exp [(s — Dkralki) « (Ls. Normalisation requires G(1, 1) = 1, and hence FO) =1. (7.1.33) b) Conditional Probability The initial condition determines F; this is (setting 1’ = 0) P(x, O|N, 0) = dy. (7.1.34) which means G(s, 0) = 5% = F(s — 1) exp [(s — Ika/ki] (7.1.35) so that kya : os G(s, t) = exp (‘2 (s— Du —e “yt +(s— Ne*yy. (7.1.36) This can now be expanded in a power series in s giving x M (NDI x (Lenin tstrenki | P(x, t|N, 0) = exp|— a en] (7.1.37) 240 7. Master Equations and Jump Processes This very complicated answer is a complete solution to the problem but is of very little practical use. It is better to work either directly from (7.1.36), the generating function, or from the equations for mean values. From the generating function we can compute = 4,G(s = 1, th = ka — et) + Net (7.1.38) is linear, we can apply the methods of Sect.3.7.4, namely, the regression theorem, which states that the stationary autocorrelation function has the same time dependence as the mean, and its value at t = Ois the stationary variance. Hence, ce = kaalky (7.1.46) var{x(t)}, = ka/k, (7.1.47) 1 so that we set x(x — 1) (x — 2) 3, etc. The solu- tion of this equation, with the initial condition x(0) = Xo, is given by Gan) Gow Gow Xo — Xi Xo — Xs = exp [—ki(e — x2) — xa)(xs — xt] « (7.1.60) Here, x;, x2, x3 are roots of kx? — ky Ax + kx — x34 = 0 (7.1.61) with x5 > x > x. Clearly these roots are the stationary values of the solutions x(t) of (7.1.59). From (7.1.59) we see that dx x <0 (7.1.62) a. xy >x>m ae dx x> Xx; =F <0: Thus, in the region x < x2, x(t) will be attracted to x, and in the region x > x2, x(t) will be attracted to x3. The solution x(t) = x, will be unstable to small pertur- bations. This yields a system with two deterministically stable stationary states. a) Stochastic Stationary Solution From (7.1.13) (7.1.63) where 7.1 Birth-Death Master Equations —One Variable 243 B= kAlk R=kjk, (7.1.64) P=kj|k,. Notice that if P = R, the solution (7.1.63) is Poissonian with mean B. In this case, we have a stationary state in which reactions (7.1.56, 57) are simultaneously in balance. This is chemical equilibrium, in which, as we will show later, there is always a Poissonian solution (Sects. 7.5.1 and 7.7b). The maxima of (7.1.63) occur, according to (7.1.21), when B= x(x — 1) & — 2) + RIP + x(x — 1). (7.1.65) The function x = x(B), found by inverting (7.1.65), gives the maxima (or minima) corresponding to that value of B for a given P and R. There are the two asymptotic forms: x(B) ~ B large B (7.1.66) x(B) ~~ PB/R small B If R > 9P, we can show that the slope of x(B) becomes negative for some range of x > Oand thus we get three solutions for a given B, as shown in Fig. 7.1. The transition from one straight line to the other gives the kink that can be seen. Notice also that for the choice of parameters shown, the bimodal shape is signi- ficant over a very small range of B. This range is very much narrower than the range over which P(x) is two peaked, since the ratio of the heights of the peaks can be very high. Fig. 7.1. Plot of x(B) against B, as given by the solution of (7.1.65) for various values of R/P, and cco) oun P = 10,000 _ Fo taser LQuauUuls anu JULIP FLUcesses A more precise result can be given. Suppose the volume V of the system be- comes very large and the concentration y of X given by yar, is constant. Clearly the transition probabilities must scale like V, since the rate of production of X will scale like x = yV. Hence, k,A~1/V kA~V (7.1.67) k,~ 1/V? kye~ which means that B~v R~v* (7.1.68) (ge fe We then write B= BV ' R= Rv? P= by? so that (7.1.65) becomes B= yQ? + RYU? + P). And if y, and y, are two values of y, wag log [P.(y.)/POn)] = 24, (log BV + log (2? + PY?) Si — log [2(@? + RV?)]} (7.1.69) and we now approximate by an integral ~V hay [toe Fee) | : yO? + R) Hence, Piya) _ Bi? + Py PA) exp [Vf tog (oe + R). ] (7.1.70) Ja piUrLea maser CyUsLONs—Une varlaore “a> and as V — co, depending on the sign of the integral, this ratio becomes either zero or infinity. Thus, in a large volume limit, the two peaks, unless precisely equal in height, become increasingly unequal and only one survives. The variance of the distribution can be obtained by a simple trick. Notice from (7.1.63) that P,(x) can be written P(x) = B°G() , (7.1.71) where G(x) is a function defined through (7.1.63). Then, oxy [& #B*G(x)| [= paw)" | and : (1.1.72) BE ah = Gh) — ur | so that var{x} = Bow : (7.1.73) From this we note that as V — 00, var(y} ~+—0. (7.1.74) So a deterministic limit is approached. Further, notice that if = var{x(B)} = B large B axt (x) P(x) . (7.2.18) In this form, we see that we have a possible tool for simulating a diffusion process by an approximating birth-death process. The method fails if B(x) = 0 anywhere in the range of x, since this leads to negative Ws(x'|x). _ Notice that the stationary solution of the Master equation in this case is — pe iy (4G =O) + BE = 9 Pe = Of ae = —9A(0) + BQO)] # [1 + 5AC)/B(ey ~ | 5A(x) + a I li = BAC EO 0 (7.2.19) 7.2. Approximation of Master Equations by Fokker-Planck Equations 249 so that, for small enough 6 log P,(x) —~ const — log B(x) + 3326 A(z)/B(z) , (7.2.20) a ie, P, (x) = exp [2 j dz A(z)/B(2)) (7.2.21) im as required. The limit is clearly uniform in any finite interval of x provided A(x)/B(x) is bounded there. 7.2.2 The Kramers-Moyal Expansion A simple but nonrigorous derivation was given by Kramers [7.3] and considerably improved by Moyal [7.4]. It was implicitly used by Einstein [7.5] as explained in Sect. 1.2.1. In the Master equation (7.2.7), we substitute x’ by defining y=x—x' — in the first term, and pox —-x in the second term. Defining ty, x) = Wx + ylx), (7.2.22) the master equation becomes SFO) = J dy [t(y, x — y)P(x — y) — Hy, PQ) - (7.2.23) We now expand in power series, =f ays + 7 & UO, P(X] (7.2.24) = Ere Fe (aPC, (7.2.28) where a(x) = f dx'(x! — x)" W(x" |x) = J dy y" t(y, x) - (7.2.26) By terminating the series (7.2.25) at the second term, we obtain the Fokker-Planck equation (7.2.8). In introducing the system size expansion, van Kampen criticised this “proof”, because there is no consideration of what small parameter is being considered. 250 7. Master Equations and Jump Processes Nevertheless, this procedure enjoyed wide popularity—mainly because of the convenience and simplicity of the result. However, the demonstration in Sect.7.2.1 shows that there are limits to its validity. Indeed, if we assume that W(x’ |x) has the form(7.2.1), we find that a(x) = 8°" f dy y"B(y, x). (7.2.27) So that as 6-0, terms higher than the second in the expansion (7.2.25) (the Kramers-Moyal expansion) do vanish. And indeed in his presentation, Moyal [7.4] did require conditions equivalent to (7.2.4, 5). 7.2.3 Van Kampen’s System Size Expansion [7.2] Birth-death master equations provide good examples of cases where the Kramers- Moyal expansion fails, the simplest being the Poisson process mentioned in Sect. 721. In all of these, the size of the jump is +1 or some small integer, whereas typical sizes of the variable may be large, e.g., the number of molecules or the position of the random walker on a long lattice. In such cases, we can introduce a system size parameter Q such that the transi- tion probabilities can be written in terms of the intensive variables x/Q etc. For example, in the reaction of Sect.7.1.3, @ was the volume V and x/Q the concentra- tion. Let us use van Kampen’s notation: = extensive variable (number fof molecules, etc oc 2) a * = a/Q intensive variable (concentration of molecules). The limit of interest is large @ at fixed x. This corresponds to the approach to a macroscopic system. We can rewrite the transition probability as Wala’) = Wa’; Aa) Aa=a—d. (7.2.28) The essential point is that the size of the jump is expressed in terms of the extensive quantity Aa, but the dependence on a’ is better expressed in terms of the intensive variable x. Thus, we assume that we can write Waa’; Aa) = Qy (5 a) (7.2.29) If this is the case, we can now make an expansion. We choose a new variable z so that a= O¢(t) + 2"2, (7.2.30) where g(t) is a function to be determined. It will now be the case that the a,(a) are proportional to 2: we will write 7.2. Approximation of Master Equations by Fokker-Planck Equations 251 @,(a) = 24,(x) . (7.2.31) We now take the Kramers-Moyal expansion (7.2.25) and change the variable to get aP(z, 1yOP(2t) _ ey QIN? 5 agit 5 ae x (- Fal “alot + Q7"2z]P(z, t). (7.2.32) The terms of order 2" on either side will cancel if 4(t) obeys 81) = HIG) « (7.2.33) which is the deterministic equation expected. We expand @,[¢(t) + Q-'/2z] in powers of Q-"?, rearrange and find aP@,t)_ 2 Q-9-P2 w a mm! ke mt atm-mtgit)) (-2)'-re. 1). (12.34) — ni" Taking the large @ limit, only the m = 2 term survives giving : PED — angen 2 2 P00) +S adgiol SP 0. (7238) a) Comparison with Kramers-Moyal Result The Kramers-Moyal Fokker-Planck equation, obtained by terminating (7.2.25) after two terms, is oan a Flaca (a) + + Fiero (7.2.36) and changing variables to x = a/@, we get oF) =— 2 taeyPoo) + » & [a(x)P(2)] - (7.2.37) We can now use the small noise theory of Sect. 6.3, with (7.2.38) and we find that substituting z= 2x — gO], (7.2.39) the lowest-order FPE for z is exactly the same as the lowest-order term in van 252 7, Master Equations and Jump Processes Kampen’s method (7.2.35). This means that if we are only interested in the lowest order, we may use the Kramers-Moyal Fokker-Planck equation which may be easier to handle than van Kampen’s method. The results will differ, but to lowest order in Q-'? will agree, and each will only be valid to this order. Thus, if a FPE has been obtained from a Master equation, its validity depends on the kind of limiting process used to derive it. If it has been derived in a limit 6—0 of the kind used in Sect.7.2.1, then it can be taken seriously and the full nonlinear dependence of a,(a) and @,(a) on a can be exploited. On the other hand, if it arises as the result of an Q expansion like that in Sect. 7.2.3, only the small noise approximation has any validity. There is no point in considering anything more than the linearisation, (7.2.35), about the deterministic solution. The solution of this equation is given in terms of the corresponding stochastic differential equation dz = a[g(le dt + Sagi) dW) . (7.2.40) by the results of Sect. 4.4.7 (4.4.69), or Sect. 4.4.9 (4.4.83). b) Example: Chemical Reaction X= A From Sect. 7.1.2, we have Wx|x') = bx xryikad + 5y.1akix’ . (7.2.41) The assumption is a=aV . (7.2.42) X=XV, where V is the volume of the system. This means that we assume the total amounts of A and X to be proportional to V (a reasonable assumption) and that the rates of production and decay of X are proportional to a and x, respectively. Thus, W(xg; Ax) = VU keadoS4x1 + kiX0540-1) (7.2.43) which is in the form of (7.2.29), with @ — V, a— x, etc. Thus, a(x) = D(x! — WO! |x) = kya — kx = Vkaag — 1X0) a(x) = SX! — xP W(x" |x) = ka + yx = Vado + kx) - (7.2.44) The deterministic equation is $'() = Ueaay — kg(0)) (7.2.45) whose solutions is Bt) = ger + bea em), (7.2.46) 7.2. Approximation of Master Equations by Fokker-Planck Equations 253 The Fokker-Planck equation is PFE) 4, 2 200) + 42 thaae + hig IPC) (7247 From (4.4.84, 85) we can compute that K2(t)> = z(O)e“*" (7.2.48) Usually, one would assume z(0) = 0, since the initial condition can be fully dealt with by the initial condition on ¢. Assuming 2(0) is zero, we find var {eto} = [22+ gcoyer* |r = em (7.2.49) so that aly = ALE) = VACOVerHe + HCI — eh (1.2.50) var (x(2)} = ¥ var (a(n) = [822 + rgojer"]al — ey. (72.51) With the identificationVg(0) = N, these are exactly the same as the exact solutions (7.1.38-40) in Sect. 7.1.2. The stationary solution of (7.2.47) is Pe) = Hexp (- iS) (7.2.52) which is Gaussian approximation to the exact Poissonian. The stationary solution of the Kramers-Moyal equation is ae a(x’) Poe) = Zev [fe] = A (kega + kx) tae, (7.2.53) In fact, one can explicitly check the limit by setting x = V(kyaolk) +6 (7.2.54) so that (7.2.53) = W (2V kay + ky5)-1*4Y#2800ig-2¥A200/41- 28 | (7.2.55) Then, log P,(x) = const — (6— &). (7.2.56) a 254 7. Master Equations and Jump Processes Using the exact Poissonian solution, making the same substitution and using Stirling’s formula log x! ~ (x + $) log x — x + const, (7.2.57) one finds the same result as (7.2.56), but the exact results are different, in the sense that even the ratio of the logarithms is different. The term linear in d is, in fact, of lower order in V: because using (7.2.39), we find 5 = 2/V and _ (4 7 log P,(z) ~ const ia wad >) (7.2.58) so that in the large V limit, we have a simple Gaussian with zero mean. ©) Moment Hierarchy From the expansion (7.2.34), we can develop equations for the moments ket) = f dz Plz, te (7.2.59) by direct substitution and integration by parts: oe oe ik! # O=d z mt alee Ne yi BP MUP 2"# 2). (7.2.60) One can develop a hierarchy by expanding Q'/? does t become a significant size, and thus it is only for very long times f that any significant time development of the system takes place. Thus, the motion of the system becomes very slow at large Q. The condition (7.2.68) is normally controllable by some external parameter, (say, for example, the temperature), and the point in the parameter space where (7.2.68) is satisfied is called a critical point. This property of very slow time development at a critical point is known as critical slowing down. 7.3, Boundary Conditions for Birth-Death Processes For birth-death processes, we have a rather simple way of implementing boundary conditions. For a process confined within an interval (a, 5], it is clear that reflecting and absorbing boundary conditions are obtained by forbidding the exit from the interval or the return to it, respectively. Namely, Reflecting Absorbing Boundary at a t-(a)=0 ta@—1)=0 (73.1) Boundary at b 1*(6) =0 (b+ 1)=0. It is sometimes useful, however, to insert boundaries in a process and, rather than set certain transition probabilities equal to zero, impose boundary conditions similar to those used for Fokker-Planck equations (Sect.5.2.1) so that the resulting solution in the interval [a, 5] is a solution of the Master equation witH the ap- propriate vanishing transition probabilities. This may be desired in order to pre- serve the particular analytic form of the transition probabilities, which may have a certain convenience. a) Forward Master Equation We can write the forward Master equation as P(x, t]x', 1!) = (x — DPQ — Atl) +O + DPOF Lele) = tO) + -OOIPCs, t12', 1"). (7.3.2) 258 7. Master Equations and Jump Processes Suppose we want a reflecting barrier at x = a. Then this could be obtained by re- quiring t(a)=0 and P(a—1,t|x’,t)=0. (7.3.3) The only equation affected by this requirement is that for 0,P(a, tx’, t’) for which the same equation can be obtained by not setting ¢-(a) = 0 but instead introducing a fictitious P(a — 1, t|x’, t') such that tt(a— 1)P(a— 1, tx’, t) = @PCa, txt’). (7.3.4) This can be viewed as the analogue of the zero current requirement for a reflecting barrier in a Fokker-Planck equation. If we want an absorbing barrier at x = a, we can set r@—)= (7.3.5) After reaching the point a — 1, the process never returns and its behaviour is now of no interest. The only equation affected by this is that for 0,P(a, t|x’, t’) and the same equation can be again obtained by introducing a fictitious P(a — 1, t|x’, t’) such that ’ Pla—1,t\|x,’)=0. (73.6) Summarising, we have the alternative formulation of imposed boundary condi- tions which yield the same effect in {a, 6] as (7.3.1): Foward Master Equation on interval [a, b] Reflecting Absorbing (7.3.7) Boundary at a | #(a)P(a) = t*(a— 1)P@— 1) P(a@—1)=0 Boundary at b | 1*(6)P(b) = "(6+ I)P(/ +1) P(b+1)=0 b) Backward Master Equation The backward Master equation is (see Sect. 3.6) Bu P(x, t1x’, t!) = P(e t]x' + 1,1) — PQs tl’, 09] + PC, t|x’ — 1, 1) — PQ, t]x', 2). (7.3.8) In the case of a reflecting barrier, at x = a, it is clear that t-(a) = 0 is equivalent to constructing a fictitious P(x, t|a — 1, t’) such that Ple tha — 1 t\ = Ply tla ey (7720 7.4 Mean First Passage Times 259 In the absorbing barrier case, none of the equations for P(x, t|x’, t’) with xx" & [a, b] involve t*(a — 1). However, because 1*(a — 1) = 0, the equations in which x’ < a — 1 will clearly preserve the condition P(x, t1x’, t’) 0, xelab, x x3) ~ PED 262 7. Master Equations and Jump Processes m = 34 Pe) (7.4.19) a and is the total probability of being in the lower peak of the stationary distribution. The result is a discrete analogue of those obtained in Sect. 5.2.7c. 7.5 Birth-Death Systems with Many Variables There is a very wide class of systems whose time development can be considered as the result of individual encounters between members of some population. These include, for example, —chemical reactions, which arise by transformations of molecules on collision; —population systems, which die, give birth, mate and consume each other; —systems of epidemics, in which diseases are transmitted from individual to individual by contact. All of these can usually be modelled by what I call combinatorial kinetics, in which the transition probability for a certain transformation consequent on that en- counter is proportional to the number of possible encounters of that type. For example, in a chemical reaction X =: 2Y, the reaction ¥ — 2Y occurs by spontaneous decay, a degenerate kind of encounter, involving only one individual. The number of encounters of this kind is the number of X; hence, we say t(x—x—Lyryt 2) = ky. (7.5.1) For the reverse reaction, one can assemble pairs of molecules of ¥ in p(y — 1)/2 different ways. Hence (x—+x+Lysy—2)=bhyy— 1). (7.5.2) In general, we can consider encounters of many kinds between molecules, species, etc., of many kinds. Using the language of chemical reactions, we have the general formulation as follows. Consider an n-component reacting system involving s different reactions: . SINAN, = MAK, (A =1,2,04.5). (1.5.3) ky The coefficient N4 of X, is the number of molecules of X, involved on the left and M¢ is the number involved on the right. We introduce a vector notation so that if x, is the number of molecules of X,, then B= (iy Xay 05 Xn) N4 = (NG, NY, - ND (7.5.4) M*= (Mt, Mj, ..., Ma) and we also define 7.5 Birth-Death Systems with Many Variables 263 r= M4A_ NA, (7.5.5) Clearly, as reaction A proceeds one step in the forward direction, xoxtrt (7.5.6) and in the backward direction, xo+x—rt. (15.7) The rate constants are defined by , 14@) = ki Te Gs (7.5.8) OG) = ka which are proportional, respectively, to the number of ways of choosing the com- bination N¢ or Mé from x molecules. The Master equation is thus O,P(x, 1) = Sifltae + e)P(e + 04,0) — 132) P, OD] + [the — P(e — 4,1) — 132) P(x, 1))}- (7.5.9) This form is, of course, a completely general way of writing a time-homogeneous Master equation for an integer variable x in which steps of size r4 can occur. It is only by making the special choice (7.5.8) for the transition probabilities per unit time that the general combinatorial Master equation arises. Another name is the chemical Master equation, since such equations are particularly adapted to chemical reactions. 7.5.1 Stationary Solutions when Detailed Balance Holds In general, there is no explicit way of writing the stationary solution in a practical form. However, if detailed balance is satisfied, the stationary solution is easily derived. The variable x, being simply a vector of numbers, can only be an even variable, hence, detailed balance must take the form (from Sect. 5.3.5) 13x + PAP Cx + 14) = ti(x)Px) (7.5.10) for all A. The requirement that this holds for all A puts quite stringent requirements on the £4. This arises from the fact that (7.5.10) provides a way of calculating P,(xo + nr4) for any n and any initial x5. Using this method for all available 4, we can generate P,(x) on the space of all x which can be written r= x9+ Dy; (ng integral) (7.5.11) but the solutions obtained may be ambiguous since, for example, from (7.5.10) we may write 264 7. Master Equations and Jump Processes Pl +r) + © tae Hr) tale + t+ P(x) (x) te + 4) | but (7.5.12) P(x) tx) 14x + 1?) | r—~—_Cs;s qe re Using the combinatorial forms (7.5.8) and substituting in (7.5.12), we find that this condition is automatically satisfied. The condition becomes nontrivial when the same two points can be connected to each other in two essentially different ways, i-e., if, for example, N4 x N® M4 MP (7.5.13) but r4= In this case, uniqueness of P,(x + r4) in (7.5.10) requires tix) __1il2) Ae +e +H G5.) and this means 4 ke (7.5.15) If there are two chains A, B,C,..., and A’, B’, C’,..., of reactions such that repre eh Se be EH. (7.5.16) Direct substitution shows that Px tort tir? 4 ro $0.) = Pe tr! tr! 4 r+.) (7.5.17) only if (7.5.18) which is, therefore, the condition for detailed balance in a Master equation with combinatorial kinetics. A solution for P,(x) in this case is a multivariate Poisson P(x) = ee (7.5.19) 2 Xa! 7.5 Birth-Death Systems with Many Variables 265 which we check by substituting into (7.5.10) which gives ae ta : (7.5.20) Using the fact that ri= Mi—Né, we find that ki TLaN# = kG Tah. (7.5.21) However, the most general solution will have this form only subject to conser- vation laws of various kinds. For example, in the reaction X=——2Y, (7.5.22) the quantity 2x + y is conserved. Thus, the stationary distribution is oe Pe ore ee (7.5.23) where ¢ is an arbitrary function. Choosing ¢(2x + y) = 1 gives the Poissonian solution. Another choice is $x + y) = (2x + y, N) (7.5.24) which corresponds to fixing the total of 2x + y at N. As a degenerate form of this, one sometimes considers a reaction written as A=2Y (7.5.25) in which, however, A is considered a fixed, deterministic number and the possible reactions are A—-2Y: ty) = ka (7.5.26) 2¥+A 10) =kyy— 1). In this case, the conservation law is now simply that y is always even, or always odd. The stationary solution is of the form PAY) = Svs @) (7.5.27) where y(y, a) is a function which depends on y only through the evenness or oddness of y. 266 7. Master Equations and Jump Processes 7.5.2. Stationary Solutions Without Detailed Balance (Kirchoff’s Solution) There is a method which, in principle, determines stationary solutions in general, though it does not seem to have found great practical use. The interested reader is referred to Haken [Ref. 7.8, Sect.4.8] and Schnakenberg [7.4] for a detailed treat- ment. In general, however, approximation methods have more to offer. 7.5.3. System Size Expansion and Related Expansions In general we find that in chemical Master equations a system size expansion does exist. The rate of production or absorption is expected to be proportional to @, the size of the system. This means that as 2 — oo, we expect x~Qp, (7.5.28) where p is the set of chemical concentrations. Thus, we must have f(x) proportional to Q as Q — 00, so that this requires key ~ 1 Q ENE | (7.5.29) ky~wgQ “pant Under these circumstances, a multivariate form of van Kampen’s system size expan- sion can be developed. This is so complicated that it will not be explicitly derived here, but as in the single variable case, we have a Kramers-Moyal expansion whose first two terms give a diffusion process whose asymptotic form is the same as that arising from a system size expansion. The Kramers-Moyal expansion from (7.5.9) can be derived in exactly the same way as in Sect.7.2.2, in fact, rather more easily, since (7.5.9) is already in the ap- propriate form. Thus, we have ape.) => {AP parce, n+ AY raeppee, 0) (7.5.30) and we now truncate this to second order to obtain O,PCx, 1) = — Zi Bal Aa(e)PCH, )) + $7 0.0s[Basle)P(x, 1], (7.5.31) A(x) = Yiratta(e) — 12@)) 7 | (7.5.32) Bale) = Trdrilea(e) + ta] - In this form we have the chemical Fokker-Planck equation corresponding to the ~ Hawever wa nate that this is really only valid as an approxima- 7.6 Some Examples 267 tion, whose large volume asymptotic expansion is identical to order 1/Q with that of the corresponding Master equation. If one is satisfied with this degree of approximation, it is often simpler to use the Fokker-Planck equation than the Master equation. 7.6 Some Examples 761 X+A==2X Here, t*(x) = kyax (7.6.1) (x) = kaxtx— 1). Hence, A(x) = kyax — kax(x — 1) ~ kyax — knx* to order 1/Q (7.6.2) B(x) = kyax + kax(x — 1) ~ kyax + k,x? to order 1/2. yk 76.2 X= Y=>A4 k y Here we have ioe eI r=(,-1 (7.6.3) ty(x) = kx 1) = P=(@,1). (7.6.4) 13%) = Hence, A(x) = [_ io» —kxy+ [i | (ka — yy) (7.6.5) -| ie | (7.6.6) kx + ka — 2yy BU) = {ile a, | Gy + kx) + (i) 0,0] a+») (76.1) -[_ w+kx —yy — kx | (7.6.8) yy —kx — Qyy + kx + kal” 268 7. Master Equations and Jump Processes If we now use the linearised form about the stationary state, w=kx=ka (7.6.9) B= (7.6.10) 2ka “i —2ka — 4ka] 7.6.3. Prey-Predator System The prey-predator system of Sect. 1.3 provides a good example of the kind of system in which we are interested. As a chemical reaction we can write it as i) X4A—2x fF ii) ¥+ Y—2¥ iii) YB (1,0) (-1,) (7.6.11) @—-. The reactions are all irreversible (though reversibility may be introduced) so we have t7(x) =0 (A = 1, 2, 3) but tt(x) = hae x! _ °&=—DIO-D HQ) = kaxy (7.6.12) The Master equation can now be explicitly written out using (7.5.9): one obtains 8,P(x, y) = kya(x — 1)P(x — 1, ») + kale + DY — DP@ + Ly — 1) +k(y + 1) P(x, y + 1) — (kyax + kaxy + kay) Px, y). (7.6.13) There are no exact solutions of this equation, so approximation methods must be used. Kramers-Moyal. From (7.5.32) 1 -1 0 A(x) = [| bax +| ike +| tise (7.6.14) (7.6.15) [ier . io] koxy — kay |’ 7.6 Some Examples 269 B(x) = (I (1, O)kyax + C1] (FL, Diaxy + [tle —Dhay (7.6.16) _ [her +kaxy = —kyxy i (7.6.17) —kxy kaxy + kay, The deterministic equations are Kyax — k, ai] = ed ”). (7.6.18) ly) Lexy — kay Stationary State at x) [halk [ ] ~ (re : (7.6.19) To determine the stability of this state, we check the stability of the iinearised deterministic equation d (6x) _ aA(x,) ate) ral |- bx + 24) gy oy. on kya — kay, lex, = 6: 6 7.6.20) tn. as ah . wl : ae 0 —k, -[ i [a (7.6.21) kya 0 | (dy. The eigenvalues of the matrix are 2= + i(kykya)"? (7.6.22) which indicates a periodic motion of any small deviation from the stationary state. We thus have neutral stability, since the disturbance neither grows nor decays. This is related to the existence of a conserved quantity V =k(x + y) — ks log x — kya log y (7.6.23) which can readily be checked to satisfy dV/dt = 0. Thus, the system conserves V and this means that there are different circular trajectories of constant V. Writing again x= x, + 6x (7.6.24) yay, t+ dy and expanding to second order, we see that 270 7. Master Equations and Jump Processes _ KR (dx? | by? v=3G aaa) (7.6.25) so that the orbits are initially elliptical (this can also be deduced from the linearised analysis). As the orbits become larger, they become less elliptic and eventually either x or y may become zero. If x is the first to become zero (all the prey have been eaten), one sees that y inevitably proceeds to zero as well. If y becomes zero (all predators have starved to death), the prey grow unchecked with exponential growth. Stochastic Behaviour. Because of the conservation of the quantity V, the orbits have neutral stability which means that when the fluctuations are included, the system. will tend to change the size of the orbit with time. We can see this directly from the equivalent stochastic differential equations dx kyax — kyxy dW(t) = dt + (x,y) , (7.6.26) dy) Ukaxy — key dwt) where Cs EC)" = BUx). (7.6.27) Then using Ito’s formula $ ie Neve Oe V(x, 9) = 5 de +5 dy + a (Fa ae + aa dey + Fee) (7.6.28) so that Cartas a> = (BE eax — kar) + FE thay — kay) dt (7.6.29) + (ou 8+ Bn $4) dt. The first average vanishes since V is deterministically conserved and we find okay, kiko , kykya 7.6.30) x y 5 ¢ ) Gata, yy = (REE All of these terms are of order Q-! and are positive when x and y are positive. Thus, in the mean, V(x, y) increases steadily. Of course, eventually one or other of the axes is hit and similar effects occur to the deterministic case. We see that when x or y vanish, V = 0. Direct implementation of the system size expansion is very cumbersome in this case, and moment equations prove more useful. These can be derived directly from the Master equation or from the Fokker-Planck equation. The results differ slightly 7.6 Some Examples an from each other, by terms of order inverse volume. For simplicity, we use the FPE so that d [Xo kyaXx) — kadxyy Hol ~ lop — k ) <2y dy + dy) ‘2k ,a¢x?) — 2k2¢x?y) + kyadx) + krXxyy =lkGtly — yx) + (ka — ky —k)xyy (7.6.33) 2kaXxy") — 2ks Knowing a system size expansion is valid means that we know all correlations and variances are of order 1/@ compared with the means. We therefore write x = (x) + dx (7.6.34) y= (7.6.36) ‘2kia — 2k2, —2ka »0 >» kya ks + ka{ » 2ka¢x) — 2ks JL dy") We note that the means, to lowest order, obey the deterministic equations, but to next order, the term containing G(s, 1) = a/[B(L — e-™*) — s(a — pe~*)} (7.6.65) x [(B — ae“) — as(l — eye (with 2 = B— a). As t — oo, a stationary state exists only if 8 > a and is G,(s, 00) = (B — as)"”/*(B — a)"'* (7.6.66) > Poy = PE Heal BY og gyre (7.6.67) L(yla)x! We can also derive moment equations from the generating function equations by noting 9,G(S, t)| sar = Cx(t)> (7.6.68) BG Maer = GOLD) = MD - Proceeding this way we have OW) = (hed — k BVA) + KC (7.6.69) and fp OIE) — ID = 20h ~ ky BYALA — UD + kA CX(t)) + 2kyCKx(t)) - (7.6.70) These equations have a stable stationary solution provided 276 7. Master Equations and Jump Processes kA k,B, an explosive chain reaction occurs. Notice that ¢x,) and var {x}, both become very large as a critical point is approached and, in fact, kB me maT (7.6.73) Thus, there are very large fluctuations in ¢x,) near the critical point. Note also that the system has linear equations for the mean and is Markovian, so the methods of Sect. 3.7.4 (the regression theorem) show that , = exp [(k24 — 1B) evar {x}, (7.6.74) so that the fluctuations become vanishingly slow as the critical point is approached, ie., the time correlation function decays very slowly with time. k, b) Chemical Reaction X, —= X; One reaction kt=k, ki=k (7.6.75) 9,G(s15 S25 t) = (82 — 51)(K194, — K29,,)G(S1, S25 t) can be solved by characteristics. The generating function is an arbitrary function of solutions of dt ds, ds, dt_ ds) ds 1.6. 1 kG — 5) BG — 5) Lea) Two integrals are solutions of k.ds, + kids, = 0 => kas, + kis, = 0. (7.6.77) 7.1 The Poisson Representation 217 (hs + kat = M9 72 (7.6.78) => (5,— settee! = ws GS15 S25 t) = Flkas, + Kisa, (82 — sien]. (7.6.79) The initial condition (Poissonian) G51, 52,0) = exp [a(s, — 1) + B(s2 — 1] (7.6.80) gives the Poissonian solution: (kB — k G(S1s Sas t) = exp ie z i (52 = sierra atBp _ + pe Ua — D+ ks: — DI} (7.6.81) In this case, the stationary solution is not unique because x + y is a conserved quantity. From (7.6.79) we see that the general stationary solution is of the form G(5,, 52, 00) = Flkas; + ky52, 0). (7.6.82) Thus, (7.6.83) which implies that, setting 5, = s, = 1, Dy = BODDs- ey) 7.7 The Poisson Representation [7.10] This is a particularly elegant technique which generates Fokker-Planck equations which are equivalent to chemical Master equations of the form (7.5.9). We assume that we can expand P(x, 1) as a superposition of multivariate uncor- related Poissons: P(t) = fel ‘fla, t). (1.7.1) This means that the generating function G(s, t) can be written G(s, t) = § da exp[X (sa — Nad fla, t). (7.7.2) We substitute this in the generating function equation (7.6.54) to get 278 7. Master Equations and Jump Processes 8,005, 1) = & fda | (i . 1)" Zi)" o x (ka Tel? — kz Tet “) exp (SG — daa] flat). (7.73) We now integrate by parts, drop surface terms and finally equate coefficients of the exponential to obtain (an) x [xi I att — kz ql at| fa, t). (1.7.4) a) Fokker-Planck Equations for Bimolecular Reaction Systems This equation is of the Fokker-Planck form if we have, as is usual in real chemical reactions, UMi<2 DNS <2 (1.7.5) which indicates that only pairs of molecules at the most participate in reactions. The FPE can then be simplified'as follows. Define the currents Ida) = ki TI (7.7.6) the drifts ALJ(a)] = S) r2l a), (1.7.7) and the diffusion matrix elements by Bal I(@)] = YI Aa) MIMs — NING — 5,0rd) - (7.7.8) Then the Poisson representation FPE is Be) — _ 92 tatuelfa,) az + 4D aT (Bala) f(a, 1} - =3da,0a, (7.7.9) Notice also that if we use the explicit volume dependence of the parameters given in Sect 7 § 27 § 20) and define 7.7 The Poisson Representation 279 Na = @[V (7.7.10) ean (7.7.11) and F(q, t) is the quasiprobability in the 7 variable, then the FPE for the 7 variable takes the form of aF(n, 1) _ a _ | RO SF hr 1+ F Bee gg, BoalDAHD]| (7-712) with AQ = Siriha) (7.7.13a) Lan) = 4 TL ne — 1g Wl Pi (7.7.13b) Buln) = a S\(MEMb — NAN — 6,04) - (7.7.13¢) In this form we see how the system size expansion in V~'/? corresponds exactly to a small noise expansion in 7 of the FPE (7.7.12). For such birth-death Master equations, this method is technically much simpler than a direct system size ex- pansion. b) Unimolecular Reactions If for all A, SiMs<1 and DNS <1, then it is easily checked that the diffusion coefficient B,,(m) in (7.7.13) vanishes, and we have a Liouville equation. An initially Poissonian P(x, fo), corresponds to a delta function F(x, fo), and the time evolution generated by this Liouville equation will generate a delta function solution, 5(7 — 4(t)), where 4(t) is the solution of dn|dt = A(n) This means that P(x, f) will preserve a Poissonian form, with mean equal to 4(f). Thus we derive the general result, that there exist propagating multipoissonian solutions for any unimolecular reaction system. Non Poissonian solutions also exist—these correspond to initial F(, to) which are not delta functions. ©) Example As an example, consider the reaction pair i) A+X 2x he (7.7.14) ky ii) B+} XC ky 280 7. Master Equations and Jump Processes Nt=1, 0 M'=2,) ki=khA, kpoky Wooly MeO, kB | kG so that (7.7.4) takes the form x = [( 7 zy) ~ (1 ~ a) ede oe (7.7.15) +fi-(t- 2) ete — kos af ot 2 [- Ztsc + (kA — kyB)a — kya?) + J lida — kel} f (7.7.16) which is of the Fokker-Planck form, provided k,4a — k,a? > 0. Furthermore, there is the simple relationship between moments, which takes the form (in the case of one variable) xy = Bf dalaee = 1). + NE fla) x! = f daa'fla) = 0 (7.7.24) ky > 0. Clearly, by definition, k; must be positive. It must further be checked that the integrations by parts used to derive the FPE (7.7.4) are such that under these conditions, surface terms vanish. For an interval (a, 6) the surface terms which would arise in the case of the reaction (7.7.14) can be written. [((eaAa — kya? — k, Bar + kyC)f — Oal(koae — kaa?) f)} {0-9} 15 + [era — kaa? f(s — Neo? Tg - (7.7.25) Because of the extra factor (s — 1) on the second line, each line must vanish separa- tely. It is easily checked that on the interval (0, k,4/k,), each term vanishes at each end of the interval for the choice (7.7.22) of f, provided 6 and k, are both greater than zero. In the case where k; and 6 are both positive, we have a genuine FPE equivalent to the stochastic differential equation da = [ksC + (kxA — kyB)a — kyat|dt + Jka — ke dt). (7.7.26) The motion takes place on the range (0, k,A/ks) and both boundaries satisfy the cri- teria for entrance boundaries, which means that it is not possible to leave the range (0, kzA/k,) (Sect.5.2.1). If either of the conditions (7.7.24) is violated, it is found that the drift vector is such as to take the point outside the interval (0, k,4/k,). For example, near a = 0 we have da ~ k,C dt (1.1.21) and if kC is negative, a will proceed to negative values. In this case, the coefficient of dW(t) in (7.7.26) becomes imaginary and interpretation is no longer possible without further explanation. Of course, viewed as a SDE in the complex variable @ a + ity 5 (7.7.28) the SDE is perfectly sensible and is really a pair of stochastic differential equations for the two variables a, and a,. However, the corresponding FPE is no longer the 282 7. Master Equations and Jump Processes one variable equation (7.7.16) but a two-variable FPE. We can derive such a FPE in terms of variations of the Poisson representation, which we now treat. 7.1.1 Kinds of Poisson Representations Let us consider the case of one variable and write P(x) = f dula)(e~*a*/x!)f(a) . (7.7.29) gs Then y(q) is a measure which we will show may be chosen in three ways which all lead to useful representations, and & is the domain of integration, which can take on various forms, depending on the choice of measure. 7.1.2 Real Poisson Representations Here we choose dua) = da (7.7.30) and @ isa section of the real line. As noted in the preceding example, this represen- tation does not always exist, but where it does, a simple interpretation in terms of Fokker-Planck equations is possible. 7.1.3 Complex Poisson Representations Here, dy(a) = da (7.731) and & is a contour C in the complex plane. We can show that this exists under certain restrictive conditions. For, instead of the form (7.7.18), we can choose fla) = z# ater (7.7.32) and C to be a contour surrounding the origin. This means that agg tartan PH) = 55 § GT = Sar (7.7.33) By appropriate summation, we may express a given P(x) in terms of an f(a) given by fla) = 5S Peay! (1.7.34) If the P(y) are such that for all y, y!P(y) is bounded, the series has a finite radius of convergence outside which f(a) is analytic. By choosing C to be outside this circle of convergence, we can take the integration inside the summation to find that P(x) is given by J.J tne Poisson Kepresenauon zo P(x) = § da(e-*a*/x!)f(a) . (7.7.35) é a) Example: Reactions (1) A + Y= 2X, 2) B+ X==C We use the notation of Sect. 7.7 and distinguish three cases, depending on the magnitude of 5. The quantity d gives a measure of the direction in which the reac- tion system (7.7.14) is proceeding when a steady state exists. If 6 > 0, we find that when x has its steady state value, reaction (1) is producing X while reaction (2) consumes Y. When 6=0, both reactions balance separately—thus we have chemical equilibrium. When 6 < 0, reaction (1) consumes X while reaction (2) pro- duces X. i) 5 > 0: according to (7.7.24), this is the condition for f,(a) to be a valid quasipro- bability on the real interval (0, k,4/k,). In this range, the diffusion coefficient (k,Aa — k,a?) is positive. The deterministic mean of a, given by = kB + [kad — KiB) + Aka? 2k, (7.7.36) lies within the interval (0, kA/k,). We are therefore dealing with the case of a genu- ine FPE and f(a) is a function vanishing at both ends of the interval and peaked near the deterministic steady state. ii) 6 =0: since both reactions now balance separately, we expect a Poissonian steady state. We note that f,(a) in this case has a pole at a = k,A/k, and we choose the range of a to be a contour in the complex plane enclosing this pole. Since this is a closed contour, there are no boundary terms arising from partial integration and P,(x) given by choosing this type of Poisson representation clearly satisfies the steady state Master equation. Now using the calculus of residues, we see that er *eag Px) = oe (7.737) with a = kyAlka « iii) 6 < 0: when 6 < 0 we meet some very interesting features. The steady state solution (7.7.22) now no longer satisfies the condition 5 > 0. However, if the range of @ is chosen to be a contour C in the complex plane (Fig. 7.3) and we employ the complex Poisson representation, P,(x) constructed as P(x) = fda Shae (7.7.38) is a solution of the Master equation. The deterministic steady state now occurs at a point on the real axis to the right of the singularity at a = k,A/k,, and asymp- totic evaluations of means, moments, etc., may be obtained by choosing C to pass through the saddle point that occurs there. In doing so, one finds that the variance of a, defined as 284 7. Master Equations and Jump Processes Fig. 7.3. Contour C in the complex plane for the evaluation of (7.7.38) var {al ae (7.7.39) is negative, so that var {x} = (x7) — (x)? = (a?) — (a)? + (a) < 0, the singularity at @ = k,A/k, is now integrable so the contour may be collapsed onto the cut and the integral evaluated as a discontinuity integral over the range [0, k,A/k,]. (When 6 is a positive integer, this argument requires modification). b) Example: Reactions B—*!. x, 2x “2. 4 For which the Fokker-Planck equation is MOD — — 8 (6,7 — 2V WP fle, 1 — Favela. 1, (1-741) where x,V = k,B, «,V~! = k, and Vis the system volume. Note that the diffusion coefficient in the above FPE is negative on all the real lines. The potential solution of (7.7.41) is (up to a normalisation factor) f(a) = a? exp (2a + aV*/a) (7.7.42) with a = 2x,/x, and the a integration is to be performed along a closed contour encircling the origin. Of course, in principle, there is another solution obtained by solving the stationary FPE in full. However, only the potential solution is single valued and allows us to choose an acceptable contour on which partial integration is permitted. Thus, by putting a = 7V, we get Vt § dy eV 2ntel yf p= Paper bn (7.7.43) The function (27 + a/y) does not have a maximum at the deterministic steady state. In fact, it has a minimum at the deterministic steady state 7 = + (a/2)"!?. J.0 Ane roissun neprescuauuis a However, in the complex 7 plane this point is a saddle point and provides the dominant contribution to the integral. Thus, the negative diffusion coefficient in (7.7.41) reflects itself by giving rise to a saddle point at the deterministic steady state, which results in the variance in X being less than I vanish when integrated in (7.7.50). Noting that the Poisson form e~*a*/x! is itself analytic in @, we obtain for any positive value of o? P(x) = f da f,(a)e-*a*]/x! = e-0ar/x!. (7.7.51) In practice, this nonuniqueness is an advantage rather than a problem. a) Fokker-Planck Equations We make use of the analyticity of the Poisson and its generating function to produce Fokker-Planck equations with positive diffusion matrices. A FPE of the form of (7.7.9) arises from a generating function equation G6, 1) = fdPe flat) (52 Aa ge + 324 Bageez) exp [El64— ag} (7.7.52) We now take explicit account of the fact that @ is a complex variable =a, +ia, (7.7.53) and also write Aa) = A,(@) + iA,(@) « (7.7.54) We further write ‘ B(a) = C(a)C*(@) (7.7.55) and (a) = C.(a) + iC,(@) (7.7.56) For brevity we use =~ * Bae a= x (7.7.57) a Oey." Because of the analyticity of exp [37(s. — !)a,], in the generating function equation (7.7.52) we can always make the interchangeable choice 8, 05 > — id. (7.7.58) We then substitute the form (7.7.54) for B,,, and replace , by either 03 or —idz according to whether the corresponding index on A or Cis x or y respectively. We then derive 4,G(s, 1) = § dea fl@, 1D (AaixdE + Aaiy2) FES Care xCe.t:2805 + CoreryCe.0: 0208 + Co iC, 4;9%08] EXP [3 (Se — Naz]}- (7.7.59) Integrating by parts and discarding the surface terms to get a FPE in the variables CA I fle, 1) =[- Ei Oa Aa:xt Fa Ae») + 4 Fi a95Cae:xCe.b:x + B29$Ca,c:yCe,by + 29G95Ca,c;xCe,0; yI f(a t) « (7.7.60) In the space of doubled dimensions, this is a FPE with positive semidefinite diffusion. For, we have for the variable (a, @,) the drift vector Aa) = [A,(a), 4,(@)] (7.7.61) and the diffusion matrix Cc Fa) = (ee cee ale Fay¥ay" 7.762) where (a) = ° 1.7.63 ole 3 cs so that Ba) is explicitly positive semidefinite. b) Stochastic Differential Equation (SDE) Corresponding to the drift and diffusion (7.7.61, 62) we have a stochastic differential equation de.) [Al] 5, [OAM 1768 ie! 7 We i+ ee a where W(t) is a Wiener process of the same dimension as @,. Note that the same Wiener process occurs in both lines because of the two zero entries ¥(a) as written in (7.7.63). Recombining real and imaginary parts, we find the SDE for the complex variable @: da = A(a)dt + C(a)dW(t) . (7.7.65) This is of course, exactly the same SDE which would arise if we used the usual rules for converting Fokker-Planck equations to stochastic differential equations directly on the Poisson representation FPE (7.7.9), and ignored the fact that C(@) 288 7. Master Equations and Jump Processes so defined would have complex elements if B was not a positive semidefinite diffu sion matrix. c) Examples of Stochastic Differential Equations in the Complex Plane We again consider the reactions (sect. 7.7.b) ka A+ X 22x ke (7.1.66) ky BLX=C. ky The use of the positive Poisson representation applied to this system yields the SDE, arising from the FPE (7.7.16): da = [ksC + (ka — k:B)a — kya? Jt +[2(k,Aa — ka?)}7dW(t). (7.7.67) In the case 6>0, we note that the noise term vanishes at a = 0 and at a =k,A/k,, is positive between these points and the drift term is such as to return @ to the range [0, k,A/k,] whenever it approaches the end points. Thus, for 5 > 0, (7.7.67) represents a real SDE on the real interval [0, k,A/k«]. In the case 6 < 0, the stationary point lies outside the interval (0, k,A/k,], and a point initially in this interval will move along this interval governed by (7.7.67) until it meets the right-hand end, where the noise vanishes and the drift continues to drive it towards the right. One leaving the interval, the noise becomes imaginary and the point will follow a path like that shown in Fig. 7.4 until it eventually reaches the interval (0, k,A/k,] again. The case of 5 = 0 is not very dissimilar, except that once the point reaches the right-hand end of the interval [0, k,A/k,], both drift and diffusion vanish so it re- mains there from then on. In the case of the system BX 2X—A, (7.7.68) Fig. 7.4. Path followed by a point obeying the stochastic differential equation (7.7.67) Fig. 7.5. Simulation of the path of a point obeying the stochastic differential equation (7.7.69) > 7.17 The Poisson Representation 289 the SDE coming from the FPE (7.7.41) is dnfdt = Ky — Ikan? + ic(m,) "7ne(t), (7.7.69) where @ = Vande = V-!?. The SDE (7.7.69) can be computer simulated and a plot of motion in the com- plex 7 plane generated. Figure 7.5 illustrates the behaviour. The point is seen to remain in the vicinity of Re {a} = (a/2)'? but to fluctuate mainly in the imaginary direction on either side, thus giving rise to a negative variance in a. 7.15 Time Correlation Functions The time correlation function of a Poisson variable a is not the same as that for the variable x. This can be seen, for example, in the case of a reaction X + Y which gives a Poisson Representation Fokker-Planck equation with no diffusion term. Hence, the Poisson variable does not fluctuate. We now show what the relationship is. For clarity, the demonstration is carried out for one variable only. We define Kat)a(s)> = f du(a)du(a’)aa' f(a, tla’, s)f(a’, 5). (7.7.70) We note that Slay s\a’, s) = 8,(a — a’) which means that J aula) e-*(a*/x!)f(a, sa’, 8) = e7*"a"*/x! (7.1.11) so that J dula) af(a, ta’, 8) =D xP(x, t|x’, sle-*'a'x!/x!! Hence, alt dats)) = 3} xPCs thx’, 8) f duel (ar"*¥e-*'/x' fle, 8) =D xPcetly’,s) fdutad (-a' 2, + x)erren ia’ |fte'.9) = Sx x PC tx’, PC, 3) (7.7.12) = J dula’) fla’, s)a! 2% SyxPCs tla’, Meer*’x!). (7.7.73) We define a(t) |[a’, s]) = J daaf(a, tla’, s) (7.1.74) 290 7. Master Equations and Jump Processes as the mean of a(t) given the initial condition a’ at s. Then the second term can be written = f dutatya 32 cote)’, sb fte’,) = (a! 32 date) Ila, sly) (7.715) so we have = + (a Z = always , we have lt), 20> = Cel, 9) + (0h Cul e',Sl>) 7.7) This formula explicitly shows the fact that the Poisson representation gives a process which is closely related to the Birth-Death Master equation, but not isomorphic to it. The stochastic quantities of interest, such as time correlation functions, can all be calculated bugare not given directly by those of the Poisson variable. a) Interpretation in Terms of Statistical Mechanics We assume for the moment that the reader is acquainted with the statistical mechanics of chemical systems. If we consider a system composed of chemically reacting components A, B, C, ..., the distribution function in the grand canonical ensemble is given by POD) = exp (B12 + X wx) — EM}, (7.1.78) where J is an index describing the microscopic state of the system, x,(J) is the number of molecules of X, in the state J, E(7) is the energy of the state, jz, is the chemical potential of component X;, 2 is a normalization factor, and BH AkT. (7.1.79) The fact that the components can react requires certain relationships between the chemical potentials to be satisfied, since a state J can be transformed into a state J only if Dox) = Lote, A= 12,3, (7.7.80) where v4 are certain integers. The relations (7.7.80) are the stoichiometric con- straints. 7.7 The Poisson Representation 291 The canonical ensemble for a reacting system is defined by requiring DaAO=, (7.7.81) 7 for some t4, whereas the grand canonical ensemble is defined by requiring SPw =x vie) = x vitx) = 14, (7.7.82) Maximization of entropy subject to the constraint (7.7.82) (and the usual con- straints of fixed total probability and mean energy) gives the grand canonical form (7.7.78) in which the chemical potentials also satisfy the relation m= Tikes. (7.1.83) 7 When one takes the ideal solution or ideal gas limit, in which interaction ener- gies (but not kinetic or internal energies) are neglected, there is no difference between the distribution function for an ideal reacting system and an ideal nonre- acting system, apart from the requirement that the chemical potentials be ex- pressible in the form of (7.7.83). The canonical ensemble is not so simple, since the constraints must appear ex- plicitly as a factor of the form Ul ah vix(1), t4] (7.7.84) and the distribution function is qualitatively different for every kind of reacting system (including a nonreacting system as a special case). The distribution in total numbers x of molecules of reacting components in the grand canonical ensemble of an ideal reacting system is easily evaluated, namely, P(x) = exp [AQ + Fi wxd] DTI xD), xd exp [—BEW)) . (7.7.85) The sum over states is the same as that for the canonical ensemble of an ideal non- reacting mixture so that P(x) = exp [B(@ +E wexd) TS (Lexp (PEO (7.7.86) 1 il where E,(i) are the energy eigenstates of a single molecule of the substance A. This result is a multivariate Poisson with mean numbers given by logdx,) = Bu; — log [ eo PERO] (7.7.87) which, as is well known, when combined with the requirement (7.7.82) gives the law of mass action. 2927. Master Equations and Jump Processes The canonical ensemble is obtained by maximizing entropy subject to the stronger constraint (7.7.81), which implies the weak constraint (7.7.82). Thus, the distribution function in total numbers for the canonical ensemble will simply be given by Pea) ce {IT See] ST te eA. (7.788) In terms of the Poisson representation, we have just shown that in equilibrium situations, the quasiprobability (in a grand canonical ensemble) is S(@)eq = Sf — a(eq)] (7.7.89) since the x space distribution is Poissonian. For the time correlation functions there are two results of this. i) The variables a(t) and a(s) are nonfluctuating quantities with values a(eq). Thus, (7.7.94) to be the mean values of the quantities x, at time ¢ under the condition that the system was in a configuration / at time s. Then a quantity of interest is the mean value of (7.7.94) over the distribution (7.7.92) of initial conditions, namely, 7.17 The Poisson Representation 293 Xo tl [Hs s)) = x xn tI, sPZ-M(H) x exp as Hyx(J) — EV}. (7.1.95) When the chemical potentials satisfy the equilibrium constraints, this quantity will be time independent and equal to the mean of x, in equilibrium, but otherwise it will have a time dependence. Then, with a little manipulation one finds that [7 Gon tlt sD] = lO Hae (1196) a= we The left-hand side is a reponse function of the mean value to the change in the chemical potentials around equilibrium and is thus a measure of dissipation, while the right-hand side, the two-time correlation function in equilibrum, is a measure of fluctuations. To make contact with the Poisson representation result (7.7.91) we note that the chemical potentials y, in ideal solution theory are given by HAGE) = KT log , =V = [exp (—F tyler tents Marrs (7.7.106) and Hed Ie’ OY = Fr ClO, OP = Lemp (— FO. (7.1107) Hence, = [Ct — t0)/6]"(3m) fm! . (7.7.144) 300 7. Master Equations and Jump Processes Further, we assume the process (7.7.141) is some kind of generalized Markov process, for which the joint probability distribution is given by P(Vata? Yt) = P(ata| Mh )PM, fH) (7.7.145) and from (7.7.142) we see that the first factor is a function of only v, — v, and tz — h, so that the variable V(t,) — V(t,) is statistically independent of V(t,) and that this process is a process with independent increments. Thus, dV(t) will be independent of V(t). The rigorous definition of stochastic integration with respect to V(t) is a task that we shall not attempt at this stage. However, it is clear that it will not be too dissimilar to Ito integration and, in fact, Hochberg [7.12] has rigorously defined higher-order noises of even degree and carried out stochastic integration with respect to them. We can show, however, that a stochastic differential equation of the form dy(t) = a(y)dt + B(y)dW(t) + ey) dV(t) (7.7.146) [with W(1) and V(t) independent processes] is equivalent to a third-order Fokker- Planck equation. It is clear that because W(t) and V(t) are processes with inde- pendent increments, y(s) is a Markov process. We then calculate tim (LO = HOON — tim tt0 t—to arq-0 <[dvRto)"> ie (7.7.147) where y(t) is a numerical initial value, not a stochastic variable. From (7.7.146), y(t) depends on W(t’) and V(t") for only t’ < t and, since dW(t) and dV(t) are inde- pendent of y(t), we find dy (to)> = Cal v(te)]>dto + + = BLW(ta)’ = O[y(to)F?dto (7.7.149) = cL y(ta) AV (t0)?> = H(to)}°dto - (7.7.150) Thus, we find i [Kx(t) — (te) Mt — to)] = al W(to)] Hes [OO — vGo)P>/( — t0)] = b1y(t0) (7.7.151) lim [<19(t) — v(to)P Mt — t0)] = cL y(t0)? 7.7 The Poisson Representation 301 and all higher powers give a zero result. By utilising a similar analysis to that of Sect.3.4, this is sufficient to show that y(¢) is a generalized diffusion process whose generalized FPE is : 7 G9) = — 3 rato) + FZ BoA — § Holey. (7.7.152) We define a noise source ¢(t) by V(t) = C(t)dt , (7.7.153) where eM) = KNEE) = 0 (7.7.154) KELL = Bt — 1/)8(' — t”) (7.7.155) and higher moments can be readily calculated from the moments of dV(t). The independence of increments means that, as with the Ito integral, integrals that have a delta-function singularity at their upper limit are to be taken as zero. Example of the Use of Third-Order Noise. Consider the chemical process k, A+ 2X = 3X (7.7.156) ka ks A==X ke whose Poisson representation FPE is BOD — 2 ty, y-tat — raV Aa? + KV — Kefle, I 1 & lg? Ig? +> Zqaldoa Ve? — V0") flay) = a & [6 Va? — 1, a) fla, 1)], (7.7.157) where V7! = kA, V%=ky, KV=ky Ke = kee In the steady state, (7.7.157) reduces to a linear second-order differential equa- tion which may be solved in terms of hypergeometric functions, and an asymptotic expansion for the various moments can be obtained using steepest descent methods. This procedure, although possible in principle, is not very practicable. It is in such cases that the method of stochastic differential equations proves to be very useful because of its ease of application. 302 7. Master Equations and Jump Processes The stochastic differential equation equivalent to (7.7.157) is adnft)idt = kin(t)? — Kan(t? + Ks — Kant) + Alin)? — Kant} 76(0) + wt 6liM(t)? — Kant} PC(t) (7.7.158) where a = V, « = V-"/6 and the noise source {(t), henceforth referred to as the “third-order noise”, has been defined in (7.7.153-155) Equation (7.7.158) may be solved iteratively by expanding (t): mt) = molt) + 2omalt) + wnt) + Honlt) + wnat) + went) +... (7.7.159) which, when substituted in (7.7.158), yields the deterministic equation in the lowest order and linear stochastic differential equations in the higher orders which may be solved as in Sect.6.2. In the stationary state the results are 2 &x) = Vito + <6) +o = Vito + 2b (7.7.160a) 47) — 3) = Vin) + [2Coms> + 2am + <6) + 3M + Cod + -- = ve eb m+, (7.7.160¢) where a = K,3 — Kang, b = 2K, — 3x2mMo, C= Ky — 2eiNo + 3x22 and Np is the solution of the steady-state deterministic equation xm — Kam + Ks — Kato = 0- (7.7.161) Here a few remarks are in order. The third-order noise ¢(t) contributes to O(V-') to the mean and to O(1) to the variance, but contributes to O(V) to the skewness coefficient. If one is only interested in calculating the mean and the variance to O(V), the third-order noise may be dropped from (7.7158) and the expansiou carried out in powers of e = V~'/?. Also note that as c —- 0, the variance and the higher order corrections become divergent. This of, course, is due to the fact that in this limit, the reaction system exhibits a first-order phase transition type behaviour. 8. Spatially Distributed Systems Reaction diffusion systems are treated in this chapter as a prototype of the host of spatially distributed systems that occur in nature. We introduce the subject heuristically by means of spatially dependent Langevin equations, whose inade- quacies are explained. The more satisfactory multivariate master equation descrip- tion is then introduced, and the spatially dependent Langevin equations formulated as an approximation to this description, based upon a system size expansion. It is also shown how Poisson representation methods can give very similar spatially dependent Langevin equations without requiring any approximation. We next investigate the consequences of such equations in the spatial and temporal correlation structures which can arise, especially near instability points. The connection between local and global descriptions is then shown. The chapter concludes with a treatment of systems described by a distribution in phase space (ie. the space of velocity and position). This is done by means of the Boltzmann Master equation. 8.1 Background The concept of space is central to our perception of the world, primarily because well-separated objects do not, in general, have a great deal of influence on each other. This leads to the description of the world, on a macroscopic deterministic level, by Jocal quantities such as local density, concentration, temperature, electro- magnetic potentials, and so on. Deterministically, these are normally thought of as obeying partial differential equations such as the Navier-Stokes equations of hydrodynamics, the reaction diffusion equations of chemistry or Maxwell's equa- tions of classical electromagnetism. The simplest cases to consider are reaction diffusion equations, which describe chemical reactions and which form the main topic of this chapter. In order to get some feel of the concept, let us first consider a Langevin equation description for the time evolution of the concentration p of a chemical substance. Then the classical reaction-diffusion equation can be derived as follows. A diffusion current d(r, t) exists such that de, t) = —DP ptr, 1) (8.1.1) and (8.1.1) is called Fick’s law. If there is no chemical reaction, this current obeys a conservation equation. For, considering an arbitrary volume V, the total amount of chemical in this volume can only change because of transport across the bound- ary S, of V. Thus, if N is the total amount in V, 304 8. Spatially Distributed Systems Watt ar ple, t) = = [dS-i(e,0) (8.1.2) = — farv-jrt). Hence, since V is arbitrary, O.plr, t) + V-5(r, t) = (8.1.3) Substituting Fick’s law (8.1.1) into the conservation equation (8.1.3) we get 4,p(r, t) = DP*plr, t), (8.1.4) the diffusion equation. Now how can one add fluctuations? First notice that the conservation equation (8.1.3) is exact; this follows from its derivation. We cannot add a fluctuating term to it. However, Fick’s law could well be modified by adding a stochastic source. Thus, we rewrite S(t, t) = —DP plr,t) + fale, t)- (8.1.5) Here, f,(r, t) is a vector Langevin source. The simplest assumption to make con- cerning its stochastic properties is = 0 and Salts fas 1 = Kalr, Ndy5(r — 2’ — 1’) , (8.1.6) * that is, the different components are independent of each other at the same time and place, and all fluctuations at different times or places are independent. This is a locality assumption. The fluctuating diffusion equation is then d,p(r, t) = DV*p(r, t) — 0 -Falt, 1) - (8.1.7) Notice that GD fale, OV’ -Fale’ 1Y =O -VIK Ar, O8(r — rVS(t — t') « (8.1.8) Now consider including a chemical reaction. Fick’s law still applies, but instead of the conservation equation we need an equation of the form aN al Pr ple) = — [AS-i + [ dr Flor, 0, (6.1.9) where Flp(r,1)] is a function of the concentration and represents the production of the chemical by a local chemical reaction. Hence we find, before taking fluctuations into account, O,p(r, t) + 0-5, t) = Flo(r, 1) - (8.1.10) 8.1 Background 305 The production of the chemical by a chemical reaction does generate fluctuations, so we can add to (8.1.10) a term f.(r, t) which satisfies = 0 fe, OFA, YY = Kr, NB(e = ‘BCE = 1°) ay which expresses the fact that the reaction is /ocal (i.e., fluctuations at different points are uncorrelated) and Markov (delta correlated in time). The full reaction-diffusion chemical equation now becomes O,p(r, 1) = DV*p(r, t) + Flolr, 1)] + g(r, t) (8.1.12) ° where ar, 1) = —V-Falr, 1) + fr, 1) (8.1.13) and dele, Nate, t)>=KeP— 7, NERF) +P PK ale, 1)8(r—r’)}} 8(t— 2’). | (8.1.14) The simplest procedure for turning a classical reaction diffusion equation into a Langevin equation yields a rather complex expression Further, we know nothing about K,(r) or K,(r), and this procedure is based on very heuristic models. Nevertheless, the form derived is essentially correct in that it agrees with the results arising from a more microscopic approach based on Master equations, which, however, specifies all arbitrary constants precisely. 8.1.1 Functional Fokker-Planck Equations By writing a stochastic partial differential equation such as (8.1.12), we immediately raise the question: what does the corresponding Fokker-Planck equation look like? It must be a partial differential equation in a continuously infinite number of variables p(r), where r is the continuous index which distinguishes the various variables. A simple-minded way of defining functional derivatives is as follows. First, divide space into cubic cells of side / labelled i with position r,, and introduce the variables x = Pole) (8.1.15) and consider functions of the variables x = {x,}. We now consider calculus of functions F(x) of all these cell variables. Partial derivatives are easily defined in the usual way and we formally introduce the functional derivative by SEC) fim 1-2 ED) Spr.) 0 x (8.1.16) 306 8, Spatially Distributed Systems In what sense this limit exists is, in most applied literature, left completely unde- fined. Precise definitions can be given and, as is usual in matters dealing with functionals, the precise definition of convergence is important. Further, the “obvious” definition (8.1.16) is not used. The precise formulation of functional calculus is not within the scope of this book, but an indication of what is normally done by workers who write such equations is appropriate. Effectively, the functional derivative is formally defined by (8.1.16) and a corresponding discretised version of the stochastic differential equation such as (8.1.12) is formulated. Using the same notation, this would be dey = [EI Dyxy + FO) at +O Bud WO) (8.1.17) In this equation, D,, are coefficients which yield a discretised approximation to DV?. The coefficients F and g are chosen so that Flor 1] = tim Feat (8.1.18) a(t 1) = lim 9S ay dW (0). (8.1.19) More precisely, we assume a more general correlation formula than (8.1.14), i.e., = Gr, r')(t — t'), (8.1.20) and require ‘ Glo r) = lim 1-6 Bubp (8.1.21) In this case, the FPE for x, is 0,P(x) = -EZ Wr, + byF(x)P@)} + = > ae BB nP(x)- (8.1.22) Now consider the limit /? - 0. Some manipulation gives 9,P(p) = — fare EO) {(D7°(r) + Flo@)IP(p)} +4Sfaer or | saa alr, “Pe: (8.1.23) ia P(p) is now a kind of functional probability and the definition of its normalisation requires a careful statement of the probability measure on p(r). This can be done [8.1] but what is normally understood by (8.1.23) is really the discrete version (8.1.22), and almost all calculations implicitly discretise. The situation is clearly unsatisfactory. The formal mathematical existence of stochastic partial differential equations and their solutions has now been establi- shed, but as an everyday computational tool this has not been developed. We refer the reader to [8.1] for more information on the mathematical formulation. Since, however, most work is implicitly discretised, we will mostly formulate 8.2 Multivariate Master Equation Description 307 matters directly in a discretised form, using continuum notations simply as a convenience in order to give a simpler notation. 8.2 Multivariate Master Equation Description 8.2.1 Diffusion We assume that the space is divided into cubic cells of volume AV and side length 1. The cells are labelled by an index i and the number of molecules of a chemical X inside cell i is called x,. Thus we introduce a multivariate probability P(x, 1) = P(x, Xn 2 ot) = POX, £,1)- (8.2.1) In the last expression, # means the vector of all x’s not explicitly written. We can model diffusion as a Markov process in which a molecule is transferred from cell i to cell j with probability per unit time dix, i.e., the probability of transfer is proportional to the number of molecules in the cell. For a strictly local description, we expect that d,, will be nonzero only when i and j are neighbouring cells, but this is not necessary and will not always be assumed in what follows. In terms of the notation of Sect. 7.5, we can write a birth-death Master equation with parameters given by the replacements: Ne =O M“? =(.. rP =O... c Kin = dy kgy =0. Hence, the Master equation becomes 0,P(x, t) = Ddyl + DP, x + 1, x, — 1,1) — x, P(x, ) (8.2.3) This equation is a simple linear Master equation and can be solved by various means. Notice that since PP = pun (8.2.4) NU = Moo, we can also restrict to i > j and set kj) = dj. From (7.5.15,18) we see that in this form, detailed balance is satisfied provided dKxde = UydXiDa- (8.2.5) 308 8. Spatially Distributed Systems In a system which is diffusing, the stationary solution is homogenous, i-e., a= De Hence, detailed balance requires dy = dy, (8.2.6) and (8.2.3) possesses a multivariate Poisson stationary solution. The mean-value equation is AKAD) = Sr PPL) — HY] a === Hence, d = = dy — 5p a dix) - (8.2.8) 8.2.2 Continuum Form of Diffusion Master Equation Suppose the centre of cell i is located at r, and we make the replacement x(t) = P(r, t) ' (8.2.9) and assume that d,, = 0 (i, j not nearest neighbours) d (i,j adjacent). Then (8.2.8) becomes, in the limit /— 0, 9. with (8.2.10) D=Pd. (8.2.11) Thus, the diffusion equation is recovered. We will generalise this result shortly. a) Kramers-Moyal or System Size Expansion Equations We need a parameter in terms of which the numbers and transition probabilities scale appropriately. There are two limits which are possible, both of which corres- pond to increasing numbers of molecules: i) limit of large cells: ! —~ co, at fixed concentration; ii) limit of high concentration at fixed cell size. The results are the same for pure diffusion. In either case, tix) + 0 (8.2.12) and a system size expansion is possible, To lowest order, this will be equivalent to a Kramers-Moyal expansion. From (7.5.31,32) we find 8.2. Multivariate Master Equation Description 309 A(x) = Y Dy x; (8.2.13) i Binl®) = 51m Dox + Dyx)) = DimX1 — DyuiXm » (8.2.14) where Dn = dn ~ 5S de (8.2.15) and thus, in this limit, P(x, 1) obeys the Fokker-Planck equation OP = —S2dA(e)P +} TI:DyBinl=)P - (8.2.16) b) Continuum Form of Kramers-Moyal Expansion The continuum form is introduced by associating a point r with a cell i and writing PY fdr (8.2.17) Dj —~ D(r',1) = Wr, — r+) (8.2.18) 15, (r — 1’). (8.2.19) At this stage we make no particular symmetry assumptions on D,,, etc, so that anisotropic inhomogeneous diffusion is included. However, there are some requirements brought about by the meaning of the concept “diffusion.” i) Diffusion is observed only when a concentration gradient exists. This means that the stationary state corresponds to constant concentration and from (8.2.13,15), this means that SD, =0, (8.2.20) 7 ie, Dd = Dy. (8.2.21) Note that detailed balance (8.2.6) implies these. ii) Diffusion does not change the total amount of substance in the system, i.e., 455-0 (8.2.22) aa” and this must be true for any value of x;. From the equation for the mean values, this requires xD, =0 (8.2.23) which follows from (8.2.15) and (8.2.21) iii) Jn the continuum notation, (8.2.20) implies that for any r, 310 8. Spatially Distributed Systems Ja Dr + 5,6) =0 (8.2.24) and from (8.2.23), we also have f.d°6 Hr, 6) =0. (8.2.25) iv) If detailed balance is true, (8.2.24) is replaced by the equation obtained by substituting (8.2.6) in the definition of D, i-e., Dy = Dy (8.2.26) which gives in the continuum form P(r + 6, —6) = Dr, 4). (8.2.27) The derivation of a continuum form now follows in a similar way to that of the Kramers-Moyal expansion. We define the derivate moments M(r) = sf P65 5D (r, 6) (8.2.28) D(r) = $f a5 65D, 8) (8.2.29) and it is assumed that derivate moments of higher order vanish in some appro- priate limit, similar to those used#in the Kramers-Moyal expansion. The detailed balance requirement (8.2.27) gives M(r) = [ d°65O(r + 6, —8) = f 05 [O(r, —8) + 6-PHr, —6) + ...] (8.2.30) = —M() + 2p-D(n) + Hence, detailed balance requires Mr) =7-D(r). (8.2.31) The weaker requirement (8.2.24) similarly requires the weaker condition V-[M(r) — 7-D(r)] = 0. (8.2.32) We now can make the continuum form of A,(x): Aix) —~ [ 25 Hr, 6)pr + 6) = M@)-Volr) + Die): PF pr) (8.2.33) If detailed balance is true, we can rewrite, from (8.2.31) Ai(x)>V [D(r)-Fa(r)) (8.2.34) 8.2. Multivariate Master Equation Description 31 The general form, without detailed balance, can be obtained by defining Jr) = M(r) — V-D(r) (8.2.35) from (8.2.32) V-I(r) =0 (8.2.36) so that we can write Ur) =0-E(), ' (8.2.37) where E(r) is an antisymmetric tensor. Substituting, we find that by defining H(r) = Dir) + EW), (8.2.38) we have defined a nonsymmetric diffusion tensor H(r) and that Afx) + 7 [H(r)-Fo(r,t)) « (8.2.39) This means that, deterministically, 8.p(r, t) = P-LH(r)-P lr, 1)) (8.2.40) where H(r) is symmetric, if detailed balance holds. We now come to the fluctuation term, given by (8.2.14). To compute Bim(x), we first consider the limit of PY Bing f dr’ B(r, r')g(r’) (8.2.41) where J» is an arbitrary function. By similar, but much more tedious computation, we eventually find PD Bingm > —2P [D(r)e(r)-73)] (8.2.42) so that Br, 29'P: [D(r)p(r)8(r (8.2.43) The phenomenological theory of Sect. 8.1 now has a rational basis since (8.2.43) is what arises from assuming siz 8. Spatially Distributed Systems ite.) = HOP oC, 1) — Er) (8.2.44) in which Cele, DEC, > = 28(t — FDL — Pole) (8.2.45) and hence, .o(r, 1) = 0-H(r)-Vp +P -er, t). (8.2.46) This corresponds to a theory of inhomogeneous anisotropic diffusion without detailed balance. This is usually simplified by setting Din) = Dt (8.2.47) and this gives a more familiar equation. Notice that according to (8.2.45), fluctua- tions in different components of the current are in general correlated, unless D is diagonal. Comparison with Fluctuation-Dissipation Argument. The result (8.2.43) can almost be obtained from a simple fluctuation-dissipation argument in the stationary state, where we know the fluctuations are Poissonian. In that case, Be AD = KOSy, (8.2.48) corresponding to &(r, 1) = alr), pl’) = Br — Fp) - Since the theory is linear, we can apply (4.4.51) of Sect. 4.4.6. Here the matrices A and AT become A —V-H(n)-V AT 7 Hr)". (8.2.49) Thus, Bir, r') = BB™ (8.2.50) = Ao + oA* = (-7-H()-V — 9 H()-P'lelr, £')- We note that in the stationary state, (p(r)) = dr — r')} (8.2.51) = 279": [D(r)

You might also like