Content deleted Content added
m Fixing broken anchor: Reminder of an inactive anchor: equilibrium distribution |
|||
(12 intermediate revisions by 6 users not shown) | |||
Line 1:
{{short description|Class of dependent sampling algorithms}}
{{Bayesian statistics}}
In [[statistics]], '''Markov chain Monte Carlo''' ('''MCMC''')
Markov chain Monte Carlo methods are used to study probability distributions that are too complex or too highly [[N-dimensional space|dimensional]] to study with analytic techniques alone. Various algorithms exist for constructing such Markov chains, including the [[Metropolis–Hastings algorithm]].
== Applications ==
MCMC methods are primarily used for calculating [[Numerical analysis|numerical approximations]] of [[Multiple integral|multi-dimensional integrals]], for example in [[Bayesian statistics]], [[computational physics]],<ref>{{Cite journal|last1=Kasim|first1=M.F.|last2=Bott|first2=A.F.A.|last3=Tzeferacos|first3=P.|last4=Lamb|first4=D.Q.|last5=Gregori|first5=G.|last6=Vinko|first6=S.M. | date = September 2019 |title=Retrieving fields from proton radiography without source profiles |journal=Physical Review E|volume=100|issue=3|page=033208|doi=10.1103/PhysRevE.100.033208|pmid=31639953|arxiv=1905.12934|bibcode=2019PhRvE.100c3208K|s2cid=170078861}}</ref> [[computational biology]]<ref>{{Cite journal|last1=Gupta|first1=Ankur|last2=Rawlings|first2=James B. | date = April 2014 |title=Comparison of Parameter Estimation Methods in Stochastic Chemical Kinetic Models: Examples in Systems Biology |journal=AIChE Journal|volume=60|issue=4|pages=1253–1268|doi=10.1002/aic.14409 |pmc=4946376|pmid=27429455}}</ref> and [[computational linguistics]].<ref>See Gill 2008.</ref><ref>See Robert & Casella 2004.</ref>
In Bayesian statistics,
In [[rare event sampling]], they are also used for generating samples that gradually populate the rare failure region.{{Citation needed|date=June 2021}}
Line 19 ⟶ 21:
Random walk Monte Carlo methods are a kind of random [[Computer simulation|simulation]] or [[Monte Carlo method]]. However, whereas the random samples of the integrand used in a conventional [[Monte Carlo integration]] are [[statistically independent]], those used in MCMC are [[autocorrelation|autocorrelated]]. Correlations of samples introduces the need to use the [[Markov chain central limit theorem]] when estimating the error of mean values.
These algorithms create [[Markov chains]] such that they have an [[Markov chain#Steady-state analysis and limiting distributions|equilibrium distribution]]{{Broken anchor|date=2024-06-13|bot=User:Cewbot/log/20201008/configuration|target_link=Markov chain#Steady-state analysis and limiting distributions|reason= The anchor (Steady-state analysis and limiting distributions) [[Special:Diff/970694186|has been deleted]].}} which is proportional to the function given.
==Reducing correlation==
Line 30 ⟶ 32:
**[[Gibbs sampling]]: When target distribution is multi-dimensional, Gibbs sampling algorithm<ref>{{Cite journal |last1=Geman |first1=Stuart |last2=Geman |first2=Donald |date=November 1984 |title=Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images |url=https://ieeexplore.ieee.org/document/4767596 |journal=IEEE Transactions on Pattern Analysis and Machine Intelligence |volume=PAMI-6 |issue=6 |pages=721–741 |doi=10.1109/TPAMI.1984.4767596 |pmid=22499653 |s2cid=5837272 |issn=0162-8828}}</ref> updates each coordinate from its full [[conditional distribution]] given other coordinates. Gibbs sampling can be viewed as a special case of Metropolis–Hastings algorithm with acceptance rate uniformly equal to 1. When drawing from the full conditional distributions is not straightforward other samplers-within-Gibbs are used (e.g., see <ref>{{Cite journal|title = Adaptive Rejection Sampling for Gibbs Sampling|journal = Journal of the Royal Statistical Society. Series C (Applied Statistics)|date = 1992-01-01|pages = 337–348|volume = 41|issue = 2|doi = 10.2307/2347565|first1 = W. R.|last1 = Gilks|first2 = P.|last2 = Wild|jstor=2347565}}</ref><ref>{{Cite journal|title = Adaptive Rejection Metropolis Sampling within Gibbs Sampling|journal = Journal of the Royal Statistical Society. Series C (Applied Statistics)|date = 1995-01-01|pages = 455–472|volume = 44|issue = 4|doi = 10.2307/2986138|first1 = W. R.|last1 = Gilks|first2 = N. G.|last2 = Best|author2-link= Nicky Best |first3 = K. K. C.|last3 = Tan|jstor=2986138}}</ref>). Gibbs sampling is popular partly because it does not require any 'tuning'. Algorithm structure of the Gibbs sampling highly resembles that of the coordinate ascent variational inference in that both algorithms utilize the full-conditional distributions in the updating procedure.<ref>{{Cite journal |last=Lee|first=Se Yoon| title = Gibbs sampler and coordinate ascent variational inference: A set-theoretical review|journal=Communications in Statistics - Theory and Methods|year=2021|volume=51 |issue=6 |pages=1–21|doi=10.1080/03610926.2021.1921214|arxiv=2008.01006|s2cid=220935477}}</ref>
** [[Metropolis-adjusted Langevin algorithm]] and other methods that rely on the gradient (and possibly second derivative) of the log target density to propose steps that are more likely to be in the direction of higher probability density.<ref>See Stramer 1999.</ref>
** [[Hamiltonian Monte Carlo|Hamiltonian (or hybrid) Monte Carlo]] (HMC): Tries to avoid random walk behaviour by introducing an auxiliary [[momentum]] vector and implementing [[Hamiltonian dynamics]], so the potential energy function is the target density. The momentum samples are discarded after sampling. The result of hybrid Monte Carlo is that proposals move across the sample space in larger steps; they are therefore less correlated and converge to the target distribution more rapidly.▼
**[[Pseudo-Marginal Metropolis–Hastings algorithm|Pseudo-marginal Metropolis–Hastings]]: This method replaces the evaluation of the density of the target distribution with an unbiased estimate and is useful when the target density is not available analytically, e.g. [[latent variable model]]s.
* [[Slice sampling]]: This method depends on the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. It alternates uniform sampling in the vertical direction with uniform sampling from the horizontal 'slice' defined by the current vertical position.
* [[Multiple-try Metropolis]]: This method is a variation of the Metropolis–Hastings algorithm that allows multiple trials at each point. By making it possible to take larger steps at each iteration, it helps address the curse of dimensionality.
* [[Reversible-jump]]: This method is a variant of the Metropolis–Hastings algorithm that allows proposals that change the dimensionality of the space.<ref>See Green 1995.</ref> Markov chain Monte Carlo methods that change dimensionality have long been used in [[statistical physics]] applications, where for some problems a distribution that is a [[grand canonical ensemble]] is used (e.g., when the number of molecules in a box is variable). But the reversible-jump variant is useful when doing Markov chain Monte Carlo or Gibbs sampling over [[nonparametric]] Bayesian models such as those involving the [[Dirichlet process]] or [[Chinese restaurant process]], where the number of mixing components/clusters/etc. is automatically inferred from the data.
▲* [[Hamiltonian Monte Carlo|Hamiltonian (or hybrid) Monte Carlo]] (HMC): Tries to avoid random walk behaviour by introducing an auxiliary [[momentum]] vector and implementing [[Hamiltonian dynamics]], so the potential energy function is the target density. The momentum samples are discarded after sampling. The result of hybrid Monte Carlo is that proposals move across the sample space in larger steps; they are therefore less correlated and converge to the target distribution more rapidly.
=== Interacting particle methods ===
Interacting MCMC methodologies are a class of [[mean-field particle methods]] for obtaining [[Pseudo-random number sampling|random samples]] from a sequence of probability distributions with an increasing level of sampling complexity.<ref name="dp13">{{cite book|last = Del Moral|first = Pierre|title = Mean field simulation for Monte Carlo integration|year = 2013|publisher = Chapman & Hall/CRC Press |url = http://www.crcpress.com/product/isbn/9781466504059|pages = 626}}</ref> These probabilistic models include path space state models with increasing time horizon, posterior distributions w.r.t. sequence of partial observations, increasing constraint level sets for conditional distributions, decreasing temperature schedules associated with some Boltzmann–Gibbs distributions, and many others. In principle, any Markov chain Monte Carlo sampler can be turned into an interacting Markov chain Monte Carlo sampler. These interacting Markov chain Monte Carlo samplers can be interpreted as a way to run in parallel a sequence of Markov chain Monte Carlo samplers. For instance, interacting [[simulated annealing]] algorithms are based on independent Metropolis–Hastings moves interacting sequentially with a selection-resampling type mechanism. In contrast to traditional Markov chain Monte Carlo methods, the precision parameter of this class of interacting Markov chain Monte Carlo samplers is ''only'' related to the number of interacting Markov chain Monte Carlo samplers. These advanced particle methodologies belong to the class of Feynman–Kac particle models,<ref name="dp04">{{cite book|last = Del Moral|first = Pierre|title = Feynman–Kac formulae. Genealogical and interacting particle approximations|year = 2004|publisher = Springer |url = https://www.springer.com/mathematics/probability/book/978-0-387-20268-6|pages = 575}}</ref><ref name="dmm002">{{cite book|last1 = Del Moral|first1 = Pierre|last2 = Miclo|first2 = Laurent|contribution = Branching and Interacting Particle Systems Approximations of Feynman-Kac Formulae with Applications to Non-Linear Filtering|title=Séminaire de Probabilités XXXIV |editor=Jacques Azéma |editor2=Michel Ledoux |editor3=Michel Émery |editor4=Marc Yor|series = Lecture Notes in Mathematics|date = 2000|volume = 1729|pages = 1–145|url = http://archive.numdam.org/ARCHIVE/SPS/SPS_2000__34_/SPS_2000__34__1_0/SPS_2000__34__1_0.pdf|doi = 10.1007/bfb0103798|isbn = 978-3-540-67314-9}}</ref> also called Sequential Monte Carlo or [[particle filter]] methods in [[Bayesian inference]] and [[signal processing]] communities.<ref name=":3">{{Cite journal|title = Sequential Monte Carlo samplers | doi=10.1111/j.1467-9868.2006.00553.x|volume=68|issue = 3|year=2006|journal=Journal of the Royal Statistical Society. Series B (Statistical Methodology)|pages=411–436 | last1 = Del Moral | first1 = Pierre|arxiv=cond-mat/0212648| s2cid=12074789}}</ref> Interacting Markov chain Monte Carlo methods can also be interpreted as a mutation-selection [[Genetic algorithm|genetic particle algorithm]] with Markov chain Monte Carlo mutations.
=== Quasi-Monte Carlo ===
The
== Convergence ==
Line 55 ⟶ 57:
Several software programs provide MCMC sampling capabilities, for example:
* [https://github.com/cdslaborg/paramonte ParaMonte] parallel Monte Carlo software available in multiple programming languages including [[C (programming language)|C]], [[C++]], [[Fortran]], [[MATLAB]], and [[Python (programming language)|Python]].
* Packages that use dialects of the [[Bayesian inference using Gibbs sampling|BUGS]] model language:
** [[WinBUGS]] / [[OpenBUGS]]/ [https://www.multibugs.org/ MultiBUGS]
Line 64 ⟶ 65:
** [https://juliahub.com/ui/Packages/General/DynamicHMC/ DynamicHMC.jl]
** [https://github.com/madsjulia/AffineInvariantMCMC.jl AffineInvariantMCMC.jl]
** [https://github.com/probcomp/Gen.jl Gen.jl]
** and the ones in StanJulia repository.
* [[Python (programming language)]] with the packages:
Line 121 ⟶ 123:
| isbn = 978-0-470-04609-8
}}
*Carlin, Brad; Chib, Siddhartha (1995). [https://wwwf.imperial.ac.uk/~das01/MyWeb/SCBI/Papers/CarlinChib.pdf "Bayesian Model Choice via Markov Chain Monte Carlo Methods"]. ''[[Journal of the Royal Statistical Society|Journal of the Royal Statistical Society, Series B]]'', 57(3), 473–484
*{{cite journal
| first1 = George
Line 137 ⟶ 139:
| citeseerx = 10.1.1.554.3993
}}
*{{cite journal
| first1 = Siddhartha
| last1 = Chib
| author1-link = Siddhartha Chib
| first2 = Edward
| last2 = Greenberg
| title = Understanding the Metropolis–Hastings Algorithm
| journal = The American Statistician
| volume = 49
| issue = 4
| pages = 327–335
| year = 1995
| doi = 10.1080/00031305.1995.10476177
| jstor = 2684568
}}
*{{cite journal
| first1 = A.E.
|