0% found this document useful (0 votes)

44 views14 pages

SI Nonlin

This document discusses using a novel variant of Simulated Annealing called Data Annealing for Bayesian system identification of a nonlinear dynamical system. Data Annealing is similar to Simulated Annealing but introduces the influence of training data on the posterior distribution gradually, allowing the annealing procedure to be conducted with reduced computational expense. Data Annealing also uses a proposal distribution that allows for local searches and occasional long jumps. The method is used to identify an experimental nonlinear system and approximate the parameter covariance matrices of competing models. Model selection is then performed using the Deviance Information Criterion.

Uploaded by

Jimmy Phoenix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views14 pages

SI Nonlin

Uploaded by

Jimmy Phoenix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Mechanical Systems and Signal Processing 52-53 (2015) 133–146

Contents lists available at ScienceDirect

Mechanical Systems and Signal Processing

journal homepage: www.elsevier.com/locate/ymssp

Bayesian system identification of a nonlinear dynamical

system using a novel variant of Simulated Annealing
P.L. Green
Department of Mechanical Engineering, University of Sheffield, Mappin Street, Sheffield S1 3JD, United Kingdom

a r t i c l e i n f o abstract

Article history: This work details the Bayesian identification of a nonlinear dynamical system using a
Received 28 March 2013 novel MCMC algorithm: ‘Data Annealing’. Data Annealing is similar to Simulated Anneal-
Received in revised form ing in that it allows the Markov chain to easily clear ‘local traps’ in the target distribution.
20 March 2014
To achieve this, training data is fed into the likelihood such that its influence over the
Accepted 9 July 2014
posterior is introduced gradually - this allows the annealing procedure to be conducted
Available online 4 August 2014
with reduced computational expense. Additionally, Data Annealing uses a proposal
Keywords: distribution which allows it to conduct a local search accompanied by occasional long
Bayesian model updating jumps, reducing the chance that it will become stuck in local traps. Here it is used to
Nonlinear system identification
identify an experimental nonlinear system. The resulting Markov chains are used to
Markov chain Monte Carlo
approximate the covariance matrices of the parameters in a set of competing models
Simulated Annealing
Deviance Information Criterion before the issue of model selection is tackled using the Deviance Information Criterion.
& 2014 The Author. Published by Elsevier Ltd. This is an open access article under the CC
BY license (http://creativecommons.org/licenses/by/3.0/).

1. Introduction

This paper is concerned with the system identification of a nonlinear dynamical system using experimentally obtained
training data. A probabilistic, Bayesian approach is utilised throughout. Such an approach is now well established in the
structural dynamics community – relatively recent advances include the use of Bayesian methods in structural health
monitoring [1], modal identification [2], state-estimation [3] (through use of the particle filter), the sensitivity analysis of
large bifurcating nonlinear models [4] as well as an interesting study investigating the relations between frequentist and
Bayesian approaches to probabilistic parameter estimation [5].
The identification problem detailed herein is one of model selection as well as parameter estimation such that, using
experimental data D, one must endeavor to find the optimum model M from a set of competing model structures as well as
estimate the parameter vector θ of that particular model. Using Bayes' theorem a measure of the plausibility of a parameter
vector θ, given experimental data D and assumed model structure M, is given by
PðDjθ; MÞPðθjMÞ
PðθjD; MÞ ¼ ð1Þ
PðDjMÞ
where PðθjD; MÞ is the posterior probability density function (PDF) which one wishes to evaluate, PðDjθ; MÞ is termed the
likelihood, PðθjMÞ the prior and PðDjMÞ the evidence. The likelihood represents the probability that the experimental
training data D was witnessed according to the model M with parameters θ. Defining the likelihood requires the selection

E-mail address: [email protected]

http://dx.doi.org/10.1016/j.ymssp.2014.07.010
0888-3270 & 2014 The Author. Published by Elsevier Ltd. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/3.0/).
134 P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146

of an error-prediction model which describes the uncertainties present in the measurement and modelling processes (see
[6] for a detailed discussion of error-prediction models). The prior is a PDF which represents one's parameter estimates for
model M before the training data was known. The evidence is a normalising constant which ensures that the posterior PDF
integrates to one.
This paper makes two main contributions. Firstly, a novel variant of Simulated Annealing (referred to as Data Annealing)
is proposed and applied to a real system identification problem. It is shown to be computationally cheap and easy to tune.
Secondly, it is shown that the issue of model selection of a real nonlinear dynamical system can be addressed using the
Deviance Information Criterion (DIC). For the sake of readability the remainder of the introduction is split into two sections.
The first outlines the motivation for the Data Annealing algorithm while the second focuses on the issue of model selection.

1.1. Motivation for the data annealing algorithm

For the case where one is attempting to identify ND parameters (such that θ A RND ), the evidence is given by
Z Z
PðDjMÞ ¼ ⋯ PðDjθ; MÞPðθjMÞ dθ1 ⋯dθND : ð2Þ

This integral is usually intractable and its multidimensional nature makes it too computationally expensive to evaluate
numerically (if ND 4 2). Relatively early papers such as [7] made use of the property that the maximum a posteriori (MAP)
parameter vector remains the same regardless of whether the posterior distribution has been normalised such that, through
locating the MAP, a Taylor series expansion of the log posterior could be used to approximate the posterior PDF as a
Gaussian.1 Since then, an increase in computing power has allowed the adoption of Markov chain Monte Carlo (MCMC)
methods. These involve the creation of an ergodic Markov chain whose stationary distribution is equal to the posterior PDF
such that, once converged, the Markov chain is generating samples from PðθjD; MÞ (see [9] for more information on the
convergence of Markov chains). This can be achieved without having to evaluate the evidence term. While many MCMC
methods are available in the literature (Hamiltonian Monte Carlo for example [10]), by far the most popular is the
Metropolis algorithm. Although well-established, a brief description of the Metropolis algorithm is given here as it helps to
establish the motivation for the Data Annealing algorithm presented in Section 2 of this work.
ð1Þ ð2Þ
Essentially, the aim of MCMC methods is to generate a sequence of samples fθ ; θ ; …g from a target PDF π ðθÞ=Z (where
Z is a normalising constant). In the context of this paper, π ðθÞ represents the unnormalised posterior PDF and Z represents
ðiÞ 0
the evidence term. Initialising the Metropolis algorithm from parameter vector θ , a new state θ is proposed using a user-
ðiÞ
defined proposal PDF. The proposal PDF is conditional on the current state θ . For example, in the case where a Gaussian
proposal is used then the new state is generated according to

θ0 N ðθðiÞ ; Σ Þ ð3Þ
(where Σ is a user-defined covariance matrix). The new state is then accepted with probability:
( )
π ðθ0 Þ
a ¼ min 1; : ð4Þ
π ðθðiÞ Þ
ði þ 1Þ ðiÞ 0
If accepted then θði þ 1Þ ¼ θ' else θ ¼ θ . This has the property that if the proposed state θ is in a region of higher
probability density than the current state then it is always accepted. However, the Markov chain is also able to move into
regions of lower probability density. One of the benefits of using such an acceptance rule is that the acceptance probability a
can be computed without having to evaluate the evidence term. It can be shown that such an acceptance rule allows the
chain to generate samples from π ðθÞ (for more information references [8,11] are recommended).
The advantages of using MCMC are numerous. Recalling that the purpose of system identification is usually to establish a
reliable model which can be used to accurately and robustly predict the system's future response then, using the notation
outlined in [12], one may want to predict a structural quantity of interest hðθÞ using
Z Z
R¼ ⋯ hðθÞPðθjD; MÞ dθ1 ⋯dθND : ð5Þ

While evaluating Eq. (5) is difficult (for the same reason it is difficult to evaluate the evidence term), if one has used an
ð1Þ ðMÞ
MCMC algorithm to generate samples fθ ; …; θ g from the posterior parameter distribution then Eq. (5) can be
approximated by
1 M ðiÞ
R ∑ h θ : ð6Þ
Mi¼1

Additionally, it has been shown that important information with regard to parameter correlations can be realised through
the use of MCMC methods [13] (this is also demonstrated in Section 4 of the present work). However, MCMC also has its
disadvantages. Before samples from the target distribution can be drawn in an effective manner, the Markov chain must

1
For more information the reader may wish to consult the description of the Laplace approximation given in reference [8]
P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146 135

converge on the globally optimum region of the parameter space. This region can be difficult to locate as it is often very
concentrated relative to the size of one's prior distribution. Additionally, the Markov chain may become ‘stuck’ in a region of
probability density which is not the global optimum. Throughout this paper these regions are referred to as ‘local traps’.
The issue of local trapping led to the development of the Simulated Annealing algorithm [14]. This involves the
introduction of a factitious temperature2 variable T such that, at high temperatures, the Markov chain is able to easily travel
over local traps in the parameter space. The temperature variable is then reduced such that the fine details of the target
distribution are gradually introduced – this is demonstrated graphically for a bimodal target PDF in Fig. 1 (where π T
represents one's target distribution at temperature T). The rate at which T is reduced is commonly referred to as the
annealing schedule.
Although this does not guarantee that the chain will converge on the optimum region of parameter space, Simulated
Annealing has been established as a reliable optimisation algorithm. Soon after it was introduced several variants of
Simulated Annealing were proposed [15,16] in which the spread of the proposal PDF is initially set to be large but then
reduces with temperature T (at a user-defined rate), thus encouraging the Markov chain to make large jumps at higher
temperatures but conduct a more local search at lower temperatures.
When applied to Bayesian inference, the variable T can be introduced such that it controls the influence of the likelihood
on the posterior:
π T ðθÞ p PðDjθ; MÞT PðθjMÞ: ð7Þ
Through using Eq. (7) as one's target distribution and defining an annealing schedule where T varies monotonically between
0 and 1, a gradual transition between the prior and posterior distribution can be realised. This concept was utilised in
[12,17,18] where, by exploiting this gradual transition from prior to posterior, MCMC algorithms were developed which can
be used to sample from posterior parameter distributions with complex geometries (where multiple, or even a continuum
of optimum parameter vectors exist).
The performance of any Simulated Annealing algorithm will be sensitive to the choice of annealing schedule –
annealing too fast places one at risk of becoming stuck in a local trap (such that a long time is required for the
Markov chain to converge to its stationary distribution) while annealing too slowly will prove to be computationally
expensive. It is possible to overcome this issue through the use of ‘adaptive’ annealing schedules such as those proposed
in [17–19].
While the afore-mentioned algorithms are undoubtedly powerful, they can prove to be computationally expensive. One
of the main aims of the current paper is to present a relatively cheap annealing algorithm which, within the context of
Bayesian inference, can be applied to computationally demanding models.

1.2. Model selection

The issue of model selection occurs when one must choose from a variety of competing model structures. This is
complicated by the fact that models with more parameters will likely be able to better replicate some training data than
models with less parameters. Consequently, if one judges models simply on their ability to replicate training data, then the
most complex of the competing structures will always be accepted. Models which are overly complex for the problem at
hand are referred to as overfitted. Such models are often poor representations of the physics involved in the system of
interest and, as a result, are poorly suited to making future predictions.
For a scenario where different model structures are available, the probability that the model Mi is suitable given the data
D can also be written using Bayes' theorem:
PðDjMi ÞPðMi Þ
PðMi jDÞ ¼ ð8Þ
PðDÞ
thus allowing one to write the relative probability of two different models, given data D, as
PðMi jDÞ PðDjMi ÞPðMi Þ
¼ ð9Þ
PðMj jDÞ PðDjMj ÞPðMj Þ

where PðMi Þ and PðMj Þ represent one's prior beliefs in the suitability of each model (typically set equal to one another) and
PðDjMÞ is the evidence term in Eq. (1). It is possible to show that the Bayesian approach to model selection automatically
prevents overfitting (see [8,20] for more information). However, as was described in the previous section, the evidence term
is difficult to evaluate. As a result, one may instead choose to use a different model selection paradigm which is easier to
evaluate than Eq. (9) but also retains the same model selection properties. In this work the Deviance Information Criterion
(DIC) [21] is used as a model selection criterion.
Before describing the Deviance Information Criterion (DIC) it is convenient to first define the deviance:
DðθÞ ¼ 2 ln PðDjθ; MÞ ð10Þ

2
The phrases ‘annealing’ and ‘temperature’ are used as the Simulated Annealing algorithm was originally developed by drawing analogies with
statistical physics [14]. The relations between Bayesian inference and statistical physics are discussed in [11].
136 P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146

π (θ)
T

10
8
6
T 4 θ
2
Fig. 1. Graphical example of simulated annealing when θ A R1 .

where, as stated previously, PðDjθ; MÞ is the likelihood. The expected Deviance E½DðθÞ is a measure of how well the model
structure M fits the data (as the parameter vector has been marginalised). The DIC is then defined as

DIC ¼ 2E½DðθÞ Dðθ^ Þ; ð11Þ

where
Z
E½DðθÞ ¼ PðθjD; MÞDðθÞ dθ ð12Þ

and
Z
θ^ ¼ E½PðθjD; MÞ ¼ PðθjD; MÞθ dθ ð13Þ

such that the ‘best’ estimate parameters ðθ^ Þ are defined as the expected value of the posterior parameter distribution.
Essentially, the lower the DIC, the more favourable the model. It also has the desired property that it rewards model fidelity
while penalising model complexity (see reference [22] for a more detailed discussion).
The DIC lends itself well to situations where one has sampled from the posterior parameter distribution using MCMC as,
using the successive parameter vectors realised by the MCMC algorithm fθ ; θ ; …; θ g, the optimum parameter vector θ^
ð1Þ ð2Þ ðMÞ

can be approximated by
1 M ðiÞ
θ^ ∑ θ ð14Þ
Mi¼1

while the expected deviance can also be approximated by

1 M ðiÞ
E D θ ∑ D θ ð15Þ
Mi¼1

thus allowing one to approximate the DIC. While this has been applied to synthetic data in [13], the current work
demonstrates its application to real experimentally obtained data.
The paper is organised as follows. In Section 2 the novel annealing algorithm is presented. In Section 3 the experimental
system of interest is described. In Section 4 the results of the new annealing algorithm are analysed. This includes an
analysis of the parameter correlations and predictive capabilities of competing model structures. The issue of model
selection is then addressed using the Deviance Information Criterion (DIC). Section 5 is concerned with presenting possible
future work while the conclusions are presented in Section 6.

2. Data annealing

As stated in the previous section, MCMC methods can be used to generate samples from an unnormalised target PDF
π ðθÞ. In the context of this paper the target PDF is given by
π ðθÞ ¼ PðDjθ; MÞPðθjMÞ: ð16Þ
P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146 137

In practice it is usually desirable to evaluate the logarithm of the target PDF:

lnðπ ðθÞÞ ¼ lnðPðDjθ; MÞÞ þlnðPðθjMÞÞ ð17Þ

0 ðiÞ
as, by first finding ln a ¼ ln π ðθ Þ ln π ðθ Þ before evaluating a ¼ expðln aÞ, one can often avoid numerical overflow/
underflow issues when calculating Eq. (4).
For the case where there are N measurements in the training data:

D ¼ fD1 ; …; DN g ð18Þ

then, assuming that each measurement is mutually independent, the likelihood is given by

N
PðDjθ; MÞ ¼ ∏ PðDi jθ; MÞ: ð19Þ
i¼1

In the case investigated here the training data D consists of a vector of inputs fy1 ; y2 ; …; yN g and a vector of measured
outputs fx1 ; x2 ; …; xN g (the physical meaning of x and y is discussed in Section 3). Using a Gaussian error-prediction model
allows the likelihood to be written as

N 1 1
P Djθ; M ¼ ∏ pffiffiffiffiffiffi exp 2 ðxi x^ i ðθÞ2 Þ ð20Þ
i¼1 2 π σ 2 σ

where x^ i ðθÞ represents the response of the model with parameters θ and σ2 is the likelihood variance (which can be treated
as another parameter to be found). Consequently, a single evaluation of the likelihood requires the simulation of N data
points. It is suggested here that, rather than using T to control the influence of the likelihood on the posterior (as with
Simulated Annealing), a similar effect can be achieved by varying the amount of data used in the likelihood. In other words,
it is possible to increase the influence of the likelihood through the introduction of additional data points into D. The rate at
which the data points are introduced can be controlled according to a user-defined schedule – this is conceptually similar to
the annealing schedule used in Simulated Annealing. The major advantage of this method is that it is computationally fast –
in the early stages of the algorithm relatively few points need to be simulated by the model per evaluation of the likelihood.
Throughout the current work this method is referred to as Data Annealing. It should be noted that the concept of annealing
through the gradual addition of data points in the likelihood was proposed but not actually implemented in [12].
As was stated in Section 1, the Metropolis algorithm requires a user-defined proposal PDF to generate candidate
0 0 ðiÞ
parameter vectors θ – this is often chosen to be a Gaussian. In the current work the proposal PDF will be denoted qðθ jθ Þ.
In [15] it was suggested that, to reduce the probability of the Markov chain becoming stuck in a local trap, a proposal
distribution with larger tails should be used in place of a Gaussian distribution. Specifically, it was suggested that a Cauchy
distribution could be utilised as, while it is locally similar to a Gaussian, it possesses larger tails (as shown in Fig. 2). This is
desirable as, while the resulting Markov chain will spend the majority of the time conducting a local search of the parameter
space, it will also occasionally propose relatively large jumps (thus increasing its ability to escape from local traps).
A disadvantage of this method becomes apparent when the dimension of the parameter space is greater than one as
samples from the multidimensional Cauchy distribution are not uncorrelated – large jumps in one parameter will often be
accompanied by large jumps in all of the other parameters [11]. In the author's opinion this seems rather restrictive. Here, it
is proposed that each parameter in θ can be sampled independently from a one-dimensional Cauchy distribution such that,
for parameter θn:
2 0 !2 13 1
θ0n θðiÞ
0 ðiÞ 4 @ A5
q θn jθn ¼ πλn 1 þ n
ð21Þ
λn

(where λn controls the width of the distribution). Consequently, for the case where θ A RND , the complete proposal
distribution is simply the product of ND Cauchy distributions:

ND
0 ðiÞ 0 ðiÞ
qðθ jθ Þ ¼ ∏ qðθn jθn Þ: ð22Þ
n¼1

The result is a valid PDF which integrates to one, maintains the irreducibility of the Markov chain, allows one to perform a
local search with occasional long jumps and does not have the afore-mentioned restrictive properties of the multi-
dimensional Cauchy distribution. In fact, this property is so useful that an effective exploration of the parameter space can
be achieved without having to vary the spread of the independent distributions fλ1 ; …; λND g with annealing time – this is
demonstrated in Section 4 of the current work. It should be noted that in Eq. (22) one has the option of choosing different
proposal widths for different parameters. This may be advantageous when the parameters are of very different scales.
However, it was found here that simply running the Data Annealing algorithm using the logarithm of the parameter vector
allowed one to achieve good mixing despite using the same distribution width for each parameter.
138 P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146

0.4
Gaussian
0.35 Cauchy

0.3

0.25

P(θ)
0.2

0.15

0.1

0.05

0
−20 −15 −10 −5 0 5 10 15 20
θ

Fig. 2. Comparison between Gaussian and Cauchy probability density functions.

3. Nonlinear system

A schematic of the nonlinear dynamical system of interest is shown in Fig. 3. A ‘centre magnet’ is positioned such that it
is free to slide along an aluminium rod via a set of linear bearings. Two ‘outer magnets’ are attached to the aluminium rod –
they are positioned such that their poles oppose that of the centre magnet (thus creating a magnetic restoring force on the
centre magnet). Consequently, when excited by the shaker, the centre magnet experiences oscillatory motion relative to the
shaker table. Originally developed in the context of nonlinear energy harvesting, it is known that the magnetic restoring
force on the centre magnet can be closely approximated using a linear and cubic stiffness term (similar to the hardening
spring Duffing oscillator) [23]. As a result, the equation of motion of the system is
mx€ ¼ cz_ kz k3 z3 mg F; z ¼ x y ð23Þ
where x is the absolute displacement of the centre magnet, y is the displacement of the shaker table, m is the mass of the
centre magnet, c is viscous damping, k is the linear stiffness, k3 is the cubic stiffness and g is gravity. The training data D is
made up of discretely sampled values of the excitation y (measured using the LVDT in Fig. 3) and of the centre magnet
response x (measured using the laser in Fig. 3). The quantity F represents the force on the centre magnet as a result of
friction effects. Three different friction models were considered. Firstly it was investigated whether the friction effects could
be modelled simply using the viscous damping term c. Secondly, the Coulomb damping model was utilised such that
F ¼ F c sgnðz_ Þ ð24Þ
where Fc is a parameter to be estimated. Finally, it was hypothesised that the hyperbolic tangent model was appropriate:
F ¼ F c tanhðβz_ Þ ð25Þ
(where Fc and β are parameters to be estimated). Throughout this paper these candidate models are referred to as the
viscous, Coulomb and hyperbolic tangent models respectively. The hyperbolic tangent model has the property that
lim tanhðβz_ Þ ¼ sgnðz_ Þ ð26Þ
β-1

such that it is able to form a close approximation to the signum function without being discontinuous at z_ ¼ 0. It should be
noted that the mass of the centre magnet was measured accurately before testing and so, in the following analysis, it is not
included in the vector of parameters to be estimated.
With regard to the applied excitation, a signal generator was used in conjunction with a PID controller to create a band-
limited white noise acceleration. For a more detailed discussion of this experiment (which was also developed in the context
of energy harvesting) the reader is directed towards references [24,25]. Two seconds of data measured at 1500 Hz was used
as training data (this is shown in Fig. 4).

4. Results

4.1. Markov chain Monte Carlo

Uniform (but not improper) prior distributions were used in all runs of the Data Annealing algorithm. The upper and
lower limits of the priors for each parameter are shown in Table 1. A uncorrelated Gaussian error-prediction model (as
described in Section 2) was used in the likelihood. It was assumed that the standard deviation of the likelihood (σ) was
constant throughout the experimental test. In each of the following cases the value of σ was estimated alongside the other
model parameters.
P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146 139

LVDT Laser

Outer
Magnet

Linear Bearings

Outer Centre
Magnet Magnet

Signal Generator

-
+ PID Shaker

Fig. 3. Schematic of experimental apparatus.

−3
x 10
6

4
Absolute Displacement (m)

−2

−4

−6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Time (s)

Fig. 4. Two seconds of training data.

For each model the Data Annealing algorithm was used to generate 50 000 samples of θ. The proposal distribution shown
in Eq. (22) was used with λ ¼ 0:005 for each parameter. For the initial sample the data D used in the likelihood consisted of 2
points (fy1 ; y2 g and fx1 ; x2 g). Additional data points were then introduced into the likelihood in a linear fashion for the first
2000 samples until the data D contained 3000 values of input ðyÞ and 3000 values of the corresponding response (x). The
amount of data D was then held constant for the remaining samples. The nonstationary portion of the resulting Markov
chains were removed. To increase the independence between samples only every tenth sample from the resulting Markov
chain was used to approximate the marginal PDFs of the posterior distribution.
The resulting Markov chains and parameter histograms for the viscous damping, Coulomb and hyperbolic tangent
models are shown in Figs. 5, 6 and 7 respectively. As desired, use of the Data Annealing algorithm has allowed the Markov
140 P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146

Table 1
Limits of uniform prior distribution.

Parameter Prior lower limit Prior upper limit

c 0 0.2
Fc 0 0.01
β 0 1 107
k 0 80
k3 0 1 107
σ 0 0.001

6000 6000 6000 6000

4000 4000 4000 4000

Samples
Burnt

2000 2000 2000 2000

0 0 0 0
0.05 0.1 0.15 0.2 0.25 30 40 50 60 70 80 0 5 10 4 6 8 10
x 10 x 10

6000 6000 6000 6000

4000 4000 4000 4000

Retained
Samples

2000 2000 2000 2000

0 0 0 0
0.1 0.105 0.11 56.5 57 57.5 1 1.5 2 2.5 5.6 5.8 6 6.2
x 10 x 10
1500 1500 1500 1500

1000 1000 1000 1000

Frequency

500 500 500 500

0 0 0 0
0.1 0.105 0.11 56.5 57 57.5 1 1.5 2 2.5 5.6 5.8 6 6.2
3 x 10 x 10
c (Ns/m) k (N/m) k 3 (N/m ) σ

Fig. 5. Results of the Data Annealing algorithm for the viscous model. The first row shows the burnt data during the annealing stage of the algorithm, the
second row shows the thinned Markov chain with the burn period removed and the third row shows the resulting parameter histograms.

chain to make large jumps across the parameter space during the early stages while also allowing it to conduct a more local
search once the chain has become stationary. To reiterate, this was achieved without having to vary the width of the
proposal density.
With regard to Fig. 7 it should be noted that the Markov chain for the β parameter did not appear to become stationary.
This demonstrates an interesting flaw in the MCMC algorithm used in this paper: it is not clear whether the non-stationarity
of the Markov chain is a result of β being a nuisance parameter or of a poorly tuned MCMC algorithm. Upon closer inspection
it became apparent that at no point did the chain transition into a region lower than β 1000. Recalling that the hyperbolic
tangent model forms a close approximation to the Coulomb model when a large value of β is utilised allows one to
hypothesise that the Coulomb model may be more appropriate in this case (the ability of all the models to predict future
response and the issues of model selection are discussed in the subsequent sections).
One of the advantages of using MCMC methods is that one can approximate the covariance matrix of the model
parameters of a particular system. This is achieved by computing the correlation coefficients between the Markov chains of
the different parameters. The resulting covariance matrices for the viscous, Coulomb and hyperbolic tangent models are
shown in Figs. 8, 9 and 10 respectively. For all three models it is interesting to note that there appears to be a strong negative
correlation between the linear stiffness k and the nonlinear stiffness term k3. This is a relation which is possible to show
using the technique of equivalent linearisation: the situation where one is attempting to model the response of a system
with a nonlinear hardening spring as accurately as possible using an equivalent linear system. In such a case one must
compensate for the lack of a nonlinear spring term via an increase in the linear spring term (see [26] for more details). In
Figs. 9 and 10 it is also shown that there is a strong negative correlation between the viscous damping term c and Fc which
controls the magnitude of friction in the system. This indicates that one may able to compensate for the lack of a friction
model in a linear system through an increase in viscous damping. Again, this is something which can be shown using
equivalent linearisation.
P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146 141

6000 6000 6000 6000 6000

4000 4000 4000 4000 4000

Samples
Burnt

2000 2000 2000 2000 2000

0 0 0 0 0
0 0.1 0.2 0 0.005 0.01 20 40 60 80 0 5 10 4 6 8 10
x 10 x 10
6000 6000 6000 6000 6000

4000 4000 4000 4000 4000

Retained
Samples

2000 2000 2000 2000 2000

0 0 0 0 0
0.04 0.05 0.06 5 6 7 57.4 57.6 57.8 58 0.8 1 1.2 1.4 1.6 5.2 5.4 5.6
x 10 x 10 x 10
1500 1500 1500 1000 1500

1000 1000 1000 1000

Frequency

500
500 500 500 500

0 0 0 0 0
0.04 0.05 0.06 5 6 7 57.4 57.6 57.8 58 0.8 1 1.2 1.4 1.6 5.2 5.4 5.6
x 10 x 10 x 10
c (Ns/m) Fc (N) k (N/m) k 3 (N/m )
3 σ

Fig. 6. Results of the Data Annealing algorithm for the Coulomb model. The first row shows the burnt data during the annealing stage of the algorithm, the
second row shows the thinned Markov chain with the burn period removed and the third row shows the resulting parameter histograms.

6000 6000 6000 6000 6000 6000

Samples

4000 4000 4000 4000 4000 4000

Burnt

2000 2000 2000 2000 2000 2000

0 0 0 0 0 0
0 0.1 0.2 0 0.005 0.01 0 5 10 20 40 60 80 0 5 10 4 6 8 10
x 10 x 10 x 10
6000 6000 6000 6000 6000 6000
Retained
Samples

4000 4000 4000 4000 4000 4000

2000 2000 2000 2000 2000 2000

0 0 0 0 0 0
0.04 0.045 0.05 0.055 5 6 7 8 0 2 4 6 57 57.5 58 58.5 0.8 1 1.2 1.4 1.6 5.1 5.2 5.3 5.4 5.5
x 10 x 10 x 10 x 10

1000 1500 3000 1500 1000 1500

Frequency

1000 2000 1000 1000

500 500
500 1000 500 500

0 0 0 0 0 0
0.04 0.045 0.05 0.055 5 6 7 8 0 2 4 6 57 57.5 58 58.5 0.8 1 1.2 1.4 1.6 5.1 5.2 5.3 5.4 5.5
x 10 x 10 x 10 x 10
c (Ns/m) Fc (N) β k (N/m) k 3 (N/m )
3 σ

Fig. 7. Results of the Data Annealing algorithm for the hyperbolic tangent model. The first row shows the burnt data during the annealing stage of the
algorithm, the second row shows the thinned Markov chain with the burn period removed and the third row shows the resulting parameter histograms.

4.2. Response predictions

Having obtained probabilistic estimates for the parameters, each model was used to predict the response of the system to
59 seconds of a new excitation (which was part of a different set of experimental data). This data set will be denoted Dnew to
distinguish it from the training data D. As stated in [20], the Theorem of Total Probability can be used to obtain probabilistic
estimates of Dnew :
Z
PðDnew jD; MÞ ¼ PðDnew jD; θ; MÞPðθjD; MÞ dθ ð27Þ
142 P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146

Fig. 8. Covariance matrix for the viscous model.

Fig. 9. Covariance matrix for the Coulomb model.

1 M ðiÞ

∑ P Dnew θ ; M ð28Þ
Mi¼1
ðiÞ
where θ ; i ¼ 1; …; M, are the posterior samples generated by the Data Annealing algorithm.
An alternative method was suggested in [13] where, to account for the assumption that the system parameters are time-
independent, it was suggested that one could sample a new parameter vector from the posterior after every time step of the
model simulation. In the current work, both methods of uncertainty propagation were investigated (using a total ensemble of
50 model predictions) although it was found that the results were indistinguishable.
Figs. 11 and 12 show the ability of the viscous and Coulomb models to replicate one second of the experimentally
obtained response (with confidence bounds). It can be seen that both models have replicated the response of the system to a
good level of accuracy. The prediction made by the hyperbolic tangent model is not shown here as it was indistinguishable
from that of the Coulomb model. This strengthens the hypothesis that the Coulomb damping model is preferable to the
hyperbolic tangent model as it is able to generate a very similar response despite having less parameters.
P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146 143

Fig. 10. Covariance matrix for the hyperbolic tangent model.

The mean square error (MSE) between the predicted future response from each model and the measured experimental
response was calculated. This was taken over the entire 59 seconds of data. The MCMC samples realised in the previous
section were then used to calculate the Deviance Information Criterion. The results are shown in Table 2. The MSE for the
Coulomb and hyperbolic tangent models is significantly lower than that for the viscous model while the MSE for the
Coulomb and hyperbolic tangent models is identical. This indicates that while the inclusion of a friction model has enhanced
performance, the hyperbolic tangent model is simply acting as an approximation for the Coulomb model. This is confirmed
by the Deviance Information Criterion which indicates that the Coulomb model is the most appropriate (thus confirming
what was already suspected). For the sake of completeness, the ability of the Coulomb model to replicate the full 59 seconds
of experimental data is shown in Fig. 13.

5. Discussion and future work

One of the disadvantages of Data Annealing is that, relative to algorithms such as Transitional MCMC (TMCMC) [17] and
Asymptotically Independent Markov Sampling (AIMS) [18], the user has less control over the rate at which the influence of
the likelihood is increased during the annealing process. This is because TMCMC and AIMS utilise the temperature variable
in such a way that the transition from prior to posterior can be controlled in a continuous manner. The ability to select each
temperature T from the set T A ½0; 1 (subject to the constraint that the sequence of temperature values must increase
monotonically from 0 to 1) essentially means that the user has an uncountably infinite set of possible annealing schedules
available to them. This flexibility is lost when utilising the Data Annealing algorithm as the transition from prior to posterior
is influenced by the sensitivity of one's parameter estimates to the introduction of a new data set. As a topic of future work
the author aims to develop a version of Data Annealing algorithm which allows the user to have greater control over the
annealing schedule.
Throughout this paper the DIC was used as a model selection criterion. The disadvantage of this approach is that,
although it can be estimated using samples from the posterior, it is an ad hoc penalty term which can only be used when
each model has a single optimum parameter vector. A more complete approach would involve a variation of Data Annealing
which was also able to estimate the model evidence (Eq. (2)) (thus allowing the relative plausibility of competing model
structures to be investigated within a Bayesian framework). Consequently, for future work the author intends to investi-
gate whether Data Annealing can be combined with other MCMC methods which are capable of estimating the model
evidence – such methods could include Simulated Tempering [27,28], Reversible Jump MCMC [29], TMCMC [17], AIMS [18]
and Nested Sampling [30].

6. Conclusions

In this paper the system identification of an experimental nonlinear dynamical system was investigated using three
competing model structures. A new MCMC algorithm named ‘Data Annealing’ was proposed. Being conceptually similar to
Simulated Annealing, Data Annealing is designed such that, at its initial stages, the prior distribution dominates the shape of
the target distribution. This allows the Markov chain to move freely around the parameter space. Additional training data is
144 P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146

−3
x 10
8
Model
6 Experiment
±3σ
Absolute Displacement (m) 4

−2

−4

−6

−8
10 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 11
Time (s)
Fig. 11. Comparison between one second of viscous model prediction (black) and one second of experimental data (grey) where dashed black lines
represent 3σ confidence bounds.

−3
x 10
8
Model
6 Experiment
±3σ
Absolute Displcement (m)

−2

−4

−6

−8
10 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 11

Time (s)
Fig. 12. Comparison between one second of Coulomb model prediction (black) and one second of experimental data (grey) where dashed black lines
represent 3σ confidence bounds.

Table 2
Mean square error between model and experiment and Deviance Information
Criterion for the viscous, Coulomb and hyperbolic tangent models.

Model Parameter number MSE DIC

Viscous 3 0.0175 1.1047 106

Coulomb 4 0.0085 1.1449 106
Hyperbolic tangent 5 0.0085 1.3139 105

then progressively introduced into the likelihood such that the influence of the likelihood on the posterior is gradually
increased. This computationally cheap method improves the ability of the Markov chain to converge on the globally
optimum region of the parameter space without getting stuck in ‘local traps’. Additionally, the Data Annealing algorithm
utilises a proposal distribution which allows it to conduct a local search of the parameter space accompanied by occasional
long jumps. It was shown that this proposal distribution is well suited to the problem at hand as it initially allows the
Markov chain to explore large regions of the parameter space while is also capable of providing a more local search once the
chain has converged. This was achieved without having to alter the width of the proposal distribution. Having demonstrated
the Data Annealing algorithm on a real system identification problem, the resulting Markov chains were used to extract
approximate covariance matrices for all of the models investigated, thus revealing information about parameter correlations
P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146 145

0.01
0
Model
−0.01

Absolute Displacement (m)

0 5 10 Experiment 15
0.01 ± 3σ

0
−0.01
15 20 25 30
0.01
0
−0.01
30 35 40 45
0.01
0
−0.01
45 50 55 60
Time (s)

Fig. 13. Comparison between 59 seconds of Coulomb model prediction (black) and 59 seconds of experimental data (grey) where dashed black lines
represent 3σ confidence bounds.

induced by the data. Finally, a model selection criterion known as the Deviance Information Criterion was used to select the
most appropriate model from the set of competing structures. It was shown that the DIC can be used to identify a model
which can accurately replicate a set of training data without being overfitted (relative to the other elements in a set of user-
defined model structures).

Acknowledgements

The author would like to thank James L. Beck from the California Institute of Technology for his talk at IMAC XXXI which
inspired much of the work shown in this paper.
This work was conducted as part of an EPSRC fellowship and is also closely aligned to the EPSRC Programme Grant
‘Engineering Nonlinearity’ EP/K003836/1.

References

[1] M.W. Vanik, J.L. Beck, S.-K. Au, Bayesian probabilistic approach to structural health monitoring, J. Eng. Mech. 126 (7) (2000) 738–745.
[2] K.-V. Yuen, L.S. Katafygiotis, Bayesian fast Fourier transform approach for modal updating using ambient data, Adv. Struct. Eng. 6 (2) (2003) 81–95.
[3] J. Ching, J.L. Beck, K.A. Porter, Bayesian state and parameter estimation of uncertain dynamical systems, Probab. Eng. Mech. 21 (1) (2006) 81–96.
[4] W. Becker, K. Worden, J. Rowson, Bayesian sensitivity analysis of bifurcating nonlinear models, Mech. Syst. Signal Process. 34 (1) (2013) 57–75.
[5] S.-K. Au, Connecting Bayesian and frequentist quantification of parameter uncertainty in system identification, Mech. Syst. Signal Process. 29 (2012)
328–342.
[6] E. Simoen, C. Papadimitriou, G. Lombaert, On prediction error correlation in Bayesian model updating, J. Sound Vib. 332 (18) (2013) 4136–4152.
[7] J.L. Beck, L.S. Katafygiotis, Updating models and their uncertainties. i: Bayesian statistical framework, J. Eng. Mech. 124 (4) (1998) 455–461.
[8] D.J.C. MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press, Cambridge, CB2 8RU, UK, 2003.
[9] J.L. Doob, Stochastic Processes, Wiley Publications in Statistics, Wiley, Oxford, OX4 2DQ, UK, 1953.
[10] S. Cheung, J.L. Beck, Bayesian model updating using hybrid Monte Carlo simulation with application to structural dynamic models with many
uncertain parameters, J. Eng. Mech. 135 (4) (2009) 243–255.
[11] R.M. Neal, Probabilistic Inference using Markov Chain Monte Carlo Methods, Technical Report, University of Toronto, 1993.
[12] J.L. Beck, S.-K. Au, Bayesian updating of structural models and reliability using Markov chain Monte Carlo simulation, J. Eng. Mech. 128 (4) (2002)
380–391.
[13] K. Worden, J.J. Hensman, Parameter estimation and model selection for a class of hysteretic systems using Bayesian inference, Mech. Syst. Signal
Process. 32 (2012) 153–169.
[14] S. Kirkpatrick, M.P. Vecchi, Optimization by simulated annealing, Science 220 (4598) (1983) 671–680.
[15] H. Szu, R. Hartley, Fast simulated annealing, Phys. Lett. A 122 (3–4) (1987) 157–162.
[16] L. Ingber, Very fast simulated re-annealing, Math. Comput. Modell. 12 (8) (1989) 967–973.
[17] J. Ching, Y.C. Chen, Transitional Markov chain Monte Carlo method for Bayesian model updating, model class selection, and model averaging, J. Eng.
Mech. 133 (7) (2007) 816–832.
[18] J.L. Beck, K.M. Zuev, Asymptotically independent Markov sampling: a new Markov chain Monte Carlo scheme for Bayesian inference, Int. J. Uncertain
Quantif. 3 (5) (2013).
[19] P. Salamon, J.D. Nulton, J.R. Harland, J. Pedersen, G. Ruppeiner, L. Liao, Simulated annealing with constant thermodynamic speed, Comput. Phys.
Commun. 49 (3) (1988) 423–428.
[20] M. Muto, J.L. Beck, Bayesian updating and model class selection for hysteretic structural models using stochastic simulation, J. Vib. Control 14 (1–2)
(2008) 7–34.
[21] D.J. Spiegelhalter, N.G. Best, B.P. Carlin, A. Van Der Linde, Bayesian measures of model complexity and fit, J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 64 (4)
(2002) 583–639.
[22] A. Gelman, J.B. Carlin, H.S. Stern, D.B. Rubin, Bayesian Data Analysis, Chapman & Hall, CRC, Boca Raton, Florida 33431, US, 2003.
[23] B.P. Mann, N.D. Sims, Energy harvesting from the nonlinear oscillations of magnetic levitation, J. Sound Vib. 319 (1) (2009) 515–530.
[24] P.L. Green, K. Worden, K. Atallah, N.D. Sims, The effect of Duffing-type non-linearities and Coulomb damping on the response of an energy harvester to
random excitations, J. Intell. Mater. Syst. Struct. 23 (18) (2012) 2039–2054.
[25] P.L. Green, K. Worden, K. Atallah, N.D. Sims, The benefits of Duffing-type nonlinearities and electrical optimisation of a mono-stable energy harvester
under white Gaussian excitations, J. Sound Vib. 331 (20) (2012) 4504–4517.
146 P.L. Green / Mechanical Systems and Signal Processing 52-53 (2015) 133–146

[26] K. Worden, G.R. Tomlinson, Nonlinearity in Structural Dynamics: Detection, Identification and Modelling, Taylor & Francis, Bristol, BS1 6BE, UK, 2010.
[27] E. Marinari, G. Parisi, Simulated tempering: a new Monte Carlo scheme, EPL (Europhys. Lett.) 19 (6) (1992) 451.
[28] C.J. Geyer, E.A. Thompson, Annealing Markov chain Monte Carlo with applications to ancestral inference, J. Am. Stat. Assoc. 90 (431) (1995) 909–920.
[29] P.J. Green, Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika 82 (4) (1995) 711–732.
[30] J. Skilling, Nested sampling for general Bayesian computation, Bayesian Anal. 1 (4) (2006) 833–859.

Toward Faster Methods in Bayesian Unsupervised Learning
No ratings yet
Toward Faster Methods in Bayesian Unsupervised Learning
235 pages
Polymeric Foams Innovations in Processes, Technologies, and Products (T.L)
100% (1)
Polymeric Foams Innovations in Processes, Technologies, and Products (T.L)
405 pages
GVSusing BUGS
No ratings yet
GVSusing BUGS
19 pages
Datasheet - MFHN - 1.4 - BF - 132 - 700W (1) - 1
No ratings yet
Datasheet - MFHN - 1.4 - BF - 132 - 700W (1) - 1
1 page
General State Space Markov Chains and MCMC Algorithms - Gareth O. Roberts, Jeffrey S. Rosenthal
No ratings yet
General State Space Markov Chains and MCMC Algorithms - Gareth O. Roberts, Jeffrey S. Rosenthal
64 pages
Convergence Analysis of A Collapsed Gibbs Sampler For Bayesian Vector Autoregressions
No ratings yet
Convergence Analysis of A Collapsed Gibbs Sampler For Bayesian Vector Autoregressions
31 pages
Particle Gibbs Without The Gibbs Bit
No ratings yet
Particle Gibbs Without The Gibbs Bit
12 pages
IMA Unit 5
No ratings yet
IMA Unit 5
25 pages
Chemical Kinetics Numericals
No ratings yet
Chemical Kinetics Numericals
4 pages
Adam MCMC
No ratings yet
Adam MCMC
16 pages
Piezoelectric Tiles
No ratings yet
Piezoelectric Tiles
8 pages
Elementart Trianing For Aspen Adsorption-V10
No ratings yet
Elementart Trianing For Aspen Adsorption-V10
330 pages
Using Early Rejection Markov Chain Monte Carlo and Gaussian Processes To Accelerate ABC Methods
No ratings yet
Using Early Rejection Markov Chain Monte Carlo and Gaussian Processes To Accelerate ABC Methods
33 pages
ML Unit5 QB Solutions
No ratings yet
ML Unit5 QB Solutions
13 pages
Annurev Statistics 031219 041300
No ratings yet
Annurev Statistics 031219 041300
26 pages
18 Aos1715
No ratings yet
18 Aos1715
33 pages
Lecture 3
No ratings yet
Lecture 3
13 pages
Markov Chain Monte Carlo
No ratings yet
Markov Chain Monte Carlo
13 pages
ss3 2nd Term Exam PHYSICS
No ratings yet
ss3 2nd Term Exam PHYSICS
4 pages
Coeurdoux etal23-PnPGibbs
No ratings yet
Coeurdoux etal23-PnPGibbs
15 pages
ML - Unit-V-1
No ratings yet
ML - Unit-V-1
42 pages
Bayesian - Lec - 4
No ratings yet
Bayesian - Lec - 4
25 pages
K-Nearest Neighbor Particle Filters For Dynamic Hybrid Bayesian Networks
No ratings yet
K-Nearest Neighbor Particle Filters For Dynamic Hybrid Bayesian Networks
11 pages
Discrete and Continuous Dynamical Systems Series S: Doi:10.3934/dcdss.2022054
No ratings yet
Discrete and Continuous Dynamical Systems Series S: Doi:10.3934/dcdss.2022054
25 pages
Rao-Blackwellised Particle Filtering For Dynamic Bayesian Networks
No ratings yet
Rao-Blackwellised Particle Filtering For Dynamic Bayesian Networks
8 pages
Introduction To State Space Models and Sequential Bayesian Inference
No ratings yet
Introduction To State Space Models and Sequential Bayesian Inference
58 pages
Astor: Test Transformer
No ratings yet
Astor: Test Transformer
1 page
Ms Chouhan & Team: Organic Chemistry
No ratings yet
Ms Chouhan & Team: Organic Chemistry
32 pages
17.bayesian Learning Via Stochastic Gradient Langevin Dynamics
No ratings yet
17.bayesian Learning Via Stochastic Gradient Langevin Dynamics
8 pages
Lec #11 (Lab Compaction)
No ratings yet
Lec #11 (Lab Compaction)
14 pages
Adaptive MCMC For Everyone
No ratings yet
Adaptive MCMC For Everyone
13 pages
Hogg 2018 ApJS 236 11
No ratings yet
Hogg 2018 ApJS 236 11
18 pages
LGH 15 200RVX E - Service - Manual
No ratings yet
LGH 15 200RVX E - Service - Manual
105 pages
MCMC Methods For Functions: Modifying Old Algorithms To Make Them Faster
No ratings yet
MCMC Methods For Functions: Modifying Old Algorithms To Make Them Faster
23 pages
Green Et Al 2015 Bayesian and Markov Chain Monte Carlo Methods For Identifying Nonlinear Systems in The Presence of
No ratings yet
Green Et Al 2015 Bayesian and Markov Chain Monte Carlo Methods For Identifying Nonlinear Systems in The Presence of
18 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
A Conceptual Introduction To Markov Chain Monte Carlo Methods
No ratings yet
A Conceptual Introduction To Markov Chain Monte Carlo Methods
56 pages
Particle Filtering and Marginalization For Parameter Identification in Structural Systems
No ratings yet
Particle Filtering and Marginalization For Parameter Identification in Structural Systems
25 pages
FASB Datasheet
100% (1)
FASB Datasheet
2 pages
CIEM5250 Lecture Week1
No ratings yet
CIEM5250 Lecture Week1
87 pages
Bishop2008 Chapter ANewFrameworkForMachineLearnin
No ratings yet
Bishop2008 Chapter ANewFrameworkForMachineLearnin
24 pages
Chapter 2 Measurement
No ratings yet
Chapter 2 Measurement
28 pages
My Notes Unit 5
No ratings yet
My Notes Unit 5
12 pages
On Particle Methods For Parameter Estimation in State-Space Models
No ratings yet
On Particle Methods For Parameter Estimation in State-Space Models
25 pages
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
No ratings yet
A Tutorial On Particle Filtering and Smoothing: Fifteen Years Later
41 pages
Random Sets Approach and Its Applications
No ratings yet
Random Sets Approach and Its Applications
12 pages
Overview of Harmonic and Resonance in Railway Electrification Systems
No ratings yet
Overview of Harmonic and Resonance in Railway Electrification Systems
19 pages
S Y B Tech Instru
No ratings yet
S Y B Tech Instru
37 pages
Lok Test & Capo Test
No ratings yet
Lok Test & Capo Test
10 pages
S Torvik 2002
No ratings yet
S Torvik 2002
9 pages
Periodic Table Test Review
No ratings yet
Periodic Table Test Review
2 pages
Nyy 0,6/1 KV Technical - Data
No ratings yet
Nyy 0,6/1 KV Technical - Data
24 pages
Frigola Bayesian Inference and Learning in Gaussian Process State Space Models With Particle MCMC
No ratings yet
Frigola Bayesian Inference and Learning in Gaussian Process State Space Models With Particle MCMC
9 pages
Service Instructions: Oilgear Type "HF" Horsepower Limiter W/Load Sensor Controls For "PVWH" and "PVW" Pumps
No ratings yet
Service Instructions: Oilgear Type "HF" Horsepower Limiter W/Load Sensor Controls For "PVWH" and "PVW" Pumps
4 pages
Big Data JPM
No ratings yet
Big Data JPM
31 pages
Cra I U Rosenthal Ann Rev
No ratings yet
Cra I U Rosenthal Ann Rev
40 pages
Particle Filter Tutorial
No ratings yet
Particle Filter Tutorial
39 pages
An Introduction To Particle Filters: David Salmond and Neil Gordon Sept 2005
No ratings yet
An Introduction To Particle Filters: David Salmond and Neil Gordon Sept 2005
27 pages
Bayesian Inference
No ratings yet
Bayesian Inference
28 pages
On Sequential Monte Carlo Sampling Methods For Bayesian Filtering
No ratings yet
On Sequential Monte Carlo Sampling Methods For Bayesian Filtering
35 pages
Markov Chain Monte Carlo and Gibbs Sampling
No ratings yet
Markov Chain Monte Carlo and Gibbs Sampling
24 pages
Stochastic Optimization On Continuous Domains With Finite-Time Guarantees by Markov Chain Monte Carlo Methods
No ratings yet
Stochastic Optimization On Continuous Domains With Finite-Time Guarantees by Markov Chain Monte Carlo Methods
6 pages
Bayesian System Identification
No ratings yet
Bayesian System Identification
14 pages
Operational Modal
No ratings yet
Operational Modal
22 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
Bayesian System Identification Based On Probability Logic: James L. Beck
No ratings yet
Bayesian System Identification Based On Probability Logic: James L. Beck
23 pages
MCMC With Temporary Mapping and Caching With Application On Gaussian Process Regression
No ratings yet
MCMC With Temporary Mapping and Caching With Application On Gaussian Process Regression
16 pages
Sequential Monte Carlo Methods
No ratings yet
Sequential Monte Carlo Methods
6 pages
Mcmc-A Comparative Study
No ratings yet
Mcmc-A Comparative Study
29 pages
Annurev Statistics 022513 115540
No ratings yet
Annurev Statistics 022513 115540
26 pages
Adaptive TimeDomain
No ratings yet
Adaptive TimeDomain
29 pages
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
No ratings yet
On Input Selection With Reversible Jump Markov Chain Monte Carlo Sampling
10 pages
A-NICE-MC: Adversarial Training For MCMC
No ratings yet
A-NICE-MC: Adversarial Training For MCMC
19 pages
Article Ahmouda Yacine
No ratings yet
Article Ahmouda Yacine
22 pages
An Introduction To MCMC For Machine Learning
No ratings yet
An Introduction To MCMC For Machine Learning
39 pages
MCMC Bayes PDF
No ratings yet
MCMC Bayes PDF
27 pages
Co Crystals
No ratings yet
Co Crystals
23 pages
Bee Prev Years Question Papers
No ratings yet
Bee Prev Years Question Papers
15 pages
Bayesian Analysis
No ratings yet
Bayesian Analysis
20 pages
MCMC Final Edition
No ratings yet
MCMC Final Edition
17 pages
An Introduction To MCMC For Machine Learning: Abstract
No ratings yet
An Introduction To MCMC For Machine Learning: Abstract
39 pages
Markov Chain Monte Carlo
No ratings yet
Markov Chain Monte Carlo
29 pages
2022JuneSeries 1paper 1 (Core MCQ)
No ratings yet
2022JuneSeries 1paper 1 (Core MCQ)
50 pages
MCMC - Markov Chain Monte Carlo: One of The Top Ten Algorithms of The 20th Century
100% (1)
MCMC - Markov Chain Monte Carlo: One of The Top Ten Algorithms of The 20th Century
31 pages
Full Text
No ratings yet
Full Text
6 pages
Physical Chemistry Written Assignment QP
No ratings yet
Physical Chemistry Written Assignment QP
4 pages
LRV CM 23d
No ratings yet
LRV CM 23d
3 pages
4.calculation - ZULIN RL48 For Beam 800mm
No ratings yet
4.calculation - ZULIN RL48 For Beam 800mm
6 pages
Anirban 7
No ratings yet
Anirban 7
1 page
Hs
No ratings yet
Hs
1 page
Anirban 5
No ratings yet
Anirban 5
1 page
Stability and Control of Tailless Aircraft Using Variable-Fidelity Aerodynamic Analysis
No ratings yet
Stability and Control of Tailless Aircraft Using Variable-Fidelity Aerodynamic Analysis
17 pages
BEC403
No ratings yet
BEC403
4 pages
Grade 12 2025 Analysis
No ratings yet
Grade 12 2025 Analysis
3 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
From Everand
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
Fouad Sabry
No ratings yet

SI Nonlin

Uploaded by

SI Nonlin

Uploaded by

Mechanical Systems and Signal Processing 52-53 (2015) 133–146

Contents lists available at ScienceDirect

Mechanical Systems and Signal Processing

Bayesian system identification of a nonlinear dynamical

E-mail address: [email protected]

1.1. Motivation for the data annealing algorithm

1.2. Model selection

DIC ¼ 2E½DðθÞ Dðθ^ Þ; ð11Þ

while the expected deviance can also be approximated by

In practice it is usually desirable to evaluate the logarithm of the target PDF:

lnðπ ðθÞÞ ¼ lnðPðDjθ; MÞÞ þlnðPðθjMÞÞ ð17Þ

Fig. 2. Comparison between Gaussian and Cauchy probability density functions.

4.1. Markov chain Monte Carlo

Fig. 3. Schematic of experimental apparatus.

Fig. 4. Two seconds of training data.

Parameter Prior lower limit Prior upper limit

6000 6000 6000 6000

4000 4000 4000 4000

2000 2000 2000 2000

6000 6000 6000 6000

4000 4000 4000 4000

2000 2000 2000 2000

1000 1000 1000 1000

500 500 500 500

6000 6000 6000 6000 6000

4000 4000 4000 4000 4000

2000 2000 2000 2000 2000

4000 4000 4000 4000 4000

2000 2000 2000 2000 2000

1000 1000 1000 1000

6000 6000 6000 6000 6000 6000

4000 4000 4000 4000 4000 4000

2000 2000 2000 2000 2000 2000

4000 4000 4000 4000 4000 4000

2000 2000 2000 2000 2000 2000

1000 1500 3000 1500 1000 1500

1000 2000 1000 1000

4.2. Response predictions

Fig. 8. Covariance matrix for the viscous model.

Fig. 9. Covariance matrix for the Coulomb model.

Fig. 10. Covariance matrix for the hyperbolic tangent model.

5. Discussion and future work

Model Parameter number MSE DIC

Viscous 3 0.0175 1.1047  106

Absolute Displacement (m)

You might also like

DIC ¼ 2E½DðθÞ Dðθ^ Þ; ð11Þ

Viscous 3 0.0175 1.1047 106