Non Stochastic Volatility Rev
Non Stochastic Volatility Rev
Non Stochastic Volatility Rev
Student's Name
Institutional Affiliation
Professor's Name
Introduction
and observation-driven specifications (Koopman et al., 2016). It's important to note that in an
observation-driven model, the current parameters are dependable functions of both lag-
dependent and lag-exogenous variables. Randomly evolving parameters can be predicted one
step in the future based on previous data. The forecast error dissection provides a closed-form
conditional probability for observation-driven models (Koopman et al., 2016). This property has
made this framework popular in applied econometrics and analytics since it facilitates easy
estimating processes.
models. They are subject to change. There are no closed-form theoretical formulas for the
likelihood function for these models. A more comprehensive approach to evaluating likelihood is
needed for parameter-driven simulations because efficient prediction models are often required.
copula models (Koopman et al., 2016). In light of the vast amount of time and effort that has
been invested in studying and implementing a wide range of these parameters, it is critical to
(Koopman et al., 2016). Any time series model's applicability hinges on its robust out-of-sample
performance.
3
The generalized autoregressive score (GAS) model belongs to the same family of
generalizability. The GAS approach updates time-varying variables using a scaled score vector
from the density of the statistical models (Blasques et al., 2018). For any density of observations,
we can use the GAS model. Although the GARCH model belongs to the GAS class, new models
like the mixed variable dynamic factor structure can also be developed using the GAS
framework. A natural assertion alternative to the state space framework, the GAS framework can
be used for a wide variety of DGPs (Blasques et al., 2018). No matter how much data you have,
models, the predictive dispersion is a mix of observation distributions over the random time-
varying variable, whereas, in observation-driven models, the forecast density is just the
observation distribution given a completely predictable parameter (Blasques et al., 2018). Over-
distribution, larger tails, and other properties of parameter-driven models may directly place
Research Motivation
Since the financial crisis of 2008, stock risk analysis has become increasingly important.
Financial institutions and regulators focus on determining the common variation in corporate
failures (Blasques et al., 2018). This study developed a novel instrumental variable modeling
approach for mixed quantification time series data. There may be a wide range of distributions to
choose from and the possibility of missing data and cross-sectional interdependence due to
shared experience to dynamic homogeneity components in this framework. The primary goal is
4
to build a versatile framework for modeling stock risk estimation, assessment, and projection.
This framework has a significant advantage in that the certainty is obtainable in a linear system
and does not need to be determined through simulation (Blasques et al., 2018). As a result,
It is possible to price options using the stochastic volatility model. Option prices may not
represent volatility predictions with large standard errors. It is hoped that this averaging will
lower the standard errors in option prices, which are dependent on the average volatility of the
option contract. If process parameters are considered over a lengthy period, volatility converges
to the process's unconditional variance, which is known without any error. The smoothing
estimator's errors will be highly autocorrelated, and its convergence to unconditional volatility
will be delayed if the volatility process's persistence parameter r is close to unity. Large standard
errors will be present in both the option pricing and the deltas if this is the case.
Stochastic volatility models are a natural option to the GARCH parameters of time-
varying volatility models. In SVOL models, there is a distinct error method for the conditional
variance and the conditional mean. The auto-regressive lognormal process with independent
innovations invariance equations is the SVOL model's basis. The SVOL family of models has
been more scalable than the GARCH family. Conditional mean innovations with a fat tail or
"leverage" impact can naturally expand an SVOL model. Gallant and coworkers have uncovered
findings in favor of fat-tails. Volatility increases (decreases) for negative shocks to the mean
when the mean and variance errors are negatively correlated. This is something that Black has
looked into. GARCH and Glosten et al. (1991)'s modified GARCH are two such models.
5
According to Dumas et al. (1998), index option prices negatively correlate with the underlying
index return.
Academic Contribution
Using volatility models to analyze financial time series data has become a widespread
practice in recent years, and a large literature has been developed. The ARCH model is a critical
tool for studying variance changes over time. Using the ARCH process, Engle (1982) attempts to
explain time-varying conditional variance by taking into account past disruptions. High ARCH
order is required to capture the provisions of conditional variance, as early empirical data
suggests. Bollerslev's (1986) GARCH model provides an answer to this problem. Many great
ARCH/GARCH models assessments may be found in Bollerslev, Chou, and Kroner (1992) and
Pantula (1986), describing ARCH models' maximum likelihood inference strategies under the
It has been discussed in Mark (1988) and Bodurtha and Mark (1991), as well as Simon
(1989). In addition to this, Geweke (1988) established the Bayesian inference processes for
ARCH models using Monte Carlo approaches to predict the precise posterior distributions.
While Robinson (1987) utilizes a nonparametric method, Gallant and Nychka (1987), Gallant,
Rossi, and Tauchen (1990) use a semiparametric approach. It has been found in recent studies
that a good vector autoregression (VAR) for predicting and computational modeling of
time variation in their volatilities. According to Banbura, Giannone, and Reichlin (2010), larger
systems are better at predicting and structural analysis. Volatility changes throughout time and
6
Primiceri (2005) emphasizes the importance of this variance. Bayesian estimate relies on a
probability density, the sum of the certainty, and the before accommodate both of these
components (Asai & McAleer, 2009). A large model's many parameters necessitate Bayesian
Literature Review
Observation-Driven Models
serial correlation and exogenous variables (Creal et al., 2013). In this scenario, parameters
change at random yet can be predicted exactly one step ahead of time-based on prior knowledge.
i. GARCH
According to Creal et al. (2013), Heteroskedastic means that the regression coefficients
are predicted to be bigger for certain points or bands of data than for others and that the
deviations of the parameter estimates are not equal. Even if the regression estimates for an
ordinary least regression are still impartial even when heteroskedasticity is present, conventional
techniques' standard errors and confidence intervals will be too small, leading to an illusion of
concern to be fixed. A prediction is also made for the dispersion of each residual variance due to
the corrections made to least squares. This is frequently of interest in the financial industry. The
A class of GARCH-X models can be defined by the return and GARCH equations, which
we refer to as the return and GARCH equations, respectively. In the abbreviation GARCH-X, the
exogenous variable xt is referred to. Chen et al. (2009) developed the HYBRID GARCH
framework, which incorporates GARCH-X model versions and other related models. If you look
at xt, you can see that it can be used to measure ht; hence, we'll call it the measurement equation.
xt = ht + ut is the most basic measurement equation. The measurement equation is a crucial part of
the model, which completes it. It is known that rt and xt are dependent on each other, and the
measurement equation gives an easy approach to model this relationship. The existence of zt in
the quantification equation models this interdependence, which our empirical research shows to
be quite significant. The Realized GARCH approach incorporates all of the ARCH and GARCH
model variants, if not all of them. In such models, the measurement solution is reduced to a
The multimodal realized GARCH model now includes return variables, GARCH
symmetries, and assessment equations, and it can be introduced using the proper notation. The
With example, the original observation equation can be substituted for the simplified one
under the structural model, i.e., percent t = percent (t). Most of the time, a useful analogy for
returns is more important than a model for actual measurements because they are secondary
considerations. Using the converted variance and correlation measurement equations, we may
incorporate the actual measurements into our models more easily. As long as a factor structure is
in place, the lower-dimensional assessment equation for t can be utilized to achieve this, making
It's important to remember that the equation's initial statistic should be used in some
cases. To compare the overall log-likelihood of several model specifications, all models must use
the same measurement formulae. It is possible to combine different factor structures into distinct
measurement algorithms. Models that use distinct measurement equations can only be compared
in terms of their total likelihood, comparing models that simulate different (dimensions). This
comparison would be like comparing apples and oranges. If the total likelihoods of different
models are to be compared, then a universal set of assessment equations, such as the first
equation. You can also use a selective log-likelihood for returns to evaluate the model
parameters instead of considering the entire probability and then omitting the part about the
actual measurements. Since multivariate GARCH models are designed to model the rate of
9
return, this may be the ideal comparison method. The following section goes through the log-
It is simple to forecast return probabilities from the model because all dynamic variables
variables and correlations for period t+1 by using the GARCH equations. Ht+h's constituents are
not established beyond threshold h = 1; hence the future of zt and ut cannot be predicted.
predictions can be inferred from this model at any forecasting horizon. Such Realized GARCH
models are extensively discussed in depth in Lunde and Olesen's book on forecasting
methodologies. The bootstrap method has the advantage of not relying on the distributional
described. This objective is defined by the log-likelihood function for returns used in the
estimation. There are many ways to evaluate the different specifications, but comparing their
return log-likelihoods, which represent their ability to predict the distribution of the return
vector, is an excellent starting point for comparison. Since the parameter vector can be calculated
from in-sample data, it is necessary to evaluate and compare specification values using the
average value of 'r,t (in and out of sample). The mean predictive log-likelihood is used as a gain
function in this type of model evaluation, which is similar to one-day-ahead return vector density
forecasting.
ii. GAS
10
Huang et al. (2014) asserted that observation-based models like GAS have advantages
over the other models. The process of determining the probability of an event is simple.
Asymmetric, long-term memory, and other more intricate dynamics can be studied without
additional complexity. Based on the score, rather than averages and high-order moments alone,
the GAS model exploits the full density structure, not only means and high-order moments.
According to Benjamin et al. (2003) and Cipollini and Cipollini et al. (2012), it is distinct from
average models and vector cumulative error models. The general formulation of the GAS model
is represented as below;
variation. The bivariate Gaussian distribution of zt and ut is taken for granted. The leverage
function, d1(z2t +1) + d2zt, introduces reliance between the yield shock and fluctuation shock.
Density p is used to jointly distribute y1t and y2t (y1t; y2t). The new (zt; ut) function of time t's
inventions. As a result of these new findings and their previous value, the volatility changes. The
observation density immediately following the GAS definition determines the precise functional
shape. The GAS model is a model for latent dynamic components based on observations. The
There's a big advantage to the GAS(p/q) specification in that it may be used for various
models and parameterizations. While the recursion approach can be applied to a broad variety of
models with a parametric likelihood definition, it is more general. The GAS model's score is
scaled according to the information matrix's inverse. The unit matrix, St1 I, provides a simpler
scaling option. Unscaled gradients are used in this situation; therefore, it's similar to a steepest-
descent optimization procedure. However, our experience has shown that this revamping
mechanism is generally less reliable than other mechanisms. Due to these considerations, we
believe it is best to use St=I1/t to scale the rating matrix instead of the direct product of the
matrix. The execution of the inversion when the informational matrix is not full rank or
The GAS model has the advantage of using all of the likelihood information. The time-
varying parameter makes a scaled (local density) score step to lower the one-step-ahead
estimation error at the present observation. Despite Being Built on A Fundamentally Different
Paradigm, the GAS model offers a powerful and hypercompetitive alternative to conventional
computational examples have been used to demonstrate this. The time-varying parameters of
operations, and time-varying copula structures all have intriguing expansions and alternate
This versatility and relevance for a wide range of models make it challenging to develop
a standard set of parameters for stationarity and regularity that can be applied to all relevant
12
scenarios. A more promising approach may be to use GAS specifications to create conditions for
specific subsets of models. A second research direction is to examine the finite-sample aspects of
GAS models in greater detail. The statistical features of parameter estimations for GAS models
may benefit from a more comprehensive investigation than the few fascinating empirical and
computational examples we've supplied so far. This is especially true if the information matrix is
unambiguously non-singular for all sample data. The probabilistic maximization technique
converges quickly and reliably. When there is little or no information about the design parameter
beginning values. There are locations where the information matrix degenerates, which is
especially pertinent here. In our opinion, data smoothing is essential in these situations.
Automated smoothing by computing the smoothing variable directly from data has also
variation of an efficient price mechanism from high bandwidth noisy data using the realized
kernel estimators. With the help of alternative methodologies like subsampling and pre-
averaging, we can understand better time-varying variability and better anticipate future
circumstances when prices are tracked continuously and without estimation errors.
13
Since the RV is a sum-of-squared yields calculation, this finding recommends that the
evidence be sampled as frequently as possible when calculating the RV. However, due to market
microstructure noise, this leads to a bias issue. Awartani, Corradi, and Distaso have recently
documented the presence of noise in volatility indicator plots that Andersen established,
Bollerslev, Diebold, and Labys (2000) (Barndorff‐Nielsen et al., 2009). As a result, returns are
often sampled at a modest frequency, such as every five minutes, because of the trade-off
between prejudice and volatility. Filtering techniques, which earlier studies have employed to
Parameter-Driven Models
parameters themselves change over time. Closed-form analytical equations do not represent the
probability function in these models. Effective simulation methodologies are generally required
parameters in stochastic processes are possible for any conditional observation density, so long
as the process itself is stochastic. Because of this, parameter-driven models are applicable in a
must create a new data function to update the time-varying parameter for new observation
density and parametrization. Many times, it may not be obvious what the right function is in a
According to Hansen et al. (2010), simplicity, flexibility, and resilience are the
multivariate factor stochastic volatility model (SV) (Abbara & Zevallos, 2019). According to this
factorization space, much like a factor model; therefore, it is straightforward. These components
are allowed to display clustering, but they are also allowed to be stochastic volatility processes,
allowing the extent of turbulence co-movement to be time-varying. This is both flexible and
resilient.
There may be a tendency for volatility to revert to its long-term mean value in stochastic
volatility models. Stochastic volatility models can be used to solve the challenge of derivative
models that assume constant volatility over a certain time frame (Hansen et al., 2010). Stochastic
volatility models are used to control and value the risk relating with derivative contracts. The
variance (error term) in an ARCH model is a function of the period squared mistakes in the past.
Volatility clustering is a financial theory that explains the correlation between volatile markets
and times of low volatility. For example, in a generalized autoregressive model, a GARCH
model's error term is a function of previous period squared errors and the previous period
estimated variance (Hansen et al., 2010). High volatility tends to follow periods of low volatility,
from either the time domain system model supplied by a differential equation or its transfer
function representation (Hansen & Lunde, 2006). This section will deal with scenarios that
15
incorporate the observer form, the modal form, and the Jordan form, four state-space forms
commonly employed in current control theory and application. The general formula is
represented below;
This may or may not incorporate all of the underlying transfer function modes, i.e., the
poles of the underlying transfer function, depending on how the state-space models are
constructed (before zero-pole cancellation, if any, takes place) (Hansen & Lunde, 2006). The
state-space model will have a lower order if parts of the transfer function's zeros or poles are
negated, and the accompanying patterns will not be visible in the transition matrix.
According to Costa & Alpuim (2010), the Kalman filter method, developed by Kalman in
1960, has been widely applied to studying dynamic structures' evolution. Using a group of
equations called a state-space model, the technique attempts to derive estimates for unobservable
variables based on associated observable variables. The equations used to construct these types
Assessment Equation (1) connects the n*1 vectors of measured variables (Yt) and states
(bt), which are known as the m*1 vectors of unobservable variables. White noise n*1 vector et
referred to as the sampling error, and the correlation coefficient of the n*m matrix Ht make up
the Ht matrices (Costa & Alpuim, 2010). Eq. (2), the equation of transformation or state, also
shows that the vector of states bt changes with time. Covariance matrix and vector
autoregression coefficient matrix U are used in this example. There is no correlation between the
two disturbances et and et. When the state vector is a deterministic model with a mean, one
According to Wooldridge (2014), if you want to estimate a linear model with one or more
endogenous explanatory variables, 2SLS is the most typical method. The limited information
probabilistic estimator, computed under the nominal condition of jointly regularly dispersed
unobservable, has better small-sample features, however, as many authors have shown,
particularly when multiple overidentifying requirements are present. The Quasi model can be
computed by
17
notoriously challenging, specific instances have been provided. In nonlinear models, the
some mix – portrays imminent significance (Durbin & Koopman, 2000). 2SLS can be utilized
irrespective of the type of the EEVs. In the second stage, the structural variables and other
quantities of relevance, e.g., average partial impacts, are frequently discordant with the values
obtained in the first stage. The bulk of the time, two strategies are utilized to approximate
must have an explicit description of the EEV distribution and an associated response parameter
distribution conditional on the EEVs (Durbin & Koopman, 2000). There are various downsides
to the MLE technique, especially when dealing with binary responses. Managing many EEVs,
for one thing, could be computationally taxing. Furthermore, if assumptions are erroneous, it is
Remainders from the first stage estimate technique, including EEV, are utilized in a
second stage assessment issue using a control function approach. Nonlinear models with cross-
section data and panel data are just two of the circumstances in which researchers use the control
function (CF) technique (Wooldridge, 2014). Using the method in semiparametric and
nonparametric circumstances has been proven by Wooldridge (2014). According to BP, there are
no distributional or functional limits on quantities of interest, which means they can be found
generically. APE provides unobservable that aren't considered independent of external causes,
18
but Durbin & Koopman (2000) compares the ordinary structural support and average partial
effects as similar notions. The method of inserting first-stage aspects for EEVs can generate
consistent estimators of parameters up to a universal scale factor in some cases, but the
constraints under which this occurs are quite restrictive, and average partial effects cannot be
easily retrieved (Wooldridge, 2014). It is also difficult to test the premise that the EEVs are
According to Gourieroux et al. (1993), Indirect inference makes use of the simplicity and
convenience with which data from even complex structural frameworks may be replicated. The
central notion is to look at observed values and replicated findings through the lens of an
inferential statistical (or auxiliary) framework with auxiliary parameters. The structural
parameters are then chosen to match the parameter estimates, and when viewed through this
perspective, the simulated outcomes resemble the observed data. To formalize these notions,
suppose the actual choices {yit}, I = 1, . . . , n, t = 1, . . . , T, are created by the structural discrete
choice model specified in (2.1), for a given value β0 of the structural response. An auxiliary
framework can be evaluated using the observed values to obtain parameter estimates ˆθn.
Where the x/s are external variables that can be observed, but the u/s and &;s are not.
Suppose (Al) is a stationary Markov process (yt,xt) and (et) is a white noise whose dispersion
GO is known. We further assume that (Al) (xt) is a homogeneous Markov process (Al) (xt).
Notably, in the parametric situation, it is not necessary to assume that et has some known
distribution as a white noise with standard normal distribution, and a variable that can be
Because any simulated choice is a step function, step functions appear naturally when
using indirect inference on discrete choice models. The prototype binding function n is hence
discontinuous. Because of this, the II estimators' criterion functions have a discontinuity. Due to
II's discrete outcomes, gradient-based optimization methods cannot be used. There are only a few
options left: derivative-free approaches, random search algorithms (such as simulated annealing),
1993). MCMC, on the other hand, can provide (infinite samples) an approximation that differs
significantly from the statistical criterion's optimum even when it converges slowly. Since non-
smooth criterion functions define II estimators, their employment in the form of nonlinear data is
extremely problematic (Gourieroux et al., 1993). Despite the challenges of applying II to discrete
choice models, several authors have persisted in doing so due to the attraction of the II approach.
v. Importance Sampling
accelerated if the samples are drawn from a distribution comparable to the function in the
integrand. Importance sampling utilizes this fact. The underlying principle is that an accurate
20
approximation is produced more quickly by focusing effort where the integrand's value is
relatively high. The scattering equation for this model is computed using the formula below;
The scattering equation, for example, might be used as an example. What happens if a
random direction is selected that is approximately perpendicular to the surface normal? If the
cosine term is close to zero, this integral is estimated. Durbin & Koopman (2000) asserted that
performing a BSDF analysis and then tracking a laser beam to determine the incident radiance at
the sample site will be a complete waste of time and money, as the results will be so insignificant
(Durbin & Koopman, 2000). Sample directions in such a manner that they are less likely to lead
us to choose directions close to the horizon. In general, efficiency improves when directions are
selected from populations that match other integrand factors (the BSDF or the distribution of
incoming illumination). Variance can be decreased as long as the independent variables are
According to Durbin & Koopman (2000), the local level model encompasses other
filtering models such as the Kalman filter, Regression Lemma, Bayesian Treatment, Minimum
Variance linear unbiased treatment, and smooth state variance, among other models within the
21
local level projection. For instance, In a time series, each observation is sorted sequentially from
y1 to yn. The additive model is the most fundamental representation of a time series. The
These three components are called the trend, the seasonal, and the error/disturbance,
respectively. t is the slow-changing component. For this model, we assume the observation yt
and the other parameters listed in (2.1) are numeric values. Components are multiplied together
in various applications, especially in the economy (Durbin & Koopman, 2000). The resulting
This model, despite its simplicity, does not represent a contrived special instance but
rather serves as a foundation for the investigation of key real-world challenges in time series
variances, and covariance matrices is a routine affair based on the features of the multivariate
standard deviation, which may be applied here (Durbin & Koopman, 2000). However, when the
Using the filtering and flattening techniques discussed in the following three sections, this naïve
approach to estimating can be greatly improved (Durbin & Koopman, 2000). It is possible to get
22
the same findings as multivariate analysis theory using these techniques because they give fast
computer algorithms.
According to Zhu & Galbraith (2010), a variety of distributions resembles the normal
distribution curve, although it is a little shorter and thicker. The t distribution is preferred over
the normal distribution when working with tiny samples. The t distribution resembles the normal
distribution more closely as the sample size increases. For sample sizes greater than 20, the
dispersion is nearly identical to the normal distribution. This model is computed using the
If you want to know if you should embrace or reject the null hypothesis, you utilize the T
Distribution (and its associated t scores) in hypothesis testing. The graph's core represents the
area of acceptance, while the graph's extremities represent the area(s) of rejection. This two-
tailed test graph's rejection area is colored blue (Zhu & Galbraith, 2010). Z-scores and t-scores
Gaussian Distribution
dispersion used to compare scores or make other statistical decisions, such as determining the
mean and standard deviation. This distribution's form suggests that the bulk of scores is located
near the distribution's center and that the frequency of scores that deviate from the center
diminishes. The normal probability distribution is the most common among discrete probability
expresses the probability of a random variable taking a specific value. Plotting the variable x by
its likelihood of happening, y, creates this graph. They have asymmetric bell shape but can have
any real mean and any positive integer standard deviation. These are normally distributed
ranges. To put it simply, normal distributions are defined by continuous data, which means that
any value in the data set can be represented. Special instances allow the normal distribution to be
24
normalized so that its mean is zero and the standard deviation is one. It is possible to standardize
any normal distribution to fit the normal curve. Its condensed formula is;
The mean (μ), standard deviation (σ), and variance (σ2) make up this exponential
function. Shorthand for the expression is N (μ, σ2). N (0, 1) has the usual normal distribution if
the parameter values are equal to zero and one (Giner & Smyth, 2016). The mean and the
confidence interval determine the normal distribution's form. Standard deviation controls the y-
axis, whereas the mean controls the x-axis. The mean affects the distribution's apex's location, so
it is referred to as the location parameter. What determines how large a distribution appears is
referred to as a "scale parameter." The bell curve will be broader if the variance is greater.
Data Cleaning
Volatility assessment from high-frequency data requires careful data cleansing. High-
frequency data cleansing has been given considerable attention. A large data set can improve
this conclusion may initially appear to be counterintuitive, yet it is rather simple. As a rule of
thumb, an estimator that uses all available data to its fullest extent is more likely to place a high
(Barndorff‐Nielsen et al., 2009). The conventional least squares estimator, on the other hand, has
25
a lower level of precision when it includes noisy data. The least-squares estimator can suffer
more than benefit from using low-quality observations, and this is a reasonable comparison in
light of the current scenario. Outliers can significantly impact the realized kernel and related
estimators, which process all observations equally (Barndorff‐Nielsen et al., 2009). The
following steps can be followed for trade data to achieve high-frequency data cleaning.
Re-enter the right trades in T1. A Correction Indicator (CORR > 0) is used in these trades
Use T2 to remove any entries that have an odd Sale Condition. Letter-coded COND
transactions, except those including the letters "E" and "F," are excluded from this list.
TAQ 3's User's Guide has more information on how sales are handled (Barndorff‐Nielsen
et al., 2009).
T3: Use the median price if there are numerous transactions with the same time stamp.
In T4, the bid-ask spread should be subtracted from any prices that are over the 'ask.'
With prices below the "bid" minus the "bid-ask spread," the situation is similar
Conclusion
This paper has exploratively discussed the stochastic and non-stochastic volatility models
exogenous variables and parameters change at random yet can be predicted exactly one step
ahead of time based on prior knowledge. Time-varying parameters in stochastic processes are
26
possible for any conditional observation density, so long as the process itself is stochastic.
References
Abbara, O., & Zevallos, M. (2019). A note on stochastic volatility model estimation. Brazilian
Asai, M., & McAleer, M. (2009). The structure of dynamic correlations in multivariate stochastic
Barndorff‐Nielsen, O. E., Hansen, P. R., Lunde, A., & Shephard, N. (2009). Realized kernels in
Blasques, F., Gorgi, P., Koopman, S. J., & Wintenberger, O. (2018). Feasible invertibility
Costa, M., & Alpuim, T. (2010). Parameter estimation of state-space models for univariate
Creal, D., Koopman, S. J., & Lucas, A. (2013). Generalized autoregressive score models with
Durbin, J., & Koopman, S. J. (2000). The time series analysis of non‐Gaussian observations is
Giner, G., & Smyth, G. K. (2016). stated: probability calculations for the inverse Gaussian
Gourieroux, C., Monfort, A., & Renault, E. (1993). Indirect inference. Journal of applied
econometrics, 8(S1), S85-S118.
Gustafsson, F. (2010). Particle filter theory and practice with positioning applications. IEEE
Hansen, P. R., & Lunde, A. (2006). Realized variance and market microstructure noise. Journal
Hansen, P. R., Huang, Z., & Shek, H. H. (2010). Realized GARCH: A complete model of returns
Huang, Z., Wang, T., & Zhang, X. (2014). Generalized autoregressive score model with realized
Koopman, S. J., Lucas, A., & Scharth, M. (2016). Predicting time-varying parameters with
Statistics, 98(1), 97-110.
234.
29
Yuan, C., & Druzdzel, M. J. (2012). An importance sampling algorithm based on evidence pre-
Zhu, D., & Galbraith, J. W. (2010). A generalized asymmetric Student-t distribution with