0% found this document useful (0 votes)

13 views16 pages

Entropy 21 00713

The article proposes a method for generating surrogate data, called entropy preserving surrogates (EPS), which maintains the statistical properties of ordinal patterns up to a certain length, aiding in the testing of determinism versus stochasticity in time series analysis. The authors critique existing methods for their inability to distinguish between linear and nonlinear dynamics and present EPS as a more effective alternative for identifying underlying dynamics. The proposed approach is demonstrated using various toy models and aims to enhance confidence in mathematical modeling of time series data.

Uploaded by

Imane Mtms

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views16 pages

Entropy 21 00713

Uploaded by

Imane Mtms

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

entropy

Article
Surrogate Data Preserving All the Properties of
Ordinal Patterns up to a Certain Length
Yoshito Hirata 1,2, * , Masanori Shiro 3 and José M. Amigó 4
1 Mathematics and Informatics Center, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku,
Tokyo 113-8656, Japan
2 Faculty of Engineering, Information and Systems, University of Tsukuba, 1-1-1 Tennodai, Tsukuba,
Ibaraki 305-8573, Japan
3 Human Informatics Research Institute, National Institute of Advanced Industrial Science and Technology,
Ibaraki 305-8568, Japan
4 Centro de Investigación Operativa, Universidad Miguel Hernández, Avda. de la Universidad s/n,
03202 Elche, Spain
* Correspondence: [email protected]

Received: 15 June 2019; Accepted: 19 July 2019; Published: 22 July 2019

Abstract: We propose a method for generating surrogate data that preserves all the properties of
ordinal patterns up to a certain length, such as the numbers of allowed/forbidden ordinal patterns
and transition likelihoods from ordinal patterns into others. The null hypothesis is that the details
of the underlying dynamics do not matter beyond the refinements of ordinal patterns finer than a
predefined length. The proposed surrogate data help construct a test of determinism that is free from
the common linearity assumption for a null-hypothesis.

Keywords: time series analysis; determinism; stochasticity; permutations; hypothesis testing

1. Introduction
Judging whether the underlying dynamics are deterministic or stochastic based on a given time
series is an old problem and the first step for modelling such a time series. The current standard
approach uses iterative amplitude adjusted Fourier transform (IAAFT) surrogates [1] with some
statistics characterizing determinism such as prediction errors [2] and Wayland statistic [3]—but by
following this approach, we cannot distinguish nonlinear stochasticity from linear stochasticity or
nonlinear determinism.
Recently, we have proposed an alternative approach where we prepare two independent tests for
linearity-nonlinearity as well as determinism-stochasticity [4]. For the test of linearity-nonlinearity,
we use truncated Fourier transform surrogates (TFTS) [5], an extension of IAAFT surrogates with
the mean of s(t)2 s(t + 1)2 over a time series s(t) as a test statistic which is not directly related to
the determinism that may exist. For the test of determinism-stochasticity, we use the properties of
permutations [6–8], which are inequality relations among consecutive measurements: If the underlying
dynamics is deterministic and verifies some assumptions (see Section 3 for details), then the number
of appearing permutations increases exponentially when the length of permutations is prolonged.
But, currently this approach has a problem—we need a long time series of length 1,000,000 to classify
stationary time series appropriately [4].
Thus, we propose another approach for testing determinism-stochasticity for the underlying
dynamics using the permutation properties of a time series. In this paper, we generate surrogate
data which preserve the series of permutations for a given time series almost perfectly and thus
the stochastic properties for the underlying dynamics fully up to a certain pre-defined length of the

Entropy 2019, 21, 713; doi:10.3390/e21070713 www.mdpi.com/journal/entropy

Entropy 2019, 21, 713 2 of 16

permutations. We call the surrogate data we propose here as entropy preserving surrogates (EPS).
Thus, based on the proposed method, we will be able to identify the determinism for the underlying
dynamics based on its time series more firmly than by using the existing methods in the literature,
helping researchers to make their mathematical model with more confidence.
For the test of linearity-nonlinearity, we continue using TFTS with the mean of s(t)2 s(t + 1)2 as the
statistic. If (s(t), s(t + 1)) follows the multivariate Gaussian distribution, a higher-order moment such
as s(t)2 s(t + 1)2 can be characterized with the means and variances [9] and becomes pivotal [10] and
constant if the underlying dynamics are kept. Thus, any variation can be attributed to a deviation from
the linear Gaussianity. Hence, the mean of s(t)2 s(t + 1)2 can be used as a test statistic for nonlinearity.
We demonstrate the proposed set of methods using time series of length 1000.

2. Our Mathematical Settings

Before starting the main parts of this manuscript, we define our mathematical settings
more rigorously.
Our interest is on a dynamical system f : X × P → X on a manifold X driven by a parameter
space P, which may change along the time. Thus, typically, we have

x (t + 1) = f ( x (t), p(t)) (1)

for x (t) ∈ X and p(t) ∈ P, starting from the initial conditions x (0) ∈ X and p(0) ∈ P. We cannot
directly observe x (t). Instead, we have an observation function g : X → R such that s(t) = g( x (t)).
When g is given by a skew product of the state X and its disturbance Q, then we can model
observational noise as well.
Then, our question is whether p(t) is constant throughout the time or p(t) changes along the time.
If p(t) is constant throughout the time, then we call the underlying dynamics deterministic. If p(t)
changes along the time in a deterministic way, then we also call the underlying dynamics deterministic.
If p(t) changes along the time randomly, then we call the underlying dynamics stochastic.

3. Background
There have been a number of researches in the existing literature discussing how to characterize
determinism and/or stochasticity: The best known approaches could be the ones using the parallelness
of neighboring orbits [3,11] and the optimal neighborhood size for local linear predictions [12]. Recently,
the most popular one could be that by Amigó et al. [8,13,14], which uses the fact that there exist
forbidden ordinal patterns. To explain the approach of Amigó et al. (2008) [8] in more detail, we first
define ordinal patterns or permutations [6].
Suppose that a time series is given by s(t) ∈ R. We focus on inequality relations among
consecutive measurements s(t), s(t + 1), . . . , s(t + L − 1) over time period between t and t + L − 1.
Namely, if we order these measurements in the ascending order, we could have s(t + i1 ) ≤ s(t + i2 ) ≤
· · · ≤ s(t + i L ), where i j ∈ {0, 1, 2, . . . , L − 1} and are unique. For convenience, we define
s(t + i ) ≤ s(t + j) if s(t + i ) = s(t + j) and i < j. Then, the corresponding permutation is
π ({s}, t, L) = (i1 , i2 , . . . , i L ).
The number of appearing permutations increases exponentially when the length of permutations L
is prolonged if the underlying dynamics is one-dimensional, deterministic and piecewise monotone [13]
or, in any dimension, if the underlying dynamics is deterministic and expansive [7].
Thus, Amigó et al. (2008) [8] uses the existence of forbidden permutations as the signature for a
deterministic system. The contraposition of this theorem was previously used for identifying if a given
time series is generated from a nonlinear and stochastic system [4].
Moreover, the entropies obtained using the permutation statistics can be used for estimating the
metric and topological entropies [6,7,15].
Entropy 2019, 21, 713 3 of 16

Therefore, permutations are good tools for characterizing time series generated from the
underlying dynamics.

4. Methods
Here we propose to generate surrogate data that preserves all the statistical properties of
permutations up to a certain predefined length L. Here we call such surrogate data as entropy
preserving surrogates (EPS). Our method is quite simple and follows a general principle proposed by
Schreiber (1998) [16]: we randomly exchange the temporal order of time series through the method of
simulated annealing [17] so that we preserve a series of permutations for a given time series as well as
a series of permutations for a moving average of the given time series over length L subsampled by
an interval L (see Figure 1). In this way, we generate 39 surrogate data for obtaining the significance
level of 2/(39 + 1) = 5% level for each time series. Since a series of permutations is preserved in the
entropy preserving surrogates, all the transitions from every permutation to another are preserved.
Therefore, our null hypothesis in the entropy preserving surrogates is that the underlying dynamics
has significant historical dependence only up to the length L and the dependence over L does not
matter for the underlying dynamics. As a by-product, permutation entropies calculated up to length L
are preserved. Please find the detail on how to generate EPS in the Appendix A.

Exchange time points randomly so as to preserve the series

of permutations for the original time series as well as that for
its sub-sampled moving average
2
Original time series

1
(1,2,0) (1,2,0)
(0,2,1)
0 (2,1,0) (2,0,1)
(2,1,0)
(0,1,,2)
(1,2,0)
(2,0,1)
−1
(1,0,2)
−2
0 2 4 6 8 10 12
Sub−sampled moving average

(1,2,0)
0
(2,0,1)

−1
0 2 4 6 8 10 12
Time

Figure 1. Schematic figure showing how we generate an entropy preserving surrogate.

To compare an original time series with its surrogate data for telling whether the original time
series is statistically different from its surrogate data or not, we estimate the maximal Lyapunov
exponent in the following way: First, we fit the parameters a(t) and b(t) for the following local linear
model for each time t using 20 neighboring points in infinite-dimensional delay coordinates [18]:

s(t + 1) − s(τ + 1) ≈ a(t) + b(t)(s(t) − s(τ )), (2)

where s(τ ) is one of 20 spatial neighbors for s(t). Then, we evaluate the following quantity as a test
statistic for the second half of each dataset:
Entropy 2019, 21, 713 4 of 16

Et [log |b(t)|]. (3)

This statistic can be regarded as a proxy for the maximal Lyapunov exponent. We decided to
use 20 neighbors for the above estimation because if the number of neighbors is less than 20, then the
estimation would heavily depend on the closer neighbors, while the estimation would not be able to
characterize local states well if the number of neighbors is greater than 20.

5. Results

5.1. Toy Examples

First, we show some numerical experiments for datasets generated from toy models which are free
from observational noise. We set L = 30 throughout the paper because we would like to investigate
the deterministic structure which finely persists over the pseudo-periodicity evaluated by the
pseudo-periodic surrogates [19]. To obtain pseudo-periodic surrogates, we used the three-dimensional
delay coordinates with delay 8.
Our first toy model is the first-order autoregressive linear (AR(1)) model [20]. The model we used
is as follows:
x (t + 1) = 0.8x (t) + η (t), (4)

where η (t) follows the Gaussian distribution of mean 0 and standard deviation 1.
The second toy model is the GARCH model [21]. The model equations are

y(t) = 0.409933 + 0.095y(t − 1) + e(t), (5)

h(t) = 14.4038 + 0.095e(t − 1)2 + 0.895h(t − 1), (6)

p
where e(t) follows the Gaussian distribution of mean 0 and standard deviation h(t). We observe
y(t) to generate a time series.
The third toy model is the model for noise-induced order [22]. We use the following equations:

x (t + 1) = f ( x (t)) + b + 10−2.5 u(t), (7)


 −( 0.125 − x ) 1/3 + 0.50607357 exp(− x ), if x < 0.125,


f (x) = ( x − 0.125)1/3 + 0.50607357 exp(− x ), if 0.125 ≤ x < 0.3, (8)

x 19

0.121205602 (10x exp(−10 3 )) , otherwise,


where u(t) follows the uniform distribution between −1 and 1.

The fourth toy model is the logistic map [23]. We use the following equation:

x (t + 1) = 3.8x (t)(1 − x (t)). (9)

We also use time-continuous models for testing the proposed method. Our fifth toy model is the
Lorenz model [24]. We use the following equations:

ẋ = −10( x − y), (10)

ẏ = − xz + 28x − y, (11)
8
ż = xy − z. (12)
3
We sampled x every 0.1 unit time.
The sixth model is the Rössler model [25]. Here we use the following equations:

ẋ = −(y + z), (13)

Entropy 2019, 21, 713 5 of 16

ẏ = x + 0.36y, (14)

ż = 0.4 + z( x − 4.5). (15)

We sampled x every 1 unit time.

For each model, we generated 20 time series of length 1000 to examine the robustness for
the proposed test. In this paper, we also used pseudo-periodic surrogates [19] with correlation
dimensions [26] as test statistics. For pseudo-periodic surrogates, the null hypothesis is that the
underlying dynamics has determinism beyond pseudo-periodicity. Such surrogate data can be
generated by connecting segments of time series by choosing a neighboring point at each step with
a Gaussian uncertainty. If we generate surrogate data in this way, a rough periodicity related to the
underlying dynamics is preserved, while fine structure is destroyed. Thus, we can judge if there
is determinism beyond this rough periodicity. For the cases without observation noise, we also
use the proxy for the maximal Lyapunov exponent as a test statistic for pseudo-periodic surrogates
for comparison.
When we use TFTS, we apply the end-to-end matching [27] using the first and last 20 points
to suppress the artificial high-frequency components which might be generated during applying
the Fourier transforms. When we generate the proposed entropy preserving surrogates, we use the
same segments of time series, which could be the reason why we can find slight differences between
the values for the proxies of the maximal Lyapunov exponent between Figures 6 and 7 as we will
discuss later.
In all the model analyses down below, we used whole the datasets for each time series, meaning
that we did not divide each time series into halves or so.
Examples for entropy preserving surrogates are shown in Figures 2 and 3. Especially, such a time
series shown for entropy preserving surrogates looks similar to the original time series (Figure 2).
When we look at their return plots, we can see that an entropy preserving surrogate (Figure 3B) seems
to be perturbed from the original time series (Figure 3A).
1.1
Original
1 An entropy preservinfg surrogate

0.9

0.8

0.7
x(t)

0.6

0.5

0.4

0.3

0.2

0.1
0 20 40 60 80 100
Time
Figure 2. Example of an entropy preserving surrogate for the logistic map.
Entropy 2019, 21, 713 6 of 16

A Original B An entropy preserving surrogate

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
x(t+1)

x(t+1)
0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1
0 0.5 1 0 0.5 1
x(t) x(t)

Figure 3. Return plot for the original time series of the logistic map (A) and that for one of its entropy
preserving surrogates (B).

The results of the surrogate tests are summarized in Figures 4–7 as well as Table 1. For most
cases, the tested time series were classified into the correct classes for the corresponding toy models.
To evaluate the numbers of rejections appropriately, consider the binomial distribution with N = 20
trials and p = 0.05. Then, the cumulative sum of probabilities from 0 becomes more than 95% if the
number of positives is 4 or greater. For example, 3 rejections for the test of linearity for the AR(1) model
are not statistically significant. For the same reason, 3 rejections for the proposed test of determinism
beyond 30 steps for the model of noise-induced order are not statistically significant.
The results presented in Figure 4 and Table 1 show that the nonlinearity test examined here
is robust.
Table 1 and comparison of Figures 5B,C and 6B,C mean that the results of pseudo-periodic
surrogates heavily depend on test statistics we use. These results may be due to the fact that
pseudo-periodic surrogates are typical realizations and need a pivotal statistic for the test [10].
Figure 7C,D mean that a time series with the same value for the permutation entropies up
to length 30 is likely to have a positive Lyapunov exponent even if the underlying dynamics is
stochastic. This usage could lead to another implication obtained from the entropy preserving
surrogates. The results shown in Table 1 mean that the proposed method has some skill for detecting
the determinism for the underlying dynamics.
The results presented in Table 1 also show that the proposed method works well even for flows
such as the Lorenz model and Rössler models as far as sampling intervals are chosen appropriately.
Entropy 2019, 21, 713 7 of 16

Figure 4. Examples of tests for nonlinearity for various models when the datasets are free from
observational noise. Here an important point is whether or not the value obtained from each original
time series shown in the red vertical dashed line is within the interval specified with the minimum and
the maximum for the test statistic E[ x (t)2 x (t + 1)2 ] of the 39 truncated Fourier transform surrogates
(TFTS) surrogates, which can be interpreted from each histogram. Therefore, it does not matter much
whether the test statistic obtained from the original data is smaller or greater than those obtained from
TFTS surrogates. (A) result for the AR(1) model; (B) result for the GARCH model; (C) result for the
model of noise-induced order; (D) result for the logistic map; (E) result for the Lorenz model; (F) result
for the Rössler model.

Table 1. Results of noise free data summarized as classifications. We counted the number of rejections
for each test for each model. The italic numbers correspond to the significant numbers of rejections
based on the calculations using the binomial distributions.

Noise-Induced
Property\Model AR(1) GARCH Logistic Lorenz Rössler
Order
Nonlinearity with E[ x (t)2 x (t + 1)2 ] 3 20 20 20 20 20
Determinism beyond pseudo-periodicity
0 20 0 0 0 20
with correlation dimensions
Determinism beyond psuedo-periodicity
1 1 7 17 2 0
with maximal Lyapunov exponent
Determinism beyond 30 steps
2 1 3 20 8 6
with maximal Lyapunov exponent
Total 20 20 20 20 20 20
Entropy 2019, 21, 713 8 of 16

Figure 5. Examples of tests of determinism beyond pseudo-periodicity using pseudo-periodic

surrogates for various models when the datasets are free from observational noise. Here we use
the correlation dimensions as test statistics. In this surrogate data, rough periodic behavior is preserved,
while fine structure related to the possible underlying determinism in question is destroyed. Correlation
dimensions are normalized so that the minimum and the maximum values for the correlation
dimensions of the 39 pseudo-periodic surrogates for each dimension become 0 and 1, respectively. (A)
result for the AR(1) model; (B) result for the GARCH model; (C) result for the model of noise-induced
order; (D) result for the logistic map; (E) result for the Lorenz model; (F) result for the Rössler model.

We also tested the cases where for each case, we added Gaussian observational noise of
mean 0 and standard deviation which is 5% of the standard deviation of the original time series
(Figures 8–10, and Table 2). But, still the proposed method seems to work properly. Determinism
beyond pseudo-periodicity was detected for the GARCH model and the model of noise-induced
order, while the determinism was weak in the sense that the dependence did not persist beyond 30
steps statistically significantly. On the other hand, the logistic map tended to exhibit determinism
beyond 30 steps (Table 2). Overall, Table 2 shows the robustness for the proposed method against
observational noise.
Entropy 2019, 21, 713 9 of 16

Figure 6. Examples of tests of determinism beyond pseudo-periodicity using pseudo-periodic

surrogates when we use the proxy for the maximal Lyapunov exponent as a test statistic. In each
panel, the red dashed line corresponds to the value obtained from the original time series and the
histogram, obtained from the pseudo-periodic surrogates. (A) result for the AR(1) model; (B) result for
the GARCH model; (C) result for the model of noise-induced order; (D) result for the logistic map; (E)
result for the Lorenz model; (F) result for the Rössler model.

Table 2. Results of 5% observational noise data summarized as classifications. See the caption of Table 1
to interpret the results.

Noise-Induced
Property\Model AR(1) GARCH Logistic Lorenz Rössler
Order
Nonlinearity with E[ x (t)2 x (t + 1)2 ] 1 19 20 20 20 20
Determinism beyond pseudo-periodicity
0 20 20 0 20 11
with correlation dimensions
Determinism beyond 30 steps
3 0 2 16 6 7
with maximal Lyapunov exponent
Total 20 20 20 20 20 20
Entropy 2019, 21, 713 10 of 16

Figure 7. Examples of tests of determinism beyond 30 steps using the proposed entropy preserving
surrogates for various models when the datasets are free from observational noise. In each panel,
the red dashed line corresponds to the value of test statistic obtained from the original data. (A) result
for the AR(1) model; (B) result for the GARCH model; (C) result for the model of noise-induced order;
(D) result for the logistic map; (E) result for the Lorenz model; (F) result for the Rössler model.

Figure 8. Examples of tests of nonlinearity for various models when 5% observational noise is added.
See the caption of Figure 5 to interpret the results. (A) result for the AR(1) model; (B) result for the
GARCH model; (C) result for the model of noise-induced order; (D) result for the logistic map; (E) result
for the Lorenz model; (F) result for the Rössler model.
Entropy 2019, 21, 713 11 of 16

Figure 9. Examples of tests of determinism beyond pseudo-periodicity using pseudo-periodic

surrogates for various models when 5% observational noise is added. (A) result for the AR(1) model;
(B) result for the GARCH model; (C) result for the model of noise-induced order; (D) result for the
logistic map; (E) result for the Lorenz model; (F) result for the Rössler model.

Figure 10. Examples of tests of determinism using the proposed entropy preserving surrogates for
various models when 5% observational noise is added. (A) result for the AR(1) model; (B) result for
the GARCH model; (C) result for the model of noise-induced order; (D) result for the logistic map;
(E) result for the Lorenz model; (F) result for the Rössler model.
Entropy 2019, 21, 713 12 of 16

5.2. Real Data Example of the USD/JPY Market

We analyzed the dataset of the USD/JPY market compiled by the Thomson Reuters Cooperation.
The record starts from 1 January 2006 and ends on 31 December 2015. We use the first 100,000 quotes for
the analysis here. We divided the dataset by every 1000 quotes into 100 segments, and took inter-quote
intervals for each segment.
For the first segment, one of generated entropy preserving surrogates looks as shown in
Figures 11 and 12. We can see that typical characteristics for the time series as well as return plots are
almost preserved.
The results are summarized as Table 3. Nonlinearity was detected in 24 out of 100 cases, while
determinism beyond 30 steps was detected in 12 out of 100 cases. Because these numbers are significant
from the viewpoint of the binomial distribution of 100 trials and the probability 0.05 for each test,
namely judging from the facts that each test is 5% significant and each time segment is independent
from each other, overall, the dataset of the USD/JPY market seems nonlinear with the determinism
beyond 30 quotes.

9000
Original
An entropy preserving surrogate
8000

7000
Inter-event interval (seconds)

6000

5000

4000

3000

2000

1000

0
0 20 40 60 80 100
Time
Figure 11. Example of an entropy preserving surrogate for a part of the USD/JPY data.

Table 3. Results of the USD/JPY data summarized as classifications. See the caption of Table 1 to
interpret the results.

Property Number of Time Segments

Nonlinearity with E[ x (t)2 x (t + 1)2 ] 24
Determinism beyond pseudo-periodicity
0
with correlation dimensions
Determinism beyond 30 steps
12
with maximal Lyapunov exponent
Total 100
Entropy 2019, 21, 713 13 of 16

A Original B An entropy preserving surrogate

9000 9000

8000 8000

7000 7000

6000 6000
s(t+1) (seconds)

s(t+1) (seconds)
5000 5000

4000 4000

3000 3000

2000 2000

1000 1000

0 0
0 5000 10000 0 5000 10000
s(t) (seconds) s(t) (seconds)

Figure 12. Return plot for the original time series of a USD/JPY data part (A) and that for one of its
entropy preserving surrogates (B).

6. Discussions
Although we set L = 30 in this manuscript, we may vary the length L of permutations for
elucidating the effect and the length of dynamical dependence. By choosing L, we can control the
length of dependence which should have significant meaning. Thus, by varying L, we can narrow
down the topical area of a target time series mostly into the intersection of nonlinear and deterministic
regions, whose regions could be smaller than the region specified with pseudo-periodic surrogates as
shown in Figure 13. Hence, together with the methods [5,19] in the existing literature, the proposed
entropy preserving surrogate helps us to specify the assumptions of a model more finely when we try
to construct a model based on a time series.
For pseudo-periodic surrogates, the length 1000 of time series might have been too short to
show the determinism beyond pseudo-periodicity for the dataset of the USD/JPY data, while it was
sufficient to show the determinism beyond 30 quotes using the proposed method. Thus, we would
like to explore the effect for the length of time series in the future more deeply.
The proposed method preserves series of permutations for the original time series as well as that
for its sub-sampled moving average. Thus, the proposed method of entropy preserving surrogates
can be regarded as a constrained realization [10] rather than a typical realization. When we focus
on surrogate data generated by permutations, there are methods such as those of References [28–30].
Because these methods are surrogate data as typical realizations, the proposed method is the first
method generating surrogate data with permutations as a constrained realization. As a constrained
realization, the proposed method can formally be used with a non-pivotal statistic [10], which does
not have to provide a consistent value for a class of null models. Thus, we hope that the proposed
method be powerful for investigating the deterministic properties beyond a pre-defined length for a
given time series.
If there are a pair of time series and we generate entropy preserving surrogates for both, then we
can also preserve symbolic transfer entropies [31] and transcripts [32]. Therefore, applying entropy
preserving surrogates to multivariate data time series could be an interesting and open problem.
Entropy 2019, 21, 713 14 of 16

Deterministic

Figure 13. The Venn diagram describing the relationship among original properties for the underlying
dynamics such as nonlinearity and determinism against properties we can identify with surrogate data
such as determinism beyond pseudo-periodicity (pseudo-periodic surrogates [19]) and determinism
beyond L steps (the proposed entropy preserving surrogates).

7. Conclusions
We have proposed a method for generating surrogate data such that all the properties of
permutations up to a certain length are preserved. Such surrogate data look very similar to the
original data as shown in Figures 2 and 11, but with dynamical noise especially demonstrated in
Figure 3. By using the four toy models, we evaluated that the proposed method works finely. Then,
we applied the proposed method to inter-quote interval data in the USD/JPY market and found that
the market behaved in a nonlinear and deterministic manner, which is consistent with our previous
findings [33].

Author Contributions: Conceptualization, Y.H., M.S. and J.M.A.; methodology, Y.H.; numerical experiments,
Y.H.; writing–original draft Y.H.; writing–review and editing, Y.H., M.S. and J.M.A.; supervision, J.M.A.; funding
acquisition, Y.H.
Funding: The research of Y.H. is supported by JSPS KAKENHI Grant Number JP18K11461.
Acknowledgments: We thank Michael Small (University of Western Australia) very much for his making his codes
freely available, with which we generated pseudo-periodic surrogates [19] as well as calculated the correlation
dimensions [26] throughout this paper.
Conflicts of Interest: The authors declare no conflict of interest.

Appendix A
Let {s(t) ∈ R|t = 1, 2, . . . , T } be a given time series. Then, its moving average of L consecutive
points sub-sampled by every L time points can be defined by {s̄(u) = L1 ∑iL=1 s((u − 1) L + i )|u =
1, 2, . . . , b T/Lc}.
First, we convert the given time series {s(t)} and its moving average {s̄} to the corresponding
permutation series {π ({s}, t, L)|t = 1, 2, . . . , T − L + 1} and {π ({s̄}, u, L)|u = 1, 2, . . . , b T/Lc − L + 1}.
Second, we initialize our simulated annealing algorithm by setting the current time series {c(t)}
to the original time series {s(t)}.
Entropy 2019, 21, 713 15 of 16

Third, we repeat the following process until the number of iterations reaches ( NS + 10) × S,
where we set NS = 39, which is the number of surrogate data, and S = 10,000, which is the number of
iterations we skip:

1. Increment the current number i of iterations by 1.

2. Prepare an attempt a(t) for replacement by swapping two elements of {c(t)}.
3. Calculate {π ({ a}, t, L)|t = 1, 2, . . . , T − L + 1} and {π ({ ā}, u, L)|u = 1, 2, . . . , b T/Lc − L + 1}.
4. Calculate the number of differences between [{π ({s}, t, L)|t = 1, 2, . . . , T − L + 1}, {π ({s̄}, u, L)|u =
1, 2, . . . , bT/Lc − L + 1}] and [{π ({ a}, t, L)|t = 1, 2, . . . , T − L + 1}, {π ({ ā}, u, L)|u =
1, 2, . . . , b T/Lc − L + 1}]. Let #n denote this number.
5. Let p be the probability for accepting the attempt, which can be calculated as exp[−iβ#n].
6. Generate a uniform random number between 0 and 1. If the random number is less than p,
then replace the current time series {c(t)} by the attempt { a(t)}.
7. If i is a multiple of S and i > 10S, then record the current {c(t)} as the (i/S − 10)-th surrogate data.

References
1. Schreiber, T.; Schmitz, A. Improved surrogate data for nonlinearity tests. Phys. Rev. Lett. 1996 77, 635–638.
[CrossRef]
2. Theiler, J.; Eubank, S.; Longtin, A.; Galdrikian, B.; Farmer, J.D. Testing for nonlinearity in time series:
The method of surrogate data. Phys. D 1992, 58, 77–94. [CrossRef]
3. Wayl, R.; Bromley, D.; Pickett, D.; Passamante, A. Recognizing determinism in a time series. Phys. Rev. Lett.
1993, 70, 580–582.
4. Hirata, Y.; Shiro, M. Detecting nonlinear stochastic systems using two independent hypothesis tests.
Phys. Rev. E 2019, in press.
5. Nakamura, T.; Small, M.; Hirata, Y. Testing for nonlinearity in irregular fluctuations with long-term trends.
Phys. Rev. E 2006, 74, 026205. [CrossRef] [PubMed]
6. Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Phys. Rev. Lett.
2002, 88, 174102. [CrossRef] [PubMed]
7. Amigó J.M.; Kennel, M.B. Topological permutation entropy. Phys. D 2007, 231, 137–142. [CrossRef]
8. Amigó, J.M.; Zambrano, S.; Sanjuán, M.A.F. Combinatorial detection of determinism in noisy time series.
EPL 2008, 83, 60005. [CrossRef]
9. Michalowicz, J.V.; Nichols, J.M.; Bucholtz, F.; Olson, C.C. An Isserlis’ theorem for mixed Gaussian variables:
Application to the auto-bispectral density. J. Stat. Phys. 2009, 136, 89–102. [CrossRef]
10. Theiler, J.; Prichard, D. Constrained-realization Monte-Carlo method for hypothesis testing. Phys. D
1996, 94, 221–235. [CrossRef]
11. Kaplan, D.T.; Glass, L. Direct test for determinism in a time series. Phys. Rev. Lett. 1992, 68, 427–430.
[CrossRef] [PubMed]
12. Casdagli, M.C.; Weigend, A.S. Exploring the continuum between deterministic and stochastic
modeling. In Time Series Prediction: Forecasting the Future and Understanding the Past; Weigend, A.S.,
Gershenfeld, N.A., Eds.; Westview Press: New York, NY, USA, 1993; pp. 347–366.
13. Amigó, J.M.; Kocarev, L.; Szczepansiki, J. Order patterns and chaos. Phys. Lett. A 2006, 355, 27–36. [CrossRef]
14. Amigó, J.M.; Zambrano, S.; Sanjuán, M.A.F. Detecting determinism with oridinal patterns: A comparative
study. Int. J. Bifurcat. Chaos 2010, 20, 2915–2924. [CrossRef]
15. Amigó, J.M.; Kennel, M.B.; Kocarev, L. The permutation entropy rate equals the metric entropy rate for
ergodic information sources and ergodic dynamical systems. Phys. D 2005, 210, 77–95. [CrossRef]
16. Schreiber, T. Constrained randomization of time series data. Phys. Rev. Lett. 1998, 80, 2105–2108. [CrossRef]
17. Gershenfeld, N. The Nature of Mathematical Modeling; Cambridge University Press: Cambridge, UK, 1998.
18. Hirata, Y.; Takeuchi, T.; Horai, S.; Suzuki, H.; Aihara, K. Parsimonious description for predicting
high-dimensional dynamics. Sci. Rep. 2015, 5, 15736. [CrossRef] [PubMed]
19. Small, M.; Yu, D.; Harrison, R.G. Surrogate test for pseudoperiodic time series data. Phys. Rev. Lett.
2001, 87, 188101. [CrossRef]
20. Hamilton, J.D. Time Series Analysis; Princeton University Press: Princeton, NJ, USA, 1994.
Entropy 2019, 21, 713 16 of 16

21. Lamoureux, C.G.; Lastrapes, W.D. Persistence in variance, structural change, and the GARCH model.
J. Bus. Econ. Stat. 1990, 8, 225–234.
22. Matsumoto, K.; Tsuda, I. Noise-induced order. J. Stat. Phys. 1983, 31, 87–106. [CrossRef]
23. May, R.M. Simple mathematical models with very complicated dynamics. Nature 1976 261, 459–467.
[CrossRef]
24. Lorenz, E.N. Deterministic nonperiodic flow. J. Atmos. Sci. 1963, 20, 130–141. [CrossRef]
25. Rössler, O.E. An equation for continuous chaos. Phys. Lett. 1976, 57A, 397–398. [CrossRef]
26. Yu, D.J.; Small, M.; Harrison, R.G.; Diks, C. Efficient implementation of the Gaussian kernel algorithm in
estimating invariants and noise level from noisy time series data. Phys. Rev. E 2000, 61, 3750–3756. [CrossRef]
27. Schreiber, T.; Schmitz, A. Surrogate time series. Phys. D 2000, 142, 346–382. [CrossRef]
28. Hirata, Y.; Amigó, J.A.; Matsuzaka, Y.; Yokota, R.; Mushiake, H.; Aihara, K. Detecting causality by combined
use of multiple methods: climate and brain examples. PLoS ONE 2016, 11, e0158572. [CrossRef] [PubMed]
29. McCullough, M.; Sakellariou, K.; Stemler, T.; Small, M. Regenerating time series from ordinal networks.
Chaos 2017, 27, 035814. [CrossRef]
30. Small, M.; McCullough, M.; Sakellariou, K. Ordinal network measures: Quantifying determinism in data.
In Proceedings of the 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy,
27–30 May 2018.
31. Staniek, M.; Lehnertz, K. Symbolic transfer entropy. Phys. Rev. Lett. 2008, 100, 158101. [CrossRef] [PubMed]
32. Amigó, J.M.; Monetti, R.; Aschenbrenner, T.; Bunk, W. Transcripts: An algebraic approach to coupled time
series. Chaos 2012, 22, 013105. [CrossRef]
33. Hirata, Y.; Aihara, K. Timing matters in foreign exchange markets. Phys. A 2012, 391, 760–766. [CrossRef]

Sample Availability: Matlab codes are available from the corresponding author’s following website: https:
//sites.google.com/view/yoshitohirata/home.

c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Time Series - Brockwell and Davis PDF
No ratings yet
Time Series - Brockwell and Davis PDF
531 pages
Math11statprob Finals
No ratings yet
Math11statprob Finals
28 pages
Amit Konar, Diptendu Bhattacharya-Time-Series Prediction and Applications. A Machine Intelligence Approach-Springer (2017)
No ratings yet
Amit Konar, Diptendu Bhattacharya-Time-Series Prediction and Applications. A Machine Intelligence Approach-Springer (2017)
248 pages
4540 17 PDF
No ratings yet
4540 17 PDF
274 pages
Time Series Theory and Methods Brockwell PDF
No ratings yet
Time Series Theory and Methods Brockwell PDF
530 pages
Calculus For Iit Jee PDF
77% (96)
Calculus For Iit Jee PDF
510 pages
American Statistical Association, Taylor & Francis, Ltd. Journal of The American Statistical Association
No ratings yet
American Statistical Association, Taylor & Francis, Ltd. Journal of The American Statistical Association
12 pages
New Approach For Dealing With The Startup Problem in Discrete Event Simulation
No ratings yet
New Approach For Dealing With The Startup Problem in Discrete Event Simulation
18 pages
Recurrent Neural Processes: Preprint. Under Review
No ratings yet
Recurrent Neural Processes: Preprint. Under Review
12 pages
Stationarity Issues in Time Series Models: David A. Dickey North Carolina State University
No ratings yet
Stationarity Issues in Time Series Models: David A. Dickey North Carolina State University
17 pages
A NN-based Model For Time Series Forecasting in Function of Energy Associated of Series
No ratings yet
A NN-based Model For Time Series Forecasting in Function of Energy Associated of Series
7 pages
Fast Detection of Nonlinearity and Nonstationarity in Short and Noisy Time Series
No ratings yet
Fast Detection of Nonlinearity and Nonstationarity in Short and Noisy Time Series
6 pages
Time Series Analysis Using Wavelets and Entropy Analysis: Vasco Chikwasha
No ratings yet
Time Series Analysis Using Wavelets and Entropy Analysis: Vasco Chikwasha
36 pages
Hannan E.J., Krishnaiah P.R., Rao M.M.-Handbook of Statistics, Vol. 5. Time Series in The Time Domain (1985) PDF
No ratings yet
Hannan E.J., Krishnaiah P.R., Rao M.M.-Handbook of Statistics, Vol. 5. Time Series in The Time Domain (1985) PDF
482 pages
Lecture 1 - Time Series Fundamentals - Introduction
No ratings yet
Lecture 1 - Time Series Fundamentals - Introduction
61 pages
Time Series and Sequential Data
No ratings yet
Time Series and Sequential Data
143 pages
Time Series Prediction of Earthquake Input by Using Soft Computing
100% (1)
Time Series Prediction of Earthquake Input by Using Soft Computing
6 pages
Chap1 Introduction - pt1 Student
No ratings yet
Chap1 Introduction - pt1 Student
11 pages
Stochiastic Time Series
No ratings yet
Stochiastic Time Series
49 pages
TS Note Copy1
No ratings yet
TS Note Copy1
44 pages
Diffusion Time Series Prediction (General)
No ratings yet
Diffusion Time Series Prediction (General)
30 pages
Pub - Time Series Theory and Methods PDF
No ratings yet
Pub - Time Series Theory and Methods PDF
530 pages
Assumption-Free Anomaly Detection in Time Series: Figure 1. A Snapshot of The Anomaly Detection Tool
No ratings yet
Assumption-Free Anomaly Detection in Time Series: Figure 1. A Snapshot of The Anomaly Detection Tool
4 pages
Lai Sysid Time Series Hankel
No ratings yet
Lai Sysid Time Series Hankel
10 pages
Cryptographic Applications
No ratings yet
Cryptographic Applications
36 pages
Fraser 1989
No ratings yet
Fraser 1989
18 pages
Chapter 1. Basic Concepts in Time Series Analysis
No ratings yet
Chapter 1. Basic Concepts in Time Series Analysis
43 pages
Bayesian Online Changepoint Detection: Ryan Prescott Adams David J.C. Mackay
No ratings yet
Bayesian Online Changepoint Detection: Ryan Prescott Adams David J.C. Mackay
7 pages
Stat 497 - LN4
No ratings yet
Stat 497 - LN4
67 pages
Econ f342 Ae Time Series Addl
No ratings yet
Econ f342 Ae Time Series Addl
124 pages
QuantStrat Trade1
No ratings yet
QuantStrat Trade1
71 pages
PDS LVC 3 Post-Session Summary Time Series
No ratings yet
PDS LVC 3 Post-Session Summary Time Series
18 pages
Albert Chib JB 93
No ratings yet
Albert Chib JB 93
15 pages
Time Series Lecture
No ratings yet
Time Series Lecture
14 pages
Gratis
No ratings yet
Gratis
38 pages
Modeling Extreme Events in Time Series Prediction: Daizong Ding, Mi Zhang Xudong Pan, Min Yang Xiangnan He
No ratings yet
Modeling Extreme Events in Time Series Prediction: Daizong Ding, Mi Zhang Xudong Pan, Min Yang Xiangnan He
9 pages
Time Series Analysis
No ratings yet
Time Series Analysis
41 pages
Time Series Prediction and Neural Networks: R.J.Frank, N.Davey, S.P.Hunt
No ratings yet
Time Series Prediction and Neural Networks: R.J.Frank, N.Davey, S.P.Hunt
12 pages
Econometric Toolkit For Studying Dynamic Models in Economics and Finance
No ratings yet
Econometric Toolkit For Studying Dynamic Models in Economics and Finance
39 pages
Alexandridis 2015
No ratings yet
Alexandridis 2015
5 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
NPT29 Randomproces1
No ratings yet
NPT29 Randomproces1
11 pages
Computers and Mathematics With Applications: Alejandro Balbás, Beatriz Balbás, Inna Galperin, Efim Galperin
No ratings yet
Computers and Mathematics With Applications: Alejandro Balbás, Beatriz Balbás, Inna Galperin, Efim Galperin
15 pages
Fed Former
No ratings yet
Fed Former
19 pages
ADSP Unit 2
No ratings yet
ADSP Unit 2
29 pages
Efficient Bayesian Inference For AFRIMA Processes
No ratings yet
Efficient Bayesian Inference For AFRIMA Processes
33 pages
Markov Bayes LDP
No ratings yet
Markov Bayes LDP
16 pages
Chapter11 PDF
No ratings yet
Chapter11 PDF
29 pages
Статья на конференцию
No ratings yet
Статья на конференцию
27 pages
MCMC: Gibbs Sampling: D K k1 k+1 D
No ratings yet
MCMC: Gibbs Sampling: D K k1 k+1 D
7 pages
Time Series Analysis
No ratings yet
Time Series Analysis
104 pages
Te 1555
No ratings yet
Te 1555
134 pages
Feature-Based Time-Series Analysis
No ratings yet
Feature-Based Time-Series Analysis
28 pages
Descarga
No ratings yet
Descarga
18 pages
From Fourier To Koopman Spectral Methods For Long-Term Prediction
No ratings yet
From Fourier To Koopman Spectral Methods For Long-Term Prediction
38 pages
17 Aos1662
No ratings yet
17 Aos1662
32 pages
AR, MA, ARIMATime Series
No ratings yet
AR, MA, ARIMATime Series
76 pages
Nonlinear Transformations of Random Processes
From Everand
Nonlinear Transformations of Random Processes
Ralph Deutsch
No ratings yet
Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications
From Everand
Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications
Harald Cramér
4/5 (2)
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
From Everand
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
Björn Olsson
No ratings yet
Analytical Methods of Optimization
From Everand
Analytical Methods of Optimization
D. F. Lawden
No ratings yet
Chapter 1 - Lecture Slides
No ratings yet
Chapter 1 - Lecture Slides
53 pages
6.1 HW
No ratings yet
6.1 HW
3 pages
QMB FL. Chap 05
No ratings yet
QMB FL. Chap 05
32 pages
JSO (Test - 14) Paid
No ratings yet
JSO (Test - 14) Paid
5 pages
Proof Mean and Variance of BINOMIAL and POSSION
No ratings yet
Proof Mean and Variance of BINOMIAL and POSSION
4 pages
Chapter 5 (Normal Probability Distribution)
No ratings yet
Chapter 5 (Normal Probability Distribution)
39 pages
CS181 HW0
No ratings yet
CS181 HW0
9 pages
Miller and Freunds Probability and Statistics For Engineers 9th Edition Johnson Solutions Manual
100% (34)
Miller and Freunds Probability and Statistics For Engineers 9th Edition Johnson Solutions Manual
8 pages
A Gradient Boosting Model To Predict The Milk Production
No ratings yet
A Gradient Boosting Model To Predict The Milk Production
8 pages
Software Regression
No ratings yet
Software Regression
2 pages
Stochastic Process
No ratings yet
Stochastic Process
154 pages
Bayesian Decision Theory: CS479/679 Pattern Recognition Dr. George Bebis
No ratings yet
Bayesian Decision Theory: CS479/679 Pattern Recognition Dr. George Bebis
64 pages
L02 Heritability Calculation
No ratings yet
L02 Heritability Calculation
4 pages
A. Identify The Type of Sampling Method. Write Your Answer Before The Number
No ratings yet
A. Identify The Type of Sampling Method. Write Your Answer Before The Number
2 pages
M.Tech Statistics
No ratings yet
M.Tech Statistics
2 pages
Komar University of Science and Technology
No ratings yet
Komar University of Science and Technology
15 pages
Laporan Analisis Dan Interpretasi One Way Anova
No ratings yet
Laporan Analisis Dan Interpretasi One Way Anova
2 pages
Chapter 3 Measures of Variability
No ratings yet
Chapter 3 Measures of Variability
69 pages
GDP Forecasting Using Time Series Analysis
No ratings yet
GDP Forecasting Using Time Series Analysis
15 pages
Chapter 4 ECON NOTES
No ratings yet
Chapter 4 ECON NOTES
8 pages
11014-Article Text-33351-2-10-20230201
No ratings yet
11014-Article Text-33351-2-10-20230201
15 pages
Binomial Distribution and Applications
100% (2)
Binomial Distribution and Applications
23 pages
ISE 529 Mock Test Answers
No ratings yet
ISE 529 Mock Test Answers
6 pages
Application of Maximum Permissible Erron in Calibration Calibration-1
No ratings yet
Application of Maximum Permissible Erron in Calibration Calibration-1
20 pages
Shrinkage Estimation of The Covariance Matrix
No ratings yet
Shrinkage Estimation of The Covariance Matrix
34 pages
Analysis of Variance
No ratings yet
Analysis of Variance
20 pages
Business Forecasting John E. Hanke Dean Wichern Ninth Edition
No ratings yet
Business Forecasting John E. Hanke Dean Wichern Ninth Edition
159 pages
Flight Price Prediction Report
No ratings yet
Flight Price Prediction Report
18 pages
Time Series Forcasting
No ratings yet
Time Series Forcasting
19 pages

Entropy 21 00713

Uploaded by

Entropy 21 00713

Uploaded by

entropy

Keywords: time series analysis; determinism; stochasticity; permutations; hypothesis testing

Entropy 2019, 21, 713; doi:10.3390/e21070713 www.mdpi.com/journal/entropy

2. Our Mathematical Settings

x (t + 1) = f ( x (t), p(t)) (1)

Exchange time points randomly so as to preserve the series

Figure 1. Schematic figure showing how we generate an entropy preserving surrogate.

s(t + 1) − s(τ + 1) ≈ a(t) + b(t)(s(t) − s(τ )), (2)

Et [log |b(t)|]. (3)

5.1. Toy Examples

y(t) = 0.409933 + 0.095y(t − 1) + e(t), (5)

h(t) = 14.4038 + 0.095e(t − 1)2 + 0.895h(t − 1), (6)

x (t + 1) = f ( x (t)) + b + 10−2.5 u(t), (7)

where u(t) follows the uniform distribution between −1 and 1.

x (t + 1) = 3.8x (t)(1 − x (t)). (9)

ẋ = −10( x − y), (10)

ẋ = −(y + z), (13)

ż = 0.4 + z( x − 4.5). (15)

We sampled x every 1 unit time.

A Original B An entropy preserving surrogate

Figure 5. Examples of tests of determinism beyond pseudo-periodicity using pseudo-periodic

Figure 6. Examples of tests of determinism beyond pseudo-periodicity using pseudo-periodic

Figure 9. Examples of tests of determinism beyond pseudo-periodicity using pseudo-periodic

5.2. Real Data Example of the USD/JPY Market

Property Number of Time Segments

A Original B An entropy preserving surrogate

1. Increment the current number i of iterations by 1.

You might also like