Richman Moorman 2000 Physiological Time Series Analysis Using Approximate Entropy and Sample Entropy
Richman Moorman 2000 Physiological Time Series Analysis Using Approximate Entropy and Sample Entropy
Richman, Joshua S., and J. Randall Moorman. Physi- rithm counts each sequence as matching itself, a prac-
ological time-series analysis using approximate and sample tice carried over from the work of Eckmann and Ruelle
entropy. Am J Physiol Heart Circ Physiol 278: H2039–H2049, (5) to avoid the occurrence of ln (0) in the calculations.
2000.—Entropy, as it relates to dynamical systems, is the rate This step has led to discussion of the bias of ApEn (22,
of information production. Methods for estimation of the
23, 27). In practice, we find that this bias causes ApEn
entropy of a system represented by a time series are not,
however, well suited to analysis of the short and noisy data
to lack two important expected properties. First, ApEn
sets encountered in cardiovascular and other biological stud- is heavily dependent on the record length and is
ies. Pincus introduced approximate entropy (ApEn), a set of uniformly lower than expected for short records. Sec-
measures of system complexity closely related to entropy, ond, it lacks relative consistency. That is, if ApEn of one
which is easily applied to clinical cardiovascular and other data set is higher than that of another, it should, but
time series. ApEn statistics, however, lead to inconsistent does not, remain higher for all conditions tested (22).
results. We have developed a new and related complexity This shortcoming is particularly important, because
measure, sample entropy (SampEn), and have compared ApEn has been repeatedly recommended as a relative
ApEn and SampEn by using them to analyze sets of random measure for comparing data sets (22–24).
numbers with known probabilistic character. We have also To reduce this bias, we have developed and character-
evaluated cross-ApEn and cross-SampEn, which use cardio-
vascular data sets to measure the similarity of two distinct
ized a new family of statistics, sample entropy (Samp-
time series. SampEn agreed with theory much more closely En), that does not count self-matches. SampEn is
than ApEn over a broad range of conditions. The improved derived from approaches developed by Grassberger and
accuracy of SampEn statistics should make them useful in co-workers (2, 9–11). SampEn(m, r, N ) is precisely the
the study of experimental clinical cardiovascular and other negative natural logarithm of the conditional probabil-
biological time series. ity that two sequences similar for m points remain
probability; nonlinear dynamics similar at the next point, where self-matches are not
included in calculating the probability. Thus a lower
value of SampEn also indicates more self-similarity in
the time series. In addition to eliminating self-matches,
NONLINEAR DYNAMICAL ANALYSIS is a powerful approach the SampEn algorithm is simpler than the ApEn
to understanding biological systems. The calculations, algorithm, requiring approximately one-half as much
however, usually require very long data sets that can be time to calculate. SampEn is largely independent of
difficult or impossible to obtain. Pincus (21, 22) devised record length and displays relative consistency under
the theory and method for a measure of regularity circumstances where ApEn does not.
closely related to the Kolmogorov entropy, the rate of Cross-ApEn is a recently introduced technique for
generation of new information, that can be applied to analyzing two related time series to measure the
the typically short and noisy time series of clinical data. degree of their asynchrony (20, 28). Cross-ApEn is very
This family of statistics, named approximate entropy similar to ApEn in design and intent, differing only in
(ApEn), is rooted in the work of Grassberger and that it compares sequences from one series with those
Procaccia (10) and Eckmann and Ruelle (5) and has of the second. Because it does not compare a series with
been widely applied in clinical cardiovascular studies itself, bias from self-matches does not arise. A potential
(3, 6–8, 12–19, 24, 26, 29, 32–34). problem, however, remains in the necessity for each
The method examines time series for similar epochs: template to generate a defined, nonzero probability.
more frequent and more similar epochs lead to lower Thus each template must find at least one match for
values of ApEn. Informally, given N points, the family m 1 1 points, or a probability must be assigned to it
of statistics ApEn(m, r, N ) is approximately equal to according to a ‘‘correction’’ strategy. We tested the effect
the negative average natural logarithm of the condi- of two extremes of correction strategies on cross-ApEn
tional probability that two sequences that are similar analysis. We find that cross-ApEn analysis lacks rela-
for m points remain similar, that is, within a tolerance tive consistency, and conclusions about relative syn-
r, at the next point. Thus a low value of ApEn reflects a chrony of pairs of time series depend on the unguided
high degree of regularity. Importantly, the ApEn algo- selection of analysis schemes. Cross-SampEn, on the
0363-613500 $5.00 Copyright r 2000 the American Physiological Society H2039
m11
other hand, is defined as long as one template finds a CN 2m11 (r) is not, simply because the vector
match, and we find that cross-SampEn remains rela- um 1 1 (N 2 m 1 1) does not exist. For practical pur-
tively consistent for conditions where cross-ApEn does poses, ApEn can be thought of as the negative natural
not. logarithm of the probability that sequences that are
THEORY close for m points remain close for an additional point.
Because conditional probabilities lie between 0 and 1,
ApEn reports on similarity in time series. We employ the parameter ApEn(m, r) is a positive number of
the terminology and notation of Grassberger and Pro- infinite range. For finite N, however, the largest pos-
caccia (10), Eckmann and Ruelle (5), and Pincus (21) in sible value of ApEn(m, r, N ) is 2Fm 1 1 (r) #
describing techniques for estimating the Kolmogorov 2ln c(N 2 m) 21d, so ApEn(m, r, N ) # ln (N 2 m).
entropy of a process represented by a time series and ApEn(m, r, N) is biased and suggests more similarity
the related statistics ApEn and SampEn. The param- than is present. It is important to note that ApEn takes
eters N, m, and r must be fixed for each calculation. N is
a template-wise approach to calculating this average
the length of the time series, m is the length of
logarithmic probability, first calculating a probability
sequences to be compared, and r is the tolerance for
accepting matches. It is convenient to set the tolerance for each template. The ApEn algorithm thus requires
as r 3 SD, the standard deviation of the data set, that each template contribute a defined, nonzero prob-
allowing measurements on data sets with different ability. This constraint is overcome by allowing each
amplitudes to be compared. Throughout this work, all template to match itself. Formally, because d[xm (i),
time series have been normalized to have SD 5 1. xm (i)] 5 0 # r, the ApEn algorithm counts each
We proceed as follows: For a time series of N points, template as matching itself, a practice we will refer to
5u( j): 1 # j # N 6 forms the N 2 m 1 1 vectors xm (i) for as self-matching. This ensures that the functions C m i (r)
5i 0 1 # i # N 2 m 1 16, where xm (i) 5 5u(i 1 k): are . 0 for all i, thereby avoiding the occurrence of
0 # k # m 2 16 is the vector of m data points from u(i) ln (0) in the calculation. Thus ApEn(m, r, N ) is certain
to u(i 1 m 2 1). The distance between two such vectors to be defined under all circumstances. Pincus and
is defined to be d[x(i), x( j)] 5 max 50u(i 1 k) 2 u( j 1 k)0: Goldberger (23, 27) note that, as a consequence of this
0 # k # m 2 16, the maximum difference of their corre- practice, ApEn(m, r, N ) is a biased statistic. Formally, a
sponding scalar components. Let Bi be the number of statistic is biased if its expected value is not equal to
vectors xm ( j) within r of xm (i) and let Ai be the number the parameter it estimates. ApEn statistics are biased,
of vectors xm 1 1 ( j ) within r of xm 1 1 (i). Define the because the expected value of ApEn(m, r, N ) is less
function C m i (r) 5 (Bi )/(N 2 m 1 1). In calculating C i (r),
m
than the parameter ApEn(m, r) (27).
the vector xm (i) is called the template, and an instance To discuss the bias caused by including self-matches,
where a vector xm ( j) is within r of it is called a template let us redefine the conditional probability associated
match. C m i (r) is the probability that any vector xm ( j) is with the template xi (m) by letting Bi denote the num-
within r of xm (i). The function Fm (r) 5 (N 2 m 1 1) 21 ber of vectors xm ( j) with j Þ i, such that d[xm 1 1 (i),
oiN521m 1 1 lnt[C m i (r)] is the average of the natural loga- xm 1 1 ( j)] # r by Ai. The ApEn algorithm thus assigns to
rithms of the functions C m i (r). Eckmann and Ruelle (5) the template xi (m) a biased conditional probability of
suggest approximating the entropy of the underlying (Ai 1 1)/(Bi 1 1), which is always greater than the
process as limr = 0 limm = ` limN=` [Fm (r) 2 F m 1 1 (r)]. unbiased Ai/Bi. In the limit as N approaches infinity, Ai
Because of these limits, this definition is not suited to and Bi will generally be large, making the biased and
the analysis of the finite and noisy time series derived
unbiased probabilities asymptotically equivalent. There-
from experiments.
fore, this bias is evident only for the analysis of finite
Pincus (21) saw that the calculation of Fm (r) 2
data sets and is a characteristic of the statistic
Fm 1 1 (r) for fixed parameters m, r, and N had intrinsic
interest as a measure of regularity and complexity. He ApEn(m, r, N ), rather than the parameter ApEn(m, r).
defines the related parameter ApEn(m, r) 5 limN = ` For a finite N, however, the result is that ApEn(m, r, N )
[Fm (r) 2 F m 1 1 (r)], which for finite data sets is esti- is biased toward lower values of ApEn and returns
mated by the statistic ApEn(m, r, N) 5 Fm (r) 2 Fm11 (r). values below those predicted by theory.
Algebraic manipulation reveals that ApEn(m, r,N) 5 The largest deviation occurs when a large proportion
(N 2 m 1 1) 21 oiN521m 1 1 ln [Cm of templates have Bi 5 Ai 5 0, since these templates are
i (r)] 2 (N 2 m)
21 oN2m ln
i51
[Cm11 (r)]. When N is large, ApEn(m, r, N ) is approxi- assigned a conditional probability of 1, corresponding
i
mately equal to (N 2 m) 21 oiN521m [2ln (Ai/Bi )], the aver- to perfect order. Furthermore, the difference between
age over i of the negative natural logarithm of the the biased and unbiased conditional probabilities as-
conditional probability that d[xm 1 1 (i), xm 1 1 ( j)] # r signed to individual templates makes the calculation
given that d[xm (i), xm ( j)] # r. The difference between sensitive to record length in a way that depends on the
ApEn(m, r, N ) and this average logarithmic probability conditional probability. Suppose that the unbiased con-
is less than (N 2 m 1 1) 21 ln (N 2 m 1 1), which is ditional probability is known and denote it by CP. For a
,0.05 for N 2 m 1 1 $ 90 and ,0.02 for N 2 m 1 1 $ given um (i) let Bi denote the number of template
283. They differ because there are N 2 m 1 1 templates matches without counting self-matches. The original
of length m, but only N 2 m templates of length m 1 1 algorithm estimates CP as (1 1 Ai )/(1 1 Bi ) 5
(27). Thus, although the quantity C N m
2 m 1 1 (r) is defined, (1 1 Bi 3 CP)/(1 1 Bi ). The fractional error of this
relative to CP is Err 5 5[(1 1 Bi 3 CP)/ (1 1 Bi )] 2 CP6 / report a conditional probability of 1 in the event that
CP). To find the value of Bi necessary to keep the only self-matches are encountered. Another strategy
m11
fractional error below a threshold Errmax, we isolate Bi would be to define ApEne2,e1 (m, r, N ) 5 Fm
e2 (r) 2 Fe1 (r)
from the inequality Err # Errmax, yielding Bi $ where e1 and e2 are small and e1 # e2. This is similar to
[1 2 CP(Errmax 1 1)]/(CP 3 Errmax ). For independent, estimating the conditional probabilities by (e1 1 Ai )/
identically distributed (iid) random numbers, obtain- (e2 1 Bi ). Thus a template reporting only self-matches
ing Bi matches of length m requires a data set con- would be assigned a probability of e1/e2. This remains a
taining, on average, Bi/(CP) m templates. For iid random biased estimate of ApEn, with bias toward 2ln (e1/e2 )
numbers and m 5 2, estimating CP 5 0.368 when a large proportion of templates report only self-
[ApEn(m, r, N ) 5 1] within Errmax 5 0.05 requires Bi $ matches. If e1/e2 5 (N 2 m 1 1) 21 and Ai 5 Bi 5 0, the
33 and a data set of .240 points. Estimating CP 5 bias is toward the lowest observable probability. This is
0.135 [ApEn(m, r, N ) 5 2] with similar resolution re- the opposite bias of the original formulation of ApEn,
quires Bi $ 127 and .6,900 points. where the bias is toward the highest possible probabil-
The most straightforward way to eliminate the bias ity. Still another approach would be to use the estima-
would be to remove self-matching from the ApEn tors e1 and e2 but incorporate them into the calculation
algorithm, leaving it otherwise unaltered. However, only when Ai or Bi 5 0, respectively. This would
without the inclusion of self-matches, the ApEn algo- estimate the probabilities as Ai/Bi and would simply
11
rithm is not defined unless C m i (r) . 0 for every i. redefine Ai 5 e1 when Ai 5 0 and Bi 5 e2 when Bi 5 0.
Removing self-matches would make ApEn statistics These approaches reduce the bias in the estimation
highly sensitive to outliers; if there were a single of the individual conditional probabilities and ensure
template that matched no other vector, ApEn could not that perfect order would not be reported where none
be calculated because of the occurrence of ln (0). For the had been detected. None of the corrections eliminates
set of uniform random numbers shown in Fig. 1A, bias; the bias is minimized only if e1/e2 5 CP, but CP is
without self-matches, ApEn(1, r, N ) would not have not known beforehand. Thus no family of estimators e
been defined for r # 0.63 and N , 4,000; ApEn(2, r, N ) that minimizes bias can be chosen a priori.
would have required r $ 1 for N , 4,000. Thus, for SampEn statistics have reduced bias. We developed
many practical applications, self-matches cannot sim- SampEn statistics to be free of the bias caused by
ply be excluded when ApEn statistics are calculated. self-matching. The name refers to the applicability to
Can the bias be corrected? It is suggested that this time series data sampled from a continuous process. In
bias can be reduced with a family of estimators e by addition, the algorithm suggests ways to employ sam-
m
defining C i,e (r) 5 (N 2 m 1 1) 21 5e 1 number of j Þ i ple statistics to evaluate the results, as explained
such that d[xm (i), xm ( j)] # r6 (27). Then Fm e (r) 5 below.
(N 2 m 1 1) 21 oiN521m 1 1 C i,e
m
(r) and ApEne (m, r, N ) 5 There are two major differences between SampEn
m11
Fme (r) 2 Fe (r). For large N, this is very close to and ApEn statistics. First, SampEn does not count
estimating the conditional probabilities by (e 1 Ai )/ self-matches. We justified discounting self-matches on
(e 1 Bi ). When e , 1, the error due to counting self- the grounds that entropy is conceived as a measure of
matches is reduced, but this algorithm would still the rate of information production (5), and in this
Fig. 1. Sample entropy (SampEn) and approximate entropy (ApEn) of random numbers with a uniform
distribution. A: frequency histogram of 1,000 numbers. Inset: 25-point excerpt from time series. B: SampEn and
ApEn as functions of r(m 5 2). Straight line, calculated theoretical values for ApEn and SampEn parameters. In
this special case of uniformly distributed random numbers, their theoretical values coincide. Values of r are
displayed logarithmically, so that theoretical values appear linear. C: SampEn and ApEn as functions of N(m 5 2,
r 5 0.2). In B and C, confidence intervals for SampEn statistics are displayed as error bars. Straight line is
theoretically predicted value of parameter SampEn(2,0.2). Sets of uniformly distributed random numbers were
generated using a minimal standard number generator with added random shuffling (30), and all sets passed a runs
test for random arrangement (1). For data shown above and for similar tests in which random data with different
probabilistic distributions were used, theoretical values were calculated from expressions for SampEn(m, r) and
ApEn(m, r) by use of trapezoidal numerical integration. We verified our calculation of ApEn statistics by comparing
our results with published values for Henon and logistic map data (21).
context comparing data with themselves is meaning- as an upper bound, nearly doubling ln (N 2 m), the
less. Furthermore, self-matches are explicitly dis- dynamic range of ApEn(m, r, N ).
missed in the later work of Grassberger and co-workers Confidence intervals inform the implementation of
(2, 9, 11). Second, SampEn does not use a template-wise SampEn statistics. SampEn is not defined unless tem-
approach when estimating conditional probabilities. To plate and forward matches occur and is not necessarily
be defined, SampEn requires only that one template reliable for small numbers of matches. We have re-
find a match of length m 1 1. viewed SampEn(m, r, N ) calculation as a process of
We began from the work of Grassberger and Pro- sampling information about regularity in the time
caccia (10), who defined C m (r) 5 (N 2 m 1 1) 21 oiN521m 1 1 series and used sample statistics to inform us of the
Cm m
i (r), the average of the C i (r) defined above. This reliability of the calculated result. For example, say
differs from Fm (r) only in that Fm (r) is the average that we find B template matches, allowing for no more
of the natural logarithms of the C m i (r). They sug- than B forward matches, A of which actually occur. We
gest approximating the Kolmogorov entropy of a pro- assign a value of 1 to the A forward matches and a value
cess represented by a time series by limr = 0 limn = ` of 0 to the (B 2 A) potential forward matches that do
limN = ` 2 ln [C m 1 1 (r)/ C m (r)]; self-matches are counted, not occur and compute the conditional probability
and C m 1 1 (r)/C m (r) 5 (N 2 m 1 1) oiN521m Ai/(N 2 m) measured by SampEn as the average of this sample of
oiN521m 1 1 Bi. 0s and 1s. For operational purposes, we will assume
In this form, however, the limits render it unsuitable that the sample averages follow a Student’s td distribu-
for the analysis of finite time series with noise. We tion, where d is the number of degrees of freedom. We
therefore made two alterations to adapt it to this can then say with 95% confidence that the ‘‘true’’
purpose. First, we followed their later practice in average conditional probability of the process is within
calculating correlation integrals (2, 9, 11) and did not SDt(B 2 1,0.975) /ÎB of the sample average, where SD is the
consider self-matches when computing C m (r). Second, sample standard deviation and tB 2 1,0.975 is the upper
we considered only the first N 2 m vectors of length m, 2.5th percentile of a t distribution with B 2 1 degrees of
ensuring that, for 1 # i # N 2 m, xm (i) and xm 1 1 (i) were freedom (31). The size of the confidence intervals
defined. depends on the number B and the number of forward
i (r) as (N 2 m 2 1)
We defined B m 21 times the number
matches. Informally, large confidence intervals around
of vectors xm ( j) within r of xm (i), where j ranges from 1 SampEn(m, r, N ) indicate that there are insufficient
to N 2 m, and j Þ i to exclude self-matches. We then data to estimate the conditional probability with confi-
defined B m (r) 5 (N 2 m) 21 oiN521m B m i (r). Similarly, we dence for that choice of m and r. In addition, confidence
defined A m i (r) as (N 2 m 2 1) 21 times the number of intervals allow standard statistical tests of the signifi-
vectors xm 1 1 ( j) within r of xm 1 1 (i), where j ranges from cance of differences between data sets.
1 to N 2 m ( j Þ i), and set A m (r) 5 (N 2 m) 21 oiN521m The confidence intervals for SampEn are displayed
Am m
i (r). B (r) is then the probability that two sequences
as error bars in the figures. For some small values of N
will match for m points, whereas A m (r) is the probabil- and r, no value of SampEn is given. This indicates that
ity that two sequences will match for m 1 1 points. We B 5 0, A 5 0, or the confidence intervals extended to a
then defined the parameter SampEn(m, r) 5 limN = ` probability of .1 or ,0. In these cases, no value of
52ln [A m (r)/ B m (r)]6, which is estimated by the statistic SampEn(m, r, N ) can be assigned with confidence.
SampEn(m, r, N ) 5 2ln [A m (r)/B m (r)]. Where there is Cross-ApEn and cross-SampEn measure asynchrony.
no confusion about the parameter r and the length m of Cross-ApEn is a recently introduced technique for
the template vector, we set B 5 5[(N 2 m 2 1)(N 2 m)]/ comparing two different time series to assess their
26 B m (r) and A 5 5[(N 2 m 2 1)(N 2 m)]/26 A m (r), so that degree of asynchrony or dissimilarity (20, 28). The
B is the total number of template matches of length m definition of cross-ApEn is very similar to ApEn. Given
and A is the total number of forward matches of two time series of N points 5u( j): 1 # j # N 6 and 5v( j):
length m 1 1. We note that A/B 5 [A m (r)]/B m (r)], so 1 # j # N 6, form the vectors xm (i) 5 5u(i 1 k):
SampEn(m, r, N ) can be expressed as 2ln (A/B). 0 # k # m 2 16 and ym (i) 5 5v(i 1 k): 0 # k # m 2 16.
The quantity A/B is precisely the conditional probabil- The distance between two such vectors is defined
ity that two sequences within a tolerance r for m points as d[xm (i), ym ( j)] 5 max 5 0 u(i 1 k) 2 v( j 1 k) 0 :
remain within r of each other at the next point. In 0 # k # m 2 16. Define C m i (r)(v\u) as the number of
contrast to ApEn(m, r, N ), which calculates probabili- ym ( j) within r of xm (i) divided by (N 2 m 1 1), then
ties in a template-wise fashion, SampEn(m, r, N ) calcu- define Fm (r)(v\u) 5 (N 2 m 1 1) 21 oiN521m 1 1 ln [C mi (r)
lates the negative logarithm of a probability associated (v\u)], and cross-ApEn(m, r, N )(v\u) 5 Fm (r)(v \u) 2
with the time series as a whole. SampEn(m, r, N ) will Fm 1 1 (r)(v\u). This is identical to the definition of the
be defined except when B 5 0, in which case no statistic ApEn, except the templates are chosen from
regularity has been detected, or when A 5 0, which the series u and compared with vectors from v. There is
corresponds to a conditional probability of 0 and an thus a directionality to this analysis, and we will call
infinite value of SampEn(m, r, N ). The lowest nonzero the series that contributes the templates the template
conditional probability that this algorithm can report is series and the series with which they are compared the
2[(N 2 m 2 1)(N 2 m)] 21. Thus, the statistic Samp- target series. We can then refer to cross-ApEn(m, r, N )
En(m, r, N ) has ln (N 2 m) 1 ln (N 2 m 2 1) 2 ln (2) (target\template).
We made two observations. First, because no tem- The only difference between the two strategies is that
plate is compared with itself, there are no self-matches. bias max assigns to a template yielding no matches at
Consequently, C m i (r)(v\u) can equal 0, and there is no all a probability of (N 2 m) 21, the lowest nonzero
assurance that cross-ApEn(m, r, N )(v\ u) will be de- probability allowed by the length of the time series.
fined. Second, there is a ‘‘direction dependence’’ of Thus bias 0 sets the bias toward a cross-ApEn value of 0
cross-ApEn analysis. We defined cross-SampEn to avoid in the absence of any matches, whereas bias max sets
both potential problems. the bias toward the highest observable value of cross-
Cross-ApEn is not always defined. As noted above, ApEn.
cross-ApEn does not include self-matching and, thus, Cross-ApEn is direction dependent; cross-SampEn is
does not inherently suffer from the same bias as ApEn. not. Because of the logarithms inside the summation,
A potential problem, however, remains in the necessity Fm (r)(v\u) will not generally be equal to Fm (r)(u \v).
for each template to generate a defined, nonzero condi- Thus cross-ApEn(m, r, N )(v\u) and its direction conju-
tional probability. Thus each template must find at gate cross-ApEn(m, r, N )(u\v) are unequal in most
least one match for m 1 1 points, or a probability must cases.
In defining cross-SampEn, we set B m i (r)(v \u) as
be assigned to it. No guidelines have been suggested for
handling this potential difficulty. Cross-SampEn, on (N 2 m) 21 times the number of vectors ym ( j) within r of
the other hand, requires only that one pair of vectors in xm (i), where j ranges from 1 to N 2 m. We then defined
the two series match for m 1 1 points. B m (r)(v\u) 5 (N 2 m) 21 oiN521m B m i (r)(v\u). Similarly, we
The family of MIX(P) stochastic processes (21) pro- set A mi (r)(v\u) as (N 2 m) 21 times the number of vec-
vided a testing ground for cross-ApEn. Informally, the tors ym 1 1 ( j) within r of xm 1 1 (i), where j ranges from 1
MIX(P) time series of N points, where P is between 0 to N 2 m. We then defined A m (r)(v\u) 5 (N 2 m) 21
and 1, is a sine wave, where N 3 P randomly chosen oiN521m A m i (r)(v\u). Finally, we set cross-SampEn-
points have been replaced with random noise. We (m, r, N )(v \u) 5 2ln 5[A m (r)(v\u)]/ [Bm (r)(v\u)]6. Exam-
calculated cross-ApEn(1, r, 250) for the pair ining this definition for direction dependence, we see
[MIX(Q) \ MIX(P)] and its direction conjugate that (N 2 m) B m i (r)(v\u) is the number of vectors from v
[MIX(P)\MIX(Q)] for 16 realizations of each of the 6 within r of the ith template of the series u. Summing
combinations of P 5 0.1, 0.2, 0.3 and Q 5 0.5, 0.7 over a over the templates, we see that oiN521m (N 2 m)
range of values of r from 0.01 to 1.0. Cross- Bmi (r)(v\u) simply counts the number of pairs of vectors
ApEn(1, r, 250) [MIX(Q) \MIX(P)] was not defined for from the two series that match within r. The number of
any of the 96 pairs for r # 0.16 and was defined for pairs that match is clearly independent of which series
all of them only for r $ 0.50. Cross-ApEn (1, r, 250) is the template and which is the target. Because the
[MIX(P)\MIX(Q)] was not defined for any values of r # last summation is equal to (N 2 m) 2 B m (r)(v \u), it
0.32 and was defined for all pairs only for r 5 1.0. follows that B m (r)(v\u) is also direction independent,
To broaden the conditions for which cross-ApEn was implying that cross-SampEn(m, r, N )(v \ u) 5 cross-
defined, we introduced a correction factor into its SampEn(m, r, N )(u\v). We further noted that cross-
algorithm. To avoid ln (0) whenever C m i (r)(v\u) 5 0 or SampEn will be defined provided that A m (r)(v \u) Þ 0.
11
Cmi (r)(v\u) 5 0, we redefined them to be positive and Cross-SampEn, on the other hand, requires only that
nonzero. Rather than include these factors into the one pair of vectors in the two series match for m 1 1
m11
calculation of each C m i (r)(v\u) and C i (r)(v \u), we points.
minimized their impact on the overall calculation by We calculated cross-SampEn(1, r, 250) for the same
including them only when needed to ensure that cross- realizations of the [MIX(Q)\ MIX(P)] pairs used above
ApEn is defined. The cost of this adjustment, however, to test cross-ApEn and over the same range of r. In
is the introduction of bias. For this reason, we tested contrast to cross-ApEn(1, r, 250) [MIX(P)\MIX(Q)] and
cross-ApEn with two different correction strategies. cross-ApEn(1, r, 250) [MIX(Q)\MIX(P)], cross-SampEn-
The first strategy, which we called bias 0, was very (1, r, 250) [MIX(P)\MIX(Q)], which is identical to cross-
similar to self-matching; we simply set any C m i (r)(v\u) SampEn(1, r, 250) [MIX(Q)\ MIX(P)], was defined for
11
or C mi (r)(v\u) 5 1 if it would otherwise have been 0. all 96 pairs over the entire range of r considered.
Thus, if a template matched no other at all, it would be ApEn and SampEn can be calculated analytically for
assigned a conditional probability of 1, as with the series of random numbers. ApEn and SampEn derive
original description of ApEn. If, however, C m i (r)(v \u) Þ from formulas suggested to estimate the Kolmogorov
11 11
0 but C m i (r)(v\u) 5 0, we redefined C m i (r)(v \u) 5 entropy of a process represented by a time series. At
(N 2 m) 21, so that the probability assigned would be their root, each is a measurement of the conditional
the lowest observable, nonzero probability given the probability that two vectors that are close to each other
nonzero value of C m i (r)(v\u). for m points will remain close at the next point. There
The second approach, which we called bias max, also are several models, including sets of iid random num-
m11
only modified the functions C m i (r)(v\u) and C i (r) bers, for which the theoretical values of the parameters
(v\u) that would otherwise have been 0. Here, C m i (r) ApEn(m, r)(21) and SampEn(m, r) can be calculated.
(v\ u) was redefined to be 1 when it would otherwise We show here the case of uniform random numbers, for
11
have been 0 and C m i (r)(v\u) was redefined to be which the theoretical values of ApEn and SampEn are
(N 2 m 1 1) , as for bias 0.
21 nearly identical.
The expected value of the key probability can be RESULTS AND DISCUSSION
calculated analytically for series of iid numbers based
only on their probabilistic distribution. The numbers’ SampEn agrees with theory more closely than ApEn.
independence implies that the probability that two For most processes, ApEn statistics are expected to
randomly selected sequences within rSD of each other have two properties. First, the conditional probability
for the first m points will remain within r at their next that sequences within r of each other remain within r
points is simply the probability that any two points will should decrease as r decreases and the criterion for
be within a distance rSD of each other. For random matching becomes more stringent. In other words,
numbers with density function p(x) and standard devia- ApEn(m, r, N ) should increase as r decreases (22, 27).
tion SD, the expression for this probability is e 1` 2` p(t)
This expected property is demonstrated in Fig. 1B by
1 rSD
[e tt 2 rSD p(x) dx] dt. Because the parameter ApEn(m, r) the straight line, which plots the theoretically pre-
measures the average of the negative logarithm of this dicted values of ApEn and SampEn. Second, ApEn
probability, it is expected to return a value of 2 e 1` 2` p(t) should be independent of record length. The plot of
1 rSD
ln [e tt 2 rSD p(x) dx] dt (21). SampEn(m, r), on the other theoretical values of ApEn and SampEn in Fig. 1B
hand, measures the logarithm of the conditional prob- illustrates this expectation. Because of their similarity,
ability and is thus expected to return a value of 2 we expect SampEn statistics to exhibit similar pro-
ln 5e 1` t 1 rSD
2` p(t) [e t 2 rSD p(x) dx] dt6. perties. It has been suggested that record lengths
Because these expressions for the expected values of of 10m –20m should be sufficient to estimate Ap-
the parameters ApEn and SampEn depend solely on r En(m, r) (27).
and the probabilistic character of the data, uniform We tested ApEn and SampEn statistics on uniform,
random numbers provide a benchmark for testing the iid random numbers, because the results could be
estimation of ApEn over a range of parameters. In compared with the analytically calculated expected
particular, ApEn and SampEn are expected to give values. Figure 1, B and C, shows the performance
identical results for uniformly distributed random num- of ApEn(2, r, N ) and SampEn(2, r, N ) on uniform
bers. Figure 1A shows a histogram of the random iid random numbers. SampEn(2, r, N ) very closely
numbers used and an excerpt from the sequence (inset). matches the expected results for r $ 0.03 and N $ 100
Taking the derivative of 2e 1` t 1 rSD
2` p(t) ln [e t 2 rSD p(x) dx] dt (Fig. 1, B and C), whereas ApEn(2, r, N ) differs mark-
with respect to ln (r) reveals that, for small r (in this
edly from expectations for N , 1,000 and r , 0.2.
case r # 1), ApEn(m, r) decreases in proportion to ln (r)
for small r. This result generalizes to random numbers Figure 2 shows SampEn and ApEn as functions of r
with other distributions provided that their density (m 5 2) for three sets of uniform random numbers
functions are smoothly continuous and bounded. The consisting of 100, 5,000, and 20,000 points. SampEn
expected values are shown as the straight line in Fig. statistics for r 5 0.2 are in agreement with theory
1B. For random numbers with other distributions, the for much shorter data sets. We investigated the gen-
theoretical values of ApEn and SampEn will differ eral applicability of SampEn and ApEn statistics to
slightly. This is explained by noting that because 2ln is random numbers with other distributions. The analy-
a convex function, Jensen’s inequality (4) applied to sis of numbers with Gaussian, exponential, and
each of the integral expressions implies that the ApEn g-distributions with parameter l 5 1,2, . . . , 10
is expected to return a value greater than or equal to gave results essentially identical to those shown in
SampEn. Fig. 1.
Fig. 2. Effect of record length on SampEn and ApEn as a function of r with m 5 2 for uniform random numbers.
Confidence intervals of SampEn calculations are displayed as error bars. Plots are SampEn and ApEn as functions
of r for sets of 100 (A), 5,000 (B), and 20,000 (C) points. Straight lines, theoretically predicted values of ApEn and
SampEn statistics.
Fig. 3. Improved relative consistency but residual bias in SampEn. A and B: relative consistency is maintained by
SampEn but not by ApEn for MIX(0.1) and MIX(0.9) processes. A: ApEn statistics as a function of r(m 5 2,
N 5 1000) for MIX(0.1) and MIX(0.9). Inset: 25 sample points of MIX(0.1) (top) and MIX(0.9) (bottom). B: SampEn
statistics for MIX(0.1) and MIX(0.9). C: residual bias of SampEn(2,0.2,N ) statistics for short record lengths of
independent identically distributed Gaussian numbers. For N 5 4–102, SampEn(2,0.2,N ) was calculated for $105
sets of random numbers and averaged.
SampEn shows relative consistency where ApEn does 0.05, ApEn(2, r, 1,000) of MIX(0.1) was 0.463, whereas
not. A critically important expected feature of ApEn is the value for MIX(0.9) was 0.505. The spurious similar-
relative consistency (22, 27). That is, it is expected that, ity of the ApEn statistics is due to bias. The MIX(0.1)
for most processes, if ApEn(m1, r1 )(S) # ApEn(m1, r1 )(T), data yielded ,27 matches per template and 22 forward
then ApEn(m2, r2 )(S), # ApEn(m2, r2 )(T ). That is, if matches, whereas the MIX(0.9) data yielded only 0.45
record S exhibits more regularity than record T for one and 0.01, respectively. Thus more than one-half of the
pair of parameters m and r, it is expected to do so for all templates of the MIX(0.9) data matched no other
other pairs. Graphically, plots of ApEn as a function of r templates and were assigned a conditional probability
for different data sets should not cross over one an- of 1. In this example, ApEn statistics lack relative
other. The determination that one set of data exhibits consistency, because less-ordered data sets have fewer
greater regularity than another can be made only when matching templates and are more vulnerable to the
this condition is met. We tested this expectation using bias generated by self-matches. SampEn analysis of the
1,000-point realizations of the MIX(P) process, where same data, on the other hand, reports correctly over the
the degree of order could be specified. Figure 3A shows whole range of r (Fig. 3B).
that the ApEn statistics of the less-ordered MIX(0.9), Are SampEn statistics relatively consistent? We have
which has, on average, few matches of length m for a shown that SampEn statistics appear to be relatively
given template (small Bi ) for small r, rises as a function consistent over the family of MIX(P) processes, whereas
of r and crosses over a plot of ApEn statistics of the ApEn statistics are not. Although we believe that
more-ordered MIX(0.1). For r , 0.05, one would con- relative consistency should be preserved for processes
clude incorrectly that MIX(0.9) was more ordered than for which probabilistic character is understood, we see
MIX(0.1). Thus relative consistency does not hold for no general reason why ApEn or SampEn statistics
ApEn statistics. should remain relatively consistent for all time series
We investigated the mechanism responsible for this and all choices of parameters.
lack of relative consistency. Note that, for r 5 0.5, ApEn We propose a general, but by no means exhaustive,
statistics correctly distinguished MIX(0.1) and MIX(0.9). explanation for this phenomenon. SampEn is, in es-
For this value of r, the MIX(0.1) data yielded an sence, an event-counting statistic, where the events are
average of .46 matches per template and ,28 forward instances of vectors being similar to one another. When
matches per template, whereas the MIX(0.9) data these events are sparse, the statistics are expected to be
yielded ,37 and 10, respectively. These large numbers unstable, which might lead to a lack of relative consis-
of matches render the bias insignificant; that is, the tency. Recall that SampEn(m, r, N ) is less than or
unbiased Ai/Bi (28/46 and 10/37) is not very different equal to ln (B), the natural logarithm of the number of
from the biased (Ai 1 1)/(Bi 1 1) (29/47 and 11/38). As r template matches. Suppose SampEn(m, r, N)(S) , Sam-
is decreased, however, the number of template matches pEn(m, r, N ) (T ) and that the number of T’s template
decreased more for the MIX(0.9) data than for the matches, BT, is less than the number of S’s template
MIX(0.1) data. This made the bias significant for matches, BS, which would be consistent with T display-
MIX(0.9) data under conditions for which it was insig- ing less order than S. Provided that AT and AS, the
nificant in the MIX(0.1) data. For example, when r 5 number of forward matches, are relatively large, both
SampEn statistics will be considerably lower than their of 0.112 expected for independent templates in this
upper bounds. As r decreases, BT and AT are expected to case.
decrease more rapidly than BS and AS. Thus, as BT A second way to test the hypothesis is to calculate
becomes very small, SampEn(m, r, N )(T ) will begin to SampEn statistics for one set of data under two condi-
decrease, approaching the value ln (BT ), and could tions: with and without overlapping templates. The
cross over a graph of SampEn(m, r, N )(S), where or hypothesis predicts that the results should not match.
while BS is still relatively large. Furthermore, as the For this test, we chose the case of m 5 2 and N 5 6, the
number of template matches decreases, small changes shortest record containing two nonoverlapping tem-
in the number of forward matches can have a large plates of length m 1 1 5 3. We can represent each set as
effect on the observed conditional probability. Thus the 5a, b, c, x, y, z6. For 106 sets of six Gaussian random
discrete nature of the SampEn probability estimation numbers, we calculated the conditional probability that
could lead to small degrees of crossover and intermit- (a, b, c) was within r of (x, y, z) given that their first two
tent failure of relative consistency, and we cannot say points were close, thus calculating the probability for
that SampEn will always be relatively consistent. We pairs of disjoint templates. The result was 0.111, very
have shown, however, that SampEn is relatively consis- close to the expected 0.112 for independent templates.
tent for conditions where ApEn is not, and we have not For the same number sets, we then calculated the
observed any circumstance where ApEn maintains average value of SampEn(2, 0.2, 6), and the result was
relative consistency and SampEn does not. 0.094. Thus the two results do not match, in support of
One source of the residual bias in SampEn is correla- the hypothesis. We conclude from this analysis that the
tion of templates. Although more consonant with theory statistics SampEn(m, r, N ) are not completely unbi-
than ApEn, we found that SampEn statistics deviated ased under all conditions and that the bias of SampEn
from predictions for very short data sets. For 105 sets of for very small data sets is largely due to nonindepen-
Gaussian random numbers with m 5 2, and r 5 0.2, we dence of templates.
found that the deviation was ,3% for record lengths One method for removing this bias would be to
.100 but as high as 35% for sets of 15 points. Figure 3C partition the time series 5uj 01 # j # N 6 into the m 1 1
shows the biased results of SampEn(2, 0.2, N ) for the sets of neighboring, disjoint vectors of length m 1 1
Xi 5 5[ui 1 k(m 1 1), u i 1 k(m 1 1) 1 1, . . . , u i 1 k (m 1 1) 1 m ]: 0 #
range of 4 # N # 100. We suspected that the integral
k # [N 2 (m 1 i)]/(m 1 1)6, where i, the initial point of
expressions for the parameters ApEn(m, r) and Sam-
the first template, ranges from 1 to m 1 1. The
pEn(m, r) could not be used as expected values of the
conditional probability that vectors close for m points
statistics ApEn(m, r, N ) and SampEn(m, r, N ) under
remain close at the next point would be calculated for
all conditions, because the expressions relied on the
each of the sets of vectors Xi and then averaged.
assumption that the templates were independent of
Because this calculation compares only disjoint tem-
one another. As N decreases and m increases, however,
plates, it will not suffer from the bias introduced by
a larger proportion of templates are comprised of nonindependent templates. This truly unbiased ap-
overlapping segments of the record and are thus not proach has the potentially severe limitation of reducing
independent. Because of this correlation, results might the number of possible template matches and enlarging
deviate from these predictions for short data sets. We the confidence intervals about the SampEn estimate.
thus tested the hypothesis that the majority of the bias Because this bias appears to be present only for very
results from nonindependence of the templates. small N, the disjoint template approach does not ap-
One way to test the hypothesis is to compare ob- pear necessary in usual practice.
served values of SampEn with those obtained from a Cross-SampEn shows relative consistency where cross-
model accounting for template correlation. The hypoth- ApEn does not. As noted above, an essential feature of
esis predicts that the values should match. We tested the measures of order is their relative consistency. That
the simplest case of m 5 2 and N 5 4, where a data set is, if one series is more ordered than another, it should
can be represented by 5w, x, y, z6, so that there are have lower values of ApEn and SampEn for all condi-
exactly two vectors of length m 5 3, (w, x, y) and tions. We can extend this idea to cross-ApEn and
(x, y, z). If (w, x) is close to (x, y), it stands to reason that cross-SampEn; if a pair of series is more synchronous
(w, x, y) will have a higher than expected probability of than another pair, it should have lower values of
being close to (x, y, z). Formally, the conditional probabil- cross-ApEn and cross-SampEn statistics for all condi-
ity that (w, x, y) and (x, y, z) will be close given (w, x) tions tested.
and (x, y) are close is [e `2` p(w) (e w 2 rSD p(x) 5e x 2 rSD p( y)
w 1 rSD x 1 rSD
We tested the ability of cross-ApEn and cross-
[e y 2 rSD p(z) dz] dy6 dx) dw]/ (e 2` p(w) 5e w 2 rSD p(x) [ex2rSD
y 1 rSD ` w 1 rSD x1rSD
SampEn to distinguish between MIX(0.1) and the
p(y) dy] dx6 dw), where p(z) is the Gaussian probability less-ordered MIX(0.6) processes. The strategy was to
density function. For r 5 0.2, this was evaluated by compare each with the intermediate MIX(0.3) process.
numerical integration and found to be 0.137. For 106 The expected result was that the [MIX(0.3), MIX(0.1)]
sets of 4 iid Gaussian numbers, we calculated Sam- pair should appear more ordered than the [MIX(0.3),
pEn(2, 0.2, 4) to be 0.140. Thus the observed and MIX(0.6)] pair, because MIX(0.1) is significantly more
expected results nearly agree, indicating that bias due ordered than MIX(0.6). That is, cross-ApEn(2, r, 250)
to nonindependence of templates accounts for most of [MIX(0.3) \ MIX(0.1)] should be less than cross-
the observed deviation from the conditional probability ApEn(2, r, 250) [MIX(0.3) \ MIX(0.6)], and cross-
SampEn(2, r, 250) [MIX(0.3)\MIX(0.1)] should be less other cases, the order of results was reversed (Fig. 4C)
than cross-SampEn(2, r, 250) [MIX(0.3)\MIX(0.6)]. or crossed over (Fig. 4, E and F). As shown in Fig. 4B,
We tested this prediction bidirectionally, that is, cross-SampEn returned the expected results with a
MIX(0.3) served as the template series for one analysis high degree of confidence across the range of toleran-
and as the target series for the next, and over a range of ces r.
tolerances r by using the bias-max and bias-0 strategies Thus cross-ApEn statistics fail as a means of judging
for ensuring that cross-ApEn was defined. The results the relative order of two time series by their similarity
are shown in Fig. 4. MIX(0.3) was the template series in to a third series. In practice, however, cross-ApEn has
Fig. 4, C and E, and the target series in Fig. 4, D and F. been used differently: to determine the relative syn-
The expected result is that cross-ApEn and cross- chrony of two pairs of time series of clinical data from
SampEn should be less for the [MIX(0.3), MIX(0.1)] different patients. We thus tested cross-ApEn and
pair than for the [MIX(0.3), MIX(0.6)] pair. That is, the cross-SampEn on two sections of a long multivariate
circles should always be below the squares. We found cardiovascular time series used in the 1991 Santa Fe
this to be true for only one of the four tests of cross- competition for time series forecasting (35). The series
ApEn, the case of using MIX(0.3) as the target series consisted of concurrent measurements of a sleeping
with the bias-max correction strategy (Fig. 4D). In the patient’s heart rate (hr) and chest volume (cv). We
compared the pairs (hr1,cv1) and (hr2,cv2) shown in We thank L. Pitt, D. Scollan, and Rizwan-uddin for advice and
Fig. 5A, excerpted from the larger time series. Here, the Virginia’s Center for Innovative Technology for support.
Address for reprint requests and other correspondence: J. R.
expected result was not known beforehand, and our Moorman, Box 6012, MR4 Bldg., UVAHSC, Charlottesville, VA 22908
question was whether one pair consistently appeared (E-mail: [email protected]).
more synchronous than the other. This is an extension Received 2 August 1999; accepted in final form 13 December 1999.
of the expected relative consistency of ApEn discussed
above. REFERENCES
Figure 5 shows the results for N 5 250, m 5 1, and a
1. Bendat JS and Piersol AG. Random Data. Analysis and
range of r. We set m 5 1, in accordance with published Measurement Procedures. New York: Wiley, 1986.
practice for analyzing series of similar length (28), and 2. Ben-Mizrachi A, Procaccia I, and Grassberger P. The
the record length of N 5 250 exceeds the 10m –20m characterization of experimental (noisy) strange attractors. Phys
points recommended for ApEn analysis (27). Time Rev A 29A: 975, 1984.
3. Dawes GS, Moulden M, Sheil O, and Redman CW. Approxi-
series are shown in Fig. 5A. The question is whether mate entropy, a statistic of regularity, applied to fetal heart rate
the pair (hr1,cv1) has more joint synchrony than the data before and during labor. Obstet Gynecol 80: 763–768, 1992.
pair (hr2,cv2). The expected result is that the circles 4. Durrett R. Probability Theories and Examples. Belmont, CA:
should be consistently higher or lower than the squares Duxbury, 1996, p. 14.
5. Eckmann JP and Ruelle D. Ergodic theory of chaos and
for Fig. 5, D–F. This was not the case; conclusions strange attractors. Rev Modern Phys 57: 617–654, 1985.
about relative synchrony by use of cross-ApEn analysis 6. Fleisher LA, DiPietro JA, Johnson TR, and Pincus S.
depended on which series served as the template and Complementary and noncoincident increases in heart rate vari-
on the correction strategy, and there was no consistent ability and irregularity during fetal development. Clin Sci 92:
345–349, 1997.
result. Cross-SampEn, on the other hand, consistently 7. Fleisher LA, Pincus SM, and Rosenbaum SH. Approximate
reported that (hr1,cv1) had more joint synchrony than entropy of heart rate as a correlate of postoperative ventricular
(hr2,cv2) (Fig. 5B). dysfunction. Anesthesiology 78: 683–692, 1993.
For 32 sets of these data, we further examined the 8. Goldberger AL, Mietus JE, Rigney DR, Wood ML, and
Fortney SM. Effects of head-down bed rest on complex heart
correction methods, calculating cross-ApEn(m, r, N )
rate variability: response to LBNP testing. J Appl Physiol 77:
(cv\hr) and cross-ApEn(m, r, N ) (hr \ cv) for each set. 2863–2869, 1994.
For the relaxed condition of r 5 1.0, we found that 9. Grassberger P. Finite sample corrections to entropy and dimen-
cross-ApEn(1,1,250)(cv\hr) was defined for only 19 of sion estimates. Phys Lett A 128: 369, 1988.
the 32 cases, whereas cross-ApEn(1,1,250)(hr \cv) was 10. Grassberger P and Procaccia I. Estimation of the Kolmogorov
entropy from a chaotic signal. Phys Rev A 28: 2591–2593, 1983.
defined for only 12 cases. For the more stringent case of 11. Grassberger P, Schreiber T, and Schaffrath C. Nonlinear
r 5 0.2, cross-ApEn(1,0.2,250)(cv\hr) was never de- time sequence analysis. Int J Bifur Chaos 1: 547, 1991.
fined, whereas cross-ApEn(1,0.2,250) (hr \cv) was de- 12. Ho KK, Moody GB, Peng CK, Mietus JE, Larson MG, Levy
fined for 2 of the 32 cases. By contrast, for all r $ 0.08, D, and Goldberger AL. Predicting survival in heart failure case
and control subjects by use of fully automated methods for
cross-SampEn(1,r,250)(cv\hr) was defined for each of deriving nonlinear and conventional indices of heart rate dynam-
the 32 pairs. Thus cross-SampEn had a more consistent ics. Circulation 96: 842–848, 1997.
performance for evaluating these clinical cardiovascu- 13. Hogue CJ, Domitrovich PP, Stein PK, Despotis GD, Re L,
lar data. Schuessler RB, Kleiger RE, and Rottman JN. RR interval
dynamics before atrial fibrillation in patients after coronary
Summary. We have developed and characterized artery bypass graft surgery. Circulation 98: 429–434, 1998.
SampEn, a new family of statistics measuring complex- 14. Korpelainen JT, Sotaniemi KA, Makikallio A, Huikuri HV,
ity and regularity of clinical and experimental time and Myllyla VV. Dynamic behavior of heart rate in ischemic
series data and compared it with ApEn, a similar stroke. Stroke 30: 1008–1013, 1999.
15. Lipsitz LA, Pincus SM, Morin RJ, Tong S, Eberle LP, and
family. We find that SampEn statistics 1) agree much Gootman PM. Preliminary evidence for the evolution in complex-
better than ApEn statistics with theory for random ity of heart rate dynamics during autonomic maturation in
numbers with known probabilistic character over a neonatal swine. J Auton Nerv Syst 65: 1–9, 1997.
broad range of operating conditions, 2) maintain rela- 16. Makikallio TH, Ristimae T, Airaksinen KE, Peng CK,
Goldberger AL, and Huikuri HV. Heart rate dynamics in
tive consistency where ApEn statistics do not, and 3) patients with stable angina pectoris and utility of fractal and
have residual bias for very short record lengths, in a complexity measures. Am J Cardiol 81: 27–31, 1998.
large part because of nonindependence of templates. 17. Makikallio TH, Seppanen T, Niemela M, Airaksinen KE,
Furthermore, cross-SampEn is a more consistent mea- Tulppo M, and Huikuri HV. Abnormalities in beat-to-beat
complexity of heart rate dynamics in patients with a previous
sure of joint synchrony of pairs of clinical cardiovascu- myocardial infarction. J Am Coll Cardiol 28: 1005–1011, 1996.
lar time series. We attribute the difficulties of ApEn 18. Nelson JC, Rizwan-uddin, Griffin MP, and Moorman JR.
analysis to the practice of counting self-matches and of Probing the order within neonatal heart rate variability. Pediatr
cross-ApEn to the problem of unmatched templates Res 43: 823–831, 1998.
19. Palazzolo JA, Estafanous FG, and Murray PA. Entropy
resulting in undefined probabilities. The differences measures of heart rate variation in conscious dogs. Am J Physiol
are that SampEn does not count templates as matching Heart Circ Physiol 274: H1099–H1105, 1998.
themselves and does not employ a template-wise strat- 20. Pincus S and Singer BH. Randomness and degrees of irregular-
egy for calculating probabilities. SampEn statistics ity. Proc Natl Acad Sci USA 93: 2083–2088, 1995.
21. Pincus SM. Approximate entropy as a measure of system
provide an improved evaluation of time series regular- complexity. Proc Natl Acad Sci USA 88: 2297–2301, 1991.
ity and should be a useful tool in studies of the 22. Pincus SM. Approximate entropy (ApEn) as a complexity mea-
dynamics of human cardiovascular physiology. sure. Chaos 5: 110–117, 1995.
23. Pincus SM. Quantifying complexity and regularity of neurobio- 30. Press WH, Teukolsky SA, Vetterling WT, and Flannery BP.
logical systems. Methods Neurosci 28: 336–363, 1995. Numerical Recipes in FORTRAN. The Art of Scientific Comput-
24. Pincus SM, Cummins TR, and Haddad GG. Heart rate ing. New York: Cambridge University Press, 1994.
control in normal and aborted-SIDS infants. Am J Physiol 31. Rosner B. Fundamentals of Biostatistics. Boston, MA: PWS-
Regulatory Integrative Comp Physiol 264: R638–R646, 1993. Kent, 1990, p. 161.
26. Pincus SM, Gladstone IM, and Ehrenkranz RA. A regularity 32. Ryan SM, Goldberger AL, Pincus SM, Mietus J, and Lipsitz
statistic for medical data analysis. J Clin Monit 7: 335–345, LA. Gender- and age-related differences in heart rate dynamics:
1991. are women more complex than men? J Am Coll Cardiol 24:
27. Pincus SM, and Goldberger AL. Physiological time-series 1700–1707, 1994.
analysis: what does regularity quantify? Am J Physiol Heart Circ 33. Schuckers SA. Use of approximate entropy measurements to
Physiol 266: H1643–H1656, 1994. classify ventricular tachycardia and fibrillation. J Electrocardiol
28. Pincus SM, Mulligan T, Iranmanesh A, Gheorghiu S, God- 31 Suppl: 101–105, 1998.
schalk M, and Veldhuis JD. Older males secrete luteinizing 34. Tulppo MP, Makikallio TH, Takala TE, Seppanen T, and
hormone and testosterone more irregularly, and jointly more Huikuri HV. Quantitative beat-to-beat analysis of heart rate
asynchronously, than younger males. Proc Natl Acad Sci USA 93: dynamics during exercise. Am J Physiol Heart Circ Physiol 271:
14100–14105, 1996. H244–H252, 1996.
29. Pincus SM and Viscarello RR. Approximate entropy: a regular- 35. Weigend AS and Gershenfield NA. Time Series Prediction:
ity measure for fetal heart rate analysis. Obstet Gynecol 79: Forecasting the Future and Understanding the Past. Boston, MA:
249–255, 1992. Addison-Wesley, 1994.