Creating New Distributions Using Integration and S
Creating New Distributions Using Integration and S
net/publication/332186746
CITATIONS READS
0 185
1 author:
SEE PROFILE
All content following this page was uploaded by Rose Dawn Baker on 23 May 2019.
summation by parts
Rose Baker, School of Business
University of Salford, UK
email [email protected]
April 4, 2019
Abstract
Methods for generating new distributions from old can be thought
of as techniques for simplifying integrals used in reverse. Hence inte-
grating a probability density function (pdf) by parts provides a new
way of modifying distributions; the resulting pdfs are integrals that
sometimes require computation as special functions. Summation by
parts can be used similarly for discrete distributions. The general
methodology is given, with some examples of distribution classes and
of specific distributions, and fits to data.
Keywords
Mixture distribution; partial integration; stochastic dominance; summation
by parts; discrete distribution; special functions
1 Introduction
Parametric models of probability distributions are essential for statistical
inference. Hence a vast number of distributions of all types has been cre-
ated, and a common way to generate new distributions is to modify an old
one. Jones (2015) reviews the main techniques for generalizing univariate
symmetric distributions, and Lai (2012) gives a comprehensive account of
ways to modify survival distributions.
Transforming the random variable is probably the most popular method.
It is a technique that is often used to evaluate unknown integrals, and is
used ‘in reverse’ where the integral of the pdf (unity) is already known, to
1
generate new pdfs. For example, the exponential distribution with survival
function F̄ (x) = exp(−αx) becomes the Weibull distribution with F̄ (y) =
exp(−(αy)β ) on setting x = α−1 (αy)β .
Other integration techniques can be similarly used to create new distribu-
tions. For example, Azzalini’s method (e.g. Azzalini and Capitanio, 2018) of
transforming a symmetric pdf f (x) to 2w(x)f (x), where w(−x) = 1 − w(x),
can be thought of as a trick to simplify asymmetric integrands of type
w(x)f (x) used in reverse. These reflections prompt the thought that since
a probability density function (pdf) must integrate to unity, all the ‘tricks’
used to simplify unknown integrals, if used in reverse, can be used to gen-
erate more complex integrands (pdfs).
A method for simplifying integrands is integration by parts (IBP), and
the creation of new distributions using this method was introduced by Baker
(2019). This article further explores the use of IBP and its discrete analogue,
summation by parts (SBP) to generate new distributions.
Before giving the general methodology, we show the power of the method
with an example, starting with the exponential distribution. This is a special
case of a distribution given in Baker (2019). Write the exponential pdf
f (t) = α exp(−αt) for α > 0 and T ≥ 0 as f (t) = −uv ′ , where u =
αt1/2 , v ′ = − exp(−αt)/t1/2 ; this is just one of many choices of u and v that
could be made. R
∞
Then v(t) = t x−1/2 exp(−αx) dx, and on evaluating the integral by
√ √
changing variable to y where αx = y 2 /2 we see that v(t) = α−1/2 2 πΦ(− 2αt),
where Φ is the normal distribution function. We have that u(0) = 0, v(∞) =
0, and hence using the method of parts, integrating v ′ and differentiating u,
the integrand (the new pdf) is
√ √
α πΦ(− 2αt)
g(t) = √ .
αt
This is a 1-parameter distribution, that can be used for modelling lifetimes
when the hazard function initially high, and the pdf and hazard function
are shown in figure 1. Further details are given in appendix A.
This example shows how potentially useful and fairly tractable new dis-
tributions can be generated quite easily using IBP, and illustrates the point
that the pdf is often a special function.
The next section gives the general methodology and some general prop-
erties of the new distributions for the continuous case, and then some promis-
ing distribution classes are introduced. For discrete distributions, the method
of summation by parts can be used analogously and gives broadly similar
results, discussed in section 6.
2
3
Modified exponential pdf
Hazard function
Exponential pdf
Exponential hazard
2.5
2
Pdf/hazard
1.5
0.5
0
0 1 2 3 4 5
Time
Figure 1: The modified exponential pdf for α = 1 and its hazard function,
along with the pdf and hazard function for the exponential distribution.
3
2 General theory
2.1 Deriving probability density functions
Some notation is now introduced and the basic idea described more formally.
The method gives a transformation of the integrand (pdf) leading to a new
pdf, and the two pdfs can conveniently be called ‘L’ and ‘R’, where the R
form stochastically dominates the L form, so that the probability mass is
further to the right. The original distribution is sometimes referred to here
as the ‘base’ distribution.
Let the support of a distribution be (xl , xh ), which covers distributions
defined on the whole real line, survival distributions, and doubly-bounded
distributions. Write the R-pdf as f (x) = −u(x)v ′ (x), where u is a positive
monotone increasing function so that u′ ≥ 0, with u(xl ) = 0, and v a positive
decreasing function so that v ′ (x) ≤ 0, with v(xh ) = 0. Then applying
integration by parts,
Z xh Z xh
xh
f (x) dx = 1 = [−u(x)v(x)]xl + u′ (x)v(x) dx. (1)
xl xl
4
and Ḡ(x) = F̄ (x)−u(x)v(x). The L-distribution has probability mass shifted
to the left, and the R-distribution dominates it stochastically,
Rx i.e. F̄ (x) >
Ḡ(x). The mean can be written as Eg (X) = Ef (X) − xlh u(y)v(y) dy. The
R-distribution may or may not dominate when (3) is used, and
5
3 Some distribution classes
3.1 Using a function of the distribution function for u
With u a function of F , G will be a function of F , so this yields general
families of distributions. The most interesting classes have G → F as the
parameter λ → ∞. In general, when du(x)/ dF (x) → ∞ as F → 0, the pdf
g(x) will be infinite at the lower limit xl , if f (xl ) is finite.
6
3.1.2 u = exp(λF )
On taking u(x) = exp λF (x) and using (5),
λF − exp(−λ(1 − F )) + exp(−λ)
G= . (8)
λ − 1 + exp(−λ))
exp(−λ(1−F ))−exp(−λ)
Since H = 1−exp(−λ) is a distribution function, G is the negative
mixture
λ 1 − exp(−λ)
G= F− H.
λ − (1 − exp(−λ)) λ − (1 − exp(−λ)
When F is exponential, H is a truncated extreme-value distribution.
AS λ → ∞, (8) yields G → F , and as λ → 0, Ḡ → F̄ 2 , e.g. a component
lifetime distribution tends to the distribution of the lifetime of a series system
where either of two components must fail for a failure of the system.
In the tail where F̄ is small, expanding (8) shows that the hazard function
is twice that of the base distribution (which is true for all x as λ → 0).
In the left tail,
(1 − exp(−λ))λF
G≃ ,
λ − 1 + exp(−λ)
i.e. the pdf is scaled up from the base distribution by a factor not exceeding
2. For survival distributions as base distribution, this distribution therefore
tends to give an increasing hazard function.
Other increasing functions of F can also be used, e.g. u = exp(λF ) − 1.
This gives
1 − exp(−λ)
G = F + (exp(λF ) − 1) ln /λ,
1 − exp(−λF )
which is messy. Any increasing function can be used for u, leading to a vast
number of possible (and messy) distributions.
7
We must have that β − 1 + λ > 0 so that the pdf is positive, hence λ > 1 − β.
Baker (2019) discusses this case in detail and defines reliability growth
as
ξ = 1/(β − 1 + λ). (10)
The mean lifetime is Eg (T ) = β−1+λ
β+λ Ef (T ), so the proportional increase in
expected lifetime on eliminating inferior items is
Ef (T ) − Eg (T ) 1
∆E(T ) = = = ξ,
Eg (T ) β−1+λ
and the proportional decrease in lifetime caused by inferior items is 1/(β +
λ) = ξ/(ξ + 1). As λ increases and the base distribution is regained, ∆E(T )
goes to zero.
αγ(αt)βγ−1 exp(−(αt)γ )
f (t) = ,
Γ(β)
where α > 0, β > 0, γ > 0 and Γ is the gamma function. It includes gamma,
Weibull and lognormal distributions as special cases. Going L → R with
v(t) = exp(−(αt)γ )/(αt)λ yields a simple mixture of Stacy distributions,
but going R → L is more productive. In integrating v ′ (t), the incomplete
gamma function will be needed, defined as
Z ∞
Γ(a; t) = xa−1 exp(−x) dx, (11)
t
8
where a > 0. The gamma function itself is Γ(a) = Γ(a; 0). If a ≤ 0, Γ(a; t)
is still defined but cannot be computed using software that computes the
incomplete gamma function.
The resulting pdf is
α(λ + βγ − 1)(αt)βγ+λ−2 Γ( (1−λ) γ
γ ; (αt) )
g(t) = , (12)
Γ(β)
and the survival function is
(αt)βγ+λ−1 Γ( (1−λ) γ
γ ; (αt) )
Ḡ(t) = F̄ (t) − .
Γ(β)
The relation between the moments for L and R-distributions for the u = tλ
case was given in section 3.2.
The beta distribution is the most common doubly-bounded distribution.
Here the pdf f (x) = B −1 (α, β)xα−1 (1 − x)β−1 , where B denotes
R 1 the beta
function. Taking u = xα+λ−1 , we have that v(x) = B(α, β)−1 x y −λ (1 −
y)β−1 dy + c, where c ≥ 0. The constant of integration had to be zero for
survival distributions as u(∞) = ∞, but here does not. Hence
v(x) = B(1 − λ, β; x)/B(α, β) + c,
where B(1 − λ, β; x) is the complement of the unregularized incomplete beta
function. This yields the pdf
(α + λ − 1)xα+λ−2 v(x)
g(x) = u′ v/(1 + c) = .
1+c
The distribution function is
F (x) + xα+λ−1 (B(1 − λ, β; x)/B(α, β) + c)
G(x) = ,
1+c
where F (x) = (1 − B(α, β; x))/B(α, β).
On integrating gxn by parts, we have that
α−1+λ c Ef (X n )
Eg (X n ) = { + }.
1+c α−1+λ+n α−1+λ+n
We have that g(1) = (α + λ − 1)c/(1 + c). This 4-parameter distribution
allows a much more flexible pdf than the beta distribution. The transforma-
tion can of course also be applied to 1 − X by changing X ↔ 1 − X in the
transformed distribution. Note that the beta distribution is label-invariant
(1 − X also follows a beta distribution) but the transformed distribution is
not.
Random numbers can be generated as follows:
9
1. With U a uniform r.v., if U < c/(1 + c) , generate Z = V 1/(α−1+λ) ,
where V is a uniform r.v.;
2. if U ≥ c/(1 + c) generate X, a r.v. from the parent beta distribution ;
3. Then Y = W 1/α+λ−1 X, where W is a uniform r.v.
To keep generation of random numbers to a minimum, V and W could be
generated by affine transformations on U .
On the whole real line, the normal distribution is the most important by
far, and has been skewed in many ways, e.g. Azzalini and Capitanio (2018).
Taking u(x) = exp(λx) leads to the lagged-normal or normal-exponential
distribution (e.g. Johnson et al 1995). This is exponential in the tail, and
can also be derived as the sum of normal and exponential random variables,
so is not new. The u = F λ transformation yields the class (7) which with
F (x) = Φ(x), where Φ is the normal distribution function, yields a skew
distribution where G(x) = (Φ(x)λ − λΦ(x))/(1 − λ). The moments cannot
be found simply.
5 Data fitting
The method generates vast numbers of distributions, some of which are new.
The aim of the analysis was merely to show by an example that the new
distributions can be useful.
In their book on survival analysis, Klein and Moeschberger (2003) use a
dataset of days to death of 863 kidney-transplant patients whose transplants
were performed at the Ohio State University Transplant Center between
1982 and 1992. Available covariates are age, gender and white/black. Only
140 patients’ survival times were not censored. We discretized age into 5
bands: 1-16, 17-32, 33-48, 49-64 and 65+ and fitted the distribution from
(12) with β = 1, i.e. a modified Weibull distribution. An accelerated time
model was used (e.g. Chiou et al, 2014), so that α = α0 exp(η T X), where X
is a vector of covariates and η a vector of regression coefficients. The model
parameters are then the η, α0 the baseline time-scale, γ the Weibull shape
parameter and λ, reparameterized so that ξ as defined in (10) is used instead
of λ. The fit was done by maximum-likelihood in a purpose-written fortran
program. The incomplete gamma function for negative argument is not
available as a standard special function, and so was evaluated as an integral
for a ≤ 0 in (11). The NAG library routine D01AMF was used. This routine
transforms the integration range to be finite and then integrates adaptively.
Similar routines exist on many platforms.
10
0.0016
Age 33-48
Age 65+
0.0014
0.0012
0.001
Hazard of death
0.0008
0.0006
0.0004
0.0002
0
0 500 1000 1500 2000 2500 3000 3500
Days
Figure 2: Hazard function for the modified Weibull distribution fitted to the
kidney transplant data of Klein and Moeschberger (2003) for the 33-48 and
65+ age groups.
The fit to data cannot be shown because of the censoring, but figure 2
shows the hazard function of the fitted distribution for the central and oldest
age-bands. From this it can be seen that the hazard of death decreases
steadily after transplant, effectively to a constant for the central age-band,
but starts increasing again for the top age band. This result is consistent
with hazard plots using all patients produced by various methods by Klein
and Moeschberger (2003).
The table shows fitted parameters and standard errors. Standard errors
are large, sometimes very large, because little information comes from the
censored survival times. The baseline age group was 33-48 years. It can be
seen that the hazard of death increases rapidly from the lowest age band, and
then increases more slowly with age. Women have a slightly higher hazard
of death than men, and people of colour higher than white, but these effects
are not statistically significant. Because γ > 1, the hazard of death would
rise eventually for all age groups. The value of 1.6 for ξ means that, for
a given set of covariate values, the expected lifetime could be increased by
11
160% of its current value, if all patients could be made to respond as well
as the best ones. This suggests that much could potentially be gained by
studying and improving survival rates.
Table 1: Estimated parameters and standard errors for the fit to the kidney
transplant dataset.
12
in terms of model parameters. The class of r-distributions has the advantage
of tractable moments and so allows more straightforward inference.
The general methodology is next given, followed by the properties of the
2 classes of distributions and example of a fit to data is given.
for any ui , vi . The proof follows from the telescoping property of the sum
P n
i=m+1 (ui vi+1 − ui−1 vi ) = un vn+1 − um vm+1 . Distributions are commonly
defined on the integers 0 to n, where usually, n = ∞, the main exception
being the binomial distribution. Let a ‘parent’ probability mass function
(pmf) pi = −ui (vi+1 − vi ) ≥ 0, ui ≥ 0 and let ui+1 − ui ≥ 0 with vn+1 = 0.
Since pi ≥ 0, vi ≥ vi+1 ≥ 0. Then from (13)
vi (ui − ui−1 )
qi = (14)
1 − u m vm
is also a pmf.
Set m = −1 and set the pmf p−1 = 0. Then from (14) for i ≥ 0
vi (ui − ui−1 )
qi = .
1 − u−1 v−1
Since vi+1 − vi = −pi /ui and vn+1 = 0, it follows that vi = nj=i pj /uj .
P
Since p−1 = 0, v−1 = v0 . The function ui can be chosen to be zero at
m = −1, e.g. ui = P(i + 1)λ where λ > 0, or r i − 1/r where r > 1. In this
n
case simply qi = ( j=i pj /uj )(ui − ui−1 ). Otherwise
( nj=i pj /uj )(ui − ui−1 )
P
qi = . (15)
1 − u−1 v0
Using (13) in this way, both parent and descendant distributions have the
same upper limit n and hence the same support.
Note that for qi ≥ 0 we require u−1 v0 < 1. Since v0 = nj=0 pj /uj <
P
1/u0 and u−1 < u0 , we have that u−1 v0 < 1 as required. If u is indexed
by a parameter r so that ui+1 /ui → ∞ as r → ∞, then from (15) qi → pi
as r → ∞. This is so because vi → pi /ui , ui − ui−1 → ui , and u−1 v0 →
p0 (u−1 /u0 ) → 0. Hence the class of descendant distributions generalizes the
parent distribution.
13
6.2 The discrete distribution function and moments
Denoting distribution functions for parent and descendant distributions by
Fk , Gk respectively, we have from (13) that
Fk + uk vk+1 − u−1 v0
Gk = . (16)
1 − u−1 v0
Hence the distribution function is readily calculable once the vi are calcu-
lated and the pmf is known. If u−1 = 0, Gk > Fk and the parent stochasti-
cally dominates the descendant distribution.
From (15) the mean µg is
Pn Pn
i=0 i(ui − ui−1 ) j=i pj /uj
µg = . (17)
1 − u−1 v0
µ − ni=1 ( i−1
P P
j=0 uj )pi /ui
µg = . (18)
1 − u−1 v0
14
6.4 The ui = (i + 1)λ distribution
With the choice ui = (i + 1)λ , u−1 = 0 and the new system of distributions
n
X
qi = {(i + 1)λ − iλ } pj /(j + 1)λ (20)
j=i
is obtained.
As λ → ∞, qi → pi , and as λ → 0, q0 → 1 and all the probability mass
is at zero.
The case where λ = 1 is briefly mentioned in Johnson et al (2005). In
general, the moments are intractable, unless λ = 1. There (18) enables the
mean to be found, and higher moments are also tractable. Letting pj → pj+1
and setting p0 = 0 gives
∞
X
qi = {(i + 1)λ − iλ } pj /j λ .
j=i+1
Hi (1/r)r i (1 − 1/r)
qi = , (21)
1 − H0 (1/r)/r
i
as v−1 = 1/r. Also, Gi = Fi +r H1−Hi+1 (1/r)−H0 (1/r)/r
0 (1/r)/r
. The pmf for a Poisson
parent distribution is shown in figure 3.
The function H0 is the probability generating function (pgf) for the
parent distribution. From (21) on reversing the order of summation, the pgf
M (s) for the derived distribution is
From this, the moments can be read off, e.g. the mean µg is
µ+1 r
µg = − . (22)
1 − H0 (1/r)/r r − 1
15
0.3
Poisson
r=5
r=1.5
0.25
0.2
Probability
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 10
Random variable
Figure 3: Probabilities for the Poisson distribution with µ = 2.1, and for
the r-distribution with r = 1.5 and r = 5.
16
The pgf H0 is known in analytic form for the major discrete distributions,
e.g. for the Poisson, H0 (1/r) = exp(−(1 − 1/r)µ). Hence the moments of
the descendant distribution can be found as functions of the parameters of
the parent distribution.
Random numbers can be computed using the general method described
earlier. This now particularizes to the following: generate a random number
K from the parent distribution. Then compute
ln{(r K − r −1 )U + r −1 }
m= .
ln(r)
where U is a uniform [0, 1] random number. Then the integer part of m + 1
is M , a r.v. from the descendant distribution. This is the inverse-transform
method (Shmerling, 2013).
As r → ∞ we have that qi → pi , while when r → 1 the numerator
and denominator of (21) go to zero. Applying L’Hospital’s rule and using
dH0 (x)/ dx|x=1 = µ, qi → (pi + nj=i+1 pj )/(1 + µ), a partial sum distribu-
P
tion corresponding to the parent distribution.
As n → ∞ this simplifies to
i
X
pi = {(i + 1)−λ
− (i + 2) −λ
} qj (j + 1)λ .
j=0
17
the product of the parent pgf and the pgf of a geometric distribution with
Prob(X = k) = r −k /(1 − 1/r). Hence the effect of SBP has been to add a
geometric random variable to the original.
t−1
Y n
X t
Y
qi = t (i + s) pj / (j + k).
s=1 j=i k=1
The mean is tractable, and from (17), µg = tµ/(t + 1). From (19) the
variance is
tσ 2 tµ2 tµ
σg2 = + 2
+ ,
t + 2 (t + 1) (t + 2) (t + 1)(t + 2)
7 Inference
We usually wish to study the effect of covariates on the mean µg . The
mean µg is therefore predicted as µg = µ0 exp(β T X), where X is a vector of
18
covariates. For the example, for the r-distributions, the model parameter µ
was found for each case from (22), using Newton-Raphson iteration, which
usually converged in 4 iterations. The pgf H0 (x) = {1 + µα(1 − x)}−1/α for
the negative binomial distribution. The probabilities qi were then computed
and fits were by maximum likelihood.
Note that for the λ-distributions the mean is not readily calculable in
terms of parent distribution parameters. Model fitting is still possible how-
ever, because the mean µg can be computed using the probabilities qi taken
up to some large cutoff value of i, and the Newton-Raphson iteration still
done for µ.
8 Example
Hilbe (2011) fits negative binomial distributions to a number of datasets.
One is the ‘affairs’ dataset, with 601 observations from Fair (1978), reporting
counts of extramarital affairs over a year in the USA. Table 2 shows results
of fitting the Poisson and NB distributions, and on adding the r parameter,
and table 3 shows the covariates and the regression coefficients. The mean
quoted is for the average values of the covariates.
Model -ℓ AIC
Poisson 1426.8 2871.54
Poisson+r 1126.62 2273.24
Negative Binomial 728.10 1476.20
Negative Binomial + r 711.45 1444.91
Table 2: Model fits for models of the extramarital affairs data of Fair (1978),
showing minus the log-likelihood and the Akaike Information Criterion.
19
Variable Coeff p
Mean 1.873 (.388) -
α 3.02 (.155) -
r 1 -
Gender -.064 (.265) .81
Age -.0229 (.0188) .226
Years married .107 (.0355) .0024
Children? .113 (.307) .366
Religious? -.415 (.100) .000036
Educ. level -.000610 (.0560) .991
Occupation .0737 (.0801) .920
Rating -.447 (.099) .0000003
Table 3: Fitted parameter values for the NB+r model of the extramarital
affairs data. Standard errors are given in parentheses. The last column is
the p-value for a test that the regression coefficient is zero.
function minimisers.
9 Conclusions
Integration by parts is a general method that yields a cornucopia of distri-
butions, only a few of which have been explored here. Some are new, while
some have been derived by other methods and now have a new characteri-
zation. This could be useful e.g. in generating random numbers.
IBP is an addition to other general methods for modifying distributions,
such as transforming variables. Summation by parts can be used similarly
for discrete distributions, and its usefulness may be relatively greater, as the
variety of methods for generating new distributions from old is more limited
in the discrete case.
One can shift the probability mass of a distribution left or right. Shift-
ing it left is useful when dealing with failure-time distributions in reliability,
where the left-shifted distributions can reproduce the bathtub-shaped haz-
ard functions sometimes seen in practice (e.g. Lai and Xie, 2006). For
discrete distributions, the left shifted distributions have a higher probability
of zero events occurring. This zero-inflation can substantially improve the
model fit to data. Shifting probability mass right gives long-tailed distribu-
tions, which are needed in many areas, e.g. finance.
Comparing modifying distributions by integration by parts to using the
20
transformation of variables method, one could sum up as follows:
1. transformation gives a simple pdf, whereas IBP gives the pdf as an
integral, which may be simple or may be a special function;
3. the IBP method has the further property of yielding a new distribution
that either stochastically dominates the original, or is dominated by
it. This property can be used in a test of stochastic dominance as
described in appendix B.
21
binomial
P has been fitted. The class of long-tailed distributions of the form
qi = j≤i f (i, j)pj is completely new.
Further work could include derivation of new classes of distribution using
other functions for ui , and more detailed exploration of their properties.
Bivariate distributions have not been considered here, but new bivariate
distributions could be derived from old by applying SBP to X for each level
of Y .
22
References
[1] Azzalini, A. and Capitanio, A. (2018). The skew normal and related
families, Cambridge University Press, Cambridge UK.
[2] Baker, R. D. (2019). New survival distributions that quantify the gain
from eliminating flawed components, Reliability Engineering and Sys-
tem Safety, in press.
[4] Chiou, S. H., Kang, S. and Yan, J. (2014). Fast accelerated failure time
modeling for case-cohort data, Statistics and Computing, 24, 559–568.
[7] Johnson, N.L., Kemp, A. W., Kotz, S. (2005). Univariate Discrete Dis-
tributions, 3rd ed., Wiley, New York.
[9] Johnson, N.L., Kemp, A. W., Kotz, S. (2005). Univariate Discrete Dis-
tributions, 3rd ed., Wiley, New York.
[12] Lai, C-D., Xie, M. (2006). Stochastic ageing and Dependence for Reli-
ability, Springer, Berlin.
23
[15] Wimmer, G. and Altmann, G. (2001). A new type of partial-sums dis-
tributions, Statistics and Probability Letters, 52, 359-364.
24
Appendix A: the modified exponential distribution
The distribution is a special case of the modified Stacy distribution where
β = γ = 1, λ = 1/2. Some properties of this distribution are now given
without proof, for completeness, and to show the tractability of the distri-
bution. The pdf is initially infinite, and is exponential in the tail. The
survival function is
√ √
Ḡ(t) = exp(−αt) − 2 πΦ(− 2αt)(αt)1/2 .
25