Pmu 12
Pmu 12
Matem aticas
del Uruguay
Volumen 12, A no 2011
Editorial Board
J. Rodriguez Hertz
A. Treibich
J.Vieitez
Publicaciones
Matematicas
del Uruguay
Editorial board
J. Rodriguez Hertz
IMERL
[email protected]
A. Treibich
Universite dArtois / Regional Norte
[email protected]
J. Vieitez
Regional Norte
[email protected]
Published by:
IMERL-Facultad de Ingeniera
CMAT-Facultad de Ciencias
Universidad de la Rep ublica
http://imerl.fing.edu.uy/pmu
ISSN: 0797-1443
Credits:
Cover design: J. Rodriguez Hertz
L
A
T
E
X editor: J. Rodriguez Hertz
using L
A
T
E
Xs confproc package, version 0.7 (by V. Verfaille)
Printed in Montevideo by Mastergraf c 2011
Mario Wschebor, former editor of the Pub-
licaciones Matem aticas del Uruguay, and
founder of the IFUM, passed away while
this volume was in print. We dedicate this
volume to his memory.
Contents
Preface
A review of some recent results on Random Polynomials over R and
over C.
DIEGO ARMENTANO 1
Rice formulas and Gaussian waves II.
JEAN-MARC AZAIS, JOS E R. LE ON, and MARIO WSCHEBOR 15
On automorphism groups of ber bundles
MICHEL BRION 39
On the focusing of Cram er - von Mises test
ALEJANDRA CABA NA and ENRIQUE CABA NA 67
Feuilletage de Hirsch, mesures harmoniques et g-mesures
BERTRAND DEROIN and CONSTANTIN VERNICOS 79
On existence of smooth critical subsolutions of the Hamilton-Jacobi
Equation
ALBERT FATHI 87
Paths towards adaptive estimation for Instrumental Variable
Regression
JEAN-MICHEL LOUBES and CL EMENT MARTEAU 99
Semisimple Hopf algebras and their representations
SONIA NATALE 123
An example concerning the Theory of Levels for codimension-one
foliations
ANDR ES NAVAS 169
Accessibility and abundance of ergodicity in dimension three: a
survey.
FEDERICO RODRIGUEZ HERTZ, JANA RODRIGUEZ HERTZ,
and RA UL URES 177
PREFACE
This volume contains the proceedings of the Colloquium celebrat-
ing the opening of the Franco-Uruguayan Institute of Mathematics
(IFUM), which is an International Associate Laboratory (LIA) of the
French National Center for Scientic Research (CNRS). This meet-
ing took place in December 8-11, 2009, in Punta del Este, Uruguay,
and was enriched with the participation of many specialists in the ar-
eas of Probability, Algebra and Dynamical Systems, from Argentina,
France and Uruguay.
We are grateful to the Scientic Committee, specially to Viviane
Baladi, for entrusting to us the edition of these proceedings. We
are also indebted to CSIC, PEDECIBA-Matematica and IFUM for
supporting the edition of this volume.
Last but not least, we counted on the generous collaboration of the
authors and the referees, without whom this volume would have not
been possible. We wish to express our gratitude to all of them.
Jana Rodriguez Hertz
Montevideo, November 2011.
A REVIEW OF SOME RECENT RESULTS ON
RANDOM POLYNOMIALS OVER R AND OVER C.
DIEGO ARMENTANO
Abstract. This article is divided in two parts. In the rst part
we review some recent results concerning the expected number
of real roots of random system of polynomial equations. In the
second part we deal with a dierent problem, namely, the dis-
tribution of the roots of certain complex random polynomials.
We discuss a recent result in this direction, which shows that
the associated points in the sphere (via the stereographic pro-
jection) are surprisingly well-suited with respect to the minimal
logarithmic energy on the sphere.
1. Introduction
Let us consider a system of m polynomial equations in m unknowns
over a eld K,
f
i
(x) :=
jd
i
a
(i)
j
x
j
(i = 1, . . . , m). (1)
The notation in (1) is the following: x := (x
1
, . . . , x
m
) denotes a
point in K
m
, j := (j
1
, . . . , j
m
) a multi-index of non-negative integers,
j =
m
h=1
j
h
, x
j
= x
j
1
x
j
m
, a
(i)
j
= a
(i)
j
1
,...,j
m
, and d
i
is the degree
of the polynomial f
i
.
We are interested in the solutions of the system of equations
(2) f
i
(x) = 0 (i = 1, . . . , m),
lying in some subset V of K
m
. Throughout this review we are mainly
concerned with the case K = R or K = C.
Key words and phrases. Random Polynomials; System of Random Equations;
Bernstein Basis, Logarithmic Energy, Elliptic Fekete Points.
1
2 DIEGO ARMENTANO
If we choose at random the coecients {a
(i)
j
}, then the solution of
the system (2) becomes a random subset of K
m
. This is the main
object of this review.
In the rst part of this paper we focus on the real case. The main
problem we consider is that of understanding N
f
(V ): the number of
solutions lying in the Borel subset V of R
m
.
In the second part we deal with a dierent problem: How are the
roots of complex polynomials distributed?
This article is organized as follows:
In Section 2 we start with some historical remarks on random polyno-
mials. After that we move to the case of random systems of equations.
We mention some recent results for centered Gaussian distributions.
In Section 2.1 we consider the non-centered case, which has also been
called smooth-analysis in the last years. That is, we start with a
xed (non-random) polynomial system, then we perturb it with a
polynomial noise, and we ask what can be said about the number
of roots of the perturbed system. In Section 2.2 we review a result
which computes the expected number of roots of a random system
of polynomial equations expressed in a dierent basis, namely, the
Bernstein basis. Finally in Section 3 we focus on the complex case.
We discuss a recent result concerning the distribution of points in the
sphere associated with roots of random complex polynomials.
This review follows the talk given by the author in the colloquium
which was held the inauguration of the Franco-Uruguayan Institute
of Mathematics, in Punta del Este, Uruguay, on December 2009.
2. The Number of Real Roots of Random Polynomials
The study of the expectation of the number of real roots of a
random polynomial started in the thirties with the work of Block
and Polya [7]. Further investigations were made by Littlewood and
Oord [14]. However, the rst sharp result is due to M. Kac (see
Kac[11, 12]), who gives the asymptotic value
E
_
N
f
(R)
_
log d, as d +,
RANDOM POLYNOMIALS 3
when the coecients of the degree d univariate polynomial f are
Gaussian centered independent random variables N(0, 1) (see the
book by BharuchaReid and Sambandham [6]).
The rst important result in the study of real roots of random
system of polynomial equations is due to Shub and Smale [20] in
1992, where the authors computed the expectation of N
f
(R
m
) when
the coecients are Gaussian centered independent random variables
having variances:
E
_
(a
(i)
j
)
2
_
=
d
i
!
j
1
! j
m
! (d
i
j)!
. (3)
Their result was
E
_
N
f
(R
m
)
_
=
_
d
1
d
m
, (4)
that is, the square root of the Bezout number associated to the sys-
tem. The proof is based on a double bration manipulation of the co-
area formula. Some extensions of their work, including new results for
one polynomial in one variable, can be found in EdelmanKostlan[10].
There are also other extensions to multi-homogeneous systems in
McLennan[16], and, partially, to sparse systems in Rojas[17] and
MalajovichRojas[15]. A similar question for the number of critical
points of real-valued polynomial random functions has been consid-
ered in DedieuMalajovich[9].
The probability law of the ShubSmale model dened in (3) has
the simplifying property of being invariant under the action of the
orthogonal group in R
m
. In Kostlan[13] one can nd the classication
of all Gaussian probability distributions over the coecients with this
geometric invariant property.
In 2005, Azas and Wschebor gave a new and deep insight to this
problem. The key point is using the Rice formula for random Gauss-
ian elds (cf. AzasWschebor[5]). This formula allows one to extend
the ShubSmale result to other probability distributions over the co-
ecients. A general formula for E(N
f
(V )) when the random func-
tions f
i
(i = 1, . . . , m) are stochastically independent and their law
is centered and invariant under the orthogonal group on R
m
can be
found in AzasWschebor[4]. This includes the ShubSmale formula
4 DIEGO ARMENTANO
(4) as a special case. Moreover, Rice formula appears to be the in-
strument to consider a major problem in the subject which is to nd
the asymptotic distribution of N
f
(V ) (under some normalization).
The only published results of which the author is aware concern as-
ymptotic variances as m +. (See Wschebor[25] for a detailed
description in this direction and a simpler proof of ShubSmale re-
sult).
2.1. Non-centered Systems. The aim of this section is to remove
the hypothesis that the coecients have zero expectation.
One way to look at this problem is to start with a non-random
system of equations (the signal)
P
i
(x) = 0 (i = 1, . . . , m), (5)
perturb it with a polynomial noise X
i
(x) (i = 1, . . . , m), that is,
consider
P
i
(x) + X
i
(x) = 0 (i = 1, . . . , m),
and ask what one can say about the number of roots of the new
system, or, how much the noise modies the number of roots of the
deterministic part. (For short, we denote N
f
= N
f
(R
m
)).
Roughly speaking, we prove in Theorem 1 that if the relation signal
over noise is neither too big nor too small, in a sense that will be made
precise later on, there exist positive constants C, , where 0 < < 1,
such that
E(N
P+X
) C
m
E(N
X
). (6)
Inequality (6) becomes of interest if the starting non-random sys-
tem (5) has a large number of roots, possibly innite, and m is large.
In this situation, the eect of adding polynomial noise is a reduction
at a geometric rate of the expected number of roots, as compared to
the centered case in which all the P
i
s are identically zero.
For simplicity we assume that the polynomial noise X has the
Shub-Smale distribution. However, one should keep in mind that the
result can be extended to other orthogonally invariant distributions
(cf. ArmentanoWschebor[2]).
RANDOM POLYNOMIALS 5
Before the statement of Theorem 1 below, we need to introduce
some additional notations.
In this simplied situation, one only needs hypotheses concerning
the relation between the signal P and the Shub-Smale noise X, which
roughly speaking should neither be too small nor too big.
Since X has the Shub-Smale distribution, from (3) we get
Var(X
i
(x)) = (1 +x
2
)
d
i
, x R
m
, (i = 1, . . . , m).
Dene
H(P
i
) := sup
xR
m
_
(1 +x)
_
_
_
_
_
P
i
(1 +x
2
)
d
i
/2
_
(x)
_
_
_
_
_
,
K(P
i
) := sup
xR
m
\{0}
_
(1 +x
2
)
_
P
i
(1 +x
2
)
d
i
/2
_
(x)
_
,
for i = 1, . . . , m, where is the Euclidean norm, and
denotes
the derivative in the direction dened by
x
x
, at each point x = 0.
For r > 0, put:
L(P
i
, r) := inf
xr
P
i
(x)
2
(1 +x
2
)
d
i
(i = 1, . . . , m).
One can check by means of elementary computations that for each P
as above, one has
H(P) < , K(P) < .
With these notations, we introduce the following hypotheses on the
systems as m grows:
H
1
)
A
m
=
1
m
m
i=1
H
2
(P
i
)
i
= o(1) as m + (7a)
B
m
=
1
m
m
i=1
K
2
(P
i
)
i
= o(1) as m +. (7b)
H
2
) There exist positive constants r
0
, such that if r r
0
:
L(P
i
, r) for all i = 1, . . . , m.
6 DIEGO ARMENTANO
Theorem 1. Under the hypotheses H
1
) and H
2
), one has
E(N
P+X
) C
m
E(N
X
), (8)
where C, are positive constants, 0 < < 1.
2.1.1. Remarks on the statement of Theorem 1.
It is obvious that our problem does not depend on the order
in which the equations
P
i
(x) + X
i
(x) = 0 (i = 1, . . . , m)
appear. However, conditions (7a) and (7b) in hypothesis H
3
)
do depend on the order. One can state them by saying that
there exists an order i = 1, . . . , m on the equations, such that
(7a) and (7b) hold true.
Condition H
1
) can be interpreted as a bound on the quotient
signal over noise. In fact, it concerns the gradient of this
quotient. In (7b) the radial derivative appears, which happens
to decrease faster as x than the other components of
the gradient.
Clearly, if H(P
i
), K(P
i
) are bounded by xed constants,
(7a) and (7b) are veried. Also, some of them may grow as
m + provided (7a) and (7b) remain satised.
Hypothesis H
2
) goes in some sense in the opposite direc-
tion: For large values of x we need a lower bound of the
relation signal over noise.
A result of the type of Theorem 1 can not be obtained without
putting some restrictions on the relation signal over noise. In
fact, consider the system
P
i
(x) + X
i
(x) = 0 (i = 1, . . . , m), (9)
where is a positive real parameter. If we let +,
the relation signal over noise tends to zero and the expected
number of roots will tend to E(N
X
). On the other hand, if
0, E(N
X
) can have dierent behaviours. For example, if
P is a regular system, the expected value of the number of
roots of (9) tends to the number of roots of P
i
(x) = 0, (i =
RANDOM POLYNOMIALS 7
1, . . . , m), which may be much bigger than E(N
X
). In this
case, the relation signal over noise tends to innity.
As it was mentioned before we can extend Theorem 1 to other
orthogonally invariant distributions. However, for the general
version we need to add more hypotheses.
In the next paragraphs we are going to give two simple examples.
For the proof of Theorem 1 and more examples with dierent noises
see ArmentanoWschebor[2].
2.1.2. Some Examples. We assume that the degrees d
i
are uniformly
bounded.
For the rst example, let
P
i
(x) = x
d
i
r
d
i
,
where d
i
is even and r is positive and remains bounded as m varies.
Then, one has:
_
P
i
(1 +x
2
)
d
i
/2
_
(x) =
d
i
x
d
i
1
+ d
i
r
d
i
x
(1 +x
2
)
d
i
2
+1
d
i
(1 + r
d
i
)
(1 +x
2
)
3/2
_
P
i
(1 +x
2
)
d
i
/2
_
(x) =
d
i
x
d
i
2
+ d
i
r
d
i
(1 +x
2
)
d
i
2
+1
x
which implies
_
_
_
_
_
P
i
(1 +x
2
)
d
i
/2
_
(x)
_
_
_
_
d
i
(1 + r
d
i
)
(1 +x
2
)
3/2
.
Again, since the degrees d
1
, . . . , d
m
are bounded by a constant that
does not depend on m, H
1
) follows. H
2
) also holds under the same
hypothesis.
Notice that an interest in this choice of the P
i
s lies in the fact that
obviously the system P
i
(x) = 0 (i = 1, . . . , m) has an innite number
of roots (all points in the sphere of radius r centered at the origin are
solutions), but the expected number of roots of the perturbed system
is geometrically smaller than the ShubSmale expectation, when m
is large.
8 DIEGO ARMENTANO
Our second example is the following: Let T be a polynomial of
degree d in one variable that has d distinct real roots. Dene:
P
i
(x
1
, . . . , x
m
) = T(x
i
) (i = 1, . . . , m).
One can easily check that the system veries our hypotheses, so that
there exist C, positive constants, 0 < < 1 such that
E(N
P+X
) C
m
d
m/2
,
where we have used the ShubSmale formula when the degrees are
all the same. On the other hand, it is clear that N
P
= d
m
so that
the diminishing eect of the noise on the number of roots can be
observed. A number of variations of these examples for P can be
constructed, but we will not pursue the subject here.
2.2. Other Polynomial Basis. Up to now all probability measures
were introduced in a particular basis, namely, the monomial basis
{x
j
}
jd
. However, in many situations, polynomial systems are ex-
pressed in dierent basis, for example, orthogonal polynomials, har-
monic polynomials,Bernstein polynomials, etc. So, it is a natural
question to ask: What can be said about N
f
(V ) when the randomiza-
tion is performed in a dierent basis?
For the case of random orthogonal polynomials see Barucha-Reid
and Sambandham[6], and EdelmanKostlan[10] for random harmonic
polynomials.
In this section following ArmentanoDedieu[3] we give an answer
to the average number of real roots of a random system of equations
expresed in the Bernstein basis. Let us be more precise:
The Bernstein basis is given by:
b
d,k
(x) =
_
d
k
_
x
k
(1 x)
dk
, 0 k d,
in the case of univariate polynomials, and
b
d,j
(x
1
, . . . , x
m
) =
_
d
j
_
x
j
1
1
. . . x
j
m
m
(1 x
1
. . . x
m
)
dj
, j d,
for polynomials in m variables, where j = (j
1
, . . . , j
m
) is a multi-
integer, and
_
d
j
_
is the multinomial coecient.
RANDOM POLYNOMIALS 9
Let us consider the set of real polynomial systems in m variables,
f
i
(x
1
, . . . , x
m
) =
jd
i
a
(i)
j
b
d,j
(x
1
, . . . , x
m
) (i = 1, . . . , m).
Take the coecients a
(i)
j
to be independent Gaussian standard ran-
dom variables.
Dene
: R
m
P
_
R
m+1
_
by
(x
1
, . . . , x
m
) = [x
1
, . . . , x
m
, 1 x
1
. . . x
m
].
Here P(R
m+1
) is the projective space associated with R
m+1
, [y] is
the class of the vector y R
m+1
, y = 0, for the equivalence relation
dening this projective space. The (unique) orthogonally invariant
probability measure in P(R
m+1
) is denoted by
m
.
With the above notation the following theorem holds:
Theorem 2. (1) For any Borel set V in R
m
we have
E
_
N
f
(V )
_
=
m
((V ))
_
d
1
. . . d
m
.
In particular
(2) E
_
N
f
_
=
d
1
. . . d
m
,
(3) E
_
N
f
(
m
)
_
=
d
1
. . . d
m
/2
m
, where
m
= {x R
m
: x
i
0 and x
1
+ . . . + x
m
1} ,
(4) When m = 1, for any interval I = [, ] R, one has
E
_
N
f
(I)
_
=
1i<jN
1
x
i
x
j
1i<jN
ln x
i
x
j
N ln N
4
+ C
N
N,
then,
0.112768770... liminf
N
C
N
limsup
N
C
N
0.0234973...
Let X
1
, . . . , X
N
be independent random variables with common
uniform distribution over the sphere. One can easily show that the
expected value of the function V (X
1
, . . . , X
N
) in this case is,
(13) E(V (X
1
, . . . , X
N
)) =
N
2
4
ln
_
4
e
_
+
N
4
ln
_
4
e
_
.
Thus, this random choice of points in the sphere with independent
uniform distribution already provides a reasonable approach to the
minimal value V
N
, accurate to the order of O(N ln N).
On one side, this probability distribution has an important prop-
erty, namely, invariance under the action of the orthogonal group on
the sphere. However, on the other hand this probability distribution
lacks on correlation between points. More precisely, in order to obtain
well-suited congurations one needs some kind of repelling property
between points, and in this direction independence is not favorable.
Hence, it is a natural question whether other handy orthogonally in-
variant probability distributions may yield better expected values.
Here is where complex random polynomials comes into account.
Given z C, let
z :=
(z, 1)
1 +|z|
2
C R
= R
3
be the associated points in the Riemann Sphere, i.e. the sphere of
radius 1/2 centered at (0, 0, 1/2). Finally, let
X = 2 z (0, 0, 1) S
2
12 DIEGO ARMENTANO
be the associated points in the unit sphere.
Given a polynomial f in one complex variable of degree N, we
consider the mapping
f V (X
1
, . . . , X
N
),
where X
i
(i = 1, . . . , N) are the associated roots of f in the unit
sphere. Notice that this map is well dened in the sense that it does
not depend on the way we choose to order the roots.
Theorem 3. Let f(z) =
N
k=0
a
k
z
k
be a complex random polyno-
mial, such that the coecients a
k
are independent complex random
variables, such that the real and imaginary parts of a
k
are indepen-
dent (real) Gaussian random variables centered at 0 with variance
_
N
k
_
. Then, with the notations above,
E(V (X
1
, . . . , X
N
)) =
N
2
4
ln
_
4
e
_
N ln N
4
+
N
4
ln
4
e
.
Comparing Theorem 3 with equations (12) and (13), we see that
the value of V is surpringsingly small at points coming from the
solution set of this random polynomials. More precisely, necessarily
many random realizations of the coecients will produce values of
V below the average and very close to V
N
, possibly close enough to
satisfy equation (11).
Notice that, taking the homogeneous counterpart of f, Theorem 3
can be restated for random homogeneous polynomials and consider-
ing its complex projective solutions, under the identication of IP(C
2
)
with the Riemann sphere. In this fashion, the induced probability dis-
tribution over the space of homogeneous polynomials in two complex
variables corresponds to the classical unitarily invariant Hermitian
structure of the respective space (see BlumCuckerShubSmale[8]).
Therefore, the probability distribution of the roots in IP(C
2
) is in-
variant under the action of the unitary group.
It is not dicult to prove that the unitary group action over IP(C
2
)
correspond to the special orthogonal group of the unit sphere. Hence,
RANDOM POLYNOMIALS 13
the distribution of the associated random roots on the sphere is or-
thogonally invariant. Thus, Theorem 3 is another geometric conr-
mation of the repelling property of the roots of this Gaussian random
polynomials.
For a proof of Theorem 3 and a more detailed discussion on this
account see ArmentanoBeltr anShub[1]. See also ShubSmale[21].
References
[1] Armentano D., Beltr
s J.-M and Wschebor M., Level sets and extrema of random pro-
cesses and elds. John Wiley and Sons 2009.
[6] Bharucha-Reid A. T. and Sambandham M., Random Polynomials,
Probability and Mathematical Statistics, Academic Press, Orlando, FL,
1986.
[7] Bloch A. and Polya G.,On the number of real roots of a random algebraic
equation. Proc. Cambridge Philos. Soc., 33:102114, 1932.
[8] Blum L., Cucker F., Shub M. and Smale S., Complexity and real
computation, Springer-Verlag, New York, 1998.
[9] Dedieu J.-P. and Malajovich G., On the number of minima of a random
polynomial. Journal of Complexity 24 (2008) 89-108.
[10] Edelman A. and Kostlan E, How many zeros of a random polynomial
are real?,Bull. Amer. Math. Soc. (N.S.)32,(1), (1995), 1-37.
[11] Kac M., On the average number of real roots of a random algebraic equation.
Bull. Am. Math. Soc. 49 (1943) 314-320 and 938.
[12] Kac M., On the average number of real roots of a random algebraic equation
(II). Proc. London Math. Soc. 50 (1949) 390-408.
[13] Kostlan E., On the expected number of real roots of a system of radom
polynomial equations. In: Foundations of Computational Mathematics, Hong
Kong 2002, 149-188. World Sci. Pub., 2002.
14 DIEGO ARMENTANO
[14] Littlewood, J. E. and Offord, A. C., On the number of real roots of
a random algebraic equation. J. London Math. Soc., 13:288295, 1938.
[15] Malajovich G. and Rojas J.M, High probability analysis of the condition
number of sparse polynomial systems, in Theoret. Comput. Sci., 315(2-3), pp.
524-555, (2004).
[16] McLennan A., The expected number of real roots of a multihomogeneous
system of polynomial equations, in Amer. J. Math., 124(1), pp. 49-73, (2002).
[17] Rojas J.M., On the average number of real roots of certain random sparse
polynomial systems, in The mathematics of numerical analysis, (Park City,
UT, 1995) vol. 32, Lectures in Appl. Math., pp. 689-699, Amer. Math. Soc.,
Providence, RI, (1996).
[18] Rakhmanov E.A., Saff E.B. and Zhou Y.M. , Minimal discrete energy
on the sphere, Math. Res. Letters 1 (1994), 647662.
[19] Saff, E.B. and Kuijlaars, A.B.J., Distributing many points on a sphere.
Math. Intelligencer 19 (1997), no. 1, 511.
[20] Shub M. and Smale S., Complexity of Bezouts theorem. II. Volumes and
probabilities, Computational algebraic geometry (Nice, 1992), Progr. Math.,
vol. 109, Birkhauser Boston, Boston, MA, 1993, pp. 267285.
[21] , Complexity of Bezouts theorem. III. Condition number and packing,
J. Complexity 9 (1993), no. 1, 414, Festschrift for Joseph F. Traub, Part I.
[22] Smale S., Mathematical problems for the next century, Mathematics: fron-
tiers and perspectives, Amer. Math. Soc., Providence, RI, 2000, pp. 271294.
[23] Whyte L.L., Unique arrangements of points on a sphere, Amer. Math.
Monthly 59 (1952), 606611.
[24] Wschebor M., On the Kostlan-Shub-Smale model for random polynomial
systems. Variance of the number of roots, in J. of Complexity, 21, pp. 773-
789, (2005).
[25] Wschebor M., Systems of random equations. A review of some recent re-
sults. In and out of equilibrium. 2, 559574, Progr. Probab., 60, Birkhuser,
Basel, 2008.
Centro de Matem
ublica. Montevideo,
Uruguay
RICE FORMULAS AND GAUSSIAN WAVES II.
JEAN-MARC AZA
IS, JOS
E R. LE
IS, JOS
E R. LE
(x) =
2
r
1
1
r
2
x(r
2
r
1
)
,
where
i
:= h
i
W(x) and r
i
:=
_
x
2
+
2
i
, i=1,2.
The points (x, W(x)) of the curve such that x is a solution of (1)
are called specular points. When the curve is random, one of our
aims is to study the probability distribution of the number of specu-
lar points such the abscise x A, where A is a Borel subset of the line.
The following approximation is due to M.S. Longuet-Higgins (see
[7], [8]): Suppose that h
1
and h
2
are big with respect to W(x) and x,
then r
i
=
i
+x
2
/(2
i
) +O(h
3
i
). Then, (1) can be approximated by
(2) W
(x)
x
2
1
+
2
x
2
h
1
+ h
2
h
1
h
2
= kx,
where
k :=
1
2
_
1
h
1
+
1
h
2
_
.
Set Y (x) := W
IS, JOS
E R. LE
=
1
k
+ O(1) as k 0,
where is a constant that can be computed by means of an
explicit formula from the covariance of the given Gaussian
process, which is well-adapted to numerical computation.
This implies that the coecient of variation of the random
variable SP(R) tends to zero in a controlled manner, namely:
(3)
_
Var(SP(R))
E(SP(R))
_
k
2
4
as k 0,
since
E(SP(R))
_
2
4
1
k
,
((3) corrects a small error in [1]).
(2) With some additional requirement on the smoothness of the
paths of the process, under the same asymptotic, the natu-
ral renormalization of SP(R) tends to the standard normal
RICE FORMULAS AND GAUSSIAN WAVES II. 19
distribution (x), that is, for every x R:
P
_
SP(R) (2
4
/)
1/2
/k
(/k)
1/2
x
_
(x) as k 0.
2.2. Specular points for two-parameter processes. Let us con-
sider in R
3
a coordinate system Oxyz, and a C
1
-function z = W(x, y).
The following denition of specular points of the graph extends nat-
urally the one we gave above for functions of one real variable.
The source of light is placed at the point (0, 0, h
1
) and the observer
at (0, 0, h
2
). The point (x, y) is said to be a specular point if the nor-
mal vector n(x, y) = (W
x
, W
y
, 1) to the graph at (x, y, W(x, y))
satises the following two conditions:
the angles with the incident ray I = (x, y, h
1
W) and
the reected ray R = (x, y, h
2
W) are equal (for short
the argument (x, y) has been removed),
it belongs to the plane generated by I and R.
Setting
i
= h
i
W and r
i
=
_
x
2
+ y
2
+
i
, i = 1, 2, as in the
one-parameter case we have:
W
x
=
x
x
2
+ y
2
2
r
1
1
r
2
r
2
r
1
,
W
y
=
y
x
2
+ y
2
2
r
1
1
r
2
r
2
r
1
. (4)
When h
1
and h
2
are large, the system above can be approximated by
W
x
= kx
W
y
= ky, (5)
under the same conditions as in dimension 1. This is the Longuet-
Higgins approximation for two-parameter functions.
For each subset Q of R
2
, we denote by SP(Q), the number of ap-
proximate specular points in the sense of (5) such that (x, y) Q.
In the remaining of this paragraph we limit our attention to this ap-
proximation and to the case in which {W(x, y) : (x, y) R
2
} is a
20 JEAN-MARC AZA
IS, JOS
E R. LE
ij
=
_
R
2
u
i
v
j
(du, dv),
whenever they are well-dened.
In [1] one can nd the statement of certain results on the behavior
of expectation and variance of SP(Q) under the asymptotic k 0.
We give full proofs of these results below. For the time being, what
is known for variance and coecient of variation is weaker than in
the one-dimensional parameter case.
Let us dene:
(6) Y(x, y) :=
_
W
x
(x, y) kx
W
y
(x, y) ky
_
.
Under the non-degeneracy condition
20
02
2
11
= 0, the random
eld {Y (x, y) : x, y R} satises the hypotheses of Theorem 6.2. in
[2], and we can write the Rice formula:
(7)
E
_
SP(Q)
_
=
_
Q
E
_
| det Y
(x, y)|
Y(x, y) = 0
_
p
Y(x,y)
(0) dxdy
=
_
Q
E
_
| det Y
(x, y)|
_
p
Y(x,y)
(0) dxdy,
since for xed (x, y) the random matrix Y
20
02
2
11
exp
_
k
2
2(
20
02
2
11
)
_
02
x
2
2
11
xy +
20
y
2
_
_
.
To compute the expectation of the absolute value of the determi-
nant in the right hand side of (7), which does not depend on x, y,
we use the method of [3] (see also [6]). Set := det Y
(x, y) =
(W
xx
k)(W
yy
k) W
2
xy
.
We have
(9) E(||) = E
_
2
_
+
0
1 cos(t)
t
2
dt
_
.
Dene
h(t) := E
_
exp
_
it[(W
xx
k)(W
yy
k) W
2
xy
]
_
.
Then
(10) E(||) =
2
_
_
+
0
1 Re[h(t)]
t
2
dt
_
.
To compute h(t) we dene
A =
_
_
0 1/2 0
1/2 0 0
0 0 1
_
_
,
and the variance matrix of W
xx
, W
yy
, W
x,y
:=
_
_
40
22
31
22
04
13
31
13
22
_
_
.
22 JEAN-MARC AZA
IS, JOS
E R. LE
j=1
d
j
(t, k)
_
1 + 4
2
j
t
2
_
cos
_
3
j=1
_
j
(t) + k
2
t
j
(t)
_
_
,
where, for j = 1, 2, 3:
d
j
(t, k) = exp
_
k
2
t
2
2
(s
1j
+ s
2j
)
2
1 + 4
2
j
t
2
_
,
j
(t) =
1
2
arctan(2
j
t), 0 <
j
< /4,
j
(t) =
1
3
t
2
(s
1j
+ s
2j
)
2
j
1 + 4
2
j
t
2
.
Introducing these expressions in (10) and using (8) we obtain a
new formula which has the form of a rather complicated integral.
However, it is well adapted to numerical evaluation.
RICE FORMULAS AND GAUSSIAN WAVES II. 23
On the other hand, this formula allows us to compute the equiv-
alent as k 0 of the expectation of the total number of specular
points under the Longuet-Higgins approximation. In fact, a rst or-
der expansion of the terms in the integrand gives a somewhat more
accurate result, that we state as a theorem:
Theorem 1.
(13) E
_
SP(R
2
)
_
=
m
2
k
2
+ O(1),
where
(14) m
2
=
_
+
0
1
_
3
j=1
(1 + 4
2
j
t
2
)
1/2
cos
_
3
j=1
j
(t)
_
t
2
dt
=
_
+
0
1 2
3/2
_
3
j=1
_
A
j
_
1 + A
j
__
1 B
1
B
2
B
2
B
3
B
3
B
1
_
t
2
dt,
where
A
j
= A
j
(t) =
_
1 + 4
2
j
t
2
_
1/2
, B
j
= B
j
(t) =
_
(1 A
j
)/(1 + A
j
).
Notice that m
2
only depends on the eigenvalues
1
,
2
,
3
and is
easily computed numerically.
We now consider the variance of the total number of specular
points in two dimensions, looking for analogous results to the one-
dimensional case, in view of their interest for statistical applications.
It turns out that the computations become much more involved. The
statements on variance and speed of convergence to zero of the coef-
cient of variation that we give below include only the order of the
asymptotic behavior in the Longuet-Higgins approximation, but not
the constant. However, we still consider them to be useful. If one
renes the computations one can give rough bounds on the generic
constants in Theorem 2 and Corollary 1 on the basis of additional
hypotheses on the random eld.
24 JEAN-MARC AZA
IS, JOS
E R. LE
(0) =
_
W
xx
(0) W
xy
(0)
W
xy
(0) W
yy
(0)
_
.
The function
z (z) = det
_
Var
_
W
(0)z
_
,
dened on z = (z
1
, z
2
)
T
R
2
, is a non-negative homogeneous polyno-
mial of degree 4 in the pair z
1
, z
2
. We will assume the non-degeneracy
condition:
(15) min{(z) : z = 1} = > 0.
Theorem 2. Let us assume that {W(x) : x R
2
} satises the
above conditions and that it is also -dependent, > 0, that is,
E
_
W(x)W(y)
_
= 0 whenever x y > .
Then, for k small enough:
Var
_
SP(R
2
)
_
L
k
2
,
where L is a positive constant depending upon the law of the random
eld.
A direct consequence of Theorems 1 and 2 is the following:
Corollary 1. Under the same hypotheses of Theorem 2, for k small
enough, one has:
_
Var
_
SP(R
2
)
_
E
_
SP(R
2
)
_
L
1
k,
where L
1
is a new positive constant.
Proof of Theorem 2. Let us denote T = SP(R
2
). We have:
(16) Var(T) = E(T(T 1)) +E(T) [E(T)]
2
.
RICE FORMULAS AND GAUSSIAN WAVES II. 25
We have already computed the equivalents as k 0 of the second
and third term in the right-hand side of (16). Our task in what
follows is to consider the rst term.
The proof is performed using Rice formula for the second factorial
moment of the number of roots of the random eld Y . We apply
Theorem 6.3. of [2] for dimension d = 2 and k = 2. Then,
E(T(T 1)) =
=
_ _
R
2
R
2
E
_
| det Y
(x)|| det Y
(y)|
Y(x) = 0, Y(y) = 0
_
p
Y(x),Y(y)
(0, 0) dxdy
=
_ _
xy>
... dxdy +
_ _
xy
... dxdy = J
1
+ J
2
.
For J
1
we proceed as in the proof of Theorem 1 of [1], using the
-dependence and the evaluations therein. We obtain:
(17) J
1
=
m
2
2
k
4
+
O(1)
k
2
.
Let us show that for small k,
(18) J
2
=
O(1)
k
2
.
In view of (16), (13) and (17) this suces to prove the theorem.
We do not perform all detailed computations. The key point con-
sists in evaluating the behavior of the integrand that appears in J
2
near the diagonal x = y, where the density p
Y(x),Y(y)
(0, 0) degener-
ates and the conditional expectation tends to zero.
For the density, using the invariance under translations of the law
of W
(x) : x R
2
, we have:
p
Y(x),Y(y)
(0, 0) = p
W
(x),W
(y)
(kx, ky)
= p
W
(0),W
(yx)
(kx, ky)
= p
W
(0),[W
(yx)W
(0)]
(kx, k(y x)).
26 JEAN-MARC AZA
IS, JOS
E R. LE
(z) = W
(0) + W
(0)z + O(z
2
).
Using the non-degeneracy assumption (15) and the fact that W
(0)
and W
,
where C
1
, C
2
, C
3
are positive constants.
Let us consider the conditional expectation. For each pair x, y of
dierent points in R
2
, denote by the unit vector (y x)/y x
and n a unit vector orthogonal to . We denote respectively by
Y,
Y,
n
Y the rst and second partial derivatives of the ran-
dom eld in the directions given by and n.
Under the condition
Y(x) = 0, Y(y) = 0,
we have the following simple bound on the determinant, based upon
its denition and Rolles Theorem applied to the segment [x, y] =
{x + (1 )y}:
(19)
det Y
(x)
Y(x)
n
Y(x)
y x sup
s[x,y]
Y(s)
n
Y(x)
RICE FORMULAS AND GAUSSIAN WAVES II. 27
So,
E
_
| det Y
(x)|| det Y
(y)|
Y(x) = 0, Y(y) = 0
_
y x
2
E
_
sup
s[x,y]
Y(s)
2
n
Y(x).
n
Y(y)
(x) = kx, W
(y) = ky
_
= z
2
E
_
sup
s[0,z]
Y(s)
2
n
Y(0).
n
Y(z)
(0) = kx,
W
(z) W
(0)
z
= k
_
,
where the last equality is again a consequence of the stationarity of
the random eld {W(x) : x R
2
}.
At this point, we perform a Gaussian regression on the condition.
For the condition, use again Taylor expansion, the non-degeneracy
hypothesis and the independence of W
(0) and W
(x)|| det Y
(y)|
Y(x) = 0, Y(y) = 0
_
C
4
z
2
_
1+kx
_
4
,
where C
4
is a positive constant. Summing up, we have the following
bound for J
2
:
(21)
J
2
C
1
C
4
2
_
R
2
_
1 + kx
_
4
exp
_
C
2
k
2
(x C
3
)
2
dx
= C
1
C
4
2
2
2
_
+
0
_
1 + k
_
4
exp
_
C
2
k
2
( C
3
)
2
d.
Performing the change of variables w = k, (18) follows.
28 JEAN-MARC AZA
IS, JOS
E R. LE
(x))|
Z(x) = 0]p
Z(x)
(0)dx,
where p
Z(x)
(.) is the density of Z(x). One can easily check that this
density is non-degenerate. Moreover, one has (use Proposition 6.5.
of [2]) P(x, Z(x) = 0, det[Z
(x))| is
the area of the parallelogram generated by two independent standard
Gaussian variables in R
2
. Using invariance of the distribution, the
distribution of this volume is the product of independent square roots
of a
2
(2) and a
2
(1) distributed random variables. An elementary
calculation gives then: E[| det(Z
(x))|] =
2
. Finally, we get
d
2
=
1
2
2
.
This quantity is equal to
K
2
4
in Berry and Dennis [3] notation, giving
their formula (4.6).
3.2. Mean length of dislocation curve. Now suppose that the
space variable is of dimension 3 and the random eld {Z(x) : x R
3
}
satises the same hypotheses as in the 2-dimensional parameter case.
Generically the dislocation points form a curve C:
C = {x : Z(x) = 0}.
30 JEAN-MARC AZA
IS, JOS
E R. LE
(x) Z
(x)
T
)
1/2
Z(x) = 0]p
Z(x)
(0)dx,
and the verication of the validity is performed in a similar way to the
2-dimensional case above. For simplicity, we may assume again that
S has Lebesgue measure equal to 1. The expression can be simplied
using the stationarity and the normalization of the variance, to get
d
3
=
1
2
E[(det Z
(x)Z
(x)
T
)
1/2
],
with
E[(det(Z
(x)Z
(x)
T
)
1/2
] =
2
E(V ),
where V is the surface area of the parallelogram generated by two
standard Gaussian variables in R
3
. The projection method gives
E(V ) = E(XY ) =
4
2
_
2
= 2,
Here X and Y are independent and X (resp. Y ) is the square root
of a
2
(3)-distributed (resp.
2
(2)-distributed) random variable.
So,
d
3
=
2
.
In Berry and Dennis notations [3] the last quantity is denoted by
k
2
3
giving their formula (4.5).
3.3. Variance. In this section, we limit ourselves to dimension 2 and
the random eld satises the hypotheses we introduced to compute
the expectation of the number of dislocation points. We further as-
sume that for s
1
, s
2
R
2
, s
1
= s
2
the joint distribution of (s
1
), (s
2
)
RICE FORMULAS AND GAUSSIAN WAVES II. 31
does not degenerate. Let S be again a measurable subset of R
2
hav-
ing Lebesgue measure equal to 1.
The variance of the number of dislocations points is an important
issue that can be obtained via the second factorial moment of the
number of zeroes. More precisely:
Var
_
N
Z
S
(0)
_
= E
_
N
Z
S
(0)
_
N
Z
S
(0) 1
__
+ d
2
d
2
2
,
and using Theorem 6.3 of [2], we can write the formula:
E
_
N
Z
S
(0)
_
N
Z
S
(0) 1
__
=
_
SS
A(s
1
, s
2
)ds
1
ds
2
,
where
A(s
1
, s
2
) = E
_
| det Z
(s
1
) det Z
(s
2
)|
Z(s
1
) = Z(s
2
) = 0
_
p
Z(s
1
,s
2
)
(0, 0).
Taking into account that the law of the random eld is invariant
under translations and orthogonal transformations of R
2
, we have
A(s
1
, s
2
) = A
_
(0, 0), (r, 0)
_
= A(r) whith r = s
1
s
2
.
The function A(r) has two intuitive interpretations. First it can be
viewed as
A(r) = lim
0
1
4
E
_
N
_
B((0, 0), )
_
N
_
B((r, 0), )
_
.
Second it is the density of the Palm distribution (a generalization of
horizontal window conditioning of [5]) of the number of zeroes of Z
per unit of surface, locally around the point (r, 0) given that there is
a zero at (0, 0). A(r)/d
2
2
is called the correlation density function
in [3].
To compute A(r), we recall that
1
,
2
,
1
,
2
denote the partial
derivatives of , with respect to the rst and second coordinate.
32 JEAN-MARC AZA
IS, JOS
E R. LE
(0, 0) det Z
(r, 0)|
Z(0, 0) = Z(r, 0) = 0
2
p
Z(0,0),Z(r,0)
(0
4
)
= E
_
1
_
(0, 0)
_
1
_
(r, 0)
_
Z(0, 0) = Z(r, 0) = 0
2
p
Z(0,0),Z(r,0)
(0
4
),
(23)
where 0
p
denotes the null vector in dimension p.
The density is easy to compute
p
Z(0,0),Z(r,0)
(0
4
) =
1
(2)
2
(1
2
(r))
, where (r) =
_
0
J
0
(kr)(dk).
Here, J
0
is the Bessel function of the rst kind of order 0. The spec-
tral measure is invariant under the isometries of R
2
, so that the
measure on R
+
is dened to be such that for every w 0, ( :
R
2
, w) = 2([0, w]).
To compute the conditional expectation of the product of the ab-
solute value of the determinants, we use again the same device as in
[3], as well as the same notations. We have:
(24) |w| =
1
_
+
(1 cos(wt))t
2
dt.
_
_
C := (r)
E =
(r)
H = E/r
F = (r)
F
0
= (0)
The regression formulas imply that the conditional variance matrix
of the vector
W =
_
1
(0),
1
(r, 0),
2
(0),
2
(r, 0),
1
(0),
1
(r, 0),
2
(0),
2
(r, 0)
_
,
is given by
= Diag
_
A, B, A, B
_
RICE FORMULAS AND GAUSSIAN WAVES II. 33
with
A =
_
F
0
E
2
1C
2
F
E
2
C
1C
2
F
E
2
C
1C
2
F
0
E
2
1C
2
_
B =
_
F
0
H
H F
0
_
.
Using formula (24) the expectation we have to compute is equal to
(25)
1
2
_
+
dt
1
_
+
dt
2
t
2
1
t
2
2
_
1
1
2
T(t
1
, 0)
1
2
T(t
1
, 0)
1
2
T(0, t
2
)
1
2
T(0, t
2
)
+
1
4
T(t
1
, t
2
) +
1
4
T(t
1
, t
2
) +
1
4
T(t
1
, t
2
) +
1
4
T(t
1
, t
2
)
_
,
where
T(t
1
, t
2
) = E
_
exp
_
i(w
1
t
1
+ w
2
t
2
)
_
_
,
with
w
1
=
1
(0)
2
(0)
1
(0)
2
(0) = W
1
W
7
W
3
W
5
w
2
=
1
(r, 0)
2
(r, 0)
1
(r, 0)
2
(r, 0) = W
2
W
8
W
4
W
6
.
T(t
1
, t
2
) = E
_
exp(iW
T
HW)
_
where W has the distribution N(0, )
and
H =
_
_
0 0 0 D
0 0 D 0
0 D 0 0
D 0 0 0
_
_
,
D =
1
2
_
t
1
0
0 t
2
_
.
A standard diagonalization argument shows that
T(t
1
, t
2
) = E
_
exp(iW
T
HW)
_
= E
_
exp(i
8
j=1
2
j
)
_
,
34 JEAN-MARC AZA
IS, JOS
E R. LE
j=1
(1 2i
j
)
1/2
.
Clearly
1/2
= Diag
_
A
1/2
, B
1/2
, A
1/2
, B
1/2
_
,
and
1/2
H
1/2
=
_
_
0 0 0 M
0 0 M
T
0
0 M 0 0
M
T
0 0 0
_
_
,
with M= A
1/2
DB
1/2
.
Let be an eigenvalue of
1/2
H
1/2
. It is easy to check that
2
is
an eigenvalue of MM
T
. Respectively if
2
1
and
2
2
are the eigenvalues
of MM
T
, those of
1/2
H
1/2
are
1
(twice) and
2
(twice).
Note that
2
1
and
2
2
are the eigenvalues of MM
T
= A
1/2
DBDA
1/2
or equivalently, of DBDA. Using (26)
E
_
exp(iW
T
HW)
_
=
_
1 + 4(
2
1
+
2
2
) + 16
2
1
2
2
_
1
=
_
1 + 4 tr(DBDA) + 16 det(DBDA)
_
1
where
DBDA =
1
4
_
t
2
1
F
0
(F
0
E
2
1C
2
)+t
1
t
2
H(F
E
2
C
1C
2
) t
2
1
F
0
(F
E
2
C
1C
2
)+t
1
t
2
H(F
0
E
2
1C
2
)
t
1
t
2
H(F
0
E
2
1C
2
)+t
2
2
F
0
(F
E
2
C
1C
2
) t
1
t
2
H(F
E
2
C
1C
2
)+t
2
2
F
0
(F
0
E
2
1C
2
)
_
So,
4 tr(DBDA) = (t
2
1
+ t
2
2
)F
0
(F
0
E
2
1 C
2
) + 2t
1
t
2
H(F
E
2
C
1 C
2
)
(27)
16 det(DBDA) = t
2
1
t
2
2
_
F
2
0
H
2
_
(F
0
E
2
1 C
2
)
2
(F
E
2
C
1 C
2
)
2
,
(28)
RICE FORMULAS AND GAUSSIAN WAVES II. 35
giving
(29) T(t
1
, t
2
) = E
_
exp(iW
T
HW)
_
=
_
1 + (t
2
1
+ t
2
2
)F
0
(F
0
E
2
1 C
2
) + 2t
1
t
2
H(F
E
2
C
1 C
2
)
+ t
2
1
t
2
2
_
F
2
0
H
2
_
(F
0
E
2
1 C
2
)
2
(F
E
2
C
1 C
2
)
2
_
1
.
Performing the change of variable t
A
1
t with A
1
= F
0
(F
0
E
2
1C
2
) the integral (25) becomes
(30)
A
1
2
_
+
dt
1
_
+
dt
2
t
2
1
t
2
2
_
1
1
1 + t
2
1
1
1 + t
2
2
+
1
2
_
1
1 + (t
2
1
+ t
2
2
) 2A
2
t
1
t
2
+ t
2
1
t
2
2
Z
+
1
1 + (t
2
1
+ t
2
2
) + 2A
2
t
1
t
2
+ t
2
1
t
2
2
Z
_
_
=
A
1
2
_
+
dt
1
_
+
dt
2
t
2
1
t
2
2
_
1
1
1 + t
2
1
1
1 + t
2
2
+
1 + (t
2
1
+ t
2
2
) + t
2
1
t
2
2
Z
_
1 + (t
2
1
+ t
2
2
) + t
2
1
t
2
2
Z
_
2
4A
2
2
t
2
1
t
2
2
_
,
where
_
_
_
A
2
=
H
F
0
F(1C
2
)E
2
C
F
0
(1C
2
)E
2
Z =
F
2
0
H
2
F
2
0
_
1 (F
E
2
C
1C
2
)
2
.(F
0
E
2
1C
2
)
2
_
.
In this form, and up to a sign change, this result is equivalent to
Formula (4.43) of [3] (note that A
2
2
= Y in [3]).
In order to compute the integral (30), rst we obtain
_
1
t
2
2
_
1
1
1 + t
2
2
dt
2
= .
We split the other term into two integrals, thus we have for the rst
one
36 JEAN-MARC AZA
IS, JOS
E R. LE
1
t
2
2
_
1
1 + (t
2
1
+ t
2
2
) 2A
2
t
1
t
2
+ t
2
1
t
2
2
Z
1
1 + t
2
1
dt
2
=
1
2(1 + t
2
1
)
_
1
t
2
2
(1 + t
2
1
Z)t
2
2
2A
2
t
1
t
2
1 + t
2
1
2A
2
t
1
t
2
+ (1 + t
2
1
Z)t
2
2
dt
2
=
1
2(1 + t
2
1
)
_
1
t
2
2
t
2
2
2Z
1
t
1
t
2
t
2
2
2Z
1
t
1
t
2
+ Z
2
dt
2
= I
1
,
where Z
2
=
1+t
2
1
1+Zt
2
1
and Z
1
=
A
2
1+Zt
2
1
.
Similarly, for the second integral we get
1
2
_
1
t
2
2
_
1
1 + (t
2
1
+ t
2
2
) + 2A
2
t
1
t
2
+ t
2
1
t
2
2
Z
1
1 + t
2
1
dt
2
=
1
2(1 + t
2
1
)
_
1
t
2
2
t
2
2
+ 2Z
1
t
1
t
2
t
2
2
+ 2Z
1
t
1
t
2
+ Z
2
dt
2
= I
2
I
1
+ I
2
=
1
2(1 + t
2
1
)
_
1
t
2
2
_
t
2
2
2Z
1
t
1
t
2
t
2
2
2Z
1
t
1
t
2
+ Z
2
+
t
2
2
+ 2Z
1
t
1
t
2
t
2
2
+ 2Z
1
t
1
t
2
+ Z
2
dt
2
=
1
(1 + t
2
1
)
_
t
2
2
+ (Z
2
4Z
2
1
t
2
1
)
t
4
2
+ 2(Z
2
2Z
2
1
t
2
1
)t
2
2
+ Z
2
2
dt
2
=
1
(1 + t
2
1
)
(Z
2
2Z
2
1
t
2
1
)
Z
2
_
(Z
2
Z
2
1
t
2
1
)
.
In the third line we have used the formula provided by the method
of residues. In fact, if the polynomial X
2
SX + P with P > 0 has
not root in [0, ), then
_
t
2
t
4
St
2
+ P
dt =
_
P(S + 2
P)
(
P ).
In our case = (Z
2
4Z
2
1
t
2
1
), S = 2(Z
2
2Z
2
1
t
2
1
) and P = Z
2
2
.
Therefore we get
RICE FORMULAS AND GAUSSIAN WAVES II. 37
A(r) =
A
1
4
3
(1 C
2
)
_
1
t
2
1
_
1
1
(1 + t
2
1
)
(Z
2
2Z
2
1
t
2
1
)
Z
2
_
(Z
2
Z
2
1
t
2
1
)
dt
1
.
Acknowledgement. This work has received nancial support from
European Marie Curie Network SEAMOCS.
References
[1] J-M. Azas, J. Leon and M. Wschebor, Rice formulae and Gaussian waves
(2010) Bernoulli, vol. 17, No. 1, 170-193 (2011).
[2] J-M. Azas, and M. Wschebor, Level sets and extrema of random processes
and elds, Wiley (2009).
[3] M.V. Berry, and M.R. Dennis, Phase singularities in isotropic random waves,
Proc. R. Soc. Lond, A, 456, 2059-2079 (2000).
[4] L. Callenbach, P. Hanggi and S.J. Linz, Oscillatory systems driven by noise:
Frequency and phase syncronisation, Physical Review E vol. 65, 051110 (2002).
[5] H. Cramer and M.R. Leadbetter, Stationary and Related Stochastic Processes,
Wiley (1967).
[6] Li and Wei, Gaussian integrals involving absolute value functions. IMS Lec-
ture Notes Monograph Series. Proceedings of the Conference in Luminy (IMS)
(2009).
[7] M. S. Longuet-Higgins, Reection and refraction at a random surface. I, II,
III, Journal of the Optical Society of America, vol. 50, No. 9, 838-856 (1960).
[8] M. S. Longuet-Higgins, The statistical geometry of random surfaces. Proc.
Symp. Appl. Math., Vol. XIII, AMS Providence R.I., 105-143 (1962).
[9] A.O. Petters, B. Rider and A.M. Teguia, A Theory of Stochastic Microlensing
I. Random Images, Shear and the Kac-Rice Formula. J. Math. Physics, vol. 50,
pp. 122501, (2009).
[10] S.O. Rice,(1944-1945). Mathematical Analysis of Random Noise, Bell System
Tech. J., 23, 282-332; 24, 45-156 (1944-1945).
38 JEAN-MARC AZA
IS, JOS
E R. LE
(O
X
) = O
Y
induces a homomorphism
: Aut
o
(X) Aut
o
(Y ) between the
neutral components of the automorphism group schemes (Corollary
2.2). Our proof is an adaptation of that given in [Ak] in the setting
of complex spaces.
In Section 3, we consider a torsor : X Y under a connected
group scheme G, and show the existence of the associated ber bun-
dle X
G
G/H = X/H for any subgroup scheme H G (Theorem
3.3). As a consequence, X
G
Z exists when Z is the total space
of a G-torsor, or a group scheme where G acts via a homomorphism
(Corollary 3.4). Another application of Theorem 3.3 concerns the
quasi-projectivity of torsors (Corollary 3.5); it builds on work of Ray-
naud, who showed e.g. the local quasi-projectivity of homogeneous
spaces over a normal scheme (see [Ra]).
The automorphism groups of torsors are studied in Section 4. In
particular, we obtain a version of Morimotos theorem: the equivari-
ant automorphisms of a torsor over a proper scheme form a group
scheme, locally of nite type (Theorem 4.2). Here our proof, based
on an equivariant completion of the structure group, is quite dierent
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 41
from the original one. We also analyze the relative equivariant auto-
morphism group of such a torsor; this yields a version of Chevalleys
structure theorem for algebraic groups in that setting (Proposition
4.3).
The nal Section 5 contains a full description of relative equivariant
automorphisms for torsors under abelian varieties (Proposition 5.1)
and our lifting result for automorphisms of the base (Theorem 5.4).
Acknowledgements. Many thanks to Gael Remond for several
clarifying discussions, and special thanks to the referee for very help-
ful comments and corrections. In fact, the nal step of the proof of
Theorem 3.3 is taken from the referees report; the end of the proof
of Corollary 2.2, and the proof of Corollary 3.4 (ii), closely follow
his/her suggestions.
Notation and conventions. Throughout this article, we consider
algebraic varieties, schemes, and morphisms over an algebraically
closed eld k. Unless explicitly mentioned, we will assume that the
considered schemes are of nite type over k (such schemes are also
called algebraic schemes). By a point of a scheme X, we will mean
a closed point unless explicitly mentioned. A variety is an integral
separated scheme.
We will use [DG] as a general reference for group schemes. Given
such a group scheme G, we denote by
G
: GG G the multiplica-
tion and by e
G
G(k) the neutral element. The neutral component
of G is denoted by G
o
, and the Lie algebra by Lie(G).
We recall that an action of G on a scheme X is a morphism
: GX X, (g, x) g x
such that the composite map
X
e
G
id
X
GX
X
42 MICHEL BRION
is the identity, and the square
GGX
id
G
GX
G
id
X
GX
X
commutes. We then say that X is a G-scheme. A morphism f : X
Y between two G-schemes is called equivariant if the square
GX
X
id
G
f
GY
Y
commutes (with the obvious notation). We then say that f is a G-
morphism.
A smooth group scheme will be called an algebraic group. By
Chevalleys structure theorem (see [Ro, Theorem 16], or [Co] for
a modern proof), every connected algebraic group G has a largest
closed connected normal ane subgroup G
a
; moreover, the quotient
G/G
a
=: A(G) is an abelian variety. This yields an exact sequence
of connected algebraic groups
1 G
a
G A(G) 1.
2. Descending automorphisms for fiber spaces
We begin with the following scheme-theoretic version of a result of
Blanchard (see [Bl, Section I.1] and also [Ak, Lemma 2.4.2]).
Proposition 2.1. Let G be a connected group scheme, X a G-
scheme, Y a scheme, and : X Y a proper morphism such that
(O
X
) = O
Y
. Then there is a unique G-action on Y such that is
equivariant.
Proof. We will consider a scheme Z as the ringed space (Z(k), O
Z
)
where the set Z(k) is equipped with the Zariski topology; this makes
sense as Z is of nite type.
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 43
We rst claim that the abstract group G(k) permutes the bers
of : X(k) Y (k) (note that these bers are non-empty and con-
nected, since
(O
X
) = O
Y
). Let y Y (k) and denote by F
y
the
set-theoretic ber of at y, viewed as a closed reduced subscheme of
X. Then the map
: G
red
F
y
Y, (g, x) (g x)
maps {e
G
} F
y
to the point y. Moreover, G
red
is a variety, and F
y
is connected and proper. By the rigidity lemma (see [Mu, p. 43]),
it follows that maps {g} F
y
to a point for any g G(k), i.e.,
g F
y
F
gy
. Thus, g
1
F
gy
F
y
and hence g F
y
= F
gy
. This
implies our claim.
That claim yields a commutative square
G(k) X(k)
X(k)
id
G
G(k) Y (k)
Y (k),
where is an action of the (abstract) group G(k).
Next, we show that is continuous. It suces to show that
1
(Z)
is closed for any closed subset Z Y (k). But (id
G
, )
1
1
(Z) =
1
(Z) is closed, and (id
G
, ) is proper and surjective; this yields
our assertion.
Finally, we dene a morphism of sheaves of k-algebras
#
: O
Y
(O
GY
).
For this, to any open subset V Y , we associate a homomorphism
of algebras
#
(V ) : O
Y
(V ) O
GY
1
(V )
.
By assumption, the left-hand side is isomorphic to O
X
1
(V )
, and
the right-hand side to
O
GX
(id
G
, )
1
1
(V )
= O
GX
1
(V )
.
We dene
#
(V ) :=
#
1
(V )
GY
Y
commutes.
It remains to show that is an action of the group scheme G.
Note that e
G
acts on X(k) via the identity; moreover, the composite
morphism of sheaves
O
Y
(O
GY
)
(e
G
id
Y
)
#
(O
{e
G
}Y
)
= O
Y
is the identity, since so is the analogous morphism
O
X
(O
GX
)
(e
G
id
X
)
#
(O
{e
G
}X
)
= O
X
and
(O
X
) = O
Y
. Likewise, the square
GGY
id
G
GY
G
id
Y
GY
Y
commutes on closed points, and the corresponding square of mor-
phisms of sheaves commutes as well, since the analogous square with
Y replaced by X commutes.
This proposition will imply a result of descent for group scheme
actions, analogous to [Bl, Proposition I.1] (see also [Ak, Proposition
2.4.1]). To state that result, we need some recollections on automor-
phism functors.
Given a scheme S, we denote by Aut
S
(X S) the group of auto-
morphisms of X S viewed as a scheme over S. The assignement
S Aut
S
(X S) yields a group functor Aut(X), i.e., a contravari-
ant functor from the category of schemes to that of groups. If X
is proper, then Aut(X) is represented by a group scheme Aut(X),
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 45
locally of nite type (see [MO, Theorem 3.7]). In particular, the neu-
tral component Aut
o
(X) is a group scheme of nite type. Also, recall
that
(1) Lie Aut(X)
= (X, T
X
)
where the right-hand side denotes the Lie algebra of global vector
elds on X, that is, of derivations of O
X
.
We now are in a position to state:
Corollary 2.2. Let : X Y be a morphism of proper schemes
such that
(O
X
) = O
Y
. Then induces a homomorphism of group
schemes
: Aut
o
(X) Aut
o
(Y ).
Proof. This is a formal consequence of Proposition 2.1. Specically,
let G := Aut
o
(X) and consider the G-action on Y obtained in that
proposition. This yields a automorphism of Y G as a scheme over
G,
(y, g) (g y, g),
and in turn a morphism (of schemes)
: G Aut(Y ).
Moreover,
(e
G
) = e
Aut(Y )
since e
G
acts via the identity. As G is
connected, it follows that the image of
is contained in Aut
o
(Y ) =:
H. In other words, we have a morphism of schemes
: G H such
that
(e
G
) = e
H
. It remains to check that
is a homomorphism;
but this follows from the fact that
: Aut
o
(X)Aut
o
(Y ) Aut
o
(X), q
: Aut
o
(X)Aut
o
(Y ) Aut
o
(Y ).
This implies readily the following analogue of [Bl, Corollaire, p. 161]:
46 MICHEL BRION
Corollary 2.3. Let X and Y be complete varieties. Then the ho-
momorphism
(p
, q
) : Aut
o
(X Y ) Aut
o
(X) Aut
o
(Y )
is an isomorphism, with inverse the natural homomorphism
Aut
o
(X)Aut
o
(Y ) Aut
o
(XY ), (g, h)
.
More generally, the isomorphism
Aut
o
(X Y )
= Aut
o
(X) Aut
o
(Y )
holds for those proper schemes X and Y such that O(X) = O(Y ) =
k, but may fail for arbitrary proper schemes. Indeed, let X be a
complete variety having non-zero global vector elds, and let Y :=
Spec k[] where
2
= 0; denote by y the closed point of Y . Then we
have an exact sequence
1 (X, T
X
) Aut
Y
(X Y ) Aut(X) 1,
where the map on the right is obtained by restricting to X {y}.
This identies the vector group (X, T
X
) to a closed subgroup of
Aut
o
(XY ), which is not in the image of the natural homomorphism.
Likewise, Aut(X Y ) is generally strictly larger than Aut(X)
Aut(Y ) (e.g. take Y = X and consider the automorphism (x, y)
(y, x)).
3. Torsors and asssociated fiber bundles
Consider a group scheme G, a G-scheme X, and a G-invariant
morphism
(2) : X Y,
where Y is a scheme. We say that X is a G-torsor over Y , if is
faithfully at and the morphism
(3) p
2
: GX X
Y
X, (g, x) (g x, x)
is an isomorphism. The latter condition is equivalent to the existence
of a faithfully at morphism f : Y
: X
Y
Y
(O
X
)
G
(the subsheaf of G-invariants in
(O
X
)). Thus, we
will also denote Y by X/G.
Remark 3.1. If G is an ane algebraic group, then every G-torsor
(2) is locally isotrivial, i.e., for any point y Y there exist an open
subscheme V Y containing y and a nite etale surjective morphism
f : V
is trivial (this
result is due to Grothendieck, see [Ra, Lemme XIV 1.4] for a detailed
proof). The local isotriviality of also holds if G is an algebraic group
and Y
red
is normal, as a consequence of [loc. cit., Theor`eme XIV 1.2].
In particular, is locally trivial for the etale topology in both cases.
Yet there exist torsors under algebraic groups that are not locally
isotrivial, see [loc. cit., XIII 3.1] (reformulated in more concrete terms
in [Br1, Example 6.2]) for an example where Y is a rational nodal
curve, and G is an abelian variety having a point of innite order.
Given a G-torsor (2) and a G-scheme Z, we may view X Z as
a G-scheme for the diagonal action, and ask if there exist a G-torsor
: X Z W where W is a scheme, and a morphism q : W Y
such that the square
X Z
p
1
X
W
q
Y
is cartesian; here p
1
denotes the rst projection. Then q is called the
associated ber bundle with ber Z. The quotient scheme W will be
denoted by X
G
Z.
48 MICHEL BRION
The answer to this question is positive if Z admits an ample G-
linearized invertible sheaf (as follows from descent theory; see [SGA1,
Proposition 7.8] and also [MFK, Proposition 7.1]). In particular,
the answer is positive if Z is ane. Yet the answer is generally
negative, even if Z is a smooth variety; see [Bi]. However, associated
ber bundles do exist in the category of algebraic spaces, see [KM,
Corollary 1.2].
Of special interest is the case that the ber is a group scheme G
. Then X
:=
X
G
G
is a G
, then X
comes with a G
-morphism to G
G
q
X
f
G/H
where q denotes the quotient map. Then X
is a G-scheme and f
is a
G-morphism with ber Z at e
G
. It follows readily that the morphism
GZ X
, (g, z) g z
is an isomorphism, with inverse
X
GZ, x
(x
), f
(x
)
1
x
.
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 49
This identies f
X X
Y
Y
and let
U := U
Y
Y . Then is a G-torsor, and hence
X is normal.
Moreover,
X = G
U contains
U as an open ane subset. Hence
is quasi-projective by [Ra, Theor`eme VI 2.3]. Therefore, to show
that X/H is a scheme, it suces to check that
X/H is a scheme in
view of [loc. cit., Lemme XI.3.2]. Thus, we may assume that X is
normal and is quasi-projective. Then we may further assume that
Y is quasi-projective, and hence so is X. Now X is the disjoint union
of its irreducible components, and each of them is G-stable; thus, we
may assume that X is irreducible. This yields the desired reduction.
Thus, we may assume that there exists an ample invertible sheaf L
on X; since X is normal, we may assume that L is G
a
-linearized. In
view of [Br1, Lemma 3.2], it follows that there exists a G-morphism
X G/G
1
, where G
1
G is a subgroup scheme containing G
a
and such that G
1
/G
a
is nite. By Lemma 3.2, this yields a G-
isomorphism
X
= G
G
1
X
1
where X
1
X is a closed subscheme, stable under G
1
. Moreover,
the restriction
1
: X
1
Y is a G
1
-torsor. Since G
1
is ane, so is
the morphism
1
and hence X
1
is quasi-projective.
We now show that
1
factors as a G
a
-torsor p
1
: X
1
X
1
/G
a
,
where X
1
/G
a
is a quasi-projective scheme, followed by a G
1
/G
a
-
torsor q
1
: X
1
/G
a
Y . Indeed, the associated ber bundle X
1
G
1
G
1
/G
a
is a quasi-projective scheme, since G
1
/G
a
is ane; we then
take for p
1
the composite of the morphism id
X
1
e
G
1
: X
1
X
1
G
1
with the natural morphism X
1
G
1
X
1
G
1
G
1
/G
a
. Then p
1
is
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 51
G
a
-invariant and ts into a commutative diagram
X
1
G
1
X
1
G
1
/G
a
X
1
X
1
p
1
X
1
G
1
G
1
/G
a
q
1
X
1
/G
1
= Y
where the top horizontal arrows are the natural projections, and the
vertical arrows are G
1
-torsors; thus, p
1
is a G
a
-torsor.
Next, note that the smooth, quasi-projective G
a
-variety G admits
a G
a
-linearized ample invertible sheaf. By the preceding step and
[MFK, Proposition 7.1], it follows that G
G
a
X
1
is a quasi-projective
scheme; it is the total space of a G
1
/G
a
-torsor over X = G
G
1
X
1
.
Likewise, G/H
G
a
X
1
is a quasi-projective scheme, the total space
of a G
1
/G
a
-torsor over
(G/H
G
a
X
1
)/(G
1
/G
a
) =: Z
It follows that Z = G/H
G
1
X
1
ts into a cartesian square
GX
1
rid
X
1
G/H X
1
X
p
Z
where the vertical arrows are G
1
-torsors; therefore, p is an H-torsor.
Finally, in the general case, we may assume that k has characteris-
tic p > 0. For any positive integer n, we then have the n-th Frobenius
morphism
F
n
G
: G G
(n)
.
Its kernel G
n
is a nite local subgroup scheme of G. Likewise, we
have the n-th Frobenius morphism
F
n
X
: X X
(n)
and G
(n)
acts on X
(n)
compatibly with the G-action on X. In par-
ticular, F
n
X
is invariant under G
n
. Since the morphism F
n
X
is nite,
the sheaf of O
X
(n) -algebras
(F
n
X
)
O
X
G
n
is of nite type. Thus, the
scheme
X/G
n
:= Spec
X
(n)
(F
n
X
)
O
X
G
n
52 MICHEL BRION
is of nite type, and F
n
X
is the composite of the natural morphisms
X X/G
n
X
(n)
. Clearly, the formation of X/G
n
commutes with
faithfully at base change; thus, the morphism
n
: X X/G
n
is a G
n
-torsor, since this holds for the trivial G-torsor G Y Y .
As a consequence, factors through
n
, the G-action on X descends
to an action of G/G
n
= G
(n)
on X/G
n
, and the map X/G
n
Y
is a G
(n)
-torsor. Note that G
(n)
is reduced, and hence a connected
algebraic group, for n 0.
Now consider the restriction
F
n
H
: H H
(n)
with kernel H
n
= H G
n
. Then H acts on X/G
n
via its quotient
H/H
n
= H
(n)
G
n)
. By the preceding step, there exists an H
(n)
-
torsor X/G
n
(X/G
n
)/H
(n)
= X/G
n
H, and hence a G
n
H-torsor
p
n
: X X/G
n
H
where X/G
n
H is a scheme (of nite type). We now set
Z := Spec
X/G
n
H
(p
n
)
O
X
H
so that p
n
factors through a morphism p : X Z. Then p is an H-
torsor, since the formations of X/G
n
H and Z commute with faithfully
at base change, and p is just the natural map G Y G/H Y
when is the trivial torsor over Y . Likewise, the morphism Z
X/G
n
H is nite, and hence the scheme Z is of nite type.
(ii) The composite map
GX
X
p
X/H
is invariant under the action of HH on GX via (h
1
, h
2
) (g, x) =
(gh
1
1
, h
2
x). This yields a morphism : G/HX/H X/H which
is readily seen to be an action.
Corollary 3.4. Let again G be a connected group scheme.
(i) Given two G-torsors
1
: X
1
Y
1
and
2
: X
2
Y
2
, the associ-
ated torsor X
1
X
2
X
1
G
X
2
exists.
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 53
(ii) Given a homomorphism of group schemes f : G G
and a
G-torsor : X Y , the G
-torsor
: G
G
X Y (obtained by
extension of structure groups) exists.
Proof. (i) Apply Theorem 3.3 to the GG-torsor X
1
X
2
Y
1
Y
2
and to the diagonal embedding of G into GG.
(ii) Denote by
G the (scheme-theoretic) image of f and by p : G
/
G the quotient morphism. Then p : G
X G
/
G Y is
a
G G-torsor. Moreover,
G G is a connected group scheme,
and contains G viewed as the image of the homomorphism f id.
Applying Theorem 3.3 again yields a G-torsor G
X G
G
X.
Moreover, the trivial G
-torsor G
X X descends to a G
-torsor
G
G
X X/G = Y .
Corollary 3.5. Let G be a connected algebraic group. Then every
G-torsor (2) factors uniquely as the composite
(5)
X
p
Z
q
Y,
where Z is a scheme, p is a G
a
-torsor, and q is an A(G)-torsor.
Here p is ane and q is proper.
Moreover, the following conditions are equivalent:
(1) is quasi-projective.
(2) q is projective.
(3) q admits a reduction of structure group to a nite subgroup
scheme F A(G).
(4) admits a reduction of structure group to an ane subgroup
scheme H G.
These conditions hold if X is smooth. In characteristic 0, they
imply that q is isotrivial and is locally isotrivial.
Proof. The existence and uniqueness of the factorization are direct
consequences of Theorem 3.3. The assertions on p and q follow by
descent theory (see [SGA1, Expose VIII, Corollaires 4.8, 5.6]).
(1)(2) is a consequence of [Ra, Lemme XIV 1.5 (ii)].
(2)(1) holds since p is ane.
(2)(3) follows from [Br1, Lemma 3.2].
54 MICHEL BRION
(3)(4) Let H G be the preimage of F. Then G/H
= A(G)/F.
By assumption, X/G
a
admits an A(G)-morphism to A(G)/F; this
yields a G-morphism X G/H.
(4)(3) Since G
a
H is ane (as a quotient of the ane group
scheme G
a
H), we may replace H with G
a
H. Thus, we may
assume that H is the preimage of a nite subgroup scheme F A(G).
Then q admits a reduction of structure group to A(G)/F.
If X is smooth, then so are Y and Z; in that case, (3) follows from
[Ro, Theorem 14] or alternatively from [Ra, Proposition XIII 2.6].
Also, (3) means that Y
= A(G)
F
Z
as A(G)-torsors over Z
=
Z
/F, where Z
p
2
Z
Y
q
Z
where the vertical arrows are F-torsors, and hence etale in character-
istic 0. This shows the isotriviality of q. Since p is locally isotrivial,
so is .
Remarks 3.6. (i) The equivalent conditions in the preceding result
do not generally hold in the setting of normal varieties. Specically,
given an elliptic curve G, there exists a G-torsor : X Y where
Y is a normal ane surface and X is not quasi-projective; then of
course is not projective (see [Br1, Example 6.4], adapted from [Ra,
XIII 3.2]).
(ii) If (4) holds, one may ask whether admits a reduction of struc-
ture group to some ane algebraic subgroup H G. The answer
is trivially positive in characteristic 0, but negative in characteristic
p > 0, as shown by the following example.
Choose an integer n 2 not divisible by p, and let C denote the
curve of equation y
p
= x
n
1 in the ane plane A
2
, minus all points
(x, 0) where x is a n-th root of unity. The group scheme
p
of p-th
roots of unity acts on A
2
via t (x, y) = (x, ty), and this action leaves
C stable. The morphism A
2
A
2
, (x, y) (x, y
p
) restricts to a
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 55
p
-torsor
q : C Y
where Y A
2
denotes the curve of equation y = x
n
1 minus all
points (x, 0) with x
n
= 1. Note that Y is smooth, whereas C is
singular; both curves are rational, since the equation of C may be
rewritten as x
n
= (y + 1)
p
.
Next, let G be an ordinary elliptic curve, so that G contains
p
,
and denote by
: X = G
p
C Y
the G-torsor obtained by extension of structure group (which exists
since C is ane). Then X is a smooth surface.
We show that there exists no G-morphism f : X G/H, where H
is an ane algebraic (or equivalently, nite) subgroup of G. Indeed,
f would map the rational curve C X and all its translates by
G to points of the elliptic curve G/H, and hence f would factor
through a G-morphism G/
p
G/H. As a consequence,
p
H, a
contradiction.
(iii) Given a torsor (2) under a group scheme (of nite type) G, there
exists a unique factorization
(6)
X
p
Z = X/G
o
red
q
Y
where Z is a scheme, p is a torsor under the connected algebraic
group G
o
red
, and q is nite. (Indeed, Z = X
G
G/G
o
red
as in the
proof of Theorem 3.3).
4. Automorphism groups of torsors
To any G-torsor : X Y as in Section 3, one associates several
groups of automorphisms:
the automorphism group of X as a scheme over Y , denoted
by Aut
Y
(X) and called the relative automorphism group,
the automorphism group of the pair (X, Y ), denoted by Aut(X, Y ):
it consists of those pairs (, ) Aut(X) Aut(Y ) such that
56 MICHEL BRION
the square
X
X
Y
Y
commutes,
the automorphism group of X viewed as a G-scheme, denoted
by Aut
G
(X) and called the equivariant automorphism group.
Clearly, the projection p
2
: Aut(X) Aut(Y ) Aut(Y ) yields an
exact sequence of (abstract) groups
1 Aut
Y
(X) Aut(X, Y )
p
2
Aut(Y ).
Also, note that each G-morphism : X X descends to an mor-
phism : Y Y , since : X Y is G-invariant and
is a categorical quotient. The assignement Aut
G
(X) =:
Aut(Y ).
Moreover, we may view the equivariant automorphisms as those
pairs (, ) where Aut(Y ), and : X X
is a G-morphism.
Here X
F : X X, x f(x)x
Aut(Y ).
Also, by Lemma 4.1, we have a functorial isomorphism
Aut
Y S
(X S)
= Hom(X S, G).
In other words, Aut
Y
(X) is isomorphic to the group functor
Hom(X, G) : S Hom(X S, G).
As a consequence, Aut
G
Y
(X) is isomorphic to Hom
G
(X, G) : S
Hom
G
(X S, G). This readily yields isomorphisms
Lie Aut
Y
(X)
= Hom
X, Lie(G)
= O(X) Lie(G),
Lie Aut
G
Y
(X)
= Hom
G
X, Lie(G)
O(X) Lie(G)
G
.
58 MICHEL BRION
We now obtain a niteness result for Aut
G
(X), analogous to a
theorem of Morimoto (see [Mo, Theor`eme, p. 158]):
Theorem 4.2. Consider a G-torsor : X Y where G is a group
scheme, X a scheme, and Y a proper scheme. Then the functor
Aut
G
(X) is represented by a group scheme, locally of nite type, with
Lie algebra (X, T
X
)
G
.
Proof. The assertion on the Lie algebra follows from the G-isomorphism
(1).
To show the representability assertion, we rst reduce to the case
that G is a connected ane algebraic group. Let G
a
denote the
largest closed normal ane subgroup of G, or equivalently of G
o
red
.
Then Aut
G
(X) is a closed subfunctor of Aut
G
a
(X). Moreover, the
factorizations (5) and (6) yield a factorization of as
X
p
X/G
a
q
X/G
o
red
r
Y
where p is a torsor under G
a
, q a torsor under G
o
red
/G
a
, and r is a
nite morphism. Since q and r are proper, X/G
a
is proper as well.
This yields the desired reduction.
Next, we may embed G as a closed subgroup of GL(V ) for some
nite-dimensional vector space V . Let Z denote the closure of G in
the projective completion of End(V ). Then Z is a projective variety
equipped with an action of GG (arising from the GG-action on
End(V ) via left and right multiplication) and with an ample GG-
linearized invertible sheaf. By construction, G (viewed as a G G-
variety via left and right multiplication) is the open dense GG-orbit
in Z.
As seen in Section 3, the associated ber bundle X
G
Z (for the
left G-action on Z) exists; it is equipped with a G-action arising from
the right G-action on Z. Moreover, X
G
Z contains X
G
G
= X as
a dense open G-stable subscheme. Also, recall the cartesian square
X Z
p
X
X
G
Z
q
Y.
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 59
Since Z is complete and is faithfully at, it follows that q is proper,
and hence so is X
G
Z.
Now let S be a scheme, and Aut
G
S
(X S). Then yields an
S-automorphism
: X Z S X Z S, (x, z, s)
g
1
x, (g
1
, g
2
) z, s
.
Then is G G-equivariant, and hence yields an automorphism
Aut
G
S
(X
G
ZS) which stabilizes X
G
(Z\G) S. Moreover,
the assignement identies Aut
G
S
(X S) with the stabilizer of
X
G
(Z\G)S in Aut
G
S
(X
G
ZS). Thereby, Aut
G
(X) is identied
with a closed subfunctor of Aut(X
G
Z); the latter is represented
by a group scheme of nite type, since X
G
Z is proper.
For simplicity, we denote by Aut
G
(X) the group scheme dened
in the preceding theorem. Since Aut
G
Y
(X) is a closed subfunctor of
Aut
G
(X), it is also represented by a group scheme (locally of nite
type) that we denote likewise by Aut
G
Y
(X). Further properties of this
relative automorphism group scheme are gathered in the following:
Proposition 4.3. Let : X Y be a torsor under a connected al-
gebraic group G, where Y is a proper scheme. Then the factorization
X
p
Z = X/G
a
q
Y
(obtained in Corollary 3.5) yields an
exact sequence of group schemes
(10)
1 Aut
G
a
Z
(X) Aut
G
Y
(X)
p
Aut
A(G)
Y
(Z).
Moreover, Aut
G
a
Z
(X) is ane of nite type,
If Y is a (complete) variety, then the neutral component of Aut
A(G)
Y
(Z)
is just A(G); it is contained in the image of p
.
Proof. We rst show that Aut
G
a
Z
(X) is ane of nite type. By
Lemma 4.1, we have
Aut
G
a
Z
(X)
= Hom
G
a
(X, G
a
).
Moreover, there exists a closed G
a
-equivariant immersion of G
a
into an ane space V where G
a
acts via a representation. Thus,
60 MICHEL BRION
Aut
G
a
Z
(X) is a closed subfunctor of Hom
G
a
(X, V ). But the latter
is represented by an ane space (of nite dimension), namely, the
space of global sections of the associated vector bundle X
G
a
V
over the proper scheme X/G
a
= Z. This completes the proof.
Next, we obtain (10). We start with the exact sequence (9) for
the G
a
-torsor p, which translates into an exact sequence of group
schemes
1 Aut
G
a
Z
(X) Aut
G
a
(X)
p
Aut(Z).
Taking G-invariants yields the exact sequence of group schemes
1 Aut
G
Z
(X) Aut
G
Y
(X)
p
Aut(Z).
But G acts on the ane scheme Aut
G
a
Z
(X) through its quotient
G/G
a
= A(G), an abelian variety. So this G-action must be trivial,
that is, Aut
G
Z
(X) = Aut
G
a
Z
(X).
We now show that A(G) = Aut
A(G),o
Y
(Z) if Y (or equivalently Z)
is a variety. Since A(G) is commutative, we have a homorphism
f : A(G) Aut
A(G)
Y
(Z). The induced homomorphism of Lie algebras
is the natural map
Lie A(G) Lie Aut
A(G)
Y
(Z) =
A(G)
which is an isomorphism since O(Z) = k. This yields our assertion.
Finally, we show that A(G) is contained in the image of p
. In-
deed, the neutral component of the center of G is identied with a
subgroup of Aut
G
Y
(X), and is mapped onto A(G) under the quotient
homomorphism G G/G
a
(as follows from [Ro, Corollary 5, p.
440]).
Observe that the exact sequence (10) yields an analogue for torsors
of Chevalleys structure theorem; it gives back that theorem when
applied to the trivial torsor G.
5. Lifting automorphisms for abelian torsors
We begin by determining the relative equivariant automorphism
groups of torsors under abelian varieties:
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 61
Proposition 5.1. Let G be an abelian variety and : X Y a
G-torsor, where X and Y are complete varieties. Then the group
scheme Aut
G
Y
(X) is isomorphic to Hom
gp
A(Y ), G
G. Here A(Y )
denotes the Albanese variety of Y , and Hom
gp
A(Y ), G
denotes the
space of homomorphisms of algebraic groups A(Y ) G; this is a free
abelian group of nite rank, viewed as a constant group scheme.
Proof. By Lemma 4.1, we have a functorial isomorphism
Aut
G
Y S
(X S)
= Hom(Y S, G).
Choose a point y
0
Y . For any f Hom(Y S, G), consider the
morphism
: Y S G, (y, s) f(y, s) f(y
0
, s)
where the group law of the abelian variety G is denoted additively.
We claim that factors through the projection Y S Y . For
this, we may replace k with a larger eld, and assume that S has a
k-rational point s
0
; we may also assume that S is connected. Then
the morphism
: Y S G, (y, s) f(y, s) f(y, s
0
)
maps Y {s
0
} to a point. By a scheme-theoretic version of the rigidity
lemma (see [SS, Theorem 1.7]), it follows that factors through the
projection Y S S. Thus, f(y, s) f(y, s
0
) = f(y
0
, s) f(y
0
, s
0
)
which shows the claim.
By that claim, we may write
f(y, s) = (y) + (s)
where : Y G and : S G are morphisms such that (y
0
) = 0.
Now let a : Y A(Y ) be the Albanese morphism, normalized so that
a(y
0
) = 0. Then factors through a unique homorphism : A(Y )
A, and f = ( a) + where is an S-point of Hom
gp
A(Y ), G
,
and an S-point of G.
Next, we obtain a preliminary result which again is certainly well-
known, but for which we could not locate any reference:
62 MICHEL BRION
Lemma 5.2. Assume that k has characteristic 0. Let : Z Y be
a nite etale morphism, where Y and Z are complete varieties.
Then the natural homomorphism
restricts
to a surjective homomorphism Aut
o
(Z, Y ) Aut
o
(Y ); its kernel is
nite by Galois theory.
If is an F-torsor, then
: Aut
G
(X) Aut(Y ) restricts to an isogeny
|H
: H Aut
o
(Y ) for any quasi-complement H as above.
Proof. The assertion that G is central in Aut
o
(X) and admits a
quasi-complement follows from [Ro, Corollary, p. 434].
By Proposition 4.3 or alternatively Proposition 5.1, G is the neutral
component of the kernel of
g, (z)
.
This is a G F-automorphism of X Z, and hence descends to a
G-automorphism of X. The assignement yields the desired
identication. This proves the claim and, in turn, the surjectivity of
|H
.
Remarks 5.5. (i) With the notation and assumptions of the preced-
ing theorem, the surjectivity of
|H
also holds when X (or equiva-
lently Y ) is normal. Choose indeed an Aut
o
(Y )-equivariant desingu-
larization
f : Y
Y,
that is, f is proper and birational, and the action of Aut
o
(Y ) on Y
lifts to an action on Y
(O
Y
) = O
Y
. In view of Proposition 2.1, this yields a homomor-
phism
f
: Aut
o
(Y
) Aut
o
(Y )
which is injective (on closed points) as f is birational, and surjective
by construction. Thus, f
) Aut
o
(X) is an isomorphism, where X
:= X
Y
Y
: X
. Now the
desired surjectivity follows from Theorem 5.4.
We do not know whether
|H
is surjective for arbitrary (complete)
varieties X, Y . Also, we do not know whether the characteristic-0
assumption can be omitted.
(ii) The preceding theorem may be reformulated in terms of vec-
tor elds only: let X, Y be smooth complete varieties over an alge-
braically closed eld of characteristic 0, and : X Y a smooth
morphism such that the relative tangent bundle T
is trivial. Then
every global vector on Y lifts to a global vector eld on X.
Consider indeed the Stein factorization of ,
X
p
Y.
Then one easily checks that p is etale; thus, X
is smooth and
, as follows e.g.
from Lemma 5.2. Thus, we may replace with
: Aut
G
(X)
Aut(Y ).
ON AUTOMORPHISM GROUPS OF FIBER BUNDLES 65
Here the assumption that G is proper cannot be omitted. For
example, let Y be an abelian variety, so that Aut
o
(Y ) is the group of
translations. Let also G be the multiplicative group G
m
, so that G-
torsors : X Y correspond bijectively to invertible sheaves L on
Y . Then Aut
o
(Y ) lifts to an isomorphic (resp. isogenous) subgroup
of Aut
G
(X) if and only if L is trivial (resp. of nite order). Also, the
image of
contains Aut
o
(Y ) if and only if L is algebraically trivial
(see [Mu] for these results).
This is the starting point of the theory of homogeneous bundles
over abelian varieties, to be developed in [Br2].
References
[Ak] D. N. Akhiezer, Lie group actions in complex analysis, Aspects of Math-
ematics E 27, Vieweg, Braunschweig/Wiesbaden, 1995.
[Bi] A. Bialynicki-Birula, On induced actions of algebraic groups, Ann. Inst.
Fourier (Grenoble) 43 (1993), no. 2, 365368.
[Bl] A. Blanchard, Sur les varietes analytiques complexes, Ann. Sci.
Ecole
Norm. Sup. (3) 73 (1956), 157202.
[Br1] M. Brion, Some basic results on actions of non-ane algebraic groups,
in: Symmetry and Spaces (in honor of Gerry Schwarz), 120, Progr. Math.
278, Birkhauser, Boston, MA, 2009.
[Br2] M. Brion, Homogeneous bundles over abelian varieties, in preparation.
[Co] B. Conrad, A modern proof of Chevalleys theorem on algebraic groups,
J. Ramanujam Math. Soc. 17 (2002), 118.
[DG] M. Demazure, P. Gabriel, Groupes algebriques, Masson, Paris, 1970.
[EV] S. Encinas, O. Villamayor, A course on constructive desingularization
and equivariance, in: Resolution of singularities (Obergurgl, 1997), 147
227, Progr. Math. 181, Birkhauser, Basel, 2000.
[KM] S. Keel, S. Mori, Quotients by groupoids, Ann. of Math. (2) 145 (1997),
no. 1, 193213.
[Ma] H. Matsumura, On algebraic groups of birational transformations, Atti
Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. (8) 34 (1963), 151155.
[MO] H. Matsumura, F. Oort, Representability of group functors, and auto-
morphisms of algebraic schemes, Invent. Math. 4 (1967), 125.
[Mo] A. Morimoto, Sur le groupe dautomorphismes dun espace bre principal
analytique complexe, Nagoya Math. J. 13 (1958), 157168.
[Mu] D. Mumford, Abelian Varieties, Oxford University Press, Oxford, 1970.
[MFK] D. Mumford, J. Fogarty, F. Kirwan, Geometric Invariant Theory.
Third Edition, Ergeb. der Math., Springer, 1994.
66 MICHEL BRION
[Ra] M. Raynaud, Faisceaux amples sur les schemas en groupes et les espaces
homog`enes, Lecture Note Math. 119, Springer-Verlag, New York, 1970.
[Ro] M. Rosenlicht, Some basic theorems on algebraic groups, Amer. J. Math.
78 (1956), 401443.
[SGA1] Revetements etales et groupe fondamental, Seminaire de geometrie
algebrique du Bois Marie 196061 dirige par A. Grothendieck, Documents
Mathematiques 3, Soc. Math. France, 2003.
[SS] C. Sancho de Salas, F. Sancho de Salas, Principal bundles, quasi-
abelian varieties and structure of algebraic groups, J. Algebra 322 (2009),
27512772.
Universit
ER - VON MISES
TEST.
ALEJANDRA CABA
NA
Abstract. The statistical bibliography frequently refers to om-
nibus tests intended to be sensitive to all or at least a wide variety
of alternatives, and focused or directional tests directed to detect
eciently some specic alternatives.
In fact, the apparent opposition between omnibus and focused
is articial, and, for instance, K-S test is focused on changes in
position of Double Exponential distribution, as well as Cramer
- von Mises is focused on changes in position of the distribution
with density f(t) = 1/(2 cosh(t/2)).
We provide in this article a simple proof of this latter fact.
1. Introduction
In the statistical literature refering to a test as being omnibus or
directional often implies opposite categories.
Omnibus tests are able to detect a wide bunch of alternatives, and
no special ability to detect any particular one is intended.
When statistical practitioners wish to detect specic alternatives
they can use directional tests. These ones focus their power in the
direction of the interesting alternatives.
The former tests are not expected to be ecient in the detection
of particular alternatives. On the other hand, it is generally claimed
that the second ones have the drawback that they have a poor power
against alternatives other that the ones on which they were focused.
Research partially supported by TIN2008-06582-C03-02/TIN, Ministerio de
Ciencia y Tecnologa.
Partially supported by CSIC-Udelar,Uruguay, Centre de Recerca
Matem`aatica, Barcelona, Spain and Carolina Foundation, Spain.
67
68 ALEJANDRA CABA
NA
Notwhithstanding, it is well established that a test can be both
omnibus and focused: this is the case of the well known omnibus
Kolmogorov - Smirnov goodness-of-t test, that is also focused to
detect changes in position of samples of the Double - Exponential
Distribution as shown by J. Capon ([3]) by computing lower bounds
for the asymptotic eciency of the test for several alternatives.
In this short note, we show that the well known Cramer - von
Mises goodness-of-t test, also reputed to be an omnibus test, is also
focused to detect changes in position of random samples of another
family of distributions obtained by changes in location and scale from
the distribution with probability density
(1) g(t) =
1
2 cosh(t/2)
.
It is known (see [8]) that there is one direction with the highest
asymptotic power that is possible for Cramer - von Mises test. We
present here a straightforward computation of such direction.
The principal result is that the asymptotic power of the Cramer -
von Mises test for those alternatives is almost optimal. This state-
ment is made precise in 4, where the power of the test is compared
with the power of the two-sided test based on the likelihood ratio.
This kind of quasi-optimal behaviour characterises several tests of
goodness-of-t developed by the authors in which a quadratic statistic
of Watson type is employed in such a way that the resulting tests are
consistent against any alternative, and also have a near optimum
eciency for some alternative of focusing arbitrarily selected by the
user (see [1], [2] and references therein).
The tuning on the interesting alternatives is a part of the design of
our tests, but the quasi-optimum eciency is inherent to the statistic
in use.
The eciency of our tests is described in the already cited articles.
But the fact that the eciency of the classical Cramer - von Mises
test share such kind of properties does not appear to us to be widely
discussed in the statistical literature, and motivates this article.
The power of Cramer - von Mises test has been analysed by sev-
eral authors, and is fully described by Durbin and Knott ([5]), for
ON THE FOCUSING OF CRAM
(F
n
(t)F
0
(t))
2
dF
0
(t)
quanties a quadratic distance between the probability distribution
function F
0
and the empirical distribution function F
n
(t) =
n
i=1
1
{X
i
t}
of the sample of i.i.d. random variables X
1
, X
2
, . . . , X
n
with proba-
bility distribution F.
By introducing the empirical process b
n
(t) =
n(F
n
(t)F
0
(t)),
2
n
is written as
2
n
=
_
b
2
n
(t)dF
0
(t).
We shall assume that F
0
is continuous, with density f
0
, nite
rst- and second-order moments, and, with no loss of generality that
_
tdF
0
(t) = 0,
_
t
2
dF
0
(t) = 1.
Let the probability distribution of
2
n
be denoted by P(t, F, n) =
P{
2
n
t}.
The Cramer - von Mises test of the null hypothesis H
0
: F = F
0
,
with condence level , rejects H
0
when
2
n
> c
n
(), where c
n
()
solves the equation P(c
n
(), F
0
, n) = 1 , and its power for the
alternative F is 1 P(c
n
(), F, n).
2.1. The asymptotic law of
2
n
under H
0
. Since b
n
converges
in law to a brownian bridge associated to F
0
, that is, to a Gauss-
ian centred process b
F
0
with covariances Eb
F
0
(s)b
F
0
(t) = F
0
(s t)
F
0
(s)F
0
(t), then
2
n
has the asymptotic law of
_
(b
F
0
(t))
2
dF
0
(t)
_
1
0
b
2
(u)du, where b denotes a standard Brownian bridge, because b
F
0
has the same law as b F
0
.
In order to obtain the distribution of Q
0
=
_
1
0
b
2
(u)du = b
2
,
the L
2
squared norm of the standard Brownian bridge b in L
2
(([0, 1])
with the Lebesgue measure, let us follow Durbin ([4]) and compute
the Fourier expansion
(2) b(u) =
j=1
__
1
0
b(v)
j
(v)dv
_
j
(u)
70 ALEJANDRA CABA
NA
of b in terms of the complete orthonormal system{
j
(u) =
2 sin ju :
j = 1, 2, . . . } of eigenfunctions of the covariance kernel which admits
the expansion
Eb(u)b(v) = u v uv =
j=1
1
j
2
j
(u)
j
(v).
The random coecients in (2) are independent centred Gaussian
variables vith variances
E
__
1
0
b(u)(u)du
_
2
=
_
1
0
_
1
0
(u v uv)(u)(v) du dv =
1
j
2
2
and hence we may rewrite (2) as b(u) =
j=1
B
j
j
j
(u), by introduc-
ing the i.i.d. standard Gaussian variables B
j
= j
_
1
0
b(u)
j
(u)du,
leading us to conclude
(3) Q
0
= b
2
=
j=1
B
2
j
j
2
2
.
2.2. The limiting law of
n
under sequences of contiguous
alternatives. Let us assume now that for each n, the sample has a
probability law F
(n)
with density f
n
(t) satisfying
f
n
(t)
f
0
(t)
= 1 +
k
n
(t)
2
n
for a sequence of functions k
n
such that
_
(k
n
(t) k(t))
2
dF
0
(t) 0,
_
k
2
(t)dF
0
(t) = 1.
When this happens, we shall say that the alternative H(k, ) holds.
These alternatives are contiguous to the null hypothesis (see [9]) and
therefore the asymptotic law of b
n
under H(k, ) is the same one
corresponding to H
0
= H(k, 0) plus a deterministic term, according
to Le Cam Third Lemma ([6], [7]).
ON THE FOCUSING OF CRAM
n(F
n
(t) F
(n)
(t)) +
n(F
(n)
(t) F
0
(t)).
tends to b
(F
0
)
(t), and the second one is written as
n
_
t
(f
n
(s)f
0
(s))ds =
n
_
t
k
n
(s)
n
dF
0
(s)
_
t
k(s)dF
0
(s)
so that, with the change of variables u = F
0
(t) and the new function
K dened by
K(u) =
_
u
0
(v)dv, (F
0
(t)) = k(t),
we get
(4)
b
n
(t)
L
b
(F
0
)
(t)+
_
t
k(s)dF
0
(s) = b(u)+
_
u
0
(v)dv = b(u)+K(u).
The assumptions on k imply that satises
_
1
0
(u)du = 0,
_
1
0
2
(u)du =
1, and, in particular, K(0) = K(1) = 0. The function shall be called
standardized shape of the alternative H(k, ).
From (4), we obtain
2
n
L
_
1
0
(b(u) + K(u))
2
du.
Let us notice that this expression of the limit law of
n
leads to
conclude that when the null hypothesis is replaced by H(k, ), then
the asymptotic expectation of
n
increases in the amount
(5) () =
2
_
1
0
K
2
(u)du.
It is reasonable to expect that larger values of () be associated
with larger powers of the tests comparing H
0
with H(, k). Therefore,
we search in the next section the function K that maximises () for
given .
72 ALEJANDRA CABA
NA
3. The focused alternatives.
3.1. The standardized shape of the alternative that pro-
duces the largest increment in the asymptotic expectation
of
n
. We shall obtain the function K(u) =
_
u
0
(s)ds that max-
imises
_
1
0
K
2
(u)du with the restrictions
_
1
0
2
(u)du = 1,
_
1
0
(u)du = 0.
The associated Euler equations express that for each continuously
dierentiable g such that
g(0) = g(1) = 0,
_
1
0
K
(u)g
(u) = 0
the condition
_
1
0
K(u)g(u)du = 0
must hold.
The condition
_
1
0
K
(u)g
(u)g
(u)du = [g(u)K
(u)]
1
0
_
1
0
K
(u)g(u)du = 0.
Since the integrated term in the right-hand side vanishes, we nd
that when g is orthogonal to K
in L
2
([0, 1]), it is also orthogonal to
K, and this means that K and K
=
2
K.
The solutions of K
=
2
K in [0, 1] with border conditions
K(0) = K(1) = 0, satisfying
_
1
0
(K
(u))
2
du = 1 are
K(u) =
2
j
sin ju, j = 1, 2, . . . .
The solution with maximum norm is the one with j = 1, hence
(6) (u) =
2 cos u.
ON THE FOCUSING OF CRAM
n)
we have
f
n
(t)
f
0
(t)
= 1 +
c
2
n
f
0
(t)
f
0
(t)
+ o(
1
n
)
so that k(t) = c
f
0
(t)
f
0
(t)
. The constant c is introduced in order to be able
to impose k
2
= 1.
It follows that (u) = c
f
0
(F
1
0
(u))
f
0
(F
1
0
(u))
and Equation (6) shows that the
alternative shall be detected by the Cramer - von Mises statistic with
maximum asymptotic increment of the expectation when
c
f
0
(F
1
0
(u))
f
0
(F
1
0
(u))
=
2 cos u.
In order to solve this dierential equation in F
0
, we return to the
variable t = F
1
0
(u), and get
cf
0
(t) =
2f
0
(t) cos F
0
(t),
which, integrated in (, t] gives
cf
0
(t) =
sin F
0
(t).
A further integration leads to
2t
c
=
_
t
0
dF
0
(s)
sin F
0
(s)
=
_
F
0
(t)
F
0
(0)
du
sin u
=
1
2
log
(cos F
0
(t) 1)(cos F
0
(0) + 1)
(cos F
0
(t) + 1)(cos F
0
(0) 1)
.
74 ALEJANDRA CABA
NA
By imposing with no loss of generality that F
0
is centred in 0,
follows the simpler expression
t = log
1 cos F
0
(t)
1 + cos F
0
(t)
,
in which the parameter =
2
2
c
determines the dispersion.
By solving in F
0
and choosing = to get a distribution with
variance equal one, we conclude
(7) F
0
(t) =
1
arccos
1 e
t
1 + e
t
, f
0
(t) =
1
2 cosh(t/2)
.
3.3. Asymptotic law of
n
under changes in location for sam-
ples with the law of Equation (7), and power of the test. The
statistic
n
has the asymptotic law of
Q() =
_
1
0
(b(u) + K(u))
2
du =
_
1
0
_
b(u) +
1
(u)
_
2
du.
Since b(u) +
1
(u) =
j=1
1
j
B
j
+
1
(u), then
Q() =
_
_
_
_
b +
1
_
_
_
_
2
=
1
2
_
(B
1
+ )
2
+
j=2
1
j
2
B
2
j
_
Cramer - von Mises test of F
(n)
(t) = F
0
(t) against F
(n)
(t) = F
0
(t+
2
n
) with signicance level is asymptotically equivalent to the test
of H
0
: = 0 with critical region Q() > c() where c() solves
P{Q(0) > c()} = . The power, that we have computed by a
numerical convolution for the purposes discussed in next section, is
(, ) = P{Q() > c()}.
4. Comparison with the two-sided test based on
Neymann and Pearson statistic.
The Neyman and Pearson test of H
0
against the alternatives H
n
that the true density of the sample distribution is g
n
(t) = f
0
(t +
c
n
)
ON THE FOCUSING OF CRAM
i=1
log
_
f
0
_
X
i
+
c
n
_
/f
0
(X
i
)
_
constant,
asymptotically equivalent to
n
n
i=1
f
0
(X
i
)
cf
0
(X
i
)
constant.
When H
0
holds, the variables f
0
(X
i
)/(cf
0
(X
i
)) are centred, with
variance 1, and therefore the asymptotic law of the statistic T
n
=
1
n
i=1
f
0
(X
i
)
cf
0
(X
i
)
is standard normal.
If the sequence of alternatives H
n
hold, then
ET
n
=
nEf
0
(X
1
)/(cf
0
(X
1
)) =
n
_
f
0
(x)
cf
0
(x)
f
0
(x +
c
n
)dx
has limit , E(f
0
(X
i
))/(cf
0
(X
i
)))
2
tends to 1, hence T
n
converges in
law to Z + , Z standard Gaussian.
As a consequence, the test of = 0 against > 0 with optimal
asymptotic power is the one with critical region T
n
> constant.
While there is no optimal test for = 0 against = 0, the usual
practice if there are not signicant dierences between the cases > 0
or < 0 is to reject = 0 when |T
n
| > constant. In that case,
if denotes as usual the standard normal cumulative distribution
function, the asymptotic power of the two - sided test with asymptotic
level , is
(, ) = P{Z + >
1
(1
2
)} +P{Z + <
1
(
2
)}
= (
1
(
2
) + ) + (
1
(
2
) ).
The practically coincident plots of the functions (,.05) and
(,.05)
in Figure 1 show that Cramer - von Mises test against the alternative
of displacement of samples with distribution (7) is almost optimal, in
the sense that its performance is almost asymptotically equivalent to
the performance of the test with critical region T
n
>constant.
The relationship between the asymptotic powers (and the intended
meaning of almost optimal) is better shown in the second diagram
76 ALEJANDRA CABA
NA
Figure 1. Almost coincident asymptotic powers
ds
2
z
= ds
2
i(z)
.
Au voisinage de
3
P, on a (
xz
4
+
1
2
)
ds
2
z
2
= ds
2
z
et (
xz
4
1
2
)
ds
2
z
2
=
ds
2
i(z)
.
Une facon simple de construire de telles familles est de considerer
des metriques sur P que nous appellerons admissibles. Une metrique
admissible est une metrique ds
2
sur un voisinage de P dans C qui,
au voisinage de
3
P, admet lexpression
|ds| =
|dz|
|z|(2 + log
1
|z|
)
,
et verie de plus (
x
4
+
1
2
)
ds
2
= ds
2
et (
x
4
1
2
)
ds
2
= ds
2
. Alors, pour
construire une metrique sur le bre tangent de F, il sut de construire
une famille {ds
2
z
}
zS
1 de metriques admissibles qui verient de sur-
crot la condition
ds
2
z
= ds
2
i(z)
. En eet, une metrique admissible
est invariante par rotation au voisinage de
3
P, ce qui montre que
pour toute paire de metriques admissibles ds
2
j
, j = 0, 1, et tout z du
cercle, on a (
xz
4
1
2
)
ds
2
0
= ds
2
1
.
3. g-mesures et mesures harmoniques
On reprend les notations du paragraphe precedent. Une g-fonction
2
est une fonction continue g : S
1
(1, +) telle que pour tout point
z S
1
,
1
g(z)
+
1
g(i(z))
= 1.
Une g-mesure est une mesure de probabilite sur le cercle telle que
la derivee de Radon-Nikodym de T relativement `a est la fonction
g. Rappelons que cela signie que, si B est un Borelien du cercle
2. La terminologie est malheureuse mais cest celle qui est classiquement uti-
lisee.
MESURES HARMONIQUES ET g-MESURES 83
sur lequel T est injective, alors (TB) =
B
gd. Lexistence dune
g-mesure decoule du theor`eme du point xe de Kakutani, voir [Ke].
Lucy Garnett demontre dans [Ga] quil existe toujours une me-
sure harmonique sur un feuilletage equippe dune metrique sur son
bre tangent, qui est lisse sur les feuilles, et continue transversale-
ment. Dans ce qui suit, nous construisons explicitement des mesures
harmoniques dans le cas particulier du feuilletage de Hirsch. Plus
precisement, etant donnee une g-fonction associee `a T, nous produi-
sons une metrique riemannienne sur TF, qui est lisse le long des
feuilles et admet la meme regularite transverse que g, en sorte que
toute g-mesure donne lieu ` a une mesure harmonique sur F.
Proposition 3.1. Pour tout > 0, il existe un voisinage U de P
dans C, tel que pour tout couple (L
1
, L
2
) de reels superieurs `a et
veriant
e
L
1
+ e
L
2
= 1,
il existe une metrique riemannienne admissible ds
2
L
1
,L
2
sur U, et une
fonction
ds
2
L
1
,L
2
-harmonique
L
1
,L
2
: U R
>0
qui verie les condi-
tions suivantes :
Pour tout x dans un voisinage de
3
P, on a
L
1
,L
2
(x) = 1 +
1
2
log
1
|x|
.
Pour tout x dans un voisinage de
3
P, on a
L
1
,L
2
(
x
4
+1/2) = e
L
1
L
1
,L
2
(x), et
L
1
,L
2
(
x
4
1/2) = e
L
2
L
1
,L
2
(x).
De plus, on peut supposer que les metriques ds
2
L
1
,L
2
et les fonctions
L
1
,L
2
dependent de facon analytique de L
1
et L
2
, et que pour tout
(L
1
, L
2
), on a
ds
2
L
1
,L
2
= ds
2
L
2
,L
1
.
Demonstration. Sur les cylindres C
i
= S
1
[0, L
i
], i = 1, 2, considerons
la metrique de courbure 1 denie par :
e
2(vL
i
)
du
2
+ dv
2
.
En eet, cest la metrique quon obtient en partant de
du
2
+dy
2
y
2
et en
eectuant le changement de variables y = e
L
i
v
. Les bords
C
i
=
84 BERTRAND DEROIN ET CONSTANTIN VERNICOS
S
1
0 et
+
C
i
= S
1
L
i
sont alors des horocycles respectivement
negatif de longueur e
L
i
et positif de longueur 1
3
.
On coupe C
1
et C
2
le long des geodesiques 1 [0, ] et 1 [0, ],
et on colle le segment 1
+
[0, ] de C
1
(resp. 1
[0, ] de C
2
) au
segment 1
[0, ] de C
2
(resp. 1
+
[0, ] de C
1
) de facon isometrique
et en renversant lorientation. On construit de cette facon un panta-
lon P
L
1
,L
2
avec une metrique ds
2
de courbure 1 et une singularite
conique dangle 4. Ce pantalon P
L
1
,L
2
a trois composantes de bord :
les composantes
1
P
L
1
,L
2
=
+
C
1
,
2
P
L
1
,L
2
=
+
C
2
, qui sont des ho-
rocycles positifs de longueur 1, et la composante
3
P
L
1
,L
2
qui est un
horocycle negatif de longueur la somme des longueurs des bords
C
1
et
C
2
, cest ` a dire e
L
1
+ e
L
2
= 1.
La metrique avec singularite conique munit P
L
1
,L
2
dune structure
de surface de Riemann lisse - latlas des cartes preservant lorientation
dans lesquelles la metrique est conforme `a la metrique plate |dz| sur
C. La fonction
L
1
,L
2
: P
L
1
,L
2
R denie sur chaque C
i
par e
v
est
alors une fonction harmonique sur P
L
1
,L
2
, qui vaut e
L
i
sur
i
P
L
1
,L
2
pour i = 1, 2, et 1 sur
3
P
L
1
,L
2
.
Les bords de P
L
1
,L
2
etant horocycliques de longueur 1, il existe
un dieomorphisme : P P
L
1
,L
2
tel que
ds
2
est une metrique
admissible, ` a ceci pr`es quelle admet une singularite conique. On peut
choisir en sorte que cette derni`ere se situe ` a lorigine, et que lon ait
en son voisinage
ds
2
= |x|
2
|dx|
2
. On consid`ere alors une metrique
de la forme ds
2
L
1
,L
2
=
ds
2
, o` u : P \ {0} R
>0
est une fonction
lisse, qui vaut identiquement 1 ` a lexterieur dun petit voisinage de
lorigine, et qui, dans un voisinage encore plus petit, est de la forme
(x) =
1
|x|
2
.
La fonction
L
1
,L
2
est alors
ds
2
L
1
,L
2
-harmonique, et verie les
conditions du lemme.
Nous choisissons de sorte que 0 < < inf
zS
1 log g(z). Pour
chaque point z du cercle, on pose
L
1
(z) = log g(z), et L
2
(z) = log g(i(z)).
3. Nous entendons par horocycle positif ou negatif une courbe lisse de courbure
signee 1 ou 1.
MESURES HARMONIQUES ET g-MESURES 85
La famille de metriques admissibles {ds
2
z
}
zS
1 denies par ds
2
z
=
ds
2
L
1
(z),L
2
(z)
denit alors une metrique sur le feuilletage de Hirsch.
Dautre part, si est une g-mesure sur le cercle, alors la mesure
m =
L
1
(z),L
2
(z)
vol(ds
2
z
)
denit une mesure harmonique sur le feuilletage de Hirsch. Pour mon-
trer notre theor`eme, il nous sut de prendre une g-fonction continue
pour laquelle il existe plusieurs g-mesures dierentes, dont lexistence
nous est assuree par un theor`eme de Anthony N. Quas [Qu].
R
ef
erences
[De] B. Deroin. Non unique-ergodicity of harmonic measures : smoothing Sa-
muel Petites examples. Dierential Geometry. Proc. VIII Intern. col. San-
tiago de Compostela (2008).
[DK] B. Deroin & V. Kleptsyn. Random conformal dynamical systems.
Geom. Funct. Anal. Vol. 17 (2007) no. 4, p. 1043-1105.
[Ga] L. Garnett. Foliations, the ergodic theorem and Brownian motion. J.
Funct. Anal. 51 (1983), no.3, p. 285-311.
[Gh]
E. Ghys. Laminations par surfaces de Riemann. Dynamique et geometrie
complexes (Lyon, 1997), 4995, Panor. Synth`eses, 8, Soc. Math. France, Paris,
1999.
[Ke] M. Keane. Strongly mixing g-measures. Inventiones Math. 16 (1972), p.
309-324.
[KP] V. Kleptsyn et S. Petite. Communication personnelle.
[Qu] A. N. Quas. Non-ergodicity for C
1
expanding maps and g-measures. Er-
godic Th. Dyn. Syst. 16 (1996), p. 531-543.
CNRS UMR 8628, D
epartement de Math
ematique dOrsay, B
atiment
425, Universit
ematique et de mod
M by
(x, p), with x M, and p T
x
M =
1
(x). With this notation the
canonical projection : T
x
M coming
from a Riemannian metric on M. Since M is compact, all Riemann-
ian metrics are equivalent, and this last condition 3) is satised by
all Riemannian metrics as soon as it is satised by one of them.
As is usual now, see [7, 13, 14], we dene the Ma ne critical value
c[0] of H by
c[0] = inf{sup
xM
H(x, d
x
u) | u C
1
(M, R)}.
Of course by density of C
(M, R) in C
1
(M, R) for the C
1
topology,
we could have taken the inf on C
(M, R).
Recall, see for example [13, 14], that we say that u : M R
is a critical subsolution (of the Hamilton-Jacobi Equation) if it is
Lipschitz, and H(x, d
x
u) c[0], for (Lebesgue) almost every x M.
Due to the coercivity and convexity of H in p, a critical subsolution
is nothing but a (global) viscosity subsolution of the Hamilton-Jacobi
Equation
H(x, d
x
u) = c[0],
see [1, Chapter II] or [2, Chapitre 2].
It is not dicult to obtain from the stability of viscosity subsolu-
tions, again see [1, Proposition 2.2 page 35]or [2, Theor`eme 2.3 page
21], that there exists a critical subsolution. A much stronger result
has been obtained by Patrick Bernard: there always exists C
1,1
crit-
ical subsolutions (as usual a C
1,1
function is a C
1
function, whose
derivative is locally Lipschitz).
To a critical subsolution u, we can associate a specic compact non-
empty subset I(u) called the projected Aubry set of u, such that d
x
u
exists at every x I(u), and moreover x d
x
u is Lipschitz on I(u).
In fact, one has H(x, d
x
u) = c[0] for every x I(u); therefore it is
very dicult to perturb a critical subsolution near I(u), while keeping
it a critical subsolution. There are several possible descriptions for
I(u), see for example [10, 13]. We will give some of these descriptions
in 3.
Here is the main result of this paper. It shows that the problem of
existence of smoother critical subsolutions is localized to a neighbor-
hood of Aubry sets.
CRITICAL SUBSSOLUTIONS OF THE HAMILTON-JACOBI EQUATION 89
Theorem 1.1. Let u : M R be a critical subsolution for the
Tonelli Hamiltonian H on M. Suppose we can nd an open subset
U of M, with I(u) U, and a C
k
map u : U R such that:
1) u = u on I(u),
2) H(x, d
x
u) c[0], for every x U,
then there exists a C
k
critical subsolution u : M R with u = u on
I(u). Moreover, we can nd such a critical subsolution u : M R
which is C
consists of
a nite number of hyperbolic orbits then we can nd a C
k
critical
subsolution u : M R which is strict outside the projected Aubry
set A = (
A
) M.
For the denitions of the Aubry set
A
M, see section 3
below.
2. An obvious way to combine subsolutions
This section contains the simple main idea of this work. It shows
how to combine viscosity subsolutions to obtain a new one. Here
we only need to assume that H : T
M R is convex in p. Recall
that, under this convexity assumption, a locally Lipschitz function
u : U M, dened on the open subset U , is a viscosity subsolution
of
H(x, d
x
u) = c,
where c R is xed, if and only if H(x, d
x
u) c (Lebesgue) almost
everywhere on U, again see [1, Chapter II] or [2, Chapitre 2].
Proposition 2.1. Suppose c R is xed. Let U M be an open sub-
set of M, and u
1
, u
2
: U R be two locally Lipschitz maps satisfying
H(x, d
x
u
i
) c, i = 1, 2 (Lebesgue) almost everywhere on U. For any
Lipschitz function : R R, with non-decreasing, and Lip() 1,
the function u
= u
1
+ (u
2
u
1
) also satises H(x, d
x
u
) c
(Lebesgue) almost everywhere on U.
90 ALBERT FATHI
Proof. We rst consider the case where is dierentiable everywhere
(for example C
1
). The set
U
= {x U | d
x
u
1
, d
x
u
2
both exist, and H(x, d
x
u
i
) c, i = 1, 2}
is of full Lebesgue measure in U. For every x U
, we have
d
x
u
= d
x
u
1
+
[(u
2
u
1
)(x)](d
x
u
2
d
x
u
1
)
=
[(u
2
u
1
)(x)]
d
x
u
1
+
[(u
2
u
1
)(x)]d
x
u
2
.
Since is non-decreasing we have
(t) 1. Therefore d
x
u
is a convex combina-
tion of d
x
u
1
and d
x
u
2
. The convexity of H(x, p) in p implies that
H(x, d
x
u
) c, for every x U
approximation
of the identity for the convolution, then
n
=
n
is also non-
decreasing and has Lipschitz constant 1. The function
n
is C
,
and
n
uniformly on any compact subset of R. By the rst
part of the proof u
n
= u
1
+
n
(u
2
u
1
) is locally Lipschitz and
satises H(x, d
x
u
n
) = c almost everywhere on U. Therefore u
n
is
a viscosity subsolution of H(x, d
x
u) = c on U. Since u
n
u
uni-
formly on compact subsets of U, we obtain that the limit u
is also
a viscosity subsolution of H(x, d
x
u) = c, see [1, Proposition 2.2 page
35]or [2, Theor`eme 2.3 page 21]. Therefore at each point x where
the derivative of the locally Lipschitz function u
exists, we have
H(x, d
x
u
M R is a Tonelli
Hamiltonian. We will call
H
t
the Hamiltonian ow of H. This ow
CRITICAL SUBSSOLUTIONS OF THE HAMILTON-JACOBI EQUATION 91
is the ow dened by the ODE
x =
H
p
(x, p),
p =
H
x
(x, p).
If u : M R is a Lipschitz function, the derivative d
x
u exists for
(Lebesgue) almost every x M, and we set
Graph(du) = {(x, d
x
u) | x where d
x
u exists}.
If u : M R is a critical subsolution, we can dene the Aubry set
(u) by
(u) =
tR
H
t
[Graph(du) H
1
(c[0])].
Although this is not obvious this set is compact and non-empty. By
its denition it is invariant under the Hamiltonian ow of H. The
projected Aubry set of u is I(u) = (
(u)), where : T
M M is
the canonical projection.
The Aubry set
A
of H is
=
I
x
M
p(v) H(x, p).
The map L is as smooth as the Tonelli Hamiltonian H. Moreover,
it satises the analogous of the properties 2) and 3) of a Tonelli
92 ALBERT FATHI
Hamiltonian. The Legendre transform L : TM T
M dened by
L(x, v) =
x,
L
v
(x, v)
,
is a global dieomorphism whose inverse is given by
L
1
(x, p) =
x,
H
p
(x, p)
.
One has the Fenchel inequality
x M, v T
x
M, p T
x
M, p(v) L(x, v).
The Fenchel inequality is an equality if and only if (x, p) = L(x, v)
( p = L/v(x, v) v = H/p(x, p)).
Suppose now that u is a strict subsolution, and x (x, p)
I
(u). If
we write
H
t
(x, p) = ((t), p(t)) then p(t) = d
(t)
u, and H((t), d
(t)
u) =
c[0]. Since the Legendre transform exchanges speed curves of ex-
tremals of the Lagrangian l and orbits of
H
t
, we have d
(t)
u = p(t) =
L/v((t), (t)). Therefore using the equality case in the Fenchel
inequality, we get
d
(t)
u( (t)) = L((t), (t)) + c[0].
By integration this implies that :] , +[ M is (u, L, c[0])-
calibrated. Recall that a curve : I M, where I in an interval in
R, is said to be (u, L, c[0])-calibrated if for all t, t
I, with t t
,
we have
u((t
)) u((t)) =
t
L((s), (s)) ds + c[0](t
t).
Conversely, if :] , +[ M is (u, L, c[0])-calibrated, using the
properties of calibrated curves, we have that d
(t)
u exists, and
d
(t)
u =
L
v
((t), (t)) and H((t), d
(t)
u) = c[0].
Since a calibrated curve is an extremal, it follows that t ((t), d
(t)
u)
is an orbit of
H
t
, contained in
I
(u). Hence
I
(u) = L(
I(u)), where
t
, T
+
t
: C
0
(M, R) C
0
(M, R).
If t 0 and u C
0
(M, R), we have
T
t
(u)(x) = inf
u((t)) +
0
t
L((s), (s)) ds,
where the inmum is taken over all curves : [t, 0] M, with
(0) = x. In the same way
T
+
t
(u)(x) = sup
u((t))
t
0
L((s), (s)) ds,
where the supremum is taken over all curves : [0, t] M, with
(0) = x. The function u : M R is a critical subsolution if and
only if u T
t
(u) + c[0]t, for every t 0 ( u T
+
t
(u) c[0]t, for
every t 0).
A negative (resp. positive) weak KAM solution is a function u
:
M R (resp. u
+
: M R) such that u
= T
t
u
+ c[0]t (resp.
u
+
= T
+
t
u
+
c[0]t, for every t 0. By what we said above weak
KAM solutions are automatically critical subsolutions.
Given a critical subsolution u, then T
t
u+c[0]t (resp. T
+
t
(u)+c[0]t)
is non-increasing (resp. non-decreasing) in t, and converges uniformly
to a negative (resp. positive) weak KAM solution u
u (resp.
u
+
u). For proof of the convergence see [10], or arguments in [9].
In particular, we have u
+
u u
t
u(x) + c[0]t, for every t 0.
It follows that u
, we obtain
that u
+
(x) = u
(x)}.
In particular u
u
+
> 0 on M \ I(u). It can be shown that I(u) =
I(u
) = I(u
+
).
It is useful to introduce the concept for a critical subsolution of
being strict on an open subset, see [13, 14]. Since we will use this
concept when the function is at least C
1
on the open set, we can
give the following denition: we will say that the critical subsolution
u : M R is strict on the open subset U M if it is C
1
on U and
x U, H(x, d
x
u) < c[0].
We will need the following density theorem. For a proof see for
example [13, 7]and [14, 6].
Theorem 3.1. If u : M R is a critical subsolution, and > 0 is
given, we can nd a critical subsolution u : M R such that:
1) the function u is C
and strict on M \ A;
2) u u
< .
4. Proof of Theorem 1.1
In this section we will assume that u : M R is a critical subso-
lution. By what was recalled in the previous section 3, we can nd
a pair (u
, u
+
) of negative and positive weak KAM solutions, with
u
u u
+
and I(u) = {x | u
+
(x) = u
(x)}.
We will further assume that u : U R is a C
k
function such that:
1) U is an open subset of M containing I(u),
2) u = u(= u
= u
+
) on I(u),
3) H(x, d
x
u) c[0] on U.
Lemma 4.1. Let K be a compact subset of M \ I(u). We can nd
a C
k
function u
1
: U R, such that H(x, d
x
u
1
) c[0], for every
x U, u
1
= u on a neighborhood of I(u) ,and u
1
< u
on U K.
Proof. Since u
> u
+
on the compact set K, we can choose > 0
such that u
+
+4 < u
, on K. Since u
+
is a critical subsolution, by
Theorem 3.1, we can nd a global critical subsolution u
+
: M R
CRITICAL SUBSSOLUTIONS OF THE HAMILTON-JACOBI EQUATION 95
which is C
.
Since u
+
= u = u on I(u), we can nd an open neighborhood V U
of I(u) such that u u
+
on V . Therefore u u
+
2 on V .
Let : R [0, 1] be a C
t
0
(s) ds.
The function is clearly C
3
0
(s) ds
3
0
ds = 3.
By Proposition 2.1, the function u
1
= u
+
+( u u
+
) satises H(x, d
x
u
)
c[0] on U. We also have u
1
u
+
+ max u
+
+ 3 u
+
+ 4.
Therefore, by the choice of , we obtain u
1
< u
on K U. Note
that u
1
is C
k
outside of I(u). On the open set V I(u), we have
u u
+
2. Since on ] , 2] the derivative
is identically 1,
we have (t) = t, for every t ] , 2]. On V , we therefore get
( u u
+
) = u u
+
, and u
1
= u. In particular, the function u
1
is also
C
k
on V , hence on U = V (U \ I(u)).
Lemma 4.2. For any neighborhood W of I(u), we can nd a C
k
function u
2
: M R such that
1) u
2
= u in a neighborhood of I(u),
2) u
2
is a critical subsolution,
3) u
2
is a strict critical subsolution outside of W, i.e. H(x, d
x
u
2
) <
c[0], for every x M \ W.
Proof. We can assume
W U. Moreover, by Lemma 4.1, applied
with K = M \ W, replacing u by u
1
if necessary, we can also assume
u < u
on U \ W. Choose U
a neighborhood of
W with
W U
U. Dene by
3 = inf
\W
u
u.
96 ALBERT FATHI
Note that > 0, since
U
\ W is a compact subset of U \ W, on
which the continuous function u
u is > 0. Since u
is a critical
subsolution, by Theorem 3.1, we can choose a critical subsolution
u
: M R which is C
), and
satises u
= u = u on I(u). We
obtain u
u 2 on V . Let : R [0, 1] be a C
function such
that = 0 on ] , 2] and = 1 on [3, +[. We dene : R R
by
(t) =
t
0
(t) dt.
The function is C
= u +( u
u)
is dened on U and C
k
on U \ I(u). By Proposition 2.1, the function
u
satises H(x, d
x
u
= u on V and u
= u
+ (3) 3 on
U
\ W. In particular u
is C
k
on the whole of U. Since
W U
, we can dene a C
k
function
u
2
: M R such that u
2
= u
on U
and u
2
= u
+ (3) 3 on
M \ W. Note that u
2
is a critical subsolution which is strict outside
W like u
.
Proof of Theorem 1.1. Choose a sequence V
n
, n 0, of neighbor-
hoods of I(u) such that V
n+1
V
n
and
nN
V
n
= I(u). By Lemma
4.2, we can nd a sequence u
n
: M R of C
k
critical subsolutions,
such that u
n
= u on I(u), and u
n
is strict outside V
n
. We can pick
a converging series
n
> 0, n 0, such that
nN
n
u
n
converges in
the C
k
topology (this is easy to show see for example [12, Lemma
3.3, page 722]. Changing a nite number of terms, we can assume
nN
n
= 1. By the convexity of H in p, the sum
nN
n
u
n
is also
a critical subsolution. It is strict outside V
n
, n 0, since
n
> 0.
Hence it is strict outside
nN
V
n
= I(u). Of course
nN
n
u
n
= u
on I(u).
To make u =
nN
n
u
n
of class C
such that
I(u
) = A, and u
is C
k
is a neighborhood of A. It therefore suces
to apply Theorem 1.1 with u = u = u
EMENT MARTEAU
Abstract. We tackle the problem of estimating a regression
function observed in an instrumental regression framework. This
model is an inverse problem with unknown operator. We provide
a spectral cut-o estimation procedure which enables to derive
oracle inequalities which warrants that our estimate, built with-
out any prior knowledge, behaves as well as, up to log term, if
the best model were known.
Introduction
An economic relationship between a response variable Y and a
vector of explanatory variables X is often represented by an equation
Y = (X) + U,
where is the parameter of interest which models the relationship
while U is an error term. Contrary to usual statistical regression
models, the error term is correlated with the explanatory variables
X, hence E(U[X) ,= 0, preventing direct estimation of . To over-
come the endogeneity of X, we assume that there exists an observed
random variable W, called the instrument, which decorrelates the
eects of the two variables X and Y in the sense that E(U[W) = 0.
It is often the case in economics, where the practical construction of
instrumental variables play an important part, for instance for prac-
tical situations where prices of goods and quantity in goods can be
explained using an instrument. This situation is also encountered
2000 Mathematics Subject Classication. 62G05, 62G20.
Key words and phrases. Inverse Problems, Instrumental Variables, Model
Selection, Econometrics.
99
100 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
when dealing with simultaneous equations, error-in-variable models,
treatment model with endogenous eects. It denes the so-called in-
strumental variable regression model which has received a growing
interest among the last decade and turned to be a challenging issue
in statistics. In particular, we refer to [NP03] for general references
on the use of instrumental variables in economics while [CFR06] deal
with the statistical estimation problem.
More precisely, we aim at estimating a function from the obser-
vations of (Y, X, W) satisfying the following condition
(1) Y = (X) + U,
_
E(U[X) ,= 0
E(U[W) = 0
Hence, the model (1) can be rewritten as an inverse problem using
the expectation conditional operator with respect to W, which will
be denoted T, as follows :
(2) r := E(Y [W) = E((X)[W) = T.
The function r is not known and only an observation r is available,
leading to the usual inverse problem settings
(3) r = T + ,
where is dened as the solution of a noisy Fredholm equation of
the rst order which may generate an ill-posed inverse problem. The
literature on inverse problems in statistics is large, but contrary to
most of the problems tackled in the literature on inverse problems
(see [EHN96], [MR96], [CGPT02], [CHR03], [LL08] and [OS86] for
general references), the operator T is unknown either, which trans-
forms the model into an inverse problem with unknown operator.
Few results exist in this settings and only very recently new methods
have arisen. In particular [CH05], [Mar06, Mar09], or [EK01] and
[HR08] in a more general case, construct estimators which enable
to estimate inverse problem with unknown operators in an adaptive
way, i.e getting optimal rates of convergence without prior knowledge
of the regularity of the functional parameter of interest.
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 101
In this work, we are facing an even more dicult situation since
both r and the operator T have to be estimated from the same sam-
ple. Some attention has been paid to this estimation issue, by esti-
mating the joint density with dierent kinds of technics such as ker-
nel based Tikhonov regularization [CFR06], regularization in Hilbert
scales, nite dimensional sieve minimum distance estimator [NP03],
with dierent rates and dierent smoothness assumptions, providing
sometimes minimax rates of convergence. But, to our knowledge, all
the proposed estimators rely on prior knowledge on the regularity
of the function expressed through an embedding condition into a
smoothness space or an Hilbert scale, or a condition linking the reg-
ularity of to the regularity of the operator, namely a link condition
or source condition (see [CR08] for general comments and insightful
comments on such assumptions). In a rst part, we explain how to
use a general penalized approach to turn any regularization scheme
into an adaptive procedure when the operator is known. But the
extension of this method to the case of IV regression fails, hence we
provide under some conditions for the SVD decomposition, an adap-
tive estimation procedure of the function which converges, without
prior regularity assumption, at the optimal rate of convergence, up to
a logarithmic term. Moreover, we derive an oracle inequality which
ensures optimality among the dierent choices of estimators.
Hence, the objective of this work is twofold; rst extending the
estimation procedure for inverse problem with unknown operator to
the case of correlated data, and yet obtaining an oracle inequality;
then providing a tractable adaptive estimator to some cases of in-
strumental variable regression.
1. A statistical framework for instrumental variable
(IV) regression
1.1. Mathematical model. We observe an i.i.d sample (Y
i
, X
i
, W
i
)
for i = 1, . . . , n with unknown distribution f. Dene the following
102 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
Hilbert spaces
L
2
X
= h : R R, |h|
2
X
:= E(h
2
(X)) < +
L
2
W
= g : R R, |g|
2
W
:= E(g
2
(W)) < +,
with the corresponding scalar product ., .
X
and ., .
W
. For sake
of convenience, we only consider in this paper the case where is
univariate. The approach presented in this paper may be certainly
extended to the multivariate case (i.e. with a variable X of dimension
d > 1).
Then the conditional expectation operator of X with respect to W
is dened as an operator T
T : L
2
X
L
2
W
g E[g(X)[W = .] .
The model (1) can be written, as discussed in [CR08], as
Y
i
= (X
i
) +E[(X
i
)[W
i
] E[(X
i
)[W
i
] + U
i
= E[(X
i
)[W
i
] + V
i
= T(W
i
) + V
i
, (4)
where V
i
= (X
i
) E[(X
i
)[W
i
] +U
i
, is such that E(V [W) = 0. The
parameter of interest is the unknown function . Hence, the obser-
vation model turns to be an inverse problem with unknown operator
T with a correlated noise V . Solving this issue amounts to deal with
the estimation of the operator and then controlling the correlation
with respect to the noise.
The operator T is unknown since it depends on the unknown dis-
tribution of the observed variables Y, X, W denoted f
(Y,X,W)
. The
estimation of this operator can be performed either by directly using
an estimate of f
(Y,X,W)
, or if exists, by estimating the spectral value
decomposition of the operator.
Assume that T is compact and admits a singular value decomposi-
tion (SVD) (
j
,
j
,
j
)
j1
, which provides a natural basis adapted to
the operator for representing the function , see for instance [EHN96].
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 103
More precisely, let T
T is a
compact operator on L
2
X
with eigenvalues
2
j
, j 1 associated to the
corresponding eigenfunctions
j
, while
j
are dened by
j
=
T
j
T
j
.
So we obtain
T
j
=
j
j
, T
j
=
j
j
.
The decay of the eigenvalues denes the diculty of the inverse prob-
lem. Hereafter, we only consider the case of mildly ill-posed inverse
problems, i.e when the eigenvalues decay at a polynomial rate.
IP: Degree of ill-posedness: We assume that there exists t,
called the degree of ill-posedness of the operator which con-
trols the decay of the eigenvalues of the operator T. More
precisely, there are constants
L
,
U
such that
(5)
L
k
t
k
U
k
t
, k 1
We assume some conditions on the observations errors in order
to obtain Hoeding-type concentration bounds. Other equivalent
conditions can be used.
Exponential Moment conditions:: The observation Y sat-
isfy to the following moment condition. There exists some
positive numbers v E(Y
2
j
) and c such that
(6) j 1, k 2, E(Y
k
j
) <
k!
2
vc
k2
.
1.2. An econometric example. Instrumental variable regression
in econometrics are used when modeling a relationship between cor-
related variables. It occurs usually when considering the econometric
problem of the estimation of price and demand of goods. If Q is a
quantity of a good with price P observed over the years, the usual
linear regression model
log Q
i
=
0
+
1
log P
i
+ U
i
= f(P
i
) + u
i
where the coecients
0
and
1
are the elasticity, faces the diculty
that E(f(P[U)) ,= 0. Hence it turns necessary to to de-correlate the
eects using an auxiliary variable which should be highly correlated
104 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
with P but uncorrelated with the error term U. This variable is
called an instrument.
Examples are numerous when studying the variation of price and
demand. For example consider Q the annual sales of wheat and P
the prices. An instrument could be in that case PL the rain level
in the production region. It is obvious that the level of rain does
not change the demand, hence Corr(PL, U) = 0 while the lack of
rain decreases the production which in turn increases the prices, so
Corr(PL, log P) ,= 0.
However, to solve this practical example, the specic link between
the covariates and the instrumental variable is required.
Assume that the link between X and the instrument W is of the
form X = /(W, Z) with Z an independent random variable with
distribution P
Z
. Then the operator has the following form
T(w) =
_
/(w, Z)dP
Z
(Z) =
_
(x)K
L
(x, w)dx
with a change of variable under some dierentiability conditions on
/. Under technical assumptions, the operator denes a Fredholm in-
tegral operator with kernel K
L
depending on the the link function /
and the distribution of Z. Such operators are well studied in and, in
many cases, the SVD decomposition will be available, which enables
to use the estimation procedure developed in this paper.
As a practical example, one may be interested in the particular
case, where X is uniform on [0, 1] and W = X + Z where Z is a
random variable independent of X with unknown density g
Z
. We
point out that the model Y
i
= (W
i
Z
i
) + U
i
is also at the core
of curve registration issues when curves are warped through random
shifts W
i
s.
In this example, both and g
Z
are supposed to be 1-periodic. The
conditional operator T : L
2
(X) L
2
(W) can be written as
Tf(w) = E(f(X)[W = w) = E(f(w Z)) =
_
1
0
f(w z)g(z)dz,
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 105
with adjoint
T
h(x) = E(h(W)[X = x) =
_
1
0
h(z + x)g(z)dz,
for all periodic functions f, g belonging respectively in L
2
(X) and
L
2
(W). Hence, T is a deconvolution type operator, up to some change
of variable. Let (
k
)
kN
be the usual real trigonometric basis on [0, 1]:
1
(t) 1,
2p
(t) =
2 cos(2pt),
2p+1
(t) =
2 sin(2pt), p N.
Since X is uniform on [0, 1], (
k
)
kN
is an orthonormal basis of L
2
(X).
With simple algebra, it is possible to prove that this sequence cor-
responds to the eigenvectors of T
EMENT MARTEAU
approximation spaces as projection spaces in order to study the data.
So, denote the projection of any space W over any subspace Z by
Z
W. Let
n
Y
m
stands for the projection in the empirical norm. Set
also the corresponding projected operator T
m
=
n
Y
m
T.
In this part, we impose some smoothness condition on the function
to be estimated, namely
SC source condition:
There exists > 0 such that Range((T
T)
) := 1((T
T)
)
This condition, well used in the eld of inverse problems, links the
smoothness of the function to the regularity of the operator. The
relationships with other kind of regularity assumptions are described
in [LR09].
Using a sieve of the space }, we consider the corresponding ap-
proximation spaces in the space A, dened as X
m
= T
m
}. By con-
struction
X
m
= (
n
Y
m
T)
+
n
Y
m
T.
Hereafter, we consider a class of regularized estimators built using a
projection and a regularization procedure. Hence the rst step is to
project the data onto a well chosen space. Namely let Y
m
0
be a big
enough space in the sense that m
0
is such that
|(I
X
m
0
)| inf
mM
n
[|(I
X
m
)| +
_
d
m
n
1
m
],
with
m
:= inf
vY
m
,g=1
|T
m
g|,
which expresses the eect of operator T
m
over the approximating
subspace Y
m
. This quantity can be chosen so as not to depend on the
unknown regularity of the solution , but only on the ill-posedness of
the inverse problem, namely
m
= O(d
t
m
) as shown in [LL08, LL10].
This bound leads to the usual optimal rate of convergence for inverse
problems. Under assumption SC the above inequality is satised if
the dimension of the set is such that
d
2t
m
0
n
2t
4t+2t+1
.
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 107
Thus it is enough to choose m
0
such that d
m
0
n n
1/(2t+1)
.
The second step is obtained by, for /
n
a set of indices, considering
R
k
, k /
n
a collection of regularization operators which depend
on dierent values of the smoothing parameters. For instance con-
sider Tikhonov regularization operators which rely on the choice of
a smoothing sequence, Landweber iteration operators which rely on
the choice of a stopping index, or other general smoothing operators
described in [EHN96]. Consider the corresponding estimators
(7)
k
:=
R
k
n
Y
m
0
y = R
k
y,
where we have written R
k
:=
R
k
n
Y
m
0
. The behavior of such gen-
eral estimators depends on the choice of the regularization sequence.
From the theory of inverse problems, we know that it is possible to
choose a regularization operator for which the corresponding estima-
tor achieves the optimal rate of convergence, but this choice depends
on dened in SC, which characterizes the regularity of the solution.
Our aim is building a method that picks, according to the data,
an optimal R
k
, among all the R
k
, k /
n
in such a way that op-
timal rates are maintained. This choice must also not depend on a
priori regularity assumptions. We point out that selecting the op-
timal smoothing parameter in a collection of sequences, belongs to
model selection theory since it is equivalent as selecting a good model
among a collection of sets.
For this consider the following penalized procedure. For a given
constant r > 2 and weights L
k
, k /
n
to be chosen, dene the
penalty as
pen(k) := r
2
(1 + L
k
)[Tr(R
t
k
R
k
) +
2
(R
k
)],
where Tr(R
t
k
R
k
) is the trace and (R
t
k
R
k
) =
2
(R
k
) is the spectral
radius. Finally
k is selected as the solution of
(8)
k := arg min
kK
n
_
|R
k
(y T(
k
))|
2
+ pen(k)
_
,
which denes the estimator
k
= R
k
y. Let R
k
T be the regular-
ized true function, which measures the accuracy of the estimation
108 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
procedure without observation noise. The following result states the
asymptotic behaviour of the estimator
k
.
Theorem 2.1. Under some technical conditions, there exists a con-
stant C which depends on r and on T, such that the following in-
equality holds true
(9)
E|
k
|
2
2|(I
X
m
0
)|
2
+C inf
kK
n
_
|R
k
T |
2
+ 2pen(k)
+
(d)
n
,
where we have set
(d) =
kK
n
2
_
dTr(R
t
k
R
k
)
2
(R
k
)
+ 1
_
[
d
2
(nR
k
)
]
1
e
dL
k
[Tr(R
t
k
R
k
)+
2
(R
k
)]/
2
(R
k
)
,
for d properly chosen.
Hence, the estimator is optimal in the sense that the adaptive esti-
mator achieves the best rate of convergence among all the regularized
estimators, up to an error of order pen(k) and (d)/n. This bound
is non asymptotic and the rate of convergence depends on both pre-
vious terms.
We also point out that
2
(nR
k
) and Tr(R
t
k
R
k
)/
2
(R
k
) do not de-
pend on n.
The main ingredients of the proof can be found in [LL08, LL10].
When the operator T is unknown, one could be tempted by using
the same ideas, just replacing T by an estimator
T. However, the
whole procedure turns more dicult since the term in R
k
T can not be
bounded as easily as previously. Recent results on concentration for
random matrices provide some hopes to extend this general adaptive
procedure to these cases but work is still under progress. However, in
the following section, we provide a general methodology built using
the SVD decomposition of the operator.
3. An oracle inequality with partially known SVD
In this part, we assume that the SVD is partially known in the sense
that the basis of eigenvectors (
j
) is known but that the eigenvalues,
j
s, are not observed. This assumption, yet restrictive, still enables
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 109
to handle some useful cases. It will be discussed in details at the end
of section 4.
3.1. General estimation approach. This case is inspired by the
pioneering work by [CH05]. It is fully described in [LM09]
We can write the following decompositions
(10) r(w) = E(Y [W = w) = T(w) =
j1
j
,
j
j
(w),
(11) and r(w) =
j1
r
j
j
(w),
with r
j
= Y,
j
W
that can thus be estimated by
r
j
=
1
n
n
i=1
Y
i
j
(W
i
).
Hence the noisy observations are the r
j
s which will be used to esti-
mate the regression function in an inverse problem framework.
Note rst that, if the operator were known we could provide an es-
timator using the spectral decomposition of the function as follows.
For a given decomposition level m, dene the projection estimator
(also called spectral cut-o [EHN96])
(12)
0
m
=
m
j=1
r
j
j
Since the
j
s are unknown, our rst task is to build an estimator of
the eigenvalues. For this, using the decomposition (10), we obtain
j
=< T
j
,
j
>
W
= E[T
j
(W)
j
(W)]
= E[E[
j
(X)[W]
j
(W)]
= E[
j
(X)
j
(W)]. (13)
110 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
So, following (13), a natural estimator for the eigenvalue
j
is given
by
(14)
j
=
1
n
n
i=1
j
(W
i
)
j
(X
i
).
As studied in [CH05], replacing directly the eigenvalues by their esti-
mates in (12) does not yield a consistent estimator, hence using their
same strategy we dene an upper bound for the resolution level
(15) M = inf
_
k N : [
k
[
1
n
log n
_
1,
for N any integer chosen greater than n. The parameter N provides
an upper bound for M in order to ensure that M is not too large.
The main idea behind this denition is that when the estimates of the
eigenvalues are too small with respect to the observation noise, trying
to still provide an estimation of the inverse
1
k
only amplicates the
estimation error. To avoid this trouble, we truncate the sequence
of the estimated eigenvalues when their estimate is too small, i.e
smaller than the noise level. We point out that this parameter M
is a random variable which we will have to control. More precisely,
dene two deterministic lower and upper bounds M
0
, M
1
as
(16) M
0
= inf
_
k : [
k
[
1
n
log
2
n
_
1,
and
(17) M
1
= inf
_
k : [
k
[
1
n
log
3/4
n
_
,
we can show that with high probability M
0
M < M
1
as proved in
Lemma 5.1. Note that if in the denition (15) the set is empty, we
set M = 0. However, from the remark above, this case happens with
very small probability.
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 111
Now, thresholding the spectral decomposition in (12) leads to the
following estimator
(18)
m
=
m
j=1
r
j
j
1
jM
j
.
The asymptotic behaviour of this estimate depends on the choice of
m. In the next section, we provide an optimal procedure to select
the parameter m that gives rise to an adaptive estimator
and an
oracle inequality.
3.2. Oracle inequality. All the estimation errors will be given with
respect to the L
2
X
norm which is a natural choice for this kind of
problems. Another possibility would have been to place the issue in
L
2
([0, 1]).
First, let R
0
(m, ) be the quadratic estimation risk for the naive
estimator
0
m
(12), dened for all m N, by
R
0
(m, ) = E|
0
m
|
2
X
=
k>m
2
k
+
1
n
m
k=1
2
k
2
k
, m N,
with
2
k
= Var(Y
k
(W)). The best model would be obtained by
choosing a minimizer of this quantity, namely
(19) m
0
= arg min
m
R
0
(m, ).
This risk depends on the unknown function hence m
0
is referred
to as the oracle. We aim at constructing an estimator of R
0
(m, )
which, by minimization, could give rise to a convenient choice for m,
i.e as close as possible to m
0
. The rst step would be to replace
k
112 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
by their estimates
1
k
r
k
and take for estimator of
2
k
,
2
k
, dened by
2
k
=
1
n
n
i=1
_
Y
i
k
(W
i
)
1
n
n
i=1
Y
i
k
(W
i
)
_
2
=
1
n
n
i=1
(Y
i
k
(W
i
) r
k
)
2
.
This would lead us to consider the empirical risk for any m M, the
cut-o which warrants a good behaviour for the
j
s
U
0
(m, r, ) =
m
k=1
2
k
r
2
k
+
c
n
m
k=1
2
k
2
k
, m N,
for a well chosen constant c. The corresponding random oracle within
the range of models which are considered would be
(20) m
1
= arg min
mM
R
0
(m, ).
Unfortunately, the correlation between the errors V
i
and the observa-
tions Y
i
prevents an estimator dened as a minimizer of U
0
(m, r, )
to achieve the quadratic risk R
0
(m, ). Indeed, we have to use a
stronger penalty, leading to an extra error in the estimation that
shall be discussed later in the paper. More precisely, c in the penalty
is not a constant anymore but is allowed to depend on the number
of observations n.
Hence, now dene R(m, ) the penalized estimation risk as
(21) R(m, ) =
k>m
2
k
+
log
2
n
n
m
k=1
2
k
2
k
, m N.
The best choice for m would be a minimizer of this quantity, which
yet depends on the unknown regression function . Hence, to mimic
this risk, dene the following empirical criterion
(22) U(m, r, ) =
m
k=1
2
k
r
2
k
+
log
2
n
n
m
k=1
2
k
2
k
, m N.
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 113
Then, the best estimator is selected by minimizing this quantity as
follows
(23) m
:= arg min
mM
U(m, r, ),
Finally, the corresponding adaptive estimator
is dened as:
(24)
=
m
k=1
1
k
r
k
k
.
The performances of
|
2
X
B
0
log
2
(n).
_
inf
m
R(m, )
_
+
B
1
n
_
log(n).||
2
X
_
2t
+ + log
2
(n).(),
where B
2
(1 + ||
2
X
) exp
_
log
1+
n
_
, m
0
denotes the oracle
bandwidth and
(25) () =
m
0
k=min(M
0
,m
0
)
_
2
k
+
1
n
2
k
2
k
_
,
with the convention
b
a
= 0 if a = b.
We obtain a non asymptotic inequality which guarantees that the
estimator achieves the optimal bound, up to a logarithmic factor,
among all the estimators that could be constructed. We point out
that we lose a log
2
(n) factor when compared with the bound obtained
in [CH05]. This loss comes partly from the fact that the error on the
operator is not deterministic nor even due to a independent noisy
observation of the eigenvalues. Here, the
k
s have to be estimated
using the available data by
k
. In the econometric model, both the
operator and the regression function are estimated on the same sam-
ple, which leads to high correlation eects that are made explicit in
Model (4), hampering the rate of convergence of the corresponding
estimator.
114 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
An oracle inequality only provides some information on the as-
ymptotic behaviour of the estimator if the remainder term () is
of smaller order than the risk of the oracle. This remainder term
models the error made when truncating the eigenvalues, i.e the error
when selecting a model close to the random oracle m
1
M and not
the true oracle m
0
. In the next section, we prove that, under some
assumptions, this extra term is smaller than the risk of the estimator.
Proof. The full proof of this result can be found in [LM09]. We
provide here the general ideas. First, the decay of the eigenvalues and
of the estimated eigenvalues is controlled in probability as follows. Set
/= M
0
M < M
1
, where M, M
0
, M
1
are respectively dened in
(15), (16) and (17). Then, for all n 1,
P(/
c
) CM
0
e
log
1+
n
,
where C and denote positive constants independent of n, as proved
in Lemma 5.1.
Then, the proof of our main result can be decomposed into four steps.
In a rst time, we prove that the quadratic risk of
is close, up to
some residual terms, to E
R(m
, ) where
(26)
R(m, ) =
k>m
2
k
+
log
2
n
n
m
k=1
2
k
2
k
, m N.
This result is uniform in m and justies our choice of
R(m, ) as a
criterion for the bandwidth selection.
In a second time, we show that E
R(m
, ) and EU(m
, r, ) are
in some sense comparable. Then, according to the denition of m
in (23),
U(m
, r, ) U(m, r, ), m M.
We will conclude the proof by proving that for all m M, EU(m, r, ) =
E|
m
|
2
, up to a log term and some residual terms.
Some additional assumptions are required on both the data Y
i
, i =
1, . . . , n and the eigenfunctions
k
and
k
for k 1.
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 115
Bounded SVD functions:: There exists a nite constant C
1
such that
(27) j 1, |
j
|
< C
1
, |
j
|
< C
1
Requiring bounded SVD functions may be seen as a restrictive condi-
tion. Yet it is met when the eigenvectors are trigonometric functions.
However, this condition can be also be turned into a moment con-
dition if we replace the concentration bound by a Bernstein type
inequality. Note also that the moment conditions on Y amounts to
require a bounded regression function and equivalent moment con-
ditions on the errors U
j
.
Enough ill-posedness : : Let
2
j
= Var(Y
j
(W)). We as-
sume that there exist two positive constants
2
L
and
2
U
such
that
(28) j 1,
2
L
2
j
2
U
.
Note that Condition (6) implies the upper bound of Condition (28);
which is also a direct consequence of Assumption A.2 in [HH05]. Both
the upper and lower bound is similar to the assumption 4.1 and the
variance condition in Assumption 3.1 in [CR08]. We also point out
that this condition is not needed when building an estimator for the
regression function. However it turns necessary when obtaining the
lower bound to get a minimax result, or when obtaining an oracle
inequality.
3.3. Rate of convergence. To get a rate of convergence for the
estimator, we need to specify the regularity of the unknown function
and compare it with the degree of ill-posedness of the operator T,
following the usual conditions in the statistical literature on inverse
problems, see for example [MR96] or [CT02], [BHMR07] for some
examples.
Regularity Condition: Assume that the function is such
that there exists s and a constant C such that
(29)
k1
k
2s
2
k
< C
116 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
This Assumption corresponds to functions whose regularity is gov-
erned by the smoothness index s. This parameter is unknown and
yet governs the rate of convergence. In the special cases where the
eigenfunctions are the Fourier basis, this set corresponds to Sobolev
classes. We prove that our estimator achieves the optimal rate of
convergence without prior assumption on s.
Corollary 3.2. Let
|
2
X
= O
_
_
n
log
2
n
_ 2s
2s+2t+1
_
,
with = 2 + 2s + 2t.
We point out that
k=1
exp(2k
t
)
2
k
< C.
Following the guidelines of the proof of Corollary 3.2 and Theorem
2.1, we obtain that M
0
> m
0
(a2 log n)
1/t
with 2a > 1, leading
to the optimal recovery rate for super smooth functions in inverse
problems.
4. Conclusion and comments
In conclusion, this work shows that provided the eigenvectors are
known, for smooth functions , estimating the eigenvalues and using
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 117
a threshold suces to get a good estimator of the regression function
in the instrumental variable framework. The price to pay for not
knowing the operator is only an extra log
2
n with respect to usual in-
verse problems and is only due to the correlation induced by the V
i
s.
Remark that this log term could be avoided by splitting the data.
One may use a training set for the construction of the bandwidth
m
and the remaining data for the recovery of . In this case, the
quadratic risks of both our estimator and the oracle are comparable,
up to some computable constant. Nevertheless, this approach is not
satisfying from a mathematical point of view since the underlying
problem of adaptation is hidden.
One could object that the knowledge of the eigenvectors is a huge
hint and thus, the operator is not totally unknown. Still, in the
following examples, we present a class of cases where this situation
happens, mainly when the relationship between the variable X and
the instrument W has a particular form. However, some papers have
considered the case of completely unknown operators, using func-
tional approach, see for instance [CFR06], but their estimate clearly
rely on smoothness assumptions for the regression. Hence the two
approaches are complementary since we provide more rened adap-
tive result under stronger assumptions. Nevertheless, using similar
techniques to develop a fully adaptive estimation procedure would be
a next step toward a full understanding of the IV regression model.
To our knowledge, we provide the rst adaptive estimation pro-
cedure for IV regression in some particular cases which yet present
some interest from an econometric point of view. We are aware that
we do not handle the estimation problem in the general case but this
work only claims to be a rst step towards an adaptive estimation
procedure for this dicult problem.
118 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
5. Appendix
Lemma 5.1. Set / = M
0
M < M
1
, where M, M
0
, M
1
are
respectively dened in (15), (16) and (17). Then, for all n 1,
P(/
c
) CM
0
e
log
1+
n
,
where C and denote positive constants independent of n.
PROOF. It is easy to see that:
P(/
c
) = P (M < M
0
M M
1
) P(M < M
0
)+P(M M
1
).
Using (15) and (17),
P(M M
1
) = P
_
M
1
k=1
_
[
k
[
1
n
log n
_
_
P
_
[
M
1
[
1
n
log n
_
.
The denition of
M
1
yields
P(M M
1
) P
_
M
1
M
1
+
M
1
n
log n
_
,
P
_
M
1
M
1
n
log n [
M
1
[
_
,
P
_
1
n
n
i=1
M
1
(X
i
)
M
1
(W
i
) E[
M
1
(X)
M
1
(W)]
b
n
_
,
where b
n
= n
1/2
log n[
M
1
[ for all n N. Let k N and x [0, 1]
be xed. Assumption (27) and Hoeding inequality yield
P([
k
[ > x) 2 exp
_
(nx)
2
2
n
i=1
Var(
M
1
(X
i
)
M
1
(W
i
)) + 2nCx/3
_
,
= 2 exp
_
nx
2
2Var(
M
1
(X)
M
1
(W)) + 2Cx/3
_
.
Using again the assumption (27) on the bases (
k
)
kN
and (
k
)
kN
,
Var(
M
1
(X)
M
1
(W)) E[
2
M
1
(X)
2
M
1
(W)] C
4
1
.
Hence,
(30) P([
k
[ > x) 2 exp
_
Cnx
2
_
, x [0, 1],
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 119
for some constant C depending on C
1
but independent of n. Using
(17), we obtain 1 > b
n
> 0 for all n N. Therefore, using (30) with
x = b
n
, we obtain:
P(M M
1
) 2 exp
_
Cnb
2
n
_
2 exp
_
C(log n log
3/4
n)
2
_
,
C exp
_
log
1+
n
_
,
where C and denote positive constants independent of n.
The bound of P(M < M
0
) follows the same lines:
P(M < M
0
) = P
_
M
0
_
j=1
_
[
j
[
log n
n
_
_
M
0
j=1
P
_
[
j
[
log n
n
_
,
M
0
j=1
P
_
j
log n
n
_
.
Let j 1, . . . , M
0
be xed.
P
_
j
log n
n
_
= P
_
j
j
b
n,j
_
,
where
b
n,j
= n
1/2
log n
j
for all n N. Thanks to (16),
b
n,j
< 0
for all n N. Using (30) with x =
b
n,j
, we get
P
_
j
log n
n
_
exp
_
Cn
b
2
n,j
_
C exp
_
log
1+
n
_
,
for some C, > 0. This concludes the proof of Lemma 5.1.
2
Proof of Corollary 3.2 We start by recalling the oracle inequal-
ity obtained for the estimator
.
E|
|
2
C
0
log
2
(n).
_
inf
m
R(m, )
_
+
C
1
n
_
log(n).||
2
_
2
+ + log
2
(n).(),
120 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
We have to bound the risk under the regularity condition and the
extra term log
2
(n)(). Recall that the risk is given by
R(m, ) =
k>m
2
k
+
log
2
n
n
m
k=1
2
k
2
k
.
Hence under (29), we obtain both upper bounds for two constants
C
1
and C
2
k>m
2
k
m
2s
C
1
,
log
2
n
n
m
k=1
2
k
2
k
C
2
log
2
n
n
2
U
m
2t+1
.
An optimal choice is given by m = [(n/ log n)
1
1+2s+2t
], leading to the
desired rate of convergence.
Now consider the remainder term (). Under Assumption [IP],
M
0
[n
1/2s
/ log
2
n], but since m
0
= [n
1
1+2s+2t
] we get clearly that
m
0
M
0
, which entails that () = 0.
References
[BHMR07] N. Bissantz, T. Hohage, A. Munk, and F. Ruymgaart. Convergence
rates of general regularization methods for statistical inverse prob-
lems and applications. SIAM J. Numer. Anal., 45(6):26102636 (elec-
tronic), 2007.
[Cav08] L. Cavalier. Nonparametric statistical inverse problems. Inverse Prob-
lems, 24(3):034004, 19, 2008.
[CFR06] M. Carrasco, J-P. Florens, and E. Renault. Linear Inverse Problems
in Structural Econometrics: Estimation Based on Spectral Decompo-
sition and Regularization, volume 6. North Holland, 2006.
[CGPT02] L. Cavalier, G. K. Golubev, D. Picard, and A. B. Tsybakov. Oracle
inequalities for inverse problems. Ann. Statist., 30(3):843874, 2002.
Dedicated to the memory of Lucien Le Cam.
[CH05] L. Cavalier and N. W. Hengartner. Adaptive estimation for inverse
problems with noisy operators. Inverse Problems, 21(4):13451361,
2005.
[CHR03] A. Cohen, M. Homann, and M. Reiss. Adaptive wavelet galerkin
methods for linear inverse problems. SIAM, 1(3):323354, 2003.
ADAPTATIVE ESTIMATION FOR INSTRUMENTAL REGRESSION 121
[CR08] X. Chen and M. Reiss. On rate optimality for ill-posed inverse prob-
lems in econometrics. Cowles Foundation Discussion Paper No. 1626,
2008.
[CT02] L. Cavalier and A. Tsybakov. Sharp adaptation for inverse problems
with random noise. Probab. Theory Related Fields, 123(3):323354,
2002.
[EHN96] H. Engl, M. Hanke, and A. Neubauer. Regularization of inverse prob-
lems, volume 375 of Mathematics and its Applications. Kluwer Aca-
demic Publishers Group, Dordrecht, 1996.
[EK01] S. Efromovich and V. Koltchinskii. On inverse problems with un-
known operators. IEEE Trans. Inform. Theory, 47(7):28762894,
2001.
[HH05] P. Hall and J. L. Horowitz. Nonparametric methods for inference
in the presence of instrumental variables. Ann. Statist., 33(6):2904
2929, 2005.
[HR08] M. Homann and M. Reiss. Nonlinear estimation for linear inverse
problems with error in the operator. Ann. Statist., 36(1):310336,
2008.
[LL08] J-M. Loubes and C. Ludena. Adaptive complexity regularization for
inverse problems. Electronic Journal Of Statistics, 2:661677, 2008.
[LL10] J-M. Loubes and C. Ludena. Adaptive complexity regularization for
inverse problems. ESAIM Probab. Statist., 2:661677, 2010.
[LM09] Jean-Michel Loubes and Clement Marteau. Oracle Inequality for In-
strumental Variable Regression. hal-00356428, 2009. 62G05; 62G20.
[LR09] Jean-Michel Loubes and Vincent Rivoirard. Review of rates of conver-
gence and regularity conditions for inverse problems. Int. J. Tomogr.
Stat., 11(W09):6182, 2009.
[Mar06] C. Marteau. Regularization of inverse problems with unknown oper-
ator. Math. Methods Statist., 15(4):415443, 2006.
[Mar09] C. Marteau. On the stability of the risk hull method. Journal of
Statistical Planning and Inference, (139):18211835, 2009.
[MR96] B. Mair and F. Ruymgaart. Statistical inverse estimation in Hilbert
scales. SIAM J. Appl. Math., 56(5):14241444, 1996.
[NP03] W. K. Newey and J. L. Powell. Instrumental variable estimation of
nonparametric models. Econometrica, 71(5):15651578, 2003.
[OS86] F. OSullivan. A statistical perspective on ill-posed inverse problems.
Statist. Sci., 1(4):502527, 1986. With comments and a rejoinder by
the author.
122 JEAN-MICHEL LOUBES AND CL
EMENT MARTEAU
Equipe de probabilit
ematique
de Toulouse, UMR5219, Universit
ematique
de Toulouse, UMR5219, Universit
.
2. Semisimple Hopf algebras and fusion categories
A Hopf algebra is called semisimple (respectively, cosemisimple)
if it is semisimple as an algebra (respectively, if it is cosemisimple
as a coalgebra). A semisimple Hopf algebra is automatically nite
dimensional. Let H be a nite-dimensional Hopf algebra over k. By
a result of Larson and Radford, it is known that H is semisimple if
and only if H is cosemisimple, if and only if o
2
= id [LR, LR2].
2.1. Rep H as a tensor category. Let H be a nite dimensional
Hopf algebra. The category Rep H of its nite dimensional represen-
tations is a nite tensor category with tensor product given by the
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 127
diagonal action of H and unit object k. The antipode implements
the H-action on the dual vector space.
Finite tensor categories of the form Rep H are characterized, using
tannakian reconstruction arguments, as those possessing a ber func-
tor with values in the category of vector spaces over k. The forgetful
functor Rep H Vec
k
is a ber functor and other ber functors cor-
respond to twisting the comultiplication of H in the following sense.
Denition 2.1. A twist in H is an invertible element J H H
satisfying:
(id)(J)(J 1) = (id )(J)(1 J), (2.1)
( id)(J) = 1 = (id )(J). (2.2)
Dually, an invertible normalized 2-cocycle on H is a convolution in-
vertible linear map : H H k, such that, for all g, h, t H,
(h
(1)
, g
(1)
)(t, h
(2)
g
(2)
) = (t
(1)
, h
(1)
)(t
(2)
h
(2)
, g), (2.3)
(h, 1) = (h) = (1, h). (2.4)
If J HH is a twist, then (H
J
, m,
J
, S
J
) is a Hopf algebra with
H
J
= H as algebras,
J
(h) = J
1
(h)J, and S
J
(h) = v
1
S(h)v,
for all h H, where v = m(S id)(J).
The Hopf algebras H and H
H
J
.
This type of deformation was originally introduced by Drinfeld [Dr2]
in the context of quasi-Hopf algebras.
The following theorem is a consequence of a more general result
of Schauenburg [S]. An analogous statement for nite dimensional
quasi-Hopf algebras has been proved by Etingof and Gelaki.
Theorem 2.2. The nite dimensional Hopf algebras H and H
are
twist equivalent if and only if Rep H Rep H
as tensor categories.
In particular, properties like (quasi)triangularity, semisimplicity or
the structure of the Grothendieck ring are preserved under twisting
deformations.
Dually, if : H H k is an invertible normalized 2-cocycle
on H, then (H
, m
, , o
= H as
128 SONIA NATALE
coalgebras with multiplication and antipode
h.
g = (h
1
, g
1
)h
2
g
2
1
(h
3
, g
3
), o
(h) = u
1
(h
(1)
)o(h
(2)
)u(h
(3)
),
for all h, g H, where u(h) = (o(h
(1)
), h
(2)
), h H.
The Hopf algebra H
is a twisting deformation
of H
.
Let H, H
,
respectively, are equivalent as tensor categories if and only if H
= H
is a cocycle deformation of H.
The 2-cocycle : H H k gives rise to a ber functor U
:
H Corep Vec
k
whose underlying functor is the forgetful functor
HCorep Vec
k
with monoidal structure f : U
(U
)
induced by . That is,
(2.5) f(u v) = (u
(1)
v
(1)
)u
(0)
v
(0)
,
for all u U, v V , U, V H Corep, where u u
(1)
u
(0)
,
denotes the H-coaction on u U.
Using tannakian reconstruction, one recovers the Hopf algebra H
: H
End(U
).
This denes a bijective correspondence between equivalence classes
of invertible 2-cocycles on H and isomorphism classes of ber functors
on H Corep.
By results of Ulbrich, generalizing ideas of Grothendieck, isomor-
phism classes of ber functors on HCorep correspond bijectively to
isomorphism classes of H-Galois extensions of k, also called H-Galois
objects.
Recall that the extension of k-algebras B A is called a right
H-Galois extension if A is a right H-comodule algebra such that
B = A
co H
and the canonical map
can : A
B
A A H, x y xy
(0)
y
(1)
,
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 129
is bijective. Here, : A A H, (a) = a
(0)
a
(1)
, denotes the
H-coaction on A. Left H-Galois extensions and left H-Galois objects
are dened similarly.
The right H-Galois object A corresponds to the ber functor U
A
:=
A
H
: HCorep Vec
k
, where
H
denotes the cotensor product
of H-comodules.
Let H, H
be Hopf algebras. An (H
, H)-bigalois object is an
(H
-
Galois object and right H-Galois object.
For instance, the Hopf algebra H is itself an (H, H)-bigalois object
with respect to the left and right H-coactions given by the comulti-
plication : H H H.
More generally, let : HH k be an invertible 2-cocycle. Then
the crossed product
H = k#
= L(A, H), called the left Galois Hopf algebra, such that A is in
a natural way an (H
S : f(sg) = s f(g),
where k
) denes a twist J
c
kGkG. This twist has the form
(2.6) J =
c(, )e
,
where e
=
1
|G|
hG
(h
1
)h,
is.
Example 3.2. (Kobayashi-Masuoka.) Let H be a semisimple Hopf
algebra and K a Hopf subalgebra. Suppose that the index of K in
H, that is, the quotient dimH/ dimK, is the smallest prime number
dividing dimH. Then K is normal in H. This generalizes a well-
known fact for nite groups.
The concept of solvability of groups translates into the notion of
semisolvability of Hopf algebras, due to Montgomery and Wither-
spoon, in such a way that if H is semisolvable, then it can be obtained
from group algebras and their duals via a nite number of extensions.
A related, although not comparable, notion of solvability of a fusion
category is introduced and studied by Etingof, Nikshych and Ostrik
in [ENO2]. This notion will be discussed later on in Section 4.
Denition 3.3. [MW]. A lower normal series for H is a series
of Hopf subalgebras H
n
= k H
n1
H
1
H
0
= H,
where H
i+1
is normal in H
i
, for all i. The factors are the quotients
H
i
= H
i
/H
i
H
+
i+1
.
An upper normal series is inductively dened as follows. Let
H
(0)
= H. Let H
i
be a normal Hopf subalgebra of H
(i1)
and de-
ne H
(i)
= H
(i1)
/H
(i1)
H
+
i
. Assume that H
n
= H
(n1)
, for some
132 SONIA NATALE
positive integer n, so that H
(n)
= k. The factors are the Hopf subal-
gebras H
i
of the quotients H
(i1)
.
The Hopf algebra H is called lower (respectively, upper) semisolv-
able if it possesses a lower (respectively, upper) normal series such
that all factors are commutative or cocommutative. If H is both
lower and upper semisolvable, then it is called semisolvable.
We have that H is upper semisolvable if and only if H
is lower
semisolvable [MW].
Remark 3.4. Note that, equivalently, an upper normal series can be
dened as a sequence of quotient Hopf algebra maps H
(0)
= H
H
(1)
H
(n)
= k such that each of the maps H
(i1)
H
(i)
is normal. In this case, the factors are H
i
:= H
co
i
(i1)
=
co
i
H
(i1)
,
where H
co
i
(i1)
,
co
i
H
(i1)
are the spaces of (right, respectively left)
coinvariants of the map
i
. They coincide and form a Hopf subalgebra
of H
(i1)
, by normality of the map
i
.
A result due to Masuoka [M2] says that a semisimple Hopf algebra
of dimension p
n
, p prime, contains a nontrivial central group-like
element g. This implies that group algebra kg is a central Hopf
subalgebra of H. Inductively, this implies that H is semisolvable
[MW].
It was shown in [N4] that in dimension < 60 every semisimple Hopf
algebra is obtained, up to a twisting deformation, from group alge-
bras and their duals through iterated extensions. That is, they are
(either upper or lower) semisolvable, except possibly after a twist-
ing deformation. This result answered a question formulated by S.
Montgomery in [Mo3].
Some nontrivial examples of semisimple Hopf algebras which are
simple as Hopf algebras arise as twisting deformations of simple
groups.
For instance, the twisting H = (kA
5
)
J
of the alternating group
A
5
, where J is the twist lifted from the unique nontrivial 2-cocycle
in a Klein subgoup A
5
is a nontrivial simple Hopf algebra of
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 133
dimension 60. This example is due to Nikshych [Nk]. The Hopf
algebra H is not (upper or lower) semisolvable.
In the papers [GN, GN2] certain twisting deformations of a family
of supersolvable groups which are simple Hopf algebras were con-
structed. These groups are direct products of two generalized dihe-
dral subgroups.
Let p, q and r be prime numbers such that q divides p 1 and
r 1. Let G
1
= Z
p
Z
q
and G
2
= Z
r
Z
q
be the only nonabelian
groups of orders pq and rq, respectively. Let G = G
1
G
2
and let
Z
q
Z
q
G a subgroup of order q
2
.
Let also 1 ,= H
2
(
, k
, (x, y) =
s
(x, y)e
s
, and :
(k
F
)
, (s, t) =
x
(s, t)e
x
, be normalized 2-cocycles with
the respect to the actions aorded, respectively, by and , subject
to appropriate compatibility conditions [M3]. Here, e
y
k
F
, y F,
are the canonical idempotents dened by e
y
(x) =
x,y
, and similarly
for e
s
k
.
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 135
The bicrossed product H = k
#
s
(x, y) e
s
#xy, (3.2)
(e
s
#x) =
gh=s
x
(g, h) e
g
#(h x) e
h
#x, (3.3)
The Hopf algebra H ts into an abelian exact sequence k k
, kF) associated to
the matched pair (F, ).
The class of an element of Opext(k
, kF) can
also be described as the rst cohomology group of a certain double
complex [M3, Proposition 5.2].
The following result is proved in [ENO2]. This leads to the full
classication of semisimple Hopf algebras of the prescribed dimen-
sions.
Theorem 3.8. Let p, q, r be distinct prime numbers. Then every
semisimple Hopf algebra of dimension pqr or pq
2
is an abelian exten-
sion.
Indeed, it is shown in [ENO2, Corollary 9.7] that a semisimple Hopf
algebra of dimension pq
2
is either an abelian extension or a twist of
a group algebra or the dual of such a twist. But in the last cases, H
and H
)
res
H
1
(F, k
) H
1
(, k
) Aut(k
#kF)
H
2
(G, k
)
res
H
2
(F, k
) H
2
(, k
) Opext(k
, kF)
H
3
(G, k
)
res
H
3
(F, k
) H
3
(, k
) . . .
This is an important tool in calculations related to the Opext group.
See [S3] for a generalization, as well as a conceptual explanation of
the Kac exact sequence in terms of related monoidal categories.
A consequence of the results of [S3] is the following description,
given in [N2], of the representation category of an abelian extension,
in terms of the map : Opext(k
, kF) H
3
(G, k
) be
the 3-cocycle corresponding to H via the map .
Let us denote by ((G, , F) the category of kF-bimodules in the
tensor category ((G, ) of G-graded vector spaces with associativity
given by (this is a special case of a group-theoretical fusion category,
discussed in Subsection 4.4).
Theorem 3.9. There is an equivalence of fusion categories Rep H
((G, , F).
Also, there is an equivalence of fusion categories Rep D(H)
Rep D
(G).
Here, the twisted quantum double D
(G), H
3
(G, k
), is the
quasi-Hopf algebra introduced by Dijkgraaf, Pasquier and Roche [DPR].
For the case of split extensions, that is when (, ) = 1 and hence
= 1, this result was obtained previously in [BGM].
4. Extensions of fusion categories by finite groups
In this section we review some important recent results from [GNi,
DGNO, ENO2]. These concern certain classes of extensions of a
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 137
fusion category by nite groups. We also discuss some connections
with the results of Section 3.
4.1. G-extensions. Let G be a nite group. A G-grading of a fusion
category ( is a decomposition of ( as a direct sum of full abelian sub-
categories ( =
gG
(
g
, such that (
g
= (
g
1 and the tensor product
: ( ( ( maps (
g
(
h
to (
gh
. The neutral component (
e
is
thus a full fusion subcategory of (.
The grading is called faithful if (
g
,= 0, for all g G. In this case,
( is called a G-extension of (
e
[ENO2].
Proposition 4.1. Let ( = Rep H be the representation category of a
semisimple Hopf algebra. Then a faithful G-grading on ( corresponds
to a central exact sequence of Hopf algebras k k
G
H H k,
such that Rep H = (
e
.
Dually, a faithful G-grading on ( = Corep H corresponds to a
cocentral exact sequence of Hopf algebras k K H kG k,
such that Corep K = (
e
.
Here, the sequence k K H kG k is called cocentral if
the dual sequence is central (see Subsection 4.3 below).
Proof. See [GNi, Proof of Theorem 3.8] for the statement on Rep H.
The dual statement follows from this, since H ts into a cocentral
extension k K H kG k if and only if the dual Hopf
algebra H
k,
if and only if the category Rep H
= Corep H is a G-extension of
Rep K
= Corep K.
Let ( be a fusion category and let (
ad
be the adjoint subcategory
of (. That is, (
ad
is the full fusion subcategory of ( generated by the
subobjects of X X
(, with f
V
g,h
: T
g
(V ) T
h
(V )
T
gh
(V ).
Given an action of G on (, the G-equivariantization of (, denoted
(
G
, is the category of G-equivariant objects and G-equivariant mor-
phisms, dened as follows. A G-equivariant object in ( is a pair
(V, (u
V
g
)
gG
), where V is an object of ( and u
V
g
: T
g
(V ) V , g G,
are isomorphisms such that, for all g, h G,
(4.2) u
V
g
T
g
(u
V
h
) = u
V
gh
f
V
g,h
.
A G-equivariant morphism : (U, u
U
g
) (V, u
V
g
) is a morphism
: U V in ( such that u
U
g
= u
V
g
, for all g G.
This is a tensor category with tensor product dened as (U, u
U
g
)
(V, u
V
g
) = (U V, (u
U
g
u
V
g
)j
g
[
U,V
), where j
g
[
U,V
: T
g
(U V )
T
g
(U) T
g
(V ) are the isomorphisms giving the monoidal structure
on T
g
.
The category (
G
is Morita equivalent, in the sense of M uger, to a
certain G-extension ( G of ( with respect to the indecomposable
module category ( [Nk2].
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 139
In the case of the representation category of a semisimple Hopf
algebra K, there is a duality between G-graded fusion categories with
trivial component Rep K and G-equivariantizations of Rep K.
Suppose H ts into a cocentral extension k K H kG k.
Then Rep H (Rep K)
G
; see Subsection 4.3 below. On the other
hand, Corep H is a G-extension of Corep K, by Proposition 4.1.
4.3. G-actions on Rep H and cocentral extensions. An exact
sequence of nite dimensional Hopf algebras
k H
H
kG k
is called cocentral if (h
1
) h
2
= (h
2
) h
1
, for all h
H (equiva-
lently, the dual inclusion
: (kG)
is central).
In [N7, Proposition 3.5] we showed that every such cocentral ex-
act sequence gave rise to a G-action on Rep H such that Rep
H
(Rep H)
G
as tensor categories. In this subsection we shall show that
the converse is also true, up to twisting deformations. This gives a
characterization of cocentral extensions in terms of equivariantiza-
tions.
Let G be a nite group and let H be a semisimple Hopf algebra
(although the semisimplicity of H is not crucial in our arguments).
Consider an action of G on Rep H by tensor autoequivalences T :
G Aut
Rep H.
For each g G, consider the tensor functor (T
g
, j
g
) : Rep H
Rep H. By [S] there exist a twist J(g) H H and a Hopf algebra
isomorphism
g
: H H
J(g)
such that (T
g
, j
g
) is isomorphic as a
tensor functor to (
g
, J(g)
1
), where
g
is the direct image functor
and, by abuse of notation, J(g)
1
:
g
(U V )
g
(U)
g
(V )
is the isomorphism given by the action of J(g)
1
. In particular,
g
are algebra automorphisms of H, for all g G, and the map
J : G H H is invertible. Let us denote g.a =
g
(a), g G,
a H.
140 SONIA NATALE
In particular, the following hold, for all g G, a, b H:
g.(ab) = (g.a)(g.b), g.1 = 1, (4.3)
(g.a) = J(g)(g.a
1
g.a
2
)J(g)
1
. (4.4)
For each g G, let (g) : (T
g
, j
g
) (
g
, J(g)
1
) be an isomor-
phism of tensor functors. Then we have natural isomorphisms of
tensor functors f
g,h
:
gh
, for all g, h G, dened for an
H-module X as
(f
g,h
)
X
= (gh)
X
(f
g,h
)
X
T
g
((h)
1
X
)(g)
1
h
(X)
.
The isomorphisms f
g,h
determine an invertible map : G G H
such that (g, h)
1
[
X
= f
g,h
[
X
, for all g, h G, and for all H-module
X.
The data , , J satisfy the following conditions:
(g.(h, t))(g, ht) = (gh, t)(g, h), (1, g) = (g, 1) = 1, (4.5)
g.(h.a) = (g, h)(gh.a)(g, h)
1
, (4.6)
((g, h))J(gh) = J(g)(g.J(h)) ((g, h) (g, h)), (4.7)
for all g, h, t G, a, b H. Indeed, (4.5) and (4.6) are equivalent to
(g, h) being isomorphisms of k-linear functors, and (4.7) is equiva-
lent to (g, h) being morphisms of tensor functors.
Conditions (4.3)(4.7), together with the twist conditions for J(g),
imply that the vector space H
= H
J
#
kG k.
Proposition 4.2. Let G be a nite group and let
H be a semisimple
Hopf algebra. Then the following are equivalent:
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 141
(i) There exists a semisimple Hopf algebra H and an action of
G on Rep H by tensor autoequivalences, such that Rep
H
(Rep H)
G
.
(ii)
H is twist equivalent to a Hopf algebra H
that ts into a
cocentral exact sequence k H H
kG k.
Proof. The implication (ii) (i) is [N7, Proposition 3.5]. We shall
show that (i) (ii). The proof is based on the relation between
G-actions and Hopf monads, as studied in [BrN], see Subsection 5.1.
Keep the notation above. Let T
G
be the Hopf monad on Rep H cor-
responding to the given action of G. Consider the bicrossed product
Hopf algebra H
= H
J
#
G
on Rep H.
By denition of and , there exists an isomorphism of Hopf
monads (that is, a morphism of monads which is monoidal) =
g
(g) : T
G
T
G
, where (g) : (T
g
, j
g
) (
g
, J(g)
1
) are the
given isomorphisms of tensor functors. Indeed, by denition of the
isomorphisms f
g,h
, is a morphism of monads, and it is comonoidal
because (g) is an isomorphism of tensor functors, for all g G.
Hence, (Rep H)
T
G
(Rep H)
T
G
as tensor categories [BV]. The
proposition follows from the fact that Rep
H (Rep H)
T
G
, while
(Rep H)
T
G
Rep H
, by [N7].
4.4. Weakly group-theoretical fusion categories. The concepts
of G-extension and G-equivariantization discussed previously lead to
the notions of nilpotent and solvable fusion categories.
Denition 4.3. [ENO2, GNi]. A fusion category ( is called (cycli-
cally) nilpotent if there is a sequence of fusion categories
(
0
= Vec
k
, (
1
, . . . , (
n
= (,
such that (
i
is a G
i
-extension of (
i1
, for some nite (cyclic) groups
G
1
, . . . , G
n
.
142 SONIA NATALE
This denition extends the denition of nilpotency of a nite group,
that is, the group G is nilpotent if and only if Rep G is a nilpotent
fusion category. We have in addition:
Proposition 4.4. Let ( = Rep H, where H is a semisimple Hopf
algebra. Then ( is nilpotent if and only if there is a sequence of
(normal) quotient Hopf algebras
H
(n)
= H H
(n1)
H
(0)
= k,
such that H
i
= H
co H
(i1)
k
G
i
is a central Hopf subalgebra of H
(i)
,
for all i = 1, . . . , n.
Dually, the category Corep H is nilpotent if and only if there is a
sequence of (normal) Hopf subalgebras
k = H
0
H
1
H
n
= H,
such that H
i
= H
i
/H
i
H
+
i1
kG
i
is a cocentral Hopf algebra quotient
of H
i
, for all i = 1, . . . , n.
Proof. Suppose ( = Rep H is nilpotent. Let (
0
= Vec
k
, (
1
, . . . , (
n
=
(, be a sequence of fusion categories such that (
i
is a G
i
-extension
of (
i1
, where G
1
, . . . , G
n
are nite groups. In particular, (
n1
is iso-
morphic to a full fusion subcategory of (
n
= Rep H (the trivial com-
ponent with respect to the G
n
-grading), hence (
n1
Rep H
(n1)
,
for some quotient Hopf algebra H = H
(n)
H
(n1)
. Furthermore,
by Proposition 4.1 there is a central exact sequence k k
G
n
H
(n)
H
(n1)
k. The claim follows by induction on n. Note
that each of the factors of the resulting series is one the Hopf sub-
algebras H
co H
(i1)
(i)
= k
G
i
, which is a central Hopf subalgebra of H
(i)
,
i = 1, . . . , n.
For the statement on the category of corepresentations, apply the
above to Rep H
= Corep H.
Corollary 4.5. Let H be a semisimple Hopf algebra. Then we have:
(i) Rep H is nilpotent if and only if H is upper semisolvable with
central factors k
G
i
.
(ii) Corep H is nilpotent if and only if H is lower semisolvable
with cocentral factors kG
i
.
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 143
Let us say that a semisimple Hopf algebra H is nilpotent if the
category Rep H is nilpotent.
In view of the results of Masuoka [M2], every semisimple Hopf alge-
bra of dimension p
n
, p a prime number is nilpotent. More generally,
every fusion category of dimension p
n
is nilpotent [GNi, Example
4.5].
Remark 4.6. Nilpotency of a semisimple Hopf algebra is not a self-
dual notion. Indeed, if G is a nite group, then the Hopf algebra k
G
is always nilpotent. However, the group algebra kG is nilpotent if
and only if G is a nilpotent group.
Example 4.7. The universal grading group of a group-theoretical
category ( = ((G, , S, ) is computed in [GNa]. It is shown in
[GNa, Corollary 4.3] that ( is a nilpotent fusion category, if and only
if the normal closure of S in G is nilpotent.
Let G be a nite group and let A = A(S, ) be the k
G
-Galois object
corresponding to a subgroup S G and a nondegenerate 2-cocycle
H
2
(S, k
= (kG)
J
is a twisting of kG, so that H
is nilpotent
if and only if G is nilpotent. In particular, in this case, H is nilpotent
if H
is not.
In the paper [ENO2], the authors dene a fusion category to be
simple if it contains no proper fusion subcategories.
When ( = Rep H for a semisimple Hopf algebra H, ( is simple if
and only if H has no Hopf algebra quotients at all (normal or not).
In particular, if G is a nite group Rep G is simple if and only if G
is a simple group, but the category ((G) of G-graded vector spaces
is simple if and only if G is a cyclic group of prime order (that is, G
144 SONIA NATALE
has no proper subgroups). A dierent notion of simplicity of a tensor
category, discussed later on in Section 5, is given in [BrN].
The following corollary can be seen as a consequence of the results
of [N4].
Corollary 4.8. Let H be a semisimple Hopf algebra of dimension
< 60. If Rep H is simple in the sense of [ENO2], then H kZ
p
, p
prime.
More generally, by [ENO2, 9.5] the only simple fusion categories
with integer Frobenius-Perron dimension < 60 are the categories
((G, ), where G is a cyclic group of prime order and H
3
(G, k
).
Indeed, it follows from the results loc. cit. that a fusion category
of dimension < 60 is always solvable (dimension p
a
q
b
) or group-
theoretical (dimension pqr).
On the other hand, a simple fusion category of (Frobenius-Perron)
dimension 60 is necessarily isomorphic to the representation category
Rep A
5
[ENO2, Theorem 9.12]. In particular, a semisimple Hopf
algebra H of dimension 60 such that Rep H is simple in the sense of
[ENO2] is a twisting of the alternating group A
5
.
Denition 4.9. [ENO2]. A fusion category ( is called weakly group-
theoretical if there exists an indecomposable algebra A in ( such
that
A
(
A
is a nilpotent fusion category. In the case where
A
(
A
is a
cyclically nilpotent fusion category, then ( is called solvable.
Here,
A
(
A
is the category of A-bimodules in ( with tensor product
A
. This denition can be rephrased saying that ( is Morita equiv-
alent to a nilpotent fusion category in the sense of M uger [Mg2].
Solvable fusion categories can be alternatively dened as follows
[ENO2, Proposition 4.4]: ( is solvable if and only if there is a sequence
of fusion categories
(
0
= Vec
k
, (
1
, . . . , (
n
= (,
such that (
i
is obtained from (
i1
either by a G
i
-equivariantization or
as a G
i
-extension, where G
1
, . . . , G
n
are cyclic groups of prime order.
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 145
If G is a nite group and H
3
(G, k
is a subobject of
F(X) for some X in (. F is called normal if for any object X of
(, there exists a subobject X
0
X such that F(X
0
) is the largest
trivial subobject of F(X).
Let Ker
F
denote the full subcategory of ( whose objects are those
X such that F(X) is trivial, that is, isomorphic to 1
n
, n 1. When
(, (
, (, (
be tensor cate-
gories over k. A sequence of tensor functors (
f
(
F
(
is called an
exact sequence of tensor categories if the following conditions hold:
(i) F is dominant and normal;
(ii) f is a full embedding;
(iii) The essential image of f is Ker
F
.
This denition leads to the related notions of normal fusion subcat-
egory and simple fusion category. A fusion subcategory (
( is nor-
mal if ( ts into an exact sequence of fusion categories (
( (
.
( is simple if it has no non-trivial normal fusion subcategory. This
notion of simplicity diers from the one introduced in [ENO2]. For
instance, when G is a nite group, then the simplicity of Rep G is
equivalent to the simplicity of G and also to the simplicity of the
fusion category ((G) of G-graded vector spaces.
It follows from [BrN, Proposition 3.9] that every exact sequence of
nite dimensional (semisimple) Hopf algebras k K
i
H
H
k gives rise to an exact sequence of tensor (fusion) categories
(5.1) Rep H
Rep H
i
Rep K.
148 SONIA NATALE
Here, the functors
and i
f
(
F
(
f
(
F
gG
T
g
. In this case we have (
G
= (
T
. Hopf
monads on ( corresponding to a group action are characterized in
[BrN, Theorem 4.24].
150 SONIA NATALE
Consider for instance an exact sequence of nite groups 1 G
G
G
Rep G Rep G
= Res
G
G
. Let Y be a kG
-module. As a consequence
of Mackeys Subgroup Theorem, there is a natural isomorphism
Res
G
G
Ind
G
G
(Y )
G/G
Y,
where
Y denotes the kG
, by:
T(Y ) =
G
Y.
This comes in fact from the action by tensor autoequivalences of G
on Rep G
by conjugation.
5.2. Extensions and commutative central algebras. We next
discuss another characterization of exact sequences of fusion cate-
gories from [BrN], in terms of commutative central algebras. This
relies on results of [BLV].
Let ( be a fusion category. A central algebra of ( is a pair (A, ),
where A is an algebra in ( endowed with natural isomorphisms (half-
braiding)
X
: A X X A, X (, such that the pair (A, ) is
an algebra in the center :(() of (.
A central algebra (A, ) is called commutative if m
A
= m, where
m : A A A denotes the multiplication in A.
Let (A, ) be a commutative central algebra of (. Assume A is
semisimple. The the category mod
C
A = mod
C
(A, ) of right A-
modules in ( is a fusion category with tensor product
A
and unit
object 1 [BrN, Proposition 5.5].
There is a free module functor F
A
: ( mod
C
A, X X A,
which is a tensor functor. The central algebra (A, ) is called the
induced central algebra of F = F
A
.
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 151
The algebra A is called self-trivializing if F
A
(A) is a trivial object of
mod
C
A. The following proposition is contained in [BrN, Proposition
5.7].
Proposition 5.6. Suppose F : ( T is an exact tensor functor
between fusion categories. Let (A, ) be its induced central algebra.
Then F is normal if and only if the algebra A is self-trivializing. In
that case, Ker
F
= A ( and we have an exact sequence of tensor
categories A ( T.
Here, A denotes the smallest abelian subcategory of ( containing
A and stable by direct sums, subobjects and quotients.
The following characterization is contained in [BrN, Corollary 5.8].
Theorem 5.7. An exact sequence of fusion categories (
(
F
(
V
(h v) = v
(0)
S(v
(1)
) hv
(2)
.
for any right H-comodule V . We have mod
Corep H
(A, ) Vec
k
as
tensor categories.
Let f : H H
:
Corep H Corep H
) Corep H
.
Note that the k-linear category mod
Corep H
(B,
H kG k is a cocentral exact
sequence of Hopf algebras. If is cyclic, then H is cocommutative.
Proof. The coalgebra structure of H is that of a crossed product H
k
st=h
(s, t)(g) e
s
#(t g) e
t
#g,
for all h , g G. Here, e
h
k
(.
Moreover, this exact sequence comes from an equivariantization.
In fact, T is a tannakian category, so that T Rep G as symmetric
tensor categories, where G is a nite group that acts on
( and such
that ( =
(
G
.
As an application of the notion of exact sequence of fusion cate-
gories, the following classication result was proved in [BrN].
Theorem 5.9. Let ( be a braided fusion category over k such that
FPdim( is odd and square-free. Then ( is equivalent to Rep as a
fusion category, for some nite group .
The proof relies on the concept of modularization and on the fact
that a quasitriangular Hopf algebra whose dimension is odd and
square-free is in fact a group algebra [N6].
154 SONIA NATALE
In [N6], a construction of certain canonical quotients of a nite
dimensional quasitriangular Hopf algebra, related to modularization,
was given. This construction is based on properties of the transmu-
tation studied by S. Majid and on a correspondence between Hopf
algebra quotients and coideal subalgebras, due to Takeuchi. The rst
notion concerns a natural map
R
: H
divides dimH.
A generalization of this result to spherical fusion categories appears
in [ENO, Proposition 5.7].
The degree of the character is dened as deg = (1) = dimV ,
if =
V
. Let Irr(H) denote the set of irreducible characters of H.
Following [I, Chapter 12], let us consider the set
cd(H) = deg [ Irr(H).
For a nite group, the knowledge of the set cd(G) = cd(kG) gives
in some cases substantial information about the structure of G. For
instance, if cd(G) = 1, m, m 1, then either G has an abelian
normal subgroup of index m or m is a power of a prime p and G is
the direct product of a p-group and an abelian group [I, Theorem
12.5].
For semisimple Hopf algebras, a result in this direction is the fol-
lowing one:
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 157
Theorem 6.3. [BN, Corollary 6.6]. Suppose that cd(H
) = 1, 2.
Then H is lower semisolvable.
The proof of this theorem relies on a renement of the above men-
tioned result of Nichols and Richmond given in [BN].
6.2. Module categories. A generalized ber functor is an exact
faithful tensor functor ( R Bimod, where R is a separable al-
gebra. This functors play the role of representations of (. They
correspond to so called module categories over (, that is, semisim-
ple k-linear categories / over k endowed with an exact functor
( / /, (X, M) X M, satisfying appropriate associa-
tivity and unit axioms. See [O1] and references therein. In this way,
module categories can be seen as an analogue of the notion of modules
over rings.
When ( is the representation category of a semisimple Hopf algebra
H, this functors are also in correspondence with H-Galois extensions
R A [S2, Theorem 2.5.3]. Some references for the theory of Hopf
Galois extensions of a Hopf algebra and their most important features
are [B, Mo2, S2, SS].
Indecomposable module categories over Rep G, where G is a nite
group, are classied in [O1, Theorem 3.2]. They are in one-to-one
correspondence with conjugacy classes of pairs (, ), where G
is a subgroup and H
2
(, k
is trivial, and H
2
(, k
).
The corresponding module category is the category of (k
F, k
)-
bimodules in ((G, ). This module category is of rank one (that is,
it corresponds to a ber functor on (, and thus to a semisimple Hopf
algebra H with Rep H (), if and only if F = G and the cocycle
1
is non-degenerate on F .
Module categories over an equivariantized category (
G
are classi-
ed in [ENO2, Proposition 5.4], generalizing results of [Nk2]. More
158 SONIA NATALE
recently, the classication has also been obtained in [MS] for any
G-extension of a fusion category.
6.3. Frobenius-Schur indicators. These invariants are dened for
a semisimple pivotal tensor category (that is, categories which admit
a tensor isomorphism between the identity functor and the functor
V V
G := mcm(e(
g
)[g[ : g G); where e(
g
) denotes the order
of the cohomology class of the restriction of to the subgroup gen-
erated by g G. Moreover, exp ( = exp
G in certain cases. As a
consequence, the exponent of a group-theoretical quasi-Hopf algebra
divides the square of its dimension and, in addition, this bound is
optimal.
160 SONIA NATALE
In the paper [LMS] the authors studied the properties of the so
called Hopf order of an element h H: this is the least n such
that h
(1)
. . . h
(n)
= (h)1. Hopf orders are investigated for some split
abelian extensions, including Drinfeld doubles of certain groups (in
particular, a semisimple Hopf algebra H may have elements of prime
Hopf order p, even when p does not divide dimH). The spaces of ele-
ments with trivial n-th Hopf powers are discussed, showing, however,
that they do not give a twist invariant of H.
7. Some further questions
As already explained, there exist group-theoretical Hopf algebras
(specically, twistings of group algebras), which are not semisolvable.
We believe it would be interesting to describe those semisimple Hopf
algebras that can be obtained as extensions from (weakly) group-
theoretical Hopf algebras. In particular, we do not know the answer
to the following question:
Question 7.1. Let k A H B k be an extension of Hopf
algebras. Suppose A and B are weakly group-theoretical. Is it true
that H is weakly group-theoretical?
It is known [N2] that if the extension is abelian then the answer
is armative. In any case, if the answer were armative in general,
this would imply that the class of semisimple Hopf algebras which
are semisolvable would be contained in the class of weakly group-
theoretical Hopf algebras.
In relation with the classication of semisimple Hopf algebras from
its character degrees, we do not know the answers to the following
questions:
Question 7.2. Let p be a prime number. Let H be a semisimple Hopf
algebra such that cd(H) = 1, p. Is it true that H is semisolvable?
It is known [I, Theorem (12.11)] that a nite group G whose irre-
ducible character degrees are either 1 or p must be an extension of
an abelian group by Z
p
or else [G : Z(G)[ = p
3
. So these groups are
solvable.
SEMISIMPLE HOPF ALGEBRAS AND THEIR REPRESENTATIONS 161
Also, the result in [IK, Theorem IX.8 (iii)] implies that answer to
Question 7.2 is yes for Kac algebras H when [G(H
)[ = p.
Question 7.2 also makes sense in the context of fusion categories,
considering solvability instead of semisolvability.
In the context of the exact sequences of tensor categories intro-
duced in [BrN], we think it would be interesting to extend the semi-
solvability results in low dimension of [N4] to fusion categories. We
know that the notion of simplicity of fusion categories considered in
[BrN] extends that of nite groups. In particular, the category of
representations of the alternating group A
5
is a simple fusion cate-
gory.
Question 7.3. Does there exist a fusion category of dimension < 60
which is simple in the sense of [BrN]?
As pointed out in Subsection 4.4, the answer is no if one consid-
ers instead the notion of simplicity studied in [ENO2]. In view of the
main result of [N4] the answer is also no if one considers fusion cate-
gories that admit a ber functor, that is, categories of representations
of semisimple Hopf algebras.
In the same spirit, the following questions are natural:
Question 7.4. Does there exist a fusion category of dimension p
a
q
b
,
where p and q are distinct prime numbers, which is simple in the
sense of [BrN]?
Question 7.5. Does there exist a fusion category of prime power
dimension p
n
, n > 1, which is simple in the sense of [BrN]?
It is clear that fusion categories of prime dimension are simple
(according to both denitions of simplicity). On the other hand,
fusion categories of dimensions p
a
q
b
are always solvable. In particular,
they are not simple in the sense of [ENO2] if a + b > 1.
In relation with the invariants of fusion categories described in
Section 6, we believe it would be of interest to compute them for the
category (
T
, where T is a semisimple faithful (normal) Hopf monad
on a fusion category (. In particular, an answer to the following
162 SONIA NATALE
question would give a generalization of the description of module
categories for equivariantized categories given in [ENO2]:
Question 7.6. What are module categories for the category (
T
?
Concerning extensions of fusion categories, as explained in Section
5, we have the following natural question:
Question 7.7. Let (
( (
and (
es, S
4
-
symmetries of 6j-symbols and Frobenius-Schur indicators in rigid monoidal
C
uger, Galois theory for braided tensor categories and the modular
closure, Adv. Math. 150, 151201 (2000).
[Mg2] M. M
atica, Astronom
a y F
ordoba, Argentina
E-mail address: [email protected],
URL: http://www.famaf.unc.edu.ar/natale
AN EXAMPLE CONCERNING THE THEORY OF
LEVELS FOR CODIMENSION-ONE FOLIATIONS
ANDR
ES NAVAS
An important aspect of foliations concerns the existence of local
minimal sets. Recall that a foliated manifold has the LMS property
if, for every open, saturated set W and every leaf L W, the relative
closure
L W contains a minimal set of F|
W
. A fundamental result
(due to Cantwell-Conlon [2] and Duminy-Hector [5]) establishes the
LMS property for codimension-one foliations that are transversely of
class C
1+Lipschitz
. This is the basic tool of the so-called Theory of
Levels.
A classical example due to Hector (which corresponds to the sus-
pension of a group action on the interval) shows that the LMS prop-
erty is no longer true for codimension-one foliations which trans-
versely are only continuous (see [1, Example 8.1.13]). Despite of this,
in recent years, the possibility of extending some of the results of the
Theory of Levels to smoothness smaller than C
1+Lipschitz
has been
naturally addressed [3, 4]. In this Note we will show that, however,
analogues of Hectors example appear in class C
1
(and actually in
class C
1+
for small values of ).
1. A General Construction
Let (a
n
)
nZ
be a sequence such that a
n+1
< a
n
for all n Z,
a
n
0 as n , and a
n
1 as n . Let (n
k
) be a strictly
increasing sequence of positive integers, and let f : [0, 1] [0, 1] be a
homeomorphism such that f(a
n+1
) = a
n
for all n Z. For each k >
0, we let u
k
, v
k
, b
k
, c
k
be such that a
n
k
+1
< b
k
< u
k
< v
k
< c
k
< a
n
k
.
For each i {0, . . . , n
k+1
n
k
}, we set u
i
k
:= f
i
(u
k
) and v
i
k
:= f
i
(v
k
).
Partially funded by the Math-AMSUD Project DySET..
169
170 ANDR
ES NAVAS
Notice that
f
i
([u
0
k+1
, v
0
k+1
]) = [u
i
k+1
, v
i
k+1
] f
i
([a
1+n
k+1
, a
n
k+1
]) = [a
n
k+1
i+1
, a
n
k+1
i
].
Now, we let g : [0, 1] [0, 1] be a homeomorphism such that:
g = Id on [a
n+1
, a
n
] for each n < 0, as well as each n > 0 such that
n = n
k
for every k;
g = Id on [a
1+n
k
, b
k
] [c
k
, a
n
k
], g(u
0
k
) = v
0
k
, and g has no xed point
on ]b
k
, c
k
[.
Main assumption: In order that f, g generate a group of home-
omorphisms of [0, 1] whose associated suspension does not have the
LMS property, we assume that (see Figure 1)
u
n
k+1
n
k
k+1
= b
k
and v
n
k+1
n
k
k+1
= c
k
.
With these general notations, Hectors example corresponds to the
choice n
k
= k. We will show that, by taking n
k
= 2
k
, one may
perform this construction in such a way the resulting maps f and
g are dieomorphisms of class C
1
(actually, of class C
1+
for any
< (
(1+|n|)
1+
, where c
is chosen so that
nZ
|[a
n+1
, a
n
]| =
1;
|[b
k
, c
k
]| :=
1
2
|[a
2
k
+1
, a
2
k]| =
c
2(1+2
k
)
1+
, where k > 0;
|[u
k
, v
k
]| := |[b
k
, c
k
]|
1+
.
We assume that the center of [a
2
k
+1
, a
2
k] coincides with the center
of [b
k
, c
k
] and with that of [u
k
, v
k
]. Furthermore, we assume that for
each i {0, . . . , 2
k
}, the centers of [u
i
k+1
, v
i
k+1
] and [a
2
k+1
i+1
, a
2
k+1
i
]
coincide.
For the estimates concerning regularity, we will strongly use the
following lemma from [6].
Technical Lemma. Let : [0, ] [0, ()] be a function (modulus
of continuity) such that s s/(s) is non-increasing. If I, J are
closed non-degenerate intervals such that 1/2 |I|/|J| 2 and
|J|
|I|
1
1
(|I|)
M,
then there exists a C
1+
dieomorphism f : I J that is tangent
to the identity at the endpoints and whose derivative has -norm
bounded from above by 6M.
172 ANDR
ES NAVAS
Actually, for I :=[a, b] and J :=[a
, b
,b
a,b
,
where
a,b
is dened by (a similar denition stands for
a
,b
)
a,b
(x) =
1
(b a)
ctg
_
_
x a
b a
_
_
.
The condition on the derivative at the endpoints allows us to t
together the maps in order to create a dieomorphism of a larger
interval. Actually, if all of the involved sub-intervals of type I, J
satisfy the hypothesis of the lemma above for the same constant M,
then the -norm of the derivative of the induced dieomorphism is
bounded from above by 12M.
In what follows, we will deal with the modulus of continuity (s)=
s
|[u
i+1
k+1
, v
i+1
k+1
]|
|[u
i
k+1
, v
i
k+1
]|
1
1
|[u
i
k+1
, v
i
k+1
]|
= |
k
1|
1
(
i
k
|[u
0
k+1
, v
0
k+1
]|)
|
k
1|
1
|[b
k+1
, c
k+1
]|
(1+)
.
Now from (1) one obtains
2
k
k
=
c
2(1+2
k
)
1+
(
c
2(1+2
k+1
)
1+
)
1+
M
_
(1 + 2
k+1
)
1+
1 + 2
k
_
1+
M2
k(1+)
.
AN EXAMPLE CONCERNING THE THEORY OF LEVELS 173
From the inequality |2
|[u
i+1
k+1
, v
i+1
k+1
]|
|[u
i
k+1
, v
i
k+1
]|
1
1
|[u
i
k+1
, v
i
k+1
]|
M
k
2
k
2
k(1+)(1+)
.
A C
B D
a
2
k+1
i
a
2
k+1
i1
( )
f
a
2
k+1
i1
a
2
k+1
i2
( )
u
i
k+1
v
i
k+1
( )
u
i+1
k+1
v
i+1
k+1
( )
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
Figure 2
Now, for (ii), set A := |[u
i
k+1
, v
i
k+1
]|, B := |[a
2
k+1
i
, a
2
k+1
i1
]|,
C := |[u
i+1
k+1
, v
i+1
k+1
]|, and D := |[a
2
k+1
i1
, a
2
k+1
i2
]|. Then
|[a
2
k+1
i1
, u
i+1
k+1
]|
|[a
2
k+1
i
, u
i
k+1
]|
1
1
|[a
2
k+1
i
, u
i
k+1
]|
D C
B A
1
(B A)
.
174 ANDR
ES NAVAS
Moreover, since A B/2 and C =
k
A,
D C
B A
1
D B
B A
C A
B A
D B
B
+|
k
1|
=
M
B
_
1
(2
k+1
i 2)
1+
1
(2
k+1
i 1)
1+
_
+ M
k
2
k
MB
_
(2
k+1
i 1)
1+
(2
k+1
i 2)
1+
+ M
k
2
k
M
2
k(1+)
2
k
+ M
k
2
k
M
k
2
k
.
Therefore,
D C
B A
1
(B A)
M
k
2
k
2
k(1+)
,
hence
(3)
|[a
2
k+1
i1
, u
i+1
k+1
]|
|[a
2
k+1
i
, u
i
k+1
]|
1
1
|[a
2
k+1
i
, u
i
k+1
]|
M
k
2
k(1(1+))
.
Finally, notice that by construction, the estimates for (iii) are the
same as those for (ii).
Estimates for g: The dieomorphism g is obtained by tting to-
gether the maps provided by the Technical Lemma sending:
(i) [b
k
, u
0
k
] into [b
k
, v
0
k
],
(ii) [u
0
k
, c
k
] into [v
0
k
, c
k
],
(iii) [a
2
k
+1
, b
k
] and [c
k
, a
2
k] into themselves as the identity.
For (i), notice that
|[b
k
, v
0
k
]|
[b
k
, u
0
k
]
1
1
|[b
k
, u
0
k
]|
=
|[u
0
k
, v
0
k
]|
|[b
k
, u
0
k
]|
1+
2
1+
|[u
0
k
, v
0
k
]|
(|[b
k
, c
k
]| |[u
0
k
, v
0
k
]|)
1+
=
2
1+
|[b
k
, c
k
]|
1+
(|[b
k
, c
k
]| |[b
k
, c
k
]|
1+
)
1+
,
AN EXAMPLE CONCERNING THE THEORY OF LEVELS 175
thus
(4)
|[b
k
, v
0
k
]|
[b
k
, u
0
k
]
1
1
|[b
k
, u
0
k
]|
M|[b
k
, c
k
]|
.
The estimates for (ii) are similar to those for (i) and we leave them
to the reader.
The choice of the parameters: According to our Technical
Lemma, and due to (2), (3), and (4), sucient conditions for the
C
1+
smoothness of f, g are:
(1 + )(1 + ) < 1,
1
1+
>
> .
Now, for 0 < < (
ES NAVAS
[6] Navas, A. Growth of groups and dieomorphisms of the interval.
Geom. and Functional Analysis 18 (2008), 988-1028.
[7] Tsuboi, T. Homological and dynamical study on certain groups
of Lipschitz homeomorphisms of the circle. J. Math. Soc. Japan 47
(1995), 1-30.
Univ. de Santiago de Santiago, Alameda 3363, Santiago, Chile
E-mail address: [email protected]
ACCESSIBILITY AND ABUNDANCE OF
ERGODICITY IN DIMENSION THREE: A SURVEY.
FEDERICO RODRIGUEZ HERTZ, JANA RODRIGUEZ HERTZ,
AND RA
UL URES
Abstract. In [18] the authors proved the Pugh-Shub conjec-
ture for partially hyperbolic dieomorphisms with 1-dimensional
center, i.e. stably ergodic dieomorphisms are dense among the
partially hyperbolic ones and, in subsequent results [20, 21], they
obtained a more accurate description of this abundance of ergod-
icity in dimension three. This work is a survey type paper of this
subject.
1. Introduction
The purpose of this survey is to present the state of the art in the
study of the ergodicity of conservative partially hyperbolic dieo-
morphisms on three dimensional manifolds. In fact, we shall mainly
describe the results contained in [20, 21]. The study of partial hy-
perbolicity has been one of the most active topics on dynamics over
the last years and we do not pretend to describe all the related re-
sults, even for 3-manifolds. Some of the important themes excluded
in this survey are entropy maximizing measures, absolute continu-
ity of center foliations, co-cycles over partially hyperbolic systems,
SRB-measures, dynamical coherence, classication, etc.
A dieomorphism f : M M of a closed smooth manifold M is
partially hyperbolic if TM splits into three invariant bundles such
that one of them is contracting, the other is expanding, and the
Date: August 24, 2011.
2000 Mathematics Subject Classication. Primary: 37D30, Secondary: 37A25.
Key words and phrases. partial hyperbolicity; accessibility property; ergodic-
ity; laminations.
177
178 F.RODRIGUEZ HERTZ, J.RODRIGUEZ HERTZ, AND R.URES
third, called the center bundle, has an intermediate behavior, that is,
not as contracting as the rst, nor as expanding as the second (see
Subsection 2.3 for a precise denition). The rst and second bundles
are called strong bundles.
A central point in dynamics is to nd conditions that guarantee
ergodicity. In 1994, the pioneer work of Grayson, Pugh and Shub
[9] suggested that partial hyperbolicity could be essentially a suf-
cient condition for ergodicity. Indeed, soon afterwards, Pugh and
Shub conjectured that stable ergodicity (open sets of ergodic dif-
feomorphisms) is dense among partially hyperbolic systems. They
proposed as an important tool the accessibility property (see also the
previous work by Brin and Pesin [2]): f is accessible if any two points
of M can be joined by a curve that is a nite union of arcs tangent
to the strong bundles. Essential accessibility is the weaker property
that any two measurable sets of positive measure can be joined by
such a curve. In fact, accessibility will play a key role in this survey.
Pugh and Shub split their Conjecture into two sub-conjectures: (1)
essential accessibility implies ergodicity, (2) the set of partially hy-
perbolic dieomorphisms contains an open and dense set of accessible
dieomorphisms.
Many advances have been made since then in the ergodic theory
of partially hyperbolic dieomorphisms. In particular, there is a re-
sult by Burns and Wilkinson [4] proving that essential accessibility
plus a bunching condition (trivially satised if the center bundle is
one dimensional) implies ergodicity. There is also a result by the
authors [18] obtaining the complete Pugh-Shub conjecture for one-
dimensional center bundle. See [19] for a survey on the subject.
We have therefore that almost all partially hyperbolic dieomor-
phisms with one dimensional bundle are ergodic. This means that
the non-ergodic partially hyperbolic systems are very few. Can we
describe them? Concretely,
Question 1.1. Which manifolds support a non-ergodic partially hy-
perbolic dieomorphism? How do they look like?
In this survey we give a description of what is known about this
question for three dimensional manifolds. We study the sets of points
ABUNDANCE OF ERGODICITY IN DIMENSION THREE 179
that can be joined by paths everywhere tangent to the strong bundles
(accessibility classes), and arrive, using tools of geometry of lamina-
tions and topology of 3-manifolds, to the somewhat surprising con-
clusion that there are strong obstructions to the non-ergodicity of a
partially hyperbolic dieomorphism. See Theorems 1.4, 1.6 and 1.7.
This gave us enough evidence to conjecture the following:
Conjecture 1.2 ([20]). The only orientable manifolds supporting
non-ergodic partially hyperbolic dieomorphisms in dimension 3 are
the mapping tori of dieomorphisms of surfaces which commute with
Anosov dieomorphisms.
Specically, they are (1) the mapping tori of Anosov dieomor-
phisms of T
2
, (2) T
3
, and (3) the mapping torus of id where id :
T
2
T
2
is the identity map on the 2-torus.
Indeed, we believe that for 3-manifolds, all partially hyperbolic
dieomorphisms are ergodic, unless the manifold is one of the listed
above.
In the case that M = T
3
we can be more specic and we also
conjecture that:
Conjecture 1.3. Let f : T
3
T
3
be a conservative partially hyper-
bolic dieomorphism homotopic to a hyperbolic automorphism. Then,
f is ergodic.
In [20] we proved Conjecture 1.2 when the fundamental group of
the manifold is nilpotent:
Theorem 1.4. All the conservative C
2
partially hyperbolic dieomor-
phisms of a compact orientable 3-manifold with nilpotent fundamental
group are ergodic, unless the manifold is T
3
.
A paradigmatic example is the following. Let M be the mapping
torus of A
k
: T
2
T
2
, where A
k
is the automorphism given by the
matrix
1 k
0 1
, the
transition functions
: R
n
T R
n
T are homeomorphisms
and take the form:
(u, v) = (l
(u, v), t
(v)),
where l
are C
1
with respect to the u variable. No dierentiability
is required in the transverse direction T. The sets
1
(R
n
{t}) are
ABUNDANCE OF ERGODICITY IN DIMENSION THREE 183
called plaques. Each point x of a lamination belongs to a maximal
connected injectively immersed n-submanifold, called the leaf of x
in L. The leaves are union of plaques. Observe that the leaves are
C
1
, but vary only continuously. The number n is the dimension of
the lamination. If n = dimM 1, we say is a codimension-one
lamination. The set L is an f-invariant lamination if it is a lamination
such that f takes leaves into leaves.
We call a lamination a foliation if = M. In this case, we shall
denote by F the set of leaves. In principle, we shall not assume
any transverse dierentiability. However, in case l
is C
r
with re-
spect to the v variable, we shall say that the foliation is C
r
. Note
that even purely C
0
codimension-one foliations admit a transverse
1-dimensional foliation (see Siebenmann [25], ). In our case the exis-
tence of this 1-dimensional foliation is trivial thanks to the existence
of the 1-dimensional center bundle E
c
. These allows us to translate
many local deformation arguments, usually given in the C
2
category,
into the C
0
category as observed, for instance, by Solodov [26]. In
particular, Theorems 2.1 and 2.3, which were originally formulated for
C
2
foliations hold in the C
0
case. We shall say that a codimension-one
foliation F, is transversely orientable if the transverse 1-dimensional
foliation mentioned above is orientable. An invariant foliation is a
foliation that is an invariant lamination.
Let be a codimension-one lamination that is not a foliation. A
complementary region V is a component of M \ . A closed comple-
mentary region
V is the metric completion of a complementary region
V with the path metric induced by the Riemannian metric, the dis-
tance between two points being the inmum of the lengths of paths in
V connecting them. A closed complementary region is independent
of the metric. Note that they are not necessarily compact. If does
not have compact leaves, then every closed complementary region
decomposes into a compact gut piece and non-compact interstitial re-
gions which are I-bundles over non-compact surfaces, and get thinner
and thinner as they go away from the gut (see [13] or [8]). The in-
terstitial regions meet the gut along annuli. The decomposition into
184 F.RODRIGUEZ HERTZ, J.RODRIGUEZ HERTZ, AND R.URES
interstitial regions and guts is unique up to isotopy. Moreover, one
can take the interstitial regions as thin as one wishes.
A boundary leaf is a leaf corresponding to a component of V , for
V a closed complementary region. That is, a leaf is a non-boundary
leaf if it is not contained in a closed complementary region.
Figure 1. A Reeb component
The geometry of codimension-one foliations is deeply related to the
topology of the manifold that supports them. The following subset
of a foliation is important in their description. A Reeb component
is a solid torus whose interior is foliated by planes transverse to the
of core of the solid torus, such that each leaf limits on the boundary
torus, which is also a leaf (see Figure 1). A foliation that has no Reeb
components is called Reebless.
The following theorems show better the above mentioned relation:
Theorem 2.1 (Novikov). Let M be a compact orientable 3-manifold
and F a transversely orientable codimension-one foliation. Then each
of the following implies that F has a Reeb component:
(1) There is a closed, nullhomotopic transversal to F
(2) There is a leaf L in F such that
1
(L) does not inject in
1
(M)
ABUNDANCE OF ERGODICITY IN DIMENSION THREE 185
The statement of this theorem can be found, for instance, in [6,
Theorems 9.1.3 & 9.1.4., p.288]. We shall also use the following
theorem
Theorem 2.2 (Haeiger). Let be a codimension one lamination in
M. Then the set of points belonging to compact leaves is compact.
This theorem was originally formulated for foliations [10]. How-
ever, it also holds for laminations, see for instance [13].
We have the following consequence of Novikovs Theorem about
Reebless foliations. This theorem is stated in [24] as Corollary 2 on
page 44.
Theorem 2.3. If M is a compact 3-manifold and F is a transversely
orientable codimension-one Reebless foliation, then either F is the
product foliation of S
2
S
1
, or
F, the foliation induced by F on the
universal cover
M of M, is a foliation by planes R
2
. In particular,
if M = S
2
S
1
then M is irreducible.
This theorem was originally stated for C
2
foliations, but it also
holds for C
0
foliations, due to Siebenmanns theorem mentioned
above.
2.2. Topologic preliminaries. Let M be a 3-dimensional mani-
fold. A manifold M is irreducible if every 2-sphere S
2
embedded in
the manifold bounds a 3-ball. Recall that a 2-torus T embedded in
M is an Anosov torus if there exists a dieomorphism f : M M
such that f(T) = T and the action induced by f on
1
(T), that is,
f
#
|
T
:
1
(T)
1
(T), is a hyperbolic automorphism. Equivalently,
f restricted to T is isotopic to a hyperbolic automorphism.
We shall assume from now on, that M is an irreducible 3-manifold
since this is the case for 3-manifolds supporting partially hyperbolic
dieomorphisms. In this subsection, we will focus on what is called
the JSJ-decomposition of M (see below). That is, we will cut M
along certain kind of tori, called incompressible, and will obtain cer-
tain 3-manifolds with boundary that are easier to handle, which are,
respectively, Seifert manifolds, and atoroidal and acylindrical mani-
folds. Let us introduce these denitions rst.
186 F.RODRIGUEZ HERTZ, J.RODRIGUEZ HERTZ, AND R.URES
An orientable surface S embedded in M is incompressible if the
homomorphism induced by the inclusion map i
#
:
1
(S)
1
(M) is
injective; or, equivalently, if there is no embedded disc D
2
M such
that D S = D and D 0 in S (see, for instance, [12, Page 10]).
We also require that S = S
2
.
A manifold with or without boundary is Seifert, if it admits a one
dimensional foliation by closed curves, called a Seifert bration. The
boundary of an orientable Seifert manifold with boundary consists of
nite union of tori. There are many examples of Seifert manifolds,
for instance S
3
, T
1
S where S is a surface, etc.
The other type of manifold obtained in the JSJ-decomposition is
atoroidal and acylindrical manifolds. A 3-manifold with boundary N
is atoroidal if every incompressible torus is -parallel, that is, isotopic
to a subsurface of N. A 3-manifold with boundary N is acylindrical
if every incompressible annulus A that is properly embedded, i.e. A
N, is -parallel, by an isotopy xing A.
As we mentioned before, a closed irreducible 3-manifold admits a
natural decomposition into Seifert pieces on one side, and atoroidal
and acylindrical components on the other:
Theorem 2.4 (JSJ-decomposition [14], [15]). If M is an irreducible
closed orientable 3-manifold, then there exists a collection of disjoint
incompressible tori T such that each component of M \ T is either
Seifert, or atoroidal and acylindrical. Any minimal such collection is
unique up to isotopy. This means that if T is a collection as described
above, it contains a minimal sub-collection m(T ) satisfying the same
claim. All collections m(T ) are isotopic.
Any minimal family of incompressible tori as described above is
called a JSJ-decomposition of M. When it is clear from the context we
shall also call JSJ-decomposition the set of pieces obtained by cutting
the manifold along these tori. Note that if M is either atoroidal or
Seifert, then T = .
2.3. Dynamic preliminaries. Throughout this paper we shall work
with a partially hyperbolic dieomorphism f, that is, a dieomorphism
admitting a non-trivial Tf-invariant splitting of the tangent bundle
ABUNDANCE OF ERGODICITY IN DIMENSION THREE 187
TM = E
s
E
c
E
u
, such that all unit vectors v
x
( = s, c, u)
with x M verify:
T
x
fv
s
< T
x
fv
c
< T
x
fv
u
for some suitable Riemannian metric. f also must satisfy that Tf|
E
s <
1 and Tf
1
|
E
u < 1. We shall say that a partially hyperbolic dif-
feomorphism f that satises
T
x
fv
s
< T
y
fv
c
< T
z
fv
u
x, y, z M
is absolutely partially hyperbolic.
We shall also assume that f is conservative, i.e. it preserves Lebesgue
measure associated to a smooth volume form.
It is a known fact that there are foliations W
(x).
In general it is not true that there is a foliation tangent to E
c
. It is
false even in case dimE
c
= 1 (see [22]). However, in Proposition 3.4
of [1] it is shown that if dimE
c
= 1, then f is weakly dynamically co-
herent. This means that for each x M there are complete immersed
C
1
manifolds which contain x and are everywhere tangent to E
c
, E
cs
and E
cu
, respectively. We will call a center curve any curve which is
everywhere tangent to E
c
. Moreover, we will use the following fact:
Proposition 2.5 ([1]). If is a center curve through x, then
W
s
() =
y
W
s
(y) and W
u
() =
y
W
u
(y)
are C
1
immersed manifolds everywhere tangent to E
s
E
c
and E
c
E
u
respectively.
We shall say that a set X is s-saturated or u-saturated if it is a union
of leaves of the strong foliations W
s
or W
u
respectively. We also say
that X is su-saturated if it is both s- and u-saturated. The acces-
sibility class AC(x) of the point x M is the minimal su-saturated
set containing x. Note that the accessibility classes form a partition
188 F.RODRIGUEZ HERTZ, J.RODRIGUEZ HERTZ, AND R.URES
of M. If there is some x M whose accessibility class is M, then
the dieomorphism f is said to have the accessibility property. This
is equivalent to say that any two points of M can be joined by a
path which is piecewise tangent to E
s
or to E
u
. A dieomorphism is
said to be essentially accessible if any su-saturated set has full or null
measure.
The theorem below relates accessibility with ergodicity. In fact it
is proven in a more general setting, but we shall use the following
formulation:
Theorem 2.6 ([4],[18]). If f is a C
2
conservative partially hyper-
bolic dieomorphism with the (essential) accessibility property and
dimE
c
= 1, then f is ergodic.
In [20] it is proved that there are manifolds whose topology implies
the accessibility property holds for all partially hyperbolic dieomor-
phisms. In these manifolds, all partially hyperbolic dieomorphisms
are ergodic.
Sometimes we will focus on the openness of the accessibility classes.
Note that the accessibility classes form a partition of M. If all of them
are open then, in fact, f has the accessibility property. We will call
U(f) = {x M; AC(x) is open} and (f) = M \ U(f). Note that f
has the accessibility property if and only if (f) = . We have the
following property of non-open accessibility classes:
Proposition 2.7 ([18]). The set (f) is a codimension-one lamina-
tion, having the accessibility classes as leaves.
In fact, any compact su-saturated subset of (f) is a lamination.
The above proposition is Proposition A.3. of [18]. The fact that the
leaves of (f) are C
1
may be found in [7]. The following proposition
is Proposition A.5 of [18]:
Proposition 2.8 ([18]). If is an invariant sub-lamination of (f),
then each boundary leaf of is periodic and the periodic points are
dense in it (with the induced topology).
Moreover, the stable and unstable manifolds of each periodic point
are dense in each plaque of a boundary leaf of .
ABUNDANCE OF ERGODICITY IN DIMENSION THREE 189
Observe that the proof of Proposition A.5 of [18] shows in fact that
periodic points are dense in the accessibility classes of the boundary
leaves of V endowed with its intrinsic topology. In other words,
periodic points are dense in each plaque of the boundary leaves of
V .
We shall also use the following theorem by Brin, Burago and Ivanov,
whose proof is in [1], after Proposition 2.1.
Theorem 2.9 ([1]). If f : M
3
M
3
is a partially hyperbolic dif-
feomorphism, and there is an open set V foliated by center-unstable
leaves, then there cannot be a closed center-unstable leaf bounding a
solid torus in V .
3. Anosov tori
In this section we will say a few words about the proof of Theorem
1.6. The idea in its proof is that, given an Anosov torus T, we can
place T so that either T belongs to the family T given by the JSJ-
decomposition (Theorem 2.4), or else T is in a Seifert component,
and it is either transverse to all bers, or it is union of bers of this
Seifert component. See Proposition 3.3.
It is important to note the following property of Anosov tori:
Theorem 3.1 ([20]). Anosov tori are incompressible.
An Anosov torus in an atoroidal component will then be -parallel
to a component of its boundary. In this case, we can assume T T .
On the other hand, the Theorem of Waldhausen below, guarantees
that we can always place an incompressible torus in a Seifert manifold
in a standard form; namely, the following: a surface is horizontal
in a Seifert manifold if it is transverse to all bers, and vertical if it
is union of bers:
Theorem 3.2 (Waldhausen [27]). Let M be a compact connected
Seifert manifold, with or without boundary. Then any incompressible
surface can be isotoped to be horizontal or vertical.
The architecture of the proof of Theorem 1.6 is contained in the
following proposition.
190 F.RODRIGUEZ HERTZ, J.RODRIGUEZ HERTZ, AND R.URES
Proposition 3.3. Let T be an Anosov torus of a closed irreducible
orientable manifold M. Then, there exists a dieomorphism f : M
M and a JSJ-decomposition T such that
(1) f|T is a hyperbolic toral automorphism,
(2) f(T ) = T , and
(3) one of the following holds
(a) T T
(b) T is a vertical torus in a Seifert component of M \ T ,
and T is not -parallel in this component.
(c) M is a Seifert manifold (T = ), and T is a horizontal
torus,
The proposition above allows us to split the proof of Theorem 1.6
into cases. Note that case (3b) includes the case in which M is a
Seifert manifold and T is a vertical torus.
In the case that T is a vertical torus in a Seifert component we
can cut this component along T. Then we can suppose that T is in
the boundary. We take prot of the fact that in most manifolds the
Seifert bration is unique up to isotopy. Since the dynamics restricted
to T is Anosov we have that the manifold has more than one Seifert
bration. This lead us to show that this Seifert component must be
T
2
[0, 1]. This gives that the whole manifold must be one of the
manifolds of Theorem 1.6.
If T is horizontal torus then the manifold M is Seifert and T inter-
sects all the bers. This is discarded in a case by case study thanks
to the fact that the Seifert manifolds having horizontal torus a nite.
The last and more dicult case is when T is part of the JSJ-
decomposition but it is not the boundary of a Seifert component.
The proof in this case is complicated but a very rough idea is to take
a properly embedded surface S with an essential circle of T in its
boundary. Taking a large iterate f
n
(S) and considering S f
n
(S),
it is possible to construct a non-parallel incompressible cylinder as a
union of a band in S and a band in f
n
(S). This leads to contradiction
because the component is not Seifert and then, it is acylindrical.
ABUNDANCE OF ERGODICITY IN DIMENSION THREE 191
4. The su-lamination (f)
Let f be a partially hyperbolic dieomorphism of a compact 3-
manifold M. From Subsection 2.3 it follows that we have three pos-
sibilities: (1) f has the accessibility property, (2) the union of all non-
open accessibility classes is a strict lamination, (f) M or (3)
the union of all non-open accessibility classes foliates M: (f) = M.
Now, we shall distinguish two possible cases in situations (2) and
(3):
(a) the lamination (f) does not contain compact leaves
(b) the lamination (f) contains compact leaves
In this section we deal with the case (2a). In fact, for our purposes
it will be sucient to assume that there exists an f-invariant sub-
lamination of (f) without compact leaves. Section 5 treats the
cases (2b) and (3b). Section 6 treats the case (3a).
In this section, we will prove that the complement of consists
of I-bundles. To this end, we shall assume that the bundles E
V = I(V ) G(V ).
192 F.RODRIGUEZ HERTZ, J.RODRIGUEZ HERTZ, AND R.URES
The following statement is rather standard:
Lemma 4.3. Let f : M M be a partially hyperbolic dieomorphism.
If U is an open invariant set such that U (f), then the closure of
U is su-saturated.
Let us observe that if
V is connected then there are only two bound-
ary leaves of
V . Indeed, as we mentioned before periodic points are
dense in boundary leaves. This fact jointly with the local product
structure imply, using standard arguments, that the stable and un-
stable leaves of periodic points are dense too. Take a periodic point
p in a boundary leaf and in the intersticial region. There are center
curves joining the points in the local stable manifold of p with other
boundary curve L
1
of
V (the same property holds for the local unsta-
ble manifold). Invariance of the stable manifold of p and boundary
leaves give that the center curve of any point of the stable manifold
joins the boundary leaf L
0
containing p with L
1
. Denseness of the
stable and unstable manifolds of p implies that the complement of
the set of points such that their center manifold join L
0
with L
1
is
totally disconnected. Then, it is not dicult to see that L
0
and L
1
are the unique boundary leaves of
V .
Also, since periodic points are dense in the boundary leaves due
to Proposition 2.8, there is an iterate of f that xes all connected
components of
V , so we will assume when proving Theorem 4.1 that
a, Universidad de la Rep
ublica, CC
30 Montevideo, Uruguay
E-mail address: [email protected]
E-mail address: [email protected]
E-mail address: [email protected]
Las Publicaciones Matem aticas del Uruguay (PMU) tienen como
objetivo reejar parte de las actividades de investigaci on matem atica
que se lleva a cabo en Uruguay. Nuestro interes es publicar artculos
de investigacion, as como artculos de tipo survey, anuncios, y otros
trabajos que el comite editorial considere adecuado.
Los vol umenes no necesariamente seran arbitrados. Esto se in-
dicar a cuidadosamente en cada volumen.
Todos los artculos de este volumen han sido arbitrados.
The goal of Publicaciones Matem aticas del Uruguay (PMU) is to
reect part of the mathematics research activities taking place in
Uruguay. It is our interest to publish research articles, survey-type
articles, research announcements and other papers considered suit-
able by the Editorial Board.
The editorial process may or may not involve a revision by referees.
This will be carefully indicated in each volume.
All papers in this volume have been peer-reviewed.
Contents
Preface
A review of some recent results on Random Polynomials over R and
over C.
DIEGO ARMENTANO 1
Rice formulas and Gaussian waves II.
JEAN-MARC AZAIS, JOS E R. LE ON, and MARIO WSCHEBOR 15
On automorphism groups of ber bundles
MICHEL BRION 39
On the focusing of Cram er - von Mises test
ALEJANDRA CABA NA and ENRIQUE CABA NA 67
Feuilletage de Hirsch, mesures harmoniques et g-mesures
BERTRAND DEROIN and CONSTANTIN VERNICOS 79
On existence of smooth critical subsolutions of the Hamilton-Jacobi
Equation
ALBERT FATHI 87
Paths towards adaptive estimation for Instrumental Variable
Regression
JEAN-MICHEL LOUBES and CL EMENT MARTEAU 99
Semisimple Hopf algebras and their representations
SONIA NATALE 123
An example concerning the Theory of Levels for codimension-one
foliations
ANDR ES NAVAS 169
Accessibility and abundance of ergodicity in dimension three: a
survey.
FEDERICO RODRIGUEZ HERTZ, JANA RODRIGUEZ HERTZ,
and RA UL URES 177