0% found this document useful (0 votes)

29 views822 pages

Operator

Uploaded by

usyoussef ufs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views822 pages

Operator

Uploaded by

usyoussef ufs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 822

Trends in Mathematics

Richard M. Aron
Mohammad Sal Moslehian
Ilya M. Spitkovsky
Hugo J. Woerdeman
Editors

Operator
and Norm
Inequalities and
Related Topics
Trends in Mathematics
Trends in Mathematics is a series devoted to the publication of volumes arising
from conferences and lecture series focusing on a particular topic from any area of
mathematics. Its aim is to make current developments available to the community as
rapidly as possible without compromise to quality and to archive these for reference.

Proposals for volumes can be submitted using the Online Book Project Submission
Form at our website www.birkhauser-science.com.

Material submitted for publication must be screened and prepared as follows:

All contributions should undergo a reviewing process similar to that carried out by
journals and be checked for correct use of language which, as a rule, is English.
Articles without proofs, or which do not contain any significantly new results,
should be rejected. High quality survey papers, however, are welcome.

We expect the organizers to deliver manuscripts in a form that is essentially ready

for direct reproduction. Any version of TEX is acceptable, but the entire collection
of files must be in one particular dialect of TEX and unified according to simple
instructions available from Birkhäuser.

Furthermore, in order to guarantee the timely appearance of the proceedings it is

essential that the final version of the entire material be submitted no later than one
year after the conference.
Richard M. Aron • Mohammad Sal Moslehian •
Ilya M. Spitkovsky • Hugo J. Woerdeman
Editors

Operator and Norm

Inequalities and Related
Topics
Editors
Richard M. Aron Mohammad Sal Moslehian
Department of Mathematical Sciences Department of Pure Mathematics
Kent State University Ferdowsi University of Mashhad
Kent, OH, USA Mashhad, Iran

Ilya M. Spitkovsky Hugo J. Woerdeman

Department of Mathematics Department of Mathematics
New York University Abu Dhabi Drexel University
Abu Dhabi, United Arab Emirates Philadelphia, PA, USA

ISSN 2297-0215 ISSN 2297-024X (electronic)

Trends in Mathematics
ISBN 978-3-031-02103-9 ISBN 978-3-031-02104-6 (eBook)
https://doi.org/10.1007/978-3-031-02104-6

Mathematics Subject Classification: 46L08, 15A09, 47A30, 47A55, 47B35, 47B38, 47B32, 30H10,
42B35, 44A15, 46A16, 16W80, 46L10, 46L54, 47L55; Primary: 15A42, 15A45, 47A63, 47A64, 46L30,
47A60, 47B49, 47A30, 46B20, 46C15, 52A21, 46C50, 47B38, 47A10, 47A11, 47B48, 46E05, 46E10,
46E15, 46E40, 47E38, 47H30, 46B04, 26D10, 34A40, 35A23, 47B37, 46L05, 42B35, 35B40, 45A07,
45G10; Seconday: 47A64, 47A30, 81P17, 15A45, 47A63, 47B65, 47A56, 47A60, 47B10, 47B15,
47B20, 47B44, 47B47, 05C20, 46B10, 46B28, 46C05, 47L25, 47L30, 47B6, 47A53, 47A55, 06F20,
47H07, 47J05, 46B07, 46B20, 34L10, 46L60, 42B30, 60G42, 42B25, 46E30, 47H06, 47H20

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This book is published under the imprint Birkhäuser, www.birkhauser-science.com by the registered
company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

Inequalities play a central role in mathematics with various applications in other

disciplines. The main goal of this contributed volume is to present several important
matrix, operator, and norm inequalities in a systematic and self-contained fashion.
The volume includes contributions by a number of the world’s leading specialists
in functional analysis and operator theory. It contains the latest developments of
significant mathematical inequalities in numerous fields in the last decades that are
of interest to a wide audience of pure and applied mathematicians.
This book consists of 5 parts and includes a total of 23 chapters. The chapters
are written in a reader-friendly style and can be read independently. Each chapter
contains a rich bibliography.

Part I: Matrix and Operator Inequalities

Whenever we see an inequality concerning real or complex numbers, an interesting

question is to ask ourselves whether it is true for matrices or bounded linear
operators on a Hilbert space. This is based on the fact that the real linear space
of self-adjoint operators (Hermitian matrices) can be regarded as a generalization
of the real line. One of the most significant notions in this part is the concept of
operator monotone function, which was first studied by C. Löwner [Math. Z. 38
(1934), 177–216], and its connection with operator means was introduced by F.
Kubo and T. Ando [Math. Ann. 246 (1980), no. 3, 205–224]; [cf. Simon, Barry.
Loewner’s theorem on monotone matrix functions. Grundlehren der mathematis-
chen Wissenschaften, 354. Springer, Cham, 2019].
Chapter “Log-majorization Type Inequalities” is devoted to studying the link
between majorization theory and several matrix inequalities such as Araki’s log
majorization, the Löwner–Heinz, the Furuta, the Golden-Thompson, the von Neu-
mann trace, and their extensions.

v
vi Preface

In Chapter “Ando-Hiai Inequality: Extensions and Applications”, extensions

and applications of the Ando-Hiai inequality are investigated and the Furuta, the
Bebiano–Lemos–Providência, and the grand Furuta inequalities are explored.
Chapter “Relative Operator Entropy” demonstrates the relative operator entropy
that is the tangent vector of the geodesic in the manifold of positive invertible
operators. Tsallis relative entropy is studied as the secant of a path of geometric
matrix means.
Chapter “Matrix Inequalities and Characterizations of Operator Monotone Func-
tions” includes various characterizations of operator monotone functions using
matrix inequalities involving matrix means. A trace monotonicity inequality and the
Powers-Størmer inequality are used to characterize operator monotone functions.
Chapter “Perspectives, Means and Their Inequalities” focuses on the operator
perspective and its extensions including operator means and the Pusz–Woronowicz
functional calculus.
Chapter “Cauchy–Schwarz Operator and Norm Inequalities for Inner Product
Type Transformers in Norm Ideals of Compact Operators, with Applications”
provides an overview of operator and norm inequalities of Cauchy–Schwarz type for
strongly square integrable operator families and symmetrically norming functions.
Some applications to the Aczél–Bellman, Grüss–Landau, arithmetic–geometric,
Young, Heinz, Heron inequalities are presented.
Chapter “Norm Estimations for the Moore-Penrose Inverse of the Weak Per-
turbation of Hilbert C ∗ -module Operators” defines multiplicative perturbations
and studies representations and norm estimations for the Moore-Penrose inverse
associated with the multiplicative perturbation.

Part II: Orthogonality and Inequalities

There are several ways to extend the notion of orthogonality from inner product
spaces to the framework of normed spaces. The most developed one is the Birkhoff–
James orthogonality. It was introduced by Birkhoff [Duke Math. J. 1 (1935), 169–
172] and extensively studied by R.C. James [Duke Math. J. 12 (1945), 291–302]; cf.
C. Alsina, J. Sikorska, M.S. Tomás [Norm derivatives and characterizations of inner
product spaces. World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2010].
Chapter “Birkhoff–James Orthogonality: Characterizations, Preservers, and
Orthogonality Graphs” reviews the Birkhoff–James orthogonality starting from
historical perspectives throughout the current development and presents several
characterizations of Birkhoff–James orthogonality in classical Banach spaces, C ∗ -
algebras, and Hilbert C ∗ -modules. In addition, some characterizations of preservers
of Birkhoff–James orthogonality are given.
Chapter “Approximate Birkhoff-James Orthogonality in Normed Linear Spaces
and Related Topics” is an introduction to approximate Birkhoff–James orthogonal-
ity in real normed spaces and its characterizations.
Preface vii

Chapter “Orthogonally Additive Operators on Vector Lattices” focuses on

the vector lattice structure of different partial subclasses of the vector space of
all orthogonally additive operators, certain domination problems, representation
theorems, and Banach lattice structure of orthogonally additive operators.

Part III: Inequalities Related to Types of Operators

This part mainly studies inequalities concerning closed range, normal, and Toeplitz
operators (cf. A. Böttcher and B. Silbermann [Analysis of Toeplitz operators.
Second edition. Prepared jointly with Alexei Karlovich. Springer Monographs in
Mathematics. Springer-Verlag, Berlin, 2006] and I. Gohberg, S. Goldberg, and M.
A. Kaashoek [Basic classes of linear operators. Birkhäuser Verlag, Basel, 2003]).
In Chapter “Normal Operators and Their Generalizations”, some aspects of local
spectral theory and Fredholm theory of certain classes of operators that generalize
normal operators on Hilbert spaces are studied.
Chapter “On Wold Type Decomposition for Closed Range Operators” surveys
Wold-type decomposition for closed range operators satisfying certain operator
inequalities. Several results on left invertible operators close to isometries are listed
and extended to the case of regular operators.
Chapter “(Asymmetric) Dual Truncated Toeplitz Operators” considers properties
of asymmetric dual truncated Toeplitz operators acting between the orthogonal
complements of two model spaces.
Chapter “Boundedness of Toeplitz Operators in Bergman-Type Spaces” is
devoted to the open problem of characterization of the bounded Toeplitz operators
Ta in Bergman spaces. Based on the structure of the Bergman spaces, a characteri-
zation of the boundedness and compactness is presented in the case of operators in
spaces with weighted sup-norms.

Part IV: Inequalities in Various Banach Spaces

This part deals with miscellaneous inequalities concerning topological and geomet-
rical properties of various Banach spaces and operator algebras (cf. W.B. Johnson
and J. Lindenstrauss (ed.) [Handbook of the geometry of Banach spaces. Vol. I.
North-Holland Publishing Co., Amsterdam, 2001]).
In Chapter “Disjointness Preservers and Banach-Stone Theorems”, the so-called
weak and strong Banach-Stone theorems are given. In addition, it is proved
that in many cases lattice isomorphisms (Kaplansky’s Theorem), ring isomor-
phisms (Gelfand-Kolmogorov Theorem), multiplicative isomorphisms (Milgram’s
Theorem), isometries (Banach-Stone Theorem), and nonvanishing preservers are
⊥-isomorphisms.
viii Preface

Chapter “The Bishop–Phelps–Bollobás Theorem: An Overview” provides a

comprehensive survey of the Bishop–Phelps–Bollobás theorem from 2008 to 2021.
Chapter “A New Proof of the Power Weighted Birman–Hardy–Rellich Inequal-
ities” introduces a new proof of the optimal version of the power-weighted
Birman–Hardy–Rellich integral inequalities. Extensions to homogeneous Sobolev
spaces and the vector-valued case are also discussed.
Chapter “An Excursion to Multiplications and Convolutions on Modulation
Spaces” is devoted to reviewing results on boundedness for multiplications and con-
volutions in (quasi-)Banach modulation spaces of ultradistributions. Furthermore,
the Gabor product is investigated.
Chapter “The Hardy-Littlewood Inequalities in Sequence Spaces” presents
modern proofs of m-linear versions of the results of Hardy and Littlewood and the
state of the art of the subject, as well as an application to the combinatorial Gale-
Berlekamp switching game.
Chapter “Symmetries of C ∗ -algebras and Jordan Morphisms” illustrates interre-
lations between symmetries of various structures attached to C ∗ -algebras and von
Neumann algebras and Jordan ∗-isomorphisms. In this direction, one-dimensional
projections in a Hilbert space with transition probability, projection lattices of von
Neumann algebras, and measures on state spaces endowed with the Choquet order
are extensively studied.

Part V: Inequalities in Commutative and Noncommutative

Probability Spaces

This part includes generalizations of Doob’s maximal inequality, the Burkholder–

Davis–Gundy inequality, and several inequalities related to Markov processes and
noncommutative free probability (cf. P. Jorgensen and F. Tian [Non-commutative
analysis. With a foreword by Wayne Polyzou. World Scientific Publishing Co. Pte.
Ltd., Hackensack, NJ, 2017]).
In Chapter “Mixed Norm Martingale Hardy Spaces and Applications in Fourier
Analysis”, martingale Hardy spaces defined with the help of mixed Lp -norm
are investigated. Two different generalizations of Doob’s maximal inequality for
mixed-norm Lp spaces and two versions of atomic decompositions are given.
Several martingale inequalities and a generalization of the Burkholder–Davis–
Gundy inequality are also presented. As an application in Fourier analysis, the
boundedness of the Fejér maximal operator from Hp to Lp , whenever 1/2 < p <
∞ is obtained.
Chapter “The First Eigenvalue for Nonlocal Operators” presents some results
concerning the first eigenvalue for a nonlocal operator in convolution form with a
smooth kernel and gives information on the asymptotic behavior of some natural
Markov processes.
Preface ix

Chapter “Comparing Banach Spaces for Systems of Free Random Variables

Followed by the Semicircular Law” studies certain Banach-space operators from
noncommutative free probability, acting on systems of free random variables whose
free distributions are followed by the semicircular law.
The editors are grateful for the hard work of numerous mathematicians who
carefully reviewed the chapters and gave insightful comments to improve them.
The book can be used as an introduction to several active research areas within
operator theory. It is intended for use by both researchers and graduate students in
mathematics, scientific computing, physics, statistics, and engineering who have a
basic grasp of the fundamentals in functional analysis and operator theory.

Kent, OH, USA Richard M. Aron

Mashhad, Iran Mohammad Sal Moslehian
Abu Dhabi, United Arab Emirates Ilya M. Spitkovsky
Philadelphia, PA, USA Hugo J. Woerdeman
Contents

Part I Matrix and Operator Inequalities

Log-majorization Type Inequalities. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 3
N. Bebiano, R. Lemos, and G. Soares
Ando-Hiai Inequality: Extensions and Applications . . . . .. . . . . . . . . . . . . . . . . . . . 41
Masatoshi Fujii and Ritsuo Nakamoto
Relative Operator Entropy.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 69
Jun Ichi Fujii and Yuki Seo
Matrix Inequalities and Characterizations of Operator Monotone
Functions . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 97
Trung Hoa Dinh, Hiroyuki Osaka, and Oleg E. Tikhonov
Perspectives, Means and their Inequalities . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 131
Hiroyuki Osaka and Shuhei Wada
Cauchy–Schwarz Operator and Norm
Inequalities for Inner Product Type Transformers in
Norm Ideals of Compact Operators, with Applications .. . . . . . . . . . . . . . . . . . . . 179
Danko R. Jocić and Milan Lazarević
Norm Estimations for the Moore-Penrose Inverse of the Weak
Perturbation of Hilbert C ∗ -Module Operators .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 221
Chunhong Fu, Dingyi Du, Liuhui Huang, and Qingxiang Xu

Part II Orthogonality and Inequalities

Birkhoff–James Orthogonality: Characterizations, Preservers,
and Orthogonality Graphs .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 255
Ljiljana Arambašić, Alexander Guterman, Bojan Kuzma,
and Svetlana Zhilina

xi
xii Contents

Approximate Birkhoff-James Orthogonality in Normed Linear

Spaces and Related Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 303
Jacek Chmieliński
Orthogonally Additive Operators on Vector Lattices . . . .. . . . . . . . . . . . . . . . . . . . 321
Marat Pliev and Mikhail Popov

Part III Inequalities Related to Types of Operators

Normal Operators and their Generalizations .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 355
Pietro Aiena
On Wold Type Decomposition for Closed Range Operators . . . . . . . . . . . . . . . . 397
H. Ezzahraoui, M. Mbekhta, and E. H. Zerouali
(Asymmetric) Dual Truncated Toeplitz Operators .. . . . . .. . . . . . . . . . . . . . . . . . . . 429
M. Cristina Câmara, Kamila Kliś-Garlicka, and Marek Ptak
Boundedness of Toeplitz Operators in Bergman-Type Spaces . . . . . . . . . . . . . . 461
Jari Taskinen and Jani A. Virtanen

Part IV Inequalities in Various Banach Spaces

Disjointness Preservers and Banach-Stone Theorems . . .. . . . . . . . . . . . . . . . . . . . 493
Denny H. Leung and Wee Kee Tang
The Bishop–Phelps–Bollobás Theorem: An Overview .. .. . . . . . . . . . . . . . . . . . . . 519
Sheldon Dantas, Domingo García, Manuel Maestre, and Óscar Roldán
A New Proof of the Power Weighted Birman–Hardy–Rellich
Inequalities . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 577
Fritz Gesztesy, Isaac Michael, and Michael M. H. Pang
An Excursion to Multiplications and Convolutions
on Modulation Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 601
Nenad Teofanov and Joachim Toft
The Hardy-Littlewood Inequalities in Sequence Spaces .. . . . . . . . . . . . . . . . . . . . 639
Daniel Núñez-Alarcón, Daniel M. Pellegrino, and Anselmo B. Raposo Jr.
Symmetries of C ∗ -algebras and Jordan Morphisms . . . . .. . . . . . . . . . . . . . . . . . . . 673
Jan Hamhalter and Ekaterina Turilova

Part V Inequalities in Commutative and Noncommutative

Probability Spaces
Mixed Norm Martingale Hardy Spaces and Applications in
Fourier Analysis .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 709
Ferenc Weisz
Contents xiii

The First Eigenvalue for Nonlocal Operators .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 741

Julio D. Rossi
Comparing Banach Spaces for Systems of Free Random
Variables Followed by the Semicircular Law . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 773
Ilwoo Cho and Palle Jorgensen
Part I
Matrix and Operator Inequalities
Log-majorization Type Inequalities

N. Bebiano, R. Lemos, and G. Soares

Abstract Several inequalities have been established in the context of Hilbert spaces
operators or operator algebras. Our discussion will be limited to matrices. Important
inequalities in mathematics and other sciences, such as Golden-Thompson inequal-
ity or von Neumann trace inequality, and extensions, are revisited. Our main goal is
to emphasize the link between majorization theory and other relevant inequalities.

Keywords Eigenvalues · Singular values · Majorization · Log-majorization ·

Norm inequalities · Determinant and trace inequalities · Operator connections ·
Ando-Hiai inequality · Araki’s log-majorization · Löwner-Heinz inequality ·
Furuta inequality · Golden-Thompson inequality · von Neumann trace inequality

Notation
N Set of natural numbers
N0 Set of nonnegative integer numbers
R Set of real numbers
R+0 Set of nonnegative real numbers
C Set of complex numbers
Rn Vector space of real n-tuples
Cn Vector space of complex n-tuples
· Euclidean norm; spectral norm or operator norm

N. Bebiano ()
CMUC, Departamento de Matemática, Faculdade de Ciências e Tecnologia da, Universidade de
Coimbra, Coimbra, Portugal
e-mail: [email protected]
R. Lemos
CIDMA, Department of Mathematics, University of Aveiro, Aveiro, Portugal
e-mail: [email protected]
G. Soares
CMAT-UTAD, Universidade de Trás-os-Montes e Alto Douro, Vila Real, Portugal
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 3

R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_1
4 N. Bebiano et al.

||| · ||| Unitarily invariant norm

· (k) Ky Fan k-norm
· p Schatten p-norm
· 2 Frobenius norm, Hilbert-Schmidt norm or Schur norm
Mn (C) Algebra of n × n complex matrices
Mm×n (C) Vector space of m × n complex matrices
n Set of n × n doubly stochastic matrices
A = (aij ) Matrix A with entries aij
A∗ Adjoint of a matrix A
AT Transpose of a matrix A
A Entrywise conjugate of A
|A| Unique positive semidefinite square root of A∗ A
A∧k kth compound or kth antisymmetric tensor power of A
ρ(A) Spectral radius of A
f (A) Functional calculus applied to a function f
A≥0 Positive semidefinite matrix A
A>0 Positive definite matrix A
A≥B A−B ≥0
ReA (ImA) Real (imaginary) part of A
tr(A) Trace of a matrix A
det(A) Determinant of a matrix A
λi (A) Eigenvalue of A
λ1 (A) Largest eigenvalue of A if A is Hermitian
si (A) Singular value of A
s1 (A) Largest singular value of A
In Identity matrix of order n
A◦B Hadamard product of matrices A and B
|x| Absolute value vector (|x1 |, . . . , |xn |)
x≺y x is majorized by y
x ≺w y x is weakly majorized by y
x ≺log y x is log-majorized by y
x ≺wlog y x is weakly log-majorized by y
α α-weighted geometric mean for α ∈ [0, 1]
Geometric mean
σ Operator connection
σ⊥ Dual of an operator connection σ
fσ Representing function of an operator connection σ
Sn Symmetric group of degree n
S(A, B) Umegaki relative entropy
X∼ X or XT
Hn Set of n × n Hermitian matrices
HnT Set of n × n symmetric matrices
per A Permanent of a matrix A
J Hermitian involutive matrix
Log-majorization Type Inequalities 5

σJ± (A) Set of eigenvalues with eigenvectors x, such that x ∗ J x = ±1

A ≥J B J (A − B) ≥ 0

1 Introduction

The concept of majorization was introduced by Hardy, Littlewood and Pólya

[43]. Since then various majorizations were obtained for the eigenvalues and
singular values of matrices and compact operators [72]. These majorizations are
powerful devices for the derivation of several norm inequalities, as well as trace or
determinant inequalities for matrices or operators. In this section, we review in a
concise way the majorization theory used throughout this chapter.
Any vector x = (x1 , . . . , xn ) ∈ Rn is assumed to have its components sorted in
non-increasing order, that is, x1 ≥ · · · ≥ xn .
Let x, y ∈ Rn . We say that x is majorized by y and write x ≺ y if

k
k
xi ≤ yi , k = 1, . . . , n, (1)
i=1 i=1

and equality occurs in (1) for k = n. Further, if (1) holds, then x is said to be weakly
majorized or submajorized by y and the notation x ≺w y is used. We remark that
x ≺ y is equivalent to

n
n
xi ≥ yn , k = 1, . . . , n, (2)
i=k i=k

with equality in (2) for k = 1. If (2) holds, then x is said to be supermajorized by y

and we write x ≺w y.
Naively, vector majorization means that one vector is more disordered than the
other. For instance, a physics interpretation may be that x describes a more chaotic
state than y, thinking of xi as the probability of the system described by x being in
state i.
Two important resources on the topic of majorization are [21, 72].
A square matrix with non-negative entries is called doubly stochastic if all its
row and column sums are one. The class n of doubly stochastic matrices of order
n is a convex set, whose extreme points are the permutation matrices as stated
by the famous Birkhoff’s Theorem [24]. In fact, there is a close relation between
majorization and doubly stochastic matrices [72].
Proposition 1.1 A matrix A ∈ n if and only if A x ≺ x for all x ∈ Rn .
6 N. Bebiano et al.

Proposition 1.2 For x, y ∈ Rn , the following statements are equivalent:

i. x ≺ y;
ii. x is in the convex hull of all the vectors obtained by permutating the coordinates
of y;
iii. x = A y for some A ∈ n .
For any real valued function f defined on an interval, containing all the
components of the real n-tuple x, we adopt the notation

f (x) = f (x1 ), . . . , f (xn ) .

Proposition 1.3 Let x, y ∈ Rn and f be a convex function on an interval

containing all the components of x and y. Then
i. If x ≺ y , then f (x) ≺w f (y).
ii. If x ≺w y and f is also non-decreasing, then f (x) ≺w f (y).
Log-majorization can be defined as a multiplicative version of majorization. If
x, y ∈ Rn have nonnegative components, x ≺log y means that

k
k
xi ≤ yi , k = 1, . . . , n, (3)
i=1 i=1

and equality occurs in (3) for k = n. If x, y > 0, i.e., all the components of x, y are
positive, this is clearly equivalent to

n
n
xi ≥ yn , k = 1, . . . , n, (4)
i=k i=k

with equality in (4) for k = 1. If x, y > 0, then

x ≺log y ⇔ log x ≺ log y,

this justifying the log-majorization terminology. When equality between the prod-
ucts of all the components of x and y is not required, the following parallel notations
are used:

x ≺wlog y for (3) and x ≺wlog y for (4).

Proposition 1.4 Let x, y ∈ Rn have all the components positive and f be a non-
decreasing continuous function on an interval containing all the components of x, y,
such that f (et ) is convex. Then

x ≺wlog y ⇒ f (x) ≺w f (y).

Log-majorization Type Inequalities 7

In particular, f (t) = t in the previous proposition shows that the weak log-
majorization ≺wlog is stronger than the weak majorization ≺w .

2 Matrix Majorization

If A = (aij ), B = (bij ) are m × n complex matrices, let A ◦ B = (aij bij ) be

the Hadamard product of A and B. Let Mn (C) be the algebra of n-square complex
matrices and In be the identity matrix of order n. If A ∈ Mn (C), then its eigenvalues
are denoted by λ1 (A), . . . , λn (A) and

ρ(A) = max |λi (A)|

i=1,...,n

is the spectral radius of A. Further, considering the Euclidean norm x of a vector
x ∈ Cn , let

A = max Ax

x=1

be the spectral norm or operator norm of A. It is clear that

ρ(A) ≤ A. (5)

If A ∈ Mn (C) has real eigenvalues, denote by λ(A) the n-tuple of eigenvalues of A

arranged in non-increasing order:

λ1 (A) ≥ · · · ≥ λn (A).

For A ∈ Mn (C), the unique positive semidefinite square root of A∗ A is denoted by

|A|. The eigenvalues of |A| are the singular values of A, which are arranged in the
vector s(A) as follows:

s1 (A) ≥ · · · ≥ sn (A).

A norm ||| · ||| in Mn (C) is said to be unitarily invariant if |||U AV ||| = |||A||| for any
A, U, V ∈ Mn (C) with U, V unitary. Examples of unitarily invariant norms are the
Schatten p-norms given by
n 1
p
p
1
Ap = si (A) = tr |A|p p , p ≥ 1,
i=1
8 N. Bebiano et al.

and the Ky Fan k-norms defined by

k
A(k) = si (A), k = 1, . . . , n,
i=1

including A = s1 (A). The Schatten 2-norm

A2 = tr(A∗ A),

also called Frobenius norm, Hilbert-Schmidt norm or Schur norm, is the norm
induced by to the Frobenius or Hilbert-Schmidt inner product in Mn (C):

A, B = tr(B ∗ A).

The notion of majorization gives a mean for comparing two probability distri-
butions or two density matrices, that is positive semidefinite matrices of trace one,
using the eigenvalues, in an elegant way. It arises in fields like computer science,
economics or quantum mechanics.
Important sources on majorization for eigenvalues and singular values of matri-
ces are [21, 46, 47, 55, 72] and two survey articles of T. Ando [2, 3].
For simplicity, if A, B ∈ Mn (C) have real eigenvalues, then λ(A) ≺ λ(B) and
λ(A) ≺w λ(B) are abreviated to A ≺ B and A ≺w B, respectively.
The main diagonal entries and the eigenvalues of a Hermitian matrix are related
through majorization. This classical result due to I. Schur [84] can be briefly stated
as follows.
Theorem 2.1 (Schur Majorization Theorem, 1923) If A ∈ Mn (C) is Hermitian,
then In ◦ A ≺ A.
In 1954, A. Horn [51] proved the converse, giving rise to the next fundamental
result, which received considerable attention and led to generalizations in several
directions.
Theorem 2.2 (Schur-Horn Theorem) Let x, y ∈ Rn . There exists a Hermitian
matrix with prescribed diagonal entries and prescribed eigenvalues arranged,
respectively, in x and y if and only if x ≺ y.
After this, Horn’s subsequent work on the eigenvalues of sums of Hermitian
matrices culminated in the inequalities conjectured in [53]. The solution to Horn’s
conjecture appeared in two papers, one by A. Klyachko [60] and the other one by
A. Knutson and T. Tao [61].
Another relevant result in matrix majorization is due to Ky Fan [28].
Theorem 2.3 (Ky Fan Dominance Theorem, 1951) Let A, B ∈ Mn (C). Then the
following are equivalent statements:
i. |A| ≺w |B|;
Log-majorization Type Inequalities 9

ii. |||A||| ≤ |||B||| for any unitarily invariant norm ||| · ||| in Mn (C).
If A, B ∈ Mn (C) have nonnegative eigenvalues, A ≺log B stands for

λ(A) ≺log λ(B).

Abbreviated notations for the weaker versions, involving either ≺wlog or ≺wlog, are
analogously used. Clearly, if A, B have positive eigenvalues, then

A ≺wlog B ⇔ B −1 ≺wlog A−1 .

Matrix log-majorization is a powerful tool for establishing trace, determinantal and

matrix norm inequalities. For instance,

A ≺log B ⇒ det(In + A) ≤ det(In + B).

On the other hand, some classical determinantal inequalities can find their majoriza-
tion counterparts.
As usual, A > 0 means that A is a positive definite matrix and A ≥ B means that
A − B is a positive semidefinite matrix. Real-valued continuous functions f defined
on a real interval , such that

A≥B ⇒ f (A) ≥ f (B)

for all Hermitian A, B ∈ Mn (C) with spectra in and all n ∈ N, are said to
be operator monotone on . A useful and fundamental tool for treating operator
inequalities is Löwner-Heinz inequality. Löwner’s original proof [69] used an
integral representation for operator monotone functions and an alternative proof was
given by Heinz [44]. It states that

A≥B≥0 ⇒ Aα ≥ B α , (6)

that is, f (t) = t α is operator monotone on R+ 0 , for α ∈ [0, 1]. In general, (6) is not
true for α > 1.
For k = 1, . . . , n and nk = nk , the kth compound or kth antisymmetric tensor
power of A ∈ Mn (C) is the matrix A∧k ∈ Mnk (C) with entries given by the
minors det A(i, j), where the index sets i, j ⊂ {1, . . . , n} have cardinality k and
are lexicographically ordered. As usual, A(i, j) denotes the submatrix of A that lies
in rows and columns indexed, respectively, by i, j. Some essential properties of these
matrices [21] are listed below:
∧k ∧k B ∧k (Binet-Cauchy formula);
P1. ∧k ∗ = A ∗ ∧k
(AB)
P2. A = (A ) ;
∧k r
P3. A = (Ar )∧k , r > 0;
−1
P4. A∧k = (A−1 )∧k if A is invertible;
10 N. Bebiano et al.

k
P5. λi A∧k = λij (A), where i = (i1 , . . . , ik ) and 1 ≤ i1 < · · · < ik ≤ n;
j =1
k
P6. A∧k = s1 A∧k = sj (A), k = 1, . . . , n.
j =1

Thus, a useful tool in log-majorization is provided by the next lemma.

Lemma 2.4 Let A, B ∈ Mn (C) have nonnegative eigenvalues. The following are
equivalent:
i. A ≺log B;

ii. λ1 A∧k ≤ λ1 B ∧k , k = 1, . . . , n, and det(A) = det(B).
A basic log-majorization in matrix theory is Weyl’s relation between eigenvalues
and singular values [98]. Let |λ(A)| be the vector of the absolute values of the
eigenvalues of A ∈ Mn (C) arranged in non-increasing order of magnitude:

|λ1 (A)| ≥ · · · ≥ |λn (A)|.

Theorem 2.5 (Weyl’s Majorant Theorem, 1949) If A ∈ Mn (C), then

|λ(A)| ≺log s(A). (7)

Proof Use properties P5 and P6, after applying 5, that is,

ρ(A) = |λ1 (A)| ≤ s1 (A).

to the kth antisymmetric tensor power of A, k = 1, . . . , n, and observe that

n
1 1
λi (A) = | det(A)| = det(A) det(A) 2 = det(A∗ A) 2 = det |A|
i=1

is the product of all the singular values of A.

In 1954, A. Horn proved the converse [52], that is, there exists a square matrix
with prescribed eigenvalues and singular values arranged in vectors x and y if the
log-majorization |x| ≺log y is satisfied.
In the sequel, we illustrate the potential of using the previous antisymmetric
tensor power technique, also called Weyl trick, by using Lemma 2.4 to derive some
other log-majorization for expressions involving products and fractional matrix
powers, having in mind that these “commute” with the kth antisymmetric tensor
power. As a consequence, some known results will be revisited. Some classical
inequalities for the trace and the determinant are meanwhile surveyed in the next
section.
Log-majorization Type Inequalities 11

3 Trace and Determinantal Inequalities

The von Neumann’s trace inequality was first published in 1937 by von Neumann
[96] with a complicated proof. Other proofs were given in 1959 and subsequently
in 1975, based on doubly stochastic matrices, by Mirsky [76, 77]. However, these
proofs only work in the finite dimensional case. A simple proof, which also extends
to the infinite dimensional setting, was finally obtained in 1991 by R. D. Grigorieff
[41].
Theorem 3.1 (von Neumann’s Inequality, 1937) Let A, B ∈ Mn (C). Then

n
|tr(AB)| ≤ si (A)si (B)
i=1

and equality occurs if A, B share a joint set of singular vectors.

This result is an important tool with various applications in pure and applied
mathematics. For instance, just to mention a few, it is useful in Schatten’s theory of
cross spaces and in Ball’s approach of the equations of nonlinear elasticity. Inspired
by this famous inequality, further singular value inequalities have been meanwhile
derived, among them Horn’s multiplicative inequalities (see, e.g. [72]).
Theorem 3.2 (Horn, 1950) If A, B ∈ Mn (C), then

s(AB) ≺log s(A) ◦ s(B).

Proof By the submultiplicativity of the operator norm, we have

s1 (AB) = AB ≤ AB = s1 (A)s1 (B).

Apply the antisymmetric tensor power technique to the previous inequality, that is,
replace A, B by their kth compounds, k = 1, . . . , n, and use P6. Equality for k = n
is immediate by properties of the determinants.

Remark 3.3 For A, B ∈ Mn (C), Horn and Weyl’s log-majorizations stated before,
the second applied to the product AB, imply the corresponding weak majorizations,
so we easily find for k = 1, . . . , n that

k
k
k
k
λi (AB) ≤ |λi (AB)| ≤ si (AB) ≤ si (A)si (B). (8)
i=1 i=1 i=1 i=1

In particular, von Neumann trace inequality is obtained when k = n in (8).

In 1958, Richter [81] proved a related trace inequality for the product of
two Hermitian matrices. Other contributions in this vein are due to Marcus [70],
12 N. Bebiano et al.

Mirsky [76] and Theobald [88]. Ruhe [82] reobtained it under the more restrictive
assumption of positive semidefiniteness of both matrices.
Theorem 3.4 If A, B ∈ Mn (C) are Hermitian, then

n
n
λi (A)λn−i+1 (B) ≤ tr(AB) ≤ λi (A)λi (B). (9)
i=1 i=1

We remark that the lower bound is an immediate consequence of the

upper bound in (9) with the Hermitian matrix B replaced by −B, since
λi (−B) = −λn−i+1 (B), i = 1, . . . , n. Note that the previous inequality is a
matrix version of the following classical rearrangement inequality [43]. Let Sn be
the symmetric group of degree n of all permutations of {1, . . . , n}.
Theorem 3.5 (Hardy-Littlewood-Pólya Rearrangement Inequality, 1929) If
x, y ∈ Rn , then

n
n
n
xi yn−i+1 ≤ xi yσ (i) ≤ xi yi
i=1 i=1 i=1

for any permutation σ ∈ Sn .

Having in mind that the trace of a matrix is the sum of the eigenvalues while
the determinant is the product, we can think in “dual inequalities” in the sense of
replacing sums by products and products by sums. In fact, the determinant of the
sum of matrices has no simple relation with the determinants of the summands. We
recall some inequalities in this avenue. We start with a remarkable result due to
Fiedler [29], after the following remark.
Remark 3.6 A continuity argument will be repeatedly used, when possible, along
the proof of some of the results, involving eigenvalues of Hermitian matrices. In
such cases, we only need to prove the results for nonsingular matrices. Otherwise,
we may replace in the inequalities each nonsingular matrix A by A + In for > 0
and then take the limit as converges to 0.
Theorem 3.7 If A, B ∈ Mn (C) are Hermitian with eigenvalues α1 ≥ · · · ≥ αn and
β1 ≥ · · · ≥ βn , respectively, then

n

n

min αj + βσ (j ) ≤ det(A + B) ≤ max αj + βσ (j ) .
σ ∈Sn σ ∈Sn
j =1 j =1

If αn + βn ≥ 0, then the minimum is attained when σ is the identity permutation

and the maximum is attained when σ (j ) = n − j + 1, j = 1, . . . , n.
Log-majorization Type Inequalities 13

Proof If A and B commute, they are simultaneously unitarily diagonalizable and

the result easily follows. Otherwise, there exists U ∈ Mn (C) unitary, such that

det(A + B) = det(A0 + U ∗ B0 U ),

where A0 , B0 are the diagonal forms of A, B. Since the unitary group is compact and
the determinant is continuous, det(A0 +V ∗ B0 V ) attains its maximum and minimum
values for some unitary matrix V ∈ Mn (C). Take

U = ei S
= In + i S + O( 2 ),

where is a small quantity and S ∈ Mn (C) is Hermitian. Assuming that

A0 + V ∗ B0 V is nonsingular and calculating

det(A0 + U ∗ V ∗ B0 V U )

to the first order in , it can be easily shown that V ∗ B0 V commutes with the inverse
of A0 + V ∗ B0 V . Thus V ∗ B0 V commutes with A0 and the theorem follows. If
A0 + V ∗ B0 V is singular, then the result follows by a limiting argument.

Remark 3.8 A natural generalization of Fiedler’s Theorem would be the following.
If A, B ∈ Mn (C) are normal matrices with eigenvalues α1 , . . . , αn , β1 , . . . , βn ,
respectively, then det(A + B) lies in the convex hull of the products

n

αj + βσ (j ) , σ ∈ Sn .
j =1

This is Marcus-de Oliveira conjecture [71, 79], a longstanding open problem.

Concerning more general multiplicative inequalities, involving singular values of
matrices, we state Gel‚fand-Naimark Theorem (see, e.g. [47, 72]).
Theorem 3.9 (Gelfand-Naimark, 1950) For A, B ∈ Mn (C),

k
k
sij (A) sn−ij +1 (B) ≤ sj (AB), k = 1, . . . , n,
j =1 j =1

equivalently,

k
k
sij (AB) ≤ sj (A) sij (B), k = 1, . . . , n,
j =1 j =1

for 1 ≤ i1 < i2 < · · · < ik ≤ n, with equality for k = n.

14 N. Bebiano et al.

The next result [9, 21] has a simple proof, using majorization theory.
Theorem 3.10 If A, B ≥ 0 have eigenvalues a1 ≥ · · · ≥ an and b1 ≥ · · · ≥ bn ,
respectively, then

n
n
aj2 + bj2 ≤ | det(A + iB)|2 ≤ aj2 + bn−j
2
+1 .
j =1 j =1

Proof We may assume A, B > 0. We easily find that

|det(A + iB)|2 = det(A)2 det In + (A−1 B)2

n
n
= aj2 1 + λ2j (A−1 B)
j =1 j =1

and
1 1
λj (A−1 B) = sj2 A− 2 B 2 , j = 1, . . . , n.

By Gelfand-Naimark Theorem with A, B replaced by B 2 , A− 2 , respectively,

1 1

k
1 1
k
1 1
k
1 1
sn−j +1 (A− 2 )sj (B 2 ) ≤ sj (A− 2 B 2 ) ≤ sj (A− 2 )sj (B 2 )
j =1 j =1 j =1

hold for k = 1, . . . , n, with equality for k = n. Clearly,

−2 2 1 1 bj 1 1 bj
2
sn−j +1 (A )sj (B 2 ) = , sj2 (A− 2 )sj2 (B 2 ) = , j = 1, . . . , n.
aj an−j +1

Thus, the previous singular values inequalities are equivalent to

b1 bn b1 bn
,..., ≺log λ(A−1 B) ≺log ,..., .
a1 an an a1

Since the function f (x) = log(1 + x 2 ) is a continuous increasing function on

(0, ∞), such that f (et ) is convex in t, by Proposition 1.4 applied to the previous
log-majorization, we obtain

n bj2
n
n 2
bn−j +1
log 1 + ≤ log 1 + λ2j (A−1 B) ≤ log 1 + .
j =1
aj2 j =1 j =1
aj2
Log-majorization Type Inequalities 15

Thus,

n bj2
n
n 2
bn−j +1
1+ ≤ 1 + λ2j (A−1 B) ≤ 1+
j =1
aj2 j =1 j =1
aj2

and the result easily follows.

Remark 3.11 If one of the two matrices in Theorem 3.10 is not positive semidefini-
te, the lower bound is not necessarily true. Indeed, let A, B be Hermitian matrices
with eigenvalues a1 = 1, a2 = −1 and b1 = 2, b2 = 1, respectively. As Marcus-de
Oliveira conjecture holds for n = 2, then det(A + iB) is in the line segment with
endpoints −3 − i and −3 + i, consequently, we have
√
3 ≤ |det(A + iB)| ≤ 10.
√
If Theorem 3.10 was true, the lower bound of | det(A + iB)| would be 10, but in
the previous case the lower bound 3 is attained.
Now, considering the Cartesian decomposition A = Re A + i Im A of
A ∈ Mn (C), where

A + A∗ A − A∗
Re A = and Im A =
2 2i
are Hermitian matrices, the next corollary is easy to derive.
Corollary 3.12 If A ∈ Mn (C) is such that Re A > 0 then

det |Re A| ≤ |det A| .

We remark that related inequalities are surveyed in [101, Section 3.4].

4 Golden-Thompson Inequality and Araki’s

Log-majorization

For matrices A, B that commute, the following identity holds:

eA+B = eA eB .

In the noncommutative case, the situation is not so simple.

Let H, K be Hermitian matrices of the same order. It is obvious that

det eH +K = det eH det eK .
16 N. Bebiano et al.

Furthermore, the following remarkable trace inequality, motivated by considerations

in statistical mechanics, states a relation between eH +K and eH eK , even when these
matrices do not commute.
Theorem 4.1 (Golden-Thompson Inequality, 1965) If H, K ∈ Mn (C) are Hermi-
tian, then

tr eH +K ≤ tr eH eK . (10)

Nowadays, (10) is a basic tool in quantum statistical mechanics. Some historical

aspects and its applications in random matrix theory are collected in [30], including
a not previously published proof due to Dyson. Golden [40] proved (10) for positive
semidefinite matrices and observed that it may be used to get lower bounds for the
Helmholtz free energy by partitioning the Hamiltonian. C. J. Thompson [89] showed
(10) for Hermitian matrices, independently of the semidefiniteness condition, and
applied it to obtain an upper
bound
for the partition function of an antiferromagnetic
chain, that is, for z = tr e−βH , where H is the Hamiltonian of the physical system
and β = 1/(kB T ), with kB the Boltzmann constant and T the absolute temperature.
Symanzik [86] obtained (10) for particular selfadjoint Hilbert space operators, in
that context showing that the classical partition function is an upper bound for the
corresponding quantum partition function. Some discussion on Golden-Thompson
inequality can also be found in the expository blog post by Terence Tao [87].
The direct extension of the Golden-Thompson inequality to three or more
matrices fails. Then if any two of the Hermitian matrices H, K, L commute,

tr eH +K+L ≤ tr eH eK eL

obviously holds, but this is not true in general as the next counter-example, due to
C. J. Thompson [90], shows.
Example Consider the Pauli matrices

01 0 −i 1 0
σ1 = , σ2 = , σ3 = ,
10 i 0 0 −1

the real vector a = (a1 , a2 , a3 ) and the matrix A = a1 σ1 + a2 σ2 + a3 σ3 . Then

sinh a
eA = cosh a I2 + I2 .
a

Here a is the Euclidean norm of a. For ∈ R \ {0}, let

H = σ1 , K = σ2 , L = (σ3 − σ2 − σ1 ).
Log-majorization Type Inequalities 17

In this case,

tr eH +K+L = 2 cosh

and, after some calculations, we find

4
tr eH eK eL = 2 cosh 1− + O( 6 ) .
12

Therefore, for small enough, we have

tr eH eK eL < tr eH +K+L .

Nevertheless, there is a nontrivial generalization of Golden-Thompson inequality

to a triple of Hermitian matrices by Lieb [67], as well as recent multivariate versions
[42, 85].
We notice in passing the interesting related result due to R. C. Thompson [91]: if
H, K are Hermitian matrices, then
∗ +V KV ∗ )
eiH eiK = ei(U H U ,

for some unitary matrices U, V . This result has application in quantum computing.
We observe that Thompson’s result was obtained before the long-standing Horn’s
conjecture on eigenvalues of sums of Hermitian matrices has been solved (see [20]
for more details).
Several trace inequalities may be strengthened in the set up of majorization. This
is the case of the Golden-Thompson inequality. In fact, it was proved by Lenard [66]
and Thompson [90] that

eH +K ≺w e 2 eK e 2
H H

holds for Hermitian matrices H, K.

Closely related, Araki [7] obtained a log-majorization presented in the next
theorem that extends the Lieb-Thirring trace inequality:

tr (AB)r ≤ tr Ar B r , r ∈ N,

for A, B ≥ 0, firstly used to derive inequalities for the moments of the eigenvalues
of the Schrödinger Hamiltonian [68].
Theorem 4.2 (Araki’s Log-majorization, 1990) Let A, B ≥ 0. Then
1 1 r r
(A 2 BA 2 )r ≺log A 2 B r A 2 , r ≥ 1, (11)
18 N. Bebiano et al.

or equivalently
1
q q 1 p p p
(A 2 B q A 2 ) q ≺log A 2 B p A 2 , 0 < q ≤ p. (12)

1 1 r r
Proof It is clear that (A 2 BA 2 )r and A 2 B r A 2 have the same determinant. Assum-
ing A invertible, let us prove that
1 1 r r
λ1 ((A 2 BA 2 )r ) ≤ λ1 (A 2 B r A 2 ). (13)
r r 1 1
To do so we may prove that A 2 B r A 2 ≤ In implies that A 2 BA 2 ≤ In , because
both sides of (13) have the same order of homogeneity for A and B, so that we can
multiply A, B by a positive constant. Since B r ≤ A−r , for r ≥ 1, then Löwner-
Heinz inequality implies B ≤ A−1 . If A is not invertible, by a continuity argument,
we obtain (13). By properties P1 and P3, then

(A∧k ) 2 (B ∧k )r (A∧k ) 2 = (A 2 B r A 2 )∧k ,

r r r r

∧k 1 ∧k 1 1 ∧k
(A ) 2 (B )(A∧k 2 )r = (A 2 BA 2 )r .
1

Hence, (11) follows from (13) applied to the matrices A∧k , B ∧k , k = 1, . . . , n,

using Lemma 2.4.
For 0 < p ≤ q, we may replace A and B by Aq , B q and take r = p/q in (11) so
that (12) follows.

Araki’s log-majorization readily implies the next trace inequality.
Corollary 4.3 (Araki-Lieb-Thirring Inequality) If A, B ≥ 0, r ≥ 1 and s > 0,
then
1 1 rs r r s
tr A 2 BA 2 ≤ tr A 2 B r A 2 .

The next extension of Golden-Thompson inequality is now easy to derive, as

observed by Ando and Hiai [5].
Corollary 4.4 If H, K ∈ Mn (C) are Hermitian and p > 0, then
pH pH 1
eH +K ≺log e 2 epK e 2 p . (14)

Proof Consider A = eH and B = eK in (12). Further, have in mind the continuous

parameter version of Lie-Trotter formula [5, Lemma 1.6]:
qH qH 1
lim e 2 eqK e 2 q = eH +K
q→0
Log-majorization Type Inequalities 19

and the result follows.

Let H, K be Hermitian matrices. If p = 1, then (14) can be written as

eH +K ≺log eH eK , (15)

H H
since e 2 eK e 2 and eH eK have the same eigenvalues. From the previous results, we
can see that Golden-Thompson inequality is strengthened to
pH pK pH 1
eH +K ≤ e 2 e e 2 p , p > 0,

for any unitarily invariant norm, and the right hand side decreases to the left hand
side as p converges to 0.

5 Ando-Hiai Inequality

The axiomatic theory of operator connections was developed by F. Kubo and

T. Ando [62]. A matrix connection of order n is a binary operation σ on the
cone of positive semidefinite matrices in Mn (C), satisfying for any A, B, C, D,
Ak , Bk ≥ 0:
C1. (joint monotonicity) A ≤ C and B ≤ D ⇒ A σ B ≤ C σ D;
C2. (transformer inequality) X∗ (A σ B)X ≤ (X∗ AX) σ (X∗ BX) for any
X ∈ Mn (C);
C3. (joint continuity from above) Ak ↓ A and Bk ↓ B ⇒ Ak σ Bk ↓ A σ B,
where Ak ↓ A means that A1 ≥ A2 ≥ · · · Ak ≥ · · · and Ak converges strongly
to A as k → ∞. An operator connection is a matrix connection of every order
n ∈ N. An operator mean is an operator connection σ , satisfying the normalization
property In σ In = In . For instance, for α ∈ [0, 1],

−1
A∇α B = (1 − α)A + αB and A!α B = (1 − α)A−1 + αB −1

are the weighted arithmetic and harmonic means, respectively; Awl B = A and
Awr B = B are the left and right trivial operator means, respectively.
Kubo and Ando proved that there is a one-to-one correspondence between
operator connections and operator monotone functions on R+ 0.
Theorem 5.1 ([62]) For each operator connection σ , there exists a unique ope-
rator monotone function f : R+ +
0 → R0 , satisfying

f (t) In = In σ (tIn ), t > 0,

20 N. Bebiano et al.

and for A, B > 0 the formula

1 1
A σ B = A 2 f A− 2 B A− 2 A 2
1 1

holds, with the right hand side defined via functional calculus, and extended to
A, B ≥ 0 as follows

A σ B = lim (A + In ) σ (B + In ).
→0+

Let α ∈ [0, 1]. In particular, associated with the operator monotone function
f (t) = t α , the α-weighted geometric mean is
1 1 1 α 1
A α B = A 2 A − 2 B A − 2 A 2 .

It is easy to see that Aα B = B1−α A and when A commutes with B, then
Aα B = A1−α B α . The geometric mean, simply denoted by , is the mean
1
corresponding to f (t) = t 2 . It is the unique positive semidefinite solution of the
Riccati equation XA−1 X = B, also characterized [80] as

AX
A B = max X ∈ Hn : ≥0 . (16)
XB

1 1
Further, there is a unitary matrix U such that AB = A 2 U B 2 .
Ando and Hiai [5] proved the following interesting result, concerning the
weighted geometric mean.
Theorem 5.2 (Ando-Hiai Inequality, 1994) For A, B ≥ 0 and α ∈ [0, 1],

Ar α B r ≺log (Aα B)r , r ≥ 1, (17)

or, equivalently,
p 1 1
A α B p p ≺log Aq α B q q , 0 < q ≤ p. (18)

Proof If 1 ≤ r ≤ 2, then r = 1 + with ∈ [0, 1]. Suppose

Aα B ≤ In . (19)
1 1
Let C = A− 2 BA− 2 . By continuity, we may assume that A, B are invertible.
It follows from (19) that A ≤ C −α . By Löwner-Heinz inequality, we have
Log-majorization Type Inequalities 21

A ≤ C −α . In this case,
1 1
Ar α B r = A 2 A α (CAC) 1− C A 2
1 1
≤ A 2 C −α α (C 2−α 1− C A 2
1 1
= A 2 C αA 2 = Aα B,

by the joint monotonicity of the weighted geometric means α and 1− . Therefore,
Ar α B r ≤ In . We have just proved that λ1 (Aα B) ≤ 1 implies λ1 (Ar α B r ) ≤ 1.
Thus,

λ1 (Ar α B r ) ≤ λ1 (Aα B)r

and applying the antisymmetric tensor power trick, having also in mind that

det Ar α B r = det(Aα B)r ,

then (17) holds for 1 ≤ r ≤ 2. If r > 2, then r = 2m s for m ∈ N and 1 ≤ s ≤ 2. By

repeated use of the above case, we find that

m m m−1s m−1s 2 2m m

A2 s α B 2 s ≺log A2 α B 2 ≺log · · · ≺log As α B s ≺log (Aα B)2 s .

This proves that (17) also holds for r > 2. Now, for 0 < q ≤ p, the result easily
follows.

The following corollary complements the previous log-majorizations of Golden-
Thompson type [5].
Corollary 5.3 If H, K are Hermitian matrices and α ∈ [0, 1], then
1
epH α epK p ≺log e(1−α)H +αK , p > 0.

Proof Consider (18) applied to A = eH and B = eK , then use

1
lim eqH α eqK q = e(1−α)H +αK ,
q→0

that is, the Lie-Trotter like formula for the weighted geometric mean obtained in
[45, Lemma 3.3].

The corresponding inequality for unitarily invariant norms holds, with the left
hand side norm decreasing to the right hand side as p converges to 0.
A celebrated development of Löwner-Heinz inequality established by T. Furuta
[35] is the next order preserving operator inequality, which was motivated by a
previous conjecture by Chan and Kwong [25].
22 N. Bebiano et al.

Theorem 5.4 (Furuta Inequality, 1987) Let A ≥ B ≥ 0. Then

1 1
p+r r r q r r q p+r
A q ≥ A 2 BpA 2 and B 2 Ap B 2 ≥B q (20)

hold for r ≥ 0, p ≥ 0 and q ≥ 1 with (1 + r)q ≥ p + r.

The case of p, q, r all equal to 2 in (20) affirmatively answers Chan and Kwong’s
conjecture:
1
2
A≥B≥0 ⇒ A2 ≥ A B 2 A .

Furuta and many other researchers refined and generalized (20) and applied these
results to produce new inequalities [36].
The essential part of Furuta inequality is the case q = p+r1+r , which can be
formulated for invertible A, using the weighted geometric mean, as follows:

A≥B≥0 ⇒ A−r 1+r B p ≤ A, p ≥ 1, r ≥ 0. (21)

p+r

Fujii and Kamei [33] showed that Ando-Hiai inequality is equivalent to Furuta
inequality. Next, we show this direct implication. Indeed, let A ≥ B > 0 and p ≥ 1.
Firstly, if 0 ≤ r ≤ 1, then Ar ≥ B r by Löwner-Heinz inequality. Consequently,

A−r p+r
r B ≤ B
p −r r
p+r B p = In .

On the other hand, for r > 1, observe that A−1 ≤ B −1 yields

p p
A−1 p+r
r Br ≤ B
−1 r
p+r B r = In

and so Ando-Hiai inequality implies that

A−r p+r
r B ≤ In .
p

Therefore, for any r ≥ 0 we find that

A−r 1+r B p = A−r p+r
r B
p
1 B p ≤ In 1 B p = B ≤ A,
p+r p p

that is, the essential part of Furuta inequality (21) holds. The remaining part follows
readily from Löwner-Heinz inequality.
Extensions of Furuta inequality and Ando-Hiai log-majorization were given by
Furuta [37] and afterwards by other authors. Nowadays, the multivariate geometric
mean as settled in [22, 78], following a Riemannian geometric approach, is often
called the Karcher mean [63]. It is also called Cartan mean and Riemannian mean.
Log-majorization Type Inequalities 23

Extensions of Ando-Hiai inequality to the Karcher mean and generalized Karcher

mean are due to Yamazaki [99, 100]. Other Ando-Hiai type inequalities have
meanwhile been obtained. See the recent works by Hiai, Seo and Wada [49, 50],
Kian, Moslehian and Seo [57–59] and references therein.

6 BLP and Matharu-Aujla Inequalities

Using the previous techniques of Ando and Hiai, Bebiano, Lemos and Providência
[12, Theorem 2.1] obtained the next log-majorization of Araki’s type.
Theorem 6.1 (BLP Inequality, 2005) For A, B ≥ 0,
1+q 1+q 1 r r q 1
A 2 Bq A 2 ≺log A 2 A 2 B r A 2 r A 2 , 0 < q ≤ r. (22)

Proof The equality of the determinants is clear. It is enough to prove that

1+q 1+q 1 r r q 1
λ1 A 2 Bq A 2 ≤ λ1 A 2 A 2 B r A 2 r A 2 (23)

when A is invertible, otherwise a continuity argument is used. Assume that

1 r r q 1
A 2 A 2 B r A 2 r A 2 ≤ In ,

that is,
r r q
A−1 ≥ A 2 B r A 2 r ≥ 0.

By Furuta inequality, since r > 0, r

q ≥ 1 and (1 + r) qr ≥ r
q + r, we find

− r/q+r r r r q r r q
A−(1+q) = A r/q ≥ A− 2 A 2 B r A 2 r q A− 2 r = B q ,

that is,
1+q 1+q
A 2 Bq A 2 ≤ In

and then (23) holds. Using Lemma 2.4, the result follows from (23) replacing A, B
by A∧k , B ∧k , respectively, for k = 1, . . . , n, by properties P1, P3, P5.

For convenience of notation, for α ∈ R, let

B = A 2 (A− 2 BA− 2 )α A 2
1 1 1 1
A α

which is just the α-weighted geometric mean of A, B ≥ 0 when α ∈ [0, 1].

24 N. Bebiano et al.

Corollary 6.2 If A, B > 0 and 1 ≤ α ≤ 2, then

A1−α B α ≺log A α B.

Proof The case α = 1 is trivial. For α > 1, let q = α − 1, r = 1, replace A, B by

B, A−1 , respectively, in Theorem 6.1 and note that B 1−α A = A α B.

A. Matsumoto, R. Nakamoto and M. Fujii [75, Theorem 1] proved that
s+q q s+q p q
A 2 B A 2 ≤ A 2s (A 2 B pA 2 ) p A 2s ,
p
0 < q ≤ p, s ≥ 0, (24)

for A, B ≥ 0, which reduces to Araki-Cordes inequality [31] if s = 0. They also

proved (24) with the reverse inequality sign if 0 ≤ s ≤ p ≤ q and p > 0 [75,
Theorem 2]. Furuta [39, Corollary 3.1 iii.] obtained a norm inequality, that yields
the reverse of (24) for 0 ≤ q ≤ p and −s ≥ q (see [64, p.28]). These norm
inequalities can be restated as follows.
Theorem 6.3 Let A, B ≥ 0. If 0 < q ≤ p and s ≥ 0, then
s+q s+q s p p q s
A 2 Bq A 2 ≺log A 2 A 2 B pA 2 p A 2 . (25)

If either 0 ≤ s ≤ p ≤ q and p > 0 or 0 ≤ q ≤ p and −s ≥ q, then (25) holds with

reversed log-majorization.
Araki’s log-majorization is obtained if s = 0 and BLP inequality if s = 1.
Corollary 6.4 If A > 0, B ≥ 0 and α ≥ 2, then

A αB ≺log A1−α B α .

Proof Let q = α − 1, p = s = 1 and replace A, B by B, A−1 , respectively, in

Theorem 6.3.

Clearly, if α = 2, the matrices in both hand sides of the log-majorizations given
in Corollary 6.2 and Corollary 6.4 have the same eigenvalues.
The Umegaki relative entropy [92] of the density matrices A, B is

S(A, B) = tr (A (log A − log B)).

Fujii and Kamei [32] introduced the variant

1 1 1 1
Ŝ(A|B) = A 2 log (A− 2 BA− 2 ) A 2 .

A logarithmic trace inequality [64] is now presented.

Log-majorization Type Inequalities 25

Theorem 6.5 Let A, B > 0. If q, s ≥ 0, then

p p q
tr (As (log Aq + log B q )) ≤ tr As log (A 2 B pA 2 ) p , p > 0, (26)

and the left hand side converges to the right hand side as p converges to 0.
Proof The log-majorization (25) implies the trace inequality
p p q
tr (As Aq B q ) ≤ tr As (A 2 B pA 2 ) p , 0 ≤ q ≤ p, s ≥ 0,

occuring trace equality when q = 0. Taking the derivatives of the right and left hand
sides of the previous inequality at q = 0, observing that

d q q
A B q=0
= log A + log B, (27)
dq

d p p p pq p p
1
p
(A 2 B A 2 ) q=0
= log A 2 B p A 2 , p > 0, (28)
dq

yields a trace inequality. Multiplying both hand sides of the obtained trace inequality
by q provides (26). By the parametric Lie-Trotter formula, we may see that (28)
converges to (27) as p converges to 0.

The case q = s = 1 in Theorem 6.5 is due to Hiai and Petz [45]. It was later
complemented in [5]. Using relative entropy terminology, Theorem 6.5 for q = s,
replacing B by B −1 , may be written in the condensed form
s
S(As , B s ) ≤ − tr Ŝ(Ap |B p )As−p , s ≥ 0, p > 0,
p

this providing an upper bound for the relative entropy S(A, B) when s = 1.
Fujii, Nakamoto and Tominaga [34] improved BLP inequality as follows.
Theorem 6.6 If A, B ≥ 0, p ≥ 1, q ≥ 0, then
1+q 1+q 1+q p+q 1 q q1 1
A 2 B A 2 p(1+q) ≤ A 2 A 2 B q+p A 2 p A 2 .

The next log-majorization [73] is obtained from Furuta inequality too.

Theorem 6.7 (Matharu-Aujla Inequality, 2012) Let A, B ≥ 0 and 0 ≤ α ≤ 1,
then

Aα B ≺log A1−α B α .

26 N. Bebiano et al.

Proof If α = 0 or α = 1, the result is trivial. Let 0 < α < 1. Clearly, Aα B and
A1−α B α have the same determinant. Let us prove that

λ1 (Aα B) ≤ λ1 (A1−α B α ). (29)

If A is invertible and λ1 (A1−α B α ) ≤ 1, then

A−(1−α) ≥ B α .

By Furuta inequality with p = q = 1

α ≥ 1 and r ≥ 0, we find

−(1−α) 1+αr r r α
A ≥ A−(1−α) 2 B A−(1−α) 2 .

Taking r = 1
1−α yields

1 1 α
A−1 ≥ A− 2 BA− 2 ,

so that λ1 (Aα B) ≤ 1 holds. If A is not invertible, by a continuity argument, (29)

is obtained. Using Lemma 2.4, the result follows from (29) replacing A, B by A∧k ,
B ∧k , respectively, for k = 1, . . . , n, by properties P1, P3, P5.

Furuta considered other operator inequalities implying the generalized BLP
inequality [38] as well as Matharu-Aujla inequality [39].

7 Inequalities for Operator Connections

In this section, some inequalities involving operators connections are presented.

For that purpose, we recall that the dual of a nonzero operator connection σ is the
operator connection σ ⊥ defined by

−1
A σ ⊥ B = B −1 σ A−1

for A, B > 0 and extended by continuity to A, B ≥ 0 as usual. Its representing

function satisfies

fσ ⊥ (t) = t/fσ (t), t > 0.

In the sequel, for C, X ∈ Mn (C) the condensed notation X∼ stands for X or XT ,

whereas C ∈ Hn∼ means that if the symbol ∼ is omitted along the stated result, then
C is assumed Hermitian, and if ∼ acts as the transpose along the result, then C is
assumed symmetric.
Log-majorization Type Inequalities 27

Theorem 7.1 Let A, B ≥ 0 and C ∈ Hn∼ . If the representing functions of the

nonzero operator connections σ, τ, ρ satisfy fσ2 ≤ fτ fρ , then
1
s1 (A τ ⊥ B) 2 C ∗ (A σ B)∼ C (A ρ ⊥ B) 2 ≤ λ1 (A C ∗ B ∼ C).
1
(30)

Proof For C Hermitian, there exists U unitary, such that U ∗ CU = D is real

diagonal and it is enough to prove that
1
s1 (A τ ⊥ B) 2 D(A σ B)D (A ρ ⊥ B) 2 ≤ λ1 (ADBD),
1

since we may replace A, B by U ∗ AU, U ∗ BU , respectively, and apply the trans-

former inequality. If C is symmetric, by Takagi’s factorization [54, Corollary 4.4.4],
there exist V unitary and D diagonal with the singular values of C in its main
diagonal, such that C = V D V T . In this case, we need to show that
1
s1 (A τ ⊥ B) 2 D(A σ B)T D (A ρ ⊥ B) 2 ≤ λ1 (ADB T D)
1

from which the result follows, replacing A, B by V TAV , V TBV , respectively.

Thus, assuming D to be a real diagonal matrix, we will check that
1
λ1 (ADB ∼D) ≤ 1 s1 (A τ ⊥ B) 2 D (A σ B)∼ D (A ρ ⊥ B) 2 ≤ 1.
1
⇒ (31)

Firstly, let A, B > 0. If λ1 (ADB ∼D) ≤ 1, equivalently, λ1 (DA∼DB) ≤ 1, then

DA∼D ≤ B −1 and DB ∼D ≤ A−1 . By the transformer inequality C2 and the joint
monotonicity C1, we find

D (A ρ ⊥ B)∼ D = D A∼ ρ ⊥ B ∼ D ≤ (DA∼D) ρ ⊥ (DB ∼D)
≤ B −1 ρ ⊥ A−1
= (A ρ B)−1 .
∼
Analogously, D (A τ ⊥ B D ≤ (A τ B)−1 . Under the hypothesis, we see that

1 −1 1 1 1
A 2 fσ (M) fρ (M) fσ (M) A 2 ≤ A 2 fτ (M) A 2 = A τ B, (32)

1 1
where M = A− 2 BA− 2 . Therefore

λ1 (A σ B)(A ρ B)−1 (A σ B)(A τ B)−1 ≤ 1. (33)

Moreover,
1
s1 (A τ ⊥ B) 2 D (A σ B)∼ D (A ρ ⊥ B) 2
1
28 N. Bebiano et al.

is equal to the square root of

λ1 (A σ B) D (A ρ ⊥ B)∼ D (A σ B) D (A τ ⊥ B)∼ D . (34)

Now, it is clear that (34) is not greater than (33) and the implication (31) holds. If
A, B ≥ 0, by a continuity argument, the result is obtained.

If the nonzero operator connections σ, τ, ρ satisfy fσ2 ≥ fτ fρ , then (30) holds
with each connection replaced by its dual [65].
Applying Theorem 7.1 for A, B ≥ 0, C ∈ Hn∼ and σ ≤ τ = ρ yields

λ1 (A τ ⊥ B) C ∗ (A σ B)∼ C ≤ λ1 (A C ∗ B ∼ C). (35)

If τ = σ and ∼ is absent, this was observed in [64, Theorem 2.1] for C ≥ 0.

Corollary 7.2 If A, B ≥ 0, C ∈ Hn∼ and σ is a nonzero operator connection, then

(A B) C ∗ (A B)∼ C ≺log (A σ ⊥ B) 2 C ∗ (A B)∼ C (A σ B) 2

1 1

≺log (A σ ⊥ B) C ∗ (A σ B)∼ C.

Proof We can see that

1 1
λ1 (A B) C ∗ (A B)∼ C ≤ s1 A 2 C ∗ (A B)∼ C B 2 ≤ λ1 (A C ∗ B ∼ C). (36)

The last inequality in (36) is the case τ, ρ as the trivial operator means wl , wr and
σ = in Theorem 7.1. The first inequality in (36) follows after taking square roots
of the obtained eigenvalues from the case τ = σ = with ∼ deleted in (35),
then replacing C by C ∗ (A B)∼ C. Applying Weyl’s trick to (36) and observing the
equality of the determinants of the matrices involved, a log-majorization is obtained.
Next, replace A by A σ ⊥ B, B by A σ B in that log-majorization and use the identity
(A σ ⊥ B) (A σ B) = A B.

Corollary 7.3 If A, B ≥ 0, C ∈ Hn∼ and 0 ≤ α ≤ β ≤ 1, then
1 1
(A1−α B) 2 C ∗ (A β B)∼ C (A 1+α−β B) 2 ≺log A C ∗ B ∼ C. (37)
2

Proof If σ = β , τ = α , ρ = β−α in Theorem 7.1, since β − α ∈ [0, 1], then

1 1
s1 (A 1−α B) 2 C ∗ (A β B)∼ C (A 1+α−β B) 2 ≤ λ1 (A C ∗ B ∼ C).
2

Replace A, B, C by their kth compounds and apply properties P1–P6. The determi-
nants of the matrices in both hand sides of (37) are equal.

Log-majorization Type Inequalities 29

Remark 7.4 Let A, B ≥ 0, C ∈ Hn∼ and α ∈ [0, 1]. If σ = α in Corollary 7.2 and
β = 1 in Corollary 7.3, then

(A1−α B) 2 C ∗ (A B)∼ C (A α B) 2
1 1

is log-majorized by (A 1−α B) C ∗ (A α B)∼ C and A C ∗ B ∼ C, respectively, being

these two matrices related as follows:

(A 1−α B) C ∗ (A α B)∼ C ≺log A C ∗ B ∼ C (38)

as a consequence of applying Weyl’s trick to (35) when σ = τ = α . In particular,

this implies the next trace inequality observed by Bhatia, Lim and Yamazaki [23]:

tr (A1−α B)(Aα B) ≤ tr (AB).

The question on whether it is possible to extend (38) to

(A σ ⊥ B) C ∗ (A σ B)∼ C ≺log A C ∗ B ∼ C

for other operator connections σ naturally arises.

Theorem 7.5 If A, B ≥ 0 and r ∈ N0 , then
1 1 r+1
(AB)r+1 ≺log |A 2 (AB)r B 2 | ≺log (AB) 2 .

Proof We can check that

λ1 (A B)2(r+1) ≤ λ1 A (A B)r B (A B)r ≤ λ1 (AB)r+1 . (39)

The first inequality in (39) follows from (36) when the symbol ∼ is absent and
C = (AB)r . Concerning the second, if A > 0 and λ1 (AB)r+1 ≤ 1, then B ≤ A−1
implies

(A B) B (A B) ≤ (A B) A−1 (A B) = B ≤ A−1 .

By induction on r ∈ N0 , we easily prove that (A B)r B (A B)r ≤ A−1 , so

λ1 A (A B)r B (A B)r ≤ 1.

Thus, the last inequality in (39) is true. By continuity, it remains valid for A ≥ 0.
After applying Weyl’s trick to (39), the obtained log-majorization implies the log-
majorization between the corresponding square roots.

30 N. Bebiano et al.

If A, B ≥ 0, the last log-majorization in Theorem 7.5 and AB ≺log |AB| which

follows readily from Weyl’s Majorant Theorem yield

k
1 1
k r+1
si A 2 (A B)r B 2 ≤ si 2 (AB), k = 1, . . . , n,
i=1 i=1

for r ∈ N0 . If r = 1, these inequalities were obtained by Zou [102].

Conjecture 7.6 If A, B ≥ 0 then

Aα (A α B)B 1−α ≺log |AB|

for all α ∈ [0, 1].

8 Ando and Visick’s Inequalities for the Hadamard Product

In this section, Ando and Visick’s inequalities [4, 94] for the Hadamard product
of positive definite matrices, which settled affirmatively Bapat and Johnson’s
conjecture [56], are revisited and weighted interpolations are presented.
We recall that a map : Mm (C) → Mn (C) is called positive if A ≥ 0 implies
(A) ≥ 0 and it is called unital if (Im ) = In .
Lemma 8.1 ([1]) If : Mm (C) → Mn (C) is a unital positive linear map and f is
operator monotone on R+
0 , then

f ( (A)) ≥ (f (A)), A ≥ 0.

The proof presented below of Ando and Visick’s inequalities follows Ando’s
arguments [4]. We state these results in the following condensed form, where ∼ is
either omitted or acts as the transpose.
Theorem 8.2 If A, B > 0, then A ◦ B ≺wlog AB ∼ , that is,

n
n
λi (A ◦ B) ≥ λi (AB ∼ ), k = 1, . . . , n. (40)
i=k i=k

Proof There exits a unital positive linear map such that (X ⊗ Y ) = X ◦ Y for
all X, Y ∈ Mn (C). For A, B > 0,

log(A ⊗ B) = log A ⊗ In + In ⊗ log B.

Log-majorization Type Inequalities 31

Then H = log A, K = log B are Hermitian and

log(A ⊗ B) = H ◦ In + In ◦ K = In ◦ (H + K).

Using Lemma 8.1 with f (t) = log t, t > 0, we have

log (A ⊗ B) = log(A ◦ B) ≥ In ◦ (H + K).

By Schur Majorization Theorem, In ◦ (H + K) ≺ H + K holds, as H + K is

Hermitian. Therefore,

n
n
n
λi (log(A ◦ B)) ≥ λi (In ◦ (H + K)) ≥ λi (H + K)
i=k i=k i=k

for k = 1, . . . , n. It follows that

n
n
n
n
λi (A ◦ B) ≥ eλi (In ◦(H +K)) ≥ eλi (H +K) = λi eH +K
i=k i=k i=k i=k

n
n
≥ λi (eH eK ) = λi (AB), k = 1, . . . , n.
i=k i=k

The last inequality is a consequence of the Golden-Thompson type log-majorization

(15). Then (40) with ∼ deleted is proved.
Since K and K T have the same diagonal entries, In ◦(H +K) may be replaced by
T
In ◦ (H + K T ) in the proof above. In such case, eH eK is replaced by eH eK = AB T
and (40) with ∼ acting as the transpose is fullfilled.

Remark 8.3 For A, B > 0, by the Lie-Trotter formula, we have
1
lim Ap B p p = elog A+log B
p→0

and a Lie-Trotter type formula for the Hadamard product [97] is

1
lim Ap ◦ B p p = eIn ◦(log A+log B)
p→0

(see also [95, Theorem 1]). According to the previous proof, we can write
1 1
A ◦ B ≺wlog lim Ap ◦ B p p ≺wlog lim Ap (B ∼ )p p ≺log AB ∼ .
p→0 p→0
32 N. Bebiano et al.

Moreover, for A, B > 0 and r > 0, Visick [94] obtained

n
n
−r
λi (A ◦ B)−r ≤ λi AB ∼ , k = 1, . . . , n,
i=k i=k

and deduced Theorem 8.2 from it. In fact, this is equivalent to Theorem 8.2 as shown
by Bebiano and Perdigão [10]. One of the implications is a trivial consequence of
the following limit:

λ−r − 1
lim = −log λ, λ > 0.
r→0 r

To prove the other, note that (40) implies

(A ◦ B)−1 ≺wlog (AB ∼ )−1 .

Considering the function f (t) = log(1 + ert ), which is convex and increasing for
t > 0, with , r > 0, by Proposition 1.3 ii, we obtain

n
n

−r
1 + λi (A ◦ B) ≤ 1 + λi (AB ∼ )−r , k = 1, . . . , n.
i=k i=k

The implication follows, because

n
1
n
−r
lim 1 + λi − 1 = λ−r
i .
→0
i=k i=k

Theorem 8.4 Let A, B > 0 and D ∈ Mn (C) be a diagonal matrix, assumed real
when ∼ is absent. If α ∈ [0, 1], then

n

n

λi (A o B) |D|2 ≥ λi (AB) D (AB)∼ D (41)
i=k i=k

n

≥ λi (A1−α B) D (Aα B)∼ D
i=k

n

≥ λi A D B ∼ D , k = 1, . . . , n,
i=k

equality occurring for k = 1 in the last two inequalities.

Log-majorization Type Inequalities 33

Proof Let D ∈ Mn (C) be a diagonal matrix. Then

D(A o B)D ≥ D (A B) o (A B) D = D(A B)D o (A B)

and replacing A, B in Theorem 8.2 by D(A B)D, A B, respectively, yields

(A o B)|D|2 ≺wlog (D(A B)D) o (A B) ≺wlog (A B)D(A B)∼ D.

Further, if C = D ∈ Hn∼ in Corollary 7.2 with σ = α , α ∈ [0, 1], and in (38), the
result is obtained.

If ∼ is deleted and D = In , then (41) was previously given by Ando [4,
Theorem 2] and, in this case, the remaining inequalities were obtained by Hiai and
Lin [48]. The complete version in Theorem 8.4 is derived in [65]. The inequalities
in (41) hold for A, B > 0, k = 1, . . . , n and D diagonal, but

n

n

λi (A o B) |D|2 ≥ λi (AB) D ∗ (AB)∼ D (42)
i=k i=k

does not remain true, in general, when D is replaced by any Hermitian matrix.
Example Consider

10 21 1 1+i
A= , B= , D= .
01 11 1 − i −3

1
In this case, AB = B 2 and (42) with ∼ absent does not hold, because
1 2
λ2 ((A ◦ B) D 2 ) ≈ 3.783 ≤ λ2 B 2 D ≈ 4.095.

9 Indefinite Inequalities

The permanent of A = (aij ) ∈ Mn (C) is denoted and defined as

n
per A = aj σ (j ) , σ ∈ Sn .
σ ∈Sn j =1

Although permanents and determinants have similar definitions and share some
common properties, they exhibit substancial differences, such as the non-
multiplicativity of the permanent.
In 1926, van der Waerden raised a question [93] and motivated a conjecture:
the minimum of the permanent of a n-square doubly stochastic matrix is nn!n and
34 N. Bebiano et al.

equality occurs when every entry of the matrix equals n1 . This conjecture attracted
the attention of mathematicians all over the world, although it remained open for
more than fifty years. The proof of this famous conjecture by G. P. Egoritjev [26] in
1981, also proved by Falikman [27], is based on an inequality for permanents, which
is a special case of a result of A. D. Alexandroff on positive definite quadratic forms.
In what follows we write per A = per(a1 , . . . , an ) with ai the ith column of A.
Theorem 9.1 (Alexandroff Permanental Inequality) For a1 , . . . , an ∈ Rn ,

per(a1 , . . . , an−1 , b)2 ≥ per(a1 , . . . , an−1 , an−1 )per(a1 , . . . , b, b),

with equality if and only if b = λan−1 for some constant λ.

This inequality resembles Schwartz inequality, but the direction of the inequality
sign is reversed. The reason is the following. Taking the permanent as the inner
product in Rn :

x, y = per(a1 , . . . , an−2 , x, y),

the space Rn is no longer Euclidean but Lorentzian, accordingly as the length of the
vector (x1 , . . . , xn ) is

x12 + · · · + xn2 or x12 − x22 · · · − xn−1

2
− xn2 .

That is, we are dealing now with a so called indefinite inner product space.
In this section, we present miscellaneous indefinite inequalities obtained in this
set up. First, we introduce some definitions and notations.
Let J be a Hermitian involutive matrix, that is, J ∗ = J and J 2 = In . Consider
C endowed with the indefinite inner product induced by J , given by [x, y] = y ∗ J x
n

for all x, y ∈ Cn . Let A = J A∗ J. A matrix A ∈ Mn (C) is said to be J -Hermitian

if A = A , that is, if J A is Hermitian. These matrices appear in several problems
of relativistic quantum mechanics and quantum physics. Let A, B ∈ Mn (C) be J -
Hermitian and consider A ≥J B defined by

[Ax, x] ≥J [Bx, x], x ∈ Cn ,

which means that J (A − B) ≥ 0. A matrix A ∈ Mn (C) is called a J -contrac-

tion if In ≥J A A. It is well known that the eigenvalues of a J -Hermitian matrix
A ∈ Mn (C) may not be real, nevertheless its spectrum is symmetric relative to the
real axis. If A is J -Hermitian and In ≥J A, then all the eigenvalues of A are real.
In fact, in this case, In − A is the product of the Hermitian matrix J and a positive
semidefinite matrix. If A is a J -contraction, by a Theorem of Potapov-Ginzburg [8,
Chapter 2, Section 4], then all the eigenvalues of A A are nonnegative. Sano [83,
Theorem 2.6] obtained the next indefinite version of Löwner-Heinz inequality.
Log-majorization Type Inequalities 35

Theorem 9.2 (Löwner Inequality of Indefinite Type, 2007) If A, B ∈ Mn (C)

are J -Hermitian matrices with nonnegative eigenvalues, In ≥J A ≥J B and
0 < α < 1, then the J -Hermitian powers Aα , B α are well defined and

In ≥J Aα ≥J B α .

The case α = 12 in Theorem 9.2 is due to Ando [6, Theorem 6], being the cases
α = 0 and α = 1 trivially satisfied. Motivated by these results, the Furuta inequality
of indefinite type in (43) and (44) was established by Sano [83, Theorem 3.4] and
Bebiano et al. [14, Theorem 2.1], respectively.
Theorem 9.3 (Furuta Inequality of Indefinite Type) Let A, B ∈ Mn (C) be J -
Hermitian with nonnegative spectra, μIn ≥J A ≥J B (or A ≥J B ≥J μIn ) for
some μ > 0. Then for each r ≥ 0,

p+r 1
r r q
A q ≥J A 2 B p A 2 (43)

and
1 p+r
r r q
B 2 Ap B 2 ≥J B q (44)

hold for p ≥ 0 and q ≥ 1 with (1 + r)q ≥ p + r.

In particular, Löwner-Heinz inequality of indefinite type is recovered by Theo-
rem 9.3 for r = 0.
In order to present the indefinite version of Theorem 3.4 obtained in [13,
Corollary 1.2], assume (r, n − r) to be the inertia of J and 0 < r < n. Without
loss of generality, we may consider

J = Ir ⊕ −In−r , 0 < r < n.

For an arbitrary J -Hermitian matrix A ∈ Mn (C), we denote by σJ± (A) the set
of eigenvalues of A with eigenvectors x, such that x ∗ J x = ±1. We say that A is
J -unitarily diagonalizable if every eigenvalue of A belongs to either σJ+ (A) or to
σJ− (A). In this case, σJ+ (A) and σJ− (A) have r and n − r eigenvalues, respectively.
Consider a J -Hermitian matrix A, whose eigenvalues α1 ≥ · · · ≥ αr belong to
σJ+ (A) and αr+1 ≥ · · · ≥ αn belong to σJ− (A). In this case, the eigenvalues of A
are said to not interlace if either αr > αr+1 or αn > α1 , otherwise, they are said to
interlace.
Theorem 9.4 Let J = Ir ⊕ −In−r , 0 < r < n, and A, C ∈ Mn (C) be
non-scalar J -Hermitian and J -unitarily diagonalizable matrices with eigenvalues
α1 ≥ · · · ≥ αr (c1 ≥ · · · ≥ cr ) in σJ+ (A) (σJ+ (C)) and αr+1 ≥ · · · ≥ αn
(cr+1 ≥ · · · ≥ cn ) in σJ− (A) (σJ− (C)). If the eigenvalues of A and C do not interlace,
then statements i.and ii. hold.
36 N. Bebiano et al.

i. If (αk − αl )(ck − cl ) < 0 for all 1 ≤ k, k ≤ r, r + 1 ≤ l, l ≤ n, then

n
tr(CA) ≤ ci αi .
i=1

ii. If (αk − αl )(ck − cl ) > 0 for all 1 ≤ k, k ≤ r, r + 1 ≤ l, l ≤ n, then

r
n
ci αr−i+1 + ci αn+r−i+1 ≤ tr(CA).
i=1 i=r+1

Several other inequalities of indefinite type have been studied. For instance,
just to mention a few, we refer some spectral inequalities for the trace of the
exponential or the logarithmic of J -Hermitian matrices [15], operator inequalities
associated with Furuta inequality of indefinite type [16], a reversed Heinz-Kato-
Furuta inequality [17] and indefinite versions of some determinantal inequalities
[19], including a Fiedler-type theorem for the determinant of J -positive matrices
[18].
Recently, Matharu, Malhotra and Moslehian [74] defined a J -mean associated
with a positive matrix monotone function f on (0, ∞), such that f (1) = 1, for J -
Hermitian matrices with spectra in (0, ∞). Fundamental properties of this J -mean,
such as the power monotonicity and an indefinite version of Ando-Hiai inequality
[74, Theorem 3.11] were obtained.

Acknowledgments The authors would like to thank the referees for their valuable comments
which helped to improve the presentation of this chapter. The work of the first author was partially
supported by the Centre for Mathematics of the University of Coimbra - UIDB/00324/2020,
funded by the Portuguese Government through FCT/MCTES. The work of the second author was
supported by Portuguese funds through the Center for Research and Development in Mathematics
and Applications (CIDMA) and the Portuguese Foundation for Science and Technology (FCT-
Fundação para a Ciência e a Tecnologia), project UIDB/04106/2020. The work of the third author
was financed by CMAT-UTAD and Portuguese funds through the Portuguese Foundation for Sci-
ence and Technology (FCT-Fundação para a Ciência e a Tecnologia), reference UIDB/00013/2020.

References

1. T. Ando, Concativity of certain maps of positive definite matrices and applications to Hadamard
products. Linear Algebra Appl. 26, 203–241 (1979)
2. T. Ando, Majorization, doubly stochastic matrices and comparison of eigenvalues. Linear
Algebra Appl. 118, 163–248 (1989)
3. T. Ando, Majorizations and inequalities in matrix theory. Linear Algebra Appl. 199, 17–67
(1994)
4. T. Ando, Majorization relations for Hadamard products. Linear Algebra Appl. 223/224, 57–64
(1995)
5. T. Ando, F. Hiai, Log-majorization and complementary Golden-Thompson type inequality.
Linear Algebra Appl. 197, 113–131 (1994)
Log-majorization Type Inequalities 37

6. T. Ando, Löwner inequality of indefinite type. Linear Algebra Appl. 385, 73–80 (2004)
7. H. Araki, On an inequality of Lieb and Thirring. Lett. Math. Phys. 19, 167–170 (1990)
8. T. Azizov, I. Iokhvidov, Linear Operators in Spaces with an Indefinite Metric (Nauka, Moscow,
1986). English Translation: Wiley, New York, 1989
9. N. Bebiano, Contradomínios Numéricos Generalizados: variações sobre este tema, PhD Thesis,
Universidade de Coimbra, 1984
10. N. Bebiano, C. Perdigão, Extremal matrices in certain determinantal inequalities. Linear
Multilinear Algebra 44, 261–276 (1998)
11. N. Bebiano, J. da Providência Jr., R. Lemos, Matrix inequalities in statistical mechanics. Linear
Algebra Appl. 376, 265–273 (2004)
12. N. Bebiano, R. Lemos, J. da Providência, Inequalities for quantum relative entropy. Linear
Algebra Appl. 401, 159–172 (2005)
13. N. Bebiano, H. Nakazato, J. da Providência, R. Lemos, G. Soares, Inequalities for J -Hermitian
matrices. Linear Algebra Appl. 407, 125–139 (2005)
14. N. Bebiano, R. Lemos, J. da Providência, G. Soares, Further developments of Furuta inequality
of indefinite type. Math. Inequal. Appl. 13, 523–535 (2010)
15. N. Bebiano, R. Lemos, J. da Providência, G. Soares, Trace inequalities for logarithms and
powers of J -Hermitian matrices. Linear Algebra Appl. 432, 3172–3182 (2010)
16. N. Bebiano, R. Lemos, J. da Providência and G. Soares, Operator inequalities for J -
contractions. Math. Inequal. Appl. 12, 883–897 (2012)
17. N. Bebiano, R. Lemos, J. da Providência, On a reverse Heinz-Kato-Furuta inequality. Linear
Algebra Appl. 437, 1892–1905 (2012)
18. N. Bebiano, J. da Providência, A Fiedler-type theorem for the determinant of J -positive
matrices. Math. Inequal. Appl. 19, 663–669 (2016)
19. N. Bebiano, J. da Providência, Determinantal inequalities for J -accreative dissipative matrices.
Studia Universitatis Babeş-Bolyai Mathematica 62, 119–125 (2017)
20. R. Bhatia, Linear Algebra to Quantum Cohomology: the story of Alfred Horn’s inequalities,
Am. Math. Monthly 108, 289–318 (2001)
21. R. Bhatia, Matrix Analysis. Graduate Texts in Mathematics, vol. 169 (Springer, New York,
1997)
22. R. Bhatia, J. Holbrook, Riemannian geometry and matrix geometric means. Linear Algebra
Appl. 413, 594–618 (2006)
23. R. Bhatia, Y. Lim, T. Yamazaki, Some norm inequalities for matrix means. Linear Algebra
Appl. 501, 112–122 (2016)
24. G. Birkhoff, Tres observaciones sobre el algebra lineal. Rev. Universidad Nacional de
Tucumán, Ser. A 5, 147–151 (1946)
25. N.N. Chan, K. Kwong, Hermitian matrix inequalities and a conjecture. Am. Math. Monthly
92, 533–541 (1985)
26. G.P. Egorychev, The solution of van der Waerden’s problem for permanents. Adv. Math. 42,
299–305 (1981)
27. D.I. Falikman, The proof of the van der Waerden’s conjecture regarding the permanent of
doubly stochastic matrices. Mat. Zametki 29(6), 931–938 (1981); Math. Notes 29, 475–479
(1981)
28. K. Fan, Maximum properties and inequalities for the eigenvalues of completely continuous
operators. Proc. Nat. Acad. Sci. U.S.A. 37, 760–766 (1951)
29. M. Fiedler, Bounds for the determinant of the sum of Hermitian matrices. Proc. Am. Math.
Soc. 30(1), 27–31 (1971)
30. P.J. Forrester, C.J. Thompson, The Golden-Thompson inequality: historical aspects and random
matrix applications. J. Math. Phys. 55, 023503 (2014)
31. J.I. Fujii, T. Furuta, Löwner-Heinz, Cordes and Heinz-Kato inequalities. Math. Jpn. 38, 73–78
(1993)
32. J.I. Fujii, E. Kamei, Relative operator entropy in non-commutative information theory. Math.
Jpn. 34, 341–348 (1989)
38 N. Bebiano et al.

33. M. Fujii, E. Kamei, Ando-Hiai inequality and Furuta inequality. Linear Algebra Appl. 416,
541–545 (2006)
34. M. Fujii, R. Nakamoto, M. Tominaga, Generalized Bebiano-Lemos-Providência inequalities
and their reverses. Linear Algebra Appl. 426, 33–39 (2007)
35. T. Furuta, A ≥ B ≥ 0 assures (B r Ap B r )1/q ≥ B {p+2r)/q for r ≥ 0, p ≥ 0, q ≥ 1 with
(1 + 2r)q ≥ p + 2r. Proc. Am. Math. Soc. 101, 85–88 (1987)
36. T. Furuta, Invitation to Linear Operators: From Matrices to Bounded Linear Operators on a
Hilbert Space (CRC Press, Boca Raton, 2001)
37. T. Furuta, An extension of the Furuta inequality and Ando-Hiai log-majorization. Linear
Algebra Appl. 219, 139–155 (1995)
38. T. Furuta, Operator inequality implying generalized Bebiano-Lemos-Providência one, Linear
Algebra Appl. 426, 342–348 (2007)
39. T. Furuta, Extensions of inequalities for unitarily invariant norms via log-majorization. Linear
Algebra Appl. 436, 3463–3468 (2012)
40. S. Golden, Lower bounds for Helmholtz functions. Phys. Rev. B 137, 1127–1128 (1965)
41. R.D. Grigorieff, Note on von Neumann’s trace inequality. Math. Nachr. 151, 327–328 (1991)
42. F. Hansen, Multivariate extensions of the Golden-Thompson inequality. Ann. Funct. Anal. 6(4),
301–310 (2015)
43. G.H. Hardy, J.E. Littlewood, G. Pólya, Inequalities (Cambridge University Press, Cambridge,
1952)
44. E. Heinz, Beiträge zur Störungstheorie der Spektralzerlegung. Math. Ann. 123, 415–438
(1951)
45. F. Hiai, D. Petz, The Golden-Thompson trace inequality is complemented. Linear Algebra
Appl. 181, 153–185 (1993)
46. F. Hiai, Matrix analysis: matrix monotone functions, matrix means and majorization. Interdis-
cipl. Inf. Sci. 16(2), 139–248 (2010)
47. F. Hiai, D. Petz, Introduction to Matrix Analysis and Applications. Universitext (Springer, New
York, 2014)
48. F. Hiai, M. Lin, On an eigenvalue inequality involving the Hadamard product. Linear Algebra
Appl. 515, 313–320 (2017)
49. F. Hiai, Y. Seo, S. Wada, Ando-Hiai type inequalities for multivariate operator means. Linear
Multilinear Algebra 67, 2253–2281 (2019)
50. F. Hiai, Y. Seo, S. Wada, Ando-Hiai type inequalities for operator means and operator
perspectives. Int. J. Math. 31(1), 2050007 (2020)
51. A. Horn, Doubly stochastic matrices and the diagonal of a rotation matrix. Am. J. Math. 76,
620–630 (1954)
52. A. Horn, On the eigenvalues of a matrix with prescribed singular values. Proc. Am. Math. Soc.
5, 4–7 (1954)
53. A. Horn, Eigenvalues of sums of Hermitian matrices. Pac. J. Math. 12, 225–241 (1962)
54. R.A. Horn, C.R. Johnson, Matrix Analysis (Cambridge University Press, Cambridge, 1985)
55. R.A. Horn, C.R. Johnson, Topics in Matrix Analysis (Cambridge University Press, Cambridge,
1991)
56. C.R. Johnson, R.B. Bapat, A weak multiplicative majorization conjecture for Hadamard
products. Linear Algebra Appl. 104, 246–247 (1988)
57. M. Kian, M.S. Moslehian, Y. Seo, Variants of Ando-Hiai inequality for operator power means.
Linear Multilinear Algebra 69, 1694–1704 (2021)
58. M. Kian, M.S. Moslehian, Y. Seo, Variants of Ando-Hiai type inequalities for deformed means
and applications. Glasgow Math. J. 63, 622–639 (2021)
59. M. Kian, M.S. Moslehian, Matrix inequalities related to power means of probability measures.
Linear Multilinear Algebra (2021), https://doi.org/10.1080/03081087.2021.1930991
60. A. Klyachko, Stable bundles, representation theory and Hermitian operators. Selecta Math.,
New Series 4, 419–445 (1998)
61. A. Knutson, T. Tao, The Honeycomb model of the Berenstein-Zelevinsky cone I: proof of the
saturation conjecture. J. Am. Math. Soc. 12, 1055–1090 (1999)
Log-majorization Type Inequalities 39

62. F. Kubo, T. Ando, Means of positive linear operators. Math. Ann. 246, 205–224 (1980)
63. J. Lawson, Y. Lim, Karcher means and Karcher equations of positive definite operators. Trans.
Am. Math. Soc. Ser. B. 1, 1–22 (2014)
64. R. Lemos, G. Soares, Some log-majorizations and an extension of a determinantal inequality.
Linear Algebra Appl. 547, 19–31 (2018)
65. R. Lemos, G. Soares, Spectral inequalities for Kubo-Ando operator means. Linear Algebra
Appl. 607, 29–44 (2020)
66. A. Lenard, Generalization of the Golden-Thompson inequality tr(eA eB ) ≥ tr(eA+B ). Indiana
Univ. Math. J. 21, 457–467 (1971)
67. E. Lieb, Convex trace functions and the Wigner-Yanase-Dyson conjecture. Adv. Math. 11,
267–288 (1973)
68. E.H. Lieb, W.E. Thirring, Studies in Mathematical Physics, ed. by E. Lieb, B. Simon, A.S.
Wightmer (Princeton University Press, Princeton, 1976), pp. 269–303
69. K. Löwner, Über monotone Matrixfunktionen. Math. Z. 38, 177–216 (1934)
70. M. Marcus, An eigenvalue for the product of normal matrices. Am. Math. Monthly 63, 173–
174 (1956)
71. M. Marcus, Derivations, Plücker relations and the numerical range. Indiana Univ. Math. J. 22,
1137–1149 (1973)
72. A.W. Marshall, I. Olkin, B.C. Arnold, Inequalities: Theory of Majorization and Its Applica-
tions, 2nd edn. (Springer, New York, 2011)
73. J.S. Matharu, J.S. Aujla, Some inequalities for unitarily invariant norms. Linear Algebra Appl.
436, 1623–1631 (2012)
74. J.S. Matharu, C. Malhotra, M.S. Moslehian, Indefinite matrix inequalities via matrix means.
Bull. Sci. Math. 171, 103036 (2021)
75. A. Matsumoto, R. Nakamoto, M. Fujii, Reverse of Bebiano-Lemos-Providência inequality
and Complementary Furuta inequality (Inequalities on Linear Operators and its Applications),
Departmental Bulletin Paper, Kyoto University, 2008, 91–98
76. L. Mirsky, On the trace of matrix products. Math. Nachr. 20, 171–174 (1959)
77. L. Mirsky, A trace inequality of John von Neumann. Monatshefte für Mathematik 79, 303–306
(1975)
78. M. Moakher, A differential geometric approach to the geometric mean of symmetric positive-
definite matrices. SIAM J. Matrix Anal. Appl. 26, 735–747 (2005)
79. G.N. de Oliveira, Normal matrices (research problem). Linear Multilinear Algebra 12, 153–154
(1982)
80. W. Pusz, S.L. Woronowicz, Functional calculus for sesquilinear forms and the purification
map. Rep. Math. Phys. 8, 159–170 (1975)
81. H. Richter, Zur abschatzung von matrizen-normen. Math. Nachr. 18, 178–187 (1958)
82. A. Ruhe, Perturbation bounds for means of eigenvalues and invariant subspaces. BIT Numer.
Math. 10, 343–354 (1970)
83. T. Sano, Furuta inequality of indefinite type. Math. Inequal. Appl. 10, 381–387 (2007)
84. I. Schur, Über eine Klasse von Mittelbildungen mit Anwendungen auf die Determinantenthe-
orie. Sitzungsber. Berl. Math. Ges. 22, 9–20 (1923)
85. D. Sutter, M. Berta, M. Tomamichel, Multivariate trace inequalities. Commun. Math. Phys.
352, 37–58 (2017)
86. K. Symanzik, Proofs and refinements of an inequality of Feynman. J. Math. Phys. 6, 1155–
1156 (1965)
87. T. Tao, The Golden-Thompson inequality - What’s new.terrytao.wordpress.com/2010/07/15/
the-golden-thompson-inequality (2010)
88. C.M. Theobald, An inequality for the trace of the product of two symmetric matrices. Math.
Proc. Cambridge Philos. Soc. 77, 265–267 (1975)
89. C.J. Thompson, Inequality with applications in statistical mechanics. J. Math. Phys. 6, 1812–
1813 (1965)
90. C.J. Thompson, Inequalities and partial orders on matrix spaces. Indiana Univ. Math. J. 21,
469–480 (1971)
40 N. Bebiano et al.

91. R.C. Thompson, Proof of a conjectured exponential formula. Linear Multilinear Algebra 19,
187–197 (1986)
92. H. Umegaki, Condition expectation in an operator algebra IV. Kodai Math. Semin. Rep. 14,
59–85 (1962)
93. B.L. van der Waerden, Aufgabe 45. Jber. Deutsch. Math. Verein. 35, 117 (1926)
94. G. Visick, A weak majorization involving the matrices A ◦ B and AB. Linear Algebra Appl.
223/224, 731–744 (1995)
95. G. Visick, Majorizations of Hadamard products of matrix powers. Linear Algebra Appl. 269,
233–240 (1998)
96. J. von Neumann, Some matrix-inequalities and metrization of matrix-space. Tomsk Univ. Rev.
1, 286–300 (1937)
97. J. Wang, Y. Li, H. Sun, Lie-Trotter Formula for the Hadamard Product. Acta Math. Sci. 40,
659–669 (2020)
98. H. Weyl, Inequalities between the two kinds of eigenvalues of a linear transformation Proc.
Natl. Acad. Sci. U. S. A. 35, 408–411 (1949)
99. T. Yamazaki, The Riemannian mean and matrix inequalities related to the Ando-Hiai inequality
and chaotic order. Oper. Matrices 6, 577–588 (2012)
100. T. Yamazaki, The Ando-Hiai inequality for the solution of the generalized Karcher equation
and related results. J. Math. Anal. Appl. 479, 531–545 (2019)
101. X. Zhan, Matrix Inequalities. Lecture Notes in Mathematics (Springer, Berlin, 2002)
102. L. Zou, An arithmetic geometric mean inequality for singular values and its applications.
Linear Algebra Appl. 528, 25–32 (2017)
Ando-Hiai Inequality: Extensions and
Applications

Masatoshi Fujii and Ritsuo Nakamoto

Abstract The Ando-Hiai inequality says that if A#α B ≤ I for a fixed α ∈ [0, 1]
and positive invertible operators A, B on a Hilbert space, then Ar #α B r ≤ I for
1 1 1 1
r ≥ 1, where #α is the α-geometric mean defined by A#α B = A 2 (A− 2 BA− 2 )α A 2 .
This chapter is devoted by extensions and applications of Ando-Hiai inequality. It
is closely related to Furuta inequality, Bebiano-Lemos-Providência inequality and
grand Furuta inequality. Consequently they are given useful extensions.

Keywords Ando-Hiai inequality · Furuta inequality · Grand Furuta ineqiality ·

Bebiano-Lemos-Providência inequality

1 Introduction

Throughout this chapter, an operator A means a bounded linear operator acting on a

complex Hilbert space H. An operator A is positive, denoted by A ≥ 0, if Ax, x ≥
0 for all x ∈ H. We denote A > 0 if A is positive and invertible. The α-geometric
1 1 1 1
mean #α is defined by A#α B = A 2 (A− 2 BA− 2 )α A 2 for A > 0 and B ≥ 0.
A log-majorization theorem due to Ando-Hiai [1] is expressed as follows: For
α ∈ [0, 1] and positive definite matrices A and B,

(A#α B)r (log) Ar #α B r (r ≥ 1).

The core in the proof is that

A#α B ≤ I ⇒ Ar #α B r ≤ I (r ≥ 1).

M. Fujii ()
Department of Mathematics, Osaka Kyoiku University, Kashiwara, Osaka, Japan
e-mail: [email protected]
R. Nakamoto
3-4-13 Daihara-cho, Hitachi, Japan
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 41

R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_2
42 M. Fujii and R. Nakamoto

It holds for positive operators A, B on a Hilbert space, and is called the Ando-Hiai
inequality, denoted by (AH) simply .
Now the original proof of (AH) is a typical application of Löwner-Heniz
inequality (LH), i.e.,

A ≥ B ≥ 0 ⇒ Ar ≥ B r (0 ≤ r ≤ 1).

See [18, 25] and [27] for (LH).

Related to (AH), we should mention the Furuta inequality. Because it is a
beautiful extension of (LH). It is presented as follows:
Furuta Inequality (FI)
If A ≥ B ≥ 0, then for each r ≥ 0,
r r 1 r r 1
(i) (B 2 Ap B 2 ) q ≥ (B 2 B p B 2 ) q

and
r r 1 r r 1
(ii) (A 2 Ap A 2 ) q ≥ (A 2 B p A 2 ) q

hold for p ≥ 0 and q ≥ 1 with

(1 + r)q ≥ p + r.

Related to Furuta inequality, see [4, 5, 15, 16, 31] and [9].
After publishing (AH), Furuta himself [17] presented so-called “grand Furuta
inequality” which interpolates (AH) to his inequality (FI), see also [8] and [9].
Grand Furuta Inequality (GFI) If A ≥ B > 0 and t ∈ [0, 1], then
1−t+r
[A 2 (A− 2 B p A− 2 )s A 2 ] (p−t)s+r ≤ A1−t +r
r t t r
Ando-Hiai Inequality 43

holds for r ≥ t and p, s ≥ 1.

The relations among (AH), (FI) and (GFI) are as follows:
(GFI) for t = 1, r = s ⇐⇒ (AH)
(GFI) for t = 0, (s = 1) ⇐⇒ (FI).
On the other hand, we discussed the equivalence between (AH) and (FI) in [10].
Moreover we gave two variable version (GAH):
If A#α B ≤ I for α ∈ [0, 1] and A, B ≥ 0, then Ar #β B s ≤ I for r, s ≥ 1, where
β = αr+(1−α)s
αr
.
The one-sided versions are considerable:

A #α B ≤ I ⇒ Ar # αr+(1−α)
αr B≤I (r ≥ 1);

A #α B ≤ I ⇒ A # α+(1−α)s
α Bs ≤ I (s ≥ 1).

It is known that both one-sided versions are equivalent, and that they are
alternative expressions of (FI).

2 Extensions

This section is based on our paper [11]. First of all, a binary operation α is defined
by the same formula as the α-geometric mean for α ∈ [0, 1], that is,

A α B = A 2 (A− 2 BA− 2 )α A 2
1 1 1 1
for A, B > 0.

Recently (AH) is extended by Seo [29] and [23] as follows: For α ∈ [−1, 0],
A α B ≤ I for A, B > 0 implies Ar α B r ≤ I for r ∈ [0, 1].
So, following our previous work, we present two variable version of it. For this,
we mention the following useful identity on the binary operation : For β ∈ R and
positive invertible operators X and Y ,

X β Y = X(X−1 −β Y
−1
)X. (1)

Lemma 2.1 If A α B ≤ I for α ∈ [−1, 0] and A, B > 0, then Ar βB ≤ I for

r ∈ [0, 1], where β = αr+(1−α)
αr
.

Proof For convenience, we show that if A−1 α B ≤ I , then A−r β B ≤ I for r ∈

1 1
[0, 1]. Thus the assumption ensures that C α ≤ A, where C = A 2 BA 2 . Note that
β ∈ [−1, 0].
44 M. Fujii and R. Nakamoto

Now we first assume that r = 1 − ∈ [ 12 , 1], i.e., ∈ [0, 12 ]. Then we have

A βC = A (A− #−β C −1 )A
≤ A (C −α #−β C −1 )A
= A C α(1−2 )A
≤ A A1−2 A = A.

Hence it follows that

A−r = A− 2 (A − 12
≤ A− 2 AA− 2 = I.
1 1 1
βB β C)A

In particular, we note that Ar β B ≤ I for r = 12 , that is, A− 2 α1 B ≤ I holds for

α1 = 2−αα
. Hence it follows from the preceding paragraph that for r ∈ [ 12 , 1],

1
I ≥ (A− 2 )r = A− 2
r
β1 B β1 B,

α1 r αr/2
where β1 = α1 r+(1−α1 ) = αr/2+(1−α) . This means that the desired inequality holds
for r ∈ [ 14 , 12 ]. Finally we have the conclusion by the induction.

Lemma 2.2 If A α B ≤ I for α ∈ [−1, 0] and A, B > 0, then A β B s ≤ I for
s ∈ [ −2α
1−α , 1], where β = α+(1−α)s .
α

Proof For convenience, we show that if A α B −1 ≤ I , then A β B −s ≤ I for s ∈

[ −2α
1 1
1−α ≤ B, where D = B 2 AB 2 .
1−α , 1]. Thus the assumption is understood as D
−2α
We first note that β ∈ [−1, 0] by s ∈ [ 1−α , 1]. So we put s = 1 − for some
∈ [0, 1 − −2α
1−α ]. Then we have

D β B = D(D −1 #−β B − )D ≤ D(D −1 #−β D − (1−α)

D = D 1−α ≤ B,

so that

A β B −s = B − 2 (D β B )B − 2 ≤ B − 2 BB − 2 = I.
1 1 1 1

Theorem 2.3 If A α B ≤ I for α ∈ [−1, 0] and A, B > 0, then Ar βB
s ≤ I for
r ∈ [0, 1] and s ∈ [ −2αr
1−α , 1], where β = αr+(1−α)s .
αr

Proof Suppose that A α B ≤ 1. Then Lemma 2.1 says that Ar γ B ≤ 1 for r ∈

[0, 1], where γ = αr+(1−α)
αr
. Next we apply Lemma 2.2 to this obtained inequality.
Ando-Hiai Inequality 45

Then we have

1 ≥ Ar γ B s = Ar αr Bs
γ +(1−γ )s αr+(1−α)s

for s ∈ [ −2γ −2αr

1−γ , 1] = [ 1−α , 1].

As a special case s = r in the above, we obtain Seo’s original extension of (AH)
because β = α (by s = r) and r ∈ [ −2αr
1−α , 1].
Corollary 2.4 If A α B ≤ I for α ∈ [−1, 0] and A, B > 0, then Ar αB
r ≤ I for
r ∈ [0, 1].
Remark 2.5 We here consider the condition s ∈ [ −2α 1−α , 1] in Lemma 2.2. In
particular, take α = −1. Then the assumption A α B ≤ 1 means that B ≥ A2 , and
β = α+(1−α)s
α
= 1−2s
1
. Though s = 1 in this case by s ∈ [ −2α
1−α , 1], the inequality
in Lemma 2.2 still holds for s ∈ [ 34 , 1]. We use the formula X γ Y = Y 1−γ X =
Y (Y −1 γ −1 X−1 )Y . Note that −β ∈ [1, 2]. Therefore we have

A β B s = A(A−1 β B −s )A = AB −s (B s #β−1 A)B −s A

1
≤ AB −s (B s #−β−1 B 2 )B −s A = AB −1 A ≤ AA−2 A = I.

On the other hand, it is false for s ∈ [0, 14 ]. Note that β = 1−2s

1
∈ [1, 2]. Suppose
to the contrary that A β B ≤ 1 holds under the assumption B ≥ A2 . Then it follows
s

that 1 ≥ A β B s = B s (B −s #β−1 A−1 )B s and so

1
B −2s ≥ B −s #β−1 A−1 ≥ B −s #β−1 B − 2 = B −2s ,

so that B = A2 follows, which is impossible in general.

Next we consider representations of Furuta type associated with extensions of
(AH) obtained in the preceding discussion.
We here remark that the optimal case (1 + r)q = p + r is essential in (FI), which
is realized as a beautiful formula by the use of the α-geometric mean:
If A ≥ B ≥ 0, then for each r ≥ 0

A−r # 1+r B p ≤ A
p+r

holds for p ≥ 1.
More precisely, the conclusion in above is improved by

A−r # 1+r B p ≤ B (≤ A)
p+r

holds for p ≥ 1, due to Kamei [21].

46 M. Fujii and R. Nakamoto

The following result is also led by Lemma 2.1.

Theorem 2.6 If A ≥ B > 0, then

A−r 1+r Bp ≤ A
p+r

holds for p ≤ −1 and r ∈ [−1, 0].

Proof As in the proof of Lemma 2.1, it says that if A−1 α B ≤ I , then A−r β B ≤ I
for r ∈ [0, 1], where β = αr+(1−α)
αr
. Thus the assumption is that C α ≤ A, where
1 1
C = A 2 BA 2 . So we put B1 = C α ≤ A, and p = α1 , r1 = r − 1. Then p ≤ −1
1+r1
and r1 ∈ [−1, 0] and β = p+r 1
. Moreover the conclusion is rephrased as

A−r+1 β C ≤ A, or A−r1 1+r1 B1 p ≤ A.

p+r1

As well as (FI), (GFI) has also mean theoretic expression as follows:
If A ≥ B > 0 and t ∈ [0, 1], then

A−r+t # 1−t+r (At #s B p ) ≤ A

(p−t)s+r

holds for r ≥ t and p, s ≥ 1.

In succession with the above discussion, Theorem 2.3 gives us the following
inequality of (GFI)-type:
Theorem 2.7 If A ≥ B > 0, then

A−r+1 r
r+(p−1)s
(A#s B p ) ≤ A

−2r
holds for p ≤ −1, r ∈ [0, 1] and s ∈ [ p−1 , 1].

Proof Theorem 2.3 says that if A−1 α B ≤ I , then A−r β B s ≤ I for r ∈ [0, 1]
and s ∈ [ −2αr
1−α , 1], where β = αr+(1−α)s . So the assumption is that B1 = C ≤ A,
αr α
1 1
where C = A 2 BA 2 . On the other hand, putting α = 1
p, it follows that

1 1
I ≥ A−r αr
αr+(1−α)s
B s = A−r r
r+(p−1)s
(A− 2 B1 p A− 2 )s

or equivalently

A ≥ A−r+1 r
r+(p−1)s
(A#s B1 p ),

p 1 1
which is the conclusion by B1 = C = A 2 BA 2 .

Ando-Hiai Inequality 47

From the viewpoint of (GFI), the following extension might be expected:

Conjecture If A ≥ B > 0 and t ∈ [0, 1], then

A−r+t 1−t+r (At #s B p ) ≤ A

r+(p−t)s

−2r
holds for p ≤ −1, r ∈ [0, t] and s ∈ [ p−t , 1].
We can prove it under a restriction as follows:
Theorem 2.8 If A ≥ B > 0 and t ∈ [0, 1], then

A−r+t 1−t+r (At #s B p ) ≤ A

r+(p−t)s

−t −2r−(1−t )
holds for p ≤ −1, r ∈ [0, t] and s ∈ [max{ p−t , p−t }, 1].
−t
Proof First of all, we note that r + (p − t)s ≤ t + (p − t)s ≤ 0 by s ≥ p−t . So we
1−t +r 1−t +r
have r+(p−t )s ≤ 0. On the other hand, −1 ≤ r+(p−t )s is obtained by −(r + (p −
−2r−(1−t )
t)s) ≥ 1 − t + r since the assumption s ≥ p−t . Namely −(1−t +r)
r+(p−t )s ∈ [0, 1].
Hence we have

Ar−t # −(1−t+r) (A−t #s B −p ) ≤ Ar−t # −(1−t+r) B −(p−t )s−t ≤ A2(r−t )+1.

r+(p−t)s r+(p−t)s

The second inequality in above is shown as follows: The exponent −(p − t)s − t
of B is nonnegative as mentioned first. Thus, if −(p − t)s − t ≤ 1, the second
inequality holds. On the other hand, if −(p − t)s − t ≥ 1, then the Furuta inequality
assures that
t−r 1−t+r
B −(p−t )s−t A
t−r
(A 2 2 ) −(p−t)s−r ≤ A1−t +r ,

or equivalently

Ar−t # 1−t+r B −(p−t )s−t ≤ A2(r−t )+1.

−(p−t)s−r

Hence, noting that X −q Y = X(X−1 q Y −1 )X, it follows that

A−r+t 1−t+r (At #s B p ) = A−r+t {Ar−t # −(1−t+r) (A−t #s B −p )}A−r+t

r+(p−t)s r+(p−t)s

−r+t 2(r−t )+1 −r+t

≤A A A = A.

The following theorems show that Theorem 2.8 holds for the critical points s =
−t −2r−(1−t )
p−t , p−t .
48 M. Fujii and R. Nakamoto

Theorem 2.9 If A ≥ B > 0 and t ∈ [0, 1], then

A−r+t 1−t+r (At #s B p ) ≤ A

r+(p−t)s

−2r−(1−t )
holds for p ≤ −1, r ∈ [0, t] and s = p−t .
1−t +r
Proof First of all, we note that r+(p−t )s = −1 and X −1 Y = XY −1 X. Therefore
the conclusion is arranged as

A−r+t −1 (At #s B p ) ≤ A,

A−r+t (A−t #s B −p )A−r+t ≤ A

and so

A−t #s B −p ≤ A1+2r−2t . (∗)

To prove this, we recall the Furuta inequality, i.e., if A ≥ B ≥ 0, then

t t 1 P +t
(A 2 B P A 2 ) q ≤ A q

holds for t, P ≥ 0 and q ≥ 1 with (1 + t)q ≥ P + t. Taking P = −p and q = 1s ,

the required condition (1 + t)q ≥ P + t is enjoyed and we obtain

(A 2 B −p A 2 )s ≤ A1+2r−t ,
t t

which is equivalent to (*).

−t
In succession to the preceding theorem, the other case s = p−t can be proved as
in the below discussion:
Theorem 2.10 If A ≥ B > 0 and t ∈ [0, 1], then

A−r+t 1−t+r (At #s B p ) ≤ A

r+(p−t)s

−t
holds for p ≤ −1, r ∈ [0, t] and s = p−t .
−t −2r−(1−t )
By Theorem 2.8, we have to consider the case p−t < p−t , that is, 0 ≤
t −r < 1
2 can be assumed. Hence we have

1−t +r 1
=1− < −1.
r + (p − t)s t −r
Ando-Hiai Inequality 49

1−t +r
As a special case, we take t = 23 , r = 13 and p = −2. Then s = 1
4 and r+(p−t )s =
−2, so that the statement in this case is arranged as follows:
If A ≥ B > 0, then
1 2
−2
A3 −2 (A 3 # 1 B )≤A
4

holds? It is proved by using Furuta inequality twice: First of all, since A ≥ B > 0,
(FI) ensures that
1 1 5 5 1 1 1 1
(A 3 B 2 A 3 ) 8 ≤ A 3 ; (A 3 B 2 A 3 ) 8 ≤ A 3 .

So we have
−2
) = A 6 (A− 6 (A 3 # 1 B −2 )A− 6 )−2 A 6
1 2 1 1 2 1 1
A3 −2 (A 3 # 1 B
4 4
1 1 2 1 1
= A 6 (A 6 (A− 3 # 1 B 2 )A 6 )2 A 6
4

= A 6 (A− 3 # 1 A 6 B 2 A 6 )2 A 6
1 1 1 1 1

4
1 1 1 1 1 1 1
= A 6 (A− 6 (A 3 B 2 A 3 ) 4 A− 6 )2 A 6

= (A 3 B 2 A 3 ) 4 A− 3 (A 3 B 2 A 3 ) 4
1 1 1 1 1 1 1

1 1 1 1 1
≤ (A 3 B 2 A 3 ) 4 − 8 + 4
1 1 3
≤ (A 3 B 2 A 3 ) 8
≤ A,

as desired.
To prove Theorem 2.10, we cite a lemma obtained by the Furuta inequality.
Lemma 2.11 If A ≥ B > 0, t ≥ 0 and p ≤ −1, then
1+t
(A 2 B −p A 2 ) −p+t ≤ A1+t ;
t t

t t
in particular, (A 2 B −p A 2 )s ≤ At holds for s = t
−p+t .
To show Theorem 2.10, we reformulate it as follows:
Theorem 2.12 If A ≥ B > 0, t ≥ c−1
c+1 for some c ≥ 2, 1 ≥ t > r ≥ 0 with
t − r = c+1
1
and p ≤ −1, then

1
A c+1 −c (At #s B p ) ≤ A
50 M. Fujii and R. Nakamoto

holds for s = t
−p+t .

Proof Put α = t − r. Then α = c+1 1

< 12 , c = 1−α
α and the assumption
t ≥ c+1 means α(c − 1) ≤ t, which plays a role when we use the Löwner-Heinz
c−1

inequality in the below. We put X = A 2 B −p A 2 and Y = A− 2 Xs A− 2 . Then

t t r r

1 α α s 1
A c+1 −c (At #s B p ) = A 2 Y c A 2 , and X t = X −p+t ≤ A, in particular, Xs ≤ At
st
and X t ≤ At for 0 ≤ t ≤ 1 + t by Lemma 2.11.
(1) First we suppose that 2n ≤ c < 2n + 1 for some n, i.e., c = 2n + for
some ∈ [0, 1). Since α(c − 2) ≤ t − α = r by α(c − 1) ≤ t, we have
α ≤ α(2(n − 1) + ) = α(c − 2) ≤ r and so

α −r α(2(n − k) + ) − r
−1 ≤ ≤ ≤0
t t

for k = 1, 2, · · · , n. Noting that 0 ≤ 2s + [α(2(n − 1) + ) − r] st ≤ 1+t

−p+t by
c−1
c+1 ≤ 1, it follows that

= Y n Y Y n = Y n (A− 2 Xs A− 2 ) Y n
r r
Yc

≤ Y n (A− 2 At A− 2 ) Y n = Y n Aα Y n
r r
by Xs ≤ At and (LH)

= Y n−1 A− 2 Xs Aα −r
Xs A− 2 Y n−1
r r

α −r
≤ Y n−1 A− 2 X2s+(α −r) st
A− 2 Y n−1
r r
by Xs ≤ At , ∈ [−1, 0]
t
−2r
≤ Y n−1 A2t+α Y n−1 by putting t = 2t + α − r ≤ 1 + t

= Y n−1 Aα(2+ ) Y n−1

≤ Y n−2 Aα(4+ ) Y n−2

···

≤ Y Aα(2(n−1)+ ) Y

≤ Aα(2n+ )

= Aαc .

Hence we have
1 α α
A c+1 −c (At #s B p ) = A 2 Y c A 2 ≤ Aαc+α = A,

as desired.
Ando-Hiai Inequality 51

(2) Next we suppose that 2n + 1 ≤ c < 2n + 2 for some n, i.e., c = 2n + 1 +

for some ∈ [0, 1). For this case, we prepare the inequality

Y 1+ ≤ Aα(1+ ).

It is proved as follows:

Y 1+ = (A− 2 Xs A− 2 )1+
r r

= A− 2 X 2 (X 2 A−r X 2 ) X 2 A− 2
r s s s s r

≤ A− 2 X 2 (X 2 X− t X 2 ) X 2 A− 2
r s s sr s s r

= A− 2 Xs+(s− t A− 2
r sr r
)

≤ A− 2 At +α A− 2 = Aα(1+ ).
r r

Now, if n = 0, i.e., c = 1 + , then

α α α α
A 2 Y 1+ A 2 ≤ A 2 Aα(1+ )A 2 = Aα(2+ ) = A.

Next, if c = 2n + 1 + for some ∈ [0, 1) with n = 0, then

Y c = Y n Y 1+ Y n ≤ Y n Aα(1+ ) Y n

= Y n−1 A− 2 Xs Aα(1+
r
Xs A− 2 Y n−1
r
)−r

≤ Y n−1 A− 2 X2s+(α(1+ A− 2 Y n−1

r
)−r) ts r

≤ Y n−1 A2t +α(1+ )−2r

Y n−1
= Y n−1 Aα(3+ )Y n−1
≤ Y n−2 Aα(5+ )Y n−2
···
≤ Y Aα(2(n−1)+1+ )Y
≤ Aα(2n+1+ ) = Aαc ,

in which (−1 ≤ −r ≤) α(2(n − 1) + 1 + ) − r ≤ 0 is required in order to use

the Löwner-Heinz inequality. (Fortunately it is assured by the assumption t ≥ c−1
c+1 .)
Hence we have
1 α α
A c+1 −c (At #s B p ) = A 2 Y c A 2 ≤ Aαc+α = A,

as desired.

52 M. Fujii and R. Nakamoto

Recently Ito and Kamei [20] give an improvement to the above results. For this,
they rewrite it as follows:

A ≥ B > 0 implies A−r # 1−r B p ≤ A1−2r for p ≥ 1 and 0 ≤ r ≤ 1.

p+r
(FN)
Moreover we recall a stellite of (FI) due to Kamei [21]:

A ≥ B > 0 ⇒ A−r # 1+r B p ≤ B ≤ A ≤ B −r # 1+r Ap for p ≥ 1 and r ≥ 0.

p+r p+r
(SF)
Under such preparation, the following improvement of Theorem 2.6 is proposed:
Theorem 2.13 Let A ≥ B > 0 and r > 0. Then for p ≥ 1, the following
inequalities hold.

−r p ≤ B 1−2r ≤ A1−2r if 0 ≤ r ≤ 12 ,
A # 1−r B (2)
p+r ≤ A1−2r ≤ B 1−2r if 1
2 ≤ r ≤ 1,

A−r 1−r B p ≥ A1−2r if r > 1. (3)

p+r

Proof If 0 ≤ r ≤ 1, then we have

A−r # 1−r B p ≤ B −r # 1−r B p = B 1−2r

p+r p+r

and

A−r # 1−r B p = A−r # 1−r (A−r # 1+r B p )

p+r 1+r p+r

≤ A−r # 1−r (B −r # 1+r B p )

1+r p+r

−r
=A # 1−r B
1+r

−r
≤ A # 1−r A = A1−2r .
1+r

Therefore we obtain (2) since B 1−2r ≤ A1−2r holds if 0 ≤ r ≤ 1

2 and A1−2r ≤
B 1−2r holds if 12 ≤ r ≤ 1.
If r > 1, then we have (3) because

A−r 1−r B p = A−r (Ar # r−1 B −p )A−r

p+r p+r

−r −p
= A (B # 1+p Ar )A−r ≥ A−r AA−r = A1−2r .
r+p

by (SF).

Ando-Hiai Inequality 53

An improvement of Theorem 2.8 is given as follows:

Theorem 2.14 Let A ≥ B > 0 and 0 ≤ r ≤ t ≤ 1. Then

r−t −t p ≤ B 1−2(t −r) ≤ A1−2(t −r) if 0 ≤ t − r ≤ 12 ,
A # 1−t+r (A #s B )
(p+t)s−r ≤ A1−2(t −r) ≤ B 1−2(t −r) if 1
2 ≤t −r ≤1

1−t +2r
holds for p ≥ 1 and p+t ≤ s ≤ 1.
1−t +r
Proof Noting that 0 ≤ (p+t )s−r ≤ 1 and 0 ≤ t − r ≤ 1 hold, we have

Ar−t # 1−t+r (A−t #s B p ) ≤ B r−t # 1−t+r (B −t #s B p ) = B 1−2(t −r).

(p+t)s−r (p+t)s−r

Next we show Ar−t # 1−t+r (A−t #s B p ) ≤ A1−2(t −r) by dividing into three
(p+t)s−r
cases:
(i) If (p + t)s − t ≥ 1 holds, then

Ar−t # 1−t+r (A−t #s B p )

(p+t)s−r

≤ Ar−t # 1−t+r (B −t #s B p )
(p+t)s−r

= Ar−t # 1−t+r B (p+t )s−t

(p+t)s−r

= Ar−t # 1−t+r (Ar−t # 1+(t−r) B (p+t )s−t )

1+t−r (p+t)s−t+(t−r)

≤ Ar−t # 1−t+r (B r−t # 1+(t−r) B (p+t )s−t )

1+t−r (p+t)s−t+(t−r)

= Ar−t # 1−t+r B
1+t−r

≤A r−t
# 1−t+r A = A1−2(t −r).
1+t−r

(ii) If 0 ≤ (p + t)s − t ≤ 1 holds, then

Ar−t # 1−t+r (A−t #s B p )

(p+t)s−r

≤ Ar−t # 1−t+r (B −t #s B p )
(p+t)s−r

= Ar−t # 1−t+r B (p+t )s−t

(p+t)s−r

≤ Ar−t # 1−t+r A(p+t )s−t = A1−2(t −r).

(p+t)s−r
54 M. Fujii and R. Nakamoto

(iii) If (p + t)s − t ≤ 0 holds, then

A−t #s B p = A−t # (p+t)s (A−t # t Bp)

t p+t

≤ A−t # (p+t)s (B −t # t Bp)

t p+t

= A−t # (p+t)s I = A(p+t )s−t ,

so that we have

Ar−t # 1−t+r (A−t #s B p ) ≤ Ar−t # 1−t+r A(p+t )s−t = A1−2(t −r).

(p+t)s−r (p+t)s−r

Therefore we obtain the desired result since B 1−2(t −r) ≤ A1−2(t −r) holds if
0 ≤ t − r ≤ 12 and A1−2(t −r) ≤ B 1−2(t −r) holds if 12 ≤ t − r ≤ 1.

3 Applications

This section is based on our recent paper [12]. Now Bebiano-Lemos-Providência

[2] proposed the following norm inequality:
1+t 1+t 1 s s t 1
A 2 Bt A 2 ≤ A 2 (A 2 B s A 2 ) s A 2

holds for A, B ≥ 0 and s ≥ t ≥ 0. We call it BLP inequality. For this, we

generalized it in [13] from the viewpoint of Furuta inequality. In this section, we
discuss further generalizations of BLP inequality as applications of the results in
the preceding section.
We first mention the useful identity on the binary operation β again: For β ∈ R
and positive invertible operators X and Y ,

X β Y = X(X−1 −β Y −1 )X.

This means that if β ∈ [−1, 0], then β looks like an operator mean in some sense.
We also rewrite Lemma 2.1 and Theorem 2.6 for convenience.
Lemma 3.1 If A α B ≤ 1 for α ∈ [−1, 0] and A, B > 0, then Ar β B ≤ 1 for
r ∈ [0, 1], where β = αr+(1−α)
αr
.
It is reformulated as Furuta type as follows:
Theorem 3.2 If A ≥ B > 0, then

A−r 1+r Bp ≤ A
p+r
Ando-Hiai Inequality 55

holds for p ≤ −1 and r ∈ [−1, 0].

We remark that they are equivalent. Moreover it suggests us that the domain in
which (FI) holds is extendable.
Theorem 3.3 If A ≥ B > 0, then

A−r # 1+r B p ≤ A
p+r

holds for p ∈ [0, 1] and r ≤ −1.

Proof It is easily checked:

A−r # 1+r B p ≤ A−r # 1+r Ap = A

p+r p+r

by (LH).

As an application, we show a generalization of BLP inequality in the below. For
this, we need the following lemma:
Lemma 3.4 Suppose that A, B > 0.
(1) If Ar 1 B p+r ≤ A1+r for some p ≤ −1 and r ∈ [−1, 0], then B 1+r ≤ A1+r .
p
(2) If Ar 1 B p+r ≤ A1+r for some p ∈ [0, 1] and r ≤ −1, then B 1+r ≤ A1+r .
p

1
Proof (1) Since the assumption is rephrased as B1 = (A− 2 B p+r A− 2 ) p ≤ A, it
r r

follows from Theorem 3.2 that

A ≥ A−r # 1+r B1 = A−r # 1+r A− 2 B p+r A− 2 = A− 2 B 1+r A− 2 ,

p r r r r

p+r p+r

so that we have the conclusion B 1+r ≤ A1+r .

(2) is proved by the same way as (1) with the use of Theorem 3.3.

By the use of (LH), we have the following.
Corollary 3.5 Suppose that A, B > 0.
(1) If Ar 1 B p+r ≤ A1+r for some p ≤ −1 and r ∈ [−1, 0], then B 1+s ≤ A1+s
p
for −1 ≤ s ≤ r.
(2) If Ar 1 B p+r ≤ A1+r for some p ∈ [0, 1] and r ≤ −1, then B 1+s ≤ A1+s
p
for r ≤ s ≤ −1.
Consequently we have a generalized BLP inequality as follows:
Theorem 3.6 If A, B > 0, then
1+r 1+r p+r 1 r r 1 1
A 2 B 1+r A 2 p(1+r) ≤ A 2 (A 2 B p+r A 2 ) p A 2
56 M. Fujii and R. Nakamoto

holds for either p ≤ −1 and r ∈ [−1, 0] or p ∈ [0, 1] and r ≤ −1.

Proof Lemma 3.4 implies that if A−r 1 B p+r ≤ A−(1+r), then B 1+r ≤ A−(1+r) .
p
1 r r 1 1 1+r 1+r
It says that if A 2 (A 2 B p+r A 2 ) p A 2 ≤ I , then A 2 B 1+r A 2 ≤ I . Consequently
we have the desired norm inequality.

In addition, we also obtain norm inequalities corresponding to Corollary 3.5:
Theorem 3.7 If A, B > 0, then

1+s 1+r p+r 1 r r 1 1

A 2 B 1+s A 2 p(1+s) ≤ A 2 (A 2 B p+r A 2 ) p A 2

holds for either p ≤ −1 and −1 ≤ s ≤ r ≤ 0 or p ∈ [0, 1] and r ≤ s ≤ −1.

Next we consider a reverse inequality of generalized BLP inequality. For this, we
cite a reverse inequality of Araki-Cordes inequality (AC), i.e.,

ABAp ≤ Ap B p Ap for A, B ≥ 0.

Theorem R-AC ([14]) If A ≥ 0, 0 < mI ≤ B ≤ MI for some M > m > 0 and

h= Mm , then

(ABAp ≤) Ap B p Ap ≤ K(h, p)ABAp

holds for p ≥ 1, where K(h, p) is the generalized Kantorovich constant defined by

p
1 hp − h p − 1 hp − 1
K(h, p) = .
h−1 p−1 hp − h p

Theorem 3.8 Suppose that A ≥ 0, 0 < mI ≤ B ≤ MI for some M > m > 0 and
h= Mm . Then

1 r r 1 1 p + r − p1 1+s 1+s p+r

A 2 (A 2 B p+r A 2 ) p A 2 ≤ K(h1+s , − ) A 2 B 1+s A 2 p(1+s)
1+s

holds for p ≤ −1, −1 < s and r ≤ −(p + 1 + s).

Proof It is proved by (AC) and Theorem R-AC. As a matter of fact, since p ≤ −1
and −1 < s ≤ r ≤ 0, we have −p ≥ 1 and −(p+r)
1+s ≥ 1, and so

1 r r 1 1
A 2 (A 2 B p+r A 2 ) p A 2
p p − p1
≤ A− 2 (A 2 B p+r A 2 )−1 A− 2
r r
Ando-Hiai Inequality 57

p+r −(p+r) p+r − p1

= A− 2 B (1+s) 1+s A− 2
−(p + r) − p1 1+s 1+s p+r
≤ K(h1+s , ) A 2 B 1+s A 2 p(1+s) .
1+s

We note that Theorem 3.8 can be expressed as an operator inequality:
Corollary 3.9 Suppose that A > 0, 0 < mI ≤ B ≤ MI for some M > m > 0 and
h= Mm . If B
1+s ≤ A1+s for some −1 < s ≤ 0, then

p + r − p1 1+r
Ar 1 B p+r ≤ K(h1+s , − ) A
p 1+s

for p ≤ −1 and r ≤ −(p + 1 + s).

Taking s = 0 in Corollary 3.9, we have the following.
Corollary 3.10 Suppose that 0 < mI ≤ B ≤ MI for some M > m > 0 and
h= Mm . If A ≥ B, then

− p1
Ar 1 B p+r ≤ K(h, −(p + r)) A1+r
p

for p ≤ −1 and r ≤ −(p + 1).

We remark that if we take p = −1 in the above, then we obtain a well-known
result:

A ≥ B > 0 ⇒ K(h, 1 + r)A1+r ≥ B 1+r for r ≥ 0.

It is a complementary inequality related to (LH). As a matter of fact, we know that

A ≥ B > 0 ⇒ A1+r ≥ B 1+r for r > 0

in general.
By the way, Theorem 3.7 is expressed that

1+s 1+r 1 r r 1 1 p(1+s)

A 2 B 1+s A 2 ≤ A 2 (A 2 B p+r A 2 ) p A 2 p+r

holds for p ≤ −1 and −1 ≤ s ≤ r ≤ 0

Next we discuss reverse inequalities with respect to difference, precisely we
estimate a upper bound of the difference

1 r r 1 1 p(1+s) 1+s 1+s

A 2 (A 2 B p+r A 2 ) p A 2 p+r − λA 2 B 1+s A 2
58 M. Fujii and R. Nakamoto

for a given λ > 0.

For this, we put, for q > 1, M > m > 0 and h = M
m,

(1) hq − 1 (2) hq − 1
j1 = jq,h = , j 2 = j =
q(hq − hq−1 ) q,h
q(h − 1)

and
⎧
⎪
⎪ (1 − λ)M if 0 < λ < j1
⎪
⎨ 1
β(m, M, q; λ) = q−1 M q −mq q−1 λ(Mmq −mM q )
⎪ + M q −mq if λ ∈ [j1 , j2 ]
⎪
⎪
q λq(M−m)
⎩(1 − λ)m if λ > j2 .

Theorem 3.11 Suppose that A ≥ 0, 0 < mI ≤ B ≤ MI for some M > m > 0

m . If p ≤ −1, −1 < s and r ≤ −(p + 1 + s), then for each λ > 0
and h = M

1 r r 1 1 p(1+s)
A 2 (A 2 B p+r A 2 ) p A 2 p+r

1+s 1+s p+r

≤ λA 2 B 1+s A 2 + β(m1+s , M 1+s , − , λ)A1+s .
1+s

Proof We first refer [14, Theorem 6]: If A1 > 0, 0 < m1 I ≤ B1 ≤ M1 I for some
M1 > m1 > 0 and q > 1, then for each λ > 0

q q q 1
A1 B1 A1 q ≤ λA1 B1 A1 + β(m1 , M1 , q; λ)A1 2 .

1+s
We apply it for A1 = A 2 , B1 = B 1+s and q = − p+r
1+s . Then it follows that

1 r r 1 1 p(1+s)
A 2 (A 2 B p+r A 2 ) p A 2 p+r

1 r r 1 1 −p
= A 2 (A 2 B p+r A 2 ) p A 2 q

p p 1
≤ A− 2 (A 2 B p+r A 2 )−1 A− 2 q
r r
by (AC)
p+r p+r 1
= A− 2 B −(p+r) A− 2 q

1
q q q
= A1 B1 A1 q

≤ λA1 B1 A1 + β(m1+s , M 1+s , q, λ)A1 2

1+s 1+s p+r
= λA 2 B 1+s A 2 + β(m1+s , M 1+s , − , λ)A1+s .
1+s

For convenience, we cite the original form of “grand Furuta inequality”
Ando-Hiai Inequality 59

Grand Furuta Inequality If A ≥ B > 0 and t ∈ [0, 1], then

1
[A 2 (A− 2 B p A− 2 )s A 2 ] q ≤ A1−t +r
r t t r

holds for r ≥ t, p ≥ 0, q ≥ 1 and s ≥ 1 with (1 − t + r)q ≥ (p − t)s + r.

The core of the grand Furuta inequality is the case (1 − t + r)q = (p − t)s + r
by virtue of (LH). So we call it (GFI). That is,
Grand Furuta Inequality (GFI) If A ≥ B > 0 and t ∈ [0, 1], then
1−t+r
[A 2 (A− 2 B p A− 2 )s A 2 ] (p−t)s+r ≤ A1−t +r
r t t r

holds for r ≥ t and p, s ≥ 1.

As in the case of Furuta inequality, a mean theoreteic expression of (GFI) is given
as follows:
If A ≥ B > 0 and t ∈ [0, 1] is given, then

A−r+t # 1−t+r (At s Bp) ≤ A

(p−t)s+r

holds for r ≥ t and p, s ≥ 1.

In the below, we discuss a generalization of BLP inequality corresponding to
(GFI). To do this, we prepare the following operator inequality:
Theorem 3.12 Suppose that A, B > 0 and t ∈ [0, 1] satisfy

Ar−t # 1 (Ar # 1 B (p−t )s+r ) ≤ A1−t +r

p s

for some p, s ≥ 1 and r ≥ t. Then B 1−t +r ≤ A1−t +r .

t−r
Proof Multiplying A 2 on both sides of the assumption, we have

t−r t−r 1
B2 = [A 2 (Ar # 1 B (p−t )s+r )A 2 ] p ≤ A.
s

Applying (GFI) for B2 ≤ A, it follows that

At −r #
p
1−t+r (At s B2 ) ≤ A.
(p−t)s+r

Moreover we have
p t−r t−r
At s B2 = At s [A 2 (Ar # 1 B (p−t )s+r )A 2 ]
s

= A 2 [A− 2 (Ar # 1 B (p−t )s+r )A− 2 ]s A 2

t r r t

s
60 M. Fujii and R. Nakamoto

= A 2 [A− 2 B (p−t )s+r )A− 2 ] s ×s A 2

t r r 1 t

t−r t−r
=A 2 B (p−t )s+r A 2 ,

so that

A ≥ At −r #
p
1−t+r (At s B2 )
(p−t)s+r

= At −r #
t−r t−r
1−t+r A 2 B (p−t )s+r A 2
(p−t)s+r
t−r
B 1−t +r A
t−r
=A 2 2 ,

as desired.

As a corollary, we have a norm inequality of BLP type corresponding to (GFI):
Corollary 3.13 Suppose that A, B > 0 and t ∈ [0, 1]. Then

1−t+r 1−t+r (p−t)s+r 1 1 1 1

B 1−t +r A ps(1−t+r) ≤ A 2 {A− 2 (A 2 B (p−t )s+r A 2 ) s A− 2 } p A 2
t r r t
A 2 2

holds for p, s ≥ 1 and r ≥ t.

1 1 1 1
Proof It suffices to show that A 2 {A− 2 (A 2 B (p−t )s+r A 2 ) s A− 2 } p A 2 ≤ I implies
t r r t

1−t+r 1−t+r
A 2 B 1−t +r A 2 ≤ I . Since the assumption is equivalent to

A−(r−t ) # 1 (A−r # 1 B (p−t )s+r ) ≤ A−(1−t +r),

p s

the conclusion is ensured by Theorem 3.12 with replacing A to A−1 .

In Theorem 2.8, we extended domain where (GFI) holds:
If A ≥ B > 0 and t ∈ [0, 1], then

A−r+t 1−t+r (At #s B p ) ≤ A

r+(p−t)s

−t −2r−(1−t )
holds for p ≤ −1, r ∈ [0, t] and s ∈ [max{ p−t , p−t }, 1].
Consequently, as similar to (GFI) itself, we have the following operator inequal-
ity and norm inequality.
Theorem 3.14 Suppose that A, B > 0 and t ∈ [0, 1] satisfy

Ar−t 1 (Ar 1 B (p−t )s+r ) ≤ A1−t +r

p s
Ando-Hiai Inequality 61

−t −2r−(1−t )
for some p ≤ −1, r ∈ [0, t] and s ∈ [max{ p−t , p−t }, 1]. Then B 1−t +r ≤
A 1−t +r .
Corollary 3.15 Suppose that A, B > 0 and t ∈ [0, 1]. Then
(p−t)s+r 1
B 1−t +r A ps(1−t+r) ≤ A 2 {A− 2 (A 2 B (p−t )s+r A 2 ) s A− 2 } p A 2
1−t+r 1−t+r 1 t r r 1 t 1
A 2 2

−t −2r−(1−t )
holds for p ≤ −1, r ∈ [0, t] and s ∈ [max{ p−t , p−t }, 1].
Proof of Theorem 3.14 The proof is similar to that of Theorem 3.12. As in the
proof of it, we have

t−r t−r 1
B2 = [A 2 (Ar 1 B (p−t )s+r )A 2 ] p ≤ A.
s

Applying Theorem 2.8 for B2 ≤ A, it follows that

At −r #
p
1−t+r (At s B2 ) ≤ A.
(p−t)s+r

p t−r t−r
Moreover, since we have At s B2 = A 2 B (p−t )s+r A 2 , it follows that

A ≥ At −r # B2 ) = At −r #
p t−r t−r
1−t+r (At s 1−t+r A 2 B (p−t )s+r A 2 .
(p−t)s+r (p−t)s+r

By multiplying A−
t−r
2 on both sides, we obtain the conclusion.

In succession, we discuss some inequalities on the logarithm. The chaotic order
A " B for A, B > 0 is defined by log A ≥ log B. It is weaker than the
usual Löwner order A ≥ B. The Furuta inequality for chaotic order is known in
[6]:
Chaotic Furuta Inequality (CFI)
If A " B for A, B > 0, then for each r ≥ 0,
r r 1 r r 1
(i) (B 2 Ap B 2 ) q ≥ (B 2 B p B 2 ) q

and
r r 1 r r 1
(ii) (A 2 B p A 2 ) q ≥ (A 2 Ap A 2 ) q

hold for p ≥ 0 and q ≥ 1 with

rq ≥ p + r.
62 M. Fujii and R. Nakamoto

As in (FI), the optimal case rq = p + r is the most important, which is expressed

as the following way by the use of the α-geometric mean:
If A " B for A, B > 0, then for each r ≥ 0

A−r # p+r
r B ≤ I
p
and B p # p A−r ≤ I
p+r

hold for p ≥ 0.
As an application of (CFI), we obtain that if A " B for A, B > 0, then

A−r # 1+r B p = B p # p−1 A−r = B p # p−1 (B p # p A−r ) ≤ B p # p−1 I = B.

p+r p+r p p+r p

Namely, satellite of (FI) is refined as follows:

If A " B for A, B > 0, then

A−r # 1+r B p ≤ B
p+r

holds for p ≥ 1 and r ≥ 0.

Now, based on the theory of operator means, the relative operator entropy was
introduced by Fujii-Kamei [3]:

S(A|B) = A 2 log(A− 2 BA− 2 )A 2

1 1 1 1
for A, B > 0.

As an application of (CFI), we show the following inequality, which is regarded as

a chaotic order version of Corollary 3.5.
Theorem 3.16 Let A, B > 0 and r > 0 be given. Then, if S(Ar |Ap+r ) ≥
S(Ar |B p+r ) for some p > 0, then Ar ≥ B r .
Ando-Hiai Inequality 63

1
= A 2 log(A− 2 B p+r A− 2 ) p A 2 , the assumption is
r r r r
p S(A |B
1 r p+r )
Proof Since
rephrased as
1
log A ≥ log(A− 2 B p+r A− 2 ) p .
r r

1
Putting B1 = (A− 2 B p+r A− 2 ) p , we have A " B1 . Hence it follows from (CFI)
r r

that

I ≥ A−r # p+r −r − p+r − 2

p r r
r B = A # p+r
r A 2B A ,
1

so that Ar ≥ I # p+r
r B p+r = B r .

Next we show a generalization of the above, which is type of (GFI).
Theorem 3.17 Let A, B > 0 and t, r ≥ 0 be given. Then, if

S(At +r |Ap+t +r ) ≥ S(At +r |Ar 1 B (p+t )s+r )

holds for some p, s > 0 with (p + t)s ≥ t, then At +r ≥ B t +r .

Proof We first note that the assumption is rephrased as
1
log A ≥ log[A− B (p+t )s+r )A−
t+r t+r
2 (Ar 1 2 ]p .
s

We here recall that the following operator inequality of type of (GFI), see [9,
Theorem 3.16]: If A " X for A, X > 0, then
(p+t)s+r r t t r 1
A q ≥ [A 2 (A 2 Xp A 2 )s A 2 ] q

holds for p, t, r, s ≥ 0, q ≥ 1 with (t + r)q ≥ (p + t)s + r. Hence, taking

1
and X = [A−
t+r
B (p+t )s+r )A−
(p+t )s+r t+r
q= t +r
2 (Ar 1 2 ] p , it follows that
s

t+r
At +r ≥ [A 2 (A 2 Xp A 2 )s A 2 ] (p+t)s+r
r t t r

t+r
= [A 2 (A 2 [A− B (p+t )s+r )A−
r t t+r t+r t r
2 (Ar 1 2 ]A 2 )s A 2 ] (p+t)s+r
s
t+r
= [A 2 (A− 2 (Ar
r r
B (p+t )s+r )A− 2 )s A 2 ] (p+t)s+r
r r
1
s
t+r
= [A 2 (A− 2 B (p+t )s+r A− 2 A 2 ] (p+t)s+r
r r r r

= B t +r ,

which completes the proof.

64 M. Fujii and R. Nakamoto

Finally we discuss log-majorization related to an operator inequality obtained

in [11]. In the below, A and B are positive definite n × n matrices, denoted by
A, B > 0. We denote the order of log-majorization by A (log) B, i.e., A and B
satisfies

k
k
λi (A) ≥ λi (B) for k = 1, · · · , n − 1
i=1 i=1

and

n
n
λi (A) = λi (B),
i=1 i=1

where {λi (X); i = 1, · · · , n} is the eigenvalues of X > 0, arranged in decreasing

order.
For convenience, we briefly explain a relation between log-majorization and the
k-fold antisymmetric tensor power of matrices. For an n × n matrix X, let Ck (X)
for k = 1, · · · , n be the k-fold antisymmetric tensor power of X. Then it has the
following properties;
(1) Ck (X∗ ) = Ck (X)∗ for k = 1, · · · , n.
(2) Ck (XY ) = Ck (X)Ck (Y ) for k = 1, · · · , n.
(3) Ck (X−1 ) = Ck (X)−1 for k = 1, · · · , n if X is invertible.
kk(X ) = Ck (X) for k = 1, · · · , n if X > 0 and p = 0.
(4) C p p

(5) λ
i=1 i (A) = λ1 k (A)) k = 1, · · · , n if A > 0.
(C
Cosequently, for A, B > 0, A (log) B if and only if det A = det B and
λ1 (Ck (A)) ≥ λ1 (Ck (B)) for k = 1, · · · , n. Incidentally we note that Ck (A α B) =
Ck (A) α Ck (B) for A, B > 0 by (2)-(4), so that matrix inequalities of Ando-Hiai
type implies log-majorization inequalities corresponding to them.
The following log-majorization inequality corresponds to Theorem 2.3:
If A α B ≤ 1 for α ∈ [−1, 0] and positive invertible operators A and B, then
Ar β B s ≤ 1 for r ∈ [0, 1] and s ∈ [ −2αr
1−α , 1], where β = αr+(1−α)s .
αr

Theorem 3.18 For α ∈ [−1, 0] and A, B > 0,

rs
(A α B) αr+(1−α)s (log) Ar β Bs

holds for r ∈ [0, 1] and s ∈ [ −2αr

1−α , 1], where β =
αr
αr+(1−α)s .
Moreover we obtain an extension of [17, Theorem 2.1], which corresponds to
Theorem 2.8:
Theorem 3.19 For α ∈ [−1, 0] and A, B > 0,
(1−t+r)s
(A α B) αr+(1−αt)s (log) A1−t +r β (A1−t #s B)
Ando-Hiai Inequality 65

−t −2r−(1−t ) α(1−t +r)

holds for r ∈ [0, t] and s ∈ [max{ p−t , p−t }, 1], where β = αr+(1−αt )s .
Proof Theorem 2.8 says that if A ≥ B > 0 and t ∈ [0, 1], then

A−r+t 1−t+r (At #s B p ) ≤ A

r+(p−t)s

−t −2r−(1−t )
holds for p ≤ −1, r ∈ [0, t] and s ∈ [max{ p−t , p−t }, 1].
1 1
Putting A1 = A−1 , B1 = A− 2 B p A− 2 and p = α1 , it implies that

+r
A1 α B1 ≤ I #⇒ A1−t
1 β (A1−t
1 s B1 ) ≤ I.

Then we have the conclusion.

4 Concluding Remarks

We mention remarkable results to Ando-Hiai inequality. Wada [33] gave the

completion to (AH) in the sense that r ≥ 1 if and only if (AH) holds, i.e., for a
fixed α ∈ (0, 1)

A, B > 0, A#α B ≥ I ⇒ Ar #α B r ≥ I.

He also generalized it as follows: For a fixed α ∈ (0, 1) and an operator convex

−α 1−α
function f on [0, ∞) with f (0) = 0 and f (1) = 1, f ≥ ψα (f ) := f (t 1−α ) −α if
and only if

A, B > 0, A#α B ≥ I ⇒ f (A)#α f (B) ≥ I.

We here note that t r is operator convex for 1 ≤ r ≤ 2. In [32], he proposed a mean

theoretic generalization. For a nonnegative operator monotone function f on [0, ∞)
with f (1) = 1, f (t)r ≤ f (t r ) for t > 0 and r ≥ 1 if and only if

A, B > 0, A σf B ≥ I ⇒ Ar σf B r ≥ I for r ≥ 1,

where σf is the operator mean corresponding to the function f , i.e., A σf B =

A 2 f (A− 2 BA− 2 )A 2 , see [24].
1 1 1 1

Another improvement of (AH) is posed by Seo [28]: For a fixed α ∈ (0, 1),

A, B > 0, A#α B ≥ I ⇒ Ar #α B r ≤ ((A#α B)−1 −1 )r−1 I for r ≥ 1.

Reverse inequalities for Ando-Hiai inequality are presented by several authors,

e.g. [26], [30] and [28]. A basic result is as follows: If M ≥ A, B ≥ m > 0 and
h= M m , then for each α ∈ (0, 1)

K(h2r , α)A#α Br ≤ Ar #α B r ≤ A#α Br for r ≥ 1.

There are deep discussion on Ando-Hiai inequality for n-variable operator means
in [34] and [22].

References

1. T. Ando, F. Hiai, Log majorization and complementary Golden-Thompson type inequalities.

Linear Algebra Appl. 197, 198, 113–131 (1994)
2. N. Bebiano, R. Lemos, J. Providência, Inequalities for quantum relative entropy. Linear
Algebra Appl. 401, 159–172 (2005)
3. J.I. Fujii, E. Kamei, Relative operator entropy in noncommutaive infrmation theory. Math. Jpn.
34, 341–348 (1989)
4. M. Fujii, Furuta’s inequality and its mean theoretic approach. J. Oper. Theory 23, 67–72 (1990)
5. M. Fujii, Furuta inequality and its related topics. Ann. Funct. Anal. 1, 28–45 (2010)
6. M. Fujii, T. Furuta, E. Kamei, Furuta’s inequality and its application to Ando’s theorem. Linear
Algebra Appl. 179, 161–169 (1993)
7. M. Fujii, M. Ito, E. Kamei, A. Matsumoto, Operator inequalities related to Ando-Hiai
inequality. Sci. Math. Jpn. 70, 229–232 (2009)
8. M. Fujii, E. Kamei, Mean theoretic approach to the grand Furuta inequality. Proc. Am. Math.
Soc. 124, 2751–2756 (1996)
9. M. Fujii, J. Mićić Hot, J. Pečarić, Y. Seo, Recent Developments of Mond-Pečarić Method in
Operator Inequalities. Monographs in Inequalities, vol. 4 (Element, Zagreb, 2012)
10. M. Fujii, E. Kamei, Ando-Hiai inequality and Furuta inequality. Linear Algebra Appl. 416,
541–545 (2006)
11. M. Fujii, R. Nakamoto, Extensions of Ando-Hiai inequality with negative power. Sci. Math.
Jpn. 83, 211–223 (2020)
12. M. Fujii, A. Matsumoto, R. Nakamoto, Further generalizations of Bebiano-Lemos-Providência
inequality. Adv. Oper. Theory 7, Paper No. 34 (2022)
13. M. Fujii, R. Nakamoto, M. Tominaga, Generalized Bebiano-Lemos-Providência inequalities
and their reverses. Linear Algebra Appl. 426, 33–39 (2007)
14. M. Fujii, Y. Seo, Reverse inequalities of Cordes and Löwner-Heinz inequalities. Nihonkai
Math. J. 16, 145–154 (2005)
15. T. Furuta, A ≥ B ≥ 0 assures (B r Ap B r )1/q ≥ B (p+2r)/q for r ≥ 0, p ≥ 0, q ≥ 1 with
(1 + 2r)q ≥ p + 2r. Proc. Am. Math. Soc. 101, 85–88 (1987)
16. T. Furuta, Elementary proof of an order preserving inequality. Proc. Jpn. Acad. 65, 126 (1989)
17. T. Furuta, Extension of the Furuta inequality and Ando-Hiai log-majorization. Linear Algebra
Appl. 219, 139–155 (1995)
18. E. Heinz, Beiträge zur Störungstheorie der Spectralzegung. Math. Ann. 123, 415–438 (1951)
19. M. Ito, E. Kamei, Ando-Hiai inequality and a generalized Furuta-type operator function. Sci.
Math. Jpn. 70, 43–52 (2009)
20. M. Ito, E. Kamei, Furuta type inequalities related to Ando-Hiai inequality with negative
powers. Sci. Math. Jpn. 84, 23–32 (2021)
21. E. Kamei, A satellite to Furuta’s inequality. Math. Jpn. 33, 883–886 (1988)
Ando-Hiai Inequality 67

22. M. Kian, M.S. Moslehian, Y. Seo, Variants of Ando-Hiai inequality for operator power means.
Linear Multilinear Algebra 69, 1694–1704 (2021)
23. M. Kian, Y. Seo, Norm inequalities related to the matrix geometric mean of negative power.
Sci. Math. Jpn. (in Editione Electronica), e-2018. article 2018-7
24. F. Kubo, T. Ando, Means of positive linear operators. Math. Ann. 246, 205–224 (1980)
25. K. Löwner, Über monotone Matrix function. Math. Z. 38, 177–216 (1934)
26. R. Nakamoto, Y. Seo, A complement of the Ando-Hiai inequality and norm inequalities for the
geometric mean. Nihonkai Math. J. 18, 43–50 (2007)
27. G.K. Pedersen, Some operator monotone functions. Proc. Am. Math. Soc. 36, 309–310 (1972)
28. Y. Seo, On a reverse of Ando-Hiai inequality. Banach J. Math. Anal. 4, 87–91 (2010)
29. Y. Seo, Matrix trace inequalities related to the Tsallis relative entropy of negative order. J.
Math. Anal. Appl. 472, 1499–1508 (2019)
30. Y. Seo, M. Tominaga, A complement of the Ando-Hiai inequality. Linear Algebra Appl. 429,
1546–1554 (2008)
31. K. Tanahashi, Best possibility of the Furuta inequality. Proc. Am. Math. Soc. 124, 141–146
(1996)
32. S. Wada, Some ways of constructing Furuta-type inequalities. Linear Algebra Appl. 457, 276–
286 (2014)
33. S. Wada, When does Ando-Hiai inequality hold? Linear Algebra Appl. 540, 234–243 (2018)
34. T. Yamazaki, The Riemannian mean and matrix inequalities related to the Ando-Hiai inequality
and chaotic order. Oper. Matrices 6, 577–588 (2012)
Relative Operator Entropy

Jun Ichi Fujii and Yuki Seo

Abstract The relative operator entropy S(A|B) is an operator version of (the minis
of) the Kullback-Leibler divergence in the information theory. It is introduced an
extension, called solidarities A s B, of the Kubo-Ando operator means A m B and
moreover it is a tangent vector of the path of geometric means A #t B. So we discuss
mean-like properties and geometric ones in the manifold of the positive invertible
operators. In fact, this path A #t B is a geodesic and S(A|B) is the initial tangent
vector in this manifold with the principal fiber bundle, say the CPR geometry. The
former defines the multivariate power mean and the latter the Karcher mean. Related
to the quantum information theory, we discuss the Tsallis operator entropy and its
trace as the secant vector between A and A #t B.

Keywords Relative operator entropy · Solidarity · CPR geometry · Tsallis

entropy

1 Introduction

The relative operator entropy is derived from the Kubo-Ando operator means and
the Uhlmann relative operator entropy [18, 19]. So we begin with the Kubo-Ando
means [35] and the solidarities as their generalization in Sect. 2. We note that a
solidarity can be defined among the positive invertible operators, while it is not
always defined for non-invertible ones.

J. I. Fujii
Department of Educational Collaboration (Sciences; Mathematics and Information), Osaka
Kyoiku University, Kashiwara, Osaka, Japan
e-mail: [email protected]
Y. Seo ()
Department of Mathematics Education, Osaka Kyoiku University, Kashiwara, Osaka, Japan
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 69

R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_3
70 J. I. Fujii and Y. Seo

Typical solidarities are given as derivatives at t = 0 (i.e., the tangent vectors at

the initial point) of differentiable paths P (t) of operator means. In particular, the
relative operator entropy is the tangent vector of a path of the geometric operator
means. We discuss in Sect. 3 an existence condition and related ones.
It is noteworthy that the relative operator entropy is indeed the tangent vector
of the geodesic in the manifold of positive invertible operators, which we call
the CPR geometry named after Corach et al. [9] who pointed the above fact first.
Their geometric theory is based on that of fiber bundles which makes us free
from complicated coordinate calculations. This geometric view might be due to
E.Cartan and it is discussed in the famous standard text by Kobayashi–Nomizu [33].
Unfortunately this geometry might not be familiar, so that we discuss it in Sect. 4
in detail including a general framework of this geometric view. Consequently this
consideration gives us a new meaningful representation of the Karcher equation
which defines a multivariate geometric operator mean, the Karcher mean.
In the final section, as an application, we show the importance of the concept
of relative operator entropy in the framework of the matrix theory. We discuss the
Tsallis relative entropy as the secant of a path of geometric matrix means, while the
relative operator entropy as the tangent of it. Here we concentrate on the numerical
version of it since it is now related to various relative entropies in the quantum
information theory.

2 Operator Means and Solidarities

The theory of operator means is started at Ando’s lecture note [4] and established
as the Kubo-Ando theory [35]. For positive operators on a Hilbert space, the theory
of operator means is defined axiomatically: An (operator) connection m is a binary
operation on positive operators satisfying the following axioms:
• monotonicity: A1 ≤ A2 and B1 ≤ B2 imply A1 m B1 ≤ A2 m B2 .
• semi-continuity: An ↓ A and Bn ↓ B imply An m Bn ↓ A m B.
• transformer inequality: T ∗ (A m B)T ≤ (T ∗ AT ) m (T ∗ BT ).
An operator mean is a connection m satisfying
• normalization: A m A = A.
It is easy to show that the transformer inequality becomes equality if T is invertible.
For an operator mean m, the representing function fm (x) = 1 m x is operator
monotone:

0≤A≤B implies fm (A) ≤ fm (B).

Relative Operator Entropy 71

This correspondence m $→ fm is bijective. In fact, if f is a continuous nonnegative

operator monotone functional on [0, ∞) with f (1) = 1, then a binary operation m
defined by

A m B = A1/2 f A−1/2 BA−1/2 A1/2

for positive invertible operators A and B induces an operator mean A m B. They

also introduced the three operations in operator means:
The transpose ◦ , the adjoint ∗ and the dual ⊥ are defined by:

1
A m◦ B = B m A, f ◦ (x) = xf
x
1
A m∗ B = (A−1 m B −1 )−1 , f ∗ (x) =
f (1/x)
x
A m⊥ B = (B −1 m A−1 )−1 , f ⊥ (x) = .
f (x)

An operation in the above is the composition of the other two. Self-transpose

means are called symmetric √ and the geometric (operator) mean # (i.e., the mean
corresponding to f (x) = x) is invariant for all the above operations. The
arithmetic and the harmonic ones, other typical symmetric means, are adjoint (or
dual) each other.
As an extension of this class of binary operations on the positive invertible
operators, we define a solidarity s by the following axioms:
• right monotonicity: B1 ≤ B2 implies A s B1 ≤ A s B2 .
• right semi-continuity: Bn ↓ B implies A s Bn ↓ A s B.
• transformer inequality: T ∗ (A s B)T ≤ (T ∗ AT ) s (T ∗ BT ).
The corresponding normalized condition for solidarities is the following one which
is often assumed:
• normalization: I s I = 0.
This condition follows from the subclass of solidarities, say derivative solidarity:
For a differentiable path of operator means A mt B for positive invertible operators
A and B and t ∈ [0, 1] with A = A m0 B and B = A m1 B, the derivative at t = 0
defines a solidarity sm

A mt B − A
A sm B = lim .
t %0 t

Then it is clear that sm satisfies the above normalization.

Though it is extended for noninvertible cases, note that it does not always exists
as a bounded operator (Later we discuss it in detail for the relative operator entropy)
which is different from operator means. For a solidarity s, the representing function
72 J. I. Fujii and Y. Seo

f (x) = 1 s x is operator monotone on (0, ∞) and its transpose F (x) = f ◦ (x) =

x s 1 = xf (1/x) is operator concave as we will show later. The normalization
implies f (1) = 0. Putting fε (x) = f (x + ε), we have fε (x) is operator monotone
on [0, ∞) and f˜ε (x) = fε (x) − fε (0) is nonnegative operator monotone and hence
defines an operator mean. So f˜ε is also operator concave and hence so is f . Note
that F (0) is a nonnegative number by

f (1/ε) f (x)
lim F (ε) = lim = lim = lim f (x)
ε→0 ε→0 1/ε x→∞ x x→∞

by the l’Hospital theorem where the limit always exists since f is monotone
nonincreasing. In general, f is not defined at 0, but this fact shows F can be always
defined.
Lemma 2.1 If f is operator monotone on (0, ∞), then the corresponding solidarity
s is defined by f (x) = 1 s x and the transpose F (x) = x s 1 is operator concave
on [0, ∞). Conversely, if F is operator concave on [0, ∞) and F (0) 0, then
F (x) = x s 1 defines a solidarity., i.e., f (x) = 1 s x is operator monotone on
(0, ∞).
Proof Suppose f is operator monotone. We may assume A and B are positive
1 1 1 1
invertible operators. Then (A + B)− 2 A 2 and (A + B)− 2 B 2 are contractions. Since
f is also operator concave, the Jensen inequality of Davis-Hansen-Pedersen type
[10, 28, 29] shows

(A + B)− 2 A 2 f (A−1 )A 2 + B 2 f (B −1 )B 2 (A + B)− 2

1 1 1 1 1 1

≤ f (A + B)−1 + (A + B)−1 = f 2(A + B)−1 .

1
Multiplying (A + B) 2 from both sides in the above inequality, we have

Af (A−1 ) + Bf (B −1 ) A+B
≤ f (2(A + B)−1 ),
2 2

that is, F (A)+F
2
(B)
≤ F A+B 2 , which shows the operator concavity of F .
Conversely suppose F is operator concave and F (0) 0. Let A B for positive
invertible operators A and B. Then B − 2 A 2 is contractive and then we also have
1 1

1 1 1 1 1 1
B − 2 f (A)B − 2 = B − 2 A 2 F (A−1 )A 2 B − 2

≤ B − 2 A 2 F (A−1 )A 2 + (B − A) 2 F (0)(B − A) 2 B − 2
1 1 1 1 1 1

≤ F (B −1 + 0) = F (B −1 ),
Relative Operator Entropy 73

so that

f (A) ≤ BF (B −1 ) = f (B),

which completes the proof.

To see the existence condition of s, we consider the tangent function Gα at x = α
for F . Since
1 1
F (x) = f (1/x) − f (1/x), or f (x) = F (1/x) − F (1/x),
x x
then we obtain

Gα (x) = F (α)(x − α) + F (α) = F (α)x − αF (x) + F (α) = f (1/α).

So we have the following existence condition:

Theorem 2.2 For a solidarity s, let f (x) = 1 s x and F (x) = x s 1. Then A s B
exists if and only if Gα (x) = F (α)x + f (1/α) is bonded below for all α > 0,
which is precisely expressed as the existence of a selfadjoint operator C with

F (α)A + f (1/α)B ≥ C

for all α > 0.

−1 −1
Proof For ε > 0, put Bε = B + ε and Xε = Bε 2 ABε 2 . Suppose A s B exists.
Then
1 1 1 1
A s B ≤ A s Bε = Bε2 F (Xε )Bε2 ≤ Bε2 Gα (Xε )Bε2 = F (α)A + f (1/α)Bε

for all α > 0. Tending ε % 0, we have by f (x) ≥ 0

A s B ≤ F (α)A + f (1/α)B,

and hence it is bounded below.

Conversely suppose

F (α)A + f (1/α)B ≥ C

for all α > 0. Since f (1/α) > 0, we have

−1 −1
F (α)A + f (1/α)Bε ≥ C, that is, F (α)Xε + f (1/α) ≥ Bε 2 CBε 2 .

The left hand of the latter inequality in the above attains η(Xε ) = −Xε log Xε as the
infimum, that is, A s Bε ≥ C. Therefore A s Bε is bounded below and A s B exists
with A s B ≥ C.

74 J. I. Fujii and Y. Seo

Remark 2.3 In the proof of the above in [17], the only if part is a little ambiguous.
The above proof is a complete version of [17, Theorem 1]. In the case of relative
operator entropy, we showed it in [22].
If F is nonnegative, then F is also operator monotone, so that F defines the
transpose of some operator mean. When s is operator mean, the above property is
clear, so that we may assume that F (x0 ) = 0 for some x0 0. If the set of zero
points {x0 } consists of 0 only, then F is nonpositive and so is f . In the case x0 > 0,

F (x0 )
f (1/x0) = = 0.
x0

Then we can normalize

F̃ (x) = F (xx0 ) and f˜(x) = f (x/x0 ),

so that F̃ (1) = F (x0 ) = 0 and f˜(1) = f (1/x) = 0 like derivative solidarities:

For a differentiable path of operator means mt , we may define the corresponding
solidarity sm by

A mt B − A ∂
A sm B = lim = A mt B
t %0 t ∂t t =0

if the limit exists. Since fm (1) = 1, we have f (1) = F (1) = 0 for the
corresponding solidarity s.

3 Relative Operator Entropy

A typical and important example of solidarities is the relative operator entropy.

First we review the relative operator entropy S(A|B) for positive (bounded linear)
operators A, B on a Hilbert space, see [13, 14, 18–20]. If B is invertible, then it is
1 1 1 1
defined by S(A|B) = B 2 η B − 2 AB − 2 B 2 , where η is the entropy function:

η(x) = −x log x if x > 0, η(0) = 0.

In addition, if A is invertible, then S(A|B) = A 2 log A− 2 BA− 2 A 2 . Since

1 1 1 1

S(A|B) has the right-term monotone decreasing property of S(A|B + ε) as ε ↓ 0,

we define for non-invertible A and B

S(A|B) = s-lim S(A|B + ε)

ε↓0
Relative Operator Entropy 75

if the limit (in the strong operator topology) exists as a bounded operator. But, in
general, S(A|B) does not always exist. On the other hand, based on the fact that
xt − 1
% log t as t ↓ 0, it follows that A#t B−A
t is monotone-decreasing as t ↓ 0,
t
so that another equivalent definition of Uhlmann’s type is the derivative one for the
path of geometric means A#t B:

A#t B − A
S(A|B) = s-lim
t ↓0 t

if the limit exists. If A and B are commuting and S(A|B) is defined, then

S(A|B) = A log B − A log A,

in particular, S(0|B) = 0 for all positive operators B ≥ 0. Though we often use

unbounded expressions like log A from now on, these conventions are surely based
on the total boundedness of A log A. Under the existence, we have the following
properties of S(A|B) for positive operators A and B by those for operator means:
Lemma 3.1 Under the existence of relative operator entropies, the following
properties like those of operator means hold:
(1) right monotonicity: If B ≤ B , then S(A|B) ≤ S(A|B ).
(2) transformer inequality: T ∗ S(A|B)T ≤ S(T ∗ AT |T ∗ BT ) for all T
(the equality holds for invertible T ).
(2 ) informational monotonicity: (S(A|B)) ≤ S( (A)| (B))
for all normal positive linear maps .
(3) sub-additivity: S(A1 |B1 ) + S(A2 |B2 ) ≤ S(A1 + A2 |B1 + B2 ).
(3 ) joint concavity: For all t ∈ [0, 1],
(1 − t)S(A1 |B1 ) + tS(A2 |B2 ) ≤ S((1 − t)A1 + tA2 |(1 − t)B1 + tB2 ).
(4) upper bound: S(A|B) ≤ B − A.
(5) kernel inclusion: ker S(A|B) ⊃ ker A.
(6) orthogonality: S k Ak | k Bk = k S(Ak |Bk ).
(7) affine parametrization: S(A|A#t B) = t S(A|B) for all t ∈ [0, 1].
Here we recall the equality condition in the transformer inequality (2) of
Lemma 3.1 [12, Theorem 3]: If ker T ∗ ⊂ ker A ∩ ker B for an operator T , then
T ∗ (A m B)T = (T ∗ AT )m(T ∗ BT ) holds for all operator means m. Moreover this
equality holds for in S(A|B) since S(A|B) = s-limt ↓ 0 A#t B−A
t :
Theorem 3.2 Let A and B be positive operators. If S(A|B) exists and ker T ∗ ⊂
ker A ∩ ker B for an operator T , then

T ∗ S(A|B)T = S(T ∗ AT |T ∗ BT ).
76 J. I. Fujii and Y. Seo

Then we have one of the (sufficient) conditions that S(A|B) exists;

Lemma 3.3 If A is majorized by B, i.e., A ≤ αB for some α > 0, then S(A|B)
exists.
1 1
In fact, by Douglas’ majorization theorem [10], we have A 2 = DB 2 for some
‘derivative’ operator D with ker D = ker A ⊃ ker B and so ker B = ker A ∩ ker B.
Then, for the support projection PB for B, we have PB APB = A and PB D ∗ DPB =
D ∗ D. Hence it follows from Theorem 3.2 that
1 1
S(A|B) = S(B 2 D ∗ DB 2 |B) = B 2 S(D ∗ D|PB )B 2 = B 2 η D ∗ D B 2
1 1 1 1

and so S(A|B) exists.

It is also shown that the majorization A ≤ αB is equivalent to the condition for
the range inclusion;
1 1
ranA 2 ⊂ ranB 2 .

But it is too strong for the existence of S(A|B). In fact, A is not majorized by A2 if
σ (A) = [0, 1], while we easily see S(A|A2 ) = A log A.
Another candidate is the kernel inclusion

ker A ⊃ ker B,

which is weaker than the range inclusion. In fact, the kernel condition does not
guarantee the existence: For B with σ (B) = [0, 1] where 0 is not an eigenvalue, it
follows that S(I |B) = log B diverges while both kernels are trivial.
The third condition between the above ones is B-absolute continuity in the sense
of Ando’s Lebesgue decomposition [3]:

A = [B]A ≡ s-lim A : nB
n→∞

where A : B defined by

A : Bz, z = inf [Ax, x + gBy, y] (†)

x+y=z

is the parallel addition [2], which is the half of the harmonic mean A h B [4]. Kosaki
[34] showed that
1 1
[B]A = A 2 PM A 2

for the projection PM on the closed subspace

1
M = {y | A 2 y ∈ ranB}.
Relative Operator Entropy 77

This result implies A = [B]A = limt ↓ 0 A#t B and hence B-absolute continuity
guarantees the continuity of A#t B at t = 0 and it is a necessary condition for the
existence of S(A|B) as the above derivative. In fact, this continuity is in the norm
topology:
Lemma 3.4 If S(A|B) exists, then A#t B converges uniformly to A for t ↓ 0.
Proof Since there are scalars c1 and c2 with

A#t B − A A#t B − A

c1 ≤ S(A|B) ≤ ≤ ≤ B − A ≤ c2
t t
for all t ∈ (0, 1), we have tc1 ≤ A#t B − A ≤ tc2 , so that the required convergence
yields.

Since ker A#t B ⊃ ker A ∨ ker B for all t ∈ (0, 1) as in [14] (as we will see later,
these are equal indeed) and it is related to the ranges, it is a stronger condition than
the kernel inclusion. But it is weaker than the existence condition: If A is the range
projection PB for B with σ (B) = [0, 1], then S(PB |B) = PB log B is not bounded.
The existence condition for solidarity shows that of S(A|B):
Corollary 3.5 The relative operator entropy S(A|B) exists if and only if there exists
a real number c with α1 B + (log α)A ≥ c for all α > 0.
Summing up, we have the following relations around the existence condition:
Theorem 3.6 The implications (1) ⇒ (2) ⇒ (3) ⇒ (4) hold in the following
conditions for a pair of A, B ≥ 0 and each converse does not always hold.
1 1
(1) majorization or range inclusion: ∃α > 0; A ≤ αB, i.e., ranA 2 ⊂ ranB 2 .
(2) existence condition: S(A|B) exists as a bounded operator, i.e.,

1
B + (log α)A > ∃c ∈ R (∀α > 0)
α

1 1
(3) B-absolute continuity: A = [B]A = A 2 PM A 2 = lim A#t B .
t↓0
(4) kernel inclusion: ker A ⊃ ker B.
Remark 3.7 If both ranges of A and B are closed, in particular, for the case of
matrices, the above conditions in Theorem 3.6 are mutually equivalent since the
1
relation ranA 2 = ranA = (ker A)⊥ holds for all positive operators A.

4 CPR Geometry

The path of geometric means mt and the relative operator entropy are important
concepts in the geometric structure of the manifold of the positive invertible
operators discussed by Corach et al. [8, 9] which we call it CPR geometry. In this
78 J. I. Fujii and Y. Seo

section, we mention the general structure of the geometry of fiber bundles to read
easily for readers not familiar with it. Here the CPR geometry represents the one
on the Finsler manifold A + , the positive invertible elements in a unital C*-algebra
A , which we review in the below. Corach himself reformulated it in [8]: The base
manifold is A + with the tangent vector bundle A h (the selfadjoint operators in A ).
As we confirm later, the invertible elements G = G(A ) is the principal fiber bundle
(of A + ). Thus the total space P = {G, A + , U, π} is defined by
projection π: G → A + , g $→ gg ∗
structure group the unitary group U = U(A )
1
principal fiber π −1 (A) = A 2 U
This definition of this fiber homeomorphic to U is consistent. In fact, for each unitary
U , we obtain

π(A 2 U ) = A 2 U (A 2 U )∗ = A 2 U U ∗ A 2 = A.
1 1 1 1 1

Conversely, take g ∈ π −1 (A) for some A ∈ A + . Then, the polar decomposition

of the adjoint g ∗ , which is the fundamental idea of the above correspondence, is
1 1 1
g ∗ = U ∗ (gg ∗ ) 2 = U ∗ A 2 and hence g = A 2 U . So the principal fiber bundle is the
invertible operators G by

G = {A 2 U |A ∈ A + , U ∈ U}.
1

As in Fig. 1, the projection π : G → A + naturally induces the derivative π∗

from the tangent bundle T (G) to the tangent vector bundle T (A + ) = A h . When

G
g

π −1 (A)
π∗
π

+
∗
A = π(g) = gg

Fig. 1 Basic concept of the principal fiber bundle P = {G , A + , U , π}

Relative Operator Entropy 79

Vg
G
Γ
g Hg

π −1 (A)

γ
+
A = π(g)

Fig. 2 Connection and the horizontal lift of γ

we regard G as an upper structure of the base space A + , the kernel ker π∗ is the
vectors along π∗ ‘vertically’. So the subspace ker π∗ ∩Tg of the tangent space Tg (G)
at g ∈ G is called the vertical space Vg . In this case, since the Lie algebra of a unitary
group is the skew-hermitians, we have

Vg = {gX | X∗ = −X}.

In Tg (G), if a ‘horizontal space’ Hg is given (and it is compatible for the right action
U; HgU = Hg U ), then it is called the principal fiber G has a connection (in the
sense of manifold). In CPR geometry, it is naturally determined by Hg = gA h
considering the vertical one Vg . In fact, the compatibility of the right action is
shown: Take gH ∈ Hg with H = H ∗ . Then, for all U ∈ U, gH U = gU U ∗ H U
with U ∗ H U ∈ A h , which shows Hg U = HgU .
Now, for a (differentiable) curve γ (t) on A + , consider a horizontal lift (t) ∈
−1
π (γ (t)).
The term ‘lift’ means

γ (t) = π((t)) = (t)(t)∗

˙
and the term ‘horizontal’ means the tangent vector (t) ∈ H(t ). In other words,
˙
(t)−1 (t) is hermitian, i.e.,

˙ ∗ (t)( ∗ )−1 (t) = ((t)−1 (t))

˙ ∗ ˙
= (t)−1 (t).
80 J. I. Fujii and Y. Seo

This is equivalent to ˙ ∗ =
˙ ∗ . Then we have

˙ ∗ + ˙ ∗ = 2
γ̇ = ˙ ∗

and hence

˙ ∗ )( ∗ )−1 −1 = 2

γ̇ γ −1 = (2 ˙ −1 ,

which is called the transport equation which characterizes a horizontal lift.

As we will see later, the geodesic from A to B is given by the path γ (t) = A #t B
of the geometric means. One of the horizontal lifts of the geodesic is given by the
following simple form, which is our unpublished result:
1 1 1
Lemma 4.1 The curve (t) = A 2 C 2 for C = A− 2 BA− 2 is a horizontal lift of
t

γ (t) = A #t B.
Proof The transport equation follows from:

˙ −1
= A 2 (log C)C 2 C − 2 A− 2 = A 2 (log C)A− 2
1 t t 1 1 1
2(t)(t)
1 1 1 1 1 1
γ̇ (t)γ (t)−1 = A 2 (log C)C t A 2 A− 2 C −t A− 2 = A 2 (log C)A− 2 .

In order to obtain further results in this geometry, we identify the tangent vector
bundle, which is A h , as the associated vector bundle for G. Note that G acts on A ∈
A + by A $→ gAg ∗ and hence X → gXg ∗ ≡ ρ(g)X. Regard A h as the tangent
vector space, consider the associated bundle: it is the quotient bundle G ×ρ A h of
G × A h with the equivalence relation of the right action by f ∈ G

(g, X)f = (gf, ρ(f )X) ∼ (g, X) ∼ (IA , ρ(g −1 )X).

Roughly speaking, at the point π(g), we see ρ(g −1 )X = g −1 X(g ∗ )−1 as a tangent
vector. Considering the connection of G, we reflects it on the tangent bundle via a
horizontal lift: If γ is a path on A + and is a horizontal lift of it, then we observe
−1 (t)γ̇ (t)((t)−1 )∗ instead of the tangent vector γ̇ (t), and thereby the translation
(t)(0)−1 : (0) $→ (t) yield the parallel displacement of a tangent vector X
along γ from 0 to t is

Pt X = (t)(0)−1 X((0)∗ )−1 (t)∗ .

Note that

˙ ) = 1
( −1 ˙ −1 = γ̇ γ −1
2
Relative Operator Entropy 81

by the transport equation and the noncommutative inverse differential formula

˙ ) = − −1
( −1 ˙ −1 .

So the covariant derivative Dt of a tangent field X(t) along the curve γ (t) in A +
is given by

(t)(t + ε)−1 X(t + ε)((t + ε)∗ )−1 (t)∗ − X(t)

Dt X = lim
ε→0 ε
X(t + ε) − X(t)
= lim (t)(t + ε)−1 · · ((t + ε)∗ )−1 (t)∗
ε→0 ε
(t + ε)−1 − (t)−1
+ (t) · · X(t)((t + ε)∗ )−1 (t)∗
ε
((t + ε)∗ )−1 − ((t)∗ )−1
+ X(t) · · (t)∗
ε
˙ −1 X + X( ∗˙)−1 (t)∗ = Ẋ − (
= Ẋ − ( ˙ )∗ )(t)
˙ −1 X + X( −1
1
˙ −1 X + X(
= Ẋ − ( ˙ −1 )∗ )(t) = Ẋ − (γ̇ γ −1 X + Xγ −1 γ̇ )(t).
2
Then, as the self-parallel curve, the geodesic equation is

O = Dt γ̇ = γ̈ − γ̇ γ −1 γ̇

and we can obtain the geodesic:

Theorem 4.2 (Corach-Porta-Lecht) The geodesic from A to B in the above
geometry in A + is the path of geometric Kubo-Ando means:
1 1 1 1
γ (t) ≡ A#t B = A 2 (A− 2 BA− 2 )t A 2 .

To see this, we confirm the following approximation:

Lemma 4.3 Let f be a (norm differentiable) path in A+ . If f (t) commutes with
f (t) for each t ∈ [0, 1], then

d
log f (t) = f (t)f −1 (t).
dt
Proof For arbitrary ε > 0, there is δε > 0 with

f (t + δε ) − f (t)
− f (t)

δε < ε,
82 J. I. Fujii and Y. Seo

or equivalently

δε (f (t) − ε) ≤ f (t + δε ) − f (t) ≤ δε (f (t) + ε).

We can choose δε such that δε % 0 if ε % 0. Putting h = δε (f (t) + ε), we have h

commutes with f (t) and hence

log f (t + δε ) − log f (t) log(f (t) + h) − log f (t) log(f (t) + h) − log f (t) h
≤ =
δε δε h δε
log(f (t) + h) − log f (t)
= (f (t) + ε) −→ f (t)−1 f (t)
h

as ε → 0. Thereby δε → 0 and h → 0. Thus (log f (t)) ≤ f (t)f (t)−1 .

Similarly, considering the left hand in (4), we have (log f (t)) ≥ f (t)f (t)−1 , so
that (log f (t)) = f (t)f (t)−1 .

Then we give a proof of Theorem 4.2:
Proof of Theorem 4.2 By the geodesic equation, for
1 1
f (t) = γ (0)− 2 γ (t)γ (0)− 2 ,

we have f (0) = I and f also satisfies the geodesic equation (Indeed, it is a curve
from I to A− 2 BA− 2 ). By f f −1 = (f f −1 )2 , we have
1 1

(f f −1 ) = f f −1 + f (f −1 ) = f f −1 − (f f −1 )2 = 0

and then there exists C with f f −1 = C. Since C = f (0)f (0)−1 = f (0), we

have C = C ∗ and hence f (t) and f (t) (also C) are commuting for all t ∈ [0, 1].
Note that f f −1 = C by the above lemma, so that there exists an operator D with
log f (t) = tC + D. Moreover, the case t = 0 shows D = 0: f (t) = et C . Thereby
1 1
γ (t) = γ (0) 2 et C γ (0) 2 .
1 1
Considering the terminal conditions A = γ (0) and B = γ (1) = A 2 eC A 2 shows
1 1 1 1
γ (t) = A 2 (A− 2 BA− 2 )t A 2 = A#t B.

The parallel transport of the tangent vector X at γ (t1 ) to that at γ (t2 ) along the
geodesic γ is described as Ptt12 X. Now by Lemma 4.1, we can obtain the parallel
transform along the geodesic:
Relative Operator Entropy 83

Theorem 4.4 The parallel transform of the tangent vector X on A to that on B

along the geodesic is given by

P01 X = A 2 C 2 A− 2 XA− 2 C 2 A 2
1 1 1 1 1 1

1 1
for C = A− 2 BA− 2 .
1 1 1
In particular, the case B = I , we have A− 2 XA− 2 = ρ −1 (A 2 )X. This
manifold is a symmetric space and then a homogeneous space; A + = G/U. Thus
properties around the identity I reflect on those around other points. Moreover,
every symmetric space is geodesic complete, that is, the domain [0, 1] is extended
to R:

γ (t) = A t B ≡ A 2 (A− 2 BA− 2 )t A 2

1 1 1 1

for t ∈ R.
From this viewpoint, the definition
1 1
L(X; A) = XA = A− 2 XA− 2

at each point A ∈ A makes the above manifold A + a Finsler space with a Finsler
metric as in the CPR main result. Since XA is equivalent to the operator norm
X, it is a Finsler metric if

Finsler condition: L(Pt X; γ (t)) = Pt Xγ (t ) = Xγ (0) = L(X; γ (0))

holds for all curves γ and parallel transports Pt along γ . The CPR geometry does not
always determine a unique Finsler metric. In fact, we show each unitarily invariant
norm ||| ||| also gives a Finsler metric for the CPR geometry:
Theorem 4.5 For a unitarily invariant norm ||| ||| on A , a function
1 1
L||| ||| (X; A) = |||X|||A = |||A− 2 XA− 2 |||

determines a Finsler metric on A + for the CPR geometry.

Proof Since Ut = γ (t)− 2 (t) defines a unitary for each t by γ = ∗ , the Finsler
1

condition is satisfied by

|||Pt X|||γ (t ) = |||Ut U0∗ γ (0)− 2 Xγ (0)− 2 U0 Ut∗ |||

1 1

1 1
= |||γ (0)− 2 Xγ (0)− 2 ||| = |||X|||γ (0).

84 J. I. Fujii and Y. Seo

A Finsler metric L||| ||| (X; A), which is called a unitarily invariant Finsler one, is
homogeneous like L(X; A):
Theorem 4.6 For any invertible operator Y ,

L||| ||| (Y ∗ XY ; Y ∗ AY ) = L||| ||| (X; A).

√ √
Proof Since |||Z||| = ||||Z|||| = ||| Z ∗ Z||| = ||| ZZ ∗ |||, we have

L||| ||| (Y ∗ XY ; Y ∗ AY ) = |||(Y ∗ AY )− 2 Y ∗ XY (Y ∗ AY )− 2 |||

1 1

= ||| (Y ∗ AY )− 2 Y ∗ XY (Y ∗ AY )−1 Y ∗ XY (Y ∗ AY )− 2 |||
1 1

1 1
= ||| (Y ∗ AY )− 2 Y ∗ XA−1 XY (Y ∗ AY )− 2 |||

= ||| A− 2 XY (Y ∗ AY )−1 Y ∗ XA− 2 |||
1 1

1 1
= ||| A− 2 XA−1 XA− 2 ||| = |||A− 2 XA− 2 |||
1 1

= L||| ||| (X; A).

In particular, the case of the Hilbert-Schmidt norm shows it is a Riemannian

manifold, which is discussed by Bhatia-Holbrook [7].
Finally in this section, we emphasise that the relative operator entropy S(A|B) is
the (initial) tangent vector of the geodesic A #t B, and adding the initial point A, we
reconstruct the geodesic by the exponential map Exp of this manifold:

ExpA tS(A|B) ≡ ρ(A 2 ) exp t ρ −1 (A 2 )(S(A|B)) = A #t B.

1 1

Thus this geometric consideration says that the Karcher equation should be

w(n)S(An |X) = O
n

as a barycenter of the terminal tangent vectors, which is the geometric meaning for
the Karcher equation, cf. [37, 38]. It is also consistent considering the power mean
equation

X = w(n) (An #t X) ,

or
An #t X − X
w(n) = O.
n
t
Relative Operator Entropy 85

5 Tsallis Relative Entropy

In this section, we consider quantum relative entropies in the framework of the

matrix theory. Let Mn (C) = Mn be the algebra of n × n complex matrices, Pn
the set of positive definite matrices in Mn and Tr the usual trace. We denote the set
of all density matrices (positive definite matrices with trace one) by Sn (C) = Sn .
As a quantum extension of the Shannon entropy [43], von Neumann [46] defined
the entropy of the density matrix A in Sn by the formula

S(A) = Tr [η(A)],

where the entropy function η(t) = −t log t. As for the Shannon entropy, it is
extremely useful to define a quantum version of the relative entropy. Suppose that A
and B are density matrices in Sn . As a quantum generalization of the relative entropy
due to Kullback and Leibler [36], Umegaki firstly introduced in the setting of von
Neumann algebra [45] in 1962 the quantum relative entropy of A with respect to B,
which is defined by

Tr [A(log A − log B)] if supp A ⊂ supp B,
SU (A|B) = (1)
+∞ otherwise.

We call (1) the Umegaki relative entropy. In [1], for any A and B in Sn , the Tsallis
relative entropy of A to B is defined by

1 − Tr [A1−α B α ]
Dα (A|B) = = Tr [A1−α (lnα A − lnα B)] (2)
α

for any 0 < α ≤ 1, where lnα t = t α−1 is the α-logarithmic function. The Tsallis
α

relative entropy (2) is a 1-parameter extension of (1), and Ruskai and Stillinger
in [41] showed the following relation between the Tsallis relative entropy and the
Umegaki relative entropy:

Dα (A|B) ≤ SU (A|B) ≤ D−α (A|B)

for all 0 < α ≤ 1, and limα→0 Dα (A|B) = SU (A|B).

On the other hand, there are another formulation of the quantum relative entropy.
By Theorem 4.2, the path γ (t) = A#t B for t ∈ R is the geodesic from A to B with
γ (0) = A and γ (1) = B, and the relative operator entropy S(A|B) is the velocity
vector of the geodesic A#t B at t = 0. By virtue of the relative operator entropy, we
define the quantum relative entropy as

SF K (A|B) = −Tr [S(A|B)]. (3)

86 J. I. Fujii and Y. Seo

The quantum quantity Tr [A(log A1/2 B −1 A1/2)] is firstly proposed by Belavkin and
Staszewski [6] in the framework of C∗ -algebra. Since we treat SF K (A|B) as the
minus of the trace of the relative operator entropy S(A|B), we call (3) the FK
relative entropy, or the BS relative entropy in [39, pp125]. If A and B commute,
then we have SU (A|B) = SF K (A|B). Generally, two quantum formulations of
the relative entropy are different. In fact, Hiai and Petz [30] showed the following
relation:

SU (A|B) ≤ SF K (A|B) (4)

and a 1-parameter extension of (4):

1
SU (A|B) ≤ − Tr [A1−q S(Aq |B q )] for all q > 0. (5)
q

In fact, if q = 1 in (5), then we have the Hiai-Petz inequality (4) and as q → 0 the
right-hand side of (5) converges to the Umegaki relative entropy SU (A|B).
Moreover, Yanagi et al. [47] have been advancing research on the Tsallis relative
operator entropy as an operator generalization of the Tsallis relative entropy, which
is regarded as a 1-parameter extension of the relative operator entropy: For positive
definite matrices A and B in Pn , the Tsallis relative operator entropy is defined by

A α B − A
Tα (A|B) = for 0 < α ≤ 1.
α
We recall the notation α for the binary operation

A α B = A1/2 (A−1/2BA−1/2 )α A1/2 for α ∈ [0, 1],

that have formula in common with α . Then the Tsallis relative entropy of negative
order is defined by

A α B−A
Tα (A|B) = for α ∈ [−1, 0).
α
For convenience, we denote another quantum Tsallis relative entropy of order α ∈
[−1, 1]\{0} by

NTα (A|B) = −Tr Tα (A|B)

for positive definite matrices A and B in Pn . For two relative entropies, we know
the following two relations:

NTα (A|B) ≤ SF K (A|B) ≤ NT−α (A|B) for all α ∈ (0, 1]

Relative Operator Entropy 87

Fig. 3 Geometric structure

of Tsallis relative entoropy
B

Aα B
Aα B − A
Tα (A|B) =
α −NTα (A|B)

S(A|B) −SF K (A|B)

(see Fig. 3) and

Dα (A|B) ≤ NTα (A|B) for all α ∈ [−1, 1]. (6)

We have the following properties of the quantum relative entropy NTα of order
α ∈ [−1, 1]\{0}, also see [21]:
Theorem 5.1 Let A and B be positive definite matrices in Pn and α ∈ [−1, 1]\{0}.
Then the following properties of the quantum Tsallis relative entropy NTα hold:
(1) (Non-negativity) NTα (A|B) ≥ 0 if A ≥ B.
(2) (Pseudoadditivity)
NTα (A1 ⊗ A2 |B1 ⊗ B2 ) = NTα (A1 |B1 ) + NTα (A2 |B2 ) + αNTα (A1 |B1 )NTα (A2 |B2 ).

(3) (Joint convexity) NTα ( j λj Aj | j λj Bj ) ≤ j λj NTα (Aj |Bj ).
(4) (Monotonicity) For any trace-preserving positive linear map

NTα ( (A)| (B)) ≤ NTα (A|B).

Proof For (1), if α ∈ [−1, 0), then it follows that (1 − α)A + αB ≤ A α B and so

A α B −A (1 − α)A + αB − A
NTα (A|B) = −Tr ≥ −Tr
α α
= −Tr [B − A] ≥ 0.

If α ∈ (0, 1], then it follows that (1 − α)A + αB ≥ A α B and so

A α B − A (1 − α)A + αB − A
NTα (A|B) = −Tr ≥ −Tr
α α
= −Tr [B − A] ≥ 0.
88 J. I. Fujii and Y. Seo

For (2), suppose that α ∈ [−1, 1]\{0}. Then we have

Hence it follows that

Tα (A1 ⊗ A2 |B1 ⊗ B2 ) = αTα (A1 |B1 ) ⊗ Tα (A2 |B2 ) + Tα (A1 |B1 ) ⊗ A2

+ A1 ⊗ Tα (A2 |B2 )

and we have the desired equality (2).

For (3), it follows from jointly convexity of α for α ∈ [−1, 0) and jointly
concavity of α for α ∈ (0, 1] that

( λj Aj ) α λj Bj ) − λj Aj
(
NTα ( λj Aj | λj Bj ) = −Tr
α
j j

λj (Aj α Bj ) − λj Aj
≤ −Tr
α
⎡ ⎤

= −Tr ⎣ λj Tα (Aj |Bj )⎦ = λj NTα (Aj |Bj )
j j

and thus we have (3).

For (4), it follows from the information monotonicity of α for α ∈ [−1, 1]\{0}
that

(A) α (B) − (A)
NTα ( (A)| (B)) = −Tr Tα ( (A)| (B)) = −Tr
α

(A α B) − (A) A α B −A
≤ −Tr = −Tr ( )
α α
Relative Operator Entropy 89

A α B −A
= −Tr
α
= NTα (A|B)

and hence we have (4).

To prove the main theorem, we need some preliminaries. Bebiano et al. in [5,
Theorem 2.1] showed the following norm inequality, say BLP inequality: If A and
B are positive definite matrices in Pn , then
1+t 1+t t
A 2 Bt A 2 ≤ A1/2 (As/2B s As/2) s A1/2 (7)

for all s ≥ t ≥ 0 and any unitarily invariant norm |||·|||.

The Furuta inequality [24, pp49] says that if A ≥ B ≥ 0, then

A−r 1+r B p ≤ A for p ≥ 1 and r > 0. (8)

p+r

We show the following variant of the BLP inequality (7):

Lemma 5.2 [23, Theorem 3.2] Let A and B be positive definite matrices in Pn .
Then
q
p p 1−q 1−q
A 2 A− 2 B p A− 2
1 p 1
A2 ≤ A 2 Bq A 2 (9)

for all p ≥ q > 0 and 0 < q ≤ 1, and any unitarily invariant norm |||·|||.
Proof By the antisymmetric tensor technique, in order to prove (9), it suffices to
show that
q
p p 1−q 1−q
λ1 (A 2 A− 2 B p A− 2
1 p 1
A 2 ) ≤ λ1 (A 2 Bq A 2 ) (10)

for all 0 < q ≤ p, where λ1 (A) is the maximal eigenvalue of A.

For this purpose we may prove that
q
1−q 1−q 1 p p 1
implies A 2 A− 2 B p A− 2
p
A 2 Bq A 2 ≤I A 2 ≤ I,

1 1
or equivalently, replacing A and B by A q−1 and B q respectively
p p p−1
B ≤A #⇒ A q−1 q B q ≤ A q−1
p

for all 0 < q ≤ p and 0 < q ≤ 1, because both sides of (10) have the same order of
homogeneity for A, B, so that we can multiply A, B by a positive constant.
90 J. I. Fujii and Y. Seo

p +r−p r
Put r = 1−qp
> 0 and p = p
q ≥ 1. Then p−1
q−1 = p . It follows from the
Furuta inequality (8) that
p p
A q−1 q B q = A−r 1 B p
p p

= A−r p +r A−r 1+r B p by the multiplicity of α
p (1+r) p +r

≤ A−r p+r A by the Furuta inequality (8)

p(1+r)

p +r−p r p−1
=A p = A q−1

and so the proof is complete.

We show a 1-parameter extension of the inequality (6), which is a generalization
of the inequality (5) due to Hiai-Petz:
Theorem 5.3 Let A and B be positive definite matrices in Pn . Then

1
Dα (A|B) ≤ − Tr [A1−q T αq (Aq |B q )]
q

for all 0 < α ≤ 1 and q ≥ α > 0, or −1 ≤ α < 0 and q ≥ −α > 0.

Proof Suppose that 0 < α ≤ 1. Since Tr [|A|] is unitarily invariant norm, it follows
from Lemma 5.2 that
q q
] ≥ Tr [A 2 (A− 2 B q A− 2 )α/q A 2 ]
1−α 1−α 1 1
Tr [A1−α B α ] = Tr [A 2 BαA 2

for all q ≥ α > 0. Hence for each α ∈ (0, 1]

A1−α B α − A
Dα (A|B) = −Tr
α
" 1 q q α #
A 2 (A− 2 B q A− 2 ) q A 2 − A
1

≤ −Tr
α
" q q q α q #
A1−q A 2 (A− 2 B q A− 2 ) q A 2 − Aq
= −Tr α
q q
1−q
A
= −Tr T αq (Aq |B q )
q

for all q ≥ α > 0.

Relative Operator Entropy 91

Suppose that −1 ≤ α < 0. If we put s = q and t = −α in (7), then we have

1−α 1−α
Tr [A1−α B α ] = Tr [A 2 BαA 2 ]
≤ Tr [A1/2 (Aq/2 B −q Aq/2 )−α/q A1/2 ]
= Tr [A1/2(A−q/2 B q A−q/2 )α/q A1/2 ]

for all q ≥ −α > 0. Hence for each α ∈ [−1, 0)

A1−α B α − A
Dα (A|B) = −Tr [ ]
α
A1/2(A−q/2 B q A−q/2 )α/q A1/2 − A
≤ −Tr [ ]
α
1−q
A
= −Tr T αq (A |B )
q q
q

for all q ≥ −α > 0. Hence the proof of Theorem 5.3 is complete.

Remark 5.4 If we put q = 1 in Theorem 5.3, then we have (6). If we put α → 0 in
Theorem 5.3, then we have (5).

6 Concluding Remarks

There are many related topics on the relative operator entropy and the Tsallis relative
operator entropy, see [32, 40] and [11]. Among others, we present generalizations
of the relative operator entropy and the Tsallis relative operator entropy for positive
invertible operators on a Hilbert space due to Isa et al. [31]. We treat Tα (A|B) in
A x B −A
which the range of α is extended from [−1, 1] to R, that is, Tx (A|B) ≡
x
(x ∈ R\{0}) and T0 (A|B) ≡ lim Tx (A|B) = S(A|B).
x→0
We regard the Tsallis relative operator entropy Tx (A|B) as the average rate of
change of the path A t B over the interval [0, x] and relative operator entropy
S(A|B) as the rate of change of the path at t = 0. Based on this viewpoint, we define
the n-th Tsallis relative operator entropy and the n-th relative operator entropy.
We begin by defining the first relative operator entropy S[1] (A|B) and the first
Tsallis relative operator entropy Tx[1] (A|B) as S(A|B) and Tx (A|B), respectively,
that is,
1 1 1 1
S[1] (A|B) ≡ S(A|B) = A 2 (log A− 2 BA− 2 )A 2 and
1 1
(A− 2 BA− 2 )x − I 1
1
Tx[1] (A|B) ≡ Tx (A|B) = A 2 A 2 (x ∈ R\{0}).
x
92 J. I. Fujii and Y. Seo

ax − 1
The corresponding functions to S[1] (A|B) and Tx[1] (A|B) are log a and
x
ax − 1
for a > 0, respectively. Since lim = log a, it follows that T0[1] (A|B) ≡
x→0 x
lim Tx[1] (A|B) = S[1] (A|B). Next, we define the second Tsallis relative operator
x→0
entropy Tx[2] (A|B) as the average rate of change of Tx[1] (A|B) over the interval [0, x],
that is,

Tx[1] (A|B) − S[1] (A|B)

Tx[2] (A|B) ≡
x
1 1 1 1
1 (A− 2 BA− 2 )x − I − x(log A− 2 BA− 2 ) 1
= A2 A2 .
x2

a x − 1 − x log a a x − 1 − x log a
Since its corresponding function is and lim =
x2 x→0 x2
1
(log a)2 , we define the second relative operator entropy as
2
1 1 1 1 1 1
S[2] (A|B) ≡ A 2 (log A− 2 BA− 2 )2 A 2 = S(A|B)A−1 S(A|B),
2 2

and then T0[2] (A|B) ≡ lim Tx[2] (A|B) = S[2] (A|B).

x→0
Based on this consideration, it is natural to define the n-th relative operator
1
entropy S[n] (A|B) by using (log a)n as its corresponding function, and the n-th
n!
Tsallis relative operator entropy Tx[n] (A|B).
Definition 6.1 Let A and B be positive invertible operators and n ∈ N. The n-th
relative operator entropy S[n] (A|B) are defined by

1 1 1
S[n] (A|B) ≡ A 2 (log A− 2 BA− 2 )n A 2 = A(A−1 S(A|B))n
1 1 1

n! n!

and the n-th Tsallis relative operator entropy Tx[n] (A|B) are inductively defined by

Tx[1] (A|B) ≡ Tx (A|B)

and for n ≥ 2,

Tx[n−1] (A|B) − S[n−1] (A|B)

Tx[n] (A|B) ≡ .
x
Relative Operator Entropy 93

dn x
Since a = a x (log a)n holds for a > 0, we have
dx n

dn 1 1 1 1 1 1
A x B = A 2 (A− 2 BA− 2 )x (log A− 2 BA− 2 )n A 2 = (A x B)(A−1 S(A|B))n
dx n

1 dn
and so S[n] (A|B) = A x B .
n! dx n x=0
Then we have
n−1

A x B =A+ x S (A|B) + x n Tx[n] (A|B),
k [k]

k=1

which is the Taylor’s expansion of A x B around 0. We remark that the k-th relative
operator entropy S[k] (A|B) appears as the coefficient of the x k -term and the n-th
Tsallis relative operator entropy Tx[n] (A|B) appears in the residual term. So we can
call Tx[n] (A|B) the n-th residual relative operator entropy.
There are deep discussion on the n-th relative operator entropy and the n-th
Tsallis relative operator entropy in [31] and [44].

Acknowledgments The authors are partially supported by JSPS KAKENHI Grant Number
JP19K03542.

References

1. S. Abe, Monotone decrease of the quantum nonadditive divergence by projective measure-

ments. Phy. Lett. A 312, 336–338 (2003)
2. W.N. Anderson, R.J. Duffin, Series of parallel addition of matrices. J. Math. Anal. Appl. 26,
576–594 (1969)
3. T. Ando, Lebesgue type decomposition of positive operators. Acta Sci. Math. 38, 253–260
(1976)
4. T. Ando, Topics on Operator Inequalities, Hokkaido Univ. Lecture Note (1978)
5. N. Bebiano, R. Lemos, J. da Providencia, Inequalities for quantum relative entropy. Linear
Algebra Appl. 401, 159–172 (2005)
6. V.P. Belavkin, P. Staszewski, C ∗ -algebraic generalization of relative entropy and entropy. Ann.
Inst. Poincare Sect. A 37, 51–58 (1982)
7. R. Bhatia, J.A.R. Holbrook: Riemannian geometry and matrix geometric means. Linear
Algebra Appl. 423, 594–618 (2006)
8. G. Corach, A.L. Maestripieri: Differential and metrical structure of positive operators. Positiv-
ity 3, 297–315 (1999)
9. G. Corach, H. Porta, L. Recht, Geodesics and operator means in the space of positive operators.
Int. J. Math. 4, 193–202 (1993)
10. C. Davis, A Schwarz inequality for convex operator functions. Proc. Am. Math. Soc. 8, 42–44
(1957)
94 J. I. Fujii and Y. Seo

11. S.S. Dragomir, A survey of recent inequalities for relative operator entropy, in Operations
Research, Engineering, and Cyber Security, Springer Optim. Appl., vol. 113 (Springer, Cham,
2017), pp. 199–229
12. J.I. Fujii, Izumino’s view of operator means. Math. Japon. 33, 671–675 (1988)
13. J.I. Fujii, Operator means and the relative operator entropy, in Operator Theory and Complex
Analysis (Sapporo, 1991), Oper. Theory Adv. Appl., vol. 59 (Birkhäuser, Basel, 1992), pp.
161–172
14. J.I. Fujii, Structure of Hiai-Petz parametrized geometry for positive definite matrices. Linear
Algebra Appl. 432 , 318–326 (2010)
15. J.I. Fujii, Path of quasi-means as a geodesic. Linear Algebra Appl. 434, 542–558 (2011)
16. J.I. Fujii, Interpolationality for symmetric operator means. Sci. Math. Japon. 75, 267–274
(2012)
17. J.I. Fujii, M. Fujii, Y. Seo, An extension of the Kubo-Ando theory: Solidarities. Math. Japon.
35, 509–512 (1990)
18. J.I. Fujii, E. Kamei, Relative operator entropy in noncommutative information theory. Math.
Japon. 34, 341–348 (1989)
19. J.I. Fujii, E. Kamei, Uhlmann’s interpolational method for operator means. Math. Japon. 34,
541–547 (1989)
20. J.I. Fujii, E. Kamei, Interpolational paths and their derivatives. Math. Japon. 36, 557–560
(1994)
21. J.I. Fujii, Y. Seo, Tsallis relative operator entropy with negative parameters. Adv. Oper. Theory
1(2), 219–236 (2016)
22. J.I. Fujii, Y. Seo, The relative operator entropy and the Karcher mean. Linear Algebra Appl.
542, 4–34 (2018)
23. M. Fujii, Y. Seo, Matrix trace inequalities related to the Tsallis relative entropies of real order.
J. Math. Anal. Appl. 498, Article 124877 (2021)
24. M. Fujii, J. Mićić Hot, J. Pečarić, Y. Seo, Recent Developments of Mond-Pečarić Method in
Operator Inequalities (Element, Zagreb, 2012)
25. S. Furuichi, Matrix trace inequalities on the Tsallis entropies. J. Inequal. Pure Appl. Math. 9,
Article 1, 7pp. (2008)
26. S. Furuichi, K. Yanagi, K. Kuriyama, Fundamental properties of Tsallis relative entropy. J.
Math. Phys. 45, 4868–4877 (2004)
27. S. Furuichi, K. Yanagi, K. Kuriyama, A note on operator inequalities of Tsallis relative opeartor
entropy. Linear Algebra Appl. 407, 19–31 (2005)
28. F. Hansen, An operator inequality. Math. Ann. 246, 249–250 (1980)
29. F. Hansen, G.K. Pedersen, Jensen’s inequality for operators and Löwner’s theorem. Math. Ann.
258, 229–241 (1982)
30. F. Hiai, D. Petz, The Golden-Thompson trace inequality is complemented. Linear Algebra
Appl. 181, 153–185 (1993)
31. H. Isa, E. Kamei, H. Tohyama, M. Watanabe, The n-th relative operator entropies and the n-th
operator divergences. Ann. Funct. Anal. 11(2), 298–313 (2020)
32. S. Kim, H. Lee, Relative operator entropy related with the spectral geometric mean. Anal.
Math. Phys. 5(3), 233–240 (2015)
33. S. Kobayashi, K. Nomizu, Foundations of Differential Geometry (John Wiley & Sons, New
York, vol. 1 (1963), vol. 2 (1969))
34. H. Kosaki, Remarks on Lebesgue-type decomposition of positive operators. J. Oper. Theory
11, 137–143 (1984)
35. F. Kubo, T. Ando, Means of positive linear operators. Math. Ann. 246, 205–224 (1980)
36. S. Kullback, R. Leibler, On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951)
37. J. Lawson, Y. Lim, Karcher means and Karcher equations of positive definite operators. Trans.
Am. Math. Soc., Ser. B 1, 1–22 (2014)
38. Y. Lim, M. Pálfia, Matrix power means and the Karcher mean. J. Funct. Anal. 262, 1498–1514
(2012)
Relative Operator Entropy 95

39. M. Ohya, D. Petz, Quantum Entropy and Its Use (Springer, Berlin Heidelberg, 2004), Corrected
Second Printing
40. M. Raïssouli, M.S. Moslehian, S. Furuichi, Relative entropy and Tsallis entropy of two
accretive operators. C. R. Math. Acad. Sci. Paris 355(6), 687–693 (2017)
41. M.B. Ruskai, F.H. Stillinger, Convexity inequalities for estimating free energy and relative
entropy. J. Phys. A 23, 2421–2437 (1990)
42. Y. Seo, Matrix trace inequalities on Tsallis relative entropy of negative order. J. Math. Anal.
Appl. 472, 1499–1508 (2019)
43. C.E. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27, 623–656
(1948)
[n]
44. H. Tohyama, E. Kamei, M. Watanabe, The n-th residual relative operator entropy Rx,y (A|B).
Adv. Oper. Theory 6(1), No. 18, 11 pp. (2021)
45. H. Umegaki, Conditional expectation in an operator algebra, IV (Entropy and information),
Kodai Math. Sem. Rep. 14, 59–85 (1962)
46. J. von Neumann, Mathematical Foundations of Quantum Mechanics (Princeton University
Press, Princeton, NJ, 1955). (Originally appeared in German in 1932)
47. K. Yanagi, K. Kuriyama, S. Furuichi, Generalized Shannon inequalities based on Tsallis
relative operator entropy. Linear Algebra Appl. 394, 109–118 (2005)
Matrix Inequalities and
Characterizations of Operator
Monotone Functions

Trung Hoa Dinh, Hiroyuki Osaka, and Oleg E. Tikhonov

Abstract In this chapter, we give a series of new characterizations of operator

monotone functions using matrix inequalities involving different Kubo-Ando matrix
means. We also use a trace monotonicity inequality and the Powers-Størmer
inequality to characterize operator monotone functions.

Keywords Operator monotone functions · Kubo-Ando matrix means ·

Powers-Stormer’s inequality · Matrix Heinz mean · Matrix Heron mean · Matrix
geometric means · Matrix power means · Matrix inequalities

1 Introduction

Throughout this chapter, M n stands for the algebra of n × n matrices over C and Pn
denotes the cone of positive definite elements in M n . Denote by In the identity
matrix of M n . For a Hermitian matrix A with eigenvalues in the domain of a
function f , the matrix f (A) is defined by means of the functional calculus.
Definition 1.1 A continuous function f on I (⊂ R) is called n-monotone, if

A≤B #⇒ f (A) ≤ f (B)

T. H. Dinh ()
Department of Mathematics, Troy University, Troy, AL, USA
e-mail: [email protected]
H. Osaka
Department of Mathematical Sciences, Ritsumeikan University, Kusatsu, Shiga, Japan
e-mail: [email protected]
O. E. Tikhonov
Institute of Computer Mathematics and Information Technologies, Kazan Federal University,
Kazan, Russia
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 97

R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_4
98 T. H. Dinh et al.

for any pair of self-adjoint matrices A, B ∈ M n with σ (A), σ (B) ⊂ I , where σ (A)
stands for the spectrum of A.
Definition 1.2 A continuous function f on I (⊂ R) is called n-convex if the
inequality

f (λA + (1 − λ)B) ≤ λf (A) + (1 − λ)f (B)

holds for all self-adjoint matrices A, B ∈ M n with σ (A), σ (B) ⊂ I and for all
λ ∈ [0, 1]. Also, f is called a n-concave on I if (−f ) is n-convex on I .
Let f : I (⊂ R) → R. We call a function f operator convex if f is n-convex for
any n ∈ N, and operator monotone if f is n-monotone for any n ∈ N.
Operator monotone functions were firstly introduced and studied by Löwner in
1934 [35]. He has completely characterized all operator monotone functions as the
class of Pick functions. These functions play an essential role in the theory of
analytic functions. He has also proved that if f is operator monotone on [0, ∞),
then there exists a positive measure μ on [0, ∞) such that
$ ∞ st
f (t) = a + bt + dμ(s),
0 t +s

where a is a real number and b ≥ 0. According to this representation, for r ∈ [0, 1],
the function f (t) = t r is operator monotone on [0, ∞). This is the well-known
Löwner-Heinz’s inequality [42, Theorem 1.1] for positive semidefinite matrices
which states that for any 0 ≤ A ≤ B and r ∈ [0, 1], Ar ≤ B r .
In [31], Hansen and Perdesen considered some basic equivalent assertions for
operator monotone functions and operator convex functions. They showed that for
a strictly positive, continuous function f on (0, ∞), the following statements are
equivalent:
(a) f is operator concave.
t
(b) is operator monotone.
f (t)
In [37], Osaka and Tomiyama discussed some similar assertions at each level n
in order to see clearly the inside of the double piling structure of matrix monotone
functions and of matrix convex functions. More precisely, the main results in [37]
are in the following.
Characterizations of Operator Monotone Functions 99

Theorem 1.3 Let n ∈ N and f : [0, α) → R. Let us consider the following

assertions:
(i) f (0) ≤ 0 and f is n-convex in [0, α).
(ii) For each self-adjoint matrix A with its spectrum in [0, α) and a contraction C
in M n ,

f (C AC) ≤ C f (A)C.
f (t )
(iii) The function g(t) = t is n-monotone on (0, α).
Then

(i)n+1 #⇒ (ii)n #⇒ (iii)n #⇒ (i)[ n2 ] ,

where notation (A)m ⇒ (B)n means that “if (A) holds for the matrix algebra M m ,
then (B) holds for the matrix algebra M n ”.
In 1980, Kubo and Ando [34] introduced the theory of operator means. Let
B(H)+ be the set of positive invertible operators in a Hilbert space H. A binary
operation σ : B(H)+ × B(H)+ → B(H)+ , (A, B) $→ Aσ B, is called a connection
if the following requirements are fulfilled:
(I) If A ≤ C and B ≤ D, then Aσ B ≤ Cσ D;
(II) C ∗ (Aσ B)C ≤ (C ∗ AC)σ (C ∗ BC);
(III) If An % A and Bn % B, then An σ Bn % Aσ B.
Further, a mean is a connection satisfying the normalized condition:
(IV) 1σ 1 = 1.
Kubo and Ando showed that there exists an affine order-isomorphism from the
class of connections to the class of positive operator monotone functions, which is
given by σ $→ fσ (t) = 1σ t. Let f be an operator monotone function, then for
positive definite matrices A and B,

Aσ B := A1/2f (A−1/2BA−1/2 )A1/2 . (1)

In 1996, Petz [38] introduced the theory of monotone metric in quantum infor-
mation theory which was based on operator monotone functions. Therefore, such
functions are important in matrix analysis, quantum information and other areas as
well. The authors refer readers to the books of William Donoghue [29], Barry Simon
[41] and Bhatia [5] for more details about operator monotone functions.
It is well-known that if σ is a symmetric matrix Kubo-Ando mean, i.e., Aσ B =
Bσ A, then the representing function fσ satisfies the following inequalities

2x 1+x
≤ fσ (x) ≤ .
1+x 2
100 T. H. Dinh et al.

Consequently, for any positive operators A and B,

A!B ≤ Aσ B ≤ A∇B, (2)

where A!B = 2(A−1 + B −1 )−1 is the harmonic mean of A and B, and A∇B =
(A + B)/2 is the arithmetic mean of A and B. Obviously, if f : [0, ∞) → [0, ∞)
is operator monotone, we have

f (A!B) ≤ f (Aσ B) ≤ f (A∇B). (3)

Interestingly, if a continuous function f satisfies either of the inequalities:

f (a!b) ≤ f (aσ b) ≤ f (a∇b). (4)

for any positive numbers a and b, then f is monotonically increasing. Matrix

generalizations of this observation for Kubo-Ando means were discussed by Hiai
and Ando in [1, Proposition 4.1]. Namely, they showed that a continuous function
f on (0, ∞) is operator monotone if and only if one of the following conditions
holds:
(A) f (A∇B) ≥ f (Aσ B) for all positive definite matrices A, B and for some
symmetric operator mean σ = ∇;
(B) f (A!B) ≤ f (Aσ B) for all positive definite matrices A, B and for some
symmetric operator mean σ =!.
One of the most important matrix means is the geometric mean,

A#B := A1/2(A−1/2 BA−1/2 )1/2 A1/2

as the mid-point of the geodesic,

A#t B := A1/2(A−1/2 BA−1/2 )t A1/2 (t ∈ [0, 1])

connecting two matrices A and B in the Riemannian manifold of positive matrices.

It is natural to consider a similar characterization using this mid-point. This
importance becomes more evident when one considers that # is not only symmetric
but also self-adjoint i.e. (A#B)−1 = A−1 #B −1 , so it seems as a natural candidate
to extend this characterization to other classes of means.
Now assume that H is an infinite-dimensional Hilbert space. Let Tr be the
canonical trace on the algebra B(H). It is well-known that for a monotone function
f on [0, ∞),

0≤A≤B #⇒ Tr (f (A)) ≤ Tr (f (B)).

For any normal state φ on B(H), a positive kernel operator Sφ with Tr (Sφ ) = 1
such that φ(A) = Tr (Sφ A) (A ∈ B(H)) is uniquely defined. If f is an operator
Characterizations of Operator Monotone Functions 101

monotone function, and φ is a positive linear functional on B(H), then for any 0 ≤
A ≤ B,

φ(f (A)) ≤ φ(f (B)). (5)

In this chapter, we give a series of new characterizations of operator monotone

functions using matrix inequalities involving Kubo-Ando matrix means. In addition,
we also use trace inequality (5) and the Powers-Størmer inequality in quantum
hypothesis testing theory [3] to characterize operator monotone functions.

2 Matrix Inequalities and Characterizations of Operator

Functions

2.1 Heinz Mean, Heron Mean, and Operator Monotone

Functions
2.1.1 Scalar Inequality for Heinz Mean and Heron Mean

For two non-negative numbers x and y let us denote by

x s y 1−s + x 1−s y s
Gs (x, y) =
2
the Heinz means and by

x+y
Hs (x, y) = s + (1 − s)x 1/2 y 1/2
2
the Heron means.
The family of Heron means and Heinz means are clearly interpolations between
the arithmetic and the geometric means. In [6], Bhatia obtained a relation between
the Heinz mean and the Heron mean which states that for t ∈ [0, 1],

Gt (a, b) ≤ H(2t −1)2 (a, b). (6)

Therefore, for any t ∈ [0, 1], we have

√ a+b
ab ≤ Gt (a, b) ≤ H(2t −1)2 ≤ H|2t −1| ≤ . (7)
2
Now, we show that the inequality between the Heinz mean and the Heron mean of
scalars also characterizes monotonicity.
102 T. H. Dinh et al.

Theorem 2.1 A continuous function f on [0, ∞) is monotone increasing if and

only if for any pair of positive numbers x, y and s ∈ (0, 1/2) ∪ (1/2, 1),

x s y 1−s + x 1−s y s x+y √

f ≤f α(s)2 + (1 − α(s)2 ) xy , (8)
2 2

where α(s) = 2s − 1.
Proof The implication follows from (7) and monotonicity, so we only need to show
the converse. Given two positive numbers a ≤ b, it suffices to show that there exist
positive numbers x and y such that

x s y 1−s + x 1−s y s x+y √

a= , b = α(s)2 + (1 − α(s)2 ) xy, (9)
2 2
as this would imply f (a) ≤ f (b) showing the desired monotonicity. If such x and
y exist, from (9) we would have

a x s y 1−s + x 1−s y s
= √
b α(s) (x + y) + 2(1 − α(s)2 ) xy
2

(y/x)α(s)/2 + (y/x)−α(s)/2
=
α(s)2 ((y/x)1/2 + (y/x)−1/2) + 2(1 − α(s)2 )
cosh(α(s)c)
= ,
α(s)2 cosh(c) + (1 − α(s)2 )

where e2c = y/x. We define

cosh(αc)
fα (c) =
α 2 cosh(c) + (1 − α2 )

and show that fα : [0, ∞) → (0, 1] is bijective. Indeed, notice that

fα (0) = 1 and lim fα (c) = 0.

c→∞

Continuity and the Intermediate Value Theorem imply that the function fα :
[0, ∞) → (0, 1] is surjective. Moreover, we can show that the function fα :
[0, ∞) → (0, 1] is also injective. To do this, it is enough to show that the function
is monotonic on [0, ∞). So, note that

d
fα (c) ≤ 0
dc
Characterizations of Operator Monotone Functions 103

if and only if,

gα (c) := α sinh(αc)(α 2 cosh(c) + (1 − α 2 )) − α 2 sinh(c) cosh(αc) ≤ 0.

Since, gα (0) = 0, it suffices to show that gα is monotonically decreasing on [0, ∞).

Taking a derivative with respect to c we obtain,

d
gα (c) = 2α(−1 + α 2 ) cosh(cα) sinh(c/2)2
dc
which is clearly non-positive when c ≥ 0. Hence, the function fα : [0, ∞) → (0, 1]
is bijective. To obtain a solution for (9), fix s ∈ (0, 1/2) ∪ (1/2, 1) and set c =
−1
fα(s) (a/b). With this, we can obtain the desired x and y satisfying (9).

Let 0 ≤ p ≤ 1 ≤ q. It is well-known that for non-negative numbers a and b,

√ a p + bp 1/p
a+b a q + bq 1/q
ab ≤ ≤ ≤ ,
2 2 2
or,
√
ab ≤ μ(p, a, b) ≤ μ(1, a, b) ≤ μ(q, a, b), (10)

1/p
where μ(p, a, b) = a +b
p p
2 is the power mean (or, binomial mean). Using
similar arguments as in the proof of Theorem 2.1 one can obtain the following
theorem.
Theorem 2.2 Let f be a continuous function on [0, ∞). For 0 ≤ p ≤ 1 ≤ q,
suppose that one of the following inequalities holds for all any non-negative
numbers a ≤ b:
√
(1) f (a) ≤ f ( ab);
(2) f (μ(1, a, b)) ≤ f (b),
a s b1−s + a 1−s bs a+b √
(3) f ≤ f |2s − 1| + (1 − |2s − 1|) ab .
√ 2 2
(4) f ( ab) ≤ f (μ(p, a, b));
(5) f (μ(p, a, b)) ≤ f (μ(1, a, b)) ;
(6) f (μ(1, a, b)) ≤ f (μ(q, a, b)) .
Then the function f is increasingly monotone on [0, ∞).
104 T. H. Dinh et al.

2.1.2 Matrix Inequalities and Operator Monotone Functions

Notice that from (7) we have the following inequalities for matrix means:

As B + A1−s B A+B A+B

A#B ≤ ≤ α(s)2 + (1 − α(s)2 )AB ≤ ,
2 2 2
In this section, using above inequalities we establish new characterizations of
operator monotone functions.
Theorem 2.3 Let f be a continuous function on [0, ∞), s ∈ (0, 1/2) ∪ (1/2, 1)
and α = 1 − 2s. The following statements are equivalent:
(i) f is operator monotone on [0, ∞);
(ii) For any positive definite matrices A and B,

As B + A1−s B
f (AB) ≤ f ; (11)
2

(iii) For any positive definite matrices A and B,

As B + A1−s B A+B

f ≤f α(s)2 + (1 − α(s)2 )AB ; (12)
2 2

(iv) For any positive definite matrices A and B,

A+B A+B
f α(s)2 + (1 − α(s)2 )AB ≤f .
2 2

Proof It is obvious that (i) implies (ii), (iii), and (iv). Let us show that (iii) implies
(i) first and then we show (ii) implies (i). That would complete the proof since
(iv) implies (i) follows from [1, Proposition 4.1] since the matrix Heron mean is
symmetric.
Suppose (12) holds for any positive definite matrices A and B. We need to show
that for any 0 < X ≤ Y ,

f (X) ≤ f (Y ).

Firstly, let us consider the case when Y = In . We now show that there exist positive
definite matrices A0 , B0 such that

1−s
1/2 C0 + C0
s
A0 s B0 + A0 1−s B0 1/2
= A0 A0 = X (13)
2 2
Characterizations of Operator Monotone Functions 105

and

1/2 In + C0 1/2 1/2

A0 α(s)2 + (1 − α(s)2 )C0 A0 = In , (14)
2

−1/2 −1/2
where C0 = A0 B0 A0 . From (14), we get

−1/2
1/2 In + C0 1/2
A0 = α(s)2 + (1 − α(s)2 )C0 .
2

Substituting the last identity to (13), we get

−1
C0s + C01−s In + C0 1/2
X= α(s)2 + (1 − α(s)2 )C0 (15)
2 2

From the proof of Theorem 2.1 the function

√ −1
x s + x 1−s 1+x
f (x) = α(s)2 + (1 − α(s)2 ) x
2 2

is bijective and takes values in (0, 1]. Therefore, for any 0 < X ≤ In there exists
a unique matrix C0 satisfying (15). Hence, the matrix A0 is obtained from (14) and
1/2 1/2
the matrix B0 equals A0 C0 A0 .
In general, for 0 < X ≤ Y we have 0 < Y −1/2 XY −1/2 ≤ In . By the above
arguments, we can find A0 , B0 ∈ M+ n such that

A0 s B0 + A0 1−s B0
= Y −1/2 XY −1/2
2
and
A0 + B0
α(s)2 + (1 − α(s)2 )A0 B0 = In .
2

Consequently, applying (12) to matrices A = Y 1/2 A0 Y 1/2 , B = Y 1/2B0 Y 1/2 we

obtain that f (X) ≤ f (Y ). Finally, by the continuity of f we conclude that the
function f is operator monotone on [0, ∞).
To show that (ii) implies (i), following the same argument, it suffices to show that
the function ks (x) : (0, 1] → (0, 1] defined by
√
2 x
ks (x) = s
x + x 1−s

is bijective. However, by realizing ks as a hyperbolic secant, this is obvious.

106 T. H. Dinh et al.

Remark 2.4 There are numerous papers on the matrix Heron mean and the matrix
Heinz mean and related questions. We refer the readers to [8, 16, 20, 22, 24–26] and
the references therein.

2.2 Symmetric and Self-adjoint Means via Integral

Representations

In this section we use characterizations of symmetric means given in [4] and of

self-adjoint means given in [30] to characterize operator monotone functions via the
geometric mean and a mean σ under certain constraints. More precisely, we use

g(A#B) ≤ g(Aσ B),

g(A#B) ≥ g(Aσ B),

to characterize operator monotone functions.

2.2.1 Symmetric Means

Definition 2.5 ([4]) Let f : R+ → R+ , where R+ = (0, ∞). We say that f ∈ Fop
if it satisfies the following conditions:
1. f is operator monotone,
2. tf (t −1 ) = f (t) for all t ∈ R+ , and
3. f (1) = 1.
Notice, that functions in f ∈ Fop are in one-to-one correspondence with
symmetric means.
Definition 2.6 For f, g ∈ Fop , define

t + 1 f (t)
ψ(t) = , t > 0.
2 g(t)

We say f 0 g if and only if ψ ∈ Fop .

It is clear that if f ∈ Fop , 1+t 0 f (t) 0 1+t −1
2 as ψ(t) = t f (t) or ψ(t) = f (t)
2t

in these particular cases, both of which are operator monotone. It is shown in [4] that
Fop forms a lattice under 0. It is worth noting that this order is stronger than the
regular point-wise order ≤. That is, if f 0 g then f ≤ g because ψ(t) ≤ 1+t 2 (t ∈
R+ ).
Characterizations of Operator Monotone Functions 107

It is shown in [4, Proposition 2.1] that f ∈ Fop implies that f has an integral
representation of the form

1 + t H (t )
f (t) = e (16)
2
where
$ 1 (λ2 − 1)(1 − t)2
H (t) = h(λ) dλ
0 (t + λ)(1 + tλ)(λ + 1)2

and h : [0, 1] → [0, 1] is a measurable function

√ that is uniquely determined by f
a.e. We notice that if h(λ) = 12 , then f (t) = t.
In [4, Theorem 2.4], they showed that f 0 g implies the following a.e.
inequality at the level of the corresponding measurable functions: hf ≥ hg . If f 0 g
and hf = hg on a set of non-zero measure, we will say f ≺ g.
Lemma 2.7 Let f ∈ Fop and define

ϕ(t) = t −1 f (t 2 ).

Then,
√
1. If · ≺ f then, as a real function, ϕ is monotonically decreasing on (0, 1) and
monotonically
√ increasing on (1, ∞).
2. If · f then ϕ is monotonically increasing on (0, 1) and monotonically
decreasing on (1, ∞).
Proof Consider the derivative

ϕ (t) = −t −2 f (t 2 ) + 2f (t 2 ).

To show monotonicity as a real function, it suffices to show

2tf (t) ≶ f (t)

depending on the interval and the order relationship considered. Based on (16) we
consider

2tf (t) = teH (t ) (1 + (1 + t)H (t)) ≶ f (t), respectively,

if and only if

1−t
H (t) ≶ , respectively.
2t (1 + t)
108 T. H. Dinh et al.

Explicitly calculating H (t) we obtain

$ $

1 1 1 1 (1 − λ2 )(1 − t 2 )
H (t) = − h(λ) dλ = h(λ) dλ.
0 (t + λ)2 (1 + tλ)2 0 (t + λ)2 (1 + tλ)2

An easy calculation shows that when h(λ) is substituted by the constant function
1/2, the integral becomes
$
1 1 (1 − λ2 )(1 − t 2 ) 1−t
dλ = .
2 0 (t + λ) (1 + tλ)
2 2 2t (1 + t)

So now√we apply [4, Theorem 2.4] to determine the monotonicity of ϕ in each case.
So, let · ≺ f and t ∈ (0, 1). In this case h(λ) ≤ 1/2 and the integrand,

(1 − λ2 )(1 − t 2 )
h(λ) ≥ 0
(t + λ)2 (1 + tλ)2

for all (t, λ) ∈ (0, 1) × [0, 1]. Therefore,

1−t
H (t) ≤ ,
2t (1 + t)

which implies that ϕ is monotonically decreasing on (0, 1). When t ∈ (1, ∞)

the integrand is non-positive and the inequality is reversed,
√ yielding that ϕ is
monotonically increasing on that interval. The analysis for · f is similar, but in
this case h(λ) ≥ 1/2.

Remark 2.8 Another way to obtain the previous result would be by using the
monotonicity on one interval and using the fact that ϕ(t) = ϕ(t −1 ) by the symmetry
condition 2 in the definition of the class Fop . As a corollary, ϕ has an absolute
minimum/maximum at the point (1, 1).
√ √
Suppose that · ≺ f . Then, t < f (t) for some t ∈ (1, ∞). By the preceding
lemma ϕ is monotonically increasing on this interval, so

γ := lim ϕ(t) = lim t −1 f (t 2 ) > 1.

t →∞ t →∞

As a result the interval (1, γ ) is non-empty.

√ √
On the other hand, suppose that · f . Then, t > f (t) for some t ∈ (1, ∞).
In this case, however, ϕ is monotonically decreasing on this interval, so

γ := lim ϕ(t) < 1

t →∞

and (1, γ ) is non-empty.

Characterizations of Operator Monotone Functions 109

Lemma 2.9 Let σ be√some symmetric√operator mean on R+ with representing

function f such that · ≺ f (resp. · f ) and let γ = limt →∞ f (t 2 )/t.
Then, if X and Y are positive definite operators such that X ≤ Y < γ X (resp.
γ X < Y ≤ X), then there exist positive operators A and B such that

X = A#B and Y = Aσ B.

Proof Note that if we show that for In ≤ X−1/2 Y X−1/2 := Y0 ≤ γ In we can find
positive operators A0 and B0 such that:

In = A0 #B0 and Y0 = A0 σ B0 ,

we can obtain the desired result by choosing A := X1/2 A0 X1/2 and B :=

X1/2 B0 X1/2 . This is equivalent to the following problem: Given In ≤ Y0 ≤ γ In
find A0 ≥ 0 such that

Y0 = A0 σ A−1
0 .

So, define ϕ(t) := tσ t −1 = tf (t −2 ). By symmetry, we have that ϕ(t) = t −1 f (t 2 ).

Since ϕ(t) is continuous on [1, ∞), ϕ(1) = f (1) = 1, the function is bijective
from [1, ∞) onto [1, γ ) and so we can −1
√ define A0 = ϕ (Y0 ). This gives the desired
result. The proof for the case when · f is identical, but uses the fact that in this
case ϕ : [1, ∞) → (γ , 1] is bijective instead.

Theorem 2.10 Let σ√be some symmetric operator mean on R+ with representing
function f such that · ≺ f . Then, if

g(A#B) ≤ g(Aσ B) (17)

for any positive operators√A and B, then the function g is operator monotone on
R+ . If, on the other hand, · f and

g(A#B) ≥ g(Aσ B), (18)

then g is operator monotone on R+ .

Proof First
√ we prove (17). Let f and ϕ be as in the proof of Lemma 2.9. Assume
that f · and choose γ0 ∈ (1, γ ). Let 0 < X ≤ Y and consider the sequence:

0 < X ≤ γ0 X ≤ γ02 X ≤ ...

Clearly, there exists k ∈ N such that:

0 ≤ X ≤ γ0 X ≤ γ02 X ≤ ... ≤ γ0k X ≤ Y ≤ γ0k+1 X.

110 T. H. Dinh et al.

Since γ0i X ≤ γ0i+1 X ≤ γ γ0i X, Lemma 2.9 implies that there exist positive operators
Ai and Bi such that:

γ0i X = Ai #Bi and γ0i+1 X = Ai σ Bi

and so

g(X) ≤ g(γ0 X) ≤ g(γ02 X) ≤ ... ≤ g(γ0k X).

Since γ0k X ≤ Y ≤ γ γ0k X, the lemma gives

g(X) ≤ g(γ0 X) ≤ g(γ02 X) ≤ ... ≤ g(γ0k X) ≤ g(Y ).

The proof of (18) is similar, hence omitted.

2.2.2 Self-adjoint Means

A mean σ is said to be self-adjoint if it satisfies

(Aσ B)−1 = A−1 σ B −1 .

There exists a one-to-one correspondence between self-adjoint means and the

class of operator monotone functions E defined below. This correspondence was
considered by Hansen in [30] and a characterization was given in terms of the
exponential of an integral.
Definition 2.11 Let f : R+ → R+ . We say that f ∈ E if it satisfies the following
conditions:
1. f is operator monotone and
2. f (t −1 ) = f (t)−1 for all t ∈ R+ .
The aforementioned characterization is proved in [30, Theorem 1.1] and it states
that
$ 0 1 t
f (t) = exp + h(λ) dλ,
−1 λ−t 1 − λt

where h : [−1,
√ 0] → [0, 1] is a measurable function. We notice that if h(λ) = 1/2,
then f (t) = t.
Definition 2.12 For f, g ∈ E, we say f 1sa g if and only if fg −1 is operator
monotone.
In the following, we show that this so defined relation satisfies the same
properties as the order defined in [4] on Fop that we introduced earlier in this section.
Characterizations of Operator Monotone Functions 111

Note that f, g ∈ E implies that (f/g)(t −1 ) = ((f/g)(t))−1 . So, requiring fg −1

be operator monotone is equivalent to requiring fg −1 ∈ E. Therefore, there exists a
class of measurable functions hfg −1 : [−1, 0] → [0, 1] such that
$ 0 1 t
(fg −1 )(t) = exp + hfg −1 (λ) dλ,
−1 λ−t 1 − λt

and

hfg −1 (λ) = hf (λ) − hg (λ) a.e.

So, clearly f 1sa g if and only if hf ≥ hg a.e. Therefore, 1sa defines an order
relation on E. Moreover, it is easy to see that for f ∈ E implies that 1 0sa f (t) 0sa
t. Indeed, 1 0sa f (t) follows from the monotonicity of f and f (t) 0sa t follows
from the monotonicity of f (tt ) .
We can also define the meet and join of any two elements in a similar fashion as
in [4]. Let f, g ∈ E then define:
$ 0 1 t
f ∧ g = exp + min{hf (λ), hg (λ)} dλ,
−1 λ−t 1 − λt
$ 0 1 t
f ∨ g = exp + max{hf (λ), hg (λ)} dλ.
−1 λ−t 1 − λt

As an immediate result we obtain that E with 0sa is a lattice. That is, for any two
f, g ∈ E,

f ∧ g 0sa f 0sa f ∨ g.

Moreover, the map

t
f (t) → f † (t) =
f (t)

is an involutive order reversing map on E. Indeed, it is easy to see f †† = f ,

1 f (t)
f † (t −1 ) = = = (f † (t))−1 ,
tf (t −1 ) t

and

f 0sa g #⇒ g † 0sa f † .
112 T. H. Dinh et al.

Now we turn into a characterization of operator monotone functions using self-

adjoint means. As before, if f 0sa g and hf = hg on a set of non-zero measure,
we will say f ≺sa g.
Lemma 2.13 Let f ∈ E and define

ϕ(t) = t −1 f (t 2 ).

Then,
√
1. If √· 0sa f then, as a real function, ϕ is monotonically increasing on R+ .
2. If · 1sa f then ϕ is monotonically decreasing on R+ .
Proof As before, to show the monotonicity of ϕ as a real function, it suffices to
show

2tf (t) ≶ f (t) (19)

depending on the interval and the order relationship considered. With the integral
expression of f , (19) becomes:
$ 0 1 1
2t + h(λ) dλ ≶ 1.
−1 (λ − t) 2 (1 − λt)2

The result now follows

√ from the fact that the integrand is non-negative and for
h(λ) = 1/2, f (t) = t and
$ 0 1 1 1
+ dλ = .
−1 (λ − t) 2 (1 − λt)2 t

Using the same arguments as in Lemma 2.9 and Theorem 2.10 we can show the
following result.
Theorem 2.14 Let σ √be some self-adjoint operator mean on R+ with representing
function f such that · ≺sa f and let A and B be positive operators such that
A < B. Then, if

g(A#B) ≤ g(Aσ B), (20)

√
then the function g is operator monotone on R+ . If, on the other hand, · ≺sa f †
and

g(A#B) ≥ g(Aσ B), (21)

then g is operator monotone on R+ .

Characterizations of Operator Monotone Functions 113

2.2.3 Kubo-Ando Condition

There is yet another class of means to consider. Let τ and τ ⊥ be the means
represented by operator monotone functions g and g † , respectively. Kubo and Ando
showed in [34, Theorem 5.4] that if an operator mean σ with representing function
f satisfies

(Aτ B)σ (Aτ ⊥ B) ≤ Aσ B (22)

√
for a non-trivial mean τ and all positive operators A and B then f ≥ ·. Moreover,
in [34, Theorem 5.7], they showed that whenever σ satisfies (22) for every operator
mean τ its representing function f satisfies t −1 f (t 2 ) is non-increasing on (0, 1)
and non-decreasing on (1, ∞). Moreover, in subsequent
√ corollaries, they showed
that if the inequality (22) is reversed then f ≤ · and t −1 f (t 2 ) is non-decreasing
on (0, 1) and non-increasing on (1, ∞).
These are precisely the behaviors needed in the proof of Lemma 2.9 and
consequently Theorem 2.10. Therefore, this allows us to follow the same arguments
to show a similar result for this particular class of means.
Theorem 2.15 Let A and B be positive operators and σ be an operator mean on
R+ satisfying (22) for every √
operator mean τ . Assume further that the representing
function f satisfies f (x) > x for all x ∈ (1, ∞). Then, if

g(A#B) ≤ g(Aσ B), (23)

then the function g is operator monotone on R+ . If, on the other hand,

√ the reversed
inequality is satisfied in (22) for every operator mean σ , f (x) < x for all x ∈
(0, 1), and

g(A#B) ≥ g(Aσ B), (24)

then g is operator monotone on R+ .

2.2.4 General Symmetric Means

In this section, we prove Theorem 2.10 for general symmetric Kubo-Ando means.
We used monotonicity of the function ϕ on certain intervals to obtain bijectivity,
thus obtaining a well-defined ϕ −1 when restricted to the appropriate intervals. With
this function, we were able to solve the problem in Lemma 2.9, which then allowed
us to obtain the desired characterization. With a little care, it is possible to obtain
the same result when ϕ is only surjective on the prescribed intervals.
114 T. H. Dinh et al.

√
We recall some of our notation from Sect. 2.2. Suppose that · ≶ f , respectively,
as before define ϕ(t) = t −1 f (t 2 ). Then, we have

γ := lim ϕ(t) = lim t −1 f (t 2 ) ≶ 1, respectively.

t →∞ t →∞

With this we can show a lemma which is stronger than Lemma 2.9.
Lemma 2.16 Let σ √ be some symmetric
√ operator mean on R+ with representing
function f such that · < f (resp. · > f ) and let γ = limt →∞ f (t 2 )/t. Then, if
X and Y are positive definite matrices such that X ≤ Y < γ X (resp. γ X < Y ≤ X),
then there exist positive matrices A and B such that

X = A#B and Y = Aσ B.
√
Proof As in the proof of Lemma 2.9, we show the lemma when · < f . In this
case, it suffices to show that given In ≤ Y0 = U diag({λi (Y0 )}) U ∗ ≤ γ In , we can
find A0 ≥ 0 such that

Y0 = A0 σ A−1
0 = ϕ(A0 ).

While ϕ(t) is not necessarily bijective in this case, it is continuous on [1, ∞) and
ϕ(1) = f (1) = 1. Therefore, the restriction of ϕ to some subset of [1, ∞) is
surjective onto [1, γ ).
Since σ (Y0 ) ⊂ [1, γ ), surjectivity of the restriction of ϕ implies that the set

ϕ −1 (λi (Y0 )) := {x ∈ [1, ∞) | ϕ(x) = λi (Y0 )} = ∅.

In particular, if we choose δi (Y0 ) ∈ ϕ −1 (λi (Y0 )) for each i, the matrix

A0 := U diag({δi (Y0 )})U ∗

satisfies

ϕ(A0 ) = U diag({ϕ(δi (Y0 ))})U ∗ = U diag({λi (Y0 )})U ∗ ,

and the result follows as in Lemma 2.9.

Now using the same argument as in the proof of Theorem 2.10 we can show the
following theorem.
Theorem 2.17 ([19]) Let f be a continuous function on (0, ∞). Then, f is
operator monotone if and only either one of the following holds:
1. If f (A#B) ≤ f (Aσ B) for all positive definite A and B and some symmetric
operator mean # < σ .
2. If f (A#B) ≥ f (Aσ B) for all positive definite A and B and some symmetric
operator mean # > σ .
Characterizations of Operator Monotone Functions 115

2.3 Matrix Power Means and Operator Monotone Functions

Now, let p be a real number, and a, b be positive. According to the relation (1) the
power mean μ(p, a, b) is corresponding to the monotone function f as

1/p
1 + tp
fμ,p (t) = .
2

Then for positive definite matrices A and B, the Kubo-Ando matrix power mean is
defined as
1/p
1 + (A−1/2BA−1/2 )p
Pμ (p, A, B) = A1/2 A1/2 .
2

Therefore, from the chain of inequalities (10) for an operator monotone function f
on (0, ∞) we have

f (AB) ≤ f (A1/2 fμ,p (A−1/2 BA−1/2 )A1/2) (25)

≤ f (A1/2 fμ,1 (A−1/2BA−1/2 )A1/2 )
≤ f (A1/2 fμ,q (A−1/2 BA−1/2 )A1/2)

whenever A, B are positive definite matrices.

Notice that the matrix power mean Pμ (p, A, B) is not either a symmetric or
self-adjoint Kubo-Ando mean. In this section, we investigate new characterizations
of operator monotone functions by using inequalities in (25). We show that if one
of the inequalities in (25) holds for any positive definite matrices A and B, then the
function f is operator monotone on (0, ∞).
The more difficult case is the one involving the naive matrix extension of the
power means. Let 1/2 ≤ p ≤ 1 ≤ q. The function t 1/q is operator concave, while
the function t 1/p is operator convex. Then we have

1/p 1/q
Ap + B p A+B Aq + B q
≤ ≤ (26)
2 2 2

whenever A and B are positive semidefinite. It is worth noting that the inequalities
in (26) were discussed by Audenaert and Hiai [2], where they obtained conditions
on p and q such that (26) holds true. In [10, 18] the authors also studied new types
of operator convex functions and related inequalities.
116 T. H. Dinh et al.

2.3.1 Kubo-Ando Matrix Power Means and Characterizations

In this section we study the problem of characterization of operator monotone

functions using Kubo-Ando matrix power means.
Theorem 2.18 ([28]) Let 0 < p ≤ 1, and f be a continuous function on [0, ∞)
that satisfies the following inequality

f (AB) ≤ f A1/2 fμ,p A−1/2 BA−1/2 A1/2 , (27)

for any positive definite matrices A and B. Then f is operator monotone on (0, ∞).
Proof Suppose that the inequality (27) holds, it suffices to show that for any 0 ≤
X ≤ Y , there exist two positive semidefinite matrices A and B such that

X = AB, Y = A1/2fμ,p A−1/2 BA−1/2 A1/2.

Firstly, let us consider the case when X = In . We now show that there exist positive
definite matrices A0 and B0 such that A0 B0 = In and

−1/2
B0 A−1/2 A0 = A0 fμ,p A−2
1/2 1/2
Y = A0 fμ,p A0 0 . (28)

x p + x −p
1/p
Since the function h(x) = xfμ,p (x −2 ) = is surjective from
2
(0, ∞) to [1, ∞), we obtain that for any In ≤ Y there exists a matrix A0 > 0
satisfying (28). The matrix B0 is equal to A−1
0 .
In general, for 0 < X ≤ Y we have In ≤ X−1/2 Y X−1/2 . By the above
arguments, we can find positive semidefinite matrices A0 and B0 such that A0 B0 =
In and

−1/2
X−1/2 Y X−1/2 = A0 fμ,p A0 B0 A−1/2 A0
1/2 1/2
= Pμ (p, A0 , B0 ) .

Consequently, applying (27) to matrices A = X1/2 A0 X1/2 and B = X1/2 B0 X1/2 ,

we obtain that f (X) ≤ f (Y ). In other words, f is operator monotone.

Theorem 2.19 Let 0 < p ≤ 1 ≤ q, and f be a continuous function on [0, ∞) that
satisfies one of the following inequalities

A+B
f ≤ f A1/2 fμ,q A−1/2BA−1/2 A1/2 , (29)
2
A+B
f A1/2fμ,p A−1/2BA−1/2 A1/2 ≤ f , (30)
2

for any positive definite matrices A and B. Then f is operator monotone on (0, ∞).
Characterizations of Operator Monotone Functions 117

To prove Theorem 2.19, we will need the following lemma which is the inverse
problem for the arithmetic mean and the power mean. Recently, in [27] we also
study the inverse problem for the generalized contraharmonic means.
Lemma 2.20 Suppose X and Y are positive definite matrices satisfying X ≤ Y <
γ X (resp. γ X < Y ≤ X) where γ = 21−1/q (resp. γ = 21−1/p ), then there exist
positive matrices A and B such that

A+B
X= and Y = Pμ (q, A, B) (resp. Y = Pμ (p, A, B)),
2
where 0 < p ≤ 1 ≤ q.
Proof We show the lemma when X ≤ Y < γ X, the remaining case can be obtained
similarly. Firstly, let us consider the case when X = In . Then it suffices to show that
given In ≤ Y = U diag ({λi (Y )}) U ∗ ≤ γ In , we can find A0 , B0 ≥ 0 such that

A0 + B0
In = and Y = Pμ (q, A0 , B0 ).
2
Or, equivalently, there exists 0 < A0 ≤ 2In such that Y = Pμ (q, A0 , 2In − A0 ) =
ϕ(A0 ), where

1/q
x q + (2 − x)q
ϕ(x) = x 1/2fμ,q x −1/2(2 − x)x −1/2 x 1/2 = .
2

Note that ϕ is continuous on [0, 2] and surjective from [0, 2] onto [1, γ ]. Since
λi (Y ) ∈ [1, γ ], for each i, we can choose δi (Y ) ∈ [0, 2] such that ϕ (δi (Y )) =
λi (Y ). The matrix

A0 := U diag ({δi (Y )}) U ∗

satisfies

ϕ(A0 ) = U diag ({ϕ (δi (Y ))}) U ∗ = U diag ({λi (Y )}) U ∗ .

In general, for 0 < X ≤ Y < γ X, we have In ≤ X−1/2 Y X−1/2 < γ In . By

the above arguments, we can find positive definite matrices A0 and B0 such that
(A0 + B0 )/2 = In and X−1/2 Y X−1/2 = Pμ (q, A0, B0 ). Now, let A = X1/2 A0 X1/2
and B = X1/2 B0 X1/2 , we have (A + B)/2 = X and

Y = X1/2 Pμ (q, A0 , B0 )X1/2 = Pμ (q, A, B),

which completes the proof of Lemma 2.20.

118 T. H. Dinh et al.

We are now ready to prove Theorem 2.19.

Proof of Theorem 2.19 First we prove the case when f satisfies the inequality (29).
Let 0 ≤ X ≤ Y and Y0 = X−1/2 Y X−1/2 , and choose γ0 ∈ (1, 21−1/q ). Consider
the spectral decomposition, Y0 = ri=1 λi Ei with the eigenvalues λi listed in the
decreasing order. Then, there exists a set of non-ascending integers {mi | 1 ≤ i ≤ r}
such that

γ0mi < λi ≤ γ0mi +1 .

Let 1 < 2 < · · · < t = r be the sequence of indexes such that

m1 = · · · = m1 > m1 +1 = · · · = m2 > m2 +1 = · · · = m3

> · · · > mt−1 +1 = · · · = mt = mr .

We have
mt mt +1
γ0 < λr < · · · < λt−1 +1 ≤ γ0 ≤ γ0 t−1
+1
< · · · < λ1 ≤ γ01 +1 .

< λt−1 < · · · < λt−2 +1 ≤ γ0 t−1

It follows that
mt mt
I < γ0 I < γ02 I < · · · < γ0 I = γ0(E1 + E2 + · · · + Er )
m
≤ λr Er + · · · + λt−1 +1 Et−1 +1 + γ0 t Et−1 + Et−1 −1 + · · · + E1
m +1
≤ λr Er + · · · + λt−1 +1 Et−1 +1 + γ0 t Et−1 + Et−1 −1 + · · · + E1
...
mt−1
≤ λr Er + · · · + λt−1 +1 Et−1 +1 + γ0 Et−1 + Et−1 −1 + · · · + E1
m
≤ λr Er + · · · + λt−2 +1 Et−2 +1 + γ0 t−1 Et−2 + Et−2 −1 + · · · + E1
m +1
≤ λr Er + · · · + λt−2 +1 Et−2 +1 + γ0 t−1 Et−2 + Et−2 −1 + · · · + E1
...
m2
≤ λr Er + · · · + λ2 +1 E2 +1 + γ0 E1 + E1 −1 + · · · + E1
m +1
≤ λr Er + · · · + λ2 +1 E2 +1 + γ0 2 E1 + E1 −1 + · · · + E1
...
m1
≤ λr Er + · · · + λ2 +1 E2 +1 + γ0 E1 + E1 −1 + · · · + E1
m1 +1
≤ λr Er + . . . λ1 E1 = Y0 ≤ γ0 I.
Characterizations of Operator Monotone Functions 119

After multiplying each term of the chain of inequalities on both sides by X1/2 , let
Zk be the k-th expression of the chain, we obtain the following chain inequalities

m1 +1
0 ≤ X = Z1 ≤ Z2 ≤ · · · ≤ Zm−1 ≤ Y ≤ Zm = γ0 X.

where m is a positive integer. The previous calculation gives us Zk ≤ Zk+1 ≤ γ Zk .

Therefore, it follows from Lemma 2.20 that there exist positive definite matrices A
and B such that

Zk = (A + B)/2 and Zk+1 = Pμ (q, A, B).

Consequently,

f (X) ≤ f (Z1 ) ≤ f (Z2 ) ≤ · · · ≤ f (Zm−1 ) ≤ f (Y ).

In other words, f is operator monotone.

The proof in the case when (30) holds is similar.

2.3.2 The Inverse Problem for Non-Kubo-Ando Matrix Power Means

In [2] Audenaert and Hiai determined values of p and q such that the following
inequality holds true

1/p 1/q
Ap + B p Aq + B q
≤
2 2

whenever A, B are positive semidefinite matrices. When 1/2 ≤ p ≤ 1 ≤ q,

according to the operator convexity of t 1/p and operator concavity of t 1/q we have

1/p 1/q
Ap + B p A+B Aq + B q
≤ ≤ (31)
2 2 2

Suppose that 0 ≤ X ≤ Y . Solving the inverse mean problem is to find positive

definite matrices A and B such that
1/p 1/q
Ap + B p Aq + B q
X= , Y = . (32)
2 2

If this system has a positive solution, then we may use the result to characterize
operator monotone function. Unfortunately, inequalities in (31) do not characterize
operator monotone functions, in general.
120 T. H. Dinh et al.

Proposition 2.21 For any q > 1, there exists a non-monotone operator function
satisfying

1/q
A+B Aq + B q
f ≤f
2 2

for all positive definite matrices A and B.

Proof For a fixed number 1 < r ≤ min{2, q}, we consider the function f (x) = x r
which is not an operator monotone function. It follows from the operator convexity
of x r and the operator concavity of x r/q that
" #
r r/q 1/q r
A+B Ar + B r Aq + B q Aq + B q
≤ ≤ ≤ .
2 2 2 2

Hence, the non-operator monotone function f satisfies

1/q
A+B Aq + B q
f ≤f ,
2 2

which completes the proof of Lemma 2.21.

Similarly, one would expect to have the same conclusion for the first inequality,
namely, for 1/2 ≤ p ≤ 1, there exists a non-monotone operator satisfying

1/p
Ap + B p A+B
f ≤f .
2 2

However, when p = 1/2 we were able to solve the inverse problem and establish a
new characterization of operator monotone functions.
Theorem 2.22 Let f be a continuous function on [0, ∞) that satisfies the following
inequality
2

A1/2 + B 1/2 A+B
f ≤f ,
2 2

for any positive semidefinite matrices A and B. Then f is operator monotone.

Characterizations of Operator Monotone Functions 121

Proof Firstly, we show that f (X) ≤ f (Y ) for any positive semidefinite matrices
X, Y with 0 ≤ X ≤ Y ≤ 2X. Indeed, we need to solve the following system
⎧
⎪ 2
⎨ A +B
1/2 1/2
⎪
=X
2 (33)
⎪
⎩ A + B = Y.
⎪
2
Subtracting the first equation from the second, we obtain

2
A1/2 − B 1/2
Y −X = .
2

Therefore, system (33) is equivalent to

⎧ 1/2
⎪ A + B 1/2
⎨ = X1/2
2
⎩A − B
⎪ 1/2 1/2
= (Y − X)1/2 .
2
The last system has a unique positive solution as

2 2
A = X1/2 + (Y − X)1/2 , B = X1/2 − (Y − X)1/2 .

Notice that the condition 0 ≤ X ≤ Y < 2X guarantees the semidefinite positivity

of A and B. Thus, f (X) ≤ f (Y ).
Now, for any positive semidefinite matrices 0 ≤ X ≤ Y , we apply the same
arguments as in the proof of Theorem 2.19 to obtain a positive integer m and positive
semidefinite matrices Z1 , . . . , Zm such that

0 ≤ X = Z1 ≤ Z2 ≤ Z3 ≤ · · · ≤ Zm−1 ≤ Y ≤ Zm

with Zk ≤ Zk+1 < 2Zk for all k = 1, 2, . . . , m. Therefore, combining with the
previous arguments, we get

f (X) = f (Z1 ) ≤ f (Z2 ) ≤ · · · ≤ f (Zm−1 ) ≤ f (Y ).

122 T. H. Dinh et al.

3 Powers-Størmer’s Inequality and Characterizations

of Operator Monotone Functions

Powers-Størmer’s inequality [3, 39] asserts that for s ∈ [0, 1] the following
inequality

2Tr (As B 1−s ) ≥ Tr (A + B − |A − B|) (34)

holds for any pair of positive matrices A, B. This is a key inequality to prove the
upper bound of Chernoff bound, in quantum hypothesis testing theory [3]. This
inequality was first proved in [3], using an integral representation of the function
t s . After that, N. Ozawa [32, Proposition 1.1] gave a much simpler proof for the
same inequality, using fact that, for s ∈ [0, 1], the function f (t) = t s (t ∈
[0, +∞)) is an operator monotone. Recently, Ogata in [36] extended this inequality
to standard von Neumann algebras. The motivation of this section is that if the
function f (t) = t s is replaced by another operator monotone function (this class is
intensively studied) then Tr (A + B − |A − B|) may get smaller upper bound that is
used in quantum hypothesis testing. Based on Ozawa’s proof we formulate Powers-
Størmer’s inequality for an arbitrary operator monotone function on (0, +∞).
Lemma 3.1 ([31, Theorem 2.5]) Let f be a strictly positive, continuous function
on [0, ∞). If f is 2n-monotone, then for any positive semidefinite A and a
contraction C in M n

C ∗ f (A)C ≤ f (C ∗ AC).

Lemma 3.2 ([14]) Let f be a strictly positive, continuous function on (0, ∞). Then
f is n-monotone if and only if − f 1(t ) is n-monotone.
Proof For any t1 , t2 , · · · , tn ∈ (0, ∞) we have

f (tj )−f (ti )

1
f (ti ) − 1
f (tj ) f (ti )f (tj ) 1 f (ti ) − f (tj )
= =− .
ti − tj ti − tj f (ti )f (tj ) ti − tj

f (t )−f (t )
Since f is n-monotone, [ iti −tj j ] is positive semidefinite by Löwner [35],
hence, we have
⎛ ⎞ ⎛ ⎞
(− f (t1 i ) ) − (− f (t1j ) ) 1
− 1
⎝ ⎠ = − ⎝ f (ti ) f (tj ) ⎠
ti − tj ti − tj

1 f (ti ) − f (tj )
=− −
f (ti )f (tj ) ti − tj
Characterizations of Operator Monotone Functions 123

1 f (ti ) − f (tj )
= [ ]◦[
f (ti )f (tj ) ti − tj
≥ 0,

where ◦ means the Hadamard product.

Therefore, the function − f 1(t ) is n-monotone by Löwner [35].

Proposition 3.3 Let f be a strictly positive, continuous function on (0, ∞). If f is
2n-monotone, the function g(t) = f (tt ) is n-monotone.
Proof Let A, B be positive matrix in M n such that 0 < A ≤ B.
Let C = B − 2 A 2 . Then C ≤ 1. Since f is 2n-monotone, −f satisfies the
1 1

Jensen type inequality from Lemma 3.1, that is,

−f (A) = −f (C ∗ BC) ≤ −C ∗ f (B)C

1 1 1 1
−f (A) ≤ −A 2 B − 2 f (B)B − 2 A 2

−A− 2 f (A)A− 2 ≤ −B − 2 f (B)B − 2

1 1 1 1

−A−1 f (A) ≤ −B −1 f (B).

Hence, the function − f (tt ) is n-monotone. Therefore, from Lemma 3.2 we conclude
that
1 t
− =
− f (tt ) f (t)

is n-monotone.

Theorem 3.4 ([12]) Let f be a strictly positive, 2n-monotone function on (0, ∞).
Then for any pair of positive definite matrices A, B ∈ M n such that A ≤ B
1 1
Tr (A) + Tr (B) − Tr (|A − B|) ≤ 2Tr (f (A) 2 g(B)f (A) 2 ), (35)

where g(t) = t
f (t ) .
Proof Let A, B be positive matrices in M n such that A ≤ B. We may assume that
A and B are invertible. For operator A − B let us denote by P = (A − B)+ and
Q = (A − B)− its positive and negative part, respectively. Then we have

A − B = P − Q and |A − B| = P + Q, (36)

from that it follows that

A + Q = B + P. (37)
124 T. H. Dinh et al.

On account of (37) the inequality (35) is equivalent to the following

1 1
Tr (A) − Tr (f (A) 2 g(B)f (A) 2 ) ≤ Tr (P ).

Since B + P ≥ B ≥ 0 and B + P = A + Q ≥ A ≥ 0 we have g(A) ≤ g(B + P )

by Proposition 3.3 and
1 1
Tr (A) − Tr (f (A) 2 g(B)f (A) 2 )
1 1 1 1
= Tr (f (A) 2 g(A)f (A) 2 ) − Tr (f (A) 2 g(B)f (A) 2 )
1 1 1 1
≤ Tr (f (A) 2 g(B + P )f (A) 2 ) − Tr (f (A) 2 g(B)f (A) 2 )
1 1
= Tr (f (A) 2 (g(B + P ) − g(B))f (A) 2 )
1 1
≤ Tr (f (B + P ) 2 (g(B + P ) − g(B))f (B + P ) 2 )
1 1
= Tr (f (B + P ) 2 g(B + P )f (B + P ) 2 )
1 1
− Tr (f (B + P ) 2 g(B)f (B + P ) 2 )
1 1
= Tr (B + P ) − Tr (f (B) 2 g(B)f (B) 2 )
= Tr (B + P ) − Tr (B)
= Tr (P ).

Thus, we have the conclusion.

Remark 3.5 In [9, 13] we studied the interpolation classes and matrix means,
and proved the Powers-Stormer’s inequality for interpolation functions. Some new
characterizations of operator monotone functions using the arithmetic-geometric
means inequality were obtained in [17].
Now let ϕ be a normal state on the algebra B(H) of all bounded operators on
a Hilbert space H, f a strictly positive, continuous function on (0, ∞), and g a
function on (0, ∞) defined by g(t) = f (tt ) . In the following theorem, we show that
Powers-Størmer type inequality
1 1
ϕ(A + B) − ϕ(|A − B|) ≤ 2ϕ(f (A) 2 g(B)f (A) 2 ). (38)

characterizes the matrix monotonicity.

The following two lemmas are obvious.
Lemma 3.6 Let A and B be positive semidefinite matrices in M n such that A ≤ B.
Then there is a unitary U in M n such that a11 > b11 for [aij ] = U AU ∗ and
[bij ] = U BU ∗ .
Characterizations of Operator Monotone Functions 125

Lemma 3.7 Let A = (aij ), B = (bij ) be positive invertible in M n and S a non-

finite rank density operator on an infinite dimensional, separable Hilbert space H.
Suppose that a11 > b11. Then
∞ there exist an orthogonal system {ξi }∞ ⊂ H and
i=1
∞
{λi }i=1 ⊂ [0, 1) such that i=1 λi = 1, Sξi = λi ξi , and ni=1 aii λi > ni=1 bii λi .
Theorem 3.8 Let H be an infinite dimensional, separable Hilbert space and ϕ a
normal state on B(H) such that its corresponding density operator is not finite rank.
Let f be a strictly positive, continuous function on (0, ∞), and g be a function on
(0, ∞) defined by g(t) = f (tt ) . Suppose that

1 1
ϕ(A + B) − ϕ(|A − B|) ≤ 2ϕ(f (A) 2 g(B)f (A) 2 ) (39)

for any positive invertible A, B ∈ B(H). Then both functions f and g on (0, ∞)
are operator monotone.
Proof Suppose that g is not operator monotone. Then there exist n ∈ N and
invertible positive matrices A, B in M n with A ≤ B such that g(A) ≤ g(B).
Hence,
1 1
A ≤ f (A) 2 g(B)f (A) 2 .
1 1
Put A = [aij ] and f (A) 2 g(B)f (A) 2 = [bij ] = B . Note that for any unitary U in
Mn

U AU ∗ ≤ Uf (A) 2 g(B)f (A) 2 U ∗ = Uf (A) 2 U ∗ Ug(B)U ∗ Uf (A) 2 U ∗

1 1 1 1

1 1
= f (U AU ∗ ) 2 g(U BU ∗ )f (U AU ∗ ) 2 .

Hence without loss of generality, we can assume that a11 > b11 by Lemma 3.6.
Let Sϕ be a density operator on H such that ϕ(X) = Tr(Sϕ X) for all X ∈ B(H).
By Lemma 3.7, there exists system {ξi }∞
an orthogonal ∞
i=1 ⊂ H and {λi }i=1 ⊂ [0, 1)
∞ n n
such that i=1 λi = 1 and i=1 aii λi > i=1 bii λi .
Let consider the following canonical inclusion:
n

n
ρ : M n −→ |ξi ξi | B (H) |ξi ξi |
i=1 i=1

n
ρ([xij ]) = xij |ξi ξj |.
i,j =1

Put
∞
∞

C = ρ(A) + |ξi ξi | and D = ρ(B) + |ξi ξi |.
i=n+1 i=n+1
126 T. H. Dinh et al.

Then both of operators C and D are invertible on H and C ≤ D. That means,

inequality (39) holds true for selected C, D, that is,
1 1
ϕ(C) ≤ ϕ(f (C) 2 g(D)f (C) 2 ).

On the other hand, note that

1 1 1 1
n
ρ(f (A) 2 )ρ(g(B))ρ(f (A) 2 ) = ρ(f (A) 2 g(B)f (A) 2 ) = bij |ξi ξj |.
i=1

Then by straightforward calculations, we obtain

n ∞

= bij |ξi ξj | + |ξi ξi |.
i,j =1 i=n+1

Consequently,
∞

ϕ(C) = Tr(Sϕ (ρ(A) + |ξi ξi |))
i=n+1

n ∞

= (Sϕ ρ(A)ξi |ξi ) + λi
i=1 i=n+1

n ∞

= aii λi + λi
i=1 i=n+1

n ∞

> bii λi + λi
i=1 i=n+1

n ∞

1 1
= (Sϕ ρ(f (A) 2 g(B)f (A) 2 )ξi |ξi ) + λi
i=1 i=n+1
Characterizations of Operator Monotone Functions 127

∞

1 1
= Tr(Sϕ (ρ(f (A) 2 g(B)f (A) 2 ) + |ξi ξi |))
i=n+1
1 1
= ϕ(f (C) 2 g(D)f (C) 2 ).

The last inequality contradicts (39). Therefore, the function g is operator

monotone.
Moreover, the monotonicity of f follows from [31, Corollary 6].

Remark 3.9 The Powers-Størmer inequality for interpolation functions was proved
in [9]. The matrix Powers-Størmer inequality was proved in [15]. Namely, we
showed that for a strictly positive operator monotone function f on (0, ∞) with
f ((0, ∞)) ⊂ (0, ∞) and f (1) = 1,

A+B 1
− Aσf B ≤ |A − B|
2 2
for any positive semidefinite matrices A and B satisfying the condition AB + BA ≥
0. Using the same method as in Sect. 2, the first author [7] also obtained a new
characterizations of operator monotone functions using the matrix Powers-Størmer
inequality. It was showed that for a nonnegative function f on [0, ∞), if

A+B 1
f ≤f AB + A1/2|In − A−1/2BA−1/2 |A1/2
2 2

for any positive definite matrices A and B, then f is operator monotone on [0, ∞).
Finally, we show that if the monotonicity inequality holds for at least one normal
state on the algebra B(H), where H is some infinite-dimensional Hilbert space, then
we also have a new characterization of operator monotone functions.
Theorem 3.10 ([11]) Let ϕ be a normal state on B(H). The following condition is
sufficient (and, evidently, necessary) for a continuous function f : → R (where
is a subset of R) to be operator monotone function:
(∗): for any A, B ∈ B(H)sa such that σ (A), σ (B) ⊂ ,

A≤B #⇒ ϕ(f (A)) ≤ ϕ(f (A)).

Proof Let f be a continuous function that satisfies the condition (∗), and suppose
that f is not an operator monotone function on . Therefore, there exists a
natural number n, Hermitian matrices A = [aij ]ni,j =1 and B = [bij ]ni,j =1
such that σ (A ), σ (B ) ⊂ , A ≤ B , and α11 > β11 , where [αij ]ni,j =1 =
f (A ), [βij ]ni,j =1 = f (B ).
Let ξk be the eigenvectors of the density operator Sϕ corresponding to eigen-
values (possibly, zero ones) λk . In the space H we choose
an orthonormal system
{ξk }nk=1 of eigenvectors of the operator Sϕ such that nk=1 αkk λk > nk=1 βkk λk .
128 T. H. Dinh et al.

We can always do this, because the sum of all eigenvalues of the operator Sϕ , taking
into account their multiplicities, equals one, therefore we can choose λ1 sufficiently
large, and λk (k = 2, 3, · · · , n) arbitrarily small. Let us complement the system
{ξk }nk=1 to the orthonormal basis {ξk }nk=1 ∪ {ξk }k∈K (where K is a set of indexes)
consisting of eigenvectors of Sϕ . Consider H as the direct sum H1 H2 of the n-
dimensional Hilbert space H1 with the basis {ξk }nk=1 and the Hilbert spaceH2 with
the basis {ξk }k∈K . Choose some η0 ∈ and put A = A η0 E , B = B ξ0 E ,

where E is the unit operator in the space H2 . Then A, B ∈ B(H) ; σ (A), σ (B) ⊂
sa

and A ≤ B; but

n
ϕ(f (A)) = Tr (Sϕ f (A)) = αkk λk + f (ξ0 )λk
k=1 k∈K

n
> βkk λk + f (ξ0 )λk = ϕ(f (B))
k=1 k∈K

which contradicts the condition (∗).

Remark 3.11 In [21] we studied monotonicity inequality for extended part of a
von Newmann algebra that involve operator monotone functions. In a recent paper
[23] we studied functions preserving operator means, and also established new
characterizations of operator monotone functions.

References

1. T. Ando, F. Hiai, Operator log-convex functions and operator means. Math. Ann. 350(3), 611–
630 (2011)
2. K.M.R. Audenaert, F. Hiai, On matrix inequalities between the power means: Counterexam-
ples. Linear Algebra Appl. 439(5), 1590–1604 (2013)
3. K.M.R. Audenaert, J. Calsamiglia, L.I. Masanes, R. Munoz-Tapia, A. Acin, E. Bagan, F.
Verstraete, Discriminating States: The quantum Chernoff bound. Phys. Rev. Lett. 98, 160501
(2007)
4. K.M.R. Audenaert, L. Cai, F. Hansen, Inequalities for quantum skew information. Lett. Math.
Phys. 85, 135–146 (2008)
5. R. Bhatia, Matrix Analysis, Graduate Texts in Mathematics (Springer, New York, 1997)
6. R. Bhatia, Interpolating the arithmetic-geometric mean inequality and its operator version.
Linear Algebra Appl. 413, 355–363 (2006)
7. T.H. Dinh, On characterization of operator monotone functions. Linear Algebra Appl. 487,
260–267 (2015)
8. T.H. Dinh, Some inequalities for the matrix Heron mean. Linear Algebra Appl. 528, 321–330
(2017)
9. T.H. Dinh, H. Osaka, Interpolation functions and inequalities. Banach J. Math. Anal. 9(1),
67–74 (2015)
10. T.H. Dinh, B.K. Vo, Some inequalities for operator (p, h)-convex functions. Linear Multilinear
Algebra 66(3), 580–592 (2018)
Characterizations of Operator Monotone Functions 129

11. T.H. Dinh, O.E. Tikhonov, To the theory of operator monotone and operator convex functions.
Izv. Vyssh. Uchebn. Zaved. Mat. 54(3), 9–14 (2010) [Translation: Russian Mathematics. 54(3),
7–11 (2010)]
12. T.H. Dinh, H. Osaka, H.M. Toan, On generalized Powers-Størmer’s inequality. Linear Algebra
Appl. 438(1), 242–249 (2013)
13. T.H. Dinh, M.T. Ho, H. Osaka, Interpolation classes and matrix means. Banach J. Math. Anal.
9(3), 140–152 (2015)
14. T.H. Dinh, H. Osaka, J. Tomiyama, Characterization of operator monotone functions by
Powers-Størmer type inequalities. Linear Multilinear Algebra 63(8), 1577–1589 (2015)
15. T.H. Dinh, H. Osaka, B.K. Vo, A generalized reverse Cauchy inequality for matrices. Linear
Multilinear Algebra 64(7), 1415–1423 (2016)
16. T.H. Dinh, M.S. Moslehian, C. Conde, P. Zhang, An extension of the Polya-Szego operator
inequality. Expo. Math. 35(2), 212–220 (2017)
17. T.H. Dinh, M.T. Ho, H.B. Du, On some matrix mean inequalities with Kantorovich constant.
Sci. Math. Jap. 80(2), 139–151 (2017)
18. T.H. Dinh, T.D. Dinh, B.K. Vo, A new type of operator convexity. Acta Math. Viet. 4(43),
595–605 (2018)
19. T.H. Dinh, R. Dumitru, J. Franco, New characterizations of operator monotone functions.
Linear Algebra Appl. 546, 169–186 (2018)
20. T.H. Dinh, B.K. Vo, T.Y. Tam, In-sphere property and reverse inequalities for matrix means.
Electr. J. Linear Algebra 35(35), Article 3 (2019)
21. T.H. Dinh, O.E. Tikhonov, L. Veselova, Inequalities for the extended part of a von Neumann
algebra, related to operator monotone and operator convex functions. Ann. Funct. Anal. 10(3),
425–432 (2019)
22. T.H. Dinh, M.T. Ho, C.T. Le, B.K. Vo. Two trace inequalities for operator functions. Math.
Ineq. Appl. 22(3), 1021–1026 (2019)
23. T.H. Dinh, H. Osaka, S. Wada, Functions preserving operator means. Ann. Funct. Anal. 11,
1203–1219 (2020)
24. T.H. Dinh, R. Dumitru, J. Franco, On the matrix Heron means and Renyi divergences. Linear
Multilinear Algebra (2020). https://doi.org/10.1080/03081087.2020.1763239
25. T.H. Dinh, H.R. Moradi, M. Sababheh, On the Pólya-Szegö operator inequality. Hacettepe J.
Math. Stat. 49(5), 1744–1752 (2020)
26. T.H. Dinh, A.V. Le, C.T. Le, N.Y. Phan. The matrix Heinz mean and related divergence.
Hacettepe J. Math. Stat. 49(5), 1744–1752 (2020)
27. T.H. Dinh, C.T. Le, B.K Vo. The inverse problem for generalized contraharmonic means,
accepted to Russian Math (2022)
28. T.H. Dinh, C.T. Le, V.T. Nguyen, B.K. Vo. Matrix power means and new characterizations of
operator monotone functions, accepted to Linear Multilinear Algebra (2022)
29. W. Donoghue, Monotone Matrix Functions and Analytic Continuation (Springer, New York,
1974)
30. F. Hansen, Selfadjoint means and operator monotone functions. Math. Ann. 256, 29–35 (1981)
31. F. Hansen, G.K. Pedersen, Jensen’s inequality for operator and Löwner’s theorem. Math. Ann.
258, 229–241 (1982)
32. V. Jakšic, Y. Ogata, C.A. Pillet, R. Seiringer, Quantum hypothesis testing and non-equilibrium
statistical mechanics. Rev. Math. Phys. 24(6), 1230002, 67 pp. (2012)
33. Y. Kapil, C. Conde, M.S. Moslehian, M. Singh, M. Sababheh. Norm inequalities related to the
Heron and Heinz means. Mediterr. J. Math. 14(5), paper no. 213, 18 pp. (2017)
34. F. Kubo, T. Ando, Means of positive linear operators. Math. Ann. 246(3), 205–224 (1980)
35. C. Löwner, Über monotone matrix funktionen. Math. Z. 38, 177–216 (1934)
36. Y. Ogata, A generalization of Powers-Størmer inequality. Lett. Math. Phys. 97(3), 339–346
(2011)
37. H. Osaka, J. Tomiyama, Double piling structure of matrix monotone functions and of matrix
convex functions. Linear Algebra Appl. 431, 1825–1832 (2009)
38. D. Petz, Monotone metrics on matrix spaces. Linear Algebra Appl. 244, 81–96 (1996)
130 T. H. Dinh et al.

39. R.T. Powers, E. Størmer, Free states of the canonical anti-commutation relations. Commun.
Math. Phys. 16, 1–33 (1970)
40. W. Pusz, S.L. Woronowicz. Functional calculus for sesquilinear forms and the purification
map. Rep. Math. Phys. 5, 159–170 (1975)
41. B. Simon, Löwner’s Theorem on Monotone Matrix Functions, Grundlehren der mathematis-
chen Wissenschaften, vol. 354 (Springer Nature Switzerland, Cham, 2019)
42. X. Zhan, Matrix Inequalities (Springer, 2002)
Perspectives, Means and their
Inequalities

Hiroyuki Osaka and Shuhei Wada

Abstract The perspective function is useful in convex analysis. In the study of

operator theory, an operator mean can be realized as the operator perspective and
its limits. On the other hand, an operator mean can be regarded as a two-variable
functional calculus for positive operators. In this chapter, we study the operator
perspective and its extensions including operator means and the Pusz–Woronowicz
functional calculus. We also discuss about the related operator inequalities.

Keywords Operator perspective · Operator connection · Functional calculus ·

Operator mean · Operator convex · Operator monotone · Ando–Hiai inequality ·
Furuta inequality

1 Introduction

The perspective method is the standard operation to transform a convex function of

n-variable into one of n + 1-variable [12, Lemma 2]. Applying the same method
for a given operator function, a binary operation on the set of invertible positive
operators that inherits the properties of the original function, can be obtained.
We could identify the operator geometric mean which was introduced by Pusz
and Woronowicz in 1975 as the operator perspective for the root function. This

H. Osaka ()
Department of Mathematics Sciences, Ritsumeikan University, Kusatsu, Japan
e-mail: [email protected]
S. Wada
Department of Information and Computer Engineering, National Institute of Technology,
Kisarazu College, Kisarazu, Japan
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 131
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_5
132 H. Osaka and S. Wada

observation was applied to more general operator monotone functions, and as a

result, operator means and related operations are defined by several authors [1, 18,
39].
The convexity of the scalar function is inherited by the perspective function. This
is extended in the framework of operator functions by Effros [13] and Ebadian et al.
[15]. That is, for a given real valued continuous function f on (0, ∞), the operator
perspective Pf is jointly convex if and only if f is operator convex. This observation
is related to the joint convexity of Quantum f -divergence in quantum information
theory [28, 29]. Recently, the perspective theory has been further discussed in [23,
32, 33, 37].
A perspective function is defined for two invertible positive operators. Under
some conditions, it can be extended to a binary operation on the set of non-negative
operators like an operator mean. We shall discuss about such conditions in Sect. 3.

2 Perspectives for Invertible Operators

Through this chapter, H is a Hilbert space. B(H )sa is the set of bounded self-adjoint
operators, B(H )+ is the set of bounded positive operators on H , and B(H )++ is the
set of invertible elements in B(H )+ . We also write A ≥ 0 when A ∈ B(H )+ , and
A > 0 when A ∈ B(H )++ .
To form the perspective function of a convex map is a useful trick in convex
analysis [34, 35]. For a real valued continuous function f on (0, ∞), a two-variable
operator function Pf defined by

Pf (A, B) = B 1/2 f (B −1/2 AB −1/2 )B 1/2 (A, B > 0)

is called the operator perspective of f [13–15, 32, 33]. In this section, some
properties of operator perspectives are given.
Example 2.1 Consider the operator convex function f (t)= − log t defined on
(0, ∞). Then the perspective function Pf is given by

Pf (B, A) = −A1/2 log(A−1/2BA−1/2 )A1/2

for A, B > 0.
Note that when A and B are commuting positive definite matrices, then

Tr(Pf (B, A)) = S(A, B),

where S(A, B) is relative entropy defined by S(A, B) = Tr(A log A)−Tr(A log B).
Perspectives, Means and their Inequalities 133

2.1 Homogeneity

We first note the homogeneity

of the operator perspective. If f is a polynomial
function such as f (t) = k αk t k , it is clear that

B 1/2 f (B −1/2 AB −1/2 )B −1/2 = B 1/2 αk (B −1/2 AB −1/2 )k B −1/2
k

= αk (AB −1 )k .
k

In the following, we conventionally denote the last term f (AB −1 ).

The following is obtained from the similar argument.
Proposition 2.1 Let C be an invertible operator and f be a real valued continuous
function on (0, ∞). Then C ∗ Pf (A, B)C = Pf (C ∗ AC, C ∗ BC) for all A, B > 0.
Proof It is enough to show the case if f is a polynomial function. Using the above
argument, we have

C ∗ Pf (A, B)C = C ∗ f (AB −1 )BC

= C ∗ f (AB −1 )(C ∗ )−1 C ∗ BC
= Pf (C ∗ AC, C ∗ BC).

Corollary 2.1

C > 0 ⇒ CPf (A, B)C = Pf (CAC, CBC).

Remark 2.1 Let f be a continuous function on (0, ∞) which has the analytic
continuation. If we regard f (AB −1 ) as the holomorphic function calculus for
AB −1 , a function (A, B) $→ f (AB −1 )B has the homogeneity property. Some
authors define the operator perspective of non-self-adjoint operators in this way (cf.
[8]).

2.2 Convexity

In convex analysis, the convexity of a given function is equivalent to one of its

perspective function. Similar results are known for the convexity of a two-variable
operator function [33, Theorem 10.1].
134 H. Osaka and S. Wada

Definition 2.1 Let be a convex subset of B(H )sa × B(H )sa and let P be a
function from to B(H )sa . The function P is said to be jointly convex if

P (αA1 + (1 − α)A2 , αB1 + (1 − α)B2 ) ≤ αP (A1 , B1 ) + (1 − α)P (A2 , B2 )

holds for all (Ai , Bi ) ∈ (i = 1, 2) and α ∈ [0, 1].

In the above definition, if the function P satisfies the reverse inequality, P is said
to be jointly concave.
Definition 2.2 Let J be an interval in R and f be a real valued continuous function
on J . The function f is said to be operator convex if

f (αA + (1 − α)B) ≤ αf (A) + (1 − α)f (B)

holds for all A, B ∈ B(H )sa with σ (A), σ (B) ⊂ J and α ∈ [0, 1].
The function f is said to be operator concave if the reverse inequality holds. It
is known that a positive valued function f on (0, ∞) is operator concave if and
only if f is operator monotone (i.e., f (A) ≤ f (B) holds if 0 < A ≤ B) [6,
Theorem V.2.5].
The following result is well-known [24].
Proposition 2.2 Let f be a real valued continuous function on [0, ∞) with f (0) ≤
0. Then f is operator convex if and only if the inequality

f (D ∗ AD + E ∗ BE) ≤ D ∗ f (A)D + E ∗ f (B)E

holds for all A, B ≥ 0 and for all D, E ∈ B(H ) with D ∗ D + E ∗ E ≤ I .

Using this observation, we have the following:
Proposition 2.3 Let f be an operator convex function on [0, ∞) with f (0) ≤ 0.
Then the perspective function Pf is jointly convex on B(H )++ × B(H )++ .
Proof Let (A1 , B1 ), (A2 , B2 ) be in B(H )++ × B(H )++ and let α ∈ [0, 1]. Put
C := αB1 + (1 − α)B2 , D := (αB1 )1/2 C −1/2 and E := ((1 − α)B2 )1/2C −1/2 .
Since D ∗ D + E ∗ E = I ,

C −1/2 Pf (αA1 + (1 − α)A2 , αB1 + (1 − α)B2 )C −1/2

= f (C −1/2 (αA1 + (1 − α)A2 )C −1/2 )
= f (C −1/2 (αA1 )C −1/2 + C −1/2 ((1 − α)A2 )C −1/2 )
−1/2 −1/2 −1/2 −1/2
= f D ∗ B1 A1 B1 D + E ∗ B2 A2 B2 E
−1/2 −1/2 −1/2 −1/2
≤ D ∗ f (B1 A1 B1 )D + E ∗ f (B2 A2 B2 )E
−1/2 −1/2 −1/2
= αC Pf (A1 , B1 )C + (1 − α)C Pf (A2 , B2 )C −1/2 .

Perspectives, Means and their Inequalities 135

Let f be an operator convex function on (0, ∞). For an arbitrary > 0, f (t)(:=
f (t + ) − f ( )) satisfies the assumption of the above result. So the following is
obtained.
Corollary 2.2 Let f be an operator convex function on (0, ∞). Then the perspec-
tive function Pf is jointly convex on B(H )++ × B(H )++ .
Proof Let (A1 , B1 ), (A2 , B2 ) be in B(H )++ × B(H )++ and let α ∈ [0, 1]. Put
A := αA1 + (1 − α)A2 and B := αB1 + (1 − α)B2 , then we have

Pf (αA1 + (1 − α)A2 , αB1 + (1 − α)B2 )

= B 1/2 f (B −1/2 AB −1/2 + I )B 1/2 − f ( )B
≤ αPf (A1 , B1 ) + (1 − α)Pf (A2 , B2 )
1/2 −1/2 −1/2 1/2
= αB1 f (B1 A1 B1 + I )B1 − αf ( )B1
1/2 −1/2 −1/2 1/2
+ (1 − α)B2 f (B2 A2 B2 + I )B2 − (1 − α)f ( )B2 .

Thus

B 1/2 f (B −1/2 AB −1/2 + I )B 1/2

1/2 −1/2 −1/2 1/2
≤ αB1 f (B1 A1 B1 + I )B1
1/2 −1/2 −1/2 1/2
+ (1 − α)B2 f (B2 A2 B2 + I )B2 ,

which implies the desired result as goes to 0.

The following is a fundamental result in the study of operator perspectives that
is an easy consequence of the above [13, 15].
Theorem 2.1 Let f be a real valued continuous function on (0, ∞). Then the
operator perspective Pf of f is jointly convex (resp. jointly concave) on B(H )++ ×
B(H )++ if and only if f is operator convex (resp. operator concave).
Corollary 2.3 Let f be a real valued operator convex function on (0, ∞). Then

Pf (A1 + A2 , B1 + B2 ) ≤ Pf (A1 , B1 ) + Pf (A2 , B2 ),

for all (Ai , Bi ) ∈ B(H )++ × B(H )++ (i = 1, 2).

Remark 2.2 For a real valued continuous function f , we define f˜ by f˜(t) :=
tf (1/t). If f is a polynomial function, the equation

Pf˜ (A, B) = f˜(AB −1 )B = f (BA−1 )A = Pf (B, A)

holds. So the equation Pf˜ (A, B) = Pf (B, A) always holds for an arbitrary f .
136 H. Osaka and S. Wada

We next consider the case if f is an operator convex function on (0, ∞). Using
the last theorem, the map

(A, B) $→ Pf (B, A)(= Pf˜ (A, B))

is jointly convex, which signifies that f˜ is also an operator convex function.

2.3 Monotonicity and Convergence

In the following, OC, OC0 , and OM denote, respectively, all the (real valued)
operator convex functions on (0, ∞), all the functions in OC that take the value
0 at 1, and all the (real valued) operator monotone functions on (0, ∞).

2.3.1 Monotonicity for One Direction

Let f ∈ OC0 and A, B > 0. Corollary 2.3 implies that

Pf (A + tI, B + tI ) = Pf (A + t I + (t − t )I, B + t I + (t − t )I )
≤ Pf (A + t I, B + t I ) + Pf ((t − t )I, (t − t )I )
= Pf (A + t I, B + t I ) + 0 · I (0 ≤ t < t).

This can be extended as follows:

Proposition 2.4 Let f ∈ OC0 and let A, B ∈ B(H )++ and X, X ∈ B(H )+ . If
X ≤ X, then Pf (A + X, B + X) ≤ Pf (A + X , B + X ).
Proof Note that Pf (X − X + I , X − X + I ) = 0 for all > 0. Thus

Pf (A + X, B + X)x | x
= supPf (A + X + I , B + X + I )x | x

≤ sup Pf (A + X , B + X ) + Pf (X − X + I , X − X + I ) x | x

= Pf (A + X , B + X )x | x

hold for all x ∈ H .

Corollary 2.4 Let α, β > 0 and let f ∈ OC with f (α/β) = 0. Then the map
X ≥ 0 $→ Pf (A + αX, B + βX) is decreasing.
Perspectives, Means and their Inequalities 137

Example 2.1 Take either f (t) = − log t or f (t) = t log t. From the proposition
above, the sequence Pf (A + (1/n)I, B + (1/n)I ) is increasing for all A, B ≥ 0.
Moreover, if A, B are invertible, Pf (A + (1/n)I, B + (1/n)I ) 3 Pf (A, B).

2.3.2 Monotonicity for Each Variable

Let f ∈ OC with |f (0+)| < ∞. Then f0 (:= f − f (0+)) is operator convex and
f0 (0+) = 0. Thus there exists h ∈ OM such that f0 (t) = th(t) and f˜0 = h(1/t)
[1], which implies

Pf (B, A1 ) ≥ Pf (B, A2 ) − f (0+)(A2 − A1 ) (A1 ≤ A2 ), (1)

because that

Pf0 (B, A2 ) = B 2 h(B 2 A−1

1 1 1 1
2 B )B
2 2

≤ B 2 h(B 2 A−1
1 1 1 1
1 B )B
2 2

= Pf0 (B, A1 ).

For an arbitrary positive operator monotone function k, the operator convex

function −k satisfies the condition above, so, we have

Pk (B1 , A1 ) ≤ Pk (B2 , A1 )
≤ Pk (B2 , A2 ) − k(0+)(A2 − A1 )
≤ Pk (B2 , A2 ) (A1 ≤ A2 , B1 ≤ B2 ).

Proposition 2.5 Let f be a positive continuous function on (0, ∞). Then f is

operator monotone if and only if Pf is monotone for each variable.
Corollary 2.5 Let an and bn be (strictly) positive decreasing sequences and let
f ∈ OM be positive function and A, B ≥ 0. Then, Pf (A+an I, B +bn I ) converges
strongly.
We next consider the convergence of the sequence

Pf (A + I, B + I )

for A, B ≥ 0.
From the above discussion, if an operator convex function f on (0, ∞) satisfies
|f (0+)| < ∞, then the function −f˜0 is operator monotone. For an arbitrary α ∈
(0, 1), we put g and gα by g(t) := −f˜0 (t) and gα (t) := g(t + α) − g(α). Then
gα is a non-negative operator monotone function. The perspective function Pgα is
138 H. Osaka and S. Wada

calculated as follows:

Pgα (A, B) = B 1/2 g(B −1/2 AB −1/2 + α)B 1/2 − g(α)B

= Pg (A + αB, B) − g(α)B
= −Pf (B, A + αB) + f (0+)(A + αB) − g(α)B

for A, B > 0. Using this, the condition for Pf (A + I, B + I ) to converge is

obtained.
Proposition 2.6 ([32, Theorem 6.2]) Let f be an operator convex function on
(0, ∞). Then the followings are equivalent:
(1) Pf (A + I, B + I ) converges strongly as ↓ 0 for every A, B ≥ 0 such that
αA ≤ B for some α > 0;
(2) |f (0+)| < ∞.
Proof (1) ⇒ (2). Taking A = 0, B = I , then Pf (A + I, B + I ) = (1 +
)f 1+ I converges to f (0+)I as ↓ 0.
(2) ⇒ (1). We use the notation g and gα in the above argument and put A :=
A + I . Since βA ≤ αA ≤ B hold for 0 < β ≤ α, we can assume that α ∈ (0, 1).
It follows from the above argument and Corollary 2.5 that

Pf (A , B ) = −Pgα (B − αA , A ) − g(α)A + f (0+)B

= −Pgα (B − αA + (1 − α) , A ) − g(α)A + f (0+)B

converges strongly.

Corollary 2.6 ([32, Corollary 6.4]) Let f be an operator convex function on
(0, ∞). Then the followings are equivalent:
(1) Pf (A + I, B + I ) converges strongly as ↓ 0 for every A, B ≥ 0 such that
αB ≤ A for some α > 0;
(2) |f˜(0+)| < ∞.
As we said above, the function g defined as g(t)(:= −f˜0 (t)) is operator
monotone. If |g(0+)| = |f˜(0+)| < ∞, then g − g(0+) is a non-negative operator
monotone function. Thus

Pf (A , B ) = −Pg (B , A ) − g(0+)A + f (0+)B

converges strongly as ↓ 0 .
In the next statement, we say that a binary operation σ on B(H )++ is an operator
connection if it can be described as Aσ B = Pg (B, A) for some non-negative
function g ∈ OM.
Perspectives, Means and their Inequalities 139

Corollary 2.7 ([32, Proposition 6.1]) Let f be an operator convex function on

(0, ∞). Then the followings are equivalent:
(1) Pf (A + I, B + I ) converges strongly as ↓ 0 for every A, B ≥ 0;
(2) |f (0+)| < ∞ and |f˜(0+)| < ∞;
(3) There exist λ, μ ∈ R and an operator connection σ such that

Pf (A, B) = −Aσ B + λA + μB

for all A, B > 0.

Put f (t) := t−t
+1 for t > 0. Then f is an operator convex function with f (0+) =
˜
f (0+) = 0. So, from Corollary 2.7, the strong limit of −Pf (A + I, B + I ) as
↓ 0 exists for A, B ≥ 0 and we denote this limit by A : B.
Remark 2.3 If A, B are invertible, A : B can be written as (A−1 + B −1 )−1 .
Remark 2.4 In Sect. 3, an operator connection is defined as a binary operation on
B(H )+ .
Remark 2.5 For a function f ∈ OC having (2) of the above result, we can define a
continuous homogeneous function on [0, ∞)2 by

yf (x/y), if x, y ∈ (0, ∞)
(x, y) =
0, if x · y = 0.

Remark 2.6 It is obvious that the perspective function is continuous w.r.t. operator
norm topology (i.e., if An − A → 0, Bn − B → 0, then Pf (An , Bn ) −
Pf (A, B) → 0). Moreover, if f is operator convex, then, from [33, Theorem 6.1],
the following property (upper continuity) holds:

An % A > 0 and Bn % B > 0 ⇒ Pf (An , Bn ) → Pf (A, B)

in the strong operator topology.

The proof will be given in the next section.

3 An Extension of the Perspective Function

3.1 A Functional Calculus for Commuting Positive Operators

In [49], Pusz and Woronowicz defined a two-variable functional calculus f (A, B)

for A, B ∈ B(H )+ and for a homogeneous Borel measurable locally bounded
function f on [0, ∞)2 . In this section, we first introduce Pusz–Woronowicz
functional calculus (PW-functional calculus, for short). Then we will show the
140 H. Osaka and S. Wada

properties of this functional calculus in the case when the function f and the pair
(A, B) are restricted.
Let A, B be positive operators with AB = BA. Then N(:= A + iB) is a
normal operator. So there is an isometric *-isomorphism ϕN from the function space
C(σ (N)) of continuous functions on σ (N) onto the C*-algebra C ∗ (N) generated
by N (see [11, Corollary I.3.3]).
Note that σ (N) ⊂ σ (A) + iσ (B) and C(σ (A) × σ (B)) 4 C(σ (A) + iσ (B)).
So, for ∈ C(σ (A) × σ (B)), we can define a functional calculus (A, B) by

(A, B) := ϕN ˆ |σ (N) ,

where ˆ (z) := (Rez, I mz).

The map ϕN can be extended to a norm-decreasing unital *-homomorphism on
the set Bb (σ (N)) of bounded Borel functions as

ϕN : Bb (σ (N)) → B(H )

having the following property: if fn is a bounded increasing sequence and f =

sup fn , then ϕN (fn ) 3 ϕN (f ), (cf. [48, Theorem 4.5.4]). As stated in [27], the
monotone class theorem guarantees such homomorphisms take the same value on
Bb (σ (N)) and then, for all ∈ Bb (σ (N)), the bounded Borel functional calculus
h(A, B) is defined by

h(A, B) := ϕN |σ (N) .

f ) on [0, ∞)
Remark 3.1 We shall mainly treat the bounded Borel function (= 2

defined by using a real valued continuous function f on (0, ∞) as

yf (x/y), if x, y ∈ (0, ∞)
f (x, y) =
0, if x · y = 0.

It is easy to verify that f is homogeneous.

3.2 Pusz–Woronowicz Functional Calculus

3.2.1 The Commuting Pair (R, S)

Let A, B ≥ 0. We note that the obvious relation A ≤ A + B holds. Using

Dougla’s range inclusion theorem [10, Theorem 17.1], there exists a bounded
operator C : H → ker(A + B)⊥ such that A1/2 = (A + B)1/2 C = C ∗ (A + B)1/2 ,
Perspectives, Means and their Inequalities 141

which implies

A = (A + B)1/2 CC ∗ (A + B)1/2 .

Thus we have the following.

Proposition 3.1 Let A, B ≥ 0 such that A + B = 0. Then there uniquely exists the
pair (R, S) of positive operators on the Hilbert space ker(A + B)⊥ such that

A = (A + B)1/2 R(A + B)1/2 , B = (A + B)1/2 S(A + B)1/2 . (2)

Proof It is enough to show the uniqueness. If (R, S) and (R , S ) are such pairs,
then

0 = A − A = (A + B)1/2 (R − R )(A + B)1/2 .

So, the operator R − R must be 0. The relation S = S can be proved in a similar

fashion.

The pair (R, S) in the above proposition satisfies

A + B = (A + B)1/2 (R + S)(A + B)1/2 .

So we have R + S = Iker(A+B)⊥ and RS = SR.

If A > 0, then the positive operator R is invertible and is written as R = (A +
B)−1/2 A(A + B)−1/2 . So, we have

SR −1 = (A + B)−1/2 BA−1 (A + B)1/2

and

p(SR −1 ) = (A + B)−1/2 p(BA−1 )(A + B)1/2

= (A + B)−1/2 A1/2p(A−1/2 BA−1/2 )A−1/2(A + B)1/2

for all polynomial p. Thus, for every continuous function f on [0, ∞),

f (SR −1 ) = (A + B)−1/2 A1/2f (A−1/2 BA−1/2 )A−1/2(A + B)1/2

and

f (SR −1 )R = (A + B)−1/2 A1/2 f (A−1/2 BA−1/2 )A1/2(A + B)−1/2 .

Proposition 3.2 If A > 0, then for every continuous function f on [0, ∞),

(A + B)1/2 f (SR −1 )R(A + B)1/2 = A1/2 f (A−1/2BA−1/2 )A1/2 .

142 H. Osaka and S. Wada

Corollary 3.1 If B > 0, then for every continuous function f on [0, ∞),

(A + B)1/2 f (RS −1 )S(A + B)1/2 = B 1/2 f (B −1/2 AB −1/2 )B 1/2 .

Furthermore, if A, B > 0, then

(A + B)1/2 Pf (R, S)(A + B)1/2 = Pf (A, B)

for every continuous function f on (0, ∞).

Example 3.1 Take f (t) = t
1+t . If either A > 0 or B > 0, then

RS
(A + B)1/2 (A + B)1/2 = B − B(A + B)−1 B = (A : B).
R+S

Example 3.2 Take f (t) = log t. If A, B > 0, then

(A + B)1/2 S(log R − log S)(A + B)1/2 = B 1/2 log(B −1/2 AB −1/2 )B 1/2 .

Remark 3.2 Suppose 0 ≤ cA ≤ B for some c > 0. Then it is clear that cR ≤ S

and cI = c(R + S) ≤ (c + 1)S hold. Thus we can define a bounded self-adjoint
operator

(A + B)1/2 f (RS −1 )S(A + B)1/2

for every continuous function f on [0, ∞). We shall discuss later the corresponding
mapping F of (A, B), namely

F (A, B) = (A + B)1/2 f (RS −1 )S(A + B)1/2 .

3.2.2 Variational Expression

To define the Pusz–Woronowicz functional calculus, we show the following varia-

tional expression. Let A, B ∈ B(H )++ with AB = BA and let z ∈ H . If x, y ∈ H
satisfy z = x + y, then by a simple calculation,

AB
z | z + (A + B)u | u = Ax | x + By | y
A+B

holds, where u := A+B z − x.

B
So we have

AB
inf (Ax | x + By | y) = z | z. (3)
x+y=z A+B

This can be generalized for any pair of positive operators.

Perspectives, Means and their Inequalities 143

Proposition 3.3 Let A, B ∈ B(H )+ . For z ∈ H ,

inf (Ax | x + By | y) = (A : B)z | z.

x+y=z

Proof We first consider the case if A, B are invertible. Since (A : B) = B − B(A +

B)−1 B,

Ax | x + B(z − x) | (z − x) − (A : B)z | z

= (A + B)−1/2 Bz2 + (A + B)1/2 x2
− 2Re(A + B)−1/2 Bz | (A + B)1/2 x
≥ 0.

The equality attains if x = (A + B)−1 Bz.

For A, B ≥ 0 and for x ∈ H ,

(A : B)z | z = inf(A : B )z | z

= inf inf (A x | x + B (z − x) | (z − x))

≥ inf (Ax | x + B(z − x) | (z − x))

≥ (A : B)z | z.

As stated in the last subsection, for every (A, B) = (0, 0), there uniquely exists the
commuting pair (R, S) of positive contractive operators such that (2) holds. This
pair (R, S) satisfy the following:
Corollary 3.2 Let H0 = ran(A + B)1/2 . For z ∈ H0 ,

inf (Rx | x + Sy | y) = (R : S)z | z.

x,y∈H0 , x+y=z

Proof From the preceding proposition, for every ∈ (0, 1), there exists x , y ∈ H0
such that x + y = z and

Rx | x + Sy | y < (R : S)z | z + .

Let δ := (R : S)z | z+ −(Rx | x + Sy | y ) and let K > be a positive
number such that

(2 /δ) (2 max{x , y } + 1) < K.

144 H. Osaka and S. Wada

Then there exists x ∈ H0 such that x − x < /K and so,

We next put y := z−x . Then y −y = y −(z−x ) = −x +x < /K.

The similar argument implies Sy | y ≤ δ/2 + Sy | y . Thus we have

(R : S)z | z ≤ Rx | x + Sy | y

< Rx | x + Sy | y + δ
= (R : S)z | z + .

3.2.3 Pusz–Woronowicz Functional Calculus

Proposition 3.4 For A, B ∈ B(H )+ ,

RS
(A : B) = (A + B)1/2 (A + B)1/2 .
R+S

Proof Let z ∈ H . Put w := (A + B)1/2 z, then

(A : B)z | z
= inf (Ax | x + By | y)
x+y=z

= inf (A + B)1/2 R(A + B)1/2 x | x + (A + B)1/2 S(A + B)1/2 y | y

x+y=z

= inf (Rx | x + Sy | y)

x,y∈ran(A+B)1/2, x+y=w

= inf (Rx | x + Sy | y)

x,y∈ran(A+B)1/2 , x+y=w

RS
= w | w
R+S
RS
= (A + B)1/2 (A + B)1/2 z | z.
R+S

Perspectives, Means and their Inequalities 145

From this, we have

RS
A − (A : B) = (A + B)1/2 R − (A + B)1/2 (4)
R+S
= (A + B)1/2 R 2 (A + B)1/2 . (5)

Similarly, put

ek (ek−1 − ek )
e0 := R + S(= I ), e1 := R, ek+1 := ek − (k = 1, 2, . . .).
ek + (ek−1 − ek )

By a simple calculation, ek = R k , and so, for any polynomial p,

(A + B)1/2 p(R)(A + B)1/2

can be written as a two-variable function of (A, B).

Definition 3.3 Let be a real valued homogeneous continuous function on
[0, ∞)2 , i.e., (λr, λs) = λ (r, s) for all λ, r, s ≥ 0. The two-variable map

(A, B) ∈ B(H )+ × B(H )+ $→ (A + B)1/2 (R, S)(A + B)1/2 ∈ B(H )sa

is said to be the Pusz–Woronowicz functional calculus (PW-functional calculus, for

short) associated with . We write

(A, B) = (A + B)1/2 (R, S)(A + B)1/2 .

Put f (t) := (t, 1) and f˜ (t) := (1, t). It is obvious that f˜ (t) = (1, t) =
t (1/t, 1) = tf (1/t) for t > 0. From the continuity of , f and f˜ (t) are
continuous function on (0, ∞) and

|f (0+)| < ∞, |f˜ (0+)| < ∞.

Conversely, if a continuous function f on (0, ∞) has the properties

|f (0+)| < ∞, |f˜(0+)| = lim |tf (1/t)| < ∞,

t %0

then a two variable function f defined by

sf (r/s), if r, s ∈ (0, ∞)
f (r, s) :=
0, if r · s = 0
146 H. Osaka and S. Wada

is a continuous function on [0, ∞)2. As a conclusion, f is continuous function on

[0, ∞)2 if and only if

|f (0+)| < ∞ and |f˜ (0+)| < ∞. (6)

Let A, B are invertible positive operators. Then R, S are also invertible, and so,
from Corollary 3.1,

f (A, B) = (A + B)1/2 f (R, S)(A + B)1/2

= (A + B)1/2 S f (R/S)(A + B)1/2
= (A + B)1/2 Pf (R, S)(A + B)1/2
= Pf (A, B).

Proposition 3.5 Let f be a continuous function on (0, ∞) with (6). If A, B > 0,

then f (A, B) = Pf (A, B).
In the following, we show some properties of PW-functional calculus.

3.2.4 Homogeneity, Upper Continuity and Convexity

Proposition 3.6 Let H and K be Hilbert spaces and A, B ∈ B(H )+ . Let be

a real valued homogeneous continuous function and C : K → H be a bounded
operator. If ran(A + B) ⊂ ran(C), then

(C ∗ AC, C ∗ BC) = C ∗ (A, B)C.

Proof The operator (A + B)1/2 C has the polar decomposition as follows:

(A + B)1/2 C = U (C ∗ (A + B)C)1/2 ,

where U is a bounded operator from K into H with

U ∗ U = Pran|(A+B)1/2 C|

and

U U ∗ = Pran((A+B)1/2 C) = Pran(A+B)1/2 .

The last equality comes from the assumption. Note that by Proposition 3.1 there
exists a commuting pair (R, S) of positive contractive operators on ker(A + B)⊥
such that

A = (A + B)1/2 R(A + B)1/2 , B = (A + B)1/2 S(A + B)1/2 .

Perspectives, Means and their Inequalities 147

So, we have

C ∗ AC = C ∗ (A + B)1/2 R(A + B)1/2 C

= (C ∗ AC + C ∗ BC)1/2 U ∗ RU (C ∗ AC + C ∗ BC)1/2

and similarly

C ∗ BC = (C ∗ AC + C ∗ BC)1/2 U ∗ SU (C ∗ AC + C ∗ BC)1/2 .

Thus

(C ∗ AC, C ∗ BC) = (C ∗ AC + C ∗ BC)1/2 (U ∗ RU, U ∗ SU )(C ∗ AC + C ∗ BC)1/2

= (C ∗ AC + C ∗ BC)1/2 U ∗ (R, S)U (C ∗ AC + C ∗ BC)1/2
= C ∗ (A + B)1/2 (R, S)(A + B)1/2 C = C ∗ (A, B)C.

The next theorem plays an important role in the study of PW-functional calculus.
Theorem 3.1 ([33, Theorem 6.1]) Let be a real valued homogeneous contin-
uous function on [0, ∞)2 and let An , Bn , A, B are in B(H )+ . If An % A and
Bn % B, then (An , Bn ) strongly converges to (A, B).
We need some lemmas. Here, (Rn , Sn ) is the pair of positive contractive operators
on ran(A + B)1/2 corresponding to (An , Bn ), namely,

An = (An + Bn )1/2 Rn (An + Bn )1/2 , Bn = (An + Bn )1/2 Sn (An + Bn )1/2.

Lemma 3.1 Rn ξ | η → Rξ | η for ξ, η ∈ ran(A + B)1/2 .

Proof Put Tn := (An + Bn )1/2 and T := (A + B)1/2 . Since ξ, η (∈ ran T ) can
be approximated by some elements in ran T , it is enough to show the case when
ξ, η ∈ ran T .
From the assumption, it is clear that

An = Tn Rn Tn , A = T RT , Tn % T

and for every x, y ∈ H ,

Rn T x | T y
= Rn (T − Tn )x | T y + Rn Tn x | (T − Tn )y + Rn Tn x | Tn y
148 H. Osaka and S. Wada

holds. Using the fact that Rn and R are contractions,

|Rn T x | T y − RT x | T y|

≤ (T − Tn )xT y + Tn x(T − Tn )y + (Tn Rn Tn − T RT )xy
→ 0 as n → ∞.

Using the fact that = An − (An : Bn ) strongly converges to A − (A :
Tn Rn2 Tn
B) = T R 2 T (see (5)), we have the following.
Lemma 3.2 Rn2 ξ | η → R 2 ξ | η for ξ, η ∈ ran(A + B)1/2 .
Proof It is enough to show the case if ξ, η ∈ ran(A + B)1/2 . By the similar
calculation in the above proof, we have

|Rn2 T x | T y − R 2 T x | T y|
≤ (T − Tn )xT y + Tn x(T − Tn )y
+ (Tn Rn2 Tn − T R 2 T )xy
= (T − Tn )xT y + Tn x(T − Tn )y
+ (An − (An : Bn ) − (A − (A : B)))xy.

Thus |Rn2 T x | T y − R 2 T x | T y| tends to 0 as n → ∞.

Lemma 3.3 For k ∈ N, Rnk ξ → R k ξ for all ξ ∈ ran(A + B)1/2 .
Proof Let us prove this by induction on k. We first show the case if k = 1. By using
the above two lemmas, for ξ ∈ ran(A + B)1/2 ,

Rn ξ − Rξ 2 = Rn2 ξ | ξ − 2ReRn ξ | Rξ + R 2 ξ | ξ

tends to 0 as n → ∞.
Assume the statement holds for some k. Then for ξ ∈ ran(A + B)1/2 ,

(Rnk+1 − R k+1 )ξ ≤ (Rnk+1 − Rn R k )ξ + (Rn − R)R k ξ

≤ (Rnk − R k )ξ + (Rn − R)R k ξ → 0 as n → ∞.

Proof of Theorem 3.1 We use the notations Tn and T defined in the proof of
Lemma 3.1. Put f (t) := (t, 1 − t). Then, f is a continuous function on [0, 1]
and

(An , Bn ) = Tn f (Rn )Tn , (A, B) = Tf (R)T .

Perspectives, Means and their Inequalities 149

Since f is uniformly approximated by a polynomial, the above lemma implies

f (Rn )ξ → f (R)ξ for all ξ ∈ ran(A + B)1/2 .

Using the property Tn % T ,

(An , Bn )ξ − (A, B)ξ

= Tn f (Rn )Tn ξ − Tf (R)T ξ
≤ Tn f (Rn )(Tn − T )ξ + Tn (f (Rn ) − f (R))T ξ + (Tn − T )f (R)T ξ
≤ T1 f ∞ (Tn − T )ξ + T1 (f (Rn ) − f (R))T ξ + (Tn − T )f (R)T ξ

converges to 0 as n → ∞, where f ∞ := max{|f (t)| : t ∈ [0, 1]}.

In the following, we denote the operator A + I by A .
Corollary 3.3 Let f be a real valued continuous function on (0, ∞). Then the
followings are equivalent:
(1) The strong limit of Pf (A , B ) exists for all (A, B) ∈ B(H )+ × B(H )+ ;
(2) f satisfies (6).
Proof (2) ⇒ (1). Immediate from Theorem 3.1.
(1) ⇒ (2). Take (A, B) = (I, 0) (resp. (A, B) = (0, I )), then Pf (A , B )
converges to f˜(0+) · I (resp. f (0+) · I ) as % 0.

Remark 3.3 The statement in Corollary 3.3 is similar to Corollary 2.7 except that
the function f is operator convex.
Proposition 3.7 Let f be a real valued continuous function on (0, ∞) with (6).
Then the map (A, B) $→ f (A, B) is jointly convex if and only if f is operator
convex on (0, ∞).
Proof It is enough to show the “if” part. Assume f is operator convex on (0, ∞)
with (6). We fix positive operators Ai , Bi (i = 1, 2) and x ∈ H . Put

xα :=α f (A1 , B1 )x + (1 − α) (A2 , B2 )x

− f (αA1 + (1 − α)B1 , αA2 + (1 − α)B2 )x

and

xα ( ) :=α f (A1 , B1 )x + (1 − α) (A2 , B2 )x

− f (αA1 + (1 − α)B1 , αA2 + (1 − α)B2 )x.

By using Proposition 3.5 and Theorem 2.1, we have xα ( ) | x ≥ 0 for all α ∈
[0, 1]. Thanks to Theorem 3.1, this sequence converges to xα | x≥ 0 as % 0.

150 H. Osaka and S. Wada

From Theorem 2.1, it is clear that the former properties of an operator perspective
are inherited by PW-functional calculus.
Corollary 3.4 Let f be a positive valued continuous function on (0, ∞) with (6).
Then the map (A, B) $→ f (A, B) is jointly concave if and only if f is operator
monotone on (0, ∞).
Corollary 3.5 Let f be a continuous function on (0, ∞) with (6). Then
f (A, B) = f˜ (B, A) holds for all (A, B) ∈ B(H )+ × B(H )+ .

Corollary 3.6 Let f, g be continuous functions on (0, ∞) with (6). If f ≤ g, then

f (A, B) ≤ g (A, B) holds for all (A, B) ∈ B(H )+ × B(H )+ .

Corollary 3.7 Let α, β > 0 and let f ∈ OC with (6). If f (α/β) = 0, then the map
X ≥ 0 $→ f (A + αX, B + βX) is decreasing.
We denote by OM+ the set of all positive (non-negative) operator monotone
functions on (0, ∞). If f ∈ OM+ , then f˜ is also in OM+ , so f satisfies (6).
Corollary 3.8 Let f ∈ OM+ . If 0 ≤ A ≤ C, 0 ≤ B ≤ D, then 0 ≤ f (A, B) ≤
f (C, D).

Proof Let > 0. The facts 0 < A ≤ C , 0 < B ≤ D and Proposition 2.5 imply
the following :

0 ≤ Pf (A , B ) = f (A , B ),

and

0≤ f (C ,D ) − f (A , B ),

which implies the desired result by Theorem 3.1.

Corollary 3.9 Let f ∈ OM+ . Then, for A, B, C ≥ 0,

C f (A, B)C ≤ f (CAC, CBC)

holds.
Proof For 1, 2, 3 > 0,

C 3 f (A, B)C 3 ≤C 3 f (A 1 , B 1 )C 3 (Corollary 3.7)

= f (C 3 A 1 C 3 , C 3 B 1 C 3 ) (Proposition 3.6)
≤ f (C 3 A 1C 3 + 2 I, C 3 B 1C 3 + 2I )

= Pf (C 3 A 1 C 3 + 2 I, C 3 B 1C 3 + 2I ) (Proposition 3.5)

hold. So, we have

C 3 f (A, B)C 3 ξ | ξ ≤ Pf (C 3 A 1 C 3 + 2 I, C 3 B 1C 3 + 2 I )ξ | ξ ,

Perspectives, Means and their Inequalities 151

for all ξ ∈ H . By letting 1 % 0,

C 3 f (A, B)C 3 ξ | ξ ≤ Pf (C 3 AC 3 + 2 I, C 3 BC 3 + 2 I )ξ | ξ ,

and then letting 3 % 0, we have

C f (A, B)Cξ | ξ ≤ Pf (CAC + 2 I, CBC + 2 I )ξ | ξ ,

which implies the desired result.

3.2.5 Restricted Domain

We denote by (B(H )+ × B(H )+ )≤ (resp. (B(H )+ × B(H )+ )≥ ) the set of all

pairs of positive operators (A, B) such that cA ≤ B (resp. A ≥ cB) for some
c > 0. These domains have appeared several times in this chapter. We show some
properties of PW-functional calculus of a pair of positive operators in these domain.
Let (A, B) be in B(H )+ × B(H )+ and let (R, S) be a pair of positive operators
that correspond to (A, B).
Lemma 3.4 Let c > 0 be a positive number. Then the followings are equivalent:
(1) cA ≤ B (resp. cB ≤ A) ;
(2) cR ≤ S (resp. cS ≤ R) ;
c
(3) c+1 I ≤ S (resp. c+1
c
I ≤ R).
Proof Immediate (see Remark 3.2).

By using this lemma, if the pair (A, B) is in (B(H )+ × B(H )+ )≤ , then PW-
functional calculus (A, B) is written as follows:
Proposition 3.8 Let f be a continuous function on (0, ∞) with (6). Then

f (A, B) = (A + B)1/2 f (RS −1 )S(A + B)1/2 .

We next treat the domain (B(H )+ × B(H )+ )≥ . Let f be a real valued continu-
ous function on (0, ∞) with |f˜(0+)| < ∞ and let α, c > 0. Assume that positive
operators A, B satisfy A ≥ cB. Then, from the above lemma, we have R ≥ cS and
R > 0, which implies 0 ≤ RS ≤ (1/c)I .
The functions f˜ and f˜α extend to continuous functions on [0, 1/c], where
f˜(0) := f˜(0+) and f˜α (0) := f˜α (0+) = f˜(0+) . Here, we define a two variable
function ϕ(α, t) on [0, 1] × [0, 1/c] by
⎧
⎨f˜ (t) = tf 1
+α , if α ∈ [0, 1], t ∈ (0, 1/c],
α t
ϕ(α, t) :=
⎩f˜α (0+), if α ∈ [0, 1], t = 0.
152 H. Osaka and S. Wada

This function is continuous on the compact set [0, 1] × [0, 1/c]. So, that is
uniformly continuous, i.e., for every > 0, there exists δ > 0 such that if
d((α, t), (α , t )) = |(α − α )2 + (t − t )2 |1/2 < δ, then |ϕ(α, t) − ϕ(α , t )| < .
If α(= d((α, t), (0, t))) < δ, then

|f˜α (t) − f˜(t)| = |ϕ(α, t) − ϕ(0, t)| < (t ∈ [0, 1/c]),

which signifies the sequence f˜α uniformly converges to f˜ as α % 0.

Put X := (A + B)1/2 f˜(S/R)R(A + B)1/2 , then

X − fα (A, B)

= X − f˜α (B, A)

≤ (A + B)1/2 f˜(S/R) − f˜α (S/R)R(A + B)1/2

≤ f˜ − f˜α ∞ A + B,

where f˜ − f˜α ∞ := max{|f˜(t) − f˜α (t)| : t ∈ [0, 1/c]}.

Proposition 3.9 Let f be a continuous function on (0, ∞) with |f˜(0+)| < ∞
and let (A, B) ∈ (B(H )+ × B(H )+ )≥ . Then the sequence fα (A, B) converges to
(A + B)1/2 f˜(S/R)R(A + B)1/2 in the operator norm topology.
Corollary 3.10 Let f be a continuous function on (0, ∞) with |f˜(0+)| < ∞ and
let (A, B) ∈ (B(H )+ × B(H )+ )≥ . Then Pf (A , B ) strongly converges to (A +
B)1/2 f˜(S/R)R(A + B)1/2 as % 0.
Proof Assume that c ∈ (0, 1), α > 0 and A ≥ cB. Then, for every > 0, we have
A ≥ cB , which implies

S( )
0≤ ≤ (1/c)I
R( )

holds, where (R( ), S( )) is the pair of positive operators corresponding to

(A , B ).
Put X := (A + B)1/2f˜(S/R)R(A + B)1/2 . Using the argument before the above
proposition,

X − fα (A, B) ≤ f˜ − f˜α ∞ A + B,

Pfα (A , B ) − Pf (A , B ) ≤ f˜ − f˜α ∞ A + B

≤ f˜ − f˜α ∞ (A + B + 2) (1 > > 0).
Perspectives, Means and their Inequalities 153

So, for every ξ ∈ H and for every ∈ (0, 1),

Xξ − Pf (A , B )ξ ≤ Xξ − fα (A, B)ξ + fα (A, B)ξ − Pfα (A , B )ξ

+ Pfα (A , B )ξ − Pf (A , B )ξ
≤ f˜ − f˜α ∞ (2A + B + 2)ξ
+ fα (A, B)ξ − Pfα (A , B )ξ ,

where f˜ − f˜α ∞ := max{|f˜(t) − f˜α (t)| : t ∈ [0, 1/c]}. Thus the desired result
is obtained by Theorem 3.1 and Proposition 3.9.

Corollary 3.11 Let f be a continuous function on (0, ∞). Then the followings are
equivalent:
(1) Pf (A , B ) converges strongly as % 0 for all (A, B) ∈ (B(H )+ × B(H )+ )≥ ;
(2) |f˜(0+)| < ∞.
Proof (2)⇒(1). Immediate from the above proposition. (1)⇒(2). Take (A, B) =
(I, 0).

Corollary 3.12 Let f be a continuous function on (0, ∞). Then the followings are
equivalent:
(1) Pf (A , B ) converges strongly as % 0 for all (A, B) ∈ (B(H )+ × B(H )+ )≤ ;
(2) |f (0+)| < ∞.
Example 3.3 Take f (t) = t α (α ∈ R). Then the followings are equivalent:
(1) Pf (A , B ) converges strongly as % 0 for all (A, B) ∈ (B(H )+ × B(H )+ )≤
(resp. for all (A, B) ∈ (B(H )+ × B(H )+ )≥ ) ;
(2) α ≥ 0 (resp. α ≤ 1).
Remark 3.4 When the Hilbert space H is finite dimensional, (A, B) ∈ (B(H )+ ×
B(H )+ )≥ if and only if P(ker A)⊥ ≥ P(ker B)⊥ . Since the “only if” part is obvious,
we show the “if” part. The positive operators A, B have the following spectral
decomposition:

n
A= αi Pi (α1 ≥ α2 ≥ · · · ≥ αn > 0),
i=1

m
B= βj Qj (β1 ≥ β2 ≥ · · · ≥ βm > 0).
j =1

So if P(ker A)⊥ ≥ P(ker B)⊥ , then

n
m
αn m
αn
m
αn
A ≥ αn Pi ≥ αn Qj = β1 Qj ≥ βj Qj = B.
β1 β1 β1
i=1 j =1 j =1 j =1
154 H. Osaka and S. Wada

4 Theory of Operator Means

4.1 Kubo-Ando’s Axiomatization

When a function f is in OM+ , f satisfies (6) and f˜ (A, B) is well-defined for

all (A, B) ∈ B(H )+ × B(H )+ . As stated before, the binary operation (A, B) $→
Aσ B(:= f˜ (A, B)) satisfies the following statements :
(i) A ≤ C, B ≤ D ⇒ Aσ B ≤ Cσ D,
(ii) C(Aσ B)C ≤ (CAC)σ (CBC) for all C ≥ 0,
(iii) An % A ≥ 0, Bn % B ≥ 0 ⇒ An σ Bn % Aσ B.
We call such a binary operation σ an operator connection and denote the set
of operator connections by . In [39], Kubo and Ando show that the above three
statements characterize the class { f˜ | f ∈ OM+ }.
Theorem 4.1 For every σ ∈ , there uniquely exists fσ ∈ OM+ such that Aσ B =
f˜σ (A, B) for all A, B ∈ B(H )+ . The map σ $→ fσ is an affine order isomorphism
from onto OM+ .
Lemma 4.1 For C ∈ B(H )++ , C(Aσ B)C = (CAC)σ (CBC).
Proof From the statement (ii), we have

Aσ B = (C −1 CACC −1 )σ (C −1 CBCC −1 )
≥ C −1 ((CAC)σ (CBC)) C −1
≥ C −1 C(Aσ B)CC −1 = Aσ B.

Lemma 4.2 Let σ ∈ and A, B ∈ B(H )+ . If an orthogonal projection P
commutes with A, B, then ((AP )σ (BP ))P = (Aσ B)P = P (Aσ B) holds.
Proof Since the condition (i) and (ii) hold, we have

P (Aσ B)P ≤ (P AP )σ (P BP ) ≤ Aσ B.

This implies

(Aσ B)P − P (Aσ B)P 2

= ((Aσ B) − P (Aσ B)P )P 2
≤(Aσ B) − P (Aσ B)P P (Aσ B − P (Aσ B)P )P = 0.

Thus (Aσ B)P = P (Aσ B)P = (P (Aσ B)P )∗ = P (Aσ B).

Lemma 4.3 For t ≥ 0, there exists αt ≥ 0 such that I σ (tI ) = αt I .
Perspectives, Means and their Inequalities 155

Proof Let P be an orthogonal projection. It follows from the fact that P commutes
with I and tI and the preceding lemma that P commutes with I σ (tI ). Thus I σ (tI )
reduces all closed subspaces in H .

Put f (t) := I σ (tI ). From the statement (i), f (t) is right continuous on [0, ∞).
On the other hand, from Lemma 4.1, f (t)/t (= ((1/t)I )σ I ) is left continuous on
(0, ∞). Combining them, f is continuous on [0, ∞).
Lemma 4.4 f (t)(:= I σ (tI )) is an operator monotone function on [0, ∞).
show the case that there exist orthogonal projections {Pi } such that
ProofWe first
A = i αi Pi , i Pi = I and Pi Pj = 0 (i = j ). From Lemma 4.2,

I σ A = (I σ A)( Pi ) = (I σ A)Pi
i i

= (Pi σ (αi Pi )) = (I σ (αi I ))Pi
i i

= f (αi )Pi = f (A).
i

For general A, there exist a sequence An of the above form such that An % A. So,
I σ An = f (An ) converges to I σ A(= f (A)) strongly. From the statement (i), it
follows that f is operator monotone.

Proof of Theorem 4.1 Let σ ∈ . From the above lemmas, f (t)(:= I σ (tI )) is
operator monotone and f (A) = I σ A for all A ≥ 0. Thus we have

A σ B = A1/2(I σ (A−1/2 B A−1/2 ))A1/2

= A1/2f (A−1/2B A−1/2 )A1/2
= f˜ (A ,B )

hold for all A, B ∈ B(H )+ and > 0. Taking the strong limit of each side, Aσ B =
f˜ (A, B) holds for all A, B ∈ B(H )+ .
Let us prove the second half of the theorem. The equivalences

σ = ασ1 + (1 − α)σ2 ⇐⇒ fσ = αfσ1 + (1 − α)fσ2 ,

σ = 0 ⇐⇒ fσ = 0

and

σ1 ≤ σ2 ⇐⇒ fσ1 ≤ fσ2

are obvious. So, it is enough to show that the map σ $→ f is surjective. For
every f ∈ OM+ , put Aσ B := f˜ (A, B). This binary operation σ satisfies the
156 H. Osaka and S. Wada

statements (i) and (ii) by Corollary 3.8 and Corollary 3.9, respectively. The proof of
the statement (iii) comes from Theorem 3.1 and Corollary 3.7 This implies σ ∈ .

From the above argument, a normalized positive valued operator monotone function
on (0, ∞) is identified with an operator mean. The binary operation σ satisfying
Aσ B = A (resp. Aσ B = B) is a trivial example for an operator mean and is
denoted by l (resp. r).

4.2 Operator Means

An operator connection σ having I σ I = I is called an operator mean. From the

theorem in the last subsection, there exists an affine order isomorphism σ ↔ fσ
1 := {f ∈ OM | f (1) = 1}.
from the set 1 of all operator means onto the set OM+ +

4.2.1 Integral Representation

It is known that a necessary and sufficient condition for a positive continuous func-
tion f on (0, ∞) to be operator monotone is that f has an integral representation as
follows :
$
x:t
f (t) = dm(x) (t > 0), (7)
[0,∞] x : 1

where m is a positive finite Borel measure on [0, ∞] (See [6, V.53, p144]).
Example 4.1 We give some examples of the correspondence m ↔ f .

(1 − α)δ{0} + αδ{∞} ←→ f∇α (t) := (1 − α) + αt (0 ≤ α ≤ 1)

sin απ x α−1
· dx ←→ f#α (t) := t α (0 < α < 1),
π 1+x

δ{α/(1−α)} ←→ f!α (t) := ((1 − α) + αt −1 )−1 (0 < α < 1).

Example 4.2 We above introduced f∇α , f#α and f!α . The corresponding operator
means can be written as follows:

A∇α B = (1 − α)A + αB
= (A + B)1/2 ((1 − α)R + αS)(A + B)1/2 ,
Perspectives, Means and their Inequalities 157

A#α B = s − lim A1/2 (A−1/2 B 1/2 A−1/2)α A1/2

= (A + B)1/2 R 1−α S α (A + B)1/2 ,

where s − lim denotes the strong limit. Moreover,

A!α B = s − lim((1 − α)A−1 + αB −1 )−1

RS
= (A + B)1/2 (A + B)1/2 .
αR + (1 − α)S

Remark 4.1 In the following, we write f!0 (t) := 1, f#0 (t) := 1, f!1 (t) := t and
f#1 (t) := t.
The operator means ∇α , #α and !α are called the arithmetic mean , the geometric
mean and the harmonic mean. By a simple calculation, we have f!α ≤ f#α ≤ f∇α ,
which implies !α ≤ #α ≤ ∇α for all α ∈ [0, 1].
In the following, we denote ∇ := ∇1/2 , # := #1/2 and ! :=!1/2.
Example 4.3 Since the power functions t α (0 ≤ α ≤ 1) are in OM+ 1 , the integral
)1 α
0 t dα = (t − 1)/ log t is also in OM+ . The corresponding operator mean
1

is denoted by λ and is called the logarithmic mean. The relation between the
logarithmic mean and the operator means stated above is # ≤ λ ≤ ∇.
Remark 4.2 The upper and lower bounds for the logarithmic mean have been
studied [18, 57]. The following is curious in this respect [38, 42]:

min{r ≥ 0 λ(a, b) ≤ pr,1/2(a, b) (∀a, b ∈ (0, ∞))} = 1/3,

where pr,1/2 (a, b) := ((a r + b r )/2)1/r .

4.2.2 Geometric Mean

Let’s talk about statements that characterize the geometric mean. Recall the
fundamental formula for the operator geometric mean:

A#B = A1/2(A−1/2 BA−1/2 )1/2 A1/2 (A > 0, B ≥ 0).

We first show that it is the unique positive solution of the Riccati equation.
Proposition 4.1 Let A, B, X be positive operators. If A is invertible, then X =
A#B if and only if XA−1 X = B.
158 H. Osaka and S. Wada

Proof

X = A#B ⇐⇒ A−1/2 XA−1/2 = (A−1/2BA−1/2 )1/2

⇐⇒ A−1/2 XA−1 XA−1/2 = (A−1/2 XA−1/2 )2 = A−1/2 BA−1/2
⇐⇒ XA−1 X = B.

Proposition 4.2 Let A, B, X be positive operators. Consider the following state-
ments

AX
(1) ≥ 0;
XB
(2) XA−1 X ≤ B (∀ > 0);
(3) X ≤ A#B.
Then (1) ⇐⇒ (2) ⇒ (3) hold.
−1/2 −1/2
Proof (1) ⇐⇒ (2). Let δ > 0. Put S := A XBδ . Since
" −1/2 # " −1/2 #
I S A A X A
= −1/2 −1/2 ≥ 0,
S∗ I Bδ X Bδ Bδ

we have
* + * +
I −S x x I S x x
= ≥0
−S ∗ I y y S ∗ I −y −y

for x, y ∈ H . So, the inequalities

I S −I
≥ ∗ ≥
I S −I
−1/2 −1/2
and S = A XBδ ≤ 1 hold, which is equivalent to
−1/2 −1/2
XA−1 X = Bδ (Bδ XA−1 XBδ
1/2 1/2
)Bδ ≤ Bδ .

(2)⇒ (3).

XA−1 X ≤ B
⇐⇒ (A−1/2 XA−1/2 )2 = A−1/2 XA−1 XA−1/2 ≤ A−1/2 BA−1/2
#⇒ A−1/2 XA−1/2 ≤ (A−1/2 BA−1/2 )1/2
⇐⇒ X ≤ A #B.

Perspectives, Means and their Inequalities 159

Theorem 4.2 Let A, B be positive operators. Then

AX
A#B = max X ≥ 0 | ≥0 .
XB

Proof From the above proposition,

AX
X≥0| ≥ 0 ⊆ {X ≥ 0 | X ≤ A#B}.
XB

So, to show this theorem, it is enough to prove

A A#B
≥ 0.
A#B B

Thanks to (A #Bδ )A−1 (A #Bδ ) = Bδ , we have

A A #Bδ
≥ 0,
A #Bδ Bδ

which implies the desired result.

4.2.3 Mean of Projections

Let P , Q be orthogonal projections with P + Q = 0 and let (R, S) be positive

operators on ker(P + Q)⊥ such that P = (P + Q)1/2 R(P + Q)1/2 and Q =
(P + Q)1/2 S(P + Q)1/2 . On the direct sum ker(P + Q)⊥ = (ran(P ) ∩ ran(Q)) ⊕
(ran(P ) ∩ ran(Q)⊥ ) ⊕ (ran(P )⊥ ∩ ran(Q)), the operators R, S can be denoted
as
⎡ −1 ⎤ ⎡ −1 ⎤
2 I 2 I
R=⎣ I ⎦, S=⎣ 0 ⎦.
0 I

Thus, for f ∈ OM+

⎡ ⎤
2−1 I
⎢ ⎥
f˜ (P , Q) = (P + Q)1/2 ⎣ f˜ (1, 0)I ⎦ (P + Q)
1/2

f˜ (0, 1)I

= P ∧Q+ f˜ (1, 0)(P − P ∧ Q) + f˜ (0, 1)(Q − P ∧ Q),

where P ∧ Q is the orthogonal projection onto ran(P ) ∩ ran(Q).

160 H. Osaka and S. Wada

Proposition 4.3 Let P , Q be orthogonal projections.

P σ Q = P ∧ Q + a(P − P ∧ Q) + b(Q − P ∧ Q),

.
where a := fσ (0+) and b := fσ (0+).

1
4.2.4 Transforms on OM+

Let σ be an operator mean. The following binary operations :

Aσ̃ B := Bσ A, Aσ ∗ B := s- lim(A−1 σ B −1 )−1 , Aσ ⊥ B := Aσ̃ ∗ B

correspond to operator monotone functions

f˜σ (t) = tfσ (1/t), fσ∗ (t) = fσ (1/t)−1 , fσ⊥ (t) = t/fσ (t),

which implies the binary operations (σ̃ , σ ∗ , σ ⊥ ) are in 1 . We show some properties
of these transforms (σ̃ , σ ∗ , σ ⊥ ) on 1 . It follows from the fact σ̃˜ = σ , (σ ∗ )∗ =
σ and (σ ⊥ )⊥ = σ that these transforms are bijective map on 1 . By a simple
calculation, we have the following.
Proposition 4.4 Let σ1 , σ2 ∈ 1 . If σ1 ≤ σ2 , then

σ˜1 ≤ σ˜2 , σ2∗ ≤ σ1∗ and σ2⊥ ≤ σ1⊥ .

Corollary 4.1

! = ∇ ∗ ≤ λ∗ = λ⊥ ≤ #∗ = # ≤ λ ≤ ∇.
t +f
Remark 4.3 The injective map f $→ fˆ := 1+f is called the Barbour transform on
OM+ . This map plays an important role in the analysis of OM+ ([40, 46]). The
Barbour transform has the following properties:

+ = OM+ \{1}, + = {f ∈ OM+ | f! ≤ f ≤ f∇ },
1 1 1
OM OM

f/⊥ = (fˆ)⊥ , (0 0
f˜) = (fˆ)∗ , (f ∗ ) = 1̂ 1
f (f ∈ OM+ ).

4.2.5 Weight and Symmetricity

Let f be a positive function on (0, ∞). It is known [6, Theorem V. 25] that f is
operator monotone if and only if f is operator concave. Using this, the following is
obtained.
Perspectives, Means and their Inequalities 161

Proposition 4.5 Let σ be an operator mean. Then !α ≤ σ ≤ ∇α hold, where

α := df σ
dx x=1 .
Proof Put f := fσ . Note that the tangent line of f (x) at x = 1 is written as
y = f (1)x + (1 − f (1)). Since f is an operator monotone function, we have

(f (1)x + (1 − f (1))) ≥ f (x)

for all x > 0, which implies 1 − f (1) ≥ f (0+) ≥ 0 and f (1) ≥ 0. So σ ≤ ∇f (1) .
Applying this argument to f ∗ , we have σ ∗ = σf ∗ ≤ ∇(f ∗ ) (1) = ∇f (1) = (!f (1) )∗ ,
which implies !f (1) ≤ σ .

dfσ
Remark 4.4 We call the positive number α := dx x=1 the weight of σ .

An operator mean σ having σ̃ = σ is called a symmetric operator mean. Since

.
(f
σ ) (1) = 1 − fσ (1) = fσ (1), we have fσ (1) = 2 and ! ≤ σ ≤ ∇.
1

Proposition 4.6

{σ ∈ 1 | σ̃ = σ } {σ ∈ 1 | ! ≤ σ ≤ ∇}.
+1
Proof Put f (t) := 3tt +3 . Then the operator mean σf which corresponds f is in

{σ ∈ | ! ≤ σ ≤ ∇}\{σ ∈ 1 | σ̃ = σ } .
1

t +f
Remark 4.5 Recall the Barbour transform f $→ fˆ = 1+f . For f ∈ OM+
1 with

f ∗ = f , σfˆ is not symmetric, but ! ≤ σfˆ ≤ ∇.

Remark 4.6 An operator mean σ having σ ∗ = σ is called a self-adjoint operator
mean. The geometric mean is an easy-to-understand and important example for a
self-adjoint operator mean. A non-trivial operator mean σ (i.e., σ = l, σ = r) is
self-adjoint if and only if it can be written as the Barbour transform of a symmetric
operator connection, namely

{f ∈ OM+
1
\{1} | f = f ∗ } = {fˆ | f = f˜, f ∈ OM+ }.

For example, for r ∈ [0, 1], the function t0 +t

1−r
1−r = t is symmetric [7, 43]. An
t 1−r +1
operator mean which corresponds this function is called the Lehmar mean. The
exact definition of this mean will be given.
Example 4.4 For an arbitrary operator mean σ , the operator mean (σ + σ̃ )/2 is a
symmetric mean. The Heinz mean σhα defined by

Aσhα B := (A#α B + A#1−α B)/2 (0 ≤ α ≤ 1)

is a typical example.
162 H. Osaka and S. Wada

Let f ∈ OM+
1 . The two variable positive function
f˜ (s, t) := sf (t/s) (s, t > 0)
is homogeneous and is monotone for each variable. Furthermore, it satisfies

min{s, t} ≤ f˜ (s, t) ≤ max{s, t}.

So, the function f˜ can be viewed as a numerical mean.

In what follows, we treat an operator mean as a numerical mean induced from an
element of OM+ 1.

4.2.6 Power Means

Let α ∈ [0, 1] and r ∈ R. For s, t > 0,

pr,α (s, t) := ((1 − α)s r + αt r )1/r (r = 0)

and

p0,α (s, t) := lim pr,α (s, t) = s 1−α t α .

r→0

The function pr,α (s, t) is increasing w.r.t. r and (s, t) $→ pr,α (s, t) is an operator
mean for all α ∈ [0, 1] if and only if r ∈ [−1, 1]. The map r $→ pr,α is a path
connecting familiar operator means. For example,

!α (r = −1), #α (r = 0) and ∇α (r = 1).

An operator mean pr,α (r ∈ [−1, 1]) is called the power mean [45].
Proposition 4.7 Let A, B, X be positive invertible operators and let (r, α) ∈
[−1, 1] × [0, 1]. Then X = A σpr,α B if and only if

(1 − α)Pfr (A, X) + αPfr (B, X) = 0, (8)

where σpr,α is an operator mean which corresponds to the function t $→ pr,α (1, t),
fr (t) := (t r − 1)/r (r = 0) and f0 (t) := log t.
The proof is left to the reader.
Corollary 4.2 Let A, B, X be positive operators and let r > 0. Then

X = Aσpr,α B ⇒ (1 − α) fr (A, X) +α fr (B, X) = 0.

Proof Since X( )(:= A σpr,α B ) % X as % 0,

(1 − α) fr (A , X( )) + α fr (B , X( ))(= 0)

strongly converges to (1 − α) fr (A, X) +α fr (B, X)(= 0).

Perspectives, Means and their Inequalities 163

Remark 4.7 Let X, Ai ∈ B(H )++ (i = 1, 2, · · · n) and let αi ∈ (0, 1) (i =

1, 2, · · · n) with i αi = 1. A generalized equation of (8)

αi Pfr (Ai , X) = 0
i

is discussed in [41, Theorem 5.6]. The existence and uniqueness of the positive
solution X is not trivial in general, whereas the proof in the case when n = 2 is very
easy. When r = 0, the solution X is a multivariate extension of the geometric mean
and it is called the Karcher mean.

4.2.7 Stolarsky Means

For α ∈ R\{0, 1}, the Stolarsky mean is defined by

⎧ 1
⎨ s α −t α α−1
, if s = t :
Sα (s, t) := α(s−t )
⎩
s, if s = t.

and

S0 (s, t) := lim Sα (s, t) = s λ t,

α→0

1/(s−t )
1 ss
S1 (s, t) := lim Sα (s, t) = .
α→1 e tt

The function Sα (s, t) is monotone increasing w.r.t. α ([50]) and is an operator

mean if and only if −2 ≤ α ≤ 2 ([45]). Thus we have

#(= S−1 ) ≤ p1/2,1/2(= S1/2 ) ≤ S1 ≤ ∇(= S2 ).

The operator mean S1 is called the identric mean.

4.2.8 Means of Szabó Type

In [51], V.E. Szabó define the following function and discuss its operator mono-
tonicity:

(t α1 − 1)(t α2 − 1) · · · (t αn − 1)
t $→ t γ ,
(t β1 − 1)(t β2 − 1) · · · (t βn − 1)

where γ ∈ R, αi , βj > 0 with αi = βj (i, j = 1, 2, . . . , n). Several familiar

operator means are induced from the function of this type as follows.
164 H. Osaka and S. Wada

The function pdp defined by

p−1 s p −t p
p s p−1 −t p−1 , if s = t :
pdp (s, t) :=
s, if s = t.

is increasing w.r.t. p and interpolates some familiar means. For example,

!(= pd−1 ) ≤ #(t)(= pd1/2 ) ≤ λ(= pd1 ) ≤ ∇(= pd2 ).

This function is called the power difference mean and is an operator mean if and
only if p ∈ [−1, 2] [18, 30, 44, 52].
The Lehmer mean defined by

sp + t p s 2p − t 2p s p−1 − t p−1
lp (s, t) := = ·
s p−1 + t p−1 s 2p−2 −t 2p−2 sp − t p

is an operator mean if and only if p ∈ [0, 1] [45]. Clearly,

!(= l0 ) ≤ #(= l1/2 ) ≤ ∇(= l1 ).

The following mean

⎧
⎪ (s−t )2
⎪p(1 − p) (s p −t p )(s 1−p −t 1−p ) ,
⎨ if p = 0, p = 1, s = t,
php (s, t) := s−t
, if p = 0 or p = 1, s = t,
⎪
⎪ log s−log t
⎩
t, if s = t.

is introduced in [25, 26]. We call this the Petz-Hasegawa mean. This function is an
operator mean if and only if p ∈ [−1, 2] ([5, 25]). An elementary calculus shows
that this function equals the harmonic mean if p = −1, 2, the logarithmic mean if
p = 0, 1 and the power mean p1/2,1/2 if p = 1/2.
Remark 4.8 Let (a, b) be a pair of real numbers with |a|, |b| ≤ 2. Put

b ta − 1
ma,b (t) := .
a tb − 1
1 is that (a, b) is
A necessary and sufficient condition for this function to be in OM+
in the following set [44]:

{(a, b) | 0 < a − b ≤ 1,a ≥ −1, b ≤ 1}

∪ ([0, 1] × [−1, 0])\{(0, 0)}.
Perspectives, Means and their Inequalities 165

5 Operator Inequalities

5.1 Positive Maps

In this section, we study some inequalities involving an operator mean. A linear map
: B(H ) → B(K) is said to be positive if (A) ∈ B(K)+ for every A ∈ B(H )+ .
For a positive map , by estimating (IH ), some properties of becomes clear.
The continuity of a positive map is guaranteed by the fact = (IH ) (cf.
[47]). A positive map is said to be strictly positive (resp. unital ) if (IH ) ∈
B(K)++ (resp. (IH ) = IK ). Since a positive map preserves the order relation,
we have

(A) ≥ (αIH ) ≥ α (IH )

for all A ∈ B(H )++ , where α := min sp(A). So, a positive unital map is strictly
positive.
In what follows, we denote P(H, K) (resp. P(H, K)1 ) the set of strictly positive
(resp. unital positive) map from B(H ) to B(K).
If A is a positive definite matrix having the spectral decomposition: A =
i αi Pi , then, for ∈ P(H, K)1 ,

IK (A−1 )−1 (A) − (A−1 )−1 0 IK 0
0 IK 0 (A−1 ) (A−1 )−1 IK

(A) IK
=
IK (A−1 )

αi (Pi )
(Pi ) αi 1
= = ⊗ (Pi ) ≥ 0.
(Pi ) αi−1 (Pi ) 1 αi−1
i i

By the similar argument, the following two lemmas (Choi’s inequality and Kadi-
son’s inequality) are obtained.
Lemma 5.1 ([9, Theorem 2.1]) Let ∈ P(H, K)1 . Then

(A−1 ) ≥ (A)−1

for all A ∈ B(H )++ .

Lemma 5.2 ([36, Theorem 1]) Let ∈ P(H, K)1 . Then

(A2 ) ≥ (A)2

for all A ∈ B(H )sa .

166 H. Osaka and S. Wada

Proposition 5.1 Let ∈ P(H, K)1 and let f be an operator convex function on
the open interval (α, β). Then,

f ( (A)) ≤ (f (A))

for all A ∈ B(H )sa with sp(A) ⊆ (α, β).

Proof It is enough to assume that f is not linear function. We first show the case
when (α, β) = (−1, 1). Note that

IK ± (A) = (IH ± A) > 0.

From the assumption, there exist a, b ∈ R such that

$ 1 t2
f (t) = a + bt + dm(x),
−1 1 − xt

where m is a finite positive Borel measure on [−1, 1]. So, we have

$ 1 A2
(f (A)) = a + b (A) + dm(x).
−1 IH − xA

Here, for x = 0, using Choi’s inequality,

A2 −1 −1 1 1
= (A) + 2 (IH ) + 2
IH − xA x x x IH − xA
−1 −1 1 1
≥ (A) + 2 (IH ) + 2
x x x IK − x (A)
(A)2
= .
IK − x (A)

For x = 0, using Kadison’s inequality,

A2 (A)2
= (A2 ) ≥ (A)2 = .
IH − xA IK − x (A)

Thus (f (A)) ≥ f ( (A)).

Perspectives, Means and their Inequalities 167

β−α α+β
In the general case, if we put g(t) := f 2 t + 2 , then g is operator convex
on (−1, 1) and we have

2 α+β
f ( (A)) = g (A) − IK
β −α β −α
2 α+β
=g A− IH
β−α β −α
2 α+β
≤ g A− IH = (f (A)).
β−α β −α

Theorem 5.1 Let ∈ P(H, K) and let f be a real valued operator convex
function on (0, ∞). Then

(Pf (A, B)) ≥ Pf ( (A), (B))

holds for all A, B ∈ B(H )++ .

Proof Put

(X) := (B)−1/2 (B 1/2 XB 1/2 ) (B)−1/2 , C := B −1/2 AB −1/2 .

Then is in P(H, K)1 . So using the preceding proposition,

(Pf (A, B)) = (B 1/2 Pf (C, IH )B 1/2 )

= (B)1/2 (Pf (C, IH )) (B)1/2
≥ (B)1/2 Pf ((C), IK ) (B)1/2
= Pf ( (B)1/2 (C) (B)1/2 , (B)) = Pf ( (A), (B)).

Corollary 5.1 Let σ be an operator mean and let ∈ P(H, K). Then

(Aσ B) ≤ (A)σ (B)

for all A, B ∈ B(H )+ .

Proof Since −fσ is operator convex, for A, B ≥ 0,

(Aσ B) ≤ (A σ B )
= − (P−fσ (B , A ))
168 H. Osaka and S. Wada

≤ −P−fσ ( (B ), (A ))
= (A )σ (B )
= ( (A) + (IH ))σ ( (B) + (IH )).

5.2 Power Monotonicity

Let t be a positive real number with t ≥ 1 (resp. t ≤ 1). The function [1, ∞) 8 r $→
t r is monotone increasing (resp. decreasing). A certain numerical mean (a, b) $→
a σf b(:= af (b/a)) also satisfies this property. If the positive function f satisfies
f (t)r ≤ f (t r ) (r ≥ 1), then

a r σf br ≥ (a σf b)r (r ≥ 1)

which is equivalent to

a σf b ≥ 1 ⇒ a r σf br ≥ 1 (r ≥ 1).

In [4], Ando and Hiai prove that an operator version of this inequality holds for
the weighted geometric mean. In this section, we study the similar statement

A, B > 0, Aσ B ≥ I ⇒ Ar σ B r ≥ I (9)

and

A, B > 0, Aσ B ≤ I ⇒ Ar σ B r ≤ I. (10)

5.2.1 Ando–Hiai Type Inequalities

For a positive operator X, we denote the minimum value of sp(X) by λmin (X).
The following inequality holds for an arbitrary operator mean.
Proposition 5.2 Let σ be an operator mean and A, B > 0. Then

fσ (C r )
Ar σ B r ≥ λmin λmin (Aσ B)r−1 (Aσ B) for 1 ≤ r ≤ 2,
fσ (C)r
−1 −1
where C := A 2 BA 2 and fσ (t) := 1σ t (t > 0).
Perspectives, Means and their Inequalities 169

Proof We first show the case when λmin (Aσ B) = 1. Since Aσ B ≥ I , we have
fσ (C) ≤ A. Set := 2 − r. Then
1

Ar σ B r = A 2 fσ (A− 2 + 2 CA 2 B − A 2 CA− 2 + 2 )A 2
r r 1 1 1 r 1 r

1 1 1 1 1 1
= A 2 fσ (A− 2 + 2 CA 2 (A− 2 C −1 A− 2 ) A 2 CA− 2 + 2 )A 2
r r r r

−1+ −1+
= A 2 A 2 fσ (A 2 C[A# C −1 ]CA 2 )A 2 A 2
1 1− 1− 1

1
2 3 1
= A 2 A1− σ {C[A# C −1 ]C} A 2

1 1 1 −1 1
≥A 2 σ C # C C A2
fσ (C)1− fσ (C)
1 fσ (C 2− ) 1 1 fσ (C r ) 1
= A2 A2 = A2 A2
fσ (C)1− fσ (C)r−1
fσ (C r ) 1 1 fσ (C r )
≥ λmin r
A 2 fσ (C)A 2 = λmin (Aσ B).
fσ (C) fσ (C)r

In the general case, we put α := λmin (Aσ B). Then

λmin ((A/α)σ (B/α)) = λmin ((1/α)Aσ B) = 1.

Thus the above argument implies

fσ (D r )
(A/α)r σ (B/α)r ≥ λmin (A/α)σ (B/α),
fσ (D)r

where D := (A/α)−1/2 (B/α)(A/α)−1/2 = A−1/2 BA−1/2 = C. The last inequality

is equivalent to

fσ (C r )
Ar σ B r ≥ λmin α r−1 (Aσ B).
fσ (C)r

Using this result, the condition fσ (t r ) ≥ fσ (t)r (t > 0, 2 ≥ r ≥ 1) clearly
implies

Ar σ B r ≥ λmin (Aσ B)r−1 (Aσ B).

Corollary 5.2 Let f ∈ OM+ 1 and σ be an operator mean which corresponds to

f
f . The followings are equivalent:
(i) f (t r ) ≥ f (t)r (r ≥ 1);
(ii) Ar σf B r ≥ λmin (Aσf B)r−1 (Aσf B) (A, B > 0, 1 ≤ r ≤ 2);
(iii) A, B > 0, Aσf B ≥ I ⇒ Ar σf B r ≥ I (r ≥ 1).
170 H. Osaka and S. Wada

Proof (i) ⇒ (ii). Immediate from the above proposition.

(ii) ⇒ (iii). It is enough to show the case when r > 2. There exist a positive
integer n and 1 ≤ r0 ≤ 2 such that r = 2n r0 . Iterating (ii) gives (iii).
(iii) ⇒ (i). Take A = (1/f (t))I and B = (t/f (t))I .

In the statement of the preceding proposition, replacing σ by σ ∗ , the following
is obtained.
Corollary 5.3 Let σ be an operator mean and A, B > 0. Then

fσ (C r )
Ar σ B r ≤
f (C)r Aσ B (Aσ B) for 1 ≤ r ≤ 2.
r−1
σ

Corollary 5.4 Let f ∈ OM+ 1 and σ be an operator mean which corresponds to

f
f . The followings are equivalent:
(i) f (t r ) ≤ f (t)r (r ≥ 1);
(ii) Ar σf B r ≤ Aσf Br−1 (Aσf B) (A, B > 0, 1 ≤ r ≤ 2);
(iii) A, B > 0, Aσf B ≤ I ⇒ Ar σf B r ≤ I (r ≥ 1).
Remark 5.9 In the case when dim H < ∞, for A, B ≥ 0, lim A σf B =
Aσf B. So, the statement (i)–(iii) above are equivalent to

A, B ≥ 0, Aσf B ≤ I ⇒ Ar σf B r ≤ I (r ≥ 1).

In what follows, we denote the set of all functions in OM+ 1 satisfying (i) in

Corollary 5.2 (resp. (i) in Corollary 5.4) by PMI (resp. PMD). Note that

f ∈ P MI ⇐⇒ f ∗ ∈ P MD.

Corollary 5.5 For f ∈ P MD and A, B > 0,

(Ap σf B p )1/p ≤ (Aq σf B q )1/q (0 < q ≤ p)

Proof Using Corollary 5.3,

Ar σf B r ≤ Aσf Br (1 ≤ r).

Aq σf B q 1/q = (Ap )q/p σf (B p )q/p 1/q

≤ Ap σf B p (1/q)(q/p) = Ap σf B p 1/p

Since the function t $→ tα (α ∈ [0, 1]) is in P MI ∩ P MD, we have the following.
Perspectives, Means and their Inequalities 171

Corollary 5.6 (The Ando–Hiai Inequality [4]) Let α ∈ [0, 1]. Then the follow-
ings hold.

A, B > 0, A#α B ≥ I ⇒ Ar #α B r ≥ I (r ≥ 1), (11)

A, B > 0, A#α B ≤ I ⇒ Ar #α B r ≤ I (r ≥ 1). (12)

Remark 5.10 Statements (11) and (12) are equivalent. Assume (11). Then, for
A, B > 0 with A#α B ≤ I , we have A−1 #α B −1 ≥ I . So A−r #α B −r ≥ I holds
by (11). This implies Ar #α B r ≤ I .

5.3 Furuta Inequality

By using the Ando–Hiai inequality, the essential part of the Furuta inequality [19]
is obtained.
Proposition 5.3 If A ≥ B > 0, then

A−r # 1+r B p ≤ B
p+r

for p ≥ 1 and r ≥ 0.
Proof It is enough to assume r > 0. We first show

A−r # p+r
r B ≤ I
p
(p ≥ 1, r > 0). (13)

Note that there exists s ∈ (0, 1] such that r/s ≥ 1. Set q := p/r. It follows from
A ≥ B > 0 that

A−s # 1 B sq ≤ B −s # 1 B sq = I
1+q 1+q

which implies

A−r # p+r
r B = A
p −s(r/s)
# 1 B sq(r/s) ≤ I
1+q

by the Ando–Hiai inequality. So (13) is obtained. From this,

A−r # 1+r B p = B p # p−1 A−r

p+r p+r

= B # p−1 B p #
p p A−r
p p+r
172 H. Osaka and S. Wada

= B p # p−1 A−r # p+r

r B
p
p

≤ B # p−1 I = B.
p
p

Proposition 5.4 (The Furuta Inequality) If A ≥ B ≥ 0, then

1/q p+r
Ar/2B p Ar/2 ≤A q

for p ≥ 0, r ≥ 0 and q ≥ 1 with q ≥ (p + r)/(r + 1).

Lemma 5.3 If A ≥ B ≥ 0, then

1/q p+r
Ar/2B p Ar/2 ≤A q

for p ≥ 1, r ≥ 0 and q > 0 with q ≥ (p + r)/(r + 1).

Proof It is enough to prove the case when A ≥ B > 0. Proposition 5.3 gives
1+r
p+r
Ar/2B p Ar/2 ≤ Ar/2BAr/2 ≤ Ar+1 .

p+r
Thus, by taking q(1+r) power of each side,

p+r 1+r
1/q q(1+r) p+r p+r
Ar/2B p Ar/2 = Ar/2B p Ar/2 ≤A q .

Proof of Proposition 5.4 Suppose (p, q, r) satisfies the condition. If p = 0 or r =
0, then the desired inequality clearly holds. So we assume p > 0 and r > 0. Put
s := min{p, 1}(≤ 1). Then p := (p/s) ≥ 1 and r := (r/s) > 0. From the above
lemma,
1+r
s+r
r r p +r
r r p+r s p
A B A
2 p 2 = (A ) (B ) (A )
s 2 s 2

(p +r ) p1+r
+r
≤ (As )
= As+r .
4 5
Since q ≥ max p+r 1+r , 1 p+r
, we have 1 ≥ q(s+r) . Thus, by taking p+r
q(s+r) power of
each side, the desired inequality is obtained.

Perspectives, Means and their Inequalities 173

5.4 Chaotic Order

Let A, B > 0. In [3], Ando proved the following equivalence:

log A ≥ log B ⇐⇒ A−p #B p ≤ I (p ≥ 0)

−p −q
⇐⇒ A #B ≥ A
p
#B q (0 ≤ p ≤ q).

In this subsection, a generalization of this result will be given. We first introduce, as

a preliminary, Lie-Trotter-type formula for an operator perspective.
Proposition 5.5 [[32, Theorem 5.1]] Assume that f is a C 1 function on (0, ∞)
with f > 0, f (1) = 1. Then for every A, B > 0,

lim Pf (Ap , B p )1/p = exp(f (1) log A + (1 − f (1)) log B)

p→0

in the operator norm topology.

The proof is omitted.
Let f ∈ P MD. The function p $→ Pf (Ap , B p )1/p is decreasing (see
Corollary 5.5). It follows from this fact and the preceding proposition that

(Ap σf˜ B p )1/p = Pf (Ap , B p )1/p ≤ exp(f (1) log A + (1 − f (1)) log B).

Proposition 5.6 ([32]) Let α ∈ [0, 1] and P MDα be the set of all f ∈ OM+
1 such

that f (1) = α. If A, B > 0. Then the followings are equivalent:
(i)(1 − α) log A + α log B ≤ 0;
(ii)Ap σf B p ≤ I for all p > 0 and for all f ∈ P MDα ;
(iii)Ap #α B p ≤ I for all p > 0;
(iv) p $→ Ap σf B p is a decreasing map from [0, ∞) into B(H )++ for all f ∈
P MDα ;
(v) p $→ Ap #α B p is a decreasing map from [0, ∞) into B(H )++ .

Proof The equivalence of (i) − (iii) is immediate from the above argument.
(iii) ⇒ (iv). From Corollary 5.4, we have Ar σf B r ≤ Aσf B (r ∈ [1, 2]). Thus
for all r ≥ 1, Ar σf B r ≤ Aσf B holds. Using this,

Ap σf B p = Aq(p/q)σf B q(p/q) ≤ Aq σf B q (p ≥ q > 0).

Combining this and the fact that Ap σf B p ≤ I = A0 σf B 0 (p > 0), the desired
result is obtained.
(iv) ⇒ (v). Immediate.
(v) ⇒ (i). From Proposition 5.5,

I ≥ lim(Ap #α B p )1/p = exp((1 − α) log A + α log B).

174 H. Osaka and S. Wada

Replacing A, B by A−1 , B −1 , we have the following.

Corollary 5.7 Let α ∈ [0, 1] and A, B > 0. Then the followings are equivalent:
(i)(1 − α) log A + α log B ≥ 0;
(ii)Ap σf ∗ B p ≥ I for all p > 0 and for all f ∈ P MDα ;
(iii)Ap #α B p ≥ I for all p > 0.
(iv) p $→ Ap σf ∗ B p is a increasing map from [0, ∞) into B(H )++ for all f ∈
P MDα ;
(v) p $→ Ap #α B p is a increasing map from [0, ∞) into B(H )++ .

The next result is a generalization for the Ando–Hiai inequality.

Corollary 5.8 ([21]) Let A, B > 0. Suppose that

A# β B ≥ I for fixed α ≥ 0 and β ≥ 0 with α + β > 0.

α+β

Then the following inequality holds

Aμ # βμ B λ ≥ I for λ ≥ 1 and μ ≥ 1.
αλ+βμ

Proof Put W := A# β B(≥ I ). Since

α+β

(W −1/2 AW −1/2 ) # β (W −1/2 BW −1/2 ) = I,

α+β

αμλ log(W −1/2 AW −1/2 ) + βμλ log(W −1/2 BW −1/2 ) = 0. (14)

Then, there exists non-negative integer n such that 1 ≤ λ0 := λ/2n ≤ 2, and

βμλ log(W −1/2 BW −1/2 ) = βμ2n log(W −1/2 BW −1/2 )λ0

≤ βμ2n log(W −1/2 B λ0 W −1/2 )
≤ βμ log(W −1/2 B λ W −1/2 ).

Thus, from (14),

αλ log(W −1/2 Aμ W −1/2 ) + βμ log(W −1/2 B λ W −1/2 ) ≥ 0,

αλ βμ
log(W −1/2 Aμ W −1/2 ) + log(W −1/2 B λ W −1/2 ) ≥ 0,
αλ + βμ αλ + βμ

which implies the desired result by using the preceding corollary.

Perspectives, Means and their Inequalities 175

5.5 Notes and Remarks

The perspective function for a real valued continuous function is often appeared in
the theory of convex analysis. The operator perspective of invertible positive oper-
ators is a non-commutative analogy of this. In Sect. 2, we state some fundamental
properties of an operator perspective such as the equivalence between the operator
convexity of the representation function and the joint convexity of the perspective
[13, 15].
In Sect. 3, we study a two-variable functional calculus (A, B) of positive
operators A, B developed by W. Pusz and S. L. Woronowicz (PW-functional
calculus for short)[49]. We introduce some important properties in [27, 33]. We
defined the PW-functional calculus for a real valued continuous function with the
condition (6). The set of such functions does not contain some important functions
such as log t and t log t, but (A, B) can often be calculated in some restricted
domain (Sect. 3.2.5).
The operator connection in the sense of Kubo and Ando [39] can be considered
as the PW-functional calculus for a positive operator monotone function on [0, ∞).
Their axiomatization not only teaches us the essence of an operator connection, but
also serves as a tool for checking whether a given binary operation is an operator
connection. In Sect. 4, we introduce some fundamental properties of an operator
mean and give some examples.
Thanks to the basic results of the operator perspective described in the previous
section, some of the properties of the operator mean become apparent by taking
limits.
Transformations ( ˜, ∗ , ⊥ ) on the set of operator means are well-known [39]. In
Sect. 4.2, the Barbour transform is introduced. Though it was first defined in [40],
but before that, its sprouting can be seen in the papers of some authors [7, 43].
The fixed point with respect to the transformation σ $→ σ ∗ is called a self-
adjoint operator mean. F. Hansen’s work concerning a characterization of that
was groundbreaking [22]. We briefly mention the relations between a self-adjoint
operator mean and a symmetric one proved by H. Osaka and S. Wada [46].
Inequalities involving operator perspectives and positive linear maps have been
studied(e.g.,[1, 2, 9]). In Sect. 5.1, we discuss about how the operator convexity and
the operator concavity of a representing function are reflected in the inequalities.
Almost all statements are classical and fundamental.
For an operator mean σ , some statements comparing Aσ B and Ar σ B r are
treated in Sect. 5.2. A central result (Proposition 5.2) is stated in [4, Theorem 2.1].
Functions satisfying the condition (i) in Corollary 5.2(resp. Corollary 5.4) is said
to be power monotone increasing (resp. decreasing) [54] and has been studied in
[31, 32, 55].
The Furuta inequality was developed as a generalization of the Löwner-Heinz
inequality: A ≥ B ≥ 0 ⇒ As ≥ B s ≥ 0 (s ∈ [0, 1]) [19]. Though the original
proof is elegant and not very difficult, but we gave a slightly longer proof using the
Ando–Hiai inequality in Sect. 5.3. The proof is given by M. Fujii and E. Kamei
176 H. Osaka and S. Wada

[17]. They showed that the essential part of the Furuta inequality(Proposition 5.3)
and the Ando–Hiai inequality (Corollary 5.6) imply each other.
For A, B ∈ B(H )++ , a (weaker) order defined by log A ≥ log B is called the
chaotic order. Many equivalent conditions have been studied [16, 20, 21, 32, 53].
Among them, we introduce some statements that come from the Ando–Hiai
inequality [32].
The last statement given here is developed by T. Furuta, M. Yanagida and T.
Yamazaki [21]. We show this result using the method developed by T. Yamazaki in
the theory of the multivariate Ando–Hiai type inequality [56].

Acknowledgments The first author’s research is supported by KAKENHI grant No. JP20K03644.

References

1. T. Ando, Topics on Operator Inequalities, Lecture Note (Hokkaido University, Sapporo, 1978)
2. T. Ando, Concavity of certain maps on positive definite matrices and applications to Hadamard
products. Linear Algebra Appl. 26, 203–241 (1979)
3. T. Ando, On some operator inequalities. Math. Ann. 279(1), 157–159 (1987)
4. T. Ando, F. Hiai, Log majorization and complementary Golden–Thompson type inequalities.
Linear Algebra Appl. 197, 113–131 (1994)
5. Á. Besenyei, The Hasegawa–Petz mean: properties and inequalities. J. Math. Anal. Appl.
391(2), 441–450 (2012)
6. R. Bhatia, Matrix Analysis. Graduate Texts in Mathematics, vol. 169 (Springer, New York,
1997)
7. J.L. Brenner, W.A. Newcomb, O.G. Ruler, An inequality: problem 85–20. SIAM Rev. 28, 573
(1987)
8. H.J. Carlin, G.A. Noble, Circuit properties of coupled dispersive lines with applications to
wave guide modelling, in Proceedings on Network and Signal Theory, ed. by J.K. Skwirzynki,
J.O. Scanlan. (Peter Pergrinus, Stevenage, 1973), pp. 258–269
9. M.-D. Choi, A Schwarz inequality for positive linear maps on C ∗ -algebras. Illinois J. Math.
18, 565–574 (1974)
10. J.B. Conway, A Course in Operator Theory. Graduate Studies in Mathematics, vol. 21
(American Mathematical Society, Providence, RI, 2000)
11. K.R. Davidson, C*-Algebras by Example. Fields Institute Monographs, vol. 6 (American
Mathematical Society, Providence, RI, 1996)
12. B. Dacorogna, P. Maréchal, The role of perspective functions in convexity, polyconvexity, rank-
one convexity and separate convexity. J. Convex Anal. 15(2), 271–284 (2008)
13. E.G. Effros, A matrix convexity approach to some celebrated quantum inequalities. Proc. Natl.
Acad. Sci. U. S. A. 106(4), 1006–1008 (2009)
14. E. Effros, F. Hansen, Non-commutative perspectives. Ann. Funct. Anal. 5(2), 74–79 (2014)
15. A. Ebadian, I. Nikoufar, M.E. Gordji, Perspectives of matrix convex functions. Proc. Natl.
Acad. Sci. U. S. A. 108(18), 7313–7314 (2011)
16. M. Fujii, T. Furuta, E. Kamei, Furuta’s inequality and its application to Ando’s theorem. Linear
Algebra Appl. 179, 161–169 (1993)
17. M. Fujii, E. Kamei, Ando–Hiai inequality and Furuta inequality. Linear Algebra Appl. 416(2–
3), 541–545 (2006)
18. J.I. Fujii, M. Fujii, Y. Seo, An extension of the Kubo–Ando theory: solidarities. Math. Jpn. 35,
509–512 (1990)
Perspectives, Means and their Inequalities 177

19. T. Furuta, A ≥ B ≥ 0 assures (B r Ap B r )1/q ≥ B (p+2r)/q for r ≥ 0, p ≥ 0, q ≥ 1 with

(1 + 2r)q ≥ p + 2r. Proc. Am. Math. Soc. 101(1), 85–88 (1987)
20. T. Furuta, Applications of order preserving operator inequalities, in Operator Theory and
Complex Analysis (Sapporo, 1991), 180–190. Oper. Theory Adv. Appl. 59 (Birkhäuser, Boston,
1991)
21. T. Furuta, T. Yamazaki, M. Yanagida, Operator functions implying generalized Furuta inequal-
ity. Math. Inequal. Appl. 1(1), 123–130 (1998)
22. F. Hansen, Selfadjont means and operator monotone functions. Math. Ann. 256(1), 29–35
(1981)
23. F. Hansen, Perspectives and completely positive maps. Ann. Funct. Anal. 8(2), 168–176 (2017)
24. F. Hansen, G.K. Pedersen, Jensen’s inequality for operators and Löwner’s theorem. Math. Ann.
258(3), 229–241 (1982)
25. H. Hasegawa, D. Petz, On the Riemannian metric of α-entropies of density matrices. Lett.
Math. Phys. 38(2), 221–225 (1996)
26. H. Hasegawa, D. Petz, Non-commutative extension of the information geometry II, in Quantum
Communication and Measurement, ed. by O. Hirota (Plenum, New York, 1997), pp. 109–118
27. K. Hatano, Y. Ueda, Pusz–Woronowicz’s functional calculus revisited. Acta Sci. Math.
(Szeged) 87, 485–503 (2021)
28. F. Hiai, Log-majorization related to Rényi divergences. Linear Algebra Appl. 563, 255–276
(2019)
29. F. Hiai, Quantum f-Divergences in von Neumann Algebras-Reversibility of Quantum Opera-
tions. Mathematical Physics Studies (Springer, Singapore, 2021)
30. F. Hiai, H. Kosaki, Means for matrices and comparison of their norms. Indiana Univ. Math. J.
48, 899–936 (1999)
31. F. Hiai, Y. Seo, S. Wada, Ando–Hiai type inequalities for multivariate operator means. Linear
Multilinear Algebra 67(11), 2253–2281 (2019)
32. F. Hiai, Y. Seo, S. Wada, Ando–Hiai type inequalities for operator means and operator
perspective. Int. J. Math. 31, 2050007, 44 pp. (2020)
33. F. Hiai, Y. Ueda, S. Wada, Pusz–Woronowicz functional calculus and extended operator convex
functions. Integral Equa. Oper. Theory 94(1), 66 pp. (2022). Paper No. 1
34. J.-B. Hiriart-Urruty, C. Lemarechal, Convex analysis and Minimization Algorithms. I. Fun-
damentals. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of
Mathematical Sciences], vol. 305 (Springer, Berlin, 1993)
35. J.-B. Hiriart-Urruty, J.-E. Maritinez-Legaz, Convex solutions of a functional equation arising
in information theory. J. Math. Anal. Appl. 328(2), 1309–1320 (2007)
36. H. Kadison, A generalized Schwarz inequality and algebraic invariants for operator algebras.
Ann. Math. (2) 56, 494–503 (1952)
37. M. Kian, M.S. Moslehian, Y. Seo, Variants of Ando–Hiai type inequalities for deformed means
and applications. Glasg. Math. J. 63(3), 622–639 (2021)
38. F. Kubo, On logarithmic means, in Tenth-Symp. Appl. Func. Anal. (1987), pp. 47–60
39. F. Kubo, T. Ando, Means of positive linear operators. Math. Ann. 246(3), 205–224 (1980)
40. F. Kubo, N. Nakamura, K. Ohno, S. Wada, Barbour path of operator monotone functions. Far
East J. Mth. Sci. (FJMS) 57(2), 181–192 (2011)
41. J. Lawson, Y. Lim, Karcher means and Karcher equations of positive definite operators. Trans.
Am. Math. Soc. Ser. B 1, 1–22 (2014)
42. T.-P. Lin, The power mean and the logarithmic mean. Am. Math. Mon. 81, 879–883 (1974)
43. M. Mays, Functions which parametrize means, Am. Math. Mon. 90, 677–683 (1983)
44. M. Nagisa, S. Wada, Operator monotonicity of some functions. Linear Algebra Appl. 486,
389–408 (2015)
45. Y. Nakamura, Classes of operator monotone functions and Stieltjes functions, in The Gohberg
Anniversary Collection, vol. II, ed. by H. Dym, et al. Oper. Theory Adv. Appl., vol. 41
(Birkhäuser, Basel, 1989), pp. 395–404
46. H. Osaka, S. Wada, Unexpected relations which characterize operator means. Proc. Am. Math.
Soc. Ser. B 3, 9–17 (2016)
178 H. Osaka and S. Wada

47. V.I. Paulsen, Completely Bounded Maps and Operator Algebras (Cambridge University Press,
Cambridge, 2002)
48. G.K. Pedersen, Analysis Now. Graduate Texts in Mathematics, vol. 118 (Springer, New York,
1989)
49. W. Pusz, S.L. Woronowicz, Functional calculus for sesquilinear forms and the purification
map. Rep. Math. Phys. 5, 159–170 (1975)
50. K.B. Stolarsky, Generalizations of the logarithmic mean. Math. Mag. 48, 87–92 (1975)
51. V.E.S. Szabó, A class of matrix monotone functions. Linear Algebra Appl. 420, 79–85 (2007)
52. S.-E. Takahasi, M. Tsukada, K. Tanahashi, T. Ogiwara, An inverse type of Jensen’s inequality.
Math. Jpn. 50, 85–91 (1999)
53. M. Uchiyama, Some exponential operator inequalities. Math. Inequal. Appl. 2, 469–471 (1999)
54. S. Wada, Some ways of constructing Furuta-type inequalities. Linear Algebra Appl. 457, 276–
286 (2014)
55. S. Wada, When does Ando–Hiai inequality hold?. Linear Algebra Appl. 540, 234–243 (2018)
56. T. Yamazaki, The Riemannian mean and matrix inequalities related to the Ando–Hiai
inequality and chaotic order. Oper. Matrices 6, 577–588 (2012)
57. Z.-H. Yang, New sharp bounds for logarithmic mean and identric mean. J. Inequal. Appl.
2013(1), 116 (2013)
Cauchy–Schwarz Operator and Norm
Inequalities for Inner Product Type
Transformers in Norm Ideals of Compact
Operators, with Applications

Danko R. Jocić and Milan Lazarević

Abstract In this survey paper we present operator and norm inequalities of

Cauchy–Schwarz type:
$ $ 1/2 $ 1/2
At XBt dμ(t) ≤ A∗t At dμ(t) X Bt Bt∗ dμ(t) ,

for strongly square integrable operator families {At }t ∈, {Bt∗ }t ∈ and symmetri-
cally norming functions , such that the associated unitarily invariant norm is
nuclear, Q∗ or arbitrary, under some additional commutativity conditions. The
applications of this and complementary inequalities for Q and Schatten–von
Neumann norms to Aczél–Bellman, Grüss–Landau, arithmetic–geometric, Young,
Minkowski, Heinz, Zhan, Heron, and generalized derivation norm inequalities are
also presented.

Keywords Norm inequalities · Elementary operators · i.p.t. transformers ·

Q and Q∗ -norms · Generalized derivations · Operator monotone functions ·
Subnormal · N -hyper-accretive and N -hyper-contractive operators

1 Introduction

In this paper we will denote respectively by B(H) and K(H) the spaces of all
bounded and all compact linear operators on a separable, complex Hilbert space H.
Each symmetrically norming (s.n.) function ϒ, defined on sequences of complex
numbers, gives rise to the associated symmetric or a unitary invariant (u.i.) norm
||·||ϒ on operators, defined on the naturally associated norm ideal Cϒ (H) ⊂ K(H).
Most important example of s.n. functions is the trace s.n. function (denoted also by

D. R. Jocić () · M. Lazarević

Department of Mathematics, University of Belgrade, Belgrade, Serbia
e-mail: [email protected]; [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 179
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_6
180 D. R. Jocić and M. Lazarević

def ∞
1 or 1 ), defined by (λn )∞
n=1 = n=1 |λn |. For any s.n. function ϒ, there is their
adjoint s.n. function, which we denote by ϒ ∗. For any p ≥ 1 a s.n. function ϒ could
(p)
be p modified and this p-modification ϒ also represents a new s.n. functions, for
which the corresponding ideal of compact operators will be denoted by Cϒ (p) (H).
(p) ∗
Similarly, the ideal of compact operators, associated to dual s.n. function ϒ
we will denote by Cϒ (p)∗ (H) and by Cϒ (p) (H) we will denote the closure of
◦
finite rank operators in the ||·||ϒ (p) norm. The Schatten–von Neumann trace classes
Cp (H) = C(p) (H) are the most important and the best known examples of norm
def

ideals associated to degree p modified (i.e. its s.n. function ) norms. Amongst them,
C1(H) is also known as the class of nuclear operators, while C2(H) is known as the
Hilbert–Schmidt class. For the norm in Cp (H) we will use the simplified notation
(p) p (2)
||·||p . For p ≥ 2, all norms ||·||ϒ (p) are also known as Q-norms, as ϒ = (ϒ ( 2 ) )
p
and ϒ ( 2 ) is also a s.n. function, while their associated dual norms ||·||ϒ (p) ∗ are also
known as Q∗ -norms.
If (, M, μ) is a space with a measure μ on σ -algebra M, consisting of
(measurable) subsets of , then we will refer to a function A : → B(H) : t $→ At
as to a ([M]) weakly∗ -measurable whenever t $→ At g, h is a ([M]) measurable
for all g, h ∈ H. If, in addition, those functions are [μ] integrable on , then A is
called ([μ]) weakly∗ -integrable on , in which ) case there is the unique (known as
Gel’fand or weak∗ -integral and denoted by At dμ(t)) operator in B(H), having
the property
*$ + $
At dμ(t)h, k = At h, k dμ(t) for all h, k ∈ H.

)
Note that At dμ(t) also satisfies the definition of Pettis integral. For a more
complete account about weak∗ -integrability of operator valued (o.v.) functions
the interested reader is referred to [5, p.53], [15, p.320] and [20, lemma 1.2].
Let also L2 (, μ, H)) denote the space of all (weakly) measurable functions
f : → H such that f (t)2 dμ(t) < +∞, and similarly, let L2G (, μ,B(H))
denote
) the space of all weak∗ -measurable functions A : → B(H) such that
||At f || dμ(t) < +∞ for all f ∈ H. In this case we say that A is ([μ])
2

square integrable (s.i.), and by analogy, a family {An }∞ n=1 in B(H) will be called

a (strongly) square summable (s.s.) if ∞ n=1 ||A n h|| 2 < +∞ for all h ∈ H. Also,

A∗A is Gel’fand integrable iff A ∈ L2G (, μ,B(H)), as shown (amongst others) in
[15, ex. 2]. If a family {At }t ∈ consists of mutually commuting normal operators,
we will refer to it as to a m.c.n.o. family. If for A ∈ L2G (, μ,B(H)) the associated
family {At }t ∈ is a m.c.n.o. family, then we will refer to {At }t ∈ (and A) as to s.i.
and m.c.n.o. family (o.v. function).
Cauchy–Schwarz Inequalities for i.p.t. Transformers 181

For A, B ∈ B(H) the bilateral multiplier A ⊗ B is defined by A ⊗ B : B(H) →

B(H) : X $→ AXB and the generalized derivation A,B by A,B = A⊗I + I ⊗B.
def

∞ ∞
For sequences {An }n=1 , {Bn }n=1 in B(H) and a norm ideal Cϒ (H) the associated
transformer ∞ n=1 An ⊗Bn is called σ -elementary transformer on Cϒ (H) iff for

every X ∈ Cϒ (H) there is Y ∈ Cϒ (H), such that Y = [w] ∞ n=1 An XBn =
def

[w ] lim n [w ] lim [s ] lim

n→∞ k=1 Ak XBk , where n→∞ (resp. n→∞ ) denotes the weak
(resp. strong) operator limit. More generally, if A, B : → B(H) are weak∗ -
measurable,
) such that t $→ At XBt is weak∗ -integrable ) on for all X ∈ Cϒ (H),
then At ⊗Bt dμ(t) : Cϒ (H) → B(H) : X $→ ) At XBt dμ(t) is called inner
product type (i.p.t.) transformer on Cϒ (H) if At XBt dμ(t) ∈ Cϒ (H) for
all X ∈ Cϒ (H). For related questions about the existence and different types
of convergence for σ -elementary transformers, weak∗ -integrability, convergence
properties for weak∗ -integrals and boundedness of i.p.t. transformers the interested
reader is referred to [13, th. 2.2], [14, th. 2.1, cor. 2.1], [15, lemma 3.1, th. 3.2,
th. 3.3, th. 3.4], [16, th. 2.1, th. 3.1, th. 3.2, cor. 3.1] and [26, lemma 2.1, th. 3.1].
Cauchy–Schwarz norm inequalities for i.p.t. transformers appear in different
forms, depending on ideals of compact operators on which transformers act, as
well as of normality and commutativity properties of s.i. families involved therein.
By providing those new tools for investigation of i.p.t. transformers, this enables
(amongst others) to threat derivation inequalities for different classes of operators,
including N -hyper accretive, N -hyper contraction, quasinormal, hyponormal, sub-
normal and operators with the contractive real part. This gives new contributions
to the perturbation theory for non necessarily normal operators in ideals of
compact operators, well beyond scopes of the standard theory of double operator
integrals (D.O.I.), developed earlier by Birman, Solomyak and its collaborators.
For successful adaptation of D.O.I. and its applications to means inequalities for
operators see [10, 11, 30].
Next, we recall definitions of some important subclasses of bounded operators
on Hilbert spaces, which will be investigated in the sequel.
Definition 1.1 For operators A, C ∈ B(H) and n, N ∈ N we say that A is:
• accretive if A∗ + A ≥ 0;
• strictly accretive if there is a constant c > 0, such that A∗ + A ≥ cI, which will
be denoted by A∗ + A > 0;
• N -hyper-accretive if, and only if, nk=0 nk A∗kAn−k ≥ 0 for all 1 ≤ n ≤ N ;
∗
• dissipative if 9A = A9 = A−A 2i ≥ 0;
def def

• (co)hyponormal if (A A ≤ AA )A∗A ≥ AA∗ ;

∗ ∗

• subnormal if there is a Hilbert space K containing H and a normal operator N

on K such that NH ⊂ H and A = NH ;
• quasinormal if A∗ A commutes ∗ ∗
n A∗kAAk = AA A.
n with A,ki.e.
C is N -hyper-contraction if k=0 (−1) k C C ≥ 0 for all 1 ≤ n ≤ N .
182 D. R. Jocić and M. Lazarević

Throughout this paper we use conventions R+ = [0, +∞), C+ = {z ∈ C : 9z =

def def def

z−z∗
2i > 0}, C– = {z ∈ C : 9z < 0} and 2Z (H) for the Hilbert space of all sequences
def

{hn }n∈Z in H, satisfying n∈Z ||hn ||2 < +∞.

We also emphasize that we will treat (address to) every unnumbered line in a
multiline formula as (to) a part of the consequent numbered one.

2 Cauchy–Schwarz Operator and Norm Inequalities for i.p.t.

Transformers

We start with different types of operator Cauchy–Schwarz inequalities.

Theorem 2.1 Let X ∈ B(H) and A∗, B ∈ L2G (, μ,B(H)).
(a1) Then t $→ At XBt , acting on , is weak∗ -integrable and for all f, g ∈ H
$ 2
At XBt f, g dμ(t) ≤

$ $
At |X∗ |2−2θA∗t g, g dμ(t) Bt∗ |X|2θBt f, f dμ(t) for all θ ∈ [0, 1],

(1)
$ 2 $ $
At XBt dμ(t) ≤ At A∗t dμ(t) Bt∗ X∗XBt dμ(t). (2)

(a2) For every ε > 0

$ −1/2$ 2 $
εI + At A∗t dμ(t) At XBt dμ(t) ≤ Bt∗ X∗XBt dμ(t).

(3)
)
(a3) If At A∗t dμ(t) is additionally invertible, then εI appearing in the inequal-
ity (3) could be omitted.
(b) If, in addition, {At }t ∈ is a m.c.n.o. family, then
$ 2 $ $
At XBt dμ(t) ≤ Bt∗ X∗ A∗t At dμ(t) XBt dμ(t). (4)

The inequality (1) is a special case Xt ··= X, Ct ··= At and Dt ··= Bt for all t ∈
in [15, th. 3.1(a)], inequalities (2), (3) and (a3) are proved in [26, lemma 2.1], while
the inequality (4) is proved in [26, cor. 2.3].
Cauchy–Schwarz Inequalities for i.p.t. Transformers 183

For elementary operators the inequality (2) is further refined in [19, lemma 2.2],
saying that
Lemma 2.2 If X ∈ B(H) and {A1 , . . . , AN }, {B1 , . . . , BN } are families in B(H)

for N ∈ N, then for all c ≥ || Nn=1 A∗n An ||1/2 > 0 we have the identity

N 2
N

N 1 −1
N 2
A∗n XBn + A∗n An A∗m XBm − cXBn
2
An cI + c2 I −
n=1 n=1 n=1 m=1

= c2 |XBn |2 .
n=1

N 2
N

N

Consequently, A∗n XBn ≤ A∗n An Bn∗ X∗XBn , (5)

n=1 n=1 n=1

and thus necessary and sufficient conditions for equality to take place in (5) are
Nthere ∗exists D∈N B(H)
that that An D = XBn for all n = 1, . . . , N and
such
∗ A D = 0.
A
n=1 n nA − A
n=1 n n
The fundamental role in investigation of i.p.t. transformers is played by the
following list of Cauchy–Schwarz norm inequalities.
Theorem
) 2.3 Let p ≥ 2, ,ϒ be s.n. functions and X ∈ B(H), then
A
t XBt dμ(t) ∈ C (H) :
) 1 ) 1
(a) if A, B ∗ ∈ L2G (, μ,B(H)) and ∗
At At dμ(t) 2X Bt Bt∗ dμ(t) 2 ∈ C (H),
in which case
$ $ 1
2
$ 1
2
At XBt dμ(t) ≤ A∗t At dμ(t) X Bt Bt∗ dμ(t) , (6)

under any of the following conditions:

(a1) μ is a σ -finite measure on and ||·|| ··= ||·||1 ,
(p) ∗
(a2) L2 (, μ) is separable, ··= ϒ and (at least) one of families {At }t ∈
or {Bt }t ∈ is a m.c.n.o. family. If, in addition, also X ∈ Cϒ (p)∗ (H),
then (6) remains valid if (at least) one of families {At }t ∈ or {Bt∗ }t ∈ is a
u.e.2s.i.m.c.n.o. family,
(a3) μ is a σ -finite measure on , ||·|| ··= |||·||| and both {At }t ∈ and {Bt }t ∈ are
m.c.n.o. families. If, in addition, X ∈ C|||·|||(H), then (6) remains valid if both
families {At }t ∈ and {Bt∗ }t ∈ are u.e.2s.i.m.c.n.o. families;
) 1
(b) if A, B ∈ L2G (, μ,B(H)) and A∗t At dμ(t) 2 X ∈ C (H), in which case

$ $ 1
2
$ 1
2
At XBt dμ(t) ≤ A∗t At dμ(t) X Bt∗ Bt dμ(t) , (7)

184 D. R. Jocić and M. Lazarević

under any of the following conditions:

(b1) ||·|| ··= ||·||2 ,
(b2) ··= ϒ and {At }t ∈ is additionally a m.c.n.o. family,
(p)

(b3) ··= ϒ , X ∈ Cϒ (p) (H) (additionally) and {At }t ∈ is additionally a

(p)

u.e.2s.i.m.c.n.o. family; )
(c) if under conditions of (b) operator Bt∗ Bt dμ(t) is invertible, then

$ $ − 12 $ 1
2
At XBt dμ(t) Bt∗ Bt dμ(t) ≤ A∗t At dμ(t) X ,

(8)

under any of the following conditions:

(c1) ||·|| ··= ||·||2 ,
(c2) ··= ϒ and {At }t ∈ is additionally a m.c.n.o. family;
(p)

) 1
(d) if A∗, B ∗ ∈ L2G (, μ,B(H)) and X Bt Bt∗ dμ(t) 2 ∈ C (H),

$ $ 1
2
$ 1
2
At XBt dμ(t) ≤ At A∗t dμ(t) X Bt Bt∗ dμ(t) ,

(9)

under any of the following conditions:

(d1) ||·|| ··= ||·||2 ,
(d2) ··= ϒ and {Bt }t ∈ is additionally a m.c.n.o. family,
(p)

(d3) ··= ϒ , X ∈ Cϒ (p) (H) (additionally) and {Bt∗ }t ∈ is additionally a

(p)

u.e.2s.i.m.c.n.o. family; )
(e) if under conditions of (d) operator At A∗t dμ(t) is invertible, then

$ − 12 $ $ 1
2
At A∗t dμ(t) At XBt dμ(t) ≤ X Bt Bt∗ dμ(t) ,

(10)

under any of the following conditions:

(e1) ||·|| ··= ||·||2 ,
(e2) ··= ϒ and {Bt }t ∈ is additionally a m.c.n.o. family.
(p)

The inequality:
• (6) in the case (a1) of Theorem 2.3 is exactly the case p ··= q ··= r ··= 1 of the
inequality (24) in [15, th. 3.3];
• (6) in the case (a2) of Theorem 2.3 is the inequality (32) in [26, th. 3.1(d)]. If
X ∈ Cϒ (p)∗(H), then (6) is the inequality (3.3) in [18, th. 3.1(c)];
Cauchy–Schwarz Inequalities for i.p.t. Transformers 185

• (6) in the case (a3) of Theorem 2.3 is the inequality (23) in [15, th. 3.2]. In the
case of the counting measure μ of ··= N, the inequality (6) was proved earlier
in [13, th. 2.2]. If X ∈ C|||·|||(H), then (6) is the inequality (3.4) in [18, th. 3.1(d)] ;
• (7) in the case (b1) (resp. (b2)) of Theorem 2.3 is exactly the inequality (28) in
[26, th. 3.1(b)] (resp. the special case Ct ··= A∗t and Dt ··= Bt for all t ∈ of the
inequality (33) in [21, lemma 3.4]). In the case (b3) the inequality (7) is exactly
the inequality (3.1) in [18, th. 3.1(a)];
• (9) in the case (d1) (resp. (d2)) of Theorem 2.3 is exactly the inequality (29) in
[26, th. 3.1(b)] (resp. the special case Ct ··= A∗t and Dt ··= Bt for all t ∈ of the
inequality (34) in [21, lemma 3.4]). In the case (d3) the inequality (7) is exactly
the inequality (3.2) in [18, th. 3.1(b)];
• (8) in the case (c1) (resp. (c2)) in Theorem 2.3 derives similarly by applying
the inequality (28) in [26, th. 3.1(b)] to L2G (, μ,B(H)) families {At }t ∈ and
6 ) − 17
Bt Bt∗Bt dμ(t) 2 t ∈ (instead of {Bt }t ∈) (resp. the inequality (33) in [21,
lemma 3.4] to s.i. families {Ct∗ }t ∈ , {Dt }t ∈, given by Ct ··= A∗t and Dt ··=
) −1
Bt Bt∗Bt dμ(t) 2 for all t ∈ );
• (10) in the case (e1) (resp. (e2)) in Theorem 2.3 proves by analogy to the proof
of the case (c1) (resp. (c2)).
We conclude our list of Cauchy–Schwarz inequalities for i.p.t. transformers
within the context of Schatten–von Neumann ideals.
Theorem 2.4 If μ is a σ -finite measure on and A, A∗, B, B ∗ ∈ L2G (, μ,B(H)),
then for all 1 ≤ p, q, r < +∞ satisfying p2 = q1 + 1r and X ∈ Cp(H)
$
At XBt dμ(t) ≤ (11)
p
8 8
$ $ q−1 $ $ r−1
A∗t At A∗t dμ(t) Bt∗ Bt dμ(t) Bt∗ dμ(t)
2q 2r
At dμ(t) X Bt ,
p

while for all 2 ≤ p < +∞

$
At XBt dμ(t) ≤
p
8
$ 1
2
$ $ p
2 −1
At A∗t dμ(t) Bt∗ Bt dμ(t) Bt∗ dμ(t)
p
X Bt , (12)
p
$
At XBt dμ(t) ≤
p
8
$ $ p
2 −1
$ 1
2
A∗t At A∗t dμ(t) Bt∗ Bt
p
At dμ(t) X dμ(t) . (13)
p
186 D. R. Jocić and M. Lazarević

) ∗
) ∗
If, in addition, At At dμ(t) and Bt Bt dμ(t) are invertible, then

$ 2q − 2
1 1 $ $ 2r − 2
1 1

At A∗t dμ(t) At XBt dμ(t) Bt∗ Bt dμ(t) ≤

p
$ 1
2q
$ 1
2r
A∗t At dμ(t) X Bt Bt∗ dμ(t) , (14)
p

while for all 2 ≤ p < +∞

$ p−2
1 1 $ $ − 12
At A∗t dμ(t) At XBt dμ(t) Bt∗ Bt dμ(t) ≤
p
$ 1
p
A∗t At dμ(t) X , (15)
p
$ − 12 $ $ p−2
1 1

At A∗t dμ(t) At XBt dμ(t) Bt∗ Bt dμ(t) ≤

p
$ 1
p
X Bt Bt∗ dμ(t) . (16)
p

The inequality (11) was proved in [15, th. 3.3], while the inequality (12) (resp. (13))
represents the special case q ··= +∞ (resp. r ··= +∞), which actually follows from
the proof of [15, th. 3.3]. In the case of the counting measure μ of ··= N, the
special case p ··= q ··= r of the inequality (11) was proved earlier in [14, th. 2.1].
The inequality (14) follows directly by applying the inequality (11) to the
6) ∗
1 −1 7
t t ∈ instead of {At }t ∈ and to the s.i. family
s.i. family 2q 2 A
At At dμ(t)
6 ) ∗
1 17
−2
Bt Bt Bt dμ(t) 2r
t ∈
instead of {Bt }t ∈ .
Similarly, the inequality (15) (resp. (16)) follows immediately from the
6) 1 − 1 7
inequality (12) (resp. (13)) applied to s.i. families At A∗t dμ(t) p 2At t ∈ and
6 ) − 1 7 6) − 1 7
Bt Bt∗ Bt dμ(t) 2 t ∈ (resp. to s.i. families ∗
At At dμ(t)
2A
t t ∈ and
6 ) 1 17
−
Bt Bt∗ Bt dμ(t) p 2 t ∈ .
Cauchy–Schwarz Inequalities for i.p.t. Transformers 187

3 Aczél–Bellman Type Norm Inequalities for i.p.t.

Transformers

Aczél–Bellman u.i. norm inequality for i.p.t. transformers was presented in [15,
th. 4.1], saying that
Theorem 3.1 If μ is a σ -finite measure) on , {At }t ∈ and {B) t }t ∈ are m.c.n.o.
families in L2G (, μ,B(H)), such that A∗t At dμ(t) and Bt Bt∗ dμ(t) are
)
contractions and X − At XBt dμ(t) ∈ C|||·|||(H) for some X ∈ B(H), then

$ 1
2
$ 1
2
$
I− A∗t At dμ(t) X I − Bt Bt∗ dμ(t) ≤ X− At XBt dμ(t) . (17)

This complements the Cauchy–Schwarz u.i. norm inequality [15, th. 3.2], also
generalizing the previous inequality in [13, th. 2.3] in two directions.
In the sequel, we will need the following terminology.
Definition 3.2 For an analytic function f on some neighborhood of zero, let
def √ −1
Rf = lim supn→∞ n |cn | denotes the radius of convergence for its Taylor
∞
(power) series (f (z) =) n=0 cn zn , with 0 < Rf ≤ +∞. For A ∈ L2G (, μ,B(H))
and for an analytic function f with non-negative Taylor coefficients, satisfying
) def ) ∗ n 1
r A∗t ⊗At dμ(t) = inf At ⊗At dμ(t) (I ) n ≤ Rf (when transformer
) ∗
n∈N
At ⊗At dμ(t) acts on B(H)) and f (0) > 0, we define its associated generalized
spectral defect operator f,A by

) −1/2
f,A = lim f ρ 2 A∗t ⊗At dμ(t) (I )
def
[s ]
.
ρ31

For a function f : D → C : z $→ ∞ ·
n=0 z (in which case Rf = 1) A ·= f,A
n

was introduced by [21, def. 2.1], while in the case that μ is the counting measure
on N operator f,A was given in [23, def. 2.3]. In both cases above it follows from
) −1/2
[21, lemma 2.2] and [23, rem. 2, rem. 7] that f,A = f A∗t At dμ(t) if
A ··= {At }t ∈ is s.i. and m.c.n.o. family.
The above definition helps us in formulation of the following generalization of
Theorem 3.1, which reformulates [21, th. 3.5]:
Theorem 3.3 Let ϒ be a s.n. function and p ≥ 2. If A∗, B ∈ L2G (, μ,B(H)) are
) )
such that r At ⊗A∗t dμ(t) ≤ 1, r Bt∗ ⊗Bt dμ(t) ≤ 1 and at least one of
families {At }t ∈ or {Bt }t ∈ is a m.c.n.o. family, then for X ∈ Cϒ (p)(H)
$
XB
A∗ ϒ
(p) X− At XBt dμ(t) (p)
.
ϒ
188 D. R. Jocić and M. Lazarević

Theorem 3.4 Let ϒ be a s.n. function, p ≥ 2, X ∈ Cϒ (p)∗ (H), L2 (, μ) be

separable and let A, B ∗ ∈ L2G (, μ,B(H)) be such that at least one of families
{At }t ∈ or {Bt∗ }t ∈ is a m.c.n.o. family.
∞ )

(a) If, in addition, ||At1· · ·Atn h||2 + ||Bt∗1· · ·Bt∗n h||2 dμn (t1 , . . . , tn ) < +∞
n
n=1 ) )
for all h ∈ H, then r A∗t ⊗At dμ(t) ≤ 1, r Bt ⊗Bt∗ dμ(t) ≤ 1 and
$
||X||ϒ (p) ∗ ≤ −1
A X− At XBt dμ(t) −1
B∗ (p) ∗
.
ϒ
) )
(b) Alternatively, if additionally A∗t At dμ(t) ≤ I and Bt Bt∗ dμ(t) ≤ I, then
the inequality (17) remains valid if |||·||| is replaced by ||·||ϒ (p) ∗ .
The case (a) (resp. (b)) of Theorem 3.4 is a reformulation of the case (a) (resp. (c))
of [26, th. 3.2].
In the context of Schatten–von Neumann ideals the counterpart of Theo-
rems 3.1, 3.3 and 3.4 says:
)
Theorem 3.5 Let A∗, B ∈ L2G (, μ,B(H)) be such that r At ⊗A∗t dμ(t) ≤ 1
)
and r Bt∗ ⊗Bt dμ(t) ≤ 1. Then for all X ∈ B(H)
$
A∗ XB ≤ X − At XBt dμ(t) .

)
If additionally p ≥ 2 and ∞ n ||At1 · · · Atn h|| dμ (t1 , . . . , tn ) < +∞ for all
2 n
) ∗

n=1
h ∈ H, then r At ⊗At dμ(t) ≤ 1 and for all X ∈ Cp(H)
$
1− 2 −2
A∗ p XB p ≤ A p X− At XBt dμ(t) .
p

)
Similarly, if p ≥ 2 and ∞ ∗ ∗ 2
n ||Bt1 · · · Btn h|| dμ (t1 , . . . , tn ) < +∞ for all
n
) n=1
h ∈ H, then r Bt ⊗Bt∗ dμ(t) ≤ 1 and for all X ∈ Cp(H)
$
1− p2 −2
A∗ XB p
≤ X− At XBt dμ(t) B ∗p .
p

)
If ∞ 2 ∗ ∗
n=1 n ||At1· · ·Atn h|| + ||Bt1 · · ·Btn h||
2 dμn (t , . . . , t ) < +∞ for all
1 n
) ∗

h ∈ H and p, q, r ≥ 1 are such that p = q + r , then r At ⊗At dμ(t) ≤ 1,
2 1 1
)
r Bt ⊗Bt∗ dμ(t) ≤ 1 and for all X ∈ Cp(H)
$
1− 1 1− 1r −1 −1
A∗ q XB p
≤ A q X − At XBt dμ(t) B ∗r .
p

Theorem 3.5 is just a modest reformulation of [21, th. 3.1].

Cauchy–Schwarz Inequalities for i.p.t. Transformers 189

4 Norm Inequalities for Transformers Generated by

Analytic Functions with Non-negative Taylor Coefficients

In this section we will show that Aczél–Bellman type norm inequalities in the
previous section are just a part of a wider family of Cauchy–Schwarz norm
inequalities.
Theorem 4.1 Let ,ϒ be s.n. functions, let p 2, let f be an analytic function
with non-negative Taylor coefficients and X ∈ C (H). Then
9 9
∞ : ∞ : ∞
: :
f An⊗Bn X ; f An⊗An (I ) X;f
∗ Bn⊗Bn∗ (I ) ,
n=1 n=1 n=1

∞
both {An }∞
if ∗ ∞
n=1 and {Bn }n=1 are s.s. families such that
∗
n=1 An An < Rf ,
∞ ∗
n=1 Bn Bn < Rf and one of additional pair of conditions are satisfied:
(p) ∗
(a) ··= ϒ and at least one of {An }∞ ∗ ∞
n=1 or {Bn }n=1 is a m.c.n.o. family,
(b) ||·|| ··= |||·||| and both {An }∞
n=1 and {B ∗ }∞ are m.c.n.o. families.
n n=1
(c) If ··= ϒ , {A∗n }∞ ∞
(p)
{B }
∞ ∗ n n=1 are s.s. families, such that
n=1 and
∞ ∗ <R ,
A
n=1 n n A f n=1 Bn Bn < Rf , then:
9
∞ : ∞ ∞
:
1
2
f An ⊗Bn X ≤ ;f A∗n An X f Bn∗ ⊗Bn (I )
(p) (p)
n=1 ϒ n=1 ϒ n=1

(c1) if, in addition, {An }∞

n=1 is a m.c.n.o. family;
9
∞ ∞ : ∞
:
1
2
f An ⊗Bn X ≤ f An ⊗A∗n (I ) X;f Bn∗ Bn
(p) (p)
n=1 ϒ n=1 n=1 ϒ

(c2) if, in addition, {Bn }∞

n=1 is a m.c.n.o. family.
9
∞ ∞ − 12 : ∞
:
f An ⊗Bn (X) f Bn∗ ⊗Bn (I ) ≤ ;f A∗n An X
(p) (p)
n=1 n=1 ϒ n=1 ϒ

∞
(d1) if f Bn∗ ⊗Bn (I ) is invertible in addition to conditions (c) and (c1);
n=1
9
∞ − 12 ∞ : ∞
:
f An ⊗A∗n (I ) f An ⊗Bn X ≤ X;f Bn∗ Bn
(p) (p)
n=1 n=1 ϒ n=1 ϒ

∞
(d2) if, in addition to conditions (c) and (c2), f An ⊗A∗n (I ) is invertible.
n=1
190 D. R. Jocić and M. Lazarević

Theorem 4.1(a) (resp. (b)) is a reformulation of [23, th. 2.4(a)] (resp. [23,
th. 2.2]), while cases (c1) and (c2) are based on [23, th. 2.4(b)]. The case (d1)
(resp. (d2)) follows by applying the inequality (8) in Theorem 2.3(c2) (resp.
the inequality
<∞ (10) in Theorem 2.3(e2))6 to the counting measure
7 μ on ··=
√ √ √
{0} m=1 N and to families { c0 I } ∪ cm An(m) · · · An(m) 1≤k≤m and { c0 I } ∪
m
m 1 (m)
1≤nk
6√ 7
cm Bn(m) · · · Bn(m) 1≤k≤m , combined with the arguments already used in the proof
1 m (m)
1≤nk
of [23, th. 2.2].
For two-side multipliers A⊗B Theorem 4.1 remains valid under some additional
subnormality conditions for A and B ∗, as stated in [18, th. 3.2]:
def
Theorem 4.2 Let , ϒ be s.n. functions, p ≥ 2, f (z) = ∞ n
n=0 cn z be analytic
function with non-negative Taylor coefficients cn ≥ 0 and A, B ∈ B(H) have their
1/2
spectra σ (A) ∪ σ (B) ⊂ D 0, Rf . Then for all X ∈ C (H)

∞
1
cn AnXB n = f (A⊗B)X
≤ f (A∗⊗A)(I )X
f (B ∗⊗B)(I ) 2

n=0

∞

1
2
∞

1
2
∗n n ∗n n
= cn A A X cn B B
n=0 n=0

(a1) if ··= ϒ and A is subnormal,

(p)

(a2) if ||·|| ··= ||·||2 ;

∞
∞

1
2
cn AnXB n ≤ f (A∗A)X
cn B ∗nB n
n=0 n=0

(b) if ··= ϒ
(p)
and A is quasinormal;
∞

1
cn AnXB n ≤ f (A⊗A∗ )(I ) 2 X f (B ⊗ B ∗ )(I )
n=0

∞

1
2
∞

1
2
n ∗n n ∗n
= cn A A X cn B B
n=0 n=0

(c1) if ··= ϒ and B ∗ is subnormal,

(p)

(c2) if ||·|| ··= ||·||2 ;

∞
∞

1

2
cn AnXB n ≤ cn AnA∗n X f (BB ∗)
n=0 n=0
Cauchy–Schwarz Inequalities for i.p.t. Transformers 191

(d) if ··= ϒ and B ∗ is quasinormal;

(p)

∞

cn AnXB n ≤ f (A∗ ⊗A)(I )X f (B ⊗B ∗ )(I )
n=0

∞

1
2
∞

1
2
∗n n n ∗n
= cn A A X cn B B
n=0 n=0

(p) ∗
(e1) if ··= ϒ and A or B ∗ is subnormal,
(e2) if ||·|| ··= |||·||| and both A and B ∗ are subnormal,
(e3) if ||·|| ··= ||·||1 ;

∞
∞
2
1

cn AnXB n ≤ cn A∗nAn X f (BB ∗)

n=0 n=0

(p) ∗
(f1) if ··= ϒ and B ∗ is quasinormal,
(f2) if ||·|| ··= |||·|||, B ∗ is quasinormal and A is subnormal;

∞
∞

1
2
n ∗n
cn A XBn n
≤ f (A∗A)X cn B B
n=0 n=0

(p) ∗
(g1) if ··= ϒ and A is quasinormal,
(g2) if ||·|| ··= |||·|||, A is quasinormal and B ∗ is subnormal;
∞

cn AnXB n ≤ f (A∗A)X f (BB ∗ )
n=0

(h) if ||·|| ··= |||·||| and both A and B ∗ are quasinormal.

For Schatten–von Neumann ideals Theorem 4.1 has the following counterpart:
Theorem 4.3 Let p, q, r 1 satisfy p2 = q1 + 1r , X ∈ Cp (H), {An }∞ ∗ ∞
n=1 , {An }n=1 ,
∞ ∗ ∞
{Bn }n=1 and {Bn }n=1 be s.s. families in B(H) and let f be an analytic function with
non-negative Taylor coefficients.
6 ∞ ∞ ∞ ∞ 7
(a) If max A∗n An , An A∗n , Bn∗ Bn , Bn Bn∗
∞ n=1
∗
∞ ∗
n=1 n=1 n=1
≤ Rf , n=1 An An n=1 Bn Bn < Rf , f (0) > 0 and for all h ∈ H
2

∞
∞ 2 2
cm A (m) · · · A (m) h + B ∗(m) · · · B ∗(m) h < +∞, then
m=1 (m) (m) n1 nm n1 nm
n1 ,...,nm =1

∞

1− 1 1− 1 −1 −1
f,Aq∗ f An ⊗Bn (X)f,Br ≤ f,A
q r
Xf,B ∗
p
. (18)
p
n=1
192 D. R. Jocić and M. Lazarević

4
∞
∞
∞
∞ 5
(b) If max A∗n An , An A∗n , Bn∗ Bn , Bn Bn∗ < Rf , then
n=1 n=1 n=1 n=1

∞
∞
∞
q−1 1
2q
f An⊗Bn X ≤ f A∗n⊗An f An⊗A∗n (I )
n=1 p n=1 n=1

∞
∞
r−1 1
2r
×X f Bn⊗Bn∗ f Bn∗⊗Bn (I ) , (19)
n=1 n=1 p

and if additionally f (0) > 0, then also

∞ 1 1 ∞ ∞ 1 1
2q − 2 2r − 2
f An⊗A∗n (I ) f An ⊗Bn (X) f Bn∗⊗Bn (I )
n=1 n=1 n=1 p

∞

1 ∞

1
2q 2r
≤ f A∗n⊗An (I ) X f Bn⊗Bn∗ (I ) . (20)
n=1 n=1 p

The part (a), including the inequality (18), is a simple reformulation of [23,
th. 2.9], while the inequality (19) in part (b) is also a reformulation of [23, th. 2.8].
∞ ∗ −1/2
According to [23, rem. 7] we can conclude that f,A = f n=1 An⊗An (I )
based on the conditions given in part (b). With analogous formulas for f,A∗ , f,B
and f,B ∗ and by the fact that conditions in (b) provides the fulfilment of all
requirements in (a), then the inequality (19) becomes the proclaimed inequality (20).
The next is a list of selected inequalities from [23, cor. 2.5, cor. 2.6]
Corollary 4.4 Let ϒ be a s.n. function, let p ≥ 2, α, β ∈ [−1, 1], γ 0 and let
A, B ∈ B(H), X ∈ Cϒ (p)(H) and Y ∈ Cϒ (p)∗(H).
(a) If A and B are strict contractions, i.e., max{||A||, ||B ||} < 1, with B being
additionally normal, then

(A⊗B)X+ (I − A ⊗ B) log I − A⊗B X (p)
≤
ϒ
∞
AnA∗n 1/2
X B ∗B + (I − B ∗B) log I − B ∗B ,
n(n − 1) ϒ
(p)
n=2
γ γ
I − A⊗B Y + α I + βA⊗B Y ϒ (p) ∗ ≤
∞
1/2 γ γ
(−1)n+αβ n γn A∗nAn Y I − B ∗B + α I + βB ∗B .
(p) ∗
n=0 ϒ
Cauchy–Schwarz Inequalities for i.p.t. Transformers 193

(b) If A is normal, then

exp(A⊗B)X + α exp(βA⊗B)X (p)

≤
ϒ

∞
1/2
1+αβ n
exp(A∗A) + α exp(βA∗A)X (p) n! B ∗nB n ,
ϒ
n=0

exp(A⊗B)Y + α exp(βA⊗B)Y (p) ∗ ≤

∞
1/2
1+αβ n
exp(A∗A) + α exp(βA∗A) Y n! B nB ∗n ,
(p) ∗
n=0 ϒ

n
Ak ⊗B k
X − (I − A⊗B) exp X
k ϒ
(p)
k=1
9
:
:
n
(A∗A)k
≤ ;I − (I − A∗A) exp X
k ϒ
(p)
k=1

n
B ∗k ⊗B k 1/2
∗
× I − (I − B ⊗B) exp (I ) ,
k
k=1

n
Ak ⊗B k
Y − (I − A⊗B) exp Y
k (p) ∗
k=1 ϒ
9
:
:
n
(A∗A)k
≤ ;I − (I − A∗A) exp Y
k
k=1
9
:
:
n
B k ⊗ B ∗k
× ;I − (I − B⊗B ∗ ) exp (I ) .
k ϒ
(p) ∗
k=1

Also, the list of selected inequalities from [18, th. 3.4] is displayed in
Theorem 4.5 Let ϒ be a s.n. function, let p ≥ 2, let α, β ∈ [−1, 1] and let
A, B, C, D, T ∈ B(H), with A, T ∗ being subnormal and C, D ∗ being quasinormal,
such that their spectra satisfy σ (A) ∪ σ (B) ∪ σ (C) ∪ σ (D) ∪ σ (T ) ⊂ D. Then for
all X ∈ Cϒ (p)(H), Y ∈ Cϒ (p)∗(H) and Z ∈ Cϒ (H)

α log I + βC⊗B X − log I − C⊗B X (p)
≤
ϒ
∞
1/2
1−α(−β)n
α log(I + βC ∗ C) − log I − C ∗ C X n B ∗nB n ,
(p)
ϒ n=1
194 D. R. Jocić and M. Lazarević

arcsin(C⊗B)X + α log βC⊗B + I + β 2 C 2 ⊗B 2 X (p)
≤
ϒ
=

arcsin(C ∗ C) + α log βC ∗ C + I + β 2 (C ∗ C)2 X
(p)
ϒ
∞
1/2
(1+(−1)nαβ 2n+1 )(2n)!
× 22n (n!)2 (2n+1)
(B ∗ )2n+1 B 2n+1 ,
n=0

π βπ
tan C⊗B X + α tanh C⊗B X ≤
2 2 ϒ
(p)
=
π ∗ βπ ∗
tan C C + α tanh C C X
2 2 ϒ
(p)

∞
1/2
(22n −1)π 2n−1 (|B 2n |+αβ
2n−1 B )
× 2(2n)!
2n
(B ∗ )2n−1 B 2n−1 ,
n=1

α log I + βA⊗B Y − log I − A⊗B Y ϒ
(p) ∗ ≤
∞
1/2 ∞
1/2
1−α(−β)n 1−α(−β)n
n A∗nAn Y n B n B ∗n ,
(p) ∗
n=1 n=1 ϒ

α log I + βA⊗T Z − log I − A⊗T Z ≤
∞
1/2 ∞
1/2
1−α(−β)n 1−α(−β)n
n A∗nAn Z n T n T ∗n ,
n=1 n=1

arcsin(C⊗D)Z + α log βC⊗D + I + β 2 C 2 ⊗D 2 Z ≤
=

arcsin(C ∗ C) + α log βC ∗ C + I + β 2 (C ∗ C)2 Z
=

× arcsin(DD ∗ ) + α log βDD ∗ + I + β 2 (DD ∗ )2 ,

π βπ
tan C⊗D Z + α tanh C⊗D Z ≤
2 2
= =
π ∗ βπ ∗ π βπ
tan C C + α tanh C C Z tan DD ∗ + α tanh DD ∗ .
2 2 2 2
Cauchy–Schwarz Inequalities for i.p.t. Transformers 195

5 Grüss–Landau Type Norm Inequalities for i.p.t.

Transformers

We begin this section by presenting [20, th. 2.4]:

Theorem 5.1 Let μ be a probability measure on , let 1 ≤ p, q, r < +∞ satisfy
∗ ∗
p = q + r and let A, A , B, B ∈ LG (, μ,B(H)). Then for all X ∈ Cp(H)
2 1 1 2

$ $ $
At XBt dμ(t) − At dμ(t)X Bt dμ(t) ≤
p
$ $ $ q−1 $ 2 1
2 2 2q
A∗t − A∗t dμ(t) dμ(t) At − At dμ(t) dμ(t) X

$ $ $ r−1 $ 2 1
2 2 2r
× Bt − Bt dμ(t) dμ(t) Bt∗ − Bt∗ dμ(t) dμ(t) .
p

For some other types of u.i. norms the counterpart of Theorem 5.1 says:
Theorem 5.2 Let μ be a probability measure on , p ≥ 2, ,ϒ be s.n. functions
and X ∈ B(H).
(a) If A, B ∗ ∈ L2G (, μ,B(H)), then
$ $ $
At XBt dμ(t) − At dμ(t)X Bt dμ(t) ≤ (21)

8$ $ 8$ $
2 2
A∗t At dμ(t) − At dμ(t) X Bt Bt∗ dμ(t) − Bt dμ(t) ,

under any of the following conditions:

(a1) ||·|| ··= ||·||1 ,
(p) ∗
(a2) L2 (, μ) is separable, ··= ϒ and (at least) one of families {At }t ∈ or
{Bt }t ∈ is a m.c.n.o. family,
(a3) ||·|| ··= |||·||| and both {At }t ∈ and {Bt }t ∈ are m.c.n.o. families.
(b) If A, B ∈ L2G (, μ,B(H)), then
$ $ $
At XBt dμ(t) − At dμ(t)X Bt dμ(t) ≤ (22)

8$ $ $ $
2 2 1/2
A∗t At dμ(t) − At dμ(t) X Bt∗ Bt dμ(t) − Bt dμ(t) ,

196 D. R. Jocić and M. Lazarević

under any of the following conditions:

(b1) ||·|| ··= ||·||2 ,
(b2) ··= ϒ and {At }t ∈ is additionally a m.c.n.o. family.
(p)

∗ ∗
(c) If A , B ∈ L2G (, μ,B(H)), then
$ $ $
At XBt dμ(t) − At dμ(t)X Bt dμ(t) ≤ (23)

$ $ $ 8 $
2 1/2 2
At A∗t dμ(t) − A∗t dμ(t) X Bt Bt∗ dμ(t) − Bt∗ dμ(t) ,

under any of the following conditions:

(c1) ||·|| ··= ||·||2 ,
(c2) ··= ϒ and {Bt }t ∈ is additionally a m.c.n.o. family.
(p)

As we can confine to the case in which the righthand side of the inequality (21) is
finite, so (21) in the case (a1) is a special case p ··= q ··= r ··= 1 of Theorem 5.1.
The proof of [20, th. 2.1] provides the proof for the inequality (21) in the case (a3),
together with the proof for the case (a2), with the only difference that we now apply
(to the same families) the Cauchy–Schwarz inequality [26, th. 3.1(d)] instead of
[15, th. 3.2]. For σ -elementary transformers the case (a2) was proved earlier in [32,
th. 2.10].
The case (b1) (resp. (c1)) was proved in [32, th. 2.6(21)] (resp. [32, th. 2.6(20)],
while the case (b2) (resp. (c2)) was proved in [32, th. 2.7(28)] (resp. [32, th. 2.7(27)].
Definition 5.3 For a bounded field of operators A ··= {At }t ∈ its radius of
its essential range is given by r∞ (A) = infB∈B(H) sup esst ∈ ||At − B || =
def

infB∈B(H) ||A − B ||∞ = minB∈B(H) ||A − B ||∞ , while its essential diameter is
given by diam∞ (A) = sup esss,t ∈||As − At ||.
def

This definition helps us to abbreviate the formulation of the following:

Theorem 5.4 If μ is a σ -finite measure on , A ··= {At }t ∈ and B ··= {Bt }t ∈
are [μ] a.e. bounded fields of operators and X ∈ C|||·|||(H), then
$ $ $
1 1 1
sup At XBt dμ(t) − At dμ(t) X Bt dμ(t)
δ∈M μ(δ) μ(δ) μ(δ)
0<μ(δ)<+∞

diam∞ (A) diam∞ (B)
≤ min r∞ (A)r∞ (B), |||X|||. (24)
2

Specially, if μ is a probability measure on and if {At }t ∈ and {Bt }t ∈ are

bounded self-adjoint fields satisfying C ≤ At ≤ D and E ≤ Bt ≤ F for all
Cauchy–Schwarz Inequalities for i.p.t. Transformers 197

t ∈ and some bounded self-adjoint operators C, D, E, F, then

$ $ $
||D − C ||·||F − E||
At XBt dμ(t) − At dμ(t)X Bt dμ(t) ≤ |||X|||.
4
(25)

The inequality (24) ( resp. (25)) was proved in [20, th. 2.8] (resp. [20, cor. 2.9]).
The operator Grüss–Landau inequality is given by:
Theorem 5.5 Let μ be a probability measure on , A∗, B ∈ L2G (, μ,B(H)), X ∈
B(H) and η ∈ [0, 1]. Then
$ $ $ 2η
At XBt dμ(t) − At dμ(t)X Bt dμ(t) ≤ (26)

$ $ 2 η $ $ 2 η
At A∗t dμ(t) − A∗t dμ(t) Bt∗ X∗XBt dμ(t) − X Bt dμ(t) .

Specially, if A ··= {At }t ∈ and B ··= {Bt }t ∈ are (bounded) self-adjoint fields
satisfying C ≤ At ≤ D and E ≤ Bt ≤ F for all t ∈ and some bounded self-
adjoint operators C, D, E, F, such that C, D (resp. E, F ) commutes with At (resp.
Bt ) for all t ∈ , satisfying CD = DC and EF = F E, then
$ $ $ 2η
At Bt dμ(t) − At dμ(t) Bt dμ(t) ≤

$ $ η $ η $ η
D− At dμ(t) At dμ(t) − C F− Bt dμ(t) Bt dμ(t) − E

1
≤ 2η ||D − C ||2η (F − E)2η . (27)
4
The inequality (26) (resp. (27)) was proved in [32, th. 2.1] (resp. [32, th. 2.2]).
The refined Grüss–Landau operator and norm inequalities are presented in:
N
Theorem 5.6 If {α1 , . . . , αN } are in (0, 1], satisfying n=1 αn = 1 for some
N ∈ N, and if X, {A1 , . . . , AN } and {B1 , . . . , BN } are in B(H), then for all
N −1 ∗
N 2 12
c≥ n=1 αn An An − n=1 An >0

N

N

N 2
αn−1 A∗n XBn − A∗n X Bn +
n=1 n=1 n=1

N

N
2 1 −1
−1
Am −αn−1 An ) cI + c2 I − αn−1 A∗n An +
2
αm αn (αm An
1≤m<n≤N n=1 n=1
198 D. R. Jocić and M. Lazarević

N

N

N 2
× αn−1 A∗n XBn − A∗n X Bn −1
− cX(αm Bm − αn−1 Bn )
n=1 n=1 n=1

N
N
2
= c2 αn−1 Bn∗ X∗XBn − X Bn . (28)
n=1 n=1

If 2 ≤ p ≤ q < +∞ and ϒ is a s.n. function, then

N

N

N p
αn−1A∗n XBn − A∗n X Bn ≤ (29)
(q )
n=1 n=1 n=1 ϒ

N

N

N p
αn−1A∗n XBn − A∗n X Bn +
n=1 n=1 n=1

p p
N

N
2 1 −1
−1
Am − αn−1An ) cI + c2 I − αn−1A∗n An +
2
αm2 αn2 (αm An
1≤m<n≤N n=1 n=1

N

N

N p
× αn−1A∗n XBn − A∗n X −1
Bn −cX(αm Bm − αn−1Bn ) q
(p)
n=1 n=1 n=1 ϒ
p

N

N

N 2

≤c p
αn−1 Bn∗ X∗XBn − Bn∗ XX ∗
Bn . (30)
q
n=1 n=1 n=1 ( )
ϒ 2

If {Bn }Nn=1 is additionally a m.c.n.o. family, then the expression appearing in (30)
N 2 1/2 p
further estimates by cp X Nn=1 αn−1 Bn∗ Bn − n=1 Bn ϒ
(q ) .

The identity (28) is a subject of [22, lemma 2.1], while the chain of inequalities (29)–
(30) was proven in [22, th. 2.2] and was further generalized in [22, th. 2.3].

6 Norm Inequalities for Holomorphic Functions on Simply

Connected Domains in the Complex Plane

Important applications of previous Cauchy–Schwarz norm inequalities to general-

ized derivations (which includes perturbations) are presented in:
Theorem
∞ 6.1 Let A, B ∈ B(H) be contractions, p ≥ 2, , ϒ be s.n. functions,
∞
n=0 nc be an absolutely summable complex series satisfying n=0 |cn | ≤ 1 and
def ∞
f (z) = n=0 cn z for all |z| ≤ 1. If AX − XB ∈ C (H) and Y − AY B ∈ C (H)
n
Cauchy–Schwarz Inequalities for i.p.t. Transformers 199

for some X, Y ∈ B(H), then

√ √
(a) I − A∗A f (A)X − Xf (B) I − BB ∗ ∈ C (H) and
√ √
I − A∗A f (I )Y − f (A⊗B)(Y ) I − BB ∗ ∈ C (H), satisfying
√ √
I − A∗A f (A)X − Xf (B) I − BB ∗ ≤

I − f (A)∗ f (A) (AX − XB) I − f (B)f (B)∗ , (31)
√ √
I − A∗A f (I )Y − f (A⊗B)(Y ) I − BB ∗ ≤

I − f (A)∗ f (A)(Y − AYB) I − f (B)f (B)∗ , (32)

under any of the following conditions:

(a1) ||·|| ··= ||·||1 ,
(p) ∗
(a2) ··= ϒ and (at least) one of A or B is normal,
(a3) ||·|| ··= |||·||| andboth A and B are√
normal;
(b) if ||A|| < 1, then f (A)X − Xf (B) I − BB ∗ ∈ C (H) and
√
f (I )Y − f (A ⊗ B)(Y ) I − BB ∗ ∈ C (H), satisfying
√
f (A)X − Xf (B) I − BB ∗
≤

I − f (A)f (A)∗ (I − AA∗ )−1/2 (AX − XB) I − f (B)f (B)∗
1/2

,
(33)
√
f (I )Y − f (A⊗B)(Y ) I − BB ∗
≤

I − f (A)f (A)∗ (I − AA∗ )−1/2 (Y − AYB) I − f (B)f (B)∗
1/2

,
(34)

under any of the following conditions:

(b1) ||·|| ··= ||·||2 ,
(b2) ··= ϒ and B is√normal;
(p)

(c) if ||B || < 1, then I − A∗A f (A)X − Xf (B) ∈ C (H) and
√
I − A∗A f (I )Y − f (A⊗B)(Y ) ∈ C (H), satisfying
√
I − A∗A f (A)X − Xf (B) ≤

I − f (A)∗ f (A) (AX − XB)(I − B ∗B)−1/2 I − f (B)∗ f (B)
1/2

,
(35)
200 D. R. Jocić and M. Lazarević

√
I − A∗A f (I )Y − f (A⊗B)(Y ) ≤

I − f (A)∗ f (A)(Y − AYB)(I − B ∗B)−1/2 I − f (B)∗ f (B)
1/2

,
(36)

under any of the following conditions:

(c1) ||·|| ··= ||·||2 ,
(c2) ··= ϒ and A is normal.
(p)

The inequality
• (31) in the case (a1) (resp. (a2) and (a3)) was proved in [24, th. 2.1(a3)] (resp.
[24, th. 2.1(a2)] and [24, th. 2.1(a1)]);
• (32) in the case (a1) (resp. (a2) and (a3)) was proved in [24, th. 2.1(b3)] (resp.
[24, th. 2.1(b2)] and [24, th. 2.1(b1)]);
• (33) in cases (b1) and (b2) is just the inequality (6) in [25, th. 2.1(a4)];
• (34) in cases (b1) and (b2) is a reformulation of the inequality (9) in [25,
th. 2.1(b4)];
• (35) in cases (c1) and (c2) is just the inequality (7) in [25, th. 2.1(a4)];
• (36) in cases (c1) and (c2) is a reformulation of the inequality (10) in [25,
th. 2.1(b4)].
Parts of Theorem 6.1 remains valid for the subclass of functions in disc algebra
A(D) possessing non-negative Taylor coefficients, if at least one of operators A or
B ∗ is quasinormal, as presented in [18, th. 3.6]:
def ∞
Theorem 6.2 Let ϒ be a s.n. function, p ≥ 2, f (z) = n
n=0 cn z be an analytic
∞
function with non-negative Taylor coefficients cn ≥ 0, 0 < n=0 cn < +∞ and let
A, B, X, Y ∈ B(H) be such that A, B ∗ are contractions and at least one of them is
quasinormal. If AY − Y B ∈ Cϒ (p)∗(H), then
√ √
I − A∗A f (A)Y − Yf (B) I − BB ∗ ϒ (p) ∗ ≤

f (1)I − |f (A)|2 /f (1)(AY − Y B) f (1)I − |f (B ∗ )|2 /f (1) ϒ
(p) ∗ .

If A and B ∗ are both quasinormal and AX − XB ∈ C|||·|||(H), then

√ √
I − A∗A f (A)X − Xf (B) I − BB ∗ ≤

f (1)I − f (A∗A)(AX − XB) f (1)I − f (BB ∗ ) ≤

f (1)I − |f (A)|2 /f (1)(AX − XB) f (1)I − |f (B ∗ )|2 /f (1) .
Cauchy–Schwarz Inequalities for i.p.t. Transformers 201

Definition 6.3 Let μ be a complex Borel measure on R and A, B, X ∈ B(H). If a

function t $→ eit A is Gel’fand μ integrable on R, then
$
μ̂(A) = Fμ(A) =
def def
eit A dμ(t),
R

will denote the operator valued (o.v.) Fourier transform of μ evaluated in A.

For dissipative operators we present the following Q∗ norm inequality for
generalized derivations, given in [26, th. 4.2]:
Theorem 6.4 Let p ≥ 2, ϒ be a s.n. function, μ be a complex Borel measure on
R+ , with its total variation |μ|(R+ ) ≤ 1 and A, B, X ∈ B(H) be such that A, B
are dissipative and at least one of them is normal. If AX − XB ∈ Cϒ (p)∗(H), then
√ √
iA∗ − iA μ̂(A)X − Xμ̂(B) iB ∗ − iB ϒ
(p) ∗ ≤

2 2
I − μ̂(A) (AX − XB) I − μ̂(B)∗ (p) ∗
.
ϒ

7 Mean Value Norm Inequalities for Operator Monotone

Functions and Heinz Inequalities, with Applications

Definition 7.1 For an interval I ⊂ R and a Borel measurable function ϕ : I → R

we say that ϕ is operator (increasingly) monotone (OM) on I if ϕ(A) ≤ ϕ(B) for
all A, B ∈ B(H) satisfying σ (A) ∪ σ (B) ⊂ I and A ≤ B.
Every OM function ϕ on I uniquely extends to a Pick class P(I ) function ϕ̃,
which is analytic in C+ ∪ I ∪ C– , so it satisfies 9ϕ̃(z) > 0 for all 9z > 0 (see [9,
th. 2.7.7]). Thus, we will also use a simplified notation ϕ for ϕ̃.
The first result in this section is the following extension of “mean value” norm
inequalities (68) and (69) for operator monotone functions on R+ in [15, th. 4.4],
from positive operators to normal strictly accretive operators.
Theorem 7.2 Let , ϒ be s.n. functions, p ≥ 2 and A, B, X ∈ B(H) be such
that AX − XB ∈ C (H). If A and B have strictly contractive real parts and ϕ ∈
P(−1, 1) is a non-constant function or A and B are
strictly
accretive and ϕ ∈
∗ ∗
P[0, +∞) is a non-constant function, then ϕ A 2+A and ϕ B+B2 are invertible.
Also, ϕ(A)X − Xϕ(B) ∈ C (H) :
(a) if A is cohyponormal and B is hyponormal, in which case
8 8
A ∗ +A B +B ∗
ϕ(A)X − Xϕ(B)
≤ ϕ (AX − XB) ϕ , (37)
2 2
202 D. R. Jocić and M. Lazarević

under any of the following conditions:

(a1) ||·|| ··= ||·||1 ,
(p) ∗
(a2) ··= ϒ and (at least) one of A or B is normal,
(a3) ||·|| ··= |||·||| and both A and B are normal.
Moreover, the rightmost expression in (37) can be further estimated by
A∗ +A− 1 A∗ +A1 ∗ 1 ∗1
2ϕ 2 (AX − XB) B+B − 2 ϕ B+B 2 if ϕ ∈ P[0,+∞) and
2 2 2 2
ϕ(0) = 0;
(b) if A is hyponormal, in which case

− 12 1
A+A∗

B +B ∗
2
ϕ ϕ(A)X − Xϕ(B) ≤ (AX − XB)ϕ ,
2 2
(38)

under any of the following conditions:

(b1) ||·|| ··= ||·||2 and B is hyponormal,
(b2) ··= ϒ and B is normal;
(p)

(c) if B is cohyponormal, in which case

− 12 1
B +B ∗ A+A∗ 2
ϕ(A)X − Xϕ(B) ϕ ≤ ϕ (AX − XB) ,
2 2
(39)

under any of the following conditions:

(c1) ||·|| ··= ||·||2 and A is cohyponormal,
(c2) ··= ϒ and A is normal;
(p)

(d) if ϕ ∈ P[0,+∞) and ϕ(0) = 0, A is hyponormal and B is cohyponormal, in

which case
1
− 12 1
− 12
A∗ +A 2 A∗ +A B +B ∗ 2 B +B ∗
ϕ ϕ(A)X − Xϕ(B) ϕ ≤
2 2 2 2
− 12 − 12
A∗ +A

B +B ∗
ϕ ϕ(A)X − Xϕ(B) ϕ ≤ ||AX − XB || ,
2 2
(40)

under any of the following conditions:

(d1) ||·|| is the operator norm ||·||,
(d2) ··= ϒ and (at least) one of A or B is normal,
(p)

(d3) ||·|| ··= |||·||| and both A and B are normal.

Moreover, the last inequality in (40) is valid for all ϕ ∈ P[0,+∞) ∪ P(−1, 1).
If J ··= (−1, 1) or J ··= (0, +∞), then ϕ ∈ P(J ) satisfies conditions of [9,
th. 2.4.1], implying that ϕ > 0 on J and ϕ is continuous on J, so ϕ ≥ mA =
def
Cauchy–Schwarz Inequalities for i.p.t. Transformers 203

minσ ( A∗ +A ) ϕ > 0. Thus, for every h ∈ H there is a positive measure μh , such that
> A∗+A2 ? ) ∗
ϕ ( 2 )h, h = σ ( A∗ +A ) ϕ (λ)dμh (λ) ≥ mA ||h||2 , implying that ϕ A 2+A ≥
∗ 2
mA I, so ϕ A 2+A is invertible, as proclaimed. Similarly we also conclude that
∗
ϕ B+B
2 is positive and invertible.
• The inequality (37) in the case (a1) (resp. (a2) and (a3)) is stated at the end of

[24, th. 3.1(c)] (resp. is given by the inequality (37) in [24, th. 3.1(c)] and by the
inequality (39) in [24, th. 3.1(d)]). The rest of the statement in the part (a) is based
on estimates
A∗+A A∗ +A−1 A∗ +A B+B ∗ B+B ∗−1 B+B ∗
ϕ 2 ≤ 2 ϕ 2 , ϕ 2 ≤ 2 ϕ 2 , (41)

presented in the proof of [24, th. 3.1(c),(d)], additionally combined with the (double)
monotonicity property [26, (1)].
• The inequality (38) in the case (b1) (resp. (b2)) is stated at the end of [24,
th. 3.1(a)] (resp. is given by the inequality (35) in [24, th. 3.1(a)]).
• The inequality (39) in the case (c1) (resp. (c2)) is stated at the end of [24,
th. 3.1(b)] (resp. is given by the inequality (36) in [24, th. 3.1(b)]).
• The first inequality in (40) in the case (d) is again based on the inequalities in (41),
combined with application of the monotonicity property [26, (1)].
The proof of the second inequality in (40) in the case (d1) for ϕ ∈ P(−1, 1)
relays on the estimate

− 21 − 12
A∗+A B+B ∗
ϕ 2 ϕ(A)X − Xϕ(B) ϕ 2 =
$ 1 − 12 − 12
A∗+A B+B ∗
ϕ (0) ϕ 2 (I − tA)−1 (AX − XB)(I − tB)−1 ϕ 2 dμ(t)
−1
$ 1
1
∗ − 21 ∗ − 12 2
≤ ϕ A 2+A
ϕ (0) (I − tA) −1
(I − tA ) ∗ −1
dμ(t)ϕ A 2+A (42)
−1
$ 1
1
− 21 − 21 2
B+B ∗ B+B ∗
×||AX − XB || ϕ 2 ϕ (0) (I − tB ∗ )−1(I − tB )−1 dμ(t)ϕ 2 ,
−1

where the inequality in (42) is based on application of the inequality (12) in

6 ∗ −1/2 7
[15, lemma 3.1(a1)], applied to s.i. families ϕ A 2+A (I − tA) −1 t ∈[−1,1] and
6 ∗−1/2 7
(I − tB ) −1ϕ B+B
2 t ∈[−1,1]
. With the norms of the expression appearing
before (42) and the second expression appearing after (42) not exceeding 1, based on
the same arguments used to justify inequalities (47)–(48) in the proof of [24, th. 3.1],
this confirms the second inequality in (40) in the case (d1). The case ϕ ∈ P[0, +∞)
proves by analogy.
The second inequality in (40) in the case (d2) follows from the inequality (38)
∗ −1/2
in the case (b2) (resp. (39) in the case (c2)), if applied to Xϕ B+B2 (resp.
204 D. R. Jocić and M. Lazarević

∗ −1/2
ϕ A 2+A X) instead of X. The second inequality in (40) in the case (d3) is given
by the inequality (42) in [24, th. 3.1(d)].
The previous Theorem 7.2 provides the following generalization of some well
known norm inequality to normal operators, which are either accretive or they have
strictly contractive real parts.
Corollary 7.3 Let A, B ∈ B(H) be normal, such that AX − XB ∈ C|||·|||(H) for
some X ∈ B(H).
(a) If A and B are accretive operators, then for all θ ∈ [0, 1]
A∗ +A1−θ B+B ∗1−θ
2
2 (Aθ X − XB θ ) 2
2 ≤ θ |||AX − XB |||, (43)
√
A∗ + A log(A)X − X log(B) B + B ∗ ≤ 2|||AX − XB |||. (44)

If, in addition, A and B are strictly accretive operators, then

−1 −1
A log(I + A) X − XB log(I + B) ≤

∗ ∗ ∗ −1 ∗ −1
log I + A 2+A − A 2+A I + A 2+A log I + A 2+A (AX − XB)

∗ ∗ ∗ −1 ∗ −1
× log I + B+B 2 − B+B2 I + B+B2 log I + B+B2 .

(b) If A and B have strictly contractive real parts and 0 ≤ α, β ≤ 1, then

A∗+A 2 I +A I +B B+B ∗ 2
I− 2 log I −A X−X log I −B I− 2 ≤ 2|||AX−XB|||,
∗ B+B ∗
π
2 cos A π+A tan 2A
π X − X tan π cos π
2B
≤ |||AX−XB |||,

(I + A)α − (I − A) β X − X (I + B)α − (I − B) β ≤

A∗+A α−1

A∗+A β−1
1/2
αI+ 2 +β I − 2 (AX − XB)

B+B ∗ α−1

B+B ∗ β−1
1/2
× αI+ 2 + β I− 2 .

Corollary 7.3 represents a reformulation of [24, cor. 3.2].

As noted in [24, rem. 3], the inequality (43) extends “difference” version of
celebrated Heinz inequality in [8, Hilfssatz 3.] and its norm ideal version (9) in
[3, th. 2], [9, (5.3.2)] and [30, (4.1)] to accretive normal operators. Its weakened
version
1 1 1 1
|||H 2 +β XK 2 −β − H 2 −β XK 2 +β ||| ≤ |||H X − XK ||| (45)
Cauchy–Schwarz Inequalities for i.p.t. Transformers 205

in [30, (4.1)] and another important generalization of the first inequality in [8,
Hilfssatz 3.] and Heinz-type means inequality [9, (5.3.1)] from positive to self-
adjoint operators is given by the special case p ··= 1 of [12, lemma 3.2]:
Lemma 7.4 For self-adjoint A, B ∈ B(H) and X ∈ B(H), p ≥ 1 and u.i. norms
|||·|||, the function f : [0, p] → [0, +∞], defined by

f (s) = |A|s−1AX|B|p−s + |A|p−s XB|B|s−1 for all s ∈ [0, p],

def

is convex and symmetric on [0, p], non-increasing on [0, p/2] and non-decreasing
on [p/2, p], implying that for all s ∈ [0, p]

|A|s−1AX|B|p−s + |A|p−s XB|B|s−1 ≤ |A|p−1AX + XB|B|p−1 . (46)

The inequality (46) follows from the first part of the above lemma, as f (s) ≤
max[0,p] f = f (0) = f (p) = |A|p−1AX + XB|B|p−1 for all s ∈ [0, p]. The
1+2β
inequality (45) follows from inequality (46) by taking p ··= 1, A ··= H 2s
and
1+2β
B ··= −K 2s
for s ∈ (0, 1). Inequality (46) plays the important role in the proofs
of the norm inequality for self-adjoint derivations

|AX + XB|p ≤ 2p−1 ||X||p−1 |A|p−1AX + XB|B|p−1

in [12, th. 3.1] and the perturbation norm inequality in [12, th. 3.3]:

|A − B|p ≤ 2p−1 A|A|p−1 − B|B|p−1 .

It is also noted in [24, rem. 3] that the special case of (44) for strictly positive
operators A and B can also be derived from geometric-logarithmic mean inequality
[30, (5.2)].

8 Laplace Transformers, Arithmetic-Geometric and Young

Norm Inequalities

In this section we show some applications of Theorem 2.3 to Laplace transformers,

which represent an important subclass of i.p.t. transformers.
Definition 8.1 Let f : R+ → C be a Lebesque measurable function and A, B, X ∈
B(H). If a function t $→ e−t A f (t) is Gel’fand integrable on R+ (in respect to the
Lebesque measure), then
$
LB(H) f (A) = e−t A f (t) dt
def

R+
206 D. R. Jocić and M. Lazarević

will denote the operator valued Laplace transform of f (in A), and, similarly, if a
function t $→ e−t A Xe−t B f (t) is Gel’fand integrable on R+ , then
$
LB(B(H)) f (A⊗I + I ⊗B)(X) = e−t A Xe−t B f (t) dt
def

R+

will denote the Laplace transformer of f (in a generalized derivation A⊗I + I ⊗B,
evaluated in X).
In the sequel, we will use the simplified notation Lf instead of both LB(H) f and
LB(B(H)) f, as their (exact) meaning will be clear from the context. As it is usual
for linear transformations, we will also often write Lf (A⊗I + I ⊗B)X instead of
Lf (A⊗I + I ⊗B)(X), except in the case X ··= I in which the brackets will not be
omitted.
For Laplace transformers we have the following reformulation of [28, th. 2.2]:
Theorem 8.2 Let ,ϒ be s.n. functions, p ≥ 2, A, B ∈ B(H), f, g : R+ → C be
some Lebesgue measurable functions and X ∈ C (H).
) ∗
(a) If R+ (||e−t Ah||2 |f (t)|2 + ||e−t B h||2 |g(t)|2 )dt < +∞ for all h ∈ H, then

1 1
L(fg)(A,B )X
≤ (L|f |2(A∗,A )(I )) 2 X(L|g|2(B,B ∗)(I )) 2
, (47)

under any of the following conditions:

(a1) ||·|| ··= ||·||1 ,
(p) ∗
(a2) ··= ϒ and (at least) one of A or B is normal operator,
(a3) ||·||) ··= |||·||| and both A and B are normal operators;
(b) If R+ (||e−t Ah||2 |f (t)|2 + ||e−t B h||2 |g(t)|2 )dt < +∞ for all h ∈ H, then

1 1
L(fg)(A,B )X
≤ (L|f |2 (A∗,A )(I )) 2 X
L|g|2 (B ∗,B )(I ) 2 ,
(48)

under any of the following conditions:

(b1) ||·|| ··= ||·||2 ,
(b2) )··= ϒ and A is normal operator;
(p)

∗ ∗
(c) If R+ (||e−t A h||2 |f (t)|2 + ||e−t B h||2 |g(t)|2 )dt < +∞ for all h ∈ H, then

1 1
L(fg)(A,B )X
≤ L|f |2 (A,A∗)(I ) 2 X(L|g|2 (B,B ∗)(I )) 2
,
(49)

under any of the following conditions:

(c1) ||·|| ··= ||·||2 ,
(c2) ··= ϒ and B is normal operator.
(p)
Cauchy–Schwarz Inequalities for i.p.t. Transformers 207

As LB(B(H)) |f |2 (A∗,A ) (I ) = LB(H) |f |2 (A∗ + A) if A is normal and

LB(B(H)) |g|2 (B,B ∗) (I ) = LB(H) |g|2 (B + B ∗) if B is normal, let us note that
inequalities (47), (48) and (49) can be simplified in those cases.
The following Aczél–Bellman type norm inequality for Laplace transformers
reformulates [28, th. 2.3]:
Theorem 8.3 Let p ≥ 2, , ϒ be s.n. functions, f, g : R+ → C be Lebesgue
measurable functions and A, B ∈ B(H), such that
$ $
−t A ∗
||e h|| |f (t)| dt ≤ 1
2 2
and ||e−t B h||2 |g(t)|2 dt ≤ 1
R+ R+

for all h ∈ H satisfying ||h|| ≤ 1. Then for all X ∈ C (H)

X − L(fg)(A,B )X
≥

(I − L|f |2 (A∗,A )(I ))1/2 X(I − L|g|2 (B,B ∗)(I ))1/2

, (50)

under any of the following conditions:

(p) ∗
(a) ··= ϒ and (at least) one of A or B is normal operator,
(b) ||·|| ··= |||·||| and both A and B are normal operators.
More precisely, L|f |2 (A∗,A )(I ) (resp. L|g|2 (B,B ∗)(I )) in (50) can be
replaced by L|f |2 (A∗ + A)(I ) (resp. L|g|2 (B + B ∗ )) if A (resp. B) is normal.
Jocić et al. [28, th. 2.9] for convolutional norm inequalities for Laplace trans-
formers can be reformulated as:
Theorem 8.4 Let , ϒ be s.n. functions, p ≥ 2, α, β ∈ [0, 1], A, B ∈ B(H),
f, g : R+ → C be some Lebesgue measurable functions and X ∈ C (H).
) ∗
(a) If R+ (||e−t Ah||2 |f |2−2α ∗ |g|2−2β (t) + ||e−t B h||2 |f |2α ∗ |g|2β (t))dt < +∞
for all h ∈ H, then

||L(f ∗ g)(A,B )X|| ≤ (51)

2−2α 1 1
L |f | ∗ |g|2−2β (A∗,A )(I ) X L |f |2α ∗ |g|2β (B,B ∗)(I )
2 2

,

under any of the following conditions:

(a1) ||·|| ··= ||·||1 ,
(p) ∗
(a2) ··= ϒ and (at least) one of A or B is normal operator,
(a3) ||·|| ··= |||·||| and both A and B are normal operators;
208 D. R. Jocić and M. Lazarević

)
(b) If R+ (||e−t Ah||2 |f |2−2α ∗ |g|2−2β (t) + ||e−t B h||2 |f |2α ∗ |g|2β (t)) dt < +∞
for all h ∈ H, then

||L(f ∗ g)(A,B )X|| ≤ (52)

2−2α 1 1
L |f | ∗ |g|2−2β (A∗,A )(I ) 2 X
L |f |2α ∗ |g|2β (B ∗,B )(I ) 2 .

under any of the following conditions:

(b1) ||·|| ··= ||·||2 ,
(b2) )··= ϒ and A is normal operator;
(p)

∗ ∗
(c) If R+ (||e−t A h||2 |f |2−2α ∗ |g|2−2β (t) + ||e−t B h||2 |f |2α ∗ |g|2β (t))dt < +∞
for all h ∈ H, then

||L(f ∗ g)(A,B )X|| ≤ (53)

1 1
L |f |2−2α ∗ |g|2−2β (A,A∗)(I ) 2 X L |f |2α ∗ |g|2β (B,B ∗)(I ) 2
.

under any of the following conditions:

(c1) ||·|| ··= ||·||2 ,
(c2) ··= ϒ and B is normal operator.
(p)

As L |f |2−2α ∗ |g|2−2β (A∗,A )(I ) = L |f |2−2α ∗ |g|2−2β (A∗ + A) if A is normal

and L |f |2α ∗ |g|2β (B,B ∗)(I ) = L |f |2α ∗ |g|2β (B + B ∗) if B is normal,
inequalities (51), (52) and (53) can be simplified in those cases.
Jocić et al. [27, lemma 2.5] guarantees the correctness of the following
Definition 8.5 For accretive operators A, B ∈ B(H) and X ∈ B(H) we define
√ √
V −(A, B)X = A∗ + Ae−T A Xe−T B B + B ∗,
def
[s ] lim
T →+∞

−
(A, B)X = lim e−T A Xe−T B .
def
[s ]
T →+∞

The previous Definition 8.5 is useful for the formulation of the next:
Theorem 8.6 Let α ∈ (0, 1), p ≥ 2, ϒ be a s.n. function and A, B, X ∈ B(H),
such that A and B are accretive.
(a) If AX + XB ∈ C1(H), then
√ √
A∗ + AX B + B ∗ − V −(A, B)X 1 ≤ (54)
1/2 1/2
I − −(A∗, A)I (AX + XB) I − −(B, B ∗)I 1 ≤ ||AX + XB ||1 .
√ √
Consequently, the left side of (54) can be replaced by A∗ + AX B + B ∗ 1
under additional conditions required in (b1) or (b2) of [27, lemma 2.7].
Cauchy–Schwarz Inequalities for i.p.t. Transformers 209

(b1) If, in addition, A is strictly accretive, B is normal and AX + XB ∈ Cϒ (p)(H),

then

X(B + B ∗)1/2 ϒ
(p) ≤ (A∗ + A) −1/2 (AX + XB)PR(B+B ∗) (p)
ϒ

≤ (A∗ + A) −1/2 (AX + XB) ϒ

(p) . (55)

(b2) Alternatively, if A is normal, B is strictly accretive and AX + XB ∈ Cϒ (p)(H),

then

(A∗ + A)1/2 X ϒ
(p) ≤ PR(A∗+A) (AX + XB)(B + B ∗)−1/2 (p)
ϒ

≤ (AX + XB)(B + B ∗)−1/2 ϒ

(p) . (56)

(c) Specially, in the case of the Hilbert-Schmidt class C2 (H) (i.e. if ··= and
p ··= 2), the requirement of normality for A or B can be omitted, with (55)
and (56) still remaining valid, if PR(B+B ∗) (resp. PR(A∗+A) ) is replaced by
1/2 1/2
I − −(B, B ∗)I (resp. I − −(A∗, A)I ).
(d) If at least one of operators A or B is additionally normal and AX + XB ∈
Cϒ (p)∗(H), then
√ √
A∗ + AX B + B ∗ ϒ
(p) ∗ ≤
1 1
I − −(A∗, A)I (AX + XB) I − −(B, B ∗)I 2
2
ϒ
(p) ∗ ≤ ||AX + XB ||ϒ (p) ∗ .

(e) If A and B are both normal, then

(A∗ + A)1−α X(B + B ∗)α ≤

2
(57)
$
∗
(2 − 2α) e−t B (B ∗ + B)α |AX + XB|2 (B ∗ + B)αe−t B t 2α−1 dt,
R+

and, if additionally AX + XB ∈ C|||·|||(H), then also

(A∗+A)1−αX(B +B ∗)α ≤ (2 − 2α)(2α) PR(A∗+A) (AX+XB)PR(B+B ∗)

≤ (2 − 2α)(2α)|||AX + XB ||| = (1−2α)π
sin 2απ |||AX + XB |||. (58)

(f) If p > 1 and p = p−1

p def
, then for all normal accretive operators A and B,
such that AX + XB ∈ C|||·|||(H),

√
p A B
(A∗ + A)1/pX(B + B ∗)1/p ≤ p
p p (2 − 2α)(2α) X+X .
p p
210 D. R. Jocić and M. Lazarević

For the proof see [27, th. 2.9] and its proof, as well as [27, rem. 2.2]. In addition, for
the inequality (57) see [27, th. 2.8(d)] and its proof.
Thus, Theorem 8.6 provides extensions of the Young’s norm inequality [15,
cor. 4.1] in various direction, where the norm inequality (58) in part (e) extends
it from positive definite to normal accretive operators.
The next Theorem 8.7 represents the main part of [27, th. 2.10].
Theorem 8.7 Let X ∈ C|||·|||(H), (αn )∞
n=1 be a sequence
∞
∞ in (0, 1), (tn )n=1 be a
summable sequence in (0, +∞), f : C → C : z $→ n=1 (1 + tn z) and let each
of the absolutely summable (in B(H)) families {An }∞ ∞
n=1 and {Bn }n=1 consists of
normal, accretive and mutually commuting operators. Then
∞
1/2 ∞
1/2
|||X||| ≤ (I + A∗n + An ) X (I + Bn + Bn∗)
n=1 n=1
∞

≤ (I ⊗I + An ⊗I + I ⊗Bn )X .
n=1

Specially, if A and B are normal, accretive operators, then

|||X||| ≤ f (A∗ + A)X f (B + B ∗)

≤ f (A⊗I + I ⊗B)X ≤ f (I + A∗A)X f (I + BB ∗ ) .

The entire function s : C → C : z $→ ∞ zn
n=0 (2n+1)! satisfies a functional relation
= zs(z
sinh(z) 2 ) for all z ∈ C and it admits an infinite product representation
∞
s(z) = n=1 1 + n2zπ 2 , playing the prominent role in
Corollary 8.8 If X ∈ C|||·|||(H) and A, B, C, D ∈ B(H) are such that A and B are
normal and accretive, C = C ∗ and D = D ∗, then

|||X||| ≤ s(A∗ + A)X s(B + B ∗) ≤

|||s(A⊗I + I ⊗B)X||| ≤ s(I + A∗A)X s(I + BB ∗ ) , (59)
|||CX + XD||| ≤ |||sinh(C⊗I + I ⊗D)X||| = 1
2 eCXeD − e−C Xe−D . (60)

Corollary 8.8 was earlier presented in [27, cor. 2.13]. For its connections with
[31, prop. 21(1)] see [27, rem. 2.5].
Cauchy–Schwarz Inequalities for i.p.t. Transformers 211

For 2-hyper-accretive and 2-hyper-contractive operators [28, th. 3.6] says:

Theorem 8.9 Let A, B, C, D, X ∈ B(H), ,ϒ be s.n. functions and p ≥ 2.
If A, B ∗ are 2-hyper-accretive and A2X + 2AXB + XB 2 ∈ C (H), then

(A∗2 + 2A∗A + A2 )1/2 X(B 2 + 2BB ∗ + B ∗2 )1/2

≤

A2X + 2AXB + XB 2
,

under any of the following conditions:

(a1) ||·|| ··= ||·||1 and X ∈ K(H),
(p) ∗
(a2) ··= ϒ and (at least) one of operators A or B is normal,
(a3) ||·|| ··= |||·||| and both A and B are normal operators.
If C, D ∗ are 2-hyper-contractions and X − 2CXD + C 2XD 2 ∈ C (H), then

(I − 2C ∗C + C ∗2 C 2 )1/2 X(I − 2DD ∗ + D 2D ∗2 )1/2

≤

X − 2CXD + C 2XD 2
,

under any of the following conditions:

(b1) ||·|| ··= ||·||1 and X ∈ K(H),
(p) ∗
(b2) ··= ϒ and (at least) one of operators C or D is normal,
(b3) ||·|| ··= |||·||| and both C and D are normal operators.
Under stronger conditions X ∈ C (H) we have the following reformulation of [27,
th. 2.15]:
Theorem 8.10 If , ϒ are s.n. functions, p ≥ 2, M , N ∈ N are such that (M +
N )/2 ∈ N and A, B ∈ B(H) are such that A is M -hyper-accretive and B ∗ is N -
hyper-accretive, then for all X ∈ C (H)

M
M 1/2
N
N 1/2
∗m M −m
m A A X n B N −n B ∗n
m=0 n=0

√ +N )/2
(M
M+N M+N
−k
≤ M+N
(M−1)!(N−1)! 2 A 2 XB k ,
k
2 −1 !
k=0

under any of the following conditions:

(a1) ||·|| ··= ||·||1 ,
(p) ∗
(a2) ··= ϒ and (at least) one of operators A or B is normal,
(a3) ||·|| ··= |||·||| and both A and B are normal operators.
212 D. R. Jocić and M. Lazarević

9 Applications to Refinements and Generalizations

of Minkowski, Zhan and Heron Inequalities

We start this section with the shortened version of [17, th. 2.5], which provides
Minkowski inequality for p-modified u.i. norms.

Theorem 9.1 Let p ≥ 2, {Bn }n∈N be a sequence in B(H), such that n∈N Bn
in the strong operator topology and let {βn }n∈N be a sequence in (0, +∞)
converges
satisfying n∈N βn < +∞. Then

1
p p p p 1− p1 1
Bm Bn p
1−p p
Bn + 12 βm2 βn2 − ≤ βn βn |Bn |p .
βm βn
n∈N m,n∈N n∈N n∈N

If 0 = Bn ∈ C|||·||| (H) for all n ∈ N and n∈N |||Bn |||(p) < +∞, then
(p)

1
p p p
Bm Bn p p
Bn ≤ Bn + 12 |||Bm |||(p)
2
|||Bn |||(p)
2
−
(p)
|||Bm |||(p) |||Bn |||(p)
n∈N n∈N m,n∈N

1− p1
1
p
≤ |||Bn |||(p) |||Bn |||(p)
1−p
|Bn |p ≤ |||Bn |||(p) . (61)
n∈N n∈N n∈N

Here we used the notation |||·|||(p) = ||·||ϒ (p) , where s.n. function ϒ is uniquely
def

determined by ||·||ϒ = |||·|||, as well as the monotonicity property for u.i. norms to
get the first (of three) inequality in (61).
We also have the following refined norm inequality for bi-infinite operator
matrices, given as a part of [17, th. 2.8]:
Theorem 9.2 If 2 ≤ p < +∞, then for all [Am,n ]m,n∈Z ∈ C|||·||| (2Z (H))
(p)

2 p
p p
[Am,n ]m,n∈Z ≤ [Am,n ]m,n∈Z + 12 [Ak,n ]n∈Z 2 [Al,n ]n∈Z 2
(p) (p) (p)
k,l∈Z
2 2 p 2
[Ak,n ]n∈Z [Al,n ]n∈Z p 2
× 2
− 2
≤ [Am,n ]n∈Z ≤
[Ak,n ]n∈Z [Al,n ]n∈Z m∈Z
(p)
(p) (p)

2
p p |A∗k,m |2 |A∗l,m |2 p p
[A∗n,m ]n∈Z + 12 |||A∗k,m |||(p) |||A∗l,m |||(p)
p 2 2
−
|||A∗k,m |||(p)
2 |||A∗l,m |||(p)
2
m∈Z k,l∈Z

≤ |||Am,n |||(p)
2
.
m,n∈Z
Cauchy–Schwarz Inequalities for i.p.t. Transformers 213

p
If q ≥ p = if C|||·||| (2Z (H)) is the associated ideal of compact operators
def
p−1 , (q)
∗ ( 2 (H)), then for all [A
and |||·|||(q)∗ is a norm on its dual space C|||·||| Z m,n ]m,n∈Z ∈
(q)
∗ (2 (H))
C|||·||| Z
(q)

p
p
p
Am,n m,n∈Z ≥ Am,n ≥ Am,n .
(q)∗ n∈Z (q)∗ (q)∗
m∈Z m,n∈Z

We proceed with another generalization of the case n ··= 1 in Young norm

inequality (60) in [15, th. 4.3] and its consequences to refinements and general-
izations of Zhan inequality, provided in the wider form in [27, th. 2.11]:
Theorem 9.3 Let A, B, X ∈ B(H), A ≥ 0, B ≥ 0, r1 , . . . , rN ∈ (0,+∞), η, θ,
θ1 , . . . , θN ∈ [−π, π), α, α1 , . . . , αN ∈ (0, 1) for some N ∈ N and β ∈ [0, 1].
(a) If eiηAX + eiθXB ∈ C|||·|||(H), then

eiη + eiθ A1−α XB α ≤ (2 − 2α)(2α) eiηAX + eiθXB .

(b) If A2 X + 2 cos θ AXB + XB 2 ∈ C|||·|||(H), then

(2 + 2 cos θ )|||A2−2α XB 2α ||| ≤ (2 − 2α)(2α) A2X+ 2 cos θ AXB +XB 2 ,

√ √
(2 + 2 cos θ )|||AXB ||| ≤ (1 + cos θ ) A A1−β XB β + Aβ XB 1−β B
≤ 1+cos θ
2 A1−β (AX + XB)B β + Aβ (AX + XB)B 1−β
≤ 1+cos
2
θ
A2X + 2AXB + XB 2 ≤ A2X + 2 cos θ AXB + XB 2 .

(c) If c0 , . . . , cN ∈ C satisfy cN = 0, Nn=0 cn zn = cN Nn=1 z− rn eiθn for all
def

N
z ∈ C and n=0 cn AnXB N −n ∈ C|||·|||(H), then

N
N N
2 |cN |
N
|sin(θn /2)|rnαn AN − n=1 αn XB n=1 αn
n=1

N

N

≤ (2 − 2αn )(2αn ) cn AnXB N −n .

n=1 n=0

N
def
N
2
(d) If a sequence {cn }Nn=−N in C satisfies cn eint = eit − rn eiθn for all
n=−N n=1
t ∈ R, cN = 0 and Nn=−N cn AN +nXB N −n ∈ C|||·|||(H), then

N

N

2N |cN | (1 − cos θn ) ANXB N ≤ cn AN +nXB N −n .

n=1 n=−N
214 D. R. Jocić and M. Lazarević

N N
(e) If n=0 n einη+i(N −n)θ AnXB N −n ∈ C|||·|||(H), then
N N
2N cos η−θ AN −
N
n=1 αn XB n=1 αn
2

N

N
N
≤ (2 − 2αn )(2αn ) n einη+i(N −n)θAnXB N −n and
n=1 n=0
N +1 N N
sin 2 η◦
η AN − n=1 αn XB n=1 αn
sin 2◦

N

N

≤ (2 − 2αn )(2αn ) einη◦AnXB N −n ,

n=1 n=0

N inη◦ AnXB N −n
if n=0 e ∈ C|||·|||(H) for some η◦ ∈ (0, 2π).
2N

(f) If θ◦ ∈ (−π, π) and 2N i(N −n)θ◦ AnXB 2N −n − T AN XB N ∈ C (H) for
n e |||·|||
n=0
2N

0 ≤ T < 22N cos2N θ2◦ , then 2N
n ei(2N −n)θ◦AnXB 2N −n ∈ C|||·|||(H) and
n=0

2N
2N
n ei(N −n)θ◦ − T ANXB N ≤
n=0

2N
2N
1− θ
T
n ei(N −n)θ◦AnXB 2N −n ≤
22N cos2N 20
n=0

2N
2N
n ei(N −n)θ◦AnXB 2N −n − T ANXB N .
n=0

Some relations between Young and Heron means norm inequalities are investi-
gated in [27, cor. 2.12], which we present here in the next reduced form:
Corollary 9.4 A, B, X ∈ B(H), A ≥ 0, B ≥ 0, 0 ≤ β < 1/2 ≤ α ≤ α , such
√ If √
that (1 − α ) AX B + α (AX + XB)/2 ∈ C|||·|||(H), then

√ √ $
β 1 1−β−t t−β t−β 1−t−β β 1
AX B ≤ 2(1−1−2β) A 2 + 4 A 2 XB 2 +A 2 XB 2 dt B 2 +4 ≤
[β,1−−β]
$
3 1 1 3
A 4 − 2 XB 4 + 2 + A 4 + 2 XB 4 − 2 dt ≤
t t t t
1
2(1−−2β)
[β,1−−β]
3 β β β β
4−2
XB 4 + 2 + A 4 + 2 XB 4 − 2 ≤
1 1 3
1
2 A
1−β 1 1 β β 1 1 1−β
1
4 A 2 A 2 X + XB 2 B 2 + A 2 A 2 X + XB 2 B 2 ≤
Cauchy–Schwarz Inequalities for i.p.t. Transformers 215

√ √
1
4 AX + 2 AX B + XB ≤
√ √ √ √ α
(1 − α) AX B + α2 (AX+XB) ≤ (1 − α ) AX B + 2 (AX+XB) .

For some additional insight in this topics see also [4, 11], [27, rem. 2.3,rem. 2.4],
[29, 34] and references therein.

10 Connections with Cauchy–Schwarz Inequalities for

Hilbert Modules

I.p.t. transformers on B(H) belong to a more general class of transformers

on C∗ -algebras in the framework of Hilbert C∗ -modules. Namely, the space
L2G (, μ,B(H)) of weak∗ -measurable, square integrable functions represents a
Hilbert C∗ -module over C∗ -algebra B(H), with the inner product defined by
$∗
w
A, B = A∗t Bt dμ(t) for all A, B ∈ L2G (, μ,B(H)),
def
(62)

w)
∗
where A∗t Bt dμ(t) denotes w∗ or Gel’fand integral of B(H) valued w∗ (or
Gel’fand) integrable function A∗B : → B(H) : t $→ A∗t Bt . A wider class of
semi-inner product modules is treated in [15, th. 2.1] and [20, lemma 1.3]. Thus,
) ∗
w∗
) ∗
w∗
At ⊗Bt dμ(t) : B(H) → B(H) : X $→ At XBt dμ(t) = A, X·B, where
X·B : → B(H) : t $→ XBt , explaining the origin of the name for inner product
type transformers
considered in this and previous papers.
If M,·,· is a semi-inner product module over a C∗ -algebra A, then Cauchy–
Schwarz inequality asserts that

2
A, B = B, AA, B ≤ ||A, A||B, B for all A, B ∈ M. (63)

By applying (63) to X·B instead of B it follows

2
A, X·B = X·B, AA, X·B ≤ ||A, A||X·B, X·Bfor all A, B ∈ M, X ∈ A,

which for M ··= L2G (, μ,B(H)) becomes the Cauchy–Schwarz inequality (2).
In [1] semi-inner product A-modules are considered with respect to alternated
semi-inner products. For an arbitrary C ∈ M another semi-inner product on
M defines correctly by ·,·C : M×M → A : (A, B) $→ ||C, C||A, B −
A, CC, B. By applying (63) to X·B instead of B and to ·,·C instead of ·,·
216 D. R. Jocić and M. Lazarević

gives

2
||C, C||A, X·B − A, CC, X·B
2 2
≤ ||C, C||A, A − C, A ||C, C||X·B, X·B − C, X·B (64)

for all A, B ∈ M and X ∈ A. A special case M ··= L2G (, μ,B(H)) for a
probability measure μ on , A∗, B ∈ L2G (, μ,B(H)) and Ct ··= I for all t ∈
provides the Grüss-Landau operator inequality (26) for η ··= )1, which then implies
the validity of Theorem 5.5 for all η ∈ [0, 1]. If in addition At dμ(t) = 0 then
we have the strengthened Cauchy–Schwarz inequality
) 2 ) ∗
) ∗ ∗
) 2
At XBt dμ(t) ≤ At At dμ(t) Bt X XBt dμ(t) − X Bt dμ(t) ,
)
which improves the inequality (2) in Theorem 2.1 if At dμ(t) = 0. Another
special case X ··= I coincides with the inequality [1, (2.7)]. More generally, in
the special case when C, A = 0 the inequality (64) provides i.t.p. transformer’s
Ostrowski inequality on a semi-inner product over C∗-algebra

2 ||A,A|| 2
A, X·B ≤ ||C,C|| ||C, C||X·B, X·B − C, X·B , (65)

which also generalizes the inequality (63). Again, the special case X ··= I coincides
with the inequality [1, (2.8)]. Alternatively, if C, B = 0 and X ··= I, then (64)
implies another Ostrowski inequality on C∗ -modules

2 |C,A|2
A, B ≤ A, A − ||C,C|| B, B, (66)

which also improves the Cauchy–Schwarz inequality (63) if C, B = 0.

Alternated semi-inner products also enables in [1] to improve an inequality
related to the Gram matrix and to obtain a sequence of nested inequalities that
emerges from Cauchy–Schwarz inequality.
The framework of Hilbert C∗ -modules and semi-inner product C∗ -modules over
unital C∗ -algebras proves to be very useful for generalizing many other classical
inequalities. In [7] polar decomposition and the operator geometric mean are used to
present the Cauchy–Schwarz inequality and to give several additive and multiplica-
tive type reverses of it in this setting. Further, this allows to authors to obtain various
operator inequalities on a Hilbert C∗ -modules, including Kantorovich inequality,
Pólya–Szegö inequality, the covariance-variance inequality, Ozeki–Izumino–Mori–
Seo inequality, Wielandt inequality, Heinz–Kato–Furuta inequality and Malamud
inequality.
Another alternated semi-inner product ·,·T : M×M → A : (A, B) $→
A, T B for an arbitrary positive T ∈ B(M) was considered in [6], where a polar
decomposition A, T B = V A, T B with a partial isometry V ∈ A is used
Cauchy–Schwarz Inequalities for i.p.t. Transformers 217

to prove generalized Cauchy–Schwarz inequality on semi-inner product over C∗ -

algebra

A, T B ≤ V ∗A, T AV #B, T B. (67)

Here, C#D = C 1/2 (C −1/2 DC −1/2 )1/2 C 1/2 denotes the operator geometric mean
def

for positive C, D ∈ A, if C is invertible. For the definition of C#D if C is

not invertible see [9, ex. 3.3.1(3)]. The special case in which T is the identity
operator IM on M gives the Cauchy–Schwarz inequality [6, (3.1)] and Kantorovich
inequality on Hilbert C∗ -modules is also given by [6, th. 4.4]. An application of (67)
to T ··= IM and to the semi–inner product ·,·C gives

||C, C||A, B − A, CC, B

2 2
≤ V ∗ ||C, C||A, A − C, A V # ||C, C||B, B − C, B
2 1/2 21/2
≤ ||C, C||A, A − C, A ||C, C||B, B − C, B , (68)

which applied to X·B instead of B for X ∈ A leads to a refined version for the
inequality (64).
Like all C∗ -algebras, B(H) itself represents a Hilbert C∗ -module over B(H) with
inner products A, B = A∗B, for all A, B ∈ B(H), so the special case T ··= I
def

in (67) says |A B| ≤ (V ∗A∗AV )#B ∗B, which generalizes the matrix inequality
∗

(4.2) in [33, lemma 4.2].

We conclude this paper with the following
Remark 10.1 The inequality |||A∗XB |||2 ≤ |||AA∗X||||||XBB ∗||| for all A, B, X
∈ B(H) and u.i. norms |||·||| in [3, (7)] is also known as the Cauchy–Schwarz
inequality for u.i. norms, which for X ··= I implies |||A∗B |||2 ≤ |||AA∗||||||BB ∗||| =
|||A∗A||||||B ∗B |||. In the framework of matrices this inequality was generalized in
[2], saying that |||A∗B |||2 ≤ |||ηAA∗ + (1 − η)BB ∗ ||||||(1 − η)AA∗ + ηBB ∗ ||| for
all η ∈ [0, 1].

Acknowledgments Authors were partially supported by MPNTR grant No. 174017, Serbia

References

1. L. Arambašić, D. Bakić, M.S. Moslehian, A treatment of the Cauchy–Schwarz inequality in

C∗ -modules. J. Math. Anal. Appl. 381, 546–556 (2011)
2. K.M.R. Audenaert, Interpolating between the arithmetic–geometric mean and Cauchy–
Schwarz matrix norm inequalities. Oper. Matr. 9, 475–479 (2015)
3. R. Bhatia, C. Davis, A Cauchy–Schwarz inequality for operators with applications. Linear
Algebra Appl. 223/224, 119–129 (1995)
218 D. R. Jocić and M. Lazarević

4. R. Bhatia, F. Kittaneh, The matrix arithmetic–geometric mean inequality revisited. Linear

Algebra Appl. 428, 2177–2191 (2008)
5. J. Diestel, J.J. Uhl, Vector Measures. Mathematical Surveys and Monographs, vol. 15
(American Mathematical Society, Providence, 1977), MR56:12216
6. J.I. Fujii, M. Fujii, M.S. Moslehian, Y. Seo, Cauchy–Schwarz inequality in semi-inner product
C∗ -modules via polar decomposition, J. Math. Anal. Appl. 394, 835–840 (2012)
7. J.I. Fujii, M. Fujii, Y. Seo, Operator inequalities in Hilbert C∗ -modules via the Cauchy–
Schwarz inequality. Math. Inequal. Appl. 17, 295–315 (2014)
8. E. Heinz, Beiträge zur Störungstheorie der Spektralzerlegung. Math. Ann. 123, 415–438
(1951)
9. F. Hiai, Matrix analysis: matrix monotone functions, matrix means and majorization. Interdis-
cip. Inf. Sci. 16, 139–248 (2010)
10. F. Hiai, H. Kosaki, Comparasion of various means for operators. J. Funct. Anal. 163, 300–323
(1999)
11. F. Hiai, H. Kosaki, Means of Hilbert Space Operators. Lecture Notes in Matematics (Springer,
Berlin, 2003)
12. D.R. Jocić, Norm inequalities for self-sdjoint derivations. J. Funct. Anal. 145, 24–34 (1997)
13. D.R. Jocić, Cauchy–Schwarz and means inequalities for elementary operators into norm ideals.
Proc. Am. Math. Soc. 126, 2705–2711 (1998)
14. D.R. Jocić, The Cauchy–Schwarz norm inequality for elementary operators in Schatten ideals.
J. London. Math. Soc. 60, 925–934 (1999)
15. D.R. Jocić, Cauchy–Schwarz norm inequalities for weak*-integrals of operator valued func-
tions. J. Funct. Anal. 218, 318–346 (2005)
16. D.R. Jocić, Interpolation norms between row and column spaces and the norm problem for
elementary operators. Linear Algebra Appl. 430, 2961–2974 (2009)
17. D.R. Jocić, Clarkson-McCarthy inequalities for several operators and related norm inequalities
for p-modified unitarily invariant norms. Complex Anal. Oper. Theory 13, 583–613 (2019)
18. D.R. Jocić, M. Lazarević, Cauchy-Schwarz norm inequalities for elementary operators and
inner product type transformers generated by families of subnormal operators. Mediterr. J.
Math 19, 49 (2022). https://doi.org/10.1007/s00009-021-01919-x
19. D.R. Jocić, S. Milošević, Refinements of operator Cauchy–Schwarz and Minkowski inequal-
ities for p-modified norms and related norm inequalities. Linear Algebra Appl. 488, 284–301
(2016)
20. D.R. Jocić, D. Krtinić, M. Sal Moslehian, Landau and Grüss type inequalities for inner product
type integral transformers in norm ideals. Math. Ineq. Appl. 16, 109–125 (2013)
21. D.R. Jocić, S. Milošević, V. Durić, Norm inequalities for elementary operators and other inner
product type integral transformers with the spectra contained in the unit disc. Filomat 31, 197–
206 (2017)
22. D.R. Jocić, D. Krtinić, M. Lazarević, P. Melentijević, S. Milošević, Refinements of inequalities
related to Landau-Grüss inequalities for elementary operators acting on ideals associated to p-
modified unitarily invariant norms. Complex Anal. Oper. Theory 12, 195–205 (2018)
23. D.R. Jocić, M. Lazarević, S. Milošević, Norm inequalities for a class of elemetary operators
generated by analytic functions with non-negative Taylor coefficients in ideals of compact
operators related to p-modified unitarily invariant norms. Linear Algebra Appl. 540, 60–83
(2018)
24. D.R. Jocić, M. Lazarević, S. Milošević, Inequalities for generalized derivations of operator
monotone functions in norm ideals of compact operators. Linear Algebra Appl. 586, 43–63
(2020)
25. D.R. Jocić, M. Lazarević, S. Milošević, Corrigendum to “Inequalities for generalized deriva-
tions of operator monotone functions in norm ideals of compact operators” [Linear Algebra
Appl. 586 (2020) 43–63]. Linear Algebra Appl. 599, 201–204 (2020)
26. D.R. Jocić, D. Krtinić, M. Lazarević, Cauchy–Schwarz inequalities for inner product type
transformers in Q∗ norm ideals of compact operators. Positivity 24, 933–956 (2020)
Cauchy–Schwarz Inequalities for i.p.t. Transformers 219

27. D.R. Jocić, D. Krtinić, M. Lazarević, Extensions of the arithmetic-geometric means and
Young’s norm inequalities to accretive operators, with applications. Linear Multilinear Algebra
(2021). https://doi.org/10.1080/03081087.2021.1900049
28. D.R. Jocić, D. Krtinić, M. Lazarević, Laplace transformers in norm ideals of compact
operators. Banach J. Math. Anal. 15, 67 (2021). https://doi.org/10.1007/s43037-021-00149-
3
29. R. Kaur, M.S. Moslehian, M. Singh, C. Conde, Further refinements of the Heinz inequality.
Linear Algebra Appl. 447, 26–37 (2014)
30. H. Kosaki, Positive Definiteness of Functions with Applications to Operator Norm Inequalities.
Memoirs of the American Mathematical Society, vol. 212, no. 997 (American Mathematical
Society, Providence, 2011)
31. G. Larotonda, Norm inequalities in operator ideals. J. Funct. Anal. 255, 3208–3228 (2008)
32. M. Lazarević, Grüss-Landau inequalities for elementary operators and inner product type
transformers in Q and Q* norm ideals of compact operators. Filomat 33, 2447–2455 (2019)
33. R. Nakayama, Y. Seo, R. Tojo, Matrix Ostrowski inequality via the matrix geometric mean. J.
Math. Inequal. 14, 1375–1382 (2020)
34. J. Zhao, J. Wu, Some operator inequalities for unitarily invariant norms. Ann. Funct. Anal. 8,
240–247 (2017)
Norm Estimations for the Moore-Penrose
Inverse of the Weak Perturbation
of Hilbert C ∗ -Module Operators

Chunhong Fu, Dingyi Du, Liuhui Huang, and Qingxiang Xu

Abstract Let A be a C ∗ -algebra, H and K be Hilbert A modules and L(H, K) be

the set of all adjointable operators from H to K. A multiplicative perturbation M of
a Moore-Penrose invertible operator T ∈ L(H, K) has the form M = ET F ∗ with
E ∈ L(K) and F ∈ L(H ), which can be expressed alternately as M = ET T † · T ·
(F T † T )∗ = LZ,T · T † · RF,T
∗ , where T † is the Moore-Penrose inverse of T and

LZ,T = ET T † + IK − T T † , RF,T = F T † T + IH − T † T .

In view of the above ET T † , F T † T , LE,T and RF,T , the relationship between

various types of multiplicative perturbations are investigated, and formulas for M † ,
MM † and M † M are derived in the case that M is a weak perturbation of T . Based
on these derived formulas, some norm computations are carried out by using certain
C ∗ -algebraic techniques, through which some norm estimations for the Moore-
Penrose inverse are obtained.

Keywords Hilbert C ∗ -module · Moore–Penrose inverse · Multiplicative

perturbation · Weak perturbation · Strong perturbation

1 Introduction

Let Mm×n (C) be the set of all m × n complex matrices, In and 0n be the identity
matrix and zero matrix in Mn (C), respectively. For every A ∈ Mm×n (C), let A† and
A denote the Moore-Penrose inverse and the spectral norm of A, respectively.

C. Fu
Health School Attached to Shanghai University of Medicine & Health Sciences, Shanghai, PR
China
D. Du · L. Huang · Q. Xu ()
Department of Mathematics, Shanghai Normal University, Shanghai, PR China

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 221
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_7
222 C. Fu et al.

Given any M, T ∈ Mm×n (C), if ran(M) = ran(T ), then clearly there exist two
matrices E ∈ Mm (C) and F ∈ Mn (C) such that

M = ET F ∗ , where both E and F are nonsingular. (1)

The matrix M of the form (1) is called a multiplicative perturbation of T .

One research field associated to the multiplicative perturbation (1) is the study
of representations for M † , which can be applied to the derivation of formulas for
the Moore-Penrose inverse of certain 2 by 2 block matrices [3]. Another research
field associated to (1) is the study of norm estimations for M † − T † , which can be
carried out by directly using the SVD of M and T [8, 20] or by using the Halmos’
two projections theorem [5] (a theorem also known for Hilbert space operators
[1]).
One generalization of (1) is the case that both E and F may fail to be square. For
instance, the rank-revealing decomposition of a matrix was considered in [2], which
can be applied to the study of the accurate solutions of structured least squares
problems. Another generalization of (1) is the case that E and F are still square
matrices, whereas both of them may be singular [4].
Let M be a multiplicative perturbation of T ∈ Mm×n (C) given by (12), where
both E ∈ Mm (C) and F ∈ Mn (C) may be singular, so it may happen that ran(M) =
ran(T ). Inspired by [3], an alternative expression of M was given in [21] as
∗
M = LE,T · T · RF,T , (2)

where LE,T and RF,T are given by (13) below, in which H = Cn and K = Cm .
Based on the expression (2), the terms of the strong perturbation and the weak
perturbation were introduced in [21], through which representations for M † as
well as norm estimations for M † − T † were carried out therein. Recently, another
expression of M was introduced in [17] as

M = ET T † · T · (F T † T )∗ .

As a result, the representation theory for M † obtained in [21] has been improved in
[17], which leads to some new norm estimation for M † − T † in [4].
The Moore-Penrose inverse associated to the rank-preserving perturbation and
the stable perturbation is considered originally for the additive perturbation [7,
11, 13, 18, 19], which can also be dealt with in the multiplicative perturbation
case. Given a multiplicative perturbation (12) in the matrix case, as shown in
Norm Estimations for the M-P Inverse of the Weak Perturbation 223

[4, Lemma 2.1 and Corollary 2.1], the relationship between various types of
multiplicative perturbations for matrices can be figured out as follows:

M is a stable perturbation of T
⇑
M is a weak perturbation of T
⇑
M − T · T † < 1 #⇒ M is a strong perturbation of T
>
M is a weak perturbation of T and ran(M) = ran(T ).

The diagram above, together with [4, Examples 2.1 and 2.2], indicates that the weak
perturbation is actually parallel to the rank-preserving perturbation. It is notable that
much progress has been made on norm estimations for M † − T † in the case of
rank-preserving perturbation, yet little has been done up to now in the case of the
weak perturbation. The purpose of this paper is, in the general setting of Hilbert
C ∗ -module operators, to make some generalizations of the main results originally
obtained in [4, 17] for matrices.
Let A be a C ∗ -algebra, H and K be two Hilbert A -modules, and M be a
weak perturbation of T ∈ L(H, K) given by (12). Formulas for M † , MM † and
M † M are derived in Theorem 4.5, hence a generalization of [17, Corollary 3.6] is
obtained from the matrix case to the case of Hilbert C ∗ -module operators. Based on
these formulas, some norm computations of the associated operators can be carried
out in Lemmas 5.1 and 5.2 by using certain C ∗ -algebraic techniques employed in
[4]. Consequently in Theorem 5.3, an elegant estimation is derived in the weak
perturbation case as
6 7
max MM † − T T † , M † M − T † T ≤ T † · M − T . (3)

Thus, a generalization of [4, Theorem 3.1] is obtained from the matrix case to
the case of Hilbert C ∗ -module operators. Note that [4, Theorem 3.2] indicates an
interesting phenomenon that the similar estimation
6 7
max MM † − T T † , M † M − T † T ≤ M † · M − T (4)

may be false for a general weak perturbation. It is gratifying that (4) is always
true for every strong perturbation (see Theorem 5.4 for the details). So in the
strong perturbation case, another elegant estimation for M † − T † is derived
in Theorem 5.6 by using both (3) and (4). This shows theoretically that some
well-known results, associated to the stable additive perturbation satisfying a norm
inequality (28) [11, 13, 18, 19], have been generalized in this paper to the strong
224 C. Fu et al.

perturbation case, since as is shown in Sect. 3.2, every stable additive perturbation
is essentially a strong perturbation whenever this widely used norm inequality
is satisfied. Furthermore, some new results concerning the strong perturbation of
operators are also obtained; see Theorem 5.7 for the details.
The paper is organized as follows. In Sect. 2, we recall some basic knowledge
about the Moore-Penrose inverse of Hilbert C ∗ -module operators. In Sect. 3, we
study the relationship between various types of multiplicative perturbations. In
Sects. 4 and 5, we focus on the study of representations and norm estimations for the
Moore-Penrose inverse associated to the multiplicative perturbation, respectively.

2 Some Basic Knowledge About the Moore-Penrose Inverse

Let C be the complex field and A be a C ∗ -algebra [10]. An inner-product A -

module [6] is a linear space E which is a right A -module, together with a map
E × E → A , (x, y) → x, y such that for every x, y, z ∈ E, α, β ∈ C and
a ∈ A , the following conditions hold:
(i) x, αy + βz = α x, y + β x, z ;
(ii) x, ya = x, y a;
(iii) y, x = x, y ∗ ;
(iv) x, x ≥ 0, and x, x = 0 ⇐⇒ x = 0.
An inner-product
√ A -module E which is complete with respect to the induced
norm x = x, x (x ∈ E) is called a (right) Hilbert A -module.
Throughout the rest of this paper, H and K are Hilbert A -modules. Let L(H, K)
be the set of operators T : H → K for which there is an operator T ∗ : K → H
such that

T x, y = x, T ∗ y for every x ∈ H and y ∈ K.

It is known that each element T of L(H, K) is a bounded linear operator. We call

L(H, K) the set of adjointable operators from H to K. For every A ∈ L(H, K), its
range and null space are denoted by R(A) and N (A), respectively. In case H = K,
L(H, H ) which we abbreviate to L(H ), is a C ∗ -algebra. Let L(H )sa and L(H )+
denote the sets of self-adjoint elements and positive elements in L(H ), respectively.
The identity operator on H is denoted by IH . An element M of L(H ) is said to be
positive definite or strictly positive [9], if M is positive and invertible in L(H ). By
a projection P ∈ L(H ), we always mean that P = P ∗ and P 2 = P . In the special
case that H is a Hilbert space, L(H ) consists of all bounded linear operators on H ,
and in this case we use the notation B(H ) instead of L(H ).
Norm Estimations for the M-P Inverse of the Weak Perturbation 225

The notations of “⊕” and “ ” are used in this paper with different meanings for
the sake of reader’s convenience. Given Hilbert A -modules H1 and H2 , let
4 5
H1 ⊕ H2 = (h1 , h2 )T : hi ∈ Hi , i = 1, 2 ,

which is also a Hilbert A -module whose A -valued inner product is given by

@ A
(x1 , y1 )T , (x2 , y2 )T = x1 , x2 + y1 , y2 , ∀ xi ∈ H1 , yi ∈ H2 , i = 1, 2.

On the other hand, if both H1 and H2 are closed submodules of a Hilbert A -module
H such that H1 ∩ H2 = {0}, then we write

H1 H2 = {h1 + h2 : hi ∈ Hi , i = 1, 2} .

Definition 2.1 ([12, 15]) Let A ∈ L(H, K). The Moore-Penrose inverse of A,
written A† , is the unique element X ∈ L(K, H ) which satisfies

AXA = A, XAX = X, (AX)∗ = AX and (XA)∗ = XA. (5)

Lemma 2.2 ([16, Theorem 1.3]) For every A ∈ L(H, K), A† exists if and only if
R(A) is closed.
Lemma 2.3 (cf. [6, Theorem 3.2] and [15, Remark 1.1]) Let A ∈ L(H, K).
Then the closedness of any one of the following sets implies the closedness of the
remaining three sets:

R(A), R(A∗ ), R(AA∗ ) and R(A∗ A).

If R(A) is closed, then R(A) = R(AA∗ ), R(A∗ ) = R(A∗ A) and the following
orthogonal decompositions hold:

H = N (A) R(A∗ ) and K = R(A) N (A∗ ).

Remark 2.4 ([14, Section 1]) Let A ∈ L(H, K). If A† exists, then A is said to be
Moore-Penrose invertible (briefly, M-P invertible). In such case,

(A† )∗ = (A∗ )† , (A∗ A)† = A† (A∗ )† , R(A† ) = R(A∗ ), N (A† ) = N (A∗ ). (6)

If A ∈ L(H )sa , then AA† = A† A. If furthermore A ∈ L(H )+ , then A† ∈ L(H )+

1 1
such that (A† ) 2 = (A 2 )† , since
1 ∗ 1 1 1 † 1 1
A2 = A 2 and A† = A 2 · A 2 = (A 2 )† · (A 2 )† .
226 C. Fu et al.

Lemma 2.5 (cf. [18, Lemma 4.1]) For every A ∈ L(K, H ), let

0 A
ρ(A) = ∈ L(H ⊕ K)sa . (7)
A∗ 0

Then ρ(A)† exists if and only if A† exists. In such case,

0 (A† )∗
ρ(A)† = . (8)
A† 0

Proof Note that

AA∗ 0
ρ(A)ρ(A)∗ = , (9)
0 A∗ A

so R ρ(A)ρ(A)∗ is closed if and only if both R(AA∗ ) and R(A∗ A) are closed.
Thus, from Lemmas 2.2 and 2.3 we conclude that ρ(A)† exists if and only if A†
exists, and in which case the verification of (8) follows directly from four equations
in (5).

Lemma 2.6 (cf. [18, Lemma 4.1]) For every A ∈ L(K, H ), let ρ(A) be defined by
(7). Then

ρ(A) = A. (10)

If furthermore A is M-P invertible, then

6 7
ρ(A)† = A† and ρ(A)ρ(A)† = max AA† , A† A . (11)

Proof It follows from (9) that

ρ(A) = ρ(A)ρ(A)∗ = max{ AA∗ , A∗ A} = A.

If furthermore A is M-P invertible, then by (8), (7) and (10) we have

∗
ρ(A)† = ρ (A† )∗ = A† = A† ,
6 7
ρ(A)ρ(A)† = max AA† , A† A .

Norm Estimations for the M-P Inverse of the Weak Perturbation 227

3 Various Types of Perturbations

In this section, we study relationships between various perturbations. Throughout

the rest of this paper, T ∈ L(H, K) is M-P invertible.

3.1 The Multiplicative Perturbation Case

Let M be a multiplicative perturbation of T given by

M = ET F ∗ , where E ∈ L(K) and F ∈ L(H ). (12)

Definition 3.1 ([21]) The operator M given by (12) is said to be semi-strong

perturbation of T if both LE,T and RF,T are M-P invertible and injective, where

LE,T = ET T † + IK − T T † and RF,T = F T † T + IH − T † T . (13)

If furthermore LE,T and RF,T are both invertible, then M is said to be a strong
perturbation of T .
Definition 3.2 ([4, 21]) The operator M given by (12) is said to be a weak
perturbation of T if both LE,T and RF,T defined by (13) are M-P invertible, and
the following three conditions are satisfied:
(i) T T † L†E,T (IK − T T † ) = 0;
†
(ii) T † T RF,T (IH − T † T ) = 0;
(iii) L†E,T LE,T T = T RF,T
†
RF,T .
Definition 3.3 ([4, 18, 19]) The operator M given by (12) is said to be a stable
perturbation of T if R(M) ∩ R(T )⊥ = {0}.
Lemma 3.4 (cf. [4, Lemma 2.1]) Let M be the multiplicative perturbation of T ∈
L(H, K) given by (12). Then the following statements are valid:
(i) If M is a semi-strong perturbation of T , then M is a weak perturbation of T ;
(ii) If M is a weak perturbation of T , then M is a stable perturbation of T ;
(iii) If T † · M − T < 1, then both L∗E,T and RF,T
∗ are injective, where LE,T
and RF,T are defined by (13).
Proof
(i) Suppose that M is a semi-strong perturbation of T . Then

L†E,T LE,T = IK and RF,T

†
RF,T = IH , (14)
228 C. Fu et al.

hence Definition 3.2 (iii) is satisfied. Since LE,T (IK − T T † ) = IK − T T † , by

(14) we get L†E,T (IK −T T † ) = IK −T T † and thus T T † L†E,T (IK −T T † ) = 0.
This shows the validity of Definition 3.2 (i). Similarly, Definition 3.2 (ii) is also
satisfied. So M is a weak perturbation of T .
(ii) Suppose that M is a weak perturbation of T . By (13) M can be expressed
alternately as (2). So given every u ∈ R(M) ∩ R(T )⊥ , there exist x ∈ H and
y ∈ K such that
∗
u = LE,T T RF,T x = (IK − T T † )y. (15)

Then Definition 3.2 (iii) and (i) yield

∗ † ∗ ∗
T RF,T x = T T † · T RF,T RF,T · RF,T x = T T † · L†E,T LE,T T · RF,T x

= T T † L†E,T u = T T † L†E,T (IK − T T † )y = 0,

hence by (15) we conclude that u = 0. This completes the proof that R(M) ∩
R(T )⊥ = {0}.
(iii) Suppose that T † · M − T < 1. Let
∗
α = T † (LE,T T RF,T − T ).

Then by (2), we have

α = T † (M − T ) ≤ T † · M − T < 1.

∗ ), since (I − T † T )R ∗
Given every x ∈ N (RF,T F,T = IH − T T , we have
†
H
x = T T x, hence
†

∗
x = T † (LE,T T RF,T − T ) x ≤ αx,

∗ ) = {0}.
which happens only if x = 0. This completes the proof that N (RF,T
† † ∗
Similarly, due to IK − T T = (IK − T T )LE,T and

(T † )∗ (RF,T T ∗ L∗E,T − T ∗ ) = (T † )∗ (M − T )∗ ≤ T † · M − T < 1,

we can also conclude that N (L∗E,T ) = {0}.

Norm Estimations for the M-P Inverse of the Weak Perturbation 229

An application of the preceding lemma is as follows.

Corollary 3.5 ([4, Corollary 2.1]) Let M be the multiplicative perturbation of T ∈
Mm×n (C) given by (12), where E ∈ Mm (C) and F ∈ Mn (C). Then the following
statements are valid:
(i) If T † · M − T < 1, then M is a strong perturbation of T ;
(ii) M is a strong perturbation of T if and only if ran(M) = ran(T ) and M is a
weak perturbation of T .
Proof
(i) The conclusion follows immediately from Lemma 3.4 (iii).
(ii) Suppose that M is a strong perturbation of T . Then by Lemma 3.4 (i) M is weak
perturbation of T . Furthermore, since both LE,T and RF,T are nonsingular, we
know from (2) that ran(M) = ran(T ).
Conversely, suppose that M is a weak perturbation of T such that ran(M) =
∗ ) = ran(T ), which means by Definition 3.2 (iii)
ran(T ). Then clearly ran(T RF,T
that
∗ †
R(T ) = R(T RF,T ) = R(T RF,T RF,T ) = R(L†E,T LE,T T ),

hence N (T † ) = N (T ∗ ) = N (T ∗ L†E,T LE,T ). Now, given every x ∈ N (LE,T ), we

have T † x = 0, hence x = ET T † x + (IK − T T † )x = LE,T x = 0. Therefore, LE,T
is nonsingular. The proof of the nonsingularity of RF,T is similar.

Lemma 3.6 (cf. [17, Theorem 3.2]) Let M be the multiplicative perturbation of
T ∈ L(H, K) given by (12) such that both LE,T and RF,T defined by (13) are M-P
invertible. Then M is a strong (weak) perturbation of T if and only if

E 0
ρ(M) = Hρ(T )H ∗ with H = (16)
0 F

is a strong (weak) perturbation of ρ(T ), where ρ(T ) and ρ(M) are defined by (7).
Proof Direct computation yields (16). Also, from (8) we can obtain

LE,T 0
LH,ρ(T ) = Hρ(T )ρ(T )† + IK⊕H − ρ(T )ρ(T )† = , (17)
0 RF,T
230 C. Fu et al.

which clearly indicates that M is a strong perturbation of T if and only if ρ(M) =

Hρ(T )H ∗ isa strong perturbation
of ρ(T ). Furthermore, by (17) we can obtain
†
L 0
L†H,ρ(T ) = E,T
† , hence from (7) and (8) we know that Definition 3.2
0 RF,T
(i)–(iii) can be rephrased as

ρ(T )ρ(T )† L†H,ρ(T ) IK⊕H − ρ(T )ρ(T )† = 0,

L†H,ρ(T ) LH,ρ(T ) ρ(T ) = ρ(T )L†H,ρ(T ) LH,ρ(T ) ,

which means that M is a weak perturbation of T if and only if ρ(M) = Hρ(T )H ∗

is a weak perturbation of ρ(T ).

Before ending this section, we provide an interpretation of the weak (strong)
perturbation by using operator block matrices. To this end, a lemma is stated
as follows, whose proof is the same as that of the matrix case initiated in [21,
Lemma 3.3].
B 0
Lemma 3.7 (cf. [21, Lemma 3.3]) Let A = ∈ L(H ⊕ K), where B ∈
C IK
L(H ). Then the following statements are equivalent:
(i) B is M-P invertible such that R(C ∗ ) ⊆ R(B ∗ );
(ii) B is M-P invertible such that C = CB † B;
Z11 0
(iii) A is M-P invertible such that A† has the form , where Z11 ∈ L(H ).
Z21 Z22
In each case,

B† 0
A† = . (18)
−CB † IK

Next, we recall some elementary results on operator block matrices. Let H1 be a

closed submodule of H . Then for every A ∈ L(H, K), we use the notation A|H1 to
denote the restriction A on H1 .
Suppose that P ∈ L(H ) and Q ∈ L(K) are two projections. Let

H1 = P H, H2 = (IH − P )H, K1 = QK, K2 = (IK − Q)K,

and let UP : H → H1 ⊕ H2 be the unitary operator defined by

T
UP h = P h, (IH − P )h for every h ∈ H. (19)
Norm Estimations for the M-P Inverse of the Weak Perturbation 231

Then UP∗ ∈ L(H1 ⊕ H2 , H ) which is given by

UP∗ (h1 , h2 )T = h1 + h2 for every hi ∈ Hi , i = 1, 2.

The unitary operator UQ ∈ L(K, K1 ⊕ K2 ) can be defined similarly.

With the notations as above, it is easy to verify that for every S ∈ L(H, K), the
operator matrix (Sij )1≤i,j ≤2 corresponding to the operator UQ SUP∗ is formulated
by

S11 = QSP |H1 , S12 = QS(IH − P )|H2 ,

(20)
S21 = (IK − Q)SP |H1 , S22 = (IK − Q)S(IH − P )|H2 .

Conversely, given every X = (Xij )1≤i,j ≤2 ∈ L(H1 ⊕ H2 , K1 ⊕ K2 ) and every

h ∈ H , we have
∗
UQ X UP h = X11 P h + X12 (IH − P )h + X21 P h + X22 (IH − P )h.

It follows that
∗
UQ XUP = (X11 + X21 )P + (X12 + X22 )(IH − P ). (21)

An interpretation of the weak (strong) perturbation for matrices was given in [4,
Remark 2.1], which can also be carried out for operators.
Remark 3.8 Let M be the multiplicative perturbation of T ∈ L(H, K) given by (12)
such that both LE,T and RF,T defined by (13) are M-P invertible. Let PT ∈ L(H )
and QT ∈ L(K) be two projections defined by

PT = T † T and QT = T T † , (22)

and put H1 = R(PT ) = R(T ∗ ) ⊆ H and K1 = R(QT ) = R(T ) ⊆ K. By (20) we

have
−1
T11 0 T11 0
UQT T UP∗T = ∗
and UPT T † UQ = , (23)
0 0 T
0 0

−1
where T11 = T |H1 ,K1 is invertible such that T11 = T † |K1 ,H1 . Then

∗ IK1 0 IH1 0
UQ T T T † UQ T
= and UPT T † T UP∗T = , (24)
0 0 0 0

which lead to

∗ B0 D0
UQT ET T † UQ T
= and UPT F T † T UP∗T = (25)
C0 G0
232 C. Fu et al.

for some B ∈ L(K1 ), C ∈ L(K1 , K1⊥ ), D ∈ L(H1 ) and G ∈ L(H1 , H1⊥ ). It follows
that

∗ B 0 ∗ D 0
UQT LE,T UQT = and UPT RF,T UPT = . (26)
C IK ⊥ G IH ⊥
1 1

Note that
∗
† ∗
†
UQT LE,T UQ T
= UQT L†E,T UQ T
and UPT RF,T UP∗T = UPT RF,T
†
UP∗T ,

so (26) and Lemma 3.7 indicate that

Definition 3.2 (i) is satisfied ⇐⇒ B is M-P invertible such that C = CB † B,

Definition 3.2 (ii) is satisfied ⇐⇒ D is M-P invertible such that G = GD † D.

In such case, by (18) and (26) we obtain

∗ B† 0 B 0 T11 0
L†E,T LE,T T = UQ UP T
T −CB † IK ⊥ C IK ⊥ 0 0
1 1

∗ B † BT 11 0
= UQ T
UP T .
0 0

Similarly, we have

† ∗ T11 D † D 0
T RF,T RF,T = UQ T
UP T .
0 0

Therefore, Definition 3.2 (iii) is furthermore satisfied if and only if

B † BT11 = T11 D † D. (27)

Moreover, it easily follows from (26) that M is a strong perturbation of T if and

only if both B and D are invertible.
It is remarkable that in the matrix case, a weak perturbation may fail to be rank-
preserving, and vice visa; see [4, Examles 2.1 and 2.2] for such two examples. An
example of the strong perturbation of an operator is as follows.
M11 M12
Example Let M = ∗ M ∈ L(H ⊕ K)+ be such that both M11 ∈ L(H )
M12 22
and S ∈ L(K) are M-P invertible, where S is the generalized Schur complement of
M11 in M defined by
∗ †
S = M22 − M12 M11 M12 .
Norm Estimations for the M-P Inverse of the Weak Perturbation 233

Put

IH 0 M11 0
E= ∗ † and T = .
M12 M11 IK 0 S

By [15, Corollary 3.5] we know that

†
M11 ∈ L(H )+ , M12 = M11 M11 M12 , S ∈ L(K)+ ,

hence M = ET E ∗ . Direct computation yields LE,T = E, which is invertible.

Therefore, M is a strong perturbation of T .

3.2 The Stable Additive Perturbation and the Strong

Perturbation

Let M = T + be a stable additive perturbation of T ∈ L(H, K) such that

T † · < 1. (28)

With the conditions as above, norm estimations for M † were considered in [19] and
[18] for Hilbert space operators and Hilbert C ∗ -module operators, respectively. In
this subsection, we will show that M can be in fact expressed as a strong perturbation
of T .
Let PT and QT be defined by (22) such that UQT T UP∗T and UPT T † UQ ∗ are
T
∗
given by (23), where H1 = R(T ), K1 = R(T ) and T11 ∈ L(H1 , K1 ) is invertible.
Put

M11 M12
UQT MUP∗T = and 11 = M11 − T11 .
M21 M22

Then
−1
T11 · 11 = T † · 11 ≤ T † · UQT · · UP∗T < 1,

−1
−1 −1 −1
therefore M11 is invertible such that M11 = IH1 + T11 11 T11 , hence

M11 0
UQT MUP∗T = W1 W2 , (29)
0 S
234 C. Fu et al.

−1
where S = M22 − M21 M11 M12 ∈ L(H1⊥ , K1⊥ ) and

−1
IK1 0 IH1 M11 M12
W1 = −1 and W2 = . (30)
M21 M11 IK ⊥ 0 IH ⊥
1 1

By assumption M is a stable perturbation of T , so R(M) ∩ R(IK − T T † ) = {0},

which clearly gives

R(UQT MUP∗T W2−1 ) ∩ R UQT (IK − T T † )UQ
∗
T
= {0}.

It follows from (24), (29) and (30) that

M11 0 0 0
W1 = for every x ∈ H1⊥ and y ∈ K1⊥ \ {0},
0 S x y

which happens only if S = 0. Accordingly, by (29) and (30) we have

M11 0
UQT MUP∗T = W1 W2
0 0
−1
IK1 0 T11 + 11 0 IH1 M11 M12
= −1
M21 M11 0 0 0 0 0
∗
B0 T11 0 D0
= ,
C0 0 0 G0

−1
where B = IK1 and D = (IH1 + T11 11 )∗ . It follows from (21) that M = ET F ∗ ,
where
−1
E = (IK1 + M21 M11 )QT ,
−1 −1 −1 ∗
F = IH1 + T11 11 + (IH1 + T11 11 )M11 M12 PT .

Furthermore, by Remark 3.8 we conclude that M is a strong perturbation of T , since

both B and D are invertible.

4 Representations for the Moore-Penrose Inverse

In this section, we study representations for the Moore-Penrose inverse associated

to the weak perturbation of operators.
Norm Estimations for the M-P Inverse of the Weak Perturbation 235

Lemma 4.1 ([10, Proposition 1.3.5]) Let x and y be two positive elements in a
C ∗ -algebra. If x ≤ y, then x ≤ y.
Definition 4.2 Given every B ∈ L(H ) and C ∈ L(H, K), let ∈ L(H ) be defined
by

= (B, C) = B ∗ B + C ∗ C. (31)

Lemma 4.3 (cf. [17, Lemma 2.4]) Given every B ∈ L(H ) and C ∈ L(H, K), let
= (B, C) be defined by (31) and let be defined by

B0
= ∈ L(H ⊕ K).
C0

Then is M-P invertible if and only if is M-P invertible. In such case,

† B ∗ † C ∗
† = . (32)
0 0

0
Proof Note that ∗ = , so by Lemmas 2.2 and 2.3 we know that
00

† exists ⇐⇒ R(∗ ) is closed ⇐⇒ R() is closed ⇐⇒ † exists.

Now, suppose that † exists. Since is self-adjoint, we have

( † )∗ = † and † = † .

Let P = † and H1 = P H = R(). Then for every x ∈ H ,

B ∗ x = u + v, (33)

where u = P (B ∗ x) ∈ H1 and v = (I − P )(B ∗ x) ∈ H1⊥ . Note that

0 = v, v = Bv, Bv + Cv, Cv ,

which implies by Lemma 4.1 that Bv, Bv = 0, hence Bv = 0. Therefore, by

(33) we have

v, v = v, B ∗ x − u = Bv, x − v, u = 0,

hence v = 0 and thus B ∗ x = u ∈ H1 . Since x ∈ H is arbitrary, we have R(B ∗ ) ⊆

R(). Similarly, we have R(C ∗ ) ⊆ R(). It follows that

† B ∗ = B ∗ and † C ∗ = C ∗ . (34)
236 C. Fu et al.

Taking ∗-operation yields

B † = B and C † = C. (35)

In view of (34) and (35), the four Penrose equations for and † stated in (5) are
satisfied.

Lemma 4.4 Let B ∈ L(H ) and C ∈ L(H, K) be such that B is M-P invertible and
C = CB † B. Then the operator = (B, C) defined by (31) is also M-P invertible
such that

† = † = B † B.

Proof Let PB and QB be two projections in L(H ) defined by

PB = BB † and QB = B † B,

and put H1 = R(PB ) = R(B) and H2 = R(QB ) = R(B ∗ ). Then by (20) we have

−1
∗ B11 0 B11 0
UPB BUQ B
= and UQB B † UP∗B = , (36)
0 0 0 0

−1
where B11 = B|H2 ,H1 is invertible such that B11 = B † |H1 ,H2 . It follows that

∗ IH2 0
UQB B † BUQ B
= UQB B † UP∗B · UPB BUQ
∗
B
= . (37)
0 0

Since C = CB † B, we have C ∗ C = B † BC ∗ CB † B, and thus

W 0
C ∗ C = UQ
∗
B
UQB , where W ∈ L(H2 ) is positive. (38)
0 0

∗ B
Furthermore, from (36) we know that B11 11 is invertible in L(H2 ), which
∗
obviously leads to the invertibility of B11 B11 + W in L(H2 ). In view of (31), (36)
and (38), we have

∗ B +W 0
B11
∗ 11
= UQ B
UQ B , (39)
0 0

and thus is M-P invertible such that

∗ B + W )−1 0
(B11
∗ 11
† = UQ B
UQ B . (40)
0 0
Norm Estimations for the M-P Inverse of the Weak Perturbation 237

It follows from (39), (40) and (37) that

∗ IH2 0
† = † = UQ B
UQB = B † B.

0 0
Theorem 4.5 (cf. [17, Corollary 3.6]) Suppose that T ∈ L(H, K) is M-P invert-
ible. Let M be a weak perturbation of T given by (12). Then

ET T † , F T † T , T T † ET T † and T † T F T † T

are all M-P invertible, and

∗
M † = (F T † T )† · T † · (ET T † )† , (41)
MM † = ET T † · (ET T † )† and M † M = F T † T · (F T † T )† . (42)

Proof Let 1 = T T † ET T † and 2 = T † T F T † T . Following the notations in

Remark 3.8, by (24) and (25) we have

∗ B0 D0
1 = UQ T
UQT and 2 = UP∗T UP T ,
0 0 0 0

where both B and D are M-P invertible, hence

∗ B† 0 D† 0
1† = UQ T
UQT and 2† = UP∗T UP T .
0 0 0 0

Furthermore, by Remark 3.8 we have

C = CB † B and G = GD † D. (43)

We may then combine (25), (43), Lemmas 4.3 and 4.4 to conclude that

† ∗ † ∗
∗ E B E C
(ET T † )† = UQ T
UQ T , (44)
0 0

F† D ∗ F† G∗
(F T † T )† = UP∗T UP T , (45)
0 0

where E and F are defined by

E = B ∗ B + C ∗ C and F = D ∗ D + G∗ G (46)
238 C. Fu et al.

such that
† †
E E = E E = B † B and F† F = F F† = D † D. (47)

It follows from (25), (23), (41), (45) and (44) that

M = ET T † · T · (F T † T )∗ (48)

∗ B0 T11 0 D ∗ G∗
= UQ T
UP T
C0 0 0 0 0

∗ BT11 D ∗ BT11 G∗
= UQ UP T , (49)
T CT11 D ∗ CT11 G∗

−1 † ∗ † ∗
DF† 0 T11 0 E B E C
M † = UP∗T † UQ T
GF 0 0 0 0 0

−1 † ∗ −1 † ∗
DF† T11 E B DF† T11 E C
= UP∗T −1 † ∗ −1 † ∗ UQT . (50)
GF† T11 E B GF† T11 E C

Then we may combine (49), (50), (46), (47) and (27) to get an expression of MM †
as

∗ 11 12
MM † = UQ T
UQ T ,
21 22

where
−1 † ∗ −1 † ∗
11 = BT11 (D ∗ D + G∗ G)F† T11 E B = BT11 F F† T11 E B
−1 † ∗ −1 † ∗ † ∗
= B · T11 D † D · T11 E B = B · B † BT11 · T11 E B = BE B ,
† ∗ † ∗ † ∗
12 = BE C , 21 = CB † BE B = CE B and 22 = CF† C ∗ .

Therefore,

† ∗ † ∗
∗ BE B BE C
MM = †
UQ T † ∗ † ∗ UQT = ET T † · (ET T † )† (51)
CE B CE C

by (25) and (32). As a result, (MM † )∗ = MM † . Moreover, (51) together with (48)
yields

MM † M = (ET T † )(ET T † )† (ET T † )T (F T † T )∗ = (ET T † )T (F T † T )∗ = M.

Norm Estimations for the M-P Inverse of the Weak Perturbation 239

Similarly, we can prove that

DF† D ∗ DF† G∗
M M=
†
UP∗T UPT = F T † T · (F T † T )† ,
GF† D ∗ GF† G∗

hence (M † M)∗ = M † M, and

∗
M † MM † = F T † T · (F T † T )† · (F T † T )† · T † · (ET T † )†
∗
= (F T † T )† · T † · (ET T † )† = M † .

Therefore, the four equations in (5) are satisfied for M and M † .

5 Norm Estimations for the Moore-Penrose Inverse

In this section, we study norm estimations for the Moore-Penrose inverse. First, we
present a technical result, which was originally obtained in [4] for matrices.
Lemma 5.1 ([4, Lemma 3.3]) Let B ∈ L(H ) be M-P invertible and C ∈ L(H, K)
be such that C = CB † B. Then

CB † 2
C † C ∗ = and IH − B † B ∗ = θ (B, C), (52)
1 + CB † 2

where = (B, C) is defined by (31) and

1, if B is not surjective,
θ (B, C) = CB † 2 (53)
1+CB † 2
, if B is surjective.

Proof Following the notations as in the proof of Lemma 4.4, we know that is
M-P invertible such that † is given by (40), which can be expressed alternately as

−1 ∗ )−1 0
∗ B11 (IH1 + Z)−1 (B11
† = UQ B
UQ B , (54)
0 0

where H1 = R(B), H2 = R(B ∗ ) and

∗ −1 −1
) W 2 ∈ L(H2 , H1 ), Z = SS ∗ = (B11
∗ −1
1
S = (B11 ) W B11 ∈ L(H1 ). (55)
240 C. Fu et al.

Meanwhile, by (36), (38) and (55) we obtain

Z0
(CB † )∗ (CB † ) = (B † )∗ (C ∗ C)B † = UP∗B UP B ,
0 0

which implies that

Z = (CB † )∗ (CB † ) = CB † 2 . (56)

Since Z is positive, we have Z = max {t : t ∈ σ (Z)}, where σ (Z) is the spectrum
of Z in L(H1 ). Note that the function from t to 1+tt
is monotonically increasing
on [0, +∞), so by the spectral theory for normal elements in a C ∗ -algebra [10,
Section 1], we have
4 t 5 Z
Z(IH1 + Z)−1 = max : t ∈ σ (Z) = . (57)
1+t 1 + Z

Now, we prove the first equation in (52). By (40), (38) and (54)–(56), we have
∗ ∗
† 1 1 1 1
C C =
† ∗
C
2 · C † 2 = C † 2
·C
2†

† 1 ∗ † 1
= 2 = ∗ −1 ∗ −1
(B11 B11 + W ) 2 W (B11 B11 + W ) 2
2C C

∗ − 1 1 ∗ − 1 1 ∗
=
B B
11 11 + W 2W 2 · B B
11 11 + W 2W 2

∗ − 1 1 ∗ ∗ − 1 1
=
B B
11 11 + W 2W 2 · B B
11 11 + W 2W 2

1 1
∗ 1 −1 1
= W 2 (B11 B11 + W )−1 W 2 = W 2 B11 (IH1 + Z)−1 (B11
∗ −1
) W 2

S ∗ S
= S ∗ (IH1 + SS ∗ )−1 · S = (IH2 + S ∗ S)−1 S ∗ · S =
1 + S ∗ S
SS ∗ Z CB † 2
= = = .
1 + SS ∗ 1 + Z 1 + CB † 2
Norm Estimations for the M-P Inverse of the Weak Perturbation 241

Finally, we prove the second equation in (52). First, we consider the case that B
is not surjective. In this case, H1 = H and thus IH ⊥ = 0. Therefore, by (36) and
1
(54) we have

∗ IH1 − (IH1 + Z)−1 0
IH − B B †
=UP∗B UP B
0 IH ⊥
1

Z(IH1 + Z)−1 0
=UP∗B UP B .
0 IH ⊥
1

The equation above together with (57) yields

IH − B † B ∗ = max{Z(IH1 + Z)−1 , IH ⊥ } = 1.

Next, we consider the case that B is surjective. In this case IH ⊥ = 0, so the

1
argument above indicates that

Z CB † 2
IH − B † B ∗ = Z(IH + Z)−1 = = .

1 + Z 1 + CB † 2
Our next technical result is as follows.
Lemma 5.2 Let M be a weak perturbation of T ∈ L(H, K) given by (12) and let
LE,T and RF,T be defined by (13). Put

πL = (IK − T T † )L†E,T T T † , (58)

†
πR = (IH − T † T )RF,T T † T , (59)
⎧
⎨ 1, if LE,T is not surjective,
θL = πL2 (60)
⎩ 2 , if LE,T is surjective.
1+πL
⎧
⎨ 1, if RF,T is not surjective,
θR = πR2 (61)
⎩ , if RF,T is surjective.
1+πR2

Then

MM † − T T † = T T † (IK − MM † ) = θL , (62)
πL
MM † (IK − T T † ) = , (63)
1 + πL2
242 C. Fu et al.

M † M − T † T = T † T (IH − M † M) = θR , (64)
πR
M † M(IH − T † T ) = . (65)
1 + πR2

Proof For a proof, see [4, Lemma 3.5].

Theorem 5.3 Let M be a weak perturbation of T ∈ L(H, K) given by (12). Then
upper bound (3) is valid.
Proof For a proof, see [4, Theorem 3.1].

Theorem 5.4 Let M be a strong perturbation of T ∈ L(H, K) given by (12). Then
upper bound (4) is valid.
Proof We first consider the self-adjoint case that H = K, T = T ∗ and E = F . In
this case LE,T = RF,T , which is invertible by assumption, hence by (58)–(61) we
have
πL πR
θL = = = θR .
1 + πL2 1 + πR2

Then it follows from (62) and (65) that

MM † − T T † = M † M(IH − T † T )
= M † (M − T )(IH − T † T )
≤ M † · M − T · IH − T † T
≤ M † · M − T . (66)

Next, we consider the general case that M is given by (12). By Lemma 3.6
we know that ρ(M) = Hρ(T )H ∗ is also a strong perturbation of ρ(T ), where
ρ(T ), ρ(M) and H are given by (7) and (16), respectively. Using (9), (11), (66) and
(10), we obtain
6 7
max MM † − T T † , M † M − T † T = ρ(M)ρ(M)† − ρ(T )ρ(T )†
≤ ρ(M)† · ρ(M) − ρ(T )
= M † · M − T .

Remark 5.5 It is remarkable that upper bound (4) may be false for a general weak
perturbation. For a counterexample, see [4, Theorem 3.2].
Norm Estimations for the M-P Inverse of the Weak Perturbation 243

Theorem 5.6 Let M be a strong perturbation of T ∈ L(H, K) given by (12). Then

√
5+1
M − T ≤
† †
M † · T † · M − T . (67)
2
Proof First we consider the case that H = K. Following the line in the proof of
[19, Theorem 3.3.5], we give a detailed proof of (67) based on (3) and (4). Note that
T , E, M, T † , M † all belong to L(H ), which is a unital C ∗ -algebra. So there exists a
Hilbert space L and a C ∗ -morphism π : L(H ) → B(L) such that π is faithful [10,
Corollary 3.7.5]. Replacing L with π(IH )L if necessary, we may assume that π is
unital.
Since π : L(H ) → R(π) is a C ∗ -isomorphism, we have π(X) = X for
every X ∈ L(H ) [10, Theorem 1.5.7], and from (5) we know that both π(T ) and
π(M) are M-P invertible such that π(T )† = π(T † ) and π(M)† = π(M † ). Direct
computation yields

M † − T † = A + B = M † MA + (IH − M † M)B, (68)

where

A = −M † (M − T )T † + M † (IK − T T † ) and B = −(IH − M † M)T † . (69)

Now, given any x ∈ L with x = 1, we put

sin φ = π(T )π(T )† x and cos φ = IL − π(T )π(T )† x

for some φ ∈ [0, π2 ]. Let

λ = M † · T † · M − T .

Then by (69) we have

π(A)x = − π(M † )π(M − T )π(T )† π(T )π(T )† x

+ π(M † )π MM † − T T † IL − π(T )π(T )† x,

π(B)x = − π T † T − M † M π(T )† π(T )π(T )† x.

It follows from (3) and (4) that

π(A)x ≤M † · M − T · T † · sin φ

+ M † · MM † − T T † · cos φ (70)

≤λ · sin φ + cos φ ,
π(B)x ≤T † T − M † M · T † · sin φ ≤ λ · sin φ. (71)
244 C. Fu et al.

In view of (68), (70) and (71), we have

π(M † − T † )x2 = π(A)x2 + π(B)x2

2 2 3
≤ sin φ + cos φ + sin2 φ λ2
23 1 3
= cos(2φ) λ2
+ sin(2φ) −
2 2
√ "√ #2
3+ 5 2 5+1
≤ λ = λ .
2 2

Therefore,
√
5+1
M − T = π(M − T ) ≤
† † † †
λ.
2
This completes the proof of (67) in the case that H = K.
Next we consider the general case that H = K. Let M B=EB· T
B· F
B∗ ∈ L(H ⊕K),
where

B= 0 0 0 0 F 0
E , TB = B=
and F . (72)
0E T 0 0 0

Then

0 T† 0 0 0 M†
TB† = B=
,M B† =
,M , (73)
0 0 M0 0 0

BTBTB† + IH ⊕K − T
BTB† = IH 0
B TB = E
LE, , (74)
0 LE,T

BTB† TB + IH ⊕K − TB† TB = RF,T 0

RFB,TB = F . (75)
0 IK

Therefore, both LE, B TB and RF B is a strong

B,TB are invertible in L(H ⊕ K), hence M
B
perturbation of T . It follows that
√
B† − T
B† ≤ 5 + 1 B†
M † − T † = M M · TB† · M B − TB
2
√
5+1
= M † · T † · M − T .

2
In the rest of this section, we study further the perturbation estimation for the
strong perturbation. In this case, a technical result can be provided as follows.
Norm Estimations for the M-P Inverse of the Weak Perturbation 245

Theorem 5.7 Let M be a strong perturbation of T ∈ L(H, K) given by (12). Then

−1 ∗
T = L−1
E,T · M · RF,T (76)

is a strong perturbation of M such that

−1 ∗
LL−1 = ET T † + IK − ET T † (IK − MM † ) L−1
E,T (IK − T T † ),
E,T ,M

(77)
−1 −1
∗
RR −1 ,M = F T † T + IH − F T † T (IH − M † M) RF,T (IH − T † T ),
F,T
(78)
−1
(IK − MM † ) L −1 MM † = πL , (79)
LE,T ,M
9
: πL 6 72
L −1 ≤ : ; + max (ET T † )† , IK − T T † , (80)
LE,T ,M
1 + πL2

L − IK ≤ LE,T − IK · (ET T † )† , (81)
L−1
E,T ,M
−1 6 7

LL−1 ,M ≤ (1 + πL + πL2 ) · max ET T † 2 , 1 + πL2 , (82)
E,T
−1

LL−1 ,M − IK ≤ 1 + πL2 · LE,T − IK , (83)
E,T

where LE,T , RF,T and πL are defined by (13) and (58) respectively, and

LL−1 = L−1
E,T MM + IK − MM ,
† †
(84)
E,T ,M

−1
RR −1 ,M = RF,T M † M + IH − M † M. (85)
F,T

Proof
(1) Note that (76) can be derived directly from (2). Put

PM = MM † , KM = R(PM ) = R(M)

⊥ as (19). By (42) and (13), we

and define the unitary UPM : K → KM ⊕ KM
have

(IK − PM )LE,T = IK − ET T † (ET T † )† (ET T † + IK − T T † )

= IK − ET T † (ET T † )† (IK − T T † )
= (IK − PM )(IK − T T † ). (86)
246 C. Fu et al.

Taking ∗-operation, we get

L∗E,T (IK − PM ) = (IK − T T † )(IK − PM ). (87)

Let A ∈ L(K) be defined by

A = PM + LE,T (IK − PM ). (88)

Then from (88), (20) and (86) we have

IKM A12
UPM AUP∗M = , (89)
0 A22

where A12 = PM LE,T (IK − PM )|K ⊥ and

A22 = (IK − PM )(IK − T T † )(IK − PM )|K ⊥ = SS ∗ |K ⊥ ∈ L(KM

⊥
), (90)
M M

in which

S = (IK − PM )(IK − T T † ) ∈ L(K). (91)

⊥ ). Indeed, given any ξ ∈ K ⊥ such

We prove that A22 is invertible in L(KM M
that A22ξ = 0, by (90) we have

(IK − T T † )(IK − PM )ξ = S ∗ ξ = 0,

which leads by (87) to ξ = (IK − PM )ξ = 0, since ξ ∈ KM ⊥ and L∗

E,T is
invertible. Therefore, A22 is injective. Following the notations in Remark 3.8
and Theorem 4.5, we know from (24), (51), (26) and (18) that
∗
UQT (IK − T T † )(IK − PM ) L−1 ∗
E,T (IK − T T )UQT
†

0 0 IK1 − B −1 B ∗ −B −1 C ∗
= ·
0 IK ⊥ −C −1 B ∗ IK ⊥ − C −1 C ∗
1 1

(B ∗ )−1 −(B ∗ )−1 C ∗ 0 0
· ·
0 IK ⊥ 0 IK ⊥
1 1

0 0 ∗
= = UQT (IK − T T † )UQ ,
0 IK ⊥ T
1

where = B ∗ B + C ∗ C. Therefore,
∗
(IK − T T † )(IK − PM ) L−1 † †
E,T (IK − T T ) = IK − T T . (92)
Norm Estimations for the M-P Inverse of the Weak Perturbation 247

It follows that
∗
A22 (IK − PM ) L−1 †
E,T (IK − T T ) = S, (93)

⊥.
which means that A22 is surjective, since by (86) and (91) we have R(S) = KM
⊥
Therefore, A22 is invertible in L(KM ), and then (93) gives
−1 ∗
A−1
22 S = (IK − PM ) LE,T (IK − T T ).
†
(94)

It follows from (89) that the operator A defined by (88) is invertible such that

IKM −A12 A−1

UPM A−1 UP∗M = 22 . (95)
0 A−1
22

In view of (84) and (88), we have LL−1 ,M = L−1

E,T A, hence LL−1 is
E,T E,T ,M
invertible such that
−1
LL−1 = A−1 LE,T . (96)
E,T ,M

Similarly, we can prove that the operator RR −1 ,M defined by (85) is also

F,T
invertible. Therefore, the operator T written in the form (76) is a strong
perturbation of M.
Now we are ready to prove the validity of (77). Indeed, from (42) and (92)
we have
∗
PM LE,T (IK − PM )(L−1
E,T (IK − T T † )
∗
= PM · ET T † · (IK − PM )(L−1 E,T (IK − T T )
†

∗
+ PM · (IK − T T † ) · (IK − PM )(L−1 †
E,T (IK − T T )
∗
= ET T † · (IK − PM )(L−1 E,T (IK − T T ) + PM (IK − T T ).
† †
(97)

Therefore, it can be deduced by (96), (95), (21), (86), (91), (94) and (97) that
−1 2 3
LL−1 = PM − A12A−1
22 (IK − PM ) + A −1
22 (IK − PM ) LE,T
E,T ,M

= PM LE,T − PM LE,T (IK − PM )A−1 −1

22 S + A22 S
∗
= PM LE,T + IK − PM LE,T (IK − PM ) L−1 E,T (IK − T T † )
∗
= ET T † + IK − ET T † (IK − PM ) L−1
E,T (IK − T T † ).

This completes the proof of (77). The proof of (78) is similar.

248 C. Fu et al.

(2) We prove the norm equation (79). For simplicity, we put

−1
X = (IK − PM ) LL−1 PM , Y = A−1 ⊥
22 SPM |KM ∈ L(KM , KM ), (98)
E,T ,M

where A22 is formulated by (90) and S is defined by (91). Note that both IK −
PM and IK − T T † are idempotent, so by (90) we have

Y Y ∗ = A−1 †
22 (IK − PM )(IK − T T ) − (IK − PM ) + IK

(IK − T T † )(IK − PM )A−1

= −IK ⊥ + A−1
22 . (99)
M

Since the operator A22 defined by (90) is positive definite and A22 ≤ 1, we
know that A−1
22 is also positive definite. Then (99) indicates that each spectral
−1
point λ of A22 satisfies λ ≥ 1, which means that

Y 2 = Y Y ∗ = A−1
22 − 1. (100)

Once again from (90) we have

A−1 ∗ † † ∗ † † † ∗ ∗ †
22 = (SS ) = (S ) S = S (S ) = (S S) . (101)

Following the notations as in the proof of Lemma 5.2, we have

∗ ∗ 0 0
S S = (IK − T T )(IK − PM )(IK − T T ) =
† †
UQ UQ T .
T 0 IK ⊥ − C † C ∗
1

By (52) we know that the norm of C † C ∗ is less than 1, so IK ⊥ − C † C ∗ is

1
invertible, hence

∗ † ∗ 0 0
(S S) = UQT UQ T . (102)
0 (IK ⊥ − C † C ∗ )−1
1

Furthermore, since C † C ∗ is positive and C † C ∗ < 1, we have

1
(IK ⊥ − C † C ∗ )−1 = .
1 1 − C † C ∗

The equation above together with (52) and πL = CB † yields

(IK ⊥ − C † C ∗ )−1 = 1 + CB † 2 = 1 + πL2 . (103)

1
Norm Estimations for the M-P Inverse of the Weak Perturbation 249

Moreover, by (98), (96), (95) and (86), we have

0 0 IKM −A12 A−1
UP M · X · UP∗M = 22
0 IK ⊥
M
0 A−1
22

∗ ∗ IKM 0
SPM |KM A22 0 0
00
= .
Y 0

So we may combine the equation above with (100), (101), (102) and (103) to
conclude that

X = Y = A−1 22 − 1 = (S ∗ S)† − 1 = (1 + π 2 ) − 1 = π .
L L

This completes the proof of (79).

(3) We prove the norm estimations (80) and (81). Let LL−1 ,M be given by (84).
E,T
Put
∗
6 7
Z = UQT LL−1 UQ and θ = max (ET T † )† , IK − T T † .
E,T ,M T

Then direct computation yields

IK1 + (IK1 − B) −1 B ∗ (IK1 − B) −1 C ∗
Z= ,
−C −1 B ∗ IK ⊥ − C −1 C ∗
1

which gives

∗IK1 − B −1 B ∗ + −1 −B −1 C ∗
ZZ =
−C −1 B ∗ IK ⊥ − C −1 C ∗
1

IK1 − B −1 B ∗ −B −1 C ∗ −1 0
= +
−C −1 B ∗ −C −1 C ∗ 0 IK ⊥
1

∗ −1 0
=UQT (T T † − PM )UQ + .
T 0 IK ⊥
1

Note that

−1 0 2 3∗
∗ ∗
= UQT (ET T † )† UQ · UQT (ET T † )† UQ ,
0 0 T T
250 C. Fu et al.

hence by (62) and (60), we obtain

πL
LL−1 ,M
2
= ZZ ∗ ≤T T † − PM + θ 2 = + θ 2.
E,T
1 + πL2

Similarly,

(IK1 − B) −1 B ∗ (IK1 − B) −1 C ∗

Z − IK =
−C −1 B ∗ −C −1 C ∗
IK1 − B 0 −1 B ∗ −1 C ∗
= ·
−C 0 0 0
∗ ∗
=UQT (IK − LE,T )UQ T
· UQT (ET T † )† UQ T
.

Therefore, Z − IK ≤ IK − LE,T · (ET T † )† .

(4) We prove norm estimations (82) and (83). Let

−1 ∗ IK1 0
W = UQT LL−1 UQ and W1 = . (104)
E,T ,M T CB −1 IK ⊥
1

Then by direct computation, from (77) we can obtain

B −(IK1 − B)(CB −1 )∗
W = , (105)
C IK ⊥ + C(B ∗ )−1 C ∗
1

hence

(CB −1 )∗
W ∗W =
CB IK ⊥ + CB (IK1 + )(CB −1 )∗
−1 −1
1

0
=W1 W1∗ . (106)
0 IK ⊥ + CB −1 (CB −1 )∗
1

Note that CB −1 = πL , so by (10) we can get ρ (CB −1 )∗ = πL , where

ρ (CB −1 )∗ is defined by (7). It follows from (104) that

0 0
W1 W1∗ = −1 ∗
IK1 ⊕K ⊥ + ρ (CB ) +

1 0 CB −1 (CB −1 )∗
≤ 1 + πL + πL2 .
Norm Estimations for the M-P Inverse of the Weak Perturbation 251

Therefore (106) gives

−1 6 7
2
LL−1 ,M =W 2 ≤ W1 2 · max , 1 + πL2
E,T
6 7
≤(1 + πL + πL2 ) · max , 1 + πL2
6 7
=(1 + πL + πL2 ) · max ET T † 2 , 1 + πL2 .

This finishes the proof of (82).

Finally, from (105) we can obtain

B − IK1 0
W − IK1 ⊕K ⊥ = · W2 , (107)
1 C 0

IK1 (CB −1 )∗
where W2 = . As is shown above, W2 = W2 W2∗ =
0 0

1 + πL . Therefore (107) yields
2

−1

LL−1 ,M − IK = W − IK1 ⊕K ⊥ ≤ LE,T − IK · 1 + πL2 .
E,T 1

This completes the proof of all the assertions.

Acknowledgments This work was supported by the National Natural Science Foundation of
China (11971136) and a grant from Shanghai Municipal Science and Technology Commission
(18590745200).

References

1. A. Böttcher, I.M. Spitkovsky, A gentle guide to the basics of two projections theory. Linear
Algebra Appl. 432, 1412–1459 (2010)
2. N. Castro-González, J. Ceballos, F.M. Dopico, J.M. Molera, Accurate solution of structured
least squares problems via rank-revealing decompositions. SIAM J. Matrix Anal. Appl. 34,
1112–1128 (2013)
3. N. Castro-González, M.F. Martı́nez-Serrano, J. Robles, Expressions for the Moore-Penrose
inverse of block matrices involving the Schur complement. Linear Algebra Appl. 471, 353–
368 (2015)
4. C. Fu, C. Song, G. Wang, Q. Xu, Norm estimations for the Moore-Penrose inverse of the weak
perturbation of matrices. Linear Multilinear Algebra. 70, 125–237 (2022)
5. N. Hu, W. Luo, C. Song, Q. Xu, Norm estimations for the Moore-Penrose inverse of
multiplicative perturbations of matrices. J. Math. Anal. Appl. 437, 498–512 (2016)
6. E.C. Lance, Hilbert C ∗ -modules–A Toolkit for Operator Algebraists (Cambridge University
Press, Cambridge, 1995)
7. Z. Li, Q. Xu, Y. Wei, A note on stable perturbations of Moore-Penrose inverses. Numer. Linear
Algebra Appl. 20, 18–26 (2013)
252 C. Fu et al.

8. L. Meng, B. Zheng, New multiplicative perturbation bounds of the Moore-Penrose inverse.

Linear Multilinear Algebra 63, 1037–1048 (2015)
9. M.S. Moslehian, M. Kian, Q. Xu, Positivity of 2 × 2 block matrices of operators. Banach J.
Math. Anal. 13(3), 726–743 (2019)
10. G.K. Pedersen, C ∗ -Algebras and their Automorphism Groups (Academic Press, New York,
1979)
11. G.W. Stewart, On the perturbation of pseudo-inverses, projections and linear least squares
problems. SIAM Rev. 19, 635–662 (1977)
12. G. Wang, Y. Wei, S. Qiao, Generalized Inverses: Theory and Computations. Developments in
Mathematics, vol. 53 (Springer/Science Press, Singapore/Beijing, 2018)
13. P.Å. Wedin, Perturbation theory for pseudo-inverses. BIT 13, 217–232 (1973)
14. Q. Xu, Common hermitian and positive solutions to the adjointable operator equations AX =
C, XB = D. Linear Algebra Appl. 429, 1–11 (2008)
15. Q. Xu, L. Sheng, Positive semi-definite matrices of adjointable operators on Hilbert C ∗ -
modules. Linear Algebra Appl. 428, 992–1000 (2008)
16. Q. Xu, Y. Chen, C. Song, Representations for weighted Moore-Penrose inverses of partitioned
adjointable operators. Linear Algebra Appl. 438, 10–30 (2013)
17. Q. Xu, C. Song, G. Wang, Multiplicative perturbations of matrices and the generalized triple
reverse order law for the Moore-Penrose inverse. Linear Algebra Appl. 530, 366–383 (2017)
18. Q. Xu, Y. Wei, Y. Gu, Sharp norm-estimations for Moore-Penrose inverses of stable perturba-
tions of Hilbert C ∗ -module operators. SIAM J. Numer. Anal. 47, 4735–4758 (2010)
19. Y. Xue, Stable Perturbations of Operators and Related Topics (World Scientific Publishing,
Singapore, 2012)
20. H. Yang, P. Zhang, A note on multiplicative perturbation bounds for the Moore-Penrose
inverse. Linear Multilinear Algebra 62, 831–838 (2014)
21. X. Zhang, X. Fang, C. Song, Q. Xu, Representations and norm estimations for the Moore-
Penrose inverse of multiplicative perturbations of matrices. Linear Multilinear Algebra 65,
555–571 (2017)
Part II
Orthogonality and Inequalities
Birkhoff–James Orthogonality:
Characterizations, Preservers,
and Orthogonality Graphs

Ljiljana Arambašić, Alexander Guterman, Bojan Kuzma,

and Svetlana Zhilina

Abstract We present Birkhoff–James orthogonality from historical perspectives to

the current development. We compare it with some other orthogonalities, present
its properties and its applications, and review the characterizations of Birkhoff–
James orthogonality in classical Banach spaces like B(H), C ∗ -algebras, Hilbert
C ∗ -modules, or the space of rectangular matrices normed with Schatten norms.
We also present the results on characterizations of preservers of Birkhoff–James
orthogonality and, by devising a directed graph of the relation, show that in smooth
spaces it can completely determine the norm up to (conjugate) linear isometry.
Most, though not all, of the results that we state are supplied with (sketches of)
the proof.

Keywords Normed vector space · Birkhoff–James orthogonality · C ∗ -algebra ·

Hilbert C ∗ -module · Preservers · Graph · Clique

L. Arambašić
Department of Mathematics, Faculty of Science, University of Zagreb, Zagreb, Croatia
e-mail: [email protected]
A. Guterman · S. Zhilina
Department of Mathematics and Mechanics, Lomonosov Moscow State University, Moscow,
Russia
Moscow Center for Fundamental and Applied Mathematics, Moscow, Russia
Moscow Center for Continuous Mathematical Education, Moscow, Russia
e-mail: [email protected]
B. Kuzma ()
University of Primorska, Koper, Slovenia
IMFM, Ljubljana, Slovenia
Moscow Center for Fundamental and Applied Mathematics, Moscow, Russia
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 255
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_8
256 L. Arambašić et al.

1 Introduction

In a real or complex inner product space (H, ·, ·), two vectors x, y are orthogonal
(denoted by x ⊥ y) if x, y = 0. The familiar, and mostly trivial, properties of
orthogonality relation are
Homogeneity x ⊥ y implies (αx) ⊥ (βy) for scalars α, β.
Left additivity x1 ⊥ y and x2 ⊥ y imply (x1 + x2 ) ⊥ y.
Right additivity x ⊥ y1 and x ⊥ y2 imply x ⊥ (y1 + y2 ).
Symmetricity x ⊥ y implies y ⊥ x.
Would it be possible to extend the orthogonality relation from inner product
spaces to a more general normed space (X , · )? The lack of the inner product
in such spaces suggests that, if there is any chance to succeed, one should better
express orthogonality with the help of the norm alone. This is possible in many ways
and it gives rise to several nonequivalent extensions of orthogonality. We present a
few of them, each with its own virtues and vices. Unless explicitly stated otherwise,
our claims will hold equally well for real and complex normed spaces. Thus, we
will occasionally omit specifying the underlying field and will simply refer that
something holds for all scalars or that there exists a scalar with a particular property,
etc. Let F denote the real or the complex field.

1.1 Roberts Orthogonality

In 1934 Roberts [76] observed that in inner product spaces x ⊥ y if and only if

x + λy = x − λy (1)

for every scalar λ. Clearly, this condition can be verified in every normed space and
gives rise to Roberts orthogonality: x ⊥R y if (1) holds.
It is an easy exercise to see that Roberts orthogonality is homogeneous and
symmetric. It does not have an additivity property in general—say, in three-
dimensional real space equipped with supremum norm, (R3 , · ∞ ), the vector
x = (1, 1, 1) is Roberts orthogonal to y1 = (1, −1, 0) and to y2 = (0, 2, −2) but
not to y1 + y2 .
Further vices of Roberts orthogonality are that the condition (1) is very restric-
tive, so much that in some normed spaces x ⊥R y only when either x or y is
a zero vector. An example of such a (real, two-dimensional) space was given by
James [45]; it equals the set of all real polynomials of degree at most two restricted
to unit interval [0, 1] and vanishing at 0, and equipped with the supremum norm.
In the same paper, James also proved the next proposition.
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 257

Proposition 1.1 (Cf. [45, Corollary 4.7]) The following conditions are equivalent
for a real and at least two-dimensional normed space X :
(i) X is an inner product space.
(ii) In every two-dimensional plane ⊆ X and for every x ∈ there exists a
nonzero y ∈ which is Roberts orthogonal to x.
The proof will be given within Sect. 1.3.
Similar characterization holds also for complex normed spaces provided that
we take for two-dimensional real-linear planes. Namely, each complex normed
space (X , · ) can be considered as a real normed space (denoted temporarily as
(X , · R )) by restricting the scalars, and the following general result applies:
Proposition 1.2 (Cf. [31, Theorem 7.2]) A complex normed space (X , · ) is an
inner product space if and only if the associated real normed space (X , · R ) is
an inner product space. When this happens, the inner products ·, · and ·, ·R are
related by the equations a, bR = Rea, b and a, b = a, bR − iia, bR .
Sketch of the Proof If ·R is induced by a real inner product ·, ·R , then 2x2 =
(1 + i)x2 = x2 + ix2 + 2ix, xR , so ix, xR = 0. Linearization gives
further x, iyR + ix, yR = 0. From here it is straightforward that x, y :=
x, yR − iix, yR is an inner product over C which induces the given norm.

1.2 Isosceles Orthogonality

James in [45] observed that one does not need the full set of scalars in order that (1)
be equivalent to the usual orthogonality in real inner product spaces; it suffices that

x + y = x − y. (2)

This condition gives another orthogonality relation, denoted by ⊥I and termed

isosceles (also known as James, see [5]) orthogonality, by which x ⊥I y if (2)
holds. As for the terminology: isosceles means that the diagonals of a parallelogram
spanned by x, y can form an isosceles triangle.
Isosceles orthogonality is clearly symmetric. Also, unlike Roberts orthogonality,
it is always nontrivial. This was observed by James [45] who argued as follows:
Given two vectors x, y, a positive number α, and a large enough integer n, we have,
from |a − b| ≤ a − b, that

n→+∞
nx + n+α y − nx
1
+ n1 y ≤ n( n+α
1
− n1 )y = ( n+α
n
− 1)y −−−−→ 0.

Therefore,
n→+∞
(n + α)x + y − nx + y = (n + α)x + n+α y − nx
1
+ n1 y −−−−→ αx.
258 L. Arambašić et al.

This shows that the sequence

f (k) := x + (kx + y) − x − (kx + y)

satisfies limk→+∞ f (k) = limk→+∞ ((k + 1)x + y − (k − 1)x + y) =

limn→+∞ ((n+2)x +y−nx +y) = 2x. Similarly, limk→−∞ f (k) = −2x.
By continuity, the function f must take the zero value, so there is a real scalar λ with
x ⊥I (λx + y).
However, isosceles orthogonality is neither homogeneous nor additive in general.
In fact, James in [45] obtained the following characterization of inner product
spaces:
Proposition 1.3 (Cf. [45, Theorem 4.7, 4.8]) The following are equivalent for a
normed space X over the field F:
(i) X is an inner product space.
(ii) Isosceles orthogonality is real-homogeneous in X .
(iii) Isosceles orthogonality is additive in X .
Sketch of the Proof (i) #⇒ (ii) and (iii). Given any x, y ∈ X , the condition x ⊥I y
is equivalent to Rex, y = 0 and is clearly homogeneous under multiplication by
real scalars and additive.
(ii) #⇒ (i). Take any two normalized vectors x, y. Then (x + y) ⊥I (x − y), so,
by real-homogeneity, also (1 + α)(x + y) ⊥I (1 − α)(x − y), that is,

2x + αy = (1 + α)(x + y) + (1 − α)(x − y)

= (1 + α)(x + y) − (1 − α)(x − y)
= 2αx + y; α ∈ R.

It now follows by Ficken’s [36] characterization of (real or complex) inner product

spaces as the ones where x = y implies that αx + βy = βx + αy for all
scalars α, β ∈ R that X is an inner product space.
(iii) #⇒ (ii). By additivity, x ⊥I y implies (nx) ⊥I (my) for all positive integers
n, m. Therefore, nx + my = nx − my, and so x + m n y = x − n y. By
m

continuity, and as x ⊥I y trivially implies x ⊥I (−y), we see that x + λy =

x − λy for every λ ∈ R. Then, by the implication (ii) #⇒ (i), X is an inner
product space.

1.3 Birkhoff–James Orthogonality

Birkhoff in [22], based on geometric considerations, introduced another orthogo-

nality relation as follows: “A vector pq issuing from a point p is perpendicular to a
second such vector pr if and only if there is no point on the extended line through
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 259

nearer to q than p.” In modern terminology, x ⊥ y (since we will consider only

pr
this orthogonality in the sequel, we write ⊥ instead of ⊥B ) if

x + λy ≥ x

for every scalar λ. This orthogonality was more thoroughly investigated by James
in [46, 47], and because of this it is often known as Birkhoff–James (B-J for short)
orthogonality. We remark, however, that B-J orthogonality can be traced back at
least as far as Carathéodory (see [5]).
One can see from its very definition that B-J orthogonality is homogeneous,
but we prefer to verify this from its equivalent geometrical reformulation (see
Proposition 1.4 below), due to James [47, Corollary 2.2], which will reveal much
more than just homogeneity. To do that we need to introduce some terminology.
Recall that F denotes either the field of real or the field of complex numbers; also,
given a vector x ∈ (X , · ) in a normed space X over F, the linear functional
f : X → F for which

f = 1 and f (x) = x

will be called a supporting functional at a vector x.

It is at least intuitively geometrically clear that, for a normalized vector x and a
vector y, we have x ⊥ y if and only if the line, passing through x in the direction
of y, does not contain the interior points of the norm’s unit ball (see Fig. 1). That is,
this line must be a supporting line to the norm’s unit ball. The intuition is correct:
Proposition 1.4 (Cf. [47, Corollary 2.2]) Let X be a normed space over F. Then
x ⊥ y if and only if there exists a supporting functional fx at point x, which
annihilates y.
Sketch of the Proof If x is a nonzero vector and its supporting functional fx
annihilates y, then x + λy ≥ |fx (x + λy)| = |fx (x)| = x, so x ⊥ y.
Conversely, by its definition, x ⊥ y for nonzero x, y ∈ X implies that a linear
functional f : span{x, y} → F, defined by f (x) = x and f (y) = 0, is norm-one
and can be extended by the Hahn–Banach theorem to a supporting functional at x
which annihilates y.

Corollary 1.5 In inner product spaces Birkhoff–James orthogonality is equivalent
to the usual one.

Fig. 1 Birkhoff–James
orthogonality: x ⊥ y
260 L. Arambašić et al.

Proof Let (H, ·, ·) be an inner product space and H its completion. By the
Riesz representation theorem and the Bunyakovsky–Cauchy–Schwarz inequality—
the part which claims that equality can hold only for linearly dependent vectors [42,
p. 4], the only supporting functional at a normalized vector x ∈ H ⊆ H is given by
·, x.

From the equivalence stated in Proposition 1.4, the homogeneity of B-J orthog-
onality is evident; it is also clear that in every two-dimensional plane and for
every x ∈ there exists y ∈ with x ⊥ y. That is, similarly to isosceles,
B-J orthogonality is always nontrivial. It is easily seen that B-J orthogonality is
in general nonsymmetric, that is, x ⊥ y does not always imply y ⊥ x (e.g., in the
space (F2 , · ∞ ), where · ∞ is the maximum norm, we have (1, 1) ⊥ (1, 0)
but (1, 0) ⊥ (1, 1)). We will discuss the properties of B-J orthogonality more
thoroughly in Sect. 2.
Proposition 1.4 yields a simple procedure to find all vectors y in a normed space
which are B-J orthogonal to a given normalized x, namely:
C
Nx := {y ∈ X ; x ⊥ y} = ker f.
f ∈X ∗
f =f (x)=1

But one can do slightly better: Singer [82] (see also his monograph [84, Lemma 1.3,
p. 169]) observed that if M ⊆ X is an n-dimensional subspace of a real normed
space X , and f is a bounded linear functional on X such that f |M = 1, then
f coincides on M with some convex combination of at most n extremal points
of the dual norm. This works also for complex normed spaces, except that in this
case one requires at most 2n − 1 extremal points of the dual norm. By taking
M = span{x, y}, one obtains the following result, which was the starting point of
Li and Schneider’s [60] investigation into B-J orthogonality on rectangular matrices
equipped with Schatten p-norm (see also the next chapter).
Proposition 1.6 (Cf. [60, Proposition 2.1]) Let X be a normed space over F and
let x, y ∈ X be linearly independent. Then the following are equivalent:
(i) x ⊥ y.
(ii) There exist h (h ≤ 2 if F = R, respectively, h ≤ 3 if F = C) extremal points
f1 , . . . , fh in unit sphere of the dual norm and h numbers λ1 , . . . , λh > 0 with
h
i=1 λi = 1 such that

h
λi fi (y) = 0 and f1 (x) = · · · = fh (x) = x.
i=1

In [13] a symmetrized version of B-J orthogonality was introduced. Two vectors

x, y ∈ X are mutually B-J orthogonal (denoted by x ⊥ ⊥ y) if x ⊥ y and y ⊥ x.
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 261

It is immediate that Roberts orthogonality is finer than isosceles. It is also finer

than mutual B-J:
Proposition 1.7 Let X be a normed space over F. If x, y ∈ X are Roberts
orthogonal, then they are also isosceles and mutual B-J orthogonal.
Proof For any λ ∈ F it holds

2x = (x + λy) + (x − λy) ≤ x + λy + x − λy = 2x + λy,

so x ⊥ y. Since Roberts orthogonality is symmetric, we similarly obtain y ⊥ x, and

thus x ⊥
⊥ y.

The proposition above is required to prove Proposition 1.1. We remark that James
in his paper [45] mentions that this is a trivial consequence of Proposition 1.3.
Proof of Proposition 1.1 Choose any nonzero x ∈ X and a two-dimensional plane
⊆ X , x ∈ . By the assumptions of Proposition 1.1, there exists a nonzero
y ∈ such that x ⊥R y. We now show that z = αx + βy ∈ \ {0} and x ⊥I z
imply z y. By definition of isosceles orthogonality,

βy + (1 + α)x = x + z = x − z = βy − (1 − α)x.

Since x ⊥R y implies βy ⊥R x, we also have

βy − (1 + α)x = βy + (1 + α)x = βy − (1 − α)x = βy + (1 − α)x.

Assume that α = 0. Then βy ± (1 + α)x and βy ± (1 − α)x are at least three

distinct points on one line which have the same norm. Thus, the norm’s unit sphere
S contains a nontrivial line segment inside . Consider a maximal line segment I
of S ∩ and let w ∈ I be interior point which is not the middle of I . We show
below that for each nonzero u ∈ we have w ⊥R u. This violates the assumptions
of Proposition 1.1. Thus, α = 0 and hence z ∈ Ry. This shows that, on an arbitrary
two-dimensional space , isosceles orthogonality implies Roberts orthogonality, so
they are equivalent, and in particular, isosceles orthogonality is homogeneous. It
remains to apply Proposition 1.3.
To finish the proof, assume that w ⊥R u. Then w ⊥ u (in B-J sense), so, by
Proposition 1.4, u is parallel to I . However, since w is not the middle of I , there
exists λ ∈ R such that w+λu ∈ I but w−λu ∈ / I . Therefore, w+λu = w−λu,
so w ⊥R u, a contradiction.

1.4 A Plethora of Orthogonalities

There are many additional possibilities to define orthogonality in general normed

spaces which on real inner product spaces agree with the usual one. We list a few of
262 L. Arambašić et al.

them; more detailed study can be found in survey papers [3] and [5]. One possibility
is Pythagorean orthogonality introduced in [47] by x ⊥P y if

x + y2 = x2 + y2 .

Day [31] obtained another nice characterization of inner product spaces based on
the (non)equivalence between Pythagorean and isosceles orthogonalities.
Proposition 1.8 (Cf. [31, Theorems 5.1 and 5.2]) The following are equivalent for
a normed space X .
(i) X is an inner product space.
(ii) Isosceles orthogonality implies Pythagorean orthogonality.
(iii) Pythagorean orthogonality implies isosceles orthogonality.
Sketch of the Proof (i) #⇒ (ii) and (iii). Given any x, y ∈ X , both x ⊥I y and
x ⊥P y are equivalent to Rex, y = 0.
(ii) #⇒ (i). Choose any normalized x, y ∈ X . Then (x + y) ⊥I (x − y), so by
the assumptions also (x + y) ⊥P (x − y), that is,

4 = 2x2 = (x + y) + (x − y)2 = x + y2 + x − y2 .

We first assume that X is a real normed space. It can be shown [31, Theorem 2.2]
that a two-dimensional real normed space Y is an inner product space (equivalently,
the norm’s unit sphere is an ellipse) if and only if x + y2 + x − y2 = 4 for all
normalized vectors x, y ∈ Y.
Consider now two arbitrary vectors u, v ∈ X and a two-dimensional subspace
Y ⊆ X which contains them. Then Y is an inner product space, so u + v2 + u −
v2 = 2(u2 + v2 ), and thus parallelogram identity holds in X . Therefore, by
Jordan–von Neumann characterization [49], X is an inner product space.
Assume now that X is a complex normed space. We have already proved that X
is a real inner product space, so it remains to apply Proposition 1.2.
(iii) #⇒ (i). Assume that x ⊥P y, that is, x +y2 = x2 +y2 . Then we also
have x ⊥I y, so x +y = x −y, and thus x −y2 = x2 +y2 , which means
that x ⊥P (−y). By [31, Lemma 5.3], X is a real inner product space whenever the
conditions x ⊥P y and x ⊥P (−y) are equivalent. Since Pythagorean orthogonality
does not depend on the field F, for a complex normed space X it remains to apply
Proposition 1.2.

In 1957 Singer [83] introduced a homogeneous version of isosceles orthogonality
y
by x ⊥S y if either x · y = 0 or x x
⊥I y . It is additive in dimension two.
The question of its additivity in general was finally put to the rest by Lin [61] who
found that in higher dimensions Singer orthogonality is additive if and only if the
norm is induced by the inner product.
In 1984 Alonso [2] introduced area orthogonality in real normed spaces by
x ⊥A y if either x · y = 0 or x, y are linearly independent and the lines Rx,
Ry divide the norm’s unit ball of the plane span{x, y} 4 R2 into four parts of equal
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 263

area; together with Benítez they obtained yet another interesting characterization of
real inner product spaces, whose proof we omit, as follows:
Proposition 1.9 (Cf. [4, Proposition 3]) The following are equivalent for a real
normed space X :
(i) X is an inner product space.
(ii) B-J orthogonality implies area orthogonality.
(iii) Area orthogonality implies B-J orthogonality.
An interesting equivalent description of orthogonality in inner product spaces
is presented in [40]. It does not rely neither on the inner product nor on the norm
but rather on isometries of the norm. To state it properly, we require the following
terminology. Given a vector x, denote by x := {λx; λ ≥ 0} the closed ray in
direction of x. Observe that 0 = {0} and that, for x nonzero, x \ {0} = {λx; λ > 0}
is an open ray.
Proposition 1.10 (Cf. [40]) Two vectors x, y in a real inner product space H are
orthogonal (in a classical sense) if and only if there exists a rigid motion, i.e., an
isometry T of H such that
−
→ − → → −−→
T x ∪ Ty ∪ −
x \ (−
→
y \ {0}) = −
→
x ∪ (−x). (3)

Two vectors x, y in a complex inner product space H are orthogonal if and only if
there exists an isometry T for which, simultaneously, (3) and
−−−→ − → −−→ −−→ −−−→
T (ix) ∪ T y ∪ (ix) \ (−
→
y \ {0}) = (ix) ∪ (−ix) (3 )

are satisfied.
The condition in (3) is clearly homogeneous and a moment’s thought reveals that
an isometry T satisfies (3) if and only if the isometry T = −T satisfies (3) with
x and y swapped. Hence, the relation based on (3) is always symmetric. However,
in general normed spaces it might be trivial, i.e., if vectors x, y satisfy (3), then at
least one of them must be zero. This can happen when the isometry group consists
of scalar operators only. Such normed spaces do exist: Davis [29] was the first to
construct separable Banach space over reals such that its only isometries (surjective
or not) are ±I . We also refer to Gordon and Loewy [38, Theorem 3.1] who, by
answering a question of Lindenstrauss, showed much more: Any finite subgroup of
(real) orthogonal n-by-n matrices which contains −I is the isometry group of some
norm on Rn . Moreover, any real or complex Banach space can be renormed, so that
its isometry group consists of scalar operators only, see Jarosz [48].
The following proposition presents several other nice criteria for a normed space
to be an inner product space.
264 L. Arambašić et al.

Proposition 1.11 Let X be a real or complex normed space. Then the following
conditions are equivalent:
(i) X is an inner product space.
(ii) x = y = 1 implies αx+βy = βx+αy for all α, β ∈ R (Ficken [36]).
(iii) There exists some γ ∈ R, γ = 0, 1, such that x = y = 1 implies x +
γ y = γ x + y (Lorch [62]).
(iv) x = y = 1 and x ⊥ y imply (x + y) ⊥ (x − y) (Oman [70,
Theorem 5.21]).
The interested reader is referred to a monograph by Amir [7] for hundreds of
additional conditions on the norm which force it to be induced by an inner product.

2 Properties of B-J Orthogonality

2.1 Gateaux Derivatives and Complex Case Related to Real

Orthogonality

A normalized vector x ∈ (X , · ) is said to be a smooth point of the norm if it

has a unique supporting functional; if every normalized vector is a smooth point
then one says that the norm (or, more colloquial, the space) is smooth. Similarly to
Proposition 1.2, each complex normed space (X , · ) can be regarded as a real
normed space by restricting the scalars. There is an elegant relationship between
smooth points in (X , · ) and the real counterpart (X , · R ) akin Proposition 1.2:
Proposition 2.1 A normalized vector x in a complex normed space (X , · ) is a
smooth point if and only if it is a smooth point in (X , · R ).
Sketch of the Proof Each complex-linear functional f : X → C is uniquely
determined from its real part (Re f ) : X → R; the correspondence being f (x) =
(Re f )(x) − i(Re f )(ix); moreover, Re f = f . Thus, if x ∈ (X , · R ) has
two complex-linear supporting functionals, then their real parts are two different
real-linear supporting functionals for x. Conversely, let x ∈ (X , · ) have two
different real-linear supporting functionals g1 , g2 , and let fk (z) := gk (z) − igk (iz).
Since |gk (x) − igk (ix)| ≤ fk = gk = gk (x), we have g1 (ix) = g2 (ix) = 0,
and thus f1 , f2 are two different complex-linear supporting functionals for x.

It is immediate from Propositions 1.4 and 2.1 that, in smooth complex normed
spaces, x ⊥ y if and only if x ⊥R y and x ⊥R (iy).
It is well-known that supporting functionals are intimately related to the (one-
sided) derivative of the norm. Let x, y ∈ X . Notice that t $→ x + ty is a
convex function of a real variable t and as such has left and right derivatives at each
point [77, Theorem 23.1]. In particular, there exists the norm’s directional derivative
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 265

at x in direction of y:

x + ty − x x + ty − x

D− (x; y) := lim and D+ (x; y) := lim .
t 30 t t %0 t

Observe that D− (x; y) = −D+ (x; −y). We show next their geometric significance.
It will be beneficial to denote by J (x) the set of all supporting functionals at a
vector x. Recall that φ ∈ J (x) if and only if φ(x) = x and φ = 1.
Proposition 2.2 (Cf. [33, Lemma 1 and Theorem 15]) Let x be a nonzero vector
in X . Then

D− (x; y) = min Re φ(y),

φ∈J (x)

D+ (x; y) = max Re φ(y).

φ∈J (x)

Sketch of the Proof If φ ∈ J (x), then φ = Re φ = 1 and φ(x) = x. So, for
t ∈ R sufficiently close to zero, x +ty ≥ | Re φ(x +ty)| = x+t Re φ(y). After
dividing by t and keeping an eye on its sign we get limt 30 x+ty−x
t ≤ Re φ(y) ≤
x+ty−x
limt %0 t .
To show that both inequalities are achieved, define two real-linear functionals on
spanR {x, y} by ψ± (αx+βy) := αx+βD± (x; y), (α, β ∈ R). Since t $→ x+ty
is convex, the function β $→ x+βy−x
β increases to D− (x; y) as β 3 0 and
decreases to D+ (x; y) as β % 0, respectively. Thus, with β ≥ 0 we have

ψ+ (x + βy) = x + βD+ (x; y) ≤ x + (x + βy − x) = x + βy

with a similar derivation and the same conclusion also for β < 0. Moreover,
ψ+ (−x − βy) = −x − βD+ (x; y) ≤ −x + (x − βy − x) = x − βy −
2x ≤ − x − βy. Thus, ψ+ (αx + βy) ≤ αx + βy (α, β ∈ R). Combined with
ψ+ (x) = x, we get ψ+ = 1. By the Hahn–Banach theorem, we can enlarge
the domain of ψ+ to X without affecting its norm, and (if X is a complex space)
make it into a complex-linear functional φ+ ∈ J (x) with Re φ+ = ψ+ .
One argues similarly with ψ− to construct φ− ∈ J (x) with Re φ− (y) = ψ− (y) =
D− (x; y).

Proposition 2.3 (Cf. [47, Theorem 3.1]) Let X be a real normed space, x, y ∈ X .
Then x ⊥ (y − αx) if and only if D− (x; y) ≤ αx ≤ D+ (x; y).
Sketch of the Proof Apply Propositions 1.4 and 2.2.

Definition 2.4 Let (X , · ) be a real or complex normed space. The norm is
Gateaux differentiable at a point x if for all y ∈ X there exists D(x; y) =
limR8t →0 x+ty−x
t .
266 L. Arambašić et al.

Clearly, the norm is Gateaux differentiable at x ∈ X if and only if D+ (x; y) =

D− (x; y) for all y ∈ X . In this case the Gateaux derivative induces a bounded
linear functional y $→ D(x; y), see [73, Corollary 4.7]. We present below another
proof of this fact.
Proposition 2.5 (Cf. [18, part 3, ch. 1, §2, Proposition 2 and Remark 1])
A normalized vector x in a real or complex normed space X is a smooth point if
and only if the norm is Gateaux differentiable at x.
Proof In view of Proposition 2.1, it is sufficient to consider only a real-linear
normed space X . Assume first that the norm is Gateaux differentiable at x ∈ X ,
and consider any supporting functional φ at x. Then, by Proposition 2.2, for all
y ∈ X we have

D(x; y) = D− (x; y) ≤ φ(y) ≤ D+ (x; y) = D(x; y),

so φ(y) = D(x; y). It follows that the Gateaux derivative y $→ D(x; y) is the
unique supporting functional at x, and thus x is a smooth point.
Assume now that x is not a smooth point, that is, there exist two different
supporting functionals φ1 , φ2 at x. Then there exists some y ∈ X such that
φ1 (y) < φ2 (y). Hence it follows from Proposition 2.2 that

D− (x; y) ≤ φ1 (y) < φ2 (y) ≤ D+ (x; y),

so D− (x; y) = D+ (x; y), and the norm is not Gateaux differentiable at x.

Since the norm is a convex function, it follows from [77, Theorem 25.2] that
the norm on a real finite-dimensional normed space X = (Rn , · ) is Gateaux
differentiable if and only if it is differentiable at all nonzero points of X . In this case
the unique supporting functional fx at a normalized vector x = (ξ1 , . . . , ξn ) ∈ Rn
is given by

fx : y → y, ∂x, (4)

where u, v := uv t is the standard scalar product of row vectors in Rn and ∂x :=
∂· ∂·
∂ξ1 , . . . , ∂ξn (x) is the norm’s differential evaluated at vector x.
A complex normed space (Cn , · ) can be regarded as a real normed space
(R , · ) by restricting the scalars. We write x = (ξ1 , . . . , ξn ) ∈ Cn and
2n

consider the real and complex components of ξk = Re ξk + i Im ξk as independent

real variables. The R-linear supporting functional gx which depends on 2n real
variables (Re ξ1 , Im ξ1 , . . . , Re ξn , Im ξn ) is the real part of a C-linear functional
given by fx : z $→ gx (z) − igx (iz). By the proof of Proposition 2.1, fx is a C-linear
supporting functional at a normalized vector x. If the norm on X is smooth, then it
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 267

follows easily that fx is given by

* +
∂ · ∂ · ∂ · ∂ ·
fx : z $→ z, +i ,..., +i (x) ,
∂ Re ξ1 ∂ Im ξ1 ∂Re ξn ∂ Im ξn

where u, v := uv ∗ is the standard scalar product of row vectors in Cn . This can
be simplified if one introduces complex partial derivatives

∂ · ∂ · ∂ · ∂ · ∂ · ∂ ·
:= −i and := +i ;
∂ξk ∂Re ξk ∂ Im ξk ∂ξk ∂Re ξk ∂ Im ξk

then we can identify the C-linear supporting functional fx with the norm’s complex
conjugate differential

∂ · ∂ ·
∂C x := ,..., (x)
∂ξ1 ∂ξn
(5)
∂ · ∂ · ∂ · ∂ ·
= +i ,..., +i (x).
∂ Re ξ1 ∂ Im ξ1 ∂ Re ξn ∂ Im ξn

We refer an interested reader to a monograph [6] for additional results based on

(one sided) norm’s derivative D± . In particular, one can find characterizations of
continuous maps T : (R2 , · ) → (R2 , · ) satisfying D± (T x; T y) = D± (x; y)
(a generalization to Wigner’s unitary-antiunitary theorem [92], see also [37, 66])
as well as additional characterizations of inner product spaces in terms of specific
triangle points.

2.2 Symmetry, Additivity and Uniqueness of B-J Orthogonality

We will need two additional properties of orthogonality which hold trivially for
orthogonality in inner product spaces.
Left uniqueness For any x, y ∈ X , x = 0, there exists at most one α ∈ F such
that (αx + y) ⊥ x.
Right uniqueness For any x, y ∈ X , x = 0, there exists at most one α ∈ F such
that x ⊥ (αx + y).
Theorem 2.6 (Cf. [46, Theorems 1 and 2], [47, Theorems 4.1–4.3 and 5.1]) Let
X be a real or complex normed space.
(i) B-J orthogonality in X is left unique if and only if the norm on X is strictly
convex.
(ii) B-J orthogonality in X is right unique if and only if it is right additive if and
only if the norm on X is smooth.
268 L. Arambašić et al.

(iii) If dim X ≥ 3 then B-J orthogonality in X is symmetric if and only if X is an

inner product space.
(iv) If dim X = 2 then B-J orthogonality in X is left additive if and only if the norm
on X is strictly convex.
(v) If dim X ≥ 3 then B-J orthogonality in X is left additive if and only if X is an
inner product space.
Sketch of the Proof
(i) Assume first that the norm is not strictly convex, that is, the norm’s unit sphere
contains some nontrivial line segment. Then there exist two distinct normalized
vectors x, y ∈ X such that (1 − λ)x + λy = 1 for all λ ∈ (−ε, 1 + ε), where
ε > 0 is small enough. Consider an arbitrary supporting functional f at x, that
is, f (x) = f = 1. Then

|1 + λ(f (y) − 1)| = |f ((1 − λ)x + λy)| ≤ f · (1 − λ)x + λy = 1

for any λ ∈ (−ε, ε), so f (y) = 1. Hence f ((1 − λ)x + λy) = 1 for all
λ ∈ (−ε, 1+ε), and thus f is also supporting at (1−λ)x+λy. Since f (x−y) =
0, we obtain from Proposition 1.4 that ((1 − λ)x + λy) ⊥ (x − y) for all
λ ∈ (−ε, 1 + ε). Therefore, B-J orthogonality in X is not left unique.
Assume now that B-J orthogonality in X is not left unique. Then there exists
a two-dimensional subspace Y ⊆ X , two linearly independent normalized
vectors x, y ∈ Y and a normalized vector z ∈ Y such that x ⊥ z and y ⊥ z.
Consider two supporting functionals fx , fy : Y → F at x and y, respectively,
such that fx (z) = fy (z) = 0. Since Y is two-dimensional, fy is a scalar
multiple of fx , so we may assume that fx = fy = f . Hence

1 = f ((1 − λ)x + λy) ≤ f · (1 − λ)x + λy ≤ 1

for all λ ∈ [0, 1]. Therefore, the norm’s unit sphere contains a line segment
[x, y], and the norm on X is not strictly convex.
(ii) We first show that if B-J orthogonality is right unique, then for any nonzero
x ∈ X there exists a unique supporting functional fx at x. Assume from
the contrary that f1 and f2 are two distinct supporting functionals for some
normalized vector x ∈ X . Then there exists y ∈ X such that f1 (y) = f2 (y).
We have f1 (y−f1 (y)x) = 0, so x ⊥ (y−f1 (y)x). Similarly, f2 (y−f2 (y)x) =
0, so x ⊥ (y − f2 (y)x). Hence B-J orthogonality in X is not right unique.
It follows from Proposition 1.4 that uniqueness of the supporting functional
fx for any nonzero x ∈ X implies right additivity of B-J orthogonality in X .
We next show that right additivity implies right uniqueness.
Assume that B-J orthogonality is right additive and that x ⊥ (αx + y),
x ⊥ (βx + y) for some α, β ∈ F and x, y ∈ X . Then x ⊥ −(βx + y), so
x ⊥ ((αx + y) − (βx + y)) = (α − β)x. This is only possible for α − β = 0,
so B-J orthogonality in X is right unique.
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 269

(iii) Let Y be an arbitrary three-dimensional subspace of X , and consider any two

elements x, y ∈ Y. Let f be a norm-one linear functional on Y such that
Z = span{x, y} ⊆ ker f . There exists a normalized vector z ∈ Y such that
f (z) = f . By Proposition 1.4, z ⊥ Z. If B-J orthogonality is symmetric,
then we also have Z ⊥ z.
Hence, if a projection P : Y → Z is defined by w = P (w) + kw z, kw ∈ F,
then P (w) ≤ w for all w ∈ Y. Therefore, P = 1. Now, if X is a
real normed space, then, by Kakutani’s result [50, Theorem 4], the existence
of such a normalized projection implies that the norm’s unit ball in Y is an
ellipsoid, so Y is an inner product space. The same conclusion holds true also
if X is a complex normed space as shown by Bohnenblust [26, Theorem A]
(but see also [7, §12] for a modern treatment). Since Y is arbitrary, it follows
from Jordan–von Neumann characterization [49] that X is an inner product
space.
(iv) If dim X = 2 then B-J orthogonality is left additive if and only if it is left
unique. Then the statement follows from (i).
(v) Consider two arbitrary linearly independent elements y1 , y2 ∈ X . If fi is any
supporting functional at yi and Yi = ker fi , then yi ⊥ Yi , i ∈ {1, 2}. Let
Y = Y1 ∩ Y2 . If B-J orthogonality is left additive, then Z ⊥ Y, where Z =
span{y1 , y2 }. Hence Z ∩ Y = 0, so dim Z = 2 and codim Y ≤ 2 imply that
X = Z ⊕ Y. We can now define a norm-one projection P : X → Z along Y.
It follows again from Kakutani’s [50, Theorem 4] and Bohnenblust’s [26,
Theorem A] results that X is an inner product space.

Note that Theorem 2.6(iii) does not hold for dim X = 2. In other words, there
exist two-dimensional normed spaces in which B-J orthogonality is symmetric,
however, the norm is not induced by an inner product. Such spaces are called Radon
planes (see [5]), and their first examples are due to Birkhoff [22] and James [46,
p. 561]. We provide a complete characterization of real Radon planes which was
obtained by Day [31].
Remark 2.7 (Cf. [31, pp. 330–333]) Let X be a two-dimensional real normed
space. Then B-J orthogonality in X is symmetric if and only if modulo a linear
transformation its unit sphere SX can be obtained by the following procedure:
(1) Choose any (auxiliary) two-dimensional real normed space Y.
(2) Find any two normalized vectors x, y ∈ Y with x ⊥ ⊥ y. They exist by
Proposition 2.13(i) below.
(3) Find supporting functionals fx , fy ∈ Y ∗ with fx (x) = fy (y) = 1 and fx (y) =
fy (x) = 0.
(4) Choose a coordinate system in Y with x and y at (1, 0) and (0, 1), correspond-
ingly. Similarly, choose a coordinate system in Y ∗ with fx and fy at (0, 1) and
(−1, 0).
(5) Set the first and the third quadrants of SX equal to the first and the third
quadrants of SY , set the second and the fourth quadrants of SX equal to the
second and the fourth quadrants of SY ∗ .
270 L. Arambašić et al.

Fig. 2 Real Radon plane

based on 1 –∞ norm

Example If one takes Y = (R2 , · p ) and Y ∗ = (R2 , · q ) with p, q ∈ [1, ∞]

and p1 + q1 = 1 in Remark 2.7, then the resulting Radon plane X is the one which
was constructed by James (Fig. 2).
Some examples of complex Radon planes were given by Oman [70, pp. 43–48,
Constructions III and V], though he did not obtain a complete characterization for
them. His constructions use the following result:
Proposition 2.8 (Cf. [70, Theorem 3.4]) A two-dimensional real or complex
normed space X is a Radon plane if and only if there exists an isometry : X →
X ∗ such that (x)(x) = 0, x ∈ X .
Consider a general real normed space. How far is isosceles orthogonality from
satisfying B-J condition? To quantify this we might try by reformulating into: how
deep inside a norm’s unit sphere can a line go if it contains two points at the same
distance through the origin and placed symmetrically relative to a given normalized
vector x? The answer was given by James [45, Theorem 4.2]: If x + y = x −
y, then 12 < infλ∈R x+λy
x . For real Radon planes this estimate was improved by
Mizuguchi, [65, Theorem 2.9] into 89 ≤ infλ∈R x+λy
x ; the equality holds for some
nonzero x ⊥I y if and only if the norm is hexagonal (see [65, Theorem 2.10]).

2.3 Mutual B-J Orthogonality

We have already mentioned that if dim X ≥ 2, then for every nonzero x ∈ X

there is a nonzero y ∈ X such that x ⊥ y. However, as the following example
shows, it is possible that for some nonzero x there is no nonzero y with x ⊥
⊥ y. By
Corollary 2.10 below, this can happen only in two-dimensional spaces.
Example Let X = (F2 , · 1 ). Then the only supporting functional at x = ( 13 , 23 ) ∈
X is given by f (ξ1 , ξ2 ) = ξ1 +ξ2 . By Proposition 1.4, if y = (ζ1 , ζ2 ) is a normalized
vector such that x ⊥ y, then y ∈ Ker f , i.e., ζ1 + ζ2 = 0. We may assume that
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 271

y = 12 , − 12 . The only supporting functional at y is given by g(ξ1 , ξ2 ) = ξ1 − ξ2 ,

and since x ∈ / Ker g, we conclude that y ⊥ x. Hence for this x there exists no
nonzero vector y such that x ⊥
⊥ y.
Theorem 2.9 (Cf. [13, Theorem 2.3]) Let n ∈ N, and let X be a real or complex
normed space with dim X ≥ 2n + 1. Then for any normalized vectors x1 , . . . , xn ∈
X there exists a normalized vector y ∈ X such that y ⊥
⊥ xk for all k = 1, . . . , n.
Sketch of the Proof We are going to prove a stronger result, namely, we will find a
normalized vector y ∈ X such that

xk ⊥ y for all k = 1, . . . , n;
(6)
y ⊥ nk=1 γk xk for all γ1 , . . . , γn ∈ F.

We can assume that dim X = 2n + 1, since we may pass from X to any

(2n+1)-dimensional subspace of X which contains x1 , . . . , xn and look for y there.
Moreover, it is enough to prove the theorem for the case when the norm on X is
smooth. Indeed, if the norm on X is nonsmooth, we can construct a sequence of
smooth norms · (m) converging uniformly to · on compact subsets of X (see
[32, p. 52–53] or [41, Theorem 2.10]), and for any m ∈ N choose a normalized (in
· norm) vector ym ∈ X which satisfies condition (6) with respect to the norm
·(m) . Since the unit sphere in X is compact, passing to a subsequence if necessary,
we may assume that limm→∞ ym = y for some y ∈ X . This y satisfies condition (6)
with respect to the original norm · .
Let now X be a (2n + 1)-dimensional space with a smooth norm. For each k ∈
{1, . . . , n} let fk be a supporting functional at xk , and

D
n
N := ker fk . (7)
k=1

Then dim X = 2n + 1 implies dim N ≥ n + 1, and each vector z ∈ N satisfies

xk ⊥ z for all k = 1, . . . , n. In order to find z ∈ N which satisfies the second
condition in (6), we split into two cases.
Case 1: X is a real normed space.
We identify R2n+1 with X and the standard basis vectors e1 , . . . , e2n+1 ∈ R2n+1
with a particular basis of X such that e1 , . . . , en+1 ∈ N . Hence each z ∈ Rn+1 ⊕
0n ⊆ X (here 0n denotes the zero vector in Rn ) satisfies xk ⊥ z.
The gradient function of the norm ·

z $→ ∂z

exists everywhere on the slice of the norm’s unit sphere

:= {z = (ζ1 , . . . , ζn+1 , 0, . . . , 0) ∈ R2n+1 ; z = 1}

272 L. Arambašić et al.

and it is continuous there by [77, Theorem 25.5]. Also, − z = z implies that

∂ − z = −∂z (8)

for all z ∈ . Note that the Euclidean unit n-sphere Sn ⊂ Rn+1 is homeomorphic
(ζ1 ,...,ζn+1 ,0,...,0)
to (via the radial projection r : (ζ1 , . . . , ζn+1 ) $→ (ζ1 ,...,ζn+1 ,0,...,0)
). In particular,
by Borsuk–Ulam’s theorem applied to the function
> ? > ?
z $→ x1 , ∂r(z) , . . . , xn , ∂r(z) : Sn → Rn , (9)

which is odd (by (8)) and continuous, there exists a point y ∈ ⊆ Rn+1 ⊕ 0n
on the unit> sphere? where the function (9) is equal to 0n . By Proposition 2.5,
g : x $→x, ∂y is a supportingfunctional at y, and for all γ1 , . . . , γn ∈ R we
have g( nk=1 γk xk ) = 0, so y ⊥ nk=1 γk xk . Since y ∈ N , we finally obtain that y
satisfies condition (6).
Case 2: X is a complex normed space.
Recall that dimC X = 2n + 1 and that each z ∈ N satisfies xk ⊥ z in a complex
normed space X . It remains to find an element in N which satisfies the second
condition in (6). To do this, we consider X and N as real normed spaces. We may
apply the construction from the real case to 2n vectors x1 , ix1 , . . . , xn , ixn , since
dimR X = 2(2n + 1) ≥ 2(2n) + 1 and dimR N ≥ 2(n + 1) ≥ (2n) + 1. This
gives us a normalized
n vector y ∈ N such that, for all α1 , . . . , αn , β1 , . . . , βn ∈ R,
it holds
y ⊥ k=1 (α k xk + βk ixk ) in a real vector space X . This is equivalent to
y ⊥ nk=1 γk xk for all γ1 , . . . , γn ∈ C in a complex vector space X , so y satisfies
both conditions in (6).

Example 2.4 in [13] demonstrates that the lower bound on dim X in Theorem 2.9
is exact. We state two immediate corollaries.
Corollary 2.10 Let X be a normed space over F with dim X ≥ 3. Then for every
normalized vector x there is a normalized vector y with x ⊥
⊥ y.
Corollary 2.11 Let X be a normed space over F with dim X ≥ 5. Then for every
two normalized vectors x, y ∈ X there is a normalized vector z ∈ X with x ⊥
⊥
z⊥⊥ y.
Another corollary to Theorem 2.9 is that in infinite-dimensional spaces we can
find infinitely many pairwise B-J orthogonal nonzero vectors.
Corollary 2.12 Let X be an infinite-dimensional normed space over F. Then there
exists an infinite number of nonzero vectors which are pairwise B-J orthogonal.
Proof We construct a sequence (xn )n of pairwise B-J orthogonal normalized
vectors in X recursively. Let x1 ∈ X be an arbitrary normalized vector. Assume
now that we already have x1 , . . . , xn . Since dim X > 2n + 1, Theorem 2.9 implies
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 273

that there exists a normalized vector xn+1 ∈ X such that xn+1 ⊥

⊥ xk for all
k = 1, . . . , n.

In the case of finite-dimensional spaces, the lower bound on the number of pair-
wise B-J orthogonal nonzero vectors was obtained by Taylor [90, Theorem 2], see
Proposition 2.13(i) below. This result is also known as Auerbach’s lemma [16, 17].
The main idea of Taylor’s proof [90, Theorem 1] was to maximize the volume of
a parallelepiped inscribed into the norm’s unit ball. It has also a nice geometrical
consequence mentioned by Day [30, Theorem 4.1]. Namely, let S be a unit ball of
some norm · in Rn . Then S can be inscribed into an n-dimensional parallelepiped
P centered at 0, so that the middle point of every hyperface of P belongs to S.
Proposition 2.13 (Cf. [90, Theorem 2], [13, Proposition 3.3, Theorem 3.5]) Let
n ≥ 2 and let X be a real or complex n-dimensional normed space.
(i) There exist at least n pairwise B-J orthogonal normalized vectors in X .
(ii) If the norm on X is smooth then there exist exactly n pairwise B-J orthogonal
nonzero vectors in X .
(iii) There exists a positive integer γn,F (it depends only on the dimension n and on
the field F but not on the normed space X ) such that any set of pairwise B-J
orthogonal nonzero vectors in X has at most γn,F members.

3 Examples of B-J Orthogonality

3.1 B-J Orthogonality in B(H) and Beyond

How does B-J orthogonality look like in classical Banach spaces? For Hilbert spaces
it coincides with the usual inner product orthogonality. The next important example
we provide the answer for is B(H), the space of bounded linear operators on a
complex Hilbert space H (equipped with the usual operator norm). Let us begin
with two examples.
Example
(a) Let A, B ∈ B(H) be operators with orthogonal ranges, that is, such that A∗ B =
0. Then, by Pythagorean identity, for all λ ∈ C it holds

A + λB2 = sup (A + λB)x2 = sup Ax2 + λBx2 ≥ A2 ,

x=1 x=1

so A ⊥ B.
(b) Let A ∈ B(H) be an operator which is not bounded from below. Let (xn )n
be a sequence of normalized vectors such that limn→∞ Axn = 0. Then
limn→∞ (I + λA)xn = 1 = I for all λ ∈ C, so I + λA ≥ I and
I ⊥ A.
274 L. Arambašić et al.

B-J orthogonality in B(H) was studied by several authors. A characterization in

the special case when one of the operators is the identity was obtained by Stampfli
[86, Theorem 2] in the study on derivations. It was stated in terms of the maximal
numerical range of A defined as

W0 (A) = {λ ∈ C; ∃(xn )n ∈ H, xn = 1, lim Axn , xn = λ,

n→∞

lim Axn = A}.

n→∞

Stampfli first proved that W0 (A) is a closed convex subset of C, and this was used
in the proof of the equivalence d(A, CI ) = A ⇔ 0 ∈ W0 (A).
Later, in his study on the distance to finite-dimensional subspaces in operator
algebras, Magajna [63] introduced, for A, B ∈ B(H), the notion of the maximal
numerical range of B ∗ A relative to A in the following way
4
WA (B ∗ A) = μ ∈ C; ∃(xn )n ∈ H, xn = 1, lim B ∗ Axn , xn = μ,
n→∞
5
lim Axn = A .
n→∞

Obviously, if B = I , this reduces to Stampfli’s maximal numerical range of A.

Magajna observed that Stampfli’s results hold, with the same arguments, in the
more general case, and this implies a complete characterization of B-J orthogonal
operators in B(H). The same characterization was obtained with the help of different
methods by Bhatia and Šemrl [20] in their study of the diameter of a unitary orbit of
a matrix, and also by Roy, Bagchi, and Sain [78] within their directional approach
to B-J orthogonality.
The following proof is an adaptation of Theorem 2 from [86].
Theorem 3.1 Let A, B ∈ B(H). Then A ⊥ B if and only if 0 ∈ WA (B ∗ A). In
other words, A ⊥ B if and only if there is a sequence of normalized vectors (xn )n
in H such that

lim Axn = A and lim B ∗ Axn , xn = 0.

n→∞ n→∞

Proof We may assume that B = 1. By using the same arguments as in [86,
Lemma 2], it can be shown that the set WA (B ∗ A) is a nonempty, closed and convex
subset of C.
Suppose that 0 ∈ WA (B ∗ A). Let (xn )n be a sequence of normalized vectors in
H such that limn→∞ B ∗ Axn , xn = 0 and limn→∞ Axn = A. Then for each n
we have

A + λB2 ≥ (A + λB)xn 2

= Axn 2 + 2 Re(λB ∗ Axn , xn ) + |λ|2 Bxn 2
n→∞
≥ Axn 2 + 2 Re(λB ∗ Axn , xn ) −−−→ A2 .
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 275

Conversely, suppose that 0 ∈ / WA (B ∗ A). Since WA (B ∗ A) is closed and

convex, without loss of generality we may assume that there is τ > 0 such that
Re WA (B ∗ A) ≥ τ (by changing B with eiφ B for an appropriate φ). Let

S = {x ∈ H; x = 1, ReB ∗ Ax, x ≤ τ/2}

and η = sup{Ax; x ∈ S}. Then Re WA (B ∗ A) ≥ τ implies that η < A. Let

4 5
τ A−η
μ = min 2, 2 > 0.

We will show that A − μB < A which will imply that A ⊥ B.
Let x ∈ H be a normalized vector. If x ∈ S then

(A − μB)x ≤ Ax + μBx ≤ η + μ ≤ A − μ.

Suppose now that x ∈

/ S. Let us write Ax = (a + ib)Bx + y with Bx, y = 0.
Then 0 < μ ≤ τ/2 < ReB ∗ Ax, x = aBx2 ≤ a. After some manipulation this
implies μ(μ − 2a)Bx2 < −μ2 . Then

(A − μB)x2 = (a + ib − μ)Bx + y2 = ((a − μ)2 + b 2 )Bx2 + y2

= ((a 2 + b 2 )Bx2 + y2 ) + μ(μ − 2a)Bx2
= Ax2 + μ(μ − 2a)Bx2 < A2 − μ2 .

This proves that A − μB < A, so A ⊥ B.

Directly from this characterization we obtain a family of B-J orthogonal pairs of
operators on a complex Hilbert space.
Corollary 3.2 For all A, B ∈ B(H) it holds A ⊥ B(A2 − A∗ A).
Another corollary is a simpler characterization of B-J orthogonality in the case of
compact operators on a complex Hilbert space and, in particular, when dim H < ∞.
B-J orthogonality of compact operators on a separable, complex Hilbert space was
discussed by Kečkić, see [54, Corollary 2.8]. We present here a short proof based
on Theorem 3.1. Also, in our proof only A needs to be compact.
Corollary 3.3 Let A, B ∈ B(H), with A compact. The following are equivalent.
(i) A ⊥ B.
(ii) There exists a normalized vector x ∈ H such that A = Ax and
Ax, Bx = 0.
Proof Suppose that A ⊥ B. By the preceding result, then 0 ∈ WA (B ∗ A), so
√ that limn→∞ Axn = A
there is a sequence (xn )n of normalized vectors such
and limn→∞ B ∗ Axn , xn = 0. Since |A| = A∗ A is compact and (xn )n is
bounded, we may assume that (|A|xn )n is convergent. By [91, Lemma 2.1], it holds
276 L. Arambašić et al.

limn→∞ (|A|xn − Axn ) = 0, so (xn )n is convergent as well. Then (ii) holds with
x := lim xn . The converse is obvious.

In the case when dim H < ∞, this corollary can also be stated in terms of
matrices (see [20, Theorem 1.1]). Moreover, Li and Schneider used Proposition 1.6
to generalize it to rectangular (real or complex) matrices Mm×n (F) with Schatten
p-norm
=
n √ p 1/p
Ap := p
(σi (A))p = tr A∗ A ,
i=1

where σ1 (A) ≥ σ2 (A) ≥ · · · ≥ σn (A) ≥ 0 are the singular values of A (the square
roots of eigenvalues of A∗ A). Note that A∞ = σ1 (A) is the spectral norm. By
[79, Theorem 9], Schatten norms · p and · q on Mm×n (F) with 1/p + 1/q = 1
are dual to each other; the duality is given by (A, B) $→ tr(B ∗ A).
Except for p = 1 and p = ∞, Schatten p-norm is differentiable, hence smooth.
√ p 1/p
This can be seen directly for p > 2, because X $→ tr X∗ X is a
∗
composition of real differentiable functions: X $→ H (X) = X X is clearly real
differentiable, while
=
n n
∗
H =U λi (H )Eii U $→ tr(|H |p/2) := |λi (H )|p/2
p p
i=1 i=1

is also a differentiable map on the real space of Hermitian matrices by, e.g.,
Lewis’s
[59, Theorem 1.1]. One should mention that the differentiability of H $→
p
tr(|H |p/2) follows also from Rellich’s result [75] (a simplified proof can be
found in Kato’s monograph [53, Theorem II.6.8]). Namely, for each Hermitian
H, H ∈ Mn (C), there exist n continuously differentiable functions in a real variable
t which represent n eigenvalues (counted with multiplicities) of H + tH . It is then
immediate that H $→ |H |p/2 has a continuous directional derivative in direction
H , so it must be differentiable. We remark that [59, Theorem 1.1] gives a much
more general differentiability result for matrix functions. The differentiability for
1 < p < 2 was proved in [15], but the reader is again referred to Lewis [58,
Corollary 2.6] for a more general result.
Let us introduce a bit more notation: Given a matrix A ∈ Mn (F), we denote by

WC (A) := {x ∗ Ax; x ∈ Cn , x ∗ x = 1}

its numerical range. If F = R we let

WR (A) := {x ∗ Ax; x ∈ Rn , x ∗ x = 1}

be the restricted numerical range. Recall that WC (A) is convex, and so is WR (A)
because it is an interval (it is an image of a continuous function x $→ x ∗ Ax whose
domain is the Euclidean (n − 1)-sphere).
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 277

We proceed to prove Li and Schneider’s extension for spectral norm. Recall

that the dual of spectral norm is the trace norm (i.e., Schatten 1-norm). Also, as
required by Proposition 1.6, the extreme points of the trace norm on Mm×n (F) are
the operators yx ∗ for x ∈ Fn and y ∈ Fm with x = y = 1 (see Zi˛etak [95]
or [58, Theorem 3.4]).
Proposition 3.4 (Cf. [60, Theorem 3.1]) Let m ≤ n. The following are equivalent
for A, B ∈ (Mm×n (F), · ∞ ), equipped with Schatten ∞-norm (i.e., spectral
norm):
(i) A ⊥ B.
(ii) There exists a normalized vector x ∈ Fn such that A∞ = Ax and
Ax, Bx = 0.
(iii) There exist normalized vectors y ∈ Fm and x ∈ Fn such that A∞ = y ∗ Ax
and y ∗ Bx = 0.
(iv) For any U ∈ Mm×r (F) whose columns form an orthonormal basis for the
eigenspace of AA∗ corresponding to the largest eigenvalue, we have

0 ∈ WF (U ∗ BA∗ U ).

Sketch of the Proof (i) #⇒ (iv). By Proposition 2.1, there exist extreme points
Fk = yk xk∗ ∈ Mm×n (F) (xk = yk = 1) for 1 ≤ k ≤ h and positive scalars λk
summing to one, such that

h
A∞ = Fk , A := tr(Fk∗ A) = yk∗ Axk and λk Fk , B = 0. (10)
k=1

Thus, A achieves its norm on xk , and yk is proportional to Axk . By the assumption

on U in item (iv) (together with the singular value decomposition of A), there exist
normalized vectors vk ∈ Fr with

1 1
U vk = yk and xk = A∗ yk = V vk ; V = A∗ U ∈ Mn×r (F).
A∞ A∞

Thus, by the second condition in (10), 0 = hk=1 λk vk∗ U ∗ BV vk , which is a point in
the convex hull of WF (U ∗ BV ). Due to its convexity, 0 ∈ WF (U ∗ BV ), giving (iv).
(iv) #⇒ (iii). Choose a normalized vector v ∈ Fr with 0 = v ∗ U ∗ BA∗ U v, then
y := U v and x := A1 ∞ A∗ U v are normalized vectors satisfying (ii).
The remaining implications (iii) #⇒ (ii) #⇒ (i) are straightforward.

Proposition 3.5 (Cf. [60, Theorem 3.2]) Let m ≤ n and let p ∈ (1, ∞). The
following are equivalent for A, B ∈ (Mm×n (F), · p ), equipped with Schatten
p-norm:
(i) A ⊥ B.
(ii) tr(P p−1 U B ∗ ) = 0 where A = P U is the polar decomposition (P ∈ Mm (F) is
positive semidefinite and U ∈ Mm×n (F) satisfies U U ∗ = Im ).
278 L. Arambašić et al.

m ∗
Sketch of the Proof Assume that A = 0 and write P = i=1 σi (A)xi xi for some
orthonormal basis (xi )i . One can calculate that

1
m
1
T := p P p−1
U = p σi (A)p−1 xi xi∗ U
Ap Ap i=1

satisfies T q = tr(T ∗ A) = 1. Therefore, by smoothness of the norm, T induces

the only supporting functional at A, and the result follows from Proposition 1.4.

In the same paper [60], Li and Schneider also characterized B-J orthogonality of
Mm×n (F) in trace norm. Note that it corresponds to Schatten 1-norm which is not
differentiable.
Spectral and trace norms are the first and the last among Ky Fan norms. These are
defined for any positive integer k, not larger than the size of a matrix, by (see [43,
Theorem 3.4.1])

k
A(k) := σi (A) = max | tr(U ∗ AV )|; A ∈ Mn (C).
U,V ∈Mn×k (C)
i=1 U ∗ U =V ∗ V =Ik

The special importance of Ky Fan norms is their dominance property: Given two
matrices A, B ∈ Mn (C), then A ≤ B for every unitarily invariant norm if
and only if this inequality holds for all n Ky Fan norms (see [43, Corollary 3.5.9]).
Grover in [39] gave the following characterization of B-J orthogonality in Ky Fan
norms, which we state without a proof:
Proposition 3.6 (Cf. [39, Theorem 1.1 and Theorem 3.2]) Let 1 ≤ k ≤ n, let
A = U |A| be a polar decomposition of A ∈ Mn (F), and let B ∈ Mn (F). If there
exist k orthonormal vectors u1 , . . . , uk ∈ Fn such that

k
|A|ui = σi (A)ui for i = 1, . . . , k and ui , U ∗ Bui = 0,
i=1

then A is B-J orthogonal to B in · (k) . If in addition σk (A) > 0, then the converse
is also true.
The condition (ii) from Corollary 3.3 can be restated as:
(ii ) There exists a normalized vector x ∈ Fn such that Ax = A and Ax ⊥ Bx.
Now (ii ), unlike (ii), makes sense not only in inner product spaces but also in
general normed spaces. It is easy to see that, in any norm · on Fn , (ii ) implies
(i) of Corollary 3.3 (i.e., A ⊥ B) provided that Mn (F) is equipped with the induced
operator norm T := supx=1 T x. Bhatia and Šemrl conjectured in [20] that,
conversely, (i) implies (ii ) in any norm on Fn .
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 279

The first counterexample to this conjecture was provided by Li and Schnei-

Example 4.3], in p norm on F for p = 2, and with matrices A =
der n
1 1[60,
1 −1 ⊕ 0n−2 and B = I2 ⊕ 0n−2 . Later, it was shown in Benítez, Fernández,
and Soriano [19] that a counterexample to this conjecture exists in Rn whenever
the regarded norm is not induced by an inner product, that is, the implication (i)
of Corollary 3.3 ⇒ (ii ) is a characterization of real inner product spaces of finite
dimension.
The conditions on an individual operator A which force that (i) implies (ii )
were studied in a series of papers by Ghosh, Hait, Paul, Sain, and others. In real
Banach spaces X of finite dimension one such neat sufficient condition is that
the norm-attaining set MA := {x ∈ X ; x = 1, Ax = A} contains at
most two connected components, see the survey paper [72, Theorem 8.2.1]. When
dim X = 2, this condition is also necessary, see [72, Theorem 8.4.8]. Moreover,
by [72, Theorem 8.8.2], for arbitrary operators A, B acting on a finite-dimensional
real Banach space we have A ⊥ B if and only if there exist x, y ∈ MA such that

Ax + λBx ≥ Ax = A for λ ≥ 0 and

Ay + λBy ≥ Ay = A for λ ≤ 0

(sufficiency is trivial: A + λB ≥ Ax + λBx ≥ A for λ ≥ 0 and A + λB ≥

Ay + λBy ≥ A for λ ≤ 0 combined give A + λB ≥ A for each λ ∈ R).
Similar results for operators on an infinite-dimensional X as well as the description
of smooth points in B(X ) can also be found in this survey.

3.2 B-J Orthogonality in Commutative C ∗ -Algebras and

Function Spaces

The case of commutative C ∗ -algebras was discussed by Kečkić. Recall that for
each unital commutative C ∗ -algebra A there is a compact Hausdorff space K such
that A is ∗-isomorphic to the C ∗ -algebra C(K) of all continuous complex valued
functions on K with the maximum norm f = max{|f (t)|; t ∈ K}, see [23,
II.2.2.4 and II.1.1.3.(2)]. If A is a nonunital commutative C ∗ -algebra, then there
is a noncompact locally compact Hausdorff space such that A is isomorphic
to C0 (), the C ∗ -algebra of all continuous complex functions on vanishing at
“infinity”. If K = ∪ {s0 } is the one-point compactification of , then we can
identify C0 () with the C ∗ -subalgebra {f ∈ C(K); f (s0 ) = 0} of C(K). In this
way, since B-J orthogonality of two elements “happens” in the subspace spanned by
these two elements, it is enough to obtain the characterization of B-J orthogonality
in C(K).
Let us begin with some examples.
280 L. Arambašić et al.

Example
(a) Let f, g ∈ C(K) be such that there is t0 ∈ K satisfying |f (t0 )| = f and
g(t0 ) = 0. Then

f + λg ≥ |f (t0 ) + λg(t0 )| = f , ∀λ ∈ C,

so f ⊥ g.
(b) Let f, g ∈ C(K). It follows from the first example that f ⊥ (f 2 − |f |2 )g.
Observe the similarity with Corollary 3.2.
(c) Let f, g ∈ C([0, 1]) be defined as f (t) = 1 and g(t) = e2πit . It is easy to see
that f ⊥ g. This example shows that the sufficient conditions stated in the first
example are not necessary.
Theorem 3.7 (Cf. [55, Corollary 2.1]) Let C(K) be the Banach space of all
continuous complex valued functions on a compact Hausdorff space K, with the
norm f = max{|f (t)|; t ∈ K}. For f ∈ C(K) define

Ef = {t ∈ K; |f (t)| = f }.

Then the following are equivalent for f, g ∈ C(K):

(i) f ⊥ g.
(ii) The set F = {f (t)g(t); t ∈ Ef } is not contained in an open half plane in C
with a boundary that contains the origin, that is, the closed convex hull of F
contains the origin.
(iii) There exists a probability measure μ concentrated at Ef , such that
$
f (t)g(t) dμ(t) = 0.
K

Therefore, if f = 0 and Ef = {t0 } is a singleton, then f ⊥ g if and only if

g(t0 ) = 0.
Sketch of the Proof By [55, Theorem 2.1, Proposition 1.3] f is orthogonal to g if
and only if

inf max Re(eiφ e−i arg f (t )g(t)) ≥ 0,

φ∈[0,2π) t ∈Ef

that is, if and only if it holds that, under every rotation around the origin, the set
{e−i arg f (t )g(t); t ∈ Ef } contains at least one value with nonnegative real part.
Since f (t)g(t) = f e−i arg f (t ) g(t) for each t ∈ Ef , this is equivalent to (ii).
) Since the convex hull of the set F is the set of points of the form
K f (t)g(t) dλ(t) where λ is a probability measure supported on a finite subset
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 281

of Ef , there is a sequence (λn )n such that

$
0 = lim f (t)g(t) dλn (t).
n→∞ K

By the Banach–Alaoglu theorem, there is a subsequence (λnk ) which w∗ -converges

to some probability measure μ. Obviously, the support of μ is contained in Ef and
(iii) holds. Conversely, suppose that (iii) holds. Then for all λ ∈ C we have
$ $
f + λg ≥ 2
|f + λg| (t) dμ(t) = f + |λ|
2 2 2
|g(t)|2 dμ(t) ≥ f 2 ,
Ef Ef

which gives (i).

In the same paper, Kečkić also considered a more general space Cb (X) of
bounded, continuous complex valued functions on a locally compact, Hausdorff
space X, equipped with the supremum norm. His result is as follows:
Proposition 3.8 (Cf. [55, Corollary 3.1]) The following is equivalent for f, g ∈
Cb (X):
(i) f ⊥ g.
(ii) There exists a sequence of probability measures μn concentrated at Eδ := {x ∈
X; |f (x)| ≥ f − δ} such that
$
lim f (x)g(x) dμn (x) = 0.
n→∞ X

For c0 , that is, the space of all complex-valued sequences which converge to zero,
and equipped with supremum norm, the characterization of B-J orthogonality is, as
expected, easier.
Proposition 3.9 (Cf. [54, Example 1.7]) The following are equivalent for x =
(xn )n , y = (yn )n ∈ c0 :
(i) x ⊥ y.
(ii) There does not exist an acute open angle D = {z; α < arg z < β} with
β − α < π, such that xn yn ∈ D for all those n for which |xn | = x holds.
Some other classical spaces are p and Lp (μ) with p ∈ [1, ∞). Here Lp (μ)
denotes the space of complex valued functions on a measurable space with a
positive measure μ whose p-th degree is summable. In these spaces B-J orthog-
onality was characterized by James [47] and Kečkić [54] by means of supporting
functionals and Gateaux derivatives of the norm. We remark that James considered
the real case and = [0, 1] with Lebesgue measure μ only, but the proof can be
transferred easily to the complex case and an arbitrary measurable space .
282 L. Arambašić et al.

Proposition 3.10 (Cf. [47, Example 8.1]) Let x = (xj )j ∈N and y = (yj )j ∈N
belong to p with p ≥ 1. Then x ⊥ y if and only if one of the following conditions
holds:
(i) p = 1 and

xj
yj ≤ |yj |;
|xj |
xj =0 xj =0

(ii) p > 1 and

|xj |p−2 xj yj = 0,
j ∈N

where any occurrence of |0|p−20 in the sum above is interpreted as zero.

It follows that if p = 1 then x = 0 is a smooth point of the norm if and only if
xj = 0 for any j ∈ N, and if p > 1 then the norm on p is smooth.
Proposition 3.11 (Cf. [47, Example 8.2] and [54, Example 1.6]) Let μ be a
positive measure on a measurable space . Then for f, g ∈ Lp (μ), p ≥ 1, it
holds f ⊥ g if and only if one of the following conditions is satisfied:
(i) p = 1 and
$ $
f (t)
g(t) dμ(t) ≤ |g(t)| dμ(t);
\f −1 (0) |f (t)| f −1 (0)

(ii) p > 1 and

$
|f (t)|p−2 f (t)g(t) dμ(t) = 0.

Similarly, if p = 1 then f = 0 is a smooth point of the norm if and only if

μ(f −1 (0)) = 0, and if p > 1 then the norm on Lp (μ) is smooth.

3.3 B-J Orthogonality in General C ∗ -Algebras and Hilbert

C ∗ -Modules

A characterization of B-J orthogonality in general C ∗ -algebras (and, more generally,

in Hilbert C ∗ -modules) was discussed in Arambašić–Rajić [10] and later, by using a
different approach, in Bhattacharyya–Grover [21]. They obtained a characterization
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 283

in terms of states on a C ∗ -algebra A , that is, positive linear functionals on A of

norm one.
For example, if x ∈ H is a normalized vector, then φ : B(H) → C given by
φ(T ) = T x, x is a state on B(H), while in the case of C(K) examples of states are
evaluations at a point. There are other states than these (for example, if T ∈ B(H) is
a positive operator which does not achieve its norm, then by [68, Theorem 5.1.11]
there exists a state φ such that φ(T 2 ) = T 2 , while T 2 x, x = T x2 < 1 for
each normalized x).
Observe that Theorem 3.1 can be rewritten in terms of states. Namely, 0 ∈
WA (B ∗ A) means that there exists a sequence of normalized vectors (xn )n such that
n→∞ n→∞
A∗ Axn , xn −−−→ A2 and B ∗ Axn , xn −−−→ 0.

Then for each n, φn : T $→ T xn , xn is a state. Recall that the set of states is closed
in w∗ -topology, hence compact by the Banach–Alaoglu theorem.
Proposition 3.12 (Cf. [10, Theorem 2.7], [21, Proposition 4.1]) The following is
equivalent in a C ∗ -algebra A :
(i) a ⊥ b.
(ii) There exists a state φ on A such that

φ(a ∗ a) = a2 and φ(a ∗ b) = 0.

Proof Let φ be a state as in (ii). Then for all λ ∈ C we have

a2 = |φ(a ∗ (a + λb))| ≤ a ∗ (a + λb) ≤ aa + λb,

which gives a ⊥ b.
Conversely, suppose that a ⊥ b. By Gelfand–Naimark theorem, we embed A
into B(H) and then use a sequence of states φn : T $→ T xn , xn provided by
Theorem 3.1. Due to w∗ -compactness, this sequence has a subsequence which w∗ -
converges to a desired state φ.

By using the linking algebra of a Hilbert C ∗ -module, this result can be extended
from C ∗ -algebras to Hilbert C ∗ -modules. The concept of a Hilbert C ∗ -module has
been introduced by Kaplansky [52] and Paschke [71] in an investigation of right
modules over a C ∗ -algebra which possess a C ∗ -valued inner product respecting the
module action. More precisely, a Hilbert C ∗ -module X over a C ∗ -algebra A is a
right A -module equipped with an A -valued inner product · , · : X × X → A
which satisfies
(1) x, αy + βz = αx, y + βx, z for x, y, z ∈ X , α, β ∈ C,
(2) x, ya = x, ya for x, y ∈ X , a ∈ A,
(3) x, y∗ = y, x for x, y ∈ X ,
(4) x, x ≥ 0 for x ∈ X ; if x, x = 0 then x = 0,
284 L. Arambašić et al.

√
and which is a Banach space with respect to the norm defined as x = x, x.
We say that a Hilbert C ∗ -module X over a C ∗ -algebra A is full if the inner products
of elements from X span a dense subset in A, in short, if X , X = A . A left
Hilbert C ∗ -module can be defined in a similar way. Every right Hilbert A -module
is also a left Hilbert C ∗ -module over a C ∗ -algebra K(X ) of ‘compact’ operators
on X , that is, the C ∗ -algebra spanned by the operators
θx,y , x, y ∈ X , defined as
θx,y (z) = xy, z. It is easy to show that x = θx,x .
Besides Hilbert spaces, one of the most important examples of Hilbert C ∗ -
modules are C ∗ -algebras. If A is a C ∗ -algebra, then we can regard it as a Hilbert
C ∗ -module over itself with the algebra multiplication as a (right) module action and
an inner product defined as a, b = a ∗ b.
The linking algebra L(X ) of a Hilbert A -module X is defined as the C ∗ -algebra
of all ‘compact’ operators acting on the Hilbert A -module A ⊕ X . It can be written
in the matrix form

Ta ly
L(X ) = ; a ∈ A , x, y ∈ X , T ∈ K(X ) ,
rx T

where the maps rx : A → X , ly : X → A and Ta : A → A are given by rx (a) =

xa, ly (z) = y, z and Ta (b) = ab. For more details on Hilbert C ∗ -modules we
refer to [23] and [64].
The following result was proved in [10] and [21].
Theorem 3.13 (Cf. [10, Theorem 2.7] and [21, Theorem 4.4]) Let X be a Hilbert
C ∗ -module over a C ∗ -algebra A . Let x, y ∈ X . Then x ⊥ y if and only if there is
a state φ on A such that φ(x, x) = x2 and φ(x, y) = 0.

0 0
Sketch of the Proof Let x, y ∈ X be such that x ⊥ y. Then for X =
rx 0

0 0
and Y = , elements of the C ∗ -algebra L(X ), it holds X ⊥ Y. Applying
ry 0
Proposition 3.12 to L(X ), we get a state on L(X) such that (X∗ X) = X
2

Ta 0
and (X∗ Y ) = 0. If φ : A → C is defined by the formula φ(a) = ,
0 0
we easily see that φ is a state on A satisfying φ(x, x) = x2 and φ(x, y) = 0.
The converse is similar to the C ∗ -algebra case.

If H and K are Hilbert spaces, then the Banach space B(H, K) of all bounded
linear operators from H into K is a Hilbert C ∗ -module over B(H). Therefore, as a
corollary of the previous theorem we get a generalization of Theorem 3.1.
Corollary 3.14 Let A, B ∈ B(H, K). Then A ⊥ B if and only if there is a
sequence of normalized vectors (xn )n in H such that limn→∞ Axn = A and
limn→∞ B ∗ Axn , xn = 0.
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 285

We remark that Singla [85, Theorem 1.3] very recently extended Corollary 3.14 and
classified when an operator A ∈ B(H, K), which is not far away from compact
operators, is B-J orthogonal to a subspace B ⊆ B(H, K). This was achieved as
a consequence of his study of norm’s directional derivatives D+ at elements in a
C ∗ -algebra (which are not far from some, possibly nonproper, closed ideal).

3.4 Strong B-J Orthogonality on Hilbert C ∗ -Modules

In addition to the types of orthogonality that we have already mentioned, in a Hilbert

C ∗ -module X over a C ∗ -algebra A there are two types of orthogonality that rely on
C ∗ -modular structure of X . The first one is orthogonality which directly generalizes
orthogonality in inner product spaces: two elements x and y of a Hilbert C ∗ -module
X are orthogonal with respect to the C ∗ -valued inner product in X if x, y = 0.
This is a very strong type of orthogonality; for example, in a Hilbert C ∗ -module
B(H, K) over B(H ), where the inner product of A and B is defined as A∗ B, this
is exactly the range orthogonality of operators. The second one generalizes B-J
orthogonality in a way that the role of scalars is taken by elements of the underlying
C ∗ -algebra A (see [11]): for x, y ∈ X we say that x is strongly B-J orthogonal to
y, denoted as x ⊥s y, if x + ya ≥ x for every a ∈ A , that is, if the distance
from x to the A -submodule yA of X generated by y is exactly x.
Obviously, x ⊥s y holds if and only if x ⊥ ya for every a ∈ A . However, as
we show in the following theorem, it turns out that instead of checking that x ⊥ ya
holds for every a ∈ A , it is enough to check that x ⊥ ya holds for one special
element a ∈ A . Also, this result gives a characterization of strong B-J orthogonality.
Theorem 3.15 (Cf. [11, Theorem 2.5]) Let X be a Hilbert C ∗ -module over a C ∗ -
algebra A , and let x, y ∈ X . Then the following statements are equivalent:
(a) x ⊥s y.
(b) x ⊥ yy, x.
(c) There is a state φ on A such that φ(x, x) = x2 and φ(x, yy, x) = 0.
Proof The implication (a)#⇒(b) is obvious, while the equivalence (b)⇐⇒(c)
follows from Theorem 3.13. It remains to prove (c)#⇒(a).
Suppose that there is a state φ on A such that φ(x, x) = x2 and
φ(x, yy, x) = 0. Let a ∈ A be arbitrary. By the Bunyakovsky-Cauchy-Schwarz
inequality applied to (a, b) $→ φ(a ∗ b) we get

|φ(x, ya)|2 = |φ(x, ya)|2 ≤ φ(x, yy, x)φ(a ∗ a) = 0,

so φ(x, ya) = 0, and therefore x ⊥ ya. This gives that x ⊥s y.

In combination with Theorems 3.1 and 3.7 this result gives the following
characterizations of strong B-J orthogonality in C ∗ -algebras C(K) and B(H )
286 L. Arambašić et al.

(regarded as Hilbert C ∗ -modules over itself):

(1) If f, g ∈ C(K), then f ⊥s g if and only if there is t0 ∈ X such that |f (t0 )| =
f and g(t0 ) = 0 (cf. [13, Proposition 4.2]).
(2) If A, B ∈ B(H ), then A ⊥s B if and only if there is a sequence of normalized
vectors (xn ) in H such that limn→∞ Axn = A and limn→∞ B ∗ Axn = 0.
In particular, if dim H < ∞, then A ⊥s B if and only if there is a normalized
vector x ∈ H such that Ax = A and B ∗ Ax = 0 (cf. [11, Proposition 2.8]).
It follows from Theorem 3.15 and the definition of strong B-J orthogonality that in
every Hilbert C ∗ -module X the following implications hold for all x, y ∈ X :

x, y = 0 #⇒ x ⊥s y #⇒ x ⊥ y.

Now when we have characterizations of B-J and strong B-J orthogonality in X , we

easily see that the converse implications do not hold in general. For example, let
X = C([0, 1]). In order to see that the converse of the first implication does not
hold, take f, g ∈ C([0, 1]) defined as f (t) = 1 and g(t) = 2t − 1 for t ∈ [0, 1].
Then f, g = f g = 0, while f ⊥s g holds because f ( 12 ) = f and g( 12 ) = 0.
As an example of functions which show that the converse of the second implication
does not hold, we can use the same function f and the function h(t) = e2πit . By
Theorem 3.7, it follows that f ⊥ h, while f ⊥s h, since h(t) = 0 for all t ∈ [0, 1].
It is evident that these three types of orthogonality coincide in a Hilbert space
(regarded as a Hilbert C ∗ -module over the C ∗ -algebra of complex numbers). Now
it is natural to ask if there is any other example of a Hilbert C ∗ -module in which
two of these three orthogonalities coincide. These questions were discussed in [12]
for the class of full Hilbert C ∗ -modules, and the following results were obtained.
Theorem 3.16 (Cf. [12, Theorem 3.5, Corollary 4.9]) Let X = {0} be a full
Hilbert C ∗ -module over a C ∗ -algebra A .
(1) The equivalence x ⊥ y ⇐⇒ x ⊥s y holds for every x, y ∈ X if and only if A
is isomorphic to C.
(2) If X is not singly generated, then the equivalence x, y = 0 ⇐⇒ x ⊥s y holds
for every x, y ∈ X if and only if A is isomorphic to C. If X is singly generated,
then the equivalence x, y = 0 ⇐⇒ x ⊥s y holds for every x, y ∈ X if and
only if K(X ) is isomorphic to C.
Recall that a right Hilbert C ∗ -module X over a C ∗ -algebra A is also a full left
Hilbert C ∗ -module over the C ∗ -algebra K(X ). The norm on X is defined as x =
√
x, x, so we can say that it is induced by the C ∗ -norm on A and the A -valued
inner product
(x, y) $→ x, y on X . It is easy to see that for every x ∈ X it holds
x = θx,x , which can be interpreted as if the norm on X were induced by
the C ∗ -norm on K(X ) and the K(X )-valued inner product (x, y) $→ θx,y on X .
Therefore, if strong B-J orthogonality coincides on X with either B-J orthogonality
or orthogonality with respect to the A -valued inner product, then X is an ordinary
Hilbert space. In other words, whenever neither A nor K(X ) are isomorphic to C,
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 287

strong B-J orthogonality is a new type of orthogonality on X . Therefore, this is a

very interesting topic and many papers discuss this subject in general or for some
special classes of Hilbert C ∗ -modules. For example, in [67] the authors characterize
strong B-J orthogonality for elements of a general C ∗ -algebra (regarded as a Hilbert
C ∗ -module over itself) as well as for the special classes of elements of B(H). In
the same paper some characterizations of standard B-J orthogonality for elements
of a Hilbert K(H)-module and of B(H) are obtained. Let us also mention the paper
[93] where the author studies B-J orthogonality for elements and finite-dimensional
subspaces of a pre-Hilbert C ∗ -module in terms of a convex hull of continuous linear
functionals.

4 Applications of B-J Orthogonality

Of all the inequivalent orthogonalities listed in Sect. 1, B-J orthogonality has

arguably found the most applications. Part of the reason is that several key properties
of orthogonality in inner product spaces are inherited within B-J orthogonality. For
example, given a vector x in an inner product space X , its orthogonal projection to
a subspace Y ⊆ X is its best approximate (within Y). That is,

x − y0 = inf x − y =: d(x, Y) (11)

y∈Y

if and only if x − y0 , y = 0 for each y ∈ Y. This holds also for general normed
spaces:
Proposition 4.1 Let X be a normed space. Among all the vectors from a subspace
Y ⊆ X a vector y0 ∈ Y is the best approximate to x in a sense of (11) precisely
when x − y0 is B-J orthogonal to Y.
Proof From the definition of B-J orthogonality we have that (x − y0 ) ⊥ Y if and
only if (x − y0 ) + y ≥ x − y0 for every y ∈ Y.

In infinite-dimensional Banach spaces the sum of two closed subspaces may fail
to be closed.
For a classical concrete example, consider the Hilbert space H := 2
with en n≥0 as a standard orthonormal basis, let M be a closed subspace spanned

by pairwise orthogonal vectors e2n + n+1 1
e2n+1 n and let N be a closed subspace

spanned by orthonormal vectors e2n n . Then M ∩ N = 0 and M + N is dense
in 2 , since
it contains en for any n ≥ 0. However, M + N does not contain the
vector n≥0 n+1 1
e2n+1 ∈ 2 , because otherwise the only way to write it would be

n≥0 (e2n + n+1 e2n+1 ) − n≥0 e2n ∈
1
n≥0 e2n , but / 2 .
However, it is easily seen that, in Hilbert spaces, the sum of two orthogonal,
closed subspaces is again closed. Again, this can be generalized easily to general
Banach spaces (see Anderson [8, Remark 1.3]). To avoid misunderstanding, we say
288 L. Arambašić et al.

that a subset S1 of a normed space X is B-J orthogonal to a subset S2 ⊆ X (in

symbols: S1 ⊥ S2 ) if s1 ⊥ s2 for every tuple of vectors (s1 , s2 ) ∈ S1 × S2 .
Proposition 4.2 Let M, N be closed subspaces in a (real or complex) Banach
space X . If M ⊥ N , then M ∩ N = 0 and M + N is closed.
Proof Firstly, if m = n ∈ M ∩ N , then m + n ≥ m implies m = n = 0, so B-J
orthogonal subspaces intersect trivially. Secondly, a projection P from M + N to
N is bounded with norm one because P (m + n) = m ≤ m + n (by definition
of the fact that m ⊥ n). Therefore, if (mk + nk )k ∈ M + N is convergent, then
mk := P (mk + nk ) ∈ M = M is a Cauchy sequence, so it converges to some
m ∈ M (M is a closed, hence Banach, subspace of a Banach space X ). Then also
nk = (nk + mk ) − mk converges to an element in N = N , so limk (nk + mk ) ∈
M + N.

Bhatia and Šemrl [20] used B-J orthogonality to prove that the diameter of
the unitary orbit of a given complex matrix A ∈ Mn (F) equals 2d(A, CI ). The
unitary orbit of a complex square matrix A ∈ Mn (F) is the set {U AU ∗ ; U ∈
Mn (F) is unitary}, so its diameter is given by

dA = max{V AV ∗ − U AU ∗ ; U, V are unitary}

= max{A − U AU ∗ ; U is unitary}.

Here the norm is assumed to be the spectral norm.

Theorem 4.3 (Cf. [20, Theorem 1.2]) For each A ∈ Mn (C) it holds dA =
2d(A, CI ).
Proof Notice that dA = 0 if and only if A is a scalar matrix, so the statement holds
for scalar matrices.
Suppose that A is not a scalar matrix. The inequality dA ≤ 2d(A, CI ) is easy to
prove, since for every unitary matrix U and λ ∈ C we have

A − U AU ∗ = (A − λI ) − U (A − λI )U ∗ ≤ 2A − λI .

For the converse inequality, let λ0 ∈ C be such that d(A, CI ) = A + λ0 I . Then

A0 := A + λ0 I = 0, and for all λ ∈ C it holds

A0 + λI = A + (λ + λ0 )I ≥ A + λ0 I = A0 ,

so A0 ⊥ I. By Corollary 3.3, there is a normalized vector x ∈ Cn such that

A0 x = A0 and A0 x, x = 0. Denote y = A10 A0 x. Then x and y
are normalized vectors such that x, y = 0 and A0 x, y = A0 . It follows
from the second equality that A0 x, y = A0 xy, so we have the case of
equality in the Bunyakovsky-Cauchy-Schwarz inequality. Therefore, it has to be
A0 x = A0 y. Now for a unitary U such that U x = x and Uy = −y we have
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 289

U A0 U ∗ x = −A0 y, and then

dA = dA0 ≥ A0 x − U A0 U ∗ x = 2A0 = 2d(A, CI ).

This result can be used for calculating the norm of inner derivations. Recall that,
for a given A ∈ B(H), an inner derivation is the operator δA given by δA (X) =
AX − XA. In [86] Stampfli proved that the norm of δA is equal to 2d(A, CI ).
Bhatia and Šemrl gave a simpler proof of this by using Theorem 4.3. Namely, each
X ∈ Mn (C) such that X = 1 can be written in the form X = 12 (V + W ), where
V and W are unitary matrices (this follows from singular value decomposition of X
by writing each singular value, which is a number between 0 and 1, as 12 (eiθ + e−iθ )
for some θ ∈ R). Then we have

δA = max{AX − XA; X = 1}

= max{AU − U A; U is unitary} = dA = 2d(A, CI ).

This can be extended to the infinite-dimensional case by a limiting argument,

see [20, Remark 3.2]. Namely, one can consider an increasing sequence {Pn } of
finite rank projections, set An = Pn A and show that δA ≥ 2d(An , CI ) for all
n ∈ N. It then follows that δA ≥ 2d(A, CI ). For the converse inequality, it is
sufficient to note that for any X ∈ B(H) with X = 1 and any λ ∈ C it holds
AX − XA = (A − λI )X − (A − λI )X ≤ 2(A − λI ).
In this respect we mention also a classical result by Anderson [8] that, for a
normal operator N, the range of δN is always B-J orthogonal to its kernel. In other
words, if NS = SN, then for every X ∈ B(H) it holds

NX − XN + S ≥ S.

This result has been greatly extended in various directions, see [25, 34, 56].
It was shown in [10] that B-J orthogonality provides a convenient criterion for
two elements of a normed space X to satisfy the equality in the triangle inequality.
Proposition 4.4 (Cf. [10, Proposition 4.1]) Let X be a real or complex normed
space, x, y ∈ X . Then the following conditions are equivalent:
(i) x + y = x + y;
(ii) x ⊥ (yx − xy);
(iii) y ⊥ (yx − xy).
Proof It follows from the Hahn–Banach theorem that x + y = x + y if and
only if there exists a norm-one linear functional f : X → F such that f (x) = x
and f (y) = y, see [69, Theorem 2]. If x = 0, then the latter condition can be
restated as f (x) = x and f (yx − xy) = 0. Such a linear functional f exists
290 L. Arambašić et al.

if and only if x ⊥ (yx − xy), so the equivalence (i) ⇔ (ii) is proved. Similarly
for (i) ⇔ (iii).

This proposition, together with the obtained characterizations of B-J orthogo-
nality in different normed spaces, gives characterizations of the case of equality in
the triangle inequality, which were directly obtained earlier in [9] and [69]. Let us
formulate the version for C ∗ -algebras.
Corollary 4.5 (Cf. [69, Theorem 1], [9, Remark 2.2]) Let A be a C ∗ -algebra and
a, b ∈ A be nonzero. Then the equality a + b = a + b holds if and only if
there is a state φ on A such that φ(a ∗ b) = ab.
Proof Suppose that a + b = a + b holds. Then, by the previous proposition,
a ⊥ (ba − ab) and, by Proposition 3.12, there is a state φ on A such that
φ(a ∗ a) = a2 and φ(a ∗ (ba − ab)) = 0. These two relations give φ(a ∗ b) =
ab.
Conversely, suppose there is a state φ on A such that φ(a ∗ b) = ab. From
the Bunyakovsky-Cauchy-Schwarz inequality applied to (a, b) $→ φ(a ∗ b) we get

ab = |φ(a ∗ b)| ≤ φ(a ∗ a) 2 φ(b∗ b) 2 ≤ a ∗ a 2 b∗ b 2 = ab,

1 1 1 1

wherefrom φ(a ∗ a) = a2 and φ(b∗ b) = b2 . Thus, φ(a ∗ (ba − ab)) =
bφ(a ∗ a) − aφ(a ∗ b) = 0, so, by Proposition 3.12, a ⊥ (ba − ab).

Among other applications of B-J orthogonality we mention also Cheng–
Mashreghi–Ross’s [28] bounds for the zeros of an analytic function on a disk,
where B-J orthogonality on p with p ∈ (1, ∞) is used, see Proposition 3.10 above.

5 Preservers of B-J Orthogonality

It is an easy exercise to prove that a linear map between inner product spaces which
preserves orthogonality must be a scalar multiple of an isometry. This result can
also be generalized to B-J orthogonality. Koldobsky [57] was the first to show that
linear preservers of B-J orthogonality on real normed spaces are scalar multiples of
isometries. Later, Blanco and Turnšek [24] extended his result to complex normed
spaces. We present a simplified proof due to Wójcik [94, Theorem 2.1]. To counter
nonsmooth norms, Wójcik relied on the following standard identities:
Proposition 5.1 (Cf. [33, Theorem 18]) In the notations of Sect. 2.1, for any
nonzero x, y ∈ X it holds that

lim D− (x + ty; y) = D− (x; y) and lim D+ (x + ty; y) = D+ (x; y). (12)

t 30 t %0

Sketch of the Proof For t ∈ R let φt ∈ J (x + ty) be a supporting functional such

that Re φt (y) = D+ (x + ty; y). Then | Re φt (x)| ≤ Re φt · x = x, so
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 291

x + ty = Re φt (x + ty) = Re φt (x) + t Re φt (y) ≤ x + t Re φt (y), and

therefore, with t > 0,

x + ty − x
≤ Re φt (y) = D+ (x + ty; y). (13)
t
x+ty−x
Moreover, for each fixed t > 0 and each x we have D+ (x; y) ≤ t . By
replacing x with x + ty we get, in combination with (13),

x + ty − x x + 2ty − x + ty

≤ D+ (x + ty; y) ≤ .
t t

Both the left and the right sides converge to D+ (x; y) as t % 0, which gives the
second estimate of (12); the first one is completely similar.

Theorem 5.2 (Cf. [24, 57, 94]) Let T : X → Y be a (conjugate) linear map
between two normed spaces X , Y over the field F. Then the following are equiv-
alent:
(i) T preserves B-J orthogonality, i.e., x ⊥ y #⇒ T x ⊥ T y for every x, y ∈ X .
(ii) T is a scalar multiple of an isometry, i.e., there is γ ≥ 0 such that T x =
γ x for every x ∈ X .
Sketch of the Proof Injectivity is straightforward: Indeed, assume T c = 0 for some
nonzero c. Notice that 0 = c +(−1)c ≥ c, so there exists a small enough ε > 0
such that if x < ε, then (c + x) + (−1)c ≥ c + x, i.e., (c + x) ⊥ c. By
Proposition 1.4, we can find a scalar λ = λx with (c + x) ⊥ (λ(c + x) + c), giving
T x = T (c + x) ⊥ T (λ(c + x) + c) = λσ T x, where σ : F → F is either identity or
a complex conjugation. This is possible only if λ = 0 (contradicting (c + x) ⊥ c)
or if T x = 0. Hence, if T is not injective, then T = 0.
Choose linearly independent x, y ∈ X and fix φ ∈ J (x). By Proposition 1.4 with
α := φ(y)
x , we have x ⊥ (αx − y), so also T x ⊥ (α T x − T y). By Proposition 1.4
σ

again, there exists f ∈ J (T x) which annihilates (α T x − T y). Therefore, φ(y) =

σ
x
αx = T σ
x f (T y) , so also
" #
x x
Re φ(y) = Re f (T y) ∈ inf Re f (T y), sup Re f (T y)
T x T x f ∈J (T x) f ∈J (T x)

x 2 3
= D− (T x; T y) , D+ (T x; T y) .
T x

By taking infimum and supremum, respectively, over all φ ∈ J (x), we get

x x
T x D− (T x; T y) ≤ D− (x; y) ≤ D+ (x; y) ≤ T x D+ (T x; T y).
292 L. Arambašić et al.

If the norm is smooth at T x, i.e., if D− (T x; T y) = D+ (T x; T y), then the above

inequality simplifies into D± (T x; T y) = T x
x D± (x; y). This holds also if T x
is a nonsmooth point, namely, one can consider a two-dimensional subspace of Y
spanned by T x, T y; smooth points are dense there, and then (12) can be applied.
Let b(x) := T x
x . Then

T x
0 = D+ (T x; T y) − D+ (x; y)
x
T (x + ty) − T x T x x + ty − x
= lim −
t %0 t x t
T (x + ty) · x − T x · x + ty
= lim
t %0 t · x
b(x + ty) − b(x) b(x + ty) − b(x)
= lim · x + ty = x · lim .
t %0 t t %0 t

b(x+ty)−b(x)
Likewise one shows that limt 30 t = 0. Hence, the function b is
constant.

Blanco and Turnšek considered in [24] (possibly nonlinear) bi-preservers of B-J
orthogonality on projective Banach space PX := {[x] = Fx; x ∈ X \ {0}}. Their
result is the following:
Theorem 5.3 (Cf. [24, Corollary 3.4]) Let X be an infinite-dimensional, reflexive,
smooth Banach space. If : PX → PX is a bijective map such that

[x] ⊥ [y] ⇐⇒ ([x]) ⊥ ([y])

then there exists a linear or conjugate linear surjective isometry U : X → X , so

that

([x]) = [U x].

The smoothness assumption is indispensable here, for example, in Pc0 there exists
bijective bi-preservers which are not induced by (conjugate) linear map, let alone
isometry (see [24, Example 3.5]). It turns out, however, that the assumption about
infinite dimensionality is not required. We will show this in our last chapter.
We also remark that there do exist complex reflexive Banach spaces which
are conjugate-linear isometric but are not even isomorphic. The examples were
constructed by Bourgain [27] and Kalton [51].
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 293

6 Graph Induced by B-J Orthogonality

It is possible to study B-J orthogonality on a normed space (X , · ) over the field

F also with the help of a (directed) graph, ˆ = (X
ˆ ). Its vertex set consists of all
lines, i.e., points in a projective space PX = {[x] = Fx; x ∈ X \ {0}}, and vertices
[x], [y] form a directed edge ([x], [y]) if x ⊥ y. Notice that its edges are well
defined because of the homogeneity of B-J orthogonality. To simplify things, we
will frequently not distinguish between a nonzero vector x ∈ X and the line [x] =
Fx ⊆ X passing through it. Thus, we will often denote vertices of di-orthograph
ˆ ) simply by x instead of [x] but we do call them lines. The results of this
(X
section are from our recent paper [14].

6.1 Property Recognition

The fundamental question which we address here is the following:

Let P be a given property on a normed space. Can we decide, using B-J orthogonality alone,
if a space has property P or not?

Below we collect, mostly without proofs, some partial answers; we refer to [14]
for proofs. The first one is merely a restatement of Corollary 2.12.
Proposition 6.1 A normed space X over F is infinite-dimensional if and only if
ˆ ) contains an infinite clique.
(X
Di-orthograph alone can compute the dimension of the underlying space.
Proposition 6.2 (Cf. [14, Lemma 2.3]) Let X be a normed space over F and
n ≥ 2. Then the following statements are equivalent.
ˆ )
(i) For any (n − 1)-tuple of lines (x1 , . . . , xn−1 ), one can always find xn ∈ (X
with xi ⊥ xn , 1 ≤ i ≤ n − 1.
(ii) dim X ≥ n.
Corollary 6.3 A normed space X has dimension n < ∞ if and only if its di-
orthograph (Xˆ ) satisfies item (i) with k = 1, . . . , n − 1 and does not satisfy item
(i) for larger k.
Di-orthograph can detect the presence of nonsmooth points. Compare with
Theorem 2.6(ii) where smoothness was related to right uniqueness and right
additivity of B-J orthogonality.
Proposition 6.4 (Cf. [14, Lemma 2.5]) Let X be a normed space over F with 2 ≤
dim X = n < ∞. Then the norm in X is nonsmooth if and only if there exist
ˆ ) and two additional distinct lines yn , yn such that
n − 1 lines x1 , . . . , xn−1 ∈ (X
xi ⊥ xj for 1 ≤ i < j ≤ n − 1, and xi ⊥ yn , xi ⊥ yn for 1 ≤ i ≤ n − 1.
294 L. Arambašić et al.

Di-orthograph alone can also detect if the norm is strictly convex (compare again
with Theorem 2.6 (i) or (iv)). Given a vertex z in a di-orthograph , ˆ let Nz :=
ˆ ˆ ˆ
{v ∈ ; (z, v) ∈ E()} denote its neighborhood; here E() is the set of all
ˆ Note that Nz = ∅ if the underlying normed space is at least
directed edges of .
two-dimensional.
Proposition 6.5 (Cf. [14, Lemma 2.6]) A normed space X with dim X ≥ 2 over
ˆ X)
ˆ ) → 2(
F is strictly convex if and only if the function (X which maps a vertex
z to its neighborhood Nz is injective.

6.2 Isomorphism Problem

How much information on the norm is encoded in the di-orthograph (X ˆ )? Clearly,
if A : X → Y is a linear bijective isometry between two Banach spaces, then
A induces an isomorphism of di-orthographs (X ˆ ) and (Y).
ˆ We show next the
converse of this fact. Note that this problem is closely related to characterization of
preservers of B-J orthogonality which was discussed in Sect. 5. Namely, bijective bi-
preservers of B-J orthogonality between projective Banach spaces P(X ) and P(Y)
are exactly isomorphisms between (X ˆ ) and (Y).
ˆ
In the lemma below, a curve is a subset in C, which is the image of a path, i.e.,
the image of a continuous map r : [a, b] → C where [a, b] ⊆ R is an interval with
at least two different points. We say that a function f : C → C is bounded from
above on a subset ⊆ C if supz∈ |f (z)| < ∞.
Lemma 6.6 (Cf. [14, Lemma 3.1]) Suppose that a nonzero ring homomorphism
σ : C → C is bounded from above on a curve ⊆ C with more than one point.
Then σ is continuous, hence either identity or a complex conjugation.
Sketch of the Proof The full proof is relatively long and can be found in [14].
We first manipulate by a finite number of rotations/translations/taking unions to
obtain a closed curve ˆ which separates the complex plane. Observe that the ring
homomorphism σ remains bounded from above on . ˆ Then we use the fact, inspired
ˆ ˆ
by a paper of Simon and Taylor [81], that − := {γ1 − γ2 ; γ1 , γ2 ∈ } ˆ has a
ˆ ˆ
nonempty interior. Since σ is clearly bounded from above also on − , it is hence
bounded on an open set, and hence it must be continuous (see, e.g., [1, Corollary 5,
p. 15]).

We remark that, with the help of dimension theory for separable metric spaces
(see, e.g., a monograph by Hurewitz and Wallman [44]), an even more general result
was obtained by Shchepin E[80]: If 1 , . . . , n are n ≥ 1 compact connected subsets
n
in Rn such that (i) 0 ∈ i=1 i and (ii) there exist n linearly independent points

ai ∈ i , then the sum ni=1 i := {γ1 + · · · + γn ; γi ∈ i } ⊆ Rn is of dimension
at least n and therefore it contains an open ball (see [44, Theorem IV.3]).
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 295

It was shown by Rätz [74] (see also Sundaresan [87, Lemma 1]) that B-J
orthogonality in real normed spaces is Thalesian, that is, if x, y are B-J orthogonal
vectors in a real normed space, then for every λ0 > 0 there exists a scalar α
such that (x + αy) ⊥ (λ0 x − αy). This fact was used by Wójcik [94, Proof of
Theorem 3.1] to give an alternative proof that linear maps which preserve B-J
orthogonality between real normed spaces are scalar multiples of isometries. The
main idea of the proof of the next lemma comes from Wójcik’s paper (see [94,
Proof of Theorem 3.1]) and uses a partial extension of Thalesian property for B-J
orthogonality in complex normed spaces: we show that if normalized vectors x, y
are mutually B-J orthogonal, then there exists a curve ⊆ C with more than one
point, such that for every λ ∈ we can find α ∈ C with (x + αy) ⊥ (λx − αy).
Recall that an additive map : X → Y between F-vector spaces X and Y is σ -
quasilinear if (λx) = σ (λ) (x) holds where σ : F → F is a field homomorphism
(if σ is surjective, such maps are semilinear).
Proposition 6.7 (Cf. [14, Lemma 3.4]) Let X , Y be smooth normed complex
spaces of dimension at least two, and let σ : C → C be a field homomorphism.
If a nonzero σ -quasilinear map : X → Y preserves B-J orthogonality, then σ is
identity or a complex conjugation.
Sketch of the Proof By Proposition 2.13(i), there exist two mutually B-J orthogo-
nal normalized vectors x, y ∈ X such that (x) = 0. Identify the two-dimensional
subspace spanC {x, y} with (C2 , · ), so that x = (1, 0) and y = (0, 1). By (5), a
C-linear supporting functional at a point (1, α) ∈ C2 equals
∗
∂(1,α) ∂(1,α)
fα = ∂z1 , ∂z2 , (zk = xk + iyk ).

and its kernel contains a row vector

∂(1,α)
∂z2 , − ∂(1,α)
∂z1 . (14)

Therefore, (1, α) ⊥ ∂(1,α)

∂z2 , − ∂(1,α)
∂z1 . Since x, y are mutually B-J orthogonal,
their supporting functionals equal
∗ ∗
∂(1,0) ∂(1,0)
∂z1 , ∂z2 = μx · (1, 0)∗ and ∂(0,1) ∂(0,1)
∂z1 , ∂z2 = μy · (0, 1)∗ ,

for some nonzero μx , μy ∈ C, respectively. Since partial derivatives of a smooth

norm are continuous (see Rockafellar [77, Theorem 25.5]), the first supporting
∗
∂(1,α) ∂(1,α)
functional equals the limit of ∂z1 , ∂z2 as R 8 α → 0. Next, by positive
∂(1,α) ∂(1/α,1)
homogeneity of the norm, ∂zk = ∂zk (α > 0), so the second supporting
296 L. Arambašić et al.

∗
∂(1,α) ∂(1,α)
functional equals the limit of ∂z1 , ∂z2 as R 8 α → ∞. Hence,

∂(1,α)
∂z2
wα = ∂(1,α)
αx − αy
∂z1

is a well-defined vector-valued function of α around α = 0 and parallel to (14), so

(x + αy) ⊥ wα . Also, its first component, i.e.,

∂(1,α)
∂z2
λ(α) := ∂(1,α)
α,
∂z1

is a continuous function which cannot vanish identically. As such, with α restricted

to a suitable closed interval I = [0, ε], its range is a curve ⊆ C with more than
one point.
Thus, for every α ∈ [0, ε] we have (x + αy) ⊥ (λ(α)x − αy), and therefore
(x + αy) ⊥ (λ(α)x − αy). From here, the very definition of B-J orthogonality
of (x) and (y) gives

(x) ≤ (x) + σ (α) (y) = (x + αy)

≤ (x + αy) + (λ(α)x − αy) = |1 + σ (λ(α))| · (x)

so |σ | is bounded from below by 1 on the curve 1 + ⊆ C. Being multiplicative,

1
σ is then bounded from above on the reciprocal curve, 1+ . The result then follows
from Lemma 6.6.

We can now prove a version of Theorem 5.3, which is valid also for finite-
dimensional complex normed spaces. The proof follows almost verbatim Blanco
and Turnšek [24]. The only major difference is that at a final step of the proof
they relied on Molnár’s result, [66, Corollary 1], to show that a σ -quasilinear
bijection which preserves B-J orthogonality must be (conjugate) linear. Instead,
we replace [66, Corollary 1] with Proposition 6.7 which is valid also in finite-
dimensional complex spaces.
Theorem 6.8 (Cf. [24, Corollary 3.4]) Let (X , · X ) and (Y, · Y ) be smooth
Banach spaces over F. Suppose that X is reflexive and 3 ≤ dim X ≤ ∞. If
ˆ ) → (Y)
T : (X ˆ is an isomorphism of di-orthographs, then there exists a linear
or conjugate-linear surjective map U : X → Y such that

T [x] = [U x] and U xY = xX .

Proof By Proposition 6.1, dim X < ∞ if and only if dim Y < ∞. It then follows
from Corollary 6.3 that dim X = dim Y.
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 297

Define a map S : PX ∗ → PY ∗ between projectivizations of dual spaces of X

and Y by [f ] $→ [fT [x] ]. Here, x is (any) normalized vector where f achieves
its norm (it exists because of reflexivity of X ) and fT [x] is a unique supporting
functional at some nonzero vector in a line T [x] ∈ PY. Note that S is well-defined
because if f achieves its norm on normalized vectors x, y ∈ X , then f is a unique
supporting functional for [x] and [y], giving that N[x] = N[y] = ker f (notations
copied from Proposition 6.5). Therefore, also NT [x] = NT [y] and, since Y is smooth,
the supporting functionals at T [x], T [y] ∈ PY coincide.
The last argument shows at once that S is not only well defined but also injective.
Also, the surjectivity of T implies that the range of S consists of all lines spanned
by norm-attaining functionals. It is easily seen that, given f ∈ X ∗ and z ∈ X , we
have f (z) = 0 if and only if, for every functional g ∈ S[f ] and vector y ∈ T [z],
we have g(y) = 0.
We claim that T is a morphism of projective spaces. Namely, assume from the
contrary that there exist the lines [x], [y], [z] ∈ ˆ which satisfy

[z] ⊆ [x] + [y] (15)

but their images T [x], T [y], T [z] span a three-dimensional space. Then we can
choose a norm-attaining normalized functional g : Y → F which annihilates T [x]
and T [y] but not T [z]. Clearly, the line spanned by such g belongs to the range of S,
so [g] = S[f ], and f annihilates [x] and [y] but not the line [z] contradicting (15).
Indeed, T is a bijective morphism of projective spaces. By the (nonsurjective
version of) Fundamental Theorem of Projective Geometry (see Faure [35]), there
exist a field isomorphism σ : F → F and a σ -quasilinear map V : X → Y such that
T [x] = [V x] for x ∈ X \ {0}. Clearly, V must be orthogonality preserving, so, by
Proposition 6.7, σ is either identity or a complex conjugation. By Theorem 5.2, the
(conjugate) linear orthogonality preserving map V : (X , · X ) → (Y, · Y ) is a
scalar multiple of an isometry (i.e., there exists a scalar μ ∈ C such that U := μV
satisfies U xY = xX ).

We remark that, if X is finite-dimensional and smooth, then in the theorem
above the smoothness of Y need not be assumed in advance; it follows from
Proposition 6.4 (after Corollary 6.3 establishes that dim X = dim Y). Note also that
Theorem 6.8 is not valid for dim X = dim Y = 2 because of the Radon planes; see
also [14, Example 4.1] or [88, Theorem 4.1]. Here is a restatement of Theorem 6.8
in terms of preservers of B-J orthogonality.
Corollary 6.9 Let X , Y be normed spaces over F, with 3 ≤ dim X < ∞. Assume
also that X is smooth. If there exists a bijection : PX → PY such that

[x] ⊥ [y] ⇐⇒ ([x]) ⊥ ([y]),

then there exists a surjective (conjugate) linear isometry U : X → Y such that

([x]) = [U x] and U xY = xX .

298 L. Arambašić et al.

We have similar results valid in Banach spaces rather than their projectivizations.
Let us show first that, in smooth spaces, B-J orthogonality can check for linear
independence among two vectors:
Lemma 6.10 Let x, y be nonzero vectors in a smooth normed space X over F.
Then the following conditions are equivalent:
(i) Fx = Fy.
(ii) There exists a vector z ∈ X such that z ⊥ x but z ⊥ y.
Proof (i) #⇒ (ii). Choose a normalized linear functional f which attains its norm
at some vector z ∈ X and which satisfies f (x) = 0 and f (y) = 0. Then f is the
unique supporting functional at z, and (ii) follows from Proposition 1.4.
(ii) #⇒ (i). Follows from homogeneity of B-J orthogonality for any normed
space X .

Example The above lemma does not hold in nonsmooth spaces. Say, in (R3 , · ∞ )
we have that u = (1, 1/2, 0) and v = (1, 1/3, 0) are independent, and yet they are
B-J orthogonal to the same vectors: Nu = Nv = {0} × R2 , and the same vectors are
B-J orthogonal to them: v N = u N = {(a, b, c); (|c| ≥ max{|a|, |b|}) ∨ (a + b =
0 ∧ |c| ≤ |b|)}, and also they are mutually B-J orthogonal to the same set of vectors,
i.e., to {(0, b, c); |b| ≤ |c|}.
Corollary 6.11 Let X , Y be smooth normed spaces over F, with 3 ≤ dim X < ∞.
If there exists a bijection : X → Y such that

x ⊥ y ⇐⇒ (x) ⊥ (y),

then there exists a surjective (conjugate) linear isometry U : X → Y and a scalar-

valued function γ : X → F such that

(x) = γ (x)U x and U xY = xX .

Proof Since x ⊥ x if and only if x = 0, we see that (0) = 0, and hence

(X \{0}) = Y\{0}. By Lemma 6.10, preserves in both directions linear
dependence among two vectors, so it induces an isomorphism of di-orthographs
ˆ ) → (Y).
T : (X ˆ The rest follows from Corollary 6.9.

By applying Theorem 6.8 instead of Corollary 6.9 we could state a similar
result for reflexive, smooth, infinite-dimensional Banach spaces X , Y. Moreover,
it turns out that reflexivity and completeness assumptions can be omitted, see
[14, Theorem 3.11] for more details.
Finally, we remark that Tanaka [88, 89] has also studied preservers of B-J
orthogonality between Banach spaces without the linearity assumption. In [88,
Theorem 2.5] he managed to prove a version of Theorem 6.8 for real smooth Banach
spaces without the reflexivity assumption. Besides, in [88, Theorem 4.3] he gave
an example of strongly B-J isomorphic (see [88, Definition 3.8]) but not isometric
nonsmooth real Banach spaces of arbitrary dimension. This shows that smoothness
assumption cannot be omitted. We note that his construction uses Radon planes and
“extends” them in a suitable way.
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 299

Acknowledgments The authors would like to express their deep gratitude to Professor Rajna
Rajić for many insightful conversations and suggestions while preparing this manuscript. They are
also grateful to the referee for suggesting additional references.
The work of the second and the fourth authors is supported by the RSF grant 21-11-00283.
The third author acknowledges the financial support from the Slovene Research Agency, ARRS
(research programs No. P1-0222, No. P1-0285, and research project No. N1-0210).

References

1. J. Aczel, J. Dhombres, Functional equations in several variables. Series: Encyclopedia of

Mathematics and its Applications, vol. 31 (Cambridge University Press, Cambridge, 1989)
2. J. Alonso, Ortogonalidad en espacios normados, Ph.D. Thesis, University of Extremadura,
Badajoz, Spain, 1984
3. J. Alonso, C. Benítez, Orthogonality in normed linear spaces: a survey. I. Main properties.
Extracta Math. 3(1), 1–15 (1988)
4. J. Alonso, C. Benítez, Area orthogonality in normed linear spaces. Arch. Math. 68 (1), 70–76
(1997)
5. J. Alonso, H. Martini, S. Wu, On Birkhoff orthogonality and isosceles orthogonality in normed
linear spaces. Aequationes Math. 83(1–2), 153–189 (2012)
6. C. Alsina, J. Sikorska, T.M. Santos, Norm Derivatives and Characterizations of Inner Product
Spaces (World Scientific Publishing, Hackensack, 2010)
7. D. Amir, Characterizations of inner product spaces, in Operator Theory: Advances and
Applications, vol. 20 (Birkhäuser Verlag, Basel, 1986)
8. J. Anderson, On normal derivations. Proc. Am. Math. Soc. 38, 135–140 (1973)
9. L. Arambašić, R. Rajić, On some norm equalities in pre-Hilbert C ∗ -modules. Linear Algebra
Appl. 414(1), 19–28 (2006)
10. L. Arambašić, R. Rajić, The Birkhoff–James orthogonality in Hilbert C ∗ -modules. Linear
Algebra Appl. 437(7), 1913–1929 (2012)
11. L. Arambašić, R. Rajić, A strong version of the Birkhoff-James orthogonality in Hilbert C ∗ -
modules. Ann. Funct. Anal. 5(1), 109–120 (2014)
12. L. Arambašić, R. Rajić, On three concepts of orthogonality in Hilbert C ∗ -modules. Linear
Multilinear Algebra 63(7), 1485–1500 (2015)
13. L. Arambašić, A. Guterman, R. Rajić, B. Kuzma, S. Zhilina, Symmetrized Birkhoff–James
orthogonality in arbitrary normed spaces. J. Math. Anal. Appl. 502(1), 125203 (2021)
14. L. Arambašić, A. Guterman, R. Rajić, B. Kuzma, S. Zhilina, What does Birkhoff–James
orthogonality know about the norm? (2021). Publ. Math. Debr., To appear, arXiv:2109.09414
15. J. Arazy, On the geometry of the unit ball of unitary matrix spaces. Integr. Equ. Oper. Theory
4, 151–171 (1981)
16. H. Auerbach, Über eine Eigenschaft der Eilinien mit Mittelpunkt. Ann. Soc. Polon. Math. 9,
204 (1930)
17. H. Auerbach, On the area of convex curves with conjugate diameters, Ph.D. Thesis, L’vov
University, L’vov, Ukraine, 1930, in Polish
18. B. Beauzamy, Introduction to Banach Spaces and Their Geometry. North-Holland Mathemat-
ics Studies, vol. 68 (North-Holland, Amsterdam, 1982)
19. C. Benítez, M. Fernández, M.L. Soriano, Orthogonality of matrices. Linear Algebra Appl. 422,
155–163 (2007)
20. R. Bhatia, P. Šemrl, Orthogonality of matrices and some distance problems. Linear Algebra
Appl. 287, 77–86 (1999)
21. T. Bhattacharyya, P. Grover, Characterization of Birkhoff–James orthogonality. J. Math. Anal.
Appl. 407(2), 350–358 (2013)
22. G. Birkhoff, Orthogonality in linear metric spaces. Duke Math. J. 1, 169–172 (1935)
300 L. Arambašić et al.

23. B. Blackadar, Operator Algebras. Theory of C ∗ -Algebras and von Neumann Algebras.
Encyclopaedia of Mathematical Sciences, vol. 122 (Springer, Berlin and Heidelberg, 2006)
24. A. Blanco, A. Turnšek, On maps that preserve orthogonality in normed spaces. Proc. Roy. Soc.
Edinburgh Sect. A 136(4), 709–716 (2006)
25. A. Blanco, A. Turnšek, On the converse of Anderson’s theorem. Linear Algebra Appl. 424(2–
3), 384–389 (2007)
26. F. Bohnenblust, A characterization of complex Hilbert spaces. Portugal Math. 3, 103–109
(1942)
27. J. Bourgain, Real isomorphic complex Banach spaces need not be complex isomorphic. Proc.
Am. Math. Soc. 96(2), 221–226 (1986)
28. R. Cheng, J. Mashreghi, W.T. Ross, Birkhoff–James orthogonality and the zeros of an analytic
function. Comput. Methods Funct. Theory 17, 499–523 (2017)
29. W.J. Davis, Separable Banach spaces with only trivial isometries. Rev. Roumaine Math. Pures
Appl. 16, 1051–1054 (1971)
30. M.M. Day, Polygons circumscribed about convex closed curves. Trans. Am. Math. Soc. 62,
315–319 (1947)
31. M.M. Day, Some characterizations of inner-product spaces. Trans. Am. Math. Soc. 62, 320–
337 (1947)
32. R. Deville, G. Godefroy, V. Zizler, Smoothness and Renormings in Banach Spaces. Pitman
Monographs and Surveys in Pure and Applied Mathematics, vol. 64 (Longman Scientific &
Technical, Harlow, 1993)
33. S.S. Dragomir, S. Boriotti, D. Dennis, Semi-inner Products and Applications (Nova Science
Publishers, Hauppauge, 2004)
34. H.-K. Du, Another generalization of Anderson’s theorem. Proc. Am. Math. Soc. 123(9), 2709–
2714 (1995)
35. C.-A. Faure, An elementary proof of the fundamental theorem of projective geometry. Geom.
Dedicata. 90, 145–151 (2002)
36. F.A. Ficken, Note on the existence of scalar products in normed linear spaces. Ann. of Math.
45(2), 362–366 (1944)
37. G. P. Gehér, An elementary proof for the non-bijective version of Wigner’s theorem. Phys. Lett.
A 378(30–31), 2054–2057 (2014)
38. Y. Gordon, R. Loewy, Uniqueness of () spaces. Math. Ann. 241(2), 159–180 (1979)
39. P. Grover, Orthogonality of matrices in the Ky Fan k-norms. Linear Multilinear Algebra 65(3),
496–509 (2017)
40. P. Grover, S. Singla, Birkhoff–James orthogonality and applications: a survey, in Operator
Theory, Functional Analysis and Applications. Operator Theory: Advances and Applications,
vol. 282 (Springer International Publishing, New York, 2021), pp. 293–315
41. P. Hájek, J. Talponen, Smooth approximations of norms in separable Banach spaces. Q. J.
Math. 65, 957–969 (2014)
42. P.R. Halmos, A Hilbert Space Problem Book (Springer, Berlin, 1982)
43. R.A. Horn, C.R. Johnson, Topics in Matrix Analysis (Cambridge University Press, Cambridge,
1991)
44. W. Hurewics, H. Wallman, Dimension Theory (Princeton University Press, Princeton, 1941)
45. R.C. James, Orthogonality in normed linear spaces. Duke Math. J. 12, 291–302 (1945)
46. R.C. James, Inner product in normed linear spaces. Bull. Am. Math. Soc. 53, 559–566 (1947)
47. R.C. James, Orthogonality and linear functionals in normed linear spaces. Trans. Am. Math.
Soc. 61, 265–292 (1947)
48. K. Jarosz, Any Banach space has an equivalent norm with trivial isometries. Isr. J. Math. 64,
49–56 (1988)
49. P. Jordan, J. von Neumann, On inner products in linear metric spaces. Ann. Math. 36(2), 719–
723 (1935)
50. S. Kakutani, Some characterizations of Euclidean space. Jap. J. Math. 16, 93–97 (1939)
51. N.J. Kalton, An elementary example of a Banach space not isomorphic to its complex
conjugate. Canad. Math. Bull. 38(2), 218–222 (1995)
Birkhoff–James Orthogonality: Characterizations, Preservers, and Orthogonality Graphs 301

52. I. Kaplansky, Modules over operator algebras. Am. J. Math. 75, 839–858 (1953)
53. T. Kato, Perturbation Theory for Linear Operators (Springer, Berlin, 1980)
54. D.J. Kečkić, Orthogonality in S1 and S∞ spaces and normal derivations. J. Oper. Theory
51(1), 89–104 (2004)
55. D.J. Kečkić, Orthogonality and smooth points in C (K) and Cb (). Eur. Math. J. 3(4), 44–52
(2012)
56. F. Kittaneh, Normal derivations in norm ideals. Proc. Am. Math. Soc. 123(6), 1779–1785
(1995)
57. A. Koldobsky, Operators preserving orthogonality are isometries. Proc. Roy. Soc. Edinburgh
Sect. A 123, 835–837 (1993)
58. A.S. Lewis, The convex analysis of unitarily invariant matrix functions. J. Convex Anal. 2(1–
2), 173–183 (1995)
59. A.S. Lewis, Derivatives of spectral functions. Math. Oper. Res. 21(3), 576–588 (1996)
60. C.K. Li, H. Schneider, Orthogonality of matrices. Linear Algebra Appl. 347, 115–122 (2002)
61. P.K. Lin, A remark on the Singer-orthogonality in normed linear spaces. Math. Nachr. 160,
325–328 (1993)
62. E.R. Lorch, On certain implications which characterize Hilbert spaces. Ann. Math. 49(3), 523–
532 (1948)
63. B. Magajna, On the distance to finite-dimensional subspaces in operator algebras. J. London
Math. Soc. 47(2), 516–532 (1993)
64. V.M. Manuilov, E.V. Troïtsky, Hilbert C ∗ -Modules. Translations of Mathematical Mono-
graphs, vol. 22 (American Mathematical Society, Providence, 2005)
65. H. Mizuguchi, The differences between Birkhoff and isosceles orthogonalities in Radon planes.
Extracta Math. 32(2), 173–208 (2017)
66. L. Molnár, Orthogonality preserving transformations on indefinite inner product spaces:
generalization of Uhlhorn’s version of Wigner’s theorem. J. Funct. Anal. 194(2), 248–262
(2002)
67. M.S. Moslehian, A. Zamani, Characterizations of operator Birkhoff-James orthogonality.
Canad. Math. Bull. 60(4), 816–829 (2017)
68. G.J. Murphy, C ∗ -Algebras and Operator Theory (Academic Press, San Diego, 1990)
69. R. Nakamoto, S.-E. Takahasi, Norm equality condition in triangular inequality. Sci. Math. Jpn.
55(3), 463–466 (2002)
70. J.A. Oman, Characterizations of inner product spaces, Ph.D. Thesis. Michigan State University,
Michigan, MI, 1969
71. W.L. Paschke, Inner product modules over B ∗ -algebras. Trans. Am. Math. Soc. 182, 443–468
(1973)
72. K. Paul, D. Sain, Birkhoff-James orthogonality and its application in the study of geometry of
Banach space, in Advanced Topics in Mathematical Analysis (CRC Press, Boca Raton, 2019),
pp. 245–284
73. R.R. Phelps, Convex Functions, Monotone Operators and Differentiability (Springer, Berlin,
1989)
74. J. Rätz, On orthogonally additive mapping. Aequationes Math. 28, 35–49 (1985)
75. F. Rellich, Perturbation Theory of Eigenvalue Problems. Lecture Notes (New York University,
New York, 1953)
76. B.D. Roberts, On the geometry of abstract vector spaces. Tôhoku Math. J. 39, 42–59 (1934)
77. R.T. Rockafellar, Convex Analysis (Princeton University Press, Princeton, 1970)
78. S. Roy, S. Bagchi, D. Sain, Birkhoff–James orthogonality in complex Banach spaces and
Bhatia–Šemrl Theorem revisited (2021). arXiv:2109.12775
79. R. Schatten, Norm Ideals of Completely Continuous Operators (Springer, Berlin, 1960)
80. E.V. Shchepin, The dimension of a sum of curves. Uspekhi Mat. Nauk 30(4)(184), 267–268
(1975) (in Russian)
81. K. Simon, K. Taylor, Interior of sums of planar sets and curves. Math. Proc. Cambridge Philos.
Soc. 168(1), 119–148 (2020)
302 L. Arambašić et al.

82. I. Singer, Sur le L-problème de la théorie des moments dans les espaces de Banach. Acad.
Repub. Popul. Romîne. Bul. Şti. Secţ. Şti. Mat. Fiz. 9, 19–28 (1957)
83. I. Singer, Angles abstraits et fonctions trigonométriques dans les espaces de Banach. Acad.
Repub. Popul. Romîne. Bul. Şti. Secţ. Şti. Mat. Fiz. 9, 29–42 (1957)
84. I. Singer, Best Approximation in Normed Linear Spaces by Elements of Linear Subspaces
(Springer, New York, 1970)
85. S. Singla, Gateaux derivative of C ∗ norm. Linear Algebra Appl. 629, 208–218 (2021)
86. J.G. Stampfli, The norm of a derivation. Pac. J. Math. 33, 737–747 (1970)
87. K. Sundaresan, Orthogonality and nonlinear functionals on Banach spaces. Proc. Am. Math.
Soc. 34, 187–190 (1972)
88. R. Tanaka, On Birkhoff–James orthogonality preservers between real non-isometric Banach
spaces (2021). arXiv:2108.00655
89. R. Tanaka, Nonlinear equivalence of Banach spaces based on Birkhoff–James orthogonality. J.
Math. Anal. Appl. 505(1), 125444 (2022)
90. A.E. Taylor, A geometric theorem and its application to biorthogonal systems. Bull. Am. Math.
Soc. 53, 614–616 (1947)
91. A. Turnšek, On operators preserving James’ orthogonality. Linear Algebra Appl. 407, 189–195
(2005)
92. E.P. Wigner, Gruppentheorie und ihre Anwendung auf die Quantenmechanik der Atomspek-
trum (Fredrik Vieweg und Sohn, Braunschweig, 1931)
93. P. Wójcik, The Birkhoff orthogonality in pre-Hilbert C ∗ -modules. Oper. Matrices 10(3), 713–
729 (2016)
94. P. Wójcik, Mappings preserving B-orthogonality. Indag. Math. 30(1), 197–200 (2019)
95. K. Zi˛etak, On the characterization of the extremal points of the unit sphere of matrices. Linear
Algebra Appl. 106, 57–75 (1988)
Approximate Birkhoff-James
Orthogonality in Normed Linear Spaces
and Related Topics

Jacek Chmieliński

Abstract The classical Birkhoff-James orthogonality (BJ-orthogonality) in a real

normed linear space is one of many possible, but arguably the most adequate,
generalizations of the usual orthogonality relation in an inner product space. In
this work, however, we are dealing not so much with the exact BJ-orthogonality as
with its approximate version. In the first section of this chapter we introduce basic
definitions connected with the notion of approximate BJ-orthogonality. Then we
present a package of equivalent statements, defining in various ways the introduced
concept. Some of these characterizations are known but some other are new. The
second part of the paper is a survey on selected results depicting the areas where
the approximate BJ-orthogonality can be applied or where it stimulates further
studies.

Keywords Birkhoff-James orthogonality · Approximate Birkhoff-James

orthogonality · Normed spaces · Inner product spaces · Linear operators ·
Orthogonality preserving mappings · Orthogonality of operators

1 Introduction and Preliminaries

This work is concentrated around the notions of orthogonality and approximate

orthogonality. Whereas in inner product spaces they both can be defined naturally
by comparison to zero the value of the inner product, it becomes a challenge to
transfer these concepts to normed spaces. Even though it is possible and in many
ways, none of them is fully satisfactory.

J. Chmieliński ()
Department of Mathematics, Pedagogical University of Krakow, Kraków, Poland
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 303
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_9
304 J. Chmieliński

1.1 Orthogonality in Inner Product Spaces

In an inner product space (H, ·|·) with the standard orthogonality relation: x⊥y ⇔
x|y = 0, we have an equally natural notion of the approximate orthogonality or,
more specifically, the ε-orthogonality (with a given ε ∈ [0, 1)):

x⊥ε y ⇐⇒ | x|y | ≤ ε x y, x, y ∈ H. (1)

Of course, 0-orthogonality is just orthogonality. Equivalently, one can write (1) in

the form
| x|y |
x⊥ε y ⇐⇒ ≤ ε, x, y ∈ H \ {0},
x y
x|y
where the quotient x y can be interpreted as a cosine of the angle between vectors
x and y.
We notice easily that the approximate orthogonality means the same as the exact
orthogonality to some nearby vector; more precisely:

x⊥ε y ⇐⇒ ∃ z ∈ Lin {x, y} : x⊥z and z − y ≤ εy. (2)

Indeed, if x⊥ε y, then the vector z = − x|y

x2
x + y (or z = y if x = 0) does the job.
Conversely, assuming x⊥z and z − y ≤ εy, we have

| x|y | = | x|y − z | ≤ x y − z ≤ εx y,

i.e., x⊥ε y. Since the relations ⊥ and ⊥ε are here symmetric, the roles of x, y and z
in the above statement can be interchanged.

1.2 Orthogonalities in Normed Spaces

Although inner product spaces are the most natural venue for orthogonality,
analogous relations may be considered also in normed linear spaces (cf. e.g. the
survey [1]). Among the most natural, with clear geometrical background, are the
isosceles orthogonality:

x⊥i y ⇐⇒ x + y = x − y

or the Pythagorean orthogonality:

x⊥P y ⇐⇒ x − y2 = x2 + y2 .

Approximate Birkhoff-James Orthogonality 305

Many other, not considered here, orthogonality relations are defined, including
axiomatic definitions in linear spaces or other structures (cf. [24, 26, 44]).
Birkhoff-James Orthogonality One of the most important orthogonality relation
in a normed space, is the Birkhoff-James orthogonality (sometimes called the
Birkhoff orthogonality). This concept is well known, extensively studied (cf.
[1, 2, 6, 29–31]) and crucial for the present paper.
Let X be a real normed linear space and x, y ∈ X . The Birkhoff-James
ortohogonality (BJ-orthogonality) of x and y (in the given order) is defined as
follows:

x⊥B y ⇐⇒ ∀ λ ∈ R : x + λy ≥ x.

For a fixed x ∈ X \ {0} we consider the (always nonempty) set of its supporting
functionals:

J (x) = {ϕ ∈ X ∗ : ϕ = 1, ϕ(x) = x },

where X ∗ denotes the dual space. With the Hahn-Banach theorem behind, the
following characterization can be given [30, Corollary 2.2]:

x⊥B y ⇐⇒ ∃ ϕ ∈ J (x) : ϕ(y) = 0. (3)

Norm Derivatives and Semi-inner Product To introduce yet another type of

orthogonality let us recall the notion of norm derivatives in a real normed linear
space X (cf. [3] for the background and properties of these functionals).

x + ty2 − x2
ρ± (x, y) = lim
t →0± 2t
x + ty − x
= x lim , x, y ∈ X .
t →0± t

We define the related orthogonality relations

x⊥ρ± y ⇐⇒ ρ± (x, y) = 0

and then the corresponding approximate orthogonalities (cf. [17, 18])

x⊥ερ± y ⇐⇒ |ρ± (x, y)| ≤ εx y.

Since not every norm is generated by an inner product, the following definition
is sometimes very much helpful. Due to [25, 35] (see also the monograph [23]) each
norm in a linear space X (over K ∈ {R, C}) admits a semi-inner product generating
this norm, that is a functional [·|·] : X × X → K satisfying the conditions:
306 J. Chmieliński

• [λx + μy|z] = λ [x|z] + μ [y|z] , x, y, z ∈ X , λ, μ ∈ K;

• [x|λy] = λ [x|y] , x, y ∈ X , λ ∈ K;
• | [x|y] | ≤ x y, x, y ∈ X
and
• [x|x] = x2 , x ∈ X .
Generally, unless the norm is smooth, there could be many different semi-inner
products related to a given norm in X . If the norm in X comes from an inner product,
then the inner product itself is the unique semi-inner product. For a fixed semi-inner
product [·|·] and x, y ∈ X , we define the semi-orthogonality

x⊥s y ⇐⇒ [y|x] = 0

and the ε-semi-orthogonality (approximate semi-orthogonality)

x⊥εs y ⇐⇒ | [y|x] | ≤ εx y.

and ρ can
It is known (cf. [23]) that for a real normed space X the values of ρ+ −
be obtained as the supremum or infimum, respectively, over the values taken by all
semi-inner products in X on a given pair of vectors. Namely,

ρ+ (x, y) = sup{[y|x] : [·|·] is a semi-inner product on X } (4)

and

ρ− (x, y) = inf{[y|x] : [·|·] is a semi-inner product on X }. (5)

There is an apparent connection of the BJ-orthogonality and semi-orthogonality

(cf. [23]): for any x, y ∈ X with x⊥B y there exists a semi-inner product [·|·] such
that x⊥s y (i.e., [y|x] = 0).

2 Approximate Birkhoff-James Orthogonality in Normed

Linear Spaces

The objective of this part of the paper is to define and develop the concept
of an approximate Birkhoff-James orthogonality. Various attempts and several
characterizations will be discussed. While many of the facts presented here have
been already known, there is some novelty in this section, both in results and in the
presentation and proofs. For the convenience of the reader, we tried to make this
part as detailed and self-contained as possible; only in few places we refer to results
from the outside. From now on, let X be a real normed space with dim X ≥ 2.
Approximate Birkhoff-James Orthogonality 307

There are at least two notions of ε-BJ-orthogonality (with given ε ∈ [0, 1)). The
first one was given by Dragomir [22]

x⊥ B y ⇐⇒ ∀ λ ∈ R : x + λy ≥ (1 − ε)x
ε

and the other one, by Chmieliński [10]

x⊥εB y ⇐⇒ ∀ λ ∈ R : x + λy2 ≥ x2 − 2εx λy. (6)

One can check that if the norm comes from an inner product then (6) coincides
exactly with (1), whereas ⊥ B gives ⊥ with η = 1 − (1 − ε)2 . The latter causes
η
ε
that it is sometimes more convenient
√ to use a modification of the Dragomir’s
definition, replacing 1 − ε by 1 − ε2 :

x⊥εD y ⇐⇒ ∀ λ ∈ R : x + λy ≥ 1 − ε2 x.

Some relationships between ⊥εB and ⊥ B were established in [10, 19, 38]. In
ε 2
particular, if X is a real normed space and ε ∈ 0, 12 , then (cf. [19])

x⊥εB y #⇒ x⊥ B y.
2ε

If X is a real uniformly smooth normed space and δX ∗ denotes the modulus of

convexity for the dual space X ∗ , then for ε ∈ [0, 2δX ∗ (1)) and for any x, y ∈ X we
have (cf. [38]):

x⊥ B y #⇒ x⊥ηB y
ε

−1 ε
with η = δX ∗ 2 .
In the sequel we will consider mainly the approximate BJ-orthogonality ⊥εB ,
defined by (6).
It is known [18, Theorem 3.2] that if X is a real normed space, then for any semi-
inner product on X and ε ∈ [0, 1), there is ⊥εs ⊂ ⊥εB (in particular ⊥s ⊂ ⊥B ) and in
a smooth space both relations coincide, that is ⊥εs = ⊥εB (cf. also [50]).
In our considerations we will also use the following characterization of the ε-
BJ-orthogonality (actually, it is a special case of a more general result given in [18,
Theorem 3.1]):
Lemma 2.1 Let X be a real normed linear space and let ε ∈ [0, 1). Then, for
arbitrary x, y ∈ X we have

x⊥εB y ⇐⇒ ρ− (x, y) − εx y ≤ 0 ≤ ρ+ (x, y) + εx y. (7)
308 J. Chmieliński

Proof Assume x⊥εB y. Then, from (6), we have

x + λy2 − x2
−εx y ≤ for λ > 0
2λ
and

x + λy2 − x2
≤ εx y for λ < 0.
2λ

Letting λ → 0± , we get −εx y ≤ ρ+ (x, y) and ρ (x, y) ≤ εx y,

−
respectively, as required.
Now, let us prove the converse. Since it is obvious for x = 0 or y = 0, we assume
that x, y ∈ X \ {0}. The first inequality in (7) can be written as

x + λy2 − x2
lim ≤ 2εxy
λ→0− λ

and with an arbitrarily choosen γ ∈ (0, 1) we have

x + λy2 − x2
lim < 2(ε + γ )xy.
λ→0− λ

It follows that there exists δ1 < 0 such that

x + λy2 − x2
∀ λ ∈ [δ1 , 0) : < 2(ε + γ )xy,
λ
whence

∀ λ ∈ [δ1 , 0) : x2 < x + λy2 + 2(ε + γ )xλy. (8)

Analogously, from the second inequality in (7) we get (for the same γ as above and
for some δ2 > 0)

∀ λ ∈ (0, δ2 ] : x2 < x + λy2 + 2(ε + γ )xλy. (9)

Define ϕ : R → R by ϕ(λ) := x + λy2 + 2(ε + γ )xλy. It can be

easily shown that this mapping is convex. Inequalities (8) and (9) yield ϕ(0) =
min{ϕ(λ) : λ ∈ [δ1 , δ2 ]} and convexity of ϕ gives ϕ(0) = min{ϕ(λ) : λ ∈ R}. Thus

x2 < x + λy2 + 2(ε + γ )xλy, λ ∈ R \ {0}. (10)

Approximate Birkhoff-James Orthogonality 309

Since γ was arbitrarily chosen from the interval (0, 1), letting γ → 0+ in (10) we
obtain

x2 ≤ x + λy2 + 2εxλy, λ ∈ R \ {0}.

Obviously, the above inequality holds true also for λ = 0, thus finally we get x⊥εB y.

Now we formulate a useful auxiliary result (originally stated in [15]).
Lemma 2.2 Let X be a real normed space and let x, y ∈ X be such that x⊥εB y
(with some ε ∈ [0, 1)). Then, for each n ∈ N there exists a semi-inner product [·|·]n
in X such that

1
| [y|x]n | ≤ ε + x y. (11)
n

Proof Applying (7) and (4)–(5), for each integer n we may choose semi-inner
products [·|·]n and [·|·]n such that

1 1
[y|x]n < ε + x y and − ε+ x y < [y|x]n .
n n

It follows that for some λn ∈ [0, 1] we have

1 1
− ε+ x y ≤ λn [y|x]n + (1 − λn ) [y|x]n ≤ ε + x y.
n n

Now, we consider a semi-inner product [·|·]n := λn [·|·]n + (1 − λn ) [·|·]n and

from the above inequalities it follows

1 1
− ε+ x y ≤ [y|x]n ≤ ε + x y.
n n

Characterizations of the Approximate Orthogonality The notion of approxi-
mate BJ-orthogonality ⊥εB , as defined by (6), can be characterized in various ways.
Now, we collect the already known as well as new characterizations in one theorem.
Theorem 2.3 (Characterization of ⊥εB ) Let X be a real normed linear space,
x, y ∈ X and let ε ∈ [0, 1). The following conditions are equivalent (each of them
can be treated as a definition of x⊥εB y):
(i) ∃ c > 0 ∀ λ ∈ [−c, c] : x + λy2 ≥ x2 − 2εx λy,
(ii) ∀ λ ∈ R : x + λy2 ≥ x2 − 2εx λy,
(iii) ∃ z ∈ Lin {x, y} : x⊥B z, z − y ≤ εy,
(iv) ∃ ϕ ∈ J (x) : |ϕ(y)| ≤ εy,
310 J. Chmieliński

(v) ∀ λ ∈ R : x + λy ≥ x − ελy,

(vi) ∃ c > 0 ∀ λ ∈ [−c, c] : x + λy ≥ x − ελy,
Proof (i)⇒(ii) Assume (i) and define

f (λ) := x + λy2 + 2εx λy, λ ∈ R.

Clearly, the mapping f : R → [0, ∞) is convex and f (0) = x2 . Moreover, it

follows from (i) that f (λ) ≥ x2 whenever |λ| ≤ c. Assume, contrary to our
claim, that (ii) does not hold and x + λ0 y < x2 − 2εx λ0 y for some
λ0 ∈ R. That would mean f (λ0 ) < x2 . Taking n ∈ N big enough so that λn0 ≤ c
and using convexity of f we would have then:

λ0 1 1
x2 ≤ f =f λ0 + 1 − ·0
n n n
1 1
≤ f (λ0 ) + 1 − f (0)
n n
1 1
< x2 + 1 − x2 = x2 ,
n n

a contradiction. Thus (ii) holds true.

(ii)⇒(iii) Suppose that (ii) holds and that x = 0 (otherwise the result is trivial).
It follows from Lemma 2.2 that for an arbitrary n ∈ N there exists a semi–inner
product [·|·]n in X such that (11) holds. Let ⊥s,n denote the corresponding semi-
orthogonality relation. Defining

[y|x]n
zn := − x + y ∈ Lin {x, y},
x2

it is easy to see that x⊥s,n zn and since ⊥s,n ⊂ ⊥B , it follows that x⊥B zn .
Applying (11) we estimate zn − y and we get

1
x⊥B zn and zn − y ≤ ε + y. (12)
n

Notice that zn ≤ 2y whence the elements of the sequence (zn )n=1,2,... belong to
a closed ball in a two-dimensional space Lin {x, y}. Thus there exists z ∈ Lin {x, y}
and a subsequence (znk ) convergent to z. Finally, (12) and continuity of the norm
yield x⊥B z and z − y ≤ εy.
(iii)⇒(iv) Assuming (iii) and applying (3), there exists ϕ ∈ J (x) such that
ϕ(z) = 0. Therefore, |ϕ(y)| = |ϕ(z) − ϕ(y)| ≤ z − y ≤ εy, as claimed.
Approximate Birkhoff-James Orthogonality 311

(iv)⇒(v) For an arbitrary λ ∈ R we have from (iv)

x + λy ≥ |ϕ(x + λy)| ≥ |ϕ(x)| − |ϕ(λy)|

= x − ελy,

i.e., (v) is satisfied.

(v)⇒(vi) Trivial. 4 5
x
(vi)⇒(i) Assuming that (vi) holds true, let d := min εy , c (we assume that
y = 0, otherwise (i) follows trivially). Now, if |λ| ≤ d, then x − ελy ≥ 0
whence:

x + λy2 ≥ (x − ελy)2 = x2 − 2εx λy + ε2 λy2

≥ x2 − 2εx λy

and we are done.

Remarks 2.4
1. The equivalence of (ii), (iii) and (iv) was proved already in [15].
2. Conditions (v) and (vi) are new characterizations of the ⊥εB relation, perhaps more
convenient to use.
3. Notice also that (i) and (vi) prove that it is sufficient to verify either of the
inequalities x + λy2 ≥ x2 − 2εx λy or x + λy ≥ x − ελy
for λ in some neighbourhood of zero only.
4. It is clear that (iii) generalizes (2). However, since the BJ-orthogonality is not
symmetric, the roles of x and y cannot be interchanged now.
Another Definition of an Approximate Birkhoff-James Orthogonality
We conclude this section with an attempt to propose a new definition of an
approximate BJ-orthogonality, joining the advantages of the definitions ⊥ B and ⊥εB .
ε
Definition 2.5 Let X be a real normed linear space. For x, y ∈ X and ε ∈ [0, 1)
we define:
ε
x ⊥B y ⇐⇒ ∀ λ ∈ R : x + λy ≥ x − ε · min{x, λy}. (13)

Remark 2.6 It is quite straightforward that for x, y ∈ X \ {0}

x
ε x − ελy, |λ| ≤ y ,
x ⊥B y ⇐⇒ x + λy ≥ x (14)
(1 − ε)x, |λ| > y .
312 J. Chmieliński

Notice that if |λ| ≥ (2 − ε) x

y , then we have always

x + λy ≥ λy − x ≥ (2 − ε)x − x = (1 − ε)x

so in (13) (or (14)) one can restrict to considering |λ| < (2 − ε) x
y .
Proposition 2.7 For an arbitrary real normed space X , x, y ∈ X and ε ∈ [0, 1)
we have:
ε
x ⊥B y ⇐⇒ x⊥ B y and x⊥εB y.
ε

ε
Proof Assume that x ⊥B y. The assertion is trivial for x = 0 or y = 0 so we assume
that x, y ∈ X \ {0}. If |λ| ≤ x
y , then −ελy ≥ −εx and from (14) we have

x + λy ≥ x − ελy ≥ (1 − ε)x.

Thus x + λy ≥ (1 − ε)x holds true for any λ and we get x⊥ B y.

ε
It follows from (14) that x + λy ≥ x − ελy for |λ| ≤ c where c := x
y ,
whence the condition (vi) in Theorem 2.3 yields x⊥εB y.
The reverse implication follows easily from the definition of ⊥ B and the
ε
characterization (v) of ⊥εB in Theorem 2.3.

3 Applications and Generalizations—Review of Selected

Results

This part of the paper is a survey on known results, with the aim to give
at least an impression of various places where the notions of BJ-orthogonality
and its approximate counterpart can be considered or applied. Actually, we will
concentrate ourselves here on approximate orthogonality. The role of the (exact)
BJ-orthogonality in studies on the geometry of Banach spaces is well described,
e.g., in [42].
This section is organized as follows. At first, we discuss the natural extension
from general normed spaces to the space of linear bounded operators. Then we
consider an issue belonging to linear preserver problems, namely preservation or
approximate preservation (by linear operators) of the BJ-orthogonality relation. In
the third subsection we are pointing out some recent studies on the approximate
symmetry of the BJ-orthogonality, which is a topic very much connected to
approximate orthogonality as well as to the geometry of the considered space.
Approximate Birkhoff-James Orthogonality 313

3.1 Approximate Birkhoff-James Orthogonality in Operator

Theory

There is a vast literature devoted to various aspects of BJ-orthogonality in the space

of linear bounded operators. Let H be a Hilbert space. Denote by B(H) the space of
all linear and bounded operators on H and by K(H) the subspace of B(H) consisting
of all compact operators. For a given T ∈ B(H) we denote by MT the set of unit
vectors at which T attains its norm, i.e.,

MT = {x ∈ SH : T x = T },

where SH stands for the unit sphere in H. We begin with presenting a canonical
result given by Bhatia and Šemrl [8] (and independently by Paul [41]).
Theorem 3.1 ([8, Theorem 1.1, Remark 3.1]) Let H be a Hilbert space and let
T , A ∈ B(H). Then, the following conditions are equivalent:
(1) T ⊥B A;
(2) ∃ (xn )∞
n=1 ⊂ SH : limn→∞ T xn = T , limn→∞ T xn |Axn = 0.
Moreover, if dim H < ∞, then each of the above conditions is equivalent to:
(3) ∃ x0 ∈ MT : T x0 ⊥Ax0.
The above result was developed by various authors. Benítez, Fernández and
Soriano [7] showed that the equivalence (1)⇔(3) is valid if and only if H is a Hilbert
space (cannot be replaced by a Banach space). Generalizations of Theorem 3.1 have
been obtained, e.g., by Arambašić and Rajić [4], Sain, Paul and Hait [45, 46], Grover
[27] and Wójcik [51].
In [15] authors provided the first extension of the result of Bhatia and Šemrl to
approximate orthogonality in B(H).
Theorem 3.2 ([15, Theorem 3.2]) Let H be a real Hilbert space, let T , A ∈ B(H)
and let ε ∈ [0, 1). Then, the following conditions are equivalent:
(1) T ⊥εB A;
(2) ∃ (xn )∞
n=1 ⊂ SH :

lim T xn = T , lim | T xn |Axn | ≤ εT A.

n→∞ n→∞

Moreover, if dim H < ∞, then each of the above conditions is equivalent to:
(3) ∃ x0 ∈ MT : | T x0 |Ax0 | ≤ εT A.
If dim H < ∞ and, additionally, MT ⊂ MA , then each of the above three conditions
is equivalent also to:
(4) ∃ x0 ∈ MT : T x0 ⊥ε Ax0 .
314 J. Chmieliński

The condition dim H < ∞ can be replaced by compactness of T .

Theorem 3.3 ([15, Theorem 3.4]) Let H be a real Hilbert space, let T , A ∈ B(H)
and let ε ∈ [0, 1). Assume that MT ⊂ MA and T ∈ K(H). Then T ⊥εB A if and only
if

∃ x0 ∈ MT : T x0 ⊥ε Ax0 .

The following characterization of the approximate orthogonality in B(H) was

given later by Paul et al. [43].
Theorem 3.4 ([43, Theorem 3.1])
(1) Let T ∈ B(H). Then for any A ∈ B(H), T ⊥εB A ⇔ | T x|Ax | ≤ εT A for
some x ∈ MT if and only if MT = SH0 for some finite dimensional subspace
H0 of H and T H⊥ < T .
0
(2) Moreover, if MT ⊂ MA then T ⊥εB A ⇔ T x⊥ε Ax for some x ∈ MT if and only
if MT = SH0 for some finite dimensional subspace H0 of H and T H⊥ <
0
T .
Now, let X , Y be normed linear spaces. By B(X , Y) we denote the space of all
bounded linear operators from X to Y and by K(X , Y) its subspace consisting of
compact operators.
In the above quoted paper [43], some characterizations were given for the
approximate BJ-orthogonality for compact operators defined on a reflexive Banach
space.
Theorem 3.5 ([43, Theorem 3.2]) Let X be a reflexive Banach space and let Y
be a normed space. Let T , A ∈ K(X , Y) and MT = D ∪ (−D), where D is a
nonempty compact connected subset of SX . Then T ⊥εB A if and only if there exists
x ∈ MT such that T x + λAx2 ≥ T 2 − 2εT λA. Moreover if MT ⊂ MA ,
then T ⊥εB A if and only if T x⊥εB Ax.
Theorem 3.6 ([43, Theorem 3.3]) Let X be a reflexive Banach space and Y be a
normed space. Let T , A ∈ K(X , Y). Then T ⊥εB A if and only if there exist x, y ∈ MT
such that T x + λAx2 ≥ T 2 − 2εT λA for all λ ≥ 0 and T y + λAy2 ≥
T 2 − 2εT λA for all λ ≤ 0.
For arbitrary normed spaces and linear bounded operators the following charac-
terization was given.
Theorem 3.7 ([43, Theorem 3.4]) Let X , Y be two normed spaces. Suppose T ∈
B(X , Y) be nonzero. Then for any A ∈ B(X , Y), T ⊥εB A if and only if either of the
conditions in (a) or (b) holds.
(a) There exists a sequence (xn ) of unit vectors such that

T xn → T and lim Axn ≤ εA.

n→∞
Approximate Birkhoff-James Orthogonality 315

(b) There exist two sequences (xn ), (yn ) of unit vectors and two sequences of
positive real numbers (εn ), (δn ) such that
(i) εn → 0, δn → 0, T xn → T , T yn → T as n → ∞;
(ii) T xn +λAxn 2 ≥ (1−εn2 )T xn 2 −2ε 1 − εn2 T xn λA for all λ ≥ 0;
(iii) T yn +λAyn 2 ≥ (1−δn2 )T yn 2 −2ε 1 − δn2 T yn λA for all λ ≤ 0.
Under some additional conditions on the norm attainment set, further character-
izations of the approximate BJ-orthogonality of operators between normed linear
spaces were obtained in particular by Mal et al. [36]. For bilinear operators the topic
was studied recently by Khurana and Sain [33]. For a positive operator A ∈ B(H)
acting on a Hilbert space H, Sen et al. [48] defined A-orthogonality in H and A-
BJ-orthogonality in B(H). Next, they introduced the notions of (ε, A)-orthogonality
and (ε, A)-BJ-orthogonality in H, and finally (ε, A)-BJ-orthogonality in B(H). In
particular, some characterizations of those approximate orthogonalities were given.
Investigations concerning BJ-orthogonality in semi-Hilbertian spaces were carried
on also by Zamani [52].
Now, let us consider the space C(K) of all real continuous mappings defined
on a locally compact topological space K endowed with the supremum norm. A
subspace C0 (K) of C(K) defined by

C0 (K) := {f ∈ C(K) : ∀ ε > 0, the set {t ∈ K : |f (t)| ≥ ε} is compact}

has the property that for f ∈ C0 (K) the set Mf := {t ∈ K : |f (t)| = f } is always
nonempty and compact. In [15] a characterization of approximate BJ-orthogonality
on C0 (K) was given.
Theorem 3.8 ([15, Theorem 3.6]) Let f, g ∈ C0 (K), f = 0 = g. Assume that Mf
is connected. Then, the following conditions are equivalent:
(a) f ⊥εB g;
(b) ∃ t1 ∈ Mf : |g(t1 )| ≤ εg.

3.2 Operators Approximately Preserving Orthogonality

A linear mapping T between two inner product spaces H and K which preserves
orthogonality, i.e., such that

x⊥y #⇒ T x⊥T y, x, y ∈ H,

is necessarily a linear similarity, i.e., an isometry multiplied by a constant (quite

elementary proof can be found, e.g., in [11, Theorem 2.1]). Koldobsky [34] and
then Blanco and Turnšek [9] extended this result to normed linear spaces with BJ-
orthogonality.
316 J. Chmieliński

A related study on linear mappings approximately preserving orthogonality for

inner product spaces was started in [11, 12] and continued in [49]. A generalization
to C ∗ -modules was given by Ilišević and Turnšek [28].
Another generalization, toward normed spaces, was tackled first for the isosceles-
orthogonality in [16]. Later, Mojškerc and Turnšek [38] have shown that each linear
mapping between normed spaces, approximately preserving BJ-orthogonality, is an
approximate similarity.
Theorem 3.9 ([38, Theorem 3.5]) Let X , Y be normed spaces, ε ∈ [0, 12 ) and
T : X → Y a linear mapping satisfying

x⊥y #⇒ T x⊥εB T y, x, y ∈ X .

Then

(1 − 16ε)T x ≤ T x ≤ T x, x ∈ X.

The assumption ε < 12 is needed to prove that T is bounded, i.e., T x ≤

T x holds true. The left-hand side inequality in the assertion is redundant unless
1
ε < 16 . The constant 16ε can be diminished to 8ε for real spaces (cf. [38, Remark
3.1]).
A more extended review on the described above results and related topics can be
found in [13].

3.3 Approximate Symmetry of the Birkhoff-James

Orthogonality

Approximate orthogonality has been used to introduce the notion of approximate

symmetry of the BJ-orthogonality. It is known that BJ-orthogonality generally is
not symmetric; even more—its symmetry characterizes inner product spaces among
normed spaces of the dimension greater than or equal to 3. Only for a 2-dimensional
linear space it is possible to find a norm which does not come from an inner product
but the corresponding ⊥B orthogonality is symmetric (more on such norms, so called
Radon norms, in [37]). In [19] the following definition was introduced.
Definition 3.10 The BJ-orthogonality relation in a normed linear space X is called
approximately symmetric, or more precisely: ε-symmetric for some ε ∈ [0, 1), if for
any x, y ∈ X

x⊥B y #⇒ y⊥εB x.

In [19] there were given several conditions sufficient for approximate symmetry
of ⊥B . On the other hand, there were given examples of (classes of) normed
Approximate Birkhoff-James Orthogonality 317

spaces for which the BJ-orthogonality is not approximately symmetric. Moreover,

approximate symmetry of ⊥B does not imply inner product structure (regardless of
the dimension of the underlying space).
It is obvious that symmetry of the approximate orthogonality implies approx-
imate symmetry of the orthogonality. But not vice versa; actually even the exact
symmetry of the orthogonality does not imply symmetry of the approximate
orthogonality (cf. [19, Example 2.5]).
In [19, Theorem 4.1] it was proved that in each real uniformly convex normed
space, BJ-orthogonality is approximately symmetric. The same assertion is true
(cf. [19, Theorem 4.2]) for real finite-dimensional and smooth normed spaces.
Moreover, if X is a real uniformly convex and smooth Banach space, then (cf. [19,
Theorem 4.6]) there exists ε ∈ [0, 1) such that the BJ-orthogonality relations in X ,
X ∗ and X ∗∗ are ε-symmetric.
Whereas the above described approximate symmetry of ⊥B has a global setting,
its local version was considered in [14]. Namely, the following notion was intro-
duced.
Definition 3.11 Let X be a normed linear space and let x ∈ X . We say that x is
approximately left-symmetric if there exists εx ∈ [0, 1) such that whenever y ∈ X
and x ⊥B y, it follows that y ⊥εBx x. Analogously, we define the approximate
right-symmetry of x.
In particular, it was proved in [14, Theorem 3.10] that the approximate symmetry
of the orthogonality and the local approximate left-symmetry at each point of the
unit sphere are equivalent properties of any finite-dimensional polyhedral Banach
space. Moreover a geometrical characterization of this property was given. An
analogous definition was introduced also for Dragomir’s definition of approximate
orthogonality and it was proved that the BJ-orthogonality is approximately symmet-
ric in the sense of Dragomir in all finite-dimensional Banach spaces.
Recently, approximate symmetry of the BJ-orthogonality has been studied by
Set et al. [48] for semi-Hilbertian structures induced by positive operators acting
on a Hilbert space. The notion of (ε, A)-approximate right (left) symmetry of the
BJ-orthogonality of linear A-bounded operators on H was introduced.

3.4 Varia

Apart from mentioned above, there are other areas of research where the notion of
approximate BJ-orthogonality is involved. Without giving too much details we only
signal their presence.
Approximate Birkhoff-James Orthogonality in Hilbert Modules In an inner
product module over a C ∗ -algebra one can define an orthogonality relation by
using both the inner product as well as the corresponding norm. This gives rise
to various types of orthogonality (see Arambašić and Rajić [4, 5]). Approximate
318 J. Chmieliński

BJ-orthogonality in such realm has been also considered by Moslehian and Zamani
[39].
Orthogonality Sets Two notions of approximate Birkhoff-James orthogonality sets
have been introduced by Sain et al. [47]. Given x ∈ X and ε ∈ [0, 1) one can
consider

F (x, ε) := {y ∈ X : x⊥εD y}; G(x, ε) := {y ∈ X : x⊥εB y}.

A geometrical description of these two sets in an arbitrary normed space was given:
each of them is a union of two-dimensional normal cones.
A similar notion of Birkhoff–James ε-orthogonality sets for matrices and for
matrix polynomials, based on the Dragomir’s definition of the approximate BJ-
orthogonality, were defined and studied in [20] (see also [21, 32, 40]).
The list of topics and results connected with the approximate BJ-orthogonality
which we dealt with in this section is by no means complete. It was our purpose just
to show that the considered concepts may be applied in a variety of further studies.

References

1. J. Alonso, C. Benitez, Orthogonality in normed linear spaces: a survey. Part I: main properties.
Extracta Math. 3(1), 1–15 (1988). Part II: Relations between main orthogonalities. Extracta
Math. 4(3), 121–131 (1989)
2. J. Alonso, H. Martini, S. Wu, On Birkhoff orthogonality and isosceles orthogonality in normed
linear spaces. Aequationes Math. 83(1–2), 153–189 (2012)
3. C. Alsina, J. Sikorska, M. Santos Tomás, Norm Derivatives and Characterizations of Inner
Product Spaces (World Scientific, Hackensack, 2010)
4. Lj. Arambašić, R. Rajić, The Birkhoff-James orthogonality in Hilbert C ∗ -modules. Linear
Algebra Appl. 437(7), 1913–1929 (2012)
5. Lj. Arambašić, R. Rajić, On three concepts of orthogonality in Hilbert C ∗ -modules. Linear
Multilinear Algebra 63(7), 1485–1500 (2015)
6. G. Birkhoff, Orthogonality in linear metric spaces. Duke Math. J. 1(2), 169–172 (1935)
7. C. Benítez, M. Fernández, M.L. Soriano, Orthogonality of matrices. Linear Algebra Appl.
422(1), 155–163 (2007)
8. R. Bhatia, P. Šemrl, Orthogonality of matrices and some distance problems. Linear Algebra
Appl. 287(1–3), 77–85 (1999)
9. A. Blanco, A. Turnšek, On maps that preserve orthogonality in normed spaces. Proc. R. Soc.
Edinburgh Sect. A 136(4), 709–716 (2006)
10. J. Chmieliński, On an ε-Birkhoff orthogonality. J. Inequal. Pure Appl. Math. 6(3) (2005). Art.
79
11. J. Chmieliński, Linear mappings approximately preserving orthogonality. J. Math. Anal. Appl.
304(1), 158–169 (2005)
12. J. Chmieliński, Stability of the orthogonality preserving property in finite-dimensional inner
product spaces. J. Math. Anal. Appl. 318(2), 433–443 (2006)
13. J. Chmieliński, Orthogonality Preserving Property and Its Ulam Stability. Functional Equa-
tions in Mathematical Analysis, Springer Optimization and Its Applications, vol. 52 (Springer,
New York, 2012), pp. 33–58
Approximate Birkhoff-James Orthogonality 319

14. J. Chmieliński, D. Khurana, D. Sain, Local approximate symmetry of Birkhoff-James orthog-

onality in normed linear spaces. Results Math. 76(3) (2021). Paper No. 136
15. J. Chmieliński, T. Stypuła, P. Wójcik, Approximate orthogonality in normed spaces and its
applications. Linear Algebra Appl. 531, 305–317 (2017)
16. J. Chmieliński, P. Wójcik, Isosceles-orthogonality preserving property and its stability. Non-
linear Anal. 72(3–4), 1445–1453 (2010)
17. J. Chmieliński, P. Wójcik, On a ρ-orthogonality. Aequationes Math. 80(1–2), 45–55 (2010)
18. J. Chmieliński, P. Wójcik, ρ-orthogonality and its preservation – revisited, in Recent Devel-
opments in Functional Equations and Inequalities, vol. 99 (Banach Center Publications,
Warszawa, 2013), pp. 17–30
19. J. Chmieliński, P. Wójcik, Approximate symmetry of Birkhoff orthogonality. J. Math. Anal.
Appl. 461, 625–640 (2018)
20. Ch. Chorianopoulos, P. Psarrakos, Birkhoff-James approximate orthogonality sets and numer-
ical ranges. Linear Algebra Appl. 434(9), 2089–2108 (2011)
21. Ch. Chorianopoulos, P. Psarrakos, On the continuity of Birkhoff-James ε-orthogonality sets.
Linear Multilinear Algebra 61(11), 1447–1454 (2013)
22. S.S. Dragomir, On approximation of continuous linear functionals in normed linear spaces.
An. Univ. Timişoara Ser. Ştiinţ. Mat., 29(1), 51–58 (1991)
23. S.S. Dragomir, Semi-Inner Products and Applications (Nova Science Publishers, Inc., Haup-
pauge, NY, 2004)
24. W. Fechner, J. Sikorska, On the stability of orthogonal additivity. Bull. Pol. Acad. Sci. Math.
58(1), 23–30 (2010)
25. J.R. Giles, Classes of semi–inner–product spaces. Trans. Am. Math. Soc., 129, 436–446 (1967)
26. S. Gudder, D. Strawther, Orthogonally additive and orthogonally increasing functions on vector
spaces. Pac. J. Math. 58(2), 427–436 (1975)
27. P. Grover, Orthogonality to matrix subspaces and a distance formula. Linear Algebra Appl.
445, 280–288 (2014)
28. D. Ilišević, A. Turnšek, Approximately orthogonality preserving mappings on C ∗ -modules. J.
Math. Anal. Appl. 341(1), 298–308 (2008)
29. R.C. James, Orthogonality in normed linear linear spaces. Duke Math. J. 12, 291–301 (1945)
30. R.C. James, Orthogonality and linear functionals in normed linear spaces. Trans. Am. Math.
Soc. 61, 265–292 (1947)
31. R.C. James, Inner products in normed linear spaces. Bull. Am. Math. Soc. 53, 559–566 (1947)
32. M. Karamanlis, P.J. Psarrakos, Birkhoff-James ε-orthogonality sets in normed linear spaces,
in The Natália Bebiano Anniversary, vol. 81–92. Textos Mat. Sér. B, vol. 44 (University of
Coimbra, Coimbra, 2013)
33. D. Khurana, D. Sain, Norm derivatives and geometry of bilinear operators. Ann. Funct. Anal.
12(3) (2021). Paper No. 49
34. A. Koldobsky, Operators preserving orthogonality are isometries. Proc. R. Soc. Edinburgh
Sect. A 123(5), 835–837 (1993)
35. G. Lumer, Semi-inner-product spaces. Trans. Am. Math. Soc. 100, 29–43 (1961)
36. A. Mal, K. Paul, T.S.S.R.K. Rao, D. Sain, Approximate Birkhoff-James orthogonality and
smoothness in the space of bounded linear operators. Monatsh. Math. 190(3), 549–558 (2019)
37. H. Martini, K.J. Swanepoel, Antinorms and Radon curves. Aequationes Math. 72(1–2), 110–
138 (2006)
38. B. Mojškerc, A. Turnšek, Mappings approximately preserving orthogonality in normed spaces.
Nonlinear Anal. 73(12), 3821–3831 (2010)
39. M.S. Moslehian, A. Zamani, Characterizations of operator Birkhoff-James orthogonality.
Canad. Math. Bull. 60(4), 816–829 (2017)
40. V. Panagakou, P. Psarrakos, N. Yannakakis, Birkhoff-James ε-orthogonality sets of vectors and
vector-valued polynomials. J. Math. Anal. Appl. 454(1), 59–78 (2017)
41. K. Paul, Translatable radii of an operator in the direction of another operator. Sci. Math. 2(1),
119–122 (1999)
320 J. Chmieliński

42. K. Paul, D. Sain Birkhoff-James Orthogonality and Its Application in the Study of Geometry
of Banach Space. Advanced Topics in Mathematical Analysis (CRC Press, Boca Raton, FL,
2019), pp. 245–284
43. K. Paul, D. Sain, A. Mal, Approximate Birkhoff-James orthogonality in the space of bounded
linear operators. Linear Algebra Appl. 537, 348–357 (2018)
44. J. Rätz, On orthogonally additive mappings. Aequationes Math. 28(1–2), 35–49 (1985)
45. D. Sain, K. Paul, Operator norm attainment and inner product spaces. Linear Algebra Appl.
439(8), 2448–2452 (2013)
46. D. Sain, K. Paul, S. Hait, Operator norm attainment and Birkhoff-James orthogonality. Linear
Algebra Appl. 476, 85–97 (2015)
47. D. Sain, K. Paul, A. Mal, On approximate Birkhoff-James orthogonality and normal cones in
a normed space. J. Convex Anal. 26(1), 341–351 (2019)
48. J. Sen, D. Sain, K. Paul, On approximate orthogonality and symmetry of operators in semi-
Hilbertian structure. Bull. Sci. Math. 170 (2021). Paper No. 102997
49. A. Turnšek, On mappings approximately preserving orthogonality. J. Math. Anal. Appl. 336(1),
625–631 (2007)
50. P. Wójcik, Characterization of smooth spaces by approximate orthogonalities. Aequationes
Math. 89(4), 1189–1194 (2015)
51. P. Wójcik, Orthogonality of compact operators. Expo. Math. 35(1), 86–94 (2017)
52. A. Zamani, Birkhoff-James orthogonality of operators in semi-Hilbertian spaces and its
applications. Ann. Funct. Anal. 10(3), 433–445 (2019)
Orthogonally Additive Operators
on Vector Lattices

Marat Pliev and Mikhail Popov

Abstract An orthogonally additive operator T : E → F between vector lattices

E, F is a map which satisfies T (x + y) = T (x) + T (y) for all disjoint elements
x, y ∈ E. We summarize some results and open problems on this class of operators.
We focus mainly on the vector lattice structure of different partial subclasses of
the vector space of all orthogonally additive operators, some versions of order
continuity, certain domination problems, representation theorems and Banach lattice
structure of orthogonally additive operators.

Keywords Orthogonally additive operator · Abstract Uryson operator ·

Nemytskii operator · C-compact operator · AM-compact operator · Narrow
operator · Vector lattice · Lateral ideal · Lateral band

1 Introduction

The work of famous mathematicians Drewnowski et al. [15, 16, 32, 33] has led to the
appearance and study of orthogonally additive operators (OAOs) on vector lattices.
OAOs generalize linear ones (see for a definition below) and naturally appear in
different areas of modern mathematics, e.g. partial differential equations, convex
geometry, dynamical systems and stochastic processes [31, 50, 57]. It is worth
mentioning that some classical operators of nonlinear analysis including Uryson,
Hammerstein and Nemyskii operators are orthogonally additive in appropriate

M. Pliev ()
Southern Mathematical Institute of the Russian Academy of Sciences, Vladikavkaz, Russia
North Caucasus Center for Mathematical Research of the Vladikavkaz Scientific Center of the
Russian Academy of Sciences, Vladikavkaz, Russia
e-mail: [email protected]
M. Popov
Institute of Exact and Technical Sciences, Pomeranian University in Słupsk, Słupsk, Poland
Vasyl Stefanyk Precarpathian National University, Ivano-Frankivsk, Ukraine

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 321
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_10
322 M. Pliev and M. Popov

function spaces [35]. The theory of OAOs is being developed by different authors
in many directions [1, 4, 5, 19–21, 47, 49].
The aim of this paper is to discuss some known results and state open problems
which can be useful for further development of the theory. Let us describe the
content of this article. In the next section, we briefly present necessary background
on vector lattices and orthogonally additive operators. In Sect. 3, we discuss the
lateral partial order on vector lattices, which is useful for the study of OAOs.
Section 4 presents several extension theorems. In Sect. 5 we provide results on the
order structure of different classes of OAOs, including elegant formulas of Riesz-
Kantorovich type for the lattice operations over OAOs. In Sect. 6, we discuss on
C-compact and AM-compact OAOs. In particular, we show that, under some mild
conditions, the set of all C-compact OAOs is a projection band in OAr (E, F ) and
present a solution to the domination problem for AM-compact abstract Uryson
operators. Section 7 is devoted to the relationships between different partial order
continuities, which are clear for linear operators, however become involved for
OAOs. In Sect. 8 we present some theorems on narrow OAOs, including a deep
result on the representation of regular operators as the sum of a pseudo-embedding
and a diffuse (= narrow) operator, both for linear and orthogonally additive settings.
In Sect. 9, we endow the vector lattice of order bounded OAOs with a norm, such
that the set of all OAOs having finite norm becomes a Banach lattice in which the
subspace of all linear bounded operators is contractive complemented by means of
plenty projections called linear sections of OAOs. Final section contains some open
problems. Several results are provided with proofs, which are not new.
The standard reference books on the theory of vector and Banach lattices are
[8, 37]. All vector
<n lattices we consider below are supposed to be Archimedean. We
n
write x = i=1 x i to express that x = i=1 xi and xi ⊥xj for all i = j . In
particular, for n = 2 we use the notation x = x1 x2 . We say that y is a fragment
(a component) of x ∈ E and use the notation y ? x, if y ⊥ (x − y). The set of
all fragments of x ∈ E is denoted by Fx . We say that x1 , x2 ∈ Fx are mutually
complemented if x = x1 x2 . It is a standard exercise to show that ? is a partial
order on E, called the lateral order (see [38] for a detailed study of this order).

2 Definition and Main Examples of OAOs

Definition 2.1 Let E be a vector lattice and X a real vector space. A function
T : E → X is called an orthogonally additive operator (OAO in short) provided
T (x + y) = T (x) + T (y) for any disjoint elements x, y ∈ E.
It is not hard to check that T (0) = 0. The set of all OAOs from E to X is a real
vector space with respect to the natural linear operations.
Orthogonally Additive Operators on Vector Lattices 323

Definition 2.2 Let E, F be vector lattices. An OAO T : E → F is said to be:

• positive if T x ≥ 0 holds in F for all x ∈ E;
• regular if T = S1 − S2 , where S1 , S2 are positive OAOs from E to F ;
• order bounded, or an abstract Uryson operator, if it maps order bounded sets in
E to order bounded sets in F ;
• disjointness preserving, if T x ⊥ T y for every disjoint x, y ∈ E;
• non-expanding, if E = F and T x ∈ {x}dd for every x ∈ E;
• C-bounded or a Popov operator, if the set T (Fx ) is order bounded in F for every
x ∈ E.
Observe that if T : E → F is a positive OAO and x ∈ E is such that T (x) = 0
then T (−x) = −T (x). So, the positivity of OAOs is completely different from that
of linear operators, and the only linear operator which is positive in the sense of
OAOs is zero. A positive OAO need not be order bounded. Indeed, every function
T : R → R with T (0) = 0 is an OAO, and, obviously, not all such functions are
order bounded.
The cone of all positive OAOs from E to F is denoted by OA+ (E, F ). The
vector spaces of all regular, abstract Uryson and C-bounded operators we denote
by OAr (E, F ), U(E, F ) and P(E, F ) respectively. We note that the inclusion
U(E, F ) ⊂ P(E, F ) is strict even in the one-dimensional case.
Example Let E = F = R and map T : R → R be defined by the formula

1
if x = 0
T (x) = x2
0 if x = 0.

Since for every x ∈ R the set Fx contains only two elements {0, x}, one has T ∈
P(E, F ). On the other hand, for the order bounded subset (−1, 1) ⊂ R, the subset
T (−1, 1) is order unbounded and hence T ∈ / U(E, F ).
Proposition 2.3 Let E, F be vector lattices. Then OA+ (E, F ) ⊆ P(E, F ).
Proof Given any T ∈ OA+ (E, F ), x ∈ E and y ∈ Fx , one has

T (x) = T y (x − y) = T (y) + T (x − y) ≥ T y ≥ 0,

hence T (x) = sup T (Fx ), and T ∈ P(E, F ) is proved.

Consider some traditional examples of OAOs which motivate the development
of the general theory of OAOs.
Definition 2.4 Let (A, , μ) and (B, ", ν) be finite measure spaces. By (A ×
B, μ ⊗ ν) we denote the completion of their product measure space. A map
324 M. Pliev and M. Popov

K : A×B×R → R is said to be a Carathéodory function if the following conditions

hold:
(C1 ) K(·, ·, r) is μ ⊗ ν-measurable for all r ∈ R;
(C2 ) K(s, t, ·) is continuous on R for μ ⊗ ν-almost all (s, t) ∈ A × B.
We say that a Carathéodory function K is normalized if K(s, t, 0) = 0 for μ ⊗ ν-
almost all (s, t) ∈ A × B.
Example ([18, Proposition 3.2]) Let E be an order ideal of L0 (ν), K : A×B ×R →
R a normalized Carathéodory function and let the inequality
$
|K(s, t, f (t))| dν(t) < ∞
B

hold for every f ∈ E and almost all s ∈ A. Then the formula

$
Tf (s) = K(s, t, f (t)) dν(t) (1)
B

defines a regular OAO T : E → L0 (μ). A special case is the Hammerstein operator

defined by setting
$
(Tf )(s) := L(s, t)N(t, f (t)) dν(t),
B

where L(·, ·) is a μ × ν-measurable function on A × B and N : B × R → R a

normalized Carathéodory function.
We note that the integral operator T defined by (1) is known in the literature as
a Uryson integral operator and the normalized Carathéodory function K is called
the kernel of T . The theory of such operators is widely represented in the literature
[28, 35, 36, 55].
Example Let (A, , μ) be a finite measure space. We say that N : A × R → R
is a superpositionally measurable function, or sup-measurable for shortness, if
N(·, f (·)) is μ-measurable for every f ∈ L0 (μ). A sup-measurable function N
is said to be normalized if N(s, 0) = 0 for μ-almost all s ∈ A. Every normalized
sup-measurable function N generates an OAO N : L0 (μ) → L0 (μ) defined by
setting N (f )(s) = N(s, f (s)), f ∈ L0 (μ).
Remark that the operator N is known in the literature as a nonlinear superposition
operator or Nemytskii operator. This operator arises in various problems of modern
mathematics (see [9, 10, 31]).
Orthogonally Additive Operators on Vector Lattices 325

3 The Lateral Order and Related Notions

3.1 Basic Properties

The given partial order ≤ on a vector lattice E induces another partial order ? on
E, which was formally introduced and studied in [38].
Definition 3.1 Let E be a vector lattice. The partial order ? on E we call the lateral
order on E. A subset G ⊆ E is said to be laterally bounded in E if G ⊆ Fx for some
x ∈ E. We do not mention here “from above” because every subset is automatically
laterally bounded from below by zero.
The lateral supremum and infimum
withrespect to the lateral order ? on E are
denoted using the bold symbols , ∪ and , ∩ respectively.
Proposition 3.2 ([38, 52]) Let E be a vector lattice and e ∈ E. Then the following
assertions hold.
1. The set Fe of all fragments of e is a Boolean algebra with zero 0, unit e with
respect to the operations ∪ and ∩. Moreover, x ∪ y = (x+ ∨ y+ ) − (x− ∨ y− )
and x ∩ y = (x+ ∧ y+ ) − (x− ∧ y− ) for all x, y ∈ Fe .
2. Assume e ≥ 0. Then the following holds.
(a) The lateral order ? on Fe coincides with the lattice order ≤.
(b) Let a nonempty subset A of Fe have a lateral supremum a = A
(respectively, a lateral infimum a = A).
(i) If y = sup A (respectively, y = inf A) exists in E then y = a.
(ii) If, moreover, E has the principal projection property then sup A (respec-
tively, inf A) exists in E and by (i) equals a.
Remark that there
exist a vector lattice E, an element e ∈ E+ and subsets A and
B of Fe such that A and B exist, while sup A and inf B do not exist in E [52,
Example 1.2].
A vector lattice E is said to be
• C-completeif every nonempty laterally bounded subset G of E has a lateral
supremum G ∈ E;
• laterally complete if every disjoint family from E+ has a supremum.
If a vector lattice E is either Dedekind complete or laterally complete then E is
C-complete [38, Corollary 5.8]. The Banach lattice C[0, 1] is a C-complete vector
lattice which is neither Dedekind complete, nor laterally complete.
The following statement is easy to prove (cf. [8, Theorem 1.49], [18]).
Proposition 3.3 Let E be a C-complete vector lattice. Then for every x ∈ E the
Boolean algebra Fx is Dedekind complete.
The lateral order is of great importance for the study of OAOs.
326 M. Pliev and M. Popov

Proposition 3.4 Let E, F be vector lattices, e, f ∈ E with e ? f and T : E → F

be a positive OAO. Then T e ≤ Tf .
Proof f = (f − e) e and T (f − e) ≥ 0 imply Tf = T (f − e) + T e ≥ T e.

Definition 3.5 Let E be a vector lattice. We say that, a net (eα )α∈A in E horizon-
tally converges (or laterally converges in another terminology) to an element e ∈ E
h
(notation eα −→ e) if the net (eα )α∈A order converges to e and eα ? eβ for all
α, β ∈ A with α ≤ β.
h
By Proposition 3.3, if eα −→ e then eα ? e for all α ∈ A.
Definition 3.6 Let E be a vector lattice and X either a normed space or a vector
lattice, depending on each case. An OAO T : E → X is said to be:
1. horizontally-to-norm continuous if for every net (xα )α∈A in E and every x ∈ E
h
the relation xα −→ x implies T (xα ) − T (x) → 0;
2. horizontally continuous if for every net (xα )α∈A in E and every x ∈ E the
h h
condition xα −→ x implies T (xα ) −→ T (x);
3. horizontally-to-order continuous for every net (xα )α∈A in E horizontally conver-
gent to x ∈ E the net (T (xα ))α∈A order converges to T (x).
Example ([18, Proposition 4.7]) Let (A, , μ) be a finite measure space and let
N : B × R → R be a sup-measurable function. Then the Nemytskii operator
N : L0 (μ) → L0 (μ) associated with N is horizontally-to-order continuous.
Example ([18, Proposition 4.8]) Let (A, , μ), (B, ", ν) be finite measure spaces,
let E be an order ideal of L0 (ν) and let T : E → L0 (μ) be an integral Uryson
operator with a kernel K. Then T is a horizontally-to-order continuous OAO.
Some properties of horizontally-to-order continuous OAOs were studied in [22,
23, 47, 49, 51].

3.2 Lateral Ideal and Lateral Bands

Definition 3.7 Let E be a vector lattice. A subset I of E is said to be a lateral ideal

if the following hold:
1. x y ∈ I for every disjoint x, y ∈ I;
2. if x ∈ I then y ∈ I for all y ∈ Fx .
Example Let E be a vector lattice and I be an order ideal of E. Then I is a lateral
ideal of E.
Example Let E be a vector lattice, x ∈ E. Then Fx is a lateral ideal of E.
Orthogonally Additive Operators on Vector Lattices 327

Example Let E, F be vector lattices and T : E → F a positive, OAO. Then the

kernel ker(T ) = {y ∈ E : T (y) = 0} is a lateral ideal of E.
Theorem 3.8 ([39, Theorem 3.1]) A subset I of a vector lattice E is a lateral ideal
if and only if I is the kernel of some positive OAO T : E → F where F is a suitable
Dedekind complete vector lattice.
We recall that a net (xλ )λ∈$ in a vector lattice E is called order fundamental if
the net (xλ − xλ )(λ,λ )∈$×$ order converges to zero.
Definition 3.9 An order fundamental net (xλ )λ∈$ in E is called horizontally
fundamental if xλ ? xλ for all λ, λ ∈ $ with λ ≤ λ . A subset D of the vector
lattice E is called horizontally closed, if every horizontally fundamental net (xλ )λ∈$
in D order converges to some x ∈ D. Horizontally closed lateral ideal B is said to
be a lateral band of E.
Example Every band B of a Dedekind complete vector lattice E is a lateral band
of E.
Example Let E be a Dedekind complete vector lattice. Since for every x ∈ E the
lateral ideal Fx is laterally closed (see Proposition 3.3), it follows that Fx is a lateral
band.
Obviously, the intersection of any nonempty family of lateral ideals (or lateral
bands) is a lateral ideal (respectively, a lateral band). The lateral ideal (or lateral
band) generated by a nonempty subset A of E is defined to be the intersection of
all lateral ideals (respectively, lateral bands) of E including A. For every e ∈ E the
set Fe is simultaneously the lateral ideal and lateral band generated by the singleton
{e}, and is called the principal lateral ideal and principal lateral band of E.
Remark 3.10 Let B be a lateral band of a vector lattice E and x ∈ E. Then the
set-theoretical intersection Fx ∩ B contains zero and hence is nonempty.
Proposition 3.11 Suppose E is a C-complete vector lattice, x ∈ E and B is a
band of E. Then Fx ∩ B has a ?-greatest element, which we denote by
lateral
x B := (Fx ∩ B). In particular, for B = Fy , y ∈ E, one has x Fy = x ∩ y.
Proof The first part is a consequence of Proposition 3.3, and the second one follows
from the remark before Theorem 3.14 below.

Proposition 3.12 ([47, Lemma 3.5]) Let E be a vector lattice, x, y, z, v ∈ E and
z v = x y. Then there exist elements z1 , z2 , v1 , v2 ∈ E such that
(i) z = z1 z2 ; v = v1 v2 ;
(ii) x = z1 v1 ; y = z2 v2 .
Proposition 3.13 Let E be a vector lattice, B a lateral band of E, x ∈ E and
x = y z. Then x B = y B zB
Proof Since x B ? x by Proposition 3.12, there exists a decomposition x B = u v
where u ? y and v ? z. We claim that u = y B and u = zB . Indeed, it is clear that
328 M. Pliev and M. Popov

u ∈ B ∩ Fy and v ∈ B ∩ Fz . Thus u ? y B and v ? zB . Assume that either u = y B

or v = zB . Then x B ? y B zB ∈ B ∩ Fx and x B = y B zB , a contradiction with
the maximality of x B .

3.3 The Intersection Property

By (1) of Proposition 3.2, every finite laterally bounded subset of E has a lateral
supremum and lateral infimum. However, there is a vector lattice E and a two-point
subset {x, y} of E which (being laterally bounded from below by 0) has no lateral
infimum [38, Example 3.11]. A vector lattice E is said to have the intersection
property if every two-point subset {x, y} of E has a lateral
infimum x ∩ y. Remark
that if x ∩ y exists for some x, y ∈ E then x ∩ y = (Fx ∩ Fy ) is the ?-maximal
common fragment of x and y [52, Proposition 1.12]. The intersection property is a
lateral analogue of the principal projection property, see Proposition 4.9.
The following result describes the relationships between the intersection property
and some other known properties of vector lattices.
Theorem 3.14 Let E be a vector lattice.
1. If E has the principal projection property then E possesses the intersection
property. Moreover,(∀x, y ∈ E) Fx∩y = Fx ∩ Fy .
2. The C-completeness of E implies the intersection property of E.
3. The vector lattice C[0, 1] is C-complete and does not have the principal
projection property. As a consequence, the intersection property does not imply
the principal projection property.
4. There exists a vector lattice with the principal projection property which is not
C-complete. As a consequence, the intersection property does not imply the C-
completeness.
Item (1) follows from [38, Theorem 3.13]. (2) is a part of [52, Proposition 1.12].
(3) The C-completeness of C[0,1] is proved in [43, Proposition 4.2], and the fact
that C[0, 1] fails the principal projection property is well known and easily seen. (4)
A corresponding example is provided in [39, Proposition 2.5].

4 Extension of Orthogonally Additive Maps

Definition 4.1 Let E be a vector lattice and I a lateral ideal of E. We say that a
subset D of I is absolutely order bounded in I, provided

(∃y ∈ I)(∀x ∈ D) |x| ≤ |y|.

Orthogonally Additive Operators on Vector Lattices 329

Definition 4.2 Let E, F be vector lattices and I a lateral ideal of E. A map

T : I → F is said to be
• orthogonally additive provided T (x + y) = T x + T y for all disjoint elements x,
y ∈ I;
• positive provided T x ≥ 0 for every x ∈ I;
• order bounded provided T maps absolutely order bounded in I subsets of I to
order bounded subsets of F .
Theorem 4.3 ([45, Theorems 1,2]) Let E, F be vector lattices with F Dedekind
complete, I ⊆ E a lateral ideal and T : I → F a positive order bounded
orthogonally additive map. Then the map T1I : E → F defined by

T1I x = sup{T y : y ∈ Fx ∩ I} (x ∈ E),

is a positive OAO from E to F , that is, T1I ∈ OA+ (E, F ). Moreover, T1I x = T x for
all x ∈ I.
Now we present a refinement of [24, Theorem 3]. Given a vector lattice E, an
OAO T : E → E is said to be laterally non-expanding, if T (x) ? x for all x ∈ E.
Obviously, every laterally non-expanding OAO preserves disjointness. A laterally
non-expanding projection (that is, T 2 = T ) is called a lateral retraction. A subset A
of E is called a lateral retract if A is the image of some lateral retraction T : E →
E, that is, T (E) = A. A lateral band A of E, which is a lateral retract, is called a
projection lateral band, and the lateral retraction of E onto A is called the lateral
band projection of E onto A.
Theorem 4.4 ([27, Theorem 2.6]) Let E be a vector lattice.
1. For each lateral retract A in E there is a unique lateral retraction of E onto A.
2. Every lateral retraction is horizontally continuous.
3. Every lateral retract in E is a lateral band.
The following theorem generalizes Theorem 3 of [24] and asserts, in particular,
that every lateral band in a C-complete vector lattice is a lateral retract, and hence,
a projection lateral band.
Theorem 4.5 Every lateral band B of a C-complete vector lattice E is a lateral
retract, and the function pB : E → E defined by setting for every x ∈ E

pB (x) = (Fx ∩ B) = x B (2)

is the lateral band projection of E onto B

Proof By Proposition 3.11, the map pB is well defined. By Theorem 3 of [24], we
have to prove the orthogonal additivity of pB only. Fix disjoint y, z ∈ E and let
330 M. Pliev and M. Popov

x = y z. Then by Proposition 3.13 we have that

pB (x) = x B = y B zB = pB (y) pB (z).

Hence, pB is an OAO.

The following theorem provides a partial case of formula (2) for principal
lateral bands, however proved under a less restrictive assumption on E to have the
intersection property.
Theorem 4.6 ([52, Theorem 1.6]) Let E be a vector lattice with the intersection
property. Then for every e ∈ E the function Qe : E → E defined by setting

Qe x = x ∩ e for all x ∈ E

is a lateral retraction, the image of which is the principal lateral band Fe .

To provide an interesting consequence of Theorem 4.6, we need the following
definition.
Definition 4.7 Let E be a vector lattice and x, y ∈ E. We say that x is laterally
disjoint to y and write x†y if Fx ∩ Fy = {0}. We say that two subsets H and D
of E are laterally disjoint and use the notation H †D if x†y for every x ∈ H and
y ∈ D. The laterally disjoint complement to a subset A of E is defined as follows:
A† := {x ∈ E : (∀a ∈ A) x†a}.
Observe that x ⊥ y implies x†y for all x, y ∈ E, and the converse implication is
false. However, one can show that, x†y implies x ⊥ y for every laterally bounded
pair x, y ∈ E.
Now we show that the intersection property is a lateral analogue of the principal
projection property.
Definition 4.8 An element e of a vector lattice E is called a laterally projection
element provided E is decomposed into a nonlinear direct sum E = Fe F†e , that
is, every x ∈ E has a unique representation x = y z, where y ∈ Fe and z ∈ F†e .
Proposition 4.9 A vector lattice E has the intersection property if and only if every
element of E is laterally projective.
Proof Let E have the intersection property and e ∈ E and x ∈ E. Then x =
Qe x (x − Qe x), which gives the desired decomposition by Theorem 4.6.
Assume that every element of E is laterally projective and fix any x, y ∈ E.
Our goal is to show that there exists the lateral infimum x ∩ y, which is the ?-
maximal common fragment of x and y. Using the decomposition E = Fy F†y ,
write x = y z, where y ∈ Fy and z†y. Then y ∈ Fx ∩ Fy , and we prove that
y is the maximal common fragment of x and y. Assume on the contrary this is
false. Then there exists t ∈ Fx ∩ Fy such that t ? y . Hence for w := y ∪ t we
obtain w ∈ Fx ∩ Fy , y ? w ? x and y = w. By the above, x = y (x − y ) and
x = w (x −w) = y (w −y )(x −w) and therefore, x −y = (w −y )(x −w),
Orthogonally Additive Operators on Vector Lattices 331

which yields w − y ? x − y = z. On the other hand, 0 = w − y ? y, which

contradicts z†y.

5 The Order Structure on the Vector Lattice of OAOs

5.1 Order Calculus and Riesz-Kantorovich Types Formulas

Two fundamental results in this direction were obtained by Mazón and Segura de
León in 1990.
Theorem 5.1 ([35, Theorem 3.2]) Let E and F be vector lattices with F Dedekind
complete. Then U(E, F ) is a Dedekind complete vector lattice. Moreover, for each
S, T ∈ U(E, F ) and x ∈ E the following conditions hold:
1. (T ∨ S)(x) = sup{T (y) + S(z) : x = y z}.
2. (T ∧ S)(x) = inf{T (y) + S(z) : x = y z}.
3. T+ (x) = sup{T y : y ? x}.
4. T− (x) = − inf{T y : y ? x}.
5. |T (x)| ≤ |T |(x).
The second one represents the lattice operations on U(E, F ) in terms of directed
systems.
Theorem 5.2 ([36, Lemma 3.2]) Let E and F be vector lattices with F Dedekind
complete. Then for all T , S ∈ U(E, F ) and x ∈ E one has
4 <n 5
n
1. i=1 T (y i ) ∧ S(y i ) : x = y i ; n ∈ N ↓ (S ∧ T )(x).
i=1
4 <n 5
n
2. i=1 T (yi ) ∨ S(yi ) : x = yi ; n ∈ N ↑ (S ∨ T )(x).
i=1
4 <
n 5
n
3. i=1 |T (y i )| : x = y i ; n ∈ N ↑ |T |(x).
i=1
Similar results, obtained by the first named author and Ramdane in 2018, concern
a more wide class P(E, F ) of C-bounded OAOs.
Theorem 5.3 ([47, Theorem 3.6]) Let E and F be vector lattices with F
Dedekind complete. Then P(E, F ) is a Dedekind complete vector lattice. Moreover,
P(E, F ) = OAr (E, F ) and for all S, T ∈ P(E, F ) and x ∈ E conditions 1–5
from Theorem 5.1 hold.
The following proposition strengthens the inclusion U(E, F ) ⊂ P(E, F ).
Proposition 5.4 ([47, Proposition 3.7]) Let E, F be vector lattices with F
Dedekind complete. Then U(E, F ) is an order ideal of P(E, F ).
The next example shows that U(E, F ) need not be a band in P(E, F ).
332 M. Pliev and M. Popov

Example Let E = F = R and the operator T ∈ P(R) be

1
, if x = 0
T (x) = |x|
0, if x = 0.

It is clear that T is an C-bounded OAO and T ∈ / U(R). Let

T x, if T x ≤ n
Tn (x) =
0, if T x > n.

It is not difficult to check that Tn ∈ U(R) for every n ∈ N and Tn ↑ T .

We remark that, without the assumption of Dedekind completeness of F , the
vector spaces U(E, F ) and P(E, F ) are, in general, not vector lattices. Neverthe-
less, for any S ∈ P(E, F ) and T ∈ U(E, F ) the relation 0 ≤ S ≤ T implies
the inclusion S ∈ U(E, F ). Indeed, take a order bounded subset D of E and let
z ∈ F+ be a some upper boundary of the set T (D). Then Sx ≤ z for any x ∈ D and
S ∈ U(E, F ).
The set of all horizontally-to-order continuous (σ -continuous) C-bounded OAOs
is denoted by Pc (E, F ) (Pσ c (E, F )).
Proposition 5.5 Let E, F be vector lattices. Then every horizontally-to-order
continuous OAO T : E → F is C-bounded.
Proof By Proposition 3.2, for every x ∈ E the set Fx is directed with respect to the
partial order ?. Consider
= Fx × Fx as a directed set with the lexicographical
$
order. Define a net x(u,v) (u,v)∈$ by setting x(u,v) = u ∩ v for all (u, v) ∈ $.
h h
Note that x(u,v) −→ x. Moreover, for every u0 ∈ Fx one has x(u0 ,v) −→v u0 and
h
for every v0 ∈ Fx , x(u,v0) u∈Fx −→u v0 . By the horizontal-to-order continuity
of T , there exists a net (e(u,v))(u,v)∈$ ⊂ F+ with the same index set $ such that
|T x − T x(u,v)| ≤ e(u,v) ≤ e(u0 ,v0 ) for all (u, v) ≥ (u0 , v0 ). Given any v ∈ Fx , for
u := u0 ∪ v one has u0 ? u and v ? u. Then x(u,v) = u ∩ v = v and we may write
|T x − T v| = |T x − T x(u,v)| ≤ e(u0 ,v0 ) and hence |T v| ≤ e(u0 ,v0 ) + |T x|. Thus,
T (Fx ) is order bounded in F .

Theorem 5.6 ([47, Theorem 3.13]) Let E, F be vector lattices with F Dedekind
complete. Then Pc (E, F ) and Pσ c (E, F ) are bands in the vector lattice P(E, F ).

5.2 The Boolean Algebra of Fragments of a Positive OAO

Let E, F be vector lattices with F Dedekind complete and T ∈ U+ (E, F ). The

purpose of this section is to describe the fragments of T . That is

FT = {S ∈ U+ (E, F ) : S ∧ (T − S) = 0}.
Orthogonally Additive Operators on Vector Lattices 333

First we consider elementary fragments. For a subset A of a vector lattice W we

use the following notation: A↑ = {x ∈ W : ∃ a net (xα ) in A with xα ↑ x}. The
meaning of A↓ is analogous. As usual, we also write A↓↑ = (A↓ )↑ . It is clear
that A↓↓ = A↓ , A↑↑ = A↑ . Since FT is a Boolean algebra, it is closed under
finite suprema and infima. In particular, all “ups and downs” of FT are likewise
closed under finite suprema and infima, and therefore are also directed upward and,
respectively, downward.
Let T ∈ U+ (E, F ) and D ⊆ E be a lateral ideal. Then for every x ∈ E the
following formula defines a map π D T : E → F+

π D T (x) = sup{T y : y ∈ Fx ∩ D}. (3)

Proposition 5.7 ([11, Lemma 3.6]) Let E, F be vector lattices with F Dedekind
complete, ρ ∈ B(F ), T ∈ U+ (E, F ) and D be a lateral ideal. Then π D T is a
positive abstract Uryson operator and ρπ D T ∈ FT .
If D = Fx then the operator D x
n π Tx is denoted by π T . Let F be a vector lattice.
Any fragment of the form i=1 ρi π i T , n ∈ N, where ρ1 , . . . , ρn is a finite family
of mutually disjoint order projections of F , is called an elementary fragment of T .
The set of all elementary fragments of T we denote by AT . The following theorem
describes the structure of FT for a positive abstract Uryson operator T .
Theorem 5.8 ([42, Theorem 3.12]) Let E, F be vector lattices, F Dedekind
↑↓↑
complete and T ∈ U+ (E, F ). Then FT = AT .
Remark that, for linear positive operators a similar theorem and its modifications
were proved by de Pagter, Aliprantis and Burkinshaw, Kusraev and Strizhevski, see
[7, 13, 29].

6 Compact Orthogonally Additive Operators

In this section we study C-compact and AM-compact OAOs taking values in

Banach lattices.

6.1 The Projection Band of C-Compact Orthogonally Additive

Operators

In this subsection, following [43] we show that the set of all C-compact regular
OAOs from a vector lattice E to a Banach lattice F with an order continuous norm
is a band in the vector lattice of all OAOs from E to F .
334 M. Pliev and M. Popov

Definition 6.1 Let E be a vector lattice and Y a normed space. An OAO T : E → Y

is said to be C-compact, if T (Fx ) is relatively compact in Y for all x ∈ E. For a
Banach lattice F , by COAr (E, F ) we denote the set of all C-compact regular OAOs
from E to F .
Example We note that OAr (R, R) is exactly the set of all real-valued functions
such that f (0) = 0. Define an OAO T : R → R by

1
, if x = 0
T (x) = x2
0, if x = 0.

Since any element 0 = x ∈ R is an atom, one has Fx = {0, x} for any x ∈ R. It

follows that T is C-compact. On the other hand, T ([0, 1]) is an unbounded set in R
and therefore T is not AM-compact.
Let (A, , μ) be a σ -finite complete measure space. A Banach space E which
is a linear subspace of L0 (μ) is called a Banach function space, provided for every
x ∈ L0 (μ) and y ∈ E the condition |x| ≤ |y| implies that x ∈ E and x ≤ y.
Obviously, a Banach function space is a Banach lattice.
Proposition 6.2 Let E be a Banach function space on a σ -finite measure space
(B, , ν) and T : E → R the Uryson integral functional defined by
$
Tf = K(t, f (t)) dν(t), f ∈E
B

with a kernel K. Then T is C-compact.

Proof Given any f ∈ E, one has Ff = {f 1D : D ∈ }. Then for every D ∈
$ $
T (f 1D ) = K(t, f 1D (t)) dν(t) = K(t, f (t)) dν(t) ≤
B D
$
|K(t, f (t))| dν(t) = M.
B

Hence the set T (Ff ) is order bounded in R and therefore T is C-compact.

We mention that a C-compact order bounded OAO T : E → F from a Banach
lattice E to a σ -Dedekind complete Banach lattice F is AM-compact if, in addition,
T is uniformly continuous on order bounded subsets of E [36, Theorem 3.4].
Recall that, a Banach lattice with an order continuous norm is Dedekind complete
(see [8, Theorem 12.9]).
Next is the main result of the subsection, which we provide with a proof.
Theorem 6.3 ([43, Theorem 3.9]) Let E be a vector lattice and F a Banach lattice
with an order continuous norm. Then the set of all C-compact regular OAOs from
E to F is a projection band in OAr (E, F ).
Orthogonally Additive Operators on Vector Lattices 335

In our proof we use the following lemma.

Lemma 6.4 ([43, Lemma 2.2]) Let T ∈ COAr (E, F ) under the assumptions of
Theorem 6.3. Then FT ⊂ COAr (E, F ).
Proof of Theorem 6.3 We prove some properties of COAr (E, F ):
(a) Clearly, COAr (E, F ) is a vector subspace of OAr (E, F ).
(b) We show that COAr (E, F ) is even a vector sublattice of OAr (E, F ). Take
S, T ∈ COAr (E, F ). Then T − S ∈ COAr (E, F ). By Lemma 6.4 FT ⊂
COAr (E, F ) and hence T+ ∈ COAr (E, F ). Therefore due to the equalities

S + (T − S)+ = S + (T − S) ∨ 0 = T ∨ S, S ∧ T = −((−S) ∨ (−T ))

which are valid in OAr (E, F ) we obtain that COAr (E, F ) is a sublattice of
OAr (E, F ).
(c) Now we show that, if 0 ≤ Tλ ↑ T in OAr (E, F ) and any Tλ ∈ COAr (E, F )
then T ∈ COAr (E, F ). Indeed, take x ∈ E and ε > 0. Since the Banach lattice
F is order continuous, it follows from Tλ x ↑ T x that T x − Tλ0 x < 4ε for
some λ0 . We claim that, moreover, T y − Tλ0 y < ε4 for any y ∈ Fx . Indeed,
consider x = y z for some z ∈ E. Then

0 ≤ T y − Tλ 0 y ≤ T y − Tλ 0 y + T z − Tλ 0 z = T x − Tλ 0 x

implies T y − Tλ0 y ≤ T x − Tλ0 x. Since Tλ0 ∈ COAr (E, F ), there exists
a finite subset D of Fx with the property that for any y ∈ Fx there exists u ∈ D
satisfying
ε
Tλ0 u − Tλ0 y < .
2
So we obtain T u − T y ≤ T u − Tλ0 u + T y − Tλ0 y + Tλ0 u − Tλ0 y < ε,
which establishes the relative compactness of T (Fx ) in F .
(d) Finally we prove that COAr (E, F ) is an order ideal in OAr (E, F ). Let 0 ≤
R ≤ T , where R ∈ OAr (E, F ) and T ∈ COAr (E, F ). Then R ∈ IT and
by the Freudenthal’s spectral theorem [8, Theorem 2.8], there exists a sequence
(Sn )n∈N in OAr (E, F ) of T -step-functions with 0 ≤ Sn ↑ R. Taking into
account that1 Sn ∈ COAr (E, F ) for all n ∈ N and, what has been established
in c), we deduce that R ∈ COAr (E, F ). So, COAr (E, F ) is a band in
OAr (E, F ).
(e) Due to the Dedekind completeness of OAr (E, F ), it is a projection band.

1 This follows from the fact that together with T each fragment Ti of T belongs to COAr (E, F ).
336 M. Pliev and M. Popov

6.2 Domination Problem for AM-Compact Abstract Uryson

Operator

In this section, we consider the following domination problem for AM-compact

abstract Uryson operators. Let E, F be vector lattices and S, T : E → F be OAOs
with 0 ≤ S ≤ T . Let P be some property of OAOs, so P(R) means that an OAO
R : E → F possesses P. Does P(T ) imply P(S)?
Definition 6.5 Let E be a vector lattice and Y a normed space. An OAO T : E → Y
is said to be AM-compact provided T maps order bounded subsets of E to relatively
compact subsets of Y .
Example ([36, Theorem 3.5]) Let E, F be Banach function spaces with F having
an order continuous norm. Then every integral Uryson operator T ∈ U(E, F ) is
AM-compact.
The next theorem is the main result of the subsection.
Theorem 6.6 ([40, Theorem 3.19]) Let E be a Dedekind complete vector lattice,
F a Banach lattice with an order continuous norm, and T ∈ U+ (E, F ) an AM-
compact operator. Then every operator S ∈ U+ (E, F ) with 0 ≤ S ≤ T is AM-
compact.
Remark that the same property for linear operators was proved earlier by Dodds
and Fremlin in [14].
Taking into account that every integral Uryson operator T : E → F is
AM-compact (see [36, Theorem 3.5]), we obtain the following consequence of
Theorem 6.6.
Corollary 6.7 ([40, Corollary 3.21]) Let E, F be Banach function spaces with
F having an order continuous norm and T ∈ U+ (E, F ) be an integral Uryson
operator. Then every abstract Uryson operator S ∈ U(E, F ) such that 0 ≤ S ≤ T
is AM-compact.

7 Partial Order Continuities of Orthogonally Additive

Operators

Throughout this section, let E, F be vector lattices with F Dedekind complete. Here
we discuss the relationships between the order continuity of an abstract Uryson
operator T : E → F and its modulus |T |, as well as some partial order continuities,
like horizontal-to-order and uniformly-to-order ones. Similar questions for linear
operators are simpler, see the next proposition.
Orthogonally Additive Operators on Vector Lattices 337

Proposition 7.1 ([35, Proposition 3.9]) Let E be a vector lattice with the principal
projection property, F a Dedekind complete vector lattice and S : E → F a regular
linear operator. Then the following assertions hold:
1. if S is horizontally-to-order continuous then S is order continuous;
2. if S is horizontally-to-order σ -continuous then S is order σ -continuous.
This is not longer true for OAOs due to the following example.
Example Let 0 ≤ p ≤ ∞. There exists a horizontally-to-order continuous
orthogonally additive functional f : Lp → R which is not order continuous.
Moreover, f is not uniformly-to-order continuous.
Define a function ϕ : R → [−1, 1] by setting ϕ(t) = t as |t| ≤ 1 and ϕ(t) = 0
for |t| > 1. Then define f : Lp → R by setting
$

f (x) = ϕ x(t) dμ(t).
[0,1]

Detailed proof that f is as desired the reader can find in [22, Example 2.1].
We say that a function f : E → F is uniformly-to-order continuous if for every
o
net (xα ) in E and every x ∈ E the condition xα ⇒ x implies f (xα ) −→ f (x).
Assume T ∈ U(E, F ). Then the function T B : E → F defined by setting

B(x) = sup |T |(y), x ∈ E

T (4)
|y|≤|x|

is a positive abstract Uryson operator (that is, TB ∈ U(E, F )+ ) [35, Proposition 3.4].
Following [52], we say that T B is the envelope of T . Remark that the envelope has
the following properties (see propositions 3.4 and 3.5 of [52] for details).
Proposition 7.2 Let E, F be vector lattices with F Dedekind complete and S, T ∈
U(E, F ). Then
1. T (x) ≤ TB(x) for all x ∈ E;
2. if x ≤ y for x, y ∈ E then TB(x) ≤ TB(y);
3. if 0 ≤ S ≤ T then BS(x) ≤ T B(x) for all x ∈ E;
4. S + T (x) ≤ BS(x) + TB(x) for all x ∈ E;
5. if, moreover, E has the principal projection property then B
T = TB.
Theorem 7.3 ([22, Theorem 2.2]) Let E be a vector lattice with the principal
projection property, F a Dedekind complete vector lattice and T ∈ U(E, F ).
Consider the following statements.
(i) T is order continuous.
(ii) |T | is order continuous.
(iii) The envelope T B of T is horizontally-to-order continuous.

Then the following assertions hold.

338 M. Pliev and M. Popov

(A) (i) and (ii) imply (iii).

(B) Suppose in addition that E has the horizontal Egorov property and (iii) holds.
Then the uniformly-to-order continuity of T implies (i), and the uniformly-to-
order continuity of |T | yields (ii).
Remark that conditions (i) and (ii) are equivalent for linear operators [8,
Theorem 1.56], however for OAOs this is not true, see the next example.
Example Set E = F = R and define an order bounded orthogonally additive
functional f : E → F by setting
⎧
⎨ 0 for − ∞ < x ≤ 0,
f (x) = x for 0 < x ≤ 1,
⎩
−1 for 1 < x < +∞.

Then
⎧
⎨ 0 for − ∞ < x ≤ 0,
|f |(x) = x for 0 < x ≤ 1,
⎩
1 for 1 < x < +∞.

Obviously, |f | is order continuous, however f is not.

A limited space does not allow us presenting all the results of [22]. We just
remark that not everything is now clear in this direction, see open problems in the
final section.

8 Narrow OAOs and Representation of Regular Operators

8.1 Narrow Operators

Narrow linear operators were introduced and studied in 1990 by Plichko and the
second named author in [41] as a generalization of compact operators defined on
symmetric function spaces. But actually these operators were investigated by dif-
ferent mathematicians earlier. In 2009 narrow operators were naturally generalized
to linear operators defined on vector lattices in [34] (see also [53] and references
therein). After the monograph [53] was published, the notion was (not less naturally)
generalized to OAOs in paper by the authors [44], and then developed in some other
papers (see e.g., [21]). A recent paper by the authors [46] contains a representation
theorem for regular operators (two versions for both linear and orthogonally additive
settings) which generalizes a number of known results in this direction.
Definition 8.1 An OAO T : E → F between vector lattices E, F is said to be
order narrow if for every e ∈ E there is a net of decompositions e = eα eα such
Orthogonally Additive Operators on Vector Lattices 339

that the net T (eα ) − T (eα ) α order converges to zero. An OAO T : E → G from a
vector lattice E to a Banach space G is called narrow if for every e ∈ E and every
ε > 0 there is a decomposition e = e e such that T (e ) − T (e ) < ε. An OAO
T : E → V from a vector lattice E to a linear space V is called strictly narrow if
for every e ∈ E there is a decomposition e = e e such that T (e ) = T (e ).
Obviously, every strictly narrow operator is both narrow and order narrow
for suitable range lattices. Let us briefly demonstrate of why is every “small”
operator strictly narrow. Let E be a Banach function space on [0, 1]. Then every
horizontally-to-norm continuous OAO f : E → R is strictly narrow. Indeed,
given any e ∈ E, the function ϕ : [0, 1] → R defined by ϕ(t) = f (e · 1[0,t ] ) is
continuous, ϕ(0) = 0 and ϕ(1) = f (e). Choosing s ∈ [0, 1] so that ϕ(s) = f (e)/2,
we obtain e = e · 1[0,s] e · e · 1(s,1] and

f (e)
f e · 1[0,s] = = f e · 1(s,1] .
2
Theorem 8.2 ([53, Proposition 2.1]) If a Banach function space E on a finite
measure space (, , μ) has an absolutely continuous norm on the unit (i.e.,
limμ(A)→0 1A = 0) and X is a Banach space then every compact and every
AM-compact linear operator T : E → X is narrow.
A strictly narrow operator need not have “small” range and can be “very non-
compact”: under the same assumptions on T = (, , μ), for every rearrangement
invariant Banach space E on T there exists a strictly narrow linear projection of
E onto a subspace E0 which is isometrically isomorphic to E [53, Theorem 4.17].
The assumption on E to have an absolutely continuous norm on the unit is essential:
there are nonnarrow continuous linear functionals on L∞ [53, Example 10.12].
The following theorem extends Theorem 8.2 to OAOs.
Theorem 8.3 ([44, Theorem 3.2]) Let E be an atomless Dedekind complete vector
lattice and X a Banach space. Then every orthogonally additive horizontally-to-
norm continuous C-compact operator T : E → X is narrow.

8.2 Representation of Regular Operators

In this subsection, we present some recent (unpublished) authors’ results [46]

which generalize known representation theorems by different authors: Kwapień
[30] and Kalton’s representation theorem for continuous linear operators on Lp (μ)
for 0 ≤ p ≤ 1 [26], Rosenthal’s version of the same theorem for operators on
L1 [0, 1] [54], Weis’ representation theorem for order continuous linear operators on
function vector lattices [58], Huijsmans and de Pagter’s theorem on representation
of regular linear operators on vector lattices [25], O. Maslyuchenko, Mykhaylyuk
and the second named author’s representation theorem of order continuous linear
340 M. Pliev and M. Popov

operators on vector lattices [34] and the authors’ representation of order bounded
OAOs [44]. The main contribution of the result under presentation is ridding of
the order continuity assumption on an operator and extending to OAOs. Earliest
of the above mentioned results are formulated using the language of measures
and integral operators on suitable function spaces, and the later ones used lattice
terminology, which is shorter and allows obtaining more general results. The general
idea of all representation theorems is to split the vector lattice X(E, F ) of all
operators T : E → F from a certain class into a direct sum of orthogonal bands
X(E, F ) = Y (E, F ) ⊕ Z(E, F ), where operators from Y (E, F ) are in some sense
atomic and operators from Z(E, F ) are kind of continuous, which then yields a
unique desirable representation of every operator T = Ta + Tc .
Any representation theorem for linear operators can be stated as follows: let
H(E, F ) be the band generated by all lattice homomorphisms (in other words, by all
disjointness preserving operators) from E to F , and D(E, F ) = H(E, F )d be the
disjoint complement of H(E, F ) in Lb (E, F ). Then, by the Dedekind completeness
of F , we obtain the following decomposition of Lb (E, F ) into orthogonal bands

Lb (E, F ) = H(E, F ) ⊕ D(E, F ). (5)

Now any representation theorem is reduced to characterization of operators

which belong to the summands of (5). One convenient characterization of operators
from the first summand [53, Theorem 1.33] asserts: for every T ∈ Lb (E, F )
the relation T ∈ H(E, F ) holds if and only if T is the sum of an absolutely
order summable family T = j ∈J Tj of disjointness preserving operators Tj ∈
Lb (E, F ).
The very effective characterizations of the summands of (5) concern the case
E = F = L1 = L1 [0, 1] and were obtained by Kalton [26] and Rosenthal [54] due
to the fact that the set L(L1 ) of all continuous linear operators on L1 coincides with
Lb (L1 ). Representation (5) for this particular case has the following form

L(L1 ) = H(L1 ) ⊕ D(L1 ). (6)

The next two combined theorems by Enflo, Kalton, Rosenthal and Starbird
characterize summands of (6) (cf. theorems 7.38 and 7.39 in [53]).
Theorem 8.4 ([54, Theorem 3.2]) For every operator T ∈ L(L1 ) the following
assertions are equivalent:
1. T ∈ H(L1 );
2. T equals a pointwise absolutely convergent2 series T = ∞n=1 Tn of disjointness
preserving operators Tn ∈ L(L1 ).

2 Strong 1 -convergent, in Rosenthal’s original terminology.

Orthogonally Additive Operators on Vector Lattices 341

Moreover, every nonzero operator T ∈ H(L1 ) possesses the following property: for
every ε > 0 there exists a measurable subset A of [0, 1] such that the restriction
T L (A) is an into isomorphism with
1

(i) T L (A) ≥ T − ε;
1
−1
(ii) T · T
L1 (A)
< 1 + ε.
L1 (A)
Before stating of the second theorem, we provide with a definition of the Enflo-
Starbird function for OAOs, which is the same as for linear operators.
Definition 8.5 Let E, F be vector lattices with F Dedekind complete and T ∈
OAr (E, F ). We define a function λT : E+ → F+ , called the Enflo-Starbird
function of T , by setting for all x ∈ E+
G
F
m
λT (x) = inf sup |T |xi : x = x i , x i ∈ E+ , m ∈ N .
1≤i≤m i=1

Theorem 8.6 ([26, 54]) For every T ∈ L(L1 ) the following assertions are
equivalent:
1. T ∈ D(L1 );
2. T is narrow;
3. the Enflo-Starbird function λT of T equals zero;
4. for every measurable subset A of [0, 1] the restriction T L1 (A)
is not an into
isomorphism.
Implication (4) ⇒ (3) and the technique of λ-function is due to Enflo and Starbird
[17]; equivalences (1) ⇔ (3) ⇔ (4) can be deduced from Kalton’s results [26], and
the final equivalence (1) ⇔ (2), as well as noticing of the entire theorem the reader
can find in Rosenthal’s paper [54].
In view of Theorem 8.4, elements of H(E, F ) we call pseudo-embeddings, and
following terminology of Weis [58] and Huijsmans-de Pagter [25], elements of
D(E, F ) we call diffuse operators. Using this terminology, we provide below one
of the main results of [34] generalizing Theorem 8.6 to vector lattices (see also [53,
Theorem 10.40]).
Theorem 8.7 ([34]) Let E, F be Dedekind complete vector lattices such that E
is atomless and F is an ideal of some order continuous Banach lattice. Then for
every regular order continuous operator T : E → F the following assertions are
equivalent:
1. T ∈ D(E, F );
2. T is order narrow;
3. the Enflo-Starbird λ-function of T is zero: λT = 0.
Hence, each regular order continuous operator T : E → F is uniquely
represented in the form T = Ta + Tc where Ta is a sum of an order absolutely
342 M. Pliev and M. Popov

summable family of disjointness preserving order continuous operators and Tc is an

order continuous order narrow operator.
The main contribution of Theorem 8.7 was equivalence (1) ⇔ (2), because
equivalence (1) ⇔ (3) easily follows (even without the assumption of order
continuity on T ) from the results of [25]. The main open questions remained
unsolved in [34] (see also [53, Problem 10.42] and [53, Problem 10.43]) are: is
Theorem 8.7 true for regular operators, which are not order continuous?3 Is the set
of all order narrow regular operators T : L∞ → L∞ a band in the vector lattice
Lr (L∞ ) of all regular linear operators on L∞ ? Remark that, the set of all narrow
regular operators T : L∞ → L∞ is not a band in Lr (L∞ ) [34], [53, Theorem 10.5].
Results, which we present below, give affirmative answers to both questions not
only for linear operators, but for OAOs. Moreover, in these results the Dedekind
complete assumption on E is replaced with a less restrictive assumption of
possessing the principal projection property, and the atomlessness assumption on
E is removed (the later adjustment is not a big deal, because every narrow and order
narrow operator must send atoms to zero, making the representation trivial on the
atomic part of E). Being more general at first glance, the results for OAOs do not
formally imply similar results for linear operators, because the set of linear operators
is just a linear subspace of the vector lattice of OAOs, but not a sublattice, due to
different orders.
Theorem 8.8 ([46, Theorem A]) Let E be a vector lattice with the principal
projection property and F a Dedekind complete vector lattice being an ideal of
some order continuous Banach lattice G. Then for every regular OAO T : E → F
the following assertions are equivalent
1. T is diffuse;
2. T : E → F is order narrow;
3. T : E → G is order narrow;
4. T : E → G is narrow;
5. the Enflo-Starbird λ-function of T is zero: λT = 0 (both in F and G).
Hence, every regular OAO T : E → F is uniquely represented as follows T =
Ta + Tc , where Ta is an absolutely order convergent sum of disjointness preserving
regular OAOs and Tc is a regular order narrow OAO.
Similar result holds for linear operators.
Theorem 8.9 ([46, Theorem B]) Let E be a vector lattice with the principal
projection property and F a Dedekind complete vector lattice being an ideal of
some order continuous Banach lattice G. Then for every T ∈ Lb (E, F ) assertions
1-5 of Theorem 8.8 are equivalent.
To get Theorem 8.9 as a consequence of Theorem 8.8, we define the canonical
embedding ϕ : Lb (E, F ) → OAr (E, F ) by setting ϕ(T ) x = T |x| for a given

3 Weis’ representation theorem [58] was proved under the assumption of order continuity.
Orthogonally Additive Operators on Vector Lattices 343

T ∈ L+ + −
b (E, F ) and all x ∈ E, and by ϕ(T ) = ϕ(T ) − ϕ(T ) for an arbitrary
+ −
T ∈ Lb (E, F ), that is, ϕ(T ) x = T |x| − T |x| for all x ∈ E. If E has the
principal projection property then ϕ is a lattice monomorphism, that is, an injective
lattice homomorphism.

9 Banach Lattices of OAOs

Throughout this section, we consider a pair of normed lattices E, F with F both

norm and Dedekind complete. There is a naturally defined norm on the vector lattice
U(E, F ), called the absolute norm. However, the normed sublattice AB(E, F ) of
U(E, F ), consisting of all bounded with respect to this norm OAOs, need not be
norm complete. Another natural somewhat greater norm, called the uniform norm,
makes the sublattice UB(E, F ) of both AB(E, F ) and U(E, F ) of all bounded
with respect to the latter norm OAOs to be a Banach lattice, which is ought to be
investigated.

9.1 Absolute and Uniform Norms of an Abstract Uryson

Operator

Definition 9.1 Let E be a normed lattice and F a Dedekind complete Banach

lattice. An OAO T ∈ U(E, F ) is said to be absolutely
norm
bounded if there is
M ∈ [0, +∞) such that, for every x ∈ E one has |T |(x) ≤ Mx.
The set of all such operators is denoted by AB(E, F ) and endowed with the
following nonnegative value

|T |(x)
T abs := sup ,
x∈E\{0} x

which we call the absolute norm. The following statement, in particular, asserts that
it is a norm.
Theorem 9.2 ([52, Theorem 3.2]) Let E be a normed lattice and F a Dedekind
complete Banach lattice. Then AB(E, F ) is a normed lattice with respect to the
absolute norm, and is a sublattice of U(E, F ).
The following example (with non-obvious proof) shows that the normed lattice
AB(E, F ) need not be norm complete.
Example ([52, Example 3.3]) Let E = F = L1 [0, 1]. Then the normed space
AB(E, F ) is not norm complete.
344 M. Pliev and M. Popov

To introduce a complete norm on U(E, F ), we must restrict the sublattice

AB(E, F ) to a much more narrow class of operators.
Definition 9.3 Let E be a normed lattice and F a Dedekind complete Banach
lattice. An abstract Uryson operator T : E → F is said to be uniformly order
bounded, if there is L ∈ [0, +∞) such that for every x ∈ E one has

sup |T |(y) ≤ Lx.
|y|≤|x|

In other words, T is uniformly order bounded provided its envelope4 is absolutely

norm bounded, that is, T B ∈ AB(E, F ). The set of all uniformly order bounded
abstract Uryson operators we denote by UB(E, F ) and endow with the following
nonnegative value

sup B
T (x)
|y|≤|x| |T |(y)
T u := sup = sup
x∈E\{0} x x∈E\{0} x

which we call the uniform norm. Obviously, UB(E, F ) ⊆ AB(E, F ) and T abs ≤
T u = TBabs for every T ∈ UB(E, F ).
There inclusion UB(E, F ) ⊂ AB(E, F ) is strict for the class of AL-spaces [52,
Example 3.8].
Theorem 9.4 ([52, Theorem 3.9]) Let E be a normed lattice and F a Dedekind
complete Banach lattice. Then UB(E, F ) is a Dedekind complete Banach lattice
with respect to the uniform norm, and is a sublattice of U(E, F ).

9.2 Consistent Sets and Levels in a Vector Lattice

A subset G of a vector lattice E is said to be consistent if every two-point subset

{x, y} of G is laterally bounded in E, that is, there exists e ∈ E such that
x ? e and y ? e (equivalently, every finite subset of G is laterally bounded [38,
Proposition 5.2]). The lateral band in E generated by a consistent set G is consistent
[38, Theorem 6.10].
Definition 9.5 A consistent lateral band in a vector lattice E is called a level of E.
A level which is not included in another level is called a maximal level. A level L in
E is called a principal level provided L = Fe for some e ∈ E.

4 See Sect. 7 for the definition of the envelope.

Orthogonally Additive Operators on Vector Lattices 345

Example Let (, , μ) be a finite atomless measure space, 0 ≤ p ≤ ∞ and E =

Lp (μ). Fix any z ∈ L0 (μ) and set Lz = {x ∈ E : x ? z}. Then
1. Lz is a level in E;
2. Lz is a maximal level in E if and only if supp z = ;
3. Lz is a principal level Lz = Fz if and only if z ∈ E.
Proposition 9.6 ([52, Proposition 2.4]) A vector lattice E is laterally complete if
and only if every maximal level is a principal level.
Obviously, if L and L are orthogonal levels in a vector lattice E, that is, e ⊥ e
for all e ∈ L and e ∈ L , then the direct sum defined by setting

L ⊕ L = {x + y : x ∈ L , y ∈ L }

is a level as well.
A level L in a vector lattice E is said to be positive (respectively, negative)
provided L ⊂ E+ (respectively, x ≤ 0 for each x ∈ L). The relation L ≥ 0
(respectively, L ≤ 0) means that the level L is positive (respectively, negative).
Proposition 9.7 ([52, Proposition 2.5]) Every level L in a vector lattice E admits
a unique decomposition into a direct sum of levels L = L+ ⊕ L− , where L+ ≥ 0
and −L− ≥ 0. In particular, for any principal level L = Fe one has (Fe )+ = F(e+ )
and (Fe )− = F(−e− ) .

9.3 Linear Sections of Orthogonally Additive Operators

This subsection is devoted to construction of norm one projections of the Banach

lattice UB(E, F ) onto its subspace L(E, F ) of all linear bounded operators. We
present some results asserting the existence of plenty norm one projections of
UB(E, F ) onto L(E, F ).
Given an OAO T : E → F between vector lattices and a level L in E, we
construct a linear operator S : E → F having the same values on L and vanishing
on the disjoint complement to L. First, we define such an operator on the linear
subspace EL ⊕ Ld of E and then find a possibility to extend it to the entire space
E preserving some important properties (by EL we denote the minimal ideal of
E including L). Our final purpose is to find assumptions on E, F , T and L under
which there is a unique linear operator with the desired properties. Then such a
linear operator will be picked as a canonical projection of UB(E, F ) onto L(E, F ).
Definition 9.8 Let E be a vector lattice, L a level in E, F a linear space and
T : E → F an OAO. A linear operator S : EL ⊕ Ld → F is called a linear section
of T by L if S|L = T |L and S|Ld = 0. A linear operator S : E → F is called an
extended linear section of T by L if S|L = T |L and S|Ld = 0.
346 M. Pliev and M. Popov

The first theorem concerns the very general case and asserts the existence of an
extended linear section without any additional properties.
Theorem 9.9 ([52, Theorem 4.2]) Let E be a vector lattice, F a vector space and
T : E → F an OAO. Then for every level L of E there exists an extended linear
section S : E → F of T by L.
Definition 9.10 Let E, F be vector lattices and D ⊆ E. A function f : D → F is
said to be vertically order σ -continuous on D if D is an ideal of E and for every
w ∈ D+ , every x ∈ Ew and every increasing sequence (xn )∞ n=1 in Ew such that
o
0 ≤ x − xn ≤ n1 w one has f (xn ) −→ f (x).
One can easily show that every regular linear operator T : E → F is vertically
order σ -continuous on E, once F is Archimedean [52, Proposition 4.6].
Next is the main technical tool for the construction of linear sections.
Theorem 9.11 ([52, Theorem 4.7]) Let E, F be vector lattices. Assume E has the
principal projection property, F is Dedekind complete and T ∈ U(E, F ). Then for
every level L of E there is a unique regular linear section S = L (T ) : EL ⊕ Ld →
F of T by L. Moreover, if L ≥ 0 then S + = (L (T ))+ = L (T + ). In particular, if
T is positive as an OAO and L ≥ 0 then S is positive as a linear operator.
Definition 9.12 Let E be a vector lattice with the principal projection property, F
a Dedekind complete vector lattice, T ∈ U(E, F ) and L a level of E. The regular
linear section S = L (T ) : EL ⊕Ld → F of T by L, the existence and uniqueness
of which Theorem 9.11 asserts, is called the canonical linear section of T by L.
Theorem 9.11 yields the following properties of the canonical linear section
S = L (T ) of T by L:
1◦ S is a regular linear operator.
If, in addition, L ≥ 0 then
2◦ S + = (L (T ))+ = L (T + );
3◦ if T ≥ 0 as an OAO then S ≥ 0 as a linear operator.
The next property is expressed in the following theorem.
Theorem 9.13 ([52, Theorem 4.12]) Let E be a vector lattice with the principal
projection property, F a Dedekind complete vector lattice, T ∈ U(E, F ) and L a
positive level of E. If T is horizontally-to-order continuous (horizontally-to-order
σ -continuous) on L then the canonical linear section S = L (T ) of T by L is order
continuous (order σ -continuous) on its domain.
The following theorem asserts that the canonical linear section is a linear operator
from UB(E, F ) to Lr (EL ⊕ Ld , F ) with some useful properties.
Theorem 9.14 ([52, Theorem 4.15]) Let E, F be vector lattices. Assume E has the
principal projection property, F is Dedekind complete and L is a level in E. Then the
corresponding canonical linear section as a mapping L : UB(E, F ) → Lr (EL ⊕
Ld , F ) is a disjointness preserving linear operator. If, moreover, L ≥ 0 then L
Orthogonally Additive Operators on Vector Lattices 347

is a lattice homomorphism, and if L ≤ 0 then −L is a lattice homomorphism.

Consequently, in the general case L is a difference of two lattice homomorphisms.
The existence of a unique extended linear section, which is a linear projection of
UB(E, F ) onto L(E, F ), can be obtained in some partial cases.
Theorem 9.15 ([52, Theorem 5.1]) Let E be an AL-space, F a Dedekind com-
plete Banach lattice and T ∈ UB(E, F ). Then for every positive level L of E there
is a unique extended linear bounded regular section S = L (T ) : E → F of T by
L with S ≤ T u . Moreover, S = L (T ) is linear with respect to T and if T ≥ 0
then S ≥ 0.
In some natural cases, an extended linear section does not exist.
Example Let 1 ≤ p < r < ∞. Denote by 1 the characteristic function of [0, 1],
by L the principal level L = F1 in Lp , and by J : L → Lr denote the identity
embedding. Define a function T : Lp → Lr by setting

T (x) = J x ∩ 1 for all x ∈ Lp . (7)

Then T ∈ U(Lp , Lr ) and there is no extended linear section of T by L.

Proof Since Lp is Dedekind complete, Lp has the intersection property and T is
well defined by (7). By Theorem 4.6, T is an OAO, and the inequality |T x| ≤ |x|
implies that T is order bounded. Thus, T ∈ U(Lp , Lr ). Observe that Ld = {0},
EL = L∞ and the canonical linear section of T by L is the identity operator S =
L (T ) : L∞ → Lr , Sx = x for all x ∈ L∞ ⊂ Lp . So S, being unbounded on its
domain, cannot be extended to the entire Lp .

As a vector subspace of UB(E, F ), the Banach lattice Lr (E, F ) is not a
sublattice of UB(E, F ), because the lattice order on Lr (E, F ) is completely
different from that of UB(E, F ). Moreover, UB(E, F )+ ∩ Lr (E, F ) = {0}.
However, for the next result, it is enough to consider the case where

E is an AL-space and F a Dedekind complete Banach lattice that satisfy

(8)
Lr (E, F ) = L(E, F ) and ∀T ∈ L(E, F ) |T | = T

Observe that, under the above assumptions on E and F , the Banach space
L(E, F ) is a subspace of UB(E, F ).
Recall that a Banach lattice F is called a KB-space if every increasing norm
bounded sequence in F+ is norm convergent (equivalently, if the canonical image
of F in its second dual F is a band [8, Theorem 4.60]). By [8, Theorem 4.75], every
KB-space F satisfies (8) for any AL-space E. Nevertheless, there is a strictly wider
class of Banach lattices F than the KB-spaces (including e.g. infinite dimensional
L∞ (μ)-spaces which are not KB-spaces), possessing (8) for any AL-space E.
348 M. Pliev and M. Popov

A norm · on a Banach lattice F is said to be:

• a Fatou norm provided for every net (fα ) in F + and f ∈ F + the condition
fα ↑ f implies f = limα fα ;
• a Levi norm provided every net (fα ) in F with 0 ≤ fα ↑ and fα ≤ 1 for all α
has a supremum in F ;
• a Fatou-Levi norm provided that its norm is both Fatou and Levi.
By [6, Theorem 4.1], a Banach lattice F satisfies (8) for every AL-space E if and
only if F has a Fatou-Levi norm.
Next is the main result of the section.
Theorem 9.16 ([52, Theorem 5.4]) Let E be an AL-space, F a Banach lattice
with a Fatou-Levi norm. Then L(E, F ) is a 1-complemented subspace of UB(E, F ).
Moreover, for every maximal positive level L in E, L is a contractive projection
of UB(E, F ) onto L(E, F ) such that
(i) L (T )|L = T |L for all T ∈ UB(E, F );
(ii) if L and L are distinct maximal levels in E then L = L .

10 Open Problems

10.1 An Analytic Representation of OAOs

The criterions of an Uryson and Hammerstein type integral representability of order

bounded OAOs were obtained in [55, 56].
Problem 10.1 Obtain the criterions of integral representability of regular, (in
general order unbounded) OAOs.

10.2 Disjointness Preserving OAOs

Different classes of disjointness preserving OAOs were investigated in [2, 4, 12]. It is

well known that the sum of disjointness preserving OAOs need not be a disjointness
preserving operator.
Problem 10.2 Obtain the criterion for an OAO T to be the sum of n disjointness
preserving OAOs.
We recall that an OAO T : E → E is non-expanding if T x ∈ {x}dd . By N (E)
we denote the set of all non-expanding OAOs on E. It is not hard to verify, that
T + S ∈ N (E) and T ◦ S ∈ N (E) for every T , S ∈ N (E). Actually, N (E) is an
(noncommutative) algebra over R.
Problem 10.3 Investigate algebraic properties of N (E). In particular describe
automorphisms of N (E), left (right) ideals of N (E).
Orthogonally Additive Operators on Vector Lattices 349

10.3 Compact OAOs

Suppose E is a vector lattice and F is a Banach lattice with an order continuous

norm. By Theorem 6.3, COAr (E, F ) is a projection band of OAr (E, F ).
Problem 10.4 Does Theorem 6.3 continue to hold if F is a Dedekind complete
Banach lattice?
Problem 10.5 Obtain a formula for the band projection from OAr (E, F ) onto
COAr (E, F ).

10.4 Order Projections

Order projections in different spaces of OAOs were studied in [2, 3, 12, 48],
Problem 10.6 Obtain a formula for the order projection of OAr (E, F ) onto the
band generated by an arbitrary positive OAO T : E → F .
Problem 10.7 Obtain a formula for the order projection in OAr (E, F ) onto the
band by all disjointness preserving OAOs from in E to F .

10.5 Partial Order Continuities of Orthogonally Additive

Operators

Problem 10.8 Under what assumptions on vector lattices E, F with F Dedekind

complete every abstract Uryson operator T : E → F , which is both horizontally-
to-order continuous and uniformly-to-order continuous, is order continuous?
Problem 10.9 Do there exist a vector lattice with the principal projection property
E, a Dedekind complete vector lattice F and an order continuous abstract Uryson
(or, at least, laterally-to-order bounded) operator T : E → F such that |T | is not
order continuous?

References

1. N. Abasov, Completely additive and C-compact operators in lattice-normed spaces. Ann.

Funct. Anal. 11(4), 914–928 (2020)
2. N. Abasov, On band preserving orthogonally additive operators. Sib. Electron. Math. Rep. 18,
1 (2021)
3. N. Abasov, On a band generated by a disjointness preserving orthogonally additive operator.
Lobachevskii J. Math. 42(5), 851–856 (2021)
350 M. Pliev and M. Popov

4. N. Abasov, M. Pliev, On extensions of some nonlinear maps in vector lattices. J. Math. Anal.
Appl. 455(1), 516–527 (2017)
5. N. Abasov, M. Pliev, Dominated orthogonally additive operators in lattice-normed spaces. Adv.
Oper. Theory 4(3), 251–264 (2019)
6. Yu. Abramovich, Z. Chen, A. Wickstead, Regular-norm balls can be closed in the strong
operator topology. Positivity 1(1), 75–96 (1997)
7. C.D. Aliprantis, O. Burkinshaw, The components of a positive operator. Math. Z. 184(2), 245–
257 (1983)
8. C.D. Aliprantis, O. Burkinshaw, Positive Operators (Springer, Dordrecht, 2006)
9. J. Appell, P.P. Zabrejko, Nonlinear Superposition Operators (Cambridge University Press,
Cambridge, 1990)
10. J. Appell, J. Banas, N. Merentes, Bounded Variation and Around (De Gruyter, Berlin, 2014)
11. M.A. Ben Amor, M. Pliev, Laterally continuous part of an abstract Uryson operator. Int. J.
Math. Anal. 7(58), 2853–2860 (2013)
12. R. Chill, M. Pliev, Atomic operators in vector lattices. Mediter. J. Math. 17, article 138 (2020)
13. B. de Pagter, The components of a positive operator. Indag. Math. 48(2), 229–241 (1983)
14. P.G. Dodds, D.H. Fremlin, Compact Operators in Banach Lattices. Izrael J. Math. 34, 287–320
(1974)
15. L. Drewnowski, W. Orlicz, On orthogonally-additive functionals. Bull. Acad. Polon. Sci. Ser.
Sci. Math. Astron. Phys. 16, 883–888 (1968)
16. L. Drewnowski, W. Orlicz, Continuity and representation of orthogonally-additive functionals.
Bull. Acad. Polon. Sci. Ser. Sci. Math. Astron. Phys. 17, 647–653 (1969)
17. P. Enflo, T. Starbird, Subspaces of L1 containing L1 . Stud. Math. 65(2), 203–225 (1979)
18. N. Erkursun-Özcan, M. Pliev, On orthogonally additive operators in C-complete vector lattices.
Banach. J. Math. Anal. 16(1), article number 6 (2022)
19. W.A. Feldman, A characterization of non-linear maps satisfying orthogonality properties.
Positivity 21(1), 85–97 (2017)
20. W.A. Feldman, A factorization for orthogonally additive operators on Banach lattices. J. Math.
Anal. Appl. 472(1), 238–245 (2019)
21. O. Fotiy, A. Gumenchuk, I. Krasikova, M. Popov, On sums of narrow and compact operators.
Positivity 24(1), 69–80 (2020)
22. O. Fotiy, I. Krasikova, M. Pliev, M. Popov, Order continuity of orthogonally additive operators.
Results Math. 77, article 5 (2022)
23. A.I. Gumenchuk, Lateral continuity and orthogonally additive operators. Carpathian Math.
Publ. 7(1), 49–56 (2015)
24. A.I. Gumenchuk, M.A. Pliev, M.M. Popov, Extensions of orthogonally additive operators. Mat.
Stud. 42(2), 214–219 (2014)
25. C.B. Huijsmans, B. de Pagter, Disjointness preserving and diffuse operators. Compos. Math.
79(3), 351–374 (1991)
26. N.J. Kalton, The endomorphisms of Lp (0 ≤ p ≤ 1). Indiana Univ. Math. J. 27(3), 353–381
(1978)
27. A. Kamińska, I. Krasikova, M. Popov, Projection lateral bands and lateral retracts. Carpathian
Math. Publ. 12(2), 333–339 (2020)
28. M.A. Krasnosel’skii, P.P. Zabrejko, E.I. Pustil’nikov, P.E. Sobolevski, Integral Operators in
Spaces of Summable Functions (Noordhoff, Leiden, 1976)
29. A.G. Kusraev, M.A. Strizhevski, Lattice-normed spaces and dominated operators, Studies on
Geometry and Functional Analysis. Trudy Inst. Mat. (Novosibirsk), Novosibirsk 7, 132–158
(1987) (in Russian)
30. S. Kwapien, On the form of a linear operator in the space of all measurable functions. Bull.
Acad. Polon. Sci. Sér. Sci. Math. Astronom. Phys. 21, 951–954 (1973)
31. H. Le Dret, Nonlinear Elliptic Partial Differential Equations (Springer, Berlin, 2018)
32. M. Marcus, V. Mizel, Representation theorem for nonlinear disjointly additive functionals and
operators on Sobolev spaces. Trans. Am. Math. Soc. 226, 1–45 (1977)
Orthogonally Additive Operators on Vector Lattices 351

33. M. Marcus, V. Mizel, Extension theorem of Hahn-Banach type for nonlinear disjontly additive
functionals and operators in Lebesgue spaces. J. Funct. Anal. 24, 303–335 (1977)
34. O.V. Maslyuchenko, V.V. Mykhaylyuk, M.M. Popov, A lattice approach to narrow operators.
Positivity 13(3), 459–495 (2009)
35. J.M. Mazón, S. Segura de León, Order bounded ortogonally additive operators. Rev. Roumane
Math. Pures Appl. 35(4), 329–353 (1990)
36. J.M. Mazon, S. Segura de Leon, Uryson operators. Rev. Roumane Math. Pures Appl. 35(5),
431–449 (1990)
37. P. Meyer-Nieberg, Banach Lattices (Springer, Berlin, 1991)
38. V. Mykhaylyuk, M. Pliev, M. Popov, The lateral order on Riesz spaces and orthogonally
additive operators. Positivity 25(2), 291–327 (2021)
39. V. Mykhaylyuk, M. Pliev, M. Popov, The lateral order on Riesz spaces and orthogonally
additive operators. II. Preprint
40. V. Orlov, M. Pliev, D. Rode, Domination problem for AM-compact abstract Uryson operators.
Arch. Math. 107(5), 543–552 (2016)
41. A.M. Plichko, M.M. Popov, Symmetric function spaces on atomless probability spaces.
Dissertationes Math. (Rozprawy Mat.) 306, 1–85 (1990)
42. M. Pliev, Domination problem for narrow orthogonally additive operators. Positivity 21(1),
23–33 (2017)
43. M. Pliev, On C-compact orthogonally additive operators. J. Math. Anal. Appl. 494(1), 124594
(2021)
44. M. Pliev, M. Popov, Narrow orthogonally additive operators. Positivity 18(4), 641–667 (2014)
45. M. Pliev, M. Popov, Extension of abstract Urysohn operators. Sib. Math. J. 57(3), 552–557
(2016)
46. M. Pliev, M. Popov, Representation theorems for regular linear and orthogonally additive
operators. Preprint, 26 p. (2020)
47. M. Pliev, K. Ramdane, Order unbounded orthogonally additive operators in vector lattices.
Mediter. J. Math. 15(2), article 55 (2018)
48. M. Pliev, M. Weber, Disjointness and order projections in the vector lattices of abstract Uryson
operators. Positivity 20(3), 695–707 (2016)
49. M.A. Pliev, F. Polat, M.R. Weber, Narrow and C-compact orthogonally additive operators in
lattice-normed spaces. Results Math. 74, article 157 (2019)
50. A. Ponosov, E. Stepanov, Atomic operators, random dynamical systems and invariant mea-
sures. St. Petersburg Math. J. 26(4), 607–642 (2015)
51. M. Popov, Horizontal Egorov property of Riesz spaces. Proc. Am. Math. Soc. 149(1), 323–332
(2021)
52. M. Popov, Banach lattices of orthogonally additive operators. J. Math. Anal. Appl. 514(1),
126279 (2022)
53. M. Popov, B. Randrianantoanina, Narrow Operators on Function Spaces and Vector Lattices
(De Gruyter, Berlin, 2013)
54. H.P. Rosenthal, Embeddings of L1 in L1 . Contemp. Math. 26, 335–349 (1984)
55. S. Segura de León, Bukhvalov type characterizations of Urysohn operators. Stud. Math. 99(3),
199–220 (1991)
56. S. Segura de León, Characterizations of Hammerstein operators. Rend. Math. 7(12), 689–707
(1992)
57. P. Tradacete, I. Villanueva, Valuations on Banach lattices. Int. Math. Res. Not. 2020(1), 287–
319 (2020)
58. L. Weis, On the representation of order continuous operators by random measures. Trans. Am.
Math. Soc. 285(2), 535–563 (1984)
Part III
Inequalities Related to Types of Operators
Normal Operators and their
Generalizations

Pietro Aiena

Abstract In this chapter we study the spectral properties of some classes of

operators which generalize normal operators on Hilbert spaces. In particular, we
consider for these operators some aspects of local spectral theory and Fredholm
theory.

Keywords Normal operators · Fredholm theory · Localized single valued

extension property · Weyl type theorems

1 Introduction

It is well-known that a normal operator on a Hilbert space possesses a rich spectral

theory. Many classes of operators that generalize normal operators, have been
introduced and studied in the last years. These classes of operators are defined
by means of some (order) inequalities that involve the operator T and its adjoint
T ∗ . Precisely, most of them are defined by relaxing the condition of normality
T T ∗ = T ∗T .
In this chapter we shall consider the spectral properties of these classes of
operators, and we show that such operators share with the normal operators many
spectral properties, mostly of them concerning Fredholm theory and local spectral
theory.
Our main interest regards the isolated points of the spectrum of these operators,
as well as the isolated points of the approximate point spectrum. Many times these
points are poles (or left poles) of the resolvent. In this context the concept of
polaroid operator (or a-polaroid operator), jointly with the single-valued extension
property (SVEP), provide a very useful tool for studyng the structure of the
spectrum. We also introduce the quasi-T HN operators (quasi totally hereditarily

P. Aiena ()
Dipartimento d’Ingegneria, Università di Palermo (Italia), Palermo, Italy
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 355
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_11
356 P. Aiena

normaloid operators), in order to determine a general theoretical framework from

which we can state that several versions of Weyl type theorems hold for all these
operators in their classical form, as well as in their generalized form.

2 Notations and Preliminary Results on Spectral Theory

Since this chapter concerns the spectral theory of bounded linear operators, we
always assume that the Banach spaces, or the Hilbert spaces, are complex infinite-
dimensional. If X, Y are Banach spaces, by L(X, Y ) we denote the Banach space of
all bounded linear operators from X into Y . Recall that if T ∈ L(X, Y ), the norm of
T is defined by

T x
T := sup .
x =0 x

If X = Y we write L(X) := L(X, X). By X∗ := L(X, C) we denote the dual of X.

If T ∈ L(X, Y ) by T ∈ L(Y ∗ , X∗ ) we denote the dual operator of T , defined by

(T f )(x) := f (T x) for all x ∈ X, f ∈ Y ∗ .

The identity operator on X will be denoted by IX , or simply I if no confusion can

arise.
We reassume now some of the basic definitions of Hilbert space operators. We
refer to the books Rudin [53], Heuser [39] for details and proof. Let H be a complex
Hilbert space with an inner product ·, ·. The inner product satisfies the Schwarz
inequality, i.e.,

|x, y| ≤ xy for all x, y ∈ H.

The dual of a Hilbert space is described by the following theorem.

Theorem 2.1 (Frechét-Riesz Representation Theorem) For each fixed element
z ∈ H the map f : x ∈ H → x, z defines a continuous linear form on H ,
Conversely, for every continuous linear form f on H there exists exactly a vector
z ∈ H such that f (x) = x, z for all x ∈ H , Furthermore, f = z.
A consequence of this theorem is that every Hilbert space is isometrically
isomorphic to its dual. If T ∈ L(H ), for a fixed y ∈ H define

f (x) := T x, y.

According Theorem 2.1, there exists a unique element z ∈ H such that

f (x) = T x, y = x, z.

Normal Operators and their Generalizations 357

The adjoint operator T ∈ L(H ) is then defined by

T x, y = x, z = x, T ∗ y for all x ∈ H.

By the Frechét-Riesz representation theorem there exists a conjugated-linear isom-

etry U : H → H , H the dual of H , that associates to every y ∈ H the linear
form defined fy (x) :=< x, y >. The dual T and the adjoint T ∗ of T are related as
follows:

(λ̄I − T ∗ ) = U −1 (λI − T )U for every λ ∈ C.

Hence

U (λ̄I − T ∗ ) = (λI − T )U and (λ̄I − T ∗ )U −1 = U −1 (λI − T ). (1)

Given a bounded operator T ∈ L(X, Y ) between Banach spaces, the kernel of T is

the set

ker T := {x ∈ X : T x = 0},

while the range of T is denoted by T (X). In the sequel, for every bounded operator
T ∈ L(X, Y ), we shall denote by α(T ) the nullity of T , defined as α(T ) :=
dim ker T , while the deficiency β(T ) of T is the dimension of the cokernel of T (X),
i.e., β(T ) := dim Y/T (X) = codim T (X). The spectrum of T ∈ L(X) defined as

σ (T ) := {λ ∈ C : λI − T is not bijective}.

It is well-known that the spectrum is a compact subset of C and σ (T ) = σ (T ∗ )

for all T ∈ L(X), while for the adjoint of a Hilbert space operator we have
σ (T ) = σ (T ∗ ). If X is a complex Banach space then every T ∈ L(X) has non-
empty spectrum. The complement ρ(T ) := C \ σ (T ) is called the resolvent of T .
We have σ (T ) = σ (T )
Recall that an operator T ∈ L(X) is said to be bounded below if T is injective
and has closed range. An basic result shows that T ∈ L(X, Y ) is bounded below if
and only if there exists K > 0 such that

T x ≥ Kx for all x ∈ X. (2)

The classical approximate point spectrum σap (T ) is defined by

σap (T ) := {λ ∈ C : λI − T is not bounded below},

while the surjectivity spectrum σs (T ) is defined by

σsu (T ) := {λ ∈ C : λI − T is not onto}.

358 P. Aiena

These two spectra are nonempty and dual one to each other, i.e., σa (T ∗ ) = σs (T )
and σs (T ∗ ) = σa (T ). Both spectra σap (T ) and σsu (T ) contain the boundary ∂σ (T )
of the spectrum, see [4, Theorem 1.12]. Furthermore, it is easily seen that for Hilbert
space operators for the adjoint T ∗ and the dual T we have:

σap (T ) = σap (T ∗ ) and σsu (T ) = σsu (T ∗ ).

Given a linear operator T on a vector space X, T is said to have finite ascent if

N ∞ (T ) = ker T k for some positive integer k. Clearly, in such a case there is a
smallest positive integer p = p(T ) such that ker T p = ker T p+1 . The positive
integer p is called the ascent of T . If there is no such integer we set p(T ) := ∞.
Analogously, T is said to have finite descent if T ∞ (X) = T k (X) for some k. The
smallest integer q = q(T ) such that T q+1 (X) = T q (X) is called the descent of T .
If there is no such integer we set q(T ) := ∞. The proof of following basic results
may be found in [4, Chapter 1].
Theorem 2.2 Let T be a linear operator on a vector space X. If both p(T ) and
q(T ) are finite then p(T ) = q(T ).
The defects α(T ), β(T ), the ascent and the descent are related as follows:
Theorem 2.3 If T is a linear operator on a vector space X then the following
properties hold:
(i) If p(T ) < ∞ then α(T ) ≤ β(T );
(ii) If q(T ) < ∞ then β(T ) ≤ α(T );
(iii) If p(T ) = q(T ) < ∞ then α(T ) = β(T ) (possibly infinite);
(iv) If α(T ) = β(T ) < ∞ and if either p(T ) or q(T ) is finite then p(T ) = q(T ).
The finiteness of the ascent and the descent of a linear operator T is related to a
certain decomposition of X.
Theorem 2.4 Suppose that T is a linear operator on a vector space X. If p =
p(T ) = q(T ) < ∞ then we have the decomposition

X = T p (X) ⊕ ker T p .

Conversely, if for a natural number m we have the decomposition X = T m (X) ⊕

ker T m then p(T ) = q(T ) ≤ m. In this case T |T p (X) is bijective.
The following subspace has been introduced by Vrbová [59] and later studied by
Mbekhta [47].
Definition 2.5 Let X be a Banach space and T ∈ L(X). The analytic core of T is
the set K(T ) of all x ∈ X such that there exists a sequence (un ) ⊂ X and a constant
δ > 0 such that:
(1) x = u0 , and T un+1 = un for every n ∈ Z+ ;
(2) un ≤ δ n x for every n ∈ Z+ .
Normal Operators and their Generalizations 359

It is easily seen that K(T ) is a linear subspace of X and T (K(T )) = K(T ).

Another important invariant subspace for a bounded operator T ∈ L(X), X a
Banach space, is defined as follows:
Definition 2.6 Let T ∈ L(X), X a Banach space. The quasi-nilpotent part of T is
defined to be the set

H0 (T ) := {x ∈ X : lim T n x1/n = 0}.

n→∞

Clearly H0 (T ) is a linear subspace of X, generally not closed. Obviously

ker (T m ) ⊆ H0 (T ) for every m ∈ N.
Theorem 2.7 Let X be a Banach space. Then T ∈ L(X) is quasi-nilpotent if and
only if H0 (T ) = X.
Proof Suppose that T is quasi-nilpotent, i.e., limn→∞ T n 1/n = 0, Then
T n x ≤ T n x for every x ∈ X, so limn→∞ T n x1/n = 0. This shows
that H0 (T ) = X.
Conversely, assume that H0 (T ) = X. By the n-th root test the series
∞
T n x
,
|λ|n+1
n=0

converges for each x ∈ X and λ = 0. Define

∞
T nx
y := .
λn+1
n=0

It is easy to verify that (λI − T )y = x, thus (λI − T ) is surjective for all λ = 0. On

the other hand, for every λ = 0 we have that

{0} = ker (λI − T ) ∩ H0 (T ) = ker (λI − T ) ∩ X = ker (λI − T ),

which shows that λI − T is invertible and therefore σ (T ) = {0}, i.e. T is quasi-

nilpotent.
The chain lengths of (λI − T ) are intimately related to the poles of the resolvent
R(λ, T ). If f : D(λ0 , δ) \ {λ0 } → X, X a Banach space, is a analytic function
defined in the open disc centered at λ0 with values in X, then, by the Laurent
expansion, f can be represented in the form
∞
∞
bk
f (λ) = ak (λ − λ0 )k + ,
(λ − λ0 )k
k=0 k=1
360 P. Aiena

with ak , bk ∈ X, and λ ∈ D(λ0 , δ) \ {λ0 }. The coefficients are given by the formulas
$ $
1 f (λ) 1
ak = dλ, and bk = f (λ)(λ − λ0 )k−1 dλ,
2πi (λ − λ0 )k+1 2πi

where is a positively oriented circle |λ − λ0 | = r, with 0 < r < δ, see

Proposition 46.7 of [39] for details. We say that λ0 is a pole of order p if bp = 0
and bn = 0 for all n > p.
Let λ0 be an isolated point of σ (T ) and let us consider the particular case of the
Laurent expansion of the analytic function Rλ : λ ∈ ρ(T ) → (λI − T )−1 ∈ L(X)
in a neighborhood of λ0 . According the previous considerations, we have
∞
∞
Pk
Rλ = Qk (λ − λ0 )k + with Pk , Qk ∈ L(X).
(λ − λ0 )k
k=0 k=1

for all 0 < |λ − λ0 | < δ. The coefficients are calculated according the formulas
$
1 Rλ
Qk = dλ (3)
2πi (λ − λ0 )k+1
$
1
Pk = Rλ (λ − λ0 )k−1 dλ, (4)
2πi

where is a sufficiently small, positively oriented circle around λ0 .

Let H(σ (T )) be the set of all complex-valued functions which are locally
analytic on an open set containing σ (T ). Suppose that f ∈ H(σ (T )), (f ) be
the domain of f , and let denote a contour in (f ) that surrounds σ (T ). This
means a positively oriented finite system = {γ1 , . . . , γm } of closed rectifiable
curves in (f ) \ σ (T ) such that σ (T ) is contained in the inside of and C \ (f )
in the outside of . Then, from the classical functional calculus,
$
1
f (T ) := f (λ)(λI − T )−1 dλ,
2πi

is well-defined and does not depend on the particular choice of . It should be

noted that mutatis mutandis all the arguments and notions introduced above may be
extended to Banach algebras with unit u: if a ∈ H(σ (a)) is an analytic function
defined on open set containing σ (a) then
$
1
f (a) := f (λ)(λu − a)−1 dλ,
2πi

is defined in a similar way as we have done for the elements of the Banach algebra
L(X).
Normal Operators and their Generalizations 361

In the particular case that of functions which are equal to 1 on certain parts of
σ (T ) and equal to 0 on others we get idempotent operators. To see this, suppose
that σ is a spectral set (i.e. σ and σ (T ) \ σ are both closed) and := 1 ∪ 2 is an
open covering of σ (T ) such that 1 ∩ 2 = ∅ and σ ⊆ 1 , define h(λ) := 1 for λ
on 1 and h(λ) := 0 for λ on 2 . Consider the operator Pσ := h(T ). It is easy to
check that Pσ2 = Pσ , so Pσ is a projection called the spectral projection associated
with σ , and obviously
$
1
Pσ = (λI − T )−1 dλ, (5)
2πi

where is a curve enclosing σ and which separates σ from the remaining part of
the spectrum.
Let us consider again the case of an isolated point λ0 of σ (T ). Then {λ0 } is a
spectral set, so we can consider the spectral projection P0 associated with {λ0 }. It is
easy to check that if Pk are defined according (4) then

P1 = P0 , Pk = (T − λ0 I )n−1 P0 (k = 1, 2, . . . ) (6)

Equation (6) show that either Pk = 0, or that there exists a natural p such that
Pk = 0 for k = 1, . . . , p but Pk = 0 for k > p. In the second case the isolated point
λ0 is pole of order p of T .
The spectral sets produce the following decomposition see [39, §49].
Theorem 2.8 If σ is a spectral set (possibly empty) of T ∈ L(X) then the projection
in (5) generates the decomposition X = Pσ (X) ⊕ ker Pσ . The subspaces Pσ (X)
and ker Pσ are invariant under every f (T ) with f ∈ H(σ (T )); the spectrum of the
restriction T |Pσ (X) is σ and the spectrum of T | ker Pσ is σ (T ) \ σ .
The proof of the following basic result may be found in [39, Proposition 50.2].
Theorem 2.9 Let T ∈ L(X). Then λ0 ∈ σ (T ) is a pole of R(λ, T ) if and only if
0 < p(λ0 I − T ) = q(λ0 I − T ) < ∞. Moreover, if p := p(λ0 I − T ) = q(λ0 I − T )
then p is the order of the pole, every pole λ0 ∈ σ (T ) is an eigenvalue of T , and if
P0 is the spectral projection associated with {λ0 } then

P0 (X) = ker (λ0 I − T )p , ker P0 = (λ0 I − T )p (X).

In the following result, due to Schmoeger [54], we show that for an isolated point
λ0 of σ (T ) the quasi-nilpotent part H0 (λ0 I −T ) and the analytical core K(λ0 I −T )
may be precisely described as a range or a kernel of a projection.
362 P. Aiena

Theorem 2.10 Let T ∈ L(X), where X is a Banach space, and suppose that λ0
is an isolated point of σ (T ). If P0 is the spectral projection associated with {λ0 },
then:
(i) P0 (X) = H0 (λ0 I − T );
(ii) ker P0 = K(λ0 I − T ). Consequently,

X = H0 (λ0 I − T ) ⊕ K(λ0 I − T ).

In particular, if {λ0 } is a pole of the resolvent, or equivalently p := p(λ0 I −

T ) = q(λ0 I − T ) < ∞, then

H0 (λ0 I − T ) = ker(λ0 I − T )p ,

and

K(λ0 I − T ) = (λ0 I − T )p (X).

Recall that T ∈ L(X) is said to be upper semi-Fredholm, T ∈ + (X), if

α(T ) < ∞ and T (X) is closed, while T ∈ L(X) is said to be lower semi-
Fredholm, T ∈ − (X) if β(T ) < ∞. The class of Fredholm operators is defined by
(X) := + (X) ∩ − (X), while the class of semi-Fredholm operators is defined
by ± (X) := + (X) ∪ − (X). If T ∈ ± (X) then the index is defined by
ind (T ) := α(T ) − β(T ). The semi-Fredholm spectrum is defined as the set

σsf (T ) := {λ ∈ C : λI − T is not semi-Fredholm}.

A bounded operator T ∈ L(X), X a Banach space, is said to be a semi-regular

operator if T has closed range T (X) and ker T ⊆ T n (X) for every n ∈ N.
Elementary examples of semi-regular operators are a surjective operator or an
operator that is bounded below.
Definition 2.11 An operator T ∈ L(X), X a Banach space, is said to admit a
generalized Kato decomposition, abbreviated as GKD, if there exists a pair of T -
invariant closed subspaces (M, N) such that X = M ⊕ N, the restriction T |Mis
semi-regular and T |N is quasi-nilpotent. If T |N is assumed to be nilpotent of order
d then T is said to be of Kato type of operator of order d. An operator is said
to be essentially semi-regular if it admits a GKD (M, N) such that N is finite-
dimensional.
Note that if T is essentially semi-regular then T |N is nilpotent, since every quasi-
nilpotent operator on a finite dimensional space is nilpotent. A celebrate result
of Kato shows that every semi-Fredholm operator is essentially semi-regular, in
particular of Kato type, see Müller [51] for details.
Normal Operators and their Generalizations 363

Note that if T is of Kato type then also T is of Kato type. More precisely, the
pair (N ⊥ , M ⊥ ) is a GKD for T with T |N ⊥ semi-regular and T |M ⊥ nilpotent, see
Theorem 1.43 of [1].

3 Some Notions of Local Spectral Theory

For many reasons the most satisfactory generalization to the general Banach space
setting of the normal operators on a Hilbert space is the concept of decomposable
operator. In fact the class of these operators possesses a spectral theorem and a rich
lattice structure for which it is possible to develop what it is called a local spectral
theory, i.e. a local analysis of their spectra. Decomposability may be defined in
several ways, for instance by means of the concept of glocal spectral subspace.
Definition 3.1 For an arbitrary bounded linear operator on a Banach space T ∈
L(X) and a closed subset F of C, the glocal spectral subspace XT (F ) is defined as
the set of all x ∈ X such that there is an analytic X-valued function f : C \ F → X
for which

(λI − T )f (λ) = x

for all λ ∈ C \ F .
The quasi-nilpotent part may be described as a glocal subspace, indeed we have

H0 (λI − T ) = XT ({λ}) for all λ ∈ C,

see [4, Chapter 2].

Another important concept of local spectral theory is that of local spectrum of an
operator T ∈ L(X) at a point x ∈ X.
Definition 3.2 Given an arbitrary operator T ∈ L(X), X a Banach space, let ρT (x)
denote the set of all λ ∈ C for which there exists an open neighborhood Uλ of λ in
C and an analytic function f : Uλ → X such that the equation

(μI − T )f (μ) = x holds for all μ ∈ Uλ . (7)

If the function f is defined on the set ρT (x) then f is called a local resolvent
function of T at x. The set ρT (x) is called the local resolvent of T at x. The local
spectrum σT (x) of T at the point x ∈ X is defined to be the set

σT (x) := C \ ρT (x).
364 P. Aiena

Definition 3.3 For every subset F of C the local spectral subspace of an operator
T ∈ L(X) associated with F ⊆ C is the set

XT (F ) := {x ∈ X : σT (x) ⊆ F }.

It is easily seen that XT (F ) is a linear subspace, and that

XT (F ) ⊆ XT (F ) for every closed subset F ⊆ C.

Note that T has SVEP if and only if XT (F ) = XT (F ) for every closed subset
F ⊆ C, see [4, Theorem 2.23]. Obviously, if F1 ⊆ F2 ⊆ C then XT (F1 ) ⊆ XT (F2 )
and

XT (F ) = XT (F ∩ σ (T ).

Indeed, XT (F ) ∩ σ (T )) ⊆ XT (F ). Conversely, if x ∈ XT (F ) then σT (x) ⊆ F ∩

σ (T ), and hence x ∈ XT (F ∩ σ (T )). Moreover, from the basic properties of the
local spectrum it is easily seen that XλI +T (F ) = XT (F \ {λ}).
The analytic core is a local spectral subspace, precisely:

K(T ) = XT (C \ {0}) = {x ∈ X : 0 ∈
/ σT (x)},

see [4, theorem 2.20].

We now introduce an important property, defined for bounded linear operators
on complex Banach spaces, the so called single-valued extension property (SVEP).
This property dates back to the early days of local spectral theory and has received
a more systematic treatment in the classical texts by Dunford and Schwartz [25], as
well as those by Colojoară and Foiaş [23], by Vasilescu [58] and, more recently, by
Laursen and Neumann [43], and Aiena [1] . The single-valued extension property
has a basic importance in local spectral theory since it is satisfied by a wide variety
of linear bounded operators in the spectral decomposition problem.
Definition 3.4 T ∈ L(X) is said to have the single valued extension property at
λ0 ∈ C (abbreviated SVEP at λ0 ), if for every open disc U of λ0 , the only analytic
function f : U → X which satisfies the equation

(λI − T )f (λ) = 0 for all λ ∈ U

is the function f ≡ 0. An operator T ∈ L(X) is said to have SVEP if T has SVEP

at every point λ ∈ C.
Evidently, both T and T have SVEP at the points λ which belong to the boundary
∂σ (T ) of the spectrum. In particular both T and T have SVEP at the isolated points
of the spectrum. Furthermore, if σap (T ) ⊆ ∂σ (T ) then T has SVEP, and dually the
Normal Operators and their Generalizations 365

inclusion σsu (T ) ⊆ ∂σ (T ) entails that T has SVEP. Furthermore, it is not difficult

to see that

σap (T ) does not cluster at λ ⇒ T has SVEP at λ,

and dually,

σsu (T ) does not cluster at λ ⇒ T has SVEP at λ.

From the equality (1) it is easily follows that for Hilbert space operators we have

T has SVEP ⇔ T ∗ has SVEP.

The proof of the following theorem may be found in [1, Theorem 3.16, Theo-
rem 3.17].
Theorem 3.5 Let T ∈ L(X), X a Banach space and suppose that λI −T is of Kato
type . Then we have:
(i) T has SVEP at λ ⇔ p(λI − T ) < ∞.
(ii) T ∗ has SVEP at λ ⇔ q(λI − T ) < ∞.
In particular the equivalences (i) and (ii) hold for semi-Fredholm operators.
Recall that a bounded operator K ∈ L(X) is said to be algebraic if there exists a
non-constant polynomial h such that h(K) = 0. Trivially, every nilpotent operator
is algebraic and it is well-known that if K n (X) has finite dimension for some n ∈ N
then K is algebraic. An operator T ∈ L(X) is said to be Riesz if λI − T is Fredholm
for every λ = 0.
The SVEP is also stable under algebraic commuting or Riesz commuting
perturbations, see [5, 6]:
Theorem 3.6 Let T ∈ L(X) and K be algebraic which commutes with T . If T
has SVEP then T + K has SVEP. An anologous result holfd for Riesz commuting
perturbations.
Definition 3.7 A bounded operator T is said to be decomposable if, for any open
covering {U1 , U2 } of the complex plane C there are two closed T -invariant subspaces
Y1 and Y2 of X such that Y1 + Y2 = X and σ (T |Yk ) ⊆ Uk for k = 1, 2.
A bounded operator T ∈ L(X) on a Banach space X is said to have the Dunford
property (C) if every glocal spectral subspace is closed. We have

T is decomposable ⇔ T has both property (C) and property (δ),

see [43], where property (δ) means that for every open covering (U, V ) of C we
have X = XT (U ) + XT (V ). Standard examples of decomposable operators are
normal operators on Hilbert spaces and operators which have totally disconnected
spectra, as for instance compact operators.
366 P. Aiena

Two important properties in local spectral theory related to property (C) are the
so-called property (β). This property has been introduced by Bishop [16] and is
defined as follows. Let U be an open subset of C and denote by H(U, X) the Fréchet
space of all analytic functions f : U → X with respect the pointwise vector space
operations and the topology of locally uniform convergence. T ∈ L(X) has Bishop’s
property (β) if for every open U ⊆ C and every sequence (fn ) ⊆ H(U, X) for
which (λI − T )fn (λ) converges to 0 uniformly on every compact subset of U , then
also fn → 0 in H(U, X).
Examples of operators which have property (β), are provided by the weighted
right shift on 2 (N) for which the weight sequence is increasing, see [43]. Note that

property (β) ⇒ property (C) ⇒ SVEP,

and

T is decomposable ⇔ T has both property (β) and property (δ),

see [43].
Let T denote the dual of T . Property (β) and property (δ) are dual each
other, i.e., T ∈ L(X) satisfies (β) (respectively (δ)) if and only if T satisfies (δ)
(respectively, (β)), see [43]. Consequently,

T ∈ L(X) is decomposable ⇔ both T and T have property (β).

Examples of operators satisfying property (β) but not decomposable may be found
among multipliers of semi-simple commutative Banach algebras, see [43].
In the sequel, for every set F ⊆ C we set F := {λ : λ ∈ F } and F cl the closure
of F . In the case of Hilbert space operators the dual T may be replaced by the
adjoint T ∗ . Let x ∈ HT ∗ (F ), for some closed F ⊆ C. Then there exists an analytic
function f : C → H such that (λI − T ∗ )f (λ) = x for all λ ∈ C \ F . From (1) we
know that U (λI − T ∗ ) = (λ̄I − T )U , so

U x = U (λI − T ∗ )f (λ) = (λ̄I − T )U f (λ) for all λ ∈ C \ F.

The function g(λ̄) := U f (λ) for λ̄ ∈ C \ F̄ is analytic, so U x ∈ HT (F̄ ). This

shows that U HT ∗ (F ) ⊆ HT (F̄ ). Analogously, it can be shown that HT (F ) ⊆
U HT ∗ (F ) for every closed set F ⊆ C, so HT (F ) = HT (F̄ ).
Now, if T ∗ has property (δ) then H = HT ∗ (V cl ) + HT ∗ (W cl ) for every cover
{V , W } of C, so

H = U H = U HT ∗ (V cl ) + U HT ∗ (W cl ) = HT (V¯cl ) + HT (W¯cl ),
Normal Operators and their Generalizations 367

and hence, T has property (δ). An analogous argument shows that if T has property
(δ) then T ∗ has property (δ), so we have.

T has property (δ) ⇔ T ∗ has property (δ), (8)

By duality, T has property (β) ⇔ T = (T ∗ )∗ has property (δ), and hence, by (8),
if and only if (T ∗ ) has property (δ), from which we conclude that

T has property (β) ⇔ T ∗ has property (β).

Consequently,

T is decomposable ⇔ T ∗ is decomposable.

Lemma 3.8 Let T ∈ L(X), X a Banach space, and λ ∈ ρ(T ). Then λ(λI −
T )−1 x → x for every x ∈ X as |λ| → +∞.
Proof Fix x ∈ X and define f (λ) := (λI − T )−1 x : ρ(T ) → X. It is known that
f (λ) → 0 when |λ| → +∞. We have, for every λ ∈ ρ(T ),

λ(λI − T )−1 x − x = λ(λI − T )−1 x − (λI − T )(λI − T )−1 x = T (λI − T )−1 x

hence λ(λI − T )−1 x − x → 0, so λ(λI − T )−1 x → x as |λ| → +∞.

Theorem 3.9 Let T ∈ L(H ), then H0 (T ∗ ) ⊆ K(T )⊥ .
Proof Let x ∈ H0 (T ∗ ) = HT ∗ ({0}) and fix an arbitrary y ∈ K(T ). We have to
show that x, y = 0. As already observed K(T ) = {x : 0 ∈ ρT (x)}, so there exist
two analytic functions f : C \ {0} → H and g : D0 → H , D0 an open disc centered
at 0, such that

(λ̄I − T )f (λ̄) = x, λ ∈ C \ {0}, and (λI − T )g(λ) = y, λ ∈ D0 .

Both the functions f (λ) and g(λ) are defined in D0 \ {0} and for μ ∈ D0 \ {0} we
have

f (μ̄), y = f (μ̄), (μI − T )g(μ) = (μ̄I − T ∗ )f (μ̄), g(μ) = x, g(μ).

Define

f (μ̄), y if μ = D0 ,
h(μ) :=
x, g(μ) if μ ∈ D0 ,

The function h(μ) is well-defined and is analytic on C. Since f ( μ̄) = (μ̄I −T ∗ )−1 x
for all μ̄ ∈ ρ(T ∗ ), see [4, Remark 2.11] and f (μ̄) → 0 for |μ̄| → +∞, then
h(μ) → 0 as |μ| → +∞, so, by the classical Liouville theorem, h ≡ 0 on C. From
368 P. Aiena

Lemma 3.8 we have also have μ̄(μ̄I − T ∗ )−1 x = −x, as |μ| → +∞, μ̄ ∈ ρ(T ∗ ),
hence

x, y = lim μ̄(μ̄I − T ∗ )−1 x, y = lim μ̄f (μ̄), y = μh(μ) = 0,

|μ|→+∞ |μ|→+∞

so x ∈ K(T )⊥ , as desired.
Next we want to show that if T is decomposable then H0 (T ∗ ) = K(T )⊥ . To
do this we need some preliminary results. Suppose that M is a closed T -invariant
subspace of a Banach space X and denote by T /M : X/M → X/M the canonical
quotien mapping defined on the quotient X/M by (T /M)(x + M) := T x + M.
For an open disc D of C centered at 0, let D denote its closure.
Lemma 3.10 Suppose that T ∈ L(X), X a Banach space, is decomposable. If
M := XT (C \ D) then σ (T /M) is contained in D.
Proof This follows as a particular case of Theorem 1.2.23, part (b), of [43], by
taking F = C \ D.
If Y is a closed T -invariant subspace by T |Y we denote the restriction of T to Y .
Lemma 3.11 Suppose that T ∈ L(H ) is decomposable. If D is an open disc
centered at 0 then HT (C \ D)⊥ ⊆ HT ∗ (D).
Proof Let M := XT (C \ D). Recall that M is a closed invariant subspace of T ,
since a decomposable operator has property (C), while M ⊥ is a closed subspace
invariant under T ∗ . We show first that

σ (T ∗ |M) ⊆ D (9)

If S : M ⊥ → H /M is defined by S(x) = x + M for every x ∈ M ⊥ , then S is

bijective and an isometry. It is easily seen that S(T ∗ |M ⊥ ) = (T /M)∗ S, so T ∗ |M ⊥
and (T /M)∗ are similar. Therefore,

σ (T ∗ |M ⊥ ) = σ (T /M)∗ = σ (T /M).

By Lemma 3.10 then

σ (T ∗ |M ⊥ ) = σ (T ∗ |XT (C \ D)⊥ ) ⊆ D,

since D is the closure of D = D. Thus, the inclusion (9) is proved. From part (e) of
[43, Proposition 1.2.16] we then obtain HT (C \ D)⊥ = M ⊥ ⊆ HT ∗ (D).
Theorem 3.12 Let T ∈ L(H ) be decomposable then H0 (T ∗ ) = K(T )⊥ and
H0 (T ) = K(T ∗ )⊥ .
Normal Operators and their Generalizations 369

Proof To show the equality H0 (T ∗ ) = K(T )⊥ , let {Dα }α denote the set of all closed
discs of C centered at 0. Since T has SVEP we have
D
H0 (T ∗ ) = HT ∗ ({0}) = HT ∗ ({0}) = HT ∗ (Dα ),
α

see [4, Theorem 2.13, part (iv)]. To show the equality H0 (T ∗ ) = K(T )⊥ we need to
prove, by Theorem 3.9, the inclusion K(T )⊥ ⊆ H0 (T ∗ ), and for this it suffices to
prove that K(T )⊥ ⊆ HT ∗ (D), where D is any closed disc centered at 0. Evidently,

HT (C \ D) ⊆ HT (C \ {0}) = K(T ),

so K(T )⊥ ⊆ HT (C \ D)⊥ and HT (C \ D)⊥ ⊆ HT ∗ (D), by Lemma 3.10, so the

proof of the first equality is complete. The second equality is clear, since T , and
hence T ∗ , is decomposable, so we have

H0 (T ) = H0 ((T ∗ )∗ ) = K(T ∗ )⊥ .

Remark 3.13 It should be noted that the identity K(T ) = H0 (T ∗ )⊥ in general does
not hold even if T is decomposable. For instance, if T ∈ L(H ) is Riesz operator
which has infinite spectrum then T is decomposable, but K(T ) is not closed, since in
this case σ (T ) would be finite, see [50]. Hence K(T ) = H0 (T ∗ )⊥ , since H0 (T ∗ )⊥
is closed.
Corollary 3.14 If T ∈ L(H ) is self-adjoint then H0 (T ) = K(T )⊥ .
Proof T is decomposable and T = T ∗ .

4 Normal Type Operators

Perhaps the most important class of operators in Hilbert spaces is given by the
normal operators defined on a Hilbert space. Recall that if H is a complex infinite-
dimensional Hilbert space a bounded linear operator T ∈ L(H ) is said to be normal
if

T T ∗ = T ∗T (10)

Normal operators have several important spectral properties that will be next
recalled.
(A) Every isolated spectral point of a normal operator T is a simple pole of the
resolvent. If every isolated spectral point of an operator T on a Banach space
is a pole then T is said to be polaroid.
370 P. Aiena

(B) T = r(T ), where r(T ) denotes the spectral radius of T defined as

r(T ) := sup |λ|.

λ∈σ (T )

An operator T ∈ L(X), X a Banach space, for which T = r(T ) is said to be

normaloid.
(C) An operator T ∈ L(X) is said to be Weyl, (T ∈ W (X)), if T ∈ (X) and
ind T = 0. The Weyl spectrum is denoted by σw (T ). For a normal operator T
we have

σ (T ) \ σw (T ) = π00 (T ), (11)

where

π00 (T ) := {λ ∈ isoσ (T ) : 0 < α(T ) < ∞},

i.e, the spectral points for which λI − T ∈ W (X) are exactly the eigenvalues
which have finite multiplicity. An operator T ∈ L(X), X a Banach space, for
which the equality (11) holds is said to satisfy Weyl theorem.
(D) T is decomposable; in particular both T and T ∗ have SVEP.
Normal operators may be generalized in several ways:
Hyponormal Operators This class of operators on Hilbert spaces is defined
whenever the condition of normality (10) is relaxed to the inequality

T ∗T ≥ T T ∗. (12)

An operator T ∈ L(H ) which satisfies (12) is said to be hyponormal . It is easily

seen that T is hyponormal if and only if

T ∗ x ≤ T x for all x ∈ H.

Indeed, T ∗ T ≥ T T ∗ means that

T ∗ T x, x ≥ T T ∗ x, x for allx ∈ H,

or equivalently

T ∗ x2 = T ∗ x, T ∗ x = T T ∗ x, x ≤ T ∗ T x, x = T x, T x = T x2 .

Thus, T ∗ x ≤ T x.
A routine computation shows that a weighted right shift on the Hilbert space
2 (N) is hyponormal if and only if the corresponding weight sequence is increasing.
Normal Operators and their Generalizations 371

Other examples of hyponormal operators are the quasi-normal operators, see

Conway [22] or Furuta [33], where T ∈ L(H ) is said to be quasi-normal if

T (T ∗ T ) = (T ∗ T )T .

A very easy example of quasi-normal operator is given by the unilateral right shift
R on the Hilbert space 2 (N). Recall that a such operator is defined by

R(x1 , x2 , . . . ) := (0, x1 , x2 , . . . ) for all (xn ) ∈ 2 (N).

The adjoint of R is the left shift L, defined by

L(x1 , x2 , . . . ) := (x2 , x3 , . . . ) for all (xn ) ∈ 2 (N).

and obviously R(R ∗ R) = (RR ∗ )R. Note that R is not normal, since

RR ∗ = RL = R ∗ R = LR = I.

An operator T ∈ L(H ) is said to be subnormal if there exists an normal extension

N, i.e. there exists a Hilbert space K such that H ⊆ K and a normal operator
N ∈ L(K) such that N|H = T . We havept

T quasi-normal ⇒ T subnormal ⇒ T hyponormal,

for details see Furuta [33, p. 105]. In the sequel we show some relevant properties
of hyponormal operators.
Theorem 4.1 Let T ∈ L(H ) be hyponormal. Then we have:
(i) λI − T is hyponormal for every λ ∈ C.
(ii) If M is a closed invariant subspace of H then T |M is hyponormal.
Proof
(i) We have

(λI − T )∗ (λI − T ) − (λI − T )(λI − T )∗ =

(λI − T )(λI − T ) − (λI − T )(λI − T ) = T ∗ T − T T ∗ ≤ 0,

thus, λI − T is hyponormal.
(ii) IfPM is the projection of T onto M, then (T |M)∗ = (PM T ∗ )|M. For every
x ∈ M then we have

(T |M)∗ x = (PM T ∗ )x ≤ PM T ∗ x =

T ∗ x ≤ T x = T |Mx,

thus T |M is hyponormal.
372 P. Aiena

Lemma 4.2 Let T ∈ L(H ) be a self-adjoint operator such that λI ≤ T for some
λ ≥ 0. Then T is invertible. In particular, if I ≤ T then 0 ≤ T −1 ≤ I .
Proof To show the first assertion, observe that by the Schwarz inequality we have

T xx ≥ (T x, x) ≥ cx2 ,

so T x ≥ cx, and hence T is bounded below. Let y be an orthogonal element to

T (H ), that is

0 = (y, T x) = (T y, x) for all x ∈ H.

Then T y = 0 and since T is injective we then have y = 0. Therefore, T (H )⊥ =

⊥
T (H ) = {0}, and hence T is surjective, thus T is invertible.
To show the second assertion, note that if I ≤ T then T is invertible and T −1 is
also positive. Since the product of two commuting positive operators is also positive,
it then follows that

T −1 (T − I ) = I − T −1 ≥ 0,

thus T −1 ≤ I .
It is easily seen that if T is self-adjoint then ST S ∗ is also self-adjoint for every
S ∈ L(H ). Moreover, if T is positive then ST S ∗ ≥ 0 for all S ∈ L(H ).
Theorem 4.3 If T ∈ L(H ) is an invertible hyponormal operator then its inverse
T −1 is also hyponormal.
Proof Suppose that T is hyponormal. Then T ∗ T − T T ∗ ≥ 0 and hence, as noted
above, the product

T −1 (T ∗ T − T T ∗ )(T −1 )∗

is positive. From this we obtain

T −1 (T ∗ T )(T −1 )∗ − I ≥ 0,

and hence

T −1 (T ∗ T )(T −1 )∗ ≥ I,

thus, by Lemma 4.2, the product T −1 (T ∗ T )(T −1 )∗ is invertible with

0 ≤ [T −1 (T ∗ T )(T −1 )∗ ]−1 ≤ I.
Normal Operators and their Generalizations 373

From the last inequality we then obtain that

S := I − T ∗ (T −1 (T ∗ )−1 )T

is positive, so

T −1 ST −1 ≥ 0,

from which we easily obtain that

(T −1 )∗ T −1 − T −1 (T −1 )∗ ≥ 0.

Hence T −1 is hyponormal.
Paranormal Operators An operator T ∈ L(X) on a Banach space X is said to be
paranormal if

T x ≤ T 2 x for all unit vectors x ∈ X. (13)

Evidently, the restriction T |M of a paranormal operator T ∈ L(X) to a closed

subspace M is evidently paranormal. Moreover, any scalar multiple, and the inverse
(if it exists) of a paranormal operator, is paranormal.
Theorem 4.4 If T ∈ L(X) is paranormal then we have:
(i) Every power T n is paranormal.
(ii) T is normaloid.
Proof
(i) Observe that from the definition (13) we have

T k+1 x T k+2 x
≤
T k x T k+1 x

from which we obtain

T n x T x T 2 x T n x
= ···
x x T x T n−1 x
T n+1 x T n+2 x T 2n x T 2n x
≤ · · · = .
T n x T n+1 x T 2n−1 x T n x

Therefore, T n x2 ≤ (T n )2 xx.

(ii) For every paranormal operator we have

T x2 ≤ T 2 xx ≤ T 2 x2 ,

374 P. Aiena

thus T 2 = T 2 . Since T n is paranormal then T 2n = T 2n for every

n ∈ N. Hence
n 1
r(T ) = lim T 2 2n = T .
n→∞

Definition 4.5 Given a class of operators L ⊆ L(X), an operator T is said to be

algebraically L if there exists a non-trivial polynomial h for which h(T ) belongs
to L. T is said to be analytically L if there exists an analytic function h such that
h(T ) belongs to L, where h is defined on an open neighborhood of σ (T ) , non-
costant on each of the components of its domain.
Since every paranormal operator T ∈ L(X) is normaloid, then we have

T quasi-nilpotent paranormal ⇒ T = 0. (14)

Recall that an invertible operator T ∈ L(X) is said to be doubly power-bounded if

sup{T n : n ∈ Z} < ∞. Algebraic paranormal operators have been studied in [2].
Theorem 4.6 Suppose that T ∈ L(X) is algebraically paranormal and quasi-
nilpotent. Then T is nilpotent.
Proof Suppose that h is a polynomial for which h(T ) is paranormal. From the
spectral mapping theorem we have

σ (h(T )) = h(σ (T )) = {h(0)}.

We claim that h(T ) = h(0)I . To see that let us consider the two possibilities: h(0) =
0 or h(0) = 0.
If h(0) = 0 then h(T ) is quasi-nilpotent, so from the implication (14), we deduce
that h(T ) = 0, hence the equality h(T ) = h(0)I trivially holds.
Suppose the other case h(0) = 0, and set h1 (T ) := h(0)1
h(T ). Clearly, h1 (T ) has
spectrum {1} and h1 (T ) = 1. Moreover, h1 (T ) is invertible and also its inverse
h1 (T )−1 has norm 1. The operator h1 (T ) is then doubly power-bounded and by
a classical theorem due to Gelfand, see [43, Theorem 1.5.14] for a proof, it then
follows that h1 (T ) = I , and hence h(T ) = h(0)I , as claimed.
Now, from the equality h(0)I − h(T ) = 0, we see that there exist some natural
n ∈ N and μ ∈ C for which

n
0 = h(0)I − h(T ) = μ T m (λi I − T ) with λi = 0,
i=1

where all λi I − T are invertible. This obviously implies that T m = 0, so T is

nilpotent.
Normal Operators and their Generalizations 375

Theorem 4.7 Every algebraically paranormal operator T ∈ L(X), X a Banach

space has SVEP.
Proof We show first the SVEP for paranormal operators. If λ = 0 and λ = μ then,
by Theorem 2.6 of [20], we have

x + y ≥ y x ∈ ker (μI − T ), y ∈ ker (λI − T ).

It then follows that if U is an open disc and f : U → X is an analityc function such

that 0 = f (z) ∈ ker (zI − T ) for all z ∈ U , then f fails to be continuous at every
0 = λ ∈ U . Finally, if T is algebraically paranormal then h(T ) is paranormal for
some non-trivial polynomial h, and hence h(T ) has SVEP. This implies that T has
SVEP, see [4, Corollary 2.89].
Theorem 4.8 If T ∈ L(X) is algebraically paranormal then every isolated point
of the spectrum σ (T ) is a pole of the resolvent; i.e. T is polaroid.
Proof For every isolated point λ of σ (T ) we have p(λI − T ) = q(λI − T ) < ∞.
Indeed, let λ be an isolated point of σ (T ). If M := K(λI − T ) and N := H0 (λI −
T ), then H = H0 (λI − T ) ⊕ K(λI − T ), by Theorem 2.10. Furthermore, since
σ (T |N) = {λ}, while σ (T |M) = σ (T ) \ {λ}, so the restriction λI − T |N is quasi-
nilpotent and λI − T |M is invertible. Since λI − T |N is algebraically paranormal
then Lemma 5.2 implies that λI − T |Nis nilpotent. In other worlds, λI − T is an
operator of Kato Type.
Now, both T and its dual T ∗ have SVEP at λ, since λ is isolated in σ (T ) =
σ (T ∗ ), and this implies, by Theorem 3.5, that both p(λI − T ) and q(λI − T ) are
finite. Therefore, λ is a pole of the resolvent.

The property of being paranormal is not translation-invariant, see Chō and J. I.

Lee [18]. An operator T ∈ L(X) is called totally paranormal if λI −T is paranormal
for all λ ∈ C.
Theorem 4.9 Every hyponormal operator T ∈ L(H ) is totally paranormal.
Proof To show that T is totally paranormal it suffices to prove, by Lemma 4.1 to
prove that every hyponormal operator is paranormal. Since T is hyponormal we
have, for every x ∈ H ,

T x2 = (T x, T x) = (T ∗ T x, x) ≤ T ∗ (T x)x
≤ T (T x)x = T 2 xx,

Taking x = 1 we then have T x2 ≤ T 2 x, so T is paranormal.

Remark 4.10 In [33, p. 113] it is shown that there exists a hyponormal operator T
for which T 2 is not hyponormal. Since every hyponormal operator is paranormal,
then T is paranormal and hence, by Theorem 4.4, T 2 is paranormal. Therefore T 2
provides an example of operator which is paranormal, but not hyponormal.
376 P. Aiena

Theorem 4.11 For every totally paranormal operator T ∈ L(X) we have

H0 (λI − T ) = ker (λI − T ) for every λ ∈ C. (15)

Proof In fact, if x ∈ H0 (λI − T ) then (λI − T )n x1/n → 0 and since T is totally

paranormal then

(λI − T )n x1/n ≥ (λI − T )x.

Therefore, H0 (λI − T ) ⊆ ker(λI − T ), and since the reverse inclusion holds for
every operator, then we have H0 (λI − T ) = ker(λI − T ).
Equation (15) has a remarkable conseguence. To see this, observe that if λ ∈
iso σ (T ), where iso K denotes the isolated points of a set K ⊆ C, then by
Theorem 2.10, we have

H = H0 (λI − T ) ⊕ K(λI − T ) = ker (λI − T ) ⊕ K(λI − T ),

and hence

(λI − T )(H ) = (λI − T )(K(λI − T )) = K(λI − T ).

Therefore,

H = ker (λI − T ) ⊕ (λI − T )(H ).

By Theorem 2.4 then p(λI − T ) = q(λI − T ) = 1, so λ is a simple pole of the

resolvent.
Corollary 4.12 Every isolated point of the spectrum of a totally paranormal
operator is a simple pole of the resolvent.
An operator T ∈ L(X) is said to be hereditarily polaroid if every restriction
T |M on a closed invariant subspace is polaroid. From Lemma 4.1 we know that
the restriction T |M on a closed invariant subspace of a hyponormal operator is
hyponormal too, so, by Theorem 4.9 and Corollary 4.12 we deduce:
Corollary 4.13 Every hyponormal operator is hereditarily polaroid.
If T ∈ L(X), S ∈ L(Y ) a quasi-affinity is an operator A ∈ L(X, Y ) injective
with dense range for which SA = AT . It is easily seen that the property of
being hereditarily polaroid is similarity invariant, but is not preserved by a quasi-
affinity. It is easily seen that the property of being hereditarily polaroid is similarity
invariant, but is not preserved by a quasi-affinity. We now want to show that every
hereditarily polaroid operator has SVEP. First we need to introduce two concepts of
orthogonality on Banach spaces.
Normal Operators and their Generalizations 377

Definition 4.14 A closed subspace M of a Banach space X is said to be orthogonal

to a closed subspace N of X in the sense of Birkoff and James, in symbol M ⊥ N
if x ≤ x + y for all x ∈ M and y ∈ N.
A study of this concept of orthogonality may be found in [25]. Note that this
concept of orthogonality is asymmetric and reduces to the usual definition of
orthogonality in the case of Hilbert spaces. This concept of orthogonality may be
weakened as follows:
Definition 4.15 A closed subspace M of a Banach space X is said to be approxi-
mate orthogonal to a closed subspace N of X, in symbol M ⊥a N, if there exists a
scalar α ≥ 1 such that x ≤ αx + y for all x ∈ M and y ∈ N.
What M ⊥a N means is that M meets N at an angle θ , 0 ≤ θ ≤ π
2, where by
definition

sen θ = inf{x − y, y = 1} for all x ∈ M, y ∈ N.

If θ = π2 , then M is orthogonal in the sense Birkoff-James sense. If M meets N at

an angle θ > 0 then N meets M at an angle φ > 0, where in general θ = φ.
Theorem 4.16 Every hereditarily polaroid operator T ∈ L(X) has SVEP.
Proof Let T be hereditarily polaroid. For distinct eigenvalues λ and μ of T , let
M denote the subspace generated by ker (λI − T ) and ker (μI − T ). Set S :=
T |M. Then S is polaroid and σ (S) = {λ, μ}. Denote by Pμ the spectral projection
corresponding to the spectral set {μ}. Then

Pμ (M) = ker (μI − S) = ker (μI − T ),

while

ker Pμ = (I − Pμ )(M) = ker (λI − S) = ker(λI − T ).

Set α := Pμ . Then α ≥ 1, and

x = pμ x = Pμ (x.y) ≤ αx − y

for all x ∈ Pμ (M) = ker (μI − T ) and y ∈ (I − Pμ )(M) = ker (λI − T ).

Now, suppose that T does not have SVEP at a point δ0 ∈ C. Then there exists an
open disc D0 centered at δ0 and a non-trivial analytic function f : D0 → X such
that

f (δ) ∈ ker (δI − T ) for all δ ∈ D0 .

378 P. Aiena

Let λ ∈ D0 and μ ∈ D0 be two distinct complex numbers such that f (λ) and f (μ)
are non-zero. Since ker (μI − T ) ⊥a ker (λI − T ), then

0 < f (μ) ≤ αf (μ) − f (λ).

But then f is not continuous at μ, a contradiction. Hence T has SVEP.

We have seen that T ∗ is polaroid if and only if T is polaroid. An immediate
consequence of Theorem 4.16 is that this equivalence in general does not hold
for hereditarily polaroid operators. Indeed, the right shift R is trivially hereditarily
polaroid while its dual, the left shift L, cannot be hereditarily polaroid, since it does
not have SVEP.
A class of hereditarily polaroid operator may be defined by extending the
property (15) observed in Theorem 4.11. Consider a function p : λ ∈ C →
p(λ) ∈ N.
Definition 4.17 An operator T ∈ L(X), X a Banach space, is said to be a H (p)-
operator if

H0 (λI − T ) = ker (λI − T )p(λ) for every λ ∈ C.

Obviously, every hyponormal operator is H (1), where 1 is the constant function

1(λ) = 1.
The property H (p) is inherited by the restrictions on closed invariant subspaces:
Theorem 4.18 Let T ∈ L(X) be a bounded operator on a Banach space X. If T
has the property H (p) and Y is a closed T -invariant subspace of X then T |Y has
the property H (p).
Proof If H (λI − T ) = ker(λI − T )p(λ) then
0

H0 ((λI − T )|Y ) ⊆ ker(λI − T )p(λ) ∩ Y = ker((λI − T )|Y )p(λ) ,

from which we obtain H0 ((λI − T )|Y ) = ker((λI − T )|Y )p(λ .

Theorem 4.19 Every H (p)-operator T is hereditarily polaroid.
Proof By Theorem 4.18 it suffices to prove that a H (p)-operator T is polaroid. If
λ ∈ iso σ (T ), then by Theorem 2.10, we have

H = H0 (λI − T ) ⊕ K(λI − T ) = ker (λI − T )p(λ) ⊕ K(λI − T ),

and hence

(λI − T )p(λ) (H ) = (λI − T )p(λ) (K(λI − T )) = K(λI − T ).

Normal Operators and their Generalizations 379

Therefore,

H = ker (λI − T )p(λ) ⊕ (λI − T )p(λ) (H ).

By Theorem 2.4 then p(λI − T ) = q(λI − T ) = p(λ), so λ is a pole of the

resolvent.
The class of H (p)-operators is very large. To see this, we first introduce a special
class of operators which has an important role in local spectral theory. Let C ∞ (C)
denote the Fréchet algebra of all infinitely differentiable complex-valued functions
on C.
Definition 4.20 An operator T ∈ L(X), X a Banach space, is said to be generalized
scalar if there exists a continuous algebra homomorphism : C ∞ (C) → L(X)
such that

(1) = I and (Z) = T ,

where Z denotes the identity function on C.

The interested reader can be find a well organized treatment of generalized scalar
operators in Laursen and Neumann [43, Section 1.5]). It should be noted that:
(a) every quasi-nilpotent generalized scalar operator is nilpotent, [43, Proposi-
tion 1.5.10].
(b) An operator similar to a restriction of a generalized scalar operator to one of its
closed invariant subspaces is called subscalar.
(c) If T is generalized scalar then T is decomposable and hence has Dunford
property (C), from which it follows that H0 (λI − T ) = XT ({λ}) is closed,
see [43, Theorem 1.5.4 and Proposition 1.4.3], or [4, Chapter 4].
An important result due to Putinar [52] , every hyponormal operator is similar
to a subscalar operator, see also [43, section 2.4], hence any hiponormal operator is
decomposable.
Theorem 4.21 Every generalized scalar, as well as every subscalar operator, T ∈
L(X) is H (p). Consequently, every generalized scalar and every subscalar operator
is hereditarily polaroid.
Proof By Lemma 4.18 and Theorem 4.22 we may assume that T is generalized
scalar. Consider a continuous algebra homomorphism : C ∞ (C) → L(X) such
that (1) = I and (Z) = T . Let λ ∈ C. Since every generalized scalar operator
has the property (C), then H0 (λI − T ) is closed. On the other hand, if f ∈ C ∞ (C)
then

(f )(H0 (λI − T )) ⊆ H0 (λI − T ),

380 P. Aiena

because T = (Z) commutes with (f ). Define

˜ : C ∞ (C) → L(H0 (λI − T ))

˜ ) = (f )|H0 (λI − T )

(f for every f ∈ C ∞ (C).

Clearly, T |H0 (λI − T ) is generalized scalar and quasi-nilpotent, so it is nilpotent.

Thus there exists p ≥ 1 for which H0 (λI − T ) = ker(λI − T )p .
Two operators T ∈ L(X), S ∈ L(Y ), X and Y Banach spaces, are said to be
intertwined by A ∈ L(X, Y ) if SA = AT ; and A is said to be a quasi-affinity if it
has a trivial kernel and dense range. If T and S are intertwined by a quasi-affinity
then T is called a quasi-affine transform of S, and we write T ≺ S. If both T ≺ S
and S ≺ T hold then T and S are said to be quasi-similar.
The next result shows that property H (p) is preserved by quasi-affine transforms.
Theorem 4.22 Suppose that S ∈ L(Y ) has property H (p) and T ≺ S. T, then T
has property H (p).
Proof Suppose S has property H (1), SA = AT , with A injective. If λ ∈ C and
x ∈ H0 (λI − T ) then

(λI − S)n Ax1/n = A(λI − T )n x1/n ≤ A1/n (λI − T )n x1/n ,

from which it follows that Ax ∈ H0 (λI − S) = ker (λI − S). Hence A(λI − T )x =
(λI − S)Ax = 0 and, since A is injective, this implies that (λI − T )x = 0, i.e.
x ∈ ker (λI − T ). Therefore H0 (λI − T ) = ker (λI − T ) for all λ ∈ C.
The more general case of H (p)-operators is proved by using a similar argument.

Log-Hyponormal Operators In order to introduce a new class of operators we

need first to recall some basic fact. For any positive operator T ∈ L(H ) there exists
an unique operator S such that S 2 = T . The operator S is called the square root
1
of T and denoted by T 2 . Let M be a closed subspace of H . Then H = M ⊕ M ⊥ ,
where M ⊥ is the orthogonal complement of M, i.e.,

M ⊥ := {y ∈ H : (x, y) = 0 for all x ∈ M.}

The projection PM of H onto M along M ⊥ is called the orthogonal projection from

H onto M. Recall that a projection P is orthogonal if and only if P is self-adjoint.
Every orthogonal projection PM has norm equal to 1, moreover

0 ≤ PM ≤ I,
Normal Operators and their Generalizations 381

see Furuta [33].

An operator U ∈ L(H ) is said to be a partial isometry if there exists a closed
subspace M such that

U x = x for any x ∈ M, andU x = 0 for any x ∈ M.

The subspace M is said to be the initial space of U , while the range N := U (H )

is said to be the final space of U . Evidently, U is an isometry if and only if U is a
partial isometry and M = H , while U is unitary if and only if U is a partial isometry
and M = N = H , see [33] .
Theorem 4.23 Let U ∈ L(H ) be a partial isometry with initial space M and final
space N.Then we have
(i) U PM = U and U U = PM .
(ii) N is a closed subspace of H .
(iii) The adjoint U ∗ is a partial isometry with initial space N and final space M.
Note that an operator U ∈ L(H ) is a partial isometry if and only if U ∗ is a partial
1
isometry, and in this case U U ∗ and U ∗ U are projection. Set |T | := (T ∗ T ) 2 . It is
easily seen that ker T = ker |T |.
Theorem 4.24 (Polar Decomposition) For every T ∈ L(H ) there exists a partial
isometry U such that T = U |T |. The initial space of U is M := |T |(H ) = T ∗ (H ),
the final space is N := T (H ). Moreover,

ker U = ker |T | and U ∗ U |T | = |T |.

If U is as in Theorem 4.24 the product T = U |T | is called the polar

decomposition of T . The partial isometry U is uniquely determined. If T = U |T | is
the polar decomposition of T then T ∗ = U ∗ |T ∗ | is the polar decomposition of T ∗ .
Some important properties are transmitted from T to U , for instance if T is normal
then U is normal, if T is self-adjoint then U is self-adjoint, if T is positive then U
is positive.
For T ∈ L(H ) let T = W |T | be the polar decomposition of T . The operator
defined by Aluthge in [11] as

R := |T |1/2W |T |1/2

is said to be the Aluthge transform of T .

Definition 4.25 An operator T ∈ L(H ) is said to be log-hyponormal if T is
invertible and satisfies

log (T ∗ T ) ≥ log (T T ∗ ).
382 P. Aiena

If R = V |R| is the polar decomposition of R, let define

T1 := |R|1/2 V |R|1/2 .

If T is log-hyponormal then T1 is hyponormal and T = K T 1K −1 , where K :=

|R| |T | , see Tanahashi [56], and M. Chō, Jeon and J. I. Lee [19]) . Hence T
1/2 1/2

is similar to a hyponormal operator and therefore, by Theorem 4.22, has property

H (1).
p-Hyponormal Operators An operator T ∈ L(H ) is said to be p-hyponormal,
with 0 < p ≤ 1, if

(T T )p ≥ (T T )p .

If p = 12 , T is said to be semi-hyponormal . The class of p-hyponormal

operators have been studied by Aluthge [11] , while semi-hyponormal operators
have been introduced by Xia [61]. Any p-hyponormal operator is q-hyponormal
if q < p, but there are examples to show that the converse is not true, see [11].
Every invertible p-hyponormal is subscalar, see Ko [41], and is quasi-similar to
a log-hyponormal operator. Consequently, by Theorem 4.22, every invertible p-
hyponormal is operator has property H (1). This is also true for p-hyponormal
operators which are not invertible, see Duggal and Jeon [30]. Every p-hyponormal
operator is paranormal, see [12] or [17].
M-Hyponormal Operators Recall that T ∈ L(H ) is said to be M-hyponormal if
there exists M > 0 such that

T T ∗ ≤ MT ∗ T .

Every M-hyponormal operator is subscalar ([43, Proposition 2.4.9]) and hence

H (p).
w-Hyponormal Operators If T ∈ L(H ) and T = U |T | is the polar decomposi-
tion, define
1 1
T̂ := |T | 2 U |T | 2 .

T ∈ L(H ) is said to be w-hyponormal if

|T̂ | ≥ |T | ≥ |T̂ ∗ |.

Examples of w-hyponormal operators are p-hyponormal operators and log-

hyponormal operators. All w-hyponormal operators are subscalar, together with
its Aluthge transformation, see M. Chō, H. Jeon, and J. I. Lee [46], and hence
H (p). In [38, Theorem 2.5] it is shown that for every isolated point λ of the
Normal Operators and their Generalizations 383

spectrum of a w-hyponormal operator T we have H0 (λI − T ) = ker(λI − T ) and

hence λ is a simple pole of the resolvent.
p-Quasihyponormal Operators A Hilbert space operator T ∈ L(H ) is said to be
p-quasihyponormal for some 0 < p ≤ 1 if

T ∗ |T ∗ |2p T ≤ T ∗ |T |2p T .

Every p-quasi-hyponormal is paranormal [45].

Class A Operators An operator T ∈ L(H ) is said to be a class A operator if

|T 2 | ≥ |T |2 .

Every log-hyponormal operator is a class A operator [34], but the converse is no true,
see [33, p. 176]. Every class A operator is paranormal (an example of a paranormal
operator which is not a class A operator can be found in [33, p. 177]). Therefore
every class A operator, as well as every algebraically class A operator, is polaroid.
Quasi-Class A Operators An operator T ∈ L(H ) is said to be a quasi-class A
operator if T ∗ |T 2 |T ≥ T ∗ |T |2 T . The quasi-class A operators contains the class of
al p-quasinormal operators and the class of all class A operators. In [31] it is given
an example of a quasi-class A operator which is not paranormal. Every quasi-class
A operator has SVEP, since p(λI − T ) ≤ 1 for all λ ∈ C, while every non-zero λ0
isolated point of the spectrum is a simple pole of T and H0 (λ0 I −T ) = ker(λ0 I −T ),
see [31]. It has been observed in [31, Example 0.2] that a quasi-class A operator need
not to be normaloid.
∗-Paranormal Operators A bounded operator T ∈ L(H ) is said to be ∗-
paranormal if T ∗ x2 ≤ T 2 x for every unit vector x ∈ H . Paranormality
is independent of ∗-paranormality and, evidently, hyponormal operators are both
paranormal and ∗-paranormal. It is known [13] that

T is ∗ -paranormal ⇔ T ∗ 2 T 2 − 2λT T ∗ + λ2 ≥ 0 for each λ > 0.

Every ∗-paranormal operator T is normaloid. Moreover, ker(λI − T ) ⊆ ker(λI −

T ∗ ) for all λ ∈ C, from which it easily follows that p(λI − T ) < ∞ for all λ ∈ C,
thus T has SVEP.
Totally ∗-Paranormal T ∈ L(H ) is said to be totally ∗-paranormal if λI − T
is ∗-paranormal for every λ ∈ C. An example of a ∗-paranormal operator
which is not totally ∗-paranormal may be found in [37]. It is not known to the
author if every totally ∗-paranormal operator has property (C). Observe that every
totally algebraically ∗-paranormal operator is H (1) and hence hereditarily polaroid.
Indeed, μI − T is normaloid for all μ ∈ C, so
1
(λI − T )x ≤ (λI − T )n x n for all x ∈ X, λ ∈ C,
384 P. Aiena

so that H0 (λI − T ) ⊆ ker(λI − T ) for all λ ∈ C.

The class of p-quasihyponormal may be extended as follows:
(p,k)-Quasihyponormal Operators An operator T ∈ L(H ) is said to be (p,k)-
quasihyponormal for some 0 < p ≤ 1 and k ∈ N if

T ∗ k |T ∗ |2p T k ≤ (T ∗ k |T |2p T k .

Evidently,
1. a (1, 1)-quasihyponormal operator is quasihyponormal;
2. a (p, 1)-quasihyponormal operator is p-quasihyponormal;
3. a (p, 0)-quasihyponormal operator is p-hyponormal if 0 < p < 1 and
hyponormal if p = 1.
In [57, Theorem 6] it has been proved that every (p, k)-quasihyponormal
operator T ∈ L(H ) is hereditarily polaroid. Moreover, every isolated point λ = 0 is
a simple pole of the resolvent.
It should be noted that the class of totally ∗-paranormal operators, as well as
the class of M-hyponormal operators, are independent of the classes (p, k)-quasi-
hyponormal.

5 Totally Hereditarily Normaloid Operators

We have already observed that a normal operator is normaloid, i.e. T = r(T ).

Since the restriction of a normal operator is still normal, these motivates the
following definition.
Definition 5.1 An operator T ∈ L(X) is said to be hereditarily normaloid, X a
Banach space, (T ∈ HN ), if the restriction T |M of T , to any closed T -invariant
subspace M, is normaloid. Finally, T ∈ L(X) is said to be totally hereditarily
normaloid, T ∈ T HN , if T ∈ HN and every invertible restriction T |M has a
normaloid inverse.
Totally hereditarily operators were introduced by Duggal and S.V. Djordjevíc in
[29].
Recall that an invertible operator T ∈ L(X) is said to be doubly power-bounded
if sup{T n : n ∈ Z} < ∞.
Theorem 5.2 Suppose that T ∈ L(X) is quasi-nilpotent. If T is an analytically
T HN operator, then T is nilpotent.
Proof Let T ∈ L(X) and suppose that f (T ) is a T HN -operator for some f ∈
Hnc (σ (T )). From the spectral mapping theorem we have

σ (f (T )) = f (σ (T )) = {f (0)}.
Normal Operators and their Generalizations 385

We claim that f (T ) = f (0)I . To see this, let us consider the two possibilities:
f (0) = 0 or f (0) = 0.
If f (0) = 0 then f (T ) is quasi-nilpotent and f (T ) is normaloid, and hence
f (T ) = 0. The equality f (T ) = f (0)I then trivially holds.
Suppose the other case f (0) = 0, and set f1 (T ) := f (0) 1
f (T ). Clearly,
σ (f1 (T )) = {1} and f1 (T ) = 1. Further, f1 (T ) is invertible and is T HN .
This easily implies that its inverse f1 (T )−1 has norm 1. The operator f1 (T )
is then doubly power-bounded and, by a classical theorem due to Gelfand (see
[43, Teorem 1.5.14] for an elegant proof), it then follows that f1 (T ) = I , and
consequently f (T ) = f (0)I , as claimed.
Now, define g(λ) := f (0) − f (λ). Clearly, g(0) = 0, and g may have only a
finite number of zeros in σ (T ). Let {0, λ1 , . . . , λn } be the set of all zeros of g, where
λi = λj , for all i = j , and λi has multiplicity ni ∈ N. Write

n
g(λ) = μλm (λi I − T )ni h(λ),
i=1

where h(λ) has no zeros in σ (T ). The equality g(T ) = f (0)I − f (T ) = 0 then

implies that

n
0 = g(T ) = μ T m (λi I − T )ni h(T ) with λi = 0,
i=1

where all the operators λi I − T and h(T ) are invertible. This, obviously, implies
that T m = 0, i. e. T is nilpotent.
If T ∈ L(X) the numerical range of T is defined as

W (T ) := {f (T ) : f ∈ L(X)∗ , ||f = f (I ) = 1},

while the numerical radius of T is defined by

w(T ) := sup{|λ| : λ ∈ W (T )}.

In the case of Hilbert space operator the numerical range may be described as the
set

W (T ) = {(T x, x) : x = 1},

and the well-known Toeplitz-Hausdorff theorem establishes that W (T ) is a convex

set in the complex plane (for a proof, see Furuta [33, p. 91]). Furthermore,

r(T ) ≤ w(T ) ≤ T .
386 P. Aiena

The next nontrivial result has been proved in Sinclair [55], we omit the not simple
proof.
Theorem 5.3 Let T ∈ L(X) and suppose that 0 is in the boundary of the numerical
range of T . Then the kernel of T is orthogonal (in the sense of Birkoff-James) to the
range of T .
In the case of paranormal operators we have:
Theorem 5.4 Suppose that T ∈ L(X) is totally hereditarily normaloid , λ, μ ∈ C,
with λ = 0 and λ = μ. Then ker (λI − T ) ⊥ ker (μI − T ), i.e. ker (λI − T ) is
orthogonal to ker (μI − T ) in the Birkhoff and James sense.
Proof Suppose first that |λ| ≥ |μ| and let x ∈ ker (λI − T ), y ∈ ker (μI − T ).
Then T x = λx and T y = μy. Denote by M the subspace generated by x and y and
set TM := T |M. Clearly, σ (T |M) = {λ, μ} and being T |M normaloid then

T |M = r(T |M) = |λ, |

so that μ(T |M) = |λ|. Consequently, λ belongs to the boundary of the numerical
range of T |M and hence, by Theorem 5.3, ker(λI − T |M) ⊥ (λI − T |M)(M).
Evidently, λ and μ are poles of the resolvent of T |M having order 1. Denoting by Pλ
and Pμ the spectral projections for T |M associated with {λ} and {μ}, respectively,
we then have

(λI − T |M)(M) = (I − Pλ )(M) = Pμ (M) = ker (βI − T |M)

Now, x ∈ ker (λI − T |M) and y ∈ ker(μI − T |M), so x + y ≥ x.

Consider now the case where |λ| < |μ|. Then |μ| > 0, so T |M is invertible and

1 1
σ (T |M)−1 = { , },
λ μ

with | λ1 | > | μ1 |. Since T |M is normaloid then the inverse (T |M)−1 is also

normaloid. As in the first case we then see that the kernels ker( λ1 I − (T |M)−1 )
and ker ( μ1 I − (T |M)−1 ) are orthogonal. Obviously, x ∈ ker ( λ1 I − (T |M)−1 ) and
y ∈ ker ( μ1 I − (T |M)−1 ), so the proof is complete.
Theorem 5.5 Every totally hereditarily normaloid operator T on a separable
Banach space has SVEP.
Proof To prove the first assertion, we observe first that the point spectrum σp (T ) is
countable, hence its interior part is empty. If σp (T ) were not countable we would
have an uncountable set of unit vectors such that

xi − xj ≥ 1.
Normal Operators and their Generalizations 387

Since X is separable this is not possible.

Remark 5.6 It is rather simple to see that if T ∈ L(X) is T HN and M is a T -
invariant closed subspace of X then the restriction T |M is also T HN . In the sequel
by Y we denote the closure of Y ⊆ X.
Definition 5.7 An operator T ∈ L(X), X a Banach space, is said to be k-quasi
totally hereditarily normaloid, k a nonnegative integer, if the restriction T |T k (X) is
T HN .
Evidently, every T HN -operator is quasi-T HN , and if T k (X) is dense in X then
a quasi-T HN operator T is T HN .
Lemma 5.8 If T ∈ L(X) is quasi-T HN and M is a closed T -invariant subspace
of X, then T |M is quasi-T HN .
Proof Let k a nonnegative integer such that Tk := T |T k (X) is T HN . Let TM
denote the restriction T |M. Clearly, TM k (M) ⊆ T k (X), so TM k (M) is Tk -invariant
subspace of T k (X). By Remark 5.6 it then follows that

TM |TM k (M) = Tk |TM k (M)

is T HN .
We recall now some elementary algebraic facts. Suppose that T ∈ L(X) and
X = M ⊕ N, with M and N closed subspaces of X, M invariant under T . With
respect to this decomposition of X it is known that T may be represented by a
AB
upper triangular operator matrix , where A ∈ L(M), C ∈ L(N) and
0 C
x
B ∈ L(N, M). It is easily seen that for every x = ∈ M we have T x = Ax,
0
so A = T |M. Let us consider now the case of operators T acting on a Hilbert space
H , and suppose that T k (H ) is not dense in H . In this case we can consider the
nontrivial orthogonal decomposition

⊥
H = T k (H ) ⊕ T k (H ) , (16)
⊥
where T k (H ) = ker(T ∗ )k , T ∗ the adjoint of T . Note that the subspace T k (H ) is
T -invariant, since

T (T k (H )) ⊆ T (T k (H )) = T k+1 (H ) ⊆ T k (H ).

Thus we can represent, with respect the decomposition (16), T as an upper triangular
operator matrix

T1 T2
, (17)
0 T3
388 P. Aiena

⊥
where T1 = T |T k (H ). Moreover, T3 is nilpotent. Indeed, if x ∈ T k (X) , an easy
computation yields

0
T kx = T = T3 k x.
x

⊥
Hence T3 k x = 0, since T k x ∈ T k (H ) ∪ T k (H ) = {0}. Therefore we have:
Theorem 5.9 Suppose that T ∈ L(H ) and T k (H ) non dense in H . Then,
T1 T2
according the decomposition (16), T = is quasi-T HN if and only if
0 T3
T1 is T HN . Furthermore,

σ (T ) = σ (T1 ) ∪ σ (T3 ) = σ (T1 ) ∪ {0}.

Proof The first assertion is clear, since T1 = T |T k (H ). The second assertion

AC
follows from the following general result: if T := is an upper triangular
0 B
operator matrix acting on some direct sum of Banach spaces and σ (A) ∩ σ (B) has
no interior points, then σ (T ) = σ (A) ∪ σ (B); see [44] for a proof.
In the sequel we give some examples of operators which are quasi totally
hereditarily normaloid.
1. The class of quasi-paranormal operators may be extended as follows: T ∈ L(H )
is said to be (n, k)-quasiparanormal if
1 n
T k+1 x ≤ T 1+n (T k x) 1+n T k x 1+n for all x ∈ H.

The class of (1, k)- quasiparanormal operators has been studied in [48]. If T k (H )
T1 T2
is not dense then, in the triangulation T = , T1 = T |T k (H ) is n-
0 T3
quasiparanormal, and hence T HN , see [62].
2. An extension of class A operators is given by the class of all k-quasiclass A
operators, where T ∈ L(H ), H a separable infinite dimensional Hilbert space,
is said to be a k-quasiclass A operator if

T ∗ k (|T |2 − |T |2 )T k ≥ 0.

Every k-quasiclass A operator is quasi-T HN . Indeed, if T has dense range then

T is a class A operator and hence paranormal. If T does not have dense range then
T with respect the decomposition H = T k (H ) ⊕ ker T ∗ k may be represented as
T1 T2
a matrix T = , where T1 := T |T k (H ) is a class A operator, and hence
0 T3
T HN , see [56].
Normal Operators and their Generalizations 389

We have observed a quasi-class A operator (i.e. k = 1), need not to be

normaloid. This shows that, in general, a quasi-T HN operator is not normaloid,
so the class of quasi-T HN operators properly contains the class of T HN
operators.
3. An operator T ∈ L(H ), H a separable infinite dimensional Hilbert space, is said
to be k-quasi *-paranormal, k ∈ N, if

T ∗ T k x2 ≤ T k+2 xT k x for all unit vectors x ∈ H.

This class of operators contains the class of all quasi- ∗-paranormal operators
(which corresponds to the value k = 1). Every k-quasi *-paranormal operator is
quasi-T HN . Indeed, if T k has dense range then T is ∗-paranormal and hence
T HN . If T k does not have dense range then T may be decomposed, according
T1 T2
the decomposition H = T k (H ) ⊕ ker T ∗ k , as T = , where T1 =
0 T3
T |T k (H ) is ∗-paranormal, hence T HN , see [49, Lemma 2.1].
Every (p, k)-quasihyponormal operator T with respect to the decomposition
T1 T2
H = T k (H ) ⊕ ker T ∗ k , may be represented as a matrix T = , where
0 0
T1 := T |T k (H ) is k-hyponormal (hence paranormal) and consequently T HN ,
see [40]. The next result generalizes the result of Lemma 5.2.
Theorem 5.10 Suppose that T ∈ L(H ), H a Hilbert space, is analytically quasi-
T HN and quasi-nilpotent. Then T is nilpotent.
Proof Suppose first that T is quasi-nilpotent and k-quasi T HN . If T k (H ) is dense
then T is T HN , so T is nilpotent by Theorem 5.2. Suppose that T k (H ) is not dense
T1 T2
and write T = , where T1 is T HN , T3 k = 0, and σ (T ) = σ (T1 ) ∪ {0}.
0 T3
Since σ (T ) = {0} and σ (T1 ) is not empty, we then have σ (T1 ) = {0}, thus T1 is
0 T2
a quasi-nilpotent T HN operator and hence T1 = 0. Therefore T = . An
0 T3
easy computation yields that

k+1 k+1
0 T2 0 T2 T3k
T k+1 = = = 0,
0 T3 0 T3k+1

so that T is nilpotent.
Finally, suppose that T is quasi-nilpotent and analytically k-quasi T HN . Let h
be analytic on an open neighborhood of σ (T ), and non-constant on the components
of its domain, be such that h(T ) is quasi-T HN . We claim that h(T ) is nilpotent.
If h(T )k has dense range then h(T ) is T HN and hence, by Lemma 5.2, h(T )
is nilpotent. Suppose that h(T )k has not dense range. Then with respect the
⊥
decomposition X = h(T )k (H ) ⊕ h(T )k (H ) , the operator h(T ) has a triangulation
390 P. Aiena

AB
h(T ) = , such that A = h(T )|h(T )k (H ) is T HN and
0 C

σ (h(T )) = σ (A) ∪ {0}.

By the spectral mapping theorem we have

σ (h(T )) = h(σ (T )) = {h(0)}.

Consequently, 0 ∈ {h(0)}, i. e. h(0) = 0, and therefore h(T ) is quasi-nilpotent.

Since h(T ) is quasi-T HN , by the first part of proof it then follows that h(T ) is
nilpotent. Now, h(0) = 0 so we can write

n
h(λ) = μλ (λi I − T )ni g(λ),
n

i=1

where g(λ) has no zeros in σ (T ) and λi = 0 are the other zeros of g with
multiplicity ni . Hence

n
h(T ) = μ T n (λi I − T )ni g(T ),
i=1

where all λi I − T and g(T ) are invertible. Since h(T ) is nilpotent then also T is
nilpotent.
Theorem 5.11 If T ∈ L(H ) is an analytically quasi T HN operator, then T is
polaroid.
Proof We show that for every isolated point λ of σ (T ) we have p(λI − T ) =
q(λI − T ) < ∞. Let λ be an isolated point of σ (T ). Then H = H0 (λI − T ) ⊕
K(λI − T ), by Theorem 2.10. Furthermore, since σ (T |H0 (λI − T )) = {λ}, while
σ (T |K(λI − T )) = σ (T ) \ {λ}, so the restriction λI − T |H0 (λI − T ) is quasi-
nilpotent and λI −T |K(λI −T ) is invertible. Since λI −T |H0 (λI −T ) is analytically
quasi T HN , then Lemma 5.10 implies that λI −T |H0 (λI −T ) is nilpotent. In other
worlds, λI − T is an operator of Kato Type.
Now, both T and the dual T ∗ have SVEP at λ, since λ is isolated in σ (T ) =
σ (T ∗ ), and this implies, by Theorem 3.5, that both p(λI − T ) and q(λI − T ) are
finite. Therefore, λ is a pole of the resolvent.
Lemma 5.12 Suppose that T ∈ L(X) admits, with respect to the decomposition
T1 T2
X = M ⊕ N, the representation T = , where T3 is nilpotent. Then T has
0 T3
SVEP if and only if T1 has SVEP.
Normal Operators and their Generalizations 391

Proof Suppose that T1 has SVEP. Fix arbitrarily λ0 ∈ C and let f : U → X be an

analytic function defined on open disc U centered at λ0 such that (λI − T )f (λ) = 0
for all λ ∈ U . Set f (λ) := f1 (λ) ⊕ f2 (λ) on X = M ⊕ N. Then we can write

λI − T1 −T2 f1 (λ)
0 = (λI − T )f (λ) =
0 −λI − T3 f2 (λ)

(λI − T1 )f1 (λ) − T2 f2 (λ)

= .
(λI − T3 )f2 (λ)

Then (λI − T3 )f2 (λ) = 0 and (λI − T1 )f1 (λ) − T2 f2 (λ) = 0. Since a nilpotent
operator has SVEP then f2 (λ) = 0, and consequently (λI − T1 )f1 (λ) = 0. But T1
has SVEP at λ0 , so f1 (λ) = 0 and hence f (λ) = 0 on U . Thus, T has SVEP at λ0 .
Since λ0 is arbitrary then T has SVEP.
Conversely, suppose that T has SVEP. Since T1 is the restriction of T to M and
the SVEP from T is inherited by the restriction to closed invariant subspaces, then
T1 has SVEP.
Theorem 5.13 If T ∈ L(H ) is analytically quasi T HN , then T is hereditarily
polaroid and hence has SVEP.
Proof Let f ∈ Hnc (σ (T )) such that f (T ) is quasi T HN . If M is a closed T -
invariant subspace of X, we know that f (T )|M is quasi T HN , by Lemma 5.8, and
f (T )|M = f (T |M), so f (T |M) is polaroid, by Theorem 5.11, and consequently,
T |M is polaroid, see [4, Theorem 4.19].

6 Weyl Type Theorems for Analytically Quasi T HN

Operators

An operator T ∈ L(X), X a Banach space, is said to be a-polaroid if every λ ∈

iso σa (T ) is a pole of the resolvent of T . It easily seen that every a-polaroid operator
is polaroid. Indeed, if λ ∈ iso σ (T ) then λ belongs to the boundary of the spectrum,
so λ ∈ iso σa (T ) and hence is a pole. Evidently, T is polaroid if and only if T is
polaroid (T ∗ in the case of Hilbert space operators). If T has SVEP then σ (T ) =
σap (T ), so if T is polaroid then T is a-polaroid, and analogously, if T has SVEP
then σ (T ) = σap (T ) and hence T is a-polaroid.
392 P. Aiena

Recall that an operator T ∈ L(X) is said to be Weyl (T ∈ W (X)), if T is

Fredholm and the index ind T := α(T ) − β(T ) = 0. The Weyl spectrum of
T ∈ L(X) is defined by

σw (T ) := {λ ∈ C : λI − T ∈
/ W (X)}.

The class of upper semi-Weyl operators is defined by W+ (X) := {T ∈ + (X) :

ind T ≤ 0}, and class of lower semi-Weyl operators is defined by W− (X) := {T ∈
− (X) : ind T ≥ 0}. Clearly, W (X) = W+ (X) ∩ W− (X). The classes of operators
above defined generate the following spectra: the upper semi-Weyl spectrum

σuw (T ) := {λ ∈ C : λI − T ∈
/ W+ (X)},

and the lower semi-Weyl spectrum defined as

σlw (T ) := {λ ∈ C : λI − T ∈
/ W− (X)}.

An operator T ∈ L(X) is said to be Browder (T ∈ B(X)), if T is Fredholm and

p(T ) = q(T ) < ∞. The Browder spectrum of T ∈ L(X) is defined by

σb (T ) := {λ ∈ C : λI − T ∈
/ B(X)}.

Following Coburn [21], we say that Weyl’s theorem holds for T ∈ L(X) (in symbol,
(W )) if

σ (T ) \ σw (T ) = π00 (T ), (18)

where

π00 (T ) := {λ ∈ iso σ (T ) : 0 < α(λI − T ) < ∞}.

Note that T satisfies (W ) if and only if T satisfies Browder’s theorem, (i.e., σb (T ) =

σw (T )) and π00 (T ) = p00 (T ), where p00 (T ) := σ (T ) \ σb (T ), see for instance [4,
Theorem 6.40].
The concept of Fredholm operators has been generalized in the following way
[15]: for every T ∈ L(X) and a nonnegative integer n let us denote by T[n] the
restriction of T to T n (X) viewed as a map from the space T n (X) into itself (we set
T[0] = T ). T ∈ L(X) is said to be B-Fredholm if for some integer n ≥ 0 the range
T n (X) is closed and T[n] is a Fredholm operator. In this case T[m] is a Fredholm
operator for all m ≥ n [15]. This enables one to define the index of a Fredholm as
ind T = ind T[n] . A bounded operator T ∈ L(X) is said to be B-Weyl (T ∈ BW (X))
if for some integer n ≥ 0 T n (X) is closed and T[n] is Weyl. The B-Weyl spectrum
σbw (T ) is defined

σbw (T ) := {λ ∈ C : λI − T ∈
/ BW (X)}.
Normal Operators and their Generalizations 393

Another version of Weyl’s theorem has been introduced by Berkani and Koliha ([14]
as follows: T ∈ L(X) is said to verify generalized Weyl’s theorem, (in symbol
(gW )) if

σ (T ) \ σbw (T ) = E(T ), (19)

where

E(T ) := {λ ∈ iso σ (T ) : 0 < α(λI − T )}.

An operator T ∈ L(X) is said to be Drazin invertible if p(T ) and q(T ) are finite,
and this happens if and only if T is invertible or 0 is a pole of the resolvent. The
Drazin spectrum id denoted by σd (T ). Note that (gW )) holds for T if and only if T
satisfies generalized Browder’s theorem, (i.e., σbw = σd , or equivalently, Browder’s
theorem, see [4, Theorem 5.15]) and E(T ) = (T ), where (T ) is the set of all
poles of the resolvent of T . Note that generalized Weyl’s theorem entails Weyl’s
theorem, see [4, Theorem 6.60].
The following result shows that in presence of SVEP the polaroid condition
entails Weyl type theorems.
Theorem 6.1 Let T ∈ L(X) be polaroid and suppose that either T or T has SVEP.
Then both T and T satisfy generalized Weyl’s theorem.
Proof If T is polaroid also T is polaroid, and Weyl’s theorem and generalized
Weyl’s theorem for T , or T , are equivalent, see [10, Theorem 3.7]. The assertion
then follows from [10, Theorem 3.3].
Remark 6.2 In the case of a Hilbert space operator T ∈ L(H ) it is more
appropriated to consider the Hilbert adjoint T ∗ instead of the dual T . Note that
T satisfies (gW ) if and only if T ∗ does. This easily follows from the well known
equalities, σw (T ∗ ) = σw (T ), where E is the conjugate of E ⊆ C, σb (T ∗ ) =
σb (T ), E(T ∗ ) = E(T ), and (T ∗ ) = (T ). Furthermore, T satisfies SVEP
if and only if T ∗ satisfies SVEP, so, in the statement of Theorem 6.1, T may be
replaced by the Hilbert adjoint T ∗ .
In [3] it is shown that if T is hereditarily polaroid and has SVEP, and K is an
algebraic operator which commutes with T then T + K is polaroid and T + K is
a-polaroid.
The following perturbation result has been proved in [3, Theorem 3.12].
Theorem 6.3 Suppose that T ∈ L(X) and K ∈ L(X) an algebraic operator
commuting with T ∈ L(X). If T ∈ L(X), or T , has SVEP and T , or T , is
hereditarily polaroid, then f (T + K) and f (T + K ) ) satisfies (gW ) for every
f ∈ Hnc (σ (T + K)).
394 P. Aiena

Observe that in the case of Hilbert space operators

T ∗ + K ∗ is a-polaroid ⇔ T + K is a-polaroid,

see Theorem [10, Theorem 2.3].

Theorem 6.4 Let T ∈ L(H ) be an analytically quasi T HN operator on a Hilbert
space H , and let K ∈ L(H ) be an algebraic operator commuting with T . Then both
f (T + K) and f (T + K ) satisfies (gW ) for every f ∈ Hnc (σ (T + K)).
Proof Suppose that T ∈ L(H ) is analytically quasi T HN , and let f ∈ Hnc (σ (T ))
be such that f (T ) is quasi T HN , Since T has SVEP then f (T ) has SVEP. Now,
by Theorem 5.13 T is hereditarily polaroid, and hence the results of Theorem 6.3
apply.
Theorem 6.4 gives to us a general framework and applies to all classes of the
operators above considered in this paper (and much more!). Moreover, Theorem 6.4
considerably improves most the existing results in literature concerning Weyl type
theorems for these classes of operators. Observe that, always in the situation of
Theorem 6.4, the fact that f (T + K) is polaroid entails that all Weyl type theorems
(as property (gw) and a-Weyl’s theorem) hold for f (T + K ), see [10] for
definitions and details, in particular Theorem 3.10.

References

1. P. Aiena, Fredholm and Local Spectral Theory, with Application to Multipliers (Kluwer,
Dordecht, 2004)
2. P. Aiena, Algebraically paranormal operators on Banach spaces. Banach J. Math. Anal. 7(2),
136–145 (2013)
3. P. Aiena, E. Aponte, Polaroid type operators under perturbations. Stud. Math. 214(2), 121–136
(2013)
4. P. Aiena, Fredholm and Local Spectral Theory II, with Application to Weyl-Type Theorems.
Springer Lecture Notes of Mathematics, vol. 2235 (Springer, Berlin, 2018)
5. P. Aiena, V. Muller, The localized single-valued extension property and Riesz operators. Proc.
Am. Math. Soc. 143(5), 2051–2055 (2015)
6. P. Aiena, M.M. Neumann, On the stability of the localized single-valued extension property
under commuting perturbations. Proc. Am. Math. Soc. 141(6), 2039–2050 (2013)
7. P. Aiena, P. Peña, A variation on Weyl’s theorem. J. Math. Anal. Appl. 324, 566–579 (2006)
8. P. Aiena, S. Triolo, Weyl type theorems on Banach spaces under compact perturbations.
Mediterr. J. Math. 15(3), 1–18 (2018). https://doi.org/10.1007/s00009-018-1176-y
9. P. Aiena, S. Triolo, Some remarks on the spectral properties of Toeplitz operators. Mediterr. J.
Math. 16, 135 (2019). http://dx.doi.org/10.1007/s00009-019-1397-8
10. P. Aiena, E. Aponte, E. Bazan, Weyl type theorems for left and right polaroid operators. Integr.
Equ. Oper. Theory 66, 1–20 (2010)
11. A. Aluthge, On p-hyponormal operators for 1 < p < 1. Integr. Equ. Oper. Theory 13, 307–315
(1990)
12. T. Ando, Operators with a norm condition. Acta Sci. Math. 33, 169–178 (1972)
13. S.C. Arora, J.K. Thukral, On a class of operators. Glasnik Math. 21, 381–386 (1986)
Normal Operators and their Generalizations 395

14. M. Berkani, J.J. Koliha, Weyl type theorems for bounded linear operators. Acta Sci. Math.
69(1–2), 359–376 (2003)
15. M. Berkani, M. Sarih, On semi B-Fredholm operators. Glasgow Math. J. 43, 457–465 (2001)
16. E. Bishop. A duality theorem for an arbitrary operator. Pac. J. Math. 9, 379–397 (1959)
17. M. Chō, Spectral properties of p-hyponormal operators. Glasgow Math. J. 436, 117–122
(1994)
18. M. Chō, J.I. Lee, p-hyponormality is not translation-invariant. Proc. Am. Math. Soc. 131(10),
3109–3111 (2003)
19. M. Chō, I.H. Jeon, J.I. Lee, Spectral and structural properties of log-hyponormal operators.
Glasgow Math. J. 42, 345–350 (2000)
20. N.N. Chourasia, P.B. Ramanujan, Paranormal operators on Banach spaces. Bull. Austral. Math.
Soc. 21, 161–168 (1980)
21. L.A. Coburn, Weyl’s theorem for nonnormal operators. Michigan Math. J. 13(3), 285–288
(1966)
22. J.B. Conway, The Theory of Subnormal Operators. Mathematical Survey and Monographs,
vol. 36 (American Mathematical Society, Providence, 1992)
23. I. Colojoară, C. Foiaş, Theory of Generalized Spectral Operators (Gordon and Breach, New
York, 1968)
24. D.S. Djordjević, Operators obeying a-Weyl’s theorem. Publ. Math. Debrecen 55(3–4), 283–
298 (1999)
25. N. Dunford, J.T. Schwartz, Linear Operators. Part I (1967), Part II (1967), Part III (Wiley, New
York, 1967)
26. R.G. Douglas Banach Algebra Techniques in Operator Theory. Graduate Texts in Mathematics,
vol. 179, 2nd edn. (Springer, New York, 1998)
27. B.P. Duggal, Isolated eigenvalues, poles, compact perturbation of Banach space operators.
Oper. Matrices Debrecen 13(4), 966–973 (2019)
28. B.P. Duggal, S.V. Djordjević, Weyl’s theorems and continuity of the spectra in the class of
p-hyponormaloperators. Stud. Math. 143(1), 23–32 (2000)
29. B.P. Duggal, S.V. Djordjevíc, Generalized Weyl’s theorem for a class of operators satisfying a
norm condition. Math. Proc. Royal Irish Acad. 104A, 75–81 (2004)
30. B.P. Duggal, H. Jeon, Remarks on spectral properties of p-hyponormal and log-hyponormal
operators. Bull. Kor. Math. Soc. 42, 541–552 (2005)
31. B.P. Duggal, I.H. Jeon, H. Kim, On Weyl’s theorem for quasi-class A operators. J. Korean
Math. Soc. 43, 899–909 (2006)
32. D.R. Farenick, W.Y. Lee, Hyponormality and spectra of Toeplitz operators. Trans. Am. Math.
Soc. 348(10), 4153–4174 (1996)
33. T. Furuta, Invitation to Linear Operators (Taylor and Francis, London, 2001)
34. T. Furuta, M. Ito, T. Yamazaki, A subsclass of paranormal operators including class of log-
hyponormal and several related classes. Sci. Math. 1, 389–403 (1998)
35. S.R. Garcia, M. Putinar, Complex symmetric operators and applications I. Trans. Am. Math.
Soc. 358, 1285–1315 (2006)
36. S.R. Garcia, M. Putinar, Complex symmetric operators and applications II. Trans. Am. Math.
Soc. 359, 3913–3931 (2007)
37. Y.M. Han, A.-H. Kim, A note on ∗-paranormal operators. Integr. Equ. Oper. Theory 49, 435–
444 (2004)
38. Y.M. Han, J.I. Lee, D. Wang, Riesz idempotent and Weyl’s theorem for w-hyponormal
operator. Integr. Equ. Oper. Theory 53, 51–60 (2005)
39. H. Heuser, Functional Analysis (Marcel Dekker, New York, 1982)
40. I.H. Kim, On (p, k)-quasihyponormal operators. Math. Inequal. Appl. 4, 629–638 (2004)
41. E. Ko, On p-hyponormal operators. Proc. Am. Math. Soc. 128(3), 775–780 (2000)
42. D. Lay, A. Taylor, Introduction to Functional Analysis (Wiley, New York, 1980)
43. K.B. Laursen, M.M. Neumann, An Introduction to Local Spectral Theory. London Mathemat-
ical Society Monographs, vol. 20 (Clarendon Press, Oxford, 2000)
44. W.Y. Lee, Weyl’spectra of operator matrices. Proc. Am. Math. Soc. 129, 131–138 (2001)
396 P. Aiena

45. M.Y. Lee, S.H. Lee, Some generalized theorems on p-quasihyponormal operators for 0 < p <
1. Nihonkai Math. J. 8, 109–115 (1997)
46. C. Lin, Y. Ruan, Z. Yan, w-hyponormal operators are subscalar. Integr. Equ. Oper. Theory 50,
165–168 (2004)
47. M. Mbekhta, Local spectrum and generalized spectrum. Proc. Am. Math. Soc. 112, 457–463
(1991)
48. S. Mecheri, Bishop’s property (β) and Riesz idempotent for k-quasiparanormal operators.
Banach J. Math. Anal. 6, 147–154 (2012)
49. S. Mecheri, On a new class of operators and Weyl type theorems. Filomat 27(4), 629–636
(2013)
50. T.L. Miller, V.G. Miller, M.M. Neumann, Operators with closed analytic core. Rend. Circolo
Mat. Palermo. 51(3), 495–502 (2003)
51. V. Müller, Spectral Theory of Linear Operators and Spectral Systems on Banach Algebras.
Operator Theory: Advances and Applications, 2nd edn. (Birkhäuser, Berlin, 2007)
52. M. Putinar, Hyponormal operators are subscalar. J. Operator Theory 12, 385–395 (1984)
53. W. Rudin, Functional Analysis, 2nd edn. (McGraw-Hill, New York, 1991)
54. C. Schmoeger, On isolated points of the spectrum of a bounded operator. Proc. Am. Math. Soc.
117, 715–719 (1993)
55. A.M. Sinclair, Eigenvalues in the boundary of the numeraical range. Pac. J. Math. 81, 231–234
(1970)
56. K. Tanahashi, On log-hyponormal operators. Integral Equ. Oper. Theory 34, 364–372 (1999)
57. K. Tanahashi, A. Uchiyama, M. Chō, Isolated points of the spectrum of (p, k)-quasi-
hyponormal operators. Linear Algebra Appl. 382, 221–229 (2004)
58. F.H. Vasilescu, Analytic Functional Calculus and Spectral Decompositions (Editura
Academiei/D. Reidel Publishing Company, Bucharest/Dorrecht, 1982)
59. P. Vrbová, On local spectral properties of operators in Banach spaces. Czechoslov. Math. J.
23(98), 483–492 (1973)
60. H. Widom, On the spectrum of Toeplitz operators. Pac. J. Math. 14, 365–375 (1964)
61. D. Xia, Spectral Theory of Hyponormal Operators. (Birkhauser, Boston, 1993)
62. J.T. Yuan, G.X. Ji, On (n, k)-quasi paranormal operators. Stud. Math. 209, 289–301 (2012)
On Wold Type Decomposition for Closed
Range Operators

H. Ezzahraoui, M. Mbekhta, and E. H. Zerouali

Abstract This survey aims to give a brief introduction to Wold-type decomposition

for some closed range operators satisfying some operator inequalities. As a corner-
stone in the theory of the Hardy space, Beurling theorem for unweighted shift is our
starting point that we try to transfer to regular operators. Also, several results on left
invertible operators close to isometries, as extensions of the Hardy shifts, are listed
and extended to the case of regular operators. We define and study the Cauchy dual
for such operators by using the Moore-Penrose inverse of closed range operators.
The Cauchy dual plays the role of the left inverse in our approach for this general
setting.

Keywords Wold-type decomposition · Beurling-type theorem · Regular

operator · Moore-Penrose inverse · Cauchy dual

1 Introduction

Let us denote first by H a Hilbert space and by L(H) the algebra of all bounded
linear operators on H. For an operator T ∈ L(H), we denote by R(T ) and N(T )
the
E∞range and the kernel sub-spaces
H of T respectively. We also write R ∞ (T ) =
∞
k=0 R(T ) and N (T ) =
k n
n≥0 N(T ) for the generalized range and the
generalized kernel of T respectively. We will say that T is a pure operator if
R ∞ (T ) = {0}.
A subspace E ⊂ H is said to be T −invariant if T (E) ⊆ E, the lattice of all
closed T −invariant sub-spaces in H will be denoted by Lat (T , H). For any given

H. Ezzahraoui · M. Mbekhta ()

Mohammed V University, Faculty of Sciences, Rabat, Morocco
e-mail: [email protected]; [email protected]
E. H. Zerouali
Université de Lille, Département de Mathématiques UMR-CNRS 8524, Villeneuve d’Ascq,
France
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 397
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_12
398 H. Ezzahraoui et al.

I
subspace E of H, we set [E]T = ∞ k
k=0 T (E) for the smallest closed T -invariant
subspace of H containing E. A subspace E is called a reducing subspace for T if it
is invariant for both T and T ∗ in which T ∗ is the adjoint operator of T .
Recall that the operator T ∈ L(H) is bounded below if there is c > 0 satisfying
T x ≥ cx for every x ∈ H. It is not difficult to see that an operator T is bounded
below if and only if it is one to one and has a closed range. This is also equivalent
to T ∗ T invertible and hence to T is left invertible. A standard left inverse of T is
given by L = (T ∗ T )−1 T ∗ . The reduced minimum modulus γ (T ) of T is defined by
the formula:

γ (T ) = inf{T x : x = 1, x ∈ N(T )⊥ }

if T is not the null operator and γ (T ) = ∞ if T = 0. The reduced minimum

modulus encodes several properties of T . It is not difficult to show that γ (T ) =
γ (T ∗ ) for every T ∈ L(H), for more information see [17]. Clearly, T has a closed
range if and only if γ (T ) > 0. The operator T ∈ L(H) is contractive if T x ≤ x
for every x ∈ H and is said to be expansive if T x ≥ x for every x ∈ H. It is not
difficult to see that T is an expansive operator if and only if γ (T ) ≥ 1. In particular
expansive operators are left invertible with contractive left inverses.
The classical Wold decomposition theorem states that if U is an isometry on
a Hilbert space H, then H is the direct sum of two reducing subspaces for U ,
H = Hu ⊕ Hp such that U|Hu ∈ L(Hu ) is unitary and U|Hp ∈ L(Hp ) is unitarily
equivalent to a unilateral shift. This decomposition is unique and the canonical
subspaces are defined by
∞
D ∞
J
Hu := U n H and Hp := U n E,
n=1 n=1

where E := N(U ∗ ) = H A U H. The subspace E is called a wandering subspace

for U , it is characterized by the property U n E ⊥ U m E, for n = m.
In 1961 Halmos published a paper concerning shift operators on Hilbert spaces
(see [9]). The paper introduced the notion of the wandering subspace and its
connections with invariant subspaces of unilateral and bilateral shifts. This notion is
closely related to the famous Wold decomposition of isometries.
A famous result is the theorem of A. Aleman, S. Richter and C. Sundberg. In the
standard Bergman space L2a (D) of square area integrable analytic functions on the
unit disc, if T is the operator of multiplication by the complex coordinate acting in
this space and M is an arbitrary z-invariant subspace, then

M = [M A T M]T .

Later on, S.Shimorin introduced a more general property named the Wold-type
decomposition which generalizes the classical Wold decomposition and proved in
[24, Theorem 3.6]. S. Shimorin utilized a concept from the paper of Richter from
On Wold Type Decomposition for Closed Range Operators 399

1988, who showed that an analytic 2-concave operator has the wandering subspace
property (see [20]). Shimorin obtain a weak analog of the Wold decomposition
theorem, representing operator close to isometry in some sense as a direct sum of a
unitary operator and a shift operator acting in some reproducing kernel Hilbert space
of vector-valued holomorphic functions defined on a disc. The construction of the
Shimorin’s model for a left-invertible analytic operator becomes as a powerful tool
in the model theory of left-invertible operators.

1.1 Beurling Theorems for Shift Operators

Let ω = (ω(n))n≥0 be a sequence of nonegative numbers and

Hω = {x = x n en : |xn |2 ω(n)2 < ∞}.
n≥0 n≥0

be a Hilbert space endowed with some orthonormal basis (en )n≥0 . The weighted
shift Sω on Hω is defined by Sω en = ω(n)en+1 . We devote this section to some
classical results concerning Beurling theorems for weighted shift operators.

1.1.1 The Hardy Space

The hardy space H 2 (D) of analytic functions on the unit disc D is given as the
Hilbert space
⎧ ⎫
⎨ ⎬
H 2 (D) := f (z) = an zn : f 2H 2 (D) := |an |2 < ∞ .
⎩ ⎭
n≥0 n≥0

The family en (z) = zn ; n ≥ 0 is hence an orthonormal basis of H 2 (D).

The unilateral shift operator of H 2 (D) is the linear operator defined by Mz (en ) =
en+1 . It is clear that Mz (f ) = zf for every f ∈ H 2 (D).
The shift operator on a Hilbert space of analytic functions H is defined by
this last remark provided that zH ⊂ H . Since its introduction, the shift operator
becomes the principal tool in the study of the Hardy spaces (and even in all spaces of
analytic functions). It is the perfect bridge between the theory of analytic functions
and operator theory. As an example we cite the fact that every cyclic subnormal
operator is unitarily equivalent to the shift operator on some Hilbert space of analytic
functions.
Let T be a bounded operator on H. Following P. Halmos [12], E is called
wandering (for T ) if

E ⊥ T k (E), for every k ≥ 1.

400 H. Ezzahraoui et al.

We clearly have E = N(T ∗ ) is always a wandering subspace for T since T k E ⊂

R(T ) ⊥ N(T ∗ ). Notice that in the case where T is an isometry, then E is wandering
if and only if T i E ⊥ T j E for every j = i.
Also, a vector e is said to be wandering for the operator T if the subspace C.e is
a wandering subspace for T , the later is equivalent to e ⊥ T j e for every j ≥ 1.
In [4], Arne Beurling described the lattice of invariant subspaces of Mz in the
Hardy space H 2 (D).
Theorem 1.1 (Beurling’s Theorem) A closed subspace E is invariant for Mz in
the Hardy space H 2 (D), if and only if there exists an inner function θ such that
E = θ H 2 (D). Moreover, E =: E A Mz E = C.θ is a one dimensional wandering
subspace such that

E = [E A Mz E]Mz .

1.1.2 The Bergman Space

The standard Bergman space L2a (D) of square area integrable analytic functions on
the unit disc is given as the Hilbert space
$
1 1
L2a (D) = {f (z) = an zn : f 2 = |f (z)|2 dA(z) = |an |2 < ∞.}
2π D n+1
n≥0 n≥0

Here dA denotes the area Lebesgue measure on the complex plane C.

The family en (z) = (n + 1)zn ; n ≥ 0 is hence an orthonormal basis of L2a (D).
For the unilateral shift operator Mz (zn ) = zn+1 on the standard Bergman space
L2a (D), the situation is quite different. A. Aleman, S. Richter and C. Sundberg,
showed the next Beurling type theorem
Theorem 1.2 ([2, Theorem 3.5]) Let E be an invariant subspace of Mz in the
Bergman space L2a (D), then E = [E A Mz E]Mz .
In contrast with the Hardy case, the dimension of wandering subspaces E A zE in
Bergman shift Mz ranges from 1 to ∞.

1.1.3 The Direchlet Space

The Dirichlet space D(D) consists of analytic functions on the unit disc D is
$
D(D) = {f (z) = an zn : D(f ) := |f (z)|2 dA(z) < ∞},
n≥0 D
On Wold Type Decomposition for Closed Range Operators 401

here dA(z) = 1
π rdrdt denotes normalized area measure on D. A norm on D(D) is
defined by
∞

f 2D = f 2H 2 (D) + D(f ) = (n + 1)|an |2 .
n=0

Endowed with this norm, D(D) is a Hilbert space in which

1 n
{en (z) = z : n ≥ 0}
n+1

is a canonical orthonormal basis. The main theorem of Beurling type in the case of
Dirichlet space is given by Richter in [20]. In this space, every z-invariant subspace
E of D(D) is generated by an extremal function. More precisely, E = φD(mφ ),
where φ is a normalized extremal function, mφ is a certain absolutely continuous
measure on the unit circle T, and D(mφ ) is a Dirichlet-type space associated with
mφ . Moreover, E := E A zE = Cφ is a one dimensional wandering subspace such
that

E = [E A Mz E]Mz .

For more information, see for example [20].

1.1.4 More on Beurling’s Theorem for Hilbert Spaces of Analytic

Functions

In the Hardy space on the bi-disc H 2 (D2 ), Beurling theorem fails in general. Indeed,
W. Rudin provided two examples showing that none of the equalities in Beurling
theorem hold. Recall that a closed subspace E ⊂ H 2 (D2 ) is invariant under the bi-
shift M(z1 ,z2 ) = (Mz1 , Mz2 ) if and only if (z1 E +z2 E) ⊂ E. Again E A(z1 E +z2 E)
is a wandering subspace.
Example ([21]) The invariant subspace [z1 − z2 ]M(z1 ,z2 ) is not of the form θ H 2 (D2 )
for any two variable inner function θ ∈ H 2 (D2 ).
Example ([21]) Let E be the set of all functions f ∈ H 2 (D2 ) which have a zero
of order greater than or equal to n at (1 − n−3 , 0) for n = 1, 2, · · · . Then E is a
not finitely generated invariant subspace of the bi-shift, i.e., there exists no finite set
f1 , f2 , · · · , fn ∈ H 2 (D2 ) such that E = [f1 , f2 , · · · , fn ]M(z1 ,z2 ) .
We also have the next result.
Theorem 1.3 ([13, Theorem 3.6]) There exists a nontrivial function f ∈ H 2 (D2 )
such that [f ]M(z1 ,z2 ) A (z1 [f ]M(z1 ,z2 ) + z2 [f ]M(z1 ,z2 ) ) does not generate [f ]. The
402 H. Ezzahraoui et al.

subspace E =: E A Mz E = C.θ is a one dimensional wandering subspace such

that

E = [E A Mz E]Mz .

1.2 Beurling’s Type Theorem for Left Invertible Operators

Close to Isometries

Motivated by the previous discussion, the next definition has been introduced in
several papers.
Definition 1.4 We shall say that an operator T ∈ L(H) admits Wold-type decom-
position, if R ∞ (T ) is closed and,
(i) R ∞ (T ) is reducing for T for which the restriction operator T|R ∞ (T ) is unitary.
(ii) H = [H A T H]T ⊕ R ∞ (T ).
Definition 1.5 An operator T ∈ L(H) is said to have the Wandering subspace
property if H = [H A T H]T and we say the Beurling-type theorem holds for T if
M = [M A T M]T for every M ∈ Lat (T , H).
It is clear that if Beurling-type theorem holds for T , it will follow that T admits
the Wandering subspace property. Also for a pure operator, T has Wandering space
property if and only if it admits Wold decomposition. From the preceding remarks
the Hardy shift, the Bergman shift and the Dirichlet shift satisfy the Beurling-type
theorem. We discuss below the contributions of several authors that have been inter-
ested in the class of operators satisfying the Beurling-type theorem. The problem of
describing all weighted shifts that satisfy Beurling-type theorem remains open.
The case of left invertible operators has been widely studied in the two last
decades. It is always assumed that T satisfies some operator inequalities close to
isometries. A pioneer result goes to S. Richter (see [20]), that provides a sufficient
condition on an operator S ∈ L(H) to admit the Wandering subspace property. More
precisely,
Theorem 1.6 ([20, Theorem 1]) Let S ∈ L(H) be pure such that

S 2 x + x2 ≤ 2Sx2 ; for every x ∈ H.

If M ∈ Lat (S, H), then there exists a wandering subspace B for S such that
N
M= S n B.
n≥0

In particular, Richter’s result states that the Dirichlet shift satisfies the Wandering
subspace property.
On Wold Type Decomposition for Closed Range Operators 403

Later in [24] S. Shimorin gave a different approach, to prove the following

theorem.
Theorem 1.7 ([24]) Let T ∈ L(H) be pure such that;

T x + y2 ≤ 2(x2 + T y2 ), (1)

for every x, y ∈ H. Then T satisfies the Wandering subspace property.

This result extends and provides a simpler proof of Aleman-Richter-Sundberg
theorem to more general weighted spaces of analytic functions. On the other hand,
A. Olofsson in [19] extended Richter’s theorem in the following way,
Theorem 1.8 ([19, Theorem 2.1]) Let T ∈ L(H) be pure such that;
(i) T is expansive.
(ii) There exists some positive constants ck , c with k≥2 1
ck = ∞ such that

T k x2 ≤ ck (T x2 − x2 ) + cx2 . (2)

for every x ∈ H. Then T satisfies the Wandering subspace property.

Also, O. Olofsson gave a more precise relation between these conditions. His
result can be stated as follows,
Proposition 1.9 ([19, Proposition 1.2]) Let T ∈ L(H) be left invertible, T =
T (T ∗ T )−1 and c be a nonnegative constant. Then the following two statements are
equivalent:
(i) T 2 x2 − T x2 ≤ c(T x2 − x2 ) for every x ∈ H.
(ii) Q(T x + y)2 ≤ (1 + 1c )(x2 + cT y2 ) for every x, y ∈ H
where Q is the orthogonal projection of H onto R(T ).
Moreover, from [19, Corollary 2.1], if T ∈ L(H) is an expansive operator which
satisfies Inequality (2) for c = 1, then H = [HAT H]T ⊕R ∞ (T ) and the restriction
T|R ∞ (T ) is unitary, that is, T admits Wold-type decomposition.
Remark 1.10 We notice at this level that, in the previous theorem, it is necessary
that c ≥ 1. To see this, observe first that inf T x = 1. Otherwise, there exists
x=1
r > 1 such that T x ≥ rx for every x = 0. It will follow by induction, that
r 2k − c ≤ ck (T 2 − 1) and then that k≥2 c1k is finite. Now taking the infinimum,

1 = x2 ≤ T k x2 ≤ ck (T x2 − 1) + c,

we will get 1 ≤ c.
404 H. Ezzahraoui et al.

In 2009, S. Sun and D. Zheng (see [25]) gave another proof of the Beurling-type
theorem by proving some new identities in the Bergman space and later, K. J. Izuchi,
K. H. Izuchi and Y. I. Izuchi used these ideas in [14] to prove the next theorem.
Theorem 1.11 ([14, Theorem 1.1]) Let T ∈ L(H) . If T satisfies the following
conditions:
(i) T x2 + T ∗2 T x2 ≤ 2T ∗ T x2 for all x ∈ H;
(ii) T is bounded below;
(iii) T ≤ 1;
(iv) T ∗k x −→ 0 as k −→ ∞ for every x ∈ H.
Then H = [H A T H]T
The main purpose of this survey is to present the abstract approach to the problem.
We extend the previous results to the more general class of regular operators
introduced by M. Mbekhta in [17] and developed in [8].
We devote Sect. 2 to some well known properties of regular operators and the
basic tools of this class of operators. Section 3 is focused on the generalization of
the previous results to the class of regular operators. More precisely, we give under
the same conditions on orbits, an extension of Wold-type decomposition for regular
operators. See Theorem 3.9).
Section 4 is devoted to the duality between a bi-regular operator T and its Cauchy
dual ω(T ). This duality is reflected in terms of extended Wold-type decomposition.
Some applications and examples are widely given. In particular, we apply our results
to regular bilateral weighted shifts.

2 Regular Operators

2.1 Moore-Penrose Generalized Inverse

An operator S ∈ L(H) is a generalized inverse of T if T ST = T and ST S = S. It is

not difficult to see that an operator admits a generalized inverse if and only if it has
a closed range. We will focus in this survey on a particular generalized inverse for
T ∈ L(H) with closed range. More precisely, a standard generalized inverse for T
can be built as follows. We consider the operator T0 = T|N(T )⊥ : N(T )⊥ −→ R(T )
that is clearly bijective. Define T † by

T † x = T0−1 x if x ∈ R(T )
.
T †x = 0 if x ∈ R(T )⊥

We have T † = T0−1 PR(T ) , where PE denotes the orthogonal projection on a given

subspace E. It is easy to see that, T T † T = T and T † T T † = T † , and thus T † is a
generalized inverse of T satisfying T T † and T † T are orthogonal projections. The
On Wold Type Decomposition for Closed Range Operators 405

operator T † is called the Moore-Penrose inverse of T , and has been widely studied
in the literature. It is usually defined as the unique solution of the following four
operator equations:

T T †T = T , T †T T † = T †, (T T † )∗ = T T † , (T † T )∗ = T † T . (3)

Among all generalized inverses, the Moore-Penrose inverse T † received special

attention by several authors. In the case of left invertible operators T † coincides
with the standard left inverse (T ∗ T )−1 T ∗ and in the case where T is right invertible
T † = T ∗ (T T ∗ )−1 . The next well known result links the minimum modulus and the
Moore-Penrose inverse, it can be found in [18, Corollary 2.3]. For T ∈ L(H) with
closed range, we have

1
T † = .
γ (T )

We summarize in the proposition below some further properties of the Moore-

Penrose inverse of T which will be used in the sequel.
Proposition 2.1 Let T ∈ L(H) be with closed range. We have
(a) T T † = PR(T ) , T † T = PN(T )⊥ ,
(b) R(T † ) = R(T ∗ ) = N(T )⊥ ,
(c) N(T † ) = N(T T † ) = N(T ∗ ) = R(T )⊥ ,
(d) R(T ) = R(T T † ) = R(T †∗ ),
(e) N(T ) = N(T † T ) = N(T †∗ ),
(f) (T ∗ )† = (T † )∗ ,
(g) (T † )† = T ,
(h) T ∗T T † = T †T T ∗ = T ∗.

2.2 Some Basic Properties of Regular Operators

Regular operators have been introduced as a natural family of operators close to

semi invertible ones, where an operator is said to be semi invertible if it is left or
right invertible. Since then, they have been widely studied, see [15] for example. We
recall the definition of regular operators from [16],
Definition 2.2 An operator T ∈ L(H) is said to be regular if R(T ) is closed and if
N(T k ) ⊂ R(T ), for every k ≥ 1.
The regular resolvent, denoted reg(T ), is defined as the set of complex numbers
λ for which, there is a neighborhood Uλ of λ and an analytic function Rλ : Uλ −→
L(H) such that Rλ (μ) is a generalized inverse of T − μI for every μ ∈ Uλ . If
0 ∈ reg(T ), the operator T is said to be Kato invertible, see [16]. The generalized
spectrum of T is defined as σg (T ) = C \ reg(T ).
406 H. Ezzahraoui et al.

It is clear that T is regular if and only if T ∗ is regular, that all injective operators
with closed range and that all surjective operators are regular. We give next some
classical known facts on regular operators. We refer to the corresponding books and
papers for proofs and further information.
Following Saphar [22], the algebraic core C(T ) of T , is the greatest subspace M
of H for which T (M) = M. In terms of sequences, we have

C(T ) = {x ∈ H : ∃(un )n ⊂ H such that x = u0 and T un+1 = un } .

Proposition 2.3 ([7, Proposition 1]) Let T ∈ L(H) be a regular operator. We have

R ∞ (T ) = T (R ∞ (T )) = C(T ).

In particular, if T is regular, then x ∈ R ∞ (T ) if and only if T x ∈ R ∞ (T ).

In the next proposition, we summarize some properties of the generalized range
R ∞ (T ) in the case of regular operators.
Proposition 2.4 Let T ∈ L(H) be a regular operator. e have

(i) R ∞ (T ) is closed,
(ii) If R ∞ (T ) = {0}, then T is left invertible,
For further information, We refer to [1], [3] and [15].
The next proposition is given in [3] and will be useful in the sequel,
Proposition 2.5 ([7, Proposition 3]) Let T ∈ L(H) be regular. If S is such that
T ST = T , then

T n S n T n = T n for every n ≥ 1.

In particular, if S is also regular, then S n is a generalized inverse of T n for every

n ≥ 1; that is

T n S n T n = T n and S n T n S n = S n for every n ≥ 1.

Generalized inverses own various interesting properties. For example, if S is a

generalized inverse of an operator T , then T Sx = x for every x ∈ R(T ). The next
description of the generalized range for regular operators is useful for the proof of
our main results.
Proposition 2.6 Let T ∈ L(H) be a regular operator with a generalized inverse S.
Then
6 7
R ∞ (T ) = x ∈ H : T n S n x = x for every n ≥ 0 .
On Wold Type Decomposition for Closed Range Operators 407

The next proposition from [3] summarizes additional properties of regular

operators. See also [8].
Proposition 2.7 Let T ∈ L(H) be regular and S be a generalized inverse of T . We
have the following.
(i) T (N ∞ (T )) = N ∞ (T );
(ii) S(R ∞ (T )) ⊆ R ∞ (T );
(ii) S(N ∞ (T )) ⊆ N ∞ (T );
(iv) R ∞ (T )⊥ = N ∞ (T ∗ ).
We use generalized inverse to provide necessary and sufficient conditions for an
operator to be regular.
Proposition 2.8 Let T ∈ L(H) be with closed range and S be a generalized inverse
of T . The following are equivalent
(i) T is regular;
(ii) S k N(T ) ⊆ R ∞ (T ) for every k ≥ 0;
(iii) S k N(T ) ⊆ R(T ) for every k ≥ 0.
Proof (i) ⇒ (ii). Since T is regular, the result follows immediately by induction
from (ii) in the previous proposition.
(ii) ⇒ (iii). Obvious.
(iii) ⇒ (i). Since R(T ) is assumed to be closed, it remains to show that
N(T n ) ⊆ R(T ) for every n ≥ 1. So, suppose that S k N(T ) ⊆ R(T ) for every
k ≥ 0. For n ≥ 1 and x ∈ N(T n ), as in [8, Lemma 2], we have

x = x − SnT nx

= n−1 k k
k=0 S PN(T ) T x.

Since PN(T ) T k x ∈ N(T ) and by our assumption S k N(T ) ⊆ R(T ) for every k ≥ 0,
we obtain x ∈ R(T ) and hence N(T n ) ⊆ R(T ).

2.3 Restrictions of Regular Operators

The restriction of a regular operator to some invariant subspace do not need to be

regular (one can take the kernel to be convinced). We consider in the following
proposition, the restriction operator of T to R ∞ (T ). We have
Proposition 2.9 ([7, Proposition 6]) Let T ∈ L(H) be a regular operator and
n ≥ 1. The restriction operator

Tn = T|R ∞ (T )∩R(T †n ) : R ∞ (T ) ∩ R(T †n ) → R ∞ (T )

408 H. Ezzahraoui et al.

is bijective. In particular,

T1 = T|R ∞ (T )∩N(T )⊥ : R ∞ (T ) ∩ N(T )⊥ → R ∞ (T )

is bijective and hence

R ∞ (T ) ∩ R(T †n ) = R ∞ (T ) ∩ N(T )⊥ for every n ≥ 1.

Proof Since T is regular, it follows that T (R ∞ (T )) = R ∞ (T ) and the inclusion

T (R ∞ (T ) ∩ R(T †n )) ⊂ T (R ∞ (T )) = R ∞ (T )) derives immediately.
Let x ∈ R ∞ (T ). From Proposition 2.6, we have x = T n T †n x. Since R ∞ (T )
is invariant for T † , we obtain T †n x ∈ R ∞ (T ) ∩ R(T †n ). Finally, x = T n T †n x ∈
T (R ∞ (T ) ∩ R(T †n )) and thus T (R ∞ (T ) ∩ R(T †n )) = R ∞ (T ). Which leads to
T|R ∞ (T )∩R(T †n ) is onto. To show that T|R ∞ (T )∩R(T †n ) is one to one, let x ∈ R ∞ (T ) ∩
R(T †n ) be such that T|R ∞ (T )∩R(T †n ) x = 0, we have x ∈ R ∞ (T ) ∩ R(T †n ) ∩ N(T ).
Since R(T †n ) ∩ N(T ) ⊂ R(T † ) ∩ N(T ) = N(T )⊥ ∩ N(T ) = {0}, we have x = 0
and so T|R ∞ (T )∩R(T †n ) is one to one. Therefore, T|R ∞ (T )∩R(T †n ) is bijective.
We derive the second affirmation for n = 1 because of N(T )⊥ = R(T † ) and the
third one by taking the inverse of T1 .

Theorem 2.10 ([7, Theorem 4]) Let T ∈ L(H) be a regular operator. Then
†
(T|R ∞ (T ) )† = T|R ∞ (T ) .

We also have the next characterization of regular operators.

Theorem 2.11 [7, Theorem 5] Let T ∈ L(H) be with closed range. The following
conditions are equivalent
(i) T is regular.
(ii) The map TB : R(T )⊥ → R(T n ) ∩ R(T n+1 )⊥ defined by:

TBx = PR(T n )∩R(T n+1 )⊥ T n x, for every n ≥ 0

is one to one.
(iii) R(T )⊥ 4 R(T n ) ∩ R(T n+1 )⊥ for every n ≥ 0.
(bij ect ion)

Corollary 2.12 For T ∈ L(H) regular and for every integer n ≥ 1, we have

dim R(T n ) ∩ R(T n+1 )⊥ = dim R(T )⊥ .

On Wold Type Decomposition for Closed Range Operators 409

We also have
Corollary 2.13 Let T ∈ L(H) be regular such that dim R(T )⊥ = 1. Then there is
a sequence of orthogonal wandering vectors {ek }k≥1 such that
J
H= C.ek ⊕ R ∞ (T ).
k≥1

Proof By corollary 2.12, dim R(T n ) ∩ R(T n+1 )⊥ = 1 for every n ≥ 0. Let en be
a nonzero vector in R(T n−1 ) ∩ R(T n )⊥ for n ≥ 1. Since

H = ((H ∩ R(T )⊥ ) ⊕ (R(T ) ∩ R(T 2 )⊥ ) ⊕ (R(T 2 ) ∩ R(T 3 )⊥ ) ⊕ · · · ) ⊕ R ∞ (T ),

the relation H = C.e1 ⊕ C.e2 ⊕ C.e3 · · · ⊕ R ∞ (T ) follows. But we know that

T j en ∈ R(T n ) ∀j ≥ 1, thus, we have en ⊥ T j en ∀n, j ≥ 1, and so en is a
wandering vector for T for all n.

3 Wold Type Decomposition for Regular Operators

In the sequel, consider T ∈ L(H) such that γ (T ) ≥ 1. Fix the next notations

E = H A T H = N(T ∗ ) and E † = H A T † H = N(T †∗ ) = N(T ).

It follows from the identity T † = γ (T

1 †
) that T is contractive and then that
T † T x2 ≤ T x2 for every x ∈ H. Since T † T is an orthogonal projection, we
conclude that T ∗ T − T † T ≥ 0. We denote by DT the operator given by

DT = (T ∗ T − T † T )1/2 .

Clearly DT x2 = T x2 − T † T x2 for all x ∈ H.

3.1 The Main Results

We start with the following proposition involving DT .

Proposition 3.1 ([7, Proposition 7]) Let T ∈ L(H) be an operator with γ (T ) ≥
1. Let c > 0 and (ck )k (k ≥ 2) be some positive sequence. The following are
equivalent:
(i)

T k x2 ≤ ck (DT x2 ) + cT † T x2 ; for every x ∈ H, (4)

410 H. Ezzahraoui et al.

(ii)

T k x2 ≤ ck (T x2 − x2 ) + cx2 ; for every x ∈ N(T )⊥ . (5)

Proof We notice first that since γ (T ) > 0, the operator T † exists. Suppose that the
inequality (4) holds and let x ∈ N(T )⊥ . Since N(T )⊥ = R(T † ), we get T † T x =
PR(T † ) x = x and so we have the result. Conversely, let x ∈ H T † T x ∈ N(T )⊥ . By
substituting T † T x ∈ N(T )⊥ for x in (5) and by using the identity T T † T = T we
obtain (4).

In the case of expansive operators, these inequalities are equivalent to Inequality (2)
introduced in [19].
We extend next some known results of left invertible operators to our setting.
Lemma 3.2 ([7, Lemma 1]) Let T ∈ L(H) be such that γ (T ) ≥ 1 and let n ≥ 1
be an integer. For every x ∈ H, we have

n−1
n
x =
2
PE (T ) x + (T ) x +
† i 2 † n 2
DT (T † )i x2 , (6)
i=0 i=1

where PE = I − T T † .
Lemma 3.3 Let T ∈ L(H) be with closed range and n ≥ 1. We have,

(i) x − T n (T † )n x = n−1
i=0 T PE (T ) x for every x ∈ H;
i † i

(ii) x − (T † )n T n x = n−1
i=0 (T ) PE † T x for every x ∈ H.
† i i

Proof We have:

n−1
I − T n (T † )n = T i (T † )i − T i+1 (T † )i+1
i=0

n−1
= T i (I − T T † )(T † )i
i=0

n−1
= T i PE (T † )i .
i=0

The second equality is obtained in a similar way.

As a consequence we have the following useful result.
Proposition 3.4 ([7, Proposition 8]) Let T ∈ L(H) have closed range and n ≥ 1
be an integer. We have:
I 6 i 7
(i) N((T † )n ) ⊂ n−1
i=0 T xi , xi ∈ E .
On Wold Type Decomposition for Closed Range Operators 411

If moreover T is regular, we get

I 6 † i 7
(ii) N(T n ) = n−1 (T ) xi , xi ∈ E † ,
I
i=0 6 †∗ i 7
(ii) N((T ∗ )n ) = n−1
i=0 (T ) xi , xi ∈ E .
For an arbitrary operator, using the equalities:
⎧ ⎫⊥
N N ⎨D ⎬
N(T n ) = (R(T ∗ )n )⊥ = R((T ∗ )n ) = R ∞ (T ∗ )⊥ ,
⎩ ⎭
n≥0 n≥0 n≥0

we get the following duality formulas.

Corollary 3.5 Let T ∈ L(H) be with closed range, then
(i) R ∞ (T †∗ )⊥ ⊂ [E]T .
If moreover T is regular, we have
(ii) R ∞ (T ∗ )⊥ = [E † ]T † ,
(iii) R ∞ (T )⊥ = [E]T †∗ .
To give further information for regular operators, we need the next lemma,
Lemma 3.6 Let T ∈ L(H). If R ∞ (T ) reduces T , then

[E]T ⊂ R ∞ (T )⊥ .

Proof We have E = R(T )⊥ ⊂ R ∞ (T )⊥ . For n ≥ 1 , x ∈ E and y ∈ R ∞ (T ) we

have T n x, y = x, T ∗n y = 0 because T ∗n y ∈ R ∞ (T ). Finally [E]T ⊂ R ∞ (T )⊥ .

We have
Theorem 3.7 ([7, Theorem 6]) Let T ∈ L(H) be regular. If R ∞ (T ) reduces T ,
then T † is regular.
Theorem 3.8 ([7, Theorem 7]) Let T ∈ L(H) be a regular operator with γ (T ) ≥
1 and such that

T k x2 ≤ ck (T x2 − T † T x2 ) + T † T x2 for every x ∈ H (7)

with 1
k≥2 ck = ∞. Then

H = [E]T + R ∞ (T )

with E = H A T H.
412 H. Ezzahraoui et al.

We will show next that under the assumptions of Theorem 3.8, we have H =
[E]T ⊕ R ∞ (T ). We collect next some additional results provided by the same
assumptions.
Theorem 3.9 ([7, Theorem 8]) Let T ∈ L(H) be a regular operator with γ (T ) ≥
1. Under the assumptions of Theorem 3.8, the following assertions hold.
(i) The subspace R ∞ (T ) is reducing for T ,
†
(T|R ∞ (T ) )† = T|R ∗ ∗
(ii) ∞ (T ) = T|R ∞ (T ) = (T|R ∞ (T ) ) ,

(iii) T † is regular,
(iv) the restriction

T|R ∞ (T )∩N(T )⊥ : R ∞ (T ) ∩ N(T )⊥ → R ∞ (T )

is a unitary operator,
(v) H has an orthogonal decomposition

H = [E]T ⊕ R ∞ (T ).

The next duality formulas are immediate

Corollary 3.10 ([7, Corollary 7]) Let T ∈ L(H). Under the assumptions of
Theorem 3.8, we have
(i) [E]T = [E]T †∗ and [E † ]T † = [E † ]T ∗ ;
(ii) R ∞ (T †∗ ) = R ∞ (T ) and R ∞ (T ∗ ) = R ∞ (T † );
(iii) H has an orthogonal decomposition

H = [E † ]T † ⊕ R ∞ (T † ).

We derive that:
Corollary 3.11 ([7, Corollary 8]) Let T ∈ L(H). Under the assumptions of
Theorem 3.8, we have

T †n = T n† on R ∞ (T ) ∩ R ∞ (T ∗ ).

The case of expansive operators is a particular case of γ (T ) ≥ 1, and in this case

T † is a left inverse of T , thus T † T x = x, ∀x ∈ H. We retrieve Theorem 1.8 from
[19, Theorem 2.1].
Moreover, we have R ∞ (T ) ∩ N(T )⊥ = R ∞ (T ), and using Theorem 3.9 we
deduce the following corollary from [19, Corollary 2.1].
Corollary 3.12 ([7, Corollary 9]) Let T ∈ L(H) be anexpansive operator such
that the inequality (4) holds for x ∈ H with c = 1 and k≥2 c1k = ∞. Then, the
space H has an orthogonal sum decomposition

H = [E]T ⊕ R ∞ (T ),
On Wold Type Decomposition for Closed Range Operators 413

and the restriction

T|R ∞ (T ) : R ∞ (T ) → R ∞ (T )

is a unitary operator. That is T admits Wold-type decomposition.

3.2 Applications to Weighted Shifts

We apply our results to non necessarily left invertible weighted shift operators. To
this purpose, we assume that H is a Hilbert space and (en )n∈Z is an orthonormal
basis of H.
Let (ωn )n∈Z be a bounded sequence and Sω : H −→ H be the bilateral weighted
bounded shift defined by Sω en = ωn en+1 . It well known that
Sω is one to one if and
only if ωn = 0 for every n. Indeed, let x ∈ H such that x = n∈Z xn en . We have

T x2 = |xn |2 |ωn |2 ≥ inf |ωn |2 x2 .
n∈Z
n∈Z

Thus, Sω is left invertible.

Regular bilateral weighted shifts operators Sω that are not left invertible are easily
characterized as those shifts such that there exists a unique n0 ∈ Z such that ωn0 =
0. In this case, Sω is the direct sum of a unilateral weighted shift (on the subspace
spanned by (en )n>0 ) and the adjoint of a unilateral weighted shift (on the orthogonal
complement).
It is well known [23, Corollary 1] that Sω is unitarily equivalent to the weighted
shift operator with weight sequence (|ωn |)n∈Z . So, we can assume that ωn ≥ 0 for
every n ∈ Z.
For more information and additional about weighted shifts operators see [23].
For simplicity we assume that ω0 = 0 and ωn = 0 for every n = 0.
Proposition 3.13 ([7, Proposition 9]) Let (ωk )k∈Z be a bounded sequence such
that
(i) ωk = 1 for k < 0, ω0 = 0 and ωk ≥ 1 for k ≥ 1;
(ii) ωk2 .ωk+1
2 · · · ωk+n−1
2 − 1 ≤ cn (ωk2 − 1) ∀k ≥ 1; n ≥ 2, where (cn )n is a

positive sequence such that n≥2 c1n = ∞.
Then
(a) Sω satisfies the assumptions of Theorem 3.9. In particular, the inequality (4)
holds for Sω with c = 1 and for the sequence (cn )n ,
(b) Sω† is a regular
Icontraction; I
(c) the subspace j ≤0 {ej } is reducing for Sω and the restriction of Sω on j <0 {ej }
is unitary.
414 H. Ezzahraoui et al.

Proof Notice first that since ωk2 .ωk+1

2 · · · ωk+n−1
2 equals 1 or 0 for all k < 0 and
n ≥ 2, we get

ωk2 .ωk+1
2 2
· · · ωk+n−1 − 1 ≤ cn (ωk2 − 1) ∀k ∈ Z∗ ; n ≥ 2. (8)

Now, Let x ∈ H such that x = k∈Z ak ek . Clearly, we have

Sωn x2 = |ak |2 ωk2 · · · ωk+n−1
2
, n≥1
k =0

and

Sω† Sω x2 = |ak |2 .
k =0

(a) Since Sω† x2 = k =1 ω2 |ak |
1 2 and ωk ≥ 1 for all integers k = 0, we
k−1
conclude that Sω† is a contraction (and so γ (Sω ) ≥ 1). Now, from Inequality (8),
we get

Sωn x2 ≤ cn (Sω x2 − Sω† Sω x2 ) + cSω† Sω x2

where c = 1.
(b) From Theorem 3.9, we conclude that Sω† is a regular operator.
(c) By Theorem 3.9, the subspace R ∞ (Sω ) is reducing for Sω and the restriction
of Sω on R ∞ (Sω ) ∩ N(S ⊥
Iω ) 6is a7 unitary operator. On the other
I hand, clearly
we have R ∞ (Sω ) = e and since N(S ) ⊥ =
k =0 {ek }, we get
I j ≤06 7
j ω
∞ ⊥ I
R (Sω ) ∩ N(Sω ) = j <0 ej and so Sω | j<0 {ej } is unitary.

Remark 3.14 If we assume that ωk = 1 for all k = 0 and ω0 = 0, then we have
Sω† = Sω∗ , in this case; Sω is called a partial isometry.
Example ([7]) Let H be a Hilbert space and (en )n∈Z an orthonormal basis of H.
Let Sω : H −→ H be a bilateral shift defined by Sω (en ) = ωn en+1 where (ωk )k∈Z
is defined by:
⎧
⎪
⎨1 fork < 0,
ωk = 0 for k = 0,
⎪
⎩ k+1 for k ≥ 1.
k
On Wold Type Decomposition for Closed Range Operators 415

√
The sequence (ωk )k∈Z is bounded. More precisely, we have 1 ≤ ωk ≤ 2 for all
k = 0. Take cn = n. Clearly, we have
1 1
= = ∞.
cn n
n≥2 n≥2

For k < 0, we have cn (ωk2 − 1) = 0 and a simple computation shows that

ωk2 .ωk+1
2 · · · ωk+n−1
2 − 1 ≤ 0. Thus

ωk2 .ωk+1
2
· · · ωk+n−1
2
− 1 ≤ cn (ωk2 − 1).

For k ≥ 1, we have

k+1 k+2 k+n n

ωk2 .ωk+1
2
· · · ωk+n−1
2
−1= . ··· −1=
k k+1 k+n−1 k

and since cn (ωk2 − 1) = n( k+1

k − 1) = k , we conclude that
n

ωk2 .ωk+1
2
· · · ωk+n−1
2
− 1 ≤ cn (ωk2 − 1).

Thus, the inequality (8) is satisfied. Consequently, the operator Sω satisfies all
properties of Proposition 3.13.

4 Wold-Type Decomposition for Bi-Regular Operators

4.1 First Properties

It is not known whether if the Moore-Penrose inverse of a regular operator remains

regular. This fact motivates the introduction of the class of bi-regular operators.
Definition 4.1 An operator T ∈ L(H) is said to be bi-regular if both T and T † are
regular.
We list below some examples of bi-regular operators.
Examples
(i) Left invertible operators and right invertible operators are bi-regular.
1. Regular partial isometries are bi-regular. In fact, a partial isometry satisfies
T ∗ T T ∗ = T ∗ and T T ∗ T = T , and it follows that T † = T ∗ is regular.
(ii) Next, we provide an example of a bi-regular operator that is neither left
invertible nor right invertible and that is not a partial isometry. Let H be a
Hilbert space with an orthonormal basis ei,j where (i, j ) ∈ Z2 . Consider the
416 H. Ezzahraoui et al.

operator Tα = α(S ∗
I ⊗ S)P , with S is the bilateral shift and P is the orthogonal
projection onto {ei ⊗ ej : i, j ∈ Z, i = j }, defined on H by

Tα ei,j = α(1 − δi,j )ei−1,j +1 ,

where δi,j is the Kronecker symbol and

I α is a nonzero real number.
We first observe
I that N(T α ) = i∈Z {ei,i }. Indeed, Tα ei,i
= 0 for every
i ∈ Z and hence i∈Z {ei,i } ⊂ N(Tα ). Conversely, for x = i,j ∈Z ai,j ei,j ∈
N(Tα ) for some ai,j ∈ C, we have

0 = Tα x = α (1 − δi,j )ai,j ei−1,j +1 .
i,j ∈Z

I implies that if i = j we have ai,j = 0. Thus, x =
This i∈Z ai,i ei,i ∈
{e
i∈Z i,i }.
Since Tα ei+1,i−1 = α(1 − δi+1,i−1 )ei,i = αei,i and Tα ei+2,i−2 = αei+1,i−1 ,
an induction argument shows that

1 n
ei,i = T ei+n,i−n for every n ≥ 1.
αn α
I
Hence N(Tα ) ⊆ R ∞ (Tα ). On the other hand, since R(Tα ) = j =i+2 {ei,j },
then R(Tα ) is closed. Therefore, Tα is regular.
Now, simple computations shows that

Tα∗ ei,j = α(1 − δi+2,j )ei+1,j −1 , (9)

and thus
N6 7
N(Tα∗ ) = ei,i+2 .
i∈Z

Let (i, j ) ∈ Z2 be such that j = i + 2 and (m, n) ∈ Z2 . If m = n we get

em,n ∈ N(Tα ) = R(Tα† )⊥ ,

and hence Tα† ei,j , em,n = 0. Suppose now that m = n. We have

1 ∗ 1
Tα† ei,j , em,n = Tα† ei,j , T em−1,n+1 = Tα Tα† ei,j , em−1,n+1 .
α α α
On Wold Type Decomposition for Closed Range Operators 417

Since j = i + 2, we have ei,j ∈ R(Tα ) and thus Tα Tα† ei,j = PR(Tα ) ei,j =
ei,j . This implies Tα† ei,j , em,n = α1 ei,j , em−1,n+1 = α1 ei+1,j −1 , em,n , and
finally

1 1
Tα† ei,j = (1 − δi+2,j )ei+1,j −1 = 2 T ∗ ei,j . (10)
α α

It follows that Tα† is regular and hence that Tα is bi-regular.

Since Tα is a partial isometry if and only if Tα∗ = Tα† , it follows from (10)
that Tα is a partial isometry if and only if α = 1.
We start with some elementary properties of bi-regular operators.
Proposition 4.2 Let T be regular. We have
(i) T is bi-regular ⇐⇒ T † is bi-regular ⇐⇒ T ∗ is bi-regular.
(ii) If T is bi-regular, then T n is bi-regular for every n ≥ 2,
(iii) If T1 and T2 are bi-regular, then T1 ⊕ T2 is bi-regular,
(iv) If T is bi-regular and E is a reducing subspace for T , then T|E is bi-regular.
We provide next a sufficient condition for an operator to be bi-regular.
Theorem 4.3 Let T ∈ L(H) be regular. If R ∞ (T ∗ ) is T -invariant, then

(i) R ∞ (T ∗ ) ⊆ C(T † ). In particular, R ∞ (T ∗ ) ⊆ R ∞ (T † ).

(ii) T is bi-regular.
Proof
(i) Since T is regular, T ∗ is also regular and by Proposition 2.7 we have C(T ∗ ) =
R ∞ (T ∗ ). For x ∈ R ∞ (T ∗ ), and vn := T n x ∈ T n R ∞ (T ∗ ) ⊆ R ∞ (T ∗ ) ⊆
R(T ∗ ), we have x = v0 and T † vn+1 = T † T n+1 x = T † T vn = vn . Thus
x ∈ C(T † ) and hence R ∞ (T ∗ ) ⊆ C(T † ).
(ii) From T ∗ is regular, we get R(T † ) = R(T ∗ ) is closed and since moreover
N(T † ) = N(T ∗ ) ⊆ R ∞ (T ∗ ) ⊆ C(T † ) ⊆ R ∞ (T † ), we derive that T † is
regular.

Remark 4.4 Under assumptions of theorem 4.3, since T † is regular, we have
C(T † ) = R ∞ (T † ).
Theorem 4.3 implies the following corollary.
Corollary 4.5 Let T ∈ L(H) be regular. If R ∞ (T ) or R ∞ (T ∗ ) is reducing for T ,
then T is bi-regular.
418 H. Ezzahraoui et al.

4.2 The Cauchy Dual of a Closed Range Operator

The Cauchy dual of an operator T ∈ L(H) with closed range is introduced in [10]
by the next formula

ω(T ) := T †∗ .

This definition extends the case where T is left invertible, in which T † =

(T ∗ T )−1 T ∗ and ω(T ) = T †∗ = T (T ∗ T )−1 . The Cauchy dual of a left invertible
operator is introduced in [24] as a powerful tool in the model theory of left-invertible
operators. The reader is referred for instance to [5, 19, 24] for more information.
In the following, we extend the result given in [24, Proposition 2.10].
Corollary 4.6 Let T ∈ L(H) be a bi-regular operator. If R ∞ (ω(T )) is T † -
invariant, then

R ∞ (ω(T )) ⊆ R ∞ (T ).

Proof Since T † is regular and R ∞ ((T † )∗ ) = R ∞ (ω(T )) is T † -invariant, by

Theorem 4.3 we have R ∞ (ω(T )) ⊆ R ∞ (T ).

We also have the next extension of Olofsson’s result that will allow to transfer Wold
decomposition to Cauchy duals.
Proposition 4.7 Let T ∈ L(H) be with closed range, ω(T ) its Cauchy dual and let
c be a nonnegative constant. Then the following statements are equivalent:
(i)

ω(T )2 x2 − ω(T )x2 ≤ c(ω(T )x2 − T † T x2 ), ∀x ∈ H. (11)

(ii)

1
Q(T x + T † T y)2 ≤ (1 + )(x2 + cT y2 ), ∀x, y ∈ H, (12)
c
where Q is the orthogonal projection of H onto R(T ). In particular, if T satisfies (1),
then T is bounded below and ω(T ) is concave.
On Wold Type Decomposition for Closed Range Operators 419

4.3 Extended Wold-Type Decomposition

Definition 4.8 We shall say that an operator T ∈ L(H) admits the extended Wold-
type decomposition if
(i) R ∞ (T ) is closed and reduces T ,
(ii) T|R ∞ (T )∩N(T )⊥ : R ∞ (T ) ∩ N(T )⊥ −→ R ∞ (T ) is unitary,
(iii) H = [E]T ⊕ R ∞ (T ).
Notice in passing that if T satisfies the extended Wold-type decomposition and
if T is one to one or T|R ∞ (T ) is unitary, then T admits the classical Wold type
decomposition property investigated in [24].
For example, for a regular operator T and under the assumptions of Theorem 3.8,
by Theorem 3.9, T admits the extended Wold-type decomposition.
Remark 4.9
(i) Corollary 3.5 implies that a regular operator T is pure if and only if ω(T ) has
the wandering subspace property.
(ii) If T is an operator with closed range, then R(ω(T )) is closed. So, since
ω((ω(T )) = T and from (i) in Corollary 4.6 we have

R ∞ (T )⊥ ⊆ [E]ω(T ) .

The previous inclusion is strict in general as the following example shows.

Example Consider on C3 , endowed with an orthonormal basis {e1 , e2 , e3 }, the
operator defined by T e1 = T e2 = 0 and T e3 = e1 . It is immediate that T T ∗ = PCe1
and T ∗ T = PCe3 . It follows that T † = T ∗ and hence that T = ω(T ). Now T 2 = 0
implies R ∞ (ω(T )) = {0}, and we obtain

R ∞ (ω(T ))⊥ = C3 = [E]T = C{e2 , e3 }.

It is not clear whether the inclusion in Corollary 3.5, (i) can be replaced by the
equality if T is an arbitrary regular operator. However, in the following example, we
show that the equality holds if T is a regular weighted shift on l 2 (Z).
Example Let H be a Hilbert space endowed with an orthonormal basis (en )n∈Z and
let α = (α)n∈Z be a bounded sequence. The weighted shift Sα on H associated with
α is the bounded linear operator Sα : H −→ H defined by Sα en = αn en+1 . It is
proved in [8] that if Sα is regular, then there exits at most n0 ∈ Z such that αn0 = 0.
Let Sα be a regular bilateral weighted shift such that α0 = 0. It is easy to see that
−1
n α
(Πi=1 / {1, · · · , n} ;
k−i ) ek−n , if k ∈
(Sα† )n ek =
0 , if k ∈ {1, · · · , n} .
420 H. Ezzahraoui et al.

I 6 7
We derive that N((Sα† )n ) = 1≤j ≤n ej for every n ≥ 1, and in particular, we
have

E = H A Sα H = N(Sα† ) = span {e1 }.

For n ≥ 1 and 0 ≤ i ≤ n − 1, we have

(Sα† )n Sαi e1 = Πji =1 αj (Sα† )n ei+1 .

Since i + 1 ∈ {1, · · · , n}, we get (Sα† )n Sαi e1 = 0, and thus Sαi e1 ∈ N((Sα† )n ) for
I 6 i 7
every 0 ≤ i ≤ n − 1. Hence n−1 i=0 Sα e1I ⊂ N((S
† )n )). On the other hand, from
6 α7
[8, Proposition 8], we have N((Sα† )n ) ⊂ n−1 i
i=0 Sα e1 , and thus

N4
n−1 5
N((Sα† )n ) = Sαi e1 .
i=0

Therefore
N
N((Sα† )n ) = [e1 ]Sα .
n≥0

I 6 7 I 6 7
It is easy to check that j ≤0 ej = R ∞ (Sα ) = R ∞ (ω(Sα )) and that j ≥1 ej =
[e1 ]Sα = R ∞ (Sα )⊥ .
Proposition 4.10 If T ∈ L(H) is a bi-regular operator, then

H = [E]T ⊕ R ∞ (ω(T )) = [E]ω(T ) ⊕ R ∞ (T ). (13)

Proof Since T † is regular, then ω(T ) = T †∗ is also regular. Now, by substituting

ω(T ) for T in Corollary 3.5 and using the identities E = R(ω(T ))⊥ and ω(ω(T )) =
T , we get the desired result.

Proposition 4.10 extends the following duality result given by S. Shimorin in [24].
Corollary 4.11 ([24, Proposition 2.7]) Let T be a left-invertible operator and let
L be its left inverse defined by L = (T ∗ T )−1 T ∗ . Then

H = [E]T ⊕ R ∞ (L∗ ) = [E]L∗ ⊕ R ∞ (T ). (14)

Corollary 4.5 and Proposition 4.10 imply the following results.

Corollary 4.12 Let T ∈ L(H) be a regular operator such that R ∞ (T ) reduces T .
Then

H = [E]T ⊕ R ∞ (ω(T )) = [E]ω(T ) ⊕ R ∞ (T ).

On Wold Type Decomposition for Closed Range Operators 421

Corollary 4.13 Let T ∈ L(H) be regular such that R ∞ (T ) reduces T and

R ∞ (ω(T )) reduces ω(T ). Then

[E]T = [E]ω(T ) and R ∞ (T ) = R ∞ (ω(T )).

In particular,

H = [E]T ⊕ R ∞ (T ).

Proof Since R ∞ (T ) reduces T , then T is bi-regular. The result follows from

Corollaries 4.6 and 4.12.

Remark 4.14

1. We notice that if T is a regular operator such that T ∗ and T † are equal on R ∞ (T ),

then by Proposition 2.7, R ∞ (T ) reduces T and then T will be bi-regular. We
† ∗
see easily that the equality T|R ∞ (T ) = T|R ∞ (T ) is equivalent to each one of the
following.
(i) T T ∗ PR ∞ (T ) = PR ∞ (T ) , that is, T|R
∗
∞ (T ) is isometric;

(ii) ∞
∗
PR (T ) T T = PR (T ) T T ;
∞
†

(iii) T ∗ PR ∞ (T ) = T † PR ∞ (T ) ;
(iv) T ∗ T PR ∞ (T ) = T † T PR ∞ (T ) ;
(v) T|R ∞ (T )∩N(T )⊥ : R ∞ (T ) ∩ N(T )⊥ −→ R ∞ (T ) is unitary.

2. Under any of conditions (i)–(v), we have

† ∗ ∗
(T|R ∞ (T ) )† = T|R ∞ (T ) = T|R ∞ (T ) = (T|R ∞ (T ) ) .

In particular, T|R ∞ (T ) is a surjective partial isometry.

Moreover, we have the following result.
Proposition 4.15 Let T ∈ L(H) be regular. If one of the conditions (i)–(v) in
Remark 4.14 is fulfilled, then

(i) T is bi-regular;
(ii) R ∞ (T † ) = R ∞ (T ∗ ).
Proof
(i) Follows from Remark 4.14.
(ii) From T is bi-regular, we obtain R ∞ (T † ) and R ∞ (T ∗ ) are closed. By Propo-
sition 4.10 and since ω(T ∗ ) = T † , we have R ∞ (T † )⊥ = [E † ]T ∗ and
R ∞ (T ∗ )⊥ = [E † ]T † . On the other hand, because E † = N(T ) ⊆ R ∞ (T ),
we get T ∗n x = T †n x for every x ∈ E † and for every n ≥ 0. Then we have
[E † ]T † = [E † ]T ∗ , which proves that R ∞ (T † ) = R ∞ (T ∗ ).

422 H. Ezzahraoui et al.

Since T is regular if and only if its adjoint T ∗ is regular, then it is clear from
Proposition 4.15 that if T is regular and one of the conditions (i)–(v) in Remark 4.14
is satisfied for T and for T ∗ , then R ∞ (ω(T )) = R ∞ (T ). So, by Proposition 4.10,
we get the following result.
Corollary 4.16 Let T ∈ L(H) be regular. If one of the conditions (i)–(v) in
Remark 4.14 is fulfilled for T and for T ∗ , then T and T ∗ admit the extended Wold-
type decomposition.
The duality between T and ω(T ) is reflected in terms of extended Wold-type
decomposition as follows.
Proposition 4.17 Let T ∈ L(H) be bi-regular. Then T admits the extended Wold-
type decomposition if and only if ω(T ) admits it. In this case, we have

R ∞ (T ) = R ∞ (ω(T )) and [E]T = [E]ω(T ) .

Proof Suppose that T admits the extended Wold-type decomposition. Then we

have, H = [E]T ⊕ R ∞ (T ) and from Proposition 4.10 we get

R ∞ (T ) = R ∞ (ω(T )) and [E]T = [E]ω(T ) .

On the other hand, since ω(T ) is regular, then

ω(T )(R ∞ (ω(T )) = R ∞ (ω(T )).

Let x ∈ R ∞ (ω(T )). We have

ω(T )∗ x = T † x ∈ T † R ∞ (T ) ⊆ R ∞ (T ) = R ∞ (ω(T )).

So, R ∞ (ω(T )) reduces ω(T ).

Its remains to show that

ω(T )|R ∞ (T )∩N(T )⊥ : R ∞ (T ) ∩ N(T )⊥ −→ R ∞ (T )

is unitary. Let x ∈ R ∞ (ω(T )) ∩ N(ω(T ))⊥ = R ∞ (T ) ∩ N(T )⊥ . Since by

assumption T|R ∞ (T )∩N(T )⊥ : R ∞ (T ) ∩ N(T )⊥ −→ R ∞ (T ) is unitary, we get
T T ∗ x = T ∗ T x = x and thus,

T ∗ x = T † x.

Now, from identities T ∗ T = ω(T )† T and ω(T )ω(T )† T = T , it is clear that

T ∗ T x = x implies

ω(T )x = ω(T )ω(T )† T x = T x.

On Wold Type Decomposition for Closed Range Operators 423

It follows that for every x ∈ R ∞ (ω(T )) ∩ N(ω(T ))⊥ we have

ω(T )x = T x = x.

On the other hand, by Remark 4.14, we have T ∗ x = T † x for every x ∈ R ∞ (T ).

Thus,

ω(T )∗ x = T † x = T ∗ x = x, ∀x ∈ R ∞ (T ).

So, ω(T )|R ∞ (T )∩N(T )⊥ : R ∞ (T ) ∩ N(T )⊥ −→ R ∞ (T ) is unitary.

The converse follows by the duality ω(ω(T )) = T .

Remark 4.15 In the case of left invertible operators, we retrieve Corollary 2.9 in
[24], from the previous proposition.
We provide for bi-regular operators the next duality properties in the line of the
ones given by D. Sutton for left invertible operators in [26].
Theorem 4.18 Let T ∈ L(H) be a bi-regular operator. The following statements
are equivalent:
(i) H = [E]T ;
(ii) R ∞ (ω(T )) = {0};
(iii) there exists a closed subspace M = {0} such that M ⊆ ω(T )M;
(iv) there exists a closed subspace M ⊆ R(T ), M = {0} such that T ∗ M ⊆ M;
(v) there exists a closed subspace M ⊇ E, M = H such that T M ⊆ M;
(vi) there exists a closed subspace M ⊇ E, M = H such that M ⊆ T † M;
(vii) there exists closed subspaces A, B = {0} such that T A ⊆ A, T † A ⊆ A ⊕
E, PB T B = B and R(T ) = A ⊕ B, where PB is the orthogonal projection
onto B.
Proof (i)⇐⇒(ii): This equivalence follows from Proposition 4.10.
(ii)#⇒(iii): Let M = R ∞ (ω(T )). By assumption we have M = {0}. Since
T † is regular, ω(T ) = T †∗ is also regular and then R ∞ (ω(T )) is closed. Moreover,
by Proposition 2.7, we have ω(T )R ∞ (ω(T )) = R ∞ (ω(T )). Thus the condition
M ⊆ ω(T )M is satisfied.
(ii)#⇒(iv): Let M = R ∞ (ω(T )) = {0}. Since R ∞ (ω(T )) ⊆ R(ω(T ))
and R(ω(T )) = R(T ), we have M ⊆ R(T ). Now, since ω(T ) is regular and
T ∗ is a generalized inverse of ω(T ), by Proposition 2.7, we get T ∗ R ∞ (ω(T )) ⊆
R ∞ (ω(T )). That is T ∗ M ⊆ M.
(i)#⇒(v): Let M = [E]T , then E ⊆ M = H. Moreover, from the definition
of [E]T , it’s clear that T M ⊆ M.
(i)#⇒(vi): Again, let M = [E]T . Then E ⊆ M = H. Since ω(T ) is regular,
we have R ∞ (ω(T ))⊥ ⊆ N(ω(T ))⊥ = R(T † ). But by Proposition 4.10, we have
R ∞ (ω(T ))⊥ = [E]T , thus M ⊆ R(T † ). Since T M ⊆ M and T † T = PR(T † ) we
conclude that M = T † T M ⊆ T † M.
424 H. Ezzahraoui et al.

I
(i)#⇒(vii): Let A = ∞ ∞
i=1 T E and B = R (ω(T )). It follows from (i) and
i

(ii) that A, B = {0} and by Proposition 4.10 we have A ⊥ B. Since R(T ) is closed
and E ⊥ T i E ∀i ≥ 1, by Proposition 4.10 again we obtain
∞
N
R(T ) = E ⊥ = ([E]T ⊕ R ∞ (ω(T ))) A E = T i E ⊕ R ∞ (ω(T )) = A ⊕ B.
i=1

I I∞ i I
Since T is continuous, T A = T ∞ i=1 T E ⊆
i
I
i=2 T E ⊆ ∞ i=1 T E = A. On
i
∞
the other hand, E ⊥ T E ∀i ≥ 1, thus A ⊕ E = i=1 T E ⊕ E = [E]T . So, by
i i

Proposition 4.10 we have (A ⊕ E)⊥ = R ∞ (ω(T )). But ω(T ) is regular, so by

Proposition 2.7 we obtain T † R ∞ (ω(T ))⊥ ⊆ R ∞ (ω(T ))⊥ . Equivalently, T † (A ⊕
E) ⊆ A ⊕ E, which implies that T † A ⊆ A ⊕ E. It remains to show the equality
PB T B = B. Since PB T B ⊆ B, it suffices to show that B ⊆ PB T B. Let x ∈ B =
R ∞ (ω(T )) ⊆ R(T ) such that x = T y for some y ∈ H = [E]T ⊕ R ∞ (ω(T )) with
y = y1 + y2 , where y1 ∈ [E]T and y2 ∈ R ∞ (ω(T )) = B. Since T y1 ∈ [E]T ⊥
B, then x = PB x = PB T y2 ∈ PB T B. Since x ∈ B is arbitrary, it follows that
PB T B = B.
(iii)#⇒(ii): Applying ω(T )j to the relation M ⊆ Eω(T )M, we obtain
M ⊆ ω(T )j M for every j ≥ 0, and then that M ⊆ ∞ j =0 ω(T ) M. Since
j
E∞ ∞ ∞
j =0 ω(T ) M ⊆ R (ω(T )), we get M ⊆ R (ω(T )). Now M = {0} implies
j
∞
that R (ω(T )) = {0}.
(iv)#⇒(iii): We recall that ω(T )T ∗ = (T T † )∗ = T T † is an orthogonal
projection onto R(T ). Thus, since M ⊆ R(T ) we have ω(T )T ∗ M = M. By
applying ω(T ) to the relation T ∗ M ⊆ M we get ω(T )T ∗ M = M ⊆ ω(T )M.
By (iv) we have M = {0}.
(v)#⇒(iv): Suppose that there exists a closed subset verifying the required
properties of (v). Since E ⊆ M = H, we have {0} = M⊥ ⊆ R(T ), and by
T M ⊆ M we have T ∗ M⊥ ⊆ M⊥ . In particular, (iv) is satisfied with the closed
subspace M⊥ .
(vi)#⇒(iii) Since E ⊆ M = H, we have {0} = M⊥ ⊆ R(T ). We shall
prove that M⊥ ⊆ ω(T )M⊥ . To this aim, let x ∈ M⊥ ⊆ R(T ) = R(ω(T )).
There is y ∈ H such that x = ω(T )y. Let y = m1 + m2 with m1 ∈ M and
m2 ∈ M⊥ , and suppose that m1 = 0. Then for every m ∈ M we have 0 =
x, m = ω(T )(m1 + m2 ), m, which implies that m1 + m2 , T † m = 0 for every
m ∈ M. Then m1 + m2 ∈ (T † M)⊥ ⊆ M⊥ because by assumption we have
M ⊆ T † M. But m1 + m2 , m1 = m1 , m1 = m1 2 = 0, which contradicts the
fact that m1 + m2 ∈ M⊥ . We derive that m1 = 0 and hence x = ω(T )m2 . Since
x ∈ M⊥ is arbitrary, we get that M⊥ ⊆ ω(T )M⊥ . Therefore (iii) is satisfied with
the closed subspace M⊥ .
(vii)#⇒(v): Suppose that there exist closed subspaces A, B = {0} such that
T A ⊆ A, T † A ⊆ A ⊕ E, PB T B = B and R(T ) = A ⊕ B. We shall prove first
that T E ⊆ A. Let x ∈ E, since T x ∈ R(T ), then T x = a + b where a ∈ A and
b ∈ B = PB T B. Thus b = PB T b1 for some b1 ∈ B. Now, from the orthogonal
decomposition of R(T ), we see that T b1 = a2 + b for some a2 ∈ A. It follows that
On Wold Type Decomposition for Closed Range Operators 425

T x = a + b = a − a2 + T b1 , and then

T x = a1 + T b1 , (15)

where a1 = a − a2 . Thus, T † T x = T † a1 + T † T b1 . Now, since T † is regular, we

have x ∈ E = N(T † ) ⊆ R(T † ), and so T † T x = x = T † a1 + T † T b1 . On the
other hand, we have T † a1 ∈ T † A ⊆ A ⊕ E = B ⊥ . Now, since x ∈ E ⊆ B ⊥ , we get
T † T b1 ∈ B ⊥ = A⊕E. Thus T † T b1 = a3 +y where a3 ∈ A and y ∈ E. By applying
T to this last equality and using the fact that T T † T = T , we get T b1 = T a3 + T y
and thus we have that a3 − b1 + y ∈ N(T ). Now, since T is regular, we have
N(T ) ⊆ R(T ) and hence a3 − b1 + y ∈ R(T ). But a3 − b1 ∈ A ⊕ B = R(T ),
which yields y ∈ R(T ). The fact that y ∈ E = R(T )⊥ implies that y = 0 and
hence that T b1 = T a3 . By (15) we have T x = a1 + T a3 . Since a3 ∈ A and by
assumption T a3 ∈ T A ⊆ A, we get T x ∈ A. But x ∈ E is arbitrary, so we deduce
that T E ⊆ A. Now, because T A ⊆ A, we have T (A ⊕ E) ⊆ A ⊆ A ⊕ E. So, for
M := A ⊕ E, we have T M ⊆ M and E ⊆ M. Since M ⊥ B and B = {0}, we
have M = H. Therefore, (v) is satisfied with the closed subspace M = A ⊕ E.

5 Some Open Questions

Several natural questions arise from this survey.

Problem 5.1 Given an operator T with closed range. Under which conditions, do
we have (T † )n = (T n )† for every n ≥ 1?
An intermediate question is, suppose (T † )n = (T n )† for some n ≥ 1. Is it true
that (T † )k = (T k )† for some k < n?
A more general question is the so called reverse order law problem that
investigates the identity (AB)† = B † A† for A, B with closed range. See the survey
[6] for example.
We show first that regular weighted shifts, answer this question positively. Let
H be a Hilbert space, (en )n∈Z be an orthonormal basis of H, (ω)n∈Z be a bounded
sequence and Sω be the weighted shift associated with (ω)n .
It is not difficult to see that R(Sω ) is closed if and only if
1
n
lim inf ωk · · · ωk+n−1 >0
n→+∞ k>0

and
1
n
lim inf ω−k−1 · · · ω−k−n > 0.
n→+∞ k≥0
426 H. Ezzahraoui et al.

If ω0 = 0 and ωn = 0 for every n = 0, then Sω is regular. Moreover,

(ωk−1 )−1 ek−1 f or k = 1;
Sω† ek =
0 f or k = 1.

Sωn is the weighted shift of multiplicity n and it is easy to check that

(Sω† )n = (Sωn )† .

On the other hand, we provide provides an example disapproving the equality. For
1 −1 1 0
T = we have T † = 12 and (T † )2 T 2 = 12 T † T which is not a
0 0 −1 0
projection. Thus (T † )2 = (T 2 )† .
The equality (T n )† = (T † )n may fail even for left invertible operators as shown
by the examples below.
Recall that for T left invertible, we have T † = (T ∗ T )−1 T ∗ . Since T n is also left
invertible, we get T n† = (T n∗ T n )−1 T ∗n and T †n = ((T ∗ T )−1 T ∗ )n .
Problem 5.2 When is the restriction of a bi-regular operator is bi-regular?
As for regular operators the restriction to the kernel is not bi-regular. It is
interesting to see for which condition on the invariant subspace M of T , T|M is
bi-regular. For example, if T is regular and E is an invariant subspace such that
N ∞ (T ) ⊂ E, then T|E is regular. Is this fact true for bi-regular operators?
Problem 5.3 Is every regular operator bi-regular?
In contrast with T is regular if and only if T ∗ is regular. We do not know if T is
regular operator if and only if T † is regular. A positive answer is given when T is
regular such that (T † )n = (T n )† every n ≥ 1. In particular, regular weighted shifts
are bi-regular.

References

1. P. Aiena, Fredholm and Local Spectral Theory with Applications to Multipliers (Kluwer,
Dordecht, 2004)
2. A. Aleman, S. Richter, C. Sundberg, Beurling’s theorem for the Bergman space. Acta Math.
177(2), 275–310 (1996)
3. C. Badea, M. Mbekhta, Operators similar to partial isometries. Acta Sci. 71, 663–680 (2005)
4. A. Beurling, On two problems concerning linear transformations in Hilbert space. Acta Math.
81, 239–255 (1949)
5. S. Chavan, On operators Cauchy dual to 2-hyperexpansive operators. Proc. Edinburgh Math.
Soc. 50, 637–552 (2007)
6. N.C. Dincic, D.S. Djordjevic, Basic reverse order law and its equivalencies. Aequationes Math.
85(3), 505–517 (2013)
On Wold Type Decomposition for Closed Range Operators 427

7. H. Ezzahraoui, M. Mbekhta, E.H. Zerouali, Wold-type decomposition for some regular

operators. J. Math. Anal. Appl. 430, 483–499 (2015)
8. H. Ezzahraoui, M. Mbekhta, E.H. Zerouali, Operator inequalites related to Beurling-type
theorem. Int. J. Funct. Anal. Oper. Theory Appl. 7, 107–119 (2015)
9. H. Ezzahraoui, M. Mbekhta, A. Salhi, E.H. Zerouali, A note on roots and powers of partial
isometries. Arch. Math. 110, 251–259 (2018)
10. H. Ezzahraoui, M. Mbekhta, E.H. Zerouali, On the Cauchy Dual of closed range operators.
Acta Sci. Math. 85, 231–248 (2019)
11. H. Ezzahraoui, M. Mbekhta, E.H. Zerouali, Wold-type decomposition for bi-regular operators.
Acta Sci. Math. 87, 463–483 (2021)
12. P.R. Halmos, Shifts on Hilbert spaces. J. Reine Angew. Math. 208, 102–112 (1961)
13. K.H. Izuchi, Cyclicity of reproducing kernels in weighted Hardy spaces over the bidisc. J.
Funct. Anal. 272, 546–558 (2017)
14. K.J. Izuchi, K.H. Izuchi, Y. Izuchi, Wandering subspaces and the Beurling type theorem I.
Arch. Math. 95, 439–446 (2010)
15. M. Mbekhta, Généralisation de la décomposition de Kato aux opérateurs paranormaux et
spectraux. Glasgow Math. J. 29, 159–175 (1987)
16. M. Mbekhta, Résolvant généralisé et théorie spectrale. J. Oper. Theory. 21, 69–105 (1989)
17. M. Mbekhta, On the generalized resolvent in Banach spaces. J. Math. Anal. Appl. 189, 362–
377 (1995)
18. M. Mbekhta, Partial ismoetries and generalized inverses. Acta Sci. Math. Szeged 70, 767–781
(2004)
19. A. Olofsson, Wandering subspace theorems. Integr. Equ. Oper. Theory 51, 395–409 (2005)
20. S. Richter, Invariant subspaces of the Dirichlet shift. J. Reine Angew. Math. 386, 205–220
(1988)
21. W. Rudin, Function Theory in Polydiscs (W.A.Benjamin, NewYork, 1969)
22. P. Saphar, Contribution à l’étude des applications linéaires dans un espace de Banach. Bull.
Soc. Math. 92, 363–384 (1964)
23. A.L. Shields, Weighted shift operators and analytic function theory. Topics in operator theory,
pp. 49–128. Math. Surveys, No. 13, Am. Math. Soc., Providence, R.I., 1974.
24. S. Shimorin, Wold-type decompositions and wandering subspaces for operators close to
isometries. J. Reine Angew. Math. 531, 147–189 (2001)
25. S. Sun, D. Zheng, Beurling type theorem on the Bergman space via the Hardy space of the
bidisk. Sci. China. Ser. 52, 2517–2529 (2009)
26. D.J. Sutton, Structure of invariant subspaces for left-invertible operators on hilbert space. Ph.D.
Thesis, Faculty of the Virginia Polytechnic Institute and State University, Blacksburg, Virginia
(2010)
(Asymmetric) Dual Truncated Toeplitz
Operators

M. Cristina Câmara, Kamila Kliś-Garlicka, and Marek Ptak

Abstract Multiplication operators on the space L2 (T) on the unit circle T with
Lebesgue measure are classical operators. So are Toeplitz operators on the Hardy
space H 2 ⊂ L2 (T). Sarason’s paper (Oper Metrices 1:491–526, 2007) has started
investigations of truncated Toeplitz operators (TTO), i.e., compressions of these
multiplication operators to model spaces. If operators act between two different
model spaces they are called asymmetric truncated Toeplitz operators (ATTO).
Naturally the compressions of multiplication operators between orthogonal comple-
ments of model spaces can be investigated. They are called dual truncated Toeplitz
operators (DTTO), or asymmetric dual truncated Toeplitz operators (ADTTO) if
orthogonal complements to different model spaces are considered. In this chapter
the properties of ADTTO are presented.

Keywords Model space · Multiplication operator · Dual truncated Toeplitz

operator · Conjugation · Intertwining property · Commutativity of operators

1 Motivation and Basic Notations

TTO and ATTO are natural generalizations of Toeplitz matrices which appear in
many contexts, such as in the study of finite–interval convolution equations, signal
processing, control theory, probability and diffraction problems [10, 11, 19]. Model
spaces, which provide the natural setting for TTO and ATTO, have generated
enormous interest and they are relevant in connection with a variety of topics
such as the Schrödinger operator, classical extremal problems in control theory,

M. C. Câmara
Center for Mathematical Analysis, Geometry and Dynamical Systems Mathematics Department,
Instituto Superior Técnico, Universidade de Lisboa Av. Rovisco Pais, Lisboa, Portugal
e-mail: [email protected]
K. Kliś-Garlicka · M. Ptak ()
Department of Applied Mathematics, University of Agriculture, Kraków, Poland
e-mail: [email protected]; [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 429
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_13
430 M. C. Câmara et al.

Hankel operators and Toeplitz matrices (see for instance [12] and [10]). Natural
conjugations, which model spaces and the whole L2 (T) possess (see [3]), make
model spaces even more natural in the context of physics [11]. Their orthogonal
complements in L2 (T) also appear in numerous applications. In the equivalent
setting of the real line [9, 14], using time and frequency as the natural variables,
and taking the inner function θ = θλ with θλ = exp(iλξ ) for ξ ∈ R, they appear
via the Fourier transform, for instance, as high frequency signals, which are of
decisive importance in electronics, or as outputs of high–pass filters. DTTO and
ADTTO, acting on these spaces have realizations, for example, in long distance
communication links with several regenerators along the path that cancel low–
frequency noise using high–pass filters, or in the description of wave propagation in
the presence of finite–length obstacles. The chapter is mostly based on the papers
[4–7, 18].
Let D = {z ∈ C : |z| < 1} be the unit disk and T the unit circle. Denote by
L2 (T) := L2 (∂D) the space of measurable and square integrable functions on T
with respect to the normalized Lebesgue measure, by H 2 denote the classical Hardy
space, and let H−2 = L2 (T) A H 2 . Let P + be the orthogonal projection from L2 (T)
onto H 2 and P − = IL2 (T) − P + .
Let ϕ ∈ L∞ (T). Recall that the multiplication operator on L2 (T) is defined as
Mϕ f = ϕ f for f ∈ L2 (T). The Toeplitz operator is defined by Tϕ f = P + (ϕf ),
and the operator Hϕ f = P − (ϕf ), for f ∈ H 2 , is called the Hankel operator (with
symbol ϕ). The space of all Toeplitz operators is denoted by T (H 2 ) and space of
all Hankel operators is denoted H(H 2 , H−2 ).
Recall that the operator J , defined by J : L2 (T) → L2 (T), Jf (z) = z̄f (z),
z ∈ T, is an antilinear involution. Moreover, J −1 = J = J (by we denote the
antilinear adjoint) and J (H 2 ) = H−2 , J (H−2 ) = H 2 . More properties of antilinear
operators can be found for example in [17]. Let θ be an inner function. Note that
the multiplication operator Mθ maps H 2 bijectively onto θ H 2 and Mθ−1 = Mθ̄ .
Moreover, each of the operators J , Mθ and Mθ̄ preserves L∞ (T).
The following properties can be easily verified.
Proposition 1.1 Let θ be an inner function. Then
(a) f1 , f2 = θf1 , θf2 = Jf2 , Jf1 for f1 , f2 ∈ L2 (T);
(b) PθH 2 = Mθ P + Mθ̄ ;
(c) P − = J P +J ;
(d) Mθ (f1 ⊗ f2 )Mθ̄ = θf1 ⊗ θf2 for f1 , f2 ∈ L2 (T);
(e) J (f1 ⊗ f2 )J = Jf1 ⊗ Jf2 for f1 , f2 ∈ L2 (T);
(f) Mθ J Mθ = J ;
(g) J Mϕ = Mϕ̄ J for ϕ ∈ L∞ (T).
In particular J 1 = z̄ and J (1 ⊗ 1)J = z̄ ⊗ z̄. Here, for f1 , f2 ∈ L2 (T), f1 ⊗ f2
denotes the operator defined on L2 (T) by (f1 ⊗ f2 )(f ) = f, f2 f1 ,
For a nonconstant inner function θ denote by Kθ the model space defined
as the orthogonal complement of θ H 2 in H 2 , i.e., Kθ = H 2 A θ H 2, and let
Pθ : L2 (T) → Kθ be the orthogonal projection and let Pθ⊥ = IL2 (T) − Pθ be the
(Asymmetric) Dual Truncated Toeplitz Operators 431

orthogonal projection from L2 (T) onto (Kθ )⊥ . Hence we have following natural
decompositions

H 2 = Kθ ⊕ θ H 2 and L2 (T) = Kθ ⊕ (Kθ )⊥ = Kθ ⊕ θ H 2 ⊕ H−2 .

There is a natural conjugation (an antiunitary involution) connected with a model

space (see for instance [3, 11]). For an inner function θ define Cθ : L2 (T) → L2 (T)
by

Cθ f (z) = θ (z)zf (z), |z| = 1. (1)

Then Cθ is an antilinear isometric involution on L2 (T), which implies that

Cθ f, Cθ g = g, f for f, g ∈ L2 (T). One can easily verify that

Cθ Mϕ Cθ = Mϕ̄ . (2)

It is well known [11] that Cθ preserves Kθ . Moreover, Cθ (θ H 2 ) = H−2 and

Cθ (H−2 ) = θ H 2 , so Cθ also preserves (Kθ )⊥ . Hence,

Cθ|Kθ 0
Cθ = .
0 Cθ|(Kθ )⊥

Recall that kw = 1−1w̄z is a reproducing kernel for all functions f ∈ H 2 , i.e.,

f (w) = f, kw for w ∈ D. Let θ be a nonconstant inner function. Then kw θ =

Pθ kw = (1 − θ (w)θ )kw is a reproducing kernel for all functions f ∈ Kθ , i.e.,

f (w) = f, kw θ for w ∈ D. Denote k̃ θ = C k θ , k̃ θ (z) = θ(z)−θ(w) . Note that
w θ w w z−w
Cθ f (w) = f, k̃w
θ for w ∈ D.

2 Restrictions of Multiplication Operators and Its Basic

Properties

Let θ, α be nonconstant inner functions. Recall that Kθ∞ := Kθ ∩ L∞ (T) is a dense

subset of Kθ , (see [10]). Since z̄H ∞ is a dense subset of H−2 and θ H ∞ is a dense
subset of θ H 2 , it follows that Kθ⊥ ∩ L∞ (T) is a dense subset of Kθ⊥ . For ϕ ∈ L2 (T)
we can consider the densely defined multiplication operator

Mϕ : (Kθ ∩ L∞ (T)) ⊕ (Kθ⊥ ∩ L∞ (T)) → Kα ⊕ Kα⊥ . (3)

432 M. C. Câmara et al.

Define also

ϕ = Pα Mϕ|Kθ ∩L∞ (T) ,

Aθ,α B̃ϕθ,α = Pα Mϕ|K ⊥ ∩L∞ (T)
θ
and

Bϕθ,α = Pα⊥ Mϕ|Kθ ∩L∞ (T) , Dϕθ,α = Pα⊥ Mϕ|K ⊥ ∩L∞ (T) .
θ

Remark 2.1 Note that for g ∈ Kθ⊥ ∩ L∞ (T), h ∈ Kα ∩ L∞ (T) we have

$
B̃ϕθ,α g, h = Pα Mϕ|K ⊥ ∩L∞ (T) g, h = ϕg, h = ϕg h̄ dm
θ

= g, ϕ̄h = g, Pθ⊥ Mϕ̄|Kα ∩L∞ (T) h = (Bϕ̄α,θ )∗ g, h.

Hence, according to the decomposition (3), the action of the operator Mϕ is given
by the matrix
" # " #
α,θ ∗
Pα Mϕ|Kθ ∩L∞ (T) Pα Mϕ|K ⊥ ∩L∞ (T) g Aθ,α
ϕ (Bϕ̄ ) g
θ = . (4)
Pα⊥ Mϕ|Kθ ∩L∞ (T) Pα⊥ Mϕ|K ⊥ ∩L∞ (T) h Bϕθ,α Dϕθ,α h
θ

If Aθ,α
ϕ extends to the whole Kθ as a bounded operator, it is called an asymmetric
truncated Toeplitz operator (ATTO). Similarly, if Bϕθ,α extends to a bounded
operator from Kθ to Kα⊥ , it is called an asymmetric big truncated Hankel operator
(ATHO) (see [13, 15]), and if Dϕθ,α extends to the whole Kθ⊥ as a bounded operator,
it is called an asymmetric dual truncated Toeplitz operator (ADTTO).
Let us fix the notation

T (Kθ , Kα ) = {Aθ,α
ϕ : ϕ ∈ L (T) and Aϕ is bounded},
2 θ,α

T (Kθ , Kα⊥ ) = {Bϕθ,α : ϕ ∈ L2 (T) and Bϕθ,α is bounded},

T (Kθ⊥ , Kα⊥ ) = {Dϕθ,α : ϕ ∈ L2 (T) and Dϕθ,α is bounded}.

In case θ = α we will use the shorter notation Aθϕ , Bϕθ , Dϕθ and T (Kθ ) and
T (Kθ , Kθ⊥ ), T (Kθ⊥ ), respectively.
The following basic properties of restrictions of multiplication operators hold.
⊥ ⊥ ⊥
ϕ ∈ T (Kθ , Kα ), Bϕ ∈ T (Kθ , Kα ), Dϕ ∈ T (Kθ , Kα ).
Lemma 2.2 Let Aθ,α θ,α θ,α

∗ α,θ θ,α ∗ α,θ

ϕ ) = Aϕ̄ , (Dϕ ) = Dϕ̄ .
(a) Then (Aθ,α
(b) If ψ ∈ L∞ (T) then Aθ,α θ,α ⊥ ⊥
ψϕ ∈ T (Kθ , Kα ), Dψϕ ∈ T (Kθ , Kα ).
The relations of restrictions of Mϕ and conjugations will be now considered.
(Asymmetric) Dual Truncated Toeplitz Operators 433

Proposition 2.3 Let α, θ be inner functions and ϕ ∈ L2 (T). Assume that Aθ,α
ϕ ∈
⊥ ⊥ ⊥
T (Kθ , Kα ), Bϕ ∈ T (Kθ , Kα ), Dϕ ∈ T (Kθ , Kα ). Then
θ,α θ,α

θ,α
ϕ = Aα ϕ̄θ̄ Cθ ;
(a) Cα Aθ,α
(b) Cα Dϕθ,α = Dαθ,α C ;
ϕ̄ θ̄ θ
(c) Cα Bϕθ,α = Bαθ,α C .
ϕ̄ θ̄ θ

Proof For f ∈ L∞ (T) we have

Cα Mϕ f = Cα (ϕf ) = α z̄ϕ̄ f¯ = Mα ϕ̄ θ̄ θ z̄f¯ = Mα ϕ̄ θ̄ Cθ f. (5)

Now, since Kθ and Kθ⊥ are invariant for Cθ (the same holds for α), for g ∈ Kθ ∩
L∞ (T), h ∈ Kθ⊥ ∩ L∞ (T) using matrix representation (4) we get
" #" #
α,θ ∗
Cα |Kα 0 Aθ,α
ϕ (Bϕ̄ ) g
0 Cα |Kα⊥ Bϕθ,α Dϕθ,α h
" θ,α α,θ ∗ # " #
Aα ϕ̄ θ̄ (Bᾱϕθ ) Cθ |Kθ 0 g
= θ,α θ,α .
Bα ϕ̄ θ̄ Dα ϕ̄ θ̄ 0 C θ |K ⊥ h
θ

Hence
θ,α
ϕ g = Aα ϕ̄ θ̄ Cθ g for g ∈ Kθ ∩ L∞ (T);
(a) Cα Aθ,α
(b) Cα Dϕθ,α h = Dαθ,α C h for h ∈ Kθ⊥ ∩ L∞ (T);
ϕ̄ θ̄ θ
(c) Cα (Bϕ̄α,θ )∗ h = (Bᾱϕθ
α,θ ∗
) Cθ h for h ∈ Kθ⊥ ∩ L∞ (T);
(d) Cα Bϕθ,α g = Bαθ,α C g for g ∈ Kθ ∩ L∞ (T).
ϕ̄ θ̄ θ

Corollary 2.4 Let θ be an inner function and ϕ ∈ L2 (T).
Assume that Aθϕ ∈
θ ⊥
T (Kθ ), Dϕ ∈ T (Kθ ). Then Aϕ and Dϕ are Cθ –symmetric, i.e., Cθ Aθϕ =
θ θ Aθϕ̄ Cθ
and Cθ Dϕθ = Dϕ̄θ Cθ .

The following properties of the operators Bzθ and Bz̄θ can be verified.
Lemma 2.5 Let θ be an inner function. Then
(a) Bzθ = θ ⊗ 1 k0θ ;
(b) (Bzθ )∗ = 1k0θ ⊗ θ ;
(c) Bz̄ = z̄ ⊗ k0θ ;
θ

(d) (Bz̄θ )∗ = k0θ ⊗ z̄.

434 M. C. Câmara et al.

Proof To show (a) let f ∈ Kθ . Then

Bzθ f = Pθ⊥ (zf ) = θ P + θ̄ (zf ) = θ P + Cθ f = Cθ f (0) θ =

Cθ f, k0θ θ = Cθ k0θ , f θ = f, k̃0θ θ = θ ⊗ k̃0θ f.

(c) is a consequence of Proposition 2.3, since

Bz̄θ = Cθ Bzθ Cθ = Cθ θ ⊗ k̃0θ Cθ = Cθ θ ⊗ Cθ k̃0θ = z̄ ⊗ k0θ .

(b) and (d) are straightforward.

Proposition 2.6 Let θ be a nonconstant inner function. Then
(a) Dzθ Dz̄θ = IK ⊥ − (1 − |θ (0)|2 )θ ⊗ θ ;
θ
(b) Dz̄θ Dzθ = IK ⊥ − (1 − |θ (0)|2 )z̄ ⊗ z̄;
θ
(c) Bzθ (Bz̄θ )∗ = θ (0) θ ⊗ z̄;
(d) (Bz̄θ )∗ Bzθ = 0.
Proof Note that for f, g ∈ Kθ⊥ we have

Dzθ Dz̄θ f, g = z(z̄f − Pθ (z̄f )), g

= f − zPθ (z̄f ), g = f, g − zPθ (z̄f ), g.

Since

Pθ (z̄f ) = P + θ P − θ̄ (z̄P − f + z̄PθH 2 f )

= P + θ P − θ̄ z̄PθH 2 f = θ̄ f, 1P + θ z̄ = f, θ k̃0θ ,

we have

Dzθ Dz̄θ f, g = f, g − f, θ zk̃0θ , g = f, g − f, θ Pθ⊥ (zk̃0θ ), g.

Note that Pθ⊥ (zk̃0θ ) = (1 − |θ (0)|2 )θ . Hence

Dzθ Dz̄θ f, g = f, g − f, θ (1 − |θ (0)|2)θ, g

= (Ik ⊥ − (1 − |θ (0)|2)θ ⊗ θ )f, g.
θ

To prove (b) note that, by Proposition 2.3,

Dz̄θ Dzθ = Cθ Dzθ Dz̄θ Cθ = Cθ (Ik ⊥ − (1 − |θ (0)|2 )θ ⊗ θ )Cθ

= Ik ⊥ − (1 − |θ (0)| )Cθ θ ⊗ Cθ θ = IK ⊥ − (1 − |θ (0)|2 )z̄ ⊗ z̄.

2
θ θ
(Asymmetric) Dual Truncated Toeplitz Operators 435

Calculating (c) we will use Lemma 2.5 and the formula for multiplication of rank-
one operators (see [17])

Bzθ (Bz̄θ )∗ = (θ ⊗ 1
k0θ )(k0θ ⊗ z̄) = k0θ , k̃0θ θ ⊗ z̄ = θ (0) θ ⊗ z̄.

The last formula can be obtained similarly.

3 Basic Properties of ADTTO

It was shown that an asymmetric truncated Toeplitz operator can be bounded even if
it has no bounded symbol. The same is true for an asymmetric big truncated Hankel
operator. In the case of bounded dual asymmetric Toeplitz operators the symbol is
always bounded and unique.
Proposition 3.1 Let ϕ ∈ L2 (T). Then Dϕθ,α is bounded if and only if ϕ ∈ L∞ (T).
Moreover, in that case, Dϕθ,α = ϕ∞ .
Proof Let f ∈ H ∞ . Then θf ∈ θ H ∞ ⊂ θ H 2 and

Dϕθ,α (θf )2 =(P − + αP + ᾱ)(θf )2 = P − (θf )2 + αP + ᾱθf 2

=P − (θf )2 + αP + ᾱθf 2 Tᾱϕθ f 2 .

If the operator Dϕθ,α is bounded, then there is a constant C > 0 such that
Dϕθ,α (θf ) Cf . Hence Tᾱϕθ f Cf for every f ∈ H ∞ , which implies
that Tᾱϕθ is bounded and in consequence ᾱϕθ ∈ L∞ (T). Thus ϕ ∈ L∞ (T) and
ϕ∞ = Tᾱϕθ Dϕθ,α .
If now ϕ ∈ L∞ (T), then for any f ∈ Kθ⊥ we have

Dϕθ,α f = Pα⊥ (ϕf ) ϕ∞ f .

Hence Dϕθ,α is bounded and, moreover, Dϕθ,α ϕ∞ .

The only compact Toeplitz operator is the zero operator. The same is true for
asymmetric dual truncated Toeplitz operators.
Proposition 3.2 Let ϕ ∈ L∞ (T). Then Dϕθ,α is compact if and only if ϕ = 0.
Proof Let fn ∈ θ H 2 ⊂ Kθ⊥ be weakly convergent to 0 (fn ' 0). Then
fn = θ fñ for fñ ∈ H 2 . Note that fn ' 0 if and only if fñ ' 0. If
Dϕθ,α is compact, then Dϕθ,α fn = Dϕθ,α θ fñ → 0. Since (as in the proof of
Proposition 3.1) Tᾱθϕ fñ Dϕθ,α (θ fñ ) → 0, therefore compactness of Dϕθ,α
implies compactness of the Toeplitz operator Tᾱθϕ , which leads to the conclusion
that ϕ = 0.

436 M. C. Câmara et al.

An important consequence of Proposition 3.2 is that the only symbol for the zero
asymmetric dual truncated Toeplitz operator is ϕ = 0. What follows is that each
ADTTO has a unique symbol.
Proposition 3.3 Let α, θ be inner functions and let ϕ ∈ L∞ (T). If Dϕθ,α is
invertible, then ϕ is invertible in L∞ (T).
Proof By Douglas [8, Corollary 4.24] we know that the operator Mϕ is invertible in
L2 (T) if and only if ϕ is invertible in L∞ (T). Assume that Dϕθ,α is invertible, then
there is a constant c > 0 such that

Dϕθ,α f cf (6)

for all f ∈ Kθ⊥ . Hence for all integers k and g ∈ H 2 we have

Mϕ zk g = ϕzk g = ϕθg Pα⊥ ϕθg = Dϕθ,α θg cθg = czk g.

Since the set {zk g : k ∈ Z, g ∈ H 2 } is dense in L2 (T), we have that for x ∈ L2 (T)

Mϕ x cx.

Since (Dϕθ,α )∗ = Dϕ̄α,θ is also invertible, thus Mϕ̄ x cf , we conclude that
Mϕ is invertible by Douglas [8, Corollary 4.9].

4 Intertwining Property for ADTTO

Since the unilateral shift S is unitarily equivalent to the Toeplitz operator Tz we are
able to describe the commutant of the unilateral shift as

{S} = {Tz } = {Tϕ : ϕ ∈ H ∞ := L∞ (T) ∩ H 2 }.

Now considering the compressions Aθz and Dzθ , it is natural to try to describe
the commutants {Aθz } and {Dzθ } . In the more general asymmetric setting we are
searching for all operators intertwining Aθz and Aαz in the case of two model spaces
or searching for all operators intertwining Dzθ and Dzα in the case of orthogonal
complements of two model spaces, i.e., we try to describe the following sets of
operators

I(Kθ , Kα ) = {B ∈ B(Kθ , Kα ) : Aαz B = BAθz },

I(Kθ⊥ , Kα⊥ ) = {B ∈ B(Kθ⊥ , Kα⊥ ) : Dzα B = BDzθ }.

The following result describes T (Kθ⊥ , Kα⊥ ) ∩ I(Kθ⊥ , Kα⊥ ).

(Asymmetric) Dual Truncated Toeplitz Operators 437

Theorem 4.1 Let α, θ be nonconstant inner functions and ϕ ∈ L∞ (T), ϕ = 0.

Then
(a) Dϕθ,α Dzθ − Dzα Dϕθ,α = α ⊗ Pθ⊥ (ϕ̄ k̃0α ) − Pα⊥ (ϕk0θ ) ⊗ z̄;
(b) Dϕθ,α Dz̄θ − Dz̄α Dϕθ,α = z̄ ⊗ Pθ⊥ (ϕ̄k0α ) − Pα⊥ (ϕ k̃0θ ) ⊗ θ.
Proof Let ϕ ∈ L∞ (T) \ {0}. The commutation relation Mϕ Mz = Mz Mϕ can be
written as
" # " θ,α #
α,θ ∗ θ α
Aθ,α
ϕ (Bϕ̄ ) Az (Bz̄θ )∗ Az (Bz̄α )∗ Aϕ (Bϕ̄α,θ )∗
= .
Bϕθ,α Dϕθ,α Bzθ Dzθ Bzα Dzα Bϕθ,α Dϕθ,α

Thus we have

Bϕθ,α (Bz̄θ )∗ + Dϕθ,α Dzθ = Bzα (Bϕ̄α,θ )∗ + Dzα Dϕθ,α .

Clearly,

Bϕθ,α k0θ = Pα⊥ (ϕk0θ ) and Bϕ̄α,θ k̃0α = Pθ⊥ (ϕ̄ k̃0α ).

and, by Proposition 2.3,

⊥
Bϕ̄α,θ k̃0α = Bϕ̄α,θ Cα k0α = Cθ Bθϕ
α,θ α
ᾱ k0 = Cθ Pθ (θ ϕ ᾱk0 ).
α

Therefore, by Lemma 2.5, we get

Dϕθ,α Dzθ − Dzα Dαθ,α = Bzα (Bϕθ,α )∗ − Bϕθ,α (Bz̄θ )∗

= (α ⊗ k̃0α )(Bϕ̄α,θ )∗ − Bϕθ,α (k0θ ⊗ z̄)

= α ⊗ (Bϕ̄α,θ k̃0α ) − (Bϕθ,α k0θ ) ⊗ z̄

= α ⊗ Pθ⊥ (ϕ̄ k̃0α ) − Pα⊥ (ϕk0θ ) ⊗ z̄

which proves (a).

To see (b) we start with the relation Mϕ Mz̄ = Mz̄ Mϕ , which can be written as
" # " #
α,θ ∗ α,θ ∗
Aθ,α
ϕ (Bϕ̄ ) Aθz̄ (Bzθ )∗ Aαz̄ (Bzα )∗ Aθ,α
ϕ (Bϕ̄ )
= .
Bϕθ,α Dϕθ,α Bz̄θ Dz̄θ Bz̄α Dz̄α Bϕθ,α Dϕθ,α

Thus we have

Bϕθ,α (Bzθ )∗ + Dϕθ,α Dz̄θ = Bz̄α (Bϕ̄α,θ )∗ + Dz̄α Dϕθ,α .

438 M. C. Câmara et al.

Clearly,

Bϕθ,α k0θ = Pα⊥ (ϕk0θ ) and Bϕ̄α,θ k̃0α = Pθ⊥ (ϕ̄ k̃0α ).

and, by Proposition 2.3,

⊥
Bϕ̄α,θ k̃0α = Bϕ̄α,θ Cα k0α = Cθ Bθϕ
α,θ α
ᾱ k0 = Cθ Pθ (θ ϕ ᾱk0 ).
α

Therefore, by Lemma 2.5, we get

Dϕθ,α Dz̄θ − Dz̄α Dαθ,α = Bz̄α (Bϕθ,α )∗ − Bϕθ,α (Bzθ )∗

= (z̄ ⊗ k0α )(Bϕ̄α,θ )∗ − Bϕθ,α (k̃0θ ⊗ θ )

= z̄ ⊗ (Bϕ̄α,θ k0α ) − (Bϕθ,α k̃0θ ) ⊗ θ

= z̄ ⊗ Pθ⊥ (ϕ̄k0α ) − Pα⊥ (ϕ k̃0θ ) ⊗ θ.

Theorem 4.2 Let α, θ be nonconstant inner functions and ϕ ∈ L∞ (T), ϕ = 0.
Then Dϕθ,α Dzθ = Dzα Dϕθ,α if and only if one of the following holds
(a) α(0) = 0 = θ (0) and ϕ ∈ α
gcd(α,θ) Kz·gcd(α,θ) , or
(b) α = θ and ϕ ∈ (k0θ )−1 Kzθ .
Proof Applying Theorem 4.1

Dϕθ,α Dzθ = Dzα Dϕθ,α (7)

if and only if there is a constant c ∈ C such that

Pα⊥ (ϕk0θ ) = c α,
(8)
Pθ⊥ (ϕ̄ k̃0α ) = c̄ z̄.

Since

Cθ Pθ⊥ (ϕ̄ k̃0α ) = Pθ⊥ Cθ (ϕ̄ k̃0α ) = Pθ⊥ (θ ᾱϕα z̄k̃0α )

= Pθ⊥ (θ ᾱϕCα k̃0α ) = Pθ⊥ (θ ᾱϕk0α )

the above is equivalent to

Pα⊥ (ϕk0θ ) = c α,
(9)
Pθ⊥ (θ ᾱϕk0α ) = c θ.
(Asymmetric) Dual Truncated Toeplitz Operators 439

Hence, there are g ∈ Kα and h ∈ Kθ such that

ϕk0θ = cα + g and (10)

θ ᾱϕk0α = cθ + h. (11)

Since the functions k0α , k0θ are bounded from below and analytic (in consequence we
have (k0α )−1 , (k0θ )−1 ∈ H ∞ ), we get by (10) that ϕ ∈ H 2 . Let γ = gcd(α, θ ). Then,
by (11),

ϕk0α − cα = θ̄ αh = θ̄ α
γ̄ γ h ∈ H 2.

Hence h is divisible by θ
γ and since h ∈ Kθ = K θ ⊕ γθ Kγ , we have
γ

h= θ
γ h1 with h1 ∈ Kγ .

Therefore, by (11),
ᾱ α
γ̄ ϕk0 = cγ + h1 ∈ H 2 .

Since k0α is an outer function, it cannot be divisible by γα , which implies that

ϕ= α
γ ϕ1 with ϕ1 ∈ H 2 \ {0}.

Hence by (10) we get γα ϕ1 k0θ = cα + g. Therefore g = α

γ g1 , g1 ∈ Kγ , and (10) and
(11) are equivalent to

ϕ1 k0θ = cγ + g1 , g1 ∈ Kγ , (12)
ϕ1 k0α = cγ + h1 , h1 ∈ Kγ . (13)

Moreover,

Pγ (ϕ1 ) = Pγ (ϕ1 k0θ + θ (0)θ ϕ1 ) = Pγ (cγ + g1 + θ (0)θ ϕ1 ) = g1 .

Similarly, Pγ (ϕ1 ) = h1 , so g1 = h1 . Comparing (12) with (13) we thus get ϕ1 (k0α −

k0θ ) = 0. Since ϕ1 , (k0α − k0θ ) ∈ H 2 and ϕ1 = 0, we must have

k0α − k0θ = θ (0)θ − α(0)α = 0.

This is possible only in two cases:

1. if α(0) = 0 = θ (0), then k0α = k0θ = 1 and by (12),

ϕ= α
γ ϕ1 ∈ α
γ (Kγ ⊕ Cγ ) = γ Kz·γ ;
α
440 M. C. Câmara et al.

2. if α = θ , then (12) and (13) become the same condition, equivalent to ϕk0θ ∈
Kθ ⊕ Cθ = Kzθ , which leads to ϕ ∈ (k0θ )−1 Kzθ .

Corollary 4.3 Let θ be a nonconstant inner function and let ϕ ∈ L∞ (T), ϕ = 0.
Then Dϕθ ∈ {Dzθ } if and only if ϕ ∈ (k0θ )−1 Kzθ .
Example Let a ∈ D; we denote by Ba the Blaschke factor with zero at a, i.e.,
Ba (z) = 1− āz . Let θ = zBb and α = zBa with a = b, a = 0, b = 0, a, b ∈ D.
a−z

Then gcd(θ, α) = z. By Theorem 4.2 for ϕ ∈ L∞ (T) we have Dϕθ,α ∈ I(Kθ⊥ , Kα⊥ )
if and only if
4 5
aa0 +(aa1 −a0 )z−a1 z2
ϕ ∈ Ba Kz2 = 1−āz : a0 , a1 ∈ D .

Example Let θ (z) = exp{ z+1

z−1 }, α(z) = exp{a z−1 } with 0 < a < 1. Then there are
z+1

no dual truncated operators intertwining Dz and Dzα .

5 Other Relations with ADTTO

A classical result shows that an operator T on H 2 is Toeplitz if and only if T −

S ∗ T S = 0. Sarason in [19] showed that if A is a TTO in the model space Kθ then
A − (Aθz )∗ AAθz is a specific rank two operator with easily seen symbol. First we try
to calculate this expression for DTTO.
Proposition 5.1 Let Dϕθ,α ∈ T (Kθ⊥ , Kα⊥ ). Then

(a) Dϕθ,α − Dzα Dϕθ,α Dz̄θ = Pα⊥ (ϕ z̄k̃0θ ) ⊗ θ + α ⊗ Pθ⊥ (ϕ̄ z̄k̃0α );
(b) Dϕθ,α − Dz̄α Dϕθ,α Dzθ = Pα⊥ (ϕ z̄k0θ ) ⊗ z̄ + z̄ ⊗ Pθ⊥ (ϕ̄ z̄k0α ).
Proof By Theorem 4.1 we have

Dϕθ,α Dzθ Dz̄θ − Dzα Dϕθ,α Dz̄θ = α ⊗ Dzθ Pθ⊥ (ϕ̄ k̃0α ) − Pα⊥ (ϕk0θ ) ⊗ Dzθ z̄.

Since Dzθ z̄ = Pθ⊥ 1 = θ (0)θ , using Proposition 2.6 (a) we get

Dϕθ,α − Dzθ Dϕθ,α Dz̄θ = (1 − |θ (0)|2 )Dϕθ,α θ

− θ (0)Pα⊥ (ϕk0θ ) ⊗ θ + α ⊗ Dzθ Pθ⊥ (ϕ̄ k̃0α ). (14)
(Asymmetric) Dual Truncated Toeplitz Operators 441

Since Dϕθ,α (θ ) = Pα⊥ (θ ϕ), we obtain

(1 − |θ (0)|2)Dϕθ θ, α−θ (0)Pα⊥ (ϕk0θ ) = Pα⊥ (θ ϕ − |θ (0)|2θ ϕ − θ (0)ϕk0θ )

= Pα⊥ (θ ϕ − |θ (0)|2 θ ϕ − θ (0)ϕ + |θ (0)|2θ ϕ)

= Pα⊥ (ϕ(θ − θ (0))) = Pα⊥ (ϕ z̄k̃0θ ).

The proof will be completed with

Dzθ Pθ⊥ (ϕ̄ k̃0α ) = Pθ⊥ (Dzθ (ϕ̄ z̄(α − α(0))) = Pθ⊥ (ϕ̄(α − α(0))) = Pθ⊥ (ϕ̄ z̄k̃0α ).

To prove (b) write (a) for α ϕ̄ θ̄ ∈ L∞ (T) and apply Cα and Cθ , then

Cα Dαθ,α C − (Cα Dzα Cα )(Cα Dαθ,α

ϕ̄ θ̄ θ
C )(Cθ Dz̄θ Cθ )
ϕ̄ θ̄ θ

= Cα Pα⊥ (α ϕ̄ θ̄ (θ − θ (0))) ⊗ θ + α ⊗ Pθ⊥ (ᾱϕθ (α − α(0))) Cθ .

Hence by Proposition 2.3 we have

Dϕθ,α − Dz̄α Dϕθ,α Dzθ

= Cα Pα⊥ (α ϕ̄ θ̄ (θ − θ (0))) ⊗ Cθ θ + Cα α ⊗ Cθ Pθ⊥ (ᾱϕθ(α − α(0)))

= Pα⊥ Cα (α ϕ̄(1 − θ (0)θ̄ )) ⊗ z̄ + z̄ ⊗ Pθ⊥ Cθ (ϕθ (1 − α(0)ᾱ))

= Pα⊥ (ϕ z̄k0θ ) ⊗ z̄ + z̄ ⊗ Pθ⊥ (ϕ̄ z̄k0α ).

Corollary 5.2 Let Dϕθ ∈T (Kθ⊥ ). Then

(a) Dϕθ − Dzθ Dϕθ Dz̄θ = Pθ⊥ (ϕ z̄k̃0θ ) ⊗ θ + θ ⊗ Pθ⊥ (ϕ̄ z̄k̃0θ );
(b) Dϕθ − Dz̄θ Dϕθ Dzθ = Pθ⊥ (ϕ z̄k0θ ) ⊗ z̄ + z̄ ⊗ Pθ⊥ (ϕ̄ z̄k0θ ).
1−|a|2 |a|2 −1
Example Let θ = Ba with a ∈ D, then k0θ = 1−āz and k̃0θ = 1−āz . Consider
ϕ ≡ 1, then

Pθ⊥ (ϕ z̄k̃0θ ) = Pθ⊥ (z̄k̃0θ ) = Pθ⊥ (|a|2 − 1)(z̄ + ā 1−1āz ) = (|a|2 − 1)z̄.

Thus

D1θ − Dzθ D1θ Dz̄θ = Pθ⊥ (z̄k̃0θ ) ⊗ θ + θ ⊗ Pθ⊥ (z̄k0θ ) = (|a|2 − 1)(θ ⊗ z̄ + z̄ ⊗ θ ).
442 M. C. Câmara et al.

Consider now ϕ(z) = z(1 − āz). Then

Pθ⊥ (ϕ z̄k̃0θ ) = Pθ⊥ ((1 − āz)k̃0θ ) = Pθ⊥ (|a|2 − 1)

= (|a|2 − 1)(1 − k0θ ) = (|a|2 − 1)āθ

and

Pθ⊥ (ϕ̄ z̄k̃0θ ) = (|a|2 − 1)Pθ⊥ z̄2 (1 − a z̄) 1−1āz = (|a|2 − 1) p(z̄),

where p(z̄) = (ā(1 − |a|2))z̄ + (1 − |a|2)z̄2 − a z̄3 . Therefore

āz) − Dz Dz(1−āz) Dz̄ = (|a| − 1) ā θ ⊗ θ + θ ⊗ p(z̄) .
θ θ θ θ 2
Dz(1−

We see the variety of the above expressions.

6 A Characterization of ADTTO

In this section our goal is to give a characterization of (asymmetric) dual truncated

Toeplitz operators. The simplest approach is to require that the given operator has
to fulfill some equation(s). Then we must have the possibility to find the symbol of
the given operator. The classical Brown–Halmos result shows that a bounded linear
operator T ∈ B(H 2 ) is a Toeplitz operator if and only if T = (Tz )∗ T Tz . Similar
characterizations (in terms of compressions of Mz ) are known for Hankel operators
and dual Toeplitz operators. In [19] D. Sarason characterized bounded truncated
Toeplitz operators in terms of the compressions of Mz to Kθ . In particular, he proved
that a bounded operator A ∈ B(Kθ ) is a truncated Toeplitz operator if and only if

A − Aθz AAθz̄ = ψ ⊗ k0θ + k0θ ⊗ χ (15)

for some ψ, χ ∈ Kθ . In other words, the left hand side of (15) can be expressed as
an operator of rank at most two. In this section our aim is to give similar expressions
for operators from T (Kθ⊥ , Kα⊥ ) using operators of rank at most two.
The characterization (15) proved in [19] for truncated Toeplitz operators imme-
diately gives a symbol of the truncated Toeplitz operator A = Aθψ+χ̄ . Moreover, the
relation between ψ and χ is simple, see [19, Corollary after Theorem 3.1]. However,
for any asymmetric dual truncated Toeplitz operator, the functions μ = Pα⊥ (ϕ z̄k̃0θ ),
ν = Pθ⊥ (ϕ̄ z̄k̃0α ) in the formula (a) in Proposition 5.1, strongly and in a very
complicated way depend on each other. Moreover, in case of dual truncated Toeplitz
operators, having the rank-two operator on the right hand side of (5.1), μ⊗θ +α ⊗ν
with μ, ν ∈ Kθ⊥ , we are far from obtaining the symbol of D. For this reason, to
answer a natural question when an operator D ∈ B(Kθ⊥ , Kα⊥ ) is a ADTTO and to
(Asymmetric) Dual Truncated Toeplitz Operators 443

find its symbol, we will consider the matrix decomposition of D. This will be done
in Theorem 6.7.
First the compressions of ADTTO’s to certain subspaces of Kθ⊥ and Kα⊥ will
be considered. Let θ, α be two inner functions. Using the decompositions Kθ⊥ =
θ H 2 ⊕ H−2 and Kα⊥ = αH 2 ⊕ H−2 one can write each operator D ∈ B(Kθ⊥ , Kα⊥ ) as
a matrix
" #
PαH 2 D|θH 2 PαH 2 D|H 2
D= −
.
P − D|θH 2 P − D|H 2
−

In particular, for ϕ ∈ L∞ (T), we obtain

" # " #
T̂ϕθ,α Γˇϕα T̂ϕθ,α (Γˆϕ̄α )∗
Dϕθ,α = = , (16)
Γˆϕθ Ťϕ Γˆϕθ Ťϕ

where

T̂ϕθ,α = PαH 2 Mϕ|θH 2 , Γˆϕθ = P − Mϕ|θH 2 (17)

and

Γˇϕα = PαH 2 Mϕ|H 2 , Ťϕ = P − Mϕ|H 2 . (18)

− −

Let us denote

T (θ H 2, αH 2 ) = {T̂ ∈ B(θ H 2 , αH 2 ) : T̂ = T̂ϕθ,α for some ϕ ∈ L∞ (T)},

T (H−2 ) = {Ť ∈ B(H−2 ) : Ť = Ťϕ for some ϕ ∈ L∞ (T)},

T (θ H 2 , H−2 ) = {Γˆ ∈ B(θ H 2 , H−2 ) : Γˆ = Γˆϕθ = P − Mϕ|θH 2 for ϕ ∈ L∞ (T)},

T (H−2 , αH 2 ) = {Γˇ ∈ B(H−2 , αH 2 ) : Γˇ = Γˇϕα = PαH 2 Mϕ|H 2 for ϕ ∈ L∞ (T)}.

−

As in [2] we will write T (θ H 2 ) instead of T (θ H 2, θ H 2 ) and T̂ϕθ instead of T̂ϕθ,θ .

Each of the operators in (17) and (18) can be similarly defined for arbitrary ϕ ∈
L2 (T). In that case Γˇϕα and Ťϕ are defined on a dense subset θ H ∞ of θ H 2 , while
Γˇϕα and Ťϕ are defined on H−∞ = H−2 ∩ L∞ (T) = zH ∞ which is a dense subset of
H−2 . However, in a moment we will justify the fact that, in a sense, one needs only
to consider symbols from L∞ (T).
The following proposition is not difficult to verify using Proposition 1.1.
Proposition 6.1 Let θ and α be two nonconstant inner functions and let ϕ ∈ L2 (T).
Then
θ,α
(a) T̂ϕ|θH ∞ = PαH 2 Mϕ|θH ∞ = Mα Tᾱϕθ Mθ̄|θH ∞ = Mα Tᾱϕ|θH ∞ ;
444 M. C. Câmara et al.

(b) Γˆϕ|θH
θ −
∞ = P Mϕ|θH ∞ = Hϕθ Mθ̄|θH ∞ ;

(c) Ťϕ|H−∞ = P − Mϕ|H−∞ = J Tϕ̄ J|H−∞ ;

(d) Γˇϕ|H
α ∗
∞ = PαH 2 Mϕ|H ∞ = Mα Hα ϕ̄|H ∞ .
−
− −

It now follows from Proposition 6.1(a) that T̂ ∈ T (θ H 2 , αH 2 ) if and only

if Mᾱ T̂ Mθ|H 2 ∈ T (H 2 ) and so T (θ H 2, αH 2 ) is isomorphic to the space of
classical Toeplitz operators T (H 2 ). Moreover, each T̂ϕθ,α is uniquely determined by
its symbol and extends to a bounded operator on θ H 2 if and only if ϕ ∈ L∞ (T)
(since classical Toeplitz operators have similar properties). Similarly, T (H−2 ) is
isomorphic to T (H 2 ), each Ťϕ is uniquely determined by its symbol and extends
to a bounded operator on H−2 if and only if ϕ ∈ L∞ (T).
On the other hand, both T (θ H 2 , H−2 ) and T (H−2 , αH 2 ) are isomorphic to the
space of all Hankel operators H(H 2 , H−2 ). It follows from properties of classical
Hankel operators that Γˆϕθ and Γˇϕα may be bounded even for ϕ ∈ / L∞ (T), and

Γˆϕθ = Γˆψθ if and only if (ϕ − ψ) ⊥ θ zH 2 (19)

and

Γˇϕα = Γˇψα if and only if (ϕ − ψ) ⊥ αzH 2 .

In particular, Γˆϕθ = 0 if ϕ ∈ θ̄ H ∞ ⊃ H ∞ and Γˇϕα = 0 if ϕ ∈ αH ∞ ⊃

H ∞ . However, since each bounded Hankel operator has a symbol from L∞ (T)
[16, Theorem 1.3, Chapter 1], we see that the same is true for operators from
T (θ H 2 , H−2 ) or T (H−2 , αH 2 ). Thus operators with bounded symbols form the
spaces T (θ H 2, H−2 ) and T (H−2 , αH 2 ).
Proposition 6.1 implies the following.
Corollary 6.2
(a) T̂zθ = Mz|θH 2 , T̂z̄θ = Mz̄|θH 2 − (θ z̄ ⊗ θ )|θH 2 ;
(b) Ťz = Mz|H 2 − (1 ⊗ z̄)|H 2 , Ťz̄ = Mz̄|H 2 ;
− − −
(c) Γˆzθ = 0, Γˆz̄θ = (z̄ ⊗ 1)|θH 2 , Γˆz̄θ = 0 if θ (0) = 0;
(d) Γˇzθ = (θ (0) θ ⊗ z̄) 2 , Γˇ θ = 0, Γˇzθ = 0 if θ (0) = 0.
|H− z̄

Proof For (a) take f ∈ θ H 2. Then

T̂z̄θ f = θ P + θ̄ z̄f = θ P + z̄(θ̄ f ) = θ z̄(θ̄ f − (θ̄ f )(0))

= z̄f − θ z̄θ̄ f, 1 = z̄f − θ z̄f, θ = z̄f − (θ z̄ ⊗ θ )f.
(Asymmetric) Dual Truncated Toeplitz Operators 445

To obtain (b) take f ∈ H−2 ,

Ťz f = P − zf = zf − P + (zf ) = zf − f, z̄ = zf − (1 ⊗ z̄)f.

Observe that (3) follows, since for f ∈ H 2 , we have

Γˆz̄θ (θf ) = P − z̄θf = θ (0)f (0)z̄ = θf, 1z̄ = (1 ⊗ z̄)θf.

To note (d) take f ∈ H−2 ,

Γˇzθ f = θ P + θ̄ zf = θ P + z(θ̄f ) = θ θ̄ f, z̄ = θ f, θ z̄

= f, P − (θ z̄)θ = f, θ (0)z̄θ = θ (0)f, z̄θ = θ (0)(θ ⊗ z̄)f.

From Proposition 6.1 (a), (b) we obtain, in particular, that:
Corollary 6.3 For ϕ1 , ϕ2 ∈ L∞ (T),
(a) T̂ϕθ1 T̂ϕθ2 = Mθ Tϕ1 Tϕ2 Mθ̄|θH 2 ;
(b) Ťϕ1 Ťϕ2 = J Tϕ̄1 Tϕ̄2 J ;
(c) T̂z̄θ T̂zθ = IθH 2 ;
(d) T̂zθ T̂z̄θ = Mθ (I − 1 ⊗ 1)Mθ̄|θH 2 = IθH 2 − θ ⊗ θ |θH 2 ;
(e) Ťz̄ Ťz = J (I − 1 ⊗ 1)J|H 2 = IH 2 − z̄ ⊗ z̄|H 2 ;
− − −
(f) Ťz Ťz̄ = IH 2 .
−

It is a part of common knowledge that the space of classical Toeplitz operators

T (H 2 ) is isomorphic to the space T (θ H 2 ) and also with T (θ H 2 , αH 2 ) or also with
T (H−2 ), but we present some lemmas for completeness and to fix the notations of
these spacial isomorphisms. Recall also the notation H(H 2 , H−2 ) for the space of all
Hankel operators.
The following properties can be easily verified.
Proposition 6.4 Let θ be a nonconstant inner function. Then
(a) T̂ ∈ T (θ H 2, αH 2 ) if and only of Mᾱ T̂ Mθ|H 2 ∈ T (H 2 );
(b) Ť ∈ T (H−2 ) if and only of J T̂ J|H 2 ∈ T (H 2 );
(c) Γˆ ∈ T (θ H 2 , H−2 ) if and only of Γˆ Mθ|H 2 ∈ H(H 2 , H−2 );
∗
(d) Γˇ ∈ T (H−2 , θ H 2 ) if and only of M Γˇ ∈ H(H 2 , H−2 ).
θ̄
Observe also that for ϕ1 , ϕ2 ∈ L∞ (T),

T̂ϕθ1 T̂ϕθ2 = Mθ Tϕ1 Tϕ2 Mθ̄|θH 2 and Ťϕ1 Ťϕ2 = J Tϕ̄1 Tϕ̄2 J.
446 M. C. Câmara et al.

It thus follows from the properties of classical Toeplitz operators that if one of the
functions ϕ1 , ϕ2 belongs to H ∞ , then

T̂ϕ̄θ1 T̂ϕθ2 = T̂ϕ̄θ1 ϕ2 and Ťϕ1 Ťϕ̄2 = Ťϕ1 ϕ̄2 . (20)

It is well known that classical Toeplitz and Hankel operators can be characterized
in terms of compressions of Mz to H 2 and H−2 .
Recall that
(A) if T ∈ B(H 2 ), then T ∈ T (H 2) if and only if T = Tz∗ T Tz and in that case
T = Tϕ with ϕ = T (1) + T ∗ (1) − T ∗ 1, 1, (it is Brown–Halmos result, see
[10, Theorem 4.16]);
(B) if H ∈ B(H 2 , H−2 ), then H ∈ H(H 2, H−2 ) if and only if P − zH = H Tz and in
that case P − ϕ = H (1), see [16, Theorem 1.8, Chapter 1].
As a consequence of Proposition 6.1 we get the following.
Theorem 6.5 Let θ and α be two nonconstant inner functions.
(a) Let T̂ ∈ B(θ H 2 , αH 2 ). Then T̂ ∈ T (θ H 2 , αH 2 ) if and only if T̂ = T̂z̄α T̂ T̂zθ .
In that case T̂ = T̂ϕθ,α with ϕ = θ̄ T̂ (θ ) + α T̂ ∗ (α) − α θ̄ T̂ θ, α ∈ L∞ (T).
(b) Let Ť ∈ B(H−2 ). Then Ť ∈ T (H−2 ) if and only if Ť = Ťz Ť Ťz̄ and in that case
Ť = Ťϕ with ϕ = zŤ z̄ + z̄Ť ∗ z̄ − Ť z̄, z̄ ∈ L∞ (T).
(c) Let Γˆ ∈ B(θ H 2 , H−2 ). Then Γˆ ∈ T (θ H 2 , H−2 ) if and only if Ťz Γˆ = Γˆ T̂zθ
and in that case Γˆ = Γˆϕθ with P − (θ ϕ) = Γˆ θ . Moreover, there exists such a ϕ
belonging to L∞ (T).
(d) Let Γˇ ∈ B(H−2 , αH 2 ). Then Γˇ ∈ T (H−2 , αH 2 ) if and only if Γˇ Ťz̄ = T̂z̄α Γˇ and
in that case Γˇ = Γˇϕα with P − (α ϕ̄) = Γˇ ∗ α. Moreover, there exists such a ϕ
belonging to L∞ (T).
Proof By Proposition 6.4 (a) and (A), if T̂ ∈ T (θ H 2 , αH 2 ), then T (H 2 ) 8
Mᾱ T̂ Mθ|H 2 = Tz∗ Mᾱ T̂ Mθ Tz . Equivalently

T̂ = (Mα Tz̄ Mᾱ )T̂ (Mθ Tz Mθ̄ )|θH 2 = T̂z̄θ T̂ T̂zθ .

The symbol ϕ of T̂ can be obtained from the symbol ψ of T = Mᾱ T̂ Mθ , so

ϕ = αψ θ¯ = α Mᾱ T̂ Mθ )(1) + (Mᾱ T̂ Mθ )∗ (1) − (Mᾱ T̂ Mθ )∗ 1, 1 θ̄

= α ᾱ T̂ θ + θ̄ T̂ ∗ α − θ̄ T ∗ α, 1 θ̄ = θ̄ T̂ θ + α T̂ ∗ α − α θ̄ T ∗ α, θ .

To show (b) note that by Proposition 6.4 (b) and (A), if Ť ∈ T (H−2 ), then T (H 2 ) 8
J Ť J|H 2 = Tz∗ J T̂ J Tz . Equivalently Ť = (J Tz̄ J )Ť J Tz J|H 2 = Ťz Ť Ťz̄ . In that case
−
(Asymmetric) Dual Truncated Toeplitz Operators 447

its symbol is the conjugate of the symbol of T = J Ť J ∈ T (H 2 ), hence

ϕ̄ = J Ť J (1) + J Ť ∗ J (1) − J Ť ∗ J 1, 1 = z̄Ť (z̄) + zŤ ∗ (z̄) − Ť ∗ z̄, z̄.

To prove (c) we apply Proposition 6.4 (c) and (B). We have that Γˆ ∈
T (θ H 2 , H−2 ) if and only if P − zΓˆ Mθ|H 2 = Γˆ Mθ Tz . Equivalently,

Ťz Γˆ = P − zP − Γˆ = Γˆ Mθ Tz Mθ̄|θH 2 = Γˆ T̂zθ .

In that case Γˆ = Γˆϕθ where θ ϕ is a symbol for the Hankel operator Γˆ Mθ|H 2 (by
Proposition 6.1 (c)), thus P − (θ ϕ) = Γˆ Mθ (1) = Γˆ θ .
To obtain the last condition we apply Proposition 6.4 (d) and (B). Note
that Γˇ ∈ T (H−2 , αH 2 ) if and only if Γˇ ∗ Mα|H 2 ∈ H(H 2 , H−2 ). Equivalently,
P − zΓˇ ∗ Mα|H 2 = Γˇ ∗ Mα Tz . Hence Ťz Γˇ ∗ = Γˇ ∗ Mα Tz Mᾱ|θH 2 . Finally, Ťz Γˇ ∗ =
Γˇ ∗ T̂zα , which is the same as Γˇ Ťz̄ = T̂z̄ Γˇ . In that case Γˇ = Γˇϕα where α ϕ̄ is a symbol
of the Hankel operator (Mᾱ Γˇ )∗ = Γˇ Mα|H 2 , so P − (α ϕ̄) = Γˇ ∗ Mθ 1 = Γˇ ∗ α.

We will now consider operators of the form
" #
T̂ϕθ,α Γˇϕα
D= 1 2 .
Γˆϕθ3 Ťϕ4

with ϕi ∈ L2 (T) for i = 1, 2, 3, 4. Note that if D given above is bounded, then T̂ϕθ,α
1
and Ťϕ4 are also bounded and so, as mentioned above, necessarily ϕ1 , ϕ4 ∈ L∞ (T).
On the other hand, even though for bounded D the compressions Γˇϕα2 and Γˆϕθ3 are
also bounded, the functions ϕ2 and ϕ3 may not belong to L∞ (T) (but there exist
ψ2 , ψ3 ∈ L∞ (T) such that Γˇϕα2 = Γˇψα2 and Γˆϕθ3 = Γˆψθ3 ).
We will now study relations of the operators (17) and (18) with respect to the
conjugation Cθ (see (1)). Recall that Cθ can be expressed as Cθ = Mθ J = J Mθ̄ ,
hence we obtain the following.
Proposition 6.6 For ϕ ∈ L∞ (T),
(a) T̂ϕθ,α = Cα Ťα ϕ̄ θ̄ Cθ|θH 2 = Cα Ťα Ťϕ̄ Ťθ̄ Cθ|θH 2 ;
(b) Ťϕ = (P − Cα Mθ̄ )|θH 2 T̂ϕ̄α,θ (Mθ Cα )|H 2 ;
−
(c) Γˆϕθ = Cθ Γˇ θ Cθ|θH 2 .
ϕ̄
Proof The proof of (c) can be found in [2]. A slight modification of the proof of [2,
Proposition 22] (for α = θ ) gives

T̂ϕθ,α = Mα Tᾱϕθ Mθ̄|θH 2 = Mα (J P − J )Mᾱϕθ J (J Mθ̄ )|θH 2

= (Mα J )P − Mα ϕ̄ θ̄ P − J Mθ̄|θH 2 = Cα Ťα ϕ̄ θ̄ Cθ|θH 2 = Cα Ťα Ťϕ̄ Ťθ̄ Cθ|θH 2 ,

where the last equality follows from (20). Hence (a) holds.
448 M. C. Câmara et al.

To prove (b) recall that J Mθ̄ = Mθ J and J P + = P − J , which implies that

J Tθ̄ = P − Mθ J|H 2 and Tα J|H 2 = J P − Mᾱ|H 2 .

− −

Since J = Cα Mα = Mθ̄ Cθ , we get by Proposition 6.1

Ťϕ = J Tϕ̄ J|H 2 = J Tθ̄ Tᾱϕ̄θ Tα J|H 2 = P − Mθ J Tᾱϕ̄θ J P − Mᾱ|H 2

− − −

− − −
= P Mθ Cα Mα Tᾱϕ̄θ Mθ̄ Cθ P Mᾱ|H 2 = P Mθ Cα T̂ϕ̄θ,α Cθ Mᾱ|H 2 .
− −

The result follows since

Mθ Cα = Mα Cθ = Cθ Mᾱ = Cα Mθ̄ .

Note that for arbitrary ϕ ∈ L2 (T)
equalities (a)–(c) in Proposition 6.6 hold on
the set of bounded functions.
Finally, we are ready to give the following characterization of ADTTO’s.
Theorem 6.7 Let θ and α be inner functions and let D ∈ B(Kθ⊥ , Kα⊥ ). Then the
operator D is an asymmetric dual truncated Toeplitz operator, D ∈ T (Kθ⊥ , Kα⊥ ), if
and only if the following conditions hold:
(a) PαH 2 D|θH 2 = T̂z̄α PαH 2 D|θH 2 T̂zθ ;
∗
(b) P − D|H 2 = (P − Cα Mθ̄ )|θH 2 PαH 2 D|θH 2 (Mθ Cα )|H 2 ;
− −
(c) P − D|θH 2 T̂zθ = Ťz P − D|θH 2 and (PαH 2 D|H 2 )∗ T̂zα = Ťz (PαH 2 D|H 2 )∗ ;
− −
(d) P − (D(θ )) = P − (θ αD ∗ (α)) and P − (D ∗ (α)) = P − (θ αD(θ )).
In that case, D = Dϕθ,α with ϕ ∈ L∞ (T) given by

ϕ = θ̄ PαH 2 D(θ ) + α PθH 2 (D ∗ (α)) − α θ̄ D(θ ), α. (21)

Proof Assume firstly that D = Dϕθ,α with ϕ ∈ L∞ (T). Then

" # " #
T̂ϕθ,α Γˇϕα T̂ϕθ,α (Γˆϕ̄α )∗
Dϕθ,α = = .
Γˆϕθ Ťϕ Γˆϕθ Ťϕ

Note that (a) follows from Theorem 6.5 (a). Moreover, (b) is satisfied by Proposi-
tion 6.6 (b) and (c) is satisfied by Theorem 6.5.
Moreover,

D(θ ) = Pθ⊥ (ϕθ ) = θ P + (ϕ) + P − (ϕθ )

(Asymmetric) Dual Truncated Toeplitz Operators 449

and

D ∗ (θ ) = Pθ⊥ (ϕ̄θ ) = θ P + (ϕ̄) + P − (ϕ̄θ ).

It follows that P − (D(θ )) = P − (ϕθ ) and

P − (θ 2 D ∗ (θ )) = P − (θ P + (ϕ̄)) = P − (ϕθ ) = P − (D(θ )).

Similarly, P − (D ∗ (θ )) = P − (ϕ̄θ ) and

P − (θ 2 D(θ )) = P − (θ P + (ϕ)) = P − (ϕ̄θ ) = P − (D ∗ (θ )).

Thus (d) is also satisfied. " #

By (b) and Proposition 6.6 (b),

∗
P − D|H 2 = (P − Cα Mθ̄ )|θH 2 PαH 2 D|θH 2 (Mθ Cα )|H 2
− −

−
= (P Cα Mθ̄ )|θH 2 T̂ϕ̄α,θ (Mθ Cα )|H 2 = Ťϕ .
−

By (c) and Theorem 6.5 (c)–(d) there are functions ψ, χ ∈ L∞ (T) such that
P − D|θH 2 = Γˆψθ with P − (θ ψ) = P − D|θH 2 (θ ) and (PθH 2 D|H 2 )∗ = Γˆχθ with
−
P − (αχ) = (PαH 2 D|H 2 )∗ (α). We will now use (d) to show that
−

(ϕ − ψ) ⊥ θ zH 2 and (ϕ̄ − χ) ⊥ αzH 2 . (23)

Since ϕ is given by (22), using the first equality in (d) we get

P − (θ ϕ) = P − (αθ ((PθH 2 D|θH 2 )∗ (α))) = P − (αθ D ∗ (α))

= P − (D(θ )) = P − D|θH 2 (θ ) = P − (θ ψ),

and so (θ ϕ − θ ψ) ⊥ H−2 . Similarly, using the second equality in (d) we get

P − (α ϕ̄) = P − (αθ (PθH 2 D|θH 2 (θ ))) = P − (αθ D(θ ))

= P − (D ∗ (α) = (PαH 2 D|H 2 )∗ (α) = P − (αχ),
−
450 M. C. Câmara et al.

hence (α ϕ̄ − αχ) ⊥ H−2 . Thus, by (19), we proved that P − D|θH 2 = Γˆϕθ and
(PαH 2 D|H 2 )∗ = Γˆϕ̄θ , that is, PαH 2 D|H 2 = (Γˆϕ̄θ )∗ = Γˇϕθ . Therefore,
− −

" #
T̂ϕθ,α Γˇϕα
Dϕθ = .
Γˆϕθ Ťϕ

Moreover, by (22), we have

ϕ = θ̄ PαH 2 D|θH 2 (θ )) + α (PαH 2 D|θH 2 )∗ (α) − α θ̄ PαH 2 D|θH 2 (θ ), α

= θ̄ PαH 2 D(θ ) + α PθH 2 D ∗ (α) − α θ̄ D(θ ), α.

Remark 6.8 By (21), the symbol ϕ ∈ L∞ (T) of an asymmetric dual truncated
Toeplitz operator D can be obtained by calculating D(θ ) and D ∗ (α). The symbol
ϕ can also be calculated using D(z̄) and D ∗ (z̄). To see this let ϕ = ϕ − + ϕ + ,
ϕ − ∈ H−2 , ϕ + ∈ H 2 and let ϕ̂(0) denote the 0–th Fourier coefficient of ϕ. Then, by
the fact that Cα Dϕθ,α Cθ|K ⊥ = Dαθ,α
ϕ̄ θ̄
, we have
θ

(a) ϕ̂(0) = 1, ϕ̄ = α, Dαθ,α

ϕ̄ θ̄
θ = α, Cα Dϕθ,α Cθ θ = D z̄, z̄,
(b) ϕ + = P + θ̄ Dᾱϕθ
α,θ
α = P + θ̄ Cθ Dϕ̄α,θ Cα α = J P − (D ∗ z̄),
(c) ϕ − = P + (ᾱDαθ,α
ϕ̄ θ̄
θ ) − ϕ̂(0) = P + (ᾱCα Dϕθ,α Cθ θ ) − ϕ̂(0)
= J P − (D(z̄)) − ϕ̂(0) = P − (zD(z̄)).
Hence,

ϕ = P − (zD(z̄)) + J P − (D ∗ z̄) ∈ H−2 + H 2 . (24)

Note that the decomposition (24) is orthogonal while (21) in general is not.
Remark 6.9 Let α and θ be nonconstant inner functions. If D ∈ B(Kθ⊥ , Kα⊥ ), then
D is an asymmetric dual truncated Toeplitz operator with an analytic symbol if and
only if D satisfies conditions of Theorem 6.7 and moreover P − (zD(z̄)) = 0. The
last condition means that D(z̄) ⊥ z̄H−2 .

7 A Brown–Halmos Type Theorem for DTTO

It is a classical result of Brown and Halmos [1] that the product of two Toeplitz
operators is zero if and only if at least one of them has a zero symbol. The product
of two Toeplitz operators Tϕ Tψ (ϕ, ψ ∈ L∞ (T)) is a Toeplitz operator if and only
if ϕ̄ or ψ is analytic. They also gave necessary and sufficient conditions for the
(Asymmetric) Dual Truncated Toeplitz Operators 451

commutativity of two Toeplitz operators. They proved that, for ϕ, ψ ∈ L∞ (T),

Tϕ Tψ = Tψ Tϕ if and only if either ϕ, ψ ∈ H 2 or ϕ, ψ ∈ H−2 or a nontrivial linear
combination of ϕ and ψ is constant. In this section we will present similar results
for dual truncated Toeplitz operators. Results are based on the paper [18]
Let ϕ ∈ L∞ (T). Then according to the decomposition L2 (T) = H 2 ⊕ H−2 the
multiplication operator Mϕ can be expressed as the operator matrix of the form

Tϕ Hϕ̄∗
(25)
Hϕ Ťϕ ,

where Hϕ : H 2 → H−2 is the Hankel operator, Hϕ̄∗ : H−2 → H 2 , Hϕ̄∗ h = P + (ϕh).

Assume now that ϕ, ψ ∈ L∞ (T). Since Mϕ Mψ = Mϕψ , we have

Tϕψ = Tϕ Tψ + Hϕ̄∗ Hψ ; (26)

Hϕψ = Hϕ Tψ + Ťϕ Hψ ; (27)

Ťϕψ = Ťϕ Ťψ + Hϕ Hψ̄∗ . (28)

Note that

J Hϕ J = Hϕ∗ (29)

and

J Ťϕ J = Tϕ . (30)

Theorem 7.1 Let θ be a nonconstant inner function and let ϕ, ψ ∈ L∞ (T). If

Dϕθ Dψθ = Dϕψ
θ
, then one of the following conditions holds:

(a) ϕ, ψ ∈ H 2 ;
(b) ϕ̄, ψ̄ ∈ H 2 ;
(c) either ϕ or ψ is constant.
1
(1−|λ|2 ) 2
Let λ ∈ D be fixed and let Kλ = 1−zλ̄
be the normalized reproducing kernel.
Denote, for each function f ∈ L2 (T), by f+ = P + f and f− = P − f .
Lemma 7.2 Let θ be a nonconstant inner function, and let ϕ, ψ ∈ L∞ (T). Then

Tθ̄ϕ Tθψ = Tϕ Tψ (31)

if and only if ϕ̄ ∈ H 2 or ψ ∈ H 2 .
452 M. C. Câmara et al.

Proof Assume Tϕ Tψ = Tθ̄ϕ Tθψ . Since Tθ̄ Tϕ = Tθ̄ϕ and Tψ Tθ = Tθψ , then Tϕ Tψ =
Tθ̄ n ϕ Tθ n ψ , for each positive integer n. Hence

Tϕ Tψ Kλ , Kλ = Tθ̄ n ϕ Tθ n ψ Kλ , Kλ .

Now we will use the properties of Berezin transform. Similarly as in [20, proof of
Theorem 4.3], we have

ϕ+ (λ)ψ− (λ) − [θ̄ n ϕ]+ (λ)[θ n ψ]− (λ) = ([θ̄ n ϕ]− [θ n ψ]+ − ϕ− ψ+ )Kλ , Kλ
+ [θ̄ n ϕ]+ (λ)[θ n ψ]+ (λ) + [θ̄ n ϕ]− (λ)[θ n ψ]− (λ) − ϕ+ (λ)ψ+ (λ) − ϕ− (λ)ψ− (λ).

The right hand side of the above equation is harmonic on D. Thus the left hand side
is harmonic on D too. For any harmonic function h(λ) we have hKλ , Kλ = h(λ).
Therefore

(ϕ+ ψ− − [θ̄ n ϕ]+ [θ n ψ]− )Kλ , Kλ = ϕ+ (λ)ψ− (λ)[θ̄ n ϕ]+ (λ)[θ n ψ]− (λ).

Since Tθ is an isometry on H 2 , by Wold decomposition, we have

∞
∞

D J
H2 = θ nH 2 ⊕ θ n Kθ .
0 0

E
∞
An isometry T on H 2 is pure if T n H 2 = {0}. A Toeplitz operator with an
n=0
analytic symbol is a pure isometry if and only if its symbol is a nonconstant inner
function. Thus
∞
J
H2 = θ n Kθ
0

For any ϕ ∈ L2 (T) = H 2 ⊕ zH 2, there are {xj , yj } ⊆ Kθ such that

∞
∞

ϕ= θ j xj + z̄ θ̄ l ȳ l
j =0 l=0

∞
∞
with ϕ2 = xj 2 + yl 2 . Hence
j =0 l=0

⎛ ⎞ ⎛ ⎞
∞
n−1
P + θ̄ n ϕ = P + θ̄ n θ j xj = P + ⎝θ̄ n θ j xj ⎠ + Tθ̄ n ⎝ θ j xj ⎠
j =0 j n j =0
(Asymmetric) Dual Truncated Toeplitz Operators 453

Note that Kθ = ker Tθ̄ , thus

⎛ ⎞

n−1
n−1
Tθ̄ n ⎝ θ j xj ⎠ = Tθ̄ n−j−1 Tθ̄ xj = 0.
j =0 j =0

Therefore
⎛ ⎞

P + θ̄ n ϕ = [θ̄ n ϕ]+ = P + ⎝θ̄ n θ j xj ⎠
j n

and
1
P + θ̄ n ϕ θ j xj = ( xj 2 ) 2 .
j n j n

In consequence we have that lim P + θ̄ n ϕ = 0. For any fixed λ ∈ D,

n→∞

|[θ̄ n ϕ]+ (λ)| = |P + (θ̄ n ϕ), kλ | P + (θ̄ n ϕ) kλ

and

|[θ n ψ]− (λ)| [θ n ψ]− kλ ψ kλ ,

where kλ is the Hardy reproducing kernel. Hence,

lim [θ̄ n ϕ]+ (λ)[θ n ψ]− (λ) = 0.

n→∞

Moreover,

$
|[θ̄ ϕ]+ [θ ψ]− Kλ , Kλ | =
n n
[θ̄ n ϕ]+ [θ n ψ]− |Kλ |2 dm
∂D
1 + |λ| + n
P θ̄ ϕ ψ,
1 − |λ|
so

lim [θ̄ n ϕ]+ [θ n ψ]− Kλ , Kλ = 0.

n→∞

Thus for every fixed λ ∈ D, as n → ∞, we obtain

ϕ+ ψ− Kλ , Kλ = ϕ+ (λ)ψ− (λ).

Hence ϕ+ (λ)ψ− (λ) is harmonic on D. It follows that ϕ̄ or ψ is analytic on D.

454 M. C. Câmara et al.

Proof of Theorem 7.1 By assumption for any f ∈ H 2 we have

P + Dϕθ Dψθ (θf ) = P + Dϕψ

θ
(θf ),

thus

PθH 2 ϕPθ⊥ ψθf = PθH 2 ϕψθf.

Since

PθH 2 ϕPθ⊥ ψθf = θ P + (θ̄ ϕ(I − P + + θ P + θ̄)(θ ψf ))

= θ (P + (ϕψf ) − P + (θ̄ ϕP + θ ψf + P + (ϕP + (ψf ))).

Hence

θ (P + (ϕψf ) − P + (θ̄ ϕP + θ ψf + P + (ϕP + (ψf ))) = θ P + (θ̄ ϕψθf ),

which gives

P + (θ̄ϕP + θ ψf ) = P + (ϕP + (ψf )).

Thus Tθ̄ϕ Tθψ = Tϕ Tψ , so by Lemma 7.2 ϕ̄ ∈ H 2 or ψ ∈ H 2 .

Let now g ∈ H−2 , then by assumption P − Dϕ Dψ g = P − Dϕψ g. Hence
P (ϕPθ⊥ (ψg)) = P − (ϕψg). It follows that
−

0 = P − (ϕθ P + (θ̄ ψg)) + P − (ϕP − (ψg)) − P − (ϕψx)

= P − (ϕθ P + (θ̄ ψg)) − P − (ϕP + (ψg))).

Thus

Hθϕ Hθ∗ψ̄ = Hϕ Hψ̄∗ . (32)

By (28) we obtain that

Ťϕψ − Ťϕ Ťψ = Ťθϕ θ̄ ψ − Ťθϕ Ťθ̄ψ ,

which implies that

Ťϕ Ťψ = Ťθϕ Ťθ̄ψ .

Using the properties (30) of the conjugation J we get

Tϕ̄ Tψ̄ = J Ťϕ J J Ťψ J = J Ťθϕ J J Ťθ̄ψ J = Tθ̄ ϕ̄ Tθ ψ̄ .

So, using again Lemma 7.2 we have that ϕ ∈ H 2 or ψ̄ ∈ H 2 .

(Asymmetric) Dual Truncated Toeplitz Operators 455

Lemma 7.3 Let θ be a nonconstant inner function and let ϕ, ψ ∈ H ∞ . The

following conditions are equivalent:
(a) Dϕ̄θ Dψ̄θ = Dϕψ
θ ,

(b) Dψθ Dϕθ = Dϕψθ ,

∗
(c) Hϕ̄ Hθ̄ Hψ̄ = 0.

Proof Note that (a) is equivalent to (b), since (Dϕθ )∗ = Dϕ̄θ . Using (16) in that case
we obtain
θ
T̂ϕ̄θ 0 T̂ψ̄ 0
Dϕ̄ =
θ
, Dψ̄ =
θ
,
Γˆϕ̄θ Ťϕ̄ Γˆψ̄θ Ťψ̄

and

θ

T̂ϕψ 0
θ
Dϕψ = .
Γˆϕψ
θ
Ťϕψ

Since for ϕ, ψ ∈ H ∞ we have T̂ϕ̄θ T̂ψ̄θ = T̂ϕψ

θ
and Ťϕ̄ Ťψ̄ = Ťϕψ , thus Dϕ̄θ Dψ̄θ =
θ
Dϕψ if and only if

Γˆϕ̄θ T̂ψ̄θ + Ťϕ̄ Γˆψ̄θ = Γˆϕψ

θ
. (33)

Note that for any f ∈ H 2

Γˆϕ̄θ T̂ψ̄θ (θf ) + Ťϕ̄ Γˆψ̄θ (θf ) = P − (ϕ̄θ P + (θ̄ ψ̄θf )) + Ťϕ̄ P − (ψ̄θf )

= (Hθ ϕ̄ Tψ̄ + Ťϕ̄ Hθ ψ̄ )f

and

Γˆϕψ
θ
(θf ) = P − (ϕψθf ) = Hθϕψ f.

Hence (33) is equivalent to

Hθ ϕ̄ Tψ̄ + Ťϕ̄ Hθ ψ̄ = Hθϕψ . (34)

456 M. C. Câmara et al.

Note that Hθ ϕ̄ = Hϕ̄ Tθ and by (26) Tθ Tψ̄ = Tθ ψ̄ − Hθ̄∗ Hψ̄ . Therefore by (26)

Hθ ϕ̄ Tψ̄ + Ťϕ̄ Hθ ψ̄ = Hϕ̄ Tθ Tψ̄ + Ťϕ̄ Hθ ψ̄

= Hϕ̄ (Tθ ψ̄ − Hθ̄∗ Hψ̄ ) + Ťϕ̄ Hθ ψ̄ = Hϕ̄ Tθ ψ̄ + Ťϕ̄ Hθ ψ̄ − Hϕ̄ Hθ̄∗ Hψ̄
= Hθ ϕ̄ ψ̄ − Hϕ̄ Hθ̄∗ Hψ̄ . (35)

Hence (34) holds if and only if Hϕ̄ Hθ̄∗ Hψ̄ = 0.

Let λ ∈ D. Denote by

λ−z
ωλ (z) =
1 − λ̄z

the Möbius transform. The next Lemma we recall without proof

Lemma 7.4 [21, Lemma 2.2] For f1 , f2 , f3 ∈ L∞ (T) and λ ∈ D we have

Tω∗λ Hf∗1 Hf2 Hf∗3 Ťωλ − Hf∗1 Hf2 Hf∗3 = − (Hf∗1 Hf2 Hf∗3 J Kλ ) ⊗ (J Kλ )
− (J Hf1 Kλ ) ⊗ (J Tωλ Tω∗λ Hf∗3 Hf2 Kλ )

+ (Tω∗λ Hf∗1 Hf2 Kλ ) ⊗ (Ťω∗λ Hf3 Kλ ).

Lemma 7.5 Let θ be a nonconstant inner function and let ϕ, ψ ∈ H ∞ . If neither

ϕ nor ψ is constant, then Hϕ̄ Hθ̄∗ Hψ̄ = 0 if and only if ϕ̄(θ − λ), ψ̄(θ − λ) and
ϕψ(θ − λ) are analytic for some λ ∈ C.
Proof Assume that Hϕ̄ Hθ̄∗ Hψ̄ = 0. Using a property of the conjugation J we have
that

0 = J Hϕ̄ Hθ̄∗ Hψ̄ J = Hϕ̄∗ Hθ̄ Hψ̄∗ .

Applying Lemma 7.4 with ω0 (z) = −z we have

Tz∗ Hϕ̄∗ Hθ̄ Hψ̄∗ Ťz − Hϕ̄∗ Hθ̄ Hψ̄∗ = − (Hϕ̄∗ Hθ̄ Hψ̄∗ J 1) ⊗ (J 1)

− (J Hϕ̄ 1) ⊗ (J Tz Tz∗ Hψ̄∗ Hθ̄ 1)

+ (Tz∗ Hϕ̄∗ Hθ̄ 1) ⊗ (Ťz∗ Hψ̄ 1).

Since Hϕ̄∗ Hθ̄ Hψ̄∗ = 0, thus

(Tz∗ Hϕ̄∗ Hθ̄ 1) ⊗ (Ťz∗ Hψ̄ 1) = (J Hϕ̄ 1) ⊗ (J Tz Tz∗ Hψ̄∗ Hθ̄ 1). (36)
(Asymmetric) Dual Truncated Toeplitz Operators 457

Note that Ťz∗ Hψ̄ 1 = z̄(ψ(z) − ψ(0)) and J Hϕ̄ 1 = z̄(ϕ(z) − ϕ(0)) are not zero. Let
f ∈ Kθ⊥ be such that f, Ťz∗ Hψ̄ 1 = 0. Then

(Tz∗ Hϕ̄∗ Hθ̄ 1) ⊗ (Ťz∗ Hψ̄ 1)f = f, Ťz∗ Hψ̄ 1Tz∗ Hϕ̄∗ Hθ̄ 1

and

(J Hϕ̄ 1) ⊗ (J Tz Tz∗ Hψ̄∗ Hθ̄ 1)f = f, J Tz Tz∗ Hψ̄∗ Hθ̄ 1J Hϕ̄ 1,

so by (36)

f, J Tz Tz∗ Hψ̄∗ Hθ̄ 1

Tz∗ Hϕ̄∗ Hθ̄ 1 = J Hϕ̄ 1 = λ1 J Hϕ̄ 1. (37)
f, Ťz∗ Hψ̄ 1

f,J Tz Tz∗ H ∗ Hθ̄ 1

Let λ1 = ψ̄
. Then (36) is equivalent to
f,Ťz∗ Hψ̄ 1

(λ1 J Hϕ̄ 1) ⊗ (Ťz∗ Hψ̄ 1) =(J Hϕ̄ 1) ⊗ (J Tz Tz∗ Hψ̄∗ Hθ̄ 1);

(J Hϕ̄ 1) ⊗ (λ̄1 Ťz∗ Hψ̄ 1) =(J Hϕ̄ 1) ⊗ (J Tz Tz∗ Hψ̄∗ Hθ̄ 1).

Hence

J Tz Tz∗ Hψ̄∗ Hθ̄ 1 = λ̄1 Ťz∗ Hψ̄ 1. (38)

Since J Tz J = Ťz∗ and the kernel of Tz is zero, the kernel of Ťz∗ is also zero.
Thus (38) is equivalent to

J Tz Tz∗ Hψ̄∗ Hθ̄ 1 = λ̄1 J Tz J Hψ̄ 1 = J Tz λ1 J Hψ̄ 1,

Tz∗ Hψ̄∗ Hθ̄ 1 = λ1 J Hψ̄ 1.

On the other hand,

Tz∗ Hψ̄∗ Hθ̄ 1 = P + (z̄P + (ψP − θ̄ )) = P + (z̄ψ(θ̄ − θ (0)))

and

λ1 J Hψ̄ 1 = λ1 J P − ψ̄ = λ1 P + (z̄ψ).
458 M. C. Câmara et al.

Therefore P + (z̄ψ(θ̄ − θ (0) − λ1 )) = 0 which implies that z̄ψ(θ̄ − λ̄) ∈ zH 2 , i.e.,

ψ̄(θ − λ) ∈ H 2 , where λ = λ̄1 + θ (0).
Similarly, using (37) one can show that ϕ̄(θ − λ) ∈ H 2 . To prove the last
condition note that

0 = Hϕ̄ Hθ̄∗ Hψ̄ 1 = Hϕ̄ Hθ̄−

∗
H 1 = P − (ϕ̄P + (θ − λ)(ψ̄ − ψ(0))).
λ̄ ψ̄

Since (θ − λ)(ψ̄ − ψ(0)) ∈ H 2 , then

0 = P − ϕ̄(θ − λ)(ψ̄ − ψ(0)) = P − (ϕ̄ ψ̄(θ − λ)) − ψ(0)P − (ϕ̄(θ − λ)).

As we already have ϕ̄(θ − λ) ∈ H 2 , it follows that ϕ̄ ψ̄(θ − λ) ∈ H 2 .

For the converse implication assume that there is λ ∈ C such that ϕ̄(θ − λ),
ψ̄(θ − λ) and ϕψ(θ − λ) are analytic. For a reproducing kernel kw in H 2 , w ∈ D
we have

Hϕ̄ Hθ̄∗ Hψ̄ kw =Hϕ̄ Hθ̄−

∗
H k = P − (ϕ̄P + ((θ − λ)P − ψ̄kw )
λ̄ ψ̄ w

=P − (ϕ̄P + ((θ − λ)(ψ̄kw − ψ(w)kw ))

=P − (ϕ̄(θ − λ)(ψ̄ − ψ(w))kw
=P − (ϕ̄ ψ̄((θ − λ)kw ) − ψ(w)P − (ϕ̄(θ − λ)kw ) = 0.

Since the set {kw : w ∈ D} is linearly dense in H 2 , we see that Hϕ̄ Hθ̄∗ Hψ̄ = 0.

As a conclusion of Theorem 7.1, Lemma 7.3 and Lemma 7.5 we have the
following.
Theorem 7.6 Let θ be a nonconstant inner function and ϕ, ψ ∈ L∞ (T). Then
Dϕθ Dψθ = Dϕψ
θ if and only if one of the following conditions hold

(a) ϕ, ψ, ϕ̄(θ − λ), ψ̄ (θ − λ), ϕψ (θ − λ) ∈ H 2 for some constant λ,

(b) ϕ̄, ψ̄, ϕ(θ − λ), ψ(θ − λ), ϕψ(θ − λ) ∈ H 2 for some constant λ,
(c) either ϕ or ψ is constant.
Example Let α, β be nonconstant inner functions and let θ = αβ. It is easy to see
that then the condition (1) of Theorem 7.6 is satisfied for the operators Dαθ and Dβθ .
Hence

Dαθ Dβθ = Dαβ

θ
.

Theorem 7.7 Let θ be a nonconstant inner function and let ϕ, ψ ∈ L∞ (T). Then

Dϕθ Dψθ = Dψθ Dϕθ (39)

(Asymmetric) Dual Truncated Toeplitz Operators 459

if and only if one of the following conditions hold

(a) ϕ, ψ, ϕ̄(θ − λ), ψ̄ (θ − λ) ∈ H 2 for some constant λ,
(b) ϕ̄, ψ̄, ϕ(θ − λ), ψ(θ − λ) ∈ H 2 for some constant λ,
(c) a nontrivial linear combination of ϕ and ψ is constant.
The following easy example shows that the assumption that both symbols of dual
truncated Toeplitz operators are analytic is not sufficient for their commutativity.
Example Let θ be an inner function such that θ (0) = 0. Note that Dzθ Dθz θ z̄ = θ z
θ D θ z̄ = 0. Hence even if both symbols z, θ z are analytic, D θ D θ = D θ D θ .
and Dθz z z θz θz z

Comparing Theorems 7.6 and 7.7 note that if Dϕθ Dψθ = Dϕψ
θ
, then the operators Dϕθ
θ
and Dψ commute. However, the converse is not true.
Example Let α, β be nonconstant inner functions and let θ = αβ. Note that then by
Theorem 7.7 the operators Dθθ and Dαθ commute. On the other hand, ᾱ θ̄ θ = ᾱ is not
analytic, hence, by Theorem 7.6, Dθθ Dαθ = Dθα
θ .

Acknowledgments The work of the first author was partially supported by FCT/Portugal through
UID/MAT/04459/2020. The research of the second and the third authors was financed by the
Ministry of Science and Higher Education of the Republic of Poland.

References

1. A. Brown, P. Halmos, Algebraic properties of Toeplitz operators J. Reine Angew. Math. 213,
89–102 (1964)
2. M.C. Câmara, K. Kliś-Garlicka, B. Łanucha, M. Ptak, Compressions of multiplications
operators and their characterizations. Results Math. 75, 157 (2020). https://doi.org/10.1007/
s00025-020-01283-4
3. C. Câmara, K. Kliś-Garlicka, B. Łanucha, M. Ptak, Conjugations in L2 (T) and their invariants.
Anal. Math. Phys. 10, 22 (2020). https://doi.org/10.1007/s13324-020-00364-5
4. C. Câmara, K. Kliś-Garlicka, B. Łanucha, M. Ptak, Intertwining property for compressions of
multiplication operators (2020). arXiv:2012.05330
5. M.C. Câmara, K. Kliś-Garlicka, B. Łanucha, M. Ptak, Invertibility, Fredholmness and kernels
of dual truncated Toeplitz operators. Banach J. Math. Anal. 14, 1558–1580 (2020)
6. M.C. Câmara, K. Kliś-Garlicka, B. Łanucha, M. Ptak, Shift invariance and reflexivity of
compressions of multiplication operators. Forum Math. (2022). https://doi.org/10.1515/forum-
2021-0129
7. X. Ding, Y. Sang, Dual truncated Toeplitz operators. J. Math. Anal. Appl. 461, 929–946 (2018)
8. R.G. Douglas, Banach Algebra Techniques in Operator Theory, vol. 179 (Springer, Berlin,
2012)
9. P.L. Duren, Theory of H p Spaces. Pure and Applied Mathematics, vol. 38 (Academic Press,
New York, 1970)
10. S.R. Garcia, J.E. Mashreghi, W. Ross, Introduction to Model Spaces and Their Operators.
Cambridge Studies in Advanced Mathematics, vol. 148 (Cambridge University Press, Cam-
bridge, 2016)
11. S.R. Garcia, M. Putinar, Complex symmetric operators and applications. Trans. Am. Math.
Soc. 358, 1285–1315 (2006)
460 M. C. Câmara et al.

12. S. Garcia, W.T. Ross, Model Spaces: A Survey. Invariant Subspaces of the Shift Operator.
Contemp. Math. vol. 638 (Am. Math. Soc., Providence, RI, 2015), pp. 197–245
13. S. Garcia, W.T. Ross, W.R. Wogen, C*-Algebras Generated by Truncated Toeplitz Operators,
vol. 8 (Math and Computer Science Faculty Publications, 2014). http://scholarship.richmond.
edu/mathcs-faculty-publications/8
14. P. Koosis, Introduction to H p Spaces, 2nd edn. (Cambridge University Press, Cambridge,
1998)
15. P. Ma, F. Yan, D. Zhang, Zero, finite rank, and compact big truncated Hankel operators on
model spaces. Proc. Am. Math. Soc. 146, 5235–5242 (2018)
16. V.V. Peller, Hankel Operators and Their Applications (Springer, New York, 2003)
17. M. Ptak, K. Simik, A. Wicher, C–normal operators. Electron. J. Linear Algebra 36, 67–79
(2020)
18. Y. Sang, Y. Qin, X. Ding, A theorem of Brown–Halmos type for dual truncated Toeplitz
operators. Ann. Funct. Anal. 11, 271–284 (2020)
19. D. Sarason, Algebraic properties of truncated Toeplitz operators, Oper. Matrices 1, 491–526
(2007)
20. K. Stroethoff, The Berezin transform and operators on spaces of anlytivc functions. Banach
Center Publ. Linear Oper. 85, 361–380 (1997)
21. D. Xia, D. Zheng, Products of Hankel operators. Integr. Equ. Oper. Theory 29, 339–363 (1997)
Boundedness of Toeplitz Operators
in Bergman-Type Spaces

Jari Taskinen and Jani A. Virtanen

Abstract The characterization of the bounded Toeplitz operators Ta in Bergman

spaces is an open problem even in the simplest case of the unweighted Bergman-
Hilbert space A2 (D). We consider here recent partial results on the topic. These
include sufficient conditions for the boundedness and compactness of Ta in terms of
weak Carleson-types condition for the symbol a. The results were recently general-
ized to the case of spaces on the unit ball BN of CN . The second approach is based
on certain results on the structure of the Bergman-spaces, namely, representations of
their weighted norms using finite-dimensional decompositions of the spaces. This
approach provides a characterization of the boundedness and compactness in the
case of operators in spaces with weighted sup-norms.

Keywords Bergman space · Weighted norm · Toeplitz operator · Little Hankel

operator · Bounded operator · Compact operator

1 Introduction: The Spaces and Operators

The focus of this article is on recent results on the boundedness of Toeplitz operators
on weighted Bergman spaces of holomorphic functions, mainly on the open unit
disk D of the complex plane C, although some of the results are also formulated on
the unit ball BN of CN , N = 2, 3, . . . . The related question on the compactness is
only considered when it can be dealt with parallel to boundedness, and certain more
special recent results for compactness will remain out of this review.
We will concentrate on two circles of ideas. First, we deal with Toeplitz
operators with oscillating symbols and weak Carleson-type sufficient conditions for

J. Taskinen ()
Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
e-mail: [email protected]; [email protected]
J. A. Virtanen
Department of Mathematics and Statistics, University of Reading, Reading, England

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 461
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_14
462 J. Taskinen and J. A. Virtanen

boundedness. The starting point of this direction of research is the article [30]. The
second approach applies to operators with radial symbols, and it is based on the
results on the structure of weighted Bergman spaces which were pioneered in the
works of W. Lusky [17–19] and adapted to the study of Toeplitz operators recently in
the papers [4, 5]. This led to a characterization of the boundedness and compactness
of Toeplitz operators in weighted H ∞ -spaces.
Let us present the basic notation and definitions. The notation concerning the
spaces on the unit ball BN will only be needed and thus given at the end of Sect. 2.
The normalized area measure on D is denoted by dA = π −1 rdrdθ , where r and θ
are the polar coordinates of z = reiθ ∈ C. Given 1 ≤ p < ∞ and the real parameter
α > −1 we define the weighted area measure by dAα (z) = (1 + α)(1 − r 2 )α dA(z)
and set
4 $ 5
p
Lpα (D) = g : D → C measurable : gp,α := |g|p dAα < ∞ and
D

Apα (D) = {g ∈ Lpv (D) : g holomorphic };

in the case α = 0 these spaces are denoted by Lp (D) and Ap (D), respectively. Here,
v(z) = (1 − |z|2 )α are called standard weights.
We will also consider more general weighted Bergman spaces and their analogue,
weighted Hardy space Hv∞ corresponding to p = ∞. In general, by a weight
v we mean a continuous function D →]0, ∞[ which is radial, vanishing on the
boundary and decreasing with the radius, i.e. there holds v(z) = v(|z|) for all z ∈ D,
lim|z|→1 v(z) = 0 and v(r) ≥ v(s) if 1 > s > r > 0. We denote vdA = dAv and,
for 1 ≤ p < ∞,
4 $ 5
p
Lpv (D) = g : D → C measurable : gp,v := |g|p dAv < ∞ and
D

Apv (D) = {g ∈ Lpv (D) : g holomorphic },

and

h∞
v (D) = {g : D → C : g harmonic, gv := sup |g(z)|v(|z|) < ∞}
z∈D

and

Hv∞ (D) = {g ∈ h∞
v : g holomorphic };
Boundedness of Toeplitz Operators in Bergman-Type Spaces 463

we use the standard notation H ∞ (D) = (H ∞ (D), · ∞ ) in the non-weighted case.

In all of the above cases, the subspaces of holomorphic and harmonic functions are
closed subspaces of the their superspaces.
We write N = {1, 2, 3, . . .} and N0 = N ∪ {0}.
Given α, the Bergman projection Pα is the orthogonal projection from the Hilbert
space L2α (D) onto the closed subspace A2α (D). Given a function a ∈ L1 (D), we also
denote by Ma the pointwise multiplier Ma : f $→ af , where f : D → C is a
measurable function (which is usually holomorphic or harmonic in the sequel). If
p
1 ≤ p < ∞, then a Toeplitz operator Ta on Aα (D), with symbol a, is in principle
defined as the composition

Ta f = Pα Ma f, (1)

but the assumptions made so far do not always suffice to guarantee that (1) makes
sense, since Ma might map f outside L2α (D). In the case a is a bounded function,
there is no problem with the definition, since Pα can be written with the help of the
Bergman kernel as the integral operator
$
f (w)
Pα f (z) = dAα (w) ,
(1 − zw)2+α
D

hence
$
a(w)f (w)
Pα Ma f = dAα (w), (2)
(1 − zw)2+α
D

and for every z ∈ D, these integrals converge for all f ∈ L1α (D). Moreover, it is
p
known that Pα is a bounded operator in the space Lα (D), when 1 < p < ∞, which
p p
yields the boundedness of Ta : Aα (D) → Aα (D) for bounded symbols.
It is not difficult to construct unbounded symbols a which still induce bounded
Toeplitz operators, but the characterization of symbols a ∈ L1 (D) such that
p p
Ta : Aα (D) → Aα (D) is well-defined and bounded is a well-known open problem.
Let us mention some partial results on it. The characterization of boundedness and
compactness of Toeplitz operators with nonnegative symbols in terms of Carleson
type measures first appeared in [24].
D. Luecking [15] proved that a Toeplitz operator Ta with a nonnegative symbol
a ∈ L1 (D) is bounded in A2 (D), if and only if the average
$
−1
|B(z, r)| a(w) dA(w)
B(z,r)

is a bounded function of z. Here B(z, r) denotes a disk in the Bergman metric, with
center z and some fixed radius r > 0. Toeplitz operators with radial symbols in
464 J. Taskinen and J. A. Virtanen

the space A2α (D) and analogues on higher dimensional domains were thoroughly
considered in [10]: in this case the operator is unitarily equivalent with a sequence
space multiplier, see also (44) below, and thus the boundedness properties can be
determined. A partial generalization to the case p = 2 was established in [21]. The
Berezin transform
$
f (w)
B(f )(z) = (1 − |z|2 )2 dA(w), z ∈ D, (3)
|1 − zw̄|4
D

is a useful tool for the theory of Toeplitz operators, although it will not be used in
this article. N. Zorboska proved in [39] for symbols a of bounded mean oscillation
that the Toeplitz operator Ta : A2 (D) → A2 (D) is bounded if and only if B(a) is
bounded. The results of [15] and [39] generalize to other Ap (D)-spaces, 1 < p <
∞, as well, see e.g. [30]. Here is a non-exhaustive list of other works dealing with
the boundedness and compactness of Toeplitz operators in Bergman-type spaces:
[8], [10], [9], [12], [11], [15], [16], [21], [22], [25], [27], [28], [29], [30], [34], [35],
[36], [37], [39]. The monograph [38] is a standard reference for the topic, and we
also mention the survey article [31].
In this article we will review in Sect. 2 the results of [30], [33], [12]. These consist
of sufficient, weak Carleson-type conditions for the boundedness and compactness
of Toeplitz operators in reflexive Bergman spaces with standard weights, both on the
unit disk and the unit ball. Sections 3–6 are mainly based on the recent works [4, 5],
which deal with operators on Hv∞ (D)-spaces with quite general classes of weights.
Theorem 5 of Sect. 4 states that there is a bounded harmonic symbol f for which Tf
is unbounded in Hv∞ (D) for any radial weight v satisfying our general assumptions.
The main result of Sect. 5, Theorem 7 contains a necessary and sufficient condition
for the boundedness of Tf in Hv∞ (D), as well as the corresponding result for the
compactness. These conditions are slightly abstract, and thus in Sect. 6 we derive
some more concrete, easily formulated sufficient conditions based on the results of
Sect. 5.
We conclude this section by a remark on the definition of Toeplitz operators as
an improper integral. Here, we fix α > −1 and assume the symbol a is radial.
Formula (4) will be considered in detail in Sect. 2 even for more general, non-radial
symbols. The proof of Proposition 1 is taken here from [14], although some versions
of it have probably been known for specialists for a long time.
Proposition 1 Let a be a radial symbol, i.e. a(z) = a(|z|) for almost all z ∈ D,
belonging to L1α (D), α > −1, and let g(z) = ∞ n
n=0 gn z be a holomorphic function
on D. Then, the defining integral (2) of Ta g exists in the improper sense as the limit
$
a(w)g(w)
Ta g(z) = lim dAα (w), (4)
ρ→1 (1 − zw)2+α
|w|<ρ
Boundedness of Toeplitz Operators in Bergman-Type Spaces 465

convergent for every z ∈ D. Moreover,

∞
βa,α (1, n)gn
Ta g = zn (5)
(α + 1)B(n + 1, α + 1)
n=0

and in particular the power series on the right converges for all z ∈ D.
Here and in the next we denote by B and Γ Euler’s beta- and gamma-functions,

n!Γ (c)
B(n + 1, c) = , c > 0,
Γ (n + 1 + c)
and for 0 < ρ ≤ 1
√
$ρ
√
βa,α (ρ, n) = (α + 1) t n (1 − t)α a( t)dt, (6)
0

where the integral converges by the assumptions that a is radial and belongs to
L1α (D).
Proof of Proposition 1 We start by the remark that for all m ∈ N0 , the integral
$
g(w)w m a(w)dAα (w)
D

exists in the improper sense for every holomorphic g on the disk D. Namely, the
rotational symmetry of a and the usual orthogonality relations of trigonometric
polynomials yield for all m ∈ N0

$ $ρ
g(w)w a(w)dAα (w) = 2(α + 1)gm
m
r 2m+1 a(r)(1 − r 2 )α dr. (7)
|w|<ρ 0

Clearly, the limit exists, when ρ → 1. For every 0 < ρ < 1, z ∈ D, we obtain by (7)
$
a(w)g(w)
dAα (w)
(1 − zw)2+α
|w|<ρ
$ ∞

(zw)n
= g(w) a(w)dAα (w)
(α + 1)B(n + 1, α + 1)
|w|<ρ n=0

∞
βa,α (ρ, n)gn
= zn . (8)
(α + 1)B(n + 1, α + 1)
n=0
466 J. Taskinen and J. A. Virtanen

Let L ∈ N be such that L ≥ |α| + 1. Then,

n!Γ (α + 1)
B(n + 1, α + 1) ≥ ≥ CL n−L (9)
(n + L)!

for some constant CL > 0. We also have

$1
βa,α (ρ, n) ≤ βa,α (1, n) = 2(α + 1) t 2n (1 − t 2 )α a(t)dt ≤ Cα (10)
0

for another constant Cα > 0, for all ρ and n, since a ∈ L1α (D). Moreover, since g is
1
a holomorphic function on D, we have lim supn→∞ |gn | n ≤ 1, hence,
1
βa,α (1, n)gn n 1 1
lim sup ≤ lim sup CL Cα nL n · lim sup |gn | n ≤ 1.
n→∞ (α + 1)B(n + 1, α + 1) n→∞ n→∞

The same estimate holds, independently of ρ, when βa,α (1, n) is replaced by

β1,α (ρ, n). Hence, by the elementary theory of power series, (5), (8) converge
uniformly on compact subsets of the disk and define holomorphic functions.
Moreover, we have βa,α (ρ, n) → βa,α (1, n) for every n as ρ → 1, hence,
considering truncated series (5), (8) shows that the limit on the right of (4) exists
for every z ∈ D and coincides with (5).

2 Toeplitz Operators with Oscillating Symbols

If an unbounded, measurable function a is strongly oscillating, it may give rise to

a Toeplitz operator via the improper integral (4), and the operator may even be
bounded with respect to a Bergman norm. A sufficient condition for oscillating
symbols to induce a bounded Ta was presented in the paper [30]. More precisely, in
the reference it was shown that Ta is bounded under an averaging condition for the
symbol itself rather than for its modulus. The result needs a generalized definition of
Toeplitz operators, which, however, eventually coincides with the improper integral.
The result also extends to little Hankel operators.
We will next review the mentioned approach. It is based on a decomposition
of the disk into an infinite family of (Dn )∞
n=1 subdomains, which have essentially
constant area with respect to the hyperbolic geometry. The geometry of the
subdomains needs to be specified carefully, since an explicit integration by parts
-argument is a crucial step in the argument. Here, the sets Dn are rectangles in
the polar coordinates, but they could also be chosen differently, see the discussion
below.
Boundedness of Toeplitz Operators in Bergman-Type Spaces 467

Let us consider a symbol a : D → C, which is at least locally Lebesgue-

integrable on D. We also fix the parameter α > −1.
Definition 1 Denote by D the family of the sets D := D(r, θ ), where

1
D = {ρeiφ | r ≤ ρ ≤ 1 − (1 − r) , θ ≤ φ ≤ θ + π(1 − r)} (11)
2
)
for all 0 < r < 1, θ ∈ [0, 2π]. Let |D| := D dA and, for w = ρeiφ ∈ D(r, θ ), let

$ρ $φ
1
âD (w) := a(-eiϕ )-dϕd-. (12)
|D|
r θ

We will study symbols a for which there exists a constant C > 0 such that

|âD (w)| ≤ C (13)

for all D ∈ D and all w ∈ D.

The sets D 1 − 2−m+1 , 2π(k − 1)2−m ∈ D, where m ∈ N, k = 1, . . . , 2−m ,
form a decomposition of the disk D. Let us re-index them somehow into a family
(Dn )∞n=1 with

Dn = { z = reiθ | rn < r ≤ rn , θn < θ ≤ θn } (14)

where, for some m and k,

rn = 1 − 2−m+1 , rn = 1 − 2−m , θn = π(k − 1)2−m+1 , θn = πk2−m+1 .(15)

p
Given f ∈ Aα (D), we write for all n = n(m, k)
$
a(w)f (w)
Fn f (z) = dAα (w) , z ∈ D, (16)
(1 − zw̄)2+α
Dn

so that Fn can actually be considered as a conventional, bounded Toeplitz operator

p
on Aα (D).
The following theorem, in the case of α = 0, is the main result Theorem 2.3 of
[30]. The weighted case was included in [12].
Theorem 1 Let 1 < p < ∞ and assume that the locally integrable function a
satisfies the condition (13). Given f ∈ Aα (D), the series ∞
p
n=1 Fn f (z) converges
pointwise, absolutely for almost all z ∈ D, and the generalized Toeplitz operator
468 J. Taskinen and J. A. Virtanen

Ta : Ap → Ap , defined by
∞

Ta f (z) = Fn f (z) (17)
n=1

is bounded for all 1 < p < ∞, and there is a constant Cα , independent of a, such
that

Ta ≤ Cα sup |âD (w)|. (18)

D∈D ,w∈D

The main step of the proof consists of writing the integral (16) in polar
coordinates and performing a double integration by parts (once with respect to
both coordinates) such that there appear integrals of a and derivatives of f (w)(1 −
|w|2 )α (1 − zw̄)2+α . The former can be estimated by using the assumption (13)
and the latter by using bounds for the maximal Bergman projection and well
known arguments and estimates related with hyperbolic geometry. One obtains a
representation for the integral (2) as a pointwise convergent sum of the integrals (16)
as in (17). We refer to [30] for the details. Improved versions of the proof appear in
[33] and [12], and they yield our next theorem, although we do not repeat the proof
here. We remark that every Toeplitz operator
$
a(w)f (w)
Taρ f (z) = dAα (w) (19)
(1 − zw̄)2+α
|w|<ρ

p p
is bounded Aα (D) → Aα (D), since the support of the symbol is contained in a
compact subset of D.
Theorem 2 Let 1 < p < ∞ and 1/p + 1/q = 1, and let the symbol a be as in
p p
Theorem 1. Then, the generalized Toeplitz operator Ta : Aα (D) → Aα (D), defined
in (17), can be written as

Ta f = lim Taρ f , (20)

ρ→1

p
for all f ∈ Aα (D). The limit converges with respect to the strong operator topology.
Moreover, the transposed operator Ta∗ : Aα (D) → Aα (D) (with respect to the
q q

standard complex dual pairing) satisfies

Ta∗ f = lim Tāρ f (21)

ρ→1

q
for f ∈ Aα (D) and for almost all z ∈ D, and the limit also converges in the strong
operator topology.
Boundedness of Toeplitz Operators in Bergman-Type Spaces 469

The limits in (20), (21) cannot in general converge in the operator norm, since
the operators Taρ are compact. We mention that, when α = 0, the above results are
formulated in [33] also for little Hankel operators
$
a(w)f (w)
ha f (z) = dA(w) , z ∈ D. (22)
(1 − z̄w)2
D

Here, one also defines using the same decomposition of the unit disk as above
$
a(w)f (w)
Hn f (z) = dA(w) , z ∈ D, (23)
(1 − z̄w)2
Dn

and defines the generalized little Hankel operator ha f (z) as ∞ n=1 Hn f (z). Then,
if (13) holds for the symbol a, one obtains that ha : Ap (D) → Lp (D) is bounded
for all 1 < p < ∞, the operator norm of ha has the same bound as in (18), and
finally, the operator ha and its transpose have representations as improper integrals
similar to those in (20), (21).
The definition (17) of a generalized Toeplitz operator depends on the geometry
of the special decomposition (14) of the unit disk, but Theorem 2 largely removes
this unsatisfactory feature, since the improper integral in (20) is quite a natural one.
We remark that in the literature there are versions of the result, which use different
subdomains of the unit disk. In [36] the condition (13) is replaced by a similar one
on Carleson squares
6 7
Shα (eiθ ) = ρeiφ : 1 − h < ρ < 1, |φ − θ | < παh

where 0 < h < 1, 0 ≤ θ ≤ 2π, 0 < α ≤ 1. The authors give a boundedness

result for the Toeplitz operators and they also show that their sufficient condition is
equivalent to that in Theorem 1. Finally, they also prove the important observation
that the sufficient condition (13) is not necessary to the boundedness of Ta :
p p
Aα (D) → Aα (D).
Another variant appears in [22, 23] where Toeplitz operators on Bergman spaces
of simply connected planar domains are considered. In such domains any geometric
symmetry is usually lost, and there does not exist a decomposition of the domain
which is as natural as the one for the disk, see (14). However, the author uses a
Whitney decomposition with Euclidean rectangles and obtains results which are
analogous to Theorem 1. The Whitney decomposition can of course be applied also
in the case of the disk, and it yields another sufficient condition for the boundedness
of the Toeplitz operator. We do not know, if the condition is equivalent to (13).
In [32], we generalized Theorem 1 to the setting of A1 (D), while bounded
Toeplitz operators Tμ on A1α (BN ) were characterized in terms of the reproducing
kernels in [6] under additional conditions on the measure μ. We skip a detailed
discussion on the boundedness problem in A1 -spaces and only note that the previous
approach has not been worked out in the non-locally convex cases 0 < p < 1.
470 J. Taskinen and J. A. Virtanen

Theorems 1 and 2, first proved in [30] and [33], have been generalized to the case
of Toeplitz operators on the Bergman space of the unit ball of CN in the recent work
[12], but even presenting the results leads to non-trivial technical challenges. We
do not directly need the Euclidean space R3 here, but since that dimension is still
within the capabilities of the human imagination, we ask the reader to think about a
radially symmetric decomposition of the unit ball of R3 : that is indeed a challenge,
since decomposing the ball surface into finitely many identical squares in spherical
coordinates (corresponding to intervals [θn , θn ] in (14)–(15)) is impossible. For
example, starting to fill the ball surface from the equator with spherical squares with
one side parallel to the meridians, one runs into difficulties at latest when trying to
fill the north and south caps.
The results of [12] are formulated for measures with standard weights and thus
the proofs contain new information even in the case N = 1, since the earlier papers
only contained the unweighted case. The basic idea of the proof is the same as in
[30] and [33], but new non-trivial technical considerations are nevertheless needed.
Let us review the approach of [12] superficially without going into all technical
details. For α > −1, we define the weighted Lebesgue measure dVα on the unit
ball BN , N ∈ N, by dVα (z) = cα (1 − |z|2 )α dV (z), where dV is the unweighted
)N-dimensional (real) Lebesgue measure and cαpis a normalizing constant such that
BN dV α = 1. For 1 ≤ p < ∞, we denote by L α (BN ) the Lp -space with respect to
p
the measure dVα and by Aα (BN ) the weighted Bergman space of all holomorphic
p
functions in Lα (BN ). We also denote by Pα the orthogonal projection from L2α (BN )
p p
onto A2α (BN ). It is known to be a bounded operator Lα (BN ) onto Aα (BN ) for all
1 < p < ∞.
In the following it is useful to work with real variables by identifying CN with
Rn , n = 2N, so that BN equals Bn in real coordinates. Accordingly, any point
x ∈ Bn with modulus |x| = r can be written as

x = (r cos θ2 , r sin θ2 cos θ3 , r sin θ2 sin θ3 cos θ4 , · · · ,

r sin θ2 · · · sin θn−1 cos θn , r sin θ2 · · · sin θn−1 sin θn )

in the spherical coordinates

n−1
ξ = (r, θ2 , · · · , θn ) ∈ [0, 1[× [0, π[×[0, 2π[ =: Qn ,
j =2

and these determine the coordinate transform σ : Qn → Bn by x = σ (ξ ). As

in the case of the unit ball one needs to specify a suitable decomposition of the
unit ball Bn , but it turns out to be unexpectedly difficult in higher dimensions. We
skip the detailed choice of the sets at this point, referring to Section 1 of [12] and
only mention that it is possible to choose for every m ∈ N finitely many subsets
Boundedness of Toeplitz Operators in Bergman-Type Spaces 471

Bm,k , k = 1, . . . , Km , which are images under the mapping σ of certain rectangles

Qm,k ⊂ Qn in polar coordinates, such that
– the volume of every Bm,k is proportional to 2−nm ,
– the union of all sets Bm,k when m ∈ N and k = 1, . . . , Km , covers Bn ,
– there is a constant N ∈ N such that any point x ∈ Bn is contained in at most N
of the sets Bm,k .
We enumerate the sets Qm,k and Bm,k into sequences (Qj )∞ ∞
j =1 and (Bj )j =1 .
Then, we impose on Qn the partial ordering

x y ⇐⇒ x1 ≤ y1 , | π2 − x2 | ≥ | π2 − y2 |, . . . , | π2 − xn−1 | ≥ | π2 − yn−1 |,
xn ≤ yn . (24)
(j ) (j )
On each Qj we pick up the smallest and largest points x (j ) = x1 , . . . , xn
(j ) (j )
and y (j ) = y1 , . . . , yn with respect to the given ordering, hence, there holds
(j ) (j )
Qj = Q x , y , where we denote, for a, b ∈ Qn with a b,
6 7
Q(a, b) = x ∈ Rn : a x b , B(a, b) = σ Q(a, b) . (25)

Note that for x, y ∈ [0, 1) × [0, π2 ]n−2 × [0, 2π] the order relation “” coincides
with the usual partial order of points in Rn , which is then mirrored to all of Qn to
account for the construction of the sets Qj and Bj . In particular, the x (j ) and y (j )
are two opposite corners of Qj and we have Bj = B(x (j ) , y (j ) ).
Let a : BN → C be a locally integrable function and 1 < p < ∞. The
generalized Toeplitz operator is defined by
∞
∞

Ta f (z) := Ta (χj f )(z) = Pα (aχj f )(z), (26)
j =1 j =1

p
if the series converges for almost every z ∈ BN and all f ∈ Aα (BN ). Here χj
denotes the characteristic function of the set Bj . The boundedness of the Bergman
p p
projection Pα in Lα (BN ) implies that Ta f = Pα (af ) whenever af ∈ Lα (BN ).
In particular, if a is bounded, then Ta is just the standard Toeplitz operator. As in
the one-dimensional case, a “weak” Carleson-type condition (28) implies that Ta
becomes a well-defined bounded linear operator and the definition coincides with
the integral definition, when it is interpreted as an improper integral. Accordingly,
given a locally integrable a : BN → C, we define for all j ∈ N
$
aj := sup
B a dVα (27)
y∈Bj
B(x (j) ,y)

)
and denote |B|α = B dVα for all measurable subsets B ⊂ BN .
472 J. Taskinen and J. A. Virtanen

Theorem 3 Let a : BN → C be locally integrable, 1 < p < ∞ and the family

(Bj )j ∈N be as above. If there exists a constant Ca > 0 such that

B
aj ≤ Ca |Bj |α (28)
p
for all j ∈ N, then the series (26) converges almost everywhere and in Lα (BN ) and
p p
defines a bounded linear operator Aα (BN ) → Aα (BN ) with Ta ≤ Cα Ca , for
some constant Cα > 0 independent of a.
Given the symbol a as above and 0 < ρ < 1, we define aρ (z) = a(z) for |z| ≤ ρ
p
and aρ (z) = 0 for ρ < |z| < 1; then every operator Taρ is bounded on Aα (BN ),
since the supports of the symbols are compact subsets of the unit ball, or also by
the previous theorem. As in the one-dimensional case, the assumption (28) allows
the following representation of the Toeplitz operator, which does not depend on the
decomposition (Bj )j ∈N .
Theorem 4 Let 1 < p < ∞ and 1/p + 1/q = 1, and suppose that a ∈ L1loc
satisfies (28). Then

Ta f = lim Taρ f
ρ→1

for all f ∈ Aα (BN ) and the transpose operator Ta∗ : Aα (BN ) → Aα (BN ) can be
p q q

expressed as

Ta∗ f = lim Taρ f

ρ→1

p
for f ∈ Aα (BN ).
p
The transpose is defined respect to the standard duality of Aα (BN )-spaces.
It would probably be possible and technically easier to formulate and prove a
result analogous to Theorem 3 by using a rectangular Whitney decomposition of
BN instead of the one described here, but there would then be the disadvantage
that the spherical symmetry would be lost and the condition for the boundedness
would depend on the particular choice of the decomposition. In particular, it might
be difficult or impossible to prove Theorem 4 with that approach.

3 Toeplitz Operators in Hv∞ -Spaces: Introduction

From now on we will deal with Toeplitz operators in spaces on D with quite
general weights v satisfying the basic assumptions of Sect. 1. A typical, important
example of weights considered in this section is the exponentially decreasing
v(r) = exp(−1/(1 − r)). Because of such examples we need again to pay attention
to the definition of Toeplitz operators in the spaces Av (D) and Hv∞ (D), namely,
p
Boundedness of Toeplitz Operators in Bergman-Type Spaces 473

there is the problem that the Bergman projection may not be bounded. Actually we
will show that this is always the case for p = ∞ for any weight, see Theorem 5, but
even in the reflexive case there may be problems in this respect: in [7] it was shown
that for the above mentioned exponential weight v(z), the orthogonal projection
p
L2v (D) → A2v (D) is bounded in Lv if and only if p = 2. Moreover, in [19] W. Lusky
proved that the mere existence of a bounded projection from L∞ ∞
v (D) onto Hv (D)
is equivalent to v satisfying condition (B) of Definition 2, below. For example, the
exponential weight v satisfies (B), but there also exist natural weights which do not,
like v(z) = (1 − log(1 − |z|))−1 (see the statement after Theorem 1.2. of [19] and
Example 2.4. of the same paper for other examples).
Yet, even in the spaces Hv∞ (D) and Av (D) with general weights, the definition
p

of the Toeplitz operator involves the orthogonal projection Pv : L2v (D) → A2v (D). It
will be useful to consider the integral kernel of Pv , the so called Bergman kernel. In
the next we follow well-known arguments, see e.g. [7]. ) We denote the inner product
in the Hilbert spaces L2v (D) and A2v (D) by f, g = D f g dAv . Then, the functions
−1/2
ek (z) = Γ2k zk , where k ∈ N0 and

$1
Γk = 2π r k+1 v(r)dr, (29)
0

form an orthonormal basis of A2v (D). We remark that the numbers Γk satisfy for all
0 < - < 1 and some constant Cv,- > 0 the following lower bound

Γk ≥ Cv,- -k (30)

for every k ∈ N0 . This follows from (29) by considering the integral e.g. over the
interval [-, 1 − (1 − -)/2] only.
p
Convergence in the space Av (D), 1 < p < ∞, with respect to the norm · p,v
p p
implies pointwise convergence (hence Av (D) is a closed subspace of Lv (D) ), and
thus the point evaluation functionals at any point of D are bounded functionals on
p
Av (D). Consequently, we find the Bergman kernel by using the Riesz representation
theorem, which allows us to choose the family of functions Kz ∈ A2v (D), z ∈ D,
such that
$
g(z) = g, Kz = g(w)Kz (w) dAv (w) (31)
D

for all g ∈ A2v (D). The integral operator defined by the right hand side can be
extended to L2v (D), and it actually defines the orthogonal projection from L2v (D)
474 J. Taskinen and J. A. Virtanen

onto A2v (D), i.e. the Bergman projection Pv . Using the orthonormal basis (ek )∞
k=0
we can write for all z ∈ D
∞
$
∞ k k
z w
Pv g(z) = g, ek ek (z) = g(w)dAv (w). (32)
Γ2k
k=0 D k=0

Here, the order of the summation and the integral can be changed, because (30)
leads for any fixed z ∈ D to the estimate

zk w k |z| k
≤ cv,- 2 , (33)
Γ2k -

and we can choose here -2 > |z| so that the sum on the right-hand side of (32)
converges well enough. Moreover, the estimate (33) implies that for every z ∈ D the
Bergman kernel Kz is a bounded function:

|Kz (w)| ≤ Cz for all w ∈ D. (34)

We obtain the following inference.

Lemma 1 Let f ∈ L1 (D). The integral defining the Toeplitz operator Tf with
symbol f on Hv∞ ,
$
Tf g = f (w)g(w)Kz (w) dAv (w), (35)
D

converges for all z ∈ D and for all g ∈ Hv∞ (D),

Indeed, if g ∈ Hv∞ (D), then, by definition, gv ∈ L∞ (D). Hence, the result
follows from (34).

We remark that the a priori assumption f ∈ L1 (D) is usual also in the

considerations on Toeplitz operators in the reflexive Bergman spaces, but in that
case this assumption does not guarantee that the defining integral (35) converges
p
for all g ∈ Av (D). From this point of view, the case p = ∞ is simpler. However,
although Tf g of (35) is a well-defined holomorphic function it might not be an
element of Hv∞ (D) and Tf might not be a bounded operator Hv∞ (D) → Hv∞ (D).
Actually it is an elementary consequence of the closed graph theorem that Tf is a
bounded operator Hv∞ (D) → Hv∞ (D) if and only if Tf (Hv∞ (D)) ⊂ Hv∞ (D). We
will soon turn to questions on the boundedness of the operator Tf .
If g ∈ Hv∞ (D) is such that fg ∈ L2v (D), we also have

∞
∞
$
zn
(Tf g)(z) = fg, en en (z) = f (w)g(w)w n v(w)dA , (36)
Γ2n
n=0 n=0 D
Boundedness of Toeplitz Operators in Bergman-Type Spaces 475

where the series converges in L2v (D). However, the formula also holds for all g ∈
Hv∞ (D) (since we are assuming f ∈ L1 (D)) and the product fgv thus belongs to
L1 (D), and one can commute the summation and integration in (36), due to (33).
In the latter case, the sum (36) converges uniformly for z in compact subsets of the
disk.

4 Toeplitz Operators with Harmonic Symbols

in Hv∞ (D)-Spaces

In this section we will consider Toeplitz operators Tf with harmonic symbols f :

D → C in weighted spaces Hv∞ (D). We assume that the weight v satisfies the basic
requirements introduced in Sect. 1. In addition, the following notions will be needed
here and in subsequent sections. For any function g : D → C and radius 0 ≤ r ≤ 1
we will denote

M∞ (g, r) = sup |g(z)|. (37)

|z|=r

Also, a weight v is called normal if

v(1 − 2−n ) v(1 − 2−n−k )

sup −n−1 )
< ∞ and inf lim sup < 1. (38)
n∈N v(1 − 2 k∈N n→∞ v(1 − 2−n )

For example, the standard weights v(r) = (1 − r 2)α , α > 0 are normal, whereas the
weights of exponential type, v(r) = exp(−α/(1−r)β ), α, β > 0, are not. The Riesz
projection P maps harmonic functions into holomorphic ones and it is defined by

∞

P ak r |k| eikθ = ak r k eikθ , r ∈ [0, 1), θ ∈ [0, 2π]. (39)
k∈Z k=0

For every m > 0 we denote by rm be a point where the function r $→ r m v(r) attains
its absolute maximum on [0, 1]. Due to the general assumptions on the weights it is
easily seen that rn ≥ rm if n ≥ m and limm→∞ rm = 1; see for example [17] for
details.
We now turn to questions on the boundedness of Toeplitz operators Tf with
harmonic symbols f . In the case f is even holomorphic, the operator Tf is just the
multiplier Mf , and it is quite plain that Tf is bounded, if and only if f ∈ H ∞ (D),
i.e., f is a bounded function. Due to the generality of the weights, the details
of this claim are exposed in [4, Section 2]. Allowing the symbol to be just a
harmonic function changes the situation dramatically. The basic reason for this is
the unboundedness of the Riesz and Bergman projections with respect to the sup-
476 J. Taskinen and J. A. Virtanen

norm, but one can develop this idea as far as the following result. We repeat that in
all of our results the weights v must satisfy the general assumptions made in Sect. 1.
Theorem 5 There is a bounded harmonic function f : D → C such that Tf is not
a bounded operator Hv∞ (D) → Hv∞ (D) for any weight v on D.
This result implies the following conclusion.
Corollary 1 For any weight v, the Bergman projection Pv is not a bounded
mapping L∞ ∞
v (D) → Lv (D).

Namely, the pointwise multiplication with a bounded function f is always a

bounded operator Hv∞ (D) → L∞ v (D). So, if Pv were bounded, this would imply
Tf : Hv∞ (D) → Hv∞ (D) is bounded for every f ∈ L∞ (D), which would
contradict Theorem 5. We actually see that even the restriction of Pv onto h∞
v (D) is
unbounded.
In the sequel, the complex variable z will always be written in the polar
coordinates as z = reiθ , unless otherwise indicated.
Proof of Theorem 5 Let us fix a weight v on D and define first the function f0 :
∂D → C by

1 , if − π/2 ≤ θ ≤ π/2
f0 (z) =
0 , if −π ≤ θ < −π/2 or π/2 < θ ≤ π.

The symbol f is defined as the harmonic extension of f0 on D obtained from the

Poisson integral, hence, we have f ∈ h∞ (D). Calculating the Fourier coefficients
of f0 we observe that
∞
1 1 (−1)k 2k+1
f (z) = + z + z̄2k+1 , z ∈ D. (40)
2 π 2k + 1
k=0

Indeed, let ak , k ∈ Z, be. Then we have

$π/2
1 eikπ/2 − e−ikπ/2 ei|k|π/2 − e−i|k|π/2
ak = e−ikt dt = =
2π 2kπi 2|k|πi
−π/2

(−1)j
(2j +1)π , if |k| = 2j + 1 for some j ∈ N0 ,
=
0 for other k ∈ Z \ {0}.

Moreover, a0 = 1/2. This implies (40).

Boundedness of Toeplitz Operators in Bergman-Type Spaces 477

Next we define the test functions, which will be used in showing the unbounded-
ness of the Toeplitz operator: we set

r m eimθ
gm (z) = m v(r )
, z = reiθ ∈ D
rm m

for all m ∈ N0 , where the definition of the maximum point rm was given in the
beginning of the section so that we obviously have gm v = 1. We next show that
for all m ∈ N0 there holds

m ∞

Γ2m zk zk
Tf gm (z) = bk−m m v(r )
+ b k−m m v(r )
(41)
Γ2k rm m rm m
k=0 k=m+1

∞ |k| e ikθ
where f (z) = k=−∞ bk r and Γk is as in (29). Indeed, this follows from

r m+|j | ei(j +m)θ

f (z)g(z) = bj m v(r )
rm m
j ∈Z
∞

m
r k eikθ r 2m−k eikθ
= bk−m m v(r )
+ b k−m m v(r )
rm m rm m
k=m+1 k=−∞

and (36).
Let us now turn to the final proof showing that Tf is unbounded on Hv∞ (D). We
define
∞
(−1)j
f1 (z) = z2j +1 + z̄2j +1
2j + 1
j =0

and note that it suffices to show that Tf1 is unbounded since Tf = T1/2 + π −1 Tf1
and T1/2 (multiplication by constant 1/2) is bounded. Fix a positive integer m, say
m = 4m0 for m0 ∈ N. Then

odd if k is odd odd if j is odd
k − m is and j − 2m0 is
even if k is even even if j is even.

We apply formula (41) with bk = 0, if k is even, and with bk = (−1)k /|2k + 1| if k

is odd, to obtain

m ∞
Γ2m zk zk
Tf1 gm (z) = bk−m m
+ bk−m m . (42)
Γ2k rm v(rm ) rm v(rm )
k=0, k=m+1,
k odd k odd
478 J. Taskinen and J. A. Virtanen

Next, if S is the operator Sf (z) = (f (z) − if (iz))/2, we have

∞
∞

Sf (z) = f4k+1 z4k+1 for f (z) = f2k+1 z2k+1 , (43)
k=0 k=0

since 1 − i · i 2k+1 = 1 + (−1)k . We obtain

Γ2m z4j +1 z4j +1
STf1 gm (z) = b4j +1−m m v(r )
+ b4j +1−m m v(r )
.
Γ8j +2 rm m rm m
0≤4j +1≤m m+1≤4j +1<∞

Recall that b4j +1−m = 1/|4(j − m0 ) + 1|. So if we take θ = 0 then all summands
in the preceding sum are non-negative. Hence
∞ ∞ 4j +1
rm 1 rm (rm
4 )j rm
log = ≤
5 1 − rm
4 5 j 4j + 1
j =1 j =0

4j +1
rm v(rm )
= b4j +1−m m v(r )
≤ S(Tf1 (gm ))(rm )v(rm )
rm m
m+1≤4j +1<∞

≤ S(Tf1 (gm ))v ≤ Tf1 (gm )v .

since trivially by the definition of the operator S we have sup|z|=r |(Sf )(z)| ≤
sup|z|=r |f (z)|. Since limm→∞ rm = 1, the left-hand side of the preceding estimate
grows to the infinity, when m → ∞. Hence Tf1 and also Tf cannot be bounded.

5 General Result on Multipliers and Toeplitz Operators

in Hv∞ (D) with Radial Symbols

We continue by considering a fixed radial weight v on D and Toeplitz operators

Tf : Hv∞ (D) → Hv∞ (D), where Tf = Pv Mf . A function with radial symmetry
on the disk can nearly never be harmonic, and the study of Toeplitz operators with
radial symbols requires techniques different from those in Sect. 4. First we note that
if f ∈ L1 (D) is radial, i.e. f (z) = f (|z|) for almost every z ∈ D, then Tf is a
coefficient multiplier. This is easily seen by expanding the kernel as in (32) and a
calculation using the usual orthonormality relations of trigonometric polynomials,

∞
$1 $2π
zn
Tf g(z) = f (r)g(reiθ )r n+1 e−inθ v(r) dθ dr
Γ2n
n=0 0 0

∞
$1 ∞

zn
= f (r)r 2n+1 v(r)gn dr = γn gn zn (44)
Γ2n
n=0 0 n=0
Boundedness of Toeplitz Operators in Bergman-Type Spaces 479

where g = n gn z
n and

$1
1
γn = r 2n+1 v(r)f (r)dr. (45)
Γ2n
0

We expose here the approach based mainly on the works [17], [19] and [20]
dealing with the condition (B), below, which according to Theorem 1.1 of [19]
characterizes those radial weights such that the space Hv∞ (D) is isomorphic to the
Banach space ∞ . Examples of weights satisfying (B) are all normal weights (38),
in particular the standard weights, and the weights of exponential type v(r) =
exp(−γ /(1 − r)β ); see [19].
The very definition of condition (B) is somewhat technical and we cannot quite
avoid other technical considerations in this survey either, however, one can follow
our presentation without going into the depth of the arguments just by keeping in
mind that condition (B) associates
∞to the weight an increasing sequence of indices
(mn )∞
n=1 ⊂ (0, ∞) and radii rmn n=1 ⊂ (0, 1) such that mn → ∞ and rn → 1 as
n → ∞, and moreover, gives the very useful equivalent representation in Theorem 6
for the weighted sup-norm. We recall that the numbers rm ∈]0, 1[ were defined in
the beginning of Sect. 4.
Definition 2 The weight v satisfies the condition (B), if

∀b1 > 1 ∃b2 > 1 ∃c > 0 ∀m, n > 0

m n
rm v(rm ) rn v(rn )
≤ b1 and m, n, |m − n| ≥ c ⇒ ≤ b2 .
rn v(rn ) rm v(rm )

Note that here m and n need not be integers. We now fix a number b > 2: it
is shown in Lemma 5.1. of [19] that it is then possible to choose, by induction, an
increasing, unbounded sequence (mn )∞n=1 ⊂ (0, ∞) such that

mn mn+1
rmn v(rmn ) rmn+1 v(rmn+1 )
b = min , .
rmn+1 v(rmn+1 ) rmn v(rmn )

Next, for all n ∈ N, for the given mn , we define

⎧ |k| − [m ]
⎪
⎪
n−1
, if mn−1 < |k| ≤ mn , and
⎪
⎨ [mn ] − [mn−1 ]
wnk = (46)
⎪
⎪
⎩ [mn+1 ] − |k|
⎪
if mn < |k| ≤ mn+1 ,
[mn+1 ] − [mn ]

where k ∈ Z and m0 = 0. Here [r] is the largest integer not greater than r. With the
help of these numbers we define the coefficient multipliers of de la Valleé Poisson
480 J. Taskinen and J. A. Virtanen

∞ |k| e ikθ ,
type, acting on harmonic functions f (z) = k=−∞ fk r by

∞
∞

Wn : fk r |k| eikθ $→ wnk fk r |k| eikθ
k=−∞ k=−∞

We will need the following uniform boundedness property of the operators Wn ,

namely there exists a constant C > 0, depending on the weight only, such that

M∞ (Wn g, r) ≤ CM∞ (g, r) (47)

for all 0 ≤ r ≤ 1 and g ∈ h∞ v (D). See (37) for the notation. The inequality (47)
follows e.g. by combining an inequality in Theorem 1 of [20] with Lemma 3.3.
of [19].
The operators Wn are important, since they decompose the space Hv∞ (D) into
finite dimensional blocks with a useful representation for the norm. The result is
from Theorem 1 of [20], see also Propositions 4.1. and 5.2. of [19].
Theorem 6 Let v satisfy (B). Then there are constants c1 , c2 > 0 such that, for all
g ∈ h∞
v (D),

c1 sup M∞ (Wn g, rmn )v(rmn ) ≤ gv ≤ c2 sup M∞ (Wn g, rmn )v(rmn ) (48)
n∈N n∈N

and

c1 M∞ (Wn g, rmn )v(rmn ) ≤ Wn gv ≤ c2 M∞ (Wn g, rmn )v(rmn ) (49)

for all n ∈ N.
Moreover, it follows from Theorem 6 that if the numbers fk ∈ C, k ∈ Z satisfy

|k| ikθ
sup sup wnk fk rm n
e v(rmn ) < ∞, (50)
n∈N θ∈[0,2π] m
n−1 <|k|≤mn+1

∞ |k| ikθ
then the series defining the harmonic function f (reiθ ) = k=−∞ fk r e
∞
converges uniformly on compact subsets of D and f belongs to hv (D) and gv is
bounded by a constant depending on the weight v. For this statement, see Remark
1, (iii) of [20].
Examples If v is normal then one can take mn = 2kn for suitable fixed k > 0
(see [19, Example 2.4], and [17]). For v(r) = exp(−α/(1 − r)β ) one can take
mn = β 2 (β/α)1/β n2+2/β − β 2 n2 , see [2].
We now formulate one of the main results of this section, the characterization
of boundedness and compactness for coefficient multipliers. The case of Toeplitz
operators with radial symbols follows easily from this. The result was already
Boundedness of Toeplitz Operators in Bergman-Type Spaces 481

proven for a more restricted class of weights in Theorem 4.1 of [18]. We will assume
that a sequence (γk )∞ k=0 of complex numbers is given, and consider the formal series

f (θ ) = ∞ k=0 γ k e ikθ , which may or may not converge. The formal series W f is
n
then naturally defined as
∞

Wn f (θ ) = wnk γk eikθ
k=0

where the numbers wnk are as in (46). We denote by Mf the coefficient multiplier

∞

Mf g(z) = γk gk r k eikθ , z = reiθ (51)
k=0

for harmonic functions g(z) = ∞ |k| ikθ . By definition, M g is holomor-
k=−∞ gk r e f
phic, if the series (51) converges.
Theorem 7 Let the weight v satisfy condition (B). Then Mf maps h∞
v (D) into
Hv∞ (D) and is bounded, if and only if

$2π
sup |(Wn f )(θ )|dθ < ∞. (52)
n∈N
0

Moreover, assume (52) holds. Then Mf : h∞ ∞

v (D) → Hv (D) is compact, if and
only if

$2π
|(Wn f )(θ )|dθ → 0 as n → ∞. (53)
0

We present here the proof of the boundedness-statement, comment on the

compact case only briefly and refer to [4] for the details. Let us first prove that (52)
implies the boundedness of the operator. By (46), for every n there are only finitely
many non-zero wnk , hence, we can write MWn f , cf. (51), as a convolution

$2π
1
MWn f g(z) = Wn f (θ − ψ)g(reiψ )dψ, z = reiθ ∈ D.
2π
0

We obtain the estimate

$2π
1
MWn f g(z) v(r) ≤ |(Wn f )(θ )|dθ gv (54)
2π
0
482 J. Taskinen and J. A. Virtanen

for all g ∈ h∞
v (D), Hence,

M∞ (MWn f g, r)v(r) ≤ Cgv

for all n and r, where the constant C > 0 is the supremum on the left-hand
side of (52). According to the remark concerning (50) the series on the right-hand
side of (51) converges uniformly on compact subsets of D, defines an element of
Hv∞ (D) and is bounded by gv . This means that Mf maps h∞ v (D) continuously
into Hv∞ (D).
As for the compactness of the operator Mf under the assumption (53), one takes
a sequence (gj )∞ ∞
j =1 contained in the closed unit ball of hv (D) and converging to
0 uniformly on compact subsets of D. One needs to show that Mf maps such a
sequence into a one converging to 0 with respect to the norm; see for example [26,
Section 2.4]. Roughly speaking, one can improve the boundedness proof to get this,
by using the assumption (53) together with the assumption on the convergence in
the compact-open topology. One needs a more sophisticated use of Theorem 6.
As usual, the proof for the necessity of the condition (52) for the boundedness
requires a careful enough choice of appropriate test functions. To this end we
fix an arbitrary 0 < ε < 1 as well as n ∈ N and ϕ ∈ [0, 2π]. Using
the
Fejer |k| approximation theorem we find a trigonometric polynomial g(z) =
k∈Z gk r eikθ , depending on n, ϕ and ε, such that

Wn f (ϕ − θ ) ε
g(rmn eiθ ) − < (55)
|Wn (ϕ − θ )|v(rmn ) v(rmn )

for all θ ∈ [0, 2π|, in particular

M∞ (g, rmn )v(rmn ) ≤ 2. (56)

As a consequence,

$2π $2π
1 1
|(Wn f )(θ )|dθ = |(Wn f )(ϕ − θ )|dθ
2π 2π
0 0

$2π
1
≤ (Wn f )(ϕ − θ )g(rmn eiθ )dθ v(rmn ) + ε
2π
0

$2π
1
= f (ϕ − θ )(Wn g)(rmn eiθ )dθ v(rmn ) + ε
2π
0

= |Mf Wn g(rmn eiϕ )| v(rmn ) + ε ≤ Mf · Wn gv + ε. (57)

Boundedness of Toeplitz Operators in Bergman-Type Spaces 483

Using Theorem 6 and (47), (56) we find a constant C > 0 such that

Wn gv ≤ c2 M∞ (Wn g, rmn )v(rmn ) ≤ c2 dM∞ (g, rmn )v(rmn ) ≤ 2Cc2 .

)
2π
Hence supn |(Wn f )(θ )|dθ < ∞.
0
The proof for the necessity of the condition (53) for the compactness of Mf
needs a number of additional technical details.
Since Riesz projection P , (39), is bounded by the assumptions of Theorem 7, it
follows that the boundedness and compactness of Mf : Hv∞ (D) → Hv∞ (D) are
also equivalent to (52) and (53), respectively.
Let us turn back to Toeplitz operators. Let Ta be a Toeplitz operator on Hv∞ (D)
with a given radial symbol a ∈ L1 (D), i.e. a(z) = a(|z|) for almost every z. Then,
defining

$1 ∞

1
γk = r 2k+1 v(r)a(r)dr, k ∈ N0 and fa (θ ) = γk eikθ , (58)
Γ2k
0 k=0

it was shown in (44)–(45) that Ta coincides with the Taylor multiplier with
coefficients (γk )∞
k=0 . The previous theorem thus yields the main result on the
boundedness and compactness.
Theorem 8 Let the weight satisfy (B). If a ∈ L1 is radial then Ta is bounded as an
operator Hv∞ (D) → Hv∞ (D) if and only if

$2π
sup |(Wn fa )(θ )|dθ < ∞, (59)
n
0

and Ta is a compact operator Hv∞ (D) → Hv∞ (D), if and only if

$2π
lim |(Wn fa )(θ )|dθ = 0. (60)
n→∞
0

We finally recall that Theorems 1.1 and 3.3 of the article [21] contain necessary
p
and sufficient conditions for the boundedness and compactness of Ta : Av (D) →
p
Av (D) for 1 < p < ∞, with minimal assumptions on the radial weights
v. However, the characterization is in terms of the boundedness of coefficient
multipliers in Hardy spaces, which is another open problem.
484 J. Taskinen and J. A. Virtanen

6 Supplementary Results on Toeplitz Operators with Radial

Symbols

According to Theorem 5, the boundedness of the symbol does not suffice to

imply the boundedness of the Toeplitz operator of Tf : Hv∞ (D) → Hv∞ (D). In
this section we continue working with radial symbols and present results, where
additional regularity or decay of the symbol at the boundary of the disk D implies
the boundedness of Ta . The proofs are based on Theorem 8, although here we will
only sketch some ideas of them.
In Theorem 8, the conditions for the boundedness and compactness of the
Toeplitz operator may not be easy to verify for concrete weights and symbols, but the
results of this section also serve the purpose of presenting some sufficient conditions
that are quite easy to formulate and control. The setting for the spaces and symbols is
the same as in the previous section, but in addition to condition (B) we also assume
that, for some > 0, v satisfies the following technical condition
)1
r n−n v(r)dr
sup 0) 1 < ∞. (61)
n
n∈N 0 r v(r)dr

It is not difficult to see that (61) holds for example for the important classes of stan-
dard, normal and exponential weights. For normal weights, condition (61) with =
1/2 follows from Lemma 4.5. of [3]. In the case v(r) = exp(−1/(1−r)) it is known
)1
that 0 r m v(r)dr, m > 1, is proportional to the quantity m−3/4 exp(−Bm1/2) for
some constant B > 0 independent of m (see e.g. Lemma 2.2. in [7] or Lemma 4.28
in [1]). Hence, assuming < 1/2 we obtain

$1
r n−n v(r)dr ≤ C(n − n )3/4 exp(−B(n − n )1/2 )
0

$1
3/4
≤Cn exp(−Bn1/2
+C )≤C r n v(r)dr
0

for some positive constants C, C etc., since

−1 1/2 1 −1 −2
(n − n )1/2 = n1/2 (1 − n ) = n1/2 1 − n + O(n2 )
2
1 −1/2 −3/2
= n1/2 − n + O(n2 ) ≥ n1/2 − C
2
for all n. Thus, (61) holds. The same argument works for the more general weights
v(r) = exp(−α/(1 − r)β ), α, β > 0.
Boundedness of Toeplitz Operators in Bergman-Type Spaces 485

It was proven in [19] that normal and exponential weights satisfy (B).
Theorem 9 Let v satisfy (B) and (61) and assume that the symbol a ∈ L1 is real
valued and radial. The operator Ta is a bounded operator Hv∞ (D) → Hv∞ (D) in
any of the following cases:
(i) The restriction a|[δ,1[ is differentiable (with respect to r) for some δ ∈]0, 1[
and there holds

lim sup a (r) < ∞ or lim inf a (r) > −∞, (62)
r→1 r→1

and, in addition,

lim sup |a(r) log(1 − r)| < ∞ (63)

r→1

(ii) The restriction a|[δ,1[ is differentiable for some δ ∈]0, 1[, a satisfies (62) and,
for some constant C > 0, there holds the bound

C
|a (r)| ≤ 2 for r ∈]δ, 1[. (64)
(1 − r) log(1 − r)

(iii) The symbol a is continuously differentiable on [0, 1].

Theorem 9 holds also in the case of complex valued symbols a, namely, the
assumptions need to be satisfied by both Re a and Im a.
The symbol a(r) = 1/(1 − log(1 − r)) satisfies the second condition (62) and,
of course, (63) so that Ta : Hv∞ (D) → Hv∞ (D) is bounded. The same is true for
a(r) = (1 − r)δ with any δ > 0. The latter symbol even induces a compact operator,
as can be seen by the next result.
Theorem 10 Let v satisfy (B) and (61) and assume that the symbol a ∈ L1 is real
valued and radial.
(i) If the restriction a|[δ,1[ is differentiable for some δ ∈]0, 1[, satisfies (62) and,
in addition,

lim sup |a(r) log(1 − r)| = 0 (65)

r→1

then the operator Ta : Hv∞ (D) → Hv∞ (D) is compact.

(ii) Assume that the restriction a|[δ,1[ is differentiable for some δ ∈]0, 1[, satis-
fies (62), and there holds
2
lim |a (r)|(1 − r) log(1 − r) = 0. (66)
r→1

Then Ta is compact, if and only if limr→1 a(r) = 0.

486 J. Taskinen and J. A. Virtanen

Here, the case of complex valued symbols can be treated in the same way as in
the previous theorem.
The item (i) in both Theorems 9 and 10 follows from Theorem 8. We do not
present the proof but only refer to [5]. Recall that the coefficients of the series fa
)1
in (59), (60) are given in (58), which involves integrals 0 r n a(r)v(r)dr: the proofs
of (i) of Theorems 9 and 10 are based on quite technical estimates and calculations
with these expressions.
However, it is not so difficult to see that the sufficient condition (ii) essentially
implies (i) in Theorem 9. Assume a is real-valued and that (64) holds. For all r ∈
]δ, 1[ we get by the change of the integration variable log(1 − s) =: x and dx/ds =
−1/(1 − s) that

$1 $1 $
log(1−r)
1 1 C
|a (s)|ds ≤ C 2 ds = C dx = .(67)
(1 − s) log(1 − s) x2 | log(1 − r)|
r r −∞

This implies that we can extend a as a continuous function to ]δ, 1] by setting

$1

a(1) = a (s)ds + a(δ) = lim a(r) .
r→1
δ

Now, (67) yields for all r ∈]δ, 1[

$1
C
|a(r) − a(1)| = a (s)ds ≤ , (68)
| log(1 − r)|
r

which means that the function a −a(1) satisfies (63). Note that the Toeplitz operator
with the constant symbol a(1) is bounded as it is just a constant multiplier.
It is plain that (iii) implies (ii) in Theorem 9.
Also, as regards to Theorem 10, the assumptions in (ii) imply those of (i).
Namely, if (66) holds, then we can repeat the calculation (67)–(68) so that the
constant C is replaced by a positive function C(r) with C(r) → 0 as r → 1. Then,
we see from the analogue of (68) that the function a − a(1) even satisfies (65). If in
addition a(1) = 0 then also a satisfies (65). Note that if limr→1 a(r) = a(1) = 0,
then Ta is a compact perturbation of a non-zero multiple of the identity which is not
compact, and thus it cannot be a compact operator.
In [5] it is shown that if v is a normal weight, the assumptions on a in the previous
theorems can be relaxed, namely the boundedness of Ta : Hv∞ (D) → Hv∞ (D)
follows just from (63) and the compactness from (65) without any smoothness
Boundedness of Toeplitz Operators in Bergman-Type Spaces 487

assumptions on the symbol. Also, in the case of exponential weights v(r) = exp −
α/(1−r)β ), α, β > 0, the smoothness requirements on a can be dropped, namely, if

lim sup |a(r)|(1 − r)−1/2−β/4 < ∞, (69)

r→1

then Ta : Hv∞ (D) → Hv∞ (D) is bounded, and if

lim sup |a(r)|(1 − r)−1/2−β/4 = 0, (70)

r→1

then Ta is compact on Hv∞ (D).

p
Let us finally consider reflexive weighted Bergman spaces Av (D). For radial
symbols, the boundedness of Ta as an operator from the Bergman-Hilbert space
A2v (D) into itself is characterized by the condition

sup |γn | < ∞, (71)

n∈N

where the numbers γn are as in (45). The idea of trying to characterize the
p p
boundedness and compactness of Ta : Av (D) → Av (D) for 2 < p < ∞ (or
1 < p < 2) by interpolating does not seem to work, but one can derive a sufficient
p
condition similar to (52) for the boundedness of Ta in Av (D).
To formulate and sketch the proof of the result we need some modifications of
the notions that were used in the case of weighted sup-norms. We again assume
that the weight v satisfies condition (B). First, instead of the de la Valleé
Poisson
just to use the Dirichlet projections Qn g(z) = nk=0 gk zk for
operators it is enough
holomorphic g(z) = ∞ k
k=0 gk z . It is known that there are constants cp > 0 with
Mp (Qn g, r) ≤ cp Mp (g, r) for all 0 < r < 1, 1 < p < ∞, where cp does not
) 2π
depend on g, n or r and we write Mp (g, r)p = (2π)−1 0 |g(reiθ )|p dθ .
Analogously with the case of weighted sup-norms one picks up suitable increas-
ing numerical sequences (n )∞ ∞
n=1 with 1 = 0 and limn→∞ n = ∞ and (sn )n=1 ⊂
(0, 1) with limn→∞ sn = 1 and then defines the operators

Zn = Q[n+1 ] − Q[n ] , n ∈ N.

These are used to derive an equivalent form of the weighted Lp -norm: for some
p
constants c2 ≥ c1 > 0, for every f ∈ Av (D), there holds
∞
p p 1/p
c1 f p,v ≤ ωn Mp (Zn f, sn ) ≤ c2 f p,v , (72)
n=1

where the numbers ωn are determined by the weight. The details of the definitions
of the various parameters and proof of (72) can be found in [13] for p = 1 and in
[20] for 1 < p < ∞. Examples and calculations in concrete cases can be found in
488 J. Taskinen and J. A. Virtanen

the paper [3]:

there it is shown that one can obtain (72) for the exponential weights
v(r) = exp − α/(1 − r)β ), α, β > 0 by using

α 1/β 1
n = β 1+1/β α −1/β n2+2/β − βn2 , sn = 1 − . (73)
β n2/β

Proposition
∞ 2 Let the weight satisfy (B), let a ∈ L1 be a radial function and let
fa (θ ) = ikθ be as in (45). Then the Toeplitz operator T : Ap (D) →
k=0 γk e a v
p
Av (D) is a well-defined, bounded operator, if

$2π
sup |(Zn fa )(θ )|dθ < ∞, (74)
n∈N
0

p p
and Ta : Av (D) → Av (D) is compact, if

$2π
|(Zn fa )(θ )|dθ → 0 as n → ∞. (75)
0

Proof Let Mf be the convolution operator, or the sequence space multiplier,

p
corresponding to Ta , see (45). For all g ∈ Av (D) and z = reiθ ∈ D we get

$2π
(Zn Mf g)(z) = (MZn f g)(z) = Zn f (θ − ψ)Zn g(reidψ )dψ,
0

where we replaced g by Zn g by the usual orthogonality relations of trigonometric

monomials. The Young inequality a ∗ bLp (∂D) ≤ aL1 (∂D) bLp (∂D) yields

$2π
Mp (Zn Mf g, r) ≤ |(Zn f )(θ )|dθ Mp (Zn g, r) (76)
0

The inequality Mf gp,v ≤ Cgp,v thus follows by applying (74) and (72) to
both Mf gp,v and gp,v , and this implies the boundedness of Ta .
Assume next (75) holds, and let (gj )∞
j =1 be a sequence which is contained in the
p
unit ball of Av (D) and which converges to 0 uniformly on compact subsets of D,
) 2π
and assume ε > 0 is given. We choose N ∈ N such that 0 |(Zn f )(θ )|dθ < ε.
The convergence of the sequence in the compact-open topology can be used to find
a large enough J ∈ N such that
ε ε
sup |Zn Mf gj (z)|v(z) < ⇒ Mp (Zn Mf gj , rmn ) <
|z|≤rmn 2πNωn Nωn
Boundedness of Toeplitz Operators in Bergman-Type Spaces 489

for all n ≤ N, all j ≥ J . This, (76) and (72) imply

N ∞

p p p
Mf gj p,v ≤ ωn Mp (Zn Mf gj , rmn )p + ωn Mp (Zn Mf gj , rmn )p
n=1 n=N+1
∞
p p
≤ ε+ε ωn Mp (Zn gj , rmn )p ≤ 2εgj p,v ≤ 2ε.
n=N+1

We infer that the sequence (gj )∞

p
j =1 converges to 0 in the norm of Av (D), which
proves the compactness of the operator.

Acknowledgments JT was supported in part by the Väisälä Foundation of the Finnish Academy
Sciences and Letters. JV was supported in part by the Engineering and Physical Sciences Research
Council grant EP/T008636/1

References

1. H. Arroussi, Function and operator theory on large Bergman spaces, Ph.D. thesis, Universitat
de Barcelona (2016)
2. J. Bonet, W. Lusky, J. Taskinen, Solid hulls and cores of weighted H ∞ -spaces. Rev. Mat.
Compl. 31, 781–804 (2018)
3. J. Bonet, W. Lusky, J. Taskinen, Solid hulls and solid cores of weighted Bergman spaces.
Banach J. Math. Anal. 13(2), 468–485 (2019)
4. J. Bonet, W. Lusky, J. Taskinen, On boundedness and compactness of Toeplitz operators in
weighted H ∞ -spaces. J. Funct. Anal. 278(10), 108456 (2020)
5. J. Bonet, W. Lusky, J. Taskinen, On the boundedness of Toeplitz operators with radial symbols
over weighted sup-norm spaces of holomorphic functions. J. Math. Anal. Appl. 493(1), 124515
(2021)
6. A. Dieudonne, Bounded and compact operators on the Bergman space L1a in the unit ball of
Cn . J. Math. Anal. Appl. 388(1), 344–360 (2012)
7. M. Dostanic, Unboundedness of the Bergman projections on Lp spaces with exponential
weights. Proc. Edinb. Math. Soc. 47, 111–117 (2004)
8. M. Engliš, Toeplitz operators and weighted Bergman kernels. J. Funct. Anal. 255(6), 1419–
1457 (2008)
9. S. Grudsky, N. Vasilevski, Bergman-Toeplitz operators: radial component influence. Integr. Eq.
Oper. Theory 40, 16–33 (2001)
10. S. Grudsky, A. Karapetyants, N. Vasilevski, Toeplitz operators on the unit ball in CN with
radial symbols. J. Oper. Theory 49, 325–346 (2003)
11. R. Hagger, J.A. Virtanen, Compact Hankel operators with bounded symbols. J. Oper. Theory
86, 317–329 (2021)
12. R. Hagger, C. Liu, J. Taskinen, J.A. Virtanen, Toeplitz operators on the unit ball with locally
integrable symbols. Integr. Equ. Oper. Theory 94, 17 (2022)
13. A. Harutyunyan, W. Lusky, On L1 −subspaces of holomorphic functions. Stud. Math. 198,
157–175 (2010)
14. A. Karapetyants, J. Taskinen, Toeplitz operators with radial symbols on general analytic
function spaces. Submitted
490 J. Taskinen and J. A. Virtanen

15. D.H. Luecking, Trace ideal criteria for Toeplitz operators. J. Funct. Anal. 73(2), 345–368
(1987)
16. D.H. Luecking, Finite rank Toeplitz operators on the Bergman space. Proc. Am. Math. Soc.
136(5), 1717–1723 (2008)
17. W. Lusky, On weighted spaces of harmonic and holomorphic functions. J. Lond. Math. Soc.
51, 309–320 (1995)
18. W. Lusky, Growth conditions for harmonic and holomorphic functions, Functional Analysis,
in Proceedings of the First International Workshop, ed. by S. Dierolf, S. Dineen, P. Domanski
(1996), pp. 281–291
19. W. Lusky, On the isomorphism classes of weighted spaces of harmonic and holomorphic
functions. Stud. Math. 175, 19–45 (2006)
20. W. Lusky, J. Taskinen, Bounded holomorphic projections for exponentially decreasing weights.
J. Funct. Spaces Appl. 6, 59–70 (2008)
21. W. Lusky, J. Taskinen, Toeplitz operators on Bergman spaces and Hardy multipliers. Stud.
Math. 204, 137–154 (2011)
22. P. Mannersalo, Toeplitz operators with locally integrable symbols on Bergman spaces of
bounded simply connected domains. Complex Var. Elliptic Eq. 61(6), 854–874 (2016)
23. P. Mannersalo, Toeplitz operators on Bergman spaces of polygonal domains. Proc. Edinb. Mat.
Soc. 62, 1115–1136 (2019)
24. G. McDonald, C. Sundberg, Toeplitz operators on the disc. Indiana Univ. Math. J. 28(4), 595–
611 (1979)
25. A. Perälä, J. Taskinen, J.A. Virtanen, Toeplitz operators with distributional symbols on
Bergman spaces. Proc. Edinb. Math. Soc. 54(2), 505–514 (2011)
26. J.H. Shapiro, Composition Operators and Classical Function Theory (Springer, New York,
1993)
27. K. Stroethoff, Compact Toeplitz operators on Bergman spaces. Math. Proc. Camb. Philos. Soc.
124, 151–160 (1998)
28. K. Stroethoff, D. Zheng, Toeplitz and Hankel operators on Bergman spaces. Trans. AMS
329(2), 773–794 (1992)
29. D. Suárez, The essential norm of operators in the Toeplitz algebra on Ap (Bn ). Indiana Univ.
Math. J. 56(5), 2185–2232 (2007)
30. J. Taskinen, J.A. Virtanen, Toeplitz operators on Bergman spaces with locally integrable
symbols. Rev. Math. Iberoamericana 26(2), 693–706 (2010)
31. J. Taskinen, J.A. Virtanen, New results and open problems on Toeplitz operators in Bergman
spaces. New York J. Math. 17, 147–164 (2011)
32. J. Taskinen, J.A. Virtanen, Weighted BMO and Toeplitz operators on the Bergman space A1 .
J. Oper. Theory 68(1), 131–140 (2012)
33. J. Taskinen, J.A. Virtanen, On generalized Toeplitz and little Hankel operators on Bergman
spaces. Arch. Math. 110(2), 155–166 (2018)
34. N.L. Vasilevski, Bergman type spaces on the unit disk and Toeplitz operators with radial
symbols, Reporte Interno 245, Departamento de Matemáticas, CINVESTAV del I.P.N., Mexico
City, 1999
35. N.L. Vasilevski, Commutative algebras of Toeplitz operators on the Bergman space, in
Operator Theory: Advances and Applications, vol. 185 (Birkhäuser, 2008)
36. F. Yan, D. Zheng, Bounded Toeplitz operators on Bergman space. Banach J. Math. Anal. 13(2),
386–406 (2019)
37. K. Zhu, Positive Toeplitz operators on weighted Bergman space. J. Oper. Theory 20, 329–357
(1988)
38. K. Zhu, Operator Theory in Function Spaces, Mathematical Surveys and Monographs, 2nd
edn., vol. 138 (American Mathematical Society, Providence, 2007)
39. N. Zorboska, Toeplitz operators with BMO symbols and the Berezin transform. Int. J. Math.
Math. Sci. 46, 2929–2945 (2003)
Part IV
Inequalities in Various Banach Spaces
Disjointness Preservers
and Banach-Stone Theorems

Denny H. Leung and Wee Kee Tang

Abstract Let be a compact Hausdorff space. The space C() of continuous

functions on carries a number of structures. It is a Banach space (under the
sup-norm), a vector lattice and a ring (under pointwise operations). The classical
theorems of Banach-Stone, Kaplansky and Gelfand-Kolmogorov show that each of
these structures on C() characterizes the space up to homeomorphism. Within
the last 30 years or so, a rich literature has been built up concerning mappings
between function spaces that preserve the disjointness structure (biseparating maps
or ⊥-isomorphisms). These efforts have shown that in many cases, operators on
function spaces that preserve various kinds of structures are ⊥-isomorphisms. This
lends a certain unity to various “preserver” results and highlights the utility of the
concept of ⊥-isomorphisms. In this chapter, we will describe a general theory of
⊥-isomorphisms and survey a number of applications, including applications to
order (lattice) isomorphisms, ring and multiplicative isomorphisms, isometries and
nonvanishing preservers.

Keywords Banach-Stone theorems · Biseparating maps · Algebra

isomorphisms · Order isomorphisms · Continuous functions · Uniformly
continuous functions · Lipschitz functions · Differentiable functions

Research of the second author is supported by the Ministry of Education - Singapore, under its
Academic Research Fund Tier 1 (AcRF project no. RG24/19(S)).

D. H. Leung ()
Department of Mathematics, National University of Singapore, Singapore, Singapore
e-mail: [email protected]
W. K. Tang
School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore,
Singapore
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 493
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_15
494 D. H. Leung and W. K. Tang

1 Introduction

There is a long and fruitful tradition of studying a mathematical object by means

of looking at the space of mappings from it into a simple object of the same sort.
For example, the dual group is a fundamental object in abstract harmonic analysis;
likewise, the dual space of a locally convex topological vector space is part and
parcel of the theory of such spaces. If is a compact Hausdorff space, the space
C() of continuous real valued functions on is a natural “dual space” of . (We
will generally take the scalar field to be R, although most of what will be discussed
in the paper applies equally well to complex scalars.) Moreover, C() carries with
it a wealth of structures. It is a Banach space under the norm f = supω∈ |f (ω)|,
a ring (with unit) under pointwise addition and multiplication and a vector lattice
under pointwise order. Each of these aspects of C() has been shown to determine
the space up to homeomorphism. These are the famous classical theorems of
Banach-Stone [9, 38], Gelfand-Kolmogorov [24] and Kaplansky [28]. The aim of
this paper is to give a survey of some developments that arise out of these classical
results, which we will refer to as theorems of Banach-Stone type. Particularly, since
the 1990s, mappings that preserve “disjointness structures”—biseparating maps
in the linear case, ⊥-isomorphisms more generally—have been studied by many
researchers. An important point that we would like to make is to promote the use of
⊥-isomorphisms as a unifying concept in the study of Banach-Stone type theorems.
A recent example of such a point of view is given in the paper [15]. For a general
survey of Banach-Stone theorems up to around the year 2000, see [22].
Let us briefly summarize the contents of the paper. In Sect. 2, we recall the
statements of the three classical theorems mentioned above. The definition of a
biseparating map is given and it is shown that if T : C() → C() is either
an isometry, an algebra (ring) isomorphism or a vector lattice isomorphism, then
it is biseparating. A detailed proof is given of the fact that a biseparating map
T : C() → C() induces a homeomorphism ϕ : → , with respect to which
T can be represented as a weighted composition operator. (See Theorem 2.6.) Con-
sequently, the three classical theorems can be unified under Theorem 2.6. Section 3
develops the theory of ⊥-isomorphisms. Minimal assumptions are made on the sets
of functions and the mappings involved. Even so, it is found that a ⊥-isomorphism
induces an isomorphism between the Boolean algebras of regular open sets between
the underlying domain spaces (Theorem 3.3). Under further conditions, it is shown
that the Boolean isomorphism gives rise to a homeomorphism between the domain
spaces. These can be viewed as “weak” Banach-Stone theorems. In Sect. 3.3,
“strong” Banach-Stone theorems are given, that is, results where a ⊥-isomorphism
has a functional representation. Strong Banach-Stone theorems are seen to apply
to a large class of function spaces. Finally, Sect. 4 contains applications of the
results in Sect. 3 to a variety of settings. It is shown that in many cases lattice
isomorphisms (Kaplansky’s Theorem), ring isomorphisms (Gelfand-Kolmogorov
Theorem), multiplicative isomorphisms (Milgram’s Theorem), isometries (Banach-
Banach-Stone Theorems 495

Stone Theorem) and nonvanishing preservers are ⊥-isomorphisms. Consequently,

many results are consequences of, and can be extended by, characterization of ⊥-
isomorphisms.

2 Three Classical Theorems

Let be a topological space. Denote by C() the vector space of all (real-valued)
continuous functions on . The space C() carries with it many structures. Indeed,
it is an algebra under pointwise addition and multiplication. It is also a vector lattice
under pointwise supremum and infimum. Finally, if is compact Hausdorff, the
space C() is a Banach space with the norm f = sup{|f (ω)| : ω ∈ }.
In the first half of the twentieth century, three remarkable theorems appeared that
characterize the space in terms of each of these structures of C().
Theorem 2.1 (Banach-Stone) Let , be compact Hausdorff spaces and let T :
C() → C() be a linear isometry. There are a homeomorphism ϕ : → and
a function h ∈ C() so that |h(σ )| = 1 for all σ ∈ and that

Tf (σ ) = h(σ )f (ϕ(σ )) for all f ∈ C() and all σ ∈ .

Theorem 2.1 was proved by Banach [9] for the case of compact metric spaces.
The theorem was extended to compact Hausdorff spaces by Stone [38].
Theorem 2.2 (Gelfand-Kolmogorov [24]) Let , be compact Hausdorff spaces
and let T : C() → C() be an algebra isomorphism. There is a homeomorphism
ϕ : → such that

Tf (σ ) = f (ϕ(σ )) for all f ∈ C() and all σ ∈ .

Theorem 2.3 (Kaplansky [28]) Let , be compact Hausdorff spaces and let
T : C() → C() be a vector lattice isomorphism. There are a homeomorphism
ϕ : → and a function h ∈ C() so that h(σ ) > 0 for all σ ∈ and that

Tf (σ ) = h(σ )f (ϕ(σ )) for all f ∈ C() and all σ ∈ .

Theorem 2.3 is a special case of Kaplansky’s result. For a discussion of the result
in its full generality, see Sect. 4.1. In the intervening three quarters of a century, a
large number of extensions and generalizations of these results have been obtained.
A particularly fruitful concept that unifies the three classical theorems is that of
disjointness preserving operators. Let , be topological spaces. Two functions
f, g ∈ C(), respectively, C(), are said to be disjoint if the pointwise product
fg = 0. In terms of the lattice structure, f and g are disjoint if and only if
|f | ∧ |g| = 0. Suppose that A() and A() are vector subspaces of C() and
C() respectively. A linear operator T : A() → A() is disjointness preserving
496 D. H. Leung and W. K. Tang

if Tf, T g are disjoint whenever f, g are disjoint functions in A(). A biseparating

operator is a linear bijection T : A() → A() so that both T and T −1 are
disjointness preserving. It is evident that if A() and A() are algebras under
pointwise operations, then every algebraic isomorphism T : A() → A() is
biseparating. A similar statement holds for lattice isomorphisms. Now we proceed
to see that for compact Hausdorff spaces and , any linear isometry from C()
onto C() is biseparating. To do this, we make use of extreme points in the dual
ball of C() and C(). Let C be a convex set in a vector space V . A point x ∈ C is
an extreme point of C if x = 12 (y + z), y, z ∈ C, implies that x = y = z. Denote the
set of extreme points of C by ext C. It is easy to see that if V , W are vector spaces,
T : V → W is a vector space isomorphism and x is an extreme point of C ⊆ V ,
then T x is an extreme point of T (C). The next result is due to Arens and Kelley,
who used it in their proof of the Banach-Stone Theorem.
Proposition 2.4 ([7]) Let be a compact Hausdorff space and let BC()∗ be the
closed ball of the dual space C()∗ . Then

ext BC()∗ = {±δω : ω ∈ },

where δω is the evaluation functional on C() given by δω (f ) = f (ω).

Proposition 2.5 Let and be compact Hausdorff spaces. Every (onto) linear
isometry T : C() → C() is biseparating.
Proof Let f, g be disjoint functions in C() and let σ ∈ . By Proposition 2.4,
δσ ∈ ext BC()∗ . Since T ∗ is a vector space isomorphism and T ∗ (BC()∗ ) =
BC()∗ , T ∗ δσ ∈ ext BC()∗ . By Proposition 2.4, there exists ω ∈ and ε = ±1 so
that T ∗ δσ = εδω . Thus

Tf (σ ) · T g(σ ) = (T ∗ δσ )(f ) · (T ∗ δσ )(g) = ε2 δω (f ) · δω (g) = f (ω) · g(ω) = 0.

This proves that Tf and T g are disjoint. Hence T is disjointness preserving. The
same applies to T −1 by symmetry.

The classical theorems of Banach-Stone, Gelfand-Kolmogorov and Kaplansky
can now be unified and extended by the next result.
Theorem 2.6 ([27]) Let and be compact Hausdorff spaces and let T :
C() → C() be a linear biseparating map. There are a homeomorphism
ϕ : → and a function h ∈ C() so that h(σ ) = 0 for all σ ∈ and
that

Tf (σ ) = h(σ )f (ϕ(σ )) for all f ∈ C() and all σ ∈ .

In fact, Jarosz gave a description of general disjointness preserving linear maps

T : C() → C(). As a result, he showed that every disjointness preserving
linear bijection T : C() → C() is biseparating and has the representation
Banach-Stone Theorems 497

above. We will give a detailed proof of Theorem 2.6 that seems to us to be most
amenable to generalization. For the remainder of the section, let , and T be as
in Theorem 2.6. For any function f ∈ C(), the support of f , supp f , is the closure
of the set {ω : f (ω) = 0}. Similarly for functions in C().
Proposition 2.7 ([5, Lemma 4]) If f, g ∈ C() and supp f ⊆ supp g, then
supp Tf ⊆ supp T g.
Proof Otherwise, there are f, g ∈ C() with supp f ⊆ supp g, yet supp Tf ⊆
supp T g. Hence there exists σ0 ∈/ supp T g so that Tf (σ0 ) = 0. Choose h ∈ C() so
that h(σ0 ) = 0 and that h is disjoint from T g. Since T −1 is disjointness preserving,
T −1 h and g are disjoint. Thus T −1 h and f are disjoint. Therefore, h and Tf are
disjoint, which contradicts the fact that h and Tf are both nonzero at σ0 .

For any σ ∈ , set

Fσ = {supp f : f ∈ C(), (Tf )(σ ) = 0}.

Lemma 2.8 Fσ has the finite intersection property.

Proof Let f1 , . . . , fm be functions in C() so that Tfi (σ ) = 0, 1 ≤ i ≤ m.
There exists a nonzero g ∈ C() so that supp g ⊆ supp Tfi , 1 ≤ i ≤ m. Apply
Proposition 2.7 to T −1 to see that supp T −1 g ⊆ supp fi , 1 ≤ i ≤ m. Since T is a
bijection
Em and g = 0, T −1 g = 0. Thus supp T −1 g is a nonempty set contained in
i=1 supp fi .

E
Lemma 2.9 For any σ ∈ , Fσ contains exactly one point in .
Proof Obviously, Fσ consistsE of closed sets in the compact Hausdorff space . It
follows from Lemma 2.8 E that Fσ is nonempty. Suppose, if possible, that ω1 , ω2
are two distinct points in Fσ . Choose a pair of disjoint functions h1 , h2 ∈ C()
so that hi = 1 on a neighborhood of ωi , i = 1, 2. Let f ∈ C() be chosen so
that Tf (σ ) = 0. By definition, ωi ∈ supp f , i = 1, 2. Since h1 f and h2 f are
disjoint and T is disjointness preserving, T (h1 f ) and T (h2 f ) are disjoint. Without
loss of generality, we may assume that T (h1 f )(σ ) = 0. Then T ((1 − h1 )f )(σ ) =
Tf (σ ) = 0. Since ω1 ∈ Fσ , this would imply that ω1 ∈ supp(1 − h1 )f , which is
clearly false by choice of h1 . This completes the proof of the lemma.

E
Define ϕ : → by setting {ϕ(σ )} = Fσ . By symmetry, we may define

Fω = {supp Tf : f ∈ C(), f (ω) = 0}

E any ω ∈ . Then there is a well-defined function ψ : → so that {ψ(ω)} =

for
Fω for all ω ∈ .
Lemma 2.10 ϕ : → is a homeomorphism with inverse ψ.
Proof We will show that ψ(ϕ(σ )) = σ for all σ ∈ and that ϕ is continuous. The
lemma then follows by symmetry.
498 D. H. Leung and W. K. Tang

Suppose that σ ∈ and ω = ϕ(σ ). Assume, if possible, that σ = ψ(ω) = σ .

Let f ∈ C() be such that Tf (σ ) = 0 and that σ ∈ / supp Tf . There exists h ∈
C() disjoint from Tf so that h = 1 on a neighborhood of σ . Choose g ∈ C()
so that g(ω) = 0. Since h · T g and Tf are disjoint, so are T −1 (h · T g) and f . As
Tf (σ ) = 0, ω ∈ supp f . Hence T −1 (h · T g)(ω) = 0. Therefore,

0 = g(ω) = T −1 (h · T g)(ω) + T −1 ((1 − h) · T g)(ω) = T −1 ((1 − h) · T g)(ω).

It follows that σ ∈ supp(1 − h) · T g, contrary to the choice of h. This completes

the proof that ψ(ϕ(σ )) = σ .
If ϕ is not continuous, then making use of compactness of , there is a net (σα )α
in converging to some σ0 so that (ϕ(σα ))α converges to ω = ω0 := ϕ(σ0 ). Let
f ∈ C() be such that Tf (σ0 ) = 0. There exists α0 so that Tf (σα ) = 0 for all
α 1 α0 . By definitionEof ϕ, ϕ(σα ) ∈ supp f for all α 1 α0 . Thus ω ∈ supp f . But
this shows that ω ∈ Fσ0 and hence ω = ω, contrary to the assumption.

Proof of Theorem 2.6 We will show that if f (ϕ(σ )) = 0, then Tf (σ ) = 0. Once
this is shown, define h = T 1. For any f ∈ C() and any σ ∈ , f − f (ϕ(σ ))1
vanishes at ϕ(σ ). Hence

0 = T [f − f (ϕ(σ ))1](σ ) = Tf (σ ) − f (ϕ(σ ))h(σ ).

Thus Tf (σ ) = h(σ )f (ϕ(σ )), as claimed. Furthermore, since T is a surjection,

h(σ ) = 0 for any σ ∈ .
Suppose that, contrary to the claim above, there are f ∈ C() and σ ∈ so
that f (ϕ(σ )) = 0 yet Tf (σ ) = 0. By definition of ϕ, ω := ϕ(σ ) ∈ supp f . Thus
ω ∈ |f |−1 (0, r) for any r > 0. Define

1 1
Un = |f |−1 ( , ), n ∈ N.
(3n + 5) (3n + 1)2
2

H H H
Then ω ∈ n Un = n U2n−1 ∪ n U2n . Without loss of generality, assume that
H
ω ∈ n U2n−1 . Set

1 1
Vn = |f |−1 ( , ), n ∈ N.
(6n + 3) (6n − 3)2
2

Then (Vn ) is a sequence of disjoint open sets so that U2n−1 ⊆ Vn for all n. For
each n, choose a function hn ∈ C() so that 0 ≤ hn ≤ 1, hn = 1 on U2n−1
and hn = 0 outside Vn . The sequence of functions (nhn f )n is pairwise disjoint
and nhn f ≤ (6n−3)n
2 → 0. Hence the sum g := nhn f converges in C().
For each n, g − nf = 0 on the set U2n−1 . By definition of ϕ, this H implies that
T (g − nf )(σ ) = 0 for all σ ∈ ϕ −1 (U2n−1 ). Choose a net (ωα ) in n U2n−1 that
converges to ω. Let nα ∈ N be such that ωα ∈ U2nα −1 . Set σα = ϕ −1 (ωα ). Since
Banach-Stone Theorems 499

f (ω) = 0, ω ∈
/ Un for any n. Thus limα nα = ∞. Note that (σα ) converges to σ .
Therefore,

1
Tf (σ ) = lim Tf (σα ) = lim T g(σα ) = 0,
α α nα

contrary to the choices of f and σ .

3 Isomorphism of Disjointness Structure

Results in Sect. 2 may serve to convince the reader that biseparating maps are worthy
of study in their own right. Indeed, plenty of results concerning biseparating maps
have been obtained in the past 30 years or so. Most of these are in the context of
linear or at least additive maps. Since surjective additive maps between vector spaces
are linear maps over the field of rational numbers, the results remain mainly “linear”
in character. Very recently, several papers [15, 17, 18] appeared that took the study of
isomorphisms of disjointness structure, or ⊥-isomorphisms, to very general settings.
It is shown that even for function spaces with minimal structure, analysis of ⊥-
isomorphisms can still bear fruitful results. The aim of this section is to describe this
general approach to ⊥-isomorphisms. Earlier results on biseparating maps, in spaces
of (vector-valued) continuous functions, uniformly continuous functions, Lipschitz
functions and differentiable functions, will be seen as consequences. Applications
to theorems of Banach-Stone type will be considered in the next section.

3.1 ⊥-Isomorphisms

Let , X be Hausdorff topological spaces and let A(, X) be a subset of C(, X),
the set of continuous functions f : → X. For f, g ∈ A(, X), let

[f = g] = {ω ∈ : f (ω) = g(ω)}, suppg f = [f = g] and σg (f ) = int suppg f.

Following [15], we define the following relations for f, g, h ∈ A(, X).

1. f ⊥h g: [f = h] ∩ [g = h] = ∅.
2. f ⊆h g: σh (f ) ⊆ σh (g).
The definitions of ⊥h and ⊆h may appear asymmetrical as one uses sets of the form
[f = h] while the other uses σh (f ). However, it is easy to see that f ⊥h g if and
only if σh (f ) ∩ σh (g) = ∅. Similarly, let A(, Y ) be a subset of C(, Y ), where
, Y are Hausdorff topological spaces. Assume that T : A(, X) → A(, Y ) is a
500 D. H. Leung and W. K. Tang

bijection. Given h ∈ A(, X), say that T is a ⊥h -isomorphism if

f ⊥h g ⇐⇒ Tf ⊥T h T g for all f, g ∈ A(, X).

⊆h -isomorphism is defined similarly. Clearly, a biseparating map in the sense

of §2 is precisely a ⊥0 isomorphism, provided T 0 = 0. ⊥h -isomorphism is a
generalization of biseparating map to the nonlinear context. The set A(, X) is
said to be h-weakly regular for some h ∈ A(, X) if

h = {σh (f ) : f ∈ A(, X)} is a basis for the topology on X.

Weak regularity is a basic assumption to ensure that there are sufficient functions
in A(, X) and A(, Y ) to yield a nontrivial theory. The following simple yet
important result is noted and used in [15]. Its ancestry can be traced back to at
least [5, Lemma 4].
Proposition 3.1 Let T : A(, X) → A(, Y ) be a bijection, where A(, X) and
A(, Y ) are h- and T h-weakly regular respectively. Then T is a ⊥h -isomorphism
if and only if it is a ⊆h -isomorphism.
A set U in is a regular open set if U = int U . All sets of the form σh (f ) are
regular open sets. Denote the collection of all regular open sets in by RO().
RO() is a Boolean algebra with 0 = ∅, 1 = , lattice operations U ∧ V = U ∩ V ,
U ∨ V = int U ∪ V and negation ¬U = int(\U ). See [38]. If is a regular
topological space, then RO() is a basis for the topology on .
Let T : A(, X) → A(, Y ) be a ⊆h -isomorphism, where A(, X) and
A(, Y ) are h- and T h-weakly regular respectively. Define a map θh : RO() →
RO() by
C
θh (U ) = int {σT h (Tf ) : f ∈ A(, X), σh (f ) ⊆ U }. (1)

It can be shown that θh is a Boolean isomorphism from RO() onto RO(). In

fact, its inverse is θT h : RO() → RO(). Furthermore, if f ∈ A(, X) and
U ∈ RO(), then f = h on U if and only if Tf = T h on θh (U ). In fact, we obtain
a fundamental characterization of ⊥h -isomorphisms.
Theorem 3.2 Let T : A(, X) → A(, Y ) be a bijection, where A(, X) and
A(, Y ) are h- and T h-weakly regular respectively. Then T is a ⊥h -isomorphism
if and only if there is a Boolean isomorphism θh : RO() → RO() so that for any
f ∈ A(, X) and U ∈ RO(X), f = h on U if and only if Tf = T h on θh (U ).
In Theorem 3.2, we say that θh is associated with (T , h). In general, if T is a ⊥h
isomorphism for different h’s, the associated Boolean isomorphisms θh may well
depend on h. Some way of “linking” different functions in A(, X) and A(, Y ) is
needed in order to “uniformize” the θh ’s.
Banach-Stone Theorems 501

Call a set A(, X) ⊆ C(, X) weakly regular if A(, X) is h-weakly regular

for all h ∈ A(, X). Suppose that T : A(, X) → A(, Y ) is a bijection
between weakly regular sets of functions that is a ⊥-isomorphism, i.e., T is a ⊥h -
isomorphism for all h ∈ A(, X). Consider the following “linking” condition.
(L) If h1 , h2 ∈ A(, X), U ∈ RO() and ω ∈
/ U , then there exist f ∈ A(, X)
and V ∈ RO() containing ω so that

h1 on U
f =
h2 on V .

A set of functions A(, X) is nowhere trivial if for any ω ∈ , there are h1 , h2 ∈

A(, X) so that h1 (ω) = h2 (ω). If is a regular topological space and A(, X) is
nowhere trivial and satisfies condition (L), then A(, X) is weakly regular.
Theorem 3.3 Let , be regular topological spaces. Assume that A(, X) and
A(, Y ) are nowhere trivial and satisfy condition (L). A bijection T : A(, X) →
A(, Y ) is a ⊥-isomorphism if and only if there is Boolean isomorphism θ :
RO() → RO() so that for all f, g ∈ A(, X) and all U ∈ RO(), f = g
on U if and only if Tf = T g on θ (U ).
In order to prove Theorem 3.3, we first require a lemma.
Lemma 3.4 Assume that h1 , h2 ∈ A(, X), U ∈ RO() so that h1 = h2 on U .
Then θh1 (U ) = θh2 (U ).
Proof Assume that θh2 (U ) ⊆ θh1 (U ). Since T is a bijection and A(, Y ) is weakly
regular, there exists f ∈ A(, X) such that

∅ = σT h2 (Tf ) ⊆ θh2 (U )\θh1 (U ).

By (1) and the fact that T is a ⊆-isomorphism, θh2 (σh2 (f )) = σT h2 (Tf ). Then
∅ = σh2 (f ) = θh−1
2
(σT h2 (Tf )) ⊆ U . Hence σh2 (f ) is a nonempty set disjoint from
¬U . By condition (L), there exist g ∈ A(, X) and a nonempty set V ∈ RO(),
V ⊆ σh2 (f ), so that

h1 on ¬U ,
g=
f on V .

Now
1. θf (V ) ⊆ θf (σh2 (f )) = θf (σf (h2 )) = σTf (T h2 ) = σT h2 (Tf ).
2. T g = T h1 on θh1 (¬U ) = ¬θh1 (U ) = int[θh1 (U )c ] ⊇ σT h2 (Tf ).
3. T g = Tf on θf (V ).
4. T h1 = T h2 on θh2 (U ) ⊇ σT h2 (Tf ).
502 D. H. Leung and W. K. Tang

Here, we have applied Theorem 3.2 for items 2–4. It follows that Tf = T h2 on
θf (V ). However, θf (V ) is a nonempty open subset of σT h2 (Tf ) = int [Tf = T h2 ].
So we have reached a contradiction. Therefore, θh2 (U ) ⊆ θh1 (U ). Since θh1 (U ) is
a regular open set, θh2 (U ) ⊆ θh1 (U ). The lemma follows by symmetry.

Proof of Theorem 3.3 Taking into account Theorem 3.2, it suffices to show that
θh1 = θh2 for any h1 , h2 ∈ A(, X). Suppose that there exists U ∈ RO() so that
θh2 (U ) ⊆ θh1 (U ), so that in fact θh2 (U ) ⊆ θh1 (U ). By condition (L) for A(, Y ),
there exists g ∈ A(, Y ) and a nonempty set V ∈ RO(), V ⊆ θh2 (U )\θh1 (U ), so
that

T h1 on θh1 (U ),
g=
T h2 on V .

Apply Lemma 3.4 on A(, Y ). We find that

θg (θh1 (U )) = θT h1 (θh1 (U )) = U and θg (V ) = θT h2 (V ) = θh−1

2
(V ).

Since θh1 (U ) ∩ V = ∅ and θg is a Boolean isomorphism,

U ∩ θh−1
2
(V ) = θg (θh1 (V )) ∩ θg (V ) = ∅.

However, θh−1
2
(V ) is a nonempty subset of θh−1 2
(θh2 (U )) = U . The contradiction
shows that θh2 (U ) ⊆ θh1 (U ). The reverse inclusion follows by symmetry.

In Theorem 3.3, say that θ is associated with T . We list a few examples of sets of
functions satisfying condition (L). Another example is given in Lemma 4.2 below.
Example (a) Let be a completely regular Hausdorff space and let X be a convex
set in a Hausdorff topological vector space. The space C(, X) consists of all
continuous functions from into X.
(b) Let be a metric space and let X be a convex set in a normed space. Denote by
U (, X), U∗ (, X), Lip(, X), Lip∗ (, X), respectively, the set of uniformly
continuous functions, the set of bounded uniformly continuous functions, the
set of Lipschitz functions and the set of bounded Lipschitz functions from to
X.
(c) Let be an open set in a Banach space Z and let X be a Banach space. For p ∈
N ∪ {∞}, denote by C p (, X ) the space of all p-times continuously (Fréchet)
differentiable X -valued functions on . To ensure that there are “sufficiently
many” functions in C p (, X ), we assume that there exists a bump function in
C p (Z), i.e., a function ξ ∈ C p (Z) that has nonempty bounded support in Z.
To see that all of the spaces A(, X) above satisfy condition (L), first observe
that if ω0 ∈ , U ∈ RO() and ω0 ∈ / U , then there exist ξ : → [0, 1], V ∈
RO(X) containing ω0 so that ξ = 0 on V and ξ = 1 on U . Moreover, for the
situation in (b), we can choose ξ to be Lipschitz, and for case (c), we can choose
Banach-Stone Theorems 503

ξ ∈ C p (). Given h1 , h2 ∈ A(, X), it is easy to verify that f (ω) = ξ(ω)h1 (ω) +
(1−ξ(ω))h2 (ω) defines a function in A(, X) that is equal to h1 on U and h2 on V .
Remark 3.5 Condition (L) is a condition on A(, X), respectively, A(, Y ), that
guarantees that the Boolean isomorphisms θh are independent of h. Alternatively,
we may impose conditions on T to warrant the same outcome. For example,
if X, Y are Hausdorff topological groups, then C(, X) is a topological group
under pointwise group operations. Suppose that A(, X), A(, Y ) are subgroups
of C(, X), C(, Y ) respectively and T : A(, X) → A(, Y ) is a group
isomorphism as well as a ⊥h isomorphism for some h ∈ A(, X). Then routine
verification shows that T is a ⊥-isomorphism and for any k ∈ A(, X), θh (U ) =
θk (U ) for all U ∈ RO(X). In particular, the situation occurs if X and Y are
Hausdorff topological vector spaces, A(, X), A(, Y ) are respective subspaces of
C(, X), C(, Y ), and T : A(, X) → A(, Y ) is an additive ⊥0 -isomorphism.

3.2 Homeomorphism Associated with a ⊥-Isomorphism

Theorem 3.3 allows us to associate a Boolean isomorphism with a ⊥-isomorphism.

Unfortunately, in general, a Boolean isomorphism θ : RO() → RO() need not
induce a homeomorphism ϕ : → .
Example ([15]) Let be a topological space and let be a dense open set in .
Then the map θ : RO() → RO() given by θ (U ) = U ∩ is a Boolean
isomorphism. In particular, let S 1 be the unit circle in the complex plane. The
sets (0, 1) and S 1 \{1} are homeomorphic and open and dense in [0, 1] and S 1
respectively. Hence we have a chain of Boolean isomorphisms

RO([0, 1]) ↔ RO((0, 1)) ↔ RO(S 1 \{1}) ↔ RO(S 1 ).

But of course [0, 1] and S 1 are not homeomorphic.

The next result characterizes the Boolean isomorphisms that induce homeomor-
phisms. If ω ∈ , where is a topological space, let Nω be the family of open
neighborhoods of ω.
Proposition 3.6 Let , be Hausdorff topological spaces and let θ : RO() →
RO() be a Boolean isomorphism. Assume that
1. For any ω ∈ , there exists σ ∈ such that for any V ∈ Nσ , there exists
U ∈ RO() containing ω such that θ (U ) ⊆ V .
2. For any σ ∈ , there exists ω ∈ such that for any U ∈ Nω , there exists
V ∈ RO() containing σ such that θ −1 (V ) ⊆ U .
Then there exists a homeomorphism ψ : → such that ψ(U ) = θ (U ) for
any U ∈ RO(X). Conversely, if , are regular topological spaces and there is a
504 D. H. Leung and W. K. Tang

homeomorphism ψ such that ψ(U ) = θ (U ) for any U ∈ RO(X), then conditions 1

and 2 hold.
Given conditions 1 and 2, define ψ(ω) = σ when ω and σ are related by
condition 1. Similarly, define ϕ(σ ) = ω when σ and ω are related by condition
2. One can check that ψ : → and ϕ : → are continuous functions
that are mutual inverses. Proposition 3.6 can be applied to obtain general versions
of Theorem 2.6. A homeomorphism ψ : → is associated with T if for any
U ∈ RO() and any f, g ∈ A(, X), f = g on U if and only if Tf = T g on
ψ(U ).
Theorem 3.7 Suppose that A(, X), A(, Y ) are nowhere trivial subsets of
C(, X) and C(, Y ) respectively that satisfy condition (L), where X, Y are
Hausdorff spaces and , are compact Hausdorff. If T : A(, X) → A(, Y ) is
a ⊥-isomorphism, then there is a homeomorphism ψ : → associated with T .
Proof By Theorem 3.3, there is a Boolean isomorphism θ : RO() → RO()
associated with T . Let us verify condition 1 in Proposition 3.6. Condition 2 follows
by symmetry. Fix ω ∈ . By assumption, there are functions h1 , h2 ∈ A(, X)
so that h1 (ω) = h2 (ω). The family {θ (U ) : ω ∈ U ∈ RO()} has the finite
intersection property and henceE has nonempty intersection. Suppose that there are
two distinct points σ1 , σ2 ∈ {θ (U ) : ω ∈ U ∈ RO()}. By condition (L), there
are V1 , V2 ∈ RO() and f ∈ A(, Y ) so that σi ∈ Vi and f = T hi on Vi , i = 1, 2.
Thus T −1 f = hi on θ −1 (Vi ). However, if ω ∈ U ∈ RO(), then θ (U )∩Vi = ∅ and
hence U ∩ θ −1 (Vi ) = ∅. It follows that ω ∈ θ −1 (V1 ) ∩ θ −1 (V2 ). By continuity of
T −1 f , this would mean thatEh1 (ω) = T −1 f (ω) = h2 (ω), which is a contradiction.
Therefore, the intersection {θ (U ) : ω ∈ U ∈ RO()} contains a unique point σ .
If condition 1 of Proposition 3.6 fails, there exists V ∈ Nσ such that θ (U )∩V c =
∅ for all U ∈ RO() ∩ Nω . Using compactness again, there exists σ such that
σ ∈ θ (U ) ∩ V c for all U ∈ RO() ∩ Nω . Clearly, σ = σ and both belong to
the intersection of the family {θ (U ) : ω ∈ U ∈ RO()}, contrary to the previous
paragraph.

Theorem 3.7 extends to the case of complete metric domains, provided the sets of
functions satisfy an additional linking condition. Let be a complete metric space,
X be a Hausdorff topological space and let A(, X) be a subset of C(, X). A
sequence (ωn ) in is separated if infm=n d(ωm , ωn ) > 0.
(Ls ) Let h1 , h2 ∈ A(, X) and let (ωn ) be a separated sequence in . Then there
exists f ∈ A(, X) and U1 , U2 ∈ RO() so that each Ui contains infinitely
many ωn and that f = hi on Ui , i = 1, 2.
Theorem 3.8 Suppose that A(, X), A(, Y ) are nowhere trivial subsets of
C(, X) and C(, Y ) respectively that satisfy conditions (L) and (Ls ), where X, Y
are Hausdorff spaces and , are complete metric spaces. If T : A(, X) →
A(, Y ) is a ⊥-isomorphism, then there is a homeomorphism ψ : →
associated with T .
Banach-Stone Theorems 505

Sketch of Proof By Theorem 3.3, there is a Boolean isomorphism θ : RO() →

RO() associated with T . A bit of reflection shows that in order to verify condition
1 of Proposition 3.6, it is suffices to prove that if ω ∈ Un ∈ RO(), diam Un → 0
and σn ∈ θ (Un ) for all n, then (σn ) converges in . Fix functions h1 , h2 ∈ A(, X)
so that h1 (ω) = h2 (ω). If (σn ) fails to be convergent, then either the sequence
has no accumulation point, or at least two accumulation points. In either case, from
condition (L) or (Ls ), there are f ∈ A(, Y ) and V1 , V2 ∈ RO() so that Vi contain
infinitely many σn and f = T hi on Vi , i = 1, 2. As in the proof of Theorem 3.7,
ω ∈ θ −1 (Vi ), i = 1, 2 and T −1 f = hi on θ −1 (Vi ), which leads to a contradiction.

3.3 Representation

In many cases, it is possible to improve Theorems 3.7 and 3.8 by giving a functional
representation of the ⊥-isomorphism T .
Proposition 3.9 Let , , X, Y be Hausdorff spaces. Suppose that T :
A(, X) → A(, Y ) is a bijection, where A(, X), A(, Y ) are subsets of
C(, X) and C(, Y ) respectively. Assume that there is a homeomorphism
ψ : → that is associated with T . If f, g ∈ A(, X), ω0 ∈
and there exists h ∈ A(, X) so that ω0 ∈ int [h = f ] ∩ int [h = g], then
Tf (ψ(ω0 )) = T g(ψ(ω0 )).
Proof Let U = int[h = f ] and V = int[h = g]. By assumption, T h = Tf
on ψ(U ) and T h = Tf on ψ(V ). Since ψ is a homeomorphism and ω0 ∈
U ∩ V , ψ(ω0 ) ∈ ψ(U ) ∩ ψ(V ). By continuity of Tf, T g and T h, Tf (ψ(ω0 )) =
T h(ψ(ω0 )) = T g(ψ(ω0 )).

From the example following Theorem 3.3, the spaces listed there all satisfy
condition (L).
Theorem 3.10 Let , be a first countable compact Hausdorff topological space
and let X, Y be convex sets in Hausdorff topological vector spaces, with X, Y
containing more than one point. If T : C(, X) → C(, Y ) is a ⊥-isomorphism,
then there are a homeomorphism ψ : → and a function : × X → Y so
that

Tf (ψ(ω)) = (ω, f (ω)) for all f ∈ C(, X) and all ω ∈ .

Sketch of Proof As mentioned above, C(, X) and C(, Y ) both satisfy condition
(L). Hence there is a homeomorphism ψ : → associated with T by
Theorem 3.7. For any x ∈ X, let gx ∈ C(, X) be the constant function with
value x. Define : × X → Y by (ω, x) = (T gx )(ψ(ω)). Let ω0 ∈ and
506 D. H. Leung and W. K. Tang

f ∈ C(, X). Set x = f (ω0 ). Using the first countability of , one can easily
construct h ∈ C(, X) so that ω0 ∈ int [h = f ] ∩ int [h = gx ]. By Proposition 3.9,

Tf (ψ(ω0 )) = (T gx )(ψ(ω0 )) = (ω0 , x) = (ω0 , f (ω0 )).

This completes the proof of the theorem.

It is not hard to see that in this case is a continuous function on × X. In
[18], it is shown that if X is a Banach space, then the spaces in the example (taking
X = X where appropriate) satisfy condition (Ls ). Hence we obtain the next result
similarly.
Theorem 3.11 Let A(, X ) be one of the spaces U (, X ), U∗ (, X ), Lip(, X )
or Lip∗ (, X ), where is a complete metric space and X is a Banach space.
Similarly for A(, Y). If T : A(, X ) → A(, Y) is a ⊥-isomorphism, then there
are a homeomorphism ψ : → and a function : × X → Y so that

Tf (ψ(ω)) = (ω, f (ω)) for all f ∈ A(, X ) and all ω ∈ .

In some instances, additional information concerning the functions ψ and are

known. For example, if T : U (, X ) → U (, Y), then it can be shown that ψ is
a uniform homeomorphism and can be characterized. For details on this and for
⊥-isomorphisms T : Lip(, X ) → Lip(, Y), refer to [18].
Consider the space C p (, X ), where p ∈ N, is an open set in a Banach space
Z on which there is a C p -bump function. It can be shown that if f, g ∈ C p (, X )
satisfy D k f (ω0 ) = D k g(ω0 ), 0 ≤ k ≤ p, for some ω0 ∈ , then there exists
h ∈ C p (, X ) so that ω0 ∈ int [h = f ] ∩ int [h = g]. Therefore, we obtain the
following counterpart of the preceding theorems for these spaces. For k ∈ N, let
S k (Z, X ) be the space of all bounded symmetric k-linear operators from Z to X .
Theorem 3.12 Let p, q ∈ N, , be open sets in a Banach spaces on which there
are C p , respectively, C q -bump functions. Suppose that T : C p (, X ) → C q (, Y)
is a ⊥-isomorphism. Denote by Z the Banach space containing . Then there exist
a homeomorphism ψ : → and a function : × X × S 1 (Z, X ) × · · · ×
S p (Z, X ) → Y so that

Tf (ψ(ω)) = (ω, f (ω), Df (ω), · · · , D p f (ω)), f ∈ C p (, X ), ω ∈ .

See [3] for a complete description of additive ⊥-isomorphisms T : C p (Ω, X ) →

C q (Σ, Y).

4 Applications

We present several applications of the results in Sect. 3.

Banach-Stone Theorems 507

4.1 Order Isomorphism

In this subsection, let , be regular topological spaces and let X, Y be totally

ordered sets endowed with the order topology, unless otherwise stated. Given
subsets A(, X), A(, Y ) of C(, X) and C(, Y ) respectively, an order isomor-
phism is a bijection T : A(, X) → A(, Y ) that preserves the pointwise order:
for all f, g ∈ A(, X),

f (ω) ≤ g(ω) for all ω ∈ ⇐⇒ Tf (σ ) ≤ T g(σ ) for all σ ∈ .

If A(, X) and A(, Y ) are lattices (in the pointwise order), then an order
isomorphism is a lattice isomorphism. Following [28], we say that A(, X) is
X-normal if for any disjoint closed sets F1 , F2 in and any x1 , x2 ∈ X, there
exists f ∈ A(, X) so that f = xi on Fi , i = 1, 2. The following statement is
Kaplansky’s Theorem in its full generality.
Theorem 4.1 (Kaplansky [28]) Let , be compact Hausdorff spaces and let
X, Y be totally ordered sets with the order topology. If C(, X) and C(, Y ) are X-
and Y -normal respectively and there exists an order isomorphism T : C(, X) →
C(, Y ), then and are homeomorphic.
We will see that Kaplansky’s Theorem as well as similar results on other function
spaces can be derived from considerations in Sect. 3. A function f ∈ C(, X) is
bounded if there are x1 , x2 ∈ X so that x1 ≤ f (ω) ≤ x2 for all ω ∈ . Clearly,
if is compact Hausdorff, or if X has both largest and smallest elements, then all
functions in C(, X) are bounded.
Lemma 4.2 Let be a regular topological space and let X be a totally ordered set
with the order topology. Suppose that A(, X) is a X-normal sublattice of C(, X)
that consists of bounded functions. Then A(X, E) satisfies condition (L).
Proof Let h1 , h2 ∈ A(, X), U ∈ RO() and ω ∈ / U . Since is regular, there
exists V ∈ RO() containing ω so that V ∩ U = ∅. There are x1 , x2 ∈ X so that
x1 ≤ h1 (ω), h2 (ω) ≤ x2 for all ω ∈ . By X-normality, there are k1 , k2 ∈ A(, X)
so that

x2 on V x1 on V
k1 = and k2 = .
x1 on U x2 on U

Set k = (k2 ∨ h1 ) ∧ (k1 ∨ h2 ). Then k ∈ A(, X). It is easy to see that k = h1 on

V and k = h2 on U . This completes the verification of condition (L).

Proposition 4.3 Let , be regular topological spaces and let X, Y be totally
ordered sets with the order topology. Suppose that A(, X) is a sublattice of
C(, X) that satisfies condition (L). Similarly for A(, Y ). If T : A(, X) →
A(, Y ) is an order isomorphism, then T is a ⊥-isomorphism.
508 D. H. Leung and W. K. Tang

Proof Let f, g, h ∈ A(, X) and suppose that f ⊥h g and that f, g ≥ h. Then

f ∧ g = h and hence Tf ∧ T g = T h; whence Tf ⊥T h T g. Similarly, Tf ⊥T h T g
if f ⊥h g and f, g ≤ h.
Claim If f, g, h ∈ A(, X), f ⊥h g and f ≥ h ≥ g, then Tf ⊥T h T g.
Otherwise, there exists σ ∈ so that Tf (σ ) > T h(σ ) > T g(σ ). Let U =
σT g (T h). By condition (L), there exists k ∈ A(, Y ) so that k(σ ) = Tf (σ ) and
k = T h = T g on int U c . Replace k by (k ∨ T h) ∧ Tf if necessary to assume
additionally that T h ≤ k ≤ Tf .
If σg (T −1 k) ⊆ σg (h), there exist a nonempty W ∈ RO() and l ∈ A(, X) so
that W ⊆ σg (T −1 k), l = g on σg (h) and l = T −1 k on W . Replace l by l ∨ g
if necessary so that l ≥ g. (Note that g ≤ h ≤ T −1 k ≤ f .) Then l, h ≥ g and
l ⊥g h. Hence T l ⊥T g T h. Since k = T g on int U c , σT g (k) ⊆ U . Thus

σT g (T l) ∩ σT g (k) ⊆ σT g (T l) ∩ U = σT g (T l) ∩ σT g (T h) = ∅.

So T l ⊥T g k. Since l, T −1 k ≥ g as well, l ⊥g T −1 k. But l = T −1 k on W . Hence

T −1 k = g on W , which is absurd since W is a nonempty subset of σg (T −1 k).
This shows that σg (T −1 k) ⊆ σg (h) and thus σg (T −1 k) ⊆ σg (h).
By assumption, σh (f ) ∩ σg (h) = σh (f ) ∩ σh (g) = ∅. Therefore, σg (T −1 k) ∩
σh (f ) = ∅. If ω ∈ σg (T −1 k), then ω ∈ / σh (f ) and hence f (ω) = h(ω). So
T −1 k(ω) = h(ω) since f ≥ T −1 k ≥ h. On the other hand, if ω ∈ / σg (T −1 k),
−1 −1
then g(ω) = T k(ω) and hence T k(ω) = h(ω) since T k ≥ h ≥ g. −1

Combining the two cases, we see that T −1 k = h and hence k = T h. This is

impossible since they differ at σ . This completes the proof of the claim.
Finally, let f, g, h ∈ A(, X) with f ⊥h g. Then f D h ⊥h g D h, where
each D stands for one of the symbols (not necessarily the same) ∨ or ∧. By the first
paragraph and the Claim, T (f D h) ⊥T h T (g D h). Since

[Tf = T h] = [(Tf ∨ T h) = T h] ∪ [(Tf ∧ T h) = T h]

= [T (f ∨ h) = T h] ∪ [T (f ∧ h) = T h] and
[T g = T h] = [T (g ∨ h) = T h] ∪ [T (g ∧ h) = T h],

we see that [Tf = T h] ∩ [T g = T h] = ∅, i.e., Tf ⊥T h T g. By symmetry,

Tf ⊥T h T g implies f ⊥h g. This completes the proof of the proposition.

The next result generalizes Kaplansky’s Theorem and follows immediately from
Theorem 3.7, Lemma 4.2 and Proposition 4.3.
Theorem 4.4 (See also [15]) Let , be compact Hausdorff spaces and let X, Y
be totally ordered sets with the order topology. If A(, X) is a X-normal sublattice
of C(, X), A(, Y ) is a Y -normal sublattice of C(, Y ) and there is an order
isomorphism T : A(, X) → A(, Y ), then and are homeomorphic.
Banach-Stone Theorems 509

Proposition 4.3 and Theorem 3.11 also yield the following.

Theorem 4.5 Let A() be one of the spaces of real valued functions U (), U∗ (),
Lip() or Lip∗ (), where is a complete metric space. Similarly for A(). If
T : A() → A() is an order isomorphism, then there are a homeomorphism
ψ : → and a function : × R → R so that

Tf (ψ(ω)) = (ω, f (ω)) for all f ∈ A() and all ω ∈ .

Linear and nonlinear lattice and order isomorphisms have been well studied in a
variety of function spaces. Garrido and Jaramillo [20] showed that the unital vector
lattices U () and U∗ () determine up to uniform homeomorphism. In [21], the
same authors showed that as a unital vector lattice, Lip() determines up to
Lipschitz homeomorphism. For Lipschitz spaces defined on Banach spaces, F. and J.
Cabello Sánchez showed that Lip(R) and Lip∗ (R) are isomorphic as vector lattices.
However, if X is a Banach space of dimension > 1 and Y is a Banach space, then
Lip∗ (X ) is not isomorphic as a vector lattice to Lip(Y) [14].
As a lattice alone (i.e., disregarding linearity), Shirota [37] proved that if U∗ ()
and U∗ () are lattice isomorphic, with , complete metric spaces, then is
uniformly homeomorphic to . In the same paper, the claim was also made for
lattice isomorphisms T : U () → U (); but the proof contains a gap. The gap
was repaired by F. Cabello Sánchez [11] and F. and J. Cabello Sánchez [13]. The
same authors also showed that if T : C p () → C p () is an order isomorphism,
where p ∈ N∪{∞} and , are manifolds modeled on Banach spaces that support
C p -bump functions, then and are homeomorphic [12]. A unified treatment of
order isomorphisms between functions spaces can be found in [32].

4.2 Realcompact Spaces

One can also consider the situation for Theorem 4.4 away from the confines
of compact Hausdorff spaces. A completely regular Hausdorff space has a
“largest” compactification, the Stone-Čech compactification β, characterized by
the fact that every continuous function f from into a compact Hausdorff
space X has a unique continuous extension fB : β → X. A good source of
information concerning the Stone-Čech compactification is [39]. For the purpose
of extending the Gelfand-Kolmogorov Theorem, Hewitt [26] introduced the class of
realcompact spaces. Let R∞ be the one point compactification of R. The (Hewitt)
realcompactification υ consists of all ω0 ∈ β such that for any continuous
real-valued function f on , its continuous extension fB : β → R∞ satisfies
fB(ω0 ) ∈ R. is realcompact if = υ. It is known that a space is realcompact if
and only if it is homeomorphic to a closed subspace of R for some index set ; see,
e.g., [22]. Hewitt showed that for realcompact spaces, C() as a ring determines
uniquely up to homeomorphism. The result was generalized by Araujo et al. [5],
510 D. H. Leung and W. K. Tang

and subsequently by Araujo to vector valued functions [1, 2]. If X and Y are vector
spaces, denote the set of all linear bijections from X onto Y by I (X , Y).
Theorem 4.6 ([2]) Let and be realcompact spaces and let X and Y be normed
spaces. If T : C(, X ) → C(, Y) is a linear biseparating map (i.e., linear ⊥-
isomorphism), then there are a homeomorphism ϕ : → and a function J :
→ I (X , Y) so that

Tf (σ ) = (J σ )f (ϕ(σ )) for all f ∈ C(, X ) and all σ ∈ .

Without the assumption of linearity in Theorem 4.6, it is still possible to conclude

that and are homeomorphic. But the representation of T may not hold.
Theorem 4.7 ([18]) Let , be realcompact spaces and X , Y be Hausdorff
topological vector spaces. If T : C(, X ) → C(, Y) is a ⊥-isomorphism, then
and are homeomorphic.
In particular, using the arguments of Lemma 4.2 and Proposition 4.3, the result
can be applied to lattice isomorphisms. We emphasize that the lattice isomorphism
below need not be linear.
Theorem 4.8 ([32]) Let and be realcompact spaces. If there is a lattice
isomorphism T : C() → C(), then and are homeomorphic.

4.3 Ring Isomorphism and Multiplicative Isomorphism

Let be a Hausdorff topological space. The space C() of all real valued
continuous functions on is a (unital) ring under pointwise operations. The subring
C∗ () consists of the bounded functions. Clearly, the (pointwise) order on these
rings is determined by the ring structure since f ≥ 0 if and only if f is a square in
the ring. It follows immediately that results from Sect. 4.1 give rise to corresponding
results concerning ring isomorphisms. In particular, we cite Hewitt’s generalization
of the theorem of Gelfand-Kolmogorov as a consequence of Theorem 4.8.
Theorem 4.9 ([26]) Let and be realcompact spaces. Suppose that C() and
C() are isomorphic as rings, then and are homeomorphic.
If is a complete metric space, then Lip∗ () is a ring under pointwise
operations. Ring isomorphisms between such rings were described in [21]. Let
, be metric spaces. A function ψ : → is Lipschitz in the small if there
exist r, K > 0 so that d(ψ(ω1 ), ψ(ω2 )) ≤ Kd(ω1 , ω2 ) whenever ω1 , ω2 ∈ and
d(ω1 , ω2 ) < r. ψ is a LS-homeomorphism if it is a homeomorphism so that both ψ
and ψ −1 are Lipschitz in the small.
Banach-Stone Theorems 511

Theorem 4.10 (Garrido and Jaramillo) Let , be complete metric spaces. The
following are equivalent.
1. Lip∗ () and Lip∗ () are isomorphic as unital rings.
2. Lip∗ () and Lip∗ () are isomorphic as unital vector lattices.
3. and are LS-homeomorphic.
Garrido et al. showed that the ring of smooth functions C ∞ (M) determines
the manifold M up to smooth diffeomorphism. For notions and notation regarding
global analysis on infinite dimensional manifolds, refer to [29].
Theorem 4.11 (Garrido et al. [23]) Let M and N be paracompact Banach
manifolds modeled on C ∞ -smooth Banach spaces. The rings C ∞ (M) and C ∞ (N)
are isomorphic if and only if M and N are C ∞ -diffeomorphic.
Instead of ring isomorphisms, one can disregard linearity and consider maps that
preserve multiplication alone.
Proposition 4.12 Let , be Hausdorff spaces and let A(), A() be unital
subrings of C() and C() respectively. Assume that either
1. and are compact and A(), A() satisfy condition (L); or
2. and are complete metric spaces and A(), A() satisfy conditions (L) and
(Ls ).
If T : A() → A() is a multiplicative isomorphism, i.e., T is a bijection so that
T (fg) = Tf · T g for all f, g ∈ A(), then there is a homeomorphism ψ : →
that is associated with T in the sense defined before Theorem 3.7. In particular, T
is a ⊥-isomorphism.
Proof By assumption, the constant functions belong to A() and A(). Let Let
0, 2 denote the constant functions with values 0, 2 respectively. Then

2 · T 0 = T (T −1 2 · 0) = T 0.

Hence T 0 = 0. For any f, g ∈ C(),

f ⊥0 g ⇐⇒ fg = 0 ⇐⇒ Tf · T g = T 0 = 0 ⇐⇒ Tf ⊥T 0 T g.

Hence T is a ⊥0 -isomorphism. Note that is a regular topological space and

that A() is nowhere trivial and satisfies condition (L). Hence A() is weakly
regular. Similarly for A(). By Theorem 3.2, there is a Boolean isomorphism
θ0 : RO() → RO() associated with (T , 0). The same argument from the proof
of Theorem 3.7 or Theorem 3.8 shows that Proposition 3.6 applies to θ0 . Thus there
is a homeomorphism ψ : → so that for any f ∈ C() and any U ∈ RO(),
f = 0 on U if and only if Tf = 0 on ψ(U ).
In fact, ψ is associated with T . Let U ∈ RO() and let f, g ∈ A() be such
that f = g on U . For σ ∈ ψ(U ), it follows from condition (L) that there exists
h ∈ C() so that h(σ ) = 1 and h = 0 on int ψ(U )c = ψ(int U c ). Hence T −1 h =
512 D. H. Leung and W. K. Tang

T −1 0 = 0 on int U c . Thus T −1 h · f = T −1 h · g. Therefore, h · Tf = h · T g. In

particular, Tf (σ ) = T g(σ ). This proves that Tf = T g on ψ(U ) if f = g on U .
The reverse implication follows by symmetry.
Let θ : RO() → RO() be the Boolean isomorphism given by θ (U ) = ψ(U ).
Then θ is associated with T . Hence T is a ⊥-isomorphism by Theorem 3.3.

Proposition 4.12 and Theorem 3.11 give.
Theorem 4.13 Let , be complete metric spaces and let A() be one of the
spaces U∗ () or Lip∗ (). Similarly for A(). Let T : A() → A() be a
multiplicative isomorphism. Then there are a homeomorphism ψ : → and
a function : × R → R so that

Tf (ψ(ω)) = (ω, f (ω)) for all f ∈ A() and all ω ∈ .

Milgram [35] characterized all multiplicative isomorphisms T : C() → C().

A combination of Proposition 4.12 and Theorems 3.7, 3.10 gives a partial result
in this regard. See [15] for a proof of Milgram’s Theorem via ⊥-isomorphisms.
When p ∈ N and is a C p -manifold, Mrčun and Šemrl [36] showed that all
multiplicative automorphisms T on C p () are of the form Tf = f ◦ ψ for some
C p diffeomorphisms ψ. The result was extended to the case p = ∞ by Artstein-
Avidan et al. [8]. See [30] for a survey on the multiplication operator and other
operator functional equations.

4.4 Isometry

The study of isometries is probably the most well developed part among theorems
of Banach-Stone type. Here we restrict ourselves to a much abridged survey. Further
information can be found in the two-volume monograph [19].
Behrends [10] introduced the use of centralizers into Banach-Stone considera-
tions. Let X be a Banach space and denote the set of extreme points of the ball
in X ∗ by ext X ∗ . A bounded linear operator S : X → X is a multiplier if every
x ∗ ∈ ext X ∗ is an eigenvector of S ∗ , i.e., S ∗ x ∗ = aS (x ∗ )x ∗ for some scalar aS (x ∗ ).
If R, S are multipliers, say that R is an adjoint of S if aR (x∗) = aS (x ∗ ) for all
x ∗ ∈ X ∗ . The centralizer Z(X ) of X consists of all multipliers S for which an
adjoint exists. Note that for real Banach spaces, the centralizer is the same as the
set of all multipliers. Multiples of the identity operator are always present in the
centralizer. Say that X has trivial centralizer if there are no other operators in Z(X ).
Many classes of Banach spaces have trivial centralizers; refer to [10, 19].
Theorem 4.14 (Behrends) Let X and Y be Banach spaces which have trivial
centralizers. Suppose further that and are locally compact Hausdorff spaces
and that there exists a surjective linear isometry T : C0 (, X ) → C0 (, Y). Then
Banach-Stone Theorems 513

there is a homeomorphism ϕ : → and a continuous function V from into the

space of isometries from X onto Y (given the strong operator topology) such that

Tf (σ ) = V (σ )f (ϕ(σ )) for all f ∈ C0 (, X ) and all σ ∈ .

Araujo [4] extended this result by way of finding a connection to ⊥-

isomorphisms (biseparating maps).
Theorem 4.15 (Araujo) Let X and Y be Banach spaces which have trivial
centralizers. Assume one of the following situations.
1. , are realcompact spaces and X , Y are infinite dimensional. A(, X ) =
C∗ (, X ), the space of bounded X -valued continuous functions on , with the
sup-norm. A(, Y) = C∗ (, Y).
2. , are complete metric spaces, A(, X ) = U∗ (, X ), the space of bounded
X -valued uniformly continuous functions on . A(, Y) = U∗ (, Y).
If T : A(, X ) → A(, Y) is a surjective linear isometry, then it is a ⊥-
isomorphism. Consequently, there is a homeomorphism ϕ : → and a
continuous function V from into the space of isometries from X onto Y (given
the strong operator topology) such that

Tf (σ ) = V (σ )f (ϕ(σ )) for all f ∈ C0 (, X ) and all σ ∈ .

In case (2), ϕ is a uniform homeomorphism.

Let be a complete metric space and let X be a Banach space. The space of
X -valued Lipschitz functions on , Lip(, X ) is a Banach space under the norm

f = max{f ∞ , L(f )},

where
f (ω1 ) − f (ω2 )
f ∞ = sup f (ω) and L(f ) = sup .
ω∈ ω1 =ω2 d(ω1 , ω2 )

Araujo and Dubarbie [6] gave a complete description of isometries between vector-
valued spaces of Lipschitz functions. We state a special case of their result here.
Define an equivalence relation on by x ∼ y if there are x = x1 , . . . , xn = y
in so that d(xi , xi+1 ) < 2, 1 ≤ i < n. The equivalence classes are called the
2-components of .
Theorem 4.16 Let , be complete metric spaces and let X , Y be Banach spaces.
Assume that T : Lip(, X ) → Lip(, Y) is a surjective linear isometry so that for
all σ ∈ , there is a constant function f ∈ Lip(, X ) so that Tf (σ ) = 0. Then
514 D. H. Leung and W. K. Tang

there is a homeomorphism ϕ : → and a continuous function V from into the

space of isometries from X onto Y (given the strong operator topology) such that

Tf (σ ) = V (σ )f (ϕ(σ )) for all f ∈ Lip(, X ) and all σ ∈ .

Moreover, V is constant on each 2-component of and dY (ϕ(ω1 ), ϕ(ω2 )) =

dX (ω1 , ω2 ) if either of these quantities is < 2.
It is worth mentioning that in the course of the proof of Theorem 4.16, it is
first shown that T is a ⊥-isomorphism (biseparating). Characterization of linear
isometries on certain spaces of scalar-valued Lipschitz functions was obtained
earlier by Weaver [40].

4.5 Nonvanishing Preservers

In this part, assume that , are regular topological spaces and X, Y are Hausdorff
spaces. Let A(, X) and A(, Y ) be subsets of C(, X) and C(, Y ) respectively.
Given n ∈ N and h ∈ A(, X), a bijection T : A(, X) → A(, Y ) is a ∩nh -
isomorphism if for any f1 , . . . , fn ∈ A(, X),

D
n D
n
[fi = h] = ∅ ⇐⇒ [Tfi = T h] = ∅.
i=1 i=1

T is a ∩n -isomorphism if it is a ∩nh -isomorphism for all h ∈ A(, X). It is clear

that every ∩nh -isomorphism is a ∩m h -isomorphism if n > m. Hence every ∩ -
n

isomorphism is a ∩ -isomorphism if n > m. ∩ -isomorphisms were introduced

m n

by Hernández and Ródenas [25]. Further results were given in [16, 31].
Proposition 4.17 Let , be regular topological spaces. Suppose that A(, X)
and A(, Y ) satisfy condition (L) and that there exists k ∈ A(, X) so that
[k = h] = ∅. If T : A(, X) → A(, Y ) is a ∩2h -isomorphism, then it is a
⊥h -isomorphism.
Proof First of all, since T is a ∩2h -isomorphism, it is a ∩1h -isomorphism. Thus [k =
h] = ∅ implies [T k = T h] = ∅. Suppose that there are f, g ∈ A(X, E) so that
f ⊥h g but Tf ⊥T h T g. There exists σ0 ∈ where Tf (σ0 ), T g(σ0 ) = T h(σ0 ).
Since is a regular topological space, there exists V ∈ RO() containing σ0 so
that V ⊆ [Tf = T h] ∩ [T g = T h]. Then σ0 ∈ / int V c and int V c ∈ RO(). As
A(, Y ) satisfies condition (L), there exist l ∈ A(, Y ) and W ∈ RO(Y ) so that
σ0 ∈ W , l = T k on int V c and l = T h on W . Now

[Tf = T h] ∪ [l = T h] ⊇ V ∪ int V c = .
Banach-Stone Theorems 515

Thus [Tf = T h] ∩ [l = T h] = ∅ and hence [f = h] ∩ [T −1 l = h] = ∅. Similarly,

[g = h] ∩ [T −1 l = h] = ∅. But since f ⊥h g, [f = h] ∪ [g = h] = . Therefore,
[T −1 l = h] = ∅, whence [l = T h] = ∅, contradicting the fact that l = T h on
W = ∅. This completes the proof of the proposition.

The next two results follow easily from Proposition 4.17, Theorem 3.7 and
Theorem 3.11.
Theorem 4.18 Let , be compact Hausdorff spaces and let X , Y be normed
spaces. If T : C(, X ) → C(, Y) is a ∩2 -isomorphism, then there is a
homeomorphism ψ : → associated with T .
Theorem 4.19 Let , be complete metric spaces and let X , Y be normed spaces.
Suppose that A(, X ) is one of the spaces U (, X ), U∗ (, X ), Lip(, X ),
Lip∗ (, X ). Similarly for A(, Y). If T : A(, X ) → A(, Y) is a ∩2 -
isomorphism, then there are a homeomorphism ψ : → and a function
: × X → Y so that

Tf (ψ(ω)) = (ω, f (ω)) for all f ∈ A(, X ) and all ω ∈ .

In general, a ∩1 -isomorphism need not be a ∩2 -isomorphism, as the following

example shows.
Example Let I = [0, 1]. Define T : C(I, I ) → C(I, I ) by

1−f if range f = [0, 1]
Tf =
f otherwise.

Then T is a ∩1 -isomorphism but not a ∩2 -isomorphism, nor is T is a ⊥-

isomorphism.
Indeed, it is easy to check that if f, g ∈ C(I, I ), then [f = g] = ∅ if and only
if [Tf = T g] = ∅. However, let f be the constant function with value 14 and let
g ∈ C(I, I ) be such that g( 14 ) = 14 = g( 34 ) and range g = [0, 1]. Let h be the
identity function h(t) = t for all t ∈ I . Then [f = h] ∩ [g = h] = ∅ but

3
∈ [f = 1 − h] ∩ [g = 1 − h] = [Tf = T h] ∩ [T g = T h].
4

Hence T is not a ∩2 -isomorphism.

To see that T is not a ⊥-isomorphism, consider the same h but take f = h ∨ 12 ,
g = h ∧ 12 . It is clear that f ⊥h g but that Tf ⊥T h T g.
However, Li and Wong [33, 34] obtained a number of results regarding linear
∩1 -isomorphisms. The theorem below gives some special cases of their results.
Theorem 4.20 (Li and Wong) Let , be Hausdorff completely regular topo-
logical spaces and let X , Y be normed spaces. Assume that A(, X ) is the space
516 D. H. Leung and W. K. Tang

C(, X ), or, where is complete metric, U (, X ) or Lip(, X ). Similarly for

A(, Y). If T : A(, X ) → A(, Y) is a linear ∩1 -isomorphism, then it is a
⊥-isomorphism.
We close with a positive result concerning nonlinear ∩1 -isomorphisms. Let ,
be Hausdorff spaces. We call a bijection T : C() → C() an anti-order
isomorphism if f ≥ g ⇐⇒ T g ≥ Tf for all f, g ∈ C(). Evidently, T is
an anti-isomorphism if and only if the operator −T is an order isomorphism, where
(−T )f := −Tf .
Theorem 4.21 Let , be connected compact Hausdorff spaces. If T : C() →
C() is a ∩1 -isomorphism, then T is an order isomorphism or an anti-order
isomorphism. In particular, T is a ⊥-isomorphism and hence and are
homeomorphic.
Proof Since is connected, given any two functions h, k ∈ C() with [h = k] =
∅, either h > k (i.e., h(ω) > k(ω) for all ω) or k > h. Similarly for C(). We break
the proof of the theorem into a series of steps.
Claim 1 If f, g ∈ C() and f ≤ g, then either Tf ≤ T g or T g ≤ Tf .
Otherwise, there are σ1 , σ2 ∈ such that Tf (σ1 ) > T g(σ1 ) and T g(σ2 ) >
Tf (σ2 ). Let ki ∈ C() be functions such that k2 > Tf > k1 and ki (σi ) =
T g(σi ), i = 1, 2. For i = 1, 2, [T −1 ki = f ] = ∅ and [T −1 ki = g] = ∅. By the
statement before Claim 1, T −1 ki > f . Thus there exists h ∈ C() so that h > f
and [h = T −1 ki ] = ∅, i = 1, 2. But then [T h = Tf ] = ∅ and [T h = ki ] = ∅,
i = 1, 2; hence Tf < T h and T h < Tf , contrary to the statement before Claim
1.
Claim 2 If f, g, h ∈ C() and f ≤ g, h, then either Tf ≤ T g, T h or T g, T h ≤
Tf .
If either of g, h equals f , then Claim 2 follows from Claim 1. Otherwise, we may
choose ω1 , ω2 ∈ so that g(ω1 ) > f (ω1 ) and h(ω2 ) > f (ω2 ). Let k ∈ C()
be such that k > f and k(ω1 ) = g(ω1 ), k(ω2 ) = h(ω2 ). By the first statement
of the proof, either T k > Tf or T k < Tf . Assume the former. By Claim 1,
either T g ≥ Tf or T g ≤ Tf . Since [T g = T k] = ∅, we must have T g ≥ Tf .
Similarly T h ≥ Tf . If T k < Tf , then we can show analogously that T g, T h ≤
Tf .
The following variant of Claim 2 can be established in the same way: if g, h ≤ f ,
then either T g, T h ≤ Tf or Tf ≤ T g, T h.
Claim 3 Let f ∈ C(). Then either

g ≤ f ≤ h #⇒ T g ≤ Tf ≤ T h or
g ≤ f ≤ h #⇒ T h ≤ Tf ≤ T g

In the first case, we say that T is order preserving with respect to f and in the
second case, T is anti-order preserving with respect to f .
Banach-Stone Theorems 517

Otherwise, there are g, h = f , g ≤ f ≤ h so that either T g, T h ≥ Tf or

T g, T h ≤ Tf . Apply Claim 2 or its variant to T −1 to see that either g, h ≥ f or
g, h ≤ f , contrary to the choices of g and h.
We now show that either T is an order isomorphism or an anti-order isomor-
phism. Otherwise, taking symmetry into account, by Claim 3, we may assume that
there are h1 , h2 so that T is order preserving with respect to h1 and anti-order
preserving with respect to h2 . Since T is a bijection, h1 = h2 . Now h1 ∧h2 ≤ h1 , h2 .
Thus T h2 ≤ T (h1 ∧h2 ) ≤ T h1 . On the other hand, by Claim 2, either T (h1 ∧h2 ) ≤
T h1 , T h2 or T (h1 ∧ h2 ) ≥ T h1 , T h2 . Assume the former case; the proof is similar
in the latter case. We have T h2 ≤ T (h1 ∧ h2 ) ≤ T h2 . Hence h1 ∧ h2 = h2 ,
i.e., h2 ≤ h1 . But since T is order preserving with respect to h1 and anti-order
preserving with respect to h2 , T h2 ≤ T h1 and T h1 ≤ h2 . Thus h1 = h2 , contrary
to their choices. This concludes the proof that T is either an order isomorphism or
anti-order isomorphism. Applying Proposition 4.3 to either T or −T , we see that T
is a ⊥-isomorphism. By Theorem 4.4, and are homeomorphic.

References

1. J. Araujo, Realcompactness and spaces of vector-valued continuous functions. Fund. Math.

172, 27–40 (2002)
2. J. Araujo, Realcompactness and Banach-Stone theorems. Bull. Belg. Math. Soc. 10, 247–258
(2003)
3. J. Araujo, Linear biseparating maps between spaces of vector-valued differentiable functions
and automatic continuity. Adv. Math. 187, 488–520 (2004)
4. J. Araujo, The noncompact Banach-Stone theorem. J. Oper. Theory 55, 285–294 (2006)
5. J. Araujo, E. Beckenstein, L. Narici, Biseparating maps and homeomorphic realcompactifica-
tions. J. Math. Anal. Appl. 192, 258–265 (1995)
6. J. Araujo, L. Dubarbie, Noncompactness and noncompleteness in isometries of Lipschitz
spaces. J. Math. Anal. Appl. 377, 15–29 (2011)
7. R.F. Arens, J.L. Kelley, Characterizations of the space of continuous functions over a compact
Hausdorff space. Trans. Am. Math. Soc. 62, 499–508 (1947)
8. S. Artstein-Avidan, D. Faifman, V. Milman, On Multiplicative Maps of Continuous and Smooth
Functions, GAFA Seminar 2006–2010, Springer Lecture Notes in Math., vol. 2050 (2012), pp.
35–59
9. S. Banach, Théorie des Opérations Linéaires (Warszawa, 1932); reprinted: Chelsea Publ., New
York, 1963
10. E. Behrends, M-structure and the Banach-Stone Theorem (Springer, Berlin, 1978)
11. F. Cabello Sánchez, Homomorphisms on lattices of continuous functions. Positivity 12, 341–
362 (2008)
12. F. Cabello Sánchez, J. Cabello Sánchez, Some preserver problems on algebras of smooth
functions. Ark. Math. 48, 289–300 (2010)
13. F. Cabello Sánchez, J. Cabello Sánchez, Lattices of uniformly continuous functions. Topol.
Appl. 160, 50–55 (2013)
14. F. Cabello Sánchez, J. Cabello Sánchez, Quiz your maths – Do the uniformly continuous
functions on the line form a ring? Proc. Am. Math. Soc. 147, 4301–4313 (2019)
15. L.G. Cordeiro, A general Banach-Stone type theorem and applications. J. Pure Appl. Algebra
224 (2020). https://doi.org/10.1016/j.jpaa.2019.106275
518 D. H. Leung and W. K. Tang

16. L. Dubarbie, Maps preserving common zeros between subspaces of vector-valued continuous
functions. Positivity 14, 695–703 (2010)
17. X. Feng, Nonlinear biseparating operators on vector-valued function spaces, PhD thesis,
National University of Singapore, 2018
18. X. Feng, D.H. Leung, Nonlinear biseparating maps. Preprint (2020). arXiv:2009.11570
19. R.J. Fleming, J.E. Jamison, Isometries on Banach Spaces, vols. 1 and 2 (Chapman & Hall,
2007)
20. M.I. Garrido, J.A. Jaramillo, A Banach-Stone theorem for uniformly continuous functions.
Monatsh. Math. 131, 189–192 (2000)
21. M.I. Garrido, J.A. Jaramillo, Homomorphisms on function lattices. Monatsh. Math. 141, 127–
146 (2004)
22. M.I. Garrido, J.A. Jaramillo, Variations on the Banach-Stone theorem. Extracta Math. 17, 351–
383 (2002)
23. M.I. Garrido, J.A. Jaramillo, J.A. Prieto, Banach-Stone theorems for Banach manifolds. Rev.
Real Acad. Cienc. Exact. Fis. Natur. Madrid 94, 525–528 (2000)
24. I. Gelfand, A.N. Kolmogoroff, On rings of continuous functions on a topological space. C. R.
(Doklady) URSS 22, 11–15 (1939)
25. S. Hernández, A.M. Ródenas, Automatic continuity and representation of group homomor-
phisms defined between groups of continuous functions. Topology Appl. 154, 2089–2098
(2007)
26. E. Hewitt, Rings of real-valued continuous functions, I. Trans. Am. Math. Soc. 64, 54–99
(1948)
27. K. Jarosz, Automatic continuity of separating linear isomorphisms. Canad. Math. Bull. 33,
139–144 (1990)
28. I. Kaplansky, Lattices of continuous functions. Bull. Am. Math. Soc. 53, 617–623 (1947)
29. A. Kriegl, P.W. Michor, The Convenient Setting of Global Analysis. AMS Mathematical
Surveys and Monographs, vol. 53 (AMS, Providence, 1997)
30. H. König, V. Milman, Operator functional equations in analysis, in Asymptotic Geometric
Analysis, ed. by M. Ludwig et al., Fields Institute Communications, vol. 68, 189–209 (2013)
31. D.H. Leung, W.-K. Tang, Banach–Stone Theorems for maps preserving common zeros.
Positivity 14, 17–42 (2010)
32. D.H. Leung, W.-K. Tang, Nonlinear order isomorphisms on function spaces. Dissertationes
Math. 517, 1–75 (2016)
33. L. Li, N.-C. Wong, Kaplansky theorem for completely regular spaces. Proc. Am. Math. Soc.
142, 1381–1389 (2014)
34. L. Li, N.-C. Wong, Banach-Stone theorems for vector valued functions on completely regular
spaces. J. Math. Anal. Appl. 395, 265–274 (2012)
35. A.N. Milgram, Multiplicative semigroups of continuous functions. Duke Math. J. 16, 377–383
(1940)
36. J. Mrčun, P. Šemrl, Multiplicative bijections between algebras of differentiable functions. Ann.
Acad. Sci. Fenn. Math. 32, 471–480 (2007)
37. T. Shirota, A generalization of a theorem of I. Kaplansky. Osaka Math. J. 4, 121–132 (1952)
38. M.H. Stone, Applications of the theory of Boolean rings to general topology. Trans. Am. Math.
Soc. 41(3), 375–481 (1937)
39. R.C. Walker, The Stone-Čech Compactification (Springer, 1974)
40. N. Weaver, Isometries of noncompact Lipschitz spaces. Canad. Math. Bull. 38, 242–249 (1995)
The Bishop–Phelps–Bollobás Theorem:
An Overview

Sheldon Dantas, Domingo García, Manuel Maestre, and Óscar Roldán

Abstract In this survey, we provide an overview from 2008 to 2021 about the
Bishop–Phelps–Bollobás theorem.

Keywords Norm attaining operators · Bishop–Phelps theorem ·

Bishop–Phelps–Bollobás property

1 Motivation and Historical Background

Before starting, we take a brief moment to introduce the necessary notation.

Throughout the whole paper, we will be working with Banach spaces X over the
field K, which can be the set of real numbers, R, or the set of complex numbers,
C. All the results are valid for both cases unless otherwise explicitly stated. We
denote by SX and BX the unit sphere and the closed unit ball of the Banach
space X . We denote by X the dual of the Banach space X . The symbol B(X , Y)
stands for the Banach space of all bounded linear operators from X into Y. When
Y = K, and X = Y, we simply write B(X , K) as X and B(X , X ) as B(X ).
More in general, we denote by B(X1 , . . . , XN ; Y) the Banach space of all N-linear
mappings from X1 × . . . × XN into Y endowed with the supremum norm. We
say that T attains its norm, or it is norm-attaining, if there exists x0 ∈ SX such
that T (x0 ) = T = supx∈SX T (x). We denote by NA(X , Y) the set of all
norm-attaining operators from X into Y. The set of all norm-attaining functionals
on X will be denoted by NA(X ). Let L be a locally compact Hausdorff topological

S. Dantas
Departament de Matemàtiques and Institut Universitari de Matemàtiques i Aplicacions de
Castelló (IMAC), Universitat Jaume I, Castelló, Spain
e-mail: [email protected]
D. García · M. Maestre () · Ó. Roldán
Departamento de Análisis Matemático, Facultad de Ciencias Matemáticas, Universidad de
Valencia, Burjasot, Valencia, Spain
e-mail: [email protected]; [email protected]; [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 519
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_16
520 S. Dantas et al.

space. The space C0 (L) is the space of all real or complex continuous functions
defined on L with limit zero at infinity. We denote by H ∞ (D) the algebra of all
bounded analytic functions on the open unit disc D in the complex place C. For
a complex Banach space X , we denote by Au (BX ; Y) (respectively, Ab (BX ; Y))
the set of all Y-valued uniformly (respectively, bounded) continuous functions on
BX that are holomorphic on the interior of BX . The symbol Aw∗ u (BX ) stands for
the unital algebra of all w∗ -uniformly continuous functions from BX into C which
are holomorphic on the interior of BX endowed with the supremum norm · ∞ .
Throughout Sect. 2.6, M will denote a pointed metric space, that is, a metric space
with a distinguished point 0 ∈ M and such that M\{0} = ∅, and d will denote the
metric of M. If Y is a real Banach space, then Lip0 (M, Y) stands for the space of all
Lipschitz mappings f : M → Y with f (0) = 0 equipped with the Lipschitz norm.
If Y = R, we may omit Y in the notation and simply write Lip0 (M). Finally, F (M)
denotes the Lipschitz-free space associated to M (we refer the reader to the survey
[111] and references therein for a solid background in Lipschitz-free spaces). We
denote by P(N X ; Y) the Banach space of all N-homogeneous polynomials from X
into Y .
This survey is mainly motivated by two results due to three mathematicians:
the Americans Errett Albert Bishop (1928–1983) and Robert Ralph Phelps (1926–
2013), and the Hungarian Béla Bollobás (1943-). In fact, the whole story initiates
with Robert Clarke James (1918–2003) who provided one of the most famous
characterizations for reflexive spaces in Banach space theory known nowadays
as the James theorem (see [117, 118]). In fact, in any first course in Functional
Analysis, one is able to construct easily a bounded linear functional which never
attains its norm and this opens the gate for a very natural question: when does a
linear functional attain its norm? James, with outstanding techniques, proved that a
Banach space X is reflexive if and only if every bounded linear functional attains its
norm. It is not difficult to prove by using the Hahn-Banach theorem that when X is
reflexive, every functional attains its norm. The real deal with the James theorem is
of course the converse.
Since James characterized reflexive Banach spaces as those where every func-
tional attained its norm, the concept of subreflexivity arose to denote the normed
spaces for which the set of norm-attaining functionals is dense in the dual. Although
there exist incomplete normed spaces which are not subreflexive (see [117]), in
1961, Bishop and Phelps showed that every Banach space is subreflexive; in other
words, they proved that in any Banach space X , given a functional x ∗ ∈ X and
an arbitrary positive number ε > 0, it is always possible to find a new functional
y ∗ ∈ X which attains its norm and satisfies y ∗ − x ∗ < ε (see [41]). This means
that, for every Banach space X , the set NA(X ) is norm dense in the dual space X .
Now, in order to be fair with the title of this survey, it remains to fit Bollobás’
name somehow. This is so because Bollobás proved a strengthening of the Bishop–
Phelps’ result known nowadays as the Bishop–Phelps–Bollobás theorem [42]. The
original Bollobás’ result states the following.
The Bishop–Phelps–Bollobás Theorem: An Overview 521

Theorem 1.1 ([42, Theorem 1]) Let X be a Banach space. Suppose that x ∈ SX
and x ∗ ∈ SX satisfy

ε2
|x ∗ (x) − 1| ≤ ,
2

where 0 < ε < 12 . Then, there exist y ∈ SX and y ∗ ∈ SX such that

y ∗ (y) = 1, y − x < ε + ε2 , and y ∗ − x ∗ ≤ ε.

First of all, let us notice that it is clear that the Bishop–Phelps theorem is a particular
case of the Bollobás’ theorem. Indeed, Theorem 1.1 above contains much more
information than the denseness of the functionals which attains their norms: it gives
a simultaneous control of the involved points and functionals in a quantitative way
and that is the great difference between these two theorems. At this point, it is
worth mentioning the recent paper [63] where the authors sought the best possible
constants in the Bollobás theorem (see, in particular, [63, Theorem 2.1]). We will
be referring to the next result (extracted from [63, Corollary 2.4]) as the Bishop–
Phelps–Bollobás Theorem, which is the sharpest version of Theorem 1.1 (see Sect. 3
for more information).
Theorem 1.2 (Bishop–Phelps–Bollobás Theorem) Let X be a Banach space. Let
ε ∈ (0, 2) and suppose that x ∈ BX and x ∗ ∈ BX satisfy

ε2
Re x ∗ (x) > 1 − .
2

Then, there exist y ∈ SX and y ∗ ∈ SY such that

y ∗ (y) = 1, y − x < ε, and y ∗ − x ∗ < ε.

Due to the generality of the Bishop–Phelps–Bollobás theorem (note that X is an

arbitrary Banach space in Theorems 1.1 and 1.2), it seems reasonable to wonder
whether an analogous result holds also for bounded linear operators. In fact, this
question was asked by Bishop and Phelps at the end of their paper [41]: is it true
that the set NA(X , Y) of all bounded linear operators which attain their norms is
norm dense in B(X , Y)?
Joram Lindenstrauss (1936–2012) was the first one who gave a negative answer
for this question in his seminal paper [137]. He exhibited an example of a Banach
space X such that the set NA(X , X ) is not dense in B(X ) (see [137, Proposition 5]),
answering in the negative Bishop–Phelps’ question. This means, therefore, that there
is no general version of the Bishop–Phelps theorem (and, consequently, no general
version of the Bishop–Phelps–Bollobás theorem) for bounded linear operators.
Nevertheless, Lindenstrauss did not stop there; he proved that, under some natural
conditions, one can still have the denseness of the operators which attain their
522 S. Dantas et al.

norms. For instance, this happens when X is reflexive (see [137, Theorem 1]) or
when Y has the property β of Lindenstrauss (see [137, Proposition 3]). In other
words, putting some extra conditions in the involved spaces, we can get versions of
the Bishop–Phelps theorem for operators.
As a careful and curious reader may imagine, after Lindenstrauss’ paper, a vast
research on the topic has been done during the past sixty years in several directions.
We would like to name just a few of them which provided a great impact on
extensions of the Bishop–Phelps theorem: J. Bourgain, R. E. Huff, J. Johnson,
W. Schachermayer, J. J. Uhl, J. Wolfe, and V. Zizler continued the study on the
set of operators which attain their norms ([43, 116, 120, 144, 148, 151]); M. D.
Acosta, R. M. Aron, F. J. Aguirre, Y. S. Choi, and R. Payá ([8, 35, 70]) considered
some problems in the same line involving bilinear mappings; the second and the
third authors of the present survey considered it for homogeneous polynomials
(see [21, 36]); and more recently several problems on norm-attainment of Lipschitz
mappings were also tackled (see for instance [54, 64, 111, 121]). We suggest the
interested reader to check the excellent survey [4] from María Dolores Acosta to
know more about the norm-attaining theory (up to 2006).
In this survey, we will be interested in (generalizations of) the Bishop–Phelps–
Bollobás theorem. Let us notice that so far we have mentioned only possible
versions of the Bishop–Phelps theorem for operators and nothing related to the
Bollobás theorem for this class of functions was considered yet. The first time this
question was addressed and systematically studied was in 2008, when the authors
from [9] considered Theorem 1.2 for operators and provided several conditions
under which this theorem holds in this more general manner.
The next definition is the main one of the present work and it is exactly what we
will be discussing throughout the next sections.
Definition 1.3 (Bishop–Phelps–Bollobás Property) Let X and Y be Banach
spaces. We say that the pair (X , Y) has the Bishop–Phelps–Bollobás property for
operators (BPBp, for short) if given ε > 0, there exists η(ε) > 0 such that whenever
T ∈ B(X , Y) with T = 1 and x ∈ SX satisfy

T (x) > 1 − η(ε), (1)

there exist S ∈ B(X , Y) with S = 1 and x0 ∈ SX such that

S(x0 ) = 1, x0 − x < ε, and S − T < ε. (2)

Finally, it is important to mention that María Dolores Acosta has an excellent

survey on the recent progress on the BPBp (see [6]). Naturally, some of the
fundamental results of the present survey will overlap with some of the contents
of Acosta’s. Nevertheless, we will focus on different research lines with the hope
that the reader can get the whole picture of the theory by putting together both
surveys.
The Bishop–Phelps–Bollobás Theorem: An Overview 523

2 The Bishop–Phelps–Bollobás Property

2.1 For Operators

In this section, we will be interested in providing some results on the Bishop–

Phelps–Bollobás property for bounded linear operators. Chronologically, we should
(and we do) start with the first paper [9] on the property. For this, we invite the
reader to go back to Definition 1.3 once again and have in mind what we mean by
BPBp for operators.
Before starting, let us make an obvious but essential observation on the BPBp:
to get positive (or negative) results on this property, we strongly depend on the
geometry of the unit ball of the involved Banach spaces. The reader should
understand this literally as it is: every proof depends on the specific spaces that
we are working with.
For finite-dimensional spaces, we have a positive result. The proof of the
following theorem relies on the compactness of both unit balls BX and BY .
Theorem 2.1 ([9, Proposition 2.4]) Let X , Y be finite-dimensional Banach
spaces. Then, the pair (X , Y) has the BPBp for operators.
Note that Theorem 2.1 asks for both Banach spaces X and Y to be finite-
dimensional. If only Y is assumed to be finite-dimensional, it remains an open
question to this day if even Bishop–Phelps Theorem is satisfied. About the
analogous result where only the first space is considered to be finite-dimensional,
Theorem 2.1 does not hold in general. Indeed, there exists a sequence of 2-
dimensional polyhedral spaces such that, if we denote by Z their c0 -sum, then
(21 , Z) fails to have the BPBp, where 21 denotes the 2-dimensional space (R2 , ·
1 )). This remarkable example can be found in [34, Example 4.1] (see also [34,
Lemma 3.2]). Let us highlight this result.
Example (From [34, Example 4.1]) There exists a Banach space Z such that (21 , Z)
fails the BPBp.
Note, however, that NA(21 , Y) = B(21 , Y) for every Banach space Y and the
set NA(X , Z) is dense in B(X , Z) for every Banach space X (see the proof of
[34, Theorem 4.2] which uses [21, Proposition 3]). This shows that the study of the
Bishop–Phelps–Bollobás property is not merely a trivial extension of the study of
the density of norm-attaining operators as one might think at a first glance.
Let us recall some definitions which we will need in what follows.
Definition 2.2 A Banach space Y satisfies property β of Lindenstrauss if there exist
{yγ : γ ∈ } ⊆ SY , {yγ∗ : γ ∈ } ⊆ SY , and 0 ≤ ρ < 1 such that the following
hold:
(a) yγ∗ (yγ ) = 1 for every γ ∈ ,
(b) |yγ∗ (yβ )| ≤ ρ < 1 for every γ = β,
(c) y = supγ |yγ∗ (y)| for every y ∈ Y.
524 S. Dantas et al.

We notice that the Banach spaces c0 () and ∞ () satisfy property β of
Lindenstrauss with ρ = 0 by using their biorthogonal systems. W. Schachermayer
introduced a dual version of the previous property that is satisfied by spaces like 1 .
Definition 2.3 ([145]) A Banach space Y satisfies property α of Schachermayer if
there exist A = {yγ : γ ∈ } ⊆ SY , A∗ = {yγ∗ : γ ∈ } ⊆ SY , and 0 ≤ ρ < 1
such that the following hold:
(a) yγ∗ (yγ ) = 1 for every γ ∈ ,
(b) |yγ∗ (yβ )| ≤ ρ < 1 for every γ = β,
(c) BY is the closed absolutely convex hull of A.
J. Lindenstrauss also introduced in [137] two properties to denote spaces for
which the density of norm-attaining operators was granted. Namely, a Banach space
X has property A of Lindenstrauss if NA(X , Y) is always dense in B(X , Y) for
all Banach spaces Y, and a Banach space Y has property B of Lindenstrauss
if NA(X , Y) is always dense in B(X , Y) for all Banach spaces X . It is worth
noting that property β of Lindenstrauss implies property B and property α of
Schachermayer implies property A (see [137, 145]). The concepts of universal
domain and range arose to represent those Banach spaces satisfying properties A
and B, respectively. This concept was extended to the BPBp as follows.
Definition 2.4 ([34, Definition 1.2]) Let X , Y be Banach spaces. We say that
(a) X is a universal BPB domain space if, for every Banach space Z, the pair
(X , Z) has the BPBp.
(b) Y is a universal BPB range space if, for every Banach space Z, the pair (Z, Y)
has the BPBp.
Now, we have the two following results.
Theorem 2.5 ([9, Theorem 2.2]) Suppose that Y has property β of Lindenstrauss.
Then, the pair (X , Y) has the BPBp for operators for every Banach space X . In
other words, Y is a universal BPB range space.
The next result is due to Sun Kwang Kim and Han Ju Lee (see also Theorem 2.39
below).
Theorem 2.6 ([125, Theorem 3.1]) Suppose that X is uniformly convex. Then,
(X , Y) has the BPBp for operators for every Banach space Y. In other words, X is
a universal BPB domain space.
Remark 2.7 It is worth mentioning that Theorems 2.1, 2.5, and 2.6 were all
generalized by considering bounded closed convex sets instead of the unit ball
BX (see [67, Theorems 3.1, 3.2 and Corollary 3.4], respectively). We strongly
recommend the reader to take a look at that interesting paper due to Dong Hoon
Cho and Yun Sung Choi, which contains a more general approach.
Let us notice that the authors in [9] were not interested in calculating the optimal
constants in Theorem 2.5 (see the proof of [9, Theorem 2.2]). Nevertheless, by
The Bishop–Phelps–Bollobás Theorem: An Overview 525

using some ideas from [54] (and also from [61, 62]), Vladimir Kadets and Mariia
Soloviova studied in [123] estimates for these constants when the range space
satisfies property β of Lindenstrauss (see Theorem 3.4). As far as we know no
further research was done in this direction when X is taken to be uniformly convex.
Regarding sharpness of the constants in Bishop–Phelps–Bollobás like theorems, we
send the interested reader to Sect. 3 below.
We have already seen that reflexivity plays an important role when it comes to
operators which attain their norms. Indeed, by using the James theorem, it is not
difficult to construct a linear operator which never attains its norm when the domain
space is non-reflexive. On the other hand, Lindenstrauss proved that reflexive spaces
satisfy property A, that is, if X is reflexive, then NA(X , Y) is dense in B(X , Y)
for every Banach space Y. Therefore, it is natural to wonder whether the same
happens with the BPBp. However, this is not the case, as shown in the example after
Theorem 2.1. Actually, there exists a reflexive space X such that the pair (X , X )
does not have the BPBp. For this, take any Y which is reflexive and strictly convex
but not uniformly convex and consider the reflexive space X = 21 ⊕1 Y. This space
does the job (we send the interested reader to carefully check [34, Theorem 2.1,
Corollary 3.3, and Example 3.4]).
Now, going back to the non-reflexive setting, the authors in [9] gave a complete
characterization for all Banach spaces Y such that the pair (1 , Y) has the Bishop–
Phelps–Bollobás property. They defined the approximate hyperplane series property
(AHSP, for short) and proved that (1 , Y) has the BPBp for operators if and only if Y
has the AHSP (see [9, Theorem 4.1]). This property is very technical and we will not
treat it here. We send the interested reader in the recent progress on the AHSP (and
its variants) to the already mentioned recent survey [6] (and the references therein),
where M. D. Acosta exposes interesting facts about it in Section 3 (the interested
reader may also check for instance the papers [10, 22, 60, 78, 108, 114] and their
references for more information). For the sake of completeness, in the next theorem
we exhibit some classical Banach spaces Y for which the pairs (1 , Y) satisfy the
BPBp.
Theorem 2.8 The pair (1 , Y) has the BPBp for operators when
(a) Y is finite-dimensional.
(b) Y is uniformly convex.
(c) Y = C0 (L).
(d) Y = L1 (μ), where μ is any measure.
(e) Y = A(D).
(f)Y = H ∞ (D).
(g) Y has the property β of Lindenstrauss.
(h) Y = L1 (μ, X ), where μ is a σ -finite measure and X is as (a)-(g).
(i)Y = C0 (L, X ), where L is a locally compact Hausdorff space and X is as
(a)-(g).
(j) Y = Au (BX ), the algebra of all uniformly continuous and holomorphic
mappings on the open unit ball of a complex Banach space X .
526 S. Dantas et al.

Notice that Theorem 2.8.(g) is a particular case of Theorem 2.5 (see also the more
general result [80, Proposition 2.10]). For the proofs of Theorem 2.8 we send the
reader to [80, Section 2]. Item (i) from the previous theorem can be found in [114,
Corollary 2.10], and item (j) can be found in [75, Corollary 2.17].
Note that the previous result shows once more a difference between the classical
norm-attaining theory and the BPBp, since we know that if X has the Radon-
Nikodým property, then for all Banach spaces Y, NA(X , Y) is dense in B(X , Y)
(see [43]), but there are Banach spaces Y such that (1 , Y) fails the BPBp, and 1
satisfies the Radon-Nikodým property (see also [34, Section 3]). In fact, the authors
of this survey are not aware of works studying when (X , Y) has the BPBp if X
has the Radon-Nikodým property besides some particular cases. We do not know
for example for what spaces Y we have that (J, Y) satisfies the BPBp, where J
is the James’ space (see Question 6, and see [103, 138] for background on James’
space).
Still on the AHSP, we have the following result which gives several examples on
when the pair (L1 (μ), Y) satisfies the BPBp for operators: (L1 (μ), Y) has the BPBp
if Y has the Radon-Nikodým property and the AHSP. We again send the reader to
[6, pgs. 16-19] for a more complete discussion on the recent progress on the AHSP.
Theorem 2.9 ([77, Theorem 2.2]) Suppose that Y has the Radon-Nikodým prop-
erty. Let μ be a σ -finite measure. Then, the pair (L1 (μ), Y) has the BPBp for
operators if and only if Y has the AHSP.
On the other hand, Richard Aron, Yun Sung Choi, and the second and third
authors of this survey proved that (L1 (μ), L∞ [0, 1]) has the BPBp for operators
(see [33, Theorems 2.3 and 2.4]). This result was extended by Yun Sung Choi, Sun
Kwang Kim, Han Ju Lee, and Miguel Martín (see [79, Theorem 4.1]).
Theorem 2.10 ([79, Theorem 4.1] (see also [33, Theorems 2.3 and 2.4])) Let
μ be an arbitrary measure and let ν be a localizable measure. Then, the pair
(L1 (μ), L∞ (ν)) has the BPBp for operators. In particular, (L1 (μ), L∞ [0, 1]) has
the BPBp for operators.
There is one more result that Theorem 2.9 above does not cover. Indeed, when
the range space is an L1 -space, we have the following general result.
Theorem 2.11 ([79, Theorem 3.1]) Let μ, ν be arbitrary measures. Then, the pair
(L1 (μ), L1 (ν)) has the BPBp for operators.
Regarding Lp -spaces, we borrow [79, Corollary 1.3] to summarize all the results
on the pairs (Lp , Lq ) that satisfy the BPBp for operators including Theorems 2.10
and 2.11. Let us notice that (b) below is an immediate consequence of Theorem 2.6.
Item (c) follows from [11, Theorem 2.5] (see Theorem 2.13).
Theorem 2.12 ([79, Corollary 1.3]) Let μ, ν be any measures. Then, the pair
(Lp (μ), Lq (ν)) has the BPBp for operators when
(a) p = 1 and 1 ≤ q < ∞.
(b) 1 < p < ∞ and 1 ≤ q ≤ ∞.
The Bishop–Phelps–Bollobás Theorem: An Overview 527

(c) p = ∞ and q = ∞ (in the real case).

(d) p = 1 and q = ∞ whenever ν is a localizable measure.
Moving towards another function space, we have the following positive result
for operators from C(K) into C(S) in the real case. As far as we know the complex
case of this result is still an open question (see Question 4). In fact, the analogous
question for the density of norm-attaining operators from C(K) into C(S) is still
open. Indeed, it seems to be a difficult task to adapt a proof from the real to the
complex case when one is working with operators defined on C(K)-spaces.
Theorem 2.13 ([11, Theorem 2.5]) Let K, S be compact Hausdorff topological
spaces. Then, the pair (C(K), C(S)) has the BPBp for operators (in the real case).
In the same direction, we have the following result due to Kim and Lee.
Theorem 2.14 ([126, Corollary 3.8]) Let S be a locally compact metrizable space
and L a locally compact Hausdorff space. Then, the pair (C0 (S), C0 (L)) has the
BPB for operators (in the real case).
A lot of progress has been made in pairs where the second space is uniformly
convex. Let us highlight some of these results. It was shown in the original paper [9,
Theorem 5.2] that when X = n∞ , we get a positive result.
Theorem 2.15 ([9, Theorem 5.2]) (n∞ , Y) has the BPBp for operators for all n
when Y is uniformly convex.
Also, when one considers c0 as the domain space and Y a uniformly convex
Banach space in the range space, we always get a positive result. The next result is
due to Sun Kwang Kim.
Theorem 2.16 ([124, Corollary 2.6]) Let Y be a uniformly convex Banach space.
Then, the pair (c0 , Y) has the BPBp for operators.
For a more general result, which covers Theorem 2.5 (when X = c0 ) and
Theorem 2.16, we suggest the reader to go to [20], where the authors exhibit a new
class of Banach spaces Y such that the pair (c0 , Y) satisfies the BPBp for operators,
a class of which covers all the Banach spaces with property β of Lindenstrauss and
the uniformly convex Banach spaces (see also [6, page 23] for more details).
Kim, Lee and Lin showed in [129] the following when the domain is L∞ (μ) (μ
positive) or c0 () ( index set) and the range is uniformly convex or C-uniformly
convex.
Theorem 2.17 ([129]) Let X = L∞ (μ) with μ a positive measure or c0 () where
is an index set. Then:
(a) If Y is uniformly convex, the pair (X , Y) has the BPBp for operators.
(b) If Y is C-uniformly convex, the pair (X , Y) has the BPBp for operators in the
complex case.
In the real case, Kim and Lee proved that the pair (C(K), Y) satisfies the BPBp
for operators for any compact Hausdorff K whenever Y is uniformly convex [127].
528 S. Dantas et al.

Theorem 2.18 ([127, Theorem 2.2]) If Y is uniformly convex, then the pair
(C(K), Y) has the BPBp for operators in the real case.
In the complex case we should highlight the following nice result due to M. D.
Acosta from 2016 (see [5]), which generalizes Theorem 2.17.(b).
Theorem 2.19 ([5, Theorem 2.4]) The pair (C0 (L), Y) has the BPBp for opera-
tors in the complex case whenever L is a locally compact Hausdorff and Y is any
C-uniformly convex complex space. In particular, the pairs
(a) (C0 (L), Lp (ν)),
(b) (L∞ (μ), Lp (ν))
satisfy the BPBp for operators in the complex case for any positive measure μ and
1 ≤ p < ∞ and for every measure ν.
Let us note that Theorem 2.19.(b) was not covered by Theorem 2.12 above. It
is worth mentioning that it seems to be an open problem whether or not the pair
(c0 , 1 ) satisfies the Bishop–Phelps–Bollobás property for operators in the real case
(see Question 5).
Whenever the pair (c0 , 1 ) has the BPBp for operators, then (n∞ , 1 ) satisfies it
uniformly for every n ∈ N (see [34, Theorem 2.1]), and the converse also holds. This
fact motivated M. D. Acosta and José L. Dávila to characterize the Banach spaces Y
such that the pairs of the form (n∞ , Y) satisfy the BPBp for operators (see [17, 18];
see also [14]) for a fixed n ∈ N. In order to do so, they considered a geometric
property: the approximate hyperplane sum property for n∞ . A complete treatment
of this property is done in Acosta’s survey [6] in pages 22 and 23, and we strongly
suggest the reader to check this and the references therein. Here, we highlight the
consequences of such a property and exhibit the specific Banach spaces Y such that
the pair (n∞ , Y) satisfies the BPBp for operators. We send the reader to check [6,
Proposition 4.9] and the paragraph just after that. Let us notice that some of the
items in Theorem 2.20 follow immediately from already mentioned results in this
survey, nevertheless we include them for the sake of completeness.
Theorem 2.20 ([17, Theorem 3.3]) Let n ∈ N be fixed with n ≥ 2. The pair
(n∞ , Y) has the BPBp for operators when
(a) Y is finite-dimensional.
(b) Y is uniformly convex.
(c) Y has property β of Lindenstrauss.
(d) Y ⊆ C(K) is a uniform algebra.
(e) Y = L1 (μ) for every positive measure μ.
(f) Y = C0 (L, Z), where Z is one of the spaces in (a)-(e).
Note that from the previous result, the pairs (n∞ , 1 ) satisfy the BPBp for
operators for all n, as desired. However, it is not known if they satisfy it uniformly,
since the mappings η obtained in the proof depend on n, so the question of whether
or not (c0 , 1 ) has the BPBp for operators remains open despite that.
The Bishop–Phelps–Bollobás Theorem: An Overview 529

We have been discussing some results and questions where the domain space
was c0 or n∞ . Let us remark that, actually, some results are known when the domain
space is Asplund. Richard M. Aron, Bernardo Cascales and Olena Kozhushkina
showed in [32] the following result.
Theorem 2.21 ([32, Corollaries 2.6 and 2.7]) (X , Y) has the BPBp for operators
in the following cases:
(a) X is Asplund and Y = C0 (L) for any locally compact Hausdorff L.
(b) X is any Banach space and Y = C0 (L), where L is a scattered locally compact
Hausdorff space.
Soon after, Bernardo Cascales, Antonio J. Guirao and Vladimir Kadets extended
Theorem 2.21.(a) to the case where the range space is a uniform algebra (see [55]).
Theorem 2.22 ([55, Theorem 3.6]) (X , Y) has the BPBp for operators if X is
Asplund and Y ⊂ C(K) is a uniform algebra.
For positive results on the BPBp for operators when the range is an operator
space, such as K(X , C(K)) or W(X , C(K)), we send the reader to [13, Theorem
3.1 and Corollary 3.2].

2.2 For Some Classes of Operators

In this section we consider the BPBp when restricted to some particular classes of
operators. We may define in a natural way when a pair of Banach spaces (X , Y)
satisfies the BPBp for some class. For instance, we say that (X , Y) has the BPBp
for compact operators when one starts with a compact operator T in Definition 1.3
satisfying (1) and end up with another compact operator S satisfying conditions (2).
Theorem 2.23 ([79, Corollary 5.3]) Let K be a compact Hausdorff space K and
μ be a finite measure. Consider the real Banach space L1 (μ) and C(K).
(a) The pair (L1 (μ), C(K)) has the BPBp for Bochner representable operators.
(b) The pair (L1 (μ), C(K)) has the BPBp for weakly compact operators.
In fact, we have a more complete scenario when it comes to finite-rank, compact,
weakly compact, and Radon-Nikodým operators when the domain is an L1 -space
due to María Dolores Acosta, Julio Becerra Guerrero, Domingo García, Sun Kwang
Kim, and Manuel Maestre.
Theorem 2.24 ([13, Proposition 2.2, Theorem 2.3, and Corollary 2.4]) Let μ be
a finite measure such that L1 (μ) is infinite-dimensional. The pair (L1 (μ), Y) has
the BPBp for
(1) finite-rank operators,
(2) compact operators,
530 S. Dantas et al.

(3) weakly compact operators,

(4) for Radon-Nikodým operators
when any of the following hold
(a) Y is finite-dimensional.
(b) Y is uniformly convex.
(c) Y = C(K), where K is a compact Hausdorff topological space.
(d) Y = L1 (μ), where μ is a positive measure.
(e) Y has the property β of Lindenstrauss.
(f) Y = L1 (μ, X ), where μ is a σ -finite measure and X
(f.1) is finite-dimensional.
(f.2) is uniformly convex.
(f.3) is lush and separable.
(f.4) is an almost-CL-space.
(f.5) has property β of Lindenstrauss.
Actually, item (f) is obtained in [80, Theorem 2.11 and Corollary 2.12].

As it was somehow mentioned before, in 2011, Richard Aron, Bernardo Cascales,

and Olena Kozhushkina proved that the pair (X , C0 (L)) satisfies the BPBp for
Asplund operators for every Banach space X and every compact Hausdorff space
K (see [32, Theorem 2.4]; a sharp version of this result can be found in [59,
Theorem 5.5]). This was extended two years later by Antonio José Guirao, Vladimir
Kadets and again Cascales to uniform algebras (see [55, Theorem 3.6]; we send
the reader also to [13, Section 3] where the authors extend this last result to some
C(K, Y)). Nevertheless, Bernardo’s team contributions did not stop there. In 2018,
Bernardo himself together with Guirao, Kadets, and Mariia Soloviova introduced a
new Banach space property, called ACKρ -structure (see [56, Definition 3.1]), which
gives Bishop–Phelps–Bollobás for a wider class of Banach spaces and also a wider
class of operators. From our point of view such a paper contains many striking
results which, as a consequence, provide a long collection of pairs of Banach spaces
(X , Y) satisfying the BPBp for Asplund operators through a class of operators
called -flat operators (see [56, Definition 2.8]). We highlight only some of them
although we strongly recommend [56]. We also send the reader to [55, Theorem 3.6
and Remarks R1, R2, R3 on page 380].
Theorem 2.25 ([56, Theorem 3.4, Corollary 4.6, Theorem 4.9]) Let X be an
arbitrary Banach space. The pair (X , Y) has the BPBp for
(1) Asplund operators,
(2) finite rank operators,
(3) compact operators,
(4) p-summing operators, and
(5) weakly compact operators
The Bishop–Phelps–Bollobás Theorem: An Overview 531

whenever one of the following holds

(a) Y ⊆ C(K) is a uniform algebra,
(b) Y has property β of Lindenstrauss.
A different approach on how to prove that (X , Y) has the BPBp for compact
operators when Y is a uniform algebra, can be found in [126, Theorem 3.9], where
the authors use retractions as a tool for their proof.

Before going on, let us give some attention to an important tool for complex spaces
used in the proofs of previous results: a Urysohn type lemma for uniform algebras
proved by Cascales, Guirao, and Kadets (see [55]).
Lemma 2.26 ([55, Lemma 2.7]) Let A ⊆ C(K) be a unital uniform algebra and
0 its Choquet boundary. Then, for any open subset U of K with U ∩ 0 = ∅ and
for 0 < ε < 1, there exist f ∈ A and t0 ∈ U ∩ 0 satisfying
(a) f (t0 ) = f ∞ = 1.
(b) |f (t)| < ε for every t ∈ K \ U .
(c) |f (t)| + (1 − ε)|1 − f (t)| ≤ 1 for every t ∈ K.
We send the interested reader to [128] and [38] for related (and inspired by [55,
Lemma 2.7]) Urysohn type lemmas for holomorphic functions. This lemma is also
used in [75] to obtain results concerning the numerical index, the Daugavet equation,
lushness and the AHSP on uniform algebras and also in [86] to give a version
of Bishop–Phelps–Bollobás theorem for the unital uniform algebra Aw∗ u (BX ) for
some complex Banach space X .

Focusing on compact operators, the first three authors of this survey, together with
Miguel Martín, studied the BPBp for this class of operators in a systematic way [88].
Although this survey is focused on the Bollobás theorem, it is worth mentioning that
Martín gave a negative answer to an old open question on whether every compact
operator can be approximated by norm-attaining operators [139] opening the gate
for further research on this topic (see also [140] and the references therein).
Bearing in mind our Sect. 2.1, we already have a long list of pairs of Banach
spaces (X , Y) that satisfy the BPBp for compact operators. Indeed, when analyzing
the proofs of such results, when one starts with a compact operator, the new
operator that we construct there which satisfy the Bollobás’ conditions (2) is trivially
compact. We borrow [88, Examples 1.5] and the results we have mentioned already,
and list them in the following theorem.
Theorem 2.27 ([88, Examples 1.5]) The pair (X , Y) has the BPBp for compact
operators when
(a) X is arbitrary and Y has property β of Lindenstrauss.
(b) X is uniformly convex and Y is arbitrary.
(c) X is arbitrary and Y ⊆ C(K) is a uniform algebra.
532 S. Dantas et al.

(d) X = L1 (μ) and Y = L1 (ν) for μ, ν arbitrary.

(e) X = L1 (μ) and Y = L∞ (ν) for any measure μ and any localizable measure
ν.
(f) X arbitrary and Y an isometric predual of an L1 (μ)-space.
(g) X = L1 (μ) and Y as in Theorem 2.8 for every measure μ.
Item (f) of Theorem 2.27 above was explicitly proven in [11, Theorem 4.2].
We also send the reader to check the proof of [11, Theorem 3.3] from the same
paper where the authors prove that the pair (C0 (L), Y) has the BPBp for compact
operators whenever Y is uniformly convex.
At this point, and taking a look at Theorem 2.27 above, a natural question
is to know what is the relation between the BPBp and the BPBp for compact
operators. It turns out that the BPBp for compact operators does not imply the BPBp.
Indeed, (L1 [0, 1], C[0, 1]) satisfies the BPBp for compact operators (check item
(c) of Theorem 2.27 for instance) but the set NA(L1 [0, 1], C[0, 1]) is not dense in
B(L1 [0, 1], C[0, 1]) as proved by Schachermayer in 1983 (see [144]). On the other
hand, the other implication seems to be still an open question (see Question 7).
By using technical tools (based on some results due to J. Johnson and J. Wolfe)
the authors in [88] provide several results that allow passing the BPBp for compact
operators from sequence spaces to function spaces (see [88, Lemma 2.1, Proposition
2.2, Corollaries 2.3 and 2.4, and Proposition 2.5]) as well as passing the BPBp to the
BPBp for compact operators. These yield more examples of pairs of Banach spaces
(X , Y) satisfying the BPBp for compact operators.
Theorem 2.28 ([88, Corollary 3.3]) Let Y be a Banach space. If (c0 , Y) has the
BPBp, then (c0 , Y) has the BPBp for compact operators. In particular, (c0 , Y) has
the BPBp for compact operators when
(a) Y has property β of Lindenstrauss.
(b) Y is uniformly convex.
In fact, we have the following result.
Theorem 2.29 ([88, Corollary 3.5]) The pair (C0 (L), Y) has the BPBp for com-
pact operators when
(a) Y has property β of Lindenstrauss.
(b) Y is uniformly convex.
Although we have given only two specific examples for Theorems 2.28 and 2.29,
there are Banach spaces Y (even 2-dimensional) which are neither uniformly convex
nor satisfy property β of Lindenstrauss such that (C0 (L), Y) satisfy the BPBp for
compact operators (see [20] for a more general approach).
In the same direction, we have the following result.
Theorem 2.30 ([88, Corollaries 3.7 and 3.8]) Let X be a Banach space such that
its dual is isometrically isomorphic to 1 . If Y is uniformly convex, then (X , Y) has
the BPBp for compact operators.
The Bishop–Phelps–Bollobás Theorem: An Overview 533

In fact, when the domain is 1 , we provided the following characterization and

we have an affirmative answer for Question 7 in this case.
Theorem 2.31 ([88, Corollary 3.11]) Let Y be a Banach space. The following
statements are equivalent.
(a) (1 , Y) has the BPBp for compact operators.
(b) (1 , Y) has the BPBp.
(c) (L1 (μ), Y) has the BPBp for compact operators for every positive measure μ.
When it comes to strongly measurable function spaces, we have the following.
Theorem 2.32 ([88, Corollary 3.13]) Let μ be a positive measure and let X , Y
be Banach spaces. The pair (L1 (μ, X ), Y) has the BPBp for compact operators
when
(a) X , Y are finite-dimensional.
(b) X has the Radon-Nikodým property and Y is Hilbert such that the pair (X , Y)
has the BPBp for compact operators.
Item (b) of Theorem 2.32 above is a consequence of the proof of [131,
Proposition 9]. When coming to range spaces, we have the following positive
results.
Theorem 2.33 ([88, Theorem 3.15]) Let X , Y be Banach spaces. If (X , Y) has
the BPBp for compact operators, then so do
(a) (X , L∞ (μ, Y)) for every σ -finite positive measure μ.
(b) (X , C(K, Y)).
Notice that we have plenty of examples of pairs (X , Y) satisfying the BPBp
for compact operators from Theorem 2.32 and Theorem 2.33 by applying Theo-
rem 2.27.

Recently, M. D. Acosta and M. Soleimani-Mourchehkhorti initiated the study

of the Bishop–Phelps–Bollobás property for positive operators between Banach
lattices (see [26, 28–30]). The reader may also check the necessary definitions and
background of some of the concepts in the aforementioned papers as well. We
summarize next some of their results.
Theorem 2.34 ([26, 28–30]) The following pairs (X , Y) have the BPBp for posi-
tive operators.
• X = c0 or X = L∞ (μ) and Y = L1 (ν), with μ and ν positive measures ([26,
Theorems 1.6 and 1.7]).
• X = c0 or X = L∞ (μ) and Y is a uniformly monotone Banach lattice ([28,
Theorems 2.5 and 3.2]; see also [28, Corollary 4.4]).
• (C0 (L), Y) if Y is a uniformly monotone Banach function space, if Y is a
uniformly monotone Banach lattice with a weak unit, or if X = C0 (L) is
separable and Y is a uniformly monotone Banach lattice ([30, Theorem 2.8 and
534 S. Dantas et al.

Corollaries 2.11 and 2.12]; see also [30, Proposition 2.13 and Corollary 2.14]
for a partial converse).
• If X and Y are finite dimensional Banach lattices ([29, Corollary 2.12]).
Theorem 2.35 ([29]) X has the BPBp for positive functionals when:
• X is uniformly monotone for orthogonal elements (this actually characterizes
having a stronger property than the BPBp for positive functionals) ([29, Theorem
2.9]; see also [29, Remark 2.8]).
• X is a finite-dimensional Banach lattice ([29, Corollary 2.13]).
• X is strongly monotone and has the hereditary norm-attaining property ([29,
Theorem 2.16]). In particular, X = C(K) (K compact Hausdorff), X = M(K)
(K compact Hausdorff) and X = Lp (μ) (1 ≤ p < ∞, μ positive measure) have
this property ([29, Corollary 3.2]).
They also provide some examples of Banach lattices that do not satisfy the BPBp
for positive functionals (see [29, Section 3]).

In Hilbert spaces H, Bishop–Phelps–Bollobás type properties have been studied for

several classes of operators. In [104, Theirem 4.1], it was achieved a BPBp result on
(H, H) for the Schatten-von Neumann class. In [71], a systematic study of BPBp-
like properties was done in complex Hilbert spaces: the BPBpp (see Sect. 4 for the
treatment of this property) and its natural adaptation to the numerical radius, the
BPBpp-ν (see Sect. 2.7 for a numerical radius version of the BPBp). They proved
in particular that complex Hilbert spaces have these two properties for many classes
of operators (see [71, Theorems 3.1 and 4.1 and Propositions 3.2, 4.2 and 4.3]). We
highlight these results as follows.
Theorem 2.36 ([71]) Let H be a complex Hilbert space. Then H has the BPBpp
and the BPBpp-ν for the following classes of operators: operators, self-adjoint
operators, compact self-adjoint operators, anti-symmetric operators, unitary oper-
ators, normal operators, compact normal operators, compact operators, Schatten-
von Neumann operators, positive operators, positive Schatten-von Neumann opera-
tors, self-adjoint Schatten-von Neumann operators, normal Schatten-von Neumann
operators and compact positive operators.
We send the reader to [44, Theorem 4.2.12 and Corollary 4.2.13] for new results
related to the theorem above.

2.3 For Multilinear Mappings

In this section we will see some important results on the line of the Bishop–
Phelps–Bollobás property for multilinear mappings which were obtained in the past
few years focused on what has come after [4]. Throughout this section, we write
X1 , . . . , XN , Y for arbitrary Banach spaces.
The Bishop–Phelps–Bollobás Theorem: An Overview 535

To start with, we need to adapt Definition 1.3 for this new context.
Definition 2.37 We say that (X1 , . . . , XN ; Y) has the Bishop–Phelps–Bollobás
property for multilinear mappings (BPBp for multilinear mappings, for short) if
given ε > 0, there exists η(ε) > 0 such that whenever A ∈ B(X1 , . . . , XN ; Y)1
with A = 1 and (x1 , . . . , xN ) ∈ SX1 × . . . × SXN satisfy

A (x1 , . . . , xN ) > 1 − η(ε),

there are B ∈ B(X1 , . . . , XN ; Y) with B = 1 and x10 , . . . , xN
0 ∈ S ×. . .×S
X1 XN
such that

0
B x10 , . . . , xN = 1, max xj0 − xj < ε and B − A < ε. (3)
1≤j ≤N

First of all, we need to justify why the study of such a property for multilinear
mappings is relevant. In 2009, Yun Sung Choi and Hyun Gwi Song proved that
the Bishop–Phelps–Bollobás theorem does not hold for bilinear forms on 1 × 1 .
Actually, they showed that given the bilinear form T (ei , ej ) := 1 − δij , there is no
norm-attaining bilinear form S on 1 ×1 satisfying S −T < 1 (see [81, Theorem
2]). We highlight it below.
Theorem 2.38 ([81, Theorem 2]) The triple (1 , 1 ; K) fails the BPBp for bilinear
forms.
Nevertheless, Theorem 2.39 below gives a positive result. In particular, if X is
uniformly convex, then (X , Y) has the BPBp for operators for every Banach space
Y as in Theorem 2.6.
Theorem 2.39 ([15, Theorem 2.2]) Let X1 , . . . , XN be uniformly convex Banach
spaces. Then, (X1 , . . . , XN ; Y) has the BPBp for multilinear mappings for every
Banach space Y.
The authors of [15] also gave a complete characterization for the triple (1 ×
Y; K) to have the Bishop–Phelps–Bollobás property for bilinear forms by using the
AHSP for a pair (X , X ) which we will not treat here. We send the interested reader
to [6, Definition 5.7] and the references therein. We give the most relevant (and
specific) consequences of it in the next theorem.
Theorem 2.40 ([15, Theorem 3.6]) The triple (1 , Y; K) has the BPBp for bilin-
ear forms when
(a) Y is uniformly smooth.
(b) Y is finite-dimensional.
(c) Y = C0 (L) and, in particular, when Y = C(K) or Y = c0 .
(d) Y = K(H), where H is a Hilbert space.

1 We send the reader back to the first paragraph of the Introduction to check our notation.
536 S. Dantas et al.

The reader can find the proofs of Theorem 2.40.(a)–(d) in [15, Propositions 4.1,
4.2, 4.4 and 4.7], respectively.
Following the line of Theorem 2.38 above, we have the following negative result.
Theorem 2.41 ([15, Proposition 4.8]) The triple (1 , L1 (μ); K) fails the BPBp for
bilinear forms whenever L1 (μ) is infinite-dimensional.
It is worth mentioning that Yun Sung Choi showed in 1997 that the set
of all norm-attaining bilinear forms on L1 [0, 1] × L1 [0, 1] is not dense in
B(L1 [0, 1], L1 [0, 1]; K) (see [70, Theorem 3]). This means, in particular, that
this triple cannot satisfy the BPBp for bilinear forms.
Theorem 2.42 The triple (L1 [0, 1], L1 [0, 1]; K) fails the BPBp for bilinear forms.
On the other hand, the same authors of [15] together with Choi, Kim, and
Lee provided a characterization for the triple (L1 (μ), Y; K) to have the BPBp
for bilinear forms in [12]. Let us notice that we have an additional assumption in
[12, Theorem 2.6]: the Banach space Y is assumed to be Asplund. The tools and
techniques used to prove the following result are also based on the AHSP.
Theorem 2.43 ([12, Theorem 2.6 and Corolary 2.7]) Let μ be a σ -finite measure
such that L1 (μ) is infinite-dimensional. Then, (L1 (μ), Y; K) has the BPBp for
bilinear forms when
(a) Y is uniformly smooth.
(b) Y is finite-dimensional.
(c) Y = c0 .
Remark 2.44 In the vein of Theorems 2.40 and 2.43, it is worth mentioning that
there are positive results on Banach function spaces over a measure space. Indeed,
Lucía Agud, José M. Calabuig, Sebastián Lajara, and Enrique A. Sánchez Pérez
gave some applications to the BPBp (for operators and bilinear forms) from their
results on Gâteaux and Fréchet smoothness, and the uniform smoothness of Lp (m),
where m : −→ X is a vector measure (we invite the reader to visit Section 4 of
[31] and check the necessary background in Sections 1, 2 and 3 of that paper).
There is another positive result when it comes to the BPBp for bilinear forms on
C0 (L1 ) × C0 (L2 ) in the complex case due to Kim, Lee, and Martín, where L1 , L2
are locally Hausdorff topological spaces (see [132]). Indeed, we have the following
result.
Theorem 2.45 ([132, Theorem 2 and Corollary 3]) (C0 (L1 ), C2 (L2 ); K) has the
BPBp for bilinear forms in the complex case. In particular, so does (c0 , c0 ; K) (in
the complex case).
We send the reader to [69] for some generalizations on the previous result.
We also have the following positive result when one of the factors is c0 and the
other one is an p -space.
The Bishop–Phelps–Bollobás Theorem: An Overview 537

Theorem 2.46 ([124, Corollary 2.9]) The triple (c0 , p ; K) has the BPBp for
bilinear forms whenever 1 < p < ∞.
At this point, one might wonder what happens with Theorem 2.46 when c0
is replaced by 1 . The answer to this is that we have an analogous result, and it
follows immediately from Theorem 2.8.(b) and [87, Proposition 2.6]. This was also
observed explicitly in [82, Corollary 1.1]. We highlight this in the following result.
Theorem 2.47 ([82, Corollary 1.1]) The triple (1 , p ; K) has the BPBp for
bilinear forms whenever 1 < p < ∞.
We have seen in Theorem 2.27 that when Y is an isometric L1 -predual, we
have that the pair (X , Y) has the BPBp for compact operators. In turns out that, by
using the metric approximation property on the L1 -predual, we have the following
technical result for (compact) multilinear mappings which provides some positive
results for the BPBp for this class of functions.
Theorem 2.48 ([87, Theorem 2.9]) Suppose that Y is an L1 -predual and that the
N-tuple (X1 , . . . , XN ; K) has the BPBp for multilinear forms. Then, the (N + 1)-
tuple (X1 , . . . , XN ; Y) has the BPBp for compact multilinear mappings.
In particular, we have the following corollary.
Corollary 2.49 Suppose that Y is an L1 -predual. The triple (X , Z; Y) has the
BPBp for compact bilinear mappings when
(a) X = C0 (L1 ) and Z = C0 (L2 ) in the complex case.
(b) both X , Z are uniformly convex Banach spaces.
(c) X = 1 and
(c.1) Z is uniformly smooth.
(c.2) Z is finite-dimensional.
(c.3) Z = C0 (L) and, in particular, Z = C(K) or Z = c0 .
(c.4) Z = K(H), where H is a Hilbert space.
(d) X = L1 (μ), where μ is a σ -finite measure such that L1 (μ) is infinite-
dimensional and
(d.1) Z is uniformly smooth.
(d.2) Z is finite-dimensional.
(d.3) Z = c0 .
(e) X = c0 and Z = p for 1 < p < ∞.
Let us observe that items (c) and (d) of Corollary 2.49 are immediate consequences
of Theorem 2.40 and Theorem 2.43, respectively. Item (b) follows from Theo-
rem 2.39 and item (a) from Theorem 2.45. Item (e) follows from Theorem 2.46.
Remark 2.50 In [131], Kim, Lee, and Martín defined a more general geometric
property than the ASHP in order to characterize the pairs (X , Y) such that
(1 (X ), Y) satisfies the BPBp for operators. In the same directions, in [87]
538 S. Dantas et al.

we defined the analogous property for bilinear forms with the idea to give a
characterization for the triples of the form (1 (X ), Y; K) to have the BPBp for
bilinear forms.
For symmetric bilinear forms (Hermitian forms), the only positive known result
is the following one:
Theorem 2.51 ([104]) Let H be a Hilbert space.
• (H, H) has the BPBp for continuous symmetric bilinear forms ([104, Theorems
3.2 and 3.4]).
• If H is complex, then (H, H) has the BPBp for continuous Hermitian forms
([104, Corollary 2.2]).
For negative results on the norm-attaining multilinear mappings (and, in particu-
lar, for the Bollobás version for this class of functions) we send the interested reader
to [8, 47, 119].

2.4 For Homogeneous Polynomials

In this section, we present and discuss the progress on the Bishop–Phelps–Bollobás

property for homogeneous polynomials. Again, the definition is easily adapted and
we highlight it as follows. We send the reader back to the first paragraph of the
Introduction to check our notation.
Definition 2.52 We say that the pair (X ; Y) has the Bishop–Phelps–Bollobás
property for N-homogeneous polynomials if given ε > 0, there exists η(ε) > 0
such that whenever P ∈ P(N X ; Y) with P = 1 and x0 ∈ SX satisfy

P (x0 ) > 1 − η(ε),

there are Q ∈ P(N X ; Y) with Q = 1 and x1 ∈ SX such that

Q(x1 ) = 1, x1 − x0 < ε and Q − P < ε.

As in the case of operators (see Theorem 2.6) and multilinear mappings (see
Theorem 2.39), we have the following universal result for polynomials.
Theorem 2.53 ([12, Theorem 3.1]) Let X be a uniformly convex Banach space.
Then, (X ; Y) has the BPBp for N-homogeneous polynomials for every Banach
space Y.
The technique used to prove Theorem 2.53 is based on the original Lindenstrauss
argument. It is not known whether the analogous result for symmetric N-linear
forms holds (see Question 8).
The Bishop–Phelps–Bollobás Theorem: An Overview 539

Concerning property β of Lindenstrauss, we have the following theorem. Item (a)

of Theorem 2.54 below is an immediate consequence of [87, Proposition 2.3.(iii)].
Theorem 2.54 ([12, Proposition 3.3]) Let X be a Banach space and Y be a
Banach space with property β of Lindenstrauss. If (X ; K) has the BPBp for N-
homogeneous polynomials, so does (X ; Y). In particular, (X ; Y) has the BPBp for
N-homogeneous polynomials when
(a) X is finite-dimensional.
(b) X is uniformly convex.
There is no Bishop–Phelps theorem either in scalar and vector-valued polynomi-
als setting. For negative results on the norm-attaining N-homogeneous polynomials
(and, in particular, for the Bollobás version for this class of functions) we send the
interested reader to [47, 119].

2.5 For Holomorphic Functions

In this section, we consider the Bollobás theorem for holomorphic functions. Let
us start by saying that it seems that not much has been done in this direction.
Nevertheless, from our point of view, holomorphic functions deserve special
attention and the reason is quite simple: it requires a complete different approach
and interesting techniques.
To start with, we highlight a non-linear version of the Bishop–Phelps–Bollobás
theorem. This was done in [86]. It was proven that a Bishop–Phelps–Bollobás type
theorem holds on Aw∗ u (BX ) whenever X is either a uniformly convex or a locally
c-uniformly convex, order-continuous sequence space. For necessary notation, we
send the reader to the first paragraph of Sect. 1.
Theorem 2.55 ([86, Theorem 1]) Let X be a complex Banach space. Suppose that
s is norm dense in SX . Then, given ε > 0, there exists η(ε) > 0 such that
whenever f ∈ Aw∗ u (BX ) with f ∞ = 1 and x0∗ ∈ SX satisfy

|f (x0∗ )| > 1 − η(ε),

there are g ∈ Aw∗ u (BX ) with g∞ = 1 and x1∗ ∈ SX such that

|g(x1∗ )| = 1, g − f ∞ < ε, and x1∗ − x0∗ < ε.

In Theorem 2.55, s represents the set of all strong peak points for Aw∗ u (BX ).
Besides that, notice that in Theorem 2.55 we are requiring that the set s is dense
in SX ; for results when this happens, we refer to [86, Proposition 5 and 6]. For
necessary background we send the reader to [86, pages 8 and 9].
540 S. Dantas et al.

In the same direction, Kim and Lee proved similar results to Theorem 2.55 for
the spaces Au (BX ) and Ab (BX ) under some additional conditions on the complex
Banach space X .
Theorem 2.56 ([128, Corollary 8]) Suppose that X is either a locally uniformly
convex space or a locally c-uniformly convex, order-continuous sequence space.
Let Y be a Banach space. If A is one of Au (BX ) or Ab (BX ), then for ε ∈ (0, 1),
whenever a norm-1 element f in A(BX ; Y) and an element x0 ∈ BX satisfy
ε
f (x0 ) > 1 − ,
6

there are x1 ∈ SX and a strongly norm-attaining function g ∈ A(BX ; Y) such that

g = g(x1 ) = 1, x1 − x0 < ε, and f − g∞ < ε.

Observe that if X is finite-dimensional (in particular, if X = D), then

Aw∗ u (BX ) = Au (BX ) = A(BX ), and in particular, the two previous theorems
hold for the classical A(D).
Recently it was proven that the pair (H ∞ (D), H ∞ (D)) has the BPBp for
operators (see [38]). This result is due to Neeru Bala, Kousik Dhara, Jaydeb Sarkar,
and Aryaman Sensarma.
Theorem 2.57 ([38, Theorem 3.1]) The pair (H ∞ (D), H ∞ (D)) has the BPBp for
operators.
For negative results on the norm-attaining holomorphic functions (and, in
particular, for the Bollobás version for this class of functions) we send the reader
to very interesting paper [48] due to Daniel Carando and Martin Mazzitelli. More
specifically, there is no Bishop–Phelps theorem for Au (BX ).

2.6 For Lipschitz Mappings

In this section, we present and discuss the progress on the Bishop–Phelps–Bollobás

property for Lipschitz mappings. Throughout this section, all the metric spaces will
be considered to be complete and all Banach spaces will be considered to be real.

A quick glance at the definition of · L suggests that the most natural way of
defining a norm-attaining Lipschitz mapping is the following. We say that f ∈
Lip0 (M, Y) strongly attains its norm if there exist x, y ∈ M, x = y, such that

f (x) − f (y)
f L = .
d(x, y)
The Bishop–Phelps–Bollobás Theorem: An Overview 541

The subset of all Lipschitz mappings in Lip0 (M, Y) which attain their norms
strongly is denoted by SNA(M, Y) (although the notations SA(M, Y) and
LipSNA(M, Y) have also been used in the literature). Since there is a natural
concept of norm attainment, it makes sense to study density problems or even
Bishop–Phelps–Bollobás type properties in this scenario. Nevertheless, in [121,
Theorem 2.3] it was shown that if X is a geodesic pointed metric space (in particular,
if X is any Banach space), then SNA(X , R) is never dense in Lip0 (X , R) and later
this was extended to metric length spaces [54].
Theorem 2.58 ([54, Theorem 2.2]) Let M be a complete length pointed metric
space. Then, the set SNA(M, R) is not dense in Lip0 (M, R).
In fact, this result combined with Proposition 2.69 show that if M is a metric
length space and Y is any other Banach space, a BPBp-like property is not possible
(actually, in this case we do not even get density, see [65, Proposition 4.2]).
However, this inconvenience has not stopped researchers from developing a rich
theory on denseness of norm-attaining Lipschitz mappings by considering domains
that are not Banach spaces or using weaker norm attainment concepts (see, for
instance, [68, Section 1] for a clean exposition on several and the relations between
them). As a matter of fact, as we are going to show throughout this section, many
authors came up with several definitions in order to get positive results in this setting.
As far as the authors of this survey know, the study of the density of norm-
attaining Lipschitz mappings was initiated independently in [112] and [121], and
since then many authors have contributed to the topic. We refer the interested reader
to [54, 64–66, 68, 83, 106, 107, 112, 121] and also the nice survey [111, Section
5] for a solid background on the topic. Four of those works also include Bishop–
Phelps–Bollobás type properties for Lipschitz mappings. The rest of the section
will focus on those results.
We start with a work by Vladimir Kadets, Miguel Martín, and Mariia Soloviova,
who focused their study on the case where M is a Banach space, Y = R, and
dealt with some weaker forms of norm attainment (see [121]). First, for the set of
continuous seminorms on X , Sem(X ), they got a variation of a BPBp-like result.
Proposition 2.59 ([121, Proposition 3.4]) Let X be a Banach space. Then for
every ε > 0, there is δ > 0 such that for every p0 ∈ Sem(X ) with p0 = 1
and every x0 ∈ SX with p0 (x0 ) > 1 − δ, there exist p ∈ Sem(X ) with p = 1 and
x ∈ SX such that

p(x) = 1 = p, x − x0 < ε, p − P0 ∞ = sup |p(y) − p0 (y)| < ε.

y∈SX

Note that, in particular, this implies the uniform density of the set of norm
attaining seminorms.
We now need the definition of a natural (but weaker) form of norm attainment in
Lipschitz functionals in the setting of Banach spaces.
542 S. Dantas et al.

Definition 2.60 ([121, Definition 1.3]) Let X be a real Banach space. A Lipschitz
functional g ∈ Lip0 (X ) attains its norm at the direction u ∈ SX if there is a
sequence of pairs {(xn , yn )} in X × X , with xn = yn , such that

xn − yn g(xn ) − g(yn )
lim =u and lim = g.
n→∞ xn − yn n→∞ xn − yn

In this case, we say that g attains its norm directionally. The set of all those f ∈
Lip0 (X ) that attain their norm directionally is denoted by DA(X ).
This kind of norm attainment is a natural approach since it coincides with the
usual norm attainment if g is linear. Also, if X is finite-dimensional, then DA(X ) =
Lip0 (X ). In [121], the authors defined a BPBp-like property for Lipschitz mappings
involving this kind of norm attainment.
Definition 2.61 ([121, Definition 1.4]) A Banach space X has the directional
Bishop–Phelps–Bollobás property for Lipschitz functionals (X ∈ LipBPB, for
short), if for every ε > 0, there is δ > 0 such that for every f ∈ Lip0 (X ) with
f = 1 and for every x, y ∈ X with x = y satisfying

f (x) − f (y)
> 1 − δ,
x − y

there is g ∈ Lip0 (X ) with g = 1 and there is u ∈ SX such that g attains its norm
at the direction u,

x−y
g − f < ε, and − u
x − y < ε.

We also need to define another kind of norm attainment and its associated BPBp-
like property.
Definition 2.62 ([121, Definition 1.3]) Let X be a real Banach space. A Lipschitz
functional g ∈ Lip0 (X ) attains its norm at a point v ∈ X at the direction u ∈ SX if
there is a sequence of pairs {(xn , yn )} in X × X , with xn = yn and limn→∞ xn =
limn→∞ yn = v, such that

xn − yn g(xn ) − g(yn )
lim =u and lim = g.
n→∞ xn − yn n→∞ xn − yn

In this case, we say that g attains its norm locally-directionally. The set of all those
f ∈ Lip0 (X ) that attain their norm locally-directionally is denoted by LDA(X ).
Definition 2.63 ([121, Definition 1.4]) A Banach space X has the local directional
Bishop–Phelps–Bollobás property for Lipschitz functionals (X ∈ LLipBPB for
short), if for every ε > 0, there is δ > 0 such that for every f ∈ Lip0 (X ) with
The Bishop–Phelps–Bollobás Theorem: An Overview 543

f = 1 and for every x, y ∈ X with x = y satisfying

f (x) − f (y)
> 1 − δ,
x − y

there is g ∈ Lip0 (X ) with g = 1 and there are v ∈ X and u ∈ SX such that g
attains its norm at the point v at the direction u,

x−y

g − f < ε, − u
x − y < ε, and dist(v, [x, y]) < ε.

With the help of some lemmas (that might be of interest in themselves) involving
the LipBPB and the LLipBPB (see [121, Lemmas 4.1 and 4.4]), the authors showed
the following.
Theorem 2.64 ([121, Theorem 5.3]) Every uniformly convex Banach space X has
the local directional Bishop–Phelps–Bollobás property for Lipschitz functionals.
Some time after, Rafael Chiclana and Miguel Martín did a systematic study of a
vector-valued BPBp property for Lipschitz mappings for the strong norm attainment
with the domain being a metric space. We begin with the following definition.
Definition 2.65 ([65, Definition 1.1]) Let M be a pointed metric space and let Y
be a Banach space. We say that the pair (M, Y) has the Lipschitz Bishop–Phelps–
Bollobás property (Lip-BPB property for short) if given ε > 0, there is η(ε) > 0
such that for every norm-one F ∈ Lip0 (M, Y) and every p, q ∈ M, p = q such
that

F (p) − F (q) > (1 − η(ε))d(p, q),

there exist G ∈ Lip0 (M, Y) and r, s ∈ M, r = s, such that

G(r) − G(s) d(p, r) + d(q, s)

= GL = 1, G − F L < ε, < ε.
d(r, s) d(p, q)

If this holds for a class of linear operators from F (M) to Y, we will say that the pair
(M, Y) has the Lip-BPB property for that class.
The authors also give a reformulation of that definition in [65, Remark 1.2.(a)]
in terms of linear operators associated to the Lipschitz mappings. The first result for
this property is the following.
Theorem 2.66 ([65, Theorem 2.1]) Let M be a finite pointed metric space and let
Y be a Banach space. If (F (M), Y) has the BPBp, then (M, Y) has the Lip-BPB
property.
544 S. Dantas et al.

This happens for example if Y is finite-dimensional (see [65, Corollary 2.3]).

The authors noted that we can not remove the condition of (F (M), Y) having the
BPBp (see [65, Example 2.5]) or the finitude of M (see [65, Example 2.6]).
The next results will use a series of concepts related to pointed metric spaces. We
refer the reader to [65, Section 3] and the references cited there for the definitions
and necessary background.
Theorem 2.67 ([65, Theorem 3.3]) Let M be a uniformly Gromov concave pointed
metric space. Then, (M, Y) has the Lip-BPB property for every Banach space Y.
This is the case for example when M is concave and F (M) has property α (see
Definition 2.3), if M is concave and finite, if M is a pointed ultrametric space, and
if M is a Hölder pointed metric space ([65, Corollaries 3.4, 3.5, 3.6, 3.7]).
In [65, Example 2.6] they showed that the BPBp of (F (M), Y) does not imply
the Lip-BPB property of (M, Y) in general. The other implication does not hold in
general either.
Proposition 2.68 ([65, Proposition 3.9]) Let M be a finite pointed metric space
with more than two points. Then, there exists a Banach space Y such that (F (M), Y)
fails the BPBp.
Contrary to a conjecture they had, they showed that there is a Gromov concave
pointed metric space such that F (M) has the RNP but (M, R) fails the Lip-BPB
property (see [65, Example 3.11]). They also studied relations between the scalar-
valued and the vector-valued versions of the property. We will summarize some of
the main results.
Proposition 2.69 ([65, Proposition 4.1]) Let M be a pointed metric space. Sup-
pose that there exists a Banach space Y = 0 such that (M, Y) has the Lip-BPB
property. Then (M, R) has the Lip-BPB property.
Proposition 2.70 ([65, Proposition 4.4]) Let M be a pointed metric space such
that (M, R) has the Lip-BPB property, and let Y be a Banach space satisfying
property β of Lindenstrauss. Then (M, Y) has the Lip-BPB property.
The previous result does not hold for property quasi-β even though we have density
of SNA in that case ([65, Proposition 4.7, Example 4.9]).
Finally, they managed to adapt some modifications of results from [65, Sections
3 and 4] and [88] to compact Lipschitz operators ([65, Propositions 4.10, 4.13, 4.16,
4.17]).
In a recent work by Geunsu Choi, Yun Sung Choi, and Miguel Martín ([68]), the
authors studied a vector-valued variation of the LLipBPB (see Definition 2.63) for
a slightly modified type of norm attainment as well as a version of it for compact
operators. For a Banach space X , let us denote X̃ the set {(x, y) ∈ X 2 : x = y}.
Definition 2.71 ([68, Definition 1.5]) We say that f ∈ Lip0 (X , Y) attains its norm
locally directionally at the point x ∈ X in the direction u ∈ SX toward z ∈ Y if
The Bishop–Phelps–Bollobás Theorem: An Overview 545

there exists {(xn , yn )∞ ˜

n=1 } ⊆ X such that

f (xn ) − f (yn ) xn − yn
→ z with z = f , → u and xn , yn → x.
xn − yn xn − yn

We denote by LDirA(X , Y) the set of every f ∈ Lip0 (X , Y) which attains its norm
locally directionally at some point x ∈ X in some direction u ∈ SX toward some
point z ∈ Y.
Definition 2.72 ([121, Definition 1.4]) A pair of Banach spaces (X , Y) is said to
have the local directional Bishop–Phelps–Bollobás property for Lipschitz mappings
(in short, LDirA-BPBp) if for every ε > 0, there is η > 0 such that whenever
f ∈ SLip0 (X ,Y ) and x, y ∈ X × X with x = y satisfy

f (x) − f (y)
> 1 − η,
x − y

there exist g ∈ SLip0 (X ,Y ), z ∈ SY , u ∈ SX and x ∈ X such that g attains its norm

locally directionally at the point x in the direction u toward z,

x−y

g − f < ε, − u
x − y < ε, and dist(x, [x, y]) < ε.

Their main result extends [121, Theorem 5.3] in the case of compact operators.
Theorem 2.73 ([68, Theorem 4.1]) Let X and Y be Banach spaces such that X
is uniformly convex and (F (X ), Y) has the BPBp for compact operators. Then, the
pair (X , Y) has the LDirA-BPBp for Lipschitz compact mappings. In fact, we have
something more: for every ε > 0, there exists η > 0 such that for any positive
function ρ : X̃ → R and whenever f ∈ SLip0K (X ,Y ) and (x, y) ∈ X̃ satisfy

f (x) − f (y)
> 1 − η,
x − y

there exist g ∈ SLip0K (X ,Y ), z ∈ SY , u ∈ SX and x ∈ X such that g attains its

norm
locally directionally at the point x in the direction u toward z, g − f < ε,
x−y
u − x−y < ε and dist(x, [x, y]) < ερ(x, y).
In particular, if X is uniformly convex and Y has property β of Lindenstrauss
or satisfies that Y = L1 (μ) for some μ, then (X , Y) has the LDirA-BPBp for
Lipschitz compact mappings (see [68, Corollary 4.4]), and the same is also true if
Y is a finite-dimensional polyhedral Banach space (see [68, Corollary 4.11]). The
condition in the previous theorem that the pair (F (X ), Y) has the BPBp for compact
operators is sufficient, but not necessary (see [68, Proposition 4.8] and the discussion
right after it).
546 S. Dantas et al.

There is also a slightly different version of Theorem 2.73 where the domain space
is a Hilbert space.
Theorem 2.74 ([68, Theorem 4.5]) Let H be a Hilbert space and let Y be a
Banach space. Suppose that (F (M), Y) has the BPBp for compact operators. Then
for every ε > 0, there exists η > 0 such that whenever f ∈ SLip0K (H,Y ) and
(x, y) ∈ H̃ satisfy

f (x) − f (y)
> 1 − η,
x − y

there exist g ∈ SLip0K (H,Y ), z ∈ SY and x ∈ H such that g attains its norm locally
x−y
directionally at the point x in the direction x−y toward z, g − f < ε, and
dist(x, [x, y]) < ε max{x, y}.
In particular, the same holds if Y has property β of Lindenstrauss or if Y is an
L1 (μ) space (see [68, Corollary 4.6]), and the same is also true if Y is a finite-
dimensional polyhedral Banach space (see [68, Corollary 4.12]).
Finally, they got some stronger versions of Theorems 2.73 and 2.74 for the case
when X = R (see [68, Propositions 4.7 and 4.8, Corollaries 4.9 and 4.10]).
To conclude this section, let us note that a very recent work by Rafael Chiclana
and Miguel Martín ([66]) studied stability properties for the Bishop–Phelps–
Bollobás property for Lipschitz mappings (see Definition 2.65). Let us start by
defining the sum of a family of pointed metric spaces.
Definition 2.75 ([66, Definition 2.1], from [149, Definition 1.13]) Given a family
of pointed metric spaces {(Mi , di )}i∈I , the (metric) sum of the family is the disjoint
union of all Mi ’s, identifying the base points, endowed with the following metric d:
d(x, y) = di (x, y) if both x, y ∈ <Mi , and d(x, y) = di (x, 0) + dj (0, y) if x ∈ Mi ,
y ∈ Mj and i = j . We will write i∈I Mi to denote the sum of the family of metric
spaces.
<
Proposition 2.76 ([66, Proposition 2.2]) Let M = M1 M2 be the sum of two
pointed metric spaces and let Y be a Banach space. If the pair (M, Y) has the
Lip-BPB property, then so do (M1 , Y) and (M2 , Y).
The version of this proposition for compact Lipschitz mappings remains true (see
[66, Proposition 2.6]). However, none of the converses hold (see [66, Example 2.4]).
They also provided an extension of [65, Proposition 4.4]. We will include the
result for the sake of completeness, but we refer the reader to [56] for the necessary
definitions and background (see also [66, Section 3]) .
Theorem 2.77 ([66, Theorem 3.5]) Let M be a pointed metric space such that
(M, R) has the Lip-BPB property. Let Y be a Banach space in ACKρ with associated
1-norming set ⊆ BY , and let ε > 0. Then, there exists η(ε, ρ) > 0 such that if
we take T̂ ∈ L(F (M), Y) a -flat operator with T L = 1 and m ∈ Mol(M)
satisfying T̂ (m) > 1 − η(ε, ρ), then there exist an operator Ŝ ∈ L(F (M), Y)
The Bishop–Phelps–Bollobás Theorem: An Overview 547

and a molecule u ∈ Mol(M) such that

Ŝ(u) = SL = 1, m − u < ε, T − SL < ε.

As a consequence, if M is a pointed metric space such that (M, R) has the Lip-
BPB property, then we get several pairs of spaces satisfying the Lip-BPB property
for -flat operators (see [66, Corollary 3.6] for the details). The inconvenience of
having to work with -flat operators disappears if we restrict the results to the
compact operator case.
Proposition 2.78 ([66, Proposition 3.12]) Let M be a pointed metric space such
that (M, R) has the Lip-BPB property and let Y be an ACKρ Banach space. Then,
the pair (M, Y) has the Lip-BPB for Lipschitz compact mappings.
And again, a series of consequences can be derived from this (see [66, Corollary
3.13] for the details).
Using some results from [66] and [65, Proposition 4.16], we get the following
result.
Corollary 2.79 ([66, Corollary 3.17]) Let M be a pointed metric space and let Y
be a Banach space such that the pair (M, Y) has the Lip-BPB property for Lipschitz
compact mappings.
(1) For every compact Hausdorff topological space K, the pair (M, C(K, Y)) has
the Lip-BPB property for Lipschitz compact mappings.
(2) For 1 ≤ p < ∞, if the pair (M, p (Y)) has the Lip-BPB property for Lipschitz
compact mappings, then so does (M, Lp (μ, Y)) for every positive measure μ.
(3) For every σ -finite positive measure μ, the pair (M, L∞ (μ, Y)) has the Lip-BPB
property for Lipschitz compact mappings.
Finally, they study stability results for absolute sums of codomains.
Proposition 2.80 ([66, Proposition 4.3]) Let M be a pointed metric space, Y be
a Banach space and Y1 be an absolute summand of Y. If the pair (M, Y) has the
Lip-BPB property with a function ε $→ η(ε), then so does (M, Y1 ) with the same
function.
As a consequence, we get an universal mapping η for universal Lip-BPB domain
spaces.
Corollary 2.81 ([66, Corollary 4.4]) Let M be a pointed metric space such that
(M, Y) has the Lip-BPB property for all Banach spaces Y. Then, there exists a
function ηM (ε), which only depends on M, such that the pair (M, Y) has the Lip-
BPB property witnessed by the function ηM (ε) for evey Banach space Y.
A result in the same direction is the following.
Proposition 2.82 ([66, Proposition 4.6]) Let M be a pointed metric space, Y a
Banach space, and K a compact Hausdorff topological space. If (M, C(K, Y)) has
548 S. Dantas et al.

the Lip-BPB property witnessed by a function η(ε), then (M, Y) has the Lip-BPB
property witnessed by the same function.
It is worth noting that the last 3 results are still valid in the case of compact
operators (see [66, Proposition 4.8]
and the discussion before it for the details).
Let Y = [ i∈I Yi ]c0 or Y = [ i∈I Yi ]∞ for some family of Banach spaces
{Yi }i∈I and let M be a pointed metric space. By Proposition 2.80, if (M, Y) has the
Lip-BPB property, then all pairs (M, Yi ) have it with the same function η. We have
the following result regarding the reverse implication:
Proposition 2.83 ([66, Proposition 4.9]) Let M be a pointed metric space, let
{Y
i } i∈I be a family of Banach spaces and let Y = [ i∈I Yi ]c0 or Y =
[ i∈I Yi ]∞ . Assume that (M, Yi ) has the Lip-BPB property with the function ηi (ε)
for each i ∈ I . If inf{ηi (ε) : i ∈ I } > 0 for every ε > 0, then (M, Y) has the Lip-
BPB property.
Finally, let us notice that the compact operators version of the previous result is
also true (see [66, Proposition 4.11]).

2.7 For Numerical Radius

In this section we will discuss an adaptation of the Bishop–Phelps–Bollobás

property for the numerical radius.

Brailey Sims, in his 1972 Ph.D. dissertation (see [146]), raised a question that is,
in nature, related to the one that Lindenstrauss tackled in 1963 [137]: the norm-
denseness of the set of numerical radius attaining operators on a Banach space
X (we will define this concept shortly). Ever since, many authors have made
contributions regarding this question, such as Ira David Berg, Brailey Sims (see
[40]), Carmen Silvia Cardassi (see [50–52]), María Dolores Acosta, Francisco
José Aguirre, Rafael Payá, and Manuel Ruiz Galán (see for instance [2, 3, 7, 23–
25, 141]). It is also worth noting for the interested reader that M. D. Acosta initiated
a systematic study of this question in her nice Ph.D. dissertation [1].
Similar to what happened with the study of norm-attaining operators, it is a
natural question whether or not we can have Bishop–Phelps–Bollobás type theorems
for the numerical radius on some Banach spaces. The study of this question was first
addressed in 2013 by Antonio José Guirao and Olena Kozhushkina in [115], where,
paralleling the work [9], they introduced and studied the Bishop–Phelps–Bollobás
property for numerical radius. We need some background before proceeding.
Given a Banach space X , the set of states of X is (X ) := {(x, x ∗ ) ∈ SX ×
SX : x ∗ (x) = 1}. Given an operator T ∈ B(X ), its numerical radius is defined
as ν(T ) = sup{|x ∗ (T (x))| : (x, x ∗ ) ∈ (X )}. It is immediate to check that ν is a
seminorm, and that for all T ∈ B(X ), we have 0 ≤ ν(T ) ≤ T . The numerical
index of a Banach space X is defined as the following number n(X ) = inf{ν(T ) :
The Bishop–Phelps–Bollobás Theorem: An Overview 549

T ∈ B(X )} = max{k ≥ 0 : kT ≤ ν(T )}, which somehow measures how

“close” the norm and the numerical radius are for the given space X . In particular,
if n(X ) = 1, they coincide, and if n(X ) > 0, they are equivalent norms. In [134,
Subsection 1.1], the authors provide an overview of the main known results about
n(X ) up to 2016, and in that work, they also introduce the second numerical index
of X as n (X ) := inf{ν(T ) : T ∈ B(X ), T + Z(X ) = 1}, where Z(X ) := {S ∈
B(X ) : ν(S) = 0}, and T + Z(X ) is the natural quotient norm in B(X )/Z(X ).
We are now in a position to be able to define the main property of this section.
Definition 2.84 (Combining [115, Definition 1.2] and [130, Definition 5]) A
Banach space X has the weak Bishop–Phelps–Bollobás property for the numerical
radius (weak-BPBp-nu, for short) if given ε > 0, there exists η(ε) > 0 such that,
whenever T ∈ B(X ) with ν(T ) = 1 and (x, x ∗ ) ∈ (X ) satisfy |x ∗ (T (x))| >
1 − η(ε), there exist S ∈ B(X ) and (y, y ∗ ) ∈ (X ) such that

ν(S) = |y ∗ (S(y))|, x − y < ε, x ∗ − y ∗ < ε, T − S < ε.

If, moreover, S can be chosen so that ν(S) = 1, we say that X has the Bishop–
Phelps–Bollobás property for the numerical radius (abbreviated BPBp-nu, although
some authors use the notation BPBp-ν as well).
If the conditions from the previous definition hold within a subclass of B(X ), we
say that X has the (weak)-BPBp-nu for that class of operators (see [19]).
Acosta exhibits a nice overview of the main known results about the BPBp-nu in
her survey [6, Section 6]. For the sake of completeness, we will briefly list some of
the results about it without further detail. The following results are true both in the
real and complex settings unless specified otherwise.
Theorem 2.85 Let X , Y be a Banach spaces, K be a compact Hausdorff topologi-
cal space and μ be any measure.
1. If is any index set, then the spaces c0 () and 1 () have the BPBp-nu ([115]).
2. L1 (R) has the BPBp-nu ([101, Theorem 9]).
3. If X is finite-dimensional, then it has the BPBp-nu ([130, Proposition 2]).
4. L1 (μ) has the BPBp-nu ([130, Theorem 9]).
5. If X is both uniformly convex and uniformly smooth, then it has the weak-BPBp-
nu ([130, Proposition 4]).
6. If n(X ) > 0, or if n (X ) > 0, then X has the BPBp-nu if and only if it has the
weak-BPBp-nu ([130, Proposition 6] and [134, Theorem 3.2]). In particular,
all the Lp (μ) spaces have the BPBp-nu if 1 < p < ∞ ([130, Example 8] and
[134, Theorem 2.3]).
7. If K admits local compensation (see [37, Definition 2.1]), then the real space
C(K) has the BPBp-nu ([37, Theorem 2.2]). In particular, if K is metric, the
real space C(K) has the BPBp-nu (see [37, Section 3]).
8. If X is strongly lush (see [133, Definition 1.2]; strongly lush spaces include
C(K) spaces, L1 (μ) spaces and finite-codimensional subspaces of C[0, 1]) and
550 S. Dantas et al.

X ⊕1 Y has the weak-BPBp-nu, then (X , Y) has the BPBp ([133, Theorem

2.1]).
9. If Y is strongly lush and X ⊕∞ Y has the weak-BPBp-nu, then (X , Y) has the
BPBp ([133, Theorem 2.3]).
10. If μ is finite, if M is any class of operators between the finite-rank operators
and the Riesz-representable operators, then L1 (μ) has the BPBp-nu for the
class M ([19, Theorem 2.1]). In particular, this includes weakly compact
operators and compact operators ([19, Corollary 2.1]).
11. If X has the BPBp-nu and W is an absolute summand of X of type 1 or ∞ (see
[72, Definition 1.2]), then W has the BPBp-nu, and the result is also true for
the weak-BPBp-nu and for the compact operators case ([72, Section 4]).
Regarding item (7), it is worth noting that the complex case and the general case
remain open to this day, as far as the authors of this survey know (see Question 9).
The second, third and fourth authors of this survey and Miguel Martín studied
recently the BPBp-nu for compact operators, that is, the BPBp-nu but where all the
involved operators are compact (see [105]), paralleling the work done in [88], where
the BPBp for compact operators was introduced and studied (see also [46], where
the authors studied numerical radius attaining compact operators). We will continue
this section by summarizing the main results from [105]. First of all, similar to how
n(X ), Z(X ) and n (X ) were defined, one may consider compact operator versions
of those concepts, nK (X ), ZK (X ) and n (X ), by restricting the original definitions
to the setting of compact operators. Taking this into consideration, one can adapt
the original proofs for the BPBp-nu to show that many Banach spaces have the
BPBp-nu for compact operators.
Example ([105]) The following spaces have the BPBp-nu for compact operators:
1. Finite-dimensional spaces [130, Proposition 2].
2. c0 and 1 (adapting the proofs given in [115, Corollaries 3.3 and 4.2]).
3. L1 (μ) for every measure μ (using [19, Corollary 2.1] for finite measures and
adapting [130, Theorem 9] to compact operators for the general case).
4. Lp (μ) for every measure μ and all 1 < p < ∞ (adapting the proofs from [130,
Propositions 4 and 6] and [134, Theorem 2.3, Lemma 2.4 and Theorem 3.2]).
Regarding item (4), actually more is known: if a Banach space X is both
uniformly convex and uniformly smooth, then it has the weak-BPBp-nu for compact
operators, and if nK (X ) > 0 or nK (X ) > 0, then having the weak-BPBp-nu
for compact operators is equivalent to having the BPBp-nu for compact operators,
since [130, Propositions 4 and 6] and [134, Theorem 3.2] remain true when properly
adapted to compact operators.
As we mentioned earlier, in [72, Proposition 4.3] it is shown that if a Banach
space X has the BPBp-nu for compact operators, then its absolute summands of
type 1 and ∞ also have this property, and actually this is true with the same mapping
η. This allows us to carry the property from some spaces to some projections of
those spaces. It is natural to wonder whether something can be said in the opposite
direction, perhaps by adding suitable extra conditions. In [88, Lemma 2.1] it was
The Bishop–Phelps–Bollobás Theorem: An Overview 551

presented a tool (based in [120, Lemma 3.1]) that in particular allows to carry the
BPBp for compact operators from some projections of a space to the space itself
(check Sect. 2.2). In order to get a somewhat similar result for the numerical radius,
one needs to control things both in the space and in its dual. The most general result
obtained in this direction is the following lemma.
Lemma 2.86 ([105, Lemma 2.1]) Let X be a Banach space satisfying that
nK (X ) > 0. Suppose that there is a mapping η : (0, 1) −→ (0, 1) such that
given δ > 0, x1∗ , . . . , xn∗ ∈ BX and x1 , . . . , x ∈ BX , we can find norm one
operators P1: X −→ P1(X ), i : P1(X ) −→ X such that for P := i ◦ P1: X −→ X ,
the following conditions are satisfied:
(i)P ∗ (xj∗ ) − xj∗ < δ, for j = 1, . . . , n.
(ii)P (xj ) − xj < δ, for j = 1, . . . , .
(iii)P1 ◦ i = IdP1(X ) .
(iv) P1(X ) satisfies the Bishop–Phelps–Bollobás property for numerical radius for
compact operators with the mapping η.
(v) Either P is an absolute projection and i is the natural inclusion, or
nK (P1(X )) = nK (X ) = 1.
Then, X satisfies the BPBp-nu for compact operators.
Throughout [105, Section 2], Lemma 2.86 is used to show that if a Banach space
X with positive compact index can be suitably projected into some net of spaces
that have the BPBp-nu for compact operators with a common mapping η, then
sometimes it is possible to show that X also has that property (see [105, Proposition
2.2]). This is used for instance to show that if nK (X ) > 0, then if the spaces n∞ (X ),
n ∈ N, all have the BPBp-nu for compact operators with the same η, then c0 (X )
also has the property [105, Corollary 2.3], and this is actually an equivalence, since
the converse implication was already known (see [88, Proposition 4.3]). Another not
so direct consequence of Lemma 2.86 is that if a Banach space X satisfies that X is
isometrically isomorphic to 1 , then X has the BPBp-nu for compact operators (see
[105, Corollary 2.6]). Let us highlight these two results.
Corollary 2.87 ([105, Corollary 2.3]) Let X be a Banach space with nK (X ) > 0.
Then, the following statements are equivalent:
(i) The space c0 (X ) has the BPBp-nu for compact operators.
(ii) There is a function η : (0, 1) −→ (0, 1) such that all the spaces n∞ (X ), with
n ∈ N, have the BPBp-nu for compact operators with the function η.
Moreover, if X is finite-dimensional, these properties hold whenever c0 (X ) or
∞ (X ) have the BPBp-nu.
Corollary 2.88 ([105, Corollary 2.6]) Let X be a Banach space such that X is
isometrically isomorphic to 1 . Then X has the BPBp-nu for compact operators.
Finally, in [105, Section 3], a series of tools involving some topological
procedures and finding some suitable projections, was developed to study the BPBp-
552 S. Dantas et al.

nu for compact operators in C0 (L) spaces, where L is a locally compact Hausdorff

topological space. By adequately splitting L, one can always find a projection from
p
C0 (L) to some finite-dimensional space ∞ (p ∈ N) that allows to use Lemma 2.86.
Theorem 2.89 ([105, Theorem 3.4]) Let L be a locally compact space. Given
points {f1 , . . . , f } ⊂ C0 (L) such that fj ≤ 1 for j = 1, . . . , , and given
functionals {μ1 , . . . , μn } ⊂ C0 (L) with μj ≤ 1 for j = 1, . . . , n, for each
ε > 0 there exists a norm one projection P : C0 (L) −→ C0 (L) satisfying:
(1) P ∗ (μj ) − μj < ε, for j = 1, . . . , n,
(2) P (fj ) − fj < ε, for j = 1, . . . , ,
p
(3) P (C0 (L)) is isometrically isomorphic to ∞ for some p ∈ N.
By using Theorem 2.89, Lemma 2.86 and its consequences, the main result from
that paper is achieved.
Theorem 2.90 ([105, Theorem 1.6]) If L is a locally compact Hausdorff space,
then C0 (L) has the BPBp-nu for compact operators.
This fully answers the question for C0 (L) spaces in the compact operators
setting, so in particular, all the C(K) spaces (K compact Hausdorff) and all the
L∞ (μ) spaces (μ measure) have the BPBp-nu for compact operators. Note that
for the case of general operators, only some particular real C(K) spaces (with K
compact) are known to have the BPBp-nu (see [37]), and the general case, as well
as the complex case, remain open, as we mentioned earlier (see Question 9).
Let us finish this section speaking about multilinear versions of the numerical
radius (for multilinear versions on the calculation of numerical radius we refer the
reader to [74, 76, 87]). In these papers, the authors show that the equality v(A) =
A holds for multilinear mappings A defined on c0 , 1 , A(D), and L1 (μ) for any
measure μ. It is natural, then, to study a multilinear version of the BPBp-nu. In [87,
Section 5], the authors show that a bilinear version of the BPBp-nu on L1 (μ) is not
possible. We highlight this result as follows.
Theorem 2.91 ([87, Theorem 5.3]) The infinite-dimensional Banach space L1 (μ)
fails to have the BPBp-nu for bilinear mappings for any measure μ.
In the same paper, the authors show, however, that the BPBp-nu for multilinear
mappings holds for finite-dimensional Banach spaces ([87, Proposition 5.2]). As far
as we know there is no further research in this direction and we wonder whether
or not there are more spaces X such that X satisfies the BPBp-nu for multilinear
mappings.

3 Sharpness: The Bishop–Phelps–Bollobás Moduli

The vast majority of results we have seen so far were focused on finding pairs of
Banach spaces for which a Bishop–Phelps–Bollobás theorem is valid. However,
The Bishop–Phelps–Bollobás Theorem: An Overview 553

(almost) none of the proofs investigate the sharpness of the constants associated
to those theorems. This interesting question initiated by Bollobás himself (see [42,
Remark after Theorem 1]) has also been studied in the recent years. In this section
we will briefly discuss some of the results obtained in this direction.
In order to do so, let us define the moduli of Bishop–Phelps–Bollobás in its
general form.
Definition 3.1 ([123, Definition 4]) The Bishop–Phelps–Bollobás modulus of a
pair of Banach spaces (X , Y) is the function (X , Y, ·) : (0, 1) → R+ whose
value in η ∈ (0, 1) is defined as the infimum of those ε > 0 such that for every
(x, T ) ∈ BX × BB(X ,Y ) with

T (x) > 1 − η,

there is (y, S) ∈ SX × SB(X ,Y ) with

S(y) = 1, x − z < ε, and T − S < ε.

We define the spherical Bishop-Phelps-Bollobás modulus as the analogous

function S (X , Y, ·) : (0, 1) → R+ considering (x, T ) ∈ SX × SB(X ,Y ) instead of
(x, T ) ∈ BX × BB(X ,Y ) .
Roughly speaking, given η > 0, the symbol (S)(X , Y, η) represents the best
possible ε for which the BPBp holds for the pair (X , Y). This concept somehow
generalizes the BPB moduli, (S) X (·), introduced in [63, Definition 1.2], where Y
was considered to be the field K, as they were working with functionals (the only
difference between S (X , K, η) and SX (η) is that in the original definition of
(S) ∗ ∗
X (η) they asked to have y (y) = 1 instead of |y (y)| = 1 for norm-attainment).
Remark 3.2 Note that in his original paper [42], Bollobás provided nice indepen-
dent estimations for x ∗ − y ∗ and x − y. The BPB moduli for functionals
introduced in [63] serves as a common bound for those two values at the same time.
Note also that although the lower and upper bound of the modulus from Bollobás’
original theorem do not coincide, their difference is inessential when ε → 0.
The systematic study of the BPB moduli was initiated in a 2014 paper by Mario
Chica, Vladimir Kadets, Miguel Martín, Soledad Moreno-Pulido, and Fernando
Rambla-Barreno (see [63]), where they wondered what is the best Bishop–Phelps–
Bollobás theorem one can get in a given Banach space X while having a common
bound for x ∗ − y ∗ and x − y. As we mentioned implicitly in the introduction
(see Theorem 1.2), they showed that

S
X (η) ≤ X (η) ≤ 2η

(see [63, Theorem 2.1 and Corollary 2.4]), that is, given
√ an η, the best ε one can get
that works for all Banach spaces at once is at most 2η. However, the authors did
554 S. Dantas et al.

not stop there: not only they gave this upper bound, but they also showed that it is
actually sharp, that is, there are Banach spaces where it can not be improved. We
show just some of these examples.

√ 4]) Let Y be a Banach space. The Banach

Example ([63, Example 2.5 and Section
space X satisfies SX (η) = X (η) = 2η when X is
(a) The real space X = 2∞ .
(b) X = L1 (μ, Y), whenever L1 (μ) has dimension greater than one.
(c) X = L∞ (μ, Y), whenever L∞ (μ) has dimension greater than one.
(d) X = c0 (, Y), whenever is an index set with more than one point.
(e) X = C0 (L, Y), whenever L is a locally compact Hausdorff topological space
with at least two points.
They also found, however, spaces for which the bound of the modulus can be
improved. Actually, they proved the following interesting result.
Theorem 3.3 ([63, Theorems 5.8 and 5.9]) If X is an infinite-dimensional Banach
space or a real finite-dimensional
√ Banach space and there exists some η ∈ (0, 1/2)
such that X (η) = 2η, then X contains almost isometric copies of 2∞ .
The converse of this result is not true in general (see [63, Section 6]). The
previous theorem implies in particular that not many 2-dimensional real Banach
spaces have the best possible BPB modulus. However, in [122], V. Kadets and M.
Soloviova wondered what would be the modulus if in the Bishop–Phelps–Bollobás
theorem the second functional was not asked to have norm 1 (the modulus happens
√
to be η in this case), and they found out that, surprisingly, in this context many
2-dimensional real Banach spaces attain the maximum possible modulus (see [122,
Subsection 2]).
Further refinements and properties of the BPB moduli for functionals were
studied in [63] as well as in the more recent papers [61, 62] by Chica, Kadets,
Martín, Merí, and Soloviova. For instance, it is known that (S) X (η) is continuous
with respect to η (see [63, Proposition 3.1]) and with respect to X when using the
Banach-Mazur distance (see [61, Section 3]).
In [123], Kadets and Soloviova studied the more general version of the modulus
for operators in the case where the range space has property β of Lindenstrauss
(note that in this case, (X , Y) is always granted to have the BPBp for operators,
see Theorem 2.5). If Y satisfies property β of Lindenstrauss for some ρ ≥ 0 (see
Definition 2.2), we will denote it by β(Y) ≤ ρ. The authors found the following
upper bound.
Theorem 3.4 ([123, Theorem 1]) If X , Y are Banach spaces and β(Y) ≤ ρ for
some ρ ≥ 0, then for every ε ∈ (0, 1),
8 G
1+ρ
S
(X , Y, η) ≤ (X , Y, η) ≤ min 2η ,2 .
1−ρ
The Bishop–Phelps–Bollobás Theorem: An Overview 555

Note that they also found spaces for which that upper bound could be improved
such that when X is uniformly non-square (see [123, Theorem 2]), or when X = 21
(see [123, Theorem 3]). Note that 21 attains the maximum possible modulus in
the case of functionals, but this is not the case for operators. On the other hand,
the authors do not provide any example of a pair of Banach spaces for which
the estimation from the previous theorem is sharp. They did, however, show the
existence of pairs of Banach spaces for which a lower bound of the modulus was
reasonably close to the upper bound.
Theorem 3.5 ([123, Theorem 4]) For every Banach space Y,

S
(21 , Y, η) ≥ min{ 2η, 1},

and the equality is attained if β(Y) = 0.

They also studied bounds for some range spaces Y depending on their associated
ρ in the case where ρ ∈ [1/2, 1) (see [123, Theorem 5]). An interesting remark is
that the modulus for operators is not continuous with respect to the range space (see
[123,
√ Theorem 6]). Finally, the authors studied the growth of the modulus compared
to 2η (see [123, Section 3.4]) and the modulus of a version of the BPBp in which
the operator S is not asked to have norm 1 (see [123, Section 4]). They left an open
question that, as far as we know, remains open to this day (see Question 15).
For further research in this direction, we invite the reader to check [34, 59, 109,
110].

4 The Point and Operator Properties

In this section, we treat stronger properties than the BPBp, namely the Bishop–
Phelps–Bollobás operator property and the Bishop–Phelps–Bollobás point prop-
erty. Let us take a brief moment here to explain where the motivation to study such
properties comes from.
In [125], Sun Kwang Kim and Han Ju Lee proved the following characterization
for uniform convexity.
Theorem 4.1 ([125, Theorem 2.1]) A Banach space X is uniformly convex if and
only for every ε > 0, there exists η(ε) > 0 such that for every x ∗ ∈ SX and x ∈ BX
such that

|x ∗ (x)| > 1 − η(ε),

there is x0 ∈ SX such that

|x ∗ (x0 )| = 1 and x − x0 < ε.

556 S. Dantas et al.

Let us notice that Theorem 4.1 is a Bollobás type theorem where the functional x ∗
does not change; in other words, the same functional that almost attains the norm at
a point, actually attains its norm at a nearby point. At a first glance, the analogous
property for bounded linear operators looks really restrictive in the sense that not
many spaces would satisfy it. To make sure we are speaking the same language as
the reader, let us give a name to such a (possible) property.
Definition 4.2 ([84, Definition 2.8]) We say that the pair (X , Y) has the Bishop–
Phelps–Bollobás operator property (BPBop, for short) for operators if given ε > 0,
there is η(ε) > 0 such that whenever T ∈ B(X , Y) with T = 1 and x ∈ SX
satisfy

T (x) > 1 − η(ε),

there exists x0 ∈ SX such that

T (x0 ) = 1 and x0 − x < ε.

Notice now that Theorem 4.1 says simply that X is uniformly convex if and only
(X , K) has the BPBop for linear functionals. It turns out that, for spaces X , Y with
dimension bigger or equal to 2, the BPBop for operators is never possible. Indeed,
after several negative results presented in [84], the first author together with Kadets,
Kim, Lee, and Martín proved the following result.
Theorem 4.3 ([94, Theorem 2.1]) Let X , Y be real Banach spaces of dimension
greater or equal to 2. There exist (Tn )∞
n=1 ⊆ NA(X , Y) ∩ SB(X ,Y ) and x0 ∈ SX
such that Tn (x0 ) −→ 1 as n → ∞ and
4 5
inf dist (x0 , {x ∈ SX : Tn (x) = 1}) > 0.
n∈N

In particular, (X , Y) fails the BPBop for operators.

The proof of Theorem 4.3 is quite involved and requires finding an operator
T defined on 2-dimensional spaces satisfying a very specific geometric condition
so that we can construct the sequence satisfying the desired conditions (see [94,
Propositions 2.4 and 2.5]).
Now that we have put away the operator property, we might think of a dual
property of it remembering the duality between uniformly convex and uniformly
smooth spaces. More specifically, one may consider the following property.
Definition 4.4 We say that (X , Y) has the Bishop–Phelps–Bollobás point property
(BPBpp, for short) for operators if given ε > 0, there exists η(ε) > 0 such that
whenever T ∈ B(X , Y) with T = 1 and x ∈ SX satisfy

T (x) > 1 − η(ε),

The Bishop–Phelps–Bollobás Theorem: An Overview 557

there exists S ∈ B(X , Y) with S = 1 such that

S(x) = 1 and S − T < ε.

Let us notice that the BPBpp implies immediately the BPBp. Moreover, as expected,
it turns out that, for linear functionals, the Banach space X is uniformly smooth if
and only if the pair (X , K) has the BPBpp for linear functionals (see [95, Theorem
2.1]). This implies (see [95, Proposition 2.3]) that whenever a pair (X , Y) satisfies
the BPBpp for operators, the domain space X must be uniformly smooth. For
that reason, we can see the great connection between the BPBpp and uniform
smoothness as well as why the pairs of the form (1 , Y) and (c0 , Y) always fail such
a property. In particular, the BPBp and the BPBpp are not equivalent properties.
Next we will give the first positive results about the point property for operators.
To start with, we present the following result, which says that Hilbert spaces are
universal domain spaces. This is a consequence of the fact that Hilbert spaces have
transitive norms, that is, if x, y ∈ SH satisfy x − y < ε, then there exists a linear
isometry R : H −→ H such that R(x) = y and R − IdH < ε, where IdH is the
identity operator on H.
Theorem 4.5 ([95, Theorem 2.5]) Let H be a Hilbert space. The pair (H, Y) has
the BPBpp for operators for every Banach space Y.
It is worth mentioning that there exists a more general result than Theorem 4.5
due to Cabello-Sánchez et al. [45]. They use the BPBpp as a tool to deal with Banach
space X whose group of isometries acts micro-transitively on SX . In fact, they
introduce a weakening of the micro-transitive, the uniform micro-semitransitive
norms (see [45, Definition 2.2]). As a consequence of some results related to the
BPBpp, they were able to prove that every Banach space whose norm is uniformly
micro-semitransitive (in particular, if it is micro-transitive) is both uniformly convex
and uniformly smooth (see [45, Corollary 2.13]). In fact, the only known spaces
so far about micro-transitivity are Hilbert spaces. It is also worth mentioning that
we do not know if micro-transitivity is a different property than uniform micro-
semitransitivity.
Coming back to the point property and bearing in mind Theorem 4.5, it is natural
to wonder whether the analogous result holds for Lp (μ)-spaces. In other words, is
it true that the pair (Lp (μ), Y) satisfies the BPBpp for every Banach space Y? This
question was addressed in [93] (where the authors called the BPBpp as pointwise
BPB property). See also [95, Remark 2.6]. The answer to this question is not positive
in general, as the following theorem shows.
Theorem 4.6 ([93, Corollary 3.6]) Let 2 < p < ∞ and let μ be a positive
measure. If dim(Lp (μ)) ≥ 2, then there exists a Banach space Y such that
(Lp (μ), Y) fails to have the BPBpp for operators.
Concerning Theorem 4.6, as far as we know, it is still open what happens when
1 < p < 2 (see Question 10). Nevertheless, we have the following list of pairs
satisfying the BPBpp for operators.
558 S. Dantas et al.

Theorem 4.7 ([94, Proposition 4.2 and Corollary 4.3]) Let X be an arbitrary
uniformly smooth Banach space. The pair (X , Y) has the BPBpp when
(a) Y is a uniform algebra (in particular, when Y = C(K) and Y = C0 (L)).
(b) Y has property β of Lindenstrauss.
Now another natural question arises. Do the pairs of the form (X , Lp (μ)) or
(X , p ) for 1 < p < ∞ satisfy the BPBpp for operators for every uniformly smooth
Banach space? The answer is again negative and can be checked in the following
result. Let us put some emphasis on item (b) in Theorem 4.8 below: the pair (Xp , 2p )
must have the BPBp for operators by Theorem 2.6 although this is no longer the case
for the BPBpp.
Theorem 4.8 ([94, Theorem 4.4, Corollary 4.8, and Theorem 4.9])
(a) Let 1 < p < ∞. Then, there exists a uniformly smooth Banach space X such
that (X , Lp (μ)) fails to have the BPBpp for operators. In particular, there are
uniformly smooth Banach spaces X , Z such that (X , p ) and (Z, np ) for n ≥ 2
fail the BPBpp.
(b) For each 2 ≤ p < ∞, there is a uniformly convex and uniformly smooth Banach
space Xp such that (Xp , 2p ) fails the BPBpp for operators.
It is worth mentioning that, similarly to how the BPBp has been studied for
compact operators, the authors of [94] considered the BPBpp for compact operators.
We send the reader to Section 5 of that paper. At this point, however, it is worth
noting that it seems to be unknown whether the BPBpp and the BPBpp for compact
operators are equivalent properties (see Question 11).

5 The Local Properties

Up to this point, the properties considered were uniform in nature. In this section
we are going to tackle properties in which this uniform character is somehow
lost. We will treat weakenings of the Bishop–Phelps–Bollobás (point and operator)
properties. Besides their own interest, these properties were recently used succes-
sively as a tool in two different (in principle not connected) occasions. Indeed,
on the one hand, they were used to defined exactly when the projective norm on
X⊗ Bπ Y, the symmetric projective norm on ⊗ Bπ,s,N X , and the supremum norms
on P(N X ; Y) and B(X1 , . . . , XN ; Y) are (uniformly) strongly subdifferentiable
(see [90, 96–99]). On the other hand, one of these properties was used as a
tool to study norm attainment on X ⊗ Bπ Y (which is naturally connected to an
important problem on norm-attaining theory (see Question 2)) and ⊗ Bπ,s,N X (see
[89, 92]).
The Bishop–Phelps–Bollobás Theorem: An Overview 559

5.1 Local Properties for Operators

As we have seen in Theorem 4.3, there is no version of the Bishop–Phelps–Bollobás

operator property when one considers spaces with dimension greater or equal to 2.
Therefore, the only hope to get positive results in this direction is considering a
weakening of the mentioned property in the sense that, instead of requiring that the
η from Definition 4.2 depends just on a positive real number ε > 0, it also depends
on a previously fixed norm-one operator T . Indeed, this was done in [84] followed
by [58, 143, 147] and we highlight as follows this new property for operators.
Definition 5.1 ([84, Definition 2.2.]) 2 We say that (X , Y) has the Lo,o for opera-
tors if given ε > 0 and T ∈ B(X , Y) with T = 1, there exists η(ε, T ) > 0 such
that whenever x ∈ SX satisfies

T (x) > 1 − η(ε, T ),

there is x0 ∈ SX such that

T (x0 ) = 1 and x0 − x < ε.

Surprisingly, as the reader will see in a moment, there are several positive results
on property Lo,o . Beforehand, let us justify why the study of property Lo,o is not
merely an attempt of forcing a property without much sense.
As we have mentioned already, Miguel Martín [139] proved that there are
compact operators which cannot be approximated by norm-attaining operators.
Perhaps the most important question at this very moment on norm-attaining theory is
to known whether every finite-rank operator can be approximated by norm-attaining
ones (see Question 2). Since the space of nuclear operators N(X , Y) from X into Y
satisfies that F(X , Y) ⊆ N(X , Y) ⊆ K(X , Y), it seems to be completely reasonable
to consider the class N(X , Y) and address the analogous problem in this setting.
This was done by the first and last author of this manuscript together with Luis
Carlos García Lirola, Mingu Jung, and Abraham Rueda Zoca in the recent papers
[89, 92]. Indeed, the authors used the Lo,o as a tool to provide positive results on
the denseness of nuclear operators as well as tensors in the (symmetric) projective
tensor product between Banach spaces.
Let us recall that, in the scalar-valued case, the BPBop and the BPBpp are dual
properties in the sense that (X , K) has the BPBpp if and only if (X , K) has the
BPBop (see [96, Proposition 2.2]). It seems to be natural also to consider the “local”
version of the point property and this was done for the first time by the first author
in a joint work with Sun Kwang Kim, Han Ju Lee, and Martin Mazzitelli.

2The symbol Lo,o comes from the fact that it is a local property and the double “o” means the
operator property together with the requirement that η depends on an operator. In the paper [84],
property Lo,o was called property 1 and in [143] strong BPB or sBPBp.
560 S. Dantas et al.

Definition 5.2 ([96, Definition 2.1]) 3 We say that (X , Y) has the Lp,p if given
ε > 0 and x ∈ SX , there exists η(ε, x) > 0 such that whenever T ∈ B(X , Y) with
T = 1 satisfies

T (x) > 1 − η(ε, x),

there exists S ∈ B(X , Y) with S = 1 such that

S(x) = 1 and S − T < ε.

Therefore, properties Lo,o and Lp,p must walk parallel with each other. It turns
out that they are closely related to the strong subdifferentiability (usually denoted by
SSD) of the norm of the Banach space as was observed by Gilles Godefroy, Vicente
Montesinos, and Václav Zizler (see [113]).
Theorem 5.3 ([96, Theorem 2.3]) Let X be a Banach space.
(a) (X , K) has the Lp,p if and only if X is SSD.
(b) (X , K) has the Lo,o if and only if X is reflexive and X is SSD.
Theorem 5.3 (together with [96, Proposition 2.6] (see also [143])) yields big
differences between properties Lp,p and the BPBpp as well as property Lo,o
and the BPBop (recall that the BPBpp and the BPBop for linear functionals
give characterizations for uniformly smooth and uniformly convex Banach spaces,
respectively). We suggest the interested reader to go to [97, page 47] or to the
discussion in [97, pages 305 and 306] to find all necessary results and references
about SSD.
Example The pair (X , K) has the
(a) Lp,p for linear functionals (but not the BPBpp) when
(a1) X = c0 .
(a2) X is the predual of the Lorentz space d∗ (w, 1).
(a3) X is the space of functions of vanishing mean oscillation (VMO), the
predual of the Hardy space H 1 .
(a4) X is n1 or n∞ when n ≥ 2.
(b) Lo,o for linear functionals (but not the BPBop) when
(b1) X is n1 or n∞ .
(b2) X is the space (⊕∞ k
k=1 ∞ )2 .
Before moving forward to the results about operators, let us highlight one result
on the Lo,o for linear functionals. The first author of this survey together with

3 Analogous to the Lo,o , we can justify the notation for property Lp,p .
The Bishop–Phelps–Bollobás Theorem: An Overview 561

Abraham Rueda Zoca showed that there is a strong connection between this property
and the compact operators (see also [143, Theorem 2.12] and [147, Theorem 2.3]).
Theorem 5.4 ([99, Theorem A]) Let X be a reflexive space. The following are
equivalent.
(a) (X , K) has the Lo,o for linear functionals.
(b) (X , Y) has the Lo,o for compact operators for every Y.
(c) X is SSD.
Contrary to the BPBp (where one has to assume finite-dimensionality on both
domain and range spaces, see Theorem 2.1), property Lo,o does not require such
strong assumptions as we can see in the next result.
Theorem 5.5 ([84, Theorem 2.4]) Let X be finite-dimensional. Then, the pair
(X , Y) has the Lo,o for every Banach space Y.
We will come back to the Lo,o in a moment. Now, we sum up all the known
results for bounded linear operators when it comes to the Lp,p . We suggest the
interested reader to check [96, Propositions 2.8, 2.9, 2.10, Theorem 2.12] and [97,
Theorem 3.6].
Theorem 5.6 ([96, Section 2]) The pair (X , Y) has the Lp,p when
(a) X and Y are finite-dimensional.
(b) (X , K) satisfies property Lp,p and Y has property β of Lindenstrauss.
(c) X = n1 and Y is
(c1) Y is finite-dimensional.
(c2) Y is uniformly convex.
(c3) Y = C0 (L).
(c4) Y = L1 (μ), where μ is a positive measure.
(c5) Y = A(D).
(c6) Y = H ∞ (D).
(c7) Y has the property β of Lindenstrauss.
(c8) Y = L1 (μ, X ), where μ is a σ -finite measure and X is as (c1)-(c7).
(d) X = c0 and Y = Lp (μ) whenever μ is a positive measure and 1 ≤ p < ∞.
In the same direction as properties Lo,o and Lp,p , the authors in [96] considered
also a local Bishop–Phelps–Bollobás property, where the η that appears in Defini-
tion 1.3 depends on a norm-one point or operator. Following the same notation, we
set Lo to mean that we have a local BPBp when η depends on an operator and we
set Lp when we have a local BPBp when η depends on a point. In that paper, the
authors were interested in differentiating all of these properties and, as far as we
know, there is not much done in the direction of properties Lo and Lp (see [96,
Proposition 3.4 and Proposition 4.5]).
In Fig. 1, we sum up all the implications that hold and next justify why all
these properties are different from each other. We do not know whether property
Lp implies the denseness of the norm-attaning operators (see Question 13).
562 S. Dantas et al.

Fig. 1 Relations between the uniform and local BPBp

(1) BPBp BPBop: this follows immediately from Theorem 4.3 and any positive
result for operators from Sect. 2.1. In the functional case the same happens:
recall that the BPBop for linear functionals characterizes the uniformly convex
Banach spaces and, on the other hand, the Bishop–Phelps–Bollobás theorem
holds for every Banach space.
(2) BPBp BPBpp: recall that the BPBpp for functional characterizes uniformly
smooth Banach spaces (see [95, Theorem 2.1]) and, in the operator case, if
(X , Y) has the BPBpp, then X must be uniformly smooth (see [95, Proposition
2.3]). Therefore, we can take any positive result on the BPBp from Sect. 2.1
where the domain is not uniformly smooth and we are done.
(3) Lp,p BPBpp: in the functional case, the BPBpp characterizes uniformly
smooth Banach spaces (see [95, Theorem 2.1]) and the Lp,p characterizes the
strong subdifferentiability of the norm (see [96, Theorem 2.3]). Therefore, the
pair (c0 , K) has the Lp,p but it cannot satisfy the BPBpp.
(4) Lo,o BPBop: By Theorem 5.5, the pair (X , Y) always satisfies property
Lo,o whenever X is finite-dimensional and Y is arbitrary. On the other hand, in
the functional case, the BPBop characterizes uniformly convex spaces ([125,
Theorem 2.1]) and in the operator case it does make any sense (Theorem 4.3).
(5) Lp Lp,p : Since the BPBp implies property Lp , we have (1 , Y) satisfies
Lp in many cases (see, for instance, Theorem 2.8). Nevertheless, these pairs
cannot have Lp,p since if (X , Y) has Lp,p , then X must be SSD (see [97,
Corollary 2.4]).
(6) Lp BPBp: It is known that the set NA(( ∞ k=2 k )2 , Y) is dense in
2
∞ 2
B(( k=2 k )2 , Y) [137] for every Banach space Y. By [96, Proposition 3.4],

the pair (( ∞ 2 Y) satisfies property Lp for every Banach space Y.
k=2 k )2 ,
On the other hand, if (( ∞ k=2 k )2 , Y) had the BPBp for every Banach space
2

Y, then for every ε ∈ (0, 1), there would exista ε-dense uniformly strongly
exposed family of the unit sphere of the space ( ∞ 2
k=2 k )2 (see [34, Corollary
∞ 2
3.6]) by using the fact that ( k=2 k )2 is superreflexive. Nevertheless, by
[96, Lemma 5.1 and Proposition 5.2], there exists ε0 ∈ (0, 1) such that
The Bishop–Phelps–Bollobás Theorem: An Overview 563

thereis no ε0 -dense uniformly strongly exposed family on the unit sphere

for ( ∞ 2 ) . Therefore, there exists a Banach space Y0 such that the pair
∞ k=22 k 2
(( k=2 k )2 , Y0 ) fails the BPBp.
(7) Lo BPBp: Theorem 5.5 says that (X , Y) satisfies property Lo,o whenever
X is finite-dimensional and Y is arbitrary. On the other hand, there exists a
Banach space Z such that (21 , Z) fails the BPBp (see [34, Example 4.1]).
(8) Lo Lo,o : The pairs (1 , Y) have the BPBp (and therefore property Lo )
in many occasions (see, for instance, Theorem 2.8). Nevertheless, such pairs
cannot satisfy Lo,o since 1 is not reflexive.
(9) NA = B Lp : Let Y be a strictly convex but not uniformly convex Banach
space. Then, (21 , Y) fails to have Lp by [96, Proposition 3.2.(a)]. Nevertheless,
NA(21 , Y) = B(21 , Y) for every Banach space Y by compactness.
(10) NA = B Lo : Let Y be a strictly convex but not uniformly convex
Banach space. Then, (1 , Y) fails property Lo by [96, Proposition 3.2.(b)].
Nevertheless, NA(1 , Y) is dense in B(1 , Y) for every Banach space Y.

5.2 Local Properties for Bilinear Mappings

We can adapt Definitions 5.1 and 5.2 in a natural way to the context of bilinear
mappings. This was done for the first time in [96] (and more recently in [99]) with
the aim of classifying when the projective norm in the projective tensor product
between Banach spaces is strongly subdifferentiable. We invite the reader to go to
[96, Definition 2.1] for the formal definitions.
Concerning the Lo,o for bilinear mappings, we have the following general
characterization which yields several particular interesting cases. It deals with the
reflexivity of the projective tensor product X ⊗Bπ Y and allows us to relate property
Lo,o in different classes of functions (for linear functionals, operators, and bilinear
forms).
Theorem 5.7 ([99, Theorem B]) Let X be a strictly convex Banach space of a
Banach space satisfying the Kadec-Klee property. Let Y be an arbitrary Banach
space and assume that either X or Y satisfies the approximation property. The
following statements are equivalent.
Bπ Y, K) has the Lo,o for linear functionals.
(a) (X ⊗
B
(b) X ⊗π Y is reflexive and both (X , K) and (Y, K) have the Lo,o for linear
functionals.
(c) (X , Y; K) has the Lo,o for bilinear forms.
As a consequence, we have the following list of examples. The symbol q stands
for the conjugate index of q.
Example ([97, Proposition 2.2.(a), Lemma 2.6, and Theorem 2.7.(b)]) The triple
(X , Y; K) has the Lo,o for bilinear mappings whenever
564 S. Dantas et al.

(a) X , Y are finite-dimensional.

(b) X is finite-dimensional and Y is uniformly convex.
(c) X = p and Y = q if and only if p > q .
Another consequence of Theorem 5.7 is the following. We can classify exactly
when the projective tensor norm on p ⊗ Bπ q is strongly subdifferentiable. As far as
we know such a classification was done for the first time in [97]. Let us notice that
the positive result obtained in Corollary 5.8 below will not happen so often since if
the projective tensor product X ⊗Bπ Y is SSD, then it is an Asplund space (see, for
instance, [113, Theorem 2]). The following extends [97, Corollary 2.8] and it is a
combination of [100, Exercise 16.5], [142, Corollary 4.24], and Theorem 5.7 above.
Bπ q is SSD if and only if p−1 + q −1 < 1.
Corollary 5.8 The norm of p ⊗
We conclude this section by inviting the reader to check Section 7, which
contains a brief discussion of the possibility of using the point property for N-
homogeneous polynomials as a tool to get results on the strong subdifferentiability
Bπ,s,N X .
of the norm of the symmetric projective tensor product ⊗

6 Open Questions

In this section, we provide open questions on the different topics that we have treated
in this survey.
1. Bounded linear operators
It is known that when both X , Y are finite-dimensional Banach spaces, the pair
(X , Y) satisfies the BPBp for operators (see Theorem 2.1) and the proof of such
a result is done by contradiction using the compactness of the unit balls of both
spaces. The following is due to Richard Aron.
Question 1 (Richard Aron) To provide a direct (constructive) proof for the fact
that (X , Y) has the BPBp for operators whenever X and Y are finite-dimensional
spaces.
Still in the finite-dimensional vein, we have the following question. It is worth
noting that the following question is not known even when the range space is R2
endowed with the Euclidean norm.
Question 2 Is it true that all finite-rank operators can be approximated by norm-
attaining ones?
A characterization is known for the Banach spaces Y such that the pairs of
the form (1 , Y) satisfy the Bishop–Phelps–Bollobás property (see Theorem 2.8)
through the AHSP. In the same line, we have the following question.
The Bishop–Phelps–Bollobás Theorem: An Overview 565

Question 3 ([79, page 240]) Characterize the topological Hausdorff compact

spaces S such that the pair (X , C(S)) satisfies the BPBp for operators for every
Banach space X .
Still with X = C(K)-spaces, it is not known if the Bishop–Phelps–Bollobás
theorem holds on these spaces in the complex case.
Question 4 ([11, page 326]) Let K, S be compact Hausdorff topological spaces. Is
it true that the pair (C(K), C(S)) satisfies the BPBp for operators in the complex
case? This question is unknown even for just the Bishop–Phelps property.
Surprisingly, the techniques used on C(K)-spaces do not seem to work for the
following pair, and the following appears to be an open question.
Question 5 ([5, page 318]) Is it true that the pair (c0 , 1 ) has the BPBp for
operators in the real case?
In Sect. 2.1, we commented that, as far as we are concerned, it is not known
when (X , Y) has the BPBp assuming that X has the Radon-Nikodým property as
well as what happens with the particular case of James’ space J . This question was
proposed by Abraham Rueda Zoca.
Question 6 (Abraham Rueda Zoca) Assume that X has the Radon-Nikodým
property. For what Banach spaces Y is it true that (X , Y) has the BPBp? Assume
for instance that X = J is the James’ space.
In Sect. 2.2, we have shown the results on the BPBp for compact operators. Most
of the results known for the BPBp also work for the BPBp for compact operators.
The following is still unknown.
Question 7 ([88, page 57]) Is it true that the BPBp implies the BPBp for compact
operators?
2. Multilinear mappings
Symmetric multilinear mappings seem to be (almost) always a headache when it
comes to the BPBp. In view of Theorem 2.39, the following is a natural question.
Question 8 ([12, 4.6.(2)]) Let X be a uniformly convex Banach space. Is it true
that (X , . . . , X ; K) has the BPBp for symmetric N-linear forms?
The question has a positive answer for symmetric bilinear forms on Hilbert
spaces ([104, Theorems 3.2 and 3.4]). It is worth mentioning that Hilbert spaces do
not satisfy the Bishop–Phelps–Bollobás point property for symmetric multilinear
mappings [90]. We also send the reader to the very interesting paper [49], where the
authors characterize the sets of vectors (x1 , . . . , xN ) in H × . . . × H such that there
exists an N-linear symmetric form attaining its norm at (x1 , . . . , xN ).
3. Numerical Radius
Thinking about Theorem 2.85 for C(K)-spaces, we do not know the following.
566 S. Dantas et al.

Question 9 Let K be an arbitrary compact Hausdorff topological space. Is it true

that C(K) has the BPBp-nu?
4. The Bishop–Phelps–Bollobás point property
By Theorem 4.6, we know that if dim(Lp (μ)) ≥ 2, where 2 < p < ∞ and μ is a
positive measure, there exists a Banach space Y such that the pair (Lp (μ), Y) fails
the Bishop–Phelps–Bollobás point property. Since L1 and L∞ are not uniformly
smooth spaces, it does not make any sense to talk about the point property for these
spaces. Nevertheless, we do not know what happens with Lp -spaces for 1 < p < 2.
Question 10 ([93, Problem 6.3]) Is it true that the pair (Lp (μ), Y) has the BPBpp
for every Banach Y whenever 1 < p < 2?
In the same line of Sect. 2.2, we do not know the relation between the BPBpp
and the BPBpp for compact operators.
Question 11 ([93, Problem 6.5]) Is it true that the BPBpp for operators and the
BPBpp for compact operators are equivalent properties?
It is known that, for 2 < p, q < ∞ the triple (p , q ; K) has the Lp,p for bilinear
forms (see [97, Theorem 2.7.(a)]). Nevertheless, we do not know what happens for
the stronger BPBpp for bilinear forms.
Question 12 ([97, Remark 2.9.(a)]) Is it true that the triple (p , q ; K) has the
BPBpp for bilinear forms when 1 < p, q < 2 or when 2 < p, q < ∞?
5. Local properties
In view of the Diagram 1, the following implication seems to be open.
Question 13 ([96, page 322]) Is it true that if the pair (X , Y) has property Lp , then
the set NA(X , Y) is dense in B(X , Y)?
None of the current techniques seem to work when one tries to tackle the
following question.
Question 14 ([96, page 322]) Is it true that the pair (2p , q ) has property Lp,p
whenever 1 < p, q < ∞?
6. Modulus
√
In [63] it was shown that for all Banach spaces X , it holds that SX (η) ≤ 2η.
In [123], a BPB modulus for operators was considered for a pair of Banach spaces
(X , Y) which somehow generalizes the previously defined modulus for functionals.
However, in the real case, S (X , R, η) and SX (η) have a slight difference in
the definition concerning norm-attainment: in the first one, it is asked to satisfy
|y ∗ (y)| = 1, while in the second one it is asked to satisfy y ∗ (y) = 1. The following
natural question follows and was left open in [123]. The authors are thankful to
Vladimir Kadets for reminding us about it.
The Bishop–Phelps–Bollobás Theorem: An Overview 567

S (X , R, η)
√
Question 15 ([123, Problem 1]) Is it true that ≤ min{ 2η, 1} for all
real Banach spaces X ?

7 Further Research and New (or Recent) Possible Lines

In this section, we present some further research that has been done in the past
few years. Moreover, we present possible new lines that the interested reader could
follow. We divide the present section into subsections depending on the specific
direction that we are considering.
Stability Results There are plenty of results on stability of the Bishop–Phelps–
Bollobás property when it comes to (absolute) direct sums. This provides more
examples of pairs of Banach spaces satisfying the BPBp (for operators, in particular)
as well as counterexamples. It is worth mentioning that stability results appear quite
commonly when one starts working on the BPBp; for this, we suggest, for instance,
references [27, 69, 72, 131] and also [34, Section 2].
The Bollobás Theorem for Operators on X ⊗ Bπ Y After the papers [89] and [92],
it seems to be natural to ask when a Bollobás theorem for operators defined on tensor
products holds. In particular, the question of when it is possible to get the BPBp for
pairs of the forms (X ⊗Bπ Y, Z ⊗Bπ W ) seems to have its own interest.
The BPBp-nu for Multilinear Mappings As far as we known, Theorem 2.91,
which says that L1 (μ) fails the Bishop–Phelps–Bollobás property for numerical
radius for multilinear mappings, is the only result in this line. Perhaps, more research
in this direction would provide interesting results. Other classes of mappings for
which denseness of numerical radius attaining mappings have been studied include
N-homogeneous polynomials (see for instance [16, 73]) and holomorphic mappings
(see for instance [136, 150]), but as far as we know, no BPBp-nu property has been
studied for those classes of mappings.
Minimum Norm-Attaining Operators A relatively new line of research studies
the operators that attain their minimum norms. For T ∈ B(X , Y), its minimum
norm is defined as the number m(T ) := infx∈SX T (x). Bollobás type theorems in
this line could have their own relevance. We suggest the reader references [39, 53,
57, 58, 135].
Group Invariant Version of Bollobás Theorem Two very recent papers (see
[85, 102]) consider versions of the Bishop–Phelps theorem for group invariant
functionals and operators. In [102], Javier Falcó proved that a Bollobás theorem
in this context does not hold in general (see [102, Example, page 1611]) but he
also proved a possible extension for it (see [102, Theorem 5]). These properties for
operators have their own interest.
568 S. Dantas et al.

The Set of Operators that Satisfy a BPB Theorem Very recently, the first and
fourth authors of this survey, in a joint work with Mingu Jung, studied Bollobás
type theorems from a different perspective (see [91]): instead of looking for spaces
satisfying the Bollobás theorem, they studied the set of operators for which some
Bollobás type theorem are valid ([91, Definition 1.1]). As far as we know there is
no further research in this direction.
Strongly Subdifferentiability of ⊗ Bπ,s,N X and P(N X ) In a upcoming paper,
the first author together with Mingu Jung, Martin Mazzitelli, and Jorge Tomás
Rodríguez, study the strongly subdifferentiability of the symmetric projective tensor
product [90]. To do so, they study the Bishop–Phelps–Bollobás point property for
N-homogeneous polynomials as a tool to provide when the symmetric projective
Bπ,s,N X and P(N X ) have strongly differentiable norms.
tensor product ⊗
Related Local Properties The reader can easily notice that one can consider
the BPBop when η depends on a norm-one point x as well as the BPBpp when
η depends on a norm-one operator T (see Definitions 5.1 and 5.2). This yields
properties Lo,p and Lp,o , respectively. There are not many results in this line and we
invite the reader to check the recent paper [98], where the main aim of the authors
is to distinguish all of them from each other. On the other hand, not much is known
about the differences between the BPBp and its local versions, Lp and Lo (see [96,
Section 3]).

8 Tables for Classical Banach Spaces: A Summary

The following tables gather and summarize known results about the Bishop–Phelps–
Bollobás property for pairs of classical Banach spaces. In the pdf version of this
document, each cell is hyperlinked to the corresponding result within this survey.
The first column will represent domain spaces and the first row will represent range
spaces.
Remark 8.1 As an important note, the table will be mostly focused on positive
results, that is, when a pair of spaces has the BPBp. Negative results will be
reserved exclusively for universal domains and ranges, and may not be exhaustive.
For instance, in any pair where NA(X , Y) is not dense in B(X , Y), the BPBp
automatically fails, but even in pairs where every operator attains its norm, we can
find counterexamples (see Example after Theorem 2.1 for a 2-dimensional space
failing to be a universal BPB domain). Actually, many classical Banach spaces fail
to be universal BPB domains, such as c0 , 1 , ∞ L1 (μ), C(K) or any n1 or n∞ with
n > 1 (except for maybe particular cases), and it is known also that 1 , p , L1 (μ),
Lp (μ), and C(K) spaces are not universal BPB range spaces in general (except for
some particular cases). We encourage the interested reader to check the nice paper
[34], where an exposition of this problem is shown, and see also the excellent survey
[4] about norm-attaining operators.
The Bishop–Phelps–Bollobás Theorem: An Overview 569

Table 1 Classical Banach spaces with the BPBp (I)

BPBp ∀Y K m
1 m
q m
∞ F.D. c0 1 q ∞
∀X ✗ ✗ ✗
K
21 ✗
n1 ✗
np
n∞ ✗
F.D. ✗
c0 ✗ C
1 ✗
p
∞ ✗ C
L1 (μ) ✗ σ
Lp (μ)
L∞ (μ) ✗ R|C C R|C
C(K1 ) ✗ R|C C R|C
C0 (L1 ) ✗ C C C

Table 2 Classical Banach BPBp L1 (ν) Lq (ν) L∞ (ν) C(K2 ) C0 (L2 )

spaces with the BPBp (II)
∀X ✗ ✗ ✗ ✗
K
21
n1
np
n∞ C | R, 2+
F.D.
c0 C
1
p
∞ C R R
L1 (μ) 2◦
Lp (μ)
L∞ (μ) C R|C R R
C(K1 ) C R|C R R R,1m
C0 (L1 ) C C R,1m R,1m R,1m

Tables 1 and 2 use the following notation. Unless otherwise mentioned, 1 <
p, q < ∞, m, n > 1, μ, ν are any measures, K1 , K2 are any compact Hausdorff
spaces, L1 , L2 are any locally compact Hausdorff spaces, and F.D. will denote finite-
dimensional Banach spaces. Besides, we have the following additional notation.
• Symbol means that the pair has the BPBp in general, possibly under some
extra conditions specified in the subindexes.
570 S. Dantas et al.

• Symbol ✗ means that there is at least 1 known counterexample.

• Subindex R means in the real case. Likewise, subindex C means in the complex
case.
• When there is confusion, 1subindex means that whatever comes next applies to the
domain space, and 2subindex means that whatever comes next applies to the range
space.
• σ stands for a σ -finite measure.
• + stands for a positive measure.
• ◦ stands for a localizable measure.
• m means that the corresponding (locally) compact Hausdorff space is metrizable.

Acknowledgments The authors of this survey would like to thank the anonymous referee for
the many useful suggestions and comments. The authors would also like to thank Ramón Aliaga,
Mingu Jung, Vladimir Kadets, Miguel Martín, Martin Mazzitelli, Jorge Tomás Rodríguez, and
Abraham Rueda Zoca for fruitful conversations on the topic of this survey.
Sheldon Dantas was supported by Spanish AEI Project PID2019 - 106529GB -
I00/MCIN/AEI/10.13039/501100011033. Domingo García and Manuel Maestre were
supported by project MTM 2017-83262-C2-1-P/MCIN/AEI/10.13039/501100011033 (FEDER)
and by PROMETEU/2021/070. Óscar Roldán was supported by the Spanish Minis-
terio de Universidades, grant FPU17/02023, and by project MTM2017-83262-C2-1-
P/MCIN/AEI/10.13039/501100011033 (FEDER).

References

1. M.D. Acosta, Operadores que alcanzan su radio numérico, Ph.D. thesis, Univ. of Granada,
1990
2. M.D. Acosta, Denseness of numerical radius attaining operators: renorming and embedding
results. Indiana Univ. Math. J. 40(3), 903–914 (1991)
3. M.D. Acosta, Every real Banach space can be renormed to satisfy the denseness of numerical
radius attaining operators. Isr. J. Math. 81(3), 273–280 (1993)
4. M.D. Acosta, Denseness of norm attaining operators. Rev. R. Acad. Cienc. Exactas Fís. Nat.
Ser. A Mat. RACSAM 100(1–2), 9–30 (2006)
5. M.D. Acosta, The Bishop–Phelps–Bollobás property for operators on C(K). Banach J. Math.
Anal. 10(2), 307–319 (2016)
6. M.D. Acosta, On the Bishop–Phelps–Bollobás property, in Function Spaces XII, 13–32.
Banach Center Publ., vol. 119 (Polish Acad. Sci. Inst. Math., Warsaw, 2019)
7. M.D. Acosta, F.J. Aguirre, R. Payá, A space by W. Gowers and new results on norm and
numerical radius attaining operators. Acta Univ. Carolin. Math. Phys. 33(2), 5–14 (1992)
8. M.D. Acosta, F.J. Aguirre, R. Payá, There is no bilinear Bishop–Phelps theorem. Isr. J. Math.
93, 221–227 (1996)
9. M.D. Acosta, R.M. Aron, D. García, M. Maestre, The Bishop–Phelps–Bollobás theorem for
operators. J. Funct. Anal. 254(11), 2780–2799 (2008)
10. M.D. Acosta, R.M. Aron, F.J. García-Pacheco, The approximate hyperplane series property
and related properties. Banach J. Math. Anal. 11(2), 295–310 (2017)
11. M.D. Acosta, J. Becerra-Guerrero, Y.S. Choi, M. Ciesielski, S.K. Kim, H.J. Lee,
M.L. Lourenço, M. Martín, The Bishop–Phelps–Bollobás property for operators between
spaces of continuous functions. Nonlinear Anal. 95, 323–332 (2014)
The Bishop–Phelps–Bollobás Theorem: An Overview 571

12. M.D. Acosta, J. Becerra-Guerrero, Y.S. Choi, D. García, S.K. Kim, H.J. Lee, M. Maestre, The
Bishop–Phelps–Bollobás property for bilinear forms and polynomials. J. Math. Soc. Japan
66(3), 957–979 (2014)
13. M.D. Acosta, J. Becerra-Guerrero, D. García, S.K. Kim, M. Maestre, Bishop–Phelps–
Bollobás property for certain spaces of operators. J. Math. Anal. Appl. 414(2), 532–545
(2014)
14. M.D. Acosta, J. Becerra-Guerrero, D. García, S.K. Kim, M. Maestre, The Bishop–Phelps–
Bollobás property: a finite-dimensional approach. Publ. Res. Inst. Math. Sci. 51(1), 173–190
(2015)
15. M.D. Acosta, J. Becerra-Guerrero, D. García, M. Maestre, The Bishop–Phelps–Bollobás
theorem for bilinear forms. Trans. Am. Math. Soc. 365(11), 5911–5932 (2013)
16. M.D. Acosta, J. Becerra-Guerrero, M. Ruiz-Galán, Numerical-radius-attaining polynomials.
Q. J. Math. 54(1), 1–10 (2003)
17. M.D. Acosta, J.L. Dávila, A basis of Rn with good isometric properties and some applications
to denseness of norm attaining operators. J. Funct. Anal. 279(6), 108602, 26 pp. (2020)
18. M.D. Acosta, J.L. Dávila, M. Soleimani-Mourchehkhorti, Characterization of the Banach
spaces Y satisfying that the pair (4∞ , Y ) has the Bishop–Phelps–Bollobás property for
operators. J. Math. Anal. Appl. 470(2), 690–715 (2019)
19. M.D. Acosta, M. Fakhar, M. Soleimani-Mourchehkhorti, The Bishop–Phelps–Bollobás
property for numerical radius of operators on L1 (μ). J. Math. Anal. Appl. 458(2), 925–936
(2018)
20. M.D. Acosta, D. García, S.K. Kim, M. Maestre, The Bishop–Phelps–Bollobás property for
operators from c0 into some Banach spaces. J. Math. Anal. Appl. 445(2), 1188–1199 (2017)
21. M.D. Acosta, D. García, M. Maestre, A multilinear Lindenstrauss theorem. J. Funct. Anal.
235(1), 122–136 (2006)
22. M.D. Acosta, M. Mastyło, M. Soleimani-Mourchehkhorti, The Bishop–Phelps–Bollobás and
approximate hyperplane series properties. J. Funct. Anal. 278(9), 2673–2699 (2018)
23. M.D. Acosta, R. Payá, Denseness of operators whose second adjoints attain their numerical
radii. Proc. Am. Math. Soc. 105(1), 97–101 (1989)
24. M.D. Acosta, R. Payá, Numerical radius attaining operators and the Radon-Nikodým
property. Bull. Lond. Math. Soc. 25(1), 67–73 (1993)
25. M.D. Acosta, M. Ruiz-Galán, Reflexive spaces and numerical radius attaining operators.
Extracta Math. 15, 247–255 (2000)
26. M.D. Acosta, M. Soleimani-Mourchehkhorti, Bishop–Phelps–Bollobás property for pos-
itive operators between classical Banach spaces, in The Mathematical Legacy of Victor
Lomonosov. Adv. Anal. Geom., vol. 2 (De Gruyter, Berlin, 2020), pp. 1–13
27. M.D. Acosta, M. Soleimani-Mourchehkhorti, Stability results of properties related to the
Bishop–Phelps–Bollobás property for operators. Sci. China Math. 64(5), 1011–1028 (2021)
28. M.D. Acosta, M. Soleimani-Mourchehkhorti, Bishop–Phelps–Bollobás property for positive
operators when the domain is L∞ . Bull. Math. Sci. 11, (2), 16 pp. (2021). Paper no. 2050023
29. M.D. Acosta, M. Soleimani-Mourchehkhorti, Bishop–Phelps–Bollobás property for positive
functionals (2021). arXiv:2106.05935
30. M.D. Acosta, M. Soleimani-Mourchehkhorti, Bishop–Phelps–Bollobás property for positive
operators when the domain is C0 (L) (2021). arXiv:2108.01638
31. L. Agud, J.M. Calabuig, S. Lajara, E.A. Sánchez-Pérez, Differentiability of Lp of a vector
measure and applications to the Bishop–Phelps–Bollobás property. Rev. R. Acad. Cienc.
Exactas Fís. Nat. Ser. A Mat. RACSAM 111(3), 735–751 (2017)
32. R.M. Aron, B. Cascales, O. Kozhushkina, The Bishop–Phelps–Bollobás theorem and Asplund
operators. Proc. Am. Math. Soc. 139(10), 3553–3560 (2011)
33. R.M. Aron, Y.S. Choi, D. García, M. Maestre, The Bishop–Phelps–Bollobás theorem for
L(L1 (μ), L∞ [0, 1]). Adv. Math. 228(1), 617–628 (2011)
34. R.M. Aron, Y.S. Choi, S.K. Kim, H.J. Lee, M. Martín, The Bishop–Phelps–Bollobás version
of Lindenstrauss properties A and B. Trans. Am. Math. Soc. 367(9), 6085–6101 (2015)
572 S. Dantas et al.

35. R.M. Aron, C. Finet, E. Werner, Some remarks on norm-attaining n-linear forms, in Function
Spaces (Edwardsville, IL, 1994). Lecture Notes in Pure and Appl. Math., vol. 172 (Dekker,
New York, 1995), pp. 19–28
36. R.M. Aron, D. García, M. Maestre, On norm attaining polynomials. Publ. Res. Inst. Math.
Sci. 39(1), 165–172 (2003)
37. A. Avilés, A.J. Guirao, J. Rodríguez, On the Bishop–Phelps–Bollobás property for numerical
radius in C(K)-spaces. J. Math. Anal. Appl. 419(1), 395–421 (2014)
38. N. Bala, K. Dhara, J. Sarkar, A. Sensarma, A Bishop–Phelps–Bollobás theorem for bounded
analytic functions (2021). arXiv:2109.10125
39. N. Bala, G. Ramesh, A Bishop–Phelps–Bollobás type property for minimum attaining
operators. Oper. Matrices 15(2), 497–513 (2021)
40. I.D. Berg, B. Sims, Denseness of operators which attain their numerical radius. J. Aust. Math.
Soc. Ser. A 36(1), 130–133 (1984)
41. E. Bishop, R.R. Phelps, A proof that every Banach space is subreflexive. Bull. Am. Math.
Soc. 67, 97–98 (1961)
42. B. Bollobás, An extension to the theorem of Bishop, Phelps. Bull. Lond. Math. Soc. 2, 181–
182 (1970)
43. J. Bourgain, On dentability and the Bishop–Phelps property. Isr. J. Math. 28(4), 265–271
(1977)
44. S.A. Buss, Versiones locales y uniformes del Teorema de Bishop–Phelps–Bollobás, Bache-
lor’s thesis, National University of Comahue, 2019
45. F. Cabello-Sánchez, S. Dantas, V. Kadets, S.K. Kim, H.J. Lee, M. Martín, On Banach spaces
whose group of isometries acts micro-transitively on the unit sphere. J. Math. Anal. Appl.
488(1), 124046, 14 pp. (2020)
46. Á. Capel, M. Martín, J. Merí, Numerical radius attaining compact linear operators. J. Math.
Anal. Appl. 445(2), 1258–1266 (2017)
47. D. Carando, S. Lassalle, M. Mazzitelli, On the polynomial Lindenstrauss theorem. J. Funct.
Anal. 263(7), 1809–1824 (2012)
48. D. Carando, M. Mazzitelli, Bounded holomorphic functions attaining their norms in the
bidual. Publ. Res. Inst. Math. Sci. 51(3), 489–512 (2015)
49. D. Carando, J.T. Rodríguez, Symmetric multilinear forms on Hilbert spaces: Where do they
attain their norm?. Linear Algebra Appl. 563, 178–192 (2019)
50. C.S. Cardassi, Numerical radius attaining operators, in Banach Spaces (Columbia, Mo.,
1984). Lecture Notes in Math., vol. 1166 (Springer, Berlin, 1985), pp. 11–14
51. C.S. Cardassi, Density of numerical radius attaining operators on some reflexive spaces. Bull.
Aust. Math. Soc. 31(1), 1–3 (1985)
52. C.S. Cardassi, Numerical radius-attaining operators on C(K). Proc. Am. Math. Soc. 95(4),
537–543 (1985)
53. X. Carvajal, W. Neves, Operators that attain their minima. Bull. Braz. Math. Soc. (N.S.) 45(2),
293–312 (2014)
54. B. Cascales, R. Chiclana, L.C. García-Lirola, M. Martín, A. Rueda-Zoca, On strongly norm
attaining Lipschitz maps. J. Funct. Anal. 277(6), 1677–1717 (2019)
55. B. Cascales, A.J. Guirao, V. Kadets, A Bishop–Phelps–Bollobás type theorem for uniform
algebras. Adv. Math. 240, 370–382 (2013)
56. B. Cascales, A.J. Guirao, V. Kadets, M. Soloviova, -flatness and Bishop–Phelps–Bollobás
type theorems for operators. J. Funct. Anal. 274(3), 863–888 (2018)
57. U.S. Chakraborty, Some remarks on minimum norm attaining operators. J. Math. Anal. Appl.
492(2), 124492, 14 pp. (2020)
58. U.S. Chakraborty, Some Bishop–Phelps–Bollobás type properties in Banach spaces with
respect to minimum norm of bounded linear operators. Ann. Funct. Anal. 12(3), 15 pp. (2021).
Paper no. 46
59. L.X. Cheng, Q.J. Cheng, K.K. Xu, W. Zhang, Z.M. Zheng, A Bishop–Phelps–Bollobás
theorem for Asplund operators. Acta Math. Sin. (Engl. Ser.) 36(7), 765–782 (2020)
The Bishop–Phelps–Bollobás Theorem: An Overview 573

60. L. Cheng, D. Dai, Y. Dong, A sharp operator version of the Bishop–Phelps theorem for
operators from 1 to CL-spaces. Proc. Am. Math. Soc. 141(3), 867–872 (2013)
61. M. Chica, V. Kadets, M. Martín, J. Merí, Further properties of the Bishop–Phelps–Bollobás
moduli. Mediterr. J. Math. 13(5), 3173–3183 (2016)
62. M. Chica, V. Kadets, M. Martín, J. Merí, M. Soloviova, Two refinements of the Bishop–
Phelps–Bollobás modulus. Banach J. Math. Anal. 9(4), 296–315 (2015)
63. M. Chica, V. Kadets, M. Martín, S. Moreno-Pulido, F. Rambla-Barreno, Bishop–Phelps–
Bollobás moduli of a Banach space. J. Math. Anal. Appl. 412(2), 697–719 (2014)
64. R. Chiclana, L.C. García-Lirola, M. Martín, A. Rueda-Zoca, Examples and applications of
the density of strongly norm attaining Lipschitz maps. Rev. Mat. Iberoam. 37(5), 1917–1951
(2021)
65. R. Chiclana, M. Martín, The Bishop–Phelps–Bollobás property for Lipschitz maps. Nonlinear
Anal. 188, 158–178 (2019)
66. R. Chiclana, M. Martín, Some stability properties for the Bishop–Phelps–Bollobás property
for Lipschitz maps. Stud. Math. 264(2), 121–147 (2022)
67. D.H. Cho, Y.S. Choi, The Bishop–Phelps–Bollobás theorem on bounded closed convex sets.
J. Lond. Math. Soc. (2) 93(2), 502–518 (2016)
68. G. Choi, Y.S. Choi, M. Martín, Emerging notions of norm attainment for Lipschitz maps
between Banach spaces. J. Math. Anal. Appl. 483(1), 123600, 24 pp. (2020)
69. G. Choi, S.K. Kim, The Bishop–Phelps–Bollobás property on the space of c0 -sum, Mediterr.
J. Math. 19(2), 16 pp. (2022). Paper No. 72
70. Y.S. Choi, Norm attaining bilinear forms on L1 [0, 1]. J. Math. Anal. Appl. 211(1), 295–300
(1997)
71. Y.S. Choi, S. Dantas, M. Jung, The Bishop–Phelps–Bollobás properties in complex Hilbert
spaces. Math. Nachr. 294(11), 2105–2120 (2021)
72. Y.S. Choi, S. Dantas, M. Jung, M. Martín, The Bishop–Phelps–Bollobás property and
absolute sums. Mediterr. J. Math. 16(3), 24 pp. (2019). Paper no. 73
73. Y.S. Choi, D. García, S.G. Kim, M. Maestre, Norm or numerical radius attaining polynomials
on C(K). J. Math. Anal. Appl. 295(1), 80–96 (2004)
74. Y.S. Choi, D. García, S.G. Kim, M. Maestre, The polynomial numerical index of a Banach
space. Proc. Edinb. Math. Soc. (2) 49(1), 39–52 (2006)
75. Y.S. Choi, D. García, S.K. Kim, M. Maestre, Some geometric properties of disk algebras. J.
Math. Anal. Appl. 409(1), 147–157 (2014)
76. Y.S. Choi, S.G. Kim, Norm or numerical radius attaining multilinear mappings and polyno-
mials. J. Lond. Math. Soc. (2) 54(1), 135–147 (1996)
77. Y.S. Choi, S.K. Kim, The Bishop–Phelps–Bollobás theorem for operators from L1 (μ) to
Banach spaces with the Radon–Nikodým property. J. Funct. Anal. 261(6), 1446–1456 (2011)
78. Y.S. Choi, S.K. Kim, The Bishop–Phelps–Bollobás property and lush spaces. J. Math. Anal.
Appl. 390(2), 549–555 (2012)
79. Y.S. Choi, S.K. Kim, H.J. Lee, M. Martín, The Bishop–Phelps–Bollobás theorem for
operators on L1 (μ). J. Funct. Anal. 267(1), 214–242 (2014)
80. Y.S. Choi, S.K. Kim, H.J. Lee, M. Martín, On Banach spaces with the approximate hyperplane
series property. Banach J. Math. Anal. 9(4), 243–258 (2015)
81. Y.S. Choi, H.G. Song, The Bishop–Phelps–Bollobás theorem fails for bilinear forms on 1 ×
1 . J. Math. Anal. Appl. 360(2), 752–753 (2009)
82. D. Dai, The Bishop–Phelps–Bollobás theorem for bilinear mappings. Adv. Math. (China)
44(1), 105–110 (2015)
83. A. Dalet, G. Lancien, Some properties of coarse Lipschitz maps between Banach spaces.
North-West. Eur. J. Math. 3, 41–62 (2017)
84. S. Dantas, Some kind of Bishop–Phelps–Bollobás property. Math. Nachr. 290(5–6), 774–784
(2017)
85. S. Dantas, J. Falcó, M. Jung, Group invariant operators and some applications on norm-
attaining theory (2021). arXiv:2110.02066
574 S. Dantas et al.

86. S. Dantas, D. García, S.K. Kim, U.Y. Kim, H.J. Lee, M. Maestre. A nonlinear Bishop–Phelps–
Bollobás type theorem. Q. J. Math. 70(1), 7–16 (2019)
87. S. Dantas, D. García, S.K. Kim, H.J. Lee, M. Maestre, On the Bishop–Phelps–Bollobás
theorem for multilinear mappings. Linear Algebra Appl. 532, 406–431 (2017)
88. S. Dantas, D. García, M. Maestre, M. Martín, The Bishop–Phelps–Bollobás property for
compact operators. Can. J. Math. 70(1), 53–73 (2018)
89. S. Dantas, L.C. García-Lirola , M. Jung, A. Rueda-Zoca, On norm-attainment in (symmetric)
tensor products (2021). arXiv:2104.06841
90. S. Dantas, M. Jung, M. Mazzitelli, J.T. Rodríguez, On the strong subdifferentiability of the
homogeneous polynomials and (symmetric) tensor products (in preparation)
91. S. Dantas, M. Jung, Ó. Roldán, Norm-attaining operators which satisfy a Bollobás type
theorem. Banach J. Math. Anal. 15(2), 26 pp. (2021). Paper no. 40
92. S. Dantas, M. Jung, Ó. Roldán, A. Rueda-Zoca, Norm-attaining tensors and nuclear operators.
Mediterranean J. Math. (to be formally accepted). arXiv:2006.09871
93. S. Dantas, V. Kadets, S.K. Kim, H.J. Lee, M. Martín, On the pointwise Bishop–Phelps–
Bollobás property for operators. Can. J. Math. 71(6), 1421–1443 (2019)
94. S. Dantas, V. Kadets, S.K. Kim, H.J. Lee, M. Martín, There is no operatorwise version of the
Bishop–Phelps–Bollobás property. Linear Multilinear Algebra 68(9), 1767–1778 (2020)
95. S. Dantas, S.K. Kim, H.J. Lee, The Bishop–Phelps–Bollobás point property. J. Math. Anal.
Appl. 444(2), 1739–1751 (2016)
96. S. Dantas, S.K. Kim, H.J. Lee, M. Mazzitelli, Local Bishop–Phelps–Bollobás properties. J.
Math. Anal. Appl. 468(1), 304–323 (2018)
97. S. Dantas, S.K. Kim, H.J. Lee, M. Mazzitelli, Strong subdifferentiability and local Bishop–
Phelps–Bollobás properties. Rev. R. Acad. Cienc. Exactas Fís. Nat. Ser. A Mat. RACSAM
114(2), 16 pp. (2020). Paper no. 47
98. S. Dantas, S.K. Kim, H.J. Lee, M. Mazzitelli, On some local Bishop–Phelps–Bollobás
properties, in The Mathematical Legacy of Victor Lomonosov. Adv. Anal. Geom., vol. 2, (De
Gruyter, Berlin, 2020), pp. 109–121
99. S. Dantas, A. Rueda-Zoca, A characterization of a local vector valued Bollobás theorem.
Results Math. 76(4), 14 pp. (2021). Paper no. 167
100. A. Defant, K. Floret, Tensor Norms and Operator Ideals. North-Holland Mathematics
Studies, vol. 176 (Elsevier, Amsterdam, 1993)
101. J. Falcó, The Bishop–Phelps–Bollobás property for numerical radius on L1 . J. Math. Anal.
Appl. 414(1), 125–133 (2014)
102. J. Falcó, A group invariant Bishop–Phelps theorem. Proc. Am. Math. Soc. 149(4), 1609–1612
(2021)
103. H. Fetter, B. Gamboa de Buen, The James Forest. London Mathematical Society Lecture
Notes Series, vol. 236 (Cambridge University Press, Cambridge, 1997)
104. D. García, H.J. Lee, M. Maestre, The Bishop–Phelps–Bollobás property for Hermitian forms
on Hilbert spaces. Q. J. Math. 65(1), 201–209 (2014)
105. D. García, M. Maestre, M. Martín, Ó. Roldán, On the compact operators case of the Bishop–
Phelps–Bollobás property for numerical radius. Results Math. 76(3), 23 pp. (2021). Paper no.
122
106. L.C. García-Lirola, C. Petitjean, A. Procházka, A. Rueda Zoca, Extremal structure and
Duality of Lipschitz free space. Mediterr. J. Math. 15(2), 23 pp. (2018). Paper no. 69
107. L.C. García-Lirola, A. Procházka, A. Rueda Zoca, On the structure of spaces of vector-valued
Lipschitz functions. Stud. Math. 239(3), 249–271 (2017)
108. F.J. García-Pacheco, The AHSP is inherited by E-summands. Adv. Oper. Theory 2(1), 17–20
(2017)
109. F.J. García-Pacheco, S. Moreno-Pulido, The Bishop–Phelps–Bollobás modulus for function-
als on classical Banach spaces. Adv. Oper. Theory 4(1), 1–23 (2019)
110. F.J. García-Pacheco, S. Moreno-Pulido, The Bishop–Phelps–Bollobás modulus for operators.
Acta Sci. Math. (Szeged) 85(1–2), 189–201 (2019)
The Bishop–Phelps–Bollobás Theorem: An Overview 575

111. G. Godefroy, A survey on Lipschitz-free Banach spaces. Comment. Math. 55(2), 89–118
(2015)
112. G. Godefroy, On norm attaining Lipschitz maps between Banach spaces. Pure Appl. Funct.
Anal. 1(1), 39–46 (2016)
113. G. Godefroy, V. Montesinos, V. Zizler, Strong subdifferentiability of norms and geometry of
Banach spaces. Comment. Math. Univ. Carolin. 36(3), 493–502 (1995)
114. T. Grando, M.L. Lourenço, On a function module with approximate hyperplane series
property. J. Aust. Math. Soc. 108(3), 341–348 (2020)
115. A.J. Guirao, O. Kozhushkina, The Bishop–Phelps–Bollobás property for numerical radius in
1 (C). Stud. Math. 218(1), 41–54 (2013)
116. R.E. Huff, Dentability and the Radon-Nikodým property. Duke Math. J. 41, 111–114 (1974)
117. R.C. James, Reflexivity and the supremum of linear functionals. Ann. Math. 66, 159–169
(1957)
118. R.C. James, Characterizations of reflexivity. Stud. Math. 23, 205–216 (1964)
119. M. Jiménez Sevilla, R. Payá, Norm attaining multilinear forms and polynomials on preduals
of Lorentz sequence spaces. Stud. Math. 127(2), 99–112 (1998)
120. J. Johnson, J. Wolfe, Norm attaining operators. Stud. Math. 65(1), 7–19 (1979)
121. V. Kadets, M. Martín, M. Soloviova, Norm-attaining Lipschitz functionals. Banach J. Math.
Anal. 10(3), 621–637 (2016)
122. V. Kadets, M. Soloviova, A modified Bishop–Phelps–Bollobás theorem and its sharpness.
Mat. Stud. 44(1), 84–88 (2015)
123. V. Kadets, M. Soloviova, Quantitative version of the Bishop–Phelps–Bollobás theorem for
operators with values in a space with the property β. Mat. Stud. 47(1), 71–90 (2017)
124. S.K. Kim, The Bishop–Phelps–Bollobás theorem for operators from c0 to uniformly convex
spaces. Isr. J. Math. 197(1), 425–435 (2013)
125. S.K. Kim, H.J. Lee, Uniform Convexity and Bishop–Phelps–Bollobás property. Can. J. Math.
66(2), 373–386 (2014)
126. S.K. Kim, H.J. Lee, Simultaneously continuous retraction and Bishop–Phelps–Bollobás type
theorem. J. Math. Anal. Appl. 420(1), 758–771 (2014)
127. S.K. Kim, H.J. Lee, The Bishop–Phelps–Bollobás property for operators from C(K) to
uniformly convex spaces. J. Math. Anal. Appl. 421(1), 51–58 (2015)
128. S.K. Kim, H.J. Lee, A Urysohn-type theorem and the Bishop–Phelps–Bollobás theorem for
holomorphic functions. J. Math. Anal. Appl. 480(2), 123393, 8 pp. (2019)
129. S.K. Kim, H.J. Lee, P.K. Lin, The Bishop–Phelps–Bollobás property for operators from
L∞ (μ) to uniformly convex Banach spaces. J. Nonlinear Convex Anal. 17(2), 243–249
(2016)
130. S.K. Kim, H.J. Lee, M. Martín, On the Bishop–Phelps–Bollobás property for numerical
radius. Abstr. Appl. Anal. 2014, 479208, 15 pp. (2014)
131. S.K. Kim, H.J. Lee, M. Martín, The Bishop–Phelps–Bollobás theorem for operators from 1
sums of Banach spaces. J. Math. Anal. Appl. 428(2), 920–929 (2015)
132. S.K. Kim, H.J. Lee, M. Martín, Bishop–Phelps–Bollobás property for bilinear forms on
spaces of continuous functions. Math. Z. 283(1–2), 157–167 (2016)
133. S.K. Kim, H.J. Lee, M. Martín, On the Bishop–Phelps–Bollobás theorem for operators and
numerical radius. Stud. Math. 233(2), 141–151 (2016)
134. S.K. Kim, H.J. Lee, M. Martín, J. Merí, On a second numerical index for Banach spaces.
Proc. R. Soc. Edinburgh Sect. A 150(2), 1003–1051 (2020)
135. S.H. Kulkarni, G. Ramesh, On the denseness of minimum attaining operators. Oper. Matrices
12(3), 699–709 (2018)
136. H.J. Lee, Denseness of numerical radius attaining holomorphic functions. J. Inequal. Appl.
981453, 5 pp. (2009)
137. J Lindenstrauss, On operators which attain their norm. Isr. J. Math. 1, 139–148 (1963)
138. J. Lindenstrauss, L. Tzafriri, Classical Banach spaces. I (Springer, Berlin, New York, 1977)
139. M. Martín, Norm-attaining compact operators. J. Funct. Anal. 267(5), 1585–1592 (2014)
576 S. Dantas et al.

140. M. Martín, The version for compact operators of Lindenstrauss properties A and B. Rev. R.
Acad. Cienc. Exactas Fís. Nat. Ser. A Mat. RACSAM 110(1), 269–284 (2016)
141. R. Payá, A counterexample on numerical radius attaining operators. Isr. J. Math. 79(1), 83–
101 (1992)
142. R.A. Ryan, Introduction to Tensor Products of Banach Spaces. Springer Monographs in
Mathematics (Springer, London, 2002)
143. D. Sain, Smooth points in operator spaces and some Bishop–Phelps–Bollobás type theorems
in Banach spaces. Oper. Matrices 13(2), 433–445 (2019)
144. W. Schachermayer, Norm attaining operators on some classical Banach spaces. Pac. J. Math.
105(2), 427–438 (1983)
145. W. Schachermayer, Norm attaining operators and renormings of Banach spaces. Isr. J. Math.
44(3), 201–212 (1983)
146. B. Sims, On numerical range and its applications to Banach algebras, Ph.D. thesis, Univ. of
Newcastle, 1972
147. J. Talponen, Note on a kind of Bishop–Phelps–Bollobás property for operators (2017).
arXiv:1707.03251
148. J.J. Uhl, Norm attaining operators on L1 [0, 1] and the Radon-Nikodým property. Pacific J.
Math. 63(1), 293–300 (1976)
149. M. Weaver, Lipschitz Algebras (World Scientific Publishing Co., Inc., River Edge, NJ, 1999)
150. R. Zarghami, Coincidence the sets of norm and numerical radius attaining holomorphic
functions on finite-dimensional spaces. Acta Univ. Apulensis Math. Inform. 25, 229–233
(2011)
151. V. Zizler, On some extremal problems in Banach spaces. Math. Scand. 32, 214–224 (1973)
A New Proof of the Power Weighted
Birman–Hardy–Rellich Inequalities

Fritz Gesztesy, Isaac Michael, and Michael M. H. Pang

Abstract In this chapter, we introduce a new method for proving the power-
weighted Birman–Hardy–Rellich integral inequalities,
$ ∞ $ ∞
2 2
dx x α f (m) (x) A(, α) dx x α−2 f (m−) (x) ,
0 0

m ∈ N, ∈ {1, . . . , m}, α ∈ R, f ∈ C0∞ ((0, ∞)),

where A(, α) is given by

−
A(, α) = 4 (2j − 1 − α)2 .
j =1

The new method of proof simultaneously establishes both the existence of such
inequalities and their optimality (i.e., sharpness of the constant A(, α) on the space
C0∞ ((0, ∞)) of infinitely differentiable functions of compact support in (0, ∞)).
We also note that these inequalities are strict, that is, equality holds if and only if
f ≡ 0.

F. Gesztesy ()
Department of Mathematics, Baylor University, Waco, TX, USA
e-mail: [email protected]
https://www.baylor.edu/math/index.php?id=935340
I. Michael
Department of Mathematics, Louisiana State University, Baton Rouge, LA, USA
e-mail: [email protected]
https://www.math.lsu.edu/~imichael
M. M. H. Pang
Department of Mathematics, University of Missouri, Columbia, MO, USA
e-mail: [email protected]
https://www.math.missouri.edu/people/pang

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 577
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_17
578 F. Gesztesy et al.

.
Extensions to homogeneous Sobolev spaces, that is, f ∈ H0m (0, ρ); x α dx , as
.
well as the vector-valued case, where f ∈ H0m (0, ρ); x α dx; H , ρ ∈ (0, ∞)∪{∞},
with H a complex, separable Hilbert space, are also discussed.

Keywords Birman–Hardy–Rellich inequalities · Logarithmic refinements

1 Introduction

The primary aim in this paper is to provide a new proof of the optimal version of
the power-weighted sequence of Birman–Hardy–Rellich inequalities of the form,
$ ∞ $ ∞
2 2
dx x α f (m) (x) A(, α) dx x α−2 f (m−) (x) ,
0 0 (1)
m ∈ N, ∈ {1, . . . , m}, α ∈ R, f ∈ C0∞ ((0, ∞)),

where

A(, α) = 4− (2j − 1 − α)2 , ∈ N, α ∈ R. (2)
j =1

The novelty of our proof lies in the fact that both the existence and the optimality of
the constants A(, α) in (1) are established simultaneously.
Moreover, we also prove these inequalities in the context of homogeneous
.
Sobolev spaces, that is, for f ∈ H0m (0, ρ); x α dx , as well as in the vector-valued
.
case for f ∈ H0m (0, ρ); x α dx; H , ρ ∈ (0, ∞) ∪ {∞}, and the Sobolev space
H0m ((ρ, ∞); x α dx; H), ρ ∈ (0, ∞), where H is a complex, separable Hilbert space.
We recall that the special case α = 0 appeared in work of Birman in 1961
(English translation in 1966) [3] (see also [9, pp. 83–84]). The case m = 1 in (1)
represents Hardy’s celebrated inequality [11], [12, Sect. 9.8] (see also [16, Chs. 1,
3, App.]), the case m = 2 is due to Rellich [24, Sect. II.7] (actually, in the multi-
dimensional context). The inequalities (1) are known to be strict, that is, equality
holds in (1) if and only if f = 0 on (0, ∞). Moreover, they are known to be optimal,
that is, the constants A(, α) in (1) are sharp, although, this must be qualified as
different authors frequently prove sharpness for different function spaces. In the
present one-dimensional context at hand, sharpness of (1) is often proved in an
integral form (rather than the currently presented differential form) where f (m) on
the left-hand side is replaced by F and f on the right-hand side by m repeated
integrals over F . For pertinent one-dimensional sources, we refer, for instance, to
[2, p. 3–5], [4], [6, p. 104–105], [8, 10, 11], [12, p. 240–243], [16, Ch. 3], [17,
p. 5–11], [19, 20, 23]. We also note that higher-order Hardy inequalities, including
various weight functions, are discussed in [5], [15, Sect. 5], [16, Chs. 2–5], [17,
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 579

Chs. 1–4], [18], and [22, Sect. 10] (however, with the exception of [5], Birman’s
sequence of inequalities, i.e., (1) for α = 0, is not mentioned in these sources).
There exists a wealth of multi-dimensional investigations of Hardy, Rellich, etc.,
inequalities on domains ⊆ Rn , n ∈ N, n 2, which, when specialized to a ball
in Rn and spherically symmetric functions f , yields one dimensional inequalities of
the Birman–Hardy–Rellich-type with various weight functions. Since we included
a very detailed bibliography in [7], including such multi-dimensional sources, we
refrained from repeating it here and just focused on the one dimensional literature.
Briefly turning to the contents of each section, we introduce our new proof,
a variant of a combination of transformations studied by Hartman [13], [14,
p. 324–325] and Müller-Pfeiffer [21, p. 200–207], in Sect. 2. Generalizations to
homogeneous Sobolev spaces and to the vector-valued case (replacing complex-
valued f ( · ) by f ( · ) ∈ H, with H a complex, separable Hilbert space) appear in
Sect. 3. For background on the vector-valued case we refer to Appendix A.
Throughout this paper, H represents a complex, separable Hilbert space with
corresponding scalar product ( · , · )H (linear in the second factor) and associated
norm · H .

2 Power-Weighted Birman–Hardy–Rellich Inequalities

In this section we present our new proof of the power-weighted Birman–Hardy–

Rellich inequalities for functions in C0∞ ((0, ∞)).
We start by establishing the sequence of power-weighted Birman–Hardy–Rellich
inequalities for the case = m in (1) using a slight modification of our proof in
[7] that allows one to prove both the inequality and the optimality of the constant
A(m, α) simultaneously.
Lemma 2.1 Set

m
A(m, α) = 4−m (2j − 1 − α)2 , m ∈ N, α ∈ R. (3)
j =1

Then,
$ ∞ $ ∞
2 2
α
dx x f (m)
(x) A(m, α) dx x α−2m f (x) ,
0 0 (4)
m ∈ N, α ∈ R, f ∈ C0∞ ((0, ∞)).

Moreover, the constant A(m, α) in (4) is optimal and the inequality is strict, that is,
equality holds in (4) if and only if f ≡ 0.
580 F. Gesztesy et al.

Proof Let C ∈ (0, ∞) and define Q as the operator in L2 ((0, ∞); dx) given by

dm dm
Q = (−1)m m x α m − Cx α−2m
. (5)
dx dx C0∞ ((0,∞))

Utilizing
$ $
b
2
b (m)
dx x α f (m) (x) = (−1)m dx x α f (m) (x) f (x),
a a (6)
m ∈ N, α ∈ R, f ∈ C0∞ ((a, b)), 0 a < b ∞,

one concludes that

$ ∞ 4 5
dx x α f (m) (x) − Cx α−2m f (x) , f ∈C0∞ ((0, ∞)).
2 2
(f, Qf )L2 ((0,∞);dx)=
0
(7)

Thus, to establish (4) and, simultaneously, optimality of the constant A(m, α), we
will show that

Q 0 if and only if C A(m, α). (8)

To this end, one introduces the following elementary variable transformation, an

extension, and combination, of transformations considered by Hartman [13] (see
also [14, p. 324–325]) and Müller-Pfeiffer [21, p. 200–207]: Assume temporarily
that

α ∈ R\{j | 1 j 2m − 1}. (9)

Given f ∈ C0∞ ((0, ∞)), the transformation

x = et , x ∈ (0, ∞), dx = et dt, t ∈ R, (10)

[(2m−1−α)/2]t
f (x) ≡ f (e ) = e t
w(t), w∈ C0∞ (R), (11)

yields

(m)
2m
x α f (m) (x) = e−[(2m+1−α)/2]t c (m, α)w() (t), (12)
=0

for appropriate constants c (m, α), = 0, 1, . . . , 2m to be determined next.

The solutions of the differential equation
α (m) (m)
x f (x) = 0, (13)
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 581

are linear combinations of the following powers of x:

xj , j = 0, 1, . . . , m − 1,
(14)
x k−α , k = m, . . . , 2m − 1.

One notes that the solutions (14) are linearly independent due to (9).
Thus, recalling (10)–(12), it follows that the solutions of

2m
c (m, α)w() (t) = 0, t ∈ R, (15)
=0

are the functions

e[(1+α−2m)/2]t x j = e[(2j +1+α−2m)/2])t , j = 0, 1, . . . , m − 1, (16)

and

e[(1+α−2m)/2]t x k−α = e(2k+1−α−2m)/2]t k = m, . . . , 2m − 1. (17)

One observes that for j = 0 and k = 2m − 1,

e[(2j +1+α−2m)/2]t = e[(1+α−2m)/2]t , e[(2k+1−α−2m)/2]t = e−[(1+α−2m)/2]t ,

(18)

and for j = 1 and k = 2m − 2,

e[(2j +1+α−2m)/2]t = e[(3+α−2m)/2]t , e[(2k+1−α−2m)/2]t = e−[(3+α−2m)/2]t .

(19)

Continuing iteratively, one concludes that the linearly independent solutions of (15)
are of the form

e±[(2j +1−2m+α)/2]t , j = 0, 1, . . . , m − 1. (20)

By a simple relabeling, given α ∈ R\{j | 1 j 2m − 1}, this is equivalent to

e±[(2j −1−α)/2]t , j = 1, . . . , m, t ∈ R, (21)

are linearly independent solutions of (15). The zeros of the characteristic polynomial
of (15) are thus the constant factors in the exponents of (21). Hence, the character-
582 F. Gesztesy et al.

istic polynomial is given by

2m
Pm,α (λ) = c (m, α)λ
=0

(1 − α)2 (3 − α)2 (2m − 1 − α)2

= λ2 − λ2 − · · · λ2 −
4 4 4

m
(2j − 1 − α)2
= λ2 − . (22)
4
j =1

Thus, the coefficients c (m, α), = 0, 1, . . . , 2m, satisfy the following properties:

(i) c2j −1 (m, α) = 0, j = 1, . . . , m;

(ii) c2j (m, α) = (−1)m−j |c2j (m, α)|, j = 0, 1, . . . , m;

(iii) |c0 (m, α)| = A(m, α); (23)

(iv) c2m (m, α) = 1.

Applying this transformation to (7) yields,

$ ∞ 2 3
(m)
(f, Qf )L2 ((0,∞);dx) = dx f (x) (−1) m α
x f (m)
(x) − Cx α−2m
f (x)
0
$ ∞
m
= dt et (−1)m e−[(2m+1−α)/2]t (−1)m−j |c2j (m, α)|w(2j ) (t)
−∞ j =0

− Ce(α−2m)t +[(2m−1−α)/2]t w(t) e[(2m−1−α)/2]t w(t)

$ ∞
m
= dt w(t) (−1) |c2j (m, α)|w (t) − Cw(t) .
j (2j )
(24)
−∞ j =0

Hence, Q 0 in L2 ((0, ∞); dx) if and only if the constant coefficient operator S
in L2 (R; dt), defined by

m
d 2j
S= (−1) |c2j (m, α)| 2j − C
j
, (25)
dt C0∞ (R)
j =0

satisfies S 0 in L2 (R; dt). Invoking the Fourier transform, the closure S of S in

L2 (R; dt) is unitarily equivalent to the maximally defined operator of multiplication
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 583

m
T in L2 (R; dξ ) by the polynomial j =1 |c2j (m, α)|ξ + |c0 (m, α)| − C , that is,
2j

m

(T v)(ξ ) = |c2j (m, α)|ξ 2j v(ξ ) + |c0 (m, α)| − C v(ξ ),
j =1
(26)
$ ∞
v ∈ dom(T ) = u ∈ L2 (R; dξ ) dξ ξ |u(ξ )| < ∞ .
4m 2
−∞

Recalling (23), part (iii), T (and hence Q) is nonnegative if and only if C

A(m, α). Moreover, if C = A(m, α), then T 0 with trivial nullspace, ker(T ) =
{0}. Thus, (4) is strict unless f ≡ 0.
The case α ∈ {j | 1 j 2m − 1} then follows by taking the limits α →
k ∈ {j | 1 j 2m − 1}, noting that A(m, α) and c2j (m, α) are continuous as
polynomials in α ∈ R.

Next, we extend Lemma 2.1 to include all intermediate cases ∈ {1, . . . , m}
in (1). For this purpose we recall the following elementary identity: Given A(m, α),
m ∈ N, α ∈ R, in (3), one has,

A(m, α) = A(, α)A(m − , α − 2), m ∈ N, ∈ {1, . . . , m}, α ∈ R. (27)

Indeed,

m−
A(, α)A(m − , α − 2) = 4− (2j − 1 − α)2 4−(m−) [2k − 1 − (α − 2)]2
j =1 k=1

m−
= 4−m (2j − 1 − α)2 [2(k + ) − 1 − α]2
j =1 k=1

m
= 4−m (2j − 1 − α)2 (2k − 1 − α)2
j =1 k=+1

m
= 4−m (2j − 1 − α)2
j =1

= A(m, α). (28)

Theorem 2.2 One has

$ ∞ $ ∞
2 2
dx x α f (m) (x) A(, α) dx x α−2 f (m−) (x) ,
0 0 (29)
m ∈ N, ∈ {1, . . . , m}, α ∈ R, f ∈ C0∞ ((0, ∞)).
584 F. Gesztesy et al.

Moreover, the constants A(, α), for 1 m, α ∈ R\{2j − 1}+1j m , in (29)

are optimal and the inequality is strict, that is, equality holds in (29) if and only if
f ≡ 0.
Proof Replacing f by f (m−) in (4) yields (29).
Thus, it remains to prove the optimality of A(, α) in (29). Arguing by contra-
diction, we suppose there exists C > A(, α) such that
$ ∞ $ ∞
f ∈ C0∞ ((0, ∞)).
2 2
α
dx x f (m)
(x) C dx x α−2 f (m−) (x) ,
0 0
(30)

Applying (4) again, one concludes with the help of (27),

$ ∞ $ ∞
2 2
dx x α f (m) (x) CA(m − , α − 2) dx x α−2−2(m−) f (x)
0 0
$ ∞
2
> A(, α)A(m − , α − 2) dx x α−2m f (x)
0
$ ∞
f ∈ C0∞ ((0, ∞)),
2
= A(m, α) dx x α−2m f (x) ,
0
(31)

contradicting the optimality of A(m, α) in (4). The condition α ∈ R\{2j −

1}+1j m is used to guarantee that A(m − , α − 2) = 0.

3 Some Generalizations

In this section we turn to generalizations regarding the use of homogeneous Sobolev

spaces; we also treat the vector-valued case where f ( · ) ∈ H, with H a complex,
separable Hilbert space. (For the necessary background in the vector-valued context
we refer to Appendix A.)
We start with the following elementary result.
Lemma 3.1 One has
$ ∞ $
2 ∞ 2
dx x α f (m) (x)H A(, α) dx x α−2 f (m−) (x)H ,
0 0 (32)
m ∈ N, ∈ {1, . . . , m}, α ∈ R, f ∈ C0∞ ((0, ∞); H).
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 585

Proof Using the notations in the proof of Lemma A.3, Theorem 2.2 implies,
$ $ ∞

∞
α (m)
2 ∞
(x)H =
2
dx x f dx x α f (m) (x), ϕk H
0 0 k=1
∞ $ ∞
(m) 2
= dx x α fk (x)
k=1 0
∞
$ ∞ (m−) 2
A(, α) dx x α−2 fk (x)
k=1 0

∞ $
∞ 2
= A(, α) dx x α−2 f (m−) k (x)
k=1 0
$ ∞ ∞
2
= A(, α) dx x α−2
f (m−) (x), ϕk H
0 k=1
$ ∞ 2
= A(, α) dx x α−2 f (m−) (x)H . (33)
0

Definition 3.2 Let m ∈ N, α ∈ R.
(i) Define
.

H m (0, ∞); x α dx; H = f : (0, ∞) → H f (j ) ∈ ACloc ((0, ∞); H);
$
∞ 2
j = 0, 1, . . . , m − 1; dx x α f (m) (x)H < ∞ . (34)
0

We also introduce,
$ .
∞ 2
|||f |||2m,α = dx x α f (m) (x)H , f ∈ H m (0, ∞); x α dx; H . (35)
0

(ii) Assume A(m, α) > 0 and define

. .

H0m (0, ∞); x α dx; H = f ∈ H m (0, ∞); x α dx; H there exists a Cauchy
∞
sequence {fn }∞
n=1 in C0 ((0, ∞); H), ||| · |||m,α such that, for 0 m,
$ ∞
2
we have lim dx x α−2 fn(m−) (x) − f (m−) (x)H = 0 (36)
n→∞ 0
586 F. Gesztesy et al.

.
= f ∈ H m (0, ∞); x α dx; H there exists a Cauchy sequence {fn }∞
n=1

in C0∞ ((0, ∞); H), ||| · |||m,α such that, for all 0 < a < b < ∞,
$ b
2
lim
dx fn (x) − f (x) H = 0 . (37)
n→∞ a

Remark 3.3 To see that (36) and (37) describe the same space, we first note that
∞ .m
if {fn }∞
n=1 ⊆ C0 ((0, ∞); H), ||| · |||m,α and f ∈ H (0, ∞); x α dx; H
satisfy (36), then, by putting = m in (36), (37) is satisfied. Next, suppose
.
f ∈ H m (0, ∞); x α dx; H satisfies (37). By Lemma A.4 there exists a unique
.
g ∈ H m (0, ∞); x α dx; H such that for = 0, 1, . . . , m,
$ ∞ 2
lim dx x α−2 fn(m−) (x) − g (m−) (x)H = 0. (38)
n→∞ 0

Putting = m in (38) we have,

$ b 2
lim dx x α−2m fn (x) − g(x)H = 0 (39)
n→∞ a

for all 0 < a < b < ∞, hence f = g since (37) is true by assumption. Thus, (38)
implies (36). D
.
Clearly, C0∞ ((0, ∞)); H) ⊆ H m (0, ∞); x α dx; H . Moreover, by Lemma A.4,
.
H0m (0, ∞); x α dx; H can be identified with the completion of C0∞ ((0, ∞); H)
with respect to the norm ||| · |||m,α , that is,

. ||| · |||m,α
H0m (0, ∞); x α dx; H = C0∞ ((0, ∞); H) . (40)

Theorem 3.4 Let m ∈ N, α ∈ R, and assume A(m, α) > 0. Then,

$ $
∞ 2 ∞ 2
dx x α f (m) (x)H A(, α) dx x α−2 f (m−) (x)H ,
0 0 (41)
.m
∈ {1, . . . , m}, f ∈ H0 (0, ∞); x α dx; H .
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 587

.m ∞
∞ Let f ∈ H0 (0, ∞); x dx; H and let {fn }n=1 be a Cauchy sequence in
Proof α

C0 ((0, ∞); H), ||| · |||m,α satisfying (36). Then, for all ∈ {1, . . . , m},
$ $
∞ 2 ∞ 2
dx x α f (m) (x)H = lim dx x α fn(m) (x)H
0 n→∞ 0
$ ∞ 2
A(, α) lim dx x α−2fn(m−) (x)H
n→∞ 0
$ ∞
2
= A(, α) dx x α−2f (m−) (x)H , (42)
0

where we have used (36) and Lemma 3.1.

Next, we turn to the case A(m, α) = 0.
Definition 3.5 Let m ∈ N, α ∈ R.
(i) Define

H ((0, ∞); x dx; H) = f : (0, ∞) → H f (j ) ∈ ACloc ((0, ∞); H),
m α

m $

∞ 2
0 j m − 1; f 2m,α = dx x α f (j ) (x)H < ∞ . (43)
j =0 0

(ii) Define

H0m ((0, ∞); x α dx; H) = f ∈ H m ((0, ∞); x α dx; H) there exists a Cauchy
∞
sequence {fn }∞
n=1 in C0 ((0, ∞); H), · m,α such that

lim fn − f m,α = 0 (44)
n→∞

= f ∈ H m ((0, ∞); x α dx; H) there exists a Cauchy sequence {fn }∞
n=1 in

C0∞ ((0, ∞); H), · m,α such that for any 0 < a < b < ∞ we have
$ b
lim dx fn (x) − f (x)2H = 0 . (45)
n→∞ a

Remark 3.6 To see that (44) and (45) describe the same space,
we first note
that if {fn }∞
n=1 ⊆ C ∞ ((0, ∞); H), ·
0 m,α and f ∈ H m (0, ∞); x α dx; H

satisfy (44), then clearly (45) is immediately satisfied. Next, suppose f ∈

588 F. Gesztesy et al.

H m (0, ∞); x α dx; H satisfies (45). By Lemma A.6 there is a unique g ∈
H m (0, ∞); x α dx; H such that
$ ∞ 2
lim dx x α fn (x) − g(x)H lim fn − gm,α = 0, (46)
n→∞ 0 n→∞

and hence, for all 0 < a < b < ∞,

$ b
lim dx fn (x) − g(x)2H = 0. (47)
n→∞ a

By the assumption that (45) is true, this gives f = g. Thus (46) implies (44). D
∞

Clearly
H) ⊆ H (0, ∞); x dx; H and · m,α is a norm on
C0 ((0, ∞)); m α

H m (0, ∞); x α dx; H . By Lemma A.6, H0m ((0, ∞); x α dx; H) can be identified
with the completion of C0∞ ((0, ∞); H) with respect to · m,α , that is,

· m,α
H0m ((0, ∞); x α dx; H) = C0∞ ((0, ∞); H) . (48)

Theorem 3.7 Let m ∈ N, α ∈ R, and assume that A(m, α) = 0. Then,

$ $
∞ ∞ 2
dx x α f (m) (x)2H A(, α) dx x α−2f (m−) (x)H ,
0 0 (49)
∈ {1, . . . , m}, f ∈ H0m ((0, ∞); x α dx; H).

Let f ∈ H0m ((0, ∞); ∞

x dx; H) and let {fn }n=1 be a Cauchy sequence in
α
Proof
C0∞ ((0, ∞); H), · m,α satisfying (44). Let ∈ {1, . . . , m}. Since
$ ∞ 2
lim dx x α fn(m−) (x) − f (m−) (x)H = 0, (50)
n→∞ 0

there exists a subsequence {fnk }∞ ∞

k=1 of {fn }n=1 such that

lim fn(m−)
k
(x) = f (m−) (x) for a.e. x ∈ (0, ∞). (51)
k→∞

Hence, by Fatou’s lemma,

$ ∞ 2
A(, α) dx x α−2 f (m−) (x)H
0
$ ∞ 2
lim inf A(, α) dx x α−2 fn(m−)
k
(x)H
k→∞ 0
$ ∞ 2
lim inf dx x α fn(m)
k
(x)H (by Lemma 3.1)
k→∞ 0
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 589

$ ∞ 2
= lim dx x α fn(m)
k
(x)
H
k→∞ 0
$ ∞ 2
= dx x α f (m) (x)H (by (44)). (52)
0

Next, we consider bounded intervals (0, ρ), 0 < ρ < ∞, and recall a simplified
version of [7, Theorem 3.1 (iii)].
Lemma 3.8 Let γ ∈ (0, ∞), ρ ∈ (0, γ ), and set

m
m
B(m, α) = 4−m (2j − 1 − α)2 > 0, m ∈ N, α ∈ R. (53)
k=1 j =1,j =k

Then,
$ ρ $ ρ
dx x α−2 f (m−) (x) [ln(γ /x)]−2,
2 2
dx x f α (m)
(x) B(, α)
0 0 (54)
m ∈ N, ∈ {1, . . . , m}, α ∈ R, f ∈ C0∞ ((0, ρ)).

Lemma 3.9 Let γ ∈ (0, ∞), ρ ∈ (0, γ ). Then,

$ $
ρ 2 ρ 2
dx x α f (m) (x)H B(, α) dx x α−2f (m−) (x)H [ln(γ /x)]−2 ,
0 0

m ∈ N, ∈ {1, . . . , m}, α ∈ R, f ∈ C0∞ ((0, ρ); H).

(55)

Proof Let {ϕk }∞ ∞

k=1 be an orthonormal basis of H. For k ∈ N, let fk ∈ C0 ((0, ρ))
be defined as in (A.4) so that (A.5) holds. Then
$ $ ∞

∞ 2 ρ
dx x α f (m) (x)H =
2
dx x α f (m) (x), ϕk H
0 0 k=1
∞ $ ∞
(m) 2
= dx x α fk (x)
k=1 0
∞
$ ρ (m−)
(x) [ln(γ /x)]−2
2
B(, α) dx x α−2 fk
k=1 0

∞ $
ρ
dx x α−2 f (m−) k (x) [ln(γ /x)]−2
2
= B(, α)
k=1 0
590 F. Gesztesy et al.

$ ∞

ρ (m−) 2
= B(, α) dx x α−2 f (x), ϕk H [ln(γ /x)]−2
0 k=1
$ ρ 2
= B(, α) dx x α−2 f (m−) (x)H [ln(γ /x)]−2 . (56)
0

Definition 3.10 Let m ∈ N, ρ ∈ (0, ∞), α ∈ R.
(i) Define

H0m ((0, ρ); x α dx; H) = f : (0, ρ) → H f (j ) ∈ ACloc ((0, ρ)), 0 j m − 1;
∞
there exists a Cauchy sequence {fn }∞
n=1 in C0 ((0, ρ); H), · m,α
$ ρ
(k) 2
such that lim α (k)
dx x fn (x) − f (x) H = 0, 0 k m
n→∞ 0
(57)

= f : (0, ρ) → H f (j ) ∈ ACloc ((0, ρ)), 0 j m − 1; there exists a
∞
Cauchy sequence {fn }∞
n=1 in C0 ((0, ρ); H), · m,α such that for any
$ b
0 < a < b < ρ, we have lim dx fn (x) − f (x)2H = 0 . (58)
n→∞ a

(ii) Define
.

H0m (0, ρ); x α dx; H = f : (0, ρ) → H f (j ) ∈ ACloc ((0, ρ)), 0 j m − 1;
∞
there exists a Cauchy sequence {fn }∞ n=1 in C0 ((0, ρ); H), ||| · |||m,α
$ ρ
2
such that lim dx x α fn(m) (x) − f (m) (x)H = 0 (59)
n→∞ 0

= f : (0, ρ) → H f (j ) ∈ ACloc ((0, ρ)), 0 j m − 1; there exists a
∞
Cauchy sequence {fn }∞
n=1 in C0 ((0, ρ); H), ||| · |||m,α such that for
$ b
any 0 < a < b < ρ, we have lim dx fn (x) − f (x)2H = 0 . (60)
n→∞ a

Remark 3.11
(i) An argument similar to that from Remark 3.6 shows that the conditions (57)
and (58) describe the same space.
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 591

(ii) Lemma A.7 together with (i) show that the conditions (59) and (60) describe
the same space. D
By Lemma A.7 one has
.
H0m ((0, ρ); x α dx; H) = H0m (0, ρ); x α dx; H , m ∈ N, α ∈ R, ρ ∈ (0, ∞).
(61)

Theorem 3.12 Let ρ ∈ (0, ∞). Then,

$ $
ρ 2 ρ 2
dx x α f (m) (x)H A(, α) dx x α−2f (m−) (x)H ,
0 0 (62)
m ∈ N, ∈ {1, . . . , m}, α ∈ R, f ∈ H0m ((0, ρ); x α dx; H).

Let f ∈ H0m ((0, ρ); ∞

x dx; H) and let {fn }n=1 be a Cauchy sequence in
α
Proof
C0∞ ((0, ρ); H), · m,α satisfying (57). Let ∈ {1, . . . , m}. Since
$ ρ 2
lim dx x α fn(m−) (x) − f (m−) (x)H = 0, (63)
n→∞ 0

there exists a subsequence {fnk }∞ ∞

k=1 of {fn }n=1 such that

lim fn(m−)
k
(x) = f (m−) (x) for a.e. x ∈ (0, ρ). (64)
k→∞

Hence, by Fatou’s lemma,

$ ρ 2
A(, α) dx x α−2 f (m−) (x)H
0
$ ρ 2
lim inf A(, α) dx x α−2 fn(m−)
k
(x)H
k→∞ 0
$ ρ 2
lim inf dx x α fn(m)
k
(x)H (by Lemma 3.1)
k→∞ 0
$ ρ 2
= lim dx x α fn(m)
k
(x)H
k→∞ 0
$ ρ
2
= dx x α f (m) (x)H (by (57)). (65)
0

We now establish analogous results on the exterior domain (ρ, ∞), ρ ∈ (0, ∞).
592 F. Gesztesy et al.

Lemma 3.13 For m ∈ N, ρ ∈ (0, ∞), α ∈ R, let

. 4
H m ((ρ, ∞); x α dx; H) = f : (ρ, ∞) → H | f (j ) ∈ ACloc ((ρ, ∞); H) and
5
lim f (j ) (x) = 0, j = 0, 1, . . . , m − 1; f (m) ∈ L2 ((ρ, ∞); x α dx; H) . (66)
x↓ρ

.
For f, g ∈ H m ((ρ, ∞); x α dx; H) let
$ ∞
f, gm,α = dx x α (f (m) (x), g (m) (x))H . (67)
ρ

.
Then · , · m,α is an inner product on H m ((ρ, ∞); x α dx; H). In fact,
.m
H ((ρ, ∞); x α dx; H), · , · m,α is a Hilbert space.
Proof The proof is analogous to the argument from [7, Proposition B.1].

.m
Definition 3.14 For m ∈ N, ρ ∈ (0, ∞), α ∈ R let H0 ((ρ, ∞); x α dx; H) denote
.
the closure of C0∞ ((ρ, ∞); H) in the space H m ((ρ, ∞); x α dx; H), · , · m,α .
.
Lemma 3.15 Let f ∈ H0m ((ρ, ∞); x α dx; H). Then there is a sequence {fn }∞
n=1 ⊂
C0∞ ((ρ, ∞); H) such that
$ ∞ 2
lim dx x α fn(m) (x) − f (m) (x)H = 0 (68)
n→∞ ρ

and, for k = 0, 1, . . . , m, one has

lim f (k) (x) = f (k) (x) for a.e. x ∈ (ρ, ∞). (69)
n→∞ n

Proof The proof is analogous to the argument from [7, Corollary B.2].

Theorem 3.16 Let ρ ∈ (0, ∞). Then,
$ $
∞ 2 ∞ 2
dx f (m) (x)H A(, α) dx x α−2 f (m−) (x)H ,
ρ ρ (70)
.m
m ∈ N, ∈ {1, . . . , m}, α ∈ R, f ∈ H0 ((ρ, ∞); x α dx; H).
.
for all = 1, . . . , m, α ∈ R, m ∈ N, and f ∈ H0m ((ρ, ∞); x α dx; H).
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 593

Proof Let {fn }∞ ∞

n=1 ⊆ C0 ((ρ, ∞); H) be the sequence which satisfies (68)
and (69). Then, by Fatou’s lemma,
$ ∞ 2
A(, α) dx x α−2f (m−) (x)H
ρ
$ ∞ 2
lim inf A(, α) dx x α−2fn(m−) (x)H
n→∞ ρ
$ ∞ 2
lim inf dx x α fn(m) (x)H (by Lemma 3.1)
n→∞ ρ
$ ∞ 2
= lim dx x α fn(m) (x)H
n→∞ ρ
$ ∞ 2
= dx x α f (m) (x)H . (71)
ρ

Optimality of A(, α), and strictness of the inequalities in this section follow
from Theorem 2.2.

Appendix A: Background for the Vector-Valued Case

For the remainder of this appendix, H denotes a separable complex Hilbert space.
Definition A.1
(i) Let a, b ∈ R, a < b. A function f : [a, b] → H is said to be absolutely
continuous, denoted by f ∈ AC([a, b]; H), if for every ε > 0 there exists
δ > 0 such that for every finite collection {(aj , bj )}N
j =1 of disjoint subintervals
N
in [a, b] with j =1 (bj − aj ) < δ, one has

N
f (bj ) − f (aj )H < ε. (A.1)
j =1

(ii) A function f : (c, d) → H, (c, d) ⊆ R, is said to be locally absolutely

continuous on (c, d), denoted by f ∈ ACloc ((c, d); H), if it is absolutely
continuous on every compact subinterval [a, b] ⊂ (c, d).
Lemma A.2 ([1, Propositions 1.2.2–1.2.4, Theorem 1.2.6]) A map f : (c, d) →
H, (c, d) ⊆ R, is locally absolutely continuous, that is, f ∈ ACloc ((c, d); H), if
and only if there exists a locally Bochner integrable g : (c, d) → H and x0 ∈ (c, d)
594 F. Gesztesy et al.

such that
$ x
f (x) = f (x0 ) + dt g(t), x ∈ (c, d). (A.2)
x0

If (A.2) is satisfied, then

f (x) = g(x) for a.e. x ∈ (c, d). (A.3)

Lemma A.3 Let m ∈ N, α ∈ R, and assume that A(m, α) > 0. Then ||| · |||m,α ,
defined in (35), is a norm on C0∞ ((0, ∞); H).
Proof We only need to show that if f ∈ C0∞ ((0, ∞); H) and |||f |||m,α = 0, then
f = 0. Let {ϕk }∞ ∞
k=1 be an orthonormal basis of H. For f ∈ C0 ((0, ∞); H) and
k ∈ N we write

fk (x) = (f (x), ϕk )H , x ∈ (0, ∞). (A.4)

Then, for all k, j ∈ N, we have

(j )
f (j ) k (x) = fk (x), x ∈ (0, ∞). (A.5)

Suppose f ∈ C0∞ ((0, ∞); H) and |||f |||m,α = 0. Then, for all k ∈ N, fk ∈
C0∞ ((0, ∞)) and, applying Birman’s inequalities to fk , we have
$ $
∞
α (m)
2 ∞
(x)H
2
0= |||f |||2m,α = dx x f dx x α f (m) k (x)
0 0
$ ∞ $ ∞
(A.6)
(m) 2 2
= dx x α fk (x) A(m, α) dx x α−2m
fk (x) .
0 0

Thus, since A(m, α) > 0,

fk (x) = 0, x ∈ (0, ∞), k ∈ N. (A.7)

Hence, f = 0.

Lemma A.4 Let m ∈ N and α ∈ R. Suppose that A(m, α) > 0 and let {fn }∞ n=1
be a Cauchy sequence in (C0∞ ((0, ∞); H), ||| · |||m,α ). Then there exists a unique
.
f ∈ H m (0, ∞); x α dx; H , defined in Definition 3.2, such that, for all 0 m,
$ ∞ 2
lim dx x α−2fn(m−) (x) − f (m−) (x)H = 0. (A.8)
n→∞ 0

Proof Since the sequence {fn }∞ ∞

n=1 is Cauchy in (C0 ((0, ∞); H), ||| · |||m,α ), the
(m) ∞
sequence {fn }n=1 is Cauchy in L2 ((0, ∞); x α dx; H), hence there exists gm ∈
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 595

L2 ((0, ∞); x α dx; H) such that

$ ∞ 2
lim dx x α fn(m) (x) − gm (x)H = 0. (A.9)
n→∞ 0

Since A(m, α) > 0, A(, α) > 0 for all 1 m − 1. Therefore, for 0 m

and n1 , n2 ∈ N, Lemma 3.1 implies
$ ∞ 2
A(, α) dx x α−2 fn(m−)
1
(x) − fn(m−)
2
(x)H
0
$ ∞ 2
dx x α fn(m)
1
(x) − fn
(m)
2
(x) ,
H (A.10)
0

}∞
(m−)
hence {fn n=1 is a Cauchy sequence in L ((0, ∞); x
2 α−2dx; H). Thus there

exists gm− ∈ L ((0, ∞); x

2 α−2 dx; H) such that, for 0 m,
$ ∞ 2
lim dx x α−2 fn(m−) (x) − gm− (x)H = 0. (A.11)
n→∞ 0

To complete the proof it remains to show

(j )
g0 (x) = gj (x) for a.e. x > 0, 1 j m. (A.12)

By (A.11) there exist K ⊂ (0, ∞) and a subsequence {fnk }∞ ∞

k=1 of {fn }n=1 such that
(0, ∞) \ K has zero Lebesgue measure and, for 0 j m,

(j )
lim fnk (x) = gj (x), x ∈ K. (A.13)
k→∞

Fix a ∈ K. Then, for all x ∈ K,

g0 (x) = lim fnk (x)

k→∞
$ x $ x
= lim fnk (a) + dt {fnk (t) − g1 (t)} + dt g1 (t) , (A.14)
k→∞ a a

and, by (A.11) again,

$ $
x x
dt {fn k (t) − g1 (t)} dt fnk (t) − g1 (t)H

a H a
$ x 2 1/2
dt fnk (t) − g1 (t)H |x − a|1/2
a
$ x 2 1/2
= dt t −[α−2(m−1)] t α−2(m−1) fn k (t) − g1 (t)H |x − a|1/2
a
596 F. Gesztesy et al.

6 7 1/2
max a −[α−2(m−1)], x −[α−2(m−1)] |x − a|1/2
$ ∞ 2 1/2
× dt t α−2(m−1) fnk (t) − g1 (t)H
0

−→ 0 . (A.15)
k→∞

Thus, by (A.14) and (A.15),

$ x
g0 (x) = g0 (a) + dt g1 (t), (A.16)
a

therefore g0 is locally absolutely continuous on (0, ∞) and

g0 (x) = g1 (x) for a.e. x ∈ (0, ∞). (A.17)

Similarly, for 0 j m − 1,

(j )
gj (x) = lim fnk (x) (x ∈ K)
k→∞
$ x $ x
(j ) (j +1)
= lim fnk (a) + dt {fnk (t) − gj +1 (t)} + dt gj +1 (t) ,
k→∞ a a
(A.18)

and, by (A.11),
$ $
x 6 (j +1) 7 x (j +1)
dt fnk (t) − gj +1 (t) dt fnk (t) − gj +1 (t)H

a H a
$ x (j +1) 1/2
dt fnk (t) − gj +1 (t)2H |x − a|1/2
a
$ x (j +1) 2 1/2
= dt t −[α−2(m−j −1)] t α−2(m−j −1) fnk (t) − gj +1 (t)H |x − a|1/2
a
6 7 1/2
max a −[α−2(m−j −1)], x −[α−2(m−j −1)] |x − a|1/2
$ ∞ (j +1) 2 1/2
× dt t α−2(m−j −1) fnk (t) − gj +1 (t)H
0

−→ 0 . (A.19)
k→∞
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 597

Hence, by (A.18) and (A.19), for 0 j m − 1,

$ x
gj (x) = gj (a) + dt gj +1 (t), x ∈ K, (A.20)
a

thus gj is locally absolutely continuous on (0, ∞) and

gj (x) = gj +1 (x) for a.e. x ∈ (0, ∞). (A.21)

.
Putting f = g0 , we have f ∈ H m (0, ∞); x α dx; H , by (A.20) and (A.21), (A.8)
follows from (A.11).
The uniqueness of f follows from (A.8) with = m.

Remark A.5 The condition (36) or (37) in Definition 3.2 (ii) in the context
.
of H0m (0, ∞); x α dx; H is necessary
to ensure that representation
of an
element in the completion of C0∞ ((0, ∞); H), ||| · |||m,α by a function in
.
H m (0, ∞); x α dx; H is unique.
To illustrate this point, consider the following example with m = 1, α = 0 and
H = C: Let g ∈ C0∞ ((0, ∞)) and put

fn (x) = g(x), x ∈ (0, ∞), n ∈ N,

(A.22)
Fj (x) = j + g(x), x ∈ (0, ∞), j ∈ N.
∞
Then {fn }∞
n=1 is a Cauchy sequence in C0 ((0, ∞)), ||| · |||m,α and, for any j ∈ N,
$ ∞
dx g (x) − g (x)
2
|||fn − Fj |||21,0 = = 0, (A.23)
0

.
but Fj ∈ H 1 (0, ∞); dx; C for all j ∈ N and Fj = Fk for j = k.
This kind of “non-uniqueness” phenomenon is due to ||| · |||m,α not being a norm
.
on H m (0, ∞); x α dx; H . D
Next, we turn to the case A(m, α) = 0.
Lemma A.6 Let m ∈ N, α ∈ R. With the notation established in Definition 3.5 (i),
let {fn }∞
n=1 be a Cauchy
sequence in C0∞ ((0, ∞); H), · m,α . Then there exists
a unique f ∈ H (0, ∞); x α dx; H such that
m

lim fn − f m,α = 0. (A.24)

n→∞

∞
Proof Since {fn }∞
(j ) ∞
n=1 is a Cauchy sequence in C0 ((0, ∞); H), · m,α , {fn }n=1
is a Cauchy sequence in L2 ((0, ∞); x α dx; H) for 0 j m, therefore there exists
598 F. Gesztesy et al.

gj ∈ L2 ((0, ∞); x α dx; H) such that

$ ∞ (j ) 2
lim dx x α fn (x) − gj (x)H = 0. (A.25)
n→∞ 0

We now prove that

(j )
g0 (x) = gj (x) for a.e. x ∈ (0, ∞), 1 j m. (A.26)

By (A.25) there exists K ⊆ (0, ∞) and a subsequence {fnk }∞ ∞

k=1 of {fn }n=1 such
that (0, ∞) \ K has zero Lebesgue measure and that, for 0 j m,

(j )
lim fnk (x) = gj (x), x ∈ K. (A.27)
k→∞

Fix a ∈ K. Then, for 0 j m − 1,

(j )
gj (x) = lim fnk (x) (x ∈ K)
k→∞
$ x $ x
(j ) (j +1)
= lim fnk (a) + dt {fnk (t) − gj +1 (t)} + dt gj +1 (t) ,
k→∞ a a
(A.28)

and, by (A.25),
$ $
x 6 (j +1) 7 x (j +1)
dt fnk (t) − gj +1 (t) dt fnk (t) − gj +1 (t)H

a H a
$ x (j +1) 2 1/2
dt fnk (t) − gj +1 (t)H |x − a|1/2
a
$ x (j +1) 2 1/2
= dt t −α t α fnk (t) − gj +1 (t)H |x − a|1/2
a
$
6 7 1/2 ∞ (j +1) 2 1/2
max a −α , x −α dt t α fnk (t) − gj +1 (t)H |x − a|1/2
a

−→ 0 . (A.29)
k→∞

Hence, by (A.28) and (A.29), for 0 j m − 1,

$ x
gj (x) = gj (a) + dt gj +1 (t), x ∈ K, (A.30)
a
A New Proof of the Power Weighted Birman–Hardy–Rellich Inequalities 599

so gj is locally absolutely continuous on (0, ∞) and

gj = gj +1 (x) for a.e. x ∈ (0, ∞). (A.31)

Putting f = g0 , we have f ∈ H m ((0, ∞); x α dx; H), by (A.30) and (A.31), (A.24)
follows from (A.25).
Finally, uniqueness of f is a consequence of · m,α being a norm on the space
H m ((0, ∞); x α dx; H).

Lemma A.7 Let m ∈ N, α ∈ R, ρ ∈ (0, ∞). Then
$ ρ 2 1/2
|||f |||m,α = dx x α f (m) (x)H (A.32)
0

and
m $
ρ 2 1/2
f m,α = dx x α f (k) (x)H (A.33)
k=0 0

are equivalent norms on C0∞ ((0, ρ); H).

Proof It suffices to show that, for 0 k m−1, there exists a Ck = Ck (m, α, ρ) >
0 such that
$ ρ $ ρ

α (k)
2 2

dx x f (x) H Ck dx x α f (m) (x)H , f ∈ C0∞ ((0, ρ); H).
0 0
(A.34)

Let γ = ρ + 1. Choose η ∈ (0, ρ) such that x $→ x −2(m−k)[ln(γ /x)]−2 is strictly

decreasing on (0, η). Then, by Lemma 3.9,
$ ρ 2
dx x α f (m) (x)H
0
$ ρ 2
B(m − k, α) dx x α−2(m−k)f (k) (x)H [ln(γ /x)]−2
0
$ $
η ρ 2
= B(m − k, α) dx + dx x α−2(m−k) f (k) (x)H [ln(γ /x)]−2
0 η
$ ρ 2
Ck (m, α, ρ) dx x α f (k) (x)H . (A.35)
0

600 F. Gesztesy et al.

References

1. W. Arendt, C.K. Batty, M. Hieber, F. Neubrander, Vector-Valued Laplace Transforms and

Cauchy Problems. Monographs in Mathematics, vol. 96 (Birkhäuser, Basel, 2001)
2. A.A. Balinsky, W.D. Evans, R.T. Lewis, The Analysis and Geometry of Hardy’s Inequality.
Universitext (Springer, Cham, 2015)
3. M.S. Birman, The spectrum of singular boundary problems. Mat. Sb. (N.S.) 55(97), 125–174
(1961) (Russian). Engl. transl. in Am. Math. Soc. Transl. Ser. 2, 53, 23–80 (1966)
4. R.S. Chisholm, W.N. Everitt, L.L. Littlejohn, An integral operator inequality with applications.
J. Inequal. Appl. 3, 245–266 (1999)
5. C.Y. Chuah, F. Gesztesy, L.L. Littlejohn, T. Mei, I. Michael, M.M.H. Pang, On weighted
Hardy-type inequalities. Math. Inequal. Appl. 23, 625–646 (2020)
6. E.B. Davies, Spectral Theory and Differential Operators. Cambridge Studies in Advanced
Mathematics, vol. 42 (Cambridge University Press, Cambridge, 1995)
7. F. Gesztesy, L.L. Littlejohn, I. Michael, M.M.H. Pang, A sequence of weighted Birman–
Hardy–Rellich inequalities with logarithmic refinements. Integr. Equ. Oper. Theory (published
online March 25, 2022), arxiv: 2003.12894
8. F. Gesztesy, L.L. Littlejohn, I. Michael, R. Wellman, On Birman’s sequence of Hardy–Rellich-
type inequalities. J. Differ. Equ. 264, 2761–2801 (2018)
9. I.M. Glazman, Direct Methods of Qualitative Spectral Analysis of Singular Differential
Operators. Israel Program for Scientific Translations, Jerusalem, 1965 (Daniel Davey & Co.,
Inc., New York, 1966)
10. G.R. Goldstein, J.A. Goldstein, R.M. Mininni, S. Romanelli, Scaling and variants of Hardy’s
inequality. Proc. Am. Math. Soc. 147, 1165–1172 (2019)
11. G.H. Hardy, Notes on some points in the integral calculus, LX. An inequality between integrals.
Messenger Math. 54, 150–156 (1925)
12. G.H. Hardy, J.E. Littlewood, G. Pólya, Inequalities. (Cambridge University Press, Cambridge,
reprinted, 1988)
13. P. Hartman, On the linear logarithmic-exponential differential equation of the second-order.
Am. J. Math. 70, 764–779 (1948)
14. P. Hartman, Ordinary Differential Equations. 2nd edn. (SIAM, Philadelphia, 2002)
15. A. Kufner, Weighted Sobolev Spaces. (A Wiley-Interscience Publication, Wiley, New York,
1985)
16. A. Kufner, L. Maligranda, L.-E. Persson, The Hardy Inequality: About Its History and Some
Related Results. (Vydavatelský Servis, Pilsen, 2007)
17. A. Kufner, L.-E. Persson, N. Samko, Weighted Inequalities of Hardy Type. 2nd edn. (World
Scientific, Singapore, 2017)
18. A. Kufner, A. Wannebo, Some remarks on the Hardy inequality for higher order derivatives.
Int. Ser. Numer. Math. 103, 33–48 (1992)
19. E. Landau, A note on a theorem concerning series of positive terms: extract from a letter of
Prof. E. Landau to Prof. I. Schur. J. Lond. Math. Soc. 1, 38–39 (1926)
20. B. Muckenhoupt, Hardy’s inequality with weights. Stud. Math. 44, 31–38 (1972)
21. E. Müller-Pfeiffer, Spectral Theory of Ordinary Differential Operators. (Ellis Horwood
Limited, West Sussex, 1981)
22. B. Opic, A. Kufner, Hardy-Type Inequalities. Pitman Research Notes in Mathematics Series,
vol. 219 (Longman Scientific & Technical, Harlow, 1990)
23. L.-E. Persson, S.G. Samko, A note on the best constants in some Hardy inequalities. J. Math.
Inequal. 9, 437–447 (2015)
24. F. Rellich, Perturbation Theory of Eigenvalue Problems. (Gordon and Breach, New York,
1969)
An Excursion to Multiplications
and Convolutions on Modulation Spaces

Nenad Teofanov and Joachim Toft

Abstract We give a self-contained introduction to (quasi-)Banach modulation

spaces of ultradistributions, and review results on boundedness for multiplications
and convolutions for elements in such spaces. Furthermore, we use these results to
study the Gabor product. As an example, we show how it appears in a phase-space
formulation of the nonlinear cubic Schrödinger equation.

Keywords Time–frequency analysis · Modulation spaces · Convolutions ·

Multiplications

1 Introduction

Modulation spaces were introduced in Feichtinger’s seminal technical report [17],

and prove themselves as useful family of Banach spaces of tempered distributions
in time-frequency analysis, [4, 10, 28]. The main purpose of this survey article is to
enlighten some properties of modulation spaces in a rather self-contained manner.
In contrast to the most common situation, our analysis includes both quasi-Banach
and Banach modulation spaces within the framework of ultradifferentiable functions
and ultradistributions of Gelfand–Shilov type. For that reason we collect necessary
background material in a rather detailed preliminary section.
Motivated by recent applications of modulation spaces in the context of nonlinear
harmonic analysis and its applications, cf. [4–6, 14, 22, 38, 39, 47, 54] we focus our
attention to boundedness for multiplications and convolutions for elements in such
spaces. The basic results in that direction go back to the original contribution [17],

N. Teofanov
Department of Mathematics and Informatics, University of Novi Sad, Novi Sad, Serbia
e-mail: [email protected]
J. Toft ()
Department of Mathematics, Linnæus University, Växjö, Sweden
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 601
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_18
602 N. Teofanov and J. Toft

and were thereafter reconsidered by many authors in different contexts. Let us give
a brief, and unavoidably incomplete account on the related results.
In Sect. 3 we formulate in Theorems 3.5 and 3.7 bilinear versions of more
general multiplication and convolution results in [54, Section 3]. The contents of
Theorems 3.5 and 3.7 in the unweighted case for modulation spaces M p,q can be
summarized as follows.
Proposition 1.1 Let pj , qj ∈ (0, ∞], j = 0, 1, 2,

1 1 1 1 1
θ1 = max 1, , , and θ2 = max 1, , .
p0 q1 q2 p1 p2

Then
1 1 1 1 1 1
M p1 ,q1 · M p2 ,q2 ⊆ M p0 ,q0 , + = , + = θ1 + ,
p1 p2 p0 q1 q2 q0

1 1 1 1 1 1
M p1 ,q1 ∗ M p2 ,q2 ⊆ M p0 ,q0 , + = θ2 + , + = .
p1 p2 p0 q1 q2 q0

The general multiplication and convolution properties in Sect. 3 also overlap with
results by Bastianoni, Cordero and Nicola in [2], by Bastianoni and Teofanov in [1],
and by Guo et al. in [32].
The multiplication relation in Proposition 1.1 for pj , qj ≥ 1 was obtained
already in [17] by Feichtinger. It is also obvious that the convolution relation was
well-known since then (though a first formal proof of this relation seems to be given
first in [48]). In general, these convolution and multiplication properties follow the
rules

p1 ∗ p2 ⊆ p0 , q 1 · q 2 ⊆ q 0 ⇒ M p1 ,q1 ∗ M p2 ,q2 ⊆ M p0 ,q0

and

p1 · p2 ⊆ p0 , q 1 ∗ q 2 ⊆ q 0 ⇒ M p1 ,q1 · M p2 ,q2 ⊆ M p0 ,q0 ,

which goes back to [17] in the Banach space case and to [25] in the quasi-Banach
case. See also [19] and [42] for extensions of these relations to more general Banach
function spaces and quasi-Banach function spaces, respectively.
In Sect. 3 we basically review some results from [54]. To make this survey self-
contained we give the proof of Theorem 3.7 in unweighted case. In contrast to [32],
we do not deduce any sharpness for our results.
To show Proposition 1.1 in the quasi-Banach setting, apart from the usual use of
Hölder’s and Young’s inequalities, additional arguments are needed. In our situation
we discretize the situations in similar ways as in [2] by using Gabor analysis for
modulation spaces, and then apply some further arguments, valid in non-convex
An Excursion to Multiplications and Convolutions on Modulation Spaces 603

analysis. This approach is slightly different compared to what is used in [32] which
follows the discretization technique introduced in [55], and which has some traces
of Gabor analysis.
We refer to [54] for a detailed discussion on the uniqueness of multiplications
and convolutions in Proposition 1.1.
In Sect. 4 we apply the results from previous parts in the framework of the so
called Gabor product. It is introduced in [14] in order to derive a phase space
analogue to the usual convolution identity for the Fourier transform. The main
motivation is to use such kind of products in a phase-space formulation of certain
nonlinear equations. As noticed in [14], among other interesting characteristics
of phase-space representations, the initial value problem in phase-space may be
well-posed for more general initial distributions. This means that the phase-space
formulation could contain solutions other than the standard ones. We refer to [11–
13], where the phase-space extensions are explored in different contexts. Here we
illustrate this approach by considering the nonlinear cubic Schrödinger equation,
which appear for example in Bose-Einstein condensate theory [35]. We also refer to
[4, Chapter 7] for an overview of results related to well-posedness of the nonlinear
Schrödinger equations in the framework of modulation spaces, see also [3, 38, 39].

2 Preliminaries

In this section we give an exposition of background material related to the

definition and basic properties of modulation spaces. Thus we recall some facts
on the short-time Fourier transform and related projections, the (Fourier invariant)
Gelfand-Shilov spaces, weight functions, and mixed-norm spaces of Lebesgue type.
We also recall convolution and multiplication in weighted Lebesgue sequence
spaces.

2.1 The Short-Time Fourier Transform

In what follows we let F be the Fourier transform which takes the form
$
(F f )(ξ ) = fB(ξ ) ≡ (2π)− 2
d
f (x)e−ix,ξ dx
Rd

when f ∈ L1 (Rd ). Here · , · denotes the usual scalar product on Rd . The same
notation is used for the usual dual form between test functions and corresponding
(ultra-)distributions. We recall that map F extends uniquely to a homeomorphism
on the space of tempered distributions S (Rd ), to a unitary operator on L2 (Rd ) and
restricts to a homeomorphism on the Schwartz space of smooth rapidly decreasing
functions S (Rd ), cf. (29). We also observe with our choice of the Fourier transform,
604 N. Teofanov and J. Toft

the usual convolution identity for the Fourier transform takes the forms

F (f · g) = (2π)− 2 fB∗ B
g and F (f ∗ g) = (2π) 2 fB· B
d d
g (1)

when f, g ∈ S (Rd ).
In several situations it is convenient to use a localized version of the Fourier
transform, called the short-time Fourier transform, STFT for short. The short-time
Fourier transform of f ∈ S (Rd ) with respect to the fixed window function φ ∈
S (Rd ) is defined by

(Vφ f )(x, ξ ) ≡ (2π)− 2 (f, φ( · − x)ei · ,ξ )L2 .

d
(2)

Here ( · , · )L2 denotes the unique continuous extension of the inner product on
L2 (Rd ) restricted to S (Rd ) into a continuous map from S (Rd ) × S (Rd ) to C.
We observe that using certain properties for tensor products of distributions,

(Vφ f )(x, ξ ) = F (f · φ( · − x))(ξ ). (2)

(cf. [33, 52]). If in addition f ∈ Lp (Rd ) for some p ∈ [1, ∞], then
$
(Vφ f )(x, ξ ) = (2π)− 2
d
f (y)φ(y − x)e−iy,ξ dy. (2)
Rd

We observe that the domain of Vφ is S (Rd ). The images are contained in

C ∞ (R2d ), the set of smooth functions defined on the phase space Rd × Rd 4 R2d .
The short-time Fourier transform appears in different contexts and under dif-
ferent names. In quantum mechanics it is rather common to call it the coherent
state transform (see e.g. [37]). It is also closely related to the so-called Wigner
distribution or radar ambiguity function (see e.g. [36]). In time-frequency analysis,
it is also sometimes called the Voice transform.
The main idea with the design of short-time Fourier transform is to get the
Fourier content, or the frequency resolution of localized functions and distributions.
Roughly speaking, short-time Fourier transforms give a simultaneous information
both on functions or distributions themselves as well as their Fourier transforms in
the sense that the map

x $→ Vφ f (x, ξ )

resembles on f (x), while the map

ξ $→ Vφ f (x, ξ )

resembles on fB(ξ ).
An Excursion to Multiplications and Convolutions on Modulation Spaces 605

As for the ordinary Fourier transform, there are several mapping properties which
hold true for the short-time Fourier transform. As an elegant way to approach such
properties in the framework of distributions, we may follow ideas given in [24] by
Folland.
In fact, let T be the semi-conjugated tensor map

T (f, φ) = f ⊗ φ, (3)

U be the linear pullback

(U F )(x, y) = U (y, y − x) (4)

and F2 be the partial Fourier transform given by

$
(F2 F )(x, ξ ) = (2π)− 2 F (x, y)e−iy,ξ dy.
d
(5)
Rd

Then

Vφ f = (F2 ◦ U ◦ T )(f, φ), (6)

when f, φ ∈ S (Rd ).
We observe that the mappings

T : S (Rd ) × S (Rd ) → S (R2d ), U, F2 : S (R2d ) → S (R2d ) (7)

are continuous and uniquely extendable to continuous mappings

T : S (Rd ) × S (Rd ) → S (R2d ), U, F2 : S (R2d ) → S (R2d ), (8)

which in turn restricts to isometric mappings

T : L2 (Rd ) × L2 (Rd ) → L2 (R2d ), U, F2 : L2 (R2d ) → L2 (R2d ). (9)

Here that T is isometric means that

T (f, φ)L2 (R2d ) = f L2 (Rd ) φL2 (Rd ) .

It is now natural to define Vφ f as the right-hand side of (6) when f, φ ∈ S (Rd ),

in which Vφ f is well-defined as an element in S (R2d ).
606 N. Teofanov and J. Toft

Proposition 2.1 The map

(f, φ) $→ Vφ f : S (Rd ) × S (Rd ) → S (R2d ) (10)

is continuous, which extends uniquely to a continuous map

(f, φ) $→ Vφ f : S (Rd ) × S (Rd ) → S (R2d ), (11)

which in turn restricts to an isometric map

(f, φ) $→ Vφ f : L2 (Rd ) × L2 (Rd ) → L2 (R2d ). (12)

If φ ∈ S (Rd ) and f ∈ S (Rd ), then (11) shows that Vφ f ∈ S (R2d ). On the

other hand, it is easy to see that the right-hand side of (2) defines a smooth function.
Consequently beside (11) and (10), we also have the continuous map

(f, φ) $→ Vφ f : S (Rd ) × S (Rd ) → S (R2d ) ∩ C ∞ (R2d ). (13)

For short-time Fourier transform, the Parseval identity is replaced by the so-
called Moyal identity, also known as the orthogonality relation given by

(Vφ f, Vψ g)L2 (R2d ) = (ψ, φ)L2 (Rd ) (f, g)L2 (Rd ) , (14)

when f, g, φ, ψ ∈ S (Rd ). The identity (14) is obtained by rewriting the short-time

Fourier transforms by (2) and then applying the Parseval identity in suitable ways.
We observe that the right-hand side makes sense also when f , g, φ and ψ belong to
other spaces than S (Rd ). For example we may let

(f, g, φ, ψ) ∈ S (Rd ) × S (Rd ) × S (Rd ) × S (Rd ),

(15)

(f, g, φ, ψ) ∈ S (Rd ) × S (Rd ) × Lq (Rd ) × Lq (Rd )

or (f, g, φ, ψ) ∈ Lp (Rd ) × Lp (Rd ) × Lq (Rd ) × Lq (Rd ),

when p, p , q, q ∈ [1, ∞] satisfy

1 1 1 1
+ = + = 1.
p p q q
An Excursion to Multiplications and Convolutions on Modulation Spaces 607

By Moyal’s identity (14) it follows that if φ ∈ S (Rd ) \ {0}, then the identity
operator on S (Rd ) is given by

Id = φ−2
L2
· Vφ∗ ◦ Vφ , (16)

provided suitable mapping properties of the (L2 -)adjoint Vφ∗ of Vφ can be estab-
lished. Obviously, Vφ∗ fullfils

(Vφ∗ F, g)L2 (Rd ) = (F, Vφ g)L2 (R2d ) (17)

when F ∈ S (R2d ) and g ∈ S (Rd ).

By expressing the scalar product and the short-time Fourier transform in terms of
integrals in (17), it follows by straight-forward manipulations that the adjoint in (17)
is given by
$$
(Vφ∗ F )(x) = (2π) − d2
F (y, η)φ(x − y)eix,η dydη, (18)
R2d

when F ∈ S (R2d ). We may now use mapping properties like (11)–(12) to extend
the definition of Vφ∗ F when F and φ belong to various classes of function and
distribution spaces. For example, by (11), (10) and (12), it follows that the map

(F, g) $→ (F, Vφ g)L2 (R2d )

defines a sesqui-linear form on S (R2d ) × S (Rd ), S (R2d ) × S (Rd ) and

on L2 (R2d ) × L2 (Rd ). This implies that if φ ∈ S (Rd ), then Vφ∗ in (17) is
continuous from S (R2d ) to S (Rd ) which is uniquely extendable to a continuous
map S (R2d ) to S (Rd ), and to L2 (R2d ) to L2 (Rd ). That is, the mappings

Vφ∗ : S (R2d ) → S (Rd ), Vφ∗ :S (R2d ) → S (Rd )

(19)
and Vφ∗ : L2 (R2d ) → L2 (Rd )

are continuous.

2.2 STFT Projections and a Suitable Twisted Convolution

If φ ∈ S (Rd ) satisfies φL2 = 1, then (16) shows that Vφ∗ ◦ Vφ is the identity
operator on S (Rd ). If we swap the order of this composition we get certain types
608 N. Teofanov and J. Toft

of projections. In fact, for any φ ∈ S (Rd ) \ {0}, let Pφ be the operator given by

Pφ ≡ φ−2
L2
· Vφ ◦ Vφ∗ . (20)

We observe that Pφ is continuous on S (R2d ), L2 (R2d ) and S (R2d ) due to the

mapping properties for Vφ and Vφ∗ above.
It is clear that Pφ∗ = Pφ , i.e. Pφ is self-adjoint. Furthermore, Pφ is an projection:

Pφ2 = φ−2
L2
· Vφ ◦ φ−2 ∗ ∗ −2 ∗
2 · Vφ ◦ Vφ ◦ Vφ = φL2 · Vφ ◦ Vφ = Pφ .
O L PQ R
The identity operator

Hence,

Pφ∗ = Pφ and Pφ2 = Pφ , (21)

which shows that Pφ is an orthonormal projection.

The ranks of Pφ are given by

Pφ (S (R2d )) = Vφ (S (Rd )), Pφ (L2 (R2d )) = Vφ (L2 (Rd )),

(22)
and Pφ (S (R2d )) = Vφ (S (Rd )).

In fact, if F ∈ S (R2d ), then

Pφ F = Vφ f,

where f = φ−2 V ∗ F ∈ S (Rd ). This shows that Pφ (S (R2d )) ⊆ Vφ (S (Rd )).

L2 φ
On the other hand, if f ∈ S (Rd ) and F = Vφ f , then

Pφ F = Vφ ◦ φ−2
L2
· Vφ∗ ◦ Vφ f = Vφ f,

which shows that any element in Vφ (S (Rd )) equals an element in Pφ (S (R2d )),
i.e. Pφ (S (R2d )) = Vφ (S (Rd )). This gives the last identity in (22). In the same
way, the first two identities are obtained.
Remark 2.2 Let F ∈ S (R2d ). Then it follows from the last identity in (22) that
F = Vφ f for some f ∈ S (Rd ), if and only if

F = Pφ F. (23)

Furthermore, if (23) holds, then F = Vφ f with

f = (φ−2
L2
) · Vφ∗ F. (24)
An Excursion to Multiplications and Convolutions on Modulation Spaces 609

There is a twisted convolution which is linked to the projection in (20). In fact, if

F ∈ S (R2d ) and φ ∈ S (Rd ) \ {0}, then it follows by expanding the integrals for
Vφ and Vφ∗ in (20), and performing some straight-forward manipulations that

Pφ F = φ−2
L2
· Vφ φ ∗V F, F ∈ S (R2d ), (25)

where the twisted convolution ∗V is defined by

$$
− d2
(F ∗V G)(x, ξ ) = (2π) F (x − y, ξ − η)G(y, η)e−iy,ξ −η dydη.
R2d
$$
= (2π)− 2 F (y, η)G(x − y, ξ − η)e−ix−y,η dydη,
d

R2d
(26)

when F, G ∈ S (R2d ). We observe that the definition of ∗V is uniquely extendable

in different ways. For example, Young’s inequality for ordinary convolution also
holds for the twisted convolution. Moreover, the map (F, G) $→ F ∗V G extends
uniquely to continuous mappings from S (R2d ) × S (R2d ) or S (R2d ) × S (R2d )
to S (R2d ). By straight-forward computations it follows that

(F ∗V G) ∗V H = F ∗V (G ∗V H ), (27)

when F, H ∈ S (R2d ) and G ∈ S (R2d ), or F, H ∈ S (R2d ) and G ∈ S (R2d ).

Let f ∈ S (Rd ) and φj ∈ S (Rd ), j = 1, 2, 3. By straight-forward applications
of Parseval’s formula it follows that

(Vφ2 φ3 ) ∗V (Vφ1 f ) (x, ξ ) = (φ3 , φ1 )L2 · (Vφ2 f )(x, ξ ), (28)

which is some sort of reproducing kernel of short-time Fourier transforms in the

background of ∗V .

2.3 Gelfand-Shilov Spaces

Before defining the Gelfand-Shilov spaces, we recall that the Schwartz space
S (Rd ) consists of all (complex-valued) smooth functions f ∈ C ∞ (Rd ) such that

sup |x β ∂ α f (x)| ≤ Cα,β , (29)
x∈Rd

for some constants Cα,β > 0, which only depend on the multi-indices α, β ∈ Nd .
The Schwartz space possess several convenient properties, and is heavily used in
mathematics, science and technology. For example, the Schwartz space is invariant
610 N. Teofanov and J. Toft

under Fourier transformation. By duality the same holds true for its (L2 -)dual
S (Rd ), the set of tempered distributions on Rd .
On the other hand, we observe that there are no conditions on the growths of
the constants Cα,β with respect to α, β ∈ Nd . This implies that in the context of
the spaces S (Rd ) and S (Rd ), it is almost impossible to investigate important
properties like analyticity or related regularity properties which are stronger than
pure smoothness. For investigating such stronger regularity properties, we need to
modify S (Rd ) and the estimate (29) by imposing suitable growth conditions on the
constants Cα,β . This leads to the definition of Gelfand-Shilov spaces, [26, 40].
We only discuss Fourier invariant Gelfand-Shilov spaces and their properties.
Let 0 < s ∈ R be fixed. We have two different types of Gelfand-Shilov spaces. The
Gelfand-Shilov space Ss (Rd ) of Roumieu type with parameter s > 0 consists of all
f ∈ C ∞ (Rd ) such that

sup |x β ∂ α f (x)| ≤ Ch|α+β| (α!β!)s , (30)
x∈Rd

for some constants C, h > 0. In the same way, the Gelfand-Shilov space s (Rd ) of
Beurling type with parameter s > 0 consists of all f ∈ C ∞ (Rd ) such that for every
h > 0, there is a constant C = Ch > 0 such that (30) holds. Hence, in comparison
with the definition of Schwartz functions, we have limited ourself to constants Cα,β
in (29) which are not allowed to grow faster than those of the form

Ch|α+β| (α!β!)s

when dealing with Gelfand-Shilov spaces.

It can be proved that Ss (Rd ) and t (Rd ) are dense in S (Rd ) when s ≥ 12 and
t > 12 . We call such s and t admissible. On the other hand, for the other choices of
s and t we have
1 1
Ss (Rd ) = t (Rd ) = {0}, when s < , t≤ .
2 2

One has that S1 (Rd ) consists of real analytic functions, and that 1 (Rd ) consists
of smooth functions on Rd which are extendable to entire functions on Cd . The
topologies of Ss (Rd ) and s (Rd ) are defined by the semi-norms

|x β ∂ α f (x)|
f Ss,h ≡ sup . (31)
h|α+β| (α!β!)s

Here the supremum should be taken over all α, β ∈ Nd and x ∈ Rd . We equip

Ss (Rd ) and s (Rd ) by the canonical inductive limit topology and projective limit
topology, respectively, with respect to h > 0, which are induced by the semi-norms
in (31).
An Excursion to Multiplications and Convolutions on Modulation Spaces 611

Let Ss,h (Rd ) be the Banach space which consists of all f ∈ C ∞ (Rd ) such that
(Rd ) be the (L2 -)dual of S (Rd ). If s ≥ 1 ,
f Ss,h in (31) is finite, and let Ss,h s,h 2
then the Gelfand-Shilov distribution space Ss (Rd ) of Roumieu type is the projective
(Rd ) with respect to h > 0. If instead s > 1 , then the Gelfand-Shilov
limit of Ss,h 2
(Rd ) with
distribution space s (Rd ) of Beurling type is the inductive limit of Ss,h
respect to h > 0. Consequently, for admissible s we have
D C
Ss (Rd ) =
Ss,h (Rd ) and s (Rd ) =
Ss,h (Rd ).
h>0 h>0

It can be proved that Ss (Rd ) and s (Rd ) are the (strong) duals to Ss (Rd ) and
s (Rd ), respectively.
We have the following embeddings and density properties for Gelfand-Shilov
and Schwartz spaces

Ss (Rd ) 3→t (Rd ) 3→ St (Rd ) 3→ S (Rd ),

(32)
1
S (R ) d
3→ St (Rd ) 3→t (Rd ) 3→ Ss (Rd ), t>s≥ ,
2
with dense embeddings. Here A 3→ B means that the topological spaces A and B
satisfy A ⊆ B with continuous embeddings.
The Fourier transform possess convenient mapping properties on Gelfand-Shilov
spaces and their distribution spaces. In fact, the Fourier transform extends uniquely
to homeomorphisms on Ss (Rd ) and on s (Rd ) for admissible s. Furthermore, F
restricts to homeomorphisms on Ss (Rd ) and on s (Rd ).
One of the most important characterizations of Gelfand-Shilov spaces is per-
formed in terms of estimates of the functions and their Fourier transforms. More
precisely, in [8, 15] it is proved that if f ∈ S (Rd ) and s > 0, then f ∈ Ss (Rd )
(f ∈ s (Rd )), if and only if
1 1
|f (x)| e−r|x| and |fB(ξ )| e−r|ξ | ,
s s
(33)

for some r > 0 (for every r > 0). Here g1 g2 means that g1 (θ ) ≤ c · g2 (θ ) holds
uniformly for all θ in the intersection of the domains of g1 and g2 and for some
constant c > 0, and we write g1 E g2 when g1 g2 g1 .
The analysis in [8, 15] can also be applied on the Schwartz space, from which it
follows that an element f ∈ S (Rd ) belongs to S (Rd ), if and only if

|f (x)| x−N and |fB(ξ )| ξ −N , (34)

for every N ≥ 0. Here and in what follows we let

1
x = (1 + |x|2) 2 .
612 N. Teofanov and J. Toft

Remark 2.3 Several properties in Sects. 2.1–2.3 in the background of S (Rd ) and
S (Rd ) also hold for the Gelfand-Shilov spaces and their distribution spaces. Let
s ≥ 12 . By similar arguments which lead to Proposition 2.1 and (13), it follows that

(f, φ) $→ Vφ f : Ss (Rd ) × Ss (Rd ) → Ss (R2d ) (35)

is continuous, which extends uniquely to continuous mappings

(f, φ) $→ Vφ f : Ss (Rd ) × Ss (Rd ) → Ss (R2d ) ∩ C ∞ (R2d ) (36)

and

(f, φ) $→ Vφ f : Ss (Rd ) × Ss (Rd ) → Ss (R2d ). (37)

It follows that (14) makes sense after each S in (15) are replaced by Ss . Let
φ ∈ Ss (Rd ) \ {0} be fixed. Then by similar arguments which lead to (19) give that
the mappings

Vφ∗ : Ss (R2d ) → Ss (Rd ), Vφ∗ : Ss (R2d ) → Ss (Rd ) (19)

are continuous. For Pφ in (20) we have that (21) still holds true and that (22) can be
completed with

Pφ (Ss (R2d )) = Vφ (Ss (Rd )) and Pφ (Ss (R2d )) = Vφ (Ss (Rd )). (38)

We also have that the twisted convolution in (26) is continuous from Ss (R2d ) ×
Ss (R2d ) to Ss (R2d ) and uniquely extendable to a continuous map Ss (R2d ) ×
Ss (R2d ) or Ss (R2d ) × Ss (R2d ) to Ss (R2d ), and that the formulae (25)–(28) still
hold true after each S is replaced by Ss in the attached assumptions.
If instead s > 12 , then similar facts hold true with s in place of Ss above, at
each occurrence.
Remark 2.4 In similar ways as characterizing Gelfand-Shilov spaces in terms of
Fourier estimates (see (33)), we may also use the short-time Fourier transform to
perform similar characterizations. Moreover, the short-time Fourier transform can
in addition be used to characterize spaces of Gelfand-Shilov distributions.
In fact, let φ ∈ Ss (Rd ) \ {0} (φ ∈ s (Rd ) \ {0}) be fixed and let f be a Gelfand-
Shilov distribution on Rd . Then the following is true:
1. f ∈ Ss (Rd ) (f ∈ s (Rd )), if and only if
1 1
e−r(|x|
s +|ξ | s )
|Vφ f (x, ξ )| (39)

for some r > 0 (for every r > 0);

An Excursion to Multiplications and Convolutions on Modulation Spaces 613

2. f ∈ Ss (Rd ) (f ∈ s (Rd )), if and only if

1 1
s +|ξ | s
|Vφ f (x, ξ )| er(|x| )
(40)

for every r > 0 (for some r > 0).

We refer to [31, Theorem 2.7] for the characterization 1. concerning Gelfand-
Shilov functions and to [51, Proposition 2.2]) for the characterization 2. concerning
Gelfand-Shilov distributions.

2.4 Weight Functions

A weight or weight function on Rd is a positive function ω ∈ L∞ d

loc (R ) such that
∞
1/ω ∈ Lloc (R ). The weight ω is called moderate, if there is a positive weight v on
d

Rd and a constant C ≥ 1 such that

ω(x + y) ≤ Cω(x)v(y), x, y ∈ Rd . (41)

If ω and v are weights on Rd such that (41) holds, then ω is also called v-moderate.
We note that (41) implies that ω fulfills the estimates

C −1 v(−x)−1 ≤ ω(x) ≤ Cv(x), x ∈ Rd . (42)

We let PE (Rd ) be the set of all moderate weights on Rd .

We say that v is submultiplicative if

v(x + y) ≤ v(x)v(y) and v(−x) = v(x), x, y ∈ Rd . (43)

We observe that if v ∈ PE (Rd ) is even and satisfies

v(x + y) ≤ Cv(x)v(y), x, y ∈ Rd , (44)

for some constant C > 0, then for v0 = C 1/2 v, one has that v0 ∈ PE (Rd ) is
submultiplicative and v E v0 (see e.g. [17, 19, 28]).
We also recall from [29] that if v is positive and locally bounded and satis-
fies (44), then v(x) ≤ C0 er0 |x| for some positive constants C0 and r0 . In fact, if
x ∈ Rd ,

r = sup log v(x), c = log C

|x|≤1
614 N. Teofanov and J. Toft

and n is an integer such that n − 1 ≤ |x| ≤ n, then (44) gives

v(x) = v(n · (x/n)) ≤ C n v(x/n)n ≤ C n ern = e(r+c)n ≤ e(r+c)(|x|+1),

which gives the statement.

Therefore, if v is a submultiplicative weight, then

v(x) er|x| , x ∈ Rd , (45)

for some r ≥ 0. Hence, if ω ∈ PE (Rd ), then (41) and (45) imply

ω(x + y) ω(x)er|y|, x, y ∈ Rd (46)

for some r > 0. In particular, (42) shows that for any ω0 ∈ PE (Rd ), there is a
constant r > 0 such that

e−r|x| ω0 (x) er|x| , x ∈ Rd .

If (41) holds, then there is a smallest positive even function v0 such that (41)
holds with C = 1. We remark that this v0 is given by

ω(x + y) ω(−x + y)
v0 (x) = sup , ,
y∈Rd ω(y) ω(y)

and is submultiplicative (see e.g. [19, 27, 49]). Consequently, if ω is a moderate

weight, then it is also moderated by a submultiplicative weight. In the sequel, v and
vj for j ≥ 0, always stand for submultiplicative weights if nothing else is stated.
We also remark that in the literature it is common to define submultiplicative
weights as (43) should hold, without the condition v(−x) = v(x), i.e. that v does
not have to be even (cf. e.g. [17, 19, 25, 28]). However, in the sequel it is convenient
for us to include this property in the definition.
There are several subclasses of PE (Rd ) which are interesting for different
reasons. Though our results later on are formulated in background of weights
in PE (Rd ), we here mention some subclasses which especially appear in time-
frequency analysis. First we observe the class PE0 (Rd ), which consists of all
ω ∈ PE (Rd ) such that (46) holds for every r > 0.
The class PE0 (Rd ) is important when dealing with spectral invariance for
matrix or convolution operators on 2 (Zd ) (see e.g. [30]). If v ∈ PE (Rd ) is
submultiplicative, then v ∈ PE0 (Rd ), if and only if

1
lim v(nx) n = 1 (47)
n→∞
An Excursion to Multiplications and Convolutions on Modulation Spaces 615

(see e.g. [23]). The condition (47) is equivalent to

log(v(nx))
lim = 0, (47)
n→∞ n
and is usually called the GRS condition, or Gelfand-Raikov-Shilov condition.
A more restrictive condition on v compared to (47) is given by the Beurling-
Domar condition
∞
log(v(nx))
< ∞. (48)
n2
n=1

This condition is strongly linked to non quasi-analytic classes which contain

non-trivial compactly supported elements (see e.g. [29]). Any subexponential
submultiplicative weight satisfies the Beurling-Domar condition. That is, suppose
θ
that θ ∈ (0, 1) and that v(x) = er|x| , x ∈ Rd , then (48) is fulfilled. We let PBD (Rd )
be the set of all weights which are moderated by submultiplicative weights which
satisfy the Beurling-Domar condition.
Finally we let P(Rd ) be the set of all weights on Rd which are moderated by
polynomially bounded functions. That is, ω ∈ P(Rd ), if and only if there are
positive constants r and C such that

ω(x + y) ≤ Cω(x)(1 + |y|)r , x, y ∈ Rd .

Here we observe that v(x) = (1 + |x|)r is submultiplicative.

Among these weight classes we have

P(Rd ) PBD (Rd ) PE0 (Rd ) PE (Rd ). (49)

In fact, it is clear that the ordering in (49) holds. On the other hand, if r > 0 and
θ ∈ (0, 1), then due to
θ
er|x| ∈ PBD (Rd ) \ P(Rd ),

er|x|/ log(e+|x|) ∈ PE0 (Rd ) \ PBD (Rd ), (50)

and er|x| ∈ PE (Rd ) \ PE0 (Rd ),

it also follows that the inclusions in (49) are strict.

We refer to [16, 28, 29, 49] for more facts about weights in time-frequency
analysis.
616 N. Teofanov and J. Toft

2.5 Mixed Norm Spaces of Lebesgue Type

For every p, q ∈ (0, ∞] and weight ω on R2d , we set

F Lp,q (R2d ) ≡ GF,ω,p Lq (Rd ) , where GF,ω,p (ξ ) = F ( · , ξ )ω( · , ξ )Lp (Rd )
(ω)

and

F Lp,q (R2d ) ≡ HF,ω,q Lp (Rd ) , where HF,ω,q (x) = F (x, · )ω(x, · )Lq (Rd ) ,
∗,(ω)

p,q
when F is (complex-valued) measurable function on R2d . Then L(ω) (R2d )
p,q
(L∗,(ω) (R2d )) consists of all measurable functions F such that F Lp,q < ∞
(ω)
(F Lp,q < ∞).
∗,(ω)
In similar ways, let 1 , 2 be discrete sets, ω be a positive function on
1 × 2 and 0 (1 × 2 ) be the set of all formal (complex-valued) sequences
c = {c(j, k)}j ∈1 ,k∈2 . Then the discrete Lebesgue spaces, i.e. the Lebesgue
sequence spaces
p,q p,q
(ω) (1 × 2 ) and ∗,(ω) (1 × 2 )

of mixed (quasi-)norm types consist of all c ∈ 0 (1 × 2 ) such that

cp,q (1 ×2 ) < ∞ respectively cp,q (1 ×2 ) < ∞. Here
(ω) ∗,(ω)

cp,q (1 ×2 ) ≡ Gc,ω,p q (2 ) , where Gc,ω,p (k) = F ( · , k)ω( · , k)p (1 )
(ω)

and

cp,q ≡ Hc,ω,q p (1 ) , where Hc,ω,q (j ) = c(j, · )ω(j, · )q (2 ) ,
∗,(ω) (1 ×2 )

when c ∈ 0 (1 × 2 ).

2.6 Convolutions and Multiplications for Discrete Lebesgue

Spaces

Next we discuss extended Hölder and Young relations for multiplications and
convolutions on discrete Lebesgue spaces. The Hölder and Young conditions on
Lebesgue exponent are then

1 1 1
≤ + , (51)
q0 q1 q2
An Excursion to Multiplications and Convolutions on Modulation Spaces 617

respectively

1 1 1 1 1
≤ + − max 1, , . (52)
p0 p1 p2 p1 p2

Notice that, when p1 , p2 ∈ (0, 1), then (52) becomes p0 ≥ max{p1 , p2 }, while
for p1 , p2 ≥ 1 it reduces to the common Young condition

1 1 1
1+ ≤ + .
p0 p1 p2

The conditions on the weight functions are

ω0 (j ) ≤ ω1 (j )ω2 (j ), j ∈ $, (53)

respectively

ω0 (j1 + j2 ) ≤ ω1 (j1 )ω2 (j2 ), j1 , j2 ∈ $, (54)

where $ is a lattice of the form

$ = { n1 e1 + · · · + nd ed ; (n1 , . . . , nd ) ∈ Zd },

where e1 , . . . ed is a basis for Rd .

Proposition 2.5 Let pj , qj ∈ (0, ∞], j = 0, 1, 2, be such that (51) and (52) hold,
let $ ⊆ Rd be a lattice and let ωj be weights on $, j = 0, 1, 2. Then the following
is true:
1. if (53) holds, then the map (a1 , a2 ) $→ a1 · a2 from 0 ($) × 0 ($) to 0 ($)
q q q
extends uniquely to a continuous map from (ω1 1 ) ($) × (ω2 2 ) ($) to (ω0 0 ) ($), and

q
a1 · a2 q0 ≤ a1 q1 a2 q2 , aj ∈ (ωj j ) ($), j = 1, 2; (55)
(ω0 ) (ω1 ) (ω2 )

2. if (54) holds, then the map (a1 , a2 ) $→ a1 ∗ a2 from 0 ($) × 0 ($) to 0 ($)
p p p
extends uniquely to a continuous map from (ω11 ) ($) × (ω22 ) ($) to (ω00 ) ($), and

p
a1 ∗ a2 p0 ≤ a1 p1 a2 p2 , aj ∈ (ωj j ) ($), j = 1, 2. (56)
(ω0 ) (ω1 ) (ω2 )

The assertion 1. in Proposition 2.5 is the standard Hölder’s inequality for discrete
Lebesgue spaces. The assertion 2. in that proposition is the usual Young’s inequality
for Lebesgue spaces on lattices in the case when p0 , p1 , p2 ∈ [1, ∞]. A proof of
Proposition 2.5 is given in Appendix A in [54].
618 N. Teofanov and J. Toft

3 Modulation Spaces, Multiplications and Convolutions

In this section we introduce modulation spaces, and recall their basic properties,
in particular in the context of Gelfand-Shilov spaces. Notice that we permit
the Lebesgue exponents to belong to the full interval (0, ∞] instead of the
most common choice [1, ∞], and general moderate weights which may have a
(sub)exponential growth. Here we also recall some facts on Gabor expansions for
modulation spaces.
Then we deduce multiplication and convolution estimates on modulation spaces.
There are several approaches to multiplication and convolution in the case when the
involved Lebesgue exponents belong to [1, ∞] (see [9, 17, 19, 32, 43, 48]). Here we
consider the case when these exponents belong to (0, ∞) (see also [1, 2, 25, 41, 42,
50]). In addition, and in order to keep the survey style of our exposition, we focus
on the bilinear case, and refer to [54] for extension of these results to multi-linear
products.

3.1 Modulation Spaces

The (classical) modulation spaces, essentially introduced in [17] by Feichtinger are

given in the following. (See e.g. [18] for definition of more general modulation
spaces.)
Definition 3.1 Let p, q ∈ (0, ∞], ω ∈ PE (R2d ) and φ ∈ 1 (Rd ) \ {0}.
1. The modulation space M(ω) (Rd ) consists of all f ∈ 1 (Rd ) such that
p,q

f M p,q ≡ Vφ f Lp,q

(ω) (ω)

p,q
is finite. The topology of M(ω) (Rd ) is defined by the (quasi-)norm · M p,q ;
(ω)
p,q
2. The modulation space (of Wiener amalgam type) W(ω) (Rd ) consists of all f ∈
1 (Rd ) such that

f W p,q ≡ Vφ f Lp,q

(ω) ∗,(ω)

p,q
is finite. The topology of W(ω) (Rd ) is defined by the (quasi-)norm · W p,q .
(ω)
p,q p,q
For convenience we set M p,q = and W p,q =
M(ω) when the weight ω is
W(ω)
trivial, i.e. when ω(x, ξ ) = 1 for every x, ξ ∈ Rd . We also set
p p,p p,p
M(ω) ≡ M(ω) ( = W(ω) ) and M p ≡ M p,p ( = W p,p ).
An Excursion to Multiplications and Convolutions on Modulation Spaces 619

Remark 3.2 Modulation spaces possess several convenient properties. Let p, q ∈

(0, ∞], ω ∈ PE (R2d ) and φ ∈ 1 (Rd ) \ {0}. Then the following is true (see
[17–20, 25, 28] and their analyses for verifications):
p,q p,q
• the definitions of M(ω) (Rd ) and W(ω) (Rd ) are independent of the choices of
φ ∈ 1 (Rd ) \ {0}, and different choices give rise to equivalent quasi-norms;
p,q p,q
• the spaces M(ω) (Rd ) and W(ω) (Rd ) are quasi-Banach spaces which increase
with p and q, and decrease with ω. If in addition p, q ≥ 1, then they are Banach
spaces;
• If p, q ≥ 1, then the L2 (Rd ) scalar product, ( · , · )L2 (Rd ) , on 1 (Rd ) ×
p,q p ,q
1 (Rd ) is uniquely extendable to dualities between M(ω) (Rd ) and M(1/ω) (Rd ),
p,q p ,q
and between W(ω) (Rd ) and W(1/ω) (Rd ). If in addition p, q < ∞, then the
p,q p,q p ,q
dual spaces of M(ω) (Rd ) and W(ω) (Rd ) can be identified with M(1/ω) (Rd )
p ,q
respectively W(1/ω) (Rd ), through the form ( · , · )L2 (Rd ) ;
• if ω0 (x, ξ ) = ω(−ξ, x), then F on 1 (Rd ) restricts to a homeomorphism from
p,q q,p
M(ω) (Rd ) to W(ω0 ) (Rd ).
• The inclusions

1 (Rd ) ⊆M(ω) (Rd ), W(ω) (Rd ) ⊆ 1 (Rd )

p,q p,q
when ω ∈ PE (R2d ), (57)

S1 (Rd ) ⊆M(ω) (Rd ), W(ω) (Rd ) ⊆ S1 (Rd )

p,q p,q
when ω ∈ PE0 (R2d ) (58)

and

S (Rd ) ⊆M(ω) (Rd ), W(ω) (Rd ) ⊆ S (Rd )

p,q p,q
when ω ∈ P(R2d ) (59)

are continuous. If in addition p, q < ∞, then these inclusions are dense.

We recall from [49] that the embeddings (57)–(59), are essentially special cases of
certain characterizations of the Schwartz space, Gelfand-Shilov spaces and their
distribution spaces in terms of suitable unions and intersections of modulation
spaces. In fact, let p, q ∈ (0, ∞] and s ≥ 1 be fixed and set
⎧ 1 1
⎨er(|x| t +|ξ | t )) , t ∈ R+
vr,t (x, ξ ) = (60)
⎩
(1 + |x| + |ξ |)r , t = ∞,

where r > 0. Then

D p,q
D p,q
s (Rd ) = M(vr,s ) (Rd ) = W(vr,s ) (Rd ), (61)
r>0 r>0
C p,q
C p,q
Ss (Rd ) = M(vr,s ) (Rd ) = W(vr,s ) (Rd ), (62)
r>0 r>0
620 N. Teofanov and J. Toft

D p,q
D p,q
S (Rd ) = M(vr,∞ ) (Rd ) = W(vr,∞ ) (Rd ), (63)
r>0 r>0
C C
S (Rd ) =
p,q p,q
M(1/vr,∞ ) (Rd ) = W(1/vr,∞ ) (Rd ), (64)
r>0 r>0
D D
Ss (Rd ) =
p,q p,q
M(1/vr,s ) (Rd ) = W(1/vr,s ) (Rd ) (65)
r>0 r>0

and

C C
s (Rd ) =
p,q p,q
M(1/vr,s ) (Rd ) = W(1/vr,s ) (Rd ). (66)
r>0 r>0

The topologies of the spaces on the left-hand sides of (61)–(66) are obtained by
replacing each intersection by projective limit with respect to r > 0 and each union
with inductive limit with respect to r > 0.
The relations (61)–(66) are essentially special cases of [49, Theorem 3.9], see
also [31, 45, 46]. In order to be self-contained we here give a proof of (62).
Proof of (62) Since
∞ p,q p,q ∞
M(v 2r,s )
(Rd ) ⊆ M(vr,s ) (Rd ), W(vr,s ) (Rd ) ⊆ M(v r,s )
(Rd ),

it suffices to prove the result for p = q = ∞. Let φ ∈ 1 (Rd ) \ {0} be fixed. First
suppose that
∞ ∞
f ∈ M(v r,s )
(Rd ) = W(v r,s )
(Rd ).

Then it follows from the definition of modulation space norm that (39) holds for
some r > 0. By Remark 2.4 it follows that f ∈ Ss (Rd ), and we have proved
C
∞
M(v r,s )
(Rd ) ⊆ Ss (Rd ). (67)
r>0

Suppose instead that f ∈ Ss (Rd ). Then (39) holds for some r > 0, giving that
∞ (Rd ). Hence (67) holds with reversed inclusion, and the result follows.
f ∈ M(v r,s )

1,1 d
Example 3.3 Let p = q = 1 and ω = 1. Then M(ω) (R ) = M 1 (Rd ) is the
Feichtinger algebra, probably the most prominent example of a modulation space.
We refer to a recent survey [34] for a detailed account on M 1 (Rd ), and to [14,
Lemma 11] for a list of its basic properties.
An Excursion to Multiplications and Convolutions on Modulation Spaces 621

2,2 d
Familiar examples arise when p = q = 2 and ω = 1. Then M(ω) (R ) =
M 2 (Rd ) = L2 (Rd ), and
2,2
M(ωs)
(Rd ) = H s (Rd ), s ∈ R,

where ωs (ξ ) = ξ s , and H s (Rd ) is the Sobolev space (also known as the Bessel
potential space) of distributions f ∈ S (Rd ) such that
$
f 2H s := ξ 2s |fB(ξ )|2 dξ < ∞,
Rd

2,2
cf. [28, Proposition 11.3.1]. Furthermore, if vs (x, ξ ) = (x, ξ )s , then M(v s)
(Rd ) =
Qs (R ), s ∈ R, [7, Lemma 2.3]. Here Qs denotes the Shubin-Sobolev space, [44].
d

Finally we remark that modulation spaces can be conveniently discretized in

terms of Gabor expansions. In order for explaining some basic issues on this, in
a similar way as in Subsection 1.5 in [54], we limit ourself to the case when the
involved weights are moderated by subexponential functions. That is, we suppose
p,q
that ω in M(ω) (Rd ) satisfies

1 1
s +|ξ | s )
ω(x + y, ξ + η) ω(x, ξ )er(|x| , (68)

for some s > 1 and r > 0. We observe that this implies that

s (Rd ) ⊆ M(ω) (Rd ) ⊆ s (Rd ),

p,q
(69)

in vew of (42), (61) and (66). For more general approaches we refer to [19, 27, 28,
42, 50].
Since s > 1, it follows from Sections 1.3 and 1.4 in [33] that there are φ, ψ ∈
s (Rd ) with values in [0, 1] such that
2 3 3 3d 2 1 1 3d
supp φ ⊆ − , , φ(x) = 1 when x ∈ − , (70)
4 4 4 4
2 3 3 3d
supp ψ ⊆ [−1, 1]d , ψ(x) = 1 when x ∈ − , (71)
4 4
and

φ( · − j ) = 1. (72)
j ∈Zd
622 N. Teofanov and J. Toft

Let f ∈ s (Rd ). Then x $→ f (x)φ(x − j ) belongs to s (Rd ) and is supported in
j + [− 34 , 34 ]d . Hence, by periodization it follows from Fourier analysis that

f (x)φ(x − j ) = c(j, ι)eix,ι , x ∈ j + [−1, 1]d , (73)
ι∈πZd

where
d
π
c(j, ι) = 2−d (f, φ( · − j )ei · ,ι ) =
2
Vφ f (j, ι), j ∈ Zd , ι ∈ πZd .
2
Since ψ = 1 on the support of φ, (73) gives

π d
(73)
2
f (x)φ(x − j ) = Vφ f (j, ι)ψ(x − j )eix,ι , x ∈ Rd ,
2
ι∈πZd

By (72) it now follows that

π d
2

f (x) = Vφ f (j, ι)ψ(x − j )eix,ι , x ∈ Rd , (74)
2
(j,ι)∈$

where

$ = Zd × (πZd ), (75)

which is the Gabor expansion of f with respect to the Gabor pair (φ, ψ) and lattice
$, i.e. with respect to the Gabor atom φ and the dual Gabor atom ψ. Here the series
converges in s (Rd ). By duality and the fact that compactly supported elements in
s (Rd ) are dense in s (Rd ) we also have

π d
2

f (x) = Vψ f (j, ι)φ(x − j )eix,ι , x ∈ Rd , (76)
2
(j,ι)∈$

with convergence in s (Rd ).

Let T be a linear continuous operator from s (Rd ) to s (Rd ) and let f ∈
s (Rd ). Then it follows from (74) that

π d
Vφ f (j, ι)T (ψ( · − j )ei · ,ι )(x)
2
(Tf )(x) =
2
(j,ι)∈$
An Excursion to Multiplications and Convolutions on Modulation Spaces 623

and

π d
T (ψ( · − j )ei · ,ι )(x) = (Vφ (T (ψ( · − j )ei · ,ι )))(k, κ)ψ(x − k)eix,κ .
2
2
(k,κ)∈$

A combination of these expansions show that

π d
2

(Tf )(x) = (A · Vφ f )(j, ι)ψ(x − j )eix,ι , (77)
2
(j,ι)∈$

where A = (a(j , k))j ,k∈$ is the $ × $-matrix, given by

d
π
(T (ψ( · − j )ei · ,ι ), φ( · − k)ei · ,κ)L2 (Rd )
2
a(j , k) =
2
when j = (j, ι) and k = (k, κ). (78)

By the Gabor analysis for modulation spaces we get the following restatement of
[54, Proposition 1.8]. We refer to [17, 19–21, 25, 27, 28, 50] for details.
Proposition 3.4 Let s > 1, p, q ∈ (0, ∞], ω ∈ PE (R2d ) be such that (68) holds
for some r > 0, φ, ψ ∈ s (Rd ) with values in [0, 1] be such that (70), (71) and (72)
hold true, and let f ∈ s (Rd ). Then the following is true:
p,q
1. f ∈ M(ω) (Rd ), if and only if Vφ f p,q (Zd ×πZd ) < ∞;
(ω)
p,q
2. f ∈ M(ω) (Rd ), if and only if Vψ f p,q (Zd ×πZd ) < ∞;
(ω)
3. the quasi-norms

f $→ Vφ f p,q (Zd ×πZd ) and f $→ Vψ f p,q (Zd ×πZd )

(ω) (ω)

are equivalent to · M p,q .

(ω)
p,q p,q p,q p,q
The same holds true with W(ω) and ∗,(ω) in place of M(ω) respectively (ω) at each
occurrence.

3.2 Multiplications and Convolutions in Modulation Spaces

As a first step for approaching multiplications and convolutions for elements in

modulation spaces, we reformulate such products in terms of short-time Fourier
transforms. Let φ0 , φ1 , φ2 ∈ 1 (Rd ) be fixed such that

φ0 = (2π)− 2 φ1 φ2
d
(79)
624 N. Teofanov and J. Toft

and let f1 , f2 ∈ 1 (Rd ). Then the multiplication f0 = f1 f2 can be expressed by

F0 (x, ξ ) = F1 (x, · ) ∗ F2 (x, · ) (ξ ). (80)

where

Fj = Vφj fj , j = 0, 1, 2. (81)

In fact, by Fourier’s inversion formula we get

(Vφ1 f1 )(x, · ) ∗ (Vφ2 f2 )(x, · ) (ξ )
$$$
= (2π)−d f1 (y1 )φ1 (y1 − x)f2 (y2 )φ2 (y2 − x)e−iy1 ,ξ −η e−iy2 ,η dy1 dy2 dη

$
f1 (y)φ1 (y − x)f2 (y)φ2 (y − x)e−iy,ξ dy = (2π) 2 (Vφ1 φ2 (f1 f2 ))(x, ξ ).
d
=

We also observe that we may extract f0 = f1 f2 by the formula

f0 = (φ0 L2 )−1 Vφ∗0 F0 , (82)

provided φ0 is not trivially equal to 0.

In the same way, let φ0 , φ1 , φ2 ∈ 1 (Rd ) be fixed such that
d
φ0 = (2π) 2 φ1 ∗ φ2 , (83)

and let f1 , f2 , g ∈ 1 (Rd ). Then the convolution f0 = f1 ∗ f2 can be expressed by

F0 (x, ξ ) = F1 ( · , ξ ) ∗ F2 ( · , ξ ) (x). (84)

where Fj are given by (81), and that we may extract f0 = f1 ∗ f2 from (82).
Next we discuss convolutions and multiplications for modulation spaces, and
start with the following convolution result for modulation spaces. For multiplica-
tions of elements in modulation spaces we need to swap the conditions for the
involved Lebesgue exponents compared to (51) and (52). That is, these conditions
become
1 1 1 1 1 1 1 1 1
≤ + , ≤ + − max 1, , , , (85)
p0 p1 p2 q0 q1 q2 p0 q1 q2

1 1 1 1 1 1 1 1
≤ + , ≤ + − max 1, , . (86)
p0 p1 p2 q0 q1 q2 q1 q2
An Excursion to Multiplications and Convolutions on Modulation Spaces 625

The conditions on the weight functions are

ω0 (x, ξ1 + ξ2 ) ≤ ω1 (x, ξ1 )ω2 (x, ξ2 ), x, ξ1 , ξ2 ∈ Rd , (87)

respectively

ω0 (x1 + x2 , ξ ) ≤ ω1 (x1 , ξ )ω2 (x2 , ξ ), x 1 , x 2 , ξ ∈ Rd . (88)

Theorem 3.5 Let pj , qj ∈ (0, ∞) and ωj ∈ PE (R2d ), j = 0, 1, 2, be such

that (85) and (87) hold. Then (f1 , f2 ) $→ f1 f2 from 1 (Rd ) × 1 (Rd ) to 1 (Rd )
p ,q p ,q
is uniquely extendable to a continuous map from M(ω11 ) 1 (Rd ) × M(ω22 ) 2 (Rd ) to
p0 ,q0
M(ω0 ) (Rd ), and

p ,q
f1 f2 M p0 ,q0 f1 M p1 ,q1 f2 M p2 ,q2 , fj ∈ M(ωjj ) j (Rd ), j = 1, 2. (89)
(ω0 ) (ω1 ) (ω2 )

Theorem 3.6 Let pj , qj ∈ (0, ∞) and ωj ∈ PE (R2d ), j = 0, 1, 2, be such

that (86) and (87) hold. Then (f1 , f2 ) $→ f1 f2 from 1 (Rd ) × 1 (Rd ) to 1 (Rd )
p ,q p ,q
is uniquely extendable to a continuous map from W(ω11 ) 1 (Rd ) × W(ω22 ) 2 (Rd ) to
p0 ,q0
W(ω0 ) (Rd ), and

p ,q
f1 f2 W p0 ,q0 f1 W p1 ,q1 f2 W p2 ,q2 , fj ∈ W(ωjj ) j (Rd ), j = 1, 2. (90)
(ω0 ) (ω1 ) (ω2 )

The corresponding results for convolutions are the following. Here the conditions
on the involved Lebesgue exponents are swapped as

1 1 1 1 1 1 1 1 1
≤ + − max 1, , , , ≤ + (91)
p0 p1 p2 q0 p1 p2 q0 q1 q2

1 1 1 1 1 1 1 1
≤ + − max 1, , , ≤ + (92)
p0 p1 p2 p1 p2 q0 q1 q2

Theorem 3.7 Let pj , qj ∈ (0, ∞) and ωj ∈ PE (R2d ), j = 0, 1, 2, be such

that (88) and (92) hold. Then (f1 , f2 ) $→ f1 ∗ f2 from 1 (Rd ) × 1 (Rd ) to
p ,q p ,q
1 (Rd ) is uniquely extendable to a continuous map from M(ω11 ) 1 (Rd )×M(ω22 ) 2 (Rd )
p0 ,q0
to M(ω0 ) (Rd ), and

p ,q
f1 ∗ f2 M p0 ,q0 f1 M p1 ,q1 f2 M p2 ,q2 , fj ∈ M(ωjj ) j (Rd ), j = 1, 2. (93)
(ω0 ) (ω1 ) (ω2 )
626 N. Teofanov and J. Toft

Theorem 3.8 Let pj , qj ∈ (0, ∞) and ωj ∈ PE (R2d ), j = 0, 1, 2, be such

that (88) and (91) hold. Then (f1 , f2 ) $→ f1 ∗ f2 from 1 (Rd ) × 1 (Rd ) to
p ,q p ,q
1 (Rd ) is uniquely extendable to a continuous map from W(ω11 ) 1 (Rd )×W(ω22 ) 2 (Rd )
p ,q
to W(ω00 ) 0 (Rd ), and

p ,q
f1 ∗ f2 W p0 ,q0 f1 W p1 ,q1 f2 W p2 ,q2 , fj ∈ W(ωjj ) j (Rd ), j = 1, 2. (94)
(ω0 ) (ω1 ) (ω2 )

We observe that Theorems 3.2–3.5 in [54] are multi-linear versions of the

previous results. In particular, Theorems 3.5 and 3.6 are Fourier transformations
of Theorems 3.7 and 3.8. Hence it suffices to prove the last two theorems, cf. [54].
To shed some ideas of the arguments, we give a proof in the unweighted case of
Theorem 3.7. We will use Proposition A.1 from Appendix A, which is a special
case of [54, Proposition 3.6].
Proof of Theorem 3.7 Suppose fj ∈ S (Rd ), φj ∈ S (Rd ), j = 0, 1, 2 be such
that
d
f0 = f1 ∗ f2 and φ0 = (2π) 2 φ1 ∗ φ2 = 0,

and let Fj be the same as in (81). Then

F0 (x, ξ ) = (Vφ1 f1 ( · , ξ ) ∗ Vφ2 f2 ( · , ξ ))(x),

in view of (84).
We have

0 ≤ χk1 +Q ∗ χk2 +Q ≤ χk1 +k2 +Qd,2 , k1 , k2 ∈ Zd ,

where Qd,r is the cube

Qd,r = [0, r]d and Q = Qd,1 = [0, 1]d ,

and χE is the characteristic function with respect to the set E.

Set

G(x, ξ ) = (|Vφ1 f1 ( · , ξ )| ∗ |Vφ2 f2 ( · , ξ )|)(x),

aj (k, κ) = Vφj fj L∞ ((k,κ)+Q2d,1) , j = 1, 2,

and

b(k, κ) = GL∞ ((k,κ)+Q2d,1)

An Excursion to Multiplications and Convolutions on Modulation Spaces 627

Then

Vφ∗0 F0 M p0 ,q0 E Pφ0 F0 W(p0,q0 ) F0 W(p0 ,q0 )

≤ GW(p0 ,q0 ) E bp0 ,q0 , (95)

and

fj M pj ,qj E aj pj ,qj (96)

in view of (A.5) and Proposition A.1 in Appendix A (see also [25, Theorem 3.3])).
By (84) we have

G(x, λ) ≤ a1 (k1 , λ)a2 (k2 , λ)(χk1 +Q ∗ χk2 +Q )(x)
k1 ,k2 ∈Zd
(97)
≤ a1 (k1 , λ)a2 (k2 , λ)χk1 +k2 +Qd,2 (x).
k1 ,k2 ∈Zd

We observe that

χk1 +k2 +Qd,2 (x) = 0 when x ∈

/ l + Qd , (k1 , k2 ) ∈
/ l ,

where

l = { (k1 , k2 ) ∈ Z2d ; lj − 2 ≤ k1,j + k2,j ≤ lj + 1 },

and

kj = (kj,1 , . . . , kj,d ) ∈ Zd , j = 1, 2, and l = (l1 , . . . , ld ) ∈ Zd .

Hence, if x = l in (97), we get

b(l, λ) ≤ a1 (k1 , λ)a2 (k2 , λ)
(k1 ,k2 )∈l

≤ (a1 ( · , λ) ∗ a2 ( · , λ))(l − 2e0 + m), (98)
m∈I

where e0 = (1, . . . , 1) ∈ Zd and I = {0, 1, 2, 3}d .

628 N. Teofanov and J. Toft

If we apply the p0 quasi-norm on (98) with respect to the l variable, then
Proposition 2.5 (2) and the fact that I is finite set give

b( · , λ)p0 ≤ (a1 ( · , λ) ∗ a2 ( · , λ))( · − 2e0 + m) p0

m∈I

≤ (a1 ( · , λ) ∗ a2 ( · , λ))( · − 2e0 + m)p0
m∈I

E a1 ( · , λ) ∗ a2 ( · , λ)p0

≤ a1 ( · , λ)p1 a2 ( · , λ)p2 .

By applying the q0 quasi-norm and using Proposition 2.5 (1) we now get

bp0 ,q0 a1 p1 ,q1 a2 p2 ,q2 .

This is the same as

GLp0 ,q0 F1 Lp1 ,q1 F2 Lp2 ,q2 .

A combination of this estimate with (95) and (96) gives that f1 ∗ f2 is well-defined
and that (93) holds.
The uniqueness now follows from that (93) holds for f1 , f2 ∈ S (Rd ), and that
S (Rd ) is dense in M p,q (Rd ) when p, q < ∞.

4 Gabor Products and Modulation Spaces

In this section we give an illustration how the multiplication properties for modula-
tion spaces can be used when treating certain nonlinear problems. We consider the
Gabor product which is connected to such multiplication properties. It is introduced
in [14] in order to derive a phase space analogue to the usual convolution identity
for the Fourier transform. We will prove a formula related to (80), and then use
results from previous section to extend the Gabor product initially defined on
M 1 (R2d )×M 1 (R2d ) to some other spaces. Finally, we show how the Gabor product
gives rise to a phase-space formulation of the qubic Schrödinger equation.
Definition 4.1 Let φ ∈ M 1 (Rd ) \ {0}, and let F1 , F2 ∈ M 1 (R2d ). Then the Gabor
product φ is given by

F1 φ F2 (x, ξ )
$$$
= (2π) −d −ix,ξ
e B
φ (ζ − ξ )eix,ζ F1 (y, η)F2 (y, ζ − η) dydηdζ. (99)
R3d
An Excursion to Multiplications and Convolutions on Modulation Spaces 629

In the proof of [14, Lemma 13] it is justified that the Gabor product in (99) is
well-defined, and that

φ : M 1 (R2d ) × M 1 (R2d ) → M 1 (R2d )

is a continuous map.
The Gabor product is particularly well-suited in the context of the STFT.
Theorem 4.2 Let φ, φ1 , φ2 ∈ M 1 (Rd )\ {0}. Then

(φ2 , φ1 )L2 (Rd ) Vφ (f1 · f2 ) = (Vφ1 f1 ) φ (Vφ 2 f2 ), f1 , f2 ∈ M 1 (Rd ). (100)

Moreover, Vφ (f1 · f2 ) ∈ M 1 (R2d ).

Proof We have

((Vφ1 f1 ) φ (Vφ 2 f2 ))(x, ξ ) (101)

$$
= (2π) −d −ix,ξ
e B
φ(ζ − ξ )eix,ζ G(y, ζ ) dydζ, (102)
R2d

where
$
G(y, ζ ) = (Vφ1 f1 )(y, η)(Vφ2 f2 )(y, ζ − η) dη.
Rd

By Parseval’s formula we get

$
G(y, ζ ) = (Vφ1 f1 )(y, η)(Vφ 2 f2 )(y, ζ − η) dη
Rd
$
= F (f1 φ1 ( · − y))(η)F (f2 φ2 ( · − y))(ζ − η) dη
Rd

= (F (f1 φ1 ( · − y)) , F (f2 φ2 ( · − y)ei · ,ζ ))L2 (Rd )

= (f1 φ1 ( · − y) , f2 φ2 ( · − y)ei · ,ζ )L2 (Rd )

$
= f1 (z)φ1 (z − y)f2 (z)φ2 (z − y)e−iz,ζ dz.
Rd
630 N. Teofanov and J. Toft

By inserting this into (102) and using Fubini’s theorem we get

((Vφ1 f1 ) φ (Vφ 2 f2 ))(x, ξ )

$$
= (2π)−d e−ix,ξ B
φ (ζ − ξ )e−iz−x,ζ f1 (z)f2 (z)H (z) dzdζ,
R2d

where
$
H (z) = φ2 (z − y)φ1 (z − y) dy = (φ2 , φ1 )L2 .
Rd

Hence, by evaluating the integral with respect to ζ , and using Fourier’s inversion
formula, we get

((Vφ1 f1 ) φ ((Vφ 2 f2 )))(x, ξ )

$
− d2 −ix,ξ
= (2π) e (φ2 , φ1 )L2 φ(z − x)eix−z,ξ f1 (z)f2 (z) dz
Rd

= (φ2 , φ1 )L2 Vφ (f1 f2 )(x, ξ ),

which gives (100), and the result follows.

The formula (100) is closely related to (80). In fact, the windows φj ∈ 1 (Rd ),
j = 0, 1, 2, in (80) should satisfy the condition (79), while (100) is valid for
arbitrary non-zero elements from M 1 (Rd ). For example, when φ = φ1 = φ2 and
φL2 (Rd ) = 1, then (100) reduces to

Vφ (f1 · f2 ) = (Vφ f1 ) φ (Vφ f2 ), f1 , f2 ∈ M 1 (Rd ), (103)

while (80) does not allow such choice of windows.

One of the main goals of [14] are extensions of the Gabor product to some
function spaces Fj (R2d ), j = 0, 1, 2, so that φ maps F1 × F2 into F0 , with:

F1 φ F2 F0 ≤ CF1 F1 F2 F2 . (104)

This can be considered as a phase space form of the Young convolution inequality.
Next we discuss continuity of the Gabor product on certain spaces involving
superpositions of short-time Fourier transforms. In the end we deduce properties
p,q
similar to [14, Theorem 29]. Instead of modulation spaces of the form M(ω) (Rd ),
p, q ∈ [1, ∞), ω ∈ PE (R2d ), here we consider modulation spaces of Wiener
p,q
amalgam types W(ω) (Rd ), and allow the “quasi-Banach” choice for Lebesgue
parameters, i.e. p and q are allowed to be smaller than one.
An Excursion to Multiplications and Convolutions on Modulation Spaces 631

Thus, in what follows we assume that p, q ∈ (0, ∞), ω ∈ PE (R2d ) is v-

p,q p,q
moderate, and consider L∗,(ω) (R2d ) spaces rather than L(ω) (R2d ) which are treated
in [14].
We need some additional notation. Let s > 1, N ∈ N be given, and let
6 7
G = φn = φn ; n ∈ N ⊆ s (Rd ),

(N),p,q
be an orthonormal basis of L2 (Rd ). Then let VG ,ω (R2d ) be the closure of
N G
(N)

VG (R ) = 2d
Vφn fn ; fn ∈ 1 (R ) d
(105)
n=1

p,q
with respect to the L∗,(ω) (R2d ) norm. In particular, if N = 1, φ = φ1 and p, q ≥ 1,
then this reduces to the closure
p,q p,q
Pφ (L∗,(ω) (R2d )) = Vφ (W(ω) (Rd ))

Pφ (1 (R2d )) = Vφ (1 (Rd ))

p,q
with respect to the L∗,(ω) (R2d ) norm.
(N),p,q
By [14, Theorem 26], it follows that for every F ∈ VG ,ω (R2d ) there exist
p,q
fn ∈ W(ω) (Rd ), n = 1, 2, . . . , N, and such that

N
F = Vφn fn . (106)
n=1

Theorem 4.3 Let pj , qj ∈ (0, ∞) and ωj ∈ PE (R2d ) be vj –moderate, j =

0, 1, 2, and such that (86) and (87) hold, and let φ ∈ s (Rd ), s > 1. Then the
(N) (N) 1,1
Gabor product φ from VG R2d × VG R2d to W(v) (R2d ), extends uniquely
(N),p1 ,q1 (N),p2 ,q2
to a continuous map from VG ,ω1 (R2d ) × VG ,ω2 (R2d ) to the closure of
p0 ,q0
Pφ (L∗,(ω 0)
(R2d )), and

F1 φ F2 Lp0 ,q0 F1 Lp1 ,q1 F2 Lp2 ,q2 , (107)
∗,(ω0 ) ∗,(ω1 ) ∗,(ω2 )

(N),p ,q
for all Fj ∈ VG ,ω j j (R2d ), j = 1, 2.
j
In particular, if Fj = Vφ fj , j = 1, 2, and φL2 = 1, then (107) reduces to

Vφ f1 φ Vφ f2 Lp0 ,q0 = f1 f2 W p0 ,q0 f1 W p1 ,q1 f2 W p2 ,q2 . (108)
∗,(ω0 ) (ω0 ) (ω1 ) (ω2 )
632 N. Teofanov and J. Toft

We omit the proof which is a slight modification of the proof of Theorem 29 in

[14].
We end the paper by formally demonstrating how the Gabor product arises
in a phase space version of the cubic Schrödinger equation. Consider the elliptic
nonlinear Schrödinger equation (NLSE) given by

∂ψ
i + ψ + λ|ψ|2 ψ = 0, (109)
∂t
subject to the initial condition:

ψ(x, 0) = ϕ(x).

Here λ = ±1 stands for an attracting (λ = +1) or repulsive (λ = −1) power-law

nonlinearity, and the Laplacian is given by

d
∂2
= .
j =1
∂xj2

Thus we consider ψ = φ(x, t) with x ∈ Rd , and t in an open interval I ⊆ R.

Using the following intertwining relations

Vφ (xj ψ) = −Dξj Vφ ψ, Vφ (Dxj ψ) = ξj + Dxj Vφ ψ,

j = 1, · · · , d, and assuming that φ is a real-valued window, we obtain upon

application of the STFT Vφ to (109) that

∂F
d
2
i − 1 φF
ξj + Dxj F + λF φF = 0. (110)
∂t
j =1

Here, Dxj = −i ∂x∂ j ,

F (x, ξ, t) = Vφ (ψ( · , t))(x, ξ )

$
= (2π)− 2 ψ(y, t)φ(y − x)e−iy,ξ dy,
d
x, ξ ∈ Rd , t ∈ R,
Rd

1 is given by
and F

1(x, ξ ) = F (x, −ξ ).
F (111)

By considering (110) the phase-space formulation of the initial value problem

may be well-posed for more general initial distributions. This means that the phase-
An Excursion to Multiplications and Convolutions on Modulation Spaces 633

space formulation “contains” the solutions of the standard NLSE, but it is richer,
as it admits other solutions. We refer to [11–13], where phase-space extensions are
explored in several different contexts.
Let us conclude by noticing that (110) contains the triple product. Thus, its
qualitative analysis calls for a multilinear extension of Theorems 3.6 and 4.3. Then
the conditions (86) and (87) become more involved, see [54]. Such analysis demands
a more technical tools and arguments and goes beyond the scope of this survey
article.

Appendix A: Some Properties of Wiener Amalgam Spaces

There are convenient characterizations of modulation spaces in the framework of

Gabor analysis.
Let ω0 ∈ PE (Rd ), ω ∈ PE (R2d ), p, q, r ∈ (0, ∞], Qd = [0, 1]d be the unit
cube, and set for measurable f on Rd ,

f Wr (ω0 ,p ) ≡ a0 p (Zd ) (A.1)

when

a0 (j ) ≡ f · ω0 Lr (j +Qd ) , j ∈ Zd ,

and for measurable F on R2d ,

F Wr (ω,p,q ) ≡ ap,q (Z2d ) and F W(ω,p,q

∗ )
≡ ap,q 2d
∗ (Z )
(A.2)

when

a(j, ι) ≡ F · ωLr ((j,ι)+Q2d ) , j, ι ∈ Zd .

The Wiener amalgam space

Wr (ω0 , p ) = Wr (ω0 , p (Zd ))

consists of all measurable f ∈ Lrloc (Rd ) such that F Wr (ω0 ,p ) is finite, and the
Wiener amalgam spaces
p,q p,q
Wr (ω, p,q ) = Wr (ω, p,q (Z2d )) and Wr (ω, ∗ ) = Wr (ω, ∗ (Z2d ))

consist of all measurable F ∈ Lrloc (R2d ) such that F Wr (ω,p,q ) respectively
F Wr (ω,p,q
∗ )
are finite. We observe that Wr (ω0 , p ) is often denoted by
r p
W (L , (ω) ) in the literature (see e. e. [17, 19, 25, 41]).
634 N. Teofanov and J. Toft

The topologies are defined through their corresponding quasi-norms in (A.1)

and (A.2). For conveniency we set

W(ω, p,q ) = W∞ (ω, p,q ) and W(ω, ∗ ) = W∞ (ω, ∗ ),

p,q p,q

and if in addition ω = 1, we set

p,q p,q
W(p,q ) = W(ω, p,q ) and W(∗ ) = W(ω, ∗ ).

Obviously, Wr (ω0 , p ) and Wr (ω, p,q ) increase with p, q, decrease with r, and

W(ω, p,q ) 3→ L(ω) (R2d ) ∩ 1 (R2d ) 3→ L(ω) (R2d ) 3→ Wr (ω, p,q )
p,q p,q

(A.3)

and

· Wr (ω,p,q ) ≤ · Lp,q ≤ · W(ω,p,q ) , r ≤ min(1, p, q). (A.4)

(ω)

On the other hand, for modulation spaces we have

p,q p,q
f ∈ M(ω) (Rd ) ⇔ Vφ f ∈ L(ω) (R2d ) ⇔ Vφ f ∈ Wr (ω, p,q ) (A.5)

with

f M p,q = Vφ f Lp,q E Vφ f Wr (ω,p,q ) . (A.6)

(ω) (ω)

p,q p,q p,q p,q p,q

The same holds true with W(ω) , L∗,(ω) and W(ω, ∗ ) in place of M(ω) , L(ω)
and W(ω, p,q ), respectively, at each occurrence. (For r = ∞ , see [28] when
p, q ∈ [1, ∞], [25, 50] when p, q ∈ (0, ∞], and for r ∈ (0, ∞], see [53].)
We have now the following result on the projection operator Pφ in (20) when
acting on Wiener amalgam spaces.
Proposition A.1 Let p, q ∈ (0, ∞] and φ ∈ S (Rd ) \ {0}. Then Pφ from S (R2d )
to S (R2d ), and Vφ∗ from S (R2d ) to S (Rd ) restrict to continuous mappings

Pφ : W(p,q (Z2d )) → Vφ (M p,q (Rd )), (A.7)

p,q
Pφ : W(∗ (Z2d )) → Vφ (W p,q (Rd )), (A.8)

Vφ∗ : W(p,q (Z2d )) → M p,q (Rd ) (A.9)

and

Vφ∗ : W(∗ (Z2d )) → W p,q (Rd ).

p,q
(A.10)
An Excursion to Multiplications and Convolutions on Modulation Spaces 635

We refer to [54, Proposition 3.6] for the proof of Proposition A.1 and to [19, 21,
28, 41, 42, 54] for some facts about the operators Pφ and Vφ∗ ,
For p, q ≥ 1, i.e. the case when all spaces are Banach spaces, proofs of
Proposition A.1 can be found in e.g. [28] as well as in abstract forms in [19]. In the
general case when p, q > 0, we refer to [25, 42], since proofs of Proposition A.1
are essentially given there.

Acknowledgments The work of N. Teofanov is partially supported by TIFREFUS Project DS 15,

and MPNTR of Serbia Grant No. 451–03–68/2022–14/200125. Joachim Toft was supported by
Vetenskapsrådet (Swedish Science Council) within the project 2019–04890.

References

1. F. Bastianoni, N. Teofanov, Subexponential decay and regularity estimates for eigenfunctions

of localization operators. J. Pseudo-Differ. Oper. Appl. 12„ Paper no. 19, 28 (2021)
2. F. Bastianoni, E. Cordero, F. Nicola Decay and smoothness for eigenfunctions of localization
operators. J. Math. Anal. Appl. 492, 124480 (2020)
3. Á. Bényi, K. Okoudjou, Local well-posedness of nonlinear dispersive equations on modula-
tion spaces. Bull. Lond. Math. Soc. 41, 549–558 (2009)
4. Á. Bényi, K. Okoudjou, Modulation Spaces. With Applications to Pseudodifferential Oper-
ators and Nonlinear Schrödinger Equations. Applied and Numerical Harmonic Analysis
(Birkhäuser/Springer, New York, 2020)
5. Á. Bényi, L. Grafakos, K.H. Gröchenig, K. Okoudjou A class of Fourier multipliers for
modulation spaces. Appl. Comput. Harmon. Anal. 19, 131–139 (2005)
6. Á. Bényi, K. H. Gröchenig, K. Okoudjou, L. Rogers, Unimodular Fourier multipliers for
modulation spaces. J. Func. Anal. 246, 366–384 (2007)
7. P. Boggiatto, E. Cordero, K. Gröchenig, Generalized anti-Wick operators with symbols in
distributional Sobolev spaces. Integr. Eq. Oper. Theory 48, 427–442 (2004)
8. J. Chung, S.-Y. Chung, D. Kim, Characterizations of the Gelfand-Shilov spaces via Fourier
transforms. Proc. Am. Math. Soc. 124, 2101–2108 (1996)
9. E. Cordero, K.H. Gröchenig, Time-frequency analysis of localization operators. J. Funct.
Anal. 205, 107–131 (2003)
10. E. Cordero, L. Rodino Time-Frequency Analysis of Operators. Studies in Mathematics, vol.
75 (De Gruyter, Berlin, Boston, 2020)
11. N.C. Dias, M. de Gosson, F. Luef, J.N. Prata. A Pseudo-differential calculus on non-standard
symplectic space; spectral and regularity results in modulation spaces. J. Math. Pur. Appl. 96,
423–445 (2011)
12. N.C. Dias, M. de Gosson, F. Luef, J.N. Prata, Quantum mechanics in phase space: the
Schrödinger and the Moyal representations. J. Pseudo-Differ. Oper. Appl. 3, 367–398 (2012)
13. N.C. Dias, M. de Gosson, J.N. Prata, Dimensional extension of pseudo-differential operators:
properties and spectral results. J. Func. Anal. 266, 3772–3796 (2014)
14. N.C. Dias, J.N. Prata, N. Teofanov, Short-time Fourier transform of the pointwise product
of two functions with application to the nonlinear Schrödinger equation (2022). Preprint
(arXiv:2108.04985)
β
15. S.J.L. Eijndhoven, Functional analytic characterizations of the Gelfand-Shilov spaces Sα .
Nederl. Akad. Wetensch. Indag. Math. 49, 133–144 (1987)
16. H.G. Feichtinger, Gewichtsfunktionen auf lokalkompakten Gruppen. Sitzber. d. österr. Akad.
Wiss. 188, 451–471 (1979)
636 N. Teofanov and J. Toft

17. H.G. Feichtinger, Modulation spaces on locally compact abelian groups. Technical report,
University of Vienna, Vienna, 1983; also in ed. by M. Krishna, R. Radha, S. Thangavelu.
Wavelets and Their Applications (Allied Publishers Private Limited, NewDehli, 2003),
pp. 99–140
18. H.G. Feichtinger, Modulation spaces: looking back and ahead. Sampl. Theory Signal Image
Process. 5, 109–140 (2006)
19. H.G. Feichtinger, K.H. Gröchenig, Banach spaces related to integrable group representations
and their atomic decompositions, I. J. Funct. Anal. 86, 307–340 (1989)
20. H.G. Feichtinger, K.H. Gröchenig, Banach spaces related to integrable group representations
and their atomic decompositions, II. Monatsh. Math. 108, 129–148 (1989)
21. H.G. Feichtinger, K.H. Gröchenig, Gabor frames and time-frequency analysis of distributions.
J. Funct. Anal. 146, 464–495 (1997)
22. H.G. Feichtinger, G. Narimani, Fourier multipliers of classical modulation spaces. Appl.
Comput. Harmon. Anal. 21, 349–359 (2006)
23. C. Fernandez, A. Galbis, J. Toft, Characterizations of GRS-weights, and consequences in
time-frequency analysis. J. Pseudo-Differ. Oper. Appl. 6, 383–390 (2015)
24. G.B. Folland, Harmonic Analysis in Phase Space (Princeton University Press, Princeton,
1989)
p,q
25. Y.V. Galperin, S. Samarah, Time-frequency analysis on modulation spaces Mm , 0 < p, q ≤
∞. Appl. Comput. Harmon. Anal. 16, 1–18 (2004)
26. I.M. Gelfand, G.E. Shilov, Generalized Functions, II–III (Academic Press, NewYork, 1968).
Reprinted by AMS (2016)
27. K.H. Gröchenig, Describing functions: atomic decompositions versus frames. Monatsh.
Math. 112, 1–42 (1991)
28. K. Gröchenig, Foundations of Time-Frequency Analysis (Birkhäuser, Boston, 2001)
29. K. Gröchenig, Composition and spectral invariance of pseudodifferential operators on
modulation spaces. J. Anal. Math. 98, 65–82 (2006)
30. K. Gröchenig, Weight functions in time-frequency analysis, in ed. by L. Rodino, M.W. Wong.
Pseudodifferential Operators: Partial Differential Equations and Time-Frequency Analysis.
Fields Institute Communications, vol. 52 (American Mathematical Society, Providence,
2007), pp. 343–366
31. K. Gröchenig, G. Zimmermann, Spaces of test functions via the STFT. J. Funct. Spaces Appl.
2, 25–53 (2004)
32. W. Guo, J. Chen, D. Fan, G. Zhao, Characterizations of some properties on weighted
modulation and wiener amalgam spaces. Michigan Math. J. 68, 451–482 (2019)
33. L. Hörmander, The Analysis of Linear Partial Differential Operators, vol I–III. (Springer,
Berlin, 1983, 1985)
34. M.S. Jakobsen, On a (no longer) new segal algebra: a review of the feichtinger algebra. J.
Fourier Anal. Appl. 24, 1579–1660 (2018)
35. P.G. Kevrekidis, D.J. Frantzeskakis, R. Carretero-Gonzalez (eds.), Emergent Nonlinear
Phenomena in Bose-Einstein Condensation (Springer, Berlin, 2008)
36. E.H. Lieb, Integral bounds for radar ambiguity functions and Wigner distributions. J. Math.
Phys. 31, 594–599 (1990)
37. E.H. Lieb, J.P. Solovej, Quantum coherent operators: a generalization of coherent states. Lett.
Math. Phys. 22, 145–154 (1991)
38. T. Oh, Y. Wang, Global well-posedness of the one-dimensional cubic nonlinear Schrödinger
equation in almost critical spaces. J. Diff. Eq. 269, 612–640 (2020)
39. T. Oh, Y. Wang, On global well-posedness of the modified KdV equation in modulation
spaces. Discrete Continuous Dyn. Syst. 41, 2971–2992 (2021)
40. S. Pilipović, Tempered ultradistributions. Boll. U.M.I. 7, 235–251 (1988)
41. H. Rauhut, Wiener amalgam spaces with respect to quasi-Banach spaces. Colloq. Math. 109,
345–362 (2007)
42. H. Rauhut, Coorbit space theory for quasi-Banach spaces. Stud. Math. 180, 237–253 (2007)
An Excursion to Multiplications and Convolutions on Modulation Spaces 637

43. M. Ruzhansky, M. Sugimoto, J. Toft, N. Tomita, Changes of variables in modulation and

Wiener amalgam spaces. Math. Nachr. 284, 2078–2092 (2011)
44. M.A. Shubin, Pseudodifferential Operators and Spectral Theory, 2nd edn. (Springer, Berlin,
2001)
45. N. Teofanov, Ultradistributions and time-frequency analysis, in ed. by P. Boggiatto et al.
Pseudo-Differential Operators and Related Topics. Operator Theory Advances and Appli-
cations, vol. 164 (Birkhäuser Verlag, Basel, 2006), pp. 173–191
46. N. Teofanov, Modulation spaces, Gelfand-Shilov spaces and pseudodifferential operators.
Sampl. Theory Signal Image Process. 5, 225–242 (2006)
47. N. Teofanov, Bilinear localization operators on modulation spaces. J. Funct. Spaces 2018,
Art. ID 7560870, 10 (2018)
48. J. Toft, Convolutions and embeddings for weighted modulation spaces in ed. by R. Ashino,
P. Boggiatto, M.W. Wong. Advances in Pseudo-Differential Operators. Operator Theory
Advances and Applications, vol. 155 (Birkhäuser Verlag, Basel, 2004), pp. 165–186
49. J. Toft, The Bargmann transform on modulation and Gelfand-Shilov spaces, with applications
to Toeplitz and pseudo-differential operators. J. Pseudo-Differ. Oper. Appl. 3, 145–227 (2012)
50. J. Toft, Gabor analysis for a broad class of quasi-Banach modulation spaces, in ed. by S.
Pilipović, J. Toft. Pseudo-Differential Operators, Generalized Functions. Operator Theory
Advances and Applications, vol. 245 (Birkhäuser, Basel, 2015), pp. 249–278
51. J. Toft, Images of function and distribution spaces under the Bargmann transform. J. Pseudo-
Differ. Oper. Appl. 8, 83–139 (2017)
52. J. Toft, Tensor products for Gelfand-Shilov and Pilipović distribution spaces. J. Anal. 28,
591–613 (2020)
53. J. Toft, The Zak transform on Gelfand-Shilov and modulation spaces with applications to
operator theory. Complex Anal. Oper. Theory 15, Paper no. 2, 42 (2021)
54. J. Toft, Step multipliers, Fourier step multipliers and multiplications on quasi-Banach
modulation spaces. J. Func. Anal. 282, Paper no. 109343, 46 (2022)
55. B. Wang, C. Huang, Frequency-uniform decomposition method for the generalized BO, KdV
and NLS equations. J. Diff. Equ. 239, 213–250 (2007)
The Hardy-Littlewood Inequalities
in Sequence Spaces

Daniel Núñez-Alarcón, Daniel M. Pellegrino, and Anselmo B. Raposo Jr.

Abstract The investigation of the relation between the sums of coefficients of

bilinear forms on c0 × c0 and their supremum norms was initiated in 1930 by J.E.
Littlewood. In 1934, in a joint paper with G.H. Hardy, Littlewood extended the
previous results to p spaces. The main goal of these notes is to present modern
proofs of m-linear versions of the results of Hardy and Littlewood and the state-
of-the-art of the subject. We also illustrate an application in a combinatorial game
called Gale–Berlekamp switching game.

Keywords m-linear forms · Hardy–Littlewood inequalities

1 Introduction

G.H. Hardy and J.E. Littlewood have their names associated to dozens of inequal-
ities and when we mention the Hardy–Littlewood inequalities it is natural that
researchers of different fields conceive different results. In this work the Hardy–
Littlewood inequalities are the main theorems of the paper [32] and their m-linear
generalizations. In some sense the starting point of this cycle of ideas rests on the
works of Orlicz, Littlewood, Bohnenblust and Hille in the beginning of the 1930’s
(see [17, 33]). These results show how the sums of the coefficients are dominated
by the norms of m-linear forms in c0 spaces and, in 1934, Hardy and Littlewood
extended these inequalities to bilinear forms in p spaces. We recall that an operator

D. Núñez-Alarcón
Departamento de Matemáticas, Universidad Nacional de Colombia, Bogotá, Colombia
e-mail: [email protected]
D. M. Pellegrino ()
Departamento de Matemática, Universidade Federal da Paraíba, João Pessoa, PB, Brazil
e-mail: [email protected]
A. B. Raposo Jr.
Departamento de Matemática, Universidade Federal do Maranhão, São Luís, MA, Brazil
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 639
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_19
640 D. Núñez-Alarcón et al.

T : E1 × · · · × Em → F between Banach spaces is called m-linear when it is linear

in each coordinate. Considering the usual operations, the set of all continuous m-
linear operators T : E1 × · · · × Em → F is a Banach space when endowed with the
norm

T := sup {T (x1 , . . . , xm ) : x1 , . . . , xm ≤ 1} .

The space of all continuous m-linear operators from E1 , . . . , Em to F is denoted

by L(E1 , . . . , Em ; F ). It is obvious that when E1 , . . . , Em are finite-dimensional
every m-linear operator T : E1 × · · · × Em → F is continuous. When F = R or C,
multilinear operators are simply called multilinear forms.
The aim of these notes is to provide a reasonably self-contained exposition of the
subject up to the present date. For all p1 , p2 ∈ (1, ∞] satisfying 1/p1 + 1/p2 < 1,
let us define
⎧
⎪
⎪
1
:= 1 −
1
+
1
⎪
⎪ ;
⎨λ p1 p2
⎪
⎪
⎪
⎪ 1 3 1 1 1
⎩ := − + .
μ 4 2 p1 p2

We consider, as usual, 1/∞ = 0 and thus λ = 1 and μ = 4/3 when p1 = p2 = ∞.

Let us also settle other notations that shall be used throughout these notes. By N we
represent the set of all positive integers, the symbol p∗ denotes the conjugate of p,
i.e., 1/p + 1/p∗ = 1 and (ek )∞ k=1 denotes the sequence of canonical vectors in the
sequence spaces. The vector spaces will be considered over the scalar field K = R
or C and, finally, Kn , endowed with the p norm, is denoted by np .
In their seminal paper [32], Hardy and Littlewood prove five theorems, as
follows:
Theorem 1.1 ([32, Theorems 1 and 4]) Let p1 , p2 ∈ [2, ∞], with 1/p1 + 1/p2 ≤
1/2. There is a constant Cp1 ,p2 such that

⎛ ⎛ ⎞ λ ⎞ λ1

n n 2
⎜ 2 ⎟
⎝ ⎝ T ej1 , ej2 ⎠ ⎠ ≤ Cp1 ,p2 T , (1)
j1 =1 j2 =1

⎛ ⎛ ⎞ λ ⎞ λ1
2
⎜
n
n
2 ⎟
⎝ ⎝ T ej1 , ej2 ⎠ ⎠ ≤ Cp1 ,p2 T , (2)
j2 =1 j1 =1
The Hardy-Littlewood Inequalities in Sequence Spaces 641

and
⎛ ⎞1
μ

n
n

⎝ T ej1 , ej2
μ⎠
≤ Cp1 ,p2 T , (3)
j1 =1 j2 =1

for all bilinear forms T : np1 × np2 → K and all positive integers n. Moreover, the
exponents λ and μ are optimal.
Theorem 1.2 ([32, Theorems 2 and 4]) Let p1 , p2 ∈ [2, ∞], with 1/2 < 1/p1 +
1/p2 < 1. There is a constant Cp1 ,p2 such that the inequalities (1) and (2) are still
true, and
⎛ ⎞1
λ

n
n
λ
⎝ T ej1 , ej2 ⎠ ≤ Cp1 ,p2 T , (4)
j1 =1 j2 =1

for all bilinear forms T : np1 × np2 → K and all positive integers n. Moreover, λ is
optimal.
Theorem 1.3 ([32, Theorem 5]) Let p1 , p2 ∈ (1, ∞] be such that 1/p1 + 1/p2 <
1. Then
⎛ ⎛ ⎞ λ ⎞ λ1
p∗
⎜
n
n
p ∗ 2
⎟
⎝ ⎝ T ej1 , ej2 2 ⎠ ⎠ ≤ T
j1 =1 j2 =1

for all non-negative (i.e., T ej1 , ej2 ≥ 0 for all (j1 , j2 ) ∈ N × N) bilinear forms
T : np1 × np2 → K and all positive integers n.
Theorem 1.4 ([32, Theorems 3 and 4]) Let p1 , p2 ∈ (1, ∞] be such that 1/p1 +
1/p2 < 1. If pk1 ∈ (1, 2] and pk2 ∈ (2, ∞], for {k1 , k2 } = {1, 2}, then there is a
constant Cp1 ,p2 such that

⎛ ⎞1

n
n λ
λ
⎝ T ej1 , ej2 ⎠ ≤ Cp1 ,p2 T , (5)
j1 =1 j2 =1

for all bilinear forms T : np1 × np2 → K and all positive integers n. Moreover, λ is
optimal.
642 D. Núñez-Alarcón et al.

The constant and, in some sense, the exponents in Theorem 1.4 were improved
by Ozikiewicz and Tonge in [39]:
Theorem 1.5 ([39, Theorem 5]) Let p1 , p2 ∈ (1, ∞] be such that 1/p1 + 1/p2 <
1. If pk1 ∈ (1, 2] and pk2 ∈ (2, ∞], for {k1 , k2 } = {1, 2}, then

⎛ ⎛ ⎞ λ ⎞ λ1
pk∗
⎜
n
n
pk∗ 2
⎟
⎝ ⎝ T ej1 , ej2 2 ⎠ ⎠ ≤ T ,
jk1 =1 jk2 =1

for all positive integers n and all bilinear forms T : np1 × np2 → K.
Several natural questions related to the previous results arise:
• How do these inequalities behave for multilinear forms?
• What are the best constants Cp1 ,p2 ?
• What happens when 1/p1 + 1/p2 ≥ 1?
In the next sections we present modern proofs of the Hardy–Littlewood inequal-
ities and discuss the three issues mentioned above. More precisely, in Sect. 2 we
present some auxiliary results that will be used throughout the text. Sections 3–6
are devoted to present proofs of m-linear versions of Theorems 1.1–1.5. In Sects. 7
and 8 we try to present the state-of-the-art of the investigation related to the optimal
constants of the Hardy–Littlewood inequalities and the case 1/p1 +· · · +1/pm ≥ 1.
Finally, in the final section we present connections between the Hardy–Littlewood
inequalities and the Gale–Berlekamp switching game.

2 Preliminary Results

From now on, E ∗ denotes the topological dual of a Banach space E and BE denotes

n
the closed unit ball of E. Moreover, denotes the multiple summation
B
ik =1

n
n
n
n
··· ···
i1 =1 ik−1 =1 ik+1 =1 im =1

and, for all pm = (p1 , . . . , pm ) ∈ [1, ∞]m , and each k ∈ {1, . . . , m}, let us define

1 1 1 1
1/pm ≥k
:= + ··· + and 1/pm := 1/pm ≥1
= + ··· + .
pk pm p1 pm

When 1/pm < 1 we define 1/λpm := 1 − 1/pm . We begin by proving two

multilinear results that will be used several times.
The Hardy-Littlewood Inequalities in Sequence Spaces 643

The first one is due to Praciano-Pereira [45, Theorem A] and appears, in a

slightly extended version, in [4, Proposition 4.1]; it is a multilinear version of the
mixed inequalities (1) and (2). The proof presented here follows the lines of [4,
Proposition 4.1].
Proposition 2.1 ([45, Theorem A]) Let m, n be positive integers, with m ≥ 2,
pm = (p1 , . . . , pm ) ∈ [2, ∞]m and let A : np1 × · · · × npm → K be an m-linear
form. Let us assume that
(i) 1/pm < 1;
(ii) 1/pm ≤ 1/2 + 1/pk , for any k ≥ 1.
Then, for any k ∈ {1, . . . , m},
⎛ ⎞ 1
⎛ ⎞ λpm λpm

⎜
2
n n
⎟ √ m−1
⎜ ⎝ A(ei1 , . . . , eim ) ⎠
2 ⎟ ≤ 2 A.
⎝ ⎠
ik =1 iBk =1

√ m−1
Proof For the sake of convenience, we shall denote M = 2 and ai =
A(ei1 , . . . , eim ) . Let
6 7
l := card j : pj = ∞ .

When l < m note that the conditions (i) and (ii) reduce to precisely 1/pm ≤
1/2. We begin by proving by induction on l ∈ {0, . . . , m − 1} that, for any
pm = (p1 , . . . , pm ) ∈ [2, ∞]m with 1/pm ≤ 1/2, we have

⎛ ⎛ ⎞ 1 ×λp ⎞ λp1m
m

n n 2
⎜ ⎝ 2⎠ ⎟
⎝ ai ⎠ ≤ MA (6)
ik =1 iBk =1

for all m-linear forms A : np1 × · · · × npm → K, and all k ∈ {1, . . . , m}.
For l = 0, let n ∈ N and let A : n∞ × · · · × n∞ → K be an m-linear form. Then,
λpm = 1 and, by the Khinchin inequality for multiple sums (see [24, p. 455]) we
have
⎛ ⎛ ⎞ 1 ×λp ⎞ λp1m ⎛ ⎞1
2 m 2
⎜
n
n
⎟
n
n
⎝ ⎝ 2⎠
ai ⎠ = ⎝ ai ⎠ ≤ M A
2

ik =1 iBk =1 ik =1 iBk =1

and, hence, the case l = 0 holds.

644 D. Núñez-Alarcón et al.

6 7
6 for card 7j : pj = ∞ = l − 1 and let us
Let us assume that the result is valid
prove that it is also valid when card j : pj = ∞ = l. If k is an index such that
pk = ∞, we fix x ∈ npk and consider

Ak : np1 × · · · × npk−1 × n∞ × npk+1 × · · · × npm → K

defined by

Ak (z(1), . . . , z(m) ) = A(z(1), . . . , z(k−1) , xz(k) , z(k+1) , . . . , z(m) ),

(k) n
where xz(k) = xj zj . By applying the induction hypothesis to Ak , we know
j =1
that
⎛ ⎛ ⎞ 1 ×λk ⎞ λ1k

n n 2
⎜ ⎝ ⎟
ai2 ⎠
λk
⎝ xik ⎠
ik =1 iBk =1

⎛ ⎞ 1
⎛ ⎞ λk λk
2
⎜
n
n
⎟
=⎜ ⎝ A(ei1 , . . . , eik−1 , xeik , eik+1 , . . . , eim ) ⎠ ⎟
2
⎝ ⎠
ik =1 iBk =1

⎛ ⎞ 1
⎛ ⎞ λk λk

⎜
2
n n
⎟
=⎜ ⎝ Ak (ei1 , . . . , eim ) ⎠ ⎟
2
⎝ ⎠
ik =1 iBk =1

≤ MAk ≤ MAxnp , (7)

where we have set 1/λk := 1/λpm + 1/pk . Since (pk /λk )∗ = λpm /λk , we get

⎛ ⎛ ⎞ λ1
⎞ 1 ×λp ⎞ λp1m
1
⎛ ⎛ ⎞ 1 ×λk × ∗ ×
pk k pk ∗
2 m 2 λk
⎜
n
n
⎟ ⎜
n
n
⎟ λk

⎝ ⎝ ai2 ⎠ ⎠ = ⎜ ⎝ a 2⎠ ⎟
⎝ i ⎠
ik =1 B
ik =1 ik =1 iBk =1

⎛⎛ ⎞
⎞ 1 ×λk ⎞n
1
⎛ n λk

⎜⎜ 2
2
⎟
⎟
=⎜
⎝⎝⎝ a ⎠ ⎠ ⎟
⎠
i
B
ik =1
ik =1 (pk /λk )∗
The Hardy-Littlewood Inequalities in Sequence Spaces 645

⎛ ⎛ ⎞ 1 ×λk ⎞ λ1k
2
⎜
n
n
⎟
=⎝ sup yik ⎝ ai2 ⎠ ⎠
y∈Bn B
(pk /λk ) ik =1 ik =1

⎛ ⎛ ⎞ 1 ×λk ⎞ λ1k

n n 2
⎜ λk ⎝ 2⎠ ⎟
= ⎝ sup xik ai ⎠
x∈Bn i =1
pk k iBk =1

≤ MA,

where the last inequality holds by (7). This shows that (6) holds when k is an index
such that pk = ∞. Let us also show that it also holds when k is an index such that
pk = ∞. Without loss of generality, let us suppose that p1 = ∞. The induction
hypothesis applied to A1 and

1 1 1
= +
λ1 λpm p1

now give
⎛ ⎞ 1 ×λ1
2

n
n
∀x ∈ Bnp , ⎝ ai2 xi1
2⎠
≤ M λ1 Aλ1 . (8)
1
ik =1 iBk =1

Observe that λ1 < λpm ≤ 2. If λpm = 2, we have

⎛ ⎛ ⎞ 1 ×λp ⎞ λp1m ⎛ ⎛ ⎞ 1 ×λp ⎞ λp1m

2 m 2 m
⎜
n
n
⎟ ⎜ ⎝ 2⎠
n n
⎟
⎝ ⎝ 2⎠
ai ⎠ =⎝ ai ⎠ ≤ MA.
ik =1 iBk =1 i1 =1 iB1 =1

If λpm < 2, let us denote, for ik ∈ {1, . . . , n},

⎛ ⎞1
2

n
Sik = ⎝ ai2 ⎠ .
iBk =1
646 D. Núñez-Alarcón et al.

n λ pm
We shall show that ik =1 Sik ≤ M λpm Aλpm . We write

n
λp
n
λp −2
n
Sik m = Sik m ai2
ik =1 ik =1 iBk =1

n
n
ai2
= 2−λpm
ik =1 iBk =1 Sik

n
n 2/s
ai 2/s ∗
= 2−λpm
ai
i1 =1 iB1 =1 Sik

where (s, s ∗ ) is a couple of conjugate exponents. We then apply the Hölder

inequality twice to get
⎛ ⎞1 ⎛ ⎞ 1∗

n n n s n s
λp ai2
Sik m ≤ ⎝ ⎠ ⎝ ai2 ⎠
(2−λpm )s
ik =1 i1 =1 B
i1 =1 Sik iB1 =1

⎛ ⎛ ⎞ t ⎞ 1t ⎛ ⎛ ⎞ t ∗∗ ⎞ t ∗
1
s
⎜
n
n
ai2 ⎜ ⎝ 2⎠ ⎟
n n s

≤ ⎝ ⎝ ⎠ ⎟⎠ ⎝ ai ⎠ .
(2−λpm )s
i1 =1 iB1 =1 Sik i1 =1 iB1 =1

Choosing

2 − λ1 t∗ λp
s= and ∗
= m
2 − λpm s 2

so that s/t = λ1 /λpm , the inequality becomes

⎛ ⎞1 ⎛ ⎞ 1∗
⎛ ⎞ λpm t ⎛ ⎞ λpm t
λ1 2

n
⎜
n
n
ai2 ⎟ ⎜
n
n
⎟
λ pm
Sik ≤ ⎜ ⎝ ⎠ ⎟ ⎜ ⎝ ai2 ⎠ ⎟
⎝ (2−λ1 ) ⎠ ⎝ ⎠
ik =1 i1 =1 B S
i1 =1 k
i i1 =1 iB1 =1

⎛ ⎞1
⎛ ⎞ λpm t

⎜
λ1
λpm
n n
ai2 ⎟
≤ (MA) t∗ ⎜ ⎝ ⎠ ⎟ .
⎝ (2−λ1 ) ⎠
i1 =1 S
iB=1 ik
1
The Hardy-Littlewood Inequalities in Sequence Spaces 647

It remains to control the last sum appearing on the right hand side of the previous
inequality. By (the converse of) the Hölder inequality, it is sufficient to show that
⎛ ⎞

n
n
ai2
⎝ ⎠ yi1 ≤ (MA)λ1
(2−λ1 )
i1 =1 B S
i =1 ik
1

∗
for any y ∈ Bn ∗
. Since λpm /λ1 = p1 /λ1 , this means that we shall prove
(λpm /λ1 )
that

n
n
ai2 λ1
2−λ1
xi1 ≤ (MA)λ1
i1 =1 iB=1 Sik
1

for any x ∈ Bnp . Now, invoking the Hölder inequality with (2 − λ1 ) /2 + λ1 /2 = 1

1
we obtain

n
n
ai2 λ1

n
n
ai2 λ1
xi1 = xi1
i1 =1 B S 2−λ1
i =1 ik
2−λ1
ik =1 iB=1 Sik
1 k

⎛ ⎞ 2−λ1 ⎛ ⎞ λ1

2

n n n 2
ai2
≤ ⎝ ⎠ ⎝ ai2 xi1
2⎠
.
B
Si2k
ik =1 ik =1 iBk =1

To conclude, it remains to observe that

n
ai2 2
iBk ai
= =1
Si2k Si2k
iBk =1

and to use (8). 6 7

7 the result is true when card j : pj = ∞ = m from
Finally, we6can prove that
the case card j : pj =6 ∞ = m 7− 1. The argument is exactly6 the same
7 as we
deduced the case card j : pj = ∞ = l from the case card j : pj = ∞ = l − 1,
except that we do not need the second (and more difficult) part, since there is no
index k such that pk = ∞. This explains why we just need, for each k ∈ {1, . . . , m},
1/pm ≤ 1/2 + 1/pk and not 1/pm ≤ 1/2.

648 D. Núñez-Alarcón et al.

The second result is a technical lemma that appears in [11, Lemma 3.1]:
Lemma 2.2 ([11, Lemma 3.1]) Let m ≥ 1, qm = (q1 , . . . , qm ) ∈ (0, ∞)m , and
pm = (p1 , . . . , pm ) ∈ (1, ∞]m , with 1/pm < 1. If there is a constant Cpm ,qm such
that
⎛ ⎞1
⎛ ⎛ ⎞ qm−1 ⎞ qq1 q1
2
⎜ qm
⎟
⎜ ⎝
n n
⎜ ⎟ ⎟
⎜ ⎝· · · T (ej1 , . . . , ejm )qm ⎠ ···⎠ ⎟ ≤ Cpm ,qm T
⎝ ⎠
j1 =1 jm =1

for all n ∈ N and all non-negative m-linear forms T : np1 × · · · × npm → K, then

1/qk ≤ 1 − 1/pm ≥k
,

for all k ∈ {1, . . . , m}.

Proof Let p > 1, q > 0 and suppose that there is a constant Cp,q such that

⎛ ⎞1
q

n
⎝ T (ej1 ) q⎠
≤ Cp,q T
j1 =1

for all non-negative linear forms T : np → K. For each n, consider the non-negative

n ∗
linear form Tn (x) = xj . By the Hölder inequality, we have Tn ≤ n1/p . On
j =1
the other hand
⎛ ⎞1
q
n
1
⎝ Tn (ej ) ⎠ = n q
q

j =1

and, since n is arbitrary, we conclude that 1/q ≤ 1/p∗ and the case m = 1 is done.
Now, let us proceed by induction. Suppose that the result is valid for m − 1 and let
1/pm < 1. Thus, 1/pm ≥2 < 1 and the induction hypothesis combined with a
simple argument of summability tells us that, if there is a constant Cpm ,qm such that

⎛ ⎞1
⎛ ⎛ ⎞ qm−1 ⎞ qq1 q1
2
⎜ qm
⎟
⎜ ⎝
n n
⎜ ⎟ ⎟
⎜ ⎝· · · T (ej1 , . . . , ejm )qm ⎠ ···⎠ ⎟ ≤ Cpm ,qm T
⎝ ⎠
j1 =1 jm =1
The Hardy-Littlewood Inequalities in Sequence Spaces 649

for all n ∈ N and all non-negative m-linear forms T : np1 × · · · × npm → K, then

1/qk ≤ 1 − 1/pm ≥k

for all k ∈ {2, . . . , m}. So, we must only show that

1/q1 ≤ 1 − 1/pm = 1/λpm .

For each n consider the non-negative m-linear form Bn : np1 × · · · × npm → K given
by

n
(1) (2) (m)
Bn (x (1) , . . . , x (m) ) = xj xj · · · xj .
j =1

By the Hölder inequality we obtain

n
(1) (2) (m)
Bn = sup xj xj · · · xj
x (1) ,...,x (m) ≤1 j =1
⎛ ⎛ ⎞ 1 ⎞
m λpm
⎜ n
x (k)
n
⎝ λpm ⎠ ⎟
≤ sup ⎝ j |1| ⎠
x (1) ,...,x (m) ≤1 j =1 np
k=1 k j =1

≤ n1/λpm .

On the other hand,

⎛ ⎞1
⎛ ⎛ ⎞ qm−1 ⎞ qq1 q1
2
⎜ ⎟
⎜ ⎝
n n qm
⎜ ⎟ ⎟ 1
⎜ ⎝· · · Bn (ej1 , . . . , ejm )qm ⎠ ···⎠ ⎟ = n q1 ,
⎝ ⎠
j1 =1 jm =1

and, since n is arbitrary, we conclude that 1/q1 ≤ 1/λpm and the proof is completed.

3 Theorem 1.1 for m-Linear Forms

In this section we prove the following extended version of Theorem 1.1 for m-linear
forms.
650 D. Núñez-Alarcón et al.

Theorem 3.1 ([5, Theorem 1.3]) Let m ≥ 2, let pm = (p1, . . . , pm ) ∈ [2, ∞]m be
m
such that 0 ≤ 1/pm ≤ 1/2, and qm = (q1 , . . . , qm ) ∈ λpm , 2 . The following
statements are equivalent:
(i) There is a constant Cpm ,qm such that, for each n ∈ N and each m-linear form
T : np1 × · · · × npm → K we have

⎛ ⎞1
⎛ ⎛ ⎞ qm−1 ⎞ qq1 q1
2
⎜ ⎟
⎜ ⎝
n n qm
⎜ qm ⎠ ⎟ ⎟
⎜ ⎝· · · T ei1 , . . . , eim ···⎠ ⎟ ≤ Cpm ,qm T .
⎝ ⎠
i1 =1 im =1

(9)

(ii)

m+1
1/qm ≤ − 1/pm . (10)
2
Proof (i) ⇒ (ii). The Kahane–Salem–Zygmund inequality (see, for instance, [3])
assures the existence of an m-linear form T0 : np1 ×· · ·×npm → K with coefficients
±1 satisfying

T0 ≤ Km n 2 −
m+1
|1/pm |

for a certain constant Km . Plugging this m-linear form into (9) we arrive at (ii).
(ii) ⇒ (i). Let n ∈ N and let T : np1 × · · · × npm → K be an m-linear form. By
Proposition 2.1, we have
⎛ ⎞ 1
⎛ ⎞ λpm λpm

⎜
2
n n
⎟ √ m−1
⎜ ⎝ T (ei1 , . . . , eim ) ⎠
2 ⎟ ≤ 2 T .
⎝ ⎠
ik =1 iBk =1

for all n and all k ∈ {1, . . . , m}. Using a well-known result of Minkowski (see [31,
Corollary 5.4.2]), since λpm ≤ 2, we can interchange the position of the exponent 2
√ m−1
and obtain m inequalities with the same constant 2 and exponents

⎧
⎪
⎪ q1,1 , q2,1, q3,1, . . . , qm,1 = λpm , 2, 2, . . . , 2 ,
⎪
⎨ q1,2 , q2,2, q3,2, . . . , qm,2 = 2, λp , 2, . . . , 2
m

⎪ ..
⎪
⎪ .
⎩
q1,m , q2,m , . . . , qm−1,m , qm,m = 2, 2, . . . , 2, λpm .
The Hardy-Littlewood Inequalities in Sequence Spaces 651

Thanks to the monotonicity of the p -norms, the important case to be considered

in (10) is the equality. If λpm = 2 we have that q1 = · · · = qm = 2 and by
Proposition 2.1 we arrive at (i). If λpm < 2, choosing

2λpm − λpm qj
θj = ,
2qj − λpm qj

it is easy to see that

1 θk m
=
qj qj,k
k=1

and, by the Hölder inequality for mixed sums (see [1, Theorem 2.49]), we conclude
that
⎛ ⎞1
⎛ ⎛ ⎞ qm−1 ⎞ qq1 q1
2
⎜ ⎟
⎜ ⎝
qm
⎜
n n
qm ⎠ ⎟ ⎟ √ m−1
⎜ ⎝· · · T ei1 , . . . , eim ···⎠ ⎟ ≤ 2 T .
⎝ ⎠
i1 =1 im =1

Observe that, if we take

2m
q1 = · · · = qm = ∈ λpm , 2 ,
m + 1 − 2 1/pm

we obtain
m+1
1/qm = − 1/pm ,
2
providing the following extension of (3) to the multilinear setting:
Theorem 3.2 ([45, Theorem B]) Let m ≥ 2, and let pm = (p1 , . . . , pm ) ∈ [2, ∞]m
be such that 0 ≤ 1/pm ≤ 1/2. There is a constant Cpm ≥ 1 such that, for every
positive integer n and each m-linear form T : np1 × · · · × npm → K, we have

⎛ ⎞ m+1−2|1/pm |
2m

n
2m
⎝ T ei1 , . . . , eim m+1−2|1/pm | ⎠ ≤ Cpm T .
i1 ,...,im =1

Moreover, the exponent 2m/ m + 1 − 2 1/pm is optimal.
652 D. Núñez-Alarcón et al.

4 Theorem 1.2 for m-Linear Forms

An
m-linear
operator between Banach spaces T : E1 × · · · × Em → F is multiple
r; pm -summing (see e.g. [18, 34]), where pm = (p1 , . . . , pm ), if there is C > 0
such that
⎛ ⎞1 ⎛ ⎞ 1

n r r
m
n pk

⎝ (1) (m) ⎠ ⎝ (k) pk

⎠
T xj1 , . . . , xjm ≤C sup ϕk x j .
j1 ,...,jm =1 k=1 ϕk ∈BEk∗ j =1

(j ) n
for every xij ⊆ Ej , with j = 1, . . . , m.
ij =1
We denote by m
, . . . , Em ; F )
the space composed by all multiple
(r;pm ) (E1
r; pm -summing m-linear operators T : E1 × · · · × Em → F . In the case of linear
operators, this notion reduces to the well-known concept of absolutely summing
operators (see [28]).
The following extension of Theorem 1.2 is due to Dimant and Sevilla-Peris ([29,
Proposition 4.1]); the proof presented here follows the lines of [9, Theorem 2.2].
Theorem 4.1 ([29, Proposition 4.1]) Let m ≥ 2, and let pm = (p1 , . . . , pm ) ∈
[2, ∞]m be such that 1/2 ≤ 1/pm < 1. There is a constant Cpm such that

⎛ ⎞ 1
λpm

n

⎝ T ei1 , . . . , eim
λpm ⎠
≤ Cpm T ,
i1 ,...,im =1

for all m-linear forms T : np1 ×· · ·×npm → K and all positive integers n. Moreover,
the exponent λpm is optimal.
Proof The optimality of the exponent λpm is a straightforward consequence of
Lemma 2.2. Let
G
1 1
r
s = min r : there exist r indexes k1 , . . . , kr such that ≤ <1 .
2 pki
i=1

For the sake of simplicity let us suppose that pk1 = p1 , . . . , pks = ps . Observe that

1 1 1
+ ···+ < .
p1 ps−1 2
The Hardy-Littlewood Inequalities in Sequence Spaces 653

Let n ∈ N and let A : np1 × · · · × nps → K be s-linear. Let us fix x ∈ nps and
consider

As : np1 × · · · × nps−1 × n∞ → K

(z(1), . . . , z(s)) $→ A(z(1), . . . , z(s−1), xz(s) ).

By Proposition 2.1, for As , we have

⎛ ⎞ 1
⎛ ⎞ λs−1 λs−1

2
⎜ n n
⎟
⎜ xis
λs−1 ⎝ A(ei1 , . . . , eis ) ⎠
2 ⎟
⎝ ⎠
is =1 i1 ,...,is−1 =1

√ s−1
≤ 2 Axnps

with λs−1 = [1 − (1/p1 + · · · + 1/ps−1 )]−1 . Since

1 λs−1 λs−1
∗ =1− = ,
(ps /λs−1 ) ps λs

with λs = [1 − (1/p1 + · · · + 1/ps )]−1 , we get

⎛ ⎛ ⎞ λs ⎞ λs
1

n n 2
⎜ ⎝ 2⎠ ⎟
⎝ A(ei1 , . . . , eis ) ⎠
is =1 i1 ,...,is−1 =1

⎛ ⎛ ⎞λs−1 ×(ps /λs−1 )∗ ⎞ λs−1

1
× 1
(ps /λs−1 )∗
⎜
n
n
⎟
=⎝ ⎝ A(ei1 , . . . , eis ) ⎠
2
⎠
is =1 i1 ,...,is−1 =1

⎛⎛ ⎞ 1
⎛ ⎞λs−1 ⎞n λs−1

⎜⎜ n ⎟
⎟
=⎜ ⎝⎝ A(ei1 , . . . , eis ) ⎠ ⎟
2
⎝
⎠
⎠
i1 ,...,is−1 =1 ∗
is =1 (ps /λs−1 )

⎛ ⎞ 1
⎛ ⎞λs−1 λs−1
⎜
n
n
⎟
=⎜ yis ⎝ A(ei1 , . . . , eis ) ⎠ ⎟
2
⎝ sup ⎠
y∈Bn is =1 i1 ,...,is−1 =1
ps
λs−1
654 D. Núñez-Alarcón et al.

⎛ ⎛ ⎞λs−1 ⎞ λs−1
1

⎜
n
n
⎟
= ⎝ sup xis
λs−1 ⎝ A(ei1 , . . . , eis ) ⎠
2
⎠
x∈Bn i =1 i1 ,...,is−1 =1
ps s

√ s−1
≤ 2 A.

Thanks to the monotonicity of the p -norms, since λs ≥ 2, the above estimate

implies
⎛ ⎞1/λs

n √ s−1
⎝ A(ei1 , . . . , eis )
λs ⎠
≤ 2 A. (11)
i1 ,...,is =1

By the Khinchin inequality (see [28, Theorem 1.10]), with constant 1, because λs ≥
2, we have, for every n and all (s +1)-linear forms Ts+1 : np1 ×· · ·×nps ×n∞ → K,

⎛ ⎛ ⎞ λs ⎞ λs
1
2
⎜
n
n
2 ⎟
⎝ ⎝ Ts+1 ei1 , . . . , eis+1 ⎠ ⎠
i1 ,...,is =1 is+1 =1

⎛ ⎛ ⎞ λs ⎞ λs
1

$ λs λs
⎜
n
⎜
1
n
⎟ ⎟
≤⎜
⎝ ⎝ Ts+1 ei1 , . . . , eis+1 ris+1 (t) dt ⎠ ⎟ ⎠
i1 ,...,is =1 0 is+1 =1

⎛ ⎛ ⎞ ⎞1
$ 1
n
n
λs λs

⎜ ⎟
=⎝ Ts+1 ⎝ei1 , . . . , eis , eis+1 ris+1 (t)⎠ dt ⎠
0 i ,...,i =1 is+1 =1
1 s

⎛ ⎛ ⎞ ⎞1
λs λs

⎜
n
n
⎟
≤ ⎝ sup Ts+1 ⎝ei1 , . . . , eis , eis+1 ris+1 (t)⎠ dt ⎠
t ∈[0,1] i ,...,i =1 is+1 =1
1 s

⎛ ⎛ ⎞ ⎞1
λs λs

⎜
n n
⎟
= sup ⎝ Ts+1 ⎝ei1 , . . . , eis , eis+1 ris+1 (t)⎠ dt ⎠
t ∈[0,1] i1 ,...,is =1 is+1 =1
The Hardy-Littlewood Inequalities in Sequence Spaces 655

⎛ ⎞

√ s−1 n

≤ 2 sup T
s+1
⎝·, . . . , ·, e r
is+1 is+1 (t) ⎠

t ∈[0,1] is+1 =1
√ s−1
= 2 Ts+1 .

Here and henceforth, rj (t) denotes the j -th Rademacher function. Thus, from the
previous inequality together with canonical inclusion of p spaces,

⎛ ⎞1

n λs

⎝ Ts+1 ei1 , . . . , eis+1
λs ⎠

i1 ,...,is+1 =1

⎛ ⎛ ⎞ λs ⎞ λs
1
λs
⎜
n n
λs ⎟
=⎝ ⎝ Ts+1 ei1 , . . . , eis+1 ⎠ ⎠
i1 ,...,is =1 is+1 =1

⎛ ⎛ ⎞ λs ⎞ λs
1

n n 2
⎜ 2 ⎟
≤⎝ ⎝ Ts+1 ei1 , . . . , eis+1 ⎠ ⎠
i1 ,...,is =1 is+1 =1

√ s−1
≤ 2 Ts+1 ,

for every n and all (s + 1)-linear forms Ts+1 : np1 × · · · × nps × n∞ → K. Using the
canonical isometric isomorphisms for the spaces of weakly summable sequences
(see [28, Proposition 2.2]) we know that this is equivalent to assert that (see [29,
p. 308])
s+1
(E , . . . , Es+1 ; K) = L(E1 , . . . , Es+1 ; K)
(λs ;p1∗ ,...,ps∗ ,1) 1

for all Banach spaces E1 , . . . , Es+1 . Since

⎛ ⎞ ⎛ ⎞
1
s
1 ⎠ ⎝ s+1
1 ⎠ 1 1
s+1
1
− ⎝1 + + = − = 1 − > 0,
λs pj∗ pj∗ λs ps+1 pj
j =1 j =1 j =1

by the inclusion theorem for multiple summing forms proved in [2, 13], we have
s+1 (E1 , . . . , Es+1 ; K)

λs+1 ;p1∗ ,...,ps+1
∗ = L(E1 , . . . , Es+1 ; K)
656 D. Núñez-Alarcón et al.

for all Banach spaces E1 , . . . , Es+1 , with 1/λs+1 = 1 − (1/p1 + · · · + 1/ps+1 ).

Again (see [29, p. 308]), this is equivalent to say that there is a constant C(p1 ,...,ps+1 )
such that
⎛ ⎞ 1
λs+1

n

⎝ T ei1 , . . . , eis+1
λs+1 ⎠
≤ C(p1 ,...,ps+1 ) T ,
i1 ,...,is+1 =1

for all (s + 1)-linear forms T : np1 × · · · × nps × nps+1 → K and all positive integers
n. The proof is completed by a standard induction argument.

5 Theorem 1.3 for Non-negative m-Linear Forms

From now on, for all pm = (p1 , . . . , pm ) ∈ [1, ∞]m , with 1/pm < 1, and each
k ∈ {1, . . . , m}, let us define

1/λpm ,k = 1 − 1/pm ≥k
.

Observe that λpm = λpm ,1 . Moreover, if σ : {1, . . . , m} → {1, . . . , m} is a bijection,

then we will denote σi = σ (i) for all i = 1, . . . , m.
Theorem 5.1 ([37, Theorem 1.3]) Let m ∈ N, pm = (p1 , . . . , pm ) ∈ (1, ∞]m ,
with 1/pm < 1, and qm = (q1 , . . . , qm ) ∈ (0, ∞)m . For any bijection
σ : {1, . . . , m} → {1, . . . , m} the following assertions are equivalent:
(i)
⎛ ⎞1
⎛ ⎛ ⎞ qm−1 ⎞ qq1 q1
2
⎜ ⎟
⎜
n n n qm
⎜ ⎟ ⎟
⎜ ⎝ ···⎝ T (ej1 , . . . , ejm )qm ⎠ ···⎠ ⎟ ≤ T
⎝ ⎠
jσ1 =1 jσ2 =1 jσm =1

for all non-negative m-linear forms T : np1 × · · · × npm → K and all positive
integers n.
(ii) There is a constant Cpm ,qm such that

⎛ ⎛ ⎞ qq1 ⎞ q1
1
⎛ ⎞ qm−1 2
⎜ ⎜
qm
⎟ ⎟
n n n
⎜ ⎝ T (ej1 , . . . , ejm )qm ⎠ ···⎠ ⎟
⎝ ⎝ · · · ⎠ ≤ Cpm ,qm T
jσ1 =1 jσ2 =1 jσm =1
The Hardy-Littlewood Inequalities in Sequence Spaces 657

for all non-negative m-linear forms T : np1 × · · · × npm → K and all positive
integers n.
(iii) The exponents q1 , . . . , qm satisfy

qk ≥ λtm ,k ,

for all k ∈ {1, . . . , m}, with tm = pσ1 , . . . , pσm .
Proof To simplify the notation we will consider σj = j for all j ; the other cases
are similar. We observe that, in this case, λtm ,k = λpm ,k for all k ∈ {1, . . . , m}.
(i) ⇒ (ii) is obvious and (ii) ⇒ (iii) follows from Lemma 2.2. So, it
remains to prove (iii) ⇒ (i). In the case m = 1 the result is immediate, it holds
with constant 1 and does not need the non-negative assumption. Let us show the
general case m, supposing that the result holds for m − 1; so we suppose that if
pm−1 = (p1 , . . . , pm−1 ) ∈ (1, ∞]m−1 is such that 1/pm−1 < 1, then

⎛ ⎛ ⎞ μμ1 ⎞ μ1
1
⎛ ⎞ μm−2 2
μm−1
⎜n
⎜ ⎝
n
⎟ ⎟
⎜ T (ej1 , . . . , ejm−1 )μm−1 ⎠ ···⎠ ⎟
⎝ ⎝· · · ⎠ ≤ T ,
j1 =1 jm−1 =1

where

(μ1 , . . . , μm−1 ) = λpm−1 ,1 , . . . , λpm−1 ,m−1 ,

for all positive integers n and all non negative (m − 1)-linear forms T : np1 × · · · ×
npm−1 → K. We recall that for an m-linear form D : np1 × · · · × npm → K, we have

D = sup D x (1) , . . . , x (m)

x (i) np ≤1; 1≤i≤m
i

n
(1) (m)
= sup D ej1 , . . . , ejm xj1 · · · xjm
x (i) np ≤1; 1≤i≤m j1 ,...,jm =1
i

⎛ ∗ ⎞ 1
∗
pm pm
⎜
n
n
⎟
= sup ⎝ D ej1 , . . . , ejm xj(1)
1
· · · xj(m−1)
m−1
⎠ .
x (i) np ≤1; 1≤i≤m−1 jm =1 j/
m =1
i
(12)
658 D. Núñez-Alarcón et al.

Suppose that pm = (p1 , . . . , pm ) ∈ (1, ∞]m satisfies 1/pm < 1. In this case

1 1 1 1 1
+ ···+ <1− = = ∗
p1 pm−1 pm λpm ,m pm

and thus we have pi > pm ∗ for all i ∈ {1, . . . , m − 1}.

Let D : p1 × · · · × pm → K be a non negative m-linear form. We define the

n n

non negative (m − 1)-linear form T : nr1 × · · · × nrm−1 → K by

n
∗
T (ej1 , . . . , ejm−1 ) = D(ej1 , . . . , ejm )pm , (13)
jm =1

with ri = pi /pm ∗ for each i = 1, . . . , m − 1. Note that r

= (r1 , . . . , rm−1 ) ∈
m−1
(1, ∞] m−1 and, if we denote by + nr the set of sequences xj ∈ nr , such that xj ≥ 0
for all j , we have

n
(1) (m−1)
sup T (ej1 , . . . , ejm−1 )xj1 · · · xjm−1
x (i) nr ≤1; 1≤i≤m−1 j/
m =1
i

n
(1) (m−1)
= sup T (ej1 , . . . , ejm−1 )xj1 · · · xjm−1
x (i) + nr ≤1; 1≤i≤m−1 j/
m =1
i

n ∗
pm
= sup T (ej1 , . . . , ejm−1 ) xj(1)
1
· · · xj(m−1)
m−1
x (i) + np ≤1; 1≤i≤m−1 j/=1
m
i

n
n
∗ (1)
∗
(m−1) pm
= sup D(ej1 , . . . , ejm )pm xj1 · · · xjm−1
x (i) + np ≤1; 1≤i≤m−1 jm =1 j/=1
m
i
⎛ ⎞pm∗

n
n
= sup ⎝ D(ej1 , . . . , ejm )xj1 · · · xjm−1 ⎠
(1) (m−1)

x (i) + np ≤1; 1≤i≤m−1 jm =1 j/

m =1
i
∗
pm

n
n
(1) (m−1)
≤ sup D ej1 , . . . , ejm xj1 · · · xjm−1
x (i) np ≤1; 1≤i≤m−1 jm =1 j/=1
m
i
∗
= Dpm , (14)

for all x (k) ∈ Bnr , with k = 1, . . . , m − 1, where the last equality holds by (12).
k
The Hardy-Littlewood Inequalities in Sequence Spaces 659

Note that

m−1
1 1 m−1
1
−1 m−1
1
∗
= pm = 1− < 1.
rk pk pm pk
k=i k=i k=i

Hence, for each i ∈ {1, . . . , m − 1}, a simple calculation shows that

∗
pm λrm−1 ,i = λpm ,i .

Therefore, by (13), making

(α1 , . . . , αm ) = λpm ,1 , . . . , λpm ,m

and

(β1 , . . . , βm−1 ) = λrm−1 ,1 , . . . , λrm−1 ,m−1

we have

⎛ ⎛ ⎞ αα1 ⎞ α1 ×pm
1 ∗
⎛ ⎞ αm−1 2
⎜ ⎜ ⎝
αm
⎟ ⎟
n n
∗
⎜ ⎝· · · D(ej1 , . . . , ejm )pm ⎠ ···⎠ ⎟
⎝ ⎠
j1 =1 jm =1

⎛ ⎛ ⎞ αα1 ⎞ α1 ×pm
1 ∗
⎛ ⎞ αm−2 2
αm−1
⎜n
⎜ ⎝
n αm−1
⎟ ⎟
=⎜
⎝ ⎝· · · T (e j1 , . . . , e jm−1 ) αm ⎠ ···⎠ ⎟ ⎠
j1 =1 jm−1 =1

⎛ ⎛ ⎞ β1 ⎞ β1
1
⎛ ⎞ βm−2 β2
⎜n
⎜
n βm−1
⎟ ⎟
⎜ ⎜ ⎟
=⎜ ⎝· · ·
⎝ T (ej1 , . . . , ejm−1 ) βm−1 ⎠
···⎟
⎠ ⎟ .
⎝ ⎠
j1 =1 jm−1 =1

By the last equality and the induction hypothesis we conclude that

⎛ ⎞λ 1 ∗
×pm
⎛ ⎞ λλpm ,1 pm ,1
⎛ ⎞ λpm ,m−1 pm ,2
⎜ n λpm ,m ⎟
⎜ ⎜ n
∗ ⎟ ⎟
⎜ ⎜· · · ⎝ D(ej1 , . . . , ejm )pm ⎠ ···⎟ ⎟
⎜ ⎝ ⎠ ⎟
⎝j =1 jm =1 ⎠
1
660 D. Núñez-Alarcón et al.

n
(1) (m−1)
≤ sup T (ej1 , . . . , ejm−1 )xj1 · · · xjm−1
x (i) nr ≤1; 1≤i≤m−1 j/
m =1
i
∗
≤ Dpm ,

where in the last inequality we have used (14).

6 Theorem 1.4 for m-Linear Forms

In this section we prove a multilinear version of Theorem 1.4. We start by proving

a multilinear version of Theorem 1.5. The proof presented here is inspired by ideas
of Ozikiewicz and Tonge [39].
Theorem 6.2 ([11, Theorem 3.2]) Let m ≥ 2, σ : {1, . . . , m} → {1, . . . , m} be
a bijection, qm = (q1 , . . . , qm ) ∈ (0, ∞)m , and pm = (p1 , . . . , pm ) ∈ (1, ∞]m be
such that pσm ∈ (1, 2] and 1/pm < 1. The following assertions are equivalent:
(i)
⎛ ⎞1
⎛ ⎛ ⎞ qm−1 ⎞ qq1 q1
2
⎜ qm
⎟
⎜
n n n
⎜ ⎟ ⎟
···⎝
qm ⎠
⎜ ⎝ T (ej1 , . . . , ejm ) ···⎠ ⎟ ≤ T
⎝ ⎠
jσ1 =1 jσ2 =1 jσm =1

for all m-linear forms T : np1 × · · · × npm → K and all positive integers n.
(ii) There is a constant Cpm ,qm such that

⎛ ⎛ ⎞ qq1 ⎞ q1
1
⎛ ⎞ qm−1 2
⎜ ⎜
qm
⎟ ⎟
n n n
⎜ ⎝ qm ⎠
···⎠ ⎟
⎝ ⎝ · · · T (ej1 , . . . , ejm ) ⎠ ≤ Cpm ,qm T
jσ1 =1 jσ2 =1 jσm =1

for all m-linear forms T : np1 × · · · × npm → K and all positive integers n.
(iii) The exponents q1 , . . . , qm satisfy

qk ≥ λtm ,k ,

for all k ∈ {1, . . . , m}, with tm = pσ1 , . . . , pσm .
Proof To simplify the notation we will consider σj = j for all j ; the other cases
are similar. (i) ⇒ (ii) is immediate and (ii) ⇒ (iii) follows from Lemma 2.2. Let
us prove (iii) ⇒ (i). Let T : np1 × · · · × npm → K be an m-linear form. We define
The Hardy-Littlewood Inequalities in Sequence Spaces 661

the non-negative m-linear form A : ns1 × · · · × nsm → K by

2
A(ej1 , . . . , ejm ) = T (ej1 , . . . , ejm ) ,

with sm = pm / (2 − pm ) and si = pi /2 for each i ∈ {1, . . . , m − 1}.

∗ = (p / (2 − p ))∗ = p∗ /2, and then for all positive integers n,
Note that sm m m m
we have
⎧ ⎫
⎨ n ⎬
(1) (m)
sup A(ej1 , . . . , ejm )xj1 · · · xjm : x (i) n ≤ 1; 1 ≤ i ≤ m
⎩ s ⎭
j1 ,...,jm =1 i
⎧ ⎫
⎨
n ⎬
2 (1) (m)
= sup T (ej1 , . . . , ejm ) xj1 · · · xjm : x (i) ≤ 1; 1 ≤ i ≤ m
⎩ ns ⎭
j1 ,...,jm =1 i

⎛ ∗
pm
⎞ 2
∗
pm
⎜
2
n n
⎟
= sup ⎜ 2 (1) (m−1)
T (ej1 , . . . , ejm ) xj1 · · · xjm−1 ⎟
⎝ ⎠
x (i) ∈Bn jm =1 j/
si m =1
1≤i≤m−1
⎛ ∗
pm
⎞ 2
∗
pm
2
⎜n
n
(m−1) 2 ⎟
= sup ⎜ T (ej1 , . . . , ejm )
2 (1)
xj1 · · · xjm−1 ⎟
⎝ ⎠
x (i) ∈Bn jm =1 j/
pi m =1
1≤i≤m−1
⎛ ⎞ 2
⎛ ⎞ pm∗ ∗
pm
2
⎜n
n
(m−1) 2 ⎟
≤ sup ⎜ ⎝ T (ej1 , . . . , ejm )xj1 · · · xjm−1 ⎠
(1) ⎟ .
⎝ ⎠
x (i) ∈Bn jm =1 j/
pi m =1
1≤i≤m−1

We use the Khinchin inequality for multiple sums (see [24, p. 455]), with constant
∗ ≥ 2, to obtain
1 because pm

⎛ ⎞ 2
⎛ ⎞ pm∗ ∗
pm
2
⎜n
n
(m−1) 2 ⎟
sup ⎜ ⎝ T (ej1 , . . . , ejm )xj1 · · · xjm−1 ⎠
(1) ⎟
⎝ ⎠
x (i) ∈Bn jm =1 j/
pi m =1
1≤i≤m−1
⎛ ⎞ 2
⎛ ∗ ⎞ pm∗∗ ∗
pm
⎜n $
n
m−1
pm pm
⎟
⎜ ⎜ (k) ⎟ ⎟
≤ sup ⎜ ⎝ T (ej1 , . . . , ejm ) rjk (tk ) xjk dt ⎠ ⎟
x (i) ∈Bn ⎝ I m−1 ⎠
pi jm =1 j/
m =1 k=1
1≤i≤m−1
662 D. Núñez-Alarcón et al.

⎛ ⎞ 2
⎛ ∗ ⎞ pm∗∗ ∗
pm
⎜$
pm pm
⎟
⎜
n n m−1
⎜ (k) ⎟ ⎟
≤ sup ⎜ ⎝ T (e j1 , . . . , e jm ) rjk (tk ) xjk ⎠ dt ⎟
x (i) ∈Bn ⎝ I m−1 ⎠
pi jm =1 j/
m =1 k=1
1≤i≤m−1
$ G
2
∗
∗ pm
≤ sup T pm
dt : x (i) ≤ 1; 1 ≤ i ≤ m − 1
I m−1 np
i

= T 2 ,

where I = [0, 1] and dt := dt1 . . . dtm−1 . Thus

A ≤ T 2 .

On the other hand, observe that sm = (s1 , . . . , sm ) ∈ (1, ∞]m and

⎛ ⎞

m
1 2 2 2 − pm m
1
= + ···+ + = 2⎝ ⎠ − 1 < 1,
sj p1 pm−1 pm pj
j =1 j =1

and
⎛ ⎞
1
m
1 2 2 2 − pm 1
2 = 2 ⎝1 − ⎠=1− + ···+ + =
λpm ,i pj pi pm−1 pm λsm ,i
j =i

for each i ∈ {1, . . . , m}. Combining these facts with Theorem 5.1, we obtain

⎛ ⎞λ 2
⎛ ⎞ λλpm ,1 pm ,1
⎛ ⎞ λp ,m−1
m pm ,2
⎜ pm∗ ⎟
⎜ n ⎜ n
∗ ⎟ ⎟
⎜ ⎜· · · ⎝ T (ej1 , . . . , ejm )
pm ⎠ ···⎟ ⎟
⎜ ⎝ ⎠ ⎟
⎝j1 =1 jm =1 ⎠

⎛ ⎞λ 2
⎛ ⎞ 2×λpm ,1 pm ,1
⎛ ⎞ 2×λp ,m−1
m 2×λ pm ,2
⎜ 2×pm∗ ⎟
⎜ n ⎜ n ∗
pm ⎟ ⎟
=⎜ ⎜· · · ⎝ ⎠ ···⎟ ⎟
2× 2
⎜ ⎝ T (ej1 , . . . , ejm ) ⎠ ⎟
⎝j1 =1 jm =1 ⎠
The Hardy-Littlewood Inequalities in Sequence Spaces 663

⎛ ⎞λ 1
⎛ ⎞ λλsm ,1 sm ,1
⎛ ⎞ λsm ,m−1
⎜
λsm ,m
sm ,2
⎟
⎜ n ⎜ n
⎟ ⎟
=⎜
⎜
⎜· · · ⎝
⎝ A(ej1 , . . . , ejm )λsm ,m ⎠ ···⎟
⎠
⎟
⎟
⎝j1 =1 jm =1 ⎠

≤ A ≤ T 2 .

Since

λtm ,1 ≥ λtm ,2 ≥ · · · ≥ λtm ,m−1 ≥ λtm ,m ,

and

λtm ,1 = λpm ,

then, the extension of Theorem 1.4, with optimal exponent and optimal constant,
follows immediately from the previous theorem:
Theorem 6.2 ([11, Theorem 3.2] and [29, Proposition 4.1]) Let m ≥ 2,
pm = (p1 , . . . , pm ) ∈ (1, ∞]m be such that pi ∈ (1, 2] for some i and 1/pm < 1.
Then
⎛ ⎞ 1

n λpm

⎝ T (ej1 , . . . , ejm )
λpm ⎠
≤ T (15)
j1 ,...,jm =1

for all m-linear forms T : np1 ×· · ·×npm → K and all positive integers n. Moreover,
the exponent λpm is optimal.
The optimal exponent λpm in the above result was obtained in [29] and the
optimal constant 1 was obtained in [11].

7 The Critical and Supercritical Cases

The paper of Hardy and Littlewood does not investigate summability of the
coefficients of bilinear forms T : np1 × np2 → K when 1/p1 + 1/p2 ≥ 1. In fact,
in this case it is simple to verify that there is no finite exponent q so that there is a
constant C satisfying
⎛ ⎞1

n
n q

⎝ T ej1 , ej2
q⎠
≤ C T
j1 =1 j2 =1
664 D. Núñez-Alarcón et al.

for all bilinear forms T : np1 ×np2 → K and all positive integers n. This observation
seems to trivialize the problem since it is obvious that

sup T ej1 , ej2 ≤ T .
j1 ,j2 ∈N

Until very recently the investigation of the m-linear case has followed this vein and
ignored the case

1/p1 + · · · + 1/pm ≥ 1. (16)

However, if we consider the problem from a broader perspective, we observe that the
case (16) hides subtleties. In fact, under an anisotropic viewpoint (allowing different
exponents for different indexes) the problem is no longer trivial since a Hardy–
Littlewood type inequality
⎛ ⎞1
⎛ ⎛ ⎞ qm−1 ⎞ qq1 q1
2
⎜ ⎟
⎜ ⎝
n n qm
⎜ qm ⎠ ⎟ ⎟
⎜ ⎝· · · T ei1 , . . . , eim ···⎠ ⎟ ≤ C T
⎝ ⎠
i1 =1 im =1

for m-linear forms T : np1 × · · · × npm → K and 1/p1 + · · · + 1/pm ≥ 1 implies

q1 = ∞, but q2 , . . . , qm−1 may be finite. As far as we know the first work in this
framework was [40], where the following result was proved for m-linear forms with
p1 = · · · = pm = m (this case was called critical case):
Theorem 7.1 ([40, Theorem 1]) Let m ≥ 2. There is a constant Cm such that

⎛ ⎛ ⎞ s1 s2 ⎞ s2
1
⎛ ⎞ 1
sm sm−1 3
⎜n
⎜ ⎝
n
⎟ ⎟
sup ⎜ ⎟ ≤ Cm T
sm ⎠
⎝ ⎝ · · · T ej1 , . . . , ejm ···⎠ ⎠
j1 j2 =1 jm =1

(17)

for all m-linear forms T : nm × · · · × nm → K, and all positive integers n, with

2m(m − 1)
sk =
mk − 2k + 2

for all k = 2, . . . , m. Moreover, s1 = ∞ and s2 = m are sharp and, for m > 2 the
optimal exponents sk satisfying (17) fulfill
m
sk ≥ .
k−1
The Hardy-Littlewood Inequalities in Sequence Spaces 665

Very recently the configuration 1/p1 + · · · + 1/pm > 1 was investigated in

[36–38] but there are still several open questions related to the subject.

8 On the Constants and Some Final Remarks

There are still several open problems related to the Hardy–Littlewood inequalities
for m-linear forms. The first class of open problems are related to the optimal
exponents. As we can easily observe, there are some cases not covered by the
previous theorems and the optimal exponents of these cases are not known, in
general. At least for bilinear forms and p1 , p2 ∈ [2, ∞], the optimal exponents
are known for the non-critical cases:
Theorem 8.1 ([44, Theorem 5.1]) Let p1 , p2 ∈ [2, ∞] with 1/p1 +1/p2 < 1, and
q1 , q2 > 0. For {k1 , k2 } = {1, 2} the following assertions are equivalent:
(i) There is a constant C such that

⎛ ⎛ ⎞ q1 ⎞ q11
q2
⎜
n
n
q2 ⎠ ⎟
⎝ ⎝ T (ej1 , ej2 ) ⎠ ≤ C T ,
jk1 =1 jk2 =1

for all bilinear forms T : np1 × np2 → K and all positive integers n.
(ii) The exponents q1 , q2 satisfy (q1 , q2 ) ∈ [λ, ∞) × [pk∗2 , ∞) and

1 1 3 1 1
+ ≤ − + .
q1 q2 2 p1 p2

The optimal constants satisfying the Hardy–Littlewood inequalities are, in

general, unknown. In this section we try to summarize what is known so far about the
constants and the challenges ahead. The first important fact to note is that the optimal
constants depend on the scalar field. It seems that the case p1 = · · · = pm = ∞
with K = R is the less difficult to deal with. For instance, for √ real scalars and
p ,
(m, 1 2 ) p = (2, ∞, ∞), the inequality (3) with C p1 ,p2 = 2 is a somewhat
straightforward consequence of the Khinchin inequality. In this case, observing that
the bilinear form T : 2∞ × 2∞ → R defined by

T (x, y) = x1 y1 + x1 y2 + x2 y1 − x2 y2
√
has norm 2, we conclude that the constant √ 2 is sharp (see [30]). A similar
argument shows that (for real scalars) 2 is also the optimal constant for (9)
with (m, p1 , p2 ) = (2, ∞, ∞) , regardless of the q1 , q2 ∈ [1, 2]. The case
m > 2, with p1 = · · · = pm = ∞ and real scalars, with exponents
(q1 , . . . , qm ) ∈ {(1, 2, . . . , 2) , . . . , (2, 2, . . . , 2, 1)} was solved in [41, 43] and
666 D. Núñez-Alarcón et al.

the optimal constants for a more general family of exponents were obtained in
[22]. The best known estimates for case for m > 2, with pm = (∞, . . . , ∞)
and qm = m+1 2m 2m
, . . . , m+1 can be found in [15] (upper bounds) and [30] (lower
bounds):
⎧
⎪ j
⎪
⎪ 1− 1 446381 m 32 − j1 2−2j
⎪
⎪ ≤ ≤ √ for m ≥ 14 and real scalars,
⎪
⎪ 2 m C pm ,qm 2 55440
⎪
⎪ j =14
π
⎨
1 m 1

⎪ 21− m ≤ Cpm ,qm ≤ 2 2j−2 for 2 ≤ m < 14 and real scalars,

⎪
⎪ j =2
⎪
⎪
⎪
⎪ m j
⎪
⎪ 1 ≤ C ≤ 2 − 1 2−2j
for m ≥ 2 and complex scalars.
⎩ p ,q
m m j
j =2

The precise asymptotic growth of the upper bounds provided above is also
calculated in [15]:
1−γ
Cpm ,qm ≤ κm 2 ≤ κm0.212 for complex scalars,
2−log 2−γ
Cpm ,qm ≤ κm 2 ≤ κm0.365 for real scalars,

where κ is a positive constant and γ is the Euler–Mascheroni constant. It is

conjectured in [43] (universality conjecture) that the sharp constants Cpm ,qm for real
1
scalars and pm = (∞, . . . , ∞) and qm = m+1 2m 2m
, . . . , m+1 are 21− m . Having good
estimates for these constants play crucial role in applications as it can be seen in the
papers [12, 35]. In [6] it is shown that replacing the sum
⎛ ⎞ m+1
2m

n
2m
⎝ T ei1 , . . . , eim m+1 ⎠
i1 ,...,im =1

by
⎛ ⎞ m+1

n 2m
2m
⎝ T ein11 , . . . , einkk
m+1
⎠ ,
i1 ,...,ik =1

in (9), with pm = (∞, . . . , ∞), where ein := ei n, .times

. . ,ei , then the new constant is
contractive (i.e., tends to 1 as m → ∞) when

k log k
lim = 0.
m→∞ m
The Hardy-Littlewood Inequalities in Sequence Spaces 667

This happens, for instance if

U V
m W X
1− 1
k= or k = m log log m .
1+ log log1 log m
(log m)

It is important to mention that the case m > 2, with pm = (∞, . . . , ∞) for

real scalars was formally solved in [23] by means of an optimization technique that
provides an algorithm furnishing all extreme points of the closed unit ball of the
space of m-linear forms defined on n∞ × · · · × n∞ . Although the result of [23]
formally gives the optimal constants by means of a finite number of elementary
operations, the effective calculation of these constants can only be done under strong
computational assistance. An implementation of the algorithm can be found in [46];
the results obtained in [46] seem to bring evidence reinforcing the universality
conjecture. An effective calculation of the constants provided by the algorithm of
[23] still depend on better computational machinery.
When at least a pj is not infinity, most of the constants of the Hardy–Littlewood
inequalities are unknown. The most relevant exceptions are the constants of
Theorems 5.1 and 6.2; in these cases the proof shows that the optimal constants
are 1. The best known estimates for the constants of Theorem 3.2 can be found in
[10] and the best known constants for Theorem 4.1 can be found in [9].
The Hardy–Littlewood inequalities for m-linear forms have a natural analogue
for homogeneous polynomials and the constants for the polynomial case are also
subject of investigation in the last decade, specially for ∞ spaces (in this case
the Hardy–Littlewood inequalities are called Bohnenblust–Hille inequalities), with
important applications in Complex Analysis (see [25, 26]). In [25] it was proved that
(for m-homogeneous polynomials in ∞ and complex scalars) the constants Dm of
the Bohnenblust–Hille inequality were dominated by C m for a certain constant C,
while the estimates of the original result of Bohnenblust and Hille just provided

√ m−1 mm/2 (m + 1)(m+1)/2

Dm ≤ 2 . (18)
2m (m!)(m+1)/2

The estimate presented in [25] has important applications in Analytic Number

Theory and Complex Analysis. In [15] it was shown that the constant Dm has sub-
exponential growth and, as a straightforward consequence, the authors succeeded in
obtaining the exact asymptotic behavior of the n-dimensional Bohr radius Kn :

Kn
lim = 1.
n→∞ (log n) /n

The constants of the polynomial Hardy–Littlewood inequalities for real scalars were
investigated in [20]. In particular, in [20] it was shown that for the case of real
scalars the constants are no longer sub-exponential, and behave differently than in
the complex case; applications can be found in [27].
668 D. Núñez-Alarcón et al.

Recently, a very interesting approach, related to the notion of fractional dimen-

sions, was introduced by F. Bayart [14], investigating Hardy–Littlewood inequalities
when the sums from i1 , . . . , in = 1 to n (i.e., sums in Nn ) are replaced by sums in
arbitrary indexes, i.e., i1 , . . . , in ∈ for an arbitrary set ∈ Nn (see also [8]).

9 The Gale-Berlekamp Switching Game

Designed independently by Elwyn Berlekamp and David Gale in the 1960’s, the
Gale–Berlekamp switching game consists of an n × n square matrix of light bulbs
set-up at an initial light configuration. The goal is to turn off as many lights as
possible using n row and n column switches, which invert the state of each bulb in
the corresponding row or column.
For an initial pattern of lights , let i() denote the smallest final number of
on-lights achievable by row and column switches starting from . We define

Rn := max{i() : is an n × n light pattern},

which represents the smallest possible number of remaining on-lights, starting from
“the worst” initial pattern. Sometimes the problem is posed as to find the maximum
of the difference between the number of lights that are on and the number that are
off, often denoted by Gn . Since

1 2
Rn = n − Gn ,
2
both formulations are equivalent.
Originally, the problem introduced by Berlekamp asks for the exact value of R10 .
In 2004, Carlson and Stolarski proved that R10 = 35 (see [21]). Up to now, the exact
value of Rn is known only for small values of n due to involving combinatorial
arguments. The exact values of Rn for n up to 12 were obtained by Carlson and
Stolarski [21] (see also [19]).
In this section we show how the Hardy–Littlewood cycle of ideas is associated
to the Gale–Berlekamp switching game and how it is useful to provide asymptotic
bounds for Gn . We initially notice that byassociating +1 to the on-lights and −1 to
n
the off-lights from the array of lights aij i,j =1 , we have
⎧ ⎫
⎨
n ⎬
Gn = min max aij xi yj : aij = −1 or + 1 .
⎩xi ,yj ∈{−1,1} ⎭
i,j =1
The Hardy-Littlewood Inequalities in Sequence Spaces 669

Now, we shall observe that Gn is precisely the norm of the bilinear form A : n∞ ×
n∞ → R defined by

n
A (x, y) = aij xi yj . (19)
i,j =1

In fact, it is obvious that Gn ≤ A. In order to prove the converse inequality, we

first recall that if E is a vector space and A ⊂ E, a vector a ∈ A is an extreme point
of A if
1
y, z ∈ A with x = (y + z) ⇒ x = y = z.
2

It n
isnwell-known that the extreme points of the closed unit ball of ∞ are precisely
xj j =1 such that xj = 1 for all j = 1, . . . , n. Finally, we recall that for all
finite-dimensional Banach spaces E we have

T = max {|T (x, y)| : x, y ∈ ext (BE )}

for all bilinear forms T : E × E → R, where ext (BE ) represents the extreme points
of the closed unit ball BE of E. In fact, since BE is compact, there exist x, y ∈ BE
such that |T (x, y)| = T . By the Minkowski Theorem, we know that BE is the
(1) (2) 1 (1)
convex hull of ext (BE ). Hence, there exist λi , λj ∈ [0, 1] with ki=1 λi =
k2 (2) (1) (2)
j =1 λj = 1 and zi , zj ∈ ext (BE ) such that

k1
(1) (1)

k2
(2) (2)
x= λi zi and y= λj zj .
i=1 j =1

Therefore
⎛ ⎞
k1
k2
T = |T (x, y)| = T ⎝ (1) (1)
λi zi , λj zj ⎠
(2) (2)

i=1 j =1

k1
k2
(1) (2) (1) (2)
≤ λi λj T zi , zj
i=1 j =1

(1) (2)
and we conclude that there exist zi0 , zj0 ∈ ext BE such that

(1) (2)
T zi0 , zj0 = T .
670 D. Núñez-Alarcón et al.

The above results yield that Gn = A for A as in (19). Now, as a straightforward
consequence of the Hardy–Littlewood inequalities for m = 2 and p1 = p2 = ∞,
we have

Gn ≥ 2−1/2n3/2 .

On the other
√ hand,
the Kahane–Salem–Zygmund inequality (see [16]) tells us that
Gn ≤ 2 5 log 9 n3/2 and we conclude that

1 Gn
√ ≤ 3/2 ≤ 2 5 log 9.
2 n

Asymptotically, probabilistic techniques and an approximation scheme using

Hadamard matrices provide better lower and upper bounds, respectively (see
[7, 42]):

Gn
0.797 + o (1) ≤ ≤ 1 + o (1) .
n3/2

Acknowledgments D. Pellegrino is supported by CNPq, Grant 207327/2017-5

References

1. R.A. Adams, J.J.F. Fournier, Sobolev Spaces. Pure and Applied Mathematics Series, 2nd edn.
(Elsevier, Amsterdam, 2003)
2. N. Albuquerque, L. Rezende, Anisotropic regularity principle in sequence spaces and applica-
tions. Commun. Contemp. Math. 20(7), 1750087, 14 (2018)
3. N. Albuquerque, L. Rezende, Asymptotic estimates for unimodular multilinear forms with
small norms on sequence spaces. Bull. Braz. Math. Soc. 52(1), 23–39 (2021)
4. N. Albuquerque, F. Bayart, D. Pellegrino, J.B. Seoane-Sepúlveda, Sharp generalizations of the
multilinear Bohnenblust-Hille inequality. J. Funct. Anal. 266(6), 3726–3740 (2014)
5. N. Albuquerque, F. Bayart, D. Pellegrino, J.B. Seoane-Sepúlveda, Optimal Hardy-Littlewood
type inequalities for polynomials and multilinear operators. Isr. J. Math. 211(1), 197–220
(2016)
6. N. Albuquerque, G. Arajo, W. Cavalcante, T. Nogueira, D. Núñez, D. Pellegrino, P. Rueda, On
summability of multilinear operators and applications. Ann. Funct. Anal. 9(4), 574–590 (2018)
7. N. Alon, J.H. Spencer, The Probabilistic Method. Wiley Series in Discrete Mathematics and
Optimization, 4th edn. (Wiley, Hoboken, 2016). xiv+375 pp.
8. F.C. Alves, D. Serrano-Rodríguez, Complex Bohnenblust-Hille inequality whose monomials
have indices in an arbitrary set. São Paulo J. Math. Sci. 14(1), 242–248 (2020)
9. G. Araújo, K. Câmara, Universal bounds for the Hardy-Littlewood inequalities on multilinear
forms. Results Math. 73(3), paper 124, 10 (2018)
10. G. Araújo, D. Pellegrino, On the constants of the Bohnenblust-Hille and Hardy-Littlewood
inequalities. Bull. Braz. Math. Soc. 48(1), 141–169 (2017)
11. R.M. Aron, D. Núñez-Alarcón, D. Pellegrino, D. Serrano-Rodríguez, Optimal exponents for
Hardy-Littlewood inequalities for m-linear operators. Linear Algebra Appl. 531, 399–422
(2017)
The Hardy-Littlewood Inequalities in Sequence Spaces 671

12. S. Arunachalam, S. Chakraborty, M. Koucký, N. Saurabh, R.M. de Wolf, Improved bounds

on Fourier entropy and min-entropy, in Proceedings of the 37th International Symposium on
Theoretical Aspects of Computer Science, STACS (2020)
13. F. Bayart, Multiple summing maps: coordinatewise summability, inclusion theorems and p-
Sidon sets. J. Funct. Anal. 274(4), 1129–1154 (2018)
14. F. Bayart, Summability of the coefficients of a multlinear form. J. Eur. Math. Soc. 24(4), 1161–
1188 (2022)
15. F. Bayart, D. Pellegrino, J.B. Seoane-Sepúlveda, The Bohr radius of the n-dimensional
polydisk is equivalent to (log n) /n. Adv. Math. 264, 726–746 (2014)
16. H.P. Boas, Majorant series. Several complex variables (Seoul, 1998). J. Korean Math. Soc.
37(2), 321–337 (2000)
17. H.F. Bohnenblust, E. Hille, On the absolute convergence of Dirichlet series. Ann. Math. 32,
600–622 (1931)
18. F. Bombal, D. Pérez-García, I. Villanueva, Multilinear extensions of Grothendieck’s theorem.
Quart. J. Math. 55(4), 441–450 (2004)
19. T.A. Brown, J.H. Spencer, Minimization of ±1 matrices under line shifts. Colloq. Math. 23,
165–171 (1971)
20. J. Campos, P. Jimenez-Rodríguez, G. Muñoz-Fernández, D. Pellegrino, J. Seoane-Sepúlveda,
On the real polynomial Bohnenblust-Hille inequality. Linear Algebra Appl. 465, 391–400
(2015)
21. J. Carlson, D. Stolarski, The correct solution to Berlekamp’s switching game. Discrete Math.
287, 145–150 (2004)
22. N. Caro, D. NúÑez-AlarcÓn, D. Serrano-Rodríguez, On the generalized Bohnenblust-Hille
inequality for real scalars. Positivity 21(4), 1439–1455 (2017)
23. W. Cavalcante, D. Pellegrino, E. Teixeira, Geometry of multilinear forms. Commun. Contemp.
Math. 22(2), 1950011, 26 (2020)
24. A. Defant, K. Floret, Tensor Norms and Operator Ideals. Mathematics Studies, vol. 176
(North-Holland, Amsterdam, 1993)
25. A. Defant, L. Frerick, J. Ortega-Cerdà, M. Ounaïes, K. Seip, The Bohnenblust-Hille inequality
for homogeneous polynomials is hypercontractive. Ann. of Math. 174(1), 485–497 (2011)
26. A. Defant, D. García, M. Maestre, P. Sevilla-Peris, Dirichlet Series and Holomorfic Functions
in High Dimensions. New Mathematical Monographs (Cambridge University Press, Cam-
bridge, 2019)
27. A. Defant, M. Mastylo, A. Pérez, On the Fourier spectrum of functions on Boolean cubes.
Math. Ann. 374(1–2), 653–680 (2019)
28. J. Diestel, H. Jarchow, A. Tonge, Absolutely Summing Operators (Cambridge University Press,
Cambridge, 1995)
29. V. Dimant, P. Sevilla-Peris, Summation of coefficients of polynomials on p spaces. Publ. Mat.
60, 289–310 (2016)
30. D. Diniz, G. Muñoz-Fernández, D. Pellegrino, J.B. Seoane-Sepúlveda, Lower bounds for the
constants in the Bohnenblust-Hille inequality: the case of real scalars. Proc. Am. Math. Soc.
142(2), 575–580 (2014)
31. D.J.H. Garling, Inequalities: A journey into Linear Analysis (Cambrige University Press,
Cambrige, 2007)
32. G.H. Hardy, J.E. Littlewood, Bilinear forms bounded in space [p, q]. Quart. J. Math. 5, 241–
254 (1934)
33. J.E. Littlewood, On bounded bilinear forms in an infinite number of variables. Q. J. Math.
Oxford 1, 164–174 (1930)
34. M.C. Matos, Fully absolutely summing and Hilbert-Schmidt multilinear mappings. Collect.
Math. 54(2), 111–136 (2003)
35. A. Montanaro, Some applications of hypercontractive inequalities in quantum information
theory. J. Math. Phys. 53(12), 122206, 15 (2012)
36. D. Núñez-Alarcón, D. Paulino, D. Pellegrino, Super-critical Hardy-Littlewood inequalities for
multilinear forms, to appear in Anais Acad. Brasil Cienc.
672 D. Núñez-Alarcón et al.

37. D. Núñez-Alarcón, D. Pellegrino, D. Serrano-Rodríguez, Sharp anisotropic Hardy-Littlewood

inequality for positive multilinear forms. Results Math. 74(4), Paper No. 193, 10 (2019)
38. D. Núñez-Alarcón, D. Pellegrino, D. Serrano-Rodríguez, The Orlicz inequality for multlinear
forms. J. Math. Anal. Appl. 505, 125520 (2022)
39. B. Ozikiewicz, A. Tonge, An interpolation approach to Hardy–Littlewood inequalities for
norms of operators on sequence spaces. Linear Algebra Appl. 331, 1–9 (2001)
40. D. Paulino, Critical Hardy-Littlewood inequality for multilinear forms. Rend. Circ. Mat.
Palermo, II. Ser 69, 369–380 (2020)
41. D. Pellegrino, The optimal constants of the mixed (1 , 2 )-Littlewood inequality. J. Number
Theory 160, 11–18 (2016)
42. D. Pellegrino, A. Raposo Jr, Constants of the Kahane–Salem–Zygmund inequality asymptoti-
cally bounded by 1. J. Funct. Anal. 282, 109293, 21 (2022)
43. D. Pellegrino, E. Teixeira, Towards sharp Bohnenblust-Hille constants. Commun. Contemp.
Math. 20(3), 1750029, 33 (2018)
44. D. Pellegrino, J. Santos, D. Serrano-Rodríguez, E. Teixeira, A regularity principle in sequence
spaces and applications. Bull. Sci. Math. 141, 802–837 (2017)
45. T. Praciano-Pereira, On bounded multilinear forms on a class of p spaces. J. Math. Anal.
Appl. 81, 561–568 (1981)
46. F. Vieira Costa Júnior, The optimal multilinear Bohnenblust-Hille constants: a computational
solution for the real case. Numer. Funct. Anal. Optim. 39(15), 1656–1668 (2018)
Symmetries of C ∗ -algebras and Jordan
Morphisms

Jan Hamhalter and Ekaterina Turilova

Abstract They are many faces of C ∗ -algebras whose symmetries encode important
aspects of their structures. We show that in surprisingly different situations these
symmetries are implemented by Jordan *-isomorphisms and lead to full Jordan
invariants. In this respect we study the following structures: 1. One dimensional
projections in a Hilbert space with transition probability and orthogonality relation
(Wigner type theorems). 2. Projection lattices of von Neumann algebras and AW∗ -
algebras (Dye type theorems) 3. Abelian C∗ -subalgebras with set theoretic inclusion
(Bohrification program in quantum theory) 4. Measures on state spaces endowed
with the Choquet order.

Keywords C∗ -algebras · Jordan ∗ -morphisms

1 Introduction

The aim of this chapter is to review selected results illustrating interrelations

between symmetries of various structures attached to C ∗ -algebras and von Neumann
algebras and Jordan *-isomorphisms. The full C ∗ -structure is captured by *-
isomorphism. If we have two C ∗ -algebras A and B, then *-isomorphism ψ : A →
B is a linear bijective map satisfying the following conditions for all a, b ∈ A.

ψ(ab) = ψ(a)ψ(b)

ψ(a ∗ ) = ψ(a)∗ .

J. Hamhalter ()
Department of Mathematics, Faculty of Electrical Engineering, Czech Technical University in
Prague, Prague, Czech Republic
e-mail: [email protected]
E. Turilova
Institute of Mathematics and Mechanics, Kazan Federal University, Kazan, Russia
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 673
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_20
674 J. Hamhalter and E. Turilova

It turns out that many features of operator algebras relevant to their inner structure
and applications in physics are preserved by Jordan *-isomorphism rather than *-
isomorphisms. A Jordan *-isomorphism is a bijective linear map ϕ : A → B
between C ∗ -algebras satisfying the following conditions for all a ∈ A.

ϕ(a 2) = ϕ(a)2

ϕ(a ∗ ) = ϕ(a)∗ .

As we can see Jordan morphisms are well behaved only with respect to individual
elements and involve *-morphisms as a special case. They are ubiquitous in
preserver theory of operator algebras. Even, the very Banach space structure
of a C ∗ -algebra is in fact complete Jordan invariant. Indeed, famous Kadison
isometry theorem and its ramifications say that if two C ∗ -algebras are isometric
as Banach spaces then there is a Jordan *-isomorphism between them. Moreover,
unital surjective linear isometries between C ∗ -algebras are precisely Jordan *-
isomorphisms. We would like to review some aspects of this surprising universality
of Jordan morphisms by reviewing some recent results the authors witnessed and
have pleasure to deal with during last decade. We start at elementary level in the
first section and then move to more advanced aspects of symmetries in operator
algebras. The second section is devoted to famous Wigner theorem [18]. Based
on our results in [1, 16], we provide elementary proof on symmetries on Hilbert
spaces that preserve transition probability between one dimensional projections
(rays). These symmetries are given by Jordan *-isomorphisms that are in this case
implemented by unitary or antiunitary map. The core of our approach consists
in elementary matrix algebra arguments proving Kadison’s result to the effect
that Jordan *-isomorphism on a matrix algebra is either *-isomorphism or *-
antiisomorphism. We also show that quantum logic version of Wigner theorem due
to Uhlhorn can be quickly derived from Gleason theorem. In the second section we
analyze Dye theorem that is a generalization of Wigner theorem to a much general
context of von Neumann algebras. In Jordan formulation Dye theorem says that
any bijection between (almost all) projection lattices of von Neumann algebras
preserving orthogonality in both directions extends to a Jordan *-isomorphism
between algebras themselves. In other words, the projection lattice is a full Jordan
invariant. We show that this deep result can be quickly derived from Generalized
Gleason theorem due to Bunce and Wright. In the second part of Sect. 3 we outline
the proof of Dye’s theorem for AW∗ -algebras [7]. These algebras are algebraic
generalizations of von Neumann algebras due to Kaplansky. They are more natural
from the point of view of quantum theory. Also they seem to be useful from purely
mathematical point of view. For example, there is one-one correspondence between
complete Boolean algebras and projection lattices of abelian AW∗ -algebras. The
technique of proving Dye theorem for AW∗ -algebras must be different in combining
insights of von Neumann, Dye, and Heunen and Reyes [12] on one side and Gleason
theorem for Type I finite AW∗ -algebras due to J. Hamhalter [7] on the other side.
In Sect. 4 we consider yet another order structure associated with C ∗ -algebras. It is
Symmetries of C ∗ -algebras and Jordan Morphisms 675

the poset of abelian subalgebras order by set theoretic inclusion. This structure has
driven attention of many mathematical physicists in connection with Bohrification
program in quantum theory [14]. In this approach quantum system can only be seen
through its classical subsystems. In operator algebraic model of quantum theory
quantum system is given by a C ∗ -algebra A and its classical subsystems are given by
abelian C ∗ -subalgebras of A. Therefore the poset C(A) of abelian C ∗ -subalgebras
of A embodies the Bohr’s doctrine. We show that C(A) is a complete Jordan
invariant for many algebras and synthesize various results along this line. Finally,
in concluding Sect. 5 we review recent results discovering new complete Jordan
invariant—the Choquet order structure on orthogonal measures on state spaces of
C ∗ -algebras. We show intimate connection of this structure with C(A) and describe
preservers of the Choquet order in terms of Jordan *-isomorphisms.
We shall now recall basic notions and introduce the notation. For all details on
operator algebras and their applications to physics the reader is advised to consult
monographs [2, 13, 14, 17]. Let A be a C ∗ -algebra. If it has the unit it is called unital.
We shall denote by As , and A+ the set of all self-adjoint and positive elements,
respectively. By the symbol Z(A) we shall represent the center of A. It is the set of
all elements commuting with every element of A. By the symbol P (A) we shall
denote the poset of projections in A, that is the set of self-adjoint idempotents
ordered by relation: p ≤ q if p = pq = qp. If the algebra A is unital with unit
1, then orthocomplement of projection p is the projection 1 − p. Two projections
p, q are called orthogonal, in symbols p ⊥ q, if pq = 0. The bijection between
projections posets is called orthoisomorphism if it preserves the orthogonality in
both directions. The central cover c(p) of projection p is the smallest projection
c(p) that lies in the center and is bigger than p. Projection is called faithful if its
central cover is 1. Finally, projection p is called abelian if pAp is an abelian C ∗ -
algebra. Two projections p and q are said to be equivalent if p = v ∗ v and q = vv ∗
for some element v ∈ A. In case of a unital C ∗ -algebra, the unitary element is
an element u whose inverse is u∗ . By the symbol B(H ) we shall denote the C ∗ -
algebra of all bounded operators acting on a Hilbert space H . If S ⊂ B(H ), then
S will represent the commutant of S, that is the set of all operators commuting
with all elements in S. A vector ξ ∈ H is called separating for S ⊂ B(H ) if
aξ = 0 implies a = 0 for all a ∈ S. Moreover, ξ is called biseparating for S
if it is separating both for S and S . Von Neumann algebra is a C ∗ -algebra that
has a predual. AW∗ -algebra A is a C ∗ -algebra that is Baer∗ ring, which means
that for any nonempty set S ⊂ A there is unique projection p ∈ A such that
S 0 = {x ∈ A : sx = 0 for all s ∈ S} = pA.
Let us have two C ∗ -algebras A and B. A linear map ϕ : A → B that preserves
the product and *-operation is called *-homomorphism. If it is a bijection we
call it *-isomorphism. A linear map ϕ : A → B that reverses the product (that
is ϕ(ab) = ϕ(b)ϕ(a) for all a, b ∈ A) and preserves *-operation is called *-
antihomomorphism. If it is a bijection we call it *-antiisomorphism. A linear map
ϕ : A → B that preserves the squares (ϕ(a 2) = ϕ(a)2 for all a ∈ A) and
*-operation is called a Jordan *-homomorphism. If it is a bijection we call it a
676 J. Hamhalter and E. Turilova

Jordan *-isomorphism. We shall now collect a few folklore results about Jordan
*-morphisms that will be used frequently in this work.
Proposition 1.1 Let us have C ∗ -algebras A and B. Let ϕ : A → B be a Jordan
*-homomorphism. Then the following holds:
(i) ϕ preserves the triple products:

ϕ(abc + cba) = ϕ(a)ϕ(b)ϕ(c) + ϕ(c)ϕ(b)ϕ(a) for all a, b, c ∈ A .

(ii) ϕ preserves commutativity.

(iii) If ϕ is a *-isomorphism between unital algebras then it preserves the unit.
Moreover, the following well known fact will facilitate further discussion.
Proposition 1.2 Let A be a C ∗ -algebra that is the closed linear span of its
projections. Let B be another C ∗ -algebra. Then a bounded linear map ψ : A → B
is a Jordan *-homomorphism if and only if it preserves projections.
By a positive functional ϕ on a C ∗ -algebra A we mean the linear form that takes
nonnegative values on A+ . It is always a continuous map. It is called faithful if
ϕ(a ∗ a) = 0 implies a = 0. We write ϕ ≤ ψ for positive functionals ϕ and ψ if ψ−ϕ
is positive. A state is a norm one positive functional. By the symbol S(A) we shall
denote the set of all states of A. Extreme points of this set are called pure states. Two
positive functionals ψ and ϕ are called orthogonal if ϕ−ψ = ϕ+ψ. For each
state ϕ there is a Hilbert space Hϕ , unit vector ξϕ ∈ Hϕ and a *-homomorphism πϕ :
A → B(Hϕ ) such that the set πϕ (A)ξϕ is dense in Hϕ and ϕ(a) = πϕ (a)ξϕ , ξϕ
for all a ∈ A. This is called Gelfand-Nairmark-Segal representation (GNS in short).
Having two posets (P , ≤) and (Q, ≤), we shall call a bijection ϕ : P → Q
an order isomorphism if it preserves order in both directions: a ≤ b if and only if
ϕ(a) ≤ ϕ(b) .
After reviewing basic facts we recall Generalized Gleason theorem that plays
an important role throughout the present treatment. Let A be a C ∗ -algebra and
X a Banach space. By a finitely additive X-valued measure on P (A) we mean
a map μ : P (A) → X that satisfies: μ(p + q) = μ(p) + μ(q) whenever p
and q are orthogonal projections. If X is C we are just talking about measure on
P (A). Gleason type theorems is a series of deep results that originated by Gleason
theorem in 1957 and culminated in remarkable achievements of Bunce and Wright
in the early 90s. For self-contained proof and history we refer the interested reader
to [5] and references therein. Originally the following theorem has been proved by
Gleason for completely additive probability measures.
Theorem 1.3 (Gleason) Let μ be a finitely additive bounded measure on
P (B(H )), where H is a Hilbert space of dimension not equal two. Then there
is a bounded linear functional f : B(H ) → C extending μ, i.e.

μ(P ) = f (P ) for all P ∈ P (B(H )) .

Symmetries of C ∗ -algebras and Jordan Morphisms 677

The previous theorem has nontrivial extension to nearly all von Neumann
algebras. Let us recall that von Neumann algebra has Type I2 direct summand if
it has a direct summand *-isomorphic to the algebra C(X, M2 (C)) of all continuous
functions on a hyperstonean space X with values in the algebra of two by two
matrices.
Theorem 1.4 (Gleason-Bunce-Wright) Let M be a von Neumann algebra not
having Type I2 direct summand and X be a Banach space. Then any bounded
finitely additive measure μ : P (M) → X extends to a bounded linear functional
T : M → X.
We shall need the following n generalization of Jordan *-homomorphisms.
Definition 1.5 Let A and B be unital C ∗ -algebras. A map ψ : A → B is called
a quasi Jordan *-homomorphisms if the following conditions are satisfied for all
a, b ∈ A.
(i) ψ is homogeneous
(ii) ψ(a + b) = ψ(a) + ψ(b) whenever a and b commute.
(iii) ψ(a ∗ ) = ψ(a)∗ .
(iv) ψ(a 2 ) = ψ(a)2 .
When ψ is a bijection such that both ψ and ψ −1 are quasi Jordan *-
homomorphisms, we call ψ quasi Jordan *-isomorphism.
It is a consequence of Theorem 1.4 that any quasi Jordan *-homomorphism
defined on a von Neumann algebra without Type I2 direct summand is linear. This
does not hold for Type I2 von Neumann algebras (see [5]) for a more detailed
discussion).

2 Wigner Theorem

2.1 Probability Version

Let H be a Hilbert space. By P (H ) we shall represent the set of all projections

acting on H . Let P1 (H ) mean the set of all rank one projections in P (H ). Each
projection in P1 (H ) is of the form Pξ , where Pξ is an orthogonal projection onto
linear span of nonzero vector ξ ∈ H . By F (H ) we shall denote the set of finite rank
operators acting on H . By the trace we understand the standard trace on F (H ), that
is

tr(A) = T eα , eα ,
α

where (eα ) is an orthonormal basis of H . In probability structure of quantum

mechanics projection Pξ , ξ = 1, corresponds to the state of the system. Transition
678 J. Hamhalter and E. Turilova

probability between two states Pξ , Pμ is then given by

T (Pξ , Pμ ) = tr(Pξ Pμ ) = |ξ, μ|2 .

Wigner Theorem in its original form is concerned with symmetries preserving

transition probability. It is not difficult to show that any map preserving transition
probability is a restriction of a Jordan *-homomorphism.
Proposition 2.1 Let H be a separable Hilbert space. Let ϕ : P1 (H ) → P1 (H ) be
a map preserving transition probabilities, that is, ϕ satisfies

tr(P Q) = tr(ϕ(P )ϕ(Q)) for all P , Q ∈ P1 (H ) .

Then ϕ extends to a Jordan ∗-homomorphism ϕ̂ : F (H ) → F (H ).

Proof We shall first extend ϕ to linear map on the real space Fs (H ) of self-adjoint
finite rank operators. Given A in Fs (H ) we can suppose that

n
A= λi Pi λi ∈ R, Pi ∈ P1 (H ) . (1)
i=1

Put

n
ϕ̂(A) = λi ϕ(Pi ) .
i=1

The key argument of the proof is to show that the this definition is correct, that is,
not depending on the expression (1). To this end let us take another decomposition

m
A= μj Qj μi ∈ R, Qi ∈ P1 (H ) .
j =1

Taking arbitrary R ∈ P1 (H ), we can compute

n
n
tr λi ϕ(Pi )ϕ(R) = λi tr(ϕ(Pi )ϕ(R)) = (2)
i=1 i=1

n
= λi tr(Pi R) = tr(AR) . (3)
i=1

We shall use the following notation

n
m
T = λi ϕ(Pi ) S= μj ϕ(Qj ) .
i=1 j =1
Symmetries of C ∗ -algebras and Jordan Morphisms 679

From (2) we have, for each one dimensional projection R,

tr (S − T )ϕ(R) = 0 .

By linearity tr (S − T )B = 0 for all B in the linear span of ϕ(P1 (H )). In

particular, tr(S − T )2 = 0, giving (S − T )2 = 0. Finally, by self-adjointness,
S − T = 0.
The map ϕ̂ is obviously linear. Let us check that it preserves the squares. For this,
each A ∈ Fs (H ) can be written in its spectral form as

n
A= λi Pi ,
i=1

where λi ∈ R and Pi are pairwise orthogonal one dimensional projections. Easy

computation gives

n
n
ϕ̂(A2 ) = ϕ̂ λ2i Pi = λ2i ϕ(Pi ) = ϕ̂(A)2 .
i=1 i=1

Finally, ϕ can be canonically extended to the space F (H ) and it completes the

proof.

Now we shall demonstrate how one can express Jordan *-isomorphisms in terms
of unitary and antiunitary maps. We start with illustrative situation of two by two
matrices.
Proposition 2.2 Let ϕ : M2 (C) → M2 (C) be a nonzero Jordan ∗-homomorphism.
Then there is either a unitary operator U on C2 such that ϕ is of the form

ϕ(A) = U AU ∗ , for all A ∈ M2 (C) ,

or there is an antiunitary operator U on C2 such that ϕ is of the form

ϕ(A) = U A∗ U ∗ , for all A ∈ M2 (C) .

(Let us note that ϕ is in fact a Jordan ∗-isomorphism)

Proof As ϕ is nonzero there must exist a projection Pξ such that ϕ(Pξ ) is nonzero.
As all one dimensional projections are unitarily equivalent, we obtain that ϕ is
nonzero at every one dimensional projection. This also implies that ϕ preserves one
dimensional projections. Let e1 , e2 be the standard orthonormal basis of C2 . Then

10 00
Pe1 = Pe2 = .
00 01
680 J. Hamhalter and E. Turilova

As we know ϕ(Pe1 ) and ϕ(Pe2 ) are orthogonal rank one projections projecting on
linear span of unit orthogonal vectors f1 and f2 , respectively. Let us take a unitary
operator V satisfying conditions Vf1 = e1 and Vf2 = e2 . Then, for i = 1, 2,

V ϕ(Pei )V ∗ = V Pfi V ∗ = Pei .

Therefore by passing to V ϕ(·)V ∗ we can suppose, without loss of generality, that ϕ

ab
fixes the projections Pe1 and Pe2 . Let us write B = in the form
cd

B = aPe1 + Pe1 BPe2 + Pe2 BPe1 + dPe2 .

By Proposition 1.1 we can see that

ϕ(B) = aϕ(Pe1 ) + ϕ(Pe1 )ϕ(B)ϕ(Pe2 ) + ϕ(Pe2 )ϕ(B)ϕ(Pe1 ) + dϕ(Pe2 ) =

aPe1 + Pe1 ϕ(B)Pe2 + Pe2 ϕ(B)Pe1 + dPe2 .

Consequently, ϕ preserves diagonal in the sense that

ab ax
ϕ =
cd yd

for some x, y ∈ C. Let us now consider the matrix

00
A= ,
10

having the property that A2 = 0. Thanks to the properties of ϕ we can write

0y
ϕ(A) = ,
x0

for some unknown x, y ∈ C. Then the condition ϕ(A2 ) = ϕ(A)2 = 0 gives

xy 0
0=
0 xy

and so x or y must be zero. Let us examine at first the possibility y = 0. Then

00 00
ϕ =
10 α0
Symmetries of C ∗ -algebras and Jordan Morphisms 681

for some α ∈ C. Consequently, since ϕ preserves the ∗-operation, it transforms the

matrices in the following way:

ab a αb
ϕ = . (4)
cd αc d

01 0α
Since ϕ preserves spectra and sends matrix to , we conclude that
10 α0
|α| = 1. Let us now take a unitary map

U : (z1 , z2 ) ∈ C2 → (αz1 , z2 ) ,

that is

α0
U= .
01

One can compute easily that

ab a αb ab
U U∗ = =ϕ .
cd αc d cd

Therefore, ϕ is implemented by a unitary map.

Let us now explore the second possibility when

00 0α
ϕ =
10 00

for some α ∈ C. In the same way as before we can establish that α is a complex unit
and

ab a αc
ϕ = . (5)
cd αb d

Let us take an antiunitary map

U : (z1 , z2 ) ∈ C2 → (αz1 , z2 ) ∈ C2 .

It is easy to verify that, given a general matrix

ab
A= ,
cd

the action of U A∗ U ∗ on (z1 , z2 ) gives (az1 + αcz2 , αbz1 + dz2 ) which is precisely
the action of ϕ(A) on (z1 , z2 ). This complements the proof.

682 J. Hamhalter and E. Turilova

The algebra M2 (C) describes a two-level quantum system. Its state space has a
natural geometric interpretation. Let us identify each state with a positive matrix of
trace one (so called density matrix). Every density matrix T can be represented by
a vector r = (r1 , r2 , r3 ) in the three dimensional unit ball in the following way:

1
T = (I + r1 σ1 + r2 σ2 + r3 σ3 ) ,
2
where σ1 , σ2 , σ3 are Pauli spin matrices;

01 0 −i 1 0
σ1 = , σ2 = , σ3 = .
10 i 0 0 −1

Antiunitary case in Wigner Theorem given by (5) with specific α = 1 corresponds

to a reflection (r1 , r2 , r3 ) → (r1 , −r2 , r3 ).
Having a unitary or antiunitary map V on a Hilbert space H we can define a map
Ad V on B(H ) by

Ad V (T ) = V T V ∗ , T ∈ B(H ) .

We say that the map Ad V is implemented by V .

Proposition 2.3 Let ϕ : Mn (C) → Mn (C) be a linear map. Then the following
holds:
(i) If ϕ is a ∗-isomorphisms, then it is implemented by a unitary V acting on Cn .
(ii) If ϕ is a ∗-antiisomorphism, then it is implemented by an antiunitary map on
Cn .
Proof We shall show (i). Fix a unit vector ξ ∈ H . Then ϕ(Pξ ) is a rank-one
projection. By composing with Ad U ◦ ϕ, where U is a suitable unitary operator,
we can suppose that ϕ(Pξ ) = Pξ . Let us first observe that for any A ∈ Mn (C) we
have that Aξ = ϕ(A)ξ . Indeed,

ϕ(A)ξ 2 = ϕ(A)ξ, ϕ(A)ξ = ϕ(A∗ A)ξ, ξ = ϕ(A∗ A)Pξ ξ, Pξ ξ =

= ϕ(Pξ A∗ APξ )ξ, ξ .

One can directly check that Pξ A∗ APξ = Aξ 2 Pξ . By this, we can continue the
previous computation and obtain

ϕ(A)ξ 2 = Aξ 2 ξ, ξ = Aξ 2 .

The foregoing identity allows us to define in a correct way the unitary map V :
Mn (C) → Mn (C) by

V Aξ = ϕ(A)ξ , A ∈ B(H) .
Symmetries of C ∗ -algebras and Jordan Morphisms 683

We shall prove that V implements ϕ. Any vector z ∈ Cn can be written in the form:
z = ϕ(B)ξ where B ∈ Mn (C). Then we have

V AV ∗ z = V AV ∗ ϕ(B)ξ =

= V ABξ = ϕ(AB)ξ = ϕ(A)ϕ(B)ξ = ϕ(A)z .

Hence, operators ϕ(A) and V AV ∗ coincide. Therefore, ϕ is implemented by V .

Case (ii) has the same proof.

Let us now have complex numbers, α, β, γ of modulus one. We denote by α,β,γ
the map acting on M3 (C) by
⎛ ⎞ ⎛ ⎞
a11 a12 a13 a11 αa12 βa13
α,β,γ ⎝ a21 a22 a23 ⎠ = ⎝ αa21 a22 γ a23 ⎠ .
a31 a32 a33 βa31 γ a32 a33

Lemma 2.4 The following conditions are equivalent

(i) α,β,γ is a Jordan ∗-isomorphism.
(ii) αγ = β.
(iii) α,β,γ is implemented by a unitary operator.
(iv) α,β,γ is a ∗-isomorphism.
Proof (i) ⇒ (ii). Take the matrix
⎛ ⎞ ⎛ ⎞
00 0 0 00
A = ⎝1 0 0 ⎠ . Then α,β,γ (A) = ⎝ α 0 0 ⎠ .
11 0 βγ 0

As
⎛ ⎞ ⎛ ⎞
000 000
A2 = ⎝ 0 0 0 ⎠ , we have that α,β,γ (A2 ) = ⎝ 0 0 0 ⎠ .
100 β00

On the other hand,

⎛ ⎞
0 00
[α,β,γ (A)]2 = ⎝ 0 0 0 ⎠ ,
αγ 0 0

giving immediately β = αγ .
684 J. Hamhalter and E. Turilova

(ii) ⇒ (iii). Suppose that αγ = β. Put

⎛ ⎞
α00
U = ⎝0 1 0⎠ .
00γ

Then U is a unitary matrix and it can be verified by a direct calculation that

α,β,γ = AdU .

The remaining implications are obvious.

In order to handle matrices of higher rank, we shall introduce some notation. Let
A be a matrix in Mn (C), n > 1, and i, j ∈ {1, . . . , n}, i = j . By Aij we shall
denote the matrix in Mn (C) having the same entries as A in positions (k, l), where
k, l ∈ {i, j } and zeros elsewhere. It is easy to verify that

ij
Mn (C) := {Aij : A ∈ Mn (C)}

is a ∗-subalgebra of Mn (C) isomorphic to M2 (C). Further we shall denote by D(A)

the diagonal matrix having the same diagonal as A.
Theorem 2.5 Let ϕ : Mn (C) → Mn (C) be a nonzero Jordan ∗-homomorphism.
Then ϕ is implemented by either unitary or antiunitary operator.
Proof The theorem is true for n = 1 (trivial reason) and we have established it for
n = 2 in Proposition 2.2. First suppose that n = 3. Without loss of generality we
can assume that ϕ fixes projections Pei , i = 1, 2, 3. Then it fixes all subalgebras
ij ij
M3 (C) and diagonal matrices. We say that the algebra M3 (C) has positive (resp.
ij
negative) orientation if the restriction of ϕ to M3 (C) is implemented by a unitary
(resp. antiunitary) operator. We prove that all subalgebras have the same orientation.
First consider the pair M312 (C), M313 (C). Suppose that the first algebra has positive
orientation while the second algebra has negative one. Consider the matrix
⎛ ⎞
00 0
A = ⎝1 0 0⎠ .
10 0

We have that A2 = 0. By Proposition 2.2 (and its proof) there are complex numbers
α, β of modulus one such that
⎛ ⎞
00β
ϕ(A) = ⎝ α 0 0 ⎠ .
000
Symmetries of C ∗ -algebras and Jordan Morphisms 685

Then
⎛ ⎞
00 0
0 = ϕ(A2 ) = ⎝ 0 0 αβ ⎠ ,
00 0

which is a contradiction. Suppose that M312 (C) has negative orientation and M313 (C)
has positive orientation. Then, similarly, there are complex numbers α and β of
modulus one such that

⎛ ⎞
0α0
ϕ(A) = ⎝ 0 0 0 ⎠ .
β 00

Then
⎛ ⎞
0 0 0
0 = ϕ(A2 ) = ⎝ 0 0 0 ⎠ ,
0 βα 0

which is a contradiction. Now we shall prove that M312 (C) and M323 (C) have the
same orientation. For this we consider action of ϕ on the matrix
⎛ ⎞ ⎛ ⎞
000 000
A = ⎝ 1 0 0 ⎠ and its square A2 = ⎝ 0 0 0 ⎠ .
010 100

Suppose that M312 (C) has positive and M323 (C) has negative orientation. Then there
are complex numbers α, γ of modulus one such that
⎛ ⎞
000
ϕ(A) = ⎝ α 0 γ ⎠ .
000

Then ϕ(A2 ) = 0, however ϕ(A)2 = 0, which is not possible. If the orientation

of M312 (C) is negative and orientation of M323 (C) is positive, then one can reach a
contradiction in the same way by computing that ϕ(A2 ) = 0 again. If all orientations
on the corresponding 2 by 2 matrix subalgebras are positive, then ϕ must be a
map α,αγ ,γ (see Lemma 2.4) that is a ∗-isomorphism. On the other hand, if all
orientations are negative, then by composing ϕ with the transpose map we obtain
another Jordan ∗-isomorphism that is of the form above. In that case ϕ is a ∗-
antihomomorphism. Now it suffices to apply Proposition 2.3.
686 J. Hamhalter and E. Turilova

This way the result is established for n ≤ 3. Let us now tackle general case of
n ≥ 3. We shall show that one of the following statements are true

ϕ(P Q) = ϕ(P )ϕ(Q) for all rank-one projections P and Q (6)

ϕ(P Q) = ϕ(Q)ϕ(P ) for all rank-one projections P and Q . (7)

To this end let us fix a rank-one projection P . There exists a rank-one projection
Q that does not commute with P . The projection E = P ∨ Q is a rank-two
projection and the same holds for ϕ(E). As ϕ acts on EMn (C)E as a Jordan ∗-
isomorphism (that must preserve commutativity in both directions), we have that
ϕ(P )ϕ(Q) = ϕ(Q)ϕ(P ). Besides, we know from the case n = 2, that there are two
(mutually exclusive) possibilities: ϕ(P Q) = ϕ(P )ϕ(Q) and ϕ(P Q) = ϕ(Q)ϕ(P ).
Suppose the first one holds. Take any rank-one projection R that is not orthogonal
to P . Then the projection F = P ∨ Q ∨ R has rank at most three, and the same
holds for its image under ϕ. Based on our result for n ≤ 3, ϕ must act as ∗-
isomorphisms on F Mn (C)F . Therefore ϕ(P R) = ϕ(P )ϕ(R). As the same holds
for rank-one projections R orthogonal to P , we conclude that ϕ(P R) = ϕ(P )ϕ(R)
for all rank-one projections R. If ϕ(P Q) = ϕ(Q)ϕ(P ), then, similarly, we can show
that ϕ(P R) = ϕ(R)ϕ(P ) for all rank-one projections R.
In summary, we have proved that for any rank-one projection P one of the
following two statements is true.

ϕ(P Q) = ϕ(P )ϕ(Q) for all rank one projections Q. (8)

ϕ(P Q) = ϕ(Q)ϕ(P ) for all rank one projections Q. (9)

We call P to have a positive orientation if (8) holds and negative orientation if (9)
holds. We shall show that (6) holds or (7) holds by demonstrating that all rank-one
projections have the same orientation. Suppose, for a contradiction, that there is
a rank-one projection P with positive orientation and rank-one projection Q with
negative orientation. As P = Q, we can find a rank-one projection R not commuting
either with P or Q. Then ϕ(RQ) = ϕ(QR)∗ = (ϕ(R)ϕ(Q))∗ = ϕ(Q)ϕ(R) =
ϕ(R)ϕ(Q). Therefore R has negative orientation. Considering now the pair P , R,
we have for the same reason that P has negative orientation—a contradiction.
Finally, as rank-one projections span the whole algebra, we can see that ϕ is either
∗-isomorphism (case (6)) or ∗-antiisomorphism (case (7)). Now Proposition 2.3
concludes the proof.

The foregoing results on the structure of Jordan *-isomorphisms allows us to
establish Wigner theorem both for finite and infinite dimensional Hilbert space.
Theorem 2.6 (Wigner Theorem for Finite Quantum Systems) Let ϕ : P1 (H ) →
P1 (H ), where H is a finite dimensional Hilbert space, be a map preserving
Symmetries of C ∗ -algebras and Jordan Morphisms 687

transition probabilities, that is, ϕ satisfies

tr(P Q) = tr(ϕ(P )ϕ(Q)) P , Q ∈ P1 (H ) .

Then there is either a unitary or an antiunitary map U acting on H such that

ϕ(P ) = U P U ∗ for all P ∈ P1 (H ) .

Proof Let us identify H with Cn . By Proposition 2.1 ϕ extends to a nonzero Jordan

∗-homomorphism acting on Mn (C). By the virtue of Theorem 2.5 ϕ is implemented
by either a unitary or an antiunitary map U such that

ϕ(P ) = U P U ∗ for all P ∈ P (H ) .

The infinite-dimensional variant of the previous theorem can be found in [1].

2.2 Logical Version

Later version of Wigner theorem is about preserving logical structure of projections.

Projections impose mutually exclusive quantum operations if they are orthogonal.
It turns out that symmetries of this orthogonality structure are the same as for
transition probabilities except for the algebra M2 (C).
Theorem 2.7 Let H be a Hilbert space with dimH ≥ 3. Let ϕ : P (H ) → P (H )
be an orthoisomorphism. Then there is either a unitary or an antiunitary operator
U acting on H such that

ϕ(P ) = U P U ∗ for all P ∈ P (H ) .

Proof The crucial idea is to find a bounded linear extension of ϕ. For this consider
two identical linear combinations of projections in B(H ), i.e.

n
m
λi Pi = μj Qj ,
i=1 j =1

where P1 , . . . , Pn , Q1 , . . . , Qm are projections and λ1 , . . . , λn , μ1 , . . . , μm are

complex numbers. Let us take a functional f on B(H ). The composition f ◦ ϕ
is a finitely additive bounded measure on P (H ) that has a bounded linear extension
688 J. Hamhalter and E. Turilova

to B(H ) by Theorem 1.3. By applying this extension to the equality above we obtain

n
m
f λi ϕ(Pi ) = f μj ϕ(Qj ) .
i=1 j =1

Employing now Hahn-Banach theorem, we can conclude that

n
m
λi ϕ(Pi ) = μj ϕ(Qj ) .
i=1 j =1

This enables us to define a linear operator T acting on linear span of projections by

n
n
T λi Pi = λi ϕ(Pi ) .
i=1 i=1

It can be verified easily that T is bounded and so can be extended to a bounded linear
operator (denoted by the same letter) T : B(H ) → B(H ). As this map preserves
projection, it has to be Jordan *-homomorphism by Proposition 1.2. Arguing in the
same way for the inverse map ϕ −1 gives us that T is a Jordan *-isomorphism. By
the previous discussion (see e.g. Theorem 2.5 for a finite dimensional case) this map
is implemented by unitary or antiunitary operator.

Theorem 2.8 (Uhlhorn) Let H be a Hilbert space with dimH ≥ 3. Let ϕ :
P (H ) → P (H ) be an orthoisomorphism. Then there is either a unitary or an
antiunitary operator U acting on H implementing ϕ.
Proof It is enough to show that ϕ extends to an orthoisomorphism of P (H ). Let us
take any projection P acting on B(H ). Suppose that

P = sup Qα = sup Rβ

where (Qα ) and (Rβ ) are one-dimensional projections. Then

sup ϕ(Qα ) = sup ϕ(Rβ )

α β

because both these suprema have the same orthocomplement. This allows to extend
ϕ to an orthoisomorphism on P (H ) by putting

ϕ(P ) = sup{ϕ(R) : R ≤ P , dimP = 1}

By evoking Theorem 2.7 we can conclude the proof.

Symmetries of C ∗ -algebras and Jordan Morphisms 689

3 Dye Theorem

3.1 Dye Theorem for von Neumann Algebras

The following celebrated theorem has been proved in [4].

Theorem 3.1 (Dye) Let M and N be von Neumann algebras, where M has no
Type I2 direct summand. Then any orthoisomorphism ϕ : P (N ) → P (N ) extends
to a Jordan *-isomorphism : M → N .
Proof Even if this is much general situation, the proof is the same as for Theo-
rem 2.7. Using Theorem 1.4, we can extend ϕ to a bounded linear map : M → N
preserving projections and so being a Jordan *-homomorphism. The argument
involving the inverse ϕ −1 says that is in fact a Jordan *-isomorphism.

3.2 Dye Theorem for AW∗ -algebras

Main result of this part is taken from [7].

Theorem 3.2 (Hamhalter) Let A be an AW∗ -algebra without Type I2 direct
summand and B be an AW∗ -algebra. Let ϕ : P (A) → P (B) be a map preserving
all suprema and orthocomplements, i.e.

ϕ(sup pα ) = sup ϕ(pα )

α α

and

ϕ(1 − p) = 1 − ϕ(p) .

Then ϕ extends to a Jordan ∗-homomorphism : A → B.

One of major problems in theory of AW∗ -algebras is whether Generalised
Gleason theorem holds also for AW∗ -algebras without Type I2 . Since the positive
answer would resolve many other difficult problems in the theory of operator
algebras (see [7] for details) this problem is expected to be extremely difficult.
Therefore the idea of proving Dye theorem for AW∗ -algebras must be different
from the case of von Neumann algebras. Fortunately, we can combine local Gleason
theorem and matrix approach initiated by J. von Neumann and Dye [4] and
developed further by Heunen and Reyes [12].
We shall need the following notation.
Let Mn (A) be the C ∗ -algebra of all n by n matrices over the C ∗ -algebra A. Let
us take distinct integers 1 ≤ i, j ≤ n and a ∈ A. We shall consider the matrix pij (a)
in Mn (A) such that all entries are zero except for positions (i, i), (i, j )(j, i), (j, j )
690 J. Hamhalter and E. Turilova

which give the submatrix

(1 + aa ∗)−1 (1 + aa ∗)−1 a
.
a ∗ (1 + aa ∗)−1 a ∗ (1 + aa ∗)−1 a

Further by eii we shall denote the matrix having at position (i, i) the unit and zeros
elsewhere. It turns out that each pij (a) is a projection and even that projections
of this form generate the projection lattice of Mn (A) as a complete orthomodular
lattice (see e.g. [12, Lemma 4.1]).
There is an elegant bridge between linear structure of A and lattice operations in
Mn (A), which is a great discovery going back to J. von Neumann (see [4, Lemma 4,
Lemma 3(i)]). Let us recall that by a lattice polynomial in variables p1 , . . . , pk we
mean a formal expressions that is a result of finitely many lattice operations (∨, ∧)
performed on elements in {p1 , . . . , pk }.
Lemma 3.3 There exist lattice polynomials P , Q and R such that for any elements
a, b, c of a C ∗ -algebra A, where c is invertible, any integer n ≥ 3, and any distinct
indices 1 ≤ i, j, k ≤ n, the following holds.
(i) pij (a + b) = P (pij (a), pij (b), pik (c), eii , ejj , ekk ).
(ii) pij (−ab) = Q(pik (a), pkj (b), eii , ejj )
(iii) pij (−a ∗ ) = R(pj i (a), eii , ejj ).
As a consequence of Lemma 3.3 any lattice morphism of Mn (A) that preserves
projections of type pij (a) and eii , induces a *-ring morphism on the underlying
algebra A. This is an important ingredient in proving Dye theorem as well in
establishing the following deep result of Heunen and Reyes [12, Theorem 4.6].
Theorem 3.4 (Heunen and Reyes) Let A and B be AW ∗ -algebras. Let

f : P (A) → P (B)

be a map preserving arbitrary suprema and orthocomplements. Then f extends to

a Jordan *-homomorphisms if and only the following condition holds

f ((1 − 2p)q(1 − 2p)) = (1 − 2f (p))f (q)(1 − 2f (p)) , (10)

for all projections p, q ∈ A.

The previous theorem says that when proving Dye theorem one has to check (10),
which is an identity involving projections 1, e, f . Therefore, one can restrict himself
to a unital AW∗ -algebra generated by e and f . A thorough analysis of position of
two projections in AW∗ -algebras has led to the following structural result proved in
[7] which has independent meaning.
Proposition 3.5 Let e and f be projections in a AW∗ -algebra A. Then the smallest
AW∗ -subalgebra, AW ∗ (e, f ) of A that contains e and f , is *-isomorphic to the
Symmetries of C ∗ -algebras and Jordan Morphisms 691

direct sum

C ⊕ M2 (D) ,

where C and D are abelian C ∗ -algebras.

The algebra M2 (D) in the previous Proposition is a Type I2 AW∗ -algebra. For
this algebra the Gleason theorem does not hold. However we have succeeded in
proving that for any matrix algebra of higher rank the generalized Gleason theorem
does hold [7, Theorem 3.8].
Theorem 3.6 (Hamhalter) Let A be an AW∗ -subalgebra of type In , where n =
2, n < ∞, and X a Banach space. Then any bounded finitely additive measure

μ : P (A) → X ,

extends to a bounded linear functional

T : A → X.

Unfortunately the studied algebra AW ∗ (1, e, f ) generated by projections e and

f is not covered by the previous theorem. However, the fact that this algebra
is sitting inside the algebra with no summand of Type I2 allows one, after
nontrivial arguments using geometry of projections and their angles, to show that
the Generalized Gleason theorem does hold on subalgebras generated by two
projections.
Theorem 3.7 (Hamhalter) Let e, f be projections in a AW∗ -algebra A without
Type I2 . Let X be a Banach space. Let B be an AW ∗ -subalgebra generated by
projections e, f, 1. Then any bounded finitely additive measure

μ : P (B) → X ,

extends to a bounded linear functional

T : B → X.

Proof of Theorem 3.2 We have now all ingredients to prove the (nonbijective) Dye
theorem for AW∗ -algebras. Let us have an AW∗ -algebra A, not having Type I2
direct summand, and another AW∗ -algebra B. Consider a map ϕ : P (A) → P (B)
that preserves suprema of arbitrary projections and orthocomplements. Fix now two
projections e, f ∈ A. Theorem 3.7 assures us that there is a Jordan map J mapping
the algebra C = AW ∗ (1, e, f ) to B that coincides with ϕ on P (C). By the algebraic
properties of the Jordan morphism we can conclude that (10) is satisfied. Now we
can use Theorem 3.4 to prove Theorem 3.2.
692 J. Hamhalter and E. Turilova

4 Structure of Abelian Subalgebras

Let A be a unital C ∗ -algebra. By the symbol C(A) we shall denote the structure of
all abelian C ∗ -subalgebras of A containing the unit of A. When endowed with set
theoretic inclusion, C(A) becomes the poset with the least element span{1}. Infima
in this poset are given by set theoretic intersections of subalgebras. On the other
hand, supremum of two elements E and F exists in C(A) if and only if E and F
mutually commute. Similarly we shall denote by C0 (A) the poset of all unital abelian
C ∗ -subalgebras (not necessarily containing the unit of A). Let P be a subposet of
C0 (A) and Q be a subposet of C0 (B). The map ϕ : P → Q is said to be implemented
by a map : A → B if

ϕ(C) = [C] = { (x) : x ∈ C} for all C ∈ P.

Certainly the poset C(A) is a C∗ -invariant in category of unital C ∗ -algebras. It

is not a complete invariant in general, as the opposite algebra Ao has precisely the
same poset C(Ao ) = C(A), while A and Ao may be not isomorphic as C ∗ -algebras
as celebrated result on Type III factors due to Connes shows. (The opposite algebra
A0 is the same Banach space A with the same *-operation and reversed product
(a, b) → ba.)

4.1 Abelian C ∗ -subalgebras

Albeit the poset C(A) is not a complete invariant in category of C ∗ -algebras, it

has been shown by Mendivil, that C(A) is a complete invariant in the category of
abelian unital C ∗ -algebras. We have shown that more is true by establishing a one-
to-one correspondence between *-isomorphisms of abelian C ∗ -algebras and order
isomorphisms of posets of unital abelian C ∗ -subalgebras. This is the content of the
following theorem proved in [6].
Theorem 4.1 (Hamhalter) Let A and B be abelian unital C ∗ -algebras. Let

ϕ : C(A) → C(B)

be an order isomorphism. Then there is a *-isomorphism

ψ :A→B

such that

ϕ(C) = ψ[C] for all C ∈ C(A) .

Moreover, if dim A > 2, then ψ is uniquely determined by ϕ.

Symmetries of C ∗ -algebras and Jordan Morphisms 693

This result has a topological background. In fact, any unital abelian C ∗ -algebra
is *-isomorphic to C(X), where X is a compact Hausdorff space. Moreover, by
Banach-Stone theorem any *-isomorphism ψ : C(X) → C(Y ) is given by the
homeomorphism τ : Y → X in the following way

ψ(f ) = f ◦ τ for all f ∈ C(X) .

There are interrelations between algebraic structure of abelian C ∗ -algebra C(X) and
topology of X. For example, for any closed ideal I of C(X) there is a unique closed
subset F of X such that

I = {f ∈ C(X) : f is zero on F}

The key role in studying the poset C(C(X)) is played by so called ideal subalgebras,
that is by C ∗ -subalgebras of C(X) generated by a closed proper ideal I of C(X) and
the unit 1. More precisely, for each proper ideal algebra C of C(X) there is a closed
subset F of X with at least two points such that

C = {f ∈ C(X) : f is constant on F } .

It can be shown that any order isomorphism of subalgebras structures preserves ideal
subalgebras. Therefore it induces an order isomorphism between closed subsets of
X. An important part of proving Theorem 4.1 is therefore establishing the form
of isomorphisms of the poset of closed subsets. This is achieved in the following
theorem (see [6, Theorem 2.3]). Let us denote by the symbol F (X) the poset of all
closed subsets of X with at least two points ordered by set theoretic inclusion.
Theorem 4.2 Let X and Y be compact Hausdorff spaces. Suppose that X is not a
singleton. Let

ψ : F (X) → F (Y )

be an order isomorphism. Then there is a homeomorphism

τ :X→Y

such that

ψ(F ) = τ [F ] for all F ∈ F (X) .

The homeomorphism τ in the previous theorem is then used in the proof of

Theorem 4.1.
In case of abelian algebra the poset C(A) is a complete lattice. This is not
true in noncommutative case where suprema of elements do not exist in general.
One cannot hope for Theorem 4.1 to be valid in this wider context. Indeed, based
694 J. Hamhalter and E. Turilova

on the fact that Jordan *-isomorphisms preserve commutativity, it can be shown

that a Jordan *-isomorphism ψ between unital C ∗ -algebras A and B implements
an order isomorphism ϕ. Since there are many Jordan *-isomorphisms that are
not *-isomorphisms we can see that, in the light of the previous observation,
not all order isomorphisms of the posets of abelian subalgebras are induced by
*-isomorphisms. Even more is true, one can realize that every quasi Jordan *-
isomorphisms implements order isomorphisms. This is the content of the following
Proposition.
Proposition 4.3 Let ψ : A → B be a unital quasi Jordan *-isomorphism. Then the
map ϕ : C(A ) → C(B) given by

ϕ(C) = ψ[C] for all C ∈ C(A)

is an order isomorphisms.
There is a natural question whether the converse holds as well. The answer is
in the positive. It was proved by the first author in [6, Theorem 3.4]. For further
treatment and alternative proofs we recommend [14, 15].
Theorem 4.4 (Hamhalter) Let A and B be unital C ∗ -algebras. Then for any order
isomorphism

ϕ : C(A ) → C(B)

there is a unital quasi Jordan *-isomorphism

ψ :A→B

such that

ϕ(C) = ψ[C] for all C ∈ C(A) .

Moreover, if A is not isomorphic neither to C2 nor M2 (C), then ψ is uniquely

determined by ϕ.
This theorem is a generalization of Theorem 4.1. Let us note that one cannot
have uniqueness of the induced Jordan map in case of C2 or M2 (C). To explain it,
let us consider the latter case A = M2 (C). In that situation the poset C(A) has the
least element span {1} and uncountably many 2-dimensional abelian C ∗ -subalgebras
that are atoms and maximal elements simultaneously. On each atom let us choose
separately a unital Jordan *-automorphism. The union of such automorphisms
(and its canonical extension to the whole algebra) now gives a quasi Jordan *-
isomorphism implementing the identical order automorphism of M2 (C). We can get
this way uncountably many quasi Jordan *-isomorphisms implementing the same
(identical) automorphism of C(A)).
Symmetries of C ∗ -algebras and Jordan Morphisms 695

Except for the poset C(A), it is also natural to consider the poset C0 (A) of
all abelian C ∗ -subalgebras (unital or not) of A. The poset C0 (A) has different
properties. For example, let us consider A = C2 . The poset C(A) consists of two
elements span {(1, 1)} and A, related by span {(1, 1)} ≤ A. In case of C0 (A) we have
the largest element A, the smallest element {0} and three incomparable elements
span {(1, 1)}, span {(1, 0)}, and span {(0, 1)}. The group of order isomorphisms is
just the group of permutations of these three point set. As Jordan *-isomorphisms
are unital, they implement only those order automorphisms of C0 (A) that leave the
element span{(1, 1)} fixed. Therefore one cannot hope that order isomorphisms are
implemented by (quasi) Jordan automorphisms in this case. However careful anal-
ysis shows that implementation is possible if we assume that order isomorphisms
preserve the subalgebra generated by the unit. Given a unital C ∗ -algebra A we shall
denote by O the one dimensional subalgebra span{1}.
The authors have obtained the following result [8]:
Theorem 4.5 (Hamhalter and Turilova) Let A and B be unital C ∗ -algebras
isomorphic neither to C ⊕ C nor to M2 (C). Let

ϕ : C0 (A) → C0 (B)

be an order isomorphism such that

ϕ(O) = O .

Then there is a unique unital quasi Jordan *-isomorphism

ψ :A→B

such that

ϕ(C) = ψ[C] for all C ∈ C0 (A) .

Example In general, one cannot replace quasi Jordan -morphisms by Jordan -

morphisms in the previous results. Consider the algebra A = M2 (C). Then each
element that is neither the greatest nor the least element in C(A) is a two dimensional
algebra of the form

Vξ = span{Pξ , 1 − Pξ }} ,

where ξ is a unit vector in C2 and Pξ is an orthogonal projection onto its linear

span. Denote the set of such subalgebras by S. Let us have an order automorphism
ϕ of C(A) implemented by a Jordan *-automorphism ψ. Let PVξ be the projection
of the Banach space A onto its closed subspace Vξ . As ψ is a bounded linear map
on A, the assignment ξ → Pψ[Vξ ] is a continuous map from the unit sphere C2 into
the space of bounded operators acting on A. Let us now take a sequence (ξn ) of
696 J. Hamhalter and E. Turilova

unit vectors in C2 converging to a vector ξ ∈

/ (ξn ) in C2 . Denote by the bijection
of the set S such that (Vξn ) = Vξn for all n and (Vξ ) = Vν = Vξ . Let us now
consider the order automorphism ϕ of C(A) that coincides with on S. Then the
map ξ → Pϕ(Vξ ) is not continuous and so it cannot be implemented by any Jordan
*-isomorphism.
Nevertheless, we can see that the previous counterexample is quite special as
the poset C(M2 (C)) has all nontrivial elements as atoms. This cannot happen in
M3 (C) and higher dimensions. In fact, when applying the generalized Gleason
theorem (Theorem 1.4), we can see that any quasi Jordan ∗-homomorphism on a von
Neumann algebra without Type I2 direct summand is a Jordan *-homomorphism.
We then have the following description of symmetries of C(A) for nearly all von
Neumann algebras. The following result was proved in [6].
Theorem 4.6 (Hamhalter) Let M be a von Neumann algebra without Type I2
direct summand. Let N be another von Neumann algebra. Let

ϕ : C(M) → C(N )

be an order isomorphisms. Then there is a Jordan *-isomorphism

ψ :M→N

such that

ϕ(C) = ψ[C] for all C ∈ C(M) .

Based on Theorem 4.5 we can now obtain the nonunital version of Theorem 4.6
proved in [8].
Theorem 4.7 (Hamhalter and Turilova) Let M be a von Neumann algebra
without Type I2 direct summand. Let N be another von Neumann algebra. Let

ϕ : C0 (M) → C0 (N )

be an order isomorphism such that

ϕ(O) = O .

Then there is a Jordan *-isomorphism

ψ :M→N

such that

ϕ(C) = ψ[C] for all C ∈ C(M) .

Symmetries of C ∗ -algebras and Jordan Morphisms 697

4.2 Abelian von Neumann Subalgebras

In case of von Neumann algebras, another possibility how to embody Bohr’s

doctrine is to consider the poset of abelian von Neumann subalgebras instead of all
abelian C ∗ -subalgebras. Let M be a von Neumann algebra. By the symbol V(M)
we shall denote the poset of all abelian von Neumann subalgebras of M containing
the unit of M and ordered by set theoretic inclusion. Of course, V(M) is a subposet
of C(M). The first result in this context has been proved by Döring and Harding in
[3].
Theorem 4.8 (Döring and Harding) Let M be a von Neumann algebra without
Type I2 direct summand and N be another von Neumann algebra. Then for any
order isomorphism

ϕ : V(M) → V(N)

there is a unique Jordan *-isomorphism

ψ :M→N

implementing ϕ:

ϕ(C) = ψ[C] for all C ∈ V(N ) .

Original proof of this result is based on Dye Theorem. It can be also proved
quickly by using Generalised Gleason theorem as in the previous subsection. We
have generalized this result to AW∗ -algebras. Since we do not have Gleason type
theorem to our disposal in this case we have to rely fully on Dye theorem for AW∗ -
algebras discussed in Sect. 2. The following result may be found in [7, Theorem 4.6]
for C(M) or in [15, Theorem 9.2.8] for V(N ). Having an AW∗ -algebra M we shall
denote by V(M) the poset of abelian AW∗ -subalgebras of M containing the unit of
M.
Theorem 4.9 (Hamhalter, Lindenhovous) Let M be an AW∗ -algebra without
Type I2 direct summand and N be another AW∗ -algebra.
(i) Let

ϕ : V(M) → V(N )

be an order isomorphism. Then there exists a Jordan *-isomorphism

ψ :M→N
698 J. Hamhalter and E. Turilova

implementing ϕ:

ϕ(C) = ψ[C] for all C ∈ V(N ) .

(ii) Let

ϕ : C(M) → C(N )

be an order isomorphism. Then there exists a Jordan *-isomorphism

ψ :M→N

implementing ϕ:

ϕ(C) = ψ[C] for all C ∈ C(N ) .

Besides the order structure of all abelian C ∗ -subalgebras and abelian von Neu-
mann subalgebras we can also consider the simplest structure of finite dimensional
abelian subalgebras. Each algebra of this type is isomorphic to the power Cn and
corresponds to decomposition of the unit into sum of orthogonal projections. Let
Cf in (A) be the set of all finite dimensional abelian C ∗ -subalgebras of a unital
C ∗ -algebra A containing the unit and ordered by set theoretic inclusion. Using
Dye theorem one can show once more that order isomorphism of this structure is
implemented by Jordan *-isomorphism (see [9, Proposition 3.5]).
Theorem 4.10 (Hamhalter and Turilova) Let M be a von Neumann algebra
without Type I2 direct summand. Let N be another von Neumann algebra. Let

ϕ : Cf in (M) → Cf in (A)

be an order isomorphism. Then there is a unique Jordan *-isomorphism ψ : M →

N such that

ϕ(C) = ψ[C] for all C ∈ Cf in (A) .

4.3 Abelian Subalgebras as Invariants

In the previous section we could see that for algebras not containing Type I2 part, the
structure of abelian subalgebras implies that given algebras are isomorphic as Jordan
algebras. For algebras of Type I2 we know that not all order isomorphisms between
the structure of abelian subalgebras are implemented by Jordan maps. However,
it is surprising that such algebras are even *-isomorphic if they have isomorphic
posets of abelian subalgebras. In fact, we have shown that C(A) is a complete Jordan
invariant for all von Neumann algebras (see Theorem 2.3 in [11].)
Symmetries of C ∗ -algebras and Jordan Morphisms 699

Theorem 4.11 (Hamhalter and Turilova) Let M and N be von Neumann alge-
bras. The following assertions are equivalent
(i) V(M) and V(N ) are isomorphic.
(ii) Cf in (M) and Cf in (N ) are isomorphic.
(iii) P(M) and P(N ) are orthoisomorphic.
(iv) M and N are isomorphic as Jordan algebras.
Proof First observe that (i) implies (ii). This is due to the fact that finite dimensional
abelian subalgebras can be characterized as those elements in V(M) that have
only finitely many elements beneath. This is of course preserved by any order
isomorphism. Therefore any order isomorphism between V(M) and V(N ) restricts
to an order isomorphism between Cf in (M) and Cf in (N ).
Now we focus on proving that (ii) implies (iii). It has been proved in [3,
Lemma 3.1] that Cf in (M) is isomorphic to the poset of finite Boolean subalgebras
B f in (P(M)) of P(M) by the map

X ∈ Cf in (M) → P(X) ∈ B f in (P(M)) .

Therefore, condition (ii) implies an order isomorphism between B f in (P(M)) and

B f in (P(N )). By Döring and Harding [3, Lemma 3.3.] this isomorphism extends
naturally to an order isomorphism between the structures of all Boolean subalgebras
B(P(M)) and B(P(N )). However, according to a nice result of Harding and
Navara (see e.g. [14]), any isomorphism between structures of Boolean subalgebras
of orthomodular lattices induces an orthoisomorphism (may be not in a unique way)
between orthomodular lattices themselves. Therefore P(M) is orthoisomorphic to
P(N ).
Let us now show that (iii) implies (iv). Suppose that ϕ : P(M) → P(M) is an
orthoisomorphism. Let z be the central projection in M such that zM is either zero
or of Type I2 and (1 − z)M is either zero or does not contain any direct summand
of Type I2 . If z = 0 then the proof follows from Theorem 3.1. Therefore, let us
suppose that z is non-zero. By the properties of orthoisomorphism we know that
w = ϕ(z) is a central projection in N . Hence,

N = wN ⊕ (1 − w)N .

Moreover, the restriction of ϕ gives orthoisomorphism between P(zM) and

P(wN ) and also between P((1 −z)M) and P((1 −w)N ). By Theorem 3.1 there is
a Jordan *-isomorphism between (1 − z)M and (1 − w)N . It remains to show that
zM and wN are Jordan ∗-isomorphic. We shall prove even stronger statement that
these algebras are isomorphic as C ∗ -algebras. First we verify that wN is of Type I2 .
Let e be a faithful abelian projection in zM such that z − e is also a faithful abelian
projection in zM. (A projection is faithful if its central cover is the unity.) As ϕ
preserves commutativity in both directions and consequently it preserves the central
covers, we infer that ϕ(e) and w − ϕ(e) are orthogonal faithful abelian projections
in wM. Therefore wN is of Type I2 . Further ϕ gives an orthoisomorphism between
700 J. Hamhalter and E. Turilova

P(Z(zM)) and P(Z(wN )). Therefore Z(zM) and Z(wN ) are isomorphic as C ∗ -
algebras (it follows from the fact than any orthoisomorphism between projection
lattices of abelian von Neumann algebras extends to a *-isomorphism). According
to the structure theory of finite homogeneous von Neumann algebras of Type In
such algebras are ∗-isomorphic if they have ∗-isomorphic centers. This concludes
the proof of given implication.
Finally, the implication (iv) ⇒ (i) is easy.

5 Choquet Order Structure

In this section we shall present our main results on complete Jordan invariants based
on Choquet order structure of decompositions of states. Let us first recall basic
definitions. For details on Choquet theory on state spaces we refer the reader to
monograph [17]. Let us have a compact Hausdorff space X. By a Radon measure
μ on X we mean an element of the dual space C(X)∗ . By celebrated Riesz
representation theorem there is a bijection between Radon measures and regular
Borel measures on X. In this correspondence μ ∈ C(X)∗ can be canonically
identified with a regular Borel measure μ on X in the sense of the formula
$
μ(f ) = f (ω) dμ(ω) f ∈ C(X) .
X

The set of all positive Radon measures on X will be denoted by M + (X). A

probability Radon measure ν is a positive measure for which ν(X) = 1. The symbol
P(X) will be reserved for the set of all probability Radon measures on X.
Let now K be a non-empty compact convex set in a locally convex Hausdorff
vector topological space E. Let A(K) and CC(K) represent the set of all continuous
affine functions on K and all continuous convex real functions on K, respectively.
Take μ ∈ P(K). A point b(μ) ∈ K is called the barycenter of μ if, for each
a ∈ A(K),
$
a(b(μ)) = μ(a) = a(ω) dμ(ω) .
K

Every probability Radon measure admits a (unique) barycenter. To see an example,

n
let us have a finite
n convex combination of Dirac measures μ = i=1 λi δxi , xi ∈ K.
Then b(μ) = i=1 λi xi . A measure μ ∈ P(X) is called representing for a given
point x ∈ K if x is the barycenter of μ. The set of all representing measures of x will
be denoted by Mx (K). Note that the Dirac measure, δx , is one of the representing
measures for x. Let us recall that convex combinations of Dirac measures are just
probability measures with finite support.
Symmetries of C ∗ -algebras and Jordan Morphisms 701

Let μ and ν be positive Radon measures on a compact convex set K. We define

the relation μ ≺C ν as follows:

μ ≺C ν if μ(f ) ≤ ν(f ) for all f ∈ CC(K) .

It is known that the relation ≺C is a partial order on the set of positive Radon
measures (see e.g. [2, Proposition 4.1.3, p.325], [17, Definition 6.5,p. 233]). The
order ≺C is called the Choquet order.
Now we turn to the situation when K is a state space of a C ∗ -algebra and establish
some new results for Choquet order in this situation. Let ϕ be a state on a C ∗ -algebra
A. The triple (πϕ , ξϕ , Hϕ ) will represent the GNS data of ϕ. By Mϕ we shall denote
the von Neumann algebra generated by πϕ (A). Then Mϕ = πϕ (A) . Let Cϕ be the
(real) space of all functionals in A∗ spanned by positive functionals dominated by
ϕ. In other words,

Cϕ = span {ψ : 0 ≤ ψ ≤ ϕ} .

It is well known that there is a bijective positive map between Cϕ and πϕ (A) ,
sending each element ψ ∈ Cϕ to an operator aψ ∈ Mϕ such that, for each a ∈ A,

ψ(a) = aψ πϕ (a)ξϕ , ξϕ

(see e.g. [17, Proposition IV 3.10, p. 201]). Let μ ∈ Mϕ+ (S(A)). Take f ∈
L∞ (S(A), μ). Then, according to the previous discussion, there is a unique element,
θμ (f ) ∈ Mϕ such that, for each a ∈ A,
$
θμ (f )πϕ (a)ξϕ , ξϕ = f (ω)a(ω) dμ(ω) .
S(A)

The map θμ is a unital weak∗ to weak∗ continuous map from L∞ (S(A), μ) into
von Neumann algebra Mϕ (see e.g. [17, Proposition 6.18, p. 238]). The measure
μ ∈ Mϕ+ (S(A)) is called orthogonal if, for each Borel set E ⊂ S(A), the positive
functionals ϕE and ϕE c on A given by
$ $
ϕE (a) = a(ω) dμ(ω) ϕS(A)\E (a) = a(ω) dμ(ω)
E S(A)\E

are orthogonal. It is known that μ is an orthogonal measure if and only if θμ is a

∗-isomorphism that maps L∞ (S(A), μ) onto the von Neumann abelian subalgebra

Cμ = θμ (L∞ (S(A), μ))

of Mϕ (see [17, Theorem 6.19, p. 239]).

702 J. Hamhalter and E. Turilova

Let us denote by Oϕ (A) the set of all orthogonal measures having barycenter ϕ.
f in
Of course, δx ∈ Oϕ (A). Let us denote by Oϕ (A) the set of all finitely supported
orthogonal measures in Oϕ (A).
As an example, let us look at Oϕ (M2 (C)). The state space of M2 (C) can be
identified with all matrices of the form

1 1 + β1 β2 + iβ3
,
2 β2 − iβ3 1 − β1

where (β1 , β2 , β3 ) is a point in the three dimensional unit ball. (See also example
after Proposition 2.2.) In this way the state space is affine isomorphic to the unit
three dimensional ball. For simplicity, let ϕ be the normalized trace. It corresponds
to the origin. Any orthogonal measure on the state space that has barycenter
ϕ is a convex combination of two Dirac measures concentrated at vector states
(pure states) corresponding to orthogonal unit vectors in C2 . When using the ball
representation of the state space, we can see that these orthogonal states correspond
to antipodal points on the unit sphere. The set Oϕ (M2 (C))) can be then viewed as a
set of measures on the unit ball that are concentrated at two antipodal points on the
unit sphere and that assign mass 1/2 to each of them.
Going back to general situation, we shall denote by ϕ the map

ϕ : Oϕ (A) → V(Mϕ ) : μ → Cμ .

One of the basic theorems we shall use in this note is the following Tomita
theorem (see e.g. [17, Prop. 6.23, p. 241, Theorem 6.25 p.244]). It establishes a
one-to-one correspondence between orthogonal measures and abelian subalgebras
that preserves the Choquet order.
Theorem 5.1 (Tomita Theorem) The map ϕ : μ → Cμ is a bijection of
Oϕ (A) onto V(Mϕ ). Moreover, the following conditions are equivalent for μ, ν ∈
Oϕ (A):
(i) μ ≺ ν
(ii) Cμ ⊂ Cν .
In particular, the posets (Oϕ (A), ≺) and (V(Mϕ ), ⊂) are order isomorphic.
In [9] we showed that discrete version of Tomita’s theorem holds for finitely
supported measures as well.
Theorem 5.2 (Hamhalter and Turilova) The map ϕ : μ → Cμ is an order
isomorphism of Oϕ (A) onto Cf in (Mϕ ).
f in

The following theorem has been proved in [10, Theorem 6].

Symmetries of C ∗ -algebras and Jordan Morphisms 703

Theorem 5.3 (Hamhalter and Turilova) Let ϕ and ψ be states on C ∗ -algebras A

and B, respectively. Let one of the following statements be true
1. Mϕ is a von Neumann algebra without type I2 direct summand.
2. Mϕ has no nonzero Type I direct summand.
Then the following two statements hold:
(i) For each order isomorphism F : Oϕ (A) → Oψ (B) there is a unique Jordan
*-isomorphism J : Mϕ → Mψ such that

F (μ) = −1
ψ J [ϕ (μ)] .

for each μ ∈ Oϕ (A).

f in f in
(ii) For each order isomorphism F : Oϕ (A) → Oψ (B) there is a unique
Jordan ∗-isomorphism J : Mϕ → Mψ such that

F (μ) = −1
ψ J [ϕ (μ)]

f in
for each μ ∈ Oϕ (A).
Proof Let us prove (i), the statement (ii) can be proved analogously. By Theo-
rem 5.1 we know that order isomorphism F induces order isomorphism between
V(Mϕ ) and V(Mψ ). Suppose that Mϕ has no Type I2 direct summand. By
employing Theorem 4.8, we see that this isomorphism is implemented by a Jordan
*-isomorphism J : Mϕ → Mψ . This shows the form of F . By Theorem 9.1.3 in
[13] if a von Neumann algebra is of type I , then the same holds for its commutant.
Therefore, if Mϕ has no nonzero Type I direct summand, then Mϕ has no Type I2
direct summand and we apply the previous reasoning.

The assumption on Type I2 direct summand in the previous theorem is essential
as the following example demonstrates.
Example Let A = M2 (C) and let ϕ be a faithful state on A. Then there is a Choquet
order isomorphism on Oϕ (A) that is not induced by any Jordan ∗-isomorphism on
Mϕ .
Proof We can write ϕ = λ1 ωe1 + λ2 ωe2 , where e1 , e2 is a standard orthonormal
basis of C2 and λ1 and λ2 are positive non-zero numbers with sum one. It can be
verified easily that GNS data reads as follows:

H ϕ = C2 ⊕ C2 ,

1/2 1/2
ξϕ = λ1 e1 ⊕ λ2 e2 ,
704 J. Hamhalter and E. Turilova

and πϕ sends a 2 by 2 matrix a to block 4 by 4 matrix in the following way

a0
πϕ (a) = .
0a

It is clear that πϕ (A) consists of all matrices of the form

αI βI
,
γ I δI

where α, β, γ , δ ∈ C and I is the identity 2 by 2 matrix. It is apparent that this

algebra is isomorphic to M2 (C). Using now Example 4.1 we can find an order
isomorphism of V(πϕ (A) ) that is not implemented by any Jordan ∗-isomorphism.

So far the structure of measures with the Choquet order has been identified
with the structure of abelian subalgebras of the commutant resulting in the GNS
representation. However, in some important cases we can identify Choquet order
structure directly with the poset of abelian subalgebras of a given algebra. Let
us consider the following situation. Let ϕ be a faithful normal state on a von
Neumann algebra M. It is known that the GNS representation πϕ is a normal
faithful representation. Therefore we can identify M with Mϕ and suppose that M
acting on a Hilbert space H has a biseparating vector ξ . According to deep Tomita-
Takesaki modular theory of von Neumann algebras there is a *-antiisomorphism
between M and M . This is the content of celebrated Tomita-Takesaki theorem.
Theorem 5.4 (Tomita-Takesaki Theorem) Let M be a von Neumann algebra
acting on a Hilbert space H and having a biseparating vector ξ ∈ H . Then there is
a conjugate linear isometry J acting on H such that the map

j (x) = J x ∗ J x∈M

is an *-antiisomorphism between M and M .

As a consequence of the previous theorem combined with the previous discussion
we obtain that the posets Oϕ (M) and V(M) are isomorphic whenever ϕ is a
faithful normal state on a von Neumann algebra M. Based on this, we can show
that Choquet order on representing measures, the barycenter of which is a faithful
normal state, is a complete Jordan invariant for σ -finite algebras (see [8]).
Theorem 5.5 Let M and N be (σ -finite) von Neumann algebras with faithful
normal states ϕ and ψ, respectively. The following statements are equivalent.
(i) Oϕ (M) and Oψ (M) are isomorphic.
f in f in
(ii) Oϕ (M) and Oψ (M) are isomorphic.
(iii) M and N are isomorphic as Jordan algebras.
Symmetries of C ∗ -algebras and Jordan Morphisms 705

Proof As ϕ is faithful, the representation πϕ is a ∗-isomorphism. Therefore, M is

∗-isomorphic to πϕ (M). Since the algebra πϕ (M) has separating and generating
vector, by Theorem 5.4 we have that πϕ (M) and πϕ (M) are ∗-antiisomorphic
and thereby Jordan ∗-isomorphic. Now by Theorem 4.11 and Theorem 5.1 we can
see that conditions (i) and (ii) are equivalent to the fact that πϕ (M) and πϕ (N ) are
Jordan *-isomorphic. However, the previous reasoning tells us that this is equivalent
to (iii).

Acknowledgment This work was supported by the project OPVVV CAAS CZ.02.1.01/0.0/0.0/
16_019/0000778

References

1. J. Barvínek, J. Hamhalter, Linear algebraic proof of Wigner Theorem and its consequences.
Math. Slovaca (2), 67 (2017)
2. O. Brateli, D.W. Robinson, Operator Algebras and Quantum Statistical Mechanics, vol. 1
(Springer, Berlin, 1997)
3. A. Döring, J. Harding, Abelian subalgebras and the Jordan structure of von Neumann algebras.
Houston J. Math. (2) 42, 559–568 (2016)
4. H.A. Dye, On the geometry of projections in certain operator algebras. Ann. Math. (1) 61,
73–89 (1955)
5. J. Hamhalter, Quantum Measure Theory (Kluwer Academic, Boston, 2003)
6. J. Hamhalter, Isomorphisms of ordered structures of abelian C ∗ -subalgebras of C ∗ -algebras. J.
Math. Anal. Appl. 383, 391–399 (2011)
7. J. Hamhalter, Dye’s Theorem and Gleason’s Theorem for AW*-algebras. J. Math. Anal. Appl.
422, 1103–1115 (2015)
8. J. Hamhalter, E. Turilova, Automorphisms of ordered structures of abelian parts of operator
algebras and their role in quantum theory. Int. J. Theor. Phys. (10) 53, 3333–3345 (2014)
9. J. Hamhalter, E. Turilova, Orthogonal measures on state spaces and context structures of
quantum theory. Int. J. Theor. Phys. 55, 3353–3365 (2016)
10. J. Hamhalter, E. Turilova, Choquet order and Jordan maps. Lobachevskii J. Math. (3) 39,
340–347 (2018)
11. J. Hamhalter, E. Turilova, Jordan invariants of von Neumann algebras given by abelian
subalgebras and Choquet order on state spaces. Int. J. Theor. Phys. (2) 60, 1–11 (2021)
12. C. Heunen, M. Reyes, Active lattices determine AW*-algebras. J. Math. Anal. Appl. 416, 289–
313 (2014)
13. R.V. Kadison, J.R. Ringrose, Theory of Operator Alegebras I, II (Academic, New York, 1986)
14. K. Landsman, Foundations of Quantum Theory, From Classical Concepts to Operator
Algebras. Springer Open, Fundamental Theories of Physics, vol. 188 (Springer, Cham, 2017)
15. B. Lindenhovius, C (A). Ph.D. thesis, Radbound University, Nijmegen, 2016
16. R. Simon, N. Mukunda, S. Chaturvedi, V. Srinivasan, J. Hamhalter, Comment on: Two
elementary proofs of the Wigner Theorem on symmetry in quantum mechanics [Phys. Letter
A 327 (2008) 6847]. Phys. Lett. A 378, 2332–2335 (2014)
17. M. Takesaki, Theory of Operator Algebras I, II, III (Springer, Berlin, 2001)
18. E.P. Wigner, Group Theory and Its Application to the Quantum Mechanics of Atomic Spectra
(Academic, New York, 1959)
Part V
Inequalities in Commutative and
Noncommutative Probability Spaces
Mixed Norm Martingale Hardy Spaces
and Applications in Fourier Analysis

Ferenc Weisz

Abstract We consider martingale Hardy spaces defined with the help of mixed
Lp -norm. Five mixed normed martingale Hardy spaces will be investigated: Hps ,
HpS , HpM
, Pp and Qp . We give two different generalizations of Doob’s maximal
inequality for mixed-norm Lp spaces. We prove also two versions of atomic
decompositions. Several martingale inequalities and the generalization of the well-
known Burkholder-Davis-Gundy inequality are also presented. The dual spaces of
the mixed-norm martingale Hardy spaces are given as the mixed-norm BMOr ( α)
spaces. This implies the John-Nirenberg inequality BMO1 ( α ) ∼ BMOr (α ) for
1 < r < ∞. As an application in Fourier-analysis, we verify the boundedness of the
Fejér maximal operator from Hp to Lp , whenever 1/2 < p < ∞. As a consequence
of the boundedness, we get some almost everywhere and norm convergence results.

Keywords Mixed Lebesgue spaces · Mixed normed martingale Hardy spaces ·

Atomic decomposition · Doob’s inequality · Martingale inequalities ·
Burkholder-Davis-Gundy inequality · BMO spaces · John-Nirenberg inequality ·
Walsh system · Fejér means · Fejér maximal operator · Boundedness

1 Introduction

Since 1970, the theory of Hardy spaces has been developed very quickly (see
e.g. Fefferman and Stein [19], Stein [81], Grafakos [34]). Fefferman [18] proved
that the dual space of the Hardy space is equivalent to the space of functions of
bounded mean oscillation (BMO). John and Nirenberg [55] obtained their famous
inequality, i.e., that the BMOp spaces are equivalent. One year later, Fefferman
and Stein [19] characterized the dual space of Hp (0 < p < 1) as a Lipschitz
space. The most powerful technique in the theory of Hardy spaces, the so-called

F. Weisz ()
Department of Numerical Analysis, Eötvös L. University, Budapest, Hungary
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 709
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_21
710 F. Weisz

atomic decomposition was given in Coifman and Weiss [14, 15]. Recently several
papers were published about the generalization of Hardy spaces. For example,
Hardy spaces with variable exponents were considered in Nakai and Sawano [68],
Yan, et al. [99], Jiao et al. [54], Liu et al. [63] and [64]. Moreover Musielak-Orlicz-
Hardy spaces were studied in Yang et al. [100]. The mixed norm classical Hardy
spaces have been developed in Cleanthous et al. [11] and intensively studied by
Huang et al. in [41–44, 46].
Parallel, a similar theory was evolved for different types of martingale Hardy
spaces Hps , HpS , HpM , Pp and Qp (see e.g. Garsia [23], Long [65] and Weisz [87]).
In the celebrated work of Burkholder and Gundy [6], it was proved that the Lp
norms of the maximal function and the quadratic variation, that is the spaces HpM
and HpS , are equivalent for 1 < p < ∞. In the same year, Davis [16] extended this
result for p = 1. For martingale Hardy spaces, Weisz [87] worked out the theory
of atomic decomposition. Some boundedness results, duality theorems, martingale
inequalities and interpolation results can be proved with the help of the atomic
decomposition. A martingale analogue of H1 -BMO duality can be found in the
books Garsia [23], Long [65] and Weisz [87]. For dyadic martingales, Herz [37]
obtained the dual space of Hp (0 < p < 1). In 1990, Weisz [86] characterized the
dual space of Hp (0 < p < 1) for general martingales via atomic decomposition.
For a regular stochastic basis, the BMOp spaces are equivalent in the martingale
case, too. Recently, these results were extended to more general cases. Jiao et al.
investigated martingale Hardy-Lorentz spaces in [51, 52] and variable martingale
Hardy spaces in [49, 50, 53]. Martingale Musielak–Orlicz Hardy spaces were
investigated in Xie et al. [96–98]. The theory of martingale Hardy spaces can be well
applied in Fourier analysis (see Gát [25, 26], Goginava [30, 31] or Weisz [87, 90]).
The mixed Lebesgue spaces were introduced in 1961 by Benedek and Panzone
[3] (see also Hörmander [40]). They considered the Descartes product (, F , P)
of the probability spaces (i , F i , Pi ), where = di=1 i , F is generated by
d d
i=1 F and P is generated by i=1 P . The mixed Lp -norm of the measurable
i i

function f is defined as a number obtained after taking successively the Lp1 -norm
of f in the variable x1 , the Lp2 -norm in the variable x2 , . . ., the Lpd -norm in the
variable xd . Some basic properties of the spaces Lp were proved in [3], such as the
well known Hölder’s inequality and the duality theorem. Mixed-norm Lebesgue and
Hardy spaces were investigated in a great number of papers (e.g. in [1, 9–13, 27, 28,
35, 38, 39, 41–44, 46, 48, 56–60, 80]).
In this paper we will introduce five mixed normed martingale Hardy spaces: Hps ,
HpS , HpM
, Pp and Qp . In Sect. 3, Doob’s inequality will be proved, that is, we will
show that

sup |En f | ≤ C f p

n∈N p

for all f ∈ Lp , where 1 < p < ∞. We present also another version of Doob’s
inequality. In Sect. 5, we give the atomic decomposition for the five mixed normed
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 711

martingale Hardy spaces. Using the atomic decomposition and Doob’s inequality,
several martingale inequalities will be proved in Sect. 6. We will show that, if
the stochastic basis (Fn ) is regular, then the five martingale Hardy spaces are
equivalent. As a consequence of Doob’s inequality, the generalization of the well-
known Burkholder-Davis-Gundy inequality can be shown. In the next section, we
prove that the dual of HpM is BMO2 ( α ) and, if the stochastic basis is regular, then
M
the dual of Hp is BMOr ( α ), where 0 < p ≤ 1, α = 1/p − 1 and 1 < r < ∞.
Consequently, we obtain the generalization of the John and Nirenberg theorem
for mixed normed martingale spaces: if 0 ≤ α < ∞ and 1 < r < ∞, then
BMO1 ( α ) = BMOr (α ) with equivalent norms.
In the one-dimensional case, Paley [71] (see also Schipp, Wade, and Simon [75]
and Weisz [90]) proved the Lp -norm convergence of the partial sums of the Walsh-
Fourier series of f in case of 1 < p < ∞. There is no convergence result for
p ≤ 1 (see [33, 75]). Using summability methods, such as Fejér means, the L1 -
norm convergence can be reached for functions in L1 , too. It was proved by Fine
[21] that the Fejér means of the one-dimensional Walsh-Fourier series converges
almost everywhere to the function if f ∈ L1 . Schipp [72] obtained the same result
by proving the weak type inequality of the maximal operator σ∗ of the Fejér means.
By interpolation, this implies that σ∗ is bounded on Lp (1 < p < ∞). Next Fujii
[22] extended this and showed that σ∗ is bounded from the dyadic Hardy space H1
to L1 (see also Schipp and Simon [74]). Later the author (see [88]) generalized this
further and proved that σ∗ is bounded from Hp to Lp for 1/2 < p < ∞. The
boundedness does not hold for 0 < p ≤ 1/2 (see [78]).
In the two-dimensional case, Weisz considered the Fejér maximal operator over
a cone and he proved in [89] that σ∗ is bounded from Hp to Lp for 1/2 < p <
∞. Gát [24] and Weisz [89] proved that the Fejér means of the two-dimensional
Walsh-Fourier series converge to the function almost everywhere if we consider
the convergence over the diagonal, or more generally, over a cone. This result was
proved for trigonometric Fourier series by Marcinkievicz and Zygmund [67] and
Weisz [92]. Similar results were obtained in numerous other papers (see, e.g., Gát
[25, 26] and Goginava [29–31]).
In this paper, we generalize the previous results for mixed normed martingale
Hardy spaces. We will prove that the Fejér maximal operator defined over a cone
is bounded from Hp to Lp (1/2 < p < ∞). As a consequence, we get some
convergence results, such as almost everywhere and norm convergence of the
multi-dimensional Fejér means defined over a cone. This result generalizes the well-
known theorem of Gát [24] and Weisz [89]. Some summability results for classical
mixed norm Hardy spaces Hp (Rd ) and for Fourier transforms can be found in
[45, 93].
We denote by C a positive constant, which can vary from line to line, and denote
by Cp a constant depending only on p. The symbol A ∼ B means that there exist
constants α, β > 0 such that αA ≤ B ≤ βA and A B means that there exist
C > 0 such that A ≤ CB.
712 F. Weisz

2 Mixed Lebesgue Spaces

For 1 ≤ d ∈ N and i = 1, . . . , d, let (i , F i , Pi ) be probability spaces and p :=

d1 , . . . i, pd ) with 0 < pi ≤ ∞. Consider theproduct
(p
d
space (, F , P), where =
, the σ -algebra F is generated by F i and the probability measure
i=1 i=1
P is generated by di=1 Pi . For a constant p, the Lp space is equipped with the
quasi-norm
$ 1/p
f p := |f (
x )|p dP(
x) (0 < p < ∞),

with the usual modification for p = ∞, where x = (x1 , . . . , xd ). We generalize this

space as follows. A measurable function f : → R belongs to the mixed Lp space
if

f p := . . . f Lp (dx1) . . .
1 Lpd (dxd )
$ $ 1/pd
p2 /p1
= ... |f (x1 , . . . , xd )| p1 1
dP (x1 ) d
. . . dP (xd )
d 1

is finite, with the usual modification if pj = ∞ for some j ∈ {1, . . . , d}. If for
some 0 < p ≤ ∞, p = (p, . . . , p), then we get back the classical Lebesgue space
Lp . Under r < p ≤ q, we mean that for all i = 1, . . . , d, r < pi ≤ q, where
0 ≤ r < q ≤ ∞. For a vector p, we will use the notations

p− := min {p1 , . . . , pd } .

The
conjugate exponent vector of p will be denoted by (p) , that is, (p)
=

p1 , . . . , pd , where 1/pi + 1/pi = 1 (i = 1, . . . , d). For α > 0, p/α :=
(p1 /α, . . . , pd /α). Benedek and Panzone [3] proved the next two basic results for
the mixed Lebesgue space.
Theorem 2.1 If 1 ≤ p ≤ ∞, then for all f ∈ Lp and g ∈ L(p)
,
$
|fg| dP ≤ f p g(p)
.

Moreover,
$
f p = sup fg dP .
g(p)
≤1

Similarly to the Lebesgue spaces, the following result holds for the dual of Lp .
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 713

Theorem 2.2 If 1 < p < ∞, then

∗
Lp = L(p)

with equivalent norms.

3 Doob’s Inequality

Suppose that the σ -algebra Fni ⊂ F i (n ∈ N, i = 1, . . . , d), (Fni )n∈N is increasing

d
and F i = σ ∪n∈N Fni . Let Fn = σ i=1 Fn . The expectation and conditional
i

expectation operators relative to , i , Fn and Fni are denoted by E, Ei , En and

Ein (i = 1, . . . , d, n ∈ N), respectively. Obviously, En f = E1n ◦ . . . ◦ Edn f . An
integrable sequence f = (fn )n∈N is called a martingale if
(i) (fn )n∈N is adapted, that is for all n ∈ N, fn is Fn -measurable;
(ii) En fm = fn in case n ≤ m.
Definition 3.1 The stochastic basis (Fn ) is said to be regular, if there exists R > 0
such that for all nonnegative martingales (fn ),

fn ≤ Rfn−1 .

If for all n ∈ N, fn ∈ Lp , then f is called an Lp -martingale. Moreover, if

f p := sup fn p < ∞,

n∈N

then f is an Lp -bounded martingale, briefly f ∈ Lp . We define the Doob’s maximal
function by

M(f ) := sup |fn | .

n∈N

Of course,

M(f ) ≤ Md ◦ Md−1 ◦ . . . ◦ M1 (f ),

where, for any f ∈ L1 and i = 1, . . . , d,

Mi (f ) := sup Ein f .
n∈N

Doob’s inequality is well known:

714 F. Weisz

Theorem 3.2 If 1 < p < ∞ and f ∈ Lp , then

M(f )p ≤ Cp f p .

There is also a weak type inequality for p = 1, however, we do not use it so

we omit it. Theorem 3.2 can be found in Doob [17] and Burkholder and Gundy
[5, 6] (see also Garsia [23], Long [65] or Weisz [87]) and for the classical Hardy-
Littlewood maximal operator in Stein [81]. Now we generalize this inequality. To
this end, we have to use the following result (proved in [84]), that is interesting in
itself and is a crucial point in the proof of the Theorem 3.4.
Theorem 3.3 Let ϕ be a positive function. Then for all 1 < r < ∞, we have
$ $
|M(f )|r ϕ dP ≤ Cr |f |r M(ϕ) dP.

The first generalization of Doob’s inequality is

Theorem 3.4 ([84]) Suppose that 1 < p < ∞ or

p = (∞, ∞, . . . , ∞, pk+1 , . . . , pd ), 1 < pk+1 , . . . , pd < ∞ (1)

for some k ∈ {1, . . . , d}. Then, for all f ∈ Lp ,

Md (f )p ≤ Cf p .

For 1 < p < ∞ and for the classical Hardy-Littlewood inequality, Theorem 3.4
was shown in Bagby [2]. This theorem implies easily the next generalization of
Doob’s inequality, that is to say, the maximal operator M is bounded on Lp in case
1 < p < ∞ (see [84]).
Theorem 3.5 Under the same conditions as in Theorem 3.4, for all f ∈ Lp ,

M(f )p ≤ C f p .

Proof Since Mf ≤ Md ◦ · · · ◦ M1 f , it follows from Theorem 3.4 that

M(f )p ≤ Md ◦ Md−1 ◦ . . . ◦ M1 f p ≤ C Md−1 ◦ . . . ◦ M1 f p

≤ C Md−2 ◦ . . . ◦ M1 f p ≤ · · · ≤ C f p

and the proof is complete.

If p is a constant, then we get back Theorem 3.2. Note that this theorem is not
true for all 1 < p ≤ ∞ (see [84]). The counterexample in [84] proves also that
M2 is not bounded on L(p1 ,∞) (1 < p1 < ∞). Moreover, the classical Hardy-
Littlewood maximal operator considered in Huang et al. [41] is not bounded on
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 715

L(p1 ,∞) (cf. Lemma 3.5 in [41] and Lemma 4.8 in [69]). A weighted version of
Doob’s inequality can be found in Chen et al. [9].
We can easily modify the definition of the maximal operator. For a constant q
and f ∈ Lq , let
1/q
Mq (f ) := sup En (|f |q ) .
n∈N

The next result immediately follows from Theorem 3.2.

Theorem 3.6 If 0 < q < p < ∞ and f ∈ Lp , then

Mq (f )p ≤ Cp f p .

The generalization to mixed norm spaces is much more complicated. Let us

introduce the new maximal function
⎛ ⎛ ⎞ qd ⎞ q1
q3 qd−1 d
q2
⎜ q2
⎟
Mq (f ) := sup ⎝Edn ⎝Ed−1 ···⎠
q1
n · · · E2n E1n |f |q1 ⎠ ,
n∈N

where 0 < q < ∞. Now, we can show that under some conditions, this operator is
bounded on Lp , too.
Theorem 3.7 ([94]) Let 0 < q < ∞ and 0 < p < ∞ or

p = (∞, ∞, . . . , ∞, pk+1 , . . . , pd ), 0 < pk+1 , . . . , pd < ∞

for some k ∈ {1, . . . , d}. Suppose that

⎧
⎪
⎪ p1 > q1 , q2 , . . . , qd ,
⎨
p2 > q2 , . . . , qd ,
⎪ ···
⎪
⎩
pd > qd .

Then, for all f ∈ Lp ,

Mq (f )p ≤ Cf p .

This theorem was proved in [45] for the classical maximal function (for a part of
this theorem see also [70]).
716 F. Weisz

4 Mixed Martingale Hardy Spaces

For n ∈ N and a martingale f = (fn )n∈N , the martingale differences are defined by

dn f := fn − fn−1 , f0 := f−1 := 0.

The map ν : → N ∪ {∞} is called a stopping time relative to (Fn ) if for all
n ∈ N, {ν = n} ∈ Fn . For a martingale f = (fn ) and a stopping time ν, the stopped
martingale is defined by

n
fnν = dm f χ{ν≥m} .
m=0

Let us define the quadratic variation and the conditional quadratic variation of a
martingale f relative to (, F , P, (Fn )n∈N ) by
m 1/2 ∞ 1/2

Sm (f ) := |dn f | 2
, S (f ) := |dn f | 2

n=0 n=0
m 1/2 ∞
1/2

sm (f ) := En−1 |dn f | 2
, s (f ) := En−1 |dn f |
2
.
n=0 n=0

The set of the sequences (λn )n∈N of non-decreasing, non-negative and adapted
functions with λ∞ := limn→∞ λn is denoted by $. With the help of the previous
operators, we introduce five mixed normed martingale Hardy as follows:
4 5
HpM
:= f = (fn )n∈N : f H M := M(f )p < ∞ ;
p
4 5
HpS := f = (fn )n∈N : f H S := S(f )p < ∞ ;
p
4 5
Hps := f = (fn )n∈N : f H s := s(f )p < ∞ ;
p
6
Qp := f = (fn )n∈N : ∃ (λn )n∈N ∈ $,
7
such that Sn (f ) ≤ λn−1 , λ∞ ∈ Lp ,
6
Pp := f = (fn )n∈N : ∃ (λn )n∈N ∈ $,
7
such that |fn | ≤ λn−1 , λ∞ ∈ Lp .

Define

f Qp := inf λ∞ p , f Pp := inf λ∞ p .

(λn )∈$ (λn )∈$
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 717

For a constant p, we get back the well known martingale Hardy spaces HpM , HpS ,
Hps , Qp and Pp investigated exhaustively in [87].
The following corollary comes from Theorem 3.5. It is well-known for martin-
gale Hardy spaces with p = (p, . . . , p) (see e.g. [87]).
Corollary 4.1 If 1 < p < ∞ or p satisfies (1), then HpM
is equivalent to Lp .

5 Atomic Decomposition

In this section, we consider two atomic characterizations of mixed Hardy spaces.

The atomic decomposition is a useful characterization of the Hardy spaces by the
help of which some inequalities, duality theorems and boundedness results can be
proved. The next atomic decomposition of martingale Hardy spaces with constant p
was proved by Herz [36] and the author [87]. For classical Hardy spaces see Latter
[61], Lu [66], Stein [81] and Weisz [92].
∞)-atom if there exists
Definition 5.1 A measurable function a is called an (s, p,
a stopping time τ such that
(i) E
n a = 0 for all
n ≤ τ,
(ii) s(a)χ{τ <∞} ∞ ≤ χ 1
.
{τ <∞} p

If s(a) in (ii) is replaced by S(a) (resp. M(a)), then the function a is called
∞)-atom (resp. (M, p,
(S, p, ∞)-atom).
If p is a constant, then (ii) reads as follows:

s(a)χ{τ <∞} ≤ P(τ < ∞)−1/p .
∞

Every function from the Hardy space Hps (0 < p ≤ 1) can be decomposed into the
sum of atoms.
Theorem 5.2 ([87]) Let p be a constant with 0 < p ≤ 1. A martingale f =
(fn )n∈N ∈ Hps if and only if there exist a sequence (a k )k∈Z of (s, p, ∞)-atoms and
a sequence (μk )k∈Z of real numbers such that

fn = μk En a k a. e. (n ∈ N)
k∈Z

and
1/p
p
f Hps ∼ inf μk , (2)
k∈N

where the infimum is taken over all decompositions of f as above.

718 F. Weisz

In the present form the theorem does not hold for 1 < p < ∞ and it cannot be
extended to mixed norm Hardy spaces. It is easy to see that for 0 < p ≤ 1, (2) can
be written as
1/p
p p 1/p
μ χ
k {τk <∞}
f Hps ∼ inf μk = inf

.

k∈N k∈N χ{τk <∞} p
p

Writing the p-norm instead of the p-norm, we can generalize this form of the atomic
decomposition to mixed norm Hardy spaces (see [84]).
Theorem 5.3 Let 0 < p < ∞. A martingale f = (fn )n∈N ∈ Hps if and only if
∞)-atoms and a sequence (μk )k∈Z of real
there exist a sequence (a k )k∈Z of (s, p,
numbers such that

fn = μk En a k a. e. (n ∈ N) (3)
k∈Z

and
t 1/t

μk χ{τk <∞}
f H s
∼ inf ,

p
k∈Z χ{τk <∞} p
p

where 0 < t ≤ min {p− , 1} and the infimum is taken over all decompositions of the
form (3).
If we replace the space Hps by Pp (resp. by Qp ) and the (s, p, ∞)-atoms by
(M, p, ∞)-atoms (resp. by (S, p, ∞)-atoms), then the theorem holds, too.
If the stochastic basis (Fn ) is regular and 0 < t < min {p− , 1}, then the same
S
holds for the space HpM as for Pp and the same for Hp as for Qp .

Proof We will sketch the proof for Hps , only. Assume that f ∈ Hps and let us define
the following stopping times:
4 5
τk := inf n ∈ N : sn+1 (f ) > 2k .

Obviously fn can be written in the form

τ
fn = fn k+1 − fnτk = μk ank ,
k∈Z k∈Z

where

τ τ
fn k+1 − fn k
μk := 3 · 2k χ{τk <∞} p and ank := .
μk
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 719

Moreover, there exists a k ∈ L2 such that En a k = ank . Because of s (f τk ) = sτk (f ) ≤

2k , we have that

s (f τk+1 ) + s (f τk ) −1
s ak ≤ ≤ χ{τk <∞} p ,
μk

∞)-atom.
thus a k is an (s, p,
Since

lim s f − f τk = lim s f τk = 0
k→∞ k→−∞

almost everywhere, by the dominated convergence theorem (see e.g. [3]) we get that

m
k
f − μk a ≤ f − f τm+1 H s + f τ−l H s → 0
p p
k=−l Hps

as l, m → ∞. From this it follows that

f = μk a k in the Hps -norm.
k∈Z

Denote by
4 5
Ok := {τk < ∞} = s(f ) > 2k .

Then for all k ∈ Z, Ok+1 ⊂ Ok . Moreover, for all x ∈ and for all 0 < t ≤ 1,
t
t
3 · 2k χOk (x) ≤C 3 · 2k χOk \Ok+1 (x) .
k∈Z k∈Z

Since the sets Ok \ Ok+1 are disjoint, we have

t 1/t 1/t

μk χ{τk <∞} t
= 3 · 2 χ{τk <∞}
k

k∈Z χ{τk <∞} p k∈Z
p p

≤ C 3 · 2k χOk \Ok+1

k∈Z p
720 F. Weisz

≤ C s(f )χOk \Ok+1

k∈Z p

= C s(f )p .

Conversely, if f has a decomposition of the form (3), then

χ{τ <∞}
s(f ) ≤ μk s(a k ) ≤ μk k ,
χ{τ <∞}
k∈Z k∈Z k p

and so for all 0 < t ≤ 1,

t 1/t

χ{τk <∞} μk χ{τk <∞}
f H s ≤ μk
≤ ,
p χ{τ <∞} χ{τ <∞}
k∈Z k p p
k∈Z k p
p

which proves the theorem.

It follows from the proof of this theorem that
1/t

t
f H s ∼ inf
3 · 2k χ{τk <∞} ,
(4)
p
k∈Z
p

where the infimum is taken over all atomic decompositions of the form (3). There
are also corresponding equivalences for the other Hardy spaces. From Theorem 5.3,
we get immediately the next corollary.
Corollary 5.4 If the stochastic basis (Fn ) is regular, then

HpS = Qp = Pp

and HpM (0 < p < ∞)

with equivalent quasi-norms.

For the duality results in Sect. 7, we need a finer atomic decomposition. For this,
we assume that every σ -algebra (Fn )n is generated by countably many atoms. We
denote by A(Fn ) the set of all atoms in Fn . We introduce the concept of simple
atoms.
Definition 5.5 Suppose that every σ -algebra (Fn )n is generated by countably many
atoms and 1 < r ≤ ∞. A measurable function a is called a simple (s, p, r)-atom if
there exist j ∈ N, I ∈ A(Fj ) such that
(i) the support of a is contained in I ,
χI r
(ii) s(a)r ≤ χ I p
,
(iii) Ej (a) = 0.
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 721

If s(a) in (ii) is replaced by S(a) (resp. M(a)), then the function a is called simple
r)-atom (resp. simple (M, p,
(S, p, r)-atom).
r)-atoms are more complicated than
The atomic decomposition via simple (s, p,
the atomic decomposition via (s, p, ∞)-atoms. To this, we need the condition that
every σ -algebra is generated by countably many atoms.
Theorem 5.6 ([94]) Suppose that every σ -algebra (Fn )n is generated by countably
many atoms. Let 1 < r ≤ ∞ and
⎧
⎪
⎪ p1 < r1 , r2 , . . . , rd ,
⎨
p2 < r2 , . . . , rd ,
⎪
⎪ · · ·
⎩
pd < rd .

A martingale f = (fn )n∈N ∈ Hps if and only if there exist a sequence (a k,j,i )k,j,i of
r)-atoms associated with (Ik,j,i )k,j,i ⊂ A(Fj ), which are disjoint for
simple (s, p,
fixed k, and a sequence (μk,j,i )k∈Z,j ∈N,i of positive real numbers such that

∞

fn = μk,j,i En a k,j,i a.e. (n ∈ N) (5)
k∈Z j =0 i

and
⎛ ⎛ ⎞t ⎞1/t

∞
⎝ ⎝
μ χ
k,j,i Ik,j,i
⎠⎠
f H s ∼ inf ,
p χIk,j,i p
k∈Z j =0 i
p

where 0 < t < min {p− , 1} and the infimum is taken over all decompositions of the
form (5).
If we replace the space Hps by Pp (resp. by Qp ) and the simple (s, p, r)-atoms
by simple (M, p, r)-atoms (resp. by simple (S, p,
r)-atoms), then the theorem holds,
too.
If the stochastic basis (Fn ) is regular, then the same holds for the space HpM as
for Pp and the same for HpS as for Qp .
Proof We sketch the proof, only. Besides Theorem 3.7, the basic idea of the proof
is to decompose the sets {τk = j } into the union of atoms (Ik,j,i )i ⊂ Fj such that
C
Ik,j,i = {τk = j } ∈ Fj ,
i
722 F. Weisz

where the stopping times τk were defined in the proof of Theorem 5.3. Note that for
fixed k, j , the atoms (Ik,j,i )i ⊂ Fj are disjoint. We can show that

τ
n−1
τ
fn = (fn − fn ) =
k+1 τk
χIk,j,i (fn k+1 − fnτk )
k∈Z k∈Z j =0 i

n−1
k,j,i
= μk,j,i an ,
k∈Z j =0 i

where

τ
fn k+1 − fn k
τ
= 3 · 2 χIk,j,i p
k,j,i
μk,j,i k
and an = χIk,j,i .
μk,j,i

Similarly to (4), we obtain that
⎛ ⎞1/t

∞
t

f H s ∼ inf ⎝ 3 · 2k χIk,j,i ⎠ , (6)
p
k∈Z j =0 i
p

where the infimum is taken over all atomic decompositions of the form (5). The
corresponding equivalences for the other Hardy spaces hold, too.

6 Martingale Inequalities

In this section, we present the generalization of some classical martingale inequal-

ities (see e.g., Weisz [87]) for the five mixed normed martingale Hardy spaces. To
this end, we need the following definition and boundedness results.
Let X be a martingale space, Y be a measurable function space. Then the operator
U : X → Y is called σ -sublinear operator if for any α ∈ C,
∞
∞

U fk ≤ |U (fk )| and |U (αf )| = |α||U (f )|.
k=1 k=1

The σ -algebra generated by the stopping time τ is denoted by

Fτ = {F ∈ F : F ∩ {τ ≤ n} ∈ Fn , n ≥ 1}.
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 723

Of course, Fτ is a sub-σ -algebra of F . The conditional expectation with respect to

Fτ is denoted by Eτ .
Theorem 6.1 ([84]) Let 0 < p < ∞ and suppose that the σ -sublinear operator
T : Hrs → Lr is bounded, where p = (p1 , . . . , pd ) and r > pi (i = 1, . . . , d). If
∞)-atom a
for all (s, p,

(T a)χA = T (aχA ) (A ∈ Fτ ) , (7)

∞)-atom a, then for all

where τ is the stopping time associated with the (s, p,
f ∈ Hps ,

Tf p ≤ C f H s .
p

If we replace the spaces Hrs and Hps by HrM and Pp (resp. by HrS and Qp ) and the
∞)-atoms by (M, p,
(s, p, ∞)-atoms (resp. by (S, p, ∞)-atoms), then the theorem
holds, too.
∞)-atoms a, (S, p,
It is easy to see that for all (s, p, ∞)-atoms a or (M, p,
∞)-
atoms a and A ∈ Fτ , s(aχA ) = s(a)χA , S(aχA ) = S(a)χA and M(aχA ) =
M(a)χA . This means that the operators s, S and M satisfy condition (7). Applying
the preceding theorem to these operators, we [84] obtain
Theorem 6.2 We have the following martingale inequalities:
(i)

f H M ≤ C f H s , f H S ≤ C f H s (0 < p < 2) .
p p p p

(ii)

f H M ≤ f Pp , f H S ≤ f Qp (0 < p < ∞) .

p p

(iii)

f H S ≤ C f Pp , f H M ≤ C f Qp (0 < p < ∞) .

p p

(iv)

f Pp ≤ C f Qp , f Qp ≤ C f Pp (0 < p < ∞) .

(v)

f H s ≤ C f Pp and f H s ≤ C f Qp (0 < p < ∞) .

p p
724 F. Weisz

Proof Let f ∈ Hps . The σ -sublinear operator M is bounded from H2s to L2 (see
e.g. Weisz [87]), that is Mf 2 ≤ C f H s . So we can apply Theorem 6.1 with the
2
choice r = 2 and p := (p1 , . . . , pd ), where pi < 2 and we get that

f H M = M(f )p ≤ C f H s (0 < p < 2) .

p p

The operator S is also bounded from H2s to L2 (see [87]), hence using
Theorem 6.1 we obtain

f H S ≤ C f H s (0 < p < 2) .
p p

From the definition of the Hardy spaces it follows immediately that

f H M ≤ f Pp , f H S ≤ f Qp (0 < p < ∞) .

p p

By Burkholder-Gundy and Doob’s inequality, for all 1 < r < ∞, S(f )r ≈
M(f )r ≈ f r (see Weisz [87]). Using this, the previous inequality and
Theorem 6.1, we have

f H S ≤ C f Pp and f H M ≤ C f Qp (0 < p < ∞) .

p p

For f = (fn )n∈N ∈ Qp there exists a sequence (λn )n∈N for which Sn (f ) ≤ λn−1
and λ∞ ∈ Lp . Using the inequality |fn | ≤ Mn−1 (f ) + λn−1 and the preceding
inequality, we get that

f Pp ≤ M(f )p + λ∞ p ≤ f H M + C f Qp ≤ C f Qp .

Similarly, if f = (fn )n∈N ∈ Pp , then |fn | ≤ λn−1 with a suitable sequence
(λn )n∈N for which λ∞ ∈ Lp . Since
n 1/2

Sn (f ) = |dk f | 2
≤ Sn−1 (f ) + |dn f | ≤ Sn−1 (f ) + 2λn−1 ,
k=0

we have that

f Qp ≤ S(f )p + 2 λ∞ p = f H S + 2 f Pp ≤ C f Pp

for all 0 < p < ∞.

From [87] Proposition 2.11 (ii), we get that the operator s is bounded from HrM
to Lr and from HrS to Lr if 2 ≤ r < ∞. Again, using Theorem 6.1, we obtain

f H s ≤ C f Pp and f H s ≤ C f Qp (0 < p < ∞) .

p p

Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 725

We know (see e.g. Weisz [87]) that Sn (f ) ≤ R 1/2 sn (f ) if the stochastic basis is
regular. Using the definition of Qp and the fact that sn (f ) ∈ Fn−1 we get

f Qp ≤ Cs(f )p = Cf Hps .

By the last inequality of Theorem 6.2, we obtain that Qp = Hps . The next corollary
follows from Theorem 6.2 and Corollary 5.4 (see [84]).
Corollary 6.3 If the stochastic basis (Fn ) is regular, then the five Hardy spaces are
equivalent, that is

HpS = Qp = Pp = HpM

= Hp
s
(0 < p < ∞)

with equivalent quasi-norms.

Using Theorems 2.1 and 3.5 and a duality argument, we can prove
Theorem 6.4 ([84]) Suppose that 1 < p < ∞ or

p = (1, . . . , 1, pk+1 , . . . , pd ), 1 < pk+1 , . . . , pd < ∞ (8)

for some k ∈ {1, . . . , d}. Then for all non-negative, measurable function sequence
(fn )n∈N ,

En (fn ) ≤ C fn .

n∈N p n∈N p

As an application of the previous theorem with fn := |dn+1 f |2 , we get the

following martingale inequality.
Corollary 6.5 If 2 < p < ∞ or p/2
satisfies (8), then

f H s ≤ C f H S .
p p

To generalize the well known Burkholder-Davis-Gundy inequality, we introduce

a new space with the norm

f Gp := |dn f | .

n∈N p

The so called Davis decomposition (Lemma 6.6) holds also for mixed norm spaces.
The main idea of the proof of this decomposition is Theorem 6.4 (see [84]).
726 F. Weisz

Lemma 6.6 Suppose that 1 < p < ∞ or p satisfies (8). If f ∈ HpS , then there
exists h ∈ Gp and g ∈ Qp such that f = h + g and

hGp ≤ C f H S and gQp ≤ C f H S .

p p

, then there exists h ∈ Gp and g ∈ Pp such that f = h + g and

If f ∈ HpM

hGp ≤ C f H M and gPp ≤ C f H M .

p p

Now the generalization of the Burkholder-Davis-Gundy inequality can be proved

for mixed norm spaces.
Theorem 6.7 ([84]) If 1 < p < ∞ or p satisfies (8), then the spaces HpS and HpM

are equivalent, that is

HpS = HpM

with equivalent norms.

7 Dual Spaces of Mixed Hardy Spaces

In this section, we study the dual spaces of mixed normed martingale Hardy
spaces. To this end, we have to suppose that every σ -algebra (Fn )n is generated
by countably many atoms.
Definition 7.1 Suppose that every σ -algebra (Fn )n is generated by countably many
atoms. Let 0 ≤ α < ∞ and 1 ≤ q < ∞. Define BMOq ( α ) as the space of functions
f ∈ Lq for which

f BMOq (α) = sup sup χI −11 χI (q) (f − fn )χI q < ∞.
n≥0 I ∈A(Fn ) α +1

If q is a constant and α = 0, then this definition goes back to the classical

martingale BMOq space. If both q and α are non-zero constants, then this definition
becomes the classical martingale Lipschitz space investigated in Weisz [86, 87].
Now we are ready to characterize the dual of Hps (see [94]).
Theorem 7.2 Suppose that every σ -algebra (Fn )n is generated by countably many
atoms. If 0 < p ≤ 1 and α = 1/p − 1, then
∗
Hps = BMO2 (
α)

with equivalent norms.

Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 727

Proof For ϕ ∈ BMO2 (

α ) ⊂ L2 , define a linear functional by

lϕ (f ) = E(f ϕ) (f ∈ L2 ).

L2 can be embedded continuously in Hps , since by Hölder’s inequality,

f Hps = s(f )p ≤ s(f )2 = f 2 (f ∈ L2 ).

Theorem 5.6 implies that for each f ∈ L2 ,

∞

f = μk,j,i a k,j,i
k∈Z j =0 i

2)-
and the convergence holds also in the L2 -norm, where a k,j,i is a simple (s, p,
atom and μk,j,i = 3 · 2k χIk,j,i p . Hence

∞

lϕ (f ) = E(f ϕ) = μk,j,i E(a k,j,i ϕ).
k∈Z j =0 i

Observe that

E(a k,j,i ϕ) = E(a k,j,i (ϕ − ϕj )).

Using this, we conclude that

∞
$
|lϕ (f )| ≤ μk,j,i a k,j,i (ϕ − ϕj )dP
k∈Z j =0 i

∞

≤ μk,j,i a k,j,i 2 (ϕ − ϕj )χIk,j,i 2
k∈Z j =0 i
∞
P(Ik,j,i )1/2
≤ μk,j,i (ϕ − ϕj )χIk,j,i 2
χI
k∈Z j =0 i k,j,i p
∞

μk,j,i ϕBMO2 (α) . (9)
k∈Z j =0 i

If g, h ∈ Lp are two positive functions, then

gp + hp ≤ g + hp .

728 F. Weisz

Indeed, since 0 < p ≤ 1, the inequality holds for all Lpi spaces (i = 1, . . . , d),
hence it holds also for Lp spaces. Taking into account this inequality, (6) and
Theorem 5.6, we conclude that
∞
∞

μk,j,i 2k χIk,j,i p
k∈Z j =0 i k∈Z j =0 i

∞
k
≤
2 χ
Ik,j,i f Hps .
k∈Z j =0 i
p

Then (9) implies

|lϕ (f )| f Hps ϕBMO2 (α) .

Since by Theorem 5.6, L2 is dense in Hps , thus lϕ can be uniquely extended to a

linear functional on Hps .
Conversely, let l be an arbitrary bounded linear functional on Hps . Since L2 can
be embedded continuously to Hps , there exists ϕ ∈ L2 such that

l(f ) = lϕ (f ) = E(f ϕ) (f ∈ L2 ).

For I ∈ A(Fj ), set

(ϕ − ϕj )χI
a= .
(ϕ − ϕj )χI 2 χI 1 χI −1
2
α +1

2)-atom and so a ∈ Hps with aHps

Then the function a is a simple (s, p, 1.
Finally,

l l(a) = E a(ϕ − ϕj ) = χI −11 χI 2 (ϕ − ϕj )χI 2 .
α +1

This means that

ϕBMO2 (α) l

and the theorem is shown.

Let us denote by (Pp )∗1 those elements l from (Pp )∗ for which there exists ϕ ∈ L1
such that

l(f ) = E(f ϕ) (f ∈ L∞ ).
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 729

We can verify the following result similarly to Theorem 7.2 (see [94]).
Theorem 7.3 Suppose that every σ -algebra (Fn )n is generated by countably many
atoms. If 0 < p ≤ 1 and α = 1/p − 1, then
∗
Pp 1 = BMO1 (
α)

with equivalent norms.

If (Fn ) is regular, then Pp is equivalent to Hps (see Corollary 6.3). We know that
L2 can be embedded continuously to Hps . Thus, for l ∈ Pp , there exists ϕ ∈ L2 ⊂
∗ ∗
L1 such that l(f ) = E(f ϕ) for any f ∈ L2 ⊃ L∞ . Hence Pp 1 = Pp and
Theorem 7.3 imply the next corollary.
Corollary 7.4 Suppose that every σ -algebra (Fn )n is generated by countably many
atoms. If 0 < p ≤ 1, α = 1/p − 1 and (Fn ) is regular, then
∗
Pp = BMO1 ( α)

with equivalent norms.

For a regular stochastic basis (Fn ), we can prove sharper results.
Theorem 7.5 Suppose that every σ -algebra (Fn )n is generated by countably many
atoms. If 0 < p ≤ 1, α = 1/p − 1, 1 < r < ∞ and (Fn ) is regular, then
∗
HpM
= BMOr (
α)

with equivalent norms.

From Theorem 7.5, we get a generalization of the well known John-Nirenberg
inequality.
Corollary 7.6 Suppose that every σ -algebra (Fn )n is generated by countably many
atoms. If 0 ≤ α < ∞, 1 < r < ∞ and (Fn ) is regular, then

α ) = BMOr (
BMO1 ( α)

with equivalent norms.

Note that these results for a constant p and α are also due to the author [87].

8 One-dimensional Walsh-Fourier Series

Now we turn to some applications in Walsh-Fourier analysis and present some

summability results of one-dimensional Walsh-Fourier series. Let := [0, 1) and
consider the dyadic intervals Ik,n := [k2−n , (k+1)2−n ) (n ∈ N, k = 0, . . . , 2n −1).
730 F. Weisz

The dyadic σ -algebras Fn are generated by Ik,n , k = 0, . . . , 2n −1 (n ∈ N). F (resp.

P) denotes the one-dimensional Lebesgue σ -algebra (resp. Lebesgue measure).
To introduce the Walsh orthonormal system, let us define first the Rademacher
functions by

rn (x) := r(2n x) (x ∈ [0, 1), n ∈ N),

where r is a 1-periodic function and

1, if x ∈ [0, 12 );
r(x) :=
−1, if x ∈ [ 12 , 1).

It is clear that, for any n ∈ N, rn is Fn+1 measurable. The product system

generated by the Rademacher functions is the Walsh system:
∞

wn := rk nk (n ∈ N),
k=0

where
∞

n= nk 2k , (0 ≤ nk < 2).
k=0

For a one-dimensional function f ∈ L1 and for any n ∈ N, the number

fB(n) := E(f wn ) (n ∈ N)

is said to be the nth Walsh-Fourier coefficient of f . We can extend this definition to

martingales as follows. If f = (fk )k≥0 is a martingale, then let

fB(n) := lim E(fk wn ) (n ∈ N).

k→∞

Since wn is Fk measurable for n < 2k , it can immediately be seen that this limit
does exist. We remember that if f ∈ L1 , then Ek f → f in the L1 -norm as k → ∞,
hence

fB(n) = lim E((Ek f )wn ) (n ∈ N).

k→∞

Thus the Walsh-Fourier coefficients of f ∈ L1 are the same as the ones of the
martingale (Ek f )k≥0 obtained from f .
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 731

Denote by sn f the nth partial sum of the Walsh–Fourier series of a martingale f ,

namely,

n−1
sn f := fB(k)wk (n ∈ N).
k=0

It is a basic question, whether the function f can be reconstructed from the partial
sums of its Fourier series. It is easy to see that, for any martingale f = (fn ),

s2n f = fn (n ∈ N).

Then the martingale convergence theorem implies that

lim s2n f = f in the Lp -norm,

n→∞

where 1 ≤ p < ∞ and f ∈ Lp . This result was generalized by Paley [71] and
Schipp et al. [75, Theorem 4.1].
Theorem 8.1 If f ∈ Lp for some 1 < p < ∞, then

sup sn f Lp ≤ Cp f Lp

n∈N

and

lim sn f = f in the Lp -norm.

n→∞

One of the deepest results in harmonic analysis is Carleson’s result (see Carleson
[8], Hunt [47], Billard [4], Sjölin [79]). Using tree martingales, Schipp [73] gave a
nice proof for the theorem (see also [76, 87]).
Theorem 8.2 If f ∈ Lp for some 1 < p < ∞, then

sup |sn f | ≤ Cp f Lp

n∈N Lp

and

lim sn f = f a.e.
n→∞

Though Theorems 8.1 and 8.2 are not true for p = 1, with the help of some
summability methods they can be generalized for these endpoint cases. Obviously,
summability means have better convergence properties than the original Fourier
series. Summability is intensively studied in the literature. We refer at this time
732 F. Weisz

only to the books Stein and Weiss [82], Butzer and Nessel [7], Trigub and Belinsky
[85], Grafakos [34] and Weisz [90–92, 95] and the references therein.
The best known summability method is the Fejér method. In 1904 Fejér [20]
investigated the arithmetic means of the partial sums of the trigonometric Fourier
series, the so called Fejér means and proved that if the left and right limits f (x − 0)
and f (x + 0) exist at a point x, then the Fejér means converge to (f (x − 0) + f (x +
0))/2. One year later Lebesgue [62] extended this theorem and obtained that every
integrable function is Fejér summable almost everywhere.
We define the Fejér summability means by the arithmetic means of the partial
sums:

1
n−1 n
j B
σn f := sk f = 1− f (j )wj .
n n
k=0 j =0

The following theorem improves Theorem 8.1 (see Paley [71]).

Theorem 8.3 If f ∈ Lp for some 1 ≤ p < ∞, then

sup σn f Lp ≤ Cp f Lp

n∈N

and

lim σn f = f in the Lp -norm.

n→∞

To obtain almost everywhere convergence for the Fejér means, we introduce the
maximal operator of the Fejér means:

σ∗ f := sup |σn f |.
n∈N

Fujii [22] proved that σ∗ is bounded from H1 to L1 (see also Schipp and
Simon [74]). Later, using the atomic decomposition, the author [88] (see also [90])
generalized this result to all 1/2 < p < ∞:
Theorem 8.4 If 1/2 < p ≤ ∞ and f ∈ Hp , then

σ∗ f Lp ≤ Cp f Hp .

For p ≤ 1/2, the theorem does not hold (see Simon and Weisz [78], Simon
[77]). We get the next weak type (1, 1) inequality from Theorem 8.4 by interpolation
(Weisz [88, 90]). It was originally proved by Schipp [72].
Corollary 8.5 If f ∈ L1 , then

sup ρλ(σ∗ f > ρ) ≤ Cf 1 .

ρ>0
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 733

This weak type (1, 1) inequality and the density argument of Marcinkiewicz and
Zygmund [67] imply the next corollary, which was proved by Fine [21] and later
Schipp [72].
Corollary 8.6 If f ∈ L1 , then

lim σn f = f a.e.
n→∞

9 Higher Dimensional Walsh-Fourier Series

In this section, we generalize the summability results to higher dimensional Walsh-

Fourier series and to mixed norm spaces. For all i =, . . . , d, let i := [0, 1) and
the dyadic σ -algebras Fni be generated by Ik,n , k = 0, . . . , 2n − 1 (n ∈ N). F i
and Pi denote again the one-dimensional Lebesgue σ -algebra and the Lebesgue
measure. Note that these definitions are independent of i. Consider the product space
(, F , P) as defined in Sect. 2.
The d-dimensional Walsh-system is defined by

d
x ) :=
wn ( wnk (xk )
k=1

where x = (x1 , . . . , xd ) ∈ [0, 1)d and n = (n1 , . . . , nd ) ∈ Nd . If f ∈ L1 , then the

n-th Walsh-Fourier coefficient of f are defined by

fB(
n) := E(f wn ) n = (n1 , . . . , nd ) ∈ Nd .

This definition can be extended to martingales as before in the one-dimensional

case.
The n -th partial sum and the n -th Fejér mean of the Walsh-Fourier series of a
martingale f are defined by

1 −1
n d −1
n
sn f := ··· fB(k)
w
k n ∈ Nd )
(
k1 =0 kd =0

and

1
n1
nd
σn f := d ··· sk f n ∈ Nd ),
(
k=1 nk k1 =1 kd =1
734 F. Weisz

respectively. We will investigate the convergence of the Fejér means over the
diagonal, or more generally, over cones. For α ≥ 0 let us define the cone

ni
α := n = (n1 , . . . , nd ) ∈ N : 2−α ≤ ≤ 2α (i, j = 1, . . . , d) .
nj

For the almost everywhere convergence, we have to investigate the Fejér maximal
operator,

σ∗ f := sup |σn f | .
n ∈α

Using Theorem 5.3, we can verify that σ∗ is bounded from Hp to Lp for 1/2 < p <
∞.
Theorem 9.1 ([83]) If α ≥ 0 and 1/2 < p < ∞, then for all f ∈ Hp ,

σ∗ f p ≤ C f Hp .

For a constant p, this theorem is due to the author [89, 90]. There are counterex-
amples for the boundedness of σ∗ if p ≤ 1/2 (Goginava and Nagy [32]). Since the
Walsh polynomials are dense in Hp , the following consequences of Theorem 9.1
can be proved by a density argument in the usual way (see [83]).
Corollary 9.2 If α ≥ 0 and 1/2 < p < ∞, then for all f ∈ Hp , σn f (
x ) converges
for almost every x ∈ [0, 1)d and in the Lp -norm as n → ∞ and n ∈ α .
If I ∈ Fk is a dyadic cube with length 2−dk , then the restriction of the martingale
f to I is defined by

f χI := (fn χI : n ≥ k) .

Corollary 9.3 Let α ≥ 0, 1/2 < p < ∞ and f ∈ Hp . If there exists a dyadic cube
I , such that the restricted martingale f χI ∈ L1 (I ), then

lim x ) = f (
σn f ( x) for a.e. x ∈ I and in the Lp (I )-norm.
n ∈α ,
n→∞

If 1 ≤ p < ∞ and f ∈ Hp , then f ∈ L1 . So we have

Corollary 9.4 If α ≥ 0 and 1 ≤ p < ∞, then for all f ∈ Hp ,

lim x ) = f (
σn f ( x) for a.e. x ∈ [0, 1)2 and in the Lp -norm.
n ∈α ,
n→∞

Recall that H1 ⊂ L1 and Hp ∼ Lp for 1 < p < ∞. Next we generalize the
preceding corollary. Theorem 9.1 and interpolation imply (Weisz [89, 90])
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 735

Corollary 9.5 If α ≥ 0 and f ∈ L1 , then

sup ρλ(σ∗ f > ρ) ≤ Cf 1 .

ρ>0

Corollary 9.6 If α ≥ 0 and f ∈ L1 , then

lim σn f = f a.e.

n∈α ,
n→∞

This corollary was proved independently by Gát [24] and Weisz [89].

Acknowledgment This research was supported by the Hungarian National Research, Develop-
ment and Innovation Office—NKFIH, KH130426.

References

1. N. Antonić, I. Ivec, On the Hörmander-Mihlin theorem for mixed-norm Lebesgue spaces. J.

Math. Anal. Appl. 433(1), 176–199 (2016)
2. R.J. Bagby, An extended inequality for the maximal function. Proc. Am. Math. Soc. 48,
419–422 (1975)
3. A. Benedek, R. Panzone, The spaces Lp , with mixed norm. Duke Math. J. 28, 301–324
(1961)
4. P. Billard, Sur la convergence presque partout des séries de Fourier-Walsh des fonctions de
l’espace L2 [0, 1]. Stud. Math. 28, 363–388 (1967)
5. D. Burkholder, Distribution function inequalities for martingales. Ann. Prob. 1, 19–42 (1973)
6. D. Burkholder, R.F. Gundy, Extrapolation and interpolation of quasi-linear operators on
martingales. Acta Math. 124, 249–304 (1970)
7. P.L. Butzer, R.J. Nessel, Fourier Analysis and Approximation (Birkhäuser Verlag, Basel,
1971)
8. L. Carleson, On convergence and growth of partial sums of Fourier series. Acta Math. 116,
135–157 (1966)
9. W. Chen, K.-P-Ho, Y. Jiao, D. Zhou, Weighted mixed-norm inequality on Doob’s maximal
operator and John-Nirenberg inequalities in Banach function spaces. Acta Math. Hung.
157(2), 408–433 (2019)
10. G. Cleanthous, A.G. Georgiadis, Mixed-norm α-modulation spaces. Trans. Am. Math. Soc.
373(5), 3323–3356 (2020)
11. G. Cleanthous, A.G. Georgiadis, M. Nielsen, Anisotropic mixed-norm Hardy spaces. J.
Geom. Anal. 27(4), 2758–2787 (2017)
12. G. Cleanthous, A.G. Georgiadis, M. Nielsen, Discrete decomposition of homogeneous mixed-
norm Besov spaces, in Functional Analysis, Harmonic Analysis, and Image Processing: A
Collection of Papers in Honor of Björn Jawerth (American Mathematical Society (AMS),
Providence, RI, 2017), pp. 167–184
13. G. Cleanthous, A.G. Georgiadis, M. Nielsen, Molecular decomposition of anisotropic
homogeneous mixed-norm spaces with applications to the boundedness of operators. Appl.
Comput. Harmon. Anal. 47(2), 447–480 (2019)
14. R.R. Coifman, A real variable characterization of H p . Stud. Math. 51, 269–274 (1974)
15. R.R. Coifman, G. Weiss, Extensions of Hardy spaces and their use in analysis. Bull. Am.
Math. Soc. 83, 569–645 (1977)
736 F. Weisz

16. B.J. Davis, On the integrability of the martingale square function. Isr. J. Math. 8, 187–190
(1970)
17. J.L. Doob, Semimartingales and subharmonic functions. Trans. Am. Math. Soc. 77, 86–121
(1954)
18. C. Fefferman, Characterizations of bounded mean oscillation. Bull. Am. Math. Soc. 77,
587–588 (1971)
19. C. Fefferman, E.M. Stein, H p spaces of several variables. Acta Math. 129, 137–194 (1972)
20. L. Fejér, Untersuchungen über Fouriersche Reihen. Math. Ann. 58, 51–69 (1904)
21. N.J. Fine, Cesàro summability of Walsh-Fourier series. Proc. Nat. Acad. Sci. USA 41, 558–
591 (1955)
22. N. Fujii, A maximal inequality for H 1 -functions on a generalized Walsh-Paley group. Proc.
Am. Math. Soc. 77, 111–116 (1979)
23. A.M. Garsia, Martingale Inequalities. Seminar Notes on Recent Progress. Math. Lecture
Note. (Benjamin, New York, 1973)
24. G. Gát, Pointwise convergence of the Cesàro means of double Walsh series. Ann. Univ. Sci.
Budapest Sect. Comput. 16, 173–184 (1996)
25. G. Gát, Almost everywhere convergence of Fejér means of two-dimensional triangular Walsh-
Fourier series. J. Fourier Anal. Appl. 24(5), 1249–1275 (2018)
26. G. Gát, Cesàro means of subsequences of partial sums of trigonometric Fourier series. Constr.
Approx. 49(1), 59–101 (2019)
27. A.G. Georgiadis, J. Johnsen, M. Nielsen, Wavelet transforms for homogeneous mixed-norm
Triebel-Lizorkin spaces. Monatsh. Math. 183(4), 587–624 (2017)
28. A.G. Georgiadis, M. Nielsen, Pseudodifferential operators on mixed-norm Besov and Triebel-
Lizorkin spaces. Math. Nachr. 289(16), 2019–2036 (2016)
29. U. Goginava, On some (Hp,q , Lp,q )-type maximal inequalities with respect to the Walsh-
Paley system. Georgian Math. J. 7, 475–488 (2000)
30. U. Goginava, Almost everywhere summability of multiple Walsh-Fourier series. J. Math.
Anal. Appl. 287, 90–100 (2003)
31. U. Goginava, Maximal operators of (C, α)-means of cubic partial sums of d-dimensional
Walsh-Fourier series. Anal. Math. 33, 263–286 (2007)
32. U. Goginava, Maximal operators of Fejér means of double Walsh-Fourier series. Acta Math.
Hungar. 115, 333–340 (2007)
33. B. Golubov, A. Efimov, V. Skvortsov, Walsh Series and Transforms (Kluwer Academic,
Dordrecht, 1991)
34. L. Grafakos, Classical and Modern Fourier Analysis (Pearson Education, Upper Saddle
River, NJ, 2004)
35. J. Hart, R.H. Torres, X. Wu, Smoothing properties of bilinear operators and Leibniz-type
rules in Lebesgue and mixed Lebesgue spaces. Trans. Am. Math. Soc. 370(12), 8581–8612
(2018)
36. C. Herz, Bounded mean oscillation and regulated martingales. Trans. Am. Math. Soc. 193,
199–215 (1974)
37. C. Herz, Hp -spaces of martingales, 0 < p ≤ 1. Z. Wahrscheinlichkeitstheorie Verw. Geb.
28, 189–205 (1974)
38. K.-P. Ho, Strong maximal operator on mixed-norm spaces. Ann. Univ. Ferrara Sez. VII, Sci.
Mat. 62(2), 275–291 (2016)
39. K.-P. Ho, Mixed norm Lebesgue spaces with variable exponents and applications. Riv. Mat.
Univ. Parma (N.S.) 9(1), 21–44 (2018)
40. L. Hörmander, Estimates for translation invariant operators in Lp spaces. Acta Math. 104,
93–140 (1960)
41. L. Huang, J. Liu, D. Yang, W. Yuan, Atomic and Littlewood-Paley characterizations of
anisotropic mixed-norm Hardy spaces and their applications. J. Geom. Anal. 29, 1991–2067
(2019)
42. L. Huang, J. Liu, D. Yang, W. Yuan, Dual spaces of anisotropic mixed-norm Hardy spaces.
Proc. Am. Math. Soc. 147(3), 1201–1215 (2019)
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 737

43. L. Huang, J. Liu, D. Yang, W. Yuan, Identification of anisotropic mixed-norm Hardy spaces
and certain homogeneous Triebel-Lizorkin spaces. J. Approx. Theory 258, 105459 (2020)
44. L. Huang, J. Liu, D. Yang, W. Yuan, Real-variable characterizations of new anisotropic
mixed-norm hardy spaces. Commun. Pure Appl. Anal. 19(6), 3033–3082 (2020)
45. L. Huang, F. Weisz, D. Yang, W. Yuan, Summability of Fourier transforms on mixed-norm
Lebesgue spaces via associated Herz spaces. Anal. Appl. (2021). https://doi.org/10.1142/
S0219530521500135
46. L. Huang, D. Yang, On function spaces with mixed norms - a survey. J. Math. Study 54, 1–75
(2021)
47. R.A. Hunt, On the convergence of Fourier series, in Orthogonal Expansions and their
Continuous Analogues, Proc. Conf. Edwardsville, IL, 1967 (Illinois Univ. Press, Carbondale,
1968), pp. 235–255
48. Y. Jiang, W. Sun, Adaptive sampling of time-space signals in a reproducing kernel subspace
of mixed Lebesgue space. Banach J. Math. Anal. 14(3), 821–841 (2020)
49. Y. Jiao, F. Weisz, L. Wu, D. Zhou, Variable martingale Hardy spaces and their applications in
Fourier analysis. Diss. Math. 550, 1–67 (2020)
50. Y. Jiao, F. Weisz, L. Wu, D. Zhou, Dual spaces for variable martingale Lorentz-Hardy spaces.
Banach J. Math. Anal. 15, 53 (2021)
51. Y. Jiao, L. Wu, A. Yang, R. Yi, The predual and John-Nirenberg inequalities on generalized
BMO martingale space. Trans. Am. Math. Soc. 369, 537–553 (2017)
52. Y. Jiao, G. Xie, D. Zhou, Dual spaces and John-Nirenberg inequalities of martingale Hardy-
Lorentz-Karamata spaces. Q. J. Math. 66, 605–623 (2015)
53. Y. Jiao, D. Zhou, Z. Hao, W. Chen, Martingale Hardy spaces with variable exponents. Banach
J. Math 10, 750–770 (2016)
54. Y. Jiao, Y. Zuo, D. Zhou, L. Wu, Variable Hardy-Lorentz spaces H p(·),q (Rn ). Math. Nachr.
292, 309–349 (2019)
55. F. John, L. Nirenberg, On functions of bounded mean oscillation. Commun. Pure Appl. Math.
14, 415–426 (1961)
56. J. Johnsen, S. Munch Hansen, W. Sickel, Characterisation by local means of anisotropic
Lizorkin-Triebel spaces with mixed norms. Z. Anal. Anwend. 32(3), 257–277 (2013)
57. J. Johnsen, S. Munch Hansen, W. Sickel, Anisotropic, mixed-norm Lizorkin-Triebel spaces
and diffeomorphic maps. J. Funct. Spaces 2014, 15 (2014). Id/No 964794
58. J. Johnsen, S. Munch Hansen, W. Sickel, Anisotropic Lizorkin-Triebel spaces with mixed
norms – traces on smooth boundaries. Math. Nachr. 288(11–12), 1327–1359 (2015)
59. J. Johnsen, W. Sickel, A direct proof of Sobolev embeddings for quasi-homogeneous
Lizorkin-Triebel spaces with mixed norms. J. Funct. Spaces Appl. 5(2), 183–198 (2007)
60. J. Johnsen, W. Sickel, On the trace problem for Lizorkin–Triebel spaces with mixed norms.
Math. Nachr. 281(5), 669–696 (2008)
61. R.H. Latter, A characterization of H p (Rn ) in terms of atoms. Stud. Math. 62, 92–101 (1978)
62. H. Lebesgue, Recherches sur la convergence des séries de Fourier. Math. Ann. 61, 251–280
(1905)
63. J. Liu, F. Weisz, D. Yang, W. Yuan, Variable anisotropic Hardy spaces and their applications.
Taiwan. J. Math. 22, 1173–1216 (2018)
64. J. Liu, F. Weisz, D. Yang, W. Yuan, Littlewood-Paley and finite atomic characterizations of
anisotropic variable Hardy-Lorentz spaces and their applications. J. Fourier Anal. Appl. 25,
874–922 (2019)
65. R. Long, Martingale Spaces and Inequalities (Peking University Press and Vieweg
Publishing, Beijing, Braunschweig, 1993)
66. S. Lu, Four Lectures on Real H p Spaces (World Scientific, Singapore, 1995)
67. J. Marcinkiewicz, A. Zygmund, On the summability of double Fourier series. Fund. Math.
32, 122–132 (1939)
68. E. Nakai, Y. Sawano, Hardy spaces with variable exponents and generalized Campanato
spaces. J. Funct. Anal. 262(9), 3665–3748 (2012)
69. T. Nogayama, Mixed Morrey spaces. Positivity 23(4), 961–1000 (2019)
738 F. Weisz

70. T. Nogayama, T. Ono, D. Salim, Y. Sawano, Atomic decomposition for mixed Morrey spaces.
J. Geom. Anal. 31(9), 9338–9365 (2021)
71. R.E.A.C. Paley, A remarkable system of orthogonal functions. Proc. Lond. Math. Soc. 34,
241–279 (1932)
72. F. Schipp, Über gewissen Maximaloperatoren. Ann. Univ. Sci. Budapest Sect. Math. 18,
189–195 (1975)
73. F. Schipp, Universal contractive projections and a.e. convergence, in Probability Theory
and Applications, Essays to the Memory of József Mogyoródi, ed. by J. Galambos, I. Kátai
(Kluwer Academic, Dordrecht, 1992), pp. 47–75
74. F. Schipp, P. Simon, On some (H, L1 )-type maximal inequalities with respect to the Walsh-
Paley system, in Functions, Series, Operators, Proc. Conf. in Budapest, 1980. Coll. Math.
Soc. J. Bolyai, vol. 35 (North Holland, Amsterdam, 1981), pp. 1039–1045
75. F. Schipp, W.R. Wade, P. Simon, J. Pál, Walsh Series: An Introduction to Dyadic Harmonic
Analysis (Adam Hilger, Bristol, New York, 1990)
76. F. Schipp, F. Weisz, Tree martingales and a.e. convergence of Vilenkin-Fourier series. Math.
Pannon. 8, 17–36 (1997)
77. P. Simon, Cesàro summability with respect to two-parameter Walsh systems. Monatsh. Math.
131, 321–334 (2000)
78. P. Simon, F. Weisz, Weak inequalities for Cesàro and Riesz summability of Walsh-Fourier
series. J. Approx. Theory 151, 1–19 (2008)
79. P. Sjölin, An inequality of Paley and convergence a.e. of Walsh-Fourier series. Ark. Mat.
7(6), 551–570 (1969)
80. A. Stefanov, R.H. Torres, Calderón-Zygmund operators on mixed Lebesgue spaces and
applications to null forms. J. Lond. Math. Soc. II. Ser. 70(2), 447–462 (2004)
81. E.M. Stein, Harmonic Analysis: Real-variable Methods, Orthogonality and Oscillatory
Integrals (Princeton Univ. Press, Princeton, NJ, 1993)
82. E.M. Stein, G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces (Princeton Univ.
Press, Princeton, N.J., 1971)
83. K. Szarvas, F. Weisz, Applications of mixed martingale Hardy spaces in Fourier analysis J.
Math. Anal. Appl. 492, 124403 (2020)
84. K. Szarvas, F. Weisz, Mixed martingale Hardy spaces. J. Geom. Anal. 31, 3863–3888 (2021)
85. R.M. Trigub, E.S. Belinsky, Fourier Analysis and Approximation of Functions (Kluwer
Academic, Dordrecht, 2004)
86. F. Weisz, Martingale Hardy spaces for 0 < p ≤ 1. Probab. Theory Relat. Fields 84, 361–376
(1990)
87. F. Weisz, Martingale Hardy Spaces and their Applications in Fourier Analysis. Lecture Notes
in Math., vol. 1568 (Springer, Berlin, 1994)
88. F. Weisz, Cesàro summability of one- and two-dimensional Walsh-Fourier series. Anal. Math.
22, 229–242 (1996)
89. F. Weisz, Cesàro summability of two-dimensional Walsh-Fourier series. Trans. Am. Math.
Soc. 348, 2169–2181 (1996)
90. F. Weisz, Summability of Multi-dimensional Fourier Series and Hardy Spaces. Mathematics
and Its Applications (Kluwer Academic, Dordrecht, 2002)
91. F. Weisz, Summability of multi-dimensional trigonometric Fourier series. Surv. Approx.
Theory 7, 1–179 (2012)
92. F. Weisz, Convergence and Summability of Fourier Transforms and Hardy Spaces. Applied
and Numerical Harmonic Analysis (Springer, Birkhäuser, Basel, 2017)
93. F. Weisz, Summability in mixed-norm Hardy spaces. Ann. Univ. Sci. Budapest Sect. Comput.
48, 233–246 (2018)
94. F. Weisz, Dual spaces of mixed-norm martingale Hardy spaces. Commun. Pure Appl. Anal.
20, 681–695 (2021)
95. F. Weisz, Lebesgue Points and Summability of Higher Dimensional Fourier Series. Applied
and Numerical Harmonic Analysis (Springer, Birkhäuser, Basel, 2021)
Mixed Martingale Hardy Spaces and Applications in Fourier Analysis 739

96. G. Xie, Y. Jiao, D. Yang, Martingale Musielak-Orlicz Hardy spaces. Sci. China Math. 62(8),
1567–1584 (2019)
97. G. Xie, F. Weisz, D. Yang, Y. Jiao, New martingale inequalities and applications to Fourier
analysis. Nonlinear Anal. 182, 143–192 (2019)
98. G. Xie, D. Yang, Atomic characterizations of weak martingale Musielak-Orlicz Hardy spaces
and their applications. Banach J. Math. Anal. 13(4), 884–917 (2019)
99. X. Yan, D. Yang, W. Yuan, C. Zhuo, Variable weak Hardy spaces and their applications. J.
Funct. Anal. 271, 2822–2887 (2016)
100. D. Yang, Y. Liang, L.D. Ky, Real-Variable Theory of Musielak-Orlicz Hardy Spaces. Number
2182 in Lecture Notes in Mathematics (Springer, Berlin, 2017)
The First Eigenvalue for Nonlocal
Operators

Julio D. Rossi

To the memory of Ireneo Peral, a great mathematician and

friend

Abstract In this chapter we present some results concerning the first eigenvalue
for a nonlocal operator in convolution form with a smooth kernel. Given a bounded
domain ⊂ RN and a smooth kernel J , we deal with the eigenvalue problem
$
J (x − y)(u(y) − u(x)) dy = −λ1 u(x), x ∈ ,
A

both with Dirichlet boundary conditions (take A = RN and prescribe that u = 0 in

RN \ ) and Neumann boundary conditions (now A = , in this case we study the
first nontrivial eigenvalue).

Keywords Eigenvalues · Nonlocal equations · Smooth kernels

1 Nonlocal Diffusion Problems

First, let us briefly introduce the prototype of nonlocal problem that will be
considered along this chapter. Let J : RN → R be a nonnegative, radial, continuous
function with
$
J (z) dz = 1.
RN

J. D. Rossi ()
Departamento de Matemática, FCEyN, Universidad de Buenos Aires, Buenos Aires, Argentina
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 741
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_22
742 J. D. Rossi

Nonlocal evolution equations of the form

$
∂u
(x, t) = (J ∗ u − u)(x, t) = J (x − y)u(y, t) dy − u(x, t), (1)
∂t RN

and variations of it, have been recently widely used to model diffusion processes.
More precisely, as stated in [2, 20], if u(x, t) is thought of as a density at the point x
at time t and J (x − y) is thought
) of as the probability distribution of jumping from
location y to location x, then RN J (y − x)u(y, t) dy = (J ∗ u)(x, t) is the rate at
) individuals are arriving at position x from all other places and −u(x, t) =
which
− RN J (y − x)u(x, t) dy is the rate at which they are leaving location x to travel
to all other sites. This consideration, in the absence of external or internal sources,
leads immediately to the fact that the density u satisfies Eq. (1).
Equation (1) is called nonlocal diffusion equation since the to evaluate the right
hand side at a point x and time t one needs to know u in a neighborhood of x to
compute the convolution term J ∗ u. This equation shares many properties with the
classical heat equation, ut = u, such as: bounded stationary solutions are constant,
a maximum principle holds for both of them and, even if J is compactly supported,
perturbations propagate with infinite speed [20]. However, there is no regularizing
effect in general.
Let us fix a bounded domain in RN . For local problems the two most
common boundary conditions for local PDEs are Dirichlet and Neumann. When
looking at boundary conditions for nonlocal problems, one has to modify the usual
formulations for local problems.
Concerning the homogeneous Dirichlet boundary conditions for nonlocal prob-
lems we consider
⎧ $
⎪
⎪
⎪ ut (x, t) = N J (x − y)u(y, t) dy − u(x, t), x ∈ , t > 0,
⎪
⎨ R

⎪ u(x, t) = 0, x ∈ , t > 0, (2)

⎪
⎪
⎪
⎩
u(x, 0) = u0 (x), x ∈ .

In this model we have that diffusion takes place in the whole RN but we impose that
u vanishes outside . In the biological interpretation, we have a hostile environment
outside , any individual that jumps outside dies instantaneously. This is the
analogous of what is called homogeneous Dirichlet boundary conditions for the
heat equation. However, the boundary datum is not understood in the usual sense,
since we are not imposing that u is continuous up to ∂.
The First Eigenvalue for Nonlocal Operators 743

For an analogous to Neumann boundary conditions for nonlocal problems we

propose
⎧ $
⎪
⎨ ut (x, t) = J (x − y)(u(y, t) − u(x, t)) dy, x ∈ , t > 0,
(3)
⎪
⎩ u(x, 0) = u (x),
0 x ∈ .

In this model we have that the integral term )takes into account the diffusion inside
. In fact, as we have explained, the integral J (x − y)(u(y, t) − u(x, t)) dy takes
into account the individuals arriving or leaving position x from other places. Since
we are integrating in , we are imposing that diffusion takes place only in . The
individuals may not enter nor leave the domain. This is the analogous of what is
called homogeneous Neumann boundary conditions in the literature.
These nonlocal problems has been used to model very different applied situ-
ations, for example in biology [12, 23, 29], image processing [22, 27], particle
systems [9], coagulation models [21], nonlocal anisotropic models for phase tran-
sition [1], mathematical finances [26], etc. Besides the interest for the applications
there is also a great amount of work dealing with purely mathematical issues. For
example, see [3, 4, 6, 7, 10–13, 15–20, 24, 25], and references therein.
Associated with these evolution problems there are two eigenvalue problems that
play a central role in determining the asymptotic behaviour of the solutions as t →
+∞. They are given by
⎧ $
⎨ φ(x) − J (x − y)φ(y) dy = λ1 (x), x ∈ ;
⎩
φ(x) = 0, x ∈ ,

and
$
− J (x − y)(ϕ(y) − ϕ(x)) dy = β1 ϕ(x), x ∈ .

The analysis of these eigenvalue problems is our main concern in this chapter. The
proofs of the main results are taken from the references [2, 14, 28], but we include
here some details to make this chapter self-contained.
We have to mention the close relation between this kind of evolution problems
and probability theory. In fact, when one looks at a Levy process [8] the nonlocal
operator that appears naturally is a fractional power of the Laplacian. This is out of
the scope of this chapter and we refer to [5] for a reference concerning the interplay
between nonlocal partial differential equations and probability. Although we are not
dealing with probability issues, let us explain briefly why the concrete problem (1)
has a clear probabilistic interpretation. Let (E, E) be a measurable space and P :
E × E → [0, 1] be a probability transition on E. Then, we define a transition
744 J. D. Rossi

function, for any x ∈ E, A ∈ E, let

+∞ n
t
Pt (x, A) = e−t P (n) (x, A) t ∈ R+ ,
n!
n=0

where P (n) denotes the

) n-ieth iterate of P . The associated family of Markovian
operators, Pt f (x) = f (y)Pt (x, dy) satisfies
$
∂
Pt f (x) = Pt f (y)P (x, dy) − Pt f (x).
∂t

Now, consider the Markov process (Zt )t ≥0 associated to the transition function
(Pt )t ≥0, and denote by μt its distribution. Then the family (μt )t ≥0 satisfies a linear
equation of the form
$
∂
μt = P (y, ·)μt (dy) − μt .
∂t

In particular, for E = RN , if the probability transition P (x, dy) has a density y $→

J (x, y), and μt has a density y $→ u(y, t), then we get the following equation
$
∂
u(x, t) = J (x, y)u(y, t) dλ(y) − u(x, t). (4)
∂t

With different particular choices of P we recover the equation studied in the

Dirichlet and the Neumann cases. For example, if P (x, dy) = J (y − x)dy is
the transition probability of a random walk, Eq. (4) is just Eq. (1). The results
described here give interesting information on the asymptotic behaviour of some
natural Markov processes.

2 The First Eigenvalue with Dirichlet or Neumann Boundary

Conditions

Now, let us introduce with some detail the two eigenvalue problems that we discuss
in this chapter (Dirichlet or Neumann boundary conditions). We will deal mainly
with the L2 formulation that gives linear eigenvalue problems, and at the end of
the chapter we will comment briefly on the results and difficulties for nonlinear
eigenvalues.
The First Eigenvalue for Nonlocal Operators 745

2.1 Dirichlet Boundary Conditions

Let λ1 = λ1 () be given by

$ $
1
J (x − y)(u(x) − u(y))2 dx dy
2 RN RN
λ1 = inf $ . (5)
u∈L2 ()
(u(x))2 dx

Here and in what follows we denote by u the extension by zero of u to the whole
RN , that is,

u(x) x ∈ ,
u(x) =
0 x ∈ RN \ .

If the minimum is attained at a function φ1 we get, just by differentiation, that it is

a solution of
$
φ1 (x) − J (x − y)φ 1 (y) dy = λ1 (x), x ∈ ; (6)
RN

conversely, it is easy to check that if φ1 > 0 is a solution to (6) (with λ1 the smallest
eigenvalue) then it is a minimizer of (5). Hence, we look for the first eigenvalue
of (6), which is equivalent to
$
(1 − λ1 )φ1 (x) = J (x − y)φ 1 (y) dy, x ∈ . (7)
RN

Let S : L2 () → L2 () be the operator given by

$ $
S(u)(x) := J (x − y)u(y) dy = J (x − y)u(y) dy, x ∈ .
RN

Hence we are looking for the largest eigenvalue, μ = 1 − λ1 , of S. Since the kernel
is smooth, S is compact and then this eigenvalue is attained at some function φ1 (x)
that turns out to be an eigenfunction for our original problem (6). By taking |φ1 |
instead of φ1 in (5) we may assume that φ1 ≥ 0 in . Indeed, one simply has to use
the fact that (a − b)2 ≥ (|a| − |b|)2 .
Let us present some properties of the eigenvalue problem (6).
Proposition 2.1 ([2, 14]) Let λ1 the first eigenvalue of (6) and denote by φ1 (x) a
corresponding non-negative eigenfunction. Then φ1 (x) is strictly positive in and
λ1 is a positive simple eigenvalue with λ1 < 1.
Proof Since J (0) > 0 and J is continuous we have that B(0, d) ⊂ supp(J ) for
some d > 0. Let us assume, for simplicity, that supp(J ) = B(0, 1). First, observe
746 J. D. Rossi

that λ1 = 1 can not be an eigenvalue since then

$
J (x − y)φ 1 (y) dy = 0, φ1 (x) ≥ 0,
RN

which implies φ1 = 0. Consequently, we have that

$
(1 − λ1 )φ1 (x) = J (x − y)φ 1 (y) dy, x ∈ , λ1 = 1,
RN

which implies that φ1 is uniformly continuous in . In what follows, we consider

bounded continuous functions in extended in the natural way to . We begin
with the positivity of the eigenfunction φ1 . Assume for contradiction that the B =
{x ∈ : φ1 (x) = 0} = ∅. Then, from the continuity of φ1 in , we have that B
is closed. We next prove that B is also open, and hence, since is connected, by
standard topological arguments we conclude that ≡ B, a contradiction. Consider
x0 ∈ B. Since φ1 ≥ 0, we obtain from (7) that ∩ B(x0 , 1) ∈ B (we use here that
supp(J ) = B(0, 1)). Hence B is open and the result follows.
Next, let us prove that λ1 > 0. Assume by contradiction that λ1 ≤ 0 and denote
by M ∗ the maximum of φ1 in and by x ∗ a point where such maximum is attained.
Assume for the moment that x ∗ ∈ . One can choose x ∗ in such a way that φ1 (x) =
M ∗ in ∩ B(x ∗ , 1). By using (7) we get that,
$
∗ ∗
M ≤ (1 − λ1 )φ1 (x ) = J (x ∗ − y)φ 1 (y) < M ∗
RN

and a contradiction follows. If x ∗ ∈ ∂, we obtain a similar contradiction after

substituting and passing to the limit in (7) on a sequence {xn } ∈ , xn → x ∗ as
n → ∞. To obtain the upper bound, assume that λ1 ≥ 1. Then, from (7) we have
for every x ∈ that
$
0 ≥ (1 − λ1 )φ1 (x ∗ ) = J (x ∗ − y)φ 1 (y)dy
RN

a contradiction with the positivity of φ1 .

Finally, to prove that λ1 is a simple eigenvalue, let φ1 = φ2 be two different
eigenfunctions associated to λ1 and define

C ∗ = inf{C > 0 : φ2 (x) ≤ Cφ1 (x), x ∈ }.

The regularity of the eigenfunctions and the previous analysis shows that C ∗ is
nontrivial and bounded. Moreover from its definition, there must exists x ∗ ∈
such that φ2 (x ∗ ) = C ∗ φ1 (x ∗ ). Define φ(x) = C ∗ φ1 (x) − φ2 (x). From the linearity
of (6), φ is a non-negative eigenfunction associated to λ1 with φ(x ∗ ) = 0. From the
The First Eigenvalue for Nonlocal Operators 747

positivity of the eigenfunctions stated above, it must be φ ≡ 0. Therefore, φ2 (x) =

C ∗ φ1 (x) and the result follows. This completes the proof.

Observe that the first eigenfunction φ1 is strictly positive in (with a positive
continuous extension to ) and vanishes outside . Therefore a discontinuity occurs
on ∂ and the boundary value is not taken in the classical sense.
Now, let us prove that the first eigenvalue λ1 gives an exponential decay for the
solutions to the evolution problem (2).
Theorem 2.2 ([2, 14]) If u0 ∈ L2 (), then the solution u of (2) decays to zero as
t → ∞ with an exponential rate,

u(·, t)L2 () ≤ u0 L2 () e−λ1 t . (8)

If u0 is continuous, positive and bounded, then there exist positive constants C and
C ∗ such that

u(·, t)L∞ () ≤ C e−λ1 t (9)

and

lim eλ1 t u(·, t) − C ∗ φ1 (·)L∞ () = 0. (10)

t →∞

Proof Using the symmetry of J , we have

$ $ $
∂ 1
u (x, t) dx =
2
J (x − y) (u(y, t) − u(x, t)) u(x, t) dy dx
∂t 2 $ $ RN RN
1
=− J (x − y) (u(y, t) − u(x, t))2 dy dx.
2 RN RN

From the definition of λ1 , (5), we get

$ $
∂
u2 (x, t) dx ≤ −2λ1 u2 (x, t) dx.
∂t

Therefore
$ $
−2λ1 t
u (x, t) dx ≤ e
2
u20 (x) dx

and (8) is obtained.

We now establish the decay rate and the convergence stated in (9) and (10)
respectively. Consider a nontrivial and non-negative continuous initial datum u0 (x)
and let u(x, t) be the corresponding solution to (2). We first note that u(x, t) is a
continuous function satisfying u(x, t) > 0 for every x ∈ and t > 0, and the
748 J. D. Rossi

same holds in . This instantaneous positivity can be obtained by using analogous

topological arguments to those used in Proposition 2.1.
In order to deal with the asymptotic analysis, it is more convenient to introduce
the rescaled function v(x, t) = eλ1 t u(x, t). We have that the function v(x, t)
satisfies
$
vt (x, t) = J (x − y)v(y, t) dy − (1 − λ1 )v(x, t) x ∈ . (11)
RN

On the other hand, we have that Cφ1 (x) is a solution of (11) for every C ∈ R and
moreover, it follows from the above eigenfunction analysis that the set of stationary
solutions of (11) is given by S∗ = {Cφ1 , C ∈ R}.
Define now for every t > 0, the function

C ∗ (t) = inf{C > 0 : v(x, t) ≤ Cφ1 (x), x ∈ }.

By definition and by using the linearity of Eq. (11), C ∗ (t) is a non-increasing

function. In fact, this is a consequence of the comparison principle applied to the
solutions C ∗ (t1 )φ1 (x) and v(x, t) for t larger than any fixed t1 > 0. It implies that
C ∗ (t1 )φ1 (x) ≥ v(x, t) for every t ≥ t1 , and therefore, C ∗ (t1 ) ≥ C ∗ (t) for every
t ≥ t1 . In an analogous way, one can see that the function

C∗ (t) = sup{C > 0 : v(x, t) ≥ Cφ1 (x), x ∈ },

is non-decreasing. These properties imply that the following two limits exist,

lim C ∗ (t) = K ∗ and lim C∗ (t) = K∗ ,

t →∞ t →∞

and also provide the compactness of the orbits which is necessary to pass to the limit
(after extracting subsequences if needed) in order to obtain that v(·, t +tn ) → w(·, t)
as tn → ∞ uniformly on compact subsets in × R+ and w(x, t) is a continuous
function which satisfies (11). Let us recall the concept of ω-limit set of the trajectory
u(t) that begins with u(0) = u0 ,
4 5
ω(u0 ) := g ∈ L2 () : ∃tn → ∞ with u(tn ) → g in L2 () .

For every g ∈ ω(u0 ) we also have

K∗ φ1 (x) ≤ g(x) ≤ K ∗ φ1 (x).

Moreover, C ∗ (t) plays a role of a Lyapunov function and this fact allows to
conclude that ω(u0 ) ⊂ S∗ , the set of stationary solutions of (11), and the uniqueness
of the profile. In more detail, assume that g ∈ ω(u0 ) does not belong to S∗ and
The First Eigenvalue for Nonlocal Operators 749

consider w(x, t) the solution of (11) with initial data g(x) and define

C ∗ (w)(t) = inf{C > 0 : w(x, t) ≤ Cφ1 (x), x ∈ }.

It is clear that W (x, t) = K ∗ φ1 (x) − w(x, t) is a non-negative continuous solution

of (11) and it becomes strictly positive for every t > 0. This implies that there
exists t ∗ > 0 such that C ∗ (w)(t ∗ ) < K ∗ and by the convergence, the same holds
before passing to the limit. Hence, C ∗ (t ∗ + tj ) < K ∗ if j is large enough, which is a
contradiction with the properties of C ∗ (t). The same arguments allow us to establish
the uniqueness of the profile.

We summarize in the next result different versions of the variational characteri-
zation of the principal eigenvalue.
Theorem 2.3 ([28]) The first eigenvalue, λ1 (), can be variationally characterized
as
⎛ $ $ 2
⎞1/2
⎜ J (x − y)u(y) dy dx ⎟
⎜ $ ⎟
λ1 () = 1 − ⎜ sup ⎟ (12)
⎝ u∈L2 () ⎠
u=0
u2 (x) dx

or
$ $
J (x − y)u(x)u(y) dy dx
λ1 () = 1 − sup $ (13)
u∈L2 ()
u=0
u2 (x) dx

or
$ $
J (x − y)(u(x) − u(y))2 dy dx
1 RN RN
λ1 () = inf $ . (14)
2 u∈L2 ()
u=0 u2 (x) dx

Proof Recall that we introduced the compact operator S : L2 () → L2 () given
by
$ $
S(u)(x) := J (x − y)u(y) dy = J (x − y)u(y) dy, x∈
RN

and that we have

λ1 = 1 − S.
750 J. D. Rossi

Then, the variational characterizations (12) and (13) are can be obtained at once
since
$ $ 2

|Su|2L2 () J (x − y)u(y) dy dx

$
S = sup2
= sup ,
u∈L2 () |u|2L2 () u∈L2 ()
u2 (x) dx
u=0 u=0

and
$ $
J (x − y)u(x)u(y) dy dx
| Su, u |
S = sup = sup $ ,
u∈L2 () |u|2L2 () u∈L2 ()
u2 (x) dx
u=0 u=0

since S is self-adjoint. Finally, by expanding the square in the numerator and

applying Fubini’s theorem, it is easily seen that
$ $ $ $
1
J (x − y)(u(x) − u(y))2 dy dx J (x − y)u(x)u(y)dy dx
2 RN RN $ =1− $ ,
u2 (x) dx u2 (x) dx

since J is even and u = 0 in RN \ . Thus (14) follows.

As an immediate consequence of the variational characterizations (12) and (13),
we have an estimate for λ1 (), which will be useful when dealing with the
asymptotic behavior of λ1 () in large and small domains.
Corollary 2.4 For the principal eigenvalue λ1 () we have the estimates:
$ 1/2 $
1
A2 (x) dx ≤ 1 − λ1 () ≤ sup A(x)J (x − y) dx , (15)
|| y∈

$
where A(x) = J (x − y) dy.

Proof Taking u ≡ 1 as test function in (12), we obtain
$ 1/2
1
1 − λ1 () ≥ A2 (x) dx . (16)
||

On the other hand, thanks to Cauchy-Schwartz inequality, we have:

$ $ 2 $ $
J (x − y)u(y) dy dx ≤ A(x) J (x − y)u2 (y) dy dx.

The First Eigenvalue for Nonlocal Operators 751

Using Fubini’s theorem, we get

$ $ 2 $ $
2
J (x − y)u(y) dy dx ≤ u (y) A(x)J (x − y)dx dy

$ $
≤ sup A(x)J (x − y)dx u2 (y) dy,
y∈

and this implies that

$
1 − λ1 (0 ) ≤ sup A(x)J (x − y)dx . (17)
y∈

Finally, (15) follows from (16) and (17). This concludes the proof of the corollary.

The estimates (15) obtained in Corollary 2.4 are not sharp: if the domain
contains a ball of radius 2, say, then the right-hand side in (15) equals one, so the
estimate is useless.
Now, we turn our attention to the dependence of the first eigenvalue on the
domain . A first consequence of the variational characterization, (5), is the strict
monotonicity of λ1 ().
Theorem 2.5 ([28]) The principal eigenvalue, λ1 (), is decreasing with respect to
the domain, that is, if 1 ⊂ 2 , then λ1 (1 ) > λ1 (2 ).
Proof of Theorem 2.5 We notice that L2 (1 ) ⊂ L2 (2 ), provided we extend all
functions of the first space by zero outside 1 . Hence we have, thanks to the
characterization (12), that λ1 (1 ) ≥ λ1 (2 ). To show that the inequality is strict,
we notice that if λ1 (1 ) = λ1 (2 ), then we obtain an associated eigenfunction
which is positive in 1 , but zero in 2 \ 1 , which contradicts the strong maximum
principle.

Next, we analyze perturbations δ of a fixed domain , where δ is a small
parameter, and consider the issues of continuity and differentiability of λ1 (δ ) with
respect to δ. We assume that the perturbed domain verifies δ = (δ, ), where
: (−ε, ε) × → RN takes the form

(δ, x) = x + (δ, x), (18)

with (0, ·) = 0. The continuity of λ1 (δ ) is a more or less simple consequence of

the continuity of with respect to δ. We denote by D the differential of with
respect to x.
Theorem 2.6 ([28]) Let λ1 (δ ) be the principal eigenvalue in δ , and assume
δ = (δ, ), where has the form (18) with , D ∈ C((−ε, ε) × ) for
some ε > 0 and (0, ·) = 0. Then, λ1 (δ ) → λ1 () as δ → 0.
752 J. D. Rossi

Proof of Theorem 2.6 We first notice that for small δ we can always assume 1 ⊂
δ ⊂ 2 for some smooth domains 1 and 2 not depending on δ. Thanks to
Theorem 2.5 this implies

0 < λ1 (2 ) < λ1 (δ ) < λ1 (1 ) < 1. (19)

Now let uδ be a positive eigenfunction associated to λ1 (δ ):

$
J (x − y)uδ (x) dx = (1 − λ1 (δ ))uδ (x), x ∈ δ .
δ

We make the change of variables x = z + (δ, z), y = w + (δ, w) with x, w ∈

to obtain
$
J (z − w + (δ, z) − (δ, w))vδ (w)(δ, w) dw = (1 − λ1 (δ ))vδ (z), (20)

for z ∈ , where vδ (w) = uδ (w + (δ, w)) and (δ, w) = det(I +D (δ, w)). We
select vδ with the normalization |vδ |L2 () = 1. Then, for every sequence δn → 0,
we have a subsequence—still denoted by δn —such that vδn ' v weakly in L2 ().
Since

J (z − w + (δn , z) − (δn , w)) (δn , w) → J (z − w)

uniformly in z, w ∈ , we obtain thanks to weak convergence

$ $
J (z − w + (δn , z) − (δn , w))vδn (w) (δ, w) dw → J (z − w)v(w) dw

for almost every w ∈ . Using the dominated convergence theorem, we also have
the convergence in L2 ().
On the other hand, since λ1 (δn ) is bounded, we may pass to a further
subsequence to have λ1 (δn ) → μ, where 0 < μ < 1, thanks to (19). Then,
setting δ = δn in (20) and passing to the limit we have that the convergence of vδn
to v0 is strong in L2 (). Therefore, |v0 |L2 () = 1. By (20) we finally have
$
J (x − y)v0 (y) dy = (1 − μ)v0 (x), x ∈ ,

with v0 ≥ 0, v0 ≡ 0. According to Theorem 2.3, we obtain that μ = λ1 (), that

is, λ1 (δn ) → λ1 (). Since δn was arbitrary, this shows that λ1 (δ ) → λ1 () as
δ → 0, as we wanted to prove.

We now consider the question of differentiability of λ1 (δ ). We assume the
function in (18) is differentiable and prove that λ1 (δ ) is differentiable at δ = 0,
providing in addition an explicit formula for the derivative.
The First Eigenvalue for Nonlocal Operators 753

Theorem 2.7 Let λ(δ) = λ1 (δ ) be the principal eigenvalue in δ , and assume
δ = (δ, ), where is of the form (18) with ∈ C 1 ((−ε, ε) × ) for some
ε > 0 and (0, ·) = 0. Then λ(δ) is differentiable with respect to δ at δ = 0, and
$ * +
∂
λ (0) = −(1 − λ1 ()) u20 (x) (0, x), ν(x) dS(x), (21)
∂ ∂δ

where u0 is the positive eigenfunction associated to λ1 () normalized as

|u0 |L2 () = 1 and ν(x) is the outward unit normal to ∂.
Note that the eigenfunction u0 is strictly positive on ∂, see [14]. Thus, the
integral in (21) is not necessarily zero.
Proof of Theorem 2.7 We use the variational characterization (13) to estimate the
incremental quotients of λ1 (δ ). For simplicity, let us write μ(δ) = 1 − λ1 (δ ). If
we denote
$ $
J (x − y)u(x)u(y) dx dy
δ δ
Hδ (u) = $ ,
u2 (x) dx
δ

we have, thanks to (13), that

μ(δ) − μ(0) Hδ (u0 ) − μ(0)

≥ (22)
δ δ
for δ > 0 (recall that u0 = 0 outside ). Now, we perform the change of variables
x = z + (δ, z), y = w + (δ, w) in the integrals in Hδ and we obtain
$ $
J (x − y)u0 (x)u0 (y) dx dy
δ δ
Hδ (u0 ) = $
u20 (x) dx
$ $ δ (23)
A(z, w) (z)(w) dzdw
= $
u20 (z + (δ, z))(z) dz

where

A(z, w) = J (z − w + (δ, z) − (δ, w))u0 (z + (δ, z))u0 (w + (δ, w))

754 J. D. Rossi

and (z) = det(I + D (δ, z)) and D stands for differentiation with respect to the
second variable. By our regularity assumptions we have that

A(z, w) (z)(w) = J (z − w)u0 (z)u0 (w) + K(z, w)δ + o(δ), (24)

where
> (0, z) − (0, w)
?
K(z, w) = ∇J (z − w), u0 (z)u0 (w)
> (0, z)
?
+J (z − w)u0 (w) ∇u0 (z),
> (0, w)
?
+J (z − w)u0 (z) ∇u0 (w),
+J (z − w)u0 (z)u0 (w) div( (0, z))

+J (z − w)u0 (z)u0 (w) div( (0, w)),

and stands for differentiation with respect to δ. Integrating (24) with respect to z
and w in , we get,
$ $
A(z, w) (z)(w) dz dw
$ $ $ $
= J (z − w)u0 (z)u0 (w) dz dw + δ K(z, w) dz dw + o(δ)
$ $
= μ(0) + δ K(z, w) dz dw + o(δ).

(25)

Taking into account that J is even—and hence ∇J is odd—and using Fubini’s

theorem we have that
$ $ $ $
> ?
K(z, w) dz dw = 2 ∇J (z − w), (0, z) u0 (z)u0 (w) dz dw
$ $
> ?
+2 J (z − w)u0 (w) ∇u0 (z), (0, z) dz dw
$ $
+2 J (z − w)u0 (z)u0 (w) div( (0, z)) dz dw.

Integrating by parts in the last integral, we arrive to

$ $ $ $

K(z, w) dz dw = 2 J (z − w)u0 (z)u0 (w) (0, z), ν(z)dS(z) dw.
∂

Noticing that u0 is an eigenfunction, this expression can be further transformed into:

$ $ $
>
?
K(z, w) dz dw = 2μ(0) u20 (z) (0, z), ν(z) dS(z) dw.
∂
The First Eigenvalue for Nonlocal Operators 755

Hence, from (25) we obtain

$ $
A(z, w) (z)(w) dz dw
$ (26)
>
?
= μ(0) + 2μ(0)δ u20 (z) (0, z), ν(z) dS(z) + o(δ).
∂

On the other hand, with a similar procedure, we obtain:

$ $
>
?
u20 (z+ (δ, z))(z) dz = 1+δ u20 (z) (0, z), ν(z) dS(z)+o(δ). (27)
∂

Taking into account (26) and (27), we obtain from (23):

$
>
?
Hδ (u0 ) = μ(0) + μ(0)δ u20 (z) (0, z), ν(z) dS(z) + o(δ).
∂

Hence (22) gives:

$
μ(δ) − μ(0) >
?
≥ μ(0) u20 (z) (0, z), ν(z) dS(z) + o(1),
δ ∂

and thus
$
μ(δ) − μ(0) >
?
lim inf ≥ μ(0) u20 (z) (0, z), ν(z) dS(z).
δ→0+ δ ∂

The remaining limits, lim supδ→0+ , lim infδ→0− and lim supδ→0− of the incremen-
tal quotients μ(δ)−μ(0)
δ can be proved with similar calculations (we only remark
that for the upper estimate the continuity of uδ is needed), and therefore we finally
conclude that
$
μ(δ) − μ(0) > ?
lim = μ(0) u20 (z) (0, z), ν(z) dS(z).
δ→0 δ ∂

This proves (21), and concludes the proof of the theorem.

An important example of perturbation of a domain is provided when is
enlarged in the direction of the unit normal an amount δ. To make this precise,
assume ∂ splits into m connected components, and select k of these components
1 , . . . , k . Set

C
k
δ = {x ∈ RN : dist(x, i ) < δ}. (28)
i=1
756 J. D. Rossi

We have δ = (δ, ), where (δ, x) = x + δ 1(x). Moreover, the derivative with
respect to δ, 1 = ∂∂δ (0, ·). verifies 1 = ν on the components i while 1 = 0 on
the remaining components of the boundary. Hence, we obtain that λ1 (δ ) decreases
linearly as δ goes to zero.
Corollary 2.8 Let be a bounded C 1 domain of RN , and assume δ is the
perturbation of given by (28). Then λ(δ) = λ1 (δ ) is differentiable with respect
to δ at δ = 0, and

k $

λ (0) = −(1 − λ1 ()) u20 (x) dS(x) < 0,
i=1 i

where u0 is the positive eigenfunction of λ1 () normalized with |u0 |L2 () = 1.
Having established the smoothness and monotonicity of λ1 (), we proceed with
the analysis of its asymptotic behavior both for small and large domains . In this
context n → RN means that the sequence of sets n contains balls BRn (centered
at a fixed point) with radii Rn → +∞.
Theorem 2.9 ([28]) For the principal eigenvalue λ1 () we have λ1 () → 1 when
|| → 0 and λ1 (n ) → 0 when n → RN .
Proof of Theorem 2.9 We make use of Corollary 2.4. First, notice that if || → 0,
the integral in the second inequality in (15) goes to zero, and thus λ1 () → 1.
To prove that λ1 (n ) → 0 when n → RN , we first show that λ1 (BR ) → 0
when R → ∞. According to (15) we have
$ $ 1/2
2
1
λ1 (BR ) ≤ 1 − J (x − y) dy dx ,
|BR | BR BR

hence we need to prove

$ $ 2
1
J (x − y) dy dx → 1 (29)
|BR | BR BR

as R → ∞. We set in the inner integral y = x − z, and then x = Rw, and arrive at

$ $ 2 $ $ 2
1 1
J (x − y) dy dx = J (z) dz dw.
|BR | BR BR |B1 | B1 |z−Rw|<R

Now observe that for fixed w with |w| < 1 it holds

$ $
J (z) dz → J (z) dz = 1,
|z−Rw|<R RN

as R → ∞, and (29) follows thanks to the dominated convergence theorem.

The First Eigenvalue for Nonlocal Operators 757

Finally, let us show that λ1 (n ) → 0 as n → RN . We can assume 0 ∈ n

and that there exists balls BRn such that BRn ⊂ n with Rn → ∞, and hence
λ1 (n ) < λ1 (BRn ). It follows that

lim sup λ1 (n ) ≤ lim λ1 (BRn ) = 0,

n→∞ n→∞

which concludes the proof.

To make more precise the information given by Theorem 2.9, we fix a C 1
bounded domain and consider dilatations of it, γ = γ , where γ > 0 is
the dilatation parameter. As a consequence of the previous theorems, we have that
λ1 (γ ) is a decreasing function of γ and λ1 (γ ) → 1 when γ → 0, λ1 (γ ) → 0
as γ → +∞. Our last theorem describes precisely the asymptotic behavior of
λ1 (γ ) both when γ → 0 and when γ → ∞.
Theorem 2.10 Let be a smooth bounded domain of RN , and for γ > 0 denote
γ = γ . Then

λ1 (γ ) ∼ 1 − J (0)||γ N as γ → 0 + . (30)

If in addition J is radially symmetric and radially decreasing, then

λ1 (γ ) ∼ A(J )σ1 ()γ −2 as γ → +∞, (31)

where σ1 () is the principal eigenvalue of the Laplacian in with Dirichlet

boundary conditions,

−v(x) = σ1 ()v(x), x ∈ ,
(32)
v(x) = 0, x ∈ ∂

and the constant A(J ) is given by

$
1
A(J ) = J (z)|z|2 dz.
2N RN

Roughly speaking, when conveniently scaled to a large domain, our nonlocal

problem resembles a local one. Indeed, for the first eigenvalue of the Laplacian
it is well known that σ1 (γ ) = σ1 ()γ −2 , therefore the asymptotic behavior as
γ → ∞ for both problems coincide (up to a factor that depends on J , A(J )).
Notice that the vanishing rate of 1 − λ1 (γ ) at γ = 0 and of λ1 (γ ) at γ =
+∞ is different, which is in contrast with the already mentioned scaling invariance
of the Laplacian. This phenomenon is caused by the lack of homogeneity of the
convolution term J ∗ u. Hence, there is a strong difference between the behaviour of
the first eigenvalue for local diffusion and for nonlocal diffusion when the domain
is small (case γ ∼ 0) but there is no big difference for large domains (case γ ∼ ∞).
758 J. D. Rossi

Proof of Theorem 2.10 First, we prove (30). Let uγ be an arbitrary positive

eigenfunction associated to λ1 (γ ). Choose an arbitrary e > 0. Now, for γ small
enough we have

J (x − y) ≤ J (0) + e

if x, y ∈ γ . Then
$ $ $
(1 − λ1 (γ )) uγ (x) dx = J (x − y)uγ (y) dy dx
γ $ $ γ γ $
≤ (J (0) + e) uγ (y) dy dx = (J (0) + e)||γ N uγ (y) dy.
γ γ γ

It follows that
1 − λ1 (γ )
lim sup ≤ J (0)||.
γ →0+ γN

The reverse inequality for the liminf can be proved in an analogous way. This
completes the proof of (30).
Let us prove now (31), which is much more involved. The first step is to show
that λ1 (γ ) ≤ Cγ −2 for a certain positive constant. Indeed, we will show the more
precise estimate,

lim sup γ 2 λ1 (γ ) ≤ σ1 ()A(J ). (33)

γ →+∞

Let
) 2φ be the positive eigenfunction of the Laplacian in , normalized by
φ (x) dx = 1 and extended by zero outside . Taking as a test function
φγ (x) = φ(x/γ ) in the variational characterization (14), we obtain
$ $ 2
1 x y
J (x − y) φ −φ dy dx
2 RN RN γ γ
λ1 (γ ) ≤ $ 2
.
x
φ dx
γ γ

Setting x = y + z and y = γ w in the integrals of the numerator, and x = γ θ in the

integral of the denominator, we obtain
$ $ 2
1 z
λ1 (γ ) ≤ J (z) φ w + − φ(w) dw dz
2 RN RN γ
$ $ 2
1 z
= J (z) φ w + − φ(w) dw dz.
2 B1 RN γ
The First Eigenvalue for Nonlocal Operators 759

Taking into account that the function φ belongs to W 1,∞ (RN ), we have

z 1
$ 1* z
+
φ w+ − φ(w) = ∇φ w + s , z ds
γ γ 0 γ

for every w ∈ RN , z ∈ B1 . Hence,

1
$ $ $ 1* z
+ 2
γ 2 λ1 (γ ) ≤ J (z) ∇φ w + s , z ds dw dz. (34)
2 B1 RN 0 γ

Thanks to dominated convergence theorem, we can pass to the limit in (34) as γ →

+∞ to obtain,
$ $
1
lim sup γ 2 λ1 (γ ) ≤ J (z) ∇φ(w), z2 dw dz. (35)
γ →+∞ 2 B1 R N

In the last integral, we apply Fubini’s theorem to obtain

$ $ $ $
J (z) ∇φ(w), z2 dw dz = J (z) ∇φ(w), z2 dz dw
B1 RN RN B1
N $
$
∂φ ∂φ
= (w) (w) J (z)zi zj dz dw.
∂xi ∂xj
i,j =1 R
N B1

)
We notice that the integrals B1 J (z)zi zj dz vanish by symmetry when i = j , while
they are all equal to 2A(J ) when i = j . Thus (35) implies (33).
Now let ϕγ be a positive eigenfunction
) associated to λ1 (γ ), and set ψγ (x) =
ϕγ (γ x), x ∈ . We normalize ψγ by ψγ2 (x) dx = 1. According to the variational
characterization (14), we have
$ $
2λ1 (γ ) = Jγ (x − y)(ψγ (x) − ψγ (y))2 dxdy,
1
1

1 is a smooth bounded domain such that ⊂⊂ .

where Jγ (x) = γ N J (γ x), and 1
Now let γn → +∞ be an arbitrary sequence. By passing to a subsequence,
we may assume ψn := ψγn converges weakly in L2 () 1 to a function ψ. Since
J is radially decreasing and λ1 (γn ) ≤ Cγn−2 , thanks to (33), we may apply
Proposition 3.2 of [4], which implies that ψn → ψ strongly in L2 () 1 with
1 Since ψ = 0 in
ψ ∈ H 1 (). 1 \ , we obtain
$
ψ ∈ H01 () and ψ 2 (x) dx = 1. (36)

760 J. D. Rossi

We claim that ψ is the principal eigenfunction of a multiple of the Laplacian in

with Dirichlet boundary conditions, and this implies

lim γ 2 λ1 (γn ) = A(J )σ1 ().

n→∞ n

Indeed, thanks to (33), we may assume that γn2 λ1 (γn ) → λ0 ≥ 0. We notice that
ψn satisfies

Jγn ∗ ψn − ψn = −λ1 (γn )ψn . (37)

Choose an arbitrary function v ∈ C0∞ (). Multiply (37) by v and integrate in to

obtain
$ $ $
γ N
J (γ (x − y))ψn (y)v(x) dy dx − ψn (x)v(x) dx
RN RN $ RN (38)
= −λ1 (γn ) ψn (x)v(x) dx.
RN

Note that all the integrals in what follows may be considered in RN , since v and
ψn vanish outside . Thanks to Fubini’s theorem, the integrals in the left-hand side
of (38) can be rewritten to obtain
$ $
γnN J (γn (x − y))(v(y) − v(x))ψn (x) dx dy
RN RN $ (39)
= −λ1 (γn ) ψn (x)v(x) dx,
RN

since J has unit integral. Letting z = −γn (x − y) in the first integral of (39), we get
$ $
z
J (z) v x + − v(x) ψn (x) dx dz
RN RN $ γ n (40)
= −λ1 (γn ) ψn (x)v(x) dx.
RN

We now use Taylor expansion up to the second order in v:

z
v x+ − v(x)
γn
N $
1 ∂v 1 1
N
∂ 2v sz
= (x)zi + 2 (1 − s) x+ zi zj ds,
γn ∂xi γn 0 ∂xi ∂xj γn
i=1 i,j =1
The First Eigenvalue for Nonlocal Operators 761

which, when plugged into (40), gives

$ $
N
∂v
J (z) γn (x)zi
R N R N ∂x i
i=1 ⎞
N $ 1
∂ 2v sz
+ (1 − s) x+ zi zj ds ⎠ ψn (x) dxdz
0 ∂xi ∂xj γn
$ i,j =1
= −γn2 λ1 (γn ) ψn (x)v(x)dx.

Next we analyze the integrals involving the first derivatives of v. Notice that
$ $ $ $
∂v ∂v
J (z) (x)zi ψn (x)dxdz = (x)ψn (x) J (z)zi dz dx = 0
RN RN ∂xi R N ∂x i RN

by the symmetry of J . Hence,

⎛ ⎞
$ $ N $
1 ∂ 2v sz
J (z) ⎝ (1 − s) x+ zi zj ds ⎠ ψn (x) dx dz
RN RN ∂xi ∂xj γn
=1 0
i,j $

= −γn2 λ1 (γn ) ψn (x)v(x) dx.

RN
(41)
Now we pass to the limit as n → ∞ in (41). Notice that

∂ 2v sz ∂ 2v
x+ → (x)
∂xi ∂xj γn ∂xi ∂xj

uniformly for x ∈ , z ∈ B1 , and hence the first term in (41) converges to

$
N $
1 ∂ 2v
(x)ψ(x) J (z)zi zj dz dx = A(J )v(x)ψ(x).
2 RN i,j =1 ∂xi ∂xj RN

Thus
$ $
A(J ) v(x)ψ(x) dx = −λ0 ψ(x)v(x) dx. (42)
RN RN

According to (36), we may integrate by parts in the integral of the left-hand side
in (42) to obtain
$ $
A(J ) ∇v(x)∇ψ(x)dx = λ0 ψ(x)v(x) dx.
RN RN
762 J. D. Rossi

Since v ∈ C0∞ () is arbitrary, and ψ ∈ H01 () with ψ ≡ 0, we have that ψ is a
positive eigenfunction associated to − in . Thus λ0 = A(J )σ1 (), and since the
sequence γn was arbitrary, the theorem is proved.

2.2 Neumann Boundary Conditions

For the Neumann problem (3) the first eigenvalue is zero (with an eigenfunction
given by a constant). Hence we look for the first nontrivial eigenvalue, given by
$ $
1
J (x − y)(u(y) − u(x))2 dy dx
2
β1 (J, ) = inf) $ (43)
u∈L2 (), u=0
(u(x))2 dx

In this case the associated equation reads as

$
J (x − y)(ϕ(x) − ϕ(y)) dy = β1 ϕ(x), x ∈ . (44)

Notice that a minimizer of (43) is a solution to (44).

Our first result shows that β1 (J, ) is indeed nontrivial.
Proposition 2.11 The quantity β1 (J, ) defined by (43) is strictly positive.
Proof It is clear that β1 ≥ 0. Let us prove that β1 is in fact strictly positive. To this
end, consider the subspace H of L2 () given by the orthogonal to the constants,
and the symmetric (self-adjoint) operator T : H $→ H given by
$ $
T (u)(x) = J (x − y)(u(x) − u(y)) dy = − J (x − y)u(y) dy + A(x)u(x),

)
where A(x) = J (x − y)dy. Note that T is the sum of an invertible operator and
a compact operator. Since T is symmetric, its spectrum verifies σ (T ) ⊂ [m, M],
where

m= inf T u, u and M= sup T u, u.

u∈H, uL2 () =1 u∈H, uL2 () =1

Remark that
$ $
m= inf J (x − y)(u(x) − u(y)) dy u(x) dx = β1 .
u∈H, uL2 () =1
The First Eigenvalue for Nonlocal Operators 763

Then m ≥ 0. Let us show now that

m > 0.

If not, since m ∈ σ (T ), T : H $→ H is not invertible. Using Fredholm’s alternative,

this implies that there exists a nontrivial u ∈ H such that T (u) = 0, but then, a
simple computation shows that u must be constant in , which is a contradiction.

To study the asymptotic behaviour of the solutions, an upper estimate on β1 is
needed. Here and in what follows, χ D denotes the characteristic function of the set
D.
Lemma 2.12 ([2]) Let β1 be given by (43) then
$
β1 ≤ min J (x − y) dy. (45)
x∈

Proof Let
$
A(x) = J (x − y) dy.

Since is compact and A is continuous there exists a point x0 ∈ such that

A(x0 ) = min A(x).

x∈

For every ε small let us choose two disjoint balls of radius ε contained in ,
B(x1,ε , ε) and B(x2,ε , ε) in such a way that xi,ε → x0 as ε → 0. By using

uε (x) = χ B(x1,ε ,ε) (x) − χ B(x2,ε ,ε) (x)

as a test function in the definition of β1 for ε small, it holds

$ $
1
J (x − y)(uε (y) − uε (x))2 dy dx
2
β1 ≤ $
(uε (x))2 dx
$ $ $
2
A(x)uε (x) dx − J (x − y)uε (y) uε (x) dy dx
= $
(uε (x))2 dx
$ $ $
A(x)u2ε (x) dx − J (x − y)uε (y) uε (x) dy dx
=
.
2|B(0, ε)|
764 J. D. Rossi

Using the continuity of A and the explicit form of uε we obtain

$
A(x)u2ε (x) dx
lim
= A(x0 )
ε→0 2|B(0, ε)|

and
$ $
J (x − y)uε (y) uε (x) dy dx
lim
= 0.
ε→0 2|B(0, ε)|

Therefore, (45) follows.

In the Neumann case we find an asymptotic behaviour analogous to the one that
holds for the heat equation. The solution u(x, t) of (3) converge exponentially to the
mean value of the initial datum, and the decay is determined by the eigenvalue β1 .
First we show that the solution u of (3) preserves the total mass.
Proposition 2.13 For every u0 ∈ L1 () the unique solution u of (3) preserves the
total mass in , that is,
$ $
u(y, t) dy = u0 (y) dy.

Proof Since
$ t$
u(x, t) − u0 (x) = J (x − y) (u(y, s) − u(x, s)) dy ds,
0

integrating in x and applying Fubini’s theorem, it follows

$ $
u(x, t) dx − u0 (x) dx = 0.

The corresponding stationary problem to (3) is described by the equation
$
0= J (x − y)(ϕ(y) − ϕ(x)) dy. (46)

The only solutions for this equation are the constants.

The First Eigenvalue for Nonlocal Operators 765

Proposition 2.14 Every stationary solution of (3) is constant in , and, since the
total mass is preserved, the unique stationary solution with the same mass as u0 is
$
1
ϕ= u0 (x) dx.
||

Proof Observe (46) implies that ϕ is a continuous function. Set

K = max ϕ(x)
x∈
6 7
and consider the set A = x ∈ : ϕ(x) = K . The set A is clearly closed and
non empty. We claim that it is also open in . Let x0 ∈ A, then
$
0= J (x0 − y)(ϕ(y) − ϕ(x0 )) dy,

therefore, since ϕ(y) ≤ ϕ(x0 ), this implies ϕ(y) = ϕ(x0 ) for all y ∈ ∩ B(x0 , d),
for any B(0, d) ⊂ supp(J ). Hence A is open as claimed. Consequently, as is
connected, A = and ϕ is constant.

Theorem 2.15 ([2]) For every u0 ∈ L2 () the solution u(x, t) of (3) satisfies
$ $

u(·, t) − 1 ≤e −β1 t 1
u0
u0
|| L2 () u0 − || 2 , (47)
L ()

where β1 is given by (43). Moreover, if u0 is continuous and bounded, there exists a

positive constant C > 0 such that
$

u(·, t) − 1 u0 ≤ Ce−β1 t . (48)
|| L∞ ()

Proof Let
$ $ 2
1 1
H (t) = u(x, t) − u0 dx.
2 ||

Differentiating, using (43) and the conservation of the total mass, we obtain
$ $
1
H (t) = − J (x − y)(u(y, t) − u(x, t))2 dy dx
2 $ $ 2
1
≤ −β1 u(x, t) − u0 dx.
||
766 J. D. Rossi

Hence

H (t) ≤ −2β1 H (t).

Therefore, integrating,

H (t) ≤ e−2β1 t H (0),

and (47) follows.

In order to prove (48) let w(x, t) denote the difference
$
1
w(x, t) = u(x, t) − u0 .
||

We seek for an exponential estimate in L∞ of the decay of w(x, t). The linearity of
the equation implies that w(x, t) is a solution of (3) and satisfies
$ t $
−A(x)t −A(x)t
w(x, t) = e w0 (x) + e e A(x)s
J (x − y)w(y, s) dy ds,
0
)
where A(x) = J (x − y)dx. By using (47) and Hölder’s inequality it follows that
$ t
|w(x, t)| ≤ e−A(x)t |w0 (x)| + Ce−A(x)t eA(x)s−β1s ds.
0

Therefore, w(x, t) decays to zero exponentially fast and, moreover, (48) holds
thanks to Lemma 2.12.

An analysis of the dependence of the first eigenvalue β1 with respect to the
domain is left open.

2.3 Optimal Constants in Lq

Here we briefly comment on the Lq case. We refer to [2] for details. The main
difficulty is to show that there exists an eigenfunction associated with the usual
minimization of the corresponding Raleigh quotients. Below, we just show that the
associated optimal constants are positive.
The First Eigenvalue for Nonlocal Operators 767

2.3.1 The Dirichlet Case

Proposition 2.16 ([2]) Given q ≥ 1, a bounded domain in RN , there exists

λ = λ(J, , q) > 0 such that
$ $ $
λ |u(x)|q dx ≤ J (x − y)|u(y) − u(x)|q dy dx (49)
RN RN

for all u ∈ Lq ().

Proof Let r, α > 0 such that J (x) ≥ α in B(0, r). Let

B0 = {x ∈ J \ : d(x, ) ≤ r/2},
B1 = {x ∈ : d(x, B0 ) ≤ r/2},
j −1
Bj = {x ∈ \ ∪k=1 Bk : d(x, Bj −1 ) ≤ r/2}, j = 2, 3, . . .

Observe that we can cover by a finite number of non null sets {Bj }ljr=1 . Now
$ $ $ $
J (x − y)|u(y) − u(x)|q dy dx ≥ J (x − y)|u(y) − u(x)|q dy dx,
RN RN Bj Bj−1

≥ αj |u(x)| dx − β
q
|u(y)| dy,
q
Bj Bj−1

where
$
1
αj = min J (x − y) dy > 0
2q x∈Bj Bj−1

(since J (x) ≥ α in B(0, r)) and

$
β= J (x) dx.
RN
768 J. D. Rossi

Therefore, since u(y) = 0 if y ∈ B0 , u(y) = u(y) if y ∈ Bj , j = 1, . . . , lr ,

This ends the proof.

2.3.2 The Neumann Case

Theorem 2.17 ([2]) Given q ≥ 1, J as above and a bounded domain in RN , the

quantity
$ $
1
J (x − y)|u(y) − u(x)|q dy dx
2
βq (J, , q) = inf) $
u∈Lq (), u=0 |u(x)|q dx

is strictly positive. Consequently

$ $ q $ $
1 1
βq u− u ≤ J (x − y)|u(y) − u(x)|q dy dx, (50)
|| 2

for every u ∈ Lq ().

Proof It is enough to prove that there exists a constant c such that
$ $ $
1/q
uq ≤ c J (x − y)|u(y) − u(x)| dydx q
+ u , (51)

for every u ∈ Lq ().

Let r > 0 such that J (z) ≥ α > 0 in B(0, r). Since ⊂ ∪x∈ B(x, r/2), there
exists {xi }m
i=1 ⊂ such that ⊂ ∪i=1 B(xi , r/2). Let 0 < δ < r/2 such that
m

B(xi , δ) ⊂ for all i = 1, . . . , m. Then, for any x̂i ∈ B(xi , δ), i = 1, . . . , m,

C
m
= (B(x̂i , r) ∩ ). (52)
i=1
The First Eigenvalue for Nonlocal Operators 769

Let us argue by contradiction. Suppose that (51) is false. Then, there exists un ∈
Lq (), with un Lq () = 1, and satisfying
$ $ $
1/q
1≥n J (x − y)|un (y) − un (x)| dydx q
+ un ∀n ∈ N.

Consequently,
$ $
lim J (x − y)|un (y) − un (x)|q dy dx = 0 (53)
n

and
$
lim un = 0. (54)
n

Let

Fn (x, y) = J (x − y)1/q |un (y) − un (x)|

and
$
fn (x) = J (x − y)|un (y) − un (x)|q dy.

From (54), it follows that

fn → 0 in L1 ().

Passing to a subsequence if necessary, we can assume that

fn (x) → 0 ∀x ∈ \ B1 , B1 null. (55)

On the other hand, by (53), we also have that

Fn → 0 en Lq ( × ).

So we can suppose, up to a subsequence,

Fn (x, y) → 0 ∀(x, y) ∈ × \ C, C null. (56)

Let B2 ⊂ a null set satisfying that,

for all x ∈ \ B2 , the section Cx of C is null. (57)

770 J. D. Rossi

Let x̂1 ∈ B(x1 , δ) \ (B1 ∪ B2 ), then there exists a subsequence, denoted equal,
such that

un (x̂1 ) → λ1 ∈ [−∞, +∞].

Consider now x̂2 ∈ B(x2 , δ) \ (B1 ∪ B2 ), then up to a subsequence, we can assume

un (x̂2 ) → λ2 ∈ [−∞, +∞].

So, successively, for x̂m ∈ B(xm , δ) \ (B1 ∪ B2 ), there exists a subsequence, again
denoted equal, such that

un (x̂m ) → λm ∈ [−∞, +∞].

By (56) and (57),

un (y) → λi ∀y ∈ (B(x̂i , r) ∩ ) \ Cx̂i .

Now, by (52),

= (B(x̂1 , r) ∩ ) ∪ (∪m
i=2 (B(x̂i , r) ∩ )).

Hence, since is a bounded domain, there exists i2 ∈ {2, .., m} such that

(B(x̂1 , r) ∩ ) ∩ (B(x̂i2 , r) ∩ ) = ∅.

Therefore, λ1 = λi2 . Let us call i1 := 1. Again, since

= ((B(x̂i1 , r) ∩ ) ∪ ((B(x̂i1 , r) ∩ )) ∪ (∪i∈{1,...,m}\{i1 ,i2 } (B(x̂i , r) ∩ )),

and there exists i3 ∈ {1, . . . , m} \ {i1 , i2 } such that

((B(x̂i1 , r) ∩ ) ∪ ((B(x̂i1 , r) ∩ )) ∩ (B(x̂i3 , r) ∩ ) = ∅.

Consequently

λi1 = λi2 = λi3 .

Using the same argument we get

λ1 = λ2 = . . . = λm = λ.
The First Eigenvalue for Nonlocal Operators 771

If |λ| = +∞, we have shown that

|un (y)|q → +∞ for almost every y ∈ ,

which contradicts un Lq () = 1 for all n ∈ N. Hence λ is finite.

On the other hand, by (55), fn (x̂i ) → 0, i = 1, . . . , m. Hence,

Fn (x̂1 , .) → 0 in Lq ().

Since un (x̂1 ) → λ, from the above we conclude that

un → λ in Lq (B(x̂i , r) ∩ ).

Using again a compactness argument we get

un → λ in Lq ().

By (54), λ = 0, so

un → 0 in Lq (),

which contradicts un Lq () = 1.

Acknowledgments This work was partially supported by CONICET grant PIP GI No

11220150100036CO (Argentina), PICT-2018-03183 (Argentina) and UBACyT grant
20020160100155BA (Argentina).

References

1. G. Alberti, G. Bellettini, A nonlocal anisopropic model for phase transition: asymptotic

behaviour of rescaled. Eur. J. Appl. Math. 9, 261–284 (1998)
2. F. Andreu-Vaillo, J. Toledo-Melero, J.M. Mazon, J.D. Rossi, Nonlocal Diffusion Problems, vol.
165 (American Mathematical Society, Providence, 2010)
3. F. Andreu, J.M. Mazón, J.D. Rossi, J. Toledo, The Neumann problem for nonlocal nonlinear
diffusion equations. J. Evol. Equ. 8(1), 189–215 (2008)
4. F. Andreu, J.M. Mazón, J.D. Rossi, J. Toledo. A nonlocal p−Laplacian evolution equation
with Neumann boundary conditions. J. Math. Pures Appl. 90(2), 201–227 (2008)
5. D. Applebaum, Lévy Processes and Stochastic Calculus. Cambridge Studies in Advanced
Mathematics, vol. 93 (Cambridge University Press, Cambridge, 2004)
6. G. Barles, E. Chasseigne, C. Imbert, On the Dirichlet problem for second-order elliptic integro-
differential equations. Ind. Univ. Math. J. 57, 213–246 (2008)
7. G. Barles, C. Imbert. Second-order elliptic integro-differential equations: viscosity solutions
theory revisited. Ann. Inst. H. Poincaré Anal. Non Linéaire 25, 567–585 (2008)
8. J. Bertoin, Lévy Processes. Cambridge Tracts in Mathematics, vol. 121 (Cambridge University
Press, Cambridge, 1996)
772 J. D. Rossi

9. M. Bodnar, J.J.L. Velázquez, An integro-differential equation arising as a limit of individual

cell-based models. J. Differ. Equ. 222, 341–380 (2006)
10. L. Caffarelli, L. Silvestre, An extension problem related to the fractional Laplacian. Commun.
Partial Differ. Equ. 32, 1245–1260 (2007)
11. L. Caffarelli, S. Salsa, L. Silvestre, Regularity estimates for the solution and the free boundary
of the obstacle problem for the fractional Laplacian. Invent. Math. 171, 425–461 (2008)
12. C. Carrillo, P. Fife, Spatial effects in discrete generation population models. J. Math. Biol.
50(2), 161–188 (2005)
13. E. Chasseigne, The Dirichlet problem for some nonlocal diffusion equations. Differ. Integral
Equ. 20, 1389–1404 (2007)
14. E. Chasseigne, M. Chaves, J.D. Rossi, Asymptotic behavior for nonlocal diffusion equations.
J. Math. Pures Appl. 86, 271–291 (2006)
15. C. Cortázar, M. Elgueta, J.D. Rossi, A nonlocal diffusion equation whose solutions develop a
free boundary. Ann. Henri Poincaré 6, 269–281 (2005)
16. C. Cortázar, J. Coville, M. Elgueta, S. Martínez, A non local inhomogeneous dispersal process.
J. Differ. Equ. 241, 332–358 (2007)
17. C. Cortázar, M. Elgueta, J.D. Rossi, N. Wolanski, Boundary fluxes for non-local diffusion. J.
Differ. Equ. 234, 360–390 (2007)
18. C. Cortázar, M. Elgueta, J.D. Rossi, N. Wolanski, How to approximate the heat equation with
Neumann boundary conditions by nonlocal diffusion problems. Arch. Ration. Mech. Anal.
187(1), 137–156 (2008)
19. J. Coville, J. Dávila, S. Martínez, Existence and uniqueness of solutions to a nonlocal equation
with monostable nonlinearity. SIAM J. Math. Anal. 39, 1693–1709 (2008)
20. P. Fife, Some nonclassical trends in parabolic and parabolic-like evolutions, in Trends in
Nonlinear Analysis (Springer, Berlin, 2003), pp. 153–191
21. N. Fournier, P. Laurencot, Well-posedness of Smoluchowski’s coagulation equation for a class
of homogeneous kernels. J. Funct. Anal. 233, 351–379 (2006)
22. G. Gilboa, S. Osher, Nonlocal linear image regularization and supervised segmentation. UCLA
CAM Report 06-47 (2006)
23. V. Hutson, S. Martínez, K. Mischaikow, G.T. Vickers. The evolution of dispersal. J. Math. Biol.
47, 483–517 (2003)
24. C. Imbert. A non-local regularization of first order Hamilton-Jacobi equations. J. Differ. Equ.
211, 218–246 (2005)
25. L.I. Ignat, J.D. Rossi, A nonlocal convection-diffusion equation. J. Funct. Anal. 251, 399–437
(2007)
26. E.R. Jakobsen, K.H. Karlsen, Continuous dependence estimates for viscosity solutions of
integro-PDEs. J. Differ. Equ. 212, 278–318 (2005)
27. S. Kindermann, S. Osher, P.W. Jones, Deblurring and denoising of images by nonlocal
functionals. Multiscale Model. Simul. 4, 1091–1115 (2005)
28. J.G. Melián, J.D. Rossi, On the principal eigenvalue of some nonlocal diffusion problems. J.
Differ. Equ. 246(1), 21–38 (2009)
29. A. Mogilner, L. Edelstein-Keshet, A non-local model for a swarm. J. Math. Biol. 38, 534–570
(1999)
Comparing Banach Spaces for Systems
of Free Random Variables Followed by
the Semicircular Law

Ilwoo Cho and Palle Jorgensen

Abstract We study certain Banach-space operators from noncommutative free

probability, acting on systems of free random variables whose free distributions are
followed by the semicircular law. In particular, we consider (i) non-self-adjoint free
random variables T of a C ∗ -probability space followed by the semicircular law in
the sense that: all n-th joint free moments of {T , T ∗ } are identical to the n2 -th Catalan
numbers c n2 , for all n ∈ N, with axiomatization: c n2 = 0, whenever n2 ∈ / N, (ii) some
structure theorems of C ∗ -probability spaces generated by countable-infinitely many
free random variables followed by the semicircular law, and (iii) certain Banach-
space operators acting on free random variables of (i) and (ii).

Keywords Free probability · Semicircular elements · Free random variables

followed by the semicircular law · Banach-space operators

1 Introduction

As the counterpart of measure spaces in commutative function theory, let (B, ψ)

be a topological (noncommutative free) ∗-probability space (e.g., a C ∗ -probability
space, or, a W ∗ -probability space, or, a Banach ∗-probability space, etc.) of a
topological ∗-algebra B (a C ∗ -algebra, respectively, a von Neumann algebra,
respectively, a Banach ∗-algebra, etc.), and a bounded linear functional ψ. More-
over, as the noncommutative version of probability spaces in commutative theory,
we assume that (B, ψ) is unital in the sense that it contains the unity (or, the

I. Cho
Dept. of Math. & Stat., St. Ambrose Univ., Davenport, IA, USA
e-mail: [email protected]
P. Jorgensen ()
Dept. of Math., Univ. of Iowa, Iowa City, IA, USA
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 773
R. M. Aron et al. (eds.), Operator and Norm Inequalities and Related Topics,
Trends in Mathematics, https://doi.org/10.1007/978-3-031-02104-6_23
774 I. Cho and P. Jorgensen

multiplication-identity) 1B ∈ B, satisfying

1B · T = T = T · 1B , ∀T ∈ B,

and

ψ (1B ) = 1 = ψ 1nB , ∀n ∈ N.

Note that, even though a given topological ∗-probability space (B, ψ) is not unital,
our main results of the text hold under non-unitality. Throughout this paper, all
given (noncommutative) free probability spaces are assumed to be unital just for
convenience.
An element T ∈ B is said to be a free random variable if we understand it as an
element of (B, ψ). For example, a self-adjoint free random variable S ∈ (B, ψ) is a
self-adjoint element S ∈ B satisfying S = S ∗ in B, where S ∗ is the adjoint of S in
B.
For any arbitrary free random variables T1 , . . . , Ts ∈ (B, ψ), for s ∈ N, the free
distribution of T1 , . . . , Ts is characterized by the joint free moments,

n
r
ψ Til l = ψ Tir11 Tir22 ...Tirnn ,
l=1

or, the joint free cumulants,

knψ Tir11 , . . . , Tirnn = ψ Tirl l μ (π, 1n ) , (1)
π∈NC(n) V ∈π l∈V

of {T , T ∗ }, for all (i1 , . . . , in ) ∈ {1, . . . , s}n and (r1 , . . . , rn ) ∈ {1, ∗}n , for all
ψ
n ∈ N, where k• (.) is the free cumulant on B in terms of the linear functional ψ,
by the Möbius inversion (e.g., [17, 22–25]). In (1), the set NC(n) is the lattice of all
“noncrossing” partitions over a discrete set {1, . . . , n}, with its maximal partition,

1n = {(1, . . . , n)} ,

the single-block partition with its block (1, . . . , n), for all n ∈ N; and μ is the
Möbius functional of [23] satisfying

μ (0n , 1n ) = (−1)n+1 cn , ∀n ∈ N,

and

μ (θ, 1n ) = 0,
θ∈NC(n)
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 775

where
(2n)!
cn = , ∀n ∈ N,
n!(n + 1)!

and 0n = {(1), (2), . . . , (n)}is the minimal partition of NC(n), having its n-many
blocks (1), (2), . . . , (n), and [θ1 , θ2 ] is the interval in NC(n) under the partial
ordering,

θ1 ≤ θ2 , ⇐⇒ ∀U ∈ θ1 , ∃V ∈ θ2 , s.t., U ⊆ V ,

where “U ∈ θ1 ” means “U is a block of θ1 .”

By (1), the free distribution of a single free random variable T ∈ (B, ψ) is
characterized by the joint free moments,

n

ψ T rl = ψ T r1 T r2 ...T rn ,
l=1

or, the joint free cumulants,

knψ T r1 , T r2 , . . . , T rn ,

of {T , T ∗ }, for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N. As a special case, if a

free random variable S ∈ (B, ψ) is self-adjoint, then the free distribution of S is
characterized by the free moment sequence,
n ∞
ψ S n=1 ,

equivalently, by the free cumulant sequence,

⎛ ⎛ ⎞⎞∞
⎜ ψ⎜ ⎟⎟
⎝kn ⎝S,
O
S, ... . . . , S ⎠⎠
PQ R
, (2)
n-times n=1

since S = S ∗ in (B, ψ).

The main purposes of this paper are (i) to show the existence of “non-self-
adjoint” free random variables T in a certain unital topological ∗-probability space
(B, ψ), whose free distributions are followed by the semicircular law in the sense
that:
n

ψ T rl
= ωn c n2 ,
l=1
776 I. Cho and P. Jorgensen

for all (r1 , . . . , rn ) ∈ {1, ∗}n , where

⎧
⎨ 1 if n is even
ωn = (3)
⎩
0 if n is odd,

for all n ∈ N, and

1 2k 1 (2k)! (2k)!
ck = = = , (4)
k+1 k k+1 k!(2k − k)! k!(k + 1)!

are the k-th Catalan numbers for all k ∈ N0 = N ∪ {0}; (ii) to study free-
distributional data in (B, ψ) induced by the free random variables of (i); (iii)
to investigate structure theorems of a unital C ∗ -probability space generated by
countable-infinitely many free random variables of (i), (iv) to consider certain
actions of Banach-space operator on the free random variables of (iii); and (v) to
characterize how the actions of (iv) deform the free distributions of (ii).

1.1 Background

In both classical and free probability theory, the semicircular law is an important
topic (e.g., [1, 2, 5–8, 10–12, 20, 21, 28, 30]). The (classical, or free) distributions of
semicircular elements, called the semicircular law, are well-known in statistical lan-
guage. In particular, operators satisfying the semicircular law (under a fixed linear
functional) have been studied in free probability, and they are well-characterized in
analytic, or combinatorial free-probabilistic senses (e.g., [1, 17, 18, 21, 28–30]).
Semicircular elements play a key role in operator-algebraic free theory by the
(free) central limit theorem(s) (e.g., see [2, 17, 19, 28–30]). Roughly speaking, the
semicircular law is the noncommutative counterpart of the classical Gaussian (or,
the normal) distribution in commutative theory. From combinatorial approaches
(e.g., [17, 22, 23, 25]), the free distributions of semicircular elements are universally
characterized by the Catalan numbers {ck }∞ k=1 of (4). More precisely, the semicir-
cular law is characterized by the free-moment sequence,
∞
ωn c n2 = (0, c1 , 0, c2 , 0, c3 , ...) .
n=1

1.2 Motivation

Recently, from the analysis on p-adic number fields Qp , semicircular elements are
constructed (e.g., [5, 10]), for primes p. It shows connections among number theory,
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 777

operator algebra and quantum statistical physics (e.g., [26, 27]) via free probability.
Motivated by the constructions of [5, 10], semicircular elements are generated, as
Banach-space operators (e.g., [13, 14]), by |Z|-many orthogonal projections in a
C ∗ -algebra (e.g., [6–8, 11, 12]), different from classical approaches. Independently,
the joint free distributions of mutually free, multi semicircular elements are re-
characterized both combinatorially and analytically in [9]. Such re-characterizations
and the combinatorial techniques of [9] are applied in this paper (See Sect. 3 below).

1.3 Overview

Section 2 is devoted to introduce basic concepts of this paper. In Sects. 3 through 6,

we construct, and study the frames for our works. We show that (i) there are suitably
many non-self-adjoint free random variables whose free distributions are followed
by the semicircular law (See Sects. 7.1–7.3), (ii) a C ∗ -probability space generated
by countable-infinitely many free random variables followed by the semicircular
law is well-defined under tensor product, and the structure theorems of this free-
probabilistic structure are characterized (See Sect. 7.4), and (iii) there are certain
Banach-space operators acting on the free random variables of (i) and (ii), deforming
the original free-distributional data (followed by the semicircular law), and the
deformations are characterized (See Sect. 8).

2 Preliminaries

For fundamental free probability theory, see e.g., [3, 15–17, 28–30]. Free proba-
bility is a noncommutative operator-algebraic version of classical measure theory
and statistical analysis. Roughly, a topological ∗-probability space (B, ψ) is a
noncommutative free-probabilistic analogue of a classical measure space (X, ρ)
of a measurable space X and its (bounded) measure ρ. In particular, if (B, ψ) is
unital, satisfying ϕ (1B ) = 1, then it is a noncommutative counterpart of a classical
probability space (X, ρ), equipped with the probability measure ρ, satisfying
ρ (X) = 1.
Free probability is an important branch of operator algebra theory (e.g., [9,
17, 20–22, 28, 29]), and it provides interesting applications in both mathematical
and scientific fields (e.g., [4–8, 10–12, 24, 25]). Here, we use combinatorial free
probability of e.g., [17, 22, 23, 25].
Let (A, ϕ) be a unital topological ∗-probability space. As we discussed at
the beginning, the free distribution of a self-adjoint free random variable a is
characterized by
∞
the free-moment sequence ϕ(a n ) n=1 ,
778 I. Cho and P. Jorgensen

or,

the free-cumulant sequence (kn (a, . . . , a))∞

n=1 , (5)

by (1) and (2) (e.g., [17, 22, 23, 25]), where k• (.) is the free cumulant on A in terms
of ϕ under the Möbius inversion of [22].
Definition 1 A self-adjoint free random variable x ∈ (A, ϕ) is said to be
semicircular, if

ϕ(x n ) = ωn c n2 , ∀n ∈ N, (6)

where ωn are in the sense of (3), and ck are the k-th Catalan numbers (4) for all
k ∈ N0 .
By the Möbius inversion, a free random variable x is semicircular in (A, ϕ), if
and only if

kn (x, . . . , x) = δn,2 (7)

for all n ∈ N, where δ is the Kronecker delta. So, one can use the definition (6)
and the characterization (7) alternatively. i.e., the semicircular law, which is the
free distributions of semicircular elements, is characterized by the free-moment
sequence,

(0, c1 , 0, c2 , 0, c3 , 0, c4 , ...) , (8)

equivalently, by the free-cumulant sequence,

(0, 1, 0, 0, 0, 0, ...), (9)

by (6) and (7), respectively.

By (8) and (9), all semicircular elements are “identically” free-distributed from
each other, and hence, the free distributions of all semicircular elements are said to
be “the” semicircular law.

3 Free Distributions of Multi Semicircular Elements

In this section, we study joint free distributions of mutually free, multi semicircular
elements in a unital C ∗ -probability space (A, ϕ). Suppose there are N-many
semicircular elements x1 , . . . , xN in (A, ϕ), for N ∈ N, and assume that they are
mutually free in (A, ϕ). By the self-adjointness of x1 , . . . , xN in A, the (joint) free
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 779

distribution, say

denot e
ρ = ρx1 ,...,xN , (10)

of them are characterized by the joint free-moments,

∞ 6 7
∪ ∪ ϕ xi1 xi2 ...xin (11)
n=1 (i1 ,...,in )∈{1,...,N}n

by (1) (e.g., [17, 22, 23]).

Throughout this section, we fix s ∈ N, and an s-tuple Is ,

denot e
Is = (i1 , . . . , is ) ∈ {1, . . . , N}s , (12)

in {1, . . . , N}. From the sequence Is of (12), define a set,

[Is ] = {i1 , i2 , . . . , is }, (13)

with its cardinality s, without considering repetition. i.e., even though ij1 = ij2 as
entries of the sequence Is of (12) for j1 = j2 ∈ {1, . . . , s}, regard them as distinct
elements of the set [Is ] of (13).
Then, from the set [Is ] of (13), define a “noncrossing” partition π (i1 ) in the
noncrossing-partition lattice NC ([Is ]) (e.g., [17, 22, 23, 25]) as follows; (i) starting
from the very first entry i1 of Is , construct the maximal block U1 satisfying

U1 = i1 = ij1 , ij2 , . . . , ij|U ∈ π(Is ) ,

with the rule:

i1 = ij1 = ij2 = ... = ij|U | , (14)

in Is , (ii) and then, by fixing the very next entry of

[Is ] \ U1 ,

construct the second maximal block U2 of π (i1 ) containing the entry, as in (14), and
do these processes until end to have the noncrossing partition π (i1 ), and (iii) such a
resulted partition π (i1 ) must be “maximal” in NC ([Is ]), under the partial ordering
on NC ([Is ]) (e.g., see [22, 23, 25]), satisfying both (i) and (ii). For example, if

I10 = (1, 1, 2, 2, 1, 1, 1, 2, 1, 2)
780 I. Cho and P. Jorgensen

and

[I10 ] = {i1 , i2 , . . . , i10 } ,

with

i1 = i2 = i5 = i6 = i7 = i9 = 1 and i3 = i4 = i8 = i10 = 2,

then there exists a noncrossing partition,

π (i1 ) = {(i1 , i2 , i5 , i6 , i7 , i9 ), (i3 , i4 ), (i8 ), (i10 )}

= {(1, 1, 1, 1, 1, 1), (2, 2), (2), (2)},

in NC([I8 ]), satisfying the conditions (i), (ii) and (iii). Remark here that, even
though i3 = i4 = i8 = i10 = 2, one cannot construct the block (i3 , i4 , i8 , i10 ) in
π (i1 ), because this block has two crossings with the first block (i1 , i2 , i5 , i6 , i7 , i9 ),
so, to avoid the crossings, we need separated blocks (i3 , i4 ), (i8 ) and (i10 ).
Now, similar to the noncrossing partition π (i1 ) for the first entry i1 of Is ,
construct noncrossing partitions,

π (i2 ) , . . . , π (is ) in NC ([Is ]) ,

similarly satisfying the above conditions (i), (ii) and (iii) by replacing i1 to il , for all
l = 2, . . . , s. i.e., π (il ) are the maximal partitions containing the block containing
all identical entries of Is with il , for all l = 1,. . ., s. It is
not
hard to check that if
il1 = il2 in {1, . . . , N}, as entries of Is , then π il1 = π il2 in NC ([Is ]). Thus, if
ik1 , . . . , ikn are mutually distinct in {1, . . . , N} as entries of Is , for n ≤ s, then the
corresponding partitions,

π ik1 , . . . , π ikn

“can” be distinct. Remark that, sometimes, some of them can be identically same in
NC ([Is ]); for instance, if
let
J = (1, 1, 1, 1, 2, 2) = (i1 , . . . , i6 ) ,

then

π (1) = {(i1 , i2 , i3 , i4 ) , (i5 , i6 )} = π (2) ,

in NC ({i1 , . . . , i6 }).
Now, suppose π (il ) ∈ NC ([Is ]), for l = 1, . . . , s, is the noncrossing partition
induced by the s-tuple Is of (12),

π (il ) = {U1 , . . . , Ut },
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 781

where t ≤ s and Uk ∈ π(Is ) are the blocks of (i) and (ii), satisfying (iii), for k = 1,
. . . , t. Then the partition π (il ) is regarded as the join partition (e.g., [17, 22, 23]),

π (il ) = 1|U1 | ∨ 1|U2 | ∨ ... ∨ 1|Ut | ,

where 1|Uk | are the maximal elements of NC (Uk ), for all k = 1, . . . , t, by regarding
each block Uk as an independent discrete finite sets.
By collecting all such partitions, define the subset ([Is ]) of NC ([Is ]) by

([Is ]) = {π (il ) : l = 1, . . . , s} .

Also, define a subset e ([Is ]) of ([Is ]) by

e ([Is ]) = {θ ∈ ([Is ]) : |V | ∈ 2N, ∀V ∈ θ } , (15)

i.e., all partitions of e ([Is ]) have their blocks with even cardinalities, where 2N =
{2n : n ∈ N}.
Let Is be in the sense of (12), and let xi1 , . . . , xis be the corresponding
semicircular elements of (A, ϕ) induced by Is , without considering repetition in
the free semicircular family {x1 , . . . , xN }. Define a free random variable X[Is ] by

def s
X[Is ] = xil ∈ (A, ϕ). (16)
l=1

s
Theorem 2 Let Is be an s-tuple (12), and let X[Is ] = xil be the corresponding
l=1
free random variable (16) of (A, ϕ). Then

ϕ (X [Is ]) = ϕθ xi1 , . . . , xis ,
θ∈ e ([Is ])

with

ϕθ xi1 , . . . , xis = c |V | , (17)
2
V ∈θ

where ck are the k-th Catalan numbers (4). Clearly,

ϕ (X [Is ]) = 0 ⇐⇒ e ([Is ]) = ∅,

where ∅ is the empty set.

Proof The formula (17) is proven in [9] by the Möbius inversion of [23].

782 I. Cho and P. Jorgensen

4 A C ∗ -Probability Space Generated by |N|-Many

Semicircular Elements

In this section, we study a C ∗ -algebra X generated by mutually free, |N|-many

semicircular elements. Let (A, ϕ) be a unital C ∗ -probability space, and assume
that it contains a family X = {sn }∞ n=1 of mutually free semicircular elements.
By [17, 22, 23], all mixed free cumulants of {sn }∞
n=1 vanish with respect to ϕ by
the freeness on X. For a notational convenience, we re-index the free semicircular
family X,

{sn }∞
n=1 = {s1 , s2 , s3 , ...},

{xn }∞
n=0 = {x0 , x1 , x2 , ...},

by identifying sn to xn−1 in A, for all n ∈ N. i.e., from below, we will let

X = {xn }∞
n=0 = {xn }n∈N0 ,

without loss of generality.

4.1 Free-Isomorphic Relations

A unital C ∗ -probability space (A1 , ϕ1 ) is free-homomorphic to a unital C ∗ -

probability space (A2 , ϕ2 ), if there is a ∗-homomorphism : A1 → A2 , such
that,

ϕ2 ((a)) = ϕ1 (a), for all a ∈ (A1 , ϕ1 ).

In such a case, the ∗-homomorphism is called a free-homomorphism from

(A1 , ϕ1 ) to (A2 , ϕ2 ). We write this free-homomorphic relation by

free-homo
(A1 , ϕ1 ) −→ (A2 , ϕ2 ). (18)

free-homo
Definition 3 Suppose (A1 , ϕ1 ) −→ (A2 , ϕ2 ) in the sense of (18), by a free-
homomorphism : A1 → A2 . If is a ∗-isomorphism, then it is called a free-
isomorphism, and (A1 , ϕ1 ) is said to be free-isomorphic to (A2 , ϕ2 ). We denote this
relation by

free-iso
(A1 , ϕ1 ) = (A2 , ϕ2 ). (19)
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 783

By the definitions (18) and (19), if two C ∗ -probability spaces are free-
isomorphic, then they are understood as a same unital C ∗ -probability space.

4.2 A C ∗ -Probability Space Xϕ

Let (A, ϕ) be a unital C ∗ -probability space containing a free semicircular family

X = {xn }∞ ∗ ∗
n=0 . Construct the C -subalgebra X = C (X) of A, generated by the
∗ ∗
family X, where C (Y ) are the C -subalgebras of A generated by

Y ∪ Y ∗ of A, with Y ∗ = {y ∗ ∈ A : y ∈ Y }.

Then one can obtain a C ∗ -probabilistic sub-structure,

denot e
Xϕ = (X, ϕ = ϕ |X ) (20)

in (A, ϕ).
Now, let (B, ψ) be a unital C ∗ -probability space, containing a family S =
{yn }n∈Z of mutually free, |Z|-many semicircular elements, and let

denot e
Sψ = (S, ψ = ψ |S ) (21)

be the corresponding C ∗ -probabilistic sub-structure of (B, ψ), as in (20), where

S = C ∗ (S) is the C ∗ -subalgebra of B generated by the family S. Such a C ∗ -
probability space (21) does exist naturally (e.g., [5, 12, 20, 21]), or artificially-but-
canonically (e.g., [6–8]).
Proposition 4 Let X = C ∗ (X) be a C ∗ -subalgebra (20) of A. Then

∗-iso ∞ ∗-iso ∞
X = C ∗ ({xn }) = C ∗ {xn } , (22)
n=0 n=0

in (A, ϕ), where () in the first ∗-isomorphic relation of (22) is the free-probabilistic
free product of [17, 22, 29, 30], and the () in the second ∗-isomorphic relation of
(22) is the pure-algebraic free product inducing the noncommutative free words in
∞
∪ {xn } = X.
n=0

Proof Let X = C ∗ (X) be a fixed C ∗ -subalgebra of A. Since X is a free family

consisting of mutually free semicircular elements {xn }n∈N0 in (A, ϕ), one has that

def ∗-iso
X = C ∗ (X) = C ∗ {xn }n∈N0 = C ∗ ({xn }) . (23)
n∈N0
784 I. Cho and P. Jorgensen

Therefore, the first ∗-isomorphic relation of (22) holds by (23). i.e., all elements
of X are the limits of linear combinations of free reduced words (under operator
multiplication on A) in X by [17, 22, 23, 29, 30].
So, if we consider all noncommutative free words in the family X = {xn }n∈N0 ,
then they have their unique operator forms in X, which are the free reduced words
by (23). It shows that the second ∗-isomorphic relation of (22) holds, too.

By (22), one can understand the C ∗ -probability space Xϕ of (20) as an indepen-
dent free-probabilistic structure,

Xϕ = C ∗ ({xn }) , ϕ |C ∗ ({xn }) . (24)

n∈N0 n∈N0

Proposition 5 Let Sψ be the C ∗ -probability space (21) in (B, ψ). Then

Sψ = C ∗ {sj } , ψ |C ∗ ({sj }) , (25)
j ∈Z j ∈Z

where C ∗ (Z), here, mean the C ∗ -subalgebras of B generated by the subsets Z of

B.
Proof The proof of (25) is similar to that of (22).

Now, let’s partition N0 and Z as follows:

N0 = {0} (2N) (2N − 1),

and

Z = (−N) {0} N, (26)

where

2N = {2n : n ∈ N}, 2N − 1 = {2n − 1 : n ∈ N},

and

−N = {−n : n ∈ N}.

Motivated by (26), define a bijection g : N0 → Z by

⎧
⎨0 if n = 0
g(n) = n+1
if n ∈ 2N − 1 (27)
⎩ 2n
−2 if n ∈ 2N,
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 785

in Z, for all n ∈ N0 . From this bijection g, one can define a bijection,

G : X → S,

G(xn ) = yg(n), for all n ∈ N0 . (28)

where X is the generating family (20) of X, S is the generating family (21) of S,

and g is the bijection (27).
By (28), one can define the corresponding “multiplicative” linear transformation,

:X→S

satisfying

(xn ) = G(xn ) = sg(n) ∈ S, ∀xn ∈ X, (29)

in S, where G is in the sense of (28). i.e., for an alternating N-tuple (n1 , . . . , nN ) ∈

NN0 , satisfying

n1 = n2 , n2 = n3 , . . . , nN−1 = nN in N0 ,

N
if one has a free reduced word T = xnkll ∈ Xϕ , where xn1 , . . . , xnN ∈ X, for
l=1
k1 , . . . , kN ∈ N, for N ∈ N, then

N N
(T ) = xnkll = xnkll
l=1 l=1

by the multiplicativity of

N k
= (xnl ) l
l=1

by the multiplicativity of

N k
= s l , (30)
l=1 g(nl )

in Sψ , by (29).
Since (n1 , . . . , nN ) ∈ NN
0 is an alternating N-tuple in N0 , the N-tuple,

(g(n1 ), . . . , g(nN )) ∈ ZN ,
786 I. Cho and P. Jorgensen

is an alternating N-tuple in Z, too, by (27) and (28). i.e., the formula (30) shows that
the images (T ) of all free reduced words T ∈ Xϕ with their lengths-N become
free reduced words of Sψ with the same lengths-N.
Lemma 6 The multiplicative linear transformation : X → S of (29) is a ∗-
isomorphism. i.e.,

∗-iso
X = S, (31)

∗-iso
where = means “being ∗-isomorphic to.”
Proof Remark that all elements of X (or, of S) are the limits of linear combinations
of free reduced words in X (resp., in S) by (22) (resp., by (25)). So, the multiplicative
linear transformation of (29) is bijective and bounded. Observe that, for any xn ∈
X ⊂ X, and t ∈ C,

(txn )∗ = t xn

since xn∗ = xn , under the semicircularity

∗
= t(xn ) = t sg(n) = tsg(n)

∗
since sg(n) = sg(n) , by the semicircularity
∗
= tsg(n) = ((txn ))∗ ,

in S. So, by (22), (25), and the linearity of ,

T ∗ = ((T ))∗ in S, (32)

for all T ∈ X. Therefore, is a ∗-isomorphism by (32).

By (31), we obtain the following free-isomorphic relation.
Theorem 7 The C ∗ -probability spaces Xϕ and Sψ are free-isomorphic, i.e.,

free-iso
Xϕ = Sψ . (33)

Proof By (31), there exists a ∗-isomorphism of (29) from X onto S. By (22) and
(25), it suffices to show that the ∗-isomorphism preserves the free distributions of
generators of Xϕ to those of generators of Sψ .
Let xn ∈ X ⊂ Xϕ . Then

ψ ((xn ))k = ψ sg(n)

k
= ωk c k = ϕ xnk ,
2

for all k ∈ N, by the semicircularity (6) of X ∪ S.

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 787

It shows that preserves the free probability on Xϕ to that on Sψ by (17), and

hence, it is a free-isomorphism. Therefore, two C ∗ -probability space Xϕ and Sψ
are free-isomorphic.

Assumption and Notation From below, we will identify Xϕ and Sψ as the same
C ∗ -probability space, and denote it by Xϕ , by (33).

5 Free-Distributional Data on Xϕ

Let Xϕ = (X, ϕ) be the C ∗ -probability space (25), “identified with (24) by (33),”
generated by the free semicircular family X = {xj }j ∈Z .
Theorem 8 Let Is = (i1 , . . . , is ) be an arbitrary s-tuple in Zs , for s ∈ N, like in
(12), and let π(Is ) ∈ NC ({i1 , . . . , is }) be the noncrossing partition (15) induced by
Is . If X[Is ] be a free random variable (16) of Xϕ , then the free-distributional data
ϕ (X[Is ]) is characterized by the formula (17).

7 s ]) on Xϕ are obtained
Proof Under hypothesis, the free-distributional data ϕ6(X[I
by (17), since all elements of the generator set X = xj j ∈Z of Xϕ are mutually
free, semicircular elements.

6 Certain Free-Isomorphisms on Xϕ

By (24), (25) and (33), our C ∗ -probability space Xϕ is a representative of all unital
C ∗ -probability spaces generated by mutually free, |N|-many semicircular elements.
As we assumed in Sects. 4 and 5, we let Xϕ be the C ∗ -probability space (25)
generated by a free semicircular family X = {xj }j ∈Z of mutually free, |Z|-many
semicircular elements.

6.1 Shifts on Z

Define bijection h on the set Z of all integers by

h(j ) = j + 1, (34)

for all j ∈ Z.
Remark that, by (34), one can define a function h on N0 by

h = g −1 ◦ h ◦ g on N0 , (34’)
788 I. Cho and P. Jorgensen

where g is the bijection (27) from N0 onto Z, and g −1 : Z → N0 is the inverse

function of g. Thus, the well-defined bijection h of (34) on Z implies the existence
of bijections h of (34)’ on N0 . We now concentrate on h of (34).
Define the bijections h(n) on Z, by
⎧
⎪
⎪ idZ , the identity function on Z if n = 0
⎪
⎪
⎪
⎪
⎨
def Oh ◦ h ◦ h ◦ PQ
............. ◦ hR if n > 0
h(n) = (35)
⎪
⎪ n-times
⎪
⎪ −1
◦ −1 −1
⎪
⎪
⎩
h
O h PQ◦ ... ◦ h R if n < 0,
|n|-times

for all n ∈ Z, where (◦) is the usual functional composition, and h−1 is the inverse
of h,

h−1 (j ) = j − 1, ∀j ∈ Z.

By (34) and (35),

h(n) (j ) = j + n,

for all j, n ∈ Z. And h(n) are invertible with their inverse h(−n) , for all n ∈ Z.
Definition 9 We call the bijections h(n) of (35), the n-th shifts on Z, for n ∈ Z.

6.2 Integer Shifts on Xϕ

Let h(n) be the n-th shifts (35) on Z, for all n ∈ Z. Let k ∈ Z, and define a
“multiplicative” linear transformation λk acting on Xϕ by the morphism satisfying

λk xj = xh(k) (j ) = xj +k , ∀xj ∈ X ⊂ Xϕ . (36)

N n
By the multiplicativity of the morphism λk of (36), if T = x l is a free reduced
6 7 l=1 jl
words of Xϕ with its length-N in X = xj j ∈Z , then

N n N n
λk (T ) = λk xjl l = x l , (37)
l=1 l=1 jl +k

in Xϕ , where (j1 , . . . , jN ) ∈ ZN is alternating, and n1 , . . . , nN ∈ N.

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 789

Note that, if (j1 , . . . , jN ) ∈ ZN is alternating, then

(j1 + k, . . . ,jN + k) ∈ ZN

is alternating in Z, too. So, the computation (37) says that the morphism λk of (36)
assign free reduced words to free reduced words preserving lengths in Xϕ .
Also, by (36) and (37), one has
⎧
⎪
⎪ 1Xϕ , the identity map on Xϕ if k = 0
⎪
⎪
⎪
⎪
⎨
O · λ · λPQ· ...... · λR
λ if k > 0
λ =
k
⎪
⎪ k -times
⎪
⎪ −1
· −1
· ...... · λ−1R if k < 0,
⎪
⎪
λ
O λ PQ
⎩
|k|-times

for all k ∈ Z, by (35) and (36), where (·) is the multiplication (or, composition) of
linear transformations.
Observe that, for t ∈ C, and xj ∈ X ⊂ X,

∗ ∗
λk txj = t xj +k = txj∗+k = λk (txj ) ,

implying that
∗
λk T ∗ = λk (T ) , for all T ∈ Xϕ , (38)

in Xϕ , by (25) and (37).

Theorem 10 A multiplicative linear transformation λk of (36) is a free-
isomorphism on Xϕ , for all k ∈ Z.
Proof By (38), the morphism λk of (36) is a well-defined ∗-homomorphism on Xϕ .
And, by the bijectivity of the k-th shift h(k) on Z, the restriction λk |X is a bijection
on the free-generator set X of Xϕ . Thus, by (25) and (37), it is a ∗-isomorphism on
Xϕ .
Observe now that
n
ϕ λk (xj ) = ϕ xjn+k = ωn c n2 = ϕ xjn , (39)

for all n ∈ N, for all xj ∈ X.

Therefore, for all s-tuple Is ∈ Zs ,

ϕ (X[Is ]) = ϕ λk (X[Is ]) in Xϕ ,
790 I. Cho and P. Jorgensen

by Theorem 8 (or, (17)) and (39), where X[Is ] are in the sense of (16). Thus,

ϕ (T ) = ϕ λk (T ) , for all T ∈ Xϕ ,

in Xϕ , by (25). Therefore, λk is a free-isomorphism on Xϕ .

Let Aut Xϕ be the automorphism group of Xϕ ,
⎛⎧ ⎫ ⎞
def ⎨ α is a ⎬
Aut Xϕ = ⎝ α ∗-isomorphism , ·⎠ ,
⎩ ⎭
on Xϕ

(·) is the product (or composition) on ∗-isomorphisms. Define a subset λ of

where
Aut Xϕ by

λ = {λk : k ∈ Z}, (40)

where λk are the free-isomorphisms (36) on Xϕ .

Theorem 11 The subset λ of (40) is an abelian subgroup of Aut (Xϕ ), satisfying

Group
λ = (Z, +) , the infinite abelian cyclic group, (41)

Group
where “ = ” means “being group-isomorphic.”
Proof Let λk1 , λk2 ∈ λ. Then, by (36) and (37),

λk1 λk2 = λk1 +k2 , in λ.

So, the algebraic structure (λ, ·) is well-determined in Aut Xϕ . And hence,

λk1 λk2 λk3 = λk1 +k2 +k3 = λk1 λk2 λk3 ,

in λ, for all k1 , k2 , k3 ∈ Z.
Observe that the set λ contains λ0 = 1Xϕ , the identity map on Xϕ , by (34) and
(36), satisfying

1Xϕ · λk = λ0+k = λk = λk · 1Xϕ , on Xϕ ,

for all k ∈ Z. And, for any k ∈ Z,

λk λ−k = λk+(−k) = λ0 = 1Xϕ = λ−k λk , in λ,

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 791

showing that every element λk ∈ λ has its unique (·)-inverse λ−k . Thus, the set λ of
(40) forms a subgroup of Aut Xϕ . Clearly,

λk1 λk2 = λk1 +k2 = λk2 +k1 = λk2 λk1 ,

in λ. Therefore, the subgroup (λ, ·) is commutative in Aut Xϕ .
Define now a map : Z → λ by

(j ) = λj , for all j ∈ Z,

Then it is a group-isomorphism, satisfying

(j1 + j2 ) = λj1 +j2 = λj1 λj2 = (j1 ) (j2 ),

in λ, for all j1 , j2 ∈ Z. Therefore, the group-isomorphic relation (41) holds.

The above
> ? theorem shows that the subgroup λ of (40) is an infinite abelian cyclic
group λ1 embedded in Aut (Xϕ ), where g means the cyclic (sub)group generated
6 7
by g, g −1 (in a group).
By (37) and (41), there is a natural group-action θ of λ acting on our C ∗ -
probability space Xϕ , satisfying

θ λk (T ) = λk (T ), for all T ∈ Xϕ , (42)

for all k ∈ Z.
Definition 12 The group λ of (40), acting on Xϕ via the group-action θ of (42), is
called the integer-shift group on Xϕ .

6.3 Free-Isomorphic Relations on Xϕ

We here study how the integer-shift group λ of (40) affects the free probability on
Xϕ , under the group-action θ of (42).
Theorem 13 Let λ be the integer-shift group, and let θ be the group-action (42) of
λ acting on Xϕ . Then the free probability on Xϕ is preserved by θ , in the sense that:

ϕ θ λk (T ) = ϕ (T ) , for all T ∈ Xϕ , (43)

for all λke ∈ λ.

792 I. Cho and P. Jorgensen

Proof By (42), for any T ∈ Xϕ ,

θ λk (T ) = λk (T ) , in Xϕ ,

and hence,

ϕ θ λk (T ) = ϕ λk (T ) = ϕ (T ) ,

for all k ∈ Z, by Theorem 10. So, the action θ preserves the free probability on Xϕ .

k
Notation From below, we denote the images θ λ (T ) ∈ Xϕ of T ∈ Xϕ simply
by λk (T ), for all λk ∈ λ.

7 Free Random Variables followed by the Semicircular Law

Let λ ⊂ Aut (Xϕ ) be the integer-shift group (40) acting on the C ∗ -probability
space Xϕ (via the canonical action θ of (42)) generated by the free family {xj }j ∈Z
of semicircular elements xj ’s. In this section, we construct some free random
variables in a certain C ∗ -probability space, containing Xϕ , whose free distributions
are followed by the semicircular law.
Definition 14 Let (B, ψ) be an arbitrary topological ∗-probability space. A free
random variable y ∈ (B, ψ) is followed by the semicircular law, if
n

ψ y rl
= ωn c n2 , (44)
l=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N, where ωn are in the sense of (3) for all
n ∈ N, and ck are the k-th Catalan numbers (4) for all k ∈ N0 .
By the definition (44), if a self-adjoint free random variable y is followed by the
semicircular law in (B, ψ), then it is nothing but a semicircular element. i.e., all
semicircular elements are followed by the semicircular law in the sense of (44), but
not all such free random variables are semicircular. In the text, we focus on studying
“non-self-adjoint” free random variables followed by the semicircular law.

7.1 Group C ∗ -Algebra of λ

Let Γ be an arbitrary discrete group, and let H be the group Hilbert space,

H = l 2 (Γ ) , the l 2 -space,
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 793

with its orthonormal basis,

6 7
B = ξg : g ∈ Γ \ {e} , (45)

where e ∈ Γ is the group-identity, satisfying ξe = 1H , the identity vector, and

> ?
ξg1 , ξg2 2
= δg1 ,g2 ,

and
> ?
ξg = ξg , ξg = 1, (46)
2 2

for all g1 , g2 , g ∈ Γ , where δ is the Kronecker delta, and , 2 is the canonical l 2 -

inner product inducing the l 2 -norm .2 on H .
Every Hilbert-space vector ξ ∈ H is expressed by

ξ= tg ξg , for tg ∈ C,
g∈Γ

where is the infinite sum under l 2 -topology6induced
7 by (46).
Note that, for any Hilbert-space vectors ξg g∈Γ = B ∪ {e}, the following
multiplication-rule holds;

ξg1 ξg2 = ξg1 g2 in H, ∀g1 , g2 ∈ Γ, (47)

where B is the orthonormal basis of (45).

In the operator algebra B (H ) of all (bounded linear) operators on the group-
Hilbert space H of (45), every group-element g ∈ Γ forms a (left) multiplication
operator mg with its symbol ξg ,

mg tu ξu = tu ξg ξu = tu ξgu , (48)
u∈Γ u∈Γ u∈Γ

in H by (47).
The relation (48) shows that the group Γ is acting on the operator algebra B (H )
via a group-action m,

m (g) = mg ∈ B (H ) , ∀g ∈ Γ , (49)

where mg are the multiplication operators (48).

Define a set,

def 6 7
M = mg ∈ B (H ) : g ∈ Γ ,
794 I. Cho and P. Jorgensen

of all multiplication operators (49), and construct the C ∗ -subalgebra M ,

M = C ∗ (M) of B (H ) ,
def
(50)

under the operator-norm on B (H ) (e.g., see [14]).

Definition 15 We call the C ∗ -algebra M of (50), the group (C ∗ -)algebra of Γ .
Let λ be the integer-shift group (40) acting on the C ∗ -probability space Xϕ of
(25). Then, by (50), one can have the corresponding group algebra,

Λ = C ∗ (λ) in B (Hλ ) ,
def

where

Hλ = l 2 (λ) (51)

is the group-Hilbert space (45).

Definition 16 The group algebra Λ of λ is called the integer-shift(-group) algebra.
All elements of Λ are said to be (integer-)shift operators.
By (51), all shift operators T of Λ are expressed by

T = tλk mλk = tk mλk , with tλk = tk ,
λk ∈λ k∈Z

because λ is isomorphic to (Z, +) by (41), and is an infinite sum under the C ∗ -
topology for Λ.
Let Λ be the integer-shift algebra (51) of λ. Then this C ∗ -algebra Λ is acting on
the C ∗ -probability space Xϕ , via an ∗-algebra-action,

: Λ → B Xϕ ,

defined by

tk mλk (S) = tk λk (S) , (52)
k∈Z k∈Z

for all S ∈ Xϕ , where B Xϕ is the operator space (in the sense of [13]), consisting
of all bounded linear transformations on Xϕ , by regarding Xϕ as a Banach space
with its C ∗ -norm. Indeed, is a well-defined algebra-action of Λ acting on (the
Banach space) Xϕ , since

(S1 S2 ) = (S1 ) (S2 ) ,

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 795

and

S ∗ = ( (S))∗ , (53)

by (42), (48) and (52), for all S1 , S2 , S ∈ Xϕ . i.e., by (52) and (53), all operators
of Λ are understood to be Banach-space operators acting on the Banach space, our
C ∗ -probability space Xϕ .

Proposition 17 If T = tk mλk ∈ Λ is a shift operator, and xj ∈ X is a
k∈Z
generating semicircular element of Xϕ , then

ϕ (T ) xjn = ωn c n2 tk , (54)
k∈Z

for all n ∈ N.
Proof For any n ∈ N, and xj ∈ X ⊂ Xϕ , one has

ϕ tk mλk n
xj =ϕ k n
tk λ xj
k∈Z k∈Z

by (52)

= tk ϕ xjn+k = tk ϕ xjn
k∈Z k∈Z

by (43)

= ωn c n2 tk .
k∈Z

Therefore, the free-distributional data (54) holds.

The above proposition shows how the algebra-action of the integer-shift
algebra Λ affects the original free probability on Xϕ by (54).

7.2 The Tensor Product C ∗ -Algebra ⊗ X

In this section, we construct the tensor product C ∗ -algebra,

def
X = $⊗X (55)
796 I. Cho and P. Jorgensen

of the integer-shift algebra Λ of (51) generated by the integer-shift

6 7group λ, and the
C ∗ -algebra X generated by the free semicircular family X = xj j ∈Z , where ⊗ is
the tensor product of C ∗ -algebras.
Define a linear functional τ on this C ∗ -algebra X of (55) by the morphism
satisfying

def
τ (S ⊗ T ) = ϕ ( (S) (T )) = ϕ (S (T )) , (56)

for all S ⊗ T ∈ X , with S ∈ Λ and T ∈ X.

Definition 18 Let X be the tensor product C ∗ -algebra (55), and τ , the linear
functional (56) on X . Then the C ∗ -probability space,

denote
Xτ = (X , τ ) ,

is called the (integer-)shift-semicircular C ∗ -probability space.

By definition, the shift-semicircular C ∗ -probability space Xτ is unital equipped
with its unity I = mλ0 ⊗ 1X , satisfying

τ (I ) = ϕ λ0 (1X ) = ϕ (1X ) = 1.

And the operators,

denote
uk,j = λk ⊗ xj ∈ Xτ , for k, j ∈ Z, (57)

generate Xτ . Observe that

n n
τ uk,j =τ λk ⊗ xj = τ λkn ⊗ xjn

= ϕ λkn xjn = ϕ xjn+kn = ϕ xjn = ωn c n2 , (58)

on Xϕ , by (43), (54) and (56), since all generating shift operators λk ∈ Λ are free-
isomorphisms on Xϕ .
Lemma 19 Let uk,j = λk ⊗ xj be a generating operator (57) of the shift-
semicircular C ∗ -probability space Xτ , for k, j ∈ Z. Then
n
τ uk,j = ωn c n2 = ϕ xjn , (59)

for all n ∈ N.
Proof The free-distributional data (59) is obtained by (58).

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 797

By (59), one can verify that if k = 0 in Z, then the generating operator

uk,j = u0,j is semicircular in Xτ .
Lemma 20 A generating operator u0,j = λ0 ⊗ xj of (57) is semicircular in the
shift-semicircular C ∗ -probability space Xτ , for all j ∈ Z. And hence, they are
followed by the semicircular law in the sense of (44).
Proof For any fixed j ∈ Z, the corresponding generating operator u0,j of Xτ
satisfies that
∗ ∗
u0,j = λ0 ⊗ xj∗ = idΛ ⊗ xj = λ0 ⊗ xj = u0,j ,

in X . So, the generating operators u0,j are self-adjoint in Xτ , for all j ∈ Z.

Such a self-adjoint free random variable u0,j ∈ Xτ satisfies
n
τ u0,j = ωn c n2 = ϕ xjn ,

for all n ∈ N, by (59). So, it is semicircular in Xτ , for all j ∈ Z. Since it is

semicircular in Xτ , it is followed by the semicircular law in the sense of (44).

The above lemma shows that all generating operators of the shift-semicircular
C ∗ -probability space Xτ , formed by

u0,j = λ0 ⊗ xj in Xτ ,

are followed by the semicircular law in the sense of (44), because they are
semicircular in Xτ . So, we are now interested in the cases where

k = 0 in Z.

Let uk,j ∈ X be a generating free random variable for k = 0 in Z. Then

∗ ∗
uk,j = λk ⊗ xj∗ = λ−k ⊗ xj = u−k,j , in Xτ . (60)

i.e., if k = 0, then the generating operators uk,j are not self-adjoint in Xτ , and
hence, they cannot be semicircular in Xτ , for all j ∈ Z.
Lemma 21 Let uk,j ∈ Xτ be a generating free random variable for k ∈ Z×
= Z \ {0}, and j ∈ Z. Then
n
τ u∗k,j = ωn c n2 = ϕ xjn , (61)

for all n ∈ N.
798 I. Cho and P. Jorgensen

∗
Proof If k = 0 in Z× and j ∈ Z, and hence, if ukj = u−k
j in Xτ , then

∗ n n
τ ukj =τ u−k
j = ϕ xjn−kn = ωn c n2 = ϕ xjn ,

for all n ∈ N, by (60). So, the free-distributional data (61) holds.

Now, let k ∈ Z× , and j ∈ Z, and uk,j , the corresponding generating free random
variable (57) of the shift-semicircular C ∗ -probability space Xτ . Consider the joint
free moments of
6 ∗ 7
uk,j , uk,j = u−k,j ,

in Xτ . First, observe that, if (r1 , . . . , rn ) ∈ {1, ∗}n is a “mixed” n-tuple of {1, ∗},
for n ∈ N>1 = N \ {1}, in the sense that: there exists at least one i0 ∈ {r1 , . . . , rn },
such that i0 = rm in {1, ∗}, for some m ∈ {1, . . . , n}, then

n
rl
uk,j = λ#(1)k−#(∗)k ⊗ xjn ,
l=1

where

#(1) = the number of 1’s in (r1 , . . . , rn ) , (62)

and

#(∗) = the number of ∗ ’s in (r1 , . . . , rn ) ,

in X . Thus, by (56) and (62),

τ λ(#(1)−#(∗))k ⊗ xjn = ϕ λ(#(1)−#(∗))k xjn . (63)

Lemma 22 Let uk,j ∈ Xτ be a generating free random variable for k ∈ Z× , and

j ∈ Z, and let

(r1 , . . . , rn ) ∈ {1, ∗}n , for n ∈ N>1 ,

be a “mixed” n-tuple of {1, ∗}. Then

n
r
τ uk,j l = ωn c n2 = ϕ xjn , ∀n ∈ N. (64)
l=1
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 799

Proof Under hypothesis,

n
r
uk,j l = λ(#(1)−#(∗))k ⊗ xjn ,
l=1

in Xτ by (62), implying that

n
rl
τ ukj = ϕ λ(#(1)−#(∗))k xjn
l=1

by (63)

= ϕ xjn+(#(1)−#(∗))k = ϕ xjn = ωn c n2 ,

for all n ∈ N, by (59) and (61). Therefore, the free-distributional data (64) holds.

By (59), (61) and (64), we obtain the following result.
Theorem 23 Every generating free random variable uk,j of (57) are followed by
the semicircular law in the sense of (44) in the shift-semicircular C ∗ -probability
space Xτ , for all k, j ∈ Z. i.e.,
n
rl
τ uk,j = ωn c n2 = ϕ xjn , (65)
l=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N.

Proof Let (r1 , . . . , rn ) ∈ {1, ∗}n be non-mixed for n ∈ N, i.e., either

(1, 1, . . . , 1) , or (∗, ∗, . . . , ∗) .

Then, by (59) and (61), the free-distributional data (65) holds. Meanwhile, if
(r1 , . . . , rn ) ∈ {1, ∗}n is mixed for n ∈ N>1 , then the formula (65) holds too, by
(64).
6 Therefore,
∗the free-distributional
7 data (65) holds as the joint free moments of
uk,j , uk,j = u−k,j , on Xτ . Equivalently, the generating free random variable
uk,j is followed by the semicircular law in Xτ , for all k, j ∈ Z.

The above theorem shows that there do exist free random variables in a C ∗ -
probability space followed by the semicircular law in the sense of (44).
Theorem 24 Let (B, ψ) be a unital C ∗ -probability space containing mutually free,
semicircular elements {y1 , . . . , yN }, for N ∈ N∞ = N ∪ {∞}. Then there exists a
800 I. Cho and P. Jorgensen

C ∗ -probability space (B, τ ) and a free random variable y ∈ (B, τ ), such that y is
followed by the semicircular law.
Proof Suppose a unital C ∗ -probability space (B, ψ) contains mutually free |N|-
many semicircular elements, Y = {y1 , y2, y3 , ...}. Then the C ∗ -subalgebra
C ∗ (Y)
∗ ∗
of B induces a C -probability space C (Y) , ψ = ψ |C ∗ (Y ) , which is free-
isomorphic to our C ∗ -probability
6 7 space Xϕ = (X, ϕ) generated by the free
semicircular family X = xj j ∈Z , by (33). And hence, there exists the correspond-
ing shift-semicircular C ∗ -probability space Xτ = (X , τ ), containing infinitely
many generating free random variables uk,j of (57) followed by the semicircular
law by (65). i.e., if

N = |N| = ∞, in N∞ ,

then this theorem holds true.

Assume now that a unital C ∗ -probability space (B, ψ) contains mutually free,
N-many semicircular elements,

YN = {y1 , . . . , yN } , for N < ∞.

Then the C ∗ -subalgebra BN = C ∗ (YN ∪ {1B }) of B induces the C ∗ -probability

space (BN , ψ). From (BN , ψ), one can construct a C ∗ -probability space (B, τ ),
with
∞
B = B[i], with B[i] = BN , ∀i ∈ N,
i=1

and

τ = ψ ∞ , on B,

where () is the free product of C ∗ -algebras (e.g., [17, 30]). Remark that all free
factors {B[i]}∞
i=1 , identified with BN , are free from each other (e.g., [30]) in (B, τ ),
and hence, the C ∗ -probability space (B, τ ) contains its free semicircular family,
∞
Y = {yi1 , . . . , yiN } ,
i=1

where is the disjoint union, and

{yi1 , . . . , yiN } , with yi1 = y1 , . . . , yiN = yN ,

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 801

in a free factor B[i] = BN , for all i ∈ N. Under the possible rearrangement, one can
let
6 7
Y = yj j ∈N .
0

Therefore, by (33) and (65), this theorem also holds even if N < ∞ in N∞ .
In conclusion, if a unital C ∗ -probability space (B, ψ) contains mutually free N-
many semicircular elements for N ∈ N∞ , then one can have free random variables
in a certain C ∗ -probability space (B, τ ), followed by the semicircular law.

The above theorem shows not only that there do exist free random variables
followed by the semicircular law, but also how to construct them. It also shows
there are sufficiently many such free random variables. Motivated by Theorem 24,
one can verify that, whenever a semicircular element x, and a free-isomorphism
β exist for a unital C ∗ -probability space containing x, one can construct the free
random variables,
4 5
βk ⊗ x
k∈Z

followed by the semicircular law, in a certain tensor-product C ∗ -probability space.

7.3 Free-Distributional Information on Xτ

In this section, we study free-distributional data on the shift-semicircular C ∗ -

probability space,

Xτ = (Λ ⊗ X, τ ) ,

generated by the free random variables followed by the semicircular law. Through-
out this section, we let

denote
ul = ukl ,jl = λkl ⊗ xjl ∈ Xτ (66)

be the generating free random variables of Xτ , for l = 1, . . . , N , for N ∈ N>1 .

Recall that

u∗l = λ−kl ⊗ xjl = u−kl ,jl , ∀l = 1, . . . , N, (67)

in Xτ .
Theorem 25 Let xj1 , . . . , xjN ∈ X be generating semicircular elements of the C ∗ -
probability space Xϕ (where j1 , . . . , jN are not necessarily distinct in Z), and let
λk1 , . . . , λkN ∈ λ be integer-shifts generating the shift algebra Λ (where k1 , . . . , kN
802 I. Cho and P. Jorgensen

are not necessarily distinct in Z), inducing the generating free random variables

N
u1 , . . . , uN of (66) in the shift-semicircular C ∗ -probability space Xτ . If w = xjl
l=1

N
is a free random variable of Xϕ , and if W(r1 ,...,rN ) = url l is a free random variable
l=1
of Xτ , where url l ∈ Xτ are in the sense of (66), or (67), for l = 1, . . . , N , then

τ W(r1 ,...,rN ) = ϕ (w) , (68)

for all (r1 , . . . , rN ) ∈ {1, ∗}N .

N
Proof Assume that w = xjl ∈ Xϕ is a free random variable satisfying
l=1

ϕ (w) = 7, in C,

determined by Theorem 8. Suppose

N
r
W(r1 ,...,rN ) = ul l ∈ Xτ ,
l=1

for an arbitrarily fixed (r1 , . . . , rN ) ∈ {1, ∗}N . Then

N N N
r
W(r1 ,...,rN ) = λ ⊗
il
xjl =
l
λil ⊗ w, (69)
l=1 l=1 l=1

in Xτ , by the self-adjointness of xj1 , . . . , xjN ∈ X in Xϕ , where

kl if rl = 1
il =
−kl if rl = ∗,

in Z, for all l = 1, . . . , N .
Then, there exists k(r1 ,...,rN ) ∈ Z, such that

N
λil = λk(r1 ,...,rN ) ∈ λ,
l=1

in Λ, by (41) and (51), i.e., one has

W(r1 ,...,rN ) = λk(r1 ,...,rN ) ⊗ w in Xτ ,

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 803

by (69), implying that

τ W(r1 ,...,rN ) = ϕ λk(r1 ,...,rN ) (w) = 7 = ϕ(w),

by (43). Therefore, the formula (68) holds.

By (68), one can obtain the following generalized results.
Theorem 26 Let y1 , . . . , yN be mutually free semicircular elements in a unital C ∗ -
probability space (B, ψ), and let yil ∈ {y1 , . . . , yN }, for l = 1, . . . , n, for n ∈ N,
where i1 , . . . , in are not necessarily distinct from each other in {1, . . . , N}, and
β ∈ Aut (B), a free-isomorphism on (B, ψ). Let

denote
ukl ,j = β kl ⊗ yil ∈ (ΛB ⊗ B, τ )

be a free random variable of a C ∗ -probability space (ΛB ⊗ B, τ ), for kl ∈ Z, for

l = 1, . . . , n, where ΛB is the group algebra of the cyclic group β, and τ is the
linear functional on ΛB ⊗ B, satisfying

τ (S ⊗ T ) = ψ (S(T )) ,

for all S ⊗ T ∈ ΛB ⊗ B, where β 0 is the identity map on B, and β −1 is the inverse

of β on B. Then
n ⎛ ⎞
r
N
τ ukll ,il = ψ ⎝ yil ⎠ , (70)
l=1 j =1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N, where the right-hand side of (70) is
determined by (17).
Proof The proof of the free-distributional data (70) is similar to that of (68) by
Theorem 24. Indeed, there exists k0 ∈ Z, such that
⎛ ⎞ ⎛ ⎞ ⎛ ⎞

n
n
n
n
ukl ,il = ⎝ β εl kl ⎠ ⊗ ⎝ yil ⎠ = β k0 ⊗ ⎝ yil ⎠ ,
rl

l=1 j =1 j =1 j =1

in (ΛB ⊗ B, τ ), where

1 if rl = 1
εl =
−1 if rl = ∗,
804 I. Cho and P. Jorgensen

for all l = 1, . . . , n, implying that

⎛ ⎛ ⎞⎞ ⎛ ⎞

n
n
n
τ urkll ,l = ψ ⎝β k0 ⎝ yil ⎠⎠ = ψ ⎝ yil ⎠ ,
l=1 j =1 j =1

since β k0 is a free-isomorphism on (B, ψ), by assumption.

7.4 A Structure Theorem of Xτ

In this section, we consider some structure theorems of our shift-semicircular C ∗ -

probability space,

Xτ = (Λ ⊗ X, τ ) ,

generated by the generating free random variables,

4 5
X = uk,j = λk ⊗ xj : k, j ∈ Z ,

followed by the semicircular law by (65), where λk ∈ λ ⊂ Λ, and xj ∈ X ⊂ Xϕ .

denote
Suppose ul = ukl ,jl = λkl ⊗ xjl ∈ X are arbitrary generating free random
variables of our shift-semicircular C ∗ -probability space Xτ , for kl , jl ∈ Z, for l =
1, . . . , s, for s ∈ N. By (68), such operators are followed by the semicircular law in
Xτ . Observe that
⎛ ⎛ ⎞⎞

knτ uri11 , . . . , urinn = ⎝ τ⎝ uijjt ⎠⎠ μ (π, 1n )
r
t
π∈NC(n) V ∈π ijt ∈V

by the Möbius inversion

⎛ ⎛ ⎞⎞

= ⎝ ϕ⎝ xijt ⎠⎠μ (π, 1n )
π∈NC(n) V ∈π ijt ∈V

by (68) (or, (70))

⎛ ⎞
⎜ ⎟
= knϕ ⎝xi1 , xi2 , ... . . . , xin ⎠, (71)
O PQ R
n-times
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 805

for all (r1 , . . . , rn ) ∈ {1, ∗}n and (i1 , . . . , in ) ∈ {1, . . . , s}n , for all n ∈ N, where
ϕ
k•τ (.) (respectively, k• (.)) is the free cumulant on Xτ (respectively, on Xϕ ) in terms
of the linear functional τ on Xτ (respectively, ϕ on Xϕ ), by the semicircularity (7)
of the generating operators xj ∈ X of Xϕ .
Proposition 27 Let uk,j ∈X be a generating free random variable of Xτ , followed
by the semicircular law, for k, j ∈ Z. Then

knτ urk,j
1
, . . . , urk,j
n
= δn,2 = knϕ xj , . . . , xj , (72)

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N.

Proof The free-distributional data (72) is obtained by the general formula (71).

The above free-distributional data (72) is equivalent to (65). i.e., Proposition 27
re-characterizes the free distributions followed by the semicircular law, i.e., the
formula (72) characterizes (44).
Theorem 28 Let ul = ukl ,jl ∈ X be generating free random variables of Xτ , for
l = 1, 2. Then j1 = j2 in Z, if and only if u1 and u2 are free in Xτ .
Proof (⇒) Suppose j1 = j2 in Z, and hence, the generating operators u1 and u2 are
distinct in X ⊂ Xτ . Observe that, for any “mixed” n-tuple (l1 , . . . , ln ) ∈ {1, 2}n ,
for n ∈ N>1 , we have that

knτ url11 , . . . , urlnn = knϕ xjl1 , . . . , xjln = 0,

by (71) and (68), for all (r1 , . . . , rn ) ∈ {1, ∗}n . In particular, the second equality
holds by the freeness of xj1 and xj2 in Xϕ . Indeed, by assumption, the semicircular
6elements
7 xj1 and xj2 are distinct in the generating free semicircular family X =
xj j ∈Z in Xϕ , implying the freeness of them (e.g., [22, 23]). Therefore,

knτ url11 , . . . , urlnn = 0,

for all mixed n-tuples (l1 , . . . , ln ) ∈ {1, 2}n , for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all
n ∈ N>1 . Equivalently, two subsets,
6 7 6 7
u1 , u∗1 and u2 , u∗2

are free in Xτ , i.e., if j1 = j2 in Z, then two free random variables u1 and u2 are
free in Xτ (e.g., [30]).
(⇐) Assume now that j1 = j = j2 in Z, and hence, ul = ukl ,j = λkl ⊗ xj ∈ X
in Xτ . Then, for any (mixed, or non-mixed) (l1 , . . . , ln ) ∈ {1, 2}n , for n ∈ N, we
806 I. Cho and P. Jorgensen

have
⎛ ⎞
⎜ ⎟
knτ url11 , . . . , urlnn = knϕ ⎝xj , xj , . . . , xj ⎠ = δn,2 ,
O PQ R
n-times

for all (r1 , . . . , rn ) ∈ {1, ∗}n by (72), implying that, if n = 2 in N, then

knτ url11 , url22 = knϕ xj , xj = 1,

even though l1 = l2 in {1, 2}. It shows that such two free random variables u1 and
u2 are not free in Xτ . i.e., if j1 = j2 in Z, then u1 and u2 are not free in Xτ .

The above theorem shows that the generator set X of Xτ is decomposed to be

X = Xj in Xτ ,
j ∈Z

with
4 5
Xj = uk,j = λk ⊗ xj ∈ X : k ∈ Z , ∀j ∈ Z, (73)

where Xj are mutually free from each other in Xτ , for all j ∈ Z.

Corollary 29 The generator set X of our shift-semicircular C ∗ -probability space
Xτ is decomposed by (73), and the blocks Xj of X are mutually free from each
other in Xτ .
Proof It is proven by Theorem 28 and (73).

By Corollary 29, we obtain the following structure theorem of Xτ .
Theorem 30 The shift-semicircular C ∗ -probability space Xτ satisfies

∗-iso
Xτ = C ∗ Xj ,
j ∈Z

where
4 5
Xj = uk,j = λk ⊗ xj ∈ X : k ∈ Z , ∀j ∈ Z, (74)

and C ∗ (Y ) are the C ∗ -subalgebras of Xτ generated by subsets Y ∪ Y ∗ of Xτ .

Proof Since X is the generator set of Xτ , we have that

Xτ = C ∗ (X ) = C ∗ Xj = C ∗ Xj ,
j ∈Z j ∈Z

by (73) and Corollary 29.

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 807

By the structure theorem (74), one obtains the following result, too.
Theorem 31 The shift-semicircular C ∗ -probability space Xτ satisfies that

∗-iso ∗

Xτ = Λ ⊗ CX {xj } ,
j ∈Z

and hence,

∗-iso ∗

Xτ = Λ Λ ⊗ CX {xj } , (75)
j ∈Z

∗
where CX (Z) are the C ∗ -subalgebras of Xϕ generated by the subsets Z ∪ Z ∗ of
Xϕ , and where “Λ ” is the amalgamated free product with its amalgamation over
the group C ∗ -algebra Λ (in the sense of [22]).
∗-iso
Proof By (74), we have Xτ = C ∗ Xj , where Xj are in the sense of (73)
j ∈Z
for all j ∈ Z. So, by definition,

∗-iso ∗ 6 7 6 7
C ∗ Xj = CB(H ∗
) (λ) ⊗ CX xj
∗
= Λ ⊗ CX xj ,
6 7
where λ = λk k∈Z is the integer-shift group, generating the group algebra Λ in the
operator algebra B (H ), where H is the group Hilbert space (45) of λ. Therefore,
the first ∗-isomorphic relation of (75) holds.
By the definition of amalgamated free products with amalgamations of [22], the
second ∗-isomorphic relation (75) holds, too, since

∗-iso ∗
∗-iso ∗

Xτ = Λ ⊗ CX {xj } = Λ⊗ CX {xj } ,
j ∈Z j ∈Z

6
∗ {x }
7
by understanding Λ as the common C ∗ -subalgebras Λ ⊗ CX j j ∈Z
of Xτ .

8 Certain Banach-Space Operators Acting on Xτ

Let Xτ = (X , τ ) be our shift-semicircular C ∗ -probability space of the unital C ∗ -

algebra,

X = Λ ⊗ X,

where Λ is the shift-operator algebra of the integer-shift group λ, acting on the

unital C ∗ -probability space Xϕ = (X, ϕ) generated by the free semicircular family
808 I. Cho and P. Jorgensen

6 7
X = xj j ∈Z , equipped with the linear functional τ , satisfying

τ (T ⊗ S) = ϕ (T (S)) ,

for all T ⊗ S ∈ X . Recall that Xτ is generated by the free random variables,

denote
uk,j = λk ⊗ xj ∈ Xτ , for all k, j ∈ Z, (76)

followed by the semicircular law by (65) and (68).

In this section, we consider certain Banach-space operators, bounded linear
transformations, acting on Xτ , by regarding X as a Banach space equipped with
its tensor-product C ∗ -norm (e.g., [13]). i.e., we are interested in some elements of
the operator space B (Xτ ) (e.g., [13, 14]).

t
8.1 Banach-Space Operators Ts,l ∈ B (Xτ )

Let B (Xτ ) be the operator space consisting of all Banach-space operators acting
on our shift-semicircular C ∗ -probability space Xτ , by regarding X = Λ ⊗ X as a
Banach space. Define an element Ts,lt
∈ B (Xτ ) by

def
t
Ts,l = Mt λs ⊗ λl , on Xτ ,

satisfying

t
Ts,l uk,j = Mt λs ⊗ λl λk ⊗ xj
(77)
= tλs λk ⊗ λl xj = tλs+k ⊗ xj +l ,

i.e.,

t
Ts,l uk,j = t λk+s ⊗ xj +l = tuk+s,j +l , (78)

in Xτ , for all uk,j ∈ X , and for all t ∈ C, and s, l ∈ Z, where

6 7
X = uk,j : k, j ∈ Z

is the generator set of all generating free random variables (76) of Xτ . In (77), the
tensor factor Mt λs ∈ B (Λ) of Ts,lt is a multiplication operator acting on the shift-

operator algebra Λ (by regarding it as a Banach space), defined by

Mt λs (S) = tλs S, in Λ, ∀S ∈ Λ,
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 809

and the other tensor factor λl ∈ λ of Ts,l

t is our generating shift operator of Λ acting

on Xϕ . So, the Banach-space operator Ts,l t of (77) is well-defined in B

(Xτ ).
Theorem 32 For t ∈ C× , and s, l ∈ Z, let Ts,l t ∈B
(Xτ ) be the Banach-space
operator (77), and let uk,j ∈ X be a generating operator (76) of Xτ . Then the
denote t
image u = Ts,l uk,j satisfies the free-distributional data,
n

τ ri
u = ωn t #(1) t #(∗) c n2 , on Xτ , (79)
i=1

where t is the conjugates of t, and

#(1) = the number of 1’s in (r1 , . . . , rn ) ,

and

#(∗) = the number of ∗ ’s in (r1 , . . . , rn ) ,

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N,

Proof Under hypothesis, one has that

u = Ts,l
t
uk,j = tλk+s ⊗ xj +l = tuk+s,j +l ,

by (77) and (78). Note that an operator uk+s,j +l ∈ X , in the far-right-hand side
is a generating free random variable of Xτ , followed by the semicircular law, by
(65). So, the element u ∈ Xτ is a scalar multiple of uk+n,j +l , and hence, the free
distribution of u may be affected by the semicircularity. Indeed, observe that, for
any

(r1 , . . . , rn ) ∈ {1, ∗}n , for n ∈ N,

one has

n
n
n
τ uri =τ t ri urk+s,j
i
+l
i=1 i=1 i=1

where

t if ri = 1
t ri =
t if ri = ∗,
810 I. Cho and P. Jorgensen

in C, where t is the conjugate of t, and hence, it goes to

n

= t ri ωn c n2
i=1

since the generating free random variable uk+s,j +l ∈ X is followed by the

semicircular law in Xτ

= t #(1) t #(∗) ωn c n2 ,

where

#(1) = the number of 1’s in (r1 , . . . , rn ) ,

and

#(∗) = the number of ∗ ’s in (r1 , . . . , rn ) .

Therefore, the free-distributional data (79) holds.

The above theorem shows how our Banach-space operator Ts,l t
∈ B (Xτ ) of
(77) affects the original free-distributional data on Xτ by (79). It distorts the free
distributions followed
6 by7 the semicircular law to the free random variables satisfying
(79), deformed by t, t ⊂ C.
Now, recall the following concept, introduced in [5–8, 12].
Definition 33 Let (B, ψ) be an arbitrary topological ∗-probability space. A “self-
adjoint” free random variable y ∈ (B, ψ) is said to be weighted-semicircular with
its weight t0 ∈ C× (or, in short, t0 -semicircular), if
n
ψ y n = ωn t02 c n2 , for all n ∈ N. (80)

The free distributions of weighted-semicircular elements are called weighted-

semicircular laws.
By definition, all semicircular elements are 1-semicircular in the sense of (80),
and hence, the semicircular law is a 1-semicircular law in terms of Definition 33.
Also, by (80), even though the semicircular law is universal by (8) and (9), weighted
semicircular laws are not universally determined because they are dictated by
their weights, i.e., they are depending on choices of weights. Such free random
variables whose free distributions are weighted-semicircular laws do exist and have
interesting properties up to weight (e.g., see [5] through [6, 10–12]).
Definition 34 Let (B, ψ) be a topological ∗-probability space. A free random
variable y ∈ (B, ψ) is said to be followed by a t0 -semicircular law, if the joint free
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 811

moments of {y, y ∗ } satisfy

n n
ψ y ri = ωn t02 c n2 , for all n ∈ N, (81)
i=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N.

By (80), all weighted-semicircular elements are followed by the corresponding
weighted-semicircular laws in the sense of (81). Clearly, not all free random
variables followed by weighted-semicircular laws are weighted-semicircular. Then,
similar to Theorems 23 and 24, are there sufficiently many free random variables
followed by weighted-semicircular laws? The answer is positive by (79).
Corollary 35 Let Ts,lt ∈B
(Xτ ) be a Banach-space operator (77) for t ∈ C× , and
s, l ∈ Z, and let uk,j ∈ Xτ be a generatingfree random variable for k, j ∈ Z.
If “t ∈ R” in C× , then the image u = Ts,l t u
k,j ∈ Xτ is followed by the t -
2

semicircular law in the sense of (81).

Proof Let u = Ts,l t
uk,j = tuk+s,j +l ∈ Xτ be the corresponding free random
variable, where t ∈ R× = R \ {0}. Then

n
τ uri = ωn t #(1) t #(∗) c n2 ,
i=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N, by (79). Moreover, since t ∈ R× ,
n

τ uri
= ωn t #(1)+#(∗) c n2 = ωn t n c n2 ,
i=1

implying that

n n
2
τ uri = ωn t 2 c n2 , (82)
i=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N.

Therefore, if t ∈ R× , then the free random variable Ts,l
t u
k,j ∈ Xτ is followed
by the t 2 -semicircular law by (82).

The above corollary shows not only that there do exist free random variables fol-
lowed by weighted-semicircular laws, but also that there are sufficiently many such
free random variables. Also, it shows how our Banach-space operator Ts,lt ∈B
(Xτ )
deforms the free distributions followed by the semicircular law to free distributions
followed by t 2 -semicircular laws, for all t ∈ R× .
812 I. Cho and P. Jorgensen

Let Tstii,li ∈ B (Xτ ) be Banach-space operators (77), where ti ∈ C× , and

si , li ∈ Z, for i = 1, . . . , N , for N ∈ N. Observe that
N N

N
ti

Tsi ,li = ti Mλsi ⊗ λli
i=1 i=1 i=1

since Mt λs = tMλs ∈ B (Λ), for all t ∈ C× , and s, l ∈ Z

⎛ ⎞ N

N
= ti ⎝M
N
⎠⊗ li
λ
λsi
i=1 i=1
i=1

since Mλs1 Mλs2 = Mλs1 λs2 on Λ by the very definition

N
= ti Mλso ⊗ λlo
i=1

where

N
N
so = si , and lo = li , in Z,
i=1 i=1

and hence, it goes to

= to Mλo ⊗ λlo = Mto λo ⊗ λlo = Tstoo,lo ,

in B (Xτ ). i.e.,

N
t
Tsii,li = Tstoo,lo in B (Xτ ) ,
i=1

with

N
N
N
to = ti , so = si , and lo = li . (83)
i=1 i=1 i=1

By (79) and (83), we obtain the following result.

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 813

Theorem 36 Let Tstii,li ∈ B (Xτ ) be the Banach-space operators (77), for

denote ti
N
i = 1, . . . , N , for N ∈ N, and let TN = Tsi ,li ∈ B (Xτ ). Then there exists
i=1

N
to = ti ∈ C× ,
i=1

such that
n
ri
τ TN uk,j = ωn to#(1) to #(∗) c n2 , (84)
i=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N, for any fixed generating free random
variables uk,j ∈ X of Xτ , for all k, j ∈ Z, where #(1) and #(∗) are in the sense of
(79).
Proof Under hypothesis, we have

TN = Tstoo,lo ∈ B (Xτ ) ,

in the sense of (77), where to ∈ C× , and so , lo ∈ Z are in the sense of (83).

For any generating operator uk,j ∈ X of Xτ , we have

TN uk,j = Tstoo,lo uk,j = to uk+so ,j +lo in Xτ .

Therefore, the free-distributional data (84) holds by (79).

The above theorem generalizes (79) by (84). It is interesting that finite products
of Banach-space operators (77) become Banach-space operators again in the sense
of (77) in B (Xτ ) by (83).

N
Corollary 37 Under the same conditions of Theorem 36, if to = ti ∈ R× , then
i=1
the free random variable TN uk,j ∈ Xτ is followed by the to2 -semicircular law,
for all generating operators uk,j ∈ X of Xτ .
Proof It is proven by (84) and Corollary 37.

8.2 Multiplication Operators Muk0 ,j0 of B (Xτ )

In this section, we consider a different type of Banach-space operators acting on

the shift-semicircular C ∗ -probability space Xτ , and consider how such operators
814 I. Cho and P. Jorgensen

deform the original free-distributional data on Xτ . Recall again that all generating
free random variables
4 5
X = uk,j = λk ⊗ xj : k, j ∈ Z

of Xτ are followed by the semicircular law, for all λk ∈ λ ⊂ Λ, and xj ∈ X ⊂ Xϕ .

Fix an arbitrary generating element

denote
u0 = uk0 ,j0 ∈ X (85)

of Xτ , and define a Banach-space operator Mu0 ∈ B (Xτ ) by

Mu0 (T ) = u0 T in Xτ , ∀T ∈ Xτ . (86)

Indeed, this morphism Mu0 of (86) is a well-defined bounded linear transformation

acting on the Banach space Xτ , understood to be a multiplication Banach-space
operator on Xτ with its symbol u0 of (85).
Definition 38 We call the Banach-space operator Mu0 ∈ B (Xτ ) of (86), the
multiplication operator with its symbol u0 ∈ Xτ . More generally, if w ∈ Xτ , and if
Mw is a Banach-space operator of B (Xτ ),

Mw (y) = wy, ∀y ∈ Xτ ,

then it is called the multiplication operator with its symbol w.

Observe that, for any generators uk,j ∈ X of Xτ , if Mu0 ∈ B (Xτ ) is a multipli-
cation operator (86), then

Mu0 uk,j = uk0 ,j0 uk,j = λk0 λk ⊗ xj0 xj = λk0 +k ⊗ xj0 xj , (87)

in Xτ , by (85).
Let’s consider the tensor-factor xj0 xj ∈ Xϕ of the image (87) of Mu0 uk,j ∈
Xτ . Suppose first that j = j0 in Z. Then

xj0 xj = xj20 ∈ Xϕ ,

as a self-adjoint free random variable. By the structure theorem (25) of Xϕ (which

is free-isomorphic to (22) by (33)), this element xj20 is a free reduced word with its
length-1 in Xϕ , satisfying its free-distributional data,
n
ϕ xj20 = ϕ xj2n
0
= ω2n c 2n = cn , (88)
2

the n-th Catalan number, for all n ∈ N.

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 815

So, if uj0 = Mu0 uk,j0 = λk0 +k ⊗ xj20 ∈ Xτ , then

n
τ urji0 = τ λ#(1)−#(∗) ⊗ xj2n
0
i=1 (89)
= ϕ λ#(1)−#(∗) xj2n
0
= ϕ xj2n
0
= cn ,

by (88), for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N, since λl ∈ λ ⊂ Λ are free-
isomorphisms on Xϕ , for all l ∈ Z.
Lemma 39 Let Mu0 ∈ B (Xτ ) be the multiplication operator (86) with its symbol
u0 ∈ Xτ of (85). Then, for any generating free random variables uk,j0 ∈ X ⊂ Xτ ,
for all k ∈ Z, and fixed j0 ∈ Z, we have

n
r
τ Mu0 uk,j0 i = cn , (90)
i=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N.

Proof The free-distributional data (90) is obtained by (89).

Now, suppose j = j0 in Z in (87). Then the tensor-factor xj0 xj ∈ Xϕ of
denote
u0 = Mu0 uk,j ∈ Xτ is a free reduced word with its length-2, since two
generating semicircular elements xj0 and xj (are mutually distinct in the generating
free semicircular family X = {xk }k∈Z , and hence, they) are free in Xϕ .
For (r1 , . . . , rn ) ∈ {1, ∗}n , for n ∈ N, one has that
n n
r ri
τ u0 = τ
i
λk0 +k ⊗ xj0 xj
i=1 i=1

by (87)

n
ri
ri (k0 +k)
=τ λ ⊗ xj0 xj
i=1

where

k0 + k if ri = 1
ri (k0 + k) =
− (k0 + k) if ri = ∗,

and
⎧
ri ⎨ xj0 xj if ri = 1
xj0 xj =
⎩
xj xj0 if ri = ∗,
816 I. Cho and P. Jorgensen

for all i = 1, . . . , n, so, it goes to

n
r
N(r1 ,...,rn )
=ϕ λ ⊗ xj0 xj i
i=1

where

N(r1 ,...,rn ) = (k0 + k) (#(1) − #(∗)) ∈ Z,

and hence, it goes to

n
ri
=ϕ xj0 xj , (91)
i=1

since λN(r1 ,...,rn ) is a free-isomorphism on Xϕ . Note here that the C-quantity (91) is
characterized by Theorem 8, or (17).

Lemma 40 Let u0 = Mu0 uk,j ∈ Xτ , where Mu0 ∈ B (Xτ ) is the multiplication
operator (86) with its symbol u0 = uk0 ,j0 ∈ X ⊂ Xτ of (77), and suppose j = j0
in Z. Then
n

n ri
τ ur0i =ϕ xj0 xj , (92)
i=1 i=1

and the right-hand side of (92) is characterized by Theorem 8.

Proof The free-distributional data (92) is obtained by (91).

By the previous two lemmas, we obtain the following result showing how our
multiplications operators of B (Xτ ) affect the free-distributional data on Xτ .
Theorem 41 Let Mu0 ∈ B (Xτ ) be a multiplication operator (86), and uk,j ∈ X ,
a generating free random variable of Xτ followed by the semicircular law. Let

uj0 = Mu0 uk,j0 and u0 = Mu0 uk,j , in Xτ ,

where j = j0 in Z. Then the images uj0 and u0 are no longer followed by the
semicircular law. In particular,
n

τ urji0 = cn , the n-th Catalan number, (93)
i=1
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 817

and
n

n ri
τ ur0i =ϕ xj0 xj ,
i=1 i=1

characterized by Theorem 8 (or, (17)), for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N.
Proof The free-distributional data (93) are obtained by (90) and (92). Therefore,
the free random variables uj0 and u0 are not followed by the semicircular law in
Xτ .

The above theorem illustrates how our multiplication operators deform the free
probability on the shift-semicircular C ∗ -probability space Xτ . In particular, the
generating free random variables of Xτ , followed by the semicircular law, are no
longer followed by the semicircular law, under the action of our multiplications
operators by (93). For example, if

u0 = Mu0 uk,j = λk0 +k ⊗ xj0 xj ∈ Xτ

(with j = j0 ) is as above in Theorem 41, then

n n
τ un0 = τ λk0 +k ⊗ xj0 xj = τ λn(k0 +k) ⊗ xj0 xj
n n
= ϕ λn(k0 +k) xj0 xj = ϕ xj0 xj

= ϕ xj0 xj xj0 xj ...xj0 xj
n
= ϕ xjn0 ϕ xj = 0, (94)

since ϕ(xj ) = ω1 c 1 = 0, for all n ∈ N, by (93). Note that, the formula (94) is
2
obtained by Theorem 8. Indeed, the free reduced word,
n
W = xj0 xj = xj0 xj xj0 xj ...xj0 xj ∈ Xϕ

with its length-(2n) induces the corresponding noncrossing partition,

πW = {(i1 , i3 , . . . , i2n−1 ), (i2 ), (i4 ), . . . , (i2n )} ,

in the lattice NC ({i1 , . . . , i2n }) of all noncrossing partitions over {i1 , . . . , i2n }.
Similar to (94), under the same hypothesis, we have
n
τ u∗0 = 0, for all n ∈ N,
818 I. Cho and P. Jorgensen

since
∗ n n
u0 = xj xj0 = xj xj0 xj xj0 ...xj xj0 ∈ Xϕ

is a free reduced word with its length-(2n).

Also, one can have, for example, that

τ u∗0 u0 = τ λ−(k0 +k) ⊗ xj xj0 λk0 +k ⊗ xj0 xj

= τ λ0 ⊗ xj xj20 xj = ϕ xj xj20 xj

where xj xj20 xj = xj xj0 xj0 xj ∈ Xϕ is a free reduced word with its length-3

= ϕ xj2 ϕ xj20 = ω2 c 2 ω2 c 2 = c12 = 1,

2 2

by (93), because the free random variable w = xj xj20 xj ∈ Xϕ induces its corre-
sponding noncrossing partition,

πw = {(i1 , i4 ), (i2 , i3 )} ,

in NC ({i1 , i2 , i3 , i4 }), by (17).

Let Mui = Muki ,ji ∈ B (Xτ ) be the multiplication operators (87) with their
symbols uki ,ji = λki ⊗ xji ∈ X ⊂ Xτ , for i = 1, 2. Then it is not difficult to check
that

Mu1 Mu2 = Mu1 u2 on Xτ ,

since

Mu1 Mu2 uk,j = Mu1 λk2 λk ⊗ xjl xj

= λk1 λk2 λk ⊗ xj1 xj2 xj = λk1 λk2 λk ⊗ xj1 xj2 xj

= λk1 λk2 ⊗ xj1 xj2 λk ⊗ xj = (u1 u2 ) uk,j

= Mu1 u2 uk,j ,

for all generating operators uk,j ∈ X of Xτ , implying that

Mu1 Mu2 = Mu1 u2 in B (Xτ ). (95)

Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 819

Let Mui = Muki ,ji ∈ B (Xτ ) be the multiplication operators with their symbols
ui = uki ,ji ∈ X , for i = 1, . . . , N , for N ∈ N. Then, by (95),

N
Mui = M
N = M⎛
N
⎞
. (96)
i=1
ui k
⎜ i=1 i ⎟
N
i=1 ⎝λ ⎠⊗ xji
i=1

If uk,j ∈ X is a generating operator of Xτ , then one has

⎛ ⎞
⎜ ⎟
N
⎜ ⎟ k
Mui uk,j = ⎜
⎜M
⎛

N
⎞

⎟
⎟ λ ⊗ xj
i=1 ⎝ ⎜
ki
⎟
N ⎠
⎝λ i=1 ⎠⊗ xji
i=1

by (96)
⎛
N
⎞ N
k+ ki
= ⎝λ i=1 ⎠⊗ xji xj , (97)
i=1

in Xτ .
Theorem 42 Let Mui = Muki ,ji ∈ B (Xτ ) be the multiplication operators with
their symbols uki ,ji ∈ X , for i = 1, . . . , N , for N ∈ N, and let uk,j ∈ X be an

N
arbitrary generating free random variable of Xτ . If M = Mui ∈ B (Xτ ), and if
i=1
u = M uk,j ∈ Xτ , then
N

τ (u) = ϕ xj1 xj = ϕ xj1 xj2 ...xjN xj , (98)
i=1

where the far-right-hand side of (98) is characterized by Theorem 8 (or (17)).

Proof Under hypothesis, one has

u = M uk,j = λko ⊗ wo ∈ Xτ ,

where

N
ko = k + ki ∈ Z,
i=1
820 I. Cho and P. Jorgensen

and

wo = xj1 xj2 ...xjN xj ∈ Xϕ ,

by (97). Therefore, the free-distributional data (98) holds by (93).

As a special case of (98), we obtain the following corollary.
Corollary 43 Let Mi = Muki ,j ∈ B (Xτ ) be the multiplication operators with their
symbols uki ,j ∈ X in Xτ , for a fixed j ∈ Z, and ki ∈ Z, for i = 1, . . . . , N , for
N ∈ N. For a generating free random variable uk,j ∈ X of Xτ , for k ∈ Z, and a

N
fixed j ∈ Z. If u = Mi uk,j ∈ Xτ , then
i=1

n
τ uri = ωn(N+1) c n(N+1) = ϕ xjn(N+1) , (99)
2
i=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N.

N
Proof Under hypothesis, let M = Mi ∈ B (Xτ ). Then, by (96),
i=1

denote
M = M⎛
N
⎞ = M ko ⊗xjN , on Xτ ,
⎜ i=1
ki
⎟ λ
⎝λ ⎠⊗xjN

with

N
λko ∈ Λ, with ko = ki ∈ Z, (100)
i=1

and xjN ∈ Xϕ is the free reduced word with its length-1 for a fixed generating
semicircular element xj of Xϕ .
So, one has

u = M uk,j = λko λk ⊗ xjN xj = λko +k ⊗ xjN+1 ,

in Xτ , by (100), for uk,j ∈ X ⊂ Xτ , for k ∈ Z. Thus,

n
n(N+1)
τ uri = ϕ λK xj ,
i=1
Comparing Banach Spaces for Systems of Free Random Variables Followed by. . . 821

n
for K = εi (ko + k) ∈ Z, with
i=1

1 if ri = 1
εi =
−1 if ri = ∗,

for all i = 1, . . . , n, implying that

n
n(N+1)
τ uri = ϕ xj = ωn(N+1) c n(N+1) ,
2
i=1

for all (r1 , . . . , rn ) ∈ {1, ∗}n , for all n ∈ N.

Therefore, the free-distributional data (100) holds.

References

1. M. Ahsanullah, Some inferences on semicircular distribution. J. Stat. Theory Appl. 15(3), 207–
213 (2016)
2. H. Bercovici, D. Voiculescu, Superconvergence to the central limit and failure of the Cramer
theorem for free random variables. Probab. Theory Related Fields 103(2), 215–222 (1995)
3. M. Bozejko, W. Ejsmont, T. Hasebe, Noncommutative probability of type D. Int. J. Math.
28(2), 1750010, 30 (2017)
4. M. Bozheuiko, E.V. Litvinov, I.V. Rodionova, An extended anyon Fock space and non-
commutative Meixner-type orthogonal polynomials in the infinite-dimensional case. Usp.
Math. Nauk. 70(5), 75–120 (2015)
5. I. Cho, Semicircular families in free product banach ∗-algebras induced by p-adic number
fields over primes p. Complex Anal. Oper. Theory 11(3), 507–565 (2017)
6. I. Cho, Acting semicircular elements induced by orthogonal projections on von Neumann
algebras. Mathematics 5, 74 (2017). https://doi.org/10.3390/math5040074
7. I. Cho, Semicircular-like laws and the semicircular law induced by orthogonal projections.
Complex Anal. Oper. Theory 12, 1657–1695 (2018)
8. I. Cho, Free stochastic integrals for weighted-semicircular motion induced by orthogonal
projections, in Advanced Topics in Mathematical Analysis. Monograph Ser., Appl. Math. Anal:
Theo., Methods & Appl. (Taylor & Fransis, Abington, 2019)
9. I. Cho, J. Dong, Catalan numbers and free distributions of mutually free multi semicircular
elements. Complex Anal. Oper. Theory. Preprint (2021). Submitted
10. I. Cho, P.E.T. Jorgensen, Semicircular elements induced by p-adic number fields. Opusc. Math.
35(5), 665–703 (2017)
11. I. Cho, P.E.T. Jorgensen, Banach ∗-algebras generated by semicircular elements induced by
certain orthogonal projections. Opusc. Math. 38(4), 501–535 (2018)
12. I. Cho, P.E.T. Jorgensen, Semicircular elements induced by projections on separable Hilbert
spaces, in Operator Theory: Advances & Applications (OT, vol. 275). Linear Systems, Signal
Processing & Hypercomplex Analysis (2019), pp. 167–209
13. A. Connes, Noncommutative Geometry (Academic Press, San Diego, CA, 1994). ISBN: 0-12-
185860-X
14. P.R. Halmos, Hilbert Space Problem Books. Grad. Texts in Math., vol. 19 (Springer, Berlin,
1982). ISBN: 978-0387906850
822 I. Cho and P. Jorgensen

15. I. Kaygorodov, I. Shestakov, Free generic Poisson fields and algebras. Commun. Algebra 46(4)
(2018). https://doi.org/10.1080/00927872.2017.1358269
16. L. Makar-Limanov, I. Shestakov, Polynomials and poisson dependence in free Poisson algebras
and free poisson fields. J. Algebra 349(1), 372–379 (2012)
17. A. Nica, R. Speicher, Lectures on the Combinatorics of Free Probability, London Math. Soc.
Lecture Note Ser., vol. 335, 1st edn. (Cambridge Univ. Press., Cambridge, 2006). ISBN-
13:978-0521858526
18. I. Nourdin, G. Peccati, R. Speicher, Multi-Dimensional Semicircular Limits on the Free Wigner
Chaos. Progr. Probab., vol. 67 (2013), pp. 211–221
19. V. Pata, The central limit theorem for free additive convolution. J. Funct. Anal. 140(2), 359–380
(1996)
20. F. Radulescu, Random matrices, amalgamated free products and subfactors of the C ∗ -algebra
of a free group of nonsingular index. Invent. Math. 115, 347–389 (1994)
21. F. Radulescu, Free Group factors and Hecke operators, notes taken by N. Ozawa, in Proceed.
24-th Conference in Oper. Theo., Theta Advanced Series in Math. (Theta Foundation, 2014)
22. R. Speicher, Combinatorial theory of the free product with amalgamation and operator-valued
free probability theory. Am. Math. Soc. Mem. 132(627) (1998)
23. R. Speicher, A conceptual proof of a basic result in the combinatorial approach to freeness.
Infin. Dimens. Anal. Quantum Probab. Relat. Top. 3, 213–222 (2000)
24. R. Speicher, U. Haagerup, Brown’s spectrial distribution measure for R-diagonal elements in
finite Von Neumann algebras. J. Funct. Anal. 176(2), 331–367 (2000)
25. R. Speicher, T. Kemp, Strong Haagerup inequalities for free R-diagonal elements. J. Funct.
Anal. 251(1), 141–173 (2007)
26. V.S. Vladimirov, p-Adic quantum mechanics. Commun. Math. Phys. 123(4), 659–676 (1989)
27. V.S. Vladimirov, I.V. Volovich, E.I. Zelenov, p-Adic Analysis and Mathematical Physics. Ser.
Soviet & East European Math., vol. 1 (World Scientific, Singapore, 1994). ISBN: 978-981-02-
0880-6
28. D. Voiculescu, Free probability and the Von Neumann algebras of free groups. Rep. Math.
Phys. 55(1), 127–133 (2005)
29. D. Voiculescu, Aspects of free analysis. Jpn. J. Math. 3(2), 163–183 (2008)
30. D. Voiculescu, K. Dykema, A. Nica, Free Random Variables. CRM Monograph Series, vol. 1.
(Am. Math. Soc., Providence, 1992). ISBN-13: 978-0821811405

Nonlinear Ill Posed Problems of Monotone Type Y Alber I Ryazantseva Springer
No ratings yet
Nonlinear Ill Posed Problems of Monotone Type Y Alber I Ryazantseva Springer
420 pages
Bollobas Linear Analysis
0% (1)
Bollobas Linear Analysis
254 pages
Yair Shapira - Linear Algebra and Group Theory For Physicists and Engineers-Birkhauser (2019)
No ratings yet
Yair Shapira - Linear Algebra and Group Theory For Physicists and Engineers-Birkhauser (2019)
456 pages
(EMA) Prato G.D., Zabczyk J. - Stochastic Equations in Infinite Dimensions-CUP (2014)
100% (2)
(EMA) Prato G.D., Zabczyk J. - Stochastic Equations in Infinite Dimensions-CUP (2014)
514 pages
Jan Prüss - Evolutionary Integral Equations and Applications-Birkhäuser Basel
100% (2)
Jan Prüss - Evolutionary Integral Equations and Applications-Birkhäuser Basel
393 pages
An Introduction To Banach Space Theory. R. E. Megginson PDF
100% (4)
An Introduction To Banach Space Theory. R. E. Megginson PDF
613 pages
Javad Mashreghi Emmanuel Fricain Eds. Blaschke Products and Their Applications
100% (1)
Javad Mashreghi Emmanuel Fricain Eds. Blaschke Products and Their Applications
324 pages
Indefinite Inner Product Spaces, Schur Analysis, and Differential Equations
100% (1)
Indefinite Inner Product Spaces, Schur Analysis, and Differential Equations
501 pages
Inequalities A Journey Into Linear Analysis - D. J. H. Garling, CUP 2007
100% (3)
Inequalities A Journey Into Linear Analysis - D. J. H. Garling, CUP 2007
348 pages
Nonlinear PDEs With Applications PDF
100% (1)
Nonlinear PDEs With Applications PDF
486 pages
Matrices and Its Application
100% (6)
Matrices and Its Application
25 pages
Instructional Materials For Advanced Engineering Mathematics For ECE
No ratings yet
Instructional Materials For Advanced Engineering Mathematics For ECE
48 pages
Semigroups of Operators and Spectral Theory (Research Notes in Mathematics Series) (PDFDrive)
No ratings yet
Semigroups of Operators and Spectral Theory (Research Notes in Mathematics Series) (PDFDrive)
149 pages
Matrix PD
No ratings yet
Matrix PD
340 pages
Introduction To Functional Analysis PDF
No ratings yet
Introduction To Functional Analysis PDF
103 pages
Course File MM
No ratings yet
Course File MM
78 pages
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
No ratings yet
We Are Intechopen, The World'S Leading Publisher of Open Access Books Built by Scientists, For Scientists
33 pages
(Essential Textbooks in Mathematics) Enrique Fernandez-Cara - Ordinary Differential Equations and Applications-World Scientific Publishing (2024)
100% (1)
(Essential Textbooks in Mathematics) Enrique Fernandez-Cara - Ordinary Differential Equations and Applications-World Scientific Publishing (2024)
351 pages
Bollobás - Linear Analysis (1999)
100% (1)
Bollobás - Linear Analysis (1999)
251 pages
Lectures
No ratings yet
Lectures
289 pages
(Chapman & Hall - CRC Handbooks in Mathematics Series) Kehe Zhu - Handbook of Analytic Operator Theory-CRC Press (2019)
No ratings yet
(Chapman & Hall - CRC Handbooks in Mathematics Series) Kehe Zhu - Handbook of Analytic Operator Theory-CRC Press (2019)
370 pages
Inequalities Hardy Littlewood Polya PDF
No ratings yet
Inequalities Hardy Littlewood Polya PDF
329 pages
Spectral Geometry of Partial Differential Operators
No ratings yet
Spectral Geometry of Partial Differential Operators
378 pages
(Silvestru Sever Dragomir) Inequalities
No ratings yet
(Silvestru Sever Dragomir) Inequalities
130 pages
Linear and Multilinear Algebra: Click For Updates
No ratings yet
Linear and Multilinear Algebra: Click For Updates
8 pages
A Journey Into Matrix Analysis
No ratings yet
A Journey Into Matrix Analysis
144 pages
Maths Syllabus
No ratings yet
Maths Syllabus
23 pages
Hermitian Matrix PDF
No ratings yet
Hermitian Matrix PDF
5 pages
Matrix Computations, Marko Huhtanen
No ratings yet
Matrix Computations, Marko Huhtanen
63 pages
Positive Operator Semigroups: András Bátkai 0dumhwd - Udpdu) Lmdyè Abdelaziz Rhandi
No ratings yet
Positive Operator Semigroups: András Bátkai 0dumhwd - Udpdu) Lmdyè Abdelaziz Rhandi
366 pages
Mechanical Engineering
No ratings yet
Mechanical Engineering
30 pages
978 94 017 2226 1
No ratings yet
978 94 017 2226 1
476 pages
Ammar Khanfer - Applied Functional Analysis-Springer (2024)
No ratings yet
Ammar Khanfer - Applied Functional Analysis-Springer (2024)
378 pages
An Operator Theory Problem Book: Introduction and References
0% (1)
An Operator Theory Problem Book: Introduction and References
18 pages
WWW Emis de/journals/AFA
No ratings yet
WWW Emis de/journals/AFA
9 pages
Hardy Inequalities
No ratings yet
Hardy Inequalities
579 pages
Solution Manual Exercise 1 Linear Algebra
No ratings yet
Solution Manual Exercise 1 Linear Algebra
4 pages
Mathematical Inequalities
No ratings yet
Mathematical Inequalities
138 pages
Rank of Matrix, System of Linear Equations, Vector Space, Subspace of Vector Space, Linear Span, Linear Independence and Dependence, Basis, Dimension
No ratings yet
Rank of Matrix, System of Linear Equations, Vector Space, Subspace of Vector Space, Linear Span, Linear Independence and Dependence, Basis, Dimension
33 pages
W5 Lesson 3 - Systems of Linear Equations (Part 2) - Module
No ratings yet
W5 Lesson 3 - Systems of Linear Equations (Part 2) - Module
8 pages
2018 Operator Theory Operator Algebras and Matrix Theory - Book
100% (1)
2018 Operator Theory Operator Algebras and Matrix Theory - Book
381 pages
HW02
No ratings yet
HW02
2 pages
Freeman Dyson, The Mathematician: General Article
No ratings yet
Freeman Dyson, The Mathematician: General Article
13 pages
Around The Research of Vladimir Maz 039 Ya I Function Spaces International Mathematical Series
No ratings yet
Around The Research of Vladimir Maz 039 Ya I Function Spaces International Mathematical Series
414 pages
Tutorial5 Solution
No ratings yet
Tutorial5 Solution
2 pages
Ens 16-17 TD6
No ratings yet
Ens 16-17 TD6
29 pages
13 - B SC - Computer-Science
No ratings yet
13 - B SC - Computer-Science
15 pages
Mathematical Physics Jam Part Test - A
No ratings yet
Mathematical Physics Jam Part Test - A
16 pages
Phys Notes
No ratings yet
Phys Notes
194 pages
Excellent Book On Numerical Radius Inequalities
No ratings yet
Excellent Book On Numerical Radius Inequalities
216 pages
Foundations of The Complex Variable Boundary Element Method by Theodore Hromadka, Robert Whitley (Auth.)
No ratings yet
Foundations of The Complex Variable Boundary Element Method by Theodore Hromadka, Robert Whitley (Auth.)
86 pages
Instant Download Integral Equations With Difference Kernels On Finite Intervals 2ed. Edition Sakhnovich L.A. PDF All Chapter
No ratings yet
Instant Download Integral Equations With Difference Kernels On Finite Intervals 2ed. Edition Sakhnovich L.A. PDF All Chapter
67 pages
Advanced Molecular Quantum Mechanics
No ratings yet
Advanced Molecular Quantum Mechanics
326 pages
Assignment 4
No ratings yet
Assignment 4
4 pages
Perturbations of Positive Semi Groups Wit
No ratings yet
Perturbations of Positive Semi Groups Wit
444 pages
Numerical Radius
No ratings yet
Numerical Radius
127 pages
Algebraic Multiplicity of Eigenvalues of Linear Operators Operator Theory Advances and Applications 1st Edition Julián López-Gómez
No ratings yet
Algebraic Multiplicity of Eigenvalues of Linear Operators Operator Theory Advances and Applications 1st Edition Julián López-Gómez
41 pages
4 - 10-21-2022 - 10-26-38 - Bachelor of Arts (First To Sixth Semester)
No ratings yet
4 - 10-21-2022 - 10-26-38 - Bachelor of Arts (First To Sixth Semester)
68 pages
2002 JMP Pseudo-Hermiticity Versus PT-symmetry II A Complete Characterization of Non-Hermitian Hamiltonians With A Real Spectrum
No ratings yet
2002 JMP Pseudo-Hermiticity Versus PT-symmetry II A Complete Characterization of Non-Hermitian Hamiltonians With A Real Spectrum
4 pages
Advanced Inequalities George A Anastassiou Instant Download
100% (1)
Advanced Inequalities George A Anastassiou Instant Download
74 pages
Advanced Inequalities For BSC and MSC Students
No ratings yet
Advanced Inequalities For BSC and MSC Students
423 pages
Exercises Complete1
No ratings yet
Exercises Complete1
106 pages
Chapter 8 Note
No ratings yet
Chapter 8 Note
11 pages
Phy1071 - Phy1072 - Unit Iv - 2024-2
No ratings yet
Phy1071 - Phy1072 - Unit Iv - 2024-2
91 pages
Asymptotic Geometric Analysis Part I 1st Edition Shiri Artsteinavidan PDF Download
No ratings yet
Asymptotic Geometric Analysis Part I 1st Edition Shiri Artsteinavidan PDF Download
80 pages
(Vakhtang Kokilashvili, Alexander Meskhi, Humberto
No ratings yet
(Vakhtang Kokilashvili, Alexander Meskhi, Humberto
455 pages
10.3934 Math.2024577
No ratings yet
10.3934 Math.2024577
16 pages
Matrix
No ratings yet
Matrix
57 pages
Davis RotationEigenvectorsPerturbation 1970
No ratings yet
Davis RotationEigenvectorsPerturbation 1970
47 pages
Elliptic Op Spec
No ratings yet
Elliptic Op Spec
324 pages
Topics in Matrix Analysis 1st Edition 10th Printing Roger A Horn Instant Download
No ratings yet
Topics in Matrix Analysis 1st Edition 10th Printing Roger A Horn Instant Download
81 pages
SC - Practice Sheet - 47 - Mathematics
No ratings yet
SC - Practice Sheet - 47 - Mathematics
6 pages
Schwarz Pick Type Inequalities Frontiers in Mathematics 1st Edition Farit G. Avkhadiev Instant Download
100% (2)
Schwarz Pick Type Inequalities Frontiers in Mathematics 1st Edition Farit G. Avkhadiev Instant Download
81 pages
Advanced Topics in Mathematical Analysis (Ruzhansky-Dutta (Eds.) )
No ratings yet
Advanced Topics in Mathematical Analysis (Ruzhansky-Dutta (Eds.) )
608 pages
Classical and Quantum Orthogonal Polynomials in One Variable Mourad E H Ismail PDF Download
No ratings yet
Classical and Quantum Orthogonal Polynomials in One Variable Mourad E H Ismail PDF Download
71 pages
Lanczos Algorithms For Large Symmetric Eigenvalue Computations Volume 1 Theory 1st Jane K Cullum Instant Download
No ratings yet
Lanczos Algorithms For Large Symmetric Eigenvalue Computations Volume 1 Theory 1st Jane K Cullum Instant Download
91 pages
Representations of Linear Operators Between Banach Spaces 1st Edition David E. Edmunds PDF Download
No ratings yet
Representations of Linear Operators Between Banach Spaces 1st Edition David E. Edmunds PDF Download
61 pages
An Introduction to the Theory of Linear Spaces
From Everand
An Introduction to the Theory of Linear Spaces
Georgi E. Shilov
No ratings yet
Elementary Real and Complex Analysis
From Everand
Elementary Real and Complex Analysis
Georgi E. Shilov
4.5/5 (8)
An Introduction to Linear Algebra
From Everand
An Introduction to Linear Algebra
L. Mirsky
3/5 (2)
Sets, Sequences and Mappings: The Basic Concepts of Analysis
From Everand
Sets, Sequences and Mappings: The Basic Concepts of Analysis
Kenneth Anderson
No ratings yet
Introductory Numerical Analysis
From Everand
Introductory Numerical Analysis
Anthony J. Pettofrezzo
2/5 (1)
Introduction to Vector and Tensor Analysis
From Everand
Introduction to Vector and Tensor Analysis
Robert C. Wrede
3.5/5 (3)
The Statistical Analysis of Experimental Data
From Everand
The Statistical Analysis of Experimental Data
John Mandel
3/5 (2)
Introduction to Matrices and Vectors
From Everand
Introduction to Matrices and Vectors
Jacob T. Schwartz
No ratings yet
Differential Forms with Applications to the Physical Sciences
From Everand
Differential Forms with Applications to the Physical Sciences
Harley Flanders
5/5 (1)
Computational Modeling for Fluid Flow and Interfacial Transport
From Everand
Computational Modeling for Fluid Flow and Interfacial Transport
Wei Shyy
No ratings yet
Applied Matrix Algebra in the Statistical Sciences
From Everand
Applied Matrix Algebra in the Statistical Sciences
Alexander Basilevsky
4/5 (1)
Modern Nonlinear Equations
From Everand
Modern Nonlinear Equations
Thomas L. Saaty
3.5/5 (2)
Operators and Representation Theory: Canonical Models for Algebras of Operators Arising in Quantum Mechanics
From Everand
Operators and Representation Theory: Canonical Models for Algebras of Operators Arising in Quantum Mechanics
Palle E.T. Jorgensen
3/5 (1)

Operator

Uploaded by

Operator

Uploaded by

Trends in Mathematics

Material submitted for publication must be screened and prepared as follows:

We expect the organizers to deliver manuscripts in a form that is essentially ready

Furthermore, in order to guarantee the timely appearance of the proceedings it is

Operator and Norm

Ilya M. Spitkovsky Hugo J. Woerdeman

ISSN 2297-0215 ISSN 2297-024X (electronic)

Inequalities play a central role in mathematics with various applications in other

Part I: Matrix and Operator Inequalities

Whenever we see an inequality concerning real or complex numbers, an interesting

In Chapter “Ando-Hiai Inequality: Extensions and Applications”, extensions

Part II: Orthogonality and Inequalities

Chapter “Orthogonally Additive Operators on Vector Lattices” focuses on

Part III: Inequalities Related to Types of Operators

Part IV: Inequalities in Various Banach Spaces

Chapter “The Bishop–Phelps–Bollobás Theorem: An Overview” provides a

Part V: Inequalities in Commutative and Noncommutative

This part includes generalizations of Doob’s maximal inequality, the Burkholder–

Chapter “Comparing Banach Spaces for Systems of Free Random Variables

Kent, OH, USA Richard M. Aron

Part I Matrix and Operator Inequalities

Part II Orthogonality and Inequalities

Approximate Birkhoff-James Orthogonality in Normed Linear

Part III Inequalities Related to Types of Operators

Part IV Inequalities in Various Banach Spaces

Part V Inequalities in Commutative and Noncommutative

The First Eigenvalue for Nonlocal Operators .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 741

N. Bebiano, R. Lemos, and G. Soares

Keywords Eigenvalues · Singular values · Majorization · Log-majorization ·

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 3

||| · ||| Unitarily invariant norm

σJ± (A) Set of eigenvalues with eigenvectors x, such that x ∗ J x = ±1

The concept of majorization was introduced by Hardy, Littlewood and Pólya

with equality in (2) for k = 1. If (2) holds, then x is said to be supermajorized by y

Proposition 1.2 For x, y ∈ Rn , the following statements are equivalent:

Proposition 1.3 Let x, y ∈ Rn and f be a convex function on an interval

with equality in (4) for k = 1. If x, y > 0, then

x ≺log y ⇔ log x ≺ log y,

x ≺wlog y for (3) and x ≺wlog y for (4).

x ≺wlog y ⇒ f (x) ≺w f (y).

If A = (aij ), B = (bij ) are m × n complex matrices, let A ◦ B = (aij bij ) be

ρ(A) = max |λi (A)|

A = max Ax

be the spectral norm or operator norm of A. It is clear that

ρ(A) ≤ A. (5)

If A ∈ Mn (C) has real eigenvalues, denote by λ(A) the n-tuple of eigenvalues of A

For A ∈ Mn (C), the unique positive semidefinite square root of A∗ A is denoted by

and the Ky Fan k-norms defined by

including A = s1 (A). The Schatten 2-norm

A, B = tr(B ∗ A).

λ(A) ≺log λ(B).

A ≺wlog B ⇔ B −1 ≺wlog A−1 .

Matrix log-majorization is a powerful tool for establishing trace, determinantal and

A ≺log B ⇒ det(In + A) ≤ det(In + B).

A≥B ⇒ f (A) ≥ f (B)

Thus, a useful tool in log-majorization is provided by the next lemma.

|λ1 (A)| ≥ · · · ≥ |λn (A)|.

Theorem 2.5 (Weyl’s Majorant Theorem, 1949) If A ∈ Mn (C), then

|λ(A)| ≺log s(A). (7)

Proof Use properties P5 and P6, after applying 5, that is,

ρ(A) = |λ1 (A)| ≤ s1 (A).

to the kth antisymmetric tensor power of A, k = 1, . . . , n, and observe that

is the product of all the singular values of A. 

3 Trace and Determinantal Inequalities

and equality occurs if A, B share a joint set of singular vectors.

s(AB) ≺log s(A) ◦ s(B).

Proof By the submultiplicativity of the operator norm, we have

s1 (AB) = AB ≤ AB = s1 (A)s1 (B).

In particular, von Neumann trace inequality is obtained when k = n in (8).

We remark that the lower bound is an immediate consequence of the

for any permutation σ ∈ Sn .

If αn + βn ≥ 0, then the minimum is attained when σ is the identity permutation

Proof If A and B commute, they are simultaneously unitarily diagonalizable and

where is a small quantity and S ∈ Mn (C) is Hermitian. Assuming that

This is Marcus-de Oliveira conjecture [71, 79], a longstanding open problem.

for 1 ≤ i1 < i2 < · · · < ik ≤ n, with equality for k = n.

A = max Ax

ρ(A) ≤ A. (5)

including A = s1 (A). The Schatten 2-norm

A, B = tr(B ∗ A).

is the product of all the singular values of A.

s1 (AB) = AB ≤ AB = s1 (A)s1 (B).

and the result easily follows.

Here a is the Euclidean norm of a. For ∈ R \ {0}, let

and the result follows.

Ar α B r ≺log (Aα B)r , r ≥ 1, (17)

λ1 (Ar α B r ) ≤ λ1 (Aα B)r

A≥B≥0 ⇒ A−r 1+r B p ≤ A, p ≥ 1, r ≥ 0. (21)

Aα B ≺log A1−α B α .

λ1 (Aα B) ≤ λ1 (A1−α B α ). (29)

so that λ1 (Aα B) ≤ 1 holds. If A is not invertible, by a continuity argument, (29)