Quantum Information and Foundations
Quantum Information and Foundations
Information and
Foundations
Edited by
Giacomo Mauro D’Ariano and Paolo Perinotti
Printed Edition of the Special Issue Published in Entropy
www.mdpi.com/journal/entropy
Quantum Information and Foundations
Quantum Information and Foundations
MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin
Special Issue Editors
Giacomo Mauro D’Ariano Paolo Perinotti
QUit Group, Department of QUit Group, Department of
Physics, University of Pavia Physics, University of Pavia
Italy Italy
Editorial Office
MDPI
St. Alban-Anlage 66
4052 Basel, Switzerland
This is a reprint of articles from the Special Issue published online in the open access journal Entropy
(ISSN 1099-4300) (available at: https://www.mdpi.com/journal/entropy/special issues/quantum
information foundations).
For citation purposes, cite each article independently as indicated on the article page online and as
indicated below:
LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. Journal Name Year, Article Number,
Page Range.
c 2020 by the authors. Articles in this book are Open Access and distributed under the Creative
Commons Attribution (CC BY) license, which allows users to download, copy and build upon
published articles, as long as the author and publisher are properly credited, which ensures maximum
dissemination and a wider impact of our publications.
The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons
license CC BY-NC-ND.
Contents
Alessandro Bisio, Giacomo Mauro D’Ariano, Nicola Mosco, Paolo Perinotti and Alessandro
Tosini
Solutions of a Two-Particle Interacting Quantum Walk
Reprinted from: Entropy 2018, 20, 435, doi:10.3390/e20060435 . . . . . . . . . . . . . . . . . . . . . 9
Louis H. Kauffman
Iterant Algebra
Reprinted from: Entropy 2017, 19, 347, doi:10.3390/e19070347 . . . . . . . . . . . . . . . . . . . . . 41
Časlav Brukner
A No-Go Theorem for Observer-Independent Facts
Reprinted from: Entropy 2018, 20, 350, doi:10.3390/e20050350 . . . . . . . . . . . . . . . . . . . . . 71
Alexander Wilce
A Royal Road to Quantum Theory (or Thereabouts)
Reprinted from: Entropy 2018, 20, 227, doi:10.3390/e20040227 . . . . . . . . . . . . . . . . . . . . . 81
Giulio Chiribella
Agents, Subsystems, and the Conservation of Information
Reprinted from: Entropy 2018, 20, 358, doi:10.3390/e20050358 . . . . . . . . . . . . . . . . . . . . . 107
Howard Barnum, Ciarán M. Lee, Carlo Maria Scandolo and John Selby
Ruling out Higher-Order Interference from Purity Principles
Reprinted from: Entropy 2017, 19, 253, doi:10.3390/e19060253 . . . . . . . . . . . . . . . . . . . . . 161
Giovanni Amelino-Camelia
Planck-Scale Soccer-Ball Problem: A Case of Mistaken Identity
Reprinted from: Entropy 2017, 19, 400, doi:10.3390/e19080400 . . . . . . . . . . . . . . . . . . . . . 249
v
Ämin Baumeler and Stefan Wolf
Non-Causal Computation
Reprinted from: Entropy 2017, 19, 326, doi:10.3390/e19070326 . . . . . . . . . . . . . . . . . . . . . 269
Chris Heunen
The Many Classical Faces of Quantum Structures
Reprinted from: Entropy 2017, 19, 144, doi:10.3390/e19040144 . . . . . . . . . . . . . . . . . . . . . 279
Catalina Curceanu, Hexi SHI, Sergio Bartalucci, Sergio Bertolucci, Massimiliano Bazzi,
Carolina Berucci, Mario Bragadireanu, Michael Cargnelli, Alberto Clozza, Luca De Paolis,
Sergio Di Matteo, Jean-Pierre Egger, Carlo Guaraldo, Mihail Iliescu, Johann Marton,
Matthias Laubenstein, Edoardo Milotti, Marco Miliucci, Andreas Pichler, Dorel Pietreanu,
Kristian Piscicchia, Alessandro Scordo, Diana Laura Sirghi, Florin Sirghi, Laura Sperandio,
Oton Vazquez Doce, Eberhard Widmann and Johann Zmeskal
Test of the Pauli Exclusion Principle in the VIP-2 Underground Experiment
Reprinted from: Entropy 2017, 19, 300, doi:10.3390/e19070300 . . . . . . . . . . . . . . . . . . . . . 319
Kristian Piscicchia, Angelo Bassi, Catalina Curceanu, Raffaele Del Grande, Sandro Donadi,
Beatrix C. Hiesmayr and Andreas Pichler
CSL Collapse Model Mapped with the Spontaneous Radiation
Reprinted from: Entropy 2017, 19, 319, doi:10.3390/e19070319 . . . . . . . . . . . . . . . . . . . . . 327
Robert B. Griffiths
Quantum Information: What Is It All About?
Reprinted from: Entropy 2017, 19, 645, doi:10.3390/e19120645 . . . . . . . . . . . . . . . . . . . . . 335
Benjamin F. Dribus
Entropic Phase Maps in Discrete Quantum Gravity
Reprinted from: Entropy 2017, 19, 322, doi:10.3390/e19070322 . . . . . . . . . . . . . . . . . . . . . 347
Kevin Vanslette
Entropic Updating of Probabilities and Density Matrices
Reprinted from: Entropy 2017, 19, 664, doi:10.3390/e19120664 . . . . . . . . . . . . . . . . . . . . . 409
vi
Alain Deville and Yannick Deville
Concepts and Criteria for Blind Quantum Source Separation and Blind Quantum Process
Tomography
Reprinted from: Entropy 2017, 19, 311, doi:10.3390/e19070311 . . . . . . . . . . . . . . . . . . . . . 477
vii
About the Special Issue Editors
Giacomo Mauro D’Ariano is full professor at Pavia University, where he teaches Quantum
Mechanics and Foundations of Quantum Theory, and leads the group QUit. He is a fellow of
the American Physical Society and of the Optical Society of America, a member of the Academy
Istituto Lombardo of Scienze e Lettere, of the Center for Photonic Communication and Computing
at Northwestern IL, and of FQXi. He is the (co)author of more than 350 articles in peer-reviewed
physics journals. He started Quantum Information in Italy, where he created a school that spread
scholars worldwide.
ix
entropy
Editorial
Quantum Information and Foundations
Giacomo Mauro D’Ariano * and Paolo Perinotti *
QUIT Group, Dipartimento di Fisica dell’Università di Pavia, Istituto Nazionale di Fisica Nucleare, via Bassi 6,
27100 Pavia, Italy
* Correspondence: [email protected] (G.M.D.); [email protected] (P.P.)
Received: 3 December 2019; Accepted: 10 December 2019; Published: 23 December 2019
The new era of quantum foundations, fed by the quantum information theory experience and
opened in the early 2000s by a series of memorable papers [1–3], led in a few years to a wealth of
results, that can all be roughly traced back to the idea of testing quantum theory against new rivals
instead of struggling in the worn-out attempt at its recomprehension within a classical imaginative
world. The first remarkable construction of a toy theory for foundational purposes, in our knowledge,
is represented by Ref. [4].
The study of foil theories along with their informational power lead to important progress,
paralleled with an increasing understanding of the new foundational scenario [5]. Most importantly,
this stream of thought is the origin of the new paradigm of the so-called reconstructions, which aim
at singling out quantum theory in a wider scenario of possible theories of elementary physical
systems [6,7]. Grant the authors an unwarranted bit of pride in stating that a clear picture of such a
playground is now available thanks to the formulation of the concept of Operational Probabilistic Theory
(OPT) [8,9]. As a result of the growing interest, we now understand quantum theory as a special kind
of information theory, with postulates that regard the possibility or impossibility to carry out specific
information processing tasks, instead of directly describing the mathematical structures of Hilbert
spaces, operator algebras, and alike.
One of the future challenges for the informational approach to quantum foundations is then to
embrace the mechanical part of the theory, besides the merely information-theoretic one, or, better,
to remain on top of it.
The time demarcation represented by the year 2000 is of course artificial, just like every symbolic
date, as quantum information was strictly connected to foundations since its very birth. One could
not express this fact in better words than Chris Fuchs’ own: “The title of the NATO Advanced Research
Workshop that gave birth to this volume was ‘Decoherence and its Implications in Quantum Computation
and Information Transfer’ . . . The life of the party was all the talks and conversations on ‘Decoherence and its
Implications in Quantum Foundations’.” [10]. The new approach, moreover, has some deep connections
with the previous experience that can be broadly collected under the name quantum logic. Having said
that, the turn of the century undoubtedly brought the foundations new vigour.
This special issue is meant to witness recent progress of the balanced and fertile interchange
between the developments in application-oriented quantum information theory and those in
foundations. As a result, the response of the authors was great, and produced a perfect blend of
flavours. The subjects of the contributions can be briefly classified in three groups.
The first one can be deemed resources. One of the main topics in a well-organised information
theory is quantification and classification of resources. It is nowadays common wisdom that the
resource for quantum computation and information is entanglement, which is incidentally one of
the main resources also for foundations. In a broader view, entanglement is one of the nonclassical
resources allowed by quantum theory.
In the contribution, Nonclassicality by Local Gaussian Unitary Operations for Gaussian States [11],
the authors introduce a measure of nonclassicality for Gaussian states of continuous variable systems
and compare it with other measures of nonclassical correlations. The resource in this case is
nonclassicality, namely, the ability to produce phenomena that are not reproducible by classical
means. The proposed measure of nonclassicality is explicitly computed for a system of two bosonic
modes, and estimated in the general case.
In another respect, one of the primary resources in quantum information is the ability to prepare
states on demand. Methods for predicting the statistical efficiency of sources, or for sharpening our
description of preparations through density matrices in the presence of partial information are then of
the utmost importance.
In the paper, Entropic Updating of Probabilities and Density Matrices [12], the author analyses the task
of reconstructing the theoretical description of a quantum state from partial experimental information.
The standard relative entropy and the Umegaki entropy are derived in parallel from the same set of
design criteria.
Finally, in the contribution, Structure of Multipartite Entanglement in Random Cluster-Like Photonic
Systems [13], the authors analyse the size of multipartite entanglement in randomly generated cluster
states, relating it to the density of nodes in the cluster.
A second collection of contributions regards algorithms and protocols. This selection witnesses
progress in the ongoing challenge towards new algorithms and new tasks. In the contribution, Finding
a Hadamard Matrix by Simulated Quantum Annealing [14], the author analyses quantum algorithms for
finding a Hadamard matrix, which is itself a hard problem. The problem is reformulated in terms of
energy minimisation of spin vectors connected by a complete graph, and approached via path-integral
Monte-Carlo techniques. The scaling properties of the method show that the quantum algorithm
outperforms its classical counterpart in solving this hard problem, providing yet another hint to
quantum supremacy.
In the contribution, Quantum Genetic Learning Control of Quantum Ensembles with Hamiltonian
Uncertainties [15], the authors propose a new method for controlling a quantum ensemble of two-level
systems with uncertainties in the parameters of the Hamiltonian system. The method is based on the
combination of a sample-learning control and a quantum genetic algorithm, witnessing the continuous
cross-fertilisation between quantum theory and computer science.
The authors of the contribution, Discrete Wigner Function Derivation of the Aaronson-Gottesman
Tableau Algorithm [16], present a discrete Wigner-function-based simulation algorithm for odd-d
qudits that has the same time and space complexity as the Aaronson–Gottesman algorithm for qubits.
The authors also discuss the differences between the Wigner function algorithm for odd-d and the
Aaronson–Gottesman algorithm for qubits, conjecturing that they are due to the fact that qubits exhibit
state-independent contextuality. This may provide a guide for extending the discrete Wigner function
approach to qubits. Considering this result, one can easily realise how tightly quantum computation
and quantum foundations are bound.
Concepts and Criteria for Blind Quantum Source Separation and Blind Quantum Process Tomography [17]
discusses communication protocols for demixing a signal from the output of a communication line
and establishes properties that were already used without justification in that context. The scenario
considered here involves a pair of electron spins initially prepared in a pure state and then submitted
to an undesired exchange coupling. The authors introduce a criterion for checking that the coupling
does not produce entanglement.
In recent years, after studies that provided a fully algebraic method for analysing quantum
circuits [18], it was realised that there are easy protocols challenging the circuit model, but are still
amenable to a fully algebraic account [19]. Some of these protocols can be interpreted as computations
that call events in a causally indefinite order, thus hinting to interesting foundational questions. In the
article, Non-Causal Computation [20], the authors review recent results on indefinite orders and their
2
Entropy 2020, 22, 22
potentiality in computation, replacing the requirement of a global ordering between gates in the
computation with that of mere logical consistency.
The third collection regards foundations. This is the subject that encompasses all the remaining
contributions, that amount to fifteen, with a very diverse span of subjects, approaches, and techniques.
One of the lessons of the quantum information theoretical approach to foundations is that very
often physical concepts are easily grasped referring to the operations and processes they can undergo.
In this spirit, the author of the contribution, Agents, Subsystems, and the Conservation of Information [21]
proposes a mathematical modelling for subsystems of physical systems in the general scenario of OPTs,
where subsystems are identified through a subalgebra of the full algebra of operations on the composite
system they are part of. Various cases are then discussed, with a particular focus on quantum systems.
The relevance of appropriately treating subsystems of composite systems might appear somewhat
technical at a superficial sight, but after giving the subject some more thought, one realizes that
the notion of subsystem underlies many fundamental questions, e.g., Wigner’s thought experiment
popularly known as the Wigner’s friend paradox. This is the subject of the contribution, A No-Go
Theorem for Observer-Independent Facts [22], which proposes a perspective on the argument of
Frauchiger and Renner [23] proving that “single-world interpretations of quantum theory cannot be
self-consistent”. The author derives a no-go theorem for observer-independent facts, which would be
common both for Wigner and the friend. This result is claimed to undermine one of the assumptions
behind the concept of “self-consistency” by the authors of Ref. [23].
The analysis of conceptual foundational questions is possible thanks to the availability of a
suitable mathematical language. A continuous process of reformulation and reconsideration of the
latter is an important chapter in quantum foundations, as witnessed by the contribution, A Royal
Road to Quantum Theory (or Thereabouts) [24]. Here, the author proposes an alternate perspective for
approaching the problem of reformulating the mathematical language of quantum theory from simple
postulates, based on the theory of Euclidean Jordan algebras. While the paper, as declared by the
author, “fails to derive quantum mechanics”, it derives a more general framework that embraces the
quantum along with alternate, not wildly different possible theories.
In addition, the article Quantum Theory from Rules on Information Acquisition [25] reviews a
reconstruction of the mathematical framework of quantum theory. The starting point here is a set of
rules constraining an observer’s acquisition of information about physical systems. The reconstruction
offers an informational explanation for entanglement, monogamy, and nonlocality, from limited
accessible information and complementarity. The analysis leads to a notion of “conserved informational
charges” that stems from complementarity relations that characterise the unitary group and the set of
pure states.
The review The Many Classical Faces of Quantum Structures [26] addresses a mathematical
reformulation of quantum mechanics in terms of classical mechanics. The standpoint for this approach
is that interpretational problems with quantum mechanics can be phrased precisely by only talking
about empirically accessible information. This review spells out the main points of the abovementioned
approach in terms of the algebraic structures lying behind quantum theory.
After the reconstruction of the mathematical language of quantum theory from information
theoretical postulates was completed, one of the possible developments was the attempt at a
reformulation of quantum mechanics from information processing. In this respect, much progress
was achieved, essentially showing that one can have a fully information-theoretic account of the
basic equations at the core of relativistic quantum field theory, such as Weyl’s and Dirac’s [27,28],
and Maxwell’s [29]. The next difficult step in this direction is introducing interactions. A recent
result in this direction is the study of all possible interacting cellular automata in one dimension
along with a full diagonalization of their two-particle sector [30]. In the contribution, Solutions of a
two-particle interacting quantum walk [31], the authors provide an alternative solution of the dynamics
of the abovementioned class of cellular automata based on a path-sum approach.
3
Entropy 2020, 22, 22
Once again, on the exploration of the language of quantum foundations, one can read Ruling
out Higher-Order Interference from Purity Principles [32], where the authors analyse the principles of
Causality, Purity Preservation, Pure Sharpness, and Purification in the operational framework of
generalised probabilistic theories, proving that these principles limit interference to second-order,
namely, the interference pattern formed in a multislit experiment is a function of the interference
patterns formed between pairs of slits. This behaviour is typical of quantum theory, where there are no
genuinely new features resulting from considering three slits instead of two. Systems in such theories
correspond to Euclidean Jordan algebras.
Another contribution that is focused on the mathematical language and its framework is Leaks:
Quantum, Classical, Intermediate and More [33], where the authors introduce the notion of a leak for
general process theories and identify quantum theory as a theory with minimal leakage, as opposed
to classical theory that has maximal leakage. Leaks are processes that provide leakage of classical
information, and can be introduced in most theories. These processes allow for a category theoretical
account of decoherence as a mechanism for the emergence of classical theory in a quantum scenario.
The authors also discuss the relation of leaks with purity of processes.
One of the main themes in the context of reconstructions and reformulations of quantum theory is
to open the route to possible new post-quantum theories. The article, Iterant Algebra [34] moves a step
beyond quantum theory, starting from a generalisation of the structure of matrix algebra, motivated
by the structure of measurement for discrete processes. Iterant algebra is shown to embrace matrix
and Clifford algebras, and the framework is then applied to discuss various aspects of quantum
mechanics, such as the Schrödinger and Dirac equations, Majorana Fermions, and representations of
the braid group.
We now move to a different chapter in foundations, where one can use the standard mathematical
formalism to face questions and concepts that have interpretational issues. An example is given
by Robust Macroscopic Quantum Measurements in the Presence of Limited Control and Knowledge [35].
The authors tackle the problem of compatibility of quantum behaviour and macroscopic measurements,
focusing on the estimation of the polarization direction for a large system of spin 1/2 particles.
The analysis starts from a model of von Neumann pointer measurement and shows traits of a classical
measurement for an intermediate coupling strength. A relevant part of the contribution is devoted to
the analysis of response of the model against relaxations of the initial assumptions, showing that the
model is robust.
One of the fundamental subjects that attracted interest from the very birth of quantum mechanics
is uncertainty. The study of uncertainty is still lively, and the present special issue includes one
contribution that is devoted to this subject: Measurement Uncertainty Relations for Position and Momentum:
Relative Entropy Formulation [36]. The authors analyse uncertainty as related to incompatibility of
different observables, where the latter is quantified by the amount of unavoidable approximation in a
joint measurement. As a quantifier of information loss, the authors consider relative entropy of a “true”
probability distribution and an approximating one. Such an analysis is applied to obtain lower bound
for the amount of information that is lost by replacing the distributions of the sharp position and
momentum observables, as they could be obtained with two separate experiments, by the marginals of
any smeared joint measurement.
The renewed interest in fundamental problems produced new approaches to the unification of
quantum mechanics and the theory of gravity. Recent trends in quantum gravity are thus of high
interest for the community working in foundations and, for this reason, we appreciate the value of
a contribution such as Planck-Scale Soccer-Ball Problem: A Case of Mistaken Identity [37], which reports
about reflections on the rule of composition for momenta. Over the last decade, nonlinear laws of
composition of momenta were predicted by many approaches to quantum gravity. In order to dissipate
concerns about such nonlinearity, the author discusses the subtle difference between the two roles that
a law of momentum composition play: the first one is related to the description of space-time locality,
and the second one is related to translational invariance. The contribution exhibits an example of
4
Entropy 2020, 22, 22
space-time where the local structure provides a nonlinear composition of momenta and yet translational
invariance is expressed by a linear law for the addition of momenta of many-particle systems.
Another contribution focused on a model aiming at a formulation of quantum gravity is Entropic
Phase Maps in Discrete Quantum Gravity [38], where the author makes an attempt based on path
summation over a space of evolutionary pathways in a history configuration space. This approach
enables derivation of discrete Schrödinger-type equations, and mathematical constructions thereof are
used to introduce entropic functions that obey an abstract version of the second law of thermodynamics.
One of the most remarkable consequences of the widespread interest in foundations is a
flourishing of experiments aimed at testing fundamental questions, or challenging established pillars
of quantum theory. A remarkable example is the Pauli exclusion principle for Fermions, that has been
tested in a series of recent experiments, in an ongoing effort that is witnessed also by a contribution
in the present issue, Test of the Pauli Exclusion Principle in the VIP-2 Underground Experiment [39].
Here, the authors report progress of the VIP-2 experiments at the Laboratori Nazionali del Gran Sasso,
seeking a prohibited transition in copper atoms of a 2p orbit electron to the fully populated ground
state, via X-ray analysis. The present limit on the probability for Pauli exclusion principle violation for
electrons set by the VIP experiment is 4.7 × 10−29 . A first result from the VIP-2 experiment improves
on the VIP limit, while the goal is a gain of two orders of magnitude in the long run.
A second example is the test of spontaneous collapse models, which aim at an objective solution of
the measurement problem that keeps the quantum formalism untouched while tweaking its dynamical
equations. In the contribution, CSL Collapse Model Mapped with the Spontaneous Radiation [40], new upper
limits on the parameters of the Continuous Spontaneous Localization collapse models are extracted.
The main idea behind the experiment is to analyse IGEX data about X-ray emission and compare
them with the spectrum of the spontaneous photon emission process predicted by collapse models.
This study allows for the exclusion of a broad range of the parameter space for CSL models.
Finally, we include a contribution out of line, which is more focused on interpretational issues
than technical, such as Quantum Information: What Is It All About? [41]. In this contribution, the author
answers the provocative question originally posed by John Bell, claiming that, in the consistent
histories approach to quantum theory, information is meant about projectors on subspaces of the
Hilbert space of a system, representing its quantum properties. The main focus is the discussion of
how the single-framework rule—i.e., the rule for assigning probabilities to a projective decomposition
of the identity—for consistent histories avoids contradictions and recovers both classical information
theory and macroscopic physics. Room for issues is left only in the regimes without classical analogue,
where a single framework is not sufficient.
As a concluding remark, we would like to thank all the authors for their contributions and declare
our satisfaction in verifying the ongoing interest in fundamental problems—the only possible fuel for
the science and technology of tomorrow.
Acknowledgments: We express our thanks to the authors of the above contributions, and to the journal Entropy
and MDPI for their support during this work.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Hardy, L. Quantum theory from five reasonable axioms. arxiv 2001, arXiv:quant-ph/0101012.
2. Fuchs, C.A. Quantum Mechanics as Quantum Information (and only a little more). arXiv 2002,
arXiv:quant-ph/0205039.
3. Brassard, G. Is information the key? Nat. Phys. 2005, 1, 2–4. [CrossRef]
4. Hardy, L. Disentangling nonlocality and teleportation. arXiv 1999, arXiv:quant-ph/9906123.
5. Spekkens, R.W. Evidence for the epistemic view of quantum states: A toy theory. Phys. Rev. A 2007, 75, 032110.
[CrossRef]
5
Entropy 2020, 22, 22
6. Dakic, B.; Brukner, C. Quantum theory and beyond: Is entanglement special? In Deep Beauty: Understanding
the Quantum World through Mathematical Innovation; Halvorson, H., Ed.; Cambridge University Press:
Cambridge, UK, 2011; pp. 365–392.
7. Masanes, L.; Müller, M.P. A derivation of quantum theory from physical requirements. New J. Phys.
2011, 13, 063001. [CrossRef]
8. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Probabilistic theories with purification. Phys. Rev. A
2010, 81, 062348. [CrossRef]
9. D’Ariano, G.M.; Chiribella, G.; Perinotti, P. Quantum Theory from First Principles: An Informational Approach;
Cambridge University Press: Cambridge, UK, 2017.
10. Gonis, T.; Gonis, A.; Turchi, P.E. Decoherence and Its Implications in Quantum Computation and
Information Transfer; NATO Science Series: Computer and Systems Sciences; IOS Press: Amsterdam,
The Netherlands, 2001.
11. Wang, Y.; Qi, X.; Hou, J. Nonclassicality by Local Gaussian Unitary Operations for Gaussian States.
Entropy 2018, 20, 266. [CrossRef]
12. Vanslette, K. Entropic Updating of Probabilities and Density Matrices. Entropy 2017, 19, 664. [CrossRef]
13. Ciampini, M.A.; Mataloni, P.; Paternostro, M. Structure of Multipartite Entanglement in Random Cluster-Like
Photonic Systems. Entropy 2017, 19, 473. [CrossRef]
14. Suksmono, A.B. Finding a Hadamard Matrix by Simulated Quantum Annealing. Entropy 2018, 20, 141.
[CrossRef]
15. Arjmandzadeh, A.; Yarahmadi, M. Quantum Genetic Learning Control of Quantum Ensembles with
Hamiltonian Uncertainties. Entropy 2017, 19, 376. [CrossRef]
16. Kocia, L.; Huang, Y.; Love, P. Discrete Wigner Function Derivation of the Aaronson–Gottesman Tableau
Algorithm. Entropy 2017, 19, 353. [CrossRef]
17. Deville, A.; Deville, Y. Concepts and Criteria for Blind Quantum Source Separation and Blind Quantum
Process Tomography. Entropy 2017, 19, 311. [CrossRef]
18. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Theoretical framework for quantum networks. Phys. Rev. A
2009, 80, 022339. [CrossRef]
19. Chiribella, G.; D’Ariano, G.M.; Perinotti, P.; Valiron, B. Quantum computations without definite causal
structure. Phys. Rev. A 2013, 88, 022318. [CrossRef]
20. Baumeler, Ä.; Wolf, S. Non-Causal Computation. Entropy 2017, 19, 326. [CrossRef]
21. Chiribella, G. Agents, Subsystems, and the Conservation of Information. Entropy 2018, 20, 358. [CrossRef]
22. Brukner, Č. A No-Go Theorem for Observer-Independent Facts. Entropy 2018, 20, 350. [CrossRef]
23. Frauchiger, D.; Renner, R. Quantum theory cannot consistently describe the use of itself. Nat. Commun.
2018, 9, 3711. [CrossRef]
24. Wilce, A. A Royal Road to Quantum Theory (or Thereabouts). Entropy 2018, 20, 227. [CrossRef]
25. Höhn, P.A. Quantum Theory from Rules on Information Acquisition. Entropy 2017, 19, 98. [CrossRef]
26. Heunen, C. The Many Classical Faces of Quantum Structures. Entropy 2017, 19, 144. [CrossRef]
27. Bisio, A.; D’Ariano, G.M.; Tosini, A. Dirac quantum cellular automaton in one dimension: Zitterbewegung
and scattering from potential. Phys. Rev. A 2013, 88, 032301. [CrossRef]
28. D’Ariano, G.M.; Perinotti, P. Derivation of the Dirac equation from principles of information processing.
Phys. Rev. A 2014, 90, 062106. [CrossRef]
29. Bisio, A.; D’Ariano, G.M.; Perinotti, P. Quantum Cellular Automaton Theory of Light. arXiv 2014,
arXiv:1407.6928.
30. Bisio, A.; D’Ariano, G.M.; Perinotti, P.; Tosini, A. Thirring quantum cellular automaton. Phys. Rev. A
2018, 97, 032132. [CrossRef]
31. Bisio, A.; D’Ariano, G.M.; Mosco, N.; Perinotti, P.; Tosini, A. Solutions of a Two-Particle Interacting Quantum
Walk. Entropy 2018, 20, 435. [CrossRef]
32. Barnum, H.; Lee, C.M.; Scandolo, C.M.; Selby, J.H. Ruling out Higher-Order Interference from Purity
Principles. Entropy 2017, 19, 253. [CrossRef]
33. Selby, J.; Coecke, B. Leaks: Quantum, Classical, Intermediate and More. Entropy 2017, 19, 174. [CrossRef]
34. Kauffman, L.H. Iterant Algebra. Entropy 2017, 19, 347. [CrossRef]
35. Renou, M.O.; Gisin, N.; Fröwis, F. Robust Macroscopic Quantum Measurements in the Presence of Limited
Control and Knowledge. Entropy 2018, 20, 39. [CrossRef]
6
Entropy 2020, 22, 22
36. Barchielli, A.; Gregoratti, M.; Toigo, A. Measurement Uncertainty Relations for Position and Momentum:
Relative Entropy Formulation. Entropy 2017, 19, 301. [CrossRef]
37. Amelino-Camelia, G. Planck-Scale Soccer-Ball Problem: A Case of Mistaken Identity. Entropy 2017, 19, 400.
[CrossRef]
38. Dribus, B.F. Entropic Phase Maps in Discrete Quantum Gravity. Entropy 2017, 19, 322. [CrossRef]
39. Curceanu, C.; Shi, H.; Bartalucci, S.; Bertolucci, S.; Bazzi, M.; Berucci, C.; Bragadireanu, M.; Cargnelli, M.;
Clozza, A.; De Paolis, L.; et al. Test of the Pauli Exclusion Principle in the VIP-2 Underground Experiment.
Entropy 2017, 19, 300. [CrossRef]
40. Piscicchia, K.; Bassi, A.; Curceanu, C.; Grande, R.D.; Donadi, S.; Hiesmayr, B.C.; Pichler, A. CSL Collapse
Model Mapped with the Spontaneous Radiation. Entropy 2017, 19, 319. [CrossRef]
41. Griffiths, R.B. Quantum Information: What Is It All About? Entropy 2017, 19, 645. [CrossRef]
c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
7
Article
Solutions of a Two-Particle Interacting
Quantum Walk
Alessandro Bisio, Giacomo Mauro D’Ariano, Nicola Mosco *, Paolo Perinotti * and
Alessandro Tosini
Dipartimento di Fisica dell’Università di Pavia, Istituto Nazionale di Fisica Nucleare, Pavia 27100, Italy;
[email protected] (A.B.); [email protected] (G.M.D.); [email protected] (A.T.)
* Correspondence: [email protected] (N.M.); [email protected] (P.P.);
Tel.: +39-0382-987675 (N.M. & P.P.)
Received: 22 April 2018; Accepted: 31 May 2018; Published: 5 June 2018
Abstract: We study the solutions of an interacting Fermionic cellular automaton which is the analogue
of the Thirring model with both space and time discrete. We present a derivation of the two-particle
solutions of the automaton recently in the literature, which exploits the symmetries of the evolution
operator. In the two-particle sector, the evolution operator is given by the sequence of two steps, the
first one corresponding to a unitary interaction activated by two-particle excitation at the same site,
and the second one to two independent one-dimensional Dirac quantum walks. The interaction step
can be regarded as the discrete-time version of the interacting term of some Hamiltonian integrable
system, such as the Hubbard or the Thirring model. The present automaton exhibits scattering
solutions with nontrivial momentum transfer, jumping between different regions of the Brillouin
zone that can be interpreted as Fermion-doubled particles, in stark contrast with the customary
momentum-exchange of the one-dimensional Hamiltonian systems. A further difference compared
to the Hamiltonian model is that there exist bound states for every value of the total momentum and
of the coupling constant. Even in the special case of vanishing coupling, the walk manifests bound
states, for finitely many isolated values of the total momentum. As a complement to the analytical
derivations we show numerical simulations of the interacting evolution.
1. Introduction
Quantum walks (QWs) describe the evolution of one-particle quantum states on a lattice, or, more
generally, on a graph. The quantum walk evolution is linear in the quantum state and the quantum
aspect of the evolution occurs in the interference between the different paths available to the walker.
There are two kinds of quantum walks: continuous time QWs, where the evolution operator of
the system given in terms of an Hamiltonian can be applied at any time (see Farhi et al. [1]),
and discrete-time QWs, where the evolution operator is applied in discrete unitary time-steps.
The discrete-time model, which appeared already in the Feynman discretization of the Dirac
equation [2], was later rediscovered in quantum information [3–7], and proved to be a versatile
platform for various scopes. For example, QWs have been used for empowering quantum algorithms,
such as database search [8,9], or graph isomorphism [10,11]. Moreover, quantum walks have been
studied as a simulation tool for relativistic quantum fields [12–28], and they have been used as discrete
models of spacetime [29–32].
QWs are among the most promising quantum simulators with possible realizations in a variety of
physical systems, such as nuclear magnetic resonance [33,34], trapped ions [35], integrated photonics,
and bulk optics [36–39].
New research perspectives are unfolding in the scenario of multi-particle interacting quantum
walks where two or more walking particles are coupled via nonlinear (in the field) unitary operators.
The properties of these systems are still largely unexplored. Both continuous-time [40] and
discrete-time [41] quantum walks on sparse unweighted graphs are equivalent in power to the quantum
circuit model. However, it is highly non-trivial to design a suitable architecture for universal quantum
computation based on quantum walks. Within this perspective, a possible route has been suggested
in [42] based on interacting multi-particle quantum walks with indistinguishable particles (Bosons
or Fermions), proving that “almost any interaction” is universal. Among the universal interacting
many-body systems are the models with coupling term of the form χδx1 ,x2 n̂( x1 )n̂( x2 ), with n̂( x )
the number operator at site x. The latter two-body interaction lies at the basis of notable integrable
quantum systems in one space dimension such as the Hubbard and the Thirring Hamiltonian models.
The first attempts at the analysis of interacting quantum walks were carried out in [43,44].
More recently, in [45], the authors proposed a discrete-time analogue of the Thirring model, which is
indeed a Fermionic quantum cellular automaton, whose dynamics in the two-particle sector reduces to
an interacting two-particle quantum walk. As for its Hamiltonian counterpart, the discrete-time
interacting walk has been solved analytically in the case of two Fermions. Analogously to any
Hamiltonian integrable system, also in the discrete-time case the solution is based on the Bethe
Ansatz technique. However, discreteness of the evolution prevents the application of the usual Ansatz,
and a new Ansatz has been introduced successfully [45].
In this paper, we present an original simplified derivation of the solution of [45], which exploits
the symmetries of the interacting walk. We present the diagonalization of the evolution operator and
the characterization of its spectrum. We explicitly write the two particle states corresponding to the
scattering solutions of the system, having eigenvalues in the continuous spectrum of the evolution
operator. We then show how the present model predicts the formation of bound states, which are
eigenstates of the interacting walk corresponding to the discrete spectrum. We provide also in this
case the analytic expression of such molecular states.
We comment on the phenomenological differences between the Hamiltonian model and the
discrete-time one. First, we see that the set of possible scattering solutions is larger in the discrete-time
case: for a fixed value total momentum, a non trivial transfer of relative momentum can occur besides
the simple exchange of momentum between the two particles, differently from the Hamiltonian case.
In addition, the family of bound states appearing in the discrete-time scenario is larger than the
corresponding Hamiltonian one. Indeed, for any fixed value of the coupling constant, a bound state
exists with any possible value of the total momentum, while, for Hamiltonian systems, bound states
cannot have arbitrary total momentum.
Finally, we show that, in the set of solutions for the interacting walk, there are perfectly localized
states (namely, states that lie on a finite number of lattice sites). Moreover, differently from the
Hamiltonian systems, bound states exist also for null coupling constants; however, this is true only for
finitely many isolated values of the total momentum. In addition to the exact analytical solution of the
dynamics, we show the simulation of some significant initial states.
10
Entropy 2018, 20, 435
where W is a unitary operator. In the single particle sector, the automaton can be regarded as a
quantum walk on the single-particle Hilbert space H whose evolution unitary operator W is given by
νTx −iμ
W= , ν, μ > 0, ν2 + μ2 = 1, (3)
−iμ νTx†
where |ν|2 + |μ|2 = 1. The spectrum of the walk is given by {e−iω ( p) , eiω ( p) }, where the dispersion
relation ω ( p) is given by
ω ( p) := Arccos(ν cos p), (5)
where Arccos denotes the principal value of the arccosine function. The single-particle eigenstates,
solving the eigenvalue problem
where n a ( x ), a ∈ {↑, ↓}, represents the particle number at site x, namely n a ( x ) = ψa† ( x )ψa ( x ), and χ
is a real coupling constant. Since the interaction term preserves the total number operator, we can
study the automaton for a fixed number of particles. For N interacting particles, we can describe the
evolution in terms of an interacting quantum walk over H N = H ⊗ N with the free evolution given by
WN := W ⊗ N .
In this work, we focus on the two-particle sector whose solutions has been derived in [45]. As we
will see, the Thirring walk features molecule states besides scattering solutions. This features is shared
also by the Hadamard walk with the same on-site interaction [44].
11
Entropy 2018, 20, 435
WN := W ⊗ N , acting on the Hilbert space H N = H ⊗ N and describing the free evolution of the
particles. In order to introduce an interaction, we modify the update rule of the walk with an extra
step Vint : UN := WN Vint . In the present case, the term Vint has the form
Since we focus on the solutions involving the interaction of two particles, it is convenient to write
the walk in the centre of mass basis | a1 , a2 |y |w, with a1 , a2 ∈ {↑, ↓}, y = x1 − x2 and w = x1 + x2 .
Therefore, on this basis, the generic Fermionic state is |ψ = ∑ a1 ,a2 ,y,w c( a1 , a2 , y, w) | a1 , a2 |y |w with
c( a2 , a1 , y, w) = −c( a1 , a2 , −y, w). Notice that only the pairs y, w with y and w, both even or odd,
correspond to physical points in the original basis x1 , x2 .
We define the two-particle walk with both y and w in Z, so that the linear part of walk can be
written as
⎛ ⎞
ν 2
μ Tw −iTy ⊗ Tw −iTy† ⊗ Tw − μν
⎜ −iT ⊗ T ν 2
−νμ
−iTy ⊗ Tw ⎟
†
⎜ y w μ Ty ⎟
W2 = μν ⎜ ⎜−iT † ⊗ Tw μ ν † 2 † ⊗ T† ⎟ ,
⎟ (10)
⎝ y − ν μ yT − iT y w⎠
− μν −iTy ⊗ Tw† −iTy† ⊗ Tw† ν †2
μ Tw
where Ty represents the translation operator in the relative coordinate y, and Tw the translation operator
in the centre of mass coordinate w, whereas the interacting term reads
⎛ ⎞
Iy ⊗ Iw 0 0 0
⎜ iχδy,0
⊗ Iw 0 ⎟
⎜ 0 e 0 ⎟
V2 (χ) = ⎜ ⎟. (11)
⎝ 0 0 eiχδy,0 ⊗ Iw 0 ⎠
0 0 0 Iy ⊗ Iw
This definition gives a walk U2 = W2 V2 (χ) that can be decomposed into two identical copies
of the original walk. Indeed, defining as C the projector on the physical center of mass coordinates,
one has U2 = CU2 C + ( I − C )U2 ( I − C ), where CU2 C and ( I − C )U2 ( I − C ) are unitarily equivalent.
We will then diagonalize the operator U2 , reminding readers that the physical solutions will be given
by projecting the eigenvectors with C.
Introducing the (half) relative momentum k = 12 ( p1 − p2 ) and the (half) total momentum
p = 12 ( p1 + p2 ), the free evolution of the two particles is written in the momentum representation as
W2 = dkdp W2 ( p, k) ⊗ |k k | ⊗ | p p| , (12)
W2 ( p, k ) = W ( p1 ) ⊗ W ( p2 ). (13)
−iωsr ( p,k) sr
W2 ( p, k)vsr
k =e vk , (14)
12
Entropy 2018, 20, 435
with |ψ(y) ∈ C4 . In the centre of mass basis, the antisymmetry condition reads
Although the range of the variable p is the interval (−π, π ], it is possible to show that one can
restrict the study of the walk to the interval [0, π/2]. On the one hand, the two-particle walk transforms
unitarily under a parity transformation in the momentum space. Starting from the single particle walk,
W ( p) transforms under a parity transformation as
W2 (− p, y) = σx ⊗ σx EW2 ( p, y) E σx ⊗ σx . (21)
W2 ( p + π, y) = σz ⊗ σz W2 ( p, y) σz ⊗ σz , (22)
13
Entropy 2018, 20, 435
The Thirring walk features also another symmetry that can be exploited to simplify the derivation
of the solutions. It is easy to check that the walk operator U2 ( p, χ) = W2 ( p)V (χ) commutes with the
projector defined by ⎛ ⎞
Po 0 0 0
⎜0 P 0 0⎟
⎜ e ⎟
P := ⎜ ⎟, (23)
⎝ 0 0 Pe 0 ⎠
0 0 0 Po
where Pe and Po are the projectors on the even and the odd subspaces, respectively:
The projector P induces a splitting of the total Hilbert space H in two subspaces PH and
( I − P)H , with the interaction term acting non-trivially only in the subspace PH . In the
complementary subspace ( I − P)H , the evolution is free for Fermionic particles. This means that
solutions of the free theory are also solutions of the interacting one, as opposed to the Bosonic case for
which the interaction is non-trivial also in ( I − P)H .
The restriction of the walk to the subspace PH entails that the eigenvalue problem is equivalent
to the following system of equations:
⎧
⎪
⎪e−iω ψ1 (z) = ν2 ei2p ψ1 (z) − iμνeip eiχδz,0 ψ2 (z) − iμνeip eiχδz,−1 ψ3 (z + 1) − μ2 ψ4 (z),
⎪
⎪
⎪
⎨e−iω ψ2 (z) = −iμνeip ψ1 (z − 1) + ν2 eiχδz,1 ψ2 (z − 1) − μ2 eiχδz,0 ψ3 (z) − iμνe−ip ψ4 (z − 1),
(28)
⎪e−iω ψ3 (z) = −iμνeip ψ1 (z) − μ2 eiχδz,0 ψ2 (z) + ν2 eiχδz,−1 ψ3 (z + 1) − iμνe−ip ψ4 (z),
⎪
⎪
⎪
⎪
⎩e−iω ψ4 (z) = −μ2 ψ1 (z) − iμνe−ip eiχδz,0 ψ2 (z) − iμνe−ip eiχδz,−1 ψ3 (z + 1) + ν2 e−i2p ψ4 (z).
14
Entropy 2018, 20, 435
and
⎛ ⎞
⎧ vsr,1 −i (2z+1)k
⎪ k e
⎨ ∑ sr
dk gω (k)wsr
k ( z ), z > 0, ⎜ vsr,2 e−i(2z)k ⎟
⎜ k ⎟
ψ(z) = s,r =± S wsr
k ( z ) := ⎜ sr,3 −i (2z)k ⎟ ,
⎪
⎩antisymmetrized, ⎝ vk e ⎠
z < 0, −i (2z+1)k
vsr,4
k e
⎛ sr,1
⎞ (30)
S dk gω ( k ) vk
sr
∑s,r=±
⎜ ξ ⎟
⎜ ⎟
ψ (0) = ⎜ ⎟,
⎝ −ξ ⎠
sr ( k ) vsr,4
∑s,r=± S dk gω k
e−iω
= e−iωsr ( p,k) =⇒ gω
sr
(k) = 0. (31)
Since e−iωsr ( p,k) has to be an eigenvalue of U2 (χ, p), ωsr ( p, k ) must be real and thus k ∈ Γ f or
k ∈ Γl with l = 0, ±1, 2, so we conveniently define the sets:
Ωsr
f := e−iωsr ( p,k) k ∈ Γ f , Ωsr
l := e−iωsr ( p,k) k ∈ Γl , (33)
π
Γ f := { k ∈ S | k I = 0 } , Γl := k ∈ S k R = l , l = 0, ±1, 2. (34)
2
−iωsr ( p,k)
f ∩ Ωl = ∅ for all s, r and l, and the range of the function e
It is easy to see that Ωsr sr
covers the entire unit circle except for the points e±i2p . Therefore, we can discuss separately the case
e−iω ∈ Ωsrf and the case e
−iω ∈ Ωsr . A solution with e−iω = e±i2p actually exists, corresponding to the
l
function of Equation (29), and it will be discussed in Section 5.3.
Let us start with the case e−iω ∈ Ωsr f , which will lead to the characterization of the continuous
spectrum of the Thirring walk U2 (χ, p) and of the scattering solutions.
indeed, as one can notice from Figure 1, the lines ω = ±2p lie entirely in the gaps between the curves
ω = ±2ω ( p) and ω = ±(π − 2 Arccos(n sin p)). The solution is thus the one given in Equation (30).
One can prove that Ω++ f = Ω−−
f and Ω+− f = Ω−+f . Furthermore, as one can notice from Figure 2,
there are four values of the triple (s, r, k ) such that e−iωsr ( p,k) = e−iω for a given value of e−iω : if the
triple (+, +, k ) is a solution, so are (+, +, π − k), (−, −, −k ) and (−, −, k − π ); and if (+, −, k ) is
a solution, then also (+, −, π − k ), (−, +, −k) and (−, +, k − π ) are solutions. This result greatly
simplifies Equation (30). Indeed, the sum over s, r and the integral over k reduces to the sum of
four terms:
15
Entropy 2018, 20, 435
As we will see, the original problem can be simplified in this way to an algebraic problem
with a finite set of equations. We note that the fact that the equation e−iωsr ( p,k) = e−iω has a finite
number of solutions is a consequence of the fact that we are considering a model in one spatial
dimension. However, in analogous one-dimensional Hamiltonian models (e.g., the Hubbard model),
the degeneracy of the eigenvalues is two.
2
0
-
2
-
0 3
8 4 8 2
p
Figure 1. Continuous spectrum of the two-particle walk as a function of the total momentum
p ∈ [0, π/2] with mass parameter m = 0.7. The continuous spectrum is the same as in the free
case. The solid blue curves are described by the functions ω = ±2ω ( p), and the red ones by
ω = ±(π − 2 Arccos(n sin p)). As one can notice, the light-red lines ω = ±2p lie entirely in the
gaps between the solid curves, highlighting the fact that e±i2p is not in the range of e−iωsr ( p,k) for
p
= 0, π/2 (see text).
1
++
0
--
-1 +-
-+
-2
-3
-3 -2 -1 0 1 2 3
k
Figure 2. Spectrum of the walk for m = 0.6 and p = π/6 as a function of k. The colours highlight
the different ranges of eigenvalues corresponding to the dispersion relation ωsr ( p, k ). The range of
ωsr ( p, k ) is understood to be computed mod (2π ). One can notice that there are four values of the
relative momentum k having the same value of the dispersion relation (ω = 2 in the figure). This is in
contrast to the Hamiltonian model for which there are only two solutions.
+,j
Let us consider for the sake of simplicity the solution of the kind ψk (z), since the other one can
be analysed in a similar way. Using the notation of Appendix, Equation (35) reduces to the expressions
(dropping the + superscript)
16
Entropy 2018, 20, 435
We notice that now the number of unknown parameters is further reduced to three, namely
λ, ρ, and ξ. Clearly, one of the parameters can be fixed by choosing arbitrarily the normalization.
From now on, we fix λ = 1 and define T+ := ρ. Equation (36) has to satisfy the recurrence relations
of Equation (28) for z = 0 and z = 1, while, for z > 1, it is automatically satisfied. For z = 0,
Equation (28) becomes
e−iω ψk1 (0) = ν2 ei2p ψk1 (0) − iμνeip eiχ ξ − iμνeip ψk3 (1) − μ2 ψk4 (0), (37)
e−iω ξ = iμνeip ψk1 (0) − ν2 ψk3 (1) − μ2 eiχ ξ + iμνe−ip ψk4 (0), (38)
− e ξ = −iμνeip ψk1 (0) − μ2 eiχ ξ + ν2 ψk3 (1) − iμνe−ip ψk4 (0),
−iω
(39)
e−iω ψk4 (0) = −μ2 ψk1 (0) − iμνe−ip eiχ ξ − iμνe−ip ψk3 (1) + ν2 e−i2p ψk4 (0). (40)
Starting from Equation (37), we can notice that ν2 ei2p a − iμνeip eik b − iμνeip e−ik c − μ2 d = e−iω a,
where we employed the notation of Appendix A, so that we obtain ξ = e−iχ (b − T+ c). We can
then substitute this expression in Equation (39) and use the relations
e−iχ (b − T+ c) = T+ b − c, (43)
and thus
c + e−iχ b g+ ( p + k) + e−iχ g+ ( p − k)
T+ = = . (44)
b + e−iχ c g+ ( p − k) + e−iχ g+ ( p + k)
For these values of ξ and T+ one can verify that Equation (28) is satisfied also for z = 1,
−,j
thus concluding the derivation. For the solution of the kind ψk (z), we can follow a similar reasoning,
obtaining the analogous quantity T− :
g+ ( p + k ) + e−iχ g− ( p − k )
T− := . (45)
g− ( p − k ) + e−iχ g+ ( p + k )
17
Entropy 2018, 20, 435
We can interpret such a solution as a scattering of plane waves for which the coefficient T± plays
the role of the transmission coefficient. Being the total momentum a conserved quantity, the two
particles can only exchange their momenta, as expected from a theory in one dimension. Furthermore,
for each value k of the relative momentum, the two particles can also acquire an additional phase of π.
As the interaction is a compact perturbation of the free evolution, the continuous spectrum is the same
as that of the free walk. Equation (46) provides the generalized eigenvector if U2 (χ, p) corresponding
to the continuous spectrum σc = Ω++ f ∪ Ω+−
f .
In Figure 3, the discrete spectrum of the interacting walk together with the continuous
spectrum as a function of the total momentum p is depicted. The solid curves in the gaps between
the continuous bands denote the discrete spectrum for different values of the coupling constant
χ = 2π/3, 3π/7, −3π/7, −2π/3. Molecule states appear also in the Hadamard walk with the same
on-site interaction [44].
Referring to Figure 4, we show the evolution of two particles initially prepared in a singlet
state localized at the origin. From the figure, one can appreciate the appearance of the bound
state component that has non-vanishing overlapping with the initial state. The bound state,
18
Entropy 2018, 20, 435
being exponentially decaying in the relative coordinate y, is localized on the diagonal of the plot, that
is when the two particles lie at the same point.
In Figure 5, the probability distribution of the bound state corresponding to a choice of parameters
χ = 0.2π and p = 0.035π is depicted. The plot highlights the exponential decay of the tails, which is
the characterizing feature of the bound state.
2
= 23
0 = 37
=- 37
- 2
=- 23
- 3
0 8 4 8 2
p
Figure 3. Complete spectrum of the two-particle Thirring walk as a function of the total momentum
p with mass parameter m = 0.7. The continuous spectrum is as in Figure 1. The solid lines in the
gaps show the point spectrum for different values of the coupling constant: from top to bottom,
χ = 2π/3, 3π/7, −3π/7, −2π/3. It is worth noticing that, for each pair (χ, p), there is only one value
in the discrete spectrum. The light-red lines ω = ±2p lie entirely in the gap between the continuous
bands highlighting the fact that the e±i2p is not in the range of e−iωsr ( p,k) for p
= 0, π/2; for a given
coupling constant χ, e±i2p is an eigenvalue for p = χ/2.
(a) (b)
Figure 4. We show for comparison the free evolution (a) and the interacting one (b) highlighting the
appearance of bound states components along the diagonal, namely when the two particles are at the
same site (i.e., x1 = x2 ), where x1 and x2 denote the positions of the two particles. The plots show
the probability distribution p( x1 , x2 ) in position space after t = 32 time-steps. The chosen value of
the mass parameter is m = 0.6 and the coupling constant is χ = π/2. The two particles are initially
prepared in a singlet state located at the origin.
19
Entropy 2018, 20, 435
(a) (b)
Figure 5. We show the evolution of a bound state of the two particles peaked around the value of the total
momentum p = 0.035π. The mass paramater is m = 0.6 and the coupling constant χ = 0.2π. In (a) is
depicted the probability distribution of the initial state. In (b) is depicted the probability distribution
of the evolved state after t = 128 time-steps. One can notice that, in the relative coordinate x1 − x2 ,
the probability distribution remains concentrated on the diagonal, highlighting the fact that the two
particles are in a bound state. The diffusion of the state happens only in the centre of a mass coordinate.
Subtracting the first and the last equations of (28) using (50), we obtain the following equation:
If both ζ and ζ are non-zero, one can prove that a solution does not exist and thus we have to
consider the two cases ζ = 0 and ζ = 0 separately. Starting from ζ = 0, Equation (51) imposes that
e−iω = ei2p , meaning that, if a solution exists in this case, it is an eigenvector corresponding to the
eigenvalue ei2p . From the second equation of (28), we obtain the relation
and, using the first equation of (28), it turns out that a solution exists only if eiχ = ei2p , as expected,
since, otherwise, the case of Section 5.2 would have held. The other case, namely e−iω = e−i2p , can be
studied analogously. Let us, then, denote as |ψ±∞ such proper eigenvectors with eigenvalue e±i2p
μ
for χ = e±i2p and, choosing η = ν as the value for the free parameter η, we obtain the following
expression for |ψ±∞ :
20
Entropy 2018, 20, 435
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1±1
2 0 − 1±2 1
⎜ ⎟ ⎜ μ ⎟ ⎜ ⎟
⎜ 0 ⎟ ⎜ ⎟ ⎜ 0 ⎟
|ψ±∞ = ie±ip ⎜ ⎟ ⊗ |−1 + ⎜ νμ ⎟ ⊗ |0 + ie±ip ⎜ ⎟ ⊗ |1 . (53)
⎝ 0 ⎠ ⎝− ν ⎠ ⎝ 0 ⎠
−
− 21±1
0 − 1±1
2
Such solutions provide a special case of molecule states (namely, proper eigenvectors of U2 (χ, p)),
being localized on few sites, and differ from the previous solutions showing an exponential decay in
the relative coordinate.
0.6
0.30
0.5
0.25
0.4
0.20
p(y)
p(y)
0.3 0.15
0.2 0.10
0.1 0.05
0.0 0.00
- 30 - 20 - 10 0 10 20 30 - 30 - 20 - 10 0 10 20 30
y y
(a) (b)
Figure 6. We show the case of two proper eigenstates for p = 0. In both cases the mass parameter
is m = 0.6. (a): probability distribution in the relative coordinate y of dk (v+− − v−+
k )e
−iyk .
−+ −iyk
k
(b): probability distribution in the y-coordinate of dk (v+−
k + v k ) e .
6. Conclusions
In this work, we reviewed the Thirring quantum walk [45], providing a simplified derivation of
its solutions for Fermionic particles. The simplified derivation relies on the symmetric properties of the
walk evolution operator, allowing for separating the subspace of solutions affected by the interaction
from the subspace where the interaction step acts trivially. The interaction term is the most general
number-preserving interaction in one dimension, whereas the free evolution is provided by the Dirac
QW [17].
21
Entropy 2018, 20, 435
We showed the explicit derivation of the scattering solutions (solutions for the continuous
spectrum) as well as for the bound-state solutions. The Thirring walk features also localized bound
states (namely, states whose support is finite on the lattice) when e−iω = e±i2p . Such solutions exist
only when the coupling constant is χ = 2p. Figure 4 depicts the evolution of a perfectly localized
state showing the overlapping with bound state components. In Figure 5, we reported the evolution
of a bound state of the two particles peaking around a certain value of the total momentum: one can
appreciate that the probability distribution remains localized on the main diagonal during the evolution.
Finally, we showed that bound states exist also for a vanishing coupling constant—even though
this is true only for a finite set of values of the total momentum p—which is a striking difference
between the discrete model of the present work and corresponding Hamiltonian systems.
Author Contributions: G.M.D. and P.P. conceived and designed the model; A.B. and A.T. performed the calculations;
N.M. reviewed the derivation exploiting the symmetries of the walk, and performed the numerical analysis.
Funding: This publication was made possible through the support of a grant from the John Templeton Foundation
under the project ID# 60609 Causal Quantum Structures. The opinions expressed in this publication are those of the
authors and do not necessarily reflect the views of the John Templeton Foundation.
Conflicts of Interest: The authors declare no conflict of interest. The founding sponsors had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the
decision to publish the results.
Appendix A. Notation
For the single particle walk of Equation (3), the eigenstates can be written as
1 −iμ
vsp = , gs ( p) := −i (s sin ω ( p) + ν sin p), (A1)
| Ns ( p)| gs ( p )
with | Ns ( p)|2 = μ2 + | gs ( p)|2 . For the two-particle walk, we define vrs k = v p+k ⊗ v p−k . If s = r,
: s r
⎛ ⎞ ⎛ ⎞
− μ2 − μ2
⎜ −iμg+ ( p − k) ⎟ ⎜ iμg+ ( p − k) ⎟
⎜ ⎟ ⎜ ⎟
v++
−k ∝⎜ ⎟, v−−
k−π ∝⎜ ⎟. (A3)
⎝ −iμg+ ( p + k) ⎠ ⎝ iμg+ ( p + k) ⎠
g+ ( p + k ) g+ ( p − k ) g+ ( p + k ) g+ ( p − k )
22
Entropy 2018, 20, 435
⎛ ⎞ ⎛ ⎞
− μ2 − μ2
⎜ −iμg− ( p − k) ⎟ ⎜ ⎟
⎜ ⎟ ⎜ iμg+ ( p + k) ⎟
v+− ∝⎜ ⎟, v+−
π −k ∝⎜ ⎟, (A4)
k ⎝ −iμg+ ( p + k) ⎠ ⎝ iμg− ( p − k) ⎠
g+ ( p + k ) g− ( p − k ) g+ ( p + k ) g− ( p − k )
⎛ ⎞ ⎛ ⎞
− μ2 − μ2
⎜ −iμg ( p + k ) ⎟ ⎜ ⎟
⎜ ⎟ ⎜ iμg+ ( p + k) ⎟
v−+ v−+
+
−k ∝⎜ ⎟, k−π ∝⎜ ⎟. (A5)
⎝ −iμg− ( p − k ) ⎠ ⎝ iμg− ( p − k) ⎠
g− ( p + k ) g+ ( p − k ) g+ ( p + k ) g− ( p − k )
In order to simplify the derivation of the solution, we adopt the following notation:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a a a a
⎜b⎟ ⎜c⎟ ⎜ −c ⎟ ⎜−b⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
v++ =: ⎜ ⎟ , v++
−k = ⎜ ⎟, v−−
π −k = ⎜ ⎟, v−−
k−π = ⎜ ⎟, (A6)
k ⎝c⎠ ⎝b⎠ ⎝−b⎠ ⎝ −c ⎠
d d d d
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a a a a
⎜ b ⎟ ⎜ c ⎟ ⎜ −c ⎟ ⎜−b ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
v+− =: ⎜ ⎟ , v−+
−k = ⎜ ⎟ , v+−
π −k = ⎜ ⎟ , v−+
k−π = ⎜ ⎟ . (A7)
k ⎝c ⎠ ⎝b ⎠ ⎝−b ⎠ ⎝ −c ⎠
d d d d
References
1. Farhi, E.; Goldstone, J.; Gutmann, S. A Quantum Algorithm for the Hamiltonian NAND Tree. Theory Comput.
2008, 4, 169–190. doi:10.4086/toc.2008.v004a008. [CrossRef]
2. Feynman, R.P.; Hibbs, A.R.; Styer, D.F. Quantum Mechanics and Path Integrals; Volume 2, International Series
in Pure and Applied Physics; McGraw-Hill: New York, NY, USA, 1965.
3. Grossing, G.; Zeilinger, A. Quantum cellular automata. Complex Syst. 1988, 2, 197–208.
4. Ambainis, A.; Bach, E.; Nayak, A.; Vishwanath, A.; Watrous, J. One-dimensional Quantum Walks.
In Proceedings of the STOC ’01 Thirty-Third Annual ACM Symposium on Theory of Computing, Hersonissos,
Greece, 6–8 July 2001; ACM: New York, NY, USA, 2001; pp. 37–49. doi:10.1145/380752.380757. [CrossRef]
5. Reitzner, D.; Nagaj, D.; Bužek, V. Quantum Walks. Acta Phys. Slov. Rev. Tutor. 2011, 61, 603–725. [CrossRef]
6. Gross, D.; Nesme, V.; Vogts, H.; Werner, R. Index theory of one dimensional quantum walks and cellular
automata. Commun. Math. Phys. 2012, 310, 419–454. [CrossRef]
7. Shikano, Y. From Discrete Time Quantum Walk to Continuous Time Quantum Walk in Limit Distribution.
J. Comput. Theor. Nanosci. 2013, 10, 1558–1570. doi:10.1166/jctn.2013.3097. [CrossRef]
8. Childs, A.M.; Goldstone, J. Spatial search by quantum walk. Phys. Rev. A 2004, 70, 022314.
doi:10.1103/PhysRevA.70.022314. [CrossRef]
9. Portugal, R. Quantum Walks and Search Algorithms; Springer Science & Business Media: Berlin, Germany, 2013.
10. Douglas, B.L.; Wang, J.B. A classical approach to the graph isomorphism problem using quantum walks.
J. Phys. A Math. Theor. 2008, 41, 075303. [CrossRef]
11. Gamble, J.K.; Friesen, M.; Zhou, D.; Joynt, R.; Coppersmith, S.N. Two-particle quantum walks applied to the
graph isomorphism problem. Phys. Rev. A 2010, 81, 052313. doi:10.1103/PhysRevA.81.052313. [CrossRef]
12. Bialynicki-Birula, I. Weyl, Dirac, and Maxwell equations on a lattice as unitary cellular automata. Phys. Rev. D
1994, 49, 6920. [CrossRef]
13. Meyer, D. From quantum cellular automata to quantum lattice gases. J. Stat. Phys. 1996, 85, 551–574.
[CrossRef]
14. Yepez, J. Relativistic Path Integral as a Lattice-based Quantum Algorithm. Quantum Inf. Process. 2006,
4, 471–509. [CrossRef]
15. Arrighi, P.; Facchini, S. Decoupled quantum walks, models of the Klein-Gordon and wave equations. EPL
2013, 104, 60004. [CrossRef]
23
Entropy 2018, 20, 435
16. Bisio, A.; D’Ariano, G.M.; Tosini, A. Quantum field as a quantum cellular automaton: The Dirac free
evolution in one dimension. Ann. Phys. 2015, 354, 244–264. doi:10.1016/j.aop.2014.12.016. [CrossRef]
17. D’Ariano, G.M.; Perinotti, P. Derivation of the Dirac equation from principles of information processing.
Phys. Rev. A 2014, 90, 062106. doi:10.1103/PhysRevA.90.062106. [CrossRef]
18. D’Ariano, G.M.; Mosco, N.; Perinotti, P.; Tosini, A. Path-integral solution of the one-dimensional Dirac
quantum cellular automaton. Phys. Lett. A 2014, 378, 3165–3168. doi:10.1016/j.physleta.2014.09.020.
[CrossRef]
19. D’Ariano, G.M.; Mosco, N.; Perinotti, P.; Tosini, A. Discrete Feynman propagator for the Weyl quantum walk
in 2 + 1 dimensions. EPL 2015, 109, 40012. doi:10.1209/0295-5075/109/40012. [CrossRef]
20. Arrighi, P.; Facchini, S.; Forets, M. Quantum walking in curved spacetime. Quantum Inf. Process. 2016,
15, 3467–3486. doi:10.1007/s11128-016-1335-7. [CrossRef]
21. Bisio, A.; D’Ariano, G.M.; Perinotti, P. Quantum cellular automaton theory of light. Ann. Phys. 2016,
368, 177–190. doi:10.1016/j.aop.2016.02.009. [CrossRef]
22. Arnault, P.; Debbasch, F. Quantum walks and discrete gauge theories. Phys. Rev. A 2016, 93, 052301.
doi:10.1103/PhysRevA.93.052301. [CrossRef]
23. Bisio, A.; D’Ariano, G.M.; Erba, M.; Perinotti, P.; Tosini, A. Quantum walks with a one-dimensional coin.
Phys. Rev. A 2016, 93, 062334. doi:10.1103/PhysRevA.93.062334. [CrossRef]
24. Mallick, A.; Mandal, S.; Chandrashekar, C.M. Neutrino oscillations in discrete-time quantum walk
framework. Eur. Phys. J. C 2017, 77, 85. doi:10.1140/epjc/s10052-017-4636-9. [CrossRef]
25. Molfetta, G.D.; Pérez, A. Quantum walks as simulators of neutrino oscillations in a vacuum and matter.
New J. Phys. 2016, 18, 103038. [CrossRef]
26. Brun, T.A.; Mlodinow, L. Discrete spacetime, quantum walks and relativistic wave equations. Phys. Rev. A
2018, 97, 042131. doi:10.1103/PhysRevA.97.042131. [CrossRef]
27. Brun, T.A.; Mlodinow, L. Detection of discrete spacetime by matter interferometry. arXiv 2018,
arXiv:1802.03911. [CrossRef]
28. Raynal, P. Simple derivation of the Weyl and Dirac quantum cellular automata. Phys. Rev. A 2017, 95, 062344.
doi:10.1103/PhysRevA.95.062344. [CrossRef]
29. Bibeau-Delisle, A.; Bisio, A.; D’Ariano, G.M.; Perinotti, P.; Tosini, A. Doubly special relativity from quantum
cellular automata. EPL 2015, 109, 50003. [CrossRef]
30. Bisio, A.; D’Ariano, G.M.; Perinotti, P. Special relativity in a discrete quantum universe. Phys. Rev. A 2016,
94, 042120. doi:10.1103/PhysRevA.94.042120. [CrossRef]
31. Bisio, A.; D’Ariano, G.M.; Perinotti, P. Quantum walks, deformed relativity and Hopf algebra symmetries.
Philos. Trans. R. Soc. Lond. A Math. Phys. Eng. Sci. 2016, 374, doi:10.1098/rsta.2015.0232. [CrossRef]
[PubMed]
32. Arrighi, P.; Facchini, S.; Forets, M. Discrete Lorentz covariance for quantum walks and quantum cellular
automata. New J. Phys. 2014, 16, 093007. [CrossRef]
33. Du, J.; Li, H.; Xu, X.; Shi, M.; Wu, J.; Zhou, X.; Han, R. Experimental implementation of the quantum
random-walk algorithm. Phys. Rev. A 2003, 67, 042316. doi:10.1103/PhysRevA.67.042316. [CrossRef]
34. Ryan, C.A.; Laforest, M.; Boileau, J.C.; Laflamme, R. Experimental implementation of a discrete-time
quantum random walk on an NMR quantum-information processor. Phys. Rev. A 2005, 72, 062317.
doi:10.1103/PhysRevA.72.062317. [CrossRef]
35. Xue, P.; Sanders, B.C.; Leibfried, D. Quantum Walk on a Line for a Trapped Ion. Phys. Rev. Lett. 2009,
103, 183602. doi:10.1103/PhysRevLett.103.183602. [CrossRef] [PubMed]
36. Do, B.; Stohler, M.L.; Balasubramanian, S.; Elliott, D.S.; Eash, C.; Fischbach, E.; Fischbach, M.A.; Mills, A.;
Zwickl, B. Experimental realization of a quantum quincunx by use of linear optical elements. J. Opt. Soc.
Am. B 2005, 22, 499–504. doi:10.1364/JOSAB.22.000499. [CrossRef]
37. Sansoni, L.; Sciarrino, F.; Vallone, G.; Mataloni, P.; Crespi, A.; Ramponi, R.; Osellame, R. Two-Particle
Bosonic-Fermionic Quantum Walk via Integrated Photonics. Phys. Rev. Lett. 2012, 108, 010502.
doi:10.1103/PhysRevLett.108.010502. [CrossRef] [PubMed]
38. Crespi, A.; Osellame, R.; Ramponi, R.; Giovannetti, V.; Fazio, R.; Sansoni, L.; De Nicola, F.; Sciarrino, F.;
Mataloni, P. Anderson localization of entangled photons in an integrated quantum walk. Nat. Photonics
2013, 7, 322–328. [CrossRef]
24
Entropy 2018, 20, 435
39. Flamini, F.; Spagnolo, N.; Sciarrino, F. Photonic quantum information processing: A review. arXiv 2018,
arXiv:1803.02790. [CrossRef]
40. Childs, A.M. Universal Computation by Quantum Walk. Phys. Rev. Lett. 2009, 102, 180501.
doi:10.1103/PhysRevLett.102.180501. [CrossRef] [PubMed]
41. Lovett, N.B.; Cooper, S.; Everitt, M.; Trevers, M.; Kendon, V. Universal quantum computation using the
discrete-time quantum walk. Phys. Rev. A 2010, 81, 042330. doi:10.1103/PhysRevA.81.042330. [CrossRef]
42. Childs, A.M.; Gosset, D.; Webb, Z. Universal computation by multiparticle quantum walk. Science 2013,
339, 791–794. [CrossRef] [PubMed]
43. Meyer, D.A. Quantum lattice gases and their invariants. Int. J. Mod. Phys. C 1997, 8, 717–735. [CrossRef]
44. Ahlbrecht, A.; Alberti, A.; Meschede, D.; Scholz, V.B.; Werner, A.H.; Werner, R.F. Molecular binding in
interacting quantum walks. New J. Phys. 2012, 14, 073050. [CrossRef]
45. Bisio, A.; D’Ariano, G.M.; Perinotti, P.; Tosini, A. Thirring quantum cellular automaton. Phys. Rev. A 2018,
97, 032132. doi:10.1103/PhysRevA.97.032132. [CrossRef]
46. Östlund, S.; Mele, E. Local canonical transformations of fermions. Phys. Rev. B 1991, 44, 12413–12416.
doi:10.1103/PhysRevB.44.12413. [CrossRef]
47. Thirring, W.E. A soluble relativistic field theory. Ann. Phys. 1958, 3, 91–112. doi:10.1016/0003-4916(58)90015-0.
[CrossRef]
48. Hubbard, J. Electron correlations in narrow energy bands. Proc. R. Soc. Lond. A Math. Phys. Eng. Sci. 1963,
276, 238–257. doi:10.1098/rspa.1963.0204. [CrossRef]
c 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
25
entropy
Article
Robust Macroscopic Quantum Measurements in the
Presence of Limited Control and Knowledge
Marc-Olivier Renou *, Nicolas Gisin and Florian Fröwis
Department of Applied Physics, University of Geneva, 1211 Geneva 4, Switzerland;
[email protected] (N.G.); [email protected] (F.F.)
* Correspondence: [email protected]
Abstract: Quantum measurements have intrinsic properties that seem incompatible with our
everyday-life macroscopic measurements. Macroscopic Quantum Measurement (MQM) is a concept
that aims at bridging the gap between well-understood microscopic quantum measurements and
macroscopic classical measurements. In this paper, we focus on the task of the polarization direction
estimation of a system of N spins 1/2 particles and investigate the model some of us proposed
in Barnea et al., 2017. This model is based on a von Neumann pointer measurement, where each spin
component of the system is coupled to one of the three spatial component directions of a pointer.
It shows traits of a classical measurement for an intermediate coupling strength. We investigate
relaxations of the assumptions on the initial knowledge about the state and on the control over the
MQM. We show that the model is robust with regard to these relaxations. It performs well for thermal
states and a lack of knowledge about the size of the system. Furthermore, a lack of control on the
MQM can be compensated by repeated “ultra-weak” measurements.
1. Introduction
In our macroscopic world, we constantly measure our environment. For instance, to find north
with a compass, we perform a direction measurement by looking at the pointer. Yet, finding a quantum
model for this kind of macroscopic measurement faces several problems. Many characteristics
of quantum measurements seem to be incompatible with our intuitive notion of macroscopic
measurements. For example, perfectly measuring two non-commuting observables is impossible
in quantum mechanics, and any informative measurement has a nonvanishing invasiveness. Thus, if it
exists, such a model cannot be of the standard projective kind. Although we have a good intuition of
what such a measurement is, the natural characteristics it should satisfy are not obvious. Even if these
characteristics can be rigorously formulated, it is not clear whether there exists a quantum model that
satisfies them all.
For concreteness, quantum models for macroscopic measurements can be considered as a
parameter estimation task. In this paper, we focus on the estimation of the direction of polarization
of N qubits, oriented in a direction that is uniformly chosen at random. The question of the optimal
way to estimate N qubit polarization is already well studied [1,2] and can be seen as part of a larger
class of covariant estimation problems [3]. It is linked to covariant cloning [4] and purification of
state [5]. In the limit of macroscopic systems, those optimal measurements are arbitrarily precise
and potentially with low disturbance of the system [6,7]. A tradeoff between the quality of the guess
and the disturbance of the state has been demonstrated [8], as well as an improvement of the guess
when abstention is allowed [9]. However, these optimal measurements may not be satisfying models
of our everyday-life macroscopic measurements as it is not clear how these optimal measurements
could be physically implemented in a natural way. A first attempt to solve this Positive Operator
Valued Measure (POVM), which is continuous, into a POVM with a finite (and small) number of
elements [10,11]. However, even if this reduction exists, the resulting POVM is difficult to interpret
physically, and to our best knowledge, no family of reduced POVM for every N exists.
In [12], we argue that a good model of a macroscopic measurement should be highly non-invasive,
collect a large amount of information in a single shot and be described by a “fairly simple” coupling
between system and observer. Measurements that fulfills these requirements are called “Macroscopic
Quantum Measurements” (MQM). Invasiveness seems to be difficult to satisfy with a quantum
model. Indeed, the disturbance induced on the state by a measurement is generic in quantum
mechanics. This has no counterpart in classical physics, where any measurement can ideally be done
without disturbance of the system. However, it is now well known that this issue can be solved by
accepting quantum measurements of finite accuracy. In [13], Poulin shows the existence of a trade-off
between state disturbance and measurement resolution as a function of the size of the ensemble.
One macroscopic observable can behave “classically”, provided we measure it with sufficiently low
resolution. Yet, the question is still open for several non-commuting observables. Quantum physics
allows precise measurements of only one observable among two non-commuting ones.
In this paper, we study the behavior of an MQM model for the measurement of the polarization of a
large ensemble of N parallel spin 1/2 particles, which implies the measurement of the non-commuting
spin operators. In this model, the measured system is first coupled to a measurement apparatus through
an intuitive Hamiltonian already introduced in [14]. Then, the apparatus is measured. We extend our
previous study to more general cases. In [12], it was shown that this model allows good direction
estimation and low disturbance for systems of N parallel spin 1/2 particles. This system can be
interpreted as the ground state of a product Hamiltonian. Here, we generalize the scenario to thermal
states. We also study a different measurement procedure based on repeated weak measurements.
The paper is structured as follows: We first present a simplified technical framework that describes
the measurement of a random direction for a given quantum state and observable. Considering an
input state and an observable independent of the particle number and with no preferred direction,
we show that the problem reduces to many sub-problems, which correspond to systems of fixed total
spin j. Then, we quantitatively treat the case of the thermal state, which generalizes the N parallel spin
1/2 particle for non-zero temperature, showing that the discussed MQM is still close to the optimal
measurement. In the proposed MQM, the precision of the estimated direction highly depends on the
optimized coupling strength of the model. In Section 4, we follow the ideas of [13], and we show that
one may relax this requirement by doing repeated “ultra-weak” measurements and a naive guess.
We conclude and summarize in the last section.
2. Estimation of a Direction
In this paper, we aim to study the behavior of a specific MQM model for a direction estimation
task, e.g., the estimation of the direction of a magnet or a collection of spins. Hence, we first introduce
an explicit (and specific) direction estimation problem, which is presented as a game. It concerns the
direction estimation of a qubit ensemble. In the following, Su = S · u represents the spin operator
projected in direction u, i.e., the elementary generator of rotations around u. For a given state ρu of
N = 2J qubits, we say that ρu points in the direction u if it is positively polarized in the u direction, i.e.,
if [ρu , Su ] = 0 and Tr(ρu Su ) > 0. We consider the problem of polarization direction estimation from
states that are all the same, but point in a direction that is chosen uniformly at random. This problem
has already been widely studied [1–3,6,15]. We give here a unified framework adapted to our task.
28
Entropy 2018, 20, 39
ρu to Bob, who measures it with some given measurement device characterized by a Positive Operator
Valued Measure (POVM) Ωr . He obtains a result r with probability p(r |u) = Tr(Ωr ρu ), from which
he deduces vr , his guess for u. Bob’s score is computed according to some predefined score function
g(u, vr ) = u.vr . Given his measurement result, Bob’s goal is to find the optimal estimate, i.e., the one
that optimizes his mean score [16].
G= dr du p(r |u) g(u, vr ) (1)
For simplicity, we consider an equivalent, but simplified POVM. In our description, Bob measures
the system, obtains results r and then post-processes this information to find his guess vr . We now
regroup all POVM elements corresponding to the same guess and label by the guessed direction.
Formally, we go from Ωr to Ov = drΩr δ(vr − v).
Some assumptions are made about ρz and Ov . We suppose that ρz points in the z direction.
Moreover, we assume that ρz is symmetric under the exchange of particles, which implies [ρz , S2 ] = 0.
Let |α, j, m be the basis in which Sz and S2 are diagonal (where j ∈ { J, J − 1, ...} is the total spin,
α the multiplicity due to particle exchange and m the spin along z). Then, ρz is diagonal in this basis,
j
with coefficients independent of α, denoted as cm = α, j, m|ρz |α, j, m.
We also suppose that the measurement device does not favor any direction and treats each particle
equally. Mathematically, this means that Ov is covariant with respect to particle exchange and rotations.
Then, any POVM element is generated from one kernel Oz and the rotations Rv : Ov = R†v Oz Rv
(for more technical details, see [15]). With this, Equation (1) simplifies to:
G= dv du p(v|u) g(u, v), (2)
j
where A j = ( J2J 2J
− j) − ( J − j−1) is the degeneracy of the multiplicity α in a subspace of given ( j, m ), Oz is the
j
projections of Oz over all subspaces of fixed (α, j), ρz is the projection of ρz over all subspaces of fixed (α, j) and
j
ρz .
j
ρ̃z = j
Tr ρz
Lemma 1 says that Bob cannot use any coherence between subspaces associated with different
(α, j) to increase
his score. In other words, the score Bob achieves is the weighted sum (where the
α,j j
weights are Tr ρz ) of the scores G j Bob would achieve by playing with the states ρ̃z . This property
is a consequence of the assumption that no direction or particle is preferred by Bob’s measurement or
in the set of initial states. For self-consistency, we prove this lemma.
G = Tr(Oz Γz ). (5)
29
Entropy 2018, 20, 39
α,j α,j
Let Pα,j = ∑m |α, j, m α, j, m| be projectors, Γz = Pα,j Γz Pα,j and Oz = Pα,j Oz Pα,j . Here, as ρ and
Oz do not depend on the particle number, α is only a degeneracy.
α,j
As Γz is invariant under rotation around z and commutes with S2 , we have Γz = ∑α,j Γz .
α,j α,j j j j j
Then, G = ∑α,j Tr Oz Γz = ∑ j A j Tr Oz Γz , where Oz , Γz are respectively the projections of Oz , Γz
j j
over any spin coherent subspace of fixed α, j. Let G j = Tr Oz Γz .
j j
Γz = ∑m cm du uz R†u |α, j, m α, j, m|Ru is symmetric under rotations around z. Then, it is
diagonal in the basis |α, j, m with fixed j, α. As α, j, μ| du uz R†u |α, j, m α, j, m|Ru |α, j, μ =
mμ α,j
j( j+1)(2j+1)
= m
j( j+1)(2j+1)
α, j, μ| Sz |α, j, μ, we have:
m α,j
∑ cm j( j + 1)(2j + 1) Sz
j j
Γz = (6)
m
and:
1 j j
Gj = Tr Sz ρz Tr Sz Oz . (7)
j( j + 1)(2j + 1)
2.3. State Independent Optimal Measurement, Optimal State for Direction Estimation
α,j
Given the state ρz , the measurement that optimizes Bob’s score is the set of Θv such that
α,j α,j
Tr Sz Θz is maximal. The maximum is obtained when Θz is proportional to a projector on the
α,j
eigenspace of Sz with the maximal
eigenvalue,
that is for Θz = (2j + 1)|α, j, ± j α, j, ± j|. Here, the
j
sign depends on the sign of Tr Sz ρ z
. In the following, we restrict ourselves to the case where the
j
Tr Sz ρz are all positive (this is the case for the thermal state, considered below). Then:
j
jA j Tr ρz
Sz j
Gopt = ∑ j+1
Tr
j
ρ̃z . (8)
j
J
For ρz = | J, J J, J |, the thermal state of temperature T = 0, we find Gopt,T =0 = J +1 . Equivalently,
1 N +1
we recover the optimal fidelity Fopt,T =0 = = 2 (1 + Gopt,T =0 ) N +2 , already found in [1]. Asymptotically,
we have Gopt,T =0 = 1 − 1/J + O(1/J 2 ). This induces a natural characterization of the optimality of an
estimation procedure. Writing GT =0 as GT =0 = 1 − J /J where J = J (1 − GT =0 ) ≥ 1, we say that the
procedure is asymptotically optimal if J = 1 + O(1/J ) and almost optimal if J − 1 is asymptotically
not far from zero.
For every j, the three terms of the product are positive. Then, qualitatively, the measurement is
nearly optimal if for each j, the product of the three is small. We give here the interpretation of each of
these terms:
30
Entropy 2018, 20, 39
j
• A j is the degeneracy under permutation of particles (labeled by α) and Tr ρz the weight of ρz
over a subspace j, α. Hence, the first term, bounded by j/( j + 1), only contains the total weight of
ρz over a fixed total spin j. Hence, it is small whenever ρ has little weight in the subspace j.
j j
• Tr Sjz ρ̃z is small whenever the component of ρz on the subspace of total spin j, ρz = Pz ρz Pz , is
j
small or not well polarized. It is bounded by one. When ρz is not well polarized, the optimality of
the measurement in that subspace makes little difference. Then, this second term characterizes the
j
quality of the component ρz for the guess of the direction.
j
• The last term is small when Oz is nearly optimal and is also bounded by one. More exactly,
j j
as Oz is a covariant POVM, we have Tr Oz = 2j + 1, and all diagonal coefficients are positive.
j
Because of Sz /j, Oz is (nearly) optimal when it projects (mainly) onto the subspace of Sz with the
highest eigenvalue. POVMs containing other projections are sub-optimal. This effect is amplified
by the operator Sz : the further away these extra projections ∝ | j, m j, m| are from the optimal
projector ∝ | j, j j, j| (in the sense of j − m), the stronger the sub-optimality is. Then, the last term
j
corresponds to the optimality of the measurement component Oz for the guess of the direction.
Interestingly, we see here that the state and measurement “decouple”: the optimal measurement is
independent of the considered state. However, if the measurement is not optimal only for subspaces
where ρz has low weight or is not strongly polarized, it will still result in a good mean score.
where Z = (2cosh( β/2)) N is the partition sum. ρz is clearly invariant under rotations
around
z and
Sz α,j
symmetric under particle exchange. For later purposes, we define f j ( β) = Z Tr j ρz = (1 + j )
sinh( jβ) − j sinh((1 + j) β) (2j sinh( β/2)2 ).
Equation (3) now reads:
J S z Oz
GT = 0 = Tr , (11)
J+1 J 2J + 1
and for any temperature β:
1 j
G=
Z ∑ Aj f j ( β ) GT = 0 , (12)
j
jA
with the optimal measurement, Gopt,T = Z1 ∑ j j+1j f j ( β). Note that for low temperatures, this expression
can be approximated with J β /( J β + 1), where J β is the mean value of the total spin operator for
a thermal state.
31
Entropy 2018, 20, 39
measurements on the system, from projective measurements, which are partially informative, but
destruct the state to weak measurements, which acquire little information, but do not perturb much.
More specifically, to measure the direction of ρu , we use a pointer with three spatial degrees
of freedom: 2 2 2
1 −x +y +z
|φ = dx dy dze 4Δ2 | x |y |z , (13)
(2πΔ2 )3/4
where x, y, z are the coordinates of the pointer. The parameter Δ in |φ represents the width of the
pointer: a small Δ corresponds to a narrow pointer and implies a strong measurement, while a large Δ
gives a large pointer and a weak measurement. The interaction Hamiltonian reads:
Hint = S · p ≡ p x ⊗ Sx + py ⊗ Sy + pz ⊗ Sz , (14)
where p x , py , pz are the conjugate variables of x, y, z. A longer interaction time or stronger coupling
can always be renormalized by adjusting Δ. Hence, we take the two equal to one. Finally, a position
measurement with outcome r is performed on the pointer. The POVM elements associated with this
measurement are Or = Er Er† , where the Krauss operator Er reads:
2 2 −ip·S
Er ∝ dpeir·p e−Δ p e
. (15)
The POVM associated with this model is already covariant. Indeed, the index of each POVM
element is the direction of guess (to exactly obtain the form given in Section 2.1, one has to define
∞
Ov = 0 r2 Or dr, which is equivalent to identifying each vector with its direction). Any Or is a rotation
of Oz : Or = Rr† Oz Rr .
G
0.70
0.60
0.50
0.40
0.30
0.20
Figure 1. (a) Mean score as a function of the pointer width Δ for various N = 2J. The dashed lines
correspond to the optimal value Gopt . (b) Scaling factor J = J (1 − G J ) from the approximate lower
bound on the score G (upper, blue curve) compared to the optimal scaling factor J (1 − Gopt ) (lower,
red curve). For large J, J seems to go to 19/18 (dashed line). See Section 3.2 for further details.
32
Entropy 2018, 20, 39
From Equation (3) and the discussion about Equation (9), we see that, to achieve optimality,
the first diagonal coefficient o J must be maximal [24], that is equal to 2J + 1. When this is not the case,
as Tr(Oz ) = 2J + 1, the difference (2J + 1) − o J = Tr(Oz ) − o J = ∑m
= J om is distributed between the
other diagonal coefficients om = J, m|Oz | J, m, for m
= J. The score achieved by the measurement is
given by Equation (11):
J
S z Oz J m om
J+1 ∑
GT =0 = Tr = . (16)
J 2J + 1 m J 2J + 1
Our bound only considers the coefficient o J . However, a simple calculation shows that this is
enough to deduce the strict suboptimality of the measurement. Indeed, one can derive:
J oJ m om
2J + 1 m∑
J = J 1− +
J+1
= J
J 2J + 1
J oJ J−1 oJ
≥ J 1− + 1−
J + 1 2J + 1 J 2J + 1
oJ
≥ 2− + o (1),
2J + 1
where the equivalence ≡ is interpreted as |α, j, m( N ) ≡ |m(n) (there is no multiplicity for n and
j = n/2).
For non-zero temperature, we adapt the numerical estimation model of [12]. Due to Lemma 1
and Equation (17), we can directly exploit the same model and combine the results for the different
subspaces for given j. However, in this case, we are limited by the choice of the coupling strength Δ
of the pointer with the system. At zero temperature, only the total spin subspace that corresponds
√
to j = J is involved. The optimal coupling strength is then Δ = J/4. For a non-zero temperature,
all possible j appear, and the value of Δ cannot be optimized for each one. Our strategy is to choose
the optimal coupling value for the equivalent total spin Jeq satisfying S2 = Jeq ( Jeq + 1), which can
be deduced from S2 = 11 (3J + J (2J − 1)tanh2 β/2) (for a thermal state). Depending on the sensitivity
of the MQM guessing scheme with respect to a change in the value of Δ, this method may work or not.
√
Numeric simulations show that a change of order O( J ) perturbs the score. However, one can hope
that for smaller variation, the perturbation is insignificant.
We tested the method for different values of temperature T = 1/β corresponding to spin
polarization Sz = J tanh β/2. We find again that the asymptotic difference between Gopt and
G MQM is small. More precisely, Figure 2 shows JΔGβ as a function of J, for different temperature
corresponding to Sz = cJ, for various c. For each Δ, the error JΔGβ seems to be bounded for large J.
33
Entropy 2018, 20, 39
0.25
0.20
0.15
0.10
0.05
Figure 2. JΔG (Equation (9)) as a function of J, for various β chosen such that Sz = J tanh β/2.
The Macroscopic Quantum Measurement (MQM) is close to optimal even for finite temperature.
See Section 3.3 for further details.
34
Entropy 2018, 20, 39
where:
1 −||r −m1||2
Fm (r ) = √ p e 2Δ2 , (21)
Δ 2π
where 1 = {1, ..., 1}. As all measurements for each step commute, this case can be solved analytically.
Note first that the ordering of the measurement results is irrelevant. From Equation (1), we find:
1
G= Tr(Sz Oz )
( J + 1)(2J + 1)
2 − ||
2
r || −mt m r · 1
= √ t drδ(vr −z)e 2Δ2
∑ m e 2Δ2 sinh
Δ2
,
( J + 1)(2J + 1) Δ 2π m >0
where vr is the optimal guess. For r such that r ·1 ≥ 0, the optimal guess is clearly vr = z. By symmetry,
v−r = −vr , and the optimal guess is vr = sign(r · 1)z. Then:
2 m t
( J + 1)(2J + 1) m∑
G= m erf (22)
>0 Δ 2
is easily computed by integration over r and by decomposition into√its parallel and orthogonal
components to 1. We see here that the score only depends on the ratio t and reaches the 1D strong Δ
√
measurement limit for Δt 1 (see Figure 3). Here, erfis the error function. We see that G → 1/2 for
J → ∞, which is the optimal value for optimal measurements lying on one direction.
G
0.5
0.3
0.1
20 60 100 t
Figure 3. Score for repeated weak measurement in a single fixed direction with Δ = 10 and J = 2, 4, 8, 16.
See Section 4.2 for further details.
35
Entropy 2018, 20, 39
simulation of the model. We fix the number of qubits N = 2J and pointer width Δ. The vector u
is drawn at random on the Bloch sphere. Then, we simulate τ successive weak measurements in
directions x, y, z of the system |u⊗ N . For each t ≤ τ, we guess u from the mean of the results for x, y, z
for measurements up to t.
For large Δ, our procedure can be seen as successive weak measurements of the system.
Each measurement acquires a small amount of information and weakly disturbs the state. We attribute
the same weight to each measurement result to find the estimated polarization. As each measurement
disturbs the state, this strategy is not optimal. However, keeping the heuristic of “intuitive measurement”,
we consider this guessing method as being natural.
The results from the numerical simulation suggest that for a fixed number of particles N = 2J and
fixed pointer width Δ, the score as a function of t increases and then decreases (see Figure 4a), which
is intuitive. Indeed, for few measurements, the state is weakly disturbed, and each measurement
acquires only a small amount of information about the original state. Then, after a significant number
of measurements, the state is strongly disturbed, and each measurement is done over a noisy state
and gives no information about the initial state. Hence, there is an optimal number of measurements
tmax ( N, Δ) that gives a maximal score Gmax ( N, Δ). Moreover, for a fixed N = 2J, Gmax ( N, Δ) increases
smoothly as the measurements are weaker, i.e., as Δ increases. It reaches a limit Gmax ( N ) (see Figure 4a).
This suggest that for weak enough measurements, we observe the same behavior as in the 1D case.
More measurements compensate a weaker interaction strength, without loss of precision. Hence,
the precision of a single measurement is not important, as long as the measurement is weak enough.
Moreover, in that case, we observe a plateau, which suggests that the exact value of t is not important.
For N 1, even with t far from tmax , the mean score is close to Gmax . Interestingly, the trade-off
√
between tmax and Δ found for the 1D case seems to repeat here. √ We numerically find that t /Δ is
max
(c)
1
0.9
0.8
0.7
0.6
0.5
G
0.4
0.3
0.2
0.1
0
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
t
Figure 4. (a) Score as a function of the number of measurements for N = 2J = 20. For each Δ, there is
√
an optimal repetition rate tmax . The optimal score Gmax saturates for Δ big enough. (b) Ratio tmax /Δ for
√
N = 5, 10, 15, 20. As in the 1D case, tmax /Δ is constant and only depends on N. (c) Score as a function
√
of the number of measurements t, for N = 2J = 1.50 and Δ = 8 N. See Section 4.3 for further details.
36
Entropy 2018, 20, 39
Most importantly, for weak enough measurements, the obtained score is close to the optimal one,
as shown in Figure 5. Numerical fluctuations prevent any precise statements about an estimation of the
error, but the error is close to what was obtained with the initial measurement procedure; see Figure 5.
Figure 5. Mean score Gmax as a function of N maximized over t. A too strong measurement (Δ = 1) fails
√
to achieve an optimum. A weak enough measurement (Δ = 8 N) achieves a good score. The insert
shows N = 2 (1 − GN ). See Section 4.3 for further details.
N
5. Conclusions
In this paper, we asked the question of how to model everyday measurements of a macroscopic
system within quantum mechanics. We introduced the notion of Macroscopic Quantum Measurement
and argued that such a measurement should be highly non-invasive, collect a large amount of
information in a single shot and be described by a “fairly simple” coupling between system and
observer. We proposed a concrete model based on a pointer von Neumann measurement inspired
by the Arthur–Kelly model, where a pointer is coupled to the macroscopic quantum system through
a Hamiltonian and then measured. This approach applies to many situations, as long as a natural
Hamiltonian for the measured system can be found.
Here, we focused on the problem of a direction estimation. The Hamiltonian naturally couples
the spin of the macroscopic quantum state to the position of a pointer in three dimensions, which
is then measured. This reveals information about the initial direction of the state. We extended our
previous study to consider a collection of aligned spins, which exploits the non-monotonic behavior of
the mean score as a function of the coupling strength. We presented more precise results. We relaxed
the assumptions about the measured system, by considering a thermal state of finite temperature
and showed that our initial conclusions are still valid. We also relaxed the assumptions over the
measurement scheme, looking at its approximation by a repetition of ultra weak measurements
in several orthogonal directions. Here again, we obtained numerical results supporting the initial
conclusion. In summary, this MQM proposal tolerates several relaxations regarding lack of control
or knowledge.
It is likely that these two relaxations can be unified: polarization measurement of systems with
n unknown number of particle or temperature should be accessible via the repeated 1D ultra-weak
measurement method. However, this claim has to be justified numerically. Further open questions
include the behavior of Arthur–Kelly models in other situations where two or more non-commuting
quantities have to be estimated, e.g., for position and velocity estimation.
37
Entropy 2018, 20, 39
Acknowledgments: We would like to thank Tomer Barnea for fruitful discussions. Partial financial support by
ERC-AG MEC and Swiss NSF is gratefully acknowledged.
Author Contributions: Nicolas Gisin suggested the study. Marc-Olivier Renou and Florian Fröwis performed
the simulations and worked out the theory. Marc-Olivier Renou wrote the paper. All authors discussed the
results and implications and commented on the manuscript at all stages. All authors have read and approved the
final manuscript.
Conflicts of Interest: The authors declare no conflict of interest.
38
Entropy 2018, 20, 39
24. We can interpret this physically. We see from Section 2.3 that the best covariant measurement is obtained
from Oz ∝ | J, J J, J |. Other covariant measurements can be obtained with Oz ∝ |m, J m, J | for 0 ≤ m < J.
The coefficients om can be interpreted as how much each of these measurements is done. The term |m, J m, J |
can also be thought as the physical system used to measure. When it is highly polarized (m = J), the
measurement is efficient. However, when the polarization is low, the information gain is weak, e.g., m = 0,
and we clearly see that all POVM elements are ∝ 11.
25. Hacohen-Gourgy, S.; Martin, L.S.; Flurin, E.; Ramasesh, V.V.; Whaley, K.B.; Siddiqi, I. Quantum dynamics of
simultaneously measured non-commuting observables. Nature 2016, 538, 491–494.
c 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
39
entropy
Article
Iterant Algebra
Louis H. Kauffman
Department of Mathematics, Statistics and Computer Science, University of Illinois at Chicago,
851 South Morgan Street, Chicago, IL 60607-7045, USA; [email protected]; Tel.: +1-773-363-5115
Abstract: We give an exposition of iterant algebra, a generalization of matrix algebra that is motivated
by the structure of measurement for discrete processes. We show how Clifford algebras and matrix
algebras arise naturally from iterants, and we then use this point of view to discuss the Schrödinger
and Dirac equations, Majorana Fermions, representations of the braid group and the framed braids
in relation to the structure of the Standard Model for physics.
Keywords: iterant; Clifford algebra; matrix algebra; braid group; Fermion; Dirac equation
1. Introduction
This is a paper about an approach to algebra that we call iterants. The idea behind the definition
of iterant (see Section 2) is that one is studying a periodic discrete process with an associated action
of a group of permutations on the sequences of the process. The simplest such discrete system is
an alternation between +1 and −1. We will show that this system gives rise in a natural way to
the square root of minus one. This way thinking about the square root of minus one as an iterant
is explained below. More generally, by starting with a discrete time series of positions, one has a
non-commutativity of observations due to time-delays (the clock must tick to measure a velocity) and
this non-commutativity can be encapsulated in a generalized iterant algebra as defined in Section 3 of
the present paper. Iterant algebra generalizes matrix algebra and we shall see how it can be used to
formulate the algebra of the framed Artin Braid Group, the Lie algebra su(3) for the Standard Model
for particle physics, the framed braid representations for Fermions of Sundance Bilson-Thompson,
the Clifford algebra for Majorana Fermions and the structure of the Schrödinger and Dirac equations.
This paper is a sequel to [1] and it uses material from that paper and extends it into the more general
context of the present paper. See also [1–4] for previous work by the author about iterants. This
paper also incorporates results of the author that appear in the joint paper of the author and Rukhsan
Ul-Haq [5]. Our intent is to give a picture of the range of application of the basic mathematical idea of
iterants and to include a description of the basic results that make them work.
This paper is organized as follows. Sections 2–4 are devoted to the mathematics of iterants. Each
remaining section of the paper applies the iterant structure to a topic in mathematical physics that is of
interest to the author. We hope that the reader finds the first few sections to be a readable introduction
to iterants. An interested reader can then turn to the remaining sections to see how iterants can be
used in specific cases. The reader should note that since applying iterants often means reformulating a
topic usually written in matrix algebra in terms of iterant algebra, and the specific interest in such a
formulation may be, at this time, of a formal nature. Nevertheless, the reformulation often raises many
interesting questions, and these will be the subject of subsequent work.
Sections 2 and 3 are an introduction to the process algebra of iterants and how the square root of
minus one arises from an alternating process. Section 4 shows how iterants give an alternative way to
do matrix algebra. The section ends with the construction of the split quaternions. Section 4 considers
iterants of arbitrary period (not just two) and shows, with the example of the cyclic group, how the
ring of all n × n matrices can be seen as a faithful representation of an iterant algebra based on the
cyclic group of order n. We then generalize this construction (Theorem 1) to arbitrary non-commutative
finite groups G. Such a group has a multiplication table (n × n where n is the order of the group G).
We show that, by rearranging the multiplication table so that the identity elements appear on the
diagonal, we get a set of permutation matrices that represent the group faithfully as n × n matrices.
This gives a faithful representation of the iterant algebra associated with the group G onto the ring of
n × n matrices. As a result, we see that iterant algebra is fundamental to matrix algebra. Section 4 ends
with a number of classical examples including iterant representations for quaternion algebra.
Section 5 is a discussion of the Schrödinger equation. We formulate a discrete model related to
the diffusion equation by following a heuristic that would identify the square root of minus one as a
controlled oscillation between plus one and minus one. The resulting discrete model has the equation
(compare with [1])
and satisfies a discrete version of the diffusion equation with an extra coefficient of (−1)n(t) , where
n(t) denotes the number of time steps τ needed to reach time t. By dividing this discrete system into its
even and odd parts (the parity of (−1)n(t) ), we retrieve the Schrödinger equation, and the formalism of
the complex numbers handles the parity. In the discrete model, the iterant structure appears directly.
Section 6 discusses the iterant structure of the framed Artin braid group via framed braids and
discusses the basics of the Sundance Bilson-Thompson model for elementary particles. In Section 7,
we apply this to a formulation of the particle model of Sundance Bilson-Thompson [6], using
framed braids.
In Section 7, we give an iterant interpretation of the su(3) Lie algebra for the Standard Model
using [7]. The resulting formulation of the su(3) Lie algebra is particularly elegant from our point
of view, and we expect it to give further insight into the standard model. This iterant formulation
of the su(3) Lie Algebra is so concise that we show it here in the Introduction. We use the specific
iterant formulas
T+ = [1, 0, 0] A, T− = [0, 1, 0] B,
U+ = [0, 1, 0] A, U− = [0, 0, 1] B,
V+ = [0, 0, 1] A, V− = [1, 0, 0] B,
1
T3 = [1/2, −1/2, 0], Y = √ [1, 1, −2].
3
We have the permutation relations A[ x, y, z] = [y, z, x ] A and B = A2 = A−1 so that B[ x, y, z] =
[z, y, x ] B. This reduces the basic su(3) Lie algebra to a very elementary patterning of order three cyclic
operations. The details of this formulation are given in Section 7.
In Section 8 we apply this point of view on the Standard Model to obtain an embedding of the
framed braid algebra for the Sundance Bilson-Thompson model into the iterant version of su(3). These
three sections are an account of research of the author and Rukhsan Ul-Haq in [5].
Section 9 discusses how Clifford algebras are related to the structure of Fermions. We show how
the algebra of the split quaternions, the very first iterant algebra that appears in relation to the square
root of minus one, is behind the structure of the operator algebra of the electron. The Clifford structure
on two generators describes a pair of Majorana Fermion operators. Majorana Fermions are particles
that are their own antiparticles. These Majorana Fermion operators correspond to Clifford algebra
generators a and b such that a2 = b2 = 1 and ab = −ba. Using our iterant formulation, we can take a as
the iterant corresponding to a period two oscillation, and b as the time shifting operator. The product ab
is a square root of minus one in the non-commutative context of this Clifford algebra. The annihilation
operator for an electron can be symbolized by φ = ( a + ib)/2 and the creation operator for an electron
by φ† = ( a − ib)/2. These form the operator algebra for an electron. Note that
42
Entropy 2017, 19, 347
and therefore
φφ† + φ† φ = (φ + φ† )2 = a2 = 1.
The electron is seen in terms of its underlying Clifford structure in the form of a pair of Majorana
Fermions. Section 9 shows how braiding is related to the Majorana Femions.
Section 10 discusses the structure of the Dirac equation and how the nilpotent and the Majorana
operators arise naturally in this context. This section provides a link between our work and the
work on nilpotent structures and the Dirac equation of Peter Rowlands [8]. We end this section
with an expression in split quaternions for the the Majorana–Dirac equation in (3 + 1) spacetime.
The Majorana–Dirac equation can be written as follows:
where η and are the generators of our simplest iterant algebra with η 2 = 2 = 1 and η + η = 0.
The elements ,ˆ η̂ form a commuting copy of this algebra. This use of a combination of the simplest
Clifford algebra with itself is the underlying structure of the Majorana–Dirac equation.
We give a specific real solution to the Majorana–Dirac equation in our iterant/Clifford algebra
formalism. Here, ρ( x, t) = e( p• x− Et) , where p = ( p x , py , pz ) is a constant vector momentum, and x
denotes the vector ( x, y, z). The solution to the Majorana–Dirac equation is Γ̂ρ( x, t) as shown below:
This solution is real in the sense that its coordinates are all real valued functions once the iterant
or matrix forms for the operators are made explicit. The combination of iterant and Clifford algebra
language that we develop here makes the analysis of certain aspects of the Dirac equation and the
Majorana–Dirac equation very clear. More work needs to be done in all these fronts.
This paper is a snapshot of a larger story. Iterant algebra is a basically simple reformulation
of aspects of patterned algebra that can often illuminate correspondingly elementary topics in
mathematics and physics. The present work is a beginning in the larger enterprise of understanding
relationships in discrete physics and relationships between algbra and physics.
2. Iterants
An iterant is a sum of elements of the form
aσ = [ a1 , a2 , ..., an ]σ,
where a = [ a1 , a2 , ..., an ] is a vector of elements that are scalars (usually real or complex numbers) and
σ is a permutation on n letters. Vectors are added and multiplied coordinatewise (see below), and we
take the following rule for multiplication of vector/permutation combinations:
where bσ denotes the vector b with its elements permuted by the action of σ.
If a and b are vectors, then ab denotes the vector, where ( ab)i = ai bi , and a + b denotes the vector
where ( a + b)i = ai + bi . Then,
(ka)σ = k( aσ)
for a scalar k, and
( a + b)σ = aσ + bσ,
43
Entropy 2017, 19, 347
where vectors are multiplied as above and we take the usual product of the permutations. All of matrix
algebra is naturally represented in the iterant framework, as we shall see in the next sections.
For example, if η is the order two permutation of two elements, then [ a, b]η = [b, a]. Thus,
We define
i = [1, −1]η
and then
i2 = [1, −1]η [1, −1]η = [1, −1][−1, 1]]η 2 = [1, −1][−1, 1] = [−1, −1] = −1.
In this way, the complex numbers arise naturally from iterants. One can interpret [1, −1] as an
oscillation between +1 and −1 and η as a temporal shift operator. Then, i = [1, −1]η is a time sensitive
element and its self-interaction has square minus one. In this way, iterants can be interpreted as a
formalization of elementary discrete processes.
Note that if we let e = [1, −1], then e2 = 1, η 2 = 1 and eη = −ηe. Thus, e and η generate a small
Clifford algebra.
· · · abababababab · · · .
We illustrate with period two. The elements of the waveform can be any mathematically or
empirically well-defined objects. We can regard the ordered pairs [ a, b] and [b, a] as abbreviations for
the waveform or as two points of view about the waveform (a first or b first). We have called [ a, b]
an iterant. Thinking of an iterant as a discrete process, we define a time shift operator η such that
[ a, b]η = η [b, a] and η 2 = 1.
Discrete Calculus and the Temporal Shift Operator. If we have a discrete time series X, X , X , · · · ,
then it is convenient to define an operator J so that X t J = JX t+1 , and it is this temporal shift operator
that can be used to correlate discrete calculus for the time-series. For example, we can define a discrete
derivative D by the equation
DX t = J ( X t+1 − X t )/Δt,
(with time increment equal to Δt). Note then that the derivative is expressed as a commutator:
where here [ R, S] = RS − SR is the commutator. This means that this discrete derivative satisfies the
Leibniz rule for products, and it can be used for formulations of discrete physics. This use of the
temporal shift operator dovetails with its use for keeping track of observation in a discrete model,
where successive observations require temporal shifts. In particular, let P = mDX t and Q = X t denote
momentum and position, respectively (m is mass and commutes with J, as does Δt). Then, PQ and QP
do not commute and the temporal shift operator J keeps track of the fact that measuring momentum
requires a tick of the clock. We can interpret PQ as first measuring Q and then measuring P, while QP
represents first measuring P and then measuring Q :
44
Entropy 2017, 19, 347
Thus,
[ Q, P] = QP − PQ = mJ ( X t+1 − X t )2 /Δt = mJ (ΔX )2 /Δt.
In this form of discrete physics, the commutator equation
[ Q, P] = k,
where k is a constant, is satisfied by a Brownian walk with diffusion constant (ΔX )2 /Δt. In this way,
our interpretation of the square root of negative one in terms of the temporal shift operator fits into a
larger context of the physics of discrete observations. In this paper, we work with periodic series and
use cyclic operators such as η to keep track of the periodicity. For related discussion, see [2,3,5,9–16].
See also [17] for other uses of iterants in the context of Clifford algebras. For papers of the author about
discrete physics and quantum computing see [18–28].
We have defined products and sums of iterants as follows
and
[ a, b] + [c, d] = [ a + c, b + d].
The operation of juxtapostion of waveforms is multiplication while + denotes ordinary addition of
ordered pairs. These operations are natural with respect to the structural juxtaposition of iterants:
...abababababab...
...cdcdcdcdcdcd...
Structures combine at the points where they correspond. Waveforms combine at the times where
they correspond. Iterants combine in juxtaposition. This theme of including the result of time in
observations of a discrete system occurs at the foundation of our construction.
In the next section, we show how all matrix algebra can be formulated in terms of iterants.
where
x 0
[ x, y] = ,
0 y
and
0 1
η= .
1 0
The reader will have no difficulty verifying that the usual definition of matrix multiplicaiton
corresponds exactly to the iterant multiplication that we have already described. In particular,
and
[ x, y] + [z, w] = [ x + y, z + w]
45
Entropy 2017, 19, 347
[ x, y]η = η [y, x ].
Thus, matrix multiplication and addition is identical with iterant multiplication. There are many ways
to motivate the rules for matrix algebra. Iterants are a natural entry into matrix structure.
The fact that the iterant expression [ a, d]1 + [b, c]η captures the whole of 2 × 2 matrix algebra
corresponds to the fact that a two by two matrix is combinatorially the union of the identity pattern
(the diagonal) and the interchange pattern (the antidiagonal) that correspond to the operators 1 and η :
∗ @
.
@ ∗
In the formal diagram for a matrix shown above, we indicate the diagonal by ∗ and the anti-diagonal by @.
In the case of complex numbers, we represent
a −b
= [ a, a] + [−b, b]η = a1 + b[−1, 1]η = a + bi.
b a
In this way, we see that 2 × 2 matrix algebra can be seen as a hypercomplex number system based on
the symmetric group S2 . In the next section, we generalize this point of view to arbitrary finite groups
by generalizing Cayley’s Theorem that shows that every finite group has a faithful representation as a
permutation group.
The factorization of i into a product η of non-commuting iterant operators shows, in the iterant
viewpoint, the temporal nature of i and its algebraic roots.
Note that the quaternions arise from the split quaternions: The split quaternions are the system
Here, = 1 = ηη while i = η so that ii = −1. The quaternions come about once we construct an
√
extra square root of minus one that commutes with them. Call this extra root of minus one −1. Then,
the quaternions are generated by
√ √
I= −1, J = η, K = −1η
with
I 2 = J 2 = K2 = I JK = −1.
Remark 1. The rest of this section is an exposition of the higher period iterants and the general Theorem 1 about
finite groups and iterant matrix representations. The exposition follows the corresponding exposition in our
paper [1].
· · · abcabcabcabcabcabc · · ·
Here, we see three natural iterant views (depending upon whether one starts at a, b or c).
46
Entropy 2017, 19, 347
With T = S2 , we have
[ x, y, z] T = T [y, z, x ]
and S3 = 1. We obtain a closed algebra of iterants whose general element is of the form
[ a, b, c] + [d, e, f ]S + [ g, h, k]S2 ,
where a, b, c, d, e, f , g, h, k are real or complex numbers. Call this algebra Vect3 (R) when the scalars are
in a commutative ring with unit F. Let M3 (F) denote the 3 × 3 matrix algebra over F. We have the:
Lemma 1. The iterant algebra Vect3 (F) is isomorphic to the full 3 × 3 matrix algebra M3 ((F).
Proof.
[ a, b, c] + [d, e, f ]S + [ g, h, k]S2
maps to the matrix ⎛ ⎞
a d g
⎜ ⎟
⎝ h b e ⎠,
f k c
preserving the algebra structure. Since any 3 × 3 matrix can be written uniquely in this form, it follows
that Vect3 (F) is isomorphic to the full 3 × 3 matrix algebra M3 (F).
We can summarize the pattern behind this expression of 3 × 3 matrices by the following
symbolic matrix: ⎛ ⎞
1 S T
⎜ ⎟
⎝ T 1 S ⎠.
S T 1
Here, the letter T occupies the positions in the matrix that correspond to the permutation matrix
that represents it, and the letter T = S2 occupies the positions corresponding to its permutation
matrix. The 1s occupy the diagonal for the corresponding identity matrix. The iterant representation
corresponds to writing the 3 × 3 matrix as a disjoint sum of these permutation matrices such that the
matrices themselves are closed under multiplication. In this case, the matrices form a permutation
representation of the cyclic group of order 3, C3 = {1, S, S2 }.
Remark 2. Note that a permutation matrix is a matrix of zeroes and ones such that some permutation of the
rows of the matrix transforms it to the identity matrix. Given an n × n permutation matrix P, we associate to it
a permuation
σ ( P) : {1, 2, · · · , n} −→ {1, 2, · · · , n}
where j denotes the column in P where the i-th row has a 1. Note that an element of the domain of a permutation
is indicated to the left of the symbol for the permutation. It is then easy to check that for permutation matrices
P and Q,
σ( P)σ( Q) = σ( PQ),
given that we compose the permutations from left to right according to this convention.
47
Entropy 2017, 19, 347
This construction generalizes directly for iterants of any period and hence for a set of operators
forming a cyclic group of any order. We shall generalize further to any finite group G. We now define
Vectn ( G, F) for any finite group G.
Definition 1. Let G be a finite group, written multiplicatively. Let F denote a given commutative ring with
unit. Assume that G acts as a group of permutations on the set {1, 2, 3, · · · , n} so that given an element g ∈ G
we have (by abuse of notation)
We shall write
ig
for the image of i ∈ {1, 2, 3, · · · , n} under the permutation represented by g. The notation denotes functionality
from the left. We have (ig)h = i ( gh) for all elements g, h ∈ G and i1 = i for all i, in order to have a
representation of G as permutations. We shall call an n-tuple of elements of F a vector and denote it by
a = ( a1 , a2 , · · · , an ). We then define an action of G on vectors over F by the formula
and note that ( a g )h = a gh for all g, h ∈ G. Define an algebra Vectn ( G, F), the iterant algebra for G, to be the
set of finite sums of formal products of vectors and group elements in the form ag with multiplication rule
( ag)(bh) = ab g ( gh),
and the understanding that ( a + b) g = ag + bg and for all vectors a, b and group elements g. It is understood
that vectors are added coordinatewise and multiplied coordinatewise. Thus, ( a + b)i = ai + bi and ( ab)i = ai bi .
Theorem 1. Let G be a finite group of order n [1]. Let ρ : G −→ Sn denote the right regular representation of G
as permutations of n objects. List the elements of G as G = { g1 , · · · , gn }, and let G act on its own underlying
set via the definition gi ρ( g) = gi g. Here, we describe ρ( g) acting on the set of elements gk of G. We also regard
ρ( g) as a mapping of the set {1, 2, · · · n}, replacing gk by k and iρ( g) = k where gi g = gk .
Then, Vectn ( G, F) is isomorphic to the matrix algebra Mn ((F). In particular, Vectn! (Sn , F) is isomorphic
with the matrices of size n! × n!, Mn! ((F).
Proof. Take the multiplication table for G to be the n × n matrix with columns and rows listed in
the order [ g1 , · · · , gn ]. Permute the rows of this matrix so that the diagonal consists in all 1 s. Let the
resulting matrix be called the G-Table. The G-Table is labeled by elements of the group. For any vector
a, let D ( a) denote the n × n diagonal matrix whose entries in order down the diagonal are the entries
of a in the order specified by a. For each group element g, let Pg denote the permutation matrix with 1
in every spot on the G-Table that is labeled by g and 0 in all other spots. It is now a direct verification
that the mapping
F (Σin=1 ai gi ) = Σin=1 D ( ai ) Pgi
defines an isomorphism from Vectn ( G, F) to the matrix algebra Mn ((F). The main point to check is
that σ( Pg ) = ρ( g). We now prove this fact.
In the G-Table, the rows correspond to { g1−1 , g2−1 , · · · gn−1 } and the columns correspond to
{ g1 , g2 , · · · gn } so that the i-i entry of the table is gi−1 gi = 1. With this, we have that, in the table,
a group element g occurs in the i-th row at column j where gi−1 g j = g. This is equivalent to the
equation gi g = g j which, in turn, is equivalent to the statement iρ( g) = j. This is exactly our functional
interpretation of the action of the permutation corresponding to the matrix Pg . Thus, ρ( g) = σ ( Pg ).
The rest of the proof is straightforward and left to the reader.
48
Entropy 2017, 19, 347
Example 1.
C3 = {1, S, S2 }
and this is the G-Table that we used for Vect3 (C3 , F) prior to proving the Main Theorem. The same
pattern works for abitrary cyclic groups.
S6 = {1, R, R2 , F, RF, R2 F },
This G-Table encodes the isomorphism of Vect6 (S3 , F) with the full algebra of six by six matrices. Similarly,
Vectn! (Sn , F) is isomorphic with the full algebra of n! × n! matrices. The permutation matrices are
obtained from the G-Table by choosing a given group element and then replacing it by 1 for each appearance
in the table, and replacing the other elements of the table by 0. For example, we have the permutation
matrix for R given by the formula below:
49
Entropy 2017, 19, 347
⎛ ⎞
0 1 0 0 0 0
⎜ 0 0 1 0 0 0 ⎟
⎜ ⎟
⎜ ⎟
⎜ 1 0 0 0 0 0 ⎟
R=⎜ ⎟.
⎜ 0 0 0 0 0 1 ⎟
⎜ ⎟
⎝ 0 0 0 1 0 0 ⎠
0 0 0 0 1 0
Thus, we have the corresponding permutation matrices that I shall call E, A, B, C. The reader can verify
that A2 = B2 = C2 = 1, AB = BA = C. Let
α = [1, −1, −1, 1], β = [1, 1, −1, −1], γ = [1, −1, 1, −1].
In addition, let
I = αA, J = βB, K = γC.
I 2 = J 2 = K2 = I JK = −1, I J = K, J I = −K.
Thus, we have constructed the quaternions as iterants in relation to the Klein 4-Group. In Figure 1, we
illustrate these quaternion generators with string diagrams for the permutations. The reader can check
that the permuations correspond to the permutation matrices constructed for the Klein 4-Group.
+ + + + + - - + + + - - + - + -
1 I J K
Elements of the Klein Four-Group.
Basic products: II = JJ = KK = IJK = -1
+ - - +
+ - + -
+ - + -
I
K
= =
+ + - -
J
IJ = K
50
Entropy 2017, 19, 347
4. Since complex numbers commute with one another, we could consider iterants whose values are in the
complex numbers. This is just like considering matrices whose entries are complex numbers. Thus, we
shall allow a version of i that commutes with the iterant shift operator η. Let this commuting i be denoted
by ι. Then, we are assuming that
ι2 = −1,
ηι = ιη,
η 2 = +1.
We then consider iterants of the form [ a + bι, c + dι] and [ a + bι, c + dι]η = η [c + dι, a + bι]. In particular,
we have = [1, −1], and i = η is quite distinct from ι. Note, as before, that η = −η and that 2 = 1.
Now, let
I = ι,
J = η,
K = ιη.
Thus,
I 2 = J 2 = K2 = I JK = −1.
This construction shows how the structure of the quaternions comes directly from the non-commutative
structure of period two iterants. The group SU (2) of 2 × 2 unitary matrices of determinant one is
isomorphic to the quaternions of length one.
5. Similarly,
a c + dι
H = [ a, b] + [c + dι, c − dι]η =
c − dι b
represents a Hermitian 2 × 2 matrix and hence an observable for quantum processes mediated by SU (2).
Hermitian matrices have real eigenvalues.
51
Entropy 2017, 19, 347
√
It is not hard to see that the eigenvalues of H are T ± X 2 + Y 2 + Z2 . Thus, viewed as an observable, H
can observe the time and the invariant spatial distance from the origin of the event ( T, X, Y, Z ). At least
at this very elementary juncture, quantum mechanics and special relativity are reconciled.
6. Hamilton’s Quaternions are generated by iterants, as discussed above, and we can express them purely
algebraicially by writing the corresponding permutations as shown below:
where
s = (12)(34),
l = (13)(24),
t = (14)(23).
Here, we represent the permutations as products of transpositions (ij). The transposition (ij) interchanges
i and j, leaving all other elements of {1, 2, ..., n} fixed.
One can verify that
I 2 = J 2 = K2 = I JK = −1.
Note that making an iterant interpretation of an entity like I = [+1, −1, −1, +1]s is a conceptual
departure from our original period two iterant (or cyclic period n) notion. Now, we consider iterants such
as [+1, −1, −1, +1] where the permutation group acts to produce other orderings of a given sequence.
The iterant itself can represent a form that can be seen in any of its possible orders. These orders are
subject to permutations that produce the possible views of the iterant. Algebraic structures such as the
quaternions appear in the explication of such forms.
7. In all these examples, we can interpret the iterants as short hand for matrix algebra based on permutation
matrices, or as indicators of discrete processes. The discrete processes become more complex in proportion
to the complexity of the groups used in the construction. We began with processes of order two, then
considered cyclic groups of arbitrary order, then the symmetric group S3 in relation to 6 × 6 matrices,
and the Klein 4-Group in relation to the quaternions. In the case of the quaternions, we know that
this structure is intimately related to rotations of three- and four-dimensional space and many other
geometric themes.
5. Schrödinger’s Equation
In this section, we go more deeply into a treatment of Schrödinger’s equation that was begun
in the introduction to [1]. In that paper, we used this example for Schrödinger’s equation to
motivate the introduction of iterants. Here, we already have iterants, but we find that a discrete
model for Schrödinger’s equation instantiates an alternating pattern that is essentially of the form
· · · + − + − + − + · · · , and the problem of taking the continuum limit of this discrete model leads to
the complex numbers by a parity consideration. The parity consideration corresponds to our iterant
construction of the square root of minus one, and so we see in this model how the iterant square root
of minus one can correspond to an alternation in a discrete process while the usual square root of
minus one describes the behaviour of the limit of the process.
52
Entropy 2017, 19, 347
so that the time step is τ and the space step is of absolute value Δ. We regard the probability of left or
right steps as equal, so that if P( x, t) denotes the probability that the Brownian particle is at point x at
time t, then
P( x, t + τ ) = P( x − Δ, t)/2 + P( x + Δ, t)/2.
From this equation for the probability, we can write a difference equation for the partial derivative of
the probability with respect to time:
The expression in brackets on the right-hand side is a discrete approximation to the second partial of
P( x, t) with respect to x. Thus, if the ratio C = Δ2 /2τ remains constant as the space and time intervals
approach zero, then this equation goes in the limit to the diffusion equation
ih̄∂ψ/∂t = Hψ,
where the Hamiltonian H is given by the equation H = p2 /2m + V, where V ( x, t) is the potential
energy and p = (h̄/i )∂/∂x is the momentum operator. With this, we have p2 /2m = (−h̄2 /2m)∂2 /∂x2 .
Thus, with V ( x, t) = 0, the equation becomes ih̄∂ψ/∂t = (−h̄2 /2m)∂2 ψ/∂x2 , which simplifies to
Thus, we have arrived at the form of the diffusion equation with an imaginary constant, and it is
possible to make the identification with the diffusion equation by setting
h̄/m = Δ2 /τ,
where Δ denotes a space interval, and τ denotes a time interval as explained in the last section
about the Brownian walk. With this, we can ask what space interval and time interval will satisfy
this relationship? One answer is that this equation is satisfied when m is the Planck mass, Δ is
the Planck length and τ is the Planck time. Note that L2 /T = (h̄/Mc)2 /(h̄/Mc2 ) = h̄/M. Here,
h̄ is Planck’s constant divided by 2π. c is the speed of light. G is Newton’s gravitational constant.
√
M = h̄c/G, L = h̄/Mc, T = h̄/Mc2 .
What does all this say about the nature of the Schrödinger equation itself? Consider a discrete
function ψ( x, t) defined (recursively) by the following equation:
53
Entropy 2017, 19, 347
In other words, we are thinking here of a random “quantum walk” where the amplitude for stepping
right or stepping left is proportional to i while the amplitude for not moving at all is proportional to
(1 − i ). It is then easy to see that ψ is a discretization of
This gives a direct interpretation of the solution to the Schrödinger equation as a limit of a sum over
generalized Brownian paths with complex amplitudes.
Replacing i by An Iterant. Now, however, suppose that we replace i by (−1)n(t) at time step t = n(t)τ
where n(t) is a non-negative integer. Instead of writing
we will write
so that the diffusion equation seems to have been replaced with an equation of the form
(−1)n(t)
in the realm of continuous time. In the discrete world, the wave function ψ divides into ψe and ψo
where the (discrete) time, n(t), is either even or odd. We write
∂t ψe = κ∂2x ψo ,
∂t ψo = −κ∂2x ψe ,
ψ = ψe + iψo ,
so that
i∂t ψ = i∂t (ψe + iψo ) = i∂t ψe − ∂t ψo
54
Entropy 2017, 19, 347
This the Schrödinger equation. Instead of the simple diffusion equation, we have a mutual
dependency where the temporal variation of ψe is mediated by the spatial variation of ψo
and vice-versa:
ψ = ψe + iψo ,
∂t ψe = κ∂2x ψo ,
∂t ψo = −κ∂2x ψe ,
Note that in terms of the iterant interpretation, the pair [ψe , ψo ] is an abbreviation of the temporal
series · · · ψt , ψt+τ , ψt+2τ , · · · that represents the discrete process ψt+τ ( x ) = ((−1)n(t) /2)ψt ( x − Δ) +
(1 − (−1)n(t) )ψt ( x ) + ((−1)n(t) /2)ψt ( x + Δ) Here, the process itself is not periodic, but the underlying
alternation of the parity of (−1)n(t) gives the iterant stucture that allows the use of i as a combination
of shift and permutation.
Remark 3. The discrete recursion at the beginning of this section can be implemented to approximate solutions
to the Schrödinger equation. This will be the subject of another paper. The main point of this section is that a
discrete version of the Schrödinger equation actually uses the temporal iterant interpretation of the square root
of minus one, so that one can think of this oscillation as part of a discrete process in back of the Schrödinger
evolution. This reformulation of basic quantum mechanics deserves further study.
6. The Framed Braid Group and the Sundance Bilson-Thompson Model for Elementary Particles
The reader should recall that the symmetric group Sn has presentation
The Artin Braid Group Bn is a relative of the symmetric group that is obtained by removing the
condition that each generator has a square equal to the identity:
In Figure 2, we illustrate the the generators σ1 , σ2 , σ3 of the 4-strand braid group and we show the
topological nature of the relation σ1 σ2 σ1 = σ2 σ1 σ2 and the commuting relation σ1 σ3 = σ3 σ1 . Topological
braids are represented as collections of always descending strings, starting from a row of points and
ending at another row of points. The strings are embedded in three-dimensional space and can wind
around one another. The elementary braid generators σi correspond to the i-th strand interchanging
with the (i + 1)-th strand. Two braids are multiplied by attaching the bottom endpoiints of one braid
to the top endpoints of the other braid to form a new braid.
There is a fundamental homomorphism
π : Bn −→ Sn
defined on generators by
π (σi ) = Ti
in the language of the presentations above. In terms of the diagrams in Figure 2, a braid diagram is a
permutation diagram if one forgets about its weaving structure of over and under strands at a crossing.
55
Entropy 2017, 19, 347
σ1 σ2
Braid Generators
σ3 σ 1-1
= σ 1-1 σ1 = 1
= σ1 σ 2 σ1 = σ 2 σ1 σ 2
= σ1 σ 3 = σ 3 σ1
We now turn to a generalization of the braid group, the framed braid group. In this generalization,
we associate elements of the form t a to the top of each braid strand. For these purposes, it is useful
to take t as an algebraic variable and a as an integer. To interpret this framing, geometrically replace
each braid strand by a ribbon and interpret t a as a 2πa twist in the ribbon. In Figure 3, we illustrate
how to multiply two framed braids. In our formalism, the braids A and B in this figure are given by
the formulas
A = [t a , tb , tc ]σ1 σ2 σ3 ,
B = [td , te , t f ]σ2 σ3 ,
in the framed braid group on three strands, denoted FB3 . As the Figure 3 illustrates, we have the
basic formula
vσ = σvπ (σ) ,
where v is a vector of the form v = [t a , tb , tc ] (for n = 3) and vπ (σ) denotes the action of the permutation
associated with the braid σ on the vector v. In the figure, the permutation is accomplished by sliding
the algebra along the strings of the braid.
a b c d e f
t t t t t t
= =
c b a d e
t f
t t t t t
A B
a b c a +f b +e c +d
t t t t t t
AB = =
d e f
t t t
We can form an algebra Alg[ FBn ] by taking formal sums of framed braids of the form ∑ ck vk Gk ,
where ck is a scalar, vk is a framing vector and Gk is an element of the Artin Braid group Bn . Since
56
Entropy 2017, 19, 347
braids act on framing vectors by permutations, this algebra is a generalization of the iterant algebras
we have defined so far. The algebra of framed braids uses an action of the braid group based on its
representation to the symmetric group. Furthemore, the representation π : Bn −→ Sn induces a map
of algebras
π̂ : Alg[ FBn ] −→ Alg[ FSn ],
and
e− = σ2 σ1−1 [t−1 , t−1 , t−1 ].
Here, we use [t a , tb , tc ] for the framing numbers ( a, b, c). Products of framed braids correspond to
particle interactions. Note that e+ e− = [1, 1, 1] = γ so that the electron and the positron are inverses
in this algebra. In Figure 5 are illustrated the representations of bosons, including γ, a photon and
the identity element in this algebra. Other relations in the algebra correspond to particle interactions.
For example, in Figure 6 the muon decay is illustrated:
μ → νμ + W− → νμ + ν¯e + e− .
The reader can see the definitions of the different parts of this decay sequence from the three
figures we have just mentioned. Note that strictly speaking the muon decay is a multiplicative identity
in the braid algebra:
μ = νμ W− = νμ ν¯e e− .
Particle interactions in this model are mediated by factorizations in the non-commutative algebra of the
framed braids.
Figure 4. Sundance Bilson-Thompson Framed Braid Fermions (“(3)” under the labels for the up and
down quarks and antiquarks represent the fact that there are three permutations of charge placement
giving the three colours).
Figure 5. Bosons.
57
Entropy 2017, 19, 347
By using the representation π̂ : Alg[ FB3 ] −→ Alg[ FS3 ], we can image the structure of
Bilson-Thompson’s framed braids in the the iterant algebra corresponding to the symmetric group.
However, we propose to change this map so that we have a non-trivial representation of the Artin
braid group. This can be accomplished by defining
where
ρ(σk ) = [t, t, t] Tk
and
ρ(σk−1 ) = [t−1 , t−1 , t−1 ] Tk
for k = 1, 2. The reader will find that we have now represented the braid group in the iterant algebra
Alg[ FS3 ] and extended the representation to the framed braid group algebra. Thus, the Sundance
Bilson-Thompson representation of elementary particles as framed braids is represented inside the iterant algebra
for the symmetric group on three letters. In Section 10, we carry this further and place the representation
inside the Lie Algebra su(3).
The group SU (3) consists of the matrices U (1 , · · · , 8 ) = ei ∑a a λa , where 1 , · · · , 8 are real
numbers and a ranges from 1 to 8. The Gell Man matrices satisfy the following relations:
tr (λ a λb ) = 2δab ,
58
Entropy 2017, 19, 347
We now give an iterant representation for these matrices that is based on the pattern
⎛ ⎞
1 A B
⎜ ⎟
⎝ B 1 A ⎠
A B 1
as described in the previous section. That is, we use the cyclic group of order three to represent all
3 × 3 matrices at iterants based on the permutation matrices
⎛ ⎞ ⎛ ⎞
0 1 0 0 0 1
⎜ ⎟ ⎜ ⎟
A=⎝ 0 0 1 ⎠,B = ⎝ 1 0 0 ⎠.
1 0 0 0 1 0
the reader will have no difficulty verifying the following formulas for the Gell Mann Matrices in the
iterant format:
λ1 = [1, 0, 0] A + [0, 1, 0] B,
λ2 = [−i, 0, 0] A + [0, i, 0] B,
λ4 = [1, 0, 0] B + [0, 0, 1] A,
λ5 = [i, 0, 0] B + [0, 0, −i ] A,
λ6 = [0, 1, 0] A + [0, 0, 1] B,
T± = F1 ± iF2 ,
U± = F6 ± iF7 ,
V± = F4 ± iF5 ,
T3 = F3 ,
59
Entropy 2017, 19, 347
2
Y = √ F8 .
3
Iterant Formulation of the su(3) Lie Algebra. We now have the specific iterant formulas
T+ = [1, 0, 0] A,
T− = [0, 1, 0] B,
U+ = [0, 1, 0] A,
U− = [0, 0, 1] B,
V+ = [0, 0, 1] A,
V− = [1, 0, 0] B,
We then have A = QP, B = PQ, R = PQP = QPQ. The two transpositions P and Q generate the entire
group of permuatations S3 . It is usual to think of the order-three transformations A and B as expressed
in terms of these transpositons, but we can also use the iterant structure of the 3 × 3 matrices to express
P, Q and R in terms of A and B. The result is as follows:
60
Entropy 2017, 19, 347
Recall from the previous section that we have the iterant generators for the su(3) Lie algebra:
T+ = [1, 0, 0] A,
T− = [0, 1, 0] B,
U+ = [0, 1, 0] A,
U− = [0, 0, 1] B,
V+ = [0, 0, 1] A,
V− = [1, 0, 0] B.
Thus, we can express these transpositions P and Q in the iterant form of the Lie algebra as
P = [0, 0, 1] + T+ + T− ,
Q = [1, 0, 0] + U+ + U− ,
R = [0, 1, 0] + V+ + V− .
The basic permutations receive elegant expressions in the iterant Lie algebra.
Now that we have basic permutations in the Lie algebra, we can take the map from Section 7
with
ρ(σk ) = [t, t, t] Tk
and
ρ(σk−1 ) = [t−1 , t−1 , t−1 ] Tk
ρ(σ1 ) = [t, t, t] P
and
ρ(σ1−1 ) = [t−1 , t−1 , t−1 ] P
and
ρ(σ2 ) = [t, t, t] Q
and
ρ(σ1−1 ) = [t−1 , t−1 , t−1 ] Q.
By choosing t
= 1 on the unit circle in the complex plane, we obtain representations of the
Sundance Bilson-Thompson constructions of Fermions via framed braids inside the su(3) Lie algebra.
This brings the Bilson-Thompson formalism in direct contact with the Standard Model via our iterant
representations. We shall return to these relationships in a sequel to the present paper.
61
Entropy 2017, 19, 347
ψψ† + ψ† ψ = 1.
If you have more than one of them, say ψ and φ, then they anti-commute:
ψφ = −φψ.
Majorana Fermion operators c satisfy c† = c so that the corresponding particles are their own
anti-particles. A group of researchers [32] claims, at this writing, to have found Majorana Fermions in
edge effects in nano-wires.
Majorana operators are related to standard Fermions as follows: the algebra for Majoranas is
c = c† and cc = −c c if c and c are distinct Majorana Fermions with c2 = 1 and c2 = 1. One can make
a standard Fermion operator from two Majorana operators via
ψ = (c + ic )/2,
ψ† = (c − ic )/2.
Similarly, one can mathematically make two Majoranas from any single Fermion. If one takes a set
of Majoranas
{ c1 , c2 , c3 , · · · , c n },
then there are natural braiding operators that act on the vector space with these ck as the basis.
The operators are mediated by algebra elements that themselves satisfy braiding relations
√
τk = (1 + ck+1 ck )/ 2,
√
τk−1 = (1 − ck+1 ck )/ 2.
Tk : Span{c1 , c2 , · · · , , cn } −→ Span{c1 , c2 , · · · , , cn }
via
Tk ( x ) = τk xτk−1 .
Tk (ck+1 ) = −ck ,
and Tk is the identity otherwise. We have then a unitary representaton of the Artin braid group.
See Figure 7 for a depiction of the braiding of Majorana Fermions in relation to the topology of a
62
Entropy 2017, 19, 347
belt that connects them. In quantum mechanics, we must represent rotations of three-dimensional
space as unitary transformations. This relationship between rotations and unitary transformations
is encoded in the topology of the belt. See [34] for more about this topological view of the physics
of Fermions. In the figure, we see that the strictly topological belt does not know which of the two
Fermions will individually acquire a phase change, but the Ivanov algebra above makes this decision.
More understanding is needed in this area of subtle topological structure of Fermions.
y x
x y
y x
Topological Exchange
x y y x
Ivanov Braiding
Transformation
of Majorana Fermion
T(x) = y
T(y) = -x Operators
(Note that x goes to the y-position and
y goes to the x-position with a twist.)
Recall that, in discussing the inception of iterants, we introduce a temporal shift operator η such that
[ a, b]η = η [b, a]
and
ηη = 1
for any iterant [ a, b]. In this way, we have a Clifford algebra generated by e = [1, −1] and η. We can
take e and η as Majorana Fermion operators and construct Fermion operators
ψ = (e + iη )/2,
ψ† = (e − iη )/2.
Here, i is an extra square root of minus one that commutes with the operators e and η. We arrive at
fermions in a few short steps from the origin of the iterants. Algebraically, we have controlled the
period two oscillation e so that it satisfies the fermion algebra. From the point of view taken in this
paper, it is worth examining if this discrete process view of fermion algebra and Majorana operator
algebra can shed light on the many properties in this domain. In particular, I would like to see if there
is insight into the braiding of Majorana Fermion operators to be gained from the iterant viewpoint.
63
Entropy 2017, 19, 347
If the speed of light is equal to 1 (by convention), then energy E, momentum p and mass m are
related by the (Einstein) equation
E2 = p2 + m2 .
Dirac constructed his equation by finding an algebraic square root of p2 + m2 . A corresponding linear
operator for E can then take the role of the Hamiltonian in the Schrödinger equation. We first assume
that p is a scalar (using one dimension of space and one dimension of time). Let E = αp + βm, where α
and β are elements of a non-commutative, associative algebra. Then,
E2 = α2 p2 + β2 m2 + pm(αβ + βα).
Let
O = i∂/∂t + iα∂/∂x − βm
so that the Dirac equation takes the form
O ψ( x, t) = 0.
Then,
U 2 = − E2 + p2 + m2 = 0.
The nilpotent element U leads to the same plane wave solution to the Dirac equation as follows.
We have shown that
O ψ = Δψ
for ψ = ei( px− Et) . It then follows that
O( βαΔβαψ) = ΔβαΔβαψ = U 2 ψ = 0,
64
Entropy 2017, 19, 347
then
D ψ̃ = (− βαE + βp − αm)ψ = U † ψ̃,
giving a definition of U † for the anti-particle for Uψ.
U = βαE + βp − αm
and
U † = − βαE + βp − αm.
and
(U − U † )2 = −(2βαE)2 = −4E2 .
U 2 = (U † )2 = 0,
and
UU † + U † U = 4E2 .
The Fermion operator algebra emerges from these plane wave solutions to the Dirac equation.
The decomposition of Uand U † into the corresponding Majorana Fermion operators corresponds
to the decomposition of the energy into momentum and mass: E2 = p2 + m2 . Normalizing by dividing
by 2E, we have
A = ( βp + αm)/E
and
B = iβα,
65
Entropy 2017, 19, 347
so that
A2 = B2 = 1
and
AB + BA = 0.
Then,
U = ( A + Bi ) E
and
U † = ( A − Bi ) E,
showing how the Fermion operators are expressed in terms of the simpler Clifford algebra of Majorana
operators (split quaternions once again). We can take A = e and B = η and regard these Fermion
annihilation and creation operators in the simplest iterant framework.
O = Ê − α P̂ − βm,
where Ê is the energy operator and p̂ is the momentum operator. Then, a solution
φ = A + αB + βC + αβD
O φ = 0.
[ A, B, C, D ]α = [ B, A, D, C ]
and
[ A, B, C, D ] β = [C, − D, A, − B].
Thus, the structure corresponds to the action of the split quaternions as a signed Klein 4-group.
The equation O φ = 0 becomes four operator equations involving these signed permutations:
O φ = ( Ê − α p̂ − βm)( A + αB + βC + αβD ) =
66
Entropy 2017, 19, 347
The plane wave solution φ = ( E + αp + βm)ei( px− Et) k corresponds, in this iterant formalism, to φ =
[ E, p, m, 0]ei( px−Et) .
In this way, we can think of a solution to the Dirac equation as an iterant composed of four
complex valued functions taken in order with the given action of the split quaternions as described
above. This can then be reformulated as single recursive system, as we did for the Schrödinger
equation in the introduction. The analogs for the way the recursion acts on the time steps of the
recursion are given by the action of the split quaternions rather than the action of the complex numbers
([ a, b]i = [−b, a]). The idea remains the same, and the matrix representations for the Dirac algebra arise
naturally from the algebra itself.
O ψ( x, t) = 0.
Let
ψ( x, t) = ei( p• x− Et)
and construct solutions by first applying the Dirac operator to this ψ. The modified Dirac operator is
D = iβα∂/∂t + β∇ • σ − αm.
We have that
D ψ = Uψ,
67
Entropy 2017, 19, 347
where U = βαE + βp • σ − αm. Here, U 2 = 0 and Uψ is a solution to the modified Dirac Equation.
We can use the Fermion operators as creation and annihilation operators, and locate the corresponding
Majorana Fermion operators. We leave these details to the reader.
Let ˆ and η̂ generate another, independent algebra of split quaternions, commuting with the first
algebra generated by and η. Then, a totally real Majorana Dirac equation can be written as follows:
(Here, the “hats” denote the quantum differential operators corresponding to the energy and
momentum.) will satisfy
Ê2 = pˆx 2 + pˆy 2 + pˆz 2 + m2
if the algebra generated by α x , αy , αz , β has each generator of square one and each distinct pair of
generators anti-commuting. From there, we obtain the general Dirac equation by replacing Ê by i∂/∂t,
and pˆx with −i∂/∂x (and same for y, z):
This is equivalent to
(∂/∂t + α x ∂/∂x + αy ∂/∂y + αz ∂/∂z + iβm)ψ = 0.
Thus, here we take
α x = η̂η, αy = , αz = η,
ˆ β = i ˆ η̂η,
and observe that these elements satisfy the requirements for the Dirac algebra. Since the algebra
appearing in the Majorana–Dirac operator is constructed entirely from two commuting copies of the
split quaternions, there is no appearance of the complex numbers, and when written out in 2 × 2
matrices, we obtain coupled real differential equations to be solved.
MO ρ( x, t) = (− E + η̂η p x + py + η
ˆ pz − ˆ η̂ηm)ρ( x, t).
68
Entropy 2017, 19, 347
Let
Γ = − E + η̂η p x + py + η
ˆ pz − ˆ η̂ηm,
and
Γ̂ = − E − η̂η p x − py − η
ˆ pz + ˆ η̂ηm.
Then, we have
Γ̂Γ = 0,
since all algebraic coefficients square to minus one, and anti-commute. Therefore,
Thus,
Γ̂ρ( x, t) = (− E − η̂η p x − py − η
ˆ pz + ˆ η̂ηm)ρ( x, t)
is a solution to the Majorana–Dirac equation. When this solution is written out into its components, it
is an entirely real valued solution since the components of the matrices representing the algebra are all
real numbers. Recall from the earlier part of this section that we were able to reformulate solutions of
this kind for the usual Dirac equation in terms of the nilpotent formalism with the algebraic element
U with U 2 = 0. Here, we can produce real solutions to the Majorana–Dirac equation, but it does not
seem possible to put them in the nilpotent formalism. This is surely a reflection of the fact that these
solutions are not Fermions in the usual sense. On the other hand, one can regard the solution Γ̂ρ( x, t)
in relation to the algebra element Γ̂, and this algebra element is a combination of Majorana Fermion
operators {η̂η, , η,
ˆ ˆ η̂η } in the sense of Clifford algebra or iterant operators that we have used earlier
in this paper. Thus, we see that there is at least the beginning of a relationship between the modern use
of the Majorana Fermion operators and the original intents of Ettore Majorana to find real solutions to
the Dirac equation.
We would like to know if there are other ways to produce such real Dirac equations, and particularly
if there are ways to accomplish this aim that do not algebraically entangle the two copies of the split
quaternions as our construction (and Majorana’s original construction) seems to require.
Acknowledgments: It gives the author great pleasure to thank G. Spencer-Brown, James Flagg, Alex Comfort,
David Finkelstein, Pierre Noyes, Peter Rowlands, Sam Lomonaco, Bernd Schmeikal and Rukhsan Ul-Haq for
conversations related to the considerations in this paper.
Conflicts of Interest: The author declares no conflict of interest.
References
1. Kauffman, L.H. Iterants, Fermions and Majorana Operators. In Unified Field Mechanics—Natural Science Beyond
the Veil of Spacetime; Amoroso, R., Kauffman, L.H., Rowlands, P., Eds.; World Scientific Pub. Co.: Singapore,
2015; pp. 1–32.
2. Kauffman, L.H. Knot Logic. In Knots and Applications; Kauffman, L., Ed.; World Scientific Pub. Co.: Singapore,
1994; pp. 1–110.
3. Kauffman, L.H. Knot logic and topological quantum computing with Majorana fermions. In Logic and
Algebraic Structures in Quantum Computing and Information; Lecture Notes in Logic; Chubb, J., Eskandarian, A.,
Harizanov, V., Eds.; Cambridge University Press: Cambridge, UK, 2016; 124p.
4. Kauffman, L.H.; Lomonaco, S.J. Braiding, Majorana Fermions and Topological Quantum Computing, (to appear
in Special Issue of QIP on Topological Quantum Computing). In Proceedings of the 2nd International Conference
and Exhibition on Mesoscopic and Condensed Matter Physics, Chicago, IL, USA, 26–28 October 2016.
5. Ul Haq, R.; Kauffman, L.H. Iterants, Idempotents and Clifford algebra in Quantum Theory. arXiv 2017,
arXiv:1705.06600.
6. Bilson-Thompson, S.O. A topological model of composite fermions. arXiv 2006, arXiv:hep-ph/0503213.
7. Gasiorowicz, S. Elementary Particle Physics; Wiley: New York, NY, USA, 1966.
69
Entropy 2017, 19, 347
8. Rowlands, P. Zero to Infinity: The Foundations of Physics; Series on Knots and Everything, Volume 41; World
Scientific Publishing Co.: Singapore, 2007.
9. Spencer-Brown, G. Laws of Form; George Allen and Unwin Ltd.: London, UK, 1969.
10. Kauffman, L. Sign and Space. In Religious Experience and Scientific Paradigms, Proceedings of the 1982 IASWR
Conference; Institute of Advanced Study of World Religions: Stony Brook, NY, USA, 1985; pp. 118–164.
11. Kauffman, L. Self-reference and recursive forms. J. Soc. Biol. Struct. 1987, 10, 53–72.
12. Kauffman, L. Special relativity and a calculus of distinctions. In Proceedings of the 9th Annual International
Meeting of ANPA, Cambridge, UK, 23–28 September 1987; pp. 290–311.
13. Kauffman, L. Imaginary values in mathematical logic. In Proceedings of the Seventeenth International
Conference on Multiple Valued Logic, Boston, MA, USA, 26–28 May 1987; pp. 282–289.
14. Kauffman, L.H. Biologic. AMS Contemp. Math. Ser. 2002, 304, 313–340.
15. Kauffman, L.H. Temperley-Lieb Recoupling Theory and Invariants of Three-Manifolds (Annals Studies-114);
Princeton University Press: Princeton, NJ, USA, 1994.
16. Kauffman, L.H. Time imaginary value, paradox sign and space. In Computing Anticipatory Systems, Proceedings
of the AIP Conference CASYS—Fifth International Conference, Liege, Belgium, 13–18 August 2001; Dubois, D., Ed.;
AIP Conference Publishing: Melville, NY, USA, 2002; Volume 627.
17. Schmiekal, B. Decay of Motion: The Anti-Physics of SpaceTime; Nova Publishers, Inc.: Hauppauge, NY, USA, 2014.
18. Kauffman, L.H.; Noyes, H.P. Discrete Physics and the Derivation of Electromagnetism from the formalism of
Quantum Mechanics. Proc. R. Soc. Lond. A 1996, 452, 81–95.
19. Kauffman, L.H.; Noyes, H.P. Discrete Physics and the Dirac Equation. Phys. Lett. A 1996, 218, 139–146.
20. Kauffman, L.H. Noncommutativity and discrete physics. Phys. D Nonlinear Phenom. 1998, 120, 125–138.
21. Kauffman, L.H. Space and time in discrete physics. Int. J. Gen. Syst. 1998, 27, 241–273.
22. Kauffman, L.H. A non-commutative approach to discrete physics. In Aspects II: Proceedings of ANPA 20; ANPA:
Stanford, CA, USA, 1999; pp. 215–238.
23. Kauffman, L.H. Non-commutative calculus and discrete physics. In Boundaries: Scientific Aspects of ANPA 24;
ANPA: Stanford, CA, USA, 2003; pp. 73–128.
24. Kauffman, L.H. Non-commutative worlds. New J. Phys. 2004, 6, 73.
25. Kauffman, L.H. Non-commutative worlds and classical constraints. In Scientific Essays in Honor of Pierre Noyes
on the Occasion of His 90-th Birthday; Amson, J., Kaufman, L.H., Eds.; World Scientific Pub. Co.: Singapore,
2013; pp. 169–210.
26. Kauffman, L.H. Differential geometry in non-commutative worlds. In Quantum Gravity: Mathematical Models
and Experimental Bounds; Fauser, B., Tolksdorf, J., Zeidler, E., Eds.; Birkhauser: Basel, Switzerland, 2007;
pp. 61–75.
27. Kauffman, L.H. Knot Logic and Topological Quantum Computing with Majorana Fermions. arXiv 2013,
arXiv:1301.6214.
28. Kauffman, L.H.; Lomonaco, S.J., Jr. q-deformed spin networks, knot polynomials and anyonic topological
quantum computation. J. Knot Theory Ramif. 2007, 16, 267–332.
29. Cheng, T.P.; Lee, L.F. Gauge Theory of Elementary Particles; Clarendon Press: Oxford, UK, 1988.
30. Majorana, E. A symmetric theory of electrons and positrons. I Nuovo Cimento 1937, 14, 171–184.
31. Moore, G.; Read, N. Noabelions in the fractional quantum Hall effect. Nucl. Phys. B 1991, 360, 362–396.
32. Mourik, V.; Zuo, K.; Frolov, S.M.; Plissard, S.R.; Bakkers, E.P.A.M.; Kouwenhuven, L.P. Signatures of Majorana
fermions in hybred superconductor-semiconductor devices. Science 2012, 336, 1003–1007.
33. Ivanov, D.A. Non-abelian statistics of half-quantum vortices in p-wave superconductors. Phys. Rev. Lett. 2001,
86, 268, doi:10.1103/PhysRevLett.86.268.
34. Kauffman, L.H. Knots and Physics; World Scientific Pub., Co.: Singapore, 2012.
c 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
70
entropy
Article
A No-Go Theorem for Observer-Independent Facts
Časlav Brukner 1,2
1 Vienna Center for Quantum Science and Technology (VCQ), Faculty of Physics, University of Vienna,
Boltzmanngasse 5, A-1090 Vienna, Austria; [email protected]
2 Institute of Quantum Optics and Quantum Information (IQOQI), Austrian Academy of Sciences,
Boltzmanngasse 3, A-1090 Vienna, Austria
Received: 5 April 2018; Accepted: 2 May 2018; Published: 8 May 2018
Abstract: In his famous thought experiment, Wigner assigns an entangled state to the composite
quantum system made up of Wigner’s friend and her observed system. While the two of them have
different accounts of the process, each Wigner and his friend can in principle verify his/her respective
state assignments by performing an appropriate measurement. As manifested through a click in a
detector or a specific position of the pointer, the outcomes of these measurements can be regarded as
reflecting directly observable “facts”. Reviewing arXiv:1507.05255, I will derive a no-go theorem for
observer-independent facts, which would be common both for Wigner and the friend. I will then
analyze this result in the context of a newly-derived theorem arXiv:1604.07422, where Frauchiger
and Renner prove that “single-world interpretations of quantum theory cannot be self-consistent”.
It is argued that “self-consistency” has the same implications as the assumption that observational
statements of different observers can be compared in a single (and hence an observer-independent)
theoretical framework. The latter, however, may not be possible, if the statements are to be understood
as relational in the sense that their determinacy is relative to an observer.
1. Introduction
One of the most debated situations concerning the quantum measurement problem is described
in the thought experiment of the so-called “Wigner’s friend”. The experiment involves a quantum
system and an observer (Wigner’s friend) who performs measurements on this system in a sealed
laboratory. A “super-observer” (Wigner) is placed outside the laboratory. While for the friend,
the measurement outcome is reflected in a property of the device recording it (e.g., in the form of a
click in a photo-detector or a certain position of a pointer device), Wigner can describe the process
unitarily on the basis of the information that is in principle available to him. At the end of the process,
the friend projects the state of the system corresponding to the observed outcome, whereas Wigner
assigns a specific entangled state to the system and the friend, which he can verify performing a further
experiment. When Wigner’s friend observes an outcome, does the state collapse for Wigner as well?
If not, how can we reconcile their different accounts of the process?
The thought experiment of Wigner’s friend has great conceptual value, as it challenges different
approaches to understanding quantum theory. In his original work [1], Wigner designed the experiment
to support his view that consciousness is necessary to complete the quantum measurement process.
According to the many-worlds interpretation [2], there are many copies of Wigner’s friend in different
“worlds”. Each copy observes one outcome, a different one in each world. According to the Copenhagen,
relational [3] and quantum Bayesian [4] interpretations, the state is defined only relative to the observer;
relative to the friend, the state is projected, while relative to Wigner, it is in a superposition. Either
way, supporters of any of these interpretations will arrive at the same predictions in Wigner’s verifying
experiment. In contrast, objective collapse theories [5–7] predict that the quantum state collapses when
a superposed system reaches a certain threshold of mass, size, complexity, etc., such that it becomes
impossible to even prepare the entangled state of Wigner’s friend and the system. Consequently,
Wigner’s state assignment can statistically be disproved repeating the verifying experiment.
The descriptions of “what is happening inside the lab” as given by Wigner and Wigner’s friend
respectively will differ. This difference need not pose a consistency problem for quantum theory,
for example if one takes the view that the theory gives the physical description relative to the observer
and her/his measuring apparatus in agreement with [3]. As long as the two observers do not exchange
the information about their outcomes, they will remain separated from each other, each holding a
different description of the systems with respect to their individual experimental arrangements. If they
do compare their predictions, they will agree. For example, should the friend communicate her result
to Wigner, this would collapse the state he assigns to the friend and the system. This suggests that
there should be no tension in accepting that, relative to their experimental arrangements, Wigner’s
friend in her measurement, as well as Wigner in his verifying measurement, each obtains a respective
measurement outcome. Since these outcomes are usually manifested as clicks in detectors or definite
positions of a pointer, they can be considered as directly accessible “facts”. Quite naturally, the question
arises: Can the facts as observed by Wigner and by Wigner’s friend be jointly considered as objective
properties of the world, in which case we might call them “facts of the world”? What we mean with
this question is whether there exists any theory, potentially different from quantum theory, where a
joint probability may be assigned for Wigner’s outcome and for that of his friend.
Reviewing the results of [8], I will derive a Bell-type no-go theorem for observer-independent
facts, showing that there can be no theory in which Wigner’s and Wigner’s friend’ facts can jointly
be considered as (local) objective properties. More precisely, I will show that the assumptions of
“locality”, “freedom of choice” and “universality of quantum theory” (the latter in the sense that
there are no constraints of the system to which the theory can be applied) are incompatible with
the assumption of observer-independent facts, i.e., under the assumptions one cannot define joint
probabilities for Wigner’s outcome and for that of Wigner’s friend. This might indicate that in quantum
theory, we can only define facts relative to an observation and an observer. I will then analyze the
relation of these results to the theorem developed by Frauchinger and Renner [9], which proves that
“single-world interpretations of quantum theory cannot be self-consistent”. In particular, I will argue
that the implications of their “self-consistency” requirement are equivalent to those of a theoretical
framework in which the truth values of the observational statements by Wigner and Wigner’s friend
can be jointly assigned and then whether they are consistent or not verified. However, in the view of
the no-go theorem, this in general need not be possible in a physical theory; the theory may operate
only with facts relative to the observer.
It should be emphasized that the no-go theorem applies to “facts” understood as “immediate
experiences of observers”; it may refer to what various interpretations of quantum mechanics assume to
be “real” (e.g., the wave function of the Universe, Bohmian’s trajectories, etc.) only to the extent to which
these “realities” give rise to directly observable facts in terms of detector clicks or pointer positions.
72
Entropy 2018, 20, 350
1
|ΦSF = √ (|z+S | Fz+ F + |z−S | Fz− F ) , (1)
2
where the particular phase (here “+”) between the two amplitudes in Equation (1) is specified by the
measurement interaction in control of Wigner (note that if Wigner did not know this phase due to
the lack of control of it, he would describe the “spin + friend’s laboratory” in an incoherent mixture
of the two possibilities). Wigner can verify his state assignment (1), for example by performing
a Bell state measurement in the basis: |Φ± SF = √1 (|z+S | Fz+ F ± |z−S | Fz− F ) and |Ψ± SF =
2
√1
2
(|z+S | Fz− F ± |z−S | Fz+ F ).
The fact that the friend and Wigner have different accounts of the friend’s measurement process is
at the heart of the discussion surrounding the Wigner-friend thought experiment. Still, the difference
need not give rise to any inconsistency in practicing quantum theory, since the two descriptions
belong to two different observers, who remain separated in making predictions for their respective
systems. The novelty of Deutsch’s proposal [10] lies in the possibility for Wigner to acquire direct
knowledge on whether the friend has observed a definite outcome upon her measurement or not
without revealing what outcome she has observed. The friend could open the laboratory in a manner
that allowed communication (e.g., a specific message written on a piece of paper) to be passed outside
to Wigner, keeping all other degrees of freedom fully isolated, as illustrated in Figure 1. Obviously,
it is of central importance that the message does not contain any information concerning the specific
observed outcome (which would destroy the coherence of state (1)), but merely an indication of the
kind: “I have observed a definite outcome” or “I have not observed a definite outcome”. If the message
is encoded in the state of system M, the overall state is:
1
|ΦSFM = √ (|z+S | Fz+ F + |z−S | Fz− F ) |“I have observed a definite outcome“ M , (2)
2
since the state of the message is factorized out from the total state (I leave the option for the message
“I have not observed a definite outcome” out, as it conflicts with our experience of the situation
that we refer to as measurement and it also can be used to violate the bound on quantum state
discrimination [8]).
If we assume the universality of quantum theory in the sense that it can be applied at any scale,
including the apparatus, the entire laboratory and even the observer’s memory, we conclude that
the message will indicate that the friend perceives a definite outcome and yet Wigner will confirm
his state assignment (1). This should be contrasted to the “collapse models” by Ghirardi, Rimini and
Weber [5] or by Diosi [6] and Penrose [7], which predict a breakdown of the quantum-mechanical laws
73
Entropy 2018, 20, 350
at some scale. In the presence of such a collapse, the prediction based on Wigner’s state assignment
will statistically deviate from the result obtained in the verification test.
Figure 1. Deutsch’s version of the Wigner-friend thought experiment. An observer (Wigner’s friend)
performs a Stern–Gerlach experiment on a spin 1/2 particle in a sealed laboratory. The outcome,
either “spin up” or “spin down”, is recorded in the friend’s laboratory, including her memory.
A super-observer (Wigner) describes the entire experiment as a unitary transformation resulting
in an encompassing entangled state between the system and the friend’s laboratory. The friend is
allowed to communicate a message, which only reports whether she sees a definite outcome or not,
without in any way revealing the actual outcome she observes.
Postulate 1. (“Observer-independent facts”) The truth values of the propositions Ai of all observers form a
Boolean algebra A. Moreover, the algebra is equipped with a (countably additive) positive measure p( A) ≥ 0 for
all statements A ∈ A, which is the probability for the statement to be true.
In the proof, we will only use the conjunction of propositions of different observers, which is a
weaker requirement. Furthermore, we use a countably additive measure since we are dealing with only
a countable (in fact only a finite) set of elements. In Boolean algebra, one can build the conjunction,
the disjunction and the negation of the statements. A typical example of a Boolean algebra is set theory.
The operations are identified with the set theoretic intersection, union and complement, respectively.
This is significant in the context of classical physics, where the propositions can be represented by
subsets of a phase space. In the present context, one can jointly assign truth values “true” or “false” to
74
Entropy 2018, 20, 350
statements A1 and A2 about observations made by Wigner’s friend and Wigner, respectively. Moreover,
one can build the conjunction A1 ∩ A2 and assign joint probability p( A1 = ±1, A2 = ±1), where A1
is observed by the friend and A2 by Wigner (and where truth value “true” corresponds to a value
of one and “false” to −1). Note that since observables corresponding to A1 and A2 do not commute
with each other, this amounts to introducing “hidden variables”, for which we now formulate a Bell’s
theorem [11].
Theorem 1. (No-go theorem for “observer-independent facts”) The following statements are incompatible
(i.e., lead to a contradiction)
1. “Universal validity of quantum theory”. Quantum predictions hold at any scale, even if the measured
system contains objects as large as an “observer“ (including her laboratory, memory etc.).
2. “Locality”. The choice of the measurement settings of one observer has no influence on the outcomes of the
other distant observer(s).
3. “Freedom of choice”. The choice of measurement settings is statistically independent from the rest of
the experiment.
4. “Observer-independent facts”. One can jointly assign truth values to the propositions about observed
outcomes (“facts”) of different observers (as specified in the postulate above).
Before going to the proof, I make two comments. Firstly, we use word "universal" in assumption 1
in the sence of Peres [12]: “There is nothing in quantum theory making it applicable to three atoms
and inapplicable to 1023 ... Even if quantum theory is universal, it is not closed. A distinction must be
made between endophysical systems—those which are described by the theory—and exophysical ones,
which lie outside the domain of the theory (for example, the telescopes and photographic plates used
by astronomers for verifying the laws of celestial mechanics). While quantum theory can in principle
describe anything, a quantum description cannot include everything. In every physical situation
something must remain unanalyzed. This is not a flaw of quantum theory, but a logical necessity ...”.
Secondly, the theorem can be derived by replacing assumptions 2, 3 and 4 with a single assumption
of Bell’s “local causality”. The latter already implies the existence of (local) probabilities for “joint facts”
for Wigner and Wigner’s friend [13], which is the subject of the present no-go theorem. The reason for
working with the present choice of assumptions is that the relevance of the theorem for the propositions
different observers make about their respective outcome becomes apparent.
Proof. With reference to Figure 2, consider a pair of super-observers (Alice and Bob) who can carry out
experiments on two systems that include a laboratory for each system, in each of which an observer
(Charlie and Debbie, respectively) performs a measurement on a spin-1/2 particle. We consider
a Bell inequality test and assume that Alice chooses between two measurement settings A1 and
A2 , and similarly, Bob chooses between B1 and B2 . The settings A1 and A2 correspond to the
observational statements Charlie and Alice can make about their respective outcomes, respectively.
Similarly, the settings B1 and B2 correspond to observational statements of Debbie and Bob, respectively.
Assumptions (2), (3) and (4) together account for the existence of local hidden variables that predefine
the values for A1 , A2 , B1 and B2 to be +1 or −1. Moreover, the assumptions imply the existence
of the joint probability p( A1 , A2 , B1 , B2 ) whose marginals satisfy the Clauser–Horne–Shimony–Holt
inequality (CHSH): S = A1 B1 + A1 B2 + A2 B1 − A2 B2 ≤ 2. Here, for example, A1 B1 =
∑ A1 ,B1 =−1,1 A1 B1 p( A1 , B1 ) and p( A1 , B1 ) = ∑ A2 ,B2 =−1,1 p( A1 , A2 , B1 , B2 ) and similarly for other cases.
Suppose that Charlie and Debbie initially share an entangled state of two respective spin-1/2
particles S1 and S2 in a state:
θ θ
|ψS1 S2 = − sin |φ+ S1 S2 + cos |ψ− S1 S2 , (3)
2 2
where |φ+ S1 S2 = √1 (|z+S1 |z+S2 + |z−S1 |z−S2 ) and |ψ− S1 S2 = √1 (|z+S1 |z−S2 −
2 2
|z−S1 |z+S2 ), and the first spin is in possession of Charlie and the second of Debbie. The state can be
75
Entropy 2018, 20, 350
obtained by applying rotation ( ⊗ e− 2 θσy )|ψ− S1 S2 to the singlet state |ψ− S1 S2 =
i
√1 (| z +S | z −S −
2 1 2
|z−S1 |z+S2 ), where θ is the angle of rotation of Debbie’s spin around the y-axis and σy is a Pauli
matrix. This particular choice of the state enables all measured observables to be either of the Wigner’s
friend type, or of the Wigner type.
For Alice and Bob, the overall state of the spins together with Charlie’s and Debbie’s laboratories
is initially:
| Ψ 0 = | ψ S1 S2 | 0 C | 0 D , (4)
in agreement with Assumption 1. The state |0C |0 D of the two observers does not require further
characterization, except for the description of observers capable of completing a measurement.
Now, Charlie and Debbie each perform a measurement of the respective spin along the z direction.
This measurement procedure is described as a unitary transformation from the point of view of
Alice and Bob. We assume that after Charlie and Debbie complete their measurement, the overall
state becomes:
θ θ
|Ψ̃ = − sin |Φ+ + cos |Ψ− , (5)
2 2
where:
1
|Φ+ = √ (| Aup | Bup + | Adown | Bdown ), (6)
2
− 1
|Ψ = √ (| Aup | Bdown − | Adown | Bup ) (7)
2
and:
We take now θ = π/4 and define two sets of (binary) observables, which play the same role of
spin (Pauli) operators along the z and x axis, respectively: Az = | Aup Aup | − | Adown Adown | and
A x = | Aup Adown | + | Adown Aup | for Alice and similarly Bz and Bx for Bob. In the Bell experiment,
Alice chooses between A1 = Az and A2 = A x , whereas Bob chooses between B1 = Bz and B2 = Bx .
Note that Alice and Bob each choose between the friend’s (A1 and B1 ) and Wigner’s (A2 and B2 ) type √
of measurement. The Bell test with these measurement settings and state (5) results in SQ = 2 2.
The violation of the inequality implies that the conjunction of the assumptions (1–4) used to derive it
is untenable.
Figure 2. A Bell experiment on two entangled observers in a Wigner-friend scenario. The super-observers
Alice and Bob perform their respective measurements on laboratories containing the observers Charlie
and Debbie, who both perform a Stern–Gerlach measurement on their respective spin-1/2 particles.
76
Entropy 2018, 20, 350
We conclude that Wigner, even as he has clear evidence for the occurrence of a definite outcome
in the friend’s laboratory, cannot assume any specific value for the outcome to coexist together
with the directly observed value of his outcome, given that all other assumptions are respected.
Moreover, there is no theoretical framework where one can assign jointly the truth values to
observational propositions of different observers (they cannot build a single Boolean algebra) under
these assumptions. A possible consequence of the result is that there cannot be facts of the world
per se, but only relative to an observer, in agreement with Rovelli’s relative-state interpretation [3],
quantum Bayesianism (already in 1996, in the “Replies to Referee 4” of [14], Fuchs drew a distinction
between “facts for the agent” and “facts for everybody”) [4], as well as the (neo)-Copenhagen
interpretation [8]. It is interesting to note that a similar view was expressed by Jammer as early
as in 1974 [15], when he wrote that “the description of the state of a system, rather than being restricted
to the particle (or systems of particles) under observation, expresses a relation between the particle
and all the measurement devices involved.” Other possible interpretations of the violation of Bell’s
inequalities include violations of Assumption 1 in collapse models [5–7], of Assumption 2 in non-local
hidden variable models such as de Broglie–Bohm theory [16] or of Assumption 3 in superdeterministic
theories [17]. The proper account of the result in the many-worlds interpretation should be found in the
interpretation’s account of Bell’s inequality violation [18,19] and points again to observer-dependent
facts as they depend on the branch of the many worlds.
77
Entropy 2018, 20, 350
Sa Observer W assigns the truth value “true” to the statement: “A sees x = ok”;
Sb Observer A assigns the truth value “true” to the statement: “If x = ok, then F2 sees z = +”;
Sc Observer W assigns the truth value “true” to the statement: “A concludes that F2 sees
z = +”.
By repeating reasoning (10) in an iterative way, starting from statement S4 –S1 , one arrives at a
new statement:
It is important to note that this statement refers to W’s conclusion about what other observers
conclude when they apply T conditional on the outcomes they observe. It is not a statement about his
directly observed outcome.
In the second step, the self-consistency property (SC) is used to arrive at an implication of the
following type:
T =⇒ S. (11)
where the implied statement is:
which stands in logical contradiction with W’s directly observed outcome w = ok.
The second step is non-trivial. It enables promoting others’ knowledge based on their observations
to ones’ own knowledge and then to put this “promoted knowledge” in logical comparison with ones’
own knowledge gained through direct observation. Through implication (11), the self-consistency
property (SC) enables observational statements of other observers (A, F2 and F1 ) to be logically
compared with ones (W) own. This has the same predictive power as a theoretical framework in
which the truth values of statements of different observers can jointly be assigned and compared.
To see this, denote statements Si , i = 1, 2, 3 as implications S1 : (P =⇒ Q), S2 : (Q =⇒ R) and
S3 : (R =⇒ S), where P: “A sees x = ok”, Q: “F2 sees z = +”, R: “F1 sees r = t” and S: “W sees
w
= ok”. Then, “collapsing” others’ knowledge into W’s knowledge via Equation (11) is equivalent in
its implications to considering all the statements as belonging to a single Boolean algebra (i.e., they are
now all propositions of observer W, who can apply logical operations on them) for which one can
use the transitivity of implication to arrive at [P ∩ (P =⇒ Q) ∩ (Q =⇒ R) ∩ (R =⇒ S)] =⇒ S.
Statement S is again in logical contradiction to W’s directly observed outcome w = ok.
We have seen that the existence of a single Boolean algebra for truth values for observational
statements of different observers is incompatible with the assumptions of “locality”, “freedom of
choices” and the predictions of quantum theory, which does not impose any constraints on the objects
to which it is applied. This might be interpreted as an indication that the strong conclusions implied
by the theorem of [9] rely on a too restrictive requirement of property (SC) on a physical theory.
The requirement needs not only be fulfilled in quantum theory, but in other physical theories, as well.
An example was provided by Sudbery [23]: In the special theory of relativity, due to time dilation, every
inertial observer can claim that her/his clock ticks slower than that of a moving partner. This apparent
contradiction in predictions of different observers is resolved when one realizes that the statements only
have meaning with respect to the specific, observer-dependent measurement procedures that define
“simultaneity”. Similarly, the states referring to outcomes of different observers in a Wigner-friend
type of experiment cannot be defined without referring to the specific experimental arrangements
of the observers, in agreement with Bohr’s idea of contextuality as formulated by him in 1963 [26]:
“the unambiguous account of proper quantum phenomena must, in principle, include a description of
all relevant features of experimental arrangement.”
I conclude with a remark that the theorem by Frauchiger and Renner has deep conceptual value,
as it points to the necessity to differentiate between ones’ knowledge about direct observations and
ones’ knowledge about others’ knowledge that is compatible with physical theories. It is likely that
78
Entropy 2018, 20, 350
understanding this difference will be an important ingredient in further development of the method of
Bayesian inference in situations as in the Wigner-friend experiment.
Funding: I acknowledge the support of the Austrian Science Fund (FWF) through the project I-2526-N27.
This research was funded by [John Templeton Foundation] grant number [60609]. The opinions expressed in this
publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation.
Acknowledgments: I acknowledge helpful discussions with Mateus Araújo, Veronika Baumann, Adán Cabello,
Giulio Chiribella, Christopher Fuchs, Borivoje Dakić, Philipp Höhn, Nikola Paunković, Lídia del Rio, Rüdiger
Schack and Stefan Wolf. I would like to especially acknowledge the fruitful discussions with Renato Renner and
thank him for providing notes summarizing that discussion.
Conflicts of Interest: The author declares no conflict of interest.
Appendix A
The Bell theorem from the main text can be extended to a Greenberger–Horne–Zeilinger (GHZ)
version [27] with three friends and three Wigners. Since the incompatibility of Assumptions 1–4 is not of a
probabilistic, but rather of a deterministic nature, this version of the theorem completely bypasses any use
of the notion of probability, similarly to the version by Frauchiger and Renner [9]. The experiment was
independently introduced in [28], where it was argued that it suggests a violation of Lorentz symmetry.
Consider three spatially-separated observers (Wigners), Alice, Bob and Cleve. They each perform a
measurement on a subsystem of a tripartite system. Each of the subsystems includes a further observer,
Debbie, Eric and Fiona (Wigner’s friends), who perform a Stern–Gerlach measurement of spin along x of
their respective spin-1/2 particles. Alice measures Debbie and her spin particle; Bob measures Eric and his
spin particle; and finally, Cleve measures Fiona and her spin particle. We consider a GHZ test where Alice
chooses between two measurement settings: A1 and A2 , Bob between B1 and B2 and Cleve between C1
and C2 . Assumptions 2, 3 and 4 imply that A1 , A2 , B1 , B2 , C1 and C2 have predefined values of +1 or −1.
Define  x = | Aup Aup | − | Adown Adown | and Ây = i (| Aup Adown | − | Adown Aup |) for Alice
and similarly B̂x and B̂y for Bob and Ĉx and Ĉy for Cleve, where:
In the GHZ test, we choose Â1 = Â x , Â2 = Ây for Alice and similarly for Bob and Cleve.
Assume that Alice, Bob and Cleve perform these measurements on a shared GHZ state:
1
|ΨGHZ ABC = √ (| A+| B+|C + − | A−| B−|C −) , (A3)
2
where due to Assumption 1, we presume that such a state can be prepared and | A± = √1 (| Aup ±
2
| Adown ), | B± = √1 (| Bup ± | Bdown ) and |C ± = √1 (|Cup ± |Cdown ).
2 2
In order to reproduce perfect correlations in the GHZ state, the predefined values need to satisfy
A x By Cy = Ay Bx Cy = Ay By Cx = 1. These equations imply then that A x Bx Cx = 1; however, one finds
the opposite result in quantum mechanics: Â x B̂x Ĉx |ΨGHZ ABC = −|ΨGHZ ABC .
References
1. Wigner, E.P. Remarks on the mind-body question. In The Scientist Speculates; Good, I.J., Ed.; Heinemann:
London, UK, 1961.
2. Everett, H. “Relative State” Formulation of Quantum Mechanics. Rev. Mod. Phys. 1957, 29, 454–462. [CrossRef]
3. Rovelli, C. Relational quantum mechanics. Int. J. Theor. Phys. 1996, 35, 1637–1678. [CrossRef]
4. Fuchs, C.A. Notwithstanding Bohr, the Reasons for QBism. Mind Matter 2017, 15, 245–300.
5. Ghirardi, G.C.; Rimini, A.; Weber, T. Unified dynamics for microscopic and macroscopic systems.
Phys. Rev. D 1986, 34, 470. [CrossRef]
79
Entropy 2018, 20, 350
6. Diosi, L. Models for universal reduction of macroscopic quantum fluctuations. Phys. Rev. A 1989, 40, 1165.
[CrossRef]
7. Penrose, R. On gravity’s role in quantum state reduction. Gen. Relat. Gravit. 1996, 28, 581–600. [CrossRef]
8. Brukner, Č. On the quantum measurement problem. In Quantum [Un]speakables II; Bertlmann, R.,
Zeilinger, A., Eds.; The Frontiers Collection; Springer: New York, NY, USA, 2016. [CrossRef]
9. Frauchiger, D.; Renner, R. Single-world interpretations of quantum theory cannot be self-consistent. arXiv
2016, arXiv:1604.07422. [CrossRef]
10. Deutsch, D. Quantum theory as a universal physical theory. Int. J. Theor. Phys. 1985, 24, 1–41. [CrossRef]
11. Bell, J.S. Speakable and Unspeakable in Quantum Mechanics; Collected Papers on Quantum Philosophy;
Cambridge University Press: Cambridge, MA, USA, 2004. [CrossRef]
12. Peres, A. Quantum Theory: Concepts and Methods; Springer: New York, NY, USA, 1995; p. 173. [CrossRef]
13. Zukowski, M.; Brukner, Č. Quantum non-locality—It ain’t necessarily so ... J. Phys. A Math. Theor. 2014,
47, 424009. [CrossRef]
14. Fuchs, C.A.; Schlosshauer, M.; Stacey, B.C. My Struggles with the Block Universe. arXiv 2015, arXiv:1405.2390.
[CrossRef]
15. Jammer, M. The Philosophy of Quantum Merchanics: The Interpretations of QM in Historical Perspective; John Wiley
and Sons: Hoboken, NJ, USA, 1974; pp. 197–198.
16. Bohm, D. A Suggested Interpretation of the Quantum Theory in Terms of “Hidden” Variables, I and II.
Phys. Rev. 1952, 85, 166–193. [CrossRef]
17. Hooft, G ’t. Free Will in the Theory of Everything. arXiv 2017, arXiv:1709.02874. [CrossRef]
18. Brown, H.R.; Timpson, C.G. Bell on Bell’s theorem: The changing face of nonlocality. In Quantum Nonlocality
and Reality: 50 Years of Bell’s Theorem; Bell, M., Gao, S., Eds.; Cambridge University Press: Cambridge,
MA, USA, 2016.
19. Araújo, M. Understanding Bell’s Theorem Part 3: The Many-Worlds Version. Blog: More Quantum.
Available online: http://mateusaraujo.info/2016/08/02/understanding-bells-theorem-part-3-the-many-
worlds-version/ (accessed on 2 August 2016).
20. Hardy, L. Quantum mechanics, local realistic theories, and Lorentz-invariant realistic theories. Phys. Rev. Lett.
1992, 68, 2981. [CrossRef] [PubMed]
21. Hardy, L. Nonlocality for two particles without inequalities for almost all entangled states. Phys. Rev. Lett.
1993, 71, 1665. [CrossRef] [PubMed]
22. Baumann, V.; Hansen, A.; Wolf, S. The measurement problem is the measurement problem is the
measurement problem. arXiv 2016, arXiv:1611.01111 . [CrossRef]
23. Sudbery, A. Single-World Theory of the Extended Wigner’s Friend Experiment. Found. Phys. 2017, 47,
658–669. [CrossRef]
24. Bub, J. Why Bohr was (Mostly) Right. arXiv 2017, arXiv:1711.01604. [CrossRef]
25. Brukner, Č. (University of Vienna, Austria; Austrian Academy of Sciences, Austria); Renner, R. (Institute for
Theoretical Physics, ETH Zürich, Switzerland). Personal communication, 2017.
26. Bohr, N. Quantum Physics and Philosophy: Causality and Complementarity. In Philosophy in Mid-Century:
A Survey; Klibansky, R., Ed.; La Nuova Italia Editrice: Florence, Italy, 1963.
27. Greenberger, D.M.; Horne, M.A.; Shimony, A.; Zeilinger, A. Going beyond Bell’s Theorem.Am. J. Phys. 1990,
58, 1131–1143. [CrossRef]
28. Leegwater, G. When GHZ Meet Wigner’s Friend. Erasmus University Rotterdam, Rotterdam, The Netherlands.
Unpublished manuscript, 2017.
c 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
80
Article
A Royal Road to Quantum Theory (or Thereabouts)
Alexander Wilce
Department of Mathematics, Susquehanna University, Selinsgrove, PA 17870, USA; [email protected]
Received: 15 January 2018; Accepted: 19 March 2018; Published: 26 March 2018
Abstract: This paper fails to derive quantum mechanics from a few simple postulates. However,
it gets very close, and does so without much exertion. More precisely, I obtain a representation
of finite-dimensional probabilistic systems in terms of Euclidean Jordan algebras, in a strikingly
easy way, from simple assumptions. This provides a framework within which real, complex and
quaternionic QM can play happily together and allows some (but not too much) room for more
exotic alternatives. (This is a leisurely summary, based on recent lectures, of material from the papers
arXiv:1206:2897 and arXiv:1507.06278, the latter joint work with Howard Barnum and Matthew
Graydon. Some further ideas are also explored, developing the connection between conjugate
systems and the possibility of forming stable measurement records and making connections between
this approach and the categorical approach to quantum theory.)
Problems with existing approaches. These recent reconstructive efforts suffer from two related
problems. First, they make use of assumptions that seem too strong. Secondly, in trying to derive
exactly complex, finite-dimensional quantum theory, they derive too much.
• All of the cited papers assume local tomography. This is the doctrine that the state of a bipartite
composite system is entirely determined by the joint probabilities it assigns to outcomes of
measurements on the two subsystems. This rules out both real and quaternionic QM, both of
which are legitimate quantum theories [11].
• These papers also all make some version of a uniformity assumption: that all systems having the
same information-carrying capacity are isomorphic, or that all systems are composed, in a uniform
way, from “bits” of a uniform type. Here, “information carrying capacity” means essentially
the maximum number of states that can be distinguished from one another with probability
one by a single measurement. A bit is a system for which this number is two. This rules out
systems involving superselection rules, i.e., those that admit both real and classical degrees of
freedom (for example, the quantum system corresponding to M2 (C) ⊕ M2 (C), corresponding to a
classical choice between one of two qubits, has the same information-carrying capacity as a single,
four-level quantum system). More seriously, it rules out any theory that includes, e.g., real and
complex, or real and quaternionic systems, as the state spaces of the bits of these theories have
different dimensions. As I will discuss below, one can indeed construct mathematically-reasonable
theories that embrace finite-dimensional quantum systems of all three types.
• Another shortcoming, not related to the exclusion of real and quaternionic QM, is the technical
assumption (explicit in [10] for bits) that all positive affine functionals on the state space
taking values between zero and one correspond to physically-accessible “effects”, i.e., possible
measurement results. From an operational point of view, this principle (called the “no-restriction
hypothesis” in [12]) seems to call for further motivation.
Another approach. In these notes, I am going to describe an alternative approach that avoids these
difficulties. This begins by associating with every physical system a convex set of states and a
distinguished set of basic measurements (or experiments) that can be made on the system. We then
isolate two striking features shared by classical and quantum probabilistic systems. The first is the
possibility of finding a joint state that perfectly correlates a system A with an isomorphic system A
(call it a conjugate system) in the sense that every basic measurement on A is perfectly correlated
with the corresponding measurement on A. In finite-dimensional QM, where A is represented by
a finite-dimensional Hilbert space H, A, corresponds to the conjugate Hilbert space H, and the
perfectly-correlating state is the maximally-entangled “EPR” state on H ⊗ H.
The second feature is the existence of what I call filters associated with each basic measurement.
These are processes that independently attenuate the “response” of each outcome of the measurement
by some specified factor. Such a process will generally not preserve the normalization of states, but up
to a constant factor, in both classical and quantum theory, one can prepare any desired state by applying
a suitable filter to the maximally-mixed state. Moreover, when the target state is not singular (that is,
when it does not assign probability zero to any nonzero measurement outcome), one can reverse the
filtering process, in the sense that it can be undone by another process with positive probability.
The upshot is that all probabilistic systems having conjugates and a sufficiently lavish supply
of (probabilistically) reversible filters can be represented by formally real Jordan algebras, a class
of structures that includes real, complex and quaternionic quantum systems, and just two further
well-studied additional possibilities, which I will review below.
In addition to leaving room for real and quaternionic quantum mechanics (which I take to be
a virtue), this approach has another advantage: it is much easier! The assumptions involved are
few and easily stated, and the proof of the main technical result (Lemma 1 in Section 4) is short
and straightforward. By contrast, the mathematical developments in the papers listed above are
significantly more difficult and ultimately lean on the (even more difficult) classification of compact
82
Entropy 2018, 20, 227
groups acting on spheres. My approach, too, leans on a received result, but one that is relatively
accessible. This is the Koecher–Vinberg theorem, which characterizes formally real, or Euclidean,
Jordan algebras in terms of ordered real vector spaces with homogeneous, self-dual cones. A short and
non-taxing proof of this classical result can be found in [13].
These ideas were developed in [14–16] and especially [17], of which this paper is, to an extent,
a summary. However, the presentation here is slightly different, and some additional ideas are
also explored. In particular, I have spelled out in more detail the connection between conjugate
systems and measurement records, only alluded to in the earlier paper. I also link this approach to the
categorical approach to quantum theory due to Abramsky, Coecke and others [18], along the way
briefly discussing recent work with Howard Barnum and Matthew Graydon [19] on the construction
of probabilistic theories in which real, complex and quaternionic quantum systems coexist. Finally,
Appendix B presents a uniqueness result for spectral decompositions of states, which may find
further application.
A bit of background. At this point, I had better pause to explain some terms. A Jordan algebra is a real
commutative algebra (a real vector space E with a commutative bilinear multiplication a, b → a b) ·
·· · ·
having a multiplicative unit u and satisfying the Jordan identity: a2 ( a b) = a ( a2 b), for all
·
a, b, c ∈ E, where a2 = a a. A Jordan algebra is formally real if sums of squares of nonzero elements are
always nonzero. The basic, and motivating, example is the space Lsa (H) of self-adjoint operators on a
· ·
complex Hilbert space, with the Jordan product given by a b = 12 ( ab + ba). Note that here, a a = aa,
so the notation a2 is unambiguous. To see that Lsa (H) is formally real, just note that a2 is always a
positive operator.
If H is finite dimensional, Lsa (H) carries a natural inner product, namely a, b = Tr( ab).
· ·
This plays well with the Jordan product: a b, c = b, a c for all a, b, c ∈ Lsa (H). More generally,
a finite-dimensional Jordan algebra equipped with an inner product having this property is said to
be Euclidean. For finite-dimensional Jordan algebras, being formally real and being Euclidean are
equivalent [13]. In what follows, I will abbreviate “Euclidean Jordan algebra” to EJA.
Jordan algebras were originally proposed, with what now looks like slightly thin motivation,
by P. Jordan [20]: if a and b are quantum-mechanical observables, represented by a, b ∈ Lsa (H),
then while a + b is again self-adjoint, ab and ba are not, unless a and b commute; however, their
·
average, a b, is self-adjoint and, thus, represents another observable. Almost immediately, Jordan,
von Neumann and Wigner showed [21] that all formally real Jordan algebras are direct sums of simple
such algebras, with the latter falling into just five classes, parametrized by positive integers n: the
self-adjoint parts, Mn (F)sa , of matrix algebras Mn (F), where F = R, C or H (the quaternions) or, for
n = 3, over O (the octonions); and also what are called spin factors Vn (closely related to Clifford
algebras). There is some overlap: V2 M2 (R), V3 M2 (C) and V5 M2 (H). In all but one case, one
can show that a simple Jordan algebra is a Jordan subalgebra of Mn (C) for suitable n. The exceptional
Jordan algebra, M3 (O)sa , admits no such representation.
Besides this classification theorem, there is only one other important fact about Euclidean Jordan
algebras that is needed for what follows. This is the Koecher–Vinberg (KV) theorem alluded to above.
Recall that an ordered vector space is a real vector space, call it E, spanned by a distinguished convex
cone E+ having its vertex at the origin. Such a cone induces a translation-invariant partial order on
E, namely a ≤ b iff b − a ∈ E+ . As an example, the space Lsa (H) is ordered by the cone of positive
operators. More generally, any EJA is an ordered vector space, with positive cone E+ := { a2 | a ∈ A}.
This cone has two special features: first, it is homogeneous, i.e., for any points a, b in the interior of E+ ,
there exists an automorphism of the cone (a linear isomorphism E → E, taking E+ onto itself) that
maps a to b. In other words, the group of automorphisms of the cone acts transitively on the cone’s
interior. The other special property is that E+ is self-dual. This means that E carries an inner product
(in fact, the given one making E Euclidean) such that a ∈ E+ iff a, b ≥ 0 for all b ∈ E+ .
83
Entropy 2018, 20, 227
An order unit in an ordered vector space E is an element u ∈ E+ such that, for all a ∈ E, there exists
some n ∈ N with a ≤ nu. In finite dimensions, this is equivalent to u’s belonging to the interior of
the cone E+ [22]. In the following, by a Euclidean order unit space, I mean an ordered vector space E
equipped with an inner product , with a, b ≥ 0 for all a, b ∈ E+ , and a distinguished order-unit
u. I will say that such a space E is HSD iff E+ is homogeneous, and also self-dual with respect to the
given inner product.
Theorem 1 (Koecher 1958; Vinberg 1961). Let E be a finite-dimensional euclidean order-unit space. If E is
·
HSD, then there exists a unique product with respect to which E (with its given inner product) is a euclidean
Jordan algebra, u is the Jordan unit, and E+ is the cone of squares.
Some notational conventions. My notation is mostly consistent with the following conventions
(more standard in the mathematics than the physics literature, but in places slightly excentric relative
to either). Capital Roman letters A, B, C serve as labels for systems. Mn (F) stands for the set of n × n
matrices over F = R or H; Mn (F)sa is the set of self-adjoint such matrices. Vectors in a Hilbert space H
are denoted by little Roman letters x, y, z from the end of the alphabet. Operators on H will usually be
denoted by little Roman letters a, b, c, ... from the beginning of the alphabet. Roman letters t, s typically
stand for real numbers. The space of all linear operators on H is denoted L(H); as already indicated
above, Lsa (H) is the (real) vector space of self-adjoint operators on H.
As above, the conjugate Hilbert space is denoted H. I will write x for the vectors in H
corresponding to x ∈ H. From a certain point of view, this is the same vector; the bar serves to
remind us that cx = c x for scalars c ∈ C. Alternatively, one can regard H as the space of “bra” vectors
x | corresponding to the “kets” | x in H, i.e., as the dual space of H.
The inner product of x, y ∈ H is written as x, y and is linear in the first argument (if you
like: x, y = y| x in Dirac notation). The inner product on H is then x, y = y, x . The rank-one
projection operator associated with a unit vector x ∈ H is p x . Thus, p x (y) = y, x x. I denote
functionals on Lsa (H) by little Greek letters, e.g., α, β..., and operators on Lsa (H) by capital Greek
letters, e.g., Φ. Two exceptions to this scheme: a generic density operator on H is denoted by the
capital Roman letter W, and a certain special unit vector in H ⊗ H is denoted by the capital Greek
letter Ψ. With luck, context will help keep things straight.
84
Entropy 2018, 20, 227
observing effect a in state W as Tr(Wa). If W is a pure state, i.e., W = pv where v is a unit vector in H,
then Tr(Wa) = av, v; by the same token, if a = p x , then Tr(Wa) = Wx, x .
For a, b ∈ Lsa (H), let a, b := Tr( ab). This is an inner product. By the spectral theorem,
Tr( ab) ≥ 0 for all b ∈ Lh (H)+ iff Tr( ap x ) ≥ 0 for all unit vectors x. However, Tr( ap x ) = ax, x .
So Tr( ab) ≥ 0 for all b ∈ Lh (H)+ iff a ∈ Lh (H)+ , i.e., the trace inner product is self-dualizing.
However, this now leaves us with the following:
Question: What does the trace inner product represent, oprationally or probabilistically?
Let H be the conjugate Hilbert space to H. Suppose H has dimension n. Any unit vector Ψ in
H ⊗ H gives rise to a joint probability assignment to effects a on H and b on H, namely ( a ⊗ b)Ψ, Ψ.
Consider the EPR state for H ⊗ H defined by the unit vector:
Ψ= √1
n ∑ x ⊗ x ∈ H ⊗ H,
x∈E
where E is any orthonormal basis for H. A straightforward computation shows that the joint probability
of observing a and b in the state Ψ is:
In other words, the normalized trace inner product just is the joint probability function determined
by the pure state vector Ψ!
As a consequence, the state represented by Ψ has a very strong correlational property: if x, y are
two orthogonal unit vectors with corresponding rank-one projections p x and py , we have p x py = 0,
so ( p x ⊗ py )Ψ, Ψ = 0. On the other hand, ( p x ⊗ p x )Ψ, Ψ = n1 Tr( p x ) = n1 . Hence, Ψ perfectly,
and uniformly, correlates every basic measurement (orthonormal basis) of H with its counterpart in H.
Filters and homogeneity. Next, let us see why the cone Lh (H)+ is homogeneous. Recall that this
means that any state in the interior of the cone (here, any non-singular density operator) can be
obtained from any other by an automorphism of the cone. However, in fact, something better is true:
this order-automorphism can be chosen to represent a probabilistically-reversible physical process,
i.e., an invertible CP mapping with a CP inverse.
To see how this works, suppose W is a positive operator on H. Consider the pure CP mapping
ΦW : Lsa (H) → Lsa (H) given by:
ΦW ( a) = W 1/2 aW 1/2 .
−1
Then, ΦW (1) = W. If W is nonsingular, so is W 1/2 , so ΦW is invertible, with inverse ΦW = ΦW −1 ,
again a pure CP mapping. Now, given another nonsingular density operator M, we can get from W to
M by applying Φ M ◦ ΦW −1 .
All well and good, but we are still left with the following:
where p x is the projection operator associated with x. We can understand this to mean that ΦW acts as
a filter on the test E: the response of each outcome x ∈ E is attenuated by a factor 0 ≤ t x ≤ 1 (my usage
85
Entropy 2018, 20, 227
here is slightly non-standard, in that I allow filters that “pass” the system with a probability strictly
between zero and one). Thus, if M is another density operator on H, representing some state of the
corresponding system, then the probability of obtaining outcome x after preparing the system in state
M and applying the process Φ is t x times the probability of x in state M. In detail: suppose p x is the
rank-one projection operator associated with x, and note that W 1/2 p x = p x W 1/2 = t1/2
x p x . Thus,
If we think of the basis E as representing a set of alternative channels plus detectors, as in the
figure below, we can add a classical filter attenuating the response of one of the detectors (say, x) by a
fraction t x . What the computation above tells us is that we can achieve the same result by applying a
suitable CP map to the system’s state. Moreover, this can be done independently for each outcome
of E. In Figure 1, this is illustrated for a three-level quantum system: E = { x, y, z} is an orthonormal
basis, representing three possible outcomes of a Stern–Gerlach-like experiment; the filter Φ acts on the
system’s state in such a way that the probability of outcome x is attenuated by a factor of t x = 1/2,
while outcomes y and z are unaffected. Returning to the general situation, if we apply a filter ΦW to the
maximally-mixed state n1 1, we obtain n1 W. Thus, we can prepare W, up to normalization, by applying
the filter ΦW to the maximally mixed state.
x prob = 12 α( x )
α
y prob = α(y)
Φ z prob = α(z)
Filters are symmetric. Here is a final observation, linking these last two: the filter ΦW is symmetric
with respect to the uniformly-correlating “EPR” state Ψ, in the sense that:
for all effects a, b ∈ Lsa (H)+ . Remarkably, this is all that is needed to recover the Jordan structure of
finite-dimensional quantum theory: the existence of a conjugate system, with a uniformly-correlating
joint state, plus the possibility of preparing non-singular states by means of filters that are symmetric
with respect to this state, and doing so reversibly when the state is nonsingular.
In a very rough outline, the argument is that states preparable (up to normalization) by
symmetric filters have spectral decompositions, and the existence of spectral decompositions makes
the uniformly-correlating joint state a self-dualizing inner product. However, to spell this out in a
precise way, I need a general mathematical framework for discussing states, effects and processes in
abstraction from quantum theory. The next section reviews the necessary apparatus.
Definition 1. A test space is a collection M of non-empty sets E, F, ...., each representing the outcome-set of
some measurement, experiment, or test. At the outset, one makes no special assumptions about the combinatorial
!
structure of M. In particular, distinct tests are permitted to overlap. Let X := M denote the set of all
86
Entropy 2018, 20, 227
Test spaces were introduced and studied by D. J. Foulis and C. H. Randall in a long series of papers
beginning around 1970. The original term for a test was an operation, which has the advantage of
signaling that the concept has wider applicability than simply reading a number off a meter: anything
an agent can do that leads to a well-defined, exhaustive set of mutually-exclusive outcomes defines an
operation. Accordingly, test spaces were originally called “manuals of operations”.
It can happen that a test space admits no probability weights at all. However, to serve as a model
of a real family of experiments associated with an actual physical system, a test space should obviously
carry a lavish supply of such weights. One might want to single out some of these as describing
physically (or otherwise) possible states of the system. This suggests the following:
Definition 2. A probabilistic model is a pair A = (M, Ω), where M is a test space and Ω is some designated
convex set of probability weights, called the states of the model.
The definition is deliberately spare. Nothing prohibits us from adding further structure (a group
of symmetries, say, or a topology on the space of outcomes). However, no such additional structure
is needed for the results I will discuss below. I will write M( A), X ( A) and Ω( A) for the test space,
associated outcome space and state space of a model A. The convexity assumption on Ω( A) is
intended to capture the possibility of forming mixtures of states. To allow the modest idealization of
taking outcome-wise limits of states to be states, I will also assume that Ω( A) is closed as a subset of
[0, 1] X ( A) (in its product topology). This makes Ω( A) compact and, so, guarantees the existence of
pure states, that is, extreme points of Ω( A). If Ω( A) is the set of all probability weights on M( A), I
will say that A has a full state space.
Two bits. Here is a simple, but instructive illustration of these notions. Consider a test space M =
{{ x, x }, {y, y }}. Here, we have two tests, each with two outcomes. We are permitted to perform either
test, but not both at once. A probability weight is determined by the values it assigns to x and to y, and
since the sets { x, x } and {y, y } are disjoint, these values are independent. Thus, geometrically, the
space of all probability weights is the unit square in R2 (Figure 2a, below). To construct a probabilistic
model, we can choose any closed, convex subset of the square for Ω. For instance, we might let Ω be
the convex hull of the four probability weights δx , δx , δy and δy corresponding to the midpoints of the
four sides of the square, as in Figure 2b, that is,
y
δy
1
δx δx
x
1 δy
(a) (b)
Figure 2. The state spaces of two bits. (a) The square bit; (b) The diamond bit.
87
Entropy 2018, 20, 227
The model of Figure 2a, in which we take Ω to be the entire set of probability weights on
M = {{ x, x }, {y, y }}, is sometimes called the square bit. I will call the model of Figure 2b the
diamond bit.
Classical, quantum and Jordan models. If E is a finite set, the corresponding classical model is A( E) =
({ E}, Δ( E)) where Δ( E) is the simplex of probability weights on E. If H is a finite-dimensional
!
complex Hilbert space, let M(H) denote the set of orthonormal bases of H: then X = M(H)
is the unit sphere of H, and any density operator W on H defines a probability weight αW , given
by αW ( x ) = Wx, x for all x ∈ X. Letting Ω(H) denote the set of states of this form, we obtain the
quantum model, A(H) = (M(H), Ω(H)), associated with H (Gleason’s theorem tells us that A(H)
has a full state space for dim(H) > 2, but we will not need this fact).
More generally, every Euclidean Jordan algebra E gives rise to a probabilistic model as follows.
A minimal or primitive idempotent of E is an element p ∈ E with p2 = p and, for q = q2 < p, q = 0.
A Jordan frame is a maximal pairwise orthogonal set of primitive idempotents. Let X (E) be the set of
primitive idempotents; let M(E) be the set of Jordan frames; and let Ω(E) be the set of probability
weights of the form α( p) = a, p where a ∈ E+ with a, u = 1. These data define the Jordan model
A(E) associated with E. In the case where E = Lh (H) for a finite-dimensional Hilbert space H,
this almost gives us back the quantum model A(H): the difference is that we replace unit vectors by
their associated projection operators, thus conflating outcomes that differ only by a phase.
Sharp models. Jordan models enjoy many special features that the generic probabilistic model lacks.
I want to take a moment to discuss one such feature, which will be important below.
Definition 3. A model A is unital iff, for every outcome x ∈ X ( A), there exists a state α ∈ Ω( A) with
α( x ) = 1, and sharp if this state is unique (from which it follows easily that it must be pure). If A is sharp, I
will write δx for the unique state making x ∈ X ( A) certain.
88
Entropy 2018, 20, 227
The spaces V( A), V∗ ( A). Any probabilistic model gives rise to a pair of ordered vector spaces in a
canonical way. These will be essential in the development below, so I am going to go into a bit of
detail here.
Definition 4. Let A be any probabilistic model. Let V( A) be the span of the state space Ω( A) in RX ( A) ,
ordered by the cone V( A)+ consisting of non-negative multiples of states, i.e.,
Call the model A finite-dimensional iff V( A) is finite-dimensional. From now on, I assume that
all models are finite-dimensional.
Let V∗ ( A) denote the dual space of V( A), ordered by the dual cone of positive linear functionals,
i.e., functionals f with f (α) ≥ 0 for all α ∈ V( A)+ . Any measurement-outcome x ∈ X ( A) yields an
evaluation functional x" ∈ V∗ ( A), given by x"(α) = α( x ) for all α ∈ V( A). More generally, an effect is a
positive linear functional f ∈ V∗ ( A) with 0 ≤ f (α) ≤ 1 for every state α ∈ Ω( A). The functionals x" are
effects. One can understand an arbitrary effect a to represent a mathematically possible measurement
outcome, having probability a(α) in state α. I stress the adjective mathematically because, a priori,
there is no guarantee that every effect will correspond to a physically-realizable measurement outcome.
In fact, at this stage, I make no assumption at all about what, apart from the tests E ∈ M( A), is or
is not physically realizable. (Later, it will follow from further assumptions that every element of
V∗ ( A) represents a random variable associated with some E ∈ M( A) and is, therefore, operationally
meaningful. However, this will be a theorem, not an assumption.)
The unit effect is the functional u A := ∑ x∈ E x", where E is any element of M( A). This takes the
constant value of one on Ω( A), and, thus, represents a trivial measurement outcome that occurs with
probability one in every state. This is an order unit for V∗ ( A) (to see this, let a ∈ V( A)∗ , and let N be
the maximum value of | a(α)| for α ∈ Ω( A), remembering that the latter is compact: then a ≤ Nu).
For both classical and quantum models, the ordered vector spaces V∗ ( A) and V( A) are naturally
isomorphic. If A( E) is the classical model associated with a finite set E, both are isomorphic to the
space RE of all real-valued functions on E, ordered pointwise. If A = A(H) is the quantum model
associated with a finite-dimensional Hilbert space H, V( A) and V∗ ( A) are both naturally isomorphic
to the space Lh (H) of Hermitian operators on H, ordered by its usual cone of positive semi-definite
operators. More generally, if E is a Euclidean Jordan algebra and A = A(E) is the corresponding
Jordan model, then V( A) E V∗ ( A), with E ordered as usual, i.e., by its cone of squares. The first
of these isomorphisms is due to the definition of the model A(E) and the second to E’s self-duality.
The space E( A). It is going to be technically useful to introduce a third ordered vector space, which I
will denote by E( A). This is the span of the evaluation-effects x", associated with measurement
outcomes x ∈ X ( A), in V∗ ( A), ordered by the cone:
# $
E( A)+ := ∑ i i i
t "
x t ≥ 0 .
i
That is, E( A)+ is the set of linear combinations of effects x" having non-negative coefficients. It is
important to note that this is, in general, a proper sub-cone of V( A)∗+ . To see this, we can revisit the
example of the “diamond bit” of Figure 2b. Letting x and y be the outcomes corresponding to the
right face and the top face of the larger (full) state space pictured below in Figure 3a, consider the
functional f := x" + y" − 12 u. This takes positive values on the smaller state space of the diamond bit,
but is negative on, for example, the state γ corresponding to the lower-left corner of the full state space
(see Figure 3b). Thus, f ∈ V( A)+ , but f
∈ E( A)+ .
89
Entropy 2018, 20, 227
y" = 1
x" = 1 f =1
f =0
(a) (b)
Figure 3. (a) Two outcome-effects for the square bit; (b) An effect for the diamond bit not positive on
the square bit.
Since we are working in finite dimensions, the outcome-effects x" span V∗ ( A). Thus, as vector
spaces, E( A) and V∗ ( A) are the same. However, as the diamond bit illustrates, they can have quite
different positive cones and, thus, need not be isomorphic as ordered vector spaces.
Processes and subnormalized states. A subnormalized state of a model A is an element α of V( A)+ with
u(α) < 1. These can be understood as states that allow a nonzero probability 1 − u(α) of some generic
“failure” event, (e.g., the destruction of the system), represented by the zero functional in V∗ ( A).
More generally, we may wish to regard two systems, represented by models A and B, as the input
to and output from some process, whether dynamical or purely information-theoretic, that has some
probability to destroy the system or otherwise “fail”. Since such a process should preserve probabilistic
mixtures, it should be represented mathematically by an affine mapping T : Ω( A) → V( B)+ , taking
each normalized state α of A to a possibly sub-normalized state T (α) of B. One can show that such a
mapping extends uniquely to a positive linear mapping:
T : V ( A ) → V ( B ),
Definition 5. A process T : A → B is probabilistically reversible iff there exists a process S such that, for all
α ∈ Ω( A), (S ◦ T )(α) = pα, where p ∈ (0, 1].
This means that there is a probability 1 − p of the composite process S ◦ T failing, but a
probability p that it will leave the system in its initial state (note that, since S ◦ T is linear, p must
be constant); where T preserves normalization, so that T (Ω( A)) ⊆ Ω( B), S can also be taken to be
normalization-preserving and will undo the result of T with probability one. This is the more usual
meaning of “reversible” in the literature.
Given a process T : V( A) → V( B), there is a dual mapping T ∗ : V∗ ( B) → V∗ ( A), also positive,
given by T ∗ (b)(α) = b( T (α)) for all b ∈ V∗ ( B) and α ∈ V( A). The assumption that T takes normalized
states to subnormalized states is equivalent to the requirement that T ∗ (u B ) ≤ u A , that is that T ∗ maps
effects to effects.
Remark 1. Since we are attaching no special physical interpretation to the cone E+ ( A), we do not require a
physical process T : V( A) → V( B) to have a dual process T ∗ that maps E+ ( B) to E+ ( A). That is, we do not
require T ∗ to be positive as a mapping E( B) → E( A).
90
Entropy 2018, 20, 227
Joint probabilities and joint states. If M1 and M2 are two test spaces, with outcome-spaces X1 and
X2 , we can construct a space of product tests (note here the savage abuse of notation: M1 × M2 is
not the Cartesian product of M1 and M2 ):
M1 × M2 = { E × F | E ∈ M1 , F ∈ M2 }
This models a situation in which tests from M1 and from M2 can be performed separately,
and the results collated. Note that the outcome-space for M1 × M2 is X1 × X2 . A joint probability
weight on M1 and M2 is just a probability weight on M1 × M2 , that is a function ω : X1 × X2 →
[0, 1] such that ∑( x,y)∈E× F ω ( x, y) = 1 for all tests E ∈ M1 and F ∈ M2 . One says that ω is
non-signaling iff the marginal (or reduced) probability weights ω1 and ω2 , given by:
are well-defined, i.e., independent of the choice of the tests E and F, respectively. One can understand
this to mean that the choice of which test to measure on M1 has no observable, i.e., no statistical,
influence on the outcome of tests made of M2 , and vice versa. In this case, one also has well-defined
conditional probability weights:
(with, say, ω2| x = 0 if ω1 ( x ) = 0, and similarly for ω1|y ). This gives us the following bipartite version
of the law of total probability [23]: for any choice: of E ∈ M1 or F ∈ M2 ,
Definition 6. A joint state on a pair of probabilistic models A and B is a non-signaling joint probability weight
ω on M( A) × M( B) such that, for every x ∈ X ( A) and every y ∈ X ( B), the conditional probability weights
ω2| x and ω1|y belong to Ω( A) and Ω( B), respectively. It follows from (1) that the marginal weights ω1 and ω2
are also states of A and B, respectively.
This naturally suggests that one should define, for models A and B, a composite model AB,
the states of which would be precisely the joint states on A and B. If one takes M( AB) = M( A) ×
M( B), this is essentially the “maximal tensor product” of A and B [24]. However, this does not
coincide with the usual composite of quantum-mechanical systems. In Section 6, I will discuss
composite systems in more detail. Meanwhile, for the main results of this paper, the idea of a joint
state is sufficient.
For a simple example of a joint state that is neither classical, nor quantum, let B denote the “square
bit” model discussed above. That is, B = (B , Ω) where e B = {{ x, x }, {y, y }} is a test space with two
non-overlapping, two-outcome tests, and Ω is the set of all probability weights thereon, amounting to
the unit square in R2 . The joint state on B × B given by Table 1 (a variant of the “non-signaling box” of
Popescu and Rohrlich [25]) is clearly non-signaling. Notice that it also establishes a perfect, uniform
correlation between the outcomes of any test on the first system and its counterpart on the second.
x x’ y y’
x 1/2 0 1/2 0
x’ 0 1/2 0 1/2
y 0 1/2 1/2 0
y’ 1/2 0 0 1/2
91
Entropy 2018, 20, 227
" :
Conditioning maps. If ω is a joint state on A and B, define the associated conditioning maps ω
" ∗ : X ( B) → V( A) by:
X ( A) → V( B) and ω
" ∗ (y)( x )
" ( x )(y) = ω ( x, y) = ω
ω
for all x ∈ X ( A) and y ∈ X ( B). Note that ω " ( x ) = ω1 ( x )ω2| x for every x ∈ X ( A), i.e., ω " ( x ) can be
understood as the un-normalized conditional state of B given the outcome x on A. Similarly, ω " ∗ (y) is
the unnormalized conditional state of A given outcome y on B.
The conditioning map ω " extends uniquely to a positive linear mapping E( A) → V( B), which
I also denote by ω, " such that ω " ( x ) for all outcomes x ∈ X ( A). To see this, consider the
" ( x") = ω
linear mapping T : V∗ ( A) → RX ( B) defined, for f ∈ V∗ ( A), by T ( f )(y) = f (ω " ∗ (y)) for all y ∈ X ( B).
If f = x", we have T ( x") = ω1 ( x )ω2| x ∈ V( B)+ , whence, for all y ∈ X ( B), T ( x")(y) = ω ( x, y) = ω " ( x )(y).
Since the evaluation functionals x" span E( A), the range of T lies in V( B), and moreover, T is positive on
the cone E( A)+ . Hence, as advertised, T defines a positive linear mapping E( B) → V( A), extending
" In the same way, ω
ω. " ∗ defines a positive linear mapping ω " ∗ : E ( B ) → V ( A ).
An immediate and important corollary is that any joint state ω on A and B defines a bilinear
form, which by abuse of notation I also call ω, on E( A) × E( B), given by ω ( a, b) := ω " ( a)(b) for all
a, b ∈ E( A). Note that ω ( x", y") = ω ( x, y) for all x ∈ X ( A), y ∈ X ( B) and also that the bilinear form ω
is positive, in the sense that ω ( a, b) ≥ 0 for all a ∈ E( A)+ and all b ∈ E( B)+ .
Definition 7. Let A be uniform probabilistic model with tests of size n. A conjugate for A is a model A, plus a
chosen isomorphism γ A : A A and a joint state η A on A and A such that for all x, y ∈ X ( A),
(a) η A ( x, x ) = 1/n
(b) η A ( x, y) = η (y, x )
where x := γ A ( x ).
This corresponds to what is called a “weak conjugate” in [17]. Note that if E ∈ M( A), we have
∑ x,y∈ E× E η A ( x, y) = 1 and | E| = n. Hence, η A ( x, y) = 0 for x, y ∈ E with x
= y. Thus, η A establishes
a perfect, uniform correlation between any test E ∈ M( A) and its counterpart, E := { x | x ∈ E},
in M( A).
The symmetry condition (b) is pretty harmless. If η is a joint state on A and A satisfying (a), then
so is η t ( x, y) := η (y, x ); thus, 12 (η + η t ) satisfies both (a) and (b). In fact, if A is sharp, (b) is automatic:
if η satisfies (a), then the conditional state (η A )1| x assigns probability one to the outcome x. If A is
sharp, this implies that η1| x = δx is uniquely defined, whence η ( x, y) = nδy ( x ) is also uniquely defined.
In other words, for a sharp model A and a given isomorphism γ : A A, there exists at most one joint
state η satisfying (a); whence, in particular, η = η t .
92
Entropy 2018, 20, 227
Equivalently, Φ is a filter iff the dual process Φ∗ : V∗ ( A) → V∗ ( A) satisfies Φ∗ ( x") = t x x" for each
x ∈ E. Just as in the quantum-mechanical case, a filter independently attenuates the “sensitivity” of
the outcomes x ∈ E. (The extreme case is one in which the coefficient t x corresponding to a particular
outcome is one, and the other coefficients are all zero. In that case, all outcomes other than x are, so to
say, blocked by the filter. Conversely, given such an “all or nothing” filter Φ x for each x ∈ E, we can
construct an arbitrary filter with coefficients t x by setting Φ = ∑ x∈ E t x Φ x .)
Call a filter Φ reversible iff Φ is an order-automorphism of V( A); that is, iff it is probabilistically
reversible as a process. Evidently, this requires that all the coefficients t x be nonzero. We will eventually
see that the existence of a conjugate, plus the preparability of arbitrary nonsingular states by symmetric
reversible filters, will be enough to force A to be a Jordan model. Most of the work is done by the easy
Lemma 1, below. First, some terminology.
Definition 9. Suppose Δ = {δx | x ∈ X ( A)} is a family of states indexed by outcomes x ∈ X ( A) and such
that δx ( x ) = 1. Say that a state α is spectral with respect to Δ iff there exists a test E ∈ M( A) such that
α = ∑ x∈ E α( x )δx . Say that the model A itself is spectral with respect to Δ if every state of A is spectral with
respect to Δ.
Lemma 1. Let A have a conjugate ( A, η A ). Suppose A is spectral with respect to the states δx := η1| x ,
x ∈ X ( A). Then:
a, b := nη A ( a, b),
where n is the rank of A, defines a self-dualizing inner product on E( A), with respect to which V( A)+ E( A)+ .
Moreover, A is sharp, and E( A)+ = V∗ ( A)+ .
Proof. That , is symmetric and bilinear follows from η A ’s being symmetric and non-signaling.
Note that x", x" = 1 for every x ∈ X ( A) and x", y" = 0 for any distinct x, y ∈ X ( A) lying in a common
test. We need to show that , is positive-definite. Since A " A and the latter is spectral, so is the
former. It follows that η" takes E( A)+ onto V( A)+ and, hence, is an order-isomorphism. From this,
it follows that every a ∈ E( A)+ has a “spectral” decomposition of the form ∑ x∈ E t x x for some
coefficients t x ≥ 0 and some test E ∈ M( A). In fact, any a ∈ E( A), positive or otherwise, has such a
decomposition (albeit with possibly negative coefficients). If a ∈ E( A) is arbitrary, with a = a1 − a2 for
some a1 , a2 ∈ E( A)+ , we can find N ≥ 0 with a2 ≤ Nu. Thus, b := a + Nu = a1 + ( Nu − a2 ) ≥ 0, and
so, b := ∑ x∈ E t x x for some E ∈ A, and hence, a = b − Nu = ∑ x∈ E t x x − N (∑ x∈ E x ) = ∑ x∈ E (t x − N ) x.
Now, let a ∈ E( A). Decomposing a = ∑ x∈ E t x x for some test E and some coefficients t x , we have:
a, a = ∑ t x ty x", y" = ∑ tx 2 ≥ 0.
x,y∈ E× E x∈E
93
Entropy 2018, 20, 227
This is zero only where all coefficients t x are zero, i.e., only for a = 0. Therefore, , is an inner
product, as claimed.
We need to show that , is self-dualizing. Clearly a, b = nη A ( a, b) ≥ 0 for all a, b ∈ E( A)+ .
Suppose a ∈ E( A) is such that a, b ≥ 0 for all b ∈ E( A)+ . Then, a, y" ≥ 0 for all y ∈ X. Now,
a = ∑ x∈ E t x x" for some test E; thus, for all y ∈ E, we have a, y" = ty ≥ 0, whence, a ∈ E( A)+ .
Next, we want to show that E( A)+ = V( A)∗+ . Since η" : E( A) → V( A) is an order-isomorphism,
for every α ∈ V( A), there exists a unique a ∈ E( A) with η"( a) = n1 α. In particular,
a, x = nη A ( a, x ) = α( x ) = α( x ).
η A ( a) = nη ( a, b) = a, b.
b(α) = b(α) = bn"
Since every a ∈ E( A)+ has the form a = η"−1 ( n1 α) for some α ∈ V( A)+ , if b ∈ V∗ ( A)+ , we have
a, b ≥ 0 for all a ∈ E( A)+ , whence, by the self-duality of the latter cone, b ∈ E( A)+ . Thus,
V∗ ( A) = E( A)+ .
Finally, let us see that A is sharp. If α ∈ Ω( A), let a be the unique element of E( A)+ with
a, x = α( x ). In particular, a, u = 1. If a has spectral decomposition a = ∑ x∈E t x x", where E ∈ M( A),
then for all x ∈ E, a, x = t x ; hence, ∑ x∈ E t x = ∑ x∈ E a, x = a, u = 1. Thus, a2 = ∑ x∈ E t2x ≤ 1,
whence, a ≤ 1. Now, suppose α( x ) = 1 for some x ∈ X ( A): then, 1 = a, x ≤ a x ; as x = 1,
we have a = 1. However, now a, x" = a x", whence, a = x". Hence, there is only one weight α
with α( x ) = 1, namely, α = x, · , so A is sharp.
If A is sharp, then we say that A is spectral iff it is spectral with respect to the pure states δx defined
by δx ( x ) = 1. If A is sharp and has a conjugate A, then, as noted earlier, the state η1| x is exactly δx ,
so the spectrality assumption in Lemma 1 is fulfilled if we simply say that A is spectral. Hence, a sharp,
spectral model with a conjugate is self-dual.
For the simplest systems, this is already enough to secure the desired representation in terms of a
Euclidean Jordan algebra.
Definition 10. Call A a bit iff it has rank two (that is, all tests have two outcomes) and if every state α ∈ Ω( A)
can be expressed as a mixture of two sharply distinguishable states; that is, α = tδx + (1 − t)δy for some
t ∈ [0, 1] and states δx and δy with δx ( x ) = 1 and δy (y) = 1 for some test { x, y}.
Theorem 2. Let A be spectral with respect to a conjugate system A. If V( A) is homogeneous, then there exists
a canonical Jordan product on E( A) with respect to which u A is the Jordan unit. Moreover, with respect to this
product, X ( A) is exactly the set of primitive idempotents, and M( A) is exactly the set of Jordan frames.
The first part is almost immediate from the Koecher–Vinberg theorem, together with Lemma 1.
The KV theorem gives us an isomorphism between the ordered vector spaces V( A) and E( A), so if one
is homogeneous, so is the other. Since E( A) is also self-dual by Lemma 1, the KV theorem yields the
requisite unique Euclidean Jordan structure having u as the Jordan unit. One can then show without
94
Entropy 2018, 20, 227
much trouble that every outcome x ∈ X ( A) is a primitive idempotent of E( A) with respect to this
Jordan structure and that every test is a Jordan frame. The remaining claims (that every minimal
idempotent belongs to X ( A) and every Jordan frame, to M( A)) take a little bit more work. I will
not reproduce the proof here; the details (which are not especially difficult, but depend on some facts
concerning Euclidean Jordan algebras) can be found in [17].
The homogeneity of V( A) can be understood as a preparability assumption: it is equivalent
to saying that every state in the interior of Ω( A) can be obtained, up to normalization, from the
maximally-mixed state by a reversible process. That is, if α ∈ Ω( A), there is some such process φ such
that φ(ρ) = pα where 0 < p ≤ 1. One can think of the coefficient p as the probability that the process
φ will yield a nonzero result (more dramatically: will not destroy the system). Thus, if we prepare an
ensemble of identical copies of the system in the maximally-mixed state ρ and subject them all to the
process φ, the fraction that survives will be about p, and these will all be in state α.
In fact, if the hypotheses of Lemma 1 hold, the homogeneity of E( A) follows directly from the
mere existence of reversible filters with arbitrary non-zero coefficients. To see this, suppose a ∈ E( A)+
has a spectral decomposition ∑ x∈ E t x x" for some E ∈ M( A), with t x > 0 for all x when a belongs
to the interior of E( A)+ . Now, if we can find a reversible filter for E with Φ( x ) = t x x" for all x ∈ E,
then applying this to the order-unit u = ∑ x∈ E x" yields a. Thus, V∗ ( A) is homogeneous.
Two paths to spectrality. Some axiomatic treatments of quantum theory have taken one or another
form of spectrality as an axiom [6,26]. If one is content to do this, then Lemma 1 above provides a
very direct route to the Jordan structure of quantum theory. However, spectrality can actually be
derived from assumptions that, on their face, seem a good deal weaker, or anyway more transparent (a
different path to spectrality is charted in a recent paper [27] by G. Chiribella and C. M. Scandolo).
I will call a joint state on models A and B correlating iff it sets up a perfect correlation between
some pair of tests E ∈ M( A) and F ∈ M( B). More exactly:
Definition 11. A joint state ω on probabilistic models A and B correlates a test E ∈ M( A) with a test
F ∈ M( B) iff there exist subsets E0 ⊆ E and F0 ⊆ F, and a bijection f : E0 → F0 such that ω ( x, y) = 0 for
( x, y) ∈ E × F unless y = f ( x ). In this case, say that ω correlates E with F along f . A joint state on A and B
is correlating iff it correlates some pair of tests E ∈ M( A), F ∈ M( B).
Lemma 2. Suppose A is sharp and that every state α of A arises as the marginal of a correlating joint state
between A and some model B. Then, A is spectral.
Proof. Suppose α = ω1 , where ω is a joint state correlating a test E ∈ M( A) with a test F ∈ M( B),
say along a bijection f : E0 → F0 , where Eo ⊆ E and F0 ⊆ F. Then, for any x ∈ E with α( x )
= 0,
ω1| f ( x) ( x ) = 1, whence, as A is sharp, ω1| f ( x) = δx , the unique state making x certain. It follows from
the law of total probability that α = ∑ x∈ E α( x )δx .
In principle, the model B can vary with the state α. Lemma 2 suggests the following language:
Definition 12. A model A satisfies the correlation condition iff every state α ∈ Ω( A) is the marginal of some
correlating joint state of A and some model B.
This has something of the same flavor as the purification postulate of [8], which requires that all
states of a given system arise as marginals of a pure state on a larger, composite system, unique up to
symmetries on the purifying system. However, note that we do not require the correlating joint state
to be either pure (which, in classical probability theory, it will not be) or unique.
95
Entropy 2018, 20, 227
If A is sharp and satisfies the correlation condition, then every state of A is spectral. If, in addition,
A has a conjugate, then for every x ∈ X ( A), we have η1| x = δx . In this case, A is spectral with respect
to the family of states η1| x , and the hypotheses of Lemma 1 are satisfied.
Here is another, superficially quite different, way of arriving at spectrality. Suppose A has a
conjugate, A. Call a transformation Φ symmetric with respect to η A iff, for all x, y ∈ X ( A),
∗
η A (Φ∗ x, y) = η A ( x, Φ y).
Say that a state α is preparable by a filter Φ iff α = Φ(ρ), where ρ is the maximally-mixed state.
Lemma 3. Let A have a conjugate, A, and suppose every state of A is preparable by a symmetric filter. Then, A
is spectral.
Proof. Let α = Φ(ρ) where Φ is a filter on a test E ∈ M( A), say Φ( x ) = t x x for all x ∈ E. Then:
∗
α = Φ(η"∗ (u)) = η (Φ∗ (·), u) = η ( · , Φ (u)) = ∑ η ( · , tx x) = ∑ tx n1 δx .
x∈E x∈E
Thus, the hypotheses of either Corollary 2 or Lemma 3 will supply the needed spectral assumption
that makes Lemma 1 work (in fact, it is not hard to see that these hypotheses are actually equivalent,
an exercise I leave for the reader).
To obtain a Jordan model, we still need homogeneity. This is obviously implied by the preparability
condition in Lemma 3, provided the preparing filters Φ can be taken to be reversible whenever the
state to be prepared is non-singular. On the other hand, as noted above, in the presence of spectrality, it
is enough to have arbitrary reversible filters, as these allow one to prepare the spectral decompositions
of arbitrary non-singular states. Thus, conditions (a) and (b) below both imply that A is a Jordan
model. Conversely, one can show that any Jordan model satisfies both (a) and (b), closing the loop [17]:
(a) A has a conjugate, and every non-singular state can be prepared by a reversible symmetric filter;
(b) A is sharp, has a conjugate, satisfies the correlation condition and has arbitrary reversible filters;
(c) A is a Jordan model.
(a) The states β x are distinguishable, or readable, by some test F ∈ M( B). This means that for each
x ∈ E, there is a unique y ∈ F such that β x (y) = 1. Note that this sets up an injection f : E → F.
96
Entropy 2018, 20, 227
(b) The record states must be accurate, in the sense that if we were to measure E on A, and secure
x ∈ E, the record state β x should coincide with the conditional state ω2| x (if this is not the case,
then a measurement of A cannot correctly calibrate the system B as a measuring device for E).
In other words, ω must correlate E with F, along the bijection f : E → Fo ⊆ F. If the measurement
process leaves α undisturbed, in the sense that ω1 = α, then α dilates to a correlating state. This suggests
the following non-disturbance principle: every state can be measured, by some test E ∈ M( A),
without disturbance. Lemma 2 then tells us that if A is sharp and satisfies the non-disturbance
principle, every state of A is spectral.
Here is a slightly different, but possibly more compelling, version of this story. Suppose we can
perform a test E on A directly (setting aside, that is, any issue of whether or not this can be achieved
through some dynamical process): this will result in an outcome x occurring. To do anything with
this, we need to record its having occurred. This means we need a storage medium, B and a family
of states β x , one for each x ∈ E, such that if, on performing the test E, we obtain x, then B will be
in state β x . Moreover, these record states need to be readable at a later time, i.e., distinguishable by
a later measurement on B. To arrange this, we need A and B to be in a joint state, associated with
a joint probability weight ω, such that ω1 = α (because we want to have prepared A in the state α)
and β x = ω2| x for every x ∈ E. We then measure E on A; upon our obtaining outcome x ∈ E, B is in
the state β x . Since the ensemble of states β x is readable by some F ∈ M( B) with | F | ≥ | E|, we have
correlation, and α must also be spectral.
Of course, these desiderata cannot always be satisfied. What is true, in QM, is that for every
choice of state α, there will exist some test that is recordable in that state, in the foregoing sense. If we
promote this to the general principle, we again see that every state is the marginal of a correlating state,
and hence spectral, if A is sharp.
Definition 13. A non-signaling composite of models A and B is a model AB, together with a mapping
π : X ( A) × X ( B) → V∗ ( AB)+ such that:
∑ π ( x, y) = u AB
x ∈ E,y∈ F
97
Entropy 2018, 20, 227
The idea here, expressed in Alice-and-Bob language (Alice controlling system A, Bob controlling
system B), is that π ( x, y) is an effect of the composite system AB, corresponding to x being observed
by Alice and y, by Bob. In many cases, π ( x, y) will actually be an outcome in X ( AB). Indeed,
we usually have π : X ( A) × X ( B) → X ( AB) injective, and for E ∈ M( A), F ∈ M( B), π ( E × F ) =
{π ( x, y)| x ∈ E, y ∈ F } a test in M( AB). The rank of AB will then be the product of the ranks of A
and B. Accordingly, let us call a non-signaling composite with these these properties multiplicative.
Composites in real and complex quantum mechanics are multiplicative; in quaternionic quantum
mechanics, with the most plausible definition of tensor product, they are not [28].
Therefore, the question becomes: can one construct, for Jordan models A and B, a non-signaling
composite AB that is also a Jordan model? At present, and in this generality, this question seems to be
open, but some progress is made in [28]: if neither A, nor B contain the exceptional Jordan algebra as a
summand, such a composite can indeed be constructed, and in multiple ways. Moreover, under a
considerably more restrictive definition of “Jordan composite”, no Jordan composite AB can exist if
either factor has an exceptional summand.
( f ⊗ g ) ◦ ( f ⊗ g ) = ( f ◦ f ) ⊗ ( g ◦ g ).
By a probabilistic theory, I mean a category of probabilistic models and processes; that is, objects
of C are models, and a morphism A → B, where A, B ∈ C , is a process V( A) → V( B). A monoidal
probabilistic theory is such a category, C , carrying a symmetric monoidal structure A, B → AB, where
AB is a non-signaling composite in the sense of the definition above. I also assume that the monoidal
unit, I, is the trivial Model 1 with V(1) = R, and that, for all A ∈ C ,
Call C locally tomographic iff AB is a locally tomographic composite for all A, B ∈ C . Much of the
qualitative content of (finite-dimensional) quantum information theory can be formulated in purely
categorical terms [11,18,30]. In particular, in the work of Abramsky and Coecke [18], it is shown that a
range of quantum phenomena, notably gate teleportation, is available in any dagger-compact category.
For a review of this notion, as well as a proof of the following result, see Appendix D:
Theorem 4. Let C be a locally-tomographic monoidal probabilistic theory, in which every object A ∈ C is sharp,
spectral and has a conjugate A ∈ C , with η A ∈ Ω( AA). Assume also that, for all A, B ∈ C ,
Then, C has a canonical dagger-compact structure, in which A is the dual of A with η A : R → V( AA) as
the co-unit.
98
Entropy 2018, 20, 227
7. Conclusions
As promised, we have here an easy derivation of something close to orthodox, finite-dimensional
QM, from operationally or probabilistically transparent assumptions. As discussed earlier,
this approach offers, in addition to its relative simplicity, greater latitude than the locally-tomographic
axiomatic reconstructions of [7–10], putting us in the slightly less constrained realm of formally real
Jordan algebras. This allows for real and quaternionic quantum systems, superselection rules and even
theories, such as the ones discussed in Section 6, in which real, complex and quaternionic quantum
systems coexist and interact.
There remains some mystery as to the proper interpretation of the conjugate system A.
Operationally, the situation is clear enough: if we understand A as controlled by Alice and A, by
Bob, then if Alice and Bob share the state η A , then they will always obtain the same result, as long as
they perform the same test. However, what does it mean physically that this should be possible (in
a situation in which Alice and Bob are still able to choose their tests independently)? In fact, there
is little consensus (that I can find, anyway) among physicists as to the proper interpretation of the
conjugate of the Hilbert space representing a given quantum-mechanical system. One popular idea
is that the conjugate is a time-reversed version of the given system; but why, then, should we expect
to find a state that perfectly correlates the two? At any rate, finding a clear physical interpretation of
conjugate systems, even (or especially!) in orthodox quantum mechanics, seems to me an urgently
important problem.
I would like to close with another problem, this one of mainly mathematical interest. The
hypotheses of Theorem 2 yield a good deal more structure than just a homogeneous, self-dual cone.
In particular, we have a distinguished set M( A) of orthonormal observables in V∗ ( A), with respect
to which every effect has a spectral decomposition. Moreover, with a bit of work, one can show that
this decomposition is essentially unique. More exactly, if a = ∑i ti pi where the coefficients ti are all
distinct and the effects p1 , ..., pk are associated with a coarse-graining of a test E ∈ M( A), then both
99
Entropy 2018, 20, 227
the coefficients and the effects are uniquely determined. The details are in Appendix B. Using this,
we have a functional calculus on V∗ ( A), i.e., for any real-valued function f of a real variable and any
effect a with spectral decomposition ∑i ti pi as above, we can define f ( a) = ∑i f (ti ) pi . This gives us a
unique candidate for the Jordan product of effects a and b, namely,
·
a b = 12 (( a + b)2 − a2 − b2 )).
We know from Theorem 2 (and thus, ultimately, from the KV theorem) that this is bilinear.
The challenge is to show this without appealing to the KV theorem (the fact that the state spaces of
“bits” are always balls, as shown in Appendix C, is perhaps relevant here).
Acknowledgments: This paper is partly based on talks given in workshops and seminars in Amsterdam, Oxford,
in 2014 and 2015, and was largely written while the author was a guest of the Quantum Group at the Oxford
Computing Laboratory, supported by a grant (FQXi-RFP3-1348) from the FQXifoundation. I would like to thank
Sonja Smets (in Amsterdam) and Bob Coecke (in Oxford) for their hospitality on these occasions. I also wish to
thank Carlo Maria Scandolo for his careful reading of, and useful comments on, two earlier drafts of this paper.
Conflicts of Interest: The author declares no conflict of interest.
Definition A1. Let G be a group. A G-test space is a test space ( X, M) where X is a G-space, that is, where X
comes equipped with a preferred G-action G × X → X, ( g, x ) → gx, such that gE ∈ M for all E ∈ M. A
G-model is a probabilistic model A such that (i) M( A) is a G-test space and (ii) Ω( A) is invariant under the
action of G on probability weights given by α → gα := α ◦ g−1 for g ∈ G.
Lemma A1. Let A be a finite-dimensional G-model, and suppose G acts transitively on the outcome space
X ( A). Suppose also that A is unital, i.e., for every x ∈ X ( A), there exists at least one state α with α( x ) = 1.
Then, there exists a G-invariant convex subset Δ ⊆ Ω( A) such that A = (M( A), Δ) is a sharp G-model.
Proof. For each x ∈ X ( A), let Fx denote the face of Ω( A) consisting of states α with α( x ) = 1. Let β x
be the barycenter of Fx . It is easy to check that Fgx = gFx for every g ∈ G. Thus, gβ x = β gx , i.e., the set
of barycenters β x is an orbit. Let Δ be the convex hull of these barycenters. Then, Δ is invariant under
G. If α ∈ Δ with α( x ) = 1, then α ∈ Fx ∩ Δ = { β x }, so (M( A), Δ) is sharp.
" :=
D ∑ x".
x∈E
" = u.
A test is a maximal event, and for any test E ∈ M( A), D
100
Entropy 2018, 20, 227
Definition A2. An effect p ∈ V∗ ( A) is sharp iff it has the form p = D " for some event D. A set of sharp effects
p1 , ..., pn ∈ V∗ ( A) is jointly orthogonal with respect to M( A) iff there exists a test E ∈ M( A) and pairwise
disjoint events D1 , ..., Dn ⊆ E with pi = D " i for i = 1, ..., n.
Proof. Normalize the inner product on E( A) so that x = 1 for all outcomes x. Then, for any sharp
" D an event, we have D 2 = | D |, the cardinality of D. Choosing any outcome x0 ∈ E0 ,
effect p = D,
set α = | x0 , i.e., α( x") = x", x"0 for all x ∈ X ( A). Then, α ∈ Ω( A), α( p0 ) = 1 and α( pi ) = 0 for
i > 0. Thus,
t0 = α ( a ) = ∑ s j α ( q j ).
j
Since the coefficients α(q j ) are sub-convex, the right-hand side is no larger than the largest of the
values s j , namely, so . Thus, t0 ≤ s0 . The same argument, with the roles of the two decompositions
reversed, shows that s0 ≤ t0 . Thus, s0 = t0 .
Now again, let x ∈ E0 : then,
Since ∑lj=0 x", q j ≤ x", u =≤ 1, the sum in the last expression above is a sub-convex combination
of the distinct values so > · · · > sl . This can equal t0 = s0 , the maximum of these values, only if
x", q0 = 1 and x", q j = 0 for the remaining q j . It follows that p0 , q0 = ∑ x∈E0 x", q0 = | E0 | = p0 2 .
The same argument, with p’s and q’s interchanged, shows that p0 , q0 = q0 2 . Hence, p0 = q0 ,
and p0 , q0 = p0 2 = p0 q0 , whence, p0 = q0
Proposition A1. Every a ∈ V∗ ( A) has a unique expansion of the form a = ∑ik=0 ti pi where t0 > t1 > ... > tk
are non-zero coefficients and p1 , ..., pn are jointly orthogonal sharp effects.
Proof. Suppose a = ∑ik=1 ti pi , as above, and also a = ∑lj=1 s j q j , with s0 > · · · > sl > 0 and q j pairwise
orthogonal sharp effects. We shall show that k = l, and that ti = si and pi = qi for each i = 1, ..., k.
Lemma A2 tells us that t0 = s0 and p0 = s0 . Hence,
k l
∑ t i p i = a − t o p o = a − s0 q0 = ∑ s j q j .
i =1 j =1
101
Entropy 2018, 20, 227
whence, ∑lj=k+1 s j q j = 0, which is impossible since all q j are sharp and the coefficients s j are strictly
positive. Hence, l = k, and the proof is complete.
Lemma A3. Let A be a bit with conjugate A. Then, Ω( A) is a Euclidean ball, the extreme points of which are
the states δx , x ∈ X ( A).
Proof. By Lemma 1, E( A) carries a self-dualizing inner product such that x", y" = 0 for { x, y} ∈
M( A), and which we can normalize so that x" = 1 for each outcome x ∈ X ( A), so that u, x" =
x", x" = 1 and u2 = 2. Every state α ∈ Ω( A) corresponds to a unique vector a ∈ E( A)+ with
a, u = 1, where α( x ) = a, x" for all x ∈ X ( A); conversely, every vector a ∈ E( A)+ with a, u = 1
corresponds in this way to a state. In particular, the state δx corresponds to the unit vectors x", and the
maximally-mixed state corresponds to the vector n1 u. To simplify the notation, let us agree for the
moment to write ρ for this vector. Thus, ρ, x" = 12 , ρ2 = 14 u, u = 12 , and hence,
It follows that, for rank-two models, we do not even need to invoke homogeneity: they all
correspond to spin factors. Letting d denote the dimension of the state space (that is, d = dim(E) − 1),
we see that if d = 1, we have the classical bit; d = 2 gives the real quantum-mechanical bit, d = 3
gives the familiar Bloch sphere, i.e., the usual qubit of complex QM; while d = 5 corresponds to the
quaternionic unit sphere, giving us the quaternionic bit. The generalized bits with d = 4 and d ≥ 6 are
more exotic “post-quantum” possibilities.
102
Entropy 2018, 20, 227
up to the natural associator and unit isomorphisms. If C is †-monoidal and = σA,A ◦ η A † , then ( A , η, )
is a dagger-dual. A category in which every object A has a specified dual ( A , η A , A ) is compact closed,
and a dagger-monoidal category in which every object has a given dagger-dual is dagger-compact.
See [18,30] for details.
An important example of all this is the category FdHilbR of finite-dimensional real Hilbert
spaces and linear mappings. If H and K are two such spaces and φ : H → K, let φ† be the usual
adjoint of φ with respect to the given inner products. Letting H ⊗ K be the usual tensor product of
H and K (in particular, with x ⊗ y, u ⊗ v = x, u y, v for x, u ∈ H and y, v ∈ K), FdHilbR is a
dagger-monoidal category with R as the monoidal unit.
Since any H ∈ FdHilbR is canonically isomorphic to its dual space, we have also a canonical
isomorphism H ⊗ H H∗ ⊗ H = L(H) and a canonical trace functional TrH : H ⊗ H → R,
uniquely defined by TrH ( x ⊗ y) = x, y for all x, y ∈ H. Taking H = H, let ηH ∈ H ⊗ H be given
by ηH = ∑i xi ⊗ xi , where the sum is taken over any orthonormal basis { xi } for H; then, for any
a ∈ H ⊗ H, η A , a = Tr( a). It is routine to show that TrH = σH,H ◦ ηH
† , so that η and Tr make H
H H
its own dagger-dual.
In any compact closed symmetric monoidal category C , every morphism φ : A → B yields a dual
morphism φ : B → A defined by:
(again, suppressing associators and left and right units). For φ : H → K in FdHilbR , one has, for any
v ∈ A,
φ (v) = ∑ v, f ( x ) x = ∑ f † (v), x x = f † (v),
x∈ M x∈ M
i.e., φ = φ† .
Now, let C be a monoidal probabilistic theory; that is, a category of probabilistic models and
processes, with a symmetric monoidal structure A, B → AB, where AB is a (non-signaling) composite
in the sense discussed in Section 6. Let C is multiplicative, so that for A, B ∈ C , we have π AB :
X ( A) × X ( B) → X ( AB). Henceforward, I will write x ⊗ y for π ( x, y) where x ∈ X ( A) and y ∈ X ( B).
I will further assume that C ’s tensor unit is I = R, and that:
(a) Every A ∈ C has a conjugate, A ∈ C , with A = A;
(b) For all A, B ∈ C and φ ∈ C( A, B), φ ∈ C( A, B);
(c) A = A, with η A ( a, b) := η A ( a, b).
Remark A1. (1) The chosen conjugate A for A ∈ C required by Condition (a) is equipped with a canonical
isomorphism γ A : A A, with x = γ( x ) for every x ∈ X ( A). As discussed in Section 4, this extends to an
order-isomorphism E( A) E( A), which we again write as γ A ( a) = a for a ∈ E( A). Notice, however, that γ A
is not assumed to be a morphism in C .
(2) In spite of this, Condition (b) requires that φ = γB ◦ φ ◦ γ−
A does belong to C( A, B ) for φ ∈ C( A, B ).
1
x, y = η A ( x, y) = η A ( x, y) = x", y"
103
Entropy 2018, 20, 227
Lemma A4. For all models A, B ∈ C , the inner product on E( AB) factors, in the sense that if u, x ∈ E( A)
and v, y ∈ E( B), then u ⊗ v, x ⊗ y = u, x v, y.
Proof. This follows from the sharpness of A, B and AB. For u ∈ X ( A), v ∈ X ( B), let δu , δv and δu⊗v
denote the unique states of A, B and AB such that δu (u) = δv (v) = δu⊗v (u ⊗ v) = 1. Since (δu ⊗
δv )(u ⊗ v) is also one, we conclude that δu⊗v = δu ⊗ δv . However, we also have δu ( x ) = n u", x",
δv (y) = m v", y" and δu⊗v ( x ⊗ y) = nm u" ⊗ v", x" ⊗ y", where n, m and nm are the ranks, respectively, of
A, B and A ⊗ B. This establishes the claim.
It follows that C is a monoidal subcategory of FdHilbR . In effect, we are going to show that
C inherits a dagger-compact structure from FdHilbR , with the minor twist that we will take A,
rather than A, as the dual for A ∈ C . We define the dagger of φ ∈ C( A, B) to be the Hermitian adjoint
of φ : E( A) → E( B) with respect to the canonical inner products on E( A) and E( B). At this point, it is
not obvious that φ† belongs to C . In order to show that it does, we first need to show that C is compact
closed. To define the unit, let e A ∈ E( A) ⊗ E( A) = E( AA) (note the use of local tomography here) to
be the vector with e A , · = η A , i.e., for all a, b ∈ E( A),
e A , a ⊗ b) = η A ( a ⊗ b) = a, b.
Lemma A5. With η A and e A defined as above, A is a dual for A for every A ∈ C . In particular, C is
compact closed.
Proof. Choose an orthonormal basis M ⊆ E( A). Local tomography and Lemma A4 tell us that
M ⊗ M = { a ⊗ a| a ∈ M } is then an orthonormal basis for E( AA) (note here that a, b ∈ M are not
necessarily even positive, let alone in X ( A)). If we expand e A with respect to this basis, we have:
eA = ∑ e A , a ⊗ b a ⊗ b
a,b∈ M
e A , a ⊗ a = a, a = a2 = 1
and for a
= b, both in M,
e A , a ⊗ b = a, b = 0
104
Entropy 2018, 20, 227
Similarly, for v ∈ A,
(id A ⊗ η A ) ◦ (e A ⊗ id A )(v) = (id A ⊗ η A ) ∑ a⊗a⊗v
a∈ M
= ∑ aη A ( a, v) = ∑ a, va
x∈ M a∈ M
= ∑ v, aa = v.
a∈ M
Proof. Using the compact structure on C defined above, if φ : A → B, we construct the dual of φ,
φ := (ηB ⊗ id A ) ◦ (idB ⊗ φ ⊗ id A ) ◦ (idB ⊗ e A ) : E( B) → E( A).
Thus, φ† = φ , which is evidently a morphism in C .
Thus, C is a dagger-, as well as a monoidal, sub-category of FdHilbR . Hence, the associator, swap
and left- and right-unit morphisms associated with an object A ∈ C are all unitary (since they are
unitary in FdHilbR ), whence C is dagger-monoidal. To complete the proof of Theorem 4, we need to
check that η A = e†A ◦ σA,A : E( AA) → R. In view of our local tomography assumption, it is enough to
check this on pure tensors, where a routine computation gives us e†A (σA,A ( a ⊗ b)) = e†A (b ⊗ a), 11 =
b ⊗ a, e A AA = a, b = η A ( a ⊗ b).
Remark A2. Given that C is compact closed, with A the dual of A, the functoriality of φ → φ makes C strongly
compact closed, in the sense of [18]. This is equivalent to dagger-compactness.
References
1. Von Neumann, J. Mathematical Foundations of Quantum Mechanics; Princeton University Press: Princeton, NJ,
USA, 1955.
2. Schwinger, J. The algebra of microscopic measurement. Proc. Natl. Acad. Sci. USA 1959, 45, 1542–1553.
105
Entropy 2018, 20, 227
3. Mackey, G.W. Mathematical Foundations of Quantum Mechanics; Dover Publications, Inc.: Mineola, NY,
USA, 2004.
4. Ludwig, G. Foundations of Quantum Mechanics I; Springer: New York, NY, USA, 1983.
5. Piron, C. Mathematical Foundations of Quantum Mechanics; Academic Press: Cambridge, MA, USA, 1978.
6. Barnum, H.; Müller, M.; Ududec, C. Higher-order interference and single-system postulates characterizing
quantum theory. New J. Phys. 2014, 16, 123029.
7. Hardy, L. Quantm theory from five reasonable axioms. arXiv 2001, arXiv:quant-ph/0101012.
8. Chiribella, G.; D’Ariano, M.; Perinotti, P. Informational derivation of quantum theory. Phys. Rev. A 2011, 84,
012311.
9. Dakic, B.; Brukner, C. Quantum theory and beyond: Is entanglement special? arXiv 2009, arXiv:0911.0695.
10. Masanes, L.; Müller, M. A derivation of quantum theory from physical requirements. New J. Phys. 2011,
13, 063001.
11. Baez, J. Division algebras and quantum theory. Found. Phys. 2012, 42, 819–855.
12. Janotta, P.; Lal, R. Generalized probabilistic theories without the no-restriction hypothesis. Phys. Rev. A 2013,
87, 052131.
13. Faraut, J.; Koranyi, A. Analysis on Symmetric Cones; Oxford University Press: London, UK, 1994.
14. Wilce, A. 4.5 axioms for finite-dimensional quantum probability. In Probability in Physics; Ben-Menahem, Y.,
Hemmo, M., Eds.; Springer: New York, NY, USA, 2012.
15. Wilce, A. Symmetry and composition in probabilistic theories. Electron. Notes Theor. Comput. Sci. 2011, 270,
191–207.
16. Wilce, A. Symmetry, self-duality and the Jordan structure of finite-dimensional quantum mechanics. arXiv
2011, arxiv:1110.6607.
17. Wilce, A. Conjugates, Filters and Quantum Mechanics. arXiv 2012, arxiv.org/pdf/1206.2897.
18. Abramsky, S.; Coecke, B. Abstract Physical Traces. Theor. Appl. Categories 2005, 14, 111–124.
19. Barnum, H.; Graydon, M.A.; Wilce, A. Some nearly quantum theories. arXiv 2015, arXiv:1507.06278.
20. Jordan, P. Über ein Klasse nichtassoziativer hypercomplexe algebren. Nachr. Akad. Wiss. Göttingen Math.
Phys. Kl. I. 1933, 33, 569–575. (In German)
21. Von Neumann, J. On an algebraic generalization of the quantum mechanical formalism (Part I). Ann. Math.
1936, 1, 415–484.
22. Aliprantis, C.D.; Toukey, R. Cones and Duality; American Mathematical Society: Providence, RI, USA, 2007.
23. Foulis, D.J.; Randall, C.H. Empirical logic and tensor products. In Interpretations and Foundations of Quantum
Theory; Neumann, H., Ed.; Bibliographisches Inst.: Mannheim, Germany, 1981.
24. Barnum, H.; Wilce, A. Post-classical probability theory. In Quantum Theory: Informational Foundations and
Foils; Chiribella, G., Spekkens, R., Eds.; Springer: Dordrecht, The Netherlands, 2016.
25. Popescu, S.; Rohrlich, D. Nonlocality as an axiom. Found. Phys. 1994, 24, 379–385.
26. Gunson, J. On the algebraic structure of quantum mechanics. Commun. Math. Phys. 1967, 6, 262–285.
27. Chribella, G.; Scandolo, C.M. Operational axioms for state diagonalization. arXiv 2015, arXiv:1506:00380.
28. Barnum, H.; Graydon, M.; Wilce, A. Composites and categories of Euclidean Jordan algebras. arXiv 2016,
arXiv:1606.09331.
29. Mac Lane, S. Categories for the Working Mathematician; Springer: New York, NY, USA, 1978.
30. Selinger, P. Dagger compact closed categories and completely positive maps. Electron. Notes Theor. Comput. Sci.
2007, 170, 139–163.
c 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
106
entropy
Article
Agents, Subsystems, and the Conservation
of Information
Giulio Chiribella 1,2,3,4
1 Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK;
[email protected]
2 Canadian Institute for Advanced Research, CIFAR Program in Quantum Information Science,
661 University Ave, Toronto, ON M5G 1M1, Canada
3 Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, China
4 HKU Shenzhen Institute of Research and Innovation, Yuexing 2nd Rd Nanshan, Shenzhen 518057, China
Received: 12 March 2018; Accepted: 5 May 2018; Published: 10 May 2018
Abstract: Dividing the world into subsystems is an important component of the scientific method.
The choice of subsystems, however, is not defined a priori. Typically, it is dictated by experimental
capabilities, which may be different for different agents. Here, we propose a way to define
subsystems in general physical theories, including theories beyond quantum and classical mechanics.
Our construction associates every agent A with a subsystem S A , equipped with its set of states and its
set of transformations. In quantum theory, this construction accommodates the notion of subsystems
as factors of a tensor product, as well as the notion of subsystems associated with a subalgebra of
operators. Classical systems can be interpreted as subsystems of quantum systems in different ways,
by applying our construction to agents who have access to different sets of operations, including
multiphase covariant channels and certain sets of free operations arising in the resource theory of
quantum coherence. After illustrating the basic definitions, we restrict our attention to closed systems,
that is, systems where all physical transformations act invertibly and where all states can be generated
from a fixed initial state. For closed systems, we show that all the states of all subsystems admit a
canonical purification. This result extends the purification principle to a broader setting, in which
coherent superpositions can be interpreted as purifications of incoherent mixtures.
1. Introduction
The composition of systems and operations is a fundamental primitive in our modelling of
the world. It has been investigated in depth in quantum information theory [1,2], and in the
foundations of quantum mechanics, where composition has played a key role from the early days
of Einstein–Podolski–Rosen [3] and Schroedinger [4]. At the level of frameworks, the most recent
developments are the compositional frameworks of general probabilistic theories [5–15] and categorical
quantum mechanics [16–20].
The mathematical structure underpinning most compositional approaches is the structure of
monoidal category [18,21]. Informally, a monoidal category describes circuits, in which wires represent
systems and boxes represent operations, as in the following diagram:
A A
B U B B (1)
C V C
1. quantum subsystems associated with the tensor product of two Hilbert spaces,
2. subsystems associated with an subalgebra of self-adjoint operators on a given Hilbert space,
3. classical systems of quantum systems,
4. subsystems associated with the action of a group representation on a given Hilbert space.
The example of the classical systems has interesting implications for the resource theory
of coherence [34–41]. Our construction implies that different types of agents, corresponding to
different choices of free operations, are associated with the same subsystem, namely the largest
classical subsystem of a given quantum system. Specifically, classical systems arise from strictly
incoherent operations [41], physically incoherent operations [38,39], phase covariant operations [38–40],
and multiphase covariant operations (to the best of our knowledge, multiphase covariant operations
have not been considered so far in the resource theory of coherence). Notably, we do not obtain classical
subsystems from the maximally incoherent operations [34] and from the incoherent operations [35,36],
108
Entropy 2018, 20, 358
which are the first two sets of free operations proposed in the resource theory of coherence. For these
two types of operations, we find that the associated subsystem is the whole quantum system.
After examining the above examples, we explore the general features of our construction.
An interesting feature is that certain properties, such as the impossibility of instantaneous signalling
between two distinct subsystems, arise by fiat, rather then being postulated as physical requirements.
This fact is potentially useful for the project of finding new axiomatizations of quantum theory [42–48]
because it suggests that some of the axioms assumed in the usual (compositional) framework may turn
out to be consequences of the very definition of subsystem. Leveraging on this fact, one could hope to
find axiomatizations with a smaller number of axioms that pinpoint exactly the distinctive features of
quantum theory. In addition, our construction suggests a desideratum that every truly fundamental
axiom should arguably satisfy: an axiom for quantum theory should hold for all possible subsystems of
quantum systems. We call this requirement Consistency Across Subsystems. If one accepts our broad
definition of subsystems, then Consistency Across Subsystems is a very non-trivial requirement, which
is not easily satisfied. For example, the Subspace Axiom [5], stating that all systems with the same
number of distinguishable states are equivalent, does not satisfy Consistency Across Subsystems
because classical subsystems are not equivalent to the corresponding quantum systems, even if they
have the same number of distinguishable states.
In general, proving that Consistence Across Subsystems is satisfied may require great effort.
Rather than inspecting the existing axioms and checking whether or not they are consistent across
subsystems, one can try to formulate the axioms in a way that guarantees the validity of this property.
We illustrate this idea in the case of the Purification Principle [8,12,13,15,49–51], which is the key
ingredient in the quantum axiomatization of Refs. [13,15,42] and plays a central role in the axiomatic
foundation of quantum thermodynamics [52–54] and quantum information protocols [8,15,55–57].
Specifically, we show that the Purification Principle holds for closed systems, defined as systems where
all transformations are invertible, and where every state can be generated from a fixed initial state by
the action of a suitable transformation. Closed systems satisfy the Conservation of Information [58],
i.e., the requirement that physical dynamics should send distinct states to distinct states. Moreover,
the states of the closed systems can be interpreted as “pure”. In this setting, the general notion of
subsystem captures the idea of purification, and extends it to a broader setting, allowing us to regard
coherent superpositions as the “purifications” of classical probability distributions.
The paper is structured as follows. In Section 2, we outline related works. In Section 3, we present
the main framework and the construction of subsystems. The framework is illustrated with five
concrete examples in Section 4. In Section 5, we discuss the key structures arising from our construction,
such as the notion of partial trace and the validity of the no-signalling property. In Section 6, we identify
two requirements, concerning the existence of agents with non-overlapping sets of operations, and
the ability to generate all states from a given initial state. We also highlight the relation between the
second requirement and the notion of causality. We then move to systems satisfying the Conservation
of Information (Section 7) and we formalize an abstract notion of closed systems (Section 8). For such
systems, we provide a dynamical notion of pure states, and we prove that every subsystem satisfies the
Purification Principle (Section 9). A macro-example, dealing with group representations in quantum
theory is provided in Section 10. Finally, the conclusions are drawn in Section 11.
2. Related Works
In quantum theory, the canonical route to the definition of subsystems is to consider commuting
algebras of observables, associated with independent subsystems. The idea of defining independence
in terms of commutation has a long tradition in quantum field theory and, more recently, quantum
information theory. In algebraic quantum field theory [22], the local subsystems associated with
causally disconnected regions of spacetime are described by commuting C*-algebras. A closely related
approach is to associate quantum systems to von Neumann algebras, which can be characterized as
double commutants [59]. In quantum error correction, decoherence free subsystems are associated
109
Entropy 2018, 20, 358
with the commutant of the noise operators [28,29,31]. In this context, Viola, Knill, and Laflamme [23]
and Zanardi, Lidar, and Lloyd [24] made the point that subsystems should be defined operationally,
in terms of the experimentally accessible operations. The canonical approach of associating subsystems
to subalgebras was further generalized by Barnum, Knill, Ortiz, and Viola [60,61], who proposed the
notion of generalized entanglement, i.e., entanglement relative to a subspace of operators. Later, Barnum,
Ortiz, Somma, and Viola explored this notion in the context of general probabilistic theories [62].
The above works provided a concrete model of subsystems that inspired the present work.
An important difference, however, is that here we will not use the notions of observable and expectation
value. In fact, we will not use any probabilistic notion, making our construction usable also in
frameworks where no notion of measurement is present. This makes the construction appealingly
simple, although the flip side is that more work will have to be done in order to recover the probabilistic
features that are built-in in other frameworks.
More recently, del Rio, Krämer, and Renner [63] proposed a general framework for representing
the knowledge of agents in general theories (see also the Ph.D. theses of del Rio [64] and Krämer [65]).
Krämer and del Rio further developed the framework to address a number of questions related to
locality, associating agents to monoids of operations, and introducing a relation, called convergence
through a monoid, among states of a global system [33]. Here, we will extend this relation to
transformations, and we will propose a general definition of subsystem, equipped with its set of
states and its set of transformations.
Another related work is the work of Brassard and Raymond-Robichaud on no-signalling and
local realism [66]. There, the authors adopt an equivalence relation on transformations, stating that
two transformations are equivalent iff they can be transformed into one another through composition
with a local reversible transformation. Such a relation is related to the equivalence relation on
transformations considered in this paper, in the case of systems satisfying the Conservation of
Information. It is interesting to observe that, notwithstanding the different scopes of Ref. [66] and
this paper, the Conservation of Information plays an important role in both. Ref. [66], along with
discussions with Gilles Brassard during QIP 2017 in Seattle, provided inspiration for the present paper.
3. Constructing Subsystems
Here, we outline the basic definitions and the construction of subsystems.
Example 1 (Closed quantum systems). Let us illustrate the basic framework with a textbook example,
involving a closed quantum system evolving under unitary dynamics. Here, S is a quantum system of dimension
110
Entropy 2018, 20, 358
d, and the state space St(S) is the set of pure quantum states, represented as rays on the complex vector space Cd ,
or equivalently, as rank-one projectors. With this choice, we have
St(S) = |ψ ψ| : | ψ ∈ Cd , ψ|ψ = 1 . (2)
The physical transformations are represented by unitary channels, i.e., by maps of the form |ψ ψ| →
U |ψ ψ|U † , where U ∈ Md (C) is a unitary d-by-d matrix over the complex field. In short, we have
Transf (S) = U · U † : U ∈ Md (C) , U†U = U†U = I , (3)
where I is the d-by-d identity matrix. The physical transformations form a monoid, with the composition
operation induced by the matrix multiplication (U · U † ) ◦ (V · V † ) := (UV ) · (UV )† .
Example 2 (Open quantum systems). Generally, a quantum system can be in a mixed state and can undergo
an irreversible evolution. To account for this scenario, we must take the state space St(S) to be the set of all
density matrices. For a system of dimension d, this means that the state space is
St(S) = ρ ∈ Md (C) : ρ≥0 Tr[ρ] = 1 , (4)
where Tr[ρ] = ∑dn=1 n|ρ|n denotes the matrix trace, and ρ ≥ 0 means that the matrix ρ is positive semidefinite.
Transf (S) is the set of all quantum channels [67], i.e., the set of all linear, completely positive, and trace-preserving
maps from Md (C) to itself. The action of the quantum channel T on a generic state ρ can be specified through
the Kraus representation [68]
r
T (ρ) = ∑ Ti ρTi† , (5)
i =1
where { Ti }ri=1 ⊆ Md (C) is a set of matrices satisfying the condition ∑ri=1 Ti† Ti = I. The composition of two
transformations T and S is given by the composition of the corresponding linear maps.
Note that, at this stage, there is no notion of measurement in the framework. The sets St(S) and
Transf (S) are meant as a model of system S irrespectively of anybody’s ability to measure it, or even to
operate on it. For this reason, we call this layer of the framework pre-operational. One can think of the
pre-operational framework as the arena in which agents will act. Of course, the physical description of
such an arena might have been suggested by experiments done earlier on by other agents, but this fact
is inessential for the scope of our paper.
3.2. Agents
Let us introduce agents into the picture. In our framework, an agent A is identified a set of
transformations, denoted as Act( A; S) and interpreted as the possible actions of A on S. Since the
actions must be allowed physical processes, the inclusion Act( A; S) ⊆ Transf (S) must hold. It is
natural, but not strictly necessary, to assume that the concatenation of two actions is a valid action,
and that the identity transformation is a valid action. When these assumptions are made, Act( A; S) is
a monoid. Still, the construction presented in the following will hold not only for monoids, but also for
generic sets Act( A; S). Hence, we adopt the following minimal definition:
Note that this definition captures only one aspect of agency. Other aspects—such as the ability
to gather information, make decisions, and interact with other agents—are important too, but not
necessary for the scope of this paper.
111
Entropy 2018, 20, 358
We also stress that the interpretation of the subset Act( A; S) ⊆ Transf (S) as the set of actions of an
agent is not strictly necessary for the validity of our results. Nevertheless, the notion of “agent” here is
useful because it helps explaining the rationale of our construction. The role of the agent is somehow
similar to the role of a “probe charge” in classical electromagnetism. The probe charge need not exist in
reality, but helps—as a conceptual tool—to give operational meaning to the magnitude and direction
of the electric field.
In general, the set of actions available to agent A may be smaller than the set of all physical
transformations on S. In addition, there may be other agents that act on system S independently of
agent A. We define the independence of actions in the following way:
Definition 2. Agents A and B act independently if the order in which they act is irrelevant, namely
In a very primitive sense, the above relation expresses the fact that A and B act on “different
degrees of freedom” of the system.
CD = DC ∀C ∈ A, ∀D ∈ B . (7)
In contrast, Equation (6) is a condition on the transformations, and not on the observables, which are not
even described by our framework. In quantum theory, Equation (6) is a condition on the completely positive
maps, and not to the elements of the algebras A and B. In Section 4, we will bridge the gap between our
framework and the usual algebraic framework, focussing on the scenario where A and B are finite dimensional
von Neumann algebras.
Definition 3 (Adversary). Let A be an agent and let Act( A; S) be her set of operations. An adversary of A
is an agent B that acts independently of A, i.e., an agent B whose set of actions satisfies
Act( B; S) ⊆ Act( A; S) := B ∈ Transf (S) : B ◦ A = A ◦ B , ∀A ∈ Act( A; S) . (8)
Like the agent, the adversary is a conceptual tool, which will be used to illustrate our notion of
subsystem. The adversary need not be a real physical entity, localized outside the agent’s laboratory,
and trying to counteract the agent’s actions. Mathematically, the adversary is just a subset of the
commutant of Act( A; S). The interpretation of B as an “adversary” is a way to “give life to to the
mathematics”, and to illustrate the rationale of our construction.
112
Entropy 2018, 20, 358
The more operations B can perform, the more powerful B will be as an adversary. The most
powerful adversary compatible with the independence condition (6) is the adversary that can
implement all transformations in the commutant of Act( A; S):
' (
Definition 4. The maximal adversary of agent A is the agent A that can perform the actions Act A ; S :=
Act( A; S) .
Note that the actions of the maximal adversary are automatically a monoid, even if the set
Act( A; S) is not. Indeed,
• the identity map IS commutes with all operations in Act( A; S), and
• if B and B commute with every operation in Act( A; S), then also their composition B ◦ B will
commute with all the operations in Act( A; S).
In the following, we will use the maximal adversary to define the subsystem associated
with agent A.
Rule 1. If the state ψ is obtained from the state φ through degradation, i.e., if ψ ∈ Deg A (φ), then ψ and φ
must correspond to the same state of subsystem S A , i.e., one must have Λψ = Λφ .
Rule 1 imposes that all states in the set Deg A (ψ) must be contained in the set Λψ . Furthermore,
we have the following fact:
Proposition 1. If the sets Deg A (φ) and Deg A (ψ) have non-trivial intersection, then Λφ = Λψ .
Proof. By Rule 1, every element of Deg A (φ) is contained in Λφ . Similarly, every element of Deg A (ψ)
is contained in Λψ . Hence, if Deg A (φ) and Deg A (ψ) have non-trivial intersection, then also Λφ and
Λψ have non-trivial intersection. Since the sets Λφ and Λψ belong to a disjoint partition, we conclude
that Λφ = Λψ .
113
Entropy 2018, 20, 358
Generalizing the above argument, it is clear that two states φ and ψ must be in the same subset
Λφ = Λψ if there exists a finite sequence (ψ1 , ψ2 , . . . , ψn ) ⊆ St(S) such that
When this is the case, we write φ A ψ. Note that the relation φ A ψ is an equivalence relation.
When the relation φ A ψ holds, we say that φ and ψ are equivalent for agent A. We denote the
equivalence class of the state ψ by [ψ] A .
By Rule 1, the whole equivalence class [ψ] A must be contained in the set Λψ , meaning that all
states in the equivalence class must correspond to the same state of subsystem S A . Since we are not
constrained by any other condition, we make the minimal choice
Λψ := [ψ] A . (14)
In general, Act( A; S) could be larger than Act( A; S), in agreement with the fact the set of physical
transformations of system S A could be larger than the set of operations that agent A can perform.
For example, agent A could have access only to noisy operations, while another, more technologically
advanced agent could perform more accurate operations on the same subsystem.
For two transformations S and T in Act( A; S) , the degradation relation ! A takes the simple form
As we did for the set of states, we now partition the set Act( A; S) into disjoint subsets, with the
interpretation that two transformations act in the same way on the subsystem S A if and only if they
belong to the same subset.
Let us denote by ΘA the subset containing the transformation A. To find the appropriate partition
of Act( A; S) into disjoint subsets, we adopt the following rule:
Rule 2. If the transformation T ∈ Act( A; S) is obtained from the transformation S ∈ Act( A; S) through
degradation, i.e., if T ∈ Deg A (S), then T and S must act in the same way on the subsystem S A , i.e., they must
satisfy ΘT = ΘS .
Intuitively, the motivation for the above rule is that system S A is defined as the system that is not
affected by the action of the adversary.
Rule 2 implies that all transformations in Deg A (T ) must be contained in ΘT . Moreover, we have
the following:
Proposition 2. If the sets Deg A (S) and Deg A (T ) have non-trivial intersection, then ΘS = ΘT .
114
Entropy 2018, 20, 358
Proof. By Rule 2, every element of Deg A (S) is contained in ΘS . Similarly, every element of Deg A (T )
is contained in ΘT . Hence, if Deg A (S) and Deg A (T ) have non-trivial intersection, then also ΘS and
ΘT have non-trivial intersection. Since the sets ΛS and ΛT belong to a disjoint partition, we conclude
that ΛS = ΛT .
Using the above proposition, we obtain that the equality ΘT = ΘS holds whenever there exists a
finite sequence (A1 , A2 , . . . , An ) ⊆ Act( A; S) such that
When the above relation is satisfied, we write S A T and we say that S and T are equivalent for
agent A. It is immediate to check that A is an equivalence relation. We denote the equivalence class
of the transformation T ∈ Act( A; S) as [T ] A .
By Rule 2, all the elements of [T ] A must be contained in the set ΘT , i.e., they should correspond
to the same transformation on S A . Again, we make the minimal choice: we stipulate that the set ΘT
coincides exactly with the equivalence class [T ] A . Hence, the transformations of subsystem S A are
Transf (S A ) := [T ] A : T ∈ Act( A; S) . (19)
The composition of two transformations [T1 ] A and [T2 ] A is defined in the obvious way, namely
[T ] A [ψ] A := [T ψ] A . (21)
In Appendix A, we show that definitions (20) and (21) are well-posed, in the sense that their
right-hand sides are independent of the choice of representatives within the equivalence classes.
Remark 1. It is important not to confuse the transformation T ∈ Act( A; S) with the equivalence class
[T ] A : the former is a transformation on the whole system S, while the latter is a transformation only on
subsystem S A . To keep track of the distinction, we define the restriction of the transformation T ∈ Act( A; S)
to the subsystem S A via the map
π A (T ) := [T ] A . (22)
Proposition 3. The restriction map π A : Act( A; S) → Transf (S A ) is a monoid homomorphism, namely
π A (IS ) = IS A and π A (S ◦ T ) = π A (S) ◦ π A (T ) for every pair of transformations S , T ∈ Act( A; S) .
115
Entropy 2018, 20, 358
St(S) = ρ ∈ Lin(HS ) : ρ ≥ 0, Tr[ρ] = 1 . (23)
The transformations are all the quantum channels (linear, completely positive, and trace-preserving
linear maps) from Lin(HS ) to itself. We will denote the set of all channels on system S as Chan(S).
Similarly, we will use the notation Lin(H A ) [Lin(H B )] for the spaces of linear operators from H A
[H B ] to itself, and the notation Chan( A) [Chan( B)] for the quantum channels from Lin(H A ) [Lin(H B )]
to itself.
We can now define an agent A whose actions are all quantum channels acting locally on
system A, namely
Act( A; S) := A ⊗ I B : A ∈ Chan( A) , (24)
where I B denotes the identity map on Lin(H B ). It is relatively easy to see that the commutant of
Act( A; S) is
Act( A; S) = I A ⊗ B : B ∈ Chan( B) (25)
(see Appendix B for the proof). Hence, the maximal adversary of agent A is the adversary A = B that
has full control on the Hilbert space H B . Note also that one has Act( A; S) = Act( A; S).
Now, the following fact holds:
Proposition 4. Two states ρ, σ ∈ St(S) are equivalent for agent A if and only if TrB [ρ] = TrB [σ], where TrB
denotes the partial trace over the Hilbert space H B .
Proof. Suppose that the equivalence ρ A σ holds. By definition, this means that there exists a finite
sequence (ρ1 , ρ2 , . . . , ρn ) such that
In turn, the condition of non-trivial intersection implies that, for every i ∈ {1, 2, . . . , n − 1}, one has
where Bi and B ) i are two quantum channels in Chan(B). Since Bi and B)i are trace-preserving, Equation (27)
implies TrB [ρi ] = TrB [ρi+1 ], as one can see by taking the partial trace on HB on both sides. In conclusion,
we obtained the equality TrB [ρ] ≡ TrB [ρ1 ] = TrB [ρ2 ] = · · · = TrB [ρn ] ≡ TrB [σ].
Conversely, suppose that the condition TrB [ρ] = TrB [σ] holds. Then, one has
where B0 ∈ Chan( B) is the erasure channel defined as B0 (·) = β 0 TrB [·], β 0 being a fixed (but otherwise
arbitrary) density matrix in Lin(H B ). Since I A ⊗ B0 is an element of Act( B; S), Equation (28) shows
that the intersection between Deg B (ρ) and Deg B (σ) is non-empty. Hence, ρ and σ correspond to the
same state of system S A .
We have seen that two global states ρ, σ ∈ St(S) are equivalent for agent A if and only if they
have the same partial trace over B. Hence, the state space of the subsystem S A is
St(S A ) = TrB [ρ] : ρ ∈ St(S) , (29)
116
Entropy 2018, 20, 358
Now, let us consider the transformations. It is not hard to show that two transformations
T , S ∈ Act( A; S) are equivalent if and only if TrB ◦T = TrB ◦S (see Appendix B for the details).
Recalling that the transformations in Act( A; S) are of the form A ⊗ I B , for some A ∈ Chan( A),
we obtain that the set of transformations of S A is
In summary, our construction correctly identifies the quantum subsystem associated with the
Hilbert space H A , with the right set of states and the right set of physical transformations.
for appropriate Hilbert spaces H Ak and H Bk . Relative to this decomposition, the elements of the
algebra A are characterized as
*' (
C∈A ⇐⇒ C= Ck ⊗ IBk , (32)
k
where Ck is an operator in Lin(H Ak ), and IBk is the identity on H Bk . The elements of the commutant
algebra A are characterized as
*' (
D ∈ A ⇐⇒ D= I A k ⊗ Dk , (33)
k
The maximal adversary of agent A is the agent B who can implement all the quantum channels
that commute with the channels in Chan(A), namely
In Appendix C, we prove that Chan(A) coincides with the set of quantum channels with Kraus
operators in the commutant of the algebra A: in formula,
117
Entropy 2018, 20, 358
As in the previous example, the states of subsystem S A can be characterized as “partial traces”
of the states in S, provided that one adopts the right definition of “partial trace”. Denoting the
commutant of the algebra A by B := A , one can define the “partial trace over the algebra B” as the
+
channel TrB : Lin(HS ) → k Lin(H Ak ) specified by the relation
*
TrB (ρ) := TrBk Πk ρΠk , (37)
k
where Πk is the projector on the subspace H Ak ⊗ H Bk ⊆ HS , and TrBk denotes the partial trace over
the space H Bk . With definition (37), is not hard to see that two states are equivalent for A if and only if
they have the same partial trace over B:
Proposition 5. Two states ρ, σ ∈ St(S) are equivalent for A if and only if TrB [ρ] = TrB [σ].
The proof is provided in Appendix C. In summary, the states of system St(S A ) are obtained from
the states of S via partial trace over B, namely
St(S A ) = TrB (ρ) : ρ ∈ St(S) . (38)
Our construction is consistent with the standard algebraic construction, where the states of system
S A are defined as restrictions of the global states to the subalgebra A: indeed, for every element C ∈ A,
we have the relation
, -
*
Tr[C ρ] = Tr Ck ⊗ IBk ρ
k
= ∑ Tr[(Ck ⊗ IBk ) Πk ρΠk ]
k
= ∑ Tr Ck TrBk [Πk ρΠk ]
k
*
= Tr Č TrB [ρ] , Č := Ck , (39)
k
meaning that the restriction of the state ρ to the subalgebra A is in one-to-one correspondence with the
state TrB [ρ].
Alternatively, the states of subsystem S A can be characterized as density matrices of the block
diagonal form
*
σ= pk σk , (40)
k
118
Entropy 2018, 20, 358
As the set of transformations, we consider the set of all unitary channels: in formula,
Transf (S) = U · U † : U ∈ Md (C) , U†U = U†U = I . (43)
To agent A, we grant the ability to implement all unitary channels corresponding to diagonal
unitary matrices, i.e., matrices of the form
where each phase θk can vary independently of the other phases. In formula, the set of actions of
agent A is
Act( A; S) = Uθ · Uθ† : Uθ ∈ Lin(HS ) , Uθ as in Equation (44) . (45)
The peculiarity of this example is that the actions of the maximal adversary A are exactly the
same as the actions of A. It is immediate to see that Act( A; S) is included in Act( A ; S) because all
operations of agent A commute. With a bit of extra work, one can see that, in fact, Act( A; S) and
Act( A ; S) coincide.
Let us look at the subsystem associated with agent A. The equivalence relation among states
takes a simple form:
Proposition 6. Two pure states with unit vectors |φ, |ψ ∈ HS are equivalent for A if and only if |ψ = U |φ
for some diagonal unitary matrix U.
Proof. Suppose that there exists a finite sequence (|ψ1 , |ψ2 , . . . , |ψn ) such that
|ψ1 = |φ , |ψn = |ψ , and Deg A (|ψi ψi |) ∩ Deg A (|ψi+1 ψi+1 |)
= ∅ ∀i ∈ {1, 2, . . . , n − 1} .
)i
This means that, for every i ∈ {1, . . . , n − 1}, there exist two diagonal unitary matrices Ui and U
) i |ψi+1 , or equivalently,
such that Ui |ψi = U
) † Ui |ψi .
|ψi+1 = U (46)
i
Using the above relation for all values of i, we obtain |ψ = U |φ with U :=
) † Un−1 · · · U
U ) † U2 U
) † U1 .
n −1 2 1
Conversely, suppose that the condition |ψ = U |φ holds for some diagonal unitary matrix U.
Then, the intersection Deg A (|φ φ|) ∩ Deg A (|ψ ψ|) is non-empty, which implies that |φ φ| and
|ψ ψ| are in the same equivalence class.
119
Entropy 2018, 20, 358
Using Proposition 6, it is immediate to see that the equivalence class [|ψ ψ|] A is uniquely
identified by the diagonal density matrix ρ = ∑k |ψk |2 |k k|. Hence, the state space of system S A is
the set of diagonal density matrices
St(S A ) = ρ = ∑ pk |k k| : pk ≥ 0 ∀k , ∑ pk = 1 . (47)
k k
The set of transformations of system S A is trivial because the actions of A coincide with the actions
of the adversary A , and therefore they are all in the equivalence class of the identity transformation.
In formula, one has
Transf (S A ) = IS A . (48)
where Uθ = Uθ · Uθ† is the unitary channel corresponding to the diagonal unitary Uθ = ∑k eiθk |k k |.
Physically, we can interpret the restriction to multiphase covariant channels as the lack of a reference
for the definition of the phases in the basis {|k , k = 1, . . . , d}.
It turns out that the maximal adversary of agent A is the agent A that can perform every
basis-preserving channel B , that is, every channel satisfying the condition
Theorem 1. The monoid of multiphase covariant channels and the monoid of basis-preserving channels are the
commutant of one another.
The proof, presented in Appendix D.1, is based on the characterization of the basis-preserving
channels provided in [71,72].
We now show that states of system S A can be characterized as classical probability distributions.
Proposition 7. For every pair of states ρ, σ ∈ St(S), the following are equivalent:
Proof. Suppose that Condition 1 holds, meaning that there exists a sequence (ρ1 , ρ2 , . . . , ρn ) such that
where Bi and B)i are basis-preserving channels. The above equation implies
120
Entropy 2018, 20, 358
Now, the relation k|B(ρ)|k = k |ρ|k is valid for every basis-preserving channel B and for every
state ρ [71]. Applying this relation on both sides of Equation (52), we obtain the condition
k | ρ i | k = k | ρ i +1 | k , (53)
valid for every k ∈ {1, . . . , d}. Hence, all the density matrices (ρ1 , ρ2 , . . . , ρn ) must have the same
diagonal entries, and, in particular, Condition 2 must hold.
Conversely, suppose that Condition 2 holds. Since the dephasing channel D is obviously
basis-preserving, we obtained the condition Deg A (ρ) ∩ Deg A (σ)
= ∅, which implies that ρ and
σ are equivalent for agent A. In conclusion, Condition 1 holds.
4.5. Classical Systems From Free Operations in the Resource Theory of Coherence
In the previous example, we have seen that classical systems arise from agents who have access
to the monoid of multiphase covariant channels. In fact, classical systems can arise in many other
ways, corresponding to agents who have access to different monoids of operations. In particular, we
find that several types of free operations in the resource theory of coherence [34–41] identify classical
systems. Specifically, consider the monoids of
1. Strictly incoherent operations [41], i.e., quantum channels T with the property that, for every
Kraus operator Ti , the map Ti (·) = Ti · Ti satisfies the condition D ◦ Ti = Ti ◦ D , where D is the
completely dephasing channel.
2. Dephasing covariant operations [38–40], i.e., quantum channels T satisfying the condition
D ◦ T = T ◦ D.
3. Phase covariant channels [40], i.e., quantum channels T satisfying the condition T ◦ U ϕ =
U ϕ ◦ T , ∀ ϕ ∈ [0, 2π ), where U ϕ is the unitary channel associated with the unitary matrix
U ϕ = ∑k eikϕ |k k |.
4. Physically incoherent operations [38,39], i.e., quantum channels that are convex combinations of
channels T admitting a Kraus representation where each Kraus operator Ti is of the form
where Uπi is a unitary that permutes the elements of the computational basis, Uθi is a diagonal
unitary, and Pi is a projector on a subspace spanned by a subset of vectors in the computational basis.
For each of the monoids 1–4, our construction yields the classical subsystem consisting of diagonal
density matrices. The transformations of the subsystem are just the classical channels. The proof is
presented in Appendix E.1.
Notably, other choices of free operations, such as the maximally incoherent operations [34] and the
incoherent operations [35], do not identify classical subsystems. The maximally incoherent operations
121
Entropy 2018, 20, 358
are the quantum channels T that map diagonal density matrices to diagonal density matrices, namely
T ◦ D = D ◦ T ◦ D , where D is the completely dephasing channel. The incoherent operations are the
quantum channels T with the property that, for every Kraus operator Ti , the map Ti (·) = Ti · Ti sends
diagonal matrices to diagonal matrices, namely Ti ◦ D = D ◦ Ti ◦ D .
In Appendix E.2, we show that incoherent and maximally incoherent operations do not identify
classical subsystems: the subsystem associated with these operations is the whole quantum system.
This result can be understood from the analogy between these operations and non-entangling
operations in the resource theory of entanglement [38,39]. Non-entangling operations do not generate
entanglement, but nevertheless they cannot (in general) be implemented with local operations
and classical communication. Similarly, incoherent and maximally incoherent operations do not
generate coherence, but they cannot (in general) be implemented with incoherent states and coherence
non-generating unitary gates. An agent that performs these operations must have access to more
degrees of freedom than just a classical subsystem.
At the mathematical level, the problem is that the incoherent and maximally incoherent operations
do not necessarily commute with the dephasing channel D . In our construction, commutation
with the dephasing channel is essential for retrieving classical subsystems. In general, we have
the following theorem:
Definition 5. The partial trace over A is the function Tr A : St(S) → St(S A ), defined by Tr A (ψ) = [ψ] A
for a generic ψ ∈ St(S).
The reason for the notation Tr A is that in quantum theory the operation Tr A coincides with the
partial trace of matrices, as shown in the example of Section 4.1. For subsystems associated with
von Neumann algebras, the partial trace is the “partial trace over the algebra” defined in Section 4.2.
For subsystems associated with multiphase covariant channels or dephasing covariant operations,
the partial trace is the completely dephasing channel, which “traces out” the off-diagonal elements of
the density matrix.
122
Entropy 2018, 20, 358
With the partial trace notation, the states of system S A can be succinctly written as
St(S A ) = ρ = Tr A (ψ) : ψ ∈ St(S) . (57)
Equation (58) can be regarded as the no signalling property: the actions of agent B cannot lead to
any change on the system of agent A. Of course, here the no signalling property holds by fiat, precisely
because of the way the subsystems are defined!
The construction of subsystems has the merit to clarify the status of the no-signalling principle.
No-signalling is often associated with space-like separation, and is heuristically justified through the
idea that physical influences should propagate within the light cones. However, locality is only a
sufficient condition for the no signalling property. Spatial separation implies no signalling, but the
converse is not necessarily true: every pair of distinct quantum systems satisfies the no-signalling
condition, even if the two systems are spatially contiguous. In fact, the no-signalling condition holds
even for virtual subsystems of a single, spatially localized system. Think for example of a quantum
particle localized in the xy plane. The particle can be regarded as a composite system, made of two
virtual subsystems: a particle localized on the x-axis, and another particle localized on the y-axis.
The no-signalling property holds for these two subsystems, even if they are not separated in space.
As Equation (58) suggests, the validity of the no-signalling property has more to do with the way
subsystems are constructed, rather than the way the subsystems are distributed in space.
and
Transf (S → SB ) = Tr A ◦T : T ∈ Transf (S) , (60)
respectively.
Morphisms from S A to S, from SB to S, from S A to SB , or from SB to S A , are not naturally defined.
In Appendix F, we provide a mathematical construction that enlarges the sets of transformations,
making all sets non-empty. Such a construction allows us to reproduce a categorical structure known
as a splitting of idempotents [73,74]
123
Entropy 2018, 20, 358
Obviously, the set of actions allowed to agent A includes the set of actions allowed to agent A.
At this point, one could continue the construction and consider the maximal adversary of agent A .
However, no new agent would appear at this point: the maximal adversary of agent A is agent A
again. When two agents have this property, we call them a dual pair:
Definition 6. Two agents A and B form a dual pair iff Act( A; S) = Act( B; S) and Act( B; S) = Act( A; S) .
Definition 7. Two agents A and B are non-overlapping iff Act( A; S) ∩ Act( B; S) ⊆ {IS }.
Dual pairs of non-overlapping agents are characterized by the fact that the sets of actions have
trivial center:
Proposition 8. Let A and B be a dual pair of agents. Then, the following are equivalent:
Proof. Since agents A and B are dual to each other, we have Act( B; S) = Act( A; S) and Act( A; S) =
Act( B; S) . Hence, the intersection Act( A; S) ∩ Act( B; S) coincides with the center of Act( A; S), and with
the center of Act( B; S). The non-overlap condition holds if and only if the center is trivial.
Note that the existence of non-overlapping dual pairs is a condition on the transformations of the
whole system S:
Proof. Assume that Condition 1 holds for a pair of agents A and B. Let C(S) be the center of Transf (S).
By definition, C(S) is contained into Act( B; S) because Act( B; S) contains all the transformations that
commute with those in Act( A; S). Moreover, the elements of C(S) commute with all elements of
Act( B; S), and therefore they are in the center of Act( B; S). Since A and B are a non-overlapping
dual pair, the center of Act( B; S) must be trivial (Proposition 8), and therefore C(S) must be trivial.
Hence, Condition 2 holds.
Conversely, suppose that Condition 2 holds. In that case, it is enough to take A to be the maximal
agent, i.e., the agent Amax with Act ( Amax ; S) = Transf (S). Then, the maximal adversary of Amax is the
agent B = Amax with Act( B; S) = Act ( Amax ; S) = C(S) = {IS }. By definition, the two agents form a
non-overlapping dual pair. Hence, Condition 1 holds.
124
Entropy 2018, 20, 358
The existence of dual pairs of non-overlapping agents is a desirable property, which may be used
to characterize “good systems”:
Definition 8 (Non-Overlapping Agents). We say that system S satisfies the Non-Overlapping Agents
Requirement if there exists at least one dual pair of non-overlapping agents acting on S.
The Non-Overlapping Agents Requirement guarantees that the total system S can be regarded as
a subsystem: if Amax is the maximal agent (i.e., the agent who has access to all transformations on S),
then the subsystem S Amax is the whole system S. A more formal statement of this fact is provided in
Appendix G.
6.3. Causality
The Non-Overlapping Agents Requirement guarantees that the subsystem associated with a maximal
agent (i.e., an agent who has access to all possible transformations) is the whole system S. On the
other hand, it is natural to expect that a minimal agent, who has no access to any transformation, should
be associated with the trivial system, i.e., the system with a single state and a single transformation.
The fact that the minimal agent is associated with the trivial system is important because it equivalent
to a property of causality [8,13,75,76]: indeed, we have the following
Proposition 10. Let Amin be the minimal agent and let Amax be its maximal adversary, coinciding with the
maximal agent. Then, the following conditions are equivalent
Proof. 1 ⇒ 2: By definition, the state space of S Amin consists of states of the form Tr Amax [ρ], ρ ∈ St(S).
Hence, the state space contains only one state if and only if Condition 2 holds. 2 ⇒ 1: Condition 2
implies that every two states of system S are equivalent for agent Amax . The fact that S Amin has only
one transformation is true by definition: since the adversary of Amin is the maximal agent, one has
T ∈ Deg Amax (IS ) for every transformation T ∈ Transf (S). Hence, every transformation is in the
equivalence class of the identity.
With a little abuse of notation, we may denote the trace over Amax as TrS because Amax has access
to all transformations on system S. With this notation, the causality condition reads
It is interesting to note that, unlike no signalling, causality does not necessarily hold in the framework
of this paper. This is because the trace TrS is defined as the quotient with respect to all possible
transformations, and having a single equivalence class is a non-trivial property. One possibility is
to demand the validity of this property, and to call a system proper, only if it satisfies the causality
condition (62). In the following subsection, we will see a requirement that guarantees the validity of
the causality condition.
125
Entropy 2018, 20, 358
Definition 9. A system S satisfies the Initialization Requirement if there exists a state ψ0 ∈ St(S) from which
any other state can be generated, meaning that, for every other state ψ ∈ St(S), there exists a transformation
T ∈ Transf (S) such that ψ = T ψ0 . When this is the case, the state ψ0 is called cyclic.
The Initialization Requirement is satisfied in quantum theory, both at the pure state level and at the
mixed state level. At the pure state level, every unit vector |ψ ∈ HS can be generated from a fixed unit
vector |ψ0 ∈ HS via a unitary transformation U. At the mixed state level, every density matrix ρ can be
generated from a fixed density matrix ρ0 via the erasure channel Cρ (·) = ρ Tr[·]. By the same argument,
the initialization requirement is also satisfied when S is a system in an operational-probabilistic
theory [8,10–13] and when S is a system in a causal process theory [75,76].
The Initialization Requirement guarantees that minimal agents are associated with trivial systems:
Proposition 11. Let S be a system satisfying the Initialization Requirement, and let Amin be the minimal
agent, i.e., the agent that can only perform the identity transformation. Then, the subsystem S Amin is trivial:
' ( ' (
St S Amin contains only one state and Transf S Amin contains only one transformation.
Proof. By definition, the maximal adversary of Amin is the maximal agent Amax , who has access to
all physical transformations. Then, every transformation is in the equivalence class of the identity
transformation, meaning that system S Amin has a single transformation. Now, let ψ0 be the cyclic state.
By the Initialization Requirement, the set Deg Amax (ψ0 ) is the whole state space St(S). Hence, every
' (
state is equivalent to the state ψ0 . In other words, St S Amin contains only one state.
The Initialization Requirement guarantees the validity of causality, thanks to Proposition 10.
In addition, the Initialization Requirement is important independently of the causality property.
For example, we will use it to formulate an abstract notion of closed system.
is injective.
Logically invertible transformations can be interpreted as evolutions of the system that preserve
the distictness of states. At the fundamental level, one may require that all physical evolutions be
logically invertible, a requirement that is sometimes called the Conservation of Information [58]. In the
following, we will explore the consequences of such requirement:
The requirement is well-posed because the invertible transformations form a monoid. Indeed,
the identity transformation is logically invertible, and that the composition of two logically invertible
transformations is logically invertible.
A special case of logical invertibility is physical invertibility, defined as follows:
126
Entropy 2018, 20, 358
Definition 12. A transformation T ∈ Transf (S) is physically invertible iff there exists another
transformation T ∈ Transf (S) such that T ◦ T = IS .
Physical invertibility is more than injectivity: not only should the map T be injective on the state
space, but also its inverse should be a physical transformation. In light of this observation, we state a
stronger version of the Conservation of Information, requiring physical invertibility:
The difference between Logical and Physical Conservation of Information is highlighted by the
following example:
This choice of transformations satisfies the Logical Conservation of Information, but violates the Physical
Conservation of Information because in general the map V † · V fails to be trace-preserving, and therefore fails to
be an isometric channel. For example, consider the shift operator
∞
V= ∑ |n + 1 n| . (66)
n =0
The operator V is an isometry but its left-inverse V † is not an isometry. As a result, the channel V † · V is
not an allowed physical transformation according to Equation (65).
An alternative choice of physical transformations is the set of unitary channels
Transf (S) = V · V † : V ∈ Lin(S) , V † V = VV † = I . (67)
With this choice, the Physical Conservation of Information is satisfied: every physical transformation is
invertible and the inverse is a physical transformation.
Definition 14. A transformation T ∈ Transf (S) is physically reversible iff there exists another
transformation T ∈ Transf (S) such that T ◦ T = T ◦ T = IS .
Proposition 12. If system S satisfies the Physical Conservation of Information, then every physical
transformation is physically reversible. The monoid Transf (S) is a group, hereafer denoted as G(S).
127
Entropy 2018, 20, 358
Proof. Since T is physically invertible, there exists a transformation T such that T ◦ T = IS . Since the
Physical Conservation of Information holds, T must be physically invertible, meaning that there
exists a transformation T such that T ◦ T = IS . Hence, we have
T = T ◦ (T ◦ T ) = (T ◦ T ) ◦ T = T . (68)
It is immediate to see that the set GB is a group. We call it the adversarial group.
The equivalence relations used to define subsystems can be greatly simplified. Indeed, it is easy
to see that two states ψ, ψ ∈ St(S) are equivalent for A if and only if there exists a transformation
U B ∈ GB such that
ψ = U B ψ . (70)
Hence, the states of the subsystem S A are orbits of the group GB : for every ψ ∈ St(S), we have
TrB [ψ] := U B ψ : U B ∈ GB . (71)
It is easy to show that the transformations of the subsystem S A are the orbits of the group GB :
Transf (S A ) = π A (U ) : U ∈ GA , π A (U ) := U B ◦ U : U B ∈ GB . (73)
8. Closed Systems
Here, we define an abstract notion of “closed systems”, which captures the essential features of
what is traditionally called a closed system in quantum theory. Intuitively, the idea is that all the states
of the closed system are “pure” and all the evolutions are reversible.
An obvious problem in defining closed system is that our framework does not include a notion of
“pure state”. To circumvent the problem, we define the closed systems in the following way:
Definition 15. System S is closed iff it satisfies the Logical Conservation of Information and the Initialiation
Requirement, that is, iff
1. every transformation is logically invertible,
2. there exists a state ψ0 ∈ St(S) such that, for every other state ψ ∈ St(S), one has ψ = V ψ0 for some
suitable transformation V ∈ Transf (S).
For a closed system, we nominally say that all the states in St(S) are “pure”, or, more precisely,
“dynamically pure”. This definition is generally different from the usual definition of pure states as
128
Entropy 2018, 20, 358
extreme points of convex sets, or from the compositional definition of pure states as states with only
product extensions [77]. First of all, dynamically pure states are not a subset of the state space: provided
that the right conditions are met, they are all the states. Other differences between the usual notion of
pure states and the notion of dynamically pure states are highlighted by the following example:
Example 4. Let S be a system in which all states are of the form Uρ0 U † , where U is a generic 2-by-2 unitary
matrix, and ρ0 ∈ M2 (C) is a fixed 2-by-2 density matrix. For the transformations, we allow all unitary
channels U · U † . By construction, system S satisfies the initialization Requirement, as one can generate every
state from the initial state ρ0 . Moreover, all the transformations of system S are unitary and therefore the
Conservation of Information is satisfied, both at the physical and the logical level. Therefore, the states of system
S are dynamically pure. Of course, the states Uρ0 U † need not be extreme points of the convex set of all density
matrices, i.e., they need not be rank-one projectors. They are so only when the cyclic state ρ0 is rank-one.
On the other hand, consider a similar example, where
• system S is a qubit,
• the states are pure states, of the form |ψ ψ| for a generic unit vector |ψ ∈ C2 ,
• the transformations are unitary channels V · V † , where the unitary matrix V has real entries.
Using the Bloch sphere picture, the physical transformations are rotations around the y axis. Clearly,
the Initialization Requirement is not satisfied because there is no way to generate arbitrary points on the sphere
using only rotations around the y-axis. In this case, the states of S are pure in the convex set sense, but not
dynamically pure.
For closed systems satisfying the Physical Conservation of Information, every pair of pure states
are interconvertible:
Proposition 13 (Transitive action on the pure states). If system S is closed and satisfies the Physical
Conservation of Information, then, for every pair of states ψ, ψ ∈ St(S), there exists a reversible transformation
U ∈ G(S) such that ψ = U ψ.
The requirement that all pure states be connected by reversible transformations has featured in
many axiomatizations of quantum theory, either directly [5,44–46], or indirectly as a special case of
other axioms [42,48]. Comparing our framework with the framework of general probabilistic theories,
we can see that the dynamical definition of pure states refers to a rather specific situation, in which all
pure states are connected, either to each other (in the case of physical reversibility) or with to a fixed
cyclic state (in the case of logical reversibility).
9. Purification
Here, we show that closed systems satisfying the Physical Conservation of Information also
satisfy the purification property [8,12,13,15,49–51], namely the property that every mixed state can be
modelled as a pure state of a larger system in a canonical way. Under a certain regularity assumption,
the same holds for closed systems satisfying only the Logical Conservation of Information.
129
Entropy 2018, 20, 358
of ρ is essentially unique: if ψ ∈ St(S) is another pure state with TrB [ψ] = ρ, then there exists a reversible
transformation U B ∈ GB such that ψ = U B ψ.
Proof. By construction, the states of system S A are orbits of states of system S under the adversarial
group GB . By Equation (71), every two states ψ, ψ ∈ St(S) in the same orbit are connected by an
element of GB .
Note that the notion of purification used here is more general than the usual notion of purification
in quantum information and quantum foundations. The most important difference is that system
S A need not be a factor in a tensor product. Consider the example of the coherent superpositions vs.
classical mixtures (Section 4.3). There, systems S A and SB coincide, their states are classical probability
distributions, and the purifications are coherent superpositions. Two purifications of the same classical
state p = ( p1 , p2 , . . . , pd ) are two rank-one projectors |ψ ψ| and |ψ ψ | corresponding to unit vectors
of the form
√ √
|ψ = ∑ pn eiθn |n and |ψ = ∑ pn eiθn |n . (74)
n n
One purification can be obtained from the other by applying a diagonal unitary matrix. Specifically,
one has
| ψ = UB | ψ with UB = ∑ ei(θn −θn ) |n n| . (75)
n
For finite dimensional quantum systems, the notion of purification proposed here encompasses
both the notion of entanglement and the notion of coherent superposition. The case of infinite
dimensional systems will be discussed in the next subsection.
. ∞ . ∞
|ψ AB = 1 − x2 ∑ xn |n A ⊗ |n B and |ψ AB = 1 − x2 ∑ x n | n A ⊗ | n + 1 B , (76)
n =0 n =0
Given two purifications of the same state, say |ψ and |ψ , it is possible to show that at least one
of the following possibilities holds:
1. |ψ = ( I A ⊗ VB ) |ψ for some isometry VB acting on system SB ,
2. |ψ = ( I A ⊗ VB ) |ψ for some isometry VB acting on system SB .
Unfortunately, this uniqueness property is not automatically valid in every system satisfying the
Logical Conservation of Information. Still, we will now show a regularity condition, under which the
uniqueness property is satisfied:
130
Entropy 2018, 20, 358
Definition 16. Let S be a system satisfying the Logical Conservation of Information, let M ⊆ Transf (S) be a
monoid, and let DegM (ψ) be the set defined by
DegM (ψ) = V ψ : V ∈M . (78)
1. for every pair of states ψ, ψ ∈ St(S), the condition DegM (ψ) ∩ DegM (ψ )
= ∅ implies that there exists
a transformation U ∈ M such that ψ = U ψ or ψ = U ψ ,
2. for every pair of transformations V , V ∈ M, there exists a transformation W ∈ M such that V = W ◦ V
or V = W ◦ V .
The regularity conditions are satisfied in quantum theory by the monoid of isometries.
Example 5 (Isometric channels in quantum theory). Let S be a quantum system with separable Hilbert
space H, of dimension d ≤ ∞. Let St(S) the set of all pure quantum states, and let Transf (S) be the monoid of
all isometric channels.
We now show that the monoid M = Transf (S) is regular. The first regularity condition is immediate
because for every pair of unit vectors |ψ and |ψ there exists an isometry (in fact, a unitary) V such that
|ψ = U |ψ. Trivially, this implies the relation |ψ ψ | = U |ψ ψ|U † at the level of quantum states and
isometric channels.
Let us see that the second regularity condition holds. Let V, V ∈ Lin(H) be two isometries on H, and let
{|i }id=1 be the standard basis for H. Then, the isometries V and V can be written as
d
V= ∑ |φi i| and V = ∑ |φi i| , (79)
i =1 i
where {|φi }id=1 and {|φi }id=1 are orthonormal vectors (not necessarily forming bases for the whole Hilbert
space H). Define the subspaces S = Span{|φi }id=1 and S = Span{|φi }id=1 , and let {|ψj }rj=1 and {|ψj }rj=1
be orthonormal bases for the orthogonal complements S⊥ and S⊥ , respectively. If r ≤ r , we define the isometry
d r
W= ∑ |φi φi | + ∑ |ψj ψj | , (80)
i =1 j =1
and we obtain the condition V = WV. Alternatively, if r ≤ r, we can define the isometry
d r
W= ∑ |φi φi | + ∑ |ψj ψj | , (81)
i =1 j =1
and we obtain the condition V = WV . At the level of isometric channels, we obtained the condition V = W ◦ V
or the condition V = W ◦ V , with V (·) = V · V † , V (·) = V · V † , and W (·) = W · W † .
The fact that the monoid of all isometric channels is regular implies that other monoids of isometric channels
are also regular. For example, if the Hilbert space H has the tensor product structure H = H A ⊗ H B , then the
monoid of local isometric channels, defined by isometries of the form I A ⊗ VB , is regular. More generally, if the
Hilbert space is decomposed as
*
H= (H A,k ⊗ H B,k ) , (82)
k
131
Entropy 2018, 20, 358
is regular.
We are now in position to derive the purification property for general closed systems:
Proposition 15. Let S be a closed system. Let A be an agent and let B = A be its maximal adversary.
If Act( B; S) is a regular monoid, the condition TrB [ψ] = TrB [ψ ] implies that there exists some invertible
transformation V B ∈ Transf ( B; S) such that the relation ψ = V B ψ or the relation ψ = V B ψ holds.
Corollary 1 (Purification). Let S be a closed system, let A be an agent in S, and let B = A be its maximal
adversary. If the monoid Act( B; S) is regular, then every state ρ ∈ St(S A ) has a purification ψ ∈ St(S), i.e.,
a state such that ρ = TrB [ψ]. Moreover, the purification is essentially unique: if ψ ∈ St(S) is another state
with TrB [ψ] = ρ, then there exists a reversible transformation V B ∈ Act( B; S) such that the relation ψ = V B ψ
or the relation ψ = V B ψ holds.
The maximal adversary of A is the agent B = A who is able to perform all unitary channels V
that commute with those in G A , namely, the unitary channels in the group
G B : = V ∈ G( S ) : V ◦ Ug = Ug ◦ V ∀g ∈ G . (85)
where, for every fixed V, the function ω(V, ·) : G → C is a multiplicative character, i.e., a one-dimensional
representation of the group G.
Note that, if two unitaries V and W satisfy Equation (86) with multiplicative characters ω (V, ·)
and ω (W, ·), respectively, then their product VW satisfies Equation (86) with multiplicative character
ω (VW, ·) = ω (V, ·) ω (W, ·). This means that the function ω : GB × G → C is a multiplicative
bicharacter: ω (V, ·) is a multiplicative character for G for every fixed V ∈ GB , and, at the same time,
ω (·, g) is a multiplicative character for GB for every fixed g ∈ G.
132
Entropy 2018, 20, 358
VUg = Ug V ∀g ∈ G . (87)
The unitaries in the commutant satisfy Equation (86) with the trivial multiplicative character
ω (V, g) = 1, ∀ g ∈ G. In general, the adversarial group may contain other unitary operators,
corresponding to non-trivial multiplicative characters. The full characterization of the adversarial
group is provided by the following theorem:
Theorem 3. Let G be a compact group, let U : G → Lin(H) be a projective representation of G, and let G A be
the group of channels G A := {Ug · Ug† g ∈ G}. Then, the adversarial group GB is isomorphic to the semidirect
product A U , where U is the commutant of the set {Ug : g ∈ G}, and A is an Abelian subgroup of the
group of permutations of Irr(U ), the set of irreducible representations contained in the decomposition of the
representation Ug .
Theorem 4. If G is a compact connected Lie group, then the Abelian subgroup A of Theorem 3 is trivial, and all
the solutions of Equation (86) have ω (V, g) = 1 ∀ g ∈ G.
where Irr(U ) is the set of irreducible representations (irreps) of G contained in the decomposition of U,
( j)
U ( j) : g → Ug is the irreducible representation of G acting on the representation space R j , and IM j is
the identity acting on the multiplicity space M j . From this expression, it is clear that the adversarial
group GB consists of unitary gates V of the form
*
V= IR j ⊗ Vj , (89)
j∈Irr(U )
where IR j is the identity operator on the representation space R j , and Vj is a generic unitary operator
on the multiplicity space M j .
In general, the agents A and B = A do not form a dual pair. Indeed, it is not hard to see that the
maximal adversary of B is the agent C = A that can perform every unitary channel U (·) = U · U † ,
where U is a unitary operator of the form
*
U= Uj ⊗ IM j , (90)
j∈Irr(U )
133
Entropy 2018, 20, 358
Uj being a generic unitary operator on the representation space R j . When A and B form a dual par,
the groups G A and GB are sometimes called gauge groups [79].
It is now easy to characterize the subsystem S A . Its states are equivalence classes of pure states
under the relation |ψ ψ| A |ψ ψ | iff
It is easy to see that two states in the same equivalence class must satisfy the condition
Proposition 16. Let |ψ, |ψ ∈ HS be two unit vectors such that TrB (|ψ ψ|) = TrB (|ψ ψ |). Then, there
exists a unitary operator UB ∈ GB such that |ψ = UB |ψ.
where { p j } is a generic probability distribution. The state space of system S A is not convex, unless
the condition
dM j ≥ dR j ∀ j ∈ Irr(U ) (95)
is satisfied. Basically, in order to obtain a convex set of density matrices, we need the total system S to
be “sufficiently large” compared to its subsystem S A . This observation is a clue suggesting that the
standard convex framework could be considered as the effective description of subsystems of “large”
closed systems.
Finally, note that, in agreement with the general construction, the pure states of system S are
“purifications" of the states of the system S A . Every state of system S A can be obtained from a pure
state of system S by “tracing out" system SB . Moreover, every two purifications of the same state are
connected by a unitary transformation in GB .
11. Conclusions
In this paper, we adopted rather minimalistic framework, in which a single physical system was
described solely in terms of states and transformations, without introducing measurements. Or at least,
without introducing measurements in an explicit way: of course, one could always interpret certain
transformations as “measurement processes", but this interpretation is not necessary for any of the
conclusions drawn in this paper.
134
Entropy 2018, 20, 358
Our framework can be interpreted in two ways. One way is to think of it as a fragment of
the larger framework of operational-probabilistic theories [8,11–13], in which systems can be freely
composed and measurements are explicitly described. The other way is to regard our framework as
a dynamicist framework, meant to describe physical systems per se, independently of any observer.
Both approaches are potentially fruitful.
On the operational-probabilistic side, it is interesting to see how the definition of subsystem
adopted in this paper interacts with probabilities. For example, we have seen in a few examples that
the state space of a subsystem is not always convex: convex combination of allowed states are not
necessarily allowed states. It is then natural to ask: under which condition is convexity retrieved?
In a different context, the non-trivial relation between convexity and the dynamical notion of system
has been emerged in a work of Galley and Masanes [80]. There, the authors studied alternatives to
quantum theory where the closed systems have the same states and the same dynamics of closed
quantum systems, while the measurements are different from the quantum measurements. Among
these theories, they found that quantum theory is the only theory where subsystems have a convex
state space. These and similar clues are an indication that the interplay between dynamical notions
and probabilistic notions plays an important role in determining the structure of physical theories.
Studying this interplay is a promising avenue of future research.
On the opposite end of the spectrum, it is interesting to explore how far the measurement-free
approach can reach. An interesting research project is to analyze the notions of subsystem, pure
state, and purification, in the context of algebraic quantum field theory [22] and quantum statistical
mechanics [32]. This is important because the notion of pure state as an extreme point of the convex
set breaks down for type III von Neumann algebras [81], whereas the notions used in this paper
(commutativity of operations, cyclicity of states) would still hold. Another promising clue is the
existence of dual pairs of non-overlapping agents, which amounts to the requirement that the set
of operations of each agent has trivial center and coincides with its double commutant. A similar
condition plays an important role in the algebraic framework, where the operator algebras with trivial
center are known as factors, and are at the basis of the theory of von Neumann algebras [82,83].
Finally, another interesting direction is to enrich the structure of system with additional features,
such as a metric, quantifying the proximity of states. In particular, one may consider a strengthened
formulation of the Conservation of Information, in which the physical transformations are required
not only to be invertible, but also to preserve the distances. It is then interesting to consider how the
metric on the pure states of the whole system induces a metric on the subsystems, and to search for
relations between global metric and local metric. Also in this case, there is a promising precedent,
namely the work of Uhlmann [84], which led to the notion of fidelity [85]. All these potential avenues
of future research suggest that the notions investigated in this work may find application in a variety
of different contexts, and for a variety of interpretational standpoints.
Acknowledgments: It is a pleasure to thank Gilles Brassard and Paul Raymond-Robichaud for stimulating
discussions on their recent work [66], Adán Cabello, Markus Müller, and Matthias Kleinmann for providing
motivation to the problem of deriving subsystems, Mauro D’Ariano and Paolo Perinotti for the invitation to
contribute to this Special Issue, and Christopher Timpson and Adam Coulton for an invitation to present at the
Oxford Philosophy of Physics Seminar Series, whose engaging atmosphere stimulated me to think about extensions
of the Purification Principle. I am also grateful to the three referees of this paper for useful suggestions, and to
Robert Spekkens, Doreen Fraser, Lídia del Rio, Thomas Galley, John Selby, Ryszard Kostecki, and David Schmidt
for interesting discussions during the revision of the original manuscript. This work is supported by the
Foundational Questions Institute through grant FQXi-RFP3-1325, the National Natural Science Foundation
of China through grant 11675136, the Croucher Foundation, the Canadian Institute for Advanced Research
(CIFAR), and the Hong Research Grant Council through grant 17326616. This publication was made possible
through the support of a grant from the John Templeton Foundation. The opinions expressed in this publication
are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. The authors
also acknowledge the hospitality of Perimeter Institute for Theoretical Physics. Research at Perimeter Institute
is supported by the Government of Canada through the Department of Innovation, Science and Economic
Development Canada and by the Province of Ontario through the Ministry of Research, Innovation and Science.
Conflicts of Interest: The author declares no conflict of interest.
135
Entropy 2018, 20, 358
Proposition A1. If the transformations S , S), T , T) ∈ Act( A; S) are such that [S] A = [S]
) A and [T ] A =
) ) )
[T ] A , then [S ◦ T ] A = [S ◦ T ] A .
Proof. Let (S1 , S2 , . . . , Sm ) ⊂ Act( A; S) and (T1 , T2 , . . . , Tn ) ⊂ Act( A; S) be two finite sequences
such that
Without loss of generality, we assume that the two finite sequences have the same length m = n.
When this is not the case, one can always add dummy entries and ensure that the two sequences have
the same length: for example, if m < n, one can always define Si := Sm for all i ∈ {m + 1, . . . , n}.
Equation (A1) mean that for every i and j there exist transformations Bi , B)i , C j , C)j ∈ Act( A; S)
such that
Bi ◦ Si = B)i ◦ Si+1 ,
C j ◦ T j = C)j ◦ T j+1 . (A2)
Using the above equalities for i = j, and using the fact that transformations in Act( A; S) commute
with transformations in Act( A; S) , we obtain
' ( ' ( ' ( ' (
Bi ◦ Ci ◦ Si ◦ Ti = Bi ◦ Si ◦ Ci ◦ Ti
' ( ' (
= B)i ◦ Si+1 ◦ C)i ◦ Ti+1
' ( ' (
= B)i ◦ C)i ◦ Si+1 ◦ Ti+1 . (A3)
136
Entropy 2018, 20, 358
TrB C |α α| ⊗ | β β| = |α α| . (A7)
The above relation implies that the state C |α α| ⊗ | β β| is of the form
C |α α| ⊗ | β β| = |α α| ⊗ B(| β β|) , (A8)
for some suitable channel B ∈ Chan( B). Since |α and | β are arbitrary, we obtained C = I A ⊗ B .
Theorem A1. Let A be a von Neumann subalgebra of Md (C), d < ∞, and let Chan(A) be the set of quantum
channels with Kraus operators in A. Then, the commutant of Chan(A) is the set of channels with Kraus operators
in the algebra A . In formula,
Lemma A1. Every channel D ∈ Chan( A) must satisfy the condition
Pl ◦ D ◦ Pk = 0 ∀l
= k , (A10)
where Pk is the CP map Pk (·) := Πk · Πk , and Πk is the projector on the subspace H Ak ⊗ H Bk in Equation (31).
137
Entropy 2018, 20, 358
where each |αk is a generic (but otherwise fixed) unit vector in H Ak and I Bk is the identity map
on Lin(H Bk ). By definition, every channel D ∈ Chan(A) must satisfy the condition C ◦ D = D ◦ C .
In particular, we must have
D(|αk αk | ⊗ | β k β k |) = (D ◦ C)(|αk αk | ⊗ | β k β k |)
= (C ◦ D)(|αk αk | ⊗ | β k β k |)
*
= |αl αl | ⊗ Tr Al (Pl ◦ D)(|αk αk | ⊗ | β k β k |) . (A12)
l
Applying the CP map Pl on both sides of the above equality, we obtain the relation
Lemma A2. Every channel D ∈ Chan( A) must satisfy the conditions
D ◦ Pk = Pk ◦ D ◦ Pk ∀k (A15)
and
Pk ◦ D = Pk ◦ D ◦ Pk ∀k . (A16)
Thanks to Lemma A1, we know the right-hand side is 0 unless i = j = k. Since the vector |φ
is are arbitrary, the condition | φ|Πi Dk (ρ) Π j |φ| = 0 implies the relation Πi Dk (ρ) Π j = 0. Using
this fact, we obtain the relation
(D ◦ Pk )(ρ) = Dk (ρ)
= ∑ Πi Dk ( ρ ) Π j
i,j
= Πk Dk (ρ) Πk
= (Pk ◦ D ◦ Pk )(ρ) , (A18)
valid for arbitrary density matrices ρ, and therefore for arbitrary matrices in Md (C). In conclusion,
Equation (A16) holds.
138
Entropy 2018, 20, 358
The proof of Equation (A15) is analogous to that of Equation (A16), with the only difference that it
uses the adjoint map, which for a generic linear map L : Lin(HS ) → Lin(HS ) is defined by the relation
where the right-hand side is 0 unless i = j = k (cf. Lemma A2). Since the condition
) k (Πi ρΠ j ) |φ| = 0, ∀|φ ∈ HS implies the condition D
| φ| D ) k (Πi ρΠ j ) = 0, we obtained the relation
) k (Πi ρΠ j ) = 0
D unless i = j = k. (A21)
) k (ρ)
(Pk ◦ D)(ρ) = D
=∑D ) k (Πi ρΠ j )
i,j
) k ◦ Pk )(ρ)
= (D
= (Pk ◦ D ◦ Pk )(ρ) . (A22)
Since the equality holds for every ρ, this proves Equation (A16).
Lemma A2 guarantees that the linear map D ◦ Pk sends Lin(Rk ⊗ Mk ) into itself. It is also easy
to see that the map D ◦ Pk has a simple form:
D ◦ Pk = (I Ak ⊗ Bk ) ◦ Pk ∀k, (A23)
where I Ak is the identity map from Lin(H Ak ) to itself, and Bk is a quantum channel from Lin(H Ak ) to itself.
Lemma A4. For every channel D ∈ Chan( A) , the adjoint D † preserves the elements of the algebra A, namely
D † (C ) = C for all C ∈ A.
Proof. Let C be a generic element of A. By Equation (31), one has the equality
* *
C= (Ck ⊗ IBk ) = P k ( C ). (A24)
k k
139
Entropy 2018, 20, 358
having used Lemma A3 in the last equality. Then, we use the fact that the channel Bk is trace-preserving,
and therefore its adjoint Bk† preserves the identity. Using this fact, we can continue the chain of
equalities as
Tr[D † (C )] = ∑ Tr [Ck ⊗ Bk† ( IBk )] Pk (ρ)
k
= ∑ Tr (Ck ⊗ IBk ) Pk (ρ)
k
= ∑ Tr Pk (Ck ⊗ IBk ) ρ
k
, -
*
= Tr Ck ⊗ IBk ρ
k
= Tr[Cρ] , (A26)
having used Equation (A24) in the last equality. Since the equality holds for every density matrix ρ,
we proved the equality D † (C ) = C.
Proof of Theorem A1. Let D be a quantum channel in Chan(A) . Then, Lemma A4 guarantees that the
adjoint D † preserves all operators in the algebra A. Then, a result due to Lindblad [86] guarantees that
all the Kraus operators of D belong to the algebra A . This proves the inclusion Chan(A) ⊆ Chan(A ).
The converse inclusion is immediate: if a channel D belongs to Chan( A ), it commutes with all
channels in Chan(A) thanks to the block diagonal form of the Kraus operators (cf. Equations (32)
and (33)).
Appendix C.2. States of Subsystems Associated to Finite Dimensional Von Neumann algebras
Here, we provide the proof of Proposition 5, adopting the notation B := A .
The proof uses the following lemma:
Lemma A5 (No signalling condition). For every channel D ∈ Chan(B), one has TrB ◦D = TrB .
140
Entropy 2018, 20, 358
where the second equality follows from Lemma A3, and the third equality follows from the fact that
Bk is trace-preserving.
Proof of Proposition 5. Suppose that ρ and σ are equivalent for A. By definition, this means that there
exists a finite sequence (ρ1 , ρ2 , . . . , ρn ) such that
The condition of non-trivial intersection implies that, for every i ∈ {1, 2, . . . , n − 1}, one has
) i ( ρ i +1 ) ,
Di ( ρi ) = D (A30)
) i are two quantum channels in Chan(B). Tracing over B on both sides we obtain
where Di and D
the relation
) i ) ( ρ i +1 ) ,
(TrB ◦Di ) (ρi ) = (TrB ◦D (A31)
and, thanks to Lemma A5, TrB [ρi ] = TrB [ρi+1 ]. Since the equality holds for every i ∈ {1, . . . , n − 1},
we obtained the condition TrB [ρ] = TrB [σ]. In summary, if two states ρ and σ are equivalent for A,
then TrB [ρ] = TrB [σ ].
To prove the converse, it is enough to define the channel D0 ∈ Chan(B) as
*
D0 ( ρ ) : = TrBk [Pk (ρ)] ⊗ β k , (A32)
k
where each β k is a fixed (but otherwise generic) density matrix in Lin(H Bk ). Now, if the equality
TrB [ρ] = TrB [σ ] holds, then also the equality D0 (ρ) = D0 (σ) holds. This proves that the intersection
between Deg B (ρ) and Deg B (σ) is non-empty, and therefore ρ and σ are equivalent for A.
Appendix C.3. Transformations of Subsystems Associated to Finite Dimensional von Neumann algebras
+
Here, we prove that all transformations of system S A are of the form A = k Ak , where each Ak
is a quantum channel from Lin(H Ak ) to itself. The proof is based on the following lemmas:
Lemma A6. For every channel C ∈ Chan(A), one has the relation
Pk ◦ C = (Ak ⊗ I Bk ) ◦ Pk , (A33)
141
Entropy 2018, 20, 358
Proof. Let
*
C(ρ) = ∑ Ci ρ Ci† , Ci = (Cik ⊗ IBk ) (A34)
i k
be a Kraus representation of channel C . The preservation of the trace amounts to the condition
I= ∑ Ci† Ci
i
*
= ∑ Cik† Cik ⊗ IBk , (A35)
k i
which implies
Now, we have
Since the density matrix ρ in Equation (A37) is arbitrary, we proved the relation Pk ◦ C =
(Ak ⊗ I Bk ) ◦ Pk .
Lemma A7. For two channels C , C ∈ Chan(A), let Ak and Ak be the quantum channels defined in Lemma A6.
Then, the following are equivalent:
1. TrB ◦ C = TrB ◦ C ,
2. Ak = Ak for every k.
Clearly, if Ak and Ak are equal for every k, then the partial traces TrB ◦ C and TrB ◦ C are equal.
1 =⇒ 2. Suppose that partial traces TrB ◦ C and TrB ◦ C are equal. Then, Equations (A39) and (A40)
imply the equality
142
Entropy 2018, 20, 358
In turn, the above equality implies Ak = Ak , ∀k, as one can easily verify by applying both sides
of Equation (A41) to a generic product operator Xk ⊗ Yk , with Xk ∈ Lin(H Ak ) and Yk ∈ Lin(H Bk ).
Lemma A8. Two channels C , C ∈ Chan(A) are equivalent for A if and only if TrB ◦ C = TrB ◦ C .
Proof. Suppose that C and C are equivalent for A. By definition, this means that there exists a finite
sequence (C1 , C2 , . . . , Cn ) ⊂ Chan(A) such that
) i ◦ C i +1 .
Di ◦ Ci = D (A43)
) i ◦ C i +1 ,
TrB ◦ Di ◦ Ci = TrB ◦ D (A44)
Since the above relation holds for every i, we obtained the equality TrB ◦ C = TrB ◦ C .
Conversely, suppose that TrB ◦ C = TrB ◦ C . Then, Lemma A7 implies the equality
Ak = Ak ∀k , (A46)
where Ak and Ak are the quantum channels defined in Lemma A6.
Now, let D0 be the channel in Chan(B) defined in Equation (A32). By definition, we have
*
D0 ◦ C = (I Ak ⊗ β k TrBk ) ◦ Pk ◦ C
k
*
= (I Ak ⊗ β k TrBk ) ◦ (Ak ⊗ I Bk ) ◦ Pk (A47)
k
*
= (Ak ⊗ β k TrBk ) ◦ Pk .
k
Similarly, we have
*
D0 ◦ C = (Ak ⊗ β k TrBk ) ◦ Pk . (A48)
k
Since Ak and Ak are equal for every k, we conclude that D0 ◦ C is equal to D0 ◦ C . This means
that the intersection between Deg(C) and Deg(C ) is non-empty, and, therefore C is equivalent to C
modulo B.
Corollary A1. For two channels C , C ∈ Chan(A), let Ak and Ak be the quantum channels defined in
Lemma A6. Then, the following are equivalent:
143
Entropy 2018, 20, 358
Proof. By Lemma A8, C and C are equivalent for A if and only if the condition TrB ◦C = TrB ◦C
holds. By Lemma A7, the condition TrB ◦C = TrB ◦C holds if and only if one has Ak = Ak for every k.
+ +
In turn, the latter condition holds if and only if the equality k Ak = k Ak holds.
where Chan( Ak ) is the set of all quantum channels from Lin(H Ak ) to itself.
To conclude, we observe that the transformations of S A act in the expected way. To this purpose,
we consider the restriction map
* *
πA : Chan(A) → Chan(Ak ) , C → Ak , (A50)
k k
In words, evolving system S with C and then computing the local state of system S A is the same as
computing the local state of system S A and then evolving it with πA (C).
Proposition A3. For every pair of channels C1 , C2 ∈ Chan(A), we have the homomorphism relation
144
Entropy 2018, 20, 358
(A12k ⊗ I Bk ) ◦ Pk = Pk ◦ C1 ◦ C2
= (A1k ⊗ I Bk ) ◦ Pk ◦ C2
= (A1k ⊗ I Bk ) ◦ (A2k ⊗ IBk ) ◦ Pk
= (A1k ◦ A2k ) ⊗ I Bk ◦ Pk ∀k . (A55)
From the above equation, we obtain the equality A12k = A1k ◦ A2k for all k. In turn, this equality
implies the desired result:
* *
πA (C1 ) ◦ πA (C2 ) = A1k ◦ A2l
k l
*
= A1k ◦ A2k
k
*
= A12k
k
= πA (C1 ◦ C2 ) . (A56)
Proof. Every unitary channel of the form Uθ = Uθ · Uθ† is basis-preserving, and therefore every
channel C in the commutant of BPres(S) must commute with it. By definition, this means that C is
multiphase covariant.
r d
M(ρ) = ∑ Mi ρMi† + ∑ ∑ p( j|k ) | j k | ρ|k j| , (A57)
i =1 k =1 j
= k
where each operator Mi is diagonal in the computational basis, and each p( j|k) is non-negative.
Proof. Let M ∈ Lin(HS ⊗ HS ) be the Choi operator of channel M. For a multiphase covariant channel,
the Choi operator must satisfy the commutation relation [87,88]
145
Entropy 2018, 20, 358
where the d × d matrix [Γs,t ] := [ Mss,tt ]s,t∈{1,...,d} is positive semidefinite and each coefficient Mst,st is
non-negative. Then, Equation (A57) follows from diagonalizing the matrix Γ and using the relation
M(ρ) = Tr[ M ( I ⊗ ρ T )], where ρ T is the transpose of ρ in the computational basis.
From Equation (A57), one can show every multiphase covariant channel commutes with every
basis-preserving channel:
Proof. Let B ∈ BPres(S) be a generic basis-preserving channel, and let M ∈ MultiPCov(S) be a generic
multiphase covariant channel. Using the characterization of Equation (A57), we obtain
The second equality used the fact that the Kraus operators of B are diagonal in the computational
basis [71,72] and therefore commute with each operator Mi . The third equality uses the relation
k|B(ρ)|k = k|ρ|k, following from the fact that B preserves the computational basis [71,72].
Summarizing, we have shown that the multiphase covariant channels are the commutant of the
basis-preserving channels:
Proof. A special case of multiphase covariant channel is the erasure channel Mk defined by Mk (ρ) =
|k k| for every ρ ∈ Lin(S). For a generic channel C ∈ MultiPCov(S) , one must have
Since the above condition must hold for every k, the channel C must be basis-preserving.
146
Entropy 2018, 20, 358
4 j j|) |k ,
k| M(| j j|) |k = k| M(| (A63)
4 A .
then [M] A = [M]
Likewise, we have
4 (ρ) = ∑ j|ρ| j k |M(|
D◦M 4 j j|)|k |k k | . (A65)
j,k
Proof. If M and M 4 are in the same equivalence class, then there exists a finite sequence
(M1 , M2 , . . . , Mn ) such that
M1 = M , 4,
Mn = M ∀i ∈ {1, . . . , n − 1} ∃Bi , B)i ∈ BPres(S) : Bi ◦ Mi = B)i ◦ Mi+1 .
147
Entropy 2018, 20, 358
l k | Mi (ρ) |k = Tr[Mi (ρ) |k k|] = k | Bi ◦ Mi (ρ) |k = k | B)i ◦ Mi+1 (ρ) |k
= k | M i +1 ( ρ ) | k , (A66)
for all i ∈ {1, . . . , n − 1} and for all ρ ∈ Lin(ρ). In particular, choosing ρ = | j j| we obtain
1. Strictly incoherent operations [41], i.e., quantum channels T with the property that, for every
Kraus operator Ti , the map Ti (·) = Ti · Ti satisfies the condition D ◦ Ti = Ti ◦ D , where D is the
completely dephasing channel.
2. Dephasing covariant operations [38–40], i.e., quantum channels T satisfying the condition
D ◦ T = T ◦ D.
3. Phase covariant channels [40], i.e., quantum channels T satisfying the condition T ◦ U ϕ = U ϕ ◦
T , ∀ ϕ ∈ [0, 2π ), where U ϕ is the unitary channel associated with the unitary matrix U ϕ =
∑k eikϕ |k k|.
4. Physically incoherent operations [38,39], i.e., quantum channels that are convex combinations of
channels T admitting a Kraus representation where each Kraus operator Ti is of the form
where Uπi is a unitary that permutes the elements of the computational basis, Uθi is a
diagonal unitary, and Pi is a projector on a subspace spanned by a subset of vectors in the
computational basis.
5. Classical channels i.e., channels satisfying T = D ◦ T ◦ D .
We now show that all the above operations define classical subsystems according to our construction.
The first ingredient in the proof is the observation that each of the monoids 1–5 contains the
monoid of classical channels. Then, we can apply the following lemma:
Lemma A15. Let M ⊆ Chan(S) be a monoid of quantum channels, and let M be its commutant. If M contains
the monoid of classical channels, then M is contained in the set of basis-preserving channels.
Proof. Consider the erasure channel Ck defined by Ck (ρ) := |k k | Tr[ρ], ∀ρ ∈ Lin(HS ). Clearly,
the erasure channel is a classical channel. Then, every channel B ∈ M must satisfy the condition
148
Entropy 2018, 20, 358
Lemma A16. Let Act( A; S) ⊆ Chan(S) be a set of quantum channels that contains the monoid of classical
channels. If two quantum states ρ, σ ∈ St(S) are equivalent for A, then they must have the same diagonal
entries. Equivalently, they must satisfy D(ρ) = D(σ).
Proof. Same as the first part of the proof of Proposition 7. Suppose that Condition 1 holds, meaning
that there exists a sequence (ρ1 , ρ2 , . . . , ρn ) such that
where Bi and B)i are channels in the commutant Act( A; S) . The above equation implies
Now, we know that the commutant Act( A; S) consists of basis-preserving channels (Lemma A15).
Since every basis-preserving channel satisfies the relation k |B(ρ)|k = k |ρ|k [71,72], we obtain that
all the density matrices (ρ1 , ρ2 , . . . , ρn ) must have the same diagonal entries, namely D(ρ1 ) = D(ρ2 ) =
· · · = D(ρn ).
Now, we observe that the completely dephasing channel D is contained in the commutant of
all the monoids 1–5. This fact is evident for the monoids 1, 2 and 5, where the commutation with D
holds by definition. For the monoid 3, the commutation with D has been proven in [38,39], and for the
monoid 4 it has been proven in [40].
Since D is contained in the commutant of all the monoids 1–5, we can use the following
obvious fact:
Lemma A17. Let Act( A; S) ⊆ Chan(S) be a monoid of quantum channels and suppose that its commutant
Act( A; S) contains the dephasing channel D . If two quantum states ρ, σ ∈ St(S) satisfy D(ρ) = D(σ ),
then they are equivalent for A.
Proposition A4. Let Act( A; S) ⊆ Chan(S) be a monoid of quantum channels on system S. If Act(A; S)
contains the monoid of classical channels, and if the the commutant Act( A; S) contains the completely dephasing
channel D , then two states ρ, σ ∈ St(S) are equivalent for A if and only if D(ρ) = D(σ).
Proposition A4 implies that the states of the subsystem S A are in one-to-one correspondence with
diagonal density matrices. Since the conditions of the proposition are satisfied by all the monoids 1–5,
each of these monoids defines the same state space.
The same result holds for the transformations:
Proposition A5. Let Act( A; S) ⊆ Chan(S) be a monoid of quantum channels. If Act(A; S) contains the
monoid of classical channels, and if the the commutant Act( A; S) contains the completely dephasing channel D ,
then two transformations S , T ∈ Transf (S) are equivalent for A if and only if D ◦ T ◦ D = D ◦ T ◦ D .
Proposition A5 implies that the transformations of subsystem S A can be identified with classical
channels. Hence, system S A is exactly the d-dimensional classical subsystem of the quantum system S.
In summary, each of the monoids 1–5 defines the same d-dimensional classical subsystem.
149
Entropy 2018, 20, 358
1. The maximally incoherent operations are the quantum channels T that map diagonal density
matrices to diagonal density matrices, namely T ◦ D = D ◦ T ◦ D , where D is the completely
dephasing channel.
2. The Incoherent operations are the quantum channels T with the property that, for every Kraus
operator Ti , the map Ti (·) = Ti · Ti sends diagonal matrices to diagonal matrices, namely
Ti ◦ D = D ◦ Ti ◦ D .
Note that each set of operations contains the set of classical channels. Hence, the commutant of
each set of operation consists of (some subset of) basis-preserving channels (by Lemma A15).
Moreover, both sets of operations 1 and 2 contain the set of quantum channels Cψ defined by
the relation
I − |1 1|
Cψ (ρ) = |1 1| ψ|ρ|ψ + Tr[( I − |ψ ψ|) ρ] ∀ρ ∈ Lin(HS ) , (A72)
d−1
where |ψ ∈ HS is a fixed (but otherwise arbitrary) unit vector. The fact that both monoids contain the
channels Cψ implies a strong constraint on their commutants:
Lemma A18. The only basis-preserving quantum quantum channel B ∈ BPres(S) satisfying the property
B ◦ Cψ = Cψ ◦ B for every |ψ ∈ HS is the identity channel.
where we used the fact that B is basis-preserving. Tracing both sides of the equality with the projector
|1 1|, we obtain the relation
the second equality following from the definition of channel Cψ . In turn, Equation (A74) implies the
relation B(|ψ ψ|) = |ψ ψ|. Since |ψ is arbitrary, this means that B must be the identity channel.
In summary, the commutant of the set of incoherent channels consists only of the identity channel,
and so is the the commutant of the set of maximally incoherent channels. Since the commutant is
trivial, the equivalence classes are trivial, meaning that the subsystem S A has exactly the same states
and the same transformations of the original system S. In short, the subsystem associated with the
incoherent (or maximally incoherent) channels is the full quantum system.
150
Entropy 2018, 20, 358
As we have seen in the main text, our basic construction does not provide transformations
from the subsystem S A to the global system S. One could introduce such transformations by hand,
by defining an embedding [63]:
Definition A1. An embedding of S A into S is a map E A : St(S A ) → St(S) satisfying the property
TrB ◦E A = IS A . (A75)
A priori, embeddings need not be physical processes. Consider the example of a classical system,
viewed as a subsystem of a closed quantum system as in Section 4.3. An embedding would map each
classical probability distribution ( p1 , p2 , . . . , pd ) into a pure quantum state |ψ = ∑k ck |k satisfying
the condition |ck |2 = pk for all k ∈ {1, . . . , d}. If the embedding were a physical transformation, there
would be a way to physically transform every classical probability distributions into a corresponding
pure quantum state, a fact that is impossible in standard quantum theory.
When building a new physical theory, one could postulate that there exists an embedding E A that
is physically realizable. In that case, the transformations from S A to S would be those in the set
Transf (S A → S) = T ◦ E A : T ∈ Transf (S) , (A76)
and similarly for the transformations from SB to S. The transformations from S A to SB would be those
in the set
Transf (S A → SB ) = Tr A ◦T ◦ E A : T ∈ Transf (S) , (A77)
and similarly for the transformations from SB to S A . In that new theory, the old set of transformations
from S A should be replaced by the new set:
(S A ) =
Transf TrB ◦T ◦ E A : T ∈ Transf (S) , (A78)
so that the structure of category is preserved. Similarly, the old set of transformations from SB to SB
should be replaced by the new set .
(SB ) =
Transf Tr A ◦T ◦ E B : T ∈ Transf (S) . (A79)
When this is done, the embeddings define two idempotent morphisms P A := E A ◦ TrB and
P B := E B ◦ Tr A , i.e., two morphisms satisfying the conditions
PA ◦ PA = PA and PB ◦ PB = PB . (A80)
The partial trace and the embedding define a splitting of idempotents, in the sense of Refs. [73,74].
The splitting of idempotents was considered in the categorical framework as a way to define general
decoherence maps, and, more specifically, decoherence maps to classical subsystems [74,89].
151
Entropy 2018, 20, 358
Proposition A6. Let S be a system satisfying the Non-Overlapping Agents Requirement, let Amax be the
maximal agent, and S Amax be the associated subsystem. Then, one has S Amax S, meaning that there exist two
isomorphisms γ : St(S) → St (S Amax ) and δ : Transf (S) → Transf (S Amax ) satisfying the condition
Proof. The Non-Overlapping Agents Requirement guarantees that the commutant Act( Amax ; S)
contains only the identity transformation. Hence, the equivalence class [ψ] Amax contains only the
state ψ. Hence, the partial trace Tr Amax
: ψ → [ψ] Amax is a bijection from St(S) to St (S Amax ). Similarly,
the equivalence class [T ] Amax contains only the transformation T . Hence, the restriction π Amax :
T → [T ] Amax is a bijective function between Transf (S) and Transf (S Amax ). Such a function is an
homomorphism of monoids, by Equation (20). Setting δ := π Amax and γ := Tr Amax , the condition (A81)
is guaranteed by Equation (21).
ψ1 = ψ , ψn = ψ , )i ψi+1 .
∀i ∈ {1, . . . , n − 1} ∃Vi , V)i ∈ Act( B; S) : Vi ψi = V (A82)
Our goal is to prove that there exists an adversarial action V B ∈ Act( B; S) such that the relation
ψ = V B ψ or ψ = V B ψ holds.
We will proceed by induction on n, starting from the base case n = 2. In this case, we have
Deg B (ψ) ∩ Deg B (ψ )
= ∅. Then, the first regularity condition implies that there exists a transformation
V B ∈ Act( B; S) such that at least one of the relations V B ψ = ψ and ψ = V B ψ holds. This proves the
validity of the base case.
Now, suppose that the induction hypothesis holds for all sequences of length n, and suppose
that ψ and ψ are equivalent through a sequence of length n + 1, say (ψ1 , ψ2 , . . . , ψn , ψn+1 ). Applying
the induction hypothesis to the sequence (ψ1 , ψ2 , . . . , ψn ), we obtain that there exists a transformation
V ∈ Act( B; S) such that at least one of the relations ψn = V ψ and ψ = V ψn holds. Moreover,
applying the induction hypothesis to the pair (ψn , ψn+1 ) we obtain that there exists a transformation
V ∈ Act( B; S) such that ψn+1 = V ψn , or ψn = V ψn+1 . Hence, there are four possible cases:
1. ψn = V ψ and ψn+1 = V ψn . In this case, we have ψn+1 = (V ◦ V )ψ, which proves the
desired statement.
2. ψn = V ψ and ψn = V ψn+1 . In this case, we have V ψ = V ψn+1 , or equivalently Deg B (ψ) ∩
Deg B (ψn+1 )
= ∅. Applying the induction hypothesis to the sequence (ψ, ψn+1 ), we obtain the
desired statement.
3. ψ = V ψn and ψn+1 = V ψn . Using the second regularity condition, we obtain that there exists a
transformation W ∈ Act( B; S) such that at least one of the relations V = W ◦ V and V = W ◦ V
holds. Suppose that V = W ◦ V . In this case, we have
152
Entropy 2018, 20, 358
Lemma A19 (Canonical form of the elements of the adversarial group). Let U : g → Ug be a projective
representation of the group G, let Irr(U ) be the set of irreducible representations contained in the isotypic
decomposition of U, and let ω : G → C be a multiplicative character of G. Then, the commutation relation
VUg = ω ( g) Ug V ∀g ∈ G (A85)
holds iff
1. The map U ( j) → ω U ( j) is a permutation of the set Irr(U ), denoted as π : Irr(U ) → Irr(U ). In other
words, for every irrep U ( j) with j ∈ Irr(U ), the irrep ω U ( j) is equivalent to an irrep k ∈ Irr(U ), and the
correspondence between j and k is bijective.
2. The multiplicity spaces M j and Mπ ( j) have the same dimension.
3. The unitary operator V has the canonical form V = Uπ V0 , where V0 is an unitary operator in the
commutant U and Uπ is a permutation operator satisfying
Uπ R j ⊗ M j = Rπ ( j) ⊗ Mπ ( j) ∀ j ∈ Irr(U ) . (A86)
Vj,k := Π j V Πk , (A87)
where Π j (Πk ) is the projector onto R j ⊗ M j (Rk ⊗ Mk ). Then, Equation (A85) is equivalent to
the condition
(k) ( j)
Vj,k Ug ⊗ IMk = ω ( g) Ug ⊗ IM j Vjk , ∀ g ∈ G , ∀ j, k , (A88)
(k) ( j)
α|Vj,k | β Ug = ω ( g) Ug α|Vj,k | β , ∀ g ∈ G , ∀ j, k , ∀|α ∈ M j , ∀| β ∈ Mk , (A89)
where α|Vj,k | β is a shorthand for the partial matrix element ( IR j ⊗ α|) Vj,k ( IRk ⊗ | β).
Equation (A89) means that each operator α|Vj,k | β intertwines the two representations U (k) and
ω U ( j) .
Recall that each representation is irreducible. Hence, the second Schur’s lemma [78] implies
that α| Vj,k | β is zero if the two representations are not equivalent. Note that there can be at most
one value of j such that U (k) is equivalent to ω U ( j) . If such a value exists, we denote it as j = π (k ).
By construction, the function π : Irr(U ) → Irr(U ) must be injective.
When j = π (k), the first Schur’s lemma [78] guarantees that the operator α| Vπ (k),k | β is
proportional to the partial isometry Tπ (k),k that implements the equivalence of the two representations.
Let us write
153
Entropy 2018, 20, 358
(k)
for some Mα,β ∈ C. Note also that, since the left-hand side is sesquilinear in |α and | β, the right-hand
side should also be sesquilinear. Hence, we can find an operator Mπ (k),k : Mk → Mπ (k) such that
(k)
Mα,β = α| Mπ (k),k | β. Putting everything together, the operator V can be written as
*
V= Tπ (k),k ⊗ Mπ (k),k . (A91)
k ∈Irr(U )
Now, the operator V must be unitary, and, in particular, it should satisfy the condition VV † = I,
which reads
*
IRπ (k) ⊗ Mπ (k),k Mπ† (k),k = I . (A92)
k ∈Irr(U )
The above condition implies that: (i) the function π must be surjective, and (ii) the operator
Mπ (k),k must be a co-isometry. From the relation V † V, we also obtain that Mπ (k),k must be an isometry.
Hence, Mπ (k) is unitary.
Summarizing, the condition (A85) can be satisfied only if there exists a permutation π : Irr(U ) →
Irr(U ) such that, for every j,
1. the irreps ω U (k) and U π (k) are equivalent,
2. the multiplicity spaces Mk and Mπ (k) are unitarily isomorphic.
Fixing a unitary isomorphism Sπ (k),k : Mk → Mπ (k) , we can write every element of the
adversarial group in the canonical form V = Uπ V0 , where Uπ is the permutation operator
*
Uπ = Tπ (k),k ⊗ Sπ (k),k , (A93)
k ∈Irr(U )
and V0 is an element of the commutant U , i.e., a generic unitary operator of the form
*
V0 = Ij ⊗ V0,k . (A94)
k ∈Irr(U )
Conversely, if a permutation π exists with the properties that for every k ∈ Irr(U )
1. ω U (k) and U (π (k)) are equivalent irreps,
2. Mk and Mπ (k) are unitarily equivalent,
and if the operator V has the form V = Uπ V0 , with Uπ and V0 as in Equations (A93) and (A94), then V
satisfies the commutation relation (A85).
We have seen that every element of the adversarial group can be decomposed into the product of
a permutation operator, which permutes the irreps, and an operator in the commutant of the original
group representation U : G → Lin(H). We now observe that the allowed permutations have an
additional structure: they must form an Abelian group, denoted as A.
Lemma A20. The permutations π arising from Equation (A85) with a generic multiplicative character ω (V, ·)
form an Abelian subgroup A of the group of all permutations of Irr(U ).
Proof. Let V and W be two elements of the adversarial group GB , let ω (V, ·) and ω (W, ·) be the
corresponding characters, and let πV and πW be the permutations associated with ω (V, ·) and ω (W, ·)
as in Theorem A19, i.e., through the relation
154
Entropy 2018, 20, 358
Now, the element VW is associated with the permutation πV ◦ πW , while the element WV is
associated with the permutation πW ◦ πV . On the other hand, the characters obey the equality
Hence, we conclude that πV ◦ πW and πW ◦ πV are, in fact, the same permutation. Hence,
the elements of the adversarial group must correspond to an Abelian subgroup of the permutations
of Irr(U ).
Proof of Theorem 3. For different permutations in A, we can choose the isomorphisms Sπ (k),k : Mk →
Mπ (k) such that the following property holds:
Sπ2 ◦π2 (k),k = Sπ2 (π1 (k)),π1 (k) Sπ1 (k),k , ∀ π1 , π2 ∈ A . (A97)
When this is done, the unitary operators Uπ defined in Equation (A93) form a faithful
representation of the Abelian group A. Using the canonical decomposition of Theorem A19, every
element of V ∈ GB is decomposed uniquely as V = Uπ V0 , where V0 is an element of the commutant U .
Note also that the commutant U is a normal subgroup of the adversarial group: indeed, for every
element V ∈ GB we have VU V † = U . Since U is a normal subgroup and the decomposition
V = Uπ V0 is unique for every V ∈ GB , it follows that the adversarial group GB is the semidirect
product A U .
U : Z2 → Lin(S) , k → Uk = Z k . (A99)
The representation can be decomposed into two irreps, corresponding to the one-dimensional
subspaces H0 = Span{|0} and H1 = Span{|1}. The corresponding irreps, denoted by
ω0 : Z2 → C , ω (k ) = 1,
ω1 : Z2 → C , ω (k ) = (−1)k , (A100)
are the only two irreps of the group and are multiplicative characters.
The condition VUk = Uk V yields the solutions
corresponding to the commutant U . The condition VUk = (−1)k Uk V yields the solutions
155
Entropy 2018, 20, 358
Let us consider now the subsystem S A . The states of S A are equivalence classes under the relation
It is not hard to see that the equivalence class of the state |ψ is uniquely determined by the
unordered pair {| 0|ψ| , | 1|ψ|}. In other words, the state space of system S A is
St(S A ) = { p, 1 − p} , : p ∈ [0, 1] . (A104)
Note that, in this case, the state space is not a convex set of density matrices. Instead, it is the
quotient of the set of diagonal density matrices, under the equivalence relation that two matrices with
the same spectrum are equivalent.
Finally, note that the transformations of system S A are trivial: since the adversarial group GB
contains the group G A , the group G(S A ) = π A ( G A ) is trivial, namely
G( S A ) = I S A . (A105)
or equivalently,
Since the operators exp[iλK ] and exp[iλ(K + μ IS )] are unitarily equivalent, they must have
the same spectrum. This is only possible if the operators K and K + μ IS have the same spectrum,
which happens only if μ = 0.
Now, recall that the one-parameter Abelian subgroup H is generic. Since every element of G is
contained in some one-parameter Abelian subgroup H, we showed that ω ( g) = 1 for every g ∈ G.
To conclude the proof, observe that the map U ( j) → ω U ( j) is the identity, and therefore induces
the trivial permutation on the set of irreps Irr(U ). Hence, the group of permutations A induced by
multiplication by ω contains only the identity element.
156
Entropy 2018, 20, 358
where |ψj and |ψj are unit vectors in R j ⊗ M j . Using this decomposition, we obtain
* *
T B (|ψ ψ|) = pj ρj and T B (|ψ ψ|) = pj ρj , (A109)
j∈Irr(U ) j∈Irr(U )
where ρ j (ρj ) is the marginal of |ψj (|ψj ) on system R j . It is then clear that the equality T B (|ψ ψ|) =
T B (|ψ ψ |) implies p j = pj and ρ j = ρj for every j. Since the states |ψj and |ψj have the same
marginal on system R j , there must exist a unitary operator Uj : M j → M j such that
which satisfies the property UB |ψ = |ψ . By the characterization of Equation (89), UB is an element
of GB .
References
1. Nielsen, M.; Chuang, I. Quantum information and computation. Nature 2000, 404, 247.
2. Kitaev, A.Y.; Shen, A.; Vyalyi, M.N. Classical and Quantum Computation; Number 47; American Mathematical
Society: Providence, RI, USA, 2002.
3. Einstein, A.; Podolsky, B.; Rosen, N. Can quantum-mechanical description of physical reality be considered
complete? Phys. Rev. 1935, 47, 777. [CrossRef]
4. Schrödinger, E. Discussion of probability relations between separated systems. In Mathematical Proceedings
of the Cambridge Philosophical Society; Cambridge University Press: Cambrdige, UK, 1935; Volume 31,
pp. 555–563.
5. Hardy, L. Quantum theory from five reasonable axioms. arXiv 2001, arXiv:quant-ph/0101012.
6. Barnum, H.; Barrett, J.; Leifer, M.; Wilce, A. Generalized no-broadcasting theorem. Phys. Rev. Lett. 2007,
99, 240501. [CrossRef] [PubMed]
7. Barrett, J. Information processing in generalized probabilistic theories. Phys. Rev. A 2007, 75, 032304.
[CrossRef]
8. Chiribella, G.; D’Ariano, G.; Perinotti, P. Probabilistic theories with purification. Phys. Rev. A 2010, 81, 062348.
[CrossRef]
9. Barnum, H.; Wilce, A. Information processing in convex operational theories. Electron. Notes Theor. Comput. Sci.
2011, 270, 3–15. [CrossRef]
10. Hardy, L. Foliable operational structures for general probabilistic theories. In Deep Beauty: Understanding the
Quantum World through Mathematical Innovation; Halvorson, H., Ed.; Cambridge University Press: Cambrdige,
UK, 2011; p. 409.
11. Hardy, L. A formalism-local framework for general probabilistic theories, including quantum theory.
Math. Struct. Comput. Sci. 2013, 23, 399–440. [CrossRef]
12. Chiribella, G. Dilation of states and processes in operational-probabilistic theories. In Proceedings of
the 11th workshop on Quantum Physics and Logic, Kyoto, Japan, 4–6 June 2014; Coecke, B., Hasuo, I.,
Panangaden, P., Eds.; Electronic Proceedings in Theoretical Computer Science; Volume 172, pp. 1–14.
13. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Quantum from principles. In Quantum Theory: Informational
Foundations and Foils; Springer: Dordrecht, The Netherlands, 2016; pp. 171–221.
14. Hardy, L. Reconstructing quantum theory. In Quantum Theory: Informational Foundations and Foils; Springer:
Dordrecht, The Netherlands, 2016; pp. 223–248.
15. Mauro D’Ariano, G.; Chiribella, G.; Perinotti, P. Quantum Theory from First Principles. In Quantum Theory
from First Principles; D’Ariano, G.M., Chiribella, G., Perinotti, P., Eds.; Cambridge University Press: Cambridge,
UK, 2017.
157
Entropy 2018, 20, 358
16. Abramsky, S.; Coecke, B. A categorical semantics of quantum protocols. In Proceedings of the 19th Annual
IEEE Symposium on Logic in Computer Science, Turku, Finland, 17 July 2004; pp. 415–425.
17. Coecke, B. Kindergarten quantum mechanics: Lecture notes. In Proceedings of the AIP Conference
Quantum Theory: Reconsideration of Foundations-3, Växjö, Sweden, 6–11 June 2005; American Institute of
Physics: Melville, NY, USA, 2006; Volume 810, pp. 81–98.
18. Coecke, B. Quantum picturalism. Contemp. Phys. 2010, 51, 59–83. [CrossRef]
19. Abramsky, S.; Coecke, B. Categorical quantum mechanics. In Handbook of Quantum Logic and Quantum
Structures: Quantum Logic; Elsevier Science: New York, NY, USA, 2008; pp. 261–324.
20. Coecke, B.; Kissinger, A. Picturing Quantum Processes; Cambridge University Press: Cambridge, UK, 2017.
21. Selinger, P. A survey of graphical languages for monoidal categories. In New Structures for Physics; Springer:
Berlin/Heidelberg, Germany, 2010; pp. 289–355.
22. Haag, R. Local Quantum Physics: Fields, Particles, Algebras; Springer: Berlin/Heidelberg, Germany, 2012.
23. Viola, L.; Knill, E.; Laflamme, R. Constructing qubits in physical systems. J. Phys. A Math. Gen. 2001, 34, 7067.
[CrossRef]
24. Zanardi, P.; Lidar, D.A.; Lloyd, S. Quantum tensor product structures are observable induced. Phys. Rev. Lett.
2004, 92, 060402. [CrossRef] [PubMed]
25. Palma, G.M.; Suominen, K.A.; Ekert, A.K. Quantum computers and dissipation. Proc. R. Soc. Lond. A 1996,
452, 567–584. [CrossRef]
26. Zanardi, P.; Rasetti, M. Noiseless quantum codes. Phys. Rev. Lett. 1997, 79, 3306. [CrossRef]
27. Lidar, D.A.; Chuang, I.L.; Whaley, K.B. Decoherence-free subspaces for quantum computation. Phys. Rev. Lett.
1998, 81, 2594. [CrossRef]
28. Knill, E.; Laflamme, R.; Viola, L. Theory of quantum error correction for general noise. Phys. Rev. Lett. 2000,
84, 2525. [CrossRef] [PubMed]
29. Zanardi, P. Stabilizing quantum information. Phys. Rev. A 2000, 63, 012301. [CrossRef]
30. Kempe, J.; Bacon, D.; Lidar, D.A.; Whaley, K.B. Theory of decoherence-free fault-tolerant universal quantum
computation. Phys. Rev. A 2001, 63, 042307. [CrossRef]
31. Zanardi, P. Virtual quantum subsystems. Phys. Rev. Lett. 2001, 87, 077901. [CrossRef] [PubMed]
32. Bratteli, O.; Robinson, D.W. Operator Algebras and Quantum Statistical Mechanics 1; Springer:
Berlin/Heidelberg, Germany, 1987.
33. Kraemer, L.; Del Rio, L. Operational locality in global theories. arXiv 2017, arXiv:1701.03280.
34. Åberg, J. Quantifying superposition. arXiv 2006, arXiv:quant-ph/0612146.
35. Baumgratz, T.; Cramer, M.; Plenio, M. Quantifying coherence. Phys. Rev. Lett. 2014, 113, 140401. [CrossRef]
[PubMed]
36. Levi, F.; Mintert, F. A quantitative theory of coherent delocalization. New J. Phys. 2014, 16, 033007. [CrossRef]
37. Winter, A.; Yang, D. Operational resource theory of coherence. Phys. Rev. Lett. 2016, 116, 120404. [CrossRef]
[PubMed]
38. Chitambar, E.; Gour, G. Critical examination of incoherent operations and a physically consistent resource
theory of quantum coherence. Phys. Rev. Lett. 2016, 117, 030401. [CrossRef] [PubMed]
39. Chitambar, E.; Gour, G. Comparison of incoherent operations and measures of coherence. Phys. Rev. A 2016,
94, 052336. [CrossRef]
40. Marvian, I.; Spekkens, R.W. How to quantify coherence: Distinguishing speakable and unspeakable notions.
Phys. Rev. A 2016, 94, 052324. [CrossRef]
41. Yadin, B.; Ma, J.; Girolami, D.; Gu, M.; Vedral, V. Quantum processes which do not use coherence. Phys. Rev. X
2016, 6, 041028. [CrossRef]
42. Chiribella, G.; D’Ariano, G.; Perinotti, P. Informational derivation of quantum theory. Phys. Rev. A 2011,
84, 012311. [CrossRef]
43. Hardy, L. Reformulating and reconstructing quantum theory. arXiv 2011, arXiv:1104.2066.
44. Masanes, L.; Müller, M.P. A derivation of quantum theory from physical requirements. New J. Phys. 2011,
13, 063001. [CrossRef]
45. Dakic, B.; Brukner, C. Quantum Theory and Beyond: Is Entanglement Special? In Deep Beauty: Understanding
the Quantum World through Mathematical Innovation; Halvorson, H., Ed.; Cambridge University Press:
Cambridge, UK, 2011; pp. 365–392.
158
Entropy 2018, 20, 358
46. Masanes, L.; Müller, M.P.; Augusiak, R.; Perez-Garcia, D. Existence of an information unit as a postulate of
quantum theory. Proc. Natl. Acad. Sci. USA 2013, 110, 16373–16377. [CrossRef] [PubMed]
47. Wilce, A. Conjugates, Filters and Quantum Mechanics. arXiv 2012, arXiv:1206.2897.
48. Barnum, H.; Müller, M.P.; Ududec, C. Higher-order interference and single-system postulates characterizing
quantum theory. New J. Phys. 2014, 16, 123029. [CrossRef]
49. Chiribella, G.; D’Ariano, G.; Perinotti, P. Quantum Theory, namely the pure and reversible theory of
information. Entropy 2012, 14, 1877–1893. [CrossRef]
50. Chiribella, G.; Yuan, X. Quantum theory from quantum information: The purification route. Can. J. Phys.
2013, 91, 475–478. [CrossRef]
51. Chiribella, G.; Scandolo, C.M. Conservation of information and the foundations of quantum mechanics.
In EPJ Web of Conferences; EDP Sciences: Les Ulis, France, 2015; Volume 95, p. 03003.
52. Chiribella, G.; Scandolo, C.M. Entanglement and thermodynamics in general probabilistic theories.
New J. Phys. 2015, 17, 103027. [CrossRef]
53. Chiribella, G.; Scandolo, C.M. Microcanonical thermodynamics in general physical theories. New J. Phys.
2017, 19, 123043. [CrossRef]
54. Chiribella, G.; Scandolo, C.M. Entanglement as an axiomatic foundation for statistical mechanics. arXiv
2016, arXiv:1608.04459.
55. Lee, C.M.; Selby, J.H. Generalised phase kick-back: The structure of computational algorithms from physical
principles. New J. Phys. 2016, 18, 033023. [CrossRef]
56. Lee, C.M.; Selby, J.H. Deriving Grover’s lower bound from simple physical principles. New J. Phys. 2016,
18, 093047. [CrossRef]
57. Lee, C.M.; Selby, J.H.; Barnum, H. Oracles and query lower bounds in generalised probabilistic theories.
arXiv 2017, arXiv:1704.05043.
58. Susskind, L. The Black Hole War: My Battle with Stephen Hawking to Make the World Safe for Quantum Mechanics;
Hachette UK: London, UK, 2008.
59. Takesaki, M. Theory of Operator Algebras I; Springer: New York, NY, USA, 1979.
60. Barnum, H.; Knill, E.; Ortiz, G.; Somma, R.; Viola, L. A subsystem-independent generalization of
entanglement. Phys. Rev. Lett. 2004, 92, 107902. [CrossRef] [PubMed]
61. Barnum, H.; Knill, E.; Ortiz, G.; Viola, L. Generalizations of entanglement based on coherent states and
convex sets. Phys. Rev. A 2003, 68, 032308. [CrossRef]
62. Barnum, H.; Ortiz, G.; Somma, R.; Viola, L. A generalization of entanglement to convex operational theories:
entanglement relative to a subspace of observables. Int. J. Theor. Phys. 2005, 44, 2127–2145. [CrossRef]
63. Del Rio, L.; Kraemer, L.; Renner, R. Resource theories of knowledge. arXiv 2015, arXiv:1511.08818.
64. Del Rio, L. Resource Theories of Knowledge. Ph.D. Thesis, ETH Zürich, Zürich, Switzerland, 2015.
[CrossRef]
65. Kraemer Gabriel, L. Restricted Agents in Thermodynamics and Quantum Information Theory. Ph.D. Thesis,
ETH Zürich, Zürich, Switzerland, 2016. [CrossRef]
66. Brassard, G.; Raymond-Robichaud, P. The equivalence of local-realistic and no-signalling theories. arXiv
2017, arXiv:1710.01380.
67. Holevo, A.S. Statistical Structure of Quantum Theory; Springer: Berlin/Heidelberg, Germany, 2003; Volume 67.
68. Kraus, K. States, Effects and Operations: Fundamental Notions of Quantum Theory; Springer: Berlin/Heidelberg,
Germany, 1983.
69. Haag, R.; Schroer, B. Postulates of quantum field theory. J. Math. Phys. 1962, 3, 248–256. [CrossRef]
70. Haag, R.; Kastler, D. An algebraic approach to quantum field theory. J. Math. Phys. 1964, 5, 848–861.
[CrossRef]
71. Buscemi, F.; Chiribella, G.; D’Ariano, G.M. Inverting quantum decoherence by classical feedback from the
environment. Phys. Rev. Lett. 2005, 95, 090501. [CrossRef] [PubMed]
72. Buscemi, F.; Chiribella, G.; D’Ariano, G.M. Quantum erasure of decoherence. Open Syst. Inf. Dyn. 2007,
14, 53–61. [CrossRef]
73. Selinger, P. Idempotents in dagger categories. Electron. Notes Theor. Comput. Sci. 2008, 210, 107–122.
[CrossRef]
74. Coecke, B.; Selby, J.; Tull, S. Two Roads to Classicality. Electron. Proc. Theor. Comput. Sci. 2018, 266, 104–118.
[CrossRef]
159
Entropy 2018, 20, 358
75. Coecke, B.; Lal, R. Causal categories: relativistically interacting processes. Found. Phys. 2013, 43, 458–501.
[CrossRef]
76. Coecke, B. Terminality implies no-signalling... and much more than that. New Gener. Comput. 2016, 34, 69–85.
[CrossRef]
77. Chiribella, G. Distinguishability and copiability of programs in general process theories. Int. J. Softw. Inform.
2014, 8, 209–223.
78. Fulton, W.; Harris, J. Representation Theory: A First Course; Springer: Berlin/Heidelberg, Germany, 2013;
Volume 129.
79. Marvian, I.; Spekkens, R.W. A generalization of Schur–Weyl duality with applications in quantum estimation.
Commun. Math. Phys. 2014, 331, 431–475. [CrossRef]
80. Galley, T.D.; Masanes, L. Impossibility of mixed-state purification in any alternative to the Born Rule. arXiv
2018, arXiv:1801.06414.
81. Yngvason, J. Localization and entanglement in relativistic quantum physics. In The Message of Quantum
Science; Springer: Berlin/Heidelberg, Germany, 2015; pp. 325–348.
82. Murray, F.J.; Neumann, J.V. On rings of operators. Ann. Math. 1936, 37, 116–229. [CrossRef]
83. Murray, F.J.; von Neumann, J. On rings of operators. II. Trans. Am. Math. Soc. 1937, 41, 208–248. [CrossRef]
84. Uhlmann, A. The transition probability in the state space of a *-algebra. Rep. Math. Phys. 1976, 9, 273–279.
[CrossRef]
85. Jozsa, R. Fidelity for mixed quantum states. J. Mod. Opt. 1994, 41, 2315–2323. [CrossRef]
86. Lindblad, G. A general no-cloning theorem. Lett. Math. Phys. 1999, 47, 189–196. [CrossRef]
87. D’Ariano, G.M.; Presti, P.L. Optimal nonuniversally covariant cloning. Phys. Rev. A 2001, 64, 042308.
[CrossRef]
88. Chiribella, G.; D’Ariano, G.; Perinotti, P.; Cerf, N. Extremal quantum cloning machines. Phys. Rev. A 2005,
72, 042336. [CrossRef]
89. Coecke, B.; Selby, J.; Tull, S. Categorical Probabilistic Theories. Electron. Proc. Theor. Comput. Sci. 2018,
266, 367–385.
c 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
160
entropy
Article
Ruling out Higher-Order Interference from
Purity Principles
Howard Barnum 1,2, *, Ciarán M. Lee 3, *, Carlo Maria Scandolo 4, * and John H. Selby 4,5, *
1 Centre for the Mathematics of Quantum Theory (QMATH), Department of Mathematical Sciences,
University of Copenhagen, DK-2100 Copenhagen, Denmark
2 Department of Physics and Astronomy, University of New Mexico, Albuquerque, NM 87131, USA
3 Department of Physics, University College London, London WC1E 6BT, UK
4 Department of Computer Science, University of Oxford, Oxford OX1 3QD, UK
5 Department of Physics, Imperial College London, London SW7 2AZ, UK
* Correspondence: [email protected] (H.B.); [email protected] (C.M.L.);
[email protected] (C.M.S.); [email protected] (J.H.S.)
Abstract: As first noted by Rafael Sorkin, there is a limit to quantum interference. The interference
pattern formed in a multi-slit experiment is a function of the interference patterns formed between
pairs of slits; there are no genuinely new features resulting from considering three slits instead of two.
Sorkin has introduced a hierarchy of mathematically conceivable higher-order interference behaviours,
where classical theory lies at the first level of this hierarchy and quantum theory theory at the second.
Informally, the order in this hierarchy corresponds to the number of slits on which the interference
pattern has an irreducible dependence. Many authors have wondered why quantum interference
is limited to the second level of this hierarchy. Does the existence of higher-order interference
violate some natural physical principle that we believe should be fundamental? In the current work
we show that such principles can be found which limit interference behaviour to second-order,
or “quantum-like”, interference, but that do not restrict us to the entire quantum formalism. We work
within the operational framework of generalised probabilistic theories, and prove that any theory
satisfying Causality, Purity Preservation, Pure Sharpness, and Purification—four principles that
formalise the fundamental character of purity in nature—exhibits at most second-order interference.
Hence these theories are, at least conceptually, very “close” to quantum theory. Along the way we
show that systems in such theories correspond to Euclidean Jordan algebras. Hence, they are self-dual
and, moreover, multi-slit experiments in such theories are described by pure projectors.
1. Introduction
Described by Feynman as “impossible, absolutely impossible, to explain in any classical way” [1]
(volume 1, chapter 37), quantum interference is a distinctive signature of non-classicality. However, as
first noted by Rafael Sorkin [2,3], there is a limit to this interference; in contrast to the case of two slits,
the interference pattern formed in a three slit experiment can be written as a linear combination of two
and one slit patterns. Sorkin has introduced a hierarchy of mathematically conceivable higher-order
interference behaviours, where classical theory lies at the first level of this hierarchy and quantum
theory theory at the second. Informally, the order in this hierarchy corresponds to the number of slits
on which the interference pattern has an irreducible dependence.
Many authors have wondered why quantum interference is limited to the second level of this
hierarchy [2,4–13]. Does the existence of higher-order interference violate some natural physical
principle that we believe should be fundamental [14]? In the current work we show that such
natural principles can be found which limit interference behaviour to second-order, or “quantum-like”,
interference, but that do not restrict us to the entire quantum formalism.
We work in the framework of general probabilistic theories [15–28]. This framework is general
enough to accommodate essentially arbitrary operational theories, where an operational theory specifies
a set of laboratory devices which can be connected together in different ways, and assigns probabilities to
different experimental outcomes. Investigating how the structural and information-theoretic features of a
given theory in this framework depend on different physical principles deepens our physical and intuitive
understanding of such features. Indeed, many authors [20,22,23,28,29] have derived the entire structure
of finite-dimensional quantum theory from simple information-theoretic axioms—reminiscent of
Einstein’s derivation of special relativity from two simple physical principles. So far, ruling out
higher-order interference has required thermodynamic arguments. Indeed, by combining the results
and axioms of Refs. [30,31], higher-order interference could be ruled out in theories satisfying the
combined axioms. In this paper we show that we can prove this in a more direct way from first
principles, using only the axioms of Ref. [30].
Many experimental investigations have searched for divergences from quantum theory by looking
for higher-order interference [32–36]. These experiments involved passing a particle through a physical
barrier with multiple slits and comparing the interference patterns formed on a screen behind the
barrier when different subsets of slits are closed. Given this set-up, one would expect that the physical
theory being tested should possess transformations that correspond to the action of blocking certain
subsets of slits. Moreover, blocking all but two subsets of slits should not affect states which can pass
through either slit. This intuition suggests that these transformations should correspond to projectors.
Many operational probabilistic theories do not possess such a natural mathematical interpretation
of multi-slit experiments; indeed many theories do not admit well-defined projectors [9]. Here, we
show that there exist natural information-theoretic principles that both imply the existence of the
projector structure, and rule out third-, and higher-, order interference. The principles that ensure
this structure are Causality, Purity Preservation, Pure Sharpness, and Purification. These formalise
intuitive ideas about the fundamental role of purity in nature. More formally, we show that such
theories possess a self-dualising inner product, and that there exist pure projectors which represent the
opening and closing of slits in a multi-slit experiment. Barnum, Müller and Ududec have shown that
in any self-dual theory in which such projectors exist for every face, if projectors map pure states to
pure states, then there can be at most second-order interference [4] (Proposition 29). The conjunction
of our new results and the principle of Purity Preservation implies the conditions of Barnum et al.’s
proposition. Hence sharp theories with purification do not exhibit higher-order interference. In fact
we prove a stronger result, that the systems in such theories are Euclidean Jordan algebras which have
been studied in quantum foundations [4,13,37].
This paper is organised as follows. In Section 2 we review the basics of the operational probabilistic
theory framework. In Section 3 we formally define higher-order interference. In Section 4 we define
sharp theories with purification and review relevant known results. In Section 5 we present and prove
our new results. Finally, in Section 6, we offer some suggestions on how new experiments might be
devised to observe higher-order interference.
2. Framework
We will describe theories in the framework of operational-probabilistic theories (OPTs) [19,20,24,29,38–40],
arising from the marriage of category theory [41–46] with probabilities. The foundation of this
framework is the idea that any successful physical theory must provide an account of experimental
data. Hence, such theories should have an operational description in terms of such experiments.
The OPT framework is based on the graphical language of circuits, describing experiments that
can be performed in a laboratory with physical systems connecting together physical processes, which
are denoted as wires and boxes respectively. The systems/wires are labelled with a type denoted A,
162
Entropy 2017, 19, 253
B, C, . . . . For example, the type given to a quantum system is the dimension of the Hilbert space
describing the system. The processes/boxes are then viewed as transformations with some input and
output systems/wires. For instance, in quantum theory these correspond to quantum instruments.
We now give a brief introduction to the important concepts in this formalism.
A A
A
A A a
ρ . (1)
B B
B b
Processes with no inputs (such as ρ in the above diagram) are called states, those with no outputs
(such as a and b) are called effects and, those with both inputs and outputs (such as A, A , B ) are called
transformations. We define:
OPTs include a particular system, the trivial system I, representing the lack of input or output for
a particular device.
Hence, states (resp. effects) are transformations with the trivial system as input (resp. output).
Circuits with no external wires, like the circuit in Equation (1), are called scalars and are associated
with probabilities. We will often use the notation ( a|ρ) to denote the circuit
( a|ρ) := ρ A a ,
( a|C|ρ) := ρ A
C B a .
The fact that scalars are probabilities and so are real numbers induces a notion of a sum of
transformations, so that the sets St (A), Transf (A, B), and Eff (A) become spanning sets of real vector
spaces, denoted by StR (A), Transf R (A, B), and Eff R (A). In this work we will restrict our attention to
finite systems, i.e., systems for which the vector space spanned by states is finite-dimensional for all
systems. Operationally this assumption means that one need not perform an infinite number of distinct
experiments to fully characterise a state. Restricting ourselves to non-negative real numbers, we have
the convex cone of states and of effects, denoted by St+ (A) and Eff + (A) respectively. We moreover
make the assumption that the set of states is close. Operationally this is justified by the fact that up to
any experimental error a state space is indistinguishable from its closure.
The composition of states and effects leads naturally to a norm. This is defined, for states ρ as
ρ := supa∈Eff (A) ( a|ρ), and similarly for effects a as a := supρ∈St(A) ( a|ρ). The set of normalised
states (resp. effects) of system A is denoted by St1 (A) (resp. Eff 1 (A)).
Transformations are characterised by their action on states of composite systems: if A, A ∈
Transf (A, B), we have that A = A if and only if
163
Entropy 2017, 19, 253
A
A B A
A B
ρ = ρ , (2)
S S
for every system S and every state ρ ∈ St (A ⊗ S). However it follows that [19] effects (resp. states) are
completely defined by their action on states (resp. effects) of a single system.
Equality on states of the single system A is, in general, not enough to discriminate between A
and A , as is the case for quantum theory over real Hilbert spaces [47]. However, for the scope of the
present article, which focuses on single-system properties, we often concern ourselves with equality
on single system.
.
Definition 1. Two transformations A, A ∈ Transf (A, B) are equal on single system, denoted by A = A ,
if Aρ = A ρ for all states ρ ∈ St (A).
Definition 2. The states {ρi }i∈X are called perfectly distinguishable if there exists an observation-test
' (
{ ai }i∈X such that ai ρ j = δij for all i, j ∈ X.
Moreover, if there is no other state ρ0 such that the states {ρi }i∈X ∪ {ρ0 } are perfectly distinguishable,
the set {ρi }i∈X is said maximal.
Ci = ∑ Dj
j ∈ Yi
5 6
In this case, we say that the test D j j∈Y is a refinement of the test {Ci }i∈X , and that the
5 6
transformations D j j∈Y are a refinement of the transformation Ci . A transformation C ∈ Transf (A, B)
i 5 6 5 6
is pure if it has only trivial refinements, namely refinements D j of the form D j = p j C , where p j is
a probability distribution. We denote the sets of pure transformations, pure states, and pure effects as
164
Entropy 2017, 19, 253
PurTransf (A, B), PurSt (A), and PurEff (A) respectively. Similarly, PurSt1 (A), and PurEff 1 (A) denote
normalised pure states and effects respectively. Non-pure states are called mixed.
Clearly, no states are contained in a pure state. On the other edge of the spectrum we have
complete states.
Definition 5. We say that two transformations A, A ∈ Transf (A, B) are equal upon input of the state
ρ ∈ St1 (A) if Aσ = A σ for every state σ contained in ρ. In this case we will write A =ρ A .
2.4. Causality
A natural requirement of a physical theory is that it is causal, that is, no signals can be sent from
the future to the past. In the OPT framework this is formalised as follows:
Axiom 1 (Causality [19,39]). The probability that a transformation occurs is independent of the choice of tests
performed on its output.
Causality is equivalent to the requirement that, for every system A, there exists a unique
deterministic effect uA on A (or simply u, when no ambiguity can arise) [19]. Owing to the uniqueness
of the deterministic effect, the marginals of a bipartite state can be uniquely defined as:
A
ρA A := ρAB ,
B u
Moreover, this uniqueness forbids the ability to signal [19,53]. We will denote by TrB ρAB the
marginal on system A, in analogy with the notation used in the quantum case. We will stick to the
notation Tr in formulas where the deterministic effect is applied directly to a state, e.g., Tr ρ := (u|ρ).
In a causal theory it is easy to see that the norm of a state takes the form ρ = Tr ρ, and that a
state can be prepared deterministically if and only if it is normalised.
3. Higher-Order Interference
The definition of higher-order interference we shall present in this section takes its motivation
from the set-up of multi-slit interference experiments. In such experiments a particle passes through
slits in a physical barrier and is detected at a screen. By repeating the experiment many times, one
builds up a pattern on the screen. To determine if this experiment exhibits interference one compares
this pattern to those produced when certain subsets of the slits are blocked. In quantum theory,
for example, the two-slit experiment exhibits interference as the pattern formed with both slits open is
not equal to the sum of the one-slit patterns.
Consider the state of the particle just before it passes through the slits. For every slit, there should
exist states such that the particle is definitely found at that slit, if measured. Mathematically, this means
that there is a face [4] of the state space, such that all states in this face give unit probability for the
“yes” outcome of the two-outcome measurement “is the particle at this slit?”. Recall that a face is a
convex set with the property that if px + (1 − p) y, for 0 ≤ p ≤ 1, is an element then x and y are also
elements. These faces will be labelled Fi , one for each of the n slits i ∈ {1, . . . , n}. As the slits should
be perfectly distinguishable, the faces associated with each slit should be perfectly distinguishable,
or orthogonal. One can additionally ask coarse-grained questions of the form “Is the particle found
among a certain subset of slits, rather than somewhere else?”. The set of states that give outcome “yes”
with probability one must contain all the faces associated with each slit in the subset. Hence the face
165
Entropy 2017, 19, 253
associated with the subset of slits I ⊆ {1, . . . , n} is the smallest face containing each face in this subset
7 7
FI := i∈I Fi , where the operation is the least upper bound of the lattice of faces where the ordering
is provided by subset inclusion of one face within another. The face FI contains all those states which
can be found among the slits contained in I. The experiment is “complete” if all states in the state space
(of a given system A) can be found among some subset of slits. That is, if F12···n = St (A).
An n-slit experiment requires a system that has n orthogonal faces Fi , with i ∈ {1, . . . , n}.
Consider an effect E associated with finding a particle at a particular point on the screen. We now
formally define an n-slit experiment.
Definition 6. An n-slit experiment is a collection of effects eI , where I ⊆ {1, . . . , n}, such that
The effects introduced in the above definition arise from the conjunction of blocking off the slits
{1, . . . , n} \ I and applying the effect E. If the particle was prepared in a state such that it would be
unaffected by the blocking of the slits (i.e., ρ ∈ FI ) then we should have (eI |ρ) = ( E|ρ). If instead the
particle is prepared in a state which is guaranteed to be blocked (i.e., ρ ⊥ FI ) then the particle should
have no probability of being detected at the screen, i.e., (eI |ρ ) = 0.
The relevant quantities for the existence of various orders of interference are [2,9,13,15]:
I1 := ( E|ρ) , (3)
I2 := ( E|ρ) − (e1 |ρ) − (e2 |ρ) , (4)
I3 := ( E|ρ) − (e12 |ρ) − (e23 |ρ) − (e31 |ρ) + (e1 |ρ) + (e2 |ρ) + (e3 |ρ) , (5)
In := ∑ (−1)n−|I| (eI |ρ) , (6)
∅
=I⊆{1,...,n}
Definition 7. A theory has n-th order interference if there exists a state ρ and an effect E such that In
= 0.
In a slightly different formal setting, it was shown in [2] that In = 0 =⇒ In+1 = 0, so if there is no
nth order interference, there will be no (n + 1)th order interference; the argument of [2] applies here.
It should be noted that there appears to be a lot of freedom in choosing a set of effects {eI } to test
for the existence of higher-order interference. Indeed, in arbitrary generalised theories this appears to
be the case [9]. However, it is natural to ask whether there exists physical transformations TI in the
theory which correspond to leaving the subset of slits I open and blocking the rest. Hence a unique eI
is assigned to each fixed E defined as eI = ETI . Ruling out the existence of higher-order interference
then reduces to proving certain properties of the TI . This will turn out to be the case in sharp theories
with purification.
Axiom 2 (Purity Preservation [55]). Sequential and parallel compositions of pure transformations yield
pure transformations.
166
Entropy 2017, 19, 253
The second axiom—Pure Sharpness—guarantees that every system possesses at least one
elementary property.
Axiom 3 (Pure Sharpness [54]). For every system there exists at least one pure effect occurring with unit
probability on some state.
These axioms are satisfied by both classical and quantum theory. Our third axiom—Purification—
signals the departure from classicality, and characterises when a physical theory admits a level of
description where all deterministic processes are pure and reversible.
Given a normalised state ρA ∈ St1 (A), a normalised pure state Ψ ∈ PurSt1 (A ⊗ B) is a purification
of ρA if
A
Ψ = ρA A ;
B u
in this case B is called the purifying system. We say that a pure state Ψ ∈ PurSt (A ⊗ B) is an essentially
unique purification of its marginal ρA [39] if every other pure state Ψ ∈ PurSt (A ⊗ B) satisfying the
purification condition must be of the form
A A
Ψ B
= Ψ
B B
,
U
Axiom 4 (Purification [19,39]). Every state has a purification. Purifications are essentially unique.
Quantum theory, both on complex and real Hilbert spaces, satisfies Purification, and also Spekkens’
toy model [56]. Examples of sharp theories with purification besides quantum theory include fermionic
quantum theory [57,58], a superselected version of quantum theory known as doubled quantum
theory [49], and a recent extension of classical theory with the theory of codits [30].
Proposition 1. For every system A there is a positive integer dA , called the dimension of A, such that all
maximal sets of pure states have dA elements.
Note that we will omit the subscript A when the context is clear.
In sharp theories with purification every state can be diagonalised, i.e., written as a convex
combination of perfectly distinguishable pure states (cf. Refs. [30,54]).
Theorem 5. Every normalised state ρ ∈ St1 (A) of a non-trivial system can be decomposed as
d
ρ= ∑ pi αi ,
i =1
where { pi }id=1
is a probability distribution, and {αi }id=1 is a pure maximal set. Moreover, given ρ, { pi }id=1 is
unique up to rearrangements.
Such a decomposition is called a diagonalisation of ρ, the pi ’s are the eigenvalues of ρ, and the αi ’s
are the eigenstates. Theorem 5 implies that the eigenvalues of a state are unique, and independent
of its diagonalisation. Sharp theories with purification have a unique invariant state χ [19], which
can be diagonalised as χ = 1d ∑id=1 αi , where {αi }id=1 is any pure maximal set [30]. Furthermore, the
167
Entropy 2017, 19, 253
diagonalisation result of Theorem 5 can be extended to every vector in StR (A), but here the eigenvalues
will be generally real numbers [30].
One of the most important consequences for this paper of the axioms defining sharp theories with
purification is a duality between normalised pure states and normalised pure effects.
Theorem 6 (States-effects duality [30,54]). For every system A, there is a bijective correspondence †:
PurSt1 (A) → PurEff 1 (A) such that if α ∈ PurSt1 (A), α† is the unique normalised pure effect such
' (
that α† α = 1. Furthermore this bijection can be extended by linearity to an isomorphism between the vector
spaces StR (A) and Eff R (A).
With a little abuse of notation we will use † also to denote the inverse map PurEff 1 (A) →
' (
PurSt1 (A), by which, if a ∈ PurEff 1 (A), a† is the unique pure state such that a a† = 1. Pure maximal
sets {αi }id=1 have the property that ∑id=1 αi† = u [30].
A diagonalisation result holds for vectors of Eff R (A) as well [30]: they can be written as
X = ∑id=1 λi αi† , where {αi }id=1 is a pure maximal set. Again, the λi ’s are uniquely defined given X.
Another result that will be made use of in the following sections is the following. It was shown to
hold in Ref. [30], and expresses the possibility of constructing non-disturbing measurements [20,59,60].
Proposition 2. Given a system A, let a ∈ Eff (A) be an effect such that ( a|ρ) = 1, for some ρ ∈ St1 (A).
Then there exists a pure transformation T ∈ PurTransf (A) such that T =ρ I , with (u|T |σ ) ≤ ( a|σ ), for
every state σ ∈ St1 (A).
Note that the pure transformation T is non-disturbing on ρ because it acts as the identity on ρ and
on all states contained in it. In other words, whenever we have an effect occurring with unit probability
on some state ρ, we can always find a transformation that does not disturb ρ (i.e., a non-disturbing,
non-demolition measurement) [30].
Finally, a property that we will use often is a sort of no-restriction hypothesis for tests, derived
in [20] (Corollary 4).
Proposition 3. A collection of transformations {Ai }i∈X is a valid test if and only if ∑i∈X uAi = u.
A collection of effects { ai }i∈X is a valid observation-test if and only if ∑i∈X ai = u.
5.1. Self-Duality
Now we will prove that sharp theories with purification are self-dual. Recall that a theory is
self-dual if for every system A there is an inner product •, • on StR (A) such that ξ ∈ St+ (A) if and
only if ξ, η ≥ 0 for every η ∈ St+ (A). To show that, we need to find a self-dualising inner product
on StR (A) for every system A. The dagger will provide us with a good candidate. First we need the
following lemma.
Lemma 1. Let a ∈ Eff 1 (A) be a normalised effect. Then a is of the form a = ∑ri=1 αi† , with r ≤ d, and the
pure states {αi }ri=1 are perfectly distinguishable.
168
Entropy 2017, 19, 253
Proof. We know that every effect a can be written as a = ∑ri=1 λi αi† , where r ≤ d, the pure states
{αi }ri=1 are perfectly distinguishable, and for every i ∈ {1, . . . , r }, λi ∈ (0, 1]. Since the state space is
closed, and a is normalised, then there exists a (normalised) state ρ such that ( a|ρ) = 1. One has
r
1 = ( a|ρ) = ∑ λi αi† ρ .
i =1
' ( ' (
Now, αi† ρ ≥ 0, and ∑ri=1 αi† ρ ≤ 1 because
r d
∑ αi† ρ ≤ ∑ αi† ρ = Tr ρ = 1,
i =1 i =1
' (
where we have used the fact that ∑id=1 αi† = u. Then ∑ri=1 λi αi† ρ ≤ λmax , where λmax is the
maximum of the λi ’s. Therefore, λmax ≥ 1, which implies λmax = 1. Now, the condition
r
∑ λi αi† ρ = λmax
i =1
In the above, we call r the rank of the normalised effect. We can use this result to prove
the following.
Proof. The map •, • is clearly bilinear by construction, because the dagger is also linear. Let us show
that it is positive-definite. Take a non-null vector ξ ∈ StR (A), and diagonalise it as ξ = ∑id=1 xi αi . Then
d d
ξ, ξ = ξ † ξ = ∑ xi x j αi† α j = ∑ xi2 > 0,
i,j=1 i =1
' (
where we have used the fact that for perfectly distinguishable pure states αi† α j = δij [30].
The hard part is to prove that this bilinear map is symmetric, namely ξ, η = η, ξ , for every
ξ, η ∈ StR (A). Let us define a new (double) dagger ‡. The double dagger of a normalised state ρ is
an effect ρ‡ whose action on normalised states σ is defined as
ρ‡ σ := σ† ρ , (7)
where † is the dagger of Theorem 6. Note that Equation (7) is enough to characterise ρ‡ completely,
' (
and it guarantees that ρ‡ is a mathematically well-defined effect, because it is linear and σ† ρ ∈ [0, 1].
' ‡ ( ' † (
Consider now ρ and σ to be a normalised pure state ψ. Then ψ ψ = ψ ψ = 1, this means that α‡
is normalised. If we manage to show that ψ‡ is pure, then by Theorem 6 we can conclude that ψ‡ = ψ† .
By Lemma 1, ψ‡ is of the form ψ‡ = ∑ri=1 αi† , with r ≤ d, and the pure states {αi }ri=1 are perfectly
distinguishable. Clearly ψ‡ is pure if and only if r = 1. To prove it, first let us evaluate ψ‡ on χ:
1 1
ψ‡ χ = χ† ψ = Tr ψ = , (8)
d d
169
Entropy 2017, 19, 253
r
r
ψ‡ χ = ∑ αi† χ = ,
d
(9)
i =1
' (
because αi† χ = 1d for every i [30]. A comparison between Equations (8) and (9), shows that r = 1.
This means that ψ‡ is pure, whence ψ‡ = ψ† . Now we can show that the double dagger ‡ actually
coincides with the dagger of Theorem 6. Indeed, given a state ρ, diagonalise it as ρ = ∑id=1 pi αi .
‡
One can easily show that the double dagger of Equation (7) is linear, so we have ρ‡ = ∑id=1 pi αi , but
‡
we have just proved that αi = αi† for pure states, so ρ‡ = ∑id=1 pi αi† = ρ† . This means that ‡ = †, and
that Equation (7) is nothing but a redefinition of the usual dagger. This means for every normalised
states we have
ρ† σ = σ† ρ , (10)
and this extends linearly to all vectors ξ, η ∈ StR (A). We have proved that •, • is symmetric, and
this concludes the proof.
Note that the above result immediately yields the “symmetry of transition probabilities” as
defined in Ref. [61,62].
Now we prove that this inner product is invariant under reversible transformations.
Proposition 4. For every ξ, η ∈ StR (A) and every reversible channel U one has
U ξ, U η = ξ, η .
Proof. To prove the statement, let us first prove that for a normalised pure state α one has (U α)† =
' ( ' (
α† U −1 , for every reversible channel U . α† U −1 is a pure effect and one has α† U −1 U α = α† α = 1.
By the uniqueness of the dagger for normalised pure states, α U † − 1 †
= (U α) . This can be extended
by linearity to all vectors ξ in StR (A), so (U ξ )† = ξ † U −1 . Therefore, when we compute U ξ, U η ,
we have
U ξ, U η = ξ † U −1 U η = ξ † η = ξ, η .
The fact that •, • is an inner product allows us to define an additional norm in sharp theories
with purification: if ξ ∈ StR (A), define the dagger norm as
/
ξ † := ξ, ξ .
See Appendix A.1 for an extended discussion on the properties of this norm.
Now we are ready to state the core of this subsection.
Proof. Given a system A, we need to prove that ξ ∈ StR (A) is in St+ (A) if and only if ξ, η ≥ 0 for
all η ∈ St+ (A). Note that ξ ∈ St+ (A) if and only if it can be diagonalised as ξ = ∑id=1 xi αi , where the
xi ’s are all non-negative.
Necessity. Suppose ξ ∈ St+ (A), and take any η ∈ St+ (A), diagonalised as η = ∑id=1 yi β i .
Then we have
d
ξ, η = ∑ xi y j αi† β j ≥ 0
i,j=1
' (
because all the terms xi , y j , and αi† β j are non-negative.
170
Entropy 2017, 19, 253
Sufficiency. Take ξ ∈ StR (A), and assume that ξ, η ≥ 0 for all η ∈ St+ (A). Assume ξ is
diagonalised as ξ = ∑id=1 xi αi , where the xi ’s are generic real numbers. We wish to prove that all the
xi ’s are non-negative. Then
d
ξ, η = ∑ xi αi† η ≥ 0.
i,j=1
' (
Recalling that for perfectly distinguishable pure states one has αi† α j = δij [30], it is enough to
take η to be one of the states {αi }id=1 to conclude that xi ≥ 0 for every i ∈ {1, . . . , d}, meaning that
ξ ∈ St+ (A).
The self-dualising inner product, besides being a nice mathematical tool, has some operational
meaning, because it provides a measure of the distinguishability of states, as explained in Appendix A.2.
Moreover, it is the starting point for extending the dagger to all transformations. This is done in
Appendix B.
in analogy with those of Definition 6. Clearly the effect aI⊥ := ∑i∈/I αi† defines the orthogonal face FI⊥ ,
5 6
as it occurs with probability one on the states of FI⊥ . Note that each of the effects αi† i∈/I occurs with
zero probability on the states of FI .
Definition 8. An orthogonal projector (in the sense of [20]) on the face FI is a transformation PI ∈ Transf (A)
such that
• if ρ ∈ FI , then PI ρ = ρ;
• if ρ ∈ FI⊥ , then PI ρ = 0.
We can prove the existence of a projector at least in one case, when I = {1, . . . , d}. In this case
.
aI = u, so FI = St1 (A), and FI⊥ = ∅. Then it is enough to take PI = I . However, sharp theories with
purification admit projectors on every face.
Proposition 6. Sharp theories with purification have pure projectors on every face FI . Furthermore one has
uPI = aI .
Proof. Suppose ρ is any state in FI , then ( aI |ρ) = 1. By Proposition 2 we know that there is a pure
transformation PI such that PI ρ = ρ for every ρ ∈ FI . We also have (u| PI |σ) ≤ ( aI |σ), so if σ ∈ FI⊥ ,
we have (u| PI |σ ) = 0, whence PI σ = 0.
171
Entropy 2017, 19, 253
To prove that uPI = aI , first note that ψ† PI = ψ† for every pure state ψ ∈ FI . Indeed ψ† PI is
' ( ' (
pure by Purity Preservation, and we have ψ† PI ψ = ψ† ψ = 1 because PI ψ = ψ by definition.
By Theorem 6, we have ψ PI = ψ . Furthermore, ϕ PI = 0 for a pure state ϕ ∈ FI⊥ . Indeed, consider
† † †
1
1
ϕ† PI χ =
d ∑ ϕ† PI αi +
d ∑ ϕ† PI αi .
i ∈I i∈
/I
The second term vanishes because αi ∈ FI⊥ for i ∈/ I. The first term vanishes because PI αi = αi for
i ∈ I, and ϕ is perfectly distinguishable from any of the αi ’s for i ∈ I by means of the observation-test
' (
{u − aI , aI }, implying ϕ† αi = 0 [30]. This means that ϕ† PI occurs with zero probability on all states
contained in χ, and since χ is complete [19], ϕ† PI = 0. Now, when we calculate uPI , we separate the
contribution arising from states in orthogonal faces:
In other words, PI occurs with the same probability as aI , thus satisfying one of the desiderata
of Section 3. Moreover, extending some of the results in the Proof 6 by linearity, we obtain the dual
statements of Definition 8, namely
• ρ† PI = ρ† if ρ ∈ FI
• ρ† PI = 0 if ρ ∈ FI⊥
Another consequence of Proposition 6 is that projectors actually project on their associated face, viz.
for every normalised state ρ, PI ρ = λσ, where σ is in FI , and λ = ( aI |ρ). Indeed, λ = (u| PI |ρ) = ( aI |ρ).
If λ
= 0, which means ρ ∈ / FI⊥ , then and ( aI |σ ) = λ1 ( aI | PI |ρ). However, we know that aI PI = aI , so
( aI |σ) = 1, showing that σ ∈ FI .
Furthermore, we can show that every projector PI has a complement PI⊥ , which is the projector
' (
associated with the effect aI⊥ = ∑i∈/I αi† , which defines the orthogonal face FI⊥ . Clearly PI⊥ ρ = aI⊥ ρ σ,
with σ ∈ FI⊥ . In particular, PI⊥ ρ vanishes if and only if ρ ∈ FI .
These properties are the starting point for proving the idempotence of projectors.
.
Proposition 7. Given a fixed pure maximal set {αi }id=1 and I ⊆ {1, . . . , d}, one has PI2 = PI . Moreover, if J is
.
another subset of {1, . . . , d} disjoint from I, then PI PJ = 0.
Proof. Recall that for every state ρ, PI ρ = λσ, where σ is in FI . Now, PI leaves σ invariant by definition, so
which implies ( aJ |ρ) = 0 because ( aI |ρ) = 1. Hence ρ ∈ FJ⊥ . Now, given any normalised state ρ,
.
PI PJ ρ = 0 because PJ ρ is proportional to a state in FI⊥ . This proves that PI PJ = 0.
d
5 6
This result shows that, a pure maximal set {αi }i=1 is fixed, whenever we have a partition
once
I j of {1, . . . , d}, the test PIj is a von Neumann measurement. The only thing left to check is that
172
Entropy 2017, 19, 253
∑ j uPIj = u, which is a sufficient condition for a set of transformations to be a test in sharp theories
with purification. This is satisfied because, recalling Proposition 6,
d
∑ uPIj = ∑ aIj = ∑ αi† = u.
j j i =1
Because of the properties proved above, von Neumann measurements are repeatable and
minimally disturbing measurements in the sense of Refs. [59,63]. Indeed, aIj PIj = aIj , and
because for k
= j the PIk ’s project on faces orthogonal to FIj .
The next proposition concerns the interplay between orthogonal projectors and the dagger.
Proposition 8. For every normalised state ρ, and for every projector PI on a face FI , one has ( PI ρ)† = ρ† PI .
Proof. First of all, note that 0 ≤ PI ρ ≤ 1, and it vanishes if and only if ρ ∈ FI⊥ . If ρ ∈ FI⊥ , then
ρ† PI = 0, so the statement is trivially true. Now suppose PI ρ > 0. We will first prove the statement
for normalised pure states ψ, then it is sufficient to extend it by linearity to all states. We will make use
of the uniqueness of the dagger for normalised pure states. Then the statement is equivalent to proving
†
PI ψ ψ† PI
= ,
PI ψ PI ψ
Noting that the term in brackets is a normalised pure state (by Purity Preservation), and that the RHS
is a pure effect (again by Purity Preservation), by the uniqueness of the dagger for normalised pure
states (cf. Theorem 6), it is enough to prove that
' (
ψ† PI PI ψ
= 1;
PI ψ2
' ( . ' (
in other words that ψ† PI PI ψ = PI ψ2 . Recall that PI2 = PI (Proposition 7), so ψ† PI PI ψ =
' † ( ' ( ' (
ψ PI ψ . Now, PI ψ = PI ψ ψ , where ψ is a pure state in FI . We have
ψ
† P ψ = P ψ ψ† ψ .
PI I I
' † ( ' † (
We only need to prove that ψ ψ = PI ψ. Recall that ψ ψ = ψ † ψ by Lemma 2, and that
ψ † PI = ψ † as ψ ∈ FI , thus
ψ† ψ = ψ † PI ψ = PI ψ ψ † ψ = PI ψ .
†
PI ψ ψ† PI
By the uniqueness of the dagger for normalised pure states we conclude that PI ψ
= PI ψ
, namely
( PI ψ)† = ψ† PI .
A consequence of this proposition is that orthogonal projectors play nicely with the inner product
of Lemma 2, namely for every ξ, η ∈ StR (A) one has
PI ξ, η = ξ, PI η . (11)
In other words, projections are symmetric with respect to the inner product.
The last property we need is a generalisation of the results of Proposition 7.
.
Proposition 9. Fixing a pure maximal set {αi }id=1 , and considering I, J ⊆ {1, . . . , d}, we have PI PJ = PI∩J .
173
Entropy 2017, 19, 253
for every normalised state ρ, where ρ ∈ FI∩J . Let us show that PI PJ ρ = ( aI∩J |ρ). By Proposition 6,
(u| PI PJ |ρ) = ( aI | PJ |ρ). Now, recalling that aI = ∑i∈I αi† ,
( aI | PJ |ρ) = ∑ αi† PJ ρ + ∑ αi† PJ ρ = ∑ αi† ρ = ( aI∩J |ρ) ,
i ∈I∩J i ∈I\J i ∈I∩J
where we have used the fact that αi† PJ = αi† if i ∈ J, and αi† PJ = 0 if i ∈/ J. If ρ ∈ FI⊥∩J , both the LHS and
the RHS of Equation (12) vanish, and the statement is trivially satisfied. Now, let us assume ρ ∈ / FI⊥∩J ,
in this case ( aI∩J |ρ) > 0. We wish to prove that ( aI∩J | PI PJ |ρ) = ( aI∩J |ρ). Recalling the expression of
aI∩J , we have
∑ αi† PI PJ ρ = ∑ αi† PJ ρ = ∑ αi† ρ = (aI∩J |ρ) ,
i ∈I∩J i ∈I∩J i ∈I∩J
again by the properties of PI and PJ . This means that PI PJ maps every normalised state to a state of
FI∩J , up to normalisation.
.
Now let us prove that ( PI PJ )2 = PI PJ . First note that FI∩J ⊆ FI . Indeed, suppose ρ ∈ FI∩J , then
( aI | ρ ) = ∑ αi† ρ + ∑ αi† ρ = ( aI∩J |ρ) = 1,
i ∈I∩J i ∈I\J
( '
where we have used the fact that αi† ρ = 0 if i ∈ / I ∩ J. By a similar argument, FI∩J ⊆ FJ . Now,
PI PJ ρ = PI PJ ρ ρ , with ρ ∈ FI∩J . Then ( PI PJ )2 ρ = PI PJ ρ PI PJ ρ . However, ρ ∈ FJ , so PJ ρ = ρ ,
and, similarly, ρ ∈ FI , so PI ρ = ρ . Consequently,
( PI PJ )2 ρ = PI PJ ρ ρ = PI PJ ρ,
.
proving that ( PI PJ )2 = PI PJ .
Now let us prove that for every ξ ∈ StR (A), we have ( PI PJ ξ )† = ξ † PI PJ . Following the lines of
proof of Proposition 8, let us show that this is true when ξ is a normalised pure state ψ. This boils
down to showing that
ψ† PI PJ PI PJ ψ = PI PJ ψ2 .
The proof goes on as for Proposition 8, noting that if ψ ∈ FI∩J , then ψ† PI PJ = ψ† because
ψ† PI = ψ† as ψ ∈ FI , and, similarly, ψ† PJ = ψ† as ψ ∈ FJ . Eventually we find that for pure states
( PI PJ ψ)† = ψ† PI PJ , and by linearity this means that ( PI PJ ξ )† = ξ † PI PJ .
A consequence of this property is that PI PJ ξ, η = ξ, PI PJ η , for all ξ, η ∈ StR (A). These linear
maps on StR (A) are such that StR (A) = im PI PJ ⊕ ker PI PJ , and ker PI PJ is the orthogonal subspace
to im PI PJ , hence it is uniquely defined once im PI PJ is fixed. Note that for any projector PI we have
im PI = span FI , and we have just proved that im PI PJ = span FI∩J = im PI∩J . Having the same image,
and consequently the same kernel, PI PJ and PI∩J agree on a basis of StR (A), therefore they agree also
.
on all states of A, meaning that PI PJ = PI∩J .
174
Entropy 2017, 19, 253
sense of Definition 8 above, and the fact that these are symmetric with respect to the self-dualising inner
product (i.e., orthogonal projectors), and satisfy Proposition 9 above. We have established these weaker
premises for sharp theories with purification, and moreover, we have established in Proposition 6 that
their projectors preserve purity, so we have proved:
Theorem 7. In any sharp theory with purification there can be no nth order interference for n ≥ 3.
Theorem 8. In a sharp theory with purification, every system A has both St+ (A) and Eff + (A) isomorphic to
the cone of squares in a Euclidean Jordan algebra (EJA) via isomorphisms S and T such that ( a|ρ) = Ta, Sρ,
where •, • is the canonical inner product on the EJA, and T takes the deterministic effect to the Jordan unit.
Proof. The proof uses results of Alfsen and Shultz [64], for which we refer to [61]. Theorem 9.33
in [61] implies that finite-dimensional systems with symmetry of transition probabilities (STP), a type
of projection operator they call “compression” associated with every face, and whose compressions
preserve purity, have state spaces affinely isomorphic to the state spaces of Euclidean Jordan algebras.
Sharp theories with purification satisfy STP, as noted following Lemma 2 above. Our projectors are
easily shown to be examples of compressions by the same argument as in Theorem 17 of [4]; this
argument uses only properties satisfied by our projectors (the same ones needed in the proof of
Theorem 7, except for Purity Preservation) and does not need Strong Symmetry. As shown above, our
projectors also preserve purity.
Since faces of Jordan-algebraic systems are also Jordan-algebraic (to see this, combine a result
of Iochum [65] (Theorem 5.32 in [61]), whose finite dimensional case is that all faces of EJAs are the
positive part of the images of compressions, with the facts (cf. pp. 22–26 of [61]) that every face of the
cone of squares is the image of such a compression P ([61], Lemma 1.39), and also a Jordan subalgebra
whose unit is the image of the order unit under P ([61], Proposition 1.43).), so are the faces of state
spaces in sharp theories with purification. However, it is not the case that in sharp theories with
purification, each face of a system is necessarily isomorphic to a stand-alone system of the theory
(an object of the category, in the categorical formulation), but, it is always possible to extend the theory
such that they are. Every category has a Cauchy completion: this is a minimal extension of the category
such that every idempotent morphism π : A → A can be written as a retraction-section pair, i.e., as the
composition π = σ ◦ ρ, with ρ : A → B and σ : B → A, such that the reverse composition ρ ◦ σ is the
identity morphism on B. When the idempotents are projectors P like the ones we consider here, B will
be a system isomorphic to the face im+ ( P). Of course, since there may be idempotents beyond the
projectors onto faces (for example, decoherence of a set of orthogonal subspaces, or damping to a fixed
state, in quantum theory), Cauchy completion of an operational theory T may add many objects in
addition to ones isomorphic to faces of systems of T; indeed, for many operational theories (e.g., ones
possessing idempotent decoherence maps) this will add some classical systems. This is indeed the
case for quantum theory where the Cauchy completion leads to the category of finite-dimensional
C*-algebras and completely positive maps [66]. The Cauchy completion can be thought of as adding
in all operationally accessible systems that can be simulated on the physical system via a consistent
restriction on the allowed states, effects and transformations. The Cauchy completion of a sharp theory
with purification will likely satisfy the Ideal Compression postulate by virtue of containing the faces
that are images of orthogonal projectors; but there are also non-Cauchy complete theories that satisfy
175
Entropy 2017, 19, 253
it, e.g., the category CPM of finite-dimensional quantum systems and CP maps, in which all systems,
and also all images of orthogonal projectors as defined above, are fully coherent quantum systems, but
there are no classical systems.
In [37], some categories, including dagger-compact-closed categories, of Jordan algebraic systems
were constructed; these categories are equivalent to operational theories as we use the term here.
Although sharp theories with purification also have Jordan algebraic state and effect spaces, it is
interesting to note that some of the explicit examples in [30,49] involve composites different from those
that would be obtained in the categories considered in [37] for systems with the same state spaces.
On the other hand, the category combining real and quaternionic systems in [37] does not satisfy
Purity Preservation by parallel composition and hence falls outside the class of sharp theories with
purification, although its filters do preserve purity. Of course, the failure of Purity Preservation by
parallel composition seems likely to allow phenomena like the nonextensiveness of entropy when
products of states are taken, which could warrant focusing on sharp theories with purification in
thermodynamically motivated work such as [30].
That Jordan-algebraic systems lack higher-order interference was shown by Barnum and
Ududec ([12]; announced in [67]) and by Niestegge [68]; combining this with Theorem 8 gives another
way to see that our results on sharp theories with purification imply the absence of higher-order
interference. Moreover, as not all EJAs satisfy our postulates, it is clear that our postulates are sufficient
but not necessary conditions for ruling out higher-order interfence.
176
Entropy 2017, 19, 253
hence, to have any chance of observing higher-order interference, experiments must be designed in
order to try to violate these conditions.
1. The transformations corresponding to blocking slits satisfy: TI TJ = TI∩J . By this we mean that
they share several properties with the projectors PI of Section 5: if we define the effects aI = uTI
and the faces FI and FI⊥ as in Section 5.2, i.e., as the 1-set and 0-set of aI , then the TI are assumed
to be orthogonal projectors in the sense of Definition 8, and to be both idempotent and “orthogonal”
(TI TJ = 0) if I and J are disjoint (as in Proposition 7).
2. The TI ’s map pure states to pure states
3. The TI ’s are self-adjoint.
The first of these is generally expected as only those slits belonging to both I and J will not be
blocked by either TI or TJ , and so should hold in this experimental set-up for any theory that can
describe it.
The second assumption, which is also natural given the multi-slit set-up, is that, in an idealised
scenario, the slits should not introduce fundamental noise. That is, if an input state ρ is pure, i.e., has
no classical noise associated with it, then TI ρ should also be pure. Hence it appears natural to assume
that TI maps pure states to pure states. Violating this principle by just adding noise to the experiment
does not seem likely to demonstrate higher-order interference. A more plausible way to violate this
however would be if the particle passing through the slits were to become entangled with some degree
of freedom associated with them, if we do not have access to this degree of freedom then this would
send a pure input to a mixed state.
The final assumption is far less general than the others, as it places a constraint on the theory.
That is, to even discuss whether a transformation is self-adjoint (cf. also Appendix B), one requires
that the theory itself be self-dual. To fully understand what this assumption entails, one needs an
operational or physical interpretation of the self-dualising inner product (see [69] for an example
of such an interpretation). However, intuitively this notion reflects the inherent symmetry of the
experimental set-up. Here one could consider propagation from the source to the effect or from the
effect to the source as being “dual” to one another and, moreover, that the physical blocking of slits
has an equivalent effect in either situation. That is, the assumption of self-adjointness corresponds to
the statement that the projector has an equivalent action on the effects associated with a particular slit
as it does on the states which can pass through them.
If an experiment satisfies these assumptions then for any self-dual theory it was shown in [4]
(Proposition 29) that we will not see higher-order interference in this experiment. Hence any set
of physical principles which ensure these assumptions hold will rule out higher-order interference.
Because the mathematical assumptions involved in formalising a multi-slit experiment are so natural
when interpreted operationally, perhaps one should search for higher-order interference in set-ups that
don’t seem to preclude it from the outset. This could involve “asymmetric” multi-slit set-ups that are
not obviously time-symmetric in an arbitrary generalised probabilistic theory. One could also consider
experiments that search for higher-order phases [8], a reformulation of higher-order interference that
makes no reference to projectors and hence does not preclude certain generalised theories from the
outset. The assumption that nature is self-dual could also be rejected; this poses the question as to
whether it is possible to find a direct experimental test of this principle.
Acknowledgments: The authors thank J. Barrett for useful discussions and J. J. Barry for encouragement while
writing the current paper. This work was supported by EPSRC grants through the Controlled Quantum Dynamics
Centre for Doctoral Training, the UCL Doctoral Prize Fellowship (project number 534936), and an Oxford doctoral
training scholarship, and also by Oxford-Google DeepMind Graduate Scholarship. We also acknowledge financial
support from the European Research Council (ERC Grant Agreement No. 337603), the Danish Council for
Independent Research (Sapere Aude) and VILLUM FONDEN via the QMATH Centre of Excellence (Grant
No. 10059). This work began while the authors were attending the “Formulating and Finding Higher-order
Interference” workshop at the Perimeter Institute. Research at Perimeter Institute is supported by the Government
of Canada through the Department of Innovation, Science and Economic Development Canada and by the
Province of Ontario through the Ministry of Research, Innovation and Science.
177
Entropy 2017, 19, 253
As pointed out in [19], in quantum theory the operational norm coincides with the trace norm.
The analogy is apparent also in sharp theories with purification.
Proof. Let us separate the terms with non-negative eigenvalues from the terms with negative
eigenvalues, so that we can write ξ = ξ + − ξ − , where ξ + := ∑xi ≥0 xi αi , and ξ − = ∑xi <0 (− xi ) αi . Clearly,
ξ + , ξ − ∈ St+ (A). In order to achieve the supremum of (a|ξ ) we must have (a|ξ − ) = 0. Moreover,
( a|ξ + ) = ∑ xi ( a | αi ) ≤ ∑ xi
x i ≥0 x i ≥0
since ( a|αi ) ≤ 1 for every i. The supremum of ( a|ξ + ) is achieved by a = ∑ xi ≥0 αi† . Hence supa ( a|ξ ) =
∑ xi ≥0 xi . By a similar argument, one shows that infa ( a|ξ ) = ∑ xi <0 xi . Therefore
d
ξ = ∑ xi + ∑ (− xi ) = ∑ | xi | .
x i ≥0 x i <0 i =1
1
p p
For p ≥ 1, the p-norm of a vector x ∈ Rd is defined as x p := ∑id=1 | xi | , thus we have
ξ = x1 , where x is the spectrum of ξ.
In sharp theories with purification we have an additional norm,
/ the dagger norm, defined in
Section 5.1. The dagger norm of a vector ξ ∈ StR (A) is ξ † = ∑id=1 x2i , where the xi ’s are the
eigenvalues of ξ. It is obvious from the very definition that ξ † = x2 . Thanks to these results
following from diagonalisation, we can derive the
√ standard bounds between the two norms, by making
use of the well-known bounds x2 ≤ x1 ≤ d x2 , which imply
√
ξ † ≤ ξ ≤ d ξ † . (A1)
Note that, unlike Ref. [70], here the bounds are derived without assuming Bit Symmetry [4,71].
If we take ξ to be a normalised state ρ, its eigenvalues form a probability distribution, and we
have ρ† ≤ 1, with equality if and only if ρ is pure. Note that ρ† is a Schur-convex function [72] of
the eigenvalues of ρ, so it is a purity monotone [30]. As such, it attains its minimum on the invariant
state, which is χ† = √1 , so for every normalised state one has
d
1
√ ≤ ρ† ≤ 1,
d
178
Entropy 2017, 19, 253
consistently with the bounds (A1). The square of the dagger norm, still a Schur-convex function,
was called purity in Refs. [70,73]. Consequently 1 − ρ2† is a measure of mixedness, sometimes
called the impurity I (ρ) of ρ. The impurity can be extended to subnormalised states by defining it as
I (ρ) := (Tr ρ)2 − ρ2† [4].
The two norms behave differently under channels applied to states. In Ref. [19] it was shown that
in causal theories the operational norm of a state ρ is preserved by channels: C ρ = ρ for every
channel C , because channels are such that uC = u.
Instead the dagger norm shows a different behaviour. To describe it, it is useful to divide channels
into two classes: unital and non-unital channels [49].
Proposition A2. If D is a unital channel, then D ρ† ≤ ρ† , for every normalised state ρ.
Proof. Unital channels can be chosen as free operations for the resource theory of purity [49].
In Ref. [49] it was shown that the spectrum of D ρ is majorised by the spectrum of ρ (see Ref. [72] for
a definition of majorisation and Schur-convex functions). Since the dagger norm is a Schur-convex
function, we have D ρ† ≤ ρ† .
Definition A2. Given two normalised states ρ and σ, the dagger fidelity is defined as
ρ, σ
F† (ρ, σ) = .
ρ† σ †
The dagger fidelity measures the overlap between two states. It shares some properties with the
fidelity in quantum theory (cf. for instance Ref. [74]), despite not coinciding with it. The first, obvious
one, is that F† (ρ, σ) = F† (σ, ρ).
To prove the other properties we need the following lemma, generalising one of the results
of Ref. [30].
' (
Lemma A1. Let {ρi }in=1 be perfectly distinguishable states. Then ρi† ρ j = ρi 2† δij .
' (
Proof. Clearly what we need to prove is that ρi† ρ j = 0 if i
= j. Let { ai }in=1 be the perfectly
r
distinguishing test, and let ρi be diagonalised as ρi = ∑ki=1 pk,i αk,i , where pk,i > 0 for all k = 1, . . . , r.
We have ( ai |ρi ) = 1, hence by Proposition 2 there exists a non-disturbing pure transformation Ti such
' ( ' (
that Ti =ρi I . Specifically, we have that Ti αk,i = αk,i . Moreover if i
= j, we have uTi ρ j ≤ ai ρ j = 0,
' (
whence uTi ρ j = 0. This means that Ti ρ j = 0 for all j
= i.
179
Entropy 2017, 19, 253
Now, consider
α†k,i Ti αk,i = α†k,i αk,i = 1,
where we have used the fact that Ti αk,i = αk,i . Since α†k,i Ti is a pure effect, it must be α†k,i Ti = α†k,i by
Theorem 6. By linearity we have ρi† Ti = ρi† . Now, using this fact, for all j
= i
ρi† ρ j = ρi† Ti ρ j = 0,
because Ti ρ j = 0.
' (
Recalling that ρ† σ = ρ, σ , this lemma means that perfectly distinguishable states form an
orthogonal set. Specifically, if the states are pure, the set is orthonormal.
The following proposition extends and generalises the properties of the self-dualising inner
product of Ref. [71].
Proposition A3. The dagger fidelity has the following properties, for all normalised states ρ and σ.
1. 0 ≤ F† (ρ, σ) ≤ 1;
2. F† (ρ, σ ) = 0 if and only if ρ and σ are perfectly distinguishable;
3. F† (ρ, σ ) = 1 if and only if ρ = σ;
4. F† (U ρ, U σ) = F† (ρ, σ ), for every reversible channel U .
Note that Property 3 captures the sharpness of the dagger for all normalised states [69].
A property involving tensor product of states is the following.
Proof. Let us prove the result for ρ and σ pure, the general result will follow by linearity. By Purity
' (
Preservation, ρ ⊗ σ and ρ† ⊗ σ† are pure, and one has ρ† ⊗ σ† ρ ⊗ σ = 1. By Theorem 6,
†
(ρ ⊗ σ ) = ρ† ⊗ σ† .
180
Entropy 2017, 19, 253
ρ1 ⊗ ρ2 , σ1 ⊗ σ2
F† (ρ1 ⊗ ρ2 , σ1 ⊗ σ2 ) = .
ρ1 ⊗ ρ2 † σ1 ⊗ σ2 †
Furthermore,
/ /
ρ1 ⊗ ρ2 † = ρ1 ⊗ ρ2 , ρ1 ⊗ ρ2 = ρ1 , ρ1 ρ2 , ρ2 = ρ1 † ρ2 † .
ρ1 , σ1 ρ2 , σ2
F† (ρ1 ⊗ ρ2 , σ1 ⊗ σ2 ) = · = F† (ρ1 , σ1 ) F† (ρ2 , σ2 ) .
ρ1 † σ1 † ρ2 † σ2 †
Definition A3. Given the transformation A ∈ Transf (A, B), its dagger (or adjoint) is a linear transformation
A† from B to A defined as
B A
A B
†
A† A
ρ = ρ† , (A2)
S S
This definition specifies the dagger of a transformation completely, thanks to Equation (2).
Note that Lemma 2 allows us to formulate Equation (10) in term of effects and their dagger:
ab† = ba†
for all effects a, and b. In this way, Definition A3 can be recast in equivalent terms by taking b as the
term in round brackets in the RHS of Equation (A2). This yields
B A A B
A† A
ρ E = E† ρ† , (A3)
S S
for every system S, every state ρ ∈ St1 (B ⊗ S), and every effect E ∈ Eff (A ⊗ S).
The dagger of a transformation may not be a physical transformation, i.e., it may send physical
states to non-physical ones. Indeed, the action of A† ⊗ I on a generic state (the LHS of Equation (A2))
is defined as the dagger of an effect. However, not all daggers of effects are physical states. For instance,
take the deterministic effect u = ∑id=1 αi† , where {αi }id=1 is a pure maximal set. Its dagger is
u† = ∑id=1 αi = dχ, which is a supernormalised (and hence non-physical) state.
For channels, we can give a necessary condition for the existence of a physical dagger of the channel.
181
Entropy 2017, 19, 253
Proposition A5. Let C ∈ Transf (A, B) be a channel. If C † is a physical transformation, then C is unital, and
C † itself is a unital channel.
Proof. If C † is a physical transformation, then, for every normalised state ρ ∈ St1 (B), we have
8 † 8 ' ( ' ( ' (
8C ρ8 ≤ 1, or in other words, uC † ρ ≤ 1. By Equation (A3), uC † ρ = ρ† C u† , so the condition
8 † 8
8C ρ8 ≤ 1 is equivalent to
1
ρ† C χ ≤ , (A4)
d
with equality if and only if C † is a channel. Suppose by contradiction that C is not unital, then
C χ = ρ0
= χ. Diagonalise ρ0 as ρ0 = ∑id=1 pi αi , where p1 ≥ p2 ≥ . . . ≥ pd ≥ 0, and p1 > 1d .
' (
Then taking ρ to be α1 in ρ† C χ yields p1 , but p1 > 1d , contradicting Equation (A4).
Being C unital, we have that
1 1
ρ† C χ = ρ† χ = Tr ρ = ,
d d
showing that C † is itself a channel. Let us prove it is unital. The action of C † on χ is defined in
Equation (A2), so
† 1 1
C † χ = χ† C = (uC)† = u† = χ,
d d
where we have used the fact that C is a channel, so uC = u. This proves that C † is unital.
We can prove that the dagger of a transformation has some nice properties.
' (†
Proposition A6. For every transformation A ∈ Transf (A, B), one has A† = A.
Proof. By Equation (A3) given any system S, any state ρ ∈ St1 (A ⊗ S), and any effect E ∈ Eff (B ⊗ S),
we have
A ' † († B
A B
A†
A
ρ E = E † ρ† . (A5)
S
S
A linear extension of Equation (A3) to cover the case when E† is not a physical state, applied to the
RHS of Equation (A5) yields
B A A B
A† A
E† ρ† = ρ E .
S S
We can give a characterisation of the dagger of reversible channels, which are unital channels.
Proof. We have
B A A B
U† U
ρ E = E† ρ† ,
S S
9 : 9 :
for any S, ρ, E. Recalling Lemma 2, the RHS is ρ, (U ⊗ I) E† . By Proposition 4 ρ, (U ⊗ I) E† =
9' −1 ( :
U ⊗ I ρ, E† , and by symmetry of the inner product we have that
182
Entropy 2017, 19, 253
; < ; < B
U −1
A
U −1 ⊗ I ρ, E† = E† , U −1 ⊗ I ρ = ρ E ,
S
In particular we have that the dagger of the SWAP channel between two systems is the SWAP with
the input and output systems reversed.
The orthogonal projectors of Section 5.2, on the other hand, are self-adjoint on single system.
.
Proposition A8. Given the orthogonal projector PI on a face FI , we have PI† = PI .
' ( ' ( 9 :
Proof. For every ρ and E, we have E PI† ρ = ρ† PI E† . The RHS is ρ, PI E† . By the properties
of projectors, ; < ; < ; <
ρ, PI E† = PI ρ, E† = E† , PI ρ = ( E| PI |ρ) .
.
This shows that PI† = PI .
Finally we prove some properties of the dagger with respect to compositions. We need an easy
lemma first.
Lemma A3. For every A ∈ Transf (A, B), every system S, and every vector ξ ∈ StR (A⊗S) we have
A B
† B A
A A†
ξ = ξ† .
S S
' († ⎛ ⎞†
A B
A
A B A† B
A†
A
ξ = ξ =⎝ ξ† ⎠ .
S S
S
Now we can state the main results. The first concerns sequential composition.
Proposition A9. For all transformations A ∈ Transf (A, B), B ∈ Transf (B, C), one has (BA)† = A† B † .
Proof. Take any system S, any state ρ ∈ St1 (C ⊗ S), and any effect E ∈ Eff (A ⊗ S). By Equation (A3)
we have
C A
(BA)† A
BA C A
A B
B C
ρ E = E† ρ† = E† ρ† .
S S
S
Define ξ as ξ := (A ⊗ I) E† , so
C A
(BA)† B
B C C
B†
B
ρ E = ξ ρ† = ρ ξ† .
S S
S
183
Entropy 2017, 19, 253
0 1† ' (
By Lemma A3 ξ † = (A ⊗ I) E† = E A† ⊗ I , then
C A
(BA)† C
B†
B
A†
A
ρ E = ρ E ,
S S
therefore (BA)† = A† B † .
Lemma A4. For every A ∈ Transf (A, B), every systems S and S , we have (IS ⊗ A ⊗ IS )† = IS ⊗ A† ⊗
I S .
Proof. As a first step, let us prove that, for every system S, we have (A ⊗ IS )† = A† ⊗ IS . Take any
system S , any state ρ ∈ St1 (B ⊗ S ⊗ S ), and any effect E ∈ Eff (A ⊗ S ⊗ S ), Equation (A3) yields
A B
B A A B A
†
(A ⊗ I) A⊗I
ρ S S
E = E†
S S
ρ† = E†
S
ρ† .
S S
S
B A
A
A B
A†
E†
S
ρ† = ρ S
E ,
S S
S S A B S
A
A B
= SWAP SWAP .
A A S B
To get the thesis, note that (IS ⊗ A ⊗ IS )† = [(IS ⊗ A) ⊗ IS ]† . We have just proved that
and that (IS ⊗ A)† = IS ⊗ A† , therefore we conclude that (IS ⊗ A ⊗ IS )† = IS ⊗ A† ⊗ IS .
Proposition A10. Let A ∈ Transf (A, B), and B ∈ Transf (C, D). We have (A ⊗ B)† = A† ⊗ B † .
Proof. Take any system S, any state ρ ∈ St1 (B ⊗ D ⊗ S), and any effect E ∈ Eff (A ⊗ C ⊗ S), we have
184
Entropy 2017, 19, 253
A B
B A A B A
(A ⊗ B)† A⊗B
ρ D C
E = E†
C D
ρ† = E†
C
B D
ρ† .
S S
S
B A
B A
A
A B
A†
(A ⊗ B)†
ρ D C
E = ξ D
ρ† = ρ D
ξ†
S
S S
' (
By Lemmas A3 and A4, we have that ξ † = E IA ⊗ B † ⊗ IS , so
B A
B A A†
†
(A ⊗ B)
ρ D C
E = ρ D
B†
C
E ,
S
S
This means that the dagger respects the composition of diagrams, and corresponds to the action
of flipping a diagram with respect to a vertical axis.
References
1. Feynman, R.P.; Leighton, R.; Sands, M. The Feynman Lectures on Physics. The Definitive and Extended Edition;
Addison Wesley: Boston, MA, USA, 2005.
2. Sorkin, R.D. Quantum mechanics as quantum measure theory. Mod. Phys. Lett. A 1994, 9, 3119–3127.
3. Sorkin, R.D. Quantum Classical Correspondence: The 4th Drexel Symposium on Quantum Nonintegrability;
Chapter Quantum Measure Theory and Its Interpretation; International Press: Boston, MA, USA, 1997;
pp. 229–251.
4. Barnum, H.; Müller, M.P.; Ududec, C. Higher-order interference and single-system postulates characterizing
quantum theory. New J. Phys. 2014, 16, 123029.
5. Bolotin, A. On the ongoing experiments looking for higher-order interference: What are they really testing?
arXiv 2016, arXiv:1611.06461.
6. Dakić, B.; Paterek, T.; Brukner, Č. Density cubes and higher-order interference theories. New J. Phys. 2014,
16, 023028.
7. Lee, C.M.; Selby, J.H. Deriving grover’s lower bound from simple physical principles. New J. Phys. 2016,
18, 093047.
8. Lee, C.M.; Selby, J.H. Generalised phase kick-back: The structure of computational algorithms from physical
principles. New J. Phys. 2016, 18, 033023.
9. Lee, C.M.; Selby, J.H. Higher-order interference in extensions of quantum theory. Found. Phys. 2017,
47, 89–112.
10. Niestegge, G. Three-slit experiments and quantum nonlocality. Found. Phys. 2013, 43, 805–812.
11. Ududec, C. Perspectives on the Formalism of Quantum Theory. Ph.D. Thesis, University of Waterloo,
Waterloo, ON, Canada, 2012.
12. Ududec, C.; Barnum, H.; Emerson, J. Probabilistic Interference in Operational Models. 2009, in preparation.
13. Ududec, C.; Barnum, H.; Emerson, J. Three slit experiments and the structure of quantum theory. Found. Phys.
2011, 41, 396–405.
14. Lee, C.M.; Selby, J.H. A no-go theorem for theories that decohere to quantum mechanics. arXiv 2017,
arXiv:1701.07449.
185
Entropy 2017, 19, 253
15. Barnum, H.; Barrett, J.; Leifer, M.; Wilce, A. Generalized no-broadcasting theorem. Phys. Rev. Lett. 2007,
99, 240501.
16. Barnum, H.; Wilce, A. Information processing in convex operational theories. Electron. Notes Theor. Comput. Sci.
2011, 270, 3–15.
17. Barrett, J. Information processing in generalized probabilistic theories. Phys. Rev. A 2007, 75, 032304.
18. Barrett, J.; de Beaudrap, N.; Hoban, M.J.; Lee, C.M. The computational landscape of general physical theories.
arXiv 2017, arXiv:1702.08483.
19. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Probabilistic theories with purification. Phys. Rev. A 2010,
81, 062348.
20. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Informational derivation of quantum theory. Phys. Rev. A 2011,
84, 012311.
21. Chiribella, G.; Spekkens, R.W. (Eds.) Quantum Theory: Informational Foundations and Foils; Fundamental
Theories of Physics; Springer: Dordrecht, The Netherlands, 2016; Volume 181.
22. Dakić, B.; Brukner, Č. Quantum Theory and Beyond: Is Entanglement Special; Cambridge University Press:
Cambridge, UK, 2011; pp. 365–392.
23. Hardy, L. Quantum Theory From Five Reasonable Axioms. arXiv 2001, arXiv:quant-ph/0101012.
24. Hardy, L. Foliable Operational Structures for General Probabilistic Theories; Cambridge University Press:
Cambridge, UK, 2011; pp. 409–442.
25. Lee, C.M.; Barrett, J. Computation in generalised probabilistic theories. New J. Phys. 2015, 17, 083001.
26. Lee, C.M.; Hoban, M.J. Bounds on the power of proofs and advice in general physical theories. Proc. R. Soc. A
2016, 472, 20160076.
27. Lee, C.M.; Hoban, M.J. The information content of systems in general physical theories. In Proceedings of
the 7th International Workshop on Physics and Computation, Manchester, UK, 14 July 2016; Volume 214,
pp. 22–28.
28. Masanes, L.; Müller, M.P. A derivation of quantum theory from physical requirements. New J. Phys. 2011,
13, 063001.
29. Hardy, L. Reformulating and reconstructing quantum theory. arXiv 2011, arXiv:1104.2066.
30. Chiribella, G.; Scandolo, C.M. Entanglement as an axiomatic foundation for statistical mechanics. arXiv
2016, arXiv:1608.04459.
31. Krumm, M.; Barnum, H.; Barrett, J.; Müller, M.P. Thermodynamics and the structure of quantum theory.
New J. Phys. 2017, 19, 043025.
32. Jin, F.; Liu, Y.; Geng, J.; Huang, P.; Ma, W.; Shi, M.; Duan, C.; Shi, F.; Rong, X.; Du, J. Experimental test of
born’s rule by inspecting third-order quantum interference on a single spin in solids. Phys. Rev. A 2017,
95, 012107.
33. Kauten, T.; Keil, R.; Kaufmann, T.; Pressl, B.; Brukner, Č.; Weihs, G. Obtaining tight bounds on higher-order
interferences with a 5-path interferometer. New J. Phys. 2017, 19, 033017.
34. Park, D.K.; Moussa, O.; Laflamme, R. Three path interference using nuclear magnetic resonance: A test of
the consistency of born’s rule. New J. Phys. 2012, 14, 113025.
35. Sinha, A.; Vijay, A.H.; Sinha, U. On the superposition principle in interference experiments. Sci. Rep. 2015,
5, 10304.
36. Sinha, U.; Couteau, C.; Jennewein, T.; Laflamme, R.; Weihs, G. Ruling out multi-order interference in
quantum mechanics. Science 2010, 329, 418–421.
37. Barnum, H.; Graydon, M.; Wilce, A. Composites and categories of Euclidean Jordan algebras. arXiv 2016,
arXiv:1606.09331.
38. Chiribella, G. Dilation of states and processes in operational-probabilistic theories. In Proceedings of the
11th workshop on Quantum Physics and Logic, Kyoto, Japan, 4–6 June 2014; Volume 172, pp. 1–14.
39. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Quantum Theory: Informational Foundations and Foils; Chapter
Quantum from Principles; Springer: Dordrecht, The Netherlands, 2016; pp. 171–221.
40. Hardy, L. Quantum Theory: Informational Foundations and Foils; Chapter Reconstructing Quantum Theory;
Springer: Dordrecht, The Netherlands, 2016; pp. 223–248.
41. Abramsky, S.; Coecke, B. A categorical semantics of quantum protocols. In Proceedings of the 19th Annual
IEEE Symposium on Logic in Computer Science, Turku, Finland, 13–17 July 2004; pp. 415–425.
42. Coecke, B. Kindergarten quantum mechanics: Lecture notes. AIP Conf. Proc. 2006, 810, 81–98.
186
Entropy 2017, 19, 253
187
Entropy 2017, 19, 253
70. Müller, M.P.; Oppenheim, J.; Dahlsten, O.C.O. The black hole information problem beyond quantum theory.
J. High Energy Phys. 2012, 2012, 9.
71. Müller, M.P.; Ududec, C. Structure of reversible computation determines the self-duality of quantum theory.
Phys. Rev. Lett. 2012, 108, 130401.
72. Marshall, A.W.; Olkin, I.; Arnold, B.C. Inequalities: Theory of Majorization and Its Applications; Springer Series
in Statistics; Springer: New York, NY, USA, 2011.
73. Müller, M.P.; Dahlsten, O.C.O.; Vedral, V. Unifying typical entanglement and coin tossing: On randomization
in probabilistic theories. Commun. Math. Phys. 2012, 316, 441–487.
74. Wilde, M.M. Quantum Information Theory, 2nd ed.; Cambridge University Press: Cambridge, UK, 2017.
75. Selinger, P. Dagger compact closed categories and completely positive maps. Electron. Notes Theor. Comput. Sci.
2007, 170, 139–163.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
188
entropy
Article
Leaks: Quantum, Classical, Intermediate and More
John Selby 1 and Bob Coecke 2, *
1 Department of Physics, Imperial College London, Kensington, London SW7 2AZ, UK;
[email protected]
2 Department of Computer Science, University of Oxford, Oxford OX1 3PA, UK
* Correspondence: [email protected]; Tel.: +44-7881-333990
Abstract: We introduce the notion of a leak for general process theories and identify quantum
theory as a theory with minimal leakage, while classical theory has maximal leakage. We provide
a construction that adjoins leaks to theories, an instance of which describes the emergence of classical
theory by adjoining decoherence leaks to quantum theory. Finally, we show that defining a notion
of purity for processes in general process theories has to make reference to the leaks of that theory,
a feature missing in standard definitions; hence, we propose a refined definition and study the
resulting notion of purity for quantum, classical and intermediate theories.
1. Introduction
Can we explain why the world is quantum by finding some sense in which quantum theory is an
optimal theory? Broadcasting distinguishes quantum theory from classical theory in that quantum
states cannot be broadcast [1], but neither can the states of many other theories [2,3]. Non-locality is
a measure of non-classicality, and quantum theory is non-local, but not maximally so [4]. Therefore,
is there some manner in which we can uniquely single out quantum theory? In this paper, we show
that quantum theory is a leak-free theory, whilst classical theory is maximally leaking. We formalise
the notion of a leak, which can roughly be thought of as a ‘one-sided broadcasting map’, within
the process-theoretic framework [3,5,6] as a particular type of process, which, as the name suggests,
accounts for leaking state-data into the environment.
Moreover, there is a natural way to introduce leaks to any theory, and by doing so, we obtain new
theories. We call this the leak construction. In particular, classical theory can be obtained from quantum
theory in this manner, where, in this example, the leaking is then nothing but decoherence [7,8]. Hence,
the concept of a leak allows us to generalise decoherence to arbitrary process theories. Besides classical
theory, any theory characterised by some finite-dimensional C*-algebra can be obtained in this manner
from quantum theory. In fact, as we show in a follow-up paper [9], only C*-algebras can be obtained
in this manner. Leaks therefore capture the operational content of finite-dimensional C*-algebras
on-the-nose, in a manner that does not involve any additive structure, nor a ∗-operation.
Finally, we observe that defining purity of processes in process theories with leaks is problematic;
in particular, this is the case for classical theory. Making explicit use of the concept of a leak, we
therefore propose a new definition that makes sense for arbitrary processes in arbitrary process theories.
Related Work
As explained in detail in the follow-up paper [9], the leak construction is related to the
“constructions of classical system types” in [10–12]. More specifically, in the case of quantum theory, we
exactly obtain the same result, but in a much simpler way, with much less use of structure and guided
by a clear operational meaning. The notion of a leak is closely related to the decomposability
of a state-space [13] in the generalised probabilistic theory framework as, at least under some
standard assumptions, such as the “no-restriction hypothesis”, each is equivalent to the existence of a
non-disturbing measurement as discussed in [14].
f h
A
the resulting diagram should also be a process. To be mathematically more precise, the data that make
up a diagram are:
Hence, two diagrams are equal when these data match up.
By a circuit [3,6], we mean a diagram that can be constructed by means of the obvious operations
of parallel composition ⊗ and sequential composition ◦ of boxes. For example, the following diagrams
is a circuit:
g g
=
f h f h
⎛ ⎞ ⎛ ⎞
= ⎝ ⊗ g ⎠◦⎝ f ⊗ h ⎠
A⊗B := A B
Remark 1. A process theory with circuits as diagrams can also be defined as a strict symmetric monoidal
category. Strictness means that associativities and unit laws hold on-the-nose, unlike the symmetric monoidal
categories of concrete mathematical models where non-trivial associativity and unit natural isomorphisms are
required. Fortunately, by Mac Lane’s strictification theorem [15], every such category is categorically equivalent
(although not isomorphic) to a strict one, which means that for all practical purposes, it can be thought of as
a strict one.
A state is a process without inputs; an effect is a process without outputs; and a number is a
process with neither inputs nor outputs. One special number is the empty diagram:
190
Entropy 2017, 19, 174
:= (1)
A⊗B A B
f = (2)
and a theory is causal if all of the processes of the theory are causal. Therefore, except for the fact
that it composes, discarding is not subject to any defining constraints. In a sense, its behaviour is
entirely implicit within its role within the defining equation of causality. In particular, by Equation (2)
where f is taken to be an effect, it immediately follows that the only effects in a causal theory are the
discarding effects. In this form, the axiom of causality traces back to [16]. When restricting to causal
processes, a process theory is non-signalling [17]; hence, the causality of a theory is vital to guarantee
compatibility with relativity.
Example 1 (Classical probability theory). When viewing probability theory as a process theory, systems are
n-state classical systems and boxes are n × m stochastic matrices, and so, in particular, states are probability
distributions. Discarding is given by marginalisation, and so, causality boils down to the fact that the entries of
a probability distribution add up to one and that the entries in each column of a stochastic matrix add up to one.
Example 2 (Quantum theory). Quantum theory as a process theory has finite dimensional Hilbert spaces H
as its systems and completely positive trace preserving (CPTP) maps:
ξ : B(H) → B(H )
as its processes. Causality for density operators means having trace one and for completely positive maps means
being trace-preserving. One can also include classical data as additional systems, and then measurements and
controlled operations are also processes. If this is the case, we will often denote the classical systems as dotted
wires to distinguish them from quantum wires. Specifically, measurements are processes from quantum to
classical systems where the probabilities of obtaining the different outcomes are encoded in the classical system.
Causality then implies that, for projective measurements, the projectors form a resolution of the identity and,
for general measurements that the POVM elements sum to discarding. A full description and a pedagogical
introduction to this theory is in [3,6,18].
Typically, as will be the case in the examples below, we will want to describe both causal and
non-causal processes. We therefore will still, for each system, have a discarding map, which specifies
the causal processes, but there will also be other processes that will not satisfy Equation (2). There are
two main reasons for this. The first is to allow us to discuss events, i.e., processes that we cannot make
happen deterministically, but that can occur as a particular outcome in some experiment; therefore,
allowing us to obtain the probability of obtaining a specific outcome, which, in particular, allows us,
via suitable renormalisation, to describe post-selection. The second reason is mathematical simplicity:
it is often much easier to define the process theory, or various structures within it, in the non-causal
setting and then to restrict to the causal sub-theory when necessary.
191
Entropy 2017, 19, 174
Example 3 (Non-causal extension of quantum theory). To describe non-causal processes in quantum theory,
rather than taking processes as completely positive trace-preserving maps, we instead just require that they are
completely positive. It is very standard within quantum theory to consider such processes, for example Dirac
bras are non-causal or, more generally, individual POVM elements are non-causal.
An important tool is the Choi–Jamiolkowski isomorphism between transformations and bipartite states.
One direction of this isomorphism can be realised causally, using the Bell state, which we represent, up to
a normalisation factor, with a cup-shaped wire:
1
where D :=
D
which allows us to “bend wires up”:
1
f → f
D
This associates with each (causal) process a (causal) bipartite state. The other direction is however not realisable
causally, as it relies on the Bell effect, which we represent with a cap-shaped wire:
ρ → D ρ
The fact that this is an isomorphism provides us with the following intuitive diagrammatic rule (justifying the
representation of these as a cup and cap):
= (3)
It is then clear that the cap cannot be causal (even up to a rescaling) as, if it were, then the identity transformation
would be separable, i.e.:
ρ
= = =
where in the second step we relied on the fact that by causality, all effects must be discarding, so in particular,
the cap, as well as Equation (1).
Example 4 (Non-causal extension of classical theory). We can similarly extend classical theory, taking
processes as n × m matrices with positive real elements as opposed to stochastic matrices. This again allows us
to discuss particular outcomes of measurements, which may not happen with certainty, and moreover, gives us
a classical equivalent of the Choi–Jamiolkowski isomorphism where rather than using the Bell state and effect,
we use the perfectly correlated state and effect, again denoted by a cup and a cap. These can be defined in terms of
the orthonormal basis states and effects as:
1 δij
i j = and i j = δij
n n
respectively. It is simple to check that these also satisfy Equation (3) as we would expect from the choice of the
diagrammatic representation.
192
Entropy 2017, 19, 174
This forms the basic structures needed to describe the physical content of process theories;
however, we will need some further tools for the proofs. These are all defined in the standard way for
categorical quantum mechanics and surveyed in Appendix A for those unfamiliar with the field.
= (5)
When we have multiple leaks around, we may often represent them with different colours to
distinguish them.
L1 L2 L1 ⊗ L2
=:
since we have:
L1
L1 ⊗ L2
= =
L2
L1 L2 L1 ⊗ L2
=:
A B A⊗B
193
Entropy 2017, 19, 174
since we have:
L1 ⊗ L2 L1 L2
= = =
A⊗B A B A B A⊗B
: X → X × X :: x → ( x, x )
since if we discard a copy, we are back with what we started off with. In fact, strictly speaking, what
we are dealing with here is not a copying operation since while it copies pure classical states, it does
not do that for impure ones. What it is instead is broadcasting, that is besides Equation (5), discarding
is also a left counit for the leaking process:
= (6)
Note that this requires L := A in Equation (4). This is the maximal possible leak for any system, as all
of the information about the ingoing state is leaked out.
On the other hand, quantum theory does not allow for broadcasting [1]. In fact, the only kind
of leak quantum theory admits is constant leaking. This immediately follows from the following
fact about quantum processes, which states that any dilation of a pure process, i.e., representation as
a process with an extra output that is discarded, must separate:
ρ
f = g =⇒ g = f (7)
with ρ causal. That is, if a reduced process f is pure, then the process g we started from must separate.
Hence, since the identity is pure, by Proposition 3, it follows from the defining equation of a
leak (5) that any leak for quantum theory must be constant, that is of the form:
ρ (8)
= (9)
ρ
194
Entropy 2017, 19, 174
Remark 2. In quantum theory, Proposition 3 can actually be taken as a definition of the purity of processes, that
is a quantum process f is pure if and only if all dilations of f separate. However, in theories with non-constant
leaks, this definition must be revised as we discuss in detail in Section 7.
Of course, (8) is also a leak for classical probability theory, and another example arises by
combining broadcasting and a constant:
ρ (10)
4. Quality of a Leak
For the sake of simplicity of the argument, we will restrict ourselves to a special kind of
process theories that admit the notion of a feedback wire. Explicitly spelling out the process-theoretic
characterisation of a feedback wire as in [19] goes beyond the scope of this paper. It suffices to know
that they exist in both quantum and classical theory, where they can be constructed in the obvious way
using the cups and caps of Examples 3 and 4. The behaviour of such a feedback wire is that of a wire
of the shape:
In particular, by means of such a wire, we can feed an output of a process back into it as an input:
B
f C
A
195
Entropy 2017, 19, 174
some other system type. We therefore want to consider maximising over potential restoration maps
r : L → A, where r is taken to be causal. We call this notion the quality of a leak:
⎡ ⎤
2 3
Q := Maxr ⎣ r ⎦
If the structure of the numbers in a process theory is sufficiently rich, e.g., they are the real
numbers or probabilities, one can moreover renormalise this quantity as follows:
2 3
Q −
(11)
−
where the circle indicates the feedback-loop applied to the identity. As a leak, the quality of
broadcasting is one, since we have:
(6)
=
(9)
ρ = =
ρ
We therefore see that quantum theory is a minimally leaking theory as the renormalised quality
for any leak is zero, whilst classical theory is maximal as every system has a leak with renormalised
quality of one. In the next section, we consider how to increase the amount of leaking for a theory,
providing a process-theoretic perspective on the quantum to classical transition.
Example 5. If a process theory admits sums (cf. [6] or Appendix A), then set:
c := c +q ρ
L
A L
A l
= (12)
A A
A
196
Entropy 2017, 19, 174
:= l
Despite the fact that this is defined using a non-causal process, the composite process l is actually causal:
l = = =
We can then use the matrix representation of the leak (see Appendix A):
i j
ij
:= ∑ Δk
ijk k
ij
where Δk ∈ R+ . The leak condition then implies that:
∑ Δk
ij
= δki
j
and so:
ij ij
Δk = Δk δki
l i j i j
ij ij
= = ∑ Δk δki = ∑ Δk =
ijk k ijk k
We can now also characterise all leaks for composite classical-quantum systems:
Proposition 5. Denoting the classical system by a dotted line and the quantum system by a solid line;
all composite classical quantum systems have leaks of the form:
L
= (13)
197
Entropy 2017, 19, 174
Proof. Note that any composite leak defines a quantum leak as:
1
D
= ρ
L
:=
ρ
L
= = =
ρ
The bottom line is that all of these leaks involve the copying leak as the fundamental ingredient.
This is not all too surprising, since, as we showed in the previous section, it stands for maximal leakage.
The processes l and L then play the role of reducing the leakage, with as the extremal cases l and L
being constant, producing a constant leak.
6. The Leak-Construction
We now show how one can construct new process theories from old ones by introducing leaks.
This is done by inserting particular processes of the old theory of the form (15) on all of the wires.
The processes (14), to which we refer in the old theory as pre-leaks, then become leaks in the new
theory. Hence, the leak construction turns pre-leaks into leaks.
Theorem 1. Given any process theory and for each system a causal process:
A LA
(14)
A
(15)
198
Entropy 2017, 19, 174
A⊗B L A⊗ B A B LA LB
:= (16)
A⊗B A B
we can construct a new process theory in which each process (14) is a leak for the system A. This construction
goes as follows:
f (17)
A
= (18)
discarding is preserved by the leak-construction. Given the form Equation (17) of the processes in the
theory and due to the idempotence of Equation (15), plain wires have taken the form Equation (15),
so the defining equation of a leak Equation (5) is satisfied. To consider the pre-leak in the new theory, we
must apply the leak construction Equation (17), and using the condition for composites Equation (16),
we get the following process in the new theory:
A LA
LA LLA
A LA
A
LA
(18) (15)
= =
199
Entropy 2017, 19, 174
which is the form of a plain wire in the new theory, and so, this construction does turn pre-leaks into
leaks. It is moreover straightforward to see that we again obtain a process theory.
Sometimes the leak-construction does nothing, in particular, when the pre-leaks are already leaks:
Example 6 (Trivial). A simple example of the leak construction is the one where the pre-leaks are taken to
already be leaks, since then (17) will reduce to the processes f themselves.
The main motivating example for this construction is of course the following:
: B(H) → B(H ⊗ H) :: |i i | → |i i | ⊗ |i i |
applied to the process theory of quantum processes (i.e., Example 2), we obtain classical probability theory
(i.e., Example 1).
In the above construction, it is really the idempotents rather than the specific pre-leaks that
determine the theory that is obtained. We can therefore have several different perspectives on the
“cause” of this idempotent, by considering different pre-leaks from which it could be obtained. Firstly,
we can always take the trivial case, where the pre-leak is just the idempotent itself, i.e., taking the
leaked system as the empty system. There are however three alternate forms that always exist in
quantum theory and that are more insightful.
Example 8. Firstly we can consider the purification f of the idempotent, in the sense of [16]:
= f
This corresponds to the idea that information can never be fundamentally destroyed, only discarded, and so,
we can see this leaking of information into some causally-separated system leading to decoherence. Another
standard way to represent a general process is, via Stinespring dilation [20], as a reversible interaction with
an environment:
U
=
s
and so, we can equivalently view decoherence as arising due to a reversible interaction with some uncontrolled
environment [8]. A final example, suggested to us by Rob Spekkens, is that the idempotent can be viewed as
describing a system that lacks a reference frame [21]; the leaked system would then correspond to the reference
system itself. This is the subject of ongoing work and is discussed in the Conclusion.
Example 7 leaves open the question whether there are any theories that can be obtained from this
leak construction in between classical and quantum theory. This question is solved in a forthcoming
paper where the key result is the following theorem:
Theorem 2. The leak construction applied to quantum processes (i.e., Example 2) gives all C*-algebras and
C*-algebras only.
200
Entropy 2017, 19, 174
Therefore, despite the weak structure of a leak, for the specific case of quantum theory, we obtain
precisely the C*-algebras via the leak construction. This leads one to contemplate the view that the
operational essence of (finite dimensional) C*-algebras is entirely captured by leaks and that the
additional structure of C*-algebras is merely an artefact of the Hilbert space representation.
Remark 3. The leak-construction does not apply to Example 5, since only for c = 0, 1, we have idempotence of (15).
Remark 4. For a process theory in which all systems are compositions A⊗n of one atomic system A, it suffices
to pick a single process (14) for the system A (where L A will be of the form A⊗n , since all other such processes
arise then by coherence (16)).
Remark 6. The construction in Theorem 1, when modified by not fixing a pre-leak for each type, but rather
considering all pairs of a system and a corresponding pre-leak, is known as the Karoubi envelope, or Cauchy
completion, or splitting of idempotents. More details on this can be found in [9].
Side information
about process f
f = g
Lack of side-information for a process would imply that g must separate such that the
side-information is independent of the process f . Indeed, this must be the case for any such g, i.e.:
ρ
f = g =⇒ g = f (19)
or in other words, all dilations of f must separate. As mentioned in Remark 2, the separability of
dilations (cf. Proposition 3) has been proposed as a definition of process-purity. Indeed for the case of
201
Entropy 2017, 19, 174
quantum theory, this corresponds to the expected notion of purity, that is that the CPTP map must
be Kraus rank 1. Remarkably, however, in the form of (19), this definition does not extend to general
processes of classical probability theory. In fact, nor does it do so for any theory that has broadcasting:
Proposition 6. If a non-trivial theory has broadcasting and one defines purity by means of (19), then plain
wires (i.e., identity processes) are not pure.
Proof. Assuming identities are pure and applying (19) to the defining equation of a leak (5), we obtain:
= ρ (20)
that is, it is a constant leak. However, then, from the second defining equation of broadcasting,
we obtain:
ρ
(6) (20)
= =
that is, each plain wire is a constant process, and hence, the theory is trivial, since as a consequence,
all processes must then be constant since for (causal) processes, we have:
ρ
ρ
f = = =
f f
Hence, in a non-trivial theory with broadcasting, identities cannot be pure in the sense of (19).
From the first part of this proof, namely that this definition of purity implies that leaks must be
constant, it follows that this issue arises in any theories with non-constant leaks. We can think of this
as the fact that, if a system has a leak, then there is irreducible side-information contained within the
system itself:
Side information
about system A
=
A A
Fortunately, leaks also allow us to fix this problem. Firstly, let us suppose that a theory has leaks
and also has a pure process f . Then, clearly, the following is a dilation of f :
202
Entropy 2017, 19, 174
where l is causal. One may therefore consider explicitly bringing leaks into play in the definition of
purity. A first step in this direction is to weaken (19) as follows:
f = g =⇒ ∃ , & l : g = f (21)
However, now, we have the opposite problem: all classical processes, including all states, are pure!
(See Appendix B). It is clear that we are missing a constraint. The original idea was that for a process to
be pure, it should have no side-information that some eavesdropper could take advantage of. However,
we have shown that for some systems, there is irreducible side-information represented by leakage.
Therefore, to ensure that the eavesdropper cannot gain information or influence the process, we must
demand that the process does not interact with this irreducible side-information, such that leaking
before or after is equivalent:
f
∀ ∃ and ∀ ∃ such that = (22)
f
Hence, we propose the following definition of process-purity, which packages these two conditions,
(21) and (22), into a neat form:
f
f = g =⇒ ∃ & : g = = (23)
f
This ensures that the only side-information is this irreducible kind, i.e., system leakage,
and moreover, that pure processes do not interact with this irreducible side information. To further
motivate this definition, we will show that it provides a sensible definition for quantum, classical and
composite systems. However, first, note that for states, this definition reduces to:
ψ = =⇒ σ = ψ ρ
σ
This is the same as the original definition, and so, we see that it is only for general processes that
this new definition is necessary. Similarly, in the case of quantum theory, it is only the first condition
that provides a non-trivial constraint:
Example 10 (quantum purity). As for quantum theory, the only leaks are constant leaks, Condition (21) in
Definition 2 reduces to (19), while Condition (22) becomes trivial.
203
Entropy 2017, 19, 174
Whilst, in the classical case, as we have mentioned above, (21) is satisfied by all classical processes,
and so, it is only (22) that needs to be considered:
Example 11 (classical purity). All pure classical processes, between an n and m state system, are of the form:
n (24)
n
r
n
= and =
Proof. We prove here that pure classical processes must be of this form and leave the proof that any
process of this form is pure to Appendix C.
First consider the condition:
f
∀ ∃ such that =
f
l
:=
l
f
f = f = =
f
204
Entropy 2017, 19, 174
f = f l
i i i
j
Causality of l then implies that, for each j, there can only be a single i where li = 1, and so, for all other
j j
fi fi ,
i, we must have = 0. This means that in each row of there is at most a single non-zero element.
We can run through this argument in the opposite direction using the condition:
f
∀ ∃ such that =
f
j
which shows that fi can have at most a single non-zero element in each column. This is precisely
what is enforced by the black/white dot in the above form; the value of the non-zero elements is then
determined by the state r. Hence, we can write f in the desired form.
Example 12. If we consider purity for causal classical processes, then we find that the pure processes are those
that are reversible (i.e., are isometries).
Proof. The definition of purity, and the standard form for classical leaks, requires that:
l
f =
f
l
l
= = f = =
f f
Finally, we consider the composite case, where the conjunction of (21) and (22) is necessary:
f
(25)
205
Entropy 2017, 19, 174
f (26)
i
Proof. Again, we prove the interesting direction here that pure processes on composite systems must
be of this form and, again, leave the other direction to Appendix C.
Note that a generic process can be written as:
An (almost) identical argument to the classical case shows that if this is pure, it can be written as:
We therefore move on to considering the other part of the definition of purity, that is that any dilation
can be written as a leak; that means that any dilation of this process can be written as:
f l f l
= ∑ i
i i i
r
r
f := fi = gi
i
defines a dilation of the whole process, which must be able to be written as a leak:
f l
∑ gi
i
= ∑ i
i i i i
r r
Therefore, each gi must separate, and hence, the f i are each pure quantum processes.
206
Entropy 2017, 19, 174
Proposition 7. The pure quantum to classical or classical to quantum maps are separable.
Then, using the above result regarding pure maps for composite quantum classical systems, we have,
f f f
ĩ
= = ĩ ĩ
r r r
ĩ
This means that there is no pure way to transform between classical and quantum information.
8. Conclusions
In this paper, we introduce the concept of leaks to generalised process theories. The definition
of which can be thought of as a “one-way” broadcasting map. These prove to be very useful for
understanding various aspects of quantum theory from a physically well-motivated perspective.
In particular, we show:
• that quantum theory is a leak-free theory, whilst classical theory is maximally leaking, giving
a clear separation between the theories for which quantum theory is optimal.
• how to construct sub-theories via a “leak construction”, which can be thought of as the sub-theories
that can be obtained from a dynamical decoherence mechanism. For quantum theory, we can
obtain classical theory, composite quantum classical theory and, generally, finite dimensional
C*-algebras from this construction [9].
• a characterisation of the leaks and pure processes for quantum, classical and composite systems; in
particular, we demonstrate that there is no pure way to transform quantum systems into classical
systems or vice versa.
• that leaks are essential to defining purity of processes; we therefore introduce a novel definition
of purity of processes, which makes sense both for quantum theory and for classical theory.
Future Work
In this paper, we have shown how classical theory emerges from quantum theory due to the leak
construction, providing a process-theoretic perspective on why the world on a large scale appears to
us to be classical. It is natural to ask: Is there some deeper theory of nature than quantum theory from
which quantum theory emerges in an analogous way? This is the subject of a forthcoming paper [23].
A second, related question, would be to ask: What does it imply about a theory if it can obtain classical
theory via a leak construction; is the ability for this to happen in quantum theory special or is this a
generic feature of general theories?
We have also shown that quantum theory is minimally leaking and classical theory maximally;
moreover, if we start from a process theory describing finite dimensional C*-algebras, then quantum
theory is singled out as the unique minimally-leaking theory. Can this idea lead to a complete
reconstruction of quantum theory [24]?
207
Entropy 2017, 19, 174
where G is a group associated with a reference frame for a particular degree of freedom, Ug is the
representation of G on the system of interest and g the state of the reference system. Note, however,
that making sense of this integral for general symmetry groups requires the reference be an infinite
dimensional quantum system and so is beyond the scope of this paper. One could replace, at least
for compact groups, the integral by a finite convex mixture (using the results of [16], Corollary 33
from Caratheodory’s theorem), for which the resulting idempotent would be the same. This can be
thought of as there only being a finite set of possible orientations for the reference frame. However,
a comprehensive understanding of the connections here would require consideration of the infinite
dimensional case. Moreover, we know that in the finite dimensional case, the leak construction leads
to C*-algebraic systems only; however, it remains an interesting open question as to what the leak
construction leads to for infinite dimensional systems.
Acknowledgments: We thank Aleks Kissinger, Dan Marsden, Rob Spekkens and Sean Tull for useful feedback.
John Selby was supported by the EPSRC (Engineering and Physical Sciences Research Council) through the
Controlled Quantum Dynamics Centre for Doctoral Training, and Bob Coecke is supported by the U.S. Air Force
Office of Scientific Research.
Author Contributions: Both authors contributed equally to all aspects of this work. Both authors have read and
approved the final manuscript.
Conflicts of Interest: The authors declare no conflict of interest.
c
c
∑ bi = ∑ bi
i i
a a
In particular, in classical probability theory, we can take sums of diagrams where the sum is
the standard sum of matrices. In fact, this provides us with a matrix calculus for our diagrams.
In particular, we have a basis and co-basis for each system, denoted:
# $n # $n
i and j
i =1 j =1
j
= δij
i
208
Entropy 2017, 19, 174
i
= ∑
i i
i
∑
i i i
f = f := ∑ f ji
ij j
j
∑
j j
where it is simple to check that sequential composition then coincides with a matrix multiplication,
parallel composition with the matrix tensor product and diagrammatic sum with the sum of matrices.
Definition A2. For each classical system type, we have a family of spiders diagrammatically defined by, firstly:
···
···
··· ···
··· ··· =
··· ···
···
and secondly, that the symmetries of the representation as spiders are respected. Alternately, spiders can be
defined via the matrix representation as:
···
··· i i
:= ∑i
··· i i
···
This family of maps is particularly important as, for classical theory at least, they allow us to define
various concepts that we have used throughout the paper in a unified way. Firstly, the broadcasting
map can now be seen as just an example spider with one input and two outputs, but moreover,
we have:
= = =
The feedback-loop we introduced can also now be interpreted as the composite of two spiders:
We moreover want to consider a way to join spiders of different dimensionality (denoted by using
a different colour), which is exactly what the black/white dots achieve.
209
Entropy 2017, 19, 174
Definition A3. Diagrammatically, the black/white dots are any process satisfying:
··· ···
···
··· = = ···
··· ···
···
···
···
···
···
which is equivalent to how they were introduced in Example 11. Alternatively, their matrix representation is:
π1
m i
l
:= ∑ (A1)
i =1 i
n
π2
requiring that l ≤ Min(n, m) and πi are arbitrary permutations of the basis elements. These are then just
matrices with elements {0, 1} with at most a single one in each row and column.
f = F =⇒ F = f
and then, it is simple to check this satisfies the above equation and, moreover, is causal.
210
Entropy 2017, 19, 174
where:
fi := f
i
∀ l ∃ l˜
l l˜
=
s
l
l˜ := + ∑
j∈ J j
where J = Ker .
That (21) is satisfied is also simple if, using the purity of the f i , we can write the dilation as:
fi si
∑ i
i
l si
:= where l := ∑
i i
References
1. Barnum, H.; Caves, C.M.; Fuchs, C.A.; Jozsa, R.; Schumacher, B. Noncommuting mixed states cannot be
broadcast. Phys. Rev. Lett. 1996, 76, 2818.
2. Barnum, H.; Barrett, J.; Leifer, M.; Wilce, A. A generalized no-broadcasting theorem. Phys. Rev. Lett. 2007,
99, 240501.
3. Coecke, B.; Kissinger, A. Categorical quantum mechanics I: Causal quantum processes. In Categories for the
Working Philosopher; Landry, E., Ed.; Oxford University Press: Oxford, UK, 2016.
4. Popescu, S.; Rohrlich, D. Quantum nonlocality as an axiom. Found. Phys. 1994, 24, 379–385.
5. Abramsky, S.; Coecke, B. A categorical semantics of quantum protocols. In Proceedings of the 19th Annual
IEEE Symposium on Logic in Computer Science (LICS), Washington, DC, USA, 13–17 July 2004; pp. 415–425.
211
Entropy 2017, 19, 174
6. Coecke, B.; Kissinger, A. Picturing Quantum Processes. A First Course in Quantum Theory and Diagrammatic
Reasoning; Cambridge University Press: Cambridge, UK, 2016.
7. Kuperberg, G. The capacity of hybrid quantum memory. IEEE Trans. Inf. Theory 2003, 49, 1465–1473.
8. Zurek, W.H. Quantum darwinism. Nat. Phys. 2009, 5, 181–188.
9. Coecke, B.; Selby, J.; Tull, S. Two roads to classicality. arXiv 2017, arXiv:1701.07400.
10. Selinger, P. Idempotents in Dagger Categories (Extended Abstract). Electron. Notes Theor. Comput. Sci. 2008,
210, 107–122.
11. Heunen, C.; Kissinger, A.; Selinger, P. Completely positive projections and biproducts. In Proceedings of the
10th International Workshop on Quantum Physics and Logic, Barcelona, Spain, 17–19 July 2013.
12. Cunningham, O.; Heunen, C. Axiomatizing complete positivity. arXiv 2015, arXiv:1506.02931.
13. Barrett, J. Information processing in generalized probabilistic theories. Phys. Rev. A 2007, 75, 032304.
14. Richens, J.; Selby, J.; Al-Safi, S. Entanglement is an inevitable feature of any non-classical theory. arXiv 2016,
arXiv:1610.00682 .
15. Mac Lane, S. Categories for the Working Mathematician; Springer: Berlin/Heidelberg, Germany, 1998.
16. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Probabilistic theories with purification. Phys. Rev. A 2010,
81, 062348.
17. Coecke, B. Terminality implies non-signalling. arXiv 2014, arXiv:1405.3681.
18. Coecke, B.; Kissinger, A. Categorical quantum mechanics II: Classical-quantum interaction. arXiv 2016,
arXiv:1605.08617.
19. Joyal, A.; Street, R.; Verity, D. Traced monoidal categories. In Mathematical Proceedings of the Cambridge
Philosophical Society; Cambridge University Pressess: Cambridge, UK, 1996; Volume 119, pp. 447–468.
20. Stinespring, W.F. Positive functions on C*-algebras. Proc. Am. Math. Soc. 1955, 6, 211–216.
21. Bartlett, S.D.; Rudolph, T.; Spekkens, R.W. Reference frames, superselection rules, and quantum information.
Rev. Mod. Phys. 2007, 79, 555.
22. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Quantum from principles. In Quantum Theory: Informational
Foundations and Foils; Springer: Berlin/Heidelberg, Germany, 2016; pp. 171–221.
23. Lee, C.M.; Selby, J.H. A no-go theorem for post-quantum theories that decohere to quantum theory. arXiv
2017, arXiv:1701.07449.
24. Selby, J.; Scandolo, C.M.; Coecke, B. Quantum theory from diagrammatic postulates. Forthcoming submitted.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
212
entropy
Article
Measurement Uncertainty Relations for Position and
Momentum: Relative Entropy Formulation
Alberto Barchielli 1,2,3 , Matteo Gregoratti 1,2, * and Alessandro Toigo 1,3
1 Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, I-20133 Milano, Italy;
[email protected] (A.B.); [email protected] (A.T.)
2 Istituto Nazionale di Alta Matematica (INDAM-GNAMPA), 00185 Roma, Italy
3 Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Milano, 20133 Milano, Italy
* Correspondence: [email protected]; Tel.: +39-0223994569
Abstract: Heisenberg’s uncertainty principle has recently led to general measurement uncertainty
relations for quantum systems: incompatible observables can be measured jointly or in sequence only
with some unavoidable approximation, which can be quantified in various ways. The relative entropy
is the natural theoretical quantifier of the information loss when a ‘true’ probability distribution
is replaced by an approximating one. In this paper, we provide a lower bound for the amount
of information that is lost by replacing the distributions of the sharp position and momentum
observables, as they could be obtained with two separate experiments, by the marginals of any
smeared joint measurement. The bound is obtained by introducing an entropic error function,
and optimizing it over a suitable class of covariant approximate joint measurements. We fully exploit two
cases of target observables: (1) n-dimensional position and momentum vectors; (2) two components
of position and momentum along different directions. In (1), we connect the quantum bound to the
dimension n; in (2), going from parallel to orthogonal directions, we show the transition from highly
incompatible observables to compatible ones. For simplicity, we develop the theory only for Gaussian
states and measurements.
1. Introduction
Uncertainty relations for position and momentum [1] have always been deeply related to the
foundations of Quantum Mechanics. For several decades, their axiomatization has been of ‘preparation’
type: an inviolable lower bound for the widths of the position and momentum distributions, holding in
any quantum state. Such kinds of uncertainty relations, which are now known as preparation uncertainty
relations (PURs) have been later extended to arbitrary sets of n ≥ 2 observables [2–5]. All PURs
trace back to the celebrated Robertson’s formulation [6] of Heisenberg’s uncertainty principle:
for any two observables, represented by self-adjoint operators A and B, the product of the variances
of A and B is bounded from below by the expectation value of their commutator; in formulae,
Varρ ( A) Varρ ( B) ≥ 14 |Tr{ρ[ A, B]}|2 , where Varρ is the variance of an observable measured in any
system state ρ. In the case of position Q and momentum P, this inequality gives Heisenberg’s relation
2
Varρ (Q) Varρ (P) ≥ h̄4 . About 30 years after Heisenberg and Robertson’s formulation, Hirschman
attempted a first statement of position and momentum uncertainties in terms of informational
quantities. This led him to a formulation of PURs based on Shannon entropy [7]; his bound was
later refined [8,9], and extended to discrete observables [10]. Also other entropic quantities have been
used [11]. We refer to [12,13] for an extensive review on entropic PURs.
However, Heisenberg’s original intent [1] was more focused on the unavoidable disturbance that
a measurement of position produces on a subsequent measurement of momentum [14–21]. Trying to
give a better understanding of his idea, more recently new formulations were introduced, based
on a ‘measurement’ interpretation of uncertainty, rather than giving bounds on the probability
distributions of the target observables. Indeed, with the modern development of the quantum theory
of measurement and the introduction of positive operator valued measures and instruments [3,22–26],
it became possible to deal with approximate measurements of incompatible observables and to
formulate measurement uncertainty relations (MURs) for position and momentum, as well as for more
general observables. The MURs quantify the degree of approximation (or inaccuracy and disturbance)
made by replacing the original incompatible observables with a joint approximate measurement of them.
A very rich literature on this topic flourished in the last 20 years, and various kinds of MURs have been
proposed, based on distances between probability distributions, noise quantifications, conditional
entropy, etc. [12,14–21,27–32].
In this paper, we develop a new information-theoretical formulation of MURs for position and
momentum, using the notion of the relative entropy (or Kullback-Leibler divergence) of two probabilities.
The relative entropy S( pq) is an informational quantity which is precisely tailored to quantify the
amount of information that is lost by using an approximating probability q in place of the target
one p. Although classical and quantum relative entropies have already been used in the evaluation of
the performances of quantum measurements [24,27,30,33–40], their first application to MURs is very
recent [41].
In [41], only MURs for discrete observables were considered. The present work is a first attempt
to extend that information-theoretical approach to the continuous setting. This extension is not trivial
and reveals peculiar problems, that are not present in the discrete case. However, the nice properties of
the relative entropy, such as its scale invariance, allow for a satisfactory formulation of the entropic
MURs also for position and momentum.
We deal with position and momentum in two possible scenarios. Firstly, we consider the case
of n-dimensional position and momentum, since it allows to treat either scalar particles, or vector
ones, or even the case of multi-particle systems. This is the natural level of generality, and our
treatment extends without difficulty to it. Then, we consider a couple made up of one position and
one momentum component along two different directions of the n-space. In this case, we can see
how our theory behaves when one moves with continuity from a highly incompatible case (parallel
components) to a compatible case (orthogonal ones).
The continuous case needs much care when dealing with arbitrary quantum states and
approximating observables. Indeed, it is difficult to evaluate or even bound the relative entropy
if some assumption is not made on probability distributions. In order to overcome these technicalities
and focus on the quantum content of MURs, in this paper we consider only the case of Gaussian
preparation states and Gaussian measurement apparatuses [2,4,5,42–45]. Moreover, we identify the
class of the approximate joint measurements with the class of the joint POVMs satisfying the same
symmetry properties of their target position and momentum observables [3,23]. We are supported in
this assumption by the fact that, in the discrete case [41], simmetry covariant measurements turn out to
be the best approximations without any hypothesis (see also [17,19,20,29,32] for a similar appearance
of covariance within MURs for different uncertainty measures).
We now sketch the main results of the paper. In the vector case, we consider approximate joint
measurements M of the position Q ≡ ( Q1 , . . . , Qn ) and the momentum P ≡ ( P1 , . . . , Pn ). We find
the following entropic MUR (Theorem 5, Remark 14): for every choice of two positive thresholds
214
Entropy 2017, 19, 301
1 , 2 , with 1 2 ≥ h̄2 /4, there exists a Gaussian state ρ with position variance matrix Aρ ≥ 1 1 and
momentum variance matrix Bρ ≥ 2 1 such that
= >
h̄ h̄
S(Qρ M1,ρ ) + S(Pρ M2,ρ ) ≥ n (log e) ln 1 + √ − √ (1)
2 1 2 h̄ + 2 1 2
for all Gaussian approximate joint measurements M of Q and P. Here Qρ and Pρ are the distributions
of position and momentum in the state ρ, and Mρ is the distribution of M in the state ρ, with marginals
M1,ρ and M2,ρ ; the two marginals turn out to be noisy versions of Qρ and Pρ . The lower bound is
strictly positive and it linearly increases with the dimension n. The thresholds 1 and 2 are peculiar of
the continuous case and they have a classical explanation: the relative entropy S( pq) → +∞ if the
variance of p vanishes faster than the variance of q, so that, given M, it is trivial to find a state ρ enjoying
(1) if arbtrarily small variances are allowed. What is relevant in our result is that the total loss of
information S(Qρ M1,ρ ) + S(Pρ M2,ρ ) exceeds the lower bound even if we forbid target distributions
with small variances.
The MUR (1) shows that there is no Gaussian joint measurement which can approximate arbitrarily
well both Q and P. The lower bound (1) is a consequence of the incompatibility between Q and P and,
indeed, it vanishes in the classical limit h̄ → 0. Both the relative entropies and the lower bound in (1)
are scale invariant. Moreover, for fixed 1 and 2 , we prove the existence and uniqueness of an optimal
approximate joint measurement, and we fully characterize it.
In the scalar case, we consider approximate joint measurements M of the position Qu = u · Q
along the direction u and the momentum Pv = v · P along the direction v, where u · v = cos α. We find
two different entropic MURs. The first entropic MUR in the scalar case is similar to the vector case
(Theorem 3, Remark 11). The second one is (Theorem 1):
for all Gaussian states ρ and all Gaussian joint approximate measurements M of Qu and Pv . This lower
bound holds for every Gaussian state ρ without constraints on the position and momentum variances
' ( ' (
Var Qu,ρ and Var Pv,ρ , it is strictly positive unless u and v are orthogonal, but it is state dependent.
Again, the relative entropies and the lower bound are scale invariant.
The paper is organized as follows. In Section 2, we introduce our target position and momentum
observables, we discuss their general properties and define some related quantities (spectral measures,
mean vectors and variance matrices, PURs for second order quantum moments, Weyl operators,
Gaussian states). Section 3 is devoted to the definitions and main properties of the relative and
differential (Shannon) entropies. Section 4 is a review on the entropic PURs in the continuous
case [7–9,46], with a particular focus on their lack of scale invariance. This is a flaw due to the
very definition of differential entropy, and one of the reasons that lead us to introduce relative entropy
based MURs. In Section 5 we construct the covariant observables which will be used as approximate
joint measurements of the position and momentum target observables. Finally, in Section 6 the main
results on MURs that we sketched above are presented in detail. Some conclusions are discussed in
Section 7.
215
Entropy 2017, 19, 301
Each of the vector operators has n components; it could be the case of a single particle in one or
more dimensions (n = 1, 2, 3), or several scalar or vector particles, or the quadratures of n modes of the
electromagnetic field. We assume the Hilbert space H to be irreducible for the algebra generated by
the canonical operators Q and P. An observable of the quantum system H is identified with a positive
operator valued measure (POVM); in the paper, we shall consider observables with outcomes in Rk
endowed with its Borel σ-algebra B(Rk ). The use of POVMs to represent observables in quantum
theory is standard and the definition can be found in many textbooks [22,23,26,47]; the alternative
name “non-orthogonal resolutions of the identity” is also used [3–5]. Following [5,23,26,31], a sharp
observable is an observable represented by a projection valued measure (pvm); it is standard to identify
a sharp observable on the outcome space Rk with the k self-adjoint operators corresponding to it by
spectral theorem. Two observables are jointly measurable or compatible if there exists a POVM having
them as marginals. Because of the non-vanishing commutators, each couple Qi , Pi , as well as the
vectors Q, P, are not jointly measurable.
We denote by T (H) the trace class operators on H, by S ⊂ T (H) the subset of the statistical
operators (or states, preparations), and by L(H) the space of the linear bounded operators.
In the Dirac notation, if | x and | p are the improper position and momentum eigenvectors,
these densities take the expressions f ( x|ρ) = x|ρ| x and g( p|ρ) = p|ρ| p, respectively. The mean
vectors and the variance matrices of these distributions will be given in (7) and (8).
In this case we have [ Qu , Pv ] = ih̄ cos α, so that Qu and Pv are not jointly measurable, unless the
directions u and v are orthogonal.
Their pvm’s are denoted by Qu and Pv , their distributions in a state ρ by Qu,ρ and Pv,ρ , and their
corresponding probability densities by f u (•|ρ) and gv (•|ρ): ∀ A, B ∈ B(R),
Qu,ρ ( A) = Tr{Qu ( A)ρ} = f u ( x |ρ) dx, Pv,ρ ( B) = Tr{Pv ( A)ρ} = gv ( p|ρ) dp.
A B
216
Entropy 2017, 19, 301
Of course, the densities in the scalar case are marginals of the densities in the vector case.
Means and variances will be given in (11).
Then, the mean vector and the variance matrix of the position Q in the state ρ ∈ S2 are
ρ
ai : = xi f ( x|ρ)dx ≡ Tr {ρQi } ,
Rn
(7)
ρ ρ ρ ρ ρ
Aij := xi − ai x j − a j f ( x|ρ)dx ≡ Tr ρ Qi − ai Qj − aj ,
Rn
Since there is no joint measurement for the position Q and momentum P, the quantum covariances
ρ
Cij are not covariances of a joint distribution, and thus they do not have a classical probabilistic
interpretation.
By means of the moments above, we construct the three real n × n matrices Aρ , Bρ , C ρ ,
the 2n-dimensional vector μρ and the symmetric 2n × 2n matrix V ρ , with
aρ Aρ Cρ
μρ := , V ρ := . (10)
bρ (C ρ ) T Bρ
We say V ρ is the quantum variance matrix of position and momentum in the state ρ. In [2]
dimensionless canonical operators are considered, but apart from this, our matrix V ρ corresponds to
their “noise matrix in real form”; the name “variance matrix” is also used [44,48].
In a similar way, we can introduce all the moments related to the position Qu and momentum Pv
introduced in (6). For ρ ∈ S2 , the means and variances are respectively
Similarly to (9), we have also the ‘quantum covariance’ u · C ρ v ≡ v · (C ρ ) T u. Then, we collect the
two means in a single vector and we introduce the variance matrix:
ρ u · aρ ρ u · Aρ u u · Cρ v
μu,v := , Vu,v := . (12)
v · bρ u · Cρ v v · Bρ v
217
Entropy 2017, 19, 301
A C
Proposition 1. Let V = be a real symmetric 2n × 2n block matrix with the same dimensions of
CT B
a quantum variance matrix. Define
A C ± i 2h̄ 1 i 0 h̄1
V± := ≡ V ± Ω, with Ω := . (13)
C T ∓ i 2h̄ 1 B 2 −h̄1 0
Then
V = V ρ for some state ρ ∈ S2 ⇐⇒ V+ ≥ 0 ⇐⇒ V− ≥ 0. (14)
The inequalities (14) for V± tell us exactly when a (positive semi-definite) real matrix V is the quantum
variance matrix of position and momentum in a state ρ. Moreover, they are the multidimensional
version of the usual uncertainty principle expressed through the variances [2,3,5], hence they represent
a form of PURs. The block matrix Ω in the definition of V± is useful to compress formulae involving
position and momentum; moreover, it makes simpler to compare our equations with their frequent
dimensionless versions (with h̄ = 1) in the literature [43,44].
Proof. Equivalences (14) are well known, see e.g., [3] (Section 1.1.5), [5] (Equation (2.20)), and [2]
(Theorem 2). Then V = 12 V+ + 12 V− ≥ 0.
αu
By using the real block vector , with arbitrary α, β ∈ R and given u , v ∈ Rn ,
βv
the semi-positivity (14) implies
u · Au u · Cv ± i 2h̄ u · v
≥ 0, ∀ u ∈ Rn , ∀ v ∈ Rn ,
v · C T u ∓ i 2h̄ v · u v · Bv
which in turn implies A ≥ 0, B ≥ 0 and (15). Then, by choosing u = v = ui , where u1 , . . . , un are the
eigenvectors of A (since A is a real symmetric matrix, ui ∈ Rn for all i), one gets the strict positivity of
all the eigenvalues of A; analogously, one gets B > 0.
Inequality (15) for u = u and v = v becomes the uncertainty rule à la Robertson [6] for the
observables in (6) (a position component and a momentum component spanning an arbitrary angle α):
h̄2
Var(Qu,ρ ) Var(Pv,ρ ) ≥ (v · C ρ u)2 + (cos α)2 . (16)
4
Inequality (16) is equivalent to
ρ ih̄ 0 1
Vu,v ± cos α ≥ 0. (17)
2 −1 0
Since V± are block matrices, their positive semi-definiteness can be studied by means of the Schur
complements [49–51]. However, as V± are complex block matrices with a very peculiar structure,
special results hold for them. Before summarizing the properties of V± in the next proposition, we need
a simple auxiliary algebraic lemma.
Lemma 1. Let A and B be complex self-adjoint matrices such that A ≥ B ≥ 0. Then det A ≥ det B ≥ 0,
and the equality det A = det B holds iff A = B.
218
Entropy 2017, 19, 301
Proof. Let λi↓ ( A) and λi↓ ( B) be the ordered decreasing sequences of the eigenvalues of A and B,
respectively. Then, by Weyl’s inequality, A ≥ B ≥ 0 implies λi↓ ( A) ≥ λi↓ ( B) ≥ 0 for every i [52]
(Section III.2). This gives the first statement. Moreover, if A ≥ B ≥ 0 and det A = det B, we get
λi↓ ( A) = λi↓ ( B) for every i. Then A = B because A − B ≥ 0 and Tr{ A − B} = 0.
A C
Proposition 2. Let V = be a real symmetric 2n × 2n matrix with the same dimensions of
CT B
a quantum variance matrix. Then V+ ≥ 0 (or, equivalently, V− ≥ 0) if and only if A > 0 and
ih̄ ih̄ h̄2 ih̄ −1
B≥ CT ∓ 1 A −1 C ± 1 ≡ C T A −1 C + A −1 ∓ A C − C T A −1 . (18)
2 2 4 2
h̄ 2n
(det A)(det B) ≥ det V = (det A) det B − C T A−1 C ≥ , (20)
2
2n
h̄ h̄2 −1
det V = ⇔ B = C T A −1 C + A ⇒ CA = AC T , (21)
2 4
2n
h̄ h̄2 −1
(det A)(det B) = ⇔ B= A , C = 0. (22)
2 4
Proof. Since we already know that V+ ≥ 0 implies the invertibility of A, the equivalence between (14)
and (18) with A > 0 follows from [49] (Theorem 1.12 p. 34) (see also [50] (Theorem 11.6) or [51] (Lemma 3.2)).
In (19), the first inequality follows by summing up the two inequalities in (18). The last two ones
are immediate by the positivity of A−1 .
The equality in (20) is Schur’s formula for the determinant of block matrices ([49], Theorem 1.1 p. 19).
Then, the first inequality is immediate by the lemma above and the trivial relation B ≥ B − C T A−1 C;
the second one follows from (19):
h̄2 −1 h̄2 −1 (h̄/2)2n
B − C T A −1 C ≥ A ⇒ det B − C T A−1 C ≥ det A = .
4 4 det A
2n ' ( 2
The equality det V = 2h̄ is equivalent to det B − C T A−1 C = det h̄4 A−1 ; since the latter
two determinants are evaluated on ordered positive matrices by (19), they coincide if and only if
the respective arguments are equal (Lemma 1); this shows the equivalence in (21). Then, by (18),
' (
the self-adjoint matrix ih̄2 A−1 C − C T A−1 is both positive semi-definite and negative semi-definite;
hence it is null, that is, CA = AC .T
2
2n 2n
Finally, B = h̄4 A−1 gives (det A)(det B) = 2h̄ trivially. Conversely, (det A)(det B) = 2h̄
implies det B = det B − C T A−1 C by (20); since B ≥ B − C T A−1 C ≥ 0 by (19), Lemma 1 then implies
C T A−1 C = 0 and so C = 0.
By (18) and (19), every time three matrices A, B, C define the quantum variance matrix of a state
ρ, the same holds for A, B, C) = 0. This fact can be used to characterize when two positive matrices
219
Entropy 2017, 19, 301
A and B are the diagonal blocks of some quantum variance matrix, or two positive numbers cQ and c P
are the position and momentum variances of a quantum state along the two directions u and v.
Proposition 3. Two real matrices A > 0 and B > 0, having the dimension of the square of a length and
momentum, respectively, are the diagonal blocks of a quantum variance matrix V ρ if and only if
h̄2 −1
B≥ A .
4
Two real numbers cQ > 0 and c P > 0, having the dimension of the square of a length and momentum,
respectively, are such that cQ = Var(Qu,ρ ) and c P = Var(Pv,ρ ) for some state ρ if and only if
2
h̄
cQ cP ≥ cos α .
2
Proof.For A
and B, the necessity follows from (19). The sufficiency comes from (18) by choosing
A 0
Vρ = .
0 B
ρ
ForcQ and c P , the necessity follows from (15). The sufficiency comes from (18) with V =
A 0
and for example the following choices of A and B:
0 B
h̄2 h̄2
A = cQ uu T + vv T + A B= uu T + c P vv T + B ,
4c P 4cQ
where A and B are any two scalar multiples of the orthogonal projection onto {u, v}⊥ satisfying
2
B ≥ h̄4 A −1 when restricted to {u, v}⊥ ;
• if cos α ∈
/ {0, ±1}, we choose
2 3
1 2
A = cQ uu T − (uv T + vu T ) + vv T + A
cos α (cos α) 2
2 3
cP (sin α)2 + (cos α)4 T 1
B= uu − (uv T + vu T ) + vv T + B ,
(sin α)4 (cos α) 2 cos α
220
Entropy 2017, 19, 301
due to this property, the Weyl operators are also known as displacement operators.
With a slight abuse of notation, we shall sometimes use the identification
x
W ( x, p) ≡ W , (26)
p
x
where is a block column vector belonging to the phase-space Rn × Rn ≡ R2n ; here, the first block
p
x is a position and the second block p is a momentum.
By means of the Weyl operators, it is possible to define the characteristic function of any
trace-class operator.
Definition 2. For any operator ρ ∈ T (H), its characteristic function is the complex valued function
ρ" : R2n → C defined by
k
ρ"(w) := Tr {ρW (−Ωw)} , w≡ . (27)
l
Note that k is the inverse of a length and l is the inverse of a momentum, so that w is a block
vector living in the space R2n ≡ Rn × Rn regarded as the dual of the phase-space.
Instead of the characteristic function, sometimes the so called Weyl transform Tr {W ( x, p)ρ} is
introduced [4,44].
By [4] (Proposition 5.3.2, Theorem 5.3.3), we have ρ"(w) ∈ L2 (R2n ) and the following trace formula
holds: ∀ρ, σ ∈ T (H), n
h̄
Tr{σ∗ ρ} = "
σ(w) ρ"(w) dw. (28)
2π R2n
As a corollary [4] (Corollary 5.3.4), we have that a state ρ ∈ S is pure if and only if
n
h̄
|ρ"(w)|2 dw = 1.
2π R2n
By [53] (Lemma 3.1) or [26] (Proposition 8.5.(e)), the trace formula also implies
1
W ( x, p)ρW ( x, p)∗ dxdp = Tr{ρ}1, ∀ ρ ∈ T (H ) . (29)
(2πh̄)n R2n
221
Entropy 2017, 19, 301
Moreover, the following inversion formula ensures that the characteristic function ρ" completely
characterizes the state ρ [4] (Corollary 5.3.5):
n
h̄
ρ= W (Ωw) ρ"(w)dw, ∀ ρ ∈ T (H ) .
2π R2n
The last two integrals are defined in the weak operator topology.
Finally, for ρ ∈ S2 , the moments (7)–(10) can be expressed as in [4] (Section 5.4):
ρ(w)
∂" ρ ∂2 ρ"(w) ρ ρ ρ
−i = μi , − = Vij + μi μ j . (30)
∂wi 0 ∂wi ∂w j 0
ρ
The condition V+ ≥ 0 is necessary and sufficient in order that the function (31) defines the
characteristic function of a quantum state [4] (Theorem 5.5.1), [5] (Theorem 12.17). Therefore,
Gaussian states are exactly the states whose characteristic function is the exponential of a second order
polynomial [4] (Equation (5.5.49)), [5] (Equation (12.80)).
We shall denote by G the set of the Gaussian states; we have G ⊂ S2 ⊂ S. By (30), the vectors
aρ , bρ and the matrices Aρ , Bρ , C ρ characterizing a Gaussian state ρ are just its first and second order
quantum moments introduced in (7)–(9). By (31), the corresponding distributions of position and
momentum are Gaussian, namely
(h̄/2)n
Proof. The trace formula (28) and (31) give Tr{ρ2 } = √ , and this implies the statement.
det V ρ
2n
Proposition 5 (Minimum uncertainty states). For ρ ∈ S2 , we have (det Aρ )(det Bρ ) = h̄2 if and only if
ρ is a pure Gaussian state and it factorizes into the product of minimum uncertainty states up to a rotation of Rn .
2n 2
Proof. If (det Aρ )(det Bρ ) = 2h̄ , then the equivalence (22) gives Bρ = h̄4 ( Aρ )−1 , so that the
variance matrices Aρ and Bρ have a common eigenbasis u1 , . . . , un . Thus, all the corresponding
2
couples of position Qui and momentum Pui have minimum uncertainties: Var(Qui ) Var(Pui ) = h̄4 .
Therefore, if we consider the factorization of the Hilbert space H = H1 ⊗ · · · ⊗ Hn corresponding
to the basis u1 , . . . , un , all the partial traces of the state ρ on each factor Hi are minimum uncertainty
states. Since for n = 1 the minimum uncertainty states are pure and Gaussian, the state ρ is a pure
product Gaussian state.
The converse is immediate.
222
Entropy 2017, 19, 301
The value +∞ is allowed for S( pq); the usual convention 0 log(0/0) = 0 is understood.
The relative entropy (33) is the amount of information that is lost when q is used to approximate
p [54] (p. 51). Of course, if x is dimensioned, then the densities f and g have the same dimension
(that is, the inverse of x), and the argument of the logarithm is dimensionless, as it must be.
(i) S( pq) ≥ 0.
(ii) S( pq) = 0 ⇐⇒ p=q ⇐⇒ f = g a.e..
(iii) S( pq) is invariant under a change of the unit of measurement.
(iv) If p = N ( a; A) and q = N (b; B) with invertible variance matrices A and B, then
# $
det B
−1 −1
2 S( pq) = (log e) ( a − b) · B ( a − b) + Tr B A−1 + log . (34)
det A
As S( pq) is scale invariant, it quantifies a relative error for the use of q as an approximation of p,
not an absolute one.
Let us employ the relative entropy to evaluate the effect of an additive Gaussian noise ν ∼ N (b; β2 )
on an independent Gaussian random variable X. If X ∼ N ( a; α2 ), then X + ν ∼ N ( a + b; α2 + β2 ),
and the relative entropy of the true distribution of X with respect to its disturbed version X + ν is
log e b2 − β2 1 α2 + β2
S( X X + ν) = + log .
2 α +β
2 2 2 α2
This expression vanishes if the noise becomes negligible with respect to the true distribution, that
is if β2 /α2 → 0 and b2 /α2 → 0. On the other hand, S( X X + ν) diverges if the noise becomes too
strong with respect to the true distribution, or, in other words, if the true distribution becomes too
peaked with respect to the noise, that is, β2 /α2 → +∞ or b2 /α2 → +∞.
223
Entropy 2017, 19, 301
This quantity is commonly used in the literature, even if it lacks many of the nice properties of
the Shannon entropy for discrete random variables. For example, H ( X ) is not scale invariant, and it
can be negative [56] (p. 244).
Since the density f enters in the logarithm argument, the definition of H ( X ) is meaningful only
when f is dimensionless, which is the same as X being dimensionless. Note that, if X is dimensioned
and c > 0 is a real parameter making X ) = cX a dimensionless random variable, then
)) = − f (u/c) f (u/c) f ( x)
H (X log du = − f ( x) log dx .
Rn cn cn Rn cn
In the following, we shall consider the differential entropy only for dimensionless random vectors X.
1 n 1
H (X ) ≤ log (2πe)n det A = log (2πe) + Tr log A.
2 2 2
The equality holds iff X is Gaussian with variance matrix A and arbitrary mean vector a.
(ii) If X = ( X1 , . . . , Xn ) is an absolutely continuous random vector, then
n
H (X ) ≤ ∑ H ( Xi ) .
i =1
Remark 1. In property (i) we have used the following well-known matrix identity, which follows by diagonalization:
Remark 2. Property (i) yields that the differential entropy of a Gaussian random variable X ∼ N ( a; α2 ) is
1
H (X) = log 2πeα2 ,
2
which is an increasing function of the variance α2 , and thus it is a measure of the uncertainty of X. Note that
H ( X ) ≥ 0 iff α2 ≥ 1/(2πe).
κ
) :=
Q Q, ) = √λ P
P ⇒ ) i , P)j = iλδij .
Q (35)
h̄ h̄κ
We use a unique dimensional constant κ , in order to respect rotation symmetry and do not
distinguish different particles. Anyway, there is no natural link between the parameter multiplying Q
and the parameter multiplying P; this is the reason for introducing λ. As we see from the commutation
rules, the constant λ plays the role of a dimensionless version of h̄; in the literature on PURs, often λ = 1
is used [8,9,12,46].
224
Entropy 2017, 19, 301
Thus, the bound (37) arises from quantum relations between Q and P; indeed, there would be no
lower bound for (36) if we could take both det Aρ and det Bρ arbitrarily small.
By item (ii) of Proposition 7, the differential entropy for the distribution of a random vector is
smaller than the sum of the entropies of its marginals; however, the final bound (37) is a tight bound
for both H (Q) ρ ) + H (P
) ρ ) and ∑n H (Q
) i,ρ ) + ∑n H (P
) i,ρ ).
i =1 i =1
By the results of [8,9], the same bound (37) is obtained even if the minimization is done over all
the states, not only the Gaussian ones.
The uncertainty result (37) depends on λ, this being a consequence of the lack of scale invariance
of the differential entropy; note that the bound is positive if and only if λ > 1/(πe). Sometimes in
the literature the parameter h̄ appears in the argument of the logarithm [27,30]; this fact has to be
interpreted as the appearance of a parameter with the numerical value of h̄, but without dimensions.
In this sense the formulation (37) is consistent with both the cases with λ = 1 or λ = h̄. Sometimes the
smaller bound ln 2π appears in place of log πe [10]; this is connected to a state dependent formulation
of the entropic PUR [12] (Section V.B).
κ λ
)u =
Q Qu , P)v = √ Pv ⇒ ) u , P)v = iλ cos α.
Q (38)
h̄ h̄κ
κ λ ) u,ρ ) = κ u · Aρ u, ) v,ρ ) = λ v · Bρ v,
2
u · aρ , √ v · bρ , Var(Q Var(P
h̄ h̄κ h̄ h̄κ
/
with Var(Q) u,ρ ) Var(P
) v,ρ ) ≥ λ |cos α| /2.
As in the vector case, the total preparation uncertainty is quantified by the sum of the two
differential entropies H (Q ) u,ρ ) + H (P
) v,ρ ). For ρ ∈ G, Proposition 7 gives
/
) u,ρ ) + H (P
H (Q ) v,ρ ) = log 2πe Var(Q
) u,ρ ) Var(P
) v,ρ ) . (39)
225
Entropy 2017, 19, 301
which depends on λ, but not on κ . Of course, because of (39), for Gaussian states a lower bound
for the sum H (Q ) u,ρ ) + H (P
) v,ρ ) is equivalent to a lower bound for the product Var(Q
) u,ρ ) Var(P
) v,ρ ).
By the generalization of the results of [8,9] given in [46], the bound (40) is obtained also when the
minimization is done over all the states.
Let us note that the bound in (40) is positive for |λ cos α| > 1/(πe), and it goes to −∞ for
α → π/2, which is the case of compatible Qu,ρ and Pv,ρ . In the case α = 0, the bound (40) is the same
as (37) for n = 1.
Definition 4. Given a bi-observable M : B(R2m ) → L(H), the characteristic function of M is the operator
" : R2m → L(H), with
valued function M
" (k, l ) =
M ei(k· x+l · p) M(dxdp). (41)
R2m
In this definition the dimensions of the vector variables k and l are the inverses of a length and
momentum, respectively,
as in the definition of the characteristic function of a state (27). This definition
is given so that Tr M" (k, l )ρ is the usual characteristic function of the probability distribution Mρ
on R2m .
226
Entropy 2017, 19, 301
and they are taken as the transformation property defining the following class of POVMs on
R2n [23,26,44,53,57].
have the same symmetry properties of Q and P, respectively. Although Q and P are not jointly
measurable, the following well-known result says that there are plenty of covariant phase-space
observables [4] (Theorem 4.8.3), [63,64]. In (43) below, we use the parity operator Π on H, which is
such that
Π W ( x, p) Π = W (− x, − p) = W ( x, p)∗ . (42)
Proposition 8. The covariant phase-space observables are in one-to-one correspondence with the states on H,
so that we have the identification S ∼ C; such a correspondence σ ↔ Mσ is given by
Mσ ( B ) = Mσ ( x, p) dxdp, ∀ B ∈ B(R2n ),
B
1 (43)
Mσ ( x, p) = W ( x, p)ΠσΠW ( x, p)∗ .
(2πh̄)n
The characteristic function (41) of a measurement Mσ ∈ C has a very simple structure in terms of
the characteristic function (27) of the corresponding state σ ∈ S.
In (44) we have used the identification (26). The characteristic function of a state is introduced in (27).
227
Entropy 2017, 19, 301
Then, we get
1
" σ (k, l ) =
M ei(k· x+l · p) W ( x, p)ΠσΠW ( x, p)∗ dxdp
(2πh̄)n R2n
1
= W (− h̄l, h̄k)W ( x, p)W (− h̄l, h̄k)∗ ΠσΠW ( x, p)∗ dxdp
(2πh̄)n R2n
= W (−h̄l, h̄k) Tr{W (−h̄l, h̄k)∗ ΠσΠ},
where we used the formula (29). By (42) and the definition (27), we get (44). Again by (27), we get (45).
In terms of probability densities, measuring Mσ on the state ρ yields the density function
hσ (x, p|ρ)= Tr{ Mσ (x, p)ρ}. Then, by (45), the densities of the marginals M1,ρ
σ and Mσ are the convolutions
2ρ
where f and g are the sharp densities introduced in (5). By the arbitrariness of the state ρ, the marginal
POVMs of Mσ turn out to be the convolutions (or ‘smearings’)
M1σ ( A) dx f ( x − x |σ)Q(dx ), M2σ ( B) dp g( p − p |σ )P(dp )
A Rn B Rn
for all A, B ∈ B(R) and x, p ∈ Rn . We employ covariance to define our class of approximate joint
measurements of Qu and Pv .
So, our approximate joint measurements of Qu and Pv will be all the bi-observables in the class Cu,v .
228
Entropy 2017, 19, 301
Example 1. The marginal of a covariant phase-space observable Mσ along the directions u and v is
a (u, v)-covariant bi-observable. Actually, it can be proved that, if cos α
= 0, all (u, v)-covariant bi-observables
can be obtained in this way.
It is useful to work with a little more generality, and merge Definitions 5 and 6 into a single notion
of covariance.
Thus, approximate joint observables of Qu and Pv are just J-covariant observables on R2 for the
choice of the 2 × 2n matrix
u T 0T
J= . (47)
0T v T
On the other hand, covariant phase-space observables constitute the class of 12n -covariant
observables on R2n , where 12n is the identity map of R2n .
for two vectors aM , bM ∈ Rm , a real 2m × 2n matrix J M and a real symmetric 2m × 2m matrix V M satisfying
the condition
i
V M ± J M Ω( J M ) T ≥ 0. (49)
2
aM
We set μM = . The triple (μM , V M , J M ) is the set of the parameters of the Gaussian observable M.
bM
In this definition, the vector aM has the dimension of a length, and bM of a momentum; similarly,
the matrices J M , V M decompose into blocks of different dimensions. The condition (49) is necessary
and sufficient in order that the function (48) defines the characteristic function of a POVM.
For unbiased Gaussian measurements, i.e., Gaussian bi-observables with aM = bM = 0,
the previous definition coincides with the one of [5] (Section 12.4.3). It is also a particular case of the
more general definition of Gaussian observables on arbitrary (not necessarily symplectic) linear spaces
that is given in [43,44]. We refer to [5,44] for the proof that Equation (48) is actually the characteristic
function of a POVM.
229
Entropy 2017, 19, 301
Measuring the Gaussian observable M on the Gaussian state ρ yields the probability distribution
Mρ whose characteristic function is
# $
aM 1 T T M k
" M T k
Tr{M(k, l )ρ} = ρ" ( J ) exp i k T l T − k l V
l bM 2 l
# , - $
aM ρ k
M a 1 M M ρ M T
= exp i k T l T +J − k T l T V + J V (J ) ;
bM bρ 2 l
Proposition 10. Suppose M is a Gaussian bi-observable on R2m with parameters (μM , V M , J M ). Let J be any
2m × 2n real matrix. Then, the POVM M is a J-covariant observable if and only if J M = J.
Proof. For x, p ∈ Rn , we let M and M be the two POVMs on R2m given by
x
M ( Z ) = W ( x, p)M( Z )W ( x, p)∗ , M ( Z ) = M Z+J , ∀ Z ∈ B(R2m ).
p
By the commutation relations (24) for the Weyl operators, we immediately get
# , -$
k
M " (k, l )W ( x, p)∗ = exp
" (k, l ) = W ( x, p)M −i x T p T Ω −1 − Ω ( J M ) T " (k, l )
M
l
# $
x
= exp −i k T l T J M " (k, l );
M
p
we have also
# , -$
x x
" (k, l ) =
M exp i k T lT − J M(dx dp )
R2m p p
# $
x
= exp −i kT lT J " (k, l ).
M
p
" (k, l )
= 0 for all k, l, by comparing the last two expressions we see that M = M if and
Since M
only if
# $ # $
x x
exp −i k T l T JM = exp −i k T lT J , ∀ x, p ∈ Rn , ∀k, l ∈ Rm ,
p p
Vector Observables
Let us point out the structure of the Gaussian approximate joint measurements of Q and P.
230
Entropy 2017, 19, 301
Proposition 11. A bi-observable Mσ ∈ C is Gaussian if and only if the state σ is Gaussian. In this case,
the covariant bi-observable Mσ is Gaussian with parameters
σ σ σ
μM = μ σ , VM = Vσ, J M = 12n .
Proof. By comparing (31), (44) and (48), and using the fact that W ( x1 , p1 ) ∝ W ( x2 , p2 ) if and only if
x1 = x2 and p1 = p2 , we have the first statement. Then, for σ ∈ G, we see immediately that Mσ is
a Gaussian observable with the above parameters.
We call CG the class of the Gaussian covariant phase-space observables. By (50), observing Mσ
on a Gaussian state ρ ∈ G yields the normal probability distribution Mσρ = N (μρ + μσ ; V ρ + V σ ),
with marginals
σ
M1,ρ = N ( a ρ + a σ ; A ρ + A σ ), σ
M2,ρ = N ( b ρ + b σ ; B ρ + B σ ). (51)
Scalar Observables
We now study the Gaussian approximate joint measurements of the target observables Qu and Pu
defined in (6).
Proposition 12. A Gaussian bi-observable M with parameters (μM , V M , J M ) is in Cu,v if and only if J M = J,
where J is given by (47). In this case, the condition (49) is equivalent to
M M M M h̄2 M 2
V11 ≥ 0, V22 ≥ 0, V11 V22 ≥ (cos α)2 + (V12 ) . (52)
4
Proof. The first statement follows from Proposition 10. Then, the matrix inequality (49) reads
ih̄ 0 cos α
VM ± ≥ 0,
2 − cos α 0
We write Cu,v
G for the class of the Gaussian (u, v)-covariant phase-space observables. An observable
M ∈ Cu,v
G is thus characterized by the couple ( μM , V M ). From (50) with J M = J given by (47),
we get that measuring M ∈ Cu,v G on a Gaussian state ρ yields the probability distribution
ρ ρ ρ ρ
Mρ = N μu,v + μ ; Vu,v + V
M M with μu,v and Vu,v given by (12). Its marginals with respect to the first
and second entry are, respectively,
M1,ρ = N u · aρ + aM ; Var(Qu,ρ ) + V11
M
, M2,ρ = N v · bρ + bM ; Var(Pv,ρ ) + V22
M
. (53)
Example 2. Let us construct an example of an approximate joint measurement of Qu and Pv , by using a noisy
measurement of position along u followed by a sharp measurement of momentum along v. Let Δ be a positive
real number yielding the precision of the position measurement, and consider the POVM M on R2 given by
= > = >
1 ( x − Q u )2 ( x − Q u )2
M( A × B ) = √ exp − Pv ( B) exp − dx, ∀ A, B ∈ B(R).
2πΔ A 4Δ 4Δ
231
Entropy 2017, 19, 301
Example 3. Let us consider the case α = ±π/2; now the target observables Qu and Pv are compatible and
we can define a pvm M on R2 by setting M( A × B) = Qu ( A)Pv ( B) for all A, B ∈ B(R). Its characteristic
function is
" (k, l ) =
M eikx Qu (dx ) eil p Pv (dp) = ei(kQu +lPv ) = W (−h̄lv, h̄ku).
R R
Then, M ∈ Cu,vG with parameters aM = 0, bM = 0, V M = 0 and J M = J given by (47). Note that M can be
regarded as the limit case of the observables of the previous example when cos α = 0 and Δ ↓ 0.
232
Entropy 2017, 19, 301
so that its marginal distributions M1,ρ and M2,ρ are normal with means u · aρ + aM and v · bρ + bM
and variances
' ( ' ( M ' ( ' ( M
Var M1,ρ = Var Qu,ρ + V11 , Var M2,ρ = Var Pv,ρ + V22 . (54)
Let us recall that |u| = 1, |v| = 1, u · v = cos α, and that by (16) and (52), we have
Definition 9. Given the preparation ρ ∈ S and the covariant bi-observable M ∈ Cu,v , the error function for
the scalar case is the sum of the two relative entropies:
The relative entropy is invariant under a change of the unit of measurement, so that the error
function is scale invariant, too; indeed, it quantifies a relative error, not an absolute one. In the Gaussian
case the error function can be explicitly computed.
Proposition 13 (Error function for the scalar Gaussian case). For ρ ∈ G and M ∈ Cu,v
G , the error function is
log e
S(ρ, M) = [s( x ) + s(y) + Δ(ρ, M)] , (57)
2
where
M M
V11 V22 ( aM )2 ( bM )2
x := ' (, y := ' (, Δ(ρ, M) := ' (+ ' (,
Var Qu,ρ Var Pv,ρ Var M1,ρ Var M2,ρ
and s : [0, +∞) → [0, +∞) is the following C ∞ strictly increasing function with s(0) = 0:
x
s( x ) := ln (1 + x ) − . (58)
1+x
Proof. The statement follows by a straightforward combination of (32), (34), (53) and (56).
Note that the error function does not depend on the mixed covariances u · C ρ v and V12
M . Note also
that, if we select a possible approximation M, then the error function S(ρ, M) decreases for states ρ
233
Entropy 2017, 19, 301
' ( ' (
with increasing sharp variances Var Qu,ρ and Var Pv,ρ : the loss of information decreases when the
sharp distributions make the approximation error negligible. Finally, note that
Theorem 1 (State dependent MUR, scalar observables). For every ρ ∈ G and M ∈ Cu,v
G ,
with
h̄ |cos α|
zρ := / ' ( ' ( ∈ [0, 1]. (61)
2 Var Qu,ρ Var Pv,ρ
The lower bound is tight and the optimal measurement is unique: cρ (α) = S(ρ, M∗ ), for a unique
M∗ ∈ Cu,v
G ; such a Gaussian ( u, v )-covariant bi-observable is characterized by
B ' ( B ' (
M∗ M∗ h̄ Var Qu,ρ M∗ h̄ Var Pv,ρ
μ M∗
= 0, V12 = 0, V11 = ' ( |cos α| , V22 = ' ( |cos α| . (62)
2 Var Pv,ρ 2 Var Qu,ρ
log e M∗
V11
S(ρ, M∗ ) = s( x ) + s(zρ2 /x ) , x= ' (,
2 Var Qu,ρ
234
Entropy 2017, 19, 301
Having x > 0, we immediately get that x = zρ gives the unique minimum. Thus
zρ
S(ρ, M) ≥ S(ρ, M∗ ) = s(zρ ) log e = log(1 + zρ ) − log e,
1 + zρ
and
B ' ( B ' (
M∗ ' ( h̄ Var Qu,ρ M∗ ' ( h̄ Var Pv,ρ
V11 = zρ Var Qu,ρ ≡ ' ( |cos α| , V22 = zρ Var Pv,ρ ≡ ' ( |cos α| ,
2 Var Pv,ρ 2 Var Qu,ρ
Remark 3. The minimum information loss cρ (α) depends on both the preparation ρ and the angle α. When
α
= ±π/2, that is when the target observables are not compatible, cρ (α) is strictly grater than zero. This is
a peculiar quantum effect: given ρ, u and v, there is no Gaussian approximate joint measurement of Qu and Pv
that can approximate them arbitrarily well. On the other side, in the limit α → ±π/2, the lower bound cρ (α)
goes to zero; so, the case of commuting target observables is approached with continuity.
Remark 4. The lower bound cρ (α) goes to zero also in the classical limit h̄ → 0. This holds for every angle
α and every Gaussian state ρ.
Remark 5. Another case in which cρ (α) → 0 is the limit of large uncertainty states, that is, if we let the product
' ( ' (
Var Qu,ρ Var Pv,ρ → +∞: our entropic MUR disappears because, roughly speaking, the variance of (at
least) one of the two target observables goes to infinity, its relative entropy vanishes by itself, and an optimal
covariant bi-observable M∗ has to take care of (at most) only the other target observable.
Remark 6. Actually, something similar to the previous remark happens also at the macroscopic limit,
and does not require the measuring instrument to be an optimal one; indeed, unbiasedness is enough in
this case. This happens because the error function S(ρ, M) quantifies a relative error; even if the measurement
approximation M is fixed, such an error can be reduced by suitably changing the preparation ρ. Indeed, if we
consider the position and momentum of a macroscopic particle, for instance the center of mass of many particles,
it is natural that its state has much larger position and momentum uncertainties than the intrinsic uncertainties
M
V11 M
V22
of the measuring instrument; that is, * 1 and * 1, implying that the error function (57) is
Var(Qu,ρ ) Var(Pv,ρ )
negligible. In practice, this is a classical case: the preparation has large position and momentum uncertainties
and the measuring instrument is relatively good. In this situation we do not see the difference between the joint
measurement of position and momentum and their separate sharp observations.
Remark 7. The optimal approximating joint measurement M∗ ∈ Cu,v G is unique; by (62) it depends on the
preparation ρ one is considering, as well as on the directions u and v. A realization of M∗ is the measuring
procedure of Example 2.
Remark 8. The MUR (59) is scale invariant, as both the error function S(ρ, M) and the lower bound cρ (α) are such.
235
Entropy 2017, 19, 301
log e
sup inf S(ρ, M) = 1 − . (63)
ρ∈G M∈Cu,v
G 2
and we evaluate the error made in approximating Qu and Pv with the marginals of a (u, v)-covariant
bi-observable by maximizing the error function over all these states.
For Gaussian M, depending on the choice of the thresholds 1 and 2 , the divergence
DG (Qu , Pv M) can be easily computed or at least bounded.
h̄2
(i) For 1 2 ≥ (cos α)2 , the divergence DG (Qu , Pv M) is given by
4
log e
DG (Qu , Pv M) = S(ρ (u, v), M) = [s( x ) + s(y ) + Δ(; M)] , (66)
2
236
Entropy 2017, 19, 301
where ρ (u, v) is any Gaussian state with Var Qu,ρ (u,v) = 1 and Var Pv,ρ (u,v) = 2 , and
M M
V11 V22 ( aM )2 ( bM )2
x := , y := , Δ(; σ ) := + M .
1 2 V11 + 1
M V22 + 2
h̄2
(ii) For 1 2 < (cos α)2 , the divergence DG (Qu , Pv M) is bounded from below by
4
log e
DG (Qu , Pv M) ≥ S(ρ (u, v), M) = [s( x ) + s(y ) + Δ(; M)] , (67)
2
h̄2
where ρ (u, v) is any Gaussian state with Var Qu,ρ (u,v) = 1 and Var Pv,ρ (u,v) = (cos α)2 , and
41
M M
V11 41 V22 ( aM )2 ( bM )2
x := , y := , Δ(; σ ) := + .
1 2
h̄ (cos α) 2
V11 + 1
M M + h̄2 (cos α )2
V22 41
Proof. By Proposition 3, maximizing the error function over the states in Gu,v is the same as
' ( ' (
maximizing (57) over the parameters Var Qu,ρ and Var Pv,ρ satisfying (55) and (64) (note that
' ( ' (
in the bias Δ(ρ, M), the variances Var M1,ρ and Var M2,ρ depend on Var Qu,ρ and Var Pv,ρ by (54)).
h̄2
(i) In the case 1 2 ≥ (cos α)2 , the thresholds themselves satisfy Heisenberg uncertainty relation,
4
and so equality (66) follows from the expression (57) and the fact the functions s( x ), s(y), Δ(ρ, M)
' ( ' (
are decreasing in Var Qu,ρ and Var Pv,ρ .
h̄2 ' (
(ii) In the case 1 2 < (cos α)2 , we have to take into account the relation (55) for Var Qu,ρ
' ( 4 ' ( ' ( 2
and Var Pv,ρ : the supremum of S(ρ, M) is achieved when Var Qu,ρ Var Pv,ρ = h̄4 (cos α)2 ,
' ( ' ( ' (
with Var Qu,ρ ≥ 1 and Var Pv,ρ ≥ 2 . Then inequality (67) follows by choosing Var Qu,ρ =
' ( h̄ 2
1 and Var Pv,ρ = (cos α)2 .
41
Remark 10. The conditions on the states ρ (u, v) do not depend on M, but only on the parameters defining
Gu,v
2 2
. Thus, in the case 1 2 ≥ 4 (cos α ) , any choice of ρ ( u, v ) yields a state which is the worst one for every
h̄
G
cinc (Qu , Pv ; ) : = inf DG (Qu , Pv M) ≡ inf sup S(ρ, M). (68)
M∈Cu,v
G M∈Cu,v
G
ρ∈Gu,v
Again, depending on the choice of the thresholds 1 and 2 , the entropic incompatibility degree
G (Q , P ; ) can be easily computed or at least bounded.
cinc u v
237
Entropy 2017, 19, 301
h̄2
Theorem 3. (i) For 1 2 ≥ (cos α)2 , the incompatibility degree cinc
G (Q , P ; ) is given by
u v
4
= >
h̄ |cos α| h̄ |cos α|
G
cinc (Qu , Pv ; ) = (log e) ln 1 + √ − √ . (69)
2 1 2 2 1 2 + h̄ |cos α|
The infimum in (68) is attained and the optimal measurement is unique, in the sense that
G
cinc (Qu , Pv ; ) = DG (Qu , Pv M ) (70)
M h̄ 1 M h̄ 2 M
aM = 0, bM = 0, V11 = |cos α| , V22 = |cos α| , V12 = 0. (71)
2 2 2 1
h̄2
(ii) For 1 2 < (cos α)2 , the incompatibility degree cinc
G (Q , P ; ) is bounded from below by
u v
4
= >
1
G
cinc (Qu , Pv ; ) ≥ (log e) ln (2) − . (72)
2
where the state ρ (u, v) is defined in item (ii) of Theorem 2 and M is the bi-observable in Cu,v
G such that
M M h̄2 M
aM = 0, bM = 0, V11 = 1 , V22 = (cos α)2 , V12 = 0. (74)
41
h̄2
Proof. (i) In the case 1 2 ≥ (cos α)2 , due to (66), the proof is the same as that of Theorem 1 with
' 4( ' (
the replacements Var Qu,ρ → 1 and Var Pv,ρ → 2 .
2
h̄
(ii) In the case 1 2 < (cos α)2 , starting from (67), the proof is the same as that of Theorem 1 with
4 ' ( ' ( h̄2
the replacements Var Qu,ρ → 1 and Var Pv,ρ → 4 (cos α)2 .
1
Remark 11 (State independent MUR, scalar observables). By means of the above results, we can formulate
a state independent entropic MUR for the position Qu and the momentum Pv in the following way. Chosen two
positive thresholds 1 and 2 , there exists a preparation ρ (u, v) ∈ Gu,v
(introduced in Theorem 2) such that,
for all Gaussian approximate joint measurements M of Qu and Pv , we have
h̄2
The inequality follows by (66) and (69) in the case 1 2 ≥ 4 (cos α)2 , and (73) in the case
2
1 2 < h̄4 (cos α)2 .
What is relevant is that, for every approximate joint measurement M, the total information loss S(ρ, M)
does exceed the lower bound (75) even if the set of states Gu,v
forbids preparations ρ with too peaked target
238
Entropy 2017, 19, 301
distributions. Indeed, without the thresholds 1 , 2 , it would be trivial to exceed the lower bound (75), as we
noted in Section 6.1.2.
We also remark that, chosen 1 and 2 , we found a single state ρ (u, v) in Gu,v
that satisfies (75) for every
M, so that ρ (u, v) is a ‘bad’ state for all Gaussian approximate joint measurements of position and momentum.
2
When 1 2 ≥ h̄4 (cos α)2 , the optimal approximate joint measurement M is unique in the class of
Gaussian (u, v)-covariant bi-observables; it depends only on the class of preparations Gu,v
: it is the best
measurement for the worst choice of the preparation in the class Gu,v
.
Remark 12. The entropic incompatibility degree cinc G (Q , P ; ) is strictly positive for cos α
= 0 (incompatible
u v
target observables) and it goes to zero in the limits α → ±π/2 (compatible observables), h̄ → 0 (classical limit),
and 1 2 → ∞ (large uncertainty states).
Remark 13. The scale invariance of the relative entropy extends to the error function S(ρ, M), hence to the
divergence DG (Qu , Pv M) and the entropic incompatibility degree cinc
G (Q , P ; ), as well as the entropic MUR (75).
u v
Proposition 14 ([50–52,65]). Let M1 and M2 be n × n complex matrices such that M1 > M2 > 0. Then,
we have 0 < M1−1 < M2−1 . Moreover, if s : R+ → R is a strictly increasing continuous function, we have
Tr{s( M1 )} > Tr{s( M2 )}.
As in the scalar case, the error function is scale invariant, it quantifies a relative error, and we
always have S(ρ, Mσ ) > 0 because position and momentum are incompatible. Indeed, since the marginals
of a bi-observable Mσ ∈ C turn out to be convolutions of the respective sharp observables Q and P with
σ and P
= Mσ for all states ρ; this is an easy consequence,
some probability densities on Rn , Qρ
= M1,ρ ρ 2,ρ
for instance, of Problem 26.1, p. 362, in [66].
In the Gaussian case the error function can be explicitly computed.
Proposition 15 (Error function for the vector Gaussian case). For ρ, σ ∈ G, the error function has the two
equivalent expressions:
log e 5 6
S(ρ, Mσ ) = Tr s( Eρ,σ ) + s( Fρ,σ ) + aσ · ( Aρ + Aσ )−1 aσ + bσ · ( Bρ + Bσ )−1 bσ (77a)
2
log e −1 −1
= Tr s( Nρ,σ ) + s( Rρ,σ ) + a σ · ( A ρ + A σ ) −1 a σ + b σ · ( B ρ + B σ ) −1 b σ , (77b)
2
where the function s is defined in (58), and
239
Entropy 2017, 19, 301
Qρ = N ( a ρ ; A ρ ), σ
M1,ρ = N ( aρ + aσ ; Aρ + Aσ )
Pρ = N ( b ρ ; B ρ ), σ
M2,ρ = N ( b ρ + b σ ; B ρ + B σ ).
1 det( Aρ + Aσ ) log e ρ
σ
S(Qρ M1,ρ )= log ρ + Tr ( A + Aσ )−1 Aρ − 1 + aσ · ( Aρ + Aσ )−1 aσ .
2 det A 2
We can transform this equation by using
det ( Aσ + Aρ ) ' (
= det ( Aρ )−1/2 ( Aσ + Aρ ) ( Aρ )−1/2 = det 1 + Eρ,σ ,
det Aρ
' ( 5 ' (6
ln det 1 + Eρ,σ = Tr ln 1 + Eρ,σ ,
Tr ( Aρ + Aσ )−1 Aρ − 1 = Tr ( Aρ )1/2 ( Aρ + Aσ )−1 ( Aρ )1/2 − 1 = − Tr (1 + Eρ,σ )−1 Eρ,σ .
This gives
log e
σ
S(Qρ M1,ρ )= Tr{s( Eρ,σ )} + aσ · ( Aρ + Aσ )−1 aσ .
2
σ ) and (77a) is proved.
In the same way a similar expression is obtained for S(Pρ M2,ρ
On the other hand, by using
' (
det ( Aσ + Aρ ) det 1 + Nρ,σ −1 −1
ln ρ = ln = ln det 1 + Nρ,σ = Tr ln 1 + Nρ,σ ,
det A det Nρ,σ
= −1 >
Tr ( Aρ + Aσ )−1 Aρ − 1 = − Tr ( Aρ + Aσ )−1 Aσ = − Tr −1
1 + Nρ,σ −1
Nρ,σ ,
and the analogous expressions involving Bρ and Rρ,σ , one gets (77b).
The above equality follows since the monotonicity of s (Proposition 14) implies that the trace term
2
in (77a) attains its minimum when Bσ = h̄4 ( Aρ )−1 . However, it remains an open problem to explicitly
compute the infimum over the matrices Aσ when the preparation ρ is arbitrary.
Nevertheless, the computations can be done at least for a preparation ρ∗ of minimum uncertainty
(Proposition 5). Indeed, by (22) we get
log e ' (
−1
inf S(ρ∗ , Mσ ) = inf Tr s Eρ,σ + s Eρ,σ .
σ ∈G 2 Aσ
240
Entropy 2017, 19, 301
Now we can diagonalize Eρ,σ and minimize over its eigenvalues; since s( x ) + s( x −1 ) attains its
minimum value at x = 1, this procedure gives Eρ,σ = 1. So, by denoting by σ∗ the state giving the
minimum, we have
h̄2
Aσ∗ = Aρ∗ , Bσ∗ = Bρ∗ = ( A ρ ∗ ) −1 , (79)
4
inf S(ρ∗ , Mσ ) = S(ρ∗ , Mσ∗ ) = ns(1) log e. (80)
σ ∈G
For an arbitrary ρ ∈ G, we can use the last formula to deduce an upper bound for infσ∈G S(ρ, Mσ ).
h̄2 ρ −1
Indeed, if ρ∗ is a minimum uncertainty state with Aρ∗ = Aρ , then Bρ ≥ 4 (A ) = Bρ∗ by (19),
and, using again the state σ∗ of (79), we find
The second inequality in the last formula follows from (77b), (78b) and the monotonicity of s
(Proposition 14).
G : = { ρ ∈ G : A ρ ≥ 1 1, B ρ ≥ 2 1 } , ≡ ( 1 , 2 ) , i > 0. (81)
As in the scalar case, when Mσ is Gaussian, depending on the choice of the product 1 2 , we can
compute the divergence DG (Q, PMσ ) or at least bound it from below.
h̄2
(i) For 1 2 ≥ , the divergence DG (Q, PMσ ) is given by
4
log e
DG (Q, PMσ ) = S(ρ , Mσ ) = Tr {s ( Aσ /1 ) + s ( Bσ /2 )}
2
+ aσ · ( Aσ + 1 1)−1 aσ + bσ · ( Bσ + 2 1)−1 bσ , (83)
241
Entropy 2017, 19, 301
h̄2
where ρ is any Gaussian state with Aρ = 1 1 and Bρ = 1.
41
h̄2
Proof. (i) In the case 1 2 ≥ , for ρ ∈ G we have Nρ,σ ≥ 1 ( Aσ )−1 and Rρ,σ ≥ 2 ( Bσ )−1 ;
4
by Proposition 14 we get
−1
Tr{s( Nρ,σ )} ≤ Tr {s ( Aσ /1 )} , Tr{s( R− σ
ρ,σ )} ≤ Tr { s ( B /2 )} ,
1
( A ρ + A σ ) − 1 ≤ ( 1 1 + A σ ) − 1 , ( B ρ + B σ ) − 1 ≤ ( 2 1 + B σ ) − 1 .
By using these inequalities in the expression (77b), we get (83).
h̄2
(ii) In the case 1 2 < , the lower bound (84) follows by evaluating S(ρ, Mσ ) at the state ρ = ρ ∈ G
4
h̄2
with A = 1 1 and Bρ =
ρ 1.
41
Note that ρ does not depend on σ, but only on the parameters defining G : again, in the
h̄2
case 1 2 ≥ , the error attains its maximum at a state which is independent of the approximate
4
measurement.
G
cinc (Q, P; ) := inf DG (Q, PMσ ) ≡ inf sup S(ρ, Mσ ). (85)
σ ∈G σ ∈G ρ ∈G
h̄2
Theorem 5. (i) For 1 2 ≥ G (Q, P; ) is given by
, the incompatibility degree cinc
4
= >
h̄ h̄
G
cinc (Q, P; ) = n (log e) ln 1 + √ − √ . (86)
2 1 2 2 1 2 + h̄
The infimum in (85) is attained and the optimal measurement is unique, in the sense that
G
cinc (Q, P; ) = DG (Q, PMσ ) (87)
h̄ 1 h̄ 2
aσ = 0, bσ = 0, Aσ = 1, Bσ = 1, C σ = 0. (88)
2 2 2 1
h̄2
(ii) For 1 2 < (cos α)2 , the incompatibility degree cinc
G (Q, P; ) is bounded from below by
4
= >
1
G
cinc (Q, P; ) ≥ n(log e) ln (2) − . (89)
2
242
Entropy 2017, 19, 301
where the preparation ρ is defined in item (ii) of Theorem 4 and σ is the state in G such that
h̄2
aσ = 0, bσ = 0, Aσ = 1 1, Bσ = 1, C σ = 0. (91)
41
h̄2
Proof. (i) In the case 1 2 ≥ , from the expression (83) we get immediately aσ = 0, bσ = 0 and
4
2 2
by (19) we have Bσ ≥ h̄4 ( Aσ )−1 . So, by (83) and Propositions 3 and 14, we get Bσ = h̄4 ( Aσ )−1 ,
and # $
log e h̄2
inf sup S(ρ, Mσ ) = infσ Tr s ( Aσ /1 ) + s ( A σ ) −1 .
σ ∈G ρ ∈G
2 A 42
By minimizing over all the eigenvalues of Aσ , we get the minimum (86), which is attained if and
only if Aσ is as in (88). Hence, Aσ and Bσ are as in (88). This implies that any optimal state σ is
a minimum uncertainty state; so, C σ = 0 and the state σ is unique.
h̄2
(ii) In the case 1 2 < , by (19) and Proposition 14, inequality (84) implies
4
log e
inf sup S(ρ, Mσ ) ≥ inf Tr s ( Aσ /1 ) + s 1 ( Aσ )−1 .
σ ∈G ρ ∈G
2 Aσ
By minimizing over all the eigenvalues of Aσ , we get (89). Then (89) holds for ρ as in item (ii) of
Theorem 4 and σ in (91).
Remark 14 (State independent MUR, vector observables). By means of the above results, we can formulate
the following state independent entropic MUR for the position Q and momentum P. Chosen two positive
thresholds 1 and 2 , there exists a preparation ρ ∈ G (introduced in Theorem 4) such that, for all Gaussian
approximate joint measurements Mσ of Q and P, we have
σ σ
S(Qρ M1,ρ
) + S(Pρ M2,ρ
)
⎧ = >
⎪
⎪ h̄ h̄ h̄2
⎨n (log e) ln 1 + √ − √ , if 1 2 ≥ ,
2 1 2 2 1 2 + h̄ 4
≥ = > (92)
⎪
⎪ 1 h̄2
⎩n(log e) ln (2) − , if 1 2 < .
2 4
2 2
The inequality follows by (83) and (86) for 1 2 ≥ h̄4 , and (90) for 1 2 < h̄4 .
Thus, also in the vector case, for every approximate joint measurement Mσ , the total information loss
S(ρ, Mσ ) does exceed the lower bound (92) even if G forbids preparations ρ with too peaked target distributions.
Moreover, chosen 1 and 2 , one can fix again a single ‘bad’ state ρ in G that satisfies (92) for all Gaussian
approximate joint measurements Mσ of Q and P.
2
Whenever 1 2 ≥ h̄4 , the optimal approximating joint measurement Mσ is unique in the class of Gaussian
covariant bi-observables; it corresponds to a minimum uncertainty state σ which depends only on the chosen
class of preparations G , that is, on the thresholds 1 and 2 : Mσ is the best measurement for the worst choice of
the preparation in that class.
Remark 15. For n = 1, the vector lower bound in (92) reduces to the scalar lower bound found in (75) for two
parallel directions u and v; for n ≥ 1, the bound linearly increases with n.
243
Entropy 2017, 19, 301
Remark 16. The entropic incompatibility degree cinc G (Q , P ; ) is strictly positive for cos α
= 0 (incompatible
u v
target observables) and it goes to zero in the limit α → ±π/2 (compatible observables), h̄ → 0 (classical limit),
and 1 2 → ∞ (large uncertainty states).
Remark 17. Similarly to Remark 6 for scalar target observables, also the MUR (92) is actually ineffective for
macroscopic systems. Indeed, suppose we are concerned with position and momentum of a macroscopic particle,
say the center of mass of a multi-particle system (in this case n = 3). The states ρ which can be prepared in
practice have macroscopic widths, say ρ ∈ G with ‘large’ thresholds and 1 2 h̄2 /4. Then, we consider
a measuring instrument Mσ∗ having a high precision with respect to this class of states, but not necessarily
attaining a precision near the quantum limits. For instance, let us take Mσ∗ ∈ CG with Aσ∗ = δ1 1, Bσ∗ = δ2 1,
and 0 < δ1 * 1 , 0 < δ2 * 2 ; we assume Mσ∗ is also unbiased: aσ∗ = 0, bσ∗ = 0. Obviously, δ1 δ2 ≥ h̄2 /4
must hold. Then, ∀ρ ∈ G by (77a) and (78a) we have
δ1 δ δ2 δ
Eρ,σ∗ = ≤ 1 1, Fρ,σ∗ = ≤ 2 1,
Aρ 1 Bρ 2
log e 5 6 n log e
0 < S(ρ, Mσ∗ ) = Tr s( Eρ,σ∗ ) + s( Fρ,σ∗ ) ≤ [s(δ1 /1 ) + s(δ2 /2 )] .
2 2
By (58) the function s is increasing and it behaves as s( x ) x2 /2 in a neighborhood of zero; in the present
case δ1 /1 * 1 and δ2 /2 * 1, thus implying that the error function is negligible. This is practically a
‘classical’ case: the preparation has ‘large’ position and momentum uncertainties and the measuring instrument
is ‘relatively good’. In this situation we do not see the difference between the joint measurement of position and
momentum and their separate sharp distributions. Of course the bound (92) continues to hold, but it is also
negligible since 1 2 h̄2 /4.
Remark 18. Also in the vector case, the scale invariance of the relative entropy extends to the error function
S(ρ, Mσ ), the divergence DG (Q, PMσ ) and the entropic incompatibility degree cinc
G (Q, P; ), as well as the
entropic MUR (92). Indeed, let us consider the dimensionless versions of position and momentum (35) and
) P
their associated projection valued measures Q, ) introduced in Section 4. Accordingly, we rescale the joint
measurement Mσ of (43) in the same way, obtaining the POVM
) σ ( B) =
M M) σ ( x), p
))d)
xd p),
B
= > = >
) σ ( x), p 1 i ) − x) · P
) i ) − x) · P
)
M )) = n exp )· Q
p ΠσΠ exp − )· Q
p .
(2πλ) λ λ
) ρ M
S (Q ) σ ) + S (P
) ρ M
) 2,ρ
σ σ
) = S(Qρ M1,ρ σ
) + S(Pρ M2,ρ ). (93)
1,ρ
Then, the scale invariance holds for the entropic divergence and incompatibility degree, too:
) P
D)G (Q, ) M
) σ ) = DG (Q, PMσ ), G ) )
cinc G
(Q, P; )) = cinc (Q, P; ),
κ 1 λ 2 2 λ2 h̄2
where )
1 : = and )
2 : = . In particular ) 2 ≥
1 ) ⇐⇒ 1 2 ≥ and, in this case, we have
h̄ κ h̄ 4 4
λ h̄
G ) )
n (log e) s √ = cinc G
(Q, P; )) = cinc (Q, P; ) = n (log e) s √ .
2 )1 )2 2 1 2
244
Entropy 2017, 19, 301
7. Conclusions
We have extended the relative entropy formulation of MURs given in [41] from the case of discrete
incompatible observables to a particular instance of continuous target observables, namely the position
and momentum vectors, or two components of them along two possibly non parallel directions.
The entropic MURs we found share the nice property of being scale invariant and well-behaved in the
classical and macroscopic limits. Moreover, in the scalar case, when the angle spanned by the position
and momentum components goes to ±π/2, the entropic bound correctly reflects their increasing
compatibility by approaching zero with continuity.
Although our results are limited to the case of Gaussian preparation states and covariant Gaussian
approximate joint measurements, we conjecture that the bounds we found still hold for arbitrary states
and general (not necessarily covariant or Gaussian) bi-observables. Let us see with some more detail
how this should work in the case when the target observables are the vectors Q and P.
The most general procedure should be to consider the error function S(Qρ M1,ρ ) + S(Pρ M2,ρ ) for
an arbitrary POVM M on Rn × Rn and any state ρ ∈ S. First of all, we need states for which neither the
position nor the momentum dispersion are too small; the obvious generalization of the test states (81) is
S := {ρ ∈ S2 : Aρ ≥ 1 1, Bρ ≥ 2 1} , i > 0.
Then, the most general definitions of the entropic divergence and incompatibility degree are:
0 1
D (Q, PM) := sup S(Qρ M1,ρ ) + S(Pρ M2,ρ ) , (94)
ρ ∈ S
It may happen that Qρ is not absolutely continuous with respect to M1,ρ , or Pρ with respect to
M2,ρ ; in this case, the error function and the entropic divergence take the value +∞ by definition.
So, we can restrict to bi-observables that are (weakly) absolutely continuous with respect to the
Lebesgue measure. However, the true difficulty is that, even with this assumption, here we are not
able to estimate (94), hence (95). It could be that the symmetrization techniques used in [17,19] can be
extended to the present setting, and one can reduce the evaluation of the entropic incompatibility index
to optimizing over all covariant bi-observables. Indeed, in the present paper we a priori selected only
covariant approximating measurements; we would like to understand if, among all approximating
measurements, the relative entropy approach selects covariant bi-observables by itself. However, even
if M is covariant, there remains the problem that we do not know how to evaluate (94) if ρ and M
are not Gaussian. It is reasonable to expect that some continuity and convexity arguments should
apply, and the bounds in Theorem 5 might be extended to the general case by taking dense convex
combinations. Also the techniques used for the PURs in [8,9] could be of help in order to extend what
we did with Gaussian states to arbitrary states. This leads us to conjecture:
G
cinc (Q, P; ) = cinc (Q, P; ). (96)
Conjecture (96) is also supported since the uniqueness of the optimal approximating bi-observable
in Theorem 5(i) is reminiscent of what happens in the discrete case of two Fourier conjugated mutually
unbiased bases (MUBs); indeed, in the latter case, the optimal bi-observable is actually unique among
all the bi-observables, not only the covariant ones (see [41] (Theorem 5)).
Similar considerations obviously apply also to the case of scalar target observables. We leave
a more deep investigation of equality (96) to future work.
As a final consideration, one could be interested in finding error/disturbance bounds involving
sequential measurements of position and momentum, rather than considering all their possible
approximate joint measurements. As sequential measurements are a proper subset of the set of
all the bi-observables, optimizing only over them should lead to bounds that are greater than cinc .
245
Entropy 2017, 19, 301
This is the reason for which in [41] an error/disturbance entropic bound, denoted by ced and dinstinct
from cinc , was introduced. However, it was also proved that the equality cinc = ced holds when
one of the target observables is discrete and sharp. Now, in the present paper, only sharp target
observables are involved; although the argument of [41] can not be extended to the continuous setting,
the optimal approximating joint observables we found in Theorems 3(i) and 5(i) actually are sequential
measurements. Indeed, the optimal bi-observable in Theorem 3(i) is one of the POVMs described in
Examples 2 and 3 (see (74)); all these bi-observables have a (trivial) sequential implementation in terms
of an unsharp measurement of Qu followed by sharp Pv . On the other hand, in the vector case, it was
shown in ([67], Corollary 1) that all covariant phase-space observables can be obtained as a sequential
measurement of an unsharp version of the position Q followed by the sharp measurement of the
momentum P. Therefore, cinc = ced also for target position and momentum observables, in both the
scalar and vector case.
References
1. Heisenberg, W. Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik.
Zeitschr. Phys. 1927, 43, 172–198.
2. Simon, R.; Mukunda, N.; Dutta, B. Quantum-noise matrix for multimode systems: U (n) invariance, squeezing,
and normal forms. Phys. Rev. A 1994, 49, 1567–1583.
3. Holevo, A.S. Statistical Structure of Quantum Theory; Lecture Notes in Physics Monographs 67; Springer:
Berlin, Germany, 2001.
4. Holevo, A.S. Probabilistic and Statistical Aspects of Quantum Theory; Quaderni della Normale; Edizioni della
Normale: Pisa, Italy, 2011.
5. Holevo, A.S. Quantum Systems, Channels, Information; De Gruiter: Berlin, Germany, 2012.
6. Robertson, H. The uncertainty principle. Phys. Rev. 1929, 34, 163–164.
7. Hirschman, I.I. A note on entropy. Am. J. Math. 1957, 79, 152–156.
8. Beckner, W. Inequalities in Fourier analysis. Ann. Math. 1975, 102, 159–182.
9. Białynicki-Birula, I.; Mycielski, J. Uncertainty relations for information entropy in wave machanics.
Commun. Math. Phys. 1975, 44, 129–132.
10. Maassen, H.; Uffink, J.B.M. Generalized entropic uncertainty relations. Phys. Rev. Lett. 1988, 60, 1103–1106.
11. Gibilisco, P.; Isola, T. On a refinement of Heisenberg uncertainty relation by means of quantum Fisher
information. J. Math. Anal. Appl. 2011, 375, 270–275.
12. Coles, P.J.; Berta, M.; Tomamichel, M.; Whener, S. Entropic uncertainty relations and their applications.
Rev. Mod. Phys. 2017, 89, 015002.
13. Wehner, S.; Winter, A. Entropic uncertainty relations—A survey. New J. Phys. 2010, 12, 025009.
14. Ozawa, M. Position measuring interactions and the Heisenberg uncertainty principle. Phys. Lett. A 2002,
299, 1–7.
15. Ozawa, M. Physical content of Heisenberg’s uncertainty relation: Limitation and reformulation. Phys. Lett. A
2003, 318, 21–29.
16. Ozawa, M. Universally valid reformulation of the Heisenberg uncertainty principle on noise and disturbance
in measurement. Phys. Rev. A 2003, 67, 042105.
17. Werner, R.F. The uncertainty relation for joint measurement of position and momentum. Quantum Inf. Comput.
2004, 4, 546–562.
18. Busch, P.; Heinonen, T.; Lahti, P. Heisenberg’s Uncertainty Principle. Phys. Rep. 2007, 452, 155–176.
19. Busch, P.; Lahti, P.; Werner, R. Measurement uncertainty relations. J. Math. Phys. 2014, 55, 042111.
20. Busch, P.; Lahti, P.; Werner, R. Quantum root-mean-square error and measurement uncertainty relations.
Rev. Mod. Phys. 2014, 86, 1261–1281.
21. Ozawa, M. Heisenberg’s original derivation of the uncertainty principle and its universally valid reformulations.
Curr. Sci. 2015, 109, 2006–2016.
22. Davies, E.B. Quantum Theory of Open Systems; Academic: London, UK, 1976.
246
Entropy 2017, 19, 301
23. Busch, P.; Grabowski, M.; Lahti, P. Operational Quantum Physics; Springer: Berlin, Germany, 1997.
24. Barchielli, A.; Gregoratti, M. Quantum Trajectories and Measurements in Continuous Time: The Diffusive Case;
Lecture Notes in Physics; Springer: Berlin/Heidelberg, Germany, 2009; Volume 782.
25. Heinosaari, T.; Ziman, M. The Mathematical Language of Quantum Theory: From Uncertainty to Entanglement;
Cambridge University Press: Cambridge, UK, 2012.
26. Busch, P.; Lahti, P.; Pellonpää, J.-P.; Ylinen, K. Quantum Measurement; Springer: Berlin, Germany, 2016.
27. Buscemi, F.; Hall, M.J.W.; Ozawa, M.; Wilde, M.M. Noise and disturbance in quantum measurements:
An information-theoretic approach. Phys. Rev. Lett. 2014, 112, 050401.
28. Busch, P.; Heinosaari, T.; Schultz, J.; Stevens, N. Comparing the degrees of incompatibility inherent in
probabilistic physical theories. Europhys. Lett. 2013, 103, 10002.
29. Busch, P.; Lahti, P.; Werner, R. Proof of Heisenberg’s error-disturbance relation. Phys. Rev. Lett. 2013, 111, 160405.
30. Coles, P.J.; Furrer, F. State-dependent approach to entropic measurement-disturbance relations. Phys. Lett. A
2015, 379, 105–112.
31. Heinosaari, T.; Schultz, J.; Toigo, A.; Ziman, M. Maximally incompatible quantum observables. Phys. Lett. A
2014, 378, 1695–1699.
32. Werner, R.F. Uncertainty relations for general phase spaces. Front. Phys. 2016, 11, 110305.
33. Buscemi, F.; Das, S.; Wilde, M.M. Approximate reversibility in the context of entropy gain, information gain,
and complete positivity. Phys. Rev. A 2016, 93, 062314.
34. Barchielli, A.; Lupieri, G. Instrumental processes, entropies, information in quantum continual measurements.
Quantum Inf. Comput. 2004, 4, 437–449.
35. Barchielli, A.; Lupieri, G. Instruments and channels in quantum information theory. Opt. Spectrosc. 2005, 99,
425–432.
36. Barchielli, A.; Lupieri, G. Quantum measurements and entropic bounds on information transmission.
Quantum Inf. Comput. 2006, 6, 16–45.
37. Barchielli, A.; Lupieri, G. Instruments and mutual entropies in quantum information. Banach Center Publ.
2006, 73, 65–80.
38. Barchielli, A.; Lupieri, G. Entropic bounds and continual measurements. In Quantum Probability and Infinite
Dimensional Analysis; QP-PQ: Quantum Probability and White Noise Analysis; Accardi, L., Freudenberg, W.,
Schürmann, M., Eds.; World Scientific: Singapore, 2007; Volume 20, pp. 79–89.
39. Barchielli, A.; Lupieri, G. Information gain in quantum continual measurements. In Quantum Stochastic and
Information; Belavkin, V.P., Guţǎ, M., Eds.; World Scientific: Singapore, 2008; pp. 325–345.
40. Maccone, L. Entropic information-disturbance tradeoff. EPL 2007, 77, 40002.
41. Barchielli, A.; Gregoratti, M.; Toigo, A. Measurement uncertainty relations for discrete observables: Relative
entropy formulation. arXiv 2016, arXiv:1608.01986.
42. Braunstein, S.L.; van Loock, P. Quantum information with continuous variables. Rev. Mod. Phys. 2005, 77,
513–577.
43. Heinosaari, T.; Kiukas, J.; Schultz, J. Breaking Gaussian incompatibility on continuous variable quantum
systems. J. Math. Phys. 2015, 56, 082202.
44. Kiukas, J.; Schultz, J. Informationally complete sets of Gaussian measurements. J. Phys. A Math. Theor. 2013,
46, 485303.
45. Weedbrook, C.; Pirandola, S.; García-Patrón, R.; Cerf, N.J.; Ralph, T.C.; Shapiro, J.H.; Lloyd, S. Gaussian
quantum information. Rev. Mod. Phys. 2012, 84, 621–669.
46. Huang, Y. Entropic uncertainty relations in multidimensional position and momentum spaces. Phys. Rev. A
2011, 83, 052124.
47. Heinosaari, T.; Miyadera, T.; Ziman, M. An invitation to quantum incompatibility. J. Phys. A Math. Theor.
2016, 49, 123001.
48. Simon, R.; Sudarshan, E.C.G.; Mukunda, N. Gaussian-Wigner distributions in quantum mechanics and
optics. Phys. Rev. A 1987, 36, 3868–3880.
49. Horn, R.A.; Zhang, F. Basic Properties of the Schur Complement. In The Schur Complement and Its Applications;
Zhang, F., Ed.; Numerical Methods and Algorithms; Springer: Berlin, Germany, 2005; pp. 17–46.
50. Petz, D. Quantum Information Theory and Quantum Statistics; Springer: Berlin, Germany, 2008.
247
Entropy 2017, 19, 301
51. Carlen, E. Trace Inequalities and Quantum Entropy: An Introductory Course. In Entropy and the Quantum;
Contemporary Mathematics; American Mathematical Society: Providence, RI, USA, 2010; Volume 529,
pp. 73–140.
52. Bhatia, R. Matrix Analysis; Springer: New York, NY, USA, 1997.
53. Werner, R.F. Quantum harmonic analysis on phase spaces. J. Math. Phys. 1983, 25, 1404–1411.
54. Burnham, K.P.; Anderson, D.R. Model Selection and Multimodel Inference—A Practical Information—Theoretic
Approach; Springer: New York, NY, USA, 2002.
55. Topsøe, F. Basic concepts, identities and inequalities—The toolkit of Information Theory. Entropy 2011, 3,
162–190.
56. Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006.
57. Carmeli, C.; Heinonen, T.; Toigo, A. Position and momentum observables on R and on R3 . J. Math. Phys.
2004, 45, 2526–2539.
58. Barchielli, A.; Lupieri, G. Quantum stochastic calculus, operation valued stochastic processes and continual
measurements in quantum mechanics. J. Math. Phys. 1985, 26, 2222–2230.
59. Barchielli, A.; Lupieri, G. A quantum analogue of Hunt’s representation theorem for the generator of
convolution semigroups on Lie groups. Probab. Theory Rel. Fields 1991, 88, 167–194.
60. Barchielli, A.; Holevo, A.S.; Lupieri, G. An analogue of Hunt’s representation theorem in quantum probability.
J. Theor. Probab. 1993, 6, 231–265.
61. Holevo, A.S. Investigations in the General Theory of Statistical Decisions. Proc. Steklov Inst. Math. 1978, 124,
1–140.
62. Holevo, A.S. Infinitely divisible measurements in quantum probability theory. Theory Probab. Appl. 1986, 31,
493–497.
63. Cassinelli, G.; De Vito, E.; Toigo, A. Positive operator valued measures covariant with respect to an irreducible
representation. J. Math. Phys. 2003, 44, 4768–4775.
64. Kiukas, J.; Lahti, P.; Ylinen, K. Normal covariant quantization maps. J. Math. Anal. Appl. 2006, 319, 783–801.
65. Ohya, M.; Petz, D. Quantum Entropy and Its Use; Springer: Berlin, Germany, 1993.
66. Billingsley, P. Probability and Measure, 2nd ed.; Wiley: New York, NY, USA, 1986.
67. Carmeli, C.; Heinonen, T.; Toigo, A. Sequential measurements of conjugate observables. J. Phys. A Math. Theor.
2011, 44, 285304.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
248
entropy
Article
Planck-Scale Soccer-Ball Problem: A Case of
Mistaken Identity
Giovanni Amelino-Camelia 1,2
1 Dipartimento di Fisica, Università di Roma “La Sapienza”, P.le A. Moro 2, 00185 Roma, Italy;
[email protected]
2 Istituto Nazionale di Fisica Nucleare (INFN), Sez. Roma1, P.le A. Moro 2, 00185 Roma, Italy
Abstract: Over the last decade, it has been found that nonlinear laws of composition of momenta are
predicted by some alternative approaches to “real” 4D quantum gravity, and by all formulations of
dimensionally-reduced (3D) quantum gravity coupled to matter. The possible relevance for rather
different quantum-gravity models has motivated several studies, but this interest is being tempered
by concerns that a nonlinear law of addition of momenta might inevitably produce a pathological
description of the total momentum of a macroscopic body. I here show that such concerns are
unjustified, finding that they are rooted in failure to appreciate the differences between two roles
for laws composition of momentum in physics. Previous results relied exclusively on the role of a
law of momentum composition in the description of spacetime locality. However, the notion of total
momentum of a multi-particle system is not a manifestation of locality, but rather reflects translational
invariance. By working within an illustrative example of quantum spacetime, I show explicitly that
spacetime locality is indeed reflected in a nonlinear law of composition of momenta, but translational
invariance still results in an undeformed linear law of addition of momenta building up the total
momentum of a multi-particle system.
1. Introduction
An emerging characteristic of quantum-gravity research over the last decade has been a gradual
shift of focus toward manifestations of the Planck scale on momentum space, particularly pronounced
in some approaches to quantum gravity. For some research lines based on spacetime noncommutativity,
several momentum-space structures have been in focus, including the possibility of deformed laws
of composition of momenta, which shall be here of interest. While deformed laws of composition
of momenta are found to be inevitable in some approaches based on spacetime noncommutativity
(e.g., [1–6]), the situation is less certain in the loop-quantum-gravity approach. For “real” 4D loop
quantum gravity, the relevant issues are partly obscured by our present limited understanding of
the semiclassical limit of that theory [7], but some indirect arguments suggest that a nonlinear law of
composition of momenta might arise [8,9]. These arguments find further strength in results on 3D loop
quantum gravity, where the simplifications afforded by that dimensionally-reduced model allow one
to rigorously show that indeed the nonlinearities on momentum space are present (e.g., [10]). Actually,
evidence is growing that in all alternative formulations of 3D quantum gravity coupled to matter there
are nonlinearities in momentum space, including nonlinear laws of composition of momenta (e.g., [11]).
The role played by nonlinearities on momentum space is also noteworthy in two recently-proposed
approaches to the quantum-gravity problem: the one based on group field theory [12] and the one
based on the relative-locality framework [13].
Due to the lack of experimental guidance, a variety of approaches to quantum gravity are
being developed, and in most cases the different approaches have very little in common. This of
course endows with additional reasons of interest any result which is found to apply to more than
one approach. Indeed, there has been growing interest in the conceptual implications and possible
phenomenological implications [14] of nonlinear laws on momentum space and particularly nonlinear
laws of composition of momenta. However, this interest is being tempered by concerns that a nonlinear
law of addition of momenta might inevitably produce a pathological description of the total momentum
of a macroscopic body [15–23] (also see References [24–26] for a related discussion focused within the
novel relative-locality framework). This issue has often been labelled as the “soccer-ball problem” [17]:
the quantum-gravity pictures lead one to expect nonlinearities of the law of composition of momenta
which are suppressed by the Planck scale (∼1028 eV) and would be unobservably small for particles
at energies we presently can access, but in the analysis of a macroscopic body (e.g., a soccer ball),
one might have to add up very many of such minute nonlinearities, ultimately obtaining results in
conflict with observations [15–23].
If this so-called “soccer-ball problem” really was a scientific problem (a case of actual conflict with
experimental data), we could draw rather sharp conclusions about several areas of quantum-gravity
research. Perhaps most notably we should consider as ruled out large branches of research on
quantum-gravity based on spacetime noncommutativity and we should consider the whole effort
of research on dimensionally-reduced 3D quantum gravity as completely unreliable in forming
an intuition for “real” 4D quantum gravity. However, I here show that previous discussions of
this soccer-ball problem [15–26] failed to appreciate the differences between two roles for laws of
composition of momentum in physics. Previous results supporting a nonlinear law of addition of
momenta relied exclusively on the role of a law of momentum composition in the description of
spacetime locality. The notion of total momentum of a multi-particle system is not a manifestation
of locality, but rather reflects translational invariance in interacting theories. After being myself
confused about these issues for quite some time [17] I feel I am now in a position to articulate the
needed discussion at a completely general level. However, considering the tone and content of the
bulk of literature that precedes this contribution of mine I find it is best to opt here instead for a very
explicit discussion based on illustrative examples of calculations performed within a specific simple
model affected by nonlinearities for a law of composition of momenta. The model I focus on has
2 + 1-dimensional pure-spatial κ-Minkowski noncommutativity [1–6], with the time coordinate left
unaffected by the deformation and the two spatial coordinates, x1 and x2 , governed by
[ x1 , x2 ] = i x1 (1)
(with the deformation scale expected to be of the order of the inverse of the Planck scale).
In the next section I briefly review within this example of quantum spacetime previous
arguments showing that spacetime locality is reflected in a nonlinear law of composition of momenta.
Then, Section 3 takes off from known results on translational invariance for κ-Minkowski noncommutative
spacetimes and builds on those to achieve the first ever example of translationally-invariant interacting
two-particle system in κ-Minkowski. This allows me to explicitly verify that the conserved charge
associated with that translational invariance (the total momentum of the two-particle system) adds
linearly the momenta of the two particles involved. Section 4 offers some closing remarks.
250
Entropy 2017, 19, 400
and that the notion of integration on such a noncommutative space preserves many of the standard
properties including [1,3]
μ
d4 x eikμ x = (2π )4 δ(4) (k) . (2)
It is a rather standard exercise for practitioners of spacetime noncommutativity to use these tools
in order to enforce locality within actions describing classical fields. For example, one might want to
introduce in the action the product of three (possibly identical, but in general different) fields, Φ, Ψ,
Υ, insisting on locality in the sense that the three fields be evaluated “at the same quantum point x”;
i.e., Φ( x ) Ψ( x ) Υ( x ). There is still no consensus on how one should formulate the more interesting
quantum-field version of such theories, and it remains unclear to which extent and in which way our
ordinary notion of locality is generalized by the requirement of evaluating “at the same quantum point
x ” fields intervening in a product such as Φ( x ) Ψ( x ) Υ( x ). Nonetheless, for the classical-field case
there is a sizable body of literature consistently adopting this prescription for locality. Important for
my purposes here is the fact that with such a prescription, locality inevitably leads to a nonlinear law
of composition of momenta, as I show explicitly in the following example:
d4 x Φ ( x ) Ψ ( x ) Υ ( x ) = (3)
μ ν ρ
= d4 x d4 k d4 p d4 q Φ̃(k) Ψ̃( p) Υ̃(q) eikμ x eipν x eiqρ x
μ
= d4 x d4 k d4 p d4 q Φ̃(k) Ψ̃( p) Υ̃(q)ei(k⊕ p⊕q)μ x
= (2π )4 d4 k d4 p d4 q Φ̃(k) Ψ̃( p) Υ̃(q) δ(4) (k ⊕ p ⊕ q)
( k ⊕ p )2 = k 2 + p2 (5)
, -
k 2 + p2 1 − ek2 1 − e p2
( k ⊕ p )1 = k1 + p1 (6)
1 − e(k2 + p2 ) k2 e p2 p2
This result is rooted in one of the most studied aspects of such noncommutative spacetimes,
which is their “generalized star product” [1–3]. This is essentially a characterization of the properties
of products of exponentials induced by rules of noncommutativity of type (1). Specifically, one
easily arrives at (3) (with ⊕ such that, in particular, (6) holds) by just observing that from the
defining commutator (1) it follows that (Equation (7) is a particular example of application of
the Baker-Campbell-Hausdorff formula for products of exponentials of noncommuting variables.
In general, the Baker–Campbell–Hausdorff formula involves an infinite series of nested commutators,
but the case of noncommutativity (1) is one of the cases for which the series of nested commutators
can be resummed explicitly [2,3]) [2,3]:
The so-called soccer-ball problem concerns the acceptability of laws of composition of type (6).
Since one assumes that the deformation scale is on the order of the inverse of the Planck scale,
applying (6) to microscopic/fundamental particles has no sizable consequences: of course (6) gives us
back to good approximation (k ⊕ p)1 k1 + p1 whenever |k2 | * 1 and | p2 | * 1. However, if a law
of composition such as (6) should be used also when we add very many microparticle momenta in
obtaining the total momentum of a multiparticle system (such as a soccer ball), then the final result
251
Entropy 2017, 19, 400
could be pathological [15–26] even when each microparticle in the system has momentum much
smaller than 1/.
[ p1 , x1 ] = i , [ p2 , x1 ] = 0 , [ p2 , x2 ] = i , (8)
[ p1 , x2 ] = − i p1 , (9)
One easily finds that by combining (1), (8), and (9), all Jacobi identities are satisfied [4–6].
Additional intuition for these nonstandard properties of the momenta p j comes from actually looking
at which formulation of translation transformations preserves the form of the noncommutativity of
coordinates (1). Evidently, the standard description
x2 → x2 = x2 + a2 , x1 → x1 = x1 + a1
[ x1 , x2 ] = [ x1 + a1 , x2 + a2 − a1 p1 ] =
= i x1 − a1 [ x1 , p1 ] = i ( x1 + a1 ) = i x1 (12)
All this about translation transformations in certain noncommutative spacetimes is well known
(e.g., [4–6]). The part which I am here going to contribute is to show how this is relevant for the mentioned
much-debated issue about the total momentum of a multi-particle system. My starting point is that in
order for us to be able to even contemplate the total momentum of a multiparticle system, we must be
dealing with a case where translational invariance is ensured: total momentum is the conserved charge
for a translationally invariant multi-particle system. Surely the introduction of translationally invariant
multi-particle systems must involve some subtleties due to the noncommutativity of coordinates,
252
Entropy 2017, 19, 400
and these subtleties are directly connected to the new properties of translation transformations (9),
but they are not directly connected to the properties of the star product (7) and the associated law
of composition of momenta (6). For my purposes, also considering the heated debate that precedes
this contribution of mine, it is best to show the implications of this point very simply and explicitly,
focusing on a system of two particles interacting via a harmonic potential.
I start by noticing that evidently one does not achieve translational invariance through a
description of the form
x2 + x1 p1 → x2 − a1 p1 + ( x1 + a1 ) p1 = x2 + x1 p1
It is interesting for my purposes to see which conserved charge is associated with this invariance
under translations of the hamiltonian H. This conserved charge will describe the total momentum
of the two-particle system governed by H (i.e., the center-of-mass momentum). It is easy to see that
this conserved charge is just the standard p A + p B . For the second component, one trivially finds
that indeed
[ p2A + p2B , H] = 0
and the same result also applies to the first component:
where the only non-trivial observation I have used is that (1) leads to [ p1 , x2 + x1 p1 ] = −i p1 + i p1 = 0.
The result (15) shows that indeed p A + p B is the momentum of the center of mass of my
translationally-invariant two-particle system; i.e., it is the total momentum of the system.
The concerns about total momentum that had been voiced in discussions of the Planck-scale
soccer-ball problem were rooted in the different sum of momenta relevant for locality, the ⊕ sum
discussed in the previous section. It was feared that one should obtain the total momentum by
combining single-particle momenta with the nonlinear ⊕ sum. The result (15) shows that this
253
Entropy 2017, 19, 400
expectation was incorrect. One can also directly verify that indeed p A ⊕ p B is not a conserved
charge for my translationally-invariant two-particle system, and specifically, taking into account (6),
one finds that
[(p A ⊕ p B )1 , H]
= 0
This completes my thesis, but in closing this section I should warn readers of the fact that while the
picture emerging from my analysis is rather compelling, one should not forget that the interpretation
of the notion of total momentum in a noncommutative spacetime remains affected by some open issues
(see Reference [14] and references therein). Even the physical meaning of having noncommutative
spacetime coordinates is still being debated. In the shadow of these interpretational issues, we
cannot even be sure that the Hamiltonian of Equation (14) has physical (observable) consequences
different from an ordinary harmonic-oscillator theory. Nonetheless, my analysis contributes to this
ongoing debate by exposing two notions of momentum conservation: one connected to locality, and
one connected with translational invariance. Evidently, if interpreted in standard way, these two
notions could be mutually incompatible: in the analysis of a chain of events one might naturally
want to insist on overall total-momentum conservation, but in some parts of the chain of events
the conserved quantity might be the one coming from locality, while in other parts of the chain of
events the conserved quantity might the one coming from translational invariance. Addressing this
apparent puzzle might require a totally new interpretation of the notion of momentum of a particle
in a quantum spacetime, while failing to address it might be a mortal blow to the whole research
area. While in part my results are sub judice because of these interpretational issues, my analysis
nonetheless firmly establishes the main conceptual point I am making, which concerns the differences
between “composition of momentum appearing in locality analyses” and “composition of momentum
appearing in translational-invariance analyses”—two notions which are usually confused with each
other due to the fact that in a classical spacetime they coincide.
254
Entropy 2017, 19, 400
the formulation of second quantization with κ-Minkowski noncommutativity. As a matter of fact, I here
provided the first ever translationally-invariant formulation of an interacting theory in κ-Minkowski.
All previous attempts had been made within quantum field theory, and led to unsatisfactory results,
particularly concerning global translational invariance. Perhaps the results I here reported could
provide guidance for improving upon previous attempts at formulating interacting quantum field
theories in κ-Minkowski. In particular, it might be appropriate to make room for some novel notion
of “coincidence of points”—a possibility which had not been considered in previous attempts. I see
a hint pointing in this direction in the structure of my translationally-invariant harmonic potential:
unlike standard Harmonic potentials, the potential in my Equation (14) does not vanish when the
coordinates of the particles coincide: the potential in Equation (14) vanishes for x1A = x1B and x2A = x2B
only if the momenta also coincide (p1A = p1B ). This is reminiscent of some results obtained within the
recently-proposed relative-locality framework [13], where the only meaningful notion of “coincidence”
is a phase-space notion (not a notion that could be formulated exclusively in spacetime). This suggests
that one could perhaps improve upon previous attempts to formulate interacting quantum field
theories in κ-Minkowski by exploiting quantum-field-theory results being developed [29] for the
relative-locality framework.
Another direction for future studies which might bring some enlightenment concerns building
interacting theories with full relativistic covariance. Herein I focused on translation transformations
because it was sufficient for the purposes of my study, but it would be interesting to ask what
additional constraints would arise if one insists on full relativistic covariance (including boosts and
spatial rotations) rather than just translational invariance. For the law of composition of momenta
based on locality, a fully consistent relativistic picture is already known [13,14,29], and its consistency
with κ-Minkowski noncommutativity is well established. Important insight might be gained by
establishing whether or not analogous results are available for the law of composition of momenta
based on translational invariance of my interacting Hamiltonian.
References
1. Majid, S. Meaning of Noncommutative Geometry and the Planck-Scale Quantum Group. Lect. Notes Phys.
2000, 541, 227.
2. Kosinski, P.; Lukierski, J.; Maslanka, P. Local Field Theory ON κ-Minkowski Space, Star Products and
Noncommutative Translations. Czechoslov. J. Phys. 2000, 50, 1283–1290.
3. Agostini, A.; Lizzi, F.; Zampini, A. Generalized Weyl systems and kappa-Minkowski space. Mod. Phys. Lett. A
2002, 17, 2105–2126.
4. Lukierski, J.; Ruegg, H.; Zakrzewski, W.J. Classical and Quantum Mechanics of Free κ Relativistic Systems.
Ann. Phys. 1995, 243, 90–116.
5. Amelino-Camelia, G.; Lukierski, J.; Nowicki, A. Distance Measurement and κ-Deformed Propagation of
Light and Heavy Probes. Int. J. Mod. Phys. A 1999, 14, 4575–4588.
6. Kowalski-Glikman, J.; Nowak, S. Doubly Special Relativity theories as different bases of κ-Poincaré algebra.
Phys. Lett. B 2002, 539, 126–132.
7. Rovelli, C. Loop Quantum Gravity. Living Rev. Relativ. 2008, 11, 5.
8. Smolin, L. Quantum gravity with a positive cosmological constant. arXiv 2002, arXiv:hep-th/0209079.
9. Amelino-Camelia, G.; Smolin, L.; Starodubtsev, A. Quantum symmetry, the cosmological constant and
Planck scale phenomenology. Class. Quant. Grav. 2004, 21, 3095–3110.
10. Noui, K. Three Dimensional Loop Quantum Gravity: Particles and the Quantum Double. J. Math. Phys. 2006,
47, 102501.
11. Freidel, L.; Livine, E.R. 3-D quantum gravity and non-commutative quantum field theory. Phys. Rev. Lett.
2006, 96, 221301.
12. Oriti, D.; Ryan, J. Group field theory formulation of 3D quantum gravity coupled to matter fields.
Class. Quant. Grav. 2006, 23, 6543–6576.
255
Entropy 2017, 19, 400
13. Amelino-Camelia, G.; Freidel, L.; Kowalski-Glikman, J.; Smolin, L. The principle of relative locality.
Phys. Rev. D 2011, 84, 084010.
14. Amelino-Camelia, G. Quantum Spacetime Phenomenology. Living Rev. Relativ. 2013, 16, 5.
15. Lukierski, J. From noncommutative space-time to quantum relativistic symmetries with fundamental mass
parameter. In Proceedings of the Second International Symposium on Quantum Theory (QTS2), Krakow,
Poland, 18–21 July 2001.
16. Maggiore, M. The Atick-Witten free energy, closed tachyon condensation and deformed Poincare’ symmetry.
Nucl. Phys. B 2002, 69, 647.
17. Amelino-Camelia, G. Doubly-Special Relativity: First Results and Key Open Problems. Int. J. Mod. Phys. D
2002, 11, 1643.
18. Kowalski-Glikman, J. Introduction to Doubly Special Relativity. Lect. Notes Phys. 2005, 669, 131–159.
19. Girelli, F.; Livine, E.R. Physics of Deformed Special Relativity. Braz. J. Phys. 2005, 35, 432–438.
20. Jacobson, T.; Liberati, S.; Mattingly, D. Lorentz violation at high energy: Concepts, phenomena and
astrophysical constraints. Ann. Phys. 2006, 321, 150–196.
21. Hossenfelder, S. Multi-Particle States in Deformed Special Relativity. Phys. Rev. D 2007, 75, 105005.
22. Mignemi, S. Doubly special relativity and translation invariance. Phys. Lett. B 2009, 672, 186–189.
23. Magpantay, J.A. Dual doubly special relativity. Phys. Rev. D 2011, 84, 024016.
24. Amelino-Camelia, G.; Freidel, L.; Kowalski-Glikman, J.; Smolin, L. Relative locality and the soccer ball
problem. Phys. Rev. D 2011, 84, 087702.
25. Hossenfelder, S. Comment on “Relative locality and the soccer ball problem”. Phys. Rev. D 2013, 88, 028701.
26. Amelino-Camelia, G.; Freidel, L.; Kowalski-Glikman, J.; Smolin, L. Noisy soccer balls. Phys. Rev. D 2013,
88, 028702.
27. Snyder, H.S. Quantized Space-Time. Phys. Rev. 1947, 71, 38.
28. Yang, C.N. On Quantized Space-Time. Phys. Rev. 1947, 72, 874.
29. Freidel, L.; Rempel, T. Scalar Field Theory in Curved Momentum Space. arXiv 2013, arXiv:1312.3674.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
256
entropy
Article
Structure of Multipartite Entanglement in Random
Cluster-Like Photonic Systems
Mario Arnolfo Ciampini 1, *, Paolo Mataloni 1 and Mauro Paternostro 2
1 Dipartimento di Fisica, Sapienza Università di Roma, Piazzale Aldo Moro 5, Rome 00185, Italy;
[email protected]
2 Centre for Theoretical Atomic, Molecular and Optical Physics, School of Mathematics and Physics,
Queen’s University Belfast, Belfast BT7 1NN, UK; [email protected]
* Correspondence: [email protected]; Tel.: +39-06-4991-3526
Abstract: Quantum networks are natural scenarios for the communication of information among
distributed parties, and the arena of promising schemes for distributed quantum computation.
Measurement-based quantum computing is a prominent example of how quantum networking,
embodied by the generation of a special class of multipartite states called cluster states, can be used
to achieve a powerful paradigm for quantum information processing. Here we analyze randomly
generated cluster states in order to address the emergence of correlations as a function of the density
of edges in a given underlying graph. We find that the most widespread multipartite entanglement
does not correspond to the highest amount of edges in the cluster. We extend the analysis to higher
dimensions, finding similar results, which suggest the establishment of small world structures in
the entanglement sharing of randomised cluster states, which can be exploited in engineering more
efficient quantum information carriers.
1. Introduction
In 1929, the Hungarian author Karinthy famously set out the concept of six degrees of separation [1],
the conjecture according to which any two living entities on Earth are distant by no more than five
intermediate steps. This concept was reprised and developed later on more rigorous sociological and
statistical grounds. Remarkably, for instance, a variation of the six degrees was unveiled by the group
of Barabasi in 1999 [2], who predicted that any page in the World Wide Web can be reached from any
other one with only nineteen intermediate steps (or clicks) on average.
As counterintuitive as this result might look, they are actually based on a very solid concept in
graph theory, namely the emergence of small worlds from connected networks. A small-world network
is a type of mathematical graph in which most nodes are not neighbours of one another, but can be
reached from every other one by a small number of steps that actually grows logarithmically with
the number of nodes themselves. The six and nineteen degrees of separation highlighted above are
different yet similar manifestations of the emergence of small worlds in a network.
Can these concepts be exported to the quantum domain? While the theory of quantum networks
has found fertile applications in quantum communication [3] and ground-breaking results in the
proposal of quantum repeaters for the faithful long-haul transport of quantum information [4,5], the
implications of the emergence of small worlds have been far less explored, and mostly confined to
studies of excitation-transport and the analysis of the transition from localised to delocalised regimes
in spatially extended interacting-particle models [6,7].
Here, inspired by the analogy between classical network bonds and the correlations set between
two elements of a given network of quantum particles, we aim at exploring different aspects.
In particular, motivated by the current experimental state-of-the-art in linear optics, which makes
available controllable networks of interconnected information carriers, we address the emergence
of typical lengths in the entanglement established by a random set of unitary gates applied to the
elements of a given graph. In particular, we focus on a particular class of operations and networks,
i.e., those typically put in place in the procedure for the creation of so-called cluster states, which are
resources for measurement-based quantum computing [8].
Such computational paradigm, which has been demonstrated equivalent to any circuital quantum
computing protocol, is of fundamental importance in quantum information processing. Linear-optics
measurement-based quantum information processing has emerged as a promising avenue for the
exploration of controllable quantum protocols. Encoding and entangling qubits in more than one
degree of freedom of photons is a promising avenue for the generation of medium-to-large scale
photonic cluster states: hyperentanglement-based protocols have so far allowed for the creation of
cluster states of up to 6 qubits [9], which have been used to validate fundamental one-way quantum
algorithms [10,11].
In this paper, by randomising the application of the elementary gates needed to engineer a cluster
state of a given size, we induce the establishment of small worlds in the underlying network of a given
physical system, and address how the spreading of entanglement across the network itself is affected by
the degree of stochasticity of such gates. We unveil an interesting hierarchy with which entanglement
appears in subnetworks of growing size: only a sufficient degree of determinism allows for the settling
of multipartite entanglement within a given cluster lattice, the threshold for k-element entanglement
depending neatly on the number of elements k itself. Moreover, we illustrate a fundamental difference
between the phenomenology illustrated in this paper and recently introduced concepts of classical
entanglement percolation [12].
The significance of this study goes beyond the context set by cluster states and measurement-based
quantum information processing and addresses the fundamental concept of entanglement [13]. In fact,
the emergence of different lengths at which bipartite and multipartite entanglement emerge from
a set of entangling transformations applied to the elements of a given network, provides insightful
information on the entanglement sharing structure. In turn, such information could be used to design
better resources for quantum information protocols, obtained by applying only a small subset of
entangling operations than the whole one determined by the size of the network itself and nevertheless
bearing entanglement-sharing properties very close to those of the fully connected network.
The remainder of this paper is organised as follows. In Section 2.1 we present randomly generated
cluster states as the platform for our investigation; in Section 2.2 we focus our attention to four-qubit
cluster states, presenting a rich analysis on the interplay between stochasticity of the gates used to set
the network and the settling of bipartite and multipartite entanglement. In Section 2.3 we extend our
analysis to larger networks.
2. Results
2.1. Theoretical Framework
The approach that we use in order to investigate the core question of our work can be schematised
as follows:
1. We set the value of the threshold q and generate a suitable number of random variables pij ∈ [0, 1],
which embody the probabilities to apply the gate CPHASEi,j (π ) to the pair of qubits (ei , e j ).
2. We compare pij to q. Should it be pij < q (pij > q), CPHASEi,j (π ) is (not) applied. We exhaust the
number of all inequivalent pairs of qubits in the network. This produces the network state |ψΣ ,
where Σ = {e1 , . . . , e N } is the set of qubits of the register.
3. We compute the reduced density matrices ρσ = TrΣ\σ [|ψ ψ|Σ ] that are obtained upon tracing the
overall state over all qubits but those in the subset σ ∈ Σ.
4. We calculate the percent fraction of such reductions that are entangled at the set value of q.
258
Entropy 2017, 19, 473
5. In order to eliminate any dependence on the specific random pattern of applications of the joint
gate, we repeat the procedure above for a number Q 1 of instances.
6. When Q is reached, we change q and repeat the protocol from point 1 to 5.
Needless to say, the number of applications of CPHASEi,j (π ) at a set value of the threshold
depends strongly on the actual value of q itself: the larger the chosen value of q, the higher the number
of gate applications. This is illustrated in Figure 1, where we show the different configurations achieved
for a network of N = 8 elements for q = 0.2, 0.5 and 1, which is associated with a fully connected graph.
It is important to remark that, in our notation as well as in Figure 1, a bond connecting elements ei and
e j only means that gate CPHASEi,j (π ) was applied, and does not imply the existence of entanglement
between such elements.
ǻǼ ǻǼ ǻǼ
Figure 1. Example of instances of an N = 8 qubits random cluster states. For (a–c) we have taken
q = 0.2, q = 0.5, and q = 1 respectively.
of the reduced density matrix ρσ , and use the fact that, given the overall pure nature of |ψΣ , a value of
Pσ < 1 necessarily implies entanglement in the bipartition (Σ\σ)|σ. We have thus implemented the
protocol illustrated in Section 2.1 by calculating, in step 4, the percentage of reductions with Pσ < 1.
259
Entropy 2017, 19, 473
In order to illustrate the salient features of our analysis, we now address explicitly the case of
N = 4, for which Σ = {e1 , . . . , e4 }. The state that would be produced by applying CPHASEi,j (π ) gates
to every pair of qubits in the network, which would correspond to chosing q = 1, reads
1
|ψΣ = √ Ĥe4 (|φ+ e1 e4 |φ− e2 e3 + |ψ+ e1 e4 |ψ− e2 e3 )
2
1
= √ Ĥe3 (|φ− e1 e2 |φ+ e3 e4 − |ψ+ e1 e2 |ψ− e3 e4 )
2
(2)
1
= √ Ĥe2 (|φ− e1 e3 |φ+ e2 e4 − |ψ+ e1 e2 |ψ− e2 e4 )
2
1
= √ Ĥe1 (|φ− e1 e2 |φ+ e3 e4 − |ψ+ e1 e2 |ψ− e3 e4 )
2
where Ĥe j is the Hadamard gate on qubit e j and we have introduced the Bell states |φ± ei e j = (|00 ±
√ √
|11)ei e j / 2, |ψ± ei e j = (|01 ± |10)ei e j / 2. The orthogonality of Bell states ensures that entanglement
exists in the three inequivalent bipartition (ei , e j )|(ek , el ). Moreover, it is equally straightforward to
check that any single-qubit reduction is maximally mixed. Therefore, also the bipartitions ei |(e j , ek , el )
are entangled. This implies that for q = 1 we expect all six bipartitions that can be identified to be
inseparable and the state to be genuinely multipartite entangled. The purity of the associated reduced
states is thus necessarily smaller than one. However, for q < 1 the number of mixed-state reduction is
not necessarily as large as six, and our calculations aim at quantifying the percentage of such reduced
states as q is varied.
The results of such calculations are presented in Figure 2 (blue and red dots), where each data
point is the result of an average over Q = 5000 random instances, a sample-size that was large
enough to ensure convergence of the numerics. The error bars attached to each point show the
uncertainty associated to the averages, calculated as the standard deviation of each Q-sized sample
√
and divided by Q. Clearly, for q = 0 the state of the network is deterministically found to be the
factorised initial state ⊗4j=1 |+e j , while for q = 1 we retrieve the result anticipated above (Equation (2)).
In between such extreme situations, the number of inseparable two-vs.-two and one-vs.-three qubits
bipartitions (equivalently, mixed two-qubit and one-qubit states) grows monotonically with q, albeit at
slightly different rates. In particular, we find that the percentage fraction of inseparable two-vs.-two
(three-vs.-one) qubits bipartitions exceeds 99.9% at q = 0.82 ± 0.01 (q = 0.89 ± 0.01), as shown by the
vertical dashed line marked as T2 (T3 ) in Figure 2. The nominal positions (uncertainties) of T2,3 have
been obtained as the average (standard deviations) over 100 analytical non-linear interpolations of
the results of our simulations, each producing the functions f 2,3 (q) (whose averages are shown by the
blue and red lines in Figure 2) that have been used to solve numerically the equations f 2,3 (q) = 99.9.
Quite clearly, T2
= T3 beyond statistical errors, which implies that the random network at hand requires
a higher threshold in q to produce a complete set of inseparable one-vs.-three qubits bipartitions.
260
Entropy 2017, 19, 473
Figure 2. We study the percentage fraction of mixed-state reductions that can be identified in a network
of N = 4 elements, against the threshold parameter q. The blue (red) dots show the results of the
numerical experiment aimed at quantifying the fraction of mixed two-qubit (one-qubit) reductions.
The orange points identify the values of the percentage fraction F2 of two-qubit reductions whose
purity is exactly 1/4. The solid lines are non-linear interpolations of the data points. Each point is the
result of an average over a sample of Q = 5000 elements. Error bars show the standard deviations
associated with such averages. Dashed lines T2,3 identify the value of q at which the number of mixed
two- and one-qubit reductions is at least 99.9% of the possible ones. The line labelled max[F2 ] identifies
the value of q at which the maximum of F2 occurs.
Needless to say, the empirical rule of “no free lunch” applies here as well: the establishment of
multipartite entanglement in the network under scrutiny has to come at the expenses of something
else, in light of the monogamy of entanglement. The specific algorithm at hand allows us to explore
who pays the toll represented by the establishment of genuine multipartite entanglement in the
random network.
In particular, we expect bipartite entanglement to be affected by the emergence of multipartite one.
Such expectation is corroborated by the analysis summarized by the orange dots and curve in Figure 2,
which show the percentage fraction F2 of two-vs.-two qubits reductions of random states at a given
value of q that have purity exactly equal to 1/4, which is the lowest a two-qubit state can achieve and
witnesses maximum entanglement across the (ei , e j )|(ek , el ) bipartition. Quite intuitively, F2 grows at
small values of q: a low threshold implies very small probability to apply multiple CPHASE gates,
which inevitably favours the construction of maximally entangled two-qubit states. For q 1, we have
a large probability that one qubit is affected by multiple CPHASE gates. Intuitively, this should be
able to set strong multipartite entanglement and deplete the degree of bipartite one, and we expect F2
to decrease accordingly. Indeed, we know that at q = 1 we have a genuinely multipartite entangled.
The orange dots in Figure 2 confirm such expectation, and show the occurrence of a maximum of F2
that is close, yet not identical, to the chosen thresholds T2,3 discussed above (we have that max[F2 ]
occurs at q = 0.72 ± 0.01).
Of course, counting for the number of reductions that are in mixed states does not provide full
information about multipartite nature of the entanglement that is established among the elements of
the network. We remind that a pure N-partite state is called genuinely multipartite entangled if it is
not separable with respect to any of the possible bipartitions of its N elements. One can thus check
the multipartite nature of the entanglement of a given pure state by counting the number of separable
bipartitions that can be drawn. As each instance of our random sample is a pure state, we have decided
to approach this task by using the N-partite generalisation of negativity defined as
/
EN = N Π{σ} Eσ|Σ\σ , (3)
261
Entropy 2017, 19, 473
where Eσ|Σ\σ is the negativity of the partially transposed density matrix of the bipartition σ|Σ\σ and
the product extends to all the bipartitions. We recall the definition of negativity as
Eσ|Σ\σ = max[0, −2 ∑ λ−
j ] (4)
j
with {λ− j } the set of negative eigenvalues of the partially transposed (with respect to any of the
subparties) density matrix of the bipartition σ|Σ\σ. The geometric average upon which Equation (3)
is built is null whenever at least one of the bipartitions of the network is positive under partial
transposition. Therefore, for pure states, only if all bipartitions are certified inseparable according
to the partial transposition criterion is the state of the network genuinely multipartite entangled.
The situation is much more difficult when mixed states are considered, for which the non-nullity of
the quantity in Equation (3) is no guarantee of the existence of genuine multipartite entanglement in
a given state [14].
Figure 3 shows the behavior of E4 against q. While for q > 0 we always have four-partite
entanglement (in line with the finding in Figure 2), it is remarkable that q = 1 is not associated with
the largest degree of four-partite negativity, which actually occurs at q = 0.72 ± 0.01.
Figure 3. Average four-partite negativity E4 plotted against q obtained for a sample of Q = 5000 random
network states. The error bars are the standard deviations associated with the averages. The orange
solid line is a non-linear interpolating function whose maximum is achieved at q = 0.72 ± 0.01 (vertical
dashed line).
We continue the assessment of the four-partite case by pointing out the differences between the
average behavior of the figures of merit addressed herein and the values taken by such indicators over
the average state of the network. The latter is defined as the state obtained upon mediating over Q
random instances of network states. Formally, by assuming all instances to be equally likely to occur
(which is entailed by choosing the probabilities to apply gates CPHASEi,j (π ) uniformly), the physical
state of the system is described by the density matrix
Q
1
ρΣ =
Q ∑ |ψ ψ|Σ,j , (5)
j =1
where |ψ ψ|Σ,j is the jth random state of the Q-sized sample.
With the exception of the cases associated with q = 0, 1 (when we sum identically prepared states),
by averaging we lose the purity of the network state: PΣ reaches values as low as 0.14 for q = 0.5
(cf. Inset (a) of Figure 4), which is however larger than the minimum purity 1/16 achievable by a
four-qubit state. Despite being mixed, the average state of the network preserves significant quantum
coherences as quantified by the measure proposed in [15] and formalised as
262
Entropy 2017, 19, 473
with |(ρΣ )ij | the off-diagonal elements of the density matrix ρΣ . The behavior of C against q is shown
in Inset (b) in Figure 4: a minimum of the measure of coherence is achieved in correspondence of the
minimum purity. However, such a minimum is strictly non-null, thus leaving open the possibility of
dealing with a (mixed) state of the network exhibiting a non-trivial entanglement structure. Such a
possibility is confirmed by the analysis of E4 (cf. main panel of Figure 4), which is a growing function
of q (similar trends are exhibited by both the two-vs.-two qubits entanglement E(ei ,e j )|(ek ,el ) , and the
one-vs.-three qubits one E(ei )|(e j ,ek ,el ) ). Nothing remarkable in the behavior of E4 appears to be related
to the value of q = 0.5, although the function changes concavity in correspondence to such a value
of the probability threshold. It should be noticed that, as anticipated, in such an average-state case
E N cannot be interpreted as a quantifier of genuine multipartite entanglement. Indeed, the revelation
of multipartite entanglement in general multiparty mixed states requires a more refined approach
(see [16] for a recent assessment of this point and the provision of useful criteria). Nevertheless, this
figure of merit is still very useful for our analysis, as it provides valuable information on the average
amount of bipartite entanglement within the statistically average stage of the network, and we will thus
make further use of E N in the remainder of this work. Finally, the non-nullity of either E(ei )|(e j ,ek ,el ) ’s
or E(ei ,e j )|(ek ,el ) ’s does not exclude the possibility of facing bound entanglement (i.e., non-distillable
entanglement) of the negative-partial-transposition nature [17] in those bipartitions, an issue that goes
beyond the scopes of this work.
PΣ Inset (a)
Inset (b)
Figure 4. Main panel: Logarithmic plot of the entanglement within the average estate ρΣ of an
N = 4 random network against the threshold probability q. The red dots show the value taken
by the four-partite negativity E4 , while the blue and orange ones are for the entanglement within
the bipartitions (ei , e j )|(ek , el ) and (ei )|(e j , ek , el ). The lines connecting the dots are simply guides to
the eye. Inset (a): Purity PΣ of the average state against q. The dashed horizontal line shows the
minimum purity of a four-qubit state. Inset (b): Values taken by the measure of coherence C against
the threshold probability.
To finish the study of this paradigmatic case, we report in the main panel of Figure 5 the behavior
of E3 in the four three-qubit reduced states that can be singled out from our network. We have used
the tripartite version of Equation (3) to quantify the entanglement and changed our notation so as to
make explicit the triplets of elements of the network that we ave considered. Moreover, by tracing
out two elements, we have evaluated the residual two-qubit entanglement, whose average across the
six two-qubit reductions is displayed in the inset of Figure 5. The general trend of such figures of
merit follows the expectation that, in the large-q region, the entanglement in the reduction is depleted
to favour the emergence of multipartite one. Moreover, their quantitative value is, in general, very
small. A point of notice is that the peak of three- and two-qubit negativity does not occur at the same
value of q, thus suggesting an interesting hierarchy of values of q at which the various structures of
entanglement across the system are triggered or destroyed.
263
Entropy 2017, 19, 473
Figure 5. Main panel: E3 in the three-qubit reductions (extracted from an N = 4 network) identified in
the legend, plotted against q. Each plot is an average over Q = 5000 realisation of the random network
state (we omit the error bars for clarity of presentation). Inset: Mean bipartite negativity E bip averaged
over the six two-qubit reduced states that can be singled out from our network. Same conditions as in
the main panel.
Figure 6. We study the percentage fraction of mixed-state reductions that can be identified in a network
of N = 5 elements, against the threshold parameter q. The red dots show the results of the numerical
experiment aimed at quantifying the fraction of mixed two- and three-qubit reductions, which actually
coincide. The purple dots show the results for the one-qubit reductions. The orange points identify the
values of the percentage fraction F2 of two-qubit reductions whose purity is exactly 1/4. The solid lines
are non-linear interpolations of the data points. Each point is the result of an average over a sample of
Q = 104 elements. Error bars show the standard deviations associated with such averages. Dashed
lines T2,3 (T4 ) identify the value of q at which the number of mixed two- and three-qubit (one-qubit)
reductions is at least 99.9% of the possible ones. The line labelled max[F2 ] identifies the value of q at
which the maximum of F2 occurs.
264
Entropy 2017, 19, 473
PΣ
Inset (a)
Inset (b)
Figure 7. Main panel: Logarithmic plot of the entanglement within the average estate ρΣ of an
N = 5 random network against the threshold probability q. The red dots show the value taken by E5 ,
while the blue and orange ones are for the entanglement within the bipartitions (ei , e j )|(ek , el , em ) and
(ei )|(e j , ek , el , em ). The lines connecting the dots are simply guides to the eye. Inset (a): Purity PΣ of
the average state against q. The dashed horizontal line shows the minimum purity of a four-qubit state.
Inset (b): Values taken by the measure of coherence C against the threshold probability.
The trend is clear: as we look into larger networks, the value of Tk (k = 2, 3, . . . ) decreases.
Table 1. The table shows the threshold value of q at which the fraction of progressively larger reductions
in an N-element random network is at least 99.9%. Black squares stands for unavailable data at that
size of the network. As before, max[F2 ] is the value of q at which the maximum of F2 occurs.
N 4 5 6 ··· 9
max F2 0.72 0.66 0.64 0.40
T2 0.82 0.67 0.57 0.39
T3 0.89 0.67 0.54 0.31
T4 0.818 0.57 0.27
T5 0.75 0.27
T6 0.31
T7 0.39
T8 0.40
265
Entropy 2017, 19, 473
such a percentage remains always very small, regardless of q, showing that no classical entanglement
percolation effect occurs, as there is no value of q at which long-distance entanglement within the
network is set deterministically. The results should be considered as canonical, qualitatively valid
regardless of the actual choice of N, and indicative of the profound differences between the situation
addressed here and the study in [12].
3. Discussion
We have studied the entanglement sharing structure among the elements of a qubit network
subjected to probabilistic CPHASE gates. We have highlighted the existence of statistically inequivalent
thresholds in the probability of application of the gates for the settling of entanglement in various
subsets of network elements, thus unveiling an interesting hierarchy in the entanglement distribution
pattern of a given network. The phenomenology that we have highlighted cannot be understood
in terms of the statistical properties of an intuitive, yet too naive, reference state such as the one
obtained by averaging overall the elements of the random set of states generated in our numerical
experiments: the above-mentioned hierarchy is a statistical feature of random networks rather than a
property of the statistically average state of the network. Remarkably, small worlds structures in the
entanglement sharing of the random set of network states appear to emerge. This is an interesting
feature that deserves more attention and upon which we plan to focus our forthcoming (theoretical
and experimental) efforts.
References
1. Karinthy, F. Láncszemek. In Minden Masképpen van, 1929. Available online:
http://mek.oszk.hu/15500/15588/15588.pdf (accessed on 5 September 2017). (In Hungarian)
2. Albert, R.; Jeong, H.; Barabasi, A.-L. Internet: Diameter of the World-Wide Web. Nature 1999, 401, 130–131.
3. Kimble, H.J. The quantum internet. Nature 2008, 453, 1023–1030.
4. Munro, W.J.; Harrison, K.A.; Stephens, A.M.; Devitt, S.J.; Nemoto, K. From quantum multiplexing to
high-performance quantum networking. Nat. Photonics 2010, 4, 792–796.
5. Epping, M.; Kampermann, H.; Bruß, D. Robust entanglement distribution via quantum network coding.
New J. Phys. 2016, 18, 103052.
6. Zhu, C. P.; Xiong, S.-J. Localization-delocalization transition of electron states in a disordered quantum
small-world network. Phys. Rev. B 2000, 62, 14780.
7. Giraud, O.; Georgeot, B.; Shepelyansky, D.L. Tuning clustering in random networks with arbitrary degree
distributions. Phys. Rev. E 2005, 72, 036203.
266
Entropy 2017, 19, 473
8. Briegel, H.J.; Browne, D.E.; Dür, W.; Raussendorf, R.; Van den Nest, M. Measurement-based quantum
computation. Nat. Phys. 2009, 5, 19.
9. Vallone, G.; Donati, G.; Ceccarelli, R.; Mataloni, P. Six-qubit two-photon hyperentangled cluster states:
Characterization and application to quantum computation. Phys. Rev. A 2010, 81, 052301
10. Vallone, G.; Pomarico, E.; De Martini, F.; Mataloni, P. One-way quantum computation with two-photon
multiqubit cluster states. Phys. Rev. A 2008, 78, 042335.
11. Ciampini, M.A.; Orieux, A.; Paesani, S.; Sciarrino, F.; Corrielli, G.; Crespi, A.; Ramponi, R.; Osellame, R.;
Mataloni, P. Path-polarization hyperentangled and cluster states of photons on a chip. Light Sci. Appl. 2016,
5, e16064.
12. Acín, A.; Cirac, J.I.; Lewenstein, M. Entanglement Percolation in Quantum Networks. Nat. Phys. 2007, 3, 256.
13. Horodecki, R.; Horodecki, P.; Horodecki, M.; Horodecki, K. Quantum entanglement. Rev. Mod. Phys. 2009,
81, 865.
14. Huber, M.; Mintert, F.; Gabriel, A.; Hiesmayr, B.C. Detection of high-dimensional genuine multipartite
entanglement of mixed states. Phys. Rev. Lett. 2010, 104, 210501.
15. Baumgratz, T.; Cramer, M.; Plenio, M.B. Quantifying Coherence. Phys. Rev. Lett. 2014, 113, 140401.
16. Lancien, C.; Gühne, O.; Sengupta, R.; Huber, M. Relaxations of separability in multipartite systems:
Semidefinite programs, witnesses and volumes. J. Phys. A Math. Theor. 2015, 48, 505302.
17. Horodecki, P.; Horodecki, R. Distillation and bound entanglement. Quant. Inf. Comp. 2001, 1, 45.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
267
entropy
Article
Non-Causal Computation
Ämin Baumeler 1 and Stefan Wolf 2, *
1 Faculty of Informatics, Università della Svizzera italiana, 6900 Lugano, Switzerland; [email protected]
2 Facoltà indipendente di Gandria, 6978 Gandria, Switzerland
* Correspondence: [email protected]; Tel.: +41-58-666-4000
Abstract: Computation models such as circuits describe sequences of computation steps that
are carried out one after the other. In other words, algorithm design is traditionally subject to the
restriction imposed by a fixed causal order. We address a novel computing paradigm beyond quantum
computing, replacing this assumption by mere logical consistency: We study non-causal circuits, where
a fixed time structure within a gate is locally assumed whilst the global causal structure between the
gates is dropped. We present examples of logically consistent non-causal circuits outperforming
all causal ones; they imply that suppressing loops entirely is more restrictive than just avoiding
the contradictions they can give rise to. That fact is already known for correlations as well as for
communication, and we here extend it to computation.
1. Introduction
Computations, understood as realized through Turing machines, billiard or ballistic computers [1],
circuits, lists of computer instructions, or otherwise, are often designed to have a linear (i.e., causal)
time flow: After a fundamental operation is carried out, the program counter moves to the next
operation, and so forth. Surely, this is in agreement with our everyday experience; after you finish to
read this sentence, you continue to the next (hopefully), or do something else (in that case: goodbye!).
What sorts of computation become admissible if one drops the assumption of a linear time flow and reduces it to
mere logical consistency? One could imagine that a linear time flow restricts computation strictly beyond
what would be allowed for the purely logical point of view. Indeed, we show this to be true. If the assumption
of a linear time flow is dropped, a variable of the computational device could depend on “past” as
well as “future” computation steps. Such a dependence can be interpreted as loops in the time flow,
e.g., generated by a closed timelike curve [2]. There are two fundamental issues that might make loops
logically inconsistent. One is the liability to the grandfather antinomy. In a loop-like information flow,
multiple contradicting values could potentially be assigned to a variable—the variable is overdetermined.
The other issue is underdetermination: a variable could take multiple consistent values, yet the model
of computation cannot predict which actual value it takes. This underdetermination is also known
as the information antinomy. To overcome both issues, we restrict ourselves to models of computation
where the assumption of a linear time flow is dropped and replaced by the assumption of logical
consistency: All variables are neither overdetermined nor underdetermined. We call such models
of computation non-causal. Our main result is that non-causal models of computation are strictly more
powerful than the traditional causal ones. Therefore, causality is a stronger assumption than logical
consistency in the context of computation. Similar results are also known with respect to quantum
computation [3–7], correlations [5,8–11] as well as communication [12]. As we will show later, such
circuits are “programmed” by introducing a contradiction if an undesired result is found. This is like
guessing the solution to a problem and killing the own grandfather in the event that the guess was
wrong (similar to “quantum suicide” [13] or “anthropic computing” [14]).
The article is structured as follows. First, we discuss the assumption of logical consistency
in more depth, then we describe a non-causal circuit model of computation and give a few examples
of problems that can be solved more efficiently. We continue by describing other non-causal models
of computations: the non-causal Turing machine and non-causal billiard computer. We conclude
by showing how to efficiently find a satisfying assignment to a SAT formula if the number of satisfying
assignments is previously known.
2. Logical Consistency
Let ρt be the ensemble of all variables (also called state) of a computational model at a time t.
In general, ρt depends on ρt−1 , ρt−2 , . . . . Without loss of generality, assume that ρt depends on ρt−1
only (i.e., the computation is described by a Markov chain). These dependencies are depicted
in Figure 1a. In a non-causal model, however, the values that are assigned to the variables at time t could
in principle depend on “future” time-steps; e.g., the assignment ρ0 could depend on ρm , which results
in a Markovian “bracelet” or circle (see Figure 1b).
Figure 1. Causal and non-causal computation. The arrows point in the direction of computation.
(a) The values that are assigned to the variables of a computational model at time t depend on ρt−1 .
(b) Cyclic dependencies of the values that are assigned to the variables at different steps during the
computation.
A computational model is not overdetermined if and only if the values that are assigned to the
variables do not contradict each other. This is equivalent to the existence of a fixed point [15]
of the Markov chain that results from cutting the “bracelet” at an arbitrary position (see Figure 1b).
Let f be a function that describes the behaviour of this Markov chain. Then, the computational model
is not overdetermined if and only if ∃ x : f ( x ) = x.
A computational model is not underdetermined if and only if there exists at most one fixed point [15]:
|{ x | x = f ( x )}| ≤ 1 .
∃!x : f ( x ) = x .
270
Entropy 2017, 19, 326
Figure 2. (a) Overdetermined circuit: The bit 0 is mapped to 1 and vice versa; i.e., there is no consistent
assignment of a value that travels on the wire. (b) Information antinomy: Both 0 and 1 could potentially
travel on the wire, yet the circuit does not specify which.
We model a gate G by a Markov matrix Ĝ with 0–1 entries. Without loss of generality, assume
that the input and output dimensions of a gate are equal. The Markov matrix of the ID gate on a single
bit (see Figure 2b) is
1 0
1= ,
0 1
and the Markov matrix of the NOT gate on a single bit (see Figure 2a) is
0 1
N̂ = .
1 0
Values are modeled by vectors; e.g., in a binary setting, the value 0 is represented by the
vector (1, 0) T and the value 1 is represented by the vector (0, 1) T . In general, an n-dimensional
variable with value i is modeled by the n-dimensional vector i with a 1 at position i, and where all
other entries are 0. A gate is applied to a value via the matrix-vector multiplication; i.e., the output
of G on input a is x = Ĝa. Let F and G be two gates. The Markov matrix of the parallel composition
of both gates is F̂ ⊗ Ĝ. They are composed sequentially with a wire that takes the d-dimensional output
of F and forwards it as input to G. By this, we obtain a new gate H = G ◦ F which represents the
sequential composition. The sequentially composed gate is
d −1
Ĥ = ∑ ĜvvT F̂ = Ĝ F̂ .
v =0
271
Entropy 2017, 19, 326
By using these rules of composition, a causal circuit can always be modeled by a single gate.
A closed circuit is a circuit where all wires are connected to gates on both sides. Let H be the gate that
describes the composition of all gates for a given causal circuit. We can transform any such circuit into
a closed non-causal circuit by connecting all outputs from H with all inputs to H. A logically consistent
closed circuit is thus a circuit where a unique assignment of a value c to the looping wire exists:
In other words, the described closed circuit is logically consistent if and only if the diagonal
of Ĥ consists of 0’s with a single 1. The position of the 1-entry represents the fixed point and the
value c on the looping wire. Note that for a given closed circuit, the gate H is not unique, but might
depend on where the “cut” is introduced. An open circuit is a circuit where some wires are not
connected to a gate on one side. Thus, such a circuit has either an input a, an output x, or both.
A logically consistent open circuit, therefore, is a circuit where for any choice of input a, a unique
assignment of a value c to the looping wire and to the output x exists, such that
( x ⊗ c) T Ĥ ( a ⊗ c) = 1 ,
d −1
D̂i = ∑ iT v
v =0
to the input and output wires of C (see Figure 3a,b). The gate Di unconditionally outputs the value i.
There is an ambiguity on which wires are regarded as “looping”. We show that two different
representations H and H of the same closed non-causal circuit C yield the same computation
(the difference between H and H is the identification of the looping wires). Different H and H
that represent the same non-causal circuit C can be written as H = Q ◦ R and H = R ◦ Q. For H,
the looping wires are those that exit Q and enter R, and for H , vice versa. From Equation (1), we have
∃!c : c T Ĥc = c T Q̂ R̂c = c T Q̂ ∑ eeT R̂c = 1 .
e
where e∗ is the specific value on the wire exiting R and entering Q. Conversely,
holds. The only way H and H each have a unique fixed point is with the identification e∗ = e .
Therefore, both representations H and H assign the same values to the wires. By the above translation
from open to closed circuits, we see that the same reasoning can be applied to open circuits.
272
Entropy 2017, 19, 326
Figure 3. (a) Open circuit C with input a. (b) Closed circuit Ci with a = i → c a = ci . (c) The big
box represents a non-causal comb (note that combs obey causality; the higher-order transformations
described here are equivalent to combs, yet where the causality assumption is dropped) that transforms
a gate (H ) to a new gate, the composition.
Tr Ĥ = 1 , (2)
∀i, j : Ĥi,j ≥ 0 ,
that is, the diagonal of Ĥ consists of non-negative numbers (probabilities) that add up to 1. Equation (2)
can be interpreted as “the average number of fixed points is 1”. To see this, we decompose H as a convex
combination of deterministic matrices
Ĥ = ∑ pi Ĥi ,
i
Tr Ĥ = ∑ pi Tr Ĥi = 1 .
i
For an arbitrary deterministic matrix D̂, the expression Tr D̂ represents the number of fixed points,
with which we arrive at the stated interpretation.
An open non-causal circuit can be represented by a non-causal comb [5] G which is a higher-order
transformation—G transforms the gate H to a new gate (see Figure 3c). The non-causal comb G,
for instance, could connect the output from H with the input of H , as long as the composition remains
logically consistent.
4. Computational Advantage
The logical consistency requirement forces the value on a looping wire to be the unique fixed
point of the transformation. This can be exploited for finding fixed points of a black box, which yields
an advantage in higher-order computation. Suppose we are given a black box B that takes (produces)
273
Entropy 2017, 19, 326
a d-dimensional input (output) and has a unique fixed point x previously unknown to us. As a Markov
matrix, B is
d −1
B̂ = ∑ ei i T , with |{i | ei = i }| = 1 .
i =0
Our task is to find the fixed point x in as few queries as possible. If we solve this task with a causal
circuit, then, in the worst case, d − 1 queries are needed. In contrast, with a non-causal circuit, a single
query suffices. The reason for this is that the black box is queried with the fixed point only. Any other
query would lead to a logical contradiction, and therefore does not occur. For that purpose, we just
connect the output of B with the input of B and use a second wire to read out the value (see Figure 4a).
This circuit is logically consistent because
∀ a, ∃!c, x : ( x ⊗ c) T Ĉ (1 ⊗ B̂)( a ⊗ c)
= ( x ⊗ c) T Ĉ ( a ⊗ B̂c) = 1 ,
where Ĉ is the CNOT gate and 1 is the identity. However, this construction only works if B has a unique
fixed point. Suppose B2 has two fixed points. In that case, the circuit from Figure 4b can be used to find
both fixed points with two queries. In addition to short-cutting the black boxes, we need to introduce
a gate G that ensures a unique fixed point of the whole circuit. The gate G works in the following way:
Ĝ = ∑ ( a ⊗ b ⊗ c ⊗ c ⊗ 0)( a ⊗ b ⊗ c ⊗ c ⊗ e) T +
e,c− a<c −b
∑ ( a ⊗ b ⊗ c ⊗ c ⊗ ē)( a ⊗ b ⊗ c ⊗ c ⊗ e) T ,
e,c− a≥c −b
where e is binary, ē = e ⊕ 1, the addition is carried out modulo 2, and 0 is a 2-dimensional vector
representing the value 0. In words, if the value c on the upper wire is less than the value on the
lower wire c , and e is 0, then we get a fixed point on the third wire of G (variable e in Figure 4b).
Otherwise, the bit on the third wire gets flipped—no fixed point. This guarantees that all loops
together have a unique fixed point. Ironically, the gate G suppresses certain fixed points on the previous
loops by introducing a logical inconsistency at a later point in the circuit. This resembles “anthropic
computing” [14], where one guesses the solution to a problem and commits suicide if the guess
was wrong—a recipe to solve NP-complete problems in the relative-state interpretation of quantum
mechanics [16] and where consciousness follows only those branches where the programmer remains
alive. Such a construction can be used to find the fixed points of a black box with a few fixed points
and where the number of fixed points is known. For a large number n of fixed points (e.g., n = d/2),
we can use the probabilistic approach to non-causal circuits. Let Bn be a black box with n fixed points
and input and output spaces of dimension d. The Markov matrix of Bn is
d −1
B̂n = ∑ ei i T , with |{i | ei = i }| = n .
i =0
We construct a randomized gate where the average number of fixed points is one:
1 n−1
B̂ = B̂n + N̂ ,
n n
with
n −1
N̂ = ∑ īiT , ī = i ⊕ 1 .
i =0
274
Entropy 2017, 19, 326
The gate N̂ can be understood as a d-dimensional generalization of the NOT gate for bits:
The input is increased by one modulo d. Such an N̂ has no fixed points. The mixture B̂ is logically
consistent, because
1 n−1 1 n−1
Tr B̂n + N̂ = Tr B̂n + Tr N̂ = 1 .
n n n n
This means that we can use the circuit from Figure 4a to find a random fixed point of Bn .
Figure 4. Fixed point search for a black box with one and a black box with two fixed points. (a) The
output x is the fixed point c added to the input a. (b) Circuit for finding a fixed point for a black box
with two fixed points.
We apply these tools to find solutions to instances of search problems with a known number
of solutions, and where a guess for a solution can be verified efficiently by a verifier V. In other words,
we can find solutions to NP search problems, yet where the number of solutions to an instance must
be known to us in advance. Note that the following construction does not solve a decision problem,
but rather finds the solution. Suppose an instance I to a problem Π has a unique solution. We replace
the gate B of Figure 4a with a new gate V that acts in the following way: it takes a guess c for a solution
to Π( I ) as input, and runs V to verify c. If V accepts c, then V outputs c, and otherwise, V outputs c ⊕ 1,
where the addition is carried out modulo d. Such a circuit has a unique fixed point c which equals the
solution of Π( I ). This, for instance, could be applied to a SAT formula, where a unique assignment
of values to variables exist which make the formula true. Note that this approach does not prove
an advantage in finding satisfying assignments for SAT formulas, even if the number of these satisfying
assignments is previously known; currently, we do not know how difficult or easy it is to solve such
instances causally.
275
Entropy 2017, 19, 326
fashion and by generating a history tape [17], where no memory position gets overwritten. An example
of a non-causal Turing machine is where the history tape is non-causal in the sense that symbols can
be read “before” they are written.
The billiard computer is a model of computation on a billiard table [1]. Before the computation
starts, obstacles are placed on the table in such a way that the induced reflections of the balls and
the collisions among the balls result in the desired computation. A non-causal version of a billiard
computer is a billiard table where the holes are connected with closed timelike curves (CTCs) [2] that
are logically consistent. Now, a billiard ball could also collide with its younger self; this introduces
a non-causal effect. Echeverria, Klinkhammer, and Thorne [2] showed that solutions to CTC-dynamics
that are not overdetermined exist. However, all solutions that they found are underdetermined.
The non-causal circuits presented in this work indicate that logically consistent non-causal billiard
computers are also admissible.
References
1. Fredkin, E.; Toffoli, T. Conservative logic. Int. J. Theor. Phys. 1982, 21, 219–253.
2. Echeverria, F.; Klinkhammer, G.; Thorne, K.S. Billiard balls in wormhole spacetimes with closed timelike
curves: Classical theory. Phys. Rev. D 1991, 44, 1077–1099.
276
Entropy 2017, 19, 326
3. Chiribella, G. Perfect discrimination of no-signalling channels via quantum superposition of causal structures.
Phys. Rev. A 2012, 86, 040301.
4. Colnaghi, T.; D’Ariano, G.M.; Facchini, S.; Perinotti, P. Quantum computation with programmable
connections between gates. Phys. Lett. A 2012, 376, 2940–2943.
5. Chiribella, G.; D’Ariano, G.M.; Perinotti, P.; Valiron, B. Quantum computations without definite causal
structure. Phys. Rev. A 2013, 88, 022318.
6. Araújo, M.; Costa, F.; Brukner, Č. Computational Advantage from Quantum-Controlled Ordering of Gates.
Phys. Rev. Lett. 2014, 113, 250402.
7. Procopio, L.M.; Moqanaki, A.; Araújo, M.; Costa, F.; Alonso Calafell, I.; Dowd, E.G.; Hamel, D.R.;
Rozema, L.A.; Brukner, Č.; Walther, P. Experimental superposition of orders of quantum gates. Nat. Commun.
2015, 6, 7913.
8. Oreshkov, O.; Costa, F.; Brukner, Č. Quantum correlations with no causal order. Nat. Commun. 2012, 3, 1092.
9. Baumeler, Ä.; Feix, A.; Wolf, S. Maximal incompatibility of locally classical behavior and global causal order
in multiparty scenarios. Phys. Rev. A 2014, 90, 042106.
10. Baumeler, Ä.; Wolf, S. The space of logically consistent classical processes without causal order. New J. Phys.
2016, 18, 013036.
11. Branciard, C.; Araújo, M.; Feix, A.; Costa, F.; Brukner, Č. The simplest causal inequalities and their violation.
New J. Phys. 2016, 18, 013008.
12. Feix, A.; Araújo, M.; Brukner, Č. Quantum superposition of the order of parties as a communication
resource. Phys. Rev. A 2015, 92, 052326.
13. Tegmark, M. The Interpretation of Quantum Mechanics: Many Worlds or Many Words? Fortschr. Phys.
1998, 46, 855–862.
14. Aaronson, S. Guest Column: NP-complete problems and physical reality. ACM SIGACT News 2005, 36, 30–52.
15. Baumeler, Ä.; Wolf, S. Device-independent test of causal order and relations to fixed-points. New J. Phys.
2016, 18, 035014.
16. Everett, H. “Relative State” Formulation of Quantum Mechanics. Rev. Mod. Phys. 1957, 29, 454–462.
17. Bennett, C.H. Logical Reversibility of Computation. IBM J. Res. Dev. 1973, 17, 525–532.
18. Valiant, L.G.; Vazirani, V.V. NP is as easy as detecting unique solutions. Theor. Comput. Sci. 1986, 47, 85–93.
19. Deutsch, D. Quantum mechanics near closed timelike lines. Phys. Rev. D 1991, 44, 3197–3217.
20. Aaronson, S.; Watrous, J. Closed timelike curves make quantum and classical computing equivalent.
Proc. R. Soc. A Math. Phys. Eng. Sci. 2009, 465, 631–647.
21. Brun, T.A.; Wilde, M.M.; Winter, A. Quantum State Cloning Using Deutschian Closed Timelike Curves.
Phys. Rev. Lett. 2013, 111, 190401.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
277
entropy
Article
The Many Classical Faces of Quantum Structures
Chris Heunen
School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK;
[email protected]; Tel.: +44-131-650-5132
Academic Editors: Giacomo Mauro D’Ariano, Paolo Perinotti, Jay Lawrence and Giorgio Kaniadakis
Received: 9 January 2017; Accepted: 23 March 2017; Published: 29 March 2017
Abstract: Interpretational problems with quantum mechanics can be phrased precisely by only
talking about empirically accessible information. This prompts a mathematical reformulation of
quantum mechanics in terms of classical mechanics. We survey this programme in terms of algebraic
quantum theory.
Keywords: algebraic quantum theory; C*-algebra; gelfand duality; classical context; bohrification
1. Introduction
The mathematical formalism of quantum mechanics is open to interpretation. For example,
the possibility of deterministic hidden variables, the uncertainty principle, the measurement problem,
and the reality of the wave function, are all up for debate. (The first and the last of course
have rigorous restrictions: hidden variables by the Bell inequalities [1] and the Kochen–Specker
theorem [2], discussed below, and reality of the wave function by the Pusey–Barrett–Rudolph
theorem [3].) Classical mechanics shares none of those interpretational questions. This article surveys
a mathematical reformulation of quantum mechanics in terms of classical mechanics, intended to
bring the interpretational issues with the former to a head. This programme proposes to replace the
usual notion of state space of a quantum-mechanical system by a new one, in a way that avoids the
interpretational questions above and leaves classical systems unaffected:
• known obstructions to hidden variable interpretations merely say that states cannot be located
with exact precision in the state space, and are circumvented via open regions of states;
• the uncertainty principle cannot be expressed and therefore poses no interpretational problem;
• the measurement problem is obviated because the new notion of state space incorporates all
classical data resulting from possible measurements.
If we also take dynamics into account, the new notion of configuration space, called an active lattice:
This programme branches into a number of related themes, spread over the literature; see the extensive
bibliography. The aim of this article is to bring all these active developments together to give an
overview. There are hardly any new results. Instead, the novelty lies in rephrasing foundations to
give an accessible, coherent, and complete overview of the current state-of-the-art. To do so, we will
have to be rather brief and refer to references for many technical details. Nevertheless, there is a
novel contribution regarding topological structure of the new notion of configuration space. We will
use an n-level physical system as a running example to illustrate new notions (though many results
have exceptions for n ≤ 2, and most interesting features occur in infinite dimension). The rest of this
introduction summarizes the framework and discusses four salient features, before giving an overview
of the rest of this article.
280
Entropy 2017, 19, 144
noncommutative topology. In the case of a discrete space X with n points, this simply says that up to
isomorphism Cn is the only commutative C*-algebra of dimension n, and that functions n → n are the
only way to describe deterministic evolutions.
For more information, we refer to [14–17] in addition to references above.
281
Entropy 2017, 19, 144
2. Invariants
Bohr’s doctrine of classical concepts teaches that a quantum system can only be empirically
understood through its classical subsystems. These classical subsystems should therefore contain all
the physically relevant information about the quantum system.
Definition 1. For a unital C*-algebra A, write C( A) for its family of commutative unital C*-subalgebras C
(with the same unit as A). We may think of it either as partially ordered set by inclusion, or as a diagram that
remembers that the points of the partially ordered set are C*-algebras C.
For example, the partially ordered set C( A) of a 2-level system A = M2 (C) has Hasse diagram
• • • • • • • ···
282
Entropy 2017, 19, 144
such that every set S ⊆ A of pairwise commeasurable elements is contained in a set T ⊆ A of pairwise
commeasurable elements that forms a commutative C*-algebra under the above operations.
Of course, any commutative C*-algebra is a piecewise C*-algebra. More generally, the normal
elements (those commuting with their own adjoint) of any C*-algebra A form a piecewise C*-algebra.
For an n-level system A = Mn (C), the piecewise C*-algebra consists of all normal n-by-n matrices,
together with their norms and adjoints, as well as the knowledge of how commuting elements
add and multiply. Notice that C( A) makes perfect sense for any piecewise C*-algebra A. To make
precise how we can reconstruct the piecewise structure of A from C( A), we will use the language of
category theory [26]. C*-algebras, with ∗-homomorphisms between them, form a category. We can also
make piecewise C*-algebras into a category with the following arrows: (total) functions f : A → B that
preserve commeasurability and the algebraic operations, whenever defined.
The precise notion we need is that of a colimit. Suffice to say here, a colimit, when it exists, is a
universal solution that compatibly pastes together a given diagram into a single object. Thinking of A
as the whole and C( A) as its parts, we would like to know whether the whole is determined by the
parts. The following theorem says that C( A) indeed contains enough information to reconstruct A as a
piecewise C*-algebra.
Theorem 1 ([27]). Every piecewise C*-algebra is the colimit of its commutative C*-subalgebras in the category
of piecewise C*-algebras.
This means that the diagram C( A) determines the piecewise C*-algebra A: if C( A) and C( B) are
isomorphic diagrams, then A and B are isomorphic piecewise C*-algebras. Moreover, the previous
theorem gives a concrete way to reconstruct A from C( A). For the n-level system A = Mn (C),
this means we can reconstruct from C( A) the normal n-by-n matrices, as well as sums and products
of commuting ones. An important point to note here is that the reconstruction is happening in the
setting of piecewise C*-algebras. We could not have taken the colimit in the category of commutative
C*-algebras instead. Indeed, one way to reformulate the Kochen–Specker theorem in terms of colimits is
the following. The following reformulation might not look much like the original, but it is nevertheless
equivalent, and more suited to our purposes; see also ([2], p. 66).
Theorem 2 ([2,28]). If n ≥ 3, then the colimit of C(Mn (C)) in the category of commutative C*-algebras is the
degenerate, 0-dimensional, C*-algebra.
In fact, the colimit of C( A) degenerates for many more C*-algebras A than just Mn (C), such as any
C*-algebra of the form Mn ( B) for some C*-algebra B, or any W*-algebra that has no direct summand
C or M2 (C) [29,30].
As mentioned in the introduction, Gelfand duality is a functor from the category of commutative
C*-algebras to the category of compact Hausdorff topological spaces. That is, a systematic way
to assign a space to a C*-algebra, that respects functions. Interpreted physically: any classical
system is determined by a configuration space in a way that respects operations on the system.
The previous theorem can be used to show that there is no such configuration space determining
quantum systems—at least, if the notion of configuration space is to be a conservative extension of the
classical notion. The latter can be made precise as a continuous functor from the category of compact
Hausdorff spaces to some category with a degenerate space like the empty set, more precisely, a strict
initial object 0.
Theorem 3 ([29]). Suppose there exist a category conservatively extending that of compact Hausdorff spaces
and a functor F completing the following square.
283
Entropy 2017, 19, 144
Spec
commutative C*-algebras compact Hausdorff spaces
⊆
C*-algebras ?
F
Asking the functor on the right to be continuous is appropriate to model the classical limit
of quantum systems converging to a classical one, because then the state space of the product of
two limiting classical systems should be computed as the classical limit of the joint quantum systems.
In fact, the proof in [29] holds if the category on the bottom right has limits, and the functor on the
right reflects them. However, one might still wonder if it is reasonable to ask the diagram to commute
on the nose. Instead, we could ask it to commute up to a natural isomorphism. This is precisely the
way out we will explore in Sections 3 and 5.
This rules out many possible quantum configuration spaces that have been proposed for the
bottom right role in the square; in particular many generalized notions of topological spaces, such as
sets, topological spaces themselves, pointfree topological spaces, ringed spaces, quantales, toposes,
categories of sheaves, and many more [28,29,31]. In particular, the state space of a C*-algebra,
as discussed in the introduction, will not do for us, even though it is one of the most important
tools associated with a C*-algebra [32]. That explains why we deliberately talk about “configuration
spaces”. In the classical case, the two notions coincide. The previous theorem shows that serious
notions of quantum configuration space must be less conservative. This points the way towards good
candidates: Sections 3 and 5 will cover two that do fit the bill.
The question of noncommutative extensions of Gelfand duality is also very interesting from a
purely mathematical perspective. As mentioned in the introduction, C*-algebra theory can be regarded
as noncommutative topology. Adding more structure than mere topology leads to noncommutative
geometry, which is a rich field of study [33]. However, it takes place entirely on the algebraic side.
Finding the right notion of quantum configuration space could reintroduce geometric intuition, which
is usually very powerful [34,35]. For example, in certain cases, extensions of C( A) can be used to
compute the K-theory of A, which is a way to study homotopies of the configuration space underlying
A, that includes many local-to-global principles [36]. Similarly, closed ideals of a W*-algebra A, that are
important because they correspond to open subsets in the classical case, are in bijection with certain
piecewise ideals of C( A) [37].
So far, we have considered C( A) as a diagram of parts of the whole. We finish this section by
considering it as a mere partially ordered set, where we forget that elements have the structure
of commutative C*-algebras. That is, we only consider the shape of how the parts fit together.
This information is already enough to determine the piecewise structure of A, but as a Jordan algebra.
(In fact, considering C( A) as a mere partially ordered set gives precisely the same information as
considering it as a diagram [38]. This justifies Definition 1.) The self-adjoint elements of a C*-algebra
form a Jordan algebra under the product a ◦ b = 12 ( ab + ba); this even gives a so-called JB-algebra.
In fact, any JB-algebra is a subalgebra of the direct sum of one of this form and an exceptional one, such
as quaternionic matrices M3 (H) [39]. For example, the n-level system gives the JB-algebra of hermitian
n-by-n matrices multiplied via anticommutators. Piecewise Jordan algebras and their homomorphisms
are defined analogously to Definition 2. The structure of quantum observables leads naturally to
the axioms of Jordan algebras [8] (Modern mathematical physics tends to prefer C*-algebras, as their
theory is slightly less complicated, and the connections to Jordan algebras are so tight anyway [39].)
The following theorem justifies that point of view.
Theorem 4 ([40]). Let A and B be C*-algebras. If C( A) and C( B) are isomorphic partially ordered sets, then
A and B are isomorphic as piecewise Jordan algebras.
284
Entropy 2017, 19, 144
Corollary 1 ([42,43]). Let A and B be typical AW*-algebras. If C( A) and C( B) are isomorphic partially ordered
sets, then A and B are isomorphic as Jordan algebras.
Whereas the C*-algebra product is associative but need not be commutative, the Jordan product
is commutative but need not be associative; commutative C*-subalgebras correspond to associative
Jordan subalgebras. Indeed, the previous theorem generalizes to Jordan algebras in those terms [44].
3. Toposes
In this section, we consider C( A) as a diagram. That is, we regard it as an operation that assigns
to each classical subsystem C ∈ C( A) of the quantum system A a classical system C. What kind of
operation is this diagram C → C? We can think of it as a set S(C ) that varies with the context C ∈ C( A).
Moreover, this contextual set respects coarse-graining: if C ⊆ D, then S(C ) ⊆ S( D ). That is, when the
measurement context C grows to include more observables, the information contained in the set S(C )
assigned to it grows along accordingly. For example, for an 2-level system A = M2 (C), this comes
down to a choice of set S(u) for each unitary u ∈ U (2), that all include a fixed set S(0). Hence, these
contextual sets are functors S from C( A), now regarded as a partially ordered set, to the category of
sets and functions. The totality of all such functors forms a category. In fact, contextual sets form a
particularly nice category, namely a topos.
A topos is a category that shares a lot of the properties of the category of sets and functions.
In particular, one can do mathematics inside a topos: we may think about objects of a topos as sets, that
we may specify and manipulate using logical formulae. Of course, this internal perspective comes
with some caveats. Most notably, if a proof is to hold in the internal language of any topos, it has to be
constructive: we are not allowed to use the axiom of choice or proofs by contradiction, and have to be
careful about real numbers. We cannot go into more detail here, but for more information on topos
theory, see [45].
One particular object of interest in the topos of contextual sets over C( A) is our canonical
contextual set C → C. It turns out that, according to the logic of the topos of contextual sets, this object
is a commutative C*-algebra.
Theorem 5 ([19]). Let A be a C*-algebra. In the topos of contextual sets over C( A), the canonical contextual
set C → C is a commutative C*-algebra.
285
Entropy 2017, 19, 144
3. The quantum system A turns into a classical one given by the canonical contextual set C → C.
Corollary 2 ([48]). Let A be a C*-algebra. In the topos of contextual sets over C( A), there is a compact
Hausdorff locale X such that the canonical contextual set is of the form C ( X ).
For example, if A is the 2-level system M2 (C), then X is the contextual set S that assigns to
u ∈ U (2) the orthonormal basis of C2 corresponding to u, and that assigns to 0 the zero vector in C2 ,
where S(u) locally carries the structure of a 2-element discrete space, and S(0) carries the structure of
a 1-element discrete space. We will call this locale X the spectral contextual set. In general, it is not just
the contextual set C → Spec(C ). However, it does resemble that if we think about bundles instead of
contextual sets [49,50]: a bundle is a map of locales into the locale of ideals of C( A), and by restricting
the intuitionistic logic of a topos further to so-called geometric logic, the bundle corresponding to the
spectral contextual set does have fibre Spec(C ) over C. Also, if we reverse the partial order on C( A),
the assignment C → Spec(C) plays the role of the canonical contextual set. So there are two approaches:
• Either one uses C( A); the canonical contextual set C → C is a commutative C*-algebra, and the
spectral contextual set X does not take a canonical form [19,51–55].
• Or one uses the opposite order; the spectral contextual set X is a locale of the canonical form
C → Spec(C ), and the commutative C*-algebra C ( X ) does not take a canonical form [56–59].
For a comparison, see [60]. For this overview article, the choice of direction does not matter so
much. In any case, X is an object inside the topos of contextual sets, and as such we may reason about
it as a locale. In particular, we may wonder whether it is a topological space, that is, whether it does
in fact have enough points. It turns out that the Kochen–Specker Theorem 2 can be reformulated
as saying that not only does X not have enough points, in fact it has no points at all. In terms of
bundles: the canonical bundle has no global sections. This illustrates the need for locales rather than
topological spaces.
Proposition 1 ([23]). Let A be a C*-algebra satisfying the Kochen–Specker Theorem 2. In the topos of contextual
sets over C( A), the spectral contextual set has no points.
Thus, Bohrification turns a quantum system A into a locale X inside the topos of contextual sets
over C( A). There is an equivalence between locales X inside such a topos over C( A), and certain
continuous functions from a locale Spec( A) to C( A) outside the topos [61]. This gives a way to cut
out the whole topos detour, and assign to the quantum system A a configuration space that we will
temporarily call Spec( A) for the rest of this section.
Proposition 2 ([62]). For any C*-algebra A, the internal locale X is determined by a continuous function from
some locale Spec( A) to C( A).
286
Entropy 2017, 19, 144
In many cases, Spec( A) will in fact have enough points, i.e., will be a topological space [60,62]—despite
Proposition 1. The construction A → Spec( A) circumvents the obstruction of Theorem 3 for several
reasons. First, when the C*-algebra A is commutative, Spec( A) turns out to be a locale based on
C( A), rather than on A itself; therefore what we are currently denoting by Spec( A) does not match
the Gelfand spectrum of A. Second, the construction A → Spec( A) is only partially functorial: if we
regard C( A) as a locale, the construction only respects functions that reflect commutativity [27], and to
get functorality we have to regard C( A) as a localed topos, that is, a topos with a locale in it [63].
We can only touch on it briefly here, but one of the main features of building the topos of contextual
sets over C( A) and distilling the configuration space Spec( A) is that they encode a contextual logic.
This logic is intuitionistic, and therefore very different from traditional quantum logic [52]. The latter
concerns the set Proj( A) of yes–no questions on the quantum system A; more precisely, the set of
sharp observables with two outcomes. These correspond to projections: p ∈ A satisfying p2 = p = p∗ .
They are partially ordered by p ≤ q when pq = p, which should be read as saying that p implies q.
Similarly, least upper bounds in Proj( A) are logical disjunctions [11]. In an n-level system A = Mn (C),
projections correspond to subspaces of Cn , regarded logically as the set of (pure) states where the
proposition is true; the order becomes inclusion of subspaces; and the disjunction of subspaces is
their linear span. AW*-algebras A are determined to a great extent by their projections, and indeed
the quantum logic Proj( A) carries precisely the same amount of information as C( A) [64]. For more
information about this topos-theoretic approach to quantum logic, we refer to [19,49,51–54,56–58].
To connect contextual sets to probabilities and the Born rule, we have to translate states of A into
some notion based on the spectral contextual set X, and observables of A into some notion based
on the canonical contextual set C → C. For the latter, one has to resort to approximations, as not
every a ∈ A will be present in each C ∈ C( A); this process is sometimes called daseinisation [57]. The
former has a satisfying solution in terms of piecewise states: piecewise linear (completely) positive maps
A → C.
(The cited references consider W*-algebras, but the proof holds for AW*-algebras because
Corollary 5 does so, see Section 5. The same goes for the references in Corollary 3.) By Gleason’s
theorem (see Section 5), we can say more for AW*-algebras. See also [25].
Corollary 3 ([66,67]). There is a bijective correspondence between states of a typical AW*-algebra A, and states
of the canonical contextual set C → C inside the topos of contextual sets over C( A).
In the n-level system A = Mn (C) for n ≥ 3, this means that n-by-n density matrices correspond
precisely to a choice of probability distribution over m points that is consistent over all unitaries
u ∈ U (n) and partitions of n points into m equivalence classes.
Combining daseinisation with the above results gives rise to a contextual Born rule, justifying
the Bohrification procedure of Theorem 5 [50]. Summarizing, we can formulate the physics of the
quantum system A completely in terms of C( A) and its topos of contextual sets, and work within there
as if dealing with a classical system.
To end this section, let us mention some other related work. The “amount of nonclassicality” of
the contextual logic discussed of A measures the computational power of the quantum system A [68].
For philosophical aspects of Bohrification and related constructions, see [69,70]. Similar contextual
ideas have been used to model quantum numbers [71]. Transfering C*-algebras between different
toposes has been used successfully before in so-called Boolean-valued analysis [72–74]. Finally,
contextuality and the Kochen–Specker theorem can be formulated more generally than in algebraic
quantum theory [75].
287
Entropy 2017, 19, 144
4. Domains
The partially ordered set C( A) of empirically accessible classical contexts C of a quantum system
A embodies coarse-graining. As in the introduction, we think of each C ∈ C( A) as consisting of
compatible observables that we can measure together in a single experiment. Larger experiments,
involving more observables, should give us more information, and this is reflected in the partial order:
if C ⊆ D, then D contains more observables, and hence provides more information. If A itself is
noncommutative, the best we can do is approximate it with larger and larger commutative subalgebras
C. This sort of informational approximation is studied in computer science under the name domain
theory [76,77]. This section discusses the domain-theoretic properties of C( A). Domain theory is mostly
concerned with partial orders where every element can be approximated by finite ones, as those are
the ones we can measure in practice, leading to the following definitions.
Definition 3. A partially ordered set (C , ≤) is directed complete when every ascending chain { Di } has a least
7 7
upper bound i Di . An element C approximates D, written C * D, when D ≤ i Di implies C ≤ Di for
any chain { Di } and some i. An element C is finite when C * C. A continuous domain is a directed complete
7
partially ordered set, every element of which satisfies D = {C | C * D }. An algebraic domain is a directed
7
complete partially ordered set, every element of which is approximated by finite ones: D = {C | C * C ≤ D }.
7
Lemma 1 ([65,78]). If A is a C*-algebra, then C( A) is a directed complete partially ordered set, in which i Ci
!
is the norm-closure of i Ci .
The previous proposition does not generalize to arbitrary C*-algebras, which need not have
a decomposition as a direct sum of factors. One might expect that C( A) is a domain when A
is approximately finite-dimensional, as this would match with the intuition of approximation using
practically obtainable information. However, there also needs to be a large enough supply of projections
for this to work; see also Section 3. It turns out that the correct notion is that of scattered C*-algebras [81],
that is, C*-algebras A for which every positive map A → C is a sum of pure ones. The n-level system
A = Mn (C ) is scattered.
288
Entropy 2017, 19, 144
Theorem 7 ([38]). A C*-algebra A is scattered if and only if C( A) is a continuous domain if and only if C( A)
is an algebraic domain.
Corollary 4 ([77]). For a scattered C*-algebra A, the Lawson topology makes X = C( A) compact Hausdorff.
Hence to each scattered C*-algebra A we may assign a commutative C*-algebra C ( X ).
The assignment A → C (C( A)) is not functorial, does not leave commutative C*-algebras invariant,
and of course only works for scattered C*-algebras A in the first place [38]. Hence there is no
contradiction with Theorem 3.
One can also furnish C( A) with a topology inspired by the topology of A itself. We will use the
topology induced by the following variation on the Hausdorff metric; similar variations are named
after Banach–Mazur, Kadets [82], Gromov–Hausdorff, Effros–Maréchal [83], and Kadison–Kastler [84] .
See also [85]. Define the distance between C, D ∈ C( A) to be
' (
d(C, D ) = max sup inf c − d, sup inf c − d .
c∈C d∈ D d∈ D c∈C
c≤1 d≤1 d≤1 c≤1
Now if C and D are generated by projections p and q, and A is represented on a Hilbert space H, then
is the Hausdorff distance between p( H ) and q( H ). It follows that the distance between C and D
is max( p − q, (1 − p) − q, p − (1 − q), (1 − p) − (1 − q)) = max( p − q, (1 − p) − q).
This topology on C( A) matches the case of the 2-level system A = M2 (C), where C( A) is in bijection
with the one-point compactification of the real projective plane RP2 [50].
5. Dynamics
So far, we have only considered kinematics of the quantum system A, by looking for configuration
spaces based on C( A). It is clear, however, that C( A) in itself is not enough to reconstruct all of A.
For a counterexample, observe that any C*-algebra A has an opposite C*-algebra Aop in which the
multiplication is reversed. Clearly, C( A) and C( Aop ) are isomorphic as partially ordered sets, but
there exist C*-algebras A that are not isomorphic to Aop as C*-algebras [86]. So we need to add more
information to C( A) to be able to reconstruct A as a C*-algebra, which is the topic of this section. To do
so, we bring dynamics into the picture. For motivation of why dynamics and configuration spaces
should go together, see also [87].
We begin by viewing dynamics as a time-dependent group of evolutions. The traditional view is
that the 1-parameter group consists of unitary evolutions of the Hilbert space. For an n-level system,
289
Entropy 2017, 19, 144
these 1-parameter groups are continuous homomorphisms R → U (n). In algebraic quantum theory,
it becomes a 1-parameter group of isomorphisms A → A of the C*-algebra.
The group Aut( A) inherits the pointwise norm topology from A, that has subbasis
for f ∈ Aut( A), ε > 0, and S ⊆ A finite, and makes conjugation U ( A) → Aut( A) continuous [88].
We can similarly consider 1-parameter groups of isomorphisms C( A) → C( A) of partially ordered
sets.
Similarly, Aut(C( A)) becomes a topological group with subbasis
The following theorem shows that both notions in fact coincide. A factor is an algebra with trivial
center, that is, a single superselection sector: the n-level system Mn (C) is a factor, but Mm (C) ⊕ Mn (C)
is not, because its center is two-dimensional. More precisely, the following theorem shows that the
only freedom between the two notions in the previous definition lies in permutations of the center,
because Aut( A) Aut(C( A)) for typical AW*-factors.
So C*-dynamics of A can be completely justified in terms of C( A). This also justifies our choice of
the topology on C( A) induced by the Hausdorff metric. See also [91]. Equilibrium states are described
in algebraic quantum theory by Kubo–Martin–Schwinger states, and these can be described in terms of
C( A) as well, see [92].
We now switch gear. By Stone’s theorem, 1-parameter groups of unitaries eith in certain W*-algebras
correspond to self-adjoint (possibly unbounded) observables h. Thus, we may forget about the explicit
dependence on a time parameter and consider single self-adjoint elements of C*-algebras. In fact, we
will mostly be interested in symmetries: self-adjoint unitary elements s = s∗ = s−1 .
Symmetries are tightly linked to projections. Every projection p gives rise to a symmetry 1 − 2p,
and every symmetry s comes from a projection (1 − s)/2. As they are unitary, the symmetries of
a C*-algebra A generate a subgroup Sym( A) of the unitary group. For a commutative C*-algebra
A = C ( X ), symmetries compose, so that Sym( A) consists of symmetries only. For an n-level system
A = Mn (C), it turns out that Sym( A) consists of those unitaries u ∈ U (n) whose determinant is 1
or −1. This ‘orientation’ is what we will add to C( A) to make it into a full invariant of A. See also [93].
Having enough symmetries means having enough projections. Therefore, we now consider
AW*-algebras rather than general C*-algebras. For commutative AW*-algebras C ( X ), the Gelfand
spectrum X is not just compact Hausdorff, but Stonean, or extremally disconnected, in the sense that
the closure of an open set is still open. (For comparison, the Lawson topology in Corollary 4 is
totally disconnected, in the sense that connected components are singleton sets, which is weaker
than Stonean).
Gelfand duality restricts to commutative AW*-algebras and Stonean spaces. Another way to
put this is to say that the projections Proj( A) of a commutative AW*-algebra A form a complete
290
Entropy 2017, 19, 144
Boolean algebra, and vice versa, every complete Boolean algebra gives a commutative AW*-algebra.
The appropriate homomorphisms between AW*-algebras are normal, meaning that they preserve
least upper bounds of projections [94]. There are versions of Definition 2 for piecewise AW*-algebras,
and piecewise complete Boolean algebras, too [94]. One could also define a piecewise Stonean space,
but the following lemma suffices here.
Lemma 2 ([94]). The category of piecewise complete Boolean algebras and the category of piecewise AW*-algebras
are equivalent.
The orthocomplement p → 1 − p makes sense for the projections Proj( A) of any C*-algebra A.
We can now make precise what equivariance under symmetries achieves: it makes the difference
between being able to recover Jordan structure and C*-algebra structure.
Proposition 5 ([43,94]). Let A and B be typical AW*-algebras, and suppose that f : Proj( A) → Proj( B)
preserve least upper bounds and orthocomplements. Then f extends to a Jordan homomorphism A → B.
' ( ' (' (
It extends to a homomorphism if additionally f (1 − 2p)(1 − 2q) = 1 − 2 f ( p) 1 − 2 f (q) .
To arrive at a good configuration space for A, we can package all this information up. We saw that
Proj( A) embedded in Sym( A). Conversely, Sym( A) acts on Proj( A): a symmetry s and a projection p
give rise to a new projection sps. In this way, Proj( A) acts on itself, and we may forget about Sym( A).
Including this action leads to the notion of an active lattice AProj( A). More precisely, an active lattice
consists of a complete orthomodular lattice P, a group G generated by 1 − 2p for p ∈ P within the
unitary group of the piecewise AW*-algebra A( P) with projections P, and an action of G on P that
becomes conjugation on A( P). The active lattice of an n-level system A = Mn (C) has, for P, the lattice
of subspaces of Cn ; for G, the group {u ∈ U (n) | det(u) = ±1}; the injection P → G sends V ⊆ Cn
to the reflection in V; and u ∈ G acts on V ∈ P as uVu∗ = {uvu∗ | v ∈ V } ⊆ Cn . For morphisms
of active lattices, we refer to [94], but let us point out that thanks to Lemma 2 they can be phrased
in terms of projections alone, just like the above definition of the active lattice itself. See also [95].
We can now make precise that we can reconstruct an AW*-algebra A from its active lattice AProj( A).
Up to now, we have mostly considered reconstructions of the form “if some structures based on A
and B are isomorphic, then so are A and B”. The following theorem gives a much stronger form of
reconstruction. Recall that a functor F is fully faithful when it gives a bijection between morphisms
A → B and F ( A) → F ( B).
Theorem 9 ([94]). The functor that assigns to an AW*-algebra A its active lattice AProj( A) is fully faithful.
It follows immediately that if A and B are AW*-algebras with isomorphic active lattices
AProj( A) ∼ = AProj( B), then A ∼= B are isomorphic AW*-algebras. That is, its active lattice completely
determines an AW*-algebra. We can therefore think of them as configuration spaces. As mentioned
before, Proj( A) contains precisely the same information as C( A), so we could phrase active lattices in
terms of C( A) as well. This configuration space circumvents the obstruction of Theorem 3, because
active lattices are not a conservative extension of the “passive lattices” coming from compact Hausdorff
spaces. Another thing to note about the previous theorem is that it has no need to except atypical cases
such as M2 (C). Finally, let us point out that functoriality of A → AProj( A) is nontrivial [96].
To get a good notion of configuration space for general quantum systems, we would eventually
like to pass from AW*-algebras to C*-algebras. One way to think about this step is as refining an
underlying carrying set to a topological space, that is, moving from algebras ∞ ( X ) of all (bounded)
functions on the set X to algebras C ( X ) of continuous functions on the topological space X. One might
hope that AW*-algebras or W*-algebras play the former role in a noncommutative generalization, and to
some extent this works [97,98]. Unfortunately, the Kadison–Singer problem raises rigorous obstructions
291
Entropy 2017, 19, 144
Corollary 5 ([43]). Any normal piecewise Jordan homomorphism between typical AW*-algebras is a
Jordan homomorphism.
6. Characterization
Now that we have seen that most of the algebraic quantum theory of A can be phrased in terms
of C( A) only, let us try to axiomatize C( A) itself. Given any partially ordered set, when is it of the
form C( A) for some quantum system A? An answer to this question would, for example, make
Theorem 9 into an equivalence of categories, bringing configuration spaces for quantum systems on a
par with Gelfand duality for classical systems. An axiomatization would also open up the possibility
of generalizations, that might go beyond algebraic quantum theory.
We start with the classical case, of commutative C*-algebras C ( X ). By Gelfand duality, any
C ∈ C(C ( X )) corresponds to a quotient X/∼. In turn, the equivalence relation corresponds to a
partition of X into equivalence classes. Partitions are partially ordered by refinement: if C ⊆ D, then
any equivalence class in the partition corresponding to D is contained in an equivalence class of the
partition corresponding to C. Hence axiomatizing C(C ( X )) comes down to axiomatizing partition
lattices, and this has been well-studied, both in the finite-dimensional case [103,104], and in the general
case [105]. The list of axioms is too long to reproduce here, but let us remark that it is based on a
definition of points of the partition lattice. In the case of a finite partition lattice, the points are simply
the atoms, that is, the minimal nonzero elements. So for a classical system Cn with n states, the elements
of the partition lattice C(Cn )op are the ways to partition a set of n points into m equivalence classes;
the atoms put two of the n points in an equivalence class and all the others in their own equivalence
class of one point each. The other axioms are geometric in nature.
Lemma 3 ([64]). A partially ordered set is isomorphic to C(C ( X )) for a compact Hausdorff space X if and only
if it is opposite to a partition lattice whose points are in bijection with X.
Thanks to (a variation of) Lemma 2, the same strategy applies to piecewise Boolean algebras B.
Write C( B) for the partially ordered set of Boolean subalgebras of B. The downset of an element D of a
partially ordered set consists of all elements C ≤ D. In fact, the idea that any quantum logic (piecewise
292
Entropy 2017, 19, 144
Boolean algebra) should be seen as many classical sublogics (Boolean algebras) pasted together, is not
new, and drives much of the research in that area [27,106–109].
Theorem 10 ([110]). A partially ordered set is isomorphic to C( B) for a piecewise Boolean algebra B if
and only if:
• it is an algebraic domain;
• any nonempty subset has a greatest lower bound;
• a set of atoms has an upper bound whenever each pair of its elements does;
• the downset of each compact element is isomorphic to the opposite of a finite partition lattice.
In the case of a classical system with n states, B is the powerset of n points, and the above
conditions merely say that C( B)op is a partition lattice.
Just like in Section 3, if we consider C( B) as a diagram rather than a mere partially ordered set,
we can reconstruct B. Starting from just the partially ordered set C( B), the same issues surface as in
Sections 2 and 5, about Jordan structure verses full algebra structure. In the current piecewise Boolean
setting, it can be solved neatly by adding an orientation to C( B) [110]. This comes down to making a
consistent choice of atom in the Boolean subalgebras with two atoms, corresponding to the atypical
cases for AW*-algebras before.
Returning to C*-algebras, Lemma 3 reduces the question of characterizing C( A) for a C*-algebra A
to finding relationships between C( A) and C(C ) for C ∈ C( A). One prototypical case where we know
such a relationship is for the n-level system A = Mn (C). Namely, inspired by the previous section,
there is an action of the unitary group U (n) on C( A): if u ∈ U (n) is some rotation, and C ∈ C( A)
is diagonal in some basis, then also the rotation uCu∗ is diagonal in the rotated basis and therefore
is in C( A) again. In fact, any C ∈ C( A) will be a rotation of an element of C( A) that is diagonal in
the standard basis. Therefore, we can recognize C(Mn (C)) as a semidirect product of C(Cn ) and U (n).
Such semidirect products can be axiomatized; for details, we refer to [64]. This can be generalized
to C*-algebras A that have a weakly terminal commutative C*-subalgebra D, in the sense that any
C ∈ C( A) allows an injection C → D. This includes all finite-dimensional C*-algebras, as well as
algebras of all bounded operators on a Hilbert space. For example, for the n-level system A = Mn (C),
the matrices that are diagonal in the standard basis form a terminal subalgebra Cn .
However, the mere partially ordered set C( A) cannot detect this unitary action. For this we
need injections rather than inclusions. Therefore, we now switch to a category C ( A) of commutative
C*-subalgebras, with injective ∗-homomorphisms between them. For A = Mn (C), these morphisms
consist of a rotation in U (n) followed by an inclusion Ck → C l with k ≤ l. The following theorem
characterizes this category C ( A) up to equivalence. This is the same as characterizing C( A) up
to Morita equivalence, meaning that it determines the topos of contextual sets on C( A) discussed
in Section 3 up to categorical equivalence, rather than determining C( A) itself up to equivalence.
To phrase the following theorem, we introduce the monoid S( X ) of continuous surjections X → X on
a compact Hausdorff space X. In the finite-dimensional case, this is just the symmetric group S(n).
Because of our switch from C( A) to C ( A), it plays the role of the unitary group we need.
Theorem 11 ([64]). Suppose that a C*-algebra A has a weakly terminal commutative C*-subalgebra C ( X ).
A category is equivalent to C ( A) if and only if it is equivalent to a semidirect product of C(C ( X )) and S( X ).
293
Entropy 2017, 19, 144
7. Generalizations
As mentioned in the introduction, the idea to describe quantum structures in terms of their
classical substructures applies very generally. This final section discusses to what extent algebraic
quantum theory is special, by considering a generalization as an example of another framework.
Namely, we consider categorical quantum mechanics [117]. This approach formulates quantum
theory in terms of the category of Hilbert spaces, and then abstracts away to more general categories
with the same structures. Specifically, what is retained is the notion of a tensor product to be able
to build compound systems, the notion of entanglement in the form of objects that form a duality
under the tensor product, and the notion of reversibility in the sense that every map between Hilbert
spaces has an adjoint in the reverse direction. It turns out that these primitives suffice to derive a lot
of quantum-mechanical features, such as scalars, the Born rule, no-cloning, quantum teleportation,
and complementarity. As a case in point, one can define so-called Frobenius algebras in any category
with this structure, which is important because of the following proposition.
The point is that these notions make sense in any category with a tensor product, entanglement,
and reversibility. A different example of such a category is that of sets with relations between them.
That is, objects are sets X, and arrows X → Y are relations R ⊆ X × Y. For the tensor product, we
take the Cartesian product of sets, which makes every object dual to itself and thereby fulfulling the
structure of entanglement, and time reversibility is given by taking the opposite relation R† ⊆ Y × X.
Two relations R ⊆ X × Y and S ⊆ Y × Z compose to S ◦ R = {( x, z) | ∃y : ( x, y) ∈ R, (y, z) ∈ S}.
We may regard this as a toy example of possibilistic quantum theory: rather than complex matrices,
we now care about entries ranging over {0, 1}. A groupoid is a small category, every arrow of which is
an isomorphism; they may be considered as a multi-object generalization of groups.
Theorem 13 ([120]). Frobenius algebras in the category of sets and relations correspond to groupoids.
Algebraic quantum theory, as set out in the introduction, makes perfect sense in categories such
as sets and relations as well [121]. However, in this generality, it is not true that all classical subsystems
determine a quantum system at all. The previous theorem provides a counterexample. In commutative
groupoids, there can only be arrows X → X, for arrows g : X → Y between different objects cannot
commute with their inverse, as g ◦ g−1 = 1Y and g−1 ◦ g = 1X . Therefore, any arrow between different
objects in a groupoid can never be recovered from any commutative subgroupoid.
Similarly, quantum logic, as discussed in Section 3, makes perfect sense in this general categorical
setting [122]. Moreover, it matches neatly with algebraic quantum theory via taking projections [123].
However, it is no longer true that commutative subalgebras correspond to Boolean sublattices. Again,
a counterexample can be found using Theorem 13 [124].
294
Entropy 2017, 19, 144
One could object that commutativity might be too narrow a notion of classicality. However,
consider broadcastability instead: classical information can be broadcast, but quantum information
cannot. More precisely, a Frobenius algebra A is broadcastable when there exists a completely positive
map A → A ⊗ A such that both partial traces are the identity A → A. Again, this makes perfect
sense in general categories. It turns out that the broadcastable objects in the category of sets and
relations are the groupoids that are totally disconnected, in the sense that there are no arrows g : X → Y
between different objects [117]. So even with this more liberal operational notion of classicality, classical
subsystems do not determine a quantum system.
This breaks a well-known information-theoretic characterization of quantum theory, that is
phrased in terms of C*-algebras [125,126]. Hence there is something about (algebraic) quantum
theory beyond the categorical properties of having tensor products, entanglement, and reversibility,
that underwrites Bohr’s doctrine of classical concepts. It relates to characterizing unitary groups,
as discussed in Section 6. We close this overview by raising the interesting interpretational question of
just what this defining property is.
References
1. Bell, J.S. On the Einstein Podolsky Rosen paradox. Physics 1964, 1, 195–200.
2. Kochen, S.; Specker, E. The problem of hidden variables in quantum mechanics. J. Math. Mech. 1967, 17, 59–87.
3. Pusey, M.; Barrett, J.; Rudolph, T. On the reality of the quantum state. Nat. Phys. 2012, 8, 475–478.
4. Busch, P.; Grabowski, M.; Lahti, P.J. Operational Quantum Physics; Springer: Berlin/Heidelberg, Germany, 1995.
5. Keyl, M. Fundamentals of quantum information theory. Phys. Rep. 2002, 369, 431–548.
6. Kadison, R.V.; Ringrose, J.R. Fundamentals of the Theory of Operator Algebras; Number 15–16 in Graduate
Studies in Mathematics; Academic Press: Cambridge, MA, USA, 1983.
7. Berberian, S.K. Baer ∗ -Rings; Springer: Berlin/Heidelberg, Germany, 1972.
8. Emch, G.G. Mathematical and Conceptual Foundations of 20th-Century Physics, 1st ed.; North-Holland: Amsterdam,
The Netherlands, 1984.
9. Davies, E.B. Quantum Theory of Open Systems; Academic Press: Cambridge, MA, USA, 1976.
10. Earman, J. Superselection rules for philosophers. Erkenn 2008, 69, 377–414.
11. Rédei, M. Quantum Logic in Algebraic Approach; Springer: Cham, The Netherlands, 1998.
12. Haag, R. Local Quantum Physics; Texts and Monographs in Physics; Springer: Berlin/Heidelberg, Germany, 1996.
13. Strocchi, F. An Introduction to the Mathematical Structure of Quantum Mechanics; World Scientific: Singapore, 2008.
14. Emch, G.G. Algebraic Methods in Statistical Mechanics and Quantum Field Theory; Wiley: Hoboken, NJ, USA, 1972.
15. Alberti, P.M.; Uhlmann, A. Existence and density theorems for stochastic maps on commutative C*-algebras.
Math. Nachr. 1980, 97, 279–295.
16. Landsman, N.P. Mathematical Topics between Classical and Quantum Mechanics; Springer: Berlin/Heidelberg,
Germany, 1998.
17. Weaver, N. Mathematical Quantization; Chapman & Hall: London, UK, 2001.
18. Bohr, N. Chapter Discussion with Einstein on epistemological problems in atomic physics. In Albert Einstein:
Philosopher-Scientist; Cambridge University Press: Cambridge, UK, 1949.
19. Heunen, C.; Landsman, N.P.; Spitters, B. A topos for algebraic quantum theory. Commun. Math. Phys. 2009,
291, 63–110.
20. Kadison, R.V.; Singer, I.M. Extensions of pure states. Am. J. Math. 1959, 81, 383–400.
21. Marcus, A.; Spielman, D.A.; Srivastava, N. Interlacing families II: Mixed characteristic polynomials and the
Kadison–Singer problem. Ann. Math. 2015, 182, 327–350.
22. Altepeter, J.B.; James, D.F.V.; Kwiat, P.G. Qubit quantum state tomography. In Quantum State Estimation;
Springer: Berlin/Heidelberg, Germany, 2004.
23. Butterfield, J.; Isham, C.J. A topos perspective on the Kochen–Specker theorem: I. Quantum States as
Generalized Valuations. Int. J. Theor. Phys. 1998, 37, 2669–2733.
295
Entropy 2017, 19, 144
24. Constantin, C.M.; Döring, A. Contextual entropy and reconstruction of quantum states. arXiv 2012,
arXiv:1208.2046.
25. Hamhalter, J.; Turilova, E. Orthogonal measures on state spaces and context structure of quantum theory.
Int. J. Theor. Phys. 2016, 55, 3353–3365.
26. Mac Lane, S. Categories for the Working Mathematician, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 1971.
27. Berg, B.; Heunen, C. Noncommutativity as a colimit. Appl. Categorical Struct. 2012, 20, 393–414.
28. Reyes, M.L. Obstructing extensions of the functor Spec to noncommutative rings. Isr. J. Math. 2012, 192,
667–698.
29. Berg, B.; Heunen, C. Extending obstructions to noncommutative functorial spectra. Theory Appl. Categories
2014, 29, 457–474.
30. Döring, A. Kochen–Specker theorem for von Neumann algebras. Int. J. Theor. Phys. 2005, 44, 139–160.
31. Reyes, M.L. Sheaves that fail to represent matrix rings. In Ring theory and Its Applications; American Mathematical
Society: Providence, RI, USA, 2014; Volume 609, pp. 285–297.
32. Alfsen, E.M.; Shultz, F.W. State Spaces of Operator Algebras: Basic Theory, Orientations, and C*-Products;
Birkhäuser: Basel, Switzerland, 2001.
33. Connes, A. Noncommutative Geometry; Academic Press: Cambridge, MA, USA, 1994.
34. Akemann, C.A. The general Stone–Weierstrass problem. J. Funct. Anal. 1969, 4, 277–294.
35. Giles, R.; Kummer, H. A non-commutative generalization of topology. Indiana Univ. Math. J. 1971, 21, 91–102.
36. De Silva, N. From topology to noncommutative geometry: K-theory. arXiv 2014, arXiv:1408.1170.
37. De Silva, N.; Soares Barbosa, R. Partial and total ideals of von Neumann algebras. arXiv 2014, arXiv:1408.1172.
38. Heunen, C.; Lindenhovius, A.J. Domains of commutative C*-subalgebras. In Proceedings of the 2015
30th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), Kyoto, Japan, 6–10 July 2015;
pp. 450–461.
39. Hanche-Olsen, H.; Størmer, E. Jordan Operator Algebras; Pitman Advanced Publishing Program: Boston, MA,
USA, 1984.
40. Hamhalter, J. Isomorphisms of ordered structures of abelian C*-subalgebras of C*-algebras. J. Math. Anal. Appl.
2011, 383, 391–399.
41. Kaplansky, I. Projections in Banach algebras. Ann. Math. 1951, 53, 235–249.
42. Döring, A.; Harding, J. Abelian subalgebras and the Jordan structure of von Neumann algebras. arXiv 2015,
arXiv:1009.4945.
43. Hamhalter, J. Dye’s theorem and Gleason’s theorem for AW*-algebras. J. Math. Anal. Appl. 2015, 422, 1103–1115.
44. Hamhalter, J.; Turilova, E. Structure of associative subalgebras of Jordan operator algebras. Q. J. Math. 2013,
64, 397–408.
45. Johnstone, P.T. Sketches of an Elephant: A Topos Theory Compendium; Clarendon Press: Oxford, UK, 2002.
46. Landsman, N.P. Bohrification: From Classical Concepts to Commutative Operator Algebras; Springer:
Berlin/Heidelberg, Germany, 2017.
47. Johnstone, P.T. Stone Spaces; Number 3 in Cambridge Studies in Advanced Mathematics; Cambridge
University Press: Cambridge, UK, 1982.
48. Banaschewski, B.; Mulvey, C.J. A globalisation of the Gelfand duality theorem. Ann. Pure Appl. Log. 2006,
137, 62–103.
49. Spitters, B.; Vickers, S.; Wolters, S. Gelfand spectra in Grothendieck toposes using geometric mathematics.
Electron. Proc. Theor. Comput. Sci. 2014, 158, 77–107.
50. Fauser, B.; Raynaud, G.; Vickers, S. The Born rule as structure of spectral bundles. Electron. Proc. Theor.
Comput. Sci. 2012, 95, 81–90.
51. Heunen, C.; Landsman, N.P.; Spitters, B. Bohrification. In Deep Beauty: Understanding the Quantum World
through Mathematical Innovation, Halvorson, H., Ed.; Cambridge University Press: Cambridge, UK, 2011;
pp. 217–313.
52. Caspers, M.; Heunen, C.; Landsman, N.P.; Spitters, B. Intuitionistic quantum logic of an n-level system.
Found. Phys. 2009, 39, 731–759.
53. Heunen, C.; Landsman, N.P.; Spitters, B. Bohrification of operator algebras and quantum logic. Synthese
2012, 186, 719–752.
54. Wolters, S. Topos models for physics and topos theory. J. Math. Phys. 2013, 55, 082110.
55. Nuiten, J. Bohrification of local nets. Electron. Proc. Theor. Comput. Sci. 2011, 95, 211–218.
296
Entropy 2017, 19, 144
56. Döring, A.; Isham, C.J. Topos Methods in the Foundations of Physics. In Deep Beauty: Understanding the
Quantum World through Mathematical Innovation, Halvorson, H., Ed.; Cambridge University Press: Cambridge,
UK, 2011.
57. Döring, A.; Isham, C.J. New Structure for Physics; Chapter What is a thing? Topos theory in the founcations
of physics. In Lecture Notes in Physics; Springer: Berlin/Heidelberg, Germany, 2011; Volume 813; pp. 753–940.
58. Döring, A.; Isham, C.J. A topos founcation for theories of physics. J. Math. Phys. 2008, 49, 053515.
59. Flori, C. A First Course in Topos Quantum Theory; Lecture Notes in Physics; Springer: Berlin/Heidelberg,
Germany, 2013; Volume 868.
60. Wolters, S. A comparison of two topos-theoretic approaches to quantum theory. Commun. Math. Phys. 2013,
317, 3–53.
61. Joyal, A.; Tierney, M. An Extension of the Galois Theory of Grothendieck (Memoirs of the American Mathematical
Society); Proquest Info & Learning: Ann Arbor, MI, USA, 1984; Volume 51.
62. Heunen, C.; Landsman, N.P.; Spitters, B.; Wolters, S. The Gelfand spectrum of a noncommutative C*-algebra:
A topos-theoretic approach. J. Aust. Math. Soc. 2011, 90, 39–52.
63. Berg, B.; Heunen, C. Erratum to: Noncommutativity as a colimit. Appl. Categorical Struct. 2013, 21, 103–104.
64. Heunen, C. Characterizations of categories of commutative C*-subalgebras. Commun. Math. Phys. 2014,
331, 215–238.
65. Spitters, B. The space of measurement outcomes as a spectral invariant for non-commutative algebras.
Found. Phys. 2012, 42, 896–908.
66. De Groote, H.F. Observables IV: The presheaf perspective. arXiv 2007, arXiv:0708.0677.
67. Döring, A. Quantum states and measures on the spectral presheaf. Adv. Sci. Lett. 2009, 2, 291–301.
68. Loveridge, L.; Dridi, R.; Raussendorf, R. Topos logic in measurement-based quantum computation. Proc. R.
Soc. A 2015, 471, 20140716.
69. Heunen, C.; Landsman, N.P.; Spitters, B. The principle of general tovariance. Int. Fall Workshop Geom. Phys.
2008, 1023, 93–102.
70. Epperson, M.; Zafiris, E. Foundations of Relational Realism: A Topological Approach to Quantum Mechanics and
the Philosophy of Nature; Lexington: Lanham, MD, USA, 2013.
71. Adelman, M.; Corbett, J.V. A sheaf model for intuitionistic quantum mechanics. Appl. Categorical Struct.
1995, 3, 79–104.
72. Takeuti, G. C*-algebras and Boolean-valued analysis. Jpn. J. Math. 1983, 9, 207–245.
73. Ozawa, M. A transfer principle from von Neumann algebras to AW*-algebras. J. Lond. Math. Soc. 1985,
32, 141–148.
74. Ozawa, M. A classification of type I AW*-algebras and Boolean-valued analysis. J. Math. Soc. Jpn. 1984,
36, 589–608.
75. Abramsky, S.; Brandenburger, A. The sheaf-theoretic structure of non-locality and contextuality. New J. Phys.
2011, 13, 113036.
76. Abramsky, S.; Jung, A. Domain Theory. In Handbook of Logic in Computer Science; Oxford University Press:
Oxford, UK, 1994; Volume 3.
77. Gierz, G.; Hofmann, K.H.; Keimel, K.; Lawson, J.D.; Mislove, M.W.; Scott, D.S. Continuous Lattices and Domains;
Number 93 in Encyclopedia of Mathematics and its Applications; Cambridge University Press: Cambridge,
UK, 2003.
78. Döring, A.; Barbosa, R.S. Unsharp values, domains and topoi. In Quantum Field Theory and Gravity:
Conceptual and Mathematical Advances in the Search for a Unified Framework; Springer: Berlin/Heidelberg,
Germany, 2011; pp. 65–96.
79. Lindenhovius, A.J. Classifying finite-dimensional C*-algebras by posets of their commutative C*-subalgebras.
Int. J. Theor. Phys. 2015, 54, 4615–4635.
80. Lindenhovius, A.J. C( A). Ph.D. Thesis, Radboud University, Nijmegen, The Netherlands, 5 July 2016.
81. Jensen, H.E. Scattered C*-algebras. Math. Scand. 1977, 41, 308–314.
82. Kalton, N.J.; Ostrovskii, M.I. Distances between Banach spaces. Forum Math. 1999, 11, 17–48.
83. Haagerup, U.; Winsløw, C. The Effros–Maréchal topology in the space of von Neumann algebras. Am. J. Math.
1998, 120, 567–617.
84. Kadison, R.V.; Kastler, D. Perturbations of von Neumann algebras I: Stability of type. Am. J. Math. 1972,
94, 38–54.
297
Entropy 2017, 19, 144
85. Chetcuti, E.; Hamhalter, J.; Weber, H. The order topology for a von Neumann algebra. Stud. Math. 2015,
230, 95–120.
86. Connes, A. A factor not anti-isomorphic to itself. Ann. Math. 1975, 101, 536–554.
87. Spekkens, R.W. The paradigm of kinematics and dynamics must yield to causal structure. Foundational
Questions Institute essay contest winner. arXiv 2013, arXiv:1209.0023.
88. Moffat, J. Groups of Automorphisms of Operator Algebras. Ph.D. Thesis, University of Newcastle upon
Tyne, Newcastle, UK, 1974.
89. Hamhalter, J.; Turilova, E. Automorphisms of ordered structures of abelian parts of operator algebras and
their role in quantum theory. Int. J. Theor. Phys. 2014, 53, 3333–3345.
90. Döring, A. Flows on generalised Gelfand spectra of nonabelian unital C*-algebras and time evolution of
quantum systems. arXiv 2012, arXiv:1212.4882
91. Heunen, C.; Lindenhovius, A.J. Domains of commutative C*-subalgebras. arXiv 2015, arXiv:1504.02730.
92. Geloun, J.B.; Flori, C. Topos analogues of the KMS state. arXiv 2012, arXiv:1207.0227.
93. Alfsen, E.M.; Shultz, F.W. Orientation in operator algebras. Proc. Natl. Acad. Sci. USA 1998, 95, 6596–6601.
94. Heunen, C.; Reyes, M.L. Active lattices determine AW*-algebras. J. Math. Anal. Appl. 2014, 416, 289–313.
95. Chevalier, G. Automorphisms of an orthomodular poset of projections. Int. J. Theor. Phys. 2005, 44, 985–998.
96. Heunen, C.; Reyes, M.L. Diagonalizing matrices over AW*-algebras. J. Funct. Anal. 2013, 264, 1873–1898.
97. Kornell, A. Quantum Collections. arXiv 2012, arXiv:1202.2994.
98. Kornell, A. V*-algebras. arXiv 2015, arXiv:1502.01516.
99. Heunen, C.; Reyes, M.L. On discretization of C*-algebras. J. Oper. Theory 2017, 77, 19–37.
100. Mackey, G.W. The Mathematical Foundations of Quantum Mechanics; W. A. Benjamin: New York, NY, USA, 1963.
101. Bunce, L.J.; Wright, J.D.M. The Mackey–Gleason problem. Bull. Am. Math. Soc. 1992, 26, 288–293.
102. Hamhalter, J. Quantum Measure Theory; Springer: Berlin/Heidelberg, Germany, 2004.
103. Birkhoff, G. Lattice Theory; American Mathematical Society: Providence, RI, USA, 1948.
104. Stonesifer, J.R.; Bogart, K.P. Characterizations of partition lattices. Algebra Univers. 1984, 19, 92–98.
105. Firby, P.A. Lattices and compactifications I. Proc. Lond. Math. Soc. 1973, 27, 22–50.
106. Gudder, S.P. Partial algebraic structures associated with orthomodular posets. Pac. J. Math. 1972, 41, 717–730.
107. Finch, P.D. On the structure of quantum logic. J. Symb. Log. 1969, 34, 415–425.
108. Hughes, R.I.G. Omnibus review. J. Symb. Log. 1985, 50, 558–566.
109. Scheibe, E. The Logical Analysis of Quantum Mechanics; Pergamon Press: Oxford, UK, 1973.
110. Heunen, C. Piecewise Boolean algebras and their domains. Lect. Notes Comput. Sci. 2014, 8573, 208–219.
111. Flori, C.; Fritz, T. Compositories and gleaves. Theory Appl. Categories 2016, 31, 928–988.
112. Morris, S.A. A characterization of the topological group of real numbers. Bull. Aust. Math. Soc. 1986,
34, 473–475.
113. Kadison, R.V. Infinite unitary groups. Trans. Am. Math. Soc. 1952, 72, 386–399.
114. Marcus, M.; Newman, M. Some results on unitary matrix groups. Linear Algebra Its Appl. 1970, 3, 173–178.
115. Kerr, D.; Lupini, M.; Phillips, N.C. Borel complexity and automorphisms of C*-algebras. J. Funct. Anal. 2015,
268, 3767–3789.
116. Heunen, C. On the functor 2 . In Computation, Logic, Games, and Quantum Foundations; Springer:
Berlin/Heidelberg, Germany, 2013; pp. 107–121.
117. Heunen, C.; Vicary, J. Categories for Quantum Theory: An Introduction; Oxford University Press: Oxford, UK, 2017.
118. Vicary, J. Categorical formulation of finite-dimensional quantum algebras. Commun. Math. Phys. 2011,
304, 765–796.
119. Abramsky, S.; Heunen, C. H*-algebras and nonunital Frobenius algebras: First steps in infinite-dimensional
categorical quantum mechanics. Clifford Lect. AMS Proc. Symp. Appl. Math. 2012, 71, 1–24.
120. Heunen, C.; Contreras, I.; Cattaneo, A.S. Relative Frobenius algebras are groupoids. J. Pure Appl. Algebra
2013, 217, 114–124.
121. Coecke, B.; Heunen, C.; Kissinger, A. Categories of quantum and classical channels. Quantum Inf. Process.
2016, 15, 5179–5209.
122. Heunen, C.; Jacobs, B. Quantum logic in dagger kernel categories. Order 2010, 27, 177–212.
123. Heunen, C. Complementarity in categorical quantum mechanics. Found. Phys. 2012, 42, 856–873.
124. Coecke, B.; Heunen, C.; Kissinger, A. Chapter Compositional Quantum Logic. In Computation, Logic, Games,
and Quantum Foundations; Springer: Berlin/Heidelberg, Germany, 2013; pp. 21–36.
298
Entropy 2017, 19, 144
125. Clifton, R.; Bub, J.; Halvorson, H. Characterizing quantum theory in terms of information-theoretic
constraints. Found. Phys. 2003, 33, 1561–1591.
126. Heunen, C.; Kissinger, A. Can quantum theory be characterized by information-theoretic constraints? arXiv
2016, arXiv:1604.05948.
c 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
299
entropy
Review
Quantum Theory from Rules on
Information Acquisition
Philipp Andres Höhn 1,2
1 Vienna Center for Quantum Science and Technology, University of Vienna, Boltzmanngasse 5, 1090 Vienna,
Austria; [email protected]
2 Institute for Quantum Optics and Quantum Information, Austrian Academy of Sciences, Boltzmanngasse 3,
1090 Vienna, Austria
Abstract: We summarize a recent reconstruction of the quantum theory of qubits from rules
constraining an observer’s acquisition of information about physical systems. This review is
accessible and fairly self-contained, focusing on the main ideas and results and not the technical
details. The reconstruction offers an informational explanation for the architecture of the theory
and specifically for its correlation structure. In particular, it explains entanglement, monogamy and
non-locality compellingly from limited accessible information and complementarity. As a by-product,
it also unravels new ‘conserved informational charges’ from complementarity relations that
characterize the unitary group and the set of pure states.
1. Introduction
Why is the physical world described by quantum theory? If we wish to sensibly address this
question, we have to step beyond quantum theory and to consider it within a landscape of alternative
theories. This, after all, permits us to ponder about how the world could have been different, possibly
described by modifications of quantum theory. Such an endeavor forces us to leave the usual textbook
formulation of quantum theory, and everything we take for granted about it, behind and to develop
a more general language that also applies to alternative theories. Ideally, this language should be
operational, encompassing the interactions of some observer with physical systems in a plethora of
conceivable, physically-distinct worlds.
If we wish to also provide a possible answer to the above question, we then have to find
physical properties of quantum theory that single it out, at least within the given landscape of
alternatives. In particular, the goal should be to find an operational justification for the textbook
axioms, i.e., ultimately for complex Hilbert spaces, unitary dynamics, tensor product structure for
composite systems, Born rule, and so on. The result would be a reconstruction of quantum theory from
operational axioms [1–10] and should ideally yield a better understanding of what quantum theory
tells us about Nature; and why it is the way it is.
In this manuscript, we shall review and summarize how the quantum formalism for arbitrarily
many qubits can be reconstructed from operational rules restricting an observer’s acquisition of
information about a set of observed systems [1,2]. The goal of this summary is to provide a
didactical and easily-accessible overview of this reconstruction. Its underlying framework is especially
engineered for unraveling the architecture of quantum theory, and so many reconstruction steps are
instructive for understanding the origin of quantum properties. As we shall see, this reconstruction
provides a transparent, informational explanation for the structure of qubit quantum theory and
especially also for its paradigmatic features, such as entanglement, monogamy and non-locality.
The approach also produces novel ‘conserved informational charges’, indeed appearing in quantum
theory, that turn out to characterize the unitary group and the set of pure states and which might find
practical applications in quantum information.
The premise of the summarized approach is to only speak about information that the
observer has access to. It is thus purely operational and survives without any ontological
commitments. This approach is inspired, in part, by Rovelli’s relational quantum mechanics [11]
and the Brukner–Zeilinger informational interpretation of quantum theory [12,13]; this successful
reconstruction can be viewed as a completion of these ideas for qubit systems.
The rest of the manuscript is organized as follows. In Section 2, we review the landscape of alternative
theories; in Section 3, we formulate the operational quantum axioms; in Section 4, we summarize the
key steps of the reconstruction itself and, finally, conclude in Section 5.
302
Entropy 2017, 19, 98
to identify O’s ‘catalog of knowledge’ about the given Sa , i.e., the collection of {yi }∀ Qi ∈Q , with the
state of Sa relative to O. This is a state of information and an element of Σ. Conversely, any element in
Σ assigns a probability yi to all Qi ∈ Q. Thus, we identify Σ with the state space of Sa .
The state {yi }∀ Qi ∈Q is the prior state for the single Sa to be interrogated next, but also coincides
with the state O assigns to the ensemble {Sa } (which may only contain a single member) given that its
members are identically prepared [1].
303
Entropy 2017, 19, 98
(maximally) independent if, after having asked Qi to S in the state of no information, the posterior
probability y j = 12 . That is, if the answer to Qi relative to the state of no
information tells O ‘nothing’ about the answer to Q j .
dependent if, after having asked Qi to S in the state of no information, the posterior
probability y j
= 12 (if y j = 0 or 1, they are maximally dependent). That
is, if the answer to Qi relative to the state of no information gives O at
least partial information about the answer to Q j .
(maximally) compatible if O may know the answers to both Qi , Q j simultaneously, i.e., if there
exists a state in Σ such that yi , y j can be simultaneously zero or one.
(maximally) complementary if every state in Σ, which features yi = 0, 1, necessarily implies y j = 12 .
Notice that complementarity implies independence (but not vice versa).
(One can also define partial compatibility similarly [1].) These relations shall be symmetric; e.g., Qi is
independent of Q j if and only if Q j is independent of Qi , etc.
We impose a final condition on the posterior state update rule: if Qi , Q j are maximally compatible
and independent, then asking Qi shall not change y j , i.e., O’s information about Q j .
304
Entropy 2017, 19, 98
with 0 ≤ α(yi ) ≤ 1 bit and α(y) = 0 bit ⇔ y = 12 and α(1) = α(0) = 1 bit. O’s total information
about a Sa must be a function of the state; we make an additive ansatz:
D
I (y) := ∑ α ( y i ). (1)
i =1
for otherwise O could, for some states, reduce his total information about such a set by asking another
question from it. These complementarity inequalities represent informational uncertainty relations that
describe how the information gain about one question enforces an information loss about questions
complementary to it (see also the state ‘collapse’ in Section 2.4).
Q AB = Q A ∪ Q B ∪ Q̃ AB , (3)
1
Y ( Q|y) = Y (q|y) = q · (2y − 1) + 1 , (4)
2
where q ∈ RD is a question vector encoding Q ∈ Q and 1 is a vector with each coefficient equal to one
in the basis corresponding to Q M . This equation gives rise to (part of) the Born rule.
Suppose Q, Q ∈ Q were both encoded by the same q. Then, by (4), they would be probabilistically
indistinguishable, and O must view them as logically equivalent. O is free to remove any such
redundancy from his description of Q upon which every permissible question vector q will encode
305
Entropy 2017, 19, 98
a unique Q ∈ Q. Finally, for every Q ∈ Q, there exists a state yQ , which is the updated posterior
state of Sa after O received a ‘yes’ answer to the single question Q from Sa in the (prior) state of no
information. O had zero bits of information before, and yQ encodes a single independent question
answer, so we naturally require that it encodes one independent bit. Hence, for every Q ∈ Q, there
exists yQ ∈ Σ with I (yQ ) = 1 bit, such that Y ( Q|yQ ) = 1. (In quantum theory, the yQ will only turn
out to be pure states for a single qubit; e.g., for two qubits and Q = ‘Is the spin of Qubit 1 up in
z-direction?’, represented by the rank-two projector Pz1 = 12 (1 + σz ⊗ 12×2 ), yQ corresponds to the
mixed state ρz1 = 14 (1 + σz ⊗ 12×2 ). Clearly, tr( Pz1 ρz1 ) = 1.)
Rule 1. (Limited information) “The observer O can acquire maximally N ∈ N independent bits of
information about the system S N at any moment of time.”
There exists a maximal set Qi , i = 1, . . . , N, of N mutually maximally independent and compatible questions
in Q N .
Rule 2. (Complementarity) “The observer O can always get up to N new independent bits of
information about the system S N . However, whenever O asks S N a new question, he experiences no
net loss in his total amount of information about S N .”
There exists another maximal set Qi , i = 1, . . . , N, of N mutually maximally independent and compatible
questions in Q N , such that Qi , Qi are maximally complementary and Qi , Q j
=i are maximally compatible.
The peculiar mathematical form of Rule 2 becomes intuitive upon recalling that S N is a composite
system, such that complementarity should exist per elementary system [1].
Rules 1 and 2 are conceptually inspired by (non-technical) proposals made by Rovelli [11] and
Zeilinger and Brukner [12,13]. These rules say nothing about what happens in-between interrogations.
Naturally, we demand O not to gain or lose information without asking questions.
Rule 3. (Information preservation) “The total amount of information O has about (an otherwise
non-interacting) S N is preserved in-between interrogations.”
I (y) is constant in time in-between interrogations for (an otherwise non-interacting) S N .
306
Entropy 2017, 19, 98
Hence, O’s total information I (y) is a ‘conserved charge’ of any time evolution TΔt ∈ T N .
The more interactions to which O may subject S N are available, the more ways in which any state
may, in principle, change in time and, thus, the more ‘interesting’ O’s world. We therefore demand
that any time evolution is physically realizable as long as it is consistent with the other rules (since
Σ N , T N are interdependent, this is distinct from ‘maximizing the number’ of states).
Rule 4. (Time evolution) “O’s ‘catalog of knowledge’ about S N evolves continuously in time in-between
interrogations, and every consistent such evolution is physically realizable.”
T N is the maximal set of transformations TΔt on states such that, for any fixed state y, TΔt (y) is continuous in
Δt and compatible with Principles 1–3 (and the structure of the theory landscape).
(If we did not require this ‘maximality’ of T N , we would still ultimately obtain a linear, unitary
evolution, but not necessarily the full unitary group. This is the sole reason for demanding ‘maximality’.
Note that Principles 3 and 4 are not equivalent to the axiom of ‘continuous reversibility’ of generalized
probabilistic theories [3–5].)
We shall also allow O to ask any question to S N which ‘makes (probabilistic) sense’.
Rule 5. (Question unrestrictedness) “Every question that yields legitimate probabilities for every way of
preparing S N is physically realizable by O.”
Every question vector q ∈ RDN that satisfies Y (q|y) ∈ [0, 1] ∀y ∈ Σ N and for which there exists yQ ∈ Σ N
with I (yQ ) = 1 bit, such that Y (q|yQ ) = 1 corresponds to a Q ∈ Q N .
(Without Principle 5, we would still obtain the structure of an informationally complete set Q MN ,
finding that it encodes a basis of projective Pauli operator measurements [2]; Principle 5 legalizes all
such measurements.)
These five rules turn out to leave two solutions for the triple (Q N , Σ N , T N ). Remarkably, they
cannot distinguish between complex and real numbers. Namely, the two solutions are qubit and
rebit quantum theory, i.e., two-level systems over real Hilbert spaces [1,2]. Since the latter is both
mathematically and physically a subcase of the former, these five rules can be regarded as sufficient.
However, if one also wishes to discriminate rebits operationally, then an extra rule, adapted from [3–5]
and imposed solely for this purpose (it is partially redundant), succeeds.
Rule 6. (Tomographic locality) “O can determine the state of the composite system S N by interrogating only
its subsystems.”
As shown in [1,2], Rules 1–6 are equivalent to the textbook axioms. More precisely:
Claim. The only solution to Rules 1–6 is qubit quantum theory where:
307
Entropy 2017, 19, 98
quantum theory, and so many reconstruction steps are actually quite instructive. We now provide a
summary of key results and reconstruction steps from [1,2] (to which we refer for technical details)
needed for proving the claim of the previous section.
Qi Qj Qi ∗ Qj
0 1 a
1 0 a a
= b a, b ∈ {0, 1}. (5)
1 1 b
0 0 b
Hence, ∗ is either the XNOR ↔ (for a = 0, b = 1) or its negation, the XOR ⊕ (for a = 1, b = 0). Up to
an overall negation ¬, the two connectives are logically equivalent, and so, we henceforth make the
convention to only build up composite questions (for informationally complete sets) using the XNOR.
The composite question Qij := Qi ↔ Qj is a ‘correlation question’, representing “are the answers to
Qi , Qj the same?.” Ultimately, in quantum theory, ↔ will turn out to correspond to the tensor product
⊗ in σi ⊗ σj where σi is a Pauli matrix; Qij will then correspond to “are the spins of Qubit 1 in the i-
and of Qubit 2 in the j-direction correlated?.”
Since O is only allowed to connect compatible questions logically, there can be no edge between
individual questions of the same system.
Using only Rules 1 and 2 and logical arguments, the following result is proven in [1]:
Lemma 1. Qi , Qj , Qij are pairwise independent for all i, j = 1, . . . , D1 and will thus be part of an
informationally complete set Q M2 . Furthermore:
(i) Qi is compatible with Qij , ∀ j = 1, . . . , D1 and complementary to Qkj , ∀ k
= i and ∀ j = 1, . . . , D1 .
That is, graphically, an individual question Qi is compatible with a correlation question Qij if and only if
308
Entropy 2017, 19, 98
its corresponding vertex is a vertex of the edge corresponding to Qij . By symmetry, the analogous result
holds for Qj .
(ii) Qij and Qkl are compatible if and only if i
= k and j
= l. That is, graphically, Qij and Qkl are compatible
if their corresponding edges do not intersect in a vertex and complementary if they intersect in one vertex.
For example, Q1 in the third question graph above is compatible with Q11 and complementary to
Q22 , while Q11 and Q22 are compatible and Q11 and Q31 are complementary.
This lemma has a striking consequence: it implies entanglement. Indeed, since, e.g., Q11 and
Q22 are independent and compatible, O may spend his maximally accessible amount of N = 2
independent bits of information (Rule 1) over correlation questions only. Since non-intersecting edges
do not share a common vertex, the lemma implies that no individual question is simultaneously
compatible with two correlation questions that are compatible. Hence, when knowing the answers to
Q11 , Q22 , O will be entirely ignorant about the individual questions; O has then maximal information
about S2 , but purely composite information. This is entanglement in the very sense of Schrödinger
(“...the best possible knowledge of a whole does not necessarily include the best possible knowledge of all its
parts...” [14]). For example, in quantum theory, a state with Q11 = Q22 = ‘yes’ will coincide with a
Bell state having the spins of Qubits 1 and 2 correlated in x- and y-direction (and anti-correlated in
z-direction). Of course, there is nothing special about Q11 , Q22 , and the argument works similarly for
other composite question pairs and can be extended also to states with non-maximal entanglement
(see [1] for details).
For systems with limited information content, entanglement is therefore a direct consequence of
complementarity; without it there would be no independent and compatible composite questions
sufficient to saturate the information limit [1]. For instance, two classical bits satisfy Rule 1, as well,
but admit no complementarity so that Qcbit
M2 = { Q1 , Q1 , Q11 } and the maximum amount of N = 2
independent bits cannot be spent on composite questions only.
SC
SA SB
We also note that Rules 1 and 2 offer a simple, intuitive explanation for monogamy of entanglement.
Consider, for a moment, N = 3 elementary systems S A , SB , SC , and suppose S A and SB are maximally
entangled (say, because O received the answer Q11 = Q22 = ‘yes’ from S AB ). Noting that S AB
is a composite bipartite system inside the tripartite S ABC , O has then already spent his maximal
amount of information of N = 2 independent bits, which he may know about S AB and can therefore
not know anything else that is independent, including non-trivial correlations with SC , about the
pair. To saturate the N = 3 independent bit limit for the tripartite system S ABC , he may then only
inquire about individual information about SC . This is monogamy in its extreme form: the maximally
entangled pair S AB cannot be entangled with any other system SC . This heuristic argument can be
made rigorous in terms of the compatibility and independence structure of questions for N ≥ 3 and
can be extended to the non-extremal case using informational monogamy inequalities [1].
Theorem 1. D1 = 2 or 3.
Proof. Consider the N = 2 case. Lemma 1 implies that any maximal set of pairwise compatible
correlation questions has D1 elements. Indeed, there are maximally D1 non-intersecting edges between
309
Entropy 2017, 19, 98
the D1 vertices of System 1 and the D1 vertices of System 2; e.g., the D1 ‘diagonal’ Qii :
Q11
Q22
Q33
..
.
QD1 D1
are pairwise independent and compatible. The constraints on the posterior state update rule in
Section 2.4 entail that they are also mutually compatible (Specker’s principle) [1] such that O may
simultaneously know the answers to all D1 Qii . Since O may not know more than N = 2 independent
bits (Rule 1), the D1 Qii cannot be mutually independent if D1 > 2. Thus, assuming the Qii are of
equivalent status, the answers to any pair of them, say Q11 , Q22 , must imply the answers to all others,
say Qii , i = 3, . . . , D1 . Hence, Q jj = Q11 ∗ Q22 , j
= 1, 2, for a connective ∗ that preserves pairwise
independence of Q11 , Q22 , Q jj . Reasoning as in (5) implies that either:
so that for D1 > 3 Q jj , j = 3, . . . , D1 could not be pairwise independent. Arguing identically for all
other sets of D1 pairwise independent and compatible Qij , we conclude that D1 ≤ 3.
This theorem has several crucial repercussions. We may already suggestively call D1 = 2 and
D1 = 3 the ‘rebit’ (two-level systems over real Hilbert spaces) and ‘qubit’ case, respectively. Reasoning
as in (6) shows that the Qij are logically closed under ↔; as demonstrated in [1]:
Theorem 2. If D1 = 3, then Q M2 := { Qi , Qj , Qij }i,j=1,2,3 is logically closed under ↔ and, thus, constitutes
an informationally complete set for N = 2 with D2 = 15.
If D1 = 2, then Q M2 = { Qi , Qj , Qij , Q11 ↔ Q22 }i,j=1,2 is logically closed under ↔ and, thus, constitutes
an informationally complete set for N = 2 with D2 = 9. Furthermore, Q11 ↔ Q22 is complementary to the
individual questions Qi , Qj , i, j = 1, 2.
Indeed, D2 = 9, 15 are the correct numbers of degrees of freedom for N = 2 rebits and qubits,
respectively. However, since the composite question Q11 ↔ Q22 is complementary to all individual
questions in the rebit case (this is not true in the qubit case!), it is impossible for O to do ensemble state
tomography by asking only individual questions Qi , Qj , thereby violating Rule 6. We are left with the
qubit case and shall henceforth ignore rebits (for rebits see [1]).
4.4. Ruling out Local Hidden Variables and the Correlation Structure for N = 2
Using (6) and repeating the argument leading to it for ‘non-diagonal’ Qij show that either:
The first case (without relative negation) is the case of classical logic and compatible with local hidden
variables for the individual questions Qi , Qj . Namely, note that Q11 ↔ Q22 = Q12 ↔ Q21 can be
rewritten in terms of the individuals as:
310
Entropy 2017, 19, 98
Suppose for a moment that Q1 , Q1 , Q2 , Q2 had simultaneous definite values (although not accessible
to O). It is easy to convince oneself that any distribution of simultaneous truth values over the Qi , Qj
satisfies (8) [1]. In fact, (8) is a classical logical identity and can be argued to follow from classical
rules of inference [1]. However, it involves complementary individual questions, thereby violating
our premise from Section 2.7 that O may apply classical rules of inference exclusively to mutually
compatible questions. This classical case is thus ruled out.
One can check that the second case, Q11 ↔ Q22 = ¬( Q12 ↔ Q21 ), does not admit a local hidden
variable interpretation, but is consistent with the structure of the theory landscape and rules [1].
Since one of the two cases (7) must be true, we conclude that this second case holds. In fact, for any
complementary pairs Q, Q and Q , Q such that both Q and Q are compatible with both Q , Q ,
one finds similarly [1]:
' (
( Q ↔ Q ) ↔ ( Q ↔ Q ) = ¬ ( Q ↔ Q ) ↔ ( Q ↔ Q ) . (9)
This precludes to reason classically about the distribution of truth values over O’s questions.
Equation (9) permits us to unravel the complete correlation structure for Q M2 . In fact, it turns
out that there are two distinct representations of this correlation structure: one corresponding to
quantum theory in its standard representation, the other to its ‘mirror’ representation, related by a
passive (not a physical) transformation, reassigning Q1 → ¬ Q1 (in quantum theory tantamount to a
partial transpose on qubit 1) [1]. The two distinct representations turn out to be physically equivalent,
and so, a convention has to be made. Choosing the ‘standard’ case and using (9), one finds that
the compatibility and correlation structure of Q M2 can be represented graphically as in Figure 1.
For Q, Q , Q compatible, we shall henceforth distinguish between:
identify
Figure 1. The compatibility and correlation structure of the informationally complete set Q M2 for the
N = 2 qubit case. Two questions are compatible if connected by a triangle edge and complementary
otherwise. Red and green triangles denote odd and even correlation, respectively; e.g., Q33 = ¬( Q11 ↔
Q22 ) = Q12 ↔ Q21 . (Taken from [1].)
One can easily check that quantum theory satisfies this correlation structure for projective spin
measurements if one replaces i = 1, 2, 3 by x, y, z. For instance, Q11 = Q22 = ‘yes’ implies, by Figure 1,
the dependent Q33 = ‘no’. In quantum theory, this corresponds to the (unnormalized) Bell state with
spin correlation in the x- and y-direction and anti-correlated spins in the z-direction:
| x + x + − | x − x − = − i | y + y + + i | y − y − = | z + z − + | z − z + .
311
Entropy 2017, 19, 98
of individual questions, where μ a = 0, 1, 2, 3 and Q0 := ‘yes’. The conjunction yields ‘yes’ and ‘no’ if
an even and odd number of Qμa = ‘no’, respectively, and thus, does not represent “are the answers to
all Qμa the same?.” As shown in [1], these conjunctions are informationally complete:
Theorem 3. (Qubits) The 4 N − 1 questions Qμ1 ···μ N , μ = 0, 1, 2, 3 (we deduct the trivial question Q000···000 ),
are pairwise independent and logically closed under ↔ and, thus, form an informationally complete set Q MN
with D N = 4 N − 1. Moreover, Qμ1 ···μ N and Qν1 ···νN are compatible if they differ by an even number (including
zero) of non-zero indices and complementary otherwise.
where T (Δt) ⊂ T N defines a one-parameter matrix group [1]. Suppose T (Δt), T (Δt ) ∈ T N correspond
to two distinct interactions to which O may subject S N . By Rule 4, T (Δt) · T (Δt ) must likewise be
contained in T N , and since both T, T are invertible, also the entire set T N must be a group. We shall
henceforth often represent states with Bloch vectors r.
Rules 3 and 4, together with elementary operational conditions on the information measure,
enforce it to be quadratic α(yi ) = (2 yi − 1)2 so that O’s total information (1):
4 N −1
IN (y) = ∑ (2 yi − 1)2 = |r |2 (12)
i =1
is simply the square norm of the Bloch vector [1]. Interestingly, this derivation would not work
without the continuity of time evolution (Rule 4). Crucially, (12) is not the Shannon entropy (see [1]
for a discussion about why the Shannon entropy is also conceptually not suitable for quantifying O’s
information). This reconstruction thereby corroborates an earlier proposal for a quadratic information
measure for quantum theory by Brukner and Zeilinger [13,15,16].
This quadratic information measure becomes key for the remaining steps of the reconstruction.
Given that (12) is a ‘conserved charge’ of time evolution (rule 3), we can already infer that T N ⊂ SO
(4 N −1) because time evolution must be connected to the identity.
312
Entropy 2017, 19, 98
questions from Q MN . Thus, O’s total information (12) is 2 N − 1 bits in this case. It contains dependent
bits of information because the questions in Q MN are pairwise, but not all mutually independent.
Thanks to Rule 3, this is invariant under time evolution.
This allows us to distinguish two kinds of states [1]; y is called a:
4 N −1
IN (y) = ∑ (2 yi − 1)2 = (2 N − 1) bits, (13)
i =1
4 N −1
0 bit ≤ IN (y) = ∑ (2 yi − 1)2 < (2 N − 1) bits. (14)
i =1
The square length of the Bloch vector thus corresponds to the number of answered questions. The state
of no information y = 12 1 has length zero bits.
As can be easily checked, quantum theory satisfies this characterization. In particular, an N-qubit
density matrix, corresponding to a pure state, has a Bloch vector with square norm equal to 2 N − 1.
This peculiar mathematical fact now has a clear informational interpretation.
4.8. The Bloch Ball and Unitary Group for a Single Qubit from a Conserved Informational Charge
Since D1 = 3 (cf. Section 4.3), we have that Q M1 = { Q1 , Q2 , Q3 } is a maximal set of mutually
complementary questions, i.e., no further Q ∈ Q1 can be added to Q M1 without destroying mutual
complementarity in the set (cf. Section 4.1). According to (13), a pure state satisfies:
For later, we thus observe: for pure states, the maximal mutually complementary set carries exactly 1 bit of
information, and this is a conserved charge of time evolution (Rule 3).
Rule 1 implies that, e.g., the pure state y∗ = (1, 0, 0) exists in Σ1 , and we know T1 ⊂ SO(3).
However, it is clear that applying any T ∈ SO(3) to y∗ , according to (11), yields only states that are
also compatible with all Rules 1–3 (and the landscape). Hence, by Rule 4, we must actually have
T1 = SO(3) PSU(2). Clearly, T1 then generates all quantum pure states from y∗ , i.e., it yields the
entire Bloch sphere (the image of any legal state under a legal time evolution is also a legal state).
Recalling that Σ1 is convex, we obtain that Σ1 = B3 convex hull of CP1 is the entire unit Bloch ball
with mixed states (14) lying inside; the completely mixed state equals the state of no information at the
center. Σ1 , T1 coincide exactly with the set of density matrices ρ = 12 (1 +r ·σ) and the set of unitary
transformations ρ → U ρ U † , U ∈ SU(2), respectively, for a single qubit in its adjoint (i.e., Bloch vector)
representation, where σ = (σ1 , σ2 , σ3 ) is the vector of Pauli matrices. Finally, from the assumptions in
Section 2.8 and Rule 5, it is also clear that Q1 = {q ∈ R3 | |q|2 = 1 bit} CP1 . This coincides with the
set of projectors Pq = 12 (1 + q ·σ) onto the +1 eigenspaces of the Pauli operators q ·σ. Noting that:
1
Tr(ρ Pq ) = (1 +r · q) ≡ Y ( Q|y) (16)
2
we also recover that (4) yields the Born rule for projective measurements. We thus have the claim of
Section 3 for N = 1 (for details see [1,2]).
4.9. Unitary Group and Density Matrices for Two Qubits from Conserved Informational Charges
Also for N = 2, it is rewarding to consider maximal mutually complementary sets within Q M2 .
Using Lemma 1, one can check that there are exactly six maximal complementarity sets containing five
313
Entropy 2017, 19, 98
questions and twenty containing three [2]; e.g., two graphical representatives are:
Q11 Q11
11
00
11
00 11
00
11
00
11
00
11
00
11
00
11
00
11
00 11
00 11
00 11
00
Q12 Q12
11
00 11
00 11
00 11
00
Q2 00
11
00
11 11
00
00
11
11
00
11
00
11
00
11
00
Q13
Q3
11
00 11
00 11
00 11
00
Q3 11
00
11
00
11
00
11
00
00
11
00
11
00
11
00
11
The six maximal complementarity sets of five elements can be represented as a lattice of pentagons;
see Figure 2 (which also contains four green triangles, each representing one of the twenty maximal
complementarity sets of three questions) [2].
Q21
Q33
00
11
Q22 11
00
00
11 11
00 Q1 00
11
00
11 11
00
11
00 Q32
11
00
3 5
00
11
00
11 Q2 Q31
Q11
00
23
Q3 Q21
Q11
1 2
Q3
Q12
Q2 00
11 Q32
Q13 11
00
00
11
6 4
11
00
Q1
11
00 00
11
11
00
11
00
Q23 00
11 Q22
00
11
11
00
00
11
Q33 Q12
Figure 2. The six maximal complementarity sets represented as pentagons. Two questions are complementary
if they share a pentagon or are connected by an edge and compatible otherwise. Every pentagon is connected
to all of the other five because any Q ∈ Q M2 is contained in precisely two pentagons. The red arrows represent
the information swap (21) between Pentagons 1 and 2 that preserves all pentagon equalities (18) and defines the
time evolution generator (22). (Figure adapted from [2]. Reprinted with permission from [P. Höhn and C. Wever,
Phys. Rev. A95, 012102 2017.] Copyright (2017) by the American Physical Society.)
Each of these sets has to satisfy the complementarity inequalities (2); specifically 0 bits ≤
I (Penta ) := ∑i∈Penta ri2 ≤ 1 bit for the information carried by the five questions in pentagon a. Since
any Q ∈ Q M2 is contained in precisely two pentagons (cf. Figure 2), we find:
6
∑ I (Penta ) = 2 ∑ (ri21 + ri22 ) + ∑ rij2 = 2 IN =2 (r ). (17)
a =1 i =1,2,3 i,j=1,2,3
Noting that for pure states IN =2 (rpure ) = 3 bits thus produces the pentagon equalities [2]:
Any pure state must satisfy (18), and T2 evolves pure states to pure states (Rule 3). Hence, in analogy
to N = 1: for pure states, these six maximal mutually complementary sets carry exactly one bit of information,
and these are six conserved charges of time evolution. There are further interesting constraints on the
distribution of O’s information over Q M2 [2].
314
Entropy 2017, 19, 98
It can be straightforwardly checked that quantum theory actually satisfies (18). Indeed, in the
case of quantum theory, the identity for Pent1 reads in more familiar language (pure states):
etc. Remarkably, these identities of quantum theory seem not to have been reported before in
the literature. These novel conserved informational charges are a prediction of our reconstruction,
underscoring the benefits of taking this informational approach. Additionally, these informational
charges are indispensable for deriving the unitary group and the state space, as we shall now see.
Using that I (Penta (r )) is conserved under T2 ⊂ SO(15) entails (with new index i = 1, . . . , 15):
∑ ri Gij r j = 0, a = 1, . . . 6, (19)
i ∈Penta ,1≤ j≤15
where T (Δt) = exp(ΔtG ) for G ∈ so(15) [2]. The correlation structure of Figure 1 enforces [2]:
Each of the 15 Qi ∈ Q M2 is complementary to eight others, and since Gij = − Gji , there could be
maximally 60 linearly independent Gij of T2 .
These are constructed as follows. For every pair of pentagons, there is a unique information swap
transformation that preserves (18). For instance, the red arrows in Figure 2 represent the complete
information swap between pentagons Pent1 and Pent2 (←→ is not the XNOR):
that keeps all other components fixed. (18) are preserved because every swap in (21) occurs within a
pentagon. The correlation structure of Figure 1 fixes the corresponding generator to [2]:
Pent1 ,Pent2
Gij = δi2 δj(31) − δi3 δj(21) + δi(12) δj3 − δi(13) δj2 − (i ←→ j). (22)
One can repeat the argument for all 15 pentagon pairs, producing 15 linearly independent generators [2].
Remarkably, they turn out to coincide exactly with the adjoint representation of the 15 fundamental
generators of SU(4) [2]. In particular, (22) is the generator of entangling unitaries leaving r11 invariant.
The other 45 independent generators satisfying (20) are ruled out by the correlation structure so
that T2 cannot be generated by anything else than these 15 pentagon swaps [2]. One can show that
the exponentiation of (linear combinations of) these 15 pentagon swaps generates PSU(4) and that
this group abides by all rules and forms a maximal subgroup of SO(15) [2]. Rule 4 then implies
T2 PSU(4), which is the correct set of unitary transformations ρ → U ρ U † , U ∈ SU(4), for
two qubits.
It turns out that the set of Bloch vectors satisfying all six pentagon equalities (18) and the
conservation equations (19) for the 15 pentagon swaps splits into two sets on each of which T2 = PSU(4)
acts transitively [2]. These two sets correspond precisely to the two possible conventions of building
up composite questions either using the XNOR or XOR (cf. Section 4.1) and are therefore physically
equivalent. Adhering to the XNOR convention, we conclude that the surviving set of Bloch vectors
solving (18) and (19) is the set of N = 2 states admitted by the rules. Indeed, it coincides exactly
with the set of quantum pure states, which forms a CP3 of which PSU(4) is the isometry group [2].
Employing convexity of Σ2 , one finally finds:
315
Entropy 2017, 19, 98
Concluding, the new conserved informational charges (18), in analogy to (15) for N = 1, define
both the unitary group and the set of states for two qubits (for neglected details, see [2]).
which agrees with the set of normalized N-qubit density matrices (for details, see [2]).
As shown in [2], this set is isomorphic to the set of projectors Pq = 12 (1 +q ·σ) onto the +1 eigenspaces
of the Pauli operators q ·σ = ∑μ1 ···μ N qμ1 ···μ N σμ1 ···μ N , where σμ1 ···μ N = σμ1 ⊗ · · · ⊗ σμ N and σ0 = 1.
Noting that qμ1 ···μ N corresponds to (10) reveals that the XNOR at the question level corresponds to
the tensor product ⊗ at the operator level. One also finds that (16) again holds, such that (4) yields
the Born rule for projective measurements for arbitrary N (for the neglected details and many further
interesting properties of Q N , we refer to [2]).
ρ ( t ) = U ( t ) ρ (0) U † ( t ), (24)
∂ρ
i = [ H, ρ]. (25)
∂t
We have therefore also recovered the correct time evolution equation for quantum states.
5. Conclusions
We have reviewed and summarized the key steps from [1,2] necessary to prove the claim of
Section 3. This yields a reconstruction of the explicit formalism of qubit quantum theory from rules
constraining an observer’s acquisition of information about a system [1,2]. The derivation corroborates
the consistency of interpreting the state as the observer’s ‘catalog of knowledge’ and shows that it
is sufficient to speak only about the information accessible to him for reproducing quantum theory.
In fact, for qubits, this derivation accomplishes an informational reconstruction of the type proposed in
316
Entropy 2017, 19, 98
Rovelli’s relational quantum mechanics [11] and in the Brukner-Zeilinger informational interpretation
of quantum theory [12,13].
As a key benefit, this reconstruction also provides a novel informational explanation for the
architecture of qubit quantum theory. In particular, it explains the logical structure of a basis of spin
measurements, the dimensionality and structure of quantum state spaces, the correlation structure
and the unitarity of time evolution from the perspective of information acquisition. This unravels
previously unknown structural properties: conserved ‘informational charges’ from complementarity
relations define and explain the unitary group and the set of pure states.
Acknowledgments: The author thanks Christopher S. P. Wever for an enjoyable collaboration on [2]. The project
leading to this publication has received funding from the European Union’s Horizon 2020 research and innovation
program under the Marie Sklodowska-Curie Grant Agreement No. 657661.
Conflicts of Interest: The author declares no conflict of interest.
References
1. Höhn, P.A. Toolbox for reconstructing quantum theory from rules on information acquisition. arXiv 2014,
arXiv:1412.8323.
2. Höhn, P.A.; Wever, C.S.P. Quantum theory from questions. Phys. Rev. A 2017, 95, 012102.
3. Hardy, L. Quantum Theory From Five Reasonable Axioms. arXiv 2001, arXiv:quant-ph/0101012.
4. Dakic, B.; Brukner, C. Quantum Theory and Beyond: Is Entanglement Special? In Deep Beauty; Halvorson, H., Ed.;
Cambridge University Press: Cambridge, UK, 2011; p. 365.
5. Masanes, L.; Müller, M.P. A derivation of quantum theory from physical requirements. New J. Phys. 2011,
13, 063001.
6. Chiribella, G.; D’Ariano, G.M.; Perinotti, P. Informational derivation of quantum theory. Phys. Rev. A 2011,
84, 012311.
7. Barnum, H.; Müller, M.P.; Ududec, C. Higher-order interference and single-system postulates characterizing
quantum theory. New J. Phys. 2014, 16, 123029.
8. De la Torre, G.; Masanes, L.; Short, A.J.; Müller, M.P. Deriving Quantum Theory from Its Local Structure and
Reversibility. Phys. Rev. Lett. 2012, 109, 090403.
9. Goyal, P. From information geometry to quantum theory. New J. Phys. 2010, 12, 023012.
10. Appleby, M.; Fuchs, C.A.; Stacey, B.C.; Zhu, H. Introducing the Qplex: A Novel Arena for Quantum Theory.
arXiv 2016, arXiv:1612.03234.
11. Rovelli, C. Relational quantum mechanics. Int. J. Theor. Phys. 1996, 35, 1637–1678.
12. Zeilinger, A. A Foundational Principle for Quantum Mechanics. Found. Phys. 1999, 29, 631–643.
13. Brukner, C.; Zeilinger, A. Information and fundamental elements of the structure of quantum theory. In Time,
Quantum and Information; Castell, L., Ischebeck, O., Eds.; Springer: Berlin/Heidelberg, Germany, 2003.
14. Schrödinger, E. Discussion of Probability Relations between Separated Systems. Math. Proc. Camb. Philos. Soc.
1935, 31, 555–563.
15. Brukner, C.; Zeilinger, A. Operationally Invariant Information in Quantum Measurements. Phys. Rev. Lett.
1999, 83, 3354.
16. Brukner, C.; Zeilinger, A. Conceptual inadequacy of the Shannon information in quantum measurements.
Phys. Rev. A 2001, 63, 022113.
17. Bremner, M.J.; Dawson, C.M.; Dodd, J.L.; Gilchrist, A.; Harrow, A.W.; Mortimer, D.; Nielsen, M.A.; Osborne,T.J.
Practical Scheme for Quantum Computation with Any Two-Qubit Entangling Gate. Phys. Rev. Lett. 2002,
89, 247902.
18. Harrow, A.W. Exact universality from any entangling gate without inverses. Quant. Inf. Comput. 2009,
9, 773–777.
c 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
317
entropy
Brief Report
Test of the Pauli Exclusion Principle in the VIP-2
Underground Experiment
Catalina Curceanu 1,2,3, *,‡,§ , Hexi Shi 1,4, *,§ , Sergio Bartalucci 1 , Sergio Bertolucci 5 ,
Massimiliano Bazzi 1 , Carolina Berucci 1,4 , Mario Bragadireanu 1,3 , Michael Cargnelli 1,4 ,
Alberto Clozza 1 , Luca De Paolis 1 , Sergio Di Matteo 6 , Jean-Pierre Egger 7 , Carlo Guaraldo 1 ,
Mihail Iliescu 1 , Johann Marton 1,4 , Matthias Laubenstein 8 , Edoardo Milotti 9 , Marco Miliucci 1 ,
Andreas Pichler 1,4 , Dorel Pietreanu 1,3 , Kristian Piscicchia 2,1 , Alessandro Scordo 1 ,
Diana Laura Sirghi 1,3 , Florin Sirghi 1,3 , Laura Sperandio 1 , Oton Vazquez Doce 1,10 ,
Eberhard Widmann 4 and Johann Zmeskal 1,4
1 Laboratori Nazionali di Frascati, INFN, I-00044 Frascati, Italy; [email protected] (S.B.);
[email protected] (Mas.B.); [email protected] (C.B.);
[email protected] (Mar.B.); [email protected] (M.C.);
[email protected] (A.C.); [email protected] (L.D.P.); [email protected] (C.G.);
[email protected] (M.I.); [email protected] (J.M.); [email protected] (M.M.);
[email protected] (A.P.); [email protected] (D.P.); [email protected] (K.P.);
[email protected] (A.S.); [email protected] (D.L.S.);
[email protected] (F.S.); [email protected] (L.S.);
[email protected] (O.V.D.); [email protected] (J.Z.)
2 CENTRO FERMI - Museo Storico della Fisica e Centro Studi e Ricerche ‘Enrico Fermi’, I-00184 Rome, Italy
3 Institutul National pentru Fizica si Inginerie Nucleara Horia Hulubbei, IFIN-HH,
R-077125 Magurele, Romania
4 Stefan-Meyer-Institute for Subatomic Physics, Austrian Academy of Science, A-1090 Vienna, Austria;
[email protected]
5 Dipartimento di Fisica e Astronomia, Universitá di Bologna, I-40127 Bologna, Italy;
[email protected]
6 Institut de Physique UMR CNRS-UR1 6251, Université de Rennes1, F-35042 Rennes, France;
[email protected]
7 Institut de Physique, Université de Neuchâtel, CH-2000 Neuenburg, Switzerland;
[email protected]
8 Laboratori Nazionali del Gran Sasso, INFN, I-67100 Assergi L’Aquila, Italy;
[email protected]
9 Dipartimento di Fisica, Universitá di Trieste and INFN-Sezione di Trieste, I-34127 Trieste, Italy;
[email protected]
10 Excellence Cluster Universe, Technische Universität München, D-85748 Garching, Germany
* Correspondence: [email protected] (C.C.); [email protected] (H.S.);
Tel.: +39-06-9403-2321 (C.C.)
† This paper is an extended version of our paper published in the XIV International Conference on Topics in
Astroparticle and Underground Physics (TAUP2015), 7–11 September 2015, Torino, Italy.
‡ Current address: Laboratori Nazionali di Frascati, INFN, Via E. Fermi 40, I-00044, Frascati, Italy.
§ These authors contributed equally to this work.
Abstract: The validity of the Pauli exclusion principle—a building block of Quantum Mechanics—is
tested for electrons. The VIP (violation of Pauli exclusion principle) and its follow-up VIP-2
experiments at the Laboratori Nazionali del Gran Sasso search for X-rays from copper atomic
transitions that are prohibited by the Pauli exclusion principle. The candidate events—if they
exist—originate from the transition of a 2p orbit electron to the ground state which is already occupied
by two electrons. The present limit on the probability for Pauli exclusion principle violation for
electrons set by the VIP experiment is 4.7 ×10−29 . We report a first result from the VIP-2 experiment
improving on the VIP limit, which solidifies the final goal of achieving a two orders of magnitude
gain in the long run.
1. Introduction
The Pauli exclusion principle (PEP) states that in a system there cannot be two (or more) fermions
with all quantum numbers identical, and is a fundamental principle in physics. The validity of the PEP
is the basis of the periodic table of elements, electric conductivity in metals, the degeneracy pressure
which makes white dwarfs and neutron stars stable, as well as many other phenomena in physics,
chemistry, and biology. In quantum mechanics (QM), the states of particles are described in terms of
wave functions. For identical particles, with respect to their permutation, the states are necessarily
either symmetric for bosons, or antisymmetric for fermions. This “symmetrization postulate” [1]
excludes the mixing of different symmetrization groups, and it is at the basis of the PEP. Messiah
and Greenberg noted in [2] that this superselection rule “does not appear as a necessary feature of
the QM description of nature”. In this context, the violation of PEP is equivalent to the violation of
spin-statistics [3], and experimentally to the existence of states of particles that follow statistics other
than the fermionic or the bosonic ones.
Exhaustive reviews of the experimental and theoretical searches for a small violation of the PEP
or the violation of spin-statistics can be found, for example, in [3,4]. We first point out that there
is no established model in quantum field theory that can explicitly include small violations of the
PEP. Secondly, although many experimental searches present limits for the violation, the parameters
that quantify the limits are model/system-dependent and are not generally comparable. Moreover,
in order to search for states that are in a mixed symmetry, it is crucial to introduce new states into
the system, among which the PEP-violating states may be found. Ramberg and Snow [5] took this
argument into account by running a high electric DC current through a copper conductor, and they
searched for X-rays from transitions that are PEP-forbidden after electrons are captured by copper
atoms. In particular, they searched for PEP-violating transitions from the 2p level to the 1s level, which
is already occupied by two electrons. Due to the shielding effect of the additional electron in the
ground level, the energy of such abnormal transitions will deviate from the copper Kα X-ray at 8 keV
by about 300 eV [6], which are distinguishable in precision spectroscopic measurements. Since the new
electrons from the current are supposed to have no a-priori established symmetry with the electrons
inside the copper atoms, the detection of the energy-shifted X-rays is an explicit indication of the
violation of spin-statistics, and thus the violation of the PEP for electrons.
We want to mention that one known system in which the dichotomy of fermions and bosons does
not work is in the two-dimensional condensed matter physics through the (fractional) quantum
Hall effect [7]. Particles that are neither fermions nor bosons, and that may exist in electronic
systems confined to two spatial dimensions have been constructed theoretically and investigated
in the laboratory with great consistency with the theories as reviewed in [8]. The physics of this
special system is exciting in itself and may provide hints to the searches for the violation of the PEP in
other systems.
In Section 2, we will introduce the VIP (violation of Pauli exclusion principle) and VIP-2
experiments at Laboratori Nazionali del Gran Sasso (LNGS), and in Section 3 the first results from the
physics run of VIP-2 in 2016, which already improved the best result previously achieved by the VIP
experiment with 3 years of data collection. The paper ends with conclusions and future perspectives.
320
Entropy 2017, 19, 300
2. VIP-2 Experiment
The first experiment performed in the LNGS-INFN underground laboratory—the VIP
experiment—used a similar method as that of Ramberg and Snow, and the same definition of
the parameter to represent the probability that the PEP is violated, for a direct comparison of the
experimental results. An improvement in sensitivity was achieved firstly by performing the experiment
in the low radioactivity laboratory at LNGS, which has the advantage of the excellent shielding against
cosmic rays. Secondly, the application of charge-coupled device (CCD) as the X-ray detector with a
typical energy resolution of 320 eV at 8 keV increased the precision in the definition of the region of
interest to search for anomalous X-rays. The VIP experiment set the limit for the probability of the PEP
violation for electrons to be 4.7 × 10−29 [9–11].
By using new X-ray detectors and an active shielding of scintillators, the VIP-2 experiment plans
to further improve the sensitivity by two orders of magnitude. The major improvements come from
the change of the layout of the copper strip target and of the X-ray detectors, which allow a larger
acceptance for the X-ray detection. Secondly, a DC current with 100 amperes is applied instead of 40
amperes, which introduces two times the new electrons into the copper strip. Finally, in addition to
the improved passive shielding surrounding the setup to reduce the background generated by the
environmental radiations, the use of silicon drift detectors (SDDs) as the X-ray detectors allows the
implementation of an active shielding using scintillators, as illustrated in Figure 1a, which removes
the background induced by the high-energy charged particles that are not shielded. More details of
the detectors and the VIP-2 setup are given in [12–15].
Figure 1. (a) The design of the core components of the VIolation of Pauli exclusion principle 2
(VIP-2) setup, including the silicon drift detectors (SDDs) as the X-ray detector, the scintillators as
active shielding with silicon photomultiplier readout; (b) a picture of the VIP-2 setup in operation at
the underground laboratory of Gran Sasso.
The VIP-2 trigger logic was implemented using the Nuclear Instrumentation Module (NIM)
standard modules, and it is defined by either an event at any SDD or a coincidence between two layers
of the veto detector. A Versa Module Europa (VME) based data acquisition system for the detectors was
constructed. It records the energy deposit of the six SDDs from the output of a CAEN 568 spectroscopy
amplifier which processes the analog signals of the SDD preamplifier output. The charge to digital
signals (QDC) of the 32 scintillator channels, and the timing information of the SDDs with respect to
the main trigger are recorded in the data as well. The data acquisition computer transfers data from
the VME whenever there is one event ready in the memories of the modules, and clears the registers
of the VME when the data transfer is done. During the whole communication process between the
321
Entropy 2017, 19, 300
computer and the VME controller, the trigger logic is prohibited from receiving further events. The
user interface of the Labview-based data-taking program can be remotely accessed and controlled
from the computer terminals outside the Gran Sasso laboratory.
The temperatures of the SDDs, the copper conductor, the cooling system, as well as the ambient
temperature and vacuum pressure of the setup are monitored by a slow control system. The slow
control which can be accessed from remote terminals also controls the DC power supply to switch
on and off the current applied to the copper strip. A closed circuit chiller coupled to a cooling pad
attached to the copper strips keeps a constant temperature below 25 Celsius of the strips when the DC
current up to 100 A is applied. The temperature of the SDDs’ holder frame had a change of less than 2
K when the 100 A current was applied to the copper strip. At this level of temperature variation, the
effect of change in the energy resolution of the SDDs is negligible.
In November 2015, after having performed exhaustive tests in the laboratory, the VIP-2 setup
was transported and mounted in the Gran Sasso underground laboratory, as shown in Figure 1b.
After tuning and optimization, from October 2016 we started the first campaign of data taking with the
complete detector system. The energy calibration of the SDDs was performed in in-situ, by placing a
weak Iron-55 source covered by a 25 μm-thick titanium foil near the detectors. The manganese K-series
X-rays from the source partly go through the foil and partly irradiate the foil, generating titanium
K-series X-rays. These fluorescence X-rays are detected by the SDDs at an overall rate of about 2 Hz,
and provide reference energy peaks to calibrate the digitized SDD signals to energy scale.
Following the similar notations used by Ramberg–Snow and the VIP experiment papers,
the number of possible PEP violating events, ΔNX , is related to the β2 /2 parameter giving the
probability of PEP violation [16] :
ΔNX ≥ 12 β2 Nnew 10
1
Nint × (detection efficiency factor)
(1)
β2 (ΣIΔt) D
= eμ
1
20 × (detection efficiency factor).
Furthermore, the number of new electrons that pass through the conductor,
322
Entropy 2017, 19, 300
is given by the electric charge e of the electron, the intensity I of the applied DC current, and the
duration time Δt of the measurement. The minimum number of internal scattering processes between
a new electron and the atoms of the copper lattice, Nint , is of order D/μ, where D is the length of the
copper strip (10 cm), and μ is the mean free path of electrons in copper. We follow the same assumption
used in the VIP paper [17], that the capture probability of a new electron by an atom of the copper
lattice is greater than 1/10 of the scattering probability.
Ti
34 days
10 5
ary
Mn
i n
elim
4
10
3
Pr ROI
Cu
10
28 days
10 5
Ti
Mn
i n ary
10 4
relim
P ROI
Cu
10 3
10 2
3000 4000 5000 6000 7000 8000 9000 10000 11000
Energy [eV]
Figure 2. The energy spectra from all the SDDs, for data with and without applied DC current to
the copper strip, taken during the physics run in late 2016 at the Laboratori Nazionali del Gran
Sasso (LNGS).
The detection efficiency factor is evaluated with a Monte Carlo simulation based on Geant4.10
with realistic detector configuration, taking into account: the transmission rate of a copper Kα X-ray
that originates at a random position inside the copper strip and reaches the surface; the geometrical
acceptance of the photons coming from the surface of the copper stip arriving at the six SDD detectors;
the detection efficiency of a copper Kα X-ray by the 450 μm-thick SDD unit, and the value is determined
to be about 1%.
With D = 10 cm, μ = 3.9 × 10−6 cm, e = 1.602 × 10−19 C, I = 100 A, and normalizing the
measurement time with current to 34 days, using the three sigma upper bound of ΔNX = 41 ± 66 to
give a 99.7% C.L., we get an upper limit for the β2 /2 parameter:
β2 3 × 66
≤ = 4.2 × 10−29 . (3)
2 4.7 × 1030
323
Entropy 2017, 19, 300
G h
Graph
VIP 2011
`2/2
this work
10<29
10<31
VIP-2 goal
Figure 3. All the past results from Pauli exclusion principle (PEP) violation tests for electrons with
a copper conductor, together with the result from this work and the anticipated goal of the VIP-2
experiment. Note that the result of this work comes from two months of data collection, and it is
already compatible with the VIP result from three years of operation.
We conclude with the words of Lev Okun from his 1987 paper [18]: “The special place enjoyed by
the Pauli principle in modern theoretical physics does not mean that this principle does not require further and
exhaustive experimental tests. On the contrary, it is specifically the fundamental nature of the Pauli principle
which would make such tests, over the entire periodic table, of special interest”.
324
Entropy 2017, 19, 300
Abbreviations
PEP Pauli Exclusion Principle
VIP(-2) experiment VIolation of Pauli principle (-2) experiment
CCD Carge Coupled Device
SDD Silicon Drift Detector
NIM Nuclear Instrumentation Module
VME Versa Module Europa
QDC Charge-to-Digital Converter
LNGS Laboratori Nazionali del Gran Sasso
FWHM Full Width Half Maximum
ROI Region of Interest
References
1. Messiah, A.M.L. Quantum Mechanics, Volume II; North-Holland: Amsterdam, The Netherlands, 1962; p. 595.
2. Messiah, A.M.L.; Greenberg, O.W. Symmetrization Postulate and Its Experimental Foundation. Phys. Rev.
1964, 136, B248.
3. Greenberg, O.W. Theories of Violation of Statistics. AIP Conf. Proc. 2000, 545, 113, doi: 10.1063/1.1337721.
4. Elliott, S.R.; LaRoque, B.H.; Gehman, V.M.; Kidd, M.F.; Chen, M. An Improved Limit on
Pauli-Exclusion-Principle Forbidden Atomic Transitions. Found. Phys. 2012, 42, 1015–1030.
5. Ramberg, E.; Snow, G.A. Experimental Limit on a Small Violation of the Pauli Principle. Phys. Lett. B 1990,
238, 438–441.
6. Curceanu, C.; De Paolis, L.; Di Matteo, S.; Di Matteo, H.; Sperandio, S. Evaluation of the X-ray
Transition Energies for the Pauli-Principle-Violating Atomic Transitions in Several Elements by Using the
Dirac-Fock Method. Available online: http://www.lnf.infn.it/sis/preprint/detail.php?id=5330 (accessed on
23 June 2017).
7. Prange, R.; Girvin, S.M. The Quantum Hall Effect; Springer: New York, NY, USA, 1990.
8. Stern, A. Anyons and the quantum Hall effect—A pedagogical review. Ann. Phys. 2008, 323, 204–249.
9. Curceanu, C.; Bartalucci, S.; Bertolucci, S.; Bragadireanu, M.; Cargnelli, M.; Di Matteo, S.; Egger, J.-P.;
Guaraldo, C.; Iliescu, M.; Ishiwatari, T.; et al. Experiemntal tests of quantum mechanics—Pauli exclusion
principle violation (the VIP experiment) and future perspective. J. Phys. Conf. Ser. 2011, 306, 012036,
doi:10.1088/1742-6596/306/1/012036.
10. Bartalucci, S.; Bertolucci, S.; Bragadireanu, M.; Cargnelli, M.; Curceanu, C.; Di Matteo, S.; Egger, J.-P.;
Guaraldo, C.; Iliescu, M.; Ishiwatari, T.; et al. The VIP experimental limit on the Pauli exclusion principle
violation by electrons. Found. Phys. 2009, 40, 765–775.
11. Sperandio, L. New Experimental Limit on the Pauli Exclusion Principle Violation by Electrons From the VIP
Experiment. Ph.D. Thesis, Tor Vergata University, Rome, Italy, 2008.
12. Shi, H.; Bartalucci, S.; Bertolucci, S.; Berucci, C.; Bragadireanu, A.M.; Cargnelli, M.; Clozza, A.; Curceanu, C.;
De Paolis, L.; Di Matteo, S.; et al. Searches for the Violation of Pauli Exclusion Principle at LNGS in VIP(-2)
experiment. J. Phys. Conf. Ser. 2016, 718, 042055, doi:10.1088/1742-6596/718/4/042055.
13. Pichler, A.; Bartalucci, S.; Bazzi, M.; Bertolucci, S.; Berucci, C.; Bragadireanu, M.; Cargnelli, M.; Clozza, A.;
Curceanu, C.; De Paolis, L.; et al. Application of photon detectors in the VIP-2 experiment to test the Pauli
Exclusion Principle. J. Phys. Conf. Ser. 2016, 718, 052030, doi:10.1088/1742-6596/718/5/052030.
14. Shi, H.; Bartalucci, S.; Bertolucci, S.; Berucci, C.; Bragadireanu, A.M.; Cargnelli, M.; Clozza, A.; Curceanu, C.;
De Paolis, L.; Di Matteo, S.; et al. Testing the Pauli Exclusion Principle for electronics at LNGS. Phys. Procedia
2015, 62, 522–559.
15. Marton, J.; Bartalucci, S.; Bertolucci, S.; Berucci, C.; Bragadireanu, M.; Cargnelli, M.; Curceanu, C.;
Di Matteo, S.; Egger, J.-P.; Guaraldo, C.; et al. Testing the Pauli Exclusion Principle for Electrons. J. Phys.
Conf. Ser. 2013, 447, 012060, doi:10.1088/1742-6596/335/1/012060.
16. Greenberg, O.W.; Mohapatra, R.N. Local Quantum Field Theory of Possible Violation of the Pauli Principle.
Phys. Lett. 1987, 59, 2507.
325
Entropy 2017, 19, 300
17. VIP Collaboration; Bartalucci, S.; Bertolucci, S.; Bragadireanu, M.; Cargnelli, M.; Catitti, M.; Curceanu, C.;
Di Matteo, S.; Egger, J.-P.; Guaraldo, C.; et al. New experimental limit on the Pauli exclusion principle
violation by electrons. Phys. Lett. B 2006, 641, 18–22.
18. Okun, L. Possible violation of the Pauli principle in atoms. JETP Lett. 1987, 46, 529–532.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
326
entropy
Article
CSL Collapse Model Mapped with the
Spontaneous Radiation
Kristian Piscicchia 1,2, *, Angelo Bassi 3,4 , Catalina Curceanu 2,1 , Raffaele Del Grande 2 ,
Sandro Donadi 5 , Beatrix C. Hiesmayr 6 and Andreas Pichler 7
1 CENTRO FERMI—Museo Storico della Fisica e Centro Studi e Ricerche “Enrico Fermi”, 00184 Rome, Italy
2 Istituto Nazionale di Fisica Nucleare (INFN), Laboratori Nazionali di Frascati, 00044 Frascati, Italy
3 Department of Physics, University of Trieste, 34151 Miramare-Trieste, Italy
4 Istituto Nazionale di Fisica Nucleare, Sezione di Trieste, Via Valerio 2, 34127 Trieste, Italy
5 Institute of Theoretical Physics, Ulm University, Albert-Einstein-Allee 11 D, 89069 Ulm, Germany
6 Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
7 Stefan-Meyer-Institut für Subatomare Physik, 1090 Vienna, Austria
* Correspondence: [email protected]; Tel.: +39-06-9403-2654
Abstract: In this paper, new upper limits on the parameters of the Continuous Spontaneous
Localization (CSL) collapse model are extracted. To this end, the X-ray emission data collected by
the IGEX collaboration are analyzed and compared with the spectrum of the spontaneous photon
emission process predicted by collapse models. This study allows the obtainment of the most
stringent limits within a relevant range of the CSL model parameters, with respect to any other
method. The collapse rate λ and the correlation length rC are mapped, thus allowing the exclusion of
a broad range of the parameter space.
the model. The parameter λ has the dimensions of a rate and sets the strength of the collapse, while rC is
a correlation length which determines the spatial resolution of the collapse: for superposition with size
much smaller than rC , the collapse is much weaker compared to the case when the superposition has a
delocalization much larger than rC . The originally proposed values for λ and rC are [8] λ = 10−16 s−1 ,
rC = 10−7 m. Higher values for λ were however put forward [11], up to λ = 10−8±2 s−1 .
The interaction with the noise field causes an extra emission of electromagnetic radiation for
charged particles [7], which is not predicted by standard quantum mechanics. Such an effect is known
as spontaneous radiation emission. We show that the measurement of the radiation allows for a mapping
of the two relevant parameters λ and rC (see also Ref. [12]) into a two-dimensional parameter space,
i.e., we can present an exclusion plot. This gives a considerable reduction of the possible values in the
parameter space of collapse models.
dΓ( E) e2 λ
= 2 m2 E
, (1)
dE 4π 2 rC N
where e is the charge of the proton, m N represents the nucleon mass and E is the energy of the emitted
photon. In the non-mass proportional case, the rate takes the expression:
dΓ( E) e2 λ
= 2 m2 E
, (2)
dE 4π 2 rC e
328
Entropy 2017, 19, 319
3. A New Limit on λ
In this work, the X-ray emission spectrum measured by the IGEX experiment [19] is analysed
in order to set a more stringent limit on the collapse rate parameter λ. IGEX is a low-background
experiment based on low-activity Germanium detectors, originally dedicated to the neutrinoless
double beta decay (ββ0ν) research. The published data set [20] refers to 80 kg day exposure, and was
conceived to search for a dark matter WIMPs signal that originated from elastic scattering, producing
Ge nuclear recoil.
For the measurement in Ref. [20], one of the IGEX detectors of 2.2 kg (active mass of about
2 kg) was used. The detector, the cryostat and the shielding were fabricated following ultra-low
background techniques, in order to minimize the radionuclides emission, which represents the main
background source in the measured X-ray spectrum (shown in Figure 1 as a black distribution).
Moreover, a cosmic muon veto covered the top and the sides of the shield. The experiment had
an overburden of 2450 m.w.e., reducing the muon flux to the value of 2 · 10−7 cm−2 s−1 . The two
main sources of inefficiency are represented by the muon veto anti-coincidence and the pulse shape
analysis. The probability of rejecting non-coincident events with the muon veto was found to be less
than 0.01. The loss of efficiency introduced by the pulse shape analysis resulted to be negligible for
events above 4 keV.
Figure 1. Fit of the X-ray emission spectrum measured by the IGEX experiment [19,20], using the
theoretical fit function Equation (7). The black line corresponds to the experimental distribution; the red
dashed line represents the fit. See the text for more details.
The X-ray spectrum (Figure 1) ranges in the interval (4.5 ÷ 48.5) keV, which is compatible with
the non-relativistic assumption for electrons, used to derive Equations (1) and (2).
329
Entropy 2017, 19, 319
The X-ray spectrum is fitted in the interval ΔE by minimising a χ2 function. The expected number of
counts for each bin of 1 keV is assumed to be described by the theoretical prediction Equations (1) and (2):
dΓ( E) α(λ)
= . (7)
dE E
The χ2 minimisation presumes that the bin contents yi (number of counts in the energy
bin Ei ) follow Gaussian distributions. Strictly speaking, the yi s are Poissonian stochastic variables;
nevertheless, the approximation is reasonable for yi ≥ 5; this constraint is then used for the fit.
The result of the fit is shown in Figure 1 (red dashed line). For the free parameter of the fit, the
minimization gives the value α(λ) = 115 ± 17, corresponding to a reduced χ2 /(n.d. f . − n.p.) = 0.9.
n.d. f . represents the number of degrees of freedom, n.p. is the number of free parameters of the fit.
α(λ) is also considered to follow a Gaussian distribution with a good approximation. An upper limit
can then be set as α(λ) ≤ 143 with a probability of 95%. Correspondingly, an upper limit on the
parameter λ can be extracted using Equations (1) and (2):
dΓ( E) e2 λ 143
=c 2 2 2 ≤ , (8)
dE 4π rC m E E
In order to obtain the limits in Equations (10) and (11), two implicit assumptions are made on
the experimental input [20]. First, the measured spectrum is assumed to be background free, that is
to say that the upper limit on λ corresponds to the case in which all the measured X-ray emission
would be produced by spontaneous emission processes. This ansatz is conservative, and is imposed
by our ignorance regarding the contribution from known emission processes to the measured rate.
The second assumption, which is consistent with the analysis presented in Ref. [20], is that the detector
efficiency, in the range ΔE, is one, and that the un-efficiencies which are introduced by the muon
veto anticoincidence and the pulse shape analysis, performed to extract the experimental spectrum in
Ref. [20], are very small for events above 4 keV.
Having in mind these assumptions, the measured X-ray counts in the range ΔE can be re-analysed
in terms of their low-events Poissonian statistics. The number of counts yi s in each energy bin Ei can
be considered as independent stochastic variables following the distributions:
Λ i i e − Λi
y
G (yi | P, Λi ) = , (12)
yi !
where P denotes the Poisson distribution function. The expected numbers of counts per bin Λi are
indicated with capital letters, not to be confused with the spontaneous collapse rate λ. Let us define:
n n
y= ∑ yi , Λ= ∑ Λi (13)
i =1 i =1
330
Entropy 2017, 19, 319
where n is the total number of 1 keV bins in the range ΔE, y and Λ are the total number of counts and
the expected number of total counts, respectively. Here, y is distributed according to a Poissonian of
parameter Λ(λ), where the dependence on the collapse rate parameter, which follows the theoretical
input, was explicitly indicated.
According to the Bayes theorem, the probability distribution function of Λ(λ), given the measured
y, assuming a uniform prior, is given by:
which means that G (λ) is proportional to a gamma probability distribution. Due to the assumption
that the background is negligible, Λ(λ) also represents the expected number of total signal counts ys ,
where ys is a Poissonian variable. Thus, according to Equation (8):
n n
e2 λ α(λ)
Λ(λ) = ys + 1 = ∑ c 4π2 r2 m2 E +1 = ∑ Ei
+ 1. (15)
i =1 C i i =1
Substituting Equation (15) for Equation (14), the probability distribution function for the collapse
rate parameter can then be obtained:
y
n
α(λ) − ∑in=1 α(λ)
Ei +1
G (λ| G (y| P, Λ)) ∝ ∑ Ei + 1 e , (16)
i =1
where the measured total number of counts is y = 130. Calculating the cumulative distribution function:
λ0
G (λ| G (y| P, Λ)) dλ, (17)
0
the following upper limits can be obtained on the collapse rate parameter, setting rC to the value
10−7 m, corresponding to a probability level of 95%
331
Entropy 2017, 19, 319
Figure 2. Mapping of the λ − rC Continuous Spontaneous Localization (CSL) parameters: the originally
proposed theoretical values (GRW, Adler) are shown as black points; the region excluded by theory
(theory) is represented in gray. The excluded region according to our analysis is shown in cyan for the
non-mass proportional case (n-m-p) and in magenta for the mass proportional case (m-p).
Acknowledgments: We acknowledge the support of the CENTRO FERMI—Museo Storico della Fisica e Centro
Studi e Ricerche “Enrico Fermi” (Open Problems in Quantum Mechanics project), the support from the EU COST
Action CA 15220 is gratefully acknowledged. Furthermore, this paper was made possible through the support of a
grant from the Foundational Questions Institute, FQXi “Events” as we see them: experimental test of the collapse
models as a solution of the measurement problem) and a grant from the John Templeton Foundation (ID 58158).
The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the
John Templeton Foundation. Beatrix C. Hiesmayr acknowledges gratefully the support by the Autrian Science
Found (FWF-P26783). S. Donadi acknowledges the support by Trieste University and Istituto Nazionale di Fisica
Nucleare (INFN).
Author Contributions: Kristian Piscicchia, Catalina Curceanu, Raffaele Del Grande and Andreas Pichler analyzed
the data; Angelo Bassi, Sandro Donadi and Beatrix C. Hiesmayr gave the theoretical support for data analyses and
interpretation; Kristian Piscicchia and Catalina Curceanu wrote the paper. All authors have read and approved
the final manuscript.
332
Entropy 2017, 19, 319
References
1. Bassi, A.; Ghirardi, G.C. Dynamical reduction models. Phys. Rep. 2003, 379, 257–426.
2. Pearle, P. Collapse Models Open Systems and Measurements in Relativistic Quantum Field Theory; Lecture Notes in
Physics; Breuer, H.-P., Petruccione, F., Eds.; Springer: Berlin/Heidelberg, Germany, 1999; Volume 526.
3. Diósi, L. Models for Universal Reduction of Macroscopic Quantum Fluctuations. Phys. Rev. A 1989, 40, 1165.
4. Bassi, A. Collapse Models: Analysis of the Free Particle Dynamics. Available online: https://arxiv.org/abs/
quant-ph/0410222.pdf (accessed on 25 March 2009).
5. Adler, S.L. Quantum Theory as an Emergent Phenomenon; Cambridge University Press: Cambridge, UK, 2004;
Charpter 6.
6. Weber, T. Quantum mechanics with spontaneous localization revisited. Il Nuovo Cimento B 1991, 106, 1111–1124.
7. Fu, Q. Spontaneous radiation of free electrons in a nonrelativistic collapse model. Phys. Rev. A 1997, 56, 1806.
8. Ghirardi, G.; Rimini, A.; Weber, T. Unified dynamics for microscopic and macroscopic systems. Phys. Rev. D
1986, 34, 470.
9. Pearle, P. Combining stochastic dynamical state-vector reduction with spontaneous localization. Phys. Rev. A
1989, 39, 2277.
10. Ghirardi, G.C.; Pearle, P.; Rimini, A. Markov processes in Hilbert space and continuous spontaneous
localization of systems of identical particles. Phys. Rev. A 1990, 42, 78.
11. Adler, S.L. Lower and Upper Bounds on CSL Parameters from Latent Image Formation and IGM Heating.
J. Phys. A 2007, 40, 2935–2958.
12. Curceanu, C.; Hiesmayr, B.C.; Piscicchia, K. X-rays help to unfuzzy the concept of measurement. J. Adv. Phys.
2015, 4, 263–266.
13. Adler, S.L.; Ramazanoglu, F.M. Photon emission rate from atomic systems in the CSL model. J. Phys. A 2007,
40, 13395–13406.
14. Adler, S.L.; Bassi, A.; Donadi, S. On spontaneous photon emission in collapse models. J. Phys. A 2013,
46, 245304.
15. Donadi, S.; Bassi, A.; Deckert, D.-A. On the spontaneous emission of electromagnetic radiation in the CSL
model. Ann. Phys. 2014, 340, 70–86.
16. Miley, H.S.; Avignone, F.T.; Brodzinski, R.L., III; Collar, J.I.; Reeves, J.H. Suggestive evidence for the two
neutrino double beta decay of Ge-76. Phys. Rev. Lett. 1990, 65, 3092.
17. Laloë, F.; Mullin, W.J.; Pearle, P. Heating of trapped ultracold atoms by collapse dynamics. Phys. Rev. A 2014,
90, 52119.
18. Collett, B.; Pearle, P.; Avignone, F.; Nussinov, S. Constraint on collapse models by limit on spontaneous X-ray
emission in Ge. Found. Phys. 1995, 25, 1399–1412.
19. Aalseth, C.E.; Avignone, F.T., III; Brodzinski, R.L.; Collar, J.I.; Garcia, E.; González, D.; Hasenbalg, F.;
Hensley, W.K.; Kirpichnikov, I.V.; Klimenko, A.A.; et al. Neutrinoless double-beta decay of Ge-76: First results
from the International Germanium Experiment (IGEX) with six isotopically enriched detectors. IGEX Collab.
Phys. Rev. C 1999, 59, 2108.
20. Morales, A.; Aalseth, C.E.; Avignone, F.T.; Brodzinski, R.L., III; Cebrian, S.; Garcia, E.; Irastorza, I.G.;
Kirpichnikov, I.V.; Klimenko, A.A.; Miley, H.S.; et al. Improved constraints on WIMPs from the international
Germanium experiment IGEX. IGEX Collab. Phys. Lett. B 2002, 532, 8–14.
21. Toroš, M.; Gasbarri, G.; Bassi, A. Bounds on Collapse Models from Matter-Wave Interferometry. Available
online: https://arxiv.org/pdf/1601.03672.pdf (accessed on 31 May 2017).
22. Carlesso, M.; Bassi, A.; Falferi, P.; Vinante, A. Experimental bounds on collapse models from gravitational
wave detectors. Phys. Rev. D 2016, 94, 124036.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
333
Article
Quantum Information: What Is It All About?
Robert B. Griffiths
Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15213, USA; [email protected]
Abstract: This paper answers Bell’s question: What does quantum information refer to? It is about
quantum properties represented by subspaces of the quantum Hilbert space, or their projectors,
to which standard (Kolmogorov) probabilities can be assigned by using a projective decomposition
of the identity (PDI or framework) as a quantum sample space. The single framework rule of
consistent histories prevents paradoxes or contradictions. When only one framework is employed,
classical (Shannon) information theory can be imported unchanged into the quantum domain.
A particular case is the macroscopic world of classical physics whose quantum description needs
only a single quasiclassical framework. Nontrivial issues unique to quantum information, those with
no classical analog, arise when aspects of two or more incompatible frameworks are compared.
1. Introduction
A serious study of the relationship between quantum information and quantum foundations
needs to address Bell’s rather disparaging question, “Quantum information ... about what?” found
in the third section of his polemic against the role of measurement in standard (textbook) quantum
mechanics [1]. The basic issue has to do with quantum ontology, “beables” in Bell’s language. I believe
a satisfactory answer to Bell’s question is available, indeed was already available (in a somewhat
preliminary form) at the time he was writing. (If he was aware of it, Bell did not mention it in any of
his publications.) Further developments have occurred since, and I have found this approach to be
of some value in addressing some of the foundational issues which have come up during my own
research on quantum information. So I hope the remarks which follow may assist others who find the
textbook (both quantum and quantum information) presentations confusing or inadequate, and are
looking for something better.
Here is a summary of the remainder of this paper. The discussion begins in Section 2 by asking
Bell’s question about classical (Shannon) information: what is it all about? That theory works very
well in the world of macroscopic objects and properties. Hence if classical physics is fundamentally
quantum mechanical, as I and many others believe, and if Shannon’s approach is, as a consequence,
quantum information theory applied to the domain of macroscopic phenomena, we are already
half way to answering Bell’s question. The other half requires extending Shannon’s ideas into the
microscopic domain where classical physics fails and quantum theory is essential. This is possible,
Section 3, using a consistent formulation of standard (Kolmogorov) probability theory applied to
the quantum domain. Current quantum textbooks do not provide this, though their discussion of
measurements, Section 4, gives some useful hints. The basic approach in Section 3 follows von
Neumann: Hilbert subspaces, or their projectors, represent quantum properties, and a projective
decompositions of the identity (PDI) provides a quantum sample space. By not following Birkhoff
and von Neumman, but instead using a simplified form of quantum logic, Section 5, one has, in the
“single framework rule” of consistent histories, a means of escaping the well-known paradoxes that
inhabit the quantum foundations swamp. Section 6 argues that when quantum theory is equipped with
(standard!) probabilities, quantum information theory is identical to Shannon’s theory in the domain of
macroscopic (classical) physics, as one might have expected, since only a single quasiclassical quantum
framework (PDI) is needed for a quantum mechanical description. However, classical information
theory also applies, unchanged, in the microscopic quantum domain if only a single framework is
needed. Section 7 provides a perspective on the highly nontrivial problems that are unique to quantum
information and lack any simple classical analog: they arise when one wants to compare (not combine!)
two or more incompatible frameworks applied to a particular situation.
3. Quantum Probabilities
If we want quantum information theory to look something like Shannon’s theory, the first task is
to identify a quantum sample spaces of mutually-exclusive properties to which probabilities can be
assigned. The task will be simplest if these quantum probabilities obey the same rules as their classical
counterparts. In particular, since Shannon’s theory employs expressions like p j log( p j ), it would
336
Entropy 2017, 19, 645
be nice if the quantum probabilities were nonnegative real numbers, in contrast to the negative
quasiprobabilities sometimes encountered in discussions of quantum foundations.
Can we identify a plausible sample space which relative to the quantum Hilbert space plays
a similar role to a tiling of a classical phase space? (In what follows, I will assume that the quantum
Hilbert space is a finite-dimensional complex vector space with an inner product. Thus, all subspaces
are closed, and we can ignore certain mathematical subtleties needed for a precise discussion of
infinite-dimensional spaces.) A useful beginning is suggested by the quantum textbook approach to
probabilities given by the Born rule. Let A be an observable, a Hermitian operator on the quantum
Hilbert space, and let
A = ∑ a j Pj (1)
j
be its spectral representation: the a j are its eigenvalues and the Pj are projectors, orthogonal projection
operators, which form a projective decomposition of the identity I (PDI):
where [φ] is a convenient abbreviation for the Dirac dyad |φ φ|.
According to the textbooks, given a normalized ket |ψ, the probability that when A is measured
the outcome is a j , is given by the Born rule:
where the final equality applies only when Pj is the rank one projector in (3). Now a measurement of
A will yield just one eigenvalue, not many, so these eigenvalues correspond to the mutually-exclusive
properties Pj in the PDI used in (4). The idea that a quantum property should be associated with
a subspace of the Hilbert space, or the corresponding projector, goes back at least to von Neumann,
see Section III.5 of his oft-cited (but little read) book [2].
The projector Pj has eigenvalues 0 and 1, so it resembles an indicator function on the classical phase
space. In fact, a PDI divides up the Hilbert space into a set of mutually exclusive subspaces—Pj Pk = 0 for
j
= k—somewhat like a tiling of the classical phase space, whereas I = ∑ j Pj tells us this tiling is complete:
no part of the Hilbert space has been left out. Thus, the PDI is a plausible candidate for a quantum sample
space. The event algebra will then consist of the projectors in the PDI along with other projectors formed
from their sums, including I, along with the zero operator. The result is a commutative Boolean algebra.
We already have one scheme, (4), for assigning probabilities to elements of the PDI, and thus, by additivity,
to all the projectors in the event algebra. In particular, for j
= k,
337
Entropy 2017, 19, 645
4. Quantum Measurements
There is, of course, more to be said, and it can be motivated by noting that a carefully written
quantum textbook is likely to assign the probability p j not to the microscopic property of the measured
system, represented by Pj , but instead to the macroscopic measurement outcome, the pointer position in the
picturesque, albeit archaic, language of quantum foundations. However, in the above presentation,
it looks as if the probability is assigned directly to the microscopic property. Was this a mistake? Not if
one believes, as I do, that a properly constructed and calibrated apparatus designed to measure some
quantum observable can actually do what it was designed to do. Furthermore, if there is a one-to-one
correspondence between prior properties and later pointer positions, the probability p j will be the
same for both.
In support of my belief that quantum measurements measure something, I note that this is
assumed by my colleagues who do experiments at accelerator laboratories. They think that when they
detect a fast muon emerging from an energetic collision, there really was a fast muon that approached
and triggered their detector. Are they being naive? I do not think so. In passing, I note that these
colleagues do not seem to worry about the “collapse” of the muon wavefunction produced by its
interaction with the detector; they are less interested in what happened to the muon after it left their
measuring device, and more interested in knowing what it was doing before it arrived there.
In addition, the notion that outcome j corresponds to the earlier property Pj can in certain cases
be tested by preparing a particle which has the property Pj (see Section IV C of [3] on the topic of
preparation), sending it into the measurement apparatus, and seeing whether the result is that the
pointer points to j. Given that the apparatus has been tested and calibrated in this way, is not the
experimenter justified in thinking that the particle had the property indicated by the pointer in a run
in which the particle was not prepared in one of the Pj states? Justified or not, this is how many
of my colleagues who carry out experiments do interpret things, and if they did not it would be
difficult to draw interesting conclusions from their data. Quantum physics can hardly be called an
experimental science if experiments designed to reveal prior microscopic properties do not actually do
so! For additional details on the topic of what quantum measurements measure, including POVM and
weak measurements, see [3].
There is, to be sure, a conceptual difficulty lurking in the background if we assume that
measurements reveal prior microscopic properties. A hint is provided by the (correct) statement
in textbooks that the x and z components of spin angular momentum, Sx and Sz , of a spin-half
particle cannot be measured simultaneously. True, but what principle lies behind this? If we assume
that experimenters really do understand something about what their devices measure, their inability
to carry out such a simultaneous measurement might plausibly be explained by the fact that there is
nothing there to be measured. Even very skilled experimenters cannot measure what is not there; indeed,
this could be one thing that distinguishes them from less capable colleagues.
The Hilbert space of a spin-half particle is two-dimensional, and while it contains two subspaces
corresponding to Sx = ±1/2 (in units of h̄), and another two corresponding to Sz = ±1/2, there is no
subspace which can plausibly be associated with, to take an example, “Sx = +1/2 AND Sz = −1/2”.
Hence if we assume that quantum measurements measure microscopic properties represented by
subspaces of the quantum Hilbert space (or their projectors), we have a ready explanation for what lies
behind the assertion that Sx and Sz cannot both be measured simultaneously. This is one way in which
quantum mechanics is very different from classical mechanics.
5. Incompatible Properties
338
Entropy 2017, 19, 645
a classical property F takes one of two values, 0 and 1, while a quantum projector P has eigenvalues
that are either 0 or 1. In addition, the negation “NOT F” of a classical property has an indicator function
I (γ) − F (γ), where I (γ) is the function which is equal to 1 everywhere on the phase space. Similarly,
the negation “NOT P” of a quantum projector P is the projector I − P, with I the quantum identity
operator. However, the analogy begins to break down when we consider the conjunction “F AND G” of
two classical properties: the property which is true if and only if both F and G are true. It corresponds
to the intersection of the two subsets of phase space points associated with F and G, and its indicator
is the product F (γ) G (γ) of the two indicators. So we might expect that the conjunction “P AND Q” of
two quantum properties P and Q would be represented by the product PQ. Indeed, this is the case if
the projectors P and Q commute, PQ = QP, in which case PQ is again a projector. However, if PQ is
not equal to QP, then neither product is a projector, and it is not obvious how to define “P AND Q”.
The point can be illustrated using Sx and Sz for a spin-half particle. The projectors representing
Sx = +1/2 and −1/2 are [ x + ] = | x + x + | and [ x − ], where | x + and | x − are the eigenvectors
corresponding to Sx = +1/2 and −1/2. Since x + | x − = 0 (distinct eigenvalues means the
eigenvectors are orthogonal) [ x + ][ x − ] = [ x − ][ x + ] = 0. Thus, these projectors commute, and the
property “Sx = +1/2 AND Sx = −1/2” is represented by the zero operator on the Hilbert space:
the property that is always false and thus never occurs. Also [ x + ] + [ x − ] = I so these two
mutually-exclusive properties constitute a PDI, a quantum sample space. Likewise the projectors [z+ ]
and [z− ] that correspond to Sz = +1/2 and −1/2 form a PDI.
However, neither [ x + ] nor [ x − ] commutes with either [z+ ] or [z− ], so we cannot assign a quantum
property to “Sx = +1/2 AND Sz = −1/2” by taking the product of the projectors. Again, this is
consistent with the idea that the reason a simultaneous measurement of Sx and Sz is impossible is that
there is nothing there to be measured.
339
Entropy 2017, 19, 645
are part of the macroscopic world where classical physics is an adequate approximation to quantum
physics, and noncommutation can be ignored for all practical purposes. (More in Section 6 below.) I call
this the “black box” approach to quantum foundations. One starts with the preparation of a microscopic
quantum state using a macroscopic apparatus, and then a later measurement of the state using another
macroscopic apparatus, and what lies in between—well, that is inside the black box, and we will say as
little as possible about it. A quantum |ψ? That is just a symbolic way of representing the preparation
procedure. A PDI { Pj }? That is nothing but a mathematical tool for calculating the probabilities of
measurement outcomes. The black box approach has the advantage that it avoids the problem of
noncommuting quantum projectors. Its disadvantage is that it provides no way of understanding in
physical terms what is going on at the microscopic level inside the box.
A third approach was popularized by Bell and his followers: replace the noncommuting Hilbert
space projectors with commuting hidden variables. In essence, assume that in some way classical
physics applies at the microscopic level. However, if, as I believe, noncommutation of projectors and
PDI’s marks the frontier between classical and quantum physics, one should not be surprised that
an approach which is fundamentally classical—assumes a classical sample space, as is evident from
the way the mysterious symbol λ is employed in formulas—results in the famous Bell inequality that
disagrees with both quantum mechanical calculations and experimental results. (Nonlocal influences
can be ignored, since they do not exist; see [6].)
340
Entropy 2017, 19, 645
with the idea, which I have elsewhere called unicity (Section 27.3 of [5]), that at every instant of time
there is a single unique “state of the universe” which, even if we do not know what it is, determines
all physical properties. What might be its quantum counterpart? A “wavefunction of the universe”?
If there really is something of that sort, it is likely to be a horrible, uninterpretable superposition
of different pointer positions at the end of a measurement, or some other form of Schrödinger cat.
The corresponding projector will then not commute with properties that might resemble something
in the ordinary macroscopic world, and the single framework rule will then prevent discussing the
world of everyday experience. I do not see any way in which a single quantum state could plausibly
represent the “true state of the world”, and I believe unicity must be abandoned in the transition from
classical to quantum physics.
In practice, the choice of which framework to use will depend on the problem one is interested
in. Consider, for example, a situation in which a spin-half particle is prepared in an eigenstate of Sx ,
say Sx = +1/2, before being sent through a magnetic field-free region (so its spin direction will not
change) into an Sz measuring device. The outcome of the measurement will be either Sz = +1/2
or Sz = −1/2; let us assume the latter. This means we can say that Sz was −1/2 just before the
measurement took place. However, is it possible that the particle had both Sx = +1/2 (because it
was prepared in this state) and Sz = −1/2 (the value measured later) at the same time, just before
the measurement was made? This makes no sense, as the properties are incompatible. There is
one framework in which at the intermediate time Sx = +1/2, reflecting its earlier preparation,
and a different, incompatible framework in which at the intermediate time Sz = −1/2, reflecting the
outcome of the later measurement. These frameworks cannot be combined, and each has its own
uses. If we are concerned about whether Sx was perturbed (say by a stray magnetic field), then the Sx
framework is helpful, while if we want to identify what the measurement measured, the Sz framework
is helpful. In textbook quantum mechanics, only the Sx framework is employed. Nothing wrong with
that, except that one cannot discuss in what way the measurement measures something, leaving the
poor student rather confused.
This example suggests that the liberty to choose different frameworks is not as dangerous as
it might at first appear. A particular choice yields some type of information, and a different choice
may yield something different. By looking at a coffee cup from above you can tell if it contains some
coffee, while to see if there is a crack in the bottom you need to look from below. The oddity about the
quantum world is not that different views, different frameworks, are possible. Instead, it is that certain
frameworks cannot be combined into a consistent quantum description, because they are incompatible.
For another, less trivial, example of a case in which choosing alternative frameworks proved useful,
see the end of Section 7.
341
Entropy 2017, 19, 645
classical dynamics. See [11]; Chapters 7, 17, 18 of [12]; Chapter 26 of [5]; and Section 4 of [10].
Consequently, we can immediately claim that all of classical information theory, all seventeen chapters
of Cover and Thomas [13], or name your favorite reference, are a valid part of quantum information
theory when it is applied to macroscopic properties and processes. In this domain, we understand
quite well what quantum information is all about: its probabilities refer to quasiclassical properties and
processes, all the things for which classical physics provides a satisfactory approximation to a more
exact quantum description.
It is worth remarking, in passing, that using a quasiclassical framework provides a solution to
the infamous measurement problem of quantum foundations: what to do with a wavefunction which is
a coherent superposition of states in which the pointer points in two (or more) directions. While in the
CH approach there is nothing inherently wrong with such a thing, it can be ignored if one wants to
describe the usual macroscopic outcomes of laboratory experiments. Use a quasiclassical framework,
and the problems represented by Schrödinger’s cat are absent—and, by the single framework rule,
they are excluded from the description.
In addition, Shannon’s theory can be employed, unchanged, in situations in which some or all of
the properties being discussed are microscopic, quantum properties, provided the discussion is restricted
to a single framework. This includes what I have elsewhere [14] referred to as the second measurement
problem: inferring from the measurement outcome (the pointer position) something about the earlier
microscopic state of the system being measured. It can be analyzed in a manner which demonstrates
that my colleagues who carry out experiments at accelerator laboratories are not being foolish when
they assert that a fast muon has triggered their detector. The measurement apparatus is, in effect,
an information channel leading from microscopic quantum properties at the input to macroscopic
quantum properties (pointer positions) at the output.
342
Entropy 2017, 19, 645
and so forth. Of course, one has to assume that the channel continues to behave in the same way,
at least in a probabilistic sense, during successive runs, but the same is true for a classical channel.
Suppose Joe has built what he claims is a perfect channel, but we want to test it. This is
straightforward for a 1-bit classical channel: send in a series of 0s and 1s, and see if what emerges from
the channel is the same as what was sent in. A one-qubit quantum channel is more complicated. If we
test it using a sequence of states in which Sz = +1/2 or −1/2, and what emerges is the same as what
went in, this is not sufficient, as it could very well be the case that if one sends in Sx = +1/2 it will
emerge with Sx either +1/2 or −1/2 in a completely random fashion, uncorrelated with the input.
So we have to check something in addition to Sz . Does this mean we have to carry out experiments with
Sw = +1/2 and −1/2 for every possible spin component w? That would take a lot of time, and is not
necessary. It suffices to check both Sz = ±1/2 and Sx = ±1/2. This result is far from obvious, and to
derive it one must use principles of quantum mechanics which have no classical analog. Quantum
information theorists need not fear unemployment; we will be kept busy for a long time.
As another example, consider teleportation, often presented as an instance of the mysterious
and almost magical way in which quantum mechanics goes beyond classical physics. A standard
textbook presentation of a protocol to teleport one qubit, e.g., Section 1.3.7 of [15], consists in applying
unitary time evolution to an initial quantum state, followed by a measurement which collapses it.
The measurement has four possible outcomes, and the result is communicated from A to B through
two uses of a perfect one-bit classical channel. The end result of the protocol is a quantum state
transmitted unchanged from A to B; in effect, a perfect one-qubit quantum channel. The student will
certainly learn something by working through the formulas in the textbook, but this is of limited
value in developing an intuition about microscopic quantum processes. My own approach [16]
to understanding teleportation employs two incompatible frameworks. One framework shows
how information about Sx is transmitted from Alice to Bob with the assistance of one use of the
classical channel, and the other how Sz information is transmitted with the help of the other use of
the classical channel. Similar ideas (but without referring to frameworks) will be found in [17,18].
This way of “opening the black box” should, I think, assist students in gaining a better intuition
for microscopic quantum processes, and I hope it will become more widespread in the quantum
information community, where research, or at least its publication, is still dominated by the “shut up
and calculate” mentality encouraged by textbooks.
The preceding example could be easily dismissed in that it did not lead (directly, at least) to
any new results in quantum information: the original teleportation protocol [19] appeared fourteen
years in advance of my analysis. Hence it may be worth mentioning another example. A student
and I were trying to understand Shor’s algorithm for factoring numbers, which ends with a quantum
Fourier transform followed by measurements of each of the qubits in the standard basis |0, |1 basis
(|z+ , |z− for a spin-half particle). We noted that if you suppose that the final measurement reveals
a property that the qubit possessed before the measurement, there is a way of looking at the problem
that leads to an alternative and simpler way to carry out the algorithm [20]. Our perspective required
using a framework incompatible with that employed in the standard textbook approach: unitary time
development right up to the moment when measurement “collapses” the wavefunction—which,
when done properly, leads to the same final answer. I was pleased that Nielsen and Chuang mentioned
our work (Exercise 4.35 on p. 188, and see p. 246 of [15]), but disappointed in that they presented it
as part of one more phenomenological principle, rather than as a way of gaining insight by using
measurements outcomes to infer something about what happened earlier.
In my opinion, the discipline of quantum information could benefit from paying attention to
the developments in quantum foundations mentioned above. If you open your favorite book on
quantum information you will discover that measurements are quite firmly embedded in the discussion,
and this in the manner of other textbooks in which measurements do not actually measure something,
but instead enter as a primitive concept without further definition, a rule for carrying out calculations
which requires no real physical understanding of processes at the microscopic quantum level. My guess
343
Entropy 2017, 19, 645
is that if quantum information texts were to provide a consistent discussion of microscopic properties
and processes, it could lead to some new and interesting advances, and perhaps even some new
insights into quantum foundations.
8. Conclusions
Bell’s question, “Quantum information ... about what?” can be given a quite definite answer.
It is about physical properties and processes, which in quantum theory are represented by subspaces
of the quantum Hilbert space, and to which standard (Kolmogorov) probabilities can be assigned,
using sample spaces constructed from projective decompositions of the identity operator (PDI’s).
The single framework rule of consistent histories forbids combining incompatible PDI’s or frameworks,
resulting in a consistent theory not troubled by unresolved quantum paradoxes. From a quantum
perspective, classical (Shannon) information theory is the application of quantum information theory
to the domain of macroscopic properties and processes, where a single quasiclassical quantum
framework is sufficient for all practical purposes, and therefore quantum incompatibilities can be
ignored. However, in addition, all the ideas of classical information, and in particular its probabilistic
formulation, can be imported unchanged into the microscopic quantum domain, as long as one is
considering only a single quantum framework.
That there are many distinct frameworks available in quantum theory, frameworks which
cannot be combined but can be compared, represents the new frontier of information theory that
is specifically quantum, where classical ideas no longer suffice. At this point, new, and sometimes
very difficult, problems arise in the process of comparing (but not combining) different incompatible
quantum frameworks. They have no analogs in classical information theory, and some of them
are quite challenging. Progress in this domain might well benefit were textbooks to abandon their
outdated “black box” approach to quantum theory, in which “measurement” is an undefined
primitive and measurements do not actually measure anything, but are simply a calculational tool to
collapse wavefunctions. It is past time to open the black box with tools that can consistently handle
noncommuting projectors. Consistent histories provide one approach for doing this; if the reader can
come up with something better, so much the better.
Acknowledgments: Major contributions to the consistent histories interpretation of quantum mechanics have been
made over the years by Roland Omnès, Murray Gell-Mann, James Hartle, and, more recently, Richard Friedberg
and Pierre Hohenberg. We may not agree about everything, but I have certainly reaped great benefit from
conversations with and publications by these colleagues, and it is a pleasure to thank them. I am also grateful for
comments from three anonymous referees.
Conflicts of Interest: The author declares no conflict of interest.
References
1. Bell, J.S. Against measurement. In Sixty-Two Years of Uncertainty; Miller, A.I., Ed.; Plenum Press: New York,
NY, USA, 1990; pp. 17–31. Reprinted in Speakable and Unspeakable in Quantum Mechanics, 2nd ed.; Cambridge
University Press: Cambridge, UK, 2004; pp. 213–231.
2. Von Neumann, J. Mathematical Foundations of Quantum Mechanics; Princeton University Press: Princeton, NJ,
USA, 1955.
3. Griffiths, R.B. What quantum measurements measure. Phys. Rev. A 2017, 96, 32110.
4. Birkhoff, G.; von Neumann, J. The logic of quantum mechanics. Ann. Math. 1936, 37, 823–843.
5. Griffiths, R.B. Consistent Quantum Theory; Cambridge University Press: Cambridge, UK, 2002.
6. Griffiths, R.B. Quantum locality. Found. Phys. 2011, 41, 705–733.
7. Griffiths, R.B. The New Quantum Logic. Found. Phys. 2014, 44, 610–640.
8. Isham, C.J. Quantum logic and the histories approach to quantum theory. J. Math. Phys. 1994, 35, 2157–2185.
9. Griffiths, R.B. The Consistent Histories Approach to Quantum Mechanics. Stanford Encyclopedia of
Philosophy. 2014. Available online: http://plato.stanford.edu/entries/qm-consistent-histories/ (accessed on
29 November 2017).
10. Griffiths, R.B. A consistent quantum ontology. Stud. Hist. Philos. Mod. Phys. 2013, 44, 93–114.
344
Entropy 2017, 19, 645
11. Gell-Mann, M.; Hartle, J.B. Classical equations for quantum systems. Phys. Rev. D 1993, 47, 3345–3382.
12. Omnès, R. Understanding Quantum Mechanics; Princeton University Press: Princeton, NJ, USA, 1999.
13. Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: New York, NY, USA, 2006.
14. Griffiths, R.B. Consistent quantum measurements. Stud. Hist. Philos. Mod. Phys. 2015, 52, 188–197.
15. Nielsen, M.A.; Chuang, I.L. Quantum Computation and Quantum Information; Cambridge University Press:
Cambridge, UK, 2000.
16. Griffiths, R.B. Types of quantum information. Phys. Rev. A 2007, 76, 062320.
17. Renes, J.M.; Dupuis, F.; Renner, R. Efficient polar coding of quantum information. Phys. Rev. Lett. 2012,
109, 050504.
18. Coles, P.J.; Piani, M. Complementary sequential measurements generate entanglement. Phys. Rev. A 2014,
89, 010302.
19. Bennett, C.H.; Brassard, G.; Crépeau, C.; Jozsa, R.; Peres, A.; Wootters, W.K. Teleporting an unknown
quantum state via dual classical and Einstein-Podolsky-Rosen channels. Phys. Rev. Lett. 1993, 70, 1895–1899.
20. Griffiths, R.B.; Niu, C.-S. Semiclassical Fourier transform for quantum computation. Phys. Rev. Lett. 1996, 76,
3228–3231.
c 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
345
entropy
Article
Entropic Phase Maps in Discrete Quantum Gravity
Benjamin F. Dribus
Department of Mathematics, William Carey University, 710 William Carey Parkway, Hattiesburg, MS 39401,
USA; [email protected] or [email protected]; Tel.: +1-985-285-5821
Abstract: Path summation offers a flexible general approach to quantum theory, including quantum
gravity. In the latter setting, summation is performed over a space of evolutionary pathways in a
history configuration space. Discrete causal histories called acyclic directed sets offer certain advantages
over similar models appearing in the literature, such as causal sets. Path summation defined in terms
of these histories enables derivation of discrete Schrödinger-type equations describing quantum
spacetime dynamics for any suitable choice of algebraic quantities associated with each evolutionary
pathway. These quantities, called phases, collectively define a phase map from the space of evolutionary
pathways to a target object, such as the unit circle S1 ⊂ C, or an analogue such as S3 or S7 . This paper
explores the problem of identifying suitable phase maps for discrete quantum gravity, focusing on a
class of S1 -valued maps defined in terms of “structural increments” of histories, called terminal states.
Invariants such as state automorphism groups determine multiplicities of states, and induce families
of natural entropy functions. A phase map defined in terms of such a function is called an entropic
phase map. The associated dynamical law may be viewed as an abstract combination of Schrödinger’s
equation and the second law of thermodynamics.
Keywords: quantum gravity; discrete spacetime; causal sets; path summation; entropic gravity
1. Introduction
“Nearby" paths;
action deviates
Figure 1. In a fixed spacetime background, the Lagrangian L “chooses” the classical path γCL via
Hamilton’s principle; in a background independent theory, different paths imply different spacetimes.
In the corresponding quantum theory, the behavior of the particle depends on contributions
from every possible path. To quantify this dependence, one defines a phase map Θ on a space of paths
in spacetime, given by Feynman’s formula
i
Θ(γ) = e h̄ S(γ) , (1)
√
where i = −1 and h̄ is Planck’s reduced constant. For convenience, I use the term “phase” for the
i
value e h̄ S(γ) itself, rather than for the “angle” 1h̄ S(γ) in the complex exponential. One then performs a
path integral to “sum together” these phases. Feynman’s path integral for paths in a subset R of R4 is
the prototypical example. Its value is interpreted as a complex quantum amplitude for R, encoding the
probability that the particle follows a path through R. Due to Hamilton’s principle, phases for paths
near the classical path γCL combine via constructive interference to yield relatively large amplitudes for
neighborhoods of γCL , while phases for faraway paths destructively interfere. Schrödinger’s equation
for ordinary nonrelativistic quantum theory
∂ψ
ih̄ = Hψ, (2)
∂t
may be derived from Feynman’s path integral [1]. Here, ψ is the state function for the particle, and H
is the Hamiltonian operator.
348
Entropy 2017, 19, 322
349
Entropy 2017, 19, 322
possesses natural order-theoretic structure from which evolutionary relationships may be deduced in
a self-evident way. This is the case for discrete causal theory.
σ1
σ2
σ1
σ0
time
σ0
σ−1
σ− 1
Figure 2. R3+1 partitioned via sequences of spatial sections {σk } and {σk }; evolutionary pathways
defined by {σk } and {σk }. Both pathways share the same “limit history” R3+1 .
− −
ψR;θ (r ) = θ (r ) ∑ ψR;θ (r − ). (4)
r − ≺r
The meaning of this equation is explained in Section 2, and more thoroughly in [14], but I briefly
−
describe its content here. The function ψR;θ is a generalized state function, called the past state function,
while R is a set of relations representing natural relationships between pairs of histories in S,
called co-relative histories. Sequences of co-relative histories fit together to define evolutionary pathways
in S, called co-relative kinematics. The relations r and r − are elements of R representing specific
co-relative histories. The precursor symbol ≺ in the expression r ≺ r − indicates that the evolutionary
relationship represented by r is a possible sequel to the evolutionary relationship represented by r − .
Remaining to be identified in Equation (4) is the relation function θ, which is the entity of principal
interest in this paper. This function assigns to each element r of R a phase θ (r ) belonging to some target
object T. The most obvious choice for T is the unit circle S1 , viewed as a subobject of the complex
field C, and this is the target object focused on here. However, other choices may be studied in more
general contexts. For reasons explained in [14], the unit spheres S3 and S7 , viewed as subobjects of the
quaternions H and octonions O, respectively, are potentially interesting alternatives. At a finer level
of detail, it may be appropriate to consider discrete subobjects of S1 , S3 , or S7 , which possess interesting
algebraic properties. Alternatively, T might be an object at a higher level of algebraic hierarchy, such as
a monoidal category. In any case, T must possess a “multiplicative” operation, enabling the factor θ (r )
−
to multiply the sum ∑r− ≺r ψR;θ (r − ) in Equation (4). Extending θ via this operation, as described below,
defines a phase map Θ on the space of co-relative kinematics in S. The form of Equation (4) assumes
that θ generates Θ in this way; otherwise, the equation must be generalized. Under this assumption,
θ provides specific dynamical content to the equation, and thereby defines a quantum dynamical law
governing fundamental spacetime structure.
350
Entropy 2017, 19, 322
The elements of the relation set R in Equation (4) encode information up to first order at the
quantum level, in the sense that they represent individual stages of evolution in S. Hence, θ is
analogous to an infinitesimal path functional on S, i.e., a generalized Lagrangian. Similarly, Θ may
be regarded as a generalized action. However, to simplify the form of Equation (4), the appropriate
analogue of the exponentiation appearing in Feynman’s phase map (1) is “built in” to the definition
of θ. Hence, the quantities I call “phases” throughout the remainder of the paper are analogous
i
to Feynman’s complex exponentials e h̄ S(γ) themselves, not to the corresponding “angles” 1h̄ S(γ).
The phase Θ(γ) of a co-relative kinematics γ is therefore a product of phases θ (r ) of individual relations
r along γ, rather than a sum or integral. More precisely, one may define a concatenation product 0 joining
co-relative kinematics “end-to-end”, under which γ may be factored into a product of individual
relations γ = ... 0 r0 0 r1 0 r2 0 ... Extending θ multiplicatively then means that Θ(γ) = ∏k θ (rk ),
where the product is in the target object T. Questions of convergence are important in general, but are
not examined here, since one may go quite far under finiteness assumptions.
This paper explores the problem of identifying suitable phase maps for discrete quantum gravity,
focusing on a class of S1 -valued maps defined in terms of terminal states Δ of histories D along
evolutionary pathways γ in a history configuration space S. Here, S is a kinematic scheme of star
finite acyclic directed sets D, γ is a co-relative kinematics, and Δ encodes “recent” causes and effects
in D. Invariants such as state automorphism groups Aut(Δ) determine multiplicities of states, and induce
natural families of entropy functions. Resolution entropy is defined via a “coarse-graining” procedure
called causal atomic resolution, analogous to conventional partitioning of state space into families
of states sharing “macroscopic” properties. Superset entropy is defined by counting the number of
ways in which a terminal state Δ may embed into a larger state Δ called a superset of Δ. A large
state automorphism group Aut(Δ) corresponds to a small number of such supersets, and therefore
implies low entropy. Labeled entropy is defined by counting the number of ways to label elements
of Δ; again, large Aut(Δ) implies low entropy. Symmetry entropy, by contrast, is defined by counting
the elements of Aut(Δ) itself, so large Aut(Δ) implies high entropy in this context. A primitive
version of symmetry entropy is discussed in Section 8.2 of [14]. A phase map defined in terms of such
entropic quantities, or related quantities such as entropy per unit volume, is called an entropic phase map.
The resulting version of Equation (4) may be viewed as an abstract combination of Schrödinger’s
equation and the second law of thermodynamics, which arises entirely from the structure of S.
Section 2 presents the necessary background from discrete causal theory [14] to support
the development and description of these ideas. Section 2.1 briefly outlines the conceptual and
philosophical foundations of discrete causal theory. Section 2.2 describes the classical version of
the theory, expressed in terms of countable star finite acyclic directed sets. Section 2.3 sketches the
theory of relation space, which addresses certain technical difficulties in earlier versions of the theory
such as causal set theory. Section 2.4 describes the basics of discrete quantum causal theory. Section 3
examines entropy and the second law of thermodynamics in a broad context, introduces discrete causal
analogues of familiar thermodynamic ideas such as state space, and develops the specific notions of
entropy mentioned above. Section 3.1 discusses entropy in general terms under a broad framework
called entropy systems. Section 3.2 describes associated versions of the second law. Section 3.3 introduces
discrete causal state spaces. Section 3.4 defines resolution, superset, labeled, and symmetry entropies.
Section 4 introduces entropic phase maps, and examines some of their properties. Section 4.1 describes
some simple versions of these maps explicitly. Section 4.2 discusses the problem of obtaining suitable
interference effects analogous to those induced for Feynman’s phase map by Hamilton’s principle.
Section 4.3 discusses some possible objections to the idea of entropic phase maps, and briefly
examines an alternative approach involving a more conventional notion of action. Section 4.4 offers
concluding remarks, and mentions some mathematical problems whose solution would enhance the
study of entropic phase maps.
351
Entropy 2017, 19, 322
352
Entropy 2017, 19, 322
not imply the view that quantum gravity necessarily forbids such structure. Countability and/or star
finiteness may also be relaxed, though in my opinion there is limited motivation for doing so.
The following definitions are adapted from Sections 3.6 and 3.7 of [14]:
Definition 1. A directed set ( D, ≺) is a set D equipped with a binary relation ≺. A morphism from a
directed set ( D, ≺) to a directed set ( D , ≺ ) is a set map f : D → D such that f ( x ) ≺ f (y) whenever x ≺ y.
The category of directed sets D is the category whose objects are directed sets and whose morphisms are
morphisms of directed sets. A subobject of a directed set ( D, ≺) is a directed set ( D , ≺ ), where D is a subset
of D, and where ≺ is a subset of ≺ consisting of relations between pairs of elements of D . The causal dual of
a directed set ( D, ≺) is the directed set ( D, ≺∗ ), where x ≺∗ y if and only if y ≺ x.
Definition 2. A multidirected set ( M, R, i, t) consists of a set of elements M, a set of relations R, and initial
and terminal element maps i : R → M and t : R → M. A morphism from a multidirected set
( M, R, i, t) to a multidirected set ( M , R , i , t ) consists of a map of elements f ELT : M → M and a
map of relations f REL : R → R , such that f ELT (i (r )) = i ( f REL (r )) and f ELT (t(r )) = t ( f REL (r )) for
each r in R. The category of multidirected sets M is the category whose objects are multidirected sets and
whose morphisms are morphisms of multidirected sets. A subobject of a multidirected set ( M, R, i, t) is a
multidirected set ( M , R , i , t), where M and R are subsets of M and R, respectively, and where i and t
are the restrictions of i and t to R . The causal dual of a multidirected set ( M, R, i, t) is the multidirected
set ( M, R, t, i ).
Definition 3. A chain in a multidirected set ( M, R, i, t) is a sequence of relations ..., rk , rk+1 , ... such that
t(rk ) = i (rk+1 ). The past of an element x of ( M, R, i, t) is the set of all elements w in M such that there exists a
chain r0 , ..., r N with i (r0 ) = w and t(r N ) = x. The future of x is the set of all elements y in M such that there
exists a chain r0 , ..., r N with i (r0 ) = x and t(r N ) = y. An antichain in ( M, R, i, t) is a subset σ of M with no
chain connecting any pair of its elements, distinct or otherwise. The past relation set R− ( x ) of an element x
in M is the set of all relations r in R such that t(r ) = x. The future relation set R+ ( x ) of x is the set of all
relations r in R such that i (r ) = x. The relation set R( x ) of x is the union R− ( x ) ∪ R+ ( x ).
For both directed sets and multidirected sets, an isomorphism is an invertible morphism, and an
automorphism is a self-isomorphism. Isomorphic sets are usually considered to be equivalent. It is
often convenient to denote a directed set or multidirected set by just D or M, respectively, or to write
D = ( D, ≺) or M = ( M, R, i, t) to indicate that a set D or M is equipped with such structure. Similarly,
the causal dual of a directed set D may be denoted by D ∗ , and the causal dual of a multidirected set M
by M∗ . A directed set D = ( D, ≺) may be recognized as a multidirected set whose set of relations is
the binary relation ≺, and whose initial and terminal element maps are defined by setting i ( x ≺ y) = x
and t( x ≺ y) = y. For multidirected sets, the notation x ≺ y remains useful to indicate the existence of
a relation r such that i (r ) = x and t(r ) = y, even though no binary relation is involved. The necessity
to study multidirected sets arises at the quantum level, via iteration of structure.
A well-motivated version of discrete classical causal theory is defined by the axioms in Definition 4,
adapted from Definition 4.10.1 of [14]. Symbols and terms are further discussed below.
Definition 4. Five axioms for discrete classical causal theory are the following:
1. Binary axiom: Classical spacetime may be modeled as a directed set D = ( D, ≺), whose elements
represent events, and whose relations represent causal relationships between pairs of events.
2. Generalized measure axiom: D is equipped with a set function μ from the power set P( D ) of D to
the extended real numbers R ∪ {∞}, which assigns finite positive values to nonempty finite subsets of D,
and infinite values to infinite subsets of D.
3. Countability: D is countable.
4. Star finiteness: For every element x of D, the star St( x ) = { x } ∪ R( x ) of x is finite.
5. Acyclicity: D possesses no cycles, i.e., sequences of relations x0 ≺ ... ≺ x N with x0 = x N .
353
Entropy 2017, 19, 322
The binary axiom specifies both a mathematical structure and a physical interpretation of
this structure. The generalized measure axiom imposes no mathematical conditions on the remaining
axioms, so it is allowed a range of possible versions, each specified by a choice of μ. The most attractive
choices are similar to the counting measure used in early versions of causal set theory, which assigns
to each subset of D its number of elements in fundamental units. The function μ is unrelated to the
family of measures μ for an entropy system, introduced in Section 3.1. Since the star St( x ) of x is just
{ x } ∪ R( x ), star finiteness is equivalent to finiteness of relation sets R( x ). The physical meaning of
this condition is that every event has only a finite number of direct causes and effects. The reason for
using St( x ) rather than R( x ) involves topological bookkeeping that plays no direct role in this paper.
The meanings of countability and acyclicity are self-evident. The discreteness of D is encoded in the
generalized measure axiom and the axiom of star finiteness.
Figure 3, adapted from Figure 3.6.5 of [14], illustrates different types of directed sets and
multidirected sets. Elements are represented by nodes, and relations by directed edges. In the
third and fourth diagrams, directions of relations are indicated by arrows, while in the first and
second diagrams, directions are inferred via an “up the page” convention analogous to the convention
for the direction of time in Minkowski spacetime diagrams. This convention applies only to
acyclic directed sets. The first diagram illustrates a causal set, i.e., a countable, irreflexive, transitive,
interval finite directed set (C, ≺CS ). Irreflexivity means that C contains no “self-relations” x ≺CS x.
Transitivity means that if x ≺CS y and y ≺CS z, then x ≺CS z. Irreflexivity and transitivity together
imply acyclicity. Transitivity leads to trouble in distinguishing between direct and indirect causation
in causal set theory [14,20]. Interval finiteness means that only a finite number of elements y lie
between any two elements x and z of C, in the sense that x ≺CS y ≺CS z. Interval finiteness and
star finiteness are incomparable, i.e., neither condition implies the other. An important class of
causal sets that are generally not star finite are those induced by randomly “sprinkling” elements
into a Lorentzian manifold. These sets are useful to illustrate metric recovery results, but they are
not regarded as physically realistic, even in causal set theory. Star finite objects are preferred as
the actual workhorses for quantum gravity [2,21,22]. The second diagram in Figure 3 illustrates a
nontransitive acyclic directed set; in particular, the two relations x ≺ y and y ≺ z do not imply a
relation x ≺ z. The physical interpretation of this set still recognizes x as a cause of z, but not a direct
cause. This is analogous to the relationship between a grandparent and grandchild. The third diagram
illustrates a directed set D with cycles, including the “self-relation” t ≺ t and the “reciprocal relations”
u ≺ v ≺ u. Such sets are not studied in this paper, but remain interesting in more general contexts.
The fourth diagram illustrates a multidirected set M whose relation structure is more complicated than
any binary relation on its set of elements. For example, there are two distinct relations in M from x to
y. In discrete causal theory, multiple relations between pairs of elements arise at the quantum level,
where a given pair of histories may exhibit multiple direct evolutionary relationships.
Absent from Definition 4 is any specification of classical dynamics. This reflects the philosophy that
physics at the fundamental scale should be described in quantum-theoretic terms. Classical equations of
motion should emerge at larger scales from underlying quantum dynamics, according to a generalized
version of the correspondence principle. All histories obeying suitable axioms should contribute to
this dynamics, with contributions of “well-behaved” histories reinforced via constructive interference,
and contributions of “pathological” histories damped out. There should be no artificial distinction
between “on-shell” histories that obey preconceived classical dynamics, and “off-shell” histories that
do not. All permissible histories should begin on an equal footing, just as all permissible paths begin
on equal footing in conventional path integration.
354
Entropy 2017, 19, 322
(C, ≺CS ) ( D, ≺) ( D , ≺ ) ( M, R, i, t)
u
z z v
y y y
x x x
Figure 3. Causal set; acyclic directed set; directed set; multidirected set.
Structurally attractive models need not be relevant to the actual universe. Genuinely interesting
models exhibit solid connections to established physics. For discrete causal theory, such connections
are provided by the metric recovery theorems of Hawking [23] and Malament [24], and their
generalizations [25–27]. Informally, these theorems state that the causal structure of relativistic spacetime
determines its geometric structure up to scale. The causal metric hypothesis [14–16] strengthens and
generalizes this statement by removing dependence on relativity and the caveat “up to scale”.
If spacetime is precisely smooth and Lorentzian to arbitrary scales, then the causal metric hypothesis
is not quite true, due to this missing scale data. Hence, the hypothesis relies on the assumption that
such data arises in the actual universe from some natural source other than a Lorentzian metric.
What Finkelstein [3,4], Myrheim [28], ‘t Hooft [29], Sorkin [2], and others realized by around
1980 was that discrete causal structure supplies its own natural notion of scale via enumeration
of fundamental elements. Later, it became popular to admit fluctuations in the sizes of elements to
preserve systematic Lorentz invariance [30,31]. The generalized measure axiom in Definition 4 further
relaxes this picture to allow the possible contribution of relation structure in determining volume.
However, the basic lesson of metric recovery is unchanged by these modifications: discrete causal
structure supplies natural scale data absent in continuous causal structure. Hence, Lorentzian geometry
at large scales may be reasonably attributed to discrete causal structure at the fundamental scale.
Definition 5. Let M = ( M, R, i, t) be a multidirected set, and let r0 and r1 be elements of its relation set R.
The induced relation involves a new use of the precursor symbol ≺. Figure 4, adapted from
Figure 5.1.3 of [14], illustrates the relation space R( D ) over an acyclic directed set D. The left-hand
diagram shows the construction of an individual relation r0 ≺ r1 , while the right-hand diagram shows
R( D ) as a whole. More generally, R( M ) may be identified with the line digraph [32] over the directed
multigraph corresponding to M. Theorem 6 gives the essential properties of relation space.
355
Entropy 2017, 19, 322
Theorem 6. Passage to relation space defines a functor R from the category M of multidirected sets to the
category D of directed sets. This functor sends acyclic multidirected sets to irreducible acyclic directed sets,
and preserves star finiteness.
D R( D )
r1
r0 ≺ r1
y
r0
x
Figure 4. Induced relation between relations r0 and r1 in a directed set D; global view of R( D ).
z
X D
σ
x
y
σ
w
Figure 5. Cauchy surface σ in a globally hyperbolic manifold X, intersected by two causal curves;
maximal antichain σ in a directed set D, permeated by two chains.
356
Entropy 2017, 19, 322
In discrete causal theory, a typical maximal antichain σ in a typical directed set D is permeable,
meaning that chains in D may pass through σ from past to future without intersecting σ. In causal
set theory [33], this phenomenon is referred to as “missing links”; the antichain σ is compared to
a “sieve” [34], which is “by-passed” by a “large amount of geometric information”. “Thickened
antichains”, obtained by adding limited quantities of past and future elements to σ, typically suffer
from the same problem. Hence, maximal antichains are not good analogues of Cauchy surfaces in
causal set theory, and the same statement applies to discrete causal theory in general. The right-hand
diagram in Figure 5 illustrates a pair of chains permeating a maximal antichain σ in an acyclic
directed set. The dashed lines connecting the elements of σ are a visual aid, not part of the structure.
Permeability means that information can leak through σ, for example, from w to z. Besides posing a
general obstacle to discrete causal dynamics, this problem also has as a specific bearing on the definition
and analysis of entropic quantities, again typified in the causal set context [35,36]. Fortunately, however,
this problem disappears upon passage to relation space.
Theorem 7. Maximal antichains in relation space are impermeable. That is, if σ is a maximal antichain in
the relation space R( M ) over a multidirected set M, and if γ is a chain of relations in R( M) beginning at an
element in the past of σ and terminating at an element in the future of σ, then γ intersects σ.
τ2 (y) τ3 ( x ) τ3 ( x ) τ4 ( x )
h4 τ2 ( x )
h3 τ1 (y)
τ3 τ3 τ4
h2
τ2
h1 y τ1 ( x ) y
τ1
x x
Figure 6. Four co-relative histories sharing a common cobase with two elements x and y and one
relation x ≺ y; morphisms (transitions) representing these co-relative histories.
Individual morphisms in the category D of directed sets do not always uniquely represent
evolutionary relationships, due to symmetries. For example, the co-relative history h3 in Figure 6 is
represented by two different morphisms τ3 and τ3 , due to the symmetry interchanging the two
maximal elements of its target history. Hence, co-relative histories are defined as equivalence
classes of morphisms. It is convenient to restrict attention to special morphisms called transitions,
357
Entropy 2017, 19, 322
which represent “growth” of directed sets. This idea is made precise in Definition 8, adapted from
Definition 6.3.4 of [14]. Co-relative histories are then introduced in Definition 9, adapted from
Definition 6.4.3 of [14].
At a less-formal level, the condition that τ is a monomorphism means that τ does not “erase”
details of the source D. The “proper” condition means that τ encodes nontrivial change. The “full”
condition means that τ does not “edit” details of D. The “originary” condition means that τ does not
add “prehistory” to D. These conditions support the desired evolutionary interpretation.
The subscripts i and t in the expression h : Di ⇒ Dt stand for “initial” and “terminal”.
This notation is different from the notation for arbitrary transitions in Definition 8, since Sections 3 and 4
feature auxiliary transitions related to h that do not belong to the equivalence class defining h.
The proper, full, and originary conditions in Definition 9 allow the unadorned term “co-relative history”
to mean something more general, but co-relative histories in this paper always satisfy these conditions,
except in the context of superset microstates in Definition 15, where they need not be full.
Each transition in the equivalence class defining h is said to represent h. The “double arrow” notation ⇒
emphasizes that h may be represented by more than one transition, but often h is uniquely represented
due to the rigidity of typical “large” directed sets [37], which plays an important role in Sections 3 and 4.
It is useful to think of h as “adding elements and relations to Di to produce Dt ”, but one cannot
always identify specific elements and relations as “the ones added” since h is an equivalence class.
Multiple inequivalent transitions, and hence multiple co-relative histories, may exist between a given
pair of directed sets, even a pair differing by a single element. This implies multidirected structure at
the quantum level.
Choosing a suitable family K of directed sets, together with a suitable family H of co-relative histories
between pairs of members of K, one obtains a structure S called a kinematic scheme, which serves as a history
configuration space. The word “kinematic” means that S encodes possible behavior, without identifying
what specific behavior is determined or favored under specific conditions. The latter question involves
dynamics. As an analogy, relativistic kinematics describes possible particle paths, e.g., ruling out
spacelike motion, but the paths of specific particles depend on dynamical information. S possesses
natural multidirected structure induced by H, elaborated below. Sequences of co-relative histories in
S define evolutionary pathways called co-relative kinematics, abstractly analogous to particle paths in
conventional path summation. The conditions that S must satisfy to qualify as a kinematic scheme are
that H must include enough co-relative histories to describe the evolution of any history in K, and
K must contain all “ancestors” of its members. These conditions are made precise in Definition 10,
adapted from Definitions 7.4.1 and 7.4.7 of [14]. An additional desirable property, called the generational
property, allows each co-relative history in H to be “factored into generations”. However, this property
is not studied in this paper, and it is preferable to omit it from the definition.
Definition 10. A kinematic scheme is a pair S = (K, H), where K is a class of directed sets, and H is a
class of co-relative histories between pairs of members of K satisfying the following properties:
358
Entropy 2017, 19, 322
Figure 7, adapted from Figure 7.5.2 of [14], illustrates a portion of a kinematic scheme SPS called
the positive sequential kinematic scheme, which serves as a source of examples throughout the remainder
of the paper. SPS is modeled after a kinematic scheme of finite causal sets appearing implicitly in
Sorkin and Rideout’s theory of sequential growth dynamics [38]. Similar structures appear elsewhere in
the work of Sorkin [39], Isham [40–43], Markopoulou [7], and others. The objects illustrated inside
each large open node in the figure are members of the class K of directed sets of SPS , which is the
class of finite acyclic directed sets. This class is more restrictive than the class specified by Definition 4,
which requires only countability. The edges connecting the large open nodes represent members of the
class H of co-relative histories of SPS , which are those that “add a single new element to their targets”.
This means that if h : Di ⇒ Dt belongs to H, and if τ : Di → Dt is a transition representing h,
then the complement of τ ( Di ) in Dt is a singleton. The gray-colored nodes illustrate how the set of four
co-relative histories appearing in Figure 6 embeds into SPS . The thickened edges illustrate a co-relative
kinematics in SPS , whereby the empty set 1 evolves into a directed set D with four elements and
three relations. The specific transition or transitions representing each co-relative history illustrated
in the figure may be inferred in a straightforward manner from the directed structures of its cobase
and target; for example, there is a unique transition τ representing the final co-relative history in the
co-relative kinematics terminating at D. The “new element added by τ”, i.e., the complement of the
image of τ, is the top-right element indicated by the arrow.
Figure 7. Positive sequential kinematic scheme SPS (first four generations); gray nodes show the four
co-relative histories from Figure 6; thickened edges illustrate a co-relative kinematics.
Given a kinematic scheme S = (K, H), it is useful to associate an abstract multidirected set
M(S) with S, where each member D of K is represented by an element x ( D ) of M(S), and where
each member h : Di ⇒ Dt of H is represented by a relation r (h) from x ( Di ) to x ( Dt ) in M(S).
359
Entropy 2017, 19, 322
M(S) is called the underlying multidirected set of S. Chains in M(S) represent co-relative kinematics
in S. The left-hand diagram in Figure 8, adapted from Figure 7.5.4 of [14], illustrates a portion of the
underlying multidirected set M(SPS ) of the positive sequential kinematic scheme SPS . The chain from
x (1) to x ( D ) represents the co-relative kinematics from 1 to D illustrated in Figure 7. This diagram
illustrates the permeability problem in the context of kinematic schemes; the three nodes connected by
the auxiliary dashed lines represent a maximal antichain in M(SPS ), which is permeated by the chain
from x (1) to x ( D ). It is therefore necessary to work in relation space to properly formulate the theory
of path summation. The right-hand diagram in Figure 8 illustrates part of the relation space R(M(SPS )).
The dark square nodes represent a maximal antichain, which is impermeable by Theorem 7.
x(D)
x (1)
Figure 8. Portion of M(SPS ) illustrating the permeability problem; corresponding portion of R(M(SPS ))
showing an impermeable maximal antichain.
While one could choose to perform path summation over a particular acyclic directed set,
the resulting theory would be background dependent, and hence unsuitable for quantum gravity.
Path summation in the background independent context involves summing phases Θ(γ) associated
with co-relative kinematics γ in a kinematic scheme S. As explained in Section 1.4, these phases are
i
analogous to Feynman’s phases e h̄ S(γ) . Under modest assumptions, Θ(γ) is a product of phases θ (r )
of individual relations representing individual co-relative histories. The relation function θ determines
a specific form for Equation (4)
− −
ψR;θ (r ) = θ (r ) ∑ ψR;θ (r − ),
r − ≺r
reproduced here for convenience. The setup for deriving this equation is illustrated in Figure 9,
adapted from Figure 6.9.2 of [14], where the derivation is carried out in detail. The auxiliary shading
represents a finite subobject R of the relation space R(M(S)). A choice of maximal antichain σ
partitions R into a disjoint union R = R− ∪ σ ∪ R+ , where σ represents a choice of “present”, and R±
−
are the corresponding past and future regions. The function ψR;θ is called the past state function,
because it depends on all chains in R− , which terminate at elements of σ. Here, one such chain
γ is shown, terminating at an element r ∈ σ, with penultimate element r − . This chain may be
factored into a concatenation product γ− 0 r, where γ− is the subchain of γ terminating at r − ,
−
and this factorization induces a factorization Θ(γ) = Θ(γ− )θ (r ) of phases. The value ψR;θ (r )
is defined to be the sum ∑γ Θ(γ) of the phases of all maximal chains γ in R− terminating at r.
Mathematically, Equation (4) merely organizes the factorizations Θ(γ) = Θ(γ− )θ (r ) for all such γ.
These chains represent co-relative kinematics in the corresponding region of S that lead to the target
history of the co-relative history represented by r. Generalizing to the case of infinite R raises
−
questions of convergence. From an abstract perspective, the function ψR;θ plays a role similar to
that of Feynman’s “wave function” ([1], Section 5), except that no limiting process is necessary to
−
define it, and no normalization constant is required. However, the structural context in which ψR;θ
360
Entropy 2017, 19, 322
arises is much different than in Feynman’s original non-relativistic background dependent setup,
where evolutionary pathways are represented by paths in a fixed copy of R4 . In the present discrete
background independent context, each step along a chain represents a co-relative history, interpreted as
−
the evolution of one spacetime into another. Equation (4) describes how the value of ψR;θ changes when
the evolutionary pathways involved are extended by one additional relation r, which corresponds
to multiplying the associated phases by θ (r ). Abstractly, it arises in almost the same manner as the
ordinary Schrödinger equation under Feynman’s derivation ([1], Section 6), in which segmented
paths approximating continuous evolutionary processes are extended via a time-stepping method.
For Equation (4), however, no approximation is involved, so no limiting process is necessary.
R
(all nodes in R+
shaded region) (future region)
σ
γ (“present")
R(M(S))
r−
γ− R−
(past region)
A few further remarks regarding Equation (4) may be helpful. First, it is illuminating to
spell out how the equation can describe quantum-theoretic behavior specifically. This depends partly
on the general properties of path summation, and partly on the choice of relation function θ that
determines the phase associated with each evolutionary pathway. Like virtually any formula involving
path summation over a history configuration space, Equation (4) combines contributions from many
distinct processes involving many distinct histories. This is a familiar feature of quantum-theoretic
superposition, but is not unique to the quantum realm. For example, classical stochastic models such
as Sorkin and Rideout’s theory of sequential growth dynamics [38] organize information in a similar
manner at an abstract level, but are decidedly non-quantum. The classical nature of the latter theory
arises from the assignment of real probabilities, rather than quantum amplitudes, to evolutionary
pathways. Similarly, Feynman’s derivation [1] could just as easily be used to produce a continuous
classical stochastic model, with real probabilities assigned to subspaces of a path space. What leads
to Schrödinger’s equation specifically under Feynman’s setup is Feynman’s choice of phase map, which
produces the type of interference effects necessary to describe quantum-theoretic behavior. Similar
considerations apply in the discrete causal context. For different choices of θ, Equation (4) could be
used to describe a classical stochastic model, or a quantum-theoretic model, or neither. This highlights
why the choice of phase map is so crucial to the theory. As described in Section 1.4, the most
obvious choice of target object for a quantum-theoretic phase map is the choice made by Feynman,
namely, S1 . Alternative choices can be interesting, but this paper focuses on S1 -valued phase maps
almost exclusively. Second, due to the quantum-gravity-related focus of this paper, it is worth
noting that Equation (4) shares certain similarities with the Wheeler-Dewitt equation, but these are not
explored here. Third, allowing cycles complicates the picture, and this generalization is not considered
here. Fourth, many different kinematic schemes typically share a given class K of directed sets, and
different schemes offer different perspectives regarding the evolution of families of histories. Physical
361
Entropy 2017, 19, 322
predictions must be independent of these choices, and this is expressed by saying that the theory must
be covariant. In practical terms, this means that if one changes S, then one generally must change θ to
compensate. This paper mostly ignores covariance issues.
Figure 10 illustrates a sequential growth process in SPS , in which a history D7 with seven elements
evolves into a history D11 with eleven elements via a sequence of co-relative histories labeled h7 to h10 .
These co-relative histories are represented by relations r (h7 ) to r (h10 ) in R(M(SPS )), abbreviated by r7
to r10 . This growth process serves as a source of examples in Sections 3 and 4. Each pair of consecutive
histories in Figure 10 encodes the same type of information associated with a single square node
in Figure 9, since these nodes represent co-relative histories. Given such a process, the goal is to
define phases measuring the “favorabilities” of each co-relative history. The black nodes and edges
represent the first-degree terminal states T 1 ( D7 ) to T 1 ( D11 ) of the histories D7 to D11 , which encode the
first-order information in each history, i.e., the “physically new” information, consisting of only the
most recent causes and effects. First-degree terminal states are featured repeatedly in Chapters 7 and 8
of [14], where they are described via terminology such as “structural increments” or “generations”.
By definition, only one element in each history is “new” from the perspective of the sequential growth
process itself; these new elements are indicated by arrows. However, this process is merely one way of
describing the evolution of D11 , and therefore involves arbitrary extraphysical choices regarding the
order of appearance of elements. Terminal states T n ( D ) of degree n are introduced in Definition 13.
For n > 1, there is a distinction between degree and order; for example, second-degree terminal states
may encode information of arbitrarily high order. It is convenient to use the abbreviation Δk for T 1 ( Dk ),
which highlights the fact that Δk is a “structural increment” of Dk . To avoid clutter, only Δ8 is labeled
in the figure. The symbol Δ is used in later sections to denote states of arbitrary degree.
D11
D10
D7 h8
each such pair
h7 represented by
a square node
Δ8 (black)
in Figure 9
first-degree
terminal state
of D8
Figure 10. Sequence of co-relative histories in SPS ; terminal states indicated by dark nodes and edges;
“new elements” added by each co-relative history indicated by arrows.
First-degree terminal states are analogous to “present states” in conventional physics, involving data
up to first order, such as position and velocity. Familiar notions of entropy are associated with such
“present states”, not with entire histories. In particular, the second law of thermodynamics compares
the entropy of a “present state” to that of “previous states”; it does not involve a “higher-dimensional
entropy” associated with the entire history leading up to the present state. The evolution of physical
systems does not seem to be sensitive to details of the distant past; otherwise, one could not perform
reliable experiments without knowing the exact history of each piece of experimental equipment.
More formally, Lagrangians are typically assumed to depend on information only up to first order.
The form of Equation (4) imposes an analogous assumption at the level of kinematic schemes, since
the relation function θ is analogous to a Lagrangian on S. As discussed in Section 3.3, higher-order
362
Entropy 2017, 19, 322
information at the level of individual histories is not a priori irrelevant in discrete causal theory, but
contributions from the distant past likely play a negligible dynamical role. Hence, the simplest “serious"
entropic phase maps are defined in terms of first-degree terminal states, and more-sophisticated phase
maps may be regarded as refinements of such maps.
3.1. Entropy
Entropy, in the statistical sense pioneered by Boltzmann, may be understood very generally in
terms of the distinguishability of objects described at two different levels of detail, one regarded
as fine, and the other regarded as coarse. The prototypical application of this idea occurs in
statistical thermodynamics, in which the fine level of detail for a system, such as a fixed quantity of ideal
gas, is described in terms of microscopic data, such as the positions and momenta of individual molecules,
while the coarse level of detail is described in terms of macroscopic data, such as pressure, volume,
and temperature. Each possible choice of macroscopic data defines a coarse description of the system,
called a macrostate, while each possible choice of microscopic data defines a fine description, called a
microstate. Each macrostate generally corresponds to many different microstates, since many different
choices of microscopic data may be approximated by identical macroscopic data. The entropy of
a macrostate measures the quantity of corresponding microstates in a manner that is additive for
composite systems. In more general terms, objects distinguishable at some fine level of detail may be
indistinguishable at some coarser level, and a notion of entropy may be associated with the two levels
to quantify this difference in distinguishability. In particular, generalizations of Boltzmann entropy
such as Gibbs, Shannon, and Rényi entropies fall under the same conceptual umbrella. Measures of
entropy familiar in ordinary quantum theory, such as von Neumann entropy, are less relevant, since
they depend on specific algebraic apparatus less general than the path summation approach.
In statistical thermodynamics, the state space for a system is an abstract space parameterizing the set
of possible microstates of the system for some choice of fine detail. A choice of coarse detail partitions
state space into a family of subsets representing the possible macrostates of the system, where the points
of each subset parameterize the microstates associated with the corresponding macrostate. Such a
partition is called a coarse-graining of the state space. The left-hand diagram in Figure 11 illustrates such
a coarse-graining, where the cells representing macrostates are separated by solid lines. Dotted lines and
labels are explained below. Such a planar diagram could be interpreted literally as encoding the possible
position and momentum of a single particle moving in one real dimension, but all such diagrams in
this paper are schematic. Conventional state spaces are real manifolds, and therefore exhibit notions
of proximity, volume, and other topological and metric structure. However, their dimensions are
typically quite large, and this implies properties that are not well-represented by planar diagrams;
for example, each region typically has very many neighbors. Even in 24-dimensional Euclidean
space, each sphere in the regular packing induced by the Leech lattice is tangent to 196, 560 neighbors;
one may imagine the situation in 1024 -dimensional space. Abstract metric-related ideas remain useful
for describing the properties of discrete causal state spaces, but planar diagrams only roughly represent
these notions.
363
Entropy 2017, 19, 322
Wk
V
Figure 11. Partitions of state space; conventional state spaces exhibit regions of very different sizes;
state space inducing an “inverse second law of thermodynamics”.
Generalizing the thermodynamic picture, any set S of objects may be partitioned into a family
of subsets P, where the objects belonging to each subset are regarded as equivalent at a coarse level
of detail. More generally still, one may consider a strictly partially ordered family Π := { Pα }α∈ A of
partitions Pα of S for some index set A, where by definition Pα ≺ P β if Pα
= P β and if every member
of Pα is a union of members of P β . In this case, P β is called a refinement of Pα . Here, ≺ does not
represent causal structure, and superscript indices are used to distinguish information filtering from
mere enumeration. One may define equivalence relations ∼α on S for each α in A, where s ∼α s if s
and s belong to the same subset under Pα . If Pα ≺ P β , then Pα induces a quotient partition Pαβ of the
quotient set S β := S/ ∼ β in an obvious way. Any such choice of Pα and P β may be used to define
notions of coarse and fine detail. Returning to Figure 11 in this more abstract setting, the large regions
bordered by solid lines in the left-hand diagram represent a choice Pα of coarse detail for a set S,
while the small regions bordered by dotted lines represent a choice P β of fine detail. Here, Pα and
P β each partition S into subsets of roughly equal size, but a typical coarse-graining in conventional
thermodynamics exhibits vast differences in the sizes of regions, and correlations exist involving
proximity and size. The middle diagram in Figure 11 illustrates such a coarse-graining. As emphasized
by Penrose [44], such details are crucial for understanding whether a typical system can be expected to
exhibit a systematic increase in entropy. For example, the right-hand diagram in Figure 11 illustrates a
state space that induces an “inverse second law of thermodynamics”, in the sense that a typical path in
this space moves from larger to smaller cells. If Pα ≺ P β , and if each member of Pα is a finite union of
members of P β , then one may define multiplicities and entropies via counting: if V ⊂ S is a member
of Pα , and if V = ∪kK=1 W k for members W k of P β , then the multiplicity μαβ (V ) of V is K, and the
entropy eαβ (V ) of V is log K. The choice of notation for μαβ and eαβ is intended to emphasize the
relative viewpoint: multiplicities and entropies are properly understood in terms of natural relationships
between levels of detail, not in terms of any specific level of detail. For the set V shown in the left-hand
diagram in Figure 11, the entropy is eαβ (V ) = log 7, since P β subdivides V into seven regions. In more
general settings, it may be necessary to measure the sizes of members of Pαβ via some measure μαβ
other than the counting measure.
Definition 11. An entropy system (S, Π, μ) consists of a set S, a set Π := { Pα }α∈ A of partitions Pα of S for
some index set A, strictly partially ordered by refinement, and a family μ of measures μαβ on the quotient sets S β ,
one for each relation Pα ≺ P β in Π. Each such relation induces an entropy quadruple (S, Pα , P β , μαβ ).
The entropy of a member V of Pα is eαβ (V ) := log μαβ (V β ), where V β ⊂ S β is the image of V under the
quotient map S → S β , and where log ∞ is understood to mean ∞.
364
Entropy 2017, 19, 322
μαβ need only be a totally ordered set. One may also abstain from using logarithms to “rescale” μαβ .
However, it suffices here to consider only the counting measure on a finite set or the Lebesgue measure
on a finite-dimensional real manifold, and logarithms are useful for producing quantities that are
additive for composite systems. The reason for using “e” instead of the familiar “h” for entropy is
because “h” is used here to represent co-relative histories. Figure 12 illustrates a simple entropy system
(S, Π, μ) whose underlying set S is the unit interval [0, 1] in R. The set Π of partitions of S has members
P0 , P1 , P2 , and P3 , which subdivide S into segments of equal lengths 1, 1/2, 1/3, and 1/6, respectively.
P0 is the trivial partition, under which S represents a single macrostate. The strict partial order ≺
on Π consists of five individual relations P0 ≺ P1 , P0 ≺ P2 , P0 ≺ P3 , P1 ≺ P3 , and P2 ≺ P3 ,
each of which induces an entropy quadruple. The quotient sets S0 , S1 , S2 , and S3 have 1, 2, 3 and
6 elements, respectively. There are two nontrivial quotient partitions, P13 and P23 , which subdivide
the quotient set S3 into equal-sized subsets with 3 and 2 elements, respectively. Multiplicities and
entropies of some representative subsets of S with respect to different entropy quadruples are also listed.
For example, the subset U = ( 12 , 1] of S has measure μ13 (U ) = 3 and entropy e13 (U ) = log 3 with
respect to the entropy quadruple (S, P1 , P3 , μ13 ).
P3 : S3 : P13 : P23 :
V S2 :
Quadruples: Measures and Entropies:
P2 :
S1 : (S, P0 , P1 , μ01 ) μ01 (S) = 2, e01 (S) = log 2
U (S, P0 , P2 , μ02 ) μ02 (S) = 3, e02 (S) = log 3
P1 : S0 : (S, P0 , P3 , μ03 ) μ03 (S) = 6, e03 (S) = log 6
P3 (S, P1 , P3 , μ13 ) μ13 (U ) = 3, e13 (U ) = log 3
(S, P2 , P3 , μ23 ) μ23 (V ) = 2, e23 (V ) = log 2
P0 : Partial 1
P2 μ12 , e12 undefined
order: P
S: P0
The motivation for adopting such a general viewpoint is that multiple “levels” of entropy are
evident in discrete causal theory. An important example involves the nth-degree terminal states
T n ( D ) mentioned in Section 2.4 and formally introduced in Definition 13. Given two directed sets D
and D , it may be the case that T n ( D ) and T n ( D ) are isomorphic, while T n+1 ( D ) and T n+1 ( D ) differ.
In this case, D and D are indistinguishable at the level of detail specified by the index value n,
but become distinguishable at the finer level of detail specified by the index value n + 1. On the level
of individual elements, two elements x and y belonging to a subobject Δ of a directed set D may be
“locally indistinguishable”, in the sense that they are interchanged by an automorphism of Δ, but may
be “globally distinguishable”, in the sense that no such automorphism extends to an automorphism
of D. More generally, one may consider chains of subobjects Δ = Δ1 ⊂ Δ2 ⊂ ... ⊂ Δn ⊂ D containing x
and y, some of which possess automorphism groups interchanging x and y, and some of which do not.
Of obvious interest is the case in which Δ1 is a low-order terminal state of a history, and Δn for n > 1
are progressive “thickenings” of Δ.
While entropy is defined by associating entire families of “fine” states with individual
“coarse” states, it is sometimes interesting to compare the amount of detail encoded by specific pairs
of states. It is then natural to relate such “local comparisons” to the “global comparisons” leading to
365
Entropy 2017, 19, 322
entropy systems. In this context, one need not distinguish a priori between macrostates and microstates;
states are defined individually by specifying varying degrees and types information about an object
or system, and are then compared and categorized. Given two such states Δ and Δ , it is sometimes
possible to unambiguously identify Δ as more detailed than Δ, or vice versa. In other cases, Δ and
Δ are incomparable, in the sense that Δ contains more of one type of information, while Δ contains
more of another. In this setting, one may recognize a natural partial order ≺ on the family of states
under consideration, where Δ ≺ Δ if and only if Δ is unambiguously more detailed than Δ. This type
of partial order is different from the partial orders on sets of partitions in Definition 11, but the two
types of structure are related. For example, given an entropy quadruple (S, Pα , P β , μαβ ), the set Pα ∪ P β
is a subset of the power set P(S) of all subsets of S. The relation Pα ≺ P β means that every member V
of Pα is a union of members W of P β . One may define an induced relation on Pα ∪ P β , also denoted
by ≺, where V ≺ W if and only if V is a proper superset of W. Hence, a single relation between
two partitions induces a partial order on a corresponding family of subsets. This partial order is of a
special type, with maximal chain length 1, because its only relations are those of the form V ≺ W for
V ∈ Pα and W ∈ P β such that W ⊂ V. However, one may easily define partially ordered sets with
longer chains by considering sequences of partitions ... ≺ Pn ≺ Pn+1 ≺ ...
Working in the opposite direction, one may begin with a partial order ≺ on an arbitrary set Σ.
Here, Σ is viewed as an abstract analogue of a family of states encoding various types and quantities
of detail, while ≺ is viewed as an abstract analogue of the partial order relating pairs of states Δ and Δ
whenever Δ is unambiguously more detailed than Δ. One may partition Σ into a family of antichains
σ with respect to ≺. There are generally many different choices of partition, each analogous to a frame
of reference in relativity. In the entropic setting, elements of a given antichain σ are viewed as abstract
analogues of states sharing an equal level of detail. In the simplest case, the antichains σ “foliate” Σ,
in the sense that each nonextremal antichain σk has an unambiguous maximal predecessor σk−1 and
minimal successor σk+1 . More generally, the antichains σ form a partially ordered family. In either case,
the partition defines an atomic decomposition of Σ with respect to ≺, an idea revisited in a different
context in Section 3.3. In many cases, detail may be quantified in a variety of different ways, and this
leads to the consideration of families {≺α }α∈ A of partial orders on Σ. Such families are themselves
partially ordered via the order-theoretic version of refinement, under which ≺α precedes ≺ β if and only
if Δ ≺ β Δ whenever Δ ≺α Δ . An antichain with respect to ≺ β is then automatically an antichain with
respect to ≺α , so any partition of Σ induced by ≺ β refines at least one such partition induced by ≺α .
In this manner, the partial ordering by refinement of the family of partitions induced by {≺α }α∈ A
respects the partial ordering on {≺α }α∈ A itself. Hence, entropy systems defined in terms of such
partitions automatically respect the order-theoretic structure of Σ.
366
Entropy 2017, 19, 322
Wk
V
γ
L
γ
Figure 13. Curve in state space along which entropy increases; map from a linearly ordered set into an
entropy quadruple, showing no discernible second law.
The abstract analogue of a directed curve in state space is a map γ from a linearly ordered set L
into an entropy quadruple S = (S, Pα , P β , μαβ ). Such a map is illustrated in the right-hand diagram
in Figure 13. Here, L is drawn to suggest an interval in R, but in more general settings L may be
a non-continuous object such as an interval in Q, a discrete object such as an interval in Z, a finite
object such as the set {0, ..., N }, or even a transfinite object, such as the long line. The notion of an
increasing function requires similar generalization beyond the familiar setting of real analysis. Even in
conventional thermodynamics, strict definition of an increasing function must be relaxed, since the
second law is understood not as a prescription that entropy must increase over any time interval, but as
a description of the fact that entropy does increase with overwhelming likelihood over sufficiently
long time intervals. The map γ in the figure passes through cells of multiplicities 5, 2, 3, 7, 6, 6,
7 (again), 4, 2, 4, and 6 (again). Hence, the associated system does not obey a discernible version of the
second law. In the general case, it seems preferable to describe a variety of ways to define a version of
the second law for such a system than to isolate a particular choice via formal definition. An individual
map γ from a totally ordered set L into an entropy quadruple S = (S, Pα , P β , μαβ ), obeys a strict
version of the second law if for every pair of subsets V and V of S belonging to Pα , and for every
pair of elements and in L such that γ() ∈ V and γ( ) ∈ V , it is true that μαβ (V ) ≤ μαβ (V ).
Intuitively, this means that γ never passes from a large cell into a smaller cell. There are various
ways to relax this strict description. If L possesses a metric, then one may specify a rule relating the
size of the interval (, ) to the probability that μαβ (V ) ≤ μαβ (V ). If the target object of μαβ also
possesses a metric, then one may define something like a derivative, i.e., a rule relating the sizes
of the intervals (μαβ (V ), μαβ (V )) to the sizes of the corresponding intervals (, ). More generally,
a region U of S obeys a version of the second law if a typical map γ : L → S originating in U obeys
an individual version of the second law. The word “typical” may be made precise in terms of a
generalized measure on the space of maps γ. It is sometimes necessary to restrict attention to special
maps to obtain a clear pattern; for example, some entropy quadruples exhibit entropy increases along
typical “short curves”, but not along typical “long curves”. In particular, some cosmological models
posit a reversal of the second law in the distant past and/or future.
367
Entropy 2017, 19, 322
which consists of all maximal elements of D, all relations terminating at these elements, and all
initial elements of these relations. Knowledge of T 1 ( D ) generally does not enable recovery of D.
One may propose a choice of classical dynamics implying such a relationship for very special classes of
directed sets, for example, by abstracting the Einstein–Hilbert action from general relativity, which takes
the form /
c4
SEH = R −det( g)d4 x, (5)
16πG X
in the simple vacuum case with zero cosmological constant. Here, g is a Lorentzian metric on a
4-dimensional manifold X, R is the curvature scalar arising from the metric connection, G is Newton’s
gravitational constant, and c is the speed of light. Yet despite interesting efforts in this direction,
for example, in causal set theory [45–47], such a strategy is dubious due to the amount of geometric
structure taken for granted in relativity. Geometric data such as metrics and curvature, and even
“pre-geometric” data such as dimension and topology, are emergent notions in discrete causal theory.
Action functionals in this context must be defined more fundamentally, and cannot be expected to
produce straightforward analogues of deterministic, time-symmetric Euler–Lagrange-type equations
that uniquely determine classical dynamics via information up to first order. In particular, elements of
a directed set D that are indistinguishable up to first order, i.e., permuted by an automorphism
of T 1 ( D ), may be distinguishable when one considers higher-order information. It is therefore necessary
to consider higher-degree terminal states in what follows. The form of Equation (4) does assume
that first-order information suffices at the level of kinematic schemes, in the sense that the phase of
an arbitrary co-relative kinematics is the product of the phases of its individual co-relative histories.
This picture may be generalized without leaving the general framework of path summation, but such
generalization is not undertaken here. In any case, the latter phases do generally depend nontrivially
on information above first order in the corresponding cobases and targets.
The simplest discrete causal analogues of familiar thermodynamic state spaces are nth-order state
spaces Dn , whose elements represent isomorphism classes of countable star finite acyclic directed sets Δ
with maximal chain length n. Equivalently, Rn (Δ) is a nonempty antichain. It is useful to preface formal
definitions involving Dn with some informal remarks. First, while the notion of order identifying a
state Δ as a member of Dn is intrinsic to Δ itself, the desired interpretation of Δ is as a terminal state of
a history D, containing information encoded by chains of length at most n terminating at maximal
elements of D. Second, it is usually impossible to choose a member of Dn that includes all such
information for n > 1, because chains of length at most n terminating at different maximal elements of
D may intersect to produce longer chains, thereby defining a higher-order state. One might consider
re-defining Dn to include such states, requiring only that each element be connected to a maximal
element by at least one chain of length at most n. In physical terms, such states are still composed of
elements exerting “recent influence”, but may contain chains of arbitrary length. However, such a
definition would not be ideal for the desired applications. For example, it would allow any countable
star finite acyclic directed set in which all chains are bounded above to be converted to a member
of D1 or D2 by adding new relations terminating at new maximal elements, thereby flouting the
intuition that low-order states should be “causally simple”. It is preferable to define a separate notion
called degree, which facilitates the definition of terminal states containing all information up to a
given order in a particular history. Following this idea, Definition 13 introduces special states T n ( D ),
called nth-degree terminal states, which include all information encoded in chains of length at most n
terminating at a maximal element in D. Third, as mentioned in Section 2.4, the distinction between
order and degree does not arise for n = 1; the first-degree terminal state T 1 ( D ) of D automatically
belongs to D1 . Fourth, the nth superset microstates introduced in Definition 18 are constructed by
adding n “prehistorical” elements to a state, which may not increase its maximal chain length at all.
These subtleties reflect the fact that more than one natural-number grading is useful in studying
discrete causal state spaces.
368
Entropy 2017, 19, 322
It is useful to define terminal states in terms of transitions between pairs of histories, using the
relative viewpoint. Though the ultimate goal is to use information encoded in terminal states to assign
phases to sequences of co-relative histories, i.e., co-relative kinematics, the states of principal interest
in studying a given co-relative history h : Di ⇒ Dt are typically not those induced by transitions
representing h. This is because the “physically new” structure associated with Di and Dt is more
meaningful than whatever structure h “adds to” Di to produce Dt . For example, each co-relative
history h : Di ⇒ Dt in SPS adds only one element to Di , so most of the physically new structure in Dt
is typically already present in Di . Yet what one is really interested in is whether or not the physically
new structure in Dt is “more favorable” than the physically new structure in Di ; i.e., one wishes to
compare terminal states of Di and Dt . These may be defined in terms of auxiliary transitions that are
determined by h, but do not represent h under Definition 9. First, however, one must define terminal
states associated with arbitrary transitions.
Definition 12. Let τ : D → D be a transition of acyclic directed sets. The subobject Δτ of D consisting of
all elements of D − τ ( D ), all relations terminating at such elements, and all initial elements of such relations,
is called the terminal state of τ. If Rn (Δτ ) is a nonempty antichain, then the order ord(Δτ ) of Δτ is n.
Despite the relative nature of Definition 12, it is convenient to refer to Δτ as a terminal state of
the target set D in many cases. Δτ does not include relations between elements of τ ( D ); it includes
only relations that are “new” with respect to τ. If the context is expanded to include cycles, a different
definition of order is necessary. For example, one may define ord(Δτ ) to be the maximal length
of non-self-intersecting chains in Δτ . Here, however, I focus almost exclusively on the acyclic case.
Any directed set D is itself the terminal state of the unique transition 1 → D . This transition may be
denoted by τ1 when the choice of target set D is obvious. As mentioned above, is useful to define
special terminal states that encode all information up to order n in a given history.
Definition 13. Let D be an acyclic directed set in which every chain is bounded above.
1. The nth-degree terminal state T n ( D ) of D is the subobject of D consisting of all elements connected to
a maximal element of D by a chain of length at most n, together with all relations in such chains.
2. The nth-degree initial state I n ( D ) of D is the subobject of D constructed by deleting all non-minimal
elements of T n ( D ) from D, together with all relations in D terminating at such elements.
3. The nth-degree transition τDn : I n ( D ) → D associated with D is the inclusion map I n ( D ) → D.
The boundedness hypothesis in Definition 13 is included to rule out situations in which D has
maximal elements but also has chains “extending to infinity”, since it is awkward to exclude such
chains from consideration when studying terminal behavior. Such histories are not considered here.
Definition 14. The nth-order state space Dn is the set of all isomorphism classes of countable star finite
acyclic directed sets Δ such that Rn (Δ) is a nonempty antichain. The finite-order state space D is the disjoint
union ∞ n=0 D , and the (total, countable, acyclic) state space D is the set of all isomorphism classes of
n
Since the elements and relations in a member Δ of Dn are assumed to possess no internal structure,
one might expect Δ to be treated as a microstate. However, since discrete causal theory does not
rule out the dynamical relevance of information above order n at the level of individual histories,
data describing how Δ might fit into a larger history can be important in determining future behavior
influenced by Δ. Such data defines an even finer level of detail than Δ itself, permitting Δ to be viewed
as a macrostate. Ambiguity regarding the status of Δ is not surprising, due to the relative nature
of entropy. Figure 14 illustrates four different methods of defining coarse and fine levels of detail
using Dn . Informal discussion of these methods then precedes formal treatment in Definition 15.
The first diagram shows a third-order state Δ embedded in a history D. In this case, Δ does not
369
Entropy 2017, 19, 322
contain all the third-order information in D; in particular, it is not the third-degree terminal state T 3 ( D )
of D. The second diagram illustrates one way to treat Δ as a microstate, called a resolution microstate,
by approximating its structure via the method of causal atomic resolution, introduced in [14]. This method
involves choosing special subsets of Δ, called causal atoms, which serve as individual elements of
a coarser directed set. Such a choice defines a causal atomic decomposition of Δ. A sequence of such
decompositions is a causal atomic resolution, with each subsequence defining “initial” and “terminal”
levels of detail, and hence a notion of entropy. More generally, one may define partially ordered
families of decompositions, also called resolutions, which induce entropy systems. The resolution
in the figure involves a single decomposition, and hence just two levels of detail. Causal atomic
resolution provides perhaps the most obvious discrete causal analogue of conventional coarse-graining.
In particular, it involves actual approximation, meaning that the information contained in a causal
atomic decomposition is not only incomplete, but also imprecise. However, there is generally no
canonical choice of resolution for a given state, and different resolutions may be very dissimilar.
Further, resolutions reaching far above the fundamental scale can produce objects that are obviously
“too granular” to resemble physical spacetime. Members of Dn are usually treated as macrostates in
this paper, but methods such as causal atomic resolution remain worthy of further study in more
general entropic settings.
17
10 11 12 13 14
15 16 3
Δ 6 7 8
2
9
0 1 2 3 4 0 1
5
D
superset labeled symmetry
atomic microstate microstate microstate
∗
resolution η : Δ∗ ⇒ Δ :L→Δ : L → Δ̃
Figure 14. History D and terminal state Δ; causal atomic resolution of Δ; superset microstate of Δ;
labeled microstate of Δ; symmetry microstate of Δ.
The third diagram in Figure 14 illustrates the most obvious way to treat a member Δ of Dn as
a macrostate, by adding “prehistory” to define larger states called superset microstates. Different superset
microstates of Δ impose different constraints on the family of histories of which Δ could be a
terminal state. In particular, the superset Δ of Δ shown in the diagram is induced by the history D.
At a higher level of detail, Δ may itself be viewed as a macrostate, with its own superset microstates
adding more prehistory. One may imagine “flipping over” this diagram to obtain a co-relative
∗ ∗
history η : Δ∗ ⇒ Δ between the causal duals Δ∗ and Δ of Δ and Δ , and this is how superset
microstates are formalized in Definition 15. Hence, the convenient term “superset” is not quite precise,
because co-relative histories involve equivalence classes. Naïve amalgamation of superset microstates
produces a state space with an infinite number of elements in each cell, since one may always add more
prehistory to a directed set. This leads a priori to infinite multiplicities and entropies for finite states.
However, supersets adding “recent” data are expected to dominate dynamically, and families of
superset microstates may be filtered to reflect this expectation. In the case of finite states, one may
work with finite families of microstates defined in terms of numbers of elements and relations,
lengths of chains, sizes of antichains, and similar quantities. Here, I focus on families defined via the
number of prehistorical elements added to Δ. The quantity of superset microstates of a given type
is decreased by symmetries of Δ, which render equivalent different subsets of Δ. This meshes with
the intuition that high-entropy states should be “disordered”. For example, if Δ is an antichain of
cardinality K with automorphism group Aut(Δ) ∼ = SK , then there is only one way to add a single
prehistorical element and k relations to Δ for any k ≤ K, since the terminal elements of these relations
370
Entropy 2017, 19, 322
in Δ may be exchanged for any other k elements of Δ under Aut(Δ). By contrast, there are (Kk ) ways to
add such an element and relations to Δ if Aut(Δ) is trivial.
The fourth and fifth diagrams in Figure 14 illustrate contrasting ways to treat a member Δ of Dn as
a macrostate by focusing on its symmetries directly. Under the method illustrated in the fourth diagram,
a microstate of Δ is simply a copy of Δ labeled via a map : L → Δ, where L is a set of consecutive
natural numbers starting with zero, and where two labelings are regarded as equivalent if they are
related by an automorphism of Δ. Such a microstate is called a labeled microstate. The number of labeled
microstates associated with a state Δ of cardinality K ranges from 1 if Aut(Δ) ∼ = SK to K! if Aut(Δ)
is trivial. This method agrees qualitatively with the superset approach in the sense that high-entropy
states are those for which Aut(Δ) is small. The method illustrated in the fifth diagram essentially
reverses this relationship. Here, one begins with an arbitrary labeling : L → Δ̃, where Δ̃ is the subset
of Δ not fixed by Aut(Δ). Automorphisms of Δ convert to other labelings, each of which represents
a symmetry microstate. Such a microstate may be viewed as a “mode of symmetry breaking”, since it
breaks the symmetries of Δ in a specific way. For a finite state Δ, the number of symmetry microstates is
just |Aut(Δ)|, so high-entropy states are those for which Aut(Δ) is large. More generally, one may work
with non-surjective partial labelings : L → Δ̃ that leave a subgroup of Aut(Δ) unbroken. The labeling
in the figure is of this type, since there remains an automorphism of Δ interchanging the elements
indicated by arrows. The set of such partial labelings is partially ordered by extension, which is
interesting from the perspective of state-specific detail discussed at the end of Section 3.1. While it
is counterintuitive to associate high entropy with symmetry, there are arguments for entertaining
such possibilities. Symmetry is central to the theory of “elementary” particles, so certain special
structures that are locally symmetric, at least at measurable scales, are favored by the actual dynamics
of the physical universe. Such structures may be “attached” to underlying causal structure via
auxiliary algebraic information, but the strong interpretation of the causal metric hypothesis demands
an emergent description of both spacetime symmetries and internal symmetries. The most obvious
way to satisfy this demand is to incorporate some type of symmetry data directly into Equation (4).
Notions of entropy associated with superset microstates and/or labeled microstates might accomplish
a similar purpose, since their enumeration depends largely on symmetry considerations. Regardless of
the type of entropy chosen, an attractive though speculative idea is that elementary particles might
arise via local entropic traps, whereby certain regular structures that are small by conventional measures
but large compared to the fundamental scale might be very stable from an entropic perspective.
A mathematical result important in the study of superset microstates, labeled microstates,
and symmetry microstates is Bender and Robinson’s proof [37] that a typical acyclic directed set
D has trivial automorphism group, i.e., is rigid. This result applies asymptotically under modest
assumptions about the number of relations in D. However, these assumptions fail to hold for a typical
low-order terminal state Δ, since such a state has unusually large “spatial size” and small “causal size”,
and typically lacks enough relations to “bind elements in place”. Hence, Aut(Δ) is often nontrivial
for such a state. The extreme case is a zeroth-order state, whose automorphism group is the entire
symmetric group permuting its elements transitively. However, states tend to become increasingly
rigid as their order increases. Bender and Robinson’s result enables rough enumerations of the
number of high-order superset microstates and labeled microstates for a state Δ of a given cardinality.
It also suggests a novel explanation for why the details of the distant past seem to be irrelevant to
future dynamics, namely, because relatively few additional generations of elements must be added to
a typical low-order state to break most of its symmetries.
Definition 15. Dn , D, and D may be used to define finer state spaces, for which their members are macrostates.
∗
1. The nth-order superset state space DSUP
n is the set of full, originary co-relative histories η : Δ∗ ⇒ Δ .
where Δ is a member of Dn and Δ is a member of D. Its elements are called superset microstates.
The corresponding finite-order superset state space DSUP and (total, countable, acyclic) superset
state space DSUP are defined in the obvious ways.
371
Entropy 2017, 19, 322
2. The nth-order labeled state space DLAB n is the set of complete labelings of members Δ of Dn , where two
labelings of Δ are considered to be equivalent if they are related by an element of Aut(Δ). Its elements
are called labeled microstates. The corresponding finite-order labeled state space DLAB and
(total, countable, acyclic) labeled state space DLAB are defined in the obvious ways.
3. The nth-order symmetry state space DSYM n is the set of partial labelings of members Δ of Dn induced
by applying elements of Aut(Δ) to arbitrary initial labelings of the subsets Δ̃ of Δ not fixed by Aut(Δ).
Its elements are called symmetry microstates. The corresponding finite-order symmetry state space
DSYM and (total, countable, acyclic) symmetry state space DSYM are defined in the obvious ways.
some
state Δ7 superset
microstates
possible
prehistorical
relations
prehistorical
element
Figure 15. 22 of the 96 superset microstates of Δ7 given by adding one prehistorical element.
For a state Δτ of cardinality K, the number of superset microstates adding a single element
is “roughly” 2K , if one ignores the contribution of symmetries. This reflects the idea that one may
choose any family of elements in Δτ to be in the direct future of the single prehistorical element,
since 2K is the sum of the binomial coefficients (Kk ) for 0 ≤ k ≤ K. Nontrivial symmetries of Δτ reduce
this number; in particular, the number of superset microstates of the first-degree terminal states Δ7 to
Δ11 in Figure 10 are 96, 64, 72, 144, and 132. Ignoring symmetries need not yield exactly 2K microstates,
due to a curious graph-theoretic phenomenon called pseudosimilarity, whereby one directed set may
be a terminal state of another in multiple distinct ways, even if the two sets differ by only a single
element. Figure 16 illustrates this subtlety via an example provided by Brendan McKay, in which
augmenting two copies of a state Δτ by a single prehistorical element in two different ways produces
isomorphic supersets. The drawing emphasizes the latter isomorphism; the fact that the black nodes
and edges represent two copies of the same state Δτ may be seen by matching up the elements labeled
x and y.
372
Entropy 2017, 19, 322
copies of Δτ
x
pseudosimilar
x y
elements
Figure 16. McKay’s example: a superset may induce multiple microstates via pseudosimilarity.
Figure 17 illustrates a small region of D1SYM whose macrostates are the first-degree terminal
states Δ7 to Δ11 appearing in the sequential growth process from Figure 10. The left-hand diagram
reproduces this process. In the middle diagram, Δ7 to Δ11 are represented by large cells labeled 7
to 11, subdivided into smaller cells representing symmetry microstates. Because the histories D7
to D11 are rigid, D1SYM accurately reflects relative distinguishability properties between terminal
states and their histories in this case, since every state symmetry is broken by its ambient history.
The figure highlights the fact that symmetry microstates of a given terminal state are isomorphic as
partially labeled directed sets, which raises the question of how they are distinct. The answer is that
there are multiple ways to break the automorphisms of the original states involved, even though
the resulting objects remain isomorphic. D1SYM generally has “too many microstates” for terminal
states of nonrigid histories, since it includes symmetry breaking information for symmetries that
remain unbroken. This issue may be addressed by restricting the class of permissible labelings.
The right-hand diagram represents the sequential growth process abstractly via a “curve” in D1SYM .
Since D1SYM encodes information only up to first order at the level of individual histories, the entire
curve is necessary to reconstruct the evolution of D11 . The corresponding regions of D1SUP and D1LAB
are much too large and cluttered to illustrate here, but the basic structural aspects are similar.
D11 1
2 0 0
h10 D10 1 0
0 2 1 10 10
1 2
D9 h9 2
1 0 1 other other
1 0
h8 D8 0 2 possible possible
2
11 1 0 first-degree
11 first-degree
D7 h7 0 1
9 states 9 states
0 1
Δ7 Δ8 7 7
8 8
(black) 0 1 1 0
Figure 17. Sequential growth process from Figure 10; region of D1SYM through which this process moves;
abstract view of the process.
Definitions 14 and 15 identify discrete causal state spaces as sets, but one may recognize additional
“geometric” structure on these spaces defined in terms of discrete operations that convert one state
to another. It is useful to define such operations for multidirected sets in general.
Definition 16. Let M and M be multidirected sets. Elementary operations on such sets are defined as follows:
373
Entropy 2017, 19, 322
The absolute distance d( M, M ) between M and M is the minimal number of elementary operations required
to convert M to M , if this number is finite. Otherwise, d( M, M ) = ∞.
Notions of distance between pairs of states facilitate useful analogues of familiar evolutionary
ideas. For example, in conventional thermodynamics, one may ask why every system does not
immediately transition to the cell in state space representing thermal equilibrium. The answer is
that curves in state space are continuous in this context, so a typical system beginning far from
thermal equilibrium must pass through a sequence of intervening macrostates before reaching it.
Although literal continuity does not apply in the discrete causal context, similar ideas may be
invoked whenever one can define notions of distance and neighbors. In particular, even if a given
co-relative history is “favored” from a purely entropic perspective, it may be “costly” in the sense
that it entails direct passage between widely separated regions of a discrete causal state space.
Similarly, “short” paths between a given pair of states might be favored over “long” paths that
involve drastic changes in structure. These ideas are revisited in Section 4.2 in the context of
spacetime expansion, and again in Section 4.3 in the context of discrete causal action principles.
Alternative, relative notions of distance between pairs of directed or multidirected sets may
be defined in terms of “ambient” structure from a configuration space. In the case of directed sets,
such structure may originate from a kinematic scheme.
Definition 17. Let S = (K, H) be a kinematic scheme, and let D be a member of K in which every chain is
bounded above. Let T n ( D ) be the nth-degree terminal state of D, and let Δ be any other element of D.
1. The directed distance dS,D ( T n ( D ), Δ) between T n ( D ) and Δ in S with respect to D is the minimal
length of chains x ( D ) ≺ x ( D1 ) ≺ ... ≺ x ( D N ) in M(S), where T n ( D N ) = Δ.
2. The undirected distance S,D ( T n ( D ), Δ) between T n ( D ) and Δ in S with respect to D is the minimal
length of undirected paths x ( D ), x ( D1 ), ..., x ( D N ) in M(S) with initial element x ( D ) and terminal
element x ( D N ), where T n ( D N ) = Δ.
The reason why dS,D and S,D depend on a choice of D is because T n ( D ) and Δ may appear as
terminal states of many different histories in S. If T n ( D ) = T n ( D1 ) = T n ( D2 ), then it may be easier to
reach a history with nth-degree terminal state Δ from D1 than from D2 . The distinction between a chain
x ( D ) ≺ x ( D1 ) ≺ ... ≺ x ( D N ) and an undirected path x ( D ), x ( D1 ), ..., x ( D N ) is that chains respect
the directions of relations in M(S), while undirected paths generally do not. States close together in
an undirected sense may be far apart in a directed sense, since undirected paths are more general
than chains. Dependence on D implies that dS,D and S,D are inherently asymmetric. It is reasonable to
expect that dS,D and S,D may closely approximate more conventional notions of distance for suitable
classes of “large” directed sets, but this topic is not further explored here.
374
Entropy 2017, 19, 322
was to add detail to terminal states via partial labelings specifying symmetry breaking information,
leading to the spaces DSYM
n , DSYM , and DSYM of symmetry microstates.
Before explaining how discrete causal entropies may be defined via these four approaches,
I mention progress in the study of causal set entropy by Sorkin and collaborators [35,36]. This work
exhibits interesting relationships with analogous continuum-based notions, is supported by numerical
simulations involving “low-dimensional” causal sets, and incorporates covariance considerations.
However, it is very different in its assumptions and emphasis from the approaches examined in
this paper. First, the entropies involved are defined in terms of auxiliary fields on causal sets,
and are therefore not completely background independent quantities. Sorkin does consider causal set
“vacuum solutions”, whose entropies may be attributed solely to causal structure, but entropies associated
with nontrivial interactions typically involve large quantities of extra-causal data. Second, pre-packaged
quantum-theoretic machinery such as Hilbert spaces, operator algebras, density matrices, and von
Neumann-type entropy are applied to individual causal sets under this approach, rather than
emerging naturally from a history configuration space. Third, the permeability problem and other
technical obstructions arising in the absence of relation space methods render it difficult to define
terminal states or associated entropic data in this setting. The resulting measures of entropy are
a priori “higher-dimensional”, and can be associated only indirectly with conventional notions of
time-dependent entropy and the second law of thermodynamics. Fourth, many of the cases considered
under this approach involve special causal sets of the type mentioned in Section 2.2, induced by
sprinkling elements into relativistic spacetime manifolds. Such causal sets are naturally limited in their
potential to reveal structural features beyond the scope of general relativity.
I give only a brief sketch of how one may construct entropy systems via resolution microstates.
For simplicity, I describe this construction in terms of an individual nth-order state space Dn . The first
step is to choose a resolution of each state Δ in this space. In the simplest case, these resolutions may
be chosen to consist of single causal atomic decompositions. A choice of such decompositions defines
a coarse-graining of Dn , which induces an entropy quadruple, while a choice of resolutions involving
longer sequences of decompositions, or partially ordered families of decompositions, defines an
entropy system. In the general case, one may define a partially ordered family of equivalence relations
on Dn , specified by treating states as equivalent if their resolutions agree beyond a certain level
of detail. The associated equivalence classes then define partitions of Dn , and their cardinalities
define multiplicities. The resulting notion of entropy is called resolution entropy. One may choose to
define resolutions in such a way that each decomposition reduces the maximal length of chains in
each state by a specified quantity. For example, the decomposition illustrated in the second diagram in
Figure 14 converts a “fine” third-order state to a “rough” first-order state. An analogue of resolution
entropy appears in Sorkin’s approach to causal set entropy [35,36], but involves a random “decimation”
version of coarse-graining that does not incorporate causal structure in the same way that causal
atomic resolution does. It also involves “higher-dimensional” entropy, rather than entropy associated
with terminal states. However, numerical examples do hint at interesting universal behavior for this
type of entropy, and this evidence provides motivation for studying resolution entropy in more detail.
Numerous questions must be answered, however, before one may have confidence in the
resolution approach. The most basic is how sensitive resolution entropy is to changes of resolution,
since resolutions generally involve arbitrary extraphysical choices regarding the organization
of information. Another question, already mentioned in Section 3.3, is how one may reconcile
the increasing “granularity” produced by multi-level resolutions with the basic philosophy of
metric recovery, under which discrete causal structure at the fundamental scale should produce
effectively smooth structure at sufficiently large scales. A third issue arises from the empirical
dynamical irrelevance of details of the distant past. If only very low-order terminal states play
a substantial dynamical role in the future evolution of histories, then repeated causal atomic
decompositions of dynamically relevant states will produce antichains at relatively fine levels of detail.
Antichains possess no internal structure besides cardinality, which seems much too crude to determine
375
Entropy 2017, 19, 322
meaningful dynamics, especially locally. Therefore, the utility of resolution entropy seems to be
limited by the “causal depth” of relevant information. This issue does not necessarily disqualify the
resolution approach, however, due to the scales involved. In particular, the difference in magnitude
between the Planck scale and presently-measurable scales suggests than information up to order 1010
or 1015 could be relevant without producing noticeable deviations from the empirical obsolescence
of high-order information. A resolution involving decompositions similar to the one illustrated in
Figure 14 would require perhaps 30 decompositions to cover 10–15 orders of magnitude, and could
therefore contain a large quantity of information. However, such illustrations involving small histories
can be misleading; for example, it would not be surprising if each element in a typical physically
realistic history were directed related to 1010 or more other elements. Such large numbers of relations
would affect the qualitative properties of realistic resolutions.
Superset microstates offer a variety of different ways to define entropy systems via the state spaces
DSUP
n ,D
SUP , and DSUP . I begin by discussing simple notions of entropy involving individual partitions
of these spaces. For simplicity, I focus on the case of finite states. Let Δ be such a state, and consider
∗
all superset microstates η : Δ∗ ⇒ Δ adding a single prehistorical element to Δ. The number of
such microstates is the cardinality of the future relation set R+ ( x (Δ∗ )) in M(SPS ), since the number
of different ways in which Δ can be the terminal state of a history with one additional element is
the same as the number of ways in which Δ∗ can evolve into a history with one additional element.
As a reminder, x (Δ∗ ) is the element in the underlying multidirected set M(SPS ) of SPS representing Δ∗ ,
and R+ ( x (Δ∗ )) is the set of relations in M(SPS ) beginning at x (Δ∗ ), each of which represent a co-relative
history with cobase Δ∗ . The first superset multiplicity μ1SUP (Δ) of Δ is then defined to be the number
| R+ ( x (Δ∗ ))| of such microstates η, and the first superset entropy eSUP1 ( Δ ) is defined to be log μ1 ( Δ ).
SUP
Following essentially the same reasoning, nth superset multiplicities and entropies may be defined.
Definition 18. The nth superset multiplicity μSUPn ( Δ ) of a finite state Δ is the number of co-relative histories
∗
η : Δ ⇒ Δ , where the complement of the image of Δ∗ under any transition representing η has cardinality n.
∗
n ( Δ ) of Δ is log μn ( Δ ).
The nth superset entropy eSUP SUP
∗
An interesting entropy system on DSUP is given by filtering superset microstates η : Δ∗ ⇒ Δ by
both the number of prehistorical elements added to Δ by η, and the order of the resulting supersets Δ .
DSUP has a natural partition whose members are the infinite sets CSUP (Δ) parameterizing all full,
originary co-relative histories η with cobase Δ∗ and target belonging to D. One may partition each set
CSUP (Δ) by numbers of elements added to Δ, or by orders of supersets Δ , or by both. A general way to
∗ ∗
formalize the idea that two superset microstates η1 : Δ∗ ⇒ Δ1 and η2 : Δ∗ ⇒ Δ2 of Δ are equivalent
∗
up a given level of detail is to specify a common interpolating microstate η3 : Δ ⇒ Δ3 , characterized by
∗
the property that η1 and η2 both factor through η3 . This means that there exist pairs of transitions
τ3 ∗ τ ∗ τ3 ∗ τ ∗
Δ∗ − → Δ3 −→1
Δ1 and Δ∗ −→ Δ3 − →2
Δ2 , where τ3 and τ3 both represent η3 , and where the
compositions τ1 ◦ τ3 and τ2 ◦ τ3 represent η1 and η2 , respectively. Informally, this means that besides
being supersets of Δ, the states Δ1 and Δ2 also share common prehistorical elements. One may then
define equivalence relations ∼m and ∼n on DSUP , for each m, n ∈ N, where η1 ∼m η2 if η1 and η2
factor through a common interpolating microstate η3 adding m prehistorical elements to Δ, and where
η1 ∼n η2 if η1 and η2 factor through a common interpolating microstate η3 whose superset has order n.
Equivalence relations ∼(m,n) combine these two requirements. The corresponding partitions P(m,n) are
partially ordered lexicographically; i.e., P(m,n) ≺ P(m ,n ) if and only if m < m or m = m and n < n .
It is convenient to denote the pair (m, n) by the single symbol α, regarded as an element of N2 = N × N.
Informally, the partition Pα groups together superset microstates that agree both up to a given number
of prehistorical elements and a given order.
Definition 19. Let α = (m, n) ∈ N2 , and let ΠLEX := { Pα }α∈N2 be the set of partitions Pα of DSUP defined
by taking superset microstates η1 and η2 of Δ to be equivalent if they factor through a common interpolating
376
Entropy 2017, 19, 322
∗ ∗ ∗
microstate η3 : Δ∗ ⇒ Δ3 of Δ represented by a transition τ3 : Δ∗ → Δ3 such that |Δ3 − τ3 (Δ∗ )| = m
and ord(Δ3 ) = n. Let ∼α be the corresponding equivalence relation, and for any subset V ⊂ DSUP , let V α be
the corresponding quotient set. For any relation Pα ≺ P β under the lexicographic order induced by N2 , and for
any subset V belonging to Pα , let μαβ (V β ) be the cardinality of V β . Let μLEX be the family of measures μαβ .
Then the triple (DSUP , ΠLEX , μLEX ) is called the lexicographic superset entropy system.
The measures μαβ (V β ) may take on infinite values; for example, there are infinitely many ways
to add a single prehistorical element to N. Definition 19 does not specify the number of relations
added to Δ by each microstate, or the maximal sizes of antichains in the corresponding supersets,
or any of a variety of other basic combinatorial data that may be used to partition DSUP in
different ways. Using such quantities, one may define alternative entropy systems, involving,
for example, “higher-dimensional” lexicographic orders. This particular entropy system merely
formalizes some of the simpler properties that may be used to organize families of superset microstates.
Labeled microstates also induce a variety of entropic notions. The most obvious is given by simply
counting the number of equivalence classes of labelings of a state Δ. If Δ has cardinality K, then its total
number of labelings is K!. These labelings are partitioned by the action of Aut(Δ) into equivalence
classes of cardinality |Aut(Δ)|, so the number of such classes is K!/|Aut(Δ)|.
Definition 20. The labeled multiplicity μLAB (Δ) of a state Δ of cardinality K is K!/|Aut(Δ)|. The labeled
entropy eLAB (Δ) of Δ is log μLAB (Δ) = log K! − log |Aut(Δ)|.
It is sometimes desirable to decompose the subset CLAB (Δ) of DLAB consisting of all equivalence
classes of labelings of Δ. This may be accomplished via equivalence classes of partial labelings
of Δ, i.e., labelings of special subsets U of Δ. To yield a suitable version of equivalence, U must be a
union of orbits under Aut(Δ), and the labeling must be by consecutive natural numbers beginning
with zero. The set of equivalence classes of such partial labelings is partially ordered by extension of
class representatives. A labeling of U corresponds to a subset CLAB () of CLAB (Δ) defined by labelings
of Δ extending . Letting U and vary, one obtains a family of sets {CLAB ()} that cover CLAB (Δ),
generally in a highly redundant fashion. A partition of CLAB (Δ) induced by partial labelings of Δ is defined
to be a partition whose members are open sets in the topology on CLAB (Δ) generated by {CLAB ()},
i.e., unions of finite intersections of members of {CLAB ()}. Choosing such a partition for each Δ
defines a partition of DLAB , and the collection of all such partitions forms a “large” entropy system.
Smaller subsystems may be more convenient to work with in practice.
Definition 21. Let Δ be a member of D, and let CLAB (Δ) be the subset of DLAB consisting of all equivalence
classes of labelings of Δ. Let ΠLAB (Δ) be the set of partitions of CLAB (Δ) induced by partial labelings of Δ,
and let ΠLAB be the set of partitions of DLAB constructed from the partitions ΠLAB (Δ), partially ordered
by refinement. For any relation Pα ≺ P β in ΠLAB , and for any subset V belonging to Pα , let μαβ (V β ) be the
cardinality of the quotient set V β of V under the equivalence relation ∼ β induced by P β . Let μLAB be the family
of measures μαβ . Then the triple (DLAB , ΠLAB , μLAB ) is called the labeled entropy system.
Symmetry microstates share entropic similarities with labeling microstates, since both approaches
involve labelings. The principal differences are that symmetry microstates label only elements of a state
Δ that are not fixed by its automorphisms, and labelings related by automorphisms are not considered
to be equivalent. It is convenient to fix an arbitrary “initial” labeling on the set Δ̃ of elements of Δ not
fixed by Aut(Δ), i.e., the union of nonsingleton orbits under Aut(Δ). A labeling of Δ̃ is then considered
permissible if it is generated by applying an element of Aut(Δ) to this initial labeling. The number of
such labelings is just the order |Aut(Δ)| of Aut(Δ).
Definition 22. The symmetry multiplicity μSYM (Δ) of a finite state Δ is |Aut(Δ)|. The symmetry
entropy eSYM (Δ) of Δ is log μSYM (Δ) = log |Aut(Δ)|.
377
Entropy 2017, 19, 322
By Definitions 20 and 22, μLAB (Δ)μSYM (Δ) = K! for a state Δ of cardinality K. Processes exhibiting
an increase in eLAB therefore exhibit a decrease in eSYM for a fixed state cardinality, and vice versa,
although “expanding universes” may exhibit simultaneous increases in both types of entropy. As in
the case of labeled microstates, it is sometimes desirable to decompose the subset CSYM (Δ) of DSYM
consisting of all permissible labelings of Δ̃. This may be accomplished by partially labeling Δ̃ in a
suitable manner; in particular, the set U of elements labeled must be a union of nonsingleton orbits
under Aut(Δ). Such a labeling defines a subset CSYM () of CSYM (Δ) consisting of all labelings of Δ̃
extending . The set of all such labelings for all such U is partially ordered by extension. The collection
of sets {CSYM ()} define a family of partitions of DSYM , and hence an entropy system.
Definition 23. Let Δ be a member of D, and let CSYM (Δ) be the subset of DSYM consisting of all permissible
labelings of the set Δ̃ of elements of Δ not fixed by Aut(Δ), with respect to an arbitrary initial labeling.
Let ΠSYM (Δ) be the set of partitions of CSYM (Δ) induced by partial labelings of Δ̃, and let ΠSYM be the set of
partitions of DSYM constructed from the partitions ΠSYM (Δ), partially ordered by refinement. For any relation
Pα ≺ P β in ΠSYM , and for any subset V belonging to Pα , let μαβ (V β ) be the cardinality of the quotient set V β
of V under the equivalence relation ∼ β induced by P β . Let μSYM be the family of measures μαβ . Then the triple
(DSYM , ΠSYM , μSYM ) is called the symmetry entropy system.
It may often suffice on physical grounds to restrict attention to notions of entropy more specific
than those associated with the entropy systems of Definitions 19, 21 and 23, although it may be
necessary to supersede the simplistic notions of Definitions 18, 20 and 22. For superset microstates,
weighted sums of entropies can be useful to naturally distill finite entropic values from infinite families
of microstates. Abstractly, such sums are analogous to Gibbs or Shannon entropies. A practical
reason to study such sums is to quantify the degree to which prehistorical data of various orders is
dynamically relevant. A simple example of such a weighted sum is
∞ n (Δ)
eSUP
e(Δ) = ∑ n4
, (6)
n =1
where the denominator n4 dominates the rapid growth of eSUP n ( Δ ) as n increases. For both
labeled microstates and symmetry microstates, symmetry considerations are paramount. Interesting
generalizations of Definitions 20 and 22 include those involving the study of symmetries that are
broken or preserved by specific prehistorical information. This leads to the concept of extension groups,
which measure how many automorphisms of a terminal state extend to automorphisms of a specified
superset. One may formalize this idea in terms of pairs of transitions (τ1 , τ2 ), where τ1 specifies a
terminal state Δτ1 , and τ2 specifies a superset Δτ2 of Δτ1 that breaks some of the symmetries of Δτ1 .
Finiteness assumptions may be added as necessary.
Definition 24. Let τ, τ1 and τ2 be transitions of directed sets with sources D, D1 and D2 , and common
target D . Assume that τ2 ( D2 ) ⊂ τ1 ( D1 ) in D . Let Δτ , Δτ1 and Δτ2 be the terminal states of τ, τ1 , and τ2 .
The generational automorphism groups discussed in Section 8.2 of [14] are special cases of
τ1 τ2 τ1 τ2
state automorphism groups. The quantities μSYM and eSYM may be derived from the symmetry
entropy system, if desired. E τ1 τ2 is generally not a normal subgroup of Aut(Δτ1 ). The superset Δτ2 may
acquire “new” symmetries that do not extend nontrivial symmetries of Δτ1 , but this is atypical due
to rigidity. Since the purpose of studying entropic phase maps is to assign quantum-theoretic phases
378
Entropy 2017, 19, 322
to co-relative kinematics, it is necessary to adapt the preceding notions to apply to co-relative histories
h : Di ⇒ Dt in a kinematic scheme S. The states of principal interest in this context are terminal states
of the cobase Di and target Dt of h. For generality, it is convenient to work with an unspecified entropy
function on a subset of D. Again, finiteness assumptions may be added as necessary.
Definition 25. Let h : Di ⇒ Dt be a co-relative history. Let Δτi and Δτt be terminal states of Di
and Dt , respectively. Let e be an entropy function on a subset of D.
τ
1. The initial entropy ei i (h) of h with respect to τi is e(Δτi ).
2. The terminal entropy etτt (h) of h with respect to τt is e(Δτt ).
3. The relative entropy eτi τt (h) of h with respect to the pair (τi , τt ) is e(Δτt ) − e(Δτi ).
It is useful to specialize Definition 25 to the case where τi and τt are transitions of specific degrees,
as specified in Definition 13.
Definition 26. Let h : Di ⇒ Dt be a co-relative history, and let e be an entropy function on a subset of D.
N ' (
Θe (γ) = ∏ exp ieτik τtk (hk ) , (7)
k =0
where Δτik and Δτtk are terminal states of Dik and Dtk with respect to transitions τik and τtk .
This approach restricts attention to causal Schrödinger-type equations of the form given in Equation (4),
since this equation is defined in terms of a relation function θ, rather than a possibly nonmultiplicative
phase map. Since the target of hk coincides with the cobase of hk+1 , it is often reasonable to choose
τi(k+1) = τtk . With these choices, the product in Equation (7) telescopes to yield the simpler expression
' (
Θe (γ) = exp i e(ΔτtN ) − e(Δτi0 ) . (8)
This telescoping property implies that the value of Θe is independent of the choice of chain γ in
R(M(S)) between r (h0 ) and r (h N ), a feature revisited in Section 4.2. It is sometimes convenient to use
the shorthand eτi0 τtN (γ) for the entropic quantity e(ΔτtN ) − e(Δτi0 ) multiplying i in the exponential in
Equation (8), which generalizes the expression eτi τt (h) = e(Δτt ) − e(Δτi ) appearing in Definition 25
379
Entropy 2017, 19, 322
for a single co-relative history h : Di ⇒ Dt . The simplest such phase maps Θe are given by choosing
Δτik and Δτtk to be the mth-degree terminal states T m ( Dik ) and T m ( Dtk ) defined via the mth-degree
transitions τik = τDm and τtk = τDm under Definition 13, for some natural number m. I focus principally
ik tk
on phase maps of this form in what follows. The primitive phase maps discussed in Section 8.2 of [14]
are defined exclusively in terms of terminal states of transitions representing the co-relative histories
h0 , ..., h N . The approach described here is more general.
Referring to Section 3.4, there are many possible ways to define an entropy function e to determine
specific content for Equation (7) or Equation (8). No specific examples involving resolution entropy are
computed here, since the details of this approach are outside the scope of this paper. In rough terms,
however, the multiplicities assigned to terminal states in this context are the numbers of such states
sharing common resolutions, and the corresponding entropies are the logarithms of these multiplicities.
An obvious qualitative conclusion that may be drawn in this context is that maximizing the entropic
quantity eτi0 τtN (γ) = e(ΔτtN ) − e(Δτi0 ) tends to favor “expanding universe” scenarios, in which the
cardinality of ΔτtN exceeds that of Δτi0 , provided that the sizes of causal atoms are roughly equal
in decompositions of states of different sizes. This qualitative relationship may be understood by
“inverting” the decomposition process, replacing each element in a directed set with a causal atom;
there are clearly more ways to do this for larger sets. Qualitative entropic preference for expanding
universe scenarios is in fact a generic feature of discrete causal notions of entropy; this is a posteriori
obvious on basic enumerative grounds. Cosmological observations do favor accelerating expansion
of spacetime, but the correspondence between large universes and high overall entropy is much too
general to favor discrete causal theory specifically. Conventional thermodynamic systems exhibit
increasing entropy without acquiring new degrees of freedom, and this suggests examining the notion
of entropy per unit volume to “correct” for differences in the sizes of states. This idea is revisited in more
detail below. It should also be emphasized that the quantity eτi0 τtN (γ) appears here in a role analogous
to that of the classical action S in Feynman’s phase map, which is typically minimized for favored
trajectories under Hamilton’s principle of stationary action. This suggests the possibility of adding a
minus sign to the exponents in Equations (7) and (8), thus treating eτi0 τtN (γ) as a “negative action”.
Regardless of this choice, the quantity eτi0 τtN (γ) must obey some analogue of stationary action to
produce suitable interference effects, for example, by exhibiting similar values for similar states of
high entropy. This nontrivial requirement is elaborated in Section 4.2.
A simple specific choice for the entropy function e in Equations (7) and (8) is the nth superset
entropy function eSUP n of Definition 18. Choosing Δτi0 = T m ( Di0 ) and ΔτtN = T m ( DtN ) in Equation (8)
yields the phase map
' (
n
Θe (γ) = exp i eSUP ( T m ( DtN )) − eSUP
n
( T m ( Di0 )) . (9)
Even this simple phase map is difficult to compute exactly for arbitrary values of m and n, since it
requires calculating all possible ways to add n prehistorical elements and an unspecified number
of relations to T m ( Di0 ) and T m ( DtN ). However, a few special cases may be computed, and rough
qualitative conclusions may be drawn. Beginning with m = 0, T 0 ( Di0 ) and T 0 ( DtN ) are just antichains
consisting of the maximal elements of Di0 and DtN , respectively. In the finite case, their cardinalities
are natural numbers Ki0 and KtN . If also n = 0, then
' ( ' (
0
Θe (γ) = exp i eSUP ( T 0 ( DtN )) − eSUP
0
( T 0 ( Di0 )) = exp i log 1 − log 1 = e0 = 1,
for any choice of γ, since there is exactly one way to add zero elements to each of the directed
sets T 0 ( Di0 ) and T 0 ( DtN ). More generally, trivial supersets produce trivial superset entropies.
Taking m = 0 and n = 1 in Equation (9) still involves zeroth-degree terminal states, but adds nontrivial
information to these states. The first superset multiplicity μ1SUP ( T 0 ( Di0 )) of T 0 ( Di0 ) under Definition 18
is Ki0 + 1, because a superset of an antichain given by adding a single prehistorical element is
380
Entropy 2017, 19, 322
determined up to isomorphism by its number of relations, which may range from 0 to Ki0 in this case.
Similarly, the multiplicity μ1SUP ( T 0 ( DtN )) is KtN + 1, so with these choices
' (
Θe (γ) = exp i log(KtN + 1) − log(Ki0 + 1) .
Here, the entropic preference for “expanding universe” scenarios is quantitatively obvious, and the
same effect clearly extends to higher-order states and higher-index superset entropy functions,
since there are typically more ways to add families of prehistorical elements to large directed sets than to
small ones. Conventional thermodynamics suggests that working with zeroth-degree terminal states is
likely inadequate to determine relevant entropic quantities, so a more serious treatment involves states
of higher degree. Substituting first-degree terminal states T 1 ( Di0 ) and T 1 ( DtN ) into Equation (9) yields
the most obvious discrete causal analogue of conventional thermodynamic entropy in the superset
context. Zeroth superset entropies offer no useful information, so the first interesting case is given by
setting m = n = 1. This requires computing the number of ways to add a single prehistorical element
to a first-degree terminal state of cardinality K, an interesting enumerative problem. Referring to the
discussion following Figure 15, a very rough estimate of this number is 2K , assuming that the state is
nearly rigid. This produces an estimate of
Θe (γ) ≈ exp i (KtN − Ki0 ) log 2
for the resulting phase map, which again suggests an entropic preference for “expanding universe”
scenarios. Applying higher-index entropy maps eSUP n in this context leads to further intricate
enumerations, but rough estimates may again be formulated. Ignoring symmetries, overcounting,
and multidirected structure of the type illustrated by McKay’s example in Figure 16, the nth superset
multiplicity μSUP
n ( Δ ) of a state Δ of cardinality K and arbitrary order is roughly
n n n2
n
μSUP (Δ) ≈ ∏ 2K+k = 2(2)+Kn = 2 2 +O(n) , (10)
k =1
√
which corresponds to superset entropies of roughly n2 log 2 + O(n). This estimate is derived
by adding prehistorical elements sequentially, and naïvely multiplying together the estimated
multiplicities at each step. The factor n2 explains the choice of denominators n4 in the summands in
Equation (6), which offers a simple way to ensure convergence of the series. Equation (10) yields better
estimates for higher-order states, which are typically more rigid. For zeroth-order states, it is a very
poor estimate, particularly for low-index superset entropies. For first-degree terminal states, its overall
accuracy depends on the asymptotic behavior of automorphism groups of states of increasing size.
The mathematical interest of terminal states of low but nonzero degree arises largely from the
fact that their behavior is balanced between the rigidity of high-order states and the transitivity of
zeroth-order states in a group-theoretic sense. Estimates assuming rigidity, such as Equation (10),
are naturally rough in this context, but can nonetheless provide useful upper bounds. As in the case
of resolution entropy, conventional thermodynamic analogies suggest studying entropies per unit
volume in the superset context. The necessity of demonstrating suitable interference effects under path
summation also remains central. Since there is generally no natural limit to “how far back in time” one
may extend supersets, filtering methods associated with the lexicographic superset entropy system of
Definition 19, such as such the weighted sum of entropies in Equation (6), are of interest for organizing
relevant information, while respecting the relative insignificance of the distant past, and producing
finite values for physically meaningful quantities.
The labeled entropy function eLAB of Definition 20 offers another choice for the entropy function e
in Equations (7) and (8). A trivial case is when Δτi0 = T 0 ( Di0 ) and ΔτtN = T 0 ( DtN ). Since these states
are antichains, they are transitive under their automorphism groups; i.e., each consists of a single orbit.
381
Entropy 2017, 19, 322
Hence, all labelings of these states are equivalent, so their labeled multiplicities are equal to 1, and their
labeled entropies are equal to zero. Thus, Θe (γ) = e0 = 1 for any choice of γ. For higher-degree states,
the situation is more interesting. Referring again to Definition 20, the labeled multiplicity μLAB (Δ) of
an arbitrary state Δ of cardinality K is K!/|Aut(Δ)|. In particular, the multiplicity of 1 for a zeroth-order
state may be interpreted as the ratio K!/K!. This ratio typically increases toward K! for a sequence of
states of increasing order, since such states tend to become increasingly rigid. For such a sequence
constructed by adding new levels of structure to an initial state, the state cardinality K in the ratio
K!/|Aut(Δ)| is itself an increasing function, but this ratio is particularly interesting in the study of
entropy per unit volume, which corrects for increasing K. Low-order states often possess nontrivial
automorphism groups, and the computation of labeled entropies for such states leads to interesting
enumerative problems. The dynamical insignificance of the distant past suggests that these states
are also the most interesting from an evolutionary perspective. For high-degree states T m ( Di0 ) and
T m ( DtN ) of cardinalities Ki0 and KtN , abbreviated to K and K for legibility, typical labeled multiplicities
are approximately K! and K ! by rigidity, and the corresponding entropies are approximately
and
eLAB ( T m ( DtN )) ≈ log K ! = K log K − K + O(log K ),
by Stirling’s approximation. These estimates lead to a phase map with values of roughly
' (
Θe (γ) ≈ exp i log(K !/K!) ≈ exp i K log K − K log K , (11)
where the last expression omits the linear and logarithmic terms in Stirling’s approximation,
since rigidity is only generic and asymptotic. As in previous examples, maximizing the entropic
quantity eτi0 τtN (γ) ≈ K log K − K log K in this context favors “expanding universe” scenarios.
More sophisticated phase maps involving filtering methods such as weighted sums associated with
the labeled entropy system of Definition 21 are also of interest in this context.
Phase maps derived from symmetry entropies may be treated in a similar manner, although high
labeled entropies correspond to low symmetry entropies, and vice versa, after accounting for the
cardinalities of the states under consideration. If e = eSYM , then the symmetry multiplicities of the
0 ( D ) of cardinalities K and K are K! and K !, so the corresponding
zeroth-degree states T 0 ( Di0 ) and T tN
phase Θe (γ) = exp i log(K !/K!) is the same as the estimate given in Equation (11) for the phase
induced by labeled entropies of nearly-rigid states T m ( Di0 ) and T m ( DtN ) of the same cardinalities.
Conversely, for nearly-rigid states, phase values induced by symmetry entropies are near e0 = 1.
Again, the most interesting behavior occurs for terminal states of relatively low but nonzero degree,
which possess limited but nontrivial causal structure, and have limited but nontrivial symmetries.
More sophisticated phase maps may be constructed in terms of the symmetry entropy system of
Definition 23. For example, it is interesting to compare entropies associated with terminal states of
different degrees for the same history, using the relative notions introduced in Definition 24.
382
Entropy 2017, 19, 322
the path summation approach to succeed. Much of the appeal of entropic phase maps in this setting
arises from the fact that the idea of entropy is sufficiently general to produce a variety of discrete causal
quantities with interesting interference-related behavior that may resemble that of S, while remaining
sufficiently specific to offer meaningful physical interpretations. This is not to suggest that S is similar
to conventional entropy in other ways; indeed, S is a cumulative quantity that is typically minimized
by favored processes, which are typically time-symmetric, while entropy is conventionally understood
as an instantaneous quantity whose increase is observed to follow, and in some settings is believed to
possibly generate, the arrow of time. It is the role of discrete causal entropy in producing desirable
interference effects that must be “action-like” in the context of entropic phase maps. This is one reason
why it is reasonable to simultaneously entertain essentially opposite versions of entropy in this setting,
such as labeled entropy and symmetry entropy. In a similar manner, discrete causal action principles
need not closely resemble conventional motion-related or metric-related action principles in general,
provided that they play an analogous abstract role. The action principles discussed in Section 4.3 are
chosen with conventional definitions in mind, but many other choices are possible.
It is therefore interesting to explore which, if any, discrete causal notions of entropy can produce
“clustering effects” for phases that mimic stationary action in a suitable manner. I begin with a
simple “very early universe scenario” in SPS , involving a toy co-relative kinematics represented by
a chain γ = r (h0 ) ≺ ... ≺ r (h N ) of relations r (hk ) in R(M(SPS )) representing co-relative histories
hk : Dik ⇒ Dtk for 0 ≤ k ≤ N. In the general telescoping entropic phase map
' (
Θe (γ) = exp i e(ΔτtN ) − e(Δτi0 )
of Equation (8), I choose e to be the symmetry entropy function eSYM of Definition 22, and Δτi0 and
ΔτtN to be zeroth-degree terminal states T 0 ( Di0 ) and T 0 ( DtN ) of cardinalities 5 and 10, respectively.
' (
With these choices, Θe (γ) = exp i (log 10! − log 5!) = ei(10.3169...) . Phases determined by this
particular map are very unstable for small changes in the sizes of T 0 ( Di0 ) and T 0 ( DtN ). For example,
adding one additional element to T 0 ( DtN ) yields a phase of ei(12.7148...) , which is separated from
Θe (γ) by an angle of about 3π/4 on S1 . More generally, since log(K + 1)! − log K! = log(K + 1),
adding even a single additional maximal element to an arbitrary zeroth-order terminal state
produces a much different symmetry multiplicity, and this behavior only increases for large histories.
Working with entropy per unit volume, instead of raw entropy, trades this instability for a profound,
and perhaps excessive, stability. By Stirling’s approximation, the entropy per unit volume of T 0 ( DtN )
is roughly log | T 0 ( DtN )| in this example, a quantity which is very stable under small changes in the
size of T 0 ( DtN ). Using ballpark figures for fundamental units, the observable universe may possess
a spatial volume of about 10180 in a suitable frame of reference, and treating Hubble’s “constant” as
actually constant gives a doubling time of about 1060 . Depending on the choice of kinematic scheme,
one may therefore imagine a chain of perhaps 1060 to 10180 co-relative histories leading to a change in
entropy per unit volume of about log 2. Hence, this simplistic notion of entropy per unit volume does
not seem to change very rapidly in the actual universe.
The chain independence property for the general telescoping entropic phase map Θe of
Equation (8) is at least superficially attractive in the path summation context, since it suggests large
amplitudes for processes possessing large numbers of evolutionary pathways. What is really needed,
however, is a stronger property that produces “nearly identical phases” for “nearly identical physics”,
rather than merely producing identical phases for alternative descriptions of identical physics. A class
of maps that often exhibits this type of behavior is the class of telescoping multiplicity phase maps
' (
Θμ (γ) = exp iμ(ΔτtN )/μ(Δτi0 ) . (12)
Even a modest increase in entropy between Δτi0 and ΔτtN corresponds to a ratio μ(ΔτtN )/μ(Δτi0 )
that is near zero. Phases Θμ (γ) for chains γ exhibiting large increases in entropy therefore
constructively interfere, clustering near the complex number ei0 = 1. Similar behavior is not evident in
383
Entropy 2017, 19, 322
Equation (8), because the entropic quantity eτi0 τtN (γ) = e(ΔτtN ) − e(Δτi0 ) in the exponent of Θe typically
has nonnegligible magnitude compared to the circumference 2π of S1 . Hence, two chains γ and γ’
with “similar” final co-relative histories exhibiting large but distinct entropies may possess phases
Θe (γ) and Θe (γ ) far apart on S1 , which does not suggest encouraging interference properties for Θe .
For example, suppose that Δτi0 is rigid, and compare two different chains γ and γ with final co-relative
histories h N and hN exhibiting symmetry multiplicities μSYM (ΔτtN ) = K and μSYM (ΔτtN ) = 6K.
Here, ΔτtN and ΔτtN may be nearly-identical first-degree terminal states, differing, for example, by a
single “trident-shaped" component contributing a symmetry factor of S3 . However, the difference
between the entropic quantities eτi0 τtN (γ) and eτi0 τtN (γ ) in Θe (γ) and Θ1e (γ ) is log 6, which translates
to an angular separation exceeding π/2. This example suggests that very similar processes can
destructively interfere under Θe . In contrast, the angular separation between Θμ (γ ) and Θμ (γ) in
this example is 1/6K, so that both phases are very near ei0 = 1 for large K. Unfortunately, the map
Θμ in Equation (12) seems to exhibit too much constructive interference, in the sense that it assigns
a phase near 1 to every chain involving a modest increase in entropy. The precedent of Feynman’s
i
phase map Θ(γ) = e h̄ S(γ) suggests that the entropic quantities multiplying i in a phase map should
not be uniformly small for “physically reasonable” chains. Indeed, by scaling the classical action S
by Planck’s reduced constant h̄, Feynman’s map allows these multipliers to differ appreciably for
modestly different paths describing the behavior of systems for which quantum effects are noticeable,
such as the motion of individual electrons.
It seems, then, that the “additive recipe” of Equation (8) may produce too little constructive
interference, while the “multiplicative recipe” of Equation (12) may produce too much. There are many
possible ways to address this issue. It should be noted that the problem with Equation (12) seems to
be much more serious, producing an obviously wrong answer, whereas for Equation (8) it is merely
unclear what the interference behavior looks like for physically realistic histories. If one chooses,
then, to study modifications of Equation (8), there are at least two obvious methods to explore.
First, one may adjust Θe via a positive real-valued scale factor s, analogous to h̄. The resulting phase
map is of the form
i' (
Θs (γ) = exp e(ΔτtN ) − e(Δτi0 ) . (13)
s
Choosing s > 1 produces more tightly-clustered phases, thereby increasing constructive interference
for similar processes. The obvious question then becomes how to choose s in a non-arbitrary manner.
This immediately suggests a second method of modifying Θe , by adjusting the entropies e(Δτi0 ) and
e(ΔτtN ) individually, via information derived in a natural manner from the co-relative histories h0
and h N . An interesting variant of this approach, foreshadowed above, is to focus on entropy per
unit volume, rather than raw entropy. This involves completely different considerations than does
the conventional thermodynamic study of a variable-volume system, such as a quantity of gas in
a chamber compressed by a piston. Such a system is background dependent and does not involve
spacetime expansion. In the present more-fundamental setting, the study of entropy per unit volume
is partly motivated by the idea that the production of “new spacetime” ought to involve some “cost”,
or obey some analogue of continuity. In particular, one does not observe immediate runaway expansion
of spacetime, even though this tends to produce a large increase in entropy. A general phase map for
finite states defined in terms of entropy per unit volume is the telescoping map
' (
Θe/V (γ) = exp i e(ΔτtN )/|ΔτtN | − e(Δτi0 )/|Δτi0 | . (14)
For an “early universe scenario” involving a version of this map, let Δτi0 and ΔτtN be first-degree
terminal states T 1 ( Di0 ) and T 1 ( DtN ) of cardinalities 10 and 20, respectively, and suppose that
|Aut(Δτi0 )| = 102 and |Aut(ΔτtN )| = 104 . Then using e = eSYM in Equation (14) yields
' (
Θe/V (γ) = exp i log(104 )/20 − log(102 )/10 = ei0 = 1.
384
Entropy 2017, 19, 322
A similar process represented by a chain γ whose final co-relative history has the same size
for its first-degree terminal state but twice the symmetry multiplicity produces a phase of
Θ1e/V (γ ) ≈ ei(0.0346...) . The angular difference of 0.0346... between these two values is much smaller
than the corresponding difference of log 2 = 0.6931... produced by Θ1e . Hence, Θe/V offers an example
of how one may increase constructive interference effects via natural information associated with
evolutionary processes. Precise characterization of these effects in physically realistic scenarios depends
on asymptotic behavior of large states. For example, working with symmetry entropy, states that
are “too rigid” will typically produce values near ei0 = 1 under Equation (14), regardless of the
process involved. On the other hand, states that are “too free” will produce phases for similar processes
insufficiently close to generate adequate constructive interference. Other state-specific modifications
of Equation (8) are also worth considering. For example, natural data associated with states may
be used to determine weights in more sophisticated phase maps involving weighted sums, such as
generalizations of the map given by Equation (6). This is analogous to assigning density functions to
state spaces or weights to individual outcomes in Gibbs or Shannon entropy.
with the initial history D7 from the evolutionary process illustrated in Figure 10. Subsequent histories
in the present process are much different; each is constructed by adding a new element related to
all previously-existing elements. New elements are illustrated by large black nodes. This process
is visually suggestive of gravitational collapse, leading to a “black hole” represented by the chain of
new elements. This analogy is motivated by the fact that causal influence flows exclusively toward
the “back hole”. The automorphism groups Aut( T 1 ( Dk )) are large symmetric groups; in fact, they are
the largest possible automorphism groups for states of cardinality | T 1 ( Dk )| that are not antichains.
In particular, they are much larger than the corresponding groups associated with the process illustrated
in Figure 10. Hence, the present process maximizes symmetry entropy for first-degree terminal states.
D11
D10
D9 h10
h9
D8
h8
D7
h7
385
Entropy 2017, 19, 322
Since gravitational collapse is an important feature of general relativity, one should expect
such processes to be favored for certain histories that are large in ordinary terms but small on
cosmological scales. Similarly, one should expect “expanding universe” scenarios such as those
discussed in Section 4.1 to be favored in an appropriate cosmological sense. However, one should
not expect extreme versions of such processes to dominate all others in every situation, and such
behavior would disqualify any choice of dynamics producing it. Generalizing the present example,
it would discredit the entire idea of entropic phase maps if gravitational collapse scenarios were
found to entropically dominate all other evolutionary pathways combined. Rough computations
suggest that this is not the case. For example, beginning with a history D, one may estimate its
number of direct descendants in SPS , along with the possible sizes of their first-degree terminal state
automorphism groups. If D has cardinality K, then there exists one direct descendant D of D in SPS for
which Aut( T 1 ( D )) is isomorphic to SK , with cardinality K!, namely, the directed set D with one new
element related to all elements of D. The co-relative history D ⇒ D represents the beginning of the
global gravitational collapse scenario for D. Similarly, there are typically about K direct descendants
of D constructed by adding one new element connected to K − 1 elements of D. There may be fewer
such descendants, due to symmetries, but this is atypical due to rigidity. The first-degree terminal
state automorphism groups of these direct descendants may be as large as SK −1 , with cardinalities as
large as (K − 1)!, though they may be smaller due to symmetry breaking by the “excluded element”.
Next, there are typically about (K2 ) direct descendants of D in SPS constructed by adding one new
element connected to K − 2 elements of D, with first-degree terminal state automorphism groups
as large as (K − 2)!. Continuing this rough enumeration leads to an overestimate of the sum of the
symmetry multiplicities for first-degree terminal states over all direct descendants of D in SPS :
K K
K K!
multiplicity sum ≈ ∑ k
(K − k)! = ∑ k!
.
k =0 k =0
The ratio of the individual multiplicity associated with the beginning of gravitational collapse to the
overall multiplicity sum is therefore roughly
K
K! 1 n
K! K
1 1
K!/ ∑ k!
= 1/
K! ∑ k!
= 1/ ∑
k!
≈ = 0.3678...
e
k =0 k =0 k =0
Though this ratio is actually somewhat larger due to symmetry considerations, as well as the tiny
effect of truncating the rapidly convergent series for e, this computation suggests that the gravitational
collapse scenario does not always entropically dominate all other evolutionary pathways in the case of
symmetry entropy.
A much more general objection to the idea of entropic phase maps, already mentioned
in Section 4.2, is that it forces together notions that are only distantly related in conventional
situations where the path summation approach to quantum theory is known to succeed and where
the second law of thermodynamics is known to hold. In particular, the interference behavior
of Feynman’s phase map for paths in R4 is not closely related to conventional entropic data.
i
As explained in Section 1.2, Feynman’s map Θ(γ) = e h̄ S(γ) is determined by the classical action
S(γ) = γ L dt, where L is the Lagrangian. Hamilton’s principle states that the classical path γCL
renders S(γ) stationary, and for “sufficiently short” paths, S(γ) is generally minimized by γCL .
In this context, the Lagrangian L is symmetric under time reversal, so Hamilton’s principle certainly
does not imply the second law. While paths favored by Hamilton’s principle typically do exhibit
increases in entropy in realistic scenarios, this behavior may be attributed to auxiliary details such as
where these paths originate in state space. However, time reversal of a classical system, which generally
involves a systematic decrease in entropy, obeys the equations of motion determined by L just as well
as does the original system. Hence, an analogy between “high entropy” and “stationary action” is not
386
Entropy 2017, 19, 322
necessarily motivated by established physics in any compelling way. From this viewpoint, it is not at
all obvious that discrete causal analogues of Feynman’s phase map should depend directly on entropy.
The answer to this objection, already summarized in Section 4.2, is that discrete causal entropy
is neither expected, nor required, to play an “action-like” role in every sense. Nor must it resemble
conventional thermodynamic entropy in the sense of approximation, under which macrostates are
defined via imprecise, rather than merely incomplete, data. Indeed, the only version of entropy
introduced in Section 3 that fits this description is resolution entropy. The remaining versions all
differ from conventional thermodynamic entropy in at least two important respects: first, they do
not involve actual approximation; second, they depend nontrivially on information above first order
at the level of individual histories. More generally, discrete causal entropy must be “action-like”
only in that it produces desirable interference effects, and it must be “entropic” only in that it arises
via comparison of levels of detail under the basic framework of entropy systems. Regardless of
such conventional analogies, combinatorial data encoded in terminal states is likely, on basic
structural grounds, to determine discrete causal dynamics in the background independent setting.
The entropic notions introduced in Section 3.4 enjoy the additional benefits of possessing clear
physical meaning and suggesting effects that are known to be among the most universal in physics.
Hence, these notions stand out from among a relatively limited assortment of reasonable alternatives
for determining specific data for path summation.
Nevertheless, it is illuminating to briefly examine an alternative approach to path summation
in the discrete causal context, expressed via discrete causal action principles related more directly
to conventional motion-related or metric-related ideas. This involves defining discrete causal
“Lagrangians” and “actions” that mimic their conventional counterparts as closely as possible, in the
sense that they are defined in terms of specific “alterations” of individual histories. This is a much
narrower prescription than that of the relation function θ in Equation (4), which is “Lagrangian-like”
in an abstract sense regardless of its actual information content. An immediate difficulty with
this strategy is that notions such as energy, metric structure, and curvature, which are central to
conventional definitions of L and S, are themselves emergent in discrete causal theory. The same
is true of related quantities such as mass and momentum, which are often used to determine
these notions. In partially-background-dependent versions of discrete causal theory, such as quantum
causal set theory, “nongravitational matter” is ascribed to auxiliary fields and particles existing on
directed sets, and it is not too difficult to define reasonable analogues of L and S in this setting.
However, the situation is subtler in the perfectly-background-independent context under the strong
version of the causal metric hypothesis. As explained in Section 3.3, a popular problem in the
study of discrete gravity is how to abstract and generalize the Einstein–Hilbert action SEH [45–47].
However, the metric g and the scalar curvature R used to define SEH are unlikely to possess meaningful
direct analogues at the fundamental scale, where even primitive notions such as dimension and
topological structure are relatively obscure. Success in abstracting such quantities would accomplish
only part of the desired objective in any case, since a genuinely fundamental theory of spacetime
should explain the origins of more basic geometric and pre-geometric properties.
For these reasons, it seems preferable to work at a more conceptual level in defining discrete
causal analogues of L and S. The conceptual content of Hamilton’s principle is that nature is
basically conservative; it favors as little overall alteration as possible in evolving from one state
to another. Setting aside conventional ideas involving the conversion of one type of energy into another,
or the overall motion represented by a path between two points in a manifold, one may formulate discrete
causal action principles embodying this basic concept, hypothesizing that the resulting dynamics will
faithfully preserve the desired physical meaning as one works up from the fundamental scale. In this
context, the most natural discrete causal analogues of L and S are functionals that describe the extent
to which a given history or terminal state is altered in a process leading to another history or terminal
state. One way of describing such alteration is in terms of the elementary operations introduced in
Definition 16, which define the absolute distance between pairs of directed or multidirected sets. There
387
Entropy 2017, 19, 322
are at least two possible choices for how to quantify such an action: one may either count the number of
elementary operations necessary to convert one state Δ to another state Δ , ignoring ambient histories,
or one may count the number of operations involved in converting a history with terminal state Δ to a
history with terminal state Δ . The difference between these two notions of action is analogous to the
difference between absolute distance in Definition 16 and scheme-dependent distances in Definition 17.
Definition 27. Let h : Di ⇒ Dt be a co-relative history in a kinematic scheme S. Let Δτi and Δτt be terminal
states of Di and Dt with respect to transitions τi and τt , respectively.
1. The state-level Lagrangian quantity Lτi τt (h) of h with respect to the pair (τi , τt ) is the number of
elementary operations necessary to convert Δτi to Δτt .
2. The history-level Lagrangian L is the functional assigning to each co-relative history h the number of
elementary operations involved in converting Di to Dt , i.e., the number of elements and relations added to
Di by h.
Both Lτi τt (h) and L may take on either finite or infinite values in this general setting, though it is
often useful and appropriate to impose finiteness conditions. Lτi τt (h) is called a “Lagrangian quantity”
rather than a “Lagrangian” because it depends on choices of transitions τi and τt . One may specialize
this definition to define standard Lagrangian functionals. For example, one might define the first-degree
τ1 τ1
state-level Lagrangian L1 to be the functional assigning the state-level Lagrangian quantity L Di Dt (h) to
each co-relative history h : Di ⇒ Dt . The history-level quantity L seems much more natural than the
state-level quantity Lτi τt (h) in a structural sense. An unattractive aspect of Lτi τt (h) is that a sequence
of elementary operations converting Δτi to Δτt typically identifies structural components of these
two sets that arise from different parts of their corresponding histories. For example, the first-degree
terminal state Δ7 of the history D7 appearing in the evolutionary process illustrated in Figure 10 may
be converted into the first-degree terminal state Δ8 by a sequence of three elementary operations,
but only at the expense of identifying “early” structure in D7 with “later” structure in D8 .
A good motivation to study state-level quantities such as Lτi τt (h) despite this awkwardness is
that they are related to conventional evolutionary ideas in certain important ways. For example,
one may imagine a history in which “nothing changes”, in the sense that each terminal state of a
given degree “exactly replicates itself”. The simplest example is given by sequential growth of a chain;
at each stage of evolution, the first-degree terminal state of this chain consists of a single relation
connecting its penultimate element to its terminal element. Such a “frozen” or “static” history exhibits
a value of zero at every stage of evolution for an appropriate uniform choice of state-level Lagrangian
quantities Lτi τt (h), such as those induced by the first-degree state-level Lagrangian L1 . This agrees
with the naïve idea of dynamical stasis for this history. By contrast, the value L(h) of the history-level
Lagrangian L at every stage h of the evolution of such a history is a nonzero constant, and a similar
average value for L(h) occurs in “non-static” histories adding roughly the same number of elements
and relations at each evolutionary stage. Such histories may exhibit extreme structural differences
among generations, which may be essentially invisible to L. More generally, state-level quantities
may often detect interesting changes that are invisible to history-level quantities. A closely-related
issue is the problem of how to obtain suitable analogues of conventional evolutionary continuity.
As explained in Section 3.3, the conventional entropic preference for thermal equilibrium is balanced
by the continuity of evolution curves in state space and the fact that such curves may not originate near
the cell representing thermal equilibrium. The same topic was revisited in Section 4.2 in the context of
entropy per unit volume and spacetime expansion. Dynamics that explicitly resists drastic changes in
state-level quantities seems a priori more likely to avoid serious pathologies along these lines than
dynamics defined in terms of history-level quantities.
Each discrete causal Lagrangian induces a corresponding discrete causal action by summing
Lagrangian quantities over sequences of co-relative histories.
388
Entropy 2017, 19, 322
Definition 28. Let S be a kinematic scheme, and let γ = r (h0 ) ≺ ... ≺ r (h N ) be a chain in R(M(S))
representing a co-relative kinematics in S, where each relation r (hk ) represents a co-relative history hk : Dik →
Dtk . Let Δτik and Δτtk be terminal states of Dik and Dtk with respect to transitions τik and τtk .
1. The state-level action quantity S{τik },{τtk } (γ) along γ with respect to the pair of sequences of transitions
{τik } = {τi0 , ..., τiN } and {τtk } = {τt0 , ..., τtN } is the sum
N
S{τik },{τtk } (γ) = ∑ Lτik τtk (hk )
k =0
2. The history-level action S is the functional assigning to each chain γ the number of elementary operations
involved in converting Di0 to DtN , i.e., the number of elements and relations added to Di0 by the sequence
of co-relative histories h0 , ..., h N .
As in the case of Lagrangians, the history-level action S seems to be much more natural in a
basic structural sense than the state-level action quantity S{τik },{τtk } (γ). One obvious complication
involving the latter quantity is that fewer elementary operations are typically required to convert
a state Δ directly to a state Δ than to first convert Δ to an “interpolating state” Δ , then convert Δ
to Δ . However, the awkwardness of S{τik },{τtk } (γ) may be ameliorated to some extent by specifying
a uniform choice of transitions {τik } and {τtk }, for example, first-degree transitions. The resulting
first-degree state-level action functional may be denoted by S1 . Again, a good motivation for considering
state-level functionals is that they are more closely related to conventional evolutionary ideas in
certain respects than are history-level functionals. In particular, the history-level functional S does not
distinguish between co-relative kinematics involving state-replicating “static histories” and co-relative
kinematics involving histories in which considerable state-level change occurs, provided that the same
total number of elements and relations are added over the course of each process.
Discrete causal Lagrangians and actions defined in terms of elementary operations on directed
sets supply dynamical alternatives to entropic phase maps under the path summation approach
1
to quantum theory. For example, one might define an action-induced phase map Θ(γ) = eiS (γ)
using the first-degree state-level action functional S1 introduced above. This raises the obvious
question of how these two general types of dynamics compare. For example, one may consider
the gravitational collapse scenario illustrated in Figure 18. The value of the first-degree state-level
Lagrangian L1 at the kth stage of evolution is 2, because the kth first-degree terminal state Δk
differs from the (k + 1)st first-degree terminal state Δk+1 by a single element and a single relation,
up to isomorphism. However, the elements and relations that are identified under such a comparison
are completely different from the perspective of the entire terminal history Dk+1 . The value of the
history-level Lagrangian L at the kth stage of evolution is (k + 1), because one new element and
k new relations are added to the initial history Dk . The state automorphism group Aut(Δk ) of Δk ,
meanwhile, is typically isomorphic to Sk−1 , of cardinality (k − 1)!, and the state automorphism
group Aut(Δk+1 ) of Δk+1 is typically isomorphic to Sk , of cardinality k!. The ratio of the symmetry
multiplicities μSYM (Δk+1 )/μSYM (Δk ) is therefore typically k, and the corresponding increase in
symmetry entropy is typically log k.
Interesting structural relationships exist between the Lagrangians and actions introduced in this
section and the entropic notions developed in Section 3. Here, I can only offer vague sketches of a few
of these relationships. For example, the construction of superset microstates may be expressed via
“elementary operations” at the level of kinematic schemes. In particular, the first superset multiplicity
μ1SUP (Δ) in Definition 18 is the number | R+ ( x (Δ∗ ))| of relations in M(SPS ) beginning at the element
x (Δ∗ ) representing the causal dual Δ∗ of a state Δ. If this multiplicity is N, then one may imagine
a “growth process” for SPS that adds the N co-relative histories represented by the elements of
R+ ( x (Δ∗ )) at some stage of growth. This corresponds to a “history-level action” of roughly 2N for
the corresponding stage of growth of M(SPS ), ignoring multidirected structure, so in this case large
389
Entropy 2017, 19, 322
entropy corresponds to large action. However, since supersets encode “growth into the past”, one might
argue for associating a minus sign with this “action”, reversing this relationship. Relative notions
of symmetry entropy such as those introduced in Definition 24 also involve supersets, and may
therefore be related to such higher-level “action”. However, the most basic question in comparing
a “non-entropic” discrete causal action principle to a choice of discrete causal entropy is whether or
not such a principle, together with the structure of an appropriate discrete causal state space, at least
favors increasing entropy, regardless of whether or not it favors the maximal possible increase at each
evolutionary stage. In this context, an action principle applied to a state space may lead indirectly
to a version of the second law of thermodynamics, even if it is not derived from, or equivalent to,
such a law. This is certainly the case for conventional thermodynamics based on Newtonian physics
applied to ordinary state spaces. Corresponding relationships between discrete causal action principles
and discrete causal entropy remain mostly unexplored.
− −
ψR;θ (r ) = θ (r ) ∑ ψR;θ (r − ),
r − ≺r
reproduced here from Equation (4). In physical terms, a suitable phase map must produce interference
effects that reinforce “reasonable” evolutionary processes, while damping out pathological processes.
In the case of entropic phase maps, this means that the entropic quantities defining these maps
should satisfy a property analogous to Hamilton’s principle of stationary action. In other respects,
these quantities need not resemble the classical action that determines Feynman’s phase map.
In particular, they need not be directly associated with familiar motion-related concepts such as
potential and kinetic energy, which define classical Lagrangians and actions in Newtonian mechanics,
or with metric structure, which determines the Einstein–Hilbert action in general relativity.
Entropy systems, introduced in Section 3.1, offer a general approach to entropy and the second law
of thermodynamics. Conventional versions of the second law involve notions of entropy associated
with “present states”, not with entire histories. In the discrete causal context, this suggests defining
entropies for terminal states of histories, which encode “recent” causes and effects. Such states are
defined in Section 3.3 in terms of transitions between pairs of directed sets. Aside from their evident
physical importance, such states are mathematically interesting due to their symmetry properties,
which exhibit a balance between the typical rigidity of general acyclic directed sets demonstrated
by Bender and Robinson [37], and the transitivity of antichains under their automorphism groups.
There are a variety of ways to define entropies for such states, all of which involve comparing
distinguishability properties of states at different levels of detail. Since multiple such levels merit
simultaneous consideration in discrete causal theory, a sufficiently general approach to discrete
causal entropy requires the use of entropy systems, which organize such levels in a systematic way.
Given two levels of detail, descriptions of a system at the coarser level are called macrostates,
390
Entropy 2017, 19, 322
while descriptions at the finer level are called microstates. The corresponding notion of entropy
measures the quantity of microstates corresponding to each macrostate in a manner that is additive
for composite systems. An important distinction between conventional thermodynamics and discrete
causal theory is that precise information up to first order typically suffices to determine future evolution
in the former setting, while higher-order information at the level of individual histories is a priori
relevant in the latter setting. In both cases, however, empirical evidence suggests that details of the
distant past should exert negligible influence on future events.
Four general methods of defining discrete causal macrostates and microstates, along with their
associated notions of entropy, and the resulting entropic phase maps, are examined in this paper.
Spaces of states are studied in Section 3.3, entropies in Section 3.4, and phase maps in Section 4.1.
The first method uses the theory of causal atomic resolution, whereby causal structure at the
fundamental scale is approximated by families of coarser causal structures constructed from special
subsets of directed sets, called causal atoms. This leads to the notion of resolution entropy.
This approach is very similar to coarse-graining of state space in conventional thermodynamics; in
particular, it involves actual approximation. The second method supplements the information encoded in
terminal states by describing how they may embed into larger states called supersets. This leads to the
notion of superset entropy. The level of detail in the original states is regarded as “coarse” because it is
incomplete, not because it is approximate. Supersets offer finer detail in the sense that they encode
more complete information. The third method measures distinguishability properties intrinsic to states
by counting the number of distinct ways in which they may be labeled. This leads to the notion of
labeled entropy. Labeled entropy is maximal for states lacking nontrivial symmetries, which meshes
with the intuition that high-entropy states should be “disordered”. The fourth method follows
essentially the opposite approach, by counting symmetries. This leads to the notion of symmetry
entropy. Like superset entropy, both labeled entropy and symmetry entropy involve organizing precise
but incomplete information, rather than actual approximation.
Computation of entropic phase maps in physically realistic situations is analytically involved,
and most of the results in this paper involve toy examples or qualitative results. Many of these appear
in Sections 4.1, 4.2 and 4.3. Discrete causal versions of the second law of thermodynamics favor
expanding universe scenarios, but this conclusion is obvious on basic enumerative grounds, and does
not favor discrete causal theory over other theories in any specific way. There is some evidence that
raw measures of entropy may be too sensitive to minor changes in structure to produce desirable
interference effects. The notion of entropy per unit volume seems more stable in this regard, and is also
attractive in other respects. Since the theory of entropic phase maps is almost completely unexplored,
many versions of the approach can likely be eliminated without serious effort. Symmetry entropy is
doubtful on conventional grounds, and also seems to be vulnerable to pathological instabilities such as
universal gravitational collapse scenarios. However, the idea is not obviously unworkable, and the
desire to model symmetric structures in nature, such as “elementary” particles, renders such notions
worth entertaining. Discrete causal action principles involving elementary operations on directed sets
offer an alternative to entropic phase maps in the path summation context. Relationships exist between
these two approaches, but the details of these connections are unclear at present.
Problems that must be solved to further develop the theory of entropic phase maps
include the enumeration of certain classes of acyclic directed sets, and the computations of their
automorphism groups. These problems may be approached from a mathematical perspective via
the theory of random graphs, and interesting and important results of this nature may be found in
the graph-theoretic literature. However, most of these results are developed from a perspective very
different than the study of fundamental spacetime structure, and the perception of what problems
are interesting is different in this setting as well. Hence, it is not easy to mine the existing body of
graph theory for such results, and many physically relevant topics remain underdeveloped. This is
likely due both to difficulty of problems and differences in emphasis. Particularly useful in this
context would be a thorough analysis of families of directed graphs corresponding to nth-order states.
391
Entropy 2017, 19, 322
For example, how would one compute the average number of superset microstates adding 103 elements
to a first-order state of cardinality 104 ? What is the average size of the automorphism group of a
first-order state with 109 elements and 1012 relations? For a fixed degree n, how does the average size
of Aut( T n ( D )) scale with the cardinality of D? For a fixed ratio of order to cardinality for states Δ,
how does the average size of Aut(Δ) scale with the cardinality of Δ? Going beyond average quantities,
how are the numbers of superset microstates, or the sizes of state automorphism groups, distributed for
certain classes of states? Are they randomly scattered, or do they tend to cluster around certain values?
Many questions of this nature must be answered before the physical implications of entropic phase
maps can be understood in any detail. Computational resources may also be used to compile numerical
evidence about the behavior of various entropic phase maps for relatively small histories. For example,
it would be very interesting to compute some of the entropic quantities examined in this paper for the
first few generations of the positive sequential kinematic scheme SPS .
Acknowledgments: The author thanks Brendan McKay, Johnny Feng, Jessica Garriga, Kiran Bist, and Stephanie
Dribus for useful discussions.
Conflicts of Interest: The author declares no conflict of interest.
References
1. Feynman, R. Space-Time Approach to Non-Relativistic Quantum Mechanics. Rev. Mod. Phys. 1948, 20, 367.
2. Bombelli, L.; Lee, J.; Meyer, D.; Sorkin, R. Space-Time as a Causal Set. Phys. Rev. Lett. 1987, 59, 521.
3. Finkelstein, D. Space-Time Code. Phys. Rev. 1969, 184, 1261.
4. Finkelstein, D. “Superconducting” Causal Nets. Int. J. Theor. Phys. 1988, 27, 473–519.
5. Knuth, K.H.; Bahreyni, N. A potential foundation for emergent space-time. J. Math. Phys. 2014, 55, 112501.
6. Ambjorn, J.; Dasgupta, A.; Jurkiewicz, J.; Loll, R. A Lorentzian cure for Euclidean troubles. Nucl. Phys. B
Proc. Suppl. 2002, 106, 977–979.
7. Markopoulou, F. Quantum Causal Histories. Class. Quantum Gravity 2000, 17, 2059.
8. Rovelli, C. Quantum Gravity. In Cambridge Monographs on Mathematical Physics; Cambridge University Press:
Cambridge, UK, 2004.
9. Thiemann, T. Modern Canonical Quantum General Relativity. In Cambridge Monographs on Mathematical Physics;
Cambridge University Press: Cambridge, UK, 2007.
10. D’Ariano, G.M.; Perinotti, P. Derivation of the Dirac Equation from Principles of Information Processing.
Phys. Rev. A 2014, 90, 062106.
11. Finster, F. Causal Fermion Systems: An Overview. In Quantum Mathematical Physics; Springer: Berlin,
Germany, 2016.
12. Finster, F. The Continuum Limit of Causal Fermion Systems: From Planck Scale Structures to Macroscopic
Physics. In Fundamental Theories of Physics; Springer: Berlin, Germany, 2016.
13. Chen, H.; Sasakura, N.; Sato, Y. Emergent Classical Geometries on Boundaries of Randomly Connected
Tensor Networks. arXiv 2016, arXiv:1601.04232.
14. Dribus, B.F. Discrete Causal Theory: Emergent Spacetime and the Causal Metric Hypothesis; Springer: Berlin,
Germany, 2017.
15. Dribus, B.F. On the Foundational Assumptions of Modern Physics. In Questioning the Foundations, the Frontiers
Collection; Springer: Berlin, Germany, 2015; pp. 45–60.
16. Dribus, B.F. On the Axioms of Causal Set Theory. arXiv 2013, arXiv:1311.2148.
17. D’Ariano, G.M.; Chiribella, G.; Perinotti, P. Quantum Theory From First Principles; Cambridge University Press:
Cambridge, UK, 2017.
18. Knuth, K.H. Information-based Physics: An observer-centric foundation. Contemp. Phys. 2014, 55, 12–32.
19. Verlinde, E. On the origin of gravity and the laws of Newton. J. High Energy Phys. 2011, 4, 29.
20. Kleitman, D.J.; Rothschild, B.L. Asymptotic Enumeration of Partial Orders on a Finite Set. Trans. Am.
Math. Soc. 1975, 205, 205–220.
21. Moore, C. Comment on “Space-Time as a Causal Set”. Phys. Rev. Lett. 1988, 60, 655.
22. Bombelli, L.; Lee, J.; Meyer, D.; Sorkin, R. Bombelli et al. Reply to Comment on “Space-Time as a Causal
Set”. Phys. Rev. Lett. 1988, 60, 656.
392
Entropy 2017, 19, 322
23. Hawking, S.W.; King, A.R.; McCarthy, P.J. A new topology for curved space-time which incorporates
the causal, differential, and conformal structures. J. Math. Phys. 1976, 17, 174–181.
24. Malament, D.B. The class of continuous timelike curves determines the topology of spacetime. J. Math. Phys.
1977, 18, 1399–1404.
25. Martin, K.; Panangaden, P. A Domain of Spacetime Intervals in General Relativity. Commun. Math. Phys.
2006, 267, 563–586.
26. Bombelli, L.; Meyer, D. Origin of Lorentzian geometry. Phys. Lett. A 1989, 141, 226–228.
27. Parrikar, O.; Surya, S. Causal topology in future and past distinguishing spacetimes. Class. Quantum Gravity
2011, 28, 155020.
28. Myrheim, J. Statistical Geometry. Available online: https://cds.cern.ch/record/293594/files/197808143.pdf
(accessed on 30 June 2017).
29. Hooft, G. Quantum Gravity: A Fundamental Problem and some Radical Ideas. In Recent Developments in
Gravitation; Springer: New York, NY, USA, 1978; pp. 323–345.
30. Ahmed, M.; Dodelson, S.; Greene, P.B.; Sorkin, R. Everpresent Λ. Phys. Rev. D 2004, 69, 103523.
31. Bombelli, L.; Henson, J.; Sorkin, R. Discreteness without symmetry breaking: A theorem. Mod. Phys. Lett. A
2009, 24, 2579–2587.
32. Harary, F.; Norman, R.Z. Some Properties of Line Digraphs. Rediconti del Circolo Matematico di Palermo
1960, 9, 161–168.
33. Major, S.A.; Rideout, D.; Surya, S. Spatial Hypersurfaces in Causal Set Cosmology. Class. Quantum Gravity
2006, 23, 4743–4751.
34. Surya, S. Directions in Causal Set Quantum Gravity. In Recent Research in Quantum Gravity; Dasgupta, A., Ed.;
Nova Science Publishing Incorporated: Hauppauge, NY, USA, 2012.
35. Sorkin, R. Expressing entropy globally in terms of (4D) field-correlations. J. Phys. Conf. Ser. 2014, 484, 012004.
36. Sorkin, R.; Yazdi, Y. Entanglement Entropy in Causal Set Theory. arXiv 2016, arXiv:1611.10281v1.
37. Bender, E.A.; Robinson, R.W. The Asymptotic Number of Acyclic Digraphs II. J. Comb. Theory Ser. B
1988, 44, 363–369.
38. Rideout, D.; Sorkin, R. Classical sequential growth dynamics for causal sets. Phys. Rev. D 2000, 61, 024002.
39. Sorkin, R. Toward a Fundamental Theorem of Quantal Measure Theory. Math. Struct. Comput. Sci.
2012, 22, 816–852.
40. Isham, C. Quantum Logic and the Histories Approach to Quantum Theory. J. Math. Phys. 1994, 35, 2157.
41. Isham, C. Topos Theory and Consistent Histories: The Internal Logic of the Set of all Consistent Sets. Int. J.
Theor. Phys. 1997, 36, 785.
42. Isham, C. Quantising on a Category. Found. Phys. 2005, 35, 271–297.
43. Isham, C. Topos Methods in the Foundations of Physics. In Deep Beauty: Understanding the Quantum World
through Mathematical Innovation; Halvorson, H., Ed.; Cambridge University Press: Cambridge, UK, 2011.
44. Penrose, R. Cycles of Time; Vintage Books: New York, NY, USA, 2010.
45. Benincasa, D.M.T.; Dowker, F. Scalar Curvature of a Causal Set. Phys. Rev. Lett. 2010, 104, 181301.
46. Glaser, L. A closed form expression for the causal set D’Alembertian. Class. Quantum Gravity 2014, 31, 5007.
47. Aslanbeigi, S.; Saravani, M.; Sorkin, R. Generalized Causal Set d’Alembertians. arXiv 2014, arXiv:1403.1622.
c 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
393
entropy
Article
Nonclassicality by Local Gaussian Unitary
Operations for Gaussian States
Yangyang Wang 1,† , Xiaofei Qi 1,2, *,† and Jinchuan Hou 1,3,†
1 Department of Mathematics, Shanxi University, Taiyuan 030006, China; [email protected] (Y.W.);
[email protected] (J.H.)
2 Institute of Big Data Science and Industry, Shanxi University, Taiyuan 030006, China
3 Department of Mathematics, Taiyuan University of Technology, Taiyuan 030024, China
* Correspondence: [email protected]; Tel.:+86-351-7010555
† These authors contributed equally to this work.
Received: 19 January 2018; Accepted: 6 April 2018; Published: 11 April 2018
Abstract: A measure of nonclassicality N in terms of local Gaussian unitary operations for bipartite
Gaussian states is introduced. N is a faithful quantum correlation measure for Gaussian states as
product states have no such correlation and every non product Gaussian state contains it. For any
bipartite Gaussian state ρ AB , we always have 0 ≤ N (ρ AB ) < 1, where the upper bound 1 is sharp.
An explicit formula of N for (1 + 1)-mode Gaussian states and an estimate of N for (n + m)-mode
Gaussian states are presented. A criterion of entanglement is established in terms of this correlation.
The quantum correlation N is also compared with entanglement, Gaussian discord and Gaussian
geometric discord.
1. Introduction
The presence of correlations in bipartite quantum systems is one of the main features of quantum
mechanics. The most important one among such correlations is entanglement [1]. However, recently
much attention has been devoted to the study and the characterization of quantum correlations
that go beyond the paradigm of entanglement, being necessary but not sufficient for its presence.
Non-entangled quantum correlations also play important roles in various quantum communications
and quantum computing tasks [2–5].
For the last two decades, various methods have been proposed to quantify quantum correlations,
such as quantum discord (QD) [6,7], geometric quantum discord [8,9], measurement-induced
nonlocality (MIN) [10] and measurement-induced disturbance (MID) [11] for discrete-variable systems.
It is also important to develop new simple criteria for witnessing correlations beyond entanglement for
continuous-variable systems. In this direction, Giorda, Paris [12] and Adesso, Datta [13] independently
introduced the definition of Gaussian QD for Gaussian states and discussed its properties. Adesso
and Girolami in [14] proposed the concept of Gaussian geometric discord (GD) for Gaussian states.
Measurement-induced disturbance of Gaussian states was studied in [15], while MIN for Gaussian
states was discussed in [16]. For other related results, see [17,18] and the references therein. Note
that not every quantum correlation defined for discrete-variable systems has a Gaussian analogy for
continuous-variable systems [16]. On the other hand, the values of Gaussian QD and Gaussian GD are
very difficult to be computed and the known formulas are only for some (1 + 1)-mode Gaussian states.
Little information is revealed by Gaussian QD and GD. The purpose of this paper is to introduce a new
where z = ( x1 , y1 , · · · , xn , yn )T ∈ R2n with R the field of real numbers and (·)T the transposition,
and W (z) = exp(iR T z) is the Weyl operator. Let R = ( R1 , R2 , · · · , R2n )T = ( Q̂1 , P̂1 , · · · , Q̂n , P̂n )T .
As usual, Q̂i and P̂i stand respectively for the position and momentum operators for each
i ∈ {1, 2, · · · , n}. They satisfy the Canonical Commutation Relation (CCR) in natural units (h̄ = 1)
i, j = 1, 2, . . . , n.
Gaussian states: ρ is called a Gaussian state if χρ (z) is of the form
1
χρ (z) = exp[− zT Γz + idT z],
4
where
d= ( R̂1 , R̂2 , . . . , R̂2n )T
= (Tr(ρR1 ), Tr(ρR2 ), . . . , Tr(ρR2n ))T ∈ R2n
is called the mean or the displacement vector of ρ and Γ = (γkl ) ∈ M2n (R) is the covariance matrix
(CM) of ρ defined by γkl = Tr[ρ(Δ R̂k Δ R̂l + Δ R̂l Δ R̂k )] with Δ R̂k = R̂k − R̂k ([22–24]). Here, Ml ×k (R)
stands for the set of all l-by-k real matrices and, when l = k, we write Ml ×k (R) as Ml (R). Note
that the CM Γ of a state
is symmetric
and must satisfy the uncertainty principle Γ + iΔ ≥ 0, where
0 1
Δ = ⊕in=1 Δi with Δi = for each i. From the diagonal terms of the above inequality, one can
−1 0
396
Entropy 2018, 20, 266
easily derive the usual Heisenberg uncertainty relation for position and momentum V ( Q̂i )V ( P̂i ) ≥ 1
with V ( R̂i ) = (Δ R̂i )2 [25].
Now assume that ρ AB is any (n + m)-mode Gaussian state. Then, the CM Γ of ρ AB can be
written as
A C
Γ= , (1)
CT B
where A ∈ M2n (R), B ∈ M2m (R) and C ∈ M2n×2m (R). Particularly, if n = m = 1 , by means of local
Gaussian unitary (symplectic at the CM level) operations, Γ has a standard form:
A0 C0
Γ0 = , (2)
C0T B0
a 0 b 0 c 0
where A0 = , B0 = , C0 = , Γ0 > 0, det Γ0 ≥ 1 and
0 a 0 b 0 d
det Γ0 + 1 ≥ det A0 + det B0 + 2 det C0 ([26–29]).
Gaussian unitary operations. Let us consider an n-mode continuous-variable system with
R = ( Q̂1 , P̂1 , · · · , Q̂n , P̂n )T . For a unitary operator U, the unitary operation ρ → UρU † is said to
be Gaussian if its output is a Gaussian state whenever its input is a Gaussian state, and such U is called
a Gaussian unitary operator. It is known that a unitary operator U is Gaussian if and only if
U † RU = SR + m,
for some vector m in R2n and some S ∈ Sp(2n, R), the symplectic group of all 2n × 2n real matrices S
that satisfy
S ∈ Sp(2n, R) ⇔ SΔST = Δ.
Thus, every Gaussian unitary operator U is determined by some affine symplectic map (S, m) acting
on the phase space, and can be denoted by U = US,m ([23,24]).
The following well-known facts for Gaussian states and Gaussian unitary operations are useful
for our purpose.
Lemma 1 ([23]). For any (n + m)-mode Gaussian state ρ AB , write its CM Γ as in Equation (1). Then, the CMs
of the reduced states ρ A = TrB ρ AB and ρ B = Tr A ρ AB are matrices A and B, respectively.
Lemma 3 ([23,24]). Assume that ρ is any n-mode Gaussian state with CM Γ and displacement vector d,
and US,m is a Gaussian unitary operator. Then, the characteristic function of the Gaussian state σ = UρU † is
of the form exp(− 14 zT Γσ z + idTσ z), where Γσ = SΓST and dσ = m + Sd.
397
Entropy 2018, 20, 266
1
N (ρ AB ) = sup ρ AB − (I ⊗ U )ρ AB (I ⊗ U † )22 , (3)
2 U
where the supremum is taken over all Gaussian unitary operators U ∈ B( HB ) satisfying Uρ B U † = ρ B ,
and ρ B = Tr A (ρ AB ) is the reduced state. Here, B( HB ) is the set of all bounded linear operators acting on HB .
Observe that N (ρ AB ) = 0 holds for every product state. Thus, the product state contains no
such correlation.
Remark 1. For any Gaussian state ρ AB , there exist many Gaussian unitary U so that Uρ B U † = ρ B . This
ensures that the definition of the quantity N (ρ AB ) makes sense for each Gaussian state ρ AB .
To see this, we need Williamson Theorem ([31]), which states that, for any n-mode Gaussian state
ρ ∈ S( H ) with CM Γρ , there exists a 2n × 2n symplectic matrix S such that SΓρ ST = ⊕in=1 vi I2 with
vi ≥ 1. The diagonal matrix ⊕in=1 vi I2 and vi s are called respectively the Williamson form and the
symplectic eigenvalues of Γρ . By the Williamson Theorem, there exists a Gaussian unitary operator
U = US,m = US,−Sd such that UρU † = ⊗in=1 ρi , where ρi are thermal states. Let Sθ = ⊕in=1 Sθi with
cos θi sin θi
S θi = , θi ∈ [0, π2 ]. Then, Sθ is a symplectic matrix, and the corresponding
− sin θi cos θi
Gaussian unitary operator USθ ,0 = USθ has the form USθ = ⊗in=1 USθ = ⊗in=1 exp(θi âi† âi ). It is easily
i
checked that Sθ (⊕in=1 vi I )STθ = ⊕in=1 vi I, and so USθ (⊗in=1 ρi )US† θ = ⊗in=1 ρi . Now, write W = U † USθ U.
Obviously, W is Gaussian unitary and satisfies WρW † = U † USθ UρU † US† θ U = ρ.
We first prove that N is local Gaussian unitary invariant for all quantum states.
398
Entropy 2018, 20, 266
commutes with ρ B when W runs over all Gaussian unitary operators commuting with σB . Hence,
by Equation (3), we have
N (σAB )
1
= sup σAB − ( I ⊗ W )σAB ( I ⊗ W )22
2 W
1
= sup (U ⊗ V )ρ AB (U † ⊗ V † ) − ( I ⊗ W )(U ⊗ V )ρ AB (U † ⊗ V † )( I ⊗ W )22
2 W
= sup{Tr(ρ2AB ) − Tr(ρ AB ( I ⊗ V † WV )ρ AB ( I ⊗ V † W † V ))}
W
= sup{Tr(ρ2AB ) − Tr(ρ AB ( I ⊗ W )ρ AB ( I ⊗ W † ))}
W
1
= sup ρ AB − ( I ⊗ W )ρ AB ( I ⊗ W † )22
2 W
=N (ρ AB )
as desired.
The next theorem shows that N (ρ AB ) is a faithful nonclassicality measure for Gaussian states.
Proof of Theorem 1. By Definition 1, the “if” part is apparent. Let us check the “only if” part. Since the
mean of any Gaussian state can be transformed to zero under some local Gaussian unitary operation,
it is sufficient to consider those Gaussian states whose means are zero by Proposition 1. In the sequel,
A C
assume that ρ AB is an (n + m)-mode Gaussian state with zero mean vector and CM Γ =
CT B
as in Equation (1), so that N (ρ AB ) = 0.
By Lemma 1, the CM of ρ B is B. According to the Williamson Theorem, there exists a
symplectic matrix S0 such that S0 BST0 = ⊕im=1 vi I and U0 ρ B U0† = ⊗im=1 ρi , where U0 = US0 ,0 and
ρi are of the thermal states. Write σAB = ( I ⊗ U0 )ρ AB ( I ⊗ U0† ). It follows from Proposition 1 that
N (σAB ) = N (ρ AB ) = 0. Obviously, σAB has the CM of form:
A C
Γ =
C T ⊕im vi I
Note that I − STθ is an invertible matrix if we take θi ∈ (0, π2 ) for each i. Then, it follows from
C = C STθ that we must have C = 0. Thus, σAB is a product state by Lemma 2, and, consequently,
ρ AB = ( I ⊗ U0† )σAB ( I ⊗ U0 ) is also a product state.
We can give an analytic formula of N (ρ AB ) for (1+1)-mode Gaussian state ρ AB . Since N is locally
Gaussian unitary invariant, it is enough to assume that the mean vector of ρ AB is zero and the CM
is standard.
399
Entropy 2018, 20, 266
A0 C0
Theorem 2. For any (1 + 1)-mode Gaussian state ρ AB with CM Γ whose standard form is Γ0 =
C0T B0
as in Equation (2), we have
1 1
N (ρ AB ) = . −/ . (4)
( ab − c2 )( ab − d2 ) c2
( ab − 2 )( ab − d2
2 )
/
Particularly, N (ρ AB ) = 1 − 2
2−c2 d2 + ab(c2 +d2 )
whenever ρ AB is pure.
Proof of Theorem 2. By Proposition 1, we may assume that the mean vector of ρ AB is zero. Let US,m
be a Gaussian unitary operator such that US,m ρ B US,m† = ρ B . Then, S and m meet the conditions
SB0 ST = B0 and Sd B + m = d B = 0. It follows that m = 0. Thus, we can denote
US,m by US .
cos θ sin θ
As SΔST = Δ, there exists some θ ∈ [0, π2 ] such that S = Sθ = . Thus, the CM of
− sin θ cos θ
Gaussian state ( I ⊗ US )ρ AB ( I ⊗ US† ) is
⎛ ⎞
a 0 c cos θ −c sin θ
⎜ d sin θ d cos θ ⎟
⎜ 0 a ⎟
Γθ = ⎜ ⎟,
⎝ c cos θ d sin θ b 0 ⎠
−c sin θ d cos θ 0 b
N (ρ AB )
1
= sup ρ AB − ( I ⊗ U )ρ AB ( I ⊗ US,m
†
)22
2 US,m
= sup {Tr(ρ2AB ) − Tr(ρ AB ( I ⊗ US,m )ρ AB ( I ⊗ US,m
†
))}
US,m
1 1
= sup { √ −. }
θ ∈[0, π2 ] det Γ det[(Γ + Γθ )/2]
1
= maxπ { .
θ ∈[0, 2 ] a2 b2 − ab(c2 + d2 )
+ c2 d2
1
−. }
[ ab − c2 (1 + cos θ )/2][ ab − d2 (1 + cos θ )/2]
1 1
=. −. .
( ab − c2 )( ab − d2 ) ( ab − c2 /2)( ab − d2 /2)
For the general (n + m)-mode case, it is difficult to give an analytic formula of N (ρ AB ) for all
(n + m)-mode Gaussian states ρ AB . However, we are able to give an estimate of N (ρ AB ).
400
Entropy 2018, 20, 266
A C
Theorem 3. For any (n + m)-mode Gaussian state ρ AB with CM Γ = as in Equation (1),
CT B
we have
1 1
0 ≤ N (ρ AB ) ≤ √ −. < 1. (5)
det Γ (det A)(det B)
Proof of Theorem 3. By Proposition 1, without loss of generality, we may assume that the mean of
ρ AB is 0. Let US,m be a Gaussian unitary operator such that US,m ρ B US,m
† = ρ B . Then, the CM and the
A CST
mean of the Gaussian state ( I ⊗ US,m )ρ AB ( I ⊗ US,m
† ) are Γ =
U and 0, respectively.
SCT B
Note that, for any n-mode Gaussian states ρ, σ with CMs Vρ , Vσ and means dρ , dσ , respectively, it is
shown in [32] that
1 1
Tr(ρσ ) = / exp[− δ d T det[(Vρ + Vσ )/2]−1 δ d], where δ d = dρ − dσ . (6)
det[(Vρ + Vσ )/2] 2
Hence,
1
N (ρ AB ) = sup ρ AB − (I ⊗ U )ρ AB (I ⊗ U † )22
2 U
= sup{Tr(ρ2AB ) − Tr(ρ AB ( I ⊗ U )ρ AB ( I ⊗ U † ))}
U
1 1
= sup{ √ −. }.
S det Γ det[(Γ + ΓU )/2]
C +CST
Γ + ΓU A
Since A > 0, B > 0 and 2 = CT +SCT
2 , by Fischer’s inequality (p. 506, [33]), we have
2 B
det Γ+2ΓU ≤ (det A)(det B). Thus, we get N (ρ AB ) ≤ √1 −√ 1
. If ρ AB is a pure state, then
det Γ (det A)(det B)
1 = Tr(ρ2AB ) = √1 , which gives N (ρ AB ) ≤ 1 − √ 1
.
det Γ (det A)(det B)
Notice that, by Equation (6), we have 1. 1
= Tr(ρ2AB )2 ≤
This implies that
det Γ
N (ρ AB ) ≤ √ 1 − √ 1
< 1 since det A > 0 and det B > 0, that is, the inequality (5) is true.
det Γ (det A)(det B)
To see that the upper bound 1 is sharp, consider the two-mode squeezed vacuum state
ρ(r ) = S(r )|00 00|S† (r ), where S(r ) = exp(−r â1 â2 + r â1† â2† ) is the two-mode squeezing
operator with squeezednumber r ≥ 0 and |00 is the vacuum state ([24]). The
CM
1 A0 B0 exp(−2r ) + exp(2r ) 0
of ρ(r ) is 2 , where A0 = and
B0 A0 0 exp(−2r ) + exp(2r )
− exp(−2r ) + exp(2r ) 0
B0 = . By Theorem 2, it is easily calculated that
0 exp(−2r ) − exp(2r )
8
N (ρ(r )) = 1 − .
6 + exp(−4r ) + exp(4r )
401
Entropy 2018, 20, 266
sup N (ρ AB ) ≤ d < 1.
ρ AB is separable
If this is true, then ρ AB is entangled when N (ρ AB ) > d. This will give a criterion of entanglement
for (n + m)-mode Gaussian states in terms of correlation N . Though we can not give a mathematical
proof, we show that this is true for (1 + 1)-mode separable Gaussian states with d ≤ 10 1
by a
numerical approach (Firstly, we randomly generated one million, five million, ten million, fifty million,
one hundred million, five hundred million separable Gaussian states with a, b, |c|, |d| ranging from 1
to 2, respectively. We found that the maximum of N is smaller than 0.09. Secondly, we used the same
method and extended the range to 5. Then, the maximum of N is smaller than 0.1. Thirdly, using the
same method and extending the range to 10, 100, 1000, 10000, respectively, we found that the maximum
of N is still smaller than 0.1. We repeated the above computations ten times, and the result is just
the same).
It is followed from Theorem 1 that the quantum correlation N exists in all entangled Gaussian
states and almost all separable Gaussian states except product states. In addition, Proposition 2 can be
viewed as a sufficient condition for the entanglement of two-mode Gaussian states: if N (ρ AB ) > 0.1,
then ρ AB is entangled.
To have an insight into the behavior of this quantum correlation by N and to compare it with the
entanglement and the discords, we consider a class of physically relevant states–squeezed thermal
state (STS). This kind of Gaussian state is used by many authors to illustrate the behavior of several
interesting quantum correlations [12,13]. Recall that a two-mode Gaussian state ρ AB is an STS if
n̄ik
ρ AB = S(r )ν1 (n̄1 ) ⊗ ν2 (n̄2 )S(r )† , where νi (n̄i ) = ∑k (1+n̄i )k+1
|k k| is the thermal state with thermal
photon number n̄i (i = 1, 2) and S(r ) = exp{r ( â1† â2† − â1 â2 )} is the
two-mode squeezing operator.
Particularly, when n̄1 = n̄2 = 0, ρ AB is a pure two-mode squeezed vacuum state, also known as an
Einstein–Podolski–Rosen (EPR) state [24]. When n̄1 > 0 or n̄2 > 0, ρ AB is a mixed Gaussian state.
402
Entropy 2018, 20, 266
For fixed r, ρ AB is separable (not in product form) for large enough n̄1 , n̄2 . Notice that if ρ is a STS with
the CM Γ0 in the standard form in Equation (2), then c = −d. In this case, by Theorem 2, we have
1 1
N (ρ AB ) = − . (7)
ab − c2 ab − c2 /2
Using this parametrization, one can get. a = 2n̄r + 1 + 2n̄1 (1 + n̄r ) + 2n̄2 n̄r , b = 2n̄r + 1 + 2n̄2 (1 + n̄r ) +
2n̄1 n̄r and c = −d = 2(1 + n̄1 + n̄2 ) n̄r (1 + n̄r ), where n̄r = sinh2 r ([12]). Especially, if n̄1 = n̄2 = n̄,
then ρ AB is called a symmetric squeezed thermal state (SSTS). Now assume that ρ AB is a SSTS. Then,
ρ AB is a mixed state if and only if n̄ > 0. The global purity of ρ AB is μ = Tr(ρ2AB ) = (1+12n̄)2 and the
T 1+2n̄
smallest symplectic eigenvalue v̄− of CM of ρ AB B
is v̄− = exp (2r )
. Moreover, ρ AB is entangled if and
only if v̄− < 1.
We first discuss the relation between N and the entanglement by considering SSTS. Regard
N (ρ AB ) as a function of μ and v̄− . From Figure 1a, for separable states, we see that the value N at the
separable SSTS is always smaller than 0.06, which supports positively Proposition 2. From Figure 1b,
for fixed purity μ, N turns out to be a decreasing function of v̄− . However, for fixed v̄− , N tends to 0
when μ increases.
Figure 1. (a) N (ρ AB ) for separable SSTSs as a function of μ and v̄− ; (b) from top to bottom,
v̄− = 1.0, 1.2, 1.5, 2.0.
For the entangled SSTS, one sees from Figure 2a,b that the value of N is from 0 to 1. This reveals
that, for some entangled SSTSs, N can be smaller than 10 1
. Thus, Proposition 2 is only a necessary
condition for a Gaussian state to be separable. For fixed purity μ, from Figure 1b and 2b, N (ρ AB )
increases when entanglement increases (that is, v̄− → 0) and limμ→1,v̄− →0 N = 1. However, for fixed
v̄− , the behavior of N on μ is more complex.
Figure 2. (a) N (ρ AB ) for entangled SSTS as a function of μ and v̄− ; (b) from top to bottom,
v̄− = 0.1, 0.2, 0.5, 0.8.
403
Entropy 2018, 20, 266
Figure 3. N (ρ AB ) for SSTS as a function of n̄ and r. (a) from top to bottom n̄ = 0, 0.5, 1, 2, 3; (b) from
top to bottom r = 0.5, 1, 5, 10, 20.
x −1 x −1
where the infimum takes over all one-mode Gaussian states ω, f ( x ) = x+ 2 log 2 − 2 log 2 , v−
1 x +1
and v+ are the symplectic eigenvalues of the CM of ρ AB , Eω = A0 − C0 ( B0 + Γω )−1 C0T with Γω the
CM of ω. Let α = det A0 , β = det B0 , γ = det C0 , δ = det Γ0 , then we have [13]
⎧ 2 √ 2
⎨ 2γ +( β−1)(δ−α)+2|γ| γ +( β−1)(δ−α) if (δ − αβ)2 ≤ (1 + β)γ2 (α + δ),
inf det Eω = √ ( β −1)2 (9)
ω ⎩ αβ−γ2 +δ− γ4 +(δ−αβ)2 −2γ2 (αβ+δ)
2β otherwise.
404
Entropy 2018, 20, 266
1 9
DG (ρ AB ) = − √ √ . (11)
ab − c2 ( 4ab − 3c2 + ab)2
Clearly, our formula (7) for N is simpler then formula (11) for DG .
Figures 4 and 5 are plotted in terms of photo number n̄ and squeezing parameter r. Figure 4 shows
that, for the case of SSTS and for 0 < r ≤ 2.5, we have DG (ρ AB ) < N (ρ AB ). This means that N is
better than DG when they are used to detect the correlation that they describe in the SSTS with r < 2.5.
Figure 5a reveals that, for the case of nonsymmetric STS and for r = 0.5, we have DG (ρ AB ) < N (ρ AB );
that is, N is better in this situation too. However, for r = 5, N and DG can not be compared with each
other globally, which suggests that one may use max{N (ρ AB ), DG (ρ AB )} to detect the correlation.
DG
DG DG
Figure 5. Comparison with DG (ρ AB ) for nonsymmetric STS. (a) and (b) are correspond to nonsymmetric STS with
r = 0.5, 5, respectively.
405
Entropy 2018, 20, 266
5. Conclusions
In conclusion, we introduce a measure of quantum correlation by N for bipartite quantum states
in continuous-variable systems. This measure is introduced by performing Gaussian unitary operations
to a subsystem and the value of it is invariant for all quantum states under local Gaussian unitary
operations. N exists in all (n + m)-mode Gaussian states except product ones. In addition, N takes
values in [0, 1) and the upper bound 1 is sharp. An analytical formula of N for any (1 + 1)-mode
Gaussian states is obtained. Moreover, for any (n + m)-mode Gaussian states, an estimate of N
is established in terms of its covariance matrix. Numerical evidence shows that the inequality
N (ρ AB ) ≤ 0.1 holds for any (1 + 1)-mode separable Gaussian states ρ AB , which can be viewed as a
criterion of entanglement. It is worth noting that Gaussian QD, Gaussian GD and N measure the same
quantum correlation for (1 + 1)-mode Gaussian states. However, N is easer to calculate and can be
applied to any (n + m)-mode Gaussian states.
Acknowledgments: The authors would like to thank the anonymous referees for helpful comments and
suggestions that improved the original paper. This work is partially supported by the Natural Science Foundation
of China (11671006, 11671294) and the Outstanding Youth Foundation of Shanxi Province (201701D211001).
Author Contributions: Yangyang Wang completed the proofs of main theorems. The rest work of this paper was
accomplished by Xiaofei Qi and Jinchuan Hou.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Horodecki, R.; Horodecki, P.; Horodecki, M.; Horodecki, K. Quantum entanglement. Rev. Mod. Phys. 2009,
81, 865.
2. Dakić, B.; Lipp, Y.O.; Ma, X.; Ringbauer, M.; Kropatschek, S.; Barz, S.; Paterek, T.; Vedral, V.; Zeilinger, A.;
Brukner, Č.; et al. Quantum discord as resource for remote state preparation. Nat. Phys. 2012, 8, 666–670.
3. Madhok, V.; Datta, A. Interpreting quantum discord through quantum state merging. Phys. Rev. A 2011,
83, 032323.
4. Cavalcanti, D.; Aolita, L.; Boixo, S.; Modi, K.; Piani, M.; Winter, A. Operational interpretations of quantum
discord. Phys. Rev. A 2011, 83, 032324.
5. Datta, A.; Shaji, A.; Caves, C.M. Quantum discord and the power of one qubit. Phys. Rev. Lett. 2008,
100, 050502.
6. Ollivier, H.; Zurek, W.H. Quantum Discord: A Measure of the Quantumness of Correlations. Phys. Rev. Lett.
2001, 88, 017901.
7. Dakić, B.; Vedral, V.; Brukner, Č. Necessary and Sufficient Condition for Nonzero Quantum Discord.
Phys. Rev. Lett. 2010, 105, 190502.
8. Luo, S.; Fu, S. Geometric measure of quantum discord. Phys. Rev. A 2010, 82, 034302.
9. Miranowicz, A.; Horodecki, P.; Chhajlany, R.W.; Tuziemski, J.; Sperling, J. Analytical progress on symmetric
geometric discord: Measurement-based upper bounds. Phys. Rev. A 2012, 86, 042123.
10. Luo, S.; Fu, S. Measurement-induced nonlocality. Phys. Rev. Lett. 2011, 82, 120401.
11. Luo, S. Using measurement-induced disturbance to characterize correlations as classical or quantum.
Phys. Rev. A 2008, 77, 022301.
12. Giorda, P.; Paris, M.G.A. Gaussian Quantum Discord. Phys. Rev. Lett. 2010, 105, 020503.
13. Adesso, G.; Datta, A. Quantum versus Classical Correlations in Gaussian States. Phys. Rev. Lett. 2010,
105, 030501.
14. Adesso, G.; Girolami, D. Gaussian geometric discord. Int. J. Quantum Inf. 2011, 9, 1773–1786.
15. Mišta, L.; Tatham, R., Jr.; Girolami, D.; Korolkova, N.; Adesso, G. Measurement-induced disturbances and
nonclassical correlations of Gaussian states. Phys. Rev. A 2011, 83, 042325.
16. Ma, R.F.; Hou, J.C.; Qi, X.F. Measurement-induced nonlocality for Gaussian states. Int. J. Theor. Phys. 2017,
56, 1132–1140.
17. Farace, A.; de Pasquale, A.; Rigovacca, L.; Giovannetti, V. Discriminating strength: A bona fide measure of
non-classical correlations. New J. Phys. 2014, 16, 073010.
406
Entropy 2018, 20, 266
18. Rigovacca, L.; Farace, A.; de Pasquale, A.; Giovannetti, V. Gaussian discriminating strength. Phys. Rev. A
2015, 92, 042331.
19. Fu, L. Nonlocal effect of a bipartite system induced by local cyclic operation. Europhys. Lett. 2006, 75, 1.
20. Datta, A.; Gharibian, S. Signatures of nonclassicality in mixed-state quantum computation. Phys. Rev. A
2009, 79, 042325.
21. Gharibian, S. Quantifying nonclassicality with local unitary operations. Phys. Rev. A 2012, 86, 042106.
22. Braunstein, S.L.; van Loock, P. Quantum information with continuous variables. Rev. Mod. Phys. 2005,
77, 513.
23. Wang, X.B.; Hiroshimab, T.; Tomitab, A.; Hayashi, M. Quantum information with Gaussian states. Phys. Rep.
2007, 448, 1–111.
24. Weedbrook, C.; Pirandola, S.; García-Patrón, R.; Cerf, N.J.; Ralph, T.C.; Shapiro, J.H.; Lloyd, S. Gaussian
quantum information. Rev. Mod. Phys. 2012, 84, 621.
25. Simon, R.; Mukunda, N.; Dutta, B. Quantum-noise matrix for multimode systems: U(n) invariance, squeezing,
and normal forms. Phys. Rev. A 1994, 49, 1567.
26. Duan, L.M.; Giedke, G.; Cirac, J.I.; Zoller, P. Inseparability Criterion for Continuous Variable Systems.
Phys. Rev. Lett. 2000, 84, 2722.
27. Simon, R. Peres-Horodecki Separability Criterion for Continuous Variable Systems. Phys. Rev. Lett. 2000,
84, 2726.
28. Serafini, A. Multimode Uncertainty Relations and Separability of Continuous Variable States. Phys. Rev. Lett.
2006, 96, 110402.
29. Pirandola, S.; Serafini, A.; Lloyd, S. Correlation matrices of two-mode bosonic systems. Phys. Rev. A 2009,
79, 052327.
30. Anders, J. Estimating the degree of entanglement of unknown Gaussian states. arXiv 2012, arXiv:quant-ph/
0610263v1.
31. Williamson, J. On the algebraic problem concerning the normal forms of linear dynamical systems.
Am. J. Math. 1936, 58, 141–163.
32. Marian, P.; Marian, T.A. Uhlmann fidelity between two-mode Gaussian states. Phys. Rev. A 2012, 86, 022340.
33. Horn, R.A.; Johnson, C.R. Matrix Analysis; Cambridge University Press: Cambridge, UK, 2012.
34. Holevo, A.S. Quantum Systems, Channels, Information: A Mathematical Introduction; De Gruyter: Berlin,
Germany, 2012.
35. Peres, A. Separability Criterion for Density Matrices. Phys. Rev. Lett. 1997, 77, 1413.
36. Horodecki, M.; Horodecki, P.; Horodecki, R. Separability of mixed states: necessary and sufficient conditions.
Phys. Lett. A 1996, 1, 223.
37. Werner, R.F.; Wolf, M.M. Bound Entangled Gaussian States. Phys. Rev. Lett. 2001, 86, 3658.
38. Giedke, G.; Cirac, J.I. Characterization of Gaussian operations and distillation of Gaussian states. Phys. Rev. A
2002, 66, 032316.
39. Fiurášek, J.; Mišta, L., Jr. Gaussian localizable entanglement. Phys. Rev. A 2007, 75, 060302.
c 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
407
entropy
Article
Entropic Updating of Probabilities and
Density Matrices
Kevin Vanslette
Department of Physics, University at Albany (SUNY), Albany, NY 12222, USA; [email protected]
Abstract: We find that the standard relative entropy and the Umegaki entropy are designed for the
purpose of inferentially updating probabilities and density matrices, respectively. From the same set
of inferentially guided design criteria, both of the previously stated entropies are derived in parallel.
This formulates a quantum maximum entropy method for the purpose of inferring density matrices
in the absence of complete information.
Keywords: probability theory; entropy; quantum relative entropy; quantum information; quantum
mechanics; inference
1. Introduction
We design an inferential updating procedure for probability distributions and density matrices
such that inductive inferences may be made. The inferential updating tools found in this derivation take
the form of the standard and quantum relative entropy functionals, and thus we find the functionals
are designed for the purpose of updating probability distributions and density matrices, respectively.
Previously formulated design derivations which found the entropy to be a tool for inference originally
required five design criteria (DC) [1–3], this was reduced to four in [4–6], and then down to three in [7].
We reduced the number of required DC down to two while also providing the first design derivation of
the quantum relative entropy—using the same design criteria and inferential principles in both instances.
The designed quantum relative entropy takes the form of Umegaki’s quantum relative entropy,
and thus it has the “proper asymptotic form of the relative entropy in quantum (mechanics)” [8–10].
Recently, Wilming, etc. [11] gave an axiomatic characterization of the quantum relative entropy that
“uniquely determines the quantum relative entropy”. Our derivation differs from their’s, again in
that we design the quantum relative entropy for a purpose, but also that our DCs are imposed on
what turns out to be the functional derivative of the quantum relative entropy rather than on the
quantum relative entropy itself. The use of a quantum entropy for the purpose of inference has a large
history: Jaynes [12,13] invented the notion of the quantum maximum entropy method [14], while it
was perpetuated by [15–22] and many others. However, we find the quantum relative entropy to be the
suitable entropy for updating density matrices, rather than the von Neuman entropy [23], as is suggested
in [24]. I believe the present article provides the desired motivation for why the appropriate quantum
relative entropy for updating density matrices, from prior to posterior, should be logarithmic in form
while also providing a solution for updating non-uniform prior density matrices [24]. The relevant
results of these papers may be found using the quantum relative entropy with suitably chosen prior
density matrices.
It should be noted that because the relative entropies were reached by design, they may be
interpreted as such, “the relative entropies are tools for updating”, which means we no longer need to
attach an interpretation ex post facto—as a measure of disorder or amount of missing information. In this
sense, the relative entropies were built for the purpose of saturating their own interpretation [4,7], and,
therefore, the quantum relative entropy is the tool designed for updating density matrices.
This article takes an inferential approach to probabilities and density matrices that is expected
to be notionally consistent with the Bayesian derivations of Quantum Mechanics, such as Entropic
Dynamics [7,25–27], as well as Bayesian interpretations of Quantum Mechanics, such as QBism [28].
The quantum maximum entropy method is, however, expected to be useful independent of one’s
interpretation of Quantum Mechanics because the entropy is designed at the level of density matrices
rather than being formulated from arguments about the “inner workings” of Quantum Mechanics.
This inferential approach is, at the very least, verbally convenient so we will continue writing in
this language.
A few applications of the quantum maximum entropy method are given in an another article [29].
By maximizing the quantum relative entropy with respect to a “data constraint” and the appropriate
prior density matrix, the Quantum Bayes Rule [30–34] (a positive-operator valued measure (POVM)
measurement and collapse) is derived. The quantum maximum entropy method can reproduce the
density matrices in [35,36] that are cited as “Quantum Bayes Rules”, but the required constraints
are difficult to motivate; however, it is expected that the results of this paper may be useful for
further understanding Machine Learning techniques that involve the quantum relative entropy [37].
The Quantum Bayes Rule derivation in [29] is analogous to the standard Bayes Rule derivation from
the relative entropy given in [38], as was suggested to be possible in [24]. This article provides the
foundation for [29], and thus, the quantum maximum entropy method unifies a few topics in Quantum
Information and Quantum Measurement through entropic inference.
As is described in this article and in [29], the quantum maximum entropy method is able to
provide solutions even if the constraints and prior density matrix in question do not all mutually
commute. This might be useful for subjects as far reaching as [39], which seeks to use Quantum Theory
as a basis for building models for cognition. The immediate correspondence is that the quantum
maximum entropy method might provide a solution toward addressing the empirical evidence for
noncommutative cognition, which is how one’s cognition changes when addressing questions in
permuted order [39]. A simpler model for noncommutative cognition may also be possible by applying
sequential updates via the standard maximum entropy method with their order permuted. Sequential
updating does not, in general, give the same resultant probability distribution when the updating order
is permuted—this is argued to be a feature of the standard maximum entropy method [40]. Similarly,
sequential updating in the quantum maximum entropy method also has this feature, but it should be
noted that the noncommutativity of sequential updating is different in principle than simultaneously
updating with respect to expectation values of noncommuting operators.
The remainder of the paper is organized as follows: first, we will discuss some universally
applicable principles of inference and motivate the design of an entropy function able to rank
probability distributions. This entropy function will be designed such that it is consistent with
inference by applying a few reasonable design criteria, which are guided by the aforementioned
principles of inference. Using the same principles of inference and design criteria, we find the form
of the quantum relative entropy suitable for inference. The solution to an example of updating 2 × 2
prior density matrices with respect to expectation values over spin matrices that do not commute
with the prior via the quantum maximum entropy method is given in the Appendix B. We end with
concluding remarks (I thank the reviewers for providing several useful references in this section).
410
Entropy 2017, 19, 664
in the form of data [38,40]. In the spirit of the derivation we will carry on as if the maximum entropy
method were not known and show how it may be derived as an application of inference.
Given a probability distribution ϕ( x ) over a general set of propositions x ∈ X, it is self evident
that if new information is learned, we are entitled to assign a new probability distribution ρ( x ) that
somehow reflects this new information while also respecting our prior probability distribution ϕ( x ).
The main question we must address is: “Given some information, to what posterior probability
distribution ρ( x ) should we update our prior probability distribution ϕ( x )?”, that is,
∗
ϕ( x ) −→ ρ( x )?
This specifies the problem of inductive inference. Since “information” has many colloquial,
yet potentially conflicting, definitions, we remove potential confusion by defining information
operationally (∗) as the rationale that causes a probability distribution to change (inspired by and
adapted from [7]). Directly from [7]:
Our goal is to design a method that allows a systematic search for the preferred posterior
distribution. The central idea, first proposed in [4], is disarmingly simple: to select the
posterior, first rank all candidate distributions in increasing order of preference and then pick
the distribution that ranks the highest. Irrespective of what it is that makes one distribution
preferable over another (we will get to that soon enough), it is clear that any ranking
according to preference must be transitive: if distribution ρ1 is preferred over distribution
ρ2 , and ρ2 is preferred over ρ3 , then ρ1 is preferred over ρ3 . Such transitive rankings are
implemented by assigning to each ρ( x ) a real number S[ρ], which is called the entropy of ρ,
in such a way that if ρ1 is preferred over ρ2 , then S[ρ1 ] > S[ρ2 ]. The selected distribution
(one or possibly many, for there may be several equally preferred distributions) is that
which maximizes the entropy functional.
This simple statement provides the foundation for inference [7]. If the updating of probability
distributions is to be done objectively, then possibilities should not be needlessly ruled out or
suppressed. Being informationally stingy, that we should only update probability distributions
when the information requires it, pushes inductive inference toward objectivity. Thus, using the PMU
helps formulate a pragmatic (and objective) procedure for making inferences using (informationally)
subjective probability distributions [41].
This method of inference is only as universal and general as its ability to apply equally well to
any specific inference problem. The notion of “specificity” is the notion of statistical independence;
a special case is only special in that it is separable from other special cases. The notion that systems
may be “sufficiently independent” plays a central and deep-seated role in science and the idea that
some things can be neglected and that not everything matters, is implemented by imposing criteria
that tells us how to handle independent systems [7]. Ironically, the universally shared property by all
specific inference problems is their ability to be independent of one another—they share independence.
Thus, a universal inference scheme based on the PMU permits:
411
Entropy 2017, 19, 664
And,
Subsystem Independence: When two systems are a priori believed to be independent and we only
receive information about one, then the state of knowledge of the other system remains unchanged.
The PIs are special cases of the PMU that ultimately take the form of design criteria in this design
derivation. The process of constraining the form of S[ρ, ϕ] by imposing design criteria may be viewed
as the process of eliminative induction, and after sufficient constraining, a single form for the entropy
remains. Thus, the justification behind the surviving entropy is not that it leads to demonstrably correct
inferences, but, rather, that all other candidate entropies demonstrably fail to perform as desired [7].
Rather than the design criteria instructing one how to update, they instruct in what instances one should
not update. That is, rather than justifying one way to skin a cat over another, we tell you when not to
skin it, which is operationally unique—namely you don’t do it—luckily enough for the cat.
(The notation will be as follows: we denote priors by ϕ, candidate posteriors by lower case ρ, and the
selected posterior by upper case P.) We emphasize the point is not that we make the unwarranted
assumption that keeping ϕ( x |D) unchanged is guaranteed to lead to correct inferences. It need not;
induction is risky. The point is, rather, that, in the absence of any evidence to the contrary, there is no
reason to change our minds and the prior information takes priority.
DC1 Implementation
Consider the set of microstates xi ∈ X belonging to either of two non-overlapping domains D or
its compliment D , such that X = D ∪ D and ∅ = D ∩ D . For convenience, let ρ( xi ) = ρi . Consider
the following constraints:
such that ρ(D) + ρ(D ) = 1, and the following “local” expectation value constraints over D and D ,
412
Entropy 2017, 19, 664
0 = δ S − λ[ρ(D) − ∑ ρi ] − μ[ A − ∑ ρi Ai ]
i ∈D i ∈D
−λ [ρ(D ) − ∑ ρi ] − μ [ A − ∑ ρi Ai ] ,
i ∈D i ∈D
and, thus, the entropy is maximized when the following differential relationships hold:
δS
= λ + μAi ∀ i ∈ D, (4)
δρi
δS
= λ + μ Ai ∀ i ∈ D. (5)
δρi
Equations (2)–(5), are n + 4 equations we must solve to find the four Lagrange multipliers {λ, λ , μ, μ }
and the n probability values {ρi } associated to the n microstates { xi }. If the subdomain constraint
DC1 is imposed in the most restrictive case, then it will hold in general. The most restrictive case
requires splitting X into a set of {Di } domains such that each Di singularly includes one microstate xi .
This gives,
δS
= λi + μi Ai in each Di . (6)
δρi
Because the entropy S = S[ρ1 , ρ2 , ...; ϕ1 , ϕ2 , ...] is a functional over the probability of each microstate’s
posterior and prior distribution, its variational derivative is also a function of said probabilities
in general,
δS
≡ φi (ρ1 , ρ2 , ...; ϕ1 , ϕ2 , ...) = λi + μi Ai for each (i, Di ). (7)
δρi
DC1 is imposed by constraining the form of φi (ρ1 , ρ2 , ...; ϕ1 , ϕ2 , ...) = φi (ρi ; ϕ1 , ϕ2 , ...) to ensure that
changes in Ai → Ai + δAi have no influence over the value of ρ j in domain D j , through φi , for i
= j.
If there is no new information about propositions in D j , its distribution should remain equal to ϕ j
by the PMU. We further restrict φi such that an arbitrary variation of ϕ j → ϕ j + δϕ j (a change in the
prior state of knowledge of the microstate j) has no effect on ρi for i
= j and therefore DC1 imposes
φi = φi (ρi , ϕi ), as is guided by the PMU. At this point, it is easy to generalize the analysis to continuous
microstates such that the indices become continuous i → x, sums become integrals, and discrete
probabilities become probability densities ρi → ρ( x ).
Remark
We are designing the entropy for the purpose of ranking posterior probability distributions (for the
purpose of inference); however, the highest ranked distribution is found by setting the variational
derivative of S[ρ, ϕ] equal to the variations of the expectation value constraints by the Lagrange
multiplier method,
δS
= λ + ∑ μ i A i ( x ). (8)
δρ( x ) i
δS
Therefore, the real quantity of interest is δρ( x )
rather than the specific form of S[ρ, ϕ]. All forms of S[ρ, ϕ]
δS
that give the correct form of are equally valid for the purpose of inference. Thus, every design
δρ( x )
criteria may be made on the variational derivative of the entropy rather than the entropy itself,
which we do. When maximizing the entropy, for convenience, we will let,
δS
≡ φx (ρ( x ), ϕ( x )), (9)
δρ( x )
413
Entropy 2017, 19, 664
and further use the shorthand φx (ρ, ϕ) ≡ φx (ρ( x ), ϕ( x )), in all cases.
DC1’: In the absence of new information, our new state of knowledge ρ( x ) is equal to the old state of
knowledge ϕ( x ).
This is a special case of DC1, and is implemented differently than in [7]. The PMU is in principle
a statement about informational honestly—that is, one should not “jump to conclusions” in light
of new information and in the absence of new information, one should not change their state of
knowledge. If no new information is given, the prior probability distribution ϕ( x ) does not change,
that is, the posterior probability distribution ρ( x ) = ϕ( x ) is equal to the prior probability. If we
maximizing the entropy without applying constraints,
δS
= 0, (10)
δρ( x )
δS
= φx (ρ, ϕ) = φx ( ϕ, ϕ) = 0, (11)
δρ( x )
for all x in this case. This special case of the DC1 and the PMU turns out to be incredibly constraining
as we will see over the course of DC2.
Comment
If the variable x is continuous, DC1 requires that when information refers to points infinitely close
but just outside the domain D , that it will have no influence on probabilities conditional on D [7].
This may seem surprising as it may lead to updated probability distributions that are discontinuous.
Is this a problem? No.
In certain situations (e.g., physics) we might have explicit reasons to believe that conditions of
continuity or differentiability should be imposed and this information might be given to us in a variety
of ways. The crucial point, however—and this is a point that we keep and will keep reiterating—is
that unless such information is explicitly given, we should not assume it. If the new information leads
to discontinuities, so be it.
DC2: Subsystem Independence
DC2 imposes the second instance of when one should not update—the Subsystem PI.
We emphasize that DC2 is not a consistency requirement. The argument we deploy is not that both
the prior and the new information tells us the systems are independent, in which case consistency
requires that it should not matter whether the systems are treated jointly or separately. Rather, DC2
refers to a situation where the new information does not say whether the systems are independent
or not, but information is given about each subsystem. The updating is being designed so that the
independence reflected in the prior is maintained in the posterior by default via the PMU and the
second clause of the PIs [7].
The point is not that when we have no evidence for correlations we draw the firm conclusion that
the systems must necessarily be independent. They could indeed have turned out to be correlated and
then our inferences would be wrong. Again, induction involves risk. The point is rather that if the
joint prior reflects independence and the new evidence is silent on the matter of correlations, then the
prior independence takes precedence. As before, in this case subdomain independence, the probability
distribution should not be updated unless the information requires it [7].
DC2 Implementation
Consider a composite system, x = ( x1 , x2 ) ∈ X = X1 × X2 . Assume that all prior evidence led
us to believe the subsystems are independent. This belief is reflected in the prior distribution: if the
individual system priors are ϕ1 ( x1 ) and ϕ2 ( x2 ), then the prior for the whole system is their product
414
Entropy 2017, 19, 664
ϕ1 ( x1 ) ϕ2 ( x2 ). Further suppose that new information is acquired such that ϕ1 ( x1 ) would by itself be
updated to P1 ( x1 ) and that ϕ2 ( x2 ) would be itself be updated to P2 ( x2 ). By design, the implementation
of DC2 constrains the entropy functional such that, in this case, the joint product prior ϕ1 ( x1 ) ϕ2 ( x2 )
updates to the selected product posterior P1 ( x1 ) P2 ( x2 ) [7].
The argument below is considerably simplified if we expand the space of probabilities to include
distributions that are not necessarily normalized. This does not represent any limitation because a
normalization constraint may always be applied. We consider a few special cases below:
Case 1: We receive the extremely constraining information that the posterior distribution for system 1
is completely specified to be P1 ( x1 ) while we receive no information at all about system 2. We treat
the two systems jointly. Maximize the joint entropy S[ρ( x1 , x2 ), ϕ( x1 ) ϕ( x2 )] subject to the following
constraints on the ρ( x1 , x2 ) :
dx2 ρ( x1 , x2 ) = P1 ( x1 ) . (12)
This equation must hold for all choices of x2 and all choices of the prior ϕ2 ( x2 ) as λ1 ( x1 ) is independent
of x2 . Suppose we had chosen a different prior ϕ2 ( x2 ) = ϕ2 ( x2 ) + δϕ2 ( x2 ) that disagrees with ϕ2 ( x2 ).
For all x2 and δϕ2 ( x2 ), the multiplier λ1 ( x1 ) remains unchanged as it constrains the independent
ρ( x1 ) → P1 ( x1 ). This means that any dependence that the right-hand side might potentially have had
on x2 and on the prior ϕ2 ( x2 ) must cancel out. This means that
Since ϕ2 is arbitrary in f , suppose further that we choose a constant prior set equal to one,
ϕ2 ( x2 ) = 1, therefore
The left-hand side does not depend on x2 , and therefore neither does the right-hand side. An argument
exchanging systems 1 and 2 gives a similar result.
Case 1—Conclusion: When the system 2 is not updated the dependence on ϕ2 and x2 drops out,
415
Entropy 2017, 19, 664
As we seek the general functional form of φx1 x2 , and because the x2 dependence drops out of (19)
and the x1 dependence drops out of (20) for arbitrary ϕ1 , ϕ2 and ϕ12 = ϕ1 ϕ2 , the explicit coordinate
dependence in φ consequently drops out of both such that,
φx1 x2 → φ, (21)
Again, this is one constraint for each value of x1 and one constraint for each value of x2 , which,
therefore, require the separate multipliers μ1 ( x1 ) and μ2 ( x2 ). Maximizing S with respect to these
constraints is then,
2
0 = δ S − dx1 μ1 ( x1 ) dx2 ρ( x1 , x2 ) − P1 ( x1 )
3
− dx2 μ2 ( x2 ) dx1 ρ( x1 , x2 ) − P2 ( x2 ) , (23)
leading to
μ1 ( x1 ) + μ2 ( x2 ) = φ (ρ( x1 , x2 ), ϕ1 ( x1 ) ϕ2 ( x2 )) . (24)
The updating is being designed so that ϕ1 ϕ2 → P1 P2 , as the independent subsystems are being updated
based on expectation values which are silent about correlations. DC2 thus imposes,
μ1 ( x1 ) + μ2 ( x2 ) = φ ( P1 ( x1 ) P2 ( x2 ), ϕ1 ( x1 ) ϕ2 ( x2 )) . (25)
The left-hand side is independent of x2 so we can perform a trick similar to that we used before.
Suppose we had chosen a different constraint P2 ( x2 ) that differs from P2 ( x2 ) and a new prior ϕ2 ( x2 )
that differs from ϕ2 ( x2 ) except at the value x̄2 . At the value x̄2 ,the multiplier μ1 ( x1 ) remains unchanged
for all P2 ( x2 ), ϕ2 ( x2 ), and thus x2 . This means that any dependence that the right-hand side might
potentially have had on x2 and on the choice of P2 ( x2 ), ϕ2 ( x2 ) must cancel out, leaving μ1 ( x1 )
unchanged. That is, the Lagrange multiplier μ( x2 ) “pushes out” these dependences such that
φ ( P1 ( x1 ) P2 ( x2 ), ϕ1 ( x1 ) ϕ2 ( x2 )) − μ2 ( x2 ) = g( P1 ( x1 ), ϕ1 ( x1 )). (27)
416
Entropy 2017, 19, 664
μ2 ( x2 ) = φ ( P2 ( x2 ), ϕ2 ( x2 )) . (29)
Case 2—Conclusion: Substituting back into (25) gives us a functional equation for φ ,
φ ( P1 P2 , ϕ1 ϕ2 ) = φ ( P1 , ϕ1 ) + φ ( P2 , ϕ2 ) . (30)
The general solution for this functional equation is derived in the Appendix A.3, and is
where a1 , a2 are constants. The constants are fixed by using DC1’. Letting ρ1 ( x1 ) = ϕ1 ( x1 ) = ϕ1 gives
φ( ϕ, ϕ) = 0 by DC1’, and, therefore,
φ( ϕ, ϕ) = ( a1 + a2 ) ln( ϕ) = 0, (32)
so we are forced to conclude a1 = − a2 for arbitrary ϕ. Letting a1 ≡ A = −| A| such that we are really
maximizing the entropy (although this is purely aesthetic) gives the general form of φ to be
ρ( x )
φ(ρ, ϕ) = −| A| ln . (33)
ϕ( x )
As long as A
= 0, the value of A is arbitrary as it always can be absorbed into the Lagrange multipliers.
The general form of the entropy designed for the purpose of inference of ρ is found by integrating φ,
and, therefore,
ρ( x )
S(ρ( x ), ϕ( x )) = −| A| dx (ρ( x ) ln − ρ( x )) + C [ ϕ]. (34)
ϕ( x )
The constant in ρ, C [ ϕ], will always drop out when varying ρ. The apparent extra term (| A| ρ( x )dx)
from integration cannot be dropped while simultaneously satisfying DC1’, which requires ρ( x ) = ϕ( x )
in the absence of constraints or when there is no change to one’s information. In previous versions
where the integration term (| A| ρ( x )dx) is dropped, one obtains solutions like ρ( x ) = e−1 ϕ( x )
(independent of whether ϕ( x ) was previously normalized or not) in the absence of new information.
Obviously, this factor can be taken care of by normalization, and, in this way, both forms of the
entropy are equally valid; however, this form of the entropy better adheres to the PMU through DC1’.
Given that we may regularly impose normalization, we may drop the extra ρ( x )dx term and C [ ϕ].
For convenience then, (34) becomes
ρ( x )
S(ρ( x ), ϕ( x )) → S∗ (ρ( x ), ϕ( x )) = −| A| dx ρ( x ) ln , (35)
ϕ( x )
which is a special case when the normalization constraint is being applied. Given normalization is
applied, the same selected posterior ρ( x ) maximizes both S(ρ( x ), ϕ( x )) and S∗ (ρ( x ), ϕ( x )), and the
star notation may be dropped.
Remarks
It can be seen that the relative entropy is invariant under coordinate transformations. This implies
that a system of coordinates carry no information and it is the “character” of the probability
distributions that are being ranked against one another rather than the specific set of propositions or
microstates they describe.
417
Entropy 2017, 19, 664
The general solution to the maximum entropy procedure with respect to N linear constraints in ρ,
Ai ( x ), and normalization gives a canonical-like selected posterior probability distribution,
ρ( x ) = ϕ( x ) exp ∑ αi Ai ( x ) . (36)
i
The positive constant | A| may always be absorbed into the Lagrange multipliers so we may let it equal
unity without loss of generality. DC1’ is fully realized when we maximize with respect to a constraint
on ρ( x ) that is already held by ϕ( x ), such as x2 = x2 ρ( x ) dx, which happens to have the same
value as x2 ϕ = x2 ϕ( x ) dx, then its Lagrange multiplier is forcibly zero α1 = 0 (as can be seen in
(36) using (34)), in agreement with Jaynes. This gives the expected result ρ( x ) = ϕ( x ) as there is no
new information. Our design has arrived at a refined maximum entropy method [12] as a universal
probability updating procedure [38].
where Tr(...) is the trace. We wish to maximize this entropy with respect to expectation value
constraints, such as A = Tr( Âρ̂) on ρ̂. Using the Lagrange multiplier method to maximize the
entropy with respect to A and normalization, and setting the variation equal to zero,
δ S(ρ̂, ϕ̂) − λ[Tr(ρ̂) − 1] − α[Tr( Âρ̂) − A] = 0, (38)
418
Entropy 2017, 19, 664
where λ and α are the Lagrange multipliers for the respective constraints. Because S(ρ̂, ϕ̂) is a real
number, we inevitably require δS to be real, but without imposing this directly, we find that requiring
δS to be real requires ρ̂, Â to be Hermitian. At this point, it is simpler to allow for arbitrary variations
of ρ̂ such that,
δS(ρ̂, ϕ̂)
Tr − λ1̂ − α Â δρ̂ = 0. (39)
δρ̂ T
δS(ρ̂, ϕ̂)
= λ1̂ + α Â (40)
δρ̂ T
δS(ρ̂, ϕ̂)
at the maximum. As in the remark earlier, all forms of S that give the correct form of δρ̂T under
variation are equally valid for the purpose of inference. For notational convenience, we let
δS(ρ̂, ϕ̂)
≡ φ(ρ̂, ϕ̂), (41)
δρ̂ T
which is a matrix valued function of the posterior and prior density matrices. The form of φ(ρ̂, ϕ̂) is
already “local” in ρ̂ (the variational derivative is with respect to the whole density matrix), so we don’t
need to constrain it further as we did in the original DC1.
DC1’: In the absence of new information, the new state ρ̂ is equal to the old state ϕ̂
Applied to the ranking of density matrices, in the absence of new information, the density matrix
ϕ̂ should not change, that is, the posterior density matrix ρ̂ = ϕ̂ is equal to the prior density matrix.
Maximizing the entropy without applying any constraints gives,
δS(ρ̂, ϕ̂)
= 0̂, (42)
δρ̂ T
δS(ρ̂, ϕ̂)
= φ(ρ̂, ϕ̂) = φ( ϕ̂, ϕ̂) = 0̂. (43)
δρ̂ T
As in the original DC1’, if ϕ̂ is known to obey some expectation value Â, and then if one goes
out of their way to constrain ρ̂ to that expectation value and nothing else, it follows from the PMU that
ρ̂ = ϕ̂, as no information has been gained. This is not imposed directly but can be verified later.
DC2: Subsystem Independence
The discussion of DC2 is the same as the standard relative entropy DC2—it is not a consistency
requirement, and the updating is designed so that the independence reflected in the prior is maintained
in the posterior by default via the PMU when the information provided is silent about correlations.
DC2 Implementation
Consider a composite system living in the Hilbert space H = H1 ⊗ H2 . Assume that all prior
evidence led us to believe the systems were independent. This is reflected in the prior density matrix:
if the individual system priors are ϕ̂1 and ϕ̂2 , then the joint prior for the whole system is ϕ̂1 ⊗ ϕ̂2 .
Further suppose that new information is acquired such that ϕ̂1 would itself be updated to ρ̂1 and that
ϕ̂2 would be itself be updated to ρ̂2 . By design, the implementation of DC2 constrains the entropy
functional such that in this case, the joint product prior density matrix ϕ̂1 ⊗ ϕ̂2 updates to the product
posterior ρ̂1 ⊗ ρ̂2 so that inferences about one do not affect inferences about the other.
419
Entropy 2017, 19, 664
The argument below is considerably simplified if we expand the space of density matrices to
include density matrices that are not necessarily normalized. This does not represent any limitation
because normalization can always be easily achieved as one additional constraint. We consider a few
special cases below:
Case 1: We receive the extremely constraining information that the posterior distribution for system 1
is completely specified to be ρ̂1 while we receive no information about system 2 at all. We treat the
two systems jointly. Maximize the joint entropy S[ρ̂12 , ϕ̂1 ⊗ ϕ̂2 ], subject to the following constraints on
the ρ̂12 ,
Tr2 (ρ̂12 ) = ρ̂1 . (44)
Notice all of the N 2 elements in H1 of ρ̂12 are being constrained. We therefore need a Lagrange
multiplier which spans H1 and therefore it is a square matrix λ̂1 . This is readily seen by observing the
component form expressions of the Lagrange multipliers (λ̂1 )ij = λij . Maximizing the entropy with
respect to this H2 independent constraint is
0 = δ S − ∑ λij Tr2 (ρ̂1,2 ) − ρ̂1 , (45)
ij ij
but reexpressing this with its transpose (λ̂1 )ij = (λ̂1T ) ji , gives
0 = δ S − Tr1 (λ̂1 [Tr2 (ρ̂1,2 ) − ρ̂1 ]) , (46)
where we have relabeled λ̂1T → λ̂1 , for convenience, as the name of the Lagrange multipliers are
arbitrary. For arbitrary variations of ρ̂12 , we therefore have
DC2 is implemented by requiring ϕ̂1 ⊗ ϕ̂2 → ρ̂1 ⊗ ϕ̂2 , such that the function φ is designed to reflect
subsystem independence in this case; therefore, we have
Had we chosen a different prior ϕ̂2 = ϕ̂2 + δ ϕ̂2 , for all δ ϕ̂2 the LHS λ̂1 ⊗ 1̂2 remains unchanged given
that φ is independent of scalar functions (I would like to thank M. Krumm for pointing this out.) of ϕ̂2 ,
as those could be lumped into λ̂1 while keeping ρ̂1 fixed. The potential dependence on scalar functions
of ϕ̂2 can be removed by imposing DC2 in a subsystem independent situation where ρ̂1 in φ need not
be fixed under variations of ϕ̂2 . The resulting equation in such a situation, for instance maximizing the
entropy of an independent joint prior with respect to Tr( Â1 ⊗ 1̂2 · ρ̂12 ) = A, facilitated by a scalar
Lagrange multiplier λ, and after imposing DC2,
' (
λ Â1 ⊗ 1̂2 = φ ρ̂1 ⊗ ϕ̂2 , ϕ̂1 ⊗ ϕ̂2 . (49)
For subsystem independence to be imposed here, ρ̂1 must be independent of variations in ϕ̂2 , and,
therefore, in a general subsystem independent case, φ is independent of scalar functions of ϕ̂2 .
This means that any dependence that the right-hand side of (48) might potentially have had on
ϕ̂2 must drop out, meaning,
φ (ρ̂1 ⊗ ϕ̂2 , ϕ̂1 ⊗ ϕ̂2 ) = f (ρ̂1 , ϕ̂1 ) ⊗ 1̂2 . (50)
Since ϕ̂2 is arbitrary, suppose further that we choose a unit prior, ϕ̂2 = 1̂2 , and note that ρ̂1 ⊗ 1̂2 and
ϕ̂1 ⊗ 1̂2 are block diagonal in H2 . Because the LHS is block diagonal in H2 ,
' (
f (ρ̂1 , ϕ̂1 ) ⊗ 1̂2 = φ ρ̂1 ⊗ 1̂2 , ϕ̂1 ⊗ 1̂2 . (51)
420
Entropy 2017, 19, 664
The RHS is block diagonal in H2 and, because the function φ is understood to be a power series
expansion in its arguments,
' (
f (ρ̂1 , ϕ̂1 ) ⊗ 1̂2 = φ ρ̂1 ⊗ 1̂2 , ϕ̂1 ⊗ 1̂2 = φ (ρ̂1 , ϕ̂1 ) ⊗ 1̂2 . (52)
This gives
λ̂1 ⊗ 1̂2 = φ (ρ̂1 , ϕ̂1 ) ⊗ 1̂2 , (53)
and, therefore, the 1̂2 factors out and λ̂1 = φ (ρ̂1 , ϕ̂1 ). A similar argument exchanging systems 1 and 2
shows λ̂2 = φ (ρ̂2 , ϕ̂2 ).
Case 1—Conclusion: The analysis leads us to conclude that when the system 2 is not updated,
the dependence on ϕ̂2 drops out,
and, similarly,
φ ( ϕ̂1 ⊗ ρ̂2 , ϕ̂1 ⊗ ϕ̂2 ) = 1̂1 ⊗ φ (ρ̂2 , ϕ̂2 ) . (55)
Case 2: Now consider a different special case in which the marginal posterior distributions for systems
1 and 2 are both completely specified to be ρ̂1 and ρ̂2 , respectively. Maximize the joint entropy,
S[ρ̂12 , ϕ̂1 ⊗ ϕ̂2 ], subject to the following constraints on the ρ̂12 ,
where Tri (...) is the partial trace function, which a trace over the vectors in over
Hi . Here, each expectation value constrains the entire space Hi , where ρ̂i lives. The Lagrange
multipliers must span their respective spaces, so we implement the constraint with the Lagrange
multiplier operator μ̂i , then,
0 = δ S − Tr1 (μ̂1 [Tr2 (ρ̂12 ) − ρ̂1 ]) − Tr2 (μ̂2 [Tr1 (ρ̂12 ) − ρ̂2 ]) . (57)
By design, DC2 is implemented by requiring ϕ̂1 ⊗ ϕ̂2 → ρ̂1 ⊗ ρ̂2 in this case; therefore, we have
Write (59) as
μ̂1 ⊗ 1̂2 = φ (ρ̂1 ⊗ ρ̂2 , ϕ̂1 ⊗ ϕ̂2 ) − 1̂1 ⊗ μ̂2 . (60)
The LHS is independent of changes that might occur in H2 on the RHS of (60). This means that any
variation of ρ̂2 and ϕ̂2 must be “pushed out” by μ̂2 —it removes the dependence of ρ̂2 and ϕ̂2 in φ.
Any dependence that the RHS might potentially have had on ρ̂2 , ϕ̂2 must cancel out in a general
subsystem independent case, leaving μ̂1 unchanged. Consequently,
φ (ρ̂1 ⊗ ρ̂2 , ϕ̂1 ⊗ ϕ̂2 ) − 1̂1 ⊗ μ̂2 = g(ρ̂1 , ϕ̂1 ) ⊗ 1̂2 . (61)
421
Entropy 2017, 19, 664
Because g(ρ̂1 , ϕ̂1 ) is independent of arbitrary variations of ρ̂2 and ϕ̂2 on the LHS above—it is satisfied
equally well for all choices. The form of g(ρ̂1 , ϕ̂1 ) reduces to the form of f (ρ̂1 , ϕ̂1 ) from Case 1 when
ρ̂2 = ϕ̂2 = 1̂2 and, similarly, DC1’ gives μ̂2 = 0. Therefore, the Lagrange multiplier is
Case 2—Conclusion: Substituting back into (59) gives us a functional equation for φ ,
φ(ρ̂1 ⊗ ρ̂2 , ϕ̂1 ⊗ ϕ̂2 ) = φ(ρ̂1 , ϕ̂1 ) ⊗ 1̂2 + 1̂1 ⊗ φ(ρ̂2 , ϕ̂2 ), (64)
which is
φ(ρ̂1 ⊗ ρ̂2 , ϕ̂1 ⊗ ϕ̂2 ) = φ(ρ̂1 ⊗ 1̂2 , ϕ̂1 ⊗ 1̂2 ) + φ(1̂1 ⊗ ρ̂2 , 1̂1 ⊗ ϕ̂2 ). (65)
The general solution to this matrix valued functional equation is derived in Appendix A.5 and is
∼ ∼
φ(ρ̂, ϕ̂) = A ln(ρ̂)+ B ln( ϕ̂), (66)
∼
where tilde A is a “super-operator” having ∼constant
coefficients and twice the number of indicies as ρ̂
∼
and ϕ̂ as discussed in the Appendix (i.e., A ln(ρ̂) = ∑k Aijk (log(ρ̂))k and similarly for B ln( ϕ̂)).
ij
DC1’ imposes
∼ ∼
φ( ϕ̂, ϕ̂) = A ln( ϕ̂)+ B ln( ϕ̂) = 0̂, (67)
∼ ∼
which is satisfied in general when A = − B , and, now,
∼
φ(ρ̂, ϕ̂) = A ln(ρ̂) − ln( ϕ̂) . (68)
∼
We may fix the constant A by substituting our solution into the RHS of Equation (64), which is equal
to the RHS of Equation (65),
∼ ∼
A1 ln(ρ̂1 ) − ln( ϕ̂1 ) ⊗ 1̂2 + 1̂1 ⊗ A2 ln(ρ̂2 ) − ln( ϕ̂2 )
∼ ∼
= A 12 ln(ρ̂1 ⊗ 1̂2 ) − ln( ϕ̂1 ⊗ 1̂2 ) + A 12 ln(1̂1 ⊗ ρ̂2 ) − ln(1̂1 ⊗ ϕ̂2 ) , (69)
∼ ∼ ∼
where A 12 acts on the joint space of 1 and 2 and A 1 , A 2 acts on single subspaces 1 or 2, respectively.
Using the well known log tensor product identity in this case (The proof is demonstrated by taking the
log of ρ̂1 ⊗ 1̂2 ≡ exp(ρ̂1 ) ⊗ 1̂2 = exp(ρ̂1 ⊗ 1̂2 ) and substituting ρ̂1 = log(ρ̂1 ).), ln(ρ̂1 ⊗ 1̂2 ) = ln(ρ̂1 ) ⊗ 1̂2 ,
the RHS of Equation (69) becomes
∼ ∼
= A 12 ln(ρ̂1 ) ⊗ 1̂2 − ln( ϕ̂1 ) ⊗ 1̂2 + A 12 1̂1 ⊗ ln(ρ̂2 ) − 1̂1 ⊗ ln( ϕ̂2 ) . (70)
422
Entropy 2017, 19, 664
∼ ∼ ∼
As A 12 , A 1 , and A 2 are constant tensors, inspecting the above equalities determines the form of
∼ ∼ ∼
the tensor to be A = A 1 where A is a scalar constant and 1 is the super-operator identity over the
appropriate (joint) Hilbert space.
Because our goal is to maximize the entropy function, we let the arbitrary constant A = −| A| and
∼
distribute 1 identically, which gives the final functional form,
φ(ρ̂, ϕ̂) = −| A| ln(ρ̂) − ln( ϕ̂) . (73)
S(ρ̂, ϕ̂) = −| A|Tr(ρ̂ log ρ̂ − ρ̂ log ϕ̂ − ρ̂) + C [ ϕ̂] = −| A|SU (ρ̂, ϕ̂) + | A|Tr(ρ̂) + C [ ϕ̂], (74)
where SU (ρ̂, ϕ̂) is Umegaki’s form of the relative entropy [42–44], the extra | A|Tr(ρ̂) from integration
is an artifact present for the preservation of DC1’, and C [ ϕ̂] is a constant in the sense that it drops out
under arbitrary variations of ρ̂. This entropy leads to the same inferences as Umegaki’s form of the
entropy with an added bonus that ρ̂ = ϕ̂ in the absence of constraints or changes in information—rather
than ρ̂ = e−1 ϕ̂, which would be given by maximizing Umegaki’s form of the entropy. In this sense,
the extra | A|Tr(ρ̂) only improves the inference process as it more readily adheres to the PMU though
DC1’; however, now, because SU ≥ 0, we have S(ρ̂, ϕ̂) ≤ Tr(ρ̂) + C [ ϕ̂], which provides little nuisance.
In the spirit of this derivation, we will keep the Tr(ρ̂) term there, but, for all practical purposes of
inference, as long as there is a normalization constraint, it plays no role, and we find (letting | A| = 1
and C [ ϕ̂] = 0),
S(ρ̂, ϕ̂) → S∗ (ρ̂, ϕ̂) = −SU (ρ̂, ϕ̂) = −Tr(ρ̂ log ρ̂ − ρ̂ log ϕ̂), (75)
Umegaki’s form of the relative entropy. S∗ (ρ̂, ϕ̂) is an equally valid entropy because, given normalization
is applied, the same selected posterior ρ̂ maximizes both S(ρ̂, ϕ̂) and S∗ (ρ̂, ϕ̂).
3.2. Remarks
Due to the universality and the equal application of the PMU by using the same design criteria
for both the standard and quantum case, the quantum relative entropy reduces to the standard relative
entropy when [ρ̂, ϕ̂] = 0 or when the experiment being preformed ρ̂ → ρ( a) = Tr(ρ̂| a a|) is known.
The quantum relative entropy we derive has the correct asymptotic form of the standard relative
entropy in the sense of [8–10]. Further connections will be illustrated in a follow up article that is
concerned with direct applications of the quantum relative entropy. Because two entropies are derived
in parallel, we expect the well-known inferential results and consequences of the relative entropy to
have a quantum relative entropy representation.
Maximizing the quantum relative entropy with respect to some constraints Âi , where { Âi } are
a set of arbitrary Hermitian operators, and normalization 1̂ = 1, gives the following general solution
for the posterior density matrix:
1 1
ρ̂ = exp α0 1̂ + ∑ αi Âi + ln( ϕ̂) = exp ∑ αi Âi + ln( ϕ̂) ≡ exp Ĉ , (76)
i
Z i
Z
where αi are the Lagrange multipliers of the respective constraints and normalization may be factored
out of the exponential in general because the identity commutes universally. If ϕ̂ ∝ 1̂, it is well
known that the analysis arrives at the same expression for ρ̂ after normalization, as it would if the
423
Entropy 2017, 19, 664
von Neumann entropy were used, and thus one can find expressions for thermalized quantum states
ρ̂ = Z1 e− β Ĥ . The remaining problem is to solve for the N Lagrange multipliers using their N associated
expectation value constraints. In principle, their solution is found by computing Z and using standard
methods from Statistical Mechanics,
∂
Âi = − ln( Z ), (77)
∂αi
and inverting to find αi = αi ( Âi ), which has a unique solution due to the joint concavity (convexity
depending on the sign convention) of the quantum relative entropy [8,9] when the constraints are
linear in ρ̂. The simple proof that (77) is monotonic in α, and therefore invertible, is that its derivative
∂
∂α Âi = Âi − Âi ≥ 0. Between the Zassenhaus formula [45]
2 2
t2 t3
et( Â+ B̂) = et  et B̂ e− 2 [ Â,B̂] e 6 (2[ B̂,[ Â,B̂]]+[ Â,[ Â,B̂]]) ..., (78)
and Horn’s inequality [46–48], the solutions to (77) lack a certain calculational elegance because it is
difficult to express the eigenvalues of Ĉ = log( ϕ̂) + ∑ αi Âi (in the exponential) in simple terms of the
eigenvalues of the Âi ’s and ϕ̂, in general, when the matrices do not commute. The solution requires
solving the eigenvalue problem for Ĉ, such the the exponential of Ĉ may be taken and evaluated in
terms of the eigenvalues of the αi Âi s and the prior density matrix ϕ̂. A pedagogical exercise is starting
with a prior that is a mixture of spin-z up and down ϕ̂ = a|+ +| + b|− −| (a, b
= 0), maximizing
the quantum relative entropy with respect to an expectation of a general Hermitian operator with
which the prior density matrix does not commute. This example for spin is given in the Appendix B.
4. Conclusions
This approach emphasizes the notion that entropy is a tool for performing inference and
downplays counter-notional issues that arise if one interprets entropy as a measure of disorder,
a measure of distinguishability, or an amount of missing information [7]. Because the same design
criteria, guided by the PMU, are applied equally well to the design of a relative and quantum relative
entropy, we find that both the relative and quantum relative entropy are designed for the purpose of
inference. Because the quantum relative entropy is the functional that fits the requirements of a tool
designed for the inference of density matrices, we now know what it is and how to use it—formulating
an inferential quantum maximum entropy method. This article provides the foundation for [29], which,
in particular, derives the Quantum Bayes Rule and collapse as special cases of the quantum maximum
entropy method, as was craved in [24], analogous to [38,40]’s treatment for deriving Bayes Rule using
the standard maximum entropy method. The quantum maximum entropy method thereby unifies
a few topics in Quantum Information and Quantum Measurement through entropic inference.
Acknowledgments: I must give ample acknowledgment to Ariel Caticha who suggested the problem of justifying
the form of the quantum relative entropy as a criterion for ranking of density matrices. He cleared up several
difficulties by suggesting that design constraints be applied to the variational derivative of the entropy rather
than the entropy itself. In addition, he provided substantial improvements to the method for imposing DC2 that
led to the functional equations for the variational derivatives (φ12 = φ1 + φ2 )—with more rigor than in earlier
versions of this article. His time and guidance are all greatly appreciated—thanks, Ariel. I would also like to
thank M. Krumm, the reviewers, as well as our information physics group at UAlbany for our many intriguing
discussions about probability, inference, and quantum mechanics.
Conflicts of Interest: The author declares no conflict of interest.
Appendix A
The Appendix loosely follows the relevant sections in [49], and then uses the methods reviewed to
solve the relevant functional equations for φ. The last section is an example of the quantum maximum
entropy method applied to a mixed spin state.
424
Entropy 2017, 19, 664
f ( x + y) = f ( x ) + f (y) (A1)
is satisfied for all real x, y, and if the function f ( x ) is (a) continuous at a point, (b) nonegative for small positive
x’s, or (c) bounded in an interval, then,
f ( x ) = cx (A2)
is the solution to (A1) for all real x. If (A1) is assumed only over all positive x, y, then under the same
conditions, (A2) holds for all positive x.
Proof. The most natural assumption for our purposes is that f ( x ) is continuous at a point (which later
extends to continuity all points as given by Darboux [50]). Cauchy solved the functional equation by
induction. In particular, Equation (A1) implies,
f ( ∑ xi ) = ∑ f ( x i ), (A3)
i i
f (nx ) = n f ( x ). (A4)
f ( x ) = cx, (A7)
which is the general solution of the linear functional equation. In principle, c can be complex.
The importance of Cauchy’s solution is that it can be used to give general solutions to the following
Cauchy equations:
f ( x + y) = f ( x ) f ( y ), (A8)
f ( xy) = f ( x ) + f ( y ), (A9)
f ( xy) = f ( x ) f ( y ), (A10)
by preforming consistent substitution until they are the same form as (A1), as given by Cauchy. We will
briefly discuss the first two.
Theorem A2. The general solution of f ( x + y) = f ( x ) f (y) is f ( x ) = ecx for all real or for all positive x, y
that are continuous at one point and, in addition to the exponential solution, the solution f (0) = 1 and f ( x ) = 0
for (x > 0) are in these classes of functions.
425
Entropy 2017, 19, 664
The first functional f ( x + y) = f ( x ) f (y) is solved by first noting that it is strictly positive for real x, y,
f ( x ), which can be shown by considering x = y,
If there exists f ( x0 ) = 0, then it follows that f ( x ) = f (( x − x0 ) + x0 ) = 0, a trivial solution, hence the reason
why the possibility of being equal to zero is excluded above. Given f ( x ) is nowhere zero, we are justified in
taking the natural logarithm ln( x ), due to its positivity f ( x ) > 0. This gives,
g ( x + y ) = g ( x ) + g ( y ), (A13)
which is Cauchy’s linear equation, and thus has the solution g( x ) = cx. Because g( x ) = ln( f ( x )), one finds in
general that f ( x ) = ecx .
Theorem A3. If the functional equation f ( xy) = f ( x ) + f (y) is valid for all positive x, y then its general
solution is f ( x ) = c ln( x ) given it is continuous at a point. If x = 0 (or y = 0) are valid, then the general
solution is f ( x ) = 0. If all real x, y are valid except 0, then the general solution is f ( x ) = c ln(| x |).
In particular, we are interested in the functional equation f ( xy) = f ( x ) + f (y) when x, y are positive.
In this case, we can again follow Cauchy and substitute x = eu and y = ev to get,
f ( e u e v ) = f ( e u ) + f ( e v ), (A14)
and letting g(u) = f (eu ) gives g(u + v) = g(u) + g(v). Again, the solution is g(u) = cu and, therefore,
the general solution is f ( x ) = c ln( x ) when we substitute for u. If x could equal 0, then f (0) = f ( x ) + f (0),
which has the trivial solution f ( x ) = 0. The general solution for x
= 0, y
= 0 and x, y positive is therefore
f ( x ) = c ln( x ).
which is the Cauchy linear functional equation having solution F ( x1 , 0, ..., 0) = c1 x1 , where F ( x1 , 0, ..., 0)
is assumed to be continuous or at least measurable majorant. Similarly,
F ( x1 , x2 , ..., xn ) = ∑ ci xi , (A19)
426
Entropy 2017, 19, 664
φ ( ρ1 ρ2 , ϕ1 ϕ2 ) = φ ( ρ1 , ϕ1 ) + φ ( ρ2 , ϕ2 ). (A20)
F ( x1 y1 , x2 y2 ) = F ( x1 , x2 ) + F ( y1 , y2 ), (A21)
F ( x1 + y1 , x2 + y2 ) = a1 ( x1 + y1 ) + a2 ( x2 + y2 ) = a1 ln( x1 y1 ) + a2 ln( x2 y2 ) = F ( x1 y1 , x2 y2 ), (A23)
In such a case, when ϕ( x0 ) = 0 for some value x0 ∈ X , we may let ϕ( x0 ) = , where is as close to
zero as we could possibly want—the trivial general solution φ = 0 is saturated by the special case
when ρ = ϕ from DC1’. Here, we return to the text.
f ( X̂ + Ŷ ) = f ( X̂ ) + f (Ŷ ), (A25)
where X̂ and Ŷ are n × n square matrices. Rewriting the matrix functional equation in terms of its
components gives
f ij ( x11 + y11 , x12 + y12 , ..., xnn + ynn ) = f ij ( x11 , x12 , ..., xnn ) + f ij (y11 , y12 , ..., ynn ) (A26)
for i, j = 1, ..., n. We find it convenient to introduce super indices, A = (i, j) and B = (, k ) such that
the component equation becomes
fA = ∑ c AB xB , (A28)
B
and resembles the solution for the linear transformation of a vector from [49]. In general, we will be
discussing matrices X̂ = X̂1 ⊗ X̂2 ⊗ ... ⊗ X̂ N which stem from tensor products of density matrices.
In this situation, X̂ can be thought of as 2N index tensor or a z × z matrix where z = ∏iN ni is the
product of the ranks of the matrices in the tensor product or even as a vector of length z2 . In such
427
Entropy 2017, 19, 664
a case, we may abuse the super index notation where A and B lump together the appropriate number
of indices such that (A28) is the form of the solution for the components in general. The matrix form of
the general solution is
)X̂,
f ( X̂ ) = C (A29)
These density matrices are Hermitian, positive semi-definite, have positive eigenvalues, and are not
equal to 0̂. Because every invertible matrix can be expressed as the exponential of some other matrix,
we can substitute ρ̂1 = eρ̂1 , and so on for all four density matrices giving,
φ eρ̂1 ⊗ eρ̂2 , e ϕ̂1 ⊗ e ϕ̂2 = φ eρ̂1 ⊗ 1̂2 , e ϕ̂1 ⊗ 1̂2 + φ 1̂1 ⊗ eρ̂2 , 1̂1 ⊗ e ϕ̂2 . (A31)
and
eρ̂1 ⊗ 1ˆ2 = eρ̂1 ⊗1̂2 , (A33)
G (ρ̂1 ⊗ 1̂2 + 1̂1 ⊗ ρ̂2 , ϕ̂1 ⊗ 1̂2 + 1̂1 ⊗ ϕ̂2 ) = G (ρ̂1 ⊗ 1̂2 , ϕ̂1 ⊗ 1̂2 ) + G (1̂1 ⊗ ρ̂2 , 1̂1 ⊗ ϕ̂2 ). (A35)
428
Entropy 2017, 19, 664
= (c1 + cz )|+ +| + (c x − icy )|+ −| + (c x + icy )|− +| + (c1 − cz )|− −|, (A41)
Maximizing the entropy with respect to this general expectation value and normalization is:
0 = δS − λ[Tr(ρ̂) − 1] − α(Tr(ρ̂cμ σ̂μ ) − c) , (A43)
1
ρ̂ = exp(αcμ σ̂μ + log( ϕ̂)). (A44)
Z
Letting
gives
1 Ĉ −1 1
ρ̂ = e = UeU ĈU U −1 = Ueλ̂ U −1
Z Z
eλ+ e λ−
= U |λ+ λ+ |U −1 + U |λ− λ− |U −1 , (A46)
Z Z
λ± = λ ± δλ, (A47)
1
λ = αc1 + log( ab), (A48)
2
and
1 a 2
δλ = 2αcz + log( ) + 4α2 (c2x + c2y ). (A49)
2 b
Because λ± and a, b, c1 , c x , cy , cz are real, δλ is real and ≥ 0. The normalization constraint specifies the
Lagrange multiplier Z,
eλ+ + eλ−
1 = Tr(ρ̂) = , (A50)
Z
429
Entropy 2017, 19, 664
so Z = eλ+ + eλ− = 2eλ cosh(δλ). The expectation value constraint specifies the Lagrange multiplier α,
∂ ∂
c = Tr(ρ̂cμ σμ ) = log( Z ) = c1 + tanh(δλ) δλ, (A51)
∂α ∂α
which becomes
tanh(δλ) a
c = c1 + 2α(c2x + c2y + c2z ) + cz log( ) ,
2δλ b
or
1 a 2 2α(c2x + c2y + c2z ) + cz log( ba )
c = c1 + tanh 2αcz + log( ) + 4α2 (c2x + c2y ) 2 . (A52)
2 b
2αcz + log( ba ) + 4α2 (c2x + c2y )
This equation is monotonic in α and therefore it is uniquely specified by the value of c. Ultimately, this is
a consequence from the concavity of the entropy. The specific proof of (A52)’s monotonicity is below:
.
Proof. For ρ̂ to be Hermitian, Ĉ is Hermitian and δλ = 12 f (α) is real—furthermore, because δλ
is real f (α) ≥ 0 and thus δλ ≥ 0. Because f (α) is quadratic in α and positive, it may be written in
vertex form,
where a > 0, k ≥ 0, and (h, k) are the ( x, y) coordinates of the minimum of f (α). Notice that the form
of (A52) is
.
tanh( 12 f (α)) ∂ f (α)
F (α) = . × . (A54)
f (α) ∂α
Making the change of variables α = α − h centers the function such that f (α ) = f (−α ) is symmetric
about α = 0. We can then write
.
tanh( 12 f (α ))
F (α ) = . × 2aα , (A55)
f (α )
where the derivative has been computed. Because f (α ) is a positive, symmetric, and monotonically
√
tanh( 12 f (α ))
increasing on the (symmetric) half-plane (for α greater than or less that zero), S(α ) ≡ √ is
f (α )
also positive and symmetric, but it is unclear whether S(α) is strictly monotonic in the half-plane or
not. We may restate
We are now in a convenient position to preform the derivate test for monotonic functions:
∂ ∂
F (α ) = 2aS(α ) + 2aα S(α )
∂α ∂α
aα2 aα2 .
2 1
= 2aS(α ) 1 − 2 + a 2 1 − tanh ( 2
aα + k ) (A57)
aα + k aα + k 2
a ( α )2
≥ 2aS(α ) 1 − 2 ≥0
aα + k
430
Entropy 2017, 19, 664
2
because a, k, S(α ), and therefore aαaα2 +k are all > 0. The function of interest F (α ) is therefore monotonic
for all α , and therefore it is monotonic for all α, completing the proof that there exists a unique real
Lagrange multiplier α in (A52).
Although (A52) is monotonic in α, it is seemingly a transcendental equation. This can be solved
graphically for the given values c, c1 , c x , cy , cz , i.e., given the Hermitian matrix and its expectation value
are specified. Equation (A52) and the eigenvalues take a simpler form when a = b = 12 because, in this
instance, ϕ̂ ∝ 1̂ and commutes universally so it may be factored out of the exponential in (A44).
References
1. Shore, J.E.; Johnson, R.W. Axiomatic derivation of the Principle of Maximum Entropy and the Principle of
Minimum Cross-Entropy. IEEE Trans. Inf. Theory 1980, 26, 26–37.
2. Shore, J.E.; Johnson, R.W. Properties of Cross-Entropy Minimization. IEEE Trans. Inf. Theory 1981, 27, 472–482.
3. Csiszár, I. Why least squares and maximum entropy: An axiomatic approach to inference for linear inverse
problems. Ann. Stat. 1991, 19, 2032.
4. Skilling, J. The Axioms of Maximum Entropy. In Maximum-Entropy and Bayesian Methods in Science and
Engineering; Erickson, G.J., Smith, C.R., Eds.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1988.
5. Skilling, J. Classic Maximum Entropy. In Maximum-Entropy and Bayesian Methods in Science and Engineering;
Kluwer Academic Publishers: Dordrecht, The Netherlands, 1988.
6. Skilling, J. Quantified Maximum Entropy. In Maximum-Entropy and Bayesian Methods in Science and Engineering;
Fougére, P.F., Ed.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1990.
7. Caticha, A. Entropic Inference and the Foundations of Physics (Monograph Commissioned by the 11th
Brazilian Meeting on Bayesian Statistics—EBEB-2012). Available online: http://www.albany.edu/physics/
ACaticha-EIFP-book.pdf (accessed on 30 November 2017).
8. Hiai, F.; Petz, D. The Proper Formula for Relative Entropy and its Asymptotics in Quantum Probability.
Commun. Math. Phys. 1991, 143, 99–114.
9. Petz, D. Characterization of the Relative Entropy of States of Matrix Algebras. Acta Math. Hung. 1992, 59,
449–455.
10. Ohya, M.; Petz, D. Quantum Entropy and Its Use; Springer: New York, NY, USA, 1993; ISBN 0-387-54881-5.
11. Wilming, H.; Gallego, R.; Eisert, J. Axiomatic Characterization of the Quantum Relative Entropy and Free
Energy. Entropy 2017, 19, 241.
12. Jaynes, E.T. Information Theory and Statistical Mechanics. Phys. Rev. 1957, 106, 620–630.
13. Jaynes, E.T. Probability Theory: The Logic of Science; Cambridge University Press: Cambridge, UK, 2003.
14. Jaynes, E.T. Information Theory and Statistical Mechanics II. Phys. Rev. 1957, 108, 171–190.
15. Balian, R.; Vénéroni, M. Incomplete descriptions, relevant information, and entropy production in collision
processes. Ann. Phys. 1987, 174, 229–224.
16. Balian, R.; Balazs, N.L. Equiprobability, inference and entropy in quantum theory. Ann. Phys. 1987, 179,
97–144.
17. Balian, R. Justification of the Maximum Entropy Criterion in Quantum Mechanics. In Maximum Entropy
and Bayesian Methods; Skilling, J., Ed.; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1989;
pp. 123–129.
18. Balian, R. On the principles of quantum mechanics. Am. J. Phys. 1989, 57, 1019–1027.
19. Balian, R. Gain of information in a quantum measurement. Eur. J. Phys. 1989, 10, 208–213
20. Balian, R. Incomplete descriptions and relevant entropies. Am. J. Phys. 1999, 67, 1078–1090.
21. Blankenbecler, R.; Partovi, H. Uncertainty, Entropy, and the Statistical Mechanics of Microscopic Systems.
Phys. Rev. Lett. 1985, 54, 373–376.
22. Blankenbecler, R.; Partovi, H. Quantum Density Matrix and Entropic Uncertainty. In Proceedings of the
Fifth Workshop on Maximum Entropy and Bayesian Methods in Applied Statistics, Laramie, WY, USA,
5–8 August 1985.
23. Von Neumann, J. Mathematische Grundlagen der Quantenmechanik; Springer: Berlin, Germany, 1932.
English Translation: Mathematical Foundations of Quantum Mechanics; Princeton University Press: Princeton,
NY, USA, 1983.
431
Entropy 2017, 19, 664
24. Ali, S.A.; Cafaro, C.; Giffin, A.; Lupo, C.; Mancini, S. On a Differential Geometric Viewpoint of Jaynes’
Maxent Method and its Quantum Extension. AIP Conf. Proc. 2012, 1443, 120–128.
25. Caticha, A. Entropic Dynamics: Quantum Mechanics from Entropy and Information Geometry.
Available online: https://arxiv.org/abs/1711.02538 (accessed on 30 November 2017).
26. Reginatto, M.; Hall, M.J.W. Quantum-classical interactions and measurement: A consistent description using
statistical ensembles on configuration space. J. Phys. Conf. Ser. 2009, 174, 012038.
27. Reginatto, M.; Hall, M.J.W. Information geometry, dynamics and discrete quantum mechanics.
AIP Conf. Proc. 2013, 1553, 246–253.
28. Caves, C.; Fuchs, C.; Schack, R. Quantum probabilities as Bayesian probabilities. Phys. Rev. A 2002, 65, 022305.
29. Vanslette, K. The Quantum Bayes Rule and Generalizations from the Quantum Maximum Entropy Method.
Available online: https://arxiv.org/abs/1710.10949 (accessed on 30 November 2017).
30. Schack, R.; Brun, T.; Caves, C. Quantum Bayes rule. Phys. Rev. A 2001, 64, 014305.
31. Korotkov, A. Continuous quantum measurement of a double dot. Phys. Rev. B 1999, 60, 5737–5742.
32. Korotkov, A. Selective quantum evolution of a qubit state due to continuous measurement. Phys. Rev. B
2000, 63, 115403.
33. Jordan, A.; Korotkov, A. Qubit feedback and control with kicked quantum nondemolition measurements:
A quantum Bayesian analysis. Phys. Rev. B 2006, 74, 085307.
34. Hellmann, F.; Kamiński, W.; Kostecki, P. Quantum collapse rules from the maximum relative entropy
principle. New J. Phys. 2016, 18, 013022.
35. Warmuth, M. A Bayes Rule for Density Matrices. In Advances in Neural Information Processing Systems 18,
Proceedings of the Neural Information Processing Systems Conference, Montréal, QC, Canada, 7–12 December 2005;
Neural Information Processing Systems Foundation, Inc.: La Jolla, CA, USA, 2015.
36. Warmuth, M.; Kuzmin, D. A Bayesian Probability Calculus for Density Matrices. Mach. Learn. 2010, 78,
63–101.
37. Tsuda, K. Machine learning with quantum relative entropy. J. Phys. Conf. Ser. 2009, 143, 012021.
38. Giffin, A.; Caticha, A. Updating Probabilities. Presented at the 26th International Workshop on Bayesian
Inference and Maximum Entropy Methods (MaxEnt 2006), Paris, France, 8–13 July 2006.
39. Wang, Z.; Busemeyer, J.; Atmanspacher, H.; Pothos, E. The Potential of Using Quantum Theory to Build
Models of Cognition. Top. Cogn. Sci. 2013, 5, 672–688.
40. Giffin, A. Maximum Entropy: The Universal Method for Inference. Ph.D. Thesis, University at Albany
(SUNY), Albany, NY, USA, 2008.
41. Caticha, A. Toward an Informational Pragmatic Realism. Minds Mach. 2014, 24, 37–70.
42. Umegaki, H. Conditional expectation in an operator algebra, IV (entropy and information). Ködai Math.
Sem. Rep. 1962, 14, 59–85.
43. Uhlmann, A. Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory.
Commun. Math. Phys. 1997, 54, 21–32.
44. Schumacher, B.; Westmoreland, M. Relative entropy in quantum information theory. In Proceedings of the
AMS Special Session on Quantum Information and Computation, Washington, DC, USA, 19–21 January 2000.
45. Suzuki, M. On the Convergence of Exponential Operators—The Zassenhaus Formula, BCH Formula and
Systematic Approximants. Commun. Math. Phys. 1977, 57, 193–200.
46. Horn, A. Eigenvalues of sums of Hermitian matrices. Pac. J. Math. 1962, 12, 225–241.
47. Bhatia, R. Linear Algebra to Quantum Cohomology: The Story of Alfred Horn’s Inequalities. Am. Math. Mon.
2001, 108, 289–318.
48. Knutson, A.; Tao, T. Honeycombs and Sums of Hermitian Matrices. Not. AMS 2001, 48, 175–186.
49. Aczél, J. Lectures on Functional Equations and Their Applications; Academic Press Inc.: New York, NY, USA,
1966; Volume 19, pp. 31–44, 141–145, 213–217, 301–302, 347–349.
50. Darboux, G. Sur le théorème fondamental de la géométrie projective. Math. Ann. 1880, 17, 55–61.
c 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
432
entropy
Article
Finding a Hadamard Matrix by Simulated
Quantum Annealing
Andriyan Bayu Suksmono
Telecommunication Engineering Scientific and Research Group (TESRG), School of Electrical Engineering and
Informatics and The Research Center on Information and Communication Technology (PPTIK-ITB),
Institut Teknologi Bandung, Jl. Ganesha No.10, Bandung 40132, Indonesia; [email protected]
Abstract: Hard problems have recently become an important issue in computing. Various methods,
including a heuristic approach that is inspired by physical phenomena, are being explored. In this
paper, we propose the use of simulated quantum annealing (SQA) to find a Hadamard matrix,
which is itself a hard problem. We reformulate the problem as an energy minimization of spin
vectors connected by a complete graph. The computation is conducted based on a path-integral
Monte-Carlo (PIMC) SQA of the spin vector system, with an applied transverse magnetic field whose
strength is decreased over time. In the numerical experiments, the proposed method is employed to
find low-order Hadamard matrices, including the ones that cannot be constructed trivially by the
Sylvester method. The scaling property of the method and the measurement of residual energy after
a sufficiently large number of iterations show that SQA outperforms simulated annealing (SA) in
solving this hard problem.
Keywords: quantum annealing; adiabatic quantum computing; hard problems; Hadamard matrix;
binary optimization
1. Introduction
1.1. Background
Finding a solution to a hard problem is a challenging task in computing. Such a problem is
characterized by its complexity, as it grows beyond the polynomial against the size of the input.
A class of particularly important ones are NP (non-deterministic polynomial) problems, in which
verifying a solution can be conducted in polynomial time, whereas finding the solution is of
exponential order. Examples of such problems are, among others, the TSP (traveling salesman problem),
SAT (Boolean satisfiability), graph coloring, graph isomorphism, and subset sums.
An interesting approach to the hard problems is a method inspired by physical phenomena, such
as classical annealing (CA) or quantum annealing (QA). Both CA and QA are physical processes that
obtain an ordered (physical) system from an unordered one, which can be done either thermally (as is
the case in CA) or quantum-mechanically (as is the case in QA). To simulate the physical processes on
a (classical/non-quantum) computer, numerical methods, such as MC (Monte Carlo) for CA and PIMC
(path-integral Monte Carlo) for QA, have been developed. The algorithm or computational method
inspired by classical/thermal annealing is called simulated annealing (SA), whereas the one based
on quantum annealing is called simulated quantum annealing (SQA). Both of these methods make
use of the methods in numerical CA or numerical QA. They encode the problem into a Hamiltonian
of a spin system [1] and then evolve the system from a high energy state down to the ground state.
The annealing process enables the system to avoid local minima trapping and therefore is capable of
achieving a global optimum, which represents the best solution of the problem. The main difference
between SA and SQA is in the evolution of the systems; whereas SA uses classical/thermal annealing,
SQA employs quantum mechanism.
In SA [2–4], one starts the system in total randomness with regard to a high temperature state.
The temperature is then lowered and the system is evolved, which causes the energy to decrease so
that the system becomes increasingly ordered. To avoid local-optima trapping, a particular updating
rule, such as the Metropolis [2], is applied. The rule allows the system to (sometimes) move to a higher
energy state. Upon completion of the algorithm, the system achieves the ground state, at which point
a solution is found.
In [5], Kadowaki and Nishimori introduced quantum fluctuations to replace the thermal
fluctuations in SA to accelerate the convergence. They applied the method on an Ising model, where a
transverse field plays the role of temperature in classical SA, enabling the system to achieve the ground
state with greater probability. Santoro et al. [6] compared classical and quantum Monte Carlo annealing
protocols on a two-dimensional Ising model. They found that the quantum Monte Carlo annealing is
superior to classical annealing. In [7], Boixo et al. show experimental results on a 108 qubit D-Wave One,
which is a kind of hardware implementation of QA. A strong correlation between D-Wave and SQA,
compared to the device with classical annealing, was found, which indicates that the D-Wave performs
quantum annealing. This result raised the important issue of whether QA actually outperforms SA [8].
Rønnow et al. [9] showed how quantum speedup should be defined and measured. In an experiment
with random spin glass instances on 503 qubits of D-Wave Two, they did not find any evidence of
such speedup.
Regardless of these issues, different results have been achieved via SQA. Isakov [10] performed
quantum Monte Carlo (QMC) simulations and found that the QMC tunneling rate displayed scaled
according to system size. He also found quadratic speedup in QMC simulations when, instead of
periodic conditions, open boundary conditions were employed. In [11], Mazzola et al. demonstrated
that QMC simulations can recover the scaling of ground-state tunneling rates, which validates QA in
terms of solving combinatorial problems.
Some classes of hard problems, including ones with exponential or combinatorial complexity,
have been a subject of interest in SQA research. Martonak et al. [12] introduced an application of SQA
to solve the TSP problem. They found that a PIMC algorithm was more efficient than SA in terms of
finding an approximately minimal tour in a given graph. SQA has also been used to successfully address
other hard problems related to graphs, such as graph coloring [13] and graph isomorphism [14].
In this paper, we propose SQA as a mean to find a Hadamard matrix (H-matrix). Previously, in [15],
we successfully employed SA to perform a similar task, in which low-order H-matrices were found.
Compared to existing H-matrix construction methods, an SA-based method is more general in terms
of its capability of finding (or constructing probabilistically) an m = 4k order H-matrix, without any
restriction on the property of the order m, whereas the Sylvester method requires m = 2n , where k and
n are positive integers. This paper extends this classical SA method to its quantum version, where
PIMC based on Suzuki–Trotter formulation [16,17] is employed to simulate the quantum process.
434
Entropy 2018, 20, 141
One of the most important issues in the theory of H-matrix is its existence. Any 2l order H-matrix
with l a positive integer can be constructed using Sylvester’s method. Furthermore, if there is an m
order H-matrix, m = 4k can be shown for a positive integer k. On the other hand, no one yet knows
if there is always a 4k order H-matrix [20,21]. The latter case is formulated as the Hadamard matrix
conjecture. Up to this writing, the smallest unknown 4k order H-matrix is 668.
Various reconstruction methods have been proposed [24–29]. Nevertheless, these methods force
the order m to follow a particular rule. In [15], a general m = 4k order algorithm employing SA is
proposed. The method works on a special H-matrix called a seminormalized Hadamard (SH) matrix,
in which the first column is a 4k order unity vector v0 = (1, · · · , 1) T , and the rest are 4k order SH
vectors vi ∈ V.
A brute-force method needs to verify all NB of the 4k order binary matrix to find an H-matrix,
2
where NB (4k ) = 216k [15]. Let all matrices constructed where v0 is the first column and a combination
of vi ∈ V constitutes the remaining (4k − 1) columns be called quasi-SH (QSH) matrices. Since there are
4k 4k
NV = C (4k, 2k ) SH vectors, there are about NQU (4k) ≈ 8k23/2 unique QSH-matrices. Although the
number has been greatly reduced compared to NB , exhaustive checking still requires a great amount
of computational resources. The SA method proposed in [15] is capable of finding a few low-order SH
matrices in a more reasonable time.
Following the convention in our previous paper [15], the role of the spin, i.e., its ±1 eigenvalues,
is replaced by SH spin vectors vi ∈ V. To find a 4k order SH-matrix, one needs (4k − 1) fully connected
SH spin vectors, which initially are set randomly. With a defined energy E( Q ), the SH spin vectors are
randomly changed in accordance with conditions whereby a transition into another SH spin vector is
allowed but a transition into a non-SH-spin-vector is forbidden.
2. Methods
where Jij is a coupling constant/strength between a spin at site i with a spin at site j, h j is the magnetic
strength at site j, and {σ̂iz , σ̂ix } are Pauli’s matrices at site i. In SQA, quantum fluctuation is elaborated
by introducing a transverse magnetic field Γ. The Hamiltonian of the system takes the following
form [5]:
ĤQA = − ∑ Jij σ̂iz σ̂jz − ∑ hi σ̂iz − Γ ∑ σ̂ix . (2)
i
= j i i
In Equation (2), the transverse field is changed (reduced) over time, i.e., Γ ≡ Γ(t). On the right
hand side of the equation, the first two terms corresponds to potential energy Ĥ pot , while the third one
is the Hamiltonian introduced by the transverse field, which is related to kinetic energy Ĥkin ; i.e, we
can define
Ĥ pot ≡ − ∑ Jij σ̂iz σ̂jz − ∑ hi σ̂iz (3)
i
= j i
435
Entropy 2018, 20, 141
To simulate a quantum system described by Equation (5) using the classical method, we have to
formulate PIMC by introducing imaginary time. It can be then approximated by the Suzuki–Trotter
transform by adding one dimension in the imaginary time direction, which, for ( P × N ) degrees of
freedom, takes the following form [13,30]:
P ' ( P −1 N N
1
HST =
P ∑ H pot {Si,p } − JΓ ∑ ∑ Si,p Si,p+1 + ∑ Sj,1 Sj,p (6)
p =1 p =1 i j
where N is the number of spins in the lattice, P is the number of Trotter’s replicas, Si = ±1 are the
eigenvalues of the spin matrices, and
PT Γ
JΓ = − ln tanh >0 (7)
2 PT
where vi · v j denotes the inner product of the vector vi with v j .
Figure 1 shows an Ising system with four SH spin vectors with an additional Trotter’s dimension.
In the lower part of Figure 1a, each circle represents a binary spin, whereas the solid line represents the
connection among the spins. Interacting spin i with binary variable Si and spin j with binary variable
S j contributes the term Jij Si S j to the Hamiltonian. For a 4k order case, every 4k non-connected spins
are grouped into one SH vector vi , which is illustrated as a dashed line. To simplify the diagram, each
SH vector is represented by a filled circle; thus, we obtain the upper part of Figure 1a, which is called a
slice or a replica. In the PIMC, the slice is replicated P-times, and these slices are arranged as layers
in imaginary time. Each neighboring SH vector in a replica, i.e., vi,p with vi,p−1 and vi,p with vi,p+1 ,
interacts. The extension (in imaginary time) is illustrated in Figure 1b. The Hamiltonian in Equation (6)
becomes a Hamiltonian of an SH vector spin system HQV that can be rewritten as follows:
P ' ( P −1
1
HQV =
P ∑ Hpot {vi,p } − JΓ ∑ ∑ vi,p · vi,p+1 + ∑ vi,1 · vi,p (9)
p =1 p =1 i i
436
Entropy 2018, 20, 141
' (
where JΓ ≡ JΓ (t) and H pot {vi,p } represent complete-graph connections among the SH spin vectors,
similar to Equation (8), which is given by
' (
H pot {vi,p } = ∑ vi,p · v j,p + ∑1 · vi,p − 16k2 . (10)
i
= j i
The evolution of HQV in Equation (9) leads to the solution to the H-matrix search problem.
(a) (b)
Figure 1. Connection diagrams of the spins and spin vectors. We consider a four-order SH vector
in this example: (a) four SH spins are connected by a complete graph K4 , and each column is then
grouped into a single SH spin vector; (b) an extension of fully connected SH spin vectors into a Trotter
dimension (imaginary time) τ.
We will now formulate the SQA method for finding the H-matrix into an algorithm, which
is displayed as pseudo-code in Algorithm 1. It takes the matrix order, the number of replicas,
the initial temperature, the initial value of Γ, and the amount of iterations and sub-iterations as inputs.
This algorithm yields either an SH-matrix or a QSH-matrix that has more orthogonal column vectors
than the initial one. The algorithm starts with a random initialization of replicas with QSH-matrices,
which are (4k − 1) sets of SH vectors, and then calculates its initial energy. Following the schedule of a
linear transverse field, a trial transition is performed for each replica. The acceptance and rejection
of the transition is based on the Metropolis criterion. The iteration will be stopped when either the
number of maximum iterations is reached or an SH-matrix is found.
437
Entropy 2018, 20, 141
438
Entropy 2018, 20, 141
Figure 2. Energy evolution during the SQA algorithm runs to find an SH-matrix of order 12. Four curves
are drawn in the graph, which are the mean potential energy Epmean , the minimum potential energy
Epmin , the replica energy Erep , and the deviation standard of the potential energy Epstd . When Epmin
equals zero, the iteration is stopped since an SH-matrix has been found. The Epstd curve indicates high
variation in the configuration of replicas at the initial stage, which is then reduced in later stages.
439
Entropy 2018, 20, 141
Figure 3. The initial state of the found H-matrix: (a) The QSH-matrix, white squares indicate +1, black
squares indicate −1. (b) Orthogonality indicator, gray squares show the non-orthogonality condition
of related pair of vectors.
Figure 4. Indicator matrices of the replica content: (a) the first replica at the initial stage; (b) the last
replica at the initial stage; (c) the first replica at the final stage, and (d) the last replica at the final stage.
The matrices at the initial stages show most of the vectors as non-orthogonal, whereas those at the final
stages show most of the vectors as orthogonal.
440
Entropy 2018, 20, 141
Figure 5. Final results: (a) the found H-matrix and (b) its orthogonality indicator. The diagonal form of
the indicator matrix indicates that all of the column vectors are now orthogonal.
Figure 6. The effect of the replica number P in the algorithm: although ideally a large P is desired,
it also needs to be adjusted to the problem. Variation in replica energy (in terms of deviation standards
of the energy across the replicas) when searching for an H-matrix is shown. The numbers of replicas
P = 10, 15, 20 yield large variations up to the end of the iteration, whereas P = 5 yields a better result
with steady values at the end. In all of these cases, for the construction of a 12-order H-matrix, the total
maximum iteration is set to 20,000, consisting of a global iteration count of 20,000 for each P.
Since initially the replicas were set randomly, they will have almost identical energy, so variation
in the energy will be very low. In later iterations, the value will increase as a new configuration is
explored, and this will be followed by a decrease, which indicates that the replicas have become
homogeneous. This cycle of increasing–decreasing energy should be observed if P is chosen properly
441
Entropy 2018, 20, 141
with respect to the dimension of the problem (H-order) and a sufficient number of iterations. When P
is too small, the system will perform akin to classical SA, whereas a P that is too large will cause the
system to fail. The figure shows that, for a given number of maximum iterations 20,000, the number of
replicas P = 5 is the most suitable; anything higher is too high. This also shows that frequent updates
on a limited number of replicas, compared to less frequent updates on a larger number of replicas,
better achieve convergence.
1 −1
Pthresh (t) = 1 − e T(t) (11)
2
the threshold will start a bit higher than 0.5, which asymptotically approaches 1.0 at the end of iteration
time t. Figure 7 shows the curve of T (t), Pthresh (t), Γ(t), and JΓ (t).
(a) (b)
Figure 7. The annealing schedules in SA and SQA: (a) Linear temperature schedule and corresponding
threshold schedule in SA. (b) Linear transverse-field Γ(t) and corresponding JΓ (t) in SQA.
The experiments were repeated 10 times for each case. The averages of residual errors for each
iteration numbers are plotted in Figure 8 for both SA and SQA.
The figure shows that, although initially the residual error of SQA is larger than SA, the slope
is steeper. With a higher number of iterations, which in this case is around 100,000, SQA is superior.
Considering that SQA shows the least amount of error among the replica slices, it seems that variation
in the replica is an ideal solution. In SA, once a solution is selected, the change in spin configuration
will be less significant by the time the system reaches a lower energy state. Therefore, in terms of
finding an H-matrix, SQA is superior to SA.
442
Entropy 2018, 20, 141
Figure 8. Residual energy left by the SA and SQA algorithms. The QAP curve shows when the
horizontal axis accounts for the MCS (the Monte Carlo step); i.e., the number of iterations in the SQA
curve is divided by the number of slices P. The figure shows that SQA outperform SA in finding an
H-matrix. Even when the number of steps is counted without the MCS, SQA eventually outperforms SA
at higher iterations, demonstrated by the steeper slope of the SQA performance curve, compared to SA.
In the second experiment, both SA and SQA were applied to matrices with an increasing size
(order). Figure 9 shows a graph of computational gain, which is defined as the ratio of the number of
SA iterations to the number of SQA iterations needed to achieve 50 percent of the residual energy of the
initial mean energy of all replicas. The horizontal axis shows the order of the H-matrix, from 4 to 20,
whereas the vertical axis shows the computational gain. The gain grows with the order of the H-matrix,
which shows that speedup increases with problem size. Based on this curve, we observe that SQA
outperforms SA for the Hadamard search problem.
Figure 9. Curve of computational gain, which is the ratio of the number of SA iterations to the
number of SQA iterations needed for the algorithm to achieve 50 percent of its initial residual error.
The horizontal axis represent the problem size, which is the order of the H-matrix. The figure shows
that the gain grows non-linearly with problem size, indicating that SQA outperforms SA.
4. Conclusions
We here propose a new method of finding an H-matrix based on SQA. We have formulated the
method into an algorithm, which has been implemented, tested, and analyzed. Low-order H-matrices,
including one of order 12 that cannot be constructed via the Sylvester method, were found. We have
also discussed the advantages of the method over classical SA. Measurements of the residual error and
443
Entropy 2018, 20, 141
the relative running time on an increasing order of H-matrices indicate that SQA is superior to SA in
solving the Hadamard search problem.
Acknowledgments: This research was funded by ITB Grant of Research P3MI 2017. The author would like to thank
ITB (Institut Teknologi Bandung) for their continuous support to his research. He also thanks Donny Danudirjdo
and Andika Triwidada for their assistance in the manuscript layout and English editing.
Conflicts of Interest: The author declares no conflict of interest.
References
1. Lucas, A. Ising formulations of many NP problem. Front. Phys. 2014, 2, 5, doi:10.3389/fphy.2014.00005.
2. Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H. Equation of state calculations by fast
computing machines. J. Chem. Phys. 1953, 21, 1087, doi:10.1063/1.1699114.
3. Kirkpatrick, S.; Gelatt, C.D., Jr.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680,
doi:10.1126/science.220.4598.671.
4. Cerny, V. Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm.
J. Optim. Theory Appl. 1985, 45, 41–51, doi:10.1007/BF00940812.
5. Kadowaki, T.; Nishimori, H. Quantum annealing in the transverse Ising model. Phys. Rev. E 1988, 58, 5355,
doi:10.1103/PhysRevE.58.5355.
6. Santoro, G.E.; Martonak, R.; Tosatti, E.; Car, E. Theory of quantum annealing of an Ising spin glass. Science
2002, 295, 2427–2730, doi:10.1126/science.1068774.
7. Boixo, S.; Rønnow, T.F.; Isakov, S.V.; Wang, Z.; Wecker, D.; Lidar, D.A.; Martinis, J.M.; Troyer, M. Evidence for
quantum annealing with more than one hundred qubits. Nat. Phys. 2014, 10, 218–224, doi:10.1038/nphys2900.
8. Heim, B.; Rønnow, T.F.; Isakov, S.V.; Troyer, M. Quantum versus classical annealing of Ising spin glasses.
Science 2015, 348, 215–217, doi:10.1126/science.aaa4170.
9. Rønnow, T.F.; Wang, Z.; Job, J.; Boixo, S.; Isakov, S.V.; Wecker, D.; Martinis, J.M.; Lidar, D.A.; Troyer, M.
Defining and detecting quantum speedup. Science 2014, 345, 420–424, doi:10.1126/science.1252319.
10. Isakov, S.V.; Mazzola, G.; Smelyanskiy, V.N.; Jiang, Z.; Boixo, S.; Neven, H.; Troyer, M. Understanding
Quantum Tunneling through Quantum Monte Carlo Simulation. Phys. Rev. Lett. 2016, 117, 180402,
doi:10.1103/PhysRevLett.117.180402.
11. Mazzola, G.; Smelyanskiy, V.N.; Troyer, M. Quantum Monte Carlo Tunneling from quantum chemistry to
quantum annealing. Phys. Rev. B 2017, 96, 134305, doi:10.1103/PhysRevB.96.134305.
12. Martonak, R.; Santoro, G.E.; Tosatti, E. Quantum annealing of the traveling-salesman problem. Phys. Rev. E
2004, 70, doi:10.1103/PhysRevE.70.057701.
13. Titiloye, O.; Crispin, A. Quantum annealing of the graph coloring problem. Discret. Optim. 2011, 8, 376–384,
doi:10.1016/j.disopt.2010.12.001.
14. Zick, K.M.; Shehab, O.; French, M. Experimental quantum annealing: Case study involving the graph
isomorphism problem. Sci. Rep. 2015, 5, 11168, doi:10.1038/srep11168.
15. Suksmono, A.B. Finding a Hadamard matrix by simulated annealing of spin-vectors. J. Phys. Conf. Ser. 2012,
856, 012012, doi:10.1088/1742-6596/856/1/012012.
16. Suzuki, M. Relationship between d-dimensional quantal spin systems and (d+1)-dimensional Ising systems:
Equivalence, critical exponents and systematic approximants of the partition function and spin correlations.
Prog. Theor. Phys. 1976, 56, 1454–1469, doi:10.1143/PTP.56.1454.
17. Trotter, H.F. On the product of semi-groups of operators. Proc. Am. Math. Soc. 1959, 10, 545–551,
doi:10.1090/S0002-9939-1959-0108732-6.
18. Sylvester, J.J. Thoughts on inverse orthogonal matrices, simultaneous sign successions, and tessellated
pavements in two or more colours, with applications to Newton’s Rule, ornamental tile-work, and the theory
of numbers. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1867, 34, 461–475.
19. Hadamard, J. Resolution d’une question relative aux determinants. Bull. Sci. Math. 1893, 17, 240–246.
20. Hedayat, A.; Wallis, W.D. Hadamard Matrices and Their Applications. Ann. Stat. 1978, 6, 1184–1238.
21. Horadam, K.J. Hadamard Matrices and Their Applications; Princeton University Press: Princeton, NJ, USA,
2007; ISBN 978-1-40-084290-2.
444
Entropy 2018, 20, 141
22. Garg, V. Wireless Communications and Networking; Morgan-Kaufman: San Francisco, CA, USA, 2007;
ISBN 978-0-12-373580-5.
23. Seberry, J.; Wysocki, B.J.; Wysocki, T.A. On some applications of Hadamard matrices. Metrika 2005, 62,
221–239, doi:10.1007/s00184-005-0415-y.
24. Paley, R.E.A.C. On Orthogonal Matrices. J. Math. Phys. 1933, 12, 311–320, doi:10.1002/sapm1933121311.
25. Dade, E.C.; Goldberg, K. The construction of Hadamard matrices. Mich. Math. J. 1959, 6, 247–250,
doi:10.1307/mmj/1028998229.
26. Williamson, J. Hadamard’s determinant theorem and the sum of four squares. Duke Math. J. 1944, 11, 65–81,
doi:10.1215/S0012-7094-44-01108-7.
27. Bush, K.A. Unbalanced Hadamard matrices and finite projective planes of even order. J. Comb. Theory Ser. A
1971, 11, 38–44, doi:10.1016/0097-3165(71)90005-7.
28. Bush, K.A. Atti del Convegno di Geometria Combinatoria e sue Applicazioni; University Perugia: Perugia, Italy,
1971, Volume 131.
29. Wallis, J.S. On the existence of Hadamard matrices. J. Comb. Theory A 1976, 21, 188–195,
doi:10.1016/0097-3165(76)90062-5.
30. Battaglia, D.A.; Santoro, G.E.; Tosatti, E. Optimization by quantum annealing: Lessons from hard satisfiability
problems. Phys. Rev. E 2005, 71, 066707, doi:10.1103/PhysRevE.71.066707.
c 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
445
entropy
Article
Quantum Genetic Learning Control of Quantum
Ensembles with Hamiltonian Uncertainties
Ameneh Arjmandzadeh and Majid Yarahmadi *
Department of Mathematics and Computer sciences, Lorestan University, Khorramabad, Lorestan 465, Iran;
[email protected]
* Correspondence: [email protected]; Tel.: +98-916-665-3079
Abstract: In this paper, a new method for controlling a quantum ensemble that its members have
uncertainties in Hamiltonian parameters is designed. Based on combining the sampling-based
learning control (SLC) and a new quantum genetic algorithm (QGA) method, the control
of an ensemble of a two-level quantum system with Hamiltonian uncertainties is achieved.
To simultaneously transfer the ensemble members to a desired state, an SLC algorithm is designed.
For reducing the transfer error significantly, an optimization problem is defined. Considering the
advantages of QGA and the nature of the problem, the optimization problem by using the QGA
method is solved. For this purpose, N samples through sampling of the uncertainty parameters
via uniform distribution are generated and an augmented system is also created. By using QGA
in the training step, the best control signal is obtained. To test the performance and validation of
the method, the obtained control is implemented for some random selected samples. A couple of
examples are simulated for investigating the proposed model. The results of the simulations indicate
the effectiveness and the advantages of the proposed method.
Keywords: quantum control; quantum genetic algorithm; sampling-based learning control (SLC)
1. Introduction
In quantum phenomena, as in the classical systems, the existence of uncertainties and noises are
unavoidable. For example, in superconducting qubits, the coupling energy of a Josephson junction may
have fluctuations [1,2]. Noises and fluctuations may exist in magnetic fields and electric fields in cavity
quantum electrodynamics (QED) [3,4]. The spins of an ensemble in nuclear magnetic resonance (NMR)
experiments may not be exactly known with respect to the strength of the applied radio frequency
field [5].
The classification of inhomogeneous quantum ensembles is a significant issue which has many
applications in the discrimination of atoms (or molecules), the separation of isotopic molecules,
and quantum information extraction. Thus, treating the quantum systems with uncertainties is an
important and applicable subject which needs to be considered.
A quantum ensemble consists of a large number of single quantum systems. In the practical world,
some of the quantum systems exist in the form of quantum ensembles. Each single quantum system in
a quantum ensemble is referred to as a member of the ensemble [6]. Quantum ensembles have wide
applications in emerging quantum technology, including long-distance quantum communication [7],
quantum computation [8], and magnetic resonance imaging [9].
Control of inhomogeneous quantum ensembles is an important issue in practical applications.
Control of inhomogeneous quantum systems for discrimination between two or more similar systems,
for instance, is an attractive field of study [10]. In practical applications, the members of quantum
ensembles could have variations in some parameters of dynamic systems. These situations are referred
to as inhomogeneous quantum ensembles [6].
There are many approaches which can be used for solving quantum control problems with
uncertainties. For instance, an optimal control for NMR pulse sequences is designed by applying
gradient algorithms [11]. Additionally, a sequential convex programming method is proposed for
designing robust quantum manipulations [12]. Dong and his collogues have designed a development
of the variable structure control approach with sliding modes to improve the robustness of quantum
systems in which a sliding mode control method is presented for two-level quantum systems to treat
bounded uncertainties in the system Hamiltonian [13]. In addition to these works, a Lyapunov control
method is presented to attain a universal quantum control [14]. For the first time a sampling-based
learning control (SLC) of inhomogeneous quantum ensembles is presented for overcoming the
compensation for parameter dispersion [6]. As an important application, the sampling-based learning
controller is used for designing of a superconducting quantum control of systems [15]. Construction
of universal quantum gates by using a sampling-based learning control are presented in order to
find robust optimal control fields in the presence of different fluctuations and uncertainties [16].
Furthermore, an extended sampling-based learning control for designing a robust quantum unitary
transformation in quantum information processing is presented and implemented [17]. In other
applications, to prevent a control field failing in laser-assisted collisions, a sampling-based robust
control is used [18].
In [19], a systematic sampling-based learning control method with gradient-based learning
algorithms for steering the components of inhomogeneous quantum ensembles with uncertainties to
the same ideal state is investigated by Dong and coworkers. There are some challenges in gradient
algorithms. For instance, they may fall into a local optimum depending on the initial choices of
problem variables or, in complex situations, function derivatives may not be easily found.
Genetic-type algorithms (GAs) have being used in optimization problem-solving. For this purpose,
by applying cross-over and mutation operators on current solutions, new solutions are generated
and, statistically, they are moving toward optimal solutions in the search space. The set of solutions,
however, converges to an optimum solution according to the principle of the Darwinian theory
of evolution.
The quantum genetic algorithm (QGA) was identified by Narayanan and Moore [20]. The QGA,
with even a smaller population, presents a great ability of global optimization and good robustness.
Therefore, as compared with the common genetic algorithm, QGA has greater effectiveness [21,22].
QGAs are mostly constructed based on qubits (or quantum bits) and state superposition in quantum
mechanics. In contrast to classical representations of chromosomes (a binary string, for instance), here
they are represented by vectors of qubits (quantum registers).
In this paper, for controlling the quantum systems with uncertainties, a hybrid method based
on the SLC method and QGA is used. Specially, artificial samples are generated by sampling the
uncertainty parameters in the system model and an augmented system is constructed by using these
samples in the training step. Then, to train a control law with the desired performance for the
augmented system, QG (quantum genetic) learning and optimization algorithms are used. In the
process of testing, a set of selected uncertainty samples is tested to evaluate the control performance.
Additionally, an improvement of QGA is conducted to attain better results. In [22] an adding quantum
mutation operation in the conventional quantum genetic algorithm is used as an improving device.
Quantum mutation, by swapping the value of the probability amplitude of qubits (α, β), can completely
reverse the individual’s evolutionary direction. In this paper the mutation operation is implemented
on measured qubits (bit strings), which is more effective than adding quantum mutation. Reduction of
learning iterations, test error and training error, and also increasing the fidelity index are advantages
of the proposed method.
This paper is organized as follows: Section 2 represents the quantum control model and formulates
the control problem; A quantum genetic learning ensemble control algorithm is designed in Section 3;
Simulation results and control performance are illustrated in Section 4; Conclusions are presented in
Section 5.
448
Entropy 2017, 19, 376
2. Problem Formulation
In this paper, a finite-dimensional (N-level) closed quantum system with a state in an underlying
Hilbert space is considered. The states can be written as a superposition of eigenstates as follows:
N
|ψ(t) = ∑ ci (t)|φi (1)
i =1
where complex numbers ci (t) satisfy ∑iN=1 |ci (t)|2 = 1 and {|φi }iN=1 are the eigenstates of the N-level
quantum system [23]. Usually, the states of two-level quantum systems are considered as arrows from
the origin to points on the Bloch sphere [24].
The dynamical equation can be described as the following Schrödinger equation:
i dt
d
|ψ(t) = H (t)|ψ(t)
(2)
|ψ(t = 0) =|ψ0
√
where is Plank constant (assume = 1 in this paper), H (t) is the system Hamiltonian and i = −1.
The dynamics of the system are governed under the following Hamiltonian:
M
H (t) = H0 + Hc (t) = H0 + ∑ um (t) Hm (3)
m =1
where H0 is the free Hamiltonian of the system and Hc (t) is the time-dependent control Hamiltonian
that represents the interaction of the system with the external control fields um (t), m = 1, 2, . . . , M
(scalar functions). Additionally, Hm for m = 1, 2, . . . , M are Hermitian operators.
In practical applications, there exist external disturbances affecting the control fields. Assume
that the system Hamiltonian is disturbed as follows:
M
HΘ (t) = f 0 (θ0 ) H0 + ∑ f m (θm )um (t) Hm (4)
m =1
Suppose that a similar ensemble’s members with different Hamiltonians are given. The main
objective is to drive the members from an initial state to a desired state. To control the ensemble,
one can select a set of samples instead of all ensemble members and create an augmented system to
be controlled. Let { HΘn , n = 1, 2, . . . , N } be the Hamiltonian of the selected samples, where N is the
number of the training samples. The augmented system is constructed as follows:
⎛ ⎞ ⎛ : ⎞
|ψ1 (t) HΘ1 (t)ψ1 (t)
⎜ ⎟ ⎜ : ⎟
d⎜ |ψ2 (t) ⎟ ⎜ HΘ2 (t)ψ2 (t) ⎟
⎜ ⎟ = −i ⎜ ⎟, (6)
dt ⎜ ⎟ ⎜ ⎟
.. ..
⎝ . ⎠ ⎝ .
⎠
:
|ψN (t) HΘ N (t)ψN (t)
449
Entropy 2017, 19, 376
5' ( 6
where Θn ∈ θ0n0 , θ1n1 , . . . , θ Mn M , n0 = 1, 2, . . . , N0 , . . . , n M = 1, 2, . . . , NM and N = ∏ jM= 0 Nj
is number of the training samples. The task is to find the best control u∗ such that the
performance function
1 N ;
<2
J (u) = ∑
N n =1
ψn (t)ψntarget (7)
for each control strategy in u = {um (t), m = 1, 2, . . . , M }, is maximized. Thus, the control problem can
be formulated as a maximization problem as follows:
N ;
<2
1
max J (u) = N∑ ψn ( T )ψn target
⎛ ⎞ n =1 ⎛ : ⎞
|ψ1 (t) HΘ1 (t)ψ1 (t)
⎜ ⎟ ⎜ : ⎟
d⎜
⎜ |ψ2 (t) ⎟ ⎜ HΘ2 (t)ψ2 (t) ⎟
s.t. dt ⎟ = −i ⎜ ⎟, (8)
⎜ .. ⎟ ⎜ .. ⎟
⎝ . ⎠ ⎝ .
⎠
:
|ψN (t) HΘ N (t)ψN (t)
M
ψ(t = 0) = |ψ0 , HΘn (t) = f 0,n (θ0,n ) H0 + ∑ f m,n (θm,n )um (t) Hm , n = 1, 2, . . . , N
m =1
So, for t ∈ [0, Δt] considering Equation (9), the objective function of Equation (8), changes to:
N ; <2
1
ψn (0)e−i( H0 f0 (θ0n0 ) + ∑m=1 um f m (θmnm ) Hm )Δt ψntarget .
M
Max J (u) =
N ∑ (10)
n =1
Hence, [0, T ] is divided into Q subintervals and suppose that um (t), m = 1, 2, . . . , M are constants
:
in any subinterval with the same length Δt = T/Q. Let ψn j−1 (0) be the initial state of the control
system in the j-th subinterval, then for j-th subinterval the following problem must be solved:
< 2
Max J j (u) = |e−i( H0 f0 (θ0n0 ) + ∑m = 1 um f m (θmnm ) Hm )Δt ψn j − 1 (0)ψn target
M j
, (11)
where
|ψ −|ψ (0)
< n
|ψn (0) + j target Q
n
j
ψn target = |ψn −|ψn (0)
(12)
|ψn (0) + j target Q
450
Entropy 2017, 19, 376
is the target state of j-th subinterval for n-th sample. In each subinterval, Equation (11), by QGA is
∗j
solved and the best control um , m = 1, . . . , M is obtained. Then, for j = 1, . . . , Q,
< <
j −i ( H0 f 0 (θ0n0 )+∑m
M ∗j
=1 um f m (θmnm ) Hm )Δt ψ j −1 (0)
ψn = e n (13)
∗j
is the state transferred by optimal control um , m = 1, 2, . . . , M in the j-th subinterval, which is
: :
considered as the initial state of the next subinterval, that is, ψn j (0) = ψn j is the initial state of the
(j + 1)-th subinterval, and the process continues.
where m indicates the number of genes in any chromosomes and k represents the number of qubits
encoding each gene. In the initial generation (when t = 0), quantum encoding (α,β) of each individual
in the population is initialized with ( √1 , √1 ), which denotes that the probability of collapsing the
2 2
superposed state into each basic states is equal.
' (T
where (αi , β i ) T and αi , βi are the probability amplitudes of the i-th qubit in a chromosome before
and after the quantum rotating gates update, respectively. Additionally, θi is the rotating angle.
In Table 1, the updating strategies, for the chromosomes, are presented. The value and the sign of θi are
determined by the adjustment strategy. Here, xi is the i-th bit of the current chromosome; Refi is the
i-th bit of the current optimal binary solution, named the reference binary solution, that all quantum
chromosomes should be steered toward its corresponding chromosome; f ( x ) is the fitness function;
s(αi , β i ) is the rotate direction of the rotating angle and Δθi is the increment value of the i-th rotating
angle. The value of Δθi is a constant and is usually around 0.01π. The overall process in QGA is similar
451
Entropy 2017, 19, 376
to the GAs but with some differences in changing from one generation to the next one. In fact, a new
generation P(t) is achieved by operating quantum rotating gates on any individuals.
s(ffi , fii )
xi Refi f(x) > f(Ref) Δθi
ffi fii > 0 ffi fii < 0 ffi = 0 fii = 0
0 0 FALSE 0 0 0 0 0
0 0 TRUE 0 0 0 0 0
0 1 FALSE Δθi +1 −1 0 ±1
0 1 TRUE Δθi −1 +1 ±1 0
1 0 FALSE Δθi −1 +1 ±1 0
1 0 TRUE Δθi +1 −1 0 ±1
1 1 FALSE 0 0 0 0 0
1 1 TRUE 0 0 0 0 0
Additionally, in Figure 1, a schematic diagram of the proposed method is given. In this diagram,
first, a random population of quantum chromosomes P(t) is generated. A binary population Pb (t) by
measuring the present population is obtained. After evaluating Pb (t) and specifying the best solution
Ref, the whole of the quantum chromosomes are rotated toward the corresponding chromosome of
Ref, according to Table 1. This process generates a new population with better fitness. As indicated in
Figure 1, the above processes are repeated until the stop criterion is satisfied for all j = 1, 2, . . . , Q.
452
Entropy 2017, 19, 376
4. Simulation Results
In this section two examples are simulated. Assume that all of the control signals are bounded in
a known interval [umin , umax ].
Objective and protocols of simulation are explained as follows:
Let [umin , umax ] = [−4, 6], the initial state |ψ(0) = (0, 0, 1) in real coordinates (i.e., |ψ0 = [1 0]t ),
the time interval [0, 5] (T = 5) is divided by Q = 20 and time slices Δt = 0.25. Additionally, the
quantum genetic populations are the input control signals. The evolution generation number, the size
of the population and the length of the each quantum chromosome are 200, 100, and 24, respectively.
The mutation rate is 0.05 and the selection percentage of individuals is 50%. The stop condition for
the iterative algorithm is considered as |1 − J (u)| < ε (ε = 0.001). The objective of the problem-solving
is transferring all of the initial states to the target state |ψ( T ) = (0, 0, −1)(i.e., |ψT = [0 1]t ), with
maximum fidelity. The value of Δθi is set 0.01π.
2
d
i dt |ψ(t) = ( f 0 (θ0 ) H0 + ∑ f m (θm )um (t) Hm )|ψ(t)
m=1 (17)
|ψ(t = 0) =|ψ0
1 1 1 0 1 1 01
where H0 = =2 σz is the free Hamiltonian and H1 = = 2 σx , H2 = 12 σy =
2 0−1 2 10
1 0−i 01 0−i 1 0
. Additionally, σx = , σy = , and σz = are Pauli matrices.
2 i 0 10 i 0 0−1
Assume that the system’s state is written as
453
Entropy 2017, 19, 376
where B = {|1, |2} is the orthonormal basis of the corresponding Hilbert space.
Let C (t) = (c1 (t), c2 (t)), where ci (t) are complex time depended coefficients. Therefore, Equation (17)
is equivalent to
. 2
iC (t) = ( f 0 (θ0 ) H0 + ∑m=1 f m (θm )um (t) Hm )C (t). (19)
' (
In this example, let f m (θm ) = 1 − 2θm 2 exp (−θm 2 /2) be the Mexican hat wavelet functions for
m = 1, 2 and f 0 (θ0 ) = 1 on [1 − E, 1 + E] for E = 0.21. After sampling the uncertainty parameters,
every sample can be described as follows:
.
c1 ( t ) 0.5 f 0 (θ0 ) G (θ1 , θ2 ) c1 ( t )
. = −i (20)
c2 ( t ) G ∗ ( θ1 , θ2 ) − f 0 ( θ0 ) c2 ( t )
where G (θ1 , θ2 ) = 0.5( f 1 (θ1 )u1 (t) − f 2 (θ2 )u2 (t)i ) and θi ∈ [1 − E, 1 + E]. Additionally, G ∗ is the
complex conjugate of G. To construct an augmented system for the training step of the SLC method,
consider N training samples that are selected through sampling the uncertainties, as follows:
.
c1,n (t) 0.5 f 0 (θ0,n ) G (θ1,n , θ2,n ) c1,n (t)
. = −i , n = 1, 2, . . . , N. (21)
c2,n (t) G ∗ (θ1,n , θ2,n ) − f 0 (θ0,n ) c2,n (t)
The results of simulation are illustrated in Figure 2. Figure 2a illustrates the control signals um (t),
m = 1, 2 obtained in the training step.
(a) (b)
(c)
Figure 2. Control of an ensemble of two level quantum system with uncertainties: (a) control signals
um (t), m = 1, 2; (b) fidelity function (Fitness function performance); (c) simultaneously steering
ensemble members to the desired state.
454
Entropy 2017, 19, 376
Figure 2b illustrates the mean of fidelity function of any states as a fitness function of the
QGA. Finally, Figure 2c illustrates simultaneously steering ensemble members to the desired state.
As simulation results indicate, 25 training samples are steered to the target state with a fidelity
amplitude 0.9986 and error = 0.001. After running the control system, with founded control signals of
the training step, for 200 test samples the fidelity amount is 0.9968 and the corresponding error is 0.003.
Example 2: The second example is a three-level quantum system with uncertainties in Hamiltonian parameters
that are found widely in natural and artificial atoms. Some atoms can be explained by a V-type three-level
quantum system model. It is important to reach a robust preparation of this class of states for practical applications
of quantum technology. The SLC, contributed with QGA, is used for a V-type quantum control system. Assume
the initial state is:
|ψ(t) = c1 (t)|1 + c2 (t)|2 + c3 (t)|3 (22)
with B = {|1, |2, |3}, the orthonormal basis of the corresponding Hilbert space.
Let C (t) = (c1 (t), c2 (t), c3 (t)), where ci (t) are complex numbers. Then we have
. 4
iC (t) = ( f 0 (θ0 ) H0 + ∑m=1 f m (θm )um (t) Hm )C (t). (23)
After sampling the uncertainty parameters, every sample can be described as follows:
⎛ . ⎞ ⎛ ⎞⎛ ⎞
c1 ( t ) 1.5 f 0 (θ0 ) G (θ1 , θ2 ) G (θ3 , θ4 ) c1 ( t )
⎜ . ⎟ ⎜ ⎟ ⎜ ⎟
⎝ c2 ( t ) ⎠ = − i ⎝ G ∗ ( θ1 , θ2 ) f 0 ( θ0 ) 0 ⎠ ⎝ c2 ( t ) ⎠, (25)
. ∗
c3 ( t ) G ( θ3 , θ4 ) 0 0 c3 ( t )
where G (θ1 , θ2 ) = f 1 (θ1 )u1 (t) − f 2 (θ2 )u2 (t)i, G (θ3 , θ4 ) = f 3 (θ3 )u3 (t) − f 4 (θ4 )u4 (t)i, and
θi ∈ [1 − E, 1 + E]. E ∈ [0,1] is a given constant and G ∗ is the complex conjugate of G. Comparing
the results with previous works, uncertainty coefficients are chosen the same as what is given in [19],
that is, f m (θm ) = θm and f 0 (θ0 ) = θ0 have uniform distributions over [0.79, 1.21]. To construct an
augmented system for the training step of the SLC design, we choose N training samples (denoted as
n = 1, 2, . . . , N) through sampling the uncertainties as follows:
⎛ . ⎞ ⎛ ⎞⎛ ⎞
c1,n (t) 1.5 f 0 (θ0,n ) G (θ1,n , θ2,n ) G (θ3,n , θ4,n ) c1,n (t)
⎜ . ⎟ ⎜ ⎟ ⎜ ⎟
⎝ c2,n (t) ⎠ = −i ⎝ G ∗ (θ1,n , θ2,n ) f 0 (θ0,n ) 0 ⎠⎝ c2,n (t) ⎠ (26)
. ∗
c3,n (t) G (θ3,n , θ4,n ) 0 0 c3,n (t)
where G (θ1,n , θ2,n ) = f 1 (θ1,n )u1 (t) − f 2 (θ2,n )u2 (t)i and G (θ3 , θ4 ) = f 3 (θ3,n )u3 (t) − f 4 (θ4,n )u4 (t)i.
Now, the objective is to find a robust control strategy u(t) = {um (t), m = 1, 2, 3, 4} to drive
: √
the quantum system from |ψ0 = |1 (i.e., C0 = (1, 0, 0)) to ψtarget = 1/ 2 (|2 + |3)
√ √
(i.e., Ctarget = 0, 1/ 2, 1/ 2 ). The general conditions here are similar to ones mentioned in previous
example but Q = 10. Apart from the initial values, the results are always converged and it is more
precise than the gradient method as shown in Table 2.
455
Entropy 2017, 19, 376
The training error is computed as |1 − J (u∗ ( T ))| in which J (u∗ ( T )) is the fidelity function for
training samples. For calculating the test error, optimal control u∗ is implemented to the test samples,
which are selected randomly. Additionally, the amount of |1 − J (u∗ ( T ))| is computed for test samples,
as a test error index. The method presented in this paper always converges and does not depend
on initial choices of u = {um (t), m = 1, 2, . . . , M}. Figure 3a–c demonstrate the control signals um (t),
m = 1, 2, 3, 4 and fidelity function for steering training samples simultaneously to the target state.
(a) (b)
(c)
Figure 3. Control of an ensemble of a two-level quantum system with uncertainties: (a) control signals
um (t), m = 1, 2; (b) control signals um (t), m = 3, 4; (c) fidelity function.
The training samples are steered to the target state with a fidelity of 0.9982. The control values
found in training step are applied on 200 testing samples. The fidelity amount of 0.9954 is achieved
with a test error equal to 0.005. Figure 3a,b show the control signals u1 (t), u2 (t), u3 (t), and u4 (t)
through the time interval [0, 5] and Figure 3c illustrates the mean of the fidelity function of all of the
states as a fitness function of QGA.
5. Conclusions
In this paper a new quantum genetic sampling-based learning controller is designed. For this
purpose an unconstrained nonlinear optimization problem is designed and is solved by a new quantum
genetic algorithm. All of the members of an inhomogeneous quantum ensemble transfers to a known
target state, simultaneously. In this method controller performance is independent of the initial input
values and this is an important advantage of the proposed method as compared with gradient-based
456
Entropy 2017, 19, 376
learning methods. Additionally, transfer process errors and learning iteration numbers are reduced,
significantly. A couple of examples for two- and three-level quantum systems are simulated by
using the proposed method. The simulation results indicate the advantages and efficiency of the
presented method.
Acknowledgments: The authors are very grateful to the editor and anonymous reviewers for their suggestions in
improving the quality of the paper.
Author Contributions: They conceived of the presented idea and developed the theory of the presented paper.
Both authors discussed the results and contributed to the final manuscript. Both authors have read and approved
the final manuscript.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Shnirman, A.; Schön, G.; Hermon, Z. Quantum manipulations of small Josephson Junctions. Phys. Rev. Lett.
1997, 79, 2371–2374. [CrossRef]
2. Makhlin, Y.; Schön, G.; Shnirman, A. Josephson junction quantum logic gates. Comput. Phys. Commun. 2000,
127, 156–164. [CrossRef]
3. Giovannetti, V.; Vitali, D.; Tombesi, P.; Ekert, A. Scalable quantum computation with cavity QED systems.
Phys. Rev. A 2000, 62, 032306. [CrossRef]
4. Shu, J.; Zou, X.; Xiao, Y.; Guo, G. Quantum phase gate of photonic qubits in a cavity QED system. Phys. Rev. A
2007, 74, 044302. [CrossRef]
5. Li, J.S.; Khaneja, N. Control of inhomogeneous quantum ensembles. Phys. Rev. A 2006, 73, 030302. [CrossRef]
6. Chen, C.; Dong, D.; Long, R.; Petersen, I.R.; Rabitz, H.A. Sampling-based learning control of inhomogeneous
quantum ensembles. Phys. Rev. A 2014, 89, 023402. [CrossRef]
7. Duan, L.M.; Lukin, M.D.; Cirac, J.I.; Zoller, P. Long-distance quantum communication with atomic ensembles
and linear optics. Nature 2001, 414, 413–418. [CrossRef] [PubMed]
8. Cory, D.G.; Fahmy, A.F.; Havel, T.F. Ensemble quantum computing by NMR spectroscopy. Proc. Natl. Acad.
Sci. USA 1997, 94, 1634–1639. [CrossRef] [PubMed]
9. Li, J.S.; Ruths, J.; Yu, T.Y.; Arthanari, H.; Wagner, G. Optimal pulse design in quantum control: A unified
computational method. Proc. Natl. Acad. Sci. USA 2011, 108, 1879–1884. [CrossRef] [PubMed]
10. Mitra, A.; Rabitz, H. Mechanistic Analysis of Optimal Dynamic Discrimination of Similar Quantum Systems.
J. Phys. Chem. A 2004, 108, 4778–4785. [CrossRef]
11. Khanejia, N.; Reiss, T.; Kehlet, C.; Schulte-Herbrüggen, T.; Glaser, S.J. Optimal control of coupled spin
dynamics: Design of NMR pulse sequences by gradient ascent algorithm. J. Magn. Reson. 2005, 172, 296–305.
[CrossRef] [PubMed]
12. Kosut, R.L.; Grace, M.D.; Brif, C. Robust control of quantum gates via sequential convex programming.
Phys. Rev. A 2013, 88, 1–12. [CrossRef]
13. Dong, D.; Petersen, I.R. Sliding mode control of two-level quantum systems. Automatica 2012, 48, 725–735.
[CrossRef]
14. Hou, S.C.; Wang, L.C.; Yi, X.X. Realization of quantum gates by Lyapunov control. Phys. Lett. A 2014, 378,
699–704. [CrossRef]
15. Dong, D.; Chen, C.; Qi, B.; Petersen, I.R.; Nori, F. Robust manipulation of superconducting qubits in the
presence of fluctuations. Sci. Rep. 2015, 5, 7873. [CrossRef] [PubMed]
16. Dong, D.; Wu, C.; Chen, C.; Qi, B.; Petersen, I.R.; Nori, F. Learning robust pulses for generating universal
quantum gates. Sci. Rep. 2015, 6, 36090. [CrossRef] [PubMed]
17. Wu, C.; Qi, B.; Chen, C. Robust learning control design for quantum unitary transformations.
IEEE Trans. Cybern. 2016, 99, 1–13. [CrossRef] [PubMed]
18. Zhang, W.; Dong, D.; Petersen, I.R.; Rabitz, H.A. Sampling-based robust control in synchronizing collision
with shaped laser pulses: An application. RSC Adv. 2016, 6, 92962–92969. [CrossRef]
19. Dong, D.; Mabrok, M.A.; Petersen, I.R.; Qi, B.; Chen, C.; Rabitz, H. Sampling-Based Learning Control for
Quantum Systems with Uncertainties. IEEE Trans. Control Syst. Technol. 2015, 23, 2155–2166. [CrossRef]
457
Entropy 2017, 19, 376
20. Narayanan, A.; Moore, M. Quantum-inspired genetic algorithm. In Proceedings of the IEEE International
Conference on Evolutionary Computation, Nagoya, Japan, 20–22 May 1996.
21. Laboudi, Z.; Chikhi, S. Comparison of Genetic Algorithm and Quantum Genetic Algorithm. Int. Arab J.
Inf. Technol. 2012, 9, 243–249.
22. Wang, H.; Liu, J.; Zhi, J.; Fu, C. The Improvement of Quantum Genetic Algorithm and Its Application on
Function Optimization. Math. Probl. Eng. 2013, 2013, 1–10. [CrossRef]
23. Wu, C.; Chen, C.; Qi, B.; Dong, D. Robust quantum operation for two-level systems using sampling-based
learning control. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics,
Hongkong, China, 9–12 October 2015.
24. Wang, L.C.; Hou, S.C.; Yi, X.X.; Dong, D.; Petersen, I.R. Optimal Lyapunov quantum control of two-level
systems: Convergence and extended techniques. Phys. Lett. A 2014, 378, 1074–1080. [CrossRef]
25. Nielsen, M.A.; Chuang, I.L. Distance Measures for Quantum Information; Cambridge University Press:
Cambridge, UK, 2000.
26. Lahoz-Beltra, R. Quantum Genetic Algorithms for Computer Scientists. Computers 2016, 5, 24. [CrossRef]
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
458
entropy
Article
Discrete Wigner Function Derivation of the
Aaronson—Gottesman Tableau Algorithm
Lucas Kocia, Yifei Huang and Peter Love *
Department of Physics, Tufts University, Medford, MA 02155, USA; [email protected] (L.K.);
[email protected] (Y.H.)
* Correspondence: [email protected]; Tel.: +1-617-627-3029 (ext. 7-1065)
Abstract: The Gottesman–Knill theorem established that stabilizer states and Clifford operations can
be efficiently simulated classically. For qudits with odd dimension three and greater, stabilizer states
and Clifford operations have been found to correspond to positive discrete Wigner functions and
dynamics. We present a discrete Wigner function-based simulation algorithm for odd-d qudits that
has the same time and space complexity as the Aaronson–Gottesman algorithm for qubits. We show
that the efficiency of both algorithms is due to harmonic evolution in the symplectic structure of
discrete phase space. The differences between the Wigner function algorithm for odd-d and the
Aaronson–Gottesman algorithm for qubits are likely due only to the fact that the Weyl–Heisenberg
group is not in SU (d) for d = 2 and that qubits exhibit state-independent contextuality. This may
provide a guide for extending the discrete Wigner function approach to qubits.
1. Introduction
The cost of brute-force classical simulation of the time evolution of n-qubit states grows
exponentially with n. An important exception to this involves the set of Clifford operators acting
on stabilizer states. This set of states plays an important role in quantum error correction [1] and is
closed under action by Clifford gates. Efficient simulation of such systems was demonstrated with the
tableau algorithm of Aaronson and Gottesman [1,2] for qubits (d = 2). Finding the underlying reason
for why such an efficient algorithm is possible for Clifford circuit simulation has since been the subject
of much study [3–5].
Recent progress has been the result of work by Wootters [6], Gross [7], Veitch et al. [8,9],
Mari et al. [4], and Howard et al. [5], who have formulated a new perspective based on the
discrete phase spaces of states and operators in finite Hilbert spaces using discrete Wigner functions.
In odd-dimensional systems, they have shown that stabilizer states have positive-definite discrete
Wigner functions and that Clifford operators are positive-definite maps. This implies that Clifford
circuits are non-contextual and are efficiently simulatable on classical computers. In odd-dimensional
systems, stabilizer states have been shown to be the discrete analogue to Gaussian states in continuous
systems [7] and Clifford group gates have been shown to have underlying harmonic Hamiltonians
that preserve the discrete Weyl phase space points [10]. This means Clifford circuits are expressible by
path integrals truncated at order h̄0 and are thus manifestly classical [10,11].
This poses the question: what is the relationship between past efficient algorithms for Clifford
circuits and the propagation of discrete Wigner functions of stabilizer states under Clifford operators?
In the present paper, we show that the original Aaronson–Gottesman tableau algorithm for qubit
stabilizer states is actually equivalent to such a discrete Wigner function propagation and that the
tableau matrix coincides with the discrete Wigner function of a stabilizer state. We accomplish this by
first developing a Wigner function-based algorithm that classically simulates stabilizer state evolution
under Clifford gates and measurements in the Ẑ Pauli basis for odd d. We then show its equivalence
to the well-known Aaronson–Gottesman tableau algorithm [2] for qubits (d = 2). Both algorithms
require O(n2 ) dits to represent n stabilizer states, O(n) operations per Clifford operator, and both
deterministic and random measurements require O(n2 ) operations.
The Aaronson–Gottesman tableau algorithm makes use of the Heisenberg representation.
This means that time evolution is accomplished by updating an associated tableau or matrix
representation of the Clifford operators instead of the stabilizer states themselves. The algorithm we
present is framed in the Schrödinger picture and involves evolving the Wigner function of stabilizer
states. By demonstrating that the two algorithms are equivalent, we show that the formulation of
Clifford simulation in the Heisenberg picture is a choice and not a necessity for its efficient simulation.
Furthermore, by instead working in the Schrödinger picture we are able to more easily reveal the purely
classical basis of both algorithms and the physically intuitive phase space structures and symplectic
properties on which they rely.
1 :9
F̂j = √ ∑ e−
2πi k l
d j j k 1 , . . . , k j , . . . , k n l1 , . . . , l j , . . . , l n .
d k j ,l j ∈
Z/dZ
This is the d-dimensional equivalent of the Hadamard gate and allows us to define the Pauli X̂ j operator
as follows:
X̂ j ≡ F̂j Ẑj F̂j† . (2)
δq : :
X̂ j k1 , . . . , k j , . . . , k n ≡ k1 , . . . , k j ⊕ δq, . . . , k n , (3)
2πi q̂
Ẑj = e d j (4)
and
X̂ j = e−
2πi p̂ j
d . (5)
Thus, we can refer to the X̂ j basis as the momentum (p j ) basis, which is equivalent to the Fourier
transform of the q j basis:
p̂ j = F̂j q̂ j F̂j† . (6)
460
Entropy 2017, 19, 353
The Wigner function WΨ ( p, q) of a pure state |Ψ is defined on this discrete Weyl phase space:
( d + 1) ξ q ( d + 1) ξ q
e−
2πi ξ · p
WΨ ( p, q) = d−n ∑ d q Ψ q+ Ψ∗ q− . (7)
ξq ∈
2 2
(Z/dZ)n
This is equivalent to the discrete Wigner function introduced by Gross [7]. We will shortly be interested
in the discrete Wigner function of stabilizer states. However, first, we introduce the effect that the
Clifford gates have in this discrete Weyl phase space.
where x ≡ ( p, q). When considering Clifford gate propagation, we can restrict to a set of gates which
are generators of the Clifford group. One such set of generators is made up of the phase-shift gate P̂i ,
the Hadamard gate F̂i , and the controlled-not (CNOT) Ĉij (which act on the ith and jth qudits).
The phase shift P̂i is a one-qudit gate with the underlying Hamiltonian HP̂ = − d+ 1 2 d +1
2 qi + 2 qi [10].
i
Without loss of generality, we will instead consider
which we will refer to as the phase-shift gate in this paper. We note that the usual phase-shift can be
obtained from the new one within the Clifford group:
where [ P̂i , Ẑi ] = [ P̂i , Ẑi ] = 0. Hence, P̂i is an adequate replacement generator for P̂i , and we will use
it instead of P̂i from now on. Since its Hamiltonian has no linear term (HP̂ = −q2i ), this leads to an
i
easier presentation ahead since α P̂ = 0. The corresponding equations of motion for P̂i are ṗi = 2qi
i
and q̇i = 0. Hence, for Δt = 1,
M P̂ = δj,k + 2δi,j δn+i,k . (12)
i j,k
The Hadamard gate F̂i is a one-qudit gate and has the underlying Hamiltonian
HF̂ = − π4 ( p2i + q2i ) [10]. The corresponding equations of motion are ṗi = π2 qi and q̇i = − π2 pi .
i
Hence, for Δt = 1,
M F̂ = δj,k − δi,j δi,k − δn+i,j δn+i,k
i
(13)
j,k
+δi,j δn+i,k − δn+i,j δi,k ,
and α F̂ = 0.
i
461
Entropy 2017, 19, 353
Finally, the two-qudit CNOT Ĉij on control qudit i and second qudit j has the corresponding
Hamiltonian HĈ = pi q j [10]. The corresponding equations of motion are ( ṗi , ṗ j ) = −(0, pi ) and
ij
(q̇i , q̇ j ) = (q j , 0). Hence, for Δt = 1,
MĈij = δk,l − δi,k δj,l + δn+ j,k δn+i,l , (14)
k,l
and αĈ = 0.
ij
Theorem 1. The discrete Wigner function WΨ ( x) of a stabilizer state Ψ for any odd d and n qudits is δΦ× x,r
for 2n × 2n matrix Φ and 2n vector r with entries in Z/dZ.
An equivalent form was proven by Gross [7] who also showed that these discrete Wigner functions of
stabilizer states are non-negative. In particular,
ifwe begin with a stabilizer state defined as |Ψ0 = |q0 ,
0 0
then WΨ0 ( x) = δΦ0 × x,r0 , where Φ0 = for In the n × n identity matrix, and r 0 = (0, q0 ).
0 In
The Kronecker delta function sets this linear system of equations equal to r t . In this way, an affine
map—a linear transformation displaced from the origin by r t —is defined. This system of equations
must be updated after every unitary propagation and measurement.
Since the Wigner functions WΨ ( x) of stabilizer states propagate under M as WΨ (M x),
it follows that
Φt → Φt Mt−1 . (16)
(The importance of vector r t and when it must be updated will become evident when we consider
random measurements.) Hence, after n operations M1 , M2 , . . ., Mn ,
M− 1 −1 −1 −1
t = M1 M2 . . . M n . (17)
462
Entropy 2017, 19, 353
0 −In
J = . (18)
In 0
Thus, the the stability matrices M for F̂i , P̂i and Ĉij given in Equations (12)–(14) differ from their
inverses only by sign changes in their off-diagonal elements:
M−
P̂
1
= δj,k − 2δi,j δn+i,k , (19)
i j,k
M−
F̂
1
= δj,k − δi,j δi,k − δn+i,j δn+i,k (20)
i j,k
−δi,j δn+i,k + δn+i,j δi,k ,
and
MĈ−1 = δk,l + δi,k δj,l − δn+ j,k δn+i,l . (21)
ij
k,l
We assume the quantum state is initialized in the computational basis state Ψ0 = |0 ⊗ · · · ⊗ |0
E FG H
n
0 0
and so initially we should set Φ0 = and r 0 = 0. The initial stabilizer state is WΨ0 = δqt ,0 .
0 In
However, it will become clear when we discuss measurements that it is practically useful to instead set
In 0
Φ0 = , (22)
0 In
thereby setting WΨ0 = δ( pt ,qt ),(0,0) —not a true Wigner function. This new matrix Φ0 is equivalent to
the last matrix if the first n rows in Φt x and r t are ignored—the same as ignoring p0 ( pt , qt ). In fact,
we have two Wigner functions here: one defined by the first n rows and another by the last n rows.
We proceed in this manner, ignoring the first n rows, until their usefulness becomes apparent to us.
For n qudits unitary propagation requires O(n2 ) dits of storage to track Φt and r t . More precisely,
since Φt is a 2n × 2n matrix and r t is an 2n-vector, 2n(2n + 1) dits of storage are necessary.
463
Entropy 2017, 19, 353
CNOT from control i to target j (Ĉij ). For all j ∈ {1, . . . , 2n}, set Φk,j → Φk,j ⊕ Φk,i and Φk,n+i →
Φk,n+i 3 Φk,n+ j .
This confirms that unitary propagation in this scheme requires O(n) operations.
3.3. Measurement
The outcome of a measurement Ẑi on a stabilizer state can be either random or deterministic.
As described above, the bottom half of Φt defines q0 j for j ∈ {1, . . . , n}, each of which is a linear
combination of qt i and pt i . The entries in the (n + j)th row of Φt give the coefficient of pt i and qt i in
q0 j for j ∈ {1, . . . , n}. If the coefficient of pt i in any q0 j is non-zero then the measurement Ẑi will be
random. If all coefficients of pt i are zero for q0 j ∀ j, then the measurement of Ẑi will be deterministic.
This can be seen from the fact that if our stabilizer state |Ψ is an eigenstate of Ẑi , then Ẑi |Ψ = eiφ |Ψ
for some φ ∈ R and (discrete) Wigner functions do not change under a global phase. Thus, measuring
Ẑi leaves the Wigner function of |Ψ invariant if the measurement is deterministic. Since Ẑi is a boost
operator that increments the momentum of a state by one, its effect on the linear system of equations
specified by the Wigner function is:
⎛ ⎞ ⎛ ⎞
pt 1 pt 1
⎜ .. ⎟ ⎜ .. ⎟
⎜ . ⎟ ⎜ . ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ rt p ⎜ ⎟ rt p
⎜ pt i ⎟ ⎜ pt i + 1 ⎟
Φt ⎜ ⎟= → Φt ⎜ ⎟= . (23)
⎜ .. ⎟ rt q Ẑi ⎜ .. ⎟ rt q
⎜ . ⎟ ⎜ . ⎟
⎜ ⎟ ⎜ ⎟
⎝ pt n ⎠ ⎝ pt n ⎠
qt qt
Thus, if the lower half of the ith column of Φt is zero, then Ẑi leaves the Wigner function invariant
(and so the measurement is deterministic). Verifying that these coefficients are all zero takes O(n)
operations for each Ẑi .
In other words, to see if a given measurement of Ẑi is random or deterministic, a search must be
performed for non-zero Φtn+ j,i elements. If such a non-zero element exists, then the measurement
is random since it means that the final momentum of qudit i affects the state of the stabilizer and so
its position must be undetermined (by Heisenberg’s uncertainty principle). If no such finite Φtn+ j,i
element exists, then the measurement Ẑi is deterministic. We now describe the algorithm in detail for
these two cases:
464
Entropy 2017, 19, 353
so that the former now specifies q0 j ( pt , qt ) while the latter specifies q0 i ( pt , qt ). p0 i has also been
updated by replacing the jth row in the first half of Φt , with the (n + j)th row we just changed. Again,
this row now describes p0 i ( pt , qt ) while the ith row now specifies p0 j ( pt , qt ). Overall, this takes O(n2 )
operations since we are replacing O(n) rows with O(n) entries.
Φt xt = rt , (24)
p0 ( p t , q t ) rt p
= , (25)
q0 ( p t , q t ) rt q
where we are interested in linear combinations of the bottom half, q0 ( pt , qt ), to solve for the
measurement outcome qt i :
n
∑ cij q0 j = qt i , (26)
j =1
Lemma 1. The coefficient in front of pt i in the row of Φt that specifies p0 j ( pt , qt ), Φt ji , is equal to the coefficient
cij in front of q0 j that makes up qt i in Equation (26). Equivalently,
cij = q0 j · qt i ( p0 , q0 ) = p0 j ( pt , qt ) · pt i = Φt ji . (27)
M− t
1
= −J MtT J since Mt is symplectic. This means that we can express the matrix inversion
as follows:
p0 −1 pt
= Mt (29)
q0 qt
pt
= −J MtT J (30)
qt
T
(Mt )11 (Mt )12 pt
= −J J (31)
(Mt )21 (Mt )22 qt
(Mt )22 (−Mt )12 pt
= . (32)
(−Mt )21 (Mt )11 qt
465
Entropy 2017, 19, 353
Therefore, Mt−1 = (Mt )22i,j , and so
11i,j
cij = q0 j · qt i ( p0 , q0 ) = p0 j ( pt , qt ) · pt i = Φt ji . (33)
This property can also be seen in the drawing of phase space shown in Figure 1. There, initial
perpendicular p0 j and q0 j manifolds are drawn along with harmonically evolved pt i and qt i manifolds,
which remain perpendicular to each other and make an angle α to the first p0 j and q0 j manifolds,
respectively. The projection of qt i ( p0 , q0 ) onto q0 j can be represented as the length b of a right triangle’s
adjacent side to the angle α, with an opposite side set to some length a. The projection of p0 j ( pt , qt )
onto pt i is similarly represented by the length b of a right triangle’s adjacent side to the angle α, with an
opposite side also set to length a. It follows that the third angle β in both triangles must be the same,
and so by the law of sines
a b b
= = . (34)
sin α sin β sin β
Therefore, b = b and so these two projections are equal to one another. In the discrete Weyl phase
space such manifolds must lie along grid phase points and obey the periodicity in x p and xq , but the
premise is the same.
Overall, the procedure outlined in Lemma 1 for deterministic measurements takes O(n2 )
operations since Equation (27) is a sum of O(n) vectors made up of O(n) components. Therefore, the
overall measurement protocol takes O(n2 ) operations. Note that this formulation of the algorithm
shows that it is the symplectic structure on phase space and the linear transformation under harmonic
evolution that allows the inversion (Equation (32)) to be performed efficiently.
Figure 1. The initial perpendicular manifolds p0 j and q0 j and the harmonically evolved perpendicular
manifolds pt i and qt i . Description of the various lengths and angles are given in the text in the proof
of Lemma 1.
466
Entropy 2017, 19, 353
for propagation and measurement for n qubits. The algorithm has been proven to be extendable
to d > 2 [12] and similar algorithms have been formulated in d > 2 [13]. Alternatives have also
been developed to the tableau formalism, though they prove to be equally efficient in worst-case
scenarios [14]. However, we are not aware of any direct extension of the Aaronson–Gottesman tableau
algorithm to dimensions greater than two. In this and the next section, we will show that the Wigner
algorithm presented in Section 3 is equivalent to the Aaronson–Gottesman tableau algorithm extended
to odd d.
Definition 1. A set of operators that satisfies S = { ĝ ∈ P such that ĝ |ψ = |ψ} are called the stabilizers
πi
of state |ψ, where P is the set of Pauli operators, each of which has the form e 2 α P̂1 ⊗ · · · ⊗ P̂n where
α ∈ {0, 1, 2, 3} for n qubits with P̂i ∈ { Îi , Ẑi , X̂i , Ŷi }.
For the sake of completeness, we present here a summary of the qubit Aaronson–Gottesman
algorithm, in order to compare it to our odd d qudit algorithm. For more details, see [1,2].
Each n-qubit stabilizer state is uniquely determined by 2n Pauli operators. There are only n
generators of this Abelian group of 2n operators. Therefore, an n-qubit stabilizer state is defined by
the n generators of its stabilizer state. Every element in this set of generators, { ĝ1 , ĝ2 , . . . , ĝn }, is in the
Pauli group, and each generator has the form:
Any unitary propagation by Clifford operators or measurement of the stabilizer state changes at least
some of the P̂ij elements of the n generators of the state’s stabilizer. This includes the ±1 phase in
Equation (35), which must also be kept track of in Aaronson–Gottesman’s algorithm.
Definition 2. Destabilizers { ĝ1 , . . . , ĝn } are the operators that generate the full Pauli group with the stabilizers
{ ĝ1 , . . . , ĝn }. They have the following properties:
467
Entropy 2017, 19, 353
To incorporate the destabilizers, a tableau becomes useful to see how they play a role in updating
the stabilizer generators during measurement [2].
Aaronson–Gottesman defined such a 2n × (2n + 1) binary tableau matrix as:
⎛ ⎞
x11 ... x1n z11 ... z1n r1
⎜ .. .. .. .. .. .. .. ⎟
⎜ . . . . . . . ⎟
⎜ ⎟
⎜ ⎟
⎜ xn1 ... xnn zn1 ... znn rn ⎟
⎜ ⎟.
⎜ x ( n +1)1 ... x ( n +1) n z ( n +1)1 ... z ( n +1) n r n +1 ⎟
⎜ ⎟
⎜ .. .. .. .. .. .. .. ⎟
⎝ . . . . . . . ⎠
x(2n)1 ... x(2n)n z(2n)1 ... z(2n)n r2n
This matrix contains 2n rows. The first n rows denote the destabilizers ĝ1 to ĝn while rows (n + 1) to
2n represent the stabilizers ĝ1 to ĝn . The (n + 1)th bit in each row denotes the phase (−1)ri for each
generator. We encode the jth Pauli operator in the ith row as shown in Table 2.
Table 2. Binary representation of the Pauli operators and the Pauli group phase used in their tableau representation.
468
Entropy 2017, 19, 353
4.3. Measurement
To describe the measurement part of the algorithm, we need to first define a rowsum operation in
the tableau that corresponds to multiplying two Pauli operators together. As defined in [2]:
Rowsum: To sum row i and j, first update the bits that represent operators by xik ⊕ x jk and zik ⊕ z jk
for k = 1, . . . , n. To calculate the resultant phase, Aaronson and Gottesman first defined the
following function:
⎧
⎪
⎪ 0 if xik = zik = 0,
⎪
⎪
⎪
⎨z − x
jk jk if xik = zik = 1,
f ( xik , x jk , zik , z jk ) = (36)
⎪
⎪z jk (2x jk − 1) if xik = 1, zik = 0,
⎪
⎪
⎪
⎩ x (1 − 2z )
jk jk if xik = 0, zik = 1.
Since each stabilizer generator is the tensor product of n single qubit Pauli operators (see Equation (35)),
they must be multiplied together to obtain the phase:
#
0 if ri + r j + ∑nk=1 f ( xik , x jk , zik , z jk ) ≡ 0 (mod 4),
(37)
1 if ri + r j + ∑nk=1 f ( xik , x jk , zik , z jk ) ≡ 2 (mod 4).
Having defined the rowsum function, let us now consider a measurement of Ẑi on qubit i.
For d = 2, Pauli group operators can only commute or anti-commute with each other. If Ẑi
anti-commutes with one or more of the generators, then the measurement is random. If Ẑi commutes
with all of the generators, then the measurement is deterministic. We consider these two cases:
where c j = 1 or 0.
469
Entropy 2017, 19, 353
where we used part (ii) of Definition 2 of the destabilizers and Equation (39). The last equality requires
ck = 1.
Therefore, to find the deterministic measurement outcome, the stabilizers whose corresponding
destabilizer anti-commutes with the measurement operation Ẑi must be multiplied together. Every
row (n + j) in the bottom half of the tableau, such that x ji = 1 (for j ∈ {1, . . . , n}), can be added up
together and stored in a temporary register. The resultant phase ±1 of this sum is the measurement
result we are looking for.
Checking if each destabilizer commutes or anti-commutes with Ẑi takes a constant number of
operations. One multiplication takes O(n) operations, and there are O(n) multiplications needed.
Therefore, a measurement takes O(n2 ) operations overall.
5. Discussion
As we made clear throughout Section 4, the scaling of the number of required operations with
respect to number of qudits n is exactly the same in the (d = 2) Aaronson–Gottesman algorithm as
in the (odd d) Wigner algorithm presented in Section 3. The two algorithms also require the same
number of dits of temporary storage for performing the deterministic measurement. Moreover, there
is a correspondence between the tableau employed by Aaronson–Gottesman
and the matrix Φt and
vector r t we use. In particular, the tableau is equal to Φt rt :
⎛ ⎞
x11 ... x1n z11 ... z1n
⎜ .. .. .. .. .. .. ⎟
⎛ ⎜
⎞ ⎜ . . . . . . ⎟
∂p0 ∂p0 ⎟
⎜ x ⎟
⎠≡⎜ ⎟
∂pt ∂qt ... xnn zn1 ... znn
Φt = ⎝
n1
∂q0 ∂q0 ⎜ ⎟ (41)
⎜ x ( n +1)1 ... x ( n +1) n z ( n +1)1 ... z ( n +1) n ⎟
∂pt ∂qt ⎜ ⎟
⎜ .. .. .. .. .. .. ⎟
⎝ . . . . . . ⎠
x(2n)1 ... x(2n)n z(2n)1 ... z(2n)n
and ⎛ ⎞
r1
⎜ . ⎟
⎜ .. ⎟
⎜ ⎟
rp ⎜ r ⎟
⎜ n ⎟
rt = ≡⎜ ⎟. (42)
rq ⎜ r n +1 ⎟
⎜ ⎟
⎜ .. ⎟
⎝ . ⎠
r2n
This can be seen through the following equation:
2n 2n
2πi 2πi
exp
d ∑ Φtn+i,j x̂ j |Ψt = ∏ exp d
Φtn+i,j x̂ j |Ψt
j =1 j =1
2πi
= exp rti |Ψt , (43)
d
470
Entropy 2017, 19, 353
x̂ ≡ ( p̂,
where q̂). Multiplying the right-hand side of the first equation and the second equation by
exp − 2πi
d r ti , it follows that
2n
2πi 2πi
exp − rti ∏ exp Φtn+i,j x̂ j Ψt = ĝi |Ψt = |Ψt . (44)
d j =1
d
In other words, rti specifies the phase exp − 2πi
d rti of the ith stabilizer, which is itself specified by
Φtn+i,j for j ∈ {0, . . . , 2n}. These are the same roles for r and the tableau in the Aaronson–Gottesman
tableau algorithm [2].
Indeed, both algorithms check the bottom half of their matrices for finite elements of Φn+ j,i to
determine if a measurement on the ith qudit will be random or not. They also use a very similar
protocol to determine the outcome of deterministic measurements. The Wigner-based algorithm
motivates these manipulations in terms of the symplectic structure of Weyl phase space and the
relationship between the two Wigner functions specified by the top and bottom of Φ, providing a
strong physical intuition for their effects. Aaronson and Gottesman motivate these manipulations
using the anti-commutation relations between the stabilizer and destabilizer generators. In addition,
the latter half of both the Wigner function’s r t and Aaronson–Gottesman’s r are used to determine
measurement outcomes. The only fundamental algorithmic difference between the approaches is that
the Wigner-based algorithm does not require updates of r t during unitary propagation. The reason
for this lies in the fact that Aaronson–Gottesman’s algorithm deals with systems with d = 2 while the
Wigner-based algorithm is restricted to odd d.
In particular, for the one-qubit Clifford group gate operator  = { P̂i , F̂i } ∀i = {1, . . . , n}, the
Aaronson–Gottesman algorithm specifies that for a q- or p-state, its Wigner function evolves by:
WΨ (M Â x). (45)
However, for |r = √1 (|0 ± i |1), a Y-state which is diagonal in the pq plane, its Wigner function
2
must first be translated:
' (
WΨ M Â x + β , (46)
where the translation β can be (1, 0) or (0, 1) equivalently. There is a similar state-dependence for the
two-qubit CNOT gate Ĉij .
This demonstrates that the Aaronson–Gottesman algorithm is state-dependent on the qubit
stabilizer state it is acting on. On the other hand, the Wigner function algorithm on odd d qudit stabilizer
states is state-independent. This likely is a consequence of the fact that the Weyl–Heisenberg group, which
is made up of the boost and shift operators defined in Equations (4) and (5) that underlie the discrete
Wigner formulation, are a subgroup of U (d) instead of SU (d) for d = 2 [15]. Furthermore, qubits
exhibit state-independent contextuality while odd d qudits do not [16]. Recent progress on this subject
relating non-contextuality to classical simulatability for qubits can be found here [17,18].
471
Entropy 2017, 19, 353
Figure 2. A decomposition of the two qutrit Wigner function into nine 3 × 3 grids, where each 3 × 3
grid denotes the value of the Wigner function at all pt 1 and qt 1 for a fixed value of pt 2 and qt 2 denoted
by the external axes. This organization is used in Figure 3 below.
Figure 3. The Wigner function of two qutrits initially prepared in (a) the state |0 ⊗ |0. (1) This is
evolved under F̂1 to produce (b) √1 (|0 + |1 + |2) ⊗ |0. (2) Subsequently, this state is evolved under
3
Ĉ12 producing (c) the Bell state √1 (|00 + |11 + |22). (3) Qutrit 1 is then measured producing the
3
random outcome 1, which collapses qutrit 2 into the same state, so that (d) |1 ⊗ |1 results. The black
color indicates the Wigner function specified by the lowest n rows of δΦt x,r t , and the gray color
indicates the Wigner function specified by the highest n rows (q0 ( pt , qt ) and p0 ( pt , qt ), respectively).
The evolution and algorithmic implementation are explained in the text.
We begin with
WΨ ( x) = (47)
δ⎛ ⎞⎛ ⎞⎛ ⎞ = δ⎛ ⎞⎛ ⎞,
1 0 0 0 pt 1 0 pt 1 0
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ 0 1 0 0 ⎟⎜ pt 2 ⎟⎜ 0 ⎟ ⎜ pt 2 ⎟⎜ 0 ⎟
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ ⎟⎜ ⎟,⎜ ⎟ ⎜ ⎟,⎜ ⎟
⎜ 0 0 1 0 ⎟⎜ qt 1 ⎟⎜ 0 ⎟ ⎜ qt 1 ⎟⎜ 0 ⎟
⎝ ⎠⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
0 0 0 1 qt 2 0 qt 2 0
472
Entropy 2017, 19, 353
denoting an initially prepared state of |0 ⊗ |0. This is clear in Figure 3a by the black band that lies
along all Weyl phase space points with qt 1 = 0 and qt 2 = 0. On the other hand, the gray manifold is
perpendicular to the black one, and lies along Weyl phase space points with pt 1 = 0 and pt 2 = 0.
Acting on this state with F̂1 produces √1 e 3 0×0 |0 + e
2πi 2πi 1×0 2πi 2×0
3
3 |1 + e 3 |2 ⊗ |0. Applying
the algorithm specified at the end of Section 3.2, we find:
WΨ ( x) = (48)
δ⎛ ⎞⎛ ⎞⎛ ⎞ = δ⎛ ⎞⎛ ⎞
0 0 −1 0 pt 1 0 −qt 1 0
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ 0 1 0 0 ⎟⎜ pt 2 ⎟⎜ 0 ⎟ ⎜ pt 2 ⎟⎜ 0 ⎟
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ ⎟⎜ ⎟,⎜ ⎟ ⎜ ⎟,⎜ ⎟.
⎜ 1 0 0 0 ⎟⎜ qt 1 ⎟⎜ 0 ⎟ ⎜ pt 1 ⎟⎜ 0 ⎟
⎝ ⎠⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
0 0 0 1 qt 2 0 qt 2 0
Thus, the momentum of qutrit 1 is now determined and is 0 while the second qutrit is unchanged.
This can be seen in Figure 3b, where the qt 2 values of the non-zero Weyl phase space points are the
same, while the state has rotated by −π/2 in ( pt 1 , qt 1 )-space. A similar transformation has occurred
for the perpendicular gray manifold.
Acting next with Ĉ12 produces the Bell state √1 (|00 + |11 + |22), which is represented by the
3
following Wigner function:
WΨ ( x) = (49)
δ⎛ ⎞⎛ ⎞⎛ ⎞ = δ⎛ ⎞⎛ ⎞
0 0 −1 0 pt 1 0 −qt 1 0
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ 0 1 0 0 ⎟⎜ pt 2 ⎟⎜ 0 ⎟ ⎜ pt 2 ⎟⎜ 0 ⎟
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ ⎟⎜ ⎟,⎜ ⎟ ⎜ ⎟,⎜ ⎟.
⎜ 1 1 0 0 ⎟⎜ qt 1 ⎟⎜ 0 ⎟ ⎜ pt 1 + pt 2 ⎟⎜ 0 ⎟
⎝ ⎠⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
0 0 −1 1 qt 2 0 −qt 1 + qt 2 0
The entanglement between the two qutrits is evident in both of their dependence on each other’s
momenta and positions, pt 1 = − pt 2 and qt 1 = qt 2 , specified by the last two rows. Figure 3c
shows that the state is still representable as lines in Weyl phase space, except they now traverse
through the different planes of (qt 1 , pt 1 ) associated with each value of (qt 2 , pt 2 ). However, if you
consider the left column in Figure 3c corresponding to qt 2 = 0, you can see that the only black Weyl
phase points are at qt 1 = 0. Similarly, the middle column corresponding to qt 2 = 1 shows that
qt 1 = 1, and the right column corresponding to qt 2 = 2 shows that qt 1 = 2 too, confirming that
|Φ = √13 (|00 + |11 + |22). Thus, the entanglement of the two qutrits’ positions is clearly evident
in this Figure of the Wigner function.
We then proceed to measure qutrit 1. Since the lower two equations involve pt 1 , we know that
this is a random measurement. Let us pick the outcome to be 1 and set the third row as such, replacing
the first row with the old third row. This collapses qutrit 2 into the same state:
WΨ ( x) = (50)
δ⎛ ⎞⎛ ⎞⎛ ⎞ = δ⎛ ⎞⎛ ⎞
1 1 0 0 pt 1 0 pt 1 + pt 2 0
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ 0 1 0 0 ⎟⎜ pt 2 ⎟⎜ 0 ⎟ ⎜ pt 2 ⎟⎜ 0 ⎟
⎜ ⎟⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟
⎜ ⎟⎜ ⎟,⎜ ⎟ ⎜ ⎟,⎜ ⎟.
⎜ 0 0 1 0 ⎟⎜ qt 1 ⎟⎜ 1 ⎟ ⎜ qt 1 ⎟⎜ 1 ⎟
⎝ ⎠⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
0 0 −1 1 qt 2 0 −qt 1 + qt 2 0
The lower two rows show that now qt 1 = 1, as we chose, and qt 2 = qt 1 = 1. The collapse of qutrit 2
into |1 can also been seen in Figure 3c by the fact that qt 1 = 1 only in the 3 × 3 grids that correspond
to qt 2 = 1 too.
Finally, the fact that a measurement of qt 2 would be deterministic at this point can be seen in
the fact that pt 2 is not present in the last two rows of Φt . Furthermore, it is clear, since the first row
has a coefficient of 1 in front of pt 1 , that the corresponding third row must be added with weight 1 to
the fourth row to obtain this deterministic measurement outcome of qt 2 = 1. This can also be seen in
473
Entropy 2017, 19, 353
Figure 3 by finding the projection of p0 1 onto pt 2 , which are shown by the gray manifolds in panels (a)
and (d), respectively. They are collinear and so the projection is equal to 1. (Perpendicular manifolds
corresponds to a projection of 0, and those that lie π/4 diagonally with respect to each other have a
projection equal to 2 in this discrete geometry.)
7. Conclusions
In summary, we introduced an algorithm that efficiently simulates stabilizer state evolution
under Clifford gates and measurements in the Ẑ Pauli basis for odd d qudits. We accomplished
this by relying on the phase-space perspective of stabilizer states as discrete Gaussians and
Clifford operators as having underlying harmonic Hamiltonians. We showed the equivalence of
our algorithm, through Equations (43) and (44), to the well-known Aaronson–Gottesman tableau
algorithm [2] for qubits, revealing that Aaronson–Gottesman’s tableau corresponds to a discrete
Wigner function. As a consequence, we revealed the physically intuitive phase space perspective of
Aaronson–Gottesman’s algorithm, as well as its extension to higher odd d.
This work illustrates that no efficiency advantage is gained by using the Heisenberg representation
for stabilizer propagation. Equation (44) indicates that the Heisenberg representation is equivalent to
the Schrödinger representation in this context; evolving the operators is just as efficient as evolving the
states, as perhaps expected.
Lastly, the correspondence between the Wigner-based algorithm and the Aaronson–Gottesman
tableau algorithm may point the direction on how to resolve the long-standing issue of describing
the Wigner–Weyl–Moyal and center-chord formalism for d = 2 systems. We have shown that
the Aaronson–Gottesman algorithm is essentially a d = 2 treatment of the Wigner approach.
The salient difference appears to be the state-dependence of this evolution, and likely is related
to the state-independent contextuality that qubits exhibit, which odd d qudits do not. Exploring the
details of this state-dependence is a promising subject of future study.
Acknowledgments: This work was supported by the Air Force Office of Scientific Research (AFOSR) award
No. FA9550-12-1-0046.
Author Contributions: All authors contributed to the work presented here.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Gottesman, D. The Heisenberg Representation of Quantum Computers. arXiv 1998, arXiv:quant-ph/9807006.
2. Aaronson, S.; Gottesman, D. Improved simulation of stabilizer circuits. Phys. Rev. A 2004, 70, 052328.
3. Gottesman, D. Fault-tolerant quantum computation with higher-dimensional systems. In Quantum
Computing and Quantum Communications; Springer: Heidelberg, Germany, 1999; pp. 302–313.
4. Mari, A.; Eisert, J. Positive Wigner functions render classical simulation of quantum computation efficient.
Phys. Rev. Lett. 2012, 109, 230503.
5. Howard, M.; Wallman, J.; Veitch, V.; Emerson, J. Contextuality supplies the ‘magic’ for quantum computation.
Nature 2014, 510, 351–355.
6. Wootters, W.K. A Wigner-function formulation of finite-state quantum mechanics. Ann. Phys. 1987, 176, 1–21.
7. Gross, D. Hudson’s theorem for finite-dimensional quantum systems. J. Math. Phys. 2006, 47, 122107.
8. Veitch, V.; Ferrie, C.; Gross, D.; Emerson, J. Negative quasi-probability as a resource for quantum computation.
New J. Phys. 2012, 14, 113011.
9. Veitch, V.; Wiebe, N.; Ferrie, C.; Emerson, J. Efficient simulation scheme for a class of quantum optics
experiments with non-negative Wigner representation. New J. Phys. 2013, 15, 013037.
10. Kocia, L.; Love, P. Semiclassical Formulation of Gottesman–Knill and Universal Quantum Computation.
arXiv 2016, arXiv:1612.05649.
11. Koh, D.E.; Penney, M.D.; Spekkens, R.W. Computing quopit Clifford circuit amplitudes by the
sum-over-paths technique. arXiv 2017, arXiv:1702.03316.
12. De Beaudrap, N. A linearized stabilizer formalism for systems of finite dimension. arXiv 2011, arXiv:1102.3354.
474
Entropy 2017, 19, 353
13. Yoder, T.J. A Generalization of the Stabilizer Formalism for Simulating Arbitrary Quantum Circuits.
2012. Available online: https://pdfs.semanticscholar.org/b200/efe1709d07ffc1b5b7bd90e61c09e2729bdf.pdf
(accessed on 6 July 2017).
14. Anders, S.; Briegel, H.J. Fast simulation of stabilizer circuits using a graph-state representation. Phys. Rev. A
2006, 73, 022334.
15. Bengtsson, I.; Zyczkowski, K. On discrete structures in finite Hilbert spaces. arXiv 2017, arXiv:1701.07902.
16. Mermin, N.D. Hidden variables and the two theorems of John Bell. Rev. Mod. Phys. 1993, 65, 803.
17. Raussendorf, R.; Browne, D.E.; Delfosse, N.; Okay, C.; Bermejo-Vega, J. Contextuality as a resource for qubit
quantum computation. arXiv 2015, arXiv:1511.08506.
18. Kocia, L.; Love, P. Discrete Wigner Formalism for Qubits and the Non-Contextuality of Clifford Operations
on Qubit Stabilizer States. arXiv 2017, arXiv:1705.08869.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
475
entropy
Article
Concepts and Criteria for Blind Quantum Source
Separation and Blind Quantum Process Tomography
Alain Deville 1, * and Yannick Deville 2
1 Institut Matériaux Microélectronique et Nanosciences de Provence (IM2NP), Aix-Marseille Université,
13397 Marseille, France
2 Institut de Recherche en Astrophysique et Planétologie (IRAP), Université de Toulouse, 31400 Toulouse,
France; [email protected]
* Correspondence: [email protected]; Tel.: +33-5-61-33-28-24
Abstract: Blind Source Separation (BSS) is an active domain of Classical Information Processing,
with well-identified methods and applications. The development of Quantum Information Processing
has made possible the appearance of Blind Quantum Source Separation (BQSS), with a recent
extension towards Blind Quantum Process Tomography (BQPT). This article investigates the use of
several fundamental quantum concepts in the BQSS context and establishes properties already used
without justification in that context. It mainly considers a pair of electron spins initially separately
prepared in a pure state and then submitted to an undesired exchange coupling between these spins.
Some consequences of the existence of the entanglement phenomenon, and of the probabilistic aspect
of quantum measurements, upon BQSS solutions, are discussed. An unentanglement criterion is
established for the state of an arbitrary qubit pair, expressed first with probability amplitudes and
secondly with probabilities. The interest of using the concept of a random quantum state in the
BQSS context is presented. It is stressed that the concept of statistical independence of the sources,
widely used in classical BSS, should be used with care in BQSS, and possibly replaced by some
disentanglement principle. It is shown that the coefficients of the development of any qubit pair pure
state over the states of an orthonormal basis can be expressed with the probabilities of results in the
measurements of well-chosen spin components.
Keywords: blind source separation (BSS); qubit pair; exchange coupling; entangled pure state;
unentanglement criterion; probabilities in quantum measurements; independence of random
quantum sources
1. Introduction
The book entitled “Do we really understand quantum mechanics?” [1] was published five years
ago. Some fourty years earlier, its author, Laloë, had co-authored a treatise on quantum mechanics,
together with Cohen-Tannoudji, later a Nobel laureate, and Diu [2]. While this recent book illustrates
the present strong interest for the foundations of Quantum Theory (QT), already in 1929, Dirac could
claim: “The general theory of quantum mechanics is now almost complete” and “The underlying physical laws
necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely
known” [3]. Since that time, the development of both telecommunications through electromagnetic
waves and solid state electronics favoured the appearance first of classical Information Theory, and then
of Quantum Information Theory and Processing (QIT, QIP).
This special issue, Quantum Information and Foundations, in the Quantum Information Section
of Entropy, reflects the existence of links between QIP/QIT and the foundations of QT. An instance
of such links is given by the approach adopted e.g., in Timpson’s Thesis [4]. This methodology, in
the framework of Philosophy of Science, is difficult because of its rather general character. For the
last decade, we have been following another approach. Starting from a problem in the domain of
classical information processing, namely Source Separation (SS) with its more difficult so-called Blind
version (BSS), introduced around 1985 and now a mature field [5,6], we are developing its quantum
counterpart, which we proposed to call Blind Quantum Source Separation (BQSS). Each step of this
more pedestrian approach may be controlled, presently e.g., through simulations. This approach has
been achieved in our 2007 paper introducing BQSS [7], and in those describing the solutions which we
have built since then (see e.g., [6,8–14]), and which led to our recent introduction of Blind Quantum
Process Tomography (cf. [12,14] and more explanations at the end of this section and in Part A.2 of
the Appendix).
A short presentation of the problem of classical (i.e., non quantum) or conventional BSS, and of
its interest, is needed here. In BSS, typically, at first, a set of users (the Writer) presents a set of
simultaneous signals (input signals, or sources) at the input of a multi-user communication system
(the Mixer). The sources, constrained to possess some general properties (e.g., mutual statistical
independence), are combined (mixed, in the SS sense) in the Mixer, often specified through a model,
e.g., the linear memoryless one (cf. Chapter 11 from [15]). Another set of users (the Reader) receives the
signals arriving at the Mixer output. The Writer possibly knows the sources, but the Reader does not
know them, and cannot access the inputs of the Mixer. That Mixer uses one or several parameter values,
unknown to the Reader, who only knows some of its general properties. The Reader’s final task is the
restoration of the sources (possibly up to some so-called acceptable indeterminacies) from the signals
at the Mixer output, during the inversion phase. An intermediate task is the determination of the
unknown parameters of the Mixer, or of its inverse. Before receiving the signals to be separated at the
Mixer output, derived from the sources sent by the Writer, the Reader therefore enters an “adaptation
phase”, during which he knows that the Writer is sending one (or possibly a limited number of)
signal(s) submitted to some definite, and known by the Reader, constraints. The particular signal sent
is not known by the Reader (blind separation problem), who knows the class of the input signal(s)
and the signal(s) at the Mixer output in the adaptation phase, and, of course, the mixed signals to be
separated in the inversion phase.
Conventional BSS is already used to extract some or all source signals in various application
fields, e.g., in some audio systems, or when using radio-frequency signals to transmit digital data, or
in the biomedical field, in the processing of signals such as electrocardiograms, electroencephalograms
or magnetoencephalograms, as explained in Part A.1 of the Appendix. More information on the
applications of conventional BSS may be found in our previous papers [11,14], in [6], and in the papers
or books they cite.
BSS is moreover closely linked to a well-known domain of signal processing technology called system
identification. More precisely, BSS is linked to Blind Mixture Identification (BMI), as briefly explained in
Part A.1 of the Appendix and developed in [6], and BSS may be used in the corresponding applications.
Conventional (B)SS has favoured the introduction of concepts and the development of specific
methods [5,6]. Its extension to the quantum domain seems suitable for at least three reasons. First,
the source concept may be extended from a classical to a quantum context. Secondly, as any classical
phenomenon, conventional (B)SS may be seen as the limit of a quantum phenomenon. When
developing solutions to the BQSS problem, it seems legitimate to try and import concepts and
methods from the classical to the quantum SS domain. However, the presence of entanglement
in a quantum approach should be clearly identified and the consequences of its existence should not
be underestimated. In addition, the concepts of quantum sources and of their statistical independence
deserve some discussion, and consequences of the probabilistic aspect of the results of measurements
in the quantum domain must be drawn. Furthermore, last but not least, since some of the basic
concepts of QT are still open to discussion, when e.g., using measurements, even in an abstract process,
the adopted point of view should once be made explicit, in order to minimize confusion. The nature
of this special issue gave us the opportunity to clarify concepts and justify properties already used
in our previous papers upon BQSS, a task postponed up to now, and which should be of use in the
478
Entropy 2017, 19, 311
BQSS domain, and maybe in other fields. These two motivations stimulate a third natural one, namely
the hope of extending the field of BSS applications toward the quantum world. In the following
sections, in order to illustrate our methods and help reading, some aspects or results of our previous
papers will be occasionally presented, but the building of any specific BQSS solution is outside their
scope. The reader interested in the results from simulations may consult [8,11], obtained through BQSS
methods with classical processing, and [14], with quantum processing in the forward path. This recent
paper moreover contains a table with a detailed comparison of the key features and performance from
the existing methods.
In all of our previous papers, we considered two distinguishable qubits numbered 1 and 2,
and we presently keep this situation. When it is meaningful to speak of the state of a quantum system,
and specifically if this system is a qubit, this state may be either pure or mixed. In order to avoid any
confusion with the meaning of a mixture in the SS context, if it is needed to speak of a (quantum)
mixed state in the following, we will systematically speak of a statistical mixture. A typical situation
is the following one: at an initial time t0 , the Writer prepares both qubits, each in a given pure state,
described by some ket. This ket carries information, an idea contained in the expression “quantum
source”. The initial state | Ψ(t0 ) > of the qubit pair is then the tensor product of the corresponding
kets. The time between t0 (writing) and t1 (reading) is supposed to be short enough for the qubit pair
to be treated as isolated, a choice already made by Feynman [16,17] in the context of the quantum
computer, and presently refined at the beginning of Section 4.1 for qubits physically realized with spins.
At any time t between t0 and t1 , the state of the qubit pair may then be described by a ket | Ψ(t) >.
In the Schrödinger picture, this time evolution of the pair is described by a time-dependent unitary
operator U (t0 , t1 ). It is assumed that an undesired coupling exists between these qubits. Because of
this undesired coupling, as time goes on the state of the pair generally becomes entangled. Coupling
is then interpreted as a mixing (in the SS sense), realized by an abstract Mixer depending upon one
or several parameter values, unknown to the Reader, who only knows some general properties of
that Mixer. It is said that the input of the Mixer receives state | Ψ(t0 ) >, and that its output provides
state | Ψ(t) >. It should be well appreciated that inverting U (t0 , t1 ) in order to get | Ψ(t0 ) > from
| Ψ(t1 ) > is not that easy, because U (t0 , t1 ) is unknown (blind QSS). In Section 2, it is first explained why
both state and process quantum tomography are unable to solve this BQSS problem, and secondly
why the Schmidt criterion is ill-suited for following the degree of entanglement of | Ψ(t1 ) > during
the adaptation phase. The Peres–Horodecki criterion [18,19] is valid for separable statistical mixtures
of bipartite systems, and not specifically for unentangled pure states. A better suited unentanglement
criterion is therefore established in Section 2.
In Section 3, a model situation, for a single spin and then for a pair of spins, in inhomogeneous
magnetic fields with random directions, allows us to speak of random and possibly independent
variables, in that quantum context. We explain why, although this random quantum state corresponds
to a statistical mixture, it is simpler, in the BQSS context, to speak of a random pure state than to
introduce a density operator. In Section 4, we first make brief comments about the description of
quantum states (including the existence of statistical mixtures as source states, in a more general
context), about the act of measurement and about the physical realization of qubits with electron
spins. We then discuss questions related to the probabilities of the possible results obtained in
measurements of spin components, in the context of spins 1/2 as qubits. We first present their use
when the Reader makes measurements at the Mixer output in order to restore the sources (cf. Figure 1).
These measurements establish a link between the output of the Mixer and the classical world. It is
stressed that while the macroscopic support of the results of measurements has a classical behaviour,
the probabilities of these results obey quantum laws. We then establish an unentanglement criterion
using probabilities, equivalent to the one established in Section 2 for the probability amplitudes ci .
It is shown that the ci coefficients can be expressed as functions of the probabilities of results in the
measurements of well-chosen spin components. In Section 5, we derive the expression of the above
unentanglement criterion for all possible source states, at the output of the so-called separating system,
479
Entropy 2017, 19, 311
with respect to the parameters of both the cylindrical Heisenberg coupling, an abstract Mixer largely
used in our previous papers, and that separating system.
| ψ (t)> p
classical
| ψ (t )> mixing y
0 processing
In Part A.2 of the Appendix, the question of the applications of BQSS is addressed. Partly
because the appearance of BQSS is recent, the subject of its applications is presently largely speculative.
Two main subdomains should be distinguished. The first one is BQSS in a strict sense. It aims
at recovering the source states and is the quantum counterpart of conventional BSS. The second
subdomain focuses on an intermediate step possibly found in methods developed for BQSS and aiming
at the knowledge of the mixer function or of its inverse. The corresponding classical problem is known
as Blind Mixture Identification (BMI), a subfield of System Identification. The non-blind quantum
version of System Identification is that already mentioned and well-established field of QIP called
Quantum Process Tomography (as opposed to Quantum State Tomography). We recently introduced
the quantum version of BMI, which we proposed to call Blind Quantum Process Tomography (BQPT).
480
Entropy 2017, 19, 311
| ψ (t)>
| ψ (t0)> mixing quantum processing |φ >
classical
processing
Figure 2. Block diagram of a system using BQSS, with quantum processing in the forward path
(no cloning [14], with permision from Elsevier).
From now on, the state spaces of two arbitrary qubits, called qubits 1 and 2, are denoted as
E1 and E2 , respectively. The possible (pure) states of the pair are the kets in E1 ⊗ E2 . We assume
that the qubits are physically realized with spins 1/2, which, e.g., allows us to speak of the spin
component s1z or s2z , but many results established hereafter keep true without this assumption. We
introduce the orthonormal basis B+ , {| ++ >, | +− >, | −+ >, | −− >}, where e.g., | +− > means
| 1+ > ⊗ | 2− > and | i, + >, | i, − > are normed eigenkets of the siz component of (reduced) spin − →
si
(with i = 1, 2), for the eigenvalues +1/2 and −1/2, respectively. Any pure pair state, entangled or not,
may be expanded in B+ as
481
Entropy 2017, 19, 311
the quite numerous steps of the iterative adaptation algorithm. We avoid these issues as follows. Since
the qubit pair is in a pure state, its partial traces ρ1 and ρ2 satisfy
and the common value for Trρ21 and Trρ22 is 1 if and only if the pure state is unentangled (cf. [21]).
One could think of using Trρ21 − 1 as a cost function. However, Trρ21 depends upon the ci , which
suggests one to try and establish an unentanglement criterion using the ci explicitly. To this end, we
consider state |Ψ defined through Equation (1). When it is assumed that |Ψ is unentangled, i.e., that
it can be written as
|Ψ = ( a|+ + b|−) ⊗ (c|+ + d|−), (3)
then, in Equation (1), c1 = ac, c2 = ad, c3 = bc, c4 = bd, so c1 c4 and c2 c3 are both equal to abcd:
c1 c4 = c2 c3 . (4)
which means that |Ψ is then unentangled. If Equation (4) is satisfied and c1 = 0, then c2 = 0 and
c3
= 0, or c3 = 0 and c2
= 0, or c2 = c3 = 0, and in each case |Ψ is unentangled. Therefore, if the qubit
pair is in a pure state |Ψ written as in Equation (1), then:
This unentanglement criterion for a qubit pair pure state was used without justification in [9,10].
In Equation (1), |Ψ was expanded in the standard basis. It is possible instead to introduce e.g.,
the normed eigenvectors of s1x and s2x , or more generally those of s1u and s2v , the components of the
spins along respective arbitrary directions − →u (θ1E , ϕ1E ) and −
→
v (θ2E , ϕ2E ), defined through their Euler
angles. For each component, the possible results are again ±1/2. The possible results for the pair
may be symbolically written as (+u + v), (+u − v), (−u + v) and (−u − v), and the corresponding
probabilities as P1uv , P2uv , P3uv , P4uv. Equation (1) is replaced by
With the same reasoning within the new basis, (6) is replaced by
482
Entropy 2017, 19, 311
Furthermore, in Quantum SS with abstract qubits corresponding to physical spins 1/2, the word
“source” does not refer to some atomic beam delivering atoms carrying an electron or nuclear magnetic
moment, but still means “source signal”, then referring to some information from the quantum states
of these qubits.
In conventional SS, an important concept is that of statistical independence of the sources, at the
root of the frequent use of Independent Component Analysis (ICA) [27]. In [7,8,11], we postulated
the existence of statistically independent quantum sources when using the classical-processing SS
defined at the beginning of Section 2. Hereafter, we show that statistical independence may exist in
that context. Quantum Mechanics (QM) does e.g., consider random operators, the matrix elements
of which are random quantities (see the random lattice operators F (q) in the quantum description
of the motions of nuclear moments in liquids, in the study of Spin-Lattice Relaxation (SLR), in [28]).
As a simple model situation, a magnetic moment − →
μ associated with a single electron spin 1/2, with
−
→μ = −G − →s (isotropic g tensor), placed in a Stern–Gerlach device, is now introduced. The static field
−
→ −
→
is B0 = B0 Z , with amplitude B0 . The system of interest consists of this spin and the magnet. Writing
−
→
the Zeeman Hamiltonian as h = −− →
μ B 0 = GB0 s Z indicates that while the spin is a quantum object,
the magnetic field is treated classically. The Writer first prepares the spin in the | + Z eigenstate of s Z
(eigenvalue +1/2). The moment is then received by the Reader, supposed to ignore the direction of
−
→
B0 , and who chooses some direction attached to the Laboratory as the quantization direction, called z
(unit vector −→
uz ) and introduces a Laboratory-tied cartesian reference frame xyz, used to define θ E and
−
→
ϕ E , the Euler angles of Z . Since the field is treated classically, θ E and ϕ E behave as classical variables,
while s Z is an operator. The Reader measures sz = − →s−→
uz (eigenstates: |+ and |−), and is interested
in the probability p+z of getting +1/2. An elementary calculation indicates that
.
| + Z = r |+ + 1 − r2 eiϕ |−, (9)
with
θ2E
r = cos , ϕ = ϕE , (10)
2
and therefore p+z = cos2 θ E /2. Once the direction of the magnetic field has been chosen, state
| + Z is then unambiguously defined. If this direction has a deterministic nature, r and ϕ are
deterministic variables, and | + Z may then be called a deterministic quantum state. If θ E and ϕ E ,
−
→
defining the direction of B0 chosen by the Writer, obey probabilistic laws, one may consider that
the quantum quantities r and ϕ, which depend upon the classical Random Variables (RV) θ E and
ϕ E , do possess the properties of conventional, i.e., classical, RV. It may e.g., happen that they be
uncorrelated, or even independent (which happens if θ E and ϕ E are independent). In addition, if θ E
and ϕ E depend on time in a random way, r and ϕ are then random time functions. We are not strictly
facing the quantum equivalent of a classical situation here. Rather, the stochastic character of the field
direction, with classical nature, is reflected in the random behaviour of the quantum state expressed
through Equation (9). Therefore, rather than a random operator, we meet here a random quantum
state. The concept of a random state, if not the expression, was already used e.g., in the early and
canonical books [29,30]. The probability p+z , presently a function of the RV θ E , is itself an RV. This
results from both the randomness of the field direction and the standard probabilistic interpretation of
QM. Probabilities of results of measurements for a qubit pair were treated as RV, without the present
justification, in most of our previous papers, including [7,8,11].
If one measures the scalar observable O when the spin is in the state |Ψ = α|+ + β|− = Σk f k | ϕk
(where k is associated with + and −), had the f k been deterministic the mean value would have been:
483
Entropy 2017, 19, 311
Since the f k are random, one must moreover calculate the statistical mean, denoted as Ψ|O|Ψ:
where ρ is the density operator, the matrix elements of which, in the (|+ , |−) basis, are ρl,k = f k∗ f l .
Therefore, it is in principle possible to presently introduce a density operator, which is a non-random
operator (its matrix elements are not random quantities, but statistical averages). However, this does
not present any interest, since in the BQSS problem examined up to now, the Reader knows that e.g.,
qubit 1 has been prepared in a pure state, but does not know the values of the ρij coefficients in any
basis, and is consequently unable to choose a basis in which ρ would be diagonal. It is simpler to keep
speaking of a random pure state.
As a model situation, we now consider two spins 1/2 numbered 1 and 2, each with conditions
−
→
similar to the previous ones, with fields along directions with respective unit vectors Z1 (θ1E , ϕ1E ) and
−
→
Z2 (θ2E , ϕ2E ), and each spin initially prepared in the state
/
|ψi (t0 ) = ri |i + + 1 − ri2 eiϕi |i −, i = 1, 2, (13)
where |i + and |i − are the eigenkets of siz , the component of − →si along the quantization direction,
for the eigenvalues 1/2 and −1/2, respectively. For the same reason, if the field directions are
random, r1 , ϕ1 , r2 and ϕ2 have the properties of conventional RV. If (θ1E , ϕ1E ) and (θ2E , ϕ2E ) are
mutually statistically independent, the same is then true for the couples of RV (r1 , ϕ1 ) and (r2 , ϕ2 ).
In addition, if e.g., θ1E and ϕ1E are independent, the same is true for r1 and ϕ1 (cf. Equation (10)). These
properties are of major importance for our quantum-source independent component analysis (QSICA)
methods described in [11]. We may then say that the initial state of each qubit is random, i.e., that in
Equation (13) ri and ϕi are RV. When considering the preparation of a pair of qubits each in a pure state,
one may assume either a deterministic or a random direction for each magnetic field. This discussion
shows that the relevant concept, in the latter case, is that of random quantum states, rather than that of
random quantum operators mentioned earlier in this section.
Keeping our assumption of a pair of qubits each prepared in a pure state, we now consider
the second approach for the adaptation and inversion phases (cf. the beginning of Section 2 and
Figure 2), with a quantum state |Φ present at the output of the inverting block. The presence of |Φ
and the Reader’s final aim, the recovery of the initial pure state, prompts the Reader: (1) to speak of
a deterministic or random pure state, rather than to use a density operator; (2) to consider that the
first constraint to be respected in BQSS is then the very existence of an unentangled state at the output
of this inverting block. If unentanglement has first been achieved, then and only then is it possible
to speak of a deterministic or random state for each part of that product state. While entanglement
has no classical counterpart, the following point may be noted here: if a bipartite system is in a pure
(deterministic) state |Φ, to which a density operator ρ = |Φ Φ| corresponds, |Φ is unentangled
if and only if the partial traces ρ1 and ρ2 satisfy the equality ρ = ρ1 ⊗ ρ2 [31]. This unentanglement
condition is reminiscent of the relation ρ = ρ1 · ρ2 between ρ, the joint probability density function
of independent classical RV X1 and X2 , and ρ1 and ρ2 , the respective marginal probability density
functions. Presently, operators replace functions, a tensor product replaces the ordinary product,
and this reminiscence reflects the existence of a classical analogue to unentangled states. Condition (4)
for unentanglement was established using spins 1/2, but is valid for any pair of two-level systems.
This discussion suggests that, in the BQSS problem, when considering a pair of qubits prepared in a
pure state, and moreover using the second approach of Section 2 for adaptation and inversion, instead
of trying to directly import ICA methods into the BQSS context, one should focus on disentanglement
at the output of the inverting block, which recently led us to introduce a disentanglement-based
separation principle [9,10].
484
Entropy 2017, 19, 311
In the next section, use will be made of the number of real independent parameters necessary to
define an arbitrary normed ket |Ψ in E1 ⊗ E2 , written as in Equation (1), and a ket in E1 ⊗ E2 forced to be
unentangled. These numbers are specified hereafter. An arbitrary normed ket |Ψ in E1 ⊗ E2 depends
upon the four complex quantities c1 to c4 linked through two relations between real numbers (∑i | ci |2
is equal to 1, and |Ψ and eiϕ |Ψ, with ϕ an arbitrary real quantity, should be considered identical).
An arbitrary normed ket |Ψ in E1 ⊗ E2 therefore depends upon six real independent parameters. If it
is forced to be unentangled, it has to satisfy the equality c1 c4 = c2 c3 between complex quantities.
An unentangled normed ket |Ψ therefore depends upon four real parameters. This corresponds to the
fact that |Ψ is then restricted to the form |Ψ = |ψ1 ⊗ |ψ2 , where the normed kets |ψ1 and |ψ2 ,
describing the state of qubits 1 and 2, respectively, each depend upon two real parameters (r1 , ϕ1 ),
(r2 , ϕ2 ) (cf. Equation (13)).
485
Entropy 2017, 19, 311
a characteristic time called T1 ) [28,36]. In our previous papers and in the present one, starting from
time t0 when the Writer operates, then, at the chosen time scale, the qubit pair is assumed to be isolated
from its environment.
In the ESR/NMR domain, a well-known situation exists when a collection of identical (nuclear
or electron) spins placed in a fixed resonant magnetic field are transiently submitted to an intense,
oscillating magnetic field with a frequency equal to (or near) its resonant value, and with well-chosen
polarization. If each spin is coupled to the magnetic fields only, at the end of the pulse the density
matrix (written in the basis in which the static Zeeman Hamiltonian is diagonal) describing the state of
these spins possesses non-diagonal elements, called coherences. If a weak internal coupling (spin-spin
coupling) such as the dipolar magnetic coupling exists between the spins, and if it is able to manifest
itself at a time scale allowing one to neglect SLR, it progressively induces a decrease of the coherences,
a reversible phenomenon allowing spin echo techniques.
There is presently a second reason for referring to these behaviours in the MR domain, namely the
fact that DiVincenzo suggested the use of electron spins for the physical realization of qubits more than
twenty years ago [37]. Between two neighbouring electron spins, there may exist a strong exchange
interaction, a strictly quantum phenomenon historically first identified by Heisenberg in magnetically
ordered materials. This is the first reason for our choice of a Heisenberg coupling in the BQSS problem.
The second one is that, on the formal side, the version of the Heisenberg Hamiltonian with spherical or
cylindrical symmetry, simple enough to be used in theoretical works, may serve as a benchmark in
that BQSS problem. It should be recalled that an Ising coupling, simpler to manipulate theoretically
than the Heisenberg one, was present in the DiVincenzo 1995 paper, where it helped in the operating
process, while the presence of the Heisenberg coupling is undesired and should be compensated for in
the BQSS context.
It is well-known that the ESR lines of transition ions in insulators at moderate concentrations are
broadened by the dipolar magnetic coupling between the electron spins, the exchange interaction being
negligible then. In concentrated samples, exchange is stronger than dipolar coupling and produces a
narrowing of the lines [36]. Dipolar coupling is long ranged and anisotropic, which should lead to
heavy theoretical treatments if considering a three-dimensional configuration in the BQSS context.
Future technological developments could possibly make e.g., the consideration of a planar square
lattice of dipolar coupled spins meaningful in that context.
p2 depends upon a mixing parameter v = sgn(cos Δ E ) sin Δ E , with [8] Δ E = − Jxy (t1 − t0 )/h̄. This
expression for Δ E may be vizualized as the opposite of the phase rotation Δφ = ω (t1 − t0 ) between
states coupled by a Hamiltonian term with energy Jxy , during the time interval (t1 − t0 ), with ω given
by the Planck–Einstein relation ω = Jxy /h̄. Probability p2 satisfies
/ / .
p2 = r12 (1 − r22 )(1 − v2 ) + (1 − r12 )r22 v2 − 2r1 r2 1 − r12 1 − r22 1 − v2 v sin Δ I (15)
486
Entropy 2017, 19, 311
Equation (16) together with (17) is however weaker than condition c1 c4 = c2 c3 , as can be tested by
considering the following state:
1
|Ψi−i11 = (i | + + − i | + − + | − + + | − −). (18)
2
487
Entropy 2017, 19, 311
1
|Ψi−i11 = (| + x, + x + i | + x, − x − | − x, + x + i | − x, − x ). (19)
2
Equation (19) shows that the four probabilities Pix attached to |Ψi−i11 are all equal to 1/4. Therefore,
|Ψi−i11 satisfies (16) and (17), while being entangled.
The two qubits being in the state |Ψ expressed through (1), one may decide to treat the three
orthogonal directions on the same footing, measuring successively s x for both spins, then, in a new
set of preparations/measurements, sy for both spins, and finally sz for both spins. The probabilities
of obtaining (1/2, 1/2)), (1/2, −1/2), (−1/2, 1/2), (−1/2, −1/2), respectively, when measuring s1k
and s2k (with k successively equal to x, y, and z), will be denoted as P1k , P2k , P3k and P4k . For e.g.,
the entangled state | Ψi−i11 , as P1z P4z = P2z P3z and P1x P4x = P2x P3x , the hope is that entanglement can
be detected thanks to P1y P4y
= P2y P3y , but, in fact, the four Piy are equal to 1/4. Therefore, measuring
the same spin component for both qubits, successively for x, y and z, fails to allow us to build up an
unentanglement criterion.
However, since two spins are present, there is still the possibility of not systematically measuring
the same spin component for both spins. One chooses to measure successively sz for both spins, then
s1z and s2x in a new set of preparations/measurements, and finally s1z and s2y . The presence of the
s1z measurement in each of these sets corresponds to recognizing that (1) uses the standard basis.
The probabilities of obtaining (1/2, 1/2), (1/2, −1/2),(−1/2, 1/2), (−1/2, −1/2), respectively, when
measuring s1i and s2j (with i = z, x, or y, and j = z, x, or y) will be denoted as P1ij , P2ij , P3ij and P4ij .
Denoting the ci introduced in Equation (1) as ci = ρi eiψi , then from Equation (4) it is known that |Ψ is
unentangled if and only if
Measuring {s1z , s2z } allows us to know the moduli | ci |2 = ρ2i in (1), and to express the first equality
in Equation (20) as
P1zz P4zz = P2zz P3zz . (21)
The Pkzx and Pkzy (with k = 1 to 4), when expressed as functions of the moduli ρl and angles ψm ,
depend upon trigonometric functions of the ψm angles. For instance, for any state |Ψ entangled or not
When expressing unentanglement through probabilities, one then has to try and respect both
cos α = cos β and sin α = sin β with α and β values compatible with the equality ψ1 + ψ4 = ψ2 + ψ3 , rather
than to respect the equality ψ1 + ψ4 = ψ2 + ψ3 (mod 2π) itself. If it is first known that simultaneously
P1zz P4zz = P2zz P3zz and P1zx P4zx = P2zx P3zx are true, then one immediately deduces that
cos(ψ1 − ψ2 ) = cos(ψ3 − ψ4 ). In addition, if P1zy P4zy = P2zy P3zy replaces the second equality, one
deduces that sin(ψ1 − ψ2 ) = sin(ψ3 − ψ4 ). Therefore, when the three equalities between probability
products are satisfied, then ρ1 ρ4 = ρ2 ρ3 and ψ1 + ψ4 = ψ2 + ψ3 (mod 2π). Conversely, if |ψ is
unentangled, then Equation (8) implies that P1zj P4zj = P2zj P3zj , with j = z, x, y respectively. Finally,
The equivalence therefore is between a single relation between probability amplitudes and a
triplet of relations between probabilities. This criterion, although established in the context of BQSS,
has the same general validity as Equation (4).
Use of criterion (23) necessitates successive measurements first of s1z and s2z , then (after new
preparations) of s1z and s2x , and finally (again after new preparations) of s1z and s2y , in order to
successively estimate first the Pizz probabilities, then the Pizx and finally the Pizy . One must measure
488
Entropy 2017, 19, 311
s1z each time, because (1) getting e.g., (+1/2, −1/2) when measuring s1z and s2z is an event to be
distinguished from the one realized when measuring s1z and s2x and getting (+1/2, −1/2), (2) results
of measurements of s1z and s2x are independent only if |Ψ is unentangled, which precisely can’t be
assumed when Equation (23) is to be used.
The two distinguishable spins were made to play different roles in the process, which led to
Equation (23) (systematic measurement of s1z ). This dissymmetry is only partial, as Equation (23) can
be replaced by a version obtained by exchanging the spin numbers. The next subsection makes a
symmetrical use of measurements of spin components, allowing one to get the values of both the ρi
moduli and the ψi angles for the ci coefficients in Equation (1).
Similarly, when measuring s1z and s2y , the probabilities of getting (1/2, 1/2) and (−1/2, 1/2) are,
respectively,
1 1
P1zy = | c1 − ic2 |2 , P3zy = | c3 − ic4 |2 , (26)
2 2
which leads to
2P1zy − P1zz − P2zz 2P3zy − P3zz − P4zz
sin(ψ1 − ψ2 ) = − √ , sin(ψ3 − ψ4 ) = − √ . (27)
2 P1zz P2zz 2 P3zz P4zz
Expressions (25) and (27) allow us to know both (ψ1 − ψ2 ) and (ψ3 − ψ4 ) (mod 2π).
Now, exchanging the roles of spins 1 and 2, we successively measure {s1x , s2z } and (after new
preparations) {s1y , s2z }. The probabilities of getting (1/2, 1/2) in these measurements are, respectively,
1 1
P1xz = | c + c3 |2 , P1yz = | c − ic3 |2 , (28)
2 1 2 1
489
Entropy 2017, 19, 311
which leads to
Similarly, the state at the Mixer output at time t, here denoted as |Ψ(t) >, is given by
Equation (1), with the values of the coefficients ci (in the B+ basis) taken at t and denoted as ci (t).
The coupling-induced transition from state |Ψ(t0 ) to |Ψ(t) is interpreted as the transformation
induced by the Mixer, leading to the appearance of |Ψ(t) at its output. In the same basis, |Ψ(t)
is described by the column vector C+ (t) given by (30), with t replacing t0 . In the matrix formalism,
the relation between C+ (t0 ) and C+ (t) is written as
where the square fourth-order matrix M describes the effect of the coupling. In [8], it was shown that
when the coupling may be described by a Heisenberg cylindrical Hamiltonian, then M = QDQ−1 ,
where Q = Q−1 is a square matrix with the following non-zero matrix elements:
1
Q11 = Q44 = 1, Q22 = − Q33 = Q23 = Q32 = √ , (32)
2
and D is a Diagonal square matrix with its diagonal elements equal to Dii = e−iωi (t−t0 ) (i = 1...4),
the ωi being real quantities depending upon Jz and Jxy , with generally unknown numerical values.
The input of the inverting block then receives this state |Ψ(t). Its output provides a state |Φ described
in the B+ basis by a column vector C, with
490
Entropy 2017, 19, 311
where the square matrix U (Unmixing matrix) describes the effect of the inverting block of the
separating system. If it is possible to choose U in the form U = M−1 , then |Φ will be equal to
|Ψ(t0 ). However, strictly speaking, operating this way is impossible because M = QDQ, and D
is unknown. In [9], the inverting block was formally built using a chain of quantum gates globally
) where D
realizing matrix U in the form U = Q DQ, ) is a diagonal matrix with its four diagonal elements
D) ii (i = 1...4) equal to
) ii = eiγi ,
D γi : free real parameters. (34)
) = Δ is therefore a diagonal matrix with diagonal elements Δii = eiδi , where
DD
δi = γi − ωi (t − t0 ). (35)
The D ) matrix and the adaptation phase were introduced because it is not possible to modify the
values of the D matrix. In the following discussion, it is assumed that the ωi are time-independent and
that the adaptation phase has been successful with respect to unentanglement, i.e., that it has been
possible to adjust the γi in such a way that, in the inversion phase, if the Writer has prepared each
qubit of the qubit pair in an arbitrary pure state at time t0 , we are then sure that state |Φ at the output
of the inverting block is unentangled. The column vectors C+ (t0 ) and C are associated with |Ψ(t0 )
and |Φ respectively, and C = QΔQC+ (t0 ) is therefore the column vector
⎛ ⎞
eiδ1 c1 (t0 )
⎜ [eiδ2 (c (t ) + c (t )) + eiδ3 (c (t ) − c (t ))]/2 ⎟
⎜ 2 0 3 0 2 0 3 0 ⎟
⎜ iδ2 ⎟. (36)
⎝ [e (c2 (t0 ) + c3 (t0 )) − eiδ3 (c2 (t0 ) − c3 (t0 ))]/2 ⎠
eiδ4 c4 (t0 )
1
ei(δ1 +δ4 ) c1 c4 = [2c2 c3 (ei2δ2 + ei2δ3 ) + (c22 + c23 )(ei2δ2 − ei2δ3 )] (37)
4
(ci meaning ci (t0 ), for i = 1 to 4). We want this relation to be satisfied for any unentangled |Ψ(t0 ).
Starting with a |Ψ(t0 ) state with c2 (t0 )c3 (t0 )
= 0 and remembering that c1 (t0 )c4 (t0 ) = c2 (t0 )c3 (t0 ),
Equation (37) may then be written
Equation (38) is required to be fulfilled for all possible states |Ψ(t0 ) with c2 (t0 )c3 (t0 )
= 0, and
for fixed δi values (defined once for all during the adaptation phase). The left-hand term does not
depend upon the ci (t0 ), whereas its right-hand term does depend upon them. Therefore, Equation (38)
is satisfied only if
ei2δ2 − ei2δ3 = 0, i.e., δ3 − δ2 = mπ, m : integer, (39)
If Equations (39) and (40) and relation c1 (t0 )c4 (t0 ) = c2 (t0 )c3 (t0 ) are inserted into Equation (36),
it is easy to write |Φ as a product state, which confirms that if Equations (39) and (40) are fulfilled,
and then |Φ is unentangled indeed.
If one now supposes e.g., a |Ψ(t0 ) with c3 (t0 ) = 0, c2 (t0 )
= 0, c4 (t0 )
= 0, and therefore c1 (t0 ) = 0,
then in order for |Φ to be unentangled Equation (37) has to be fulfilled. Putting c1 (t0 ) = c3 (t0 ) = 0
into Equation (37) leads to Equation (39), and the δi are then not submitted to another constraint.
491
Entropy 2017, 19, 311
6. Conclusions
Conventional BSS is a mature field of Signal Processing, with various applications. Its extension
into a quantum context has been developing for a decade, first through the creation of theoretical
methods for Blind Quantum Source Separation (BQSS), with classical and/or quantum processing,
and recently through the use of BQSS in the exploration of Blind Quantum Process Tomography
(BQPT). The present paper examined in detail concepts (e.g., those of quantum sources and of
their independence) and established properties (e.g., an unentanglement criterion) introduced in
our previous papers. In the BQSS context, with qubits supposed to be realized with spins 1/2, one
has to face two major consequences of the quantum behaviour. First, if each qubit of a spin qubit
pair is initially prepared in a pure state, and the time evolution of the pair state is governed by some
undesired coupling between the spins, the Reader at the Mixer output accesses an unknown generally
entangled qubit pair quantum state. This entangled state may be sent to a quantum processing system
in order to restore the initially prepared state. Writing the output state of this processing system as
e.g., |Φ = ∑i ci | i in the standard basis, with well-ordered basis states, we showed that this state is
unentangled if and only if c1 c4 = c2 c3 , a constraint between probability amplitudes. Secondly, results
of measurements of the qubit spin components have a probabilistic nature, and the corresponding
probabilities follow quantum properties even when processed with classical means. This article shows
precautions to be taken when trying to extend to Blind Quantum SS the concept of source statistical
independence used in conventional BSS. Using the probabilities Pizj of getting the different possible
results when measuring s1z and s2j , successively with j = z, x and y, it is shown that the above
unentanglement criterion may be written as { P1zj P4zj = P2zj P3zj }, a set of three constraints between
probabilities. This unentanglement criterion has already been used in the adaptation phase of Blind
Quantum SS, through a disentanglement-based separation principle, before restoration of the initial
unentangled state. The already developed BQSS/BQPT methods do not depend on some specific
interpretation of Quantum Theory, while respecting its general postulates.
Acknowledgments: This theoretical study was performed without financial support. The costs to publish in open
access were handled by Yannick Deville, in the framework of the research activities and projects that he is heading
in his lab (Institut de Recherche en Astrophysique et Planétologie).
Author Contributions: This theoretical study was performed by Alain Deville and Yannick Deville, in connection
with the research activities about related topics that they also performed together (see above-mentioned papers).
Both authors participated in writing this paper.
Conflicts of Interest: The authors declare no conflict of interest.
492
Entropy 2017, 19, 311
the speech signal. The denoised speech output of this BSS system is then provided to the ASR system
(see [11] and references therein).
When using radio-frequency signals to transmit digital data, reception antennas may
simultaneously receive several mixed data streams. BSS is then applied to first unmix these signals.
Each extracted signal may then be separately used as required in the considered application. Its use in
the radio-frequency identification (RFID) system instance is briefly presented in [11].
The biomedical field makes a systematic use of signals such as electrocardiograms (ECGs) or
electroencephalograms (EEGs), processed by human experts or computers. This “main task” is often
difficult because each signal in the recorded set is a mixture of various contributions, and the information
of interest thus cannot be easily extracted from any such mixed signal. Again, a solution to this problem
consists of pre-processing the original recordings by means of BSS methods, so as to extract each
signal component of interest separately on each output of this BSS system. In [11], information
is given about the extraction of foetus’s heartbeats from ECG recordings which were mixtures of
large-magnitude mother’s heartbeats, low-magnitude foetus’s heartbeats and noise components.
These foetus’s heartbeats were hardly visible in the original recordings.
BSS is closely related to the so-called Blind System Identification (BSI). The problem of
describing an unknown classical (i.e., non quantum) system through a realistic model is called system
identification. When e.g., this system may be described by a matrix, the task is the determination of
its matrix elements. In Blind System Identification, some properties of the input signals are known,
but the input signals themselves are unknown. Methods for BSS often include the determination of
the unknown mixer function or of its inverse. This is a kind of BSI problem, called Blind Mixture
Identification (BMI).
493
Entropy 2017, 19, 311
to benefit from the fact that BQSS avoids the intrisic complexity of standard QPT methods. For more
details about the applications of BQSS and BQPT, the interested reader may refer to [11,14], and to
references therein.
References
1. Laloë, F. Comprenons-Nous Vraiment la MéCanique Quantique; EDP Sciences Les Ulis: Les Ulis, France, 2011;
English version: Do We Really Understand Quantum Mechanics? Cambridge University Press: Cambridge,
UK, 2012.
2. Cohen-Tannoudji, C.; Diu, B.; Laloë, F. Mécanique Quantique; Hermann: Paris, France, 1973; English version:
Quantum Mechanics; John Wiley: New York, NY, USA, 1977.
3. Dirac, P. Quantum Mechanics of Many-Electron Systems. Proc. R. Soc. A 1929, 123, 714–733.
4. Timpson, C.G. Quantum Information Theory and the Foundations of Quantum Mechanics. Ph.D. Thesis,
University of Oxford, Oxford, UK, 2004.
5. Comon, P.; Jutten, C. (Eds.) Handbook of Blind Source Separation: Independent Component Analysis and Applications;
Academic Press: Oxford, UK, 2010.
6. Deville, Y. Blind Source Separation and Blind Mixture Identification Methods. In Wiley Encyclopedia of Electrical
and Electronics Engineering; Webster, J., Ed.; Wiley: Hoboken, NJ, USA, 2016; pp. 1–33.
7. Deville, Y.; Deville, A. Blind separation of quantum states: Estimating two qubits from an isotropic Heisenberg
spin coupling model. In Proceedings of the 7th International Conference on Independent Component
Analysis and Signal Separation, London, UK, 9–12 September 2007; Davies, M.E., James, C.J., Abdallah, S.A.,
Plumbley, M.D., Eds.; Springer: Berlin, Germany, 2007; pp. 706–713.
8. Deville, Y.; Deville, A. Classical-processing and quantum-processing signal separation methods for qubit
uncoupling. Quantum Inf. Process. 2012, 11, 1311–1347.
9. Deville, Y.; Deville, A. A quantum-feedforward and classical-feedback separating structure adapted with
monodirectional measurements; blind qubit uncoupling capability and links with ICA. In Proceedings
of the 23rd IEEE International Workshop on Machine Learning for Signal Processing, Southampton, UK,
22–25 September 2013.
10. Deville, Y.; Deville, A. Blind qubit state disentanglement with quantum processing: Principle, criterion
and algorithm using measurements along two directions. In Proceedings of the 2014 IEEE International
Conference on Acoustics, Speech and Signal Processing, Florence, Italy, 4–9 May 2014; pp. 6262–6266.
11. Deville, Y.; Deville, A. Quantum-Source Independent Component Analysis and Related Statistical Blind Qubit
Uncoupling Methods. In Blind Source Separation: Advances in Theory, Algorithms and Applications; Naik, G.R.,
Wang, W., Eds; Springer: Berlin, Germany, 2014; pp. 3–37.
12. Deville, Y.; Deville, A. From blind quantum source separation to blind quantum process tomography.
In Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation,
Liberec, Czech Republic, 25–28 August 2015; Vincent, E., Yeredor, A., Koldovský, Z., Tichavský, P., Eds.;
Springer: Berlin, Germany, 2015; pp. 184–192.
13. Deville, Y.; Deville, A. Blind quantum computation: Blind quantum source separation and blind quantum
process tomography. In Proceedings of the 19th Conference on Quantum Information Processing, Banff, AB,
Canada, 10–15 January 2016.
14. Deville, Y.; Deville, A. Blind quantum source separation: Quantum-processing qubit uncoupling systems
based on disentanglement. Digit. Signal Process. 2017, 67, 30–51.
15. Deville, Y. Traitement du Signal: Signaux Temporels et Spatiotemporels—Analyse des Signaux, Théorie de
L’information, Traitement D’antenne, Séparation Aveugle de Sources; Ellipses Editions Marketing: Paris, France,
2011. (In French)
16. Feynman, R.P. Quantum Mechanical Computers. Opt. News 1985, 11, 11–20.
17. Feynman, R.P. Feynman Lectures on Computation; Perseus Publishing: Cambridge, MA, USA, 1996.
18. Peres, A. Separability Criterion for Density Matrices. Phys. Rev. Lett. 1996, 77, 1413–1415.
19. Horodecki, M.; Horodecki, P.; Horodecki, R. Separability of mixed states: Necessary and sufficient conditions.
Phys. Lett. A 1996, 223, 1–8.
20. Nielsen, M.A.; Chuang, I.L. Quantum Computation and Quantum Information; Cambridge University Press:
Cambridge, UK, 2000.
494
Entropy 2017, 19, 311
21. Buchleitner, A.; Viviescas, C.; Tiersch, M. (Eds.) Entanglement and Decoherence (Lectures Notes in Physics);
Springer: Berlin, Germany, 2009.
22. Köhler, J.; Disselhorst, J.A.J.M.; Donckers, M.C.J.M.; Groenen, E.J.J.; Schmidt, J.; Moerner, W.E. Magnetic
resonance of a single molecular spin. Nature 1993, 363, 242–244.
23. Gruber, A.; Dräbenstedt, A.; Tietz, C.; Fleury, L.; Wrachtrup, J.; von Borczyskowski, C. Scanning Confocal
Optical Microscopy and Magnetic Resonance on Single Defect Centers. Science 1997, 276, 2012–2014.
24. Rugar, D.; Budakian, R.; Mamin, H.J.; Chui, B.W. Single spin detection by magnetic resonance force microscopy.
Nature 2004, 430, 329–332.
25. Otte, A.F. Can data be stored in a single magnetic atom? Europhys. News 2008, 38, 31–34.
26. Bienfait, A.; Pla, J.J.; Kubo, Y.; Stern, M.; Zhou, X.; Lo, C.C.; Weis, C.D.; Schenkel, T.; Thewalt, M.L.W.;
Vion, D.; et al. Reaching the quantum limit of sensitivity in electron spin resonance. arXiv 2015, arXiv:1507.06831.
27. Hyvärinen, A.; Karhunen, J.; Oja, E. Independent Component Analysis; Wiley: New York, NY, USA, 2001.
28. Abragam, A. The Principles of Nuclear Magnetism; Oxford University Press: Oxford, UK, 1961.
29. Tolman, R.C. The Principles of Statistical Mechanics; Oxford University Press: Oxford, UK, 1938; p. 327.
30. Von Neumann, J. Les Fondements Mathématiques de la Mécanique Quantique; Alcan: Paris, France, 1946; Editions
Jacques Gabay: Paris, France, 1988. (In French)
31. Barnett, S.M. Quantum Information; Oxford University Press: Oxford, UK, 2009.
32. Fuchs, C.A.; Peres, A. Quantum theory needs no “interpretation”. Phys. Today 2000, 53, 70–71.
33. Margenau, H. Quantum-Mechanical description. Phys. Rev. 1936, 49, 240–242.
34. Margenau, H. Critical Points in Modern Physical Theory. Philos. Sci. 1937, 4, 337–370.
35. Feynman, R.P. Statistical Mechanics; Basic Books: New York, NY, USA, 1972.
36. Abragam, A.; Bleaney, B. Electron Paramagnetic Resonance of Transition Ions; Oxford University Press: Oxford,
UK, 1970.
37. DiVincenzo, D.P. Quantum Computation. Science 1995, 270, 255–261.
38. Fazekas, P. Electron Correlation and Magnetism; World Scientific: Hackensack, NJ, USA, 1999.
c 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
495
MDPI
St. Alban-Anlage 66
4052 Basel
Switzerland
Tel. +41 61 683 77 34
Fax +41 61 302 89 18
www.mdpi.com