Entanglement-Enabled Advantage For Learning A Bosonic Random Displacement Channel
Entanglement-Enabled Advantage For Learning A Bosonic Random Displacement Channel
Changhun Oh,1, ∗ Senrui Chen,1, ∗ Yat Wong,1 Sisi Zhou,2, 3, 4 Hsin-Yuan Huang,3, 5, 6 Jens A.H. Nielsen,7
Zheng-Hao Liu,7 Jonas S. Neergaard-Nielsen,7 Ulrik L. Andersen,7 Liang Jiang,1, † and John Preskill3, ‡
1
Pritzker School of Molecular Engineering, The University of Chicago, Chicago, Illinois 60637, USA
2
Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada
3
Institute for Quantum Information and Matter,
California Institute of Technology, Pasadena, CA 91125, USA
4
Department of Physics and Astronomy and Institute for Quantum Computing,
University of Waterloo, Ontario N2L 2Y5, Canada
5
Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
6
Google Quantum AI, Venice, CA, USA
7
Center for Macroscopic Quantum States (bigQ),
Department of Physics, Technical University of Denmark,
arXiv:2402.18809v1 [quant-ph] 29 Feb 2024
Quantum science and technology holds promise to rev- whether entanglement-enabled advantage can also be real-
olutionize how we understand and interact with nature, ized for learning properties of bosonic continuous-variable
enabling computational speedups [1], classically impossi- (CV) systems. This is particularly interesting and impor-
ble communication tasks [2, 3], and measurements with tant because CV systems are ubiquitous in nature and
unprecedented sensitivity [4–6]. Rapid progress during have many applications in quantum information science,
the noisy intermediate-scale quantum (NISQ) era [7] has such as quantum sensing [6, 8, 25–27]. However, gener-
brought these promises closer to reality, but the challenge alizing the results in DV systems to CV systems can be
remains to demonstrate rigorous quantum advantage for difficult because bosonic systems have infinite-dimensional
practical problems. Hilbert spaces, making it challenging to formulate rigorous
results concerning the complexity of learning properties
Over the past few years, there has been ongoing theo-
of these systems. Recent progress has been achieved in
retical and experimental progress in exploring quantum
studies of entanglement-enhanced learning of CV-state
computational advantage [8–16]. Another recent line of
characteristic functions [28]; however, the lower bounds
research seeks quantum advantage in learning [17–24],
obtained so far apply to a restricted class of learning strate-
revealing that access to quantum memory enables us to
gies rather than to general entanglement-free schemes.
learn properties of nature more efficiently. Specifically,
Refs. [18, 19] establish a framework for proving expo-
In this work, we rigorously establish an entanglement-
nential separation in sample complexity between learn-
enabled advantage in learning a probabilistic mixture of
ing with and without a coherently controllable quantum
n-mode displacement operations, called a bosonic ran-
memory. In contrast to its computational counterpart,
dom displacement channel. Specifically, we show that any
this entanglement-enabled advantage in learning can be
schemes without ancillary quantum memory require a
proven without invoking computational assumptions and
number of samples exponential in n to learn the char-
can sometimes be more experimentally accessible. A proof-
acteristic functions of the channel with reasonably good
of-principle experiment has been conducted on Google’s
precision and high success probability. On the contrary,
superconducting quantum processor using 40 qubits [18].
we present a simple scheme utilizing entanglement with
Most learning tasks studied so far are restricted to ancillary quantum memory (i.e., entanglement-assisted)
discrete-variable (DV) systems. It is natural to ask that can complete the same learning task with a sample
2
Fourier transforming to invert this relation, we obtain In particular, if we choose the squeezing parameter as r =
Z Ω(log n), the sample complexity NEA = O(ϵ−2 log δ −1 ) of
2
λ(β) = ee |β| d2n ζ pEA (ζ)eζ β−β ζ the entanglement-assisted scheme becomes independent
−2r † †
(a) True distribution probability: (1) Set Λ = Λ0 ; (2) Set Λ = Λγ , for γ sampled
from a zero-mean homogeneous Gaussian distribution
whose variance is determined by κ. Next, Alice allows Bob
to use the channel Λ N times, and Bob uses his favorite
entanglement-free scheme to learn from these channel
uses. After Bob has finished all quantum measurements
and keeps only classical data, Alice reveals some auxiliary
(b) Entanglement-assisted information to Bob, who is then asked to decide whether
Alice has chosen (1) or (2).
Given a learning scheme satisfying the assumptions of
Theorem 2, Bob can guess correctly with high probabil-
ity. This means that the outcome distributions of Bob’s
scheme under hypotheses (1) and (2) must have a suffi-
ciently large total variation distance (TVD). On the other
(c) Vacuum+Heterodyne hand, we can upper bound the contribution from each
use of Λ to the TVD to be exponentially small, where
we use a technique inspired by Ref. [35] which derived
the maximum fidelity of Gaussian random displacement
channels. Therefore, the number of channel uses N must
be exponentially large to ensure a large enough TVD,
which gives us the desired lower bound.
FIG. 2. Comparison between (a) the true distribution, (b) Effect of loss.— Now, for practical applications, we
TMSV+BM, and (c) Vacuum+Heterodyne strategies. The study how the entanglement-assisted scheme is affected by
left panel represents the probability distribution of the true photon loss, a dominant noise source in optical platforms
distribution and measurement probability distributions for (see SM S2 D for a discussion of more general noise models,
each scheme. The right panel represents the characteristic such as phase diffusion). Photon loss
function of probability distributions. √ √ transforms the
relevant bosonic operator â to T â + 1 − T ê, where T
is the transmission rate and ê is the environmental mode,
i.e., 1−T is the loss rate. We consider two different places
entanglement-assisted sample complexity given in Eq. (8),
where the loss occurs: one is before applying the channel
Theorem 2 establishes a separation exponential in n for
with loss rate 1−Tb to model the preparation imperfection,
cutoff coefficient κ = O(1) and squeezing parameter
and the other is after applying the channel and before the
r = Ω(log n). The intuition underlying this theorem is
perfect BM with loss rate 1 − Ta , which models the finite
that displacement operators D̂(β) do not generally com-
efficiency of detection, i.e., an imperfect BM [27]. As
mute with each other. Consequently, entanglement-free
before, we derive the relation between the measurement
measurements can resolve λ(β) for only a small portion
probability distribution and the characteristic function of
of β space. We sketch the proof below and leave the full
the channel (with appropriate rescaling of the phase):
details to SM S3 [29].
Z √
|β|2
Proof Sketch. Our proof extends the techniques of λ(β) = ee d2n ζ ploss (ζ)e(ζ β−β ζ)/ Ta , (14)
−2reff † †
(a) (b)
FIG. 3. (a) Comparison of TMSV+BM (with different loss rates), Vacuum+Heterodyne, and the entanglement-free lower
bound at κ = 1. The task is to estimate any λ(β) such that |β|2 ≤ κn with precision ε = 0.2 and success probability 1 − δ = 2/3.
The orange region represents a rigorous advantage over all entanglement-free schemes. The blue region represents an advantage
over noiseless Vacuum+Heterodyne. (b) Comparison of the TMSV+BM scheme with squeezing parameter r = 1.0 and loss rate
1 − T = 0.1 with the entanglement-free lower bound of Theorem 2. (See SM S3 A for further practical considerations.) The task
is the same as (a). The brown solid contour lines represent the sample complexity of TMSV+BM given by Theorem 3. The blue
dashed contour lines represent the ratio of sample complexity between the entanglement-free lower bound and TMSV+BM,
indicating the entanglement-enabled advantage.
−2reff |β|2
N = 8e2e ϵ−2 log 4δ −1 , where reff is defined ac- entanglement-free protocols may be realized in the near
cording to Eq. (15). future.
Apart from their theoretical interest, random displace-
Thus, when |β|2 ≤ κn with a constant κ > 0, Tb = ment channels can also be practically relevant in, e.g.,
1−O(1/n), Ta = 1−O(1/n) and r = Ω(log n), the sample modeling noise in bosonic systems. As in the qubit
complexity becomes N = O(ϵ−2 log δ −1 ) as in the lossless case [38], we expect that noise tailoring methods can
case. For practically relevant squeezing and including transform more general noise models into random dis-
loss prior to Bell measurement, we compare the sample placement channels; therefore efficiently learning random
complexity for the lossy TMSV+BM protocol and the displacement channels can be useful for benchmarking
lossless entanglement-free lower bound in Fig. 3, finding CV quantum systems [39, 40] and for error mitigation.
a significant entanglement-enabled advantage in realistic Displacement estimation is also studied in quantum
experimental settings. Specifically, for reasonable param- metrology (see e.g. [41–43]). A task often considered in
eter choices such as squeezing parameter r = 1, loss rate metrology is learning an unknown unitary displacement or
10%, and κ = O(1), we can achieve a factor of 104 (108 ) phase transformation acting independently on each mode
advantage for around n = 30 (60) modes. Although the [36, 42, 44–47] whereas the task analyzed in this word is
109 number of samples required to achieve the advantage learning an unknown mixture of multimode displacements.
seems large, the state-of-the-art quantum optics exper- Furthermore, while the goal in metrology is typically to
iments (e.g., Refs. [36, 37]) can attain such number of learn one or a few parameters, in our case, the param-
samples in a reasonable time with high sampling rate up eter space is very large. Therefore, the methodology in
to 160 GHz. the two settings is quite different. Connections between
Discussion.— We proved that schemes that exploit metrology and bosonic channel learning are worthy of
entanglement with an ancillary quantum memory can further exploration.
learn n-mode bosonic random displacement channels with
exponentially fewer samples compared to entanglement- We thank Mankei Tsang, Yuxin Wang, Ronald de
free schemes. Our results show that the information- Wolf, Mingxing Yao, Ming Yuan for insightful discus-
theoretic framework for learning studied in DV quantum sions. C.O., S.C., Y.W., L.J. acknowledge support from
systems [17, 19] can be generalized to the CV setting the ARO(W911NF-23-1-0077), ARO MURI (W911NF-
and have powerful implications. We anticipate that these 21-1-0325), AFOSR MURI (FA9550-19-1-0399, FA9550-
techniques can be applied to other CV learning tasks as 21-1-0209), NSF (OMA-1936118, ERC-1941583, OMA-
well. In addition, our analysis suggests that the separation 2137642), NTT Research, Packard Foundation (2020-
in sample complexity between entanglement-assisted and 71479). J.P. acknowledges support from the U.S. De-
6
partment of Energy Office of Science, Office of Advanced et al., Quantum computational advantage using photons,
Scientific Computing Research (DE-NA0003525, DE- Science 370, 1460 (2020).
SC0020290), the U.S. Department of Energy, Office of [13] H.-S. Zhong, Y.-H. Deng, J. Qin, H. Wang, M.-C. Chen,
Science, National Quantum Information Science Research L.-C. Peng, Y.-H. Luo, D. Wu, S.-Q. Gong, H. Su, et al.,
Phase-programmable Gaussian boson sampling using stim-
Centers, Quantum Systems Accelerator, and the National ulated squeezed light, Physical review letters 127, 180502
Science Foundation (PHY-1733907). The Institute for (2021).
Quantum Information and Matter is an NSF Physics [14] L. S. Madsen, F. Laudenbach, M. F. Askarani, F. Rortais,
Frontiers Center. S.Z. acknowledges funding provided T. Vincent, J. F. Bulmer, F. M. Miatto, L. Neuhaus, L. G.
by the Institute for Quantum Information and Matter Helt, M. J. Collins, et al., Quantum computational ad-
and Perimeter Institute for Theoretical Physics, a re- vantage with a programmable photonic processor, Nature
search institute supported in part by the Government of 606, 75 (2022).
[15] A. Morvan, B. Villalonga, X. Mi, S. Mandra, A. Bengtsson,
Canada through the Department of Innovation, Science P. Klimov, Z. Chen, S. Hong, C. Erickson, I. Drozdov,
and Economic Development Canada and by the Province et al., Phase transition in random circuit sampling, arXiv
of Ontario through the Ministry of Colleges and Uni- preprint arXiv:2304.11119 (2023).
versities. J.A.H.N, Z.L., J.S.N and U.L.A acknowledge [16] Y.-H. Deng, Y.-C. Gu, H.-L. Liu, S.-Q. Gong, H. Su, Z.-J.
support from DNRF (bigQ, DNRF142), IFD (photoQ) Zhang, H.-Y. Tang, M.-H. Jia, J.-M. Xu, M.-C. Chen,
and EU (CLUSTEC, ClusterQ ERC-101055224, GTGBS J. Qin, L.-C. Peng, J. Yan, Y. Hu, J. Huang, H. Li, Y. Li,
MC-101106833). Y. Chen, X. Jiang, L. Gan, G. Yang, L. You, L. Li, H.-S.
Zhong, H. Wang, N.-L. Liu, J. J. Renema, C.-Y. Lu, and
J.-W. Pan, Gaussian boson sampling with pseudo-photon-
number-resolving detectors and quantum computational
advantage, Phys. Rev. Lett. 131, 150601 (2023).
∗
[17] H.-Y. Huang, R. Kueng, and J. Preskill, Information-
These authors contributed equally to this work: theoretic bounds on quantum advantage in machine learn-
C.O. ([email protected]); S.C. (csen- ing, Physical Review Letters 126, 190505 (2021).
[email protected]). [18] H.-Y. Huang, M. Broughton, J. Cotler, S. Chen, J. Li,
†
[email protected] M. Mohseni, H. Neven, R. Babbush, R. Kueng, J. Preskill,
‡
[email protected] and J. R. McClean, Quantum advantage in learning from
[1] M. A. Nielsen and I. Chuang, Quantum computation and experiments, Science 376, 1182 (2022).
quantum information (2002). [19] S. Chen, J. Cotler, H.-Y. Huang, and J. Li, Exponential
[2] N. Gisin and R. Thew, Quantum communication, Nature separations between learning with and without quantum
photonics 1, 165 (2007). memory, in 2021 IEEE 62nd Annual Symposium on Foun-
[3] H. J. Kimble, The quantum internet, Nature 453, 1023 dations of Computer Science (FOCS) (IEEE, 2022) pp.
(2008). 574–585.
[4] V. Giovannetti, S. Lloyd, and L. Maccone, Quantum [20] M. C. Caro, Learning quantum processes and hamil-
metrology, Physical review letters 96, 010401 (2006). tonians via the pauli transfer matrix, arXiv preprint
[5] V. Giovannetti, S. Lloyd, and L. Maccone, Advances in arXiv:2212.04471 (2022).
quantum metrology, Nature photonics 5, 222 (2011). [21] S. Bubeck, S. Chen, and J. Li, Entanglement is necessary
[6] E. Polino, M. Valeri, N. Spagnolo, and F. Sciarrino, for optimal quantum property testing, in 2020 IEEE 61st
Photonic quantum metrology, AVS Quantum Science 2, Annual Symposium on Foundations of Computer Science
024703 (2020). (FOCS) (IEEE, 2020) pp. 692–703.
[7] J. Preskill, Quantum computing in the NISQ era and [22] D. Aharonov, J. Cotler, and X.-L. Qi, Quantum algorith-
beyond, Quantum 2, 79 (2018). mic measurement, Nature communications 13, 1 (2022).
[8] S. Aaronson and A. Arkhipov, The computational com- [23] S. Chen, S. Zhou, A. Seif, and L. Jiang, Quantum ad-
plexity of linear optics, in Proceedings of the forty-third vantages for pauli channel estimation, Phys. Rev. A 105,
annual ACM symposium on Theory of computing (2011) 032435 (2022).
pp. 333–342. [24] Z. M. Rossi, J. Yu, I. L. Chuang, and S. Sugiura, Quan-
[9] S. Boixo, S. V. Isakov, V. N. Smelyanskiy, R. Babbush, tum advantage for noisy channel discrimination, Physical
N. Ding, Z. Jiang, M. J. Bremner, J. M. Martinis, and Review A 105, 032401 (2022).
H. Neven, Characterizing quantum supremacy in near- [25] S. L. Braunstein and P. Van Loock, Quantum information
term devices, Nature Physics 14, 595 (2018). with continuous variables, Reviews of modern physics 77,
[10] F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, 513 (2005).
R. Barends, R. Biswas, S. Boixo, F. G. Brandao, D. A. [26] C. Weedbrook, S. Pirandola, R. García-Patrón, N. J.
Buell, et al., Quantum supremacy using a programmable Cerf, T. C. Ralph, J. H. Shapiro, and S. Lloyd, Gaussian
superconducting processor, Nature 574, 505 (2019). quantum information, Reviews of Modern Physics 84, 621
[11] Y. Wu, W.-S. Bao, S. Cao, F. Chen, M.-C. Chen, X. Chen, (2012).
T.-H. Chung, H. Deng, Y. Du, D. Fan, et al., Strong quan- [27] A. Serafini, Quantum continuous variables: a primer of
tum computational advantage using a superconducting theoretical methods (CRC press, 2017).
quantum processor, Physical review letters 127, 180501 [28] Y.-D. Wu, G. Chiribella, and N. Liu, Quantum-enhanced
(2021). learning of continuous-variable quantum states, arXiv
[12] H.-S. Zhong, H. Wang, Y.-H. Deng, M.-C. Chen, L.- preprint arXiv:2303.05097 (2023).
C. Peng, Y.-H. Luo, J. Qin, D. Wu, X. Ding, Y. Hu, [29] Supplemental material.
7
[30] S. Chen and W. Gong, Futility and utility of a quantum computation via randomized compiling, Physical
few ancillas for pauli channel learning, arXiv preprint Review A 94, 052325 (2016).
arXiv:2309.14326 (2023). [39] Y.-D. Wu and B. C. Sanders, Efficient verification of
[31] S. Chen, C. Oh, S. Zhou, H.-Y. Huang, and L. Jiang, Tight bosonic quantum channels via benchmarking, New Jour-
bounds on pauli channel learning without entanglement, nal of Physics 21, 073026 (2019).
arXiv preprint arXiv:2309.13461 (2023). [40] G. Bai and G. Chiribella, Test one to test many: a uni-
[32] W. P. Schleich, Quantum optics in phase space (John fied approach to quantum benchmarks, Physical Review
Wiley & Sons, 2011). Letters 120, 150502 (2018).
[33] D. Gottesman, A. Kitaev, and J. Preskill, Encoding a [41] H. Shi and Q. Zhuang, Ultimate precision limit of noise
qubit in an oscillator, Physical Review A 64, 012310 sensing and dark matter search, npj Quantum Information
(2001). 9, 27 (2023).
[34] D. Schuster, A. A. Houck, J. Schreier, A. Wallraff, J. Gam- [42] Q. Zhuang, Z. Zhang, and J. H. Shapiro, Distributed
betta, A. Blais, L. Frunzio, J. Majer, B. Johnson, M. De- quantum sensing using continuous-variable multipartite
voret, et al., Resolving photon number states in a super- entanglement, Physical Review A 97, 032329 (2018).
conducting circuit, Nature 445, 515 (2007). [43] Y. Xia, W. Li, W. Clark, D. Hart, Q. Zhuang, and
[35] C. M. Caves and K. Wódkiewicz, Fidelity of gaussian Z. Zhang, Demonstration of a reconfigurable entangled
channels, Open Systems & Information Dynamics 11, 309 radio-frequency photonic sensor network, Physical Review
(2004). Letters 124, 150502 (2020).
[36] X. Guo, C. R. Breum, J. Borregaard, S. Izumi, M. V. [44] K. Duivenvoorden, B. M. Terhal, and D. Weigand, Single-
Larsen, T. Gehring, M. Christandl, J. S. Neergaard- mode displacement sensor, Physical Review A 95, 012305
Nielsen, and U. L. Andersen, Distributed quantum sens- (2017).
ing in a continuous-variable entangled network, Nature [45] C. Oh, C. Lee, S. H. Lie, and H. Jeong, Optimal dis-
Physics 16, 281 (2020). tributed quantum sensing using gaussian states, Physical
[37] A. Inoue, T. Kashiwazaki, T. Yamashima, N. Takanashi, Review Research 2, 023030 (2020).
T. Kazama, K. Enbutsu, K. Watanabe, T. Umeki, [46] C. Oh, L. Jiang, and C. Lee, Distributed quantum phase
M. Endo, and A. Furusawa, Toward a multi-core ultra-fast sensing for arbitrary positive and negative weights, Phys-
optical quantum processor: 43-ghz bandwidth real-time ical Review Research 4, 023164 (2022).
amplitude measurement of 5-db squeezed light using mod- [47] H. Kwon, Y. Lim, L. Jiang, H. Jeong, and C. Oh, Quan-
ularized optical parametric amplifier with 5g technology, tum metrological power of continuous-variable quantum
Applied Physics Letters 122, 104001 (2023). networks, Physical Review Letters 128, 180503 (2022).
[38] J. J. Wallman and J. Emerson, Noise tailoring for scalable
Entanglement-enabled advantage for learning a bosonic random displacement
channel: Supplemental Material
4
Department of Physics and Astronomy and Institute for Quantum Computing,
University of Waterloo, Ontario N2L 2Y5, Canada
5
Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
6
Google Quantum AI, Venice, CA, USA
7
Center for Macroscopic Quantum States (bigQ),
Department of Physics, Technical University of Denmark,
Building 307, Fysikvej, 2800 Kgs. Lyngby, Denmark
(Dated: March 1, 2024)
CONTENTS
S1. Preliminary 1
A. Fourier relation 2
References 26
S1. PRELIMINARY
In this section, we provide some identities that are frequently used in Supplemental Material
(more details can be found in Refs. [1–3]). First, an elementary operator in n-mode bosonic
system is n-mode displacement operator D̂(β) := eβâ −β â , where β := (β1 , . . . , βn )T ∈ Cn , â :=
† †
∗
These authors contributed equally to this work: C.O. ([email protected]); S.C. ([email protected]).
†
[email protected]
‡
[email protected]
2
(â1 , . . . , ân )T and ↠:= (â†1 , . . . , â†n )T are annihilation and creation operator of bosons, which follow
the commutation relation [âi , â†j ] = δij . Displacement operator D̂(β) forms an orthogonal basis in
the operator space; thus, any operator Ô can be expanded by displacement operators as
Z
1 h i
Ô = d2n β Tr ÔD̂(β) D̂† (β), (S1)
πn
where Tr[ÔD̂(β)] is called the characteristic function of an operator Ô. The n-mode displacement
operator has the following properties
h i
D̂† (β) = D̂(−β), D̂∗ (β) = D̂(β ∗ ), D̂T (β) = D̂(−β ∗ ), Tr D̂(β) = π n δ (2n) (β), (S2)
(β2† β1 −β1† β2 )/2
D̂(β1 )D̂(β2 ) = D̂(β1 + β2 )e D̂(α)D̂† (β)D̂† (α) = D̂† (β)eα (S3)
† β−β † α
, ,
Z
d2n β
D̂(β)OD̂† (β) = Tr[O]1, (S4)
πn
where the last identity is the twirling identity. We also frequently use the following identity:
Z
1
δ (2n) (α) = d2n βeβ (S5)
† α−α† β
.
π 2n
A. Fourier relation
π Z
1 h i
= n d2n βλ(β) Tr ρ̂D̂(β) D̂† (β), (S12)
π
3
where we used the identities from Eqs. (S3),(S5). Here, the last line renders the identity
Z
λ(β) = d2n αp(α)eα (S13)
† β−β † α
.
Thus, λ(β) is the Fourier transform of p(α). Its inverse Fourier transformation gives
Z
1
p(α) = 2n d2n βλ(β)eβ (S14)
† α−α† β
.
π
In this section, we consider the two-mode squeezed vacuum and Bell measurement (TMSV+BM)
strategy (see Fig. 1(a) in the main text.) and prove Theorem 1 in the main text by deriving the
sample complexity of the strategy. We assume a lossless system and a perfect Bell measurement,
whereas the input squeezed state has a finite squeezing parameter r since the input squeezing
parameter r is typically upper-bounded by a constant in practice. We then analyze the effect of the
imperfect measurement and loss in Sec. S2 C.
We now derive the probability of outcomes of the strategy, i.e., the outcomes obtained by
applying an n-mode channel Λ onto a product state of a subsystem of n TMSV states |Ψ̃⟩ and
measuring in the Bell basis |Ψ(ζ)⟩⟨Ψ(ζ)|/π n with ζ ∈ Cn :
1 h i
pEA (ζ) = Tr (|Ψ(ζ)⟩⟨Ψ(ζ)|AB )(IA ⊗ Λ B )(|Ψ̃⟩⟨Ψ̃|AB ) . (S15)
πn
To simplify the expression, we rewrite a TMSV state. To do that, let us first consider a single-mode
squeezed state |r⟩ := Ŝ(r)|0⟩:
Z Z
1 h i 1
|r⟩⟨r| = d2 αD̂† (α) Tr D̂(α)|r⟩⟨r| := d2 αD̂† (α)f (α, r), (S16)
π π
h i
where Ŝ(r) := exp r(â†2 − â2 )/2 is the squeezing operation and we have defined
h i
f (α, r) := Tr D̂(α)|r⟩⟨r| = ⟨0|Ŝ † (r)D̂(α)Ŝ(r)|0⟩ = ⟨0|D̂(α cosh r − α∗ sinh r)|0⟩ (S17)
1
= exp − |α cosh r − α∗ sinh r|2 . (S18)
2
Here, we have used the relation, Ŝ † (r)âŜ(r) = â cosh r + ↠sinh r. Using the fact that a TMSV
state can be generated by injecting two single-mode squeezed states into the 50:50 beam splitter,
i.e., ÛBS (|r⟩⟨r| ⊗ | − r⟩⟨−r|)ÛBS
†
= |Ψ̃(r)⟩⟨Ψ̃(r)|, where ÛBS is 50:50 beam splitter, we can rewrite
the TMSV state |Ψ̃(r)⟩ as
Z
1
|Ψ̃(r)⟩⟨Ψ̃(r)| = d2 α1 d2 α2 ÛBS [D̂† (α1 ) ⊗ D̂† (α2 )]ÛBS
†
f (α1 , r)f (α2 , −r) (S19)
π2
Z
1 α1 − α2 α1 + α2
= 2 d2 α1 d2 α2 D̂† √ ⊗ D̂†
√ f (α1 , r)f (α2 , −r) (S20)
π 2 2
Z
1 ω1 + ω2 ω2 − ω1
= 2 d2 ω1 d2 ω2 D̂† (ω1 ) ⊗ D̂† (ω2 )f √ ,r f √ , −r (S21)
π 2 2
Z
1
:= 2 d2 ω1 d2 ω2 D̂† (ω1 ) ⊗ D̂† (ω2 )g(w1 , w2 , r), (S22)
π
4
where we defined
ω1 + ω2 ω2 − ω1
g(ω1 , ω2 , r) := f √ ,r f √ , −r (S23)
2 2
i
1h 2 2
= exp − (|ω1 | + |ω2 | ) cosh 2r − (ω1 ω2 + ω1 ω2 ) sinh 2r ,
∗ ∗
(S24)
2
which is the characteristic function of the TMSV state |Ψ̃(r)⟩ by Eq. (S1), i.e.,
h i
g(ω1 , ω2 , r) := Tr |Ψ̃(r)⟩⟨Ψ̃(r)|(D̂(ω1 ) ⊗ D̂(ω2 )) . (S25)
πn π n π 2n
(S37)
Z
1 d2n ω d2n β1 d2n β2 h i h
T
i
= λ(β2 )g(β 1 , β2 , r)eζ ∗ ·ω ∗ −ζ·ω
Tr D̂ †
(ω) D̂ †
(β 1 ) Tr D̂ (ω) D̂ †
(β 2 )
π 2n
A A B B
πn πn
(S38)
Z
1
= 2n d2n ωd2n β1 d2n β2 λ(β2 )g(β1 , β2 , r)eζ ·ω −ζ·ω δ(ω + β1 )δ(ω ∗ + β2 ) (S39)
∗ ∗
π Z
1
= 2n d2n ωλ(−ω ∗ )g(−ω, −ω ∗ , r)eζ ·ω −ζ·ω . (S40)
∗ ∗
π
Hence, we finally obtain the probability of obtaining ζ from Bell measurement with an initial Bell
state with finite squeezing:
Z Z
1 1 2
pEA (ζ) = 2n d2n ωλ(−ω ∗ )g(−ω, −ω ∗ , r)eζ ·ω −ζ·ω = 2n d2n ωλ(−ω ∗ )e−e |ω| eζ ·ω −ζ·ω .
∗ ∗ −2r ∗ ∗
π π
(S41)
By inverting the relation using Fourier transformation,
Z Z
1
d2n ζpEA (ζ)eζ β−β ζ = 2n d2n ωλ(−ω ∗ )g(−ω, −ω ∗ , r)e(ω +β)·ζ −(ω +β) ·ζ (S42)
† † ∗ ∗ ∗ ∗
π
= λ(β)g(β ∗ , β, r), (S43)
we obtain the relation between the characteristic function of the channel λ(β) and the probability
distribution of outcomes pEA (ζ):
Z Z
1 2n
2
λ(β) = (ζ)eζ † β−β † ζ
= exp d2n ζpEA (ζ)eζ β−β ζ . (S44)
−2r † †
d ζpEA e |β|
g(β ∗ , β, r)
The expression shows that by obtaining samples ζ’s following the probability distribution using
sampling in experiment and taking Fourier transformation, one can obtain the estimate of λ(β).
2 2
Now, we show the number of samples N ≥ ϵ82 log 4δ e2e |β| = O(e2e |β| ϵ−2 log δ −1 ) suffices to
−2r −2r
ϵ δ
the estimation error is upper-bounded by ϵ with high probability 1 − δ. This completes the proof of
Theorem 1 in the main text.
Note that in an ideal case where the input squeezing parameter r can be chosen to be arbitrarily
large, the sample complexity can be reduced to N = O(1/ϵ2 ) for any β.
6
Now, let us consider the entanglement-free scheme with vacuum input and heterodyne detection
(Vacuum+Heterodyne). In general, denoting Πϕ as a POVM with an outcome ϕ and |ϕ0 ⟩⟨ϕ0 | as an
input state, the probability of a classical scheme is written as
For the Vacuum+Heterodyne scheme, we employ vacuum state input |ϕ0 ⟩ = |0⟩ and heterodyne
detection, whose POVM elements are described by projectors onto the (overcomplete) basis of
coherent states, |ζ⟩⟨ζ|/π n , where |ζ⟩ is a coherent state with complex amplitude ζ ∈ Cn . Note
that such a scheme is informationally complete in the sense that it provides distinct probability
distributions for different channels. For this scheme, we can obtain the probability distribution
Z Z
1 h i h i 1 2
pV H (ζ) = d2n αλ(α) Tr |ζ⟩⟨ζ|D̂† (α) Tr |0⟩⟨0|D̂(α) = d2n αλ(α)eα
† ζ−ζ † α
e−|α| .
π 2n π 2n
(S52)
we obtain the final relation between the measurement probability distribution and the characteristic
function of the channel:
Z
2
λ(β) = e|β| d2n ξpV H (ζ)eζ (S54)
† β−β † ζ
.
2
It clearly shows the difference from the quantum strategies, which is the prefactor e|β| . Thus,
2
similarly, after sampling ζ from experiments following pEF (ζ) and averaging e|β| eζ β−β ζ over the
† †
samples, we obtain the estimate of λ(β). As for the entanglement-assisted case, for N samples,
1 PN |β|2 eζ (i)† β−β † ζ (i) and using the Hoeffding bound,
{ζ (i) }N
i=1 , by setting the estimator λ̃(β) = N i=1 e
we obtain
N ϵ2 −2|β|2
Pr[|λ̃(β) − λ(β)| ≤ ϵ] ≥ 1 − 4e− 8
e
. (S55)
8 4 2 2
N≥ 2
log e2|β| = O(e2|β| ϵ−2 log δ −1 ) (S56)
ϵ δ
for the estimation error to be upper-bounded by ϵ with high probability 1 − δ. It clearly shows the
significant difference of the sample complexity from the entanglement-assisted case.
7
We now consider the effect of imperfections and prove Theorem 3 in the main text. More
specifically, we consider the cases where photon loss occurs before and after applying the random
displacement channel we want to learn and a regularized Bell measurement. Here, the photon loss
before and after the random displacement channel models an imperfect input state preparation
and imperfect Bell measurement. On the other hand, we introduce parameter s to regularize the
Bell measurement POVM by a general-dyne measurement POVM, where we recover the perfect
Bell measurement by taking s → ∞. By considering the regularized Bell measurement POVM,
we assume the same condition as the lower bound in Sec. S3 in that the measurement POVM is
normalizable, i.e., its norm is finite.
Let us first consider the effect of the loss channel on a single-mode displacement operator.
Using the equivalent description of a loss channel L by a beam splitter interaction with a vacuum
environment, we can show that
L[D̂† (α)S ] = TrE [UT (D̂† (α)S ⊗ |0⟩⟨0|E )UT† ] (S57)
Z
d2 z − 1 |z|2
= e 2 TrE [UT D̂† (α)S ⊗ D̂† (z)E UT† ] (S58)
πn
Z 2 √ √ √ √
d z − 1 |z|2
= e 2 TrE [ D̂ †
( T α − 1 − T z)S ⊗ D̂ †
( T z + 1 − T α)E ] (S59)
πn
1−T 2 √
= T −1 e− 2T |α| D̂† (α/ T )S , (S60)
where ÛT is the beam splitter interaction with the environment with the transmission rate T , and
thus 1 − T is the loss rate. Here, we used the identity [1]
Z
d2n z − 1 |z|2 †
|0⟩⟨0| = e 2 D̂ (z). (S61)
πn
Recall that the input state is two-mode squeezed states with a finite squeezing parameter, which
is written as
Z
1
|Ψ̃(r)⟩⟨Ψ̃(r)| = 2n d2n ω1 d2n ω2 g(w1 , w2 , r)D̂† (ω1 ) ⊗ D̂† (ω2 ). (S62)
π
Let us first study how the displacement operator transforms over the channels.
First, after a loss channel with loss rate 1 − Tb , the random displacement channel Λ and another
loss channel with loss rate 1 − Ta , an n-mode displacement operator transforms as
LTAB
a
(IA ⊗ ΛB )LTAB
b
(D̂† (ω1 )A ⊗ D̂† (ω2 )B ) (S63)
1−T p p
− 2T b (|ω1 |2 +|ω2 |2 )
= Tb−2n e b LTAB
a
(IA ⊗ ΛB )(D̂† (ω1 / Tb )A ⊗ D̂† (ω2 / Tb )B ) (S64)
1−T p p p
− 2T b (|ω1 |2 +|ω2 |2 )
= Tb−2n e b λ(ω2 / Tb )LAB
Ta
(D̂† (ω1 / Tb )A ⊗ D̂† (ω2 / Tb )B ) (S65)
1−T 1−Ta p p p
− 2T b (|ω1 |2 +|ω2 |2 ) − 2T (|ω1 |2 +|ω2 |2 )
= (Tb Ta )−2n e b e a Tb λ(ω2 / Tb )(D̂† (ω1 / Tb Ta )A ⊗ D̂† (ω2 / Tb Ta )B ).
(S66)
Now we implement the regularized Bell measurement. Recall that the perfect Bell measurement can
be conducted by applying a 50:50 beam splitter and then performing homodyne detection. Here, we
will regularize the homodyne detection by general-dyne detection and tracing out some quadratures.
After 50:50 beam splitters ÛBS , the displacement operators transform as
p p ω − ω2 ω + ω2
D̂ (ω1 / Tb Ta )A ⊗ D̂ (ω2 / Tb Ta )B → D̂
† † †
√1 ⊗ D̂ †
√1 . (S67)
2Tb Ta A 2Tb Ta B
8
A 1
Π̂α (−s) ⊗ Π̂B
β (s) α,β where Π̂γ (s) := D̂(γ) |s⟩⟨s| D̂† (γ), (S68)
πn
where s ≥ 0 is the squeezing parameter for the Bell measurement. We note that this measurement
corresponds to a special type of general-dyne measurement [3] and that when s → ∞, we recover
the Bell measurement studied in Sec. S2 A. Then, the output probability is written as
Z
1 h i
q(α, β) := 2n d2n ωg(ω1 , ω2 , r) Tr Π̂A
α (−s) ⊗ Π̂β (s)ÛBS LAB (IA ⊗ ΛB )LAB (D̂ (ω1 )A ⊗ D̂ (ω2 )B ) ÛBS .
B Ta Tb † † †
π
(S69)
Here, we have
h 1 i 1 1 2 2
Tr Π̂α (−s)D̂† (ω) = ⟨−s|D̂† (α)D̂† (ω)D̂(α)| − s⟩ = n e− 2 (|ω| cosh 2s+(ω +ω ) sinh 2s/2) eω α−α ω ,
∗2 † †
π n π
(S70)
h i 1 1 2 2
Tr Π̂β (s)D̂† (ω) = n e− 2 (|ω| cosh 2s−(ω +ω ) sinh 2s/2) eω β−β ω , (S71)
∗2 † †
and
ω − ω2 ω + ω2
Tr Π̂α (−s)D̂ †
√1 Tr Π̂β (s)D̂ †
√1 (S72)
2Ta Tb 2Ta Tb
h i
1 − 1 (|ω1 |2 +|ω2 |2 ) cosh 2s
−(ω1 ·ω2 +ω1∗ ·ω2∗ ) sinh 2s
α+β β−α
= 2n e 2 Ta Tb Ta Tb
exp ω1∗ √ + ω2∗ √ − c.c. (S73)
π 2Ta Tb 2Ta Tb
1 p p α + β β − α
= 2n g(ω1 / Ta Tb , ω2 / Ta Tb , s) exp ω1 √ †
+ ω2 √
†
− c.c. . (S74)
π 2Ta Tb 2Ta Tb
We now trace out one of two quadratures for each mode. To do that, we take integral over the
imaginary part of α and the real part of β. Here, we define α := αr + iαi and β := βr + iβi . If we
take the integral, the relevant part reduces to
Z ∗
1 ω1 − ω2∗ ∗ ω1 − ω2
p √ √
d αi exp α √
n
−α √ = δ(Re(ω1 − ω2 )/ 2Ta Tb )e−i 2 Im(ω1 −ω2 )αr / Ta Tb ,
πn 2Ta Tb 2Ta Tb
(S75)
Z ∗
1 ω1 + ω2∗ ∗ ω1 + ω2
p √ √
d βr exp β √
n
−β √ = δ(Im(ω1 + ω2 )/ 2Ta Tb )ei 2 Re(ω1 +ω2 )βi / Ta Tb ,
πn 2Ta Tb 2Ta Tb
(S76)
Thus, the output probability is, by defining the measurement output variable as ξ = −αr + iβi ,
9
given by
q(ξ) (S79)
Z
= dn αi dn βr q(α, β) (S80)
Z
1 1−Tb 1−Ta
+ 2T )(|ω1 |2 +|ω2 |2 ) p
dn αi dn βr d2n ω1 d2n ω2 (Tb Ta )−n e
−(
= 2Tb a Tb g(ω1 , ω2 , r)λ(ω2 / Tb )
π 2n
ω1 − ω2 ω1 + ω2
× Tr Π̂α (−s)D̂† √ Tr Π̂β (s)D̂† √ (S81)
2Ta Tb 2Ta Tb
Z
1 1−Tb 1−Ta
+ 2T )(|ω1 |2 +|ω2 |2 ) p p p
d2n ω1 d2n ω2 (Tb Ta )−2n e
−(
= 2Tb a Tb λ(ω2 / Tb )g(ω1 , ω2∗ , r)g(ω1 / Ta Tb , ω2 / Ta Tb , s)
π 2n
ω1 − ω2∗ −i√2 Im(ω1 −ω2 )αr /√Ta Tb i√2 Re(ω1 +ω2 )βi /√Ta Tb
×δ √ e e (S82)
2Ta Tb
n Z
2 1 1−Tb
+ 1−T a )|ω|2 p p p √ √
d2n ωe 2(ξ·ω−ω ∗ ·ξ ∗ )/ Ta Tb
−(
= Tb Ta Tb
λ(ω ∗ / Tb )g(ω, ω ∗ , r)g(ω/ Ta Tb , ω ∗ / Ta Tb , s)e
Ta Tb π 2n
(S83)
n Z
2 1 1−T
−( T b + 1−T a )|ω|2 p 2
√ √
= d2n ωe λ(ω ∗ / Tb )e−(e +e /Ta Tb )|ω| e 2(ξ·ω−ω ·ξ )/ Ta Tb
−2r −2s ∗ ∗
b Ta Tb
Ta Tb π 2n
(S84)
n Z
2 1 1−Ta
)|ω|2
√ √
= d2n ωe−((1−Tb )+ Ta λ(ω ∗ )e 2(ξ·ω−ω ∗ ·ξ ∗ )/ Ta
. (S85)
Ta π 2n
√
Here, for consistency with Sec. S2 A, we rescale 2ξ = ζ and define ploss (ζ) such that
n Z
1 1 1−Ta
)|ω|2
√
d2n ωe−((1−Tb )+
−2r +e−2s /T 2
a )|ω|
e(ω·ζ−ω
∗ ·ζ ∗ )/
ploss (ζ) = Ta λ(ω ∗ )e−(Tb e Ta
, (S86)
Ta π 2n
where 2n factor is canceled because of the relation 2n d2n ξ = d2n ζ (This rescaling is because the
√
convention of Bell measurement outcome ζ in Sec. S2 A is different from ξ in this section by 2
factor.). Therefore, after Fourier transformation, we obtain
Z √ 1−Ta
−2r +e−2s /T )|β|2 2 |β|2
d2n ζploss (ζ)e(ζ = λ(β)e−(Tb e e−(1−Tb )|β| e− (S87)
† β−β † ζ)/ Ta a Ta .
where we defined an effective squeezing parameter reff due to all kinds of imperfections via
1 − Ta
e−2reff := (Tb e−2r + Ta−1 e−2s ) + (1 − Tb ) + . (S90)
Ta
In order to estimate any λ(β), one simply obtains √
N samples {ζ (i) }N i=1 from p(ζ) and set the estimator
1 P 2 (ζ (i)† (i) )/
to be λ̃(β) := N i=1 e . According to the Hoeffding’s inequality as the
N e −2reff |β| β−β † ζ T
e b
2 2e−2reff |β|2
ideal case, averaging over N ≥ 8/ϵ log(4/δ)e copies is sufficient to estimate λ(β) to ϵ
additive error with high probability.
Meanwhile, the effects of imperfections are thus the envelope of Fourier transforms. Especially
when s → ∞, the effective squeezing parameter under loss is given by
1 1 − Ta
reff = − log Tb e−2r + (1 − Tb ) + . (S91)
2 Ta
10
and for photon loss after the channel without other imperfections, the envelope is given by
−2r +(1−T )/T ]|β|2
e[e a a
. (S93)
In this section, we study more general sources of noise other than finite squeezing and photon
loss. To begin with, consider the case where we use an arbitrary input state while the CV Bell
measurement is still employed. To this end, note that Eq. (S44) from Sec. S2 A does not use any
special properties of TMSV, and actually holds for any 2n-mode input state, i.e.,
Z
1
λ(β) = d2n ζpEA (ζ)eζ (S94)
† β−β † ζ
,
gρ̂ (β ∗ , β)
where g is the characteristic function of the input state ρ̂:
h i
gρ̂ (w1 , w2 ) = Tr ρ̂D̂(ω1 ) ⊗ D̂(ω2 ) . (S95)
The fact that the same relation holds by replacing the g function properly indicates that for different
types of input states, we still have a very similar form of an unbiased estimator for N samples:
1 XN
1 (i)† † (i)
λ̃(β) = eζ β−β ζ . (S96)
N i=1 gρ̂ (β , β)
∗
It implies that the sampling complexity is determined by the function gρ̂ (β ∗ , β). More specifically,
by the Hoeffding inequality, the number of samples to achieve an error ϵ with high probability 1 − δ
is given by
Therefore, as long as the function g of the input state is sufficiently large for the β of interest, we
can still expect the scheme to be sample efficient.
Such a general form enables us to analyze the effect of general noise on input states. Let us
again focus on TMSV states but assume a noise channel N . Then, the characteristic function g of
the noisy TMSV states can be written as
h i
gN (Ψ̃) (w1 , w2 ) = Tr N (|Ψ̃(r)⟩⟨Ψ̃(r)|)D̂(ω1 ) ⊗ D̂(ω2 ) . (S99)
As discussed, it suffices to analyze how the characteristic function changes by noise to ensure that
a significant advantage is still maintained for noisy states. In typical experiments, while the CV
Bell measurement noise can be modeled by photon loss, as we considered already in the previous
section, other types of noise may exist in the TMSV state preparation procedure. An example is
11
(a) (b)
with a given |β|2 . We fix the squeezing parameter r = 1.5 of the input TMSV states and set the number of
modes n = 50. For the standard deviation ∆ = 1◦ of phase noise, following Gaussian distributions, one may
observe that the characteristic function is almost identical to the noiseless case (∆ = 0◦ ). For the standard
deviation ∆ = 2◦ as well, the effect is not very significant.
where ω1 e−iϕA and ω2 e−iϕB are interpreted as vectors obtained by an elementwise product, the
corresponding g function for TMSV states is written as
gN (Ψ̃) (w1 , w2 ) (S104)
h i
= Tr N∆ (|Ψ̃(r)⟩⟨Ψ̃(r)|)D̂(ω1 ) ⊗ D̂(ω2 ) (S105)
|ϕA |2 +|ϕB |2
Z Z
1 2n 2n e− 2∆2 h i
= 2n d β1 d β2 gΨ̃ (β1 , β2 ) n
d ϕA d ϕB n
Tr D̂† (β1 e−iϕA ) ⊗ D̂† (β2 e−iϕB )D̂(ω1 ) ⊗ D̂(ω2 )
π (2π∆2 )n
(S106)
|ϕ |2 +|ϕB |2
Z Z − A
e 2∆2
2n 2n
= d β1 d β2 gΨ̃ (β1 , β2 ) dn ϕA dn ϕB δ(ω1 − β1 e−iϕA )δ(ω2 − β2 e−iϕB ) (S107)
(2π∆2 )n
|ϕA |2 +|ϕB |2
Z
e− 2∆2
= dn ϕA dn ϕB gΨ̃ (ω1 eiϕA , ω2 eiϕB ). (S108)
(2π∆2 )n
Thus, the effect of phase noise is to transform the g function as a mixture with random phases. We
present examples to illustrate the effect of the phase noise on the sample complexity in Fig. S1.
12
√ √
We have chosen the parameters β ∈ Cn of two extreme cases as β := (|β|/ n, . . . , |β|/ n) and
β := (|β|, 0, . . . , 0). Recall that the typical choice of the regime of interest in the main text is
|β|2 ≤ κn; here, the range |β|2 ∈ [0, 130] in the figure covers up to κ = 2.5 for n = 50. We see that
the advantages of the entanglement-assisted scheme look robust against small-phase diffusion noise.
In this section, we prove the fundamental limit on general entanglement-free schemes for learning
n-mode random displacement channels. In this work, we will focus on the ancilla-free schemes
without concatenation. This means that, for each copy of the channel, we act it on some input state
and apply a destructive POVM measurement right after. The input states and measurements can
be adaptively chosen depending on previous measurement outcomes. See Fig. S2. Bounds for such
schemes have been investigated in different tasks [4–6]. One can also study ancilla-free protocols
with concatenation, the lower bounds for which have been obtained in several recent works [6–8], but
we leave that for future study as continuous-variable system puts an additional level of complexity.
Throughout this work, we assume entanglement-free schemes to have no concatenation.
FIG. S2. Schematics for entanglement-free schemes. In this work we assume no concatenation is allowed.
Such a scheme can be completely specificed by a collection of input states and POVM measurements that
adaptively depend on the measurement outcomes from the previous round.
Theorem S1. Let Λ be an arbitrary n-mode random displacement channel (n ≥ 8) and consider
an entanglement-free scheme that uses N copies of Λ. After all measurements are completed, the
scheme receives the query β ∈ Cn and returns an estimate λ̃(β) of Λ’s characteristic function λ(β).
Suppose that, with success probability at least 2/3, |λ̃(β) − λ(β)| ≤ ϵ ≤ 0.24 for all β such that
|β|2 ≤ nκ. Then N ≥ 0.01ϵ−2 (1 + 1.98κ)n .
Recall that an entanglement-assisted scheme can achieve the same task using O(ϵ−2 ) copies of
channels given sufficient squeezing and κ = O(1). Therefore, we establish an exponential separation
between learning bosonic random displacement channels with and without entanglement. In the
following, we start proving this result in Sec. S3 A and present a core lemma in Sec. S3 B. We also
prove a bound for learning with Gaussian schemes in Sec. S3 C, which might be of independent
interest.
Before proceeding, let us specify some regularization conditions. We will only work with proper
vectors (i.e., normalizable vector) in the Hilbert space and bounded operator acting on the Hilbert
space. That is to say, all the quantum states we considered can be expressed as a density operator
ρ̂ with trace 1, and all the POVM element Ê is a bounded positive semi-definite operator satisfying
ˆ Perhaps the most representative example that does not admit the above form is the perfect
Ê ≤ I.
homodyne detection projector |x⟩⟨x|, which represents projection onto the quadrature x. While |x⟩
is not a proper vector in the relevant Hilbert space, it can be treated as a limit of proper vectors in
any physical setting. Concretely, homodyne detection is implemented by applying a 50:50 beam
splitter between the input state and a strong local oscillator [3], and the above improper projector
|x⟩⟨x| is obtained by taking the limit where the power of the oscillator goes to infinity. Therefore, in
13
a reasonable physical setup, the actual projector is constructed with a proper vector in the Hilbert
space, thus satisfying our assumptions. We emphasize that our entanglement-assisted strategy also
satisfies the same assumption as we regularize the Bell measurements with general-dyne detection
with a parameter s < ∞, see Sec. S2 C.
Given positive number n ≥ 8 and ϵ ≤ 0.24, we introduce a family of “3-peak” random displacement
channels, defined by their characteristic functions,
|β|2 |β−γ|2 |β+γ|2
Λγ : λγ (β) := e− 2σ2 + 2iϵ0 e− 2σ 2 − 2iϵ0 e− 2σ 2 , γ ∈ Cn , (S109)
where ϵ0 := ϵ/0.98 ≤ 0.25. The corresponding distributions of displacements, computed via Fourier
transformation, are
!n
2σ 2 2 |α|2
Λγ : pγ (α) = e−2σ (1 + 4ϵ0 sin(2(Im[γ] Re[α] − Re[γ] Im[α]))) , (S110)
π
from which we see that the typical strength of displacement is of order 1/σ. Roughly, the smaller σ
is, the larger energy the channel carries. We define Λdep := Λ0 as the CV analogy of the depolarizing
channel, and the other Λγ can be viewed as perturbed depolarizing channels. The set of 3-peak
channels with parameters (ϵ, σ) is denoted as Λϵ,σ 39peak . With this, we are going to prove a strictly
stronger result than Theorem S1. That is, even if one knows the channel to be estimated is from
the restricted family, Λϵ,σ
39peak , an exponential lower bound still applies.
If there exists an entanglement-free scheme such that, after learning from N copies of an n-mode
random displacement channel Λ ∈ Λϵ,σ 39peak , and then receiving a query β ∈ C , can return an
n
estimate λ̃(β) of λ(β) such that |λ̃(β) − λ(β)| ≤ ϵ with probability at least 2/3 for all β such that
|β|2 ≤ nκ, then
n
1.98κ
N ≥ 0.01ϵ −2
1+ . (S112)
1 + 2σ 2
It is not hard to see that a σ > 0 satisfying the assumptions can always be found for any κ > 0.
Indeed, Theorem S1 follows from Theorem S2 by choosing σ → 0. Note that Theorem S2 does not
place any constraint on the input state and measurement. This means that learning a finite-energy
random displacement channel is hard without ancilla even given energy-unbounded input state and
measurement. Also, Theorem S2 enables an experimental test, as it only requires generating finite
displacement with high probability. The practical performance of this bound with σ = 0.3 is shown
in Fig. S3.
Proof of Theorem S2. Now we introduce the following game between Alice and Bob that helps
reduce the learning task to a partially-revealed hypothesis testing task [9]. First, Alice samples
s ∈ {±1} with equal probability and γ ∈ Cn according to the multivariate normal distribution q(γ)
defined as
!n |γ|2
1 − 2
q(γ) := e 2σγ
, (S113)
2πσγ2
14
(a) (b)
FIG. S3. Learning random displacement channels from the family with σ = 0.3 as in Theorem S2. (In the
main text, we set σ = 0.) All κ shown in the figure satisfies Eq. (S111). (a) Comparison of TMSV+BM (with
different loss rates), Vacuum+Heterodyne, and the entanglement-free lower bound at κ = 1. The task is to
estimate any λ(β) such that |β|2 ≤ κn with precision ε = 0.2 and success probability 1 − δ = 2/3. The orange
region represents a rigorous advantage over any entanglement-free schemes. The blue region represents an
advantage over Vacuum+heterodyne. (b) Comparison of the TMSV+BM scheme with squeezing parameter
r = 1.0 and loss rate 1 − T = 0.1 with the entanglement-free lower bound of Theorem 2. The task is the same
as (a). The brown solid contour lines represent the sample complexity of TMSV+BM given by Theorem 3.
The blue dashed contour lines represent the ratio of sample complexity between the entanglement-free lower
bound and TMSV+BM, which clearly indicates the entanglement-enabled advantages.
where we will set 2σγ2 := 0.99κ to ensure the tail probability, i.e., Pr |γ|2 > κn , to be sufficiently
small. Next, Alice does one of the following with equal probability:
Bob then measures the N copies of the channels Alice prepared. After Bob has finished the
measurements and retains only classical information, Alice reveals the value of γ to Bob. Now Bob
is asked to distinguish between the two hypotheses: whether Alice has prepared copies of Λdep or
Λsγ . Crucially, Bob must have completed all quantum measurements before Alice reveals γ, and
can only perform classical post-processing after that.
We first argue that if there is a scheme satisfying the assumptions of Theorem S2, then Bob can
use it to win the game with an average probability much better than random guess. Bob’s strategy
is as follows: If the γ he received satisfies 2σ 2 < |γ|2 ≤ κn, use the scheme to query λ(γ). Note
that for any γ ∈ Cn :
1 4|γ|2
λdep (γ) − λ±γ (γ) = λγ (γ) − λ−γ (γ) = 2ϵ0 1 − e− 2σ2 . (S114)
2
For |γ|2 > 2σ 2 , the R.H.S. is lower bounded by 2ϵ0 × 0.98 = 2ϵ. By assumption, this allows Bob to
distinguish among {Λdep , Λγ , Λ−γ } and thus guess correctly with at least 2/3 chance; For other γ,
15
The first inequality is shown in Sec. S4. The second inequality requires n ≥ 8 and 2σ 2 ≤ 0.99κ := 2σγ2 .
Bob’s average success probability is lower bounded by
Pr[Success] ≥ Pr 2σ 2 < |γ|2 ≤ κn × 2/3 + 1 − Pr 2σ 2 < |γ|2 ≤ κn × 1/2. (S120)
Now we investigate the probability distribution of Bob’s measurement outcomes for any γ. For
any adaptive entanglement-free strategy, one specifies an input state and a POVM for the ith
copy of Λ that can depend on previous measurement outcomes. We denote the ith measurement
outcomes as oi and the outcomes up to the ith round as o<i = [o1 , ..., oi−1 ]. The latter is added as
a superscript to the ith input states ρo<i and POVM element Eoi<i to emphasize their adaptive
o
nature. With these notations, the probability of obtaining outcomes o1:N on N copies of Λ is
N
Y h i
p(o1:N |Λ) = Tr Êooi<i Λ(ρ̂o<i ) . (S121)
k=1
For a fixed γ, let p1 (o1:N ) := p(o1:N |Λdep ), p2,γ (o1:N ) := Es=±1 p(o1:N |Λsγ ), which is the distribution
of Bob’s outcomes under the two hypotheses, respectively, conditioned on the γ he received.
According to the property of total variation distance, the maximal probability that Bob can
distinguish p1 and p2,γ is bounded by
1
Pr[Success|γ] ≤ (1 + TVD(p1 , p2,γ )), (S122)
2
where TVD is the total variation distance defined as
X
TVD(p1 , p2,γ ) := max {0, p1 (o1:N ) − p2,γ (o1:N )} . (S123)
o1:N
We note that the sum over o1:N should be understood as integral for continuous-variable outcomes.
Thus, the average probability that Bob can win the game is upper bounded by
1
Pr[Success] = Eγ∼q Pr[Success|γ] ≤ (1 + Eγ TVD(p1 , p2,γ )). (S124)
2
Combining Eq. (S120) and Eq. (S124), we get
In the following, we show by direct calculation that this is impossible unless N is exponentially
large in n, which yields a desired lower bound for the sample complexity.
16
Thanks to convexity, we assume pure input states and rank-1 measurement without decreasing
the TVD, i.e., the kth round’s input state and POVM are written as |Ao<k ⟩ and {|Bok<k ⟩⟨Bok<k |},
o o
which are conditioned on the previous measurement outcomes o<k . Here, the input state has unit
P
length and ok |Bok<k ⟩⟨Bok<k | = 1. We note that since any density matrix is trace-class, a spectrum
o o
decomposition always exists. On the other hand, the POVM element can be non-compact operator
and might not have spectrum decomposition, but it is known that they can always be composed
into rank-1 projectors with positive coefficients (see [10, Theorem 6]). Thus, making both the input
state and measurement projector to be rank-1 is indeed justified.
Now, let us rewrite the probabilities as
N
Y
p1 (o1:N ) = ⟨Book<k |Λdep (|Ao<k ⟩⟨Ao<k |)|Book<k ⟩ (S126)
k=1
N Z
Y 1
= d2n βk λdep (βk )⟨Book<k |D̂† (βk )|Book<k ⟩⟨Ao<k |D̂(βk )|Ao<k ⟩ , (S127)
k=1
πn
N
Y
p2,γ (o1:N ) = Es=±1 ⟨Book<k |Λsγ (|Ao<k ⟩⟨Ao<k |)|Book<k ⟩ (S128)
k=1
N Z
Y 1
= Es=±1 d2n βk λsγ (βk )⟨Book<k |D̂† (βk )|Book<k ⟩⟨Ao<k |D̂(βk )|Ao<k ⟩ . (S129)
k=1
πn
Let λadd
γ (βk ) := λγ (βk ) − λdep (βk ). The difference of the probabilities can then be written as
where
where the second line uses the AM-GM inequality and the fact that the expression inside the bracket
is the ratio of two conditional probabilities and is thus
√ non-negative; the third line uses the fact that
o o
Im Gσ≤k (γ) = − Im Gσ≤k (−γ); the fourth line uses 1 − x ≥ 1 − x, ∀ 0 ≤ x ≤ 1; and the final line
Q P
uses the inequality i (1 − xi ) ≥ 1 − i xi for all 0 ≤ xi ≤ 1. Thus, we can get rid of the maximum
in the expression of the average TVD as
X N
X
16ϵ20 Eγ |Gσ≤k (γ)|2 .
o
Eγ TVD(p1 , p2,γ ) ≤ p1 (o1:N ) (S142)
o1:N k=1
To further upper bound the R.H.S., we need the following Lemma S1. The lemma is analogous
to Pauli twirling in discrete-variable systems but also takes finite energy into consideration. The
proof of Lemma S1 is given in Sec. S3 B; Alternatively, when the input states and measurements are
restricted to be Gaussian, a more straightforward calculation is possible, yielding different bounds,
which we will present in Sec. S3 C.
Lemma S1. For any |Ao<k ⟩ , |Bok<k ⟩ we have
o
!n
1 + 2σ 2
Eγ |Gσ≤k (γ)|2
o
≤ , (S143)
1 + 2σ 2 + 4σγ2
q
1 1
given that σ2 ≤ max 2 − 2σγ2 , σγ2 1+ 4σγ4
−1 .
Combining this with the lower bound in Eq. (S125) and substituting ϵ = 0.98ϵ0 ,
!n
4σγ2
N ≥ 0.01ϵ−2 1+ . (S145)
1 + 2σ 2
By substituting 2σγ2 = 0.99κ, we obtain the lower bound as claimed in Theorem S2.
18
In Fig. S3, we compare the upper bound of the TMSV+BM scheme to the derived lower bound
of entanglement-free schemes. In contrast to the main text, we set σ = 0.3 to consider a more
practical case for experimental realization in the near future. To see how much energy is required to
realize the 3-peak channel, one can easily check that for a given σ, the corresponding single-mode
depolarizing channel Λ0 transforms a vacuum input state to a thermal state of mean photon number
1/2σ 2 . Since this channel is a product channel, it implies that we need 1/2σ 2 average photons per
mode. For our choice σ = 0.3, 1/2σ 2 ≈ 5.56. Since the envelope determined by σ has a larger
contribution than γ that determines the oscillation, we are required to produce approximately
1/2σ 2 photon number on average. It is worth emphasizing that for κ ≤ 2.5, the choice satisfies the
condition of Theorem S2.
B. Proof of Lemma S1
In this section we prove Lemma S1. Let |A⟩ , |B⟩ be arbitrary normalized pure states in the
n-mode bosonic Hilbert space. Define
G(β) := ⟨B|D̂† (β)|B⟩⟨A|D̂(β)|A⟩, (S146)
R 2n
[Nσ ∗ G](γ) d β exp −|β − γ|2 /2σ 2 G(β)
Gσ (γ) := := R 2n . (S147)
[Nσ ∗ G](0) d β exp(−|β|2 /2σ 2 )G(β)
Here ∗ stands for convolution. We are going to prove the following inequality
!n
Eγ |[Nσ ∗ G](γ)|2
2 1 + 2σ 2
Eγ |Gσ (γ)| = ≤ , (S148)
|[Nσ ∗ G](0)|2 1 + 2σ 2 + 4σγ2
n q
1 |γ|2 1 1
where γ ∼ q(γ) := 2πσγ2
exp − 2σ 2 and σ2 ≤ max 2 − 2σγ2 , σγ2 1+ 4σγ4
−1 .
γ
π 2n π 2n
where the last equality uses the convolution theorem [11]. The Fourier component of Nσ is
Z |β|2 2 |ω|2
FNσ (ω) = d2n βe− 2σ2 eβ = (2πσ 2 )n e−2σ (S150)
† ω−ω † β
.
Z
= d2n β⟨B|D̂† (β)|B⟩⟨A|D̂† (ω)D̂(β)D̂(ω)|A⟩ (S152)
= π n |⟨B|D̂(ω)|A⟩|2 , (S153)
where the second line uses D̂† (ω)D̂(β)D̂(ω) = eβ D̂(β), and the last line is by Eq. (S1). Thus,
† ω−ω † β
2
|[Nσ ∗ G](γ)| (S154)
Z 2
1
= d2n ωeω γ−γ ω FNσ (ω)FG (ω) (S155)
† †
π 2n
Z
1
= 4n d2n ωd2n ω ′ e(ω−ω ) γ−γ (ω−ω ) FNσ (ω)FN∗ σ (ω ′ )FG (ω)FG∗ (ω ′ ) (S156)
′ † † ′
π Z
2 (|ω|2 +|ω ′ |2 )
= (2σ 2 )2n d2n ωd2n ω ′ e(ω−ω )
′ † γ−γ † (ω−ω ′ )
e−2σ |⟨B, B|D̂(ω) ⊗ D̂(ω ′ )|A, A⟩|2 . (S157)
19
✓
<latexit sha1_base64="trBE1klUsy9CG5N6h840+ccDBWc=">AAACIXicbVDLSsNAFJ3UV62vqEs3wSJUxJIUH10W3bisYB/QtGUynbRDJ5MwcyOUkF9x46+4caFId+LPOH2A2nrgwplz7mXuPV7EmQLb/jQyK6tr6xvZzdzW9s7unrl/UFdhLAmtkZCHsulhRTkTtAYMOG1GkuLA47ThDW8nfuORSsVC8QCjiLYD3BfMZwSDlrpm2eXUh4LrS0wS57zkKtYPcKeUJs7Zz8OVrD+A007iDjAkIu06adfM20V7CmuZOHOSR3NUu+bY7YUkDqgAwrFSLceOoJ1gCYxwmubcWNEIkyHu05amAgdUtZPphal1opWe5YdSlwBrqv6eSHCg1CjwdGeAYaAWvYn4n9eKwS+3EyaiGKggs4/8mFsQWpO4rB6TlAAfaYKJZHpXiwywDgt0qDkdgrN48jKpl4rOVfHy/iJfuZnHkUVH6BgVkIOuUQXdoSqqIYKe0At6Q+/Gs/FqfBjjWWvGmM8coj8wvr4BPTGjnA==</latexit>
◆n̂1
<latexit sha1_base64="alaF0NDh7qsqabj8K13WLWtxSgg=">AAACQ3icbZBPa9swGMbldFu77F/aHncxC4OOkmCbrOsxtJceW1iasNg1rxXZFpVkI70eBOPv1ku/QG/7ArvssDJ2LUxJs7E1e0Dw43neF0lPUgpu0PO+OK2NR4+fbG49bT97/uLlq872zrkpKk3ZiBai0JMEDBNcsRFyFGxSagYyEWycXB4v8vFnpg0v1EeclyySkCmecgporbjzKRQsxb0w1UBrvxeEhmcSLoLe4DfFYQZSQlP7+3/S/bU01DzL8d1FHeaAtWrioIk7Xa/vLeWug7+CLlnpNO7chLOCVpIppAKMmfpeiVENGjkVrGmHlWEl0EvI2NSiAslMVC87aNy31pm5aaHtUegu3b83apDGzGViJyVgbh5mC/N/2bTC9DCquSorZIreX5RWwsXCXRTqzrhmFMXcAlDN7VtdmoOtE23tbVuC//DL63Ae9P2D/vuzQXd4tKpji7wmb8ge8ckHMiQn5JSMCCVX5Cv5Tm6da+eb88P5eT/aclY7u+QfOXe/ACoKsSc=</latexit>
!n̂2
2 2 2
1 2 1 2 4
2 1+2 2+4 2
1+2
Trace Trace
out out
⇢ˆd
<latexit sha1_base64="wiHWiiwnA+IaucJhwKq6qwE5JdU=">AAAB83icbVBNS8NAEJ3Ur1q/qh69LBbBU0nEr2PRi8cK1haaUDabTbN0sxt2N0IJ/RtePCji1T/jzX/jts1BWx8MPN6bYWZemHGmjet+O5WV1bX1jepmbWt7Z3evvn/wqGWuCO0QyaXqhVhTzgTtGGY47WWK4jTktBuObqd+94kqzaR4MOOMBikeChYzgo2VfD/BpvBVIieDaFBvuE13BrRMvJI0oER7UP/yI0nylApDONa677mZCQqsDCOcTmp+rmmGyQgPad9SgVOqg2J28wSdWCVCsVS2hEEz9fdEgVOtx2loO1NsEr3oTcX/vH5u4uugYCLLDRVkvijOOTISTQNAEVOUGD62BBPF7K2IJFhhYmxMNRuCt/jyMnk8a3qXzYv780brpoyjCkdwDKfgwRW04A7a0AECGTzDK7w5ufPivDsf89aKU84cwh84nz9u0JH1</latexit>
50:50 50:50
|bi
<latexit sha1_base64="QObvOTzqYscW7BnskYRLeBrE/Ko=">AAAB8HicbVDLSgNBEOz1GeMr6tHLYhA8hV3xdQx68RjBPCRZwuykNxkyM7vMzAphzVd48aCIVz/Hm3/jJNmDJhY0FFXddHeFCWfaeN63s7S8srq2Xtgobm5t7+yW9vYbOk4VxTqNeaxaIdHImcS6YYZjK1FIRMixGQ5vJn7zEZVmsbw3owQDQfqSRYwSY6WHp7CjiOxz7JbKXsWbwl0kfk7KkKPWLX11ejFNBUpDOdG67XuJCTKiDKMcx8VOqjEhdEj62LZUEoE6yKYHj91jq/TcKFa2pHGn6u+JjAitRyK0nYKYgZ73JuJ/Xjs10VWQMZmkBiWdLYpS7prYnXzv9phCavjIEkIVs7e6dEAUocZmVLQh+PMvL5LGacW/qJzfnZWr13kcBTiEIzgBHy6hCrdQgzpQEPAMr/DmKOfFeXc+Zq1LTj5zAH/gfP4ABZ6QkQ==</latexit>
|ai
<latexit sha1_base64="u633PLCoAg9IDnSC3LmRXGNsUWY=">AAAB8HicbVDLSgNBEOz1GeMr6tHLYhA8hV3xdQx68RjBPCRZwuykNxkyM7vMzAphzVd48aCIVz/Hm3/jJNmDJhY0FFXddHeFCWfaeN63s7S8srq2Xtgobm5t7+yW9vYbOk4VxTqNeaxaIdHImcS6YYZjK1FIRMixGQ5vJn7zEZVmsbw3owQDQfqSRYwSY6WHJ9JRRPY5dktlr+JN4S4SPydlyFHrlr46vZimAqWhnGjd9r3EBBlRhlGO42In1ZgQOiR9bFsqiUAdZNODx+6xVXpuFCtb0rhT9fdERoTWIxHaTkHMQM97E/E/r52a6CrImExSg5LOFkUpd03sTr53e0whNXxkCaGK2VtdOiCKUGMzKtoQ/PmXF0njtOJfVM7vzsrV6zyOAhzCEZyAD5dQhVuoQR0oCHiGV3hzlPPivDsfs9YlJ585gD9wPn8ABBOQkA==</latexit>
50:50 50:50
|Bi |Bi
<latexit sha1_base64="A1W0syloA77/CaF2W9Nf6bTqe9A=">AAAB8HicbVDLTgJBEOzFF+IL9ehlIzHxRHaNryPBi0dM5GFgQ2aHXpgwM7uZmTUhyFd48aAxXv0cb/6NA+xBwUo6qVR1p7srTDjTxvO+ndzK6tr6Rn6zsLW9s7tX3D9o6DhVFOs05rFqhUQjZxLrhhmOrUQhESHHZji8mfrNR1SaxfLejBIMBOlLFjFKjJUenqodRWSfY7dY8sreDO4y8TNSggy1bvGr04tpKlAayonWbd9LTDAmyjDKcVLopBoTQoekj21LJRGog/Hs4Il7YpWeG8XKljTuTP09MSZC65EIbacgZqAXvan4n9dOTXQdjJlMUoOSzhdFKXdN7E6/d3tMITV8ZAmhitlbXTogilBjMyrYEPzFl5dJ46zsX5Yv7s5LlWoWRx6O4BhOwYcrqMAt1KAOFAQ8wyu8Ocp5cd6dj3lrzslmDuEPnM8f1C+QcQ==</latexit> <latexit sha1_base64="A1W0syloA77/CaF2W9Nf6bTqe9A=">AAAB8HicbVDLTgJBEOzFF+IL9ehlIzHxRHaNryPBi0dM5GFgQ2aHXpgwM7uZmTUhyFd48aAxXv0cb/6NA+xBwUo6qVR1p7srTDjTxvO+ndzK6tr6Rn6zsLW9s7tX3D9o6DhVFOs05rFqhUQjZxLrhhmOrUQhESHHZji8mfrNR1SaxfLejBIMBOlLFjFKjJUenqodRWSfY7dY8sreDO4y8TNSggy1bvGr04tpKlAayonWbd9LTDAmyjDKcVLopBoTQoekj21LJRGog/Hs4Il7YpWeG8XKljTuTP09MSZC65EIbacgZqAXvan4n9dOTXQdjJlMUoOSzhdFKXdN7E6/d3tMITV8ZAmhitlbXTogilBjMyrYEPzFl5dJ46zsX5Yv7s5LlWoWRx6O4BhOwYcrqMAt1KAOFAQ8wyu8Ocp5cd6dj3lrzslmDuEPnM8f1C+QcQ==</latexit>
|Ai |Ai
<latexit sha1_base64="/pzOFfqXAgv5f/wpTS8yykM3q+Q=">AAAB8HicbVDJSgNBEK2JW4xb1KOXwSB4CjPidox68RjBLJIMoadTkzTp7hm6e4Qw5iu8eFDEq5/jzb+xsxw08UHB470qquqFCWfaeN63k1taXlldy68XNja3tneKu3t1HaeKYo3GPFbNkGjkTGLNMMOxmSgkIuTYCAc3Y7/xiEqzWN6bYYKBID3JIkaJsdLD01VbEdnj2CmWvLI3gbtI/BkpwQzVTvGr3Y1pKlAayonWLd9LTJARZRjlOCq0U40JoQPSw5alkgjUQTY5eOQeWaXrRrGyJY07UX9PZERoPRSh7RTE9PW8Nxb/81qpiS6DjMkkNSjpdFGUctfE7vh7t8sUUsOHlhCqmL3VpX2iCDU2o4INwZ9/eZHUT8r+efns7rRUuZ7FkYcDOIRj8OECKnALVagBBQHP8ApvjnJenHfnY9qac2Yz+/AHzucP0qSQcA==</latexit> <latexit sha1_base64="/pzOFfqXAgv5f/wpTS8yykM3q+Q=">AAAB8HicbVDJSgNBEK2JW4xb1KOXwSB4CjPidox68RjBLJIMoadTkzTp7hm6e4Qw5iu8eFDEq5/jzb+xsxw08UHB470qquqFCWfaeN63k1taXlldy68XNja3tneKu3t1HaeKYo3GPFbNkGjkTGLNMMOxmSgkIuTYCAc3Y7/xiEqzWN6bYYKBID3JIkaJsdLD01VbEdnj2CmWvLI3gbtI/BkpwQzVTvGr3Y1pKlAayonWLd9LTJARZRjlOCq0U40JoQPSw5alkgjUQTY5eOQeWaXrRrGyJY07UX9PZERoPRSh7RTE9PW8Nxb/81qpiS6DjMkkNSjpdFGUctfE7vh7t8sUUsOHlhCqmL3VpX2iCDU2o4INwZ9/eZHUT8r+efns7rRUuZ7FkYcDOIRj8OECKnALVagBBQHP8ApvjnJenHfnY9qac2Yz+/AHzucP0qSQcA==</latexit>
FIG. S4. Schematics for Eq. (S164) to (S169). Here, each line represents n-mode state, and we omit the
phase factors for simplicity.
Z
2 2 2n 2 ′ 2 2 (|ω|2 +|ω ′ |2 )
Eγ |[Nσ ∗ G](γ)| = (2σ ) d2n ωd2n ω ′ e−2σγ |ω−ω | e−2σ |⟨B, B|D̂(ω) ⊗ D̂(ω ′ )|A, A⟩|2
(S158)
Z
2 2 2 (|α|2 +|β|2 )
= (2σ 2 )2n d2n αd2n βe−4σγ |β| e−2σ †
|⟨B, B|ÛBS D̂(α) ⊗ D̂(β)ÛBS |A, A⟩|2 .
(S159)
√ √ √
Here, we √ changed the variable as ω = (α + β)/ 2 and ω ′ = (α − β)/ 2, i.e., (ω + ω ′ )/ 2 = α and
(ω − ω ′ )/ 2 = β and chose the 50:50 beam splitter such that
√ √
†
ÛBS D̂(α) ⊗ D̂(β)ÛBS = D̂((α + β)/ 2) ⊗ D̂((α − β)/ 2). (S160)
Z
2 (|ω|2 +|ω ′ |2 )
|[Nσ ∗ G](0)|2 = (2σ 2 )2n d2n ωd2n ω ′ e−2σ |⟨B, B|D̂(ω) ⊗ D̂(ω ′ )|A, A⟩|2 (S161)
Z
2 (|α|2 +|β|2 )
= (2σ 2 )2n d2n αd2n βe−2σ †
|⟨B, B|ÛBS D̂(α) ⊗ D̂(β)ÛBS |A, A⟩|2 . (S162)
To further simplify the expressions, note that by applying the convolution theorem to Eq. (S151),
we have
Z
1
|⟨B|D̂(α)|A⟩|2 = d2n βWA (β)WB (β − α), (S163)
πn
where WA and WB are the Wigner functions of the states |A⟩ and |B⟩, respectively. Here, note the
sign in the arguments due to the complex conjugate of the characteristic function, ⟨B|D̂† (β)|B⟩, in
20
Eq. (S151). Thus, by defining the 2n-mode states |a⟩ := ÛBS |A, A⟩ and |b⟩ := ÛBS |B, B⟩, we have
Z
2 2 2 (|α|2 +|β|2 )
d2n αd2n βe−4σγ |β| e−2σ †
|⟨B, B|ÛBS D̂(α) ⊗ D̂(β)ÛBS |A, A⟩|2 (S164)
Z
2 |α|2 2 2 )|β|2
= π 2n d2n ω1 d2n ω2 d2n αd2n βe−2σ e−(4σγ +2σ Wa (ω1 , ω2 )Wb (ω1 − α, ω2 − β) (S165)
Z
2 |ω −γ |2 2 2 )|ω 2
= π 2n d2n ω1 d2n ω2 d2n γ1 d2n γ2 e−2σ 1 1
e−(4σγ +2σ 2 −γ2 |
Wa (ω1 , ω2 )Wb (γ1 , γ2 ) (S166)
Z
2 |α 2 2 2 )|α 2 α1 + β1 α2 + β2 β1 − α1 β2 − α2
= π 2n d2n α1 d2n α2 d2n β1 d2n β2 e−4σ 1|
e−(8σγ +4σ 2|
Wa √ , √ Wb √ , √
2 2 2 2
(S167)
Z
2n 2 |α 2 2 2 )|α 2
=π d2n α1 d2n α2 e−4σ 1|
e−(8σγ +4σ 2|
Wd (α1 , α2 ) (S168)
!n̂1 !n̂
2n 2 −n 2 1 − 2σ 2 1 − 2σ 2 − 4σγ2 2
= π (1 + 2σ ) (1 + 2σ + 4σγ2 )−n Trρ̂d ⊗ , (S169)
1 + 2σ 2 1 + 2σ 2 + 4σγ2
where we used the convolution theorem for the first equality, and we changed the variables for the
second and third equalities, and
Z
α1 + β1 α2 + β2 β1 − α1 β2 − α2
Wd (α1 , α2 ) = d2n β1 d2n β2 Wa √ , √ Wb √ , √ (S170)
2 2 2 2
is the Wigner function of the state ρ̂d obtained by applying a 50:50 beam splitter to the state |a⟩
and |b⟩ and tracing out half of the output. For the last equality, n̂1 and n̂2 are the sum of the
photon number operators for the first and second n modes, respectively, and we use the following
correspondence between the Wigner function and the operator
2
e−4x|α| [(1 − 2x)/(1 + 2x)]n̂
⇐⇒ , (S171)
πn (1 + 2x)n
for any x > 0 (see e.g. [12, Eq. (3.6.39)]). Note that, when x > 1/2 the R.H.S. is proportional
to a thermal state. Similar methods have been used to prove the maximum fidelity of Gaussian
channels [13]. We illustrate the procedure in Fig. S4. With the same logic, we have
Z
2 (|α|2 +|β|2 )
d2n αd2n βe−2σ †
|⟨B, B|ÛBS D̂(α) ⊗ D̂(β)ÛBS |A, A⟩|2 (S172)
!n̂1 !n̂2
2n 2 −2n 1 − 2σ 2 1 − 2σ 2
= π (1 + 2σ ) Tr ρ̂d ⊗ . (S173)
1 + 2σ 2 1 + 2σ 2
Hence, we have
" n̂2 #
n̂1 1−2σ 2 −4σγ2
!n Tr ρ̂d 1−2σ 2
1+2σ 2
⊗ 1+2σ 2 +4σγ2
2 1 + 2σ 2
Eγ |Gσ (γ)| = n̂1 n̂2 . (S174)
1 + 2σ 2 + 4σγ2 1−2σ 2 1−2σ 2
Tr ρ̂d 1+2σ 2
⊗ 1+2σ 2
We now consider two parameter regimes. First, if 2σ 2 + 4σγ2 ≤ 1, the operators on the R.H.S. are
positive-semidefinite, and it is not hard to see, by monotonicity, that
" n̂2 #
n̂1 1−2σ 2 −4σγ2
1−2σ 2
Tr ρ̂d 1+2σ 2
⊗ 1+2σ 2 +4σγ2
n̂1 n̂2 ≤ 1. (S175)
1−2σ 2 1−2σ 2
Tr ρ̂d 1+2σ 2
⊗ 1+2σ 2
21
q
1
Second, if 2σ 2 + 4σγ2 > 1 but 2σ 2 ≤ 2σγ2 1+ 4σγ4
−1 ≤ 1 (the last inequality holds for all
σγ > 0), the above can be bounded as
" n̂2 # " #
n̂1 1−2σ 2 −4σγ2
n̂1 1−2σ 2 −4σγ2
n̂2
1−2σ 2 1−2σ 2
Tr ρ̂d 1+2σ 2
⊗ 1+2σ 2 +4σγ2
Tr ρ̂d 1+2σ 2
⊗ 1+2σ 2 +4σγ2
n̂1 ≤ n̂1 n̂2 (S176)
1−2σ 2 1−2σ 2 n̂2 1−2σ 2 1−2σ 2
Tr ρ̂d 1+2σ 2
⊗ 1+2σ 2
Tr ρ̂d 1+2σ 2
⊗ 1+2σ 2
n̂1 n̂2
1−2σ 2 1−2Σ2
Tr ρ̂d 1+2σ 2
⊗ 1+2Σ2
= n̂1 n̂2 (S177)
1−2σ 2 1−2σ 2
Tr ρ̂d 1+2σ 2
⊗ 1+2σ 2
≤ 1, (S178)
1−2σ 2 −4σ 2 1−2Σ2 1
where, in the second line, we define − 1+2σ2 +4σγ2 := 1+2Σ2
, i.e., Σ2 = 4(σ 2 +2σγ2 )
. In the third line,
γ
we use Σ2 ≥ σ 2 , which can
q be easily verified
under our assumptions for σ. Therefore, as long as
2 1 2 2 1
σ ≤ max 2 − 2σγ , σγ 1 + 4σ4 − 1 , we have the following bound
γ
!n
2 1 + 2σ 2
Eγ |Gσ (γ)| ≤ . (S179)
1 + 2σ 2 + 4σγ2
This completes the proof of Lemma S1. Note that the equality can be achieved if ρ̂d is chosen to be
the vacuum state. One can verify this holds when |A⟩ = |B⟩ = |α⟩ for some coherent state |α⟩.
In this section, we give a lower bound for a specific class of scheme, the Gaussian schemes,
which may be of independent interest. An ancilla-free Gaussian scheme is specified by collections of
adaptively chosen Gaussian input state and Gaussian measurements. Again, thanks to convexity,
we again consider only pure input states and rank-1 POVM measurements. A Gaussian input state
can be expressed as |A⟩ = D̂(ω)|Ā⟩, where |Ā⟩ is a centered (i.e., zero-mean) Gaussian state; A
Gaussian POVM can be written as
1
Π̂(α) = |B⟩⟨B| = D̂(α)|B̄⟩⟨B̄|D̂† (α), (S180)
πn
for outcomes α ∈ Cn , where |B̄⟩ is a centered Gaussian state. We refer the readers to Ref. [1–3] for
more details about Gaussian quantum information.
n ≥ 8, ϵ ≤ 0.24. (S181)
If there exists an entanglement-free Gaussian scheme such that, after learning from N copies of an
n-mode random displacement channel Λ ∈ Λϵ,σ 39peak , and then receiving a query β ∈ C , can return
n
an estimate λ̃(β) of λ(β) such that |λ̃(β) − λ(β)| ≤ ϵ with probability at least 2/3 for all β such that
|β|2 ≤ nκ, then
( n/2 n )
0.99κ 1.98κ
N ≥ 0.01ϵ −2
min 1+ , 1+ . (S182)
σ2 1 + 2σ 2
22
A few remarks before presenting the proof: When κ = O(1) and σ 2 ≪ κ, the second expression
in the minimization dominates, and we recover Theorem S2. On the other hand, the bound for
Gaussian schemes also holds for arbitrarily large σ, though the upper bound will enter a different
branch and the separation with entanglement-assisted schemes becomes weaker.
Proof. Consider the same partially-revealed hypothesis-testing task and the same strategy used by
Bob in the proof of Theorem S2. Recall that the average TVD under the two hypotheses is lower
bounded by
To upper bound the average TVD, recall the following bound derived in Eq. (S142),
X N
X
16ϵ20 Eγ |Gσ≤k (γ)|2 ,
o
Eγ TVD(p1 , p2,γ ) ≤ p1 (o1:N ) (S184)
o1:N k=1
o
with Gσ≤k defined as
R 2n − |β−γ|2 o
o d βe 2σ2 G ≤k (β)
Gσ≤k (γ) = R |β ′ |2
, (S185)
d2n β ′ e
− o 2σ 2 G ≤k (β ′ )
⟨Bok<k |D̂† (β)|Bok<k ⟩
o o
Go≤k (β) = o o · ⟨Ao<k |D̂(β)|Ao<k ⟩, (S186)
⟨Bok<k |Bok<k ⟩
Note that this bound holds for any σ. Now we calculate the R.H.S. with Gaussian schemes. First
compute Go≤k (β),
† (α−ω)−(α−ω)† β
Go≤k (β) = ⟨B̄|D̂† (α)D̂† (β)D̂(α)|B̄⟩⟨A|D̂(β)|A⟩ = ⟨B̄|D̂† (β)|B̄⟩⟨Ā|D̂(β)|Ā⟩eβ ,
(S187)
Here, without loss of generality, we can always write |Ā⟩ = ÛBSA ÛsqA |0⟩, where ÛBSA represents
the unitary operator for a beam-splitter network and ÛsqA represents the product of single-mode
squeezing operations. Similarly, |B̄⟩ = ÛBSB Û√sqB |0⟩.
To simplify Go≤k (β), using â := (x̂ + ip̂)/ 2, we can rewrite the displacement operator as
√ √ √
D̂(β) := exp β↠− β † â = exp 2i Im β x̂ − 2i Re β p̂ = exp 2iv · q̂ , (S188)
Now, let us introduce the symplectic matrix S that describes the dynamics of quadrature operators
under Gaussian unitary operation Û :
Since the Gaussian unitary operation we consider is written as ÛBS Ûsq , the symplectic matrix can
be decomposed as S = SBS Ssq . Here, SBS is an orthogonal matrix and Ŝsq can be explicitly written
as diag(er1 , . . . , ern , e−r1 , . . . , e−rn ), where r1 , . . . , rn ≥ 0 represent squeezing parameters for each
23
mode. We use r for squeezing parameters for |A⟩ and s for |B̄⟩. After the symplectic transformation
S, the displacement operator transforms as
√ √ √
exp 2iv T q̂ → exp 2iv · (S q̂) = exp 2i(S T v) · q̂ , (S191)
and
√ 1 T v|2
⟨0| exp 2i(S T v) · q̂ |0⟩ = e− 2 |S (S192)
Thus,
1 T 2
⟨Ā|D̂(β)|Ā⟩ = ⟨0|Ûsq
†
Û † D̂(β)ÛBSA ÛsqA |0⟩ = e− 2 |SA v| ,
A BSA
(S193)
and
1 T ′ 2 1 T 2
⟨B̄|D̂† (β)|B̄⟩ = ⟨B̄|D̂(β)|B̄⟩ = e− 2 |SB v | = e− 2 |SB v| , (S194)
where v := v(β). The first equality is due to the fact that ⟨B̄|D̂† (β)|B̄⟩ is real. And we can write
the phase factor as
exp β † (α − ω) − β(α − ω)† = exp 2iv T u , (S195)
1 T T T T
= exp − v (SA SA + SB SB )v + 2iv u (S197)
2
1 T T
:= exp − v Σv + 2iv u (S198)
2
1 T T
= exp − (Ov) D(Ov) + 2i(Ov) · (Ou) (S199)
2
2n
" 2 #
Y di 2i 2u′2
= exp − vi′ − u′i − i
, (S200)
i=1
2 di di
Hence, by the Williamson decomposition [3], the spectrum of Σ is composed of pairs such that the
product of the ith and (i + n)th eigenvalues of this matrix is no smaller than 4. Without loss of
generality, we label the eigenvalues d1 , ..., d2n in such a way that di di+n ≥ 4 for all 1 ≤ i ≤ n.
o
Now let us compute Gσ≤k (γ).
R 2n − |β−γ|2 o
o d βe 2σ2 G ≤k (β)/(2πσ 2 )n
Gσ≤k (γ) = R |β ′ |2
(S202)
d2n β ′ e
− o 2σ 2 G ≤k (β ′ )/(2πσ 2 )n
24
Therefore, we obtain
2n
" !# 2n
" #
o Y 1 −di zi′2 + 4iui zi′ o Y −di zi′2
Gσ≤k (γ) = exp , and |Gσ≤k (γ)| = exp . (S206)
i=1
2 1 + di σ 2 i=1
2(1 + di σ 2 )
Substituting this back to Eq. (S184), we obtain the following upper bound for the average TVD
2n
s
Y 1
Eγ TVD(p1 , p2,γ ) ≤ 16N ϵ20 , (S208)
i=1
1 + 2di σγ2 /(1 + di σ 2 ))
which, combined with the lower bound Eq. (S183) and substituting ϵ0 = ϵ/0.98, yields the following
sample complexity bound
s
2n
Y 2di σγ2
N ≥ 0.01ϵ−2 1+ . (S209)
i=1
1 + di σ 2
To find a lower bound independent of di ’s, focus on the following product, for any 1 ≤ i ≤ n,
! !
2di σγ2 2di+n σγ2
1+ 1+ (S210)
1 + di σ 2 1 + di+n σ 2
This is an increasing function in di and di+n . We know the spectrum satisfies di di+n ≥ 4. Hence,
we can lower bound it by setting di+n /2 = 2/di := d > 0, which leads to
! !
2di σγ2 2di+n σγ2 (d + 2σ 2 + 4σγ2 )(1 + 2d(σ 2 + 2σγ2 ))
1+ 1+ ≥ . (S211)
1 + di σ 2 1 + di+n σ 2 (d + 2σ 2 )(1 + 2dσ 2 )
Meanwhile, when d → 0 or d → ∞, it becomes 1 + 2σγ2 /σ 2 . We thus have the following lower bound,
! ! !2
2di σγ2 2di+n σγ2 2σγ2
4σγ2
1+ 1+ ≥ min 1 + 2 , 1 + , (S213)
1 + di σ 2 1 + di+n σ 2 σ 1 + 2σ 2
25
In this section, we find the condition that the effect of truncating a multivariate normal
distribution is smaller than 0.5, which is used to derive the lower bound for entanglement-free
schemes. Consider a multivariate normal distribution:
n !
1 |x|2
q(x) = exp − 2 , (S215)
2πσ 2 2σ
where x ∈ R2n . Note that in the main text, while we consider γ ∈ Cn , they are equivalent. Now,
we consider a truncated distribution with |x|2 ≤ R2 with a given R:
Z Z n !
1 |x|2
dxq(x) = dx exp − 2 (S216)
|x|≤R |x|≤R 2πσ 2 2σ
n Z R Z !
1 2n−1 r2
= dr dΩ2n r exp − 2 (S217)
2πσ 2 0 2σ
2
Γ n, 2σ
R
2
=1− . (S218)
Γ(n)
where we have used the following integrals:
Z Z R ! " !#
2π n/2 2n−1 r2 n−1 2n R2
dΩn = , drr exp − 2 =2 σ Γ(n) − Γ n, 2 , (S219)
Γ(n/2) 0 2σ 2σ
and
Z ∞
Γ(n, x) = tn−1 e−t dt (S220)
x
is the (upper)
incomplete
gamma function and Γ(n) = Γ(n, 0). Therefore, the tail probability is
R2
given by Γ n, 2σ2 /Γ(n). In the main text and the proof of sample complexity lower bound of
entanglement-free schemes, we choose 2σ 2 = 0.99κ and R2 = κn. For our purpose, it suffices to
show that Γ(n,n/0.99)
Γ(n) ≤ 0.5. To see this, we use the following inequality [14]:
Γ(n, kn)
≤ (ke1−k )n , ∀ k > 1. (S221)
Γ(n)
First notice that for k = 1/0.99 and n = 14000, (ke1−k )n ≤ 0.492. Now, for every n < 14000, one
can numerically verify Γ(n,n/0.99)
Γ(n) ≤ 0.5 (see Fig. S5); For n > 14000, the upper bound (ke1−k )n
Γ(n,n/0.99)
monotonically decreases with n, so we also have Γ(n) ≤ 0.492. Combining these two cases
completes the proof.
26
FIG. S5. Numerical verification that the Gaussian tail probability is upper bounded by 0.5 for n up to 14000.
[1] A. Ferraro, S. Olivares, and M. G. Paris, Gaussian states in continuous variable quantum information,
arXiv preprint quant-ph/0503237 (2005).
[2] C. Weedbrook, S. Pirandola, R. García-Patrón, N. J. Cerf, T. C. Ralph, J. H. Shapiro, and S. Lloyd,
Gaussian quantum information, Reviews of Modern Physics 84, 621 (2012).
[3] A. Serafini, Quantum continuous variables: a primer of theoretical methods (CRC press, 2017).
[4] H.-Y. Huang, R. Kueng, and J. Preskill, Information-theoretic bounds on quantum advantage in machine
learning, Physical Review Letters 126, 190505 (2021).
[5] D. Aharonov, J. Cotler, and X.-L. Qi, Quantum algorithmic measurement, Nature communications 13,
1 (2022).
[6] S. Chen, S. Zhou, A. Seif, and L. Jiang, Quantum advantages for pauli channel estimation, Phys. Rev.
A 105, 032435 (2022).
[7] S. Chen, C. Oh, S. Zhou, H.-Y. Huang, and L. Jiang, Tight bounds on pauli channel learning without
entanglement, arXiv preprint arXiv:2309.13461 (2023).
[8] S. Chen and W. Gong, Futility and utility of a few ancillas for pauli channel learning, arXiv preprint
arXiv:2309.14326 (2023).
[9] H.-Y. Huang, M. Broughton, J. Cotler, S. Chen, J. Li, M. Mohseni, H. Neven, R. Babbush, R. Kueng,
J. Preskill, and J. R. McClean, Quantum advantage in learning from experiments, Science 376, 1182
(2022).
[10] K. Kornelson and D. Larson, Rank-one decomposition of operators and construction of frames, Contem-
porary Mathematics 345, 203 (2004).
[11] K. R. Castleman, Digital image processing (Prentice Hall Press, 1996).
[12] S. Barnett and P. M. Radmore, Methods in theoretical quantum optics, Vol. 15 (Oxford University Press,
2002).
[13] C. M. Caves and K. Wódkiewicz, Fidelity of gaussian channels, Open Systems & Information Dynamics
11, 309 (2004).
[14] M. Ghosh, Exponential tail bounds for chisquared random variables, Journal of Statistical Theory and
Practice 15, 1 (2021).