1 Introduction

Measurements are one of the crucial elements of quantum theory. Compared to the usual notions of observables as self-adjoint operators, or of von Neumann measurements, there exists a more general description of measurements, the so-called positive operator-valued measures (POVMs). POVMs are defined by a set of positive operators summing up to the identity operator [1]. Being more general, POVMs outperform projective measurements for many tasks in quantum information theory, including quantum tomography [2], unambiguous discrimination of quantum states [3], state estimation [4], quantum key distribution [5,6,7], information acquisition from a quantum source [8], Bell inequalities [9,10,11] or device-independent quantum information protocols [12, 13].

Current quantum computation era, dubbed as Noisy intermediate-scale quantum (NISQ) by John Preskill [14], is characterized by devices which still provide limited resources. Firstly, their size is limited only to a small number of qubits. Secondly, the number of computation steps (circuit depth) and their precision is still very limited by the decoherence effects in current technology, making the efficient usable number of qubits much smaller.

Final restraint is the limitation imposed on implementable quantum measurements—current devices are tuned to perform only projective measurements in the computational basis. While projective measurements are the ideal result, in reality, the measurements are rather noisy as well. It could be noted that on the lowest level of physical qubits one can possibly tune up the measurements to perform an arbitrary measurement; in this paper, we are not addressing this topic.

Our aim is to provide a technique to implement POVM measurements with stated limited resources using gate formalism with focus mostly on the circuit depth/width trade-off. In addition, we do not concentrate on the noisy measurements [15, 16], the implementation is analyzed in the idealized scenario of perfect implementation and shows fundamental possibility of such realization and provides basic building blocks to do so.

Suppose we want to perform an n-outcome (POVM) measurement in a d-dimensional Hilbert space, \(\textsf{A}=\{A_j\}_{j=0}^{n-1}\). By a naive interpretation of Naimark theorem [1], one needs an ancillary Hilbert space of dimension up to dn. This is a single-step measurement procedure, where the measurement on the whole dn-dimensional space at some point provides complete outcome information; this, however, requires \(\log _2 dn\) qubits and the decomposition into basic gates will increase the depth of the circuit considerably. So far the best approach needs an additional Hilbert space of the same dimension as is the dimension of the original Hilbert space irrespective of the number of outcomes of the measurement at question [17].

On the other end lies the result of Ref. [18], where the spatial resources (system dimension) are exchanged for the decreased success probability. In the paper, the whole measurement is performed on the original d-dimensional Hilbert space, but it is successful only with probability 1/d. The lowered success rate is in general inevitable.

On one extreme, simple dilation may need more resources than available, on the other extreme, measurements limited just to the original Hilbert space decrease probability of success and one does not have direct access to the measurement \(\textsf{A}\). In this case one can only reconstruct statistics by post-selecting obtained data, but per-shot relation of the outcomes to the measurement \(\textsf{A}\) cannot be interpreted directly. We would like to explore possibilities how to retain reasonable memory requirements (number of additional qubits), while not decreasing success probability. We also want to explore only technical possibilities and the paper will not discuss philosophy of what constitutes direct measurement of a POVM.

Fig. 1
figure 1

An example of a coarse-graining illustrating possible implementation of a POVM in a sequential way utilizing two-outcome measurements. Imagine measurement \(\textsf{A}=\{A_j\}_{j\in [6]}\) with six outcomes labeled af. Measurement \(\textsf{B}=\{B_0,B_1\}\) is a coarse-graining of \(\textsf{A}\) having two outcomes, one being a collective outcome for the labels a and b of the measurement \(\textsf{A}\) and the other one being a collective outcome for the labels cf of the measurement \(\textsf{A}\). The idea of the paper is to use such two-outcome coarse-grainings in a sequential way to perform the measurement \(\textsf{A}\). In this case, if the measurement \(\textsf{B}\) gives the label ab, it is followed by the measurement \(\textsf{C}'\) giving a definitive answer a or b. If, however, the \(\textsf{B}\)-measurement gives the label cf, it is followed first by the measurement \(\textsf{C}''\) and based on its outcome either \(\textsf{D}'\) or \(\textsf{D}''\) is performed and provides one of the definitive outcomes of the measurement \(\textsf{A}\)

Inspired by [19], we will concentrate on the next simplest model to the no-ancilla approach. We will use only a single ancillary qubit to provide us with a possibility of performing simple (two-outcome) measurements. We shall explore this option from a point of view determining practical ways of performing complex measurements as a sequence of simple ones as depicted in Fig. 1. A similar approach using single ancilla was presented in [20] with the difference that our approach does not require post-selection. The price to pay in our case is a setup requiring deeper circuits and longer coherence times.

As noted, our approach is not entirely new and partial results can be found in some other works, most importantly [19], where the foundation for the sequential POVM implementation has been laid down. Other results contain some steps for constructions but they often lack generality. Quite often they are tailored for specific qubit implementations—we present comparison on different approaches in a designated Sect. 2.4. In this paper, we provide a general gate implementation procedure for any d-dimensional qubit-based system including the description of all steps of the process and all building blocks in one place. One can also view our results as partially following from more general approaches of single-qubit ancilla driven computation [21, 22].

1.1 Generalized measurements

Generalized measurements, or Positive operator-valued measures (POVMs), are a general way of describing measurements in quantum theory. In the finite-outcome case we are about to study, an n-outcome POVM \(\textsf{A}\) is represented by a set of operators \(\textsf{A}=\{A_j\}_{j \in [n]}\), where \([n]=\{0,1,\ldots ,n-1\}\). Operator \(A_j\) corresponds to the outcome j; having state \(\rho \) on which we perform measurement \(\textsf{A}\), outcome j is obtained with probability \(p(A_j|\rho )\) that is given by Born formula, \(p(A_j|\rho )=\textrm{tr}\left[ A_j\rho \right] \). This demands that the operator \(A_j\) is positive semi-definite, \(A_j\ge 0\); these operators are called effects.

We also require that the probabilities sum up to one,

$$\begin{aligned} 1=\sum _{j\in [n]} p(A_j|\rho ) = \sum _{j\in [n]}\textrm{tr}\left[ A_j\rho \right] = \textrm{tr}\left[ \rho \sum _{j\in [n]} A_j\right] . \end{aligned}$$

As this has to hold for all states \(\rho \), it follows that the sum of the POVM effects equals identity,

$$\begin{aligned} \sum _{j\in [n]} A_j = \mathbbm {1}. \end{aligned}$$

Note, that von Neumann (projective) measurements are a special case of POVMs, as any projective measurement is described by a set of particular projections, which are also effects.

For the purposes of this paper, we also define the notion of coarse-graining as exemplified in Fig. 1. Let us have a partition \(P=\{P_k\}_{k\in [w]}\) (for some number of partitions w) of the outcomes of the measurement \(\textsf{A}=\{A_j\}_{j\in [n]}\), namely \(P_k\subseteq [n]\) such that \(\cup _{k} P_k=[n]\) and \(P_j\cap P_k=\emptyset \) for all \(j\ne k\). A coarse-graining is such a measurement \(\textsf{B}=\{B_k\}_{k}\) that composes outcomes according to given partitioning P, \(B_k=\sum _{j\in P_k} A_j\). Later in this work we will restrict ourselves only to dichotomic coarse-grainings where there are only two partitions of the measurement.

Fig. 2
figure 2

Examples of possible measurement procedures. Figure a depicts an outcome-decreasing procedure representing an overall measurement A from Fig. 1, where every measurement eliminates one outcome. In this case measurement eliminating label x is denoted by \(\textsf{B}^x\). Figure b shows a binary-search procedure of the same measurement \(\textsf{A}\) in which the number of possible outcomes is (roughly) halved in every step. Dark regions correspond to measured effects and circled outputs represent labels of given measurement

We will also use the term fine-graining, which is an opposite to coarse-graining, i.e., it corresponds to splitting of effects \(B_k\) to sub-effects, providing a finer measurement. For example, having a measurement \(\textsf{A}=\{A_0, A_1, A_2, A_3\}\) a coarse graining can be a measurement \(\textsf{B}=\{B_0,B_1\}\), where \(B_0=A_0+A_1+A_3\) and \(B_1=A_2\). Measurement \(\textsf{A}\) is then a fine-graining of \(\textsf{B}\).

We also note that occasionally we will discern between an outcome and its label—while the effects of a POVM are in this paper usually indexed by the elements of the set [n], in general they can have assigned different labels (as can be seen in Fig. 1 or Fig. 2). This is in particular useful in cases when we need a more descriptive information about the outcome, as, e.g., in the case of collective outcomes. It can be useful also in the case of binary (dichotomic, or two-outcome) measurements that we use, when the particular effects are indexed by the measurement outcome value from the computational basis (either 0 or 1), but the labels provide interpretation of given outcome. This distinction will not always be followed in this paper and sometimes the labels will be the same as the outcomes. It should, however, be clear from the context what form is meant by respective labeling or indexing.

1.2 Naimark dilation theorem

In this subsection, we present a mathematical model of performing a POVM measurement by extending the given Hilbert space to a higher dimensional Hilbert space and subsequently performing von Neumann measurements on the larger space to put our result better into perspective.

Let \(\{F_i\}_{i\in [n]}\) be a POVM acting on Hilbert space \(\mathcal {H}_A\) of dimension \(d_A\). Then, there exists a projective measurement \(\{P_i\}_{i\in [n]}\) acting on the Hilbert space \(\mathcal {H}_{A'}\) of dimension \(d_{A'}\) and an isometry \(S:\mathcal {H}_A\longrightarrow \mathcal {H}_{A'}\) such that for all i

$$\begin{aligned} F_i=S^{\dagger }P_i S. \end{aligned}$$
(1)

A naive (and inefficient) way to construct such projective measurement and isometry is to let \(\mathcal {H}_{A'}=\mathcal {H}_A\otimes \mathcal {H}_B\), \(P_i=I_A\otimes |i\rangle _B\langle i|\), and

$$\begin{aligned} S=\sum _{i\in [n]}\sqrt{F_i}_A\otimes |i\rangle _B. \end{aligned}$$
(2)

This construction, however, requires a system of large dimension, and \(d_{A'}=nd_A\). This approach to POVM can be turned into physical implementation by extending the isometry S to a unitary operation U that fulfills

$$\begin{aligned} S=U(I_A\otimes |0\rangle _B). \end{aligned}$$
(3)

A more dimension-efficient approach was designed by Peres [1], where the construction requires dimension

$$\begin{aligned} d_{A'} = \sum _{i\in [n]} {{\,\textrm{rank}\,}}{F_i}. \end{aligned}$$

In [17], the authors provide another construction of dilation requiring an ancillary system of the same dimension as the original system irrespective of the number of outcomes.

In this work, we will similarly extend the studied system but only by a qubit system. This dilation to a qubit, however, limits possibilities for our intended measurements. Namely, we cannot expect to be able to perform a measurement with more than two outcomes (on the ancillary qubit system). This, in turn, defines a way, how we will approach the problem of measuring more outcomes—we will look at the possibility of splitting the measurements into a sequence of two-outcome measurements.

Note that particularities of used device may play role in the size of the required ancilla. While we discussed the necessity of one ancillary qubit, this is only in the case when it can be dynamically reset. If this is not the case, one will need single-qubit ancilla for each measurement in the sequence.

1.3 Measurements with state changes

POVMs describe measurements only from the perspective of outcomes and their probabilities. They do not, however, describe what happens to the measured state. As in the sequential implementation, we want to reuse the system for fine-graining, we need to be able to describe also measurements with a state change.

In the case of von Neumann measurements, the change to the state \(\rho \) when outcome j corresponding to projector \(P_j\) is measured, is given as \(\tilde{\rho }_j = P_j\rho P_j\). The operator \(\tilde{\rho }_j\) is not normalized, and its normalization provides both the outcome state

$$\begin{aligned} \rho _j = \frac{P_j\rho P_j}{\textrm{tr}\left[ P_j\rho \right] } \end{aligned}$$

and the probability of getting this outcome, \(p(P_j|\rho )=\textrm{tr}\left[ P_j\rho \right] \).

For a general POVM, we will describe these measurement-induced state changes as instruments. An instrument \(\mathcal {I}\), corresponding to the measurement \(\textsf{A}\) is a set of completely positive trace non-increasing maps \(\mathcal {I}=\{\mathcal {I}_j\}_j\) such that

$$\begin{aligned} \textrm{tr}\left[ \mathcal {I}_j(\rho )\right] = \textrm{tr}\left[ A_j\rho \right] \end{aligned}$$
(4)

which has to hold for all states \(\rho \). The positivity of \(A_j\) translates to the requirement that \(\mathcal {I}_j\) is completely positive, while the summation condition for \(\textsf{A}\) translates to the requirement that the sum of \(\mathcal {I}_j\)’s is a channel (completely positive trace preserving map) which implies the trace non-increasing property on the particular \(\mathcal {I}_j\)’s. As before, operators \(\tilde{\rho }_j=\mathcal {I}_j(\rho )\), representing what happens to state \(\rho \) when outcome j is observed, are not normalized, with probability \(p(j|\rho )\) of obtaining the outcome being the normalization factor, i.e., the outcome state is given as

$$\begin{aligned} \rho _j = \frac{1}{p(A_j|\rho )}\mathcal {I}_j(\rho ) = \frac{\mathcal {I}_j(\rho )}{\textrm{tr}\left[ \mathcal {I}_j(\rho )\right] } = \frac{\mathcal {I}_j(\rho )}{\textrm{tr}\left[ A_j\rho \right] }. \end{aligned}$$

We will use measurement as the name for both POVMs and instruments; however, it should be clear which one is used.

An important thing to note is that while for von Neumann measurements the presented state change is unique, in the general case of POVMs, the choice is not unique. Different choices can affect the state in different ways and, in particular, can lead to various degrees of state disturbances. For example, the instrument

$$\begin{aligned} \mathcal {I}_j(\rho ) = \textrm{tr}\left[ A_j\rho \right] \omega \end{aligned}$$

for some state \(\omega \) destroys the original state \(\rho \) completely. It is therefore natural to try to find the least disturbing choices, especially, when the resulting state is to be used later.

In [23, Prop. 5.17], it was shown that the so-called Lüders measurements (or instruments), given by the prescription

$$\begin{aligned} \mathcal {I}_j(\rho )\equiv \mathcal {L}_j(\rho )=A_j^{1/2}\rho A_j^{1/2}, \end{aligned}$$
(5)

are the least disturbing in the following sense: any measurement can be realized as a Lüders measurement followed by some outcome-dependent state change. This makes them a straightforward choice in our endeavor.

Note that there is a number of possible additional criteria that may give preference to other than Lüders instruments, such as hardware specifics, noise considerations, or preferences following from a particular measurement goal. This is, however, beyond the scope of this paper.

2 Measuring with limited resources

As we noted before, current quantum devices provide us with highly limited resources. If a desired measurement is more complicated, these resources might not even allow us to implement it. A straightforward idea is to split the measurement into a sequence of binary measurements as depicted in Fig. 1. While in the classical world such action bears no problems, in the quantum case we know, that every measurement disturbs measured state. A question arises, whether it is possible to devise a general procedure that would allow us to implement the measurement in such a sequential way. The question has two parts, (i) whether on general level such splitting is theoretically possible, and (ii) if so, whether this procedure is implementable.

The answer to the first question has been to large extent provided in [19]. In this paper, we present a slightly different way of obtaining the result, and later, we apply this procedure to study unambiguous state discrimination. We start by showing that the Lüders measurements allow for fine-graining of results, answering point (i). Then we show how to implement qubit-assisted measurements, what in turn allows us to implement this procedure on current quantum devices based on qubit registers. This shall answer point (ii).

2.1 POVM as a sequence of binary measurements

Let us consider a measurement \(\textsf{A}\) and its coarse-graining \(\textsf{B}\). We will consider only a two-outcome coarse-graining as (i) we want to study the possibilities of single ancillary qubit that distinguishes only two outcomes, and (ii) the analysis of the procedure to higher number of outcomes is straightforward. Let us have \(\textsf{B}=\{B,\mathbbm {1}-B\}\) and Q being the subset of outcome indices of measurement \(\textsf{A}\) that defines B, i.e., \(B=\sum _{j\in Q} A_j\).

Let us now take the case when the effect B was measured on the input state \(\rho \) and the state change is described by Lüders measurement from Eq. (5). The (unnormalized) state now is \(\tilde{\rho }=B^{1/2}\rho B^{1/2}\). If we now want to fine-grain the results to obtain information about outcomes from Q, we cannot perform measurement \(\textsf{A}\) on the state \(\tilde{\rho }\) any more—we need to design a new measurement adjusted for the fact that the previous measurement \(\textsf{B}\) has already been done, particular outcome corresponding to the effect B has been obtained, and the state has changed.

In fact, what we want is to find measurement \(\textsf{A}'=\{A'_j\}_{j\in Q}\) such that the following holds

$$\begin{aligned} \textrm{tr}\left[ \tilde{\rho }A'_j\right] =\textrm{tr}\left[ \rho A_j\right] . \end{aligned}$$
(6)

Expanding the left hand side we see

$$\begin{aligned} \textrm{tr}\left[ \tilde{\rho }A'_j\right] =\textrm{tr}\left[ B^{1/2}\rho B^{1/2}A'_j\right] =\textrm{tr}\left[ \rho B^{1/2}A'_jB^{1/2}\right] . \end{aligned}$$

Since this has to be equal to \(\textrm{tr}\left[ \rho A_j\right] \) for all \(\rho \), we obtain condition for \(\textsf{A}'_j\) stating that

$$\begin{aligned} A_j=B^{1/2}A'_j B^{1/2}. \end{aligned}$$
(7)

Let us take the Moore-Penrose pseudoinverse of \(B^{1/2}\), which we denote simply as \(B^{-1/2}\). We see that \(A'_j\) given by

$$\begin{aligned} A'_j=B^{-1/2}A_j B^{-1/2} \end{aligned}$$
(8)

satisfies Eq. (7). Since \(B\ge 0\), we also have \(A'_j\ge 0\). It remains to verify the normalization,

$$\begin{aligned} \sum _{j\in Q} A'_j=\sum _{j\in Q}B^{-1/2}A_j B^{-1/2} = B^{-1/2}BB^{-1/2}=\mathbbm {1}_B, \end{aligned}$$

where \(\mathbbm {1}_B\) is the identity (projection) on the support of B. We need not be concerned with the rest of the Hilbert space, as for the operator supports we have \({{\,\textrm{supp}\,}}A_j\subseteq {{\,\textrm{supp}\,}}B\) and also \({{\,\textrm{supp}\,}}\tilde{\rho }\subseteq {{\,\textrm{supp}\,}}B\). This means that while the transformed state loses information outside the \({{\,\textrm{supp}\,}}B\), the subsequent measurements anyway act only within \({{\,\textrm{supp}\,}}B\). So we see that Eq. (8) is sufficient to define the subsequent measurement that fulfills the condition from Eq. (6). As for the particular implementation, \(\mathbbm {1}-\mathbbm {1}_B\) can be defined as an additional outcome in order for \(A'\) to be a POVM on the whole Hilbert space, but the probability of obtaining this outcome will be zero. Alternatively, one can supplement the effects of \(\textsf{A}'\) by parts from the orthocomplement of B so that they sum up to identity. This will not change the probabilities, as \(\tilde{\rho }\) does not have support in this part of the space.

We can consider a number of possible strategies how to partition a particular measurement into a measurement tree as exemplified in Fig. 1. The role in their usefulness might have hardware specifics, noise considerations, or preferences following from a particular measurement goal. Disregarding these considerations, two partitioning strategies stand out.

Outcome-decreasing procedure: In every step, we try to rule out one of the labels. Having a measurement \(\textsf{A}=\{A_j\}_{j\in [n]}\), in step j we perform measurement \(\textsf{B}^j\) deciding between the outcome j corresponding to \(A_j\) and outcomes corresponding to the effects \(A_{j+1},\dots ,A_n\). If outcome j is obtained, the measurement process can be terminated, since a definitive answer is obtained. This effectively means that conditioning on the previous outcomes is not required—if definitive answer is obtained, the measurement procedure can continue as planned, but we can disregard the results obtained after the definitive answer. This implies that static circuits are enough for the implementation. The drawback of this procedure is large resulting circuit depth, as one needs to perform \(n-1\) consecutive steps of the measurement process. The procedure is depicted in Fig. 2a.

Binary-search procedure: In this procedure we split the outcomes of current measurement in (roughly) half and based on given outcome we choose the next measurement to be done. This procedure is depicted in Fig. 2b and was presented already in [19, 24]. Compared to the outcome-decreasing procedure, it is more efficient in circuit depth, as the number of steps one needs to make is roughly \(\log _2 n\). The price to pay is that one needs to be able to condition measurements to be done on the previous outcomes. This option is becoming available in current quantum devices, but it still might have unreasonable time demands or low quality.

There are, naturally, many other options how to approach the coarse-graining procedure. There seem to be only few special cases when the non-adaptive approach is possible (outcome-decreasing procedure or the implementation of symmetric informationally complete POVMs as, e.g., in [25]), as in general the procedures need to be adaptive (requiring dynamic circuits), where the later measurements depend on previous results.

2.2 Qubit implementation of Lüders measurements

Remaining question now is, whether there is a simple and an efficient realization to an arbitrary binary measurement. In this part, we shall show that using one ancillary qubit, we can represent any two-outcome measurement \(\textsf{B}=\{B,\mathbbm {1}-B\}\) as a rotation of the original system to the eigenbasis of B, followed by a controlled unitary with the ancillary qubit as the target, and finalized by a measurement on the ancillary qubit and rotation of the system back. We will further assume that the successful measurement outcome in the \(\textsf{B}\) measurement is 1 while the outcome 0 corresponds to the effect \(\mathbbm {1}-B\).

Fig. 3
figure 3

Coupling scheme for binary measurements. a In general setting the (not necessarily single-qubit) state \(\rho \) is coupled by a unitary U to an ancillary qubit system prepared in the state \(|0\rangle \), which is measured in the computational z basis afterward. Given outcome of the measurement is j, the (unnormalized) output state is \(\tilde{\rho }_j\). b The scheme in the case of Lüders measurements can be decomposed into a rotation of \(\rho \) to the eigenbasis of B by \(U_B\) (and back at the end). The rest of the coupling unitary is a general control operation of the form \(V=\sum _k |k\rangle \!\langle k|\otimes V_k\). When some \(V_k\) is identity, it can be excluded from the circuit

Let us present a detailed description. We start with a general coupling construction as in Fig. 3a, where the original state (Hilbert space \(\mathcal {H}_1\)) is coupled to an ancillary single-qubit state (Hilbert space \(\mathcal {H}_2\)), which is measured afterward. Without loss of generality, we assume that the ancillary qubit is prepared in the state \(|0\rangle \) and its measurement is in the computational (z) basis. In fact, this is compatible with the default settings of most contemporary quantum computers. This construction gives us the following conditions for U acting on \(\mathcal {H}_1\otimes \mathcal {H}_2\); for the two outcome cases (producing either state \(\tilde{\rho }_1\) or \(\tilde{\rho }_0\)) we have:

$$\begin{aligned} B^{1/2}\rho B^{1/2}&= {}_2\langle 1|U|0\rangle _2 \rho \, {}_2\langle 0| U^\dagger |1\rangle _2,\\ (\mathbbm {1}-B)^{1/2}\rho (\mathbbm {1}-B)^{1/2}&= {}_2\langle 0| U|0\rangle _2\rho \, {}_2\langle 0| U^\dagger |0\rangle _2, \end{aligned}$$

where we explicitly use indexing marking the original system (1) and the ancillary qubit (2) where necessary. These two equations are, in particular, impliedFootnote 1 by the following conditions:

$$\begin{aligned} {}_2\langle 1| U|0\rangle _2 = B^{1/2}\ \ \text {and}\ \ {}_2\langle 0| U|0\rangle _2 = (\mathbbm {1}-B)^{1/2}. \end{aligned}$$
(9)

For the moment we leave open the question whether for each B it is possible to construct a unitary U satisfying these conditions.

Since B is an effect, it can be diagonalized (in the computational basis) by a unitary transformation we denote \(U_B\). This unitary diagonalizes at the same time both \(B^{1/2}\) and \((\mathbbm {1}-B)^{1/2}\). By emphasizing the diagonal form by the corresponding lower index, we have

$$\begin{aligned} B^{1/2}_\textrm{diag}&= U_B B^{1/2} U_B^\dagger = U_B\, {}_2\langle 1| U|0\rangle _2\, U_B^\dagger = {}_2\langle 1| (U_B \otimes \mathbbm {1}) U (U_B^\dagger \otimes \mathbbm {1}) |0\rangle _2, \\ (\mathbbm {1}-B)^{1/2}_\textrm{diag}&= U_B (\mathbbm {1}-B)^{1/2} U_B^\dagger = U_B\, {}_2\langle 0| U|0\rangle _2\, U_B^\dagger = {}_2\langle 0| (U_B \otimes \mathbbm {1}) U (U_B^\dagger \otimes \mathbbm {1}) |0\rangle _2. \end{aligned}$$

Denoting by \(V=(U_B \otimes \mathbbm {1}) U (U_B^\dagger \otimes \mathbbm {1})\), which is unitary if (and only if) U is unitary, we can write it in the tensor product \(\mathcal {H}_2\otimes \mathcal {H}_1\) (note that we swapped the order of the Hilbert spaces in the tensor product to achieve a more comprehensible form)

$$\begin{aligned} V = \begin{pmatrix} (\mathbbm {1}-B)^{1/2}_\textrm{diag}&{} \quad V_{01} \\ B^{1/2}_\textrm{diag}&{} \quad V_{11} \end{pmatrix}. \end{aligned}$$
(10)

where \(V_{01}\) and \(V_{11}\) are unknown submatrices that we aim to complete in such a way that V is unitary. We will proceed to show that for each measurement \(\textsf{B}\) it is possible to construct such a unitary matrix V. This will also show that U satisfying conditions (9) exists.

Denoting columns of V as \(v_k\), it is easy to see that for \(j\ne k\) in the known part we have \(v_j^*v_k=0\) as both matrices \(B^{1/2}_\textrm{diag}\) and \((\mathbbm {1}-B)^{1/2}_\textrm{diag}\) are diagonal. Denoting the (real) eigenvalues of B as \(\lambda _k \in [0;1]\), for the column norm we have

$$\begin{aligned} v_k^*v_k&= \left[ (1-\lambda _k)^{1/2} \right] ^*(1-\lambda _k)^{1/2} + \left( \lambda _k^{1/2} \right) ^*\left( \lambda _k^{1/2} \right) = (1-\lambda _k) + \lambda _k = 1. \end{aligned}$$

So we see that the left part of V fulfills the conditions for unitarity. It remains to find \(V_{01}\) and \(V_{11}\) such that all columns of V are orthonormal. There is freedom in the choice, but we can choose

$$\begin{aligned} V_{01}=-B^{1/2}_\textrm{diag}\qquad \text {and}\qquad V_{11}=(\mathbbm {1}-B)^{1/2}_\textrm{diag}. \end{aligned}$$

If we now write the matrix V in the original tensor order \(\mathcal {H}_1\otimes \mathcal {H}_2\), it has a block-diagonal structure which is easily interpreted as a controlled operation of the form

$$\begin{aligned} V = \sum _k |k\rangle \!\langle k|\otimes V_k \end{aligned}$$
(11)

with \(2\times 2\) matrices

$$\begin{aligned} V_k = \begin{pmatrix} (1-\lambda _k)^{1/2} &{}\quad -\lambda _k^{1/2}\\ \lambda _k^{1/2} &{}\quad (1-\lambda _k)^{1/2} \end{pmatrix}, \end{aligned}$$
(12)

where \(k\in [d]\). Note that if \({{\,\textrm{rank}\,}}B = r\), then r of these matrices are non-trivial, while the rest of them equals \(\mathbbm {1}_2\); implementation then requires r controlled operations for the r non-trivial matrices.

To sum up, the procedure from Fig. 3a can in particular be constructed as in Fig. 3b, in which we first rotate the original state to the B-eigenbasis, then perform controlled operations with an ancillary qubit as target, and finally measure the ancillary qubit and rotate the original system back from the B-eigenbasis. Presented construction is general and works in any dimension \(d=2^m\) of the original system.

In the next part dealing with application to unambiguous state discrimination, we will consider the simplest case, when the original system is a qubit. In such case, we can rewrite this controlled operation as

$$\begin{aligned} V = |0\rangle \!\langle 0|\otimes V_0 + |1\rangle \!\langle 1|\otimes V_1 = (|0\rangle \!\langle 0|\otimes \mathbbm {1}+ |1\rangle \!\langle 1|\otimes V_1V_0^\dagger ) (\mathbbm {1}\otimes V_0), \end{aligned}$$
(13)

which can be realized as a composition of a unitary transformation \(V_0\) on the ancillary qubit and a standard qubit controlled-\((V_1V_0^\dagger )\) operation (see Fig. 4).

Fig. 4
figure 4

The measurement scheme can be simplified even further in the case of a qubit system when the general controlled operation can be constructed as a composition of \(V_0\) on the ancilla followed by a standard controlled-\((V_1V_0^\dagger )\) operation

2.3 Final overview of the procedure

Previous steps provide a detailed description of the particularities of the implementation procedure. Now we provide a complete and cohesive view on the procedure to give a higher-level overview of how it works. The work flow is as follows:

  1. 1.

    Choose measurement \(\textsf{A}\) for implementation.

  2. 2.

    Decide on binary-tree coarse-graining of \(\textsf{A}\).

  3. 3.

    Compute corresponding measurements for each partial step.

  4. 4.

    For each partial dichotomic measurement \(\textsf{B}=\{B,\mathbbm {1}-B\}\) compute diagonalization unitary \(U_B\), and \(U_B^\dagger \) and the set of coupling unitaries \(V_k\) using Eq. (12).

  5. 5.

    Build the circuit stitching constructions from Fig. 3b for particular measurements.

The first two steps are to be chosen based on demands of the application and can reflect also specific considerations based on the QPU design and its calibration. How to choose coarse-graining is not the aim of this paper. Steps 3 a 4 are tasks from linear algebra relying on the ability to perform spectral decomposition of used operators. In the simplest tasks one can use analytic approach, but in general the decomposition complexity is of the order \(O(d^3)\) where \(d=2^m\) is the dimension of the m-qubit Hilbert space. For small problems of the NISQ era this is easily computable.

Practically demanding is the last step which in general requires implementation with the possibility of evaluation of circuits conditioned on previous outcomes. These possibilities are not generally available on current devices, and when implemented, they are not yet suited for current implementation. Therefore, at present it is necessary to ease of the demands that this paper sets. We see a few possibilities how to do that: (i) find use cases where the conditioning is not needed, such as quantum state discrimination, or (ii) use outcome-eliminating coarse-graining that can be evaluated without conditioning, or (iii) create a set of circuits implementing possible paths. The last point cannot be recommended as it removes the faithfulness of the measurement (realization requires post-selection), while maintaining large depths of the circuits. However, we believe that the progress in the development of QPUs will lead to improvement of intermediate measurements.

2.4 Comparison to other works

Let us now discuss the relation of our approach to other works. As mentioned in the introduction, the approach is not entirely new. However, we present the implementation in its entirety, from the theoretical analysis of the procedure, down to providing elementary building blocks for the implementation. To better describe how current paper fits into known results, we provide a list of relevant works in Table 1.

The top part of the table lists works in the direction of sequential POVM implementation. Our approach is inspired by [19] where a solid theoretical background for the coarse-grained procedure was given, but without particularities for the implementation. In addition, the presentation was using binary coarse-graining. Similar approach was used also in [24], where the authors focused on quantum state discrimination. Their results are not universal, but rather hard-tailored to the task. In [26] the authors presented a constructive way for implementing POVMs, but only on a single qubit. Their realization requires \(O(\log _2 n)\) ancillary qubits.

Some of the ideas for sequential implementation of POVMs is present also in papers practically utilizing the measurements but as we can see in Table 1 (bottom part), most of these are limited to one or two qubits. An exception is [27] that works in any dimension, but is specifically designed for performing so-called informationally complete POVMs. All these implementations are platform specific and only hint at general implementation at best.

There are also other POVM implementation techniques that are based on different paradigms. All these implementations are suitable for POVMs of any dimension and any number of outcomes. In particular, in [17] one of the results states that a POVM in a d-dimensional Hilbert space can be performed as a specific Naimark-type dilation with the ancillary Hilbert space of the same dimension as the original system. This work also provides a constructive proof that can be adapted for practical implementation. In references [18, 20] a different approach is used with [20] requiring additional qubit. Both approaches provide shallow circuits, but the price to pay is the probabilistic nature of the implementation that requires post-selection. This means that these implementations do not represent desired POVMs faithfully. Finally, in reference [28] the authors present a construction requiring \(O(\log _2 n)\) ancillary qubits where the measurement is mapped to a problem of a particular state preparation which is measured on ancillas that provide the measurement outcome.

Table 1 Overview of related works

We conclude that while the literature is rather rich with implementations of POVMs, they are mostly limited in one or another way. Some results are just proof-of-the-principle, some are limited in the scope. In this paper, we present a complete and faithful construction for POVMs on any number m of qubits (i.e., the dimension is \(d=2^m\)) and having an arbitrarily (finitely) many outcomes n.

3 Quantum unambiguous state discrimination as a sequential measurement process

3.1 Quantum unambiguous state discrimination

Quantum state discrimination is a task in which we are provided a state from a set of states \(\{\rho _j\}_{j\in [w]}\) with probabilities \(\{p_j\}_{j\in [w]}\) and our task is to determine, which state was presented to us. Due to the particularities of quantum mechanics, this task is not as straightforward as in the classical case—in quantum theory one cannot distinguish non-orthogonal states perfectly. This task is therefore of high importance.

A particular situation of unambiguous state discrimination was introduced in [36, 37]. In this setting we want to distinguish particular states without errors, i.e., if we are given a definite answer about the state, it needs to be correct. The price to pay for this requirement is the necessity for an inconclusive outcome. When we obtain this result, we cannot say anything particular about the presented state.

This particular task has many extensions, but for the sake of exemplifying the framework from the previous section we will deal with the most basic setting of being presented with two pure qubit states \(\{|\psi _1\rangle ,|\psi _2\rangle \}\) with equal probabilities. Our task is to determine, which state was given to us.

Let us first denote \(|\psi _j^\perp \rangle \) as states perpendicular to \(|\psi _j\rangle \) and \(P_j\), \(P_j^\perp \) as the corresponding projectors. The unambiguous state discrimination measurement \(\textsf{A}\) has three outcomes 1, 2, and ? (for the inconclusive outcome) with corresponding effects,

$$\begin{aligned} A_1 = \lambda P_2^\perp ,\quad A_2 = \lambda P_1^\perp ,\quad A_? = \mathbbm {1}- A_1 - A_2. \end{aligned}$$

The choice for effects \(A_1\) and \(A_2\) is logical, as it tells us that the presented state is not the other one. Effect \(A_?\) corresponds to the inconclusive answer ? and \(\lambda \in [0;1]\) is such a parameter that \(A_?\ge 0\).

Fig. 5
figure 5

Depiction of an unambiguous state discrimination in the Bloch picture

In order to analyze this situation, let us parametrize the problem (see also Fig. 5). In the qubit case we can always pre-process presented states so that they would be easily described as

$$\begin{aligned} |\psi _1\rangle&= \cos \omega |0\rangle + \sin \omega |1\rangle ,\\ |\psi _2\rangle&= \cos \omega |0\rangle - \sin \omega |1\rangle , \end{aligned}$$

with \(\omega \in [0;\pi /4]\). The case of \(\omega = 0\) corresponds to \(|\psi _1\rangle =|\psi _2\rangle \), while \(\omega = \pi /4\) describes orthogonal states. We will disregard the pre-processing routine as it is not relevant for this work.

Within this parametrization we have

$$\begin{aligned} |\psi _1^\perp \rangle&= \sin \omega |0\rangle - \cos \omega |1\rangle ,\\ |\psi _2^\perp \rangle&= \sin \omega |0\rangle + \cos \omega |1\rangle . \end{aligned}$$

By minimizing the probability for the inconclusive outcome ? we find the optimal choice of \(\lambda \) to be

$$\begin{aligned} \lambda = \frac{1}{2\cos ^2\omega }. \end{aligned}$$

We can now explicitly express

$$\begin{aligned} A_{1,2} = \frac{1}{2} \begin{pmatrix} \tan ^2\omega &{} \quad \mp \tan \omega \\ \pm \tan \omega &{} \quad 1 \end{pmatrix},\quad A_? = \begin{pmatrix} 1-\tan ^2\omega &{} \quad 0 \\ 0 &{}\quad 0 \end{pmatrix}. \end{aligned}$$

By construction we see that \(A_{1,2}\) are multiples of projectors and we can also observe that \(A_?\) is a multiple of a projector, a measurement in the z direction.

An important quantity for us is the probability of inconclusive result,

$$\begin{aligned} p_{?} = p_1 p(A_?|\rho _{1}) + p_2 p(A_?|\rho _{2}) = \frac{1}{2} \textrm{tr}\left[ A_?P_1\right] + \frac{1}{2} \textrm{tr}\left[ A_?P_2\right] = \cos 2\omega . \end{aligned}$$

The probability of conclusive result (labeled !) is

$$\begin{aligned} p_! = 1-p_? = 1-\cos 2\omega = 2\sin ^2\omega . \end{aligned}$$

Before analyzing particular sequential measurement scenarios, let us set the notation a bit. In the first case, we will consider first the measurement \(\textsf{B}=\{A_?,\mathbbm {1}-A_?\}\), where we will denote corresponding outcomes as ? for the inconclusive answer and ! for the conclusive answer. We shall call this measurement conclusiveness measurement as it tells us, whether in the subsequent measurement we will obtain a conclusive result or not. In the second case we will start with measurement \(\textsf{B}=\{A_1, \mathbbm {1}-A_1\}\), with corresponding outcomes 1 and \(1'\) (representing the result ‘not 1’). We will call this measurement state 1 measurement (or for brevity just state measurement) that either tells us whether we were given state \(|\psi _1\rangle \) or whether we should continue with the measurement with possibility of obtaining result 2.

3.2 Conclusiveness measurement as the first measurement

Let us first look at the symmetric case in which we perform conclusiveness measurement first (see Fig. 6) and then perform the outcome measurement. In this case we coarse-grain the measurement \(\textsf{A}=\{A_1,A_2,A_?\}\) by \(\textsf{B}=\{A_?,\mathbbm {1}-A_?\}\). The unitaries used in the construction of the coupling from Eq. (13) and the basis transformation \(U_B\) are determined to be

$$\begin{aligned} U_B = \mathbbm {1},\qquad V_1 = \mathbbm {1},\qquad V_0 = \begin{pmatrix} \tan \omega &{} \quad -\sqrt{1-\tan ^2\omega }\\ \sqrt{1-\tan ^2\omega } &{}\quad \tan \omega \end{pmatrix}. \end{aligned}$$

The pre-measurement state after the coupling transformation is

$$\begin{aligned} |\psi '_{1,2}\rangle = -\sqrt{2}\sin \omega |\pm \rangle \otimes |0\rangle + \sqrt{\cos 2\omega }|0\rangle \otimes |1\rangle . \end{aligned}$$
(14)
Fig. 6
figure 6

Unambiguous state discrimination where the conclusive measurement is performed first. The two situations show probabilities when a the state \(|\psi _1\rangle \) or b the state \(|\psi _2\rangle \) is on input. Denoted probabilities are conditional, i.e., relating only to the particular measurement

Denoting \(A_!=\mathbbm {1}-A_?\), the conclusive result ! is obtained with probability \(p(A_!|\psi _1) = p(A_!|\psi _2) = 2\sin ^2\omega \) if we measure the ancillary qubit in the state \(|0\rangle \). The (normalized) post-measurement states are \(|\tilde{\psi }_1\rangle =|+\rangle \) for the initial state \(|\psi _1\rangle \) and \(|\tilde{\psi }_2\rangle =|-\rangle \) for the initial state \(|\psi _2\rangle \). These states are orthogonal and, hence, perfectly distinguishable. Indeed, the measurement that shall be performed based on Eq. (8) is \(\textsf{A}'=\{ P_+, P_- \}\) for the corresponding definitive outcomes 1 and 2; operators \(P_\pm \) are projectors into the \(\sigma _x\) eigenvectors, i.e., states \(|\pm \rangle \) and the measurement can be performed on the system qubit. Altogether we have

$$\begin{aligned} p(A_1|\psi _1)&= p(A_!|\psi _1)p(A_1'|\tilde{\psi }_1)=p(A_!|\psi _1)p(P_+|+)=p(A_!|\psi _1)=2\sin ^2\omega ,\\ p(A_1|\psi _2)&= p(A_!|\psi _2)p(A_1'|\tilde{\psi }_2)=p(A_!|\psi _2)p(P_+|-)=0,\\ p(A_2|\psi _1)&= p(A_!|\psi _1)p(A_2'|\tilde{\psi }_1)=p(A_!|\psi _1)p(P_-|+)=0,\\ p(A_2|\psi _2)&= p(A_!|\psi _2)p(A_2'|\tilde{\psi }_2)=p(A_!|\psi _2)p(P_-|-)=p(A_!|\psi _2)=2\sin ^2\omega . \end{aligned}$$

As for the inconclusive outcome, the measurement on the ancilla measures the state \(|1\rangle \) corresponding to this outcome with probability \(p(A_?|\psi _1)=p(A_?|\psi _2)=\cos 2\omega \). The (normalized) post-measurement state is afterward the same for both initial states, \(|\tilde{\psi }_1\rangle =|\tilde{\psi }_2\rangle =|0\rangle \), and thus loses all the information about the original state (as noted above).

This known result was originally presented in [38]. The computation serves as a formalized way of obtaining the coupling transformation. Comparing the details of the two results, one finds that \(A=2\omega \) and the pre-measurement state from [38] equals to the state from Eq. (14) up to an unimportant local phase which is due to a slightly different choice of completing unitary V in Eq. (10).

3.3 State measurement as the first measurement

With the framework presented in the previous section, we are able to choose also a different measurement as the first one; it does not have to be the conclusiveness measurement. Let us choose the state 1 measurement (see Fig. 7), i.e., we want to know whether the presented state \(|\psi \rangle \) is \(|\psi _1\rangle \). If we are presented the answer 1, we know that the given state was \(|\psi _1\rangle \); in the opposite case of the answer labeled as \(1'\) (meaning not 1) we need to perform a second measurement that shall tell us whether the state was \(|\psi _2\rangle \) or the outcome is inconclusive ?.

Fig. 7
figure 7

Unambiguous state discrimination where the measurement for the outcome 1 is performed first. The two situations show probabilities when a the state \(|\psi _1\rangle \) or b the state \(|\psi _2\rangle \) is on input. Denoted probabilities are conditional, i.e., relating only to the particular measurement

In the previous case of conclusiveness measurement as the first one to be performed, we saw that in the first step we either got an inconclusive answer, in which case the outcome states for both input states were the same, or we got a conclusive answer, in which case the outcome states for the two input states were orthogonal, and thus perfectly distinguishable.

What can we expect in this scenario, where we first test whether the first state is on the input? Let us assume first, that we got the first state on the input; the test shall then show with probability \(p(A_!|\psi _1)\) that we have the state \(|\psi _1\rangle \) and with complementary probability \(1-p(A_!|\psi _1)=p(A_?|\psi _1)\) we obtain outcome \(1'\) and we need to perform the second measurement on the post-measurement state \(|\tilde{\psi }_1\rangle \), where answer that we have the state \(|\psi _2\rangle \) has to have zero probability, \(p(A_2'|\tilde{\psi }_1)=0\), and so outcome ? will always be given.

In the case we are presented with the state \(|\psi _2\rangle \), the first measurement has to always provide the outcome \(1'\), since \(p(A_1|\psi _2)=0\); the subsequent measurement on the post-measurement state \(|\tilde{\psi }_2\rangle \) shall afterward lead to the conclusive answer 2 with probability \(p(A_2'|\tilde{\psi }_2)=p(A_2|\psi _2)\) and with probability \(p(A_?'|\tilde{\psi }_2)=p(A_?|\psi _2)\) lead to the inconclusive outcome ?. Let us confirm these expectations.

In this case we coarse-grain the measurement \(\textsf{A}= \{A_1, A_2, A_?\}\) by \(\textsf{B}= \{A_1, \mathbbm {1}- A_1\}\). The unitaries used in the construction of the coupling in Eq. (13) and the basis transformation \(U_B\) are determined to be

$$\begin{aligned} U_B&= |0\rangle \langle \psi _2^\perp | + |1\rangle \langle \psi _2| = \begin{pmatrix} \sin \omega &{} \cos \omega \\ \cos \omega &{} -\sin \omega \end{pmatrix},\\ V_0&= \frac{1}{\sqrt{2}\cos \omega } \begin{pmatrix} \sqrt{\cos 2\omega } &{} -1 \\ 1 &{} \sqrt{\cos 2\omega } \end{pmatrix},\\ V_1&=\mathbbm {1}. \end{aligned}$$

The second measurement \(\textsf{A}'=\{A',\mathbbm {1}-A'\}\) is determined based on Eq. (8). Choosing effect \(A'\) to correspond to the outcome 2, and the complementary effect to the outcome ? we obtain

$$\begin{aligned} A'\equiv A_2' = \left( P_2 + \frac{1}{\sqrt{1-\lambda }}P_2^\perp \right) \lambda P_1^\perp \left( P_2 + \frac{1}{\sqrt{1-\lambda }}P_2^\perp \right) \end{aligned}$$
(15)

First, supposing we are given the state \(|\psi _2\rangle \) on the input, the pre-measurement state is

$$\begin{aligned} |\psi '_2\rangle = |1\rangle \otimes |0\rangle . \end{aligned}$$

Since we have \(p(A_1|\psi _2)=0\), the ancillary qubit measurement will measure the state \(|0\rangle \) with probability 1; this state corresponds to the outcome \(1'\) and we need to follow with the second measurement \(\textsf{A}'\) on the post-measurement state \(|\tilde{\psi }_2\rangle =|\psi _2\rangle \)—this is the same state as the presented state, as it is the 1-eigenstate of the measurement effect \(B=A_1\). The probability that the effect \(A'_2\) will be measured in the second measurement, i.e., state 2 will be determined, is directly computed using the Born rule,

$$\begin{aligned} p(A_2'|\tilde{\psi }_2) = \textrm{tr}\left[ P_2 A'_2\right] = \lambda \textrm{tr}\left[ P_1^\perp P_2\right] = 2\sin ^2\omega . \end{aligned}$$

Since we always end up doing this measurement, initial state \(|\psi _2\rangle \) leads to the definitive answer 2 with conclusive probability \(p(A_2|\psi _2)=2\sin ^2\omega \) and to the inconclusive answer ? with probability \(p(A_?|\psi _2) = 1 - p(A_2|\psi _2)\).

Now, suppose the state \(|\psi _1\rangle \) is presented on the input. Particular computations are more extensive than in the previous case, but still straightforward. The pre-measurement state is

$$\begin{aligned} |\psi '_1\rangle = \sqrt{2}\sin \omega |0\rangle \otimes |1\rangle + \sqrt{\cos 2\omega }\left( \sqrt{2}\sin \omega |0\rangle + \sqrt{\cos 2\omega } |1\rangle \right) \otimes |0\rangle . \end{aligned}$$

In the case state \(|1\rangle \) is measured on the ancillary qubit, outcome 1 is assumed and this happens with probability \(p(A_1|\psi _1)=2\sin ^2\omega \). With probability \(p(\mathbbm {1}-A_1|\psi _1)=p(A_?|\psi _1)=\cos 2\omega \) outcome \(1'\) is provided and we follow with measurement \(\textsf{A}'\). The post-measurement state is

$$\begin{aligned} |\tilde{\psi }_1\rangle = U_B^\dagger \left( \sqrt{2}\sin \omega |0\rangle + \sqrt{\cos 2\omega } |1\rangle \right) . \end{aligned}$$

After some computation we find that the probability measurement \(\textsf{A}'\) yields outcome 2 is \(p(A_2'|\psi _1')=0\) and so the inconclusive outcome ? is always measured.

3.4 Generalization into biased case

In the introduction of the problem, we included a possibility to provide states with unequal probabilities. Suppose now that state \(|\psi _1\rangle \) is sampled with probability p and the state \(|\psi _2\rangle \) with probability \(1-p\). Analysis of the situation is similar to the one provided above with the difference that

$$\begin{aligned} A_1 = \lambda _1 P_2^\perp ,\quad A_2 = \lambda _2 P_1^\perp ,\quad A_? = \mathbbm {1}- A_1 - A_2. \end{aligned}$$

In this case the optimized parameters are \(\lambda _1\) and \(\lambda _2\), both from the interval [0; 1], such that \(A_?\ge 0\). In the optimization one minimizes the inconclusive result

$$\begin{aligned} p_? = p_1 p(A_?|\psi _1) + p_2 p(A_?|\psi _2) = p\textrm{tr}\left[ A_?P_1\right] +(1-p)\textrm{tr}\left[ A_?P_2\right] \end{aligned}$$

or maximizes the success rate \(p_{suc} = 1-p_?\).

Denoting the threshold value for p as \(p_\theta =\cos ^2 2\omega / (1+\cos ^2 2\omega )\), the solution is split into two cases, either (i) \(p\le p_\theta \) or \(p\ge 1-p_\theta \), or (ii) \(p_\theta \le p \le 1-p_\theta \).

The case (i) is not relevant for this paper, as in this case one of the states is preferred to such a degree that the most suitable measurement is a projective one for that state, i.e., one of the \(\lambda \)’s is 0, while the other one is 1. This means that the presented sequential procedure of implementing POVMs is not needed to implement the optimal measurement.

In the other case (ii), all the effects of the measurement are non-trivial, with

$$\begin{aligned} \lambda _1 = \frac{1}{\sin ^2 2\omega }\left( 1-\cos 2\omega \sqrt{\frac{1-p}{p}}\right) ,\quad \lambda _2 = \frac{1}{\sin ^2 2\omega }\left( 1-\cos 2\omega \sqrt{\frac{p}{1-p}}\right) , \end{aligned}$$

and \(A_1\) and \(A_2\) are, again, multiples of projectors. As the optimization procedure for \(A_?\ge 0\) is such that one of the eigenvalues \(A_?\) is zero, it implies that \(A_?\) is a multiple of a projector as well. Hence, all the effects are always multiples of projectors in this task. This means that in both implementations either \(V_0=\mathbbm {1}\) or \(V_1=\mathbbm {1}\) and we can disregard corresponding controlled operation.

Furthermore, we can look at the follow-up measurement \(\textsf{A}'\). In both cases, conclusiveness as well as state measurement schemes, this measurement is projective. The reasoning has two steps. In the first step we show that the observed effect \(A'\) (which is either \(A_1'\) or \(A_2'\)) is a multiple of a projector. In the second step we show that the multiplication factor has to be 1, i.e., the measurement is projective.

For the first part, the measured effect \(A'\) is obtained by a general scheme induced by Eq. (8)

$$\begin{aligned} A' = (\mathbbm {1}- A_0)^{-1/2} A (\mathbbm {1}- A_0)^{-1/2}, \end{aligned}$$

where \(A'\) is the effect to be observed in the follow-up measurement, A is the intended effect (in our case a multiple of an identity, one of the effects \(A_2\) or \(A_?\)), and \(A_0\) is the effect for the definitive outcome from the first measurement (either \(A_?\) or \(A_1\)). In this general notation, we have \(A_0=\lambda P_0\) and \(A=\mu P\) where \(\lambda \) and \(\mu \) are some multiplicative constants and P and \(P_0\) are projectors. We can see that

$$\begin{aligned} A'\cdot A'= & {} (\mathbbm {1}- A_0)^{-1/2} A (\mathbbm {1}- A_0)^{-1} A (\mathbbm {1}- A_0)^{-1/2} \\= & {} (\mathbbm {1}- A_0)^{-1/2} \kappa A (\mathbbm {1}- A_0)^{-1/2} = \kappa A', \end{aligned}$$

where we used that \(PQP=\tilde{\kappa }P\) for projector P and some factor \(\tilde{\kappa }\). We see that \(A'\cdot A' = \kappa A'\) and so \(A'\) is a multiple of a projector.

In the second step we argue, that \(\kappa =1\). If this would not be the case, by denoting \(A'=\tau R\) for some \(\tau \) and projector R we could exchange the measurement \(\textsf{A}'\) for the projective measurement \(\{R, \mathbbm {1}- R\}\). This would allow us to obtain higher probabilities for the intended effects (either \(A_1\) or \(A_2\)) which would increase the success rate. Since our measurements are optimal, \(\textsf{A}'\) has to be projective.

Fig. 8
figure 8

Measurement scheme for the (biased) qubit unambiguous state discrimination. Whichever measurement is chosen on the ancilla, the scheme has the same structure, with the first part performing the measurement of the coarse graining. Since the follow-up measurement is always projective, it can be performed on the system qubit by some rotation W into the corresponding basis. Note that this setup does not require classical conditioning inside the quantum circuit, but if the result on the ancilla is for the chosen definitive outcome, the results on the system qubit are disregarded. Also, the ancilla measurement can be postponed so that both measurements are performed at the end

This discussion leads us to a simplified measurement scheme, where the final measurement can be performed on the original system without the need of an ancilla. Altogether, only one ancillary qubit (without intermediate reset) is needed. The situation is depicted in Fig. 8.

The implementation within the scheme we presented is not expressible in such an easy manner as in the unbiased case, but from the Fig. 8 we see that it can be used in a systematic and computationally accessible way to determine particular elements of the circuit.

3.5 Usefulness of the POVM implementation scheme

In the previous section, we have seen that the scheme for the biased unambiguous discrimination of qubit states can be implemented in a very simple way requiring only a single ancillary qubit. The analysis shows that presented scheme can produce circuit settings in a systematic way based on the preferred choice of the measurement on the ancilla. In this case it might not be clear which choice will produce the best result and the factors affecting the decision are numerous and beyond the scope of this paper. However, let us provide a more illustrative usefulness of having possibility of the choice here. We do so be providing yet another generalization of the problem of unambiguous state discrimination beyond the qubit case. Due to the complexity of the problem, we shall provide only estimation of resources without delving into the intricacies of particular computations.

Following the results of [39, 40], unambiguous discrimination is possible only for linearly independent states and, thus, in dimension d we can distinguish unambiguously only up to d states. Together with the inconclusiveness outcome, we need \(d+1\) outcomes of the measurement. Translated to qubits, for the implementation of unambiguous state discrimination on m qubits, at least one additional qubit is necessary. This in general is not sufficient, but for our analysis of just a single partial measurement, we need to consider just one ancillary qubit.

In such case one just needs to decide which outcome measurement is to be assigned to the ancillary qubit. A straightforward observation suggests there is space for optimization—considering Eq. (11) we see that in the coupling scheme the number of controlled operations equals to the rank of the observed effect. In the standard unambiguous state discrimination [39] the outcomes (conclusive) effects are multiples of projections and, hence, rank-1, while the inconclusive effect \(A_?\) has, in general, rank \(d-1\). In higher dimension this suggests advantage of assigning one of the conclusive outcomes to the ancilla.

To quantify it, let us restrict to a dimension \(d=2^m\) of \(m\ge 2\) qubits. The problem of implementing the unambiguous discrimination of d states is split into (i) the part of coupling the system to the ancilla and performing chosen measurement on the ancilla, (ii) performing the rest of the measurement, and finally, (iii) the part for interpreting the result—if the ancillary measurement provides definitive answer, keep it and discard measurement outcomes on the system, and if the measurement on the ancilla does not provide a definitive answer, deduce correct outcome from the remaining measurements.

We are especially interested in the point (i). Since our aim is to obtain a definitive outcome, choosing the inconclusive outcome for the ancilla seems to be less useful as picking one of the definitive outcomes. For one of the definitive outcomes, which are multiples of a projection, we need to perform only single multi-controlled operation, while for the inconclusive outcome, this can be as many as \(d-1\) multi-controlled operations. Two-qubit operations in practical computation bring considerable noise into the computation and the rule of a thumb is to minimize their number. In this case, using results of [41], the number of CNOTs to decompose an m-qubit multi-controlled real-valued operation would be \(16m-24\).

To reduce the number of CNOTs, it is therefore better to perform definitive measurements on the ancilla. This rough estimate minimizes the number of two-qubit operations. However, there can be other factors involving the decision for the ancilla-designated measurement, such as particularities of studied problem, or specific noise profiles of the quantum device.

4 Discussion

We have presented a method of transforming complex general quantum measurements into a sequence of simple atomic measurements. Similarly to [19] we have provided a framework for the analysis, which we used to study quantum unambiguous state discrimination in its simplest setting on a qubit (analytically for the unbiased case and in less detail for the biased case). We were able to show that this framework describes almost exactly the construction of Peres [38] in the case we perform the conclusiveness measurement first.

However, the framework allows us to choose any other measurement on the ancilla, e.g., the measurement whether we are presented the first state. In such case we devised the measurement procedure. The reason why one might opt for this option might be following. Quantum measurements, even the simple ones, are from the experimental point quite demanding. In the current noisy quantum devices this means that during the measurement we lose a lot of quantum resources (coherence in particular). It might be therefore beneficial to perform measurements that give us the largest amount of information or the most useful information as early in the process of measurement as possible.

Take, for example, a case where we are presented with two states that are close to orthogonal. In such case performing measurement for conclusiveness will be followed by state measurement with high probability, but the state presented to this second measurement will be presented with higher noise due to short coherence times. But if we will perform the measurement for the first state, we will be presented with a definite answer with much higher probability and the second measurement will be less frequently performed. This may lead to a higher overall success rate.

At this point the procedure might not seem very useful, as in this simplest setting the measurement on the original system can be performed irrespective of the measurement on the qubit system. However, in constructions of measurements with larger number of outcomes, this flexibility might become important, as the measurements to be performed will be conditioned on the previous outcomes. The situation becomes apparent already for \(n=4\) outcomes, where subsequent measurements depend on the previous outcome.

This paper, however, does not present results on which choice for the ancillary measurement is optimal; it offers only a systematic construction based on the choice. The depth of the circuit may depend, in addition to the choice of the ancillary measurement, also on many other factors, such as the form of subsequent measurement, the ability to efficiently decompose different parts of implementation circuit, or even technical parameters of the particular quantum device.

We hope that this procedure can offer us more precise measurement processes on current quantum devices by offering a flexible and generic approach to implementing POVMs.