1. Introduction
The computation of channel capacity has always been a core problem in information theory. The very well-known Blahut-Arimoto algorithm [
1,
2] was proposed in 1972 to compute the discrete memoryless classical channel. Inspired by this algorithm, we propose an algorithm of Blahut-Arimoto type to compute the capacity of discrete memoryless classical-quantum channel. The classical-quantum channel [
3] can be considered as a mapping
of an input alphabet
to a set of quantum states in a finite dimensional Hilbert space
. The state of a quantum system is given by a density operator
, which is a positive semi-definite operator with trace equal to one. Let
denote the set of all density operators acting on a Hilbert space
of dimension
m. If the source emits a letter
x with probability
, the output would be
, thus the output would form an ensemble:
.
In 1998, Holevo showed [
4] that the classical capacity of the classical-quantum channel is the maximization of a quantity called the Holevo information over all input distributions. The Holevo information
of an ensemble
is defined as
where
is the von Neumann entropy, which is defined on positive semidefinite matrices:
Due to the concavity of von Neumann entropy [
5], the Holevo information is always non-negative. The Holevo quantity is concave in the input distribution [
5], and thus the maximization of Equation (
1) over
p is a convex optimization problem. However, it is not a straightforward convex optimization problem. In 2014, Sutter et al. [
6] promoted an algorithm based on duality of convex programming and smoothing techniques [
7] with a complexity of
, where
.
For discrete memoryless classical channels, the capacity can be computed efficiently by using an algorithm called Blahut–Arimoto (BA) algorithm [
1,
2,
8]. In 1998, H. Nagaoka [
9] proposed a quantum version of BA algorithm. In his work, he considered the quantum-quantum channel and this problem was proved to be NP-complete in 2008 [
10]. Despite the NP-completeness, in [
11], an example is given of a qubit quantum channel which requires four inputs to maximize the Holevo capacity. Further research of Nagaoka’s algorithm was presented in [
12], where the algorithm was implemented to check the additivity of quantum channels. In [
9], Nagaoka mentioned an algorithm concerning classical-quantum channel; however, its speed of convergence was not studied there and the details of the proof were not presented either. In this paper, we show that, with proper manipulations, the BA algorithm can be applied to computing the capacity of classical-quantum channel with an input constraint efficiently. The remainder of this article is structured as follows. In
Section 2, we propose the algorithm and show how the algorithm works. In
Section 3, we provide the convergence analysis of the algorithm. In
Section 4, we show the numerical experiments of BA algorithm to see how well this algorithm performs. In
Section 5, we propose an approximate solution for a special case, which is the binary input, two-dimensional output case.
Notations: The logarithm with basis 2 is denoted by . The space of all Hermitian operators of dimension m is denoted by . The set of all density matrices of dimension m is denoted by . Each letter is mapped to a density matrix , thus the classical-quantum channel can be represented as a set of density matrices . The set of all probability distributions of length n is denoted by . The von Neumann entropy of a density matrix is denoted by . The relative entropy between , if , is denoted by and otherwise. The relative entropy between , if , is denoted by and otherwise.
2. Blahut–Arimoto Algorithm for Classical-Quantum Channel
First, we write down the primal optimization problem:
where
,
is a positive real vector,
. We denote the maximal value of Equation (
3) as
. In this optimization problem, we are to maximize the Holevo quantity with respect to the input distribution
. Practically, the preparation of different signal state
x has different cost, which is represented by
. We would like to bound the expected cost of the resource within some quantity, which is represented by the inequality constraint in Equation (
3).
Lemma 1. [6] Let a set G be defined as and . Then, if , the inequality constraint in the primal problem is inactive; and, if , the inequality constraint in the primal problem is equivalent to . Now, we assume that
. The Lagrange dual problem of Equation (
3) is
Lemma 2. Strong duality holds between Equations (3) and (4). Proof. The lemma follows from standard strong duality result of convex optimization theory ([
13], Chapter 5.2.3). □
Define functions. Let
where
.
Lemma 3. For fixed p, .
Proof. Actually, we can prove a stronger lemma (the following lemma was proposed in [
9], but no proof was given, perhaps due to the space limitation). We now restate the lemma in [
9] and give the proof. □
Lemma 4. For fixed , we havewhere and . Proof. Considering Equation (
7), we have
where
and
are classical-quantum state [
5]. Let the quantum channel
be the partial trace channel on
X system; then, by the monotonicity of quantum relative entropy ([
5], Theorem 11.8.1), we have
□
Notice that, if we let
in Equation (
7), with some calculation, Equation (
7) becomes Lemma 3. Thus, Lemma 3 is a straightforward corollary of Lemma 4
Theorem 1. The dual problem in Equation (4) is equivalent to Proof. It follows from Equation (
5) and Lemma 3 that
Hence,
□
The BA algorithm is an alternating optimization algorithm, i.e., to optimize
, each iteration step would fix one variable and optimize the function over the other variable. Now, we use BA algorithm to find
. The iteration procedure is
where
.
To get
, we can use the Lagrange function:
setting the gradient with respect to
to zero. By combining the normalization condition, we can have (taking the natural logarithm for convenience):
Thus, we can summarize the Algorithm 1 below.
Algorithm 1 Blahut–Arimoto algorithm for discrete memoryless classical-quantum channel. |
set , ; repeat , where ; until convergence.
|
Lemma 5. Let for a given λ; then, is a decreasing function of λ.
Proof. For convenience, we denote as . Notice that by definition of .
For
, if
, then, by the definition of
, we have:
which is a contradiction to the fact that
is an optimizer of
. Thus, we must have
if
. □
We do not need to solve the optimization problem in Equation (
8), because from Lemma 1 we can see that the statement “
is an optimal solution” is equivalent to “
and
maximizes
”, which is also equivalent to “
and
maximizes
", thus, if for some
, a
p maximizes
and
, then the capacity
, and such
is easy to find since
is a decreasing function of
, and, to reach an
accuracy, we need
steps using bisection method.
4. Numerical Experiments on BA Algorithm
We only performed experiments on BA algorithm with no input constraint (BA algorithm with input constraint is some combination of BA algorithm with no input constraint.) We studied the relations between iteration complexity and (i.e., the input size and output dimension) when the algorithm reaches certain accuracy. Since we do not know the true capacity of a certain channel, we used the following theorem to bound the error of the algorithm.
Theorem 8. With the iteration procedure in the BA Algorithm 1, converges to from above.
Proof. Following from Algorithm 1, Corollary 1, and Theorem 3, we have
where
,
is an optimal distribution. The limit above is 1 if
and does not exceed 1 if
. Thus,
for every
, with equality if
. This proves
For any
and any optimal distribution
, we have
The first equality requires some calculation and the second equality follows since
is an optimal distribution. This means
converges to
from above. □
Thus, our accuracy criterion was: for a given classical-quantum channel, we ran the BA algorithm (with no input constraint), until was less than , and recorded the number of iteration. At this time, the accuracy was of order at most since and converged to the true capacity from above and below, respectively.
We performed the following numerical experiments: for given values of input size
n, output dimension
m and accuracy, we generated 200 classical-quantum channels randomly, recorded the numbers of iterations, and then calculated the average number of iterations of these 200 experiments. The results are shown in
Figure 1. Note that the accuracy
in
Figure 1 means we ran the BA algorithm until
was less than
, and the error between the true capacity and the computed value was of order
at most.
We can see in
Figure 1 that the iteration complexity scales better as accuracy and input dimension increase. We can also see for given input size
n and accuracy, the output dimension has vary little influence on iteration complexity, which means the iteration complexity also scales better as the output dimension
m increases. Compared with our theoretical analysis of iteration complexity in Theorem 4: to reach
accuracy, we needed
iterations; the numerical experiments showed that the number of iterations was far smaller than
to reach
accuracy, whether the output quantum states were independent or not (cases in
). The reason for this is that the inequalities in the proof of Theorem 4 are quite loose. Thus, Theorem 4 only provides a very loose upper bound on iteration complexity. We can also guess that maybe the relation in Equation (
20) holds generally and we just cannot prove it yet.
Next, we needed to see the running time of the BA algorithm. There were three methods to compute the classical-quantum channel capacity: BA algorithm, the duality and smoothing technique [
6], and the method created by Hamza Fawzi et al. [
17]. In [
17], a Matlab code package called CvxQuad is provided which accurately approximates the relative entropy function via semidefinite programming so that many quantum capacity quantities can be computed using a convex optimization tool called CVX. Here, we compared the running time of the above three methods. For different input size
n and output dimension
m, we generated a classical-quantum channel randomly and computed the channel capacity using the above three methods and then recorded the running time of each method. The results are shown in
Figure 2.
In
Figure 2, we can see the BA algorithm was the fastest method. The duality and smoothing method was rather slow and we did not record the running time of the duality and smoothing method when
because it took too long. We can also notice that the running time of the CvxQuad method was extremely sensitive to the output dimension, which is not a surprise because CVX is a second-order method. Thus, our BA algorithm was significantly faster than the other two methods when
n and
m became big.