Computing Classical-Quantum Channel Capacity Using Blahut–Arimoto Type Algorithm: A Theoretical and Numerical Analysis

Li, Haobo; Cai, Ning

doi:10.3390/e22020222

Open AccessArticle

Computing Classical-Quantum Channel Capacity Using Blahut–Arimoto Type Algorithm: A Theoretical and Numerical Analysis^†

by

Haobo Li

^1,2,3,*

and

Ning Cai

^1,*

¹

School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China

²

Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 201210, China

³

University of Chinese Academy of Sciences, Beijing 100049, China

^*

Authors to whom correspondence should be addressed.

^†

This paper is partially presented at 2019 International Symposium on Information Theory.

Entropy 2020, 22(2), 222; https://doi.org/10.3390/e22020222

Submission received: 15 January 2020 / Revised: 5 February 2020 / Accepted: 13 February 2020 / Published: 16 February 2020

(This article belongs to the Collection Quantum Information)

Download

Browse Figures

Versions Notes

Abstract

:

Based on Arimoto’s work in 1972, we propose an iterative algorithm for computing the capacity of a discrete memoryless classical-quantum channel with a finite input alphabet and a finite dimensional output, which we call the Blahut–Arimoto algorithm for classical-quantum channel, and an input cost constraint is considered. We show that, to reach

ε

accuracy, the iteration complexity of the algorithm is upper bounded by

\frac{log n log ε}{ε}

where n is the size of the input alphabet. In particular, when the output state

{ρ_{x}}_{x \in X}

is linearly independent in complex matrix space, the algorithm has a geometric convergence. We also show that the algorithm reaches an

ε

accurate solution with a complexity of

O (\frac{m^{3} log n log ε}{ε})

, and

O (m^{3} log ε {log}_{(1 - δ)} \frac{ε}{D (p^{*} | | p^{N_{0}})})

in the special case, where m is the output dimension,

D (p^{*} | | p^{N_{0}})

is the relative entropy of two distributions, and

δ

is a positive number. Numerical experiments were performed and an approximate solution for the binary two-dimensional case was analysed.

Keywords:

capacity; classical-quantum channel; Blahut–Arimoto type algorithm; convergence speed

1. Introduction

The computation of channel capacity has always been a core problem in information theory. The very well-known Blahut-Arimoto algorithm [1,2] was proposed in 1972 to compute the discrete memoryless classical channel. Inspired by this algorithm, we propose an algorithm of Blahut-Arimoto type to compute the capacity of discrete memoryless classical-quantum channel. The classical-quantum channel [3] can be considered as a mapping

x \to ρ_{x}

of an input alphabet

X = {1, 2, \dots, | X |}

to a set of quantum states in a finite dimensional Hilbert space

H

. The state of a quantum system is given by a density operator

ρ

, which is a positive semi-definite operator with trace equal to one. Let

D^{m}

denote the set of all density operators acting on a Hilbert space

H

of dimension m. If the source emits a letter x with probability

p_{x}

, the output would be

ρ_{x}

, thus the output would form an ensemble:

{p_{x} : ρ_{x}}_{x \in X}

.

In 1998, Holevo showed [4] that the classical capacity of the classical-quantum channel is the maximization of a quantity called the Holevo information over all input distributions. The Holevo information

χ

of an ensemble

{p_{x} : ρ_{x}}_{x \in X}

is defined as

\begin{matrix} χ ({p_{x} : ρ_{x}}_{x \in X}) = H (\sum_{x} p_{x} ρ_{x}) - \sum_{x} p_{x} H (ρ_{x}), \end{matrix}

(1)

where

H (\cdot)

is the von Neumann entropy, which is defined on positive semidefinite matrices:

\begin{matrix} H (ρ) = - Tr (ρ log ρ) . \end{matrix}

(2)

Due to the concavity of von Neumann entropy [5], the Holevo information is always non-negative. The Holevo quantity is concave in the input distribution [5], and thus the maximization of Equation (1) over p is a convex optimization problem. However, it is not a straightforward convex optimization problem. In 2014, Sutter et al. [6] promoted an algorithm based on duality of convex programming and smoothing techniques [7] with a complexity of

O (\frac{(n \lor m) m^{3} {(log n)}^{1 / 2}}{ε})

, where

n \lor m = max {n, m}

.

For discrete memoryless classical channels, the capacity can be computed efficiently by using an algorithm called Blahut–Arimoto (BA) algorithm [1,2,8]. In 1998, H. Nagaoka [9] proposed a quantum version of BA algorithm. In his work, he considered the quantum-quantum channel and this problem was proved to be NP-complete in 2008 [10]. Despite the NP-completeness, in [11], an example is given of a qubit quantum channel which requires four inputs to maximize the Holevo capacity. Further research of Nagaoka’s algorithm was presented in [12], where the algorithm was implemented to check the additivity of quantum channels. In [9], Nagaoka mentioned an algorithm concerning classical-quantum channel; however, its speed of convergence was not studied there and the details of the proof were not presented either. In this paper, we show that, with proper manipulations, the BA algorithm can be applied to computing the capacity of classical-quantum channel with an input constraint efficiently. The remainder of this article is structured as follows. In Section 2, we propose the algorithm and show how the algorithm works. In Section 3, we provide the convergence analysis of the algorithm. In Section 4, we show the numerical experiments of BA algorithm to see how well this algorithm performs. In Section 5, we propose an approximate solution for a special case, which is the binary input, two-dimensional output case.

Notations: The logarithm with basis 2 is denoted by

log (\cdot)

. The space of all Hermitian operators of dimension m is denoted by

H^{m}

. The set of all density matrices of dimension m is denoted by

D^{m} {ρ \in H^{m} : ρ \geq 0, Tr ρ = 1}

. Each letter

x \in X

is mapped to a density matrix

ρ_{x}

, thus the classical-quantum channel can be represented as a set of density matrices

{ρ_{x}}_{x \in X}

. The set of all probability distributions of length n is denoted by

Δ_{n} {p \in R^{n} : p_{x} \geq 0, \sum_{x = 1}^{n} p_{x} = 1}

. The von Neumann entropy of a density matrix

ρ

is denoted by

H (ρ) = - Tr [ρ log ρ]

. The relative entropy between

p, q \in Δ_{n}

, if

\sup p (p) \subset supp (q)

, is denoted by

D (p | | q) = \sum_{x} p_{x} (log p_{x} - log q_{x})

and

+ \infty

otherwise. The relative entropy between

ρ, σ \in D^{m}

, if

supp (ρ) \subset supp (σ)

, is denoted by

D (ρ | | σ) = Tr [ρ (log ρ - log σ)]

and

+ \infty

otherwise.

2. Blahut–Arimoto Algorithm for Classical-Quantum Channel

First, we write down the primal optimization problem:

\begin{matrix} Primal : \{\begin{matrix} max_{p} H (\sum_{x} p_{x} ρ_{x}) - \sum_{x} p_{x} H (ρ_{x}), \\ subject to s^{T} p \leq S; \\ p \in Δ_{n}, \end{matrix} \end{matrix}

(3)

where

ρ_{x} \in D^{m}

,

s \in R^{n}

is a positive real vector,

S > 0

. We denote the maximal value of Equation (3) as

C (S)

. In this optimization problem, we are to maximize the Holevo quantity with respect to the input distribution

{p_{x}}_{x \in X}

. Practically, the preparation of different signal state x has different cost, which is represented by

s = (s_{1}, s_{2}, \dots, s_{n})

. We would like to bound the expected cost of the resource within some quantity, which is represented by the inequality constraint in Equation (3).

Lemma 1.

[6] Let a set G be defined as

G : = arg max_{p \in Δ_{n}} χ ({p_{x} : ρ_{x}}_{x \in X})

and

S_{m a x} : = min_{p \in G} s^{T} p

. Then, if

S \geq S_{m a x}

, the inequality constraint in the primal problem is inactive; and, if

S < S_{m a x}

, the inequality constraint in the primal problem is equivalent to

s^{T} p = S

.

Now, we assume that

min {s_{x}}_{x \in X} \leq S \leq S_{m a x}

. The Lagrange dual problem of Equation (3) is

\begin{matrix} Dual : \{\begin{matrix} min_{λ \geq 0} max_{p} H (\sum_{x} p_{x} ρ_{x}) - \sum_{x} p_{x} H (ρ_{x}) - λ (s^{T} p - S) \\ subject to p \in Δ_{n} . \end{matrix} \end{matrix}

(4)

Lemma 2.

Strong duality holds between Equations (3) and (4).

Proof.

The lemma follows from standard strong duality result of convex optimization theory ([13], Chapter 5.2.3). □

Define functions. Let

\begin{matrix} f_{λ} (p, p^{'}) & = \sum_{x} Tr {p_{x} ρ_{x} [log (p_{x}^{'} ρ_{x}) - log (p_{x} ρ^{'})]} - λ s^{T} p, \end{matrix}

(5)

\begin{matrix} F (λ) & = max_{p} max_{p^{'}} f (p, p^{'}) . \end{matrix}

(6)

where

ρ^{'} = \sum_{x} p_{x}^{'} ρ_{x}

.

Lemma 3.

For fixed p,

arg max_{p^{'}} f_{λ} (p, p^{'}) = p

.

Proof.

Actually, we can prove a stronger lemma (the following lemma was proposed in [9], but no proof was given, perhaps due to the space limitation). We now restate the lemma in [9] and give the proof. □

Lemma 4.

For fixed

{p_{x} : ρ_{x}}_{x \in X}

, we have

\begin{matrix} max_{{q_{x} : σ_{x}}_{x \in X}} - D (p | | q) + \sum_{x} p_{x} Tr {ρ_{x} [log σ_{x} - log σ]} = \sum_{x} p_{x} Tr {ρ_{x} [log ρ_{x} - log ρ]}, \\ i . e ., arg max_{{q_{x} : σ_{x}}_{x \in X}} - D (p | | q) + \sum_{x} p_{x} Tr {ρ_{x} [log σ_{x} - log σ]} = {p_{x} : ρ_{x}}_{x \in X}, \end{matrix}

(7)

where

p, q \in Δ_{n}, σ_{x} \in D^{m}

and

ρ = \sum_{x} p_{x} ρ_{x}, σ = \sum_{x} q_{x} σ_{x}

.

Proof.

Considering Equation (7), we have

\begin{matrix} R H S - L H S & = D (p | | q) + \sum_{x} p_{x} D (ρ_{x} | | σ_{x}) - D (ρ | | σ) \\ = D (ρ_{X B} | | σ_{X B}) - D (ρ | | σ), \end{matrix}

where

ρ_{X B} = \sum_{x} p_{x} {| x 〉 〈 x |}_{X} \otimes ρ_{x}

and

σ_{X B} = \sum_{x} q_{x} {| x 〉 〈 x |}_{X} \otimes σ_{x}

are classical-quantum state [5]. Let the quantum channel

N

be the partial trace channel on X system; then, by the monotonicity of quantum relative entropy ([5], Theorem 11.8.1), we have

\begin{matrix} D (ρ_{X B} | | σ_{X B}) \geq D (N (ρ_{X B}) | | N (σ_{X B})) = D (ρ | | σ) . \end{matrix}

□

Notice that, if we let

σ_{x} = ρ_{x}

in Equation (7), with some calculation, Equation (7) becomes Lemma 3. Thus, Lemma 3 is a straightforward corollary of Lemma 4

Theorem 1.

The dual problem in Equation (4) is equivalent to

\begin{matrix} min_{λ \geq 0} F (λ) + λ S . \end{matrix}

(8)

Proof.

It follows from Equation (5) and Lemma 3 that

\begin{matrix} max_{p^{'}} f_{λ} (p, p^{'}) = f_{λ} (p, p) = H (ρ) - \sum_{x} p_{x} H (ρ_{x}) - λ s^{T} p . \end{matrix}

Hence,

\begin{matrix} min_{λ \geq 0} max_{p} H (ρ) - \sum_{x} p_{x} H (ρ_{x}) - λ (s^{T} p - S) \\ = & min_{λ \geq 0} max_{p} max_{p^{'}} f_{λ} (p, p^{'}) + λ S \\ = & min_{λ \geq 0} F (λ) + λ S . \end{matrix}

□

The BA algorithm is an alternating optimization algorithm, i.e., to optimize

f_{λ} (p, p^{'})

, each iteration step would fix one variable and optimize the function over the other variable. Now, we use BA algorithm to find

F (λ)

. The iteration procedure is

\begin{matrix} p_{x}^{0} > 0, \\ p_{x}^{^{'} t} = p_{x}^{t}, \\ p^{t + 1} = arg max_{p} \sum_{x} Tr {p_{x} ρ_{x} [log (p_{x}^{t} ρ_{x}) - log (p_{x} ρ^{t})]} - λ s^{T} p, \end{matrix}

where

ρ^{t} = \sum_{x} p_{x}^{t} ρ_{x}

.

To get

p^{t + 1}

, we can use the Lagrange function:

\begin{matrix} L = & \sum_{x} Tr {p_{x} ρ_{x} [log (p_{x}^{t} ρ_{x}) - log (p_{x} ρ^{t})]} - λ s^{T} p - ν (\sum_{x} p_{x} - 1), \end{matrix}

setting the gradient with respect to

p_{x}

to zero. By combining the normalization condition, we can have (taking the natural logarithm for convenience):

\begin{matrix} p_{x}^{t + 1} = & \frac{r_{x}^{t}}{\sum_{x} r_{x}^{t}}, \end{matrix}

(9)

\begin{matrix} where r_{x}^{t} = & exp (Tr {ρ_{x} [log (p_{x}^{t} ρ_{x}) - log ρ^{t}]} - s_{x} λ), \\ ρ^{t} = & \sum_{x} p_{x}^{t} ρ_{x} . \end{matrix}

(10)

Thus, we can summarize the Algorithm 1 below.

Algorithm 1 Blahut–Arimoto algorithm for discrete memoryless classical-quantum channel.

set $p_{x}^{0} = \frac{1}{| X |}$ , $x \in X$ ;
repeat
$p_{x}^{^{'} t} = p_{x}^{t};$
$p_{x}^{t + 1} = \frac{r_{x}^{t}}{\sum_{x} r_{x}^{t}}$ , where $r_{x}^{t} = exp (Tr {ρ_{x} [log (p_{x}^{t} ρ_{x}) - log ρ^{t}]} - s_{x} λ)$ ;
until convergence.

Lemma 5.

Let

p^{*} (λ) = arg {max}_{p} f (p, p)

for a given λ; then,

s^{T} p^{*} (λ)

is a decreasing function of λ.

Proof.

For convenience, we denote

χ ({p_{x} : ρ_{x}}_{x \in X})

as

χ (p)

. Notice that

f_{λ} (p, p) = χ (p) - λ s^{T} p

by definition of

f (p, p)

.

For

λ_{1} < λ_{2}

, if

〈 p^{*} (λ_{1}) s < s^{T} p^{*} (λ_{2})

, then, by the definition of

p^{*} (λ)

, we have:

\begin{matrix} χ (p^{*} (λ_{1})) - λ_{1} s^{T} p^{*} (λ_{1}) \geq & χ (p^{*} (λ_{2})) - λ_{1} s^{T} p^{*} (λ_{2}) \\ ⟹ χ (p^{*} (λ_{2})) - χ (p^{*} (λ_{1})) \leq & λ_{1} s^{T} (p^{*} (λ_{2}) - p^{*} (λ_{1})) \\ < & λ_{2} s^{T} (p^{*} (λ_{2}) - p^{*} (λ_{1})) \\ ⟹ χ (p^{*} (λ_{1})) - λ_{2} s^{T} p^{*} (λ_{1}) > & χ (p^{*} (λ_{2})) - λ_{2} s^{T} p^{*} (λ_{2}), \end{matrix}

which is a contradiction to the fact that

p^{*} (λ_{2})

is an optimizer of

χ (p) - λ_{2} s^{T} p

. Thus, we must have

s^{T} p^{*} (λ_{1}) \geq s^{T} p^{*} (λ_{2})

if

λ_{1} < λ_{2}

. □

We do not need to solve the optimization problem in Equation (8), because from Lemma 1 we can see that the statement “

p^{*}

is an optimal solution” is equivalent to “

s^{T} p^{*} = S

and

p^{*}

maximizes

f_{λ} (p, p) + λ S = χ ({p_{x}, ρ_{x}}_{x \in X}) - λ (s^{T} p - S)

”, which is also equivalent to “

s^{T} p^{*} = S

and

p^{*}

maximizes

f_{λ} (p, p)

", thus, if for some

λ \geq 0

, a p maximizes

f_{λ} (p, p)

and

s^{T} p = S

, then the capacity

C (S) = F (λ) + λ S

, and such

λ

is easy to find since

s^{T} p

is a decreasing function of

λ

, and, to reach an

ε

accuracy, we need

\begin{matrix} O (log ε) \end{matrix}

(11)

steps using bisection method.

3. Convergence Analysis

Next, we show that the algorithm indeed converges to

F (λ)

and then provide an analysis of the speed of the convergence.

3.1. The Convergence Is Guaranteed

Corollary 1.

\begin{matrix} f_{λ} (p^{t + 1}, p^{t}) = log (\sum_{x} r_{x}^{t}) . \end{matrix}

Proof.

\begin{matrix} f_{λ} (p^{t + 1}, p^{t}) = & - \sum_{x} Tr {p_{x}^{t + 1} ρ_{x} log p_{x}^{t + 1}} + \sum_{x} Tr {p_{x}^{t + 1} ρ_{x} [log (p_{x}^{t} ρ_{x}) - log (ρ^{t})]} - λ s^{T} p^{t + 1} \\ = & - \sum_{x} p_{x}^{t + 1} log p_{x}^{t + 1} + \sum_{x} p_{x}^{t + 1} log (r_{x}^{t}) \\ = & \sum_{x} p_{x}^{t + 1} log (\frac{r_{x}^{t}}{p_{x}^{t + 1}}) \\ = & log (\sum_{x} r_{x}^{t}) . \end{matrix}

The first equality comes from a manipulation of Equation (5). The second equality follows from Equation (10). The last equality follows from Equation (9). □

Corollary 2.

For arbitrary distribution

{p_{x}}_{x \in X}

, we have

\begin{matrix} χ ({p_{x}, ρ_{x}}_{x \in X}) - λ s^{T} p - f (p^{t + 1}, p^{t}) \leq \sum_{x} p_{x} log (\frac{p_{x}^{t + 1}}{p_{x}^{t} (x)}) . \end{matrix}

Proof.

Define

ρ = \sum_{x} p_{x} ρ_{x}

. Then, we have

\begin{matrix} \sum_{x} p_{x} log (\frac{p_{x}^{t + 1}}{p_{x}^{t}}) = \sum_{x} p_{x} log (\frac{1}{p_{x}^{t}} \frac{r_{x}^{t}}{\sum_{x^{'}} r_{x^{'}}^{t}}) \\ = & - f_{λ} (p^{t + 1}, p^{t}) + \sum_{x} p_{x} log \frac{r_{x}^{t}}{p_{x}^{t}} \\ = & - f_{λ} (p^{t + 1}, p^{t}) + \sum_{x} p_{x} Tr {ρ_{x} [log (p_{x}^{t} ρ_{x}) - log ρ^{t}] - s_{x} λ - log p_{x}^{t}} \\ = & - f_{λ} (p^{t + 1}, p^{t}) + \sum_{x} p_{x} Tr {ρ_{x} [log ρ_{x} - log ρ^{t}]} - λ s^{T} p \\ = & - f_{λ} (p^{t + 1}, p^{t}) + \sum_{x} p_{x} Tr {ρ_{x} [log ρ_{x} - log ρ + log ρ - log ρ^{t}]} - λ s^{T} p \\ = & - f_{λ} (p^{t + 1}, p^{t}) + χ ({p_{x}, ρ_{x}}_{x \in X}) - λ s^{T} p + D (ρ | | ρ^{t}) . \end{matrix}

(12)

The first equality follows from Equation (9). The second equality follows from Corollary 1. The third equality follows from Equation (10). The last equality follows from Equation (1). Since the relative entropy

D (ρ^{X} | | ρ^{t})

is always non-negative [5], we have

\begin{matrix} χ ({p_{x}, ρ_{x}}_{x \in X}) - λ s^{T} p - f_{λ} (p^{t + 1}, p^{t}) \leq \sum_{x} p_{x} log (\frac{p_{x}^{t + 1}}{p_{x}^{t} (x)}) . \end{matrix}

□

Theorem 2.

f_{λ} (p^{t + 1}, p^{t})

converges to

F (λ)

as

t \to \infty

.

Proof.

Let

p^{*}

be an optimal solution that achieves

F (λ)

; then, we have the following inequality

\begin{matrix} \sum_{t = 0}^{N} [F (λ) - f_{λ} (p^{t + 1}, p^{t})] \\ = & \sum_{t = 0}^{N} [χ ({p_{x}^{*}, ρ_{x}}_{x \in X}) - λ s^{T} p^{*} - f_{λ} (p^{t + 1}, p^{t})] \\ \leq & \sum_{t = 0}^{N} \sum_{x} p_{x}^{*} log (\frac{p_{x}^{t + 1}}{p_{x}^{t}}) \\ = & \sum_{x} p_{x}^{*} \sum_{t = 0}^{N} log (\frac{p_{x}^{t + 1}}{p_{x}^{t}}) \\ = & \sum_{x} p_{x}^{*} log (\frac{p_{x}^{N + 1}}{p_{x}^{0}}) \\ = & \sum_{x} p_{x}^{*} log (\frac{p_{x}^{*}}{p_{x}^{0}}) + \sum_{x} p_{x}^{*} log (\frac{p_{x}^{N + 1}}{p^{*} (x)}) \\ = & D (p^{*} | | p^{0}) - D (p^{*} | | p^{N + 1}) \end{matrix}

(13)

\begin{matrix} \leq D (p^{*} | | p^{0}) . \end{matrix}

(14)

The first equality follows from Equations (5), (6), and (1). The first inequality follows from Corollary 2. The last inequality follows from the non-negativity of relative entropy.

Thus, let

N \to \infty

and with

F (λ) - f_{λ} (p^{t + 1}, p^{t}) \geq 0

, we have

\begin{matrix} 0 \leq \sum_{t = 0}^{\infty} [F (λ) - f_{λ} (p^{t + 1}, p^{t})] \leq D (p^{*} | | p^{0}), \end{matrix}

(15)

Notice we take the initial

p^{0}

to be uniform distribution, so the right hand side of Equation (15) is finite. Combine with the fact that

f_{λ} (p^{t + 1}, p^{t})

is a non-decreasing sequence, this means

f_{λ} (p^{t + 1}, p^{t})

converges to

F (λ)

. □

Theorem 3.

The probability distribution

{p^{t}}_{t = 0}^{\infty}

also converges.

Proof.

Remove the summation over t in Equations (13) and (14); then, we have

\begin{matrix} 0 \leq F (λ) - f_{λ} (p^{t + 1}, p^{t}) \leq \sum_{x} p_{x}^{*} log (\frac{p_{x}^{t + 1}}{p_{x}^{t}}) = D (p^{*} | | p^{t}) - D (p^{*} | | p^{t + 1}) . \end{matrix}

(16)

Now that the sequence

{p^{t}}_{t = 0}^{\infty}

is a bounded sequence, there exists a subsequence

{p^{t_{k}}}_{k = 0}^{\infty}

that converges. Let us say it converges to

\bar{p}

. Then, clearly, we have

f (\bar{p}, \bar{p}) = F (λ)

(or

f (p^{t + 1}, p^{t})

would not converge). Substituting

p^{*} = \bar{p}

in Equation (16), we have

\begin{matrix} 0 \leq D (\bar{p} | | p^{t}) - D (\bar{p} | | p^{t + 1}) . \end{matrix}

Thus, the sequence

{D (\bar{p} | | p^{t} {)}}_{t = 0}^{\infty}

is a decreasing sequence and there exists a subsequence

{D (\bar{p} | | p^{t_{k}} {)}}_{k = 0}^{\infty}

that converges to zero. Therefore, we can conclude that

{D (\bar{p} | | p^{t} {)}}_{t = 0}^{\infty}

converges to zero, which means

{p^{t}}_{t = 0}^{\infty}

converges to

\bar{p}

. □

3.2. The Speed of Convergence

Theorem 4.

To reach ε accuracy to

F (λ)

, the algorithm needs an iteration complexity less than

\frac{log n}{ε}

.

Proof.

From the proof of Theorem 2, we know

\begin{matrix} \sum_{t = 0}^{N} [F (λ) - f_{λ} (p^{t + 1}, p^{t})] \leq D (p^{*} | | p^{0}) \\ = & \sum_{x} p_{x}^{*} log (\frac{p_{x}^{*}}{p_{x}^{0}}) = log n - H (p^{*}) < log n, \end{matrix}

and

[F (λ) - f_{λ} (p^{t + 1}, p^{t})]

is non-increasing in t, thus

\begin{matrix} F (λ) - f_{λ} (p^{t + 1}, p^{t}) < \frac{log n}{t} . \end{matrix}

□

Next, we show that in some special cases the algorithm has a better convergence performance, which is a geometric speed of convergence.

Assumption 1.

The channel matrices

{ρ_{x}}_{x \in X}

are linearly independent, i.e., there does not exist a vector

c \in R^{n}

such that

\begin{matrix} \sum_{x} c_{x} ρ_{x} = 0 . \end{matrix}

Remark 1.

Assumption 1 is equivalent to: The output state

ρ = \sum_{x} p_{x} ρ_{x}

is uniquely determined by the input distribution p.

Theorem 5.

Under Assumption 1, the optimal solution

p^{*}

is unique.

Proof.

Notice that the von Neumann entropy in Equation (2) is strictly concave [14], thus, for distributions

p \neq p^{'}

,

ρ = \sum_{x} p_{x} ρ_{x} \neq \sum_{x} p_{x}^{'} ρ_{x} = ρ^{'}

, which is followed from Assumption refas1. Thus, this means

H (ρ)

is strictly concave in p. Thus, Holevo quantity in Equation (1) is strictly concave in p, which means the optimal solution

p^{*}

is unique. □

We also need the following theorem:

Theorem 6.

[15] The relative entropy satisfies

\begin{matrix} D (ρ | | σ) \geq \frac{1}{2} Tr {(ρ - σ)}^{2} . \end{matrix}

Now, we state the theorem of convergence:

Theorem 7.

Suppose start from some initial point

p^{0}

, then, under Assumption 1, the algorithm converges to the optimal point

p^{*}

, and

p^{0}

converges to

p^{*}

at a geometric speed, i.e., there exist

N_{0}

and

δ > 0

, where N and δ are independent, such that, for any

t > N_{0}

, we have

\begin{matrix} D (p^{*} | | p^{t} {) \leq (1 - δ)}^{t - N_{0}} D (p^{*} | | p^{N_{0}}) . \end{matrix}

Proof.

Define

d_{x} = p_{x}^{*} - p_{x}^{t}

and the real vector

| d 〉 = {(d_{1}, d_{2}, \dots, d_{n})}^{T}

. Using Taylor expansion, we have

\begin{matrix} D (p^{*} | | p^{t}) = & \sum_{x} p_{x}^{*} log (\frac{p_{x}^{*}}{p_{x}^{t}}) = \sum_{x} - p_{x}^{*} log (1 - \frac{d_{x}}{p_{x}^{*}}) \\ = & \frac{1}{2} 〈 d | P | d 〉 + \sum_{x} O (d_{x}^{3}), \end{matrix}

where

P = d i a g (p_{1}^{*}, p_{2}^{*}, \dots, p_{n}^{*})

. Now,

p^{t}

converges to

p^{*}

, i.e.,

| d 〉

converges to zero, thus there exists a

N_{0}

such that, for any

t > N_{0}

, we have

\begin{matrix} D (p^{*} | | p^{t}) \leq \frac{2}{3} 〈 d | P | d 〉 . \end{matrix}

(17)

From Theorem 6, we have

\begin{matrix} D (ρ^{*} | | ρ^{t}) \geq \frac{1}{2} Tr {{[\sum_{x} d_{x} ρ_{x}]}^{2}} = \frac{1}{2} 〈 d | M | d 〉, \end{matrix}

(18)

where

M \in R^{n \times n}

:

\begin{matrix} M_{i j} = Tr (ρ_{i} ρ_{j}) . \end{matrix}

From Equation (18), we know that, under Assumption 1, M is positive definite. Thus, there exists a

δ > 0

such that

\begin{matrix} \frac{1}{2} M > δ \frac{2}{3} P \Rightarrow \frac{1}{2} 〈 d | M | d 〉 > δ \frac{2}{3} 〈 d | P | d 〉 . \end{matrix}

Thus, for any

t > N_{0}

, it follows from Equations (17) and (18) that

\begin{matrix} D (ρ^{*} | | ρ^{t}) \geq δ D (p^{*} | | p^{t}) . \end{matrix}

(19)

From Equation (12), we know

\begin{matrix} \sum_{x} p_{x}^{*} log (\frac{p_{x}^{t + 1}}{p_{x}^{t}}) \geq D (ρ^{*} | | ρ^{t}) \\ \Rightarrow & D (p^{*} | | p^{t + 1}) \leq D (p^{*} | | p^{t}) - D (ρ^{*} | | ρ^{t}), \end{matrix}

combined with Equation (19), we have

\begin{matrix} D (p^{*} | | p^{t + 1}) \leq D (p^{*} | | p^{t}) - δ D (p^{*} | | p^{t}) = (1 - δ) D (p^{*} | | p^{t}) \\ ⟹ D (p^{*} | | p^{t} {) \leq (1 - δ)}^{t - N_{0}} D (p^{*} | | p^{N_{0}}) \end{matrix}

(20)

for any

t > N_{0}

. □

Remark 2.

(Complexity). Denote

n, m

as the size of input alphabet and output state dimension, respectively. A closer look at Algorithm 1 reveals that, for each iteration, a matrix logarithm

log ρ^{t}

needs to be calculated, and the rest are just multiplication of matrices and multiplication of numbers. The matrix logarithm can be done with complexity

O (m^{3})

[16], thus, by Theorem 4 and Equation (11), the complexity to reach ε-close to the true capacity using Algorithm 1 is

O (\frac{m^{3} log n log ε}{ε})

. With extra condition of the channel

{ρ_{x}}_{x \in X}

, which is Assumption 1, the complexity to reach an ε-close solution (i.e.,

D (p^{*} | | p^{t}) < ε

) using Algorithm 1 is

O (m^{3} log ε {log}_{(1 - δ)} \frac{ε}{D (p^{*} | | p^{N_{0}})})

. Usually, we do not need ε to be too small (no smaller than

10^{- 6}

), thus, in either case, the complexity is better than

O (\frac{(n \lor m) m^{3} {(log n)}^{1 / 2}}{ε})

in [6] when

n \lor m

is big, where

n \lor m = max {n, m}

.

4. Numerical Experiments on BA Algorithm

We only performed experiments on BA algorithm with no input constraint (BA algorithm with input constraint is some combination of BA algorithm with no input constraint.) We studied the relations between iteration complexity and

n, m

(i.e., the input size and output dimension) when the algorithm reaches certain accuracy. Since we do not know the true capacity of a certain channel, we used the following theorem to bound the error of the algorithm.

Theorem 8.

With the iteration procedure in the BA Algorithm 1,

{max}_{x} {D (ρ_{x} | | ρ^{t}) - λ s_{x}}

converges to

F (λ)

from above.

Proof.

Following from Algorithm 1, Corollary 1, and Theorem 3, we have

\begin{matrix} lim_{t \to \infty} \frac{p_{x}^{t + 1}}{p_{x}^{t}} = exp [D (ρ_{x} | | ρ^{*}) - λ s_{x} - F (λ)], \end{matrix}

where

ρ^{*} = \sum_{x} p_{x}^{*} ρ_{x}

,

p^{*}

is an optimal distribution. The limit above is 1 if

p_{x}^{*} > 0

and does not exceed 1 if

p_{x}^{*} = 0

. Thus,

\begin{matrix} D (ρ_{x} | | ρ^{*}) - λ s_{x} \leq F (λ) \end{matrix}

for every

x \in X

, with equality if

p_{x}^{*} > 0

. This proves

\begin{matrix} max_{x} {D (ρ_{x} | | ρ^{t}) - λ s_{x}} \to F (λ) . \end{matrix}

For any

p^{t}

and any optimal distribution

p^{*}

, we have

\begin{matrix} max_{x} & [D (ρ_{x} | | ρ^{t}) - λ s_{x}] \geq \sum_{x} p_{x}^{*} [D (ρ_{x} | | ρ^{t}) - λ s_{x}] \\ = \sum_{x} p_{x}^{*} D (ρ_{x} | | ρ^{*}) + D (ρ^{*} | | ρ^{t}) - λ s^{T} p^{*} \\ = F (λ) + D (ρ^{*} | | ρ^{t}) \geq F (λ) . \end{matrix}

The first equality requires some calculation and the second equality follows since

p^{*}

is an optimal distribution. This means

{max}_{x} {D (ρ_{x} | | ρ^{t}) - λ s_{x}}

converges to

F (λ)

from above. □

Thus, our accuracy criterion was: for a given classical-quantum channel, we ran the BA algorithm (with no input constraint), until

[{max}_{x} {D (ρ_{x} | | ρ^{t})} - f (p^{t}, p^{t})]

was less than

10^{- k}

, and recorded the number of iteration. At this time, the accuracy was of order

10^{- (k + 1)}

at most since

{max}_{x} {D (ρ_{x} | | ρ^{t})}

and

f (p^{t}, p^{t})]

converged to the true capacity from above and below, respectively.

We performed the following numerical experiments: for given values of input size n, output dimension m and accuracy, we generated 200 classical-quantum channels randomly, recorded the numbers of iterations, and then calculated the average number of iterations of these 200 experiments. The results are shown in Figure 1. Note that the accuracy

10^{- k}

in Figure 1 means we ran the BA algorithm until

[{max}_{x} {D (ρ_{x} | | ρ^{t})} - f (p^{t}, p^{t})]

was less than

10^{- k}

, and the error between the true capacity and the computed value was of order

10^{- (k + 1)}

at most.

We can see in Figure 1 that the iteration complexity scales better as accuracy and input dimension increase. We can also see for given input size n and accuracy, the output dimension has vary little influence on iteration complexity, which means the iteration complexity also scales better as the output dimension m increases. Compared with our theoretical analysis of iteration complexity in Theorem 4: to reach

ϵ

accuracy, we needed

\frac{log n}{ϵ}

iterations; the numerical experiments showed that the number of iterations was far smaller than

\frac{l o g n}{ϵ}

to reach

ϵ

accuracy, whether the output quantum states were independent or not (cases in

(n, m) = (6, 2), (10, 2)

). The reason for this is that the inequalities in the proof of Theorem 4 are quite loose. Thus, Theorem 4 only provides a very loose upper bound on iteration complexity. We can also guess that maybe the relation in Equation (20) holds generally and we just cannot prove it yet.

Next, we needed to see the running time of the BA algorithm. There were three methods to compute the classical-quantum channel capacity: BA algorithm, the duality and smoothing technique [6], and the method created by Hamza Fawzi et al. [17]. In [17], a Matlab code package called CvxQuad is provided which accurately approximates the relative entropy function via semidefinite programming so that many quantum capacity quantities can be computed using a convex optimization tool called CVX. Here, we compared the running time of the above three methods. For different input size n and output dimension m, we generated a classical-quantum channel randomly and computed the channel capacity using the above three methods and then recorded the running time of each method. The results are shown in Figure 2.

In Figure 2, we can see the BA algorithm was the fastest method. The duality and smoothing method was rather slow and we did not record the running time of the duality and smoothing method when

n = 30

because it took too long. We can also notice that the running time of the CvxQuad method was extremely sensitive to the output dimension, which is not a surprise because CVX is a second-order method. Thus, our BA algorithm was significantly faster than the other two methods when n and m became big.

5. An Approximate Solution of p in Binary Two Dimensional Case

In this section, we provide an approximate optimal input distribution for the case of the input size and output dimension are both 2:

\begin{matrix} {p_{1} : ρ_{1}; p_{2} : ρ_{2}}, p_{1} + p_{2} = 1, ρ_{1}, ρ_{2} \in D^{2} . \end{matrix}

5.1. Use Bloch Sphere to Get an Approximate Solution

Any two-dimensional density matrix can be represented as a point in the Bloch sphere [5], as shown in the following:

Any density matrix can be represented as a vector in the Bloch sphere starting from the origin. Suppose

ρ_{1}, ρ_{2}

can be represented as

r_{1}, r_{2}

respectively, as shown in Figure 3; then, the two eigenvalues would be

0.5 \pm r_{1} / 2

and

0.5 \pm r_{2} / 2

, respectively. Extending

r_{1}

, we get two intersections on the surface of the Bloch sphere; then, these two intersections represent the two eigenvectors of

ρ_{1}

(the points on the surface of the sphere represent pure state and the interior points represent mixed states). A probabilistic combination of

ρ_{1}, ρ_{2}

can be represented as

p_{1} ρ_{1} + p_{2} ρ_{2} = p_{1} r_{1} + p_{2} r_{2}

([5] Exercise 4.4.13). Any point on the surface of Bloch sphere can be represented as

\begin{matrix} cos \frac{α}{2} | 0 〉 + sin \frac{α}{2} e^{i ϕ} | 1 〉, \end{matrix}

where

α

is the angle to the Z-axis and

ϕ

is the angle of the X-axis to the projection of the point on the X–Y plane.

By symmetry, it is obvious that the Holevo quantity is only related to

r_{1}, r_{2}, θ, p_{1}

, where

θ

is the angle between

r_{1}, r_{2}

. One interesting result is that the angle

θ

has very little influence on

p^{*}

, where

p^{*}

is the optimal distribution that maximizes Holevo quantity. If we know

λ_{1}, λ_{2}

(the bigger eigenvalues of

ρ_{1}, ρ_{2}

, respectively),

θ

and

p_{1}

, then the Holevo quantity can be written as

\begin{matrix} χ (λ_{1}, λ_{2}, θ, p_{1}) = S (\frac{1}{2} + \frac{1}{2} | | p_{1} r_{1} + (1 - p_{1}) r_{2} {| |}_{2}) - [p_{1} S (\frac{1}{2} + \frac{1}{2} r_{1}) + (1 - p_{1}) S (\frac{1}{2} + \frac{1}{2} r_{2})], \end{matrix}

(21)

where

S (\cdot)

is the binary entropy (

S (x) = - (x log x + (1 - x) log (1 - x))

and

r_{i} = 2 λ_{i} - 1

.

Using Cosine Theorem to calculate

| | p_{1} r_{1} + (1 - p_{1}) r_{2} {| |}_{2})

, the gradient of

χ

with respect to

p_{1}

can be calculated directly, denoted as

\begin{matrix} \nabla_{p_{1}} χ (λ_{1}, λ_{2}, θ, p_{1}) . \end{matrix}

If we can find a

p_{1}

such that

\nabla_{p_{1}} χ (λ_{1}, λ_{2}, θ, p_{1}) = 0

, then this

p_{1}

is the optimal solution (because

χ (λ_{1}, λ_{2}, θ, p_{1})

is concave in

p_{1}

). However, we cannot solve the equation

\nabla_{p_{1}} χ (λ_{1}, λ_{2}, θ, p_{1}) = 0

with respect to

p_{1}

when

θ \neq 0

. Now that

θ

has little influence on

p^{*}

, let

θ = 0

(this is actually the classical case), and let

\begin{matrix} \nabla_{p_{1}} χ (λ_{1}, λ_{2}, θ = 0, p_{1}) = 0, \end{matrix}

the above equation is easy to solve and we get a solution

{\hat{p}}_{1}

:

\begin{matrix} {\hat{p}}_{1} = \frac{\frac{1 - c}{1 + c} - r_{2}}{r_{1} - r_{2}}, where c = 2^{\frac{S (λ_{1}) - S (λ_{2})}{(r_{1} - r_{2}) / 2}}, \end{matrix}

(22)

where we assume

r_{1} \neq r_{2}

. (It can be easily seen from the Bloch sphere that if

r_{1} = r_{2}

, the optimal distribution would be

{\frac{1}{2}, \frac{1}{2}}

.)

This

{\hat{p}}_{1}

can be used as an approximate optimal solution for all

θ \in [0, π]

. Next, we need numerical experiments to see how accurate

{\hat{p}}_{1}

is.

5.2. Numerical Experiments on the Approximated Solution ${\hat{p}}_{1}$

It is obvious that the maximum of Holevo quantity only depends on

r_{1}, r_{2}

and

θ

, thus, without loss of generality, we let

ρ_{1}

be on the Z-axis and

ρ_{2}

be on the X–Z plane:

\begin{matrix} ρ_{1} = λ_{1} | 0 〉 〈 0 | + (1 - λ_{1}) | 1 〉 〈 1 |; \\ ρ_{2} = λ_{2} | ψ_{0} 〉 〈 ψ_{0} | + (1 - λ_{2}) | ψ_{1} 〉 〈 ψ_{1} |, \end{matrix}

where

\begin{matrix} | ψ_{0} 〉 = cos \frac{θ}{2} | 0 〉 + sin \frac{θ}{2} | 1 〉; \\ | ψ_{1} 〉 = - sin \frac{θ}{2} | 0 〉 + cos \frac{θ}{2} | 1 〉, \end{matrix}

which means the angle between

ρ_{1}

and

ρ_{2}

(i.e.,

r_{1}

and

r_{2}

) is

θ

[5].

In the numerical experiments, we let

λ_{1}, λ_{2}

range from

0.5

to 1, and

θ

range from 0 to

π

. For each value of

(λ_{1}, λ_{2}, θ)

, we substituted

(λ_{1}, λ_{2})

into Equation (22) to compute

{\hat{p}}_{1}

. Then, we substituted

(λ_{1}, λ_{2}, θ, {\hat{p}}_{1})

into Equation (21) to get the approximate maximum of Holevo quantity over

p_{1}

:

χ (λ_{1}, λ_{2}, θ, {\hat{p}}_{1})

. To see how accurate this approximate maximum is, we need a BA algorithm to provide an accurate maximum. The termination criterion for the iteration process of BA algorithm is stopping when

[{max}_{x} {D (ρ_{x} | | ρ^{t})} - f (p^{t}, p^{t})]

is less than

10^{- 6}

; then, the BA algorithm outputs a value of Holevo quantity

χ_{B A} (λ_{1}, λ_{2}, θ)

. We can compute the error of

χ (λ_{1}, λ_{2}, θ, {\hat{p}}_{1})

then take the maximum over

θ \in [0, π]

\begin{matrix} Error (λ_{1}, λ_{2}) = max_{θ \in [0, π]} | χ (λ_{1}, λ_{2}, θ, {\hat{p}}_{1}) - χ_{B A} (λ_{1}, λ_{2}, θ) | . \end{matrix}

Figure 4 is the numerical result, which is a plot of

(λ_{1}, λ_{2}, Error (λ_{1}, λ_{2}))

.

In Figure 4, we can see that, if

λ_{1}, λ_{2}

are not “too big", the error can be upper bounded by

10^{- 3}

. To see this more directly, we take the maximum of

Error (λ_{1}, λ_{2})

for different ranges of

λ_{1}, λ_{2}

:

\begin{matrix} max_{λ_{1}, λ_{2} \in [0.5, R]} Error (λ_{1}, λ_{2}) . \end{matrix}

Figure 5 is a plot of

(R, {max}_{λ_{1}, λ_{2} \in [0.5, R]} Error (λ_{1}, λ_{2}))

.

In Figure 5, we can see that, if

λ_{1}, λ_{2} < 0.95

, the error of approximate maximum of Holevo quantity can be upper bounded by

3 \times 10^{- 4}

. Thus, we can conclude that, when the bigger eigenvalues of

ρ_{1}, ρ_{2}

are not too big (no bigger than

0.95

), Equation (22) can make the error of the maximum of Holevo quantity smaller than

3 \times 10^{- 4}

.

The approximate solution is an interesting phenomenon. The reason the angle

θ

has such little influence on the maximum of Holevo quantity is unclear.

6. Discussion

In this paper, we provide an algorithm which computes the capacity of classical-quantum channel. We analyzed the speed of convergence theoretically and numerical experiments showed that our algorithm outperforms the existing methods [6,17]. We also provide an approximated method to compute the capacity of binary two-dimensional classical-quantum channel, which shows high accuracy. As mentioned in the Introduction, for a general quantum-quantum channel, maximizing Holevo quantity with respect to both the input distribution and output quantum state is NP-complete and this is not a convex optimization problem because Holevo quantity is concave with respect to input distribution and convex with respect to output quantum states. Thus, it remains open whether there exists an efficient algorithm to solve this problem. However, for classical-quantum channel, Holevo quantity has an upper bound, thus one future work would be to maximize Holevo quantity with respect to output states with given input distribution. It remains open if alternating optimization algorithms, in particular of Blahut–Arimoto type, can also be given for other optimization problems in terms of quantum entropy.

The authors declare no conflict of interest.

Author Contributions

Formal analysis, H.L., N.C.; Investigation, H.L.; Project administration, H.L.; Supervision, N.C.; Writing—original draft, H.L.; Writing—review & editing, N.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Arimoto, S. An algorithm for computing the capacity of arbitrary discrete memoryless channels. IEEE Trans. Inf. Theory 1972, 18, 14–20. [Google Scholar] [CrossRef] [Green Version]
Blahut, R. Computation of channel capacity and rate-distortion functions. IEEE Trans. Inf. Theory 1972, 18, 460–473. [Google Scholar] [CrossRef] [Green Version]
Holevo, A. Problems in the mathematical theory of quantum communication channels. Rep. Math. Phys. 1977, 12, 273–278. [Google Scholar] [CrossRef]
Holevo, A.S. The capacity of the quantum channel with general signal states. IEEE Trans. Inf. Theory 1998, 44, 269–273. [Google Scholar] [CrossRef] [Green Version]
Wilde, M. From Classical to Quantum Shannon Theory; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
Sutter, D.; Sutter, T.; Esfahani, P.M.; Renner, R. Efficient Approximation of Quantum Channel Capacities. IEEE Trans. Inf. Theory 2016, 62, 578–598. [Google Scholar] [CrossRef] [Green Version]
Nesterov, Y. Smooth minimization of non-smooth functions. Math. Program. 2005, 103, 127–152. [Google Scholar] [CrossRef]
Yeung, R.W. Information Theory and Network Coding; Springer: Berlin/Heidelberger, Germany, 2008. [Google Scholar]
Nagaoka, H. Algorithms of Arimoto-Blahut type for computing quantum channel capacity. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Cambridge, MA, USA, 16–21 August 1998; p. 354. [Google Scholar]
Beigi, S.; Shor, P.W. On the Complexity of Computing Zero-Error and Holevo Capacity of Quantum Channels. arXiv 2007, arXiv:0709.2090. [Google Scholar]
Osawa, S.; Nagaoka, H. Numerical Experiments on the Capacity of Quantum Channel with Entangled Input States. IEICE TRANSACTIONS Fundam. Electron. Commun. Comput. 2001, E84-A, 2583–2590. [Google Scholar]
Hayashi, M.; Imai, H.; Matsumoto, K.; Ruskai, M.B.; Shimono, T. Qubit Channels Which Require Four Inputs to Achieve Capacity: Implications for Additivity Conjectures. Quantum Inf. Comput. 2005, 5, 13–31. [Google Scholar]
Boyd, S.; Vandenberghe, L. Convex Optimization; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Bengtsson, I.; Zyczkowski, K. GEOMETRY OF QUANTUM STATES: An Introduction to Quantum Entanglement; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
Hiai, F.; Petz, D. Introduction to Matrix Analysis and Applications; Springer: Berlin/Heidelberger, Germany, 2014. [Google Scholar]
Moler, C.; Van Loan, C. Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty-Five Years Later. SIAM Rev. 2003, 45, 3–49. [Google Scholar] [CrossRef]
Hamza Fawzi, O.F. Efficient optimization of the quantum relative entropy. J. Phys. A Math. Theor. 2018, 51, 409601. [Google Scholar] [CrossRef] [Green Version]

Figure 1. The number of iterations needed to reach certain accuracy.

Figure 2. The caparison of the running time of three methods.

Figure 3. Bloch sphere.

Figure 4. Error of the approximated method.

Figure 5. Error of the approximate method.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, H.; Cai, N. Computing Classical-Quantum Channel Capacity Using Blahut–Arimoto Type Algorithm: A Theoretical and Numerical Analysis. Entropy 2020, 22, 222. https://doi.org/10.3390/e22020222

AMA Style

Li H, Cai N. Computing Classical-Quantum Channel Capacity Using Blahut–Arimoto Type Algorithm: A Theoretical and Numerical Analysis. Entropy. 2020; 22(2):222. https://doi.org/10.3390/e22020222

Chicago/Turabian Style

Li, Haobo, and Ning Cai. 2020. "Computing Classical-Quantum Channel Capacity Using Blahut–Arimoto Type Algorithm: A Theoretical and Numerical Analysis" Entropy 22, no. 2: 222. https://doi.org/10.3390/e22020222

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Computing Classical-Quantum Channel Capacity Using Blahut–Arimoto Type Algorithm: A Theoretical and Numerical Analysis^†

Abstract

1. Introduction

2. Blahut–Arimoto Algorithm for Classical-Quantum Channel

3. Convergence Analysis

3.1. The Convergence Is Guaranteed

3.2. The Speed of Convergence

4. Numerical Experiments on BA Algorithm

5. An Approximate Solution of p in Binary Two Dimensional Case

5.1. Use Bloch Sphere to Get an Approximate Solution

5.2. Numerical Experiments on the Approximated Solution ${\hat{p}}_{1}$

6. Discussion

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Computing Classical-Quantum Channel Capacity Using Blahut–Arimoto Type Algorithm: A Theoretical and Numerical Analysis †

Abstract

1. Introduction

2. Blahut–Arimoto Algorithm for Classical-Quantum Channel

3. Convergence Analysis

3.1. The Convergence Is Guaranteed

3.2. The Speed of Convergence

4. Numerical Experiments on BA Algorithm

5. An Approximate Solution of p in Binary Two Dimensional Case

5.1. Use Bloch Sphere to Get an Approximate Solution

5.2. Numerical Experiments on the Approximated Solution p ^ 1

6. Discussion

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Computing Classical-Quantum Channel Capacity Using Blahut–Arimoto Type Algorithm: A Theoretical and Numerical Analysis^†

5.2. Numerical Experiments on the Approximated Solution ${\hat{p}}_{1}$