0% found this document useful (0 votes)
4 views

Linear Network Error Correction Coding

This paper revisits linear network error correction (LNEC) coding, focusing on its framework and unifying two established approaches. By employing a graph-theoretic method, the authors enhance the characterization of LNEC codes' error correction capabilities and derive an improved upper bound on the minimum required field size for LNEC codes, which is significant for practical implementations. An efficient algorithm for computing this bound is also developed, contributing to the computational complexity and storage requirements of coding schemes.

Uploaded by

mhmad240
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Linear Network Error Correction Coding

This paper revisits linear network error correction (LNEC) coding, focusing on its framework and unifying two established approaches. By employing a graph-theoretic method, the authors enhance the characterization of LNEC codes' error correction capabilities and derive an improved upper bound on the minimum required field size for LNEC codes, which is significant for practical implementations. An efficient algorithm for computing this bound is also developed, contributing to the computational complexity and storage requirements of coding schemes.

Uploaded by

mhmad240
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

1

Linear Network Error Correction Coding:


A Revisit
Xuan Guang and Raymond W. Yeung
arXiv:2103.08081v1 [cs.IT] 15 Mar 2021

Abstract

We consider linear network error correction (LNEC) coding when errors may occur on edges of a
communication network of which the topology is known. In this paper, we first revisit and explore the
framework of LNEC coding, and then unify two well-known LNEC coding approaches. Furthermore,
by developing a graph-theoretic approach to the framework of LNEC coding, we obtain a significantly
enhanced characterization of the error correction capability of LNEC codes in terms of the minimum
distances at the sink nodes. In LNEC coding, the minimum required field size for the existence of LNEC
codes, in particular LNEC maximum distance separable (MDS) codes which are a type of most important
optimal codes, is an open problem not only of theoretical interest but also of practical importance, because
it is closely related to the implementation of the coding scheme in terms of computational complexity and
storage requirement. By applying the graph-theoretic approach, we obtain an improved upper bound on
the minimum required field size. The improvement over the existing results is in general significant. The
improved upper bound, which is graph-theoretic, depends only on the network topology and requirement
of the error correction capability but not on a specific code construction. However, this bound is not given
in an explicit form. We thus develop an efficient algorithm that can compute the bound in linear time. In
developing the upper bound and the efficient algorithm for computing this bound, various graph-theoretic
concepts are introduced. These concepts appear to be of fundamental interest in graph theory and they
may have further applications in graph theory and beyond.

I. I NTRODUCTION

In 1956, the problem of maximizing the rate of flow from a source node to a sink node through a
network was considered independently by Elias et al. [1] and Ford and Fulkerson [2], where, regardless
of whether the flow is a commodity flow or an information flow, the value of the maximum flow is
equal to the capacity of a minimum cut separating the sink node from the source node. This result is the

This paper was presented in part at the 2020 IEEE International Symposium on Information Theory.

DRAFT
2

celebrated max-flow min-cut theorem, proved in [1] and [2]. In 2000, Ahlswede et al. [3] put forward the
general concept of network coding that allows the intermediate nodes in a noiseless network to process
the received information. In particular, they focused on the single-source network coding problem on a
general network and proved that if coding is applied at the nodes in a network, rather than routing only,
the single source node can multicast messages to all the sink nodes at the theoretically maximum rate, i.e.,
the smallest minimum cut capacity between the source node and a sink node, as the alphabet size of both
the information source and the channel transmission symbol tends to infinity. This result can be regarded
as the max-flow min-cut theorem for information flow from a source node multicasting to multiple sink
nodes through a network, as well as a generalization of the classical max-flow min-cut theorem from
a source node to a sink node through a network. The idea of network coding can be dated back to
Celebiler and Stette’s work [4] in 1978, where they proposed a scheme that can improve the efficiency of
a two-way satellite communication system by performing the addition of two bits onboard the satellite. In
1999, Yeung and Zhang [5] investigated the general coding problem in a satellite communication system
and obtained an inner bound and an outer bound on the capacity region. Shortly after [1], Li et al. [6]
proved that linear network coding with a finite alphabet is sufficient for optimal multicast by means of
a vector space approach. Independently, Koetter and Médard [7] developed an algebraic characterization
of linear network coding by means of a matrix approach. The above two approaches correspond to the
global and local descriptions of linear network coding, respectively. For comprehensive discussions of
network coding, we refer the reader to [8]–[12].
In the paradigm of network coding, network error correction is necessary when errors may occur on the
edges of a communication network. For example, network transmission may suffer from random errors
caused by channel (edge in networks) noise, erasure errors caused by link failure or buffer overflow,
corruption errors caused by malicious attack, etc. In general, the problem induced by errors in network
coding can be more serious than the one in a classical point-to-point communication system, because
errors will be propagated by the coding operations at the intermediate nodes. Even a single error occurred
on an edge has the potential of polluting all the “downstream” messages. The network coding techniques
for combating network errors is referred to as network error correction coding. In particular, the linear
network coding techniques for combating network errors is referred to as linear network error correction
(LNEC) coding, which was introduced in [13] and investigated widely in the literature, e.g., [14]–[22].
A very special case of network error correction coding over the simplest network is depicted in Fig. 1,
where the network consists of only two nodes, a source node s and a sink node t, connected by multiple
parallel edges from s to t. This special case of network error correction coding can be regarded as the
model of classical coding theory (cf. [23], [24]), which is a very rich field of research originated from

DRAFT
3

s .. t
.

Fig. 1: An equivalent model of the classical coding theory.

Shannon’s seminal work [25] in 1948.

A. Related Works

Network error correction coding was first considered by Cai and Yeung [13]. Subsequently, they
further developed network error correction coding in their two-part paper [14], [15] as a generalization
of algebraic coding from the point-to-point setting to the network setting. In particular, three important
bounds in algebraic coding, the Hamming bound, the Gilbert-Varshamov bound, and the Singleton bound,
are generalized for network error correction coding, where the error correction capabilities at all the sink
nodes are the same. Subsequently, the Singleton bound was refined independently by Zhang [16] and
Yang et al. [17], where the error correction capabilities at the sink nodes can be different. This refined
Singleton bound shows that sink nodes with larger maximum flow values from the source node can
have potentially higher error correction capability. Similar refinements for the Hamming bound and the
Gilbert-Varshamov bound were also provided in [17]. In the rest of the paper, the refined Singleton bound
will be called the Singleton bound for network error correction coding.
Two frameworks of LNEC coding were developed in [16] and [26]. In order to characterize error
correction capability of an LNEC code, Zhang [16] directly defined a minimum distance at sink node
by using the introduced concept of the rank of error pattern, which can be regarded as a “measure”
of error pattern. Subsequently, Guang et al. [21] proved that this minimum distance can be obtained
by using other measures of error pattern. Yang et al. [17] considered multiple weight measures on
error vector occurred in the network to characterize error correction capability of an LNEC code. They
further proved that these weight measures induce the same minimum weight decoder. The construction of
LNEC codes has been investigated in the literature. In [17], [21], [27], different constructions of LNEC
maximum distance separable (MDS) codes were put forward, where LNEC MDS codes are a type of most
important optimal codes that achieve the Singleton bound with equality. These constructions also imply

DRAFT
4

the tightness of the Singleton bound. Besides, the construction in [21] can also be applied to construct
a general LNEC code with any admissible requirement of the rate and error correction capability, which
includes LNEC MDS codes as a special case. Further, Guang et al. [28] considered the problem of network
error correction coding when the information rate changes over time. To efficiently solve this problem,
local-encoding-preserving LNEC coding was put forward, where a family of LNEC codes is called local-
encoding-preserving if all the LNEC codes in this family share a common local encoding kernel at each
intermediate node in the network. In order to achieve the maximum error correction capability for each
possible rate, an efficient approach was also provided to construct a family of local-encoding-preserving
LNEC MDS codes with all the admissible rates.
A common assumption in the above discussion is that the network topology is known. As such, we
can construct a deterministic LNEC code based on the network topology, and use this code for network
transmission. By contrast, for the case that the network topology is unavailable, it is impossible to
construct an LNEC code based on the network topology. Network error correction coding without this
assumption has been investigated in the literature. One approach is random LNEC coding [20], [21],
[28]–[30], which uses the same idea in random network coding first studied by Ho et al. [31]. To be
specific, this approach applies random network coding to build the extended global encoding kernels for
each sink node, which form a matrix for decoding the source message with error correction. Another
approach is subspace coding [18], [19], [32], which is an end-to-end approach for error correction with
random linear network coding employed within the network. To be specific, in this approach, random
linear network coding over a network is abstracted as an operator channel in Kötter and Kschischang’s
work [18]. The source node, as the transmitter of this operator channel, emits a vector space modulated
by a source message. A sink node, as a receiver of this channel, receives a vector space which is possibly
corrupted by network errors. A new metric, called subspace distance, is used to measure the discrepancy
between the two vector spaces for network error correction. With this metric, efficient coding and decoding
schemes based on rank-metric and subspace codes were proposed in [18], [19], [33].
Another line of research considers adversarial attacks, in which various adversarial models were
investigated in the context of network coding [33]–[37]. In particular, for the Byzantine attack in which an
adversary is able to modify the messages transmitted on the edges of a network [34]–[36], network error
correction coding can be applied to combat the attack by regarding the malicious messages injected into
the network by the adversary as errors. For example, Jaggi et al. [35] proposed a distributed polynomial-
time algorithm for correcting the corruption errors, which can achieve successful decoding with a high
probability when the sizes of the base field and the source message packet are sufficiently large. A
cryptographic technique for public-key systems is also used in their coding scheme. Specifically, a

DRAFT
5

redundancy matrix, which plays the role of a parity-check, needs to be published in advance to all
the parties including the source node, the sink nodes and the adversaries before employing (random)
linear network coding within the network.

B. Contributions and Organization of the Paper

In this paper, we first revisit and further explore the framework of LNEC coding and network error
correction on a network whose topology is known. Then, we show that the two well-known LNEC
approaches developed in [16] and [17] are in fact equivalent. By developing a graph-theoretic approach,
we can enhance the characterization of error correction capability of LNEC codes in terms of the minimum
distances at the sink nodes. Briefly speaking, in order to ensure that an LNEC code can correct up to r
errors at a sink node t, it suffices to ensure that this code can correct every error vector in a “reduced
set of error vectors”. In general, the size of this reduced set is considerably smaller than the number of
error vectors with Hamming weight not larger than r . This result has the important implication that the
computational complexities for decoding and code construction can be significantly reduced.
In LNEC coding, the minimum required field size for the existence of LNEC codes, in particular LNEC
MDS codes, is an open problem not only of theoretical interest but also of practical importance, because
it is closely related to the implementation of the coding scheme in terms of computational complexity and
storage requirement [14]–[17], [21]. However, the existing upper bounds on the minimum required field
size for the existence of LNEC (MDS) codes are typically too large for implementation. In this paper, we
show that the required field size for the existence of LNEC (MDS) codes can be reduced significantly.
To be specific, by applying our graph-theoretic approach, we prove an improved upper bound on the
minimum required field size. The improvement over the existing results is in general significant. This
new bound, which is graph-theoretic, depends on the network topology and the requirement of error
correction capability but not on a specific code construction. As mentioned, our upper bound is graph-
theoretic, but it is not given in an explicit form. Thus, we develop an efficient algorithm to compute the
bound whose computational complexity is in a linear time of the number of edges in the network.
The paper is organized as follows. In Section II, we formally present the network model and linear
network coding. The necessary notation and definitions are also introduced. In Section III, we revisit and
explore the framework of LNEC coding, and then unify two well-known LNEC coding approaches. In
Section IV, we develop a graph-theoretic approach with which we can enhance the characterization of
error correction capability of LNEC codes. The improved upper bound on the minimum required field
size for the existence of LNEC codes, in particular LNEC MDS codes, is obtained in Section V. This is

DRAFT
6

followed by the development of an efficient algorithm for computing the improved bound. We conclude
in Section VI with a summary of our results.

II. P RELIMINARIES

A. Network Model

Let G = (V, E) be a finite directed acyclic graph with a single source s and a set of sink nodes
T ⊆ V \ {s}, where V and E are the sets of nodes and edges, respectively. For a directed edge e from
node u to node v , the tail and the head of an edge e ∈ E are denoted by tail(e) and head(e), respectively.
Further, for a node v , let Out(v) = {e ∈ E : tail(e) = v} and In(v) = {e ∈ E : head(e) = v}, which
are the set of input edges and the set of output edges, respectively. Without loss of generality, assume
that there are no input edges for the source node s and no output edges for any sink node t ∈ T . The
capacity of each edge is taken to be 1, i.e., a symbol taken from an alphabet is transmitted on each edge
e ∈ E for each use of e. Further, parallel edges between two adjacent nodes are allowed.
In the network G, if a sequence of edges (e1 , e2 , · · · , em ) satisfies tail(ek+1 ) = head(ek ) for all
k = 1, 2, · · · , m − 1, then (e1 , e2 , · · · , em ) is called a path from the node tail(e1 ) (or the edge e1 ) to
the node head(em ) (or the edge em ). In particular, a single edge e is regarded as a path from tail(e) to
head(e) (or from e to itself). For two nodes u and v , a cut separating v from u is a set of edges whose
removal disconnects v from u, i.e., no paths exist from u to v upon deleting the edges in this set. The
capacity of this cut separating v from u is defined as the number of edges in the cut. The minimum of
the capacities of all cuts separating v from u is called the minimum cut capacity separating v from u.
Further, a cut is called a minimum cut separating v from u if its capacity achieves this minimum cut
capacity. If u and v are two nodes such that v is disconnected from u, i.e., no path exists from u to v
in the network G, we adopt the convention that the minimum cut capacity separating v from u is 0 and
the empty set of edges is the minimum cut separating v from u.
These concepts can be extended from separating a node v from another node u to separating a nonempty
subset of nodes Vb from a node u (u ∈
/ Vb ), and separating an edge subset ξ from a node u as follows.
We first consider a nonempty subset of non-source nodes Vb ⊆ V . We create a new node vVb , and for
every node v in Vb , add a “super-edge” of infinite capacity from v to vVb (which is equivalent to adding
an infinite number of parallel edges from v to vVb ). A cut separating Vb from u is defined as a cut of
finite capacity separating vVb from u. We can naturally extend the definitions of the capacity of a cut, the
minimum cut capacity, and the minimum cut to the case of Vb . Next, we consider an edge subset ξ ⊆ E .
We first subdivide each edge e ∈ ξ by creating a node ve and splitting e into two edges e1 and e2 such

that tail(e1 ) = tail(e), head(e2 ) = head(e), and head(e1 ) = tail(e2 ) = ve . Let Vξ = ve : e ∈ ξ .

DRAFT
7

u u

e1 e2 e1 e2

e3 e4 e3 e4

e15
e5 ve 5
e25

e7 e17 ve7
e6 e6
e27 ∞

e8 e8 vξ

Fig. 2: The network G. Fig. 3: The network modification.

Then a cut separating the edge subset ξ from the node u is defined as a cut separating Vξ from u,
where, whenever e1 or e2 appears in the cut, replace it by e. By definition, ξ is a cut separating ξ from
u. Similarly, the minimum cut capacity separating Vξ from u is defined as the minimum cut capacity
separating ξ from u. Also, a cut separating ξ from u achieving this minimum cut capacity is called a
minimum cut separating ξ from u. If an edge set A ⊆ E is a cut separating a node v (resp. a set of nodes
Vb and a set of edges ξ ) from another node u, then we say that the edge set A separates v (resp. Vb and
ξ ) from u. Note that if A separates v (resp. Vb and ξ ) from u, then every path from u to v (resp. Vb and
ξ ) passes through at least one edge in A. We now use a network in [38] (Figs. 2 and 3) as an example
to illustrate the above graph-theoretic concepts.

Example 1. We consider node u and an edge subset ξ = {e5 , e7 } in the network G depicted in Fig. 2.
For edge e5 , we first create a node ve5 and split e5 into two edges e15 and e25 with tail(e15 ) = tail(e5 ),
head(e25 ) = head(e5 ), and head(e15 ) = tail(e25 ) = ve5 . The same subdivision operation is applied to

edge e7 as depicted in Fig. 3. Let Vξ = ve5 , ve7 . Now, in order to find a cut separating ξ from u, it
is equivalent to finding a cut separating Vξ from u. Toward this end, we first create a new node vξ and
add 2 super-edges with infinite capacity from ve5 to vξ and from ve7 to vξ , respectively. By definition,
a cut of finite capacity separating vξ from u is a cut separating Vξ from u and so a cut separating ξ
from u. For example, the edge subset {e3 , e4 } is such a cut. Further, the edge subset {e15 } is also a cut
separating Vξ from u. By definition, e15 appears in the cut {e15 } and e5 ∈ ξ , and thus e15 is replaced by
e5 and so {e5 } is a cut separating ξ from s. We further see that {e5 } is a minimum cut separating ξ

DRAFT
8

from u that achieves the minimum cut capacity 1 separating ξ from u.

Due to the acyclicity of the network G, we can fix an ancestral order on the edges in E that is
consistent with the natural partial order of the edges. Throughout this paper, we use this order to index
the coordinates of all the vectors and the rows/columns of all the matrices in the paper. If the columns of
a matrix L are indexed by a subset of edges ξ , then we use a symbol with subscript e, say ℓe , to denote
the column indexed by the edge e ∈ ξ ; if the rows of a matrix L are indexed by the subset of edges ξ ,
then we use a symbol followed by e in a pair of brackets, say ℓ(e), to denote the row indexed by e ∈ ξ .

B. Linear Network Coding

In this subsection, we consider the linear network coding model. On the network G, the source node s
is required to multicast the source message to each node in T , or equivalently, each node in T is required
to decode with zero error the source message generated by the source node s. For a sink node t ∈ T , we
use Ct to denote the minimum cut capacity separating t from the source node s. Linear network coding
over a finite field is sufficient for achieving mint∈T Ct , the theoretical maximum rate at which the source
node s can multicast the source message to all the sink nodes in T [6], [7].
Let ω be the (information) rate of the source (ω ≤ mint∈T Ct ), or equivalently, the source node s
generates ω symbols in an alphabet per unit time. To facilitate our discussion, we introduce ω imaginary

source edges connecting to s, denoted by d′1 , d′2 , · · · , d′ω , respectively, and let In(s) = d′1 , d′2 , · · · , d′ω .
As such, we assume that the ω source symbols are transmitted to s on the ω imaginary source edges.
Now, we state the definition of a linear network code.

Definition 1. Let Fq be a finite field of order q , where q is a prime power. An Fq -valued rate-ω linear
network code C on the network G = (V, E) consists of an Fq -valued |In(v)| × |Out(v)| matrix Kv =
[kd,e ]d∈In(v),e∈Out(v) for each non-sink node v in V , i.e.,

C = Kv : v ∈ V \ T ,

where Kv is called the local encoding kernel of C at v , and kd,e ∈ Fq is called the local encoding
coefficient for the adjacent edge pair (d, e).

For a linear network code C , the local encoding kernels induce a column ω -vector fe for each edge e
in E , called the global encoding kernel of e, which can be calculated recursively according to the given
ancestral order of edges in E by
X
fe = kd,e · fd , (1)
d∈In(tail(e))

DRAFT
9

with the boundary condition that fd′i , 1 ≤ i ≤ ω form the standard basis of the vector space Fωq . The set

of global encoding kernels for all e ∈ E , i.e., fe : e ∈ E , is also used to represent this linear network

code C . However, we remark that a set of global encoding kernels fe : e ∈ E may correspond to

more than one set of local encoding kernels Kv : v ∈ V \ T .

In using this rate-ω linear network code C , let x = x1 x2 · · · xω ∈ Fωq be the row vector of ω
source symbols generated by the source node s, which is called the source message vector, or simply
the source message. Without loss of generality, we assume that xi is transmitted on the ith imaginary
S
channel d′i , 1 ≤ i ≤ ω . We use ye to denote the symbol transmitted on e, ∀ e ∈ In(s) E . With yd′i = xi ,
1 ≤ i ≤ ω , each ye for e ∈ E can be calculated recursively according to the given ancestral order of
edges in E by the equation
X
ye = kd,e · yd . (2)
d∈In(tail(e))

In fact, ye is a linear combination of the ω source symbols xi , 1 ≤ i ≤ ω , which can be seen as follows.
First, it is readily seen that yd′i = x · fd′i (= xi ), 1 ≤ i ≤ ω . Then it can be shown by induction via (1)
and (2) that

ye = x · fe ,
∀ e ∈ E. (3)
h i
For each sink node t ∈ T , we define the matrix Ft = fe : e ∈ In(t) . The sink node t can decode

the source message vector with zero error if and only if Ft is full rank, i.e., Rank Ft = ω . We say that
a rate-ω linear network code C is decodable for T if for each sink node t ∈ T , the rank of the matrix

Ft is equal to the rate ω of the code, i.e., Rank Ft = ω , ∀ t ∈ T .1 We refer the reader to [8]–[12] for
comprehensive discussions of linear network coding.

III. L INEAR N ETWORK E RROR C ORRECTION C ODING R EVISITED

A. Linear Network Error Correction Coding

In this subsection, we present the linear network error correction (LNEC) coding model. We first
consider using an Fq -valued rate-ω linear network code C on the network G = (V, E) to multicast the
source message to the sink nodes in T . When the symbol ye is transmitted on edge e, an error ze ∈ Fq
may occur.2 As a result, the output of edge e becomes ỹe = ye + ze . The error ze is treated as a message

1
When the set of sink nodes T is clear from the context, we say that the linear network code C is “decodable” instead of
“decodable for T ” for simplicity.
2
If no error occurs on the edge e, then ze = 0.

DRAFT
10

called the error message on edge e. We write all the errors on the edges in E as an Fq -valued row
|E|-vector z = (ze : e ∈ E) and call z the error vector.
To take into account of the effect of the errors on the network G, we can modify the linear network
code C to a rate-ω LNEC code on G. Before describing the modification, we first present the extended

e = Ve , E
network G e of G, which was introduced in [16]. In the original network G, for each edge e ∈ E ,

we introduce an imaginary edge e′ such that head(e′ ) = tail(e), which is called the imaginary error edge
for edge e. Similar to the source message generated by the source node s, we also assume that the error
ze is transmitted to tail(e) through the imaginary error edge e′ . The original network G together with
e = (Ve , E)
all the imaginary error edges e′ , e ∈ E form the extended network of G denoted by G e , where
S 
Ve = V and E e = E E ′ with E ′ , e′ : e ∈ E , the set of all the imaginary error edges. Clearly,
e is also acyclic due to the acyclicity of the original network G. As for linear
the extended network G
network coding, we introduce ω imaginary source edges d′1 , d′2 , · · · , d′ω connecting to the source node s
e , where ω is the rate of the source, and let In(s) = {d′ , d′ , · · · , d′ω }. For every
in the extended network G 1 2
e, we use In(v) to denote the set of “real” input edges of v , i.e., the imaginary
non-source node v on G
error edges connected to v are not included in In(v). Now, we modify the rate-ω linear network code C
e by setting the local encoding coefficients with respect to
on G into a rate-ω linear network code on G
each imaginary error edge e′ ∈ E ′ as follows:

 1, d = e;
ke′ ,d = (4)
 0, d ∈ Out tail(e) \ {e}.

e is called the corresponding Fq -valued rate-ω LNEC code on


This modified linear network code on G
the original network G. In the following, we define the global encoding kernels of such a rate-ω LNEC
code on G in terms of the local encoding coefficients.

Definition 2. Let Fq be a finite field of order q , where q is a prime power. An Fq -valued rate-ω LNEC
code on the network G = (V, E) consists of a column (ω + |E|)-vector f˜e for each edge e in E , called
the extended global encoding kernel of e, whose components are indexed by the ω imaginary source
edges in In(s) and the |E| imaginary error edges in E ′ , such that
ω+|E|
1) f˜d′i = 1d′i , 1 ≤ i ≤ ω , f˜e′ = 1e′ , e′ ∈ E ′ , form the standard basis of the vector space Fq , where
S ′
1d , d ∈ In(s) E is a column (ω + |E|)-vector whose component indexed by d is equal to 1 while
all other components are equal to 0;
2) For each edge e ∈ E , f˜e is calculated recursively according to the given ancestral order of edges

DRAFT
11

in E by
X
f˜e = kd,e · f˜d + 1e′ , (5)
d∈In(tail(e))

where kd,e ∈ Fq is the local encoding coefficient for the adjacent edge pair (d, e).

In using this rate-ω LNEC code on G, let x = x1 x2 · · · xω be the source message vector and
z = (ze : e ∈ E) be the error vector. For each imaginary source edge d′i , 1 ≤ i ≤ ω and each imaginary
error edge e′ ∈ E ′ , we have, respectively,

ỹd′i = xi and ỹe′ = ze .

The symbol ỹe , the output of edge e ∈ E , is recursively calculated by


X
ỹe = kd,e · ỹd + ze (6)
d∈In(tail(e))

according to the given ancestral order of edges in E . Comparing (5) with (6), we obtain that
[
ỹe = (x z) · f˜e , ∀ e ∈ In(s) e
E. (7)

Before discussing how to use this LNEC code to correct errors on the network, we first introduce some
notation to be used frequently throughout the paper. For an edge e ∈ E e , we write f˜e as
 
 ⊤ fe
f˜e = f˜e (d′1 ) · · · f˜e (d′ω ) f˜e (e′1 ) · · · f˜e (e′|E| ) =  , (8)
ge
where
 ⊤  ⊤
fe = f˜e (d′1 ) f˜e (d′2 ) · · · f˜e (d′ω )
and ge = f˜e (e′1 ) f˜e (e′2 ) · · · f˜e (e′|E| ) . (9)
h i
Further, for a sink node t ∈ T , we let Fet = f˜e : e ∈ In(t) , an (ω + |E|) × |In(t)| matrix, and
S
use rowt (d′ ) to denote the row vector of Fet indexed by the imaginary edge d′ ∈ In(s) E ′ , i.e.,

rowt (d′ ) = f˜ê (d′ ) : ê ∈ In(t) . Then, we write
 
Ft
Fet =   , (10)
Gt
where
   
rowt (d′1 ) rowt (e′1 )
   
 ..   .. 
Ft =  .  and Gt =  .  (11)
   

rowt (dω ) rowt (e′|E| )
are two matrices of sizes ω × |In(t)| and |E| × |In(t)|, respectively.

DRAFT
12

B. Network Error Correction



e = f˜e : e ∈ E
We consider an Fq -valued rate-ω LNEC code C on the network G = (V, E). We

e is decodable for the set of sink nodes T , i.e., Rank Ft = ω , ∀ t ∈ T . Herein, the
first assume that C
decodability property is necessary, because otherwise, even if no errors occur on the network, at least
one of the sink nodes in T cannot decode the source message with zero error.
|E|
Let z = (ze : e ∈ E) ∈ Fq be an error vector and ρ ⊆ E be an edge subset. We say that z matches ρ
if ze = 0 for all e ∈ E \ ρ, i.e.,
n o
z ∈ z′ = (ze′ : e ∈ E) ∈ F|E| ′
q : ze = 0, ∀ e ∈ E \ ρ . (12)

For notational convenience, we write (12) as z ∈ ρ in the rest of the paper. This abuse of notation should
cause no ambiguity and would greatly simplify the notation.
We now consider network error correction. We assume that a sink node t knows the extended global
encoding kernels of the input edges of t, i.e., Fet . For a source message vector x ∈ Fωq on d′i , 1 ≤ i ≤ ω
|E|
and an error vector z ∈ Fq on e′ ∈ E ′ , we denote by ỹe (x, z) the symbol transmitted on an edge e.
Further, we let

ỹt (x, z) , ỹe (x, z) : e ∈ In(t) ,

and by (7), we have

ỹt (x, z) = (x z) · Fet . (13)

When x and z are clear from the context, we write ỹe and ỹt to simplify the notations.
At the sink node t, the source message vector x and error vector z are unknown while Fet and ỹt are
known. We attempt to decode x by “solving” x in the equation ỹt = (x z) · Fet in which x and z are
regarded as variables.
e corrects any error vector in Z
We let Z be a set of error vectors. We say that the rate-ω LNEC code C
at the sink node t if for any 2 pairs (x z) and (x′ z′ ) such that ỹt (x, z) = ỹt (x′ , z′ ), where x, x′ ∈ Fωq
and z, z′ ∈ Z , we have

x = x′ .

As such, we see that any source message vector x ∈ Fωq can be decoded with zero error regardless which
error vector in Z occurs in the network.

e = f˜e : e ∈ E
Next, we consider the error correction capability of an Fq -valued rate-ω LNEC code C
on the network G = (V, E), i.e, the possible set of error vectors for each sink node t ∈ T in which any

DRAFT
13

e at t. We first define two types of vector spaces for the code C


error vector can be corrected by C e , which

play a crucial role for network error correction [16], [21], [22].

Definition 3. Consider a sink node t ∈ T and an edge subset ρ ⊆ E . At the sink node t, the message
space and the error space of ρ are defined, respectively, by

Φ(t) = rowt (d′i ) : 1 ≤ i ≤ ω and ∆(t, ρ) = rowt (e′ ) : e ∈ ρ .3 (14)

With Definition 3, we readily see that


n o
ω
Φ(t) = x · Ft : all source message vectors x ∈ Fq , (15)

and
n o
∆(t, ρ) = z · Gt : all error vectors z ∈ F|E|
q such that z ∈ ρ . (16)
|E|
For a source message vector x ∈ Fωq and an error vector z ∈ Fq such that z ∈ ρ, by (7), (8) and (9),
we have

ỹe = (x z) · f˜e = x · fe + z · ge , ∀ e ∈ E.

By (10) and (11), we immediately have

ỹt = (x z) · Fet = x · Ft + z · Gt . (17)

Thus, we observe that the “effect” of x (i.e., x · Ft ) at t belongs to Φ(t) by (15) and the “effect” of z ∈ ρ
(i.e., z · Gt ) at t belongs to ∆(t, ρ) by (16). Briefly speaking, if the “effect” z · Gt of the error vector
z at t can be removed from ỹt , then, together with Rank(Ft ) = ω , the source message vector x can be
decoded with zero error. This will become clear in the following discussions.
With the equation (17), the “effect” x·Ft of a source message vector x can be regarded its “codeword”
at the sink node t, in which Ft is regarded as the “generator matrix” at t. So Φ(t) can be regarded as the
“codebook” at t. We now consider 2 different codewords x·Ft and x′ ·Ft (i.e., x 6= x′ by Rank(Ft ) = ω ).
Based on the above discussions, either x or x′ cannot be decoded with zero error if and only if there
exists 2 vectors z and z′ such that ỹt (x, z) = ỹt (x′ , z′ ), or equivalently,

(x − x′ ) · Ft = (z′ − z) · Gt . (18)

Hence, we define the distance between 2 codewords x · Ft and x′ · Ft as follows:


n o
d(t) (x · Ft , x′ · Ft ) = min |ρ| : ∃ an error vector z ∈ ρ s.t. (x − x′ ) · Ft = z · Gt . (19)

3
Here we use L to denote the subspace spanned by the vectors in a set L of vectors.

DRAFT
14

Before proving that d(t) (·, ·) is a metric, we first extend the distance between 2 codewords to the distance
|In(t)|
between 2 vectors in Fq .

e on the network G and a sink node t ∈ T . For any 2


Definition 4. Consider a rate-ω LNEC code C
|In(t)|
vectors ỹt and ỹt′ in Fq , the distance between ỹt and ỹt′ is defined as
n o
d(t) (ỹt , ỹt′ ) = min |ρ| : ∃ an error vector z ∈ ρ s.t. ỹt − ỹt′ = z · Gt . (20)

In (20), when ỹt = ỹt′ , the edge subset ρ that achieves the minimum is the empty set with the error
vector z being the all-zero vector. By (5) and (11), we can obtain that the |In(t)| × |In(t)| submatrix
h i
rowt (e′ ) : e ∈ In(t) of Gt is an identity matrix (cf. the proof of Theorem 3 in Section IV for more
|In(t)|
details). So for any 2 vectors ỹt and ỹt′ in Fq , there must exist an error vector z such that ỹt − ỹt′ =
z · Gt . Then, the distance d(t) (·, ·) is well-defined.

|In(t)|
Proposition 1. The distance d(t) (·, ·) defined in the vector space Fq is a metric, i.e., the following 3
|In(t)|
conditions are satisfied for arbitrary vectors ỹt , ỹt′ and ỹt′′ in Fq :

1) (Positive Definiteness) d(t) (ỹt , ỹt′ ) ≥ 0, and d(t) (ỹt , ỹt′ ) = 0 if and only if ỹt = ỹt′ ;
2) (Symmetry) d(t) (ỹt , ỹt′ ) = d(t) (ỹt′ , ỹt );
3) (Triangle Inequality) d(t) (ỹt , ỹt′′ ) ≤ d(t) (ỹt , ỹt′ ) + d(t) (ỹt′ , ỹt′′ ).

Proof: See Appendix A.


|In(t)| 
Thus, the pair Fq , d(t) (·, ·) forms a metric space. Furthermore, we naturally define the minimum
(t)
distance of the codebook Φ(t), denoted by dmin , as
(t) 
dmin = min d(t) x · Ft , x′ · Ft .
x,x′ ∈Fω
q:
x6=x′

We continue to consider the distance between two codewords:


n o
d(t) (x · Ft , x′ · Ft ) = min |ρ| : ∃ an error vector z ∈ ρ s.t. (x − x′ ) · Ft = z · Gt
n o
= min |ρ| : (x − x′ ) · Ft ∈ ∆(t, ρ)

= d(t) 0, (x − x′ ) · Ft , (21)

where 0 stands for the all-zero row |In(t)|-vector. In the rest of the paper, we always use 0 to denote an
all-zero (row or column) vector in the paper, whose dimension should be clear from the context. By (21),
(t)
we rewrite dmin as:
(t) 
dmin = min

d(t) 0, (x − x′ ) · Ft
x,x ∈Fq :
ω
x6=x′

DRAFT
15


= min d(t) 0, x · Ft
q \{0}
x∈Fω
n o
= min min |ρ| : x · Ft ∈ ∆(t, ρ)
q \{0}
x∈Fω
n \ o
= min |ρ| : Φ(t) ∆(t, ρ) 6= {0} . (22)

In the rest of the paper, we use (22) as the definition of the minimum distance of a rate-ω LNEC code
e on the network G at the sink node t ∈ T , which is more convenient for discussion. We thus write this
C
definition as follows.

e on the network G. The minimum distance of C


Definition 5. Consider a rate-ω LNEC code C e at a sink

node t is defined as
n \ o
(t)
dmin = min |ρ| : Φ(t) ∆(t, ρ) 6= {0} . (23)

Furthermore, it is not difficult to see that the distance d(t) (·, ·) defined in (20) is equivalent to the
(t)
distance measure defined in Definition 1 in [17], while the minimum distance dmin defined in (23) is the
same as the minimum distance defined in Definition 7 in [16] (see Proposition 2 in [21]). Thus, the 2
LNEC approaches developed in [17] and [16] are in fact equivalent.
For a rate-ω LNEC code C e, the minimum distance d(t) at each sink node t ∈ T characterizes its
min
j  k
e can correct up to d(t) − 1 /2 errors at each sink node
error correction capability. More precisely, C min

t ∈ T (cf. [14], [16], [17], [21], [22]).


To see this, we consider 2 arbitrary pairs (x1 z1 ) and (x2 z2 ) of source message vector and error
j  k
(t)
vector such that the Hamming weight wH (zi ) ≤ dmin − 1 /2 , i = 1, 2, and (x1 z1 )· Fet = (x2 z2 )· Fet ,
or equivalently,

(x1 − x2 ) · Ft = (z2 − z1 ) · Gt . (24)


 
Let ρi = e ∈ E : zi,e 6= 0 , where zi , zi,e : e ∈ E , i = 1, 2. Clearly, zi ∈ ρi and |ρi | ≤
j  k S
(t)
dmin − 1 /2 , i = 1, 2. Further, let ρ = ρ1 ρ2 . Then,

(t)
|ρ| ≤ |ρ1 | + |ρ2 | ≤ dmin − 1,
(t)
and z2 − z1 ∈ ρ. By the definition of dmin (cf. (23)), we immediately have
\
Φ(t) ∆(t, ρ) = {0}. (25)

Together with (x1 − x2 ) · Ft ∈ Φ(t) and (z2 − z1 ) · Gt ∈ ∆(t, ρ), we obtain that

(x1 − x2 ) · Ft = (z2 − z1 ) · Gt = 0.

DRAFT
16

j  k
e can correct up to (t)
It thus follows from Rank(Ft ) = ω that x1 = x2 . In other words, C dmin − 1 /2
errors at each t ∈ T . We state this result formally in the next theorem. Let r be a nonnegative integer
and H (r) be the collection of all edge subsets of size up to r , i.e.,

H (r) = ρ ⊆ E : |ρ| ≤ r . (26)

Theorem 2. Consider an Fq -valued rate-ω LNEC code C e on the network G. Let t be a sink node with

dim Φ(t) = ω . At this sink node t, the LNEC code C e can correct any error vector in the set
n j  k o
(t)
z ∈ F|E|
q : z ∈ ρ for some ρ ∈ H dmin − 1 /2 . (27)

(t)
Next, we present the Singleton bound on the minimum distance dmin at the sink node t ∈ T :
(t)
dmin ≤ Ct − ω + 1 (28)

e not only is decodable but also satisfies


(cf. [16], [17], [21], [22]). If an Fq -valued rate-ω LNEC code C
the Singleton bound (28) with equality for each sink node t ∈ T , i.e.,
 (t)
dim Φ(t) = ω and dmin = Ct − ω + 1, ∀ t ∈ T, (29)

e is called maximum distance separable (MDS) for T . Then, in terms of the error correction
then C
capability given in Theorem 2, an Fq -valued rate-ω LNEC MDS code has the maximum error correction
capability at each sink node.

IV. E NHANCED C HARACTERIZATION OF LNEC C APABILITY

We first introduce a number of graph-theoretic concepts that will be used frequently in the sequel.
We continue to consider a finite directed acyclic network G = (V, E). The reverse network G⊤ of G is
obtained from G by reversing the direction of every edge on G. It is evident that a subset of E is a cut
separating a node v from a node u on G if and only if this subset of E is a cut separating u from v on
G⊤ . Inspired by this observation, for an edge subset ρ and a non-source node u, a subset of E is called
a cut separating u from ρ on G if this edge subset is a cut separating ρ from u on G⊤ (cf. Section II-A).
The capacity of the cut separating u from ρ on G is accordingly defined as the number of edges in the cut.
The minimum of the capacities of all cuts separating u from ρ on G is called the minimum cut capacity
separating u from ρ, denoted by mincut(ρ, u). On the network G, a cut separating u from ρ is called a
minimum cut separating u from ρ if its capacity achieves the minimum cut capacity mincut(ρ, u).
Further, we say that a minimum cut separating u from ρ on G is primary if it separates u from all
the minimum cuts that separate u from ρ on G. The concept of primary minimum cut was introduced
by Guang and Yeung [39], where its existence and uniqueness were proved. Finally, we say that an edge

DRAFT
17

e1 e2 e4 e5
e3
i2 i4
i3
i1 e8 e9 e12 e13 i5
e7 e14
i6 i8

e16 e10 e11 e17


e6 e15
i7 i9
e18 e21

e20 e19
t1 t2

Fig. 4: The network G.

subset ρ is primary for u if ρ is the primary minimum cut separating u from ρ. We now use the following
example to illustrate these concepts.

Example 2. Consider the network G depicted in Fig. 4. On the network G, we consider an edge subset
ρ = {e2 , e5 } and a node t1 . On the reverse network G⊤ of G depicted in Fig. 5, we note that the edge
subset η = {e14 , e16 } is a cut separating ρ from t1 . So the edge subset η is a cut separating t1 from
ρ on G (see Fig. 4). It can be checked that η is actually a minimum cut separating t1 from ρ on G.
Furthermore, the unique primary minimum cut on G separating t1 from ρ is the edge subset {e18 , e20 },
which implies that {e18 , e20 } is primary for t1 .

We now consider a sink node t on the network G. Let r be a nonnegative integer not larger than Ct ,
the minimum cut capacity separating t from the source node s. We define the following two collections
of edge subsets on G,
n o
Et (r) = ρ ⊆ E : mincut(ρ, t) ≤ r (30)

and
n o
At (r) = ρ ⊆ E : |ρ| = r and ρ is primary for t .4 (31)

DRAFT
18

t1 t2
e20 e19

e18 e21
i7 i9
e6 e15
e16 e10 e11 e17

i6 i8
e7 e14
i1 e8 e9 e12 e13 i5
i3
i2 i4
e1 e3 e5
e2 e4

Fig. 5: The reverse network G⊤ of G.

We now present the following theorem which is one of the main results of this paper.

Theorem 3. Consider a rate-ω LNEC code over a finite field Fq on the network G. Then for a sink node
t with Ct ≥ ω and a nonnegative integer r ≤ Ct − ω ,
\
Φ(t) ∆(t, ρ) = {0}, ∀ ρ ∈ Et (r), (32)

if and only if
\
Φ(t) ∆(t, ρ) = {0}, ∀ ρ ∈ At (r). (33)

In order to prove Theorem 3, we need the following lemma.

Lemma 4. For an edge subset ρ and a sink node t, and any integer r such that mincut(ρ, t) ≤ r ≤ Ct ,5
there exists a size-r primary edge subset η for t such that η separates t from ρ.

Proof: Consider an arbitrary edge subset ρ with mincut(ρ, t) ≤ r . For the case of mincut(ρ, t) = r ,
the lemma is evidently true by the existence of the primary minimum cut separating t from ρ. It thus

4
Note that At (r) = Et (r) = ∅ when r = 0.
5
If Ct < mincut(ρ, t), then no such integer r exists.

DRAFT
19

suffices to consider the case of mincut(ρ, t) < r . For this case, since r ≤ Ct , we claim that there exists
an edge subset ρb satisfying ρ ⊆ ρb and mincut(ρb, t) = r . Indeed, note that when we add an edge e to ρ,
S S
the minimum cut capacity mincut(ρ {e}, t) separating t from ρ {e} satisfies
 [  
mincut ρ, t ≤ mincut ρ {e}, t ≤ mincut ρ, t + 1, (34)

i.e., the minimum cut capacity can be increased at most by 1. Note that ρ ⊆ E and we have mincut E, t ≥
Ct . Thus we see that for any r ≤ Ct , in view of (34), we can always add edges to ρ one by one to
form an edge subset ρb until mincut(ρb, t) = r . Clearly, ρ ⊆ ρb. Thus, the primary minimum cut separating
t from ρb, denoted by η , separates t from ρ. Together with the fact that the primary minimum cut η
separating t from ρb is primary for t and |η| = mincut(ρb, t) = r , the lemma is proved.

With Lemma 4, we are now ready to prove Theorem 3.


Proof of Theorem 3: The “only if” part (i.e., (32) ⇒ (33)) is evident since At (r) ⊆ Et (r). We now
prove the “if” part (i.e., (33) ⇒ (32)). We consider an arbitrary edge subset ρ ∈ Et (r). Then,

mincut ρ, t ≤ r ≤ Ct − ω ≤ Ct .

By Lemma 4, there exists a primary edge subset η in At (r) such that η separates t from ρ. For a directed
path P = (e1 , e2 , · · · , em ), m ≥ 1, on the extended network G e, we define



1 if m = 1;
KP = m−1 Q (35)

 kei ,ei+1 if m ≥ 2.

i=1

We consider an imaginary error edge e′ ∈ E ′ , which is associated with the edge e ∈ E , and an edge
ê ∈ E . By calculating by (5) recursively according to the given ancestral order on the edges in E , it is
not difficult to obtain that
X
f˜ê (e′ ) = KP , (36)
P : a directed path from e′ to ê

where if no directed paths exist from e′ to ê, we can see that f˜ê (e′ ) = 0 by (36). Continuing from (36),
we obtain that
X X
f˜ê (e′ ) = KP + KP (37)
P : a directed path from e′ to ê P : a directed path from e′ to ê
passing through the edge e not passing through the edge e
X
= KP , (38)
P : a directed path from e′ to ê
passing through the edge e

DRAFT
20

where the last equality (38) is justified as follows. First, if no directed paths exist from e′ to ê, then we
easily see that
X
KP = 0 = f˜ê (e′ ).
P : a directed path from e′ to ê
passing through the edge e

Thus, the equality (38) is satisfied. Otherwise, we consider the two cases below.
Case 1: e = ê.
In this case, we note that (e′ , e) is the unique directed path from e′ to e by the acyclicity of the extended
e . Then, there does not exist a path from e′ to e not passing through e. This immediately implies
network G
that the second term in (37) is 0, i.e.,
X
KP = 0.
P : a directed path from e′ to ê
not passing through the edge e

We thus have proved the equality (38) in this case. Further, we have
X
f˜e (e′ ) = KP = K(e′ ,e) = ke′ ,e = 1.
P : a directed path from e′ to e
passing through the edge e

Case 2: e 6= ê.
If there does not exist a path from e′ to ê not passing through e, similar to the above discussion in
Case 1, the second term in (37) is 0 and so we have proved the equality (38). Otherwise, each directed
path P from e′ to ê not passing through the edge e can be regarded as the concatenation of two sub-paths,
where one is a length-2 path (e′ , c) from e′ to some edge c ∈ Out(tail(e)) \ {e}; the other is a directed
path from c to ê, denoted by Pc→ê . Note that these two paths overlap on the edge c. Together with
ke′ ,c = 0 as c ∈ Out(tail(e)) \ {e} (cf. (4)), it follows from (35) that

KP = K(e′ ,c) · KPc→ê = ke′ ,c · KPc→ê = 0 · KPc→ê = 0. (39)

This implies that the second term in (37) is 0 and thus we have proved the equality (38). In particular, we
note that the above argument also applies to the special case that ê ∈ Out(tail(e)) \ {e}. To be specific,
in (39), KPc→ê = 1 if c = ê (cf. (35)).

Now, continuing from (38), we have


X
f˜ê (e′ ) = KP
P : a directed path from e′ to ê
passing through the edge e
X
= ke′ ,e · KP
P : a directed path from e to ê
X
= KP , (40)
P : a directed path from e to ê

DRAFT
21

where (40) also follows from ke′ ,e = 1 (cf. (4)). Note that (40) continues to hold when there exists no
directed path from e′ to ê.
Next, we will prove that ∆(t, ρ) ⊆ ∆(t, η), where we recall that ρ is any edge subset in Et (r) and η
is any primary edge subset in At (r) such that η separates t from ρ. Toward this end, we consider two
cases for an edge e ∈ ρ.
T
Case 1: e ∈ In(t), i.e., e ∈ ρ In(t).
We first claim that e ∈ η , because otherwise η cannot separate t from {e} (which is a subset of ρ) and
thus cannot separate t from ρ, a contradiction. Now, we consider the row vector rowt (e′ ) = f˜ê (e′ ) : ê ∈

In(t) , where e′ is the imaginary error edge associated with e. By the above claim that e ∈ η , we
immediately prove that rowt (e′ ) ∈ ∆(t, η) (cf. (14)).
Case 2: e ∈
/ In(t), i.e., e ∈ ρ \ In(t).
We consider an arbitrary edge ê ∈ In(t). If there exists a directed path P from e to ê, then this path P
has length at least 2 and can be regarded as the concatenation of two sub-paths, where one is a length-2
path (e, d) from e to some edge d ∈ Out(head(e)); the other is a directed path Pd→ê from d to ê.
By (35), we have

KP = K(e,d) · KPd→ê = ke,d · KPd→ê . (41)

On the other hand, if there exists no path from e to ê, then we readily see that for any edge d ∈
Out(head(e)), there exists no path from d to ê, either.
Then, continuing from (40), we obtain that
X X
f˜ê (e′ ) = KP
d∈Out(head(e)) P : a directed path
from e to ê via d
X X X X
= KP + KP (42)
d ∈ Out(head(e)) : P : a directed path d ∈ Out(head(e)) : P : a directed path
∃ a path from e to ê via d from e to ê via d ∄ a path from e to ê via d from e to ê via d

X  X 
= ke,d · KPd→ê
d ∈ Out(head(e)) : Pd→ê : a directed path from d to ê
∃ a path from e to ê via d

X  X 
+ ke,d · KPd→ê , (43)
d ∈ Out(head(e)) : Pd→ê : a directed path from d to ê
∄ a path from e to ê via d

where the last equality (43) is explained as follows. We first consider the first term in (42). By (41), we
immediately obtain that
X X X  X 
KP = ke,d · KPd→ê . (44)
d ∈ Out(head(e)) : P : a directed path d ∈ Out(head(e)) : Pd→ê : a directed path from d to ê
∃ a path from e to ê via d from e to ê via d ∃ a path from e to ê via d

DRAFT
22

Next, we consider the second term in (42). We note that for an edge d ∈ Out(head(e)), there exists no
path from e to ê via d if and only if there exists no path from d to ê. As such, we obtain that
X X
KP = 0 (45)
d ∈ Out(head(e)) : P : a directed path
∄ a path from e to ê via d from e to ê via d

and
X  X 
ke,d · KPd→ê = 0. (46)
d ∈ Out(head(e)) : Pd→ê : a directed path from d to ê
∄ a path from e to ê via d

Combining (44), (45) and (46), we immediately prove the equality (43), and we further obtain that
X  X 
f˜ê (e′ ) = ke,d · KP
d∈Out(head(e)) P : a directed path from d to ê
X
= ke,d · f˜ê (d′ ), (47)
d∈Out(head(e))

where the equality (47) again follows from (40) with d in place of e. In particular, the equality (47) holds
when there exists no path from e′ to ê, with

f˜ê (e′ ) = 0 and f˜ê (d′ ) = 0, ∀ d ∈ Out(head(e)).



Now, for the row vector rowt (e′ ) = f˜ê (e′ ) : ê ∈ In(t) , by (47) we obtain that

rowt (e′ ) = f˜ê (e′ ) : ê ∈ In(t)
 X 
= ˜ ′
ke,d · fê (d ) : ê ∈ In(t)
d∈Out(head(e))
X  
= ke,d · f˜ê (d′ ) : ê ∈ In(t)
d∈Out(head(e))
X
= ke,d · rowt (d′ ). (48)
d∈Out(head(e))

Further, for any d ∈ Out(head(e)), if no path exists from d to the sink node t, by (40) we have

f˜ê (d′ ) = 0, ∀ ê ∈ In(t),

implying that rowt (d′ ) = 0. Thus, continuing from (48), we obtain that
X
rowt (e′ ) = ke,d · rowt (d′ )
d∈Out(head(e))
X
= ke,d · rowt (d′ ). (49)
d∈Out(head(e)):
∃ a path from d to t

DRAFT
23

In (49), for each d in the summation, apply (49) recursively for rowt (d′ ) by letting e be d until all the
edges d in the summation are in η . Then we obtain that rowt (e′ ) is a linear combination of rowt (d′ ),
d ∈ η , i.e., rowt (e′ ) ∈ ∆(t, η).

Now, we combine the above two cases and immediately obtain that rowt (e′ ) ∈ ∆(t, η) for all e ∈ ρ,
T
or equivalently, ∆(t, ρ) ⊆ ∆(t, η). Then (33) implies that Φ(t) ∆(t, ρ) = {0}. We thus have proved
the “if” part and also the theorem.

Recall the definition of H (r) in (26). We immediately obtain the following corollary.

Corollary 5. Consider a rate-ω LNEC code over a finite field Fq on the network G. Then for a sink
node t with Ct ≥ ω and a nonnegative integer r ≤ Ct − ω , the conditions (32), (33) and the condition
\
Φ(t) ∆(t, ρ) = {0}, ∀ ρ ∈ H (r) (50)

are all equivalent.

Proof: Note that

At (r) ⊆ H (r) ⊆ Et (r).

Hence, we obtain that (32) ⇒ (50) and (50) ⇒ (33). Together with (33) ⇔ (32) from Theorem 3, the
corollary is proved.

Together with the equivalence of (32) and (33) in Theorem 3 and the discussion above Definition 4,
we see that at a sink node t, the “effect” of any error vector z ∈ ρ for an edge subset ρ ∈ Et (r) is equal
to the “effect” of an error vector z′ ∈ η for a primary edge subset η ∈ At (r) such that η separates t
e can correct any error vector in the
from ρ, i.e., z · Gt = z′ · Gt . Thus, to ensure that an LNEC code C
set of error vectors
 n o
Z Et (r) , z ∈ F|E|
q : z ∈ ρ for some ρ ∈ Et (r) , (51)

e can correct any error vector in the reduced set of error vectors
we only need to ensure that the code C
 n o
Z At (r) , z ∈ F|E|q : z ∈ ρ for some ρ ∈ A t (r) . (52)

Thus we have proved the following important consequence.

Theorem 6. Consider an Fq -valued rate-ω LNEC code on a network G = (V, E). For a sink node t ∈ T
 
with dim Φ(t) = ω , the LNEC code can correct at t any error vector in the set Z At (r) if and only

if this code can correct at t any error vector in the set Z Et (r) .

DRAFT
24

By combining Theorem 3 with Theorem 6, we immediately enhances Theorem 2 in the following


corollary.

Corollary 7. Consider an Fq -valued rate-ω LNEC code on the network G. For a sink node t ∈ T with

dim Φ(t) = ω , the LNEC code can correct any error vector in the following set of error vectors
 j  k  n j  k o
(t) (t)
Z Et dmin − 1 /2 = z ∈ F|E|q : z ∈ ρ for some ρ ∈ E t dmin − 1 /2 . (53)

e can correct at the sink node t any error vector in the


Proof: By Theorem 2, a rate-ω LNEC code C
set
n o
z ∈ F|E|q : z ∈ ρ for some ρ ∈ H r

,
j  k
(t)
where we let r ∗ = dmin − 1 /2 for notational simplicity. It follows from At (r ∗ ) ⊆ H (r ∗ ) that the

LNEC code C e can correct at t any error vector in the set Z At (r ∗ ) . By Theorem 6, C
e can correct at

t any error vector in the set Z Et (r ∗ ) . We thus have proved the corollary.

We now use the following example to illustrate the enhanced characterization of the capability of an
LNEC code as asserted in Theorem 6 and Corollary 7.

Example 3. Recall the network G = (V, E) depicted in Fig. 4, where s is the single source node and
e on G such
T = {t1 , t2 } is the set of sink nodes with Ct1 = Ct2 = 5. We consider a rate-3 LNEC code C
(t )
1 (t )
2
that dmin = dmin = 3. Such a code exists because it satisfies the Singleton bound in (28).
j  k
(t1 )
Due to the symmetry of the problem, we only consider the sink node t1 and let r = dmin − 1 /2 = 1.
We say an edge subset ρ ⊆ E is t1 -correctable for this LNEC code C e if any error vector z ∈ ρ can

e. It follows from Theorem 2 that all 21 edge subsets in H (1) = ρ ⊆
be corrected at t1 in using C
E : |ρ| ≤ 1 are t1 -correctable, where clearly, |H (1)| = |E| = 21.
e in terms of
We now consider the enhanced characterization of the capability of an LNEC code C
Et1 (r) and At1 (r) (cf. Theorem 6 and Corollary 7). We first partition E into two edge-disjoint sets

Etc1 , e9 , e11 , e15 , e19 , e21 and E \ Etc1 .

Note that Etc1 is precisely the set of edges in E such that there exists no path from this edge to t1 .
Accordingly, H (1) is partitioned into two disjoint collections of size-1 edge subsets
 
{e9 }, {e11 }, {e15 }, {e19 }, {e21 } and {e} : e ∈ E \ Etc1 .

The set of all size-1 primary edge subsets for t1 is given by


n o
At1 (1) = {e1 }, {e4 }, {e6 }, {e10 }, {e12 }, {e18 }, {e20 } .

DRAFT
25

Consider all the 16 size-1 edge subsets, each of which consists of one edge in E \ Etc1 . We see that
{e10 } is the primary minimum cut separating t1 from {e3 }; {e18 } is the primary minimum cut separating
t1 from {e2 }, {e7 }, {e8 } and {e16 }, respectively; and {e20 } is the primary minimum cut separating t1
from {e5 }, {e13 }, {e14 } and {e17 }, respectively. For i = 1, 4, 6, 12, {ei } is the primary minimum cut
separating t1 from only {ei } itself.
t
We write ρ ∼1 η for two edge subsets ρ and η of E if ρ and η have the same primary minimum cut
t
with respect to t1 , e.g., {e2 } ∼1 {e7 }, where {e18 } is the common primary minimum cut separating t1
t t
from {e2 } and {e7 }. It was proved in [40] that “ ∼1 ” is an equivalence relation. With the relation “ ∼1 ”,
H (1) can be partitioned into 8 equivalence classes
     
{e1 } , {e4 } , {e6 } , {e12 } , {e3 }, {e10 } , {e2 }, {e7 }, {e8 }, {e16 }, {e18 } ,
  (54)
{e5 }, {e13 }, {e14 }, {e17 }, {e20 } and {e9 }, {e11 }, {e15 }, {e19 }, {e21 } ,
where for the 5 edge subsets {e9 }, {e11 }, {e15 }, {e19 } and {e21 } in the last equivalence class, the empty
set of edges is their common primary minimum cut with respect to t1 .
Furthermore, it is not difficult to see that any union of the edge subsets in an equivalence class still
have the common primary minimum cut with respect to t1 , e.g., {e18 } is the common primary minimum
cut separating t1 from {e2 , e7 } and {e7 , e8 , e16 , e18 }. Moreover, for any union of the edge subsets in an
equivalence class, say ρ, and any edge subset µ of Etc1 (which is also a union of the edge subsets in the
last equivalence class in (54)), we have

mincut(ρ ∪ µ, t1 ) = mincut(ρ, t1 ).

For example, let ρ = {e2 , e7 } and µ = {e9 }. Then, {e18 } is the (primary) minimum cut separating t1
S
from ρ µ, and

mincut(ρ ∪ µ, t1 ) = mincut({e2 , e7 , e9 }, t1 ) = mincut({e2 , e7 }, t1 ) = 1.

Based on the above discussion, by means of a simple calculation, we can obtain that the size of Et1 (1)
is equal to 2,239, which is considerably larger than |H (1)| = 21. It follows from Corollary 7 that all
the 2,239 nonempty edge subsets in Et1 (1) are t1 -correctable. On the other hand, by Theorem 6, in order
to ensure that all the 2,239 nonempty edge subsets in Et1 (1) are t1 -correctable, it suffices to guarantee
that the 7 edge subsets in At1 (1) are t1 -correctable.

V. F IELD S IZE R EDUCTION FOR LNEC C ODES

A. Improved Upper Bound on the Minimum Required Field Size

The minimum required field size for the existence of LNEC codes, particularly LNEC MDS codes, is
an open problem not only of theoretical interest but also of practical importance, because it is closely

DRAFT
26

related to the implementation of code constructions in terms of computational complexity and storage
requirement. In this subsection, we will present an improved upper bound on the minimum required
field size, which shows that the required field size for the existence of LNEC codes in general can be
reduced significantly. This new bound is graph-theoretic, which depends only on the network topology
and requirement of error correction capability but not on the specific code construction.

Theorem 8. Let Fq be a finite field of order q , where q is a prime power. Let T be the set of sink nodes
on the network G with Ct ≥ ω , ∀ t ∈ T . For each t ∈ T , let βt be a nonnegative integer not larger than
Ct − ω . Then, there exists an Fq -valued rate-ω LNEC code on G with the minimum distance at t not
(t)
smaller than βt + 1 for each t ∈ T , i.e., dmin ≥ βt + 1, ∀ t ∈ T , if the field size q satisfies
X
q> At (βt ) . (55)
t∈T

Proof: To prove Theorem 8, we need to prove that if (55) is satisfied for the field Fq , then there
exists an Fq -valued rate-ω LNEC code on G such that for each sink node t ∈ T ,
 (t)
dim Φ(t) = ω and dmin ≥ βt + 1. (56)

By Definition 5, (56) is equivalent to the condition that for each t ∈ T ,


 \
dim Φ(t) = ω and Φ(t) ∆(t, ρ) = {0}, ∀ ρ ⊆ E with |ρ| ≤ βt . (57)

We further write the second condition in (57) as


\
Φ(t) ∆(t, ρ) = {0}, ∀ ρ ∈ H (βt ),

which, by Corollary 5, is equivalent to


\
Φ(t) ∆(t, ρ) = {0}, ∀ ρ ∈ At (βt ).

Based on the above discussion, in order to prove the theorem, it suffices to prove that if the field Fq
satisfies (55), then there exists an Fq -valued rate-ω LNEC code on G such that for each sink node t ∈ T ,
 \
dim Φ(t) = ω and Φ(t) ∆(t, ρ) = {0}, ∀ ρ ∈ At (βt ). (58)

This statement can be proved by using a standard argument (e.g., the proof of Theorem 1 in [16] and
the proof of Theorem 5 in [21]). We omit the details here.

A straightforward upper bound on the minimum required field size for the existence of a rate-ω LNEC
(t)
code with the minimum distance dmin ≥ βt +1 for each t ∈ T (where βt is a nonnegative integer not larger
P  
than Ct −ω ) is t∈T |E|
βt . Such a code can correct at t an arbitrary error vector in the set Z Et (⌊βt /2⌋)

for each t ∈ T . Subsequently, this upper bound was improved in [21] (cf. [21, Theorem 8]), as presented

DRAFT
27

in the following proposition. To our knowledge, this is the best known upper bound on the minimum
required field size for the existence of such a rate-ω LNEC code.

Proposition 9. Let Fq be a finite field of order q , where q is a prime power. Let T be the set of sink
nodes on the network G with Ct ≥ ω , ∀ t ∈ T . For each t ∈ T , let βt be a nonnegative integer not
larger than Ct − ω . Then, there exists an Fq -valued rate-ω LNEC code on G with the minimum distance
(t)
dmin ≥ βt + 1 for each t ∈ T if the field size q satisfies
X
q> Rt (βt ) , (59)
t∈T

where

Rt (βt ) = ρ ⊆ E : |ρ| = mincut(ρ, t) = βt . (60)


We readily see that At (βt ) ⊆ Rt (βt ) ⊆ ρ ⊆ E : |ρ| = βt and so
X X X |E|
At (βt ) ≤ Rt (βt ) ≤ .
βt
t∈T t∈T t∈T
P P
The improvement of our improved bound t∈T At (βt ) in Theorem 8 over t∈T Rt (βt ) (also over
P |E| P
t∈T βt ) is in general significant as illustrated by Example 4 below. The only case when t∈T At (βt )
P P P
has no improvement over t∈T Rt (βt ) , i.e., t∈T At (βt ) = t∈T Rt (βt ) , is that for each sink node
t, every edge subset ρ with |ρ| = mincut(ρ, t) = βt is primary for t, i.e., ρ is the unique minimum cut
separating t from itself. This condition holds only for very special networks. For example, we consider a
network as depicted in Fig. 1, where the network consists of only two nodes, a source node s and a sink
node t, connected by multiple parallel edges from s to t. In this network, for any positive integer βt not
larger than |E|, i.e., βt ≤ |E| (where in fact |E| is the number of multiple parallel edges from s to t),
we readily see that each edge subset ρ ⊆ E of size βt is primary for t. This immediately implies that
 
|E|
At (βt ) = Rt (βt ) = , ∀ βt ≤ |E|.
βt

Example 4. Recall the network G = (V, E) depicted in Fig. 4, where s is the single source node and
T = {t1 , t2 } is the set of sink nodes with Ct1 = Ct2 = 5. Let the rate ω = 3 and βt1 = βt2 = 2,
two nonnegative integers not larger than Ct1 − ω and Ct2 − ω , respectively. We consider an Fq -valued
(t )
1 (t )
2
rate-3 LNEC code with dmin ≥ βt1 + 1 and dmin ≥ βt2 + 1. This code can correct at the sink node ti an

arbitrary error vector in the set Z Eti (1) for i = 1, 2. We now focus on the field size q for the existence
of such a code.

DRAFT
28

P 
We first calculate the straightforward bound t∈T |E|
βt on the field size q as follows:
X |E|  
21
=2· = 420. (61)
βt 2
t∈T
P
Next, we calculate the bound t∈T Rt (βt ) on the field size q in Proposition 9. By (60) and βt1 = 2,
we obtain that
n
Rt1 (2) = {e1 , e2 }, {e1 , e3 }, {e1 , e4 }, {e1 , e5 }, {e1 , e6 }, {e1 , e7 }, {e1 , e8 }, {e1 , e10 },

{e1 , e12 }, {e1 , e13 }, {e1 , e14 }, {e1 , e16 }, {e1 , e17 }, {e1 , e18 }, {e1 , e20 }, {e2 , e3 },

{e2 , e4 }, {e2 , e5 }, {e2 , e6 }, {e2 , e10 }, {e2 , e12 }, {e2 , e13 }, {e2 , e14 }, {e2 , e17 },

{e2 , e20 }, {e3 , e4 }, {e3 , e5 }, {e3 , e6 }, {e3 , e7 }, {e3 , e8 }, {e3 , e12 }, {e3 , e13 },

{e3 , e14 }, {e3 , e16 }, {e3 , e17 }, {e3 , e18 }, {e3 , e20 }, {e4 , e5 }, {e4 , e6 }, {e4 , e7 },

{e4 , e8 }, {e4 , e10 }, {e4 , e12 }, {e4 , e13 }, {e4 , e14 }, {e4 , e16 }, {e4 , e17 }, {e4 , e18 },

{e4 , e20 }, {e5 , e6 }, {e5 , e7 }, {e5 , e8 }, {e5 , e10 }, {e5 , e12 }, {e5 , e16 }, {e5 , e18 },

{e6 , e7 }, {e6 , e8 }, {e6 , e10 }, {e6 , e12 }, {e6 , e13 }, {e6 , e14 }, {e6 , e16 }, {e6 , e17 },

{e6 , e18 }, {e6 , e20 }, {e7 , e10 }, {e7 , e12 }, {e7 , e13 }, {e7 , e14 }, {e7 , e17 }, {e7 , e20 },

{e8 , e10 }, {e8 , e12 }, {e8 , e13 }, {e8 , e14 }, {e8 , e17 }, {e8 , e20 }, {e10 , e12 }, {e10 , e13 },

{e10 , e14 }, {e10 , e16 }, {e10 , e17 }, {e10 , e18 }, {e10 , e20 }, {e12 , e13 }, {e12 , e14 }, {e12 , e16 },

{e12 , e17 }, {e12 , e18 }, {e12 , e20 }, {e13 , e16 }, {e13 , e18 }, {e14 , e16 }, {e14 , e18 }, {e16 , e17 },
o
{e16 , e20 }, {e17 , e18 }, {e18 , e20 }

with |Rt1 (2)| = 99. By the symmetry of the network G, we also have |Rt2 (2)| = 99. So, the bound (59)
in Proposition 9 is

|Rt1 (2)| + |Rt2 (2)| = 198, (62)

which is smaller than 420 from (61).


Next, we present the set At1 (2) of all the primary edge subsets for t1 of size βt1 = 2 as follows:
n
At1 (2) = {e1 , e4 }, {e1 , e10 }, {e1 , e12 }, {e1 , e20 }, {e6 , e10 }, {e6 , e12 }, {e6 , e18 },
o
{e6 , e20 }, {e10 , e12 }, {e10 , e18 }, {e10 , e20 }, {e12 , e18 }, {e12 , e20 }, {e18 , e20 } .

Then, |At1 (2)| = 14. We also have |At2 (2)| = |At1 (2)| = 14. Thus, the improved bound (55) in Theorem 8
is
|At2 (2)| + |At1 (2)| = 28,

DRAFT
29

which is considerably smaller than 198 from (62).

On the other hand, by the definition of primary edge subset in the paragraph immediately above
Example 2, it is not difficult to see that for a sink node t, any βt of the |In(t)| input edges of t form a
size-βt primary edge subset for t. We thus immediately obtain a lower bound on the size of At (βt ) as
presented in the following corollary.

Corollary 10. For a sink node t, let βt be a nonnegative integer not larger than Ct − ω . Then
 
|In(t)|
|At (βt )| ≥ .
βt
|In(ti )|
Continuing from Example 4, by this corollary, the size 14 of Ati (βti ) is lower bounded by βti = 10
for i = 1, 2.
Next, we will present an improved upper bound on the minimum required field size for the existence
of a rate-ω LNEC MDS code in the following theorem which is a consequence of Theorem 8. First, we
e is MDS if this code C
recall that an Fq -valued rate-ω LNEC code C e is decodable for T and satisfies the

Singleton bound (28) with equality, i.e.,


 (t)
dim Φ(t) = ω and dmin = Ct − ω + 1, ∀ t ∈ T.

Theorem 11. Let Fq be a finite field of order q , where q is a prime power, and T be the set of sink
nodes on the network G with Ct ≥ ω , ∀ t ∈ T . There exists an Fq -valued rate-ω LNEC MDS code on
G if the field size q satisfies
X
q> At (δt ) , (63)
t∈T

where δt , Ct − ω is called the redundancy of the sink node t ∈ T .


P
The best known upper bound t∈T Rt (δt ) on the minimum required field size for the existence
of a rate-ω LNEC MDS code was presented in [21] (cf. [21, Theorem 5]). The bound in Theorem 11
improves this bound and the improvement is in general significant. In fact, the LNEC code considered
in Example 4 is MDS and we have seen that the improvement is significant. Furthermore, similar to
Corollary 10, a lower bound on the size of At (δt ) is given as follows.

|In(t)|
Corollary 12. For a sink node t, the size of At (δt ) is lower bounded by δt , i.e.,
 
|In(t)|
|At (δt )| ≥ .
δt

DRAFT
30

We recall the discussion immediately above Example 4. Together with the fact that |In(t)| = |E| for
any network as depicted in Fig. 1, the discussion shows that the lower bound in Corollary 12 is tight,
i.e.,
 
|In(t)|
|At (δt )| = , ∀ δt ≤ |In(t)|.
δt
Further, since network error correction coding over such a network depicted in Fig. 1 can be regarded

as the model of classical coding theory, |At (δt )| = |In(t)|
δt is an upper bound on the minimum required
 
field size for the existence of an |In(t)|, |In(t)| − δt linear MDS code, where |In(t)| and |In(t)| − δt
are the length and dimension of the code, respectively. In general, linear MDS codes with field size
smaller than this bound exist. For example, let |E| = n and δt = n − k, where k (k ≤ n) is the designed
dimension of the code. Then, there exists an [n, k] linear MDS code over a finite field Fq if q ≥ n − 1.
A well-known conjecture on the field size for the existence of linear MDS codes is the following.

MDS Conjecture ([24, Chapter 7.4]): If there is a nontrivial [n, k] linear MDS code over Fq , then
n ≤ q + 1, except when q is even and k = 3 or k = q − 1, in which case n ≤ q + 2.

B. Efficient Algorithm for Computing the Improved Bound

In the last subsection, an improved upper bound on the minimum required field size for the existence of
LNEC codes is obtained. The bound thus obtained is graph-theoretic, which depends only on the network
topology and the required error correction capability of the LNEC code. However, it is not given in a
form which is readily computable. Accordingly, we in this subsection will develop an efficient algorithm
to compute this bound.
Let t be a sink node on the network G = (V, E) and r be a nonnegative integer not larger than Ct − ω .
We first develop an efficient algorithm for computing At (r). An implementation of the algorithm is given
in Algorithm 1.

Algorithm Verification:

1) In Lines 1 and 2, initialize two sets A (r) and B to the empty set and the set of all size-r edge
subsets of Et , respectively, where Et denotes the set of edges in E from which t is reachable, i.e.,
for each e ∈ Et , there exists a directed path from e to t on the network G.
2) In Lines 4 and 5, arbitrarily choose an edge subset η ∈ B and find the primary minimum cut
separating t from η , denoted by ρ. We note that for each edge subset η , the primary minimum cut
separating t from η exists and is unique.

DRAFT
31

Algorithm 1: Algorithm for computing At (r)


Input: The network G = (V, E), a sink node t and a nonnegative integer r .
Output: At (r), the set of all the size-r primary edge subsets for t.

begin
1 Set A (r) = ∅;

2 Set B = η ⊆ Et : |η| = r , where Et is the set of the edges in E from which t is
reachable;
// If there exists a directed path from an edge e to t, we say t is reachable from e or e can reach t.
3 while B 6= ∅ do
4 choose an edge subset η in B ;
5 find the primary minimum cut ρ separating t from η ;
// The primary minimum cut ρ separating t from η is a primary edge subset for t.
6 if |ρ| =
6 r then // Namely, |ρ| < r.
7 remove η from B ;
else // Namely, |ρ| = r.
8 add ρ to A (r);
9 c =E \E ;
partition Et into two parts Et,ρ and Et,ρ t t,ρ

// Here, Et,ρ is the set of the edges from which t is reachable upon deleting the edges in ρ.
c
// Note that ρ ⊆ Et,ρ .

10 for each µ ∈ B do
11 c then
if µ ⊆ Et,ρ
12 remove µ from B ;
end
end
end
end
13 Return A (r).
// After the “while” loop, A (r) contains all the size-r primary edge subsets for t, i.e., A (r) = At (r).
end

DRAFT
32

3) We note that

|ρ| = mincut(η, t) ≤ |η| = r, (64)

and then consider two cases below.


Case 1: If |ρ| =
6 r , which implies |ρ| < r by (64), then the “if” statement (Line 7) is executed. In
this case, we readily see that ρ is not a size-r primary edge subset for t. Then, we remove η from
B and go back to Line 3 for checking whether the updated B is empty or not.
Case 2: If |ρ| = r , which implies that ρ is a size-r primary edge subset for t, then the “else”
statement (Lines 8–12) is executed. To be specific, in Line 8, add this size-r primary edge subset ρ
c , E \E ,
to A (r). In Line 9, partition the edge set Et into two disjoint subsets: Et,ρ and Et,ρ t t,ρ

where Et,ρ is the set of edges from which t is reachable upon deleting the edges in ρ. Note that
c . Next, for the “for” loop (Lines 10–12), all the edge subsets in B that are subsets of
ρ ⊆ Et,ρ
c are removed. By Lemma 4, it is not difficult to see that each edge subset η in B , regardless
Et,ρ
c if and only if ρ separates t
of whether mincut(η, t) = r or mincut(η, t) < r , is a subset of Et,ρ
from η . This immediately implies that after this “for” loop, all the edge subsets in B from which ρ
separates t are removed from B , and none of the other size-r primary edge subsets are removed
from B . Thus, we see that in each iteration, exactly one size-r primary edge subset for t is added
to A (r).
4) Repeat Steps 2) and 3) above until B is empty and output A (r) in Line 13, which is now equal to
At (r).

In Algorithm 1, the two crucial steps are i) to find the primary minimum cut ρ separating t from an
c (Line 9). We first consider the
edge subset η in B (Line 5), and ii) to partition Et into Et,ρ and Et,ρ
c . Toward this end, it suffices to determine the edge set E , i.e.,
step of partitioning Et into Et,ρ and Et,ρ t,ρ

to find all the edges that can reach t upon deleting the edges in ρ. This can be implemented efficiently
by Algorithm 2 below.
Algorithm 2 extends from the sink node t and identifies an increasing number of edges that can reach
t. At any point during the execution of the algorithm, all the nodes in the network can be in one of two
states: marked or unmarked. The marked nodes are those from which t is reachable, and the unmarked
nodes are those yet to be classified. The edges in the set E-SET at this point have been identified to
be those from which t is reachable. The set N-SET contains marked nodes whose input edges have not
been processed. When a node v ∈ N-SET is selected in Line 6, all the input edges of v that are not in ρ
are added to E-SET in the “for” loop (Lines 7–11). Since v ∈ N-SET, we see that v is marked and so

DRAFT
33

c
Algorithm 2: Algorithm for partitioning Et into Et,ρ and Et,ρ

Input: The network G = (V, E) and a primary edge subset ρ for t.


Output: Et,ρ , the set of all the edges that can reach t upon deleting the edges in ρ.

begin
1 Unmark all nodes in V ;
2 mark sink node t;
3 set an edge-set E-SET = ∅;
4 set a node-set N-SET = {t};
5 while N-SET 6= ∅ do
6 select a node v in N-SET;
7 for each node u incident to an edge (u, v) not in ρ do
8 add all parallel edges leading from u to v and not in ρ to E-SET;
9 if u is unmarked then
10 mark node u;
11 add node u to N-SET;
end
end
12 delete node v from N-SET;
end
13 Return E-SET.
// After the “while” loop, E-SET contains all the edges that can reach t upon deleting the edges in ρ, i.e.,
E-SET = Et,ρ .

end

t is reachable from v . This implies that t is reachable from all these input edges and they are added to
E-SET in Line 8. The node u incident to an edge (u, v) can reach t via node v . If u is unmarked, then
mark u in Line 10. Otherwise, u has already been marked and so t is reachable from u. After the “for”
loop (Lines 7–11), all the input edges of v that are not in ρ are added to E-SET and all the nodes u
incident to an edge (u, v) are marked. Now, the node v has been processed and is removed from N-SET
in Line 12. The algorithm terminates when the set of nodes N-SET is empty. At this point, all the nodes
that can reach t have been marked and processed, and the edge set E-SET contains all the edges that

DRAFT
34

can reach t upon deleting the edges in ρ, namely that E-SET = Et,ρ . Now, we consider the complexity
of Algorithm 2. We can readily see that the algorithm traverses all the edges in Et,ρ exactly once, and
thus Algorithm 2 can find the edge set Et,ρ in O(|Et,ρ |) time.

Next, we consider the other crucial step of finding the primary minimum cut ρ separating t from an
edge subset η in B . Guang and Yeung [39] proved that in the augmenting path algorithm [1], [2] (also
see [41, Chapter 6.5] and [42, Chapter 7.2]) for finding the maximum flow from the source node s to
a non-source node t on a directed acyclic network, the last step for determining the termination of the
algorithm in fact finds the primary minimum cut separating t from s. Based on this result, we can develop
an efficient algorithm for directly finding the primary minimum cut separating t from η , which avoids
reversing the network G to G⊤ and then finding minimum cuts separating η from t on G⊤ .
On the network G, we first subdivide each edge e ∈ η by creating a node ve for e and splitting e
into two edges e1 and e2 with tail(e1 ) = tail(e), head(e2 ) = head(e), and head(e1 ) = tail(e2 ) = ve .
Then, we create a new node vη and add a new “super-edge” with infinite capacity from vη to ve for
every node ve , e ∈ η . By the definition of a cut separating t from η in the first paragraph of Section III,
we can readily see that a cut of finite capacity separating t from vη is a cut separating t from η on G,
and vice versa (where, whenever e1 or e2 appears in the cut, replace it by e). As such, for the purpose
of finding the primary minimum cut separating t from η on G, we only need to consider algorithms
for finding the primary minimum cut separating t from vη . Furthermore, for the sake of computational
efficiency, in finding the primary minimum cut separating t from η (or equivalently, the primary minimum
cut separating t from vη ), it suffices to set the capacities of all the newly added “super-edges” eb from
vη to ve , e ∈ η to one rather than infinity. In fact, the primary minimum cut separating t from vη does
not contain any newly added super-edge whether its capacity is finite or infinite. To see this, suppose ρ
is the primary minimum cut separating t from vη and assume that it contains a newly added super-edge
eb from vη to ve . Now, we replace eb by e2 in ρ to form a new edge subset ρ′ , where we recall that e2
is the edge obtained by splitting e with tail(e2 ) = vη and head(e2 ) = head(e). We can see that ρ′ 6= ρ
and ρ′ separates t from ρ. Thus ρ′ also separates t from vη . This contradicts the assumption that ρ is the
primary minimum cut separating t from vη .
Let G = (V, E) be a directed acyclic network with a sink node t and a non-sink node n. Denote
by Cn,t the minimum cut capacity separating t from n, i.e., Cn,t = mincut(n, t). By the max-flow
min-cut theorem [1], [2], the value v(̥) of a maximum flow ̥ from n to t is equal to the minimum
cut capacity Cn,t , i.e., v(̥) = Cn,t . Since all the edges in the network G have unit-capacity, Cn,t is a
positive integer and the maximum flow ̥ can be decomposed into Cn,t edge-disjoint paths from n to t.

DRAFT
35

Such Cn,t edge-disjoint paths can be found in polynomial time in |E| [41], [42]. Algorithm 3 below is
an implementation of the algorithm for finding the primary minimum cut separating t from n.

Algorithm 3: Algorithm for finding the primary minimum cut separating t from another node n
Input: The network G = (V, E) with a maximal flow ̥ from a node n to the sink node t
(n 6= t). For every edge e in the corresponding Cn,t (, mincut(n, t)) edge-disjoint paths,
the flow value is equal to 1, i.e., ̥(e) = 1; otherwise, the flow value is equal to 0, i.e.,
̥(e) = 0.
Output: The primary minimum cut separating t from n.

begin
1 Set S = {t};
2 for each node v ∈ S do
3 if ∃ a node u ∈ V \ S s.t. either ∃ a reverse edge e ∈ Et from u to v s.t. ̥(e) = 0 or ∃
a forward edge e ∈ Et from v to u s.t. ̥(e) = 1 then
S
4 replace S by S {u}.
end
end

5 Return ρ = e : tail(e) ∈ V \ S and head(e) ∈ S .
end

Example 5. We continue to consider the network G = (V, E) depicted in Fig. 4. In this example, we
will illustrate Algorithm 3 that finds the primary minimum cut separating the sink node t1 from the edge
subset η = {e2 , e4 }. Let Gt1 ,η be the network modified from G as illustrated in Fig. 6. Specifically, from
the network G, we delete the edges not connected to t1 (i.e., the edges not in Et1 ); subdivide e2 into
two edges e12 and e22 connected by a newly created node ve2 and subdivide e4 into two edges e14 and e24
connected by a newly created node ve4 ; and create a node vη with two unit-capacity output edges eb2 and
eb4 leading from vη to ve2 and from vη to ve4 , respectively. Further, a maximum flow ̥ from vη to t1 is
depicted in Fig. 6, where all the edges with flow value 1 are marked in thick lines. In the following, we
illustrate Algorithm 3 that outputs the primary minimum cut separating t1 from vη in Gt1 ,η , from which
we can immediately obtain the primary minimum cut separating t1 from η in G.

• Algorithm 3 starts with the sink node t1 . First, we see that e6 = (i1 , t1 ), e10 = (i3 , t1 ) and e12 =
(i4 , t1 ) are 3 reverse edges incident to t1 with flow value 0. Thus, the condition of the “if” statement

DRAFT
36

s vη
1
eb2
e12 e4
eb4
e1 ve 2 ve 4 e5
e3
e22 e24
i2 i4
i3
i1 e8 e13 i5
e7 e14
i6 i8

e16 e10 e17


e12
e6
i7 i9
e18

e20
t1 t2

Fig. 6: The network Gt1 ,η .

in Line 3 is satisfied. We further see that e18 = (i7 , t1 ) and e20 = (i9 , t1 ) are 2 reverse edges incident
to t1 with flow value 1, which do not satisfy the condition of the “if” statement in Line 3. Hence,
update S = {t1 } to {t1 , i1 , i3 , i4 }.
• We then consider the node i1 ∈ S . The edge e1 = (s, i1 ) with s ∈ V \ S and i1 ∈ S is a reverse
edge with flow value 0 and thus the condition of the “if” statement in Line 3 is satisfied. The edge
e7 = (i1 , i6 ) is a forward edge from i1 to i6 with i6 ∈ V \ S and ̥(e7 ) = 0. So the condition of
the “if” statement in Line 3 is not satisfied. Then, update S to {t1 , s, i1 , i3 , i4 }.
• For i3 ∈ S , the edge e3 = (s, i3 ) is the only edge incident to i3 but the tail node s is already in S .
So, the condition of the “if” statement in Line 3 is not satisfied. Similarly, for s ∈ S , no node in
V \ S satisfying the condition of the “if” statement in Line 3 exists.
• For i4 ∈ S , the edge e13 = (i4 , i8 ) is a forward edge from i4 ∈ S to i8 ∈ V \ S with flow value 1,
which satisfies the condition of the “if” statement in Line 3. Then, update S to {t1 , s, i1 , i3 , i4 , i8 }.
• For i8 ∈ S , the edge e14 = (i5 , i8 ) is a reverse edge from i5 ∈ V \ S to i8 ∈ S with flow value
0, and the edge e17 = (i8 , i9 ) is a forward edge from i8 ∈ S to i9 ∈ V \ S with flow value 1.
Thus, both i5 and i9 satisfy the condition of the “if” statement in Line 3. Then, update update S to
{t1 , s, i1 , i3 , i4 , i5 , i8 , i9 }.

DRAFT
37

• Now, we see that no new node in V \ S satisfying the condition of the “if” statement in Line 3
exists. Algorithm 3 terminates and returns the edge set ρ below:
 
ρ = e : tail(e) ∈ V \ S and head(e) ∈ S = eb24 = (ve4 , i4 ), e18 = (i7 , t1 ) .

We readily see that ρ is the primary minimum cut separating t1 from vη on Gt1 ,η .

By the definition of a cut separating a node from an edge subset in Section IV, the edge subset e4 , e18
is the primary minimum cut separating t1 from η on G.

The computational complexity of Algorithm 3 is at most O(|Et |) since in the algorithm, each edge in
Et is examined at most once. If we use the augmenting path algorithm to find Cn,t edge-disjoint paths
from n to t, then Algorithm 3 is already incorporated, and the total complexity for finding the primary
minimum cut separating t from n is at most O(Cn,t · |Et |), because the path augmentation approach
requires at most O(|Et |) time as mentioned and the number of the path augmentations is upper bounded
by the minimum cut capacity Cn,t .

Now, we can analyze the total complexity of Algorithm 1 for computing At (r). By combining the
foregoing discussions, we see that the complexity of Algorithm 1 is linear time in |Et |. This is elaborated
as follows: i) The complexity for finding the primary minimum cut ρ separating t from an edge subset
η (Line 5 in Algorithm 1) is at most O(|Et |); ii) The complexity for partitioning Et into two parts
c (Line 9 in Algorithm 1) is at most O(|E |), not larger than O(|E |); iii) Removing all
Et,ρ and Et,ρ t,ρ t
c (Lines 10–12 in Algorithm 1) can be implemented by
the edge subsets in B that are subsets of Et,ρ
creating an appropriate data structure to avoid computational complexity; iv) The “while” loop (Line 3 in
Algorithm 1) is executed |At (r)| times.6 So the complexity of Algorithm 1 is at most O |At (r)| · |Et |),
that is linear time in |Et |.

VI. C ONCLUSION

In this paper, we revisited and explored the framework of LNEC coding and network error correction on
a network of which the topology is known. Then, we showed that the two well-known LNEC approaches in
the literature are in fact equivalent. Further, we enhanced the characterization of error correction capability
of LNEC codes in terms of the minimum distances at the sink nodes by developing a graph-theoretic
approach. Based on this result, the computational complexities for decoding and code construction can
be significantly reduced.

6
Here, it suffices to consider edge subsets η ∈ B with mincut(η, t) = r.

DRAFT
38

In LNEC coding, the minimum required field size for the existence of LNEC codes, in particular LNEC
MDS codes, is an open problem not only of theoretical interest but also of practical importance. However,
the existing upper bounds on the minimum required field size for the existence of LNEC (MDS) codes
are typically too large for implementation. In this paper, we proved an improved upper bound on the
minimum required field size, which shows that the required field size for the existence of LNEC (MDS)
codes can be reduced significantly in general. This new bound only depends on the network topology
and the requirement of error correction capability but not on a specific code construction. However, it is
not given in an explicit form. Thus, we developed an efficient algorithm that computes the upper bound
in a linear time of the number of edges in the network. In developing the upper bound and the efficient
algorithm for computing this bound, various graph-theoretic concepts are introduced. These concepts
appear to be of fundamental interest in graph theory and they may have further applications in graph
theory and beyond.

A PPENDIX A
P ROOF OF P ROPOSITION 1

The positive definiteness and symmetry are straightforward. To complete the proof, we only need to
|In(t)|
prove the triangle inequality. Consider three arbitrary vectors ỹt , ỹt′ and ỹt′′ in Fq . Let

d(t) (ỹt , ỹt′ ) = d1 and d(t) (ỹt′ , ỹt′′ ) = d2 .

Let ρ1 ⊆ E be an edge subset with |ρ1 | = d1 such that there exists an error vector z1 ∈ ρ1 satisfying

ỹt − ỹt′ = z1 · Gt , (65)

and similarly ρ2 ⊆ E be an edge subset with |ρ2 | = d2 such that there exists an error vector z′ ∈ ρ2
satisfying

ỹt′ − ỹt′′ = z′ · Gt . (66)

Combining (65) and (66), we immediately obtain that

ỹt − ỹt′′ = (ỹt − ỹt′ ) + (ỹt′ − ỹt′′ )

= (z1 + z′ ) · Gt . (67)

Further, we let z1 + z′ , (ze : e ∈ E) and ρ , {e ∈ E : ze 6= 0}. Clearly, z1 + z′ ∈ ρ and

|ρ| ≤ |ρ1 | + |ρ2 | = d1 + d2 .

DRAFT
39

Together with the definition in (20), we immediately see that

d(t) (ỹt , ỹt′′ ) ≤ |ρ| ≤ d1 + d2 = d(t) (ỹt , ỹt′ ) + d(t) (ỹt′ , ỹt′′ ).

We thus have proved the triangle inequality and also Proposition 1.

R EFERENCES

[1] P. Elias, A. Feinstein, and C. E. Shannon, “A note on maximum flow through a network,” IRE Trans. Inf. Theory, col. 2,
vol. 4, pp. 117-119, April 1956.
[2] L. R. Ford Jr. and D. R. Fulkerson, “Maximal flow through a network,” Canadian Journal of Mathematics, vol. 8, no. 3,
pp. 399-404, 1956.
[3] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung, “Network information flow,” IEEE Trans. Inf. Theory, vol. 46, no. 4,
pp. 1204-1216, Jul. 2000.
[4] M. Celebiler, G. Stette, “On increasing the down-link capacity of a regenerative satellite repeater in point-to-point
communications,” Proceedings of the IEEE, vol. 66, no. 1, pp. 98-100, Jan. 1978.
[5] R. W. Yeung and Z. Zhang, “Distributed source coding for satellite communications,” IEEE Trans. Inf. Theory, vol. 45, no.
4, pp. 1111-1120, May 1999.
[6] S.-Y. R. Li, R. W. Yeung, and N. Cai, “Linear network coding,” IEEE Trans. Inf. Theory, vol. 49, no. 2, pp. 371-381, Jul.
2003.
[7] R. Koetter and M. Médard, “An algebraic approach to network coding,” IEEE/ACM Trans. Netw., vol. 11, no. 5, pp. 782-795,
Oct. 2003.
[8] R. W. Yeung, S.-Y. R. Li, N. Cai, and Z. Zhang, “Network coding theory,” Foundations and Trends in Communications and
Information Theory, vol. 2, nos.4 and 5, pp. 241-381, 2005.
[9] R. W. Yeung, Information Theory and Network Coding. New York: Springer, 2008.
[10] C. Fragouli and E. Soljanin, “Network coding fundamentals,” Foundations and Trends in Networking, vol. 2, no.1, pp.
1-133, 2007.
[11] C. Fragouli and E. Soljanin, “Network coding applications,” Foundations and Trends in Networking, vol. 2, no.2, pp.
135-269, 2007.
[12] T. Ho and D. S. Lun, Network Coding: An Introduction. Cambridge, U.K.: Cambridge Univ. Press, 2008.
[13] N. Cai and R. W. Yeung, “Network coding and error correction,” in Proc. IEEE Information Theory Workshop 2002,
Bangalore, India, Oct. 2002, pp. 119-122.
[14] R. W. Yeung and N. Cai, “Network error correction, part I: Basic concepts and upper bounds,” Communications in
Information and Systems, vol. 6, pp. 19-36, 2006.
[15] N. Cai and R. W. Yeung, “Network error correction, part II: Lower bounds,” Communications in Information and Systems,
vol. 6, pp. 37-54, 2006.
[16] Z. Zhang, “Linear network error correction codes in packet networks,” IEEE Trans. Inf. Theory, vol. 54, no. 1, pp. 209-218,
Jan. 2008.
[17] S. Yang, R. W. Yeung, and C. K. Ngai, “Refined Coding Bounds and Code Constructions for Coherent Network Error
Correction,” IEEE Trans. Inf. Theory, vol. 57, no. 3, pp. 1409-1424, Mar. 2011.
[18] R. Koetter and F. Kschischang, “Coding for errors and erasures in random network coding,” IEEE Trans. Inf. Theory, vol.
54, no. 8, pp. 3579-3591, Aug. 2008.

DRAFT
40

[19] D. Silva, F. Kschischang, and R. Kötter, “A Rank-Metric Approach to Error Control in Random Network Coding,” IEEE
Trans. Inf. Theory, vol. 54, no. 9, pp. 3951-3967, Sep. 2008.
[20] Z. Zhang, “Theory and applications of network error correction coding,” Proceedings of the IEEE, vol. 99, no. 3, pp.
406-420, March 2011.
[21] X. Guang, F.-W. Fu, and Z. Zhang, “Construction of Network Error Correction Codes in Packet Networks,” IEEE Trans.
Inf. Theory, vol. 59, no. 2, pp. 1030-1047, Feb. 2013.
[22] X. Guang and Z. Zhang, Linear Network Error Correction Coding. New York: Springer, 2014.
[23] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error Correcting Codes. Amsterdam, The Netherlands: North-
Holland, 1977.
[24] W. C. Huffman and V. Pless, Fundamentals of Error Correcting Codes. Cambridge, U.K.: Cambridge Univ. Press, 2003.
[25] C. E. Shannon, “A Mathematical Theory of Communication,” Bell Sys. Tech. Journal, 27: 379–423, 623–656, 1948.
[26] S. Yang, R. W. Yeung, and Z. Zhang, “Weight properties of network codes,” European Transactions on Telecommunications,
vol. 19, no. 4, pp. 371–383, 2008.
[27] R. Matsumoto, “Construction algorithm for network error-correcting codes attaining the singleton bound,” IEICE Trans.
Fundamentals, vol. E90-A, no. 9, pp. 1729–1735, Nov. 2007.
[28] X. Guang, F.-W. Fu, and Z. Zhang, “Variable-rate linear network error correction MDS codes,” IEEE Trans. Inf. Theory,
vol. 62, no. 6, pp. 3147–3164, June 2016.
[29] H. Balli, X. Yan, and Z. Zhang, “On randomized linear network codes and their error correction capabilities,” IEEE Trans.
Inf. Theory, vol. 55, no. 7, pp. 3148–3160, July 2009.
[30] N. Cai “Valuable messages and random outputs of edges in linear network coding,” in Proc. IEEE Int. Symp. Information
Theory, Seoul, Korea, June 2009, pp. 413–417.
[31] T. Ho, R. Koetter, M. Médard, M. Effros, J. Shi, and D. Karger, “A random linear network coding approach to multicast,”
IEEE Trans. Inf. Theory, vol. 52, no. 10, pp. 4413–4430, Oct. 2006.
[32] A. Khaleghi, D. Silva, and F. R. Kschischang, “Subspace codes,” in Cryptography and Coding 2009, M. G. Parker Ed.,
Lecture Notes in Computer Science, vol. 5921, pp. 1-21, 2009.
[33] D. Silva and F. R. Kschischang, “On metrics for error correction in network coding,” IEEE Trans. Inf. Theory, vol. 55,
no. 12, pp. 5479–5490, Dec. 2009.
[34] T. Ho, B. Leong, R. Koetter, M. Méedard, M. Effros, and D. Karger, “Byzantine modification detection in multicast
networks with random network coding,” in IEEE Trans. Inf. Theory, vol. 54, no. 6, pp. 2798–2803, June 2008
[35] S. Jaggi, M. Langberg, S. Katti, T. Ho, D. Katabi, M. Medard, and M. Effros, “Resilient network coding in the presence
of byzantine adversaries,” IEEE Trans. Inf. Theory, vol. 54, no. 6, pp. 2596–2603, Jun. 2008.
[36] L. Nutman and M. Langberg, “Adversarial models and resilient schemes for network coding,” in in Proc. IEEE Int. Symp.
Inf. Theory (ISIT), Toronto, ON, Canada, July 2008, pp. 171-¨C175.
[37] O. Kosut, L. Tong, and D. N. C. Tse, “Polytope codes against adversaries in networks,” IEEE Trans. Inf. Theory, vol. 60,
no. 6, pp. 3308-¨C3344, June 2014.
[38] X. Guang, R. W. Yeung, and F.-W. Fu, “Local-Encoding-Preserving Secure Network Coding,” IEEE Trans. Inf. Theory,
vol. 66, no. 10, pp. 5965–5994, Oct. 2020.
[39] X. Guang and R. W. Yeung, “Alphabet Size Reduction for Secure Network Coding: A Graph Theoretic Approach,” IEEE
Trans. Inf. Theory, vol. 64, no. 6, pp. 4513–4529, June 2018.
[40] X. Guang, J. Lu, and F.-W. Fu, “Small field size for secure network coding, ”IEEE Commun. Lett., vol. 19, no. 3, pp.
375-378, March 2015.

DRAFT
41

[41] R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, Network Flows: Theory, Algorithms, and Applications. Englewood Cliffs,
NJ: Prentice-Hall, 1993.
[42] J. A. Bondy and U. S. R. Murty, Graph Theory. Springer, 2008.

DRAFT

You might also like